SACM - United States of America
Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668
Browse
2 results
Search Results
Item Restricted Efficient Processing of Convolutional Neural Networks on the Edge: A Hybrid Approach Using Hardware Acceleration and Dual-Teacher Compression(University of Central Florida, 2024-07-05) Alhussain, Azzam; Lin, MingjieThis dissertation addresses the challenge of accelerating Convolutional Neural Networks (CNNs) for edge computing in computer vision applications by developing specialized hardware solutions that maintain high accuracy and perform real-time inference. Driven by open-source hardware design frameworks such as FINN and HLS4ML, this research focuses on hardware acceleration, model compression, and efficient implementation of CNN algorithms on AMD SoC-FPGAs using High-Level Synthesis (HLS) to optimize resource utilization and improve the throughput/watt of FPGA-based AI accelerators compared to traditional fixed-logic chips, such as CPUs, GPUs, and other edge accelerators. The dissertation introduces a novel CNN compression technique, "Two-Teachers Net," which utilizes PyTorch FX-graph mode to train an 8-bit quantized student model using knowledge distillation from two teacher models, improving the accuracy of the compressed model by 1%-2% compared to existing solutions for edge platforms. This method can be applied to any CNN model and dataset for image classification and seamlessly integrated into existing AI hardware and software optimization toolchains, including Vitis-AI, OpenVINO, TensorRT, and ONNX, without architectural adjustments. This provides a scalable solution for deploying high-accuracy CNNs on low-power edge devices across various applications, such as autonomous vehicles, surveillance systems, robotics, healthcare, and smart cities.26 0Item Restricted THERMAL ANALYSIS OF HIGH-PERFORMANCE FPGA-BASED MULTI-CHANNEL TIME-TO-DIGITAL CONVERTERS BASED ON TAPPED DELAY LINES ARCHITECTURE(University of Dayton, 2024-03-27) Alshehry, Awwad; Chodavarapu, VamsyWe describe a study on the effect of temperature variations on multi-channel Time to Digital Converters (TDC). The objective is to study the impact of ambient thermal variations on the performance of Field Programmable Gate Array (FPGA)-based Tapped Delay Line (TDL) TDC systems, while simultaneously meeting the requirements of high-precision time measurement, low-cost implementation, small size, and low power consumption. For our study we choose two devices, Xilinx Artix-7 and Microsemi ProASIC3L. The radiation-tolerant ProASIC3L device offers better stability in terms of thermal sensitivity and power consumption compared to the Artix-7. To assess the performance of the TDCs under varying thermal conditions, a laboratory thermal chamber was utilized to maintain ambient temperatures ranging from -75 to 80 °C. This analysis ensured a comprehensive evaluation of the TDCs performance across a wide operational range. By utilizing the Artix-7 and ProASIC3L devices, we achieved Root Mean Square (RMS) resolution of 24.7 and 554.59 picoseconds, respectively. We worked to determine the temperature sensitivity for both FPGA devices by observing a significantly low temperature coefficient using Artix-7, while temperature insensitive and stable performance are achieved using the ProASIC3L device. Total on-chip 3 power of 0.968 W was achieved using Artix-7 while less than 1.988 mW of power consumption was achieved using ProASIC3L device. The results and analysis presented in this study convince that the proposed design using the new generations of the FPGAs would help in the design and optimization of FPGA-based TDCs for many applications.21 0