Efficient Processing of Convolutional Neural Networks on the Edge: A Hybrid Approach Using Hardware Acceleration and Dual-Teacher Compression

Alhussain, Azzam

Efficient Processing of Convolutional Neural Networks on the Edge: A Hybrid Approach Using Hardware Acceleration and Dual-Teacher Compression

dc.contributor.advisor	Lin, Mingjie
dc.contributor.author	Alhussain, Azzam
dc.date.accessioned	2024-07-24T07:42:59Z
dc.date.available	2024-07-24T07:42:59Z
dc.date.issued	2024-07-05
dc.description	This dissertation changes how Artificial Intelligence (AI) can be used on small devices with limited resources, making cutting-edge technologies more accessible and practical for everyday usage. This research develops techniques to compress and optimize deep learning algorithms so complex AI systems can run smoothly on small and battery-powered devices while maintaining high performance. The on-device processing ensures data security and privacy, as there is no need for cloud involvement. The impact of this work spans diverse sectors, including healthcare, public safety, and consumer electronics, enabling transformative applications that enhance the quality of life and are beneficial for society.
dc.description.abstract	This dissertation addresses the challenge of accelerating Convolutional Neural Networks (CNNs) for edge computing in computer vision applications by developing specialized hardware solutions that maintain high accuracy and perform real-time inference. Driven by open-source hardware design frameworks such as FINN and HLS4ML, this research focuses on hardware acceleration, model compression, and efficient implementation of CNN algorithms on AMD SoC-FPGAs using High-Level Synthesis (HLS) to optimize resource utilization and improve the throughput/watt of FPGA-based AI accelerators compared to traditional fixed-logic chips, such as CPUs, GPUs, and other edge accelerators. The dissertation introduces a novel CNN compression technique, "Two-Teachers Net," which utilizes PyTorch FX-graph mode to train an 8-bit quantized student model using knowledge distillation from two teacher models, improving the accuracy of the compressed model by 1%-2% compared to existing solutions for edge platforms. This method can be applied to any CNN model and dataset for image classification and seamlessly integrated into existing AI hardware and software optimization toolchains, including Vitis-AI, OpenVINO, TensorRT, and ONNX, without architectural adjustments. This provides a scalable solution for deploying high-accuracy CNNs on low-power edge devices across various applications, such as autonomous vehicles, surveillance systems, robotics, healthcare, and smart cities.
dc.format.extent	189
dc.identifier.uri	https://hdl.handle.net/20.500.14154/72674
dc.language.iso	en_US
dc.publisher	University of Central Florida
dc.subject	Deep Neural Networks
dc.subject	FPGA
dc.subject	Computer Vision
dc.subject	Edge AI
dc.subject	CNN
dc.title	Efficient Processing of Convolutional Neural Networks on the Edge: A Hybrid Approach Using Hardware Acceleration and Dual-Teacher Compression
dc.type	Thesis
sdl.degree.department	Electrical and Computer Engineering
sdl.degree.discipline	Electrical Engineering
sdl.degree.grantor	University of Central Florida
sdl.degree.name	Doctor of Philosophy

Collections

SACM - United States of America

Efficient Processing of Convolutional Neural Networks on the Edge: A Hybrid Approach Using Hardware Acceleration and Dual-Teacher Compression

Files

Collections