ADVANCES IN REAL-TIME AMERICAN SIGN LANGUAGE RECOGNITION SYSTEM USING DEEP LEARNING TECHNIQUES FOR ENHANCED ACCESSIBILITY

No Thumbnail Available

Date

2026

Journal Title

Journal ISSN

Volume Title

Publisher

Saudi Digital Library

Abstract

Advancements in technology have significantly contributed to the development of innovative tools aimed at improving communication and accessibility for individuals with hearing impairments. This dissertation explores various machine learning and deep learning techniques for recognizing American Sign Language (ASL) gestures, focusing on enhancing accessibility and bridging the communication gap between hearing-impaired and hearing individuals. Traditional machine learning models, such as Random Forest, Support Vector Machines (SVM), and K-Nearest Neighbors (KNN), alongside deep learning architectures like AlexNet, ResNet-50, EfficientNet, ConvNeXt, and VisionTransformer, were investigated for their effectiveness. Experiments conducted on an extensive dataset of 87,000 ASL gesture images revealed exceptional recognition accuracy, with ResNet-50 achieving 99.98% and Random Forest reaching 99.55%, while other models performed within a range of 97% to 98%. Building on these findings, an innovative real-time recognition system was developed, integrating computer vision and deep learning techniques. The project initially utilized MediaPipe for precise hand movement tracking and YOLOv8, a state-of-the-art object detection model, to translate ASL gestures into text in real time. A comprehensive dataset of 29,820 annotated images was created to ensure strong generalization across diverse hand positions and lighting conditions. MediaPipe’s hand landmark annotations significantly enhanced input quality, improving the YOLOv8 models training accuracy. In addition, a more advanced framework was later designed that integrates YOLOv11 with MediaPipe for robust real-time ASL alphabet recognition. This system was trained on a large-scale dataset of 130,000 annotated images with custom keypoint-based annotations, enabling the model to capture subtle variations in hand and finger positions. Experimental evaluation demonstrated outstanding performance, achieving a mean Average Precision (mAP@0.5) of 98.2% with minimal latency, confirming its suitability for real-time applications in education, healthcare, and professional environments. Overall, the findings of this dissertation underscore the transformative potential of AI-driven solutions for ASL recognition. By bridging communication gaps through both traditional classification models and real-time deep learning frameworks, this work contributes to fostering inclusivity, accessibility, and independence for individuals with hearing impairments.

Description

Keywords

American Sign Language (ASL), Real-time sign language recognition, Deep learning, Computer vision, Hand landmark tracking, YOLO-based detection, Transfer learning, Assistive technology, Accessibility

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2026