Predicting Delayed Flights for International Airports Using Artificial Intelligence Models & Techniques
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Saudi Digital Library
Abstract
Delayed flights are a pervasive challenge in the aviation industry, significantly impacting operational efficiency, passenger satisfaction, and economic costs. This thesis aims to develop predictive models that demonstrate strong performance and reliability, capable of maintaining high accuracy within the tested dataset and showcasing potential for application in various real-world aviation scenarios. These models leverage advanced artificial intelligence and deep learning techniques to address the complexity of predicting delayed flights. The study evaluates the performance of Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), and their hybrid model (LSTM-CNN), which combine temporal and spatial pattern analysis, alongside Large Language Models (LLM, specifically OpenAI's Babbage model), which excel in processing structured and unstructured text data. Additionally, the research introduces a unified machine learning framework utilizing Gradient Boosting Machine (GBM) for regression and Light Gradient Boosting Machine (LGBM) for classification, aimed at estimating both flight delay durations and their underlying causes.
The models were tested on high-dimensional datasets from John F. Kennedy International Airport (JFK), and a synthetic dataset from King Abdulaziz International Airport (KAIA). Among the evaluated models, the hybrid LSTM-CNN model demonstrated the best performance, achieving 99.91% prediction accuracy with a prediction time of 2.18 seconds, outperforming the GBM model (98.5% accuracy, 6.75 seconds) and LGBM (99.99% precision, 4.88 seconds). Additionally, GBM achieved a strong correlation score (R² = 0.9086) in predicting delay durations, while LGBM exhibited exceptionally high precision (99.99%) in identifying delay causes. Results indicated that National Aviation System delays (correlation: 0.600), carrier-related delays (0.561),
and late aircraft arrivals (0.519) were the most significant contributors, while weather factors played a moderate role.
These findings underscore the exceptional accuracy and efficiency of LSTM-CNN, establishing it as the optimal model for predicting delayed flights due to its superior performance and speed. The study highlights the potential for integrating LSTM-CNN into real-time airport management systems, enhancing operational efficiency and decision-making while paving the way for smarter, AI-driven air traffic systems.
Description
This thesis addresses the critical challenge of flight delays in the aviation industry by developing and evaluating advanced artificial intelligence and deep learning models for accurate and efficient delay prediction. The research investigates the performance of LSTM, CNN, hybrid LSTM–CNN, and Large Language Models (LLM) on real and synthetic datasets from JFK and King Abdulaziz International Airports. Additionally, Gradient Boosting Machine (GBM) and Light Gradient Boosting Machine (LGBM) models are applied for regression and classification tasks to estimate delay durations and identify their causes. Experimental results reveal that the hybrid LSTM–CNN model achieves superior accuracy and speed compared to other models, demonstrating its potential for real-time integration in airport management systems. The study contributes to enhancing operational efficiency, improving decision-making, and advancing AI-driven air traffic management.
تعالج هذه الرسالة التحديات الحرجة المرتبطة بتأخيرات الرحلات الجوية في قطاع الطيران من خلال تطوير وتقييم نماذج متقدمة للذكاء الاصطناعي والتعلّم العميق للتنبؤ بدقة وكفاءة بالتأخيرات. تستعرض الدراسة أداء نماذج الشبكات العصبية طويلة وقصيرة المدى (LSTM)، والشبكات العصبية الالتفافية (CNN)، والنموذج الهجين (LSTM–CNN)، بالإضافة إلى النماذج اللغوية الضخمة (LLM)، باستخدام بيانات حقيقية من مطار جون إف. كينيدي (JFK) وبيانات اصطناعية من مطار الملك عبدالعزيز الدولي (KAIA). كما تم تطبيق خوارزميات الانحدار والتصنيف باستخدام نماذج الانحدار المعزز بالتدرج (GBM) وخوارزمية LightGBM لتقدير مدد التأخير وتحديد أسبابها. أظهرت النتائج تفوق النموذج الهجين LSTM–CNN في الدقة والسرعة مقارنة بالنماذج الأخرى، مما يبرز إمكانيته للتكامل في أنظمة إدارة المطارات في الوقت الحقيقي. تسهم هذه الدراسة في تحسين الكفاءة التشغيلية، ودعم اتخاذ القرار، وتعزيز أنظمة إدارة الحركة الجوية المعتمدة على الذكاء الاصطناعي.
Keywords
Computer Science, Deep Learning, Artificial Intelligence, Machine Learning, Operations Research, Air Traffic Management, Predictive Analytics, Data Science, Aviation Management, Transportation Engineering