Predictive Modelling of Flight Delay Times Using Machine Learning Algorithms: A Data-driven Approach to Operational Efficiency in Air Travel
Author(s)
Kashyap, Hillol
Editor(s)
Abstract
Flight delays are a significant concern in the aviation industry, affecting operational
efficiency, customer satisfaction, and overall airline performance. This dissertation aims to
develop a predictive framework using machine learning techniques to forecast departure delay
times based on a variety of features, including airport ratings, carrier performance metrics,
and real-time weather conditions.
The study employs several regression algorithms such as Linear Regression, Decision Tree,
Random Forest, XGBoost, Ridge, Lasso, Support Vector Regression (SVR), and K-Nearest
Neighbors (KNN) to build and evaluate predictive models. Each model is assessed using key
performance indicators-R² score, Mean Absolute Error (MAE), and Root Mean Squared
Error (RMSE)-to identify the most effective approach. Among all, XGBoost emerged as the
most robust and accurate model, offering superior performance in capturing complex, nonlinear
interactions among features.
To ensure real-world applicability, a Gradio-based web application was developed, enabling
users to interact with the model and obtain real-time predictions by inputting relevant flight
and weather parameters. Exploratory Data Analysis (EDA), feature importance ranking, and
correlation studies were also conducted to enhance model interpretability and domain
relevance.
efficiency, customer satisfaction, and overall airline performance. This dissertation aims to
develop a predictive framework using machine learning techniques to forecast departure delay
times based on a variety of features, including airport ratings, carrier performance metrics,
and real-time weather conditions.
The study employs several regression algorithms such as Linear Regression, Decision Tree,
Random Forest, XGBoost, Ridge, Lasso, Support Vector Regression (SVR), and K-Nearest
Neighbors (KNN) to build and evaluate predictive models. Each model is assessed using key
performance indicators-R² score, Mean Absolute Error (MAE), and Root Mean Squared
Error (RMSE)-to identify the most effective approach. Among all, XGBoost emerged as the
most robust and accurate model, offering superior performance in capturing complex, nonlinear
interactions among features.
To ensure real-world applicability, a Gradio-based web application was developed, enabling
users to interact with the model and obtain real-time predictions by inputting relevant flight
and weather parameters. Exploratory Data Analysis (EDA), feature importance ranking, and
correlation studies were also conducted to enhance model interpretability and domain
relevance.
File(s)![Thumbnail Image]()
Loading...
Name
2023MMBA07ASB005.pdf
Size
1.51 MB
Format
Adobe PDF
Checksum
(MD5):cdffbc6f092f5567572ef5120b2ae61c
