A Comparative Analysis of ARIMAX and LSTM for Predicting TSLA Stock Prices Using S&P 500 and VIX Data
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Saudi Digital Library
Abstract
Forecasting stock prices is a vital and complex task within the realm of financial research, a challenge that is further
exacerbated by rising volatility so typical of modern markets. While statistical models have been employed for a long
time, machine-learning- and deep-learning-based methods are formidable alternative methods to tackle financial timeseries data, which are characterised by their complicated and non-linear nature, hence creating significant challenges
to analysis methods. The dissertation investigates a long-standing controversy between classical and new forecasting
methods by centring on a central question: forecasting the notoriously volatile stock, Tesla Inc. (TSLA), during a time
of extreme market volatility from 2020 to 2025. The motivation behind this study comes from a desire to find whether
or not a classical forecasting algorithm, aided by further market data, can come even close to matching a cutting-edge
deep-learning-based forecasting algorithm during such exceptional times.
To understand the research question, a comparative quantitative study was performed with a Cross-Industry Standard Process for Data Mining (CRISP-DM) approach. Two models were proposed and compared: a statistical baseline,
called AutoRegressive Integrated Moving Average with eXogenous variables, or ARIMAX, with a Long Short-Term
Memory neural network, or LSTM, which represents the deep learning paradigm. The models were trained by using a
dataset with historical daily TSLA prices, together with additional data from a popular benchmark, such as the SP 500,
and a well-known volatility index, known as the CBOE Volatility Index or VIX, as well as a complete set of engineered
technical indicators, with the goal to better understand market trends, volatility, and momentum. The models’ predictive accuracy was rigorously assessed by utilising an unseen test dataset, with strict adherence to popular regression
measures. This study’s results were conclusive. The Long Short-Term Memory algorithm essentially dominated the
Autoregressive Integrated Moving Average with Exogenous Variables algorithm across all evaluation criteria, suggesting its better capability to recognise complex patterns inherent within the varying nature of stock data. The LSTM
attained a Mean Absolute Percentage Error (MAPE) of just 5.02%, compared to the 27.76% MAPE achieved by the
ARIMAX algorithm. Similarly, the Root Mean Squared Error (RMSE) generated by the LSTM was over five times less
compared to that by the ARIMAX algorithm ($20.34 compared to $107.83).
The major contribution of the current work is to offer significant empirical evidence supporting that within environments with high asset volatility, the forecasting accuracy of deep learning models significantly exceeds that of
traditional linear statistical models. One of the major implications derived from our work addresses financial practitioners and quantitative analysts, specifying that their overreliance on less complex models with regard to such assets
proves to be insufficient, leading to significant forecasting errors. Thus, the study points to a necessity to construct
and utilise sophisticated methods from a deep learning perspective to achieve additional and applicable understanding
within the dynamic realm of financial markets.Forecasting stock prices is a vital and complex task within the realm of financial research, a challenge that is further
exacerbated by rising volatility so typical of modern markets. While statistical models have been employed for a long
time, machine-learning- and deep-learning-based methods are formidable alternative methods to tackle financial timeseries data, which are characterised by their complicated and non-linear nature, hence creating significant challenges
to analysis methods. The dissertation investigates a long-standing controversy between classical and new forecasting
methods by centring on a central question: forecasting the notoriously volatile stock, Tesla Inc. (TSLA), during a time
of extreme market volatility from 2020 to 2025. The motivation behind this study comes from a desire to find whether
or not a classical forecasting algorithm, aided by further market data, can come even close to matching a cutting-edge
deep-learning-based forecasting algorithm during such exceptional times.
To understand the research question, a comparative quantitative study was performed with a Cross-Industry Standard Process for Data Mining (CRISP-DM) approach. Two models were proposed and compared: a statistical baseline,
called AutoRegressive Integrated Moving Average with eXogenous variables, or ARIMAX, with a Long Short-Term
Memory neural network, or LSTM, which represents the deep learning paradigm. The models were trained by using a
dataset with historical daily TSLA prices, together with additional data from a popular benchmark, such as the SP 500,
and a well-known volatility index, known as the CBOE Volatility Index or VIX, as well as a complete set of engineered
technical indicators, with the goal to better understand market trends, volatility, and momentum. The models’ predictive accuracy was rigorously assessed by utilising an unseen test dataset, with strict adherence to popular regression
measures. This study’s results were conclusive. The Long Short-Term Memory algorithm essentially dominated the
Autoregressive Integrated Moving Average with Exogenous Variables algorithm across all evaluation criteria, suggesting its better capability to recognise complex patterns inherent within the varying nature of stock data. The LSTM
attained a Mean Absolute Percentage Error (MAPE) of just 5.02%, compared to the 27.76% MAPE achieved by the
ARIMAX algorithm. Similarly, the Root Mean Squared Error (RMSE) generated by the LSTM was over five times less
compared to that by the ARIMAX algorithm ($20.34 compared to $107.83).
The major contribution of the current work is to offer significant empirical evidence supporting that within environments with high asset volatility, the forecasting accuracy of deep learning models significantly exceeds that of
traditional linear statistical models. One of the major implications derived from our work addresses financial practitioners and quantitative analysts, specifying that their overreliance on less complex models with regard to such assets
proves to be insufficient, leading to significant forecasting errors. Thus, the study points to a necessity to construct
and utilise sophisticated methods from a deep learning perspective to achieve additional and applicable understanding
within the dynamic realm of financial markets.
Description
Keywords
LSTM, Machine learning, Arimax, RMSE, MAE, Forecasting, Stock market
