Forecasting Solar Power Time Series: Strategies For Multi-Modal Data Fusion, Feature Relevance, and Sparse Data Management
Abstract
The forecasting of solar photovoltaic power (SPVP) is a significant challenge.
Solar is the least reliable renewable energy, as it depends on the weather,
among other things. However, it is also one of the cheapest sources if it can
be harnessed, particularly during daylight hours when people work and use
electricity. The ultimate aim of forecasting solar power using deep learning
(DL) techniques is to enable the aggregated use of solar power stations by
day, supplemented by alternative sources of electricity whenever solar energy
is forecast to fall below a particular level. The more accurate the solar power
predictions, the better the use and supply of this valuable resource.
This thesis proposes an SPVP forecasting method that applies DL methodologies using real data from multiple solar power stations. SPVP time series data is complex and characterized by variable, dynamic, and multidimensional attributes. Consequently the research in this thesis has to address various challenges, predominantly stemming from the inherent characteristics
of SPVP data. The multifaceted nature of these challenges includes data
variability and non-stationarity, where the influence of diverse environmental
conditions, seasonal variations, and geographical factors introduces significant
fluctuation and unpredictability into the data. To address this variability,
forecasting models that have the capability to adapt and predict based on
changing patterns are needed. Additionally, the multi-dimensional nature of
the inputs required for precise forecasting poses another hurdle.
Accurate SPVP generation forecasting models need to integrate multiple types of data, not only historical generation data but also exogenous vari-
ables such as weather conditions. Compounding these challenges is the issue of data availability. Many solar installations, especially new ones or those in less-studied regions, do not have the extensive historical data crucial for train-
ing robust forecasting models. Traditional machine learning methods often prove inadequate, as they are limited by their dependence on extensive data manipulation and feature engineering, so the requirements for deep domain
expertise—capabilities are not always available. These methods struggle to
capture and utilize the dynamic interplay between the factors affecting SPVP
generation, and this underscores the need for innovative approaches that can
navigate these complexities more effectively.
Motivated by the limitations of existing forecasting approaches, this research explores innovative DL techniques capable of handling the complexi-
ties of SPVP data. To address the challenges posed by data variability, we
introduce an aggregated SPVP model with a Wavelet-based-coefficient (Wco-
eff) approach that is used for univariate data decomposition to denoise the
data. The Wcoeff model redefines the wavelet transform (WT) application
to streamline feature extraction. This approach provides a scalable and accurate forecasting solution by mitigating computational complexity yet retaining temporal relationships.
Exogenous data is then integrated to enhance forecasting accuracy, and
the research addresses the multi-dimensional nature of these inputs through
the innovations of the Multilevel Data Fusion and Neural Basis Expansion
Analysis (MF-NBEA) model. This model represents a pivotal advance in
using DL for SPVP forecasting. Indeed, understanding the most important
lagged variables influencing the generation is crucial for refining forecasting
models. Given the high dimensionality and evolving nature of the data to
be used, a dynamic approach to lagged variable selection and modeling is
required. The research develops dynamic feature selection that adjusts to
changing conditions and highlights the most predictive variables over time.
This adaptability ensures models remain accurate and relevant, even as the
underlying data patterns shift.
Finally, we introduce a novel methodology that integrates learned knowledge from multiple source domains to address the critical challenges in fore-
casting accuracy when data is scarce. This innovative transfer learning approach marks a significant departure from traditional single-source forecasting methods. By leveraging the wealth of data available from already established solar power installations, the new methodology enhances the forecasting
model’s ability to predict solar power output in new locations or locations with
limited historical data. The essence of the novelty is in the strategic fusion of
knowledge from across multiple domains, utilizing advanced techniques such
as average weights fusion and evolutionary optimization based fusion.
This thesis makes a significant contribution to the field of DL models and
renewable energy forecasting by providing scalable, efficient, and adaptable
models. The findings underscore the potential for advanced DL techniques
to navigate the complexities of SPVP time series data and offer insights that
will facilitate the broader integration of solar energy into the power grid. This
work opens avenues for future research to enhance model interpretability,
explore cross-domain applications of transfer learning, and further optimize
models for real-time forecasting applications.
Description
Keywords
Multivariate Time Series, Time Series Analysis, Deep Learn- ing, Transfer Learning, Dynamic Feature Selection, Wavelet Transform, Data Scarcity, Solar Power Forecasting, Photovoltaic, Multi-step Forecasting, Lagged Variables Importance.