Transferable species distribution modelling: Comparative performance evaluation and interpretation of novel Generalized Functional Response models
Date
2023-11
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Glasgow
Abstract
Predictive species distribution models (SDMs) are becoming increasingly important in
ecology, in the light of rapid environmental change. The predictions of most current SDMs
are specific to the habitat composition of the environments in which such models were
fitted. However, species respond differently to a given habitat depending on the availability
of all habitats in their environment, a phenomenon known as a functional response
in resource selection. The Generalised Functional Response (GFR) framework captures
this dependence by formulating the SDM coefficients as functions of habitat availability
in the broader environment. The original GFR implementation used global polynomial
functions of habitat availability to describe functional responses. In the present thesis, I
develop several refinements of this approach and compare their explanatory and predictive
performance using two simulated and three real datasets.
I use local radial basis functions (RBF), a more flexible approach than global polynomials,
to represent the habitat selection coefficients and regularization to balance bias
and variance and prevent over-fitting. Second, I use the RBF-GFR and GFR models in
combination with the classification and regression tree (CART), which has more flexibility
and better predictive powers for non-linear modelling. As further extensions, I use
random forests (RF) and extreme gradient boosting (XGBoost) ensemble approaches that
consistently lead to variance reduction in generalization error.
After applying the original and extended models to four different datasets, I find that
the different methods perform consistently across the datasets, such that their approximate
ranking for out-of-data prediction is preserved. The traditional stationary approach to
SDMs, excluding the GFR model, consistently performs at the bottom of the ranking. The
best methods in my list provide non-negligible improvements in predictive performance,
in some cases taking the out-of-sample R2 score from 0.3 up to 0.7 across datasets.
At times of rapid environmental change and spatial non-stationarity ignoring the effects
of functional responses on SDMs, results in two different types of prediction bias
(under-prediction or mis-positioning of distribution hotspots). However, not all functional
response models are created equal. The more volatile GFR models may fall foul of similar
biases. My results indicate that there are consistently robust GFR approaches that achieve
transferability consistently across very different datasets.
In addition to these improvements in predictive performance resulting from the GFR,
RBF-GFR and their extensions, it is also essential to know whether these models can
offer insights into the mechanisms mediating species distributions. I use one of the simulated
datasets to interpret two of the models that provide the best predictive power for
this dataset. The resulting selection coefficients from the two models are similar, which
explains why the two models are able to explain the observed data in similar ways. In addition,
the behaviour of the availability-filtered selectivity coefficients is consistent with the
known mechanisms generating the data. These findings indicate that despite their purely
statistical nature these fundamentally different models show convergent and realistic behaviour.
To test the transferability of the improved versions of the GFR model in a large-scale
and multi-species dataset, I use the challenging large-scale North American Breeding Bird
Survey BBS dataset. I discuss how the information in the dataset affects the predictive
ability of each species abundance. My recent extensions of the GFR model double the
biodiversity prediction accuracy compared to the standard generalised linear model (GLM)
and the original GFR model.
Description
Keywords
Generalised linear models, Habitat selection, Predictive species distribution models, Radial basis functions, Random forests, Transferability