Predicting Osteoarthritis in Older Adults Using Literature-Based, Non-Invasive Risk Factors: A Cross-Sectional Analysis of ELSA Wave 9
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Saudi Digital Library
Abstract
Osteoarthritis (OA) is a prevalent joint disorder in older adults that is often diagnosed at a later stage,
as clinical assessments typically rely on imaging and laboratory tests that are not readily accessible in
all settings. This study aimed to develop and evaluate machine learning models that predict OA using
non-invasive, self-reported features from Wave 9 of the English Longitudinal Study of Ageing (ELSA).
A total of 4,723 participants aged 60 and above were included. An initial set of 32 features was selected
based on existing literature and refined through a structured feature selection pipeline, resulting in a
final set of 25 features, including joint pain and mobility limitations. Four supervised models -Logistic
Regression, Random Forest, XGBoost, and CatBoost- were trained using a stratified train-test split
and resampling to address class imbalance. The upsampled logistic regression model achieved the
highest sensitivity (0.769) and strong overall performance (AUC = 0.755), while CatBoost showed the
highest specificity (0.759) and an AUC of 0.747. A reduced logistic regression model using only the
top 15 features retained similar accuracy and AUC. These findings demonstrate that OA can be
predicted without imaging or biomarkers. The resulting models, particularly the logistic regression
model, offer promise as cost-effective screening tools to support early identification and guide
decisions about further clinical assessment. making them well-suited for primary care and digital
health settings, especially where resources are limited.
Description
Keywords
osteoarthritis, machine learning, predictive modelling, OA, data, aging population
