Saudi Cultural Missions Theses & Dissertations

Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10

Browse

Search Results

Now showing 1 - 1 of 1
  • ItemRestricted
    Exploring Nonlinear Associations and Interactions of Risk Factors for Breast Cancer Incidence Using Machine Learning Approaches
    (Imperial College London, 2024-08) Alqarni, Lina; Heath Alicia; Berrington, Amy
    BACKGROUND: Breast cancer is influenced by a complex array of risk factors. This study aimed to identify nonlinear associations and interactions between various risk factors and breast cancer incidence using computationally efficient, interpretable methods. METHODS: Data from the Generations Study, a long-term prospective cohort of 104,423 women, were analysed. Risk factors evaluated included demographic, medical, reproductive, hormonal, and lifestyle variables. We compared the performance of traditional Cox proportional hazards models with tree-based methods, including Classification and Regression Trees (CART) and random forests, using the C-statistic. SHapley Additive exPlanations (SHAP) values were extracted to interpret random forest outputs, highlighting key risk factors and interactions. Stability selection was applied to enhance computational efficiency and identify the most stable and important variables. RESULTS: The multivariable Cox model achieved the highest predictive accuracy with C-index of 0.657, slightly outperforming the random forest model (C-index of 0.650). However, the random forest model revealed nonlinear associations and interactions not captured by the Cox model. Age, family history of breast cancer, and benign breast disease were among the most critical factors identified, with complex interactions noted between age, body mass index at entry, and family history with other risk factors such as hormone replacement therapy duration, oral contraceptive duration, and smoking pack-years. Stability selection effectively reduced the number of variables without compromising model performance. CONCLUSIONS: While linear models capture dominant associations, tree-based models like random forests offer additional insights into complex, nonlinear relationships among breast cancer risk factors, highlighting the potential for more personalised screening and prevention strategies
    13 0

Copyright owned by the Saudi Digital Library (SDL) © 2025