Saudi Cultural Missions Theses & Dissertations

Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10

Browse

Search Results

Now showing 1 - 2 of 2
  • ItemRestricted
    EXPERIMENTAL STUDY OF THE IMPORTANCE OF DATA FOR MACHINE LEARNING-BASED BREAST CANCER OUTCOME PREDICTION
    (Saudi Digital Library, 2024) Yamani, Wid; Wojtusaik, Janusz
    EXPERIMENTAL STUDY OF THE IMPORTANCE OF DATA FOR MACHINE LEARNING-BASED BREAST CANCER OUTCOME PREDICTION Wid Yamani, Ph.D. George Mason University, 2025 Dissertation Director: Dr. Janusz Wojtusiak Researchers have used various large-scale datasets to develop and validate predictive models in breast cancer outcome prediction. However, a notable gap exists due to the lack of a systematic comparison among these datasets regarding predictive performance, feature availability, and suitability for different analytical objectives. While each dataset has unique strengths and limitations, no comprehensive studies evaluate how these differences impact model performance, particularly across diverse timeframes, survival, and recurrence outcomes. This gap limits researchers in making informed choices about the most appropriate dataset for specific research questions. Effective modeling and prediction of breast cancer outcomes (such as cancer survival and recurrence) rely on the dataset's quality, the pre-processing techniques used to clean and transform data, and the choice of predictive models. Therefore, selecting a suitable dataset and identifying relevant variables are as crucial as the choice of the model itself. This thesis addresses this gap by systematically comparing five prominent datasets for predicting breast cancer outcomes. This dissertation compares five datasets—SEER Research 8, SEER Research 17, SEER Research Plus, SEER-Medicare, and Medicare Claims data—focusing on breast cancer survival and recurrence. It evaluates the predictive performance of each dataset using supervised machine learning methods, including logistic regression, random forest, and gradient boosting. The models were tested on metrics such as AUC, accuracy, recall, and precision, with gradient boosting delivering the most accurate results. The findings indicate that SEER-Medicare, which integrates cancer registry data with three years of retrospective claims, outperformed the other datasets, achieving AUCs of 0.891 for 5-year survival and 0.942 for 10-year survival. This dataset's inclusion of comprehensive health information, including pre-existing conditions and other claims data, makes it particularly valuable for outcome prediction. However, a drawback of SEER-Medicare is that it primarily includes patients aged 65 and older, as it is based on Medicare data. This limitation reduces its suitability for predicting outcomes in younger breast cancer patients, a significant subgroup with distinct risk factors and treatment responses. SEER Research Plus ranked second, offering data on patient demographics, breast cancer characteristics, staging, outcomes, and treatment, with AUC values of 0.877, 0.901, and 0.937 for 5-year, 10-year, and 15-year survival, respectively. SEER Research 17 and SEER Research 8 include patient demographics, breast cancer characteristics, and staging information but lack treatment details. SEER Research 17, which covers a larger population with more variables, yielded AUC values of 0.870 for 5-year survival, 0.897 for 10-year survival, and 0.920 for 15-year survival. SEER Research 8, which covers a smaller population over a more extended period, yielded slightly lower AUC values of 0.857, 0.868, and 0.880 for 5-year, 10-year, and 15-year survival, respectively. Results indicate that including treatment and additional variables significantly enhances prediction accuracy while the data size is less critical. This thesis is the first study that compares SEER datasets and provides a groundbreaking, comprehensive evaluation of these datasets, providing crucial insights into how data characteristics influence breast cancer outcome modeling.
    15 0
  • ItemRestricted
    UNRAVELING THE LINK BETWEEN ANTI-INFLAMMATORY DIET, ZINC, AND CADMIUM TOXICITY IN INFLAMMATION REGULATION AMONG CHILDREN AND ADOLESCENTS
    (Florida International University, 2024) Mobarki, Huda; Liuzzi, Juan
    Zinc (Zn) is known for its antioxidant and anti-inflammatory properties and is important in regulating the body’s inflammatory response. However, there is limited evidence on how factors such as diet and heavy metal toxicity contribute to inflammation in children, and whether these effects are influenced by Zn status. This study aimed to investigate the links between diet, Zn, and cadmium (Cd) toxicity with inflammation, using high-sensitivity C-reactive protein (hsCRP) and white blood cell count (WBCs) as biomarkers. Using data from the 2015-2016 National Health and Nutrition Examination Survey (NHANES), which included 3,507 children in the U.S. aged 2-19 years, we explored the associations between the main exposure variables (Zn, Anti-inflammatory Diet Score (ADS), and Cd) and inflammatory biomarkers. Statistical analysis was conducted using a linear regression model. Of the participants, 49.4% were male and 50.6% female. We observed an inverse relationship between serum Zn and inflammation (β = -.236, p = .008 for WBCs, and β = -.223, p = .035 for hsCRP) after adjusting covariates. Although ADS was inversely associated with inflammation, the relationship was not significant (β = -.006, p = .186 for WBCs, and β = -.003, p = .210 for hsCRP). Significant associations were found between blood Cd and WBCs (β = .436, p = .008), but not for hsCRP. After adjusting for Zn, the relationship between Cd and inflammation became inversely associated (β = -.083 for WBCs, β = -.099 for hsCRP), although these results were not significant, suggesting that Zn may mitigate Cd’s inflammatory effects. To further support the epidemiological findings, we conducted studies using young C. elegans. The experiment consisted of two studies analyzing the effects of Zn and Cd on the survival of the worms using two-way ANOVA and Tukey tests. The results showed that Cd treatment significantly decreased the survival of worms; however, co-incubation with Zn attenuated this effect when the concentration of Cd and Zn were equal (100 µM). In conclusion, the epidemiological data indicate that serum Zn is a more reliable indicator of inflammation in children than Zn intake. The study also suggests zinc status neutralizes Cd's pro-inflammatory effects on inflammatory biomarkers. Additionally, C. elegans model demonstrated that Zn supplementation mitigated Cd-induced toxicity. These findings highlight the importance of maintaining adequate Zn status to mitigate the harmful effects of Cd exposure in children. Therefore, dietary interventions that improve Zn status could potentially reduce inflammation and counteract the adverse impact of Cd exposure on a population level.
    16 0

Copyright owned by the Saudi Digital Library (SDL) © 2026