SACM - United States of America

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668

Browse

Search Results

Now showing 1 - 10 of 40

Restricted
INTELLIGENT ROBOTICS WITH DIGITAL-TWIN ALIGNMENT: SEMANTIC NAVIGATION, MANIPULATION, PLANNING, AND HUMAN-TO-ROBOT ACTION TRANSFORMATION
(Saudi Digital Library, 2025) Alanazi, Ahmed Hamdan; Lee, Yugyung
This dissertation advances AI-empowered indoor robotics through four interconnected contributions that unify navigation, manipulation, semantic planning, and human-to-robot action transformation within a digital-twin-aligned framework. GRIP, a grid-aware semantic navigation module, integrates symbolic scene understanding with hybrid search-and-policy execution to achieve robust and context-aware ObjectNav. PathFormer, a transformer-based manipulation model structured around a 3D spatial--semantic grid, generates smooth, interpretable, and physically consistent trajectories that remain tightly aligned with digital-twin simulation. KG-Transformer, a knowledge-guided semantic planner, leverages a lightweight digital twin to calibrate execution, veto unsafe behaviors, and autonomously repair failing plans across diverse indoor environments. ActionFormer, an action-generation transformer, introduces a unified imitation-learning pipeline that integrates human-activity recognition, human-motion generation, and robot-motion generation. ActionFormer supports more than twenty complex human activities, producing robot-ready demonstrations that generalize across platforms and enable end-to-end imitation learning from video and landmark sequences. Collectively, these contributions establish a coherent foundation for AI-empowered robotics grounded in digital-twin intelligence. Across benchmarks and real-world deployments, GRIP yields up to 9.6\% higher success rate and more than $2\times$ gains in path efficiency (SPL, SAE). PathFormer produces digitally consistent manipulation trajectories validated through robust sim-to-real transfer. KG-Transformer achieves 99.6\% executability, delivers a +4.6-point improvement on unseen-scene tasks, and eliminates safety violations in both simulated and multi-robot execution. ActionFormer attains state-of-the-art performance in human-activity recognition and high execution accuracy across more than 20 activities, generating realistic human-motion traces and corresponding robot-motion trajectories for embodied robotic demonstration. Together, these advances deliver a trustworthy, semantically aligned, and high-performance simulation-to-reality pipeline that significantly enhances the adaptability, reliability, and real-world readiness of autonomous indoor robotic systems.
23 0
Restricted
Sensing, Scheduling, and Learning for Resource-Constrained Edge Systems
(Saudi Digital Library, 2025) Bukhari, Abdulrahman; Kim, Hyoseung
Recent advances in Internet of Things (IoT) technologies have sparked significant interest in developing learning-based sensing applications on embedded edge devices. These efforts, however, are challenged by adapting to unforeseen conditions in open-world environments and by the practical limitations of low-cost sensors in the field. This dissertation presents the design, implementation, and evaluation of resource-constrained edge systems that address these challenges through time-series sensing, scheduling, and classification. First, we present OpenSense, an open-world time-series sensing framework for performing inference and incremental classification on an embedded edge device, eliminating reliance on powerful cloud servers. To create time for on-device updates without missing events and to reduce sensing and communication overhead, we introduce two dynamic sensor-scheduling techniques: (i) a class-level period assignment scheduler that selects an appropriate sensing period for each inferred class and (ii) a Q-learning–based scheduler that learns event patterns to choose the sensing interval at each classification moment. Experimental results show that OpenSense incrementally adapts to unforeseen conditions and schedules effectively on a resource-constrained device. Second, to bridge the gap between theoretical potential and field practice for low-cost sensors, we present a comprehensive evaluation of a sensing and classification system for early stress and disease detection in avocado plants. The greenhouse deployment spans 72 plants in four treatment categories over six months. For leaves, spectral reflectance coupled with multivariate analysis and permutation testing yields statistically significant results and reliable inference. For soils, we develop a two-level hierarchical classification approach tailored to treatment characteristics that achieves 75–86\% accuracy across avocado genotypes and outperforms conventional approaches by over 20\%. Embedded evaluations on Raspberry Pi and Jetson report end-to-end latency, computation, memory usage, and power consumption, demonstrating practical feasibility. In summary, the contributions are a generalized framework for dynamic, open-world learning on edge devices and an application-specific system for robust classification in noisy field deployments. These real-world deployments collectively outline a practical framework for designing intelligent, cloud-independent edge systems from sensing to inference.
23 0
Restricted
Sensing, Scheduling, and Learning for Resource-Constrained Edge Systems
(Saudi Digital Library, 2025) Bukhari, Abdulrahman Ismail Ibrahim; Kim, Hyoseung
Recent advances in Internet of Things (IoT) technologies have sparked significant interest in developing learning-based sensing applications on embedded edge devices. These efforts, however, are challenged by adapting to unforeseen conditions in open-world environments and by the practical limitations of low-cost sensors in the field. This dissertation presents the design, implementation, and evaluation of resource-constrained edge systems that address these challenges through time-series sensing, scheduling, and classification. First, we present OpenSense, an open-world time-series sensing framework for performing inference and incremental classification on an embedded edge device, eliminating reliance on powerful cloud servers. To create time for on-device updates without missing events and to reduce sensing and communication overhead, we introduce two dynamic sensor-scheduling techniques: (i) a class-level period assignment scheduler that selects an appropriate sensing period for each inferred class and (ii) a Q-learning–based scheduler that learns event patterns to choose the sensing interval at each classification moment. Experimental results show that OpenSense incrementally adapts to unforeseen conditions and schedules effectively on a resource-constrained device. Second, to bridge the gap between theoretical potential and field practice for low-cost sensors, we present a comprehensive evaluation of a sensing and classification system for early stress and disease detection in avocado plants. The greenhouse deployment spans 72 plants in four treatment categories over six months. For leaves, spectral reflectance coupled with multivariate analysis and permutation testing yields statistically significant results and reliable inference. For soils, we develop a two-level hierarchical classification approach tailored to treatment characteristics that achieves 75–86\% accuracy across avocado genotypes and outperforms conventional approaches by over 20\%. Embedded evaluations on Raspberry Pi and Jetson report end-to-end latency, computation, memory usage, and power consumption, demonstrating practical feasibility. In summary, the contributions are a generalized framework for dynamic, open-world learning on edge devices and an application-specific system for robust classification in noisy field deployments. These real-world deployments collectively outline a practical framework for designing intelligent, cloud-independent edge systems from sensing to inference.
33 0
Restricted
EXPERIMENTAL STUDY OF THE IMPORTANCE OF DATA FOR MACHINE LEARNING-BASED BREAST CANCER OUTCOME PREDICTION
(Saudi Digital Library, 2024) Yamani, Wid; Wojtusaik, Janusz
EXPERIMENTAL STUDY OF THE IMPORTANCE OF DATA FOR MACHINE LEARNING-BASED BREAST CANCER OUTCOME PREDICTION Wid Yamani, Ph.D. George Mason University, 2025 Dissertation Director: Dr. Janusz Wojtusiak Researchers have used various large-scale datasets to develop and validate predictive models in breast cancer outcome prediction. However, a notable gap exists due to the lack of a systematic comparison among these datasets regarding predictive performance, feature availability, and suitability for different analytical objectives. While each dataset has unique strengths and limitations, no comprehensive studies evaluate how these differences impact model performance, particularly across diverse timeframes, survival, and recurrence outcomes. This gap limits researchers in making informed choices about the most appropriate dataset for specific research questions. Effective modeling and prediction of breast cancer outcomes (such as cancer survival and recurrence) rely on the dataset's quality, the pre-processing techniques used to clean and transform data, and the choice of predictive models. Therefore, selecting a suitable dataset and identifying relevant variables are as crucial as the choice of the model itself. This thesis addresses this gap by systematically comparing five prominent datasets for predicting breast cancer outcomes. This dissertation compares five datasets—SEER Research 8, SEER Research 17, SEER Research Plus, SEER-Medicare, and Medicare Claims data—focusing on breast cancer survival and recurrence. It evaluates the predictive performance of each dataset using supervised machine learning methods, including logistic regression, random forest, and gradient boosting. The models were tested on metrics such as AUC, accuracy, recall, and precision, with gradient boosting delivering the most accurate results. The findings indicate that SEER-Medicare, which integrates cancer registry data with three years of retrospective claims, outperformed the other datasets, achieving AUCs of 0.891 for 5-year survival and 0.942 for 10-year survival. This dataset's inclusion of comprehensive health information, including pre-existing conditions and other claims data, makes it particularly valuable for outcome prediction. However, a drawback of SEER-Medicare is that it primarily includes patients aged 65 and older, as it is based on Medicare data. This limitation reduces its suitability for predicting outcomes in younger breast cancer patients, a significant subgroup with distinct risk factors and treatment responses. SEER Research Plus ranked second, offering data on patient demographics, breast cancer characteristics, staging, outcomes, and treatment, with AUC values of 0.877, 0.901, and 0.937 for 5-year, 10-year, and 15-year survival, respectively. SEER Research 17 and SEER Research 8 include patient demographics, breast cancer characteristics, and staging information but lack treatment details. SEER Research 17, which covers a larger population with more variables, yielded AUC values of 0.870 for 5-year survival, 0.897 for 10-year survival, and 0.920 for 15-year survival. SEER Research 8, which covers a smaller population over a more extended period, yielded slightly lower AUC values of 0.857, 0.868, and 0.880 for 5-year, 10-year, and 15-year survival, respectively. Results indicate that including treatment and additional variables significantly enhances prediction accuracy while the data size is less critical. This thesis is the first study that compares SEER datasets and provides a groundbreaking, comprehensive evaluation of these datasets, providing crucial insights into how data characteristics influence breast cancer outcome modeling.
15 0
Restricted
Cross Dataset Fairness Evaluation of Transformer Based Sentiment Models
(Saudi Digital Library, 2025-05-10) Zuiran, Sara; Bhattacharyya, Siddhartha
With the growing exploration of Natural Language Processing (NLP) systems in decision-making environments, it is essential to evaluate technical and ethical aspects of the dataset and the NLP model to improve fairness. To assess fairness, the thesis examines demographic imbalances in sentiment classification models by evaluating transformer-based models fine-tuned on the Stanford Sentiment Treebank version 2 dataset (SST-2) against the demographically annotated Comprehensive Assessment of Language Model dataset (CALM). This work identifies performance disparities in sentiment prediction across demographic groups by examining sensitive attributes such as gender and race. The study evaluates both the RoBERTa and MentalBERT transformer models using a complete set of fairness metrics consisting of Statistical Parity Difference (SPD), Equal Opportunity Difference (EOD), False Positive Rates (FPR), False Negative Rates (FNR), Jensen-Shannon Divergence (JSD), and Wasserstein Distance (WD). The analysis examines both group-vs-rest and pairwise subgroup comparisons, including gender and ethnicity. Results show that applying adversarial mitigation reduced fairness disparities across demographic subgroups, with the most notable improvements observed for non-binary and Asian users. The observed disparities emphasize the challenge of reducing performance gaps across demographic subgroups in sentiment classification tasks. The thesis introduces a practical framework for evaluating demographic dis- disparities, extends fairness analysis, and assesses the impact of mitigation techniques in cross-dataset sentiment classification. This research proposes a framework that demonstrates a path toward creating inclusive NLP systems and establishes the groundwork for upcoming ethical Artificial Intelligence (AI) studies.
23 0
Restricted
GRAPH-BASED APPROACH: BRIDGING INSIGHTS FROM STRUCTURED AND UNSTRUCTURED DATA
(Temple University, 2025) Aljurbua, Rafaa; Obradovic, Zoran
Graph-based methodologies provide powerful tools for uncovering intricate relationships and patterns in complex data, enabling the integration of structured and unstructured information for insightful decision-making across diverse domains. Our research focuses on constructing graphs from structured and unstructured data, demonstrating their applications in healthcare and power systems. In healthcare, we examine how social networks influence the attitudes of hemodialysis patients toward kidney transplantation. Using a network-based approach, we investigate how social networks within hemodialysis clinics affect patients' attitudes, contributing to a growing understanding of this dynamic. Our findings emphasize that social networks improve the performance of machine learning models, highlighting the importance of social interactions in clinical settings (Aljurbua et al., 2022). We further introduce Node2VecFuseClassifier, a graph-based model that combines patient interactions with patient characteristics. By comparing problem representations that focus on sociodemographics versus social interactions, we demonstrate that incorporating patient-to-patient and patient-to-staff interactions results in more accurate predictions. This multi-modal analysis, which merges patient experiences with staff expertise, underscores the role of social networks in influencing attitudes toward transplantation (Aljurbua et al., 2024b). In power systems, we explore the impact of severe weather events that lead to power outages, specifically focusing on predicting weather-induced outages three hours in advance at the county level in the Pacific Northwest of the United States. By utilizing a multi-model multiplex network that integrates data from multiple sources including weather, transmission lines, lightning, vegetation, and social media posts from two leading platforms (Twitter and Reddit), we show how multiplex networks offer valuable insights for predicting power outages. This integration of diverse data sources and network-based modeling emphasizes the importance of leveraging multiple perspectives to enhance the understanding and prediction of power disruptions (Aljurbua et al., 2023). We further present HMN-RTS, a hierarchical multiplex network that classifies disruption severity by temporal learning from integrated weather recordings and social media posts. The multiplex network layers of this framework gather information about power outages, weather, lighting, land cover, transmission lines, and social media comments. By incorporating multiplex network layers consisting of data collected over time and across regions, we demonstrate that HMN-RTS significantly improves the accuracy of predicting the duration of weather-related outages. This framework enables grid operators to make more reliable predictions up to 6 hours in advance, supporting early risk assessment and proactive mitigation (Aljurbua et al., 2024a, 2025a). Additionally, we introduce SMN-WVF, a spatiotemporal multiplex network designed to predict the duration of power outages in distribution grids. By integrating network-based approach and multi-modal data across space and time, SMN-WVF offers a novel method for predicting disruption durations in distribution grids, enhancing decision-making and mitigation efforts while highlighting the critical role of network-based approaches in forecasting (Aljurbua et al., 2025b). Overall, our research showcases the potential of graph-based models in tackling complex challenges in both power systems and healthcare. By combining the network-based approach with multi-modal data, we present innovative solutions for predicting power outages and understanding patient attitudes.
24 0
Restricted
Quantifying and Profiling Echo Chambers on Social Media
(Arizona State University, 2024) Alatawi, Faisal; Liu, Huan; Sen, Arunabha; Davulcu, Hasan; Shu, Kai
Echo chambers on social media have become a critical focus in the study of online behavior and public discourse. These environments, characterized by the ideological homogeneity of users and limited exposure to opposing viewpoints, contribute to polarization, the spread of misinformation, and the entrenchment of biases. While significant research has been devoted to proving the existence of echo chambers, less attention has been given to understanding their internal dynamics. This dissertation addresses this gap by developing novel methodologies for quantifying and profiling echo chambers, with the goal of providing deeper insights into how these communities function and how they can be measured. The first core contribution of this work is the introduction of the Echo Chamber Score (ECS), a new metric for measuring the degree of ideological segregation in social media interaction networks. The ECS captures both the cohesion within communities and the separation between them, offering a more nuanced approach to assessing polarization. By using a self-supervised Graph Auto-Encoder (EchoGAE), the ECS bypasses the need for explicit ideological labeling, instead embedding users based on their interactions and linguistic patterns. The second contribution is a Heterogeneous Information Network (HIN)-based framework for profiling echo chambers. This framework integrates social and linguistic features, allowing for a comprehensive analysis of the relationships between users, topics, and language within echo chambers. By combining community detection, topic modeling, and language analysis, the profiling method reveals how discourse and group behavior reinforce ideological boundaries. Through the application of these methods to real-world social media datasets, this dissertation demonstrates their effectiveness in identifying polarized communities and profiling their internal discourse. The findings highlight how linguistic homophily and social identity theory shape echo chambers and contribute to polarization. Overall, this research advances the understanding of echo chambers by moving beyond detection to explore their structural and linguistic complexities, offering new tools for measuring and addressing polarization on social media platforms.
27 0
Restricted
Deep Learning Approaches for Multivariate Time Series: Advances in Feature Selection, Classification, and Forecasting
(New Mexico State University, 2024) Alshammari, Khaznah Raghyan; Tran, Son; Hamdi, Shah Muhammad
In this work, we present the latest developments and advancements in the machine learning-based prediction and feature selection of multivariate time series (MVTS) data. MVTS data, which involves multiple interrelated time series, presents significant challenges due to its high dimensionality, complex temporal dependencies, and inter-variable relationships. These challenges are critical in domains such as space weather prediction, environmental monitoring, healthcare, sensor networks, and finance. Our research addresses these challenges by developing and implementing advanced machine-learning algorithms specifically designed for MVTS data. We introduce innovative methodologies that focus on three key areas: feature selection, classification, and forecasting. Our contributions include the development of deep learning models, such as Long Short-Term Memory (LSTM) networks and Transformer-based architectures, which are optimized to capture and model complex temporal and inter-parameter dependencies in MVTS data. Additionally, we propose a novel feature selection framework that gradually identifies the most relevant variables, enhancing model interpretability and predictive accuracy. Through extensive experimentation and validation, we demonstrate the superior performance of our approaches compared to existing methods. The results highlight the practical applicability of our solutions, providing valuable tools and insights for researchers and practitioners working with high-dimensional time series data. This work advances the state of the art in MVTS analysis, offering robust methodologies that address both theoretical and practical challenges in this field.
47 0
Restricted
Toward a Better Understanding of Accessibility Adoption: Developer Perceptions and Challenges
(University Of North Texas, 2024-12) Alghamdi, Asmaa Mansour; Stephanie, Ludi
The primary aim of this dissertation is to explore the challenges developers face in interpreting and implementing accessibility in web applications. We analyze developers’ discussions on web accessibility to gain a comprehensive understanding of the challenges, misconceptions, and best practices prevalent within the development community. As part of this analysis, we built a taxonomy of accessibility aspects discussed by developers on Stack Overflow, identifying recurring trends, common obstacles, and the types of disabilities associated with the features addressed by developers in their posts. This dissertation also evaluates the extent to which developers on online platforms engage with and deliberate upon accessibility issues, assessing their awareness and implementation of accessibility standards throughout the web application development process. Given the volume and variety of these discussions, manual analysis alone would be insufficient to capture the full scope of accessibility challenges. Therefore, we employed supervised machine learning techniques to classify these posts based on their relevance to different aspects of the WCAG 2.2 guidelines principle. By training our models on labeled data, we were able to automatically detect patterns and keywords that indicate specific accessibility issues, even when the language used by developers is not directly aligned with the official guidelines. The results emphasize developers’ struggles with complex accessibility issues, such as time-based media customization and screen reader configuration. The findings indicate that machine learning holds significant potential for enhancing compliance with accessibility standards, providing a pathway for more efficient and accurate adherence to these guidelines.
72 0
Restricted
Online conversations: A study of their toxicity
(University of Illinois Urbana-Champaign, 2024) Alkhabaz, Ridha; Sundaram, Hari
Social media platforms are essential spaces for modern human communication. There is a dire need to make these spaces most welcoming and engaging to their participants. A potential threat to this need is the propagation of toxic content in online spaces. Hence, it becomes crucial for social media platforms to detect early signs of a toxic conversation. In this work, we tackle the problem of toxicity prediction by proposing a definition for conversational structures. This definition empowers us to provide a new framework for toxicity prediction. Thus, we examine more than 1.18 million X (made by 4.4 million users), formerly known as Twitter, threads to provide a few key insights about the current state of online conversations. Our results indicated that most of the X threads do not exhibit a conversational structure. Also, our newly defined structures are distributed differently than previously thought of online conversations. Additionally, our definitions give a meaningful sign for models to start predicting the future toxicity of online conversations. We also showcase that message-passing graph neural networks outperform state-of-the-art gradient- boosting trees for toxicity prediction. Most importantly, we find that once we observe the first two terminating conversational structures, we can predict the future toxicity of online threads with ≈88 % accuracy. We hope our findings will help social media platforms better curate content in their spaces and promote more conversations in online spaces.
25 0

SACM - United States of America

Browse

Filters

Settings

Sort By

Results per page

Search Results