SACM - United States of America
Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668
Browse
11 results
Search Results
Item Restricted Machine Learning Techniques for Calorimeter Cluster Calibration of the CMS Particle Flow Algorithm(Baylor univeristy, 2025-05) ghazwani, Noorah; Kenichi, HatakeyamThe Electromagnetic Calorimeter (ECAL) and Hadronic Calorimeter (HCAL) are key components of the CMS detector. The ECAL is designed to measure the energies of electrons and photons, while the HCAL primarily measures the energies of charged and neutral hadrons. An algorithm called Particle Flow (PF) integrates information from various CMS sub-detectors to reconstruct and identify all particles produced in proton collisions. Photons and neural hadrons are reconstructed using calorimeter energy deposit clusters, and reconstruction of charged particle candidates and their separation from neutral particle candidates rely on measurements of charged particle tracks and calorimeter clusters. A proper calibration enhances particle identification and reduces the likelihood of misreconstructed energy excess. Machine learning techniques, such as Boosted Decision Trees (BDT) and Graph Neural Networks (GNN), are employed to calibrate PF energy clusters, improving both the response and the resolution of the measured energy. In this thesis, BDT is applied to calibrate PF ECAL clusters, while GNN is tested for hadronic cluster calibration.31 0Item Restricted Learning Based Ethereum Phishing Detection: Evaluation, Robustness, and Improvement(University of Central Florida, 2025) Alghuried, Ahod; Mohaisen, DavidPhishing attacks continue to pose a significant threat to the Ethereum ecosystem, accounting for a major share of Ethereum-related cybercrimes. To enhance the detection of such fraudulent transactions, this dissertation develops a comprehensive framework for machine learning-based phishing detection in Ethereum transactions. The framework addresses critical aspects such as feature selection, class imbalance, model robustness, and the vulnerability of detection models to adversarial attacks. By systematically evaluating these key practices, this work contributes to the development of more effective detection methods. The first part of the dissertation assesses the current state of phishing detection methods, identifying gaps in feature selection, dataset composition, and model optimization. We propose a systematic framework that evaluates these factors, providing a foundation for improving the overall performance and reliability of detection models. The second part explores the vulnerability of machine learning models, including Random Forest, Decision Tree, and K-Nearest Neighbors, to single-feature adversarial attacks. Through extensive experimentation, we analyze the impact of various adversarial strategies on model performance and uncover alarming weaknesses in existing models. However, the varied effects of these attacks across different algorithms present opportunities for mitigation through adversarial training and improved feature selection. Finally, the dissertation investigates how phishing detection models generalize across datasets, focusing on the role of preprocessing techniques such as feature engineering and class balancing. Our findings show that optimizing these techniques enhances model accuracy and robustness, making detection methods more adaptable to evolving threats. Overall, this work presents a comprehensive framework that addresses the critical elements of phishing detection in Ethereum transactions, offering valuable insights for the development of more robust and generalizable machine learning-based security models. The proposed framework has broad implications for improving blockchain security and advancing the field of phishing detection.19 0Item Restricted Disinformation Classification Using Transformer based Machine Learning(Howard University, 2024) alshaqi, Mohammed Al; Rawat, Danda BThe proliferation of false information via social media has become an increasingly pressing problem. Digital means of communication and social media platforms facilitate the rapid spread of disinformation, which calls for the development of advanced techniques for identifying incorrect information. This dissertation endeavors to devise effective multimodal techniques for identifying fraudulent news, considering the noteworthy influence that deceptive stories have on society. The study proposes and evaluates multiple approaches, starting with a transformer-based model that uses word embeddings for accurate text classification. This model significantly outperforms baseline methods such as hybrid CNN and RNN, achieving higher accuracy. The dissertation also introduces a novel BERT-powered multimodal approach to fake news detection, combining textual data with extracted text from images to improve accuracy. By lever aging the strengths of the BERT-base-uncased model for text processing and integrating it with image text extraction via OCR, this approach calculates a confidence score indicating the likeli hood of news being real or fake. Rigorous training and evaluation show significant improvements in performance compared to state-of-the-art methods. Furthermore, the study explores the complexities of multimodal fake news detection, integrat ing text, images, and videos into a unified framework. By employing BERT for textual analysis and CNN for visual data, the multimodal approach demonstrates superior performance over traditional models in handling multiple media formats. Comprehensive evaluations using datasets such as ISOT and MediaEval 2016 confirm the robustness and adaptability of these methods in combating the spread of fake news. This dissertation contributes valuable insights to fake news detection, highlighting the effec tiveness of transformer-based models, emotion-aware classifiers, and multimodal frameworks. The findings provide robust solutions for detecting misinformation across diverse platforms and data types, offering a path forward for future research in this critical area.34 0Item Restricted HYBRID MACHINE LEARNING APPROACHES FOR SOC AND RUL ESTIMATION IN BATTERY MANAGEMENT SYSTEMS(Oakland University, 2024) Hawsawi, Tarik Abdullah; Zohdy, MohamedWith the fast development of electric vehicles (EVs), new technologies are needed to manage batteries more efficiently to optimize performance and more profound and longer battery use. A significant problem that must be solved successfully is accurate estimation of the State-of-Charge (SoC) to avoid fully discharging a battery. It shortens battery life and prolongs the time it takes to charge the battery. This dissertation introduces a new approach that uses Edge Computing and real-time predictive analytics to assess the status of EV batteries and send alerts when necessary, thus facilitating energy efficiency. The Edge Impulse platform is used to predict the Remain Useable Life RUL of batteries with enhanced accuracy using EON-Tuner and DSP processing blocks, enhancing computational capability and making it feasible for edge devices. Since traditional SoC estimations include tools like Kalman filters and Extended Kalman filters, which are effective but have a considerable drawback in estimating the SoC with changing battery parameters, this study proposes a multi-variable optimization method. The method enhances performance prediction after key parameters are iteratively adjusted, thus resolving the emergence hypotheses of most existing techniques. The system was designed and tested on Jupyter Notebook, and performance indicators of accuracy, MSE, and efficiency further validated the design. This study helps ensure proper energy use and long battery life for e-vehicles, which promotes clean energy use.7 0Item Restricted Developing Machine Learning and Time-Series Analysis Methods with Applications in Diverse Fields(Virginia Commonwealth University, 2024-05-06) Aljifri, Muhammed; Qian, YanjunThis dissertation introduces methodologies that combine machine learning models with time-series analysis to tackle data analysis challenges in varied fields. The first study enhances the traditional cumulative sum control charts with machine learning models to leverage their predictive power for better detection of process shifts, applying this advanced control chart to monitor hospital readmission rates. The second project develops multi-layer models for predicting chemical concentrations from ultraviolet-visible spectroscopy data, specifically addressing the challenge of analyzing chemicals with a wide range of concentrations. The third study presents a new method for detecting multiple changepoints in autocorrelated ordinal time series, using the autoregressive ordered probit model in conjunction with a genetic algorithm. This technique is applied to the air quality index data for Los Angeles, aiming to detect significant changes in air quality over time.31 0Item Restricted Comprehensive Patient-Specific Prediction Models for Diagnosis and Prognosis of Temporoman-dibular Joint Osteoarthritis(Saudi Digital Library, 2023) Alturkestani, Najla; Cevidanes, LuciaOsteoarthritis is the most common degenerative joint disease, affecting 15% of the global popula-tion. Osteoarthritis in temporomandibular joint (TMJ OA) can cause chronic pain, facial deformi-ty, joint dysfunction, impacting the quality of life. Unlike weight-bearing joints, TMJ OA primar-ily affects individuals between the ages of 20 and 40 and can also appear in adolescents. Current standards for diagnosing TMJ OA rely on clinical and imaging criteria. However, these criteria have limited efficacy in detecting early-stage TMJ OA, posing challenges to timely inter-vention and mitigation of irreversible tissue damage. Hence, it becomes imperative to identify additional objective diagnostic criteria. In addition, determining which patients are at increased risk of disease progression is critical for making informed clinical decisions and designing more effective and individualized treatments. Radiomics is a newly established field propelled by advancements in computational power. It extracts quantitative imaging features from radiological images, aiming to identify subtle tissue variations and reduce subjectivity in image interpretation. Beyond radiomics, metabolic abnor-malities in joint tissues serve as early indicators of osteoarthritis. Although there has been pro-gress in studying osteoarthritis biomarkers, they have not yet been clinically established. Evaluat-ing multiple markers may reveal their intricate interrelations and fully harness their potential. With the advent of powerful machine learning (ML) methods, analysis of complex multisource data became feasible. Nevertheless, applying feature selection methods is crucial to eliminate re-dundant and irrelevant data, improving the output accuracy. Unlike knee osteoarthritis, which has been extensively studied using ML models, TMJ OA remains an underexplored area. Therefore, we aimed to 1) Develop a reliable prediction tool for TMJ OA progression and identify the con-tributing factors during a 2–3-year follow-up period, 2) Develop a comprehensive prediction tool tailored for TMJ OA diagnosis and use explainable methods to identify key factors driving diag-nosis, and 3) Investigate the feasibility of privileged learning in addressing missing data when diagnosing TMJ OA. We successfully developed an open-source tool which combined 18 feature selection and ML methods. This allowed for the prediction of disease progression with an accuracy=0.87, area un-der the ROC curve (AUC)=0.72, and an F1 score=0.82. Using the interpretable SHAP analysis method, we identified the strongest predictors for TMJ OA progression. These included: clinical (headache, lower back pain, restless sleep), quantitative imaging (condyle high-grey-level-run-emphasis (HGLRE), articular fossa GL-non-uniformity, and long-run-low-GLRE, joint space), and biological markers in saliva (Osteoprotegerin, Angiogenin, VEGF, and MMP-7) and serum samples (ENA-78). Utilizing clinical, CBCT imaging, and biological data from 162 prospectively recruited subjects, we evaluated 77 ML methods. Random forest demonstrated the best diagnostic performance, achieving AUC=0.90, accuracy=0.79, precision=0.80, and F1=0.80. The integration of clinical, imaging, and biological markers enhanced TMJ OA diagnosis. The top contributing features were clinical (headache, restless sleep, mouth opening, muscle soreness), objective quantitative imag-ing (condyle Cluster-Prominence, HGLRE, SRHGLRE, Trabecular Thickness), and biological markers in saliva (TGFB-1, TRANCE, TIMP-1, PAI-1, VECadherin, CXCL-16) and serum (An-giogenin, PAI-1, VEGF, TRANCE, TIMP-1, BDNF, VECadherin). Lastly, we developed the KRVFL+ diagnostic tool, which can be used when only clinical and imaging data are available. It achieved an AUC, specificity, and precision of 0.81, 0.79, and 0.77, respectively. Collectively, these efforts emphasize the immense potential of multi-source data and ML applica-tions in presenting solutions for predicting TMJ OA progression and diagnosis, with potential implications for timely interventions and a transformative impact on TMJ OA healthcare deliv-ery.33 0Item Restricted TOWARDS ROBUST SENSOR-BASED HUMAN ACTIVITY RECOGNITION IN REAL- WORLD ENVIRONMENTS(Saudi Digital Library, 2023-11-21) Alkhoshi, Enas; Rasheed, Khaled; Arabnia, Hamid; Maier, Frederick; Gay, JenniferHuman Activity Recognition (HAR) using wearable sensors has become a popular research area in recent years due to its potential applications in various fields, such as healthcare, fitness, security, and smart homes. Even though numerous HAR systems are being developed, it is still challenging to create one that can accurately identify and classify human activities in actual environments. This dissertation presents methods for recognizing human activity using a single accelerometer-based system. The research explores the two pillars of making wearable sensor-based HAR systems robust and reliable: a free-living dataset that represents real-world scenarios and a user-independent system. Towards enhancing the robustness of the sensor-based HAR system, we applied deep analysis to several machine-learning techniques and models for identifying human activity using a pseudo-free-living dataset obtained from 20 participants at the University of Georgia. We found that hierarchical meta-classifiers outperformed deep learning and classical models by 6% for classifying seven activities. We classified the metabolic equivalent (METs) levels of physical activities and achieved 80% inter-subject accuracy. We introduced model personalization, and it increased the accuracy to 87% by including 50% of the participant's data. This approach is promoted since it lowers the inter-subject variability of the dataset. We built a user-independent sensor-based human activity recognition system to explore the impact of using demographic data and anthropometric features to improve the classification of the metabolic equivalent (METs) level of physical activities based on free-living data. We used a wearable accelerometer dataset collected by 270 participants from different cities in the state of Georgia performing various physical activities. We found that including demographic data and anthropometric features in the models improves their accuracy in classifying MET levels. We built and modified a transformer's self-attention mechanism to analyze motion signals over time, which expresses individual relationships between signal levels within a time series. Furthermore, model personalization was able to reduce the dataset's inter-subject variability and raised accuracy to 94.84% by including only 30% of the participant's data in training. Achieving high performance for subject-independent systems remains challenging when using real-world data.8 0Item Restricted An Effective Ensemble Learning-Based Real-Time Intrusion Detection Scheme for In-Vehicle Network(Saudi Digital Library, 2023-11-13) Alalwany, Easa; Mahgoub, ImadeldinConnectivity and automation have expanded with the development of autonomous vehicle technology. One of several automotive serial protocols that can be used in a wide range of vehicles is the controller area network (CAN). The growing functionality and connectivity of modern vehicles make them more vulnerable to cyberattacks aimed at vehicular networks. The CAN bus protocol is vulnerable to numerous attacks as it lacks security mechanisms by design. It is crucial to design intrusion detection systems (IDS) with high accuracy to detect attacks on the CAN bus. In this dissertation, to address all these concerns, we design an effective machine learning-based IDS scheme for binary classification that utilizes eight supervised ML algorithms, along with ensemble classifiers, to detect normal and abnormal activities in the CAN bus. Moreover, we design an effective ensemble learning-based IDS scheme for detecting and classifying DoS, fuzzing, replay, and spoofing attacks. These are common CAN bus attacks that can threaten the safety of a vehicle's driver, passengers, and pedestrians. For this purpose, we utilize supervised machine learning in combination with ensemble methods. Ensemble learning aims to achieve better classification results through the use of different classifiers that are combined into a single classifier. Furthermore, in the pursuit of real-time attack detection and classification, we propose IDS scheme that accurately detects and classifies CAN bus attacks in real-time using ensemble techniques and the Kappa architecture. The Kappa architecture enables real-time attack detection, while ensemble learning combines multiple machine learning classifiers to enhance the accuracy of attack detection. We build this system using the most recent CAN intrusion dataset provided by the IEEE DataPort. We carried out the performance evaluation of the proposed system in terms of accuracy, precision, recall, F1-score, and area under curve receiver operator characteristic (ROC-AUC). For the binary classification, the ensemble classifiers outperformed the individual supervised ML classifiers and improved the effectiveness of the classifier. For detecting and classifying CAN bus attacks, the ensemble learning methods resulted in a robust and accurate multi-classification IDS for common CAN bus attacks. The stacking ensemble method outperformed other recently proposed methods, achieving the highest performance. For the real-time attack detection and classification, the ensemble methods significantly enhance the accuracy of real-time CAN bus attack detection and classification. By combining the strengths of multiple models, the stacking ensemble technique outperformed individual supervised models and other ensembles.14 0Item Restricted Applications Of Artificial Intelligence In Supply Chain Management In The Era Of Industry 4.0(2023) Ali, Arishi; Krishna, KrishnanNowadays, an emerging trend in Supply Chain Management (SCM) is a focus shift from classical Supply Chain (SC) to digital SC. However, decisions in the digital SC context require new tools and methodologies that consider the digitalization environment. Artificial Intelligence (AI) methodologies can provide learning, predictive, and automated decision-making capabilities in the digital environment. Among a wide range of problems in the SCM field, risk management, logistics, and transportation have received less attention from an AI perspective. The work presented in this dissertation proposes three AI-based approaches to help SCs manage their operations more effectively using creative risk monitoring and logistics/transportation solutions in the era of Industry 4.0. In the first study, a Digital Twin (DT) framework for analyzing and predicting the impact of COVID-19 disruptions on the manufacturing SC is developed to support the decision-making process in disrupted SC. The proposed Digital SC Twin (DSCT) model is aimed to work as an online controlling tower to monitor the behavior of physical SC in the digital environment and guide SCM managers to make the necessary adjustments to minimize risks and maintain SC stability during disruptions. In the second study, a contactless truck-drone delivery model for last-mile problems in the SC is introduced to support logistics and transportation operations during pandemics. A hybrid AI approach is developed to provide quality real-time solutions for the introduced truck-drone delivery system. In the third study, a collaborative Multi-Agent Deep Reinforcement Learning (MADRL) approach for vehicle routing in the SCM is designed to facilitate collaboration and communication among multiple vehicles in the SC distribution networks. Overall, the methods and models presented in this dissertation can enable SCs to transform their traditional practices, provide cost savings, support real-time decision-making, and enable self-optimization and self-healing capabilities in the age of Industry 4.056 0Item Restricted Jointly Mining News and User-Generated Content: Machine Learning, Information and Social Network Perspective(Temple University, 2023-05-22) Alshehri, Jumanah; Obradovic, ZoranThe amount of published news articles is steadily increasing, and readers are shifting toward online platforms because of the convenience and affordable technology costs (Shearer, 2021). Users have become more engaged with online news articles. This engagement creates a rich corpus, which makes it a powerful means to understand public opinion, emerging events, and their evolvement. Therefore, many organizations invest in mining this large-scale user-generated content to improve their products, services, and, more importantly, their decision-making process. Studying users’ reactions to online news is essential for social scientists, policymakers, and journalists. This type of engagement is an area of study introduced previously. In the statistical and machine learning community, many survey-based studies tried to understand the users’ behavior by characterizing and categorizing comments in online news. Some studies focus on mining user opinions from social media and online news comments. Other works look into bias in the news and its influence on user-generated content. At the same time, the social network community addresses the problem of mining large-scale online news from different angles. Some work focuses on constructing knowledge graphs from the text. Others focus on building high-level graphs, where nodes are users and posts or documents, and links represent the relationship between nodes. Another line of work looked into the word level of the text. They extracted entities and topics by combining Natural Language Processing and graph techniques. From a Machine Learning perspective, there are three main challenges in all these studies 1) jointly mining massive user-generated data, 2) from multiple sources and platforms, and 3) the unpredictable quality of user-generated content. To address these issues, we tackle the problem of jointly learning and mining valuable information from online news articles and user-generated content. We start by studying and understating the relationship between users’ comments and articles in online news. Where the focus is to understand the level of relevancy between articles and their comments, we labeled a few article-comment pairs in this work. We proposed BERTAC (Alshehri et al.,2021), a BERT-based model that jointly learns article-comment embeddings and infers the relevance class of comment. However, we found that the disagreement among annotators as a part of a human (expert) labeling process produces noisy labels, which affect the performance of supervised learning algorithms. On the other hand, working only with high agreement annotations introduces another challenge: the data imbalance problem (Alshehri et al., 2022). As in many machine learning problems, labeling a sufficient number of examples is costly and time-consuming. Therefore, we propose a framework for aligning comments and news articles under a constrained budget(Alshehri et al., 2023a). The proposed model considers the data imbalanced, where we have only a few examples from one class, in addition, it considers the degrees of annotator disagreement. Within the framework, we consider two solutions, 1) semi-automatic labeling based on human-AI collaboration and 2) synthetic data augmentation. Another critical aspect of mining news articles and user-generated content is understanding emerging events and their associated entities. However, this is challenging, especially with the massive growth of online articles and user-generated content across different platforms. Therefore, we proposed MultiLayerET (Alshehri et al., 2023b), a unified representation of online news articles and comments. This work highlights the relationship between entities and topics in news articles and user-generated content. It projects entities and topics as a multi-layer graph, which gives a high-level understanding of the story behind the large pile of the corpus. We showed that such graphs enrich the textual representation and enhance the model learning performance in many downstream applications, such as media bias classification and fake news detection.20 0