SACM - United States of America

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668

Browse

Search Results

Now showing 1 - 10 of 21

Restricted
MULTIDIMENSIONAL APPROACHES IN BUG DETECTION FOR PARALLEL PROGRAMMING AND TEXT-TO-CODE SEMANTIC PARSING
(University of Central Florida, 2025) Alsofyani, May; Wang Liqiang
This dissertation applies deep learning and large language models to two domains: parallel programming fault detection and text-to-code translation, aiming to enhance software reliability and natural language-driven code generation. Due to their unpredictable nature, concurrency bugs-particularly data race bugs— present significant challenges in fault detection for parallel programming. We investigate deep learning and LLM-based approaches for detecting data race bugs in OpenMP programs. Our proposed methods include a transformer encoder and GPT-4 through prompt engineering and fine-tuning. Experimental results demonstrate that the transformer encoder achieves competitive accuracy compared to LLMs, highlighting its effectiveness in understanding complex OpenMP directives. Expanding this research, we explore the role of LLMs in detecting faults in Pthreads, which requires a deep understanding of thread-based logic and synchronization mechanisms. We analyze ChatGPT's effectiveness in Pthreads fault detection through dialogue-based interactions and advanced prompt engineering techniques, including Zero-Shot, Few-Shot, Chain-of-Thought, and Retrieval-Augmented Generation. Additionally, we introduce three hybrid prompting techniques—Chain-of-Thought with Few-Shot Prompting, Retrieval-Augmented Generation with Few-Shot Prompting, and Prompt Chaining with Few-Shot Prompting—to enhance fault detection performance. In the semantic parsing domain, our research bridges the gap between natural language and executable code, focusing on text-to-SQL translation. To address SQL's limitations in statistical analysis, we introduce SIGMA, a dataset for text-to-code semantic parsing with statistical analysis capabilities. In addition, we address the gap in cross-domain context-dependent text-to-SQL translation for the Arabic language. While prior research has focused on English and Chinese datasets, no efforts have been made to explore Arabic cross-domain conversational querying. We introduce Ar-SParC, the first Arabic cross-domain, context-dependent text-to-SQL dataset. This dissertation contributes to fault detection in parallel programming and semantic parsing with statistical analysis, leveraging cutting-edge deep learning and LLMs techniques. Our findings advance bug detection in high-performance computing and natural language-based code generation, significantly improving software reliability and accessibility.
30 0
Restricted
ADAPTIVE SELF-LEARNING AND MULTI-STAGE MODELING FOR EFFICIENT MEDICAL AND DENTAL IMAGE SEGMENTATION
(University of Missouir - Kansas City, 2025) Alqarni, Saeed; Yugyung, Lee
Medical imaging has revolutionized healthcare by enabling non-invasive visualization of anatomical structures and pathologies, significantly improving diagnostic accuracy, treatment planning, and patient monitoring. Modalities like computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound provide critical insights into the human body, yet precise medical image segmentation remains a challenging task. This difficulty arises from factors such as image variability, noise, artifacts, and the limited availability of annotated data necessary to train robust segmentation models. Overcoming these hurdles is essential to unlock the full potential of medical imaging in diverse clinical applications. This dissertation presents a novel framework for efficient and accurate medical image segmentation, incorporating multi-stage transfer learning, uncertainty-driven data selection, and weakly supervised learning. By combining human-guided refinement with adaptive data selection, this research addresses fundamental barriers such as data scarcity, computational resource limitations, and the high cost of annotation. The framework is structured around three key objectives: 1. Adaptive Uncertainty Sampling with SAM (AUSAM), which introduces a flexible, real-time data selection and segmentation approach, reducing reliance on large annotated datasets through dynamic thresholds and DBSCAN clustering. 2. AUSAM-SL - Active Self-Learning with SAM, which integrates entropy-based active learning with iterative self-labeling, supported by SAM for initial training, refining the selection criteria, and enhancing model predictions. 3. AUSAM-3D- 3D Modeling for Domain-Aware Segmentation and Aggregation, which builds upon AUSAM by incorporating a spatial and volumetric dimension, improving segmentation accuracy for organs and tumors, and enabling more clinically relevant outcomes. Preliminary results on medical and dental imaging datasets (MRI, CT, X-ray) validate the effectiveness of the proposed framework in improving segmentation accuracy while maintaining computational efficiency. The research offers scalable solutions suitable for resource-constrained environments by integrating human feedback with semisupervised and weakly supervised learning techniques. This work advances the field of medical and dental image segmentation and provides practical methods for leveraging multi-stage learning in real-world applications where data and computational resources are limited.
27 0
Restricted
DEEP LEARNING-ASSISTED EPILEPSY DETECTION AND PREDICTION
(Florida Atlantic University, 2024) Saem Aldahr, Raghdah; Ilyas, Mohammad
Epilepsy is a multifaceted neurological disorder characterized by superfluous and recurrent seizure activity. Electroencephalogram (EEG) signals are indispensable tools for epilepsy diagnosis that reflect real-time insights of brain activity. Recently, epilepsy researchers have increasingly utilized Deep Learning (DL) architectures for early and timely diagnosis. This research focuses on resolving the challenges, such as data diversity, scarcity, limited labels, and privacy, by proposing potential contributions for epilepsy detection, prediction, and forecasting tasks without impacting the accuracy of the outcome. The proposed design of diversity-enhanced data augmentation initially averts data scarcity and inter-patient variability constraints for multiclass epilepsy detection. The potential features are extracted using a graph theory-based approach by analyzing the inherently dynamic characteristics of augmented EEG data. It utilizes a novel temporal weight fluctuation method to recognize the drastic temporal fluctuations and data patterns realized in EEG signals. Designing the Siamese neural network-based few-shot learning strategy offers a robust framework for multiclass epilepsy detection. Subsequently, Federated Learning (FL) architecture enables epileptic seizure prediction and enhances the generalization capability by utilizing numerous seizure patterns across diversified and globally distributed epileptic patients. By capturing the potential patterns, the hybrid model design potentially offers superior prediction accuracy by integrating a spiking encoder with graph convolutional neural networks. The preictal probability of each local model then aggregates the weights of the local medical centers with the global FL. Furthermore, applying the adaptive neuro-fuzzy inference system ensures a patient-specific preictal probability by combining the local model with patientspecific clinical features. Finally, epileptic seizure forecasting utilizes Self-Supervised Learning (SSL) capabilities to overcome the limitations of annotated EEG data. This selfsupervised transfer learning improves the training efficiency in massively arriving EEG data streams. The dual-feature embedding enhances the learning ability while a lightweight prediction utilizes the embeddings from the pretext task for epilepsy forecasting in the downstream task. The performance testing on the benchmark datasets reveals the accuracy of epilepsy detection, prediction, and forecasting by addressing the limitations of the existing approaches for effective patient management. The research outcomes ultimately enable real-time, transparent, and personalized solutions to ensure commitment towards the quality of life.
21 0
Restricted
Leveraging Deep Learning for Change Detection in Bi-Temporal Remote Sensing Imagery
(University of Missouri-Columbia, 2024) Alshehri, Mariam; Hurt, J. Alex
Deforestation in the Brazilian Amazon poses significant threats to global climate stability, biodiversity, and local communities. This dissertation presents advanced deep learning approaches to improve deforestation detection using bi-temporal Sentinel-2 satellite imagery. We developed a specialized dataset capturing deforestation events between 2020 and 2021 in key conservation units of the Amazon. We first adapted transformer-based change detection models to the deforestation context, leveraging attention mechanisms to analyze spatial and temporal patterns. While these models showed high accuracy, limitations remained in effectively capturing subtle environmental changes. To address this, we introduce DeforestNet, a novel deep learning framework that integrates advanced semantic segmentation encoders within a siamese architecture. DeforestNet employs cross-temporal interaction mechanisms and temporal fusion strategies to enhance the discrimination of true deforestation events from background noise. Experimental results demonstrate that DeforestNet outperforms existing models, achieving higher precision, recall, and F1-scores in deforestation detection. Additionally, it generalizes well to other change detection tasks, as evidenced by its performance on the LEVIR-CD urban building change detection dataset. This research contributes a robust and efficient framework for accurate change detection in remote sensing imagery, offering valuable tools for environmental monitoring and aiding global efforts in sustainable forest management and conservation.
28 0
Restricted
Early Detection of Pleuropulmonary Blastoma Using Transformers Models
(Bowie State University, 2024) Almenwer, Sahar; El-Sayed, Hoda
Childhood cancer is the second leading cause of death among children under the age of fifteen, according to the American Cancer Society. The number of diagnosed cancer cases in children continues to rise each year, leading to many tragic fatalities. One specific type of cancer, pleuropulmonary blastoma (PPB), affects children from newborns to those as old as six years. The most common way to diagnose PPB is through imaging; this method is quick, cost-effective, and does not require specialized equipment or laboratory tests. However, relying solely on imaging for early detection of PPB can be challenging because of lower accuracy and sensitivity. It is time consuming and susceptible to errors because of the numerous potential differential diagnoses. A more accurate diagnosis of PPB depends on identifying mutations in the DICER1 gene. Recent advancements in biological analysis and computer learning are transforming cancer treatment. Deep learning (DL) methods for diagnosing PPB are becoming increasingly popular. Despite facing some challenges, DL shows a significant promise in supporting oncologists. However, some advanced models possess a limited local receptive field, which may restrict their ability to comprehend the overall context. This research employs the vision transformer (ViT) model to address these limitations. ViT reduces computation time and yields better results than existing models. It utilizes self-attention among image patches to process visual information effectively. The experiments in this study are conducted using two types of datasets, medical images and genomic datasets, employing two different methodologies. One approach uses the ViT model combined with an explainability framework on large medical image datasets with various modalities. The other involves developing a new hybrid model that integrates the vision transformer with bidirectional long short-term memory (ViT-BiLSTM) for genomic datasets. The results demonstrate that the ViT model and the new hybrid model, ViT-BiLSTM, significantly outperform established models, as validated by multiple performance metrics. Consequently, this research holds great promise for the early diagnosis of PPB, reducing misdiagnosis occurrences, and facilitating timely intervention and treatment. These findings could revolutionize medical diagnosis and shape the future of healthcare.
11 0
Restricted
Next-Generation Diagnostics: Deep Learning based Approaches for Medical Image Analysis
(Florida Institute of Technology, 2024-12) Alsubaie, Mohammed; Li, Xianqi
High-resolution medical imaging plays a pivotal role in accurate diagnostics and effective patient care. However, the extended acquisition times required for detailed imaging often lead to patient discomfort, motion artifacts, and increased scan failures. To address these challenges, advanced deep learning approaches are emerging as transformative tools in medical imaging. In this study, we propose a conditional denoising diffusion model-based framework designed to enhance the resolution and reconstruction quality of medical images, including Magnetic Resonance Imaging (MRI) and Magnetic Resonance Spectroscopic Imaging (MRSI). The framework incorporates a data fidelity term into the reverse sampling process to ensure consistency with physical acquisition models while improving reconstruction accuracy. Furthermore, it leverages a Self-Attention UNet architecture to upsample low-resolution MRSI data, preserving fine-grained details and critical structural information essential for clinical diagnostics. The proposed model demonstrates adaptability across varying undersampling rates and spatial resolutions, as a network trained on acceleration factor 8 generalizes effectively to other acceleration factors. Evaluations on publicly available fastMRI datasets and MRSI data highlight significant improvements over state-of-the-art methods, achieving superior metrics in SSIM, PSNR, and LPIPS while maintaining diagnostic relevance. Notably, the diffusion model excels in preserving intricate structural details, detecting small tumors, and maintaining texture integrity, particularly in glioma imaging for mapping tumor metabolism associated with IDH1 and IDH2 mutations. These findings underscore the potential of deep learning-based diffusion models to revolutionize medical imaging, enabling faster, more accurate scans and improving diagnostic workflows across clinical and research applications.
19 0
Restricted
Deep Learning Approaches for Multivariate Time Series: Advances in Feature Selection, Classification, and Forecasting
(New Mexico State University, 2024) Alshammari, Khaznah Raghyan; Tran, Son; Hamdi, Shah Muhammad
In this work, we present the latest developments and advancements in the machine learning-based prediction and feature selection of multivariate time series (MVTS) data. MVTS data, which involves multiple interrelated time series, presents significant challenges due to its high dimensionality, complex temporal dependencies, and inter-variable relationships. These challenges are critical in domains such as space weather prediction, environmental monitoring, healthcare, sensor networks, and finance. Our research addresses these challenges by developing and implementing advanced machine-learning algorithms specifically designed for MVTS data. We introduce innovative methodologies that focus on three key areas: feature selection, classification, and forecasting. Our contributions include the development of deep learning models, such as Long Short-Term Memory (LSTM) networks and Transformer-based architectures, which are optimized to capture and model complex temporal and inter-parameter dependencies in MVTS data. Additionally, we propose a novel feature selection framework that gradually identifies the most relevant variables, enhancing model interpretability and predictive accuracy. Through extensive experimentation and validation, we demonstrate the superior performance of our approaches compared to existing methods. The results highlight the practical applicability of our solutions, providing valuable tools and insights for researchers and practitioners working with high-dimensional time series data. This work advances the state of the art in MVTS analysis, offering robust methodologies that address both theoretical and practical challenges in this field.
42 0
Restricted
TOWARDS ROBUST AND ACCURATE TEXT-TO-CODE GENERATION
(University of Central Florida, 2024) almohaimeed, saleh; Wang, Liqiang
Databases play a vital role in today’s digital landscape, enabling effective data storage, manage- ment, and retrieval for businesses and other organizations. However, interacting with databases often requires knowledge of query (e.g., SQL) and analysis, which can be a barrier for many users. In natural language processing, the text-to-code task, which converts natural language text into query and analysis code, bridges this gap by allowing users to access and manipulate data using everyday language. This dissertation investigates different challenges in text-to-code (including text-to-SQL as a subtask), with a focus on four primary contributions to the field. As a solution to the lack of statistical analysis in current text-to-code tasks, we introduce SIGMA, a text-to- Code dataset with statistical analysis, featuring 6000 questions with Python code labels. Baseline models show promising results, indicating that our new task can support both statistical analysis and SQL queries simultaneously. Second, we present Ar-Spider, the first Arabic cross-domain text-to-SQL dataset that addresses multilingual limitations. We have conducted experiments with LGESQL and S2SQL models, enhanced by our Context Similarity Relationship (CSR) approach, which demonstrates competitive performance, reducing the performance gap between the Arabic and English text-to-SQL datasets. Third, we address context-dependent text-to-SQL task, often overlooked by current models. The SParC dataset was explored by utilizing different question rep- resentations and in-context learning prompt engineering techniques. Then, we propose GAT-SQL, an advanced prompt engineering approach that improves both zero-shot and in-context learning experiments. GAT-SQL sets new benchmarks in both SParC and CoSQL datasets. Finally, we introduce Ar-SParC, a context-dependent Arabic text-to-SQL dataset that enables users to interact with the model through a series of interrelated questions. In total, 40 experiments were conducted to investigate this dataset using various prompt engineering techniques, and a novel technique called GAT Corrector was developed, which significantly improved the performance of all base- line models.
34 0
Restricted
ADAPTIVE INTRUSION DETECTION SYSTEM FOR THE INTERNET OF MEDICAL THINGS (IOMT): ENHANCING SECURITY THROUGH IMPROVED MUTUAL INFORMATION FEATURE SELECTION AND META-LEARNING
(Towson University, 2024-12) Alalhareth, Mousa; Hong, Sungchul
The Internet of Medical Things (IoMT) has revolutionized healthcare by enabling continuous patient monitoring and diagnostics but also introduces significant cybersecurity risks. IoMT devices are vulnerable to cyber-attacks that threaten patient data and safety. To address these challenges, Intrusion Detection Systems (IDS) using machine learning algorithms have been introduced. However, the high data dimensionality in IoMT environments often leads to overfitting and reduced detection accuracy. This dissertation presents several methodologies to enhance IDS performance in IoMT. First, the Logistic Redundancy Coefficient Gradual Upweighting Mutual Information Feature Selection (LRGU-MIFS) method is introduced to balance the trade-off between relevance and redundancy, while improving redundancy estimation in cases of data sparsity. This method achieves 95% accuracy, surpassing the 92% reported in related studies. Second, a fuzzy-based self-tuning Long Short-Term Memory (LSTM) IDS model is proposed, which dynamically adjusts training epochs and uses early stopping to prevent overfitting and underfitting. This model achieves 97% accuracy, a 10% false positive rate, and a 94% detection rate, outperforming prior models that reported 95% accuracy, a 12% false positive rate, and a 93% detection rate. Finally, a performance-driven meta-learning technique for ensemble learning is introduced. This technique dynamically adjusts classifier voting weights based on factors such as accuracy, loss, and prediction confidence levels. As a result, this method achieves 98% accuracy, a 97% detection rate, and a 99% F1 score, while reducing the false positive rate to 10%, surpassing previous results of 97% accuracy, a 93% detection rate, a 97% F1 score, and an 11% false positive rate. These contributions significantly enhance IDS effectiveness in IoMT, providing stronger protection for sensitive medical data and improving the security and reliability of healthcare networks.
25 0
Restricted
DEEP LEARNING FOR MOLECULAR DESIGN: MODELS, FRAMEWORKS, AND APPLICATIONS
(Cornell University, 2024-08) Alshehri, Abdulelah Saeed; You, Fengqi; Gomes, Carla; Abbott, Nicholas L.
The vast and complex landscape of chemical space has traditionally been explored through a combination of experimentation and knowledge-based computational approaches. However, the limitations of these methods have hindered the efficient design of molecules with desired properties. The advent of deep learning, coupled with the availability of big chemical data, presents transformative opportunities for computational molecular design. This dissertation explores the convergence of deep learning and chemical engineering, presenting novel methodologies and frameworks to address challenges in molecular property prediction, molecular design, chemical data extraction, molecular conformation generation, and peptide design. In Chapter 2, we develop parallel models for the estimation of 25 pure component properties across over 24,000 chemicals, employing both traditional regression and machine learning methods on functional group representations. These models demonstrate robust accuracy in predicting a broad range of physicochemical properties, enabling streamlined product and process design. Chapter 3 addresses the inherent uncertainty in CMD by introducing DRL-CMD, an uncertainty-aware deep reinforcement learning framework. By explicitly quantifying and managing uncertainties, DRL-CMD reduces constraint violations by 39% and uncertainty margins by 27% compared to literature-reported molecules, particularly in complex design scenarios with limited data and extreme property ranges. This approach offers a more reliable path to molecules with tailored properties toward accelerating product and process design. In Chapter 4, the focus is on the extraction of chemical data from scientific literature, critical for model training and discovery. ChemREL, a novel deep learning pipeline, achieves an F1-score of 95.4% for property extraction, outperforming existing methods and GPT-4. Its transferability is demonstrated by successful adaptation from melting point extraction to LD50 extraction with minimal additional training, highlighting the potential to accelerate the construction of large-scale chemical datasets. In Chapter 5, we explore the utilization of abundant 2D molecular graph data to enhance 3D conformer generation, a crucial step in drug discovery. By pretraining graph neural networks on 2D data and improving the GeoMol method, we achieve a 7.7% average improvement in generated conformer quality compared to state-of-the-art sequential methods, improving the accuracy and efficiency of molecular modeling. Chapter 6 addresses the global challenge of plastic pollution by presenting an integrated framework combining biophysics-based insights, evidential deep learning, and metaheuristic search for the design of plastic-binding peptides. This approach leads to significant increases in binding free energies for polypropylene (18%) and polystyrene (34%) compared to previous designs, offering a promising bio-inspired solution for plastic remediation. By developing these novel deep learning approaches, the resulting advances improve predicting molecular properties, designing molecules with tailored properties while managing uncertainties, constructing a versatile pipeline for chemical data extraction, enhancing the quality of 3D conformer generation, and generating high-affinity plastic-binding peptides for potential environmental remediation. These works signify a step forward in the integration of deep learning and chemical engineering, paving the way for accelerated discovery and innovation in the field.
33 0

SACM - United States of America

Browse

Filters

Settings

Sort By

Results per page

Search Results