Saudi Cultural Missions Theses & Dissertations
Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10
Browse
7 results
Search Results
Item Restricted Adversarial Machine Learning: Safeguarding Al models from Attacks(Lancaster University, 2025-01-10) Alammar, Ghaida; Bilal, MuhammadThe field of AML has gained considerable popularity over the years with researchers seeking to explore gaps and new opportunities for growth. This goal of this report is to offer an in-depth survey of adversary attacks and defences in machine learning by examining existing gaps in current algorithms and understanding the implications for systems. By exploring evasion, poisoning, extraction, and inference attacks, the paper reveals the weaknesses of the existing methodologies such as adversarial training, data sanitization, and differential privacy. These techniques are usually not versatile to newer threats and have raised concerns about how effective they are in practical use. The research contributes to the field by conducting an extensive literature review of 35 articles and highlighting the need to implement adaptive and diverse defence strategies as well as empirical studies to evaluate the effectiveness of AML mechanisms. Some of the strategic suggestions are to incorporate continuous training frameworks, optimise real-time monitoring processes, and improve privacy-preserving methods to safeguard confidential information. This analysis is envisaged to offer practical data to foster the development of AML to help in the development of robust AI systems that will remain impregnable to various kinds of adversarial threats in numerous vital sectors. The study examines the basic design and consequences of various attacks in addition to the impact of subtle manipulation of input data on patterns and privacy. The report further addresses the modern challenges of large language models (LLMs) and autonomous systems. Furthermore, this research emphasises the significance of robust protection against enemy attack in strategic areas. The studies additionally evaluate present day protection mechanisms inclusive of antagonistic training, enter preprocessing, and making models stronger and more reliable. By evaluating the efficiency of these defences and evaluating key areas for improvement, the dissertation provides invaluable insights into enhancing the security and reliability of systems. The results of addressing the attacks and defences expose the need for unremitting advancements in data protection in various systems.20 0Item Restricted Automating the Formulation of Competency Questions in Ontology Engineering(University of Liverpool, 2025) Alharbi, Reham; Tamma, Valentina; Grasso, Floriana; Payne, TerryOntology reuse is a fundamental aspect of ontology development, ensuring that new ontologies align with established models to facilitate seamless integration and interoperability across systems. Despite decades of research promoting ontology reuse, practical solutions for semi-automatically assessing the suitability of candidate ontologies remain limited. A key challenge is the lack of explicit requirement representations that allow for meaningful comparisons between ontologies. Competency Questions (CQs) , which define functional requirements in the form of natural language questions, offer a promising means of evaluating ontology reuse potential. However, in practice, CQs are often not published alongside their ontology, making it difficult to assess whether an existing ontology aligns with new requirements, ultimately hindering reuse. This thesis tackles the challenge of ontology reuse by introducing an automated approach to retrofitting CQs into existing ontologies. Leveraging Generative AI, specifically Large Language Models (LLMs), this approach generates CQs from ontological statements, enabling the systematic extraction of functional requirements even when they were not explicitly documented. The performance of both open-source and closed-source LLMs is evaluated, with key parameters such as prompt specificity and temperature explored to control hallucinations and improve the quality of retrofitted CQs. Results indicate high recall and stability, demonstrating that CQs can be reliably retrofitted and aligned with an ontology’s intended design. However, precision varies due to long-tail data effects, and potential data leakage may artificially inflate recall, necessitating further research. By enabling the reconstruction of CQs, this approach provides a foundation for assessing ontology reuse based on requirement similarity. Specifically, CQ similarity can serve as an indicator of how well an existing ontology aligns with the needs of a new ontology development effort. To operationalize this idea, this thesis proposes a reuse recommendation phase within ontology development methodologies. This phase systematically identifies candidate ontologies based on requirement overlap, offering a structured approach to reuse assessment. The methodology is validated through a practical case study, demonstrating its effectiveness in real-world ontology design. By embedding an explicit reuse recommendation step in the ontology engineering process, this approach provides ontology engineers with a systematic method to identify suitable candidate ontologies, enhancing the overall design process.20 0Item Restricted Evaluating Chess Moves by Analysing Sentiments in Teaching Textbooks(the University of Manchester, 2025) Alrdahi, Haifa Saleh T; Batista-navarro, RizaThe rules of playing chess are simple to comprehend, and yet it is challenging to make accurate decisions in the game. Hence, chess lends itself well to the development of an artificial intelligence (AI) system that simulates real-life problems, such as in decision-making processes. Learning chess strategies has been widely investigated, with most studies focused on learning from previous games using search algorithms. Chess textbooks encapsulate grandmaster knowledge, which explains playing strategies. This thesis investigates three research questions on the possibility of unlocking hidden knowledge in chess teaching textbooks. Firstly, we contribute to the chess domain with a new heterogeneous chess dataset “LEAP”, consists of structured data that represents the environment “board state”, and unstructured data that represent explanation of strategic moves. Additionally, we build a larger unstructured synthetic chess dataset to improve large language models familiarity with the chess teaching context. With the LEAP dataset, we examined the characteristics of chess teaching textbooks and the challenges of using such a data source for training Natural Language (NL)-based chess agent. We show by empirical experiments that following the common approach of sentence-level evaluation of moves are not insightful. Secondly, we observed that chess teaching textbooks are focused on explanation of the move’s outcome for both players alongside discussing multiple moves in one sentence, which confused the models in move evaluation. To address this, we introduce an auxiliary task by using verb phrase-level to evaluate the individual moves. Furthermore, we show by empirical experiments the usefulness of adopting the Aspect-based Sentiment Analysis (ABSA) approach as an evaluation method of chess moves expressed in free-text. With this, we have developed a fine-grained annotation and a small-scale dataset for the chess-ABSA domain “ASSESS”. Finally we examined the performance of a fine-tuned LLM encoder model for chess-ABSA and showed that the performance of the model for evaluating chess moves is comparable to scores obtained from a chess engine, Stockfish. Thirdly, we developed an instruction-based explanation framework, using prompt engineering with zero-shot learning to generate an explanation text of the move outcome. The framework also used a chess ABSA decoder model that uses an instructions format and evaluated its performance on the ASSESS dataset, which shows an overall improvement performance. Finally, we evaluate the performance of the framework and discuss the possibilities and current challenges of generating large-scale unstructured data for the chess, and the effect on the chess-ABSA decoder model.9 0Item Restricted Evaluating Text Summarization with Goal-Oriented Metrics: A Case Study using Large Language Models (LLMs) and Empowered GQM(University of Birmingham, 2024-09) Altamimi, Rana; Bahsoon, RamiThis study evaluates the performance of Large Language Models (LLMs) in dialogue summarization tasks, focusing on Gemma and Flan-T5. Employing a mixed-methods approach, we utilized the SAMSum dataset and developed an enhanced Goal-Question-Metric (GQM) framework for comprehensive assessment. Our evaluation combined traditional quantitative metrics (ROUGE, BLEU) with qualitative assessments performed by GPT-4, addressing multiple dimensions of summary quality. Results revealed that Flan-T5 consistently outperformed Gemma across both quantitative and qualitative metrics. Flan-T5 excelled in lexical overlap measures (ROUGE-1: 53.03, BLEU: 13.91) and demonstrated superior performance in qualitative assessments, particularly in conciseness (81.84/100) and coherence (77.89/100). Gemma, while showing competence, lagged behind Flan-T5 in most metrics. This study highlights the effectiveness of Flan-T5 in dialogue summarization tasks and underscores the importance of a multi-faceted evaluation approach in assessing LLM performance. Our findings suggest that future developments in this field should focus on enhancing lexical fidelity and higher-level qualities such as coherence and conciseness. This study contributes to the growing body of research on LLM evaluation and offers insights for improving dialogue summarization techniques.32 0Item Restricted An In-Depth Analysis of the Adoption of Large Language Models in Clinical Settings: A Fuzzy Multi-Criteria Decision-Making Approach(University of Bridgeport, 2024-08-05) Aldwean, Abdullah; Tenney, DanThe growing capabilities of large language models (LLMs) in the medical field hold promising transformational change. The evolution of LLMs, such as BioBERT and MedGPT, has created new opportunities for enhancing the quality of healthcare services, improving clinical operational efficiency, and addressing numerous existing healthcare challenges. However, the adoption of these innovative technologies into clinical settings is a complex, multifaceted decision problem influenced by various factors. This dissertation aims to identify and rank the challenges facing the integration of LLMs into healthcare clinical settings and evaluate different adoption solutions. To achieve this goal, a combined approach based on the Fuzzy Analytic Hierarchy Process (FAHP) and the Fuzzy Technique for Order of Preference by Similarity to Ideal Solution (Fuzzy TOPSIS) has been employed to prioritize these challenges and then use them to rank potential LLM adoption solutions based on experts’ opinion. However, utilizing LLMs technologies in clinical settings faces several challenges across societal, technological, organizational, regulatory, and economic (STORE) perspectives. The findings indicate that regulatory concerns, such as accountability and compliance, are considered the most critical challenges facing LLMs adoption decision. This research provides a thorough and evidence-based assessment of LLMs in the clinical settings. It offers a structured framework that helps decision-makers navigate the complexities of leveraging such disruptive innovations in clinical practice.34 0Item Restricted A Method for Formal Analysis and Simulation of Standard Operating Procedures (SOPs) to Meet Safety Standards(George Mason University, 2024) Bashatah, Jomana; Sherry, LanceStandard Operating Procedures (SOPs) are the “glue” that holds the command-and-control center together. They are step-by-step instructions to guide the operators on how to control complex human-machine systems. While the machine is certified and the operators are licensed, SOPs are loosely regulated. Time and cost constraints limit the testing of SOPs to account for the variability noted in human performance, i.e., SOP execution time and the variability in the operational environment. Additionally, SOPs mainly exist as static text documents, i.e., Word documents, hindering the ability to revise SOPs and maintain configuration integrity consistently. To address these limitations, this dissertation developed a framework for a digital SOP representation, metrics, and a simulation model to aid in creating, revising, and evaluating SOPs. A canonical structure, the extended Procedure Representation Language (e-PRL), was developed to decompose SOP steps into perceptual, cognitive, and motor elements. A method for using Large Language Models (LLMs) to generate SOP Steps from Owners Manuals, and a method to classify the text in the SOP steps into e-PRL components was developed. Techniques, including Monte-Carlo simulations to assess human performance and quantitative metrics that evaluate SOP content and training requirements, were developed for the e-PRL representation. Three case studies demonstrating the applicability of the methods are presented from the following domains: (1) aviation operational SOPs, (2) International Space Station (ISS) Habitable Airlock (HAL) SOPs, and (3) semi-autonomous vehicle SOPs. The implications of the results for each case study and the limitations and future work for the methods are discussed.23 0Item Restricted Development Techniques for Large Language Models for Low Resource Languages(University of Texas at Dallas, 2023-12) Alsarra, Sultan; Khan, LatifurRecent advancements in Natural Language Processing (NLP) driven by large language models have brought about transformative changes in various sectors reliant on extensive text-based research. This dissertation is the culmination of techniques designed for crafting domain-specific large language models tailored to low-resource languages, offering invaluable support to researchers engaged in large-scale text analysis. The primary focus of these models is to address the nuances of politics, conflicts, and violence in the Middle East and Latin America using domain-specific, pre-trained large language models in Arabic and Spanish. Throughout the development of these language models, we construct a multitude of downstream tasks, including named entity recognition, binary classification, multi-label classification, and question answering. Additionally, we lay out a roadmap for the creation of domain-specific large language models. Our core objective is to contribute by devising NLP strategies and methodologies that surmount the challenges posed by low-resource languages. This contribution extends to curating an extensive corpus of texts centered around regional politics and conflicts in Spanish and Arabic, thereby enriching research in the domain of NLP large language models for low-resource languages. We assess the performance of our models against the Bidirectional Encoder Representations from Transformers (BERT) model as a baseline. Our findings unequivocally establish that the utilization of domain-specific pre-trained language models markedly enhances the performance of NLP models in the realm of politics and conflict analysis. This is observed in both Arabic and Spanish, spanning diverse types of downstream tasks. Consequently, our work equips researchers in the realm of large language models for low-resource languages with potent tools. Simultaneously, it offers political and conflict analysts, including policymakers, scholars, and practitioners, novel approaches and instruments for deciphering the intricate dynamics of local politics and conflicts, directly in Arabic and Spanish.83 0