SACM - United Kingdom

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9667

Browse

Search Results

Now showing 1 - 4 of 4

Restricted
Adversarial Machine Learning: Safeguarding Al models from Attacks
(Lancaster University, 2025-01-10) Alammar, Ghaida; Bilal, Muhammad
The field of AML has gained considerable popularity over the years with researchers seeking to explore gaps and new opportunities for growth. This goal of this report is to offer an in-depth survey of adversary attacks and defences in machine learning by examining existing gaps in current algorithms and understanding the implications for systems. By exploring evasion, poisoning, extraction, and inference attacks, the paper reveals the weaknesses of the existing methodologies such as adversarial training, data sanitization, and differential privacy. These techniques are usually not versatile to newer threats and have raised concerns about how effective they are in practical use. The research contributes to the field by conducting an extensive literature review of 35 articles and highlighting the need to implement adaptive and diverse defence strategies as well as empirical studies to evaluate the effectiveness of AML mechanisms. Some of the strategic suggestions are to incorporate continuous training frameworks, optimise real-time monitoring processes, and improve privacy-preserving methods to safeguard confidential information. This analysis is envisaged to offer practical data to foster the development of AML to help in the development of robust AI systems that will remain impregnable to various kinds of adversarial threats in numerous vital sectors. The study examines the basic design and consequences of various attacks in addition to the impact of subtle manipulation of input data on patterns and privacy. The report further addresses the modern challenges of large language models (LLMs) and autonomous systems. Furthermore, this research emphasises the significance of robust protection against enemy attack in strategic areas. The studies additionally evaluate present day protection mechanisms inclusive of antagonistic training, enter preprocessing, and making models stronger and more reliable. By evaluating the efficiency of these defences and evaluating key areas for improvement, the dissertation provides invaluable insights into enhancing the security and reliability of systems. The results of addressing the attacks and defences expose the need for unremitting advancements in data protection in various systems.
27 0
Restricted
Automating the Formulation of Competency Questions in Ontology Engineering
(University of Liverpool, 2025) Alharbi, Reham; Tamma, Valentina; Grasso, Floriana; Payne, Terry
Ontology reuse is a fundamental aspect of ontology development, ensuring that new ontologies align with established models to facilitate seamless integration and interoperability across systems. Despite decades of research promoting ontology reuse, practical solutions for semi-automatically assessing the suitability of candidate ontologies remain limited. A key challenge is the lack of explicit requirement representations that allow for meaningful comparisons between ontologies. Competency Questions (CQs) , which define functional requirements in the form of natural language questions, offer a promising means of evaluating ontology reuse potential. However, in practice, CQs are often not published alongside their ontology, making it difficult to assess whether an existing ontology aligns with new requirements, ultimately hindering reuse. This thesis tackles the challenge of ontology reuse by introducing an automated approach to retrofitting CQs into existing ontologies. Leveraging Generative AI, specifically Large Language Models (LLMs), this approach generates CQs from ontological statements, enabling the systematic extraction of functional requirements even when they were not explicitly documented. The performance of both open-source and closed-source LLMs is evaluated, with key parameters such as prompt specificity and temperature explored to control hallucinations and improve the quality of retrofitted CQs. Results indicate high recall and stability, demonstrating that CQs can be reliably retrofitted and aligned with an ontology’s intended design. However, precision varies due to long-tail data effects, and potential data leakage may artificially inflate recall, necessitating further research. By enabling the reconstruction of CQs, this approach provides a foundation for assessing ontology reuse based on requirement similarity. Specifically, CQ similarity can serve as an indicator of how well an existing ontology aligns with the needs of a new ontology development effort. To operationalize this idea, this thesis proposes a reuse recommendation phase within ontology development methodologies. This phase systematically identifies candidate ontologies based on requirement overlap, offering a structured approach to reuse assessment. The methodology is validated through a practical case study, demonstrating its effectiveness in real-world ontology design. By embedding an explicit reuse recommendation step in the ontology engineering process, this approach provides ontology engineers with a systematic method to identify suitable candidate ontologies, enhancing the overall design process.
21 0
Restricted
Evaluating Chess Moves by Analysing Sentiments in Teaching Textbooks
(the University of Manchester, 2025) Alrdahi, Haifa Saleh T; Batista-navarro, Riza
The rules of playing chess are simple to comprehend, and yet it is challenging to make accurate decisions in the game. Hence, chess lends itself well to the development of an artificial intelligence (AI) system that simulates real-life problems, such as in decision-making processes. Learning chess strategies has been widely investigated, with most studies focused on learning from previous games using search algorithms. Chess textbooks encapsulate grandmaster knowledge, which explains playing strategies. This thesis investigates three research questions on the possibility of unlocking hidden knowledge in chess teaching textbooks. Firstly, we contribute to the chess domain with a new heterogeneous chess dataset “LEAP”, consists of structured data that represents the environment “board state”, and unstructured data that represent explanation of strategic moves. Additionally, we build a larger unstructured synthetic chess dataset to improve large language models familiarity with the chess teaching context. With the LEAP dataset, we examined the characteristics of chess teaching textbooks and the challenges of using such a data source for training Natural Language (NL)-based chess agent. We show by empirical experiments that following the common approach of sentence-level evaluation of moves are not insightful. Secondly, we observed that chess teaching textbooks are focused on explanation of the move’s outcome for both players alongside discussing multiple moves in one sentence, which confused the models in move evaluation. To address this, we introduce an auxiliary task by using verb phrase-level to evaluate the individual moves. Furthermore, we show by empirical experiments the usefulness of adopting the Aspect-based Sentiment Analysis (ABSA) approach as an evaluation method of chess moves expressed in free-text. With this, we have developed a fine-grained annotation and a small-scale dataset for the chess-ABSA domain “ASSESS”. Finally we examined the performance of a fine-tuned LLM encoder model for chess-ABSA and showed that the performance of the model for evaluating chess moves is comparable to scores obtained from a chess engine, Stockfish. Thirdly, we developed an instruction-based explanation framework, using prompt engineering with zero-shot learning to generate an explanation text of the move outcome. The framework also used a chess ABSA decoder model that uses an instructions format and evaluated its performance on the ASSESS dataset, which shows an overall improvement performance. Finally, we evaluate the performance of the framework and discuss the possibilities and current challenges of generating large-scale unstructured data for the chess, and the effect on the chess-ABSA decoder model.
9 0
Restricted
Evaluating Text Summarization with Goal-Oriented Metrics: A Case Study using Large Language Models (LLMs) and Empowered GQM
(University of Birmingham, 2024-09) Altamimi, Rana; Bahsoon, Rami
This study evaluates the performance of Large Language Models (LLMs) in dialogue summarization tasks, focusing on Gemma and Flan-T5. Employing a mixed-methods approach, we utilized the SAMSum dataset and developed an enhanced Goal-Question-Metric (GQM) framework for comprehensive assessment. Our evaluation combined traditional quantitative metrics (ROUGE, BLEU) with qualitative assessments performed by GPT-4, addressing multiple dimensions of summary quality. Results revealed that Flan-T5 consistently outperformed Gemma across both quantitative and qualitative metrics. Flan-T5 excelled in lexical overlap measures (ROUGE-1: 53.03, BLEU: 13.91) and demonstrated superior performance in qualitative assessments, particularly in conciseness (81.84/100) and coherence (77.89/100). Gemma, while showing competence, lagged behind Flan-T5 in most metrics. This study highlights the effectiveness of Flan-T5 in dialogue summarization tasks and underscores the importance of a multi-faceted evaluation approach in assessing LLM performance. Our findings suggest that future developments in this field should focus on enhancing lexical fidelity and higher-level qualities such as coherence and conciseness. This study contributes to the growing body of research on LLM evaluation and offers insights for improving dialogue summarization techniques.
32 0

SACM - United Kingdom

Browse

Filters

Settings

Sort By

Results per page

Search Results