Saudi Cultural Missions Theses & Dissertations

Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10

Browse

Search Results

Now showing 1 - 2 of 2
  • ItemRestricted
    Evaluating Chess Moves by Analysing Sentiments in Teaching Textbooks
    (the University of Manchester, 2025) Alrdahi, Haifa Saleh T; Batista-navarro, Riza
    The rules of playing chess are simple to comprehend, and yet it is challenging to make accurate decisions in the game. Hence, chess lends itself well to the development of an artificial intelligence (AI) system that simulates real-life problems, such as in decision-making processes. Learning chess strategies has been widely investigated, with most studies focused on learning from previous games using search algorithms. Chess textbooks encapsulate grandmaster knowledge, which explains playing strategies. This thesis investigates three research questions on the possibility of unlocking hidden knowledge in chess teaching textbooks. Firstly, we contribute to the chess domain with a new heterogeneous chess dataset “LEAP”, consists of structured data that represents the environment “board state”, and unstructured data that represent explanation of strategic moves. Additionally, we build a larger unstructured synthetic chess dataset to improve large language models familiarity with the chess teaching context. With the LEAP dataset, we examined the characteristics of chess teaching textbooks and the challenges of using such a data source for training Natural Language (NL)-based chess agent. We show by empirical experiments that following the common approach of sentence-level evaluation of moves are not insightful. Secondly, we observed that chess teaching textbooks are focused on explanation of the move’s outcome for both players alongside discussing multiple moves in one sentence, which confused the models in move evaluation. To address this, we introduce an auxiliary task by using verb phrase-level to evaluate the individual moves. Furthermore, we show by empirical experiments the usefulness of adopting the Aspect-based Sentiment Analysis (ABSA) approach as an evaluation method of chess moves expressed in free-text. With this, we have developed a fine-grained annotation and a small-scale dataset for the chess-ABSA domain “ASSESS”. Finally we examined the performance of a fine-tuned LLM encoder model for chess-ABSA and showed that the performance of the model for evaluating chess moves is comparable to scores obtained from a chess engine, Stockfish. Thirdly, we developed an instruction-based explanation framework, using prompt engineering with zero-shot learning to generate an explanation text of the move outcome. The framework also used a chess ABSA decoder model that uses an instructions format and evaluated its performance on the ASSESS dataset, which shows an overall improvement performance. Finally, we evaluate the performance of the framework and discuss the possibilities and current challenges of generating large-scale unstructured data for the chess, and the effect on the chess-ABSA decoder model.
    9 0
  • Thumbnail Image
    ItemRestricted
    EXPLORING LANGUAGE MODELS AND QUESTION ANSWERING IN BIOMEDICAL AND ARABIC DOMAINS
    (University of Delaware, 2024-05-10) Alrowili, Sultan; Shanker, K.Vijay
    Despite the success of the Transformer model and its variations (e.g., BERT, ALBERT, ELECTRA, T5) in addressing NLP tasks, similar success is not achieved when these models are applied to specific domains (e.g., biomedical) and limited-resources language (e.g., Arabic). This research addresses issues to overcome some challenges in the use of Transformer models to specialized domains and languages that lack in language processing resources. One of the reasons for reduced performance in limited domains might be due to the lack of quality contextual representations. We address this issue by adapting different types of language models and introducing five BioM-Transformer models for the biomedical domain and Funnel transformer and T5 models for the Arabic language. For each of our models, we present experiments for studying the impact of design factors (e.g., corpora and vocabulary domain, model-scale, architecture design) on performance and efficiency. Our evaluation of BioM-Transformer models shows that we obtain state-of-the-art results on several biomedical NLP tasks and achieved the top-performing models on the BLURB leaderboard. The evaluation of our small scale Arabic Funnel and T5 models shows that we achieve comparable performance while utilizing less computation compared to the fine tuning cost of existing Arabic models. Further, our base-scale Arabic language models extend state-of-the-art results on several Arabic NLP tasks while maintaining a comparable fine-tuning cost to existing base-scale models. Next, we focus on the question-answering task, specifically tackling issues in specialized domains and low-resource languages such as the limited size of question-answering datasets and limited topics coverage within them. We employ several methods to address these issues in the biomedical domain, including the employment of models adapted to the domain and Task-to-Task Transfer Learning. We evaluate the effectiveness of these methods at the BioASQ10 (2022) challenge, showing that we achieved the top-performing system on several batches of the BioASQ10 challenge. In Arabic, we address similar existing issues by introducing a novel approach to create question-answer-passage triplets, and propose a pipeline, Pair2Passage, to create large QA datasets. Using this method and the pipeline, we create the ArTrivia dataset, a new Arabic question-answering dataset comprising more than +10,000 high-quality question-answer-passage triplets. We presented a quantitative and qualitative analysis of ArTrivia that shows the importance of some often overlooked yet important components, such as answer normalization in enhancing the quality of the question-answer dataset and future annotation. In addition, our evaluation shows the ability of ArTrivia to build a question-answering model that can address the out-of-distribution issue in existing Arabic QA datasets.
    22 0

Copyright owned by the Saudi Digital Library (SDL) © 2025