SACM - United States of America

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668

Browse

Search Results

Now showing 1 - 4 of 4
  • ItemRestricted
    Improving Feature Location in Source Code via Large Language Model-Based Descriptive Annotations
    (Arizona State University, 2025-05) Alneif, Sultan; Alhindawi, Nouh
    Feature location is a crucial task in software maintenance, aiding developers in identifying the precise segments of code responsible for specific functionalities. Traditional feature location methods, such as grep and static analysis, often result in high false-positive rates and inadequate ranking accuracy, increasing developer effort and reducing productivity. Information Retrieval (IR) techniques like Latent Semantic Indexing (LSI) have improved precision and recall but still struggle with lexical mismatches and semantic ambiguities. This research introduces an innovative method to enhance feature location by augmenting source code corpora with descriptive annotations generated by Large Language Models (LLMs), specifically Code Llama. The enriched corpora provide deeper semantic contexts, improving the alignment between developer queries and relevant source code components. Empirical evaluations were conducted on two open-source systems, HippoDraw and Qt, using standard IR performance metrics: precision, recall, First Relevant Position (FRP), and Last Relevant Position (LRP). Results showed significant performance gains; a 40% precision improvement in HippoDraw, and a 26% improvement in Qt. Recall improved by 32% in HippoDraw and 24% in Qt. The findings highlight the efficacy of incorporating LLM-generated annotations, significantly reducing developer effort and enhancing software comprehension and maintainability. This research provides a practical and scalable solution for software maintenance and evolution tasks.
    13 0
  • Thumbnail Image
    ItemRestricted
    An In-Depth Analysis of the Adoption of Large Language Models in Clinical Settings: A Fuzzy Multi-Criteria Decision-Making Approach
    (University of Bridgeport, 2024-08-05) Aldwean, Abdullah; Tenney, Dan
    The growing capabilities of large language models (LLMs) in the medical field hold promising transformational change. The evolution of LLMs, such as BioBERT and MedGPT, has created new opportunities for enhancing the quality of healthcare services, improving clinical operational efficiency, and addressing numerous existing healthcare challenges. However, the adoption of these innovative technologies into clinical settings is a complex, multifaceted decision problem influenced by various factors. This dissertation aims to identify and rank the challenges facing the integration of LLMs into healthcare clinical settings and evaluate different adoption solutions. To achieve this goal, a combined approach based on the Fuzzy Analytic Hierarchy Process (FAHP) and the Fuzzy Technique for Order of Preference by Similarity to Ideal Solution (Fuzzy TOPSIS) has been employed to prioritize these challenges and then use them to rank potential LLM adoption solutions based on experts’ opinion. However, utilizing LLMs technologies in clinical settings faces several challenges across societal, technological, organizational, regulatory, and economic (STORE) perspectives. The findings indicate that regulatory concerns, such as accountability and compliance, are considered the most critical challenges facing LLMs adoption decision. This research provides a thorough and evidence-based assessment of LLMs in the clinical settings. It offers a structured framework that helps decision-makers navigate the complexities of leveraging such disruptive innovations in clinical practice.
    34 0
  • Thumbnail Image
    ItemUnknown
    A Method for Formal Analysis and Simulation of Standard Operating Procedures (SOPs) to Meet Safety Standards
    (George Mason University, 2024) Bashatah, Jomana; Sherry, Lance
    Standard Operating Procedures (SOPs) are the “glue” that holds the command-and-control center together. They are step-by-step instructions to guide the operators on how to control complex human-machine systems. While the machine is certified and the operators are licensed, SOPs are loosely regulated. Time and cost constraints limit the testing of SOPs to account for the variability noted in human performance, i.e., SOP execution time and the variability in the operational environment. Additionally, SOPs mainly exist as static text documents, i.e., Word documents, hindering the ability to revise SOPs and maintain configuration integrity consistently. To address these limitations, this dissertation developed a framework for a digital SOP representation, metrics, and a simulation model to aid in creating, revising, and evaluating SOPs. A canonical structure, the extended Procedure Representation Language (e-PRL), was developed to decompose SOP steps into perceptual, cognitive, and motor elements. A method for using Large Language Models (LLMs) to generate SOP Steps from Owners Manuals, and a method to classify the text in the SOP steps into e-PRL components was developed. Techniques, including Monte-Carlo simulations to assess human performance and quantitative metrics that evaluate SOP content and training requirements, were developed for the e-PRL representation. Three case studies demonstrating the applicability of the methods are presented from the following domains: (1) aviation operational SOPs, (2) International Space Station (ISS) Habitable Airlock (HAL) SOPs, and (3) semi-autonomous vehicle SOPs. The implications of the results for each case study and the limitations and future work for the methods are discussed.
    26 0
  • Thumbnail Image
    ItemRestricted
    Development Techniques for Large Language Models for Low Resource Languages
    (University of Texas at Dallas, 2023-12) Alsarra, Sultan; Khan, Latifur
    Recent advancements in Natural Language Processing (NLP) driven by large language models have brought about transformative changes in various sectors reliant on extensive text-based research. This dissertation is the culmination of techniques designed for crafting domain-specific large language models tailored to low-resource languages, offering invaluable support to researchers engaged in large-scale text analysis. The primary focus of these models is to address the nuances of politics, conflicts, and violence in the Middle East and Latin America using domain-specific, pre-trained large language models in Arabic and Spanish. Throughout the development of these language models, we construct a multitude of downstream tasks, including named entity recognition, binary classification, multi-label classification, and question answering. Additionally, we lay out a roadmap for the creation of domain-specific large language models. Our core objective is to contribute by devising NLP strategies and methodologies that surmount the challenges posed by low-resource languages. This contribution extends to curating an extensive corpus of texts centered around regional politics and conflicts in Spanish and Arabic, thereby enriching research in the domain of NLP large language models for low-resource languages. We assess the performance of our models against the Bidirectional Encoder Representations from Transformers (BERT) model as a baseline. Our findings unequivocally establish that the utilization of domain-specific pre-trained language models markedly enhances the performance of NLP models in the realm of politics and conflict analysis. This is observed in both Arabic and Spanish, spanning diverse types of downstream tasks. Consequently, our work equips researchers in the realm of large language models for low-resource languages with potent tools. Simultaneously, it offers political and conflict analysts, including policymakers, scholars, and practitioners, novel approaches and instruments for deciphering the intricate dynamics of local politics and conflicts, directly in Arabic and Spanish.
    86 0

Copyright owned by the Saudi Digital Library (SDL) © 2025