SACM - United States of America

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668

Browse

Search Results

Now showing 1 - 9 of 9
  • Thumbnail Image
    ItemRestricted
    AI-GENERATED TEXT DETECTOR FOR ARABIC LANGUAGE
    (University of Bridgeport, 2024-08) Alshammari, Hamed; Elleithy, Khaled
    The rise of AI-generated texts (AIGTs), particularly with the arrival of advanced language models like ChatGPT, has spurred a growing need for effective detection methods. While these models offer various beneficial applications, their potential for misuse, such as facilitating plagiarism and the generation of fake textual content, raises significant ethical concerns. These concerns have sparked extensive academic research into detecting AIGTs. Efforts to mitigate potential misuse include commercial platforms like Turnitin, GPTZero, and more. Notably, most evaluations conducted on the current AI detection thus far have predominantly focused on English or languages rooted in Latin-driven scripts. However, the effectiveness of existing AI detectors is notably hampered when processing Arabic texts due to the unique challenges posed by the language's diacritics, which are small marks placed above or below letters to indicate pronunciation. These diacritics can cause human-written texts (HWTs) to be misclassified as AIGTs. Recognizing the limitations of current detectors, this research first established a baseline performance assessment using a newly developed benchmark dataset of Arabic texts that contain HWTs and AIGTs against the existing detection systems such as OpenAI Text Classifier and GPTZero. This evaluation highlighted critical weaknesses in existing detectors' ability to handle diacritics and differentiate between HWTs and AIGTs, particularly in essay-length texts. This research introduces a novel AI text detector designed explicitly for Arabic to address these limitations, leveraging transformer-based pre-trained models trained on several novel datasets. Our resulting detector significantly outperforms the existing detection models in accurately identifying both HWTs and AIGTs in Arabic. Although the research focus was on Arabic due to its unique writing challenges, our detector architecture is adaptable to other languages.
    109 0
  • Thumbnail Image
    ItemRestricted
    Towards Automated Security and Privacy Policies Specification and Analysis
    (Colorado State University, 2024-07-03) Alqurashi, Saja; Ray, Indrakshi
    Security and privacy policies, vital for information systems, are typically expressed in natural language documents. Security policy is represented by Access Control Policies (ACPs) within security requirements, initially drafted in natural language and subsequently translated into enforceable policy. The unstructured and ambiguous nature of the natural language documents makes the manual translation process tedious, expensive, labor-intensive, and prone to errors. On the other hand, Privacy policy, with its length and complexity, presents unique challenges. The dense lan- guage and extensive content of the privacy policies can be overwhelming, hindering both novice users and experts from fully understanding the practices related to data collection and sharing. The disclosure of these data practices to users, as mandated by privacy regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), is of utmost importance. To address these challenges, we have turned to Natural Language Processing (NLP) to automate extracting critical information from natural language documents and analyze those security and privacy policies. Thus, this dissertation aims to address two primary research questions: Question 1: How can we automate the translation of Access Control Policies (ACPs) from natural language expressions to the formal model of Next Generation Access Control (NGAC) and subsequently analyze the generated model? Question 2: How can we automate the extraction and analysis of data practices from privacy policies to ensure alignment with privacy regulations (GDPR and CCPA)? Addressing these research questions necessitates the development of a comprehensive framework comprising two key components. The first component, SR2ACM, focuses on translating natural language ACPs into the NGAC model. This component introduces a series of innovative contributions to the analysis of security policies. At the core of our contributions is an automated approach to constructing ACPs within the NGAC specification directly from natural language documents. Our approach integrates machine learning with software testing, a novel methodology to ensure the quality of the extracted access control model. The second component, Privacy2Practice, is designed to automate the extraction and analysis of the data practices from privacy policies written in natural language. We have developed an automated method to extract data practices mandated by privacy regulations and to analyze the disclosure of these data practices within the privacy policies. The novelty of this research lies in creating a comprehensive framework that identifies the critical elements within security and privacy policies. Thus, this innovative framework enables automated extraction and analysis of both types of policies directly from natural language documents.
    29 0
  • Thumbnail Image
    ItemRestricted
    A Human-Centered Approach to Improving Adolescent Real-Time Online Risk Detection Algorithms
    (Vanderbilt University, 2024-05-15) Alsoubai, Ashwaq; Wisniewski, Pamela
    Computational risk detection holds promise for shielding particularly vulnerable groups from online harm. A thorough literature review on real-time computational risk detection methods revealed that most research defined 'real-time' as approaches that analyze content retrospectively as early as possible or as preventive approaches to prevent risks from reaching online environments. This review provided a research agenda to advance the field, highlighting key areas: employing ecologically valid datasets, basing models and features on human understanding, developing responsive models, and evaluating model performance through detection timing and human assessment. This dissertation embraces human-centric methods for both gaining empirical insights into young people's risk experiences online and developing a real-time risk detection system using a dataset of youth social media. By analyzing adolescent posts on an online peer support mental health forum through a mixed-methods approach, it was discovered that online risks faced by youth could be laden by other factors, like mental health issues, suggesting a multidimensional nature of these risks. Leveraging these insights, a statistical model was used to create profiles of youth based on their reported online and offline risks, which were then mapped with their actual online discussions. This empirical study uncovered that approximately 20% of youth fall into the highest risk category, necessitating immediate intervention. Building on this critical finding, the third study of this dissertation introduced a novel algorithmic framework aimed at the 'timely' identification of high-risk situations in youth online interactions. This framework prioritizes the riskiest interactions for high-risk evaluation, rather than uniformly assessing all youth discussions. A notable aspect of this study is the application of reinforcement learning for prioritizing conversations that need urgent attention. This innovative method uses decision-making processes to flag conversations as high or low priority. After training several deep learning models, the study identified Bi-Long Short-Term Memory (Bi-LSTM) networks as the most effective for categorizing conversation priority. The Bi-LSTM model's capability to retain information over long durations is crucial for ongoing online risk monitoring. This dissertation sheds light on crucial factors that enhance the capability to detect risks in real time within private conversations among youth.
    20 0
  • Thumbnail Image
    ItemRestricted
    Developing Novel Antiviral Agents: Targeting the N-Terminal Domain of SARS-CoV-2 Nucleocapsid Protein with Small Molecule Inhibitors
    (Virginia Commonwealth University, 2024-05-13) Alkhairi, Mona A.; Safo, Martin K.
    The COVID-19 pandemic, caused by SARS-CoV-2, persists globally with over 7 million deaths and 774 million infections. Urgent research is needed to understand virus behavior, especially considering the limited availability of approved medications. Despite vaccination efforts, the virus continues to pose a significant threat, highlighting the need for innovative approaches to combat it. The SARS-CoV-2 nucleocapsid protein (NP) emerges as a crucial target due to its role in viral replication and pathogenesis. The SARS-CoV-2 NP, essential for various stages of the viral life cycle, including genomic replication, virion assembly, and evasion of host immune defenses, comprises three critical domains: the N-terminal domain (NTD), C-terminal domain (CTD), and the central linker region (LKR). Notably, the NTD is characterized by a conserved electropositive pocket, which is crucial for viral RNA binding during packaging stages. This highlights the multifunctionality of the nucleocapsid protein and its potential as a therapeutic target due to its essential roles and conserved features across diverse pathogenic coronavirus species. Our collaborators previously initiated an intriguing drug repurposing screen, identifying certain β-lactam antibiotics as potential SARS-CoV-2 NP-NTD protein inhibitors in vitro. The current study employed ensemble of computational methodologies, biophysical, biochemical and X-ray crystallographic studies to discover novel chemotype hits against NP-NTD. Utilizing a combination of traditional molecular docking tools such as AutoDock Vina, alongside AI-enhanced techniques including Gnina and DiffDock for enhanced performance, eleven structurally diverse hit compounds predicted to target the SARS-CoV-2 NP-NTD were identified from the virtual screening (VS) studies. The hits include MY1, MY2, MY3, MY4, NP6, NP7, NP1, NP2, NP3, NP4 and NP5, which demonstrated favorable binding orientations and affinity scores. Additionally, one supplementary compound provided by Dr. Cen’s laboratory (denoted as CE) was assessed in parallel. These hits were further evaluated for their in vitro activity using various biophysical and biochemical techniques including differential scanning fluorimetry (DSF), microscale thermophoresis (MST), fluorescence polarization (FP), and electrophoretic mobility shift assay (EMSA). DSF revealed native NTD had a baseline thermal melting temperature (Tm) of 43.82°C. The compounds NP3, NP6 and NP7 notably increased the Tm by 2.55°C, 2.47°C and 2.93°C respectively, indicating strong thermal stabilization over the native protein. In contrast, NP4 and NP5 only achieved marginal Tm increases. MST studies showed NP1, NP3, and NP7 exhibited the strongest affinity with low micromolar dissociation constants (KD) of 0.32 μM, 0.57 μM, and 0.87 μM, respectively, significantly outperforming the control compounds PJ34 and Suramin, with dissociation constants of 8.35 μM and 5.24 μM, respectively. Although NP2, NP6, and CE showed relatively weaker affinity, these compounds still demonstrated better binding affinities with dissociation constants of 4.1 μM, 2.50 μM, and 1.81 μM, respectively than the control compounds PJ34 and Suramin. These results substantiate the potential of these scaffolds as modulators of NTD activity. In FP competition assays, NP1 and NP3 exhibited the lowest half-maximal inhibitory concentrations (IC50) of 5.18 μM and 5.66 μM, respectively, indicating the highest potency at disrupting the NTD-ssRNA complex among the compounds, outperforming the positive controls PJ34 and Suramin, with IC50 of 21.72 μM and 17.03 μM, respectively. The compounds NP6, NP7, CE, and NP2 also showed significant IC50 values that ranged from 7.00 μM to 10.13 μM. EMSA studies confirmed the NTD-ssRNA complex disruptive abilities of the compounds, with NP1 and NP3 as the most potent with IC50 of 2.70 μM and 3.31 μM, respectively. These values compare to IC50 of 8.64 μM and 3.61 μM of the positive controls PJ34 and Suramin, respectively. NP7, CE, NP6, and NP2 also showed IC50 ranging from 4.31 μM to 7.61 μM. The use of full-length nucleocapsid protein also showed that NP1 and NP3 disrupted the NP-ssRNA binding with IC50 of 1.67 μM and 1.95 μM, which was better than Suramin with IC50 of 3.24 μM. These consistent results from both FP and EMSA highlight the superior effectiveness of NP1 and NP3 in disrupting nucleocapsid protein-ssRNA binding, showcasing their potential as particularly powerful antiviral agents. Extensive crystallization trials were conducted to elucidate the atomic structures of SARS-CoV-2 NP-NTD in complex with selected hit compounds, assessing over 8000 unique crystallization conditions. Ultimately, only a PJ34-bound structure could be determined, albeit with weak ligand density, likely due to tight crystal packing impeding binding site access. The crystal structure was determined to 2.2 Å by molecular replacement using the published apo NP-NTD (PDB 7CDZ) coordinates as a search model, and refined to R-factors of 0.193 (Rwork) and 0.234 (Rfree). The refined NP-NTD structure showed conserved intermolecular interactions with PJ34 at the RNA binding pocket as observed in the previously reported HCoV-OC43 NP-NTD-PJ34 complex (PDB 4KXJ). This multi-faceted drug discovery endeavor, combining computational screening and in vitro assays resulted in successful identification of novel compounds inhibiting the SARS-CoV-2 nucleocapsid N-terminal domain. Biophysical and biochemical studies established compounds NP1 and NP3 as superior hits with low micromolar binding affinities, as well as low micromolar potency superior to standard inhibitors at disrupting both isolated N-NTD-RNA and full-length nucleocapsid-RNA complex formation. Though crystallographic efforts encountered challenges, important validation was achieved through a resolved crystal structure of PJ34 in complex with NP-NTD. Future effort will be to obtain co-crystals of NP-NTD with our compounds to allow for targeted structure modification to improve on the potency of the compounds.
    6 0
  • Thumbnail Image
    ItemRestricted
    Adaptive Cyber Security for Smart Home Systems
    (Howard University, 2024-04-29) Alsabilah, Nasser; Rawat, Danda B.
    Throughout the recent decade, smart homes have made an enormous expansion around the world among residential customers; hence the most intimate place for people becomes connected to cyberspace. This environment attracts more hackers because of the amount and nature of data.Furthermore, most of the new technologies suffer from difficulties such as afford the proper level of security for their users.Therefore, the cybersecurity in smart homes is becoming increas- ingly a real concern for many reasons, and the conventional security methods are not effective in the smart home environment as well. The consequences of cyber attacks’ impact in this environment exceed direct users to society in some cases. Thus, from a historical perspective, many examples of cybersecurity breaches were reported within smart homes to either gain information from con- nected smart devices or exploit smart home devices within botnet networks to execute Distributed Denial of Service (DDoS) as well as others.Therefore, there is an insistent demand to detect these malicious attacks targeting smart homes to protect security and privacy.This dissertation presents a comprehensive approach to address these challenges, leveraging insights from energy consumption and network traffic analysis to enhance cybersecurity in smart home environments.The first objec- tive of this research focuses on estimating vulnerability indices of smart devices within smart home systems using energy consumption data. Through sophisticated methodology based on Kalman filter and Shapiro-Wilk test, this objective provides estimating for the vulnerability indices of smart devices in smart home system. Building upon the understanding that energy consumption is greatly affected by network traffic based on many empirical observations that have revealed alterations in the energy consumption and network behavior of compromised devices, the subsequent objectives as complementary endeavors to the first objective delve into the development of adaptive technique for cyber-attack detection and cyber-behavior prediction using Rough Set Theory combined with XGBoost. These objectives aim to detect and predict cyber threats, thus enhancing the overall security posture of smart home systems.
    13 0
  • Thumbnail Image
    ItemRestricted
    EXPLORING LANGUAGE MODELS AND QUESTION ANSWERING IN BIOMEDICAL AND ARABIC DOMAINS
    (University of Delaware, 2024-05-10) Alrowili, Sultan; Shanker, K.Vijay
    Despite the success of the Transformer model and its variations (e.g., BERT, ALBERT, ELECTRA, T5) in addressing NLP tasks, similar success is not achieved when these models are applied to specific domains (e.g., biomedical) and limited-resources language (e.g., Arabic). This research addresses issues to overcome some challenges in the use of Transformer models to specialized domains and languages that lack in language processing resources. One of the reasons for reduced performance in limited domains might be due to the lack of quality contextual representations. We address this issue by adapting different types of language models and introducing five BioM-Transformer models for the biomedical domain and Funnel transformer and T5 models for the Arabic language. For each of our models, we present experiments for studying the impact of design factors (e.g., corpora and vocabulary domain, model-scale, architecture design) on performance and efficiency. Our evaluation of BioM-Transformer models shows that we obtain state-of-the-art results on several biomedical NLP tasks and achieved the top-performing models on the BLURB leaderboard. The evaluation of our small scale Arabic Funnel and T5 models shows that we achieve comparable performance while utilizing less computation compared to the fine tuning cost of existing Arabic models. Further, our base-scale Arabic language models extend state-of-the-art results on several Arabic NLP tasks while maintaining a comparable fine-tuning cost to existing base-scale models. Next, we focus on the question-answering task, specifically tackling issues in specialized domains and low-resource languages such as the limited size of question-answering datasets and limited topics coverage within them. We employ several methods to address these issues in the biomedical domain, including the employment of models adapted to the domain and Task-to-Task Transfer Learning. We evaluate the effectiveness of these methods at the BioASQ10 (2022) challenge, showing that we achieved the top-performing system on several batches of the BioASQ10 challenge. In Arabic, we address similar existing issues by introducing a novel approach to create question-answer-passage triplets, and propose a pipeline, Pair2Passage, to create large QA datasets. Using this method and the pipeline, we create the ArTrivia dataset, a new Arabic question-answering dataset comprising more than +10,000 high-quality question-answer-passage triplets. We presented a quantitative and qualitative analysis of ArTrivia that shows the importance of some often overlooked yet important components, such as answer normalization in enhancing the quality of the question-answer dataset and future annotation. In addition, our evaluation shows the ability of ArTrivia to build a question-answering model that can address the out-of-distribution issue in existing Arabic QA datasets.
    21 0
  • Thumbnail Image
    ItemRestricted
    Automated Repair of Accessibility Issues in Mobile Applications
    (Saudi Digital Library, 2023-11-29) Alotaibi, Ali; Halfond, William GJ
    Mobile accessibility is more critical than ever due to the significant increase in mobile app usage, particularly among people with disabilities who rely on mobile devices to access essential information and services. People with vision and motor disabilities often use assistive technologies to interact with mobile applications. However, recent studies show that a significant percentage of mobile apps remain inaccessible due to layout accessibility issues, making them challenging to use for older adults and people with disabilities. Unfortunately, existing techniques are limited in helping developers debug these issues; they can only detect issues but not repair them. Therefore, the repair of layout accessibility issues remains a manual, labor-intensive, and error-prone process. Automated repair of layout accessibility issues is complicated by several challenges. First, a repair must account for multiple issues holistically in order to preserve the relative consistency of the original app design. Second, due to the complex relationship between UI components, there is no straightforward way of identifying the set of elements and properties that need to be modified for a given issue. Third, assuming the relevant views and properties could be identified, the number of possible changes that need to be considered grows exponentially as more elements and properties need to be considered. Finally, a change in one element can create cascading changes that lead to new problems in other areas of the UI. Together, these challenges make a seemingly simple repair difficult to achieve. In this dissertation, I introduce a repair framework that builds and analyzes models of the User Interface (UI) and leverages multi-objective genetic search algorithms to repair layout accessibility issues. To evaluate the effectiveness of the framework, I instantiated it to repair the different known types of layout accessibility issues in mobile apps. The empirical evaluation of these instantiations on real-world mobile apps demonstrated their effectiveness in repairing these issues. In addition, I conducted user studies to assess the impact of the repairs on the UI quality and aesthetics. The results demonstrated that the repaired UIs were not only more accessible but also did not distort or significantly change their original design. Overall, these results are positive and indicate that my repair framework can be highly effective in automatically repairing layout accessibility issues in mobile applications. Overall, my results confirm my dissertation's hypothesis that a repair framework employing a multi-objective genetic search-based approach can be highly effective in automatically repairing layout accessibility issues in mobile applications.
    36 0
  • Thumbnail Image
    ItemRestricted
    Artificial Intelligence Applied To Cybersecurity And Health Care
    (NDSU, 2023-09-21) Alenezi, Rafa; Ludwig, Simone
    Nowadays, artificial intelligence is being considered a potential solution for various problems, including classification and regression optimization, in different fields such as science, technology, and humanities. It can also be applied in areas such as cybersecurity and healthcare. With the increasing complexity and impact of cybersecurity threats, it is essential to develop mechanisms for detecting new types of attacks. Hackers often target the Domain Name Server (DNS) component of a network architecture, which stores information about IP addresses and associated domain names, to gain access to a server or compromise network connectivity. Machine learning techniques can be used not only for cyber threat detection but also for other applications in various fields. In this dissertation, the first research investigates the use of classification models, including Random Forest classifiers, Keras Sequential algorithms, and XGBoost classification, for detecting attacks. Additionally, Tree, Deep, and Kernel of Shapley Additive Explanations (SHAP) can be used to interpret the results of these models. The second research focuses on detecting DNS attacks using appropriate classifiers to enable quick and effective responses. In the medical field, there is a growing trend of using algorithms to identify diseases, particularly in medical imaging. Deep learning models have been developed to detect pneumonia, but their accuracy is not always optimal and they require large data sets for training. Two studies were conducted to develop more accurate detection models for pneumonia in chest X-ray images. The third study developed a model based on Reinforcement Learning (RL) with Convolutional Neural Network (CNN) and showed improved accuracy values. The fourth study used Teaching Learning Based Optimization (TLBO) with Convolutional Neural Network (CNN) to improve pneumonia detection accuracy, which resulted in high-level accuracy rates. Overall, all these studies provide insights into the potential of artificial intelligence in improving disease detection and cyber treat detection.
    33 0
  • Thumbnail Image
    ItemRestricted
    INVERSE MAPPERS FOR QCD GLOBAL ANALYSIS
    (Saudi Digital Library, 2023-08) Almaeen, Manal; Li, Yaohang
    Inverse problems – using measured observations to determine unknown parameters – are well motivated but challenging in many scientific problems. Mapping parameters to observables is a well-posed problem with unique solutions, and therefore can be solved with differential equations or linear algebra solvers. However, the inverse problem requires backward mapping from observ able to parameter space, which is often nonunique. Consequently, solving inverse problems is ill-posed and a far more challenging computational problem. Our motivated application in this dissertation is the inverse problems in nuclear physics that char acterize the internal structure of the hadrons. We first present a machine learning framework called Variational Autoencoder Inverse Mapper (VAIM), as an autoencoder based neural network archi tecture to construct an effective “inverse function” that maps experimental data into QCFs. In addition to the well-known inverse problems challenges such as ill-posedness, an application spe cific issue is that the experimental data are observed on kinematics bins which are usually irregular and varying. To address this ill defined problem, we represent the observables together with their kinematics bins as an unstructured, high-dimensional point cloud. The point cloud representation is incorporated into the VAIM framework. Our new architecture point cloud-based VAIM (PC VAIM) enables the underlying deep neural networks to learn how the observables are distributed across kinematics. Next, we present our methods of extracting the leading twist Compton form factors (CFFs) from polarization observables. In this context, we extend VAIM framework to the Conditional -VAIM to extract the CFFs from the DVCS cross sections on several kinematics. Connected to this effort is a study of the effectiveness of incorporating physics knowledge into machine learning. We start this task by incorporating physics constraints to the forward problem of mapping the kinematics to the cross sections. First, we develop Physics Constrained Neural Networks (PCNNs) for Deeply Virtual Exclusive Scattering (DVCS) cross sections by integrating some of the physics laws such as the symmetry constraints of the cross sections. This provides us with an inception of incorporating physics rules into our inverse mappers which will one of the directions of our future research.
    11 0

Copyright owned by the Saudi Digital Library (SDL) © 2025