SACM - United Kingdom

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9667

Browse

Search Results

Now showing 1 - 10 of 38
  • ItemRestricted
    GAN-Enhanced Super-Resolution Pipeline for Robust Word Recognition in Low-Quality Non-English Handwriting.
    (Saudi Digital Library, 2025) Shahbi, Bilqis; Xia, Panqiu
    Executive summary The dissertation tackles a critical issue where current optical character recognition (OCR) technologies fall short: correctly identifying handwritten non-English scripts in poor quality and deteriorated settings. While OCR technologies have matured for printed English and other Latin-based languages, scripts such as Arabic, Devanagari, and Telugu remain underrepresented due to structural complexities, cursive connections, diacritics, ligatures, and the limited availability of annotated datasets. These challenges are amplified by real-world factors such as low-resolution scans, noisy archival documents, and mobile phone captures, where fine details necessary for recognition are lost. The study proposes a two-stage deep learning pipeline that integrates super-resolution with recognition, explicitly designed to address these shortcomings. The first stage of the pipeline utilises Real-ESRGAN, a generative adversarial network specifically optimised for real-world image degradation. Unlike earlier models such as SRCNN, VDSR, and ESRGAN, which often prioritize aesthetics or hallucinate features, Real-ESRGAN reconstructs high-resolution images with sharper strokes, preserved ligatures, and clear diacritics. Its Residual-in-Residual Dense Block (RRDB) architecture combines residual learning and dense connections, enabling robust recovery of fine-grained textual details. By preserving structural fidelity rather than merely visual appeal, Real-ESRGAN ensures that enhanced images retain the critical features necessary for recognition. The second stage utilises a Convolutional Recurrent Neural Network (CRNN) with Connectionist Temporal Classification (CTC) loss function. The CRNN combines convolutional layers for feature extraction, bidirectional LSTM layers for capturing sequential dependencies, and CTC decoding for alignment-free sequence prediction. This integration eliminates the need for explicit segmentation, a complicated task in cursive or densely written scripts. Together, the two stages form a cohesive system in which image enhancement directly supports recognition accuracy. To ensure robustness, the research incorporated extensive dataset preparation and preprocessing. Handwritten datasets for Arabic, Devanagari, and Telugu scripts were selected to reflect structural diversity. Preprocessing included resizing, artificial noise simulation (Gaussian noise, blur, and compression artefacts), and augmentation (rotations, elastic distortions, and brightness adjustments). These techniques increased dataset variability and improved the model's ability to generalize to real-world handwriting scenarios. Evaluation was conducted at both image and recognition levels. Image quality was assessed using the Peak Signal-to-Noise Ratio (PSNR) and the Structural Similarity Index (SSIM). At the same time, recognition performance was measured using the Character Error Rate (CER) and the Word Error Rate (WER). This dual evaluation ensured that improvements in image clarity translated into tangible recognition gains. The results confirm the effectiveness of the proposed pipeline. Real-ESRGAN showed improvements compared to SRCNN, VDSR, and ESRGAN, with higher PSNR and SSIM values across scripts. These gains reflect superior structural fidelity, particularly in preserving Arabic cursive flows, Devanagari's horizontal headstrokes, and Telugu's stacked ligatures. Recognition accuracy also improved compared to baseline low-resolution inputs. Script-specific analysis showed more precise word boundaries in Arabic, sharper conjuncts and diacritics in Devanagari, and more distinct glyph separations in Telugu. When benchmarked against traditional OCR systems, such as Tesseract, the pipeline demonstrated clearer recognition outcomes, indicating the critical role of task-specific super-resolution in OCR, proving that enhancing input quality directly strengthens recognition performance. The dissertation makes contributions across methodological, empirical, theoretical, and practical domains. Methodologically, it demonstrates the value of integrating enhancement and recognition stages in a fine-tuned pipeline, ensuring that improvements in image clarity yield measurable gains in recognition. Empirically, it validates the effectiveness of Real-ESRGAN for handwritten text, showing consistent improvements across structurally diverse scripts. Theoretically, it advances the understanding of script-sensitive OCR, emphasizing the preservation of structural features such as diacritics and ligatures. Practically, the work highlights applications in archival preservation, e-governance, and education. By enabling more accurate digitisation of handwritten records, the system supports inclusive access to information and the preservation of linguistic heritage. The study acknowledges several limitations. The scarcity of diverse annotated datasets constrains the model's generalizability to other scripts such as Amharic or Khmer. The computational expense of training Real-ESRGAN limits its feasibility in low-resource settings. Occasional GAN artefacts, where spurious strokes or distortions appear, pose risks in sensitive applications such as legal documents. Moreover, the pipeline has not been extensively tested on mixed-script texts, common in multilingual societies. These limitations suggest avenues for future work, including developing larger and more diverse datasets, designing lightweight models for real-time and mobile deployment, integrating script identification for mixed-language documents, and incorporating explainable AI for greater transparency in recognition decisions. In conclusion, the dissertation demonstrates that GAN-enhanced super-resolution is not merely a cosmetic tool but an essential step toward robust OCR in non-English handwritten texts. By aligning image enhancement with recognition objectives, the proposed pipeline reduces error rates and generalizes across diverse scripts. Its implications extend beyond technical achievement to cultural preservation, digital inclusion, and the democratization of access to information. At the same time, the identified limitations provide a roadmap for future research, ensuring that multilingual OCR evolves into a truly global and inclusive technology.
    22 0
  • ItemRestricted
    CADM: Creative Accounting Detection Model in Saudi-Listed Companies
    (Saudi Digital Library, 2025) Bineid, Maysoon Mohamed; Beloff, Natalia
    In business, financial statements are the primary source of information for investors and other stakeholders. Despite extensive regulatory efforts, the quality of financial reporting in Saudi Arabia still requires improvement, as prior studies have documented evidence of creative accounting. This practice occurs when managers manipulate accounting figures within the boundaries of the International Financial Reporting Standards to present a more favourable image of the company. Although various fraud detection methods exist, identifying manipulations that are legal yet misleading remains a significant challenge. This research introduces the Creative Accounting Detection Model (CADM), a deep learning (DL)-based approach that employs Long Short-Term Memory (LSTM) networks to identify Saudi-listed companies engaging in creative accounting. Two versions of the model were developed: CADM1, trained on a simulated dataset based on established accounting measures from the literature, and CADM2, trained on a dataset tailored to reflect financial patterns observed in the Saudi market. Both datasets incorporated financial and non-financial features derived from a preliminary survey of Saudi business experts. The models achieved training accuracies of 100% (CADM1) and 95% (CADM2). Both models were then tested on real-world data from the Saudi energy sector (2019–2023). CADM1 classified one company as engaging in creative accounting, whereas CADM2 classified all companies as non-creative but demonstrated greater stability in prediction confidence. To interpret these results, a follow-up qualitative study involving expert interviews confirmed CADM as a promising supplementary tool for auditors, enhancing analytical and oversight capabilities. These findings highlight CADM’s potential to support regulatory oversight, strengthen auditing procedures, and improve investor trust in the transparency of financial statements.
    12 0
  • ItemRestricted
    The Additional Regulatory Challenges Posed by AI In Financial Trading
    (Saudi Digital Library, 2025) Almutairi, Nasser; Alessio, Azzutti
    Algorithmic trading has shifted from rule-based speed to adaptive autonomy, with deep learning and reinforcement learning agents that learn, re-parameterize, and redeploy in near real time, amplifying opacity, correlated behaviours, and flash-crash dynamics. Against this backdrop, the dissertation asks whether existing EU and US legal frameworks can keep pace with new generations of AI trading systems. It adopts a doctrinal and comparative method, reading MiFID II and MAR, the EU AI Act, SEC and CFTC regimes, and global soft law (IOSCO, NIST) through an engineering lens of AI lifecycles and value chains to test functional adequacy. Chapter 1 maps the evolution from deterministic code to self-optimizing agents and locates the shrinking space for real-time human oversight. Chapter 2 reframes technical attributes as risk vectors, such as herding, feedback loops, and brittle liquidity, and illustrates enforcement and stability implications. Chapter 3 exposes human-centric assumptions (intent, explainability, “kill switches”) embedded in current rules and the gaps they create for attribution, auditing, and cross-border supervision. Chapter 4 proposes a hybrid, lifecycle-based model of oversight that combines value-chain accountability, tiered AI-agent licensing, mandatory pre-deployment verification, explainability XAI requirements, cryptographically sealed audit trails, human-in-the-loop controls, continuous monitoring, and sandboxed co-regulation. The contribution is threefold: (1) a technology-aware risk typology linking engineering realities to market integrity outcomes; (2) a comparative map of EU and US regimes that surfaces avenues for regulatory arbitrage; and (3) a practicable governance toolkit that restores traceable accountability without stifling beneficial innovation. Overall, the thesis argues for moving from incremental, disclosure-centric tweaks to proactive, lifecycle governance that embeds accountability at design, deployment, and post-trade, aligning next-generation trading technology with the enduring goals of fair, orderly, and resilient markets.
    11 0
  • ItemRestricted
    Semi-Supervised Approach For Automatic Head Gesture Classification
    (Saudi Digital Library, 2025) Alsharif, Wejdan; Hiroshi, Shimodaira
    This study utilizes a semi-supervised method, particularly self-training, for automatic head gesture recognition using motion caption data. It explores and compares fully supervised deep learning models and self-training pipelines in terms of their perfor- mance and training approaches. The proposed approach achieved an accuracy score of 52% and a macro F1 score of 44% in the cross validation. Results have shown that leveraging self-training as part of the learning process contributes to improved model performance, due to generating pseudo-labeled data that effectively supplements the original labeled dataset, thereby enabling the model to learn from a larger and more diverse set of training examples.
    5 0
  • ItemRestricted
    ENHANCING DATAREPRESENTATION IN DISTRIBUTED MACHINE LEARNING
    (Saudi Digital Library, 2025) Aladwani, Tahani Abed; CHRISTOSANAGNOSTOPOULOS
    Distributed computing devices, ranging from smartphones to edge micro-servers—collectively referred to as clients—are capable of gathering and storing diverse types of data, such as images and voice recordings. This wide array of data sources has the potential to significantly enhance the accuracy and robustness of Deep Learning (DL) models across a variety of tasks. However, this data is intrinsically heterogeneous, due to the differences in users’ preferences, lifestyles, locations, and other factors. Consequently, it necessitates comprehensive preprocessing (e.g., labeling, filtering, relevance assessment, balancing, etc.) to ensure its suitability for the development of effective and reliable models. Therefore, this thesis explores the feasibility of conducting predictive analytics and model inference on edge computing (EC) systems when access to data is limited, and on clients’ devices through federated learning (FL) when direct access to data is entirely restricted. The first part of this thesis focuses on reducing the data transmission rate between clients and EC servers by employing techniques such as data and task caching, identifying data overlaps, and evaluating task popularity. While this strategy can significantly minimize data offloading to the lowest possible level, it does not entirely eliminate dependence on third-party entities. The second part of this thesis eliminates the dependency on third-party entities by implementing FL, where direct access to raw data is not possible. In this context, node and data selection are guided by predictions and model performance. The objective is to identify the most suitable nodes and relevant data for training by clustering nodes based on data characteristics and analyzing the overlap between query boundaries and cluster boundaries. The third part of this thesis introduces a mechanism designed to support classification tasks, such as image classification. These tasks present significant challenges when building models on distributed data, particularly due to issues like label shifting or missing labels across clients. To address these challenges, the proposed method mitigates the impact of imbalances across clients by employing multiple cluster-based meta-models, each tailored to specific label distributions. The fourth part of this thesis introduces a two-phase federated self-learning framework, termed 2PFL, which addresses the challenges of extreme data scarcity and skewness when training classifiers over distributed labeled and unlabeled data. 2PFL demonstrates the capability to achieve high-performance models, even when trained with only 10% to 20% labeled data compared to the available unlabeled data. The conclusion chapter underscores the importance of adaptable learning mechanisms that can respond to the continuous changes in clients’ data volume, requirements, formats, and protection regulations. By incorporating the EC layer, we can alleviate concerns related to data privacy, reduce the volume of data needing offloading, expedite task execution, and facilitate the training of complex models. For scenarios demanding stricter privacy-preserving measures, FL offers a viable solution, enabling multiple clients to collaboratively train models while adhering to user privacy protection, data security, and government regulations. However, due to the indirect access to data inherent in FL, several challenges must be addressed to ensure the development of high-performance models. These challenges include imbalanced data distribution across clients, partially labeled data, and fully unlabeled data, all of which are explored and demonstrated through experimental evaluations.
    7 0
  • ItemRestricted
    Deep Learning-Based White Blood Cell Classification Through a Free and Accessible Application
    (Saudi Digital Library, 2025) Alluwaim, Yaseer; Campbell, Neill
    Background Microscopy of peripheral blood smears (PBS) continues to play a fundamental role in hematology diagnostics, offering detailed morphological insights that complement automated blood counts. Examination of a stained blood film by a trained technician is among the most frequently performed tests in clinical hematology laboratories. Nevertheless, manual smear analysis is labor-intensive, time-consuming, and prone to considerable variability between observers. These challenges have spurred interest in automated, deep learning-based approaches to enhance efficiency and consistency in blood cell assessment. Methods We designed a convolutional neural network (CNN) using a ResNet-50 backbone, applying standard transfer-learning techniques for white blood cell (WBC) classification. The model was trained on a publicly available dataset of approximately 4,000 annotated peripheral smear images representing eight WBC types. The image processing workflow included automated nucleus detection, normalization, and extensive augmentation (rotation, scaling, etc.) to improve model generalization. Training was performed with the PyTorch Lightning framework for efficient development. Application The final model was integrated into a lightweight web application and deployed on Hugging Face Spaces, allowing accessible browser-based inference. The application provides an easy-to-use interface to upload images, which are then automatically cropped and analyzed in real-time. This open and free tool is intended to provide immediate classification results. It is also a useful tool for laboratory technologists without requiring specialized hardware or software. Results Testing on an independent set revealed that the ResNet-50 network reached 98.67% overall accuracy. Performance was consistently high across all eight WBC categories. Precision, recall, and specificity closely matched the overall accuracy, indicating well-balanced classification. However, for the assessment of real-world generalization, the model was tested on an external heterogeneous dataset from different sources. It performed with 86.33% accuracy, reflecting strong performance outside of its main training data. The confusion matrix showed negligible misclassifications. This suggested consistent distinction between leukocyte types. Conclusion This study indicates that a lightweight AI tool can support peripheral smear analysis by offering rapid and consistent WBC identification via a web interface. Such a system may reduce laboratory workload and observer variability, particularly in resource-limited or remote settings where expert microscopists are scarce, and serve as a practical training aid for personnel learning cell morphology. Limitations include reliance on a single dataset, which may not encompass all staining or imaging variations, and evaluation performed offline. Future work will aim to expand dataset diversity, enable real-time integration with digital microscopes, and conduct clinical validation to broaden applicability and adoption. Application link: https://huggingface.co/spaces/xDyas/wbc-classifier
    14 0
  • ItemRestricted
    Deep Multi-Modality Fusion for Integrative Healthcare
    (Queen Mary University of London, 2025) Alwazzan, Omnia; Slabaugh, Gregory
    The healthcare industry generates vast amounts of data, driving advancements in patient diagnosis, treatment, and therapeutic discovery. A single patient’s electronic healthcare record often includes multiple modalities, each providing unique insights into their condition. Yet, integrating these diverse, complementary sources to gain deeper insights remains a challenge. While deep learning has transformed single-modality analysis, many clinical scenarios, particularly in cancer care, require integrating complementary data sources for a holistic understanding. In cancer care, two key modalities provide complementary perspectives: histopathology whole-slide images (WSIs) and omics data (genomic, transcriptomic, epigenomic). WSIs deliver high-resolution views of tissue morphology and cellular structures, while omics data reveal molecular-level details of disease mechanisms. In this domain, single-modality approaches fall short: histopathology misses molecular heterogeneity, and traditional bulk or non-spatial omics data lack spatial context. Although recent advances in spatial omics technologies aim to bridge this gap by capturing molecular data within spatially resolved tissue architecture, such approaches are still emerging and are not explored in this thesis. Consequently, integrating conventional WSIs and non-spatial omics data through effective fusion strategies becomes essential for uncovering their joint potential. Effective fusion of these modalities holds the potential to reveal rich, cross-modal patterns that help identify signals associated with tumor behavior. But key questions arise: How can we effectively align these heterogeneous modalities (high-resolution images and diverse molecular data) into a unified framework? How can we leverage their interactions to maximize complementary insights? How can we tailor fusion strategies to maximize the strengths of dominant modalities across diverse clinical tasks? This thesis tackles these questions head-on, advancing integrative healthcare by developing novel deep multi-modal fusion methods. Our primary focus is on integrating the aforementioned key modalities, proposing innovative approaches to enhance omics–WSI fusion in cancer research. While the downstream applications of these methods span diagnosis, prognosis, and treatment stratification, the core contribution lies in the design and evaluation of fusion strategies that effectively harness the complementary strengths of each modality. Our research develops a multi-modal fusion method to enhance cross-modality interactions between WSIs and omics data, using advanced architectures to integrate their heterogeneous feature spaces and produce discriminative representations that improve cancer grading accuracy. These methods are flexibly designed and can be applied to fuse data from diverse sources across various application domains; however, this thesis focuses primarily on cancer-related tasks. We also introduce cross-modal attention mechanisms to refine feature representation and interpretability, functioning effectively in both single-modality and bimodal settings, with applications in breast cancer classification (using mammography, MRI, and clinical metadata) and brain tumor grading (using WSIs and gene expression data). Additionally, we propose dual fusion strategies combining early and late fusion to address challenges in omics-WSI integration, such as explainability and high-dimensional omics data, aligning omics with localized WSI regions to identify tumor subtypes without patch-level labels, and capturing global interactions for a holistic perspective. We deliver three key contributions: the Multi-modal Outer Arithmetic Block (MOAB), a novel fusion method integrating latent representations from WSIs and omics data using arithmetic operations and a channel fusion technique, achieving state-of-the-art brain cancer grading performance with publicly available code; the Flattened Outer Arithmetic Attention (FOAA), an attention-based framework extending MOAB for single- and bimodal tasks, surpassing existing methods in breast and brain tumor classification; and the Multi-modal Outer Arithmetic Block Dual Fusion Network (MOAD-FNet), combining early and late fusion for explainable omics-WSI integration, outperforming benchmarks on The Cancer Genome Atlas (TCGA) and NHNN BRAIN UK datasets with interpretable WSI heatmaps aligned with expert diagnoses. Together, these contributions provide reliable, interpretable, and adaptable solutions for the multi-modal fusion domain, with a specific focus on advancing diagnostics, prognosis, and personalized healthcare strategies while addressing the critical questions driving this field forward.
    22 0
  • ItemRestricted
    Enhancing Gravitational-Wave Detection from Cosmic String Cusps in Real Noise Using Deep Learning
    (Saudi Digital Library, 2025) Taghreed, Bahlool; Patrick, Sutton
    Cosmic strings are topological defects that may have formed in the early universe and could produce bursts of gravitational waves through cusp events. Detecting such signals is particularly challenging due to the presence of transient non-astrophysical artifacts—known as glitches—in gravitational-wave detector data. In this work, we develop a deep learning-based classifier designed to distinguish cosmic string cusp signals from common transient noise types, such as blips, using raw, whitened 1D time-series data extracted from real detector noise. Unlike previous approaches that rely on simulated or idealized noise environments, our method is trained and tested entirely on real noise, making it more applicable to real-world search pipelines. Using a dataset of 50,000 labeled 2-second samples, our model achieves a classification accuracy of 84.8% , recall 78.7% and false-positive rate 9.1% on unseen data. This demonstrates the feasibility of cusp-glitch discrimination directly in the time domain, without requiring time-frequency representations or synthetic data, and contributes toward robust detection of exotic astrophysical signals in realistic gravitational-wave conditions.
    21 0
  • ItemRestricted
    Predicting Delayed Flights for International Airports Using Artificial Intelligence Models & Techniques
    (Saudi Digital Library, 2025) Alsharif, Waleed; MHallah, Rym
    Delayed flights are a pervasive challenge in the aviation industry, significantly impacting operational efficiency, passenger satisfaction, and economic costs. This thesis aims to develop predictive models that demonstrate strong performance and reliability, capable of maintaining high accuracy within the tested dataset and showcasing potential for application in various real-world aviation scenarios. These models leverage advanced artificial intelligence and deep learning techniques to address the complexity of predicting delayed flights. The study evaluates the performance of Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), and their hybrid model (LSTM-CNN), which combine temporal and spatial pattern analysis, alongside Large Language Models (LLM, specifically OpenAI's Babbage model), which excel in processing structured and unstructured text data. Additionally, the research introduces a unified machine learning framework utilizing Gradient Boosting Machine (GBM) for regression and Light Gradient Boosting Machine (LGBM) for classification, aimed at estimating both flight delay durations and their underlying causes. The models were tested on high-dimensional datasets from John F. Kennedy International Airport (JFK), and a synthetic dataset from King Abdulaziz International Airport (KAIA). Among the evaluated models, the hybrid LSTM-CNN model demonstrated the best performance, achieving 99.91% prediction accuracy with a prediction time of 2.18 seconds, outperforming the GBM model (98.5% accuracy, 6.75 seconds) and LGBM (99.99% precision, 4.88 seconds). Additionally, GBM achieved a strong correlation score (R² = 0.9086) in predicting delay durations, while LGBM exhibited exceptionally high precision (99.99%) in identifying delay causes. Results indicated that National Aviation System delays (correlation: 0.600), carrier-related delays (0.561), and late aircraft arrivals (0.519) were the most significant contributors, while weather factors played a moderate role. These findings underscore the exceptional accuracy and efficiency of LSTM-CNN, establishing it as the optimal model for predicting delayed flights due to its superior performance and speed. The study highlights the potential for integrating LSTM-CNN into real-time airport management systems, enhancing operational efficiency and decision-making while paving the way for smarter, AI-driven air traffic systems.
    18 0
  • ItemRestricted
    Paraphrase Generation and Identification at Paragraph-Level
    (Saudi Digital Library, 2025) Alsaqaabi, Arwa; Stewart, Craig; Akrida, Eleni; Cristea, Alexandra
    The widespread availability of the Internet and the ease of accessing written content have significantly contributed to the rising incidence of plagiarism across various domains, including education. This behaviour directly undermines academic integrity, as evidenced by reports highlighting increased plagiarism in student work. Notably, students tend to plagiarize entire paragraphs more often than individual sentences, further complicating efforts to detect and prevent academic dishonesty. Additionally, advancements in natural language processing (NLP) have further facilitated plagiarism, particularly by using online paraphrasing tools and deep-learning language models designed to generate paraphrased text. These developments underscore the critical need to develop and refine effective paraphrase identification (PI) methodologies. This thesis addresses one of the most challenging aspects of plagiarism detection (PD): identifying instances of plagiarism at the paragraph-level, with a particular emphasis on paraphrased paragraphs rather than individual sentences. By focusing on this level of granularity, the approach considers both intra-sentence and inter-sentence relationships, offering a more comprehensive solution to the detection of sophisticated forms of plagiarism. To achieve this aim, the research examines the influence of text length on the performance of NLP machine learning (ML) and deep learning (DL) models. Furthermore, it introduces ALECS-SS (ALECS – Social Sciences), a large-scale dataset of paragraph-length paraphrases, and develops three novel SALAC algorithms designed to preserve semantic integrity while restructuring paragraph content. These algorithms suggest a novel approach that modifies the structure of paragraphs while maintaining their semantics. The methodology involves converting text into a graph where each node corresponds to a sentence’s semantic vector, and each edge is weighted by a numerical value representing the sentence order probability. Subsequently, a masking approach is applied to the reconstructed paragraphs modifying the v lexical elements while preserving the original semantic content. This step introduces variability to the dataset while maintaining its core meaning, effectively simulating paraphrased text. Human and automatic evaluations assess the reliability and quality of paraphrases, and additional studies examine the adaptability of SALAC across multiple academic domains. Moreover, state-of-the-art large language models (LLMs) are analysed for their ability to differentiate between human-written and machine-paraphrased text. This investigation involves the use of multiple PI datasets in addition to the newly established paragraph-level paraphrases dataset (ALECS-SS). The findings demonstrate that text length significantly affects model performance, with limitations arising from dataset segmentation. Additionally, the results show that the SALAC algorithms effectively maintain semantic integrity and coherence across different domains, highlighting their potential for domain-independent paraphrasing. The thesis also analysed the state-of-the-art LLMs’ performance in detecting auto-paraphrased content and distinguishing them from human-written content at both the sentence and paragraph levels, showing that the models could reliably identify reworded content from individual sentences up to entire paragraphs. Collectively, these findings contribute to educational applications and plagiarism detection by improving how paraphrased content is generated and recognized, and they advance NLP-driven paraphrasing techniques by providing strategies that ensure that meaning and coherence are preserved in reworded material.
    22 0

Copyright owned by the Saudi Digital Library (SDL) © 2026