SACM - United States of America
Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668
Browse
33 results
Search Results
Item Restricted Deep Learning Approaches for Multivariate Time Series: Advances in Feature Selection, Classification, and Forecasting(New Mexico State University, 2024) Alshammari, Khaznah Raghyan; Tran, Son; Hamdi, Shah MuhammadIn this work, we present the latest developments and advancements in the machine learning-based prediction and feature selection of multivariate time series (MVTS) data. MVTS data, which involves multiple interrelated time series, presents significant challenges due to its high dimensionality, complex temporal dependencies, and inter-variable relationships. These challenges are critical in domains such as space weather prediction, environmental monitoring, healthcare, sensor networks, and finance. Our research addresses these challenges by developing and implementing advanced machine-learning algorithms specifically designed for MVTS data. We introduce innovative methodologies that focus on three key areas: feature selection, classification, and forecasting. Our contributions include the development of deep learning models, such as Long Short-Term Memory (LSTM) networks and Transformer-based architectures, which are optimized to capture and model complex temporal and inter-parameter dependencies in MVTS data. Additionally, we propose a novel feature selection framework that gradually identifies the most relevant variables, enhancing model interpretability and predictive accuracy. Through extensive experimentation and validation, we demonstrate the superior performance of our approaches compared to existing methods. The results highlight the practical applicability of our solutions, providing valuable tools and insights for researchers and practitioners working with high-dimensional time series data. This work advances the state of the art in MVTS analysis, offering robust methodologies that address both theoretical and practical challenges in this field.14 0Item Restricted Toward a Better Understanding of Accessibility Adoption: Developer Perceptions and Challenges(University Of North Texas, 2024-12) Alghamdi, Asmaa Mansour; Stephanie, LudiThe primary aim of this dissertation is to explore the challenges developers face in interpreting and implementing accessibility in web applications. We analyze developers’ discussions on web accessibility to gain a comprehensive understanding of the challenges, misconceptions, and best practices prevalent within the development community. As part of this analysis, we built a taxonomy of accessibility aspects discussed by developers on Stack Overflow, identifying recurring trends, common obstacles, and the types of disabilities associated with the features addressed by developers in their posts. This dissertation also evaluates the extent to which developers on online platforms engage with and deliberate upon accessibility issues, assessing their awareness and implementation of accessibility standards throughout the web application development process. Given the volume and variety of these discussions, manual analysis alone would be insufficient to capture the full scope of accessibility challenges. Therefore, we employed supervised machine learning techniques to classify these posts based on their relevance to different aspects of the WCAG 2.2 guidelines principle. By training our models on labeled data, we were able to automatically detect patterns and keywords that indicate specific accessibility issues, even when the language used by developers is not directly aligned with the official guidelines. The results emphasize developers’ struggles with complex accessibility issues, such as time-based media customization and screen reader configuration. The findings indicate that machine learning holds significant potential for enhancing compliance with accessibility standards, providing a pathway for more efficient and accurate adherence to these guidelines.52 0Item Restricted Online conversations: A study of their toxicity(University of Illinois Urbana-Champaign, 2024) Alkhabaz, Ridha; Sundaram, HariSocial media platforms are essential spaces for modern human communication. There is a dire need to make these spaces most welcoming and engaging to their participants. A potential threat to this need is the propagation of toxic content in online spaces. Hence, it becomes crucial for social media platforms to detect early signs of a toxic conversation. In this work, we tackle the problem of toxicity prediction by proposing a definition for conversational structures. This definition empowers us to provide a new framework for toxicity prediction. Thus, we examine more than 1.18 million X (made by 4.4 million users), formerly known as Twitter, threads to provide a few key insights about the current state of online conversations. Our results indicated that most of the X threads do not exhibit a conversational structure. Also, our newly defined structures are distributed differently than previously thought of online conversations. Additionally, our definitions give a meaningful sign for models to start predicting the future toxicity of online conversations. We also showcase that message-passing graph neural networks outperform state-of-the-art gradient- boosting trees for toxicity prediction. Most importantly, we find that once we observe the first two terminating conversational structures, we can predict the future toxicity of online threads with ≈88 % accuracy. We hope our findings will help social media platforms better curate content in their spaces and promote more conversations in online spaces.19 0Item Restricted ECG CLASSIFICATION USING NEURAL NETWORK(University of Bridgeport, 2018) Alhassani, Ahmad; Faezipour, MiadAn electrocardiogram (ECG) is one of the biomedical signals that is considered a very useful approach to providing information about heart problems. This thesis has been done to contribute to making machines of observation of hearts have more ability for making accurate and fast diagnosis so that life of more patients might be saved. Physios Bank was the source of our dataset. It has many real examples of heart diseases that we can choose for our studies. In this research, there are five heart cases that were used for this research, normal N, atrial premature beat PAC, premature ventricular contraction PVC, left bundle branch block beat LBBB, and right bundle branch block beat RBBB. Classifying these five cases with a high efficiency and accuracy using neural network is our final goal. To achieve this goal, ECG signals must go through specific procedures or steps. The first procedure was ECG signal preprocessing. This step has three sup steps, signal filtering, signal detrending, and signal smoothing. The second procedure is extracting features of ECG signals. The forth one is classifying ECG signals using neural network. Finally, the results of NN will be saved for future purposes. Our system was implemented by using MATLAB because it is a very powerful software for signal processing and signal analysis. Our research was ended with some good achievements and optimizations. For example, discovering good techniques for filtering, finding new way for features extraction, building one neural network to classify multiple heart diseases, and making a high accuracy with 96.88% percent.65 0Item Restricted Network Alignment Using Topological And Node Embedding Features(Purdue University, 2024-08) Almulhim, Aljohara; AlHasan, MohammadIn today’s big data environment, development of robust knowledge discovery solutions depends on integration of data from various sources. For example, intelligence agencies fuse data from multiple sources to identify criminal activities; e-commerce platforms consolidate user activities on various platforms and devices to build better user profile; scientists connect data from various modality to develop new drugs, and treatments. In all such activities, entities from different data sources need to be aligned—first, to ensure accurate analysis and more importantly, to discover novel knowledge regarding these entities. If the data sources are networks, aligning entities from different sources leads to the task of network alignment, which is the focus of this thesis. The main objective of this task is to find an optimal one-to-one correspondence among nodes in two or more networks utilizing graph topology and nodes/edges attributes. In existing works, diverse computational schemes have been adopted for solving the network alignment task; these schemes include finding eigen-decomposition of similarity matrices, solving quadratic assignment problems via sub-gradient optimization, and designing iterative greedy matching techniques. Contemporary works approach this problem using a deep learning framework by learning node representations to identify matches. Node matching’s key challenges include computational complexity and scalability. However, privacy concerns or unavailability often prevent the utilization of node attributes in real-world scenarios. In light of this, we aim to solve this problem by relying solely on the graph structure, without the need for prior knowledge, external attributes, or guidance from landmark nodes. Clearly, topology-based matching emerges as a hard problem when compared to other network matching tasks. In this thesis, I propose two original works to solve network topology-based alignment task. The first work, Graphlet-based Alignment (Graphlet-Align), employs a topological approach to network alignment. Graphlet-Align represents each node with a local graphlet count based signature and use that as feature for deriving node to node similarity across a pair of networks. By using these similarity values in a bipartite matching algorithm GraphletAlign obtains a preliminary alignment. It then uses high-order information extending to k-hop neighborhood of a node to further refine the alignment, achieving better accuracy. We validated Graphlet-Align’s efficacy by applying it to various large real-world networks, achieving accuracy improvements ranging from 20% to 72% over state-of-the-art methods on both duplicated and noisy graphs. Expanding on this paradigm that focuses solely on topology for solving graph alignment, in my second work, I develop a self-supervised learning framework known as Self-Supervised Topological Alignment (SST-Align). SST-Align uses graphlet-based signature for creating self-supervised node alignment labels, and then use those labels to generate node embedding vectors of both the networks in a joint space from which node alignment task can be effectively and accurately solved. It starts with an optimization process that applies average pooling on top of the extracted graphlet signature to construct an initial node assignment. Next, a self-supervised Siamese network architecture utilizes both the initial node assignment and graph convolutional networks to generate node embeddings through a contrastive loss. By applying kd-tree similarity to the two networks’ embeddings, we achieve the final node mapping. Extensive testing on real-world graph alignment datasets shows that our developed methodology has competitive results compared to seven existing competing models in terms of node mapping accuracy. Additionally, we establish the Ablation Study to evaluate the two-stage accuracy, excluding the learning representation part and comparing the mapping accuracy accordingly. This thesis enhances the theoretical understanding of topological features in the analysis of graph data for network alignment task, hence facilitating future advancements toward the field.12 0Item Restricted EAVESDROPPING-DRIVEN PROFILING ATTACKS ON ENCRYPTED WIFI NETWORKS: UNVEILING VULNERABILITIES IN IOT DEVICE SECURITY(University of Central Florida, 2024-08-02) Alwhbi, Ibrahim; Zou, ChangchunThis dissertation investigates the privacy implications of WiFi communication in Internet-of-Things (IoT) environments, focusing on the threat posed by out-of-network observers. Recent research has shown that in-network observers can glean information about IoT devices, user identities, and activities. However, the potential for information inference by out-of-network observers, who do not have WiFi network access, has not been thoroughly examined. The first study provides a detailed summary dataset, utilizing Random Forest for data summary classifica- tion. This study highlights the significant privacy threat to WiFi networks and IoT applications from out-of-network observers. Building on this investigation, the second study extends the research by utilizing a new set of time series monitored WiFi data frames and advanced machine learning algorithms, specifically xGboost, for Time Series classification. This extension achieved high accuracy of up to 94% in identifying IoT devices and their working status, demonstrating faster IoT device profiling while maintaining classification accuracy. Furthermore, the study underscores the ease with which out- side intruders can harm IoT devices without joining a WiFi network, launching attacks quickly and leaving no detectable footprints. Additionally, the dissertation presents a comprehensive survey of recent advancements in machine- learning-driven encrypted traffic analysis and classification. Given the challenges posed by encryp- tion for traditional packet and traffic inspection, understanding and classifying encrypted traffic are crucial. The survey provides insights into utilizing machine learning for encrypted network traffic analysis and classification, reviewing state-of-the-art techniques and methodologies. This survey serves as a valuable resource for network administrators, cybersecurity professionals, and policy enforcement entities, offering insights into current practices and future directions in encrypted traffic analysis and classification.25 0Item Restricted Exploring the Impact of Sentiment Analysis on Price Prediction(Lehigh University, 2024-07) Zahhar, Abdulkarim Ali Y.; Robinson, Daniel P.The integration of sentiment analysis into predictive models for financial markets, particularly Bitcoin, combines behavioral finance with quantitative analysis. This thesis investigates the extent to which sentiment data, derived from social media platforms such as X (formerly Twitter), can enhance the accuracy of Bitcoin price predictions. A key idea in the study is that public sentiment, as shown on social media, affects Bitcoin’s market prices. The research uses linear regression models that combine Bitcoin’s opening prices with sentiment scores from social media to forecast closing prices. The analysis covers the period from January 2012 to December 2019. Sentiment scores were analyzed using VADER and TextBlob lexicons. The empirical findings show that models incorporating sentiment scores enhance predictive accuracy. For example, incorporating daily average sentiment scores (v avg and B avg) into the models reduced the Mean Squared Error (MSE) from 81184 to 81129 and improved other metrics such as Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), particularly at specific lag times like 8 and 76 days. These results emphasize the potential benefits of sentiment analysis to improve financial forecasting models. However, it also acknowledges limitations related to the scope of data and the complexities of accurately measuring sentiment. Future research is encouraged to explore more sophisticated models and diverse data sources to further enhance and validate the integration of sentiment analysis in financial forecasting.91 0Item Restricted Towards Cost-Effective Noise-Resilient Machine Learning Solutions(University of Georgia, 2026-06-04) Gharawi, Abdulrahman Ahmed; Ramaswamy, LakshmishMachine learning models have demonstrated exceptional performance in various applications as a result of the emergence of large labeled datasets. Although there are many available datasets, acquiring high-quality labeled datasets is challenging since it involves huge human supervision or expert annotation, which are extremely labor-intensive and time-consuming. The problem is magnified by the considerable amount of label noise present in datasets from real-world scenarios, which significantly undermines the performance accuracy of machine learning models. Since noisy datasets can affect the performance of machine learning models, acquiring high-quality datasets without label noise becomes a critical problem. However, it is challenging to significantly decrease label noise in real-world datasets without hiring expensive expert annotators. Based on extensive testing and research, this dissertation examines the impact of different levels of label noise on the accuracy of machine learning models. It also investigates ways to cut labeling expenses without sacrificing required accuracy. Finally, to enhance the robustness of machine learning models and mitigate the pervasive issue of label noise, we present a novel, cost-effective approach called Self Enhanced Supervised Training (SEST).21 0Item Restricted Detecting Flaky Tests Without Rerunning Tests(George Mason University, 2024-07-26) Alshammari, Abdulrahman Turqi; Lam, Wing; Ammann, PaulA critical component of modern software development practices, particularly continuous integration (CI), is the halt of development activities in response to test failures which requires further investigation and debugging. As software changes, regression testing becomes vital to verify that new code does not affect existing functionality. However, this process is often delayed by the presence of flaky tests—those that yield inconsistent results on the same codebase, alternating between pass and fail. Test flakiness introduces challenges to the trust in testing outcomes and undermines the reliability of the CI process. The typical approach to identifying flaky tests has involved executing them multiple times; if a test yields both passing and failing results without any modifications to the codebase, it is flaky, as discussed by Luo et al. in their empirical study. This approach, while straightforward, can be resource-intensive and time-consuming, resulting in considerable overhead costs for development teams. Moreover, this technique might not consistently reveal flakiness in tests that exhibit varied behavior across varying execution environments. Given these challenges, the research community has been actively seeking more efficient and reliable alternatives to the repetitive execution of tests for flakiness detection. These explorations aim to uncover methods that can accurately detect flaky tests without the need for multiple reruns, thereby reducing the time and resources required for testing. This dissertation addresses three principal dimensions of test flakiness. First, it presents a machine learning classifier designed to detect which tests are flaky, based on previously detected flaky tests. Second, the dissertation proposes three de-duplication-based approaches to assist developers in determining whether a flaky test failure is flaky or not. Third, it highlights the impact of test flakiness on other testing activities (particularly mutation testing) and discusses how to mitigate the effects of test flakiness on mutation testing. This dissertation explores the detection of test flakiness by conducting an empirical study on the limitations of rerunning tests as a method for identifying flaky tests, which results in a large dataset of flaky tests. This dataset is then utilized to develop FlakeFlagger, a machine learning classifier, which is designed to automatically predict the likelihood of a test being flaky through static and dynamic analysis. The objective is to leverage FlakeFlagger to identify flaky tests without the need for reruns by detecting patterns and symptoms common among previously identified flaky tests. In addressing the challenge of detecting whether a failure is due to flakiness, this dissertation demonstrates how developers can better manage flaky tests within their test suites. The dissertation proposes three deduplication-based methods to help developers determine whether a specific failure is genuinely flaky or not. Furthermore, the dissertation discusses the effects of test flakiness on mutation testing, a critical activity for assessing the quality of test suites. It includes an extensive rerun experiment on the mutation analysis of flaky tests identified earlier in the study. This is to highlight the significant impact of flaky tests on the validity of the mutation testing.26 0Item Restricted A Deep Learning Framework for Blockage Mitigation in mmWave Wireless(Portland State University, 2024-05-28) Almutairi, Ahmed; Aryafar, EhsanMillimeter-Wave (mmWave) communication is a key technology to enable next generation wireless systems. However, mmWave systems are highly susceptible to blockages, which can lead to a substantial decrease in signal strength at the receiver. Identifying blockages and mitigating them is thus a key challenge to achieve next generation wireless technology goals, such as enhanced mobile broadband (eMBB) and Ultra-Reliable and Low-Latency Communication (URLLC). This thesis proposes several deep learning (DL) frameworks for mmWave wireless blockage detection, mitigation, and duration prediction. First, we propose a DL framework to address the problem of identifying whether the mmWave wireless channel between two devices (e.g., a base station and a client device) is Lineof- Sight (LoS) or non-Line-of-Sight (nLoS). Specifically, we show that existing beamforming training messages that are exchanged periodically between mmWave wireless devices can also be used in a DL model to solve the channel classification problem with no additional overhead. We extend this DL framework by developing a transfer learning model (t-LNCC) that is trained on simulated data and can successfully solve the channel classification problem on any commercial-off-the-shelf (COTS) mmWave device with/without any real-world labeled data. The second part of the thesis leverages our channel classification mechanism from the first part and introduces new DL frameworks to mitigate the negative impacts of blockages. Previous research on blockage mitigation has introduced several model and protocol based blockage mitigation solutions that focus on one technique at a time, such as handoff to a different base station or beam adaptation to the same base station. We go beyond those techniques by proposing DL frameworks that address the overarching problem: what blockage mitigation method should be employed? and what is the optimal sub-selection within that method? To do so, we developed two Gated Recurrent Unit (GRU) models that are trained using periodically exchanged messages in mmWave systems. Specifically, we first developed a GRU model that tackled the blockage mitigation problem in single-antenna clients wireless environment. Then, we proposed another GRU model to expand our investigation to cover more complex scenarios where both base stations and clients are equipped with multiple antennas and collaboratively mitigate blockages. Those two models are trained on datasets that are gathered using a commercially available mmWave simulator. Both models achieve outstanding results in selecting the optimal blockage mitigation method with an accuracy higher than 93% and 91% for single-antenna and multiple-antenna clients, respectively. We also show that the proposed methods significantly increases the amount of transferred data compared to several other blockage mitigation policies.17 0