SACM - United States of America
Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668
Browse
4 results
Search Results
Item Restricted Deep Learning Approaches for Multivariate Time Series: Advances in Feature Selection, Classification, and Forecasting(New Mexico State University, 2024) Alshammari, Khaznah Raghyan; Tran, Son; Hamdi, Shah MuhammadIn this work, we present the latest developments and advancements in the machine learning-based prediction and feature selection of multivariate time series (MVTS) data. MVTS data, which involves multiple interrelated time series, presents significant challenges due to its high dimensionality, complex temporal dependencies, and inter-variable relationships. These challenges are critical in domains such as space weather prediction, environmental monitoring, healthcare, sensor networks, and finance. Our research addresses these challenges by developing and implementing advanced machine-learning algorithms specifically designed for MVTS data. We introduce innovative methodologies that focus on three key areas: feature selection, classification, and forecasting. Our contributions include the development of deep learning models, such as Long Short-Term Memory (LSTM) networks and Transformer-based architectures, which are optimized to capture and model complex temporal and inter-parameter dependencies in MVTS data. Additionally, we propose a novel feature selection framework that gradually identifies the most relevant variables, enhancing model interpretability and predictive accuracy. Through extensive experimentation and validation, we demonstrate the superior performance of our approaches compared to existing methods. The results highlight the practical applicability of our solutions, providing valuable tools and insights for researchers and practitioners working with high-dimensional time series data. This work advances the state of the art in MVTS analysis, offering robust methodologies that address both theoretical and practical challenges in this field.14 0Item Restricted Network Alignment Using Topological And Node Embedding Features(Purdue University, 2024-08) Almulhim, Aljohara; AlHasan, MohammadIn today’s big data environment, development of robust knowledge discovery solutions depends on integration of data from various sources. For example, intelligence agencies fuse data from multiple sources to identify criminal activities; e-commerce platforms consolidate user activities on various platforms and devices to build better user profile; scientists connect data from various modality to develop new drugs, and treatments. In all such activities, entities from different data sources need to be aligned—first, to ensure accurate analysis and more importantly, to discover novel knowledge regarding these entities. If the data sources are networks, aligning entities from different sources leads to the task of network alignment, which is the focus of this thesis. The main objective of this task is to find an optimal one-to-one correspondence among nodes in two or more networks utilizing graph topology and nodes/edges attributes. In existing works, diverse computational schemes have been adopted for solving the network alignment task; these schemes include finding eigen-decomposition of similarity matrices, solving quadratic assignment problems via sub-gradient optimization, and designing iterative greedy matching techniques. Contemporary works approach this problem using a deep learning framework by learning node representations to identify matches. Node matching’s key challenges include computational complexity and scalability. However, privacy concerns or unavailability often prevent the utilization of node attributes in real-world scenarios. In light of this, we aim to solve this problem by relying solely on the graph structure, without the need for prior knowledge, external attributes, or guidance from landmark nodes. Clearly, topology-based matching emerges as a hard problem when compared to other network matching tasks. In this thesis, I propose two original works to solve network topology-based alignment task. The first work, Graphlet-based Alignment (Graphlet-Align), employs a topological approach to network alignment. Graphlet-Align represents each node with a local graphlet count based signature and use that as feature for deriving node to node similarity across a pair of networks. By using these similarity values in a bipartite matching algorithm GraphletAlign obtains a preliminary alignment. It then uses high-order information extending to k-hop neighborhood of a node to further refine the alignment, achieving better accuracy. We validated Graphlet-Align’s efficacy by applying it to various large real-world networks, achieving accuracy improvements ranging from 20% to 72% over state-of-the-art methods on both duplicated and noisy graphs. Expanding on this paradigm that focuses solely on topology for solving graph alignment, in my second work, I develop a self-supervised learning framework known as Self-Supervised Topological Alignment (SST-Align). SST-Align uses graphlet-based signature for creating self-supervised node alignment labels, and then use those labels to generate node embedding vectors of both the networks in a joint space from which node alignment task can be effectively and accurately solved. It starts with an optimization process that applies average pooling on top of the extracted graphlet signature to construct an initial node assignment. Next, a self-supervised Siamese network architecture utilizes both the initial node assignment and graph convolutional networks to generate node embeddings through a contrastive loss. By applying kd-tree similarity to the two networks’ embeddings, we achieve the final node mapping. Extensive testing on real-world graph alignment datasets shows that our developed methodology has competitive results compared to seven existing competing models in terms of node mapping accuracy. Additionally, we establish the Ablation Study to evaluate the two-stage accuracy, excluding the learning representation part and comparing the mapping accuracy accordingly. This thesis enhances the theoretical understanding of topological features in the analysis of graph data for network alignment task, hence facilitating future advancements toward the field.12 0Item Restricted Retrieval and Labeling of Documents Using Ontologies: Aided by a Collaborative Filtering(2023) Alshammari, Asma Abdulkarim; Bhatnagar, RajInformation retrieval is one of the common tasks in today’s world and retrieval systems are aided by various text mining and analysis methods. The objective of retrieval is to obtain information resources from a collection that are relevant to a specified query. The retrieval process begins with a query provided by a user. A search engine is then started to find the relevant resources. Typically, the queries are formed using the same terms (words) that also occur within the resources. The situations of a document matching the non-occurring terms are illustrated by the following examples: we want to retrieve documents relevant to some query terms that do not explicitly occur in the documents but are relevant to their contents. We want to retrieve documents using queries that contain labels from the ontology tree, and these labels may not explicitly occur in documents. We may have a large collection of documents in an organization, and various user communities that may want to refer to the documents using their community-specific ontologies. Several information retrieval methods use clustering of documents followed by determining signatures for each cluster describing the terms predominantly present in each of the clusters. We have designed and implemented a clustering algorithm that partitions the data space in a step-wise manner and seeks to optimize clusters that have good-quality signatures representing the documents in the clusters. The clustering algorithm is modeled on a bi-clustering strategy using the spectral co-clustering method at each step and then optimizing towards clusters that have strong representative signatures. We have shown that this clustering algorithm performs better than other known clustering algorithms such as K-Means and Latent Dirichlet Allocation (LDA). We have accomplished our goal of improving information retrieval systems’ capabilities and performance by presenting a new method to generate predicted terms for the documents by using Singular Value Decomposition (SVD) based collaborative filtering methods. We have shown that retrievals made using such recommended terms for documents retrieve correct documents with reasonably high accuracy. In addition, including predicted terms in the clustering process improves the purity of clusters and the quality of retrieval. We have achieved our goal of integrating ontological labels with information retrieval by adding terms to a document from ontologies and using a collaborative filtering approach to associate ontology labels with other relevant documents. We have tested the performance of our method with many cases of integrating ontologies: single ontology label, single large ontology with all complexities of an ontology tree, and multiple ontology trees. We have tested this method on our document collections and have obtained promising results. Our method has higher performance than other existing methods.50 0Item Restricted Religious Hatred in Arabic Social Media: Analysis, Detection, and Personalization(2023-05) Albadi, Nuha; Mishra, ShivakantMiddle Eastern societies have long suffered from civil wars and domestic tensions that are partly caused by conflicting religious beliefs. This thesis examines the extent of religious hate in Arabic social media, evaluates the impact of automated accounts (i.e., bots) and personalized recommendation algorithms on its spread, and investigates social computing methods for automatically recognizing Arabic-language content and bots promoting religious hatred. First, the thesis addresses the scarcity of Arabic resources in the field by creating two publicly available, annotated Arabic datasets for Twitter and YouTube through crowdsourcing. It then presents a comprehensive analysis highlighting the prevalence of religious hatred on Arabic social networks, the most targeted religious groups, the unique characteristics of perpetrators, and the distinctions between Twitter and YouTube in terms of hate speech volume and targeted groups. Based on gathered insights, it then develops and evaluates several supervised machine learning models to automatically and efficiently detect hateful content. This thesis also contributes new insights into the role of Arabic-language bots in spreading religious hatred on Twitter and introduces a novel regression model tailored to detect Arabic-tweeting bots. Finally, the thesis audits YouTube’s recommendation algorithm to assess the effect of personalization based on demographics and watch history on the extent of hateful content recommended to users. The research presented in this thesis offers practical implications for platform designers to facilitate enforcing their policy against hate and malicious automation and contributes to the broader effort to combat online radicalization.29 0