GRAPH-BASED APPROACH: BRIDGING INSIGHTS FROM STRUCTURED AND UNSTRUCTURED DATA
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Temple University
Abstract
Graph-based methodologies provide powerful tools for uncovering intricate relationships and patterns in complex data, enabling the integration of structured and unstructured information for insightful decision-making across diverse domains. Our research focuses on constructing graphs from structured and unstructured data, demonstrating their applications in healthcare and power systems.
In healthcare, we examine how social networks influence the attitudes of hemodialysis patients toward kidney transplantation. Using a network-based approach, we investigate how social networks within hemodialysis clinics affect patients' attitudes, contributing to a growing understanding of this dynamic. Our findings emphasize that social networks improve the performance of machine learning models, highlighting the importance of social interactions in clinical settings (Aljurbua et al., 2022). We further introduce Node2VecFuseClassifier, a graph-based model that combines patient interactions with patient characteristics. By comparing problem representations that focus on sociodemographics versus social interactions, we demonstrate that incorporating patient-to-patient and patient-to-staff interactions results in more accurate predictions. This multi-modal analysis, which merges patient experiences with staff expertise, underscores the role of social networks in influencing attitudes toward transplantation (Aljurbua et al., 2024b).
In power systems, we explore the impact of severe weather events that lead to power outages, specifically focusing on predicting weather-induced outages three hours in advance at the county level in the Pacific Northwest of the United States. By utilizing a multi-model multiplex network that integrates data from multiple sources including weather, transmission lines, lightning, vegetation, and social media posts from two leading platforms (Twitter and Reddit), we show how multiplex networks offer valuable insights for predicting power outages. This integration of diverse data sources and network-based modeling emphasizes the importance of leveraging multiple perspectives to enhance the understanding and prediction of power disruptions (Aljurbua et al., 2023). We further present HMN-RTS, a hierarchical multiplex network that classifies disruption severity by temporal learning from integrated weather recordings and social media posts. The multiplex network layers of this framework gather information about power outages, weather, lighting, land cover, transmission lines, and social media comments. By incorporating multiplex network layers consisting of data collected over time and across regions, we demonstrate that HMN-RTS significantly improves the accuracy of predicting the duration of weather-related outages. This framework enables grid operators to make more reliable predictions up to 6 hours in advance, supporting early risk assessment and proactive mitigation (Aljurbua et al., 2024a, 2025a). Additionally, we introduce SMN-WVF, a spatiotemporal multiplex network designed to predict the duration of power outages in distribution grids. By integrating network-based approach and multi-modal data across space and time, SMN-WVF offers a novel method for predicting disruption durations in distribution grids, enhancing decision-making and mitigation efforts while highlighting the critical role of network-based approaches in forecasting (Aljurbua et al., 2025b).
Overall, our research showcases the potential of graph-based models in tackling complex challenges in both power systems and healthcare. By combining the network-based approach with multi-modal data, we present innovative solutions for predicting power outages and understanding patient attitudes.
Description
We examined whether a patient’s position within the hemodialysis clinic social network could improve machine learning classification of the patient’s positive or negative attitude towards kidney transplantation when compared to sociodemographic and clinical variables. Hemodialysis clinic patient social networks may reinforce both positive and negative attitudes towards kidney transplantation. We conducted a cross-sectional social network survey of hemodialysis patients in two geographically and demographically different clinics.
We evaluate whether machine learning logistic regression models, using sociodemographic or network data, could best predict the participants’ attitude towards transplantation. This model integrated both structured sociodemographic data and unstructured network data, addressing the challenge of harmonizing different data types. Models were assessed for accuracy, precision, recall, and F1-score. The results show that incorporating social network data improved the machine learning algorithm’s ability to classify attitudes towards kidney transplantation, emphasizing the significant role of hemodialysis clinic social networks in shaping attitudes towards transplantation. This work underscores the potential of leveraging social network information to address challenges in patient attitude prediction and highlights the need for domain expertise in interpreting complex data sources (Aljurbua et al., 2022)
In the follow up study (Aljurbua et al., 2024b), we classify hemodialysis clinic patients into positive and negative attitudes toward transplantation. We introduce a graph-based model named Node2VecFuseClassi er that integrates both patient interactions and patient features. To emphasize the significance of social interaction, we compare the benefits of using a sociodemographic patient-centric versus a social-centric problem representation that considers patient-to-patient and patient-to-staff interactions. We believe that including the social aspect, patient-to-patient, and patient-to-staff network features enhances all machine learning models’ performance as compared to relying on patient-centric features alone. By combining patient experiences with staff expertise, the multilevel analysis enhances predictive capabilities by incorporating diverse roles like patients and staff. Thus, we combine patient and staff networks and observe that such a multi-network approach boosts the F1-score in predicting patient attitude toward transplantation compared to using only the patient network or staff network. The proposed Node2VecFuseClassi er that combines Node2Vec embeddings and features improves the accuracy of transplantation attitudes prediction. Overall, our study shows that integrating the interaction between patients and staff provides beneficial insights to accurately predicting transplantation attitudes for dialysis patients.
While medical technology and innovations in healthcare are crucial to improving patient outcomes, advancements in other fields, such as energy and power systems, also play a significant role in our daily lives. In this study (Aljurbua et al., 2023). We investigate severe weather events that lead to power outages. Despite extensive research on using social media during disasters, little work has focused on combining social media information with power outage data. To address this limitation, we propose a novel and effective approach to enhance the prediction accuracy of weather-related power outages by learning a spatio- temporal multiplex network that integrates information on the impact of inclement weather on the residents extracted from their social media posts with relevant weather, geographic, and grid topology data. Experiments were conducted to predict the risk for weather-related power outages three hours in advance. The results demonstrate that the proposed spatio- temporal multiplex network-based approach offers beneficial insights for predicting power outages three hours ahead at the county level.
In the follow-up studies (Aljurbua et al., 2024a, 2025a), we address the significant impact of long power outages caused by severe weather on the economy, infrastructure, and overall quality of life. The unpredictability and challenges posed by weather-related disruptions are compounded by gaps in weather recordings, which hinder early warning systems and accurate predictions. To overcome this, we introduce HMN-RTS, a hierarchical multiplex network framework that leverages temporal learning from integrated weather recordings and social media posts to classify disruption severity. This innovative approach employs multiplex network layers that gather data across diverse domains, including power outages, weather conditions, lighting, land cover, transmission lines, and social media comments. Our findings demonstrate that HMN-RTS significantly improves the accuracy of predicting the duration of weather-related outages, achieving enhanced performance in forecasting outage severity 3 hours ahead, even for a complex five-class problem. The framework also supports grid operators in executing timely mitigation strategies by providing reliable predictions up to 6 hours ahead, ultimately enhancing the early risk assessment of weather-related disruptions.
While HMN-RTS offers a critical advancement in outage prediction for large-scale networks, the issue of accurately estimating outage durations in distribution grids presents unique challenges due to their smaller, more complex nature. To further tackle these complexities, we introduce SMN-WVF (Aljurbua et al., 2025b), a spatiotemporal multiplex network designed to predict power outage durations. Unlike transmission grid predictions, which benefit from relatively straightforward data, distribution grids require the integration of more granular and multifaceted data layers to handle their intricate structure. SMN-WVF incorporates multi-modal data across both spatial and temporal dimensions, including critical layers such as power outages, weather conditions, weather forecasts, vegetation, and the distances between substations. Our study demonstrates that the addition of these supplementary data sources enhances the model’s predictive accuracy, reflected in continuous improvements in the macro F1 score performance. By emphasizing the value of multi-modal data, SMN-WVF offers a novel and effective solution for predicting the duration of disruptions in complex distribution grids, ensuring better-informed decision-making and more effective mitigation measures.
Keywords
Artificial Intelligence, Machine Learning, Medical Informatics, Power Systems, Social Network Analysis.
Citation
NA