THE APPLICATION OF UNSUPERVISED MACHINE LEARNING TECHNIQUE IN CHARACTERIZING DENGUE AND MALARIA ON TWITTER DATA FOR EFFECTIVE HEALTH COMMUNICATION PLANNING
No Thumbnail Available
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
THE APPLICATION OF UNSUPERVISED MACHINE LEARNING TECHNIQUE IN CHARACTERIZING DENGUE AND MALARIA ON TWITTER DATA FOR EFFECTIVE HEALTH COMMUNICATION PLANNING
ABSTRACT
Public health surveillance is an important approach for monitoring the spread of infectious diseases and deploying rapid responses when there is an indication of an epidemic emerging. Thus, in the contemporary health communication domain, disease surveillance becomes a significant method for predicting disease outbreaks from social media posts. Although several scholars used supervised learning techniques to deal with tremendous data and make valid decisions about a disease, still the huge amount of data in conjunction with the small amount of ground truth, that is applied via supervised learning schemes, is a serious dilemma in identifying complex diseases that share similar clinical presentations such as dengue and malaria. Therefore, this study aims at solving the problem of characterizing dengue and malaria illness in Twitter messages. This study uses unsupervised learning technique to handle the existing research gap. To do so, this research collect tweets relevant to the examined diseases and compared several unsupervised learning techniques (i.e., cluster algorithms) to effectively characterize the examined diseases from the textual context. The finding revealed that Simple Means algorithm is the best algorithm used for grouping tweets that share similar features just like Dengue and Malaria. The text analysis result indicates that some of the tweets described the dengue lifecycle. Likewise, the study contributed to the analysis of disease surveillance systems and demonstrated the efficiency of using unsupervised learning techniques for characterizing topics
xiii
associated with Dengue and Malaria from users’ messages on Twitter data. Therefore, the proposed mechanism can be applied in major health-related issues such as disease recognition.