ENHANCED CRIMINAL BEHAVIOR DETECTION IN MOBILE PHONE DATA USING T-DBSCAN, CGAN, AND SEMI-SUPERVISED SELF-TRAINING

No Thumbnail Available

Date

2025-07

Journal Title

Journal ISSN

Volume Title

Publisher

Saudi Digital Library

Abstract

With the rapid expansion of mobile telecommunication networks and the widespread adoption of smartphones, vast amounts of mobile phone data are being generated. These data contain digital traces that facilitate the analysis of criminal activities and the detection of suspicious behavior based on calling behavior and mobility patterns—both of which are widely used in criminal investigations. However, the inherent characteristics of mobile phone data, such as sparsity, noise, missing values, and lack of labeling, pose significant challenges to accurately clustering crucial locations (stay points) and compromise the precision of spatiotemporal information, which is vital for modeling criminal mobility patterns. These challenges are further exacerbated by variations in data formats and the presence of missing values due to low network coverage and privacy constraints. Consequently, analyses are often limited to a single behavioral aspect—such as mobility or calling patterns—hindering the ability to capture temporal variations in criminal behavior. To address these challenges, this research proposes t-DBSCAN (Temporal Density-Based Spatial Clustering of Applications with Noise), an enhanced version of DBSCAN. This method integrates mobile phone data with crime data to improve the clustering and identification of stay points, thereby aiding in modeling critical activities such as identifying crime hotspots and residential behaviors. Compared to baseline methods such as DBSCAN, K-means, and OPTICS (Ordering Points to Identify the Clustering Structure), the proposed t-DBSCAN achieves superior clustering performance, with a Silhouette Index (SI) of 0.9889 and a Davies-Bouldin Index (DBI) of 0.0213 for clustering home locations, and an SI of 0.8136 and a DBI of 0.1671 for identifying suspicious activities—indicating high intra-cluster cohesion and strong inter-cluster separation. Additionally, a Conditional Generative Adversarial Network (cGAN) is introduced to address missing values in mobile phone data by reconstructing user profiles that integrate both mobility and calling features. The proposed cGAN is evaluated against baseline probabilistic and deep generative methods using Jensen-Shannon Divergence (JSD) and Cosine Similarity (CS) metrics. It achieves a JSD of 0.0062 for stay duration, 0.0940 for trajectory length, and 0.1373 for visit frequency—significantly outperforming all baseline methods (JSD > 0.25). It also records the highest CS scores: 0.8521 for spatial and 0.7950 for spatiotemporal distribution, surpassing deep generative models (CS ≤ 0.82) and probabilistic methods (CS < 0.63). Furthermore, a Semi-Supervised Self-Training (SSST) approach is employed to enhance criminal behavior detection by leveraging both labeled and unlabeled data. The proposed SSST, implemented using machine learning methods such as SSST-Random Forest (SSST-RF) and SSST-Decision Tree (SSST-DT), demonstrates significantly improved classification accuracy compared to baseline supervised methods. Specifically, SSST-RF and SSST-DT achieve average accuracies of 87.82% and 82.91%, respectively, outperforming their baseline counterparts by +5.3% and +4.7%. In conclusion, this research successfully meets all its objectives and underscores the potential of mobile phone data mining as a powerful tool for enhancing crime pattern detection, criminal identification, and crime prevention strategies.

Description

Keywords

Mobile Phone Data, Call Detail Records, Crime Patterns, Criminal Activities, Criminal Behavior Detection, t-DBSCAN, Conditional Generative Adversarial Network, SemiSupervised Self-Training.

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025