Privacy vs Utility inspired by GANs
Abstract
The emergence of personalised services leads to resolving the difficult equation of preserving privacy while getting the best utility from the trained models and online platforms. Several research solutions and industrial prototypes have been proposed to find optimal solutions to achieve a balance between privacy and utility. This project aims to build a model that is inspired by the Generative Adversarial Networks (GANs) concept in order to determine the right balance between privacy and utility for any supervised structured dataset. The proposed model has been evaluated based on two real-world datasets, namely South Korea Covid-19 and Twitter User Gender Classification. To measure the privacy score, we first identified the sensitivity level of each feature, based on the privacy assessment model. We then selected the essential features by multiplying the given privacy score with the feature prediction weight as per the application of the feature selection methods. The model processed several rounds in order to reduce the privacy score of the dataset by eliminating the most privacy-concerned features until reaching the pre-defined accuracy threshold.