Model-Based Learning to Augment Collaborative Filtering: Prediction and Evaluation
No Thumbnail Available
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Collaborative filtering (CF) is a novel statistical technique developed to retrieve useful information and to generate predictions based on provided data from users. It is fundamentally characterized by recommender systems (RSs), which have recently attracted researchers’ attention. CF can be defined as systems and software tools that automatically and effectively generate a list of recommendations of the most suitable items to a target user by predicting a user’s future ratings for unseen items. As a field of study, the dramatic evolution of machine learning solves problems that appeared in the early time of CF systems. Researchers claim that the main research focus of current CF research, especially in the big data era, is how to effectively develop models to address the problems of data sparsity and limited coverage that CF systems unexpectedly experience.
The particular objectives of this empirical research are: (1) providing a novel model based on machine learning and data mining algorithms that address the data sparsity problem in CF, (2) providing a novel similarity model based on rating alignment that results in better accuracy of rating prediction compared to the other similarity models used for CF, (3) providing a novel model to incorporate information from social network sites (SNSs) to augment CF and solve the data sparsity problem, and (4) providing academic advisory RS based on a web-based framework.
To validate the effectiveness of the proposed models, intensive experiments are conducted to compare the performance of the proposed models with the state-of-the-art CF models using four datasets collected from four popular RSs’ domains (music, jokes, books, and movies). These proposed models are computationally efficient and effectively generalized to other related fields in RSs. Results retained from evaluation metrics reveal that the proposed models can demonstrate promising prediction accuracy and improve the prediction performance better than the state-of-the-art algorithms. Furthermore, the proposed models can successfully address the data sparsity and the limited coverage problems.