Unsupervised Semantic Change Detection in Arabic

Thumbnail Image

Date

2023-10-23

Journal Title

Journal ISSN

Volume Title

Publisher

Queen Mary University of London

Abstract

This study employs pretrained BERT models— AraBERT, CAMeLBERT (CA), and CAMeLBERT (MSA)—to investigate semantic change in Arabic across distinct time periods. Analyzing word embeddings and cosine distance scores reveals variations in capturing semantic shifts. The research highlights the significance of training data quality and diversity, while acknowledging limitations in data scope. The project's outcome—a list of most stable and changed words—contributes to Arabic NLP by shedding light on semantic change detection, suggesting potential model selection strategies and areas for future exploration.

Description

Keywords

Natural Language Processing, Arabic NLP, Langauge Models, BERT, Data Science, Semantic Change, Unsupervised

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2024