Measuring The Quality of Wikipedia Articles Among Different Topics

Aljohani, Thamer

Measuring The Quality of Wikipedia Articles Among Different Topics

dc.contributor.advisor	Niesen, Jitse
dc.contributor.author	Aljohani, Thamer
dc.date.accessioned	2023-11-30T12:11:55Z
dc.date.available	2023-11-30T12:11:55Z
dc.date.issued	2023-11-23
dc.description.abstract	Wikipedia, a globally famous online encyclopedia, offers millions of articles across diverse topics. Its open editing policy, allowing contributions from volunteers, has made it a valuable resource. However, its reliability has been questionable, particularly in academic circles. To enhance the understanding of Wikipedia’s quality, and due to the difficulty of the assessment of quality in Wikipedia’s approach, this study presents an innovative approach to evaluate article quality. This study aims to create a quantifiable simple model based on measurable attributes, such as the length of articles, the number of references, and the number of edits. This model facilitates the calculation of article quality and subsequent assignment of quality classifications. As a result, the model proposed in this study shows an approximate accuracy equal to a random forest model which is considered a complex model. Furthermore, the research explores variations in article quality across various topics, shedding light on topics where high-quality content is prevalent and areas that require improvement. Data was collected from the Wikipedia API, and based on these measurable features, quality assessments were made. The findings indicate that Astronomy topics have a higher level of quality, while Language topics have a lower proportion of high-quality topics. These findings suggest that the attributes used to measure quality in this study are sufficient and efficient for assessing article quality on Wikipedia. Moreover, the study highlights the articles that need the experts to focus their efforts on improving articles related to topics such as Language, Business, or Mathematics to enhance the overall quality of content in these topics.
dc.format.extent	47
dc.identifier.uri	https://hdl.handle.net/20.500.14154/69966
dc.language.iso	en
dc.publisher	Saudi Digital Library
dc.subject	Wikipedia
dc.subject	Quality
dc.subject	Simple model
dc.subject	Random Forest
dc.subject	Accuracy
dc.subject	Classifications
dc.title	Measuring The Quality of Wikipedia Articles Among Different Topics
dc.type	Thesis
sdl.degree.department	Mathematics
sdl.degree.discipline	Data Science and Analystics
sdl.degree.grantor	University of Leeds
sdl.degree.name	Master of Science

Collections

SACM - United Kingdom

Measuring The Quality of Wikipedia Articles Among Different Topics

Files

Collections