LLMs in Geocoding: Converting Social Media Self-Described Locations into Geographic Coordinates

dc.contributor.advisorBjorn, Ross
dc.contributor.authorYaseen, Leena
dc.date.accessioned2025-12-02T21:56:32Z
dc.date.issued2025
dc.description.abstractTwitter (Now X) users have been increasing every year. As one of the biggest platforms world wide, the amount of location-related data has increased, yet only a small fraction of posts are geotagged. Traditional geocoding systems and methods struggle on reading social media texts due to its noisy and ambiguous nature. However, recent advancement in AI and Large Language Models (LLMs) has influenced the geospatial field by demonstrating a strong potential for prediction geoinformation. Hybrid approaches combining LLMs and gazetteers especially have gained significant interest. This dissertation explores building a hybrid pipeline that integrates LLMs, social network community voting mechanism to create an end-to-end geocoding system that predicts geo-coordinates and enhance its predictions performance. We found out the LLM models such as GPT-4o had performance best in its base form in both coordinate predictions and polygons at 55.54% accuracy for coordinates, and seen an increase to 62.62% enhancement upon integrating it with network signal. The findings provide valuable insight on encouraging on the integration of LLMs into hybrid geospatial approaches, particularly in enhancing the incorporation of free-form text within geospatial databases and informing the development of advanced hybrid models.
dc.format.extent42
dc.identifier.urihttps://hdl.handle.net/20.500.14154/77281
dc.language.isoen_US
dc.publisherSaudi Digital Library
dc.subjectLarge Language Models (LLM)
dc.subjectGeocoding
dc.subjectSocial Media Analysis
dc.titleLLMs in Geocoding: Converting Social Media Self-Described Locations into Geographic Coordinates
dc.typeThesis
sdl.degree.departmentInformatics
sdl.degree.disciplineData Science
sdl.degree.grantorThe University of Edinburgh
sdl.degree.nameMasters of Data Science

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
SACM-Dissertation.pdf
Size:
4.55 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed to upon submission
Description:

Copyright owned by the Saudi Digital Library (SDL) © 2026