Leveraging LLMs for the Analysis of Mobile App User Feedback: In-Depth Evaluation of User Perspectives on AI-Enabled Mobile Apps

Ludi, StephanieDo, HyunsookAlsanousi, Bassam Jameel A2025-06-262025-05-10https://hdl.handle.net/20.500.14154/75676This dissertation introduces the use of large language models (LLMs) for analyzing mobile app user feedback and investigates usability issues in AI-enabled mobile applications.The expanding use of artificial intelligence (AI) in mobile applications has intensified the need to investigate how integrated AI features impact user experience (UX). While research in this area is growing, a significant gap exists in evaluating the usability of AI-enabled apps across languages, platforms, and domains. Furthermore, analyzing large-scale user feedback remains challenging despite the automation potential of recent large language models (LLMs). Interestingly, evaluating mobile apps using International Organization for Standardization (ISO) standards in exploring UX has shown promise while uncovering weaknesses and emerging issues. Accordingly, in this study, we evaluated mobile apps by analyzing user reviews through the lens of the ISO 9241-11 usability model and the ISO/International Electrotechnical Commission’s (IEC’s) 25010 quality standard. This dissertation has two main objectives. The first is to examine the performance of AI-enabled apps across different domains and platforms (iOS and Android) in multiple languages to identify emerging usability issues. The second objective is to develop trustworthy automated tools using LLMs and ISO standards that improve the semantic analysis of user feedback regarding usability and software quality and thus support the handling of large amounts of data. Our research results provide valuable insights into the benefits and difficulties of AI-enabled mobile apps in various domains. By conducting sentiment analysis, we find that users are generally positive about these apps; however, there are critical issues underlying the negative reviews related to AI (e.g., giving unclear responses, algorithmic bias, privacy concerns, voice and image recognition limitations, ethical sensitivity, and insufficient transparency in AI decision-making processes). Furthermore, the advanced tools we developed demonstrate their effectiveness in automatically analyzing user reviews according to the ISO standards compared to other advanced models (e.g., GPT-4o, Llama2, and Gemini). In addition, our research has uniquely applied interpretability techniques—local interpretable model-agnostic explanations (LIME)—to develop LLMs capable of interpreting their output, aiding in the creation of trustworthy models. These findings provide developers, app owners, and researchers with insights into user perceptions of AI-enabled apps while presenting advanced strategies for automating the analysis of user reviews effectively.172en-USUsabilityLLMsUser experienceSentiment analysisISO standerdAI-enabled mobile appsUser feedbackLeveraging LLMs for the Analysis of Mobile App User Feedback: In-Depth Evaluation of User Perspectives on AI-Enabled Mobile AppsThesis