Exploring Malnutrition in Residential Aged Care: A Study on Nursing Notes using Natural Language Processing and Large Language Models

Thumbnail Image

Date

2024-03-21

Journal Title

Journal ISSN

Volume Title

Publisher

University of Wollongong

Abstract

Population ageing has led to an increasing demand for services for the older people. Residential aged care facilities (RACFs) in Australia provide a range of services for older people who can no longer live independently at home. These include accommodation, personal care, health care services and social and emotional support. Despite efforts for comprehensive care, managing nutrition for older people has been complex in RACFs. Malnutrition has emerged as a prevalent issue within these facilities, raising serious health concerns. Therefore, understanding and addressing malnutrition becomes a critical concern for the Australian government. To date, there has been a reliance on nutrition screening tools to assess older people’s nutritional care needs. Conducting these assessments require adequate healthcare training, and is time consuming, thus are not implemented as frequently as needed to timely uncover the risk of malnutrition for older people. In Australia, the majority of RACFs have established electronic health record (EHRs) system to capture and record care recipients’ information. These include medical diagnosis, regular nursing assessment, weight chart, care plan, periodic review, incident and infection review, and nursing progress report. Therefore, RAC EHRs contain wealth of information that can be mined to support aged care services. The advancement in natural language processing (NLP) technologies, in specific, large language models (LLMs), provides an opportunity to uncover useful insight from the RAC EHRs. Therefore, this PhD research is dedicated to extend NLP technology to the under-studied area RAC, design, implement and evaluate LLM applications in nutrition management among older individuals living in RACFs. It aims to design and develop a sophisticated machine learning framework capable of analysing both structured and unstructured EHR data to gain comprehensive insights into the malnutrition issue. Drawing from literature insights, the study initiates by employing word embedding techniques integrating with cosine similarity and UMLS ontology to extract nutrition- related terms from nursing notes in RACFs. This led to the uncover of language style and terminology used by the practicing nursing and aged care workers in manage nutrition for the older people under their care. Subsequent development of 13 extraction rules identifies relevant notes indicative of malnutrition, forming the basis for a training data set of 2,278 relevant nursing notes, which is utilized in LLM implementation. To enhance the LLM understanding of nursing notes, we randomly selected 500,000 notes for pre-training a domain specific LLM based on the established RoBERTa model. This is followed by fine-tuning the LLM specifically for malnutrition note detection. Achieving an impressive F1-score of 0.96, our model significantly surpassed previous models, ensuring more accurate classification of notes documenting malnutrition. Furthermore, we developed a framework integrating generative LLM, Llama 2, and retrieval augmented generation (RAG) system to extract comprehensive summary information from malnutrition-related notes. This framework demonstrates high accuracy (90%) in identifying malnutrition risk factors from 1,399 notes. It generates detailed summaries about nutrition status from EHRs with 99% of accuracy. Our study reveals a malnutrition prevalence rate of approximately 33% in the studied RACFs. There are 15 main categories and 43 subcategories of malnutrition risk factors. For the first time, this research identified the primary risk factors of malnutrition in RACFs, including poor appetite that affects 17% of older people. This is followed by insufficient oral intake and dementia progression. To enhance malnutrition predictive capabilities, we fine-tuned the RAC domain specific model to address the sequence length limitation of the RoBERTa model, 512 tokens. This is achieved by extending the sequence length to support 1,536 tokens. Augmented with risk factors, our model achieved an F1-score of 0.687, demonstrating its effectiveness in predicting malnutrition risk one month before the event onset. In conclusion, this research designs, develops and evaluates an innovative AI framework that leverages advanced AI technologies, particularly NLP and domain- specific LLMs, to tackle malnutrition among older people in residential aged care facilities. By analysing text data in EHR, The AI framework identifies risk factors, summarises nutrition information, and predict malnutrition one-month before the event onset. After thorough evaluation by domain experts, the AI framework can be implemented as an automated assessment tool. Its implementation into aged care services will alleviate the time burden associated with nutrition care for health and aged care practitioners, supporting them in identifying risk factors of malnutrition for the old people under their care, and manage malnutrition efficiently. The framework’s scalability extends beyond residential aged care facilities. It can be further extended to other healthcare settings to improve nutrition care effectiveness and quality of life for consumers.

Description

Keywords

Machine learning, large language models, Llama, RoBERTa, BERT, Retrieval-augmented generation (RAG), Health informatics, Malnutrition

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2024