Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis

Abstract The analysis of medical imaging reports is labour-intensive but crucial for accurate diagnosis and effective patient screening. Often presented as unstructured text, these reports require systematic organisation for efficient interpretation. This study applies Natural Language Processing (N...

Full description

Saved in:
Bibliographic Details
Main Authors: Luis B. Elvas, Rafaela Santos, João C. Ferreira
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-05695-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849769382040829952
author Luis B. Elvas
Rafaela Santos
João C. Ferreira
author_facet Luis B. Elvas
Rafaela Santos
João C. Ferreira
author_sort Luis B. Elvas
collection DOAJ
description Abstract The analysis of medical imaging reports is labour-intensive but crucial for accurate diagnosis and effective patient screening. Often presented as unstructured text, these reports require systematic organisation for efficient interpretation. This study applies Natural Language Processing (NLP) techniques tailored for European Portuguese to automate the analysis of cardiology reports, streamlining patient screening. Using a methodology involving tokenization, part-of-speech tagging and manual annotation, the MediAlbertina PT-PT language model was fine-tuned, achieving 96.13% accuracy in entity recognition. The system enables rapid identification of conditions such as aortic stenosis through an interactive interface, substantially reducing the time and effort required for manual review. It also facilitates patient monitoring and disease quantification, optimising healthcare resource allocation. This research highlights the potential of NLP tools in Portuguese healthcare contexts, demonstrating their applicability to medical report analysis and their broader relevance in improving efficiency and decision-making in diverse clinical environments.
format Article
id doaj-art-b24d4fe91e674ec0b6e4ee3bcd4f7f74
institution DOAJ
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-b24d4fe91e674ec0b6e4ee3bcd4f7f742025-08-20T03:03:25ZengNature PortfolioScientific Reports2045-23222025-07-0115113010.1038/s41598-025-05695-6Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysisLuis B. Elvas0Rafaela Santos1João C. Ferreira2Department of Logistics, Molde, University CollegeISTAR, Instituto Universitário de Lisboa (ISCTE-IUL)Department of Logistics, Molde, University CollegeAbstract The analysis of medical imaging reports is labour-intensive but crucial for accurate diagnosis and effective patient screening. Often presented as unstructured text, these reports require systematic organisation for efficient interpretation. This study applies Natural Language Processing (NLP) techniques tailored for European Portuguese to automate the analysis of cardiology reports, streamlining patient screening. Using a methodology involving tokenization, part-of-speech tagging and manual annotation, the MediAlbertina PT-PT language model was fine-tuned, achieving 96.13% accuracy in entity recognition. The system enables rapid identification of conditions such as aortic stenosis through an interactive interface, substantially reducing the time and effort required for manual review. It also facilitates patient monitoring and disease quantification, optimising healthcare resource allocation. This research highlights the potential of NLP tools in Portuguese healthcare contexts, demonstrating their applicability to medical report analysis and their broader relevance in improving efficiency and decision-making in diverse clinical environments.https://doi.org/10.1038/s41598-025-05695-6Natural Language processing (NLP)Data miningLanguage modelsHealthcareNamed entity recognition (NER)
spellingShingle Luis B. Elvas
Rafaela Santos
João C. Ferreira
Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis
Scientific Reports
Natural Language processing (NLP)
Data mining
Language models
Healthcare
Named entity recognition (NER)
title Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis
title_full Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis
title_fullStr Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis
title_full_unstemmed Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis
title_short Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis
title_sort fine tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis
topic Natural Language processing (NLP)
Data mining
Language models
Healthcare
Named entity recognition (NER)
url https://doi.org/10.1038/s41598-025-05695-6
work_keys_str_mv AT luisbelvas finetuningoflanguagemodelsforautomatedstructuringofmedicalexamreportstoimprovepatientscreeningandanalysis
AT rafaelasantos finetuningoflanguagemodelsforautomatedstructuringofmedicalexamreportstoimprovepatientscreeningandanalysis
AT joaocferreira finetuningoflanguagemodelsforautomatedstructuringofmedicalexamreportstoimprovepatientscreeningandanalysis