Comparative analysis of generative LLMs for labeling entities in clinical notes
Abstract This paper evaluates and compares different fine-tuned variations of generative large language models (LLM) in the zero-shot named entity recognition (NER) task for the clinical domain. As part of the 8th Biomedical Linked Annotation Hackathon, we examined Llama 2 and Mistral models, includ...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BioMed Central
2025-02-01
|
| Series: | Genomics & Informatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s44342-024-00036-x |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850243091659751424 |
|---|---|
| author | Rodrigo del Moral-González Helena Gómez-Adorno Orlando Ramos-Flores |
| author_facet | Rodrigo del Moral-González Helena Gómez-Adorno Orlando Ramos-Flores |
| author_sort | Rodrigo del Moral-González |
| collection | DOAJ |
| description | Abstract This paper evaluates and compares different fine-tuned variations of generative large language models (LLM) in the zero-shot named entity recognition (NER) task for the clinical domain. As part of the 8th Biomedical Linked Annotation Hackathon, we examined Llama 2 and Mistral models, including base versions and those that have been fine-tuned for code, chat, and instruction-following tasks. We assess both the number of correctly identified entities and the models’ ability to retrieve entities in structured formats. We used a publicly available set of clinical cases labeled with mentions of diseases, symptoms, and medical procedures for the evaluation. Results show that instruction fine-tuned models perform better than chat fine-tuned and base models in recognizing entities. It is also shown that models perform better when simple output structures are requested. |
| format | Article |
| id | doaj-art-5300776c0ed1487bbdc52fac13928cd7 |
| institution | OA Journals |
| issn | 2234-0742 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | BioMed Central |
| record_format | Article |
| series | Genomics & Informatics |
| spelling | doaj-art-5300776c0ed1487bbdc52fac13928cd72025-08-20T02:00:06ZengBioMed CentralGenomics & Informatics2234-07422025-02-012311810.1186/s44342-024-00036-xComparative analysis of generative LLMs for labeling entities in clinical notesRodrigo del Moral-González0Helena Gómez-Adorno1Orlando Ramos-Flores2Posgrado en Ciencia e Ingeniería de la Computación, Universidad Nacional Autónoma de MéxicoInstituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de MéxicoInstituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de MéxicoAbstract This paper evaluates and compares different fine-tuned variations of generative large language models (LLM) in the zero-shot named entity recognition (NER) task for the clinical domain. As part of the 8th Biomedical Linked Annotation Hackathon, we examined Llama 2 and Mistral models, including base versions and those that have been fine-tuned for code, chat, and instruction-following tasks. We assess both the number of correctly identified entities and the models’ ability to retrieve entities in structured formats. We used a publicly available set of clinical cases labeled with mentions of diseases, symptoms, and medical procedures for the evaluation. Results show that instruction fine-tuned models perform better than chat fine-tuned and base models in recognizing entities. It is also shown that models perform better when simple output structures are requested.https://doi.org/10.1186/s44342-024-00036-xZero-shotNamed entity recognitionGenerative language modelsClinical domainBLAH8 |
| spellingShingle | Rodrigo del Moral-González Helena Gómez-Adorno Orlando Ramos-Flores Comparative analysis of generative LLMs for labeling entities in clinical notes Genomics & Informatics Zero-shot Named entity recognition Generative language models Clinical domain BLAH8 |
| title | Comparative analysis of generative LLMs for labeling entities in clinical notes |
| title_full | Comparative analysis of generative LLMs for labeling entities in clinical notes |
| title_fullStr | Comparative analysis of generative LLMs for labeling entities in clinical notes |
| title_full_unstemmed | Comparative analysis of generative LLMs for labeling entities in clinical notes |
| title_short | Comparative analysis of generative LLMs for labeling entities in clinical notes |
| title_sort | comparative analysis of generative llms for labeling entities in clinical notes |
| topic | Zero-shot Named entity recognition Generative language models Clinical domain BLAH8 |
| url | https://doi.org/10.1186/s44342-024-00036-x |
| work_keys_str_mv | AT rodrigodelmoralgonzalez comparativeanalysisofgenerativellmsforlabelingentitiesinclinicalnotes AT helenagomezadorno comparativeanalysisofgenerativellmsforlabelingentitiesinclinicalnotes AT orlandoramosflores comparativeanalysisofgenerativellmsforlabelingentitiesinclinicalnotes |