Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling Study
Abstract BackgroundHeart failure with preserved ejection fraction (HFpEF) is a major clinical manifestation of cardiac amyloidosis, a condition frequently underdiagnosed due to its nonspecific symptomatology. Electronic health records (EHRs) offer a promising avenue for suppor...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
JMIR Publications
2025-07-01
|
| Series: | Journal of Medical Internet Research |
| Online Access: | https://www.jmir.org/2025/1/e76433 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849708678297419776 |
|---|---|
| author | Jacinto Mata Victoria Pachón Ana Manovel Manuel J Maña Manuel de la Villa |
| author_facet | Jacinto Mata Victoria Pachón Ana Manovel Manuel J Maña Manuel de la Villa |
| author_sort | Jacinto Mata |
| collection | DOAJ |
| description |
Abstract
BackgroundHeart failure with preserved ejection fraction (HFpEF) is a major clinical manifestation of cardiac amyloidosis, a condition frequently underdiagnosed due to its nonspecific symptomatology. Electronic health records (EHRs) offer a promising avenue for supporting early symptom detection through natural language processing. However, identifying relevant clinical cues within unstructured narratives, particularly in Spanish, remains a significant challenge due to the scarcity of annotated corpora and domain-specific models. This study proposes and evaluates a Transformer-based natural language processing framework for automated detection of HFpEF-related symptoms in Spanish EHRs.
ObjectiveThe aim of this study is to assess the feasibility of leveraging unstructured clinical narratives to support early identification of heart failure phenotypes indicative of cardiac amyloidosis. It also examines how domain-specific language models and clinically guided optimization strategies can improve the reliability, sensitivity, and generalizability of symptom detection in real-world EHRs.
MethodsA novel corpus of 15,304 Spanish clinical documents was manually annotated and validated by cardiology experts. The corpus was derived from the records of 262 patients (173 with suspected cardiac amyloidosis and 89 without). In total, 8 Transformer-based language models were evaluated, including general-purpose models, biomedical-specialized variants, and Longformers. Three clinically motivated optimization strategies were implemented to align models’ behavior with different diagnostic priorities: maximizing area under the curve (AUC) to enhance overall discrimination, optimizing F1
ResultsAll models achieved high performance, with AUC values above 0.940. The best-performing model, Longformer Biomedical-clinicalF1
ConclusionsTransformer-based models can reliably detect HFpEF-related symptoms from Spanish EHRs, even in the presence of class imbalance and substantial linguistic complexity. The results show that combining domain-specific pretraining with long-context modeling architectures and clinically aligned optimization strategies leads to substantial gains in classification performance, particularly in sensitivity. These models not only achieve high accuracy and generalization on unseen patients but also demonstrate robustness in handling the semantic nuances and narrative structure of real-world clinical documentation. These findings support the potential deployment of Transformer-based systems as effective screening tools to prioritize patients at risk for cardiac amyloidosis in Spanish-speaking health care settings. |
| format | Article |
| id | doaj-art-3b0b0ebdafb64ee58b37c6a182ef8779 |
| institution | DOAJ |
| issn | 1438-8871 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | JMIR Publications |
| record_format | Article |
| series | Journal of Medical Internet Research |
| spelling | doaj-art-3b0b0ebdafb64ee58b37c6a182ef87792025-08-20T03:15:35ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-07-0127e76433e7643310.2196/76433Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling StudyJacinto Matahttp://orcid.org/0000-0001-5329-9622Victoria Pachónhttp://orcid.org/0000-0003-0697-4044Ana Manovelhttp://orcid.org/0000-0001-7015-170XManuel J Mañahttp://orcid.org/0000-0002-7551-2401Manuel de la Villahttp://orcid.org/0000-0003-3464-2944 Abstract BackgroundHeart failure with preserved ejection fraction (HFpEF) is a major clinical manifestation of cardiac amyloidosis, a condition frequently underdiagnosed due to its nonspecific symptomatology. Electronic health records (EHRs) offer a promising avenue for supporting early symptom detection through natural language processing. However, identifying relevant clinical cues within unstructured narratives, particularly in Spanish, remains a significant challenge due to the scarcity of annotated corpora and domain-specific models. This study proposes and evaluates a Transformer-based natural language processing framework for automated detection of HFpEF-related symptoms in Spanish EHRs. ObjectiveThe aim of this study is to assess the feasibility of leveraging unstructured clinical narratives to support early identification of heart failure phenotypes indicative of cardiac amyloidosis. It also examines how domain-specific language models and clinically guided optimization strategies can improve the reliability, sensitivity, and generalizability of symptom detection in real-world EHRs. MethodsA novel corpus of 15,304 Spanish clinical documents was manually annotated and validated by cardiology experts. The corpus was derived from the records of 262 patients (173 with suspected cardiac amyloidosis and 89 without). In total, 8 Transformer-based language models were evaluated, including general-purpose models, biomedical-specialized variants, and Longformers. Three clinically motivated optimization strategies were implemented to align models’ behavior with different diagnostic priorities: maximizing area under the curve (AUC) to enhance overall discrimination, optimizing F1 ResultsAll models achieved high performance, with AUC values above 0.940. The best-performing model, Longformer Biomedical-clinicalF1 ConclusionsTransformer-based models can reliably detect HFpEF-related symptoms from Spanish EHRs, even in the presence of class imbalance and substantial linguistic complexity. The results show that combining domain-specific pretraining with long-context modeling architectures and clinically aligned optimization strategies leads to substantial gains in classification performance, particularly in sensitivity. These models not only achieve high accuracy and generalization on unseen patients but also demonstrate robustness in handling the semantic nuances and narrative structure of real-world clinical documentation. These findings support the potential deployment of Transformer-based systems as effective screening tools to prioritize patients at risk for cardiac amyloidosis in Spanish-speaking health care settings.https://www.jmir.org/2025/1/e76433 |
| spellingShingle | Jacinto Mata Victoria Pachón Ana Manovel Manuel J Maña Manuel de la Villa Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling Study Journal of Medical Internet Research |
| title | Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling Study |
| title_full | Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling Study |
| title_fullStr | Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling Study |
| title_full_unstemmed | Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling Study |
| title_short | Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling Study |
| title_sort | multicriteria optimization of language models for heart failure with preserved ejection fraction symptom detection in spanish electronic health records comparative modeling study |
| url | https://www.jmir.org/2025/1/e76433 |
| work_keys_str_mv | AT jacintomata multicriteriaoptimizationoflanguagemodelsforheartfailurewithpreservedejectionfractionsymptomdetectioninspanishelectronichealthrecordscomparativemodelingstudy AT victoriapachon multicriteriaoptimizationoflanguagemodelsforheartfailurewithpreservedejectionfractionsymptomdetectioninspanishelectronichealthrecordscomparativemodelingstudy AT anamanovel multicriteriaoptimizationoflanguagemodelsforheartfailurewithpreservedejectionfractionsymptomdetectioninspanishelectronichealthrecordscomparativemodelingstudy AT manueljmana multicriteriaoptimizationoflanguagemodelsforheartfailurewithpreservedejectionfractionsymptomdetectioninspanishelectronichealthrecordscomparativemodelingstudy AT manueldelavilla multicriteriaoptimizationoflanguagemodelsforheartfailurewithpreservedejectionfractionsymptomdetectioninspanishelectronichealthrecordscomparativemodelingstudy |