Natural Language Processing for Enhanced Clinical Decision Support in Allergy Verification for Medication Prescriptions

Objective: To develop and validate a named entity recognition (NER) model based on BERT-based model trained on Spanish-language corpor, for extracting allergy-related information from unstructured electronic health records. Patients and Methods: The model was fine-tuned using 16,176 manually annotat...

Full description

Saved in:
Bibliographic Details
Main Authors: Juan Pablo Botero-Aguirre, MS, Michael Andrés García-Rivera, MS
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:Mayo Clinic Proceedings: Digital Health
Online Access:http://www.sciencedirect.com/science/article/pii/S2949761225000513
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objective: To develop and validate a named entity recognition (NER) model based on BERT-based model trained on Spanish-language corpor, for extracting allergy-related information from unstructured electronic health records. Patients and Methods: The model was fine-tuned using 16,176 manually annotated allergy-related entities from anonimized patient records (hospitalized patients between January 1, 2021, and June 30, 2024). The data set was divided into training (80%) and testing (20%) subsets, and model performance was evaluated using accuracy, recall, and F1 score. The validated model was applied to another data set with 80,917 medication prescriptions from 5859 hospitalized patients with at least one prescribed medication (during August and September 2024) to detect potential prescription errors. Sensitivity, specificity, and Cohen κ were calculated using manual expert review as the gold standard. Results: The model achieved an accuracy of 87.28% and an F1 score of 0.80. It effectively identified medication names (F1=0.91) and adverse reactions (F1=0.85) but struggled with recommendation-related entities (F1=0.29). The model detected prescription errors in 0.96% of cases, with a sensitivity of 75.73% and specificity of 99.98%. The weighted κ score (0.7797) indicated substantial agreement with expert annotations. Conclusion: The BERT-based model trained on Spanish-language corpora–based NER model demonstrated strong performance in identifying nonallergic cases (specificity, 99.98%; negative predictive value, 99.97%) and showed promise for clinical decision support. Despite moderate sensitivity (75.73%), these results highlight the feasibility of using Spanish-language NER models to enhance medication safety.
ISSN:2949-7612