Semantics-driven improvements in electronic health records data quality: a systematic review

Abstract Background Data quality (DQ) of electronic health record (EHR) is crucial for the advancement of health informatization, yet it remains a significant challenge. Scholars are showing a growing interest in leveraging semantic technologies to enhance EHR data quality. However, previous studies...

Full description

Saved in:
Bibliographic Details
Main Authors: Yirong Wu, Mudan Ren, Na Chen, Liu Yang
Format: Article
Language:English
Published: BMC 2025-08-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-025-03146-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849333058293989376
author Yirong Wu
Mudan Ren
Na Chen
Liu Yang
author_facet Yirong Wu
Mudan Ren
Na Chen
Liu Yang
author_sort Yirong Wu
collection DOAJ
description Abstract Background Data quality (DQ) of electronic health record (EHR) is crucial for the advancement of health informatization, yet it remains a significant challenge. Scholars are showing a growing interest in leveraging semantic technologies to enhance EHR data quality. However, previous studies have focused predominantly on specific semantic technologies, scenarios, or objectives—such as interoperability—often overlooking the potential of a various semantic technologies across different scenarios. Objective This systematic review aimed to explore the potential of employing a range of semantic technologies to improve EHR data quality in a broader spectrum of application scenarios. Methods Our systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Three databases were searched, including PubMed, IEEE Xplore, and Web of Science Core Collection. The search terms used included “Semantic*”, “Quality”, “Electronic Health Record*”, “EHR*”, “Electronic Medical Record*”, and “EMR*”. These terms were combined via various Boolean operators to formulate multiple search queries. Results Thirty-seven papers that met the inclusion criteria between 2008 and 2024 were analyzed. Six semantic techniques were identified as instrumental in improving EHR DQ: EHR standardization, controlled vocabulary, ontology, semantic web, knowledge graph, and natural language processing (NLP). These technologies were further mapped to 16 core data quality indicators and the FAIR principles (Findable, Accessible, Interoperable, and Reusable), highlighting their contributions across both technical and governance dimensions. Conclusions The six identified semantic technologies can be categorized into three levels: foundational, general, and advanced. These technologies show significant potential in enhancing EHR DQ, particularly in the areas of conformance, portability, usability, and applicability, and they are suitable for a variety of contexs beyond interoperability, aligning with FAIR-aligned best practices in data management and reuse.
format Article
id doaj-art-1c458bc28c0a45c580a29fa13e06e073
institution Kabale University
issn 1472-6947
language English
publishDate 2025-08-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj-art-1c458bc28c0a45c580a29fa13e06e0732025-08-20T03:46:00ZengBMCBMC Medical Informatics and Decision Making1472-69472025-08-0125111810.1186/s12911-025-03146-wSemantics-driven improvements in electronic health records data quality: a systematic reviewYirong Wu0Mudan Ren1Na Chen2Liu Yang3Institute of Advanced Studies in Humanities and Social Sciences, Beijing Normal UniversitySchool of Government, Beijing Normal UniversitySchool of Government, Beijing Normal UniversitySchool of Government, Beijing Normal UniversityAbstract Background Data quality (DQ) of electronic health record (EHR) is crucial for the advancement of health informatization, yet it remains a significant challenge. Scholars are showing a growing interest in leveraging semantic technologies to enhance EHR data quality. However, previous studies have focused predominantly on specific semantic technologies, scenarios, or objectives—such as interoperability—often overlooking the potential of a various semantic technologies across different scenarios. Objective This systematic review aimed to explore the potential of employing a range of semantic technologies to improve EHR data quality in a broader spectrum of application scenarios. Methods Our systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Three databases were searched, including PubMed, IEEE Xplore, and Web of Science Core Collection. The search terms used included “Semantic*”, “Quality”, “Electronic Health Record*”, “EHR*”, “Electronic Medical Record*”, and “EMR*”. These terms were combined via various Boolean operators to formulate multiple search queries. Results Thirty-seven papers that met the inclusion criteria between 2008 and 2024 were analyzed. Six semantic techniques were identified as instrumental in improving EHR DQ: EHR standardization, controlled vocabulary, ontology, semantic web, knowledge graph, and natural language processing (NLP). These technologies were further mapped to 16 core data quality indicators and the FAIR principles (Findable, Accessible, Interoperable, and Reusable), highlighting their contributions across both technical and governance dimensions. Conclusions The six identified semantic technologies can be categorized into three levels: foundational, general, and advanced. These technologies show significant potential in enhancing EHR DQ, particularly in the areas of conformance, portability, usability, and applicability, and they are suitable for a variety of contexs beyond interoperability, aligning with FAIR-aligned best practices in data management and reuse.https://doi.org/10.1186/s12911-025-03146-wSemanticOntologyData quality (DQ)Electronic health record (EHR)
spellingShingle Yirong Wu
Mudan Ren
Na Chen
Liu Yang
Semantics-driven improvements in electronic health records data quality: a systematic review
BMC Medical Informatics and Decision Making
Semantic
Ontology
Data quality (DQ)
Electronic health record (EHR)
title Semantics-driven improvements in electronic health records data quality: a systematic review
title_full Semantics-driven improvements in electronic health records data quality: a systematic review
title_fullStr Semantics-driven improvements in electronic health records data quality: a systematic review
title_full_unstemmed Semantics-driven improvements in electronic health records data quality: a systematic review
title_short Semantics-driven improvements in electronic health records data quality: a systematic review
title_sort semantics driven improvements in electronic health records data quality a systematic review
topic Semantic
Ontology
Data quality (DQ)
Electronic health record (EHR)
url https://doi.org/10.1186/s12911-025-03146-w
work_keys_str_mv AT yirongwu semanticsdrivenimprovementsinelectronichealthrecordsdataqualityasystematicreview
AT mudanren semanticsdrivenimprovementsinelectronichealthrecordsdataqualityasystematicreview
AT nachen semanticsdrivenimprovementsinelectronichealthrecordsdataqualityasystematicreview
AT liuyang semanticsdrivenimprovementsinelectronichealthrecordsdataqualityasystematicreview