Semantics-driven improvements in electronic health records data quality: a systematic review
Abstract Background Data quality (DQ) of electronic health record (EHR) is crucial for the advancement of health informatization, yet it remains a significant challenge. Scholars are showing a growing interest in leveraging semantic technologies to enhance EHR data quality. However, previous studies...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-08-01
|
| Series: | BMC Medical Informatics and Decision Making |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12911-025-03146-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849333058293989376 |
|---|---|
| author | Yirong Wu Mudan Ren Na Chen Liu Yang |
| author_facet | Yirong Wu Mudan Ren Na Chen Liu Yang |
| author_sort | Yirong Wu |
| collection | DOAJ |
| description | Abstract Background Data quality (DQ) of electronic health record (EHR) is crucial for the advancement of health informatization, yet it remains a significant challenge. Scholars are showing a growing interest in leveraging semantic technologies to enhance EHR data quality. However, previous studies have focused predominantly on specific semantic technologies, scenarios, or objectives—such as interoperability—often overlooking the potential of a various semantic technologies across different scenarios. Objective This systematic review aimed to explore the potential of employing a range of semantic technologies to improve EHR data quality in a broader spectrum of application scenarios. Methods Our systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Three databases were searched, including PubMed, IEEE Xplore, and Web of Science Core Collection. The search terms used included “Semantic*”, “Quality”, “Electronic Health Record*”, “EHR*”, “Electronic Medical Record*”, and “EMR*”. These terms were combined via various Boolean operators to formulate multiple search queries. Results Thirty-seven papers that met the inclusion criteria between 2008 and 2024 were analyzed. Six semantic techniques were identified as instrumental in improving EHR DQ: EHR standardization, controlled vocabulary, ontology, semantic web, knowledge graph, and natural language processing (NLP). These technologies were further mapped to 16 core data quality indicators and the FAIR principles (Findable, Accessible, Interoperable, and Reusable), highlighting their contributions across both technical and governance dimensions. Conclusions The six identified semantic technologies can be categorized into three levels: foundational, general, and advanced. These technologies show significant potential in enhancing EHR DQ, particularly in the areas of conformance, portability, usability, and applicability, and they are suitable for a variety of contexs beyond interoperability, aligning with FAIR-aligned best practices in data management and reuse. |
| format | Article |
| id | doaj-art-1c458bc28c0a45c580a29fa13e06e073 |
| institution | Kabale University |
| issn | 1472-6947 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Medical Informatics and Decision Making |
| spelling | doaj-art-1c458bc28c0a45c580a29fa13e06e0732025-08-20T03:46:00ZengBMCBMC Medical Informatics and Decision Making1472-69472025-08-0125111810.1186/s12911-025-03146-wSemantics-driven improvements in electronic health records data quality: a systematic reviewYirong Wu0Mudan Ren1Na Chen2Liu Yang3Institute of Advanced Studies in Humanities and Social Sciences, Beijing Normal UniversitySchool of Government, Beijing Normal UniversitySchool of Government, Beijing Normal UniversitySchool of Government, Beijing Normal UniversityAbstract Background Data quality (DQ) of electronic health record (EHR) is crucial for the advancement of health informatization, yet it remains a significant challenge. Scholars are showing a growing interest in leveraging semantic technologies to enhance EHR data quality. However, previous studies have focused predominantly on specific semantic technologies, scenarios, or objectives—such as interoperability—often overlooking the potential of a various semantic technologies across different scenarios. Objective This systematic review aimed to explore the potential of employing a range of semantic technologies to improve EHR data quality in a broader spectrum of application scenarios. Methods Our systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Three databases were searched, including PubMed, IEEE Xplore, and Web of Science Core Collection. The search terms used included “Semantic*”, “Quality”, “Electronic Health Record*”, “EHR*”, “Electronic Medical Record*”, and “EMR*”. These terms were combined via various Boolean operators to formulate multiple search queries. Results Thirty-seven papers that met the inclusion criteria between 2008 and 2024 were analyzed. Six semantic techniques were identified as instrumental in improving EHR DQ: EHR standardization, controlled vocabulary, ontology, semantic web, knowledge graph, and natural language processing (NLP). These technologies were further mapped to 16 core data quality indicators and the FAIR principles (Findable, Accessible, Interoperable, and Reusable), highlighting their contributions across both technical and governance dimensions. Conclusions The six identified semantic technologies can be categorized into three levels: foundational, general, and advanced. These technologies show significant potential in enhancing EHR DQ, particularly in the areas of conformance, portability, usability, and applicability, and they are suitable for a variety of contexs beyond interoperability, aligning with FAIR-aligned best practices in data management and reuse.https://doi.org/10.1186/s12911-025-03146-wSemanticOntologyData quality (DQ)Electronic health record (EHR) |
| spellingShingle | Yirong Wu Mudan Ren Na Chen Liu Yang Semantics-driven improvements in electronic health records data quality: a systematic review BMC Medical Informatics and Decision Making Semantic Ontology Data quality (DQ) Electronic health record (EHR) |
| title | Semantics-driven improvements in electronic health records data quality: a systematic review |
| title_full | Semantics-driven improvements in electronic health records data quality: a systematic review |
| title_fullStr | Semantics-driven improvements in electronic health records data quality: a systematic review |
| title_full_unstemmed | Semantics-driven improvements in electronic health records data quality: a systematic review |
| title_short | Semantics-driven improvements in electronic health records data quality: a systematic review |
| title_sort | semantics driven improvements in electronic health records data quality a systematic review |
| topic | Semantic Ontology Data quality (DQ) Electronic health record (EHR) |
| url | https://doi.org/10.1186/s12911-025-03146-w |
| work_keys_str_mv | AT yirongwu semanticsdrivenimprovementsinelectronichealthrecordsdataqualityasystematicreview AT mudanren semanticsdrivenimprovementsinelectronichealthrecordsdataqualityasystematicreview AT nachen semanticsdrivenimprovementsinelectronichealthrecordsdataqualityasystematicreview AT liuyang semanticsdrivenimprovementsinelectronichealthrecordsdataqualityasystematicreview |