A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts
BackgroundChronic Kidney Disease (CKD) is a global health concern and is frequently underdiagnosed due to its subtle initial symptoms, contributing to increasing morbidity and mortality. A comprehensive understanding of CKD comorbidities could lead to the identification of risk-groups, more effectiv...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-02-01
|
Series: | Frontiers in Digital Health |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fdgth.2025.1495879/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832542318483210240 |
---|---|
author | Eszter Sághy Mostafa Elsharkawy Frank Moriarty Sándor Kovács István Wittmann Antal Zemplényi Antal Zemplényi |
author_facet | Eszter Sághy Mostafa Elsharkawy Frank Moriarty Sándor Kovács István Wittmann Antal Zemplényi Antal Zemplényi |
author_sort | Eszter Sághy |
collection | DOAJ |
description | BackgroundChronic Kidney Disease (CKD) is a global health concern and is frequently underdiagnosed due to its subtle initial symptoms, contributing to increasing morbidity and mortality. A comprehensive understanding of CKD comorbidities could lead to the identification of risk-groups, more effective treatment and improved patient outcomes. Our research presents a two-fold objective: developing an effective machine learning (ML) workflow for text classification and entity relation extraction and assembling a broad list of diseases influencing CKD development and progression.MethodsWe analysed 39,680 abstracts with CKD in the title from the Embase library. Abstracts about a disease affecting CKD development and/or progression were selected by multiple ML classifiers trained on a human-labelled sample. The best classifier was further trained with active learning. Disease names in question were extracted from the selected abstracts using a novel entity relation extraction methodology. The resulting disease list and their corresponding abstracts were manually checked and a final disease list was created.FindingsThe SVM model gave the best results and was chosen for further training with active learning. This optimised ML workflow enabled us to discern 68 comorbidities across 15 ICD-10 disease groups contributing to CKD progression or development. The reading of the ML-selected abstracts showed that some diseases have direct causal effect on CKD, while others, like schizophrenia, has indirect causal effect on CKD.InterpretationThese findings have the potential to guide future CKD investigations, by facilitating the inclusion of a broader array of comorbidities in CKD prognostic models. Ultimately, our study enhances understanding of prognostic comorbidities and supports clinical practice by enabling improved patient monitoring, preventive strategies, and early detection for individuals at higher CKD development or progression risk. |
format | Article |
id | doaj-art-8e032fe720da47e0a4557d656eb5ab39 |
institution | Kabale University |
issn | 2673-253X |
language | English |
publishDate | 2025-02-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Digital Health |
spelling | doaj-art-8e032fe720da47e0a4557d656eb5ab392025-02-04T06:32:07ZengFrontiers Media S.A.Frontiers in Digital Health2673-253X2025-02-01710.3389/fdgth.2025.14958791495879A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstractsEszter Sághy0Mostafa Elsharkawy1Frank Moriarty2Sándor Kovács3István Wittmann4Antal Zemplényi5Antal Zemplényi6Faculty of Pharmacy, University of Pécs, Pécs, HungaryFaculty of Sciences, University of Pécs, Pécs, HungarySchool of Pharmacy and Biomolecular Sciences, Royal College of Surgeons in Ireland, Dublin, IrelandFaculty of Pharmacy, University of Pécs, Pécs, HungaryMedical School, University of Pécs, Pécs, HungaryFaculty of Pharmacy, University of Pécs, Pécs, HungarySkaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, Anschutz Medical Campus, Denver, CO, United StatesBackgroundChronic Kidney Disease (CKD) is a global health concern and is frequently underdiagnosed due to its subtle initial symptoms, contributing to increasing morbidity and mortality. A comprehensive understanding of CKD comorbidities could lead to the identification of risk-groups, more effective treatment and improved patient outcomes. Our research presents a two-fold objective: developing an effective machine learning (ML) workflow for text classification and entity relation extraction and assembling a broad list of diseases influencing CKD development and progression.MethodsWe analysed 39,680 abstracts with CKD in the title from the Embase library. Abstracts about a disease affecting CKD development and/or progression were selected by multiple ML classifiers trained on a human-labelled sample. The best classifier was further trained with active learning. Disease names in question were extracted from the selected abstracts using a novel entity relation extraction methodology. The resulting disease list and their corresponding abstracts were manually checked and a final disease list was created.FindingsThe SVM model gave the best results and was chosen for further training with active learning. This optimised ML workflow enabled us to discern 68 comorbidities across 15 ICD-10 disease groups contributing to CKD progression or development. The reading of the ML-selected abstracts showed that some diseases have direct causal effect on CKD, while others, like schizophrenia, has indirect causal effect on CKD.InterpretationThese findings have the potential to guide future CKD investigations, by facilitating the inclusion of a broader array of comorbidities in CKD prognostic models. Ultimately, our study enhances understanding of prognostic comorbidities and supports clinical practice by enabling improved patient monitoring, preventive strategies, and early detection for individuals at higher CKD development or progression risk.https://www.frontiersin.org/articles/10.3389/fdgth.2025.1495879/fullchronic kidney diseasecomorbiditiessystematic literature reviewmachine learningactive learningnamed entity recognition |
spellingShingle | Eszter Sághy Mostafa Elsharkawy Frank Moriarty Sándor Kovács István Wittmann Antal Zemplényi Antal Zemplényi A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts Frontiers in Digital Health chronic kidney disease comorbidities systematic literature review machine learning active learning named entity recognition |
title | A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts |
title_full | A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts |
title_fullStr | A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts |
title_full_unstemmed | A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts |
title_short | A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts |
title_sort | novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts |
topic | chronic kidney disease comorbidities systematic literature review machine learning active learning named entity recognition |
url | https://www.frontiersin.org/articles/10.3389/fdgth.2025.1495879/full |
work_keys_str_mv | AT esztersaghy anovelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT mostafaelsharkawy anovelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT frankmoriarty anovelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT sandorkovacs anovelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT istvanwittmann anovelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT antalzemplenyi anovelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT antalzemplenyi anovelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT esztersaghy novelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT mostafaelsharkawy novelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT frankmoriarty novelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT sandorkovacs novelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT istvanwittmann novelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT antalzemplenyi novelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts AT antalzemplenyi novelmachinelearningmethodologyforthesystematicextractionofchronickidneydiseasecomorbiditiesfromabstracts |