Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.

The billing database of the universal healthcare system in Japan potentially includes large-cohort data of patients with immunoglobulin A nephropathy, diagnosis codes aimed at billing should not be directly used for clinical research because of the risk of misdiagnosis. To solve this problem, we aim...

Full description

Saved in:
Bibliographic Details
Main Authors: Ryoya Tsunoda, Keitaro Kume, Rina Kagawa, Masaru Sanuki, Hiroyuki Kitagawa, Kaori Mase, Kunihiro Yamagata
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0312915
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850165414626066432
author Ryoya Tsunoda
Keitaro Kume
Rina Kagawa
Masaru Sanuki
Hiroyuki Kitagawa
Kaori Mase
Kunihiro Yamagata
author_facet Ryoya Tsunoda
Keitaro Kume
Rina Kagawa
Masaru Sanuki
Hiroyuki Kitagawa
Kaori Mase
Kunihiro Yamagata
author_sort Ryoya Tsunoda
collection DOAJ
description The billing database of the universal healthcare system in Japan potentially includes large-cohort data of patients with immunoglobulin A nephropathy, diagnosis codes aimed at billing should not be directly used for clinical research because of the risk of misdiagnosis. To solve this problem, we aimed to develop a novel method for identifying patients with immunoglobulin A nephropathy from billing data using machine learning. The medical records and bills of 3,743 patients who consulted nephrologists at a single center were extracted. Patients were labeled to have been diagnosed with immunoglobulin A nephropathy through a review of medical records. A manual analysis of the diagnostic accuracy and machine learning was performed. For machine learning, the datasets were preprocessed in three patterns and assigned to the XGBoost program using five-fold cross-validation. Of all the participants, 437 were labeled as having been diagnosed with immunoglobulin A nephropathy. Bill codes for immunoglobulin A nephropathy were provided to approximately half of them. The manually created criteria consisting of the recommended examinations and treatments in the Japanese guidelines for immunoglobulin A nephropathy showed both specificity and sensitivity < 0.8. In contrast, with the receiver operating characteristic curve analysis, the machine learning process yielded area under the curve values over 0.9 with preprocessing from the clinical viewpoint. Applying machine learning technology to a dataset preprocessed from a clinical viewpoint achieved a high performance in detecting patients with immunoglobulin A nephropathy. This methodology contributes to the construction of a disease-specific cohort using big bill data.
format Article
id doaj-art-d2a0c88896f0416cbd23a582fc79c39b
institution OA Journals
issn 1932-6203
language English
publishDate 2024-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-d2a0c88896f0416cbd23a582fc79c39b2025-08-20T02:21:46ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-011912e031291510.1371/journal.pone.0312915Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.Ryoya TsunodaKeitaro KumeRina KagawaMasaru SanukiHiroyuki KitagawaKaori MaseKunihiro YamagataThe billing database of the universal healthcare system in Japan potentially includes large-cohort data of patients with immunoglobulin A nephropathy, diagnosis codes aimed at billing should not be directly used for clinical research because of the risk of misdiagnosis. To solve this problem, we aimed to develop a novel method for identifying patients with immunoglobulin A nephropathy from billing data using machine learning. The medical records and bills of 3,743 patients who consulted nephrologists at a single center were extracted. Patients were labeled to have been diagnosed with immunoglobulin A nephropathy through a review of medical records. A manual analysis of the diagnostic accuracy and machine learning was performed. For machine learning, the datasets were preprocessed in three patterns and assigned to the XGBoost program using five-fold cross-validation. Of all the participants, 437 were labeled as having been diagnosed with immunoglobulin A nephropathy. Bill codes for immunoglobulin A nephropathy were provided to approximately half of them. The manually created criteria consisting of the recommended examinations and treatments in the Japanese guidelines for immunoglobulin A nephropathy showed both specificity and sensitivity < 0.8. In contrast, with the receiver operating characteristic curve analysis, the machine learning process yielded area under the curve values over 0.9 with preprocessing from the clinical viewpoint. Applying machine learning technology to a dataset preprocessed from a clinical viewpoint achieved a high performance in detecting patients with immunoglobulin A nephropathy. This methodology contributes to the construction of a disease-specific cohort using big bill data.https://doi.org/10.1371/journal.pone.0312915
spellingShingle Ryoya Tsunoda
Keitaro Kume
Rina Kagawa
Masaru Sanuki
Hiroyuki Kitagawa
Kaori Mase
Kunihiro Yamagata
Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.
PLoS ONE
title Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.
title_full Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.
title_fullStr Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.
title_full_unstemmed Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.
title_short Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.
title_sort machine learning based identification of patients with iga nephropathy using a computerized medical billing database
url https://doi.org/10.1371/journal.pone.0312915
work_keys_str_mv AT ryoyatsunoda machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase
AT keitarokume machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase
AT rinakagawa machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase
AT masarusanuki machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase
AT hiroyukikitagawa machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase
AT kaorimase machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase
AT kunihiroyamagata machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase