Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.
The billing database of the universal healthcare system in Japan potentially includes large-cohort data of patients with immunoglobulin A nephropathy, diagnosis codes aimed at billing should not be directly used for clinical research because of the risk of misdiagnosis. To solve this problem, we aim...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2024-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0312915 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850165414626066432 |
|---|---|
| author | Ryoya Tsunoda Keitaro Kume Rina Kagawa Masaru Sanuki Hiroyuki Kitagawa Kaori Mase Kunihiro Yamagata |
| author_facet | Ryoya Tsunoda Keitaro Kume Rina Kagawa Masaru Sanuki Hiroyuki Kitagawa Kaori Mase Kunihiro Yamagata |
| author_sort | Ryoya Tsunoda |
| collection | DOAJ |
| description | The billing database of the universal healthcare system in Japan potentially includes large-cohort data of patients with immunoglobulin A nephropathy, diagnosis codes aimed at billing should not be directly used for clinical research because of the risk of misdiagnosis. To solve this problem, we aimed to develop a novel method for identifying patients with immunoglobulin A nephropathy from billing data using machine learning. The medical records and bills of 3,743 patients who consulted nephrologists at a single center were extracted. Patients were labeled to have been diagnosed with immunoglobulin A nephropathy through a review of medical records. A manual analysis of the diagnostic accuracy and machine learning was performed. For machine learning, the datasets were preprocessed in three patterns and assigned to the XGBoost program using five-fold cross-validation. Of all the participants, 437 were labeled as having been diagnosed with immunoglobulin A nephropathy. Bill codes for immunoglobulin A nephropathy were provided to approximately half of them. The manually created criteria consisting of the recommended examinations and treatments in the Japanese guidelines for immunoglobulin A nephropathy showed both specificity and sensitivity < 0.8. In contrast, with the receiver operating characteristic curve analysis, the machine learning process yielded area under the curve values over 0.9 with preprocessing from the clinical viewpoint. Applying machine learning technology to a dataset preprocessed from a clinical viewpoint achieved a high performance in detecting patients with immunoglobulin A nephropathy. This methodology contributes to the construction of a disease-specific cohort using big bill data. |
| format | Article |
| id | doaj-art-d2a0c88896f0416cbd23a582fc79c39b |
| institution | OA Journals |
| issn | 1932-6203 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-d2a0c88896f0416cbd23a582fc79c39b2025-08-20T02:21:46ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-011912e031291510.1371/journal.pone.0312915Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database.Ryoya TsunodaKeitaro KumeRina KagawaMasaru SanukiHiroyuki KitagawaKaori MaseKunihiro YamagataThe billing database of the universal healthcare system in Japan potentially includes large-cohort data of patients with immunoglobulin A nephropathy, diagnosis codes aimed at billing should not be directly used for clinical research because of the risk of misdiagnosis. To solve this problem, we aimed to develop a novel method for identifying patients with immunoglobulin A nephropathy from billing data using machine learning. The medical records and bills of 3,743 patients who consulted nephrologists at a single center were extracted. Patients were labeled to have been diagnosed with immunoglobulin A nephropathy through a review of medical records. A manual analysis of the diagnostic accuracy and machine learning was performed. For machine learning, the datasets were preprocessed in three patterns and assigned to the XGBoost program using five-fold cross-validation. Of all the participants, 437 were labeled as having been diagnosed with immunoglobulin A nephropathy. Bill codes for immunoglobulin A nephropathy were provided to approximately half of them. The manually created criteria consisting of the recommended examinations and treatments in the Japanese guidelines for immunoglobulin A nephropathy showed both specificity and sensitivity < 0.8. In contrast, with the receiver operating characteristic curve analysis, the machine learning process yielded area under the curve values over 0.9 with preprocessing from the clinical viewpoint. Applying machine learning technology to a dataset preprocessed from a clinical viewpoint achieved a high performance in detecting patients with immunoglobulin A nephropathy. This methodology contributes to the construction of a disease-specific cohort using big bill data.https://doi.org/10.1371/journal.pone.0312915 |
| spellingShingle | Ryoya Tsunoda Keitaro Kume Rina Kagawa Masaru Sanuki Hiroyuki Kitagawa Kaori Mase Kunihiro Yamagata Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database. PLoS ONE |
| title | Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database. |
| title_full | Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database. |
| title_fullStr | Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database. |
| title_full_unstemmed | Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database. |
| title_short | Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database. |
| title_sort | machine learning based identification of patients with iga nephropathy using a computerized medical billing database |
| url | https://doi.org/10.1371/journal.pone.0312915 |
| work_keys_str_mv | AT ryoyatsunoda machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase AT keitarokume machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase AT rinakagawa machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase AT masarusanuki machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase AT hiroyukikitagawa machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase AT kaorimase machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase AT kunihiroyamagata machinelearningbasedidentificationofpatientswithiganephropathyusingacomputerizedmedicalbillingdatabase |