Natural language processing for kidney ultrasound analysis: correlating imaging reports with chronic kidney disease diagnosis

Introduction Natural language processing (NLP) has been used to analyze unstructured imaging report data, yet its application in identifying chronic kidney disease (CKD) features from kidney ultrasound reports remains unexplored.Methods In a single-center pilot study, we analyzed 1,068 kidney ultras...

Full description

Saved in:
Bibliographic Details
Main Authors: Chenlu Wang, Ritwik Banerjee, Harry Kuperstein, Hamza Malick, Ruqiyya Bano, Robin L. Cunningham, Hira Tahir, Priyal Sakhuja, Janos Hajagos, Farrukh M. Koraishy
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Renal Failure
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/0886022X.2025.2539938
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849687506496258048
author Chenlu Wang
Ritwik Banerjee
Harry Kuperstein
Hamza Malick
Ruqiyya Bano
Robin L. Cunningham
Hira Tahir
Priyal Sakhuja
Janos Hajagos
Farrukh M. Koraishy
author_facet Chenlu Wang
Ritwik Banerjee
Harry Kuperstein
Hamza Malick
Ruqiyya Bano
Robin L. Cunningham
Hira Tahir
Priyal Sakhuja
Janos Hajagos
Farrukh M. Koraishy
author_sort Chenlu Wang
collection DOAJ
description Introduction Natural language processing (NLP) has been used to analyze unstructured imaging report data, yet its application in identifying chronic kidney disease (CKD) features from kidney ultrasound reports remains unexplored.Methods In a single-center pilot study, we analyzed 1,068 kidney ultrasound reports using NLP techniques. To identify kidney echogenicity as either “normal” or “increased,” we used two methods: one that looks at individual words and another that analyzes full sentences. Kidney length was identified as “small” if its length was below the 10th percentile. Nephrologists reviewed 100 randomly selected reports to create the reference standard (ground truth) for initial model training followed by model validation on an independent set of 100 reports.Results The word-level NLP model outperformed the sentence-level approach in classifying increased echogenicity (accuracy: 0.96 vs. 0.89 for the left kidney; 0.97 vs. 0.92 for the right kidney). This model was then applied to the full dataset to assess associations with CKD. Multivariable logistic regression identified bilaterally increased echogenicity as the strongest predictor of CKD (odds ratio [OR] = 7.642, 95% confidence interval [CI]: 4.887–11.949; p < 0.0001), followed by bilaterally small kidneys (OR = 4.981 [1.522, 16.300]; p = 0.008). Among individuals without CKD, those with bilaterally increased echogenicity had significantly lower kidney function than those with normal echogenicity.Conclusions State-of-the-art NLP models can accurately extract CKD-related features from ultrasound reports, with the potential of providing a scalable tool for early detection and risk stratification. Future research should focus on validating these models across different healthcare systems.
format Article
id doaj-art-150f33c19aba42919ca79d2de761611e
institution DOAJ
issn 0886-022X
1525-6049
language English
publishDate 2025-12-01
publisher Taylor & Francis Group
record_format Article
series Renal Failure
spelling doaj-art-150f33c19aba42919ca79d2de761611e2025-08-20T03:22:19ZengTaylor & Francis GroupRenal Failure0886-022X1525-60492025-12-0147110.1080/0886022X.2025.2539938Natural language processing for kidney ultrasound analysis: correlating imaging reports with chronic kidney disease diagnosisChenlu Wang0Ritwik Banerjee1Harry Kuperstein2Hamza Malick3Ruqiyya Bano4Robin L. Cunningham5Hira Tahir6Priyal Sakhuja7Janos Hajagos8Farrukh M. Koraishy9Department of Computer Science, Stony Brook University, NY, USADepartment of Computer Science, Stony Brook University, NY, USARenaissance School of Medicine, Stony Brook University, NY, USARenaissance School of Medicine, Stony Brook University, NY, USADepartment of Medicine, Stony Brook University, NY, USADepartment of Radiology, Stony Brook University, NY, USADepartment of Medicine, Stony Brook University, NY, USADepartment of Medicine, Stony Brook University, NY, USADepartment of Biomedical Informatics, Stony Brook University, NY, USADepartment of Medicine, Stony Brook University, NY, USAIntroduction Natural language processing (NLP) has been used to analyze unstructured imaging report data, yet its application in identifying chronic kidney disease (CKD) features from kidney ultrasound reports remains unexplored.Methods In a single-center pilot study, we analyzed 1,068 kidney ultrasound reports using NLP techniques. To identify kidney echogenicity as either “normal” or “increased,” we used two methods: one that looks at individual words and another that analyzes full sentences. Kidney length was identified as “small” if its length was below the 10th percentile. Nephrologists reviewed 100 randomly selected reports to create the reference standard (ground truth) for initial model training followed by model validation on an independent set of 100 reports.Results The word-level NLP model outperformed the sentence-level approach in classifying increased echogenicity (accuracy: 0.96 vs. 0.89 for the left kidney; 0.97 vs. 0.92 for the right kidney). This model was then applied to the full dataset to assess associations with CKD. Multivariable logistic regression identified bilaterally increased echogenicity as the strongest predictor of CKD (odds ratio [OR] = 7.642, 95% confidence interval [CI]: 4.887–11.949; p < 0.0001), followed by bilaterally small kidneys (OR = 4.981 [1.522, 16.300]; p = 0.008). Among individuals without CKD, those with bilaterally increased echogenicity had significantly lower kidney function than those with normal echogenicity.Conclusions State-of-the-art NLP models can accurately extract CKD-related features from ultrasound reports, with the potential of providing a scalable tool for early detection and risk stratification. Future research should focus on validating these models across different healthcare systems.https://www.tandfonline.com/doi/10.1080/0886022X.2025.2539938Natural language processingdeep syntactic structureslanguage embeddingskidney ultrasound reportschronic kidney disease
spellingShingle Chenlu Wang
Ritwik Banerjee
Harry Kuperstein
Hamza Malick
Ruqiyya Bano
Robin L. Cunningham
Hira Tahir
Priyal Sakhuja
Janos Hajagos
Farrukh M. Koraishy
Natural language processing for kidney ultrasound analysis: correlating imaging reports with chronic kidney disease diagnosis
Renal Failure
Natural language processing
deep syntactic structures
language embeddings
kidney ultrasound reports
chronic kidney disease
title Natural language processing for kidney ultrasound analysis: correlating imaging reports with chronic kidney disease diagnosis
title_full Natural language processing for kidney ultrasound analysis: correlating imaging reports with chronic kidney disease diagnosis
title_fullStr Natural language processing for kidney ultrasound analysis: correlating imaging reports with chronic kidney disease diagnosis
title_full_unstemmed Natural language processing for kidney ultrasound analysis: correlating imaging reports with chronic kidney disease diagnosis
title_short Natural language processing for kidney ultrasound analysis: correlating imaging reports with chronic kidney disease diagnosis
title_sort natural language processing for kidney ultrasound analysis correlating imaging reports with chronic kidney disease diagnosis
topic Natural language processing
deep syntactic structures
language embeddings
kidney ultrasound reports
chronic kidney disease
url https://www.tandfonline.com/doi/10.1080/0886022X.2025.2539938
work_keys_str_mv AT chenluwang naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT ritwikbanerjee naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT harrykuperstein naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT hamzamalick naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT ruqiyyabano naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT robinlcunningham naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT hiratahir naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT priyalsakhuja naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT janoshajagos naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis
AT farrukhmkoraishy naturallanguageprocessingforkidneyultrasoundanalysiscorrelatingimagingreportswithchronickidneydiseasediagnosis