Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description-Based Code Selection

Abstract BackgroundThe International Classification of Diseases (ICD), developed by the World Health Organization, standardizes health condition coding to support health care policy, research, and billing, but artificial intelligence automation, while promising, still underper...

Full description

Saved in:
Bibliographic Details
Main Authors: Sander Puts, Catharina M L Zegers, Andre Dekker, Iñigo Bermejo
Format: Article
Language:English
Published: JMIR Publications 2025-02-01
Series:JMIR Formative Research
Online Access:https://formative.jmir.org/2025/1/e60095
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850086294664773632
author Sander Puts
Catharina M L Zegers
Andre Dekker
Iñigo Bermejo
author_facet Sander Puts
Catharina M L Zegers
Andre Dekker
Iñigo Bermejo
author_sort Sander Puts
collection DOAJ
description Abstract BackgroundThe International Classification of Diseases (ICD), developed by the World Health Organization, standardizes health condition coding to support health care policy, research, and billing, but artificial intelligence automation, while promising, still underperforms compared with human accuracy and lacks the explainability needed for adoption in medical settings. ObjectiveThe potential of large language models for assisting medical coders in the ICD-10 coding was explored through the development of a computer-assisted coding system. This study aimed to augment human coding by initially identifying lead terms and using retrieval-augmented generation (RAG)–based methods for computer-assisted coding enhancement. MethodsThe explainability dataset from the CodiEsp challenge (CodiEsp-X) was used, featuring 1000 Spanish clinical cases annotated with ICD-10 codes. A new dataset, CodiEsp-X-lead, was generated using GPT-4 to replace full-textual evidence annotations with lead term annotations. A Robustly Optimized BERT (Bidirectional Encoder Representations from Transformers) Pretraining Approach transformer model was fine-tuned for named entity recognition to extract lead terms. GPT-4 was subsequently employed to generate code descriptions from the extracted textual evidence. Using a RAG approach, ICD codes were assigned to the lead terms by querying a vector database of ICD code descriptions with OpenAI’s text-embedding-ada-002 model. ResultsThe fine-tuned Robustly Optimized BERT Pretraining Approach achieved an overall F1F1F1 ConclusionsWhile lead term extraction showed promising results, the subsequent RAG-based code assignment using GPT-4 and code descriptions was less effective. Future research should focus on refining the approach to more closely mimic the medical coder’s workflow, potentially integrating the alphabetic index and official coding guidelines, rather than relying solely on code descriptions. This alignment may enhance system accuracy and better support medical coders in practice.
format Article
id doaj-art-b2f1cdb7de2e4eb595d5f6a6b98df99f
institution DOAJ
issn 2561-326X
language English
publishDate 2025-02-01
publisher JMIR Publications
record_format Article
series JMIR Formative Research
spelling doaj-art-b2f1cdb7de2e4eb595d5f6a6b98df99f2025-08-20T02:43:32ZengJMIR PublicationsJMIR Formative Research2561-326X2025-02-019e60095e6009510.2196/60095Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description-Based Code SelectionSander Putshttp://orcid.org/0000-0003-4148-1755Catharina M L Zegershttp://orcid.org/0000-0002-9772-0869Andre Dekkerhttp://orcid.org/0000-0002-0422-7996Iñigo Bermejohttp://orcid.org/0000-0001-9105-8088 Abstract BackgroundThe International Classification of Diseases (ICD), developed by the World Health Organization, standardizes health condition coding to support health care policy, research, and billing, but artificial intelligence automation, while promising, still underperforms compared with human accuracy and lacks the explainability needed for adoption in medical settings. ObjectiveThe potential of large language models for assisting medical coders in the ICD-10 coding was explored through the development of a computer-assisted coding system. This study aimed to augment human coding by initially identifying lead terms and using retrieval-augmented generation (RAG)–based methods for computer-assisted coding enhancement. MethodsThe explainability dataset from the CodiEsp challenge (CodiEsp-X) was used, featuring 1000 Spanish clinical cases annotated with ICD-10 codes. A new dataset, CodiEsp-X-lead, was generated using GPT-4 to replace full-textual evidence annotations with lead term annotations. A Robustly Optimized BERT (Bidirectional Encoder Representations from Transformers) Pretraining Approach transformer model was fine-tuned for named entity recognition to extract lead terms. GPT-4 was subsequently employed to generate code descriptions from the extracted textual evidence. Using a RAG approach, ICD codes were assigned to the lead terms by querying a vector database of ICD code descriptions with OpenAI’s text-embedding-ada-002 model. ResultsThe fine-tuned Robustly Optimized BERT Pretraining Approach achieved an overall F1F1F1 ConclusionsWhile lead term extraction showed promising results, the subsequent RAG-based code assignment using GPT-4 and code descriptions was less effective. Future research should focus on refining the approach to more closely mimic the medical coder’s workflow, potentially integrating the alphabetic index and official coding guidelines, rather than relying solely on code descriptions. This alignment may enhance system accuracy and better support medical coders in practice.https://formative.jmir.org/2025/1/e60095
spellingShingle Sander Puts
Catharina M L Zegers
Andre Dekker
Iñigo Bermejo
Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description-Based Code Selection
JMIR Formative Research
title Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description-Based Code Selection
title_full Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description-Based Code Selection
title_fullStr Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description-Based Code Selection
title_full_unstemmed Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description-Based Code Selection
title_short Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description-Based Code Selection
title_sort developing an icd 10 coding assistant pilot study using roberta and gpt 4 for term extraction and description based code selection
url https://formative.jmir.org/2025/1/e60095
work_keys_str_mv AT sanderputs developinganicd10codingassistantpilotstudyusingrobertaandgpt4fortermextractionanddescriptionbasedcodeselection
AT catharinamlzegers developinganicd10codingassistantpilotstudyusingrobertaandgpt4fortermextractionanddescriptionbasedcodeselection
AT andredekker developinganicd10codingassistantpilotstudyusingrobertaandgpt4fortermextractionanddescriptionbasedcodeselection
AT inigobermejo developinganicd10codingassistantpilotstudyusingrobertaandgpt4fortermextractionanddescriptionbasedcodeselection