Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline

Abstract Background Recruitment for cohorts involving complex liver diseases, such as hepatocellular carcinoma and liver cirrhosis, often requires interpreting semantically complex criteria. Traditional manual screening methods are time-consuming and prone to errors. While AI-powered pre-screening o...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiongbin Gui, Hanlin Lv, Xiao Wang, Longting Lv, Yi Xiao, Lei Wang
Format: Article
Language:English
Published: BMC 2025-06-01
Series:BioData Mining
Subjects:
Online Access:https://doi.org/10.1186/s13040-025-00458-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849335685922684928
author Xiongbin Gui
Hanlin Lv
Xiao Wang
Longting Lv
Yi Xiao
Lei Wang
author_facet Xiongbin Gui
Hanlin Lv
Xiao Wang
Longting Lv
Yi Xiao
Lei Wang
author_sort Xiongbin Gui
collection DOAJ
description Abstract Background Recruitment for cohorts involving complex liver diseases, such as hepatocellular carcinoma and liver cirrhosis, often requires interpreting semantically complex criteria. Traditional manual screening methods are time-consuming and prone to errors. While AI-powered pre-screening offers potential solutions, challenges remain regarding accuracy, efficiency, and data privacy. Methods We developed a novel patient pre-screening pipeline that leverages clinical expertise to guide the precise, safe, and efficient application of large language models. The pipeline breaks down complex criteria into a series of composite questions and then employs two strategies to perform semantic question-answering through electronic health records: (1) Pathway A, Anthropomorphized Experts’ Chain of Thought strategy; and (2) Pathway B, Preset Stances within an Agent Collaboration strategy, particularly in managing complex clinical reasoning scenarios. The pipeline is evaluated on key metrics including precision, recall, time consumption, and counterfactual inference—at both the question and criterion levels. Results Our pipeline achieved a notable balance of high precision (e.g., 0.921, criteria level) and good overall recall (e.g., ~ 0.82, criteria level), alongside high efficiency (0.44s per task). Pathway B excelled in high-precision complex reasoning (while exhibiting a specific recall profile conducive to accuracy), whereas Pathway A was particularly effective for tasks requiring both robust precision and recall (e.g., direct data extraction), often with faster processing times. Both pathways achieved comparable overall precision while offering different strengths in the precision-recall trade-off. The pipeline showed promising precision-focused results in hepatocellular carcinoma (0.878) and cirrhosis trials (0.843). Conclusions This data-secure and time-efficient pipeline shows high precision and achieves good recall in hepatopathy trials, providing promising solutions for streamlining clinical trial workflows. Its efficiency, adaptability, and balanced performance profile make it suitable for improving patient recruitment. And its capability to function in resource-constrained environments further enhances its utility in clinical settings.
format Article
id doaj-art-700cef0622ce4e42bd7da8d96648f4c3
institution Kabale University
issn 1756-0381
language English
publishDate 2025-06-01
publisher BMC
record_format Article
series BioData Mining
spelling doaj-art-700cef0622ce4e42bd7da8d96648f4c32025-08-20T03:45:11ZengBMCBioData Mining1756-03812025-06-0118111510.1186/s13040-025-00458-5Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipelineXiongbin Gui0Hanlin Lv1Xiao Wang2Longting Lv3Yi Xiao4Lei Wang5The First Affiliated Hospital of Guangxi University of Chinese MedicineInstitute of Biointellgence Technology, BGI ResearchInstitute of Biointellgence Technology, BGI ResearchInstitute of Biointellgence Technology, BGI ResearchThe First Affiliated Hospital of Guangxi University of Chinese MedicineInstitute of Biointellgence Technology, BGI ResearchAbstract Background Recruitment for cohorts involving complex liver diseases, such as hepatocellular carcinoma and liver cirrhosis, often requires interpreting semantically complex criteria. Traditional manual screening methods are time-consuming and prone to errors. While AI-powered pre-screening offers potential solutions, challenges remain regarding accuracy, efficiency, and data privacy. Methods We developed a novel patient pre-screening pipeline that leverages clinical expertise to guide the precise, safe, and efficient application of large language models. The pipeline breaks down complex criteria into a series of composite questions and then employs two strategies to perform semantic question-answering through electronic health records: (1) Pathway A, Anthropomorphized Experts’ Chain of Thought strategy; and (2) Pathway B, Preset Stances within an Agent Collaboration strategy, particularly in managing complex clinical reasoning scenarios. The pipeline is evaluated on key metrics including precision, recall, time consumption, and counterfactual inference—at both the question and criterion levels. Results Our pipeline achieved a notable balance of high precision (e.g., 0.921, criteria level) and good overall recall (e.g., ~ 0.82, criteria level), alongside high efficiency (0.44s per task). Pathway B excelled in high-precision complex reasoning (while exhibiting a specific recall profile conducive to accuracy), whereas Pathway A was particularly effective for tasks requiring both robust precision and recall (e.g., direct data extraction), often with faster processing times. Both pathways achieved comparable overall precision while offering different strengths in the precision-recall trade-off. The pipeline showed promising precision-focused results in hepatocellular carcinoma (0.878) and cirrhosis trials (0.843). Conclusions This data-secure and time-efficient pipeline shows high precision and achieves good recall in hepatopathy trials, providing promising solutions for streamlining clinical trial workflows. Its efficiency, adaptability, and balanced performance profile make it suitable for improving patient recruitment. And its capability to function in resource-constrained environments further enhances its utility in clinical settings.https://doi.org/10.1186/s13040-025-00458-5Patient recruitmentElectronic health recordsNatural language processingLiver diseaseEligibility criteria
spellingShingle Xiongbin Gui
Hanlin Lv
Xiao Wang
Longting Lv
Yi Xiao
Lei Wang
Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline
BioData Mining
Patient recruitment
Electronic health records
Natural language processing
Liver disease
Eligibility criteria
title Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline
title_full Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline
title_fullStr Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline
title_full_unstemmed Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline
title_short Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline
title_sort enhancing hepatopathy clinical trial efficiency a secure large language model powered pre screening pipeline
topic Patient recruitment
Electronic health records
Natural language processing
Liver disease
Eligibility criteria
url https://doi.org/10.1186/s13040-025-00458-5
work_keys_str_mv AT xiongbingui enhancinghepatopathyclinicaltrialefficiencyasecurelargelanguagemodelpoweredprescreeningpipeline
AT hanlinlv enhancinghepatopathyclinicaltrialefficiencyasecurelargelanguagemodelpoweredprescreeningpipeline
AT xiaowang enhancinghepatopathyclinicaltrialefficiencyasecurelargelanguagemodelpoweredprescreeningpipeline
AT longtinglv enhancinghepatopathyclinicaltrialefficiencyasecurelargelanguagemodelpoweredprescreeningpipeline
AT yixiao enhancinghepatopathyclinicaltrialefficiencyasecurelargelanguagemodelpoweredprescreeningpipeline
AT leiwang enhancinghepatopathyclinicaltrialefficiencyasecurelargelanguagemodelpoweredprescreeningpipeline