Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review Automation

As climate change transforms our environment and human intrusion into natural ecosystems escalates, there is a growing demand for disease spread models to forecast and plan for the next zoonotic disease outbreak. Accurate parametrization of these models requires data from diverse sources, including...

Full description

Saved in:
Bibliographic Details
Main Authors: Masood Sujau, Masako Wada, Emilie Vallée, Natalie Hillis, Teo Sušnjak
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/7/2/28
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849705955154984960
author Masood Sujau
Masako Wada
Emilie Vallée
Natalie Hillis
Teo Sušnjak
author_facet Masood Sujau
Masako Wada
Emilie Vallée
Natalie Hillis
Teo Sušnjak
author_sort Masood Sujau
collection DOAJ
description As climate change transforms our environment and human intrusion into natural ecosystems escalates, there is a growing demand for disease spread models to forecast and plan for the next zoonotic disease outbreak. Accurate parametrization of these models requires data from diverse sources, including the scientific literature. Despite the abundance of scientific publications, the manual extraction of these data via systematic literature reviews remains a significant bottleneck, requiring extensive time and resources, and is susceptible to human error. This study examines the application of a large language model (LLM) as an assessor for screening prioritisation in climate-sensitive zoonotic disease research. By framing the selection criteria of articles as a question–answer task and utilising zero-shot chain-of-thought prompting, the proposed method achieves a saving of at least 70% work effort compared to manual screening at a recall level of 95% (NWSS@95%). This was validated across four datasets containing four distinct zoonotic diseases and a critical climate variable (rainfall). The approach additionally produces explainable AI rationales for each ranked article. The effectiveness of the approach across multiple diseases demonstrates the potential for broad application in systematic literature reviews. The substantial reduction in screening effort, along with the provision of explainable AI rationales, marks an important step toward automated parameter extraction from the scientific literature.
format Article
id doaj-art-0f08be7e92fa4064b8fd1771d31fc0ed
institution DOAJ
issn 2504-4990
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Machine Learning and Knowledge Extraction
spelling doaj-art-0f08be7e92fa4064b8fd1771d31fc0ed2025-08-20T03:16:19ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902025-03-01722810.3390/make7020028Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review AutomationMasood Sujau0Masako Wada1Emilie Vallée2Natalie Hillis3Teo Sušnjak4School of Veterinary Science, Massey University, Palmerston North 4442, New ZealandSchool of Veterinary Science, Massey University, Palmerston North 4442, New ZealandSchool of Veterinary Science, Massey University, Palmerston North 4442, New ZealandSchool of Veterinary Science, Massey University, Palmerston North 4442, New ZealandSchool of Mathematical and Computational Sciences, Massey University, Auckland 0632, New ZealandAs climate change transforms our environment and human intrusion into natural ecosystems escalates, there is a growing demand for disease spread models to forecast and plan for the next zoonotic disease outbreak. Accurate parametrization of these models requires data from diverse sources, including the scientific literature. Despite the abundance of scientific publications, the manual extraction of these data via systematic literature reviews remains a significant bottleneck, requiring extensive time and resources, and is susceptible to human error. This study examines the application of a large language model (LLM) as an assessor for screening prioritisation in climate-sensitive zoonotic disease research. By framing the selection criteria of articles as a question–answer task and utilising zero-shot chain-of-thought prompting, the proposed method achieves a saving of at least 70% work effort compared to manual screening at a recall level of 95% (NWSS@95%). This was validated across four datasets containing four distinct zoonotic diseases and a critical climate variable (rainfall). The approach additionally produces explainable AI rationales for each ranked article. The effectiveness of the approach across multiple diseases demonstrates the potential for broad application in systematic literature reviews. The substantial reduction in screening effort, along with the provision of explainable AI rationales, marks an important step toward automated parameter extraction from the scientific literature.https://www.mdpi.com/2504-4990/7/2/28large language models in systematic reviewsautomated AI literature screeningzero-shot relevancy rankingclimate-sensitive zoonotic disease modellinginformation retrieval in medical literaturesystematic literature review automation
spellingShingle Masood Sujau
Masako Wada
Emilie Vallée
Natalie Hillis
Teo Sušnjak
Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review Automation
Machine Learning and Knowledge Extraction
large language models in systematic reviews
automated AI literature screening
zero-shot relevancy ranking
climate-sensitive zoonotic disease modelling
information retrieval in medical literature
systematic literature review automation
title Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review Automation
title_full Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review Automation
title_fullStr Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review Automation
title_full_unstemmed Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review Automation
title_short Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review Automation
title_sort accelerating disease model parameter extraction an llm based ranking approach to select initial studies for literature review automation
topic large language models in systematic reviews
automated AI literature screening
zero-shot relevancy ranking
climate-sensitive zoonotic disease modelling
information retrieval in medical literature
systematic literature review automation
url https://www.mdpi.com/2504-4990/7/2/28
work_keys_str_mv AT masoodsujau acceleratingdiseasemodelparameterextractionanllmbasedrankingapproachtoselectinitialstudiesforliteraturereviewautomation
AT masakowada acceleratingdiseasemodelparameterextractionanllmbasedrankingapproachtoselectinitialstudiesforliteraturereviewautomation
AT emilievallee acceleratingdiseasemodelparameterextractionanllmbasedrankingapproachtoselectinitialstudiesforliteraturereviewautomation
AT nataliehillis acceleratingdiseasemodelparameterextractionanllmbasedrankingapproachtoselectinitialstudiesforliteraturereviewautomation
AT teosusnjak acceleratingdiseasemodelparameterextractionanllmbasedrankingapproachtoselectinitialstudiesforliteraturereviewautomation