Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation

BackgroundA challenge in updating systematic reviews is the workload in screening the articles. Many screening models using natural language processing technology have been implemented to scrutinize articles based on titles and abstracts. While these approaches show promise,...

Full description

Saved in:
Bibliographic Details
Main Authors: Tatsuki Hasegawa, Hayato Kizaki, Keisho Ikegami, Shungo Imai, Yuki Yanagisawa, Shuntaro Yada, Eiji Aramaki, Satoko Hori
Format: Article
Language:English
Published: JMIR Publications 2025-03-01
Series:JMIR Medical Informatics
Online Access:https://medinform.jmir.org/2025/1/e65371
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850277048754372608
author Tatsuki Hasegawa
Hayato Kizaki
Keisho Ikegami
Shungo Imai
Yuki Yanagisawa
Shuntaro Yada
Eiji Aramaki
Satoko Hori
author_facet Tatsuki Hasegawa
Hayato Kizaki
Keisho Ikegami
Shungo Imai
Yuki Yanagisawa
Shuntaro Yada
Eiji Aramaki
Satoko Hori
author_sort Tatsuki Hasegawa
collection DOAJ
description BackgroundA challenge in updating systematic reviews is the workload in screening the articles. Many screening models using natural language processing technology have been implemented to scrutinize articles based on titles and abstracts. While these approaches show promise, traditional models typically treat abstracts as uniform text. We hypothesize that selective training on specific abstract components could enhance model performance for systematic review screening. ObjectiveWe evaluated the efficacy of a novel screening model that selects specific components from abstracts to improve performance and developed an automatic systematic review update model using an abstract component classifier to categorize abstracts based on their components. MethodsA screening model was created based on the included and excluded articles in the existing systematic review and used as the scheme for the automatic update of the systematic review. A prior publication was selected for the systematic review, and articles included or excluded in the articles screening process were used as training data. The titles and abstracts were classified into 5 categories (Title, Introduction, Methods, Results, and Conclusion). Thirty-one component-composition datasets were created by combining 5 component datasets. We implemented 31 screening models using the component-composition datasets and compared their performances. Comparisons were conducted using 3 pretrained models: Bidirectional Encoder Representations from Transformer (BERT), BioLinkBERT, and BioM- Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Moreover, to automate the component selection of abstracts, we developed the Abstract Component Classifier Model and created component datasets using this classifier model classification. Using the component datasets classified using the Abstract Component Classifier Model, we created 10 component-composition datasets used by the top 10 screening models with the highest performance when implementing screening models using the component datasets that were classified manually. Ten screening models were implemented using these datasets, and their performances were compared with those of models developed using manually classified component-composition datasets. The primary evaluation metric was the F10-Score weighted by the recall. ResultsA total of 256 included articles and 1261 excluded articles were extracted from the selected systematic review. In the screening models implemented using manually classified datasets, the performance of some surpassed that of models trained on all components (BERT: 9 models, BioLinkBERT: 6 models, and BioM-ELECTRA: 21 models). In models implemented using datasets classified by the Abstract Component Classifier Model, the performances of some models (BERT: 7 models and BioM-ELECTRA: 9 models) surpassed that of the models trained on all components. These models achieved an 88.6% reduction in manual screening workload while maintaining high recall (0.93). ConclusionsComponent selection from the title and abstract can improve the performance of screening models and substantially reduce the manual screening workload in systematic review updates. Future research should focus on validating this approach across different systematic review domains.
format Article
id doaj-art-cf576cb699e0406d9fa702cd12761e42
institution OA Journals
issn 2291-9694
language English
publishDate 2025-03-01
publisher JMIR Publications
record_format Article
series JMIR Medical Informatics
spelling doaj-art-cf576cb699e0406d9fa702cd12761e422025-08-20T01:50:00ZengJMIR PublicationsJMIR Medical Informatics2291-96942025-03-0113e6537110.2196/65371Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and ValidationTatsuki Hasegawahttps://orcid.org/0009-0008-4708-3460Hayato Kizakihttps://orcid.org/0000-0002-4572-1333Keisho Ikegamihttps://orcid.org/0009-0005-5025-3078Shungo Imaihttps://orcid.org/0000-0001-5706-613XYuki Yanagisawahttps://orcid.org/0000-0001-6998-6654Shuntaro Yadahttps://orcid.org/0000-0002-6209-1054Eiji Aramakihttps://orcid.org/0000-0003-0201-3609Satoko Horihttps://orcid.org/0000-0002-4596-5418 BackgroundA challenge in updating systematic reviews is the workload in screening the articles. Many screening models using natural language processing technology have been implemented to scrutinize articles based on titles and abstracts. While these approaches show promise, traditional models typically treat abstracts as uniform text. We hypothesize that selective training on specific abstract components could enhance model performance for systematic review screening. ObjectiveWe evaluated the efficacy of a novel screening model that selects specific components from abstracts to improve performance and developed an automatic systematic review update model using an abstract component classifier to categorize abstracts based on their components. MethodsA screening model was created based on the included and excluded articles in the existing systematic review and used as the scheme for the automatic update of the systematic review. A prior publication was selected for the systematic review, and articles included or excluded in the articles screening process were used as training data. The titles and abstracts were classified into 5 categories (Title, Introduction, Methods, Results, and Conclusion). Thirty-one component-composition datasets were created by combining 5 component datasets. We implemented 31 screening models using the component-composition datasets and compared their performances. Comparisons were conducted using 3 pretrained models: Bidirectional Encoder Representations from Transformer (BERT), BioLinkBERT, and BioM- Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Moreover, to automate the component selection of abstracts, we developed the Abstract Component Classifier Model and created component datasets using this classifier model classification. Using the component datasets classified using the Abstract Component Classifier Model, we created 10 component-composition datasets used by the top 10 screening models with the highest performance when implementing screening models using the component datasets that were classified manually. Ten screening models were implemented using these datasets, and their performances were compared with those of models developed using manually classified component-composition datasets. The primary evaluation metric was the F10-Score weighted by the recall. ResultsA total of 256 included articles and 1261 excluded articles were extracted from the selected systematic review. In the screening models implemented using manually classified datasets, the performance of some surpassed that of models trained on all components (BERT: 9 models, BioLinkBERT: 6 models, and BioM-ELECTRA: 21 models). In models implemented using datasets classified by the Abstract Component Classifier Model, the performances of some models (BERT: 7 models and BioM-ELECTRA: 9 models) surpassed that of the models trained on all components. These models achieved an 88.6% reduction in manual screening workload while maintaining high recall (0.93). ConclusionsComponent selection from the title and abstract can improve the performance of screening models and substantially reduce the manual screening workload in systematic review updates. Future research should focus on validating this approach across different systematic review domains.https://medinform.jmir.org/2025/1/e65371
spellingShingle Tatsuki Hasegawa
Hayato Kizaki
Keisho Ikegami
Shungo Imai
Yuki Yanagisawa
Shuntaro Yada
Eiji Aramaki
Satoko Hori
Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation
JMIR Medical Informatics
title Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation
title_full Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation
title_fullStr Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation
title_full_unstemmed Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation
title_short Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation
title_sort improving systematic review updates with natural language processing through abstract component classification and selection algorithm development and validation
url https://medinform.jmir.org/2025/1/e65371
work_keys_str_mv AT tatsukihasegawa improvingsystematicreviewupdateswithnaturallanguageprocessingthroughabstractcomponentclassificationandselectionalgorithmdevelopmentandvalidation
AT hayatokizaki improvingsystematicreviewupdateswithnaturallanguageprocessingthroughabstractcomponentclassificationandselectionalgorithmdevelopmentandvalidation
AT keishoikegami improvingsystematicreviewupdateswithnaturallanguageprocessingthroughabstractcomponentclassificationandselectionalgorithmdevelopmentandvalidation
AT shungoimai improvingsystematicreviewupdateswithnaturallanguageprocessingthroughabstractcomponentclassificationandselectionalgorithmdevelopmentandvalidation
AT yukiyanagisawa improvingsystematicreviewupdateswithnaturallanguageprocessingthroughabstractcomponentclassificationandselectionalgorithmdevelopmentandvalidation
AT shuntaroyada improvingsystematicreviewupdateswithnaturallanguageprocessingthroughabstractcomponentclassificationandselectionalgorithmdevelopmentandvalidation
AT eijiaramaki improvingsystematicreviewupdateswithnaturallanguageprocessingthroughabstractcomponentclassificationandselectionalgorithmdevelopmentandvalidation
AT satokohori improvingsystematicreviewupdateswithnaturallanguageprocessingthroughabstractcomponentclassificationandselectionalgorithmdevelopmentandvalidation