Predicting retracted research: a dataset and machine learning approaches

Abstract Background Retractions undermine the scientific record’s reliability and can lead to the continued propagation of flawed research. This study aimed to (1) create a dataset aggregating retraction information with bibliographic metadata, (2) train and evaluate various machine learning approac...

Full description

Saved in:
Bibliographic Details
Main Authors: Aaron H. A. Fletcher, Mark Stevenson
Format: Article
Language:English
Published: BMC 2025-06-01
Series:Research Integrity and Peer Review
Subjects:
Online Access:https://doi.org/10.1186/s41073-025-00168-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849334483511148544
author Aaron H. A. Fletcher
Mark Stevenson
author_facet Aaron H. A. Fletcher
Mark Stevenson
author_sort Aaron H. A. Fletcher
collection DOAJ
description Abstract Background Retractions undermine the scientific record’s reliability and can lead to the continued propagation of flawed research. This study aimed to (1) create a dataset aggregating retraction information with bibliographic metadata, (2) train and evaluate various machine learning approaches to predict article retractions, and (3) assess each feature’s contribution to feature-based classifier performance using ablation studies. Methods An open-access dataset was developed by combining information from the Retraction Watch database and the OpenAlex API. Using a case-controlled design, retracted research articles were paired with non-retracted articles published in the same period. Traditional feature-based classifiers and models leveraging contextual language representations were then trained and evaluated. Model performance was assessed using accuracy, precision, recall, and the F1-score. Results The Llama 3.2 base model achieved the highest overall accuracy. The Random Forest classifier achieved a precision of 0.687 for identifying non-retracted articles, while the Llama 3.2 base model reached a precision of 0.683 for identifying retracted articles. Traditional feature-based classifiers generally outperformed most contextual language models, except for the Llama 3.2 base model, which showed competitive performance across several metrics. Conclusions Although no single model excelled across all metrics, our findings indicate that machine learning techniques can effectively support the identification of retracted research. These results provide a foundation for developing automated tools to assist publishers and reviewers in detecting potentially problematic publications. Further research should focus on refining these models and investigating additional features to improve predictive performance. Trial registration Not applicable.
format Article
id doaj-art-d75ae7e7bdc84116a2a4c9c66b564fe5
institution Kabale University
issn 2058-8615
language English
publishDate 2025-06-01
publisher BMC
record_format Article
series Research Integrity and Peer Review
spelling doaj-art-d75ae7e7bdc84116a2a4c9c66b564fe52025-08-20T03:45:32ZengBMCResearch Integrity and Peer Review2058-86152025-06-0110111010.1186/s41073-025-00168-wPredicting retracted research: a dataset and machine learning approachesAaron H. A. Fletcher0Mark Stevenson1School of Computer Science, The University of SheffieldSchool of Computer Science, The University of SheffieldAbstract Background Retractions undermine the scientific record’s reliability and can lead to the continued propagation of flawed research. This study aimed to (1) create a dataset aggregating retraction information with bibliographic metadata, (2) train and evaluate various machine learning approaches to predict article retractions, and (3) assess each feature’s contribution to feature-based classifier performance using ablation studies. Methods An open-access dataset was developed by combining information from the Retraction Watch database and the OpenAlex API. Using a case-controlled design, retracted research articles were paired with non-retracted articles published in the same period. Traditional feature-based classifiers and models leveraging contextual language representations were then trained and evaluated. Model performance was assessed using accuracy, precision, recall, and the F1-score. Results The Llama 3.2 base model achieved the highest overall accuracy. The Random Forest classifier achieved a precision of 0.687 for identifying non-retracted articles, while the Llama 3.2 base model reached a precision of 0.683 for identifying retracted articles. Traditional feature-based classifiers generally outperformed most contextual language models, except for the Llama 3.2 base model, which showed competitive performance across several metrics. Conclusions Although no single model excelled across all metrics, our findings indicate that machine learning techniques can effectively support the identification of retracted research. These results provide a foundation for developing automated tools to assist publishers and reviewers in detecting potentially problematic publications. Further research should focus on refining these models and investigating additional features to improve predictive performance. Trial registration Not applicable.https://doi.org/10.1186/s41073-025-00168-wRetraction predictionMachine learningScientific publishing
spellingShingle Aaron H. A. Fletcher
Mark Stevenson
Predicting retracted research: a dataset and machine learning approaches
Research Integrity and Peer Review
Retraction prediction
Machine learning
Scientific publishing
title Predicting retracted research: a dataset and machine learning approaches
title_full Predicting retracted research: a dataset and machine learning approaches
title_fullStr Predicting retracted research: a dataset and machine learning approaches
title_full_unstemmed Predicting retracted research: a dataset and machine learning approaches
title_short Predicting retracted research: a dataset and machine learning approaches
title_sort predicting retracted research a dataset and machine learning approaches
topic Retraction prediction
Machine learning
Scientific publishing
url https://doi.org/10.1186/s41073-025-00168-w
work_keys_str_mv AT aaronhafletcher predictingretractedresearchadatasetandmachinelearningapproaches
AT markstevenson predictingretractedresearchadatasetandmachinelearningapproaches