Workflow for detecting biomedical articles with underlying open and restricted-access datasets.

To monitor the sharing of research data through repositories is increasingly of interest to institutions and funders, as well as from a meta-research perspective. Automated screening tools exist, but they are based on either narrow or vague definitions of open data. Where manual validation has been...

Full description

Saved in:
Bibliographic Details
Main Authors: Anastasiia Iarkaeva, Vladislav Nachev, Evgeny Bobrov
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-01-01
Series:PLoS ONE
Online Access:https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0302787&type=printable
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841555368739602432
author Anastasiia Iarkaeva
Vladislav Nachev
Evgeny Bobrov
author_facet Anastasiia Iarkaeva
Vladislav Nachev
Evgeny Bobrov
author_sort Anastasiia Iarkaeva
collection DOAJ
description To monitor the sharing of research data through repositories is increasingly of interest to institutions and funders, as well as from a meta-research perspective. Automated screening tools exist, but they are based on either narrow or vague definitions of open data. Where manual validation has been performed, it was based on a small article sample. At our biomedical research institution, we developed detailed criteria for such a screening, as well as a workflow which combines an automated and a manual step, and considers both fully open and restricted-access data. We use the results for an internal incentivization scheme, as well as for a monitoring in a dashboard. Here, we describe in detail our screening procedure and its validation, based on automated screening of 11035 biomedical research articles, of which 1381 articles with potential data sharing were subsequently screened manually. The screening results were highly reliable, as witnessed by inter-rater reliability values of ≥0.8 (Krippendorff's alpha) in two different validation samples. We also report the results of the screening, both for our institution and an independent sample from a meta-research study. In the largest of the three samples, the 2021 institutional sample, underlying data had been openly shared for 7.8% of research articles. For an additional 1.0% of articles, restricted-access data had been shared, resulting in 8.3% of articles overall having open and/or restricted-access data. The extraction workflow is then discussed with regard to its applicability in different contexts, limitations, possible variations, and future developments. In summary, we present a comprehensive, validated, semi-automated workflow for the detection of shared research data underlying biomedical article publications.
format Article
id doaj-art-22fa9600cfd34044a6186338fd5d9527
institution Kabale University
issn 1932-6203
language English
publishDate 2024-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-22fa9600cfd34044a6186338fd5d95272025-01-08T05:33:35ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-01195e030278710.1371/journal.pone.0302787Workflow for detecting biomedical articles with underlying open and restricted-access datasets.Anastasiia IarkaevaVladislav NachevEvgeny BobrovTo monitor the sharing of research data through repositories is increasingly of interest to institutions and funders, as well as from a meta-research perspective. Automated screening tools exist, but they are based on either narrow or vague definitions of open data. Where manual validation has been performed, it was based on a small article sample. At our biomedical research institution, we developed detailed criteria for such a screening, as well as a workflow which combines an automated and a manual step, and considers both fully open and restricted-access data. We use the results for an internal incentivization scheme, as well as for a monitoring in a dashboard. Here, we describe in detail our screening procedure and its validation, based on automated screening of 11035 biomedical research articles, of which 1381 articles with potential data sharing were subsequently screened manually. The screening results were highly reliable, as witnessed by inter-rater reliability values of ≥0.8 (Krippendorff's alpha) in two different validation samples. We also report the results of the screening, both for our institution and an independent sample from a meta-research study. In the largest of the three samples, the 2021 institutional sample, underlying data had been openly shared for 7.8% of research articles. For an additional 1.0% of articles, restricted-access data had been shared, resulting in 8.3% of articles overall having open and/or restricted-access data. The extraction workflow is then discussed with regard to its applicability in different contexts, limitations, possible variations, and future developments. In summary, we present a comprehensive, validated, semi-automated workflow for the detection of shared research data underlying biomedical article publications.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0302787&type=printable
spellingShingle Anastasiia Iarkaeva
Vladislav Nachev
Evgeny Bobrov
Workflow for detecting biomedical articles with underlying open and restricted-access datasets.
PLoS ONE
title Workflow for detecting biomedical articles with underlying open and restricted-access datasets.
title_full Workflow for detecting biomedical articles with underlying open and restricted-access datasets.
title_fullStr Workflow for detecting biomedical articles with underlying open and restricted-access datasets.
title_full_unstemmed Workflow for detecting biomedical articles with underlying open and restricted-access datasets.
title_short Workflow for detecting biomedical articles with underlying open and restricted-access datasets.
title_sort workflow for detecting biomedical articles with underlying open and restricted access datasets
url https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0302787&type=printable
work_keys_str_mv AT anastasiiaiarkaeva workflowfordetectingbiomedicalarticleswithunderlyingopenandrestrictedaccessdatasets
AT vladislavnachev workflowfordetectingbiomedicalarticleswithunderlyingopenandrestrictedaccessdatasets
AT evgenybobrov workflowfordetectingbiomedicalarticleswithunderlyingopenandrestrictedaccessdatasets