Identifying genomic data use with the Data Citation Explorer

Abstract Increases in sequencing capacity, combined with rapid accumulation of publications and associated data resources, have increased the complexity of maintaining associations between literature and genomic data. As the volume of literature and data have exceeded the capacity of manual curation...

Full description

Saved in:
Bibliographic Details
Main Authors: Neil Byers, Charles Parker, Chris Beecroft, T. B. K. Reddy, Hugh Salamon, George Garrity, Kjiersten Fagnan
Format: Article
Language:English
Published: Nature Portfolio 2024-11-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-024-04049-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850062072572805120
author Neil Byers
Charles Parker
Chris Beecroft
T. B. K. Reddy
Hugh Salamon
George Garrity
Kjiersten Fagnan
author_facet Neil Byers
Charles Parker
Chris Beecroft
T. B. K. Reddy
Hugh Salamon
George Garrity
Kjiersten Fagnan
author_sort Neil Byers
collection DOAJ
description Abstract Increases in sequencing capacity, combined with rapid accumulation of publications and associated data resources, have increased the complexity of maintaining associations between literature and genomic data. As the volume of literature and data have exceeded the capacity of manual curation, automated approaches to maintaining and confirming associations among these resources have become necessary. Here we present the Data Citation Explorer (DCE), which discovers literature incorporating genomic data that was not formally cited. This service provides advantages over manual curation methods including consistent resource coverage, metadata enrichment, documentation of new use cases, and identification of conflicting metadata. The service reduces labor costs associated with manual review, improves the quality of genome metadata maintained by the U.S. Department of Energy Joint Genome Institute (JGI), and increases the number of known publications that incorporate its data products. The DCE facilitates an understanding of JGI impact, improves credit attribution for data generators, and can encourage data sharing by allowing scientists to see how reuse amplifies the impact of their original studies.
format Article
id doaj-art-ddcdfb096ca34ba2af8896433c71afb1
institution DOAJ
issn 2052-4463
language English
publishDate 2024-11-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-ddcdfb096ca34ba2af8896433c71afb12025-08-20T02:50:00ZengNature PortfolioScientific Data2052-44632024-11-0111111310.1038/s41597-024-04049-7Identifying genomic data use with the Data Citation ExplorerNeil Byers0Charles Parker1Chris Beecroft2T. B. K. Reddy3Hugh Salamon4George Garrity5Kjiersten Fagnan6DOE Joint Genome Institute, Lawrence Berkeley National LaboratoryDOE Joint Genome Institute, Lawrence Berkeley National LaboratoryDOE Joint Genome Institute, Lawrence Berkeley National LaboratoryDOE Joint Genome Institute, Lawrence Berkeley National LaboratoryDOE Joint Genome Institute, Lawrence Berkeley National LaboratoryMichigan State University, Department of Microbiology & Molecular GeneticsDOE Joint Genome Institute, Lawrence Berkeley National LaboratoryAbstract Increases in sequencing capacity, combined with rapid accumulation of publications and associated data resources, have increased the complexity of maintaining associations between literature and genomic data. As the volume of literature and data have exceeded the capacity of manual curation, automated approaches to maintaining and confirming associations among these resources have become necessary. Here we present the Data Citation Explorer (DCE), which discovers literature incorporating genomic data that was not formally cited. This service provides advantages over manual curation methods including consistent resource coverage, metadata enrichment, documentation of new use cases, and identification of conflicting metadata. The service reduces labor costs associated with manual review, improves the quality of genome metadata maintained by the U.S. Department of Energy Joint Genome Institute (JGI), and increases the number of known publications that incorporate its data products. The DCE facilitates an understanding of JGI impact, improves credit attribution for data generators, and can encourage data sharing by allowing scientists to see how reuse amplifies the impact of their original studies.https://doi.org/10.1038/s41597-024-04049-7
spellingShingle Neil Byers
Charles Parker
Chris Beecroft
T. B. K. Reddy
Hugh Salamon
George Garrity
Kjiersten Fagnan
Identifying genomic data use with the Data Citation Explorer
Scientific Data
title Identifying genomic data use with the Data Citation Explorer
title_full Identifying genomic data use with the Data Citation Explorer
title_fullStr Identifying genomic data use with the Data Citation Explorer
title_full_unstemmed Identifying genomic data use with the Data Citation Explorer
title_short Identifying genomic data use with the Data Citation Explorer
title_sort identifying genomic data use with the data citation explorer
url https://doi.org/10.1038/s41597-024-04049-7
work_keys_str_mv AT neilbyers identifyinggenomicdatausewiththedatacitationexplorer
AT charlesparker identifyinggenomicdatausewiththedatacitationexplorer
AT chrisbeecroft identifyinggenomicdatausewiththedatacitationexplorer
AT tbkreddy identifyinggenomicdatausewiththedatacitationexplorer
AT hughsalamon identifyinggenomicdatausewiththedatacitationexplorer
AT georgegarrity identifyinggenomicdatausewiththedatacitationexplorer
AT kjierstenfagnan identifyinggenomicdatausewiththedatacitationexplorer