Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction Method

Abstract The rare earth elements Sm and Nd significantly address fundamental questions about crustal growth, such as its spatiotemporal evolution and the interplay between orogenesis and crustal accretion. Their relative immobility during high-grade metamorphism makes the Sm-Nd isotopic system cruci...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhixin Guo, Tao Wang, Chaoyang Wang, Jianping Zhou, Guanjie Zheng, Xinbing Wang, Chenghu Zhou
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-024-04229-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825197554904596480
author Zhixin Guo
Tao Wang
Chaoyang Wang
Jianping Zhou
Guanjie Zheng
Xinbing Wang
Chenghu Zhou
author_facet Zhixin Guo
Tao Wang
Chaoyang Wang
Jianping Zhou
Guanjie Zheng
Xinbing Wang
Chenghu Zhou
author_sort Zhixin Guo
collection DOAJ
description Abstract The rare earth elements Sm and Nd significantly address fundamental questions about crustal growth, such as its spatiotemporal evolution and the interplay between orogenesis and crustal accretion. Their relative immobility during high-grade metamorphism makes the Sm-Nd isotopic system crucial for inferring crustal formation times. Historically, data have been disseminated sporadically in the scientific literature due to complicated and costly sampling procedures, resulting in a fragmented knowledge base. However, the scattering of critical geoscience data across multiple publications poses significant challenges regarding human capital and time. In response, we present an automated tabular extraction method for harvesting tabular geoscience data. We collect 10,624 Sm-Nd data entries from 9,138 tables in over 20,000 geoscience publications using this method. We manually selected 2,118 data points from it to supplement the previously constructed global Sm-Nd dataset, increasing its sample count by over 20%. Our automatic data collection methodology enhances the efficiency of data acquisition processes spanning various scientific domains.
format Article
id doaj-art-0229e5f88c5c43948556eced712e1dad
institution Kabale University
issn 2052-4463
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-0229e5f88c5c43948556eced712e1dad2025-02-09T12:11:40ZengNature PortfolioScientific Data2052-44632025-02-0112111310.1038/s41597-024-04229-5Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction MethodZhixin Guo0Tao Wang1Chaoyang Wang2Jianping Zhou3Guanjie Zheng4Xinbing Wang5Chenghu Zhou6Shanghai Jiao Tong University, School of Electronic Information and Electrical EngineeringChinese Academy of Geological Sciences, Institute of GeologyChinese Academy of Geological Sciences, Institute of GeologyShanghai Jiao Tong University, School of Electronic Information and Electrical EngineeringShanghai Jiao Tong University, John Hopcroft Center for Computer ScienceShanghai Jiao Tong University, School of Electronic Information and Electrical EngineeringChinese Academy of Sciences, Institute of Geological Sciences and Natural Resources ResearchAbstract The rare earth elements Sm and Nd significantly address fundamental questions about crustal growth, such as its spatiotemporal evolution and the interplay between orogenesis and crustal accretion. Their relative immobility during high-grade metamorphism makes the Sm-Nd isotopic system crucial for inferring crustal formation times. Historically, data have been disseminated sporadically in the scientific literature due to complicated and costly sampling procedures, resulting in a fragmented knowledge base. However, the scattering of critical geoscience data across multiple publications poses significant challenges regarding human capital and time. In response, we present an automated tabular extraction method for harvesting tabular geoscience data. We collect 10,624 Sm-Nd data entries from 9,138 tables in over 20,000 geoscience publications using this method. We manually selected 2,118 data points from it to supplement the previously constructed global Sm-Nd dataset, increasing its sample count by over 20%. Our automatic data collection methodology enhances the efficiency of data acquisition processes spanning various scientific domains.https://doi.org/10.1038/s41597-024-04229-5
spellingShingle Zhixin Guo
Tao Wang
Chaoyang Wang
Jianping Zhou
Guanjie Zheng
Xinbing Wang
Chenghu Zhou
Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction Method
Scientific Data
title Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction Method
title_full Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction Method
title_fullStr Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction Method
title_full_unstemmed Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction Method
title_short Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction Method
title_sort sm nd isotope data compilation from geoscientific literature using an automated tabular extraction method
url https://doi.org/10.1038/s41597-024-04229-5
work_keys_str_mv AT zhixinguo smndisotopedatacompilationfromgeoscientificliteratureusinganautomatedtabularextractionmethod
AT taowang smndisotopedatacompilationfromgeoscientificliteratureusinganautomatedtabularextractionmethod
AT chaoyangwang smndisotopedatacompilationfromgeoscientificliteratureusinganautomatedtabularextractionmethod
AT jianpingzhou smndisotopedatacompilationfromgeoscientificliteratureusinganautomatedtabularextractionmethod
AT guanjiezheng smndisotopedatacompilationfromgeoscientificliteratureusinganautomatedtabularextractionmethod
AT xinbingwang smndisotopedatacompilationfromgeoscientificliteratureusinganautomatedtabularextractionmethod
AT chenghuzhou smndisotopedatacompilationfromgeoscientificliteratureusinganautomatedtabularextractionmethod