Defining geosciences research data through metadata reuse:

Objective. Research data refers to factual records used as primary scientific research resources. Reusing research data metadata provides a new perspective, allowing the presentation of new tests, hypotheses, and new research developments. This study aims to identify the nature of the types of Geos...

Full description

Saved in:
Bibliographic Details
Main Authors: Alexandre Ribas Semeler, Luana Farias Sales, Adilson Luiz Pinto, Roberta Pereira da Silva de Paula, Valquer Cleyton Paes Gandra, Heloisa Costa
Format: Article
Language:Spanish
Published: University Library System, University of Pittsburgh 2025-02-01
Series:Biblios
Subjects:
Online Access:https://biblios.pitt.edu/ojs/biblios/article/view/1233
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825199599976972288
author Alexandre Ribas Semeler
Luana Farias Sales
Adilson Luiz Pinto
Roberta Pereira da Silva de Paula
Valquer Cleyton Paes Gandra
Heloisa Costa
author_facet Alexandre Ribas Semeler
Luana Farias Sales
Adilson Luiz Pinto
Roberta Pereira da Silva de Paula
Valquer Cleyton Paes Gandra
Heloisa Costa
author_sort Alexandre Ribas Semeler
collection DOAJ
description Objective. Research data refers to factual records used as primary scientific research resources. Reusing research data metadata provides a new perspective, allowing the presentation of new tests, hypotheses, and new research developments. This study aims to identify the nature of the types of Geosciences research data based on the reuse of metadata from the PANGEA Data Publisher for Earth and Environmental Science available at (https://www.pangaea.de/). The research question to be analyzed is “Can the processes of analyzing and manipulating PANGEA research data metadata be used to define a concept of Geosciences research data?” To address this question, we considered data specification attributes used by data journals to describe the nature of research data: domain of specialization, accessibility, language, data type, acquisition, source location, specific subject area, and related publications. Method. The methodology in question involved collecting, analyzing, and visualizing PANGEA research data metadata. In total, (426,272) records were downloaded from the data repository and compared to the data specifications used by data journals to describe the nature of research data in data papers. The methodology required the application of techniques and technologies used for descriptive analysis, information retrieval, data manipulation, and visualization of Dublin Core metadata. These techniques were implemented using the Python programming language and other data manipulation software, including OpenRefine and VOSviewer. Results. The results of our analysis suggest a detailed examination of the metadata for (137,218) research data records from (6) six Geosciences collections. The number of records in the Geochemistry collection is (73,992), in the Atmospheric Sciences collection it is (32,314), in the Paleontology collection it is (25,903), in the Oceanography collection it is (22,287), in the Geophysics collection it is (4,175), and in the Hydrology collection, it is (834). PANGEA's (6) six research data metadata collections allow for the discussion of a concept of Geosciences research data as a type of data on studies related to the Earth, atmosphere, and oceans, across different geo-disciplines. The data come from a range of disciplines, including geochemistry, atmospheric science, paleontology, oceanography, geophysics, and hydrology, using technologies such as satellites, electronics microscopes, climate sensors, ships, computer modeling, and others. In addition, the data are augmented by other sources related to the study of the Earth and its processes. Conclusions. In conclusion, research data metadata are domain-specific objects that serve as valuable research resources, regardless of their usage timing, purpose, data characteristics, or user. Geosciences research data combine laboratory and fieldwork techniques, utilizing technologies like satellites and climate sensors to study Earth’s processes. PANGEA metadata defines Geosciences research data as including observations, experiments, and modeling. Geosciences research data support replication, reinterpretation, and new research across disciplines, showcasing various facets of data reuse in scientific research.
format Article
id doaj-art-b4bb517530b1413d9906d11586e0f138
institution Kabale University
issn 1562-4730
language Spanish
publishDate 2025-02-01
publisher University Library System, University of Pittsburgh
record_format Article
series Biblios
spelling doaj-art-b4bb517530b1413d9906d11586e0f1382025-02-08T03:39:00ZspaUniversity Library System, University of PittsburghBiblios1562-47302025-02-018710.5195/biblios.2024.1233Defining geosciences research data through metadata reuse: Alexandre Ribas Semeler0Luana Farias Sales1Adilson Luiz Pinto2Roberta Pereira da Silva de Paula3Valquer Cleyton Paes Gandra 4Heloisa Costa5Federal University of Rio Grande do SulInstituto Brasileiro de Informação em Ciência e Tecnologia Universidade Federal de Santa Catarina Instituto Brasileiro de Informação em Ciência e Tecnologia Instituto Brasileiro de Informação em Ciência e Tecnologia Universidade Federal de Santa Catarina Objective. Research data refers to factual records used as primary scientific research resources. Reusing research data metadata provides a new perspective, allowing the presentation of new tests, hypotheses, and new research developments. This study aims to identify the nature of the types of Geosciences research data based on the reuse of metadata from the PANGEA Data Publisher for Earth and Environmental Science available at (https://www.pangaea.de/). The research question to be analyzed is “Can the processes of analyzing and manipulating PANGEA research data metadata be used to define a concept of Geosciences research data?” To address this question, we considered data specification attributes used by data journals to describe the nature of research data: domain of specialization, accessibility, language, data type, acquisition, source location, specific subject area, and related publications. Method. The methodology in question involved collecting, analyzing, and visualizing PANGEA research data metadata. In total, (426,272) records were downloaded from the data repository and compared to the data specifications used by data journals to describe the nature of research data in data papers. The methodology required the application of techniques and technologies used for descriptive analysis, information retrieval, data manipulation, and visualization of Dublin Core metadata. These techniques were implemented using the Python programming language and other data manipulation software, including OpenRefine and VOSviewer. Results. The results of our analysis suggest a detailed examination of the metadata for (137,218) research data records from (6) six Geosciences collections. The number of records in the Geochemistry collection is (73,992), in the Atmospheric Sciences collection it is (32,314), in the Paleontology collection it is (25,903), in the Oceanography collection it is (22,287), in the Geophysics collection it is (4,175), and in the Hydrology collection, it is (834). PANGEA's (6) six research data metadata collections allow for the discussion of a concept of Geosciences research data as a type of data on studies related to the Earth, atmosphere, and oceans, across different geo-disciplines. The data come from a range of disciplines, including geochemistry, atmospheric science, paleontology, oceanography, geophysics, and hydrology, using technologies such as satellites, electronics microscopes, climate sensors, ships, computer modeling, and others. In addition, the data are augmented by other sources related to the study of the Earth and its processes. Conclusions. In conclusion, research data metadata are domain-specific objects that serve as valuable research resources, regardless of their usage timing, purpose, data characteristics, or user. Geosciences research data combine laboratory and fieldwork techniques, utilizing technologies like satellites and climate sensors to study Earth’s processes. PANGEA metadata defines Geosciences research data as including observations, experiments, and modeling. Geosciences research data support replication, reinterpretation, and new research across disciplines, showcasing various facets of data reuse in scientific research. https://biblios.pitt.edu/ojs/biblios/article/view/1233Research dataResearch data reuseMetadataResearch data repositoryData Web ScrapingGeosciences
spellingShingle Alexandre Ribas Semeler
Luana Farias Sales
Adilson Luiz Pinto
Roberta Pereira da Silva de Paula
Valquer Cleyton Paes Gandra
Heloisa Costa
Defining geosciences research data through metadata reuse:
Biblios
Research data
Research data reuse
Metadata
Research data repository
Data Web Scraping
Geosciences
title Defining geosciences research data through metadata reuse:
title_full Defining geosciences research data through metadata reuse:
title_fullStr Defining geosciences research data through metadata reuse:
title_full_unstemmed Defining geosciences research data through metadata reuse:
title_short Defining geosciences research data through metadata reuse:
title_sort defining geosciences research data through metadata reuse
topic Research data
Research data reuse
Metadata
Research data repository
Data Web Scraping
Geosciences
url https://biblios.pitt.edu/ojs/biblios/article/view/1233
work_keys_str_mv AT alexandreribassemeler defininggeosciencesresearchdatathroughmetadatareuse
AT luanafariassales defininggeosciencesresearchdatathroughmetadatareuse
AT adilsonluizpinto defininggeosciencesresearchdatathroughmetadatareuse
AT robertapereiradasilvadepaula defininggeosciencesresearchdatathroughmetadatareuse
AT valquercleytonpaesgandra defininggeosciencesresearchdatathroughmetadatareuse
AT heloisacosta defininggeosciencesresearchdatathroughmetadatareuse