Starrydata: from published plots to shared materials data

We have developed the Starrydata2 web system, an open, web-based database for collecting and organizing experimental material property data from the literature. It assists users worldwide in extracting and sharing curve data from plot images in published papers, along with relevant sample informatio...

Full description

Saved in:
Bibliographic Details
Main Authors: Yukari Katsura, Masaya Kumagai, Tomoya Mato, Yu Takada, Yuki Ando, Erina Fujita, Fumikazu Hosono, Eiji Koyama, Farhan Mudasar, Ton Nu Thanh Phuong, Naoto Saito, Yoshihiro Sakamoto, Atsumi Tanaka, Dewi Yana, Kaoru Kimura, Koji Tsuda, Masahiko Demura
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Science and Technology of Advanced Materials: Methods
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/27660400.2025.2506976
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849432070680477696
author Yukari Katsura
Masaya Kumagai
Tomoya Mato
Yu Takada
Yuki Ando
Erina Fujita
Fumikazu Hosono
Eiji Koyama
Farhan Mudasar
Ton Nu Thanh Phuong
Naoto Saito
Yoshihiro Sakamoto
Atsumi Tanaka
Dewi Yana
Kaoru Kimura
Koji Tsuda
Masahiko Demura
author_facet Yukari Katsura
Masaya Kumagai
Tomoya Mato
Yu Takada
Yuki Ando
Erina Fujita
Fumikazu Hosono
Eiji Koyama
Farhan Mudasar
Ton Nu Thanh Phuong
Naoto Saito
Yoshihiro Sakamoto
Atsumi Tanaka
Dewi Yana
Kaoru Kimura
Koji Tsuda
Masahiko Demura
author_sort Yukari Katsura
collection DOAJ
description We have developed the Starrydata2 web system, an open, web-based database for collecting and organizing experimental material property data from the literature. It assists users worldwide in extracting and sharing curve data from plot images in published papers, along with relevant sample information such as chemical compositions and fabrication methods. Starrydata2 streamlines the manual data collection process through partial automation. Currently, Starrydata encompasses over 194,000 curves extracted from more than 82,000 physical samples, as reported in over 13,000 publications on functional inorganic materials, including thermoelectric and magnetic materials. All data in Starrydata are openly accessible to the public for both commercial and non-commercial purposes. In this paper, we introduce the web interface, data curation workflow, data structure, and system architecture of Starrydata2. We then described in detail the datasets currently included in Starrydata2 and discuss their use cases. We also present the methods for applying the collected dataset, including a unique large-scale data representation method called ‘all-data plots’, which provides a comprehensive overview of the entire dataset. Finally, we report on how the collected datasets are being utilized in data-driven materials research through machine learning, modelling and simulation.
format Article
id doaj-art-4afb9f549a424e0faac06e95d5638f03
institution Kabale University
issn 2766-0400
language English
publishDate 2025-12-01
publisher Taylor & Francis Group
record_format Article
series Science and Technology of Advanced Materials: Methods
spelling doaj-art-4afb9f549a424e0faac06e95d5638f032025-08-20T03:27:28ZengTaylor & Francis GroupScience and Technology of Advanced Materials: Methods2766-04002025-12-015110.1080/27660400.2025.2506976Starrydata: from published plots to shared materials dataYukari Katsura0Masaya Kumagai1Tomoya Mato2Yu Takada3Yuki Ando4Erina Fujita5Fumikazu Hosono6Eiji Koyama7Farhan Mudasar8Ton Nu Thanh Phuong9Naoto Saito10Yoshihiro Sakamoto11Atsumi Tanaka12Dewi Yana13Kaoru Kimura14Koji Tsuda15Masahiko Demura16Center for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanRIKEN Center for Advanced Intelligence Project, RIKEN, Tokyo, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanResearch Center for Structural Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanDepartment of Advanced Data Science, The Institute of Statistical Mathematics (ISM), Research Organization of Information and Systems, Tokyo, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanDepartment of Physical and Environmental Sciences, University of Toronto Scarborough, Toronto, CanadaResearch Center for Magnetic and Spintronic Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanRIKEN Center for Advanced Intelligence Project, RIKEN, Tokyo, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanDepartment of Advanced Data Science, The Institute of Statistical Mathematics (ISM), Research Organization of Information and Systems, Tokyo, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), Tsukuba, JapanWe have developed the Starrydata2 web system, an open, web-based database for collecting and organizing experimental material property data from the literature. It assists users worldwide in extracting and sharing curve data from plot images in published papers, along with relevant sample information such as chemical compositions and fabrication methods. Starrydata2 streamlines the manual data collection process through partial automation. Currently, Starrydata encompasses over 194,000 curves extracted from more than 82,000 physical samples, as reported in over 13,000 publications on functional inorganic materials, including thermoelectric and magnetic materials. All data in Starrydata are openly accessible to the public for both commercial and non-commercial purposes. In this paper, we introduce the web interface, data curation workflow, data structure, and system architecture of Starrydata2. We then described in detail the datasets currently included in Starrydata2 and discuss their use cases. We also present the methods for applying the collected dataset, including a unique large-scale data representation method called ‘all-data plots’, which provides a comprehensive overview of the entire dataset. Finally, we report on how the collected datasets are being utilized in data-driven materials research through machine learning, modelling and simulation.https://www.tandfonline.com/doi/10.1080/27660400.2025.2506976Materials informaticsdatabasedata miningplot digitizationopen dataliterature data
spellingShingle Yukari Katsura
Masaya Kumagai
Tomoya Mato
Yu Takada
Yuki Ando
Erina Fujita
Fumikazu Hosono
Eiji Koyama
Farhan Mudasar
Ton Nu Thanh Phuong
Naoto Saito
Yoshihiro Sakamoto
Atsumi Tanaka
Dewi Yana
Kaoru Kimura
Koji Tsuda
Masahiko Demura
Starrydata: from published plots to shared materials data
Science and Technology of Advanced Materials: Methods
Materials informatics
database
data mining
plot digitization
open data
literature data
title Starrydata: from published plots to shared materials data
title_full Starrydata: from published plots to shared materials data
title_fullStr Starrydata: from published plots to shared materials data
title_full_unstemmed Starrydata: from published plots to shared materials data
title_short Starrydata: from published plots to shared materials data
title_sort starrydata from published plots to shared materials data
topic Materials informatics
database
data mining
plot digitization
open data
literature data
url https://www.tandfonline.com/doi/10.1080/27660400.2025.2506976
work_keys_str_mv AT yukarikatsura starrydatafrompublishedplotstosharedmaterialsdata
AT masayakumagai starrydatafrompublishedplotstosharedmaterialsdata
AT tomoyamato starrydatafrompublishedplotstosharedmaterialsdata
AT yutakada starrydatafrompublishedplotstosharedmaterialsdata
AT yukiando starrydatafrompublishedplotstosharedmaterialsdata
AT erinafujita starrydatafrompublishedplotstosharedmaterialsdata
AT fumikazuhosono starrydatafrompublishedplotstosharedmaterialsdata
AT eijikoyama starrydatafrompublishedplotstosharedmaterialsdata
AT farhanmudasar starrydatafrompublishedplotstosharedmaterialsdata
AT tonnuthanhphuong starrydatafrompublishedplotstosharedmaterialsdata
AT naotosaito starrydatafrompublishedplotstosharedmaterialsdata
AT yoshihirosakamoto starrydatafrompublishedplotstosharedmaterialsdata
AT atsumitanaka starrydatafrompublishedplotstosharedmaterialsdata
AT dewiyana starrydatafrompublishedplotstosharedmaterialsdata
AT kaorukimura starrydatafrompublishedplotstosharedmaterialsdata
AT kojitsuda starrydatafrompublishedplotstosharedmaterialsdata
AT masahikodemura starrydatafrompublishedplotstosharedmaterialsdata