Starrydata: from published plots to shared materials data
We have developed the Starrydata2 web system, an open, web-based database for collecting and organizing experimental material property data from the literature. It assists users worldwide in extracting and sharing curve data from plot images in published papers, along with relevant sample informatio...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2025-12-01
|
| Series: | Science and Technology of Advanced Materials: Methods |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/27660400.2025.2506976 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | We have developed the Starrydata2 web system, an open, web-based database for collecting and organizing experimental material property data from the literature. It assists users worldwide in extracting and sharing curve data from plot images in published papers, along with relevant sample information such as chemical compositions and fabrication methods. Starrydata2 streamlines the manual data collection process through partial automation. Currently, Starrydata encompasses over 194,000 curves extracted from more than 82,000 physical samples, as reported in over 13,000 publications on functional inorganic materials, including thermoelectric and magnetic materials. All data in Starrydata are openly accessible to the public for both commercial and non-commercial purposes. In this paper, we introduce the web interface, data curation workflow, data structure, and system architecture of Starrydata2. We then described in detail the datasets currently included in Starrydata2 and discuss their use cases. We also present the methods for applying the collected dataset, including a unique large-scale data representation method called ‘all-data plots’, which provides a comprehensive overview of the entire dataset. Finally, we report on how the collected datasets are being utilized in data-driven materials research through machine learning, modelling and simulation. |
|---|---|
| ISSN: | 2766-0400 |