Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions
Abstract To address the challenge of limited experimental materials data, extensive physical property databases are being developed based on high-throughput computational experiments, such as molecular dynamics simulations. Previous studies have shown that fine-tuning a predictor pretrained on a com...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | npj Computational Materials |
| Online Access: | https://doi.org/10.1038/s41524-025-01606-5 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850207483736358912 |
|---|---|
| author | Shunya Minami Yoshihiro Hayashi Stephen Wu Kenji Fukumizu Hiroki Sugisawa Masashi Ishii Isao Kuwajima Kazuya Shiratori Ryo Yoshida |
| author_facet | Shunya Minami Yoshihiro Hayashi Stephen Wu Kenji Fukumizu Hiroki Sugisawa Masashi Ishii Isao Kuwajima Kazuya Shiratori Ryo Yoshida |
| author_sort | Shunya Minami |
| collection | DOAJ |
| description | Abstract To address the challenge of limited experimental materials data, extensive physical property databases are being developed based on high-throughput computational experiments, such as molecular dynamics simulations. Previous studies have shown that fine-tuning a predictor pretrained on a computational database to a real system can result in models with outstanding generalization capabilities compared to learning from scratch. This study demonstrates the scaling law of simulation-to-real (Sim2Real) transfer learning for several machine learning tasks in materials science. Case studies of three prediction tasks for polymers and inorganic materials reveal that the prediction error on real systems decreases according to a power-law as the size of the computational data increases. Observing the scaling behavior offers various insights for database development, such as determining the sample size necessary to achieve a desired performance, identifying equivalent sample sizes for physical and computational experiments, and guiding the design of data production protocols for downstream real-world tasks. |
| format | Article |
| id | doaj-art-feff0b14968a41c3b30b68553019f464 |
| institution | OA Journals |
| issn | 2057-3960 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Computational Materials |
| spelling | doaj-art-feff0b14968a41c3b30b68553019f4642025-08-20T02:10:31ZengNature Portfolionpj Computational Materials2057-39602025-05-0111111010.1038/s41524-025-01606-5Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictionsShunya Minami0Yoshihiro Hayashi1Stephen Wu2Kenji Fukumizu3Hiroki Sugisawa4Masashi Ishii5Isao Kuwajima6Kazuya Shiratori7Ryo Yoshida8The Institute of Statistical Mathematics, Research Organization of Information and SystemsThe Institute of Statistical Mathematics, Research Organization of Information and SystemsThe Institute of Statistical Mathematics, Research Organization of Information and SystemsThe Institute of Statistical Mathematics, Research Organization of Information and SystemsScience & Innovation Center, Mitsubishi Chemical CorporationResearch and Service Division of Materials Data and Integrated System, National Institute for Materials ScienceResearch and Service Division of Materials Data and Integrated System, National Institute for Materials ScienceScience & Innovation Center, Mitsubishi Chemical CorporationThe Institute of Statistical Mathematics, Research Organization of Information and SystemsAbstract To address the challenge of limited experimental materials data, extensive physical property databases are being developed based on high-throughput computational experiments, such as molecular dynamics simulations. Previous studies have shown that fine-tuning a predictor pretrained on a computational database to a real system can result in models with outstanding generalization capabilities compared to learning from scratch. This study demonstrates the scaling law of simulation-to-real (Sim2Real) transfer learning for several machine learning tasks in materials science. Case studies of three prediction tasks for polymers and inorganic materials reveal that the prediction error on real systems decreases according to a power-law as the size of the computational data increases. Observing the scaling behavior offers various insights for database development, such as determining the sample size necessary to achieve a desired performance, identifying equivalent sample sizes for physical and computational experiments, and guiding the design of data production protocols for downstream real-world tasks.https://doi.org/10.1038/s41524-025-01606-5 |
| spellingShingle | Shunya Minami Yoshihiro Hayashi Stephen Wu Kenji Fukumizu Hiroki Sugisawa Masashi Ishii Isao Kuwajima Kazuya Shiratori Ryo Yoshida Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions npj Computational Materials |
| title | Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions |
| title_full | Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions |
| title_fullStr | Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions |
| title_full_unstemmed | Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions |
| title_short | Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions |
| title_sort | scaling law of sim2real transfer learning in expanding computational materials databases for real world predictions |
| url | https://doi.org/10.1038/s41524-025-01606-5 |
| work_keys_str_mv | AT shunyaminami scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions AT yoshihirohayashi scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions AT stephenwu scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions AT kenjifukumizu scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions AT hirokisugisawa scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions AT masashiishii scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions AT isaokuwajima scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions AT kazuyashiratori scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions AT ryoyoshida scalinglawofsim2realtransferlearninginexpandingcomputationalmaterialsdatabasesforrealworldpredictions |