MatSwarm: trusted swarm transfer learning driven materials computation for secure big data sharing

Abstract The rapid advancement of Industry 4.0 necessitates close collaboration among material research institutions to accelerate the development of novel materials. However, multi-institutional cooperation faces significant challenges in protecting sensitive data, leading to data silos. Additional...

Full description

Saved in:
Bibliographic Details
Main Authors: Ran Wang, Cheng Xu, Shuhao Zhang, Fangwen Ye, Yusen Tang, Sisui Tang, Hangning Zhang, Wendi Du, Xiaotong Zhang
Format: Article
Language:English
Published: Nature Portfolio 2024-10-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-024-53431-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850179531599511552
author Ran Wang
Cheng Xu
Shuhao Zhang
Fangwen Ye
Yusen Tang
Sisui Tang
Hangning Zhang
Wendi Du
Xiaotong Zhang
author_facet Ran Wang
Cheng Xu
Shuhao Zhang
Fangwen Ye
Yusen Tang
Sisui Tang
Hangning Zhang
Wendi Du
Xiaotong Zhang
author_sort Ran Wang
collection DOAJ
description Abstract The rapid advancement of Industry 4.0 necessitates close collaboration among material research institutions to accelerate the development of novel materials. However, multi-institutional cooperation faces significant challenges in protecting sensitive data, leading to data silos. Additionally, the heterogeneous and non-independent and identically distributed (non-i.i.d.) nature of material data hinders model accuracy and generalization in collaborative computing. In this paper, we introduce the MatSwarm framework, built on swarm learning, which integrates federated learning with blockchain technology. MatSwarm features two key innovations: a swarm transfer learning method with a regularization term to enhance the alignment of local model parameters, and the use of Trusted Execution Environments (TEE) with Intel SGX for heightened security. These advancements significantly enhance accuracy, generalization, and ensure data confidentiality throughout the model training and aggregation processes. Implemented within the National Material Data Management and Services (NMDMS) platform, MatSwarm has successfully aggregated over 14 million material data entries from more than thirty research institutions across China. The framework has demonstrated superior accuracy and generalization compared to models trained independently by individual institutions.
format Article
id doaj-art-28246a154c38461eb50c0ffed045e320
institution OA Journals
issn 2041-1723
language English
publishDate 2024-10-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-28246a154c38461eb50c0ffed045e3202025-08-20T02:18:28ZengNature PortfolioNature Communications2041-17232024-10-0115111410.1038/s41467-024-53431-xMatSwarm: trusted swarm transfer learning driven materials computation for secure big data sharingRan Wang0Cheng Xu1Shuhao Zhang2Fangwen Ye3Yusen Tang4Sisui Tang5Hangning Zhang6Wendi Du7Xiaotong Zhang8School of Computer and Communication Engineering, University of Science and Technology BeijingSchool of Computer and Communication Engineering, University of Science and Technology BeijingCollege of Computing and Data Science, Nanyang Technological UniversitySchool of Computer and Communication Engineering, University of Science and Technology BeijingSchool of Computer and Communication Engineering, University of Science and Technology BeijingSchool of Computer and Communication Engineering, University of Science and Technology BeijingSchool of Computer and Communication Engineering, University of Science and Technology BeijingSchool of Computer and Communication Engineering, University of Science and Technology BeijingSchool of Computer and Communication Engineering, University of Science and Technology BeijingAbstract The rapid advancement of Industry 4.0 necessitates close collaboration among material research institutions to accelerate the development of novel materials. However, multi-institutional cooperation faces significant challenges in protecting sensitive data, leading to data silos. Additionally, the heterogeneous and non-independent and identically distributed (non-i.i.d.) nature of material data hinders model accuracy and generalization in collaborative computing. In this paper, we introduce the MatSwarm framework, built on swarm learning, which integrates federated learning with blockchain technology. MatSwarm features two key innovations: a swarm transfer learning method with a regularization term to enhance the alignment of local model parameters, and the use of Trusted Execution Environments (TEE) with Intel SGX for heightened security. These advancements significantly enhance accuracy, generalization, and ensure data confidentiality throughout the model training and aggregation processes. Implemented within the National Material Data Management and Services (NMDMS) platform, MatSwarm has successfully aggregated over 14 million material data entries from more than thirty research institutions across China. The framework has demonstrated superior accuracy and generalization compared to models trained independently by individual institutions.https://doi.org/10.1038/s41467-024-53431-x
spellingShingle Ran Wang
Cheng Xu
Shuhao Zhang
Fangwen Ye
Yusen Tang
Sisui Tang
Hangning Zhang
Wendi Du
Xiaotong Zhang
MatSwarm: trusted swarm transfer learning driven materials computation for secure big data sharing
Nature Communications
title MatSwarm: trusted swarm transfer learning driven materials computation for secure big data sharing
title_full MatSwarm: trusted swarm transfer learning driven materials computation for secure big data sharing
title_fullStr MatSwarm: trusted swarm transfer learning driven materials computation for secure big data sharing
title_full_unstemmed MatSwarm: trusted swarm transfer learning driven materials computation for secure big data sharing
title_short MatSwarm: trusted swarm transfer learning driven materials computation for secure big data sharing
title_sort matswarm trusted swarm transfer learning driven materials computation for secure big data sharing
url https://doi.org/10.1038/s41467-024-53431-x
work_keys_str_mv AT ranwang matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing
AT chengxu matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing
AT shuhaozhang matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing
AT fangwenye matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing
AT yusentang matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing
AT sisuitang matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing
AT hangningzhang matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing
AT wendidu matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing
AT xiaotongzhang matswarmtrustedswarmtransferlearningdrivenmaterialscomputationforsecurebigdatasharing