Training Large Models on Heterogeneous and Geo-Distributed Resource with Constricted Networks

As the computational demands driven by large model technologies continue to grow rapidly, leveraging GPU hardware to expedite parallel training processes has emerged as a commonly-used strategy. When computational resources within a single cluster are insufficient for large-model training, the hybri...

Full description

Saved in:
Bibliographic Details
Main Authors: Zan Zong, Minkun Guo, Mingshu Zhai, Yinan Tang, Jianjiang Li, Jidong Zhai
Format: Article
Language:English
Published: Tsinghua University Press 2025-06-01
Series:Big Data Mining and Analytics
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/BDMA.2025.9020031
Tags: Add Tag
No Tags, Be the first to tag this record!