A snapshot of parallelism in distributed deep learning training

The accelerated development of applications related to artificial intelligence has generated the creation of increasingly complex neural network models with enormous amounts of parameters, currently reaching up to trillions of parameters. Therefore, it makes your training almost impossible without...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hairol Romero-Sandí, Gabriel Núñez, Elvis Rojas
Format:	Article
Language:	English
Published:	Universidad Autónoma de Bucaramanga 2024-06-01
Series:	Revista Colombiana de Computación
Online Access:	https://revistasunabeduco.biteca.online/index.php/rcc/article/view/5054
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849223246599159808
author	Hairol Romero-Sandí Gabriel Núñez Elvis Rojas
author_facet	Hairol Romero-Sandí Gabriel Núñez Elvis Rojas
author_sort	Hairol Romero-Sandí
collection	DOAJ
description	The accelerated development of applications related to artificial intelligence has generated the creation of increasingly complex neural network models with enormous amounts of parameters, currently reaching up to trillions of parameters. Therefore, it makes your training almost impossible without the parallelization of training. Parallelism applied with different approaches is the mechanism that has been used to solve the problem of training on a large scale. This paper presents a glimpse of the state of the art related to parallelism in deep learning training from multiple points of view. The topics of pipeline parallelism, hybrid parallelism, mixture-of-experts and auto-parallelism are addressed in this study, which currently play a leading role in scientific research related to this area. Finally, we develop a series of experiments with data parallelism and model parallelism. The objective is that the reader can observe the performance of two types of parallelism and understand more clearly the approach of each one.
format	Article
id	doaj-art-9e333338bec145c38bf7194e4f25b455
institution	Kabale University
issn	1657-2831 2539-2115
language	English
publishDate	2024-06-01
publisher	Universidad Autónoma de Bucaramanga
record_format	Article
series	Revista Colombiana de Computación
spelling	doaj-art-9e333338bec145c38bf7194e4f25b4552025-08-25T20:22:18ZengUniversidad Autónoma de BucaramangaRevista Colombiana de Computación1657-28312539-21152024-06-0125110.29375/25392115.5054A snapshot of parallelism in distributed deep learning trainingHairol Romero-Sandí0Gabriel Núñez1Elvis Rojas2Universidad NacionalUniversidad NacionalUniversidad Nacional \| National High Technology Center The accelerated development of applications related to artificial intelligence has generated the creation of increasingly complex neural network models with enormous amounts of parameters, currently reaching up to trillions of parameters. Therefore, it makes your training almost impossible without the parallelization of training. Parallelism applied with different approaches is the mechanism that has been used to solve the problem of training on a large scale. This paper presents a glimpse of the state of the art related to parallelism in deep learning training from multiple points of view. The topics of pipeline parallelism, hybrid parallelism, mixture-of-experts and auto-parallelism are addressed in this study, which currently play a leading role in scientific research related to this area. Finally, we develop a series of experiments with data parallelism and model parallelism. The objective is that the reader can observe the performance of two types of parallelism and understand more clearly the approach of each one. https://revistasunabeduco.biteca.online/index.php/rcc/article/view/5054
spellingShingle	Hairol Romero-Sandí Gabriel Núñez Elvis Rojas A snapshot of parallelism in distributed deep learning training Revista Colombiana de Computación
title	A snapshot of parallelism in distributed deep learning training
title_full	A snapshot of parallelism in distributed deep learning training
title_fullStr	A snapshot of parallelism in distributed deep learning training
title_full_unstemmed	A snapshot of parallelism in distributed deep learning training
title_short	A snapshot of parallelism in distributed deep learning training
title_sort	snapshot of parallelism in distributed deep learning training
url	https://revistasunabeduco.biteca.online/index.php/rcc/article/view/5054
work_keys_str_mv	AT hairolromerosandi asnapshotofparallelismindistributeddeeplearningtraining AT gabrielnunez asnapshotofparallelismindistributeddeeplearningtraining AT elvisrojas asnapshotofparallelismindistributeddeeplearningtraining AT hairolromerosandi snapshotofparallelismindistributeddeeplearningtraining AT gabrielnunez snapshotofparallelismindistributeddeeplearningtraining AT elvisrojas snapshotofparallelismindistributeddeeplearningtraining

A snapshot of parallelism in distributed deep learning training

Similar Items