Gradient Amplification: An Efficient Way to Train Deep Neural Networks

Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase trai...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sunitha Basodi, Chunyan Ji, Haiping Zhang, Yi Pan
Format:	Article
Language:	English
Published:	Tsinghua University Press 2020-09-01
Series:	Big Data Mining and Analytics
Subjects:	deep learning gradient amplification learning rate backpropagation vanishing gradients
Online Access:	https://www.sciopen.com/article/10.26599/BDMA.2020.9020004
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832568907054972928
author	Sunitha Basodi Chunyan Ji Haiping Zhang Yi Pan
author_facet	Sunitha Basodi Chunyan Ji Haiping Zhang Yi Pan
author_sort	Sunitha Basodi
collection	DOAJ
description	Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase training times, but also suffer from vanishing gradients problem while training. In this work, we propose gradient amplification approach for training deep learning models to prevent vanishing gradients and also develop a training strategy to enable or disable gradient amplification method across several epochs with different learning rates. We perform experiments on VGG-19 and Resnet models (Resnet-18 and Resnet-34) , and study the impact of amplification parameters on these models in detail. Our proposed approach improves performance of these deep learning models even at higher learning rates, thereby allowing these models to achieve higher performance with reduced training time.
format	Article
id	doaj-art-5433e311cbbb4c2e914915b00c94d99e
institution	Kabale University
issn	2096-0654
language	English
publishDate	2020-09-01
publisher	Tsinghua University Press
record_format	Article
series	Big Data Mining and Analytics
spelling	doaj-art-5433e311cbbb4c2e914915b00c94d99e2025-02-02T23:47:57ZengTsinghua University PressBig Data Mining and Analytics2096-06542020-09-013319620710.26599/BDMA.2020.9020004Gradient Amplification: An Efficient Way to Train Deep Neural NetworksSunitha Basodi0Chunyan Ji1Haiping Zhang2Yi Pan3<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.<institution content-type="dept">Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology</institution>, <institution>Chinese Academy of Sciences</institution>, <city>Shenzhen</city> <postal-code>518055</postal-code>, <country>China</country>.<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase training times, but also suffer from vanishing gradients problem while training. In this work, we propose gradient amplification approach for training deep learning models to prevent vanishing gradients and also develop a training strategy to enable or disable gradient amplification method across several epochs with different learning rates. We perform experiments on VGG-19 and Resnet models (Resnet-18 and Resnet-34) , and study the impact of amplification parameters on these models in detail. Our proposed approach improves performance of these deep learning models even at higher learning rates, thereby allowing these models to achieve higher performance with reduced training time.https://www.sciopen.com/article/10.26599/BDMA.2020.9020004deep learninggradient amplificationlearning ratebackpropagationvanishing gradients
spellingShingle	Sunitha Basodi Chunyan Ji Haiping Zhang Yi Pan Gradient Amplification: An Efficient Way to Train Deep Neural Networks Big Data Mining and Analytics deep learning gradient amplification learning rate backpropagation vanishing gradients
title	Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_full	Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_fullStr	Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_full_unstemmed	Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_short	Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_sort	gradient amplification an efficient way to train deep neural networks
topic	deep learning gradient amplification learning rate backpropagation vanishing gradients
url	https://www.sciopen.com/article/10.26599/BDMA.2020.9020004
work_keys_str_mv	AT sunithabasodi gradientamplificationanefficientwaytotraindeepneuralnetworks AT chunyanji gradientamplificationanefficientwaytotraindeepneuralnetworks AT haipingzhang gradientamplificationanefficientwaytotraindeepneuralnetworks AT yipan gradientamplificationanefficientwaytotraindeepneuralnetworks

Gradient Amplification: An Efficient Way to Train Deep Neural Networks

Similar Items