Gradient Amplification: An Efficient Way to Train Deep Neural Networks

Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase trai...

Full description

Saved in:
Bibliographic Details
Main Authors: Sunitha Basodi, Chunyan Ji, Haiping Zhang, Yi Pan
Format: Article
Language:English
Published: Tsinghua University Press 2020-09-01
Series:Big Data Mining and Analytics
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/BDMA.2020.9020004
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832568907054972928
author Sunitha Basodi
Chunyan Ji
Haiping Zhang
Yi Pan
author_facet Sunitha Basodi
Chunyan Ji
Haiping Zhang
Yi Pan
author_sort Sunitha Basodi
collection DOAJ
description Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase training times, but also suffer from vanishing gradients problem while training. In this work, we propose gradient amplification approach for training deep learning models to prevent vanishing gradients and also develop a training strategy to enable or disable gradient amplification method across several epochs with different learning rates. We perform experiments on VGG-19 and Resnet models (Resnet-18 and Resnet-34) , and study the impact of amplification parameters on these models in detail. Our proposed approach improves performance of these deep learning models even at higher learning rates, thereby allowing these models to achieve higher performance with reduced training time.
format Article
id doaj-art-5433e311cbbb4c2e914915b00c94d99e
institution Kabale University
issn 2096-0654
language English
publishDate 2020-09-01
publisher Tsinghua University Press
record_format Article
series Big Data Mining and Analytics
spelling doaj-art-5433e311cbbb4c2e914915b00c94d99e2025-02-02T23:47:57ZengTsinghua University PressBig Data Mining and Analytics2096-06542020-09-013319620710.26599/BDMA.2020.9020004Gradient Amplification: An Efficient Way to Train Deep Neural NetworksSunitha Basodi0Chunyan Ji1Haiping Zhang2Yi Pan3<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.<institution content-type="dept">Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology</institution>, <institution>Chinese Academy of Sciences</institution>, <city>Shenzhen</city> <postal-code>518055</postal-code>, <country>China</country>.<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase training times, but also suffer from vanishing gradients problem while training. In this work, we propose gradient amplification approach for training deep learning models to prevent vanishing gradients and also develop a training strategy to enable or disable gradient amplification method across several epochs with different learning rates. We perform experiments on VGG-19 and Resnet models (Resnet-18 and Resnet-34) , and study the impact of amplification parameters on these models in detail. Our proposed approach improves performance of these deep learning models even at higher learning rates, thereby allowing these models to achieve higher performance with reduced training time.https://www.sciopen.com/article/10.26599/BDMA.2020.9020004deep learninggradient amplificationlearning ratebackpropagationvanishing gradients
spellingShingle Sunitha Basodi
Chunyan Ji
Haiping Zhang
Yi Pan
Gradient Amplification: An Efficient Way to Train Deep Neural Networks
Big Data Mining and Analytics
deep learning
gradient amplification
learning rate
backpropagation
vanishing gradients
title Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_full Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_fullStr Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_full_unstemmed Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_short Gradient Amplification: An Efficient Way to Train Deep Neural Networks
title_sort gradient amplification an efficient way to train deep neural networks
topic deep learning
gradient amplification
learning rate
backpropagation
vanishing gradients
url https://www.sciopen.com/article/10.26599/BDMA.2020.9020004
work_keys_str_mv AT sunithabasodi gradientamplificationanefficientwaytotraindeepneuralnetworks
AT chunyanji gradientamplificationanefficientwaytotraindeepneuralnetworks
AT haipingzhang gradientamplificationanefficientwaytotraindeepneuralnetworks
AT yipan gradientamplificationanefficientwaytotraindeepneuralnetworks