Gradient Amplification: An Efficient Way to Train Deep Neural Networks
Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase trai...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2020-09-01
|
Series: | Big Data Mining and Analytics |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2020.9020004 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832568907054972928 |
---|---|
author | Sunitha Basodi Chunyan Ji Haiping Zhang Yi Pan |
author_facet | Sunitha Basodi Chunyan Ji Haiping Zhang Yi Pan |
author_sort | Sunitha Basodi |
collection | DOAJ |
description | Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase training times, but also suffer from vanishing gradients problem while training. In this work, we propose gradient amplification approach for training deep learning models to prevent vanishing gradients and also develop a training strategy to enable or disable gradient amplification method across several epochs with different learning rates. We perform experiments on VGG-19 and Resnet models (Resnet-18 and Resnet-34) , and study the impact of amplification parameters on these models in detail. Our proposed approach improves performance of these deep learning models even at higher learning rates, thereby allowing these models to achieve higher performance with reduced training time. |
format | Article |
id | doaj-art-5433e311cbbb4c2e914915b00c94d99e |
institution | Kabale University |
issn | 2096-0654 |
language | English |
publishDate | 2020-09-01 |
publisher | Tsinghua University Press |
record_format | Article |
series | Big Data Mining and Analytics |
spelling | doaj-art-5433e311cbbb4c2e914915b00c94d99e2025-02-02T23:47:57ZengTsinghua University PressBig Data Mining and Analytics2096-06542020-09-013319620710.26599/BDMA.2020.9020004Gradient Amplification: An Efficient Way to Train Deep Neural NetworksSunitha Basodi0Chunyan Ji1Haiping Zhang2Yi Pan3<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.<institution content-type="dept">Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology</institution>, <institution>Chinese Academy of Sciences</institution>, <city>Shenzhen</city> <postal-code>518055</postal-code>, <country>China</country>.<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges, one of which is to increase the depth of the neural networks. Such deeper networks not only increase training times, but also suffer from vanishing gradients problem while training. In this work, we propose gradient amplification approach for training deep learning models to prevent vanishing gradients and also develop a training strategy to enable or disable gradient amplification method across several epochs with different learning rates. We perform experiments on VGG-19 and Resnet models (Resnet-18 and Resnet-34) , and study the impact of amplification parameters on these models in detail. Our proposed approach improves performance of these deep learning models even at higher learning rates, thereby allowing these models to achieve higher performance with reduced training time.https://www.sciopen.com/article/10.26599/BDMA.2020.9020004deep learninggradient amplificationlearning ratebackpropagationvanishing gradients |
spellingShingle | Sunitha Basodi Chunyan Ji Haiping Zhang Yi Pan Gradient Amplification: An Efficient Way to Train Deep Neural Networks Big Data Mining and Analytics deep learning gradient amplification learning rate backpropagation vanishing gradients |
title | Gradient Amplification: An Efficient Way to Train Deep Neural Networks |
title_full | Gradient Amplification: An Efficient Way to Train Deep Neural Networks |
title_fullStr | Gradient Amplification: An Efficient Way to Train Deep Neural Networks |
title_full_unstemmed | Gradient Amplification: An Efficient Way to Train Deep Neural Networks |
title_short | Gradient Amplification: An Efficient Way to Train Deep Neural Networks |
title_sort | gradient amplification an efficient way to train deep neural networks |
topic | deep learning gradient amplification learning rate backpropagation vanishing gradients |
url | https://www.sciopen.com/article/10.26599/BDMA.2020.9020004 |
work_keys_str_mv | AT sunithabasodi gradientamplificationanefficientwaytotraindeepneuralnetworks AT chunyanji gradientamplificationanefficientwaytotraindeepneuralnetworks AT haipingzhang gradientamplificationanefficientwaytotraindeepneuralnetworks AT yipan gradientamplificationanefficientwaytotraindeepneuralnetworks |