Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor Development

For deep learning based soft sensors, the spatiotemporal attention (STA)-LSTM is a newly emerged technique which provides efficient predictions for quality variables of industrial processes. However, the STA-LSTM methods calls for an enormous network structure, which contains redundant network weigh...

Full description

Saved in:
Bibliographic Details
Main Authors: Yurun Wang, Yi Huang, Dongsheng Chen, Longyan Wang, Lingjian Ye, Feifan Shen
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10549946/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849421771234607104
author Yurun Wang
Yi Huang
Dongsheng Chen
Longyan Wang
Lingjian Ye
Feifan Shen
author_facet Yurun Wang
Yi Huang
Dongsheng Chen
Longyan Wang
Lingjian Ye
Feifan Shen
author_sort Yurun Wang
collection DOAJ
description For deep learning based soft sensors, the spatiotemporal attention (STA)-LSTM is a newly emerged technique which provides efficient predictions for quality variables of industrial processes. However, the STA-LSTM methods calls for an enormous network structure, which contains redundant network weights and therefore diminishing the model generalization ability. In this paper, we consider model sparse representation for the STA-LSTM to cope with the above problem. The <inline-formula> <tex-math notation="LaTeX">$\ell _{1}$ </tex-math></inline-formula>-regularization, which is a popular means to promote sparsity, is introduced into the loss function of the STA-LSTM. The <inline-formula> <tex-math notation="LaTeX">$\ell _{1}$ </tex-math></inline-formula>-regularized formulation is a non-smooth optimization problem, which cannot be well solved by common gradient descent approaches. We deploy the proximal operator, a well principled mathematical tool for handling non-smooth optimization problems, to solve the <inline-formula> <tex-math notation="LaTeX">$\ell _{1}$ </tex-math></inline-formula>-regularized STA-LSTM formulation. The new algorithm is developed within the framework of the state-of-art Adam algorithm, and the sparse representation for the STA-LSTM is referred to as Prox-STA-LSTM. Finally, two industrial cases, a carbon absorber and a desulfurization process, are investigated applying the new soft sensor. The results show that Prox-STA-LSTM can successfully sparsify the STA-LSTM networks. More importantly, the prediction performances are also enhanced.
format Article
id doaj-art-a8e629f0c8444568adf3d6051dc5c57c
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-a8e629f0c8444568adf3d6051dc5c57c2025-08-20T03:31:23ZengIEEEIEEE Access2169-35362024-01-0112806338064510.1109/ACCESS.2024.340989910549946Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor DevelopmentYurun Wang0Yi Huang1Dongsheng Chen2Longyan Wang3Lingjian Ye4https://orcid.org/0000-0001-8732-593XFeifan Shen5Huzhou Key Laboratory of Intelligent Sensing and Optimal Control for Industrial Systems, School of Engineering, Huzhou University, Huzhou, ChinaHuzhou Key Laboratory of Intelligent Sensing and Optimal Control for Industrial Systems, School of Engineering, Huzhou University, Huzhou, ChinaHuzhou Key Laboratory of Intelligent Sensing and Optimal Control for Industrial Systems, School of Engineering, Huzhou University, Huzhou, ChinaHuzhou Key Laboratory of Intelligent Sensing and Optimal Control for Industrial Systems, School of Engineering, Huzhou University, Huzhou, ChinaHuzhou Key Laboratory of Intelligent Sensing and Optimal Control for Industrial Systems, School of Engineering, Huzhou University, Huzhou, ChinaSchool of Information Science and Engineering, NingboTech University, Ningbo, ChinaFor deep learning based soft sensors, the spatiotemporal attention (STA)-LSTM is a newly emerged technique which provides efficient predictions for quality variables of industrial processes. However, the STA-LSTM methods calls for an enormous network structure, which contains redundant network weights and therefore diminishing the model generalization ability. In this paper, we consider model sparse representation for the STA-LSTM to cope with the above problem. The <inline-formula> <tex-math notation="LaTeX">$\ell _{1}$ </tex-math></inline-formula>-regularization, which is a popular means to promote sparsity, is introduced into the loss function of the STA-LSTM. The <inline-formula> <tex-math notation="LaTeX">$\ell _{1}$ </tex-math></inline-formula>-regularized formulation is a non-smooth optimization problem, which cannot be well solved by common gradient descent approaches. We deploy the proximal operator, a well principled mathematical tool for handling non-smooth optimization problems, to solve the <inline-formula> <tex-math notation="LaTeX">$\ell _{1}$ </tex-math></inline-formula>-regularized STA-LSTM formulation. The new algorithm is developed within the framework of the state-of-art Adam algorithm, and the sparse representation for the STA-LSTM is referred to as Prox-STA-LSTM. Finally, two industrial cases, a carbon absorber and a desulfurization process, are investigated applying the new soft sensor. The results show that Prox-STA-LSTM can successfully sparsify the STA-LSTM networks. More importantly, the prediction performances are also enhanced.https://ieeexplore.ieee.org/document/10549946/Soft sensorLSTMattention mechanismproximal operatorsparse representation
spellingShingle Yurun Wang
Yi Huang
Dongsheng Chen
Longyan Wang
Lingjian Ye
Feifan Shen
Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor Development
IEEE Access
Soft sensor
LSTM
attention mechanism
proximal operator
sparse representation
title Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor Development
title_full Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor Development
title_fullStr Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor Development
title_full_unstemmed Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor Development
title_short Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor Development
title_sort prox sta lstm a sparse representation for the attention based lstm networks for industrial soft sensor development
topic Soft sensor
LSTM
attention mechanism
proximal operator
sparse representation
url https://ieeexplore.ieee.org/document/10549946/
work_keys_str_mv AT yurunwang proxstalstmasparserepresentationfortheattentionbasedlstmnetworksforindustrialsoftsensordevelopment
AT yihuang proxstalstmasparserepresentationfortheattentionbasedlstmnetworksforindustrialsoftsensordevelopment
AT dongshengchen proxstalstmasparserepresentationfortheattentionbasedlstmnetworksforindustrialsoftsensordevelopment
AT longyanwang proxstalstmasparserepresentationfortheattentionbasedlstmnetworksforindustrialsoftsensordevelopment
AT lingjianye proxstalstmasparserepresentationfortheattentionbasedlstmnetworksforindustrialsoftsensordevelopment
AT feifanshen proxstalstmasparserepresentationfortheattentionbasedlstmnetworksforindustrialsoftsensordevelopment