Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition
Hand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless,...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2023-01-01
|
| Series: | IEEE Transactions on Neural Systems and Rehabilitation Engineering |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10323506/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849735173259657216 |
|---|---|
| author | Shengcai Duan Le Wu Aiping Liu Xun Chen |
| author_facet | Shengcai Duan Le Wu Aiping Liu Xun Chen |
| author_sort | Shengcai Duan |
| collection | DOAJ |
| description | Hand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless, these methods typically fuse sEMG and ACC either in the early or late stages, overlooking the integration of entire cross-modal hierarchical information within each individual hidden layer, thus inducing inefficient inter-modal fusion. To this end, we propose a novel Alignment-Enhanced Interactive Fusion (AiFusion) model, which achieves effective fusion via a progressive hierarchical fusion strategy. Notably, AiFusion can flexibly perform both complete and incomplete multimodal HGR. Specifically, AiFusion contains two unimodal branches and a cascaded transformer-based multimodal fusion branch. The fusion branch is first designed to adequately characterize modality-interactive knowledge by adaptively capturing inter-modal similarity and fusing hierarchical features from all branches layer by layer. Then, the modality-interactive knowledge is aligned with that of unimodality using cross-modal supervised contrastive learning and online distillation from embedding and probability spaces respectively. These alignments further promote fusion quality and refine modality-specific representations. Finally, the recognition outcomes are set to be determined by available modalities, thus contributing to handling the incomplete multimodal HGR problem, which is frequently encountered in real-world scenarios. Experimental results on five public datasets demonstrate that AiFusion outperforms most state-of-the-art benchmarks in complete multimodal HGR. Impressively, it also surpasses the unimodal baselines in the challenging incomplete multimodal HGR. The proposed AiFusion provides a promising solution to realize effective and robust multimodal HGR-based interfaces. |
| format | Article |
| id | doaj-art-a91095580e2e4ada89c9b8f0e64a9e7a |
| institution | DOAJ |
| issn | 1534-4320 1558-0210 |
| language | English |
| publishDate | 2023-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Transactions on Neural Systems and Rehabilitation Engineering |
| spelling | doaj-art-a91095580e2e4ada89c9b8f0e64a9e7a2025-08-20T03:07:37ZengIEEEIEEE Transactions on Neural Systems and Rehabilitation Engineering1534-43201558-02102023-01-01314661467110.1109/TNSRE.2023.333510110323506Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture RecognitionShengcai Duan0https://orcid.org/0000-0002-7582-3891Le Wu1https://orcid.org/0000-0002-8565-9626Aiping Liu2https://orcid.org/0000-0001-8849-5228Xun Chen3https://orcid.org/0000-0002-4922-8116School of Information Science and Technology, University of Science and Technology of China, Hefei, Anhui, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Hefei, Anhui, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Hefei, Anhui, ChinaDepartment of Neurosurgery, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, and the Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, ChinaHand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless, these methods typically fuse sEMG and ACC either in the early or late stages, overlooking the integration of entire cross-modal hierarchical information within each individual hidden layer, thus inducing inefficient inter-modal fusion. To this end, we propose a novel Alignment-Enhanced Interactive Fusion (AiFusion) model, which achieves effective fusion via a progressive hierarchical fusion strategy. Notably, AiFusion can flexibly perform both complete and incomplete multimodal HGR. Specifically, AiFusion contains two unimodal branches and a cascaded transformer-based multimodal fusion branch. The fusion branch is first designed to adequately characterize modality-interactive knowledge by adaptively capturing inter-modal similarity and fusing hierarchical features from all branches layer by layer. Then, the modality-interactive knowledge is aligned with that of unimodality using cross-modal supervised contrastive learning and online distillation from embedding and probability spaces respectively. These alignments further promote fusion quality and refine modality-specific representations. Finally, the recognition outcomes are set to be determined by available modalities, thus contributing to handling the incomplete multimodal HGR problem, which is frequently encountered in real-world scenarios. Experimental results on five public datasets demonstrate that AiFusion outperforms most state-of-the-art benchmarks in complete multimodal HGR. Impressively, it also surpasses the unimodal baselines in the challenging incomplete multimodal HGR. The proposed AiFusion provides a promising solution to realize effective and robust multimodal HGR-based interfaces.https://ieeexplore.ieee.org/document/10323506/Multimodal fusionhand gesture recognitionmyoelectric controlaccelerometerincomplete multimodalalignment |
| spellingShingle | Shengcai Duan Le Wu Aiping Liu Xun Chen Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition IEEE Transactions on Neural Systems and Rehabilitation Engineering Multimodal fusion hand gesture recognition myoelectric control accelerometer incomplete multimodal alignment |
| title | Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition |
| title_full | Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition |
| title_fullStr | Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition |
| title_full_unstemmed | Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition |
| title_short | Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition |
| title_sort | alignment enhanced interactive fusion model for complete and incomplete multimodal hand gesture recognition |
| topic | Multimodal fusion hand gesture recognition myoelectric control accelerometer incomplete multimodal alignment |
| url | https://ieeexplore.ieee.org/document/10323506/ |
| work_keys_str_mv | AT shengcaiduan alignmentenhancedinteractivefusionmodelforcompleteandincompletemultimodalhandgesturerecognition AT lewu alignmentenhancedinteractivefusionmodelforcompleteandincompletemultimodalhandgesturerecognition AT aipingliu alignmentenhancedinteractivefusionmodelforcompleteandincompletemultimodalhandgesturerecognition AT xunchen alignmentenhancedinteractivefusionmodelforcompleteandincompletemultimodalhandgesturerecognition |