Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition

Hand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless,...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shengcai Duan, Le Wu, Aiping Liu, Xun Chen
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Transactions on Neural Systems and Rehabilitation Engineering
Subjects:	Multimodal fusion hand gesture recognition myoelectric control accelerometer incomplete multimodal alignment
Online Access:	https://ieeexplore.ieee.org/document/10323506/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849735173259657216
author	Shengcai Duan Le Wu Aiping Liu Xun Chen
author_facet	Shengcai Duan Le Wu Aiping Liu Xun Chen
author_sort	Shengcai Duan
collection	DOAJ
description	Hand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless, these methods typically fuse sEMG and ACC either in the early or late stages, overlooking the integration of entire cross-modal hierarchical information within each individual hidden layer, thus inducing inefficient inter-modal fusion. To this end, we propose a novel Alignment-Enhanced Interactive Fusion (AiFusion) model, which achieves effective fusion via a progressive hierarchical fusion strategy. Notably, AiFusion can flexibly perform both complete and incomplete multimodal HGR. Specifically, AiFusion contains two unimodal branches and a cascaded transformer-based multimodal fusion branch. The fusion branch is first designed to adequately characterize modality-interactive knowledge by adaptively capturing inter-modal similarity and fusing hierarchical features from all branches layer by layer. Then, the modality-interactive knowledge is aligned with that of unimodality using cross-modal supervised contrastive learning and online distillation from embedding and probability spaces respectively. These alignments further promote fusion quality and refine modality-specific representations. Finally, the recognition outcomes are set to be determined by available modalities, thus contributing to handling the incomplete multimodal HGR problem, which is frequently encountered in real-world scenarios. Experimental results on five public datasets demonstrate that AiFusion outperforms most state-of-the-art benchmarks in complete multimodal HGR. Impressively, it also surpasses the unimodal baselines in the challenging incomplete multimodal HGR. The proposed AiFusion provides a promising solution to realize effective and robust multimodal HGR-based interfaces.
format	Article
id	doaj-art-a91095580e2e4ada89c9b8f0e64a9e7a
institution	DOAJ
issn	1534-4320 1558-0210
language	English
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Transactions on Neural Systems and Rehabilitation Engineering
spelling	doaj-art-a91095580e2e4ada89c9b8f0e64a9e7a2025-08-20T03:07:37ZengIEEEIEEE Transactions on Neural Systems and Rehabilitation Engineering1534-43201558-02102023-01-01314661467110.1109/TNSRE.2023.333510110323506Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture RecognitionShengcai Duan0https://orcid.org/0000-0002-7582-3891Le Wu1https://orcid.org/0000-0002-8565-9626Aiping Liu2https://orcid.org/0000-0001-8849-5228Xun Chen3https://orcid.org/0000-0002-4922-8116School of Information Science and Technology, University of Science and Technology of China, Hefei, Anhui, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Hefei, Anhui, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Hefei, Anhui, ChinaDepartment of Neurosurgery, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, and the Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, ChinaHand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless, these methods typically fuse sEMG and ACC either in the early or late stages, overlooking the integration of entire cross-modal hierarchical information within each individual hidden layer, thus inducing inefficient inter-modal fusion. To this end, we propose a novel Alignment-Enhanced Interactive Fusion (AiFusion) model, which achieves effective fusion via a progressive hierarchical fusion strategy. Notably, AiFusion can flexibly perform both complete and incomplete multimodal HGR. Specifically, AiFusion contains two unimodal branches and a cascaded transformer-based multimodal fusion branch. The fusion branch is first designed to adequately characterize modality-interactive knowledge by adaptively capturing inter-modal similarity and fusing hierarchical features from all branches layer by layer. Then, the modality-interactive knowledge is aligned with that of unimodality using cross-modal supervised contrastive learning and online distillation from embedding and probability spaces respectively. These alignments further promote fusion quality and refine modality-specific representations. Finally, the recognition outcomes are set to be determined by available modalities, thus contributing to handling the incomplete multimodal HGR problem, which is frequently encountered in real-world scenarios. Experimental results on five public datasets demonstrate that AiFusion outperforms most state-of-the-art benchmarks in complete multimodal HGR. Impressively, it also surpasses the unimodal baselines in the challenging incomplete multimodal HGR. The proposed AiFusion provides a promising solution to realize effective and robust multimodal HGR-based interfaces.https://ieeexplore.ieee.org/document/10323506/Multimodal fusionhand gesture recognitionmyoelectric controlaccelerometerincomplete multimodalalignment
spellingShingle	Shengcai Duan Le Wu Aiping Liu Xun Chen Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition IEEE Transactions on Neural Systems and Rehabilitation Engineering Multimodal fusion hand gesture recognition myoelectric control accelerometer incomplete multimodal alignment
title	Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition
title_full	Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition
title_fullStr	Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition
title_full_unstemmed	Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition
title_short	Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition
title_sort	alignment enhanced interactive fusion model for complete and incomplete multimodal hand gesture recognition
topic	Multimodal fusion hand gesture recognition myoelectric control accelerometer incomplete multimodal alignment
url	https://ieeexplore.ieee.org/document/10323506/
work_keys_str_mv	AT shengcaiduan alignmentenhancedinteractivefusionmodelforcompleteandincompletemultimodalhandgesturerecognition AT lewu alignmentenhancedinteractivefusionmodelforcompleteandincompletemultimodalhandgesturerecognition AT aipingliu alignmentenhancedinteractivefusionmodelforcompleteandincompletemultimodalhandgesturerecognition AT xunchen alignmentenhancedinteractivefusionmodelforcompleteandincompletemultimodalhandgesturerecognition

Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition

Similar Items