SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition

Abstract Silent speech recognition (SSR) based on surface electromyography (sEMG) is a voice interaction technology proposed for scenarios requiring silent operations. This article abstracts the SSR task based on sEMG into a short‐term image sequence classification task. Time‐frequency domain featur...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhao Li, Bin Ma, Weifan Mao, Jianxing Zhang, Zhuting Yu, Yizhou Lu
Format: Article
Language:English
Published: Wiley 2024-11-01
Series:Electronics Letters
Subjects:
Online Access:https://doi.org/10.1049/ell2.13285
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850198632095023104
author Zhao Li
Bin Ma
Weifan Mao
Jianxing Zhang
Zhuting Yu
Yizhou Lu
author_facet Zhao Li
Bin Ma
Weifan Mao
Jianxing Zhang
Zhuting Yu
Yizhou Lu
author_sort Zhao Li
collection DOAJ
description Abstract Silent speech recognition (SSR) based on surface electromyography (sEMG) is a voice interaction technology proposed for scenarios requiring silent operations. This article abstracts the SSR task based on sEMG into a short‐term image sequence classification task. Time‐frequency domain feature extraction and data reconstruction on the muscle activity segment data is performed. Additionally, the temporal and spatial dimensions to capture the intrinsic correlation representation of muscle activity is analysed. The SVIT‐SSR model is proposed based on the vision transformer (VIT) framework. Finally, experiments to identify 33 types of typical silent speech commands in the SSR dataset are designed. The results demonstrate that the proposed model achieves an accuracy of 96.67 ± 1.15%, outperforming similar algorithms.
format Article
id doaj-art-407142e260e049eeaed04dd46d7a0d51
institution OA Journals
issn 0013-5194
1350-911X
language English
publishDate 2024-11-01
publisher Wiley
record_format Article
series Electronics Letters
spelling doaj-art-407142e260e049eeaed04dd46d7a0d512025-08-20T02:12:49ZengWileyElectronics Letters0013-51941350-911X2024-11-016021n/an/a10.1049/ell2.13285SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognitionZhao Li0Bin Ma1Weifan Mao2Jianxing Zhang3Zhuting Yu4Yizhou Lu5Shanghai Advanced Research Institute Chinese Academy of Sciences Shanghai ChinaShanghai Advanced Research Institute Chinese Academy of Sciences Shanghai ChinaShanghai Advanced Research Institute Chinese Academy of Sciences Shanghai ChinaShanghai Advanced Research Institute Chinese Academy of Sciences Shanghai ChinaShanghai Advanced Research Institute Chinese Academy of Sciences Shanghai ChinaShanghai Advanced Research Institute Chinese Academy of Sciences Shanghai ChinaAbstract Silent speech recognition (SSR) based on surface electromyography (sEMG) is a voice interaction technology proposed for scenarios requiring silent operations. This article abstracts the SSR task based on sEMG into a short‐term image sequence classification task. Time‐frequency domain feature extraction and data reconstruction on the muscle activity segment data is performed. Additionally, the temporal and spatial dimensions to capture the intrinsic correlation representation of muscle activity is analysed. The SVIT‐SSR model is proposed based on the vision transformer (VIT) framework. Finally, experiments to identify 33 types of typical silent speech commands in the SSR dataset are designed. The results demonstrate that the proposed model achieves an accuracy of 96.67 ± 1.15%, outperforming similar algorithms.https://doi.org/10.1049/ell2.13285electromyographyspeech recognition
spellingShingle Zhao Li
Bin Ma
Weifan Mao
Jianxing Zhang
Zhuting Yu
Yizhou Lu
SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition
Electronics Letters
electromyography
speech recognition
title SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition
title_full SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition
title_fullStr SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition
title_full_unstemmed SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition
title_short SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition
title_sort svit ssr a semg based vision transformer approach for silent speech recognition
topic electromyography
speech recognition
url https://doi.org/10.1049/ell2.13285
work_keys_str_mv AT zhaoli svitssrasemgbasedvisiontransformerapproachforsilentspeechrecognition
AT binma svitssrasemgbasedvisiontransformerapproachforsilentspeechrecognition
AT weifanmao svitssrasemgbasedvisiontransformerapproachforsilentspeechrecognition
AT jianxingzhang svitssrasemgbasedvisiontransformerapproachforsilentspeechrecognition
AT zhutingyu svitssrasemgbasedvisiontransformerapproachforsilentspeechrecognition
AT yizhoulu svitssrasemgbasedvisiontransformerapproachforsilentspeechrecognition