HCANet: a micro expression recognition model based on hierarchical Transformer architecture
Facial micro expressions are subtle, involuntary facial movements that reveal true emotions. To enhance recognition accuracy by exploiting spatial correlations among facial landmarks, a hierarchical Transformer architecture hierarchical continuous attention network (HCANet) was proposed to effective...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
POSTS&TELECOM PRESS Co., LTD
2025-06-01
|
| Series: | 智能科学与技术学报 |
| Subjects: | |
| Online Access: | http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202525/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849409222171688960 |
|---|---|
| author | YANG Aonan MO Hong ZHAO Shili OUYANG Yuqi |
| author_facet | YANG Aonan MO Hong ZHAO Shili OUYANG Yuqi |
| author_sort | YANG Aonan |
| collection | DOAJ |
| description | Facial micro expressions are subtle, involuntary facial movements that reveal true emotions. To enhance recognition accuracy by exploiting spatial correlations among facial landmarks, a hierarchical Transformer architecture hierarchical continuous attention network (HCANet) was proposed to effectively leverage the self-attention mechanism for capturing relationships between landmarks in sequences. HCANet models optical flow differences between onset and apex frames to capture local details and reduce identity interference, thereby avoiding the oversight of local details when directly extracting features from full video frames It consists of a Transformer layer for local temporal feature extraction and an aggregation layer for global facial feature learning. Initially, the face was divided into four regions. Within the Transformer layer, continuous attention block (CAB) was introduced to focus on the local, minute muscular movements within individual regions for extracting local temporal features. Subsequently, the aggregation layer concentrated on learning the inter-region interactions to extract global semantic facial features through a cross-layer attention mechanism. Finally, comparative validations were conducted using leave-one-out-cross-validation on four publicly available micro expression datasets (CASME Ⅱ, CASME Ⅲ, SMIC, SAMM) against six other algorithms. The results demonstrate that HCANet achieves improved classification accuracy on the CASME Ⅲ, SMIC and SAMM datasets, and exhibits stronger robustness in complex scenarios (e.g., low frame rates, background noise). |
| format | Article |
| id | doaj-art-0814809e81c94d7bb0d18842664cd76f |
| institution | Kabale University |
| issn | 2096-6652 |
| language | zho |
| publishDate | 2025-06-01 |
| publisher | POSTS&TELECOM PRESS Co., LTD |
| record_format | Article |
| series | 智能科学与技术学报 |
| spelling | doaj-art-0814809e81c94d7bb0d18842664cd76f2025-08-20T03:35:34ZzhoPOSTS&TELECOM PRESS Co., LTD智能科学与技术学报2096-66522025-06-017277286117464757HCANet: a micro expression recognition model based on hierarchical Transformer architectureYANG AonanMO HongZHAO ShiliOUYANG YuqiFacial micro expressions are subtle, involuntary facial movements that reveal true emotions. To enhance recognition accuracy by exploiting spatial correlations among facial landmarks, a hierarchical Transformer architecture hierarchical continuous attention network (HCANet) was proposed to effectively leverage the self-attention mechanism for capturing relationships between landmarks in sequences. HCANet models optical flow differences between onset and apex frames to capture local details and reduce identity interference, thereby avoiding the oversight of local details when directly extracting features from full video frames It consists of a Transformer layer for local temporal feature extraction and an aggregation layer for global facial feature learning. Initially, the face was divided into four regions. Within the Transformer layer, continuous attention block (CAB) was introduced to focus on the local, minute muscular movements within individual regions for extracting local temporal features. Subsequently, the aggregation layer concentrated on learning the inter-region interactions to extract global semantic facial features through a cross-layer attention mechanism. Finally, comparative validations were conducted using leave-one-out-cross-validation on four publicly available micro expression datasets (CASME Ⅱ, CASME Ⅲ, SMIC, SAMM) against six other algorithms. The results demonstrate that HCANet achieves improved classification accuracy on the CASME Ⅲ, SMIC and SAMM datasets, and exhibits stronger robustness in complex scenarios (e.g., low frame rates, background noise).http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202525/micro expression recognitionfacial featuredeep learningattention mechanism |
| spellingShingle | YANG Aonan MO Hong ZHAO Shili OUYANG Yuqi HCANet: a micro expression recognition model based on hierarchical Transformer architecture 智能科学与技术学报 micro expression recognition facial feature deep learning attention mechanism |
| title | HCANet: a micro expression recognition model based on hierarchical Transformer architecture |
| title_full | HCANet: a micro expression recognition model based on hierarchical Transformer architecture |
| title_fullStr | HCANet: a micro expression recognition model based on hierarchical Transformer architecture |
| title_full_unstemmed | HCANet: a micro expression recognition model based on hierarchical Transformer architecture |
| title_short | HCANet: a micro expression recognition model based on hierarchical Transformer architecture |
| title_sort | hcanet a micro expression recognition model based on hierarchical transformer architecture |
| topic | micro expression recognition facial feature deep learning attention mechanism |
| url | http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202525/ |
| work_keys_str_mv | AT yangaonan hcanetamicroexpressionrecognitionmodelbasedonhierarchicaltransformerarchitecture AT mohong hcanetamicroexpressionrecognitionmodelbasedonhierarchicaltransformerarchitecture AT zhaoshili hcanetamicroexpressionrecognitionmodelbasedonhierarchicaltransformerarchitecture AT ouyangyuqi hcanetamicroexpressionrecognitionmodelbasedonhierarchicaltransformerarchitecture |