HCANet: a micro expression recognition model based on hierarchical Transformer architecture

Facial micro expressions are subtle, involuntary facial movements that reveal true emotions. To enhance recognition accuracy by exploiting spatial correlations among facial landmarks, a hierarchical Transformer architecture hierarchical continuous attention network (HCANet) was proposed to effective...

Full description

Saved in:
Bibliographic Details
Main Authors: YANG Aonan, MO Hong, ZHAO Shili, OUYANG Yuqi
Format: Article
Language:zho
Published: POSTS&TELECOM PRESS Co., LTD 2025-06-01
Series:智能科学与技术学报
Subjects:
Online Access:http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202525/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849409222171688960
author YANG Aonan
MO Hong
ZHAO Shili
OUYANG Yuqi
author_facet YANG Aonan
MO Hong
ZHAO Shili
OUYANG Yuqi
author_sort YANG Aonan
collection DOAJ
description Facial micro expressions are subtle, involuntary facial movements that reveal true emotions. To enhance recognition accuracy by exploiting spatial correlations among facial landmarks, a hierarchical Transformer architecture hierarchical continuous attention network (HCANet) was proposed to effectively leverage the self-attention mechanism for capturing relationships between landmarks in sequences. HCANet models optical flow differences between onset and apex frames to capture local details and reduce identity interference, thereby avoiding the oversight of local details when directly extracting features from full video frames It consists of a Transformer layer for local temporal feature extraction and an aggregation layer for global facial feature learning. Initially, the face was divided into four regions. Within the Transformer layer, continuous attention block (CAB) was introduced to focus on the local, minute muscular movements within individual regions for extracting local temporal features. Subsequently, the aggregation layer concentrated on learning the inter-region interactions to extract global semantic facial features through a cross-layer attention mechanism. Finally, comparative validations were conducted using leave-one-out-cross-validation on four publicly available micro expression datasets (CASME Ⅱ, CASME Ⅲ, SMIC, SAMM) against six other algorithms. The results demonstrate that HCANet achieves improved classification accuracy on the CASME Ⅲ, SMIC and SAMM datasets, and exhibits stronger robustness in complex scenarios (e.g., low frame rates, background noise).
format Article
id doaj-art-0814809e81c94d7bb0d18842664cd76f
institution Kabale University
issn 2096-6652
language zho
publishDate 2025-06-01
publisher POSTS&TELECOM PRESS Co., LTD
record_format Article
series 智能科学与技术学报
spelling doaj-art-0814809e81c94d7bb0d18842664cd76f2025-08-20T03:35:34ZzhoPOSTS&TELECOM PRESS Co., LTD智能科学与技术学报2096-66522025-06-017277286117464757HCANet: a micro expression recognition model based on hierarchical Transformer architectureYANG AonanMO HongZHAO ShiliOUYANG YuqiFacial micro expressions are subtle, involuntary facial movements that reveal true emotions. To enhance recognition accuracy by exploiting spatial correlations among facial landmarks, a hierarchical Transformer architecture hierarchical continuous attention network (HCANet) was proposed to effectively leverage the self-attention mechanism for capturing relationships between landmarks in sequences. HCANet models optical flow differences between onset and apex frames to capture local details and reduce identity interference, thereby avoiding the oversight of local details when directly extracting features from full video frames It consists of a Transformer layer for local temporal feature extraction and an aggregation layer for global facial feature learning. Initially, the face was divided into four regions. Within the Transformer layer, continuous attention block (CAB) was introduced to focus on the local, minute muscular movements within individual regions for extracting local temporal features. Subsequently, the aggregation layer concentrated on learning the inter-region interactions to extract global semantic facial features through a cross-layer attention mechanism. Finally, comparative validations were conducted using leave-one-out-cross-validation on four publicly available micro expression datasets (CASME Ⅱ, CASME Ⅲ, SMIC, SAMM) against six other algorithms. The results demonstrate that HCANet achieves improved classification accuracy on the CASME Ⅲ, SMIC and SAMM datasets, and exhibits stronger robustness in complex scenarios (e.g., low frame rates, background noise).http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202525/micro expression recognitionfacial featuredeep learningattention mechanism
spellingShingle YANG Aonan
MO Hong
ZHAO Shili
OUYANG Yuqi
HCANet: a micro expression recognition model based on hierarchical Transformer architecture
智能科学与技术学报
micro expression recognition
facial feature
deep learning
attention mechanism
title HCANet: a micro expression recognition model based on hierarchical Transformer architecture
title_full HCANet: a micro expression recognition model based on hierarchical Transformer architecture
title_fullStr HCANet: a micro expression recognition model based on hierarchical Transformer architecture
title_full_unstemmed HCANet: a micro expression recognition model based on hierarchical Transformer architecture
title_short HCANet: a micro expression recognition model based on hierarchical Transformer architecture
title_sort hcanet a micro expression recognition model based on hierarchical transformer architecture
topic micro expression recognition
facial feature
deep learning
attention mechanism
url http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202525/
work_keys_str_mv AT yangaonan hcanetamicroexpressionrecognitionmodelbasedonhierarchicaltransformerarchitecture
AT mohong hcanetamicroexpressionrecognitionmodelbasedonhierarchicaltransformerarchitecture
AT zhaoshili hcanetamicroexpressionrecognitionmodelbasedonhierarchicaltransformerarchitecture
AT ouyangyuqi hcanetamicroexpressionrecognitionmodelbasedonhierarchicaltransformerarchitecture