A fine-grained human facial key feature extraction and fusion method for emotion recognition

Abstract Emotion, a fundamental mapping of human responses to external stimuli, has been extensively studied in human–computer interaction, particularly in areas such as intelligent cockpits and systems. However, accurately recognizing emotions from facial expressions remains a significant challenge...

Full description

Saved in:
Bibliographic Details
Main Authors: Shiwei Li, Jisen Wang, Linbo Tian, Jianqiang Wang, Yan Huang
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-90440-2
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850191160814862336
author Shiwei Li
Jisen Wang
Linbo Tian
Jianqiang Wang
Yan Huang
author_facet Shiwei Li
Jisen Wang
Linbo Tian
Jianqiang Wang
Yan Huang
author_sort Shiwei Li
collection DOAJ
description Abstract Emotion, a fundamental mapping of human responses to external stimuli, has been extensively studied in human–computer interaction, particularly in areas such as intelligent cockpits and systems. However, accurately recognizing emotions from facial expressions remains a significant challenge due to lighting conditions, posture, and micro-expressions. Emotion recognition using global or local facial features is a key research direction. However, relying solely on global or local features often results in models that exhibit uneven attention across facial features, neglecting key variations critical for detecting emotional changes. This paper proposes a method for modeling and extracting key facial features by integrating global and local facial data. First, we construct a comprehensive image preprocessing model that includes super-resolution processing, lighting and shading processing, and texture enhancement. This preprocessing step significantly enriches the expression of image features. Second, A global facial feature recognition model is developed using an encoder-decoder architecture, which effectively eliminates environmental noise and generates a comprehensive global feature dataset for facial analysis. Simultaneously, the Haar cascade classifier is employed to extract refined features from key facial regions, including the eyes, mouth, and overall face, resulting in a corresponding local feature dataset. Finally, a two-branch convolutional neural network is designed to integrate both global and local facial feature datasets, enhancing the model’s ability to recognize facial characteristics accurately. The global feature branch fully characterizes the global features of the face, while the local feature branch focuses on the local features. An adaptive fusion module integrates the global and local features, enhancing the model’s ability to differentiate subtle emotional changes. To evaluate the accuracy and robustness of the model, we train and test it on the FER-2013 and JAFFE emotion datasets, achieving average accuracies of 80.59% and 97.61%, respectively. Compared to existing state-of-the-art models, our refined face feature extraction and fusion model demonstrates superior performance in emotion recognition. Additionally, the comparative analysis shows that emotional features across different faces show similarities. Building on psychological research, we categorize the dataset into three emotion classes: positive, neutral, and negative. The accuracy of emotion recognition is significantly improved under the new classification criteria. Additionally, the self-built dataset is used to validate further that this classification approach has important implications for practical applications.
format Article
id doaj-art-72c28eafa07b4eb3ab6d75fa3bce54e3
institution OA Journals
issn 2045-2322
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-72c28eafa07b4eb3ab6d75fa3bce54e32025-08-20T02:15:00ZengNature PortfolioScientific Reports2045-23222025-02-0115112910.1038/s41598-025-90440-2A fine-grained human facial key feature extraction and fusion method for emotion recognitionShiwei Li0Jisen Wang1Linbo Tian2Jianqiang Wang3Yan Huang4School of Traffic and Transportation, Lanzhou Jiaotong UniversitySchool of Traffic and Transportation, Lanzhou Jiaotong UniversitySchool of Traffic and Transportation, Lanzhou Jiaotong UniversitySchool of Traffic and Transportation, Lanzhou Jiaotong UniversitySchool of Traffic and Transportation, Lanzhou Jiaotong UniversityAbstract Emotion, a fundamental mapping of human responses to external stimuli, has been extensively studied in human–computer interaction, particularly in areas such as intelligent cockpits and systems. However, accurately recognizing emotions from facial expressions remains a significant challenge due to lighting conditions, posture, and micro-expressions. Emotion recognition using global or local facial features is a key research direction. However, relying solely on global or local features often results in models that exhibit uneven attention across facial features, neglecting key variations critical for detecting emotional changes. This paper proposes a method for modeling and extracting key facial features by integrating global and local facial data. First, we construct a comprehensive image preprocessing model that includes super-resolution processing, lighting and shading processing, and texture enhancement. This preprocessing step significantly enriches the expression of image features. Second, A global facial feature recognition model is developed using an encoder-decoder architecture, which effectively eliminates environmental noise and generates a comprehensive global feature dataset for facial analysis. Simultaneously, the Haar cascade classifier is employed to extract refined features from key facial regions, including the eyes, mouth, and overall face, resulting in a corresponding local feature dataset. Finally, a two-branch convolutional neural network is designed to integrate both global and local facial feature datasets, enhancing the model’s ability to recognize facial characteristics accurately. The global feature branch fully characterizes the global features of the face, while the local feature branch focuses on the local features. An adaptive fusion module integrates the global and local features, enhancing the model’s ability to differentiate subtle emotional changes. To evaluate the accuracy and robustness of the model, we train and test it on the FER-2013 and JAFFE emotion datasets, achieving average accuracies of 80.59% and 97.61%, respectively. Compared to existing state-of-the-art models, our refined face feature extraction and fusion model demonstrates superior performance in emotion recognition. Additionally, the comparative analysis shows that emotional features across different faces show similarities. Building on psychological research, we categorize the dataset into three emotion classes: positive, neutral, and negative. The accuracy of emotion recognition is significantly improved under the new classification criteria. Additionally, the self-built dataset is used to validate further that this classification approach has important implications for practical applications.https://doi.org/10.1038/s41598-025-90440-2Emotion recognitionGlobal featuresLocal featuresFeature fusionValence-Arousal model
spellingShingle Shiwei Li
Jisen Wang
Linbo Tian
Jianqiang Wang
Yan Huang
A fine-grained human facial key feature extraction and fusion method for emotion recognition
Scientific Reports
Emotion recognition
Global features
Local features
Feature fusion
Valence-Arousal model
title A fine-grained human facial key feature extraction and fusion method for emotion recognition
title_full A fine-grained human facial key feature extraction and fusion method for emotion recognition
title_fullStr A fine-grained human facial key feature extraction and fusion method for emotion recognition
title_full_unstemmed A fine-grained human facial key feature extraction and fusion method for emotion recognition
title_short A fine-grained human facial key feature extraction and fusion method for emotion recognition
title_sort fine grained human facial key feature extraction and fusion method for emotion recognition
topic Emotion recognition
Global features
Local features
Feature fusion
Valence-Arousal model
url https://doi.org/10.1038/s41598-025-90440-2
work_keys_str_mv AT shiweili afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT jisenwang afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT linbotian afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT jianqiangwang afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT yanhuang afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT shiweili finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT jisenwang finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT linbotian finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT jianqiangwang finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition
AT yanhuang finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition