A fine-grained human facial key feature extraction and fusion method for emotion recognition
Abstract Emotion, a fundamental mapping of human responses to external stimuli, has been extensively studied in human–computer interaction, particularly in areas such as intelligent cockpits and systems. However, accurately recognizing emotions from facial expressions remains a significant challenge...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-02-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-90440-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850191160814862336 |
|---|---|
| author | Shiwei Li Jisen Wang Linbo Tian Jianqiang Wang Yan Huang |
| author_facet | Shiwei Li Jisen Wang Linbo Tian Jianqiang Wang Yan Huang |
| author_sort | Shiwei Li |
| collection | DOAJ |
| description | Abstract Emotion, a fundamental mapping of human responses to external stimuli, has been extensively studied in human–computer interaction, particularly in areas such as intelligent cockpits and systems. However, accurately recognizing emotions from facial expressions remains a significant challenge due to lighting conditions, posture, and micro-expressions. Emotion recognition using global or local facial features is a key research direction. However, relying solely on global or local features often results in models that exhibit uneven attention across facial features, neglecting key variations critical for detecting emotional changes. This paper proposes a method for modeling and extracting key facial features by integrating global and local facial data. First, we construct a comprehensive image preprocessing model that includes super-resolution processing, lighting and shading processing, and texture enhancement. This preprocessing step significantly enriches the expression of image features. Second, A global facial feature recognition model is developed using an encoder-decoder architecture, which effectively eliminates environmental noise and generates a comprehensive global feature dataset for facial analysis. Simultaneously, the Haar cascade classifier is employed to extract refined features from key facial regions, including the eyes, mouth, and overall face, resulting in a corresponding local feature dataset. Finally, a two-branch convolutional neural network is designed to integrate both global and local facial feature datasets, enhancing the model’s ability to recognize facial characteristics accurately. The global feature branch fully characterizes the global features of the face, while the local feature branch focuses on the local features. An adaptive fusion module integrates the global and local features, enhancing the model’s ability to differentiate subtle emotional changes. To evaluate the accuracy and robustness of the model, we train and test it on the FER-2013 and JAFFE emotion datasets, achieving average accuracies of 80.59% and 97.61%, respectively. Compared to existing state-of-the-art models, our refined face feature extraction and fusion model demonstrates superior performance in emotion recognition. Additionally, the comparative analysis shows that emotional features across different faces show similarities. Building on psychological research, we categorize the dataset into three emotion classes: positive, neutral, and negative. The accuracy of emotion recognition is significantly improved under the new classification criteria. Additionally, the self-built dataset is used to validate further that this classification approach has important implications for practical applications. |
| format | Article |
| id | doaj-art-72c28eafa07b4eb3ab6d75fa3bce54e3 |
| institution | OA Journals |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-72c28eafa07b4eb3ab6d75fa3bce54e32025-08-20T02:15:00ZengNature PortfolioScientific Reports2045-23222025-02-0115112910.1038/s41598-025-90440-2A fine-grained human facial key feature extraction and fusion method for emotion recognitionShiwei Li0Jisen Wang1Linbo Tian2Jianqiang Wang3Yan Huang4School of Traffic and Transportation, Lanzhou Jiaotong UniversitySchool of Traffic and Transportation, Lanzhou Jiaotong UniversitySchool of Traffic and Transportation, Lanzhou Jiaotong UniversitySchool of Traffic and Transportation, Lanzhou Jiaotong UniversitySchool of Traffic and Transportation, Lanzhou Jiaotong UniversityAbstract Emotion, a fundamental mapping of human responses to external stimuli, has been extensively studied in human–computer interaction, particularly in areas such as intelligent cockpits and systems. However, accurately recognizing emotions from facial expressions remains a significant challenge due to lighting conditions, posture, and micro-expressions. Emotion recognition using global or local facial features is a key research direction. However, relying solely on global or local features often results in models that exhibit uneven attention across facial features, neglecting key variations critical for detecting emotional changes. This paper proposes a method for modeling and extracting key facial features by integrating global and local facial data. First, we construct a comprehensive image preprocessing model that includes super-resolution processing, lighting and shading processing, and texture enhancement. This preprocessing step significantly enriches the expression of image features. Second, A global facial feature recognition model is developed using an encoder-decoder architecture, which effectively eliminates environmental noise and generates a comprehensive global feature dataset for facial analysis. Simultaneously, the Haar cascade classifier is employed to extract refined features from key facial regions, including the eyes, mouth, and overall face, resulting in a corresponding local feature dataset. Finally, a two-branch convolutional neural network is designed to integrate both global and local facial feature datasets, enhancing the model’s ability to recognize facial characteristics accurately. The global feature branch fully characterizes the global features of the face, while the local feature branch focuses on the local features. An adaptive fusion module integrates the global and local features, enhancing the model’s ability to differentiate subtle emotional changes. To evaluate the accuracy and robustness of the model, we train and test it on the FER-2013 and JAFFE emotion datasets, achieving average accuracies of 80.59% and 97.61%, respectively. Compared to existing state-of-the-art models, our refined face feature extraction and fusion model demonstrates superior performance in emotion recognition. Additionally, the comparative analysis shows that emotional features across different faces show similarities. Building on psychological research, we categorize the dataset into three emotion classes: positive, neutral, and negative. The accuracy of emotion recognition is significantly improved under the new classification criteria. Additionally, the self-built dataset is used to validate further that this classification approach has important implications for practical applications.https://doi.org/10.1038/s41598-025-90440-2Emotion recognitionGlobal featuresLocal featuresFeature fusionValence-Arousal model |
| spellingShingle | Shiwei Li Jisen Wang Linbo Tian Jianqiang Wang Yan Huang A fine-grained human facial key feature extraction and fusion method for emotion recognition Scientific Reports Emotion recognition Global features Local features Feature fusion Valence-Arousal model |
| title | A fine-grained human facial key feature extraction and fusion method for emotion recognition |
| title_full | A fine-grained human facial key feature extraction and fusion method for emotion recognition |
| title_fullStr | A fine-grained human facial key feature extraction and fusion method for emotion recognition |
| title_full_unstemmed | A fine-grained human facial key feature extraction and fusion method for emotion recognition |
| title_short | A fine-grained human facial key feature extraction and fusion method for emotion recognition |
| title_sort | fine grained human facial key feature extraction and fusion method for emotion recognition |
| topic | Emotion recognition Global features Local features Feature fusion Valence-Arousal model |
| url | https://doi.org/10.1038/s41598-025-90440-2 |
| work_keys_str_mv | AT shiweili afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT jisenwang afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT linbotian afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT jianqiangwang afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT yanhuang afinegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT shiweili finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT jisenwang finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT linbotian finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT jianqiangwang finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition AT yanhuang finegrainedhumanfacialkeyfeatureextractionandfusionmethodforemotionrecognition |