A CNN-transformer framework for emotion recognition in code-mixed English–Hindi data

Abstract Social media is an open platform for users to express their views and thoughts through emotions about a particular topic in natural language, leading to the generation of a vast amount of emotional data on micro-blogging sites. This data needs to be processed to extract meaningful insights...

Full description

Saved in:
Bibliographic Details
Main Authors: Shreya Patankar, Madhura Phadke
Format: Article
Language:English
Published: Springer 2025-07-01
Series:Discover Artificial Intelligence
Subjects:
Online Access:https://doi.org/10.1007/s44163-025-00400-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849343090791284736
author Shreya Patankar
Madhura Phadke
author_facet Shreya Patankar
Madhura Phadke
author_sort Shreya Patankar
collection DOAJ
description Abstract Social media is an open platform for users to express their views and thoughts through emotions about a particular topic in natural language, leading to the generation of a vast amount of emotional data on micro-blogging sites. This data needs to be processed to extract meaningful insights and analyze emotions from text, enhancing various Natural Language Processing (NLP) applications. While humans can easily infer emotions due to their commonsense knowledge, machines lack this perception and must be trained to understand and detect emotions, making it a highly researched task in NLP. The challenge intensifies in Indian contexts where users often switch between two or more languages (code-mixing) to express opinions. The lack of annotated datasets for such multilingual data makes this a promising and underexplored area of research. To address this, we propose a hybrid deep learning framework using a CNN-Transformer model trained on Hindi–English code-mixed tweets, categorized into Happy, Sad, Anger, and Neutral emotions. Unlike previous approaches that rely solely on monolingual models or pre-trained transformers, our method combines local feature extraction via CNNs with global contextual modeling through Transformers specifically designed for code-mixed structures. We also utilize pre-trained word embedding fine-tuned on the dataset to improve semantic representation. Our proposed model achieves superior performance, with an F1-score of 0.82 and outperforming both CNN-only and Transformer-only baselines, and demonstrates robust emotion classification in code-mixed social media text.
format Article
id doaj-art-b96ac0e3fc724b419209cb855f53d647
institution Kabale University
issn 2731-0809
language English
publishDate 2025-07-01
publisher Springer
record_format Article
series Discover Artificial Intelligence
spelling doaj-art-b96ac0e3fc724b419209cb855f53d6472025-08-20T03:43:10ZengSpringerDiscover Artificial Intelligence2731-08092025-07-015111310.1007/s44163-025-00400-yA CNN-transformer framework for emotion recognition in code-mixed English–Hindi dataShreya Patankar0Madhura Phadke1K. J. Somaiya Institute of TechnologyK. J. Somaiya Institute of TechnologyAbstract Social media is an open platform for users to express their views and thoughts through emotions about a particular topic in natural language, leading to the generation of a vast amount of emotional data on micro-blogging sites. This data needs to be processed to extract meaningful insights and analyze emotions from text, enhancing various Natural Language Processing (NLP) applications. While humans can easily infer emotions due to their commonsense knowledge, machines lack this perception and must be trained to understand and detect emotions, making it a highly researched task in NLP. The challenge intensifies in Indian contexts where users often switch between two or more languages (code-mixing) to express opinions. The lack of annotated datasets for such multilingual data makes this a promising and underexplored area of research. To address this, we propose a hybrid deep learning framework using a CNN-Transformer model trained on Hindi–English code-mixed tweets, categorized into Happy, Sad, Anger, and Neutral emotions. Unlike previous approaches that rely solely on monolingual models or pre-trained transformers, our method combines local feature extraction via CNNs with global contextual modeling through Transformers specifically designed for code-mixed structures. We also utilize pre-trained word embedding fine-tuned on the dataset to improve semantic representation. Our proposed model achieves superior performance, with an F1-score of 0.82 and outperforming both CNN-only and Transformer-only baselines, and demonstrates robust emotion classification in code-mixed social media text.https://doi.org/10.1007/s44163-025-00400-yDeep learningNatural language processingEmotionCode-mixedEmbeddings
spellingShingle Shreya Patankar
Madhura Phadke
A CNN-transformer framework for emotion recognition in code-mixed English–Hindi data
Discover Artificial Intelligence
Deep learning
Natural language processing
Emotion
Code-mixed
Embeddings
title A CNN-transformer framework for emotion recognition in code-mixed English–Hindi data
title_full A CNN-transformer framework for emotion recognition in code-mixed English–Hindi data
title_fullStr A CNN-transformer framework for emotion recognition in code-mixed English–Hindi data
title_full_unstemmed A CNN-transformer framework for emotion recognition in code-mixed English–Hindi data
title_short A CNN-transformer framework for emotion recognition in code-mixed English–Hindi data
title_sort cnn transformer framework for emotion recognition in code mixed english hindi data
topic Deep learning
Natural language processing
Emotion
Code-mixed
Embeddings
url https://doi.org/10.1007/s44163-025-00400-y
work_keys_str_mv AT shreyapatankar acnntransformerframeworkforemotionrecognitionincodemixedenglishhindidata
AT madhuraphadke acnntransformerframeworkforemotionrecognitionincodemixedenglishhindidata
AT shreyapatankar cnntransformerframeworkforemotionrecognitionincodemixedenglishhindidata
AT madhuraphadke cnntransformerframeworkforemotionrecognitionincodemixedenglishhindidata