Enhancement of Virtual Assistants Through Multimodal AI for Emotion Recognition

Emotion recognition is becoming increasingly critical for enhancing human-computer interactions, as emotions play a vital role in shaping human interactions and overall well-being. Machines that can detect and respond to emotional cues similar to humans are essential in multiple industries. Emotiona...

Full description

Saved in:
Bibliographic Details
Main Authors: Shaun George Rajesh, Smriti Vipin Madangarli, Gauri Santosh Pisharady, Rolla Subrahmanyam
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11028073/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Emotion recognition is becoming increasingly critical for enhancing human-computer interactions, as emotions play a vital role in shaping human interactions and overall well-being. Machines that can detect and respond to emotional cues similar to humans are essential in multiple industries. Emotionally responsive agents find applications in education, healthcare, gaming, marketing, customer service, human-robot interaction, and entertainment. This study explores the potential of enhancing virtual assistants through multimodal Artificial Intelligence (AI), utilizing various emotion recognition techniques to create more empathetic and effective systems. The proposed methodology makes use of facial expressions and textual cues to enhance the emotional awareness of the system and achieve user satisfaction through empathetic conversation. The Facial Emotion Recognition (FER) model achieved 71% real-time accuracy, whereas the Textual Emotion Recognition (TER) model achieved 59% validation accuracy, demonstrating effective Multimodal Emotion Recognition (MER). Unlike prior multimodal emotion-aware systems, our lightweight architecture ensures real-time inference and uniquely integrates facial and textual emotion recognition with DialoGPT-based response generation — demonstrating compatibility with large language models for empathetic dialogue.
ISSN:2169-3536