Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art

Abstract Facial expression generation played an essential role in virtual avatars, television, film art, and human–computer interaction. The existing synthesis model faces difficulties due to poor generalization, unnatural distortion, identity inconsistency, and weak lip synchronization with audio....

Full description

Saved in:
Bibliographic Details
Main Authors: Yi Zhang, Junting Qian
Format: Article
Language:English
Published: Springer 2025-08-01
Series:International Journal of Computational Intelligence Systems
Subjects:
Online Access:https://doi.org/10.1007/s44196-025-00952-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849225901372342272
author Yi Zhang
Junting Qian
author_facet Yi Zhang
Junting Qian
author_sort Yi Zhang
collection DOAJ
description Abstract Facial expression generation played an essential role in virtual avatars, television, film art, and human–computer interaction. The existing synthesis model faces difficulties due to poor generalization, unnatural distortion, identity inconsistency, and weak lip synchronization with audio. In addition, conventional approaches fail to balance expression fidelity, real-time synthesis, and identity preservation. The research difficulties are addressed by introducing the Machine Learning-based Expression Generation Model to improve the expression generation accuracy. The ML-EGM model uses spatial and temporal features to generate expression by integrating generative adversarial networks, transformer encoding, and lip synchronization. A generator synthesizes expressive face outputs in the generative framework, while the discriminator is trained to distinguish actual and created expressions. This adversarial process teaches the generator to mimic actual facial emotions. During this process, transformer encoding is applied to manage the temporal consistency using the self-attention mechanism to minimize the error and improve the similarity rate. Further, the expressions are integrated with the speech synthesis process to enhance the overall frame expression generation efficiency. The system uses the MEAD dataset to evaluate the system efficiency, in which the model attains 99.1% expression generation accuracy, which is a 5.7% improvement compared to the MFA methods. The ML-EGM model proposal sets a new standard in achieving realism in expression generation, consistency of emotion realism, and speed of synthesis; its possible use cases include virtual aides, AI avatars, deepfake image detection, robotics systems that convey expression, and emotion-sensitive human–computer systems.
format Article
id doaj-art-3abb532915d7406d996b7ee75a4461ed
institution Kabale University
issn 1875-6883
language English
publishDate 2025-08-01
publisher Springer
record_format Article
series International Journal of Computational Intelligence Systems
spelling doaj-art-3abb532915d7406d996b7ee75a4461ed2025-08-24T11:49:38ZengSpringerInternational Journal of Computational Intelligence Systems1875-68832025-08-0118112810.1007/s44196-025-00952-yMachine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television ArtYi Zhang0Junting Qian1School of Computer Engineering, Suzhou Polytechnic UniversityFaculty of Fine and Applied Arts, Bangkok Thonburi UniversityAbstract Facial expression generation played an essential role in virtual avatars, television, film art, and human–computer interaction. The existing synthesis model faces difficulties due to poor generalization, unnatural distortion, identity inconsistency, and weak lip synchronization with audio. In addition, conventional approaches fail to balance expression fidelity, real-time synthesis, and identity preservation. The research difficulties are addressed by introducing the Machine Learning-based Expression Generation Model to improve the expression generation accuracy. The ML-EGM model uses spatial and temporal features to generate expression by integrating generative adversarial networks, transformer encoding, and lip synchronization. A generator synthesizes expressive face outputs in the generative framework, while the discriminator is trained to distinguish actual and created expressions. This adversarial process teaches the generator to mimic actual facial emotions. During this process, transformer encoding is applied to manage the temporal consistency using the self-attention mechanism to minimize the error and improve the similarity rate. Further, the expressions are integrated with the speech synthesis process to enhance the overall frame expression generation efficiency. The system uses the MEAD dataset to evaluate the system efficiency, in which the model attains 99.1% expression generation accuracy, which is a 5.7% improvement compared to the MFA methods. The ML-EGM model proposal sets a new standard in achieving realism in expression generation, consistency of emotion realism, and speed of synthesis; its possible use cases include virtual aides, AI avatars, deepfake image detection, robotics systems that convey expression, and emotion-sensitive human–computer systems.https://doi.org/10.1007/s44196-025-00952-yExpression generationTelevisionFilm artMachine learningGenerative adversarial networksSelf-attention mechanism
spellingShingle Yi Zhang
Junting Qian
Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art
International Journal of Computational Intelligence Systems
Expression generation
Television
Film art
Machine learning
Generative adversarial networks
Self-attention mechanism
title Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art
title_full Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art
title_fullStr Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art
title_full_unstemmed Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art
title_short Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art
title_sort machine learning based expression generation technology for virtual characters in film and television art
topic Expression generation
Television
Film art
Machine learning
Generative adversarial networks
Self-attention mechanism
url https://doi.org/10.1007/s44196-025-00952-y
work_keys_str_mv AT yizhang machinelearningbasedexpressiongenerationtechnologyforvirtualcharactersinfilmandtelevisionart
AT juntingqian machinelearningbasedexpressiongenerationtechnologyforvirtualcharactersinfilmandtelevisionart