Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art
Abstract Facial expression generation played an essential role in virtual avatars, television, film art, and human–computer interaction. The existing synthesis model faces difficulties due to poor generalization, unnatural distortion, identity inconsistency, and weak lip synchronization with audio....
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-08-01
|
| Series: | International Journal of Computational Intelligence Systems |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44196-025-00952-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849225901372342272 |
|---|---|
| author | Yi Zhang Junting Qian |
| author_facet | Yi Zhang Junting Qian |
| author_sort | Yi Zhang |
| collection | DOAJ |
| description | Abstract Facial expression generation played an essential role in virtual avatars, television, film art, and human–computer interaction. The existing synthesis model faces difficulties due to poor generalization, unnatural distortion, identity inconsistency, and weak lip synchronization with audio. In addition, conventional approaches fail to balance expression fidelity, real-time synthesis, and identity preservation. The research difficulties are addressed by introducing the Machine Learning-based Expression Generation Model to improve the expression generation accuracy. The ML-EGM model uses spatial and temporal features to generate expression by integrating generative adversarial networks, transformer encoding, and lip synchronization. A generator synthesizes expressive face outputs in the generative framework, while the discriminator is trained to distinguish actual and created expressions. This adversarial process teaches the generator to mimic actual facial emotions. During this process, transformer encoding is applied to manage the temporal consistency using the self-attention mechanism to minimize the error and improve the similarity rate. Further, the expressions are integrated with the speech synthesis process to enhance the overall frame expression generation efficiency. The system uses the MEAD dataset to evaluate the system efficiency, in which the model attains 99.1% expression generation accuracy, which is a 5.7% improvement compared to the MFA methods. The ML-EGM model proposal sets a new standard in achieving realism in expression generation, consistency of emotion realism, and speed of synthesis; its possible use cases include virtual aides, AI avatars, deepfake image detection, robotics systems that convey expression, and emotion-sensitive human–computer systems. |
| format | Article |
| id | doaj-art-3abb532915d7406d996b7ee75a4461ed |
| institution | Kabale University |
| issn | 1875-6883 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Springer |
| record_format | Article |
| series | International Journal of Computational Intelligence Systems |
| spelling | doaj-art-3abb532915d7406d996b7ee75a4461ed2025-08-24T11:49:38ZengSpringerInternational Journal of Computational Intelligence Systems1875-68832025-08-0118112810.1007/s44196-025-00952-yMachine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television ArtYi Zhang0Junting Qian1School of Computer Engineering, Suzhou Polytechnic UniversityFaculty of Fine and Applied Arts, Bangkok Thonburi UniversityAbstract Facial expression generation played an essential role in virtual avatars, television, film art, and human–computer interaction. The existing synthesis model faces difficulties due to poor generalization, unnatural distortion, identity inconsistency, and weak lip synchronization with audio. In addition, conventional approaches fail to balance expression fidelity, real-time synthesis, and identity preservation. The research difficulties are addressed by introducing the Machine Learning-based Expression Generation Model to improve the expression generation accuracy. The ML-EGM model uses spatial and temporal features to generate expression by integrating generative adversarial networks, transformer encoding, and lip synchronization. A generator synthesizes expressive face outputs in the generative framework, while the discriminator is trained to distinguish actual and created expressions. This adversarial process teaches the generator to mimic actual facial emotions. During this process, transformer encoding is applied to manage the temporal consistency using the self-attention mechanism to minimize the error and improve the similarity rate. Further, the expressions are integrated with the speech synthesis process to enhance the overall frame expression generation efficiency. The system uses the MEAD dataset to evaluate the system efficiency, in which the model attains 99.1% expression generation accuracy, which is a 5.7% improvement compared to the MFA methods. The ML-EGM model proposal sets a new standard in achieving realism in expression generation, consistency of emotion realism, and speed of synthesis; its possible use cases include virtual aides, AI avatars, deepfake image detection, robotics systems that convey expression, and emotion-sensitive human–computer systems.https://doi.org/10.1007/s44196-025-00952-yExpression generationTelevisionFilm artMachine learningGenerative adversarial networksSelf-attention mechanism |
| spellingShingle | Yi Zhang Junting Qian Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art International Journal of Computational Intelligence Systems Expression generation Television Film art Machine learning Generative adversarial networks Self-attention mechanism |
| title | Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art |
| title_full | Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art |
| title_fullStr | Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art |
| title_full_unstemmed | Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art |
| title_short | Machine Learning-Based Expression Generation Technology for Virtual Characters in Film and Television Art |
| title_sort | machine learning based expression generation technology for virtual characters in film and television art |
| topic | Expression generation Television Film art Machine learning Generative adversarial networks Self-attention mechanism |
| url | https://doi.org/10.1007/s44196-025-00952-y |
| work_keys_str_mv | AT yizhang machinelearningbasedexpressiongenerationtechnologyforvirtualcharactersinfilmandtelevisionart AT juntingqian machinelearningbasedexpressiongenerationtechnologyforvirtualcharactersinfilmandtelevisionart |