Digital Human Intelligent Interaction System Based on Multimodal Pre-training Mode

As the forefront of human–computer intelligent interaction, digital humans have increasingly diverse application scenarios, but also face many challenges. In order to enhance the intelligence level and interaction capability of digital human, the study first optimizes the traditional UniLM model, th...

Full description

Saved in:
Bibliographic Details
Main Authors: Xuliang Yang, Yong Fang, Lili Wang, Rodolfo C. Raga
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:Applied Artificial Intelligence
Online Access:https://www.tandfonline.com/doi/10.1080/08839514.2024.2405953
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850116295955054592
author Xuliang Yang
Yong Fang
Lili Wang
Rodolfo C. Raga
author_facet Xuliang Yang
Yong Fang
Lili Wang
Rodolfo C. Raga
author_sort Xuliang Yang
collection DOAJ
description As the forefront of human–computer intelligent interaction, digital humans have increasingly diverse application scenarios, but also face many challenges. In order to enhance the intelligence level and interaction capability of digital human, the study first optimizes the traditional UniLM model, then introduces the mechanism of multi-head attention, and mitigates the exposure bias by using adversarial training and random replacement of decoder to design an improved UniLM model, and finally applies the improved model to the digital human management system. The results show that the precision rate of the improved UniLM model is improved by 7.68%, 6.4%, and 4.96%, the recall rate is improved by 11.94%, 9.69%, and 8.83%, and the F1-score is improved by 8.34%, 6.41%, and 7.68% compared with the other three models, which proves that it has a better precision rate, robustness, and generalization ability. The perplexity of the improved UniLM model is 135, 95, 76, 71, and 55 under five text lengths, which is significantly lower than the other models, proving that its text generation ability is better. The above results demonstrate the performance of the research-designed digital human system based on the Improved UniLM model, which provides a direction for the further development of digital human technology.
format Article
id doaj-art-24a7bbd672f64a16bce0e0780fea14fb
institution OA Journals
issn 0883-9514
1087-6545
language English
publishDate 2024-12-01
publisher Taylor & Francis Group
record_format Article
series Applied Artificial Intelligence
spelling doaj-art-24a7bbd672f64a16bce0e0780fea14fb2025-08-20T02:36:22ZengTaylor & Francis GroupApplied Artificial Intelligence0883-95141087-65452024-12-0138110.1080/08839514.2024.2405953Digital Human Intelligent Interaction System Based on Multimodal Pre-training ModeXuliang Yang0Yong Fang1Lili Wang2Rodolfo C. Raga3Dongguan City University, University and Urban Integration Development Research Center, Dongguan, ChinaDongguan City University, University and Urban Integration Development Research Center, Dongguan, ChinaDongguan City University, University and Urban Integration Development Research Center, Dongguan, ChinaCollege of Computing & Information Technologies, National University, Manila, PhilippinesAs the forefront of human–computer intelligent interaction, digital humans have increasingly diverse application scenarios, but also face many challenges. In order to enhance the intelligence level and interaction capability of digital human, the study first optimizes the traditional UniLM model, then introduces the mechanism of multi-head attention, and mitigates the exposure bias by using adversarial training and random replacement of decoder to design an improved UniLM model, and finally applies the improved model to the digital human management system. The results show that the precision rate of the improved UniLM model is improved by 7.68%, 6.4%, and 4.96%, the recall rate is improved by 11.94%, 9.69%, and 8.83%, and the F1-score is improved by 8.34%, 6.41%, and 7.68% compared with the other three models, which proves that it has a better precision rate, robustness, and generalization ability. The perplexity of the improved UniLM model is 135, 95, 76, 71, and 55 under five text lengths, which is significantly lower than the other models, proving that its text generation ability is better. The above results demonstrate the performance of the research-designed digital human system based on the Improved UniLM model, which provides a direction for the further development of digital human technology.https://www.tandfonline.com/doi/10.1080/08839514.2024.2405953
spellingShingle Xuliang Yang
Yong Fang
Lili Wang
Rodolfo C. Raga
Digital Human Intelligent Interaction System Based on Multimodal Pre-training Mode
Applied Artificial Intelligence
title Digital Human Intelligent Interaction System Based on Multimodal Pre-training Mode
title_full Digital Human Intelligent Interaction System Based on Multimodal Pre-training Mode
title_fullStr Digital Human Intelligent Interaction System Based on Multimodal Pre-training Mode
title_full_unstemmed Digital Human Intelligent Interaction System Based on Multimodal Pre-training Mode
title_short Digital Human Intelligent Interaction System Based on Multimodal Pre-training Mode
title_sort digital human intelligent interaction system based on multimodal pre training mode
url https://www.tandfonline.com/doi/10.1080/08839514.2024.2405953
work_keys_str_mv AT xuliangyang digitalhumanintelligentinteractionsystembasedonmultimodalpretrainingmode
AT yongfang digitalhumanintelligentinteractionsystembasedonmultimodalpretrainingmode
AT liliwang digitalhumanintelligentinteractionsystembasedonmultimodalpretrainingmode
AT rodolfocraga digitalhumanintelligentinteractionsystembasedonmultimodalpretrainingmode