Construction of a multi-modal digital human education platform based on GAN and vision transformer

Abstract With the rapid development of artificial intelligence technology, digital human education platforms have become a research hotspot in education. This paper proposes a method to build a multi-modal digital human education platform based on a Generative Adversarial Network and a Vision Transf...

Full description

Saved in:
Bibliographic Details
Main Authors: Xuliang Yang, Aimin Pan, Rodolfo C. Raga
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-97662-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850206532517494784
author Xuliang Yang
Aimin Pan
Rodolfo C. Raga
author_facet Xuliang Yang
Aimin Pan
Rodolfo C. Raga
author_sort Xuliang Yang
collection DOAJ
description Abstract With the rapid development of artificial intelligence technology, digital human education platforms have become a research hotspot in education. This paper proposes a method to build a multi-modal digital human education platform based on a Generative Adversarial Network and a Vision Transformer. The platform enables high-quality avatar generation and interactive learning experiences. In the experimental part, we construct a large-scale dataset containing 1000 students and 50 teachers to evaluate the performance of the proposed method. The experimental results show that the proposed method has significantly improved avatars’ authenticity, interaction response speed, and learning effect by comparing them with existing digital human education platforms. Specifically, the average recognition accuracy of avatars has increased by 12%, the interaction response time has been shortened by 25%, and students’ academic performance has increased by 8% on average. This shows that the multi-modal digital human education platform based on GAN and ViT has excellent application potential and can provide new solutions for future education models.
format Article
id doaj-art-3ec1f35b5eef479abb43e44ace3b286f
institution OA Journals
issn 2045-2322
language English
publishDate 2025-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-3ec1f35b5eef479abb43e44ace3b286f2025-08-20T02:10:49ZengNature PortfolioScientific Reports2045-23222025-04-0115111710.1038/s41598-025-97662-4Construction of a multi-modal digital human education platform based on GAN and vision transformerXuliang Yang0Aimin Pan1Rodolfo C. Raga2University and Urban Integration Development Research Center, Dongguan City UniversityCollege of Computing and Information Technologies, National University-ManilaCollege of Computing and Information Technologies, National University-ManilaAbstract With the rapid development of artificial intelligence technology, digital human education platforms have become a research hotspot in education. This paper proposes a method to build a multi-modal digital human education platform based on a Generative Adversarial Network and a Vision Transformer. The platform enables high-quality avatar generation and interactive learning experiences. In the experimental part, we construct a large-scale dataset containing 1000 students and 50 teachers to evaluate the performance of the proposed method. The experimental results show that the proposed method has significantly improved avatars’ authenticity, interaction response speed, and learning effect by comparing them with existing digital human education platforms. Specifically, the average recognition accuracy of avatars has increased by 12%, the interaction response time has been shortened by 25%, and students’ academic performance has increased by 8% on average. This shows that the multi-modal digital human education platform based on GAN and ViT has excellent application potential and can provide new solutions for future education models.https://doi.org/10.1038/s41598-025-97662-4GANVision transformerMultimodalEducational platform
spellingShingle Xuliang Yang
Aimin Pan
Rodolfo C. Raga
Construction of a multi-modal digital human education platform based on GAN and vision transformer
Scientific Reports
GAN
Vision transformer
Multimodal
Educational platform
title Construction of a multi-modal digital human education platform based on GAN and vision transformer
title_full Construction of a multi-modal digital human education platform based on GAN and vision transformer
title_fullStr Construction of a multi-modal digital human education platform based on GAN and vision transformer
title_full_unstemmed Construction of a multi-modal digital human education platform based on GAN and vision transformer
title_short Construction of a multi-modal digital human education platform based on GAN and vision transformer
title_sort construction of a multi modal digital human education platform based on gan and vision transformer
topic GAN
Vision transformer
Multimodal
Educational platform
url https://doi.org/10.1038/s41598-025-97662-4
work_keys_str_mv AT xuliangyang constructionofamultimodaldigitalhumaneducationplatformbasedonganandvisiontransformer
AT aiminpan constructionofamultimodaldigitalhumaneducationplatformbasedonganandvisiontransformer
AT rodolfocraga constructionofamultimodaldigitalhumaneducationplatformbasedonganandvisiontransformer