AvatarWild: Fully controllable head avatars in the wild

Recent advancements in the field have resulted in significant progress in achieving realistic head reconstruction and manipulation using neural radiance fields (NeRF). Despite these advances, capturing intricate facial details remains a persistent challenge. Moreover, casually captured input, involv...

Full description

Saved in:
Bibliographic Details
Main Authors: Shaoxu Meng, Tong Wu, Fang-Lue Zhang, Shu-Yu Chen, Yuewen Ma, Wenbo Hu, Lin Gao
Format: Article
Language:English
Published: Elsevier 2024-09-01
Series:Visual Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2468502X24000421
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850200013213270016
author Shaoxu Meng
Tong Wu
Fang-Lue Zhang
Shu-Yu Chen
Yuewen Ma
Wenbo Hu
Lin Gao
author_facet Shaoxu Meng
Tong Wu
Fang-Lue Zhang
Shu-Yu Chen
Yuewen Ma
Wenbo Hu
Lin Gao
author_sort Shaoxu Meng
collection DOAJ
description Recent advancements in the field have resulted in significant progress in achieving realistic head reconstruction and manipulation using neural radiance fields (NeRF). Despite these advances, capturing intricate facial details remains a persistent challenge. Moreover, casually captured input, involving both head poses and camera movements, introduces additional difficulties to existing methods of head avatar reconstruction. To address the challenge posed by video data captured with camera motion, we propose a novel method, AvatarWild, for reconstructing head avatars from monocular videos taken by consumer devices. Notably, our approach decouples the camera pose and head pose, allowing reconstructed avatars to be visualized with different poses and expressions from novel viewpoints. To enhance the visual quality of the reconstructed facial avatar, we introduce a view-dependent detail enhancement module designed to augment local facial details without compromising viewpoint consistency. Our method demonstrates superior performance compared to existing approaches, as evidenced by reconstruction and animation results on both multi-view and single-view datasets. Remarkably, our approach stands out by exclusively relying on video data captured by portable devices, such as smartphones. This not only underscores the practicality of our method but also extends its applicability to real-world scenarios where accessibility and ease of data capture are crucial.
format Article
id doaj-art-0bfbfcda949846f0ad70d1480c22f52e
institution OA Journals
issn 2468-502X
language English
publishDate 2024-09-01
publisher Elsevier
record_format Article
series Visual Informatics
spelling doaj-art-0bfbfcda949846f0ad70d1480c22f52e2025-08-20T02:12:28ZengElsevierVisual Informatics2468-502X2024-09-01839610610.1016/j.visinf.2024.09.001AvatarWild: Fully controllable head avatars in the wildShaoxu Meng0Tong Wu1Fang-Lue Zhang2Shu-Yu Chen3Yuewen Ma4Wenbo Hu5Lin Gao6Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, ChinaBeijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, ChinaVictoria University of Wellington, Wellington, New ZealandBeijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, ChinaByteDance Ltd., Beijing, ChinaByteDance Ltd., Beijing, ChinaBeijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China; Corresponding author.Recent advancements in the field have resulted in significant progress in achieving realistic head reconstruction and manipulation using neural radiance fields (NeRF). Despite these advances, capturing intricate facial details remains a persistent challenge. Moreover, casually captured input, involving both head poses and camera movements, introduces additional difficulties to existing methods of head avatar reconstruction. To address the challenge posed by video data captured with camera motion, we propose a novel method, AvatarWild, for reconstructing head avatars from monocular videos taken by consumer devices. Notably, our approach decouples the camera pose and head pose, allowing reconstructed avatars to be visualized with different poses and expressions from novel viewpoints. To enhance the visual quality of the reconstructed facial avatar, we introduce a view-dependent detail enhancement module designed to augment local facial details without compromising viewpoint consistency. Our method demonstrates superior performance compared to existing approaches, as evidenced by reconstruction and animation results on both multi-view and single-view datasets. Remarkably, our approach stands out by exclusively relying on video data captured by portable devices, such as smartphones. This not only underscores the practicality of our method but also extends its applicability to real-world scenarios where accessibility and ease of data capture are crucial.http://www.sciencedirect.com/science/article/pii/S2468502X24000421Neural radiance fieldsHead avatar synthesisFace reconstructionFace reenactmentFacial animation
spellingShingle Shaoxu Meng
Tong Wu
Fang-Lue Zhang
Shu-Yu Chen
Yuewen Ma
Wenbo Hu
Lin Gao
AvatarWild: Fully controllable head avatars in the wild
Visual Informatics
Neural radiance fields
Head avatar synthesis
Face reconstruction
Face reenactment
Facial animation
title AvatarWild: Fully controllable head avatars in the wild
title_full AvatarWild: Fully controllable head avatars in the wild
title_fullStr AvatarWild: Fully controllable head avatars in the wild
title_full_unstemmed AvatarWild: Fully controllable head avatars in the wild
title_short AvatarWild: Fully controllable head avatars in the wild
title_sort avatarwild fully controllable head avatars in the wild
topic Neural radiance fields
Head avatar synthesis
Face reconstruction
Face reenactment
Facial animation
url http://www.sciencedirect.com/science/article/pii/S2468502X24000421
work_keys_str_mv AT shaoxumeng avatarwildfullycontrollableheadavatarsinthewild
AT tongwu avatarwildfullycontrollableheadavatarsinthewild
AT fangluezhang avatarwildfullycontrollableheadavatarsinthewild
AT shuyuchen avatarwildfullycontrollableheadavatarsinthewild
AT yuewenma avatarwildfullycontrollableheadavatarsinthewild
AT wenbohu avatarwildfullycontrollableheadavatarsinthewild
AT lingao avatarwildfullycontrollableheadavatarsinthewild