AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation

Computer vision advancements allow motion transfer for animating static objects in images. However, current methods rely on manually collected motion labels and struggle with accurate shape and pose representation, particularly for human bodies, due to occlusions and background variations. Thus, we...

Full description

Saved in:
Bibliographic Details
Main Authors: Nega Asebe Teka, Kumie Gedamu Alemu, Maregu Assefa, Feidu Akmel, Zhenting Zhou, Weijie Wu, Jianwen Chen
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11007652/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849472002892496896
author Nega Asebe Teka
Kumie Gedamu Alemu
Maregu Assefa
Feidu Akmel
Zhenting Zhou
Weijie Wu
Jianwen Chen
author_facet Nega Asebe Teka
Kumie Gedamu Alemu
Maregu Assefa
Feidu Akmel
Zhenting Zhou
Weijie Wu
Jianwen Chen
author_sort Nega Asebe Teka
collection DOAJ
description Computer vision advancements allow motion transfer for animating static objects in images. However, current methods rely on manually collected motion labels and struggle with accurate shape and pose representation, particularly for human bodies, due to occlusions and background variations. Thus, we propose an Adversarial Motion Transfer Network with a disentangled Shape and Pose representation for realistic image Animation (AMT-Net), utilizing an encoder-decoder adversarial structure. Specifically, we design a pose and shape learning module that captures the independent shape and pose information by training a discriminator with adversarial loss techniques, enhancing the generation of coherent animated frames. Furthermore, a motion estimation module is introduced to generate masks for objects in consecutive frames and identify occluded parts by creating occlusion maps from these masks and dense motion vectors. To evaluate the effectiveness of our approach, we conducted extensive experiments using four publicly available datasets, including VoxCeleb, TaiChiHD, TED-Talks, and MGif. The results emphasize the importance of landmark detection for video annotation and smooth transitions, while the independent shape and pose module helps capture precise representations.
format Article
id doaj-art-bb77b77c82b44b809b2f01ffe6d69cb5
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-bb77b77c82b44b809b2f01ffe6d69cb52025-08-20T03:24:39ZengIEEEIEEE Access2169-35362025-01-0113927129272910.1109/ACCESS.2025.357176011007652AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image AnimationNega Asebe Teka0https://orcid.org/0000-0002-6509-1367Kumie Gedamu Alemu1https://orcid.org/0000-0002-6458-1882Maregu Assefa2Feidu Akmel3Zhenting Zhou4Weijie Wu5Jianwen Chen6School of Information and Communication Engineering, UESTC, Chengdu, ChinaSichuan Artificial Intelligence Research Institute, UESTC, Yibin, ChinaSchool of Computing and Mathematics, Khalifa University, Abu Dhabi, United Arab EmiratesSchool of Information and Communication Engineering, UESTC, Chengdu, ChinaSchool of Information and Communication Engineering, UESTC, Chengdu, ChinaSchool of Information and Communication Engineering, UESTC, Chengdu, ChinaSchool of Information and Communication Engineering, UESTC, Chengdu, ChinaComputer vision advancements allow motion transfer for animating static objects in images. However, current methods rely on manually collected motion labels and struggle with accurate shape and pose representation, particularly for human bodies, due to occlusions and background variations. Thus, we propose an Adversarial Motion Transfer Network with a disentangled Shape and Pose representation for realistic image Animation (AMT-Net), utilizing an encoder-decoder adversarial structure. Specifically, we design a pose and shape learning module that captures the independent shape and pose information by training a discriminator with adversarial loss techniques, enhancing the generation of coherent animated frames. Furthermore, a motion estimation module is introduced to generate masks for objects in consecutive frames and identify occluded parts by creating occlusion maps from these masks and dense motion vectors. To evaluate the effectiveness of our approach, we conducted extensive experiments using four publicly available datasets, including VoxCeleb, TaiChiHD, TED-Talks, and MGif. The results emphasize the importance of landmark detection for video annotation and smooth transitions, while the independent shape and pose module helps capture precise representations.https://ieeexplore.ieee.org/document/11007652/Image animationmotion transferrepresentation learning
spellingShingle Nega Asebe Teka
Kumie Gedamu Alemu
Maregu Assefa
Feidu Akmel
Zhenting Zhou
Weijie Wu
Jianwen Chen
AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation
IEEE Access
Image animation
motion transfer
representation learning
title AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation
title_full AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation
title_fullStr AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation
title_full_unstemmed AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation
title_short AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation
title_sort amt net adversarial motion transfer network with disentangled shape and pose for realistic image animation
topic Image animation
motion transfer
representation learning
url https://ieeexplore.ieee.org/document/11007652/
work_keys_str_mv AT negaasebeteka amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation
AT kumiegedamualemu amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation
AT mareguassefa amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation
AT feiduakmel amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation
AT zhentingzhou amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation
AT weijiewu amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation
AT jianwenchen amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation