AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation
Computer vision advancements allow motion transfer for animating static objects in images. However, current methods rely on manually collected motion labels and struggle with accurate shape and pose representation, particularly for human bodies, due to occlusions and background variations. Thus, we...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11007652/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849472002892496896 |
|---|---|
| author | Nega Asebe Teka Kumie Gedamu Alemu Maregu Assefa Feidu Akmel Zhenting Zhou Weijie Wu Jianwen Chen |
| author_facet | Nega Asebe Teka Kumie Gedamu Alemu Maregu Assefa Feidu Akmel Zhenting Zhou Weijie Wu Jianwen Chen |
| author_sort | Nega Asebe Teka |
| collection | DOAJ |
| description | Computer vision advancements allow motion transfer for animating static objects in images. However, current methods rely on manually collected motion labels and struggle with accurate shape and pose representation, particularly for human bodies, due to occlusions and background variations. Thus, we propose an Adversarial Motion Transfer Network with a disentangled Shape and Pose representation for realistic image Animation (AMT-Net), utilizing an encoder-decoder adversarial structure. Specifically, we design a pose and shape learning module that captures the independent shape and pose information by training a discriminator with adversarial loss techniques, enhancing the generation of coherent animated frames. Furthermore, a motion estimation module is introduced to generate masks for objects in consecutive frames and identify occluded parts by creating occlusion maps from these masks and dense motion vectors. To evaluate the effectiveness of our approach, we conducted extensive experiments using four publicly available datasets, including VoxCeleb, TaiChiHD, TED-Talks, and MGif. The results emphasize the importance of landmark detection for video annotation and smooth transitions, while the independent shape and pose module helps capture precise representations. |
| format | Article |
| id | doaj-art-bb77b77c82b44b809b2f01ffe6d69cb5 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-bb77b77c82b44b809b2f01ffe6d69cb52025-08-20T03:24:39ZengIEEEIEEE Access2169-35362025-01-0113927129272910.1109/ACCESS.2025.357176011007652AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image AnimationNega Asebe Teka0https://orcid.org/0000-0002-6509-1367Kumie Gedamu Alemu1https://orcid.org/0000-0002-6458-1882Maregu Assefa2Feidu Akmel3Zhenting Zhou4Weijie Wu5Jianwen Chen6School of Information and Communication Engineering, UESTC, Chengdu, ChinaSichuan Artificial Intelligence Research Institute, UESTC, Yibin, ChinaSchool of Computing and Mathematics, Khalifa University, Abu Dhabi, United Arab EmiratesSchool of Information and Communication Engineering, UESTC, Chengdu, ChinaSchool of Information and Communication Engineering, UESTC, Chengdu, ChinaSchool of Information and Communication Engineering, UESTC, Chengdu, ChinaSchool of Information and Communication Engineering, UESTC, Chengdu, ChinaComputer vision advancements allow motion transfer for animating static objects in images. However, current methods rely on manually collected motion labels and struggle with accurate shape and pose representation, particularly for human bodies, due to occlusions and background variations. Thus, we propose an Adversarial Motion Transfer Network with a disentangled Shape and Pose representation for realistic image Animation (AMT-Net), utilizing an encoder-decoder adversarial structure. Specifically, we design a pose and shape learning module that captures the independent shape and pose information by training a discriminator with adversarial loss techniques, enhancing the generation of coherent animated frames. Furthermore, a motion estimation module is introduced to generate masks for objects in consecutive frames and identify occluded parts by creating occlusion maps from these masks and dense motion vectors. To evaluate the effectiveness of our approach, we conducted extensive experiments using four publicly available datasets, including VoxCeleb, TaiChiHD, TED-Talks, and MGif. The results emphasize the importance of landmark detection for video annotation and smooth transitions, while the independent shape and pose module helps capture precise representations.https://ieeexplore.ieee.org/document/11007652/Image animationmotion transferrepresentation learning |
| spellingShingle | Nega Asebe Teka Kumie Gedamu Alemu Maregu Assefa Feidu Akmel Zhenting Zhou Weijie Wu Jianwen Chen AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation IEEE Access Image animation motion transfer representation learning |
| title | AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation |
| title_full | AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation |
| title_fullStr | AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation |
| title_full_unstemmed | AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation |
| title_short | AMT-Net: Adversarial Motion Transfer Network With Disentangled Shape and Pose for Realistic Image Animation |
| title_sort | amt net adversarial motion transfer network with disentangled shape and pose for realistic image animation |
| topic | Image animation motion transfer representation learning |
| url | https://ieeexplore.ieee.org/document/11007652/ |
| work_keys_str_mv | AT negaasebeteka amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation AT kumiegedamualemu amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation AT mareguassefa amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation AT feiduakmel amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation AT zhentingzhou amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation AT weijiewu amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation AT jianwenchen amtnetadversarialmotiontransfernetworkwithdisentangledshapeandposeforrealisticimageanimation |