Appearance consistency and motion coherence learning for internal video inpainting

Abstract Internal learning‐based video inpainting methods have shown promising results by exploiting the intrinsic properties of the video to fill in the missing region without external dataset supervision. However, existing internal learning‐based video inpainting methods would produce inconsistent...

Full description

Saved in:
Bibliographic Details
Main Authors: Ruixin Liu, Yuesheng Zhu, GuiBo Luo
Format: Article
Language:English
Published: Wiley 2025-06-01
Series:CAAI Transactions on Intelligence Technology
Subjects:
Online Access:https://doi.org/10.1049/cit2.12405
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849473019142995968
author Ruixin Liu
Yuesheng Zhu
GuiBo Luo
author_facet Ruixin Liu
Yuesheng Zhu
GuiBo Luo
author_sort Ruixin Liu
collection DOAJ
description Abstract Internal learning‐based video inpainting methods have shown promising results by exploiting the intrinsic properties of the video to fill in the missing region without external dataset supervision. However, existing internal learning‐based video inpainting methods would produce inconsistent structures or blurry textures due to the insufficient utilisation of motion priors within the video sequence. In this paper, the authors propose a new internal learning‐based video inpainting model called appearance consistency and motion coherence network (ACMC‐Net), which can not only learn the recurrence of appearance prior but can also capture motion coherence prior to improve the quality of the inpainting results. In ACMC‐Net, a transformer‐based appearance network is developed to capture global context information within the video frame for representing appearance consistency accurately. Additionally, a novel motion coherence learning scheme is proposed to learn the motion prior in a video sequence effectively. Finally, the learnt internal appearance consistency and motion coherence are implicitly propagated to the missing regions to achieve inpainting well. Extensive experiments conducted on the DAVIS dataset show that the proposed model obtains the superior performance in terms of quantitative measurements and produces more visually plausible results compared with the state‐of‐the‐art methods.
format Article
id doaj-art-9defbc77a08348e4b4f8943eb3d4cc0c
institution Kabale University
issn 2468-2322
language English
publishDate 2025-06-01
publisher Wiley
record_format Article
series CAAI Transactions on Intelligence Technology
spelling doaj-art-9defbc77a08348e4b4f8943eb3d4cc0c2025-08-20T03:24:17ZengWileyCAAI Transactions on Intelligence Technology2468-23222025-06-0110382784110.1049/cit2.12405Appearance consistency and motion coherence learning for internal video inpaintingRuixin Liu0Yuesheng Zhu1GuiBo Luo2Shenzhen Graduate School Peking University Shenzhen ChinaShenzhen Graduate School Peking University Shenzhen ChinaShenzhen Graduate School Peking University Shenzhen ChinaAbstract Internal learning‐based video inpainting methods have shown promising results by exploiting the intrinsic properties of the video to fill in the missing region without external dataset supervision. However, existing internal learning‐based video inpainting methods would produce inconsistent structures or blurry textures due to the insufficient utilisation of motion priors within the video sequence. In this paper, the authors propose a new internal learning‐based video inpainting model called appearance consistency and motion coherence network (ACMC‐Net), which can not only learn the recurrence of appearance prior but can also capture motion coherence prior to improve the quality of the inpainting results. In ACMC‐Net, a transformer‐based appearance network is developed to capture global context information within the video frame for representing appearance consistency accurately. Additionally, a novel motion coherence learning scheme is proposed to learn the motion prior in a video sequence effectively. Finally, the learnt internal appearance consistency and motion coherence are implicitly propagated to the missing regions to achieve inpainting well. Extensive experiments conducted on the DAVIS dataset show that the proposed model obtains the superior performance in terms of quantitative measurements and produces more visually plausible results compared with the state‐of‐the‐art methods.https://doi.org/10.1049/cit2.12405deep internal learningmotion coherencespatial‐temporal priorstransformer networkvideo inpainting
spellingShingle Ruixin Liu
Yuesheng Zhu
GuiBo Luo
Appearance consistency and motion coherence learning for internal video inpainting
CAAI Transactions on Intelligence Technology
deep internal learning
motion coherence
spatial‐temporal priors
transformer network
video inpainting
title Appearance consistency and motion coherence learning for internal video inpainting
title_full Appearance consistency and motion coherence learning for internal video inpainting
title_fullStr Appearance consistency and motion coherence learning for internal video inpainting
title_full_unstemmed Appearance consistency and motion coherence learning for internal video inpainting
title_short Appearance consistency and motion coherence learning for internal video inpainting
title_sort appearance consistency and motion coherence learning for internal video inpainting
topic deep internal learning
motion coherence
spatial‐temporal priors
transformer network
video inpainting
url https://doi.org/10.1049/cit2.12405
work_keys_str_mv AT ruixinliu appearanceconsistencyandmotioncoherencelearningforinternalvideoinpainting
AT yueshengzhu appearanceconsistencyandmotioncoherencelearningforinternalvideoinpainting
AT guiboluo appearanceconsistencyandmotioncoherencelearningforinternalvideoinpainting