Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation

Video frame interpolation (VFI) is a task that generates intermediate frames from two consecutive frames. Previous studies have employed two main approaches to extract the necessary information from both frames: pixel-level synthesis and flow-based methods. However, when synthesizing high-resolution...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sangjin Lee, Chajin Shin, Hong-Goo Kang, Sangyoun Lee
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Sensors
Subjects:	video frame interpolation end-to-end learning hierarchical flow refinement difference map
Online Access:	https://www.mdpi.com/1424-8220/25/1/290
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841548935170097152
author	Sangjin Lee Chajin Shin Hong-Goo Kang Sangyoun Lee
author_facet	Sangjin Lee Chajin Shin Hong-Goo Kang Sangyoun Lee
author_sort	Sangjin Lee
collection	DOAJ
description	Video frame interpolation (VFI) is a task that generates intermediate frames from two consecutive frames. Previous studies have employed two main approaches to extract the necessary information from both frames: pixel-level synthesis and flow-based methods. However, when synthesizing high-resolution videos using VFI, each approach has its limitations. Pixel-level synthesis based on the transformer architecture requires high complexity to achieve 4K video results. In the case of flow-based methods, forward warping can produce holes where pixels are not allocated, while backward warping approaches struggle to obtain accurate backward flow. Additionally, there are challenges during the training stage; previous works have often generated suboptimal results by training multi-stage model architectures separately. To address these issues, we propose a Recurrent Flow Update (RFU) model trained in an end-to-end manner. We introduce a global flow update module that leverages global information to mitigate the weaknesses of forward flow and gradually correct errors. We demonstrate the effectiveness of our method through several ablation studies. Our approach achieves state-of-the-art performance not only on the XTest and Davis datasets, which have 4K resolution, but also on the SNU-FILM dataset, which features large motions at low resolution.
format	Article
id	doaj-art-94f296ab8a4248d1b00c6e5c31673706
institution	Kabale University
issn	1424-8220
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj-art-94f296ab8a4248d1b00c6e5c316737062025-01-10T13:21:29ZengMDPI AGSensors1424-82202025-01-0125129010.3390/s25010290Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame InterpolationSangjin Lee0Chajin Shin1Hong-Goo Kang2Sangyoun Lee3School of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Republic of KoreaSchool of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Republic of KoreaSchool of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Republic of KoreaSchool of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Republic of KoreaVideo frame interpolation (VFI) is a task that generates intermediate frames from two consecutive frames. Previous studies have employed two main approaches to extract the necessary information from both frames: pixel-level synthesis and flow-based methods. However, when synthesizing high-resolution videos using VFI, each approach has its limitations. Pixel-level synthesis based on the transformer architecture requires high complexity to achieve 4K video results. In the case of flow-based methods, forward warping can produce holes where pixels are not allocated, while backward warping approaches struggle to obtain accurate backward flow. Additionally, there are challenges during the training stage; previous works have often generated suboptimal results by training multi-stage model architectures separately. To address these issues, we propose a Recurrent Flow Update (RFU) model trained in an end-to-end manner. We introduce a global flow update module that leverages global information to mitigate the weaknesses of forward flow and gradually correct errors. We demonstrate the effectiveness of our method through several ablation studies. Our approach achieves state-of-the-art performance not only on the XTest and Davis datasets, which have 4K resolution, but also on the SNU-FILM dataset, which features large motions at low resolution.https://www.mdpi.com/1424-8220/25/1/290video frame interpolationend-to-end learninghierarchical flow refinementdifference map
spellingShingle	Sangjin Lee Chajin Shin Hong-Goo Kang Sangyoun Lee Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation Sensors video frame interpolation end-to-end learning hierarchical flow refinement difference map
title	Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation
title_full	Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation
title_fullStr	Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation
title_full_unstemmed	Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation
title_short	Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation
title_sort	recurrent flow update model using image pyramid structure for 4k video frame interpolation
topic	video frame interpolation end-to-end learning hierarchical flow refinement difference map
url	https://www.mdpi.com/1424-8220/25/1/290
work_keys_str_mv	AT sangjinlee recurrentflowupdatemodelusingimagepyramidstructurefor4kvideoframeinterpolation AT chajinshin recurrentflowupdatemodelusingimagepyramidstructurefor4kvideoframeinterpolation AT honggookang recurrentflowupdatemodelusingimagepyramidstructurefor4kvideoframeinterpolation AT sangyounlee recurrentflowupdatemodelusingimagepyramidstructurefor4kvideoframeinterpolation

Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation

Similar Items