FDI-VSR: Video Super-Resolution Through Frequency-Domain Integration and Dynamic Offset Estimation

The increasing adoption of high-resolution imaging sensors across various fields has led to a growing demand for techniques to enhance video quality. Video super-resolution (VSR) addresses this need by reconstructing high-resolution videos from lower-resolution inputs; however, directly applying sin...

Full description

Saved in:
Bibliographic Details
Main Authors: Donghun Lim, Janghoon Choi
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/8/2402
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The increasing adoption of high-resolution imaging sensors across various fields has led to a growing demand for techniques to enhance video quality. Video super-resolution (VSR) addresses this need by reconstructing high-resolution videos from lower-resolution inputs; however, directly applying single-image super-resolution (SISR) methods to video sequences neglects temporal information, resulting in inconsistent and unnatural outputs. In this paper, we propose FDI-VSR, a novel framework that integrates spatiotemporal dynamics and frequency-domain analysis into conventional SISR models without extensive modifications. We introduce two key modules: the Spatiotemporal Feature Extraction Module (STFEM), which employs dynamic offset estimation, spatial alignment, and multi-stage temporal aggregation using residual channel attention blocks (RCABs); and the Frequency–Spatial Integration Module (FSIM), which transforms deep features into the frequency domain to effectively capture global context beyond the limited receptive field of standard convolutions. Extensive experiments on the Vid4, SPMCs, REDS4, and UDM10 benchmarks, supported by detailed ablation studies, demonstrate that FDI-VSR not only surpasses conventional VSR methods but also achieves competitive results compared to recent state-of-the-art methods, with improvements of up to 0.82 dB in PSNR on the SPMCs benchmark and notable reductions in visual artifacts, all while maintaining lower computational complexity and faster inference.
ISSN:1424-8220