MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image Enhancement

Low-light image enhancement (LLIE) remains a challenging task due to the complex degradation patterns in images captured under insufficient illumination, including non-linear intensity mappings, spatially-varying noise distributions, and content-dependent color distortions. Despite significant advan...

Full description

Saved in:

Bibliographic Details
Main Authors:	Cheng da, Yongsheng Qian, Junwei Zeng, Xuting Wei, Futao Zhang
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Deep learning unsupervised methods attention mechanisms diffusion models
Online Access:	https://ieeexplore.ieee.org/document/11014086/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849722683812478976
author	Cheng da Yongsheng Qian Junwei Zeng Xuting Wei Futao Zhang
author_facet	Cheng da Yongsheng Qian Junwei Zeng Xuting Wei Futao Zhang
author_sort	Cheng da
collection	DOAJ
description	Low-light image enhancement (LLIE) remains a challenging task due to the complex degradation patterns in images captured under insufficient illumination, including non-linear intensity mappings, spatially-varying noise distributions, and content-dependent color distortions. Despite significant advances, existing methods struggle with three fundamental challenges: 1) difficulty in simultaneously preserving structural details while reducing noise, 2) limited generalization across diverse lighting conditions and scene types, and 3) computational inefficiency when processing complex natural scenes. While recent diffusion-based methods have shown promise, they often struggle with generalization and require paired training data. We propose MCATD, a novel unsupervised framework that integrates adaptive sampling, multi-scale feature extraction, and dynamic enhancement capabilities into diffusion models for LLIE. The framework consists of three key components: 1) a Dynamic Adaptive Diffusion Sampling (DADS) strategy that adjusts sampling steps based on image complexity, 2) a Multi-scale Contextual Attention Transformer (MCAT) network that captures features at different scales with attention mechanisms, and 3) a Multi-scale Dynamic Structure-Preserving (MDSP) loss that preserves image structure while optimizing perceptual quality. Experimental results on multiple benchmarks demonstrate that our method outperforms state-of-the-art unsupervised approaches and achieves comparable performance to supervised methods while maintaining better generalization ability. Furthermore, ablation studies validate the effectiveness of each proposed component. The proposed framework not only advances the field of unsupervised LLIE but also provides insights into leveraging diffusion models for image restoration tasks.
format	Article
id	doaj-art-3241fa763877484987aadf95ff18b8b9
institution	DOAJ
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-3241fa763877484987aadf95ff18b8b92025-08-20T03:11:17ZengIEEEIEEE Access2169-35362025-01-0113973489736310.1109/ACCESS.2025.357317111014086MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image EnhancementCheng da0Yongsheng Qian1https://orcid.org/0000-0002-6056-4209Junwei Zeng2https://orcid.org/0000-0001-6227-4849Xuting Wei3https://orcid.org/0000-0003-3570-3024Futao Zhang4https://orcid.org/0000-0002-2103-3712School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou, ChinaSchool of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou, ChinaSchool of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou, ChinaSchool of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou, ChinaSchool of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou, ChinaLow-light image enhancement (LLIE) remains a challenging task due to the complex degradation patterns in images captured under insufficient illumination, including non-linear intensity mappings, spatially-varying noise distributions, and content-dependent color distortions. Despite significant advances, existing methods struggle with three fundamental challenges: 1) difficulty in simultaneously preserving structural details while reducing noise, 2) limited generalization across diverse lighting conditions and scene types, and 3) computational inefficiency when processing complex natural scenes. While recent diffusion-based methods have shown promise, they often struggle with generalization and require paired training data. We propose MCATD, a novel unsupervised framework that integrates adaptive sampling, multi-scale feature extraction, and dynamic enhancement capabilities into diffusion models for LLIE. The framework consists of three key components: 1) a Dynamic Adaptive Diffusion Sampling (DADS) strategy that adjusts sampling steps based on image complexity, 2) a Multi-scale Contextual Attention Transformer (MCAT) network that captures features at different scales with attention mechanisms, and 3) a Multi-scale Dynamic Structure-Preserving (MDSP) loss that preserves image structure while optimizing perceptual quality. Experimental results on multiple benchmarks demonstrate that our method outperforms state-of-the-art unsupervised approaches and achieves comparable performance to supervised methods while maintaining better generalization ability. Furthermore, ablation studies validate the effectiveness of each proposed component. The proposed framework not only advances the field of unsupervised LLIE but also provides insights into leveraging diffusion models for image restoration tasks.https://ieeexplore.ieee.org/document/11014086/Deep learningunsupervised methodsattention mechanismsdiffusion models
spellingShingle	Cheng da Yongsheng Qian Junwei Zeng Xuting Wei Futao Zhang MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image Enhancement IEEE Access Deep learning unsupervised methods attention mechanisms diffusion models
title	MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image Enhancement
title_full	MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image Enhancement
title_fullStr	MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image Enhancement
title_full_unstemmed	MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image Enhancement
title_short	MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image Enhancement
title_sort	mcatd multi scale contextual attention transformer diffusion for unsupervised low light image enhancement
topic	Deep learning unsupervised methods attention mechanisms diffusion models
url	https://ieeexplore.ieee.org/document/11014086/
work_keys_str_mv	AT chengda mcatdmultiscalecontextualattentiontransformerdiffusionforunsupervisedlowlightimageenhancement AT yongshengqian mcatdmultiscalecontextualattentiontransformerdiffusionforunsupervisedlowlightimageenhancement AT junweizeng mcatdmultiscalecontextualattentiontransformerdiffusionforunsupervisedlowlightimageenhancement AT xutingwei mcatdmultiscalecontextualattentiontransformerdiffusionforunsupervisedlowlightimageenhancement AT futaozhang mcatdmultiscalecontextualattentiontransformerdiffusionforunsupervisedlowlightimageenhancement

MCATD: Multi-Scale Contextual Attention Transformer Diffusion for Unsupervised Low-Light Image Enhancement

Similar Items