MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple Filters

Multi-modal image fusion mainly refers to the feature fusion of two or more different images taken from the same perspective range to increase the amount of information contained in an image. This study proposes a multi-modal image fusion deep network called the MFF network. Compared with traditiona...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuequn Wang, Zhengwei Li, Jianli Wang, Leqiang Yang, Bo Dong, Hanfu Zhang, Jie Liu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10877823/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849425094554681344
author Yuequn Wang
Zhengwei Li
Jianli Wang
Leqiang Yang
Bo Dong
Hanfu Zhang
Jie Liu
author_facet Yuequn Wang
Zhengwei Li
Jianli Wang
Leqiang Yang
Bo Dong
Hanfu Zhang
Jie Liu
author_sort Yuequn Wang
collection DOAJ
description Multi-modal image fusion mainly refers to the feature fusion of two or more different images taken from the same perspective range to increase the amount of information contained in an image. This study proposes a multi-modal image fusion deep network called the MFF network. Compared with traditional image fusion models, the MFF network decomposes high-frequency features more finely. In contrast to popular transformer networks, the MFF network utilizes multiple filter networks for the corresponding high and low-frequency feature extraction, thereby improving the model training and inference time. First, GaborNet filtering modules were used by the MFF network to extract high-frequency texture features and invertible neural networks (INN) modules are employed for extracting high-frequency edge features. These two sets of features constitute the high-frequency characteristics of an image. The LEF module is utilized as a low-pass filter to acquire the low-frequency characteristics of an image. The method involving low-frequency feature correlation and high-frequency feature non-correlation was used for image training and fusion purposes. By systematically comparing the TNO, MSRS, and RoadScene datasets with other state-of-the-art image fusion models, the experimental results indicate that the MFF model achieves superior performance in visible-infrared image fusion. Furthermore, evaluations on the LLVIP dataset confirm the model’s effectiveness in downstream machine vision tasks. Additionally, comparisons using the MRI_CT, MRI_PET, and MRI_SPECT datasets demonstrate that the MFF model exhibits exceptional performance in medical image fusion.
format Article
id doaj-art-8a88de79e10e4d4393a16f6e8514b8d5
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-8a88de79e10e4d4393a16f6e8514b8d52025-08-20T03:29:52ZengIEEEIEEE Access2169-35362025-01-0113380763809010.1109/ACCESS.2025.354000710877823MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple FiltersYuequn Wang0https://orcid.org/0000-0001-7154-0539Zhengwei Li1Jianli Wang2https://orcid.org/0000-0002-2969-1664Leqiang Yang3Bo Dong4Hanfu Zhang5Jie Liu6Changchun Institute of Optics, Fine Mechanics and Physics (CIOMP), Chinese Academy of Sciences, Changchun, ChinaChangchun Institute of Optics, Fine Mechanics and Physics (CIOMP), Chinese Academy of Sciences, Changchun, ChinaChangchun Institute of Optics, Fine Mechanics and Physics (CIOMP), Chinese Academy of Sciences, Changchun, ChinaChangchun Institute of Optics, Fine Mechanics and Physics (CIOMP), Chinese Academy of Sciences, Changchun, ChinaChangchun Institute of Optics, Fine Mechanics and Physics (CIOMP), Chinese Academy of Sciences, Changchun, ChinaChangchun Institute of Optics, Fine Mechanics and Physics (CIOMP), Chinese Academy of Sciences, Changchun, ChinaChangchun Institute of Optics, Fine Mechanics and Physics (CIOMP), Chinese Academy of Sciences, Changchun, ChinaMulti-modal image fusion mainly refers to the feature fusion of two or more different images taken from the same perspective range to increase the amount of information contained in an image. This study proposes a multi-modal image fusion deep network called the MFF network. Compared with traditional image fusion models, the MFF network decomposes high-frequency features more finely. In contrast to popular transformer networks, the MFF network utilizes multiple filter networks for the corresponding high and low-frequency feature extraction, thereby improving the model training and inference time. First, GaborNet filtering modules were used by the MFF network to extract high-frequency texture features and invertible neural networks (INN) modules are employed for extracting high-frequency edge features. These two sets of features constitute the high-frequency characteristics of an image. The LEF module is utilized as a low-pass filter to acquire the low-frequency characteristics of an image. The method involving low-frequency feature correlation and high-frequency feature non-correlation was used for image training and fusion purposes. By systematically comparing the TNO, MSRS, and RoadScene datasets with other state-of-the-art image fusion models, the experimental results indicate that the MFF model achieves superior performance in visible-infrared image fusion. Furthermore, evaluations on the LLVIP dataset confirm the model’s effectiveness in downstream machine vision tasks. Additionally, comparisons using the MRI_CT, MRI_PET, and MRI_SPECT datasets demonstrate that the MFF model exhibits exceptional performance in medical image fusion.https://ieeexplore.ieee.org/document/10877823/Autoencoderdeep learningfilterimage fusionvisible-infrared image fusion
spellingShingle Yuequn Wang
Zhengwei Li
Jianli Wang
Leqiang Yang
Bo Dong
Hanfu Zhang
Jie Liu
MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple Filters
IEEE Access
Autoencoder
deep learning
filter
image fusion
visible-infrared image fusion
title MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple Filters
title_full MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple Filters
title_fullStr MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple Filters
title_full_unstemmed MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple Filters
title_short MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple Filters
title_sort mff a deep learning model for multi modal image fusion based on multiple filters
topic Autoencoder
deep learning
filter
image fusion
visible-infrared image fusion
url https://ieeexplore.ieee.org/document/10877823/
work_keys_str_mv AT yuequnwang mffadeeplearningmodelformultimodalimagefusionbasedonmultiplefilters
AT zhengweili mffadeeplearningmodelformultimodalimagefusionbasedonmultiplefilters
AT jianliwang mffadeeplearningmodelformultimodalimagefusionbasedonmultiplefilters
AT leqiangyang mffadeeplearningmodelformultimodalimagefusionbasedonmultiplefilters
AT bodong mffadeeplearningmodelformultimodalimagefusionbasedonmultiplefilters
AT hanfuzhang mffadeeplearningmodelformultimodalimagefusionbasedonmultiplefilters
AT jieliu mffadeeplearningmodelformultimodalimagefusionbasedonmultiplefilters