MFN: Multi-Scale Frequency Feature Fusion Network for Multi-Classification Image Segmentation

Semantic segmentation is a fundamental task in computer vision. In multi-class scenarios, the abundance of categories, feature similarity across classes, class imbalance, and the complexity of feature space modeling pose significant challenges in delineating features and class boundaries. Recent mul...

Full description

Saved in:
Bibliographic Details
Main Authors: Ji Xiao, Li Jianfang, Zhao Peng, Li Xiaochen, Zhang Chengchun, Zheng Junyi, Pang Yonghui, Huang Xiangsheng
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11015700/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Semantic segmentation is a fundamental task in computer vision. In multi-class scenarios, the abundance of categories, feature similarity across classes, class imbalance, and the complexity of feature space modeling pose significant challenges in delineating features and class boundaries. Recent multi-class segmentation methods often suffer from issues such as boundary ambiguity, loss of detail, and inadequate preservation of contextual information. To address these challenges, this paper introduces the Multi-scale Frequency Feature Fusion Network (MFN), a convolutional neural network designed to effectively integrate multi-scale information. First, the Deeplab network architecture is enhanced by refining the ASPP module. Finer-grained atrous convolutions are introduced in its channels to capture multi-scale semantic information, enabling the extraction of richer and more detailed features. The network incorporates a hybrid attention module to enhance the importance of specific spatial points and strengthen feature dependencies between channels. During the feature fusion stage, the FreqFusion module adaptively smooths high-level features to address class inconsistency and boundary misalignment effectively. And a combined loss function, integrating cross-entropy loss and the Dice coefficient, is designed to improve model training and achieve optimal segmentation performance. Extensive experiments conducted on the Cityscapes and Pascal VOC datasets demonstrate that the proposed method achieves superior performance.
ISSN:2169-3536