Urban informal settlements interpretation via a novel multi-modal Kolmogorov–Arnold fusion network by exploring hierarchical features from remote sensing and street view images

Urban informal settlements (UIS) interpretation has important scientific value for achieving urban sustainable development. Recent research on UIS interpretation tasks mainly includes the single-modality method, which uses remote sensing images, and the multi-modality method which uses remote sensin...

Full description

Saved in:
Bibliographic Details
Main Authors: Hongyang Niu, Runyu Fan, Jiajun Chen, Zijian Xu, Ruyi Feng
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Science of Remote Sensing
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666017225000148
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850217749163278336
author Hongyang Niu
Runyu Fan
Jiajun Chen
Zijian Xu
Ruyi Feng
author_facet Hongyang Niu
Runyu Fan
Jiajun Chen
Zijian Xu
Ruyi Feng
author_sort Hongyang Niu
collection DOAJ
description Urban informal settlements (UIS) interpretation has important scientific value for achieving urban sustainable development. Recent research on UIS interpretation tasks mainly includes the single-modality method, which uses remote sensing images, and the multi-modality method which uses remote sensing and geospatial data. However, from a single remote sensing perspective, the inter-class similarities, and a regional mixture of complex geo-objects from a bird-eye perspective of UIS areas make UIS interpretation extremely challenging. The current multi-modal methods cannot fully explore the modality-specific features within the modality or ignore the modality-correlation features between different modalities. To address these issues, this study proposed a novel multi-modal Kolmogorov–Arnold fusion network, namely KANFusion, to explore the modality-specific features within the modality and fuse the modality-correlation features between different modalities to boost UIS interpretation using remote sensing and street view images. The proposed KANFusion model employs the Kolmogorov–Arnold Network (KAN) instead of the conventional MLP structure to enhance the model-fitting capability of heterogeneous modality-specific features and uses a novel Multi-level Feature Fusion Module with KAN block (MFF) to fuse the hierarchical modality-specific and modality-fusion features from remote sensing and street view images for better UIS interpretation performance. We conducted extensive experiments on the manually annotated ChinaUIS dataset of eight megacities in China and a public S2UV dataset and compared the proposed KANFusion with other state-of-the-art methods. The experimental results confirmed the superiority of the proposed KANFusion. This work is available in https://github.com/cyg-nhyang/KANFusion.
format Article
id doaj-art-91915bd5ea2b4d40aa461abd3e263d7f
institution OA Journals
issn 2666-0172
language English
publishDate 2025-06-01
publisher Elsevier
record_format Article
series Science of Remote Sensing
spelling doaj-art-91915bd5ea2b4d40aa461abd3e263d7f2025-08-20T02:07:59ZengElsevierScience of Remote Sensing2666-01722025-06-011110020810.1016/j.srs.2025.100208Urban informal settlements interpretation via a novel multi-modal Kolmogorov–Arnold fusion network by exploring hierarchical features from remote sensing and street view imagesHongyang Niu0Runyu Fan1Jiajun Chen2Zijian Xu3Ruyi Feng4School of Computer Science, China University of Geosciences, Wuhan, 430078, ChinaCorresponding author.; School of Computer Science, China University of Geosciences, Wuhan, 430078, ChinaSchool of Computer Science, China University of Geosciences, Wuhan, 430078, ChinaSchool of Computer Science, China University of Geosciences, Wuhan, 430078, ChinaSchool of Computer Science, China University of Geosciences, Wuhan, 430078, ChinaUrban informal settlements (UIS) interpretation has important scientific value for achieving urban sustainable development. Recent research on UIS interpretation tasks mainly includes the single-modality method, which uses remote sensing images, and the multi-modality method which uses remote sensing and geospatial data. However, from a single remote sensing perspective, the inter-class similarities, and a regional mixture of complex geo-objects from a bird-eye perspective of UIS areas make UIS interpretation extremely challenging. The current multi-modal methods cannot fully explore the modality-specific features within the modality or ignore the modality-correlation features between different modalities. To address these issues, this study proposed a novel multi-modal Kolmogorov–Arnold fusion network, namely KANFusion, to explore the modality-specific features within the modality and fuse the modality-correlation features between different modalities to boost UIS interpretation using remote sensing and street view images. The proposed KANFusion model employs the Kolmogorov–Arnold Network (KAN) instead of the conventional MLP structure to enhance the model-fitting capability of heterogeneous modality-specific features and uses a novel Multi-level Feature Fusion Module with KAN block (MFF) to fuse the hierarchical modality-specific and modality-fusion features from remote sensing and street view images for better UIS interpretation performance. We conducted extensive experiments on the manually annotated ChinaUIS dataset of eight megacities in China and a public S2UV dataset and compared the proposed KANFusion with other state-of-the-art methods. The experimental results confirmed the superiority of the proposed KANFusion. This work is available in https://github.com/cyg-nhyang/KANFusion.http://www.sciencedirect.com/science/article/pii/S2666017225000148Urban informal settlementsMultimodalRemote sensing image (RSI)Street view image (SVI)Kolmogorov–Arnold Network
spellingShingle Hongyang Niu
Runyu Fan
Jiajun Chen
Zijian Xu
Ruyi Feng
Urban informal settlements interpretation via a novel multi-modal Kolmogorov–Arnold fusion network by exploring hierarchical features from remote sensing and street view images
Science of Remote Sensing
Urban informal settlements
Multimodal
Remote sensing image (RSI)
Street view image (SVI)
Kolmogorov–Arnold Network
title Urban informal settlements interpretation via a novel multi-modal Kolmogorov–Arnold fusion network by exploring hierarchical features from remote sensing and street view images
title_full Urban informal settlements interpretation via a novel multi-modal Kolmogorov–Arnold fusion network by exploring hierarchical features from remote sensing and street view images
title_fullStr Urban informal settlements interpretation via a novel multi-modal Kolmogorov–Arnold fusion network by exploring hierarchical features from remote sensing and street view images
title_full_unstemmed Urban informal settlements interpretation via a novel multi-modal Kolmogorov–Arnold fusion network by exploring hierarchical features from remote sensing and street view images
title_short Urban informal settlements interpretation via a novel multi-modal Kolmogorov–Arnold fusion network by exploring hierarchical features from remote sensing and street view images
title_sort urban informal settlements interpretation via a novel multi modal kolmogorov arnold fusion network by exploring hierarchical features from remote sensing and street view images
topic Urban informal settlements
Multimodal
Remote sensing image (RSI)
Street view image (SVI)
Kolmogorov–Arnold Network
url http://www.sciencedirect.com/science/article/pii/S2666017225000148
work_keys_str_mv AT hongyangniu urbaninformalsettlementsinterpretationviaanovelmultimodalkolmogorovarnoldfusionnetworkbyexploringhierarchicalfeaturesfromremotesensingandstreetviewimages
AT runyufan urbaninformalsettlementsinterpretationviaanovelmultimodalkolmogorovarnoldfusionnetworkbyexploringhierarchicalfeaturesfromremotesensingandstreetviewimages
AT jiajunchen urbaninformalsettlementsinterpretationviaanovelmultimodalkolmogorovarnoldfusionnetworkbyexploringhierarchicalfeaturesfromremotesensingandstreetviewimages
AT zijianxu urbaninformalsettlementsinterpretationviaanovelmultimodalkolmogorovarnoldfusionnetworkbyexploringhierarchicalfeaturesfromremotesensingandstreetviewimages
AT ruyifeng urbaninformalsettlementsinterpretationviaanovelmultimodalkolmogorovarnoldfusionnetworkbyexploringhierarchicalfeaturesfromremotesensingandstreetviewimages