Exploring Dynamic Hierarchical Fusion for Multi-View Clustering

Multi-view clustering is effective at uncovering the latent structures within different views or modalities. However, existing approaches often oversimplify the problem by treating the contribution and granularity of information from all views as uniform, neglecting the semantic richness and diversi...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhenshan Chen, Xuran Tian, Lizhen Gao, Gengbin Liao, Yang Shen, Kainan Zheng, Jianglin Fang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10925213/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multi-view clustering is effective at uncovering the latent structures within different views or modalities. However, existing approaches often oversimplify the problem by treating the contribution and granularity of information from all views as uniform, neglecting the semantic richness and diversity inherent in different views. To address this limitation, we propose a dynamic hierarchical fusion method that not only integrates multi-granularity representations from multiple views but also dynamically computes and adjusts the contribution of each view’s representation for the clustering task through weighted fusion. Specifically, we design a multi-view hierarchical features fusion module that adaptively maps the intermediate representations from all view-specific encoders into a unified representation space, enabling effective multi-scale fusion. This process yields a set of intermediate representations that transition from coarse to fine granularity across multiple views and scales. Additionally, we introduce a multi-view gated fusion module, which utilizes a set of learnable normalized parameters to dynamically compute the contribution of each view’s representation and its multi-scale features. This weighted fusion produces a unified clustering representation that captures the most relevant information for the clustering task. Experimental results on benchmark datasets, such as Scene-15 Dataset, show that our method outperforms existing state-of-the-art methods by up to 1.88% in clustering accuracy (e.g., 41.08% compared to 39.20%). Ablation studies demonstrate the importance of the multi-view gated fusion module in learning the relative contributions of different views, which significantly enhances clustering performance.
ISSN:2169-3536