FW-S3KIFCM: Feature Weighted Safe-Semi-Supervised Kernel-Based Intuitionistic Fuzzy C-Means Clustering Method

Semi-supervised clustering (SSC) methods have emerged as a notable research area in machine learning. These methods integrate prior knowledge of class distribution into their clustering process. Despite their efficiency and straightforwardness, SSCs encounter some fundamental issues. Generally, the...

Full description

Saved in:
Bibliographic Details
Main Authors: Shirin Khezri, Nasser Aghazadeh, Mahdi Hashemzadeh, Amin Golzari Oskouei
Format: Article
Language:English
Published: Tsinghua University Press 2025-07-01
Series:Fuzzy Information and Engineering
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/FIE.2025.9270061
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Semi-supervised clustering (SSC) methods have emerged as a notable research area in machine learning. These methods integrate prior knowledge of class distribution into their clustering process. Despite their efficiency and straightforwardness, SSCs encounter some fundamental issues. Generally, the proportion of unlabeled data surpasses that of labeled data. Consequently, handling the uncertainty of unlabeled data becomes difficult. This issue is frequently related to numerous real-world problems. On the other hand, existing SSC techniques fail to differentiate between the varied attributes within the feature space. When forming clusters, they presume uniform significance for all attributes, disregarding potential variations in feature importance. This presumption hinders the creation of optimal clusters. Furthermore, all existing approaches employ the Euclidean distance metric, susceptible to noise and outliers. This paper proposes a robust safe-semi-supervised clustering algorithm to mitigate these shortcomings. For the first time, this approach combines two concepts of Intuitionistic Fuzzy C-Means (IFCM) clustering and Safe-Semi-Supervised Fuzzy C-Means (S3FCM) clustering to address the uncertainty problem in unlabeled data. Also, it uses a kernel function as a distance metric to tackle noise and outliers. Additionally, incorporating a feature weighting parameter in the objective function highlights the importance of significant features in creating optimal clusters. The effectiveness of the proposed method is thoroughly evaluated on various benchmark datasets, and its performance is compared with state-of-the-art methods. The results show the superiority of the proposed method over its competitors.
ISSN:1616-8658
1616-8666