New Evaluation Method for Fuzzy Cluster Validity Indices

Cluster analysis is the process of associating data objects with classes of similar objects. One key aspect of the clustering problem is to determine the optimal number of groups on a data set to effectively partition it. The process of establishing accurate indicators is called cluster validity, an...

Full description

Saved in:
Bibliographic Details
Main Authors: Ismay Perez-Sanchez, Miguel Angel Medina-Perez, Raul Monroy, Octavio Loyola-Gonzalez, Andres Eduardo Gutierrez-Rodriguez
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10855404/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Cluster analysis is the process of associating data objects with classes of similar objects. One key aspect of the clustering problem is to determine the optimal number of groups on a data set to effectively partition it. The process of establishing accurate indicators is called cluster validity, and it evaluates the quality of a clustering, including the optimal number of clusters. A cluster validity index (CVI) is a function that carries out that evaluation based on a series of features like intra-class compactness, inter-class separation, object density, and membership degree of objects to each cluster, among others. Best-c validation protocol has been used to evaluate CVIs previously, but it makes a strong assumption about the correctness of the clustering performed. We propose a new protocol based on mapping the internal CVI values to an external CVI and performing statistical tests to analyze the results. The experiments are conducted evaluating the partitions generated by the fuzzy c-means clustering algorithm over 84 UCI datasets with 31 fuzzy CVIs. Experimental results evidenced the need for bigger experimental setups compared to previous studies. Also, they show that among the CVIs involved, there is none with a significant statistical difference over the other given by a Friedman test. However, the Wilcoxon signed-rank test demonstrates that in 1 vs. 1, the CVI KPBM outperforms the others in most of the fuzzifiers (m). Just except for <inline-formula> <tex-math notation="LaTeX">$m=2.0$ </tex-math></inline-formula> which consolidates the results in previous studies where none of the CVIs were able to improve consistently among the selected group of CVIs.
ISSN:2169-3536