New Evaluation Method for Fuzzy Cluster Validity Indices

Cluster analysis is the process of associating data objects with classes of similar objects. One key aspect of the clustering problem is to determine the optimal number of groups on a data set to effectively partition it. The process of establishing accurate indicators is called cluster validity, an...

Full description

Saved in:
Bibliographic Details
Main Authors: Ismay Perez-Sanchez, Miguel Angel Medina-Perez, Raul Monroy, Octavio Loyola-Gonzalez, Andres Eduardo Gutierrez-Rodriguez
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10855404/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825207040008519680
author Ismay Perez-Sanchez
Miguel Angel Medina-Perez
Raul Monroy
Octavio Loyola-Gonzalez
Andres Eduardo Gutierrez-Rodriguez
author_facet Ismay Perez-Sanchez
Miguel Angel Medina-Perez
Raul Monroy
Octavio Loyola-Gonzalez
Andres Eduardo Gutierrez-Rodriguez
author_sort Ismay Perez-Sanchez
collection DOAJ
description Cluster analysis is the process of associating data objects with classes of similar objects. One key aspect of the clustering problem is to determine the optimal number of groups on a data set to effectively partition it. The process of establishing accurate indicators is called cluster validity, and it evaluates the quality of a clustering, including the optimal number of clusters. A cluster validity index (CVI) is a function that carries out that evaluation based on a series of features like intra-class compactness, inter-class separation, object density, and membership degree of objects to each cluster, among others. Best-c validation protocol has been used to evaluate CVIs previously, but it makes a strong assumption about the correctness of the clustering performed. We propose a new protocol based on mapping the internal CVI values to an external CVI and performing statistical tests to analyze the results. The experiments are conducted evaluating the partitions generated by the fuzzy c-means clustering algorithm over 84 UCI datasets with 31 fuzzy CVIs. Experimental results evidenced the need for bigger experimental setups compared to previous studies. Also, they show that among the CVIs involved, there is none with a significant statistical difference over the other given by a Friedman test. However, the Wilcoxon signed-rank test demonstrates that in 1 vs. 1, the CVI KPBM outperforms the others in most of the fuzzifiers (m). Just except for <inline-formula> <tex-math notation="LaTeX">$m=2.0$ </tex-math></inline-formula> which consolidates the results in previous studies where none of the CVIs were able to improve consistently among the selected group of CVIs.
format Article
id doaj-art-a72581fe1af0465396a14ff6ba1dbbf2
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-a72581fe1af0465396a14ff6ba1dbbf22025-02-07T00:01:38ZengIEEEIEEE Access2169-35362025-01-0113227282274410.1109/ACCESS.2025.353541710855404New Evaluation Method for Fuzzy Cluster Validity IndicesIsmay Perez-Sanchez0https://orcid.org/0000-0003-0917-8764Miguel Angel Medina-Perez1https://orcid.org/0000-0003-4511-2252Raul Monroy2https://orcid.org/0000-0002-3465-995XOctavio Loyola-Gonzalez3https://orcid.org/0000-0002-6910-5922Andres Eduardo Gutierrez-Rodriguez4https://orcid.org/0000-0003-4178-1635School of Engineering and Sciences, Tecnologico de Monterrey, Atizap&#x00E1;n de Zaragoza, Estado de M&#x00E9;xico, MexicoKayak Analytics, Dubai, United Arab EmiratesSchool of Engineering and Sciences, Tecnologico de Monterrey, Atizap&#x00E1;n de Zaragoza, Estado de M&#x00E9;xico, MexicoNTT DATA, Madrid, SpainInstitute for the Future of Education, Tecnologico de Monterrey, Monterrey, MexicoCluster analysis is the process of associating data objects with classes of similar objects. One key aspect of the clustering problem is to determine the optimal number of groups on a data set to effectively partition it. The process of establishing accurate indicators is called cluster validity, and it evaluates the quality of a clustering, including the optimal number of clusters. A cluster validity index (CVI) is a function that carries out that evaluation based on a series of features like intra-class compactness, inter-class separation, object density, and membership degree of objects to each cluster, among others. Best-c validation protocol has been used to evaluate CVIs previously, but it makes a strong assumption about the correctness of the clustering performed. We propose a new protocol based on mapping the internal CVI values to an external CVI and performing statistical tests to analyze the results. The experiments are conducted evaluating the partitions generated by the fuzzy c-means clustering algorithm over 84 UCI datasets with 31 fuzzy CVIs. Experimental results evidenced the need for bigger experimental setups compared to previous studies. Also, they show that among the CVIs involved, there is none with a significant statistical difference over the other given by a Friedman test. However, the Wilcoxon signed-rank test demonstrates that in 1 vs. 1, the CVI KPBM outperforms the others in most of the fuzzifiers (m). Just except for <inline-formula> <tex-math notation="LaTeX">$m=2.0$ </tex-math></inline-formula> which consolidates the results in previous studies where none of the CVIs were able to improve consistently among the selected group of CVIs.https://ieeexplore.ieee.org/document/10855404/Fuzzy cluster validation indexevaluation methodfuzzy clusteringfuzzy sets
spellingShingle Ismay Perez-Sanchez
Miguel Angel Medina-Perez
Raul Monroy
Octavio Loyola-Gonzalez
Andres Eduardo Gutierrez-Rodriguez
New Evaluation Method for Fuzzy Cluster Validity Indices
IEEE Access
Fuzzy cluster validation index
evaluation method
fuzzy clustering
fuzzy sets
title New Evaluation Method for Fuzzy Cluster Validity Indices
title_full New Evaluation Method for Fuzzy Cluster Validity Indices
title_fullStr New Evaluation Method for Fuzzy Cluster Validity Indices
title_full_unstemmed New Evaluation Method for Fuzzy Cluster Validity Indices
title_short New Evaluation Method for Fuzzy Cluster Validity Indices
title_sort new evaluation method for fuzzy cluster validity indices
topic Fuzzy cluster validation index
evaluation method
fuzzy clustering
fuzzy sets
url https://ieeexplore.ieee.org/document/10855404/
work_keys_str_mv AT ismayperezsanchez newevaluationmethodforfuzzyclustervalidityindices
AT miguelangelmedinaperez newevaluationmethodforfuzzyclustervalidityindices
AT raulmonroy newevaluationmethodforfuzzyclustervalidityindices
AT octavioloyolagonzalez newevaluationmethodforfuzzyclustervalidityindices
AT andreseduardogutierrezrodriguez newevaluationmethodforfuzzyclustervalidityindices