A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters

In order to cope with data with an irregular shape and uneven density, this paper proposes a two-phase clustering algorithm based on detecting the peaks of dimensional density and the degree of interaction between clusters (CPDD-ID). In the partitioning phase, the local densities of the data in all...

Full description

Saved in:
Bibliographic Details
Main Authors: Yangming Liu, Jiaman Ding, Hongbin Wang, Yi Du
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/7/3612
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850184585769385984
author Yangming Liu
Jiaman Ding
Hongbin Wang
Yi Du
author_facet Yangming Liu
Jiaman Ding
Hongbin Wang
Yi Du
author_sort Yangming Liu
collection DOAJ
description In order to cope with data with an irregular shape and uneven density, this paper proposes a two-phase clustering algorithm based on detecting the peaks of dimensional density and the degree of interaction between clusters (CPDD-ID). In the partitioning phase, the local densities of the data in all dimensions are calculated using kernel density estimation, the density curves are constructed based on the densities of all the data, and the peaks of the density curves are used as the benchmark to construct a Kd-Tree to search for the data points that are closest to each peak to partition the initial sub-clusters. Then, the intersection of the results of the initial sub-clusters obtained from all the dimensions is taken to obtain the final sub-clusters. The proposed partitioning strategy is able to accurately identify clusters with density differences and has significant effects in dealing with data with irregular shapes as well as uneven densities in this category. In addition, a new similarity measure based on the interaction degree between clusters is proposed in the merging stage. This method iteratively merges subclusters with maximum similarity by calculating the interaction degree of shared k-nearest neighbors between neighboring subclusters. The proposed similarity measure is effective in dealing with the problems of high overlap between clusters and ambiguous boundaries. The proposed algorithm is tested in detail on 10 synthetic datasets and 10 UCI real datasets and compared with existing state-of-the-art algorithms. The experimental results show that the CPDD-ID algorithm accurately identifies potential cluster structures and exhibits excellent performance in terms of both clustering accuracy.
format Article
id doaj-art-b11f0a1515f54530bb2e58ba6aca067a
institution OA Journals
issn 2076-3417
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-b11f0a1515f54530bb2e58ba6aca067a2025-08-20T02:17:00ZengMDPI AGApplied Sciences2076-34172025-03-01157361210.3390/app15073612A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between ClustersYangming Liu0Jiaman Ding1Hongbin Wang2Yi Du3Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, ChinaFaculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, ChinaFaculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, ChinaCity College, Kunming University of Science and Technology, Kunming 650051, ChinaIn order to cope with data with an irregular shape and uneven density, this paper proposes a two-phase clustering algorithm based on detecting the peaks of dimensional density and the degree of interaction between clusters (CPDD-ID). In the partitioning phase, the local densities of the data in all dimensions are calculated using kernel density estimation, the density curves are constructed based on the densities of all the data, and the peaks of the density curves are used as the benchmark to construct a Kd-Tree to search for the data points that are closest to each peak to partition the initial sub-clusters. Then, the intersection of the results of the initial sub-clusters obtained from all the dimensions is taken to obtain the final sub-clusters. The proposed partitioning strategy is able to accurately identify clusters with density differences and has significant effects in dealing with data with irregular shapes as well as uneven densities in this category. In addition, a new similarity measure based on the interaction degree between clusters is proposed in the merging stage. This method iteratively merges subclusters with maximum similarity by calculating the interaction degree of shared k-nearest neighbors between neighboring subclusters. The proposed similarity measure is effective in dealing with the problems of high overlap between clusters and ambiguous boundaries. The proposed algorithm is tested in detail on 10 synthetic datasets and 10 UCI real datasets and compared with existing state-of-the-art algorithms. The experimental results show that the CPDD-ID algorithm accurately identifies potential cluster structures and exhibits excellent performance in terms of both clustering accuracy.https://www.mdpi.com/2076-3417/15/7/3612clusteringdensity peakintersectionshared neighborsimilarity metric
spellingShingle Yangming Liu
Jiaman Ding
Hongbin Wang
Yi Du
A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
Applied Sciences
clustering
density peak
intersection
shared neighbor
similarity metric
title A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_full A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_fullStr A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_full_unstemmed A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_short A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_sort clustering algorithm based on the detection of density peaks and the interaction degree between clusters
topic clustering
density peak
intersection
shared neighbor
similarity metric
url https://www.mdpi.com/2076-3417/15/7/3612
work_keys_str_mv AT yangmingliu aclusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters
AT jiamanding aclusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters
AT hongbinwang aclusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters
AT yidu aclusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters
AT yangmingliu clusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters
AT jiamanding clusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters
AT hongbinwang clusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters
AT yidu clusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters