A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters

In order to cope with data with an irregular shape and uneven density, this paper proposes a two-phase clustering algorithm based on detecting the peaks of dimensional density and the degree of interaction between clusters (CPDD-ID). In the partitioning phase, the local densities of the data in all...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yangming Liu, Jiaman Ding, Hongbin Wang, Yi Du
Format:	Article
Language:	English
Published:	MDPI AG 2025-03-01
Series:	Applied Sciences
Subjects:	clustering density peak intersection shared neighbor similarity metric
Online Access:	https://www.mdpi.com/2076-3417/15/7/3612
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850184585769385984
author	Yangming Liu Jiaman Ding Hongbin Wang Yi Du
author_facet	Yangming Liu Jiaman Ding Hongbin Wang Yi Du
author_sort	Yangming Liu
collection	DOAJ
description	In order to cope with data with an irregular shape and uneven density, this paper proposes a two-phase clustering algorithm based on detecting the peaks of dimensional density and the degree of interaction between clusters (CPDD-ID). In the partitioning phase, the local densities of the data in all dimensions are calculated using kernel density estimation, the density curves are constructed based on the densities of all the data, and the peaks of the density curves are used as the benchmark to construct a Kd-Tree to search for the data points that are closest to each peak to partition the initial sub-clusters. Then, the intersection of the results of the initial sub-clusters obtained from all the dimensions is taken to obtain the final sub-clusters. The proposed partitioning strategy is able to accurately identify clusters with density differences and has significant effects in dealing with data with irregular shapes as well as uneven densities in this category. In addition, a new similarity measure based on the interaction degree between clusters is proposed in the merging stage. This method iteratively merges subclusters with maximum similarity by calculating the interaction degree of shared k-nearest neighbors between neighboring subclusters. The proposed similarity measure is effective in dealing with the problems of high overlap between clusters and ambiguous boundaries. The proposed algorithm is tested in detail on 10 synthetic datasets and 10 UCI real datasets and compared with existing state-of-the-art algorithms. The experimental results show that the CPDD-ID algorithm accurately identifies potential cluster structures and exhibits excellent performance in terms of both clustering accuracy.
format	Article
id	doaj-art-b11f0a1515f54530bb2e58ba6aca067a
institution	OA Journals
issn	2076-3417
language	English
publishDate	2025-03-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-b11f0a1515f54530bb2e58ba6aca067a2025-08-20T02:17:00ZengMDPI AGApplied Sciences2076-34172025-03-01157361210.3390/app15073612A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between ClustersYangming Liu0Jiaman Ding1Hongbin Wang2Yi Du3Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, ChinaFaculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, ChinaFaculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, ChinaCity College, Kunming University of Science and Technology, Kunming 650051, ChinaIn order to cope with data with an irregular shape and uneven density, this paper proposes a two-phase clustering algorithm based on detecting the peaks of dimensional density and the degree of interaction between clusters (CPDD-ID). In the partitioning phase, the local densities of the data in all dimensions are calculated using kernel density estimation, the density curves are constructed based on the densities of all the data, and the peaks of the density curves are used as the benchmark to construct a Kd-Tree to search for the data points that are closest to each peak to partition the initial sub-clusters. Then, the intersection of the results of the initial sub-clusters obtained from all the dimensions is taken to obtain the final sub-clusters. The proposed partitioning strategy is able to accurately identify clusters with density differences and has significant effects in dealing with data with irregular shapes as well as uneven densities in this category. In addition, a new similarity measure based on the interaction degree between clusters is proposed in the merging stage. This method iteratively merges subclusters with maximum similarity by calculating the interaction degree of shared k-nearest neighbors between neighboring subclusters. The proposed similarity measure is effective in dealing with the problems of high overlap between clusters and ambiguous boundaries. The proposed algorithm is tested in detail on 10 synthetic datasets and 10 UCI real datasets and compared with existing state-of-the-art algorithms. The experimental results show that the CPDD-ID algorithm accurately identifies potential cluster structures and exhibits excellent performance in terms of both clustering accuracy.https://www.mdpi.com/2076-3417/15/7/3612clusteringdensity peakintersectionshared neighborsimilarity metric
spellingShingle	Yangming Liu Jiaman Ding Hongbin Wang Yi Du A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters Applied Sciences clustering density peak intersection shared neighbor similarity metric
title	A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_full	A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_fullStr	A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_full_unstemmed	A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_short	A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters
title_sort	clustering algorithm based on the detection of density peaks and the interaction degree between clusters
topic	clustering density peak intersection shared neighbor similarity metric
url	https://www.mdpi.com/2076-3417/15/7/3612
work_keys_str_mv	AT yangmingliu aclusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters AT jiamanding aclusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters AT hongbinwang aclusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters AT yidu aclusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters AT yangmingliu clusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters AT jiamanding clusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters AT hongbinwang clusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters AT yidu clusteringalgorithmbasedonthedetectionofdensitypeaksandtheinteractiondegreebetweenclusters

A Clustering Algorithm Based on the Detection of Density Peaks and the Interaction Degree Between Clusters

Similar Items