k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform

At present, the explosive growth of data and the mass storage state have brought many problems such as computational complexity and insufficient computational power to clustering research. The distributed computing platform through load balancing dynamically configures a large number of virtual comp...

Full description

Saved in:
Bibliographic Details
Main Authors: Chunqiong Wu, Bingwen Yan, Rongrui Yu, Baoqin Yu, Xiukao Zhou, Yanliang Yu, Na Chen
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2021/9446653
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850209199359787008
author Chunqiong Wu
Bingwen Yan
Rongrui Yu
Baoqin Yu
Xiukao Zhou
Yanliang Yu
Na Chen
author_facet Chunqiong Wu
Bingwen Yan
Rongrui Yu
Baoqin Yu
Xiukao Zhou
Yanliang Yu
Na Chen
author_sort Chunqiong Wu
collection DOAJ
description At present, the explosive growth of data and the mass storage state have brought many problems such as computational complexity and insufficient computational power to clustering research. The distributed computing platform through load balancing dynamically configures a large number of virtual computing resources, effectively breaking through the bottleneck of time and energy consumption, and embodies its unique advantages in massive data mining. This paper studies the parallel k-means extensively. This article first initializes random sampling and second parallelizes the distance calculation process that provides independence between the data objects to perform cluster analysis in parallel. After the parallel processing of the MapReduce, we use many nodes to calculate distance, which speeds up the efficiency of the algorithm. Finally, the clustering of data objects is parallelized. Results show that our method can provide services efficiently and stably and have good convergence.
format Article
id doaj-art-957bb936754a4c09942848e187708ff5
institution OA Journals
issn 1076-2787
1099-0526
language English
publishDate 2021-01-01
publisher Wiley
record_format Article
series Complexity
spelling doaj-art-957bb936754a4c09942848e187708ff52025-08-20T02:10:04ZengWileyComplexity1076-27871099-05262021-01-01202110.1155/2021/94466539446653k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing PlatformChunqiong Wu0Bingwen Yan1Rongrui Yu2Baoqin Yu3Xiukao Zhou4Yanliang Yu5Na Chen6Business College, Yango University, Fuzhou, Fujian Province 350015, ChinaBusiness College, Yango University, Fuzhou, Fujian Province 350015, ChinaBusiness College, Yango University, Fuzhou, Fujian Province 350015, ChinaBusiness College, Yango University, Fuzhou, Fujian Province 350015, ChinaBusiness College, Yango University, Fuzhou, Fujian Province 350015, ChinaBusiness College, Yango University, Fuzhou, Fujian Province 350015, ChinaBig Data Business Intelligence Engineering Research Center, Fujian University, Fuzhou, Fujian Province 350015, ChinaAt present, the explosive growth of data and the mass storage state have brought many problems such as computational complexity and insufficient computational power to clustering research. The distributed computing platform through load balancing dynamically configures a large number of virtual computing resources, effectively breaking through the bottleneck of time and energy consumption, and embodies its unique advantages in massive data mining. This paper studies the parallel k-means extensively. This article first initializes random sampling and second parallelizes the distance calculation process that provides independence between the data objects to perform cluster analysis in parallel. After the parallel processing of the MapReduce, we use many nodes to calculate distance, which speeds up the efficiency of the algorithm. Finally, the clustering of data objects is parallelized. Results show that our method can provide services efficiently and stably and have good convergence.http://dx.doi.org/10.1155/2021/9446653
spellingShingle Chunqiong Wu
Bingwen Yan
Rongrui Yu
Baoqin Yu
Xiukao Zhou
Yanliang Yu
Na Chen
k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform
Complexity
title k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform
title_full k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform
title_fullStr k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform
title_full_unstemmed k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform
title_short k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform
title_sort k means clustering algorithm and its simulation based on distributed computing platform
url http://dx.doi.org/10.1155/2021/9446653
work_keys_str_mv AT chunqiongwu kmeansclusteringalgorithmanditssimulationbasedondistributedcomputingplatform
AT bingwenyan kmeansclusteringalgorithmanditssimulationbasedondistributedcomputingplatform
AT rongruiyu kmeansclusteringalgorithmanditssimulationbasedondistributedcomputingplatform
AT baoqinyu kmeansclusteringalgorithmanditssimulationbasedondistributedcomputingplatform
AT xiukaozhou kmeansclusteringalgorithmanditssimulationbasedondistributedcomputingplatform
AT yanliangyu kmeansclusteringalgorithmanditssimulationbasedondistributedcomputingplatform
AT nachen kmeansclusteringalgorithmanditssimulationbasedondistributedcomputingplatform