An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data

The huge amounts of data generated by media sensors in health monitoring systems, by medical diagnosis that produce media (audio, video, image, and text) content, and from health service providers are too complex and voluminous to be processed and analyzed by traditional methods. Data mining approac...

Full description

Saved in:
Bibliographic Details
Main Authors: Ramzi A. Haraty, Mohamad Dimishkieh, Mehedi Masud
Format: Article
Language:English
Published: Wiley 2015-06-01
Series:International Journal of Distributed Sensor Networks
Online Access:https://doi.org/10.1155/2015/615740
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832555305377988608
author Ramzi A. Haraty
Mohamad Dimishkieh
Mehedi Masud
author_facet Ramzi A. Haraty
Mohamad Dimishkieh
Mehedi Masud
author_sort Ramzi A. Haraty
collection DOAJ
description The huge amounts of data generated by media sensors in health monitoring systems, by medical diagnosis that produce media (audio, video, image, and text) content, and from health service providers are too complex and voluminous to be processed and analyzed by traditional methods. Data mining approaches offer the methodology and technology to transform these heterogeneous data into meaningful information for decision making. This paper studies data mining applications in healthcare. Mainly, we study k -means clustering algorithms on large datasets and present an enhancement to k -means clustering, which requires k or a lesser number of passes to a dataset. The proposed algorithm, which we call G -means, utilizes a greedy approach to produce the preliminary centroids and then takes k or lesser passes over the dataset to adjust these center points. Our experimental results, which were used in an increasing manner on the same dataset, show that G -means outperforms k -means in terms of entropy and F -scores. The experiments also yield better results for G -means in terms of the coefficient of variance and the execution time.
format Article
id doaj-art-3dd5378671dc46d89dca487acd273da2
institution Kabale University
issn 1550-1477
language English
publishDate 2015-06-01
publisher Wiley
record_format Article
series International Journal of Distributed Sensor Networks
spelling doaj-art-3dd5378671dc46d89dca487acd273da22025-02-03T05:48:31ZengWileyInternational Journal of Distributed Sensor Networks1550-14772015-06-011110.1155/2015/615740615740An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare DataRamzi A. Haraty0Mohamad Dimishkieh1Mehedi Masud2 Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon Department of Computer Science, Taif University, Taif, Saudi ArabiaThe huge amounts of data generated by media sensors in health monitoring systems, by medical diagnosis that produce media (audio, video, image, and text) content, and from health service providers are too complex and voluminous to be processed and analyzed by traditional methods. Data mining approaches offer the methodology and technology to transform these heterogeneous data into meaningful information for decision making. This paper studies data mining applications in healthcare. Mainly, we study k -means clustering algorithms on large datasets and present an enhancement to k -means clustering, which requires k or a lesser number of passes to a dataset. The proposed algorithm, which we call G -means, utilizes a greedy approach to produce the preliminary centroids and then takes k or lesser passes over the dataset to adjust these center points. Our experimental results, which were used in an increasing manner on the same dataset, show that G -means outperforms k -means in terms of entropy and F -scores. The experiments also yield better results for G -means in terms of the coefficient of variance and the execution time.https://doi.org/10.1155/2015/615740
spellingShingle Ramzi A. Haraty
Mohamad Dimishkieh
Mehedi Masud
An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data
International Journal of Distributed Sensor Networks
title An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data
title_full An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data
title_fullStr An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data
title_full_unstemmed An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data
title_short An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data
title_sort enhanced means clustering algorithm for pattern discovery in healthcare data
url https://doi.org/10.1155/2015/615740
work_keys_str_mv AT ramziaharaty anenhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata
AT mohamaddimishkieh anenhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata
AT mehedimasud anenhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata
AT ramziaharaty enhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata
AT mohamaddimishkieh enhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata
AT mehedimasud enhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata