An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data
The huge amounts of data generated by media sensors in health monitoring systems, by medical diagnosis that produce media (audio, video, image, and text) content, and from health service providers are too complex and voluminous to be processed and analyzed by traditional methods. Data mining approac...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2015-06-01
|
Series: | International Journal of Distributed Sensor Networks |
Online Access: | https://doi.org/10.1155/2015/615740 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832555305377988608 |
---|---|
author | Ramzi A. Haraty Mohamad Dimishkieh Mehedi Masud |
author_facet | Ramzi A. Haraty Mohamad Dimishkieh Mehedi Masud |
author_sort | Ramzi A. Haraty |
collection | DOAJ |
description | The huge amounts of data generated by media sensors in health monitoring systems, by medical diagnosis that produce media (audio, video, image, and text) content, and from health service providers are too complex and voluminous to be processed and analyzed by traditional methods. Data mining approaches offer the methodology and technology to transform these heterogeneous data into meaningful information for decision making. This paper studies data mining applications in healthcare. Mainly, we study k -means clustering algorithms on large datasets and present an enhancement to k -means clustering, which requires k or a lesser number of passes to a dataset. The proposed algorithm, which we call G -means, utilizes a greedy approach to produce the preliminary centroids and then takes k or lesser passes over the dataset to adjust these center points. Our experimental results, which were used in an increasing manner on the same dataset, show that G -means outperforms k -means in terms of entropy and F -scores. The experiments also yield better results for G -means in terms of the coefficient of variance and the execution time. |
format | Article |
id | doaj-art-3dd5378671dc46d89dca487acd273da2 |
institution | Kabale University |
issn | 1550-1477 |
language | English |
publishDate | 2015-06-01 |
publisher | Wiley |
record_format | Article |
series | International Journal of Distributed Sensor Networks |
spelling | doaj-art-3dd5378671dc46d89dca487acd273da22025-02-03T05:48:31ZengWileyInternational Journal of Distributed Sensor Networks1550-14772015-06-011110.1155/2015/615740615740An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare DataRamzi A. Haraty0Mohamad Dimishkieh1Mehedi Masud2 Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon Department of Computer Science, Taif University, Taif, Saudi ArabiaThe huge amounts of data generated by media sensors in health monitoring systems, by medical diagnosis that produce media (audio, video, image, and text) content, and from health service providers are too complex and voluminous to be processed and analyzed by traditional methods. Data mining approaches offer the methodology and technology to transform these heterogeneous data into meaningful information for decision making. This paper studies data mining applications in healthcare. Mainly, we study k -means clustering algorithms on large datasets and present an enhancement to k -means clustering, which requires k or a lesser number of passes to a dataset. The proposed algorithm, which we call G -means, utilizes a greedy approach to produce the preliminary centroids and then takes k or lesser passes over the dataset to adjust these center points. Our experimental results, which were used in an increasing manner on the same dataset, show that G -means outperforms k -means in terms of entropy and F -scores. The experiments also yield better results for G -means in terms of the coefficient of variance and the execution time.https://doi.org/10.1155/2015/615740 |
spellingShingle | Ramzi A. Haraty Mohamad Dimishkieh Mehedi Masud An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data International Journal of Distributed Sensor Networks |
title | An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data |
title_full | An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data |
title_fullStr | An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data |
title_full_unstemmed | An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data |
title_short | An Enhanced -Means Clustering Algorithm for Pattern Discovery in Healthcare Data |
title_sort | enhanced means clustering algorithm for pattern discovery in healthcare data |
url | https://doi.org/10.1155/2015/615740 |
work_keys_str_mv | AT ramziaharaty anenhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata AT mohamaddimishkieh anenhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata AT mehedimasud anenhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata AT ramziaharaty enhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata AT mohamaddimishkieh enhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata AT mehedimasud enhancedmeansclusteringalgorithmforpatterndiscoveryinhealthcaredata |