Big data processing using hybrid Gaussian mixture model with salp swarm algorithm

Abstract The traditional methods used in big data, like cluster creation and query-based data extraction, fail to yield accurate results on massive networks. To address such issues, the proposed approach involves using the Hadoop Distributed File System (HDFS) for data processing, the map-reduce pro...

Full description

Saved in:

Bibliographic Details
Main Authors:	R. Saravanakumar, T. TamilSelvi, Digvijay Pandey, Binay Kumar Pandey, Darshan A. Mahajan, Mesfin Esayas Lelisho
Format:	Article
Language:	English
Published:	SpringerOpen 2024-11-01
Series:	Journal of Big Data
Subjects:	Hadoop distributed file system (HDFS) Map-reduce Gaussian mixture model (GMM) Salp swarm algorithm (SSA) Secure hash algorithms (SHA)
Online Access:	https://doi.org/10.1186/s40537-024-01015-3
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850162730139385856
author	R. Saravanakumar T. TamilSelvi Digvijay Pandey Binay Kumar Pandey Darshan A. Mahajan Mesfin Esayas Lelisho
author_facet	R. Saravanakumar T. TamilSelvi Digvijay Pandey Binay Kumar Pandey Darshan A. Mahajan Mesfin Esayas Lelisho
author_sort	R. Saravanakumar
collection	DOAJ
description	Abstract The traditional methods used in big data, like cluster creation and query-based data extraction, fail to yield accurate results on massive networks. To address such issues, the proposed approach involves using the Hadoop Distributed File System (HDFS) for data processing, the map-reduce programming paradigm for data processing, and query optimization techniques to quickly and effectively extract accurate outcomes from a variety of options with a high processing capacity. The methodology proposed in this work makes use of Gaussian Mixture Model (GMM) for data clustering and the Salp Swarm Algorithm (SSA) for optimization. The security of preprocessed data stored on networked clusters with interconnections has been ensured by SHA algorithms. Finally, incorporating into consideration the important parameters, evaluation findings for the experimental performance of the model in the indicated methodology are produced. For this work, the estimated range of input file sizes is 60–100 MB. The processing of 100 MB of input files yielded an accuracy of 96% and results for specificity and sensitivity of 90% and 93%, respectively. The outcomes have been compared with well-known methods like fuzzy C-means and K-means approaches, and the results show that the proposed method effectively distributes accurate data processing to cluster nodes with low latency. Moreover, it uses the least amount of memory resources possible when operating on functional CPUs. As a result, the proposed approach outperforms existing techniques.
format	Article
id	doaj-art-d078bbed49be4f47b192e3010e8c29b3
institution	OA Journals
issn	2196-1115
language	English
publishDate	2024-11-01
publisher	SpringerOpen
record_format	Article
series	Journal of Big Data
spelling	doaj-art-d078bbed49be4f47b192e3010e8c29b32025-08-20T02:22:29ZengSpringerOpenJournal of Big Data2196-11152024-11-0111112910.1186/s40537-024-01015-3Big data processing using hybrid Gaussian mixture model with salp swarm algorithmR. Saravanakumar0T. TamilSelvi1Digvijay Pandey2Binay Kumar Pandey3Darshan A. Mahajan4Mesfin Esayas Lelisho5Department of CSE, Dayananda Sagar Academy of Technology & ManagementDepartment of CSE, Panimalar Institute of TechnologyDepartment of Technical Education Uttar PradeshDepartment of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology PantnagarNICMAR University PuneDepartment of Statistics, Mizan-Tepi UniversityAbstract The traditional methods used in big data, like cluster creation and query-based data extraction, fail to yield accurate results on massive networks. To address such issues, the proposed approach involves using the Hadoop Distributed File System (HDFS) for data processing, the map-reduce programming paradigm for data processing, and query optimization techniques to quickly and effectively extract accurate outcomes from a variety of options with a high processing capacity. The methodology proposed in this work makes use of Gaussian Mixture Model (GMM) for data clustering and the Salp Swarm Algorithm (SSA) for optimization. The security of preprocessed data stored on networked clusters with interconnections has been ensured by SHA algorithms. Finally, incorporating into consideration the important parameters, evaluation findings for the experimental performance of the model in the indicated methodology are produced. For this work, the estimated range of input file sizes is 60–100 MB. The processing of 100 MB of input files yielded an accuracy of 96% and results for specificity and sensitivity of 90% and 93%, respectively. The outcomes have been compared with well-known methods like fuzzy C-means and K-means approaches, and the results show that the proposed method effectively distributes accurate data processing to cluster nodes with low latency. Moreover, it uses the least amount of memory resources possible when operating on functional CPUs. As a result, the proposed approach outperforms existing techniques.https://doi.org/10.1186/s40537-024-01015-3Hadoop distributed file system (HDFS)Map-reduceGaussian mixture model (GMM)Salp swarm algorithm (SSA)Secure hash algorithms (SHA)
spellingShingle	R. Saravanakumar T. TamilSelvi Digvijay Pandey Binay Kumar Pandey Darshan A. Mahajan Mesfin Esayas Lelisho Big data processing using hybrid Gaussian mixture model with salp swarm algorithm Journal of Big Data Hadoop distributed file system (HDFS) Map-reduce Gaussian mixture model (GMM) Salp swarm algorithm (SSA) Secure hash algorithms (SHA)
title	Big data processing using hybrid Gaussian mixture model with salp swarm algorithm
title_full	Big data processing using hybrid Gaussian mixture model with salp swarm algorithm
title_fullStr	Big data processing using hybrid Gaussian mixture model with salp swarm algorithm
title_full_unstemmed	Big data processing using hybrid Gaussian mixture model with salp swarm algorithm
title_short	Big data processing using hybrid Gaussian mixture model with salp swarm algorithm
title_sort	big data processing using hybrid gaussian mixture model with salp swarm algorithm
topic	Hadoop distributed file system (HDFS) Map-reduce Gaussian mixture model (GMM) Salp swarm algorithm (SSA) Secure hash algorithms (SHA)
url	https://doi.org/10.1186/s40537-024-01015-3
work_keys_str_mv	AT rsaravanakumar bigdataprocessingusinghybridgaussianmixturemodelwithsalpswarmalgorithm AT ttamilselvi bigdataprocessingusinghybridgaussianmixturemodelwithsalpswarmalgorithm AT digvijaypandey bigdataprocessingusinghybridgaussianmixturemodelwithsalpswarmalgorithm AT binaykumarpandey bigdataprocessingusinghybridgaussianmixturemodelwithsalpswarmalgorithm AT darshanamahajan bigdataprocessingusinghybridgaussianmixturemodelwithsalpswarmalgorithm AT mesfinesayaslelisho bigdataprocessingusinghybridgaussianmixturemodelwithsalpswarmalgorithm

Big data processing using hybrid Gaussian mixture model with salp swarm algorithm

Similar Items