Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem

In this paper, we propose a secure homomorphic K-means clustering protocol based on the Paillier cryptosystem to address the urgent need for privacy-preserving clustering techniques in sensitive domains such as healthcare and finance. The protocol uses the additive homomorphism property of the Paill...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhengqi Zhang, Zixin Xiong, Jun Ye
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/12/6918
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849432151742742528
author Zhengqi Zhang
Zixin Xiong
Jun Ye
author_facet Zhengqi Zhang
Zixin Xiong
Jun Ye
author_sort Zhengqi Zhang
collection DOAJ
description In this paper, we propose a secure homomorphic K-means clustering protocol based on the Paillier cryptosystem to address the urgent need for privacy-preserving clustering techniques in sensitive domains such as healthcare and finance. The protocol uses the additive homomorphism property of the Paillier cryptosystem to perform K-means clustering on the encrypted data, which ensures the confidentiality of the data during the whole calculation process. The protocol consists of three main components: secure computation distance (SCD) protocol, secure cluster assignment (SCA) protocol and secure cluster center update (SUCC) protocol. The SCD protocol securely computes the squared Euclidean distance between the encrypted data point and the encrypted cluster center. The SCA protocol securely assigns data points to clusters based on these cryptographic distances. Finally, the SUCC protocol securely updates the cluster centers without leaking the actual data points as well as the number of intermediate sums. Through security analysis and experimental verification, the effectiveness and practicability of the protocol are proved. This work provides a practical solution for secure clustering based on homomorphic encryption and contributes to the research in the field of privacy-preserving data mining. Although this protocol solves the key problems of secure distance computation, cluster assignment and centroid update, there are still areas for further research. These include optimizing the computational efficiency of the protocol, exploring other homomorphic encryption schemes that may provide better performance, and extending the protocol to handle more complex clustering algorithms.
format Article
id doaj-art-bc90ab160b8f41f3b8ede65e50cb17ea
institution Kabale University
issn 2076-3417
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-bc90ab160b8f41f3b8ede65e50cb17ea2025-08-20T03:27:26ZengMDPI AGApplied Sciences2076-34172025-06-011512691810.3390/app15126918Secure K-Means Clustering Scheme for Confidential Data Based on Paillier CryptosystemZhengqi Zhang0Zixin Xiong1Jun Ye2Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou 571158, ChinaKey Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou 571158, ChinaKey Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou 571158, ChinaIn this paper, we propose a secure homomorphic K-means clustering protocol based on the Paillier cryptosystem to address the urgent need for privacy-preserving clustering techniques in sensitive domains such as healthcare and finance. The protocol uses the additive homomorphism property of the Paillier cryptosystem to perform K-means clustering on the encrypted data, which ensures the confidentiality of the data during the whole calculation process. The protocol consists of three main components: secure computation distance (SCD) protocol, secure cluster assignment (SCA) protocol and secure cluster center update (SUCC) protocol. The SCD protocol securely computes the squared Euclidean distance between the encrypted data point and the encrypted cluster center. The SCA protocol securely assigns data points to clusters based on these cryptographic distances. Finally, the SUCC protocol securely updates the cluster centers without leaking the actual data points as well as the number of intermediate sums. Through security analysis and experimental verification, the effectiveness and practicability of the protocol are proved. This work provides a practical solution for secure clustering based on homomorphic encryption and contributes to the research in the field of privacy-preserving data mining. Although this protocol solves the key problems of secure distance computation, cluster assignment and centroid update, there are still areas for further research. These include optimizing the computational efficiency of the protocol, exploring other homomorphic encryption schemes that may provide better performance, and extending the protocol to handle more complex clustering algorithms.https://www.mdpi.com/2076-3417/15/12/6918K-meansprivacy preservingmulti-keyfully homomorphic encryptionoutsourced computing
spellingShingle Zhengqi Zhang
Zixin Xiong
Jun Ye
Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem
Applied Sciences
K-means
privacy preserving
multi-key
fully homomorphic encryption
outsourced computing
title Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem
title_full Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem
title_fullStr Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem
title_full_unstemmed Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem
title_short Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem
title_sort secure k means clustering scheme for confidential data based on paillier cryptosystem
topic K-means
privacy preserving
multi-key
fully homomorphic encryption
outsourced computing
url https://www.mdpi.com/2076-3417/15/12/6918
work_keys_str_mv AT zhengqizhang securekmeansclusteringschemeforconfidentialdatabasedonpailliercryptosystem
AT zixinxiong securekmeansclusteringschemeforconfidentialdatabasedonpailliercryptosystem
AT junye securekmeansclusteringschemeforconfidentialdatabasedonpailliercryptosystem