Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem
In this paper, we propose a secure homomorphic K-means clustering protocol based on the Paillier cryptosystem to address the urgent need for privacy-preserving clustering techniques in sensitive domains such as healthcare and finance. The protocol uses the additive homomorphism property of the Paill...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/12/6918 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849432151742742528 |
|---|---|
| author | Zhengqi Zhang Zixin Xiong Jun Ye |
| author_facet | Zhengqi Zhang Zixin Xiong Jun Ye |
| author_sort | Zhengqi Zhang |
| collection | DOAJ |
| description | In this paper, we propose a secure homomorphic K-means clustering protocol based on the Paillier cryptosystem to address the urgent need for privacy-preserving clustering techniques in sensitive domains such as healthcare and finance. The protocol uses the additive homomorphism property of the Paillier cryptosystem to perform K-means clustering on the encrypted data, which ensures the confidentiality of the data during the whole calculation process. The protocol consists of three main components: secure computation distance (SCD) protocol, secure cluster assignment (SCA) protocol and secure cluster center update (SUCC) protocol. The SCD protocol securely computes the squared Euclidean distance between the encrypted data point and the encrypted cluster center. The SCA protocol securely assigns data points to clusters based on these cryptographic distances. Finally, the SUCC protocol securely updates the cluster centers without leaking the actual data points as well as the number of intermediate sums. Through security analysis and experimental verification, the effectiveness and practicability of the protocol are proved. This work provides a practical solution for secure clustering based on homomorphic encryption and contributes to the research in the field of privacy-preserving data mining. Although this protocol solves the key problems of secure distance computation, cluster assignment and centroid update, there are still areas for further research. These include optimizing the computational efficiency of the protocol, exploring other homomorphic encryption schemes that may provide better performance, and extending the protocol to handle more complex clustering algorithms. |
| format | Article |
| id | doaj-art-bc90ab160b8f41f3b8ede65e50cb17ea |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-bc90ab160b8f41f3b8ede65e50cb17ea2025-08-20T03:27:26ZengMDPI AGApplied Sciences2076-34172025-06-011512691810.3390/app15126918Secure K-Means Clustering Scheme for Confidential Data Based on Paillier CryptosystemZhengqi Zhang0Zixin Xiong1Jun Ye2Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou 571158, ChinaKey Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou 571158, ChinaKey Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou 571158, ChinaIn this paper, we propose a secure homomorphic K-means clustering protocol based on the Paillier cryptosystem to address the urgent need for privacy-preserving clustering techniques in sensitive domains such as healthcare and finance. The protocol uses the additive homomorphism property of the Paillier cryptosystem to perform K-means clustering on the encrypted data, which ensures the confidentiality of the data during the whole calculation process. The protocol consists of three main components: secure computation distance (SCD) protocol, secure cluster assignment (SCA) protocol and secure cluster center update (SUCC) protocol. The SCD protocol securely computes the squared Euclidean distance between the encrypted data point and the encrypted cluster center. The SCA protocol securely assigns data points to clusters based on these cryptographic distances. Finally, the SUCC protocol securely updates the cluster centers without leaking the actual data points as well as the number of intermediate sums. Through security analysis and experimental verification, the effectiveness and practicability of the protocol are proved. This work provides a practical solution for secure clustering based on homomorphic encryption and contributes to the research in the field of privacy-preserving data mining. Although this protocol solves the key problems of secure distance computation, cluster assignment and centroid update, there are still areas for further research. These include optimizing the computational efficiency of the protocol, exploring other homomorphic encryption schemes that may provide better performance, and extending the protocol to handle more complex clustering algorithms.https://www.mdpi.com/2076-3417/15/12/6918K-meansprivacy preservingmulti-keyfully homomorphic encryptionoutsourced computing |
| spellingShingle | Zhengqi Zhang Zixin Xiong Jun Ye Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem Applied Sciences K-means privacy preserving multi-key fully homomorphic encryption outsourced computing |
| title | Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem |
| title_full | Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem |
| title_fullStr | Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem |
| title_full_unstemmed | Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem |
| title_short | Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem |
| title_sort | secure k means clustering scheme for confidential data based on paillier cryptosystem |
| topic | K-means privacy preserving multi-key fully homomorphic encryption outsourced computing |
| url | https://www.mdpi.com/2076-3417/15/12/6918 |
| work_keys_str_mv | AT zhengqizhang securekmeansclusteringschemeforconfidentialdatabasedonpailliercryptosystem AT zixinxiong securekmeansclusteringschemeforconfidentialdatabasedonpailliercryptosystem AT junye securekmeansclusteringschemeforconfidentialdatabasedonpailliercryptosystem |