Research on collaborative filtering algorithm based on improved K-means algorithm for user attribute rating and co-rating
Abstract In response to the challenges of information overload in large datasets, as well as issues related to data sparsity and cold starts in traditional recommendation algorithms, this paper proposes a collaborative filtering algorithm based on an enhanced K-means algorithm for user attribute rat...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-06-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-96705-0 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850136589873709056 |
|---|---|
| author | ShengShai Zhang Shiping Chen Xiaodong Yu Shaowei Mei |
| author_facet | ShengShai Zhang Shiping Chen Xiaodong Yu Shaowei Mei |
| author_sort | ShengShai Zhang |
| collection | DOAJ |
| description | Abstract In response to the challenges of information overload in large datasets, as well as issues related to data sparsity and cold starts in traditional recommendation algorithms, this paper proposes a collaborative filtering algorithm based on an enhanced K-means algorithm for user attribute rating and common rating. This proposed algorithm leverages a per-constructed user attribute rating matrix and computes the objective weights of user attribute rating features using variance analysis. Specifically, it is observed that larger sample variances exert greater influence on clustering outcomes. Furthermore, this study integrates the improved K-means clustering algorithm for similarity calculation with user-item common rating similarities to derive a novel similarity computation method. Consequently, a new collaborative filtering model emerges from this approach. To validate the effectiveness of the model, we constructed datasets utilizing samples from Movie-lens 100k and Movie-lens 1 M. Comparative experiments revealed that first, employing both the silhouette coefficient method and cross-validation yielded optimal clustering results at K = 7 and K = 6 respectively. Second, through Mean Absolute Error (MAE) calculations, we verified that the MAE values associated with our proposed improved similarity measure were significantly lower by 65% and 60% compared to those derived from other methods such as Pearson correlation, Jaccard index, and RJ-Pearson similarity calculations. Finally, analyses conducted using two distinct datasets indicated that our enhanced KUR-CF model achieved improvements in Precision values by 60% and Recall values by 35%, relative to other conventional collaborative filtering algorithms. In terms of Recall rate calculations specifically analyzed within this context, our proposed KUR-CF model demonstrated enhancements of 55% and 25% when compared against traditional collaborative filtering approaches. Through the experimental analysis presented above, it can be concluded that the KUR-CF algorithm proposed in this paper significantly enhances recommendation performance. This finding indicates that the proposed method is indeed effective. |
| format | Article |
| id | doaj-art-e2f8b61e0ba04767bba86c8c9126e9bc |
| institution | OA Journals |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-e2f8b61e0ba04767bba86c8c9126e9bc2025-08-20T02:31:04ZengNature PortfolioScientific Reports2045-23222025-06-0115112110.1038/s41598-025-96705-0Research on collaborative filtering algorithm based on improved K-means algorithm for user attribute rating and co-ratingShengShai Zhang0Shiping Chen1Xiaodong Yu2Shaowei Mei3School of Management, Management Science and Engineering, University of Shanghai for Science and TechnologySchool of Management, Management Science and Engineering, University of Shanghai for Science and TechnologySchool of Information Science and Technology, Computer Science and Technology, Shanghai Sanda UniversityElectronic Information, School of Computer and Information Engineering, Shanghai Second Polytechnic UniversityAbstract In response to the challenges of information overload in large datasets, as well as issues related to data sparsity and cold starts in traditional recommendation algorithms, this paper proposes a collaborative filtering algorithm based on an enhanced K-means algorithm for user attribute rating and common rating. This proposed algorithm leverages a per-constructed user attribute rating matrix and computes the objective weights of user attribute rating features using variance analysis. Specifically, it is observed that larger sample variances exert greater influence on clustering outcomes. Furthermore, this study integrates the improved K-means clustering algorithm for similarity calculation with user-item common rating similarities to derive a novel similarity computation method. Consequently, a new collaborative filtering model emerges from this approach. To validate the effectiveness of the model, we constructed datasets utilizing samples from Movie-lens 100k and Movie-lens 1 M. Comparative experiments revealed that first, employing both the silhouette coefficient method and cross-validation yielded optimal clustering results at K = 7 and K = 6 respectively. Second, through Mean Absolute Error (MAE) calculations, we verified that the MAE values associated with our proposed improved similarity measure were significantly lower by 65% and 60% compared to those derived from other methods such as Pearson correlation, Jaccard index, and RJ-Pearson similarity calculations. Finally, analyses conducted using two distinct datasets indicated that our enhanced KUR-CF model achieved improvements in Precision values by 60% and Recall values by 35%, relative to other conventional collaborative filtering algorithms. In terms of Recall rate calculations specifically analyzed within this context, our proposed KUR-CF model demonstrated enhancements of 55% and 25% when compared against traditional collaborative filtering approaches. Through the experimental analysis presented above, it can be concluded that the KUR-CF algorithm proposed in this paper significantly enhances recommendation performance. This finding indicates that the proposed method is indeed effective.https://doi.org/10.1038/s41598-025-96705-0 |
| spellingShingle | ShengShai Zhang Shiping Chen Xiaodong Yu Shaowei Mei Research on collaborative filtering algorithm based on improved K-means algorithm for user attribute rating and co-rating Scientific Reports |
| title | Research on collaborative filtering algorithm based on improved K-means algorithm for user attribute rating and co-rating |
| title_full | Research on collaborative filtering algorithm based on improved K-means algorithm for user attribute rating and co-rating |
| title_fullStr | Research on collaborative filtering algorithm based on improved K-means algorithm for user attribute rating and co-rating |
| title_full_unstemmed | Research on collaborative filtering algorithm based on improved K-means algorithm for user attribute rating and co-rating |
| title_short | Research on collaborative filtering algorithm based on improved K-means algorithm for user attribute rating and co-rating |
| title_sort | research on collaborative filtering algorithm based on improved k means algorithm for user attribute rating and co rating |
| url | https://doi.org/10.1038/s41598-025-96705-0 |
| work_keys_str_mv | AT shengshaizhang researchoncollaborativefilteringalgorithmbasedonimprovedkmeansalgorithmforuserattributeratingandcorating AT shipingchen researchoncollaborativefilteringalgorithmbasedonimprovedkmeansalgorithmforuserattributeratingandcorating AT xiaodongyu researchoncollaborativefilteringalgorithmbasedonimprovedkmeansalgorithmforuserattributeratingandcorating AT shaoweimei researchoncollaborativefilteringalgorithmbasedonimprovedkmeansalgorithmforuserattributeratingandcorating |