A Gene Clustering Algorithm Based on the CCA-Hierarchical Clustering
Aiming at the massive gene expression data brought by gene chip technology , in order to fully mine the biological information and potential biological mechanisms contained in it , this paper proposes a gene clustering algorithm based on CCA- hierarchical clustering ( CCA-Hc) . The alg...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
Harbin University of Science and Technology Publications
2023-10-01
|
| Series: | Journal of Harbin University of Science and Technology |
| Subjects: | |
| Online Access: | https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=2261 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Aiming at the massive gene expression data brought by gene chip technology , in order to fully mine the biological information and potential biological mechanisms contained in it , this paper proposes a gene clustering algorithm based on CCA- hierarchical clustering ( CCA-Hc) . The algorithm introduces canonical correlation analysis on the basis of hierarchical clustering , and optimizes the calculation method of similarity matrix. First , the canonical correlation analysis method is used to measure the gene correlation by combining the multiple feature information of the gene , and the gene similarity matrix is obtained. Then the similarity matrix is used as the neighbor matrix of hierarchical clustering for agglomerative hierarchical clustering. The CCA-Hc clustering effect test experiment was performed on the gene expression dataset of Oryza sativa L. ( rice ) . The results show that , compared with the traditional hierarchical clustering algorithm using Euclidean distance ( EUC-Hc ) , CCA-Hc is superior to EUC-Hc in both internal stability index and biological functional index , and has better robustness and clustering accuracy. It is more conducive to discovering the co-expression relationship between genes. |
|---|---|
| ISSN: | 1007-2683 |