An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes

Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made wi...

Full description

Saved in:
Bibliographic Details
Main Authors: P. M. Booma, S. Prabhakaran, R. Dhanalakshmi
Format: Article
Language:English
Published: Wiley 2014-01-01
Series:The Scientific World Journal
Online Access:http://dx.doi.org/10.1155/2014/357873
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832550562414985216
author P. M. Booma
S. Prabhakaran
R. Dhanalakshmi
author_facet P. M. Booma
S. Prabhakaran
R. Dhanalakshmi
author_sort P. M. Booma
collection DOAJ
description Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.
format Article
id doaj-art-7016462957bd41ca9e6f6b4d57dc6373
institution Kabale University
issn 2356-6140
1537-744X
language English
publishDate 2014-01-01
publisher Wiley
record_format Article
series The Scientific World Journal
spelling doaj-art-7016462957bd41ca9e6f6b4d57dc63732025-02-03T06:06:24ZengWileyThe Scientific World Journal2356-61401537-744X2014-01-01201410.1155/2014/357873357873An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between GenesP. M. Booma0S. Prabhakaran1R. Dhanalakshmi2Department of Computer and Engineering, KCG College of Technology, KCG Nagar, Rajiv Gandhi Salai, Karapakkam, Chennai, Tamil Nadu 600097, IndiaDepartment of Computer Science and Engineering, SRM University, SRM Nagar, Kattankulathur, Kanchipuram, National Highway 45, Potheri, Tamil Nadu 603203, IndiaDepartment of Computer and Engineering, KCG College of Technology, KCG Nagar, Rajiv Gandhi Salai, Karapakkam, Chennai, Tamil Nadu 600097, IndiaMicroarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.http://dx.doi.org/10.1155/2014/357873
spellingShingle P. M. Booma
S. Prabhakaran
R. Dhanalakshmi
An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes
The Scientific World Journal
title An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes
title_full An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes
title_fullStr An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes
title_full_unstemmed An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes
title_short An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes
title_sort improved pearson s correlation proximity based hierarchical clustering for mining biological association between genes
url http://dx.doi.org/10.1155/2014/357873
work_keys_str_mv AT pmbooma animprovedpearsonscorrelationproximitybasedhierarchicalclusteringforminingbiologicalassociationbetweengenes
AT sprabhakaran animprovedpearsonscorrelationproximitybasedhierarchicalclusteringforminingbiologicalassociationbetweengenes
AT rdhanalakshmi animprovedpearsonscorrelationproximitybasedhierarchicalclusteringforminingbiologicalassociationbetweengenes
AT pmbooma improvedpearsonscorrelationproximitybasedhierarchicalclusteringforminingbiologicalassociationbetweengenes
AT sprabhakaran improvedpearsonscorrelationproximitybasedhierarchicalclusteringforminingbiologicalassociationbetweengenes
AT rdhanalakshmi improvedpearsonscorrelationproximitybasedhierarchicalclusteringforminingbiologicalassociationbetweengenes