A two stage grading approach for feature selection and classification of microarray data using Pareto based feature ranking techniques: A case study

High dimensional search space in microarray data with large number of genes and few dozen of samples increases the complexity of analysis of such databases. All the genes are not significant and hence informative genes are required to be extracted. So dimension reduction is necessary for this proces...

Full description

Saved in:
Bibliographic Details
Main Author: Rasmita Dash
Format: Article
Language:English
Published: Springer 2020-02-01
Series:Journal of King Saud University: Computer and Information Sciences
Online Access:http://www.sciencedirect.com/science/article/pii/S1319157817301581
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:High dimensional search space in microarray data with large number of genes and few dozen of samples increases the complexity of analysis of such databases. All the genes are not significant and hence informative genes are required to be extracted. So dimension reduction is necessary for this process. It is often found in literature that the ranking approaches are used for feature selection. Different ranking techniques may assign different rank to the same gene and the selection made based on these ranks may not be suitable for different problems. So use of one ranking technique may lead to rejection of some important genes and possibly selection of some insignificant genes. Such selection may degrade the performance of the classifier. To overcome this problem, here a bi-objective ranked based Pareto front technique is proposed. In this technique using two ranked based technique the Pareto optimal solution is generated with a set of features. For the experimental work, 21 models based on 7 feature ranking strategies are considered. Eight different microarray data are taken to find the suitable ranking combination for the work. A grading method is used to rank the models and statistical test is performed to validate the findings. Keywords: Feature ranking technique, Statistical analysis, Pareto front, Multi-objective optimization, Classification technique, Microarray database
ISSN:1319-1578