The Comparison of Three Measures in Feature Selection

It has been known that either linear correlation or nonlinear correlation might exist between featureto- feature and feature-to-class in datasets. In this paper,we study the differences of selected feature subset when different kinds of measures are applied with same feature selection method in di...

Full description

Saved in:
Bibliographic Details
Main Authors: SONG Zhi-chao, KANG Jian, SUN Guang-lu, HE Yong-jun
Format: Article
Language:zho
Published: Harbin University of Science and Technology Publications 2018-02-01
Series:Journal of Harbin University of Science and Technology
Subjects:
Online Access:https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1489
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:It has been known that either linear correlation or nonlinear correlation might exist between featureto- feature and feature-to-class in datasets. In this paper,we study the differences of selected feature subset when different kinds of measures are applied with same feature selection method in different kinds of datasets. Three representative linear or nonlinear measures,linear correlation coefficient,symmetrical uncertainty,and mutual information are selected. By combining them with the fast correlation-based filter ( FCBF) feature selection method,we make the comparison of selected feature subset from 8 gene microarray and image datasets. Experimental results indicate that the feature subsets selected by linear correlation coefficient based FCBF obtain better classification accuracy in gene microarray datasets than in image datasets,while mutual information and symmetrical uncertainty based FCBF tend to obtain better results in image datasets. Moreover,symmetrical uncertainty based FCBF is more robust in all datasets.
ISSN:1007-2683