OSFS‐Vague: Online streaming feature selection algorithm based on vague set

Abstract Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high‐dimensional data. In real big data‐related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data...

Full description

Saved in:
Bibliographic Details
Main Authors: Jie Yang, Zhijun Wang, Guoyin Wang, Yanmin Liu, Yi He, Di Wu
Format: Article
Language:English
Published: Wiley 2024-12-01
Series:CAAI Transactions on Intelligence Technology
Subjects:
Online Access:https://doi.org/10.1049/cit2.12327
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841543292422979584
author Jie Yang
Zhijun Wang
Guoyin Wang
Yanmin Liu
Yi He
Di Wu
author_facet Jie Yang
Zhijun Wang
Guoyin Wang
Yanmin Liu
Yi He
Di Wu
author_sort Jie Yang
collection DOAJ
description Abstract Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high‐dimensional data. In real big data‐related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS‐Vague. Its main idea is to combine uncertainty and three‐way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS‐Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS‐Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS‐Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS‐Vague outperforms six state‐of‐the‐art OSFS algorithms in terms of selection accuracy and computational efficiency.
format Article
id doaj-art-1947981b1b9246d8a51a47d727d469bb
institution Kabale University
issn 2468-2322
language English
publishDate 2024-12-01
publisher Wiley
record_format Article
series CAAI Transactions on Intelligence Technology
spelling doaj-art-1947981b1b9246d8a51a47d727d469bb2025-01-13T14:05:51ZengWileyCAAI Transactions on Intelligence Technology2468-23222024-12-01961451146610.1049/cit2.12327OSFS‐Vague: Online streaming feature selection algorithm based on vague setJie Yang0Zhijun Wang1Guoyin Wang2Yanmin Liu3Yi He4Di Wu5School of Physics and Electronic Science Zunyi Normal University Zunyi ChinaKey Laboratory of Big Data Intelligent Computing Chongqing University of Posts and Telecommunications Chongqing ChinaKey Laboratory of Big Data Intelligent Computing Chongqing University of Posts and Telecommunications Chongqing ChinaSchool of Physics and Electronic Science Zunyi Normal University Zunyi ChinaDepartment of Computer Science Old Dominion University Norfolk Virginia USACollege of Computer and Information Science Southwest University Chongqing ChinaAbstract Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high‐dimensional data. In real big data‐related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS‐Vague. Its main idea is to combine uncertainty and three‐way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS‐Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS‐Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS‐Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS‐Vague outperforms six state‐of‐the‐art OSFS algorithms in terms of selection accuracy and computational efficiency.https://doi.org/10.1049/cit2.12327feature selectiononline feature selectionthree‐way decisionvague set
spellingShingle Jie Yang
Zhijun Wang
Guoyin Wang
Yanmin Liu
Yi He
Di Wu
OSFS‐Vague: Online streaming feature selection algorithm based on vague set
CAAI Transactions on Intelligence Technology
feature selection
online feature selection
three‐way decision
vague set
title OSFS‐Vague: Online streaming feature selection algorithm based on vague set
title_full OSFS‐Vague: Online streaming feature selection algorithm based on vague set
title_fullStr OSFS‐Vague: Online streaming feature selection algorithm based on vague set
title_full_unstemmed OSFS‐Vague: Online streaming feature selection algorithm based on vague set
title_short OSFS‐Vague: Online streaming feature selection algorithm based on vague set
title_sort osfs vague online streaming feature selection algorithm based on vague set
topic feature selection
online feature selection
three‐way decision
vague set
url https://doi.org/10.1049/cit2.12327
work_keys_str_mv AT jieyang osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset
AT zhijunwang osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset
AT guoyinwang osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset
AT yanminliu osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset
AT yihe osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset
AT diwu osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset