OSFS‐Vague: Online streaming feature selection algorithm based on vague set
Abstract Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high‐dimensional data. In real big data‐related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2024-12-01
|
Series: | CAAI Transactions on Intelligence Technology |
Subjects: | |
Online Access: | https://doi.org/10.1049/cit2.12327 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841543292422979584 |
---|---|
author | Jie Yang Zhijun Wang Guoyin Wang Yanmin Liu Yi He Di Wu |
author_facet | Jie Yang Zhijun Wang Guoyin Wang Yanmin Liu Yi He Di Wu |
author_sort | Jie Yang |
collection | DOAJ |
description | Abstract Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high‐dimensional data. In real big data‐related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS‐Vague. Its main idea is to combine uncertainty and three‐way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS‐Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS‐Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS‐Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS‐Vague outperforms six state‐of‐the‐art OSFS algorithms in terms of selection accuracy and computational efficiency. |
format | Article |
id | doaj-art-1947981b1b9246d8a51a47d727d469bb |
institution | Kabale University |
issn | 2468-2322 |
language | English |
publishDate | 2024-12-01 |
publisher | Wiley |
record_format | Article |
series | CAAI Transactions on Intelligence Technology |
spelling | doaj-art-1947981b1b9246d8a51a47d727d469bb2025-01-13T14:05:51ZengWileyCAAI Transactions on Intelligence Technology2468-23222024-12-01961451146610.1049/cit2.12327OSFS‐Vague: Online streaming feature selection algorithm based on vague setJie Yang0Zhijun Wang1Guoyin Wang2Yanmin Liu3Yi He4Di Wu5School of Physics and Electronic Science Zunyi Normal University Zunyi ChinaKey Laboratory of Big Data Intelligent Computing Chongqing University of Posts and Telecommunications Chongqing ChinaKey Laboratory of Big Data Intelligent Computing Chongqing University of Posts and Telecommunications Chongqing ChinaSchool of Physics and Electronic Science Zunyi Normal University Zunyi ChinaDepartment of Computer Science Old Dominion University Norfolk Virginia USACollege of Computer and Information Science Southwest University Chongqing ChinaAbstract Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high‐dimensional data. In real big data‐related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS‐Vague. Its main idea is to combine uncertainty and three‐way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS‐Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS‐Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS‐Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS‐Vague outperforms six state‐of‐the‐art OSFS algorithms in terms of selection accuracy and computational efficiency.https://doi.org/10.1049/cit2.12327feature selectiononline feature selectionthree‐way decisionvague set |
spellingShingle | Jie Yang Zhijun Wang Guoyin Wang Yanmin Liu Yi He Di Wu OSFS‐Vague: Online streaming feature selection algorithm based on vague set CAAI Transactions on Intelligence Technology feature selection online feature selection three‐way decision vague set |
title | OSFS‐Vague: Online streaming feature selection algorithm based on vague set |
title_full | OSFS‐Vague: Online streaming feature selection algorithm based on vague set |
title_fullStr | OSFS‐Vague: Online streaming feature selection algorithm based on vague set |
title_full_unstemmed | OSFS‐Vague: Online streaming feature selection algorithm based on vague set |
title_short | OSFS‐Vague: Online streaming feature selection algorithm based on vague set |
title_sort | osfs vague online streaming feature selection algorithm based on vague set |
topic | feature selection online feature selection three‐way decision vague set |
url | https://doi.org/10.1049/cit2.12327 |
work_keys_str_mv | AT jieyang osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset AT zhijunwang osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset AT guoyinwang osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset AT yanminliu osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset AT yihe osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset AT diwu osfsvagueonlinestreamingfeatureselectionalgorithmbasedonvagueset |