OMAL: A Multi-Label Active Learning Approach from Data Streams

With the rapid growth of digital computing, communication, and storage devices applied in various real-world scenarios, more and more data have been collected and stored to drive the development of machine learning techniques. It is also noted that the data that emerge in real-world applications ten...

Full description

Saved in:
Bibliographic Details
Main Authors: Qiao Fang, Chen Xiang, Jicong Duan, Benallal Soufiyan, Changbin Shao, Xibei Yang, Sen Xu, Hualong Yu
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/27/4/363
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849713843466403840
author Qiao Fang
Chen Xiang
Jicong Duan
Benallal Soufiyan
Changbin Shao
Xibei Yang
Sen Xu
Hualong Yu
author_facet Qiao Fang
Chen Xiang
Jicong Duan
Benallal Soufiyan
Changbin Shao
Xibei Yang
Sen Xu
Hualong Yu
author_sort Qiao Fang
collection DOAJ
description With the rapid growth of digital computing, communication, and storage devices applied in various real-world scenarios, more and more data have been collected and stored to drive the development of machine learning techniques. It is also noted that the data that emerge in real-world applications tend to become more complex. In this study, we regard a complex data type, i.e., multi-label data, acquired with a time constraint in a dynamic online scenario. Under such conditions, constructing a learning model has to face two challenges: it requires dynamically adapting the variances in label correlations and imbalanced data distributions and it requires more labeling consumptions. To solve these two issues, we propose a novel online multi-label active learning (OMAL) algorithm that considers simultaneously adopting uncertainty (using the average entropy of prediction probabilities) and diversity (using the average cosine distance between feature vectors) as an active query strategy. Specifically, to focus on label correlations, we use a classifier chain (CC) as the multi-label learning model and design a label co-occurrence ranking strategy to arrange label sequence in CC. To adapt the naturally imbalanced distribution of the multi-label data, we select weight extreme learning machine (WELM) as the basic binary-class classifier in CC. The experimental results on ten benchmark multi-label datasets that were transformed into streams show that our proposed method is superior to several popular static multi-label active learning algorithms in terms of both the Macro-F1 and Micro-F1 metrics, indicating its specifical adaptions in the dynamic data stream environment.
format Article
id doaj-art-e7481a97d9b947838a57a650ead52c7d
institution DOAJ
issn 1099-4300
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj-art-e7481a97d9b947838a57a650ead52c7d2025-08-20T03:13:51ZengMDPI AGEntropy1099-43002025-03-0127436310.3390/e27040363OMAL: A Multi-Label Active Learning Approach from Data StreamsQiao Fang0Chen Xiang1Jicong Duan2Benallal Soufiyan3Changbin Shao4Xibei Yang5Sen Xu6Hualong Yu7School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Information Technology, Yancheng Institute of Technology, Yancheng 224051, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaWith the rapid growth of digital computing, communication, and storage devices applied in various real-world scenarios, more and more data have been collected and stored to drive the development of machine learning techniques. It is also noted that the data that emerge in real-world applications tend to become more complex. In this study, we regard a complex data type, i.e., multi-label data, acquired with a time constraint in a dynamic online scenario. Under such conditions, constructing a learning model has to face two challenges: it requires dynamically adapting the variances in label correlations and imbalanced data distributions and it requires more labeling consumptions. To solve these two issues, we propose a novel online multi-label active learning (OMAL) algorithm that considers simultaneously adopting uncertainty (using the average entropy of prediction probabilities) and diversity (using the average cosine distance between feature vectors) as an active query strategy. Specifically, to focus on label correlations, we use a classifier chain (CC) as the multi-label learning model and design a label co-occurrence ranking strategy to arrange label sequence in CC. To adapt the naturally imbalanced distribution of the multi-label data, we select weight extreme learning machine (WELM) as the basic binary-class classifier in CC. The experimental results on ten benchmark multi-label datasets that were transformed into streams show that our proposed method is superior to several popular static multi-label active learning algorithms in terms of both the Macro-F1 and Micro-F1 metrics, indicating its specifical adaptions in the dynamic data stream environment.https://www.mdpi.com/1099-4300/27/4/363active learningmulti-label data streamquery strategyclassifier chainsweighted extreme learning machinelabel correlations
spellingShingle Qiao Fang
Chen Xiang
Jicong Duan
Benallal Soufiyan
Changbin Shao
Xibei Yang
Sen Xu
Hualong Yu
OMAL: A Multi-Label Active Learning Approach from Data Streams
Entropy
active learning
multi-label data stream
query strategy
classifier chains
weighted extreme learning machine
label correlations
title OMAL: A Multi-Label Active Learning Approach from Data Streams
title_full OMAL: A Multi-Label Active Learning Approach from Data Streams
title_fullStr OMAL: A Multi-Label Active Learning Approach from Data Streams
title_full_unstemmed OMAL: A Multi-Label Active Learning Approach from Data Streams
title_short OMAL: A Multi-Label Active Learning Approach from Data Streams
title_sort omal a multi label active learning approach from data streams
topic active learning
multi-label data stream
query strategy
classifier chains
weighted extreme learning machine
label correlations
url https://www.mdpi.com/1099-4300/27/4/363
work_keys_str_mv AT qiaofang omalamultilabelactivelearningapproachfromdatastreams
AT chenxiang omalamultilabelactivelearningapproachfromdatastreams
AT jicongduan omalamultilabelactivelearningapproachfromdatastreams
AT benallalsoufiyan omalamultilabelactivelearningapproachfromdatastreams
AT changbinshao omalamultilabelactivelearningapproachfromdatastreams
AT xibeiyang omalamultilabelactivelearningapproachfromdatastreams
AT senxu omalamultilabelactivelearningapproachfromdatastreams
AT hualongyu omalamultilabelactivelearningapproachfromdatastreams