OMAL: A Multi-Label Active Learning Approach from Data Streams
With the rapid growth of digital computing, communication, and storage devices applied in various real-world scenarios, more and more data have been collected and stored to drive the development of machine learning techniques. It is also noted that the data that emerge in real-world applications ten...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Entropy |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1099-4300/27/4/363 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849713843466403840 |
|---|---|
| author | Qiao Fang Chen Xiang Jicong Duan Benallal Soufiyan Changbin Shao Xibei Yang Sen Xu Hualong Yu |
| author_facet | Qiao Fang Chen Xiang Jicong Duan Benallal Soufiyan Changbin Shao Xibei Yang Sen Xu Hualong Yu |
| author_sort | Qiao Fang |
| collection | DOAJ |
| description | With the rapid growth of digital computing, communication, and storage devices applied in various real-world scenarios, more and more data have been collected and stored to drive the development of machine learning techniques. It is also noted that the data that emerge in real-world applications tend to become more complex. In this study, we regard a complex data type, i.e., multi-label data, acquired with a time constraint in a dynamic online scenario. Under such conditions, constructing a learning model has to face two challenges: it requires dynamically adapting the variances in label correlations and imbalanced data distributions and it requires more labeling consumptions. To solve these two issues, we propose a novel online multi-label active learning (OMAL) algorithm that considers simultaneously adopting uncertainty (using the average entropy of prediction probabilities) and diversity (using the average cosine distance between feature vectors) as an active query strategy. Specifically, to focus on label correlations, we use a classifier chain (CC) as the multi-label learning model and design a label co-occurrence ranking strategy to arrange label sequence in CC. To adapt the naturally imbalanced distribution of the multi-label data, we select weight extreme learning machine (WELM) as the basic binary-class classifier in CC. The experimental results on ten benchmark multi-label datasets that were transformed into streams show that our proposed method is superior to several popular static multi-label active learning algorithms in terms of both the Macro-F1 and Micro-F1 metrics, indicating its specifical adaptions in the dynamic data stream environment. |
| format | Article |
| id | doaj-art-e7481a97d9b947838a57a650ead52c7d |
| institution | DOAJ |
| issn | 1099-4300 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Entropy |
| spelling | doaj-art-e7481a97d9b947838a57a650ead52c7d2025-08-20T03:13:51ZengMDPI AGEntropy1099-43002025-03-0127436310.3390/e27040363OMAL: A Multi-Label Active Learning Approach from Data StreamsQiao Fang0Chen Xiang1Jicong Duan2Benallal Soufiyan3Changbin Shao4Xibei Yang5Sen Xu6Hualong Yu7School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaSchool of Information Technology, Yancheng Institute of Technology, Yancheng 224051, ChinaSchool of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, ChinaWith the rapid growth of digital computing, communication, and storage devices applied in various real-world scenarios, more and more data have been collected and stored to drive the development of machine learning techniques. It is also noted that the data that emerge in real-world applications tend to become more complex. In this study, we regard a complex data type, i.e., multi-label data, acquired with a time constraint in a dynamic online scenario. Under such conditions, constructing a learning model has to face two challenges: it requires dynamically adapting the variances in label correlations and imbalanced data distributions and it requires more labeling consumptions. To solve these two issues, we propose a novel online multi-label active learning (OMAL) algorithm that considers simultaneously adopting uncertainty (using the average entropy of prediction probabilities) and diversity (using the average cosine distance between feature vectors) as an active query strategy. Specifically, to focus on label correlations, we use a classifier chain (CC) as the multi-label learning model and design a label co-occurrence ranking strategy to arrange label sequence in CC. To adapt the naturally imbalanced distribution of the multi-label data, we select weight extreme learning machine (WELM) as the basic binary-class classifier in CC. The experimental results on ten benchmark multi-label datasets that were transformed into streams show that our proposed method is superior to several popular static multi-label active learning algorithms in terms of both the Macro-F1 and Micro-F1 metrics, indicating its specifical adaptions in the dynamic data stream environment.https://www.mdpi.com/1099-4300/27/4/363active learningmulti-label data streamquery strategyclassifier chainsweighted extreme learning machinelabel correlations |
| spellingShingle | Qiao Fang Chen Xiang Jicong Duan Benallal Soufiyan Changbin Shao Xibei Yang Sen Xu Hualong Yu OMAL: A Multi-Label Active Learning Approach from Data Streams Entropy active learning multi-label data stream query strategy classifier chains weighted extreme learning machine label correlations |
| title | OMAL: A Multi-Label Active Learning Approach from Data Streams |
| title_full | OMAL: A Multi-Label Active Learning Approach from Data Streams |
| title_fullStr | OMAL: A Multi-Label Active Learning Approach from Data Streams |
| title_full_unstemmed | OMAL: A Multi-Label Active Learning Approach from Data Streams |
| title_short | OMAL: A Multi-Label Active Learning Approach from Data Streams |
| title_sort | omal a multi label active learning approach from data streams |
| topic | active learning multi-label data stream query strategy classifier chains weighted extreme learning machine label correlations |
| url | https://www.mdpi.com/1099-4300/27/4/363 |
| work_keys_str_mv | AT qiaofang omalamultilabelactivelearningapproachfromdatastreams AT chenxiang omalamultilabelactivelearningapproachfromdatastreams AT jicongduan omalamultilabelactivelearningapproachfromdatastreams AT benallalsoufiyan omalamultilabelactivelearningapproachfromdatastreams AT changbinshao omalamultilabelactivelearningapproachfromdatastreams AT xibeiyang omalamultilabelactivelearningapproachfromdatastreams AT senxu omalamultilabelactivelearningapproachfromdatastreams AT hualongyu omalamultilabelactivelearningapproachfromdatastreams |