A Hubness Information-Based k-Nearest Neighbor Approach for Multi-Label Learning

Multi-label classification (MLC) plays a crucial role in various real-world scenarios. Prediction with nearest neighbors has achieved competitive performance in MLC. Hubness, a phenomenon in which a few points appear in the k-nearest neighbor (kNN) lists of many points in high-dimensional spaces, ma...

Full description

Saved in:
Bibliographic Details
Main Authors: Zeyu Teng, Shanshan Tang, Min Huang, Xingwei Wang
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/7/1202
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multi-label classification (MLC) plays a crucial role in various real-world scenarios. Prediction with nearest neighbors has achieved competitive performance in MLC. Hubness, a phenomenon in which a few points appear in the k-nearest neighbor (kNN) lists of many points in high-dimensional spaces, may significantly impact machine learning applications and has recently attracted extensive attention. However, it has not been adequately addressed in developing MLC algorithms. To address this issue, we propose a hubness-aware kNN-based MLC algorithm in this paper, named multi-label hubness information-based k-nearest neighbor (MLHiKNN). Specifically, we introduce a fuzzy measure of label relevance and employ a weighted kNN scheme. The hubness information is used to compute each training example’s membership in relevance and irrelevance to each label and calculate weights for the nearest neighbors of a query point. Then, MLHiKNN exploits high-order label correlations by training a logistic regression model for each label using the kNN voting results with respect to all possible labels. Experimental results on 28 benchmark datasets demonstrate that MLHiKNN is competitive among the compared methods, including nine well-established MLC algorithms and three commonly used hubness reduction techniques, in dealing with MLC problems.
ISSN:2227-7390