Hierarchical contrastive learning for multi-label text classification

Abstract Multi-label text classification presents a significant challenge within the field of text classification, particularly due to the hierarchical nature of labels, where labels are organized in a tree-like structure that captures parent-child and sibling relationships. This hierarchy reflects...

Full description

Saved in:
Bibliographic Details
Main Authors: Wei Zhang, Yun Jiang, Yun Fang, Shuai Pan
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-97597-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849713110595665920
author Wei Zhang
Yun Jiang
Yun Fang
Shuai Pan
author_facet Wei Zhang
Yun Jiang
Yun Fang
Shuai Pan
author_sort Wei Zhang
collection DOAJ
description Abstract Multi-label text classification presents a significant challenge within the field of text classification, particularly due to the hierarchical nature of labels, where labels are organized in a tree-like structure that captures parent-child and sibling relationships. This hierarchy reflects semantic dependencies among labels, with higher-level labels representing broader categories and lower-level labels capturing more specific distinctions. Traditional methods often fail to deeply understand and leverage this hierarchical structure, overlooking the subtle semantic differences and correlations that distinguish one label from another. To address this shortcoming, we introduce a novel method called Hierarchical Contrastive Learning for Multi-label Text Classification (HCL-MTC). Our approach leverages the contrastive knowledge embedded within label relationships by constructing a graph representation that explicitly models the hierarchical dependencies among labels. Specifically, we recast multi-label text classification as a multi-task learning problem, incorporating a hierarchical contrastive loss that is computed through a carefully designed sampling process. This unique loss function enables our model to effectively capture both the correlations and distinctions among labels, thereby enhancing the model’s ability to learn the intricacies of the label hierarchy. Experimental results on widely-used datasets, such as RCV1-v2 and WoS, demonstrate that our proposed HCL-MTC model achieves substantial performance gains compared to baseline methods.
format Article
id doaj-art-e5c8c217ef9140369c5d5363bd79c153
institution DOAJ
issn 2045-2322
language English
publishDate 2025-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-e5c8c217ef9140369c5d5363bd79c1532025-08-20T03:14:03ZengNature PortfolioScientific Reports2045-23222025-04-0115111210.1038/s41598-025-97597-wHierarchical contrastive learning for multi-label text classificationWei Zhang0Yun Jiang1Yun Fang2Shuai Pan3Advanced Institution of Information Technology, Peking UniversityAdvanced Institution of Information Technology, Peking UniversityAdvanced Institution of Information Technology, Peking UniversityAdvanced Institution of Information Technology, Peking UniversityAbstract Multi-label text classification presents a significant challenge within the field of text classification, particularly due to the hierarchical nature of labels, where labels are organized in a tree-like structure that captures parent-child and sibling relationships. This hierarchy reflects semantic dependencies among labels, with higher-level labels representing broader categories and lower-level labels capturing more specific distinctions. Traditional methods often fail to deeply understand and leverage this hierarchical structure, overlooking the subtle semantic differences and correlations that distinguish one label from another. To address this shortcoming, we introduce a novel method called Hierarchical Contrastive Learning for Multi-label Text Classification (HCL-MTC). Our approach leverages the contrastive knowledge embedded within label relationships by constructing a graph representation that explicitly models the hierarchical dependencies among labels. Specifically, we recast multi-label text classification as a multi-task learning problem, incorporating a hierarchical contrastive loss that is computed through a carefully designed sampling process. This unique loss function enables our model to effectively capture both the correlations and distinctions among labels, thereby enhancing the model’s ability to learn the intricacies of the label hierarchy. Experimental results on widely-used datasets, such as RCV1-v2 and WoS, demonstrate that our proposed HCL-MTC model achieves substantial performance gains compared to baseline methods.https://doi.org/10.1038/s41598-025-97597-wContrastive learningHierarchical structureMulti-taskMulti-label text classification
spellingShingle Wei Zhang
Yun Jiang
Yun Fang
Shuai Pan
Hierarchical contrastive learning for multi-label text classification
Scientific Reports
Contrastive learning
Hierarchical structure
Multi-task
Multi-label text classification
title Hierarchical contrastive learning for multi-label text classification
title_full Hierarchical contrastive learning for multi-label text classification
title_fullStr Hierarchical contrastive learning for multi-label text classification
title_full_unstemmed Hierarchical contrastive learning for multi-label text classification
title_short Hierarchical contrastive learning for multi-label text classification
title_sort hierarchical contrastive learning for multi label text classification
topic Contrastive learning
Hierarchical structure
Multi-task
Multi-label text classification
url https://doi.org/10.1038/s41598-025-97597-w
work_keys_str_mv AT weizhang hierarchicalcontrastivelearningformultilabeltextclassification
AT yunjiang hierarchicalcontrastivelearningformultilabeltextclassification
AT yunfang hierarchicalcontrastivelearningformultilabeltextclassification
AT shuaipan hierarchicalcontrastivelearningformultilabeltextclassification