Cross-domain topic transfer learning method based on multiple balance and feature fusion

In transfer learning, traditional homogeneous transfer learning assumes similar data and feature distributions between the source and target domains, focusing primarily on parameter sharing to enhance model performance. However, heterogeneous transfer learning for topic model, disparities in data an...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhenshun Xu, Zhenbiao Wang, Wenhao Zhang, Zengjin Tang
Format: Article
Language:English
Published: Elsevier 2025-05-01
Series:Heliyon
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844024167941
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In transfer learning, traditional homogeneous transfer learning assumes similar data and feature distributions between the source and target domains, focusing primarily on parameter sharing to enhance model performance. However, heterogeneous transfer learning for topic model, disparities in data and feature distributions lead to negative transfer, diminishing the effectiveness of topic extraction. This research explores alternative methods for heterogeneous transfer learning in topic model beyond traditional parameter sharing, seeking to maximize the utilization of source domain knowledge and features, mitigating the interference of negative transfer. We propose a novel approach for cross-domain topic transfer learning by combining feature fusion and balancing data and labels. Including applying dual-supervised techniques for handling label dependencies, employing function constraints and data enhancement to adjust data distributions, and utilizing feature fusion techniques to mitigate feature distribution disparities. Additionally, we introduce a topic knowledge distillation method, leveraging topics from the source domain to guide and optimize topic generation in the target domain. In practical applications, this method enhances animal disease topic mining by integrating feature knowledge from the animal diseases dataset and the 20 Newsgroups dataset. The experiments verify the effectiveness of this method, bridging gaps in topic generation and advancing the application of topic modeling in complex and dynamic contexts.
ISSN:2405-8440