A joint-training topic model for social media texts

Abstract The burgeoning significance of topic mining for social media text has intensified with the proliferation of social media platforms. Nevertheless, the brevity and discreteness of social media text pose significant challenges to conventional topic models, which often struggle to perform well...

Full description

Saved in:

Bibliographic Details
Main Authors:	Simeng Qin, Mingli Zhang, Haiju Hu, Gang Li
Format:	Article
Language:	English
Published:	Springer Nature 2025-03-01
Series:	Humanities & Social Sciences Communications
Online Access:	https://doi.org/10.1057/s41599-025-04551-2
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract The burgeoning significance of topic mining for social media text has intensified with the proliferation of social media platforms. Nevertheless, the brevity and discreteness of social media text pose significant challenges to conventional topic models, which often struggle to perform well on them. To address this, the paper establishes a more precise Position-Sensitive Word-Embedding Topic Model (PS-WETM) to adeptly capture intricate semantic and lexical relations within social media text. The model enriches the corpus and semantic relations based on word vector similarity, thereby yielding dense word vector representations. Furthermore, it proposes a position-sensitive word vector training model. The model meticulously distinguishes relations between the pivot word and context words positioned differently by assigning different weight matrices to context words in asymmetrical positions. Additionally, the model incorporates self-attention mechanism to globally capture dependencies between each element in the input word vectors, and calculates the contribution of each word to the topic matching performance. The experiment result highlights that the customized topic model outperforms existing short-text topic models, such as PTM, SPTM, DMM, GPU-DMM, GLTM and WETM. Hence, PS-WETM adeptly identifies diverse topics in social media text, demonstrating its outstanding performance in handling short texts with sparse words and discrete semantic relations.
ISSN:	2662-9992

A joint-training topic model for social media texts

Similar Items