An Empirical Configuration Study of a Common Document Clustering Pipeline

Document clustering is frequently used in applications of natural language processing, e.g. to classify news articles or creating topic models. In this paper, we study document clustering with the common clustering pipeline that includes vectorization with BERT or Doc2Vec, dimension reduction wi...

Full description

Saved in:
Bibliographic Details
Main Authors: Anton Eklund, Mona Forsman, Frank Drewes
Format: Article
Language:English
Published: Linköping University Electronic Press 2023-09-01
Series:Northern European Journal of Language Technology
Online Access:https://nejlt.ep.liu.se/article/view/4396
Tags: Add Tag
No Tags, Be the first to tag this record!