An Empirical Configuration Study of a Common Document Clustering Pipeline
Document clustering is frequently used in applications of natural language processing, e.g. to classify news articles or creating topic models. In this paper, we study document clustering with the common clustering pipeline that includes vectorization with BERT or Doc2Vec, dimension reduction wi...
Saved in:
Main Authors: | Anton Eklund, Mona Forsman, Frank Drewes |
---|---|
Format: | Article |
Language: | English |
Published: |
Linköping University Electronic Press
2023-09-01
|
Series: | Northern European Journal of Language Technology |
Online Access: | https://nejlt.ep.liu.se/article/view/4396 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Keeping It Open: A TEI-based Publication Pipeline for Historical Documents
by: Floriane Chiffoleau
Published: (2024-11-01) -
Non-hierarchic document clustering
by: Gareth Jones, et al.
Published: (1995-01-01) -
Explainable Graph Spectral Clustering of text documents.
by: Bartłomiej Starosta, et al.
Published: (2025-01-01) -
Active Learning for Constrained Document Clustering with Uncertainty Region
by: M. A. Balafar, et al.
Published: (2020-01-01) -
Configurational effects of intergenerational support on older adults’ depression: an empirical study from CHARLS data
by: Qin Bai, et al.
Published: (2025-01-01)