SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:umu-220260"
 

Search: onr:"swepub:oai:DiVA.org:umu-220260" > ADCluster: Adaptive...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

ADCluster: Adaptive Deep Clustering for unsupervised learning from unlabeled documents

Hatefi, Arezoo, 1990- (author)
Umeå universitet,Institutionen för datavetenskap
Vu, Xuan-Son, 1988- (author)
Umeå universitet,Institutionen för datavetenskap
Bhuyan, Monowar H. (author)
Umeå universitet,Institutionen för datavetenskap
show more...
Drewes, Frank (author)
Umeå universitet,Institutionen för datavetenskap
show less...
 (creator_code:org_t)
Association for Computational Linguistics, 2023
2023
English.
In: Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023). - : Association for Computational Linguistics. ; , s. 68-77
  • Conference paper (peer-reviewed)
Abstract Subject headings
Close  
  • We introduce ADCluster, a deep document clustering approach based on language models that is trained to adapt to the clustering task. This adaptability is achieved through an iterative process where K-Means clustering is applied to the dataset, followed by iteratively training a deep classifier with generated pseudo-labels – an approach referred to as inner adaptation. The model is also able to adapt to changes in the data as new documents are added to the document collection. The latter type of adaptation, outer adaptation, is obtained by resuming the inner adaptation when a new chunk of documents has arrived. We explore two outer adaptation strategies, namely accumulative adaptation (training is resumed on the accumulated set of all documents) and non-accumulative adaptation (training is resumed using only the new chunk of data). We show that ADCluster outperforms established document clustering techniques on medium and long-text documents by a large margin. Additionally, our approach outperforms well-established baseline methods under both the accumulative and non-accumulative outer adaptation scenarios.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)

Keyword

deep clustering
adaptive
deep learning
unsupervised
data stream
Computer Science
datalogi
computational linguistics
datorlingvistik

Publication and Content Type

ref (subject category)
kon (subject category)

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Find more in SwePub

By the author/editor
Hatefi, Arezoo, ...
Vu, Xuan-Son, 19 ...
Bhuyan, Monowar ...
Drewes, Frank
About the subject
NATURAL SCIENCES
NATURAL SCIENCES
and Computer and Inf ...
and Computer Science ...
Articles in the publication
By the university
Umeå University

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view