Sökning: onr:"swepub:oai:DiVA.org:umu-220260" > ADCluster: Adaptive...
Fältnamn | Indikatorer | Metadata |
---|---|---|
000 | 03203naa a2200421 4500 | |
001 | oai:DiVA.org:umu-220260 | |
003 | SwePub | |
008 | 240131s2023 | |||||||||||000 ||eng| | |
024 | 7 | a https://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-2202602 URI |
040 | a (SwePub)umu | |
041 | a engb eng | |
042 | 9 SwePub | |
072 | 7 | a ref2 swepub-contenttype |
072 | 7 | a kon2 swepub-publicationtype |
100 | 1 | a Hatefi, Arezoo,d 1990-u Umeå universitet,Institutionen för datavetenskap4 aut0 (Swepub:umu)arha0050 |
245 | 1 0 | a ADCluster: Adaptive Deep Clustering for unsupervised learning from unlabeled documents |
264 | 1 | b Association for Computational Linguistics,c 2023 |
338 | a electronic2 rdacarrier | |
520 | a We introduce ADCluster, a deep document clustering approach based on language models that is trained to adapt to the clustering task. This adaptability is achieved through an iterative process where K-Means clustering is applied to the dataset, followed by iteratively training a deep classifier with generated pseudo-labels – an approach referred to as inner adaptation. The model is also able to adapt to changes in the data as new documents are added to the document collection. The latter type of adaptation, outer adaptation, is obtained by resuming the inner adaptation when a new chunk of documents has arrived. We explore two outer adaptation strategies, namely accumulative adaptation (training is resumed on the accumulated set of all documents) and non-accumulative adaptation (training is resumed using only the new chunk of data). We show that ADCluster outperforms established document clustering techniques on medium and long-text documents by a large margin. Additionally, our approach outperforms well-established baseline methods under both the accumulative and non-accumulative outer adaptation scenarios. | |
650 | 7 | a NATURVETENSKAPx Data- och informationsvetenskapx Datavetenskap0 (SwePub)102012 hsv//swe |
650 | 7 | a NATURAL SCIENCESx Computer and Information Sciencesx Computer Sciences0 (SwePub)102012 hsv//eng |
653 | a deep clustering | |
653 | a adaptive | |
653 | a deep learning | |
653 | a unsupervised | |
653 | a data stream | |
653 | a Computer Science | |
653 | a datalogi | |
653 | a computational linguistics | |
653 | a datorlingvistik | |
700 | 1 | a Vu, Xuan-Son,d 1988-u Umeå universitet,Institutionen för datavetenskap4 aut0 (Swepub:umu)xuvu0001 |
700 | 1 | a Bhuyan, Monowar H.u Umeå universitet,Institutionen för datavetenskap4 aut0 (Swepub:umu)mobh0003 |
700 | 1 | a Drewes, Franku Umeå universitet,Institutionen för datavetenskap4 aut0 (Swepub:umu)frdr0001 |
710 | 2 | a Umeå universitetb Institutionen för datavetenskap4 org |
773 | 0 | t Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023)d : Association for Computational Linguisticsg , s. 68-77q <68-77 |
856 | 4 | u https://aclanthology.org/2023.icnlsp-1.7y Publisher's full text |
856 | 4 | u https://umu.diva-portal.org/smash/get/diva2:1833048/FULLTEXT01.pdfx primaryx Raw objecty fulltext:print |
856 | 4 8 | u https://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-220260 |
Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.
Kopiera och spara länken för att återkomma till aktuell vy