Sökning: id:"swepub:oai:DiVA.org:kth-333406" >
Fine-Tuning BERT-ba...
Fine-Tuning BERT-based Language Models for Duplicate Trouble Report Retrieval
-
- Bosch, Nathan (författare)
- KTH,Ericsson AB, GFTL GAIA, Stockholm, Sweden
-
- Shalmashi, Serveh (författare)
- Ericsson AB, GFTL GAIA, Stockholm, Sweden
-
- Yaghoubi, Forough (författare)
- Ericsson AB, GFTL GAIA, Stockholm, Sweden
-
visa fler...
-
- Holm, Henrik (författare)
- Ericsson AB, GFTL GAIA, Stockholm, Sweden
-
- Gaim, Fitsum (författare)
- Ericsson AB, GFTL GAIA, Stockholm, Sweden
-
- Payberah, Amir H., 1978- (författare)
- KTH,Programvaruteknik och datorsystem, SCS
-
visa färre...
-
(creator_code:org_t)
- Institute of Electrical and Electronics Engineers (IEEE), 2022
- 2022
- Engelska.
-
Ingår i: Proceedings. - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 4737-4745
- Relaterad länk:
-
https://urn.kb.se/re...
-
visa fler...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- In large software-intensive organizations, trouble reports (TRs) are heavily involved in reporting, analyzing, and resolving faults. Due to the scale of modern organizations and products, multiple people independently often identify faults, leading to duplicate TRs. To mitigate the additional manual effort to identify and resolve these duplicate TRs, prior work at Ericsson focused on developing a 2-stage BERT-based retrieval system for identifying similar TRs when provided a new fault observation. This approach, although powerful, struggled to generalize to out-of-domain TRs. In this paper, we evaluate several fine-tuning strategies to integrate domain knowledge further, notably telecommunications knowledge, into the BERT-based TR retrieval models to (i) attain better performance on duplicate TR retrieval/identification and (ii) improve model generalizability to out-of-domain TR data. We find that adding domain-specific data into the fine-tuning models led to improved results on both overall model performance and model generalizability.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
Nyckelord
- information retrieval
- bug reports
- trouble reports
- neural ranking
- catastrophic forgetting
- natural language processing
- transfer learning
- telecommunications
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)