Sökning: WFRF:(Östling Robert 1986 ) >
A distantly supervi...
Abstract
Ämnesord
Stäng
- This paper presents our submission to the first Shared Task on Multilingual Grammatical Error Detection (MultiGED-2023). Our method utilizes a transformer-based sequence-to-sequence model, which was trained on a synthetic dataset consisting of 3.2 billion words. We adopt a distantly supervised approach, with the training process relying exclusively on the distribution of language learners' errors extracted from the annotated corpus used to construct the training data. In the Swedish track, our model ranks fourth out of seven submissions in terms of the target F0.5 metric, while achieving the highest precision. These results suggest that our model is conservative yet remarkably precise in its predictions.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
- HUMANIORA -- Språk och litteratur -- Jämförande språkvetenskap och allmän lingvistik (hsv//swe)
- HUMANITIES -- Languages and Literature -- General Language Studies and Linguistics (hsv//eng)
Nyckelord
- gec
- grammatical error correction
- grammatical error detection
- language learning
- computer-assisted language learning
- Computational Linguistics
- datorlingvistik
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)
Hitta via bibliotek
Till lärosätets databas