Sökning: onr:"swepub:oai:DiVA.org:liu-169794" >
An Aligned Resource...
An Aligned Resource of Swedish Complex-Simple Sentence Pairs
-
- Rennes, Evelina, 1990- (författare)
- Linköpings universitet,Interaktiva och kognitiva system,Tekniska fakulteten
-
(creator_code:org_t)
- 2018
- 2018
- Engelska.
-
Ingår i: Proceedings of the Seventh Swedish Language Technology Conference (SLTC).
- Relaterad länk:
-
https://urn.kb.se/re...
Abstract
Ämnesord
Stäng
- We present a method for aligning comparable corpora of simple-complex articles at the sentence level. Three methods were tested; Average Alignment (AA), Maximum Alignment (MA), and Hungarian Alignment (HA). For evaluating the algorithms, and finding the optimal combination of parameters, a dataset of manually annotated sentences was constructed. The algorithms were evaluated against the manually annotated dataset, and the best-performing algorithm proved to be the MA algorithm, which resulted in corpus comprising 59,513 aligned sentence pairs, of which 17,653 were unique sentences.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
Publikations- och innehållstyp
- vet (ämneskategori)
- kon (ämneskategori)