Sökning: onr:"swepub:oai:DiVA.org:uu-248953" >
A statistical model...
A statistical model for grammar mapping
-
- Basirat, Ali (författare)
- Uppsala universitet,Institutionen för lingvistik och filologi,University of Tehran
-
Faili, Heshaam (författare)
-
- Nivre, Joakim (författare)
- Uppsala universitet,Institutionen för lingvistik och filologi
-
(creator_code:org_t)
- Cambridge University Press, 2016
- 2016
- Engelska.
-
Ingår i: Natural Language Engineering. - : Cambridge University Press. - 1351-3249 .- 1469-8110. ; 22:2, s. 215-255
- Relaterad länk:
-
https://urn.kb.se/re...
-
visa fler...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- The two main classes of grammars are (a) hand-crafted grammars, which are developed bylanguage experts, and (b) data-driven grammars, which are extracted from annotated corpora.This paper introduces a statistical method for mapping the elementary structures of a data-driven grammar onto the elementary structures of a hand-crafted grammar in order to combinetheir advantages. The idea is employed in the context of Lexicalized Tree-Adjoining Grammars(LTAG) and tested on two LTAGs of English: the hand-crafted LTAG developed in theXTAG project, and the data-driven LTAG, which is automatically extracted from the PennTreebank and used by the MICA parser. We propose a statistical model for mapping anyelementary tree sequence of the MICA grammar onto a proper elementary tree sequence ofthe XTAG grammar. The model has been tested on three subsets of the WSJ corpus thathave average lengths of 10, 16, and 18 words, respectively. The experimental results show thatfull-parse trees with average F1 -scores of 72.49, 64.80, and 62.30 points could be built from94.97%, 96.01%, and 90.25% of the XTAG elementary tree sequences assigned to the subsets,respectively. Moreover, by reducing the amount of syntactic lexical ambiguity of sentences,the proposed model significantly improves the efficiency of parsing in the XTAG system.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
Nyckelord
- Datavetenskap med inriktning mot människa-datorinteraktion
- Computer Science with specialization in Human-Computer Interaction
- Linguistics
- Lingvistik
Publikations- och innehållstyp
- ref (ämneskategori)
- art (ämneskategori)
Hitta via bibliotek
Till lärosätets databas