SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:uu-121758"
 

Search: onr:"swepub:oai:DiVA.org:uu-121758" > The English-Swedish...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

The English-Swedish-Turkish Parallel Treebank

Megyesi, Beáta, 1971- (author)
Uppsala universitet,Institutionen för lingvistik och filologi,datorlingvistik
Dahlqvist, Bengt (author)
Uppsala universitet,Institutionen för lingvistik och filologi,datorlingvistik
Csató, Éva Ágnes, 1948- (author)
Uppsala universitet,Institutionen för lingvistik och filologi
show more...
Nivre, Joakim (author)
Uppsala universitet,Institutionen för lingvistik och filologi,datorlingvistik
show less...
 (creator_code:org_t)
2010
2010
English.
In: Proceedings of Language Resources and Evaluation (LREC 2010).
  • Conference paper (peer-reviewed)
Abstract Subject headings
Close  
  • We describe a syntactically annotated parallel corpus containing typologically partly different languages, namely English, Swedish and Turkish. The corpus consists of approximately 300 000 tokens in Swedish, 160 000 in Turkish and 150 000 in English, containing both fiction and technical documents. We build the corpus by using the Uplug toolkit for automatic structural markup, such as tokenization and sentence segmentation, as well as sentence and word alignment. In addition, we use basic language resource kits for the linguistic analysis of the languages involved. The annotation is carried on various layers from morphological and part of speech analysis to dependency structures. The tools used for linguistic annotation, e.g. HunPos tagger and MaltParser, are freely available data-driven resources, trained on existing corpora and treebanks for each language. The parallel treebank is used in teaching and linguistic research to study the relationship between the structurally different languages. In order to study the treebank, several tools have been developed for the visualization of the annotation and alignment, allowing search for linguistic patterns.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Language Technology (hsv//eng)
HUMANIORA  -- Språk och litteratur -- Studier av enskilda språk (hsv//swe)
HUMANITIES  -- Languages and Literature -- Specific Languages (hsv//eng)

Keyword

treebank
parallel corpus
language resource
trädbank
parallell korpus
språkresurs
Computational linguistics
Datorlingvistik
Language technology
Språkteknologi
Turkic languages
Turkiska språk
Datorlingvistik
Computational Linguistics
Turkic languages
Turkiska språk

Publication and Content Type

ref (subject category)
kon (subject category)

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Find more in SwePub

By the author/editor
Megyesi, Beáta, ...
Dahlqvist, Bengt
Csató, Éva Ágnes ...
Nivre, Joakim
About the subject
NATURAL SCIENCES
NATURAL SCIENCES
and Computer and Inf ...
and Language Technol ...
HUMANITIES
HUMANITIES
and Languages and Li ...
and Specific Languag ...
Articles in the publication
By the university
Uppsala University

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view