Sökning: WFRF:(Petzell Malin 1972 ) >
Bootstrapping Langu...
Bootstrapping Language Description : The case of Mpiemo (Bantu A, Central African Republic)
-
- Hammarström, Harald (författare)
- Gothenburg University,Göteborgs universitet,Institutionen för data- och informationsteknik, datavetenskap (GU),Department of Computer Science and Engineering, Computing Science (GU),Department of Computing Science, Chalmers University, Gothenburg
-
- Thornell, Christina, 1948 (författare)
- Gothenburg University,Göteborgs universitet,Institutionen för orientaliska och afrikanska språk,Department of Oriental and African Languages,Department of African Languages, Gothenburg University, Gothenburg
-
- Petzell, Malin, 1972 (författare)
- Gothenburg University,Göteborgs universitet,Institutionen för orientaliska och afrikanska språk,Department of Oriental and African Languages,Department of African Languages, Gothenburg University, Gothenburg
-
visa fler...
-
- Westerlund, Torbjörn, 1971- (författare)
- Uppsala universitet,Institutionen för lingvistik och filologi
-
visa färre...
-
(creator_code:org_t)
- 2008
- 2008
- Engelska.
-
Ingår i: Proceedings of the 6th edition of the Language Resources and Evaluation Conference (LREC 2008), 28-30 may 2008, Marrakech, Morocco.
- Relaterad länk:
-
http://www.lrec-conf...
-
visa fler...
-
https://uu.diva-port... (primary) (Raw object)
-
https://urn.kb.se/re...
-
https://research.cha...
-
https://gup.ub.gu.se...
-
visa färre...
Abstract
Ämnesord
Stäng
- Linguists have long been producing grammatical decriptions of yet undescribed languages. This is a time-consuming process, which has already adapted to improved technology for recording and storage. We present here a novel application of NLP techniques to bootstrap analysis of collected data and speed-up manual selection work. To be more precise, we argue that unsupervised induction of morphology and part-of-speech analysis from raw text data is mature enough to produce useful results. Experiments with Latent Semantic Analysis were less fruitful. We exemplify this on Mpiemo, a so-far essentially undescribed Bantu language of the Central African Republic, for which raw text data was available.
Ämnesord
- HUMANIORA -- Språk och litteratur -- Studier av enskilda språk (hsv//swe)
- HUMANITIES -- Languages and Literature -- Specific Languages (hsv//eng)
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
- NATURVETENSKAP -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
Nyckelord
- Mpiemo
- Bantu A
- Central African Republic
- NLP
- Latent Semantic Analysis
- bootstrapping
- African languages
- Afrikanska språk
- Computational linguistics
- Datorlingvistik
- Acquisition
- Machine Learning
- Endangered languages
- Language modelling
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)