SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "id:"swepub:oai:gup.ub.gu.se/246849" "

Sökning: id:"swepub:oai:gup.ub.gu.se/246849"

  • Resultat 1-1 av 1
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Adouane, Wafia, 1985, et al. (författare)
  • Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning
  • 2016
  • Ingår i: Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects; 53–61; December 12, 2016 ; Osaka, Japan. - : Association for Computational Linguistics. - 0736-587X.
  • Konferensbidrag (refereegranskat)abstract
    • The identification of the language of text/speech input is the first step to be able to properly do any language-dependent natural language processing. The task is called Automatic Language Identification (ALI). Being a well-studied field since early 1960’s, various methods have been applied to many standard languages. The ALI standard methods require datasets for training and use character/word-based n-gram models. However, social media and new technologies have contributed to the rise of informal and minority languages on the Web. The state-of-the-art automatic language identifiers fail to properly identify many of them. Romanized Arabic (RA) and Romanized Berber (RB) are cases of these informal languages which are under-resourced. The goal of this paper is twofold: detect RA and RB, at a document level, as separate languages and distinguish between them as they coexist in North Africa. We consider the task as a classification problem and use supervised machine learning to solve it. For both languages, character-based 5-grams combined with additional lexicons score the best, F-score of 99.75% and 97.77% for RB and RA respectively.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-1 av 1
Typ av publikation
konferensbidrag (1)
Typ av innehåll
refereegranskat (1)
Författare/redaktör
Johansson, Richard, ... (1)
Adouane, Wafia, 1985 (1)
Semmar, Nasredine (1)
Lärosäte
Göteborgs universitet (1)
Språk
Engelska (1)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (1)
Humaniora (1)
År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy