Sökning: onr:"swepub:oai:gup.ub.gu.se/290867" >
Identifying Sentime...
Identifying Sentiments in Algerian Code-switched User-generated Comments
-
- Adouane, Wafia, 1985 (författare)
- Gothenburg University,Göteborgs universitet,Institutionen för filosofi, lingvistik och vetenskapsteori,Department of Philosophy, Linguistics and Theory of Science
-
Touileb, Samia (författare)
-
- Bernardy, Jean-Philippe, 1978 (författare)
- Gothenburg University,Göteborgs universitet,Institutionen för filosofi, lingvistik och vetenskapsteori,Department of Philosophy, Linguistics and Theory of Science
-
(creator_code:org_t)
- Paris : The European Language Resources Association, 2020
- 2020
- Engelska.
-
Ingår i: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, 11–16 May 2020. - Paris : The European Language Resources Association. - 9791095546344
- Relaterad länk:
-
https://gup.ub.gu.se...
Abstract
Ämnesord
Stäng
- We present in this paper our work on Algerian language, an under-resourced North African colloquial Arabic variety, for which we built a comparably large corpus of more than 36,000 code-switched user-generated comments annotated for sentiments. We opted for this data domain because Algerian is a colloquial language with no existing freely available corpora. Moreover, we compiled sentiment lexicons of positive and negative unigrams and bigrams reflecting the code-switches present in the language. We compare the performance of four models on the task of identifying sentiments, and the results indicate that a CNN model trained end-to-end fits better our unedited code-switched and unbalanced data across the predefined sentiment classes. Additionally, injecting the lexicons as background knowledge to the model boosts its performance on the minority class with a gain of 10.54 points on the F-score. The results of our experiments can be used as a baseline for future research for Algerian sentiment analysis.
Ämnesord
- HUMANIORA -- Språk och litteratur -- Jämförande språkvetenskap och allmän lingvistik (hsv//swe)
- HUMANITIES -- Languages and Literature -- General Language Studies and Linguistics (hsv//eng)
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
Nyckelord
- Algerian Arabic
- code-switching
- user-generated data
- sentiment analysis
- under-resourced colloquial languages
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)
Hitta via bibliotek
Till lärosätets databas