Search: onr:"swepub:oai:DiVA.org:lnu-106986" >
The Impact of Trans...
-
Ghafoor, AbdulSukkur IBA University, Pakistan
(author)
The Impact of Translating Resource-Rich Datasets to Low-Resource Languages Through Multi-Lingual Text Processing
- Article/chapterEnglish2021
Publisher, publication year, extent ...
-
IEEE,2021
-
electronicrdacarrier
Numbers
-
LIBRIS-ID:oai:DiVA.org:lnu-106986
-
https://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-106986URI
-
https://doi.org/10.1109/ACCESS.2021.3110285DOI
Supplementary language notes
-
Language:English
-
Summary in:English
Part of subdatabase
Classification
-
Subject category:ref swepub-contenttype
-
Subject category:art swepub-publicationtype
Notes
-
Urdu is still considered a low-resource language despite being ranked as world’s 10th most spoken language with nearly 230 million speakers. The scarcity of benchmark datasets in low-resource languages has led researchers to utilize more ingenious techniques to curb the issue. One such option widely adopted is to use language translation services to replicate existing datasets from resource-rich languages such as English to low-resource languages, such as Urdu. For most natural language processing tasks, including polarity assessment, words translated via Google translator from one language to another often change the meaning. It results in a polarity shift causing the system’s performance degradation, particularly for sentiment classification and emotion detection tasks. This study evaluates the effect of translation on the sentiment classification task from a resource-rich language to a low-resource language. It identifies and enlists words causing polarity shift into five distinct categories. It further finds the correlation between the language with similar roots. Our study shows 2-3 percentage points performance degradation in sentiment classification due to polarity shift as a result of translation from resource-rich languages to low-resource languages.
Subject headings and genre
Added entries (persons, corporate bodies, meetings, titles ...)
-
Imran, Ali ShariqNorwegian University of Science and Technology (NTNU), Norway
(author)
-
Daudpota, Sher MuhammadSukkur IBA University, Pakistan
(author)
-
Kastrati, Zenun,1984-Linnéuniversitetet,Institutionen för informatik (IK)(Swepub:lnu)zekaaa
(author)
-
Soomro, AbdullahSukkur IBA University, Pakistan
(author)
-
Batra, RakhiSukkur IBA University, Pakistan
(author)
-
Wani, Mudasir AhmadNorwegian University of Science and Technology, Norway
(author)
-
Sukkur IBA University, PakistanNorwegian University of Science and Technology (NTNU), Norway
(creator_code:org_t)
Related titles
-
In:IEEE Access: IEEE9, s. 124478-1244902169-3536
Internet link
Find in a library
To the university's database