Search: onr:"swepub:oai:DiVA.org:ltu-101305" >
AfriWOZ: Corpus for...
AfriWOZ: Corpus for Exploiting Cross-Lingual Transfer for Dialogue Generation in Low-Resource, African Languages
-
- Adewumi, Tosin (author)
- Luleå tekniska universitet,EISLAB,Masakhane
-
- Adeyemi, Mofetoluwa (author)
- Masakhane
-
- Anuoluwapo, Aremu (author)
- Masakhane
-
show more...
-
- Peters, Bukola (author)
- CIS
-
- Buzaaba, Happy (author)
- Masakhane
-
- Samuel, Oyerinde (author)
- Masakhane
-
- Rufai, Amina Mardiyyah (author)
- Masakhane
-
- Ajibade, Benjamin (author)
- Masakhane
-
- Gwadabe, Tajudeen (author)
- Masakhane
-
- Koulibaly Traore, Mory Moussou (author)
- Masakhane
-
- Ajayi, Tunde Oluwaseyi (author)
- Masakhane
-
Muhammad, Shamsuddeen (author)
-
- Baruwa, Ahmed (author)
- Masakhane
-
- Owoicho, Paul (author)
- Masakhane
-
- Ogunremi, Tolulope (author)
- Masakhane
-
- Ngigi, Phylis (author)
- Jomo Kenyatta University of Agriculture and Technology
-
- Ahia, Orevaoghene (author)
- Masakhane
-
- Nasir, Ruqayya (author)
- Masakhane
-
- Liwicki, Foteini (author)
- Luleå tekniska universitet,EISLAB
-
- Liwicki, Marcus (author)
- Luleå tekniska universitet,EISLAB
-
show less...
-
(creator_code:org_t)
- Institute of Electrical and Electronics Engineers Inc. 2023
- 2023
- English.
-
In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings. - : Institute of Electrical and Electronics Engineers Inc.. - 9781665488686 - 9781665488679
- Related links:
-
https://urn.kb.se/re...
-
show more...
-
https://doi.org/10.1...
-
show less...
Abstract
Subject headings
Close
- Dialogue generation is an important NLP task fraught with many challenges. The challenges become more daunting for low-resource African languages. To enable the creation of dialogue agents for African languages, we contribute the first high-quality dialogue datasets for 6 African languages: Swahili, Wolof, Hausa, Nigerian Pidgin English, Kinyarwanda & Yorùbá. There are a total of 9,000 turns, each language having 1,500 turns, which we translate from a portion of the English multi-domain MultiWOZ dataset. Subsequently, we benchmark by investigating & analyzing the effectiveness of modelling through transfer learning by utilziing state-of-the-art (SoTA) deep monolingual models: DialoGPT and BlenderBot. We compare the models with a simple seq2seq baseline using perplexity. Besides this, we conduct human evaluation of single-turn conversations by using majority votes and measure inter-annotator agreement (IAA). We find that the hypothesis that deep monolingual models learn some abstractions that generalize across languages holds. We observe human-like conversations, to different degrees, in 5 out of the 6 languages. The language with the most transferable properties is the Nigerian Pidgin English, with a human-likeness score of 78.1%, of which 34.4% are unanimous. We freely provide the datasets and host the model checkpoints/demos on the HuggingFace hub for public access.
Subject headings
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
- NATURVETENSKAP -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
Keyword
- crosslingual
- dialogue systems
- low-resource
- multilingual
- NLG
- Maskininlärning
- Machine Learning
Publication and Content Type
- ref (subject category)
- kon (subject category)
Find in a library
To the university's database
- By the author/editor
-
Adewumi, Tosin
-
Adeyemi, Mofetol ...
-
Anuoluwapo, Arem ...
-
Peters, Bukola
-
Buzaaba, Happy
-
Samuel, Oyerinde
-
show more...
-
Rufai, Amina Mar ...
-
Ajibade, Benjami ...
-
Gwadabe, Tajudee ...
-
Koulibaly Traore ...
-
Ajayi, Tunde Olu ...
-
Muhammad, Shamsu ...
-
Baruwa, Ahmed
-
Owoicho, Paul
-
Ogunremi, Tolulo ...
-
Ngigi, Phylis
-
Ahia, Orevaoghen ...
-
Nasir, Ruqayya
-
Liwicki, Foteini
-
Liwicki, Marcus
-
show less...
- About the subject
-
- NATURAL SCIENCES
-
NATURAL SCIENCES
-
and Computer and Inf ...
-
and Language Technol ...
-
- NATURAL SCIENCES
-
NATURAL SCIENCES
-
and Computer and Inf ...
-
and Computer Science ...
- Articles in the publication
-
IJCNN 2023 - Int ...
- By the university
-
Luleå University of Technology