SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Alabi Jesujoba O.) "

Sökning: WFRF:(Alabi Jesujoba O.)

  • Resultat 1-2 av 2
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Abdulmumin, Idris, et al. (författare)
  • Separating Grains from the Chaff : Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages
  • 2022
  • Ingår i: Proceedings of the Seventh Conference on Machine Translation (WMT). - : Association for Computational Linguistics. - 9781959429296 ; , s. 1001-1014
  • Konferensbidrag (refereegranskat)abstract
    • We participated in the WMT 2022 Large-Scale Machine Translation Evaluation for the African Languages Shared Task. This work de-scribes our approach, which is based on filtering the given noisy data using a sentence-pair classifier that was built by fine-tuning a pre-trained language model. To train the classifier, we obtain positive samples (i.e. high-quality parallel sentences) from a gold-standard curated dataset and extract negative samples (i.e.low-quality parallel sentences) from automatically aligned parallel data by choosing sentences with low alignment scores. Our final machine translation model was then trained on filtered data, instead of the entire noisy dataset. We empirically validate our approach by evaluating on two common datasets and show that data filtering generally improves overall translation quality, in some cases even significantly.
  •  
2.
  • Adelani, David Ifeoluwa, et al. (författare)
  • MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition
  • 2022
  • Ingår i: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. - : Association for Computational Linguistics (ACL). ; , s. 4488-4508
  • Konferensbidrag (refereegranskat)abstract
    • African languages are spoken by over a billion people, but are underrepresented in NLP research and development. The challenges impeding progress include the limited availability of annotated datasets, as well as a lack of understanding of the settings where current methods are effective. In this paper, we make progress towards solutions for these challenges, focusing on the task of named entity recognition (NER). We create the largest human-annotated NER dataset for 20 African languages, and we study the behavior of state-of-the-art cross-lingual transfer methods in an Africa-centric setting, demonstrating that the choice of source language significantly affects performance. We show that choosing the best transfer language improves zero-shot F1 scores by an average of 14 points across 20 languages compared to using English. Our results highlight the need for benchmark datasets and models that cover typologically-diverse African languages.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-2 av 2

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy