SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Klakow Dietrich) "

Search: WFRF:(Klakow Dietrich)

  • Result 1-5 of 5
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Adelani, David, et al. (author)
  • A Few Thousand Translations Go A Long Way! Leveraging Pre-trained Models for African News Translation
  • 2022
  • In: NAACL 2022. - Stroudsburg : Association for Computational Linguistics. - 9781955917711 ; , s. 3053-3070
  • Conference paper (peer-reviewed)abstract
    • Recent advances in the pre-training of language models leverage large-scale datasets to create multilingual models. However, low-resource languages are mostly left out in these datasets. This is primarily because many widely spoken languages are not well represented on the web and therefore excluded from the large-scale crawls used to create datasets. Furthermore, downstream users of these models are restricted to the selection of languages originally chosen for pre-training. This work investigates how to optimally leverage existing pre-trained models to create low-resource translation systems for 16 African languages. We focus on two questions: 1) How can pre-trained models be used for languages not included in the initial pre-training? and 2) How can the resulting translation models effectively transfer to new domains? To answer these questions, we create a new African news corpus covering 16 languages, of which eight languages are not part of any existing evaluation dataset. We demonstrate that the most effective strategy for transferring both to additional languages and to additional domains is to fine-tune large pre-trained models on small quantities of high-quality translation data.
  •  
2.
  • Adelani, David Ifeoluwa, et al. (author)
  • MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition
  • 2022
  • In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. - : Association for Computational Linguistics (ACL). ; , s. 4488-4508
  • Conference paper (peer-reviewed)abstract
    • African languages are spoken by over a billion people, but are underrepresented in NLP research and development. The challenges impeding progress include the limited availability of annotated datasets, as well as a lack of understanding of the settings where current methods are effective. In this paper, we make progress towards solutions for these challenges, focusing on the task of named entity recognition (NER). We create the largest human-annotated NER dataset for 20 African languages, and we study the behavior of state-of-the-art cross-lingual transfer methods in an Africa-centric setting, demonstrating that the choice of source language significantly affects performance. We show that choosing the best transfer language improves zero-shot F1 scores by an average of 14 points across 20 languages compared to using English. Our results highlight the need for benchmark datasets and models that cover typologically-diverse African languages.
  •  
3.
  • De Cock, Martine, et al. (author)
  • Privacy enhancing technologies
  • 2023
  • In: Privacy in Speech and Language Technology. - : Schloss Dagstuhl, Leibniz-Zentrum für Informatik. ; , s. 90-99
  • Book chapter (peer-reviewed)abstract
    • Privacy-enhancing technologies (PETs) provide technical building blocks for achieving privacyby design and can be defined as technologies that embody fundamental data protection goals[13 ] including the goals of unlinkability, interveneability, transparency and the classical CIA(confidentiality, integrity, availability) security goals by minimizing personal data collectionand use, maximizing data security, and empowering individuals.The privacy by design principle of a positive sum for speech and language technologiesshould enable users to benefit from the rich functions of these technologies while protectingthe users’ privacy at the same time. The fundamental question is how to achieve privacyby design for speech and language technology without hampering the services. To achievethis goal, different PETs exist that can be utilized for this purpose. Below, we first discusswhat type of personal data are accessible via speech and text and should be the target ofprotection by PETs. Then, we provide an overview of PETs that can provide protectionand discuss their limitations and challenges that arise when used for speech and languagetechnologies.
  •  
4.
  • Privacy in Speech and Language Technology : Dagstuhl Seminar 22342
  • 2023
  • Editorial collection (other academic/artistic)abstract
    • This report documents the outcomes of Dagstuhl Seminar 22342 “Privacy in Speech and LanguageTechnology”. The seminar brought together 27 attendees from 9 countries (Australia, Belgium,France, Germany, the Netherlands, Norway, Portugal, Sweden, and the USA) and 6 distinctdisciplines (Speech Processing, Natural Language Processing, Privacy Enhancing Technologies,Machine Learning, Human Factors, and Law) in order to achieve a common understanding of theprivacy threats raised by speech and language technology, as well as the existing solutions andthe remaining issues in each discipline, and to draft an interdisciplinary roadmap towards solvingthose issues in the short or medium term.To achieve these goals, the first day and the morning of the second day were devoted to3-minute self-introductions by all participants intertwined with 6 tutorials to introduce theterminology, the problems faced, and the solutions brought in each of the 6 disciplines. We alsomade a list of use cases and identified 6 cross-disciplinary topics to be discussed. The remainingdays involved working groups to discuss these 6 topics, collaborative writing sessions to report onthe findings of the working groups, and wrap-up sessions to discuss these findings with each other.A hike was organized in the afternoon of the third day.The seminar was a success: all participants actively participated in the working groups andthe discussions, and went home with new ideas and new collaborators. This report gathers theabstracts of the 6 tutorials and the reports of the working groups, which we consider as valuablecontributions towards a full-fledged roadmap.
  •  
5.
  • Shore, Todd, et al. (author)
  • Knowledge-Based Word Lattice Rescoring in a Dynamic Context
  • 2012
  • In: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. - Portland, OR, USA : International Speech Communication Association. - 9781622767595 ; , s. 1082-1085
  • Conference paper (peer-reviewed)abstract
    • Recent advances in automatic speech recognition (ASR) technology continue to be based heavily on data-driven methods, meaning that the full benefits of such research are often not enjoyed in domains for which there is little training data. Moreover, tractability is often an issue with these methods when conditioning for long-distance dependencies, entailing that many higher-level knowledge sources such as situational knowledge cannot be easily utilized in classification. This paper describes an effort to circumvent this problem by using dynamic contextual knowledge to rescore ASR lattice output using a dynamic weighted constraint satisfaction function. With this method, it was possible to achieve a roughly 80% reduction in WER for ASR in the context of an air traffic control scenario.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-5 of 5

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view