SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:9798891760608 "

Sökning: L773:9798891760608

  • Resultat 1-5 av 5
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Berdicevskis, Aleksandrs, 1983, et al. (författare)
  • Superlim: A Swedish Language Understanding Evaluation Benchmark
  • 2023
  • Ingår i: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, December 6-10, 2023, Singapore / Houda Bouamor, Juan Pino, Kalika Bali (Editors). - Stroudsburg, PA : Association for Computational Linguistics. - 9798891760608
  • Konferensbidrag (refereegranskat)
  •  
2.
  • Bruton, Micaella, et al. (författare)
  • BERTie Bott's Every Flavor Labels : A Tasty Introduction to Semantic Role Labeling for Galician
  • 2023
  • Ingår i: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. - : Association for Computational Linguistics. - 9798891760608 ; , s. 10892-10902
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we leverage existing corpora, WordNet, and dependency parsing to build the first Galician dataset for training semantic role labeling systems in an effort to expand available NLP resources. Additionally, we introduce verb indexing, a new pre-processing method, which helps increase the performance when semantically parsing highly-complex sentences. We use transfer-learning to test both the resource and the verb indexing method. Our results show that the effects of verb indexing were amplified in scenarios where the model was both pre-trained and fine-tuned on datasets utilizing the method, but improvements are also noticeable when only used during fine-tuning. The best-performing Galician SRL model achieved an f1 score of 0.74, introducing a baseline for future Galician SRL systems. We also tested our method on Spanish where we achieved an f1 score of 0.83, outperforming the baseline set by the 2009 CoNLL Shared Task by 0.025 showing the merits of our verb indexing method for pre-processing.
  •  
3.
  • Muhammad, Shamsuddeen, et al. (författare)
  • AfriSenti : A Twitter Sentiment Analysis Benchmark for African Languages
  • 2023
  • Ingår i: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. - : Association for Computational Linguistics. - 9798891760608 ; , s. 13968-13981
  • Konferensbidrag (refereegranskat)abstract
    • Africa is home to over 2,000 languages from over six language families and has the highest linguistic diversity among all continents. This includes 75 languages with at least one million speakers each. Yet, there is little NLP research conducted on African languages. Crucial in enabling such research is the availability of high-quality annotated datasets. In this paper, we introduce AfriSenti, a sentiment analysis benchmark that contains a total of >110,000 tweets in 14 African languages (Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and Yoruba) from four language families. The tweets were annotated by native speakers and used in the AfriSenti-SemEval shared task (with over 200 participants, see website: https://afrisenti-semeval.github.io). We describe the data collection methodology, annotation process, and the challenges we dealt with when curating each dataset. We further report baseline experiments conducted on the AfriSenti datasets and discuss their usefulness.
  •  
4.
  • Noble, Bill, et al. (författare)
  • Describe Me an Auklet: Generating Grounded Perceptual Category Descriptions
  • 2023
  • Ingår i: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, December 6-10, 2023, Singapore / Houda Bouamor, Juan Pino, Kalika Bali (Editors). - : Association for Computational Linguistics. - 9798891760608
  • Konferensbidrag (refereegranskat)abstract
    • Human speakers can generate descriptions of perceptual concepts, abstracted from the instance-level. Moreover, such descriptions can be used by other speakers to learn provisional representations of those concepts. Learning and using abstract perceptual concepts is under-investigated in the language-and-vision field. The problem is also highly relevant to the field of representation learning in multi-modal NLP. In this paper, we introduce a framework for testing category-level perceptual grounding in multi-modal language models. In particular, we train separate neural networks to **generate** and **interpret** descriptions of visual categories. We measure the *communicative success* of the two models with the zero-shot classification performance of the interpretation model, which we argue is an indicator of perceptual grounding. Using this framework, we compare the performance of *prototype*- and *exemplar*-based representations. Finally, we show that communicative success exposes performance issues in the generation model, not captured by traditional intrinsic NLG evaluation metrics, and argue that these issues stem from a failure to properly ground language in vision at the category level.
  •  
5.
  • Wilkens, Rodrigo, et al. (författare)
  • TCFLE-8: a Corpus of Learner Written Productions for French as a Foreign Language and its Application to Automated Essay Scoring
  • 2023
  • Ingår i: EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings. - 9798891760608
  • Konferensbidrag (refereegranskat)abstract
    • Automated Essay Scoring (AES) aims to automatically assess the quality of essays. Automation enables large-scale assessment, improvaements in consistency, reliability, and standardization. Those characteristics are of particular relevance in the context of language certification exams. However, a major bottleneck in the development of AES systems is the availability of corpora, which, unfortunately, are scarce, especially for languages other than English. In this paper, we aim to foster the development of AES for French by providing the TCFLE-8 corpus, a corpus of 6.5k essays collected in the context of the Test de Connaissance du Français (TCF - French Knowledge Test) certification exam. We report the strict quality procedure that led to the scoring of each essay by at least two raters according to the levels of the Common European Framework of Reference for Languages (CEFR) and to the creation of a balanced corpus. In addition, we describe how linguistic properties of the essays relate to the learners' proficiency in TCFLE-8. We also advance the state-of-the-art performance for the AES task in French by experimenting with two strong baselines (i.e., RoBERTa and feature-based). Finally, we discuss the challenges of AES using TCFLE-8.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-5 av 5

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy