SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:1574 020X OR L773:1574 0218 OR L773:1572 8412 srt2:(2020-2022)"

Sökning: L773:1574 020X OR L773:1574 0218 OR L773:1572 8412 > (2020-2022)

  • Resultat 1-3 av 3
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Buljan, Maja, et al. (författare)
  • A Tale of Four Parsers : Methodological Reflections on Diagnostic Evaluation and In-Depth Error Analysis for Meaning Representation Parsing
  • 2022
  • Ingår i: Language Resources and Evaluation. - : Springer Science and Business Media LLC. - 1574-020X .- 1574-0218. ; 56:4, s. 1075-1102
  • Tidskriftsartikel (refereegranskat)abstract
    • We discuss methodological choices in diagnostic evaluation and error analysis in meaning representation parsing (MRP), i.e. mapping from natural language utterances to graph-based encodings of semantic structure. We expand on a pilot quantitative study in contrastive diagnostic evaluation, inspired by earlier work in syntactic dependency parsing, and propose a novel methodology for qualitative error analysis. This two-pronged study is performed using a selection of submissions, data, and evaluation tools featured in the 2019 shared task on MRP. Our aim is to devise methods for identifying strengths and weaknesses in different broad families of parsing techniques, as well as investigating the relations between specific parsing approaches, different meaning representation frameworks, and individual linguistic phenomena—by identifying and comparing common error patterns. Our preliminary empirical results suggest that the proposed methodologies can be meaningfully applied to parsing into graph-structured target representations, as a side-effect uncovering hitherto unknown properties of the different systems that can inform future development and cross-fertilization across approaches.
  •  
2.
  • Lenci, Alessandro, et al. (författare)
  • A comparative evaluation and analysis of three generations of Distributional Semantic Models
  • 2022
  • Ingår i: Language resources and evaluation. - : Springer Science and Business Media B.V.. - 1574-020X .- 1574-0218. ; 56, s. 1219-
  • Tidskriftsartikel (refereegranskat)abstract
    • Distributional semantics has deeply changed in the last decades. First, predict models stole the thunder from traditional count ones, and more recently both of them were replaced in many NLP applications by contextualized vectors produced by neural language models. Although an extensive body of research has been devoted to Distributional Semantic Model (DSM) evaluation, we still lack a thorough comparison with respect to tested models, semantic tasks, and benchmark datasets. Moreover, previous work has mostly focused on task-driven evaluation, instead of exploring the differences between the way models represent the lexical semantic space. In this paper, we perform a large-scale evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT. First of all, we investigate the performance of embeddings in several semantic tasks, carrying out an in-depth statistical analysis to identify the major factors influencing the behavior of DSMs. The results show that (i) the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous and (ii) static DSMs surpass BERT representations in most out-of-context semantic tasks and datasets. Furthermore, we borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models. RSA reveals important differences related to the frequency and part-of-speech of lexical items. © 2022, The Author(s).
  •  
3.
  • Zeyrek, Deniz, et al. (författare)
  • TED Multilingual Discourse Bank (TED-MDB) : a parallel corpus annotated in the PDTB style
  • 2020
  • Ingår i: Language resources and evaluation. - : Springer Science and Business Media LLC. - 1574-020X .- 1574-0218. ; 54, s. 587-613
  • Tidskriftsartikel (refereegranskat)abstract
    • TED-Multilingual Discourse Bank, or TED-MDB, is a multilingual resource where TED-talks are annotated at the discourse level in 6 languages (English, Polish, German, Russian, European Portuguese, and Turkish) following the aims and principles of PDTB. We explain the corpus design criteria, which has three main features: the linguistic characteristics of the languages involved, the interactive nature of TED talks—which led us to annotate Hypophora, and the decision to avoid projection. We report our annotation consistency, and post-annotation alignment experiments, and provide a cross-lingual comparison based on corpus statistics.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-3 av 3

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy