SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Stymne Sara 1977 ) "

Sökning: WFRF:(Stymne Sara 1977 )

  • Resultat 1-10 av 69
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Adams, Allison, et al. (författare)
  • Learning with learner corpora : Using the TLE for native language identification
  • 2017
  • Ingår i: Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition. ; , s. 1-7
  • Konferensbidrag (refereegranskat)abstract
    • This study investigates the usefulness of the Treebank of Learner English (TLE) when applied to the task of Native Language Identification (NLI). The TLE is effectively a parallel corpus of Standard/Learner English, as there are two versions; one based on original learner essays, and the other an error-corrected version. We use the corpus to explore how useful a parser trained on ungrammatical relations is compared to a parser trained on grammatical relations, when used as features for a native language classification task. While parsing results are much better when trained on grammatical relations, native language classification is slightly better using a parser trained on the original treebank containing ungrammatical relations.
  •  
2.
  • Cerniavski, Rafal, et al. (författare)
  • Multilingual Automatic Speech Recognition for Scandinavian Languages
  • 2023
  • Ingår i: Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa). - Tartu : University of Tartu. - 9789916219997 ; , s. 460-466
  • Konferensbidrag (refereegranskat)abstract
    • We investigate the effectiveness of multilingual automatic speech recognition models for Scandinavian languages by further fine-tuning a Swedish model on Swedish, Danish, and Norwegian. We first explore zero-shot models, which perform poorly across the three languages. However, we show that a multilingual model based on a strong Swedish model, further fine-tuned on all three languages, performs well for Norwegian and Danish, with a relatively low decrease in the performance for Swedish. With a language classification module, we improve the performance of the multilingual model even further.
  •  
3.
  • Černiavski, Rafal, et al. (författare)
  • Uppsala University at SemEval-2022 Task 1 : Can Foreign Entries Enhance an English Reverse Dictionary?
  • 2022
  • Ingår i: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). - Stroudsburg, PA, USA : Association for Computational Linguistics. - 9781955917803 ; , s. 88-93
  • Konferensbidrag (refereegranskat)abstract
    • We present the Uppsala University system for SemEval-2022 Task 1: Comparing Dictionaries and Word Embeddings (CODWOE). We explore the performance of multilingual reverse dictionaries as well as the possibility of utilizing annotated data in other languages to improve the quality of a reverse dictionary in the target language. We mainly focus on characterbased embeddings. In our main experiment, we train multilingual models by combining the training data from multiple languages. In an additional experiment, using resources beyond the shared task, we use the training data in Russian and French to improve the English reverse dictionary using unsupervised embeddings alignment and machine translation. The results show that multilingual models occasionally but not consistently can outperform the monolingual baselines. In addition, we demonstrate an improvement of an English reverse dictionary using translated entries from the Russian training data set.
  •  
4.
  • Danilova, Vera, et al. (författare)
  • UD-MULTIGENRE : a UD-Based Dataset Enriched with Instance-Level Genre Annotations
  • 2023
  • Ingår i: Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL). - : Association for Computational Linguistics. - 9798891760561 ; , s. 253-267
  • Konferensbidrag (refereegranskat)abstract
    • Prior research on the impact of genre on cross-lingual dependency parsing has suggested that genre is an important signal. However, these studies suffer from a scarcity of reliable data for multiple genres and languages. While Universal Dependencies (UD), the only available large-scale resource for cross-lingual dependency parsing, contains data from diverse genres, the documentation of genre labels is missing, and there are multiple inconsistencies. This makes studies of the impact of genres difficult to design. To address this, we present a new dataset, UD-MULTIGENRE, where 17 genres are defined and instance-level annotations of these are applied to a subset of UD data, covering 38 languages. It provides a rich ground for research related to text genre from a multilingual perspective. Utilizing this dataset, we can overcome the data shortage that hindered previous research and reproduce experiments from earlier studies with an improved setup. We revisit a previous study that used genre-based clusters and show that the clusters for most target genres provide a mix of genres. We compare training data selection based on clustering and gold genre labels and provide an analysis of the results. The dataset is publicly available. (https://github.com/UppsalaNLP/UD-MULTIGENRE)
  •  
5.
  • de Lhoneux, Miryam, 1990-, et al. (författare)
  • Arc-Hybrid Non-Projective Dependency Parsing with a Static-Dynamic Oracle
  • 2017
  • Ingår i: IWPT 2017 15th International Conference on Parsing Technologies. - Pisa, Italy : Association for Computational Linguistics. - 9781945626739 ; , s. 99-104
  • Konferensbidrag (refereegranskat)abstract
    • We extend the arc-hybrid transition system for dependency parsing with a SWAP transition that enables reordering of the words and construction of non-projective trees. Although this extension potentially breaks the arc-decomposability of the transition system, we show that the existing dynamic oracle can be modified and combined with a static oracle for the SWAP transition. Experiments on five languages with different degrees of non-projectivity show that the new system gives competitive accuracy and is significantly better than a system trained with a purely static oracle.
  •  
6.
  • de Lhoneux, Miryam, 1990-, et al. (författare)
  • From raw text to Universal Dependencies : look, no tags!
  • 2017
  • Ingår i: Proceedings of the CoNLL 2017 Shared Task. - Vancouver, Canada : Association for Computational Linguistics. - 9781945626708 ; , s. 207-217
  • Konferensbidrag (refereegranskat)abstract
    • We present the Uppsala submission to the CoNLL 2017 shared task on parsing from raw text to universal dependencies. Our system is a simple pipeline consisting of two components. The first performs joint word and sentence segmentation on raw text; the second predicts dependency trees from raw words. The parser bypasses the need for part-of-speech tagging, but uses word embeddings based on universal tag distributions. We achieved a macroaveraged LAS F1 of 65.11 in the official test run and obtained the 2nd best result for sentence segmentation with a score of 89.03. After fixing two bugs, we obtained an unofficial LAS F1 of 70.49.
  •  
7.
  • de Lhoneux, Miryam, 1990-, et al. (författare)
  • What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
  • 2019
  • Ingår i: CoRR. ; abs/1907.07950
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)abstract
    • This article is a linguistic investigation of a neural parser. We look at transitivity and agreement information of auxiliary verb constructions (AVCs) in comparison to finite main verbs (FMVs). This comparison is motivated by theoretical work in dependency grammar and in particular the work of Tesnière (1959) where AVCs and FMVs are both instances of a nucleus, the basic unit of syntax. An AVC is a dissociated nucleus, it consists of at least two words, and a FMV is its non-dissociated counterpart, consisting of exactly one word. We suggest that the representation of AVCs and FMVs should capture similar information. We use diagnostic classifiers to probe agreement and transitivity information in vectors learned by a transition-based neural parser in four typologically different languages. We find that the parser learns different information about AVCs and FMVs if only sequential models (BiLSTMs) are used in the architecture but similar information when a recursive layer is used. We find explanations for why this is the case by looking closely at how information is learned in the network and looking at what happens with different dependency representations of AVCs.
  •  
8.
  • de Lhoneux, Miryam, 1990-, et al. (författare)
  • What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
  • 2020
  • Ingår i: Computational linguistics - Association for Computational Linguistics (Print). - : MIT Press. - 0891-2017 .- 1530-9312. ; 46:4, s. 763-784
  • Tidskriftsartikel (refereegranskat)abstract
    • There is a growing interest in investigating what neural NLP models learn about language. A prominent open question is the question of whether or not it is necessary to model hierarchical structure. We present a linguistic investigation of a neural parser adding insights to this question. We look at transitivity and agreement information of auxiliary verb constructions (AVCs) in comparison to finite main verbs (FMVs). This comparison is motivated by theoretical work in dependency grammar and in particular the work of Tesnière (1959), where AVCs and FMVs are both instances of a nucleus, the basic unit of syntax. An AVC is a dissociated nucleus; it consists of at least two words, and an FMV is its non-dissociated counterpart, consisting of exactly one word. We suggest that the representation of AVCs and FMVs should capture similar information. We use diagnostic classifiers to probe agreement and transitivity information in vectors learned by a transition-based neural parser in four typologically different languages. We find that the parser learns different information about AVCs and FMVs if only sequential models (BiLSTMs) are used in the architecture but similar information when a recursive layer is used. We find explanations for why this is the case by looking closely at how information is learned in the network and looking at what happens with different dependency representations of AVCs. We conclude that there may be benefits to using a recursive layer in dependency parsing and that we have not yet found the best way to integrate it in our parsers.
  •  
9.
  • Della Corte, Giuseppe, et al. (författare)
  • IESTAC : English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation
  • 2020
  • Ingår i: Proceedings of the First International Workshop on Natural Language Processing Beyond Text. - Stroudsburg, PA, USA : Association for Computational Linguistics. ; , s. 41-50
  • Konferensbidrag (refereegranskat)abstract
    • We discuss a set of methods for the creation of IESTAC: a English-Italian speech and text parallel corpus designed for the training of end-to-end speech-to-text machine translation models and publicly released as part of this work. We first mapped English LibriVox audiobooks and their corresponding English Gutenberg Project e-books to Italian e-books with a set of three complementary methods. Then we aligned the English and the Italian texts using both traditional Gale-Church based alignment methods and a recently proposed tool to perform bilingual sentences alignment computing the cosine similarity of multilingual sentence embeddings. Finally, we forced the alignment between the English audiobooks and the English side of our textual parallel corpus with a text-to-speech and dynamic time warping based forced alignment tool. For each step, we provide the reader with a critical discussion based on detailed evaluation and comparison of the results of the different methods.
  •  
10.
  • Dürlich, Luise, et al. (författare)
  • Cause and Effect in Governmental Reports: Two Data Sets for Causality Detection in Swedish
  • 2022
  • Ingår i: Proceedings of the First Workshop on Natural Language Processing for Political Sciences (PoliticalNLP), Marseille, Framnce,. 24 June 2022. ; , s. 46-55
  • Konferensbidrag (refereegranskat)abstract
    • Causality detection is the task of extracting information about causal relations from text. It is an important task for different types of document analysis, including political impact assessment. We present two new data sets for causality detection in Swedish. The first data set is annotated with binary relevance judgments, indicating whether a sentence contains causality information or not. In the second data set, sentence pairs are ranked for relevance with respect to a causality query, containing a specific hypothesized cause and/or effect. Both data sets are carefully curated and mainly intended for use as test data. We describe the data sets and their annotation, including detailed annotation guidelines. In addition, we present pilot experiments on cross-lingual zero-shot and few-shot causality detection, using training data from English and German.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 69
Typ av publikation
konferensbidrag (62)
tidskriftsartikel (5)
doktorsavhandling (1)
licentiatavhandling (1)
Typ av innehåll
refereegranskat (61)
övrigt vetenskapligt/konstnärligt (8)
Författare/redaktör
Stymne, Sara, 1977- (69)
Nivre, Joakim, 1962- (10)
Ahrenberg, Lars, 194 ... (8)
de Lhoneux, Miryam, ... (7)
Holmqvist, Maria, 19 ... (7)
Tiedemann, Jörg (5)
visa fler...
Hardmeier, Christian (5)
Östman, Carin, 1958- (5)
Ahrenberg, Lars (4)
Savary, Agata (3)
Nivre, Joakim (2)
Dürlich, Luise (2)
Cerniavski, Rafal (2)
Liebeskind, Chaya (2)
Nakov, Preslav (2)
Versley, Yannick (2)
Cettolo, Mauro (2)
Håkansson, David, Pr ... (2)
Adams, Allison (1)
Krek, Simon (1)
Merkel, Magnus (1)
Karlgren, Jussi (1)
Guillou, Liane (1)
Östling, Robert (1)
Megyesi, Beáta, 1971 ... (1)
Smith, Christian (1)
Gatt, Albert (1)
Basirat, Ali, 1982- (1)
Palmér, Anne, 1961- (1)
Loáiciga, Sharid (1)
Kovalevskaite, Jolan ... (1)
Bohnet, Bernd (1)
Ginter, Filip (1)
Svedjedal, Johan, 19 ... (1)
Cap, Fabienne (1)
Danilova, Vera (1)
Yan, Shao, 1990- (1)
Kiperwasser, Eliyahu (1)
Goldberg, Yoav (1)
Della Corte, Giusepp ... (1)
Riemann, Sebastian (1)
Finnveden, Gustav (1)
Nirve, Joakim (1)
Pettersson, Eva, 197 ... (1)
Kanerva, Jenna (1)
Webber, Bonnie (1)
Popescu-Belis, Andre ... (1)
Holmqvist, Maria (1)
Jody, Foo, 1979- (1)
Karamolegkou, Antoni ... (1)
visa färre...
Lärosäte
Uppsala universitet (44)
Linköpings universitet (26)
RISE (3)
Kungliga Tekniska Högskolan (1)
Språk
Engelska (67)
Svenska (2)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (67)
Humaniora (9)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy