SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Östling Robert 1986 ) srt2:(2015-2019)"

Sökning: WFRF:(Östling Robert 1986 ) > (2015-2019)

  • Resultat 1-10 av 17
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Bjerva, Johannes, et al. (författare)
  • Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations
  • 2017
  • Ingår i: Proceedings of the 21st Nordic Conference on Computational Linguistics. - Linköping : Linköping University Electronic Press. - 9789176856017 ; , s. 211-215
  • Konferensbidrag (refereegranskat)abstract
    • Assessing the semantic similarity between sentences in different languages is challenging. We approach this problem by leveraging multilingual distributional word representations, where similar words in different languages are close to each other. The availability of parallel data allows us to train such representations on a large amount of languages. This allows us to leverage semantic similarity data for languages for which no such data exists. We train and evaluate on five language pairs, including English, Spanish, and Arabic. We are able to train wellperforming systems for several language pairs, without any labelled data for that language pair.
  •  
2.
  • Bjerva, Johannes, et al. (författare)
  • What Do Language Representations Really Represent?
  • 2019
  • Ingår i: Computational linguistics - Association for Computational Linguistics (Print). - : MIT Press - Journals. - 0891-2017 .- 1530-9312. ; 45:2, s. 381-389
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)abstract
    • A neural language model trained on a text corpus can be used to induce distributed representations of words, such that similar words end up with similar representations. If the corpus is multilingual, the same model can be used to learn distributed representations of languages, such that similar languages end up with similar representations. We show that this holds even when the multilingual corpus has been translated into English, by picking up the faint signal left by the source languages. However, just as it is a thorny problem to separate semantic from syntactic similarity in word representations, it is not obvious what type of similarity is captured by language representations. We investigate correlations and causal relationships between language representations learned from translations on one hand, and genetic, geographical, and several levels of structural similarity between languages on the other. Of these, structural similarity is found to correlate most strongly with language representation similarity, whereas genetic relationships—a convenient benchmark used for evaluation in previous work—appears to be a confounding factor. Apart from implications about translation effects, we see this more generally as a case where NLP and linguistic typology can interact and benefit one another.
  •  
3.
  • Falkenjack, Johan, 1986- (författare)
  • Towards a Model of General Text Complexity for Swedish
  • 2018
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In an increasingly networked world, where the amount of written information is growing at a rate never before seen, the ability to read and absorb written information is of utmost importance for anything but a superficial understanding of life's complexities. That is an example of a sentence which is not very easy to read. It can be said to have a relatively high degree of text complexity. Nevertheless, the sentence is also true. It is important to be able to read and understand written materials. While not everyone might have a job where they have to read a lot, access to written material is necessary in order to participate in modern society. Most information, from news reporting, to medical information, to governmental information, come primarily in a written form.But what makes the sentence at the start of this abstract so complex? We can probably all agree that the length is part of it. But then what? Researches in the field of readability and text complexity analysis have been studying this question for almost 100 years. That research has over time come to include many computational and data driven methods within the field of computational linguistics.This thesis cover some of my contributions to this field of research, though with a main focus on Swedish rather than English text. It aims to explore two primary questions (1) Which linguistic features are most important when assessing text complexity in Swedish? and (2) How can we deal with the problem of data sparsity with regards to complexity annotated texts in Swedish?The first issue is tackled by exploring the task of identifying easy-to-read ("lättläst") text using classification with Support Vector Machines. A large set of linguistic features is evaluated with regards to predictive performance and is shown to separate easy-to-read texts from regular texts with a very high accuracy. Meanwhile, using a genetic algorithm for variable selection, we find that almost the same accuracy can be reached with only 8 features. This implies that this classification problem is not very hard and that results might not generalize to comparing less easy-to-read texts.This, in turn, brings us to the second question. Except for easy-to-read labeled texts, the data with text complexity annotations is very sparse. It consist of multiple small corpora using different scales to label documents. To deal with this problem, we propose a novel statistical model. The model belongs to the larger family of Probit models and is implemented in a Bayesian fashion and estimated using a Gibbs sampler based on extending a well established Gibbs sampler for the Ordered Probit model. This model is evaluated using both simulated and real world readability data with very promising results.
  •  
4.
  • Gärtner, Manja, 1986- (författare)
  • Prosocial Behavior and Redistributive Preferences
  • 2015
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This Ph.D. thesis contains four independent essays. The essays are summarized as follows.Essay I: Status quos and the prosociality of intuitive decision makingThis study investigates how the prosociality of intuitive choices depends on the presence of a status quo. I present the results of a dictator game experiment with a non-student sample. The dictator game is a choice between a selfish option and a fair and efficient option, and has either no status quo, a selfish status quo or a fair status quo. Intuitive choices are elicited in two ways, by an exogenous variation in time pressure and by measuring response times. I find that time pressure decreases the share of fair choices in decisions without a status quo, but has no effect in the presence of a status quo. Fair and selfish choices have equal response times in a decision without a status quo, whereas the status quo option is always chosen faster, i.e. fast choices are fair under a fair status quo and selfish under a selfish status quo. This suggests that the decision context critically affects whether intuitive choices are prosocial or selfish.Essay II: Risk preferences and the demand for redistributionIf individuals view redistributive policy as an insurance against future negative economic shocks, then the demand for redistribution increases in individual risk aversion. We provide a direct test of the correlation between the demand for redistribution and individual risk aversion in a customized survey and find that they are strongly and robustly positively correlated: more risk averse people demand more redistribution. We also replicate the results from previous literature and, on the one hand, find that the demand for redistribution is positively correlated with altruism, the belief that individual economic success is the result of luck rather than effort, a working-class parental background and downward mobility experience and expectations. On the other hand, preferences for redistribution are negatively correlated with income, a conservative political ideology and upward mobility experience and expectations. The magnitude of the correlation between risk aversion and the demand for redistribution is comparable to the magnitude of these previously identified, and here replicated, correlates. Essay III: Omission effects in trolley problems with economic outcomesThis paper tests how ethical views and hypothetical choices in a trolley problem with economic outcomes depend on whether an outcome is the result of an action or an omission. In a vignette experiment, subjects read about a spectator that harms one person in order to save five others from harm either by taking an action or by omission, whereas the outcomes are either death or loss of property. The results show that the distinction between harmful actions and harmful omissions is significantly smaller in the economic domain, suggesting that omission effects in trolley problems are domain-specific. A comparison of moral views about harmful actions across outcome domains shows that this difference is driven by subjects being more outcome-focused when property rather than lives are at stake. Essay IV: Is there an omission effect in prosocial behavior?We investigate whether individuals are more prone to act selfishly if they can passively allow for an outcome to be implemented (omission) rather than having to make an active choice (commission). In most settings, active and passive choice alternatives differ in terms of factors such as the presence of a suggested option, costs of taking an action, and awareness. We isolate the omission effect from confounding factors in two experiments, and find no evidence that the distinction between active and passive choices has an independent effect on the propensity to implement selfish outcomes. This suggests that increased selfishness through omission, as observed in various economic choice situations, is driven by other factors than a preference for selfish omissions.
  •  
5.
  • Tjong Kim Sang, Erik, et al. (författare)
  • The CLIN27 Shared Task : Translating Historical Text to Contemporary Language for Improving Automatic Linguistic Annotation
  • 2017
  • Ingår i: Computational Linguistics in the Netherlands Journal. - 2211-4009. ; 7, s. 53-64
  • Tidskriftsartikel (refereegranskat)abstract
    • The CLIN27 shared task evaluates the effect of translating historical text to modern text with the goal of improving the quality of the output of contemporary natural language processing tools applied to the text. We focus on improving part-of-speech tagging analysis of seventeenth-century Dutch. Eight teams took part in the shared task. The best results were obtained by teams employing character-based machine translation. The best system obtained an error reduction of 51% in comparison with the baseline of tagging unmodified text. This is close to the error reduction obtained by human translation (57%).
  •  
6.
  • Wirén, Mats, 1954-, et al. (författare)
  • Modelling the Informativeness of Non-Verbal Cues in Parent–Child Interaction
  • 2017
  • Ingår i: Proceedings of Interspeech 2017. - : The International Speech Communication Association (ISCA). - 9781510848764 ; , s. 2203-2207
  • Konferensbidrag (refereegranskat)abstract
    • Non-verbal cues from speakers, such as eye gaze and hand positions, play an important role in word learning. This is consistent with the notion that for meaning to be reconstructed, acoustic patterns need to be linked to time-synchronous patterns from at least one other modality. In previous studies of a multimodally annotated corpus of parent–child interaction, we have shown that parents interacting with infants at the early word-learning stage (7–9 months) display a large amount of time-synchronous patterns, but that this behaviour tails off with increasing age of the children. Furthermore, we have attempted to quantify the informativeness of the different nonverbal cues, that is, to what extent they actually help to discriminate between different possible referents, and how critical the timing of the cues is. The purpose of this paper is to generalise our earlier model by quantifying informativeness resulting from non-verbal cues occurring both before and after their associated verbal references.
  •  
7.
  • Östling, Robert, 1986- (författare)
  • A Bayesian model for joint word alignment and part-of-speech transfer
  • 2016
  • Ingår i: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. - Osaka, Japan : Association for Computational Linguistics. - 9784879747020 ; , s. 620-629
  • Konferensbidrag (refereegranskat)abstract
    • Current methods for word alignment require considerable amounts of parallel text to deliver accurate results, a requirement which is met only for a small minority of the world’s approximately 7,000 languages. We show that by jointly performing word alignment and annotation transfer in a novel Bayesian model, alignment accuracy can be improved for language pairs where annotations are available for only one of the languages—a finding which could facilitate the study and processing of a vast number of low-resource languages. We also present an evaluation where our method is used to perform single-source and multi-source part-of-speech transfer with 22 translations of the same text in four different languages. This allows us to quantify the considerable variation in accuracy depending on the specific source text(s) used, even with different translations into the same language.
  •  
8.
  • Östling, Robert, 1986- (författare)
  • Bayesian Models for Multilingual Word Alignment
  • 2015
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In this thesis I explore Bayesian models for word alignment, how they can be improved through joint annotation transfer, and how they can be extended to parallel texts in more than two languages. In addition to these general methodological developments, I apply the algorithms to problems from sign language research and linguistic typology.In the first part of the thesis, I show how Bayesian alignment models estimated with Gibbs sampling are more accurate than previous methods for a range of different languages, particularly for languages with few digital resources available—which is unfortunately the state of the vast majority of languages today. Furthermore, I explore how different variations to the models and learning algorithms affect alignment accuracy.Then, I show how part-of-speech annotation transfer can be performed jointly with word alignment to improve word alignment accuracy. I apply these models to help annotate the Swedish Sign Language Corpus (SSLC) with part-of-speech tags, and to investigate patterns of polysemy across the languages of the world.Finally, I present a model for multilingual word alignment which learns an intermediate representation of the text. This model is then used with a massively parallel corpus containing translations of the New Testament, to explore word order features in 1001 languages.
  •  
9.
  • Östling, Robert, 1986-, et al. (författare)
  • Continuous multilinguality with language vectors
  • 2017
  • Ingår i: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. - : Association for Computational Linguistics. - 9781945626357 ; , s. 644-649
  • Konferensbidrag (refereegranskat)abstract
    • Most existing models for multilingual natural language processing (NLP) treat language as a discrete category, and make predictions for either one language or the other. In contrast, we propose using continuous vector representations of language. We show that these can be learned efficiently with a character-based neural language model, and used to improve inference about language varieties not seen during training. In experiments with 1303 Bible translations into 990 different languages, we empirically explore the capacity of multilingual language models, and also show that the language vectors capture genetic relationships between languages.
  •  
10.
  • Östling, Robert, 1986-, et al. (författare)
  • Enriching the Swedish Sign Language Corpus with Part of Speech Tags Using Joint Bayesian Word Alignment and Annotation Transfer
  • 2015
  • Ingår i: Proceedings of the 20th Nordic Conference of Computational Linguistics. - : Linköping University Electronic Press. - 9789175190983 ; , s. 263-268
  • Konferensbidrag (refereegranskat)abstract
    • We have used a novel Bayesian model of joint word alignment and part of speech (PoS) annotation transfer to enrich the Swedish Sign Language Corpus with PoS tags. The annotations were then hand-corrected in order to both improve annotation quality for the corpus, and allow the empirical evaluation presented herein.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 17

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy