SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L4X0:1652 1366 "

Sökning: L4X0:1652 1366

  • Resultat 1-10 av 28
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • af Geijerstam, Åsa, docent, 1972- (författare)
  • Att skriva i naturorienterande ämnen i skolan
  • 2006
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • When children encounter new subjects in school, they are also faced with new ways of using language. Learning science thus means learning the language of science, and writing is one of the ways this is accomplished. The present study investigates writing in natural sciences in grades 5 and 8 in Swedish schools. Major theoretical influences for these investigations are found within the socio-cultural, dialogical and social semiotic perspectives on language use.The study is based on texts written by 97 students, interviews around these texts and observations from 16 different classroom practices. Writing is seen as a situated practice; therefore analysis is carried out of the activities surrounding the texts. The student texts are analysed in terms of genre and in relation to their abstraction, density and use of expansions. This analysis shows among other things that the texts show increasing abstraction and density with increasing age, whereas the text structure and the use of expansions do not increase.It is also argued that a central point in school writing must be the students’ way of talking about their texts. Analysis of interviews with the students is thus carried out in terms of text movability. The results from this analysis indicate that students find it difficult to talk about their texts. They find it hard to express the main content of the text, as well as to discuss it’s function and potential readers.Previous studies argue that writing constitutes a potential for learning. In the material studied in this thesis, this potential learning tool is not used to any large extent. To be able to participate in natural sciences in higher levels, students need to take part in practices where the specialized language of natural science is used in writing as well as in speech.
  •  
2.
  • Basirat, Ali, 1982- (författare)
  • Principal Word Vectors
  • 2018
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Word embedding is a technique for associating the words of a language with real-valued vectors, enabling us to use algebraic methods to reason about their semantic and grammatical properties. This thesis introduces a word embedding method called principal word embedding, which makes use of principal component analysis (PCA) to train a set of word embeddings for words of a language. The principal word embedding method involves performing a PCA on a data matrix whose elements are the frequency of seeing words in different contexts. We address two challenges that arise in the application of PCA to create word embeddings. The first challenge is related to the size of the data matrix on which PCA is performed and affects the efficiency of the word embedding method. The data matrix is usually a large matrix that requires a very large amount of memory and CPU time to be processed. The second challenge is related to the distribution of word frequencies in the data matrix and affects the quality of the word embeddings. We provide an extensive study of the distribution of the elements of the data matrix and show that it is unsuitable for PCA in its unmodified form.We overcome the two challenges in principal word embedding by using a generalized PCA method. The problem with the size of the data matrix is mitigated by a randomized singular value decomposition (SVD) procedure, which improves the performance of PCA on the data matrix. The data distribution is reshaped by an adaptive transformation function, which makes it more suitable for PCA. These techniques, together with a weighting mechanism that generalizes many different weighting and transformation approaches used in literature, enable the principal word embedding to train high quality word embeddings in an efficient way.We also provide a study on how principal word embedding is connected to other word embedding methods. We compare it to a number of word embedding methods and study how the two challenges in principal word embedding are addressed in those methods. We show that the other word embedding methods are closely related to principal word embedding and, in many instances, they can be seen as special cases of it.The principal word embeddings are evaluated in both intrinsic and extrinsic ways. The intrinsic evaluations are directed towards the study of the distribution of word vectors. The extrinsic evaluations measure the contribution of principal word embeddings to some standard NLP tasks. The experimental results confirm that the newly proposed features of principal word embedding (i.e., the randomized SVD algorithm, the adaptive transformation function, and the weighting mechanism) are beneficial to the method and lead to significant improvements in the results. A comparison between principal word embedding and other popular word embedding methods shows that, in many instances, the proposed method is able to generate word embeddings that are better than or as good as other word embeddings while being faster than several popular word embedding methods.
  •  
3.
  • Björk, Ingrid, 1961- (författare)
  • Relativizing linguistic relativity : Investigating underlying assumptions about language in the neo-Whorfian literature
  • 2008
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This work concerns the linguistic relativity hypothesis, also known as the Sapir-Whorf hypothesis, which, in its most general form claims that ‘lan-guage’ influences ‘thought’. Past studies into linguistic relativity have treated various aspects of both thought and language, but a growing body of literature has recently emerged, in this thesis referred to as neo-Whorfian, that empirically investigates thought and language from a cross-linguistic perspective and claims that the grammar or lexicon of a particular language influences the speakers’ non-linguistic thought.The present thesis examines the assumptions about language that underlie this claim and criticizes the neo-Whorfian arguments from the point of view that they are based on misleading notions of language. The critique focuses on the operationalization of thought, language, and culture as separate vari-ables in the neo-Whorfian empirical investigations. The neo-Whorfian stud-ies explore language primarily as ‘particular languages’ and investigate its role as a variable standing in a causal relation to the ‘thought’ variable. Tho-ught is separately examined in non-linguistic tests and found to ‘correlate’ with language.As a contrast to the neo-Whorfian view of language, a few examples of other approaches to language, referred to in the thesis as sociocultural appro-aches, are reviewed. This perspective on language places emphasis on prac-tice and communication rather than on particular languages, which are vie-wed as secondary representations. It is argued that from a sociocultural per-spective, language as an integrated practice cannot be separated from tho-ught and culture. The empirical findings in the neo-Whorfian studies need not be rejected, but they should be interpreted differently. The findings of linguistic and cognitive diversity reflect different communicational practices in which language cannot be separated from non-language.
  •  
4.
  • de Lhoneux, Miryam, 1990- (författare)
  • Linguistically Informed Neural Dependency Parsing for Typologically Diverse Languages
  • 2019
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis presents several studies in neural dependency parsing for typologically diverse languages, using treebanks from Universal Dependencies (UD). The focus is on informing models with linguistic knowledge. We first extend a parser to work well on typologically diverse languages, including morphologically complex languages and languages whose treebanks have a high ratio of non-projective sentences, a notorious difficulty in dependency parsing. We propose a general methodology where we sample a representative subset of UD treebanks for parser development and evaluation. Our parser uses recurrent neural networks which construct information sequentially, and we study the incorporation of a recursive neural network layer in our parser. This follows the intuition that language is hierarchical. This layer turns out to be superfluous in our parser and we study its interaction with other parts of the network. We subsequently study transitivity and agreement information learned by our parser for auxiliary verb constructions (AVCs). We suggest that a parser should learn similar information about AVCs as it learns for finite main verbs. This is motivated by work in theoretical dependency grammar. Our parser learns different information about these two if we do not augment it with a recursive layer, but similar information if we do, indicating that there may be benefits from using that layer and we may not yet have found the best way to incorporate it in our parser. We finally investigate polyglot parsing. Training one model for multiple related languages leads to substantial improvements in parsing accuracy over a monolingual baseline. We also study different parameter sharing strategies for related and unrelated languages. Sharing parameters that partially abstract away from word order appears to be beneficial in both cases but sharing parameters that represent words and characters is more beneficial for related than unrelated languages.
  •  
5.
  • Dubremetz, Marie, 1988- (författare)
  • Detecting Rhetorical Figures Based on Repetition of Words: Chiasmus, Epanaphora, Epiphora
  • 2017
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis deals with the detection of three rhetorical figures based on repetition of words: chiasmus (“Fair is foul, and foul is fair.”), epanaphora (“Poor old European Commission! Poor old European Council.”) and epiphora (“This house is mine. This car is mine. You are mine.”). For a computer, locating all repetitions of words is trivial, but locating just those repetitions that achieve a rhetorical effect is not. How can we make this distinction automatically? First, we propose a new definition of the problem. We observe that rhetorical figures are a graded phenomenon, with universally accepted prototypical cases, equally clear non-cases, and a broad range of borderline cases in between. This makes it natural to view the problem as a ranking task rather than a binary detection task. We therefore design a model for ranking candidate repetitions in terms of decreasing likelihood of having a rhetorical effect, which allows potential users to decide for themselves where to draw the line with respect to borderline cases. Second, we address the problem of collecting annotated data to train the ranking model. Thanks to a selective method of annotation, we can reduce by three orders of magnitude the annotation work for chiasmus, and by one order of magnitude the work for epanaphora and epiphora. In this way, we prove that it is feasible to develop a system for detecting the three figures without an unsurmountable amount of human work. Finally, we propose an evaluation scheme and apply it to our models. The evaluation reveals that, even with a very incompletely annotated corpus, a system for repetitive figure detection can be trained to achieve reasonable accuracy. We investigate the impact of different linguistic features, including length, n-grams, part-of-speech tags, and syntactic roles, and find that different features are useful for different figures. We also apply the system to four different types of text: political discourse, fiction, titles of articles and novels, and quotations. Here the evaluation shows that the system is robust to shifts in genre and that the frequencies of the three rhetorical figures vary with genre.
  •  
6.
  • Edling, Agnes, 1974- (författare)
  • Abstraction and authority in textbooks : The textual paths towards specialized language
  • 2006
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • During a few hours of a school day, a student might read textbook texts which are highly diversified in terms of abstraction. Abstraction is a central feature of specialized language and the transition from everyday language to specialized language is one of the most important things formal education can offer students. That transition is the focus of this thesis.This study introduces a new three-graded classification of abstraction including the levels of specificity, generalization and abstraction, based on a discussion of the concept of abstraction. The investigations performed, based on this classification, show that texts from different subject areas display distinct patterns of abstraction. The Swedish literary texts had the lowest degree of abstraction, the social science texts had an intermediate degree and the natural science texts were the most generalized and abstract. The results also show that the degree of abstraction in the textbook texts increases in later grade levels.The thesis presents a new way of analyzing shifts between levels of abstraction and their functions. Interestingly, the texts with a medium degree of abstraction, the social science texts, are the ones with the greatest variety in shifts. The functions of the shifts differ with respect to cultural domains. The shifts in the Swedish literary texts in general belong to the everyday domain while the shifts in the natural science texts belong to a specialized domain. The shifts in the social science texts had features of both domains.A secondary aim of the thesis is to develop the understanding of the relationship between author and reader in the texts. The results from my investigation of modality in the Swedish textbook texts confirm the earlier findings from English and Spanish textbooks. In comparison to other text types, textbook texts present knowledge in a more authoritative and less modalized way.From time to time, abstraction is described as a feature that hinders students accessing texts. Some researchers even suggest a removal of features of specialized language in textbook texts, in order to increase students’ understanding. However, in a society where specialized knowledge is necessary, the access to specialized texts is important. A democratic view of education and school mandates that children and adolescents have the opportunity to encounter and learn to encounter specialized language in school. In analyzing the texts special attention is paid to the relationship between the texts, the contexts of use and the student readers.
  •  
7.
  • Folkeryd, Jenny W., 1970- (författare)
  • Writing with an Attitude : Appraisal and student texts in the school subject of Swedish
  • 2006
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Learning in school is in many respects done through language. However, it has been shown that the language of school assignments is seldom explicitly discussed in school. Writing tasks are furthermore assigned without clear guidelines for how certain lexical choices make one text more powerful than another. The present study is a contribution to a linguistic and pedagogical discussion of student writing. More specifically the focus is on the use of evaluative language in texts written by students in the school subject of Swedish in grades 5, 8 and 11. The major investigations of the study have been accommodated within the theoretical framework of Appraisal. An overview is given of the language resources in the student texts for constructing emotion, judging behavior in ethical terms and valuing objects aesthetically. Another question addressed is that of how attitudinal meaning is intensified, thus creating greater or lesser degrees of positivity or negativity associated with the feelings. The results show that manifestations of attitude are found in practically all texts in the study. However, variations are noted in relation to different genres, age, proficiency level, language background and gender. A contribution of the study in relation to the theoretical framework upon which it draws is an extension of the system of Attitude as well as an identification of different patterns in the use of attitudinal resources. These patterns are furthermore discussed in relation to how students talk about their own written production in terms of text movability. Results indicate that students with a high degree of text movability also use attitudinal resources to a large extent. It is argued that applying the linguistic tool of Appraisal can facilitate a discussion of how to make one aspect of the hidden curriculum more visible, namely, how to write with an Attitude.
  •  
8.
  • Haddad, Rima, 1986- (författare)
  • Child bilingualism in Sweden and Lebanon : A study of Arabic-speaking 4-to-7-year-olds
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This dissertation investigates the vocabulary and narrative skills of 100 Arabic-Swedish-speaking children (aged 4–7 years) in Sweden cross-sectionally and the development of these skills (4 to 6) in a subgroup of 10 children longitudinally. Also, the vocabulary skills of 100 Arabic-speaking bilingual children (aged 4–7 years) in Lebanon are investigated cross-sectionally and compared to the Swedish cross-sectional study. Parental questionnaires were used to gather background information concerning language input and use inside and outside the home. The comprehension and production of vocabulary was assessed with the Cross-linguistic Lexical Task (CLT; Haman et al., 2015) and narrative macrostructure with the Multilingual Assessment Instrument for Narratives (MAIN; Gagarina et al., 2019). In Sweden, both Arabic and Swedish were investigated for vocabulary (language differences, age, socio-economic status (SES) and language input) and for narrative macrostructure (language differences, age and task effects). In Lebanon, Arabic vocabulary skills were explored in relation to age, SES and language input.Sweden: For both vocabulary and narrative macrostructure, development with age was not only evident in Swedish, but also in Arabic. Children scoring high on Arabic vocabulary comprehension and production were older and had parents speaking with them mostly in Arabic. Joint book reading in Arabic boosted the children’s Arabic expressive vocabulary whereas being exposed predominantly to Swedish had a negative effect. For Swedish, high scoring children were older and had an early age of onset of Swedish. Children who were mostly exposed to Arabic scored lower on Swedish vocabulary. Surprisingly, SES (parental education) did not predict any of the vocabulary scores. In line with international studies, narrative macrostructure production scores were generally low at this age for both languages, even for the oldest children, whereas narrative comprehension was generally well developed, even for the youngest children. The longitudinal study largely confirmed the results obtained in the cross-sectional study.Lebanon: Similarly to the Swedish sample, older children scored high on Arabic receptive and expressive vocabulary, children whose parents spoke with them mostly in Arabic scored high on expressive vocabulary, and no effects of SES were found. Compared to children in Sweden, children in Lebanon code-switched many more nouns.
  •  
9.
  • Hardmeier, Christian (författare)
  • Discourse in Statistical Machine Translation
  • 2014
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis addresses the technical and linguistic aspects of discourse-level processing in phrase-based statistical machine translation (SMT). Connected texts can have complex text-level linguistic dependencies across sentences that must be preserved in translation. However, the models and algorithms of SMT are pervaded by locality assumptions. In a standard SMT setup, no model has more complex dependencies than an n-gram model. The popular stack decoding algorithm exploits this fact to implement efficient search with a dynamic programming technique. This is a serious technical obstacle to discourse-level modelling in SMT.From a technical viewpoint, the main contribution of our work is the development of a document-level decoder based on stochastic local search that translates a complete document as a single unit. The decoder starts with an initial translation of the document, created randomly or by running a stack decoder, and refines it with a sequence of elementary operations. After each step, the current translation is scored by a set of feature models with access to the full document context and its translation. We demonstrate the viability of this decoding approach for different document-level models.From a linguistic viewpoint, we focus on the problem of translating pronominal anaphora. After investigating the properties and challenges of the pronoun translation task both theoretically and by studying corpus data, a neural network model for cross-lingual pronoun prediction is presented. This network jointly performs anaphora resolution and pronoun prediction and is trained on bilingual corpus data only, with no need for manual coreference annotations. The network is then integrated as a feature model in the document-level SMT decoder and tested in an English–French SMT system. We show that the pronoun prediction network model more adequately represents discourse-level dependencies for less frequent pronouns than a simpler maximum entropy baseline with separate coreference resolution.By creating a framework for experimenting with discourse-level features in SMT, this work contributes to a long-term perspective that strives for more thorough modelling of complex linguistic phenomena in translation. Our results on pronoun translation shed new light on a challenging, but essential problem in machine translation that is as yet unsolved.
  •  
10.
  • Kulmizev, Artur (författare)
  • The Search for Syntax : Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Syntax — the study of the hierarchical structure of language — has long featured as a prominent research topic in the field of natural language processing (NLP). Traditionally, its role in NLP was confined towards developing parsers: supervised algorithms tasked with predicting the structure of utterances (often for use in downstream applications). More recently, however, syntax (and syntactic theory) has factored much less into the development of NLP models, and much more into their analysis. This has been particularly true with the nascent relevance of language models: semi-supervised algorithms trained to predict (or infill) strings given a provided context. In this dissertation, I describe four separate studies that seek to explore the interplay between syntactic parsers and language models upon the backdrop of dependency syntax. In the first study, I investigate the error profiles of neural transition-based and graph-based dependency parsers, showing that they are effectively homogenized when leveraging representations from pre-trained language models. Following this, I report the results of two additional studies which show that dependency tree structure can be partially decoded from the internal components of neural language models — specifically, hidden state representations and self-attention distributions. I then expand on these findings by exploring a set of additional results, which serve to highlight the influence of experimental factors, such as the choice of annotation framework or learning objective, in decoding syntactic structure from model components. In the final study, I describe efforts to quantify the overall learnability of a large set of multilingual dependency treebanks — the data upon which the previous experiments were based — and how it may be affected by factors such as annotation quality or tokenization decisions. Finally, I conclude the thesis with a conceptual analysis that relates the aforementioned studies to a broader body of work concerning the syntactic knowledge of language models.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 28
Typ av publikation
doktorsavhandling (26)
samlingsverk (redaktörskap) (1)
proceedings (redaktörskap) (1)
Typ av innehåll
övrigt vetenskapligt/konstnärligt (28)
Författare/redaktör
Nivre, Joakim (6)
Viberg, Åke (4)
Nivre, Joakim, Profe ... (4)
Sågvall Hein, Anna (3)
Viberg, Åke, Profess ... (3)
Liberg, Caroline (2)
visa fler...
Nivre, Joakim, 1962- (2)
Bohnacker, Ute, Prof ... (2)
Tiedemann, Jörg, Pro ... (2)
Jahani, Carina, Prof ... (1)
Kulmizev, Artur (1)
Saxena, Anju (1)
af Geijerstam, Åsa, ... (1)
Evensen, Lars Sigfre ... (1)
Edling, Agnes, 1974- (1)
Folkeryd, Jenny W., ... (1)
Dunn, Michael (1)
Nilsson, Mattias (1)
Karlgren, Jussi (1)
Anward, Jan, Profess ... (1)
Megyesi, Beáta, 1971 ... (1)
Ygge, Jan (1)
Basirat, Ali, 1982- (1)
Tang, Marc (1)
de Lhoneux, Miryam, ... (1)
Schütze, Hinrich (1)
Hardmeier, Christian (1)
Lindgren, Josefin, 1 ... (1)
Liberg, Caroline, Pr ... (1)
Dahllöf, Mats, 1965- (1)
Björk, Ingrid, 1961- (1)
Segerdahl, Pär (1)
Taylor, Talbot J., P ... (1)
Öberg, Linnéa, 1987- (1)
Joakim, Nivre (1)
Stymne, Sara (1)
Bender, Emily, Profe ... (1)
Knight, Kevin (1)
Dubremetz, Marie, 19 ... (1)
Mats, Dahllöf, Docen ... (1)
Cori, Marcel, Profes ... (1)
Hirst, Graeme, Profe ... (1)
Hale, John (1)
Herzberg, Fröydis, P ... (1)
Pettersson, Eva, 197 ... (1)
Goldstein, Mikael (1)
Haddad, Rima, 1986- (1)
Gathercole, Virginia ... (1)
Viberg, Åke, 1945- (1)
Federico, Marcello, ... (1)
visa färre...
Lärosäte
Uppsala universitet (28)
Högskolan i Gävle (1)
Linnéuniversitetet (1)
RISE (1)
Språk
Engelska (27)
Svenska (1)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (13)
Humaniora (13)
Teknik (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy