SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "L773:0736 587X "

Search: L773:0736 587X

  • Result 1-9 of 9
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Adouane, Wafia, 1985, et al. (author)
  • Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning
  • 2016
  • In: Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects; 53–61; December 12, 2016 ; Osaka, Japan. - : Association for Computational Linguistics. - 0736-587X.
  • Conference paper (peer-reviewed)abstract
    • The identification of the language of text/speech input is the first step to be able to properly do any language-dependent natural language processing. The task is called Automatic Language Identification (ALI). Being a well-studied field since early 1960’s, various methods have been applied to many standard languages. The ALI standard methods require datasets for training and use character/word-based n-gram models. However, social media and new technologies have contributed to the rise of informal and minority languages on the Web. The state-of-the-art automatic language identifiers fail to properly identify many of them. Romanized Arabic (RA) and Romanized Berber (RB) are cases of these informal languages which are under-resourced. The goal of this paper is twofold: detect RA and RB, at a document level, as separate languages and distinguish between them as they coexist in North Africa. We consider the task as a classification problem and use supervised machine learning to solve it. For both languages, character-based 5-grams combined with additional lexicons score the best, F-score of 99.75% and 97.77% for RB and RA respectively.
  •  
2.
  • Bonafilia, Brian, et al. (author)
  • Sudden Semantic Shifts in Swedish NATO Discourse
  • 2023
  • In: Association for Computational Linguistics . Annual Meeting Conference Proceedings. - 0736-587X. ; 4, s. 184-193
  • Conference paper (peer-reviewed)abstract
    • In this paper, we investigate a type of semantic shift that occurs when a sudden event radically changes public opinion on a topic. Looking at Sweden's decision to apply for NATO membership in 2022, we use word embeddings to study how the associations users on Twitter have regarding NATO evolve. We identify several changes that we successfully validate against real-world events. However, the low engagement of the public with the issue often made it challenging to distinguish true signals from noise. We thus find that domain knowledge and data selection are of prime importance when using word embeddings to study semantic shifts.
  •  
3.
  •  
4.
  • Doostmohammadi, Ehsan, 1993-, et al. (author)
  • Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language Models
  • 2023
  • In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 521–529, Toronto, Canada. - : Association for Computational Linguistics. - 9781959429715 ; 2, s. 521-529
  • Conference paper (peer-reviewed)abstract
    • Augmenting language models with a retrieval mechanism has been shown to significantly improve their performance while keeping the number of parameters low. Retrieval-augmented models commonly rely on a semantic retrieval mechanism based on the similarity between dense representations of the query chunk and potential neighbors. In this paper, we study the state-of-the-art Retro model and observe that its performance gain is better explained by surface-level similarities, such as token overlap. Inspired by this, we replace the semantic retrieval in Retro with a surface-level method based on BM25, obtaining a significant reduction in perplexity. As full BM25 retrieval can be computationally costly for large datasets, we also apply it in a re-ranking scenario, gaining part of the perplexity reduction with minimal computational overhead.
  •  
5.
  • Hong, Xudong, et al. (author)
  • Visual Coherence Loss for Coherent and Visually Grounded Story Generation
  • 2023
  • In: Proceedings of the Annual Meeting of the Association for Computational Linguistics. - 0736-587X. - 9781959429777
  • Conference paper (peer-reviewed)abstract
    • Local coherence is essential for text generation models. We identify two important aspects of local coherence within the visual storytelling task: (1) the model needs to represent re-occurrences of characters within the image sequence in order to mention them correctly in the story; (2) character representations should enable us to find instances of the same characters and distinguish different characters. In this paper, we propose a loss function inspired by a linguistic theory of coherence for learning image sequence representations. We further propose combining features from an object detector and a face detector to construct stronger character features. To evaluate visual grounding that current reference-based metrics do not measure, we propose a character matching metric to check whether the models generate referring expressions correctly for characters in input image sequences. Experiments on a visual story generation dataset show that our proposed features and loss function are effective for generating more coherent and visually grounded stories. Our code is available at https://github.com/vwprompt/vcl.
  •  
6.
  • Sigurd, Bengt (author)
  • Computer simulation of spontaneous speech
  • 1984
  • In: 10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics : Proceedings of Coling84: 2-6 July 1984, Stanford University, California. - Proceedings of Coling84: 2-6 July 1984, Stanford University, California.. - 0736-587X. ; 22, s. 79-83
  • Conference paper (other academic/artistic)abstract
    • This paper pinpoints some of the problems faced when a computer text production model (COMMENTATOR) is to produce spontaneous speech, in particular the problem of chunking the utterances in order to get natural prosodic units. The paper proposes a buffer model which allows the accumulation and delay of phonetic material until a chunk of the desired size has been built up. Several phonetic studies have suggested a similar temporary storage in order to explain intonation slopes, rythmical patterns, speech errors and speech disorders. Small-scale simulations of the whole verbalization process from perception and thought to sounds, hesitation behaviour, pausing, speech errors, sound changes and speech disorders are presented.
  •  
7.
  • Skeppstedt, Maria, 1977-, et al. (author)
  • Unshared Task : (Dis)agreement in Online Debates
  • 2016
  • In: Proceedings of the 3rd Workshop on Argument Mining (ArgMining '16) at ACL '16. - : Association for Computational Linguistics. - 0736-587X. - 9781945626173 ; , s. 154-159
  • Conference paper (peer-reviewed)abstract
    • Topic-independent expressions for conveying agreement and disagreement were annotated in a corpus of web forum debates, in order to evaluate a classifier trained to detect these two categories. Among the 175 expressions annotated in the evaluation set, 163 were unique, which shows that there is large variation in expressions used. This variation might be one of the reasons why the task of automatically detecting the categories was difficult. F-scores of 0.44 and 0.37 were achieved by a classifier trained on 2,000 debate sentences for detecting sentence-level agreement and disagreement.
  •  
8.
  • Skeppstedt, Maria, 1977-, et al. (author)
  • Unshared Task : (Dis)agreement in Online Debates
  • 2016
  • In: The 54th Annual Meeting of the Association for Computational Linguistics : Proceedings of the 3rd Workshop on Argument Mining (ArgMining2016) - Proceedings of the 3rd Workshop on Argument Mining (ArgMining2016). - : Association for Computational Linguistics. - 0736-587X. - 9781945626173 ; , s. 154-159, s. 154-159
  • Conference paper (peer-reviewed)abstract
    • Topic-independent expressions for conveying agreement and disagreement were annotated in a corpus of web forum debates, in order to evaluate a classifier trained to detect these two categories. Among the 175 expressions annotated in the evaluation set, 163 were unique, which shows that there is large variation in expressions used. This variation might be one of the reasons why the task of automatically detecting the categories was difficult. F-scores of 0.44 and 0.37 were achieved by a classifier trained on 2,000 debate sentences for detecting sentence-level agreement and disagreement.
  •  
9.
  • Suorra Hagstedt P, Jacob, 1992, et al. (author)
  • Assisting Discussion Forum Users using Deep Recurrent Neural Networks
  • 2016
  • In: Association for Computational Linguistics . Annual Meeting Conference Proceedings. - 0736-587X. ; 2016:2016, s. 53-61
  • Conference paper (peer-reviewed)abstract
    • We present a discussion forum assistant based on deep recurrent neural networks (RNNs). The assistant is trained to perform three different tasks when faced with a question from a user. Firstly, to recommend related posts. Secondly, to recommend other users that might be able to help. Thirdly, it recommends other channels in the forum where people may discuss related topics. Our recurrent forum assistant is evaluated experimentally by prediction accuracy for the end--to--end trainable parts, as well as by performing an end-user study. We conclude that the model generalizes well, and is helpful for the users.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-9 of 9

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view