SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Sahlgren Magnus) "

Search: WFRF:(Sahlgren Magnus)

  • Result 1-50 of 110
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Boman, Magnus, et al. (author)
  • Learning Machines
  • 2018
  • In: <em>Learning, Inference and Control of Multi-Agent Systems</em>. ; , s. 610-613
  • Conference paper (peer-reviewed)
  •  
2.
  • Boman, Magnus, et al. (author)
  • Learning machines in Internet-delivered psychological treatment
  • 2019
  • In: Progress in Artificial Intelligence. - : Springer Verlag. - 2192-6352 .- 2192-6360. ; 8:4, s. 475-485
  • Journal article (peer-reviewed)abstract
    • A learning machine, in the form of a gating network that governs a finite number of different machine learning methods, is described at the conceptual level with examples of concrete prediction subtasks. A historical data set with data from over 5000 patients in Internet-based psychological treatment will be used to equip healthcare staff with decision support for questions pertaining to ongoing and future cases in clinical care for depression, social anxiety, and panic disorder. The organizational knowledge graph is used to inform the weight adjustment of the gating network and for routing subtasks to the different methods employed locally for prediction. The result is an operational model for assisting therapists in their clinical work, about to be subjected to validation in a clinical trial.
  •  
3.
  •  
4.
  • Ghoorchian, Kambiz, 1981- (author)
  • Graph Algorithms for Large-Scale and Dynamic Natural Language Processing
  • 2019
  • Doctoral thesis (other academic/artistic)abstract
    • In Natural Language Processing, researchers design and develop algorithms to enable machines to understand and analyze human language. These algorithms benefit multiple downstream applications including sentiment analysis, automatic translation, automatic question answering, and text summarization. Topic modeling is one such algorithm that solves the problem of categorizing documents into multiple groups with the goal of maximizing the intra-group document similarity. However, the manifestation of short texts like tweets, snippets, comments, and forum posts as the dominant source of text in our daily interactions and communications, as well as being the main medium for news reporting and dissemination, increases the complexity of the problem due to scalability, sparsity, and dynamicity. Scalability refers to the volume of the messages being generated, sparsity is related to the length of the messages, and dynamicity is associated with the ratio of changes in the content and topical structure of the messages (e.g., the emergence of new phrases). We improve the scalability and accuracy of Natural Language Processing algorithms from three perspectives, by leveraging on innovative graph modeling and graph partitioning algorithms, incremental dimensionality reduction techniques, and rich language modeling methods. We begin by presenting a solution for multiple disambiguation on short messages, as opposed to traditional single disambiguation. The solution proposes a simple graph representation model to present topical structures in the form of dense partitions in that graph and applies disambiguation by extracting those topical structures using an innovative distributed graph partitioning algorithm. Next, we develop a scalable topic modeling algorithm using a novel dense graph representation and an efficient graph partitioning algorithm. Then, we analyze the effect of temporal dimension to understand the dynamicity in online social networks and present a solution for geo-localization of users in Twitter using a hierarchical model that combines partitioning of the underlying social network graph with temporal categorization of the tweets. The results show the effect of temporal dynamicity on users’ spatial behavior. This result leads to design and development of a dynamic topic modeling solution, involving an online graph partitioning algorithm and a significantly stronger language modeling approach based on the skip-gram technique. The algorithm shows strong improvement on scalability and accuracy compared to the state-of-the-art models. Finally, we describe a dynamic graph-based representation learning algorithm that modifies the partitioning algorithm to develop a generalization of our previous work. A strong representation learning algorithm is proposed that can be used for extracting high quality distributed and continuous representations out of any sequential data with local and hierarchical structural properties similar to natural language text.
  •  
5.
  • Gogoulou, Evangelia, et al. (author)
  • Predicting treatment outcome from patient texts : The case of internet-based cognitive behavioural therapy
  • 2021
  • In: EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference. - : Association for Computational Linguistics (ACL). - 9781954085022 ; , s. 575-580
  • Conference paper (peer-reviewed)abstract
    • We investigate the feasibility of applying standard text categorisation methods to patient text in order to predict treatment outcome in Internet-based cognitive behavioural therapy. The data set is unique in its detail and size for regular care for depression, social anxiety, and panic disorder. Our results indicate that there is a signal in the depression data, albeit a weak one. We also perform terminological and sentiment analysis, which confirm those results. © 2021 Association for Computational Linguistics
  •  
6.
  • Alkathiri, Abdul Aziz, et al. (author)
  • Decentralized Word2Vec Using Gossip Learning
  • 2021
  • In: Proceedings of the 23<sup>rd</sup> Nordic Conference on Computational Linguistics (NoDaLiDa 2021).
  • Conference paper (peer-reviewed)abstract
    • Advanced NLP models require huge amounts of data from various domains to produce high-quality representations. It is useful then for a few large public and private organizations to join their corpora during training. However, factors such as legislation and user emphasis on data privacy may prevent centralized orchestration and data sharing among these organizations. Therefore, for this specific scenario, we investigate how gossip learning, a massively-parallel, data-private, decentralized protocol, compares to a shared-dataset solution. We find that the application of Word2Vec in a gossip learning framework is viable. Without any tuning, the results are comparable to a traditional centralized setting, with a reduction in ground-truth similarity scores as low as 4.3%. Furthermore, the results are up to 54.8% better than independent local training.
  •  
7.
  • Argaw, Atelach Alemu, et al. (author)
  • Dictionary-based Amharic-French information retrieval
  • 2006
  • In: Accessing Multilingual Information Repositories. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 354045697X ; , s. 83-92, s. 83-92
  • Conference paper (peer-reviewed)abstract
    • We present four approaches to the Amharic - French bilingual track at CLEF 2005. All experiments use a dictionary based approach to translate the Amharic queries into French Bags-of-words, but while one approach uses word sense discrimination on the translated side of the queries, the other one includes all senses of a translated word in the query for searching. We used two search engines: The SICS experimental engine and Lucene, hence four runs with the two approaches. Non-content bearing words were removed both before and after the dictionary lookup. TF/IDF values supplemented by a heuristic function was used to remove the stop words from the Amharic queries and two French stopwords lists were used to remove them from the French translations. In our experiments, we found that the SICS search engine performs better than Lucene and that using the word sense discriminated keywords produce a slightly better result than the full set of non discriminated keywords.
  •  
8.
  •  
9.
  • Axelsson, Sofia, 1987, et al. (author)
  • Expressing Happiness in Different Languages
  • 2016
  • In: EUENGAGE Political Text Analysis Workshop, 21 – 22 June 2016, Amsterdam, Netherlands.
  • Conference paper (other academic/artistic)
  •  
10.
  • Berdicevskis, Aleksandrs, 1983, et al. (author)
  • Superlim: A Swedish Language Understanding Evaluation Benchmark
  • 2023
  • In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, December 6-10, 2023, Singapore / Houda Bouamor, Juan Pino, Kalika Bali (Editors). - Stroudsburg, PA : Association for Computational Linguistics. - 9798891760608
  • Conference paper (peer-reviewed)
  •  
11.
  •  
12.
  • Carlsson, Fredrik, et al. (author)
  • Fine-Grained Controllable Text Generation Using Non-Residual Prompting
  • 2022
  • In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. - Stroudsburg, PA, USA : Association for Computational Linguistics. - 9781955917216 ; , s. 6837-6857
  • Conference paper (peer-reviewed)abstract
    • The introduction of immensely large Causal Language Models (CLMs) has rejuvenated the interest in open-ended text generation. However, controlling the generative process for these Transformer-based models is at large an unsolved problem. Earlier work has explored either plug-and-play decoding strategies, or more powerful but blunt approaches such as prompting. There hence currently exists a trade-off between fine-grained control, and the capability for more expressive high-level instructions. To alleviate this trade-off, we propose an encoder-decoder architecture that enables intermediate text prompts at arbitrary time steps. We propose a resource-efficient method for converting a pre-trained CLM into this architecture, and demonstrate its potential on various experiments, including the novel task of contextualized word inclusion. Our method provides strong results on multiple experimental settings, proving itself to be both expressive and versatile.
  •  
13.
  • Carlsson, Fredrik, et al. (author)
  • Semantic Re-tuning with Contrastive Tension
  • 2021
  • Conference paper (peer-reviewed)abstract
    • Extracting semantically useful natural language sentence representations frompre-trained deep neural networks such as Transformers remains a challenge. Wefirst demonstrate that pre-training objectives impose a significant task bias ontothe final layers of models, with a layer-wise survey of the Semantic Textual Similarity (STS) correlations for multiple common Transformer language models. Wethen propose a new self-supervised method called Contrastive Tension (CT) tocounter such biases. CT frames the training objective as a noise-contrastive taskbetween the final layer representations of two independent models, in turn makingthe final layer representations suitable for feature extraction. Results from multiple common unsupervised and supervised STS tasks indicate that CT outperformsprevious State Of The Art (SOTA), and when combining CT with supervised datawe improve upon previous SOTA results with large margins.
  •  
14.
  • Cuba Gyllensten, Amaru, et al. (author)
  • Navigating the Semantic Horizon using Relative Neighborhood Graphs
  • 2015
  • In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. - Stroudsburg, PA, USA : Association for Computational Linguistics (ACL).
  • Conference paper (peer-reviewed)abstract
    • This paper introduces a novel way to navigate neighborhoods in distributional semantic models. The approach is based on relative neighborhood graphs, which uncover the topological structure of local neighborhoods in semantic space. This has the potential to overcome both the problem with selecting a proper k in k-NN search, and the problem that a ranked list of neighbors may conflate several different senses. We provide both qualitative and quantitative results that support the viability of the proposed method.
  •  
15.
  • Cuba Gyllensten, Amaru, et al. (author)
  • R-grams: Unsupervised Learning of Semantic Units in Natural Language
  • 2019
  • In: Proceedings of the 13th International Conference on Computational Semantics - Student Papers.
  • Conference paper (peer-reviewed)abstract
    • This paper investigates data-driven segmentation using Re-Pair or Byte Pair Encoding-techniques. In contrast to previous work which has primarily been focused on subword units for machine translation, we are interested in the general properties of such segments above the word level. We call these segments r-grams, and discuss their properties and the effect they have on the token frequency distribution. The proposed approach is evaluated by demonstrating its viability in embedding techniques, both in monolingual and multilingual test settings. We also provide a number of qualitative examples of the proposed methodology, demonstrating its viability as a language-invariant segmentation procedure.
  •  
16.
  • Cuba Gyllensten, Amaru, 1989-, et al. (author)
  • Shallow Contextualized Word Embeddings
  • Other publication (other academic/artistic)abstract
    •     This paper introduces a novel word embedding method that is able to learn contextualized representations using a shallow model based on factorization machines. We discuss the limits of log-linear models and demonstrate how our proposed model -- Continuous Bag of Pairs (CBoP) -- can overcome these limits. We also demonstrate contextualized word similarity queries with the CBoP model, i.e. queries of the kind "What words are similar to orange, given a context word juice?'' We validate our model using standard word-based and sentence-based similarity benchmarks and observe that there is little difference between CBoP and a comparable CBoW model on word-based benchmarks, that CBoP outperforms CBoW on Semantic Textual Similarity benchmarks, yet is worse than CBoW on sentence classification tasks.
  •  
17.
  • Cöster, Rickard, et al. (author)
  • Selective compound splitting of Swedish queries for boolean combination of truncated terms
  • 2003. - 1
  • Conference paper (peer-reviewed)abstract
    • In compounding languages such as Swedish, it is often neccessary to split compound words when indexing documents or queries. One of the problems is that it is difficult to find constituents that express a concept similar to that expressed by the compound. The approach taken here is to expand a query with the leading constituents of the compound words. Every query term is truncated so as to increase recall by hopefully finding other compounds with the leading constituent as prefix. This approach increase recall in a rather uncontrolled way, so we use a Boolean quorum-level type of search to rank documents both according to a tf-idf factor but also to the number of matching Boolean combinations. The Boolean combinations performed relatively well, taken into consideration that the queries were very short (maximum five search terms). Also included in this paper are the results of two other methods we are currently working on in our lab; one for re-ranking search results on the basis of stylistic analysis of documents, and one for dimensionality reduction using Random Indexing.
  •  
18.
  • Dahlberg, Stefan, et al. (author)
  • A Distributional Semantic Online Lexicon for Linguistic Explorations of Societies
  • 2023
  • In: Social Science Computer Review. - : SAGE Publications. - 0894-4393 .- 1552-8286. ; 41:2
  • Journal article (peer-reviewed)abstract
    • Linguistic Explorations of Societies (LES) is an interdisciplinary research project with scholars from the fields of political science, computer science, and computational linguistics. The overarching ambition of LES has been to contribute to the survey-based comparative scholarship by compiling and analyzing online text data within and between languages and countries. To this end, the project has developed an online semantic lexicon, which allows researchers to explore meanings and usages of words in online media across a substantial number of geo-coded languages. The lexicon covers data from approximately 140 language-country combinations and is, to our knowledge, the most extensive free research resource of its kind. Such a resource makes it possible to critically examine survey translations and identify discrepancies in order to modify and improve existing survey methodology, and its unique features further enable Internet researchers to study public debate online from a comparative perspective. In this article, we discuss the social scientific rationale for using online text data as a complement to survey data, and present the natural language processing-based methodology behind the lexicon including its underpinning theory and practical modeling. Finally, we engage in a critical reflection about the challenges of using online text data to gauge public opinion and political behavior across the world.
  •  
19.
  •  
20.
  • Dahlberg, Stefan, 1975, et al. (author)
  • The Meaning of Democracy
  • 2015
  • In: Internal Conference of the Quality of Government Institute.
  • Conference paper (other academic/artistic)
  •  
21.
  •  
22.
  •  
23.
  • Dwibedi, Chinmay, 1987, et al. (author)
  • Effect of self-managed lifestyle treatment on glycemic control in patients with type 2 diabetes
  • 2022
  • In: npj Digital Medicine. - : Nature Research. - 2398-6352. ; 5:1
  • Journal article (peer-reviewed)abstract
    • The lack of effective, scalable solutions for lifestyle treatment is a global clinical problem, causing severe morbidity and mortality. We developed a method for lifestyle treatment that promotes self-reflection and iterative behavioral change, provided as a digital tool, and evaluated its effect in 370 patients with type 2 diabetes (ClinicalTrials.gov identifier: NCT04691973). Users of the tool had reduced blood glucose, both compared with randomized and matched controls (involving 158 and 204 users, respectively), as well as improved systolic blood pressure, body weight and insulin resistance. The improvement was sustained during the entire follow-up (average 730 days). A pathophysiological subgroup of obese insulin-resistant individuals had a pronounced glycemic response, enabling identification of those who would benefit in particular from lifestyle treatment. Natural language processing showed that the metabolic improvement was coupled with the self-reflective element of the tool. The treatment is cost-saving because of improved risk factor control for cardiovascular complications. The findings open an avenue for self-managed lifestyle treatment with long-term metabolic efficacy that is cost-saving and can reach large numbers of people. © 2022, The Author(s).
  •  
24.
  • Ekgren, Ariel, et al. (author)
  • GPT-SW3 : An Autoregressive Language Model for the Scandinavian Languages
  • 2024
  • In: <em>2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings</em>. - : European Language Resources Association (ELRA). ; , s. 7886-7900
  • Conference paper (peer-reviewed)abstract
    • This paper details the process of developing the first native large generative language model for the North Germanic languages, GPT-SW3. We cover all parts of the development process, from data collection and processing, training configuration and instruction finetuning, to evaluation, applications, and considerations for release strategies. We discuss pros and cons of developing large language models for smaller languages and in relatively peripheral regions of the globe, and we hope that this paper can serve as a guide and reference for other researchers that undertake the development of large generative models for smaller languages. 
  •  
25.
  • Espinoza, Fredrik, et al. (author)
  • Analysis of Open Answers to Survey Questions throughInteractive Clustering and Theme Extraction
  • 2018
  • In: Proceedings of Conference on Human Information Interaction &amp; Retrieval. - New York, New York, USA : ACM Digital Library. ; , s. 317-320
  • Conference paper (peer-reviewed)abstract
    • This paper describes design principles for and the implementation of Gavagai Explorer—a new application which builds on interactive text clustering to extract themes from topically coherent text sets such as open text answers to surveys or questionnaires.An automated system is quick, consistent, and has full coverage over the study material. A system allows an analyst to analyze more answers in a given time period; provides the same initial results regardless of who does the analysis, reducing the risks of inter-rater discrepancy; and does not risk miss responses due to fatigue or boredom. These factors reduce the cost and increase the reliability of the service. The most important feature, however, is relieving the human analyst from the frustrating aspects of the coding task, freeing the effort to the central challenge of understanding themes. Gavagai Explorer is available on-line at http://explorer.gavagai.se
  •  
26.
  • Gambäck, Björn, et al. (author)
  • A spoken Swedish e-mail interface
  • 2003. - 2
  • In: Proceedings of the 14th Nordic Conference of Computational Linguistics.
  • Conference paper (peer-reviewed)abstract
    • The paper describes the Swedish involvement in the EU project DUMAS (Dynamic Universal Mobility for Adaptive Speech Interfaces), a project which aims at developing multilingual speech-based applications, and more specifically, investigating adaptive multilingual interaction techniques to handle both spoken and text input and to provide coordinated linguistic responses to the user. The project has a clear focus on Northern Europe with two of the eight partners coming from Sweden and four from Finland; and the languages we aim at treating are English, Swedish and Finnish. We will construct an agent-based generic framework for multilingual speech applications, supporting adaptivity to both the individual user and the particular domain. Applications based on the general architecture will benefit from the advantages of fault-tolerant semantic analysis, which combined with the dialogue management routines will handle user interaction in a very robust manner. As an initial such application, we are building a mobile phone-based e-mail interface that will deal with multilingual issues in several forms and environments, and whose functionality can be adapted to different users, different situations and tasks. Such a system produces speech output only (in the form of spoken responses and read e-mails) to the user, but gets two types of input: user speech and textual e-mail messages. It must be able to distinguish between languages, both in e-mails and in the user utterances. The contents of a user's inbox must be continuously analysed in order to enable advanced search functions.
  •  
27.
  •  
28.
  • Garcia Bernal, Daniel, et al. (author)
  • Federated Word2Vec: Leveraging Federated Learning to Encourage Collaborative Representation Learning
  • Other publication (other academic/artistic)abstract
    • Large scale contextual representation models have significantly advanced NLP in recent years, understanding the semantics of text to a degree never seen before. However, they need to process large amounts of data to achieve high-quality results. Joining and accessing all these data from multiple sources can be extremely challenging due to privacy and regulatory reasons. Federated Learning can solve these limitations by training models in a distributed fashion, taking advantage of the hardware of the devices that generate the data. We show the viability of training NLP models, specifically Word2Vec, with the Federated Learning protocol. In particular, we focus on a scenario in which a small number of organizations each hold a relatively large corpus. The results show that neither the quality of the results nor the convergence time in Federated Word2Vec deteriorates as compared to centralised Word2Vec.
  •  
29.
  • Ghoorchian, Kambiz, 1981-, et al. (author)
  • GDTM: Graph-based Dynamic Topic Models
  • 2020
  • In: Progress in Artificial Intelligence. - : Springer Nature. - 2192-6352 .- 2192-6360. ; 9, s. 195-207
  • Journal article (peer-reviewed)abstract
    • Dynamic Topic Modeling (DTM) is the ultimate solution for extracting topics from short texts generated in Online Social Networks (OSNs) like Twitter. A DTM solution is required to be scalable and to be able to account for sparsity in short texts and dynamicity of topics. Current solutions combine probabilistic mixture models like Dirichlet Multinomial or PitmanYor Process with approximate inference approaches like Gibbs Sampling and Stochastic Variational Inference to, respectively, account for dynamicity and scalability in DTM. However, these solutions rely on weak probabilistic language models, which do not account for sparsity in short texts. In addition, their inference is based on iterative optimization algorithms, which have scalability issues when it comes to DTM. We present GDTM, a single-pass graph-based DTM algorithm, to solve the problem. GDTM combines a context-rich and incremental feature representation model, called Random Indexing (RI), with a novel online graph partitioning algorithm to address scalability and dynamicity. In addition, GDTM uses a rich language modeling approach based on the Skip-gram technique to account for sparsity. We run multiple experiments over a large-scale Twitter dataset to analyze the accuracy and scalability of GDTM and compare the results with four state-of-the-art approaches. The results show that GDTM outperforms the best approach by 11% on accuracy and performs by an order of magnitude faster while creating 4 times better topic quality over standard evaluation metrics.
  •  
30.
  • Gogoulou, Evangelia, et al. (author)
  • Cross-lingual Transfer of Monolingual Models
  • 2022
  • In: 2022 Language Resources and Evaluation Conference, LREC 2022. - : European Language Resources Association (ELRA). - 9791095546726 ; , s. 948-955
  • Conference paper (peer-reviewed)abstract
    • Recent studies in cross-lingual learning using multilingual models have cast doubt on the previous hypothesis that shared vocabulary and joint pre-training are the keys to cross-lingual generalization. We introduce a method for transferring monolingual models to other languages through continuous pre-training and study the effects of such transfer from four different languages to English. Our experimental results on GLUE show that the transferred models outperform an English model trained from scratch, independently of the source language. After probing the model representations, we find that model knowledge from the source language enhances the learning of syntactic and semantic knowledge in English. ©  licensed under CC-BY-NC-4.0.
  •  
31.
  • Gyllensten, Amaru Cuba, et al. (author)
  • Distributional term set expansion
  • 2018
  • In: LREC 2018 - 11th International Conference on Language Resources and Evaluation. - 9791095546009 ; , s. 2554-2558
  • Conference paper (other academic/artistic)abstract
    • This paper is a short empirical study of the performance of centrality and classification based iterative term set expansion methods for distributional semantic models. Iterative term set expansion is an interactive process using distributional semantics models where a user labels terms as belonging to some sought after term set, and a system uses this labeling to supply the user with new, candidate, terms to label, trying to maximize the number of positive examples found. While centrality based methods have a long history in term set expansion (Sarmento et al., 2007; Pantel et al., 2009), we compare them to classification methods based on the the Simple Margin method, an Active Learning approach to classification using Support Vector Machines (Tong and Koller, 2002). Examining the performance of various centrality and classification based methods for a variety of distributional models over five different term sets, we can show that active learning based methods consistently outperform centrality based methods.
  •  
32.
  • Gyllensten, Amaru Cuba, et al. (author)
  • Measuring Issue Ownership using Word Embeddings
  • 2018
  • Other publication (other academic/artistic)abstract
    • Sentiment and topic analysis are commonmethods used for social media monitoring.Essentially, these methods answers questionssuch as, “what is being talked about, regardingX”, and “what do people feel, regarding X”.In this paper, we investigate another venue forsocial media monitoring, namely issue ownership and agenda setting, which are conceptsfrom political science that have been used toexplain voter choice and electoral outcomes.We argue that issue alignment and agenda setting can be seen as a kind of semantic sourcesimilarity of the kind “how similar is sourceA to issue owner P, when talking about issue X”, and as such can be measured usingword/document embedding techniques. Wepresent work in progress towards measuringthat kind of conditioned similarity, and introduce a new notion of similarity for predictive embeddings. We then test this methodby measuring the similarity between politically aligned media and political pparties, conditioned on bloc-specific issues.
  •  
33.
  •  
34.
  • Hansen, Preben, et al. (author)
  • Cooperation, bookmarking, and thesaurus in interactive bilingual question answering
  • 2004. - 1
  • In: Multilingual Information Access for Text, Speech and Images (5th Workshop of the Cross-Language Evaluation Forum, CLEF 2004, Bath, UK, September 15-17, 2004, Revised Selected Papers). - Berlin, Heidelberg : Springer. - 9783540274209 ; , s. 343-347
  • Book chapter (peer-reviewed)abstract
    • The study presented involves several different contextual aspects and is the latest in a continuing series of exploratory experiments on information access behaviour in a multi-lingual context [1, 2]. This year’s interactive cross-lingual information access experiment was designed to measure three parameters we expected would affect the performance of users in cross-lingual tasks in languages in which the users are less than fluent. Firstly, introducing new technology, we measure the effect of topic-tailored term expansion on query formulation. Secondly, introducing a new component in the interactive interface, we investigate - without measuring by using a control group - the effect of a bookmark panel on user confidence in the reported result. Thirdly, we ran subjects pair-wise and allowed them to communicate verbally, to investigate how people may cooperate and collaborate with a partner during a search session performing a similar but non-identical search task.
  •  
35.
  • Helms, Karey, 1985-, et al. (author)
  • Design Methods to Investigate User Experiences of Artificial Intelligence
  • 2018
  • In: 2018 AAAI Spring Symposium Series. - : Association for the Advancement of Artificial Intelligence. ; , s. 394-398, s. 394-398
  • Conference paper (peer-reviewed)abstract
    • This paper engages with the challenges of designing ‘implicit interaction’, systems (or system features) in which actions are not actively guided or chosen by users but instead come from inference driven system activity. We discuss the difficulty of designing for such systems and outline three Research through Design approaches we have engaged with - first, creating a design workbook for implicit interaction, second, a workshop on designing with data that subverted the usual relationship with data, and lastly, an exploration of how a computer science notion, “leaky abstraction”, could be in turn misinterpreted to imagine new system uses and activities. Together these design activities outline some inventive new ways of designing User Experiences of Artificial Intelligence.
  •  
36.
  • Holmlund, Jon, et al. (author)
  • Creating Bilingual Lexica Using Reference Wordlists for Alignment of Monolingual Semantic Vector Spaces
  • 2005. - 1
  • Conference paper (peer-reviewed)abstract
    • This paper proposes a novel method for automatically acquiring multi-lingual lexica from non-parallel data and reports some initial experiments to prove the viability of the approach. Using established techniques for building mono-lingual vector spaces two independent semantic vector spaces are built from textual data. These vector spaces are related to each other using a small {\em reference word list} of manually chosen reference points taken from available bi-lingual dictionaries. Other words can then be related to these reference points first in the one language and then in the other. In the present experiments, we apply the proposed method to comparable but non-parallel English-German data. The resulting bi-lingual lexicon is evaluated using an online English-German lexicon as gold standard. The results clearly demonstrate the viability of the proposed methodology.
  •  
37.
  • Holst, Anders, et al. (author)
  • Dispersing the conceptual confusion
  • 2001. - 1
  • Conference paper (peer-reviewed)abstract
    • In few subjects it is as easy to talk past each other as when discussing consciousness. Not only is the subject elusive and everyone has their own opinion of what it is all about; different people also make quite different use of words and language when discussing consciousness. This contribution tries to exemplify some common misunderstanding between people with different starting points and different use of language. The suggestion is that 'the problem of consciousness' is after all quite similar to all of us, although this is muddled by the way we talk about it, and the way we have locked ourselves into our different slogans and world views.
  •  
38.
  • Kanerva, Pentti, et al. (author)
  • Computing with large random patterns
  • 2001. - 1
  • In: Foundations of Real-World Intelligence. - Stanford, California : CSLI Publications. ; , s. 251-311
  • Book chapter (peer-reviewed)abstract
    • We describe a style of computing that differs from traditional numeric and symbolic computing and is suited for modeling neural networks. We focus on one aspect of ``neurocomputing,'' namely, computing with large random patterns, or high-dimensional random vectors, and ask what kind of computing they perform and whether they can help us understand how the brain processes information and how the mind works. Rapidly developing hardware technology will soon be able to produce the massive circuits that this style of computing requires. This chapter develops a theory on which the computing could be based.
  •  
39.
  • Karlgren, Jussi, et al. (author)
  • Between Bags and Trees : Constructional Patterns in Text Used for Attitude Identification
  • 2010
  • In: ECIR 2010, 32nd European Conference on Information Retrieval.
  • Conference paper (peer-reviewed)abstract
    • This paper describes experiments to use non-terminological information to find attitudinal expressions in written English text. The experiments are based on an analysis of text with respect to not only the vocabulary of content terms present in it (which most other approaches use as a basis for analysis) but also with respect to presence of structural features of the text represented by constructional features (typically disregarded by most other analyses). In our analysis, following a construction grammar framework, structural features are treated as occurrences, similarly to the treatment of vocabulary features. The constructional features in play are chosen to potentially signify opinion but are not specific to negative or positive expressions. The framework is used to classify clauses, headlines, and sentences from three different shared collections of attitudinal data. We find that constructional features transfer well across different text collections and that the information couched in them integrates easily with a vocabulary based approach, yielding improvements in classification without complicating the application end of the processing framework.
  •  
40.
  • Karlgren, Jussi, et al. (author)
  • Dynamic lexica for query translation
  • 2005. - 1
  • In: Multilingual Information Access for Text, Speech and Images, Third Workshop of the Cross-Language Evaluation Forum (CLEF).
  • Conference paper (peer-reviewed)abstract
    • This experiment tests a simple, scalable, and effective approach to building a domain-specific translation lexicon using distributional statistics over parallellized bilingual corpora. A bilingual lexicon is extracted from aligned Swedish-French data, used to translate CLEF topics from Swedish to French, which resulting French queries are then in turn used to retrieve documents from the French language CLEF collection. The results give 34 of fifty queries on or above median for the ``precision at 1000 documents'' recall oriented score; with many of the errors possible to handle by the use of string-matching and cognate search. We conclude that the approach presented here is a simple and efficient component in an automatic query translation system.
  •  
41.
  • Karlgren, Jussi, et al. (author)
  • ELOQUENT CLEF Shared Tasks for Evaluation of Generative Language Model Quality
  • 2024
  • In: Lecture Notes in Computer Science. - : Springer Science and Business Media Deutschland GmbH. - 0302-9743 .- 1611-3349. ; 14612 LNCS, s. 459-465
  • Journal article (peer-reviewed)abstract
    • ELOQUENT is a set of shared tasks for evaluating the quality and usefulness of generative language models. ELOQUENT aims to bring together some high-level quality criteria, grounded in experiences from deploying models in real-life tasks, and to formulate tests for those criteria, preferably implemented to require minimal human assessment effort and in a multilingual setting. The selected tasks for this first year of ELOQUENT are (1) probing a language model for topical competence; (2) assessing the ability of models to generate and detect hallucinations; (3) assessing the robustness of a model output given variation in the input prompts; and (4) establishing the possibility to distinguish human-generated text from machine-generated text.
  •  
42.
  • Karlgren, Jussi, et al. (author)
  • Evaluating learning language representations
  • 2015
  • Conference paper (peer-reviewed)abstract
    • Machine learning offers significant benefits for systems that process and understand natural language: (a) lower maintenance and upkeep costs than when using manually-constructed resources, (b) easier portability to new domains, tasks, or languages, and (c) robust and timely adaptation to situation-specific settings. However, the behaviour of an adaptive system is less predictable than when using an edited, stable resource, which makes quality control a continuous issue. This paper proposes an evaluation benchmark for measuring the quality, coverage, and stability of a natural language system as it learns word meaning. Inspired by existing tests for human vocabulary learning, we outline measures for the quality of semantic word representations, such as when learning word embeddings or other distributed representations. These measures highlight differences between the types of underlying learning processes as systems ingest progressively more data.
  •  
43.
  • Karlgren, Jussi, et al. (author)
  • Filaments of Meaning in Word Space
  • 2008. - 1
  • Conference paper (peer-reviewed)abstract
    • Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local structure of word spaces has substantially different dimensionality and character than the global space and that this structure shows potential to be exploited for further semantic analysis using methods for local analysis of vector space structure rather than globally scoped methods typically in use today such as singular value decomposition or principal component analysis.
  •  
44.
  • Karlgren, Jussi, et al. (author)
  • From Words to Understanding
  • 2001. - 1
  • In: Foundations of Real-World Intelligence. - Stanford, California : CSLI Publications. - 1575863383 ; , s. 294-308
  • Book chapter (peer-reviewed)
  •  
45.
  •  
46.
  • Karlgren, Jussi, et al. (author)
  • Usefulness of Sentiment Analysis
  • 2012
  • In: ECIR 2012, 34th European Conference on Information Retrieval. - Berlin, Heidelberg : Springer Berlin/Heidelberg. ; , s. 426-435
  • Conference paper (peer-reviewed)abstract
    • What can text sentiment analysis technology be used for,and does a more usage-informed view on sentiment analysis pose newrequirements on technology development?
  •  
47.
  • Karlgren, Jussi, et al. (author)
  • Vector-based semantic analysis using random indexing and morphological analysis for cross-lingual information retrieval
  • 2002. - 1
  • In: Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems, Darmstadt, Germany, September 3 - 4, 2001. - : Springer-Verlag. - 3540440429 ; , s. 169-176
  • Book chapter (peer-reviewed)abstract
    • Meaning, the main object of study in information access, is most decidedly situation-dependent. While much of meaning appears to achieve consistency across usage situations -- a term will seem to mean much the same thing in many of its contexts -- most everything can be negotiated on the go. Human processing appears to be flexible in this respect, and oriented towards learning from prototypes rather than learning by definition: learning new words, and adding new meanings or shades of meaning to an existing word does not need a formal re-training process. We have built a query expansion and translation tool for information retrieval systems. When used in one single language it will expand the terms of a query using a thesaurus built for that purpose; when used across languages it will provide numerous translations and near translations for the source language terms. The underlying technology we are testing is that of vector-based semantic analysis, an analysis method related to latent semantic indexing based on stochastic pattern computing. This paper will briefly describe how we acquired training data, aligned it, analyzed it using morphological analysis tools, and finally built a thesaurus using the data, but will concentrate on an overview of vector-based semantic analysis and how stochastic pattern computing differs from latent semantic indexing in its current form.
  •  
48.
  • Karlgren, Jussi, et al. (author)
  • Weighting Query Terms Based on Distributional Statistics
  • 2006. - 1
  • In: Accessing Multilingual Information Repositories, 6th Workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005.
  • Conference paper (peer-reviewed)abstract
    • This year, the SICS team has concentrated on query processing and on the internal topical structure of the query, specifically compound translation. Compound translation is non-trivial due to dependencies between compound elements. This year, we have investigated topical dependencies between query terms: if a query term happens to be non-topical or noise, it should be discarded or given a low weight when ranking retrieved documents; if a query term shows high topicality its weight should be boosted. The two experiments described here are based on the analysis of the distributional character of query terms: one using similarity of occurrence context between query terms globally across the entire collection; the other using the likelihood of individual terms to appear topically in individual texts. Both -- complementary -- boosting schemes tested delivered improved results.
  •  
49.
  •  
50.
  • Koptjevskaja-Tamm, Maria, 1957-, et al. (author)
  • Temperature in the Word Space : Sense exploration of temperature expressions using word-space modeling.
  • 2014
  • In: Linguistic variation in text and speech, within and across languages. - Berlin/Boston : Walter de Gruyter. - 9783110317398 - 9783110317558 ; , s. 231-267
  • Book chapter (peer-reviewed)abstract
    • This chapter deals with a statistical technique for sense exploration based on distributional semantics known as word space modelling. Word space models rely on feature aggregation, in this case aggregation of co-occurrence events, to build an aggregated view on the distributional behaviour of words. Such models calculate meaning similarity among words on the basis of the contexts in which they occur and represent it as proximity in high-dimensional vector spaces. The main purpose of this study is to test to what extent word-space modelling is in principle suitable for lexical-typological work by taking a first little step in this direction and applying the method for the exploration of the seven central English temperature adjectives in three corpora representing different genres. In order to better capture and account for the potentially different senses of one and the same word we have suggested and applied a new variant of this general method, “syntagmatically labelled partitioning”.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-50 of 110
Type of publication
conference paper (74)
journal article (18)
book chapter (6)
other publication (4)
doctoral thesis (4)
reports (3)
show more...
book (1)
show less...
Type of content
peer-reviewed (91)
other academic/artistic (18)
pop. science, debate, etc. (1)
Author/Editor
Sahlgren, Magnus (102)
Karlgren, Jussi (30)
Kerren, Andreas, 197 ... (16)
Paradis, Carita (15)
Olsson, Fredrik (8)
Cöster, Rickard (8)
show more...
Gyllensten, Amaru Cu ... (7)
Skeppstedt, Maria, 1 ... (6)
Sahlgren, Magnus, 19 ... (6)
Gogoulou, Evangelia (6)
Holst, Anders (5)
Carlsson, Fredrik (5)
Dahlberg, Stefan, 19 ... (5)
Axelsson, Sofia, 198 ... (5)
Holmberg, Sören, 194 ... (5)
Espinoza, Fredrik (5)
Ekgren, Ariel (5)
Hamfors, Ola (5)
Boman, Magnus (4)
Hansen, Preben (4)
Cuba Gyllensten, Ama ... (4)
Knutsson, Ola (4)
Cuba Gyllensten, Ama ... (4)
Kanerva, Pentti (4)
Asker, Lars (3)
Argaw, Atelach Alemu (3)
Öhman, Joey (3)
Isbister, Tim (3)
Ghoorchian, Kambiz, ... (3)
Girdzijauskas, Sarun ... (2)
Persson, Per (2)
Ben Abdesslem, Fehmi (2)
Koptjevskaja Tamm, M ... (2)
Eriksson, Gunnar (2)
Giaretta, Lodovico, ... (2)
Gambäck, Björn (2)
Brown, Barry (2)
Gillblad, Daniel (2)
Nivre, Joakim, 1962- (2)
Börjeson, Love (2)
Görnerup, Olof (2)
Isacsson, Nils (2)
Ylipää, Erik (2)
Lampinen, Airi (2)
Kerren, Andreas (2)
Kucher, Kostiantyn, ... (2)
Emruli, Blerim (2)
Helms, Karey, 1985- (2)
Täckström, Oscar (2)
Paradis, Carita, 195 ... (2)
show less...
University
RISE (64)
Royal Institute of Technology (26)
Linnaeus University (20)
Lund University (15)
Linköping University (12)
University of Gothenburg (8)
show more...
Stockholm University (6)
Uppsala University (2)
Luleå University of Technology (2)
Mid Sweden University (1)
Södertörn University (1)
show less...
Language
English (110)
Research subject (UKÄ/SCB)
Natural sciences (91)
Humanities (20)
Engineering and Technology (11)
Social Sciences (10)
Medical and Health Sciences (1)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view