SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) ;pers:(Karlgren Jussi)"

Sökning: hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) > Karlgren Jussi

  • Resultat 1-10 av 186
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Amundin, Mats, et al. (författare)
  • A proposal to use distributional models to analyse dolphin vocalisation
  • 2017
  • Ingår i: Proceedings of the 1st International Workshop on Vocal Interactivity in-and-between Humans, Animals and Robots, VIHAR 2017. - 9782956202905 ; , s. 31-32
  • Konferensbidrag (refereegranskat)abstract
    • This paper gives a brief introduction to the starting points of an experimental project to study dolphin communicative behaviour using distributional semantics, with methods implemented for the large scale study of human language.
  •  
2.
  • Täckström, Oscar, et al. (författare)
  • Uncertainty Detection as Approximate Max-Margin Sequence Labelling
  • 2010
  • Ingår i: CoNLL 2010. - : Association for Computational Linguistics. ; , s. 84-91
  • Konferensbidrag (refereegranskat)abstract
    • This paper reports experiments for the CoNLL 2010 shared task on learning to detect hedges and their scope in natural language text. We have addressed the experimental tasks as supervised linear maximum margin prediction problems. For sentence level hedge detection in the biological domain we use an L1-regularised binary support vector machine, while for sentence level weasel detection in the Wikipedia domain, we use an L2-regularised approach. We model the in-sentence uncertainty cue and scope detection task as an L2-regularised approximate maximum margin sequence labelling problem, using the BIO-encoding. In addition to surface level features, we use a variety of linguistic features based on a functional dependency analysis. A greedy forward selection strategy is used in exploring the large set of potential features. Our official results for Task 1 for the biological domain are 85.2 F1-score, for the Wikipedia set 55.4 F1-score. For Task 2, our official results are 2.1 for the entire task with a score of 62.5 for cue detection. After resolving errors and final bugs, our final results are for Task 1, biological: 86.0, Wikipedia: 58.2; Task 2, scopes: 39.6 and cues: 78.5.
  •  
3.
  • Argaw, Atelach Alemu, et al. (författare)
  • Dictionary-based Amharic-French information retrieval
  • 2006
  • Ingår i: Accessing Multilingual Information Repositories. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 354045697X ; , s. 83-92, s. 83-92
  • Konferensbidrag (refereegranskat)abstract
    • We present four approaches to the Amharic - French bilingual track at CLEF 2005. All experiments use a dictionary based approach to translate the Amharic queries into French Bags-of-words, but while one approach uses word sense discrimination on the translated side of the queries, the other one includes all senses of a translated word in the query for searching. We used two search engines: The SICS experimental engine and Lucene, hence four runs with the two approaches. Non-content bearing words were removed both before and after the dictionary lookup. TF/IDF values supplemented by a heuristic function was used to remove the stop words from the Amharic queries and two French stopwords lists were used to remove them from the French translations. In our experiments, we found that the SICS search engine performs better than Lucene and that using the word sense discriminated keywords produce a slightly better result than the full set of non discriminated keywords.
  •  
4.
  • Gey, Frederic, et al. (författare)
  • Information access in a multilingual world: transitioning from research to real-world applications
  • 2009. - 5
  • Ingår i: SIGIR Forum. - Kista, Sweden : Swedish Institute of Computer Science. - 0163-5840 .- 1558-0229. ; 43, s. 24-28
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This report constitutes the proceedings of the workshop on Information Access in a Multilingual World: Transitioning from Research to Real-World Applications}, held at SIGIR 2009 in Boston, July 23, 2009. Multilingual Information Access (MLIA) is at a turning point wherein substantial real-world applications are being introduced after fifteen years of research into cross-language information retrieval, question answering, statistical machine translation and named entity recognition. Previous workshops on this topic have focused on research and small-scale applications. The focus of this workshop was on technology transfer from research to applications and on what future research needs to be done which facilitates MLIA in an increasingly connected multilingual world.
  •  
5.
  • Kamps, Jaap, et al. (författare)
  • Report on the Third Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR), Toronto, Canada
  • 2011
  • Ingår i: SIGIR Forum. - : SIGIR. - 0163-5840 .- 1558-0229. ; 45, s. 33-41
  • Tidskriftsartikel (refereegranskat)abstract
    • There is an increasing amount of structure on the Web as a result of modern Web lan- guages, user tagging and annotation, and emerging robust NLP tools. These meaningful, semantic, annotations hold the promise to significantly enhance information access, by en- hancing the depth of analysis of today?s systems. Currently, we have only started exploring the possibilities and only begin to understand how these valuable semantic cues can be put to fruitful use. The workshop had an interactive format consisting of keynotes, boasters and posters, breakout groups and reports, and a final discussion, which was prolonged into the evening. There was a strong feeling that we made substantial progress. Specifically, each of the breakout groups contributed to our understanding of the way forward. First, annotations and use cases come in many different shapes and forms depending on the domain at hand, but at a higher level there are commonalities in annotation tools, indexing methods, user interfaces, and general methodology. Second, there is a framework emerging to view annota- tion as (1) a linking procedure, connecting (2) an analysis of information objects with (3) a semantic model of some sort, expressing relations that contribute to (4) a task of interest to end users. Third, we should look at complex tasks that cannot be comprehensible articulated in a few keywords, and embrace interaction both to incrementally refine the search request and to explore the results at various stages, guided by the semantic structure.
  •  
6.
  • Karlgren, Jussi, et al. (författare)
  • Between Bags and Trees : Constructional Patterns in Text Used for Attitude Identification
  • 2010
  • Ingår i: ECIR 2010, 32nd European Conference on Information Retrieval.
  • Konferensbidrag (refereegranskat)abstract
    • This paper describes experiments to use non-terminological information to find attitudinal expressions in written English text. The experiments are based on an analysis of text with respect to not only the vocabulary of content terms present in it (which most other approaches use as a basis for analysis) but also with respect to presence of structural features of the text represented by constructional features (typically disregarded by most other analyses). In our analysis, following a construction grammar framework, structural features are treated as occurrences, similarly to the treatment of vocabulary features. The constructional features in play are chosen to potentially signify opinion but are not specific to negative or positive expressions. The framework is used to classify clauses, headlines, and sentences from three different shared collections of attitudinal data. We find that constructional features transfer well across different text collections and that the information couched in them integrates easily with a vocabulary based approach, yielding improvements in classification without complicating the application end of the processing framework.
  •  
7.
  • Karlgren, Jussi, et al. (författare)
  • Recognizing Text Genres with Simple Metrics Using Discriminant Analysis
  • 1994
  • Ingår i: Proceedings of the 15th International Conference on Computational Linguistics. - Morristown, NJ, USA : Association for Computational Linguistics. ; , s. 1071-1075
  • Konferensbidrag (refereegranskat)abstract
    • A simple method for categorizing texts into pre-determined text genre categories using the statistical standard technique of discriminant analysis is demonstrated with application to the Brown corpus. Discriminant analysis makes it possible use a large number of parameters that may be specific for a certain corpus or information stream, and combine them into a small number of functions, with the parameters weighted on basis of how useful they are for discriminating text genres. An application to information retrieval is discussed.
  •  
8.
  •  
9.
  • Sahlgren, Magnus, et al. (författare)
  • Automatic Bilingual Lexicon Acquisition Using Random Indexing of Parallel Corpora
  • 2005
  • Ingår i: Natural Language Engineering. - 1351-3249 .- 1469-8110. ; 11:3, s. 327-341
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a very simple and effective approach to using parallel corpora for automatic bilingual lexicon acquisition. The approach, which uses the Random Indexing vector space methodology, is based on finding correlations between terms based on their distributional characteristics. The approach requires a minimum of preprocessing and linguistic knowledge, and is efficient, fast and scalable. In this paper, we explain how our approach differs from traditional cooccurrence-based word alignment algorithms, and we demonstrate how to extract bilingual lexica using the Random Indexing approach applied to aligned parallel data. The acquired lexica are evaluated by comparing them to manually compiled gold standards, and we report overlap of around 60%. We also discuss methodological problems with evaluating lexical resources of this kind.
  •  
10.
  • Täckström, Oscar, 1979- (författare)
  • Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision
  • 2013
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Contemporary approaches to natural language processing are predominantly based on statistical machine learning from large amounts of text, which has been manually annotated with the linguistic structure of interest. However, such complete supervision is currently only available for the world's major languages, in a limited number of domains and for a limited range of tasks. As an alternative, this dissertation considers methods for linguistic structure prediction that can make use of incomplete and cross-lingual supervision, with the prospect of making linguistic processing tools more widely available at a lower cost. An overarching theme of this work is the use of structured discriminative latent variable models for learning with indirect and ambiguous supervision; as instantiated, these models admit rich model features while retaining efficient learning and inference properties.The first contribution to this end is a latent-variable model for fine-grained sentiment analysis with coarse-grained indirect supervision. The second is a model for cross-lingual word-cluster induction and the application thereof to cross-lingual model transfer. The third is a method for adapting multi-source discriminative cross-lingual transfer models to target languages, by means of typologically informed selective parameter sharing. The fourth is an ambiguity-aware self- and ensemble-training algorithm, which is applied to target language adaptation and relexicalization of delexicalized cross-lingual transfer parsers. The fifth is a set of sequence-labeling models that combine constraints at the level of tokens and types, and an instantiation of these models for part-of-speech tagging with incomplete cross-lingual and crowdsourced supervision. In addition to these contributions, comprehensive overviews are provided of structured prediction with no or incomplete supervision, as well as of learning in the multilingual and cross-lingual settings.Through careful empirical evaluation, it is established that the proposed methods can be used to create substantially more accurate tools for linguistic processing, compared to both unsupervised methods and to recently proposed cross-lingual methods. The empirical support for this claim is particularly strong in the latter case; our models for syntactic dependency parsing and part-of-speech tagging achieve the hitherto best published results for a wide number of target languages, in the setting where no annotated training data is available in the target language.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 186
Typ av publikation
konferensbidrag (99)
tidskriftsartikel (23)
bokkapitel (23)
annan publikation (15)
rapport (12)
bok (5)
visa fler...
doktorsavhandling (4)
proceedings (redaktörskap) (2)
licentiatavhandling (2)
samlingsverk (redaktörskap) (1)
visa färre...
Typ av innehåll
refereegranskat (150)
övrigt vetenskapligt/konstnärligt (34)
populärvet., debatt m.m. (2)
Författare/redaktör
Sahlgren, Magnus (28)
Hansen, Preben (16)
Eriksson, Gunnar (11)
Olsson, Fredrik (10)
Kamps, Jaap (9)
visa fler...
Cöster, Rickard (8)
Gonzalo, Julio (7)
Ortgies, Robert (6)
Persson, Per (5)
Asker, Lars (5)
Gambäck, Björn (5)
Boujemaa, Nozha (5)
Compañó, Ramón (5)
Köhler, Joachim (5)
Hulth, Anette (4)
Sanderson, Mark (4)
Geurts, Joost (4)
King, Paul (4)
Rudström, Åsa (4)
Sebe, Nicu (4)
Pettersson, Paul (3)
Holst, Anders (3)
Clarke, Charles L.A. (3)
Ferro, Nicola (3)
Murdock, Vanessa (3)
Bylund, Markus (3)
Argamon, Shlomo (3)
Argaw, Atelach Alemu (3)
Gouraud, Henri (3)
Girdzijauskas, Sarun ... (2)
Jonsson, Lars (2)
Jonsson, Anna (2)
Nivre, Joakim (2)
Dalianis, Hercules (2)
Svensson, Martin (2)
Kando, Noriko (2)
Kanoulas, Evangelos (2)
Mizzaro, Stefano (2)
Hassel, Martin (2)
Boman, Magnus (2)
Alonso, Omar (2)
Laaksolahti, Jarmo (2)
Waern, Annika (2)
Shanahan, James G. (2)
Hanbury, Allan (2)
Kompatsiaris, Yianni ... (2)
Le Moine, Jean-Yves (2)
Point, Jean-Charles (2)
Rotenberg, Boris (2)
visa färre...
Lärosäte
RISE (138)
Kungliga Tekniska Högskolan (54)
Stockholms universitet (10)
Uppsala universitet (6)
Linköpings universitet (2)
Linnéuniversitetet (1)
visa fler...
Institutet för språk och folkminnen (1)
visa färre...
Språk
Engelska (176)
Svenska (10)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (186)
Samhällsvetenskap (3)
Teknik (2)
Humaniora (2)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy