SwePub - sökning: WFRF:(Johansson Richard 1975)

Numrering	Referens	Omslagsbild	Hitta
1.	Wöhri, Annemarie, 1976, et al. (författare) A Lipidic-Sponge Phase Screen for Membrane Protein Crystallization 2008 Ingår i: Structure. - : Elsevier BV. - 0969-2126 .- 1878-4186. ; 16:7, s. 1003-1009 Tidskriftsartikel (refereegranskat)abstract A major current deficit in structural biology is the lack of high-resolution structures of eukaryotic membrane proteins, many of which are key drug targets for the treatment of disease. Numerous eukaryotic membrane proteins require specific lipids for their stability and activity, and efforts to crystallize and solve the structures of membrane proteins that do not address the issue of lipids frequently end in failure rather than success. To help address this problem, we have developed a sparse matrix crystallization screen consisting of 48 lipidic-sponge phase conditions. Sponge phases form liquid lipid bilayer environments which are suitable for conventional hanging- and sitting-drop crystallization experiments. Using the sponge phase screen, we obtained crystals of several different membrane proteins from bacterial and eukaryotic sources. We also demonstrate how the screen may be manipulated by incorporating specific lipids such as cholesterol; this modification led to crystals being recovered from a bacterial photosynthetic core complex.
2.	Adesam, Yvonne, 1975, et al. (författare) Defining the Eukalyptus forest – the Koala treebank of Swedish 2015 Ingår i: Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania. Edited by Beáta Megyesi. - 1650-3686 .- 1650-3740. - 9789175190983 ; , s. 1-9 Konferensbidrag (refereegranskat)abstract This paper details the design of the lexical and syntactic layers of a new annotated corpus of Swedish contemporary texts. In order to make the corpus adaptable into a variety of representations, the annotation is of a hybrid type with head-marked constituents and function-labeled edges, and with a rich annotation of non-local dependencies. The source material has been taken from public sources, to allow the resulting corpus to be made freely available.
3.	Adesam, Yvonne, 1975, et al. (författare) Koala – Korp’s Linguistic Annotations Developing an infrastructure for text-based research with high-quality annotations 2014 Ingår i: Proceedings of the Fifth Swedish Language Technology Conference, Uppsala, 13-14 November 2014. Konferensbidrag (övrigt vetenskapligt/konstnärligt)
4.	Adesam, Yvonne, 1975, et al. (författare) Multiwords, Word Senses and Multiword Senses in the Eukalyptus Treebank of Written Swedish 2015 Ingår i: Proceedings of the Fourteenth International Workshop on Treebanks and Linguistic Theories (TLT14), 11–12 December 2015 Warsaw, Poland. - 9788363159184 ; , s. 3-12 Konferensbidrag (refereegranskat)abstract Multiwords reside at the intersection of the lexicon and syntax and in an annotation project, they will affect both levels. In the Eukalyptus treebank of written Swedish, we treat multiwords formally as syntactic objects, which are assigned a lexical type and sense. With the help of a simple dichotomy, analyzed vs unanalyzed multiwords, and the expressiveness of the syntactic annotation formalism employed, we are able to flexibly handle most multiword types and usages.
5.	Adesam, Yvonne, 1975, et al. (författare) The Eukalyptus Treebank of Written Swedish 2018 Ingår i: Seventh Swedish Language Technology Conference (SLTC), Stockholm, 7–9 November 2018. Konferensbidrag (övrigt vetenskapligt/konstnärligt)
6.	Adesam, Yvonne, 1975, et al. (författare) The Koala Part-of-Speech and Morphological Tagset for Swedish 2018 Ingår i: Seventh Swedish Language Technology Conference (SLTC), Stockholm, 7-9 November, 2018. Konferensbidrag (övrigt vetenskapligt/konstnärligt)
7.	Brändén, Gisela, 1975, et al. (författare) Coherent diffractive imaging of microtubules using an X-ray laser. 2019 Ingår i: Nature communications. - : Springer Science and Business Media LLC. - 2041-1723. ; 10:1 Tidskriftsartikel (refereegranskat)abstract X-ray free electron lasers (XFELs) create new possibilities for structural studies of biological objects that extend beyond what is possible with synchrotron radiation. Serial femtosecond crystallography has allowed high-resolution structures to be determined from micro-meter sized crystals, whereas single particle coherent X-ray imaging requires development to extend the resolution beyond a few tens of nanometers. Here we describe an intermediate approach: the XFEL imaging of biological assemblies with helical symmetry. We collected X-ray scattering images from samples of microtubules injected across an XFEL beam using a liquid microjet, sorted these images into class averages, merged these data into a diffraction pattern extending to 2nm resolution, and reconstructed these data into a projection image of the microtubule. Details such as the 4nm tubulin monomer became visible in this reconstruction. These results illustrate the potential of single-molecule X-ray imaging of biological assembles with helical symmetry at room temperature.
8.	Dods, Robert, 1989, et al. (författare) Ultrafast structural changes within a photosynthetic reaction centre. 2021 Ingår i: Nature. - : Springer Science and Business Media LLC. - 1476-4687 .- 0028-0836. ; 589:7841, s. 310-314 Tidskriftsartikel (refereegranskat)abstract Photosynthetic reaction centres harvest the energy content of sunlight by transporting electrons across an energy-transducing biological membrane. Here we use time-resolved serial femtosecond crystallography1 using an X-ray free-electron laser2 to observe light-induced structural changes in the photosynthetic reaction centre of Blastochloris viridis on a timescale of picoseconds. Structural perturbations first occur at the special pair of chlorophyll molecules of the photosynthetic reaction centre that are photo-oxidized by light. Electron transfer to the menaquinone acceptor on the opposite side of the membrane induces a movement of this cofactor together with lower amplitude protein rearrangements. These observations reveal how proteins use conformational dynamics to stabilize the charge-separation steps of electron-transfer reactions.
9.	Forsberg, Markus, 1974, et al. (författare) From construction candidates to constructicon entries: An experiment using semi-automatic methods for identifying constructions in corpora 2014 Ingår i: Constructions and Frames. - : John Benjamins Publishing Company. - 1876-1933 .- 1876-1941. ; 6:1, 2014, s. 114-135 Tidskriftsartikel (refereegranskat)abstract We present an experiment where natural language processing tools are used to automatically identify potential constructions in a corpus. e experiment was conducted as part of the ongoing eﬀorts to develop a Swedish constructicon. Using an automatic method to suggest constructions has advantages not only for eﬃciency but also methodologically: it forces the analyst to look more objec-tively at the constructions actually occurring in corpora, as opposed to focusing on “interesting” constructions only. As a heuristic for identifying potential con-structions, the method has proved successful, yielding about 200 (out of 1,200) highly relevant construction candidates.
10.	Hagström, Lovisa, 1995, et al. (författare) The Effect of Scaling, Retrieval Augmentation and Form on the Factual Consistency of Language Models 2023 Ingår i: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5457–5476, Singapore. - : Association for Computational Linguistics. Konferensbidrag (refereegranskat)abstract Large Language Models (LLMs) make natural interfaces to factual knowledge, but their usefulness is limited by their tendency to deliver inconsistent answers to semantically equivalent questions. For example, a model might supply the answer “Edinburgh” to “Anne Redpath passed away in X.” and “London” to “Anne Redpath’s life ended in X.” In this work, we identify potential causes of inconsistency and evaluate the effectiveness of two mitigation strategies: up-scaling and augmenting the LM with a passage retrieval database. Our results on the LLaMA and Atlas models show that both strategies reduce inconsistency but that retrieval augmentation is considerably more efficient. We further consider and disentangle the consistency contributions of different components of Atlas. For all LMs evaluated we find that syntactical form and task artifacts impact consistency. Taken together, our results provide a better understanding of the factors affecting the factual consistency of language models.
11.	Johansson, Linda C, 1983, et al. (författare) Lipidic phase membrane protein serial femtosecond crystallography. 2012 Ingår i: Nature methods. - : Springer Science and Business Media LLC. - 1548-7105 .- 1548-7091. ; 9:3, s. 263-265 Tidskriftsartikel (refereegranskat)abstract X-ray free electron laser (X-FEL)-based serial femtosecond crystallography is an emerging method with potential to rapidly advance the challenging field of membrane protein structural biology. Here we recorded interpretable diffraction data from micrometer-sized lipidic sponge phase crystals of the Blastochloris viridis photosynthetic reaction center delivered into an X-FEL beam using a sponge phase micro-jet.
12.	Johansson, Linda C, 1983, et al. (författare) Structure of a photosynthetic reaction centre determined by serial femtosecond crystallography. 2013 Ingår i: Nature communications. - : Springer Science and Business Media LLC. - 2041-1723. ; 4 Tidskriftsartikel (refereegranskat)abstract Serial femtosecond crystallography is an X-ray free-electron-laser-based method with considerable potential to have an impact on challenging problems in structural biology. Here we present X-ray diffraction data recorded from microcrystals of the Blastochloris viridis photosynthetic reaction centre to 2.8Å resolution and determine its serial femtosecond crystallography structure to 3.5Å resolution. Although every microcrystal is exposed to a dose of 33MGy, no signs of X-ray-induced radiation damage are visible in this integral membrane protein structure.
13.	Johansson, Richard, 1975, et al. (författare) A Multi-domain Corpus of Swedish Word Sense Annotation 2016 Ingår i: 10th edition of the Language Resources and Evaluation Conference, 23-28 May 2016, Portorož (Slovenia). - : European Language Resources Association. - 9782951740891 Konferensbidrag (refereegranskat)abstract We describe the word sense annotation layer in Eukalyptus, a freely available five-domain corpus of contemporary Swedish with several annotation layers. The annotation uses the SALDO lexicon to define the sense inventory, and allows word sense annotation of compound segments and multiword units. We give an overview of the new annotation tool developed for this project, and finally present an analysis of the inter-annotator agreement between two annotators.
14.	Johansson, Richard, 1975, et al. (författare) Training a Swedish Constituency Parser on Six Incompatible Treebanks 2020 Ingår i: Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020). - : European Language Resources Association (ELRA). Konferensbidrag (refereegranskat)abstract We investigate a transition-based parser that usesEukalyptus, a function-tagged constituent treebank for Swedish which includesdiscontinuous constituents. In addition, we show that the accuracy of this parser can be improved by using a multitask learning architecture that makes it possible to train the parser on additional treebanks that use other annotation models.
15.	Kågebäck, Mikael, 1981, et al. (författare) Neural context embeddings for automatic discovery of word senses 2015 Ingår i: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. Denver, United States. - 9781941643464 ; , s. 25-32 Konferensbidrag (refereegranskat)abstract Word sense induction (WSI) is the problem of automatically building an inventory of senses for a set of target words using only a text corpus. We introduce a new method for embedding word instances and their context, for use in WSI. The method, Instance-context embedding (ICE), leverages neural word embeddings, and the correlation statistics they capture, to compute high quality embeddings of word contexts. In WSI, these context embeddings are clustered to find the word senses present in the text. ICE is based on a novel method for combining word embeddings using continuous Skip-gram, based on both se- mantic and a temporal aspects of context words. ICE is evaluated both in a new system, and in an extension to a previous system for WSI. In both cases, we surpass previous state-of-the-art, on the WSI task of SemEval-2013, which highlights the generality of ICE. Our proposed system achieves a 33% relative improvement.
16.	Redecke, Lars, et al. (författare) Natively inhibited Trypanosoma brucei cathepsin B structure determined by using an X-ray laser. 2013 Ingår i: Science (New York, N.Y.). - : American Association for the Advancement of Science (AAAS). - 1095-9203 .- 0036-8075. ; 339:6116, s. 227-30 Tidskriftsartikel (refereegranskat)abstract The Trypanosoma brucei cysteine protease cathepsin B (TbCatB), which is involved in host protein degradation, is a promising target to develop new treatments against sleeping sickness, a fatal disease caused by this protozoan parasite. The structure of the mature, active form of TbCatB has so far not provided sufficient information for the design of a safe and specific drug against T. brucei. By combining two recent innovations, in vivo crystallization and serial femtosecond crystallography, we obtained the room-temperature 2.1 angstrom resolution structure of the fully glycosylated precursor complex of TbCatB. The structure reveals the mechanism of native TbCatB inhibition and demonstrates that new biomolecular information can be obtained by the "diffraction-before-destruction" approach of x-ray free-electron lasers from hundreds of thousands of individual microcrystals.
17.	Saynova, Denitsa, et al. (författare) Class Explanations: the Role of Domain-Specific Content and Stop Words 2023 Ingår i: Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 103–112, Tórshavn, Faroe Islands. - : University of Tartu Library. Konferensbidrag (refereegranskat)abstract We address two understudied areas related to explainability for neural text models. First, class explanations. What features are descriptive across a class, rather than explaining single input instances? Second, the type of features that are used for providing explanations. Does the explanation involve the statistical pattern of word usage or the presence of domain-specific content words? Here, we present a method to extract both class explanations and strategies to differentiate between two types of explanations – domain-specific signals or statistical variations in frequencies of common words. We demonstrate our method using a case study in which we analyse transcripts of political debates in the Swedish Riksdag.
18.	Tahmasebi, Nina, 1982, et al. (författare) Visions and open challenges for a knowledge-based culturomics 2015 Ingår i: International Journal on Digital Libraries. - : Springer Science and Business Media LLC. - 1432-5012 .- 1432-1300. ; 15:2-4, s. 169-187 Tidskriftsartikel (refereegranskat)abstract The concept of culturomics was born out of the availability of massive amounts of textual data and the interest to make sense of cultural and language phenomena over time. Thus far however, culturomics has only made use of, and shown the great potential of, statistical methods. In this paper, we present a vision for a knowledge-based culturomics that complements traditional culturomics. We discuss the possibilities and challenges of combining knowledge-based methods with statistical methods and address major challenges that arise due to the nature of the data; diversity of sources, changes in language over time as well as temporal dynamics of information in general. We address all layers needed for knowledge-based culturomics, from natural language processing and relations to summaries and opinions.
19.	Volodina, Elena, 1973, et al. (författare) Semi-automatic selection of best corpus examples for Swedish: Initial algorithm evaluation. 2012 Ingår i: Proceedings of the SLTC 2012 workshop on NLP for CALL, Lund, 25th October, 2012.. - 1650-3740. ; :080, s. 59-70 Konferensbidrag (refereegranskat)abstract The study presented here describes the results of the initial evaluation of two sorting approaches to automatic ranking of corpus examples for Swedish. Representatives from two potential target user groups have been asked to rate top three hits per approach for sixty search items from the point of view of the needs of their professional target groups, namely second/foreign language (L2) teachers and lexicographers. This evaluation has shown, on the one hand, which of the two approaches to example rating (called in the text below algorithms #1 and #2) performs better in terms of finding better examples for each target user group; and on the other hand, which features evaluators associate with good examples. It has also facilitated statistic analysis of the “good” versus “bad” examples with reference to the measurable features, such as sentence length, word length, lexical frequency profiles, PoS constitution, dependency structure, etc. with a potential to find out new reliable classifiers.
20.	Adouane, Wafia, 1985, et al. (författare) Arabicized and Romanized Berber Automatic Identification 2016 Ingår i: Proceedings of TICAM 2016. - Morocco : IRCAM. Konferensbidrag (refereegranskat)abstract We present an automatic language identification tool for both Arabicized Berber (Berber written in the Arabic script) and Romanized Berber (Berber written in the Latin script). The focus is on short texts (social media content). We use supervised machine learning method with character and word-based n-gram models as features. We also describe the corpora used in this paper. For both Arabicized and Romanized Berber, character-based 5-grams score the best giving an F-score of 99.50%.
21.	Adouane, Wafia, 1985, et al. (författare) ASIREM Participation at the Discriminating Similar Languages Shared Task 2016 2016 Ingår i: Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects; 163–169; December 12; Osaka, Japan. Konferensbidrag (refereegranskat)
22.	Adouane, Wafia, 1985, et al. (författare) Automatic Detection of Arabicized Berber and Arabic Varieties 2016 Ingår i: Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects; 63–72; December 12; Osaka, Japan. Konferensbidrag (refereegranskat)abstract Automatic Language Identification (ALI) is the detection of the natural language of an input text by a machine. It is the first necessary step to do any language-dependent natural language processing task. Various methods have been successfully applied to a wide range of languages, and the state-of-the-art automatic language identifiers are mainly based on character n-gram models trained on huge corpora. However, there are many languages which are not yet automatically processed, for instance minority and informal languages. Many of these languages are only spoken and do not exist in a written format. Social media platforms and new technologies have facilitated the emergence of written format for these spoken languages based on pronunciation. The latter are not well represented on the Web, commonly referred to as under-resourced languages, and the current available ALI tools fail to properly recognize them. In this paper, we revisit the problem of ALI with the focus on Arabicized Berber and dialectal Arabic short texts. We introduce new resources and evaluate the existing methods. The results show that machine learning models combined with lexicons are well suited for detecting Arabicized Berber and different Arabic varieties and distinguishing between them, giving a macro-average F-score of 92.94%.
23.	Adouane, Wafia, 1985, et al. (författare) Gulf Arabic Resource Building for Sentiment Analysis 2016 Ingår i: Proceedings of the Language Resources and Evaluation Conference (LREC), 23-28 May 2016, Portorož, Slovenia. - : European Language Resources Association. - 9782951740891 Konferensbidrag (refereegranskat)abstract This paper deals with building linguistic resources for Gulf Arabic, one of the Arabic variations, for sentiment analysis task using machine learning. To our knowledge, no previous works were done for Gulf Arabic sentiment analysis despite the fact that it is present in different online platforms. Hence, the first challenge is the absence of annotated data and sentiment lexicons. To fill this gap, we created these two main linguistic resources. Then we conducted different experiments: use Naive Bayes classifier without any lexicon; add a sentiment lexicon designed basically for MSA; use only the compiled Gulf Arabic sentiment lexicon and finally use both MSA and Gulf Arabic sentiment lexicons. The Gulf Arabic lexicon gives a good improvement of the classifier accuracy (90.54 %) over a baseline that does not use the lexicon (82.81%), while the MSA lexicon causes the accuracy to drop to (76.83%). Moreover, mixing MSA and Gulf Arabic lexicons causes the accuracy to drop to (84.94%) compared to using only Gulf Arabic lexicon. This indicates that it is useless to use MSA resources to deal with Gulf Arabic due to the considerable differences and conflicting structures between these two languages.
24.	Adouane, Wafia, 1985, et al. (författare) Romanized Arabic and Berber Detection Using PPM and Dictionary Methods 2017 Ingår i: 13th ACS/IEEE International Conference on Computer Systems and Applications AICCSA 2016. - Morocco. - 2161-5322. - 9781509043200 Konferensbidrag (refereegranskat)abstract Arabic is one of the Semitic languages written in Arabic script in its standard form. However, the recent rise of social media and new technologies has contributed considerably to the emergence of a new form of Arabic, namely Arabic written in Latin scripts, often called Romanized Arabic or Arabizi. While Romanized Arabic is an informal language, Berber or Tamazight uses Latin script in its standard form with some orthography differences depending on the country it is used in. Both these languages are under-resourced and unknown to the state-of-the-art language identifiers. In this paper, we present a language automatic identifier for both Romanized Arabic and Romanized Berber. We also describe the built linguistic resources (large dataset and lexicons) including a wide range of Arabic dialects (Algerian, Egyptian, Gulf, Iraqi, Levantine, Moroccan and Tunisian dialects) as well as the most popular Berber varieties (Kabyle, Tashelhit, Tarifit, Tachawit and Tamzabit). We use the Prediction by Partial Matching (PPM) and dictionary-based methods. The methods reach a macro-average F-Measure of 98.74% and 97.60% respectively.
25.	Adouane, Wafia, 1985, et al. (författare) Romanized Arabic and Berber Detection Using Prediction by Partial Matching and Dictionary Methods 2016 Ingår i: 2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA). - 9781509043200 Konferensbidrag (refereegranskat)abstract Arabic is one of the Semitic languages written in Arabic script in its standard form. However, the recent rise of social media and new technologies has contributed considerably to the emergence of a new form of Arabic, namely Arabic written in Latin scripts, often called Romanized Arabic or Arabizi. While Romanized Arabic is an informal language, Berber or Tamazight uses Latin script in its standard form with some orthography differences depending on the country it is used in. Both these languages are under-resourced and unknown to the state-of-theart language identifiers. In this paper, we present a language automatic identifier for both Romanized Arabic and Romanized Berber. We also describe the built linguistic resources (large dataset and lexicons) including a wide range of Arabic dialects (Algerian, Egyptian, Gulf, Iraqi, Levantine, Moroccan and Tunisian dialects) as well as the most popular Berber varieties (Kabyle, Tashelhit, Tarifit, Tachawit and Tamzabit). We use the Prediction by Partial Matching (PPM) and dictionary-based methods. The methods reach a macro-average F-Measure of 98.74% and 97.60% respectively.
26.	Adouane, Wafia, 1985, et al. (författare) Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning 2016 Ingår i: Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects; 53–61; December 12, 2016 ; Osaka, Japan. - : Association for Computational Linguistics. - 0736-587X. Konferensbidrag (refereegranskat)abstract The identification of the language of text/speech input is the first step to be able to properly do any language-dependent natural language processing. The task is called Automatic Language Identification (ALI). Being a well-studied field since early 1960’s, various methods have been applied to many standard languages. The ALI standard methods require datasets for training and use character/word-based n-gram models. However, social media and new technologies have contributed to the rise of informal and minority languages on the Web. The state-of-the-art automatic language identifiers fail to properly identify many of them. Romanized Arabic (RA) and Romanized Berber (RB) are cases of these informal languages which are under-resourced. The goal of this paper is twofold: detect RA and RB, at a document level, as separate languages and distinguish between them as they coexist in North Africa. We consider the task as a classification problem and use supervised machine learning to solve it. For both languages, character-based 5-grams combined with additional lexicons score the best, F-score of 99.75% and 97.77% for RB and RA respectively.
27.	Ahlberg, Malin, 1986, et al. (författare) Swedish FrameNet++ The Beginning of the End and the End of the Beginning 2014 Ingår i: Proceedings of the Fifth Swedish Language Technology Conference, Uppsala, 13-14 November 2014. Konferensbidrag (övrigt vetenskapligt/konstnärligt)
28.	Andersson, Magnus, et al. (författare) Structural Dynamics of Light-Driven Proton Pumps 2009 Ingår i: Structure. - : Elsevier BV. - 0969-2126 .- 1878-4186. ; 17:9, s. 1265-1275 Tidskriftsartikel (refereegranskat)abstract Bacteriorhodopsin and proteorhodopsin are simple heptahelical proton pumps containing a retinal chromophore covalently bound to helix G via a protonated Schiff base. Following the absorption of a photon, all-trans retinal is isomerized to a 13-cis conformation, initiating a sequence of conformational changes driving vectorial proton transport. In this study we apply time-resolved wide-angle X-ray scattering to visualize in real time the helical motions associated with proton pumping by bacteriorhodopsin and proteorhodopsin. Our results establish that three conformational states are required to describe their photocycles. Significant motions of the cytoplasmic half of helix F and the extracellular half of helix C are observed prior to the primary proton transfer event, which increase in amplitude following proton transfer. These results both simplify the structural description to emerge from intermediate trapping studies of bacteriorhodopsin and reveal shared dynamical principles for proton pumping.
29.	Arnlund, David, et al. (författare) Visualizing a protein quake with time-resolved X-ray scattering at a free-electron laser 2014 Ingår i: Nature Methods. - : Springer Science and Business Media LLC. - 1548-7091 .- 1548-7105. ; 11:9, s. 923-926 Tidskriftsartikel (refereegranskat)abstract We describe a method to measure ultrafast protein structural changes using time-resolved wide-angle X-ray scattering at an X-ray free-electron laser. We demonstrated this approach using multiphoton excitation of the Blastochloris viridis photosynthetic reaction center, observing an ultrafast global conformational change that arises within picoseconds and precedes the propagation of heat through the protein. This provides direct structural evidence for a 'protein quake': the hypothesis that proteins rapidly dissipate energy through quake-like structural motions.
30.	Bennaceur, Amel, et al. (författare) Automatic Service Categorisation through Machine Learning in Emergent Middleware 2013 Ingår i: Lecture notes in computer sciences. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 0302-9743. ; 7542, s. 133-149 Konferensbidrag (refereegranskat)
31.	Blodgett, Joanna M., et al. (författare) Device-measured physical activity and cardiometabolic health : the Prospective Physical Activity, Sitting, and Sleep (ProPASS) consortium 2023 Ingår i: European Heart Journal. - 0195-668X .- 1522-9645. Tidskriftsartikel (refereegranskat)abstract Background and Aims: Physical inactivity, sedentary behaviour (SB), and inadequate sleep are key behavioural risk factors of cardiometabolic diseases. Each behaviour is mainly considered in isolation, despite clear behavioural and biological interdependencies. The aim of this study was to investigate associations of five-part movement compositions with adiposity and cardiometabolic biomarkers.Methods: Cross-sectional data from six studies (n = 15 253 participants; five countries) from the Prospective Physical Activity, Sitting and Sleep consortium were analysed. Device-measured time spent in sleep, SB, standing, light-intensity physical activity (LIPA), and moderate-vigorous physical activity (MVPA) made up the composition. Outcomes included body mass index (BMI), waist circumference, HDL cholesterol, total:HDL cholesterol ratio, triglycerides, and glycated haemoglobin (HbA1c). Compositional linear regression examined associations between compositions and outcomes, including modelling time reallocation between behaviours.Results: The average daily composition of the sample (age: 53.7 ± 9.7 years; 54.7% female) was 7.7 h sleeping, 10.4 h sedentary, 3.1 h standing, 1.5 h LIPA, and 1.3 h MVPA. A greater MVPA proportion and smaller SB proportion were associated with better outcomes. Reallocating time from SB, standing, LIPA, or sleep into MVPA resulted in better scores across all outcomes. For example, replacing 30 min of SB, sleep, standing, or LIPA with MVPA was associated with −0.63 (95% confidence interval −0.48, −0.79), −0.43 (−0.25, −0.59), −0.40 (−0.25, −0.56), and −0.15 (0.05, −0.34) kg/m2 lower BMI, respectively. Greater relative standing time was beneficial, whereas sleep had a detrimental association when replacing LIPA/MVPA and positive association when replacing SB. The minimal displacement of any behaviour into MVPA for improved cardiometabolic health ranged from 3.8 (HbA1c) to 12.7 (triglycerides) min/day.Conclusions: Compositional data analyses revealed a distinct hierarchy of behaviours. Moderate-vigorous physical activity demonstrated the strongest, most time-efficient protective associations with cardiometabolic outcomes. Theoretical benefits from reallocating SB into sleep, standing, or LIPA required substantial changes in daily activity.
32.	Borin, Lars, 1957, et al. (författare) Here be dragons? The perils and promises of inter-resource lexical-semantic mapping 2015 Ingår i: Linköping Electronic Conference Proceedings. Semantic resources and semantic annotation for Natural Language Processing and the Digital Humanities. Workshop at NODALIDA , May 11, 13-18 2015, Vilnius. - 1650-3686 .- 1650-3740. - 9789175190495 ; 112, s. 1-11 Konferensbidrag (refereegranskat)abstract Lexical-semantic knowledges sources are a stock item in the language technologist’s toolbox, having proved their practical worth in many and diverse natural language processing (NLP) applications. In linguistics, lexical semantics comes in many flavors, but in the NLP world, wordnets reign more or less supreme. There has been some promising work utilizing Roget-style thesauruses instead, but wider experimentation is hampered by the limited availability of such resources. The work presented here is a first step in the direction of creating a freely available Roget-style lexical resource for modern Swedish. Here, we explore methods for automatic disambiguation of interresource mappings with the longer-term goal of utilizing similar techniques for automatic enrichment of lexical-semantic resources.
33.	Borin, Lars, 1957, et al. (författare) Kulturomik: Att spana efter språkliga och kulturella förändringar i digitala textarkiv 2014 Ingår i: Historia i en digital värld. Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)
34.	Borin, Lars, 1957, et al. (författare) Mining semantics for culturomics: towards a knowledge-based approach 2013 Ingår i: 2013 ACM International Workshop on Mining Unstructured Big Data Using Natural Language Processing, UnstructureNLP 2013, Held at 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013; San Francisco, CA; United States; 28 October 2013 through 28 October 2013. - New York, NY, USA : ACM. - 9781450324151 ; , s. 3-10 Konferensbidrag (refereegranskat)abstract The massive amounts of text data made available through the Google Books digitization project have inspired a new field of big-data textual research. Named culturomics, this field has attracted the attention of a growing number of scholars over recent years. However, initial studies based on these data have been criticized for not referring to relevant work in linguistics and language technology. This paper provides some ideas, thoughts and first steps towards a new culturomics initiative, based this time on Swedish data, which pursues a more knowledge-based approach than previous work in this emerging field. The amount of new Swedish text produced daily and older texts being digitized in cultural heritage projects grows at an accelerating rate. These volumes of text being available in digital form have grown far beyond the capacity of human readers, leaving automated semantic processing of the texts as the only realistic option for accessing and using the information contained in them. The aim of our recently initiated research program is to advance the state of the art in language technology resources and methods for semantic processing of Big Swedish text and focus on the theoretical and methodological advancement of the state of the art in extracting and correlating information from large volumes of Swedish text using a combination of knowledge-based and statistical methods.
35.	Borin, Lars, 1957, et al. (författare) Search Result Diversification Methods to Assist Lexicographers 2012 Ingår i: Proceedings of the 6th Linguistic Annotation Workshop. ; , s. 113-117 Konferensbidrag (refereegranskat)abstract We show how the lexicographic task of finding informative and diverse example sentences can be cast as a search result diversification problem, where an objective based on relevance and diversity is maximized. This problem has been studied intensively in the information retrieval community during recent years, and efficient algorithms have been devised. We finally show how the approach has been implemented in a lexicographic project, and describe the relevance and diversity functions used in that context.
36.	Borin, Lars, 1957, et al. (författare) Transferring Frames: Utilization of Linked Lexical Resources 2012 Ingår i: Proceedings of the Workshop on Inducing Linguistic Structure Submission (WILS). ; , s. 8-15 Konferensbidrag (refereegranskat)abstract In our experiment, we evaluate the transferability of frames from Swedish to Finnish in parallel corpora. We evaluate both the theoretical possibility of transferring frames and the possibility of performing it using available lexical resources. We add the frame information to an extract of the Swedish side of the Kotus and JRC-Acquis corpora using an automatic frame labeler and copy it to the Finnish side. We focus on evaluating the results to get an estimation on how often the parallel sentences can be said to express the same frame. This sheds light to the questions: Are the same situations in the two languages expressed using different frames, i.e. are the frames transferable even in theory? How well can the frame information of running text be transferred from language to another?
37.	Boutet, S., et al. (författare) High-Resolution Protein Structure Determination by Serial Femtosecond Crystallography 2012 Ingår i: Science. - : American Association for the Advancement of Science (AAAS). - 0036-8075 .- 1095-9203. ; 337:6092, s. 362-364 Tidskriftsartikel (refereegranskat)abstract Structure determination of proteins and other macromolecules has historically required the growth of high-quality crystals sufficiently large to diffract x-rays efficiently while withstanding radiation damage. We applied serial femtosecond crystallography (SFX) using an x-ray free-electron laser (XFEL) to obtain high-resolution structural information from microcrystals (less than 1 micrometer by 1 micrometer by 3 micrometers) of the well-characterized model protein lysozyme. The agreement with synchrotron data demonstrates the immediate relevance of SFX for analyzing the structure of the large group of difficult-to-crystallize molecules.
38.	Dannélls, Dana, 1976, et al. (författare) Transformer-based Swedish Semantic Role Labeling through Transfer Learning 2024 Ingår i: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 20-25 May, 2024, Torino, Italia. - Turin, Italy : ELRA and ICCL. - 2951-2093. - 9782493814104 Konferensbidrag (refereegranskat)abstract Semantic Role Labeling (SRL) is a task in natural language understanding where the goal is to extract semantic roles for a given sentence. English SRL has achieved state-of-the-art performance using Transformer techniques and supervised learning. However, this technique is not a viable choice for smaller languages like Swedish due to the limited amount of training data. In this paper, we present the first effort in building a Transformer-based SRL system for Swedish by exploring multilingual and cross-lingual transfer learning methods and leveraging the Swedish FrameNet resource. We demonstrate that multilingual transfer learning outperforms two different cross-lingual transfer models. We also found some differences between frames in FrameNet that can either hinder or enhance the model’s performance. The resulting end-to-end model is freely available and will be made accessible through Språkbanken Text’s research infrastructure.
39.	Daoud, Adel, 1981, et al. (författare) Conceptualizing Treatment Leakage in Text-based Causal Inference 2022 Ingår i: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5638–5645, Seattle, United States. - : Association for Computational Linguistics. - 9781955917711 Konferensbidrag (refereegranskat)abstract Causal inference methods that control for text-based confounders are becoming increasingly important in the social sciences and other disciplines where text is readily available. However, these methods rely on a critical assumption that there is no treatment leakage: that is, the text only contains information about the confounder and no information about treatment assignment. When this assumption does not hold, methods that control for text to adjust for confounders face the problem of post-treatment (collider) bias. However, the assumption that there is no treatment leakage may be unrealistic in real-world situations involving text, as human language is rich and flexible. Language appearing in a public policy document or health records may refer to the future and the past simultaneously, and thereby reveal information about the treatment assignment.In this article, we define the treatment-leakage problem, and discuss the identification as well as the estimation challenges it raises. Second, we delineate the conditions under which leakage can be addressed by removing the treatment-related signal from the text in a pre-processing step we define as text distillation. Lastly, using simulation, we show how treatment leakage introduces a bias in estimates of the average treatment effect (ATE) and how text distillation can mitigate this bias.
40.	Dods, Robert, 1989, et al. (författare) From Macrocrystals to Microcrystals: A Strategy for Membrane Protein Serial Crystallography. 2017 Ingår i: Structure. - : Elsevier BV. - 1878-4186 .- 0969-2126. ; 25:9, s. 1461-1468 Tidskriftsartikel (refereegranskat)abstract Serial protein crystallography was developed at X-ray free-electron lasers (XFELs) and is now also being applied at storage ring facilities. Robust strategies for the growth and optimization of microcrystals are needed to advance the field. Here we illustrate a generic strategy for recovering high-density homogeneous samples of microcrystals starting from conditions known to yield large (macro) crystals of the photosynthetic reaction center of Blastochloris viridis (RCvir). We first crushed these crystals prior to multiple rounds of microseeding. Each cycle of microseeding facilitated improvements in the RCvir serial femtosecond crystallography (SFX) structure from 3.3-Å to 2.4-Å resolution. This approach may allow known crystallization conditions for other proteins to be adapted to exploit novel scientific opportunities created by serial crystallography.
41.	Doostmohammadi, Ehsan, 1993-, et al. (författare) Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language Models 2023 Ingår i: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 521–529, Toronto, Canada. - : Association for Computational Linguistics. - 9781959429715 ; 2, s. 521-529 Konferensbidrag (refereegranskat)abstract Augmenting language models with a retrieval mechanism has been shown to significantly improve their performance while keeping the number of parameters low. Retrieval-augmented models commonly rely on a semantic retrieval mechanism based on the similarity between dense representations of the query chunk and potential neighbors. In this paper, we study the state-of-the-art Retro model and observe that its performance gain is better explained by surface-level similarities, such as token overlap. Inspired by this, we replace the semantic retrieval in Retro with a surface-level method based on BM25, obtaining a significant reduction in perplexity. As full BM25 retrieval can be computationally costly for large datasets, we also apply it in a re-ranking scenario, gaining part of the perplexity reduction with minimal computational overhead.
42.	Ehrlemark, Anna, et al. (författare) Retrieving Occurrences of Grammatical Constructions 2016 Ingår i: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics : Technical Papers, December 11–17; Osaka, Japan. - 1525-2477. - 9784879747020 Konferensbidrag (refereegranskat)abstract Finding authentic examples of grammatical constructions is central in constructionist approaches to linguistics, language processing, and second language learning. In this paper, we address this problem as an information retrieval (IR) task. To facilitate research in this area, we built a benchmark collection by annotating the occurrences of six constructions in a Swedish corpus. Furthermore, we implemented a simple and flexible retrieval system for finding construction occurrences, in which the user specifies a ranking function using lexical-semantic similarities (lexicon-based or distributional). The system was evaluated using standard IR metrics on the new benchmark, and we saw that lexical-semantical rerankers improve significantly over a purely surface-oriented system, but must be carefully tailored for each individual construction.
43.	Farahani, Mehrdad, 1989, et al. (författare) An Empirical Study of Multitask Learning to Improve Open Domain Dialogue Systems 2023 Ingår i: Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 347–357, Tórshavn, Faroe Islands. - : University of Tartu Library. - 1736-8197 .- 1736-6305. - 9789916219997 Konferensbidrag (refereegranskat)abstract Autoregressive models used to generate responses in open-domain dialogue systems often struggle to take long-term context into account and to maintain consistency over a dialogue. Previous research in open-domain dialogue generation has shown that the use of auxiliary tasks can introduce inductive biases that encourage the model to improve these qualities. However, most previous research has focused on encoder-only or encoder/decoder models, while the use of auxiliary tasks in encoder-only autoregressive models is under-explored. This paper describes an investigation where four different auxiliary tasks are added to small and medium-sized GPT-2 models fine-tuned on the PersonaChat and DailyDialog datasets. The results show that the introduction of the new auxiliary tasks leads to small but consistent improvement in evaluations of the investigated models.
44.	Fares, Murhaf, et al. (författare) The 2018 Shared Task on Extrinsic Parser Evaluation: On the Downstream Utility of English Universal Dependency Parsers 2018 Ingår i: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. - : Association for Computational Linguistics. Konferensbidrag (refereegranskat)abstract We summarize empirical results and tentative conclusions from the Second Extrinsic Parser Evaluation Initiative (EPE 2018). We review the basic task setup, downstream applications involved, and end-to-end results for seventeen participating parsers. Based on both quantitative and qualitative analysis, we correlate intrinsic evaluation results at different layers of morph-syntactic analysis with observed downstream behavior.
45.	Ghanimifard, Mehdi, 1984, et al. (författare) Enriching Word-sense Embeddings with Translational Context 2015 Ingår i: Proceedings of Recent Advances in Natural Language Processing / edited by Galia Angelova, Kalina Bontcheva, Ruslan Mitkov. International Conference, Hissar, Bulgaria 7–9 September, 2015. - 1313-8502. ; , s. 208-215 Konferensbidrag (refereegranskat)abstract Vector-space models derived from corpora are an effective way to learn a representation of word meaning directly from data, and these models have many uses in practical applications. A number of unsupervised approaches have been proposed to automatically learn representations of word senses directly from corpora, but since these methods use no information but the words themselves, they sometimes miss distinctions that could be possible to make if more information were available. In this paper, we present a general framework that we call context enrichment that incorporates external information during the training of multi-sense vector-space models. Our approach is agnostic as to which external signal is used to enrich the context, but in this work we consider the use of translations as the source of enrichment. We evaluated the models trained using the translation-enriched context using several similarity benchmarks and a word analogy test set. In all our evaluations, the enriched model outperformed the purely word-based baseline soundly.
46.	Ghosh, Sucheta, et al. (författare) End-to-End Discourse Parser Evaluation 2011 Ingår i: Fifth IEEE International Conference on Semantic Computing (ICSC), 2011; September 18-21, 2011; Palo Alto, United States. - 9781457716485 Konferensbidrag (refereegranskat)abstract We are interested in the problem of discourse parsing of textual documents. We present a novel end-to-end discourse parser that, given a plain text document in input, identifies the discourse relations in the text, assigns them a semantic label and detects discourse arguments spans. The parsing architecture is based on a cascade of decisions supported by Conditional Random Fields (CRF). We train and evaluate three different parsers using the PDTB corpus. The three system versions are compared to evaluate their robustness with respect to deep/shallow and automatically extracted syntactic features.
47.	Ghosh, Sucheta, et al. (författare) Global Features for Shallow Discourse Parsing 2012 Ingår i: Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL). ; , s. 150-159 Konferensbidrag (refereegranskat)abstract A coherently related group of sentences may be referred to as a discourse. In this paper we address the problem of parsing coherence relations as defined in the Penn Discourse Tree Bank (PDTB). A good model for discourse structure analysis needs to account both for local dependencies at the token-level and for global dependencies and statistics. We present techniques on using inter-sentential or sentence-level (global), data-driven, non-grammatical features in the task of parsing discourse. The parser model follows up previous approach based on using token-level (local) features with conditional random fields for shallow discourse parsing, which is lacking in structural knowledge of discourse. The parser adopts a two-stage approach where first the local constraints are applied and then global constraints are used on a reduced weighted search space (n-best). In the latter stage we experiment with different rerankers trained on the first stage n-best parses, which are generated using lexico-syntactic local features. The two-stage parser yields significant improvements over the best performing model of discourse parser on the PDTB corpus.
48.	Ghosh, Sucheta, et al. (författare) Improving the Recall of a Discourse Parser by Constraint-based Postprocessing 2012 Ingår i: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12); Istanbul, Turkey; May 23-25. - 9782951740877 ; , s. 2791-2794 Konferensbidrag (refereegranskat)abstract We describe two constraint-based methods that can be used to improve the recall of a shallow discourse parser based on conditional random field chunking. These methods use a set of natural structural constraints as well as others that follow from the annotation guidelines of the Penn Discourse Treebank. We evaluated the resulting systems on the standard test set of the PDTB and achieved a rebalancing of precision and recall with improved F-measures across the board. This was especially notable when we used evaluation metrics taking partial matches into account; for these measures, we achieved F-measure improvements of several points.
49.	Ghosh, Sucheta, et al. (författare) Mining Fine-grained Opinion Expressions with Shallow Parsing 2013 Ingår i: Proceedings of the International Conference Recent Advances in Natural Language Processing. - 1313-8502. ; , s. 302-310 Konferensbidrag (refereegranskat)abstract Opinion analysis deals with public opinions and trends, but subjective language is highly ambiguous. In this paper, we follow a simple data-driven technique to learn fine-grained opinions. We select an intersection set of Wall Street Journal documents that is included both in the Penn Discourse Tree Bank (PDTB) and in the Multi-Perspective Question Answering (MPQA) corpus. This is done in order to explore the usefulness of discourse-level structure to facilitate the extraction of fine-grained opinion expressions. Here we perform shallow parsing of MPQA expressions with connective based discourse structure, and then also with Named Entities (NE) and some syntax features using conditional random fields; the latter feature set is basically a collection of NEs and a bundle of features that is proved to be useful in a shallow discourse parsing task. We found that both of the feature-sets are useful to improve our baseline at different levels of this fine-grained opinion expression mining task.
50.	Ghosh, Sucheta, et al. (författare) Shallow Discourse Parsing with Conditional Random Fields 2011 Ingår i: Proceedings of 5th International Joint Conference on Natural Language Processing; editors Haifeng Wang and David Yarowsky; Chiang Mai, Thailand; November 8-13, 2011. ; , s. 1071-1079 Konferensbidrag (refereegranskat)abstract Parsing discourse is a challenging natural language processing task. In this paper we take a data driven approach to identify arguments of explicit discourse connectives. In contrast to previous work we do not make any assumptions on the span of arguments and consider parsing as a token-level sequence labeling task. We design the argument segmentation task as a cascade of decisions based on conditional random fields (CRFs). We train the CRFs on lexical, syntactic and semantic features extracted from the Penn Discourse Treebank and evaluate feature combinations on the commonly used test split. We show that the best combination of features includes syntactic and semantic features. The comparative error analysis investigates the performance variability over connective types and argument positions.

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Träfflista för sökning "WFRF:(Johansson Richard 1975) "

Avgränsa träffmängd

År