SwePub - sökning: WFRF:(Nivre joakim)

Numrering	Referens	Omslagsbild	Hitta
1.	Ahlsén, Elisabeth, 1951, et al. (författare) Feedback in different social activities 2006 Ingår i: Current trends in Research on Spoken Language in the Nordic Countries. ; , s. 26-44 Tidskriftsartikel (refereegranskat)
2.	Bigert, Johnny, 1976- (författare) Automatic and unsupervised methods in natural language processing 2005 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract Natural language processing (NLP) means the computer-aided processing of language produced by a human. But human language is inherently irregular and the most reliable results are obtained when a human is involved in at least some part of the processing. However, manual workis time-consuming and expensive. This thesis focuses on what can be accomplished in NLP when manual workis kept to a minimum. We describe the construction of two tools that greatly simplify the implementation of automatic evaluation. They are used to implement several supervised, semi-supervised and unsupervised evaluations by introducing artificial spelling errors. We also describe the design of a rule-based shallow parser for Swedish called GTA and a detection algorithm for context-sensitive spelling errors based on semi-supervised learning, called ProbCheck. In the second part of the thesis, we first implement a supervised evaluation scheme that uses an error-free treebankto determine the robustness of a parser when faced with noisy input such as spelling errors. We evaluate the GTA parser and determine the robustness of the individual components of the parser as well as the robustness for different phrase types. Second, we create an unsupervised evaluation procedure for parser robustness. The procedure allows us to evaluate the robustness of parsers using different parser formalisms on the same text and compare their performance. Five parsers and one tagger are evaluated. For four of these, we have access to annotated material and can verify the estimations given by the unsupervised evaluation procedure. The results turned out to be very accurate with few exceptions and thus, we can reliably establish the robustness of an NLP system without any need of manual work. Third, we implement an unsupervised evaluation scheme for spell checkers. Using this, we perform a very detailed analysis of three spell checkers for Swedish. Last, we evaluate the ProbCheck algorithm. Two methods are included for comparison: a full parser and a method using tagger transition probabilities. The algorithm obtains results superior to the comparison methods. The algorithm is also evaluated on authentic data in combination with a grammar and spell checker.
3.	Borg, Markus, et al. (författare) Time extraction from real-time generated football reports 2007 Ingår i: [Host publication title missing]. - 9789985405130 ; , s. 37-43 Konferensbidrag (refereegranskat)abstract This paper describes a system to extract events and time information from football match reports generated through minute-by-minute reporting. We describe a method that uses regular expressions to find the events and divides them into different types to determine in which order they occurred. In addition, our system detects time expressions and we present a way to structure the collected data using XML.
4.	Calacean, Mihaela, et al. (författare) A Data-Driven Dependency Parser for Romanian 2009 Ingår i: Proceedings of the Seventh International Workshop on Treebanks and Linguistic Theories.. - 9789078328773 ; , s. 65-76 Konferensbidrag (refereegranskat)
5.	Edvinsson, Marcus, 1978- (författare) Towards a Framework for Static Analysis Based on Points-to Information 2007 Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract Static analysis on source code or binary code retrieves information about a software program. In object-oriented languages, static points-to analysis retrieves information about objects and how they refer to each other. The result of the points-to analysis is traditionally used to perform optimizations in compilers, such as static resolution of polymorphic calls, and dead-code elimination. More advanced optimizations have been suggested specifically for Java, such as synchronization removal and stack-allocation of objects. Recently, software engineering tools using points-to analysis have appeared aiming to help the developer to understand and to debug software. Altogether, there is a great variety of tools that use or could use points-to analysis, both from academia and from industry. We aim to construct a framework that supports the development of new and the improvement of existing clients to points-to analysis result. We present two client analyses and investigate the similarities and differences they have. The client analyses are the escape analysis and the side-effects analysis. The similarities refer to data structures and basic algorithms that both depend on. The differences are found in the way the two analyses use the data structures and the basic algorithms. In order to reuse these in a framework, a specification language is needed to reflect the differences. The client analyses are implemented, with shared data-structures and basic algorithms, but do not use a separate specification language. The framework is evaluated against three goal criteria, development speed, analysis precision, and analysis speed. The development speed is ranked as most important, and the two latter are considered equally important. Thereafter we present related work and discuss it with respect to the goal criteria. The evaluation of the framework is done in two separate experiments. The first experiment evaluates development speed and shows that the framework enables higher development speed compared to not using the framework. The second experiment evaluates the precision and the speed of the analyses and it shows that the different precisions in the points-to analysis are reflected in the precisions of the client analyses. It also shows that there is a trade-off between analysis precision and analysis speed to consider when choosing analysis precision. Finally, we discuss four alternative ways to continue the research towards a doctoral thesis.
6.	Eryigit, Gülsen, et al. (författare) Dependency Parsing of Turkish 2008 Ingår i: Computational Linguistics. - : MIT Press, Cambridge, MA. - 0891-2017. ; 34:3, s. 357-389 Tidskriftsartikel (refereegranskat)
7.	Eryigit, Gülsen, et al. (författare) Dependency Parsing of Turkish 2008 Ingår i: Computational linguistics - Association for Computational Linguistics (Print). - 0891-2017 .- 1530-9312. ; 34:3, s. 357-389 Tidskriftsartikel (refereegranskat)abstract The suitability of different parsing methods for different languages is an important topic in syntactic parsing. Especially lesser-studied languages, typologically different from the languages for which methods have originally been developed, pose interesting challenges in this respect. This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative, free constituent order language that can be seen as the representative of a wider class of languages of similar type. Our investigations show that morphological structure plays an essential role in finding syntactic relations in such a language. In particular, we show that employing sublexical units called inflectional groups, rather than word forms, as the basic parsing units improves parsing accuracy. We test our claim on two different parsing methods, one based on a probabilistic model with beam search and the other based on discriminative classifiers and a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless of the parsing method. We examine the impact of morphological and lexical information in detail and show that, properly used, this kind of information can improve parsing accuracy substantially. Applying the techniques presented in this article, we achieve the highest reported accuracy for parsing the Turkish Treebank.
8.	Fishel, Mark, et al. (författare) Voting and Stacking in Data-Driven Dependency Parsing 2009 Ingår i: Proceedigs of the 17th Nordic Conference on Computational Linguistics. - Tartu : Tartu University Library. ; , s. 219-222 Konferensbidrag (refereegranskat)
9.	Granfeldt, Jonas, et al. (författare) Evaluating stages of development in second language French: A machine-learning approach 2007 Ingår i: NODALIDA 2007 PROCEEDINGS. - 9789985405147 Konferensbidrag (refereegranskat)abstract This paper describes a system to define and evaluate development stages in second language French. The identification of such stages can be formulated as determining the frequency of some lexical and grammatical features in the learners’ production and how they vary over time. The problems in this procedure are threefold: identify the relevant features, decide on cutoff points for the stages, and evaluate the degree of success of the model. The system addresses these three problems. It consists of a morphosyntactic analyzer called Direkt Profil and a machine-learning module connected to it. We first describe the usefulness and rationale behind its development. We then present the corpus we used to develop the analyzer. Finally, we present new and substantially improved results on training machine-learning classifiers compared to previous experiments (Granfeldt et al., 2006). We also introduce a method to select attributes in order to identify the most relevant grammatical features.
10.	Grönqvist, Leif, 1969- (författare) Exploring Latent Semantic Vector Models Enriched With N-grams 2006 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract Denna avhandling behandlar en sorts vektorrumsmodell som jag kallar ”Latent Semantic Vector Model”, eller LSVM, framtagen med tekniken ”Latent Semantic Indexing”. En LSVM har många användningsområden men jag har i första hand tittat på en direkt tillämpning: dokumentsökning. Det en LSVM kan tillföra dokumentsökning är möjligheten att söka efter innehåll snarare än specifika sökord. Att använda sig av en LSVM i ett dokumentsökningssystem har visat sig förbättra kvaliteten på de returnerade dokumentlistorna – det blir lättare för användaren att hitta den information han eller hon är ute efter. Det problem som angrips i det här arbetet är att en LSVM i normalfallet bara innehåller enkla ord, medan termer man söker efter ofta är flerordsuttryck.Jag har försökt träna upp modeller som är konfigurerade på olika sätt med avseende på parametrar som träningsdata, vokabulär, matrisstorlek, kontextstorlek och inte minst olika sätt att få in flerordsuttryck direkt i modellerna. Syftet har varit att avgöra hur prestanda för en LSVM påverkas då man går från en ordbaserad modell till en sominnehåller både ord och flerordsuttryck. För att kunna mäta förändringen har två utvärderingsmetoder använts: synonymtest och dokumentsökning. Synonymtestningen har gjorts för svenska och dokumentsökningen för svenska och engelska. Resultaten förbättras för synonymtestningen men försämras för dokumentsökning. För engelsk dokumentsökning är förändringen inte signifikant.Arbetet har även resulterat i två nya resurser som är mycket användbara för utvärdering av flera typer av modeller: utvärderingsmängden SweHP560, innehållande 560 svenska synonym-uppgifter från Högskoleprovet, och de nya måtten RankEff och WRS för utvärdering av dokumentsökningssystem, som tar bättre hand om problemet med ofullständigt facit i utvärderingsdata än existerande mått som MAP och bpref.
11.	Hajic, Jan, et al. (författare) The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages 2009 Ingår i: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009). - : Association for Computational Linguistics. - 9781932432299 ; , s. 1-18 Konferensbidrag (refereegranskat)
12.	Hall, Johan, et al. (författare) A Dependency-Driven Parser for German Dependency and Constituency Representations 2008 Ingår i: Proceedings of the ACL-08: HLT Workshop on Parsing German (PaGe-08). - Stroudsburg, PA : Association for Computational Linguistics (ACL),Stroudsburg. - 9781932432152 ; , s. 47-54 Konferensbidrag (refereegranskat)abstract We present a dependency-driven parser that parses both dependency structures and constituent structures. Constituency representations are automatically transformed into dependency representations with complex arc labels, which makes it possible to recover the constituent structure with both constituent labels and grammatical functions. We report a labeled attachment score close to 90% for dependency versions of the TIGER and TüBa-D/Z treebanks. Moreover, the parser is able to recover both constituent labels and grammatical functions with an F-Score over 75% for TüBa-D/Z and over 65% for TIGER.
13.	Hall, Johan, et al. (författare) A generic architecture for data-driven dependency parsing 2005 Ingår i: Proceedings of the 15th Nordic Conference of Computational Linguistics (NODALIDA). - : University of Joensuu electronic publications in linguistics and language technology, Joensuu. - 9524587718 ; , s. 47-56 Konferensbidrag (refereegranskat)abstract We present a software architecture for data-driven dependency parsing of unrestricted natural language text, which achieves a strict modularization of parsing algorithm, feature model and learning method such that these parameters can be varied independently. The design has been realized in MaltParser, which supports several parsing algorithms and learning methods, for which complex feature models can be defined in a special description language.
14.	Hall, Johan, et al. (författare) A Hybrid Constituency-Dependency Parser for Swedish 2007 Ingår i: Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA). ; , s. 284–287- Konferensbidrag (refereegranskat)abstract We present a data-driven parser that derives both constituent structures and dependency structures, alone or in combination, in one and the same process. When trained and tested on data from the Swedish treebank Talbanken05, the parser achieves a labeled dependency accuracy of 82% and a labeled bracketing F-score of 75%.
15.	Hall, Johan, et al. (författare) Discriminative Classifiers for Deterministic Dependency Parsing 2006 Ingår i: Proceedings of the 44rd Annual Meeting of the Association for Computational Linguistics and 21th International Conference on Computational Linguistics (COLING-ACL 2006), July 17-21, 2006, Sydney, Australia. - : Association for Computational Linguistics, Stroudsburg. - 1932432655 ; , s. 316-323 Konferensbidrag (refereegranskat)abstract Deterministic parsing guided by treebankinduced classifiers has emerged as a simple and efficient alternative to more complex models for data-driven parsing. We present a systematic comparison of memory-based learning (MBL) and support vector machines (SVM) for inducing classifiers for deterministic dependency parsing, using data from Chinese, English and Swedish, together with a variety of different feature models. The comparison shows that SVM gives higher accuracy for richly articulated feature models across all languages, albeit with considerably longer training times. The results also confirm that classifier-based deterministic parsing can achieve parsing accuracy very close to the best results reported for more complex parsing models.
16.	Hall, Johan, et al. (författare) Discriminative learning for data-driven dependency parsing 2006 Ingår i: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Konferensbidrag (refereegranskat)
17.	Hall, Johan, 1973- (författare) MaltParser -- An Architecture for Inductive Labeled Dependency Parsing 2006 Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract This licentiate thesis presents a software architecture for inductive labeled dependency parsing of unrestricted natural language text, which achieves a strict modularization of parsing algorithm, feature model and learning method such that these parameters can be varied independently. The architecture is based on the theoretical framework of inductive dependency parsing by Nivre \citeyear{nivre06c} and has been realized in MaltParser, a system that supports several parsing algorithms and learning methods, for which complex feature models can be defined in a special description language. Special attention is given in this thesis to learning methods based on support vector machines (SVM).The implementation is validated in three sets of experiments using data from three languages (Chinese, English and Swedish). First, we check if the implementation realizes the underlying architecture. The experiments show that the MaltParser system outperforms the baseline and satisfies the basic constraints of well-formedness. Furthermore, the experiments show that it is possible to vary parsing algorithm, feature model and learning method independently. Secondly, we focus on the special properties of the SVM interface. It is possible to reduce the learning and parsing time without sacrificing accuracy by dividing the training data into smaller sets, according to the part-of-speech of the next token in the current parser configuration. Thirdly, the last set of experiments present a broad empirical study that compares SVM to memory-based learning (MBL) with five different feature models, where all combinations have gone through parameter optimization for both learning methods. The study shows that SVM outperforms MBL for more complex and lexicalized feature models with respect to parsing accuracy. There are also indications that SVM, with a splitting strategy, can achieve faster parsing than MBL. The parsing accuracy achieved is the highest reported for the Swedish data set and very close to the state of the art for Chinese and English.
18.	Hall, Johan, et al. (författare) Parsing Discontinuous Phrase Structure with Grammatical Functions 2008 Ingår i: Advances in Natural Language Processing. - Berlin, Heidelberg : Springer Berlin / Heidelberg. - 9783540852865 ; , s. 169-180 Konferensbidrag (refereegranskat)abstract This paper presents a novel technique for parsing discontinuous phrase structure representations, labeled with both phrase labels and grammatical functions. Phrase structure representations are transformed into dependency representations with complex edge labels, which makes it possible to induce a dependency parser model that recovers the phrase structure with both phrase labels and grammatical functions. We perform an evaluation on the German TIGER treebank and the Swedish Talbanken05 treebank and report competitive results for both data sets.
19.	Hall, Johan, et al. (författare) Parsing Discontinuous Phrase Structure with Grammatical Functions 2008 Ingår i: Advances in Natural Language Processing. - Berlin / Heidelberg : Springer. - 9783540852865 ; , s. 169-180 Konferensbidrag (refereegranskat)abstract This paper presents a novel technique for parsing discontinuous phrase structure representations, labeled with both phrase labels and grammatical functions. Phrase structure representations are transformed into dependency representations with complex edge labels, which makes it possible to induce a dependency parser model that recovers the phrase structure with both phrase labels and grammatical functions. We perform an evaluation on the German TIGER treebank and the Swedish Talbanken05 treebank and report competitive results for both data sets.
20.	Hall, Johan, et al. (författare) Single Malt or Blended? A Study in Multilingual Parser Optimization 2007 Ingår i: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007. - : Association for Computational Linguistics. ; , s. 933–939- Konferensbidrag (refereegranskat)abstract We describe a two-stage optimization of the MaltParser system for the ten languages in the multilingual track of the CoNLL 2007 shared task on dependency parsing. The first stage consists in tuning a single-parser system for each language by optimizing parameters of the parsing algorithm, the feature model, and the learning algorithm. The second stage consists in building an ensemble system that combines six different parsing strategies, extrapolating from the optimal parameters settings for each language. When evaluated on the official test sets, the ensemble system significantly outperforms the single-parser system and achieves the highest average labeled attachment score.
21.	Hall, Johan, 1973- (författare) Transition-Based Natural Language Parsing with Dependency and Constituency Representations 2008 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract Denna doktorsavhandling undersöker olika aspekter av automatisk syntaktisk analys av texter på naturligt språk. En parser eller syntaktisk analysator, som vi definierar den i denna avhandling, har till uppgift att skapa en syntaktisk analys för varje mening i en text på naturligt språk. Vår metod är datadriven, vilket innebär att den bygger på maskininlärning från uppmärkta datamängder av naturligt språk, s.k. korpusar. Vår metod är också dependensbaserad, vilket innebär att parsning är en process som bygger en dependensgraf för varje mening, bestående av binära relationer mellan ord. Dessutom introducerar avhandlingen en ny metod för att koda frasstrukturer, en annan syntaktisk representationsform, som dependensgrafer vilka kan avkodas utan att information i frasstrukturen går förlorad. Denna metod möjliggör att en dependensbaserad parser kan användas för att syntaktiskt analysera frasstrukturer. Avhandlingen är baserad på fem artiklar, varav tre artiklar utforskar olika aspekter av maskininlärning för datadriven dependensparsning och två artiklar undersöker metoden för dependensbaserad frasstrukturparsning. Den första artikeln presenterar vår första storskaliga empiriska studie av parsning av naturligt språk (i detta fall svenska) med dependensrepresentationer. En transitionsbaserad deterministisk parsningsalgoritm skapar en dependensgraf för varje mening genom att härleda en sekvens av transitioner, och minnesbaserad inlärning (MBL) används för att förutsäga transitionssekvensen. Den andra artikeln undersöker ytterligare hur maskininlärning kan användas för att vägleda en transitionsbaserad dependensparser. Den empiriska studien jämför två metoder för maskininlärning med fem särdragsmodeller för tre språk (kinesiska, engelska och svenska), och studien visar att supportvektormaskiner (SVM) med lexikaliserade särdragsmodeller är bättre lämpade än MBL för att vägleda en transitionsbaserad dependensparser. Den tredje artikeln sammanfattar vår erfarenhet av att optimera MaltParser, vår implementation av transitionsbaserad dependensparsning, för ett stort antal språk. MaltParser har använts för att analysera över tjugo olika språk och var bland de främsta systemen i CoNLLs utvärdering 2006 och 2007. Den fjärde artikeln är vår första undersökning av dependensbaserad frastrukturparsning med konkurrenskraftiga resultat för parsning av tyska. Den femte och sista artikeln introducerar en förbättrad algoritm för att transformera frasstrukturer till dependensgrafer och tillbaka, vilket gör det möjligt att parsa kontinuerliga och diskontinuerliga frasstrukturer utökade med grammatiska funktioner.
22.	Hjelm, Hans, 1973- (författare) Cross-language Ontology Learning : Incorporating and Exploiting Cross-language Data in the Ontology Learning Process 2009 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract An ontology is a knowledge-representation structure, where words, terms or concepts are defined by their mutual hierarchical relations. Ontologies are becoming ever more prevalent in the world of natural language processing, where we currently see a tendency towards using semantics for solving a variety of tasks, particularly tasks related to information access. Ontologies, taxonomies and thesauri (all related notions) are also used in various variants by humans, to standardize business transactions or for finding conceptual relations between terms in, e.g., the medical domain. The acquisition of machine-readable, domain-specific semantic knowledge is time consuming and prone to inconsistencies. The field of ontology learning therefore provides tools for automating the construction of domain ontologies (ontologies describing the entities and relations within a particular field of interest), by analyzing large quantities of domain-specific texts. This thesis studies three main topics within the field of ontology learning. First, we examine which sources of information are useful within an ontology learning system and how the information sources can be combined effectively. Secondly, we do this with a special focus on cross-language text collections, to see if we can learn more from studying several languages at once, than we can from a single-language text collection. Finally, we investigate new approaches to formal and automatic evaluation of the quality of a learned ontology. We demonstrate how to combine information sources from different languages and use them to train automatic classifiers to recognize lexico-semantic relations. The cross-language data is shown to have a positive effect on the quality of the learned ontologies. We also give theoretical and experimental results, showing that our ontology evaluation method is a good complement to and in some aspects improves on the evaluation measures in use today.
23.	Johansson, Richard, et al. (författare) Extended constituent-to-dependency conversion for English 2007 Ingår i: NODALIDA 2007 Proceedings. - 9789985405147 ; , s. 105-112 Konferensbidrag (refereegranskat)abstract We describe a new method to convert English constituent trees using the Penn Treebank annotation style into dependency trees. The new format was inspired by annotation practices used in other dependency treebanks with the intention to produce a better interface to further semantic processing than existing methods. In particular, we used a richer set of edge labels and introduced links to handle long-distance phenomena such as wh-movement and topicalization. The resulting trees generally have a more complex dependency structure. For example, 6% of the trees contain at least one nonprojective link, which is difficult for many parsing algorithms. As can be expected, the more complex structure and the enriched set of edge labels make the trees more difficult to predict, and we observed a decrease in parsing accuracy when applying two dependency parsers to the new corpus. However, the richer information contained in the new trees resulted in a 23% error reduction in a baseline FrameNet semantic role labeler that relied on dependency arc labels only.
24.	Lavelli, Alberto, et al. (författare) MaltParser at the EVALITA 2009 Dependency Parsing Task 2009 Ingår i: Proceedings of EVALITA 2009. Konferensbidrag (refereegranskat)
25.	Lind, Mikael, et al. (författare) The Role of Pragmatic Frameworks in Information Systems Research 2007 Ingår i: Communication – Action - Meaning. A Festschrift to Jens Allwood.. - : Department of Linguistics, Göteborg University, Sweden. ; , s. 173-190 Bokkapitel (övrigt vetenskapligt/konstnärligt)
26.	Marinov, Svetoslav, et al. (författare) A Data-Driven Dependency Parser for Bulgarian 2005 Ingår i: Proceedings of the 4th Workshop on Treebanks and Linguistic Theories (TLT-05). ; , s. 89-100 Konferensbidrag (refereegranskat)
27.	Marinov, Svetoslav, et al. (författare) A data-driven parser for Bulgarian 2005 Ingår i: Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories. Konferensbidrag (refereegranskat)
28.	Megyesi, Beata, 1971-, et al. (författare) Supporting Research Environment for Less Explored Languages : A Case Study of Swedish and Turkish 2008 Ingår i: Resourceful Language Technology. - Uppsala : Acta Universitatis Upsaliensis, Uppsala. - 9789155472269 ; , s. 96-110 Bokkapitel (populärvet., debatt m.m.)
29.	Megyesi, Beata, et al. (författare) Supporting Research Environment for Swedish and Turkish 2008 Rapport (populärvet., debatt m.m.)
30.	Megyesi, Beáta, et al. (författare) Swedish-Turkish Parallel Treebank 2008 Ingår i: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC). - : ELRA, Paris. ; , s. 470-473 Konferensbidrag (refereegranskat)
31.	Megyesi, Beata, et al. (författare) Swedish-Turkish Parallel Treebank 2008 Ingår i: Proceedings of the Sixth International Language Resources and Evaluation (LREC'08). - Paris : European Language Resources Association (ELRA). Konferensbidrag (refereegranskat)abstract In this paper, we describe our work on building a parallel treebank for a less studied and typologically dissimilar language pair, namely Swedish and Turkish. The treebank is a balanced syntactically annotated corpus containing both fiction and technical documents. In total, it consists of approximately 160,000 tokens in Swedish and 145,000 in Turkish. The texts are linguistically annotated using different layers from part of speech tags and morphological features to dependency annotation. Each layer is automatically processed by using basic language resources for the involved languages. The sentences and words are aligned, and partly manually corrected. We create the treebank by reusing and adjusting existing tools for the automatic annotation, alignment, and their correction and visualization. The treebank was developed within the project Supporting research environment for minor languages aiming at to create representative language resources for language pairs dissimilar in language structure. Therefore, efforts are put on developing a general method for formatting and annotation procedure, as well as using tools that can be applied to other language pairs easily.
32.	Nilsson, Jens, et al. (författare) Dependency Parsing by Transformation and Combination 2008 Ingår i: 6th International Conference on Natural Language Processing, GoTAL 2008. - Berlin / Heidelberg : Springer, Gothenburg, Sweden. ; , s. 348–359-, s. 348-359 Konferensbidrag (refereegranskat)abstract This study presents new language and treebank independent graph transformations that improve accuracy in data-driven dependency parsing. We show that individual generic graph transformations can increase accuracy across treebanks, but especially when they are combined using established parser combination techniques. The combination experiments also indicate that the presumed best way to combine parsers, using the highest scoring parsers, is not necessarily the best approach.
33.	Nilsson, Jens, et al. (författare) Generalizing Tree Transformations for Inductive Dependency Parsing 2007 Ingår i: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. - : Association for Computational Linguistics. ; , s. 968–975- Konferensbidrag (refereegranskat)abstract Previous studies in data-driven dependency parsing have shown that tree transformations can improve parsing accuracy for specific parsers and data sets. We investigate to what extent this can be generalized across languages/treebanks and parsers, focusing on pseudo-projective parsing, as a way of capturing non-projective dependencies, and transformations used to facilitate parsing of coordinate structures and verb groups. The results indicate that the beneficial effect of pseudo-projective parsing is independent of parsing strategy but sensitive to language or treebank specific properties. By contrast, the construction specific transformations appear to be more sensitive to parsing strategy but have a constant positive effect over several languages.
34.	Nilsson, Jens, et al. (författare) Graph Transformations in Data-Driven Dependency Parsing 2006 Ingår i: Proceedings of the 44rd Annual Meeting of the Association for Computational Linguistics and 21th International Conference on Computational Linguistics (COLING-ACL 2006), July 17-21, 2006, Sydney, Australia. - : Association for Computational Linguistics, Stroudsburg. - 1932432655 ; , s. 257-264 Konferensbidrag (refereegranskat)abstract Transforming syntactic representations in order to improve parsing accuracy has been exploited successfully in statistical parsing systems using constituency-based representations. In this paper, we show that similar transformations can give substantial improvements also in data-driven dependency parsing. Experiments on the Prague Dependency Treebank show that systematic transformations of coordinate structures and verb groups result in a 10% error reduction for a deterministic data-driven dependency parser. Combining these transformations with previously proposed techniques for recovering nonprojective dependencies leads to state-of-the-art accuracy for the given data set.
35.	Nilsson, Jens, et al. (författare) Graph transformations in data-driven dependency parsing 2006 Ingår i: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Konferensbidrag (refereegranskat)
36.	Nilsson, Jens, et al. (författare) MaltEval: An Evaluation and Visualization Tool for Dependency Parsing 2008 Ingår i: Proceedings of the Sixth International Language Resources and Evaluation (LREC'08). - Paris : Marrakech, Morocco. Konferensbidrag (refereegranskat)abstract This paper presents a freely available evaluation tool for dependency parsing, MaltEval (http://w3.msi.vxu.se/users/jni/malteval). It is flexible and extensible, and provides functionality for both quantitative evaluation and visualization of dependency structure. The quantitative evaluation is compatible with other standard evaluation software for dependency structure which does not produce visualization of dependency structure, and can output more details as well as new types of evaluation metrics. In addition, MaltEval has generic support for confusion matrices. It can also produce statistical significance tests when more than one parsed file is specified. The visualization module also has the ability to highlight discrepancies between the gold-standard files and the parsed files, and it comes with an easy to use GUI functionality to search in the dependency structure of the input files.
37.	Nilsson, Jens, et al. (författare) MAMBA Meets TIGER: Reconstructing a Swedish Treebank from Antiquity 2005 Ingår i: Proceedings from the special session on treebanks at NODALIDA 2005. - : Samfundslitteratur Press. - 8759312521 ; , s. 119-132 Konferensbidrag (refereegranskat)abstract In this paper, we will give an overview of the reconstruction process of the Swedish treebank Talbanken, created in the first half of the 70’s. Talbanken contains both written and spoken material, both encoded in the MAMBA-format. The goal has been to construct two new versions of the original data, one based on phrase structure and one on dependency structure. The outcome of the reconstruction, i.e. different versions of Talbanken, is available for non-commercial research and educational purposes.
38.	Nilsson, Jens, et al. (författare) MAMBA meets TIGER: Reconstructing a Swedish treebank from antiquity 2005 Ingår i: Proceedings from the special session on treebanks at NODALIDA 2007. Konferensbidrag (refereegranskat)
39.	Nilsson, Jens, et al. (författare) Parsing Formal Languages using Natural Language Parsing Techniques. 2009 Ingår i: Proceedings of the 11th International Conference on Parsing Technologies (IWPT). - : Association for Computational Linguistics. ; , s. 49-60 Konferensbidrag (refereegranskat)
40.	Nilsson, Jens, 1979- (författare) Transformation and Combination in Data-Driven Dependency Parcing 2009 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract This thesis deals with automatic syntactic analysis of natural languagetext, also known as parsing. The parsing approach is data-driven, whichmeans that parsers are constructed by means of machine learning, lookingat training data in the form of annotated natural language sentences. The syntactic framework used in the thesis is dependency-based. Robustness is one of the characteristics of the data-driven approaches investigated here.The overall aim of this thesis is to maintain robustness while increasing accuracy.The content of the thesis falls naturally into two tracks, a transformation track and a combination track. The rst type of transformation investigatedis called pseudo-projective, because it enables strictly projective dependency parsers to recover non-projective dependency relations. Informally,a non-projective dependency tree contains crossing binary directed relations, when drawn above the sentence. Experimental results show that pseudo-projective transformations can improve accuracy significantly for a range of languages. The second type of transformation aims to facilitate the processing of specific linguistic constructions such as coordination and verb groups. Experimental results again show a positive effect on parsing accuracy for several languages, often greater than for the pseudo-projective transformations. However, the improvement of the transformations dependson the internal structure of the base parser, which is not the case for thepseudo-projective transformations. The combination track compares various approaches for combining data driven dependency parsers, again as a means of improving accuracy. As different parsers have different strengths and weaknesses, making parsers collaborate in order to nd one single syntactic analysis may result in higher accuracy than any of the syntactic analyzers can produce by itself. The experimental results show that accuracy improves across languages, giventhat appropriate parsers are combined. The thesis ends with an attempt to combine the two tracks, showing that combining parsers with different tree transformations also increases accuracy. Moreover, this experiment indicates that high diversity among a small set of parsers is much more important than a large number of parsers with low diversity.
41.	Nilsson, Jens, 1979- (författare) Tree Transformations in Inductive Dependency Parsing 2007 Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract This licentiate thesis deals with automatic syntactic analysis, or parsing, of natural languages. A parser constructs the syntactic analysis, which it learns by looking at correctly analyzed sentences, known as training data. The general topic concerns manipulations of the training data in order to improve the parsing accuracy. Several studies using constituency-based theories for natural languages in such automatic and data-driven syntactic parsing have shown that training data, annotated according to a linguistic theory, often needs to be adapted in various ways in order to achieve an adequate, automatic analysis. A linguistically sound constituent structure is not necessarily well-suited for learning and parsing using existing data-driven methods. Modifications to the constituency-based trees in the training data, and corresponding modifications to the parser output, have successfully been applied to increase the parser accuracy. The topic of this thesis is to investigate whether similar modifications in the form of tree transformations to training data, annotated with dependency-based structures, can improve accuracy for data-driven dependency parsers. In order to do this, two types of tree transformations are in focus in this thesis. %This is a topic that so far has been less studied. The first one concerns non-projectivity. The full potential of dependency parsing can only be realized if non-projective constructions are allowed, which pose a problem for projective dependency parsers. On the other hand, non-projective parsers tend, among other things, to be slower. In order to maintain the benefits of projective parsing, a tree transformation technique to recover non-projectivity while using a projective parser is presented here. The second type of transformation concerns linguistic phenomena that are possible but hard for a parser to learn, given a certain choice of dependency analysis. This study has concentrated on two such phenomena, coordination and verb groups, for which tree transformations are applied in order to improve parsing accuracy, in case the original structure does not coincide with a structure that is easy to learn. Empirical evaluations are performed using treebank data from various languages, and using more than one dependency parser. The results show that the benefit of these tree transformations used in preprocessing and postprocessing to a large extent is language, treebank and parser independent.
42.	Nilsson, Jens, 1979- (författare) Tree Transformations in Inductive Dependency Parsing 2007 Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract This licentiate thesis deals with automatic syntactic analysis, or parsing, of natural languages. A parser constructs the syntactic analysis, which it learns by looking at correctly analyzed sentences, known as training data. The general topic concerns manipulations of the training data in order to improve the parsing accuracy. Several studies using constituency-based theories for natural languages in such automatic and data-driven syntactic parsing have shown that training data, annotated according to a linguistic theory, often needs to be adapted in various ways in order to achieve an adequate, automatic analysis. A linguistically sound constituent structure is not necessarily well-suited for learning and parsing using existing data-driven methods. Modifications to the constituency-based trees in the training data, and corresponding modifications to the parser output, have successfully been applied to increase the parser accuracy. The topic of this thesis is to investigate whether similar modifications in the form of tree transformations to training data, annotated with dependency-based structures, can improve accuracy for data-driven dependency parsers. In order to do this, two types of tree transformations are in focus in this thesis. The first one concerns non-projectivity. The full potential of dependency parsing can only be realized if non-projective constructions are allowed, which pose a problem for projective dependency parsers. On the other hand, non-projective parsers tend, among other things, to be slower. In order to maintain the benefits of projective parsing, a tree transformation technique to recover non-projectivity while using a projective parser is presented here. The second type of transformation concerns linguistic phenomena that are possible but hard for a parser to learn, given a certain choice of dependency analysis. This study has concentrated on two such phenomena, coordination and verb groups, for which tree transformations are applied in order to improve parsing accuracy, in case the original structure does not coincide with a structure that is easy to learn. Empirical evaluations are performed using treebank data from various languages, and using more than one dependency parser. The results show that the benefit of these tree transformations used in preprocessing and postprocessing to a large extent is language, treebank and parser independent.
43.	Nilsson, Mattias, et al. (författare) Learning Where to Look : Modeling Eye Movements in Reading 2009 Ingår i: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL). - : Association for Computational Linguistics. - 9781932432299 ; , s. 93-101 Konferensbidrag (refereegranskat)
44.	Nivre, Joakim, et al. (författare) A Dependency-Based Conversion of PropBank 2007 Ingår i: Proceedings of FRAME 2007: Building Frame Semantics Resources for Scandinavian and Baltic Languages. ; , s. 19-25 Konferensbidrag (refereegranskat)abstract As a prerequisite for the investigation of dependency-based methods for semantic role labeling, this paper describes the creation of a dependency-based version of the widely used PropBank, DepPropBank, and discusses some of the issues involved in the integration of syntactic and semantic dependency structures.
45.	Nivre, Joakim, et al. (författare) A Dependency-Based Conversion of PropBank 2007 Ingår i: Proceedings of FRAME 2007: Building Frame Semantics Resources for Scandinavian and Baltic Languages. ; , s. 19-25 Konferensbidrag (refereegranskat)
46.	Nivre, Joakim, et al. (författare) A generic architecture for data-driven dependency parsing 2005 Ingår i: Proceedings of the 15th NODALID Conference. Konferensbidrag (refereegranskat)
47.	Nivre, Joakim, et al. (författare) A Hybrid Constituency-Dependency Parser for Swedish 2007 Ingår i: Proceedings of the 16th Nordic Conference of Computational Linguistics. ; , s. 284-287 Konferensbidrag (refereegranskat)
48.	Nivre, Joakim (författare) Algorithms for Deterministic Incremental Dependency Parsing 2008 Ingår i: Computational Linguistics. - : MIT Press, Cambridge, MA. - 0891-2017. ; 34:4, s. 513-553 Tidskriftsartikel (refereegranskat)
49.	Nivre, Joakim, 1962- (författare) Algorithms for Deterministic Incremental Dependency Parsing 2008 Ingår i: Computational linguistics - Association for Computational Linguistics (Print). - 0891-2017 .- 1530-9312. ; 34:4, s. 513-553 Tidskriftsartikel (refereegranskat)abstract Parsing algorithms that process the input from left to right and construct a single derivation have often been considered inadequate for natural language parsing because of the massive ambiguity typically found in natural language grammars. Nevertheless, it has been shown that such algorithms, combined with treebank-induced classifiers, can be used to build highly accurate disambiguating parsers, in particular for dependency-based syntactic representations. In this article, we first present a general framework for describing and analyzing algorithms for deterministic incremental dependency parsing, formalized as transition systems. We then describe and analyze two families of such algorithms: stack-based and list-based algorithms. In the former family, which is restricted to projective dependency structures, we describe an arc-eager and an arc-standard variant; in the latter family, we present a projective and a non-projective variant. For each of the four algorithms, we give proofs of correctness and complexity. In addition, we perform an experimental evaluation of all algorithms in combination with SVM classifiers for predicting the next parsing action, using data from thirteen languages. We show that all four algorithms give competitive accuracy, although the non-projective list-based algorithm generally outperforms the projective algorithms for languages with a non-negligible proportion of non-projective constructions. However, the projective algorithms often produce comparable results when combined with the technique known as pseudo-projective parsing. The linear time complexity of the stack-based algorithms gives them an advantage with respect to efficiency both in learning and in parsing, but the projective list-based algorithm turns out to be equally efficient in practice. Moreover, when the projective algorithms are used to implement pseudo-projective parsing, they sometimes become less efficient in parsing (but not in learning) than the non-projective list-based algorithm. Although most of the algorithms have been partially described in the literature before, this is the first comprehensive analysis and evaluation of the algorithms within a unified framework.
50.	Nivre, Joakim, 1962-, et al. (författare) An Improved Oracle for Dependency Parsing with Online Reordering 2009 Ingår i: Proceedings of the 11th International Conference on Parsing Technologies (IWPT). - Stroudsburg, PA, USA : Association for Computational Linguistics. ; , s. 73-76 Konferensbidrag (refereegranskat)abstract We present an improved training strategyfor dependency parsers that use online re-ordering to handle non-projective trees.The new strategy improves both efficiency and accuracy by reducing the number of swap operations performed on non-projective trees by up to 80%. We present state-of-the-art results for five languages with the best ever reported results for Czech.

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Träfflista för sökning "WFRF:(Nivre joakim) srt2:(2005-2009)"

Avgränsa träffmängd

År