SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Kulmizev Artur) "

Sökning: WFRF:(Kulmizev Artur)

  • Resultat 1-10 av 10
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Abdou, Mostafa, et al. (författare)
  • Higher-order Comparisons of Sentence Encoder Representations
  • 2019
  • Ingår i: 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019). - : ASSOC COMPUTATIONAL LINGUISTICS-ACL. - 9781950737901 ; , s. 5838-5845
  • Konferensbidrag (refereegranskat)abstract
    • Representational Similarity Analysis (RSA) is a technique developed by neuroscientists for comparing activity patterns of different measurement modalities (e.g., fMRI, electrophysiology, behavior). As a framework, RSA has several advantages over existing approaches to interpretation of language encoders based on probing or diagnostic classification: namely, it does not require large training samples, is not prone to overfitting, and it enables a more transparent comparison between the representational geometries of different models and modalities. We demonstrate the utility of RSA by establishing a previously unknown correspondence between widely-employed pre-trained language encoders and human processing difficulty via eye-tracking data, showcasing its potential in the interpretability toolbox for neural models.
  •  
2.
  • Abdou, Mostafa, et al. (författare)
  • Word Order Does Matter (And Shuffled Language Models Know It)
  • 2022
  • Ingår i: PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1. - : Association for Computational Linguistics. - 9781955917216 ; , s. 6907-6919
  • Konferensbidrag (refereegranskat)abstract
    • Recent studies have shown that language models pretrained and/or fine-tuned on randomly permuted sentences exhibit competitive performance on GLUE, putting into question the importance of word order information. Somewhat counter-intuitively, some of these studies also report that position embeddings appear to be crucial for models' good performance with shuffled text. We probe these language models for word order information and investigate what position embeddings learned from shuffled text encode, showing that these models retain information pertaining to the original, naturalistic word order. We show this is in part due to a subtlety in how shuffling is implemented in previous work - before rather than after subword segmentation. Surprisingly, we find even Language models trained on text shuffled after subword segmentation retain some semblance of information about word order because of the statistical dependencies between sentence length and unigram probabilities. Finally, we show that beyond GLUE, a variety of language understanding tasks do require word order information, often to an extent that cannot be learned through fine-tuning.
  •  
3.
  •  
4.
  • Hershcovich, Daniel, et al. (författare)
  • Kopsala : Transition-Based Graph Parsing via Efficient Training and Effective Encoding
  • 2020
  • Ingår i: 16th International Conference on Parsing Technologies and IWPT 2020 Shared Task on Parsing Into Enhanced Universal Dependencies. - Stroudsburg, PA, USA : Association for Computational Linguistics. - 9781952148118 ; , s. 236-244
  • Konferensbidrag (refereegranskat)abstract
    • We present Kopsala, the Copenhagen-Uppsala system for the Enhanced Universal Dependencies Shared Task at IWPT 2020. Our system is a pipeline consisting of off-the-shelf models for everything but enhanced graph parsing, and for the latter, a transition-based graph parser adapted from Che et al. (2019). We train a single enhanced parser model per language, using gold sentence splitting and tokenization for training, and rely only on tokenized surface forms and multilingual BERT for encoding. While a bug introduced just before submission resulted in a severe drop in precision, its post-submission fix would bring us to 4th place in the official ranking, according to average ELAS. Our parser demonstrates that a unified pipeline is elective for both Meaning Representation Parsing and Enhanced Universal Dependencies.
  •  
5.
  • Kulmizev, Artur, et al. (författare)
  • Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited
  • 2019
  • Ingår i: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). ; , s. 2755-2768
  • Konferensbidrag (refereegranskat)abstract
    • Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope. In this paper, we show that, even though some details of the picture have changed after the switch to neural networks and continuous representations, the basic trade-off between rich features and global optimization remains essentially the same. Moreover, we show that deep contextualized word embeddings, which allow parsers to pack information about global sentence structure into local feature representations, benefit transition-based parsers more than graph-based parsers, making the two approaches virtually equivalent in terms of both accuracy and error profile. We argue that the reason is that these representations help prevent search errors and thereby allow transitionbased parsers to better exploit their inherent strength of making accurate local decisions. We support this explanation by an error analysis of parsing experiments on 13 languages.
  •  
6.
  • Kulmizev, Artur, et al. (författare)
  • Do Neural Language Models Show Preferences for Syntactic Formalisms?
  • 2020
  • Ingår i: 58Th Annual Meeting Of The Association For Computational Linguistics (Acl 2020). - : ASSOC COMPUTATIONAL LINGUISTICS-ACL. - 9781952148255 ; , s. 4077-4091
  • Konferensbidrag (refereegranskat)abstract
    • Recent work on the interpretability of deep neural language models has concluded that many properties of natural language syntax are encoded in their representational spaces. However, such studies often suffer from limited scope by focusing on a single language and a single linguistic formalism. In this study, we aim to investigate the extent to which the semblance of syntactic structure captured by language models adheres to a surface-syntactic or deep syntactic style of analysis, and whether the patterns are consistent across different languages. We apply a probe for extracting directed dependency trees to BERT and ELMo models trained on 13 different languages, probing for two different syntactic annotation styles: Universal Dependencies (UD), prioritizing deep syntactic relations, and Surface-Syntactic Universal Dependencies (SUD), focusing on surface structure. We find that both models exhibit a preference for UD over SUD - with interesting variations across languages and layers - and that the strength of this preference is correlated with differences in tree shape.
  •  
7.
  • Kulmizev, Artur, 1989-, et al. (författare)
  • Investigating UD Treebanks via Dataset Difficulty Measures
  • 2023
  • Ingår i: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. - Dubrovnik, Croatia : Association for Computational Linguistics. ; , s. 1076-1089
  • Konferensbidrag (refereegranskat)abstract
    • Treebanks annotated with Universal Dependencies (UD) are currently available for over 100 languages and are widely utilized by the community. However, their inherent characteristics are hard to measure and are only partially reflected in parser evaluations via accuracy metrics like LAS. In this study, we analyze a large subset of the UD treebanks using three recently proposed accuracy-free dataset analysis methods: dataset cartography, ?-information, and minimum description length. Each method provides insights about UD treebanks that would remain undetected if only LAS was considered. Specifically, we identify a number of treebanks that, despite yielding high LAS, contain very little information that is usable by a parser to surpass what can be achieved by simple heuristics. Furthermore, we make note of several treebanks that score consistently low across numerous metrics, indicating a high degree of noise or annotation inconsistency present therein.
  •  
8.
  • Kulmizev, Artur, et al. (författare)
  • Schrödinger's tree : On syntax and neural language models
  • 2022
  • Ingår i: Frontiers in Artificial Intelligence. - : Frontiers Media S.A.. - 2624-8212. ; 5
  • Tidskriftsartikel (refereegranskat)abstract
    • In the last half-decade, the field of natural language processing (NLP) hasundergone two major transitions: the switch to neural networks as the primarymodeling paradigm and the homogenization of the training regime (pre-train, then fine-tune). Amidst this process, language models have emergedas NLP’s workhorse, displaying increasingly fluent generation capabilities andproving to be an indispensable means of knowledge transfer downstream.Due to the otherwise opaque, black-box nature of such models, researchershave employed aspects of linguistic theory in order to characterize theirbehavior. Questions central to syntax—the study of the hierarchical structureof language—have factored heavily into such work, shedding invaluableinsights about models’ inherent biases and their ability to make human-likegeneralizations. In this paper, we attempt to take stock of this growing body ofliterature. In doing so, we observe a lack of clarity across numerous dimensions,which influences the hypotheses that researchers form, as well as theconclusions they draw from their findings. To remedy this, we urge researchersto make careful considerations when investigating coding properties, selectingrepresentations, and evaluating via downstream tasks. Furthermore, we outlinethe implications of the different types of research questions exhibited in studieson syntax, as well as the inherent pitfalls of aggregate metrics. Ultimately, wehope that our discussion adds nuance to the prospect of studying languagemodels and paves the way for a less monolithic perspective on syntax in thiscontext.
  •  
9.
  • Kulmizev, Artur (författare)
  • The Search for Syntax : Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Syntax — the study of the hierarchical structure of language — has long featured as a prominent research topic in the field of natural language processing (NLP). Traditionally, its role in NLP was confined towards developing parsers: supervised algorithms tasked with predicting the structure of utterances (often for use in downstream applications). More recently, however, syntax (and syntactic theory) has factored much less into the development of NLP models, and much more into their analysis. This has been particularly true with the nascent relevance of language models: semi-supervised algorithms trained to predict (or infill) strings given a provided context. In this dissertation, I describe four separate studies that seek to explore the interplay between syntactic parsers and language models upon the backdrop of dependency syntax. In the first study, I investigate the error profiles of neural transition-based and graph-based dependency parsers, showing that they are effectively homogenized when leveraging representations from pre-trained language models. Following this, I report the results of two additional studies which show that dependency tree structure can be partially decoded from the internal components of neural language models — specifically, hidden state representations and self-attention distributions. I then expand on these findings by exploring a set of additional results, which serve to highlight the influence of experimental factors, such as the choice of annotation framework or learning objective, in decoding syntactic structure from model components. In the final study, I describe efforts to quantify the overall learnability of a large set of multilingual dependency treebanks — the data upon which the previous experiments were based — and how it may be affected by factors such as annotation quality or tokenization decisions. Finally, I conclude the thesis with a conceptual analysis that relates the aforementioned studies to a broader body of work concerning the syntactic knowledge of language models.
  •  
10.
  • Ravishankar, Vinit, et al. (författare)
  • Attention Can Reflect Syntactic Structure (If You Let It)
  • 2021
  • Ingår i: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. - Stroudsburg, PA, USA : Association for Computational Linguistics. - 9781954085022 ; , s. 3031-3045
  • Konferensbidrag (refereegranskat)abstract
    • Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English - a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual BERT across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns. We show that full trees can be decoded above baseline accuracy from single attention heads, and that individual relations are often tracked by the same heads across languages. Furthermore, in an attempt to address recent debates about the status of attention as an explanatory mechanism, we experiment with fine-tuning mBERT on a supervised parsing objective while freezing different series of parameters. Interestingly, in steering the objective to learn explicit linguistic structure, we find much of the same structure represented in the resulting attention patterns, with interesting differences with respect to which parameters are frozen.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 10

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy