SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Virk Shafqat 1979) srt2:(2010-2014)"

Sökning: WFRF:(Virk Shafqat 1979) > (2010-2014)

  • Resultat 1-8 av 8
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Virk, Shafqat, 1979, et al. (författare)
  • An Open-Source Punjabi Resource Grammar
  • 2011
  • Ingår i: Proceedings of RANLP-2011, Recent Advances in Natural Language Processing, Hissar, Bulgaria, 12-14 September, 2011. ; , s. 70-76
  • Konferensbidrag (refereegranskat)
  •  
2.
  • Virk, Shafqat, 1979, et al. (författare)
  • An Open Source Urdu Resource Grammar
  • 2010
  • Ingår i: Proceedings of the 8th Workshop on Asian Language Resources (Coling 2010 workshop).
  • Konferensbidrag (refereegranskat)
  •  
3.
  • Caprotti, Olga, 1964, et al. (författare)
  • High-quality translation: Molto tools and applications
  • 2012
  • Ingår i: The fourth Swedish Language Technology Conference (SLTC).
  • Konferensbidrag (refereegranskat)abstract
    • MOLTO (Multilingual On Line Translation, FP7-ICT-247914, www.molto-project.eu) is a European project focusing on translation on the web. MOLTO targets translation that has production quality, that is, usable for quick and reliable dissemination of information. MOLTO’s main focus is to increase the productivity of such translation systems, building on the technology of GF (Grammatical Framework) and its Resource Grammar Library. But MOLTO also develops hybrid methods which increase the quality of Statistical Machine Translation (SMT) by adding linguistic information, or bootstrap grammatical models from statistical models. This paper gives a brief overview of MOLTO’s latest achievements, many of which are more thoroughly described in separate papers and available as web-based demos and as open-source software.
  •  
4.
  • Prasad, K V S, 1952, et al. (författare)
  • Computational evidence that Hindi and Urdu share a grammar but not the lexicon
  • 2012
  • Ingår i: 3rd Workshop on South and Southeast Asian Natural Language Processing (SANLP)", collocated with COLING 12.
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • Hindi and Urdu share a grammar and a basic vocabulary, but are often mutually unintelligible because they use different words in higher registers and sometimes even in quite ordinary situations. We report computational translation evidence of this unusual relationship (it differs from the usual pattern, that related languages share the advanced vocabulary and differ in the basics). We took a GF resource grammar for Urdu and adapted it mechanically for Hindi, changing essentially only the script (Urdu is written in Perso-Arabic, and Hindi in Devanagari) and the lexicon where needed. In evaluation, the Urdu grammar and its Hindi twin either both correctly translated an English sentence, or failed in exactly the same grammatical way, thus confirming computationally that Hindi andUrdu share a grammar. But the evaluation also found that the Hindi and Urdu lexicons differed in 18% of the basic words, in 31% of tourist phrases, and in 92% of school mathematics terms.
  •  
5.
  • Virk, Shafqat, 1979, et al. (författare)
  • An Open Source Persian Computational Grammar
  • 2012
  • Ingår i: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12).
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we describe a multilingual open-source computational grammar of Persian, developed in Grammatical Framework (GF) – A type-theoretical grammar formalism. We discuss in detail the structure of different syntactic (i.e. noun phrases, verb phrases, adjectival phrases, etc.) categories of Persian. First, we show how to structure and construct these categories individually. Then we describe how they are glued together to make well-formed sentences in Persian, while maintaining the grammatical features such as agreement, word order, etc. We also show how some of the distinctive features of Persian, such as the ezafe construction, are implemented in GF. In order to evaluate the grammar’s correctness, and to demonstrate its usefulness, we have added support for Persian in a multilingual application grammar (the Tourist Phrasebook) using the reported resource grammar.
  •  
6.
  •  
7.
  • Virk, Shafqat, 1979, et al. (författare)
  • Developing an interlingual translation lexicon using WordNets and Grammatical Framework
  • 2014
  • Ingår i: Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing. - 9781873769416
  • Konferensbidrag (refereegranskat)abstract
    • The Grammatical Framework (GF) offers perfect translation between controlled subsets of natural languages. E.g., an abstract syntax for a set of sentences in school mathematics is the interlingua between the corresponding sentences in English and Hindi, say. GF “resource grammars” specify how to say something in English or Hindi; these are reused with “application grammars” that specify what can be said (mathematics, tourist phrases, etc.). More recent robust parsing and parse-tree disambiguation allow GF to parse arbitrary English text. We report here an experiment to linearise the resulting tree directly to other languages (e.g. Hindi, German, etc.), i.e., we use a language independent resource grammar as the interlingua. We focus particularly on the last part of the translation system, the interlingual lexicon and word sense disambiguation (WSD). We improved the quality of the wide coverage interlingual translation lexicon by using the Princeton and Universal WordNet data. We then integrated an existing WSD tool and replaced the usual GF style lexicons, which give one target word per source word, by the WordNet based lexicons. These new lexicons and WSD improve the quality of translation in most cases, as we show by examples. Both WordNets and WSD in general are well known, but this is the first use of these tools with GF.
  •  
8.
  • Virk, Shafqat Mumtaz, 1979 (författare)
  • Computational Linguistics Resources for Indo-Iranian Languages
  • 2013
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Can computers process human languages? During the last fifty years, two main approaches have been used to find an answer to this question: data- driven (i.e. statistics based) and knowledge-driven (i.e. grammar based). The former relies on the availability of a vast amount of electronic linguistic data and the processing capabilities of modern-age computers, while the latter builds on grammatical rules and classical linguistic theories of language. In this thesis, we use mainly the second approach and elucidate the development of computational (”resource”) grammars for six Indo-Iranian languages: Urdu, Hindi, Punjabi, Persian, Sindhi, and Nepali. We explore different lexical and syntactical aspects of these languages and build their resource grammars using the Grammatical Framework (GF) – a type theo- retical grammar formalism tool. We also provide computational evidence of the similarities/differences between Hindi and Urdu, and report a mechanical development of a Hindi resource grammar starting from an Urdu resource grammar. We use a functor style implementation that makes it possible to share the commonalities between the two languages. Our analysis shows that this sharing is possible upto 94% at the syntax level, whereas at the lexical level Hindi and Urdu differed in 18% of the basic words, in 31% of tourist phrases, and in 92% of school mathematics terms. Next, we describe the development of wide-coverage morphological lexicons for some of the Indo-Iranian languages. We use existing linguistic data from different resources (i.e. dictionaries and WordNets) to build uni-sense and multi-sense lexicons. Finally, we demonstrate how we used the reported grammatical and lexical resources to add support for Indo-Iranian languages in a few existing GF application grammars. These include the Phrasebook, the mathematics grammar library, and the Attempto controlled English grammar. Further, we give the experimental results of developing a wide-coverage grammar based arbitrary text translator using these resources. These applications show the importance of such linguistic resources, and open new doors for future re- search on these languages.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-8 av 8

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy