SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Alemu Argaw Atelach) srt2:(2007)"

Sökning: WFRF:(Alemu Argaw Atelach) > (2007)

  • Resultat 1-4 av 4
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Alemu Argaw, Atelach, et al. (författare)
  • General-Purpose Text Categorization Applied to the Medical Domain.
  • 2007
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents work where a general-purpose text categorization method was applied to categorize medical free-texts. The purpose of the experiments was to examine how such a method performs without any domain-specific knowledge, hand-crafting or tuning. Additionally, we compare the results from the general-purpose method with results from runs in which a medical thesaurus as well as automatically extracted keywords were used when building the classifiers. We show that standard text categorization techniques using stemmed unigrams as the basis for learning can be applied directly to categorize medical reports, yielding an F-measure of 83.9, and outperforming the more sophisticated methods.
  •  
2.
  • Argaw, Atelach Alemu, et al. (författare)
  • Amharic-english information retrieval
  • 2007
  • Ingår i: Evaluation of Multilingual and Multi-modal Information Retrieval. - 9783540749981 ; , s. 43-50
  • Konferensbidrag (refereegranskat)abstract
    • We describe Amharic-English cross lingual information retrieval experiments in the ad hoc bilingual tracks of the CLEF 2006. The query analysis is supported by morphological analysis and part of speech tagging while we used two machine readable dictionaries supplemented by online dictionaries for term lookup in the translation process. Out of dictionary terms were handled using fuzzy matching and Lucene[4] was used for indexing and searching. Four experiments that differed in terms of utilized fields in the topic set, fuzzy matching, and term weighting, were conducted. The results obtained are reported and discussed.
  •  
3.
  • Argaw, Atelach Alemu (författare)
  • Amharic-English information retrieval with pseudo relevance feedback
  • 2007
  • Ingår i: CLEF2007 Working Notes. - : CEUR-WS.
  • Konferensbidrag (refereegranskat)abstract
    • We describe cross language retrieval experiments using Amharic queries and English language document collection from our participation in the bilingual ad hoc track at the CLEF 2007. Two monolingual and eight bilingual runs were submitted. The bilingual experiments designed varied in terms of usage of long and short queries, presence of pseudo relevance feedback (PRF), and three approaches (maximal expansion, first-translation-given, manual) for word sense disambiguation. We used an Amharic-English machine readable dictionary (MRD) and an online Amharic-English dictionary in order to do the lookup translation of query terms. In utilizing both resources, matching query term bigrams were always given precedence over unigrams. Out of dictionary Amharic query terms were taken to be possible named entities in the language, and further filtering was attained through restricted fuzzy matching based on edit distance. The fuzzy matching was performed for each of these terms against automatically extracted English proper names. The Lemur toolkit for language modeling and information retrieval was used for indexing and retrieval. Although the experiments are too limited to draw conclusions from, the obtained results indicate that longer queries tend to perform similar to short ones, PRF improves performance considerably, and that queries tend to fare better when we use the first translation given in the MRD rather than using maximal expansion of terms by taking all the translations given in the MRD.
  •  
4.
  • Asker, Lars, et al. (författare)
  • An Amharic Stemmer : Reducing Words to their Citation Forms
  • 2007
  • Ingår i: Computational Approaches to Semitic Languages: Common Issues and Resources.
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • Stemming is an important analysis step in a number of areas such as natural language processing (NLP), information retrieval (IR), machine translation(MT) and text classification. In this paper we present the development of a stemmer for Amharic that reduces words to their citation forms. Amharic is a Semitic language with rich and complex morphology. The application of such a stemmer is in dictionary based cross language IR, where there is a need in the translation step, to look up terms in a machine readable dictionary (MRD). We apply a rule based approach supplemented by occurrence statistics of words in a MRD and in a 3.1M words news corpus. The main purpose of the statistical upplements is to resolve ambiguity between alternative segmentations. The stemmer is evaluated on Amharic text from two domains, news articles and a classic fiction text. It is shown to have an accuracy of 60% for the old fashioned fiction text and 75% for the news articles.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-4 av 4

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy