SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Utökad sökning

onr:"swepub:oai:research.chalmers.se:78020349-9ac9-469c-a8f7-b57f98a49f54"
 

Sökning: onr:"swepub:oai:research.chalmers.se:78020349-9ac9-469c-a8f7-b57f98a49f54" > Mining Web Logs to ...

Mining Web Logs to Improve User Experience in Web Search

Tansini, Libertad, 1973 (författare)
Chalmers tekniska högskola,Chalmers University of Technology
 (creator_code:org_t)
ISBN 9789173852005
2008
Engelska.
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)
Abstract Ämnesord
Stäng  
  • The World Wide Web continues to grow in size and diversity and this makes it increasinglyhard for users to find valuable information because of heterogeneous form and contentof the documents, little knowledge about the reliability and prestige of the documents anda great deal of redundancy.Usually search engines look for documents that contain specific keywords or phrasesstated by the users as queries. There might be millions of pages containing those keywordsand they may be related to a variety of different topics.Traditional retrieval strategies yield increasingly poor results due to a dramatic increasein ballast in the results. Search engine users thus increasingly experience information overload.With these difficulties in mind, there is a large ongoing effort in research with the goalto deliver appropriate information to the users, this is what is meant by improving usersWeb search experience.The aim of this thesis is to design, test and analyze different approaches to address theproblem of characterizing search behavior of users and improve the search process, in thecontext ofWeb search. There are three main aspects to focus on when tackling this problemthroughWeb Log Mining: user recommendations to improve the search process, automaticdetection of user information needs and modeling of user information needs.To improve Web search by user recommendations, a suit of algorithms tailored to themixture models is presented, the algorithms are simple and efficient. Tests are carriedout on a broad range of generated data according to a spectrum of subclasses of mixturemodels, and on real data collected from a Hungarian news portal log and from the ChileanTodoCl1 web search log, the resulting performance is shown to be of high quality. Otherapplication areas were mixture models are used also benefit from these results, this is thecase of dating services, e-commerce, virtual collaborative communities, Internet ServiceProviders and in bioinformatics to analyze gene expression data.The main contribution in user behavior characterization is to provide a complete studyof all major learning approaches applied to automatically detect user intent in Web search.The three analyzed machine learning techniques for mining user intent are: completely supervised,semi–supervised and unsupervised. In this context the semi–supervised learningapproach shows significant improvements over the supervised approach for mining userintent and interests, which previously was considered the best one. This study is also of interest more generally in exploring the true potential of all learning techniques in largescale settings such as the Web, both in terms of their scalability and in their accuracy.In the user intent modeling context the contribution is the proposition of a new categorizationof user’s intentions from the point of view of facets with the aim to improveon previous classification schemes. Initially a set of queries were manually labeled withthe new faceted classification scheme to find relationships between the facets to aid in themanual labelling process and to understand users intentions. The distribution of the querieswithin the facets shows that the facets are relevant since each produces a division of thequery space that will allow for better understanding of the user needs.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)

Nyckelord

Web log mining
web search
user intent
user recommendation
user modelling

Publikations- och innehållstyp

dok (ämneskategori)
vet (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Hitta mer i SwePub

Av författaren/redakt...
Tansini, Liberta ...
Om ämnet
NATURVETENSKAP
NATURVETENSKAP
och Data och informa ...
och Datavetenskap
Av lärosätet
Chalmers tekniska högskola

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy