Tyck till om SwePub Sök
här!
Sökning: onr:"swepub:oai:research.chalmers.se:78020349-9ac9-469c-a8f7-b57f98a49f54" >
Mining Web Logs to ...
Mining Web Logs to Improve User Experience in Web Search
-
- Tansini, Libertad, 1973 (författare)
- Chalmers tekniska högskola,Chalmers University of Technology
-
(creator_code:org_t)
- ISBN 9789173852005
- 2008
- Engelska.
- Relaterad länk:
-
https://research.cha...
Abstract
Ämnesord
Stäng
- The World Wide Web continues to grow in size and diversity and this makes it increasinglyhard for users to find valuable information because of heterogeneous form and contentof the documents, little knowledge about the reliability and prestige of the documents anda great deal of redundancy.Usually search engines look for documents that contain specific keywords or phrasesstated by the users as queries. There might be millions of pages containing those keywordsand they may be related to a variety of different topics.Traditional retrieval strategies yield increasingly poor results due to a dramatic increasein ballast in the results. Search engine users thus increasingly experience information overload.With these difficulties in mind, there is a large ongoing effort in research with the goalto deliver appropriate information to the users, this is what is meant by improving usersWeb search experience.The aim of this thesis is to design, test and analyze different approaches to address theproblem of characterizing search behavior of users and improve the search process, in thecontext ofWeb search. There are three main aspects to focus on when tackling this problemthroughWeb Log Mining: user recommendations to improve the search process, automaticdetection of user information needs and modeling of user information needs.To improve Web search by user recommendations, a suit of algorithms tailored to themixture models is presented, the algorithms are simple and efficient. Tests are carriedout on a broad range of generated data according to a spectrum of subclasses of mixturemodels, and on real data collected from a Hungarian news portal log and from the ChileanTodoCl1 web search log, the resulting performance is shown to be of high quality. Otherapplication areas were mixture models are used also benefit from these results, this is thecase of dating services, e-commerce, virtual collaborative communities, Internet ServiceProviders and in bioinformatics to analyze gene expression data.The main contribution in user behavior characterization is to provide a complete studyof all major learning approaches applied to automatically detect user intent in Web search.The three analyzed machine learning techniques for mining user intent are: completely supervised,semi–supervised and unsupervised. In this context the semi–supervised learningapproach shows significant improvements over the supervised approach for mining userintent and interests, which previously was considered the best one. This study is also of interest more generally in exploring the true potential of all learning techniques in largescale settings such as the Web, both in terms of their scalability and in their accuracy.In the user intent modeling context the contribution is the proposition of a new categorizationof user’s intentions from the point of view of facets with the aim to improveon previous classification schemes. Initially a set of queries were manually labeled withthe new faceted classification scheme to find relationships between the facets to aid in themanual labelling process and to understand users intentions. The distribution of the querieswithin the facets shows that the facets are relevant since each produces a division of thequery space that will allow for better understanding of the user needs.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
Nyckelord
- Web log mining
- web search
- user intent
- user recommendation
- user modelling
Publikations- och innehållstyp
- dok (ämneskategori)
- vet (ämneskategori)
Hitta via bibliotek
Till lärosätets databas