Quality versus efficiency in document scoring with learning-to-rank models

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Search: onr:"swepub:oai:DiVA.org:mdh-33482" > Quality versus effi...

1 of 1
Previous record
Next record
To hitlist

Quality versus efficiency in document scoring with learning-to-rank models

Capannini, Gabriele (author): Mälardalens högskola,Inbyggda system

Lucchese, C. (author): Istituto di Scienza e Tecnologie dell'Informazione (ISTI) of the National Research Council of Italy (CNR), Pisa, Italy

Nardini, F. M. (author): Istituto di Scienza e Tecnologie dell'Informazione (ISTI) of the National Research Council of Italy (CNR), Pisa, Italy

Orlando, S. (author): University Ca’ Foscari of Venice, Italy

Perego, R. (author): Istituto di Scienza e Tecnologie dell'Informazione (ISTI) of the National Research Council of Italy (CNR), Pisa, Italy

Tonellotto, N. (author): Istituto di Scienza e Tecnologie dell'Informazione (ISTI) of the National Research Council of Italy (CNR), Pisa, Italy

show less...

(creator_code:org_t)

Elsevier BV, 2016
2016
English.
In: Information Processing & Management. - : Elsevier BV. - 0306-4573 .- 1873-5371. ; 52:6, s. 1161-1177

Related links:: https://arpi.unipi.i...; show more...; https://urn.kb.se/re...; https://doi.org/10.1...; show less...

Journal article (peer-reviewed)

Abstract Subject headings

Learning-to-Rank (LtR) techniques leverage machine learning algorithms and large amounts of training data to induce high-quality ranking functions. Given a set of documents and a user query, these functions are able to precisely predict a score for each of the documents, in turn exploited to effectively rank them. Although the scoring efficiency of LtR models is critical in several applications – e.g., it directly impacts on response time and throughput of Web query processing – it has received relatively little attention so far. The goal of this work is to experimentally investigate the scoring efficiency of LtR models along with their ranking quality. Specifically, we show that machine-learned ranking models exhibit a quality versus efficiency trade-off. For example, each family of LtR algorithms has tuning parameters that can influence both effectiveness and efficiency, where higher ranking quality is generally obtained with more complex and expensive models. Moreover, LtR algorithms that learn complex models, such as those based on forests of regression trees, are generally more expensive and more effective than other algorithms that induce simpler models like linear combination of features. We extensively analyze the quality versus efficiency trade-off of a wide spectrum of state-of-the-art LtR, and we propose a sound methodology to devise the most effective ranker given a time budget. To guarantee reproducibility, we used publicly available datasets and we contribute an open source C++ framework providing optimized, multi-threaded implementations of the most effective tree-based learners: Gradient Boosted Regression Trees (GBRT), Lambda-Mart (Λ-MART), and the first public-domain implementation of Oblivious Lambda-Mart (Ωλ-MART), an algorithm that induces forests of oblivious regression trees. We investigate how the different training parameters impact on the quality versus efficiency trade-off, and provide a thorough comparison of several algorithms in the quality-cost space. The experiments conducted show that there is not an overall best algorithm, but the optimal choice depends on the time budget.

Subject headings

NATURVETENSKAP -- Data- och informationsvetenskap (hsv//swe)
NATURAL SCIENCES -- Computer and Information Sciences (hsv//eng)

Keyword

Document scoring
Efficiency
Learning-to-rank
Artificial intelligence
Budget control
C++ (programming language)
Economic and social effects
Forestry
Learning algorithms
Learning systems
Parameter estimation
Regression analysis
Boosted regression trees
Effectiveness and efficiencies
Learning to rank
Linear combinations
Multi-threaded implementation
Training parameters
Tree-based learners
Algorithms

Publication and Content Type

ref (subject category)
art (subject category)

Find in a library

Information Processing & Management (Search for host publication in LIBRIS)

To the university's database

1 of 1
Previous record
Next record
To hitlist

Find more in SwePub

By the author/editor: Capannini, Gabri ...; Lucchese, C.; Nardini, F. M.; Orlando, S.; Perego, R.; Tonellotto, N.

About the subject

NATURAL SCIENCES: NATURAL SCIENCES; and Computer and Inf ...

Articles in the publication: Information Proc ...

By the university: Mälardalen University

Search outside SwePub

Extend your search to:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se