SwePub
Sök i LIBRIS databas

  Utökad sökning

WFRF:(Haridi Seif)
 

Sökning: WFRF:(Haridi Seif) > Cutty :

Cutty : Aggregate Sharing for User-Defined Windows

Carbone, Paris (författare)
KTH,Programvaruteknik och Datorsystem, SCS
Traub, Jonas (författare)
Katsifodimo, Asterios (författare)
visa fler...
Haridi, Seif, 1953- (författare)
KTH,Programvaruteknik och Datorsystem, SCS
Mark, Volker (författare)
visa färre...
 (creator_code:org_t)
2016-10-24
2016
Engelska.
Ingår i: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450340731 ; , s. 1201-1210
  • Konferensbidrag (refereegranskat)
Abstract Ämnesord
Stäng  
  • Aggregation queries on data streams are evaluated over evolving and often overlapping logical views called windows. While the aggregation of periodic windows were extensively studied in the past through the use of aggregate sharing techniques such as Panes and Pairs, little to no work has been put in optimizing the aggregation of very common, non-periodic windows. Typical examples of non-periodic windows are punctuations and sessions which can implement complex business logic and are often expressed as user-defined operators on platforms such as Google Dataflow or Apache Storm. The aggregation of such non-periodic or user-defined windows either falls back to expensive, best-effort aggregate sharing methods, or is not optimized at all.In this paper we present a technique to perform efficient aggregate sharing for data stream windows, which are declared as user-defined functions (UDFs) and can contain arbitrary business logic. To this end, we first introduce the concept of User-Defined Windows (UDWs), a simple, UDF-based programming abstraction that allows users to programmatically define custom windows. We then define semantics for UDWs, based on which we design Cutty, a low-cost aggregate sharing technique. Cutty improves and outperforms the state of the art for aggregate sharing on single and multiple queries. Moreover, it enables aggregate sharing for a broad class of non-periodic UDWs. We implemented our techniques on Apache Flink, an open source stream processing system, and performed experiments demonstrating orders of magnitude of reduction in aggregation costs compared to the state of the art.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)

Nyckelord

Computer circuits
Computer programming
Data communication systems
Knowledge management
Open source software
Open systems
Semantics
Computer Science
Datalogi

Publikations- och innehållstyp

ref (ämneskategori)
kon (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy