SwePub
Sök i LIBRIS databas

  Extended search

WFRF:(Wählby Carolina 1974 )
 

Search: WFRF:(Wählby Carolina 1974 ) > (2020-2024) > Rapid development o...

Rapid development of cloud-native intelligent data pipelines for scientific data streams using the HASTE Toolkit

Blamey, Ben (author)
Uppsala universitet,Avdelningen för beräkningsvetenskap,Tillämpad beräkningsvetenskap,Department of Information Technology, Uppsala University, Lägerhyddsvägen 2, 75237 Uppsala, Sweden
Toor, Salman (author)
Uppsala universitet,Tillämpad beräkningsvetenskap,Avdelningen för beräkningsvetenskap,Department of Information Technology, Uppsala University, Lägerhyddsvägen 2, 75237 Uppsala, Sweden
Dahlö, Martin (author)
Uppsala universitet,Science for Life Laboratory, SciLifeLab,Institutionen för farmaceutisk biovetenskap,Ola Spjuth,Department of Pharmaceutical Biosciences, Uppsala University, Husargatan 3, 75237, Uppsala, Sweden;Science for Life Laboratory, Uppsala University, Husargatan 3, 75237 Uppsala, Sweden
show more...
Wieslander, Håkan (author)
Uppsala universitet,Avdelningen för visuell information och interaktion,Bildanalys och människa-datorinteraktion,Department of Information Technology, Uppsala University, Lägerhyddsvägen 2, 75237 Uppsala, Sweden
Harrison, Philip J. (author)
Uppsala universitet,Institutionen för farmaceutisk biovetenskap,Science for Life Laboratory, SciLifeLab,Spjuth,Department of Pharmaceutical Biosciences, Uppsala University, Husargatan 3, 75237, Uppsala, Sweden;Science for Life Laboratory, Uppsala University, Husargatan 3, 75237 Uppsala, Sweden
Sintorn, Ida-Maria, 1976- (author)
Uppsala universitet,Bildanalys och människa-datorinteraktion,Avdelningen för visuell information och interaktion,Science for Life Laboratory, SciLifeLab,Department of Information Technology, Uppsala University, Lägerhyddsvägen 2, 75237 Uppsala, Sweden;Science for Life Laboratory, Uppsala University, Husargatan 3, 75237 Uppsala, Sweden;Vironova AB, Gävlegatan 22, 11330 Stockholm, Sweden
Sabirsh, Alan (author)
Advanced Drug Delivery, Pharmaceutical Sciences, R&D, AstraZeneca, Pepparedsleden 1, 43183 Mölndal, Sweden
Wählby, Carolina, professor, 1974- (author)
Uppsala universitet,Bildanalys och människa-datorinteraktion,Science for Life Laboratory, SciLifeLab,Avdelningen för visuell information och interaktion,Department of Information Technology, Uppsala University, Lägerhyddsvägen 2, 75237 Uppsala, Sweden;Science for Life Laboratory, Uppsala University, Husargatan 3, 75237 Uppsala, Sweden
Spjuth, Ola, Professor, 1977- (author)
Uppsala universitet,Institutionen för farmaceutisk biovetenskap,Science for Life Laboratory, SciLifeLab,Spjuth,Department of Pharmaceutical Biosciences, Uppsala University, Husargatan 3, 75237, Uppsala, Sweden;Science for Life Laboratory, Uppsala University, Husargatan 3, 75237 Uppsala, Sweden
Hellander, Andreas (author)
Uppsala universitet,Avdelningen för beräkningsvetenskap,Tillämpad beräkningsvetenskap,Department of Information Technology, Uppsala University, Lägerhyddsvägen 2, 75237 Uppsala, Sweden
show less...
 (creator_code:org_t)
2021-03-19
2021
English.
In: GigaScience. - : Oxford University Press. - 2047-217X. ; 10:3, s. 1-14
  • Journal article (peer-reviewed)
Abstract Subject headings
Close  
  • BACKGROUND: Large streamed datasets, characteristic of life science applications, are often resource-intensive to process, transport and store. We propose a pipeline model, a design pattern for scientific pipelines, where an incoming stream of scientific data is organized into a tiered or ordered "data hierarchy". We introduce the HASTE Toolkit, a proof-of-concept cloud-native software toolkit based on this pipeline model, to partition and prioritize data streams to optimize use of limited computing resources.FINDINGS: In our pipeline model, an "interestingness function" assigns an interestingness score to data objects in the stream, inducing a data hierarchy. From this score, a "policy" guides decisions on how to prioritize computational resource use for a given object. The HASTE Toolkit is a collection of tools to adopt this approach. We evaluate with 2 microscopy imaging case studies. The first is a high content screening experiment, where images are analyzed in an on-premise container cloud to prioritize storage and subsequent computation. The second considers edge processing of images for upload into the public cloud for real-time control of a transmission electron microscope.CONCLUSIONS: Through our evaluation, we created smart data pipelines capable of effective use of storage, compute, and network resources, enabling more efficient data-intensive experiments. We note a beneficial separation between scientific concerns of data priority, and the implementation of this behaviour for different resources in different deployment contexts. The toolkit allows intelligent prioritization to be `bolted on' to new and existing systems - and is intended for use with a range of technologies in different deployment scenarios.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)

Keyword

HASTE
image analysis
interestingness functions
stream processing
tiered storage
Datavetenskap
Computer Science

Publication and Content Type

ref (subject category)
art (subject category)

Find in a library

To the university's database

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view