SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:DiVA.org:uu-453958"
 

Sökning: id:"swepub:oai:DiVA.org:uu-453958" > Approaches for Dist...

Approaches for Distributing Large Scale Bioinformatic Analyses

Dahlö, Martin (författare)
Uppsala universitet,Institutionen för farmaceutisk biovetenskap,Pharmaceutical Bioinformatics
Spjuth, Ola, Professor, 1977- (preses)
Uppsala universitet,Institutionen för farmaceutisk biovetenskap
Peterson, Hedi, Associate Professor (opponent)
University of Tartu, Faculty of Science and Technology, Institute of Computer Science
 (creator_code:org_t)
ISBN 9789151313016
Uppsala : Acta Universitatis Upsaliensis, 2021
Engelska 56 s.
Serie: Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Pharmacy, 1651-6192 ; 302
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)
Abstract Ämnesord
Stäng  
  • Ever since high-throughput DNA sequencing became economically feasible, the amount of biological data has grown exponentially. This has been one of the biggest drivers in introducing high-performance computing (HPC) to the field of biology. Unlike physics and mathematics, biology education has not had a strong focus on programming or algorithmic development. This has forced many biology researchers to start learning a whole new skill set, and introduced new challenges for those managing the HPC clusters.The aim of this thesis is to investigate the problems that arise when novice users are using an HPC cluster for bioinformatics data analysis, and exploring approaches for how these can be mitigated. In paper 1 we quantify and visualise these problems and contrast them with the more computer experienced user groups already using the HPC cluster. In paper 2 we introduce a new workflow system (SciPipe), implemented as a Go library, as a way to organise and manage analysis steps. Paper 3 is aimed at cloud computing and how containerised tools can be used to run workflows without having to worry about software installations. In paper 4 we demonstrate a fully automated cloud-based system for image-based cell profiling. Starting with a robotic arm in a lab, it covers all the steps from cell culture and microscope to having the cell profiling results stored in a database and visualised in a web interface.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Bioinformatik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Bioinformatics (hsv//eng)

Nyckelord

bioinformatics
cloud computing
HPC
high-performance computing
big data
kubernetes
spark
Bioinformatics
Bioinformatik

Publikations- och innehållstyp

vet (ämneskategori)
dok (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy