Sökning: onr:"swepub:oai:DiVA.org:uu-453958" >
Approaches for Dist...
Approaches for Distributing Large Scale Bioinformatic Analyses
-
- Dahlö, Martin (författare)
- Uppsala universitet,Institutionen för farmaceutisk biovetenskap,Pharmaceutical Bioinformatics
-
- Spjuth, Ola, Professor, 1977- (preses)
- Uppsala universitet,Institutionen för farmaceutisk biovetenskap
-
- Peterson, Hedi, Associate Professor (opponent)
- University of Tartu, Faculty of Science and Technology, Institute of Computer Science
-
(creator_code:org_t)
- ISBN 9789151313016
- Uppsala : Acta Universitatis Upsaliensis, 2021
- Engelska 56 s.
-
Serie: Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Pharmacy, 1651-6192 ; 302
- Relaterad länk:
-
https://uu.diva-port... (primary) (Raw object)
-
visa fler...
-
https://uu.diva-port... (Preview)
-
https://urn.kb.se/re...
-
visa färre...
Abstract
Ämnesord
Stäng
- Ever since high-throughput DNA sequencing became economically feasible, the amount of biological data has grown exponentially. This has been one of the biggest drivers in introducing high-performance computing (HPC) to the field of biology. Unlike physics and mathematics, biology education has not had a strong focus on programming or algorithmic development. This has forced many biology researchers to start learning a whole new skill set, and introduced new challenges for those managing the HPC clusters.The aim of this thesis is to investigate the problems that arise when novice users are using an HPC cluster for bioinformatics data analysis, and exploring approaches for how these can be mitigated. In paper 1 we quantify and visualise these problems and contrast them with the more computer experienced user groups already using the HPC cluster. In paper 2 we introduce a new workflow system (SciPipe), implemented as a Go library, as a way to organise and manage analysis steps. Paper 3 is aimed at cloud computing and how containerised tools can be used to run workflows without having to worry about software installations. In paper 4 we demonstrate a fully automated cloud-based system for image-based cell profiling. Starting with a robotic arm in a lab, it covers all the steps from cell culture and microscope to having the cell profiling results stored in a database and visualised in a web interface.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Bioinformatik (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Bioinformatics (hsv//eng)
Nyckelord
- bioinformatics
- cloud computing
- HPC
- high-performance computing
- big data
- kubernetes
- spark
- Bioinformatics
- Bioinformatik
Publikations- och innehållstyp
- vet (ämneskategori)
- dok (ämneskategori)
Hitta via bibliotek
Till lärosätets databas