SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:DiVA.org:kth-152583"
 

Sökning: id:"swepub:oai:DiVA.org:kth-152583" > BESST - Efficient s...

BESST - Efficient scaffolding of large fragmented assemblies

Sahlin, Kristoffer (författare)
KTH,Beräkningsbiologi, CB,Science for Life Laboratory, SciLifeLab
Vezzi, Francesco (författare)
KTH,Beräkningsbiologi, CB,Science for Life Laboratory, SciLifeLab
Nystedt, Björn (författare)
Stockholms universitet,Institutionen för biokemi och biofysik,Science for Life Laboratory (SciLifeLab)
visa fler...
Lundeberg, Joakim (författare)
KTH,Genteknologi,Science for Life Laboratory, SciLifeLab
Arvestad, Lars (författare)
Stockholms universitet,Numerisk analys och datalogi (NADA),Swedish e-Science Research Centre (SeRC), Sweden
visa färre...
 (creator_code:org_t)
2014-08-15
2014
Engelska.
Ingår i: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 15:1, s. 281-
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • Background: The use of short reads from High Throughput Sequencing (HTS) techniques is now commonplace in de novo assembly. Yet, obtaining contiguous assemblies from short reads is challenging, thus making scaffolding an important step in the assembly pipeline. Different algorithms have been proposed but many of them use the number of read pairs supporting a linking of two contigs as an indicator of reliability. This reasoning is intuitive, but fails to account for variation in link count due to contig features. We have also noted that published scaffolders are only evaluated on small datasets using output from only one assembler. Two issues arise from this. Firstly, some of the available tools are not well suited for complex genomes. Secondly, these evaluations provide little support for inferring a software's general performance. Results: We propose a new algorithm, implemented in a tool called BESST, which can scaffold genomes of all sizes and complexities and was used to scaffold the genome of P. abies (20 Gbp). We performed a comprehensive comparison of BESST against the most popular stand-alone scaffolders on a large variety of datasets. Our results confirm that some of the popular scaffolders are not practical to run on complex datasets. Furthermore, no single stand-alone scaffolder outperforms the others on all datasets. However, BESST fares favorably to the other tested scaffolders on GAGE datasets and, moreover, outperforms the other methods when library insert size distribution is wide. Conclusion: We conclude from our results that information sources other than the quantity of links, as is commonly used, can provide useful information about genome structure when scaffolding.

Ämnesord

NATURVETENSKAP  -- Biologi -- Biokemi och molekylärbiologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Biochemistry and Molecular Biology (hsv//eng)
NATURVETENSKAP  -- Data- och informationsvetenskap -- Bioinformatik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Bioinformatics (hsv//eng)

Nyckelord

Genome analysis
Genome assembly
Mate pair next-generation sequencing
Scaffolding

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy