SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Extended search

L773:0027 8424 OR L773:1091 6490
 

Search: L773:0027 8424 OR L773:1091 6490 > Högskolan Dalarna > Scaling metagenome ...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Scaling metagenome sequence assembly with probabilistic de Bruijn graphs

Pell, J. (author)
Hintze, Arend, Professor (author)
Michigan State University, East Lansing, United States
Canino-Koning, R. (author)
show more...
Howe, A. (author)
Tiedje, J. M. (author)
Brown, C. T. (author)
show less...
 (creator_code:org_t)
2012-07-30
2012
English.
In: Proceedings of the National Academy of Sciences of the United States of America. - : Proceedings of the National Academy of Sciences. - 0027-8424 .- 1091-6490. ; 109:33, s. 13272-13277
  • Journal article (peer-reviewed)
Abstract Subject headings
Close  
  • Deep sequencing has enabled the investigation of a wide range of environmental microbial ecosystems, but the high memory requirements for de novo assembly of short-read shotgun sequencing data from these complex populations are an increasingly large practical barrier. Here we introduce a memory-efficient graph representation with which we can analyze the k-mer connectivity of metagenomic samples. The graph representation is based on a probabilistic data structure, a Bloom filter, that allows us to efficiently store assembly graphs in as little as 4 bits per k-mer, albeit inexactly. We show that this data structure accurately represents DNA assembly graphs in low memory.We apply this data structure to the problem of partitioning assembly graphs into components as a prelude to assembly, and show that this reduces the overall memory requirements for de novo assembly of metagenomes. On one soil metagenome assembly, this approach achieves a nearly 40-fold decrease in the maximum memory requirements for assembly. This probabilistic graph representation is a significant theoretical advance in storing assembly graphs and also yields immediate leverage on metagenomic assembly.

Subject headings

NATURVETENSKAP  -- Biologi -- Bioinformatik och systembiologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Bioinformatics and Systems Biology (hsv//eng)

Keyword

Compression
Metagenomics
article
gene sequence
mathematical analysis
metagenome
plots and curves
priority journal
probabilistic de Bruijn graph
Base Pairing
Chromosomes
Bacterial
Computational Biology
DNA
Circular
Escherichia coli
Genome
Bacterial
Information Theory
Nonlinear Dynamics
Sequence Analysis
DNA
Soil Microbiology

Publication and Content Type

ref (subject category)
art (subject category)

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view