SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:gup.ub.gu.se/232639"
 

Sökning: id:"swepub:oai:gup.ub.gu.se/232639" > Statistical evaluat...

Statistical evaluation of methods for identification of differentially abundant genes in comparative metagenomics

Jonsson, Viktor, 1987 (författare)
Gothenburg University,Göteborgs universitet,Institutionen för matematiska vetenskaper,Department of Mathematical Sciences,University of Gothenburg,Chalmers tekniska högskola,Chalmers University of Technology
Österlund, Tobias, 1984 (författare)
Gothenburg University,Göteborgs universitet,Institutionen för matematiska vetenskaper, matematisk statistik,Department of Mathematical Sciences, Mathematical Statistics,University of Gothenburg,Chalmers tekniska högskola,Chalmers University of Technology
Nerman, Olle, 1951 (författare)
Gothenburg University,Göteborgs universitet,Institutionen för matematiska vetenskaper, matematisk statistik,Department of Mathematical Sciences, Mathematical Statistics,University of Gothenburg,Chalmers tekniska högskola,Chalmers University of Technology
visa fler...
Kristiansson, Erik, 1978 (författare)
Gothenburg University,Göteborgs universitet,Institutionen för matematiska vetenskaper, matematisk statistik,Department of Mathematical Sciences, Mathematical Statistics,University of Gothenburg,Chalmers tekniska högskola,Chalmers University of Technology
visa färre...
 (creator_code:org_t)
2016-01-25
2016
Engelska.
Ingår i: BMC Genomics. - : Springer Science and Business Media LLC. - 1471-2164. ; 17
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • Background: Metagenomics is the study of microbial communities by sequencing of genetic material directly from environmental or clinical samples. The genes present in the metagenomes are quantified by annotating and counting the generated DNA fragments. Identification of differentially abundant genes between metagenomes can provide important information about differences in community structure, diversity and biological function. Metagenomic data is however high-dimensional, contain high levels of biological and technical noise and have typically few biological replicates. The statistical analysis is therefore challenging and many approaches have been suggested to date. Results: In this article we perform a comprehensive evaluation of 14 methods for identification of differentially abundant genes between metagenomes. The methods are compared based on the power to detect differentially abundant genes and their ability to correctly estimate the type I error rate and the false discovery rate. We show that sample size, effect size, and gene abundance greatly affect the performance of all methods. Several of the methods also show non-optimal model assumptions and biased false discovery rate estimates, which can result in too large numbers of false positives. We also demonstrate that the performance of several of the methods differs substantially between metagenomic data sequenced by different technologies. Conclusions: Two methods, primarily designed for the analysis of RNA sequencing data (edgeR and DESeq2) together with a generalized linear model based on an overdispersed Poisson distribution were found to have best overall performance. The results presented in this study may serve as a guide for selecting suitable statistical methods for identification of differentially abundant genes in metagenomes.

Ämnesord

NATURVETENSKAP  -- Matematik (hsv//swe)
NATURAL SCIENCES  -- Mathematics (hsv//eng)

Nyckelord

Environmental sequencing
Next generation sequencing
Categorical data analysis
Differential
false discovery rate
false discovery rate

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy