SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:DiVA.org:uu-360840"
 

Sökning: id:"swepub:oai:DiVA.org:uu-360840" > Examining sequence ...

Examining sequence alignments using a model-based approach

Bogusz, Marcin (författare)
Uppsala universitet,Evolutionsbiologi,Whelan Lab
Ali, Raja Hashim (författare)
Uppsala universitet,Evolutionsbiologi
Whelan, Simon (författare)
Uppsala universitet,Evolutionsbiologi
 (creator_code:org_t)
Engelska.
  • Annan publikation (övrigt vetenskapligt/konstnärligt)
Abstract Ämnesord
Stäng  
  • Multiple sequence alignment (MSA) is a commonly performed procedure required for a number of evolutionary and comparative analyses. The common two-step process of sequence alignment followed by statistical phylogenetic inference depends on MSA quality. MSA is computationally difficult and as a result in many cases sequence alignments contain regions of spurious homologies. These errors in the alignment affect downstream results, so choosing an accurate MSA is critical.  Researchers often face the problem of choosing an aligner out of many multiple sequence alignment methods (MSAMs). This choice is often based on the results of benchmarks with various popular methods claiming high accuracy scores. These methods compete to obtain the highest scores in the commonly used sum-of-pairs benchmark—which accounts for a fraction of the true homologies recovered—ignoring the fraction of introduced false positive homologies. Furthermore, these benchmarks do not account for the fact that some homologies are more difficult to recover than the others. We take a probabilistic model-based approach to examine the quality of pairwise homologies returned by four popular MSAMs. We use pair-hidden Markov models to break down alignment columns into pairs and obtain distributions of pairwise posterior scores for these aligners. Basing our results on a structural benchmark and a simulation study, we find that MSAMs appear to return a sample from a confidence set defined by high posterior probabilities. Furthermore, we find that the reference alignment contains low pairwise posterior portions of pairwise homologies which cannot be expected to be recovered by any MSAM. Finally, we look at several possible test statistics, with and without the need for reference alignments, and ultimately suggest using positive predictive value (PPV) and mean posterior probability for MSA evaluation.

Ämnesord

NATURVETENSKAP  -- Biologi -- Evolutionsbiologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Evolutionary Biology (hsv//eng)

Nyckelord

Sequence alignment
alignment accuracy
alignment uncertainty
pair hidden Markov models

Publikations- och innehållstyp

vet (ämneskategori)
ovr (ämneskategori)

Till lärosätets databas

Hitta mer i SwePub

Av författaren/redakt...
Bogusz, Marcin
Ali, Raja Hashim
Whelan, Simon
Om ämnet
NATURVETENSKAP
NATURVETENSKAP
och Biologi
och Evolutionsbiolog ...
Av lärosätet
Uppsala universitet

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy