SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:uu-360840"
 

Search: onr:"swepub:oai:DiVA.org:uu-360840" > Examining sequence ...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Examining sequence alignments using a model-based approach

Bogusz, Marcin (author)
Uppsala universitet,Evolutionsbiologi,Whelan Lab
Ali, Raja Hashim (author)
Uppsala universitet,Evolutionsbiologi
Whelan, Simon (author)
Uppsala universitet,Evolutionsbiologi
 (creator_code:org_t)
English.
  • Other publication (other academic/artistic)
Abstract Subject headings
Close  
  • Multiple sequence alignment (MSA) is a commonly performed procedure required for a number of evolutionary and comparative analyses. The common two-step process of sequence alignment followed by statistical phylogenetic inference depends on MSA quality. MSA is computationally difficult and as a result in many cases sequence alignments contain regions of spurious homologies. These errors in the alignment affect downstream results, so choosing an accurate MSA is critical.  Researchers often face the problem of choosing an aligner out of many multiple sequence alignment methods (MSAMs). This choice is often based on the results of benchmarks with various popular methods claiming high accuracy scores. These methods compete to obtain the highest scores in the commonly used sum-of-pairs benchmark—which accounts for a fraction of the true homologies recovered—ignoring the fraction of introduced false positive homologies. Furthermore, these benchmarks do not account for the fact that some homologies are more difficult to recover than the others. We take a probabilistic model-based approach to examine the quality of pairwise homologies returned by four popular MSAMs. We use pair-hidden Markov models to break down alignment columns into pairs and obtain distributions of pairwise posterior scores for these aligners. Basing our results on a structural benchmark and a simulation study, we find that MSAMs appear to return a sample from a confidence set defined by high posterior probabilities. Furthermore, we find that the reference alignment contains low pairwise posterior portions of pairwise homologies which cannot be expected to be recovered by any MSAM. Finally, we look at several possible test statistics, with and without the need for reference alignments, and ultimately suggest using positive predictive value (PPV) and mean posterior probability for MSA evaluation.

Subject headings

NATURVETENSKAP  -- Biologi -- Evolutionsbiologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Evolutionary Biology (hsv//eng)

Keyword

Sequence alignment
alignment accuracy
alignment uncertainty
pair hidden Markov models

Publication and Content Type

vet (subject category)
ovr (subject category)

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Find more in SwePub

By the author/editor
Bogusz, Marcin
Ali, Raja Hashim
Whelan, Simon
About the subject
NATURAL SCIENCES
NATURAL SCIENCES
and Biological Scien ...
and Evolutionary Bio ...
By the university
Uppsala University

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view