SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:gup.ub.gu.se/165284"
 

Sökning: id:"swepub:oai:gup.ub.gu.se/165284" > Exploration of mult...

Exploration of multivariate analysis in microbial coding sequence modeling.

Mehmood, Tahir (författare)
Bohlin, Jon (författare)
Kristoffersen, Anja Bråthen (författare)
visa fler...
Sæbø, Solve (författare)
Warringer, Jonas, 1973 (författare)
Gothenburg University,Göteborgs universitet,Institutionen för kemi och molekylärbiologi,Department of Chemistry and Molecular Biology
Snipen, Lars (författare)
visa färre...
 (creator_code:org_t)
2012-05-14
2012
Engelska.
Ingår i: BMC bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 13
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • ABSTRACT: BACKGROUND: Gene finding is a complicated procedure that encapsulates algorithms for coding sequence modeling, identification of promoter regions, issues concerning overlapping genes and more. In the present study we focus on coding sequence modeling algorithms; that is, algorithms for identification and prediction of the actual coding sequences from genomic DNA. In this respect, we promote a novel multivariate method known as Canonical Powered Partial Least Squares (CPPLS) as an alternative to the commonly used Interpolated Markov model (IMM). Comparisons between the methods were performed on DNA, codon and protein sequences with highly conserved genes taken from several species with different genomic properties. RESULTS: The multivariate CPPLS approach classified coding sequence substantially better than the commonly used IMM on the same set of sequences. We also found that the use of CPPLS with codon representation gave significantly better classification results than both IMM with protein (p < 0.001) and with DNA (p < 0.001). Further, although the mean performance was similar, the variation of CPPLS performance on codon representation was significantly smaller than for IMM (p < 0.001). CONCLUSIONS: The performance of coding sequence modeling can be substantially improved by using an algorithm based on the multivariate CPPLS method applied to codon or DNA frequencies.

Ämnesord

NATURVETENSKAP  -- Biologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences (hsv//eng)

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy