SwePub
Sök i LIBRIS databas

  Utökad sökning

WFRF:(Nivre Joakim 1962 )
 

Sökning: WFRF:(Nivre Joakim 1962 ) > (2010-2014) > Accurate Domain Ide...

Accurate Domain Identification with Structure-Anchored Hidden Markov Models, saHMMs

Tångrot, Jeanette (författare)
Umeå universitet,Institutionen för datavetenskap,Kemiska institutionen,Umeå centrum för molekylär patogenes (UCMP)
Kågström, Bo (författare)
Umeå universitet,Institutionen för datavetenskap,Högpresterande beräkningscentrum norr (HPC2N),UMIT
Sauer, Uwe (författare)
Umeå universitet,Umeå centrum för molekylär patogenes (UCMP) (Teknisk-naturvetenskaplig fakultet),Kemiska institutionen
 (creator_code:org_t)
Wiley, 2009
2009
Engelska.
Ingår i: Proteins. - : Wiley. - 0887-3585 .- 1097-0134. ; 76:2, s. 343-352
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • The ever increasing speed of DNA sequencing widens the discrepancy between the number of known gene products, and the knowledge of their function and structure. Proper annotation of protein sequences is therefore crucial if the missing information is to be deduced from sequence-based similarity comparisons. These comparisons become exceedingly difficult as the pairwise identities drop to very low values. To improve the accuracy of domain identification, we exploit the fact that the three-dimensional structures of domains are much more conserved than their sequences. Based on structure-anchored multiple sequence alignments of low identity homologues we constructed 850 structure-anchored hidden Markov models (saHMMs), each representing one domain family. Since the saHMMs are highly family specific, they can be used to assign a domain to its correct family and clearly distinguish it from domains belonging to other families, even within the same superfamily. This task is not trivial and becomes particularly difficult if the unknown domain is distantly related to the rest of the domain sequences within the family. In a search with full length protein sequences, harbouring at least one domain as defined by the structural classification of proteins database (SCOP), version 1.71, versus the saHMM database based on SCOP version 1.69, we achieve an accuracy of 99.0%. All of the few hits outside the family fall within the correct superfamily. Compared to Pfam_ls HMMs, the saHMMs obtain about 11% higher coverage. A comparison with BLAST and PSI-BLAST demonstrates that the saHMMs have consistently fewer errors per query at a given coverage. Within our recommended E-value range, the same is true for a comparison with SUPERFAMILY. Furthermore, we are able to annotate 232 proteins with 530 nonoverlapping domains belonging to 102 different domain families among human proteins labelled unknown in the NCBI protein database. Our results demonstrate that the saHMM database represents a versatile and reliable tool for identification of domains in protein sequences. With the aid of saHMMs, homology on the family level can be assigned, even for distantly related sequences. Due to the construction of the saHMMs, the hits they provide are always associated with high quality crystal structures. The saHMM database can be accessed via the FISH server at http://babel.ucmp.umu.se/fish/.

Ämnesord

NATURVETENSKAP  -- Biologi -- Biokemi och molekylärbiologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Biochemistry and Molecular Biology (hsv//eng)
NATURVETENSKAP  -- Biologi -- Biofysik (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Biophysics (hsv//eng)

Nyckelord

protein domain
structure alignment
remote homologue
sequence annotation
protein family
protein superfamily

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

  • Proteins (Sök värdpublikationen i LIBRIS)

Till lärosätets databas

Hitta mer i SwePub

Av författaren/redakt...
Tångrot, Jeanett ...
Kågström, Bo
Sauer, Uwe
Om ämnet
NATURVETENSKAP
NATURVETENSKAP
och Biologi
och Biokemi och mole ...
NATURVETENSKAP
NATURVETENSKAP
och Biologi
och Biofysik
Artiklar i publikationen
Proteins
Av lärosätet
Umeå universitet

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy