SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:gup.ub.gu.se/253121"
 

Sökning: id:"swepub:oai:gup.ub.gu.se/253121" > Context-specific in...

Context-specific independence mixture modeling for positional weight matrices.

Georgi, Benjamin (författare)
Schliep, Alexander, 1967 (författare)
Gothenburg University,Göteborgs universitet,Institutionen för data- och informationsteknik, datavetenskap (GU),Department of Computer Science and Engineering, Computing Science (GU)
 (creator_code:org_t)
2006-07-15
2006
Engelska.
Ingår i: Bioinformatics (Oxford, England). - : Oxford University Press (OUP). - 1367-4811 .- 1367-4803. ; 22:14
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • A positional weight matrix (PWM) is a statistical representation of the binding pattern of a transcription factor estimated from known binding site sequences. Previous studies showed that for factors which bind to divergent binding sites, mixtures of multiple PWMs increase performance. However, estimating a conventional mixture distribution for each position will in many cases cause overfitting.We propose a context-specific independence (CSI) mixture model and a learning algorithm based on a Bayesian approach. The CSI model adjusts complexity to fit the amount of variation observed on the sequence level in each position of a site. This not only yields a more parsimonious description of binding patterns, which improves parameter estimates, it also increases robustness as the model automatically adapts the number of components to fit the data. Evaluation of the CSI model on simulated data showed favorable results compared to conventional mixtures. We demonstrate its adaptive properties in a classical model selection setup. The increased parsimony of the CSI model was shown for the transcription factor Leu3 where two binding-energy subgroups were distinguished equally well as with a conventional mixture but requiring 30% less parameters. Analysis of the human-mouse conservation of predicted binding sites of 64 JASPAR TFs showed that CSI was as good or better than a conventional mixture for 89% of the TFs and for 70% for a single PWM model.http://algorithmics.molgen.mpg.de/mixture.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Bioinformatik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Bioinformatics (hsv//eng)

Nyckelord

Algorithms
Animals
Base Sequence
Binding Sites
Computer Simulation
DNA
genetics
Humans
Mice
Models
Genetic
Models
Statistical
Molecular Sequence Data
Protein Binding
Sequence Alignment
methods
Sequence Analysis
DNA
methods
Software
Transcription Factors
genetics

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Hitta mer i SwePub

Av författaren/redakt...
Georgi, Benjamin
Schliep, Alexand ...
Om ämnet
NATURVETENSKAP
NATURVETENSKAP
och Data och informa ...
och Bioinformatik
Artiklar i publikationen
Bioinformatics ( ...
Av lärosätet
Göteborgs universitet

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy