SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "L773:1557 8666 "

Search: L773:1557 8666

  • Result 1-10 of 23
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Andersson, Samuel A., et al. (author)
  • Motif Yggdrasil : Sampling sequence motifs from a tree mixture model
  • 2007
  • In: Journal of Computational Biology. - 1066-5277 .- 1557-8666. ; 14:5, s. 682-697
  • Journal article (peer-reviewed)abstract
    • In phylogenetic foot-printing, putative regulatory elements are found in upstream regions of orthologous genes by searching for common motifs. Motifs in different upstream sequences are subject to mutations along the edges of the corresponding phylogenetic tree, consequently taking advantage of the tree in the motif search is an appealing idea. We describe the Motif Yggdrasil sampler; the first Gibbs sampler based on a general tree that uses unaligned sequences. Previous tree-based Gibbs samplers have assumed a star-shaped tree or partially aligned upstream regions. We give a probabilistic model (MY model) describing upstream sequences with regulatory elements and build a Gibbs sampler with respect to this model. The model allows toggling, i.e., the restriction of a position to a subset of nucleotides, but does not require aligned sequences nor edge lengths, which may be difficult to come by. We apply the collapsing technique to eliminate the need to sample nuisance parameters, and give a derivation of the predictive update formula. We show that the MY model improves the modeling of difficult motif instances and that the use of the tree achieves a substantial increase in nucleotide level correlation coefficient both for synthetic data and 37 bacterial lexA genes. We investigate the sensitivity to errors in the tree and show that using random trees MY sampler still has a performance similar to the original version.
  •  
2.
  • Buongermino Pereira, Mariana, 1982, et al. (author)
  • HattCI: Fast and Accurate attC site Identification Using Hidden Markov Models
  • 2016
  • In: Journal of Computational Biology. - : Mary Ann Liebert Inc. - 1066-5277 .- 1557-8666. ; 23:11, s. 891-902
  • Journal article (peer-reviewed)abstract
    • Integrons are genetic elements that facilitate the horizontal gene transfer in bacteria and are known to harbor genes associated with antibiotic resistance. The gene mobility in the integrons is governed by the presence of attC sites, which are 55 to 141-nucleotide-long imperfect inverted repeats. Here we present HattCI, a new method for fast and accurate identification of attC sites in large DNA data sets. The method is based on a generalized hidden Markov model that describes each core component of an attC site individually. Using twofold cross-validation experiments on a manually curated reference data set of 231 attC sites from class 1 and 2 integrons, HattCI showed high sensitivities of up to 91.9% while maintaining satisfactory false-positive rates. When applied to a metagenomic data set of 35 microbial communities from different environments, HattCI found a substantially higher number of attC sites in the samples that are known to contain more horizontally transferred elements. HattCI will significantly increase the ability to identify attC sites and thus integron-mediated genes in genomic and metagenomic data. HattCI is implemented in C and is freely available at http://bioinformatics.math.chalmers.se/HattCI.
  •  
3.
  • Elias, Isaac, et al. (author)
  • Reconstruction of Ancestral Genomic Sequences Using Likelihood
  • 2007
  • In: Journal of Computational Biology. - : Mary Ann Liebert Inc. - 1066-5277 .- 1557-8666. ; 14:2, s. 216-237
  • Journal article (peer-reviewed)abstract
    • A challenging task in computational biology is the reconstruction of genomic sequences of extinct ancestors, given the phylogenetic tree and the sequences at the leafs. This task is best solved by calculating the most likely estimate of the ancestral sequences, along with the most likely edge lengths. We deal with this problem and also the variant in which the phylogenetic tree in addition to the ancestral sequences need to be estimated. The latter problem is known to be NP-hard, while the computational complexity of the former is unknown. Currently, all algorithms for solving these problems are heuristics without performance guarantees. The biological importance of these problems calls for developing better algorithms with guarantees of finding either optimal or approximate solutions. We develop approximation, fix parameter tractable ( FPT), and fast heuristic algorithms for two variants of the problem; when the phylogenetic tree is known and when it is unknown. The approximation algorithm guarantees a solution with a log- likelihood ratio of 2 relative to the optimal solution. The FPT has a running time which is polynomial in the length of the sequences and exponential in the number of taxa. This makes it useful for calculating the optimal solution for small trees. Moreover, we combine the approximation algorithm and the FPT into an algorithm with arbitrary good approximation guarantee ( PTAS). We tested our algorithms on both synthetic and biological data. In particular, we used the FPT for computing the most likely ancestral mitochondrial genomes of hominidae ( the great apes), thereby answering an interesting biological question. Moreover, we show how the approximation algorithms find good solutions for reconstructing the ancestral genomes for a set of lentiviruses ( relatives of HIV). Supplementary material of this work is available at www.nada.kth.se/(similar to)isaac/publications/aml/aml.html.
  •  
4.
  • Elias, Isaac (author)
  • Settling the Intractability of Multiple Alignment
  • 2006
  • In: Journal of Computational Biology. - : Mary Ann Liebert Inc. - 1066-5277 .- 1557-8666. ; 13:7, s. 1323-1339
  • Journal article (peer-reviewed)abstract
    • Multiple alignment is a core problem in computational biology that has received much attention over the years, both in the line of heuristics and hardness results. In most expositions of the problem it is referred to as NP-hard and references are given to one of the available hardness results. However, previous to this paper not even the most elementary variation of the problem, multiple alignment under the unit metric, had been proved hard. The aim of this paper is to settle the NP-hardness of the most common variations of multiple alignment. The following variations are shown NP-hard for all metrics over binary or larger alphabets: Multiple Alignment with SP-score, Star Alignment, and Tree Alignment ( for a given phylogeny). In addition, NP-hardness results are provided for Consensus Patterns and Substring Parsimony.
  •  
5.
  • Georgiev, Alexander (author)
  • Interpretable Numerical Descriptors of Amino Acid Space
  • 2009
  • In: Journal of Computational Biology. - : Mary Ann Liebert Inc. - 1066-5277 .- 1557-8666. ; 16:5, s. 703-723
  • Journal article (peer-reviewed)abstract
    • Informative numerical representations of amino acid residues are essential for successful in silico modeling or establishing the structure-activity relationships of proteins. A straightforward approach is adopted here for representing more than 500 amino acid indices from the AAindex database by a set of uncorrelated scales, satisfying the VARIMAX criterion. Different measures are considered in order to demonstrate the improved interpretability of the current scales as compared to previously published ones. Performance is also addressed in a classification problem of G-protein coupled receptors, and is found to be similar or higher than the performance achieved by six other scale sets. Finally, a unique correspondence between numerical indices and mutation matrices is derived and discussed in light of the evolutionary conservation of amino acid properties. Conclusions from this study highlight the discord between ease of interpretation of amino acid scales and their relevance to protein structure conservation, as well as general considerations for designing custom scale sets.
  •  
6.
  • Hjelm, M., et al. (author)
  • New Probabilistic network models and algorithms for oncogenesis
  • 2006
  • In: Journal of Computational Biology. - : Mary Ann Liebert Inc. - 1066-5277 .- 1557-8666. ; 13:4, s. 853-865
  • Journal article (peer-reviewed)abstract
    • Chromosomal aberrations in solid tumors appear in complex patterns. It is important to understand how these patterns develop, the dynamics of the process, the temporal or even causal order between aberrations, and the involved pathways. Here we present network models for chromosomal aberrations and algorithms for training models based on observed data. Our models are generative probabilistic models that can be used to study dynamical aspects of chromosomal evolution in cancer cells. They are well suited for a graphical representation that conveys the pathways found in a dataset. By allowing only pairwise dependencies and partition aberrations into modules, in which all aberrations are restricted to have the same dependencies, we reduce the number of parameters so that datasets sizes relevant to cancer applications can be handled. We apply our framework to a dataset of colorectal cancer tumor karyotypes. The obtained model explains the data significantly better than a model where independence between the aberrations is assumed. In fact, the obtained model performs very well with respect to several measures of goodness of fit and is, with respect to repetition of the training, more or less unique.
  •  
7.
  • Jansson, Jesper, et al. (author)
  • Determining the consistency of resolved triplets and fan triplets
  • 2018
  • In: Journal of Computational Biology. - : Mary Ann Liebert Inc. - 1066-5277 .- 1557-8666. ; 25:7, s. 740-754
  • Journal article (peer-reviewed)abstract
    • The R+-F+-Consistency problem takes as input two sets R+ and R- of resolved triplets and two sets F+ and F- of fan triplets, and asks for a distinctly leaf-labeled tree that contains all elements in R+ ⊂ F+ and no elements in R- ⊂ F- as embedded subtrees, if such a tree exists. This article presents a detailed characterization of how the computational complexity of the problem changes under various restrictions. Our main result is an efficient algorithm for dense inputs satisfying R-=θ whose running time is linear in the size of the input and therefore optimal.
  •  
8.
  • Jonsson, Viktor, 1987, et al. (author)
  • Variability in Metagenomic Count Data and Its Influence on the Identification of Differentially Abundant Genes.
  • 2017
  • In: Journal of Computational Biology. - : Mary Ann Liebert Inc. - 1066-5277 .- 1557-8666. ; 24:4, s. 311-326
  • Journal article (peer-reviewed)abstract
    • Metagenomics is the study of microorganisms in environmental and clinical samples using high-throughput sequencing of random fragments of their DNA. Since metagenomics does not require any prior culturing of isolates, entire microbial communities can be studied directly in their natural state. In metagenomics, the abundance of genes is quantified by sorting and counting the DNA fragments. The resulting count data are high-dimensional and affected by high levels of technical and biological noise that make the statistical analysis challenging. In this article, we introduce an hierarchical overdispersed Poisson model to explore the variability in metagenomic data. By analyzing three comprehensive data sets, we show that the gene-specific variability varies substantially between genes and is dependent on biological function. We also assess the power of identifying differentially abundant genes and show that incorrect assumptions about the gene-specific variability can lead to unacceptable high rates of false positives. Finally, we evaluate shrinkage approaches to improve the variance estimation and show that the prior choice significantly affects the statistical power. The results presented in this study further elucidate the complex variance structure of metagenomic data and provide suggestions for accurate and reliable identification of differentially abundant genes.
  •  
9.
  • Jönsson, Henrik, et al. (author)
  • An approximate maximum likelihood approach, applied to phylogenetic trees
  • 2003
  • In: Journal of Computational Biology. - 1557-8666. ; 10:5, s. 737-749
  • Journal article (peer-reviewed)abstract
    • A novel type of approximation scheme to the maximum likelihood (ML) approach is presented and discussed in the context of phylogenetic tree reconstruction from aligned DNA sequences. It is based on a parameterized approximation to the conditional distribution of hidden variables (related, e.g., to the sequences of unobserved branch point ancestors) given the observed data. A modified likelihood, based on the extended data, is then maximized with respect to the parameters of the model as well as to those involved in the approximation. With a suitable form of the approximations the proposed method allows for simpler updating of the parameters, at the cost of an increased parameter count and a slight decrease in performance. The method is tested on phylogenetic tree reconstruction from artificially generated sequences, and its performance is compared to that of ML, showing that the approach is competitive for reasonably similar sequences. The method is also applied to real DNA sequences from primates, yielding a result consistent with those obtained by other standard algorithms.
  •  
10.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-10 of 23
Type of publication
journal article (23)
Type of content
peer-reviewed (23)
Author/Editor
Kristiansson, Erik, ... (2)
Lagergren, Jens (2)
Holmgren, Sverker (2)
Schliep, Alexander, ... (2)
Elias, Isaac (2)
Sahlin, Kristoffer (2)
show more...
Mostad, Petter, 1964 (1)
Sonnhammer, ELL (1)
Sonnhammer, Erik L L (1)
Jönsson, Henrik (1)
Moore, Edward R.B. 1 ... (1)
Lingas, Andrzej (1)
Steel, M. (1)
Karlsson, Roger, 197 ... (1)
Salmela, Leena (1)
Makinen, Veli (1)
Carlborg, Örjan (1)
Höglund, Mattias (1)
Tjärnberg, Andreas (1)
Arvestad, Lars (1)
Moulton, Vincent (1)
Andersson, Björn, 19 ... (1)
Nerman, Olle, 1951 (1)
Söderberg, Bo (1)
Österlund, Tobias, 1 ... (1)
Andersson, Samuel A. (1)
Nettelblad, Carl (1)
Richter, Johan (1)
Bzhalava, D (1)
Axelson-Fisk, Marina ... (1)
Jonsson, Viktor, 198 ... (1)
Buongermino Pereira, ... (1)
Rudemo, Mats, 1937 (1)
Stenberg, Per, 1974- (1)
Tuller, Tamir (1)
Sung, Wing-Kin (1)
Frånberg, M. (1)
Wallroth, Mikael (1)
Bala, P (1)
Tomescu, Alexandru I ... (1)
Nandi, Soumyadeep (1)
Moscatelli, Ilana (1)
Henriksen, Kim (1)
Penny, D (1)
Jansson, Jesper (1)
Rothe, Michael (1)
Schambach, Axel (1)
Georgiev, Alexander (1)
Nowicki, M (1)
Hjelm, M (1)
show less...
University
Royal Institute of Technology (7)
University of Gothenburg (6)
Lund University (4)
Stockholm University (3)
Chalmers University of Technology (3)
Karolinska Institutet (3)
show more...
Uppsala University (2)
Umeå University (1)
Mid Sweden University (1)
show less...
Language
English (23)
Research subject (UKÄ/SCB)
Natural sciences (18)
Medical and Health Sciences (2)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view