SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Elofsson Arne) srt2:(2015-2019)"

Sökning: WFRF:(Elofsson Arne) > (2015-2019)

  • Resultat 1-10 av 35
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Armenteros, Jose Juan Almagro, et al. (författare)
  • Detecting sequence signals in targeting peptides using deep learning
  • 2019
  • Ingår i: Life Science Alliance. - : LIFE SCIENCE ALLIANCE LLC. - 2575-1077. ; 2:5
  • Tidskriftsartikel (refereegranskat)abstract
    • In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.
  •  
2.
  • Basile, Walter, et al. (författare)
  • High GC content causes orphan proteins to be intrinsically disordered
  • 2017
  • Ingår i: PloS Computational Biology. - : Public Library of Science (PLoS). - 1553-734X .- 1553-7358. ; 13:3
  • Tidskriftsartikel (refereegranskat)abstract
    • De novo creation of protein coding genes involves the formation of short ORFs from noncoding regions; some of these ORFs might then become fixed in the population These orphan proteins need to, at the bare minimum, not cause serious harm to the organism, meaning that they should for instance not aggregate. Therefore, although the creation of short ORFs could be truly random, the fixation should be subjected to some selective pressure. The selective forces acting on orphan proteins have been elusive, and contradictory results have been reported. In Drosophila young proteins are more disordered than ancient ones, while the opposite trend is present in yeast. To the best of our knowledge no valid explanation for this difference has been proposed. To solve this riddle we studied structural properties and age of proteins in 187 eukaryotic organisms. We find that, with the exception of length, there are only small differences in the properties between proteins of different ages. However, when we take the GC content into account we noted that it could explain the opposite trends observed for orphans in yeast (low GC) and Drosophila (high GC). GC content is correlated with codons coding for disorder promoting amino acids. This leads us to propose that intrinsic disorder is not a strong determining factor for fixation of orphan proteins. Instead these proteins largely resemble random proteins given a particular GC level. During evolution the properties of a protein change faster than the GC level causing the relationship between disorder and GC to gradually weaken.
  •  
3.
  • Basile, Walter, 1980- (författare)
  • Orphan Genes Bioinformatics : Identification and properties of de novo created genes
  • 2017
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Even today, many genes are without any known homolog. These "orphans" are found in all species, from Viruses to Prokaryotes and Eukaryotes. For a portion of these genes, we might simply not have enough data to find homologs yet. Some of them are imported from taxonomically distant organisms via lateral transfer; others have homologs, but mutated beyond the point of recognition.However, a sizeable fraction of orphan genes is unambiguously created via "de novo" mechanisms. The study of such novel genes can contribute to our understanding of the emergence of functional novelty and the adaptation of species to new ecological niches.In this work, we first survey the field of orphan studies, and illustrate some of the common issues. Next, we analyze some of the intrinsic properties of orphans proteins, including secondary structure elements and Intrinsic Structural Disorder; specifically, we observe that in young proteins the relationship between these properties and the G+C content of their coding sequence is stronger than in older proteins.We then tackle some of the methodological problems often found in orphan studies. We find that using evolutionarily close species, and sensitive, state-of-the art homology recognition methods is instrumental to the identification of a set of orphans enriched in de novo created ones.Finally, we compare how intrinsic disorder is distributed in bacteria versus eukaryota. Eukaryotic proteins are longer and more disordered; the difference is to be attributed primarily to eukaryotic-specific domains and linker regions. In these sections of the proteins, a higher frequency of the disorder-promoting amino acid Serine can be observed in Eukaryotes.
  •  
4.
  • Basile, Walter, 1980-, et al. (författare)
  • The classification of orphans is improved by combining searches in both proteomes and genomes
  • 2017
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • The identification of de novo created genes is important as it provides a glimpse on the evolutionary processes of gene creation. Potential de novo created genes are identified by selecting genes that have no homologs outside a particular species, but for an accurate detection this identification needs to be correct.Genes without any homologs are often referred to as orphans; in addition to de novo created ones, fast evolving genes or genes lost in all related genomes might also be classified as orphans. The identification of orphans is dependent on: (i) a method to detect homologs and (ii) a database including genes from related genomes.Here, we set out to investigate how the detection of orphans is influenced by these two factors. Using Saccharomyces cerevisiae we identify that best strategy is to use a combination of searching annotated proteins and a six-frame translation of all ORFs from closely related genomes. Using this strategy we obtain a set of 54 orphans in Drosophila melanogaster and 38 in Drosophila pseudoobscura, significantly less than what is reported in some earlier studies.
  •  
5.
  • Basile, Walter, et al. (författare)
  • Why do eukaryotic proteins contain more intrinsically disordered regions?
  • 2019
  • Ingår i: PloS Computational Biology. - : Public Library of Science (PLoS). - 1553-734X .- 1553-7358. ; 15:7
  • Tidskriftsartikel (refereegranskat)abstract
    • Intrinsic disorder is more abundant in eukaryotic than prokaryotic proteins. Methods predicting intrinsic disorder are based on the amino acid sequence of a protein. Therefore, there must exist an underlying difference in the sequences between eukaryotic and prokaryotic proteins causing the (predicted) difference in intrinsic disorder. By comparing proteins, from complete eukaryotic and prokaryotic proteomes, we show that the difference in intrinsic disorder emerges from the linker regions connecting Pfam domains. Eukaryotic proteins have more extended linker regions, and in addition, the eukaryotic linkers are significantly more disordered, 38% vs. 12-16% disordered residues. Next, we examined the underlying reason for the increase in disorder in eukaryotic linkers, and we found that the changes in abundance of only three amino acids cause the increase. Eukaryotic proteins contain 8.6% serine; while prokaryotic proteins have 6.5%, eukaryotic proteins also contain 5.4% proline and 5.3% isoleucine compared with 4.0% proline and ≈ 7.5% isoleucine in the prokaryotes. All these three differences contribute to the increased disorder in eukaryotic proteins. It is tempting to speculate that the increase in serine frequencies in eukaryotes is related to regulation by kinases, but direct evidence for this is lacking. The differences are observed in all phyla, protein families, structural regions and type of protein but are most pronounced in disordered and linker regions. The observation that differences in the abundance of three amino acids cause the difference in disorder between eukaryotic and prokaryotic proteins raises the question: Are amino acid frequencies different in eukaryotic linkers because the linkers are more disordered or do the differences cause the increased disorder?
  •  
6.
  • Cheng, Jianlin, et al. (författare)
  • Estimation of model accuracy in CASP13
  • 2019
  • Ingår i: Proteins. - : Wiley. - 0887-3585 .- 1097-0134. ; 87:12, s. 1361-1377
  • Tidskriftsartikel (refereegranskat)abstract
    • Methods to reliably estimate the accuracy of 3D models of proteins are both a fundamental part of most protein folding pipelines and important for reliable identification of the best models when multiple pipelines are used. Here, we describe the progress made from CASP12 to CASP13 in the field of estimation of model accuracy (EMA) as seen from the progress of the most successful methods in CASP13. We show small but clear progress, that is, several methods perform better than the best methods from CASP12 when tested on CASP13 EMA targets. Some progress is driven by applying deep learning and residue‐residue contacts to model accuracy prediction. We show that the best EMA methods select better models than the best servers in CASP13, but that there exists a great potential to improve this further. Also, according to the evaluation criteria based on local similarities, such as lDDT and CAD, it is now clear that single model accuracy methods perform relatively better than consensus‐based methods.
  •  
7.
  • De Marothy, Minttu T., et al. (författare)
  • Marginally hydrophobic transmembrane alpha-helices shaping membrane protein folding
  • 2015
  • Ingår i: Protein Science. - : Wiley. - 0961-8368 .- 1469-896X. ; 24:7, s. 1057-1074
  • Forskningsöversikt (refereegranskat)abstract
    • Cells have developed an incredible machinery to facilitate the insertion of membrane proteins into the membrane. While we have a fairly good understanding of the mechanism and determinants of membrane integration, more data is needed to understand the insertion of membrane proteins with more complex insertion and folding pathways. This review will focus on marginally hydrophobic transmembrane helices and their influence on membrane protein folding. These weakly hydrophobic transmembrane segments are by themselves not recognized by the translocon and therefore rely on local sequence context for membrane integration. How can such segments reside within the membrane? We will discuss this in the light of features found in the protein itself as well as the environment it resides in. Several characteristics in proteins have been described to influence the insertion of marginally hydrophobic helices. Additionally, the influence of biological membranes is significant. To begin with, the actual cost for having polar groups within the membrane may not be as high as expected; the presence of proteins in the membrane as well as characteristics of some amino acids may enable a transmembrane helix to harbor a charged residue. The lipid environment has also been shown to directly influence the topology as well as membrane boundaries of transmembrane helices-implying a dynamic relationship between membrane proteins and their environment.
  •  
8.
  • Dimou, Niki L., et al. (författare)
  • GWAR : robust analysis and meta-analysis of genome-wide association studies
  • 2017
  • Ingår i: Bioinformatics. - : Oxford University Press (OUP). - 1367-4803 .- 1367-4811. ; 33:10, s. 1521-1527
  • Tidskriftsartikel (refereegranskat)abstract
    • Motivation: In the context of genome-wide association studies (GWAS), there is a variety of statistical techniques in order to conduct the analysis, but, in most cases, the underlying genetic model is usually unknown. Under these circumstances, the classical Cochran-Armitage trend test (CATT) is suboptimal. Robust procedures that maximize the power and preserve the nominal type I error rate are preferable. Moreover, performing a meta-analysis using robust procedures is of great interest and has never been addressed in the past. The primary goal of this work is to implement several robust methods for analysis and meta-analysis in the statistical package Stata and subsequently to make the software available to the scientific community. Results: The CATT under a recessive, additive and dominant model of inheritance as well as robust methods based on the Maximum Efficiency Robust Test statistic, the MAX statistic and the MIN2 were implemented in Stata. Concerning MAX and MIN2, we calculated their asymptotic null distributions relying on numerical integration resulting in a great gain in computational time without losing accuracy. All the aforementioned approaches were employed in a fixed or a random effects meta-analysis setting using summary data with weights equal to the reciprocal of the combined cases and controls. Overall, this is the first complete effort to implement procedures for analysis and meta-analysis in GWAS using Stata.
  •  
9.
  • Elofsson, Arne, et al. (författare)
  • Methods for estimation of model accuracy in CASP12
  • 2018
  • Ingår i: Proteins. - : Wiley. - 0887-3585 .- 1097-0134. ; 86:S1, s. 361-373
  • Tidskriftsartikel (refereegranskat)abstract
    • Methods to reliably estimate the quality of 3D models of proteins are essential drivers for the wide adoption and serious acceptance of protein structure predictions by life scientists. In this article, the most successful groups in CASP12 describe their latest methods for estimates of model accuracy (EMA). We show that pure single model accuracy estimation methods have shown clear progress since CASP11; the 3 top methods (MESHI, ProQ3, SVMQA) all perform better than the top method of CASP11 (ProQ2). Although the pure single model accuracy estimation methods outperform quasi-single (ModFOLD6 variations) and consensus methods (Pcons, ModFOLDclust2, Pcomb-domain, and Wallner) in model selection, they are still not as good as those methods in absolute model quality estimation and predictions of local quality. Finally, we show that when using contact-based model quality measures (CAD, lDDT) the single model quality methods perform relatively better.
  •  
10.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 35
Typ av publikation
tidskriftsartikel (26)
doktorsavhandling (7)
annan publikation (1)
forskningsöversikt (1)
Typ av innehåll
refereegranskat (26)
övrigt vetenskapligt/konstnärligt (9)
Författare/redaktör
Elofsson, Arne (29)
Menéndez Hurtado (, ... (6)
Tsirigos, Konstantin ... (6)
Elofsson, Arne, Prof ... (6)
Salvatore, Marco (5)
Wallner, Björn (4)
visa fler...
Basile, Walter (3)
Bagos, Pantelis G. (2)
Basile, Walter, 1980 ... (2)
Bassot, Claudio (2)
Warholm, Per (2)
Hess, Berk (1)
Lindahl, Erik, 1972- (1)
van Der Spoel, David (1)
Kalmár, Lajos (1)
Daley, Daniel O. (1)
Emanuelsson, Olof (1)
Sander, Chris (1)
Uversky, Vladimir N. (1)
Winther, Ole (1)
von Heijne, Gunnar (1)
Nielsen, Henrik (1)
Li, Zhong (1)
Pilstål, Robert (1)
Stenmark, Pål (1)
Ott, Martin (1)
Armenteros, Jose Jua ... (1)
Carlström, Andreas (1)
Ekeberg, Magnus (1)
Mészáros, Attila (1)
Sachenkova, Oxana (1)
Light, Sara (1)
Tautz, Diethard, Pro ... (1)
Herrgård, Markus J. (1)
Nørholm, Morten H. H ... (1)
Käll, Lukas (1)
Davey, Norman E. (1)
Mirabello, Claudio (1)
Lindahl, Erik, Profe ... (1)
Dawitz, Hannah (1)
Qiu, Jian (1)
Endo, Toshiya (1)
Minervini, Giovanni (1)
Leonardi, Emanuela (1)
Tosatto, Silvio C.E. (1)
Västermark, Åke (1)
Pfanner, Nikolaus (1)
Cheng, Jianlin (1)
Choe, Myong‐Ho (1)
Han, Kun-Sop (1)
visa färre...
Lärosäte
Stockholms universitet (34)
Kungliga Tekniska Högskolan (4)
Linköpings universitet (4)
Uppsala universitet (1)
Språk
Engelska (35)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (35)
Teknik (3)
Medicin och hälsovetenskap (2)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy