SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Elofsson Arne) "

Sökning: WFRF:(Elofsson Arne)

  • Resultat 1-50 av 190
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Allison, Timothy M., et al. (författare)
  • Complementing machine learning‐based structure predictions with native mass spectrometry
  • 2022
  • Ingår i: Protein Science. - : John Wiley & Sons. - 0961-8368 .- 1469-896X. ; 31:6
  • Tidskriftsartikel (refereegranskat)abstract
    • The advent of machine learning-based structure prediction algorithms such as AlphaFold2 (AF2) and RoseTTa Fold have moved the generation of accurate structural models for the entire cellular protein machinery into the reach of the scientific community. However, structure predictions of protein complexes are based on user-provided input and may require experimental validation. Mass spectrometry (MS) is a versatile, time-effective tool that provides information on post-translational modifications, ligand interactions, conformational changes, and higher-order oligomerization. Using three protein systems, we show that native MS experiments can uncover structural features of ligand interactions, homology models, and point mutations that are undetectable by AF2 alone. We conclude that machine learning can be complemented with MS to yield more accurate structural models on a small and large scale.
  •  
2.
  •  
3.
  •  
4.
  • Armenteros, Jose Juan Almagro, et al. (författare)
  • Detecting sequence signals in targeting peptides using deep learning
  • 2019
  • Ingår i: Life Science Alliance. - : LIFE SCIENCE ALLIANCE LLC. - 2575-1077. ; 2:5
  • Tidskriftsartikel (refereegranskat)abstract
    • In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.
  •  
5.
  • Attwood, Misty M. (författare)
  • Membrane-bound proteins : Characterization, evolution, and functional analysis
  • 2020
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Alpha-helical transmembrane proteins are important components of many essential cell processes including signal transduction, transport of molecules across membranes, protein and membrane trafficking, and structural and adhesion activities, amongst others. Their involvement in critical networks makes them the focus of interest in investigating disease pathways, as candidate drug targets, and in evolutionary analyses to identify homologous protein families and possible functional activities. Transmembrane (TM) proteins can be categorized into major groups based the same gross structure, i.e., the number of transmembrane helices, which are often correlated with specific functional activities, for example as receptors or transporters. The focus of this thesis was to analyze the evolution of the membrane proteome from the last holozoan common ancestor (LHCA) through metazoans to garner insight into the fundamental functional clusters that underlie metazoan diversity and innovation. Twenty-four eukaryotic proteomes were analyzed, with results showing more than 70% of metazoan transmembrane protein families have a pre-metazoan origin. In concert with that, we characterized the previously unstudied groups of human proteins with three, four, and five membrane-spanning regions (3TM, 4TM, and 5TM) and analyzed their functional activities, involvement in disease pathways, and unique characteristics. Combined, we manually curated and classified nearly 11% of the human transmembrane proteome with these three studies. The 3TM data set included 152 proteins, with nearly 45% that localize specifically to the endoplasmic reticulum (ER), and are involved in membrane biosynthesis and lipid biogenesis, proteins trafficking, catabolic processes, and signal transduction due to the large ionotropic glutamate receptor family. The 373 proteins identified in the 4TM data set are predominantly involved in transport activities, as well as cell communication and adhesion, and function as structural elements. The compact 5TM data set includes 58 proteins that engage in localization and transport activities, such as protein targeting, membrane trafficking, and vesicle transport. Notably, ~60% are identified as cancer prognostic markers that are associated with clinical outcomes of different tumour types. This thesis investigates the evolutionary origins of the human transmembrane proteome, characterizes formerly dark areas of the membrane proteome, and extends the fundamental knowledge of transmembrane proteins.
  •  
6.
  • Baldassarre, Federico, et al. (författare)
  • GraphQA: Protein Model Quality Assessment using Graph Convolutional Networks
  • 2020
  • Ingår i: Bioinformatics. - : Oxford University Press. - 1367-4803 .- 1367-4811 .- 1460-2059. ; 37:3, s. 360-366
  • Tidskriftsartikel (refereegranskat)abstract
    • MotivationProteins are ubiquitous molecules whose function in biological processes is determined by their 3D structure. Experimental identification of a protein’s structure can be time-consuming, prohibitively expensive, and not always possible. Alternatively, protein folding can be modeled using computational methods, which however are not guaranteed to always produce optimal results.GraphQA is a graph-based method to estimate the quality of protein models, that possesses favorable properties such as representation learning, explicit modeling of both sequential and 3D structure, geometric invariance, and computational efficiency.ResultsGraphQA performs similarly to state-of-the-art methods despite using a relatively low number of input features. In addition, the graph network structure provides an improvement over the architecture used in ProQ4 operating on the same input features. Finally, the individual contributions of GraphQA components are carefully evaluated.Availability and implementationPyTorch implementation, datasets, experiments, and link to an evaluation server are available through this GitHub repository: github.com/baldassarreFe/graphqaSupplementary informationSupplementary material is available at Bioinformatics online.
  •  
7.
  • Bano-Polo, Manuel, et al. (författare)
  • Charge Pair Interactions in Transmembrane Helices and Turn Propensity of the Connecting Sequence Promote Helical Hairpin Insertion
  • 2013
  • Ingår i: Journal of Molecular Biology. - : Elsevier. - 0022-2836 .- 1089-8638. ; 425:4, s. 830-840
  • Tidskriftsartikel (refereegranskat)abstract
    • alpha-Helical hairpins, consisting of a pair of closely spaced transmembrane (TM) helices that are connected by a short interfacial turn, are the simplest structural motifs found in multi-spanning membrane proteins. In naturally occurring hairpins, the presence of polar residues is common and predicted to complicate membrane insertion. We postulate that the pre-packing process offsets any energetic cost of allocating polar and charged residues within the hydrophobic environment of biological membranes. Consistent with this idea, we provide here experimental evidence demonstrating that helical hairpin insertion into biological membranes can be driven by electrostatic interactions between closely separated, poorly hydrophobic sequences. Additionally, we observe that the integral hairpin can be stabilized by a short loop heavily populated by turn-promoting residues. We conclude that the combined effect of TM-TM electrostatic interactions and tight turns plays an important role in generating the functional architecture of membrane proteins and propose that helical hairpin motifs can be acquired within the context of the Sec61 translocon at the early stages of membrane protein biosynthesis. Taken together, these data further underline the potential complexities involved in accurately predicting TM domains from primary structures.
  •  
8.
  • Basile, Walter, 1980-, et al. (författare)
  • Difference in disorder between eukaryotes and prokaryotes is largely due to Serine in linker regions
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • In this study we ask what are the molecular properties that make eukaryotic proteins more disordered than prokaryotic ones. First, we show that on average eukaryotic proteins contain more amino acids that are promoting disorder. In particular the fraction of Serine residues is close to 8% of all residues in eukaryotes and less than 6% in prokaryotes. Second, we show that domains unique to eukaryotes and linker regions in eukaryotes are both more disordered and more abundant than corresponding regions in prokaryotic proteins. Serine is an important residue for post-translational modification and regulatory mechanisms. Therefore, we conclude that it is not unlikely that both the need for regulation in a complex eukaryotic cell and the increased amount of longer multi-domain proteins contribute to the higher intrinsic structural disorder in eukaryotic proteins.
  •  
9.
  • Basile, Walter, et al. (författare)
  • High GC content causes orphan proteins to be intrinsically disordered
  • 2017
  • Ingår i: PloS Computational Biology. - : Public Library of Science (PLoS). - 1553-734X .- 1553-7358. ; 13:3
  • Tidskriftsartikel (refereegranskat)abstract
    • De novo creation of protein coding genes involves the formation of short ORFs from noncoding regions; some of these ORFs might then become fixed in the population These orphan proteins need to, at the bare minimum, not cause serious harm to the organism, meaning that they should for instance not aggregate. Therefore, although the creation of short ORFs could be truly random, the fixation should be subjected to some selective pressure. The selective forces acting on orphan proteins have been elusive, and contradictory results have been reported. In Drosophila young proteins are more disordered than ancient ones, while the opposite trend is present in yeast. To the best of our knowledge no valid explanation for this difference has been proposed. To solve this riddle we studied structural properties and age of proteins in 187 eukaryotic organisms. We find that, with the exception of length, there are only small differences in the properties between proteins of different ages. However, when we take the GC content into account we noted that it could explain the opposite trends observed for orphans in yeast (low GC) and Drosophila (high GC). GC content is correlated with codons coding for disorder promoting amino acids. This leads us to propose that intrinsic disorder is not a strong determining factor for fixation of orphan proteins. Instead these proteins largely resemble random proteins given a particular GC level. During evolution the properties of a protein change faster than the GC level causing the relationship between disorder and GC to gradually weaken.
  •  
10.
  • Basile, Walter, 1980- (författare)
  • Orphan Genes Bioinformatics : Identification and properties of de novo created genes
  • 2017
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Even today, many genes are without any known homolog. These "orphans" are found in all species, from Viruses to Prokaryotes and Eukaryotes. For a portion of these genes, we might simply not have enough data to find homologs yet. Some of them are imported from taxonomically distant organisms via lateral transfer; others have homologs, but mutated beyond the point of recognition.However, a sizeable fraction of orphan genes is unambiguously created via "de novo" mechanisms. The study of such novel genes can contribute to our understanding of the emergence of functional novelty and the adaptation of species to new ecological niches.In this work, we first survey the field of orphan studies, and illustrate some of the common issues. Next, we analyze some of the intrinsic properties of orphans proteins, including secondary structure elements and Intrinsic Structural Disorder; specifically, we observe that in young proteins the relationship between these properties and the G+C content of their coding sequence is stronger than in older proteins.We then tackle some of the methodological problems often found in orphan studies. We find that using evolutionarily close species, and sensitive, state-of-the art homology recognition methods is instrumental to the identification of a set of orphans enriched in de novo created ones.Finally, we compare how intrinsic disorder is distributed in bacteria versus eukaryota. Eukaryotic proteins are longer and more disordered; the difference is to be attributed primarily to eukaryotic-specific domains and linker regions. In these sections of the proteins, a higher frequency of the disorder-promoting amino acid Serine can be observed in Eukaryotes.
  •  
11.
  • Basile, Walter, 1980-, et al. (författare)
  • The classification of orphans is improved by combining searches in both proteomes and genomes
  • 2017
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • The identification of de novo created genes is important as it provides a glimpse on the evolutionary processes of gene creation. Potential de novo created genes are identified by selecting genes that have no homologs outside a particular species, but for an accurate detection this identification needs to be correct.Genes without any homologs are often referred to as orphans; in addition to de novo created ones, fast evolving genes or genes lost in all related genomes might also be classified as orphans. The identification of orphans is dependent on: (i) a method to detect homologs and (ii) a database including genes from related genomes.Here, we set out to investigate how the detection of orphans is influenced by these two factors. Using Saccharomyces cerevisiae we identify that best strategy is to use a combination of searching annotated proteins and a six-frame translation of all ORFs from closely related genomes. Using this strategy we obtain a set of 54 orphans in Drosophila melanogaster and 38 in Drosophila pseudoobscura, significantly less than what is reported in some earlier studies.
  •  
12.
  • Basile, Walter, et al. (författare)
  • Why do eukaryotic proteins contain more intrinsically disordered regions?
  • 2019
  • Ingår i: PloS Computational Biology. - : Public Library of Science (PLoS). - 1553-734X .- 1553-7358. ; 15:7
  • Tidskriftsartikel (refereegranskat)abstract
    • Intrinsic disorder is more abundant in eukaryotic than prokaryotic proteins. Methods predicting intrinsic disorder are based on the amino acid sequence of a protein. Therefore, there must exist an underlying difference in the sequences between eukaryotic and prokaryotic proteins causing the (predicted) difference in intrinsic disorder. By comparing proteins, from complete eukaryotic and prokaryotic proteomes, we show that the difference in intrinsic disorder emerges from the linker regions connecting Pfam domains. Eukaryotic proteins have more extended linker regions, and in addition, the eukaryotic linkers are significantly more disordered, 38% vs. 12-16% disordered residues. Next, we examined the underlying reason for the increase in disorder in eukaryotic linkers, and we found that the changes in abundance of only three amino acids cause the increase. Eukaryotic proteins contain 8.6% serine; while prokaryotic proteins have 6.5%, eukaryotic proteins also contain 5.4% proline and 5.3% isoleucine compared with 4.0% proline and ≈ 7.5% isoleucine in the prokaryotes. All these three differences contribute to the increased disorder in eukaryotic proteins. It is tempting to speculate that the increase in serine frequencies in eukaryotes is related to regulation by kinases, but direct evidence for this is lacking. The differences are observed in all phyla, protein families, structural regions and type of protein but are most pronounced in disordered and linker regions. The observation that differences in the abundance of three amino acids cause the difference in disorder between eukaryotic and prokaryotic proteins raises the question: Are amino acid frequencies different in eukaryotic linkers because the linkers are more disordered or do the differences cause the increased disorder?
  •  
13.
  • Basmarke-Wehelie, Rahma, et al. (författare)
  • The complement regulator CD46 is bactericidal to Helicobacter pylori and blocks urease activity
  • 2011
  • Ingår i: Gastroenterology. - Baltimore : Elsevier BV. - 0016-5085 .- 1528-0012. ; 141:3, s. 918-928
  • Tidskriftsartikel (refereegranskat)abstract
    • BACKGROUND & AIMS: CD46 is a C3b/C4b binding complement regulator and a receptor for several human pathogens. We examined the interaction between CD46 and Helicobacter pylori (a bacterium that colonizes the human gastric mucosa and causes gastritis), peptic ulcers, and cancer.METHODS: Using gastric epithelial cells, we analyzed a set of H pylori strains and mutants for their ability to interact with CD46 and/or influence CD46 expression. Bacterial interaction with full-length CD46 and small CD46 peptides was evaluated by flow cytometry, fluorescence microscopy, enzyme-linked immunosorbent assay, and bacterial survival analyses.RESULTS: H pylori infection caused shedding of CD46 into the extracellular environment. A soluble form of CD46 bound to H pylori and inhibited growth, in a dose- and time-dependent manner, by interacting with urease and alkyl hydroperoxide reductase, which are essential bacterial pathogenicity-associated factors. Binding of CD46 or CD46-derived synthetic peptides blocked the urease activity and ability of bacteria to survive in acidic environments. Oral administration of one CD46 peptide eradicated H pylori from infected mice.CONCLUSIONS: CD46 is an antimicrobial agent that can eradicate H pylori. CD46 peptides might be developed to treat H pylori infection.
  •  
14.
  • Bassot, Claudio, et al. (författare)
  • Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
  • 2021
  • Ingår i: PloS Computational Biology. - : Public Library of Science (PLoS). - 1553-734X .- 1553-7358. ; 17:4
  • Tidskriftsartikel (refereegranskat)abstract
    • Repeat proteins are widespread among organisms and particularly abundant in eukaryotic proteomes. Their primary sequence presents repetition in the amino acid sequences that origin structures with repeated folds/domains. Although the repeated units often can be recognised from the sequence alone, often structural information is missing. Here, we used contact prediction for predicting the structure of repeats protein directly from their primary sequences. We benchmark the methods on a dataset comprehensive of all the known repeated structures. We evaluate the contact predictions and the obtained models for different classes of repeat proteins. Further, we develop and benchmark a quality assessment (QA) method specific for repeat proteins. Finally, we used the prediction pipeline for all PFAM repeat families without resolved structures and found that forty-one of them could be modelled with high accuracy. Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possible to predict a protein's structure. However, the unique sequence features present in repeat proteins have been a challenge to use direct coupling analysis for predicting contacts. Here, we show that deep learning-based methods (trRosetta, DeepMetaPsicov (DMP) and PconsC4) overcomes this problem and can predict intra- and inter-unit contacts in repeat proteins. In a benchmark dataset of 815 repeat proteins, about 90% can be correctly modelled. Further, among 48 PFAM families lacking a protein structure, we produce models of forty-one families with estimated high accuracy.
  •  
15.
  • Bendz, Maria, et al. (författare)
  • Membrane protein shaving with thermolysin can be used to evaluate topology predictors
  • 2013
  • Ingår i: Proteomics. - : Wiley. - 1615-9853 .- 1615-9861. ; 13:9, s. 1467-1480
  • Tidskriftsartikel (refereegranskat)abstract
    • Topology analysis of membrane proteins can be obtained by enzymatic shaving in combination with MS identification of peptides. Ideally, such analysis could provide quite detailed information about the membrane spanning regions. Here, we examine the ability of some shaving enzymes to provide large-scale analysis of membrane proteome topologies. To compare different shaving enzymes, we first analyzed the detected peptides from two over-expressed proteins. Second, we analyzed the peptides from non-over-expressed Escherichia coli membrane proteins with known structure to evaluate the shaving methods. Finally, the identified peptides were used to test the accuracy of a number of topology predictors. At the end we suggest that the usage of thermolysin, an enzyme working at the natural pH of the cell for membrane shaving, is superior because: (i) we detect a similar number of peptides and proteins using thermolysin and trypsin; (ii) thermolysin shaving can be run at a natural pH and (iii) the incubation time is quite short. (iv) Fewer detected peptides from thermolysin shaving originate from the transmembrane regions. Using thermolysin shaving we can also provide a clear separation between the best and the less accurate topology predictors, indicating that using data from shaving can provide valuable information when developing new topology predictors.
  •  
16.
  • Bernsel, Andreas, et al. (författare)
  • Prediction of membrane-protein topology from first principles
  • 2008
  • Ingår i: Proceedings of the National Academy of Sciences of the United States of America. - : Proceedings of the National Academy of Sciences. - 0027-8424 .- 1091-6490. ; 105:20, s. 7177-7181
  • Tidskriftsartikel (refereegranskat)abstract
    • The current best membrane-protein topology-prediction methods are typically based on sequence statistics and contain hundreds of parameters that are optimized on known topologies of membrane proteins. However, because the insertion of transmembrane helices into the membrane is the outcome of molecular interactions among protein, lipids and water, it should be possible to predict topology by methods based directly on physical data, as proposed >20 years ago by Kyte and Doolittle. Here, we present two simple topology-prediction methods using a recently published experimental scale of position-specific amino acid contributions to the free energy of membrane insertion that perform on a par with the current best statistics-based topology predictors. This result suggests that prediction of membrane-protein topology and structure directly from first principles is an attainable goal, given the recently improved understanding of peptide recognition by the translocon.
  •  
17.
  •  
18.
  • Bernsel, Andreas, et al. (författare)
  • TOPCONS : consensus prediction of membrane protein topology
  • 2009
  • Ingår i: Nucleic Acids Research. - : Oxford University Press (OUP). - 0305-1048 .- 1362-4962. ; 37:Suppl. 2, s. W465-W468
  • Tidskriftsartikel (refereegranskat)abstract
    • TOPCONS (http://topcons.net/) is a web server for consensus prediction of membrane protein topology. The underlying algorithm combines an arbitrary number of topology predictions into one consensus prediction and quantifies the reliability of the prediction based on the level of agreement between the underlying methods, both on the protein level and on the level of individual TM regions. Benchmarking the method shows that overall performance levels match the best available topology prediction methods, and for sequences with high reliability scores, performance is increased by approximately 10 percentage points. The web interface allows for constraining parts of the sequence to a known inside/outside location, and detailed results are displayed both graphically and in text format.
  •  
19.
  • Björklund, Asa K, et al. (författare)
  • Quantitative assessment of the structural bias in protein-protein interaction assays.
  • 2008
  • Ingår i: Proteomics. - : Wiley. - 1615-9853 .- 1615-9861. ; 8:22, s. 4657-46667
  • Tidskriftsartikel (refereegranskat)abstract
    • With recent publications of several large-scale protein-protein interaction (PPI) studies, the realization of the full yeast interaction network is getting closer. Here, we have analysed several yeast protein interaction datasets to understand their strengths and weaknesses. In particular, we investigate the effect of experimental biases on some of the protein properties suggested to be enriched in highly connected proteins. Finally, we use support vector machines (SVM) to assess the contribution of these properties to protein interactivity. We find that protein abundance is the most important factor for detecting interactions in tandem affinity purifications (TAP), while it is of less importance for Yeast Two Hybrid (Y2H) screens. Consequently, sequence conservation and/or essentiality of hubs may be related to their high abundance. Further, proteins with disordered structure are over-represented in Y2H screens and in one, but not the other, large-scale TAP assay. Hence, disordered regions may be important both in transient interactions and interactions in complexes. Finally, a few domain families seem to be responsible for a large part of all interactions. Most importantly, we show that there are method-specific biases in PPI experiments. Thus, care should be taken before drawing strong conclusions based on a single dataset.
  •  
20.
  • Björklund, Åsa, 1976- (författare)
  • Creation of new proteins - domain rearrangements and tandem duplications
  • 2010
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Proteins are modular entities with domains as their building blocks. The domains are recurrent protein fragments with a distinct structure, function and evolutionary history. During evolution, proteins with new functions have been invented through rearrangements as well as differentiation of domains. The focus of this thesis is to gain better understanding of the processes that govern domain rearrangements. In particular, the rearrangements that create long protein domain repeats have been investigated in detail.We estimate that about 65% of the eukaryotic and 40% of the prokaryotic proteins are of the multidomain type. Further, we find that the eukaryotic multidomain proteins are mainly created through insertion of a single domain at the N- or C-terminus. However, domain repeats differ from other domain rearrangements in the aspect that they are created from internal tandem duplications. We show that such duplications often involve several domains simultaneously, and that different repeated domain families show distinct evolutionary patterns. Finally, we have investigated how large repeat regions are created using a specific example; the Actin binding nebulin domain. The analysis reveals several tandem duplications of both single nebulin domains and super repeats of seven nebulins in a number of vertebrates. We see that the duplication breakpoints vary between the species and that multiple duplications of the same region are common.
  •  
21.
  • Björklund, Åsa K., et al. (författare)
  • Domain Rearrangements in Protein Evolution
  • 2005
  • Ingår i: Journal of Molecular Biology. - : Elsevier BV. - 0022-2836 .- 1089-8638. ; 353:4, s. 911-923
  • Tidskriftsartikel (refereegranskat)abstract
    • Most eukaryotic proteins are multi-domain proteins that are created from fusions of genes, deletions and internal repetitions. An investigation of such evolutionary events requires a method to find the domain architecture from which each protein originates. Therefore, we defined a novel measure, domain distance, which is calculated as the number of domains that differ between two domain architectures. Using this measure the evolutionary events that distinguish a protein from its closest ancestor have been studied and it was found that indels are more common than internal repetition and that the exchange of a domain is rare. Indels and repetitions are common at both the N and C-terminals while they are rare between domains. The evolution of the majority of multi-domain proteins can be explained by the stepwise insertions of single domains, with the exception of repeats that sometimes are duplicated several domains in tandem. We show that domain distances agree with sequence similarity and semantic similarity based on gene ontology annotations. In addition, we demonstrate the use of the domain distance measure to build evolutionary trees. Finally, the evolution of multi-domain proteins is exemplified by a closer study of the evolution of two protein families, non-receptor tyrosine kinases and RhoGEFs.
  •  
22.
  • Björklund, Åsa K., et al. (författare)
  • Expansion of Protein Domain Repeats
  • 2006
  • Ingår i: PloS Computational Biology. - : Public Library of Science (PLoS). - 1553-734X .- 1553-7358. ; 2:8, s. 959-970
  • Tidskriftsartikel (refereegranskat)abstract
    • Many proteins, especially in eukaryotes, contain tandem repeats of several domains from the same family. These repeats have a variety of binding properties and are involved in protein-protein interactions as well as binding to other ligands such as DNA and RNA. The rapid expansion of protein domain repeats is assumed to have evolved through internal tandem duplications. However, the exact mechanisms behind these tandem duplications are not well-understood. Here, we have studied the evolution, function, protein structure, gene structure, and phylogenetic distribution of domain repeats. For this purpose we have assigned Pfam-A domain families to 24 proteomes with more sensitive domain assignments in the repeat regions. These assignments confirmed previous findings that eukaryotes, and in particular vertebrates, contain a much higher fraction of proteins with repeats compared with prokaryotes. The internal sequence similarity in each protein revealed that the domain repeats are often expanded through duplications of several domains at a time, while the duplication of one domain is less common. Many of the repeats appear to have been duplicated in the middle of the repeat region. This is in strong contrast to the evolution of other proteins that mainly works through additions of single domains at either terminus. Further, we found that some domain families show distinct duplication patterns, e. g., nebulin domains have mainly been expanded with a unit of seven domains at a time, while duplications of other domain families involve varying numbers of domains. Finally, no common mechanism for the expansion of all repeats could be detected. We found that the duplication patterns show no dependence on the size of the domains. Further, repeat expansion in some families can possibly be explained by shuffling of exons. However, exon shuffling could not have created all repeats.
  •  
23.
  • Björklund, Åsa K., et al. (författare)
  • Nebulin : A Study of Protein Repeat Evolution
  • 2010
  • Ingår i: Journal of Molecular Biology. - : Elsevier BV. - 0022-2836 .- 1089-8638. ; 402:1, s. 38-51
  • Tidskriftsartikel (refereegranskat)abstract
    • Protein domain repeats are common in proteins that are central to the organization of a cell, in particular in eukaryotes. They are known to evolve through internal tandem duplications. However, the understanding of the underlying mechanisms is incomplete. To shed light on repeat expansion mechanisms, we have studied the evolution of the muscle protein Nebulin, a protein that contains a large number of actin-binding nebulin domains. Nebulin proteins have evolved from an invertebrate precursor containing two nebulin domains. Repeat regions have expanded through duplications of single domains, as well as duplications of a super repeat (SR) consisting of seven nebulins. We show that the SR has evolved independently into large regions in at least three instances: twice in the invertebrate Branchiostoma floridae and once in vertebrates. In-depth analysis reveals several recent tandem duplications in the Nebulin gene. The events involve both single-domain and multidomain SR units or several SR units. There are single events, but frequently the same unit is duplicated multiple times. For instance, an ancestor of human and chimpanzee underwent two tandem duplications. The duplication junction coincides with an Alu transposon, thus suggesting duplication through Alu-mediated homologous recombination. Duplications in the SR region consistently involve multiples of seven domains. However, the exact unit that is duplicated varies both between species and within species. Thus, multiple tandem duplications of the same motif did not create the large Nebulin protein. Finally, analysis of segmental duplications in the human genome reveals that duplications are more common in genes containing domain repeats than in those coding for nonrepeated proteins. In fact, segmental duplications are found three to six times more often in long repeated genes than expected by chance. 
  •  
24.
  • Bryant, Patrick, et al. (författare)
  • Decomposing Structural Response Due to Sequence Changes in Protein Domains with Machine Learning
  • 2020
  • Ingår i: Journal of Molecular Biology. - : Elsevier BV. - 0022-2836 .- 1089-8638. ; 432:16, s. 4435-4446
  • Tidskriftsartikel (refereegranskat)abstract
    • How protein domain structure changes in response to mutations is not well understood. Some mutations change the structure drastically, while most only result in small changes. To gain an understanding of this, we decompose the relationship between changes in domain sequence and structure using machine learning. We select pairs of evolutionarily related domains with a broad range of evolutionary distances. In contrast to earlier studies, we do not find a strictly linear relationship between sequence and structural changes. We train a random forest regressor that predicts the structural similarity between pairs with an average accuracy of 0.029 IDDT ( local Distance Difference Test) score, and a correlation coefficient of 0.92. Decomposing the feature importance shows that the domain length, or analogously, size is the most important feature. Our model enables assessing deviations in relative structural response, and thus prediction of evolutionary trajectories, in protein domains across evolution.
  •  
25.
  • Bryant, Patrick, et al. (författare)
  • Estimating the impact of mobility patterns on COVID-19 infection rates in 11 European countries
  • 2020
  • Ingår i: PeerJ. - : PeerJ. - 2167-8359. ; 8
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: As governments across Europe have issued non-pharmaceutical interventions (NPIs) such as social distancing and school closing, the mobility patterns in these countries have changed. Most states have implemented similar NPIs at similar time points. However, it is likely different countries and populations respond differently to the NPIs and that these differences cause mobility patterns and thereby the epidemic development to change.Methods: We build a Bayesian model that estimates the number of deaths on a given day dependent on changes in the basic reproductive number, R-0, due to differences in mobility patterns. We utilise mobility data from Google mobility reports using five different categories: retail and recreation, grocery and pharmacy, transit stations, workplace and residential. The importance of each mobility category for predicting changes in R-0 is estimated through the model.Findings: The changes in mobility have a considerable overlap with the introduction of governmental NPIs, highlighting the importance of government action for population behavioural change. The shift in mobility in all categories shows high correlations with the death rates 1 month later. Reduction of movement within the grocery and pharmacy sector is estimated to account for most of the decrease in R-0.Interpretation: Our model predicts 3-week epidemic forecasts, using real-time observations of changes in mobility patterns, which can provide governments with direct feedback on the effects of their NPIs. The model predicts the changes in a majority of the countries accurately but overestimates the impact of NPIs in Sweden and Denmark and underestimates them in France and Belgium. We also note that the exponential nature of all epidemiological models based on the basic reproductive number, R-0 cause small errors to have extensive effects on the predicted outcome.
  •  
26.
  • Bryant, Patrick, et al. (författare)
  • Improved prediction of protein-protein interactions using AlphaFold2
  • 2022
  • Ingår i: Nature Communications. - : Springer Science and Business Media LLC. - 2041-1723. ; 13:1
  • Tidskriftsartikel (refereegranskat)abstract
    • Predicting the structure of interacting protein chains is a fundamental step towards understanding protein function. Unfortunately, no computational method can produce accurate structures of protein complexes. AlphaFold2, has shown unprecedented levels of accuracy in modelling single chain protein structures. Here, we apply AlphaFold2 for the prediction of heterodimeric protein complexes. We find that the AlphaFold2 protocol together with optimised multiple sequence alignments, generate models with acceptable quality (DockQ >= 0.23) for 63% of the dimers. From the predicted interfaces we create a simple function to predict the DockQ score which distinguishes acceptable from incorrect models as well as interacting from non-interacting proteins with state-of-art accuracy. We find that, using the predicted DockQ scores, we can identify 51% of all interacting pairs at 1% FPR. Predicting the structure of protein complexes is extremely difficult. Here, authors apply AlphaFold2 with optimized multiple sequence alignments to model complexes of interacting proteins, enabling prediction of both if and how proteins interact with state-of-art accuracy.
  •  
27.
  • Bryant, Patrick, 1993- (författare)
  • Learning Protein Evolution and Structure
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • By analysing the structure of a protein it is possible to draw conclusions about its function. Obtaining the structure of a protein experimentally is however a time consuming and expensive process. By using evolution it is possible to infer the structure of a protein. AlphaFold2 (AF), the latest AI technology for protein structure prediction, uses evolutionary information to obtain protein structures in minutes instead of years at a fraction of the experimental cost. Here, we develop this technology further to predict the structure of interacting proteins. We create a confidence score, pDockQ, and show that this score rivals high-throughput experiments in distinguishing true and false protein-protein interactions (PPIs). Applying AF and the pDockQ score to a set of 65484 human PPIs we identify 1371 new high-confidence models. These models expand the structural knowledge of human protein complexes and can be used to e.g. develop new drugs or evaluate biological pathways. One limitation of AF is that the accuracy decreases with the number of proteins being predicted together and that the biggest protein complexes do not fit in the memory of the latest GPUs. To circumvent these issues, we predict subcomponents of protein complexes and assemble these together with Monte Carlo Tree search (MCTS). MCTS enables assembling some of the largest protein complexes using only sequence information and stoichiometry. Out of 175 protein complexes with 10-30 chains, 91 can be completely assembled with a median TM-score of 0.51. A third of these (30 complexes) are highly accurate (TM-score ≥0.8). The use of highly accurate protein structure prediction is revolutionising many fiends of biological research only one year after its realisation. Likely, this is only the beginning of a new era; the era of AI.  
  •  
28.
  • Bryant, Patrick, et al. (författare)
  • Peptide binder design with inverse folding and protein structure prediction
  • 2023
  • Ingår i: Communications Chemistry. - 2399-3669. ; 6:1
  • Tidskriftsartikel (refereegranskat)abstract
    • The computational design of peptide binders towards a specific protein interface can aid diagnostic and therapeutic efforts. Here, we design peptide binders by combining the known structural space searched with Foldseek, the protein design method ESM-IF1, and AlphaFold2 (AF) in a joint framework. Foldseek generates backbone seeds for a modified version of ESM-IF1 adapted to protein complexes. The resulting sequences are evaluated with AF using an MSA representation for the receptor structure and a single sequence for the binder. We show that AF can accurately evaluate protein binders and that our bind score can select these (ROC AUC = 0.96 for the heterodimeric case). We find that designs created from seeds with more contacts per residue are more successful and tend to be short. There is a relationship between the sequence recovery in interface positions and the plDDT of the designs, where designs with >= 80% recovery have an average plDDT of 84 compared to 55 at 0%. Designed sequences have 60% higher median plDDT values towards intended receptors than non-intended ones. Successful binders (predicted interface RMSD <= 2 angstrom) are designed towards 185 (6.5%) heteromeric and 42 (3.6%) homomeric protein interfaces with ESM-IF1 compared with 18 (1.5%) using ProteinMPNN from 100 samples. Designing peptides that bind to specific protein targets is crucial for peptidic drug development, however, traditional computer-aided binder design is outperformed by AlphaFold2. Here, the authors develop a peptide binder designing tool by combining Foldseek, ESM-IF1 and AlphaFold2 to increase the success rate.
  •  
29.
  • Bryant, Patrick, et al. (författare)
  • Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search
  • 2022
  • Ingår i: Nature Communications. - : Springer Science and Business Media LLC. - 2041-1723. ; 13:1
  • Tidskriftsartikel (refereegranskat)abstract
    • AlphaFold can predict the structure of single- and multiple-chain proteins with very high accuracy. However, the accuracy decreases with the number of chains, and the available GPU memory limits the size of protein complexes which can be predicted. Here we show that one can predict the structure of large complexes starting from predictions of subcomponents. We assemble 91 out of 175 complexes with 10–30 chains from predicted subcomponents using Monte Carlo tree search, with a median TM-score of 0.51. There are 30 highly accurate complexes (TM-score ≥0.8, 33% of complete assemblies). We create a scoring function, mpDockQ, that can distinguish if assemblies are complete and predict their accuracy. We find that complexes containing symmetry are accurately assembled, while asymmetrical complexes remain challenging. The method is freely available and accesible as a Colab notebook https://colab.research.google.com/github/patrickbryant1/MoLPC/blob/master/MoLPC.ipynb.
  •  
30.
  • Bryant, Patrick, et al. (författare)
  • The relationship between ageing and changes in the human blood and brain methylomes 
  • 2022
  • Ingår i: NAR Genomics and Bioinformatics. - : Oxford University Press (OUP). - 2631-9268. ; 4:1
  • Tidskriftsartikel (refereegranskat)abstract
    • Changes in DNA methylation have been found to be strongly correlated with age, enabling the creation of ‘epigenetic clocks’. Previously, studies on the relationship between ageing and DNA methylation have assumed a linear relationship. Here, we show that several markers show a non-linear behaviour. In particular, we observe a tendency for saturation with age, especially in the cerebellum. Further, we show that the relationships between significant methylation changes and ageing are different in different tissues. We suggest a straightforward method of assessing all methylation-age relationships and cluster them according to their relative fold change. Our fold change selection outperforms the most common epigenetic clocks in predicting age for the cerebellum, but not for Blood or the Frontal Cortex. Further, we find that the saturation of methylation observed at older ages for the cerebellum explains why epigenetic clocks consistently underestimate the age there. The findings imply that assuming linear correlations might cause biologically important markers to be missed. 
  •  
31.
  • Burke, David F., et al. (författare)
  • Towards a structurally resolved human protein interaction network
  • 2023
  • Ingår i: Nature Structural & Molecular Biology. - : Springer Science and Business Media LLC. - 1545-9993 .- 1545-9985. ; 30:2, s. 216-225
  • Tidskriftsartikel (refereegranskat)abstract
    • Cellular functions are governed by molecular machines that assemble through protein-protein interactions. Their atomic details are critical to studying their molecular mechanisms. However, fewer than 5% of hundreds of thousands of human protein interactions have been structurally characterized. Here we test the potential and limitations of recent progress in deep-learning methods using AlphaFold2 to predict structures for 65,484 human protein interactions. We show that experiments can orthogonally confirm higher-confidence models. We identify 3,137 high-confidence models, of which 1,371 have no homology to a known structure. We identify interface residues harboring disease mutations, suggesting potential mechanisms for pathogenic variants. Groups of interface phosphorylation sites show patterns of co-regulation across conditions, suggestive of coordinated tuning of multiple protein interactions as signaling responses. Finally, we provide examples of how the predicted binary complexes can be used to build larger assemblies helping to expand our understanding of human cell biology.
  •  
32.
  • Cheng, Jianlin, et al. (författare)
  • Estimation of model accuracy in CASP13
  • 2019
  • Ingår i: Proteins. - : Wiley. - 0887-3585 .- 1097-0134. ; 87:12, s. 1361-1377
  • Tidskriftsartikel (refereegranskat)abstract
    • Methods to reliably estimate the accuracy of 3D models of proteins are both a fundamental part of most protein folding pipelines and important for reliable identification of the best models when multiple pipelines are used. Here, we describe the progress made from CASP12 to CASP13 in the field of estimation of model accuracy (EMA) as seen from the progress of the most successful methods in CASP13. We show small but clear progress, that is, several methods perform better than the best methods from CASP12 when tested on CASP13 EMA targets. Some progress is driven by applying deep learning and residue‐residue contacts to model accuracy prediction. We show that the best EMA methods select better models than the best servers in CASP13, but that there exists a great potential to improve this further. Also, according to the evaluation criteria based on local similarities, such as lDDT and CAD, it is now clear that single model accuracy methods perform relatively better than consensus‐based methods.
  •  
33.
  • Contreras, F.-Xabier, et al. (författare)
  • Molecular recognition of a single sphingolipid species by a protein's transmembrane domain
  • 2012
  • Ingår i: Nature. - : Springer Science and Business Media LLC. - 0028-0836 .- 1476-4687. ; 481:7382, s. 525-529
  • Tidskriftsartikel (refereegranskat)abstract
    • Functioning and processing of membrane proteins critically depend on the way their transmembrane segments are embedded in the membrane. Sphingolipids are structural components of membranes and can also act as intracellular second messengers. Not much is known of sphingolipids binding to transmembrane domains (TMDs) of proteins within the hydrophobic bilayer, and how this could affect protein function. Here we show a direct and highly specific interaction of exclusively one sphingomyelin species, SM 18, with the TMD of the COPI machinery protein p24 (ref. 2). Strikingly, the interaction depends on both the headgroup and the backbone of the sphingolipid, and on a signature sequence (VXXTLXXIY) within the TMD. Molecular dynamics simulations show a close interaction of SM 18 with the TMD. We suggest a role of SM 18 in regulating the equilibrium between an inactive monomeric and an active oligomeric state of the p24 protein, which in turn regulates COPI-dependent transport. Bioinformatic analyses predict that the signature sequence represents a conserved sphingolipid-binding cavity in a variety of mammalian membrane proteins. Thus, in addition to a function as second messengers, sphingolipids can act as cofactors to regulate the function of transmembrane proteins. Our discovery of an unprecedented specificity of interaction of a TMD with an individual sphingolipid species adds to our understanding of why biological membranes are assembled from such a large variety of different lipids.
  •  
34.
  • Dahl, Leo, 1995-, et al. (författare)
  • Multiplexed selectivity screening of anti-GPCR antibodies
  • 2023
  • Ingår i: Science Advances. - : American Association for the Advancement of Science (AAAS). - 2375-2548. ; 9:18
  • Tidskriftsartikel (refereegranskat)abstract
    • G protein-coupled receptors (GPCRs) control critical cellular signaling pathways. Therapeutic agents including anti-GPCR antibodies (Abs) are being developed to modulate GPCR function. However, validating the selectivity of anti-GPCR Abs is challenging because of sequence similarities among individual receptors within GPCR sub-families. To address this challenge, we developed a multiplexed immunoassay to test >400 anti-GPCR Abs from the Human Protein Atlas targeting a customized library of 215 expressed and solubilized GPCRs representing all GPCR subfamilies. We found that-61% of Abs tested were selective for their intended target,-11% bound off -target, and-28% did not bind to any GPCR. Antigens of on-target Abs were, on average, significantly longer, more disordered, and less likely to be buried in the interior of the GPCR protein than the other Abs. These results provide important insights into the immunogenicity of GPCR epitopes and form a basis for designing therapeu-tic Abs and for detecting pathological auto-Abs against GPCRs.
  •  
35.
  • De Marothy, Minttu T., et al. (författare)
  • Marginally hydrophobic transmembrane alpha-helices shaping membrane protein folding
  • 2015
  • Ingår i: Protein Science. - : Wiley. - 0961-8368 .- 1469-896X. ; 24:7, s. 1057-1074
  • Forskningsöversikt (refereegranskat)abstract
    • Cells have developed an incredible machinery to facilitate the insertion of membrane proteins into the membrane. While we have a fairly good understanding of the mechanism and determinants of membrane integration, more data is needed to understand the insertion of membrane proteins with more complex insertion and folding pathways. This review will focus on marginally hydrophobic transmembrane helices and their influence on membrane protein folding. These weakly hydrophobic transmembrane segments are by themselves not recognized by the translocon and therefore rely on local sequence context for membrane integration. How can such segments reside within the membrane? We will discuss this in the light of features found in the protein itself as well as the environment it resides in. Several characteristics in proteins have been described to influence the insertion of marginally hydrophobic helices. Additionally, the influence of biological membranes is significant. To begin with, the actual cost for having polar groups within the membrane may not be as high as expected; the presence of proteins in the membrane as well as characteristics of some amino acids may enable a transmembrane helix to harbor a charged residue. The lipid environment has also been shown to directly influence the topology as well as membrane boundaries of transmembrane helices-implying a dynamic relationship between membrane proteins and their environment.
  •  
36.
  • de Marothy, Tuuli Minttu Virkki, 1984- (författare)
  • Marginally hydrophobic transmembrane α-helices shaping membrane protein folding
  • 2014
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Most membrane proteins are inserted into the membrane co-translationally utilizing the translocon, which allows a sufficiently long and hydrophobic stretch of amino acids to partition into the membrane. However, X-ray structures of membrane proteins have revealed that some transmembrane helices (TMHs) are surprisingly hydrophilic. These marginally hydrophobic transmembrane helices (mTMH) are not recognized as TMHs by the translocon in the absence of local sequence context.We have studied three native mTMHs, which were previously shown to depend on a subsequent TMH for membrane insertion. Their recognition was not due to specific interactions. Instead, the presence of basic amino acids in their cytoplasmic loop allowed membrane insertion of one of them. In the other two, basic residues are not sufficient unless followed by another, hydrophobic TMH. Post-insertional repositioning are another way to bring hydrophilic residues into the membrane. We show how four long TMHs with hydrophilic residues seen in X-ray structures, are initially inserted as much shorter membrane-embedded segments. Tilting is thus induced after membrane-insertion, probably through tertiary packing interactions within the protein.Aquaporin 1 illustrates how a mTMH can shape membrane protein folding and how repositioning can be important in post-insertional folding. It initially adopts a four-helical intermediate, where mTMH2 and TMH4 are not inserted into the membrane. Consequently, TMH3 is inserted in an inverted orientation. The final conformation with six TMHs is formed by TMH2 and 4 entering the membrane and TMH3 rotating 180°. Based on experimental and computational results, we propose a mechanism for the initial step in the folding of AQP1: A shift of TMH3 out from membrane core allows the preceding regions to enter the membrane, which provides flexibility for TMH3 to re-insert in its correct orientation.
  •  
37.
  • Delucchi, Matteo, et al. (författare)
  • A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder
  • 2020
  • Ingår i: Genes. - : MDPI AG. - 2073-4425. ; 11:4
  • Tidskriftsartikel (refereegranskat)abstract
    • Protein tandem repeats (TRs) are often associated with immunity-related functions and diseases. Since that last census of protein TRs in 1999, the number of curated proteins increased more than seven-fold and new TR prediction methods were published. TRs appear to be enriched with intrinsic disorder and vice versa. The significance and the biological reasons for this association are unknown. Here, we characterize protein TRs across all kingdoms of life and their overlap with intrinsic disorder in unprecedented detail. Using state-of-the-art prediction methods, we estimate that 50.9% of proteins contain at least one TR, often located at the sequence flanks. Positive linear correlation between the proportion of TRs and the protein length was observed universally, with Eukaryotes in general having more TRs, but when the difference in length is taken into account the difference is quite small. TRs were enriched with disorder-promoting amino acids and were inside intrinsically disordered regions. Many such TRs were homorepeats. Our results support that TRs mostly originate by duplication and are involved in essential functions such as transcription processes, structural organization, electron transport and iron-binding. In viruses, TRs are found in proteins essential for virulence.
  •  
38.
  • Dimou, Niki L., et al. (författare)
  • GWAR : robust analysis and meta-analysis of genome-wide association studies
  • 2017
  • Ingår i: Bioinformatics. - : Oxford University Press (OUP). - 1367-4803 .- 1367-4811. ; 33:10, s. 1521-1527
  • Tidskriftsartikel (refereegranskat)abstract
    • Motivation: In the context of genome-wide association studies (GWAS), there is a variety of statistical techniques in order to conduct the analysis, but, in most cases, the underlying genetic model is usually unknown. Under these circumstances, the classical Cochran-Armitage trend test (CATT) is suboptimal. Robust procedures that maximize the power and preserve the nominal type I error rate are preferable. Moreover, performing a meta-analysis using robust procedures is of great interest and has never been addressed in the past. The primary goal of this work is to implement several robust methods for analysis and meta-analysis in the statistical package Stata and subsequently to make the software available to the scientific community. Results: The CATT under a recessive, additive and dominant model of inheritance as well as robust methods based on the Maximum Efficiency Robust Test statistic, the MAX statistic and the MIN2 were implemented in Stata. Concerning MAX and MIN2, we calculated their asymptotic null distributions relying on numerical integration resulting in a great gain in computational time without losing accuracy. All the aforementioned approaches were employed in a fixed or a random effects meta-analysis setting using summary data with weights equal to the reciprocal of the combined cases and controls. Overall, this is the first complete effort to implement procedures for analysis and meta-analysis in GWAS using Stata.
  •  
39.
  • Duart, Gerard, et al. (författare)
  • Intra-helical salt bridge contribution to membrane protein insertion
  • 2024
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Salt bridges between negatively (D, E) and positively charged (K, R, H) amino acids play an important role in protein stabilization. This has a more prevalent effect in membrane proteins where polar amino acids are exposed to a very hydrophobic environment. In transmembrane (TM) helices the presence of charged residues can hinder the insertion of the helices into the membrane. This can sometimes be avoided by TM region rearrangements after insertion, but it is also possible that the formation of salt bridges could decrease the cost of membrane integration. However, the presence of intra-helical salt bridges in TM domains and their effect on insertion has not been properly studied yet. In this work, we use an analytical pipeline to study the prevalence of charged pairs of amino acid residues in TM α-helices, which shows that potentially salt-bridge forming pairs are statistically over-represented. We then selected some candidates to experimentally determine the contribution of these electrostatic interactions to the translocon-assisted membrane insertion process. Using both in vitro and in vivo systems, we confirm the presence of intra-helical salt bridges in TM segments during biogenesis and determined that they contribute between 0.5-0.7 kcal/mol to the apparent free energy of membrane insertion (ΔGapp). Our observations suggest that salt bridge interactions can be stabilized during translocon-mediated insertion and thus could be relevant to consider for the future development of membrane protein prediction software.
  •  
40.
  • Duart, Gerard, et al. (författare)
  • Intra-Helical Salt Bridge Contribution to Membrane Protein Insertion
  • 2022
  • Ingår i: Journal of Molecular Biology. - : Elsevier BV. - 0022-2836 .- 1089-8638. ; 434:5
  • Tidskriftsartikel (refereegranskat)abstract
    • Salt bridges between negatively (D, E) and positively charged (K, R, H) amino acids play an important role in protein stabilization. This has a more prevalent effect in membrane proteins where polar amino acids are exposed to a hydrophobic environment. In transmembrane (TM) helices the presence of charged residues can hinder the insertion of the helices into the membrane. It is possible that the formation of salt bridges could decrease the cost of membrane integration. However, the presence of intra-helical salt bridges in TM domains and their effect on insertion has not been properly studied yet. In this work, we show that potentially salt-bridge forming pairs are statistically over-represented in TM-helices. We then selected some candidates to experimentally determine the contribution of these electrostatic interactions to the translocon-assisted membrane insertion process. Using both in vitro and whole cell systems, we confirm the presence of intra-helical salt bridges in TM segments during biogenesis and determined that they contribute ~0.5 kcal/mol to the apparent free energy of membrane insertion (delta G(app)). Our observations suggest that salt bridge interactions can be stabilized during translocon-mediated insertion and thus could be relevant to consider for the future development of membrane protein prediction software. 
  •  
41.
  •  
42.
  • Ekman, Diana, 1977- (författare)
  • Domain rearrangement and creation in protein evolution
  • 2008
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Proteins are composed of domains, recurrent protein fragments with distinct structure, function and evolutionary history. Some domains exist only as single domain proteins, however, a majority of them are also combined with other domains. Domain rearrangements are important in the evolution of new proteins as new functionalities can arise in a single evolutionary event. In addition, the domain repertoire can be expanded through mutations of existing domains and de novo creation. The processes of domain rearrangement and creation have been the focus of this thesis.According to our estimates about 65% of the eukaryotic and 40% of the prokaryotic proteins are of multidomain type. We found that insertion of a single domain at the N- or C-terminus was the most common event in the creation of novel multidomain architectures. However, domain repeats deviate from this pattern and are often expanded through duplications of several domains. Next, by mapping domain combinations onto an evolutionary tree we estimated that roughly one domain architecture has been created per million years, with the highest rates in metazoa. Much of this so called explosion of new architectures in metazoa seems to be explained by a set of domains amenable to exon shuffling. In contrast to domain architectures, most known domain families evolved early. However, many proteins have incomplete domain coverage, and could hence contain de novo created domains. In Saccharomyces cerevisiae, however, species specific sequences constitute only a minor fraction of the proteome, and are often short, disordered sequences located at the protein termini.
  •  
43.
  • Ekman, Diana, et al. (författare)
  • Identifying and Quantifying Orphan Protein Sequences in Fungi
  • 2010
  • Ingår i: Journal of Molecular Biology. - : Elsevier BV. - 0022-2836 .- 1089-8638. ; 396:2, s. 396-405
  • Tidskriftsartikel (refereegranskat)abstract
    • For large regions of many proteins, and even entire proteins, no homology to known domains or proteins can be detected. These sequences are often referred to as orphans. Surprisingly, it has been reported that the large number of orphans is sustained in spite of a rapid increase of available genomic sequences. However, it is believed that de novo creation of coding sequences is rare in comparison to mechanisms such as domain shuffling and gene duplication; hence, most sequences should have homologs in other genomes. To investigate this, the sequences of 19 complete fungi genomes were compared. By using the phylogenetic relationship between these genomes, we could identify potentially de novo created orphans in Saccharomyces cerevisiae. We found that only a small fraction, <2%, of the S. cerevisiae proteome is orphan, which confirms that de novo creation of coding sequences is indeed rare. Furthermore, we found it necessary to compare the most closely related species to distinguish between de novo created sequences and rapidly evolving sequences where homologs are present but cannot be detected. Next, the orphan proteins (OPs) and orphan domains (ODs) were characterized. First, it was observed that both OPs and ODs are short. In addition, at least some of the OPs have been shown to be functional in experimental assays, showing that they are not pseudogenes. Furthermore, in contrast to what has been reported before and what is seen for older orphans, S. cerevisiae specific ODs and proteins are not more disordered than other proteins. This might indicate that many of the older, and earlier classified, orphans indeed are fast-evolving sequences. Finally, >90% of the detected ODs are located at the protein termini, which suggests that these orphans could have been created by mutations that have affected the start or stop codons.
  •  
44.
  • Ekman, Diana, et al. (författare)
  • Multi-domain Proteins in the Three Kingdoms of Life : Orphan Domains and Other Unassigned Regions
  • 2005
  • Ingår i: Journal of Molecular Biology. - : Elsevier BV. - 0022-2836 .- 1089-8638. ; 348:1, s. 241-243
  • Tidskriftsartikel (refereegranskat)abstract
    • Comparative studies of the proteomes from different organisms have provided valuable information about protein domain distribution in the kingdoms of life. Earlier studies have been limited by the fact that only about 50% of the proteomes could be matched to a domain. Here, we have extended these studies by including less well-defined domain definitions, Pfam-B and clustered domains, MAS, in addition to Pfam-A and SCOP domains. It was found that a significant fraction of these domain families are homologous to Pfam-A or SCOP domains. Further, we show that all regions that do not match a Pfam-A or SCOP domain contain a significantly higher fraction of disordered structure. These unstructured regions may be contained within orphan domains or function as linkers between structured domains. Using several different definitions we have re-estimated the number of multi-domain proteins in different organisms and found that several methods all predict that eukaryotes have approximately 65% multi-domain proteins, while the prokaryotes consist of approximately 40% multi-domain proteins. However, these numbers are strongly dependent on the exact choice of cut-off for domains in unassigned regions. In conclusion, all eukaryotes have similar fractions of multidomain proteins and disorder, whereas a high fraction of repeating domain is distinguished only in multicellular eukaryotes. This implies a role for repeats in cell-cell contacts while the other two features are important for intracellular functions.
  •  
45.
  • Ekman, Diana, et al. (författare)
  • Quantification of the Elevated Rate of Domain Rearrangements in Metazoa
  • 2007
  • Ingår i: Journal of Molecular Biology. - : Elsevier BV. - 0022-2836 .- 1089-8638. ; 372:5, s. 1337-1348
  • Tidskriftsartikel (refereegranskat)abstract
    • Most eukaryotic proteins consist of multiple domains created through gene fusions or internal duplications. The most frequent change of a domain architecture (DA) is insertion or deletion of a domain at the N or C terminus. Still, the mechanisms underlying the evolution of multidomain proteins are not very well studied. Here, we have studied the evolution of multidomain architectures (MDA), guided by evolutionary information in the form of a phylogenetic tree. Our results show that Pfam domain families and MDAs have been created with comparable rates (0.1–1 per million years (My)). The major changes in DA evolution have occurred in the process of multicellularization and within the metazoan lineage. In contrast, creation of domains seems to have been frequent already in the early evolution. Furthermore, most of the architectures have been created from older domains or architectures, whereas novel domains are mainly found in single-domain proteins. However, a particular group of exon-bordering domains may have contributed to the rapid evolution of novel multidomain proteins in metazoan organisms. Finally, MDAs have evolved predominantly through insertions of domains, whereas domain deletions are less common. In conclusion, the rate of creation of multidomain proteins has accelerated in the metazoan lineage, which may partly be explained by the frequent insertion of exon-bordering domains into new architectures. However, our results indicate that other factors have contributed as well.
  •  
46.
  • Elofsson, Arne, 1966-, et al. (författare)
  • Deep learning insights into the architecture of the mammalian egg-sperm fusion synapse
  • 2024
  • Ingår i: eLIFE. - 2050-084X. ; 13
  • Tidskriftsartikel (refereegranskat)abstract
    • A crucial event in sexual reproduction is when haploid sperm and egg fuse to form a new diploid organism at fertilization. In mammals, direct interaction between egg JUNO and sperm IZUMO1 mediates gamete membrane adhesion, yet their role in fusion remains enigmatic. We used AlphaFold to predict the structure of other extracellular proteins essential for fertilization to determine if they could form a complex that may mediate fusion. We first identified TMEM81, whose gene is expressed by mouse and human spermatids, as a protein having structural homologies with both IZUMO1 and another sperm molecule essential for gamete fusion, SPACA6. Using a set of proteins known to be important for fertilization and TMEM81, we then systematically searched for predicted binary interactions using an unguided approach and identified a pentameric complex involving sperm IZUMO1, SPACA6, TMEM81 and egg JUNO, CD9. This complex is structurally consistent with both the expected topology on opposing gamete membranes and the location of predicted N-glycans not modeled by AlphaFold-Multimer, suggesting that its components could organize into a synapse-like assembly at the point of fusion. Finally, the structural modeling approach described here could be more generally useful to gain insights into transient protein complexes difficult to detect experimentally.
  •  
47.
  •  
48.
  • Elofsson, Arne, et al. (författare)
  • Methods for estimation of model accuracy in CASP12
  • 2018
  • Ingår i: Proteins. - : Wiley. - 0887-3585 .- 1097-0134. ; 86:S1, s. 361-373
  • Tidskriftsartikel (refereegranskat)abstract
    • Methods to reliably estimate the quality of 3D models of proteins are essential drivers for the wide adoption and serious acceptance of protein structure predictions by life scientists. In this article, the most successful groups in CASP12 describe their latest methods for estimates of model accuracy (EMA). We show that pure single model accuracy estimation methods have shown clear progress since CASP11; the 3 top methods (MESHI, ProQ3, SVMQA) all perform better than the top method of CASP11 (ProQ2). Although the pure single model accuracy estimation methods outperform quasi-single (ModFOLD6 variations) and consensus methods (Pcons, ModFOLDclust2, Pcomb-domain, and Wallner) in model selection, they are still not as good as those methods in absolute model quality estimation and predictions of local quality. Finally, we show that when using contact-based model quality measures (CAD, lDDT) the single model quality methods perform relatively better.
  •  
49.
  • Elofsson, Arne, 1966- (författare)
  • Progress at protein structure prediction, as seen in CASP15
  • 2023
  • Ingår i: Current opinion in structural biology. - 0959-440X .- 1879-033X. ; 80
  • Forskningsöversikt (refereegranskat)abstract
    • In Dec 2020, the results of AlphaFold version 2 were presented at CASP14, sparking a revolution in the field of protein structure predictions. For the first time, a purely computational method could challenge experimental accuracy for structure prediction of single protein domains. The code of AlphaFold v2 was released in the summer of 2021, and since then, it has been shown that it can be used to accurately predict the structure of most ordered proteins and many protein–protein interactions. It has also sparked an explosion of development in the field, improving AI-based methods to predict protein complexes, disordered regions, and protein design. Here I will review some of the inventions sparked by the release of AlphaFold.
  •  
50.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-50 av 190
Typ av publikation
tidskriftsartikel (124)
doktorsavhandling (32)
annan publikation (25)
forskningsöversikt (6)
konferensbidrag (1)
bokkapitel (1)
visa fler...
licentiatavhandling (1)
visa färre...
Typ av innehåll
refereegranskat (130)
övrigt vetenskapligt/konstnärligt (59)
populärvet., debatt m.m. (1)
Författare/redaktör
Elofsson, Arne (134)
Elofsson, Arne, 1966 ... (25)
Elofsson, Arne, Prof ... (25)
Wallner, Björn (19)
Ekman, Diana (11)
von Heijne, Gunnar (11)
visa fler...
Light, Sara (11)
Tsirigos, Konstantin ... (9)
Salvatore, Marco (8)
Cristobal, Susana (8)
Viklund, Håkan (8)
Bassot, Claudio (8)
Menéndez Hurtado (, ... (7)
Bryant, Patrick (7)
Pozzati, Gabriele (6)
Shenoy, Aditi, 1995- (6)
Larsson, Per (5)
Bernsel, Andreas (5)
Björklund, Åsa K. (5)
Zhu, Wensi (5)
Uziela, Karolis (5)
Li, Zhong (4)
Lindahl, Erik (4)
Basile, Walter (4)
Sachenkova, Oxana (4)
Hennerdal, Aron (4)
Kundrotas, Petras (4)
Tosatto, Silvio C.E. (4)
Granseth, Erik (4)
Lamb, John, 1983- (4)
Piovesan, Damiano (4)
Uhlén, Mathias (3)
Lindahl, Erik, 1972- (3)
Landreh, Michael (3)
Nilsson, Daniel (3)
Daley, Daniel O. (3)
Emanuelsson, Olof (3)
Winther, Ole (3)
Nielsen, Henrik (3)
Mingarro, Ismael (3)
Basile, Walter, 1980 ... (3)
Jurkowski, Wiktor (3)
Nørholm, Morten H. H ... (3)
Käll, Lukas (3)
Davey, Norman E. (3)
Frey-Skött, Johannes (3)
Sagit, Rauan (3)
Shenoy, Aditi (3)
Minervini, Giovanni (3)
Leonardi, Emanuela (3)
visa färre...
Lärosäte
Stockholms universitet (183)
Kungliga Tekniska Högskolan (18)
Karolinska Institutet (12)
Uppsala universitet (9)
Linköpings universitet (8)
Umeå universitet (3)
visa fler...
Lunds universitet (1)
visa färre...
Språk
Engelska (181)
Odefinierat språk (9)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (149)
Medicin och hälsovetenskap (18)
Teknik (3)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy