SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Elofsson Arne Professor) srt2:(2020-2024)"

Sökning: WFRF:(Elofsson Arne Professor) > (2020-2024)

  • Resultat 1-8 av 8
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Attwood, Misty M. (författare)
  • Membrane-bound proteins : Characterization, evolution, and functional analysis
  • 2020
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Alpha-helical transmembrane proteins are important components of many essential cell processes including signal transduction, transport of molecules across membranes, protein and membrane trafficking, and structural and adhesion activities, amongst others. Their involvement in critical networks makes them the focus of interest in investigating disease pathways, as candidate drug targets, and in evolutionary analyses to identify homologous protein families and possible functional activities. Transmembrane (TM) proteins can be categorized into major groups based the same gross structure, i.e., the number of transmembrane helices, which are often correlated with specific functional activities, for example as receptors or transporters. The focus of this thesis was to analyze the evolution of the membrane proteome from the last holozoan common ancestor (LHCA) through metazoans to garner insight into the fundamental functional clusters that underlie metazoan diversity and innovation. Twenty-four eukaryotic proteomes were analyzed, with results showing more than 70% of metazoan transmembrane protein families have a pre-metazoan origin. In concert with that, we characterized the previously unstudied groups of human proteins with three, four, and five membrane-spanning regions (3TM, 4TM, and 5TM) and analyzed their functional activities, involvement in disease pathways, and unique characteristics. Combined, we manually curated and classified nearly 11% of the human transmembrane proteome with these three studies. The 3TM data set included 152 proteins, with nearly 45% that localize specifically to the endoplasmic reticulum (ER), and are involved in membrane biosynthesis and lipid biogenesis, proteins trafficking, catabolic processes, and signal transduction due to the large ionotropic glutamate receptor family. The 373 proteins identified in the 4TM data set are predominantly involved in transport activities, as well as cell communication and adhesion, and function as structural elements. The compact 5TM data set includes 58 proteins that engage in localization and transport activities, such as protein targeting, membrane trafficking, and vesicle transport. Notably, ~60% are identified as cancer prognostic markers that are associated with clinical outcomes of different tumour types. This thesis investigates the evolutionary origins of the human transmembrane proteome, characterizes formerly dark areas of the membrane proteome, and extends the fundamental knowledge of transmembrane proteins.
  •  
2.
  • Bryant, Patrick, 1993- (författare)
  • Learning Protein Evolution and Structure
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • By analysing the structure of a protein it is possible to draw conclusions about its function. Obtaining the structure of a protein experimentally is however a time consuming and expensive process. By using evolution it is possible to infer the structure of a protein. AlphaFold2 (AF), the latest AI technology for protein structure prediction, uses evolutionary information to obtain protein structures in minutes instead of years at a fraction of the experimental cost. Here, we develop this technology further to predict the structure of interacting proteins. We create a confidence score, pDockQ, and show that this score rivals high-throughput experiments in distinguishing true and false protein-protein interactions (PPIs). Applying AF and the pDockQ score to a set of 65484 human PPIs we identify 1371 new high-confidence models. These models expand the structural knowledge of human protein complexes and can be used to e.g. develop new drugs or evaluate biological pathways. One limitation of AF is that the accuracy decreases with the number of proteins being predicted together and that the biggest protein complexes do not fit in the memory of the latest GPUs. To circumvent these issues, we predict subcomponents of protein complexes and assemble these together with Monte Carlo Tree search (MCTS). MCTS enables assembling some of the largest protein complexes using only sequence information and stoichiometry. Out of 175 protein complexes with 10-30 chains, 91 can be completely assembled with a median TM-score of 0.51. A third of these (30 complexes) are highly accurate (TM-score ≥0.8). The use of highly accurate protein structure prediction is revolutionising many fiends of biological research only one year after its realisation. Likely, this is only the beginning of a new era; the era of AI.  
  •  
3.
  • Hosseini Ashtiani, Saman, 1981- (författare)
  • Omics Data Analysis of Complex Diseases and Traits
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Following the advent of the high-throughput techniques for producing massive omics data, new possibilities and challenges have also emerged in different fields of biology and medicine. Dealing with such data on different scales with different scopes such as genomics, transcriptomics, proteomics and metabolomics, demands appropriate data collection, preprocessing, statistical analysis, interpretation and visualization. The overall goal of this thesis was to conceive omics-related questions in the context of four research titles and to apply a rational choice of the mentioned methods to conduct the study plans to answer them. Paper I asks whether we could propose potentially implicated genes in psoriasis; and tries to answer it using microarray transcriptomics data of psoriasis. Initially, quality control was performed on the microarray dataset and then the Differentially Expressed Genes (DEGs) were chosen for mapping to a protein-protein interaction (PPI) database to create a subnetwork of the respective PPI. Using network analysis, genes with higher scores were proposed as potentially relevant to psoriasis and finally, we evaluated the results concerning a gene-disease association database. Paper II asks whether the knockout of two genes followed by a transformation in E. coli could lead to an increase in bacterial growth in two different media; and deals with it through in vitro experiments followed by an in silico analysis of E. coli RNA-seq data. Here, we calculated the pairwise correlations between each target (knockout) gene and the rest of the genes in the RNA-seq dataset. Then, the significantly anti-correlated genes were shown to mainly belong to protein biosynthesis pathways compared to all other background pathways, which might indicate an increase in protein biosynthesis-related genes' transcription levels when there is an absolute decrease (knockout) in each of the target genes. Paper III asks if an anti-bone-resorption drug called Denosumab significantly affects the abundance of the metabolites extracted from blood samples during a two-year longitudinal placebo-controlled clinical trial study; and tries to address this through running statistical hypothesis testing for each metabolite in the quantification data from Liquid Chromatography-Mass Spectrometry (LC-MS). Afterwards, the patterns of metabolites' variations concerning Denosumab administration and visit times were studied using Principal Component Analysis (PCA), association studies and Hierarchical clustering. The results of this study proposed some identified metabolites for further clinical investigations. Based on our analyses, the patterns of abundance variations in some of the identified metabolites could be considered for improving the corresponding clinical studies and treatment with Denosumab. Paper IV proposes potentially relevant genes in lung adenocarcinoma by constructing a genome-scale co-expression network followed by clustering. The genes in each cluster were studied using the literature knowledge. One of the most frequently reported genes in lung adenocarcinoma was EGFR. We reported all the first-neighborhood genes connected to EFGR in its corresponding module as potentially relevant to lung adenocarcinoma. The repertoire of the above choices, workflows and evaluations could be applicable for further follow-up studies at different levels including omics data integration, personalized omics data analysis, studies on different scales such as cellular or tissue, using other methodologies for the same questions and running benchmarks. Although four different omics-related questions were posed in this thesis, they all involved the selection or preparation of the respective omics data, choosing preprocessing strategies, choosing statistical analyses and hypothesis testing methods and finally, performing the evaluation of the results and interpretations.
  •  
4.
  • Lamb, John, 1983- (författare)
  • Transmembrane Proteins and Protein Structure Prediction : What we can learn from Computational Methods
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • A protein’s 3D-structure is essential to understand how proteins function and interact and how biochemical processes proceed in organic life. Despite the advancement in experimental methods, it remains expensive and time-consuming to determine protein structure experimentally. There have been significant advances in machine learning and computational methods where, in many cases, models of protein structure can be determined to a high level of quality. Using computational methods helps predict protein 3D-structure and is often used complementary to experimental methods to give better insight and understanding of biological processes.This thesis presents studies focusing on the simplicity and transparency of the 3D-structure pipeline. This is done with a new interactive database with full access to the pipeline’s data and code together with tools to analyse and compare models and structures. I present a new module for the last step in this pipeline, the final folding of the protein chain, which both simplifies the current pipeline and uses new input data based on the current research. This module predicts better models than its predecessor and produces models more than a magnitude faster than the current state-of-the-art tools. This module also contains a novel way of both folding and docking dimers in one single step. There are many examples of how machine learning models contain biases that originate in biased training data, translating into models that do not generalise well. I present a study where experts collaborate to create a high-quality database of Intrinsically Disordered Proteins. Through manual annotation and quality protocols, high-quality training data has been produced that is well suited for machine learning tasks and protein disorder analysis. In this thesis, I also present computational methods pertaining to transmembrane proteins and how they can increase our insight into membrane protein structure. In one study, we use computational methods together with experimental methods to investigate how differently charged residue pairs that form salt bridges inside the membrane of membrane proteins changes the insertion potential. We show that amino acid pairs that form salt bridges in this setting contribute 0.5-0.7 kcal/mol to membrane insertion’s apparent free energy. This gives new insight and advances in how we calculate insertion and can lead to better membrane protein topology predictors. In the final study, we investigate the CPA/AT-transporter family of transmembrane proteins and create a new integrated topology annotation method and structural classification, resulting in new insight into how this family evolved through time. The entire pipeline is published as an interactive database with complete transparency for both the method and data used. The study shows how this family has evolved by duplicating internal regions and how this has caused a structural symmetry in the family. This thesis, therefore, contributes to a more accessible and more transparent path of using computational methods to give a more extensive insight into protein structure prediction and how these structures pertain to biochemical processes.
  •  
5.
  • Pozzati, Gabriele, 1989- (författare)
  • Deep learning solutions to protein quaternary structure
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Interactions between proteins are directly involved in most biological processes and are essential for the correct functioning of every form of life. The nature of protein-protein interactions allows functional assemblies of hundreds of protein chains. Given the enormous complexity and the pivotal role of protein interactions in life’s mechanics, the necessity to obtain a complete comprehension of such mechanisms is just as big as the challenge to achieve such knowledge. In the last few decades, experimental procedures constantly improved, dramatically increasing the available structural data for protein interactions. Unfortunately, experimental methods require a lot of time and resources and cannot always be applied with the same degree of success. Several computational methods have been developed in parallel with experimental procedures to overcome such limitations. Therefore, this thesis focused on screening existing computational methods and adopting them to improve the overall accuracy in solving structures of protein-complexes. In the first paper, I propose a simple rigid-body docking framework to test several interface predictors and their ability to drive a protein-protein docking procedure. Next, in the second paper, I display a method to adapt the trRosetta deep neural network to predict inter-residues distances and dihedral angle constraints for full protein complexes. The same concept is then improved in the third paper with FoldDock, an adaptation of Alphafold2 to work on multiple protein sequences and produce the corresponding complex. Finally, in the fourth paper, the FoldDock pipeline is applied to a large dataset of protein pairwise interactions derived from the hu.MAP and HuRI datasets, resulting in the characterization of more than 3000 high-confidence structural models.
  •  
6.
  • Shenoy, Aditi, 1995- (författare)
  • Unlocking protein sequences : Advances in protein structure and ligand-binding site prediction
  • 2024
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The protein sequence determines how it will fold into its unique three-dimensional structure. Once folded, proteins perform their functions by interacting with other proteins or molecules called ligands within the cell. Experimental determination of protein structure and function is tedious. Computational approaches aim to accurately predict the properties of proteins to complement experimental efforts of understanding biochemical mechanisms within the cell. This thesis introduces computational techniques that predict the structure of protein complexes and identify protein residues involved in interactions with common biomolecules, such as metal ions and nucleic acids, based on sequence information. AlphaFold, a method that predicted protein structure using sequence information with almost experimental accuracy, was a critical breakthrough that shaped the field of protein structure prediction. Subsequently, approaches such as FoldDock adapted the AlphaFold pipeline for dimer complexes. Paper I applies the FoldDock protocol to understand toxin-antitoxin systems. These protein complexes are highly evolutionary conserved, and high-confidence dimer predictions were generated. Paper II applies the FoldDock protocol to study protein-protein interactions in the human proteome. To verify the reliability of machine-learning-based computational methods, they must be tested on independent data different from the data used to train the method. Paper III involves generating and using a homology-reduced independent test set to benchmark the performance of protein complex structure predictors, including the recent AlphaFold release adapted for multi-chain proteins – AlphaFold-Multimer. A confidence score (pDockQ2) was proposed to estimate the quality of the interfaces within multimers. Paper I, Paper II and Paper III are associated with predicting and evaluating protein-protein interactions. Representation learning involves finding effective representations of input data to maximise available information, making it easier to understand and process them for downstream prediction tasks. A recent advance in protein representation learning is Protein Language models (pLMs), where large language models are trained on a massive corpus of protein sequences. Highly contextualised and informative vector representations contained in the last hidden layer of the model have been used to predict numerous properties, such as ligand binding sites, subcellular localisation, and post-translational modifications, among others. Paper IV uses residue-level embeddings to predict whether a protein binds to one or more of the ten most common ions. It also predicts residue-level binding probabilities for multiple ions simultaneously. Paper V expands this approach beyond metals. It explores the impact of structure-informed features alongside sequence embeddings to predict whether a residue binds to nucleic acids, small molecules or metals.  Paper IV and Paper V are associated with developing machine learning methods to predict and evaluate protein-ligand interactions. In summary, the research conducted within this thesis offers valuable insights into three crucial levers to systematically harness the potential of machine learning for protein bioinformatics. These are (1) construction of homology-reduced non-redundant datasets, (2) finding optimal protein representations, and (3) rigorous evaluation and inference. 
  •  
7.
  • Zhu, Wensi, 1993- (författare)
  • Decipher protein complex structures from sequence
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Proteins are essential constituents of biological systems. A profound understanding of protein structure is significant for unraveling the intricate mechanisms of biological processes. The recent development of computational methods using AI technology is revolutionizing the structural biology field. Accurate predictions of three-dimentional protein structures can be generated from protein sequences, enabling rapid and accurate insights into protein interactions and functions. This thesis aims to investigate the applications of various cutting-edge methods in protein complex structure prediction. We first explore using trRosetta for dimeric protein complexes, and the study shows that the single-chain protein structure predictor is feasible for protein complexes. In light of the success of AlphaFold2, we use the pipeline FoldDock, which is an adaption of AlphaFold2 on protein complexes, for protein-protein interactions (PPIs) of two human interactome datasets and construct a PPI network. Next, we conduct a benchmark study of AlphaFold-Multimer in multi-chain protein complexes with 2 to 6 chains and examine how different evaluation scores affect the prediction assessment. In the last paper, we predict the large protein complexes starting from subcomponents using AlphaFold2 and a Monte Carlo Tree Search algorithm. The studies in this thesis show that deep learning approaches can yield reliable results in predicting protein complex structures, and there is ample potential for further improvement. 
  •  
8.
  • Lundström, Oxana, 1989- (författare)
  • Intrinsic disorder and tandem repeats - match made in evolution : Computational studies of molecular evolution
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Proteins are both the building blocks and workers of the cell, carrying out most of the important functions. For a long time, their structure has been regarded as the primary factor for their function, but intrinsically disordered proteins demonstrate an alternative to this paradigm. Disordered proteins can temporarily assume different forms based on their interactions with other molecules and play critical roles in several biological processes, including cell signaling and regulation of gene expression.Tandem repeats are repeated patterns in genetic sequence. The role of tandem repeats in many protein structures is well documented today, but their role in disordered proteins is not entirely clear. This thesis aims to shed light on the mechanisms by which protein disorder and tandem repeats are linked.Only 2.5% of residues in all known protein sequences are characterized by the overlap of tandem repeats and protein disorder as described in Paper III, but many of these proteins have crucial functions and are linked to human diseases. Short tandem repeats emerge in this study as most frequently occurring in disordered regions. Genetic variation in disordered proteins accounts for length differences in eukaryotic genes (Paper I) and many orphan, recently evolved proteins, are disordered due to high GC content (Paper II). A medical application of this research is illustrated in the thesis with examples of variations in short tandem repeats (STRs) and their role in human diseases. Paper IV presents a comprehensive resource of human STR variation and Paper V illustrates how it can be used to identify specific STRs of interest, such as in the case of colorectal cancer where variations in certain STRs lead to altered gene expression patterns in tumors.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-8 av 8

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy