SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Shu Nanjiang) "

Sökning: WFRF:(Shu Nanjiang)

  • Resultat 1-10 av 20
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Govindarajan, Sudha, et al. (författare)
  • The evolutionary history of topological variations in the CPA/AT superfamily
  • 2024
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • CPA/AT transporters consist of two structurally and evolutionarily related inverted repeat units, each of them with one core and one scaffold subdomain. During evolution, these families have undergone substantial changes in structure, topology and function. Central to the function of the transporters is the existence of two noncanonical helices that are involved in the transport process. In different families, two different types of these helices have been identified, reentrant and broken. Here, we use an integrated topology annotation method to identify novel topologies in the families. It combines topology prediction, similarity to families with known structure, and the difference in positively charged residues present in inside and outside loops in alternative topological models. We identified families with diverse topologies containing broken or reentrant helix. We classified all families based on 3 distinct evolutionary groups that each share a structurally similar C-terminal repeat unit newly termed as “Fold-types”. Using the evolutionary relationship between families we propose topological transitions including, a transition between broken and reentrant helices, complete change of orientation, changes in the number of scaffold helices and even in some rare cases, losses of core helices. The evolutionary history of the repeat units shows gene duplication and repeat shuffling events to result in these extensive topology variations. The novel structure-based classification, together with supporting structural models and other information, is presented in a searchable database, CPAfold (cpafold.bioinfo.se). Our comprehensive study of topology variations within the CPA superfamily provides better insight about their structure and evolution.
  •  
2.
  • Hayat, Sikander, et al. (författare)
  • Inclusion of dyad-repeat pattern improves topology prediction of transmembrane beta-barrel proteins
  • 2016
  • Ingår i: Bioinformatics. - : Oxford University Press (OUP). - 1367-4803 .- 1367-4811. ; 32:10, s. 1571-1573
  • Tidskriftsartikel (refereegranskat)abstract
    • Accurate topology prediction of transmembrane beta-barrels is still an open question. Here, we present BOCTOPUS2, an improved topology prediction method for transmembrane beta-barrels that can also identify the barrel domain, predict the topology and identify the orientation of residues in transmembrane beta-strands. The major novelty of BOCTOPUS2 is the use of the dyad-repeat pattern of lipid and pore facing residues observed in transmembrane beta-barrels. In a cross-validation test on a benchmark set of 42 proteins, BOCTOPUS2 predicts the correct topology in 69% of the proteins, an improvement of more than 10% over the best earlier method (BOCTOPUS) and in addition, it produces significantly fewer erroneous predictions on non-transmembrane beta-barrel proteins.
  •  
3.
  • Pascarelli, Stefano, et al. (författare)
  • PRODRES: Fast protein searches using a protein domain-reduced database
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Motivation: Detection of homologous sequences is a the basis formany bioinformatics applications. Position-Specific Scoring Matrices(PSSMs) or Hidden Markov Models (HMMs) are often created fromthe detected homologous sequences. These are then widely usedin many bioinformatics software in order to incorporate evolutionaryinformation in the prediction process. However, due to the increasein the size of reference databases, there is a continuous decrease inspeed of homology detection even with faster computers.Results: By using PRODRES, we save on average X percent ofthe search time. This pipeline has been exploited in our widely usedtopology prediction software, TOPCONS. In total, more than 5 millionPSSMs have been generated, with an average running time of about1 minute. This corresponds to an approximate 10 times speed-up ofthe whole process.Availability and implementation: A standalone version ofPRODRES can be found in the Github repository https://github.com/-ElofssonLab/PRODRES, while a web-server implementing themethod is available for academic users at http://PRODRES.bioinfo.se/
  •  
4.
  • Peters, Christoph, et al. (författare)
  • Improved topology prediction using the terminal hydrophobic helices rule
  • 2016
  • Ingår i: Bioinformatics. - : Oxford University Press (OUP). - 1367-4803 .- 1367-4811. ; 32:8, s. 1158-1162
  • Tidskriftsartikel (refereegranskat)abstract
    • Motivation: The translocon recognizes sufficiently hydrophobic regions of a protein and inserts them into the membrane. Computational methods try to determine what hydrophobic regions are recognized by the translocon. Although these predictions are quite accurate, many methods still fail to distinguish marginally hydrophobic transmembrane (TM) helices and equally hydrophobic regions in soluble protein domains. In vivo, this problem is most likely avoided by targeting of the TM-proteins, so that non-TM proteins never see the translocon. Proteins are targeted to the translocon by an N-terminal signal peptide. The targeting is also aided by the fact that the N-terminal helix is more hydrophobic than other TM-helices. In addition, we also recently found that the C-terminal helix is more hydrophobic than central helices. This information has not been used in earlier topology predictors.Results: Here, we use the fact that the N- and C-terminal helices are more hydrophobic to develop a new version of the first-principle-based topology predictor, SCAMPI. The new predictor has two main advantages; first, it can be used to efficiently separate membrane and non-membrane proteins directly without the use of an extra prefilter, and second it shows improved performance for predicting the topology of membrane proteins that contain large non-membrane domains.Availability and implementation: The predictor, a web server and all datasets are available at http://scampi.bioinfo.se/.
  •  
5.
  • Salvatore, Marco, et al. (författare)
  • SubCons : a new ensemble method for improved human subcellular localization predictions
  • 2017
  • Ingår i: Bioinformatics. - : Oxford University Press (OUP). - 1367-4803 .- 1367-4811. ; 33:16, s. 2464-2470
  • Tidskriftsartikel (refereegranskat)abstract
    • Motivation: Knowledge of the correct protein subcellular localization is necessary for understanding the function of a protein. Unfortunately large-scale experimental studies are limited in their accuracy. Therefore, the development of prediction methods has been limited by the amount of accurate experimental data. However, recently large-scale experimental studies have provided new data that can be used to evaluate the accuracy of subcellular predictions in human cells. Using this data we examined the performance of state of the art methods and developed SubCons, an ensemble method that combines four predictors using a Random Forest classifier. Results: SubCons outperforms earlier methods in a dataset of proteins where two independent methods confirm the subcellular localization. Given nine subcellular localizations, SubCons achieves an F1-Score of 0.79 compared to 0.70 of the second bestmethod. Furthermore, at a FPR of 1% the true positive rate (TPR) is over 58% for SubCons compared to less than 50% for the best individual predictor.
  •  
6.
  • Salvatore, Marco, et al. (författare)
  • The SubCons webserver : A user friendly web interface for state-of-the-art subcellular localization prediction
  • 2018
  • Ingår i: Protein Science. - : Wiley. - 0961-8368 .- 1469-896X. ; 27:1, s. 195-201
  • Tidskriftsartikel (refereegranskat)abstract
    • SubCons is a recently developed method that predicts the subcellular localization of a protein. It combines predictions from four predictors using a Random Forest classifier. Here, we present the user-friendly web-interface implementation of SubCons. Starting from a protein sequence, the server rapidly predicts the subcellular localizations of an individual protein. In addition, the server accepts the submission of sets of proteins either by uploading the files or programmatically by using command line WSDL API scripts. This makes SubCons ideal for proteome wide analyses allowing the user to scan a whole proteome in few days. From the web page, it is also possible to download precalculated predictions for several eukaryotic organisms. To evaluate the performance of SubCons we present a benchmark of LocTree3 and SubCons using two recent mass-spectrometry based datasets of mouse and drosophila proteins. The server is available at http://subcons.bioinfo.se/
  •  
7.
  • Shu, Nanjiang, 1981-, et al. (författare)
  • Describing and Comparing Protein Structures Using Shape Strings
  • 2008
  • Ingår i: Current protein and peptide science. - : Bentham Science Publishers. - 1389-2037 .- 1875-5550. ; 9:4, s. 310-324
  • Tidskriftsartikel (refereegranskat)abstract
    • Different methods for describing and comparing the structures of the tens of thousands of proteins that have been determined by X-ray crystallography are reviewed. Such comparisons are important for understanding the structures and functions of proteins and facilitating structure prediction, as well as assessing structure prediction methods. We summarize methods in this field emphasizing ways of representing protein structures as one-dimensional geometrical strings. Such strings are based on the shape symbols of clustered regions of φ/Ψ dihedral angle pairs of the polypeptide backbones as described by the Ramachandran plot. These one-dimensional expressions are as compact as secondary structure description but contain more information in loop regions. They can be used for fast searching for similar structures in databases and for comparing similarities between proteins and between the predicted and native structures.
  •  
8.
  • Shu, Nanjiang, et al. (författare)
  • KalignP : Improved multiple sequence alignments using position specific gap penalties in Kalign2
  • 2011
  • Ingår i: Bioinformatics. - : Oxford University Press (OUP). - 1367-4803 .- 1367-4811. ; 27:12, s. 1702-1703
  • Tidskriftsartikel (refereegranskat)abstract
    • Kalign2 is one of the fastest and most accurate methods for multiple alignments. However, in contrast to other methods Kalign2 does not allow externally supplied position specific gap penalties. Here, we present a modification to Kalign2, KalignP, so that it accepts such penalties. Further, we show that KalignP using position specific gap penalties obtained from predicted secondary structures makes steady improvement over Kalign2 when tested on Balibase 3.0 as well as on a dataset derived from Pfam-A seed alignments.
  •  
9.
  • Shu, Nanjiang, 1981- (författare)
  • Prediction of zinc-binding sites in proteins and efficient protein structure description and comparison
  • 2008
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • A large number of proteins require certain metals to stabilize their structures or to function properly. About one third of all proteins in the Protein Data Bank (PDB) contain metals and it is estimated that approximately the same proportion of all proteins are metalloproteins. Zinc, the second most abundant transition metal found in eukaryotic organisms, plays key roles, mainly structural and catalytic, in many biological functions. Predicting whether a protein binds zinc and even the accurate location of binding sites is important when investigating the function of an experimentally uncharacterized protein. Describing and comparing protein structures with both efficiency and accuracy are essential for systematic annotation of functional properties of proteins, be it on an individual or on a genome scale. Dozens of structure comparison methods have been developed in the past decades. In recent years, several research groups have endeavoured in developing methods for fast comparison of protein structures by representing the three-dimensional (3D) protein structures as one-dimensional (1D) geometrical strings based on the shape symbols of clustered regions of φ/ψ torsion angle pairs of the polypeptide backbones. These 1D geometrical strings, shape strings, are as compact as 1D secondary structures but carry more elaborate structural information in loop regions and thus are more suitable for fast structure database searching, classification of loop regions and evaluation of model structures. In this thesis, a new method for predicting zinc-binding sites in proteins from amino acid sequences is described. This method predicts zinc-binding Cys, His, Asp and Glu (the four most common zinc-binding residues) with 75% precision (86% for Cys and His only) at 50% recall according to a solid 5-fold cross-validation on a non-redundant set of the PDB chains containing 2727 unique chains, of which 235 bind to zinc. This method predicts zinc-binding Cys and His with about 10% higher precision at different recall levels compared to a previously published method. In addition, different methods for describing and comparing protein structures are reviewed. Some recently developed methods based on 1D geometrical representation of backbone structures are emphasized and analyzed in details.
  •  
10.
  • Shu, Nanjiang, 1981-, et al. (författare)
  • Prediction of zinc-binding sites in proteins from sequence
  • 2008
  • Ingår i: Bioinformatics. - : Oxford University Press (OUP). - 1367-4803 .- 1367-4811. ; 24:6, s. 775-782
  • Tidskriftsartikel (refereegranskat)abstract
    • MOTIVATION: Motivated by the abundance, importance and unique functionality of zinc, both biologically and physiologically, we have developed an improved method for the prediction of zinc-binding sites in proteins from their amino acid sequences. RESULTS: By combining support vector machine (SVM) and homology-based predictions, our method predicts zinc-binding Cys, His, Asp and Glu with 75% precision (86% for Cys and His only) at 50% recall according to a 5-fold cross-validation on a non-redundant set of protein chains from the Protein Data Bank (PDB) (2727 chains, 235 of which bind zinc). Consequently, our method predicts zinc-binding Cys and His with 10% higher precision at different recall levels compared to a recently published method when tested on the same dataset. AVAILABILITY: The program is available for download at www.fos.su.se/~nanjiang/zincpred/download/
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 20

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy