SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Sachenkova Oxana) "

Sökning: WFRF:(Sachenkova Oxana)

  • Resultat 1-5 av 5
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Basile, Walter, et al. (författare)
  • High GC content causes orphan proteins to be intrinsically disordered
  • 2017
  • Ingår i: PloS Computational Biology. - : Public Library of Science (PLoS). - 1553-734X .- 1553-7358. ; 13:3
  • Tidskriftsartikel (refereegranskat)abstract
    • De novo creation of protein coding genes involves the formation of short ORFs from noncoding regions; some of these ORFs might then become fixed in the population These orphan proteins need to, at the bare minimum, not cause serious harm to the organism, meaning that they should for instance not aggregate. Therefore, although the creation of short ORFs could be truly random, the fixation should be subjected to some selective pressure. The selective forces acting on orphan proteins have been elusive, and contradictory results have been reported. In Drosophila young proteins are more disordered than ancient ones, while the opposite trend is present in yeast. To the best of our knowledge no valid explanation for this difference has been proposed. To solve this riddle we studied structural properties and age of proteins in 187 eukaryotic organisms. We find that, with the exception of length, there are only small differences in the properties between proteins of different ages. However, when we take the GC content into account we noted that it could explain the opposite trends observed for orphans in yeast (low GC) and Drosophila (high GC). GC content is correlated with codons coding for disorder promoting amino acids. This leads us to propose that intrinsic disorder is not a strong determining factor for fixation of orphan proteins. Instead these proteins largely resemble random proteins given a particular GC level. During evolution the properties of a protein change faster than the GC level causing the relationship between disorder and GC to gradually weaken.
  •  
2.
  • Grapotte, M, et al. (författare)
  • Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network
  • 2021
  • Ingår i: Nature communications. - : Springer Science and Business Media LLC. - 2041-1723. ; 12:1, s. 3297-
  • Tidskriftsartikel (refereegranskat)abstract
    • Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.
  •  
3.
  • Hurst, Laurence D., et al. (författare)
  • A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators
  • 2014
  • Ingår i: Genome Biology. - : Springer Science and Business Media LLC. - 1465-6906 .- 1474-760X. ; 15:7, s. 413-
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Conventional wisdom holds that, owing to the dominance of features such as chromatin level control, the expression of a gene cannot be readily predicted from knowledge of promoter architecture. This is reflected, for example, in a weak or absent correlation between promoter divergence and expression divergence between paralogs. However, an inability to predict may reflect an inability to accurately measure or employment of the wrong parameters. Here we address this issue through integration of two exceptional resources: ENCODE data on transcription factor binding and the FANTOM5 high-resolution expression atlas. Results: Consistent with the notion that in eukaryotes most transcription factors are activating, the number of transcription factors binding a promoter is a strong predictor of expression breadth. In addition, evolutionarily young duplicates have fewer transcription factor binders and narrower expression. Nonetheless, we find several binders and cooperative sets that are disproportionately associated with broad expression, indicating that models more complex than simple correlations should hold more predictive power. Indeed, a machine learning approach improves fit to the data compared with a simple correlation. Machine learning could at best moderately predict tissue of expression of tissue specific genes. Conclusions: We find robust evidence that some expression parameters and paralog expression divergence are strongly predictable with knowledge of transcription factor binding repertoire. While some cooperative complexes can be identified, consistent with the notion that most eukaryotic transcription factors are activating, a simple predictor, the number of binding transcription factors found on a promoter, is a robust predictor of expression breadth.
  •  
4.
  • Light, Sara, et al. (författare)
  • Protein Expansion Is Primarily due to Indels in Intrinsically Disordered Regions
  • 2013
  • Ingår i: Molecular biology and evolution. - : Oxford University Press (OUP). - 0737-4038 .- 1537-1719. ; 30:12, s. 2645-2653
  • Tidskriftsartikel (refereegranskat)abstract
    • Proteins evolve not only through point mutations but also by insertion and deletion events, which affect the length of the protein. It is well known that such indel events most frequently occur in surface-exposed loops. However, detailed analysis of indel events in distantly related and fast-evolving proteins is hampered by the difficulty involved in correctly aligning such sequences. Here, we circumvent this problem by first only analyzing homologous proteins based on length variation rather than pairwise alignments. Using this approach, we find a surprisingly strong relationship between difference in length and difference in the number of intrinsically disordered residues, where up to three quarters of the length variation can be explained by changes in the number of intrinsically disordered residues. Further, we find that disorder is common in both insertions and deletions. A more detailed analysis reveals that indel events do not induce disorder but rather that already disordered regions accrue indels, suggesting that there is a lowered selective pressure for indels to occur within intrinsically disordered regions.
  •  
5.
  • Lundström, Oxana (Sachenkova), 1989-, et al. (författare)
  • WebSTR : A Population-wide Database of Short Tandem Repeat Variation in Humans
  • 2023
  • Ingår i: Journal of Molecular Biology. - 0022-2836 .- 1089-8638. ; 435:20
  • Tidskriftsartikel (refereegranskat)abstract
    • Short tandem repeats (STRs) are consecutive repetitions of one to six nucleotide motifs. They are hypervariable due to the high prevalence of repeat unit insertions or deletions primarily caused by polymerase slippage during replication. Genetic variation at STRs has been shown to influence a range of traits in humans, including gene expression, cancer risk, and autism. Until recently STRs have been poorly studied since they pose significant challenges to bioinformatics analyses. Moreover, genome-wide analysis of STR variation in population-scale cohorts requires large amounts of data and computational resources. However, the recent advent of genome-wide analysis tools has resulted in multiple large genome-wide datasets of STR variation spanning nearly two million genomic loci in thousands of individuals from diverse populations.Here we present WebSTR, a database of genetic variation and other characteristics of genome-wide STRs across human populations. WebSTR is based on reference panels of more than 1.7 million human STRs created with state of the art repeat annotation methods and can easily be extended to include additional cohorts or species. It currently contains data based on STR genotypes for individuals from the 1000 Genomes Project, H3Africa, the Genotype-Tissue Expression (GTEx) Project and colorectal cancer patients from the TCGA dataset.WebSTR is implemented as a relational database with programmatic access available through an API and a web portal for browsing data. The web portal is publicly available at https://webstr.ucsd.edu.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-5 av 5

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy