SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Nettelblad Carl 1985 ) "

Sökning: WFRF:(Nettelblad Carl 1985 )

  • Resultat 1-10 av 16
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Ekeberg, Tomas, 1983-, et al. (författare)
  • Observation of a single protein by ultrafast X-ray diffraction
  • 2024
  • Ingår i: Light. - : Springer Nature. - 2095-5545 .- 2047-7538. ; 13:1
  • Tidskriftsartikel (refereegranskat)abstract
    • The idea of using ultrashort X-ray pulses to obtain images of single proteins frozen in time has fascinated and inspired many. It was one of the arguments for building X-ray free-electron lasers. According to theory, the extremely intense pulses provide sufficient signal to dispense with using crystals as an amplifier, and the ultrashort pulse duration permits capturing the diffraction data before the sample inevitably explodes. This was first demonstrated on biological samples a decade ago on the giant mimivirus. Since then, a large collaboration has been pushing the limit of the smallest sample that can be imaged. The ability to capture snapshots on the timescale of atomic vibrations, while keeping the sample at room temperature, may allow probing the entire conformational phase space of macromolecules. Here we show the first observation of an X-ray diffraction pattern from a single protein, that of Escherichia coli GroEL which at 14 nm in diameter is the smallest biological sample ever imaged by X-rays, and demonstrate that the concept of diffraction before destruction extends to single proteins. From the pattern, it is possible to determine the approximate orientation of the protein. Our experiment demonstrates the feasibility of ultrafast imaging of single proteins, opening the way to single-molecule time-resolved studies on the femtosecond timescale.
  •  
2.
  • Akram, Adeel (författare)
  • Towards a realistic hyperon reconstruction with PANDA at FAIR
  • 2021
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The goal of the PANDA (anti-Proton ANnihilation at DArmstadt) experiment at FAIR (Facility for Anti-proton and Ion Research) is to study strong interactions in the confinement domain. In PANDA, a continuous beam of anti-protons () will impinge on a fixed hydrogen (p) target inside the High Energy Storage Ring (HESR), a feature intended to attain high interaction rates for various physics studies e.g. hyperon production. Two types of hydrogen targets are under development: a pellet target and a cluster-jet target where either high-density pellets or clusters of cooled hydrogen gas will be injected at the interaction point. The residual gas from the target system is expected to dissipate along the beam pipe resulting in a target that is effectively extended outside the designed interaction point. The realistic density profile of target and residual gas has implications for physics studies, e.g. in the ability to select signals of interest and, at the same time, suppress background. All hyperon simulations in PANDA until now have been performed under ideal conditions. In this work, I will for the first time implement more realistic conditions for the beam-target interaction and carry out simulations using the as benchmark channel. The impact of the different configurations of the vacuum system will be discussed in detail.In addition, I will present tests of some of the PANDA's particle track finders that are not based on ideal pattern recognition approaches. The results will provide important guidance for future tracking developments within PANDA.
  •  
3.
  • Akram, Adeel (författare)
  • Towards Realistic Hyperon Reconstruction in PANDA : From Tracking with Machine Learning to Interactions with Residual Gas
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The PANDA (anti-Proton ANnihilation at DArmstadt) experiment at FAIR (Facility for Anti-proton and Ion Research) aims to study strong interactions in the confinement domain. In PANDA, a continuous beam of anti-protons will impinge on a fixed hydrogen target inside the High Energy Storage Ring (HESR), a feature intended to attain high interaction rates for various physics studies e.g. hyperon production.        This thesis addresses the challenges of running PANDA under realistic conditions. The focus is two-fold: developing deep learning methods to reconstruct particle trajectories and reconstruct hyperons using realistic target profiles. Two approaches are used: (i) standard deep learning model such as dense network, and (ii) geometric deep leaning model such as interaction graph neural networks. The deep learning methods have given promising results, especially when it comes to (i) reconstruction of low-momentum particles that frequently occur in hadron physics experiments and (ii) reconstruction of tracks originating far from the interaction point. Both points are critical in many hyperon studies. However, further studies are needed to mitigate e.g. high clone rate. For the realistic target profiles, these pioneering simulations address the effect of residual gas on hyperon reconstruction. The results have shown that the signal-to-background ratio becomes worse by about a factor of 2 compared to the ideal target, however, the background level is still sufficiently low for these studies to be feasible. Further improvements can be made on the target side to achieve a better vacuum in the beam pipe and on the analysis side to improve the event selection. Finally, solutions are suggested to improve results, especially for the geometric deep learning method in handling low-momentum particles contributing to the high clone rate. In addition, a better way to build ground truth can improve the performance of our approach.
  •  
4.
  • Ausmees, Kristiina, et al. (författare)
  • A deep learning framework for characterization of genotype data
  • 2022
  • Ingår i: G3. - : Oxford University Press (OUP). - 2160-1836. ; 12:3
  • Tidskriftsartikel (refereegranskat)abstract
    • Dimensionality reduction is a data transformation technique widely used in various fields of genomics research. The application of dimensionality reduction to genotype data is known to capture genetic similarity between individuals, and is used for visualization of genetic variation, identification of population structure as well as ancestry mapping. Among frequently used methods are principal component analysis, which is a linear transform that often misses more fine-scale structures, and neighbor-graph based methods which focus on local relationships rather than large-scale patterns. Deep learning models are a type of nonlinear machine learning method in which the features used in data transformation are decided by the model in a data-driven manner, rather than by the researcher, and have been shown to present a promising alternative to traditional statistical methods for various applications in omics research. In this study, we propose a deep learning model based on a convolutional autoencoder architecture for dimensionality reduction of genotype data. Using a highly diverse cohort of human samples, we demonstrate that the model can identify population clusters and provide richer visual information in comparison to principal component analysis, while preserving global geometry to a higher extent than t-SNE and UMAP, yielding results that are comparable to an alternative deep learning approach based on variational autoencoders. We also discuss the use of the methodology for more general characterization of genotype data, showing that it preserves spatial properties in the form of decay of linkage disequilibrium with distance along the genome and demonstrating its use as a genetic clustering method, comparing results to the ADMIXTURE software frequently used in population genetic studies.
  •  
5.
  • Ausmees, Kristiina, et al. (författare)
  • Achieving improved accuracy for imputation of ancient DNA
  • 2023
  • Ingår i: Bioinformatics. - : Oxford University Press. - 1367-4803 .- 1367-4811. ; 39:1
  • Tidskriftsartikel (refereegranskat)abstract
    • MotivationGenotype imputation has the potential to increase the amount of information that can be gained from the often limited biological material available in ancient samples. As many widely used tools have been developed with modern data in mind, their design is not necessarily reflective of the requirements in studies of ancient DNA. Here, we investigate if an imputation method based on the full probabilistic Li and Stephens model of haplotype frequencies might be beneficial for the particular challenges posed by ancient data.ResultsWe present an implementation called prophaser and compare imputation performance to two alternative pipelines that have been used in the ancient DNA community based on the Beagle software. Considering empirical ancient data downsampled to lower coverages as well as present-day samples with artificially thinned genotypes, we show that the proposed method is advantageous at lower coverages, where it yields improved accuracy and ability to capture rare variation. The software prophaser is optimized for running in a massively parallel manner and achieved reasonable runtimes on the experiments performed when executed on a GPU.
  •  
6.
  • Ausmees, Kristiina (författare)
  • Methodology and Infrastructure for Statistical Computing in Genomics : Applications for Ancient DNA
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis concerns the development and evaluation of computational methods for analysis of genetic data. A particular focus is on ancient DNA recovered from archaeological finds, the analysis of which has contributed to novel insights into human evolutionary and demographic history, while also introducing new challenges and the demand for specialized methods.A main topic is that of imputation, or the inference of missing genotypes based on observed sequence data. We present results from a systematic evaluation of a common imputation pipeline on empirical ancient samples, and show that imputed data can constitute a realistic option for population-genetic analyses. We also develop a tool for genotype imputation that is based on the full probabilistic Li and Stephens model for haplotype frequencies and show that it can yield improved accuracy on particularly challenging data.  Another central subject in genomics and population genetics is that of data characterization methods that allow for visualization and exploratory analysis of complex information. We discuss challenges associated with performing dimensionality reduction of genetic data, demonstrating how the use of principal component analysis is sensitive to incomplete information and performing an evaluation of methods to handle unobserved genotypes. We also discuss the use of deep learning models as an alternative to traditional methods of data characterization in genomics and propose a framework based on convolutional autoencoders that we exemplify on the applications of dimensionality reduction and genetic clustering.In genomics, as in other fields of research, increasing sizes of data sets are placing larger demands on efficient data management and compute infrastructures. The final part of this thesis addresses the use of cloud resources for facilitating data analysis in scientific applications. We present two different cloud-based solutions, and exemplify them on applications from genomics.
  •  
7.
  • Clouard, Camille (författare)
  • A computational and statistical framework for cost-effective genotyping combining pooling and imputation
  • 2024
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The information conveyed by genetic markers, such as single nucleotide polymorphisms (SNPs), has been widely used in biomedical research to study human diseases and is increasingly valued in agriculture for genomic selection purposes. Specific markers can be identified as a genetic signature that correlates with certain characteristics in a living organism, e.g. a susceptibility to disease or high-yield traits. Capturing these signatures with sufficient statistical power often requires large volumes of data, with thousands of samples to be analysed and potentially millions of genetic markers to be screened. Relevant effects are particularly delicate to detect when the genetic variations involved occur at low frequencies.The cost of producing such marker genotype data is therefore a critical part of the analysis. Despite recent technological advances, production costs can still be prohibitive on a large scale and genotype imputation strategies have been developed to address this issue. Genotype imputation methods have been extensively studied in human data and, to a lesser extent, in crop and animal species. A recognised weakness of imputation methods is their lower accuracy in predicting the genotypes for rare variants, whereas those can be highly informative in association studies and improve the accuracy of genomic selection. In this respect, pooling strategies can be well suited to complement imputation, as pooling is efficient at capturing the low-frequency items in a population. Pooling also reduces the number of genotyping tests required, making its use in combination with imputation a cost-effective compromise between accurate but expensive high-density genotyping of each sample individually and stand-alone imputation. However, due to the nature of genotype data and the limitations of genotype testing techniques, decoding pooled genotypes into unique data resolutions is challenging. In this work, we study the characteristics of decoded genotype data from pooled observations with a specific pooling scheme using the examples of a human cohort and a population of inbred wheat lines. We propose different inference strategies to reconstruct the genotypes before devising them as input to imputation, and we reflect on how the reconstructed distributions affect the results of imputation methods such as tree-based haplotype clustering or coalescent models.
  •  
8.
  • Clouard, Camille, et al. (författare)
  • A joint use of pooling and imputation for genotyping SNPs
  • 2022
  • Ingår i: BMC Bioinformatics. - : Springer Nature. - 1471-2105. ; 23
  • Tidskriftsartikel (refereegranskat)abstract
    • BackgroundDespite continuing technological advances, the cost for large-scale genotyping of a high number of samples can be prohibitive. The purpose of this study is to design a cost-saving strategy for SNP genotyping. We suggest making use of pooling, a group testing technique, to drop the amount of SNP arrays needed. We believe that this will be of the greatest importance for non-model organisms with more limited resources in terms of cost-efficient large-scale chips and high-quality reference genomes, such as application in wildlife monitoring, plant and animal breeding, but it is in essence species-agnostic. The proposed approach consists in grouping and mixing individual DNA samples into pools before testing these pools on bead-chips, such that the number of pools is less than the number of individual samples. We present a statistical estimation algorithm, based on the pooling outcomes, for inferring marker-wise the most likely genotype of every sample in each pool. Finally, we input these estimated genotypes into existing imputation algorithms. We compare the imputation performance from pooled data with the Beagle algorithm, and a local likelihood-aware phasing algorithm closely modeled on MaCH that we implemented.ResultsWe conduct simulations based on human data from the 1000 Genomes Project, to aid comparison with other imputation studies. Based on the simulated data, we find that pooling impacts the genotype frequencies of the directly identifiable markers, without imputation. We also demonstrate how a combinatorial estimation of the genotype probabilities from the pooling design can improve the prediction performance of imputation models. Our algorithm achieves 93% concordance in predicting unassayed markers from pooled data, thus it outperforms the Beagle imputation model which reaches 80% concordance. We observe that the pooling design gives higher concordance for the rare variants than traditional low-density to high-density imputation commonly used for cost-effective genotyping of large cohorts.ConclusionsWe present promising results for combining a pooling scheme for SNP genotyping with computational genotype imputation on human data. These results could find potential applications in any context where the genotyping costs form a limiting factor on the study size, such as in marker-assisted selection in plant breeding.
  •  
9.
  • Clouard, Camille (författare)
  • Computational statistical methods for genotyping biallelic DNA markers from pooled experiments
  • 2022
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The information conveyed by genetic markers such as Single Nucleotide Polymorphisms (SNPs) has been widely used in biomedical research for studying human diseases, but also increasingly in agriculture by plant and animal breeders for selection purposes. Specific identified markers can act as a genetic signature that is correlated to certain characteristics in a living organism, e.g. a sensitivity to a disease or high-yield traits. Capturing these signatures with sufficient statistical power often requires large volumes of data, with thousands of samples to analyze and possibly millions of genetic markers to screen. Establishing statistical significance for effects from genetic variations is especially delicate when they occur at low frequencies.The production cost of such marker genotype data is thereforea critical part of the analysis. Despite recent technological advances, the production cost can still be prohibitive and genotype imputation strategies have been developed for addressing this issue. The genotype imputation methods have been widely investigated on human data and to a smaller extent on crop and animal species. In the case where only few reference genomes are available for imputation purposes, such as for non-model organisms, the imputation results can be less accurate. Group testing strategies, also called pooling strategies, can be well-suited for complementing imputation in large populations and decreasing the number of genotyping tests required compared to the single testing of every individual. Pooling is especially efficient for genotyping the low-frequency variants. However, because of the particular nature of genotype data and because of the limitations inherent to the genotype testing techniques, decoding pooled genotypes into unique data resolutions is a challenge. Overall, the decoding problem with pooled genotypes can be described as as an inference problem in Missing Not At Random data with nonmonotone missingness patterns.Specific inference methods such as variations of the Expectation-Maximization algorithm can be used for resolving the pooled data into estimates of the genotype probabilities for every individual. However, the non-randomness of the undecoded data impacts the outcomes of the inference process. This impact is propagated to imputation if the inferred genotype probabilities are to be devised as input into classical imputation methods for genotypes. In this work, we propose a study of the specific characteristics of a pooling scheme on genotype data, as well as how it affects the results of imputation methods such as tree-based haplotype clustering or coalescent models.
  •  
10.
  • Clouard, Camille, et al. (författare)
  • Consistency Study of a Reconstructed Genotype Probability Distribution via Clustered Bootstrapping in NORB Pooling Blocks
  • 2022
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • For applications with biallelic genetic markers, group testing techniques, synonymous to pooling techniques, are usually applied for decreasing the cost of large-scale testing as e.g. when detecting carriers of rare genetic variants. In some configurations, the results of the grouped tests cannot be decoded and the pooled items are missing. Inference of these missing items can be performed with specific statistical methods that are for example related to the Expectation-Maximization algorithm. Pooling has also been applied for determining the genotype of markers in large populations. The particularity of full genotype data for diploid organisms in the context of group testing are the ternary outcomes (two homozygous genotypes and one heterozygous), as well as the distribution of these three outcomes in a population, which is often ruled by the Hardy-Weinberg Equilibrium and depends on the allele frequency in such situation. When using a nonoverlapping repeated block pooling design, the missing items are only observed in particular arrangements. Overall, a data set of pooled genotypes can be described as an inference problem in Missing Not At Random data with nonmonotone missingness patterns. This study presents a preliminary investigation of the consistency of various iterative methods estimating the most likely genotype probabilities of the missing items in pooled data. We use the Kullback-Leibler divergence and the L2 distance between the genotype distribution computed from our estimates and a simulated empirical distribution as a measure of the distributional consistency.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 16

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy