SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Schaal Wesley PhD) srt2:(2015-2019)"

Sökning: WFRF:(Schaal Wesley PhD) > (2015-2019)

  • Resultat 1-6 av 6
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Ahmed, Laeeq, et al. (författare)
  • Efficient iterative virtual screening with Apache Spark and conformal prediction
  • 2018
  • Ingår i: Journal of Cheminformatics. - : BioMed Central. - 1758-2946. ; 10
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Docking and scoring large libraries of ligands against target proteins forms the basis of structure-based virtual screening. The problem is trivially parallelizable, and calculations are generally carried out on computer clusters or on large workstations in a brute force manner, by docking and scoring all available ligands. Contribution: In this study we propose a strategy that is based on iteratively docking a set of ligands to form a training set, training a ligand-based model on this set, and predicting the remainder of the ligands to exclude those predicted as 'low-scoring' ligands. Then, another set of ligands are docked, the model is retrained and the process is repeated until a certain model efficiency level is reached. Thereafter, the remaining ligands are docked or excluded based on this model. We use SVM and conformal prediction to deliver valid prediction intervals for ranking the predicted ligands, and Apache Spark to parallelize both the docking and the modeling. Results: We show on 4 different targets that conformal prediction based virtual screening (CPVS) is able to reduce the number of docked molecules by 62.61% while retaining an accuracy for the top 30 hits of 94% on average and a speedup of 3.7. The implementation is available as open source via GitHub (https://github.com/laeeq80/spark-cpvs) and can be run on high-performance computers as well as on cloud resources.
  •  
2.
  • Dahlö, Martin, et al. (författare)
  • Tracking the NGS revolution : managing life science research on shared high-performance computing clusters
  • 2018
  • Ingår i: GigaScience. - : Oxford University Press. - 2047-217X. ; 7:5
  • Tidskriftsartikel (refereegranskat)abstract
    • BackgroundNext-generation sequencing (NGS) has transformed the life sciences, and many research groups are newly dependent upon computer clusters to store and analyze large datasets. This creates challenges for e-infrastructures accustomed to hosting computationally mature research in other sciences. Using data gathered from our own clusters at UPPMAX computing center at Uppsala University, Sweden, where core hour usage of ∼800 NGS and ∼200 non-NGS projects is now similar, we compare and contrast the growth, administrative burden, and cluster usage of NGS projects with projects from other sciences.ResultsThe number of NGS projects has grown rapidly since 2010, with growth driven by entry of new research groups. Storage used by NGS projects has grown more rapidly since 2013 and is now limited by disk capacity. NGS users submit nearly twice as many support tickets per user, and 11 more tools are installed each month for NGS projects than for non-NGS projects. We developed usage and efficiency metrics and show that computing jobs for NGS projects use more RAM than non-NGS projects, are more variable in core usage, and rarely span multiple nodes. NGS jobs use booked resources less efficiently for a variety of reasons. Active monitoring can improve this somewhat.ConclusionsHosting NGS projects imposes a large administrative burden at UPPMAX due to large numbers of inexperienced users and diverse and rapidly evolving research areas. We provide a set of recommendations for e-infrastructures that host NGS research projects. We provide anonymized versions of our storage, job, and efficiency databases.
  •  
3.
  •  
4.
  • Kaarme, Johan, et al. (författare)
  • Rapid Increase in Carriage Rates of Enterobacteriaceae Producing Extended-Spectrum β-Lactamases in Healthy Preschool Children, Sweden
  • 2018
  • Ingår i: Emerging Infectious Diseases. - : Centers for Disease Control and Prevention (CDC). - 1080-6040 .- 1080-6059. ; 24:10, s. 1874-1881
  • Tidskriftsartikel (refereegranskat)abstract
    • By collecting and analyzing diapers, we identified a >6-fold increase in carriage of extended-spectrum β-lactamase (ESBL)-producing Enterobacteriaceae for healthy preschool children in Sweden (p<0.0001). For 6 of the 50 participating preschools, the carriage rate was >40%. We analyzed samples from 334 children and found 56 containing >1 ESBL producer. The prevalence in the study population increased from 2.6% in 2010 to 16.8% in 2016 (p<0.0001), and for 6 of the 50 participating preschools, the carriage rate was >40%. Furthermore, 58% of the ESBL producers were multidrug resistant, and transmission of ESBL-producing and non-ESBL-producing strains was observed at several of the preschools. Toddlers appear to be major carriers of ESBL producers in Sweden.
  •  
5.
  • Lapins, Maris, et al. (författare)
  • A confidence predictor for logD using conformal regression and a support-vector machine
  • 2018
  • Ingår i: Journal of Cheminformatics. - : Springer Science and Business Media LLC. - 1758-2946. ; 10:1
  • Tidskriftsartikel (refereegranskat)abstract
    • Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.
  •  
6.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-6 av 6

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy