SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Cloarec Olivier) ;hsvcat:1"

Sökning: WFRF:(Cloarec Olivier) > Naturvetenskap

  • Resultat 1-10 av 11
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Sjögren, Rickard, 1989- (författare)
  • Synergies between Chemometrics and Machine Learning
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Thanks to digitization and automation, data in all shapes and forms are generated in ever-growing quantities throughout society, industry and science. Data-driven methods, such as machine learning algorithms, are already widely used to benefit from all these data in all kinds of applications, ranging from text suggestion in smartphones to process monitoring in industry. To ensure maximal benefit to society, we need workflows to generate, analyze and model data that are performant as well as robust and trustworthy.There are several scientific disciplines aiming to develop data-driven methodologies, two of which are machine learning and chemometrics. Machine learning is part of artificial intelligence and develops algorithms that learn from data. Chemometrics, on the other hand, is a subfield of chemistry aiming to generate and analyze complex chemical data in an optimal manner. There is already a certain overlap between the two fields where machine learning algorithms are used for predictive modelling within chemometrics. Although, since both fields aims to increase value of data and have disparate backgrounds, there are plenty of possible synergies to benefit both fields. Thanks to its wide applicability, there are many tools and lessons learned within machine learning that goes beyond the predictive models that are used within chemometrics today. On the other hand, chemometrics has always been application-oriented and this pragmatism has made it widely used for quality assurance within regulated industries. This thesis serves to nuance the relationship between the two fields and show that knowledge in either field can be used to benefit the other. We explore how tools widely used in applied machine learning can help chemometrics break new ground in a case study of text analysis of patents in Paper I. We then draw inspiration from chemometrics and show how principles of experimental design can help us optimize large-scale data processing pipelines in Paper II and how a method common in chemometrics can be adapted to allow artificial neural networks detect outlier observations in Paper III. We then show how experimental design principles can be used to ensure quality in the core of concurrent machine learning, namely generation of large-scale datasets in Paper IV. Lastly, we outline directions for future research and how state-of-the-art research in machine learning can benefit chemometric method development.
  •  
2.
  • Bruce, Stephen J, et al. (författare)
  • Evaluation of a protocol for metabolic profiling studies on human blood plasma by combined ultra-performance liquid chromatography/mass spectrometry : From extraction to data analysis
  • 2009
  • Ingår i: Analytical Biochemistry. - : Elsevier. - 0003-2697 .- 1096-0309. ; 372:2, s. 237-249
  • Tidskriftsartikel (refereegranskat)abstract
    • The investigation presented here describes a protocol designed to perform high-throughput metabolic profiling analysis on human blood plasma by ultra-performance liquid chromatography/mass spectrometry (UPLC/MS). To address whether a previous extraction protocol for gas chromatography (GC)/MS-based metabolic profiling of plasma could be used for UPLC/MS-based analysis, the original protocol was compared with similar methods for extraction of low-molecular-weight compounds from plasma via protein precipitation. Differences between extraction methods could be observed, but the previously published extraction method was considered the best. UPLC columns with three different stationary phases (C8, C18, and phenyl) were used in identical experimental runs consisting of a total of 60 injections of extracted male and female plasma samples. The C8 column was determined to be the best for metabolic profiling analysis on plasma. The acquired UPLC/MS data of extracted male and female plasma samples was subjected to principal component analysis (PCA) and orthogonal projections to latent structures discriminant analysis (OPLS–DA). Furthermore, a strategy for compound identification was applied here, demonstrating the strength of high-mass-accuracy time-of-flight (TOF)/MS analysis in metabolic profiling.
  •  
3.
  • Bylesjö, Max, et al. (författare)
  • OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification
  • 2006
  • Ingår i: Journal of Chemometrics. - : Wiley. - 0886-9383 .- 1099-128X. ; 20:8-10, s. 341-351
  • Tidskriftsartikel (refereegranskat)abstract
    • The characteristics of the OPLS method have been investigated for the purpose of discriminant analysis (OPLS-DA). We demonstrate how class-orthogonal variation can be exploited to augment classification performance in cases where the individual classes exhibit divergence in within-class variation, in analogy with soft independent modelling of class analogy (SIMCA) classification. The prediction results will be largely equivalent to traditional supervised classification using PLS-DA if no such variation is present in the classes. A discriminatory strategy is thus outlined, combining the strengths of PLS-DA and SIMCA classification within the framework of the OPLS-DA method. Furthermore, resampling methods have been employed to generate distributions of predicted classification results and subsequently assess classification belief. This enables utilisation of the class-orthogonal variation in a proper statistical context. The proposed decision rule is compared to common decision rules and is shown to produce comparable or less class-biased classification results.
  •  
4.
  • Cloarec, Olivier, et al. (författare)
  • Evaluation of the Orthogonal Projection on Latent Structure Model Limitations Caused by Chemical Shift Variability and Improved Visualization of Biomarker Changes in 1H NMR Spectroscopic Metabonomic Studies
  • 2005
  • Ingår i: Analytical Chemistry. - : American Chemical Society (ACS). - 0003-2700 .- 1520-6882. ; 77:2, s. 517-26
  • Tidskriftsartikel (refereegranskat)abstract
    • In general, applications of metabonomics using biofluid NMR spectroscopic analysis for probing abnormal biochemical profiles in disease or due to toxicity have all relied on the use of chemometric techniques for sample classification. However, the well-known variability of some chemical shifts in 1H NMR spectra of biofluids due to environmental differences such as pH variation, when coupled with the large number of variables in such spectra, has led to the situation where it is necessary to reduce the size of the spectra or to attempt to align the shifting peaks, to get more robust and interpretable chemometric models. Here, a new approach that avoids this problem is demonstrated and shows that, moreover, inclusion of variable peak position data can be beneficial and can lead to useful biochemical information. The interpretation of chemometric models using combined back-scaled loading plots and variable weights demonstrates that this peak position variation can be handled successfully and also often provides additional information on the physicochemical variations in metabonomic data sets.
  •  
5.
  • Cloarec, Olivier, et al. (författare)
  • Statistical Total Correlation Spectroscopy: An Exploratory Approach for Latent Biomarker Identification from Metabolic 1H NMR Data Sets
  • 2005
  • Ingår i: Analytical Chemistry. - : American Chemical Society (ACS). - 0003-2700 .- 1520-6882. ; 77:5, s. 1282-89
  • Tidskriftsartikel (refereegranskat)abstract
    • We describe here the implementation of the statistical total correlation spectroscopy (STOCSY) analysis method for aiding the identification of potential biomarker molecules in metabonomic studies based on NMR spectroscopic data. STOCSY takes advantage of the multicollinearity of the intensity variables in a set of spectra (in this case 1H NMR spectra) to generate a pseudo-two-dimensional NMR spectrum that displays the correlation among the intensities of the various peaks across the whole sample. This method is not limited to the usual connectivities that are deducible from more standard two-dimensional NMR spectroscopic methods, such as TOCSY. Moreover, two or more molecules involved in the same pathway can also present high intermolecular correlations because of biological covariance or can even be anticorrelated. This combination of STOCSY with supervised pattern recognition and particularly orthogonal projection on latent structure-discriminant analysis (O-PLS-DA) offers a new powerful framework for analysis of metabonomic data. In a first step O-PLS-DA extracts the part of NMR spectra related to discrimination. This information is then cross-combined with the STOCSY results to help identify the molecules responsible for the metabolic variation. To illustrate the applicability of the method, it has been applied to 1H NMR spectra of urine from a metabonomic study of a model of insulin resistance based on the administration of a carbohydrate diet to three different mice strains (C57BL/6Oxjr, BALB/cOxjr, and 129S6/SvEvOxjr) in which a series of metabolites of biological importance can be conclusively assigned and identified by use of the STOCSY approach.
  •  
6.
  • Rantalainen, Mattias, et al. (författare)
  • Kernel-based orthogonal projections to latent structures (K-OPLS)
  • 2007
  • Ingår i: Journal of Chemometrics. - : Wiley. - 0886-9383 .- 1099-128X. ; 21:7-9, s. 379-385
  • Tidskriftsartikel (refereegranskat)abstract
    • The orthogonal projections to latent structures (OPLS) method has been successfully applied in various chemical and biological systems for modeling and interpretation of linear relationships between a descriptor matrix and response matrix. A kernel-based reformulation of the original OPLS algorithm is presented where the kernel Gram matrix is utilized as a replacement for the descriptor matrix. This enables usage of the kernel trick to efficiently transform the data into a higher-dimensional feature space where predictive and response-orthogonal components are calculated. This strategy has the capacity to improve predictive performance considerably in situations where strong non-linear relationships exist between descriptor and response variables while retaining the OPLS model framework. We put particular focus on describing properties of the rearranged algorithm in relation to the original OPLS algorithm. Four separate problems, two simulated and two real spectroscopic data sets, are employed to illustrate how the algorithm enables separate modeling of predictive and response-orthogonal variation in the feature space. This separation can be highly beneficial for model interpretation purposes while providing a flexible framework for supervised regression.
  •  
7.
  • Rantalainen, Mattias, et al. (författare)
  • Piecewise multivariate modelling of sequential metabolic profiling data
  • 2008
  • Ingår i: BMC Bioinformatics. - : BioMed Central. - 1471-2105. ; 9
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints.Results: A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted.Conclusion: The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.
  •  
8.
  • Rantalainen, Mattias, et al. (författare)
  • Statistically Integrated Metabonomic-Proteomic Studies on a Human Prostate Cancer Xenograft Model in Mice
  • 2006
  • Ingår i: Journal of Proteome Research. - : American Chemical Society (ACS). - 1535-3893 .- 1535-3907. ; 10, s. 2642-55
  • Tidskriftsartikel (refereegranskat)abstract
    • A novel statistically integrated proteometabonomic method has been developed and applied to a human tumor xenograft mouse model of prostate cancer. Parallel 2D-DIGE proteomic and 1H NMR metabolic profile data were collected on blood plasma from mice implanted with a prostate cancer (PC-3) xenograft and from matched control animals. To interpret the xenograft-induced differences in plasma profiles, multivariate statistical algorithms including orthogonal projection to latent structure (OPLS) were applied to generate models characterizing the disease profile. Two approaches to integrating metabonomic data matrices are presented based on OPLS algorithms to provide a framework for generating models relating to the specific and common sources of variation in the metabolite concentrations and protein abundances that can be directly related to the disease model. Multiple correlations between metabolites and proteins were found, including associations between serotransferrin precursor and both tyrosine and 3-D-hydroxybutyrate. Additionally, a correlation between decreased concentration of tyrosine and increased presence of gelsolin was also observed. This approach can provide enhanced recovery of combination candidate biomarkers across multi-omic platforms, thus, enhancing understanding of in vivo model systems studied by multiple omic technologies
  •  
9.
  • Alinaghi, Masoumeh, et al. (författare)
  • Hierarchical time-series analysis of dynamic bioprocess systems
  • 2022
  • Ingår i: Biotechnology Journal. - : John Wiley & Sons. - 1860-6768 .- 1860-7314. ; 17:12
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Monoclonal antibodies (mAbs) are leading types of ‘blockbuster’ biotherapeutics worldwide; they have been successfully used to treat various cancers and chronic inflammatory and autoimmune diseases. Biotherapeutics process development and manufacturing are complicated due to lack of understanding the factors that impact cell productivity and product quality attributes. Understanding complex interactions between cells, media, and process parameters on the molecular level is essential to bring biomanufacturing to the next level. This can be achieved by analyzing cell culture metabolic levels connected to vital process parameters like viable cell density (VCD). However, VCD and metabolic profiles are dynamic parameters and inherently correlated with time, leading to a significant correlation without actual causality. Many time-series methods deal with such issues. However, with metabolic profiling, the number of measured variables vastly exceeds the number of experiments, making most of existing methods ill-suited and hard to interpret. Methods and MajorResults: Here we propose an alternative workflow using hierarchical dimension reduction to visualize and interpret the relation between evolution of metabolic profiles and dynamic process parameters. The first step of proposed method is focused on finding predictive relation between metabolic profiles and process parameter at all time points using OPLS regression. For each time point, the p(corr) obtained from OPLS model is considered as a differential metabogram and is further assessed using principal components analysis (PCA).Conclusions: Compared to traditional batch modeling, applying proposed methodology on metabolic data from Chinese Hamster Ovary (CHO) antibody production characterized the dynamic relation between metabolic profiles and critical process parameters.
  •  
10.
  • Asim, Muhammad Nabeel, et al. (författare)
  • EL-RMLocNet : An explainable LSTM network for RNA-associated multi-compartment localization prediction
  • 2022
  • Ingår i: Computational and Structural Biotechnology Journal. - : Elsevier. - 2001-0370. ; 20, s. 3986-4002
  • Tidskriftsartikel (refereegranskat)abstract
    • Subcellular localization of Ribonucleic Acid (RNA) molecules provide significant insights into the functionality of RNAs and helps to explore their association with various diseases. Predominantly developed single-compartment localization predictors (SCLPs) lack to demystify RNA association with diverse biochemical and pathological processes mainly happen through RNA co-localization in multiple compartments. Limited multi-compartment localization predictors (MCLPs) manage to produce decent performance only for target RNA class of particular sub-type. Further, existing computational approaches have limited practical significance and potential to optimize therapeutics due to the poor degree of model explainability. The paper in hand presents an explainable Long Short-Term Memory (LSTM) network “EL-RMLocNet”, predictive performance and interpretability of which are optimized using a novel GeneticSeq2Vec statistical representation learning scheme and attention mechanism for accurate multi-compartment localization prediction of different RNAs solely using raw RNA sequences. GeneticSeq2Vec generates optimized statistical vectors of raw RNA sequences by capturing short and long range relations of nucleotide k-mers. Using sequence vectors generated by GeneticSeq2Vec scheme, Long Short Term Memory layers extract most informative features, weighting of which on the basis of discriminative potential for accurate multi-compartment localization prediction is performed using attention layer. Through reverse engineering, weights of statistical feature space are mapped to nucleotide k-mers patterns to make multi-compartment localization prediction decision making transparent and explainable for different RNA classes and species. Empirical evaluation indicates that EL-RMLocNet outperforms state-of-the-art predictor for subcellular localization prediction of 4 different RNA classes by an average accuracy figure of 8% for Homo Sapiens species and 6% for Mus Musculus species. EL-RMLocNet is freely available as a web server at (https://sds_genetic_analysis.opendfki.de/subcellular_loc/).
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 11

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy