SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "swepub ;lar1:(umu);conttype:(refereed);srt2:(2000-2004);pers:(Wold Svante)"

Sökning: swepub > Umeå universitet > Refereegranskat > (2000-2004) > Wold Svante

  • Resultat 1-10 av 26
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Andersson, Per M, et al. (författare)
  • Comparison between physicochemical and calculated molecular descriptors
  • 2000
  • Ingår i: Journal of Chemometrics: Special Issue: Proceedings of the SSC6, August 1999, HiT/TF, Norway . Issue Edited by Kim Esbensen. ; 14:5-6, s. 629-42
  • Tidskriftsartikel (refereegranskat)abstract
    • It has earlier been proven that measured physicochemical properties are useful in the selection of building blocks for combinatorial chemistry as well as for investigation of the scope and limitations of organic reactions. However, measured physicochemical properties are only available for small subsets of reagents, starting materials or building blocks; therefore it is necessary to use calculated descriptors and it is essential that the descriptors are relevant. The objective was to investigate whether three different descriptor data sets contained similar information about the chemical structure, with the major aim to investigate whether calculated descriptors contain similar information as experimental data. A total of 205 heterogeneous primary amines were characterized using three different data sets of molecular descriptor variables. The first set consisted of four physicochemical variables compiled from the literature and commercially available chemicals in chemical catalogues. From these four descriptors together with molecular weight, three additional descriptors could be calculated, resulting in a total of eight descriptor variables in the first data set. The second data set consisted of 81 calculated molecular descriptor variables relating to size, connectivity, atom count, topology and electrotopology indices. The third data set consisted of 10 semi-empirical variables (AM1). All the calculated variables were generated using the software Tsar 3.11. The descriptor variable sets were compared using principal component analysis (PCA) and partial least squares projections to latent structures (PLS). The following result shows that the different descriptor sets do contain similar latent information and that the different types of calculated variables do correlate well with the experimental data, making them suitable to use for e.g. combinatorial library design.
  •  
2.
  • Andersson, Per M, et al. (författare)
  • Strategies for subset selection of parts of an in-house chemical library
  • 2001
  • Ingår i: Journal of Chemometrics. - : Wiley. - 0886-9383. ; 15:4, s. 353-69
  • Tidskriftsartikel (refereegranskat)abstract
    • When a company decides to perform biological testing of their in-house library, i.e. compounds which have been synthesized or purchased over the years, it is usually not feasible or desirable to test all of them using e.g. high-throughput screening (HTS). The limitation is the usually high number of compounds to test (104-106) leading to practical limitations and high costs in terms of both material costs and disposal considerations. Therefore it is often desirable to make a selection of which compounds to include in the biological testing. A challenge is how to make this selection in order to cover the structural space of the in-house library as well as possible. Here we present and discuss different selection strategies based mainly on statistical molecular design (SMD). These methods require different prior information about the compounds under investigation, e.g. characterization of the chemical structure, affinity/biological activity data or neither of these. Which method to be used is largely problem-dependent, i.e. the composition and origin of the library, and hence the structural space, are of great importance. Chemical and biological knowledge about the system under investigation should as far as possible be considered when making the final decision on which method to apply.
  •  
3.
  • Artursson, Tom, et al. (författare)
  • Study of Preprocessing Methods for the Determination of Crystalline Phases in Binary Mixtures of Drug Substances by X-ray Powder Diffraction and Multivariate Calibration
  • 2000
  • Ingår i: Applied Spectroscopy. - : SAGE Publications. - 0003-7028 .- 1943-3530. ; 54:8, s. 272A-301A
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, various preprocessing methods were tested on data generated by X-ray powder diffraction (XRPD) in order to enhance the partial least-squares (PLS) regression modeling performance. The preprocessing methods examined were 22 different discrete wavelet transforms, Fourier transform, Savitzky-Golay, orthogonal signal correction (OSC), and combinations of wavelet transform and OSC, and Fourier transform and OSC. Root mean square error of prediction (RMSEP) of an independent test set was used to measure the performance of the various preprocessing methods. The best PLS model was obtained with a wavelet transform (Symmlet 8), which at the same time compressed the data set by a factor of 9.5. With the use of wavelet and X-ray powder diffraction, concentrations of less than 10% of one crystal from could be detected in a binary mixture. The linear range was found to be in the range 10-70% of the crystalline form of phenacetin, although semiquantitative work could be carried out down to a level of approximately 2%. Furthermore, the wavelet-pretreated models were able to handle admixtures and deliberately added noise.
  •  
4.
  • Berglund, Anders, 1970-, et al. (författare)
  • The GIFI approach to non-linear PLS modeling
  • 2001
  • Ingår i: Journal of Chemometrics. - : Wiley Inter Science. - 0886-9383 .- 1099-128X. ; 15:4, s. 321-36
  • Tidskriftsartikel (refereegranskat)abstract
    • The GIFI approach to non-linear modeling involves the transformation of quantitative variables to a set of 1/0 dummies in a similar manner to the way qualitative variables are coded. This is followed by analyzing the sets of 1/0 dummies by principal component analysis, multiple regression or, as discussed here, PLS. The patterns of the resulting coefficients indicate the nature of the non-linearities in the data. Here the potential uses and limitations of PLS regression, in combination with four variants of GIFI coding, are investigated using both simulated and empirical data sets.
  •  
5.
  • Champagne, M, et al. (författare)
  • The use of orthogonal signal correction to improve NIR readings of pulp fibre properties
  • 2001
  • Ingår i: Pulp & Paper-Canada. - 0316-4004. ; 102:4, s. 41-3
  • Tidskriftsartikel (refereegranskat)abstract
    • In 1999 Tembec Industries and the National Renewal Energy Laboratories worked together in developing a methodology to use Near-infrared (NIR). Technology of in-house pulp fibre quality properties Q99 and Q97. The initial results with dry samples of pulp were encouraging. the wet samples results were initially disappointing using the standard chemometric techniques. Svante Wold developed a new chemometric method called Orthogonal Signal correction (OSC), which was used to obtain a good correction of Q99 in the wet pulp samples.
  •  
6.
  • Dåbakk, Eigil, et al. (författare)
  • Inferring lake water chemistry from filtered seston using NIR spectrometry
  • 2000
  • Ingår i: Water Research. ; 34:5, s. 1666-72
  • Tidskriftsartikel (refereegranskat)abstract
    • Near-infrared spectrometry (NIR) is a rapid, inexpensive and reagent-free technique, widely used in industry in areas such as quality control and process management. The technique has great potential for environmental monitoring of aqueous systems. This study assesses relationships, using PLS regression, between NIR spectra of seston collected on glass fibre filters and the following measured lake water parameters: total organic carbon (TOC), total phosphorus (TP), Abs420 and pH. Water samples were collected from 271 oligotrophic lakes during autumn 1995. The predictive model for TOC explained 68% of the variance (SEP=2.1 mg L-1, range 14.9 mg L-1), and that for colour 71% (SEP=0.04 A, range 0.36 A), while the explained variances for pH and TP were 72% (SEP=0.36 μg L-1, range 3.13 μg L-1) and 45% (SEP=4 μg L-1, range 41 μg L-1), respectively. A model correlating NIR spectra and the actual amount of phosphorus in the seston captured on filters explained 86% of the variance (SEP=0.044 μg/filter, range 0.47). Several pretreatments and regression techniques were used in an attempt to enhance modeling performance. However, straightforward PLS on raw data performed best in all cases.
  •  
7.
  • Eriksson, Lennart, et al. (författare)
  • GIFI-PLS: Modeling of Non-Linearities and Discontinuities in QSAR
  • 2000
  • Ingår i: QSAR. ; 19:4, s. 345-55
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper introduces to the QSAR community a novel method for modeling and understanding non-linear relationships between biological potency and chemical structure properties of molecules. The approach, GIFI-PLS, is based on ``binning'' of quantitative X-variables into categorical variables. Each categorical variable is then expanded into a set of linked 1/0 dummy variables, which enable modeling of non-linearity. By way of four QSAR data sets, it is demonstrated that GIFI-PLS is useful for modeling of non-linearity and discontinuity in QSAR, and that the predictive power of a QSAR model may improve.
  •  
8.
  • Eriksson, Lennart, et al. (författare)
  • Megavariate analysis of hierarchical QSAR data
  • 2002
  • Ingår i: Journal of Computer-Aided Molecular Design. ; 16:10, s. 711-26
  • Tidskriftsartikel (refereegranskat)abstract
    • Multivariate PCA- and PLS-models involving many variables are often difficult to interpret, because plots and lists of loadings, coefficients, VIPs, etc, rapidly become messy and hard to overview. There may then be a strong temptation to eliminate variables to obtain a smaller data set. Such a reduction of variables, however, often removes information and makes the modelling efforts less reliable. Model interpretation may be misleading and predictive power may deteriorate.A better alternative is usually to partition the variables into blocks of logically related variables and apply hierarchical data analysis. Such blocked data may be analyzed by PCA and PLS. This modelling forms the base-level of the hierarchical modelling set-up. On the base-level in-depth information is extracted for the different blocks. The score vectors formed on the base-level, here called `super variables', may be linked together in new matrices on the top-level. On the top-level superficial relationships between the X- and the Y-data are investigated.In this paper the basic principles of hierarchical modelling by means of PCA and PLS are reviewed. One objective of the paper is to disseminate this concept to a broader QSAR audience. The hierarchical methods are used to analyze a set of 10 haloalkanes for which K = 30 chemical descriptors and M = 255 biological responses have been gathered. Due to the complexity of the biological data, they are sub-divided in four blocks. All the modelling steps on the base-level and the top-level are reported and the final QSAR model is interpreted thoroughly.
  •  
9.
  • Eriksson, Lennart, et al. (författare)
  • On the selection of the training set in environmental QSAR analysis when compounds are clustered
  • 2000
  • Ingår i: Journal of Chemometrics. ; 14:5-6, s. 599-616
  • Tidskriftsartikel (refereegranskat)abstract
    • In QSAR analysis in environmental sciences, adverse effects of chemicals released to the environment are modelled and predicted as a function of the chemical properties of the pollutants. Usually the set of compounds under study contains several classes of substances, i.e. a more or less strongly clustered set. It is then needed to ensure that the selected training set comprises compounds representing all those chemical classes. Multivariate design in the principal properties of the compound classes is usually appropriate for selecting a meaningful training set. However, with clustered data, often seen in environmental chemistry and toxicology, a single multivariate design may be suboptimal because of the risk of ignoring small classes with few members and only selecting training set compounds from the largest classes. Recently a procedure for training set selection recognizing clustering was proposed by us. In this approach, when non-selective biological or environmental responses are modelled, local multivariate designs are constructed within each cluster (class). The chosen compounds arising from the local designs are finally united in the overall training set, which thus will contain members from all clusters. The proposed strategy is here further tested and elaborated by applying it to a series of 351 chemical substances for which the soil sorption coefficient is available. These compounds are divided into 14 classes containing between 10 and 52 members. The training set selection is discussed, followed by multivariate QSAR modelling, model interpretation and predictions for the test set. Various types of statistical experimental designs are tested during the training set selection phase.
  •  
10.
  • Eriksson, Lennart, et al. (författare)
  • Orthogonal signal correction, wavelet analysis, and multivariate calibration of complicated process fluorescence data
  • 2000
  • Ingår i: Analytica Chimica Acta. ; 420:2, s. 181-95
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, multivariate calibration of complicated process fluorescence data is presented. Two data sets related to the production of white sugar are investigated. The first data set comprises 106 observations and 571 spectral variables, and the second data set 268 observations and 3997 spectral variables. In both applications, a single response, ash content, is modelled and predicted as a function of the spectral variables. Both data sets contain certain features making multivariate calibration efforts non-trivial. The objective is to show how principal component analysis (PCA) and partial least squares (PLS) regression can be used to overview the data sets and to establish predictively sound regression models. It is shown how a recently developed technique for signal filtering, orthogonal signal correction (OSC), can be applied in multivariate calibration to enhance predictive power. In addition, signal compression is tested on the larger data set using wavelet analysis. It is demonstrated that a compression down to 4% of the original matrix size - in the variable direction - is possible without loss of predictive power. It is concluded that the combination of OSC for pre-processing and wavelet analysis for compression of spectral data is promising for future use.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 26

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy