Search: onr:"swepub:oai:DiVA.org:uu-248959" >
Large-scale ligand-...
Large-scale ligand-based predictive modelling using support vector machines
-
- Alvarsson, Jonathan (author)
- Uppsala universitet,Institutionen för farmaceutisk biovetenskap
-
- Lampa, Samuel (author)
- Uppsala universitet,Institutionen för farmaceutisk biovetenskap
-
- Schaal, Wesley (author)
- Uppsala universitet,Institutionen för farmaceutisk biovetenskap,Science for Life Laboratory, SciLifeLab
-
show more...
-
- Andersson, Claes (author)
- Uppsala universitet,Cancerfarmakologi och beräkningsmedicin
-
- Wikberg, Jarl E. S. (author)
- Uppsala universitet,Institutionen för farmaceutisk biovetenskap
-
- Spjuth, Ola (author)
- Uppsala universitet,Institutionen för farmaceutisk biovetenskap,Science for Life Laboratory, SciLifeLab
-
show less...
-
(creator_code:org_t)
- 2016-08-10
- 2016
- English.
-
In: Journal of Cheminformatics. - : Springer Science and Business Media LLC. - 1758-2946. ; 8
- Related links:
-
https://doi.org/10.1...
-
show more...
-
https://uu.diva-port... (primary) (Raw object)
-
https://jcheminf.bio...
-
https://urn.kb.se/re...
-
https://doi.org/10.1...
-
show less...
Abstract
Subject headings
Close
- The increasing size of datasets in drug discovery makes it challenging to build robust and accurate predictive models within a reasonable amount of time. In order to investigate the effect of dataset sizes on predictive performance and modelling time, ligand-based regression models were trained on open datasets of varying sizes of up to 1.2 million chemical structures. For modelling, two implementations of support vector machines (SVM) were used. Chemical structures were described by the signatures molecular descriptor. Results showed that for the larger datasets, the LIBLINEAR SVM implementation performed on par with the well-established libsvm with a radial basis function kernel, but with dramatically less time for model building even on modest computer resources. Using a non-linear kernel proved to be infeasible for large data sizes, even with substantial computational resources on a computer cluster. To deploy the resulting models, we extended the Bioclipse decision support framework to support models from LIBLINEAR and made our models of logD and solubility available from within Bioclipse.
Subject headings
- MEDICIN OCH HÄLSOVETENSKAP -- Medicinska och farmaceutiska grundvetenskaper -- Farmaceutiska vetenskaper (hsv//swe)
- MEDICAL AND HEALTH SCIENCES -- Basic Medicine -- Pharmaceutical Sciences (hsv//eng)
- NATURVETENSKAP -- Data- och informationsvetenskap -- Bioinformatik (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Bioinformatics (hsv//eng)
Keyword
- Predictive modelling; Support vector machine; Bioclipse; Molecular signatures; QSAR
- Bioinformatik
- Bioinformatics
Publication and Content Type
- ref (subject category)
- art (subject category)
Find in a library
To the university's database