SwePub
Sök i LIBRIS databas

  Utökad sökning

L773:1743 4386 OR L773:1743 4378
 

Sökning: L773:1743 4386 OR L773:1743 4378 > Using Active Learni...

Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction

Johansson, Simon, 1994 (författare)
Chalmers tekniska högskola,Chalmers University of Technology,AstraZeneca AB
Gummesson Svensson, Hampus, 1996 (författare)
AstraZeneca AB,Chalmers tekniska högskola,Chalmers University of Technology
Bjerrum, E. (författare)
AstraZeneca AB
visa fler...
Schliep, Alexander, 1967 (författare)
Gothenburg University,Göteborgs universitet,Institutionen för data- och informationsteknik (GU),Department of Computer Science and Engineering (GU),University of Gothenburg
Haghir Chehreghani, Morteza, 1982 (författare)
Chalmers tekniska högskola,Chalmers University of Technology
Tyrchan, C. (författare)
AstraZeneca AB
Engkvist, Ola, 1967 (författare)
Chalmers tekniska högskola,Chalmers University of Technology,AstraZeneca AB
visa färre...
 (creator_code:org_t)
2022-07-14
2022
Engelska.
Ingår i: Molecular Informatics. - : Wiley. - 1868-1743 .- 1868-1751. ; 41:12
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • Computer aided synthesis planning, suggesting synthetic routes for molecules of interest, is a rapidly growing field. The machine learning methods used are often dependent on access to large datasets for training, but finite experimental budgets limit how much data can be obtained from experiments. This suggests the use of schemes for data collection such as active learning, which identifies the data points of highest impact for model accuracy, and which has been used in recent studies with success. However, little has been done to explore the robustness of the methods predicting reaction yield when used together with active learning to reduce the amount of experimental data needed for training. This study aims to investigate the influence of machine learning algorithms and the number of initial data points on reaction yield prediction for two public high-throughput experimentation datasets. Our results show that active learning based on output margin reached a pre-defined AUROC faster than random sampling on both datasets. Analysis of feature importance of the trained machine learning models suggests active learning had a larger influence on the model accuracy when only a few features were important for the model prediction.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences (hsv//eng)
NATURVETENSKAP  -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Language Technology (hsv//eng)
NATURVETENSKAP  -- Data- och informationsvetenskap -- Bioinformatik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Bioinformatics (hsv//eng)
NATURVETENSKAP  -- Biologi -- Bioinformatik och systembiologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Bioinformatics and Systems Biology (hsv//eng)

Nyckelord

Active Learning
Reaction Yield Prediction
Bayesian Matrix
Factorization
Random Forest
Neural Networks
Pharmacology & Pharmacy
Computer Science
Mathematical & Computational
Biology
Reaction Yield Prediction

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy