SwePub
Sök i LIBRIS databas

  Utökad sökning

onr:"swepub:oai:DiVA.org:uu-534289"
 

Sökning: onr:"swepub:oai:DiVA.org:uu-534289" > Reducing training d...

Reducing training data needs with minimal multilevel machine learning (M3L)

Heinen, Stefan (författare)
Vector Inst Artificial Intelligence, Toronto, ON M5S 1M1, Canada.
Khan, Danish (författare)
Vector Inst Artificial Intelligence, Toronto, ON M5S 1M1, Canada.;Univ Toronto, Dept Chem, St George Campus, Toronto, ON, Canada.
Falk von Rudorff, Guido (författare)
Univ Kassel, Dept Chem, Heinrich-Plett-Str40, D-34132 Kassel, Germany.;Ctr Interdisciplinary Nanostruct Sci & Technol CIN, Heinrich-Plett-Str 40, D-34132 Kassel, Germany.
visa fler...
Karandashev, Konstantin (författare)
Univ Vienna, Fac Phys, Kolingasse 14-16, AT-1090 Vienna, Austria.
Arismendi Arrieta, Daniel Jose (författare)
Uppsala universitet,Strukturkemi
Price, Alastair J. A. (författare)
Univ Toronto, Dept Chem, St George Campus, Toronto, ON, Canada.;Univ Toronto, Accelerat Consortium, 80 St George St, Toronto, ON M5S 3H6, Canada.
Nandi, Surajit (författare)
DTU, Dept Energy Convers & Storage, Anker Engelunds Vej, DK-2800 Lyngby, Denmark.
Bhowmik, Arghya (författare)
DTU, Dept Energy Convers & Storage, Anker Engelunds Vej, DK-2800 Lyngby, Denmark.
Hermansson, Kersti, Professor (författare)
Uppsala universitet,Strukturkemi
von Lilienfeld, O. Anatole (författare)
Vector Inst Artificial Intelligence, Toronto, ON M5S 1M1, Canada.;Univ Toronto, Dept Chem, St George Campus, Toronto, ON, Canada.;Univ Toronto, Accelerat Consortium, 80 St George St, Toronto, ON M5S 3H6, Canada.;Univ Toronto, Dept Mat Sci & Engn, St George campus, Toronto, ON, Canada.;Univ Toronto, Dept Phys, St George campus, Toronto, ON, Canada.;Tech Univ Berlin, Machine Learning Grp, Berlin, Germany.;Berlin Inst Fdn Learning & Data, Berlin, Germany.
visa färre...
Vector Inst Artificial Intelligence, Toronto, ON M5S 1M1, Canada Vector Inst Artificial Intelligence, Toronto, ON M5S 1M1, Canada.;Univ Toronto, Dept Chem, St George Campus, Toronto, ON, Canada. (creator_code:org_t)
Institute of Physics Publishing (IOPP), 2024
2024
Engelska.
Ingår i: Machine Learning. - : Institute of Physics Publishing (IOPP). - 2632-2153. ; 5:2
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • For many machine learning applications in science, data acquisition, not training, is the bottleneck even when avoiding experiments and relying on computation and simulation. Correspondingly, and in order to reduce cost and carbon footprint, training data efficiency is key. We introduce minimal multilevel machine learning (M3L) which optimizes training data set sizes using a loss function at multiple levels of reference data in order to minimize a combination of prediction error with overall training data acquisition costs (as measured by computational wall-times). Numerical evidence has been obtained for calculated atomization energies and electron affinities of thousands of organic molecules at various levels of theory including HF, MP2, DLPNO-CCSD(T), DFHFCABS, PNOMP2F12, and PNOCCSD(T)F12, and treating them with basis sets TZ, cc-pVTZ, and AVTZ-F12. Our M3L benchmarks for reaching chemical accuracy in distinct chemical compound sub-spaces indicate substantial computational cost reductions by factors of ∼1.01, 1.1, 3.8, 13.8, and 25.8 when compared to heuristic sub-optimal multilevel machine learning (M2L) for the data sets QM7b, QM9LCCSD (T), Electrolyte Genome Project, QM9CACESD(T), and QM9CECASD(T), respectively. Furthermore, we use M2L to investigate the performance for 76 density functionals when used within multilevel learning and building on the following levels drawn from the hierarchy of Jacobs Ladder: LDA, GGA, mGGA, and hybrid functionals. Within M2L and the molecules considered, mGGAs do not provide any noticeable advantage over GGAs. Among the functionals considered and in combination with LDA, the three on average top performing GGA and Hybrid levels for atomization energies on QM9 using M3L correspond respectively to PW91, KT2, B97D, and τ-HCTH, B3LYP*(VWN5), and TPSSH.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)

Nyckelord

quantum machine learning
computational cost
multilevel machine learning
delta learning
quantum chemistry
kernel ridge regression

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy