Scalable Machine Learning through Approximation and Distributed Computing

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Sökning: id:"swepub:oai:DiVA.org:kth-250038" > Scalable Machine Le...

1 av 1
Föregående post
Nästa post
Till träfflistan

Scalable Machine Learning through Approximation and Distributed Computing

Vasiloudis, Theodore (författare): KTH,Beräkningsvetenskap och beräkningsteknik (CST),RISE

Holst, Anders, Adjunct Professor (preses): KTH,Beräkningsvetenskap och beräkningsteknik (CST)

Boström, Henrik (preses): KTH,Programvaruteknik och datorsystem, SCS

visa fler...

Haridi, Seif, 1953- (preses): KTH,Programvaruteknik och datorsystem, SCS

Gillblad, Daniel, PhD (preses): Industrial Supervisor, RISE

Žliobaitė, Indre (opponent)

visa färre...

(creator_code:org_t)

ISBN 9789178731817
KTH Royal Institute of Technology, 2019
Engelska 120 s.

Relaterad länk:: https://kth.diva-por... (primary) (Raw object); visa fler...; https://urn.kb.se/re...; visa färre...

Doktorsavhandling (övrigt vetenskapligt/konstnärligt)

Abstract Ämnesord

Stäng

Machine learning algorithms are now being deployed in practically all areas of our lives. Part of this success can be attributed to the ability to learn complex representations from massive datasets. However, computational speed increases have not kept up with the increase in the sizes of data we want to learn from, leading naturally to algorithms that need to be resource-efficient and parallel. As the proliferation of machine learning continues, the ability for algorithms to adapt to a changing environment and deal with uncertainty becomes increasingly important.In this thesis we develop scalable machine learning algorithms, with a focus on efficient, online, and distributed computation. We make use of approximations to dramatically reduce the computational cost of exact algorithms, and develop online learning algorithms to deal with a constantly changing environment under a tight computational budget. We design parallel and distributed algorithms to ensure that our methods can scale to massive datasets.We first propose a scalable algorithm for graph vertex similarity calculation and concept discovery. We demonstrate its applicability to multiple domains, including text, music, and images, and demonstrate its scalability by training on one of the largest text corpora available. Then, motivated by a real-world use case of predicting the session length in media streaming, we propose improvements to several aspects of learning with decision trees. We propose two algorithms to estimate the uncertainty in the predictions of online random forests. We show that our approach can achieve better accuracy than the state of the art while being an order of magnitude faster. We then propose a parallel and distributed online tree boosting algorithm that maintains the correctness guarantees of serial algorithms while providing an order of magnitude speedup on average. Finally, we propose an algorithm that allows for gradient boosted trees training to be distributed across both the data point and feature dimensions. We show that we can achieve communication savings of several orders of magnitude for sparse datasets, compared to existing approaches that can only distribute the computation across the data point dimension and use dense communication.

Hitta via bibliotek

Scalable Machine Learning through Approximation and Distributed Computing (Sök publikationen i LIBRIS)

Till lärosätets databas

1 av 1
Föregående post
Nästa post
Till träfflistan

Hitta mer i SwePub

Av författaren/redakt...: Vasiloudis, Theo ...; Holst, Anders, A ...; Boström, Henrik; Haridi, Seif, 19 ...; Gillblad, Daniel ...; Žliobaitė, Indre

Om ämnet

NATURVETENSKAP: NATURVETENSKAP; och Data och informa ...; och Datavetenskap

Av lärosätet: Kungliga Tekniska Högskolan

Sök utanför SwePub

Sök vidare i:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se

Scalable Machine Learning through Approximation and Distributed Computing

Ämnesord

Nyckelord

Publikations- och innehållstyp

Hitta via bibliotek

Till lärosätets databas

Hitta mer i SwePub

Sök utanför SwePub