SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:research.chalmers.se:463c9a2f-4146-4f8b-be0f-6a92b93cbb5f"
 

Sökning: id:"swepub:oai:research.chalmers.se:463c9a2f-4146-4f8b-be0f-6a92b93cbb5f" > Adaptiveness and Lo...

Adaptiveness and Lock-free Synchronization in Parallel Stochastic Gradient Descent

Bäckström, Karl, 1994 (författare)
Chalmers tekniska högskola,Chalmers University of Technology
 (creator_code:org_t)
Gothenburg, 2021
Engelska.
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)
Abstract Ämnesord
Stäng  
  • The emergence of big data in recent years due to the vast societal digitalization and large-scale sensor deployment has entailed significant interest in machine learning methods to enable automatic data analytics. In a majority of the learning algorithms used in industrial as well as academic settings, the first-order iterative optimization procedure Stochastic gradient descent (SGD), is the backbone. However, SGD is often time-consuming, as it typically requires several passes through the entire dataset in order to converge to a solution of sufficient quality. In order to cope with increasing data volumes, and to facilitate accelerated processing utilizing contemporary hardware, various parallel SGD variants have been proposed. In addition to traditional synchronous parallelization schemes, asynchronous ones have received particular interest in recent literature due to their improved ability to scale due to less coordination, and subsequently waiting time. However, asynchrony implies inherent challenges in understanding the execution of the algorithm and its convergence properties, due the presence of both stale and inconsistent views of the shared state. In this work, we aim to increase the understanding of the convergence properties of SGD for practical applications under asynchronous parallelism and develop tools and frameworks that facilitate improved convergence properties as well as further research and development. First, we focus on understanding the impact of staleness, and introduce models for capturing the dynamics of parallel execution of SGD. This enables (i) quantifying the statistical penalty on the convergence due to staleness and (ii) deriving an adaptation scheme, introducing a staleness-adaptive SGD variant MindTheStep-AsyncSGD, which provably reduces this penalty. Second, we aim at exploring the impact of synchronization mechanisms, in particular consistency-preserving ones, and the overall effect on the convergence properties. To this end, we propose LeashedSGD, an extensible algorithmic framework supporting various synchronization mechanisms for different degrees of consistency, enabling in particular a lock-free and consistency-preserving implementation. In addition, the algorithmic construction of Leashed-SGD enables dynamic memory allocation, claiming memory only when necessary, which reduces the overall memory footprint. We perform an extensive empirical study, benchmarking the proposed methods, together with established baselines, focusing on the prominent application of Deep Learning for image classification on the benchmark datasets MNIST and CIFAR, showing significant improvements in converge time for Leashed-SGD and MindTheStep-AsyncSGD.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorteknik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Engineering (hsv//eng)
NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
TEKNIK OCH TEKNOLOGIER  -- Elektroteknik och elektronik -- Datorsystem (hsv//swe)
ENGINEERING AND TECHNOLOGY  -- Electrical Engineering, Electronic Engineering, Information Engineering -- Computer Systems (hsv//eng)

Nyckelord

parallelism
machine learning
Stochastic gradient descent

Publikations- och innehållstyp

lic (ämneskategori)
vet (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Hitta mer i SwePub

Av författaren/redakt...
Bäckström, Karl, ...
Om ämnet
NATURVETENSKAP
NATURVETENSKAP
och Data och informa ...
och Datorteknik
NATURVETENSKAP
NATURVETENSKAP
och Data och informa ...
och Datavetenskap
TEKNIK OCH TEKNOLOGIER
TEKNIK OCH TEKNO ...
och Elektroteknik oc ...
och Datorsystem
Av lärosätet
Chalmers tekniska högskola

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy