SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Alistarh Dan) "

Sökning: WFRF:(Alistarh Dan)

  • Resultat 1-2 av 2
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Alistarh, Dan, et al. (författare)
  • The Convergence of Sparsified Gradient Methods
  • 2018
  • Ingår i: Advances in Neural Information Processing Systems 31 (NIPS 2018). - : Neural Information Processing Systems (NIPS).
  • Konferensbidrag (refereegranskat)abstract
    • Stochastic Gradient Descent (SGD) has become the standard tool for distributed training of massive machine learning models, in particular deep neural networks. Several families of communication-reduction methods, such as quantization, large-batch methods, and gradient sparsification, have been proposed to reduce the overheads of distribution. To date, gradient sparsification methods-where each node sorts gradients by magnitude, and only communicates a subset of the components, accumulating the rest locally-are known to yield some of the largest practical gains. Such methods can reduce the amount of communication per step by up to three orders of magnitude, while preserving model accuracy. Yet, this family of methods currently has no theoretical justification. This is the question we address in this paper. We prove that, under analytic assumptions, sparsifying gradients by magnitude with local error correction provides convergence guarantees, for both convex and non-convex smooth objectives, for data-parallel SGD. The main insight is that sparsification methods implicitly maintain bounds on the maximum impact of stale updates, thanks to selection by magnitude. Our analysis also reveals that these methods do require analytical conditions to converge well, justifying and complementing existing heuristics.
  •  
2.
  • Khirirat, Sarit, et al. (författare)
  • Gradient compression for communication-limited convex optimization
  • 2018
  • Ingår i: 2018 IEEE Conference on Decision and Control (CDC). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781538613955 ; , s. 166-171
  • Konferensbidrag (refereegranskat)abstract
    • Data-rich applications in machine-learning and control have motivated an intense research on large-scale optimization. Novel algorithms have been proposed and shown to have optimal convergence rates in terms of iteration counts. However, their practical performance is severely degraded by the cost of exchanging high-dimensional gradient vectors between computing nodes. Several gradient compression heuristics have recently been proposed to reduce communications, but few theoretical results exist that quantify how they impact algorithm convergence. This paper establishes and strengthens the convergence guarantees for gradient descent under a family of gradient compression techniques. For convex optimization problems, we derive admissible step sizes and quantify both the number of iterations and the number of bits that need to be exchanged to reach a target accuracy. Finally, we validate the performance of different gradient compression techniques in simulations. The numerical results highlight the properties of different gradient compression algorithms and confirm that fast convergence with limited information exchange is possible.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-2 av 2
Typ av publikation
konferensbidrag (2)
Typ av innehåll
refereegranskat (2)
Författare/redaktör
Alistarh, Dan (2)
Johansson, Mikael (2)
Khirirat, Sarit (2)
Hoefler, Torsten (1)
Konstantinov, Nikola (1)
Renggli, Cedric (1)
Lärosäte
Kungliga Tekniska Högskolan (2)
Språk
Engelska (2)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (1)
Teknik (1)
År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy