SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Aytekin Arda 1986 ) "

Sökning: WFRF:(Aytekin Arda 1986 )

  • Resultat 1-6 av 6
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Aytekin, Arda, 1986- (författare)
  • Asynchronous Algorithms for Large-Scale Optimization : Analysis and Implementation
  • 2017
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis proposes and analyzes several first-order methods for convex optimization, designed for parallel implementation in shared and distributed memory architectures. The theoretical focus is on designing algorithms that can run asynchronously, allowing computing nodes to execute their tasks with stale information without jeopardizing convergence to the optimal solution.The first part of the thesis focuses on shared memory architectures. We propose and analyze a family of algorithms to solve an unconstrained, smooth optimization problem consisting of a large number of component functions. Specifically, we investigate the effect of information delay, inherent in asynchronous implementations, on the convergence properties of the incremental prox-gradient descent method. Contrary to related proposals in the literature, we establish delay-insensitive convergence results: the proposed algorithms converge under any bounded information delay, and their constant step-size can be selected independently of the delay bound.Then, we shift focus to solving constrained, possibly non-smooth, optimization problems in a distributed memory architecture. This time, we propose and analyze two important families of gradient descent algorithms: asynchronous mini-batching and incremental aggregated gradient descent. In particular, for asynchronous mini-batching, we show that, by suitably choosing the algorithm parameters, one can recover the best-known convergence rates established for delay-free implementations, and expect a near-linear speedup with the number of computing nodes. Similarly, for incremental aggregated gradient descent, we establish global linear convergence rates for any bounded information delay.Extensive simulations and actual implementations of the algorithms in different platforms on representative real-world problems validate our theoretical results.
  •  
2.
  • Aytekin, Arda, 1986- (författare)
  • Asynchronous First-Order Algorithms for Large-Scale Optimization : Analysis and Implementation
  • 2019
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Developments in communication and data storage technologies have made large-scale data collection more accessible than ever. The transformation of this data into insight or decisions typically involves solving numerical optimization problems. As the data volumes increase, the optimization problems grow so large that they can no longer be solved on a single computer. This has created a strong interest in developing optimization algorithms that can be executed efficiently on multiple computing nodes in parallel. One way to achieve efficiency in parallel computations is to allow for asynchrony among nodes, which corresponds to making the nodes spend less time coordinating with each other and more time computing, possibly based on delayed information.  However, asynchrony in optimization algorithms runs the risk of otherwise convergent algorithms divergent, and convergence analysis of asynchronous algorithms is generally harder. In the thesis, we develop theory and tools to help understand and implement asynchronous optimization algorithms under time-varying, bounded information delay.In the first part, we analyze the convergence of different asynchronous optimization algorithms. We first propose a new approach for minimizing the average of a large number of smooth component functions. The algorithm uses delayed partial gradient information, and it covers delayed incremental gradient and delayed coordinate descent algorithms as special cases. We show that when the total loss function is strongly convex and the component functions have Lipschitz-continuous gradients, the algorithm has a linear convergence rate. The step size of the algorithm can be selected without knowing the bound on the delay, and still, guarantees convergence to within a predefined level of suboptimality. Then, we analyze two different variants of incremental gradient descent algorithms for regularized optimization problems.  In the first variant, asynchronous mini-batching, we consider solving regularized stochastic optimization problems with smooth loss functions. We show that the algorithm with time-varying step sizes achieves the best-known convergence rates under synchronous operation when (i) the feasible set is compact or (ii) the regularization function is strongly convex, and the feasible set is closed and convex. This means that the delays have an asymptotically negligible effect on the convergence, and we can expect speedups when using asynchronous computations. In the second variant, proximal incremental aggregated gradient, we show that when the objective function is strongly convex, the algorithm with a constant step size that depends on the maximum delay bound and the problem parameters converges globally linearly to the true optimum.In the second part, we first present POLO, an open-source C++ library that focuses on algorithm development. We use the policy-based design approach to decompose the proximal gradient algorithm family into its essential policies. This helps us handle combinatorially increasing design choices with linearly many tools, and generates highly efficient code with small footprint.  Together with its sister library in Julia, POLO.jl, our software framework helps optimization and machine-learning researchers to quickly prototype their ideas, benchmark them against the state-of-the-art, and ultimately deploy the algorithms on different computing platforms in just a few lines of code. Then, using the utilities of our software framework, we build a new, ``serverless'' executor for parallel Alternating Direction Method of Multipliers (ADMM) iterations. We use Amazon Web Services' Lambda functions as the computing nodes, and we observe speedups up to 256 workers and efficiencies above 70% up to 64 workers. These preliminary results suggest that serverless runtimes, together with their availability and elasticity, are promising candidates for scaling the performance of distributed optimization algorithms.
  •  
3.
  • Aytekin, Arda, 1986-, et al. (författare)
  • Asynchronous incremental block-coordinate descent
  • 2014
  • Ingår i: 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton). - : IEEE conference proceedings. ; , s. 19-24
  • Konferensbidrag (refereegranskat)abstract
    • This paper studies a flexible algorithm for minimizing a sum of component functions, each of which depends on a large number of decision variables. Such formulations appear naturally in “big data” applications, where each function describes the loss estimated using the data available at a specific machine, and the number of features under consideration is huge. In our algorithm, a coordinator updates a global iterate based on delayed partial gradients of the individual objective functions with respect to blocks of coordinates. Delayed incremental gradient and delayed coordinate descent algorithms are obtained as special cases. Under the assumption of strong convexity and block coordinate-wise Lipschitz continuous partial gradients, we show that the algorithm converges linearly to a ball around the optimal value. Contrary to related proposals in the literature, our algorithm is delay-insensitive: it converges for any bounded information delay, and its step-size parameter can be chosen independently of the maximum delay bound.
  •  
4.
  • Aytekin, Arda, 1986-, et al. (författare)
  • Exploiting serverless runtimes for large-scale optimization
  • 2019
  • Ingår i: 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). - : IEEE Computer Society. ; , s. 499-501
  • Konferensbidrag (refereegranskat)abstract
    • Serverless runtimes provide efficient and cost-effective environments for scalable computations, thanks to their event-driven and elastic nature. So far, they have mostly been used for stateless, data parallel and sporadic computations. In this work, we propose exploiting serverless runtimes to solve generic, large-scale optimization problems. To this end, we implement a parallel optimization algorithm for solving a regularized logistic regression problem, and use AWS Lambda for the compute-intensive work. We show that relative speedups up to 256 workers and efficiencies above 70% up to 64 workers can be expected.
  •  
5.
  • Biel, Martin, et al. (författare)
  • POLO.Jl : Policy-based optimization algorithms in Julia
  • 2019
  • Ingår i: Advances in Engineering Software. - : Elsevier. - 0965-9978 .- 1873-5339. ; 136
  • Tidskriftsartikel (refereegranskat)abstract
    • We present POLO. j1- a Julia package that helps algorithm developers and machine-learning practitioners design and use state-of-the-art parallel optimization algorithms in a flexible and efficient way. POLO. j1 extends our C+ + library POLO, which has been designed and implemented with the same intentions. POLO. j1 not only wraps selected algorithms in POLO and provides an easy mechanism to use data manipulation facilities and loss function definitions in Julia together with the underlying compiled C+ + library, but it also uses the policy-based design technique in a Julian way to help users prototype optimization algorithms from their own building blocks. In our experiments, we observe that there is little overhead when using the compiled C+ + code directly within Julia. We also notice that the performance of algorithms implemented in pure Julia is comparable with that of their C+ + counterparts. Both libraries are hosted on GitHub(1)under the free MIT license, and can be used easily by pulling the pre-built 64-bit architecture Docker images.(2)
  •  
6.
  • Feyzmahdavian, Hamid Reza, et al. (författare)
  • A delayed proximal gradient method with linear convergence rate
  • 2014
  • Ingår i: IEEE International Workshop on Machine Learning for Signal Processing (MLSP). - : IEEE conference proceedings. ; , s. 1-6
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a new incremental gradient algorithm for minimizing the average of a large number of smooth component functions based on delayed partial gradients. Even with a constant step size, which can be chosen independently of the maximum delay bound and the number of objective function components, the expected objective value is guaranteed to converge linearly to within some ball around the optimum. We derive an explicit expression that quantifies how the convergence rate depends on objective function properties and algorithm parameters such as step-size and the maximum delay. An associated upper bound on the asymptotic error reveals the trade-off between convergence speed and residual error. Numerical examples confirm the validity of our results.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-6 av 6

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy