SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:1530 2075 OR L773:0769519261 "

Sökning: L773:1530 2075 OR L773:0769519261

  • Resultat 1-9 av 9
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Andersson, Björn, 1974, et al. (författare)
  • Global Priority-Driven Aperiodic Scheduling on Multiprocessors
  • 2003
  • Ingår i: Proceedings. International Parallel and Distributed Processing Symposium, 2003. - 1530-2075. - 0769519261
  • Konferensbidrag (refereegranskat)abstract
    • This paper studies multiprocessor scheduling for aperiodic tasks where future arrivals are unknown. A previously proposed priority-driven scheduling algorithm for periodic tasks with migration capability is extended to aperiodic scheduling and is shown to have a capacity bound of 0.5. This bound is close to the best achievable for a priority-driven scheduling algorithm. With an infinite number of processors, no priority-driven scheduling algorithm can perform better. We also propose a simple admission controller which guarantees that admitted tasks meet their deadlines and for many workloads, it admits tasks so that the utilization can be kept above the capacity bound.
  •  
2.
  • Andersson, Björn, 1974, et al. (författare)
  • Partitioned Aperiodic Scheduling on Multiprocessors
  • 2003
  • Ingår i: Proceedings. International Parallel and Distributed Processing Symposium, 2003. - 1530-2075. - 0769519261
  • Konferensbidrag (refereegranskat)abstract
    • This paper studies multiprocessor scheduling for aperiodic tasks where future arrivals are unknown. We propose an algorithm for tasks without migration capabilities and prove that it has a capacity bound of 0.31. No algorithm for tasks without migration capabilities can have a capacity bound greater than 0.50.
  •  
3.
  • Atalar, Aras, 1985, et al. (författare)
  • Modeling Energy Consumption of Lock-Free Queue Implementations
  • 2015
  • Ingår i: 29th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, Hyderabad, India, 25-29 May. - : IEEE Computer Society. - 1530-2075. - 9781479986484 ; , s. 229-238
  • Konferensbidrag (refereegranskat)abstract
    • This paper considers the problem of modeling the energy behavior of lock-free concurrent queue data structures. Our main contribution is a way to model the energy behavior of lock-free queue implementations and parallel applications that use them. Focusing on steady state behavior we decompose energy behavior into throughput and power dissipation which can be modeled separately and later recombined into several useful metrics, such as energy per operation. Based on our models, instantiated from synthetic benchmark data, and using only a small amount of additional application specific information, energy and throughput predictions can be made for parallel applications that use the respective data structure implementation. To model throughput we propose a generic model for lock-free queue throughput behavior, based on a combination of the dequeuers' throughput and enqueuers' throughput. To model power dissipation we commonly split the contributions from the various computer components into static, activation and dynamic parts, where only the dynamic part depends on the actual instructions being executed. To instantiate the models a synthetic benchmark explores each queue implementation over the dimensions of processor frequency and number of threads. Finally, we show how to make predictions of application throughput and power dissipation for a parallel application using a lock-free queue requiring only a limited amount of information about the application work done between queue operations. Our case study on a Mandelbrot application shows convincing prediction results.
  •  
4.
  • Cederman, Daniel, 1981, et al. (författare)
  • A Study of the Behavior of Synchronization Methods in Commonly Used Languages and Systems
  • 2013
  • Ingår i: Proceedings of the 27th IEEE International Parallel & Distributed Processing Symposium. - 1530-2075. - 9780769549712 ; , s. 1309-1320
  • Konferensbidrag (refereegranskat)abstract
    • Synchronization is a central issue in concurrency and plays an important role in the behavior and performance of modern programmes. Programming languages and hardware designers are trying to provide synchronization constructs and primitives that can handle concurrency and synchronization issues efficiently. Programmers have to find a way to select the most appropriate constructs and primitives in order to gain the desired behavior and performance under concurrency. Several parameters and factors affect the choice,through complex interactions among (i) the language and the language constructs that itsupports, (ii) the system architecture, (iii) possible run-time environments,virtual machine options and memory management support and(iv) applications.We present a systematic study ofsynchronizationstrategies, focusing on concurrent data structures.We have chosen concurrent data structures with different number of contention spots.We consider both coarse-grain and fine-grain locking strategies, as well as lock-free methods.We have investigated synchronization-aware implementations in C++, C# (.NET and Mono) and Java.Considering the machine architectures, we have studied the behavior of theimplementations on both Intel's Nehalem and AMD's Bulldozer.The properties that we study are throughput and fairness under different workloads and multiprogramming execution environments.For NUMA architectures fairness is becoming as important as the typically considered throughput property.To the best of our knowledge this is the first systematic and comprehensive study of synchronization-aware implementations.This paper takes steps towards capturing a number of guidingprinciples and concerns for the selection of the programming environmentand synchronization methods in connection to the application and the system characteristics.
  •  
5.
  • Fjälling, Tobias, et al. (författare)
  • Performance Impact of Batching Web Application Requests using Hot-spot Processing on GPUs
  • 2015
  • Ingår i: 29th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, Hyderabad, India, 25-29 May 2015. - 1530-2075. - 9781479986484 ; , s. 989-999
  • Konferensbidrag (refereegranskat)abstract
    • Web applications are a good fit for many-core servers because of their inherent high-degree of request-level parallelism. Yet, processing-intensive web-server requests can lead to low quality-of-service due to hot-spots, which calls for methods that can improve single-thread performance. This paper explores how to use off-chip GPUs to speed up web application hot-spots written in productivity-friendly environments (e.g. C#). First, we apply a number of straightforward optimizations through refactoring of a commercial-strength, web application code. This yields a speedup of 7.6 in a CPU multi-threaded, and multi-core test. Second, we then gather similar requests from different threads of the optimized code, by applying a technique called batching, to exploit SIMD parallelism provided by GPUs. Surprisingly, there is ample parallelism to be exploited from the already optimized code yielding a speedup of a factor between 2x to 3x compared to the best optimized CPU version.
  •  
6.
  • Georgiadis, Georgios, 1981, et al. (författare)
  • Overlays with preferences: Approximation algorithms for matching with preference lists
  • 2010
  • Ingår i: Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2010). - 1530-2075. - 9781424464425
  • Konferensbidrag (refereegranskat)abstract
    • A key property of overlay networks, that is going to play animportant part in future networking solutions, is the peers'ability to establish connections with other peers based on some suitability metric related to e.g. the node's distance,interests, recommendations, transaction history or availableresources. Each node may choose individually an appropriatemetric and try to connect or be matched with the availablepeers that it considers best. When there are no preferencecycles among the peers, it has been proven that a stablematching exists, where peers have maximized the individualsatisfaction gleaned of their choices. However, no suchguarantees are currently being given for the cases where cycles may exist and known methods may not be able to resolve ``oscillations'' in preference-based connectivity and reach stability. In this work we present a simple yet powerful distributed algorithm that uses aggregate satisfaction as an optimization metric. The algorithm is a generalization of a known elegant approximation one-to-one matching algorithm, into the many-to-many case. We prove that the total satisfaction achieved by our algorithm is a $\frac{1}{4}\left( {1 + \frac{1}{{{b_{\max }}}}} \right)$-approximation of the maximum total satisfaction in the network, where $b_{\max}$ is the maximum number of possible connections of a peer in the overlay.
  •  
7.
  • Ha, Phuong, 1976, et al. (författare)
  • Wait-free Programming for General Purpose Computations on Graphics Processors
  • 2008
  • Ingår i: the Proceedings of the 22th International Parallel and Distributed Symposium (IPDPS 2008). - 1530-2075. - 9781424416936 ; , s. 1-12
  • Konferensbidrag (refereegranskat)abstract
    • The fact that graphics processors (GPUs) are today’s most powerful computational hardware for the dollar has motivated researchers to utilize the ubiquitous and powerful GPUs for general-purpose computing. Recent GPUs feature the single-program multiple-data (SPMD) multicore architecture instead of the single-instruction multiple-data (SIMD). However, unlike CPUs, GPUs devote their transistors mainly to data processing rather than data caching and flow control, and consequently most of the powerful GPUs with many cores do not support any synchronization mechanisms between their cores. This prevents GPUs from being deployed more widely for general-purpose computing. This paper aims at bridging the gap between the lack of synchronization mechanisms in recent GPU architectures and the need of synchronization mechanisms in parallel applications. Based on the intrinsic features of recent GPU architectures, we construct strong synchronization objects like wait-free and t-resilient read-modify-write objects for a general model of recent GPU architectures without strong hardware synchronization primitives like test-and-set and compare-and-swap. Accesses to the wait-free objects have time complexity O(N), whether N is the number of processes. Our result demonstrates that it is possible to construct wait-free synchronization mechanisms for GPUs without the need of strong synchronization primitives in hardware and that wait-free programming is possible for GPUs.
  •  
8.
  • Klonowska, Kamilla, et al. (författare)
  • Using Golomb Rulers for Optimal Recovery Schemes in Fault Tolerant Distributed Computing
  • 2003
  • Ingår i: Proceedings International Parallel and Distributed Processing Symposium. - : IEEE. - 0769519261
  • Konferensbidrag (refereegranskat)abstract
    • Clusters and distributed systems offer fault tolerance and high performance through load sharing. When all computers are up and running, we would like the load to be evenly distributed among the computers. When one or more computers break down the load on these computers must be redistributed to other computers in the cluster. The redistribution is determined by the recovery scheme. The recovery scheme should keep the load as evenly distributed as possible even when the most unfavorable combinations of computers break down, i.e. we want to optimize the worst-case behavior. In this paper we define recovery schemes, which are optimal for a number of important cases. We also show that the problem of finding optimal recovery schemes corresponds to the mathematical problem called Golomb rulers. These provide optimal recovery schemes for up to 373 computers in the cluster.
  •  
9.
  • Nikolakopoulos, Ioannis, 1986, et al. (författare)
  • A Consistency Framework for Iteration Operations in Concurrent Data Structures
  • 2015
  • Ingår i: 2015 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, Hyderabad, India, May 25-29, 2015. - : IEEE Computer Society. - 1530-2075. - 9781479986484 ; , s. 239-248
  • Konferensbidrag (refereegranskat)abstract
    • Concurrent data structures provide the means to multi-threaded applications to share data. Data structures come with a set of predefined operations, specified by the semantics of the data structure. In the literature and in several contemporary commonly used programming environments, the notion of iteration has been introduced for collection data structures, as a bulk operation enhancing the native set of operations. Iterations in several of these contexts have been treated as sequential in nature and may provide weak consistency guarantees when running concurrently with the native operations of the data structures. In this work we study iterations in concurrent data structures in the context of concurrency with the native operations and the guarantees that they provide. Besides invariability, we propose a set of consistency specifications for such bulk operations, including also concurrency-aware properties by building on Lamppost's systematic definitions for registers. Furthermore, by using queues and composite registers as case-studies of underlying objects, we provide a set of constructions of iteration operations, satisfying the properties and showing containment relations. Besides the trade-off between consistency and throughput, we point out and study trade-off between the overhead of the bulk operation and possible support (helping) by the native operations of the data structure.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-9 av 9

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy