SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "LAR1:bth srt2:(2000-2004);pers:(Lennerstad Håkan)"

Sökning: LAR1:bth > (2000-2004) > Lennerstad Håkan

  • Resultat 1-10 av 11
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Klonowska, Kamilla, et al. (författare)
  • Comparing the optimal performance of parallel architectures
  • 2004
  • Ingår i: Computer journal. - Oxford : Oxford University Press. - 0010-4620 .- 1460-2067. ; 47:5, s. 527-544
  • Tidskriftsartikel (refereegranskat)abstract
    • Consider a parallel program with n processes and a synchronization granularity z. Consider also two parallel architectures: an SMP with q processors and run-time reallocation of processes to processors, and a distributed system (or cluster) with k processors and no run-time reallocation. There is an inter-processor communication delay of t time units for the system with no run-time reallocation. In this paper we define a function H(n,k,q,t,z) such that the minimum completion time for all programs with n processes and a granularity z is at most H(n,k,q,t,z) times longer using the system with no reallocation and k processors compared to using the system with q processors and run-time reallocation. We assume optimal allocation and scheduling of processes to processors. The function H(n,k,q,t,z)is optimal in the sense that there is at least one program, with n processes and a granularity z, such that the ratio is exactly H(n,k,q,t,z). We also validate our results using measurements on distributed and multiprocessor Sun/Solaris environments. The function H(n,k,q,t,z) provides important insights regarding the performance implications of the fundamental design decision of whether to allow run-time reallocation of processes or not. These insights can be used when doing the proper cost/benefit trade-offs when designing parallel execution platforms.
  •  
2.
  •  
3.
  • Klonowska, Kamilla, et al. (författare)
  • Using Modulo Rulers for Optimal Recovery Schemes in Distributed Computing
  • 2004
  • Konferensbidrag (refereegranskat)abstract
    • Clusters and distributed systems offer fault tolerance and high performance through load sharing. When all computers are up and running, we would like the load to be evenly distributed among the computers. When one or more computers break down the load on these computers must be redistributed to other computers in the cluster. The redistribution is determined by the recovery scheme. The recovery scheme should keep the load as evenly distributed as possible even when the most unfavorable combinations of computers break down, i.e. we want to optimize the worst-case behavior. We define recovery schemes, which are optimal for a larger number of computers down than in previous results. We also show that the problem of finding optimal recovery schemes for a cluster with n computers corresponds to the mathematical problem of finding the longest sequence of positive integers for which the sum of the sequence and the sums of all subsequences modulo n are unique.
  •  
4.
  • Lennerstad, Håkan, et al. (författare)
  • Optimal combinatorial functions comparing multiprocess allocation performance in multiprocessor systems
  • 2000
  • Ingår i: SIAM journal on computing (Print). - PHILADELPHIA : SIAM PUBLICATIONS. - 0097-5397 .- 1095-7111. ; , s. 1816-1838
  • Tidskriftsartikel (refereegranskat)abstract
    • For the execution of an arbitrary parallel program P, consisting of a set of processes with any executable interprocess dependency structure, we consider two alternative multiprocessors. The first multiprocessor has q processors and allocates parallel programs dynamically; i.e., processes may be reallocated from one processor to another. The second employs cluster allocation with k clusters and u processors in each cluster: here processes may be reallocated within a cluster only. Let T-d(P, q) and T-c(P, k, u) be execution times for the parallel program P with optimal allocations. We derive a formula for the program independent performance function [GRAPHICS] Hence, with optimal allocations, the execution of P can never take more than a factor G(k, u, q) longer time with the second multiprocessor than with the first, and there exist programs showing that the bound is sharp. The supremum is taken over all parallel programs consisting of any number of processes. Overhead for synchronization and reallocation is neglected only. We further present a tight bound which exploits a priori knowledge of the class of parallel programs intended for the multiprocessors, thus resulting in a sharper bound. The function g(n, k, u, q) is the above maximum taken over all parallel programs consisting of n processes. The functions G and g can be used in various ways to obtain tight performance bounds, aiding in multiprocessor architecture decisions.
  •  
5.
  •  
6.
  • Lennerstad, Håkan, et al. (författare)
  • Optimal worst case formulas comparing cache memory associativity
  • 2000
  • Ingår i: SIAM journal on computing (Print). - PHILADELPHIA : SIAM PUBLICATIONS. - 0097-5397 .- 1095-7111. ; , s. 872-905
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper we derive a worst case formula comparing the number of cache hits for two different cache memories. From this various other bounds for cache memory performance may be derived. Consider an arbitrary program P which is to be executed on a computer with two alternative cache memories. The rst cache is set-associative or direct-mapped. It has k sets and u blocks in each set; this is called a (k, u)-cache. The other is a fully associative cache with q blocks-a (1, q)-cache. We derive an explicit formula for the ratio of the number of cache hits h(P, k, u) for a(k, u)-cache compared to a (1, q)-cache for a worst case program P. We assume that the mappings of the program variables to the cache blocks are optimal. The formula quantifies the ratio [GRAPHICS] where the in mum is taken over all programs P with n variables. The formula is a function of the parameters n, k, u, and q only. Note that the quantity h ( P, k, u) is NP-hard. We assume the commonly used LRU (least recently used) replacement policy, that each variable can be stored in one memory block, and that each variable is free to be mapped to any set. Since the bound is decreasing in the parameter n, it is an optimal bound for all programs with at most n variables. The formula for cache hits allows us to derive optimal bounds comparing the access times for cache memories. The formula also gives bounds ( these are not optimal, however) for any other replacement policy, for direct-mapped versus set-associative caches, and for programs with variables larger than the cache memory blocks.
  •  
7.
  • Lennerstad, Håkan, et al. (författare)
  • Serier och transformer
  • 2002
  • Bok (övrigt vetenskapligt/konstnärligt)
  •  
8.
  • Lundberg, Lars, et al. (författare)
  • Comparing the Optimal Performance of Multiprocessor Architectures
  • 2003
  • Konferensbidrag (refereegranskat)abstract
    • Consider a parallel program with n processes and a synchronization granularity z. Consider also two multiprocessors: a multiprocessor with q processors and run-time reallocation of processes to processors, and a multiprocessor with k processors and no run-time reallocation. There is an inter processor communication delay of t time units for the system with no run-time reallocation. In this paper we define a function g(n,k,q,t,z) such that the minimum completion time for all programs with n processes and a granularity z is at most g(n,k,q,t,z) times longer using the system with no reallocation and k processors compared to using the system with q processors and run-time reallocation. We assume optimal allocation and scheduling of processes to processors. The function g(n,k,q,t,z) is optimal in the sense that there is at least one program, with n processes and a granularity z, such that the ratio is exactly g(n,k,q,t,z). We also validate our results using measurements on distributed and multiprocessor Sun/Solaris environments.
  •  
9.
  • Lundberg, Lars, et al. (författare)
  • Global multiprocessor scheduling of aperiodic tasks using time-independent priorities
  • 2003
  • Konferensbidrag (refereegranskat)abstract
    • We provide a constant time schedulability test for a multiprocessor server handling aperiodic tasks. Dhall's effect is avoided by dividing the tasks in two priority classes based on task utilization: heavy and light. We prove that if the load on the multiprocessor server stays below U-threshold = 3 - root7 = 35.425%, the server can accept incoming aperiodic tasks and guarantee that the deadlines of all accepted tasks will be met. 35.425% utilization is also a threshold for a task to be characterized as heavy. The bound U-threshold = 3 - root7 approximate to 35.425% is easy-to-use, but not sharp if we know the number of processors in the multiprocessor. For a server with m processors, we calculate a formula for the sharp bound U-threshold(m), which converges to U-threshold from above as m --> infinity. The results are based on a utilization function u(m)(x) = 2(1-x)/(2+ root2+2x)+x/m. By using this function, the performance of the multiprocessor can in some cases be improved beyond U-threshold(m) by paying the extra overhead of monitoring the individual utilization of the current tasks.
  •  
10.
  • Lundberg, Lars, et al. (författare)
  • Using Golomb Rulers for Minimizing Collisions in Closed Hashing
  • 2004
  • Konferensbidrag (refereegranskat)abstract
    • We give conditions for hash table probing which minimize the expected number of collisions. A probing algorithm is determined by a sequence of numbers denoting jumps for an item during multiple collisions. In linear probing, this sequence consists of only ones – for each collision we jump to the next location. To minimize the collisions, it turns out that one should use the Golomb ruler conditions: consecutive partial sums of the jump sequence should be distinct. The commonly used quadratic probing scheme fulfils the Golomb condition for some cases. We define a new probing scheme – Golomb probing - that fulfills the Golomb conditions for a much larger set of cases. Simulations show that Golomb probing is always better than quadratic and linear and in some cases the collisions can be reduced with 25% compared to quadratic and with more than 50% compared to linear.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 11

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy