SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "swepub ;srt2:(1990-1994);srt2:(1994);pers:(Johnsson Lennart)"

Sökning: swepub > (1990-1994) > (1994) > Johnsson Lennart

  • Resultat 1-10 av 16
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Edelman, Alan, et al. (författare)
  • Index Transformation Algorithms in a Linear Algebra Framework
  • 1994
  • Ingår i: IEEE Transactions on Parallel and Distributed Systems. - : Institute of Electrical and Electronics Engineers (IEEE). - 1045-9219 .- 1558-2183. ; 5:12, s. 1302-1309
  • Tidskriftsartikel (refereegranskat)abstract
    • We present a linear algebraic formulation for a class of index transformations such as Gray code encoding and decoding, matrix transpose, bit reversal, vector reversal, shuffles, and other index or dimension permutations. This formulation unifies, simplifies, and can be used to derive algorithms for hypercube multiprocessors. We show how all the widely known properties of Gray codes, and some not so well-known properties as well, can be derived using this framework. Using this framework, we relate hypercube communications algorithms to Gauss-Jordan elimination on a matrix of 0's and 1's.
  •  
2.
  • George, William, et al. (författare)
  • POLYSHIFT Communications Software for the Connection Machine System CM–200
  • 1994
  • Ingår i: Scientific Programming. - 1058-9244 .- 1875-919X. ; 3:1, s. 83-99
  • Tidskriftsartikel (refereegranskat)abstract
    • We describe the use and implementation of a polyshift function PSHIFT for circular shifts and end-offs shifts. Polyshift is useful in many scientific codes using regular grids, such as finite difference codes in several dimensions, and multigrid codes, molecular dynamics computations, and in lattice gauge physics computations, such as quantum chromodynamics (QCD) calculations. Our implementation of the PSHIFT function on the Connection Machine systems CM-2 and CM-200 offers a speedup of up to a factor of 3-4 compared with CSHIFT when the local data motion within a node is small. The PSHIFT routine is included in the Connection Machine Scientific Software Library (CMSSL).
  •  
3.
  • Ho, Ching-Tien, et al. (författare)
  • An Efficient Algorithm for Gray–to–Binary Permutation on Hypercubes
  • 1994
  • Ingår i: Journal of Parallel and Distributed Computing. - 0743-7315 .- 1096-0848. ; 20:1, s. 114-120
  • Tidskriftsartikel (refereegranskat)abstract
    •  Both Gray code and binary code are frequently used in mapping arrays into hypercube architectures. While the former is preferred when communication between adjacent array elements is needed, the latter is preferred for FFT-type communication. When different phases of computations have different types of communication patterns, the need arises to remap the data. We give a nearly optimal algorithm for permuting data from a Gray code mapping to a binary code mapping on a hypercube with communication restricted to one input and one output channel per node at a time. Our algorithm improves over the best previously known algorithm [6] by nearly a factor of two and is optimal to within a factor of n=(n Gamma 1) with respect to data transfer time on an n-cube. The expected speedup is confirmed by measurements on an Intel iPSC/2 hypercube
  •  
4.
  • Ho, Ching-Tien, et al. (författare)
  • Embedding Hyper–pyramids in Hypercubes
  • 1994
  • Ingår i: IBM Journal of Research and Development. - : IBM. - 0018-8646 .- 2151-8556. ; 38:1, s. 31-45
  • Tidskriftsartikel (refereegranskat)
  •  
5.
  • Johan, Zdenek, et al. (författare)
  • An Efficient Communication Strategy for Finite Element Methods on the Connection Machine CM-5 System
  • 1994
  • Ingår i: Computer Methods in Applied Mechanics and Engineering. - 0045-7825 .- 1879-2138. ; 113:3-4, s. 363-387
  • Tidskriftsartikel (refereegranskat)abstract
    • The objective of this paper is to propose communication procedures suitable for unstructured finite element solvers implemented on distributed-memory parallel computers such as the Connection Machine CM-5 system. First, a data-parallel implementation of the recursive spectral bisection (RSB) algorithm proposed by Pothen et al. is presented. The RSB algorithm is associated with a node renumbering scheme which improves data locality of reference. Two-step gather and scatter operations taking advantage of this data locality are then designed. These communication primitives make use of the indirect addressing capability of the CM-5 vector units to achieve high gather and scatter bandwidths. The performance of the proposed communication strategy is illustrated on large-scale three-dimensional fluid dynamics problems
  •  
6.
  • Johan, Zdenek, et al. (författare)
  • Scalability of Finite Element Applications on Distributed–Memory Parallel Computers
  • 1994
  • Ingår i: Computer Methods in Applied Mechanics and Engineering. - 0045-7825 .- 1879-2138. ; 119:1-2, s. 61-72
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper demonstrates that scalability and competitive efficiency can be achieved for unstructured grid finite element applications on distributed memory machines, such as the Connection Machine CM-5 system. The efficiency of finite element solvers is analyzed through two applications: an implicit computational aerodynamics application and an explicit solid mechanics application. Scalability of mesh decomposition and of data mapping strategies is also discussed. Numerical examples that support the claims for problems with an excess of fourteen million variables are presented.
  •  
7.
  • Johnsson, Lennart, et al. (författare)
  • Boolean Cube Emulation of Butterfly Networks Encoded by Gray Code
  • 1994
  • Ingår i: Journal of Parallel and Distributed Computing. - 0743-7315 .- 1096-0848. ; 20:3, s. 261-179
  • Tidskriftsartikel (refereegranskat)abstract
    • The authors present algorithms for butterfly emulation on binary-reflected Gray coded data that require the same number of element transfers in sequence in a Boolean cube network as for a binary encoding. The required code conversion is either performed in local memories, or through concurrent exchanges not effecting the number of element transfers in sequence. The emulation of a butterfly network with one or two elements per processor requires n communication cycles on an n-cube. For more than two elements per processor, one additional communication cycle is required for every pair of elements. The encoding on completion can be either binary, or binary reflected Gray code, or any combination thereof, without affecting the communication complexity.
  •  
8.
  • Johnsson, Lennart (författare)
  • CMSSL: A Scalable Scientific Software Library
  • 1994
  • Konferensbidrag (refereegranskat)abstract
    • Massively parallel processors introduce new demands on software systems with respect to performance, scalability, robustness and portability. The increased complexity of the memory systems and the increased range of problem sizes for which a given piece of software is used poses serious challenges for software developers. The Connection Machine Scientific Software Library, CMSSL, uses several novel techniques to meet these challenges. The CMSSL contains routines for managing the data distribution and provides data distribution independent functionality. High performance is achieved through careful scheduling of operations and data motion, and through the automatic selection of algorithms at run-time. We discuss some of the techniques used, and provide evidence that CMSSL has reached the goals of performance and scalability for an important set of applications
  •  
9.
  • Johnsson, Lennart (författare)
  • Data motion and high performance computing
  • 1994
  • Ingår i: Data Motion and High Performance Computing. ; , s. 1-18
  • Konferensbidrag (refereegranskat)abstract
    • Efficient data motion has been of critical importance in high performance computing almost since the first electronic computers were built. Providing sufficient memory bandwidth to balance the capacity of processors led to memory hierarchies, banked and interleaved memories. With the rapid evolution of MOS technologies, microprocessor and memory designs, it is realistic to build systems with thousands of processors and a sustained performance of a trillion operations per second or more. Such systems require tens of thousands of memory banks, even when locality of reference is exploited. Using conventional technologies, interconnecting several thousand processors with tens of thousands of memory banks can feasibly only be made by some form of sparse interconnection network. Efficient use of locality of reference and network bandwidth is critical
  •  
10.
  • Johnsson, Lennart, et al. (författare)
  • High Performance, Scalable Scientific Software Libraries
  • 1994
  • Ingår i: <em>Portability and Performance in Parallel Processing</em>. - : John Wiley & Sons. ; , s. 159-208
  • Bokkapitel (refereegranskat)abstract
    • Massively parallel processors introduces new demands on software systems with respect to performance, scalability, robustness and portability. The increased complexity of the memory systems and the increased range of problem sizes for which a given piece of software is used, poses serious challenges to software developers. The Connection Machine Scientific Software Library, CMSSL, uses several novel techniques to meet these challenges. The CMSSL contains routines for managing the data distribution and provides data distribution independent functionality. High performance is achieved through careful scheduling of operations and data motion, and through the automatic selection of algorithms at run--time. We discuss some of the techniques used, and provide evidence that CMSSL has reached the goals of performance and scalability for an important set of applications.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 16

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy