SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:0743 7315 OR L773:1096 0848 srt2:(1990-1994)"

Sökning: L773:0743 7315 OR L773:1096 0848 > (1990-1994)

  • Resultat 1-6 av 6
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Johnsson, Lennart, et al. (författare)
  • Generalized Shuffle Permutations on Boolean Cubes
  • 1992
  • Ingår i: Journal of Parallel and Distributed Computing. - 0743-7315 .- 1096-0848. ; 16:1, s. 1-14
  • Tidskriftsartikel (refereegranskat)abstract
    • In a generalized permutation an address (a[subscript q-1]a[subscript q-2] ... a0 receives its content from an address obtained through a cyclic shift on a subset of the q dimensions used for the encoding of the addresses. Bit-complementation may be combined with the shift. We give an algorithm that requires K/2 + 2 exchanges for K elements per processor, when storage dimensions are part of the permutation, and concurrent communication on all ports of every processor is possible. The number of element exchanges in sequence is independent of the number of processor dimensions [omega subscript r] in the permutation.
  •  
2.
  • Johnsson, Lennart (författare)
  • Performance Modeling of Distributed Memory Architectures
  • 1991
  • Ingår i: Journal of Parallel and Distributed Computing. - 0743-7315 .- 1096-0848. ; 12:4, s. 300-312
  • Tidskriftsartikel (refereegranskat)abstract
    • We provide performance models for several primitive operations on data structures distributed over memory units interconnected by a Boolean cube network. In particular, we model single-source and multiple-source concurrent broadcasting or reduction, concurrent gather and scatter operations, shifts along several axes of multidimensional arrays, and emulation of butterfly networks. We also show how the processor configuration, the data aggregation, and the encoding of the address space affect the performance for two important basic computations: the multiplication of arbitrarily shaped matrices and the Fast Fourier Transform. We also give an example of the performance behavior for local matrix operations for a processor with a single path to local memory and a set of processor registers. The analytic models are verified by measurements on the Connection Machine Model CM-2.
  •  
3.
  • Nordström, Tomas, 1963-, et al. (författare)
  • Using and designing massively parallel computers for artificial neural networks
  • 1992
  • Ingår i: Journal of Parallel and Distributed Computing. - Orlando : Academic Press. - 0743-7315 .- 1096-0848. ; 14:3, s. 260-285
  • Tidskriftsartikel (refereegranskat)abstract
    • During the past 10 years the fields of artificial neural networks (ANNs) and massively parallel computing have been evolving rapidly. The authors study the attempts to make ANN algorithms run on massively parallel computers as well as designs of new parallel systems tuned for ANN computing. Following a brief survey of the most commonly used models, the different dimensions of parallelism in ANN computing are identified, and the possibilities for mapping onto the structures of different parallel architectures are analyzed. Different classes of parallel architectures used or designed for ANN are identified. Reported implementations are reviewed and discussed. It is concluded that the regularity of ANN computations suits SIMD architectures perfectly and that broadcast or ring communication can be very efficiently utilized. Bit-serial processing is very interesting for ANN, but hardware support for multiplication should be included. Future artificial neural systems for real-time applications will require flexible processing modules that can be put together to form MIMSIMD systems
  •  
4.
  • Ho, Ching-Tien, et al. (författare)
  • An Efficient Algorithm for Gray–to–Binary Permutation on Hypercubes
  • 1994
  • Ingår i: Journal of Parallel and Distributed Computing. - 0743-7315 .- 1096-0848. ; 20:1, s. 114-120
  • Tidskriftsartikel (refereegranskat)abstract
    •  Both Gray code and binary code are frequently used in mapping arrays into hypercube architectures. While the former is preferred when communication between adjacent array elements is needed, the latter is preferred for FFT-type communication. When different phases of computations have different types of communication patterns, the need arises to remap the data. We give a nearly optimal algorithm for permuting data from a Gray code mapping to a binary code mapping on a hypercube with communication restricted to one input and one output channel per node at a time. Our algorithm improves over the best previously known algorithm [6] by nearly a factor of two and is optimal to within a factor of n=(n Gamma 1) with respect to data transfer time on an n-cube. The expected speedup is confirmed by measurements on an Intel iPSC/2 hypercube
  •  
5.
  • Johnsson, Lennart, et al. (författare)
  • Boolean Cube Emulation of Butterfly Networks Encoded by Gray Code
  • 1994
  • Ingår i: Journal of Parallel and Distributed Computing. - 0743-7315 .- 1096-0848. ; 20:3, s. 261-179
  • Tidskriftsartikel (refereegranskat)abstract
    • The authors present algorithms for butterfly emulation on binary-reflected Gray coded data that require the same number of element transfers in sequence in a Boolean cube network as for a binary encoding. The required code conversion is either performed in local memories, or through concurrent exchanges not effecting the number of element transfers in sequence. The emulation of a butterfly network with one or two elements per processor requires n communication cycles on an n-cube. For more than two elements per processor, one additional communication cycle is required for every pair of elements. The encoding on completion can be either binary, or binary reflected Gray code, or any combination thereof, without affecting the communication complexity.
  •  
6.
  • Ho, Ching-Tien, et al. (författare)
  • Embedding Meshes in Boolean Cubes by Graph Decomposition
  • 1990
  • Ingår i: Parallel Computing. - : Elsevier BV. - 0167-8191 .- 1872-7336. ; 8:4, s. 325-339
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper explores the embeddings of multidimensional meshes into minimal Boolean cubes by graph decomposition. The dilation and the congestion of the product graph (G1 × G2) → (H1 × H2) is the maximum of the dilation and congestion for the two embeddings G1 → H1 and G2 → H2. The graph decomposition technique can be used to improve the average dilation and average congestion. The graph decomposition technique combined with some particular two-dimensional embeddings allows for minimal-expansion, dilation-two, congestion-two embeddings of about 87% of all two-dimensional meshes, with a significantly lower average dilation and congestion than by modified line compression. For three-dimensional meshes we show that the graph decomposition technique, together with two three-dimensional mesh embeddings presented in this paper and modified line compression, yields dilation-two embeddings of more than 96% of all three-dimensional meshes contained in a 512 × 512 × 512 mesh. The graph decomposition technique is also used to generalize the embeddings to meshes with wrap-around. The dilation increases by at most one compared to a mesh without wraparound. The expansion is preserved for the majority of meshes, if a wraparound feature is added to the mesh.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-6 av 6

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy