SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Dackland Krister) "

Sökning: WFRF:(Dackland Krister)

  • Resultat 1-7 av 7
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Dackland, Krister, et al. (författare)
  • A ring-oriented approach for block matrix factorizations on shared and distributed memory architectures
  • 1993
  • Ingår i: Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing. - Norfolk : SIAM Publications. - 0898713153 ; , s. 330-338
  • Konferensbidrag (refereegranskat)abstract
    • A block (column) wrap-mapping approach for design of parallel block matrix factorization algorithms that are (trans)portable over and between shared memory multiprocessors (SMM) and distributed memory multicomputers (DMM) is presented. By reorganizing the matrix on the SMM architecture, the same ring-oriented algorithms can be used on both SMM and DMM systems with all machine dependencies comprised to a small set of communication routines. The algorithms are described on high level with focus on portability and scalability aspects. Implementation aspects of the LU , Cholesky, and QR factorizations and machine specific communication routines for some SMM and DMM systems are discussed. Timing results show that our portable algorithms have similar performance as machine specific implementations. 1 Introduction With the introduction of advanced parallel computer architectures a demand for efficient and portable algorithms has emerged. Several attempts to design algorithms and implementat.
  •  
2.
  • Dackland, Krister, et al. (författare)
  • Blocked Algorithms and Software for Reduction of a Regular Matrix Pair to Generalized Schur Form
  • 1999
  • Ingår i: ACM Transactions on Mathematical Software. ; 25:4, s. 425-454
  • Tidskriftsartikel (refereegranskat)abstract
    • A two-stage blocked algorithm for reduction of a regular matrix pair (A, B) to upper Hessenberg-triangular form is presented. In stage 1 (A, B) is reduced to block upper Hessenberg-triangular form using mainly level 3 (matrix-matrix) operations that permit data reuse in the higher levels of a memory hierarchy. In the second stage all but one of the r subdiagonals of the block Hessenberg A-part are set to zero using Givens rotations. The algorithm proceeds in a sequence of supersweeps, each reducing m columns. The updates with respect to row and column rotations are organized to reference consecutive columns of A and B. To further improve the data locality, all rotations produced in a supersweep are stored to enable a left-looking reference pattern, i.e., all updates are delayed until they are required for the continuation of the supersweep. Moreover, we present a blocked variant of the single diagonal double-shift QZ method for computing the generalized Schur form of(A, B) in upper Hessenberg-triangular form. The blocking for improved data locality is done similarly, now by restructuring the reference pattern of the updates associated with the bulge chasing in the QZ iteration. Timing results show that our new blocked variants outperform the current LAPACK routines, including drivers for the generalized eigenvalue problem, by a factor 2-5 for sufficiently large problems.
  •  
3.
  •  
4.
  • Dackland, Krister, et al. (författare)
  • Design and performance modeling of parallel block matrix factorizations for distributed memory multicomputers
  • 1992
  • Ingår i: Proceedings of the Industrial Mathematics Week. ; , s. 102-116
  • Konferensbidrag (refereegranskat)abstract
    • Efficient and scalable parallel block algorithms for the LU factorization with partial pivoting, the Cholesky, and QR factorizations in a distributed memory multicomputer environment are presented. The distributed system is viewed as a ring of processors and the algorithms correspond to shared memory algorithms parallelized on block level (explicit parallelism). Performance of the algorithms are analyzed theoretically and illustrated empirically by implementations on the Intel iPSC/2 hypercube. A model predicting performance and optimal block size is presented.
  •  
5.
  •  
6.
  •  
7.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-7 av 7

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy