Search: onr:"swepub:oai:DiVA.org:kth-65263" >
Matrix Multiplicati...
Matrix Multiplication on Hypercubes Using Full Bandwidth and Constant Storage
- Article/chapterEnglish1991
Publisher, publication year, extent ...
Numbers
-
LIBRIS-ID:oai:DiVA.org:kth-65263
-
https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-65263URI
Supplementary language notes
-
Language:English
-
Summary in:English
Part of subdatabase
Classification
-
Subject category:ref swepub-contenttype
-
Subject category:kon swepub-publicationtype
Notes
-
NR 20140805
-
For matrix multiplication on hypercube multiproces- sors with the product matrix accumulated in place a processor must receive about P2= p N elements of each input operand, with operands of size P P distributed evenly over N processors. With concurrent communi- cation on all ports, the number of element transfers in sequence can be reduced to P2= p N logN for each input operand. We present a two-level partitioning of the matrices and an algorithm for the matrix multipli- cation with optimal data motion and constant storage. The algorithm has sequential arithmetic complexity 2P3, and parallel arithmetic complexity 2P 3=N. The algorithm has been implemented on the Connection Machine model CM-2. For the performance on the 8K CM-2, we measured about 1.6 G ops, which would scale up to about 13 G ops for a 64K full machine.
Subject headings and genre
Added entries (persons, corporate bodies, meetings, titles ...)
-
Johnsson, LennartKTH,Parallelldatorcentrum, PDC(Swepub:kth)u1x9yl3z
(author)
-
KTHParallelldatorcentrum, PDC
(creator_code:org_t)
Internet link
To the university's database