Sökning: onr:"swepub:oai:DiVA.org:umu-145462" > On solving separabl...
Fältnamn | Indikatorer | Metadata |
---|---|---|
000 | 03479naa a2200445 4500 | |
001 | oai:DiVA.org:umu-145462 | |
003 | SwePub | |
008 | 180305s2018 | |||||||||||000 ||eng| | |
024 | 7 | a https://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-1454622 URI |
024 | 7 | a https://doi.org/10.1016/j.jpdc.2018.01.0042 DOI |
040 | a (SwePub)umu | |
041 | a engb eng | |
042 | 9 SwePub | |
072 | 7 | a ref2 swepub-contenttype |
072 | 7 | a art2 swepub-publicationtype |
100 | 1 | a Myllykoski, Mirko,d 1988-u Umeå universitet,Institutionen för datavetenskap,Department of Mathematical Information Technology, University of Jyväskylä4 aut0 (Swepub:umu)mimy0006 |
245 | 1 0 | a On solving separable block tridiagonal linear systems using a GPU implementation of radix-4 PSCR method |
264 | 1 | b Elsevier,c 2018 |
338 | a electronic2 rdacarrier | |
520 | a Partial solution variant of the cyclic reduction (PSCR) method is a direct solver that can be applied to certain types of separable block tridiagonal linear systems. Such linear systems arise, e.g., from the Poisson and the Helmholtz equations discretized with bilinear finite-elements. Furthermore, the separability of the linear system entails that the discretization domain has to be rectangular and the discretization mesh orthogonal. A generalized graphics processing unit (GPU) implementation of the PSCR method is presented. The numerical results indicate up to 24-fold speedups when compared to an equivalent CPU implementation that utilizes a single CPU core. Attained floating point performance is analyzed using roofline performance analysis model and the resulting models show that the attained floating point performance is mainly limited by the off-chip memory bandwidth and the effectiveness of a tridiagonal solver used to solve arising tridiagonal subproblems. The performance is accelerated using off-line autotuning techniques. | |
650 | 7 | a NATURVETENSKAPx Data- och informationsvetenskapx Datavetenskap0 (SwePub)102012 hsv//swe |
650 | 7 | a NATURAL SCIENCESx Computer and Information Sciencesx Computer Sciences0 (SwePub)102012 hsv//eng |
650 | 7 | a NATURVETENSKAPx Data- och informationsvetenskapx Programvaruteknik0 (SwePub)102052 hsv//swe |
650 | 7 | a NATURAL SCIENCESx Computer and Information Sciencesx Software Engineering0 (SwePub)102052 hsv//eng |
653 | a Fast direct solver | |
653 | a GPU computing | |
653 | a Partial solution technique | |
653 | a PSCR method | |
653 | a Roofline model | |
653 | a Separable block tridiagonal linear system | |
653 | a business data processing | |
653 | a administrativ databehandling | |
700 | 1 | a Rossi, Tuomou Department of Mathematical Information Technology, University of Jyväskylä4 aut |
700 | 1 | a Toivanen, Jariu Department of Mathematical Information Technology, University of Jyväskylä; Department of Aeronautics & Astronautics, Stanford University4 aut |
710 | 2 | a Umeå universitetb Institutionen för datavetenskap4 org |
773 | 0 | t Journal of Parallel and Distributed Computingd : Elsevierg 115, s. 56-66q 115<56-66x 0743-7315x 1096-0848 |
856 | 4 | u https://umu.diva-portal.org/smash/get/diva2:1187714/FULLTEXT01.pdfx primaryx Raw objecty fulltext:postprint |
856 | 4 | u https://jyx.jyu.fi/bitstream/123456789/57129/1/myllykoskirossitoivanenonsolving.pdf |
856 | 4 8 | u https://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-145462 |
856 | 4 8 | u https://doi.org/10.1016/j.jpdc.2018.01.004 |
Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.
Kopiera och spara länken för att återkomma till aktuell vy