SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:research.chalmers.se:5f5d976f-2651-4094-8bc6-ef6a8fce3deb"
 

Search: onr:"swepub:oai:research.chalmers.se:5f5d976f-2651-4094-8bc6-ef6a8fce3deb" > Parallelizing more ...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist
  • Larsen, PDanmarks Tekniske Universitet,Technical University of Denmark (author)

Parallelizing more loops with compiler guided refactoring

  • Article/chapterEnglish2012

Publisher, publication year, extent ...

  • 2012

Numbers

  • LIBRIS-ID:oai:research.chalmers.se:5f5d976f-2651-4094-8bc6-ef6a8fce3deb
  • ISBN:9780769547961
  • https://doi.org/10.1109/ICPP.2012.48DOI
  • https://research.chalmers.se/publication/169410URI

Supplementary language notes

  • Language:English
  • Summary in:English

Part of subdatabase

Classification

  • Subject category:kon swepub-publicationtype
  • Subject category:ref swepub-contenttype

Notes

  • The performance of many parallel applications relies not on instruction-level parallelism but on loop-level parallelism. Unfortunately, automatic parallelization of loops is a fragile process, many different obstacles affect or prevent it in practice. To address this predicament we developed an interactive compilation feedback system that guides programmers in iteratively modifying their application source code. This helps leverage the compiler's ability to generate loop-parallel code. We employ our system to modify two sequential benchmarks dealing with image processing and edge detection, resulting in scalable parallelized code that runs up to 8.3 times faster on an eight-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should be combined with target-specific optimizations. Furthermore, comparing the first benchmark to manually-parallelized, hand-optimized pthreads and OpenMP versions, we find that code generated using our approach typically outperforms the pthreads code (within 93-339%). It also performs competitively against the OpenMP code (within 75-111%). The second benchmark outperforms manually-parallelized and optimized OpenMP code (within 109-242%).

Subject headings and genre

Added entries (persons, corporate bodies, meetings, titles ...)

  • Ladelsky, R.IBM Haifa Labs (author)
  • Lidman, Jacob,1985Chalmers tekniska högskola,Chalmers University of Technology(Swepub:cth)lidman (author)
  • McKee, Sally A,1963Chalmers tekniska högskola,Chalmers University of Technology(Swepub:cth)mckee (author)
  • Karlsson, SDanmarks Tekniske Universitet,Technical University of Denmark (author)
  • Zaks, A. (author)
  • Danmarks Tekniske UniversitetIBM Haifa Labs (creator_code:org_t)

Related titles

  • In:Proceedings of the International Conference on Parallel Processing. 41st International Conference on Parallel Processing, ICPP 2012, Pittsburgh, PA, 10 - 13 September 2012, s. 410-4190190-39189780769547961

Internet link

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view