SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Dongarra Jack) "

Sökning: WFRF:(Dongarra Jack)

  • Resultat 1-9 av 9
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  •  
2.
  •  
3.
  • Applied parallel computing. State of the art in scientific computing : 8th International Workshop, PARA 2006; Umeå, Sweden, June 2006, Revised Selected Papers
  • 2007
  • Proceedings (redaktörskap) (refereegranskat)abstract
    • The Eighth International Workshop on Applied Parallel Computing (PARA 2006) was held in Umeå, Sweden, June 18–21, 2006. The workshop was organized by the High Performance Computing Center North (HPC2N) and the Department of Computing Science at Umeå University. The general theme for PARA 2006 was “State of the Art in Scientific and Parallel Computing.” Topics covered at PARA 2006 included basic algorithms and software for scientific, parallel and grid computing, tools and environments for developing high-performance computing applications, as well as a broad spectrum of applications from science and engineering. The workshop included 7 plenary keynote presentations, 15 invited minisymposia organized in 30 sessions, and 16 sessions of contributed talks. The minisymposia and the contributed talks were held in five to six parallel sessions. The main workshop program was preceded by two half-day tutorials. In total, 205 presentations were held at PARA 2006, by speakers representing 28 countries. Extended abstracts for all presentations were made available at the PARA 2006 Web site (www.hpc2n.umu.se/para06). The reviewing process was performed in two stages for evaluation of originality, appropriateness, and significance. In the first stage, extended abstracts were reviewed for selection of contributions to be presented at the workshop. In the second stage the full papers submitted after the workshop were reviewed. In total, 120 papers were selected for publication in this peer-reviewed post-conference proceedings. A number of people contributed in different regards to the organization and the accomplishment of PARA 2006. First of all the Local Organization Committee did a greatly appreciated and enthusiastic job. We also acknowledge the following people for the assistance and support during the workshop days: Yvonne Löwstedt and Anne-Lie Persson; Niklas Edmundsson, Roger Oscarsson, and Mattias Wadenstein. A special thanks goes to the PARA 2006 secretary, Lena Hellman, to Anders Backman and Björn Torkelsson for designing and managing the PARA 2006Web site including the electronic paper submission system, powered by Commence, and to Mats Nylén and Mikael Rännar for their professional assistance in compiling and editing the PARA 2006 program, the booklet of extended abstracts, and the final proceedings. PARA 2006 would not have been possible without the personal involvement of all these fine people. We also greatly acknowledge all minisymposia organizers, the review coordinators and all the referees for their evaluations in the second review stage, which included several rounds and resulted in these professionally peer-reviewed post-workshop proceedings. Finally, we would also like to thank the sponsoring institutions for their generous financial support. VI Preface Since 1996 the international PARA conferences have become biennial and are organized by one of the Nordic countries. The three first workshops including PARA 1996 and the last PARA 2004 were held in Lyngby, Denmark. The other three, besides this one, were held in Umeå, Sweden (PARA 1998), in Bergen, Norway (PARA 2000), and in Espoo, Finland (PARA 2002). The PARA 2008 workshop will take place in Trondheim, Norway, May 13–16, 2008.
  •  
4.
  •  
5.
  • Eerola, Paula, et al. (författare)
  • Roadmap for the ARC Grid Middleware
  • 2007
  • Ingår i: Lecture Notes in Computer Science. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 9783540757542 ; 4699/2007, s. 471-479
  • Konferensbidrag (refereegranskat)abstract
    • The Advanced Resource Connector (ARC) or the NorduGrid middleware is an open source software solution enabling production quality computational and data Grids, with special emphasis on scalability, stability, reliability and performance. Since its first release in May 2002, the middleware is deployed and being used in production environments. This paper aims to present the future development directions and plans of the ARC middleware in terms of outlining the software development roadmap.
  •  
6.
  • Gustavson, Fred G., et al. (författare)
  • Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms
  • 2013
  • Ingår i: ACM Transactions on Mathematical Software. - : Association for Computing Machinery (ACM). - 0098-3500 .- 1557-7295. ; 39:2, s. 9-
  • Tidskriftsartikel (refereegranskat)abstract
    • Four routines called DPOTF3i, i = a, b, c, d, are presented. DPOTF3i are a novel type of level-3 BLAS for use by BPF (Blocked Packed Format) Cholesky factorization and LAPACK routine DPOTRF. Performance of routines DPOTF3i are still increasing when the performance of Level-2 routine DPOTF2 of LAPACK starts decreasing. This is our main result and it implies, due to the use of larger block size nb, that DGEMM, DSYRK, and DTRSM performance also increases! The four DPOTF3i routines use simple register blocking. Different platforms have different numbers of registers. Thus, our four routines have different register blocking sizes. BPF is introduced. LAPACK routines for POTRF and PPTRF using BPF instead of full and packed format are shown to be trivial modifications of LAPACK POTRF source codes. We call these codes BPTRF. There are two variants of BPF: lower and upper. Upper BPF is "identical" to Square Block Packed Format (SBPF). "LAPACK" implementations on multicore processors use SBPF. Lower BPF is less efficient than upper BPF. Vector inplace transposition converts lower BPF to upper BPF very efficiently. Corroborating performance results for DPOTF3i versus DPOTF2 on a variety of common platforms are given for n approximate to nb as well as results for large n comparing DBPTRF versus DPOTRF.
  •  
7.
  • Gustavson, Fred G., et al. (författare)
  • Rectangular Full Packed Format for Cholesky's Algorithm : Factorization, Solution, and Inversion
  • 2010
  • Ingår i: ACM Transactions on Mathematical Software. - : Association for Computing Machinery (ACM). - 0098-3500 .- 1557-7295. ; 37:2, s. 1-21
  • Tidskriftsartikel (refereegranskat)abstract
    • We describe a new data format for storing triangular, symmetric, and Hermitian matrices called Rectangular Full Packed Format (RFPF). The standard two-dimensional arrays of Fortran and C (also known as full format) that are used to represent triangular and symmetric matrices waste nearly half of the storage space but provide high performance via the use of Level 3 BLAS. Standard packed format arrays fully utilize storage (array space) but provide low performance as there is no Level 3 packed BLAS. We combine the good features of packed and full storage using RFPF to obtain high performance via using Level 3 BLAS as RFPF is a standard full-format representation. Also, RFPF requires exactly the same minimal storage as packed the format. Each LAPACK full and/or packed triangular, symmetric, and Hermitian routine becomes a single new RFPF routine based on eight possible data layouts of RFPF. This new RFPF routine usually consists of two calls to the corresponding LAPACK full-format routine and two calls to Level 3 BLAS routines. This means no new software is required. As examples, we present LAPACK routines for Cholesky factorization, Cholesky solution, and Cholesky inverse computation in RFPF to illustrate this new work and to describe its performance on several commonly used computer platforms. Performance of LAPACK full routines using RFPF versus LAPACK full routines using the standard format for both serial and SMP parallel processing is about the same while using half the storage. Performance gains are roughly one to a factor of 43 for serial and one to a factor of 97 for SMP parallel times faster using vendor LAPACK full routines with RFPF than with using vendor and/or reference packed routines.
  •  
8.
  • Kennedy, Ken, et al. (författare)
  • Toward a Framework for Preparing and Executing Adaptive Grid Programs
  • 2002
  • Konferensbidrag (refereegranskat)abstract
    • This paper describes the program execution framework being developed by the Grid Application Development Software (GrADS) Project . The goal of this framework is to provide good resource allocation for Grid applications and to support adaptive reallocation if performance degrades because of changes in the availability of Grid resources. At the heart of this strategy is the notion of a configurable object program, which contains, in addition to application code, strategies for mapping the application to different collections of resources and a resource selection model that provides an estimate of the performance of the application on a specific collection of Grid resources. This model must be accurate enough to distinguish collections of resources that will deliver good performance from those that will not. The GrADS execution framework also provides a contract monitoring mechanism for interrupting and remapping an application execution when performance falls below acceptable levels.
  •  
9.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-9 av 9

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy