↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Tyck till om SwePub Sök här!

Träfflista för sökning "WFRF:(Laure Erwin) "

Sökning: WFRF:(Laure Erwin)

Resultat 1-10 av 141

Sortera/gruppera träfflistan

Sortering: Träffar per sida:

Numrering	Referens	Omslagsbild	Hitta
1.	Aguilar, Xavier, et al. (författare) An On-Line Performance Introspection Framework for Task-Based Runtime Systems 2019 Ingår i: 19th International Conference on Computational Science, ICCS 2019. - Cham : Springer Verlag. - 9783030227333 ; , s. 238-252 Konferensbidrag (refereegranskat)abstract The expected high levels of parallelism together with the heterogeneity and complexity of new computing systems pose many challenges to current software. New programming approaches and runtime systems that can simplify the development of parallel applications are needed. Task-based runtime systems have emerged as a good solution to cope with high levels of parallelism, while providing software portability, and easing program development. However, these runtime systems require real-time information on the state of the system to properly orchestrate program execution and optimise resource utilisation. In this paper, we present a lightweight monitoring infrastructure developed within the AllScale Runtime System, a task-based runtime system for extreme scale. This monitoring component provides real-time introspection capabilities that help the runtime scheduler in its decision-making process and adaptation, while introducing minimum overhead. In addition, the monitoring component provides several post-mortem reports as well as real-time data visualisation that can be of great help in the task of performance debugging.
2.	Aguilar, Xavier, et al. (författare) Automatic On-Line Detection of MPI Application Structure with Event Flow Graphs 2015 Ingår i: EURO-PAR 2015. - Berlin, Heidelberg : Springer Berlin/Heidelberg. - 9783662480960 - 9783662480953 ; , s. 70-81 Konferensbidrag (refereegranskat)abstract The deployment of larger and larger HPC systems challenges the scalability of both applications and analysis tools. Performance analysis toolsets provide users with means to spot bottlenecks in their applications by either collecting aggregated statistics or generating loss-less time-stamped traces. While obtaining detailed trace information is the best method to examine the behavior of an application in detail, it is infeasible at extreme scales due to the huge volume of data generated. In this context, knowing the application structure, and particularly the nesting of loops in iterative applications is of great importance as it allows, among other things, to reduce the amount of data collected by focusing on important sections of the code. In this paper we demonstrate how the loop nesting structure of an MPI application can be extracted on-line from its event flow graph without the need of any explicit source code instrumentation. We show how this knowledge on the application structure can be used to compute postmortem statistics as well as to reduce the amount of redundant data collected. To that end, we present a usage scenario where this structure information is utilized on-line (while the application runs) to intelligently collect fine-grained data for only a few iterations of an application, considerably reducing the amount of data gathered.
3.	Aguilar, Xavier, et al. (författare) MPI Trace Compression Using Event Flow Graphs 2014 Konferensbidrag (refereegranskat)abstract Understanding how parallel applications behave is crucial for using high-performance computing (HPC) resources efficiently. However, the task of performance analysis is becoming increasingly difficult due to the growing complexity of scientific codes and the size of machines. Even though many tools have been developed over the past years to help in this task, current approaches either only offer an overview of the application discarding temporal information, or they generate huge trace files that are often difficult to handle.In this paper we propose the use of event flow graphs for monitoring MPI applications, a new and different approach that balances the low overhead of profiling tools with the abundance of information available from tracers. Event flow graphs are captured with very low overhead, require orders of magnitude less storage than standard trace files, and can still recover the full sequence of events in the application. We test this new approach with the NERSC-8/Trinity Benchmark suite and achieve compression ratios up to 119x.
4.	Aguilar, Xavier, et al. (författare) Online MPI trace compression using event flow graphs and wavelets 2016 Ingår i: Procedia Computer Science. - : Elsevier. - 1877-0509. ; , s. 1497-1506 Konferensbidrag (refereegranskat)abstract Performance analysis of scientific parallel applications is essential to use High Performance Computing (HPC) infrastructures efficiently. Nevertheless, collecting detailed data of large-scale parallel programs and long-running applications is infeasible due to the huge amount of performance information generated. Even though there are no technological constraints in storing Terabytes of performance data, the constant flushing of such data to disk introduces a massive overhead into the application that makes the performance measurements worthless. This paper explores the use of Event flow graphs together with wavelet analysis and EZW-encoding to provide MPI event traces that are orders of magnitude smaller while preserving accurate information on timestamped events. Our mechanism compresses the performance data online while the application runs, thus, reducing the pressure put on the I/O system due to buffer flushing. As a result, we achieve lower application perturbation, reduced performance data output, and the possibility to monitor longer application runs.
5.	Aguilar, Xavier, et al. (författare) Online Performance Data Introspection with IPM 2014 Ingår i: Proceedings of the 15th IEEE International Conference on High Performance Computing and Communications (HPCC 2013). - : IEEE Computer Society. - 9780769550886 ; , s. 728-734 Konferensbidrag (refereegranskat)abstract Exascale systems will be heterogeneous architectures with multiple levels of concurrency and energy constraints. In such a complex scenario, performance monitoring and runtime systems play a major role to obtain good application performance and scalability. Furthermore, online access to performance data becomes a necessity to decide how to schedule resources and orchestrate computational elements: processes, threads, tasks, etc. We present the Performance Introspection API, an extension of the IPM tool that provides online runtime access to performance data from an application while it runs. We describe its design and implementation and show its overhead on several test benchmarks. We also present a real test case using the Performance Introspection API in conjunction with processor frequency scaling to reduce power consumption.
6.	Aguilar, Xavier (författare) Performance Monitoring, Analysis, and Real-Time Introspection on Large-Scale Parallel Systems 2020 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract High-Performance Computing (HPC) has become an important scientific driver. A wide variety of research ranging for example from drug design to climate modelling is nowadays performed in HPC systems. Furthermore, the tremendous computer power of such HPC systems allows scientists to simulate problems that were unimaginable a few years ago. However, the continuous increase in size and complexity of HPC systems is turning the development of efficient parallel software into a difficult task. Therefore, the use of per- formance monitoring and analysis is a must in order to unveil inefficiencies in parallel software. Nevertheless, performance tools also face challenges as a result of the size of HPC systems, for example, coping with huge amounts of performance data generated.In this thesis, we propose a new model for performance characterisation of MPI applications that tackles the challenge of big performance data sets. Our approach uses Event Flow Graphs to balance the scalability of profiling techniques (generating performance reports with aggregated metrics) with the richness of information of tracing methods (generating files with sequences of time-stamped events). In other words, graphs allow to encode ordered se- quences of events without storing the whole sequence of such events, and therefore, they need much less memory and disk space, and are more scal- able. We demonstrate in this thesis how our Event Flow Graph model can be used as a trace compression method. Furthermore, we propose a method to automatically detect the structure of MPI applications using our Event Flow Graphs. This knowledge can afterwards be used to collect performance data in a smarter way, reducing for example the amount of redundant data collected. Finally, we demonstrate that our graphs can be used beyond trace compression and automatic analysis of performance data. We propose a new methodology to use Event Flow Graphs in the task of visual performance data exploration.In addition to the Event Flow Graph model, we also explore in this thesis the design and use of performance data introspection frameworks. Future HPC systems will be very dynamic environments providing extreme levels of parallelism, but with energy constraints, considerable resource sharing, and heterogeneous hardware. Thus, the use of real-time performance data to or- chestrate program execution in such a complex and dynamic environment will be a necessity. This thesis presents two different performance data introspec- tion frameworks that we have implemented. These introspection frameworks are easy to use, and provide performance data in real time with very low overhead. We demonstrate, among other things, how our approach can be used to reduce in real time the energy consumed by the system.The approaches proposed in this thesis have been validated in different HPC systems using multiple scientific kernels as well as real scientific applica- tions. The experiments show that our approaches in performance character- isation and performance data introspection are not intrusive at all, and can be a valuable contribution to help in the performance monitoring of future HPC systems.
7.	Aguilar, Xavier, et al. (författare) Scalability analysis of Dalton, a molecular structure program 2013 Ingår i: Future generations computer systems. - : Elsevier BV. - 0167-739X .- 1872-7115. ; 29:8, s. 2197-2204 Tidskriftsartikel (refereegranskat)abstract Dalton is a molecular electronic structure program featuring common methods of computational chemistry that are based on pure quantum mechanics (QM) as well as hybrid quantum mechanics/molecular mechanics (QM/MM). It is specialized and has a leading position in calculation of molecular properties with a large world-wide user community (over 2000 licenses issued). In this paper, we present a performance characterization and optimization of Dalton. We also propose a solution to avoid the master/worker design of Dalton to become a performance bottleneck for larger process numbers. With these improvements we obtain speedups of 4x, increasing the parallel efficiency of the code and being able to run in it in a much bigger number of cores.
8.	Aguilar, Xavier, et al. (författare) Scaling Dalton, a molecular electronic structure program 2011 Ingår i: Seventh International Conference on e-Science, e-Science 2011, 5-8 December 2011, Stockholm, Sweden. - : IEEE conference proceedings. - 9781457721632 ; , s. 256-262 Konferensbidrag (refereegranskat)abstract Dalton is a molecular electronic structure program featuring common methods of computational chemistry that are based on pure quantum mechanics (QM) as well as hybrid quantum mechanics/molecular mechanics (QM/MM). It is specialized and has a leading position in calculation of molecular properties with a large world-wide user community (over 2000 licenses issued). In this paper, we present a characterization and performance optimization of Dalton that increases the scalability and parallel efficiency of the application. We also propose asolution that helps to avoid the master/worker design of Daltonto become a performance bottleneck for larger process numbers and increase the parallel efficiency.
9.	Aguilar, Xavier (författare) Towards Scalable Performance Analysis of MPI Parallel Applications 2015 Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract A considerably fraction of science discovery is nowadays relying on computer simulations. High Performance Computing (HPC) provides scientists with the means to simulate processes ranging from climate modeling to protein folding. However, achieving good application performance and making an optimal use of HPC resources is a heroic task due to the complexity of parallel software. Therefore, performance tools and runtime systems that help users to execute applications in the most optimal way are of utmost importance in the landscape of HPC. In this thesis, we explore different techniques to tackle the challenges of collecting, storing, and using fine-grained performance data. First, we investigate the automatic use of real-time performance data in order to run applications in an optimal way. To that end, we present a prototype of an adaptive task-based runtime system that uses real-time performance data for task scheduling. This runtime system has a performance monitoring component that provides real-time access to the performance behavior of anapplication while it runs. The implementation of this monitoring component is presented and evaluated within this thesis. Secondly, we explore lossless compression approaches for MPI monitoring. One of the main problems that performance tools face is the huge amount of fine-grained data that can be generated from an instrumented application. Collecting fine-grained data from a program is the best method to uncover the root causes of performance bottlenecks, however, it is unfeasible with extremely parallel applications or applications with long execution times. On the other hand, collecting coarse-grained data is scalable but sometimes not enough to discern the root cause of a performance problem. Thus, we propose a new method for performance monitoring of MPI programs using event flow graphs. Event flow graphs provide very low overhead in terms of execution time and storage size, and can be used to reconstruct fine-grained trace files of application events ordered in time.
10.	Aguilar, Xavier, et al. (författare) Visual MPI Performance Analysis using Event Flow Graphs 2015 Ingår i: Procedia Computer Science. - : Elsevier. - 1877-0509. ; 51, s. 1353-1362 Tidskriftsartikel (refereegranskat)abstract Event flow graphs used in the context of performance monitoring combine the scalability and low overhead of profiling methods with lossless information recording of tracing tools. In other words, they capture statistics on the performance behavior of parallel applications while pre- serving the temporal ordering of events. Event flow graphs require significantly less storage than regular event traces and can still be used to recover the full ordered sequence of events performed by the application. In this paper we explore the usage of event flow graphs in the context of visual performance analysis. We show that graphs can be used to quickly spot performance problems, helping to better understand the behavior of an application. We demonstrate our performance analysis approach with MiniFE, a mini-application that mimics the key performance aspects of finite- element applications in High Performance Computing (HPC).

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Resultat 1-10 av 141

Avgränsa träffmängd

Typ av publikation: konferensbidrag (76); tidskriftsartikel (44); doktorsavhandling (9); bokkapitel (7); licentiatavhandling (3); rapport (1); visa fler...; forskningsöversikt (1); visa färre...

Typ av innehåll: refereegranskat (118); övrigt vetenskapligt/konstnärligt (23)

Författare/redaktör: Laure, Erwin (130); Markidis, Stefano (55); Peng, Ivy Bo (17); Aguilar, Xavier (15); Schlatter, Philipp (11); Schliephake, Michael (11); visa fler...; Iakymchuk, Roman (10); Edlund, Åke (9); Fischer, Paul (9); Laure, Erwin, Profes ... (8); Gong, Jing (8); Kestor, G. (7); Gioiosa, R. (7); Ahmed, Laeeq (6); Akhmetova, Dana (6); Gholami, Ali (6); Hart, Alistair (6); Chien, Wei Der (5); Lapenta, G. (5); Jones, B (4); Dowling, Jim (4); Rahn, Mirko (4); Kunszt, P. (4); Chien, Steven Wei De ... (4); Sishtla, Chaitanya P ... (4); Gioiosa, Roberto (4); Min, Misun (4); Vencels, Juris (4); Vinuesa, Ricardo (3); Fürlinger, Karl (3); Gimenez, Judit (3); Ahlin, Daniel (3); Vaivads, Andris (3); Litton, Jan-Eric (3); Spjuth, Ola, 1977- (3); Jansson, Niclas, 198 ... (3); Henri, P. (3); Stockinger, H (3); Peplinski, Adam (3); Stockinger, K. (3); Olshevsky, Vyachesla ... (3); Peng, I. B. (3); Narasimhamurthy, Sai (3); Henty, David (3); Jordan, Herbert (3); Machado, Rui (3); Bartsch, Valeria (3); Ju, Yi (3); Otero, Evelyn, 1983- (3); Hemmer, F (3); visa färre...

Lärosäte: Kungliga Tekniska Högskolan (130); Uppsala universitet (15); Linköpings universitet (5); Umeå universitet (4); Lunds universitet (2); Karolinska Institutet (2); visa fler...; Stockholms universitet (1); Sveriges Lantbruksuniversitet (1); Blekinge Tekniska Högskola (1); visa färre...

Språk: Engelska (141)

Forskningsämne (UKÄ/SCB): Naturvetenskap (114); Teknik (35); Medicin och hälsovetenskap (5); Samhällsvetenskap (2)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - Nationella bibliotekssystem
LIBRIS.kb.se

pil uppåt

Stäng

Kopiera och spara länken för att återkomma till aktuell vy