SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Gioiosa R.) "

Sökning: WFRF:(Gioiosa R.)

  • Resultat 1-9 av 9
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Akhmetova, D., et al. (författare)
  • On the application task granularity and the interplay with the scheduling overhead in many-core shared memory systems
  • 2015
  • Ingår i: Proceedings - IEEE International Conference on Cluster Computing, ICCC. - : IEEE. - 9781467365987 ; , s. 428-437
  • Konferensbidrag (refereegranskat)abstract
    • Task-based programming models are considered one of the most promising programming model approaches for exascale supercomputers because of their ability to dynamically react to changing conditions and reassign work to processing elements. One question, however, remains unsolved: what should the task granularity of task-based applications be? Fine-grained tasks offer more opportunities to balance the system and generally result in higher system utilization. However, they also induce in large scheduling overhead. The impact of scheduling overhead on coarse-grained tasks is lower, but large systems may result imbalanced and underutilized. In this work we propose a methodology to analyze the interplay between application task granularity and scheduling overhead. Our methodology is based on three main points: 1) a novel task algorithm that analyzes an application directed acyclic graph (DAG) and aggregates tasks, 2) a fast and precise emulator to analyze the application behavior on systems with up to 1,024 cores, 3) a comprehensive sensitivity analysis of application performance and scheduling overhead breakdown. Our results show that there is an optimal task granularity between 1.2x10^4 and 10x10^4 cycles for the representative schedulers. Moreover, our analysis indicates that a suitable scheduler for exascale task-based applications should employ a best-effort local scheduler and a sophisticated remote scheduler to move tasks across worker threads.
  •  
2.
  • Heindel, Jerrold J., et al. (författare)
  • Parma consensus statement on metabolic disruptors
  • 2015
  • Ingår i: Environmental Health. - : BioMed Central (BMC). - 1476-069X. ; 14
  • Tidskriftsartikel (refereegranskat)abstract
    • A multidisciplinary group of experts gathered in Parma Italy for a workshop hosted by the University of Parma, May 16-18, 2014 to address concerns about the potential relationship between environmental metabolic disrupting chemicals, obesity and related metabolic disorders. The objectives of the workshop were to: 1. Review findings related to the role of environmental chemicals, referred to as "metabolic disruptors", in obesity and metabolic syndrome with special attention to recent discoveries from animal model and epidemiology studies; 2. Identify conclusions that could be drawn with confidence from existing animal and human data; 3. Develop predictions based on current data; and 4. Identify critical knowledge gaps and areas of uncertainty. The consensus statements are intended to aid in expanding understanding of the role of metabolic disruptors in the obesity and metabolic disease epidemics, to move the field forward by assessing the current state of the science and to identify research needs on the role of environmental chemical exposures in these diseases. We propose broadening the definition of obesogens to that of metabolic disruptors, to encompass chemicals that play a role in altered susceptibility to obesity, diabetes and related metabolic disorders including metabolic syndrome.
  •  
3.
  • Markidis, Stefano, et al. (författare)
  • A performance characterization of streaming computing on supercomputers
  • 2016
  • Ingår i: Procedia Computer Science. - : Elsevier. - 1877-0509. ; , s. 98-107
  • Konferensbidrag (refereegranskat)abstract
    • Streaming computing models allow for on-the-y processing of large data sets. With the increased demand for processing large amount of data in a reasonable period of time, streaming models are more and more used on supercomputers to solve data-intensive problems. Because supercomputers have been mainly used for compute-intensive workload, supercomputer performance metrics focus on the number of oating point operations in time and cannot fully characterize a streaming application performance on supercomputers. We introduce the injection and processing rates as the main metrics to characterize the performance of streaming computing on supercomputers. We analyze the dynamics of these quantities in a modi ed STREAM benchmark developed atop of an MPI streaming library in a series of di erent congurations. We show that after a brief transient the injection and processing rates converge to sustained rates. We also demonstrate that streaming computing performance strongly depends on the number of connections between data producers and consumers and on the processing task granularity.
  •  
4.
  • Peng, I. B., et al. (författare)
  • Characterizing the performance benefit of hybrid memory system for HPC applications
  • 2018
  • Ingår i: Parallel Computing. - : Elsevier. - 0167-8191 .- 1872-7336. ; 76, s. 57-69
  • Tidskriftsartikel (refereegranskat)abstract
    • Heterogenous memory systems that consist of multiple memory technologies are becoming common in high-performance computing environments. Modern processors and accelerators, such as the Intel Knights Landing (KNL) CPU and NVIDIA Volta GPU, feature small-size high-bandwidth memory near the compute cores and large-size normal-bandwidth memory that is connected off-chip. Theoretically, HBM can provide about four times higher bandwidth than conventional DRAM. However, many factors impact the actual performance improvement that an application can achieve on such system. In this paper, we focus on the Intel KNL system and identify the most important factors on the application performance, including the application memory access pattern, the problem size, the threading level and the actual memory configuration. We use a set of representative applications from both scientific and data-analytics domains. Our results show that applications with regular memory access benefit from MCDRAM, achieving up to three times performance when compared to the performance obtained using only DRAM. On the contrary, applications with irregular memory access pattern are latency-bound and may suffer from performance degradation when using only MCDRAM. Also, we provide memory-centric analysis of four applications, identify their major data objects, correlate their characteristics to the performance improvement on the testbed.
  •  
5.
  • Peng, I. Bo, et al. (författare)
  • Exploring Application Performance on Emerging Hybrid-Memory Supercomputers
  • 2017
  • Ingår i: Proceedings - 18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781509042968 ; , s. 473-480
  • Konferensbidrag (refereegranskat)abstract
    • Next-generation supercomputers will feature more hierarchical and heterogeneous memory systems with different memory technologies working side-by-side. A critical question is whether at large scale existing HPC applications and emerging data-analytics workloads will have performance improvement or degradation on these systems. We propose a systematic and fair methodology to identify the trend of application performance on emerging hybrid-memory systems. We model the memory system of next-generation supercomputers as a combination of 'fast' and 'slow' memories. We then analyze performance and dynamic execution characteristics of a variety of workloads, from traditional scientific applications to emerging data analytics to compare traditional and hybrid-memory systems. Our results show that data analytics applications can clearly benefit from the new system design, especially at large scale. Moreover, hybrid-memory systems do not penalize traditional scientific applications, which may also show performance improvement.
  •  
6.
  • Peng, Ivy Bo, et al. (författare)
  • Exploring the performance benefit of hybrid memory system on HPC environments
  • 2017
  • Ingår i: Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781538634080 ; , s. 683-692
  • Konferensbidrag (refereegranskat)abstract
    • Hardware accelerators have become a de-facto standard to achieve high performance on current supercomputers and there are indications that this trend will increase in the future. Modern accelerators feature high-bandwidth memory next to the computing cores. For example, the Intel Knights Landing (KNL) processor is equipped with 16 GB of high-bandwidth memory (HBM) that works together with conventional DRAM memory. Theoretically, HBM can provide ∼4× higher bandwidth than conventional DRAM. However, many factors impact the effective performance achieved by applications, including the application memory access pattern, the problem size, the threading level and the actual memory configuration. In this paper, we analyze the Intel KNL system and quantify the impact of the most important factors on the application performance by using a set of applications that are representative of scientific and data-analytics workloads. Our results show that applications with regular memory access benefit from MCDRAM, achieving up to 3× performance when compared to the performance obtained using only DRAM. On the contrary, applications with random memory access pattern are latency-bound and may suffer from performance degradation when using only MCDRAM. For those applications, the use of additional hardware threads may help hide latency and achieve higher aggregated bandwidth when using HBM.
  •  
7.
  • Peng, I. Bo, et al. (författare)
  • Idle period propagation in message-passing applications
  • 2017
  • Ingår i: Proceedings - 18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781509042968 ; , s. 937-944
  • Konferensbidrag (refereegranskat)abstract
    • Idle periods on different processes of Message Passing applications are unavoidable. While the origin of idle periods on a single process is well understood as the effect of system and architectural random delays, yet it is unclear how these idle periods propagate from one process to another. It is important to understand idle period propagation in Message Passing applications as it allows application developers to design communication patterns avoiding idle period propagation and the consequent performance degradation in their applications. To understand idle period propagation, we introduce a methodology to trace idle periods when a process is waiting for data from a remote delayed process in MPI applications. We apply this technique in an MPI application that solves the heat equation to study idle period propagation on three different systems. We confirm that idle periods move between processes in the form of waves and that there are different stages in idle period propagation. Our methodology enables us to identify a self-synchronization phenomenon that occurs on two systems where some processes run slower than the other processes.
  •  
8.
  • Rivas-Gomez, Sergio, et al. (författare)
  • Extending message passing interface windows to storage
  • 2017
  • Ingår i: Proceedings - 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017. - : Institute of Electrical and Electronics Engineers Inc.. - 9781509066100 ; , s. 728-730
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents an extension to MPI supporting the one-sided communication model and window allocations in storage. Our design transparently integrates with the current MPI implementations, enabling applications to target MPI windows in storage, memory or both simultaneously, without major modifications. Initial performance results demonstrate that the presented MPI window extension could potentially be helpful for a wide-range of use-cases and with low-overhead.
  •  
9.
  • Rivas-Gomez, Sergei, et al. (författare)
  • MPI windows on storage for HPC applications
  • 2017
  • Ingår i: ACM International Conference Proceeding Series. - New York, NY, USA : Association for Computing Machinery (ACM).
  • Konferensbidrag (refereegranskat)abstract
    • Upcoming HPC clusters will feature hybrid memories and storage devices per compute node. In this work, we propose to use the MPI one-sided communication model and MPI windows as unique interface for programming memory and storage. We describe the design and implementation of MPI windows on storage, and present its benefits for out-of-core execution, parallel I/O and fault-tolerance. Using a modified STREAM micro-benchmark, we measure the sustained bandwidth of MPI windows on storage against MPI memory windows and observe that only a 10% performance penalty is incurred. When using parallel file systems such as Lustre, asymmetric performance is observed with a 10% performance penalty in reading operations and a 90% in writing operations. Nonetheless, experimental results of a Distributed Hash Table and the HACC I/O kernel mini-application show that the overall penalty of MPI windows on storage can be negligible in most cases on real-world applications. 
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-9 av 9

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy