SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:1573 0484 OR L773:0920 8542 srt2:(2020-2023)"

Sökning: L773:1573 0484 OR L773:0920 8542 > (2020-2023)

  • Resultat 1-6 av 6
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Atzori, Marco, 1992-, et al. (författare)
  • In situ visualization of large-scale turbulence simulations in Nek5000 with ParaView Catalyst
  • 2022
  • Ingår i: Journal of Supercomputing. - : Springer. - 0920-8542 .- 1573-0484. ; 78:3, s. 3605-3620
  • Tidskriftsartikel (refereegranskat)abstract
    • In situ visualization on high-performance computing systems allows us to analyze simulation results that would otherwise be impossible, given the size of the simulation data sets and offline post-processing execution time. We develop an in situ adaptor for Paraview Catalyst and Nek5000, a massively parallel Fortran and C code for computational fluid dynamics. We perform a strong scalability test up to 2048 cores on KTH’s Beskow Cray XC40 supercomputer and assess in situ visualization’s impact on the Nek5000 performance. In our study case, a high-fidelity simulation of turbulent flow, we observe that in situ operations significantly limit the strong scalability of the code, reducing the relative parallel efficiency to only ≈ 21 % on 2048 cores (the relative efficiency of Nek5000 without in situ operations is ≈ 99 %). Through profiling with Arm MAP, we identified a bottleneck in the image composition step (that uses the Radix-kr algorithm) where a majority of the time is spent on MPI communication. We also identified an imbalance of in situ processing time between rank 0 and all other ranks. In our case, better scaling and load-balancing in the parallel image composition would considerably improve the performance of Nek5000 with in situ capabilities. In general, the result of this study highlights the technical challenges posed by the integration of high-performance simulation codes and data-analysis libraries and their practical use in complex cases, even when efficient algorithms already exist for a certain application scenario.
  •  
2.
  • Liu, Felix, et al. (författare)
  • A survey of HPC algorithms and frameworks for large-scale gradient-based nonlinear optimization
  • 2022
  • Ingår i: Journal of Supercomputing. - : Springer Nature. - 0920-8542 .- 1573-0484. ; 78:16, s. 17513-17542
  • Tidskriftsartikel (refereegranskat)abstract
    • Large-scale numerical optimization problems arise from many fields and have applications in both industrial and academic contexts. Finding solutions to such optimization problems efficiently requires algorithms that are able to leverage the increasing parallelism available in modern computing hardware. In this paper, we review previous work on parallelizing algorithms for nonlinear optimization. To introduce the topic, the paper starts by giving an accessible introduction to nonlinear optimization and high-performance computing. This is followed by a survey of previous work on parallelization and utilization of high-performance computing hardware for nonlinear optimization algorithms. Finally, we present a number of optimization software libraries and how they are able to utilize parallel computing today. This study can serve as an introduction point for researchers interested in nonlinear optimization or high-performance computing, as well as provide ideas and inspiration for future work combining these topics. 
  •  
3.
  • Megzari, Abdelmoujib, et al. (författare)
  • Applications, challenges, and solutions to single- and multi-objective critical node detection problems : a survey
  • 2023
  • Ingår i: Journal of Supercomputing. - : Springer Nature. - 0920-8542 .- 1573-0484. ; 79:17, s. 19770-19808
  • Tidskriftsartikel (refereegranskat)abstract
    • Recognizing critical nodes in complex networks has emerged as a challenging task across several application areas. The critical node detection problem (CNDP) is an optimization challenge that entails determining the subset of nodes whose removal adversely affects network connectivity and performance based on certain predetermined criteria. The problem of recognizing critical nodes has received significant consideration since it is a vital challenge in a multitude of application areas. As a result, many variants have been proposed on the basis of numerous metrics. In this survey, we discuss different applications, challenges, and solutions to single- and multi-objective CNDP. We review and classify different recent advancements and obtained outcomes for each variant, proposed from 2017 to 2022. To our best knowledge, this is the first survey on the heuristic optimization-based solutions for CNDP that have been developed in recent years. This study also provides researchers with future insight into filling gaps in the critical nodes research field and identifying emerging research trends in this area.
  •  
4.
  • Min-Allah, Nasro, et al. (författare)
  • Deployment of real-time systems in the cloud environment
  • 2021
  • Ingår i: Journal of Supercomputing. - : Springer Science and Business Media LLC. - 0920-8542 .- 1573-0484. ; 77:2, s. 2069-2090
  • Tidskriftsartikel (refereegranskat)abstract
    • Interest in real-time systems has grown considerably over recent years, primarily due to significant increase in the use of smart technologies and latency-sensitive applications such as cloud gaming, audio/video streaming, and smart homes. Significant work has been done on resource mapping in the cloud environment, and a number of promising results have been established accordingly where the focus is mainly on resource provisioning. However, the applicability of cloud computing services for real-time systems generated from smart systems is still in its infancy and remains unexplored, relatively. To address this gap, we propose a model for the smart systems that periodically offload computational workload to the cloud environment where virtual machines are allocated according to rate-monotonic scheduling policy to ensure requests are processed within the associated deadlines. Deadlines of tasks have been relaxed to improve server utilization as well as maintain a level of confidence in the timing constrains. Experimental results are discussed to highlight the applicability of static priority assignment for the workload in the context of virtual machines allocation.
  •  
5.
  • Shimchenko, Marina, et al. (författare)
  • Analysing software prefetching opportunities in hardware transactional memory
  • 2022
  • Ingår i: Journal of Supercomputing. - : Springer Nature. - 0920-8542 .- 1573-0484. ; 78:1, s. 919-944
  • Tidskriftsartikel (refereegranskat)abstract
    • Hardware transactional memory emerged to make parallel programming more accessible. However, the performance pitfall of this technique is squashing speculatively executed instructions and re-executing them in case of aborts, ultimately resorting to serialization in case of repeated conflicts. A significant fraction of aborts occurs due to conflicts (concurrent reads and writes to the same memory location performed by different threads). Our proposal aims to reduce conflict aborts by reducing the window of time during which transactional regions can suffer conflicts. We achieve this by using software prefetching instructions inserted automatically at compile-time. Through these prefetch instructions, we intend to bring the necessary data for each transaction from the main memory to the cache before the transaction itself starts to execute, thus converting the otherwise long latency cache misses into hits during the execution of the transaction. The obtained results show that our approach decreases the number of aborts by 30% on average and improves performance by up to 19% and 10% for two out of the eight evaluated benchmarks. We provide insights into when our technique is beneficial given certain characteristics of the transactional regions, the advantages and disadvantages of our approach, and finally, discuss potential solutions to overcome some of its limitations.
  •  
6.
  • Öhberg, Tomas, et al. (författare)
  • Hybrid CPU-GPU execution support in the skeleton programming framework SkePU
  • 2020
  • Ingår i: Journal of Supercomputing. - : SPRINGER. - 0920-8542 .- 1573-0484. ; 76:7, s. 5038-5056
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, we present a hybrid execution backend for the skeleton programming framework SkePU. The backend is capable of automatically dividing the workload and simultaneously executing the computation on a multi-core CPU and any number of accelerators, such as GPUs. We show how to efficiently partition the workload of skeletons such as Map, MapReduce, and Scan to allow hybrid execution on heterogeneous computer systems. We also show a unified way of predicting how the workload should be partitioned based on performance modeling. With experiments on typical skeleton instances, we show the speedup for all skeletons when using the new hybrid backend. We also evaluate the performance on some real-world applications. Finally, we show that the new implementation gives higher and more reliable performance compared to an old hybrid execution implementation based on dynamic scheduling.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-6 av 6

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy