SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Marcus Jägemar 1972 ) "

Sökning: WFRF:(Marcus Jägemar 1972 )

  • Resultat 1-10 av 16
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Danielsson, Jakob, et al. (författare)
  • Automatic Quality of Service Control in Multi-core Systems using Cache Partitioning
  • 2021
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we present a last-level cache partitioning controller for multi-core systems. Our objective is to control the Quality of Service (QoS) of applications in multi-core systems by monitoring run-time performance and continuously re-sizing cache partition sizes according to the applications' needs. We discuss two different use-cases; one that promotes application fairness and another one that prioritizes applications according to the system engineers' desired execution behavior. We display the performance drawbacks of maintaining a fair schedule for all system tasks and its performance implications for system applications. We, therefore, implement a second control algorithm that enforces cache partition assignments according to user-defined priorities rather than system fairness. Our experiments reveal that it is possible, with non-instrusive (0.3-0.7\% CPU utilization) cache controlling measures, to increase performance according to setpoints and maintain the QoS for specific applications in an over-saturated system.
  •  
2.
  • Danielsson, Jakob, et al. (författare)
  • LLM-shark -- A Tool for Automatic Resource-boundness Analysis and Cache Partitioning Setup
  • 2021
  • Ingår i: 45th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2021. - 9781665424639 ; , s. 49-58
  • Konferensbidrag (refereegranskat)abstract
    • We present LLM-shark, a tool for automatic hardware resource-boundness detection and cache-partitioning. Our tool has three primary objectives: First, it determines the hardware resource-boundness of a given application. Secondly, it estimates the initial cache partition size to ensure that the application performance is conserved and not affected by other processes competing for cache utilization. Thirdly, it continuously monitors that the application performance is maintained over time and, if necessary, change the cache partition size. We demonstrate LLM-shark's functionality through a series of tests using six different applications, including a set of feature detection algorithms and two synthetic applications. Our tests reveal that it is possible to determine an application's resource-boundness using a Pearson-correlation scheme implemented in LLM-shark. We propose a scheme to size cache partitions based on the correlation coefficient applications depending on their resource boundness.
  •  
3.
  • Danielsson, Jakob, et al. (författare)
  • Measurement-based evaluation of data-parallelism for OpenCV feature-detection algorithms
  • 2018
  • Ingår i: Staying Smarter in a Smartening World COMPSAC'18. - 9781538626665 ; , s. 701-710
  • Konferensbidrag (refereegranskat)abstract
    • We investigate the effects on the execution time, shared cache usage and speed-up gains when using data-partitioned parallelism for the feature detection algorithms available in the OpenCV library. We use a data set of three different images which are scaled to six different sizes to exercise the different cache memories of our test architectures. Our measurements reveal that the algorithms using the default settings of OpenCV behave very differently when using data-partitioned parallelism. Our investigation shows that the executions of the algorithms SURF, Dense and MSER correlate to L3-cache usage and they are therefore not suitable for data-partitioned parallelism on multi-core CPUs. Other algorithms: BRISK, FAST, ORB, HARRIS, GFTT, SimpleBlob and SIFT, do not correlate to L3-cache in the same extent, and they are therefore more suitable for data-partitioned parallelism. Furthermore, the SIFT algorithm provides the most stable speed-up, resulting in an execution between 3 and 3.5 times faster than the original execution time for all image sizes. We also have evaluated the hardware resource usage by measuring the algorithm execution time simultaneously with the L3-cache usage. We have used our measurements to conclude which algorithms are suitable for parallelization on hardware with shared resources.
  •  
4.
  • Danielsson, Jakob, et al. (författare)
  • Modelling Application Cache Behavior using Regression Models
  • 2021
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we describe the creation of resource usage forecasts for applications with unknown execution characteristics, by evaluating different regression processes, including autoregressive, multivariate adaptive regression splines, exponential smoothing, etc. We utilize Performance Monitor Units (PMU) and generate hardware resource usage models for the L-2-cache and the L-3-cache using nine different regression processes. The measurement strategy and regression process methodology are general and applicable to any given hardware resource when performance counters are available. We use three benchmark applications: the SIFT feature detection algorithm, a standard matrix multiplication, and a version of Bubblesort. Our evaluation shows that Multi Adaptive Regressive Spline (MARS) models generate the best resource usage forecasts among the considered models, followed by Single Exponential Splines (SES) and Triple Exponential Splines (TES).
  •  
5.
  • Danielsson, Jakob, et al. (författare)
  • Resource Depedency Analysis in Multi-Core Systems
  • 2020
  • Ingår i: Proceedings - 2020 IEEE 44th Annual Computers, Software, and Applications Conference, COMPSAC 2020. - : Institute of Electrical and Electronics Engineers Inc.. - 9781728173030 ; , s. 87-94
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we evaluate different methods for statistical determination of application resource dependency in multi-core systems. We measure the performance counters of an application during run-time and create a system resource usage profile. We then use the resource profile to evaluate the application dependency on the specific resource. We discuss and evaluate two methods to process the data, including moving average filter and partitioning the data into smaller segments in order to interpret data for correlation calculations. Our aim with this study is to evaluate and create a generalizeable methods for automatic determination of resource dependencies. The final outcome of the methods used in this study is the answer to the question: 'To what resources is this application dependent on?'. The recommendation of this tool will be used in conjunction with our last-level cache partitioning controller (LLC-PC), to make decision if an application should receive last-level cache partition slices.
  •  
6.
  •  
7.
  • Danielsson, Jakob, et al. (författare)
  • Testing Performance-Isolation in Multi-Core Systems
  • 2019
  • Konferensbidrag (refereegranskat)abstract
    • In this paper we present a methodology to be used for quantifying the level of performance isolation for a multi-core system. We have devised a test that can be applied to breaches of isolation in different computing resources that may be shared between different cores. We use this test to determine the level of isolation gained by using the Jailhouse hypervisor compared to a regular Linux system in terms of CPU isolation, cache isolation and memory bus isolation. Our measurements show that the Jailhouse hypervisor provides performance isolation of local computing resources such as CPU. We have also evaluated if any isolation could be gained for shared computing resources such as the system wide cache and the memory bus controller. Our tests show no measurable difference in partitioning between a regular Linux system and a Jailhouse partitioned system for shared resources. Using the Jailhouse hypervisor provides only a small noticeable overhead when executing multiple shared-resource intensive tasks on multiple cores, which implies that running Jailhouse in a memory saturated system will not be harmful. However, contention still exist in the memory bus and in the system-wide cache.
  •  
8.
  • Imtiaz, Shamoona, et al. (författare)
  • Automatic Segmentation of Resource Utilization Data
  • 2022
  • Ingår i: 1st IEEE Industrial Electronics Society Annual On-Line Conference (ONCON) 2022. - 9798350398069
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • Advancement of industrial systems seek improvements to achieve required level of quality of service and efficient performance management. It is essential though to have better understanding of resource utilization behaviour of applications in execution. Even the expert engineers desire to envision dependencies and impact of one computer resource on the other. For such situations it is significant to know statistical relationship between data sets such as a resource with higher cache demand should not be scheduled together with other cache hungry process at the same time and same core. Performance monitoring data coming from hardware and software is huge and grouping of this time series data based on similar behaviour can display distinguishable execution phases. For benefits like these we opt to choose change point analysis method. By using this method study determined the optimal threshold which can identify more or less same segments for other executions of same application and same event. These segments are then validated with the help of test data. Finally the study provided segment-wise, local, compact statistical model with decent accuracy.
  •  
9.
  • Imtiaz, Shamoona, et al. (författare)
  • Towards Automatic Application Fingerprinting Using Performance Monitoring Counters
  • 2021
  • Ingår i: ACM International Conference Proceeding Series. - New York, NY, USA : Association for Computing Machinery. - 9781450390576
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we discuss a method for application fingerprinting using conventional hardware and software performance counters. Modern applications are complex and often utilizes a broad spectra of the available hardware resources, where multiple performance counters can be of significant interest. The number of performance counters that can be captured simultaneously is, however, small due to hardware limitations in most modern computers. We propose to mitigate the hardware limitations using an intelligent mechanism that pinpoints the most relevant performance counters for an application's performance. In our proposal, we utilize the Pearson correlation coefficient to rank the most relevant PMU events and filter out events of less relevance to an application's execution. Our ultimate goal is to establish a comparable application fingerprint model using performance counters, that we can use to classify applications. The classification procedure can then be used to determine the type of application's fingerprint, such as malicious software.
  •  
10.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 16

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy