SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Brorsson Mats 1962 ) srt2:(2015-2019)"

Sökning: WFRF:(Brorsson Mats 1962 ) > (2015-2019)

  • Resultat 1-6 av 6
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Bhatti, Muhammad Khurram, et al. (författare)
  • Locality-aware task scheduling for homogeneous parallel computing systems
  • 2018
  • Ingår i: Computing. - : Springer Science and Business Media LLC. - 0010-485X .- 1436-5057. ; 100:6, s. 557-595
  • Tidskriftsartikel (refereegranskat)abstract
    • In systems with complex many-core cache hierarchy, exploiting data locality can significantly reduce execution time and energy consumption of parallel applications. Locality can be exploited at various hardware and software layers. For instance, by implementing private and shared caches in a multi-level fashion, recent hardware designs are already optimised for locality. However, this would all be useless if the software scheduling does not cast the execution in a manner that promotes locality available in the programs themselves. Since programs for parallel systems consist of tasks executed simultaneously, task scheduling becomes crucial for the performance in multi-level cache architectures. This paper presents a heuristic algorithm for homogeneous multi-core systems called locality-aware task scheduling (LeTS). The LeTS heuristic is a work-conserving algorithm that takes into account both locality and load balancing in order to reduce the execution time of target applications. The working principle of LeTS is based on two distinctive phases, namely; working task group formation phase (WTG-FP) and working task group ordering phase (WTG-OP). The WTG-FP forms groups of tasks in order to capture data reuse across tasks while the WTG-OP determines an optimal order of execution for task groups that minimizes the reuse distance of shared data between tasks. We have performed experiments using randomly generated task graphs by varying three major performance parameters, namely: (1) communication to computation ratio (CCR) between 0.1 and 1.0, (2) application size, i.e., task graphs comprising of 50-, 100-, and 300-tasks per graph, and (3) number of cores with 2-, 4-, 8-, and 16-cores execution scenarios. We have also performed experiments using selected real-world applications. The LeTS heuristic reduces overall execution time of applications by exploiting inter-task data locality. Results show that LeTS outperforms state-of-the-art algorithms in amortizing inter-task communication cost.
  •  
2.
  • Du, M., et al. (författare)
  • Improving real-time bidding using a constrained markov decision process
  • 2017
  • Ingår i: 13th International Conference on Advanced Data Mining and Applications, ADMA 2017. - Cham : Springer. - 9783319691787 ; , s. 711-726
  • Konferensbidrag (refereegranskat)abstract
    • Online advertising is increasingly switching to real-time bidding on advertisement inventory, in which the ad slots are sold through real-time auctions upon users visiting websites or using mobile apps. To compete with unknown bidders in such a highly stochastic environment, each bidder is required to estimate the value of each impression and to set a competitive bid price. Previous bidding algorithms have done so without considering the constraint of budget limits, which we address in this paper. We model the bidding process as a Constrained Markov Decision Process based reinforcement learning framework. Our model uses the predicted click-through-rate as the state, bid price as the action, and ad clicks as the reward. We propose a bidding function, which outperforms the state-of-the-art bidding functions in terms of the number of clicks when the budget limit is low. We further simulate different bidding functions competing in the same environment and report the performances of the bidding strategies when required to adapt to a dynamic environment.
  •  
3.
  • Du, M., et al. (författare)
  • Time series modeling of market price in real-time bidding
  • 2019
  • Ingår i: ESANN 2019 - Proceedings, 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. - : ESANN. ; , s. 643-648
  • Konferensbidrag (refereegranskat)abstract
    • Real-Time-Bidding (RTB) is one of the most popular online advertisement selling mechanisms. Modeling the highly dynamic bidding environment is crucial for making good bids. Market prices of auctions fluctuate heavily within short time spans. State-of-the-art methods neglect the temporal dependencies of bidders’ behaviors. In this paper, the bid requests are aggregated by time and the mean market price per aggregated segment is modeled as a time series. We show that the Long Short Term Memory (LSTM) neural network outperforms the state-of-the-art univariate time series models by capturing the nonlinear temporal dependencies in the market price. We further improve the predicting performance by adding a summary of exogenous features from bid requests.
  •  
4.
  • Issa, Shady, 1989- (författare)
  • Techniques for Enhancing the Efficiency of Transactional Memory Systems
  • 2018
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Transactional Memory (TM) is an emerging programming paradigm that drastically simplifies the development of concurrent applications by relieving programmers from a major source of complexity: how to ensure correct, yet efficient, synchronization of concurrent accesses to shared memory. Despite the large body of research devoted to this area, existing TM systems still suffer from severe limitations that hamper both their performance and energy efficiency.This dissertation tackles the problem of how to build efficient implementations of the TM abstraction by introducing innovative techniques that address three crucial limitations of existing TM systems by: (i) extending the effective capacity of Hardware TM (HTM) implementations; (ii) reducing the synchronization overheads in Hybrid TM (HyTM) systems; (iii) enhancing the efficiency of TM applications via energy-aware contention management schemes.The first contribution of this dissertation, named POWER8-TM (P8TM), addresses what is arguably one of the most compelling limitations of existing HTM implementations: the inability to process transactions whose footprint exceeds the capacity of the processor's cache. By leveraging, in an innovative way, two hardware features provided by IBM POWER8 processors, namely Rollback-only Transactions and Suspend/Resume, P8TM can achieve up to 7x performance gains in workloads that stress the capacity limitations of HTM.The second contribution is Dynamic Memory Partitioning-TM (DMP-TM), a novel Hybrid TM (HyTM) that offloads the cost of detecting conflicts between HTM and Software TM (STM) to off-the-shelf operating system memory protection mechanisms. DMP-TM's design is agnostic to the STM algorithm and has the key advantage of allowing for integrating, in an efficient way, highly scalable STM implementations that would, otherwise, demand expensive instrumentation of the HTM path. This allows DMP-TM to achieve up to 20x speedups compared to state of the art HyTM solutions in uncontended workloads.The third contribution, Green-CM, is an energy-aware Contention Manager (CM) that has two main innovative aspects: (i) a novel asymmetric design, which combines different back-off policies in order to take advantage of Dynamic Frequency and Voltage Scaling (DVFS) hardware capabilities, available in most modern processors; (ii) an energy efficient implementation of a fundamental building block for many CM implementations, namely, the mechanism used to back-off threads for a predefined amount of time. Thanks to its innovative design, Green-CM can reduce the Energy Delay Product by up to 2.35x with respect to state of the art CMs.All the techniques proposed in this dissertation share an important common feature that is essential to preserve the ease of use of the TM abstraction: the reliance on on-line self-tuning mechanisms that ensure robust performance even in presence of heterogeneous workloads, without requiring prior knowledge of the target workloads or architecture.
  •  
5.
  • Javed Awan, Ahsan, 1988-, et al. (författare)
  • Identifying the potential of Near Data Processing for Apache Spark
  • 2017
  • Ingår i: Proceedings of the International Symposium on Memory Systems, MEMSYS 2017. - New York, NY, USA : Association for Computing Machinery (ACM). ; , s. 60-67
  • Konferensbidrag (refereegranskat)abstract
    • While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. There is also a renewed interest in Near Data Processing (NDP) due to technological advancement in the last decade. However, it is not known if NDP architectures can improve the performance of big data processing frameworks such as Apache Spark. In this paper, we build the case of NDP architecture comprising programmable logic based hybrid 2D integrated processing-in-memory and instorage processing for Apache Spark, by extensive profiling of Apache Spark based workloads on Ivy Bridge Server.
  •  
6.
  • Oz, Isil, et al. (författare)
  • Regression-Based Prediction for Task-Based Program Performance
  • 2019
  • Ingår i: Journal of Circuits, Systems and Computers. - : WORLD SCIENTIFIC PUBL CO PTE LTD. - 0218-1266. ; 8:4
  • Tidskriftsartikel (refereegranskat)abstract
    • As multicore systems evolve by increasing the number of parallel execution units, parallel programming models have been released to exploit parallelism in the applications. Task-based programming model uses task abstractions to specify parallel tasks and schedules tasks onto processors at runtime. In order to increase the efficiency and get the highest performance, it is required to identify which runtime configuration is needed and how processor cores must be shared among tasks. Exploring design space for all possible scheduling and runtime options, especially for large input data, becomes infeasible and requires statistical modeling. Regression-based modeling determines the effects of multiple factors on a response variable, and makes predictions based on statistical analysis. In this work, we propose a regression-based modeling approach to predict the task-based program performance for different scheduling parameters with variable data size. We execute a set of task-based programs by varying the runtime parameters, and conduct a systematic measurement for influencing factors on execution time. Our approach uses executions with different configurations for a set of input data, and derives different regression models to predict execution time for larger input data. Our results show that regression models provide accurate predictions for validation inputs with mean error rate as low as 6.3%, and 14% on average among four task-based programs.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-6 av 6

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy