↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "hsv:(TEKNIK OCH TEKNOLOGIER) hsv:(Elektroteknik och elektronik) hsv:(Datorsystem) ;pers:(Brorsson Mats)"

Sökning: hsv:(TEKNIK OCH TEKNOLOGIER) hsv:(Elektroteknik och elektronik) hsv:(Datorsystem) > Brorsson Mats

Resultat 1-10 av 66

Sortera/gruppera träfflistan

Sortering: Träffar per sida:

Numrering	Referens	Omslagsbild	Hitta
1.	Du, M., et al. (författare) Time series modeling of market price in real-time bidding 2019 Ingår i: ESANN 2019 - Proceedings, 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. - : ESANN. ; , s. 643-648 Konferensbidrag (refereegranskat)abstract Real-Time-Bidding (RTB) is one of the most popular online advertisement selling mechanisms. Modeling the highly dynamic bidding environment is crucial for making good bids. Market prices of auctions fluctuate heavily within short time spans. State-of-the-art methods neglect the temporal dependencies of bidders’ behaviors. In this paper, the bid requests are aggregated by time and the mean market price per aggregated segment is modeled as a time series. We show that the Long Short Term Memory (LSTM) neural network outperforms the state-of-the-art univariate time series models by capturing the nonlinear temporal dependencies in the market price. We further improve the predicting performance by adding a summary of exogenous features from bid requests.
2.	Podobas, Artur, 1982-, et al. (författare) Cool-Cores : Thermal-aware Task Scheduling for OpenMP 2010 Konferensbidrag (refereegranskat)abstract Temperature remains a limiting factor in currentmany-core chips. This work focuses on evaluating different user-mode, task-scheduler’s temperature behaviour. We model a CMPwith 16 cores connected through a mesh interconnect withdirectory based cache coherence. This type of system closelyresembles some of the manycore architecture out on the market.The alghorithms we have investigated are two commonOpenMP scheduling strategies: Breadth-First and Cilk. We alsoimplemented two temperature-aware schedulers based on theBreadth-first and Cilk schedulers. We show that by enablingtemperature-awarness in schedulers, the MTTF can drasticallyimprove with insignificant execution performance losses.
3.	Alexandru, Iordan, et al. (författare) Investigating the Potential of Energy-savings Using a Fine-grained Task Based Programming Model on Multi-cores 2011 Konferensbidrag (refereegranskat)abstract In this paper we study the relation between energy-efficiencyand parallel executions when implemented with a fine-grained task-centricprogramming model. Using a simulation framework comprised of an ar-chitectural simulator and a power and area estimation tool, we haveinvestigated the potential energy-savings when employing parallelism onmulti-cores system. In our experiments with 2 - 8 multi-cores systems,we employed frequency and voltage scaling in order to keep the relativeperformance of the systems constant and measured the energy-efficiencyusing the Energy-delay-product. Also, we compared the energy consump-tion of the parallel execution against the serial one. Our results showthat through judicious choice of load balancing parameters, significantimprovements of around 200 % in energy consumption can be acheived.
4.	Awan, Ahsan Javed, 1988-, et al. (författare) Architectural Impact on Performance of In-memoryData Analytics: Apache Spark Case Study Annan publikation (övrigt vetenskapligt/konstnärligt)abstract While cluster computing frameworks are contin-uously evolving to provide real-time data analysis capabilities,Apache Spark has managed to be at the forefront of big data an-alytics for being a unified framework for both, batch and streamdata processing. However, recent studies on micro-architecturalcharacterization of in-memory data analytics are limited to onlybatch processing workloads. We compare micro-architectural per-formance of batch processing and stream processing workloadsin Apache Spark using hardware performance counters on a dualsocket server. In our evaluation experiments, we have found thatbatch processing are stream processing workloads have similarmicro-architectural characteristics are bounded by the latency offrequent data access to DRAM. For data accesses we have foundthat simultaneous multi-threading is effective in hiding the datalatencies. We have also observed that (i) data locality on NUMAnodes can improve the performance by 10% on average and(ii)disabling next-line L1-D prefetchers can reduce the executiontime by up-to 14% and (iii) multiple small executors can provideup-to 36% speedup over single large executor
5.	Awan, Ahsan Javed, 1988-, et al. (författare) How Data Volume Affects Spark Based Data Analytics on a Scale-up Server 2015 Ingår i: Big Data Benchmarks, Performance Optimization, and Emerging Hardware. - Cham : Springer. - 9783319290058 ; , s. 81-92 Konferensbidrag (refereegranskat)abstract Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not well understood. We present a deep-dive analysis of Spark based applications on a large scale-up server machine. Our analysis reveals that Spark based data analytics are DRAM bound and do not benefit by using more than 12 cores for an executor. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10 % better instruction retirement rate (due to lower L1 cache misses and higher core utilization). We match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x.
6.	Awan, Ahsan Javed, 1988-, et al. (författare) Micro-architectural Characterization of Apache Spark on Batch and Stream Processing Workloads 2016 Konferensbidrag (refereegranskat)abstract While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We compare the micro-architectural performance of batch processing and stream processing workloads in Apache Spark using hardware performance counters on a dual socket server. In our evaluation experiments, we have found that batch processing and stream processing has same micro-architectural behavior in Spark if the difference between two implementations is of micro-batching only. If the input data rates are small, stream processing workloads are front-end bound. However, the front end bound stalls are reduced at larger input data rates and instruction retirement is improved. Moreover, Spark workloads using DataFrames have improved instruction retirement over workloads using RDDs.
7.	Awan, Ahsan Javed, 1988-, et al. (författare) Node architecture implications for in-memory data analytics on scale-in clusters 2016 Konferensbidrag (refereegranskat)abstract While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics. Recent studies propose scale-in clusters with in-storage processing devices to process big data analytics with Spark However the proposal is based solely on the memory bandwidth characterization of in-memory data analytics and also does not shed light on the specification of host CPU and memory. Through empirical evaluation of in-memory data analytics with Apache Spark on an Ivy Bridge dual socket server, we have found that (i) simultaneous multi-threading is effective up to 6 cores (ii) data locality on NUMA nodes can improve the performance by 10% on average, (iii) disabling next-line L1-D prefetchers can reduce the execution time by up to 14%, (iv) DDR3 operating at 1333 MT/s is sufficient and (v) multiple small executors can provide up to 36% speedup over single large executor.
8.	Awan, Ahsan Javed, 1988- (författare) Performance Characterization and Optimization of In-Memory Data Analytics on a Scale-up Server 2017 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract The sheer increase in the volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted to understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation (the additional CPU time spent by threads in a multi-threaded computation beyond the CPU time required to perform the same work in a sequential computation). We have also found that workloads are bound by the latency of frequent data accesses to the memory. By enlarging input data size, application performance degrades significantly due to the substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1cache misses and higher core utilization).For data accesses, we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average,(ii) disabling next-line L1-D prefetchers can reduce the execution time by upto14%. For garbage collection impact, we match memory behavior with the garbage collector to improve the performance of applications between 1.6xto 3x and recommend using multiple small Spark executors that can provide up to 36% reduction in execution time over single large executor. Based on the characteristics of workloads, the thesis envisions near-memory and near storage hardware acceleration to improve the single-node performance of scale-out frameworks like Apache Spark. Using modeling techniques, it estimates the speed-up of 4x for Apache Spark on scale-up servers augmented with near-data accelerators.
9.	Awan, Ahsan Javed, 1988- (författare) Performance Characterization of In-Memory Data Analytics on a Scale-up Server 2016 Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract The sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted at understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation. We have also found that workloads are bound by the latency of frequent data accesses to DRAM. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1 cache misses and higher core utilization).For data accesses we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average, (ii) disabling next-line L1-D prefetchers can reduce the execution time by up-to 14%. For GC impact, we match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x. and recommend to use multiple small executors that can provide up-to 36% speedup over single large executor.
10.	Ayguadé, Eduard, et al. (författare) OpenMP Performance Analysis in the INTONE Project 2001 Konferensbidrag (refereegranskat)

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Resultat 1-10 av 66

Avgränsa träffmängd

Typ av publikation: konferensbidrag (41); tidskriftsartikel (10); rapport (6); doktorsavhandling (3); licentiatavhandling (3); bok (1); visa fler...; annan publikation (1); bokkapitel (1); visa färre...

Typ av innehåll: refereegranskat (50); övrigt vetenskapligt/konstnärligt (16)

Författare/redaktör: Brorsson, Mats (45)Ta bort avgränsningen; Brorsson, Mats, 1962 ... (16); Vlassov, Vladimir (10); Podobas, Artur, 1982 ... (9); Ayguade, Eduard (7); Awan, Ahsan Javed, 1 ... (6); visa fler...; Karlsson, S. (4); Brorsson, Mats, Prof ... (4); Stenstrom, Per (4); Muddukrishna, Ananya (3); Collin, Mikael (3); Du, M. (2); Podobas, Artur (2); Popov, Konstantin (2); Vlassov, Vladimir, A ... (2); Ayani, Rassul (2); Kral, Martin (2); Stenström, Per (2); Zhang, Z. (1); Lee, S. W. (1); Vlassov, Vladimir, 1 ... (1); Issa, Shady (1); Winkler, M (1); Jonsson, Peter (1); Hallberg, J (1); Alexandru, Iordan (1); Natvig, Lasse (1); Nilsson, Håkan (1); Sandberg, L. (1); Barriga, Luis (1); Eeckhout, Lieven (1); Ayguade, Eduard, Pro ... (1); Grot, Boris (1); Brunst, H. (1); Hoppe, H. -C (1); Martorell, X. (1); Nagel, W. E. (1); Schlimbach, F. (1); Utrera, G. (1); Bao, Yan (1); Barriga, L. (1); Bhatti, Muhammad Khu ... (1); Oz, Isil (1); Bhatti, M. K. (1); Oz, I. (1); Farooq, U. (1); Palm, T. (1); Kruzela, Ivan (1); Dahlgren, Fredrik (1); Kessler, Christoph, ... (1); visa färre...

Lärosäte: Kungliga Tekniska Högskolan (66); RISE (5); Blekinge Tekniska Högskola (1)

Språk: Engelska (66)

Forskningsämne (UKÄ/SCB): Teknik (66); Naturvetenskap (1); Samhällsvetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - Nationella bibliotekssystem
LIBRIS.kb.se

pil uppåt

Stäng

Kopiera och spara länken för att återkomma till aktuell vy