↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "WFRF:(Ayguade Eduard) "

Search: WFRF:(Ayguade Eduard)

Result 1-10 of 14

Sort/group result

Sort by: Hits per page:

Enumeration	Reference	Cover	Find
1.	Awan, Ahsan Javed, 1988-, et al. (author) Architectural Impact on Performance of In-memoryData Analytics: Apache Spark Case Study Other publication (other academic/artistic)abstract While cluster computing frameworks are contin-uously evolving to provide real-time data analysis capabilities,Apache Spark has managed to be at the forefront of big data an-alytics for being a unified framework for both, batch and streamdata processing. However, recent studies on micro-architecturalcharacterization of in-memory data analytics are limited to onlybatch processing workloads. We compare micro-architectural per-formance of batch processing and stream processing workloadsin Apache Spark using hardware performance counters on a dualsocket server. In our evaluation experiments, we have found thatbatch processing are stream processing workloads have similarmicro-architectural characteristics are bounded by the latency offrequent data access to DRAM. For data accesses we have foundthat simultaneous multi-threading is effective in hiding the datalatencies. We have also observed that (i) data locality on NUMAnodes can improve the performance by 10% on average and(ii)disabling next-line L1-D prefetchers can reduce the executiontime by up-to 14% and (iii) multiple small executors can provideup-to 36% speedup over single large executor
2.	Awan, Ahsan Javed, 1988-, et al. (author) How Data Volume Affects Spark Based Data Analytics on a Scale-up Server 2015 In: Big Data Benchmarks, Performance Optimization, and Emerging Hardware. - Cham : Springer. - 9783319290058 ; , s. 81-92 Conference paper (peer-reviewed)abstract Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not well understood. We present a deep-dive analysis of Spark based applications on a large scale-up server machine. Our analysis reveals that Spark based data analytics are DRAM bound and do not benefit by using more than 12 cores for an executor. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10 % better instruction retirement rate (due to lower L1 cache misses and higher core utilization). We match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x.
3.	Awan, Ahsan Javed, 1988-, et al. (author) Micro-architectural Characterization of Apache Spark on Batch and Stream Processing Workloads 2016 Conference paper (peer-reviewed)abstract While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We compare the micro-architectural performance of batch processing and stream processing workloads in Apache Spark using hardware performance counters on a dual socket server. In our evaluation experiments, we have found that batch processing and stream processing has same micro-architectural behavior in Spark if the difference between two implementations is of micro-batching only. If the input data rates are small, stream processing workloads are front-end bound. However, the front end bound stalls are reduced at larger input data rates and instruction retirement is improved. Moreover, Spark workloads using DataFrames have improved instruction retirement over workloads using RDDs.
4.	Awan, Ahsan Javed, 1988-, et al. (author) Node architecture implications for in-memory data analytics on scale-in clusters 2016 Conference paper (peer-reviewed)abstract While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics. Recent studies propose scale-in clusters with in-storage processing devices to process big data analytics with Spark However the proposal is based solely on the memory bandwidth characterization of in-memory data analytics and also does not shed light on the specification of host CPU and memory. Through empirical evaluation of in-memory data analytics with Apache Spark on an Ivy Bridge dual socket server, we have found that (i) simultaneous multi-threading is effective up to 6 cores (ii) data locality on NUMA nodes can improve the performance by 10% on average, (iii) disabling next-line L1-D prefetchers can reduce the execution time by up to 14%, (iv) DDR3 operating at 1333 MT/s is sufficient and (v) multiple small executors can provide up to 36% speedup over single large executor.
5.	Awan, Ahsan Javed, 1988- (author) Performance Characterization and Optimization of In-Memory Data Analytics on a Scale-up Server 2017 Doctoral thesis (other academic/artistic)abstract The sheer increase in the volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted to understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation (the additional CPU time spent by threads in a multi-threaded computation beyond the CPU time required to perform the same work in a sequential computation). We have also found that workloads are bound by the latency of frequent data accesses to the memory. By enlarging input data size, application performance degrades significantly due to the substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1cache misses and higher core utilization).For data accesses, we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average,(ii) disabling next-line L1-D prefetchers can reduce the execution time by upto14%. For garbage collection impact, we match memory behavior with the garbage collector to improve the performance of applications between 1.6xto 3x and recommend using multiple small Spark executors that can provide up to 36% reduction in execution time over single large executor. Based on the characteristics of workloads, the thesis envisions near-memory and near storage hardware acceleration to improve the single-node performance of scale-out frameworks like Apache Spark. Using modeling techniques, it estimates the speed-up of 4x for Apache Spark on scale-up servers augmented with near-data accelerators.
6.	Awan, Ahsan Javed, 1988- (author) Performance Characterization of In-Memory Data Analytics on a Scale-up Server 2016 Licentiate thesis (other academic/artistic)abstract The sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted at understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation. We have also found that workloads are bound by the latency of frequent data accesses to DRAM. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1 cache misses and higher core utilization).For data accesses we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average, (ii) disabling next-line L1-D prefetchers can reduce the execution time by up-to 14%. For GC impact, we match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x. and recommend to use multiple small executors that can provide up-to 36% speedup over single large executor.
7.	Ayguadé, Eduard, et al. (author) OpenMP Performance Analysis in the INTONE Project 2001 Conference paper (peer-reviewed)
8.	Grass, Thomas, et al. (author) Sampled Simulation of Task-Based Programs 2019 In: IEEE Transactions on Computers. - : IEEE COMPUTER SOC. - 0018-9340 .- 1557-9956. ; 68:2, s. 255-269 Journal article (peer-reviewed)abstract Sampled simulation is a mature technique for reducing simulation time of single-threaded programs. Nevertheless, current sampling techniques do not take advantage of other execution models, like task-based execution, to provide both more accurate and faster simulation. Recent multi-threaded sampling techniques assume that the workload assigned to each thread does not change across multiple executions of a program. This assumption does not hold for dynamically scheduled task-based programming models. Task-based programming models allow the programmer to specify program segments as tasks which are instantiated many times and scheduled dynamically to available threads. Due to variation in scheduling decisions, two consecutive executions on the same machine typically result in different instruction streams processed by each thread. In this paper, we propose TaskPoint, a sampled simulation technique for dynamically scheduled task-based programs. We leverage task instances as sampling units and simulate only a fraction of all task instances in detail. Between detailed simulation intervals, we employ a novel fast-forwarding mechanism for dynamically scheduled programs. We evaluate different automatic techniques for clustering task instances and show that DBSCAN clustering combined with analytical performance modeling provides the best trade-off of simulation speed and accuracy. TaskPoint is the first technique combining sampled simulation and analytical modeling and provides a new way to trade off simulation speed and accuracy. Compared to detailed simulation, TaskPoint accelerates architectural simulation with 8 simulated threads by an average factor of 220x at an average error of 0.5 percent and a maximum error of 7.9 percent.
9.	Javed Awan, Ahsan, 1988-, et al. (author) Identifying the potential of Near Data Processing for Apache Spark 2017 In: Proceedings of the International Symposium on Memory Systems, MEMSYS 2017. - New York, NY, USA : Association for Computing Machinery (ACM). ; , s. 60-67 Conference paper (peer-reviewed)abstract While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. There is also a renewed interest in Near Data Processing (NDP) due to technological advancement in the last decade. However, it is not known if NDP architectures can improve the performance of big data processing frameworks such as Apache Spark. In this paper, we build the case of NDP architecture comprising programmable logic based hybrid 2D integrated processing-in-memory and instorage processing for Apache Spark, by extensive profiling of Apache Spark based workloads on Ivy Bridge Server.
10.	Javed Awan, Ahsan, et al. (author) Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server 2015 In: Proceedings - 2015 IEEE 5th International Conference on Big Data and Cloud Computing, BDCloud 2015. - : IEEE Computer Society. - 9781467371827 ; , s. 1-8 Conference paper (peer-reviewed)abstract In last decade, data analytics have rapidly progressed from traditional disk-based processing tomodern in-memory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scalability of workloads. We also quantify the inefficiencies at micro-architecture level for various data analysis workloads. Through empirical evaluation, we show that spark workloads do not scale linearly beyond twelve threads, due to work time inflation and thread level load imbalance. Further, at the micro-architecture level, we observe memory bound latency to be the major cause of work time inflation.

Skapa referenser, mejla, bekava och länka

Permalink

Result 1-10 of 14

Refine your search

Type of publication: conference paper (8); journal article (2); editorial proceedings (1); other publication (1); doctoral thesis (1); licentiate thesis (1); show more...; show less...

Type of content: peer-reviewed (10); other academic/artistic (4)

Author/Editor: Ayguade, Eduard (13); Brorsson, Mats (6); Awan, Ahsan Javed, 1 ... (6); Vlassov, Vladimir (5); Pllana, Sabri (2); Vlassov, Vladimir, 1 ... (2); show more...; Brorsson, Mats, 1962 ... (2); Winkler, M (1); Pleiter, Dirk (1); Karlsson, S. (1); Carlson, Trevor E. (1); Stenström, Per, 1957 (1); Anzt, Hartwig (1); Vlassov, Vladimir, A ... (1); Eeckhout, Lieven (1); Brorsson, Mats, Prof ... (1); Ayguade, Eduard, Pro ... (1); Grot, Boris (1); Brunst, H. (1); Hoppe, H. -C (1); Martorell, X. (1); Nagel, W. E. (1); Schlimbach, F. (1); Utrera, G. (1); Grass, Thomas (1); Ceballos, Germán, 19 ... (1); Martorell, Xavier (1); Pnevmatikatos, Dioni ... (1); Lujan, Mikel (1); Rico, Alejandro (1); Casas, Marc (1); Moreto, Miquel (1); Kuzak, Mateusz (1); Javed Awan, Ahsan, 1 ... (1); Ohara, Moriyoshi (1); Ishizaki, Kazauki (1); Javed Awan, Ahsan (1); Melo, Alba (1); Carretero, Jesus (1); Ranka, Sanjay (1); Barhen, Jacob (1); Sclocco, Alessio (1); Méhaut, Jean-Françoi ... (1); Cornelius, Herbert (1); Eigenmann, Rudolf (1); Qasem, Apan (1); Cahill, Katharine (1); Canal, Ramon (1); Chan, Jany (1); Fosler-Lussier, Eric (1); show less...

University: Royal Institute of Technology (10); Linnaeus University (2); Uppsala University (1); Chalmers University of Technology (1)

Language: English (14)

Research subject (UKÄ/SCB): Engineering and Technology (9); Natural sciences (5); Social Sciences (1)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - National Library Systems
LIBRIS.kb.se

pil uppåt

Close

Copy and save the link in order to return to this view