SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "LAR1:cth ;pers:(Tsigas Philippas 1967)"

Sökning: LAR1:cth > Tsigas Philippas 1967

  • Resultat 21-30 av 232
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
21.
  • Cederman, Daniel, 1981, et al. (författare)
  • A Practical Quicksort Algorithm for Graphics Processors
  • 2008
  • Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Berlin, Heidelberg : Springer Berlin Heidelberg. - 1611-3349 .- 0302-9743. ; 5193, s. 246-258
  • Konferensbidrag (refereegranskat)abstract
    • In this paper we present GPU-Quicksort, an efficient Quicksort algorithm suitable for highly parallel multi-core graphics processors. Quicksort has previously been considered as an inefficient sorting solution for graphics processors, but we show that GPU-Quicksort often performs better than the fastest known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors.
  •  
22.
  • Cederman, Daniel, 1981, et al. (författare)
  • A Study of the Behavior of Synchronization Methods in Commonly Used Languages and Systems
  • 2013
  • Ingår i: Proceedings of the 27th IEEE International Parallel & Distributed Processing Symposium. - 1530-2075. - 9780769549712 ; , s. 1309-1320
  • Konferensbidrag (refereegranskat)abstract
    • Synchronization is a central issue in concurrency and plays an important role in the behavior and performance of modern programmes. Programming languages and hardware designers are trying to provide synchronization constructs and primitives that can handle concurrency and synchronization issues efficiently. Programmers have to find a way to select the most appropriate constructs and primitives in order to gain the desired behavior and performance under concurrency. Several parameters and factors affect the choice,through complex interactions among (i) the language and the language constructs that itsupports, (ii) the system architecture, (iii) possible run-time environments,virtual machine options and memory management support and(iv) applications.We present a systematic study ofsynchronizationstrategies, focusing on concurrent data structures.We have chosen concurrent data structures with different number of contention spots.We consider both coarse-grain and fine-grain locking strategies, as well as lock-free methods.We have investigated synchronization-aware implementations in C++, C# (.NET and Mono) and Java.Considering the machine architectures, we have studied the behavior of theimplementations on both Intel's Nehalem and AMD's Bulldozer.The properties that we study are throughput and fairness under different workloads and multiprogramming execution environments.For NUMA architectures fairness is becoming as important as the typically considered throughput property.To the best of our knowledge this is the first systematic and comprehensive study of synchronization-aware implementations.This paper takes steps towards capturing a number of guidingprinciples and concerns for the selection of the programming environmentand synchronization methods in connection to the application and the system characteristics.
  •  
23.
  • Cederman, Daniel, 1981, et al. (författare)
  • Adapting Lock-Free Concurrent Data Objects to Support a Generic Move Operation
  • 2012
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • In the paper Supporting Lock-Free Composition of Concurrent Data Objects we introduced a methodology to compose the insert and remove operations of lock-free data structures. This allowed for the creation of move operations that can atomically transfer data from one data structure to another. In this report we apply the methodology to four different types of data structures; a queue, a stack, a hash-table and a skip-list. We first show that the data structures are compatible with the methodology. We then go through the changes needed to adapt them. Code listings are provided that presents the algorithms before and after modification.
  •  
24.
  • Cederman, Daniel, 1981, et al. (författare)
  • Brief announcement: Concurrent data structures for efficient streaming aggregation
  • 2014
  • Ingår i: Annual ACM Symposium on Parallelism in Algorithms and Architectures. - New York, NY, USA : ACM. - 9781450328210 ; , s. 76-78
  • Konferensbidrag (refereegranskat)abstract
    • We briefly describe our study on the problem of streaming multiway aggregation [5], where large data volumes are received from multiple input streams. Multiway aggregation is a fundamental computational component in data stream management systems, requiring low-latency and high throughput solutions. We focus on the problem of designing concurrent data structures enabling for low-latency and highthroughput multiway aggregation; an issue that has been overlooked in the literature. We propose two new concurrent data structures and their lock-free linearizable implementations, supporting both order-sensitive and order-insensitive aggregate functions. Results from an extensive evaluation show significant improvement in the aggregation performance, in terms of both processing throughput and latency over the commonly-used techniques based on queues.
  •  
25.
  • Cederman, Daniel, 1981, et al. (författare)
  • Concurrent Data Structures for Efficient Streaming Aggregation
  • 2013
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • In many data gathering applications, information arrives in the form of continuous streams rather than finite data sets.Efficient one-pass algorithms are required to cope with high input loads.Stream processing engines support continuous queries to process data in a real-time fashion and have evolved rapidly from centralized to distributed, parallel and elastic solutions.While a big effort has been put on leveraging the processing capacity of clusters of machines, less work has focused on leveraging the parallelism enabled by multi-core architectures by means of concurrent and lock-free data structures, to support the pipeline.This paper explores this aspect focusing on multiway aggregation, where large data volumes are received from multiple input streams.Multiway aggregation is crucial in contexts such as sensor networks, social media or clickstream analysis applications.We provide three enhanced aggregate operators that rely on two new concurrent data structures and their lock-free implementations, supporting both order-sensitive and order-insensitive aggregation functions.We provide an extensive study of the properties of the proposed aggregate operators and the new data structures.We also show an extensive experimental evaluation of the proposed methods, giving empirical evidence of their superiority.In this evaluation we run a variety of aggregation queries on two large datasets, one with data extracted from SoundCloud, a music social network, and one with data from a smart grid metering network.In all the experiments, the new data structures improved the aggregation performance significantly, up to one order of magnitude, in terms of both processing throughput and latency.
  •  
26.
  • Cederman, Daniel, 1981, et al. (författare)
  • Dynamic Load Balancing using Work-Stealing
  • 2012
  • Ingår i: GPU Computing Gems Jade Edition. ; , s. 485-499
  • Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract
    • In this chapter, we present a methodology for efficient load balancing of computational problems thatcan be easily decomposed into multiple tasks, but where it is hard to predict the computation cost ofeach task, and where new tasks are created dynamically during runtime. We present this methodologyand its exploitation and feasibility in the context of graphics processors. Work-stealing allows an idlecore to acquire tasks from a core that is overloaded, causing the total work to be distributed evenlyamong cores, while minimizing the communication costs, as tasks are only redistributed when required.This will often lead to higher throughput than using static partitioning.
  •  
27.
  • Cederman, Daniel, 1981, et al. (författare)
  • GPU-Quicksort: A practical Quicksort algorithm for graphics processors
  • 2009
  • Ingår i: Journal of Experimental Algorithmics. - 1084-6654. ; 14:4
  • Tidskriftsartikel (refereegranskat)abstract
    • In this article, we describe GPU-Quicksort, an efficient Quicksort algorithm suitable for highly parallel multicore graphics processors. Quicksort has previously been considered an inefficient sorting solution for graphics processors, but we show that in CUDA, NVIDIA's programing platform for general-purpose computations on graphical processors, GPU-Quicksort performs better than the fastest-known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors.
  •  
28.
  • Cederman, Daniel, 1981, et al. (författare)
  • Lock-Free Concurrent Data Structures
  • 2017
  • Ingår i: Programming multi-core and many-core computing systems. - Hoboken, NJ, USA : John Wiley & Sons, Inc.. - 9781119332015 ; , s. 29-58
  • Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract
    • © 2017 by John Wiley & Sons, Inc. All rights reserved. Concurrent data structures are the data sharing side of parallel programming. An implementation of a data structure is called lock-free, if it allows multiple processes/hreads to access the data structure concurrently and also guarantees that at least one operation among those finishes in a finite number of its own steps regardless of the state of the other operations. This chapter provides a sufficient background and intuition to help the interested reader to navigate in the complex research area of lock-free data structures. It offers the programmer familiarity to the subject that allows using truly concurrent methods. The chapter discusses the fundamental synchronization primitives on which efficient lock-free data structures rely. It discusses the problem of managing dynamically allocated memory in lock-free concurrent data structures and general concurrent environments. The idiosyncratic architectural features of graphics processors that is important to consider when designing efficient lock-free concurrent data structures for this emerging area.
  •  
29.
  • Cederman, Daniel, 1981, et al. (författare)
  • Lock-free Concurrent Data Structures
  • 2013
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)abstract
    • Concurrent data structures are the data sharing side of parallel programming. Data structures give the means to the program to store data, but also provide operations to the program to access and manipulate these data. These operations are implemented through algorithms that have to be efficient. In the sequential setting, data structures are crucially important for the performance of the respective computation. In the parallel programming setting, their importance becomes more crucial because of the increased use of data and resource sharing for utilizing parallelism. The first and main goal of this chapter is to provide a sufficient background and intuition to help the interested reader to navigate in the complex research area of lock-free data structures. The second goal is to offer the programmer familiarity to the subject that will allow her to use truly concurrent methods.
  •  
30.
  • Cederman, Daniel, 1981, et al. (författare)
  • On Dynamic Load Balancing on Graphics Processors
  • 2008
  • Ingår i: Proceedings of the 23rd SIGGRAPH/Eurographics Conference on Graphics Hardware. - 1727-3471. - 9783905674095 ; 2008, s. 57-64
  • Konferensbidrag (refereegranskat)abstract
    • To get maximum performance on the many-core graphics processorsit is important to have an even balance of the workload so thatall processing units contribute equally to the task at hand.This can be hard to achieve when the cost of a task is notknown beforehand and when new sub-tasks are created dynamicallyduring execution. With the recent advent of scatter operationsand atomic hardware primitives it is now possible to bring someof the more elaborate dynamic load balancing schemes from theconventional SMP systems domain to the graphics processordomain.We have compared four different dynamic load balancing methodsto see which one is most suited to the highly parallel world ofgraphics processors. Three of these methods were lock-free andone was lock-based. We evaluated them on the task of creatingan octree partitioning of a set of particles. The experimentsshowed that synchronization can be very expensive and that newmethods that take more advantage of the graphics processorsfeatures and capabilities might be required. They also showedthat lock-free methods achieves better performance thanblocking and that they can be made to scale with increasednumbers of processing units.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 21-30 av 232
Typ av publikation
konferensbidrag (126)
tidskriftsartikel (55)
rapport (45)
bokkapitel (4)
samlingsverk (redaktörskap) (1)
bok (1)
visa fler...
visa färre...
Typ av innehåll
refereegranskat (176)
övrigt vetenskapligt/konstnärligt (56)
Författare/redaktör
Papatriantafilou, Ma ... (78)
Cederman, Daniel, 19 ... (26)
Elmqvist, Niklas, 19 ... (24)
Ha, Phuong, 1976 (24)
Nikolakopoulos, Ioan ... (22)
visa fler...
Schiller, Elad, 1974 (21)
Gulisano, Vincenzo M ... (20)
Gidenstam, Anders, 1 ... (19)
Sundell, Håkan, 1968 (19)
Walulya, Ivan, 1985 (16)
Larsson, Andreas, 19 ... (13)
Moradi, Farnaz, 1983 (12)
Olovsson, Tomas, 195 ... (11)
Atalar, Aras, 1985 (10)
Chatterjee, Bapi, 19 ... (9)
Nguyen, Dang Nhan, 1 ... (7)
Fu, Zhang, 1982 (6)
Hoepman, Jaap-Henk (5)
Renaud Goud, Paul, 1 ... (5)
Dolev, Shlomi (5)
Spirakis, Paul G. (5)
Damaschke, Peter, 19 ... (4)
Almgren, Magnus, 197 ... (4)
Bäckström, Karl, 199 ... (4)
Träff, J.L. (4)
Pllana, Sabri (3)
Assarsson, Ulf, 1972 (3)
Soudris, D. (3)
Mustafa, Mohamed, 19 ... (3)
Petig, Thomas, 1985 (3)
Najdataei, Hannaneh, ... (3)
Salem, Iosif, 1986 (3)
Richards, A. (2)
Wimmer, M. (2)
Sanders, P (2)
Larsson Träff, Jespe ... (2)
Benkner, S. (2)
Namyst, R. (2)
Moloney, D. (2)
Dahlgren, Erik, 1989 (2)
Grundén, Johan, 1985 (2)
Gunnarsson, Daniel, ... (2)
Holtryd, Nadja, 1988 (2)
Khazal, Anmar, 1988 (2)
Steup, Christoph (2)
Swantesson, Viktor, ... (2)
Chaudhry, Muhammad T ... (2)
Stasko, John (2)
Tudoreanu, Eduard (2)
visa färre...
Lärosäte
Chalmers tekniska högskola (232)
Högskolan i Borås (14)
Göteborgs universitet (3)
Linnéuniversitetet (3)
Mälardalens universitet (2)
Linköpings universitet (1)
Språk
Engelska (232)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (222)
Teknik (38)
Samhällsvetenskap (2)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy