SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Chatterjee Bapi 1982) "

Sökning: WFRF:(Chatterjee Bapi 1982)

  • Resultat 1-10 av 12
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Cederman, Daniel, 1981, et al. (författare)
  • A Study of the Behavior of Synchronization Methods in Commonly Used Languages and Systems
  • 2013
  • Ingår i: Proceedings of the 27th IEEE International Parallel & Distributed Processing Symposium. - 1530-2075. - 9780769549712 ; , s. 1309-1320
  • Konferensbidrag (refereegranskat)abstract
    • Synchronization is a central issue in concurrency and plays an important role in the behavior and performance of modern programmes. Programming languages and hardware designers are trying to provide synchronization constructs and primitives that can handle concurrency and synchronization issues efficiently. Programmers have to find a way to select the most appropriate constructs and primitives in order to gain the desired behavior and performance under concurrency. Several parameters and factors affect the choice,through complex interactions among (i) the language and the language constructs that itsupports, (ii) the system architecture, (iii) possible run-time environments,virtual machine options and memory management support and(iv) applications.We present a systematic study ofsynchronizationstrategies, focusing on concurrent data structures.We have chosen concurrent data structures with different number of contention spots.We consider both coarse-grain and fine-grain locking strategies, as well as lock-free methods.We have investigated synchronization-aware implementations in C++, C# (.NET and Mono) and Java.Considering the machine architectures, we have studied the behavior of theimplementations on both Intel's Nehalem and AMD's Bulldozer.The properties that we study are throughput and fairness under different workloads and multiprogramming execution environments.For NUMA architectures fairness is becoming as important as the typically considered throughput property.To the best of our knowledge this is the first systematic and comprehensive study of synchronization-aware implementations.This paper takes steps towards capturing a number of guidingprinciples and concerns for the selection of the programming environmentand synchronization methods in connection to the application and the system characteristics.
  •  
2.
  • Cederman, Daniel, 1981, et al. (författare)
  • Understanding the Performance of Concurrent Data Structures on Graphics Processors
  • 2012
  • Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Berlin, Heidelberg : Springer Berlin Heidelberg. - 1611-3349 .- 0302-9743. - 9783642328190 ; 7484/2012, s. 883-894
  • Konferensbidrag (refereegranskat)abstract
    • In this paper we revisit the design of concurrent data structures -- specifically queues -- and examine their performance portabilitywith regard to the move from conventional CPUs to graphics processors. We have looked at both lock-based and lock-free algorithmsand have, for comparison, implemented and optimized the same algorithms on both graphics processors and multi-core CPUs.Particular interest has been paid to study the difference between the old Tesla and the new Fermi and Kepler architecturesin this context.We provide a comprehensive evaluation and analysis of our implementations on all examined platforms.Our results indicate that the queues are in general performance portable, but that platform specific optimizations are possibleto increase performance. The Fermi and Kepler GPUs, with optimized atomic operations, are observed to provide excellent scalabilityfor both lock-based and lock-free queues.
  •  
3.
  • Chatterjee, Bapi, 1982, et al. (författare)
  • Concurrent Linearizable Nearest Neighbour Search in LockFree-kD-tree
  • 2015
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • The Nearest neighbour search (NNS) is an important problem in a large number of application domains dealing with multidimensional data. In concurrent settings, where dynamic modi?cations are allowed, a linearizable implementation of NNS is highly desirable to discover the latest nearest neighbour of a given target data-point. In this paper, we introduce the LockFree-kD-tree (LFkD-tree): a lock-free concurrent kD-tree, which implements an abstract data type (ADT) that provides the operations Add, Remove, Contains, and NNS. Our implementation is linearizable. The operations in the LFkD-tree use single-word read and compare-and-swap (CAS) atomic primitives, which are readily supported on commonly available multi-core processors. We experimentally evaluate the LFkD-tree using several benchmarks comprising real-world and synthetic datasets. The experiments show that the presented design is scalable and achieves significant speed-up compared to the implementations of an existing sequential kD-tree and a recently proposed multidimensional indexingstructure, PH-tree.
  •  
4.
  • Chatterjee, Bapi, 1982, et al. (författare)
  • Concurrent linearizable nearest neighbour search in lockfree-kd-Tree
  • 2018
  • Ingår i: ACM International Conference Proceeding Series. - New York, NY, USA : ACM. ; Part F133180
  • Konferensbidrag (refereegranskat)abstract
    • The Nearest neighbour search (NNS) is a fundamental problem in many application domains dealing with multidimensional data. In a concurrent setting, where dynamic modi-fications are allowed, a linearizable implementation of NNS is highly desirable. This paper introduces the LockFree-kD-Tree (LFkD-Tree): A lock-free concurrent kD-Tree, which implements an abstract data type (ADT) that provides the operations Add, Remove, Contains, and NNS. Our implementation is linearizable. The operations in the LFkD-Tree use single-word read and compare-And-swap (CAS) atomic primitives, which are readily supported on available multi-core processors. We experimentally evaluate the LFkD-Tree using several benchmarks comprising real-world and synthetic datasets. The experiments show that the presented design is scalable and achieves signi cant speed-up compared to the implementations of an existing sequential kD-Tree and a recently proposed multidimensional indexing structure, PH-Tree. © 2018 Copyright held by the owner/author(s).
  •  
5.
  • Chatterjee, Bapi, 1982, et al. (författare)
  • Concurrent linearizable nearest neighbour search in LockFree-kD-tree
  • 2021
  • Ingår i: Theoretical Computer Science. - : Elsevier BV. - 0304-3975. ; 886, s. 27-48
  • Tidskriftsartikel (refereegranskat)abstract
    • The Nearest neighbour search (NNS) is a fundamental problem in many application domains dealing with multidimensional data. In a concurrent setting, where dynamic modifications are allowed, a linearizable implementation of the NNS is highly desirable. This paper introduces the LockFree-kD-tree (LFkD-tree ): a lock-free concurrent kD-tree, which implements an abstract data type (ADT) that provides the operations Add, Remove, Contains, and NNS. Our implementation is linearizable. The operations in the LFkD-tree use single-word read and compare-and-swap ([Formula presented] ) atomic primitives, which are readily supported on available multi-core processors. We experimentally evaluate the LFkD-tree using several benchmarks comprising real-world and synthetic datasets. The experiments show that the presented design is scalable and achieves significant speed-up compared to the implementations of an existing sequential kD-tree and a recently proposed multidimensional indexing structure, PH-tree.
  •  
6.
  • Chatterjee, Bapi, 1982 (författare)
  • Efficient Implementation of Concurrent Data Structures on Multi-core and Many-core Architectures
  • 2015
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Synchronization of concurrent threads is the central problem in order to design efficient concurrent data-structures. The compute systems widely available in market are increasingly becoming heterogeneous involving multi-core Central Processing Units (CPUs) and many-core Graphics Processing Units (GPUs). This thesis contributes to the research of efficient synchronization in concurrent data-structures in more than one way. It is divided into two parts. In the first part, a novel design of a Set Abstract Data Type (ADT) based on an efficient lock-free Binary Search Tree (BST) with improved amortized bounds of the time complexity of set operations - Add, Remove and Contains, is presented. In the second part, a comprehensive evaluation of concurrent Queue implementations on multi-core CPUs as well as many-core GPUs are presented.Efficient Lock-free BST -To the best of our knowledge, the lock-free BST presented in this thesis is the first to achieve an amortized complexity of O(H(n)+c) for all Set operations where H(n) is the height of a BST on n nodes and c is the contention measure. Also, the presented lock-free algorithm of BST comes with an improved disjoint-access-parallelism compared to the previously existing concurrent BST algorithms. This algorithm uses single-word compare-and-swap (CAS) primitives. The presented algorithm is linearizable. We implemented the algorithm in Java and it shows good scalability.Evaluation of concurrent data-structures - We have evaluated the performance of a number of concurrent FIFO Queue algorithms on multi-core CPUs and many-core GPUs. We studied the portability of existing design of concurrent Queues from CPUs to GPUs which are inherently designed for SIMD programs. We observed that in general concurrent queues offer them to efficient implementation on GPUs with faster cache memory and better performance support for atomic synchronization primitives such as CAS. To the best of our knowledge, this is the first attempt to evaluate a concurrent data-structure on GPUs.
  •  
7.
  • Chatterjee, Bapi, 1982, et al. (författare)
  • Efficient lock-free binary search trees
  • 2014
  • Ingår i: 2014 ACM Symposium on Principles of Distributed Computing, PODC 2014; Paris; France; 15 July 2014 through 18 July 2014. - New York, NY, USA : ACM. - 9781450329446 ; , s. 322-331
  • Konferensbidrag (refereegranskat)abstract
    • In this paper we present a novel algorithm for concurrent lock-free internal binary search trees (BST) and implement a Set abstract data type (ADT) based on that. We show that in the presented lock-free BST algorithm the amortized step complexity of each set operation - ADD, REMOVE and CONTAINS - is O(H(n) + c), where H(n) is the height of the BST with n number of nodes and c is the contention during the execution. Our algorithm adapts to contention measures according to read-write load. If the situation is read-heavy, the operations avoid helping the concurrent REMOVE operations during traversal, and adapt to interval contention. However, for the write-heavy situations we let an operation help a concurrent REMOVE, even though it is not obstructed. In that case, an operation adapts to point contention. It uses single-word compare-and-swap (CAS) operations. We show that our algorithm has improved disjoint-access-parallelism compared to similar existing algorithms. We prove that the presented algorithm is linearizable. To the best of our knowledge, this is the first algorithm for any concurrent tree datastructure in which the modify operations are performed with an additive term of contention measure.
  •  
8.
  • Chatterjee, Bapi, 1982, et al. (författare)
  • Help-optimal and Language-portable Lock-free Concurrent Data Structures
  • 2016
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • Helping is the most common mechanism to guarantee lock-freedom in many concurrent data structures. An optimized helping strategy improves the overall performance of a lock-free algorithm. In this paper, we propose help-optimality, which essentially implies that no operationstep is accounted for exclusive helping in the lock-free synchronization of concurrent operations. To describe the concept, we revisit the designs of a lock-free linked-list and a lock-free binary search tree and present improved algorithms. Our algorithms employ atomic single-word compare-and-swap (CAS) primitives and are linearizable.Additionally, we do not use a language/platform speci?c mechanism to modulate helping, speci?cally, we use neither bit-stealing from a pointer nor runtime type introspection of objects, making the algorithms language-portable. Further, to optimize the amortized numberof steps per operation, if a CAS execution to modify a shared pointer fails, we obtain a fresh set of thread-local variables without restarting an operation from scratch.We use several micro-benchmarks in both C/C++ and Java to validate the e?ciency of our algorithms against existing state-of-the-art. The experiments show that the algorithms are scalable. Our implementations perform on a par with highly optimized ones and in manycases yield 10%-50% higher throughput.
  •  
9.
  • Chatterjee, Bapi, 1982, et al. (författare)
  • Help-Optimal and Language-Portable Lock-Free Concurrent Data Structures
  • 2016
  • Ingår i: 45th International Conference on Parallel Processing (ICPP), 2016. - 0190-3918. - 9781509028238 ; 2016 september, s. 360-369
  • Konferensbidrag (refereegranskat)abstract
    • Helping is a widely used technique to guarantee lock-freedom in many concurrent data structures. An optimized helping strategy improves the overall performance of a lock-free algorithm. In this paper, we propose help-optimality, which essentially implies that no operation step is accounted for exclusive helping in the lock-free synchronization of concurrent operations. To describe the concept, we revisit the designs of a lock-free linked-list and a lock-free binary search tree and present improved algorithms. Our algorithms employ atomic single-word compare-and-swap (CAS) primitives and are linearizable. We design the algorithms without using any language/platformspecific mechanism. Specifically, we use neither bit-stealing froma pointer nor runtime type introspection of objects. Thus, our algorithms are language-portable. Further, to optimize the amortized number of steps per operation, if a CAS execution tomodify a shared pointer fails, we obtain a fresh set of thread-local variables without restarting an operation from scratch. We use several micro-benchmarks in both C/C++ and Java to validate the efficiency of our algorithms against existing state-of-the-art. The experiments show that the algorithms are scalable. Our implementations perform on a par with highly optimizedones and in many cases yield 10%-50% higher throughput.
  •  
10.
  • Chatterjee, Bapi, 1982 (författare)
  • Lock-free Concurrent Search
  • 2017
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The contemporary computers typically consist of multiple computing cores with high compute power. Such computers make excellent concurrent asynchronous shared memory system. On the other hand, though many celebrated books on data structure and algorithm provide a comprehensive study of sequential search data structures, unfortunately, we do not have such a luxury if concurrency comes in the setting. The present dissertation aims to address this paucity. We describe novel lock-free algorithms for concurrent data structures that target a variety of search problems.(i) Point search (membership query, predecessor query, nearest neighbour query) for 1-dimensional data: Lock-free linked-list; lock-free internal and external binary search trees (BST).(ii) Range search for 1-dimensional data: A range search method for lock-free ordered set data structures - linked-list, skip-list and BST.(iii) Point search for multi-dimensional data: Lock-free kD-tree, specially, a generic method for nearest neighbour search.We prove that the presented algorithms are linearizable i.e. the concurrent data structure operations intuitively display their sequential behaviour to an observer of the concurrent system. The lock-freedom in the introduced algorithms guarantee overall progress in an asynchronous shared memory system. We present the amortized analysis of lock-free data structures to show their efficiency. Moreover, we provide sample implementations of the algorithms and test them over extensive micro-benchmarks. Our experiments demonstrate that the implementations are scalable and perform well when compared to related existing alternative implementations on common multi-core computers.Our focus is on propounding the generic methodologies for efficient lock-free concurrent search. In this direction, we present the notion of help-optimality, which captures the optimization of amortized step complexity of the operations. In addition to that, we explore the language-portable design of lock-free data structures that aims to simplify an implementation from programmer’s point of view. Finally, our techniques to implement lock-free linearizable range search and nearest neighbour search are independent of the underlying data structures and thus are adaptive to similar data structures.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 12

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy