SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Haridi Seif 1953 ) "

Sökning: WFRF:(Haridi Seif 1953 )

  • Resultat 1-10 av 58
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Niazi, Salman, 1982- (författare)
  • Scaling Distributed Hierarchical File Systems Using NewSQL Databases
  • 2018
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • For many years, researchers have investigated the use of database technology to manage file system metadata, with the goal of providing extensible typed metadata and support for fast, rich metadata search. However, earlier attempts failed mainly due to the reduced performance introduced by adding database operations to the file system’s critical path. Recent improvements in the performance of distributed in-memory online transaction processing databases (NewSQL databases) led us to re-investigate the possibility of using a database to manage file system metadata, but this time for a distributed, hierarchical file system, the Hadoop Distributed File System (HDFS). The single-host metadata service of HDFS is a well-known bottleneck for both the size of the HDFS clusters and their throughput.In this thesis, we detail the algorithms, techniques, and optimizations used to develop HopsFS, an open-source, next-generation distribution of the HDFS that replaces the main scalability bottleneck in HDFS, single node in-memory metadata service, with a no-shared state distributed system built on a NewSQL database. In particular, we discuss how we exploit recent high-performance features from NewSQL databases, such as application-defined partitioning, partition pruned index scans, and distribution aware transactions, as well as more traditional techniques such as batching and write-ahead caches, to enable a revolution in distributed hierarchical file system performance.HDFS’ design is optimized for the storage of large files, that is, files ranging from megabytes to terabytes in size. However, in many production deployments of the HDFS, it has been observed that almost 20% of the files in the system are less than 4 KB in size and as much as 42% of all the file system operations are performed on files less than 16 KB in size. HopsFS introduces a tiered storage solution to store files of different sizes more efficiently. The tiers range from the highest tier where an in-memory NewSQL database stores very small files (<1 KB), to the next tier where small files (<64 KB) are stored in solid-state-drives (SSDs), also using a NewSQL database, to the largest tier, the existing Hadoop block storage layer for very large files. Our approach is based on extending HopsFS with an inode stuffing technique, where we embed the contents of small files with the metadata and use database transactions and database replication guarantees to ensure the availability, integrity, and consistency of the small files. HopsFS enables significantly larger cluster sizes, more than an order of magnitude higher throughput, and significantly lower client latencies for large clusters.Lastly, coordination is an integral part of the distributed file system operations protocols. We present a novel leader election protocol for partially synchronous systems that uses NewSQL databases as shared memory. Our work enables HopsFS, that uses a NewSQL database to save the operational overhead of managing an additional third-party service for leader election and deliver performance comparable to a leader election implementation using a state-of-the-art distributed coordination service, ZooKeeper.
  •  
2.
  • Ismail, Mahmoud (författare)
  • Distributed File System Metadata and its Applications
  • 2020
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Distributed hierarchical file systems typically decouple the storage and serving of the file metadata from the file contents (file system blocks) to enable the file system to scale to store more data and support higher throughput. We designed HopsFS to take the scalability of the file system one step further by also decoupling the storage and serving of the file system metadata. HopsFS is an open-source, next- generation distribution of the Apache Hadoop Distributed File System (HDFS) that replaces the main scalability bottleneck in HDFS, the single-node in-memory metadata service, with a distributed metadata service built on a NewSQL database (NDB). HopsFS stores the file system’s metadata fully normalized in NDB, then it uses locking primitives and application-defined locks to ensure strongly consistent metadata.In this thesis, we leverage the consistent distributed hierarchical file system meta- data provided by HopsFS to efficiently build new classes of applications that are tightly coupled with the file system as well as to improve the internal file system operations. First, we introduce hbr, a new block reporting protocol for HopsFS that removes a scalability bottleneck that prevented HopsFS from scaling to tens of thousands of servers. Second, we introduce HopsFS-CL, a highly available cloud-native distribution of HopsFS that deploys the file system across Availability Zones in the cloud while maintaining the same file system semantics. Third, we introduce HopsFS-S3, a highly available cloud-native distribution of HopsFS that uses object stores as a backend for the block storage layer in the cloud while again maintaining the same file system semantics. Fourth, we introduce ePipe, a databus that both creates a consistent change stream for HopsFS and eventually delivers the correctly ordered stream with low latency to downstream clients. That is, ePipe extends HopsFS with a change-data-capture (CDC) API that provides not only efficient file system notifications but also enables polyglot storage for file system metadata. Polyglot storage enables us to offload metadata queries to a more appropriate engine - we use Elasticsearch to provide a free-text search of the file system namespace to demonstrate this capability. Finally, we introduce Hopsworks, a scalable, project-based multi-tenant big data platform that provides support for collaborative development and operations for teams through extended metadata.
  •  
3.
  • Aberer, Karl, et al. (författare)
  • The Essence of P2P : A Reference Architecture for Overlay Networks
  • 2005
  • Ingår i: Fifth IEEE International Conference on Peer-to-Peer Computing, Proceedings. - 0769523765 ; , s. 11-20
  • Konferensbidrag (refereegranskat)abstract
    • The success of the P2P idea has created a huge diversity of approaches, among which overlay networks, for example, Gnutella, Kazaa, Chord, Pastry, Tapestry, P-Grid, or DKS, have received specific attention from both developers and researchers. A wide variety of algorithms, data structures, and architectures have been proposed. The terminologies and abstractions used, however have become quite inconsistent since the P2P paradigm has attracted people from many different communities, e.g., networking, databases, distributed systems, graph theory, complexity theory, biology, etc. In this paper we propose a reference model for overlay networks which is capable of modeling different approaches in this domain in a generic manner It is intended to allow researchers and users to assess the properties of concrete systems, to establish a common vocabulary for scientific discussion, to facilitate the qualitative comparison of the systems, and to serve as the basis for defining a standardized API to make overlay networks interoperable.
  •  
4.
  • Alsayfi, Majed S., et al. (författare)
  • Big Data in Vehicular Cloud Computing : Review, Taxonomy, and Security Challenges
  • 2022
  • Ingår i: ELEKTRONIKA IR ELEKTROTECHNIKA. - : Kaunas University of Technology (KTU). - 1392-1215 .- 2029-5731. ; 28:2, s. 59-71
  • Forskningsöversikt (refereegranskat)abstract
    • Modern vehicles equipped with various smart sensors have become a means of transportation and have become a means of collecting, creating, computing, processing, and transferring data while traveling through modern and rural cities. A traditional vehicular ad hoc network (VANET) cannot handle the enormous and complex data that are collected by modern vehicle sensors (e.g., cameras, lidar, and global positioning systems (GPS)) because they require rapid processing, analysis, management, storage, and uploading to trusted national authorities. Furthermore, the integrated VANET with cloud computing presents a new concept, vehicular cloud computing (VCC), which overcomes the limitations of VANET, brings new services and applications to vehicular networks, and generates a massive amount of data compared to the data collected by individual vehicles alone. Therefore, this study explored the importance of big data in VCC. First, we provide an overview of traditional vehicular networks and their limitations. Then we investigate the relationship between VCC and big data, fundamentally focusing on how VCC can generate, transmit, store, upload, and process big data to share it among vehicles on the road. Subsequently, a new taxonomy of big data in VCC was presented. Finally, the security challenges in big data-based VCCs are discussed.
  •  
5.
  • Alsayfi, Majed S., et al. (författare)
  • Securing Real-Time Video Surveillance Data in Vehicular Cloud Computing : A Survey
  • 2022
  • Ingår i: IEEE Access. - : Institute of Electrical and Electronics Engineers (IEEE). - 2169-3536. ; 10, s. 51525-51547
  • Tidskriftsartikel (refereegranskat)abstract
    • Vehicular ad hoc networks (VANETs) have received a great amount of interest, especially in wireless communications technology. In VANETs, vehicles are equipped with various intelligent sensors that can collect real-time data from inside and from surrounding vehicles. These real-time data require powerful computation, processing, and storage. However, VANETs cannot manage these real-time data because of the limited storage capacity in on board unit (OBU). To address this limitation, a new concept is proposed in which a VANET is integrated with cloud computing to form vehicular cloud computing (VCC) technology. VCC can manage real-time services, such as real-time video surveillance data that are used for monitoring critical events on the road. These real-time video surveillance data include highly sensitive data that should be protected against intruders in the networks because any manipulation, alteration, or sniffing of data will affect a driver's life by causing improper decision-making. The security and privacy of real-time video surveillance data are major challenges in VCC. Therefore, this study reviewed the importance of the security and privacy of real-time video data in VCC. First, we provide an overview of VANETs and their limitations. Second, we provide a state-of-the-art taxonomy for real-time video data in VCC. Then, the importance of real-time video surveillance data in both fifth generation (5G), and sixth generation (6G) networks is presented. Finally, the challenges and open issues of real-time video data in VCC are discussed.
  •  
6.
  • Basloom, Huda, et al. (författare)
  • A Parallel Hybrid Testing Technique for Tri-Programming Model-Based Software Systems
  • 2023
  • Ingår i: Computers, Materials and Continua. - : Computers, Materials and Continua (Tech Science Press). - 1546-2218 .- 1546-2226. ; 74:2, s. 4501-4530
  • Tidskriftsartikel (refereegranskat)abstract
    • Recently, researchers have shown increasing interest in combining more than one programming model into systems running on high performance computing systems (HPCs) to achieve exascale by applying parallelism at multiple levels. Combining different programming paradigms, such as Message Passing Interface (MPI), Open Multiple Processing (OpenMP), and Open Accelerators (OpenACC), can increase computation speed and improve performance. During the integration of multiple models, the probability of runtime errors increases, making their detection difficult, especially in the absence of testing techniques that can detect these errors. Numerous studies have been conducted to identify these errors, but no technique exists for detecting errors in three-level programming models. Despite the increasing research that integrates the three programming models, MPI, OpenMP, and OpenACC, a testing technology to detect runtime errors, such as deadlocks and race conditions, which can arise from this integration has not been developed. Therefore, this paper begins with a definition and explanation of runtime errors that result fromintegrating the three programming models that compilers cannot detect. For the first time, this paper presents a classification of operational errors that can result from the integration of the three models. This paper also proposes a parallel hybrid testing technique for detecting runtime errors in systems built in the C++ programming language that uses the triple programming models MPI, OpenMP, and OpenACC. This hybrid technology combines static technology and dynamic technology, given that some errors can be detected using static techniques, whereas others can be detected using dynamic technology. The hybrid technique can detect more errors because it combines two distinct technologies. The proposed static technology detects a wide range of error types in less time, whereas a portion of the potential errors that may or may not occur depending on the operating environment are left to the dynamic technology, which completes the validation.
  •  
7.
  • Basloom, Huda Saleh, et al. (författare)
  • Errors Classification and Static Detection Techniques for Dual-Programming Model (OpenMP and OpenACC)
  • 2022
  • Ingår i: IEEE Access. - : Institute of Electrical and Electronics Engineers (IEEE). - 2169-3536. ; 10, s. 117808-117826
  • Tidskriftsartikel (refereegranskat)abstract
    • Recently, incorporating more than one programming model into a system designed for high performance computing (HPC) has become a popular solution to implementing parallel systems. Since traditional programming languages, such as C, C++, and Fortran, do not support parallelism at the level of multi-core processors and accelerators, many programmers add one or more programming models to achieve parallelism and accelerate computation efficiently. These models include Open Accelerators (OpenACC) and Open Multi-Processing (OpenMP), which have recently been used with various models, including Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA). Due to the difficulty of predicting the behavior of threads, runtime errors cannot be predicted. The compiler cannot identify runtime errors such as data races, race conditions, deadlocks, or livelocks. Many studies have been conducted on the development of testing tools to detect runtime errors when using programming models, such as the combinations of OpenACC with MPI models and OpenMP with MPI. Although more applications use OpenACC and OpenMP together, no testing tools have been developed to test these applications to date. This paper presents a testing tool for detecting runtime using a static testing technique. This tool can detect actual and potential runtime errors during the integration of the OpenACC and OpenMP models into systems developed in C++. This tool implement error dependency graphs, which are proposed in this paper. Additionally, a dependency graph of the errors is provided, along with a classification of runtime errors that result from combining the two programming models mentioned earlier.
  •  
8.
  •  
9.
  • Carbone, Paris, et al. (författare)
  • Cutty : Aggregate Sharing for User-Defined Windows
  • 2016
  • Ingår i: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450340731 ; , s. 1201-1210
  • Konferensbidrag (refereegranskat)abstract
    • Aggregation queries on data streams are evaluated over evolving and often overlapping logical views called windows. While the aggregation of periodic windows were extensively studied in the past through the use of aggregate sharing techniques such as Panes and Pairs, little to no work has been put in optimizing the aggregation of very common, non-periodic windows. Typical examples of non-periodic windows are punctuations and sessions which can implement complex business logic and are often expressed as user-defined operators on platforms such as Google Dataflow or Apache Storm. The aggregation of such non-periodic or user-defined windows either falls back to expensive, best-effort aggregate sharing methods, or is not optimized at all.In this paper we present a technique to perform efficient aggregate sharing for data stream windows, which are declared as user-defined functions (UDFs) and can contain arbitrary business logic. To this end, we first introduce the concept of User-Defined Windows (UDWs), a simple, UDF-based programming abstraction that allows users to programmatically define custom windows. We then define semantics for UDWs, based on which we design Cutty, a low-cost aggregate sharing technique. Cutty improves and outperforms the state of the art for aggregate sharing on single and multiple queries. Moreover, it enables aggregate sharing for a broad class of non-periodic UDWs. We implemented our techniques on Apache Flink, an open source stream processing system, and performed experiments demonstrating orders of magnitude of reduction in aggregation costs compared to the state of the art.
  •  
10.
  • Carbone, Paris, et al. (författare)
  • Large-scale data stream processing systems
  • 2017
  • Ingår i: Handbook of Big Data Technologies. - Cham : Springer International Publishing. - 9783319493404 - 9783319493398 ; , s. 219-260
  • Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract
    • In our data-centric society, online services, decision making, and other aspects are increasingly becoming heavily dependent on trends and patterns extracted from data. A broad class of societal-scale data management problems requires system support for processing unbounded data with low latency and high throughput. Large-scale data stream processing systems perceive data as infinite streams and are designed to satisfy such requirements. They have further evolved substantially both in terms of expressive programming model support and also efficient and durable runtime execution on commodity clusters. Expressive programming models offer convenient ways to declare continuous data properties and applied computations, while hiding details on how these data streams are physically processed and orchestrated in a distributed environment. Execution engines provide a runtime for such models further allowing for scalable yet durable execution of any declared computation. In this chapter we introduce the major design aspects of large scale data stream processing systems, covering programming model abstraction levels and runtime concerns. We then present a detailed case study on stateful stream processing with Apache Flink, an open-source stream processor that is used for a wide variety of processing tasks. Finally, we address the main challenges of disruptive applications that large-scale data streaming enables from a systemic point of view.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 58
Typ av publikation
konferensbidrag (35)
tidskriftsartikel (12)
doktorsavhandling (7)
forskningsöversikt (2)
konstnärligt arbete (1)
bok (1)
visa fler...
bokkapitel (1)
visa färre...
Typ av innehåll
refereegranskat (48)
övrigt vetenskapligt/konstnärligt (9)
populärvet., debatt m.m. (1)
Författare/redaktör
Haridi, Seif, 1953- (56)
Dowling, Jim (11)
Carbone, Paris (8)
Vlassov, Vladimir, 1 ... (5)
Girdzijauskas, Sarun ... (5)
Rahimian, Fatemeh (5)
visa fler...
Ismail, Mahmoud (5)
Ghodsi, Ali, 1978- (4)
Brand, Per (4)
Popov, Konstantin (4)
Payberah, Amir H., 1 ... (4)
Van Roy, Peter (3)
Alghamdi, Ahmed Moha ... (3)
Vlassov, Vladimir (2)
Romano, P. (2)
Alima, Luc Onana (2)
Maguire Jr., Gerald ... (2)
El-Ansary, Sameh (2)
Alsayfi, Majed S. (2)
Dahab, Mohamed Y. (2)
Eassa, Fathy E. (2)
Salama, Reda (2)
Al-Ghamdi, Abdullah ... (2)
Arad, Cosmin (2)
Payberah, Amir H. (2)
Correia, Miguel (2)
Eassa, Fathy Elboura ... (2)
Niazi, Salman (2)
Ewen, Stephan (2)
Rodrigues, L. (1)
Richter, Stefan (1)
Soto, J. (1)
$$$Boström, Henrik (1)
Kalavri, Vasiliki (1)
Girdzijauskas, Sarun ... (1)
Carbone, Paris, 1986 ... (1)
Hermann, G. (1)
Aberer, Karl (1)
Hauswirth, Manfred (1)
Ghodsi, Ali (1)
Berthou, Gautier (1)
Basloom, Huda (1)
Dahab, Mohamed (1)
Al-Ghamdi, Abdullah ... (1)
Eassa, Fathy (1)
Basloom, Huda Saleh (1)
Dahab, Mohamed Yehia (1)
Al-Ghamdi, Abdullah ... (1)
Rodrigues, Diogo (1)
Katsifodimos, Asteri ... (1)
visa färre...
Lärosäte
Kungliga Tekniska Högskolan (56)
RISE (17)
Språk
Engelska (58)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (43)
Teknik (17)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy