SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:DiVA.org:kth-216289"
 

Sökning: id:"swepub:oai:DiVA.org:kth-216289" > Scaling HDFS to mor...

Scaling HDFS to more than 1 million operations per second with HopsFS

Ismail, Mahmoud (författare)
KTH,Programvaruteknik och Datorsystem, SCS
Niazi, Salman, 1982- (författare)
KTH,Programvaruteknik och Datorsystem, SCS
Ronstrom, M. (författare)
visa fler...
Haridi, S. (författare)
Dowling, Jim (författare)
KTH,Programvaruteknik och Datorsystem, SCS
visa färre...
 (creator_code:org_t)
Institute of Electrical and Electronics Engineers Inc. 2017
2017
Engelska.
Ingår i: Proceedings - 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017. - : Institute of Electrical and Electronics Engineers Inc.. - 9781509066100 ; , s. 683-688
  • Konferensbidrag (refereegranskat)
Abstract Ämnesord
Stäng  
  • HopsFS is an open-source, next generation distribution of the Apache Hadoop Distributed File System(HDFS) that replaces the main scalability bottleneck in HDFS, single node in-memory metadata service, with a no-sharedstate distributed system built on a NewSQL database. By removing the metadata bottleneck in Apache HDFS, HopsFS enables significantly larger cluster sizes, more than an order of magnitude higher throughput, and significantly lower clientlatencies for large clusters. In this paper, we detail the techniques and optimizations that enable HopsFS to surpass 1 million file system operations per second-at least 16 times higher throughput than HDFS. In particular, we discuss how we exploit recent high performance features from NewSQL databases, such as application defined partitioning, partition-pruned index scans, and distribution aware transactions. Together with more traditional techniques, such as batching and write-Ahead caches, we show how many incremental optimizations have enabled a revolution in distributed hierarchical file system performance.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences (hsv//eng)

Nyckelord

Distributed File System
File System Design
High-performance file systems
NewSQL
Cluster computing
Computer software
Distributed database systems
File organization
Grid computing
Metadata
Open systems
Distributed file systems
Distributed systems
File systems
Hadoop distributed file system (HDFS)
Incremental optimization
Metadata services
Traditional techniques
Distributed computer systems

Publikations- och innehållstyp

ref (ämneskategori)
kon (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy