SwePub
Sök i LIBRIS databas

  Utökad sökning

WFRF:(Kim Sook)
 

Sökning: WFRF:(Kim Sook) > (2015-2019) > Characterizing Deep...

Characterizing Deep-Learning I/O Workloads in TensorFlow

Chien, Steven W. D. (författare)
Markidis, Stefano (författare)
KTH,Beräkningsvetenskap och beräkningsteknik (CST)
Sishtla, Chaitanya Prasad (författare)
KTH,Beräkningsvetenskap och beräkningsteknik (CST)
visa fler...
Santos, Luis (författare)
Herman, Pawel (författare)
KTH,Beräkningsvetenskap och beräkningsteknik (CST)
Nrasimhamurthy, Sai (författare)
Laure, Erwin (författare)
KTH,Parallelldatorcentrum, PDC
visa färre...
 (creator_code:org_t)
Institute of Electrical and Electronics Engineers (IEEE), 2018
2018
Engelska.
Ingår i: Proceedings of PDSW-DISCS 2018: 3rd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis. - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 54-63
  • Konferensbidrag (refereegranskat)
Abstract Ämnesord
Stäng  
  • The performance of Deep-Learning (DL) computing frameworks rely on the rformance of data ingestion and checkpointing. In fact, during the aining, a considerable high number of relatively small files are first aded and pre-processed on CPUs and then moved to accelerator for mputation. In addition, checkpointing and restart operations are rried out to allow DL computing frameworks to restart quickly from a eckpoint. Because of this, I/O affects the performance of DL plications. this work, we characterize the I/O performance and scaling of nsorFlow, an open-source programming framework developed by Google and ecifically designed for solving DL problems. To measure TensorFlow I/O rformance, we first design a micro-benchmark to measure TensorFlow ads, and then use a TensorFlow mini-application based on AlexNet to asure the performance cost of I/O and checkpointing in TensorFlow. To prove the checkpointing performance, we design and implement a burst ffer. find that increasing the number of threads increases TensorFlow ndwidth by a maximum of 2.3 x and 7.8 x on our benchmark environments. e use of the tensorFlow prefetcher results in a complete overlap of mputation on accelerator and input pipeline on CPU eliminating the fective cost of I/O on the overall performance. The use of a burst ffer to checkpoint to a fast small capacity storage and copy ynchronously the checkpoints to a slower large capacity storage sulted in a performance improvement of 2.6x with respect to eckpointing directly to slower storage on our benchmark environment.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorteknik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Engineering (hsv//eng)

Nyckelord

Parallel I/O
Input Pipeline
Deep Learning
TensorFlow

Publikations- och innehållstyp

ref (ämneskategori)
kon (ämneskategori)

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy