SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:kth-248377"
 

Search: onr:"swepub:oai:DiVA.org:kth-248377" > Characterizing Deep...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Characterizing Deep-Learning I/O Workloads in TensorFlow

Chien, Steven W. D. (author)
Markidis, Stefano (author)
KTH,Beräkningsvetenskap och beräkningsteknik (CST)
Sishtla, Chaitanya Prasad (author)
KTH,Beräkningsvetenskap och beräkningsteknik (CST)
show more...
Santos, Luis (author)
Herman, Pawel (author)
KTH,Beräkningsvetenskap och beräkningsteknik (CST)
Nrasimhamurthy, Sai (author)
Laure, Erwin (author)
KTH,Parallelldatorcentrum, PDC
show less...
 (creator_code:org_t)
Institute of Electrical and Electronics Engineers (IEEE), 2018
2018
English.
In: Proceedings of PDSW-DISCS 2018: 3rd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis. - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 54-63
  • Conference paper (peer-reviewed)
Abstract Subject headings
Close  
  • The performance of Deep-Learning (DL) computing frameworks rely on the rformance of data ingestion and checkpointing. In fact, during the aining, a considerable high number of relatively small files are first aded and pre-processed on CPUs and then moved to accelerator for mputation. In addition, checkpointing and restart operations are rried out to allow DL computing frameworks to restart quickly from a eckpoint. Because of this, I/O affects the performance of DL plications. this work, we characterize the I/O performance and scaling of nsorFlow, an open-source programming framework developed by Google and ecifically designed for solving DL problems. To measure TensorFlow I/O rformance, we first design a micro-benchmark to measure TensorFlow ads, and then use a TensorFlow mini-application based on AlexNet to asure the performance cost of I/O and checkpointing in TensorFlow. To prove the checkpointing performance, we design and implement a burst ffer. find that increasing the number of threads increases TensorFlow ndwidth by a maximum of 2.3 x and 7.8 x on our benchmark environments. e use of the tensorFlow prefetcher results in a complete overlap of mputation on accelerator and input pipeline on CPU eliminating the fective cost of I/O on the overall performance. The use of a burst ffer to checkpoint to a fast small capacity storage and copy ynchronously the checkpoints to a slower large capacity storage sulted in a performance improvement of 2.6x with respect to eckpointing directly to slower storage on our benchmark environment.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorteknik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Engineering (hsv//eng)

Keyword

Parallel I/O
Input Pipeline
Deep Learning
TensorFlow

Publication and Content Type

ref (subject category)
kon (subject category)

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view