SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:miun-43424"
 

Search: onr:"swepub:oai:DiVA.org:miun-43424" > ANNETTE :

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

ANNETTE : Accurate Neural Network Execution Time Estimation with Stacked Models

Wess, M. (author)
Ivanov, M. (author)
Unger, C. (author)
show more...
Nookala, A. (author)
Wendt, A. (author)
Jantsch, A. (author)
show less...
Institute of Electrical and Electronics Engineers Inc. 2021
2021
English.
In: IEEE Access. - : Institute of Electrical and Electronics Engineers Inc.. - 2169-3536. ; 9, s. 3545-3556
  • Journal article (peer-reviewed)
Abstract Subject headings
Close  
  • With new accelerator hardware for Deep Neural Networks (DNNs), the computing power for Artificial Intelligence (AI) applications has increased rapidly. However, as DNN algorithms become more complex and optimized for specific applications, latency requirements remain challenging, and it is critical to find the optimal points in the design space. To decouple the architectural search from the target hardware, we propose a time estimation framework that allows for modeling the inference latency of DNNs on hardware accelerators based on mapping and layer-wise estimation models. The proposed methodology extracts a set of models from micro-kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation. We test the mixed models on the ZCU102 SoC board with Xilinx Deep Neural Network Development Kit (DNNDK) and Intel Neural Compute Stick 2 (NCS2) on a set of 12 state-of-the-art neural networks. It shows an average estimation error of 3.47% for the DNNDK and 7.44% for the NCS2, outperforming the statistical and analytical layer models for almost all selected networks. For a randomly selected subset of 34 networks of the NASBench dataset, the mixed model reaches fidelity of 0.988 in Spearman's $\rho $ rank correlation coefficient metric. © 2013 IEEE.

Keyword

Analytical models
estimation
neural network hardware
Deep neural networks
Mapping
System-on-chip
Estimation errors
Estimation models
Hardware accelerators
Rank correlation coefficient
Roofline models
State of the art
Target hardware
Time estimation
Neural networks

Publication and Content Type

ref (subject category)
art (subject category)

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Find more in SwePub

By the author/editor
Wess, M.
Ivanov, M.
Unger, C.
Nookala, A.
Wendt, A.
Jantsch, A.
Articles in the publication
IEEE Access
By the university
Mid Sweden University

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view