SwePub
Sök i LIBRIS databas

  Utökad sökning

WFRF:(Lu Zhonghai)
 

Sökning: WFRF:(Lu Zhonghai) > Enabling Energy-Eff...

Enabling Energy-Efficient Inference for Self-Attention Mechanisms in Neural Networks

Chen, Qinyu (författare)
Univ Shanghai Sci & Technol, Inst Photon Chips, Shanghai, Peoples R China.
Sun, Congyi (författare)
Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Peoples R China.
Lu, Zhonghai (författare)
KTH,Elektronik och inbyggda system
visa fler...
Gao, Chang (författare)
Univ Zurich, Inst Neuroinformat, Zurich, Switzerland.;Swiss Fed Inst Technol, Zurich, Switzerland.
visa färre...
Univ Shanghai Sci & Technol, Inst Photon Chips, Shanghai, Peoples R China Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Peoples R China. (creator_code:org_t)
Institute of Electrical and Electronics Engineers (IEEE), 2022
2022
Engelska.
Ingår i: 2022 Ieee International Conference On Artificial Intelligence Circuits And Systems (Aicas 2022). - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 25-28
  • Konferensbidrag (refereegranskat)
Abstract Ämnesord
Stäng  
  • The study of specialized accelerators tailored for neural networks is becoming a promising topic in recent years. Such existing neural network accelerators are usually designed for convolutional neural networks (CNNs) or recurrent neural networks have been (RNNs), however, less attention has been paid to the attention mechanisms, which is an emerging neural network primitive with the ability to identify the relations within input entities. The self-attention-oriented models such as Transformer have achieved great performance on natural language processing, computer vision and machine translation. However, the self-attention mechanism has intrinsically expensive computational workloads, which increase quadratically with the number of input entities. Therefore, in this work, we propose an software-hardware co-design solution for energy-efficient self-attention inference. A prediction-based approximate self-attention mechanism is introduced to substantially reduce the runtime as well as power consumption, and then a specialized hardware architecture is designed to further increase the speedup. The design is implemented on a Xilinx XC7Z035 FPGA, and the results show that the energy efficiency is improved by 5.7x with less than 1% accuracy loss.

Ämnesord

TEKNIK OCH TEKNOLOGIER  -- Elektroteknik och elektronik (hsv//swe)
ENGINEERING AND TECHNOLOGY  -- Electrical Engineering, Electronic Engineering, Information Engineering (hsv//eng)

Nyckelord

Self-attention
approximate computing
VLSI

Publikations- och innehållstyp

ref (ämneskategori)
kon (ämneskategori)

Till lärosätets databas

Hitta mer i SwePub

Av författaren/redakt...
Chen, Qinyu
Sun, Congyi
Lu, Zhonghai
Gao, Chang
Om ämnet
TEKNIK OCH TEKNOLOGIER
TEKNIK OCH TEKNO ...
och Elektroteknik oc ...
Artiklar i publikationen
Av lärosätet
Kungliga Tekniska Högskolan

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy