↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "WFRF:(Shen Hongbing) ;lar1:(kth)"

Sökning: WFRF:(Shen Hongbing) > Kungliga Tekniska Högskolan

Resultat 1-2 av 2

Sortera/gruppera träfflistan

Sortering: Träffar per sida:

Numrering	Referens	Omslagsbild	Hitta
1.	Qin, Zidi, et al. (författare) A Novel Approximation Methodology and Its Efficient VLSI Implementation for the Sigmoid Function 2020 Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-7747 .- 1558-3791. ; 67:12, s. 3422-3426 Tidskriftsartikel (refereegranskat)abstract In this brief, a novel approximation method and its optimized hardware implementation are proposed for the sigmoid function used in Deep Neural Networks (DNNs). Based on piecewise approximation and truncated Taylor series expansion, the proposed method achieves very good approximation with low complexity while exploiting data representation with powers of two. In addition, by analyzing gradients of the sigmoid function, a small trick is introduced to improve the approximation precision. Furthermore, to reduce the hardware complexity and shorten the critical path, sampled values of the function are generated with simple logical-mapping. It is shown that the proposed approximation schemes can be implemented with purely combinational logic and the sigmoid function can be computed in one clock cycle. The experimental results demonstrate that the mean absolute errors are at the order of 1 x 10(-3). Compared with prior arts, the new design can obtain significant improvement in critical path with comparable performance.
2.	Qin, Zidi, et al. (författare) Accelerating Deep Neural Networks by Combining Block-Circulant Matrices and Low-Precision Weights 2019 Ingår i: Electronics. - : MDPI. - 2079-9292. ; 8:1 Tidskriftsartikel (refereegranskat)abstract As a key ingredient of deep neural networks (DNNs), fully-connected (FC) layers are widely used in various artificial intelligence applications. However, there are many parameters in FC layers, so the efficient process of FC layers is restricted by memory bandwidth. In this paper, we propose a compression approach combining block-circulant matrix-based weight representation and power-of-two quantization. Applying block-circulant matrices in FC layers can reduce the storage complexity from O(k2) to O(k). By quantizing the weights into integer powers of two, the multiplications in the reference can be replaced by shift and add operations. The memory usages of models for MNIST, CIFAR-10 and ImageNet can be compressed by 171x, 2731x and 128x with minimal accuracy loss, respectively. A configurable parallel hardware architecture is then proposed for processing the compressed FC layers efficiently. Without multipliers, a block matrix-vector multiplication module (B-MV) is used as the computing kernel. The architecture is flexible to support FC layers of various compression ratios with small footprint. Simultaneously, the memory access can be significantly reduced by using the configurable architecture. Measurement results show that the accelerator has a processing power of 409.6 GOPS, and achieves 5.3 TOPS/W energy efficiency at 800 MHz.

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Resultat 1-2 av 2

Avgränsa träffmängd

Typ av publikation: tidskriftsartikel (2)

Typ av innehåll: refereegranskat (2)

Författare/redaktör: Lu, Zhonghai (2); Qin, Zidi (2); Shen, Qinghong (2); Pan, Hongbing (2); Li, Li (1); Chen, Xuan (1); visa fler...; Zhu, Di (1); Gao, Yang (1); Qiu, Yuou (1); Sun, Huaqing (1); Wang, Zhongfeng (1); Zhu, Xingwei (1); Shi, Yinghuan (1); visa färre...

Lärosäte: Kungliga Tekniska Högskolan (2)Ta bort avgränsningen

Språk: Engelska (2)

Forskningsämne (UKÄ/SCB): Teknik (2)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - Nationella bibliotekssystem
LIBRIS.kb.se

pil uppåt

Stäng

Kopiera och spara länken för att återkomma till aktuell vy