SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Chu Haoming) "

Sökning: WFRF:(Chu Haoming)

  • Resultat 1-3 av 3
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Huang, Boming, et al. (författare)
  • IECA : An In-Execution Configuration CNN Accelerator With 30.55 GOPS/mm(2) Area Efficiency
  • 2021
  • Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-8328 .- 1558-0806. ; 68:11, s. 4672-4685
  • Tidskriftsartikel (refereegranskat)abstract
    • It remains challenging for a Convolutional Neural Network (CNN) accelerator to maintain high hardware utilization and low processing latency with restricted on-chip memory. This paper presents an In-Execution Configuration Accelerator (IECA) that realizes an efficient control scheme, exploring architectural data reuse, unified in-execution controlling, and pipelined latency hiding to minimize configuration overhead out of the computation scope. The proposed IECA achieves row-wise convolution with tiny distributed buffers and reduces the size of total on-chip memory by removing 40% of redundant memory storage with shared delay chains. By exploiting a reconfigurable Sequence Mapping Table (SMT) and Finite State Machine (FSM) control, the chip realizes cycle-accurate Processing Element (PE) control, automatic loop tiling and latency hiding without extra time slots for pre-configuration. Evaluated on AlexNet and VGG-16, the IECA retains over 97.3% PE utilization and over 95.6% memory access time hiding on average. The chip is designed and fabricated in a UMC 55-nm process running at a frequency of 250 MHz and achieves an area efficiency of 30.55 GOPS/mm(2) and 0.244 GOPS/KGE (kilo-gate-equivalent), which makes an over 2.0x and 2.1x improvement, respectively, compared with that of previous related works. Implementation of the IEC control scheme uses only a 0.55% area of the 2.75 mm(2) core.
  •  
2.
  • Jin, Yi, et al. (författare)
  • TMR Group Coding Method for Optimized SEU and MBU Tolerant Memory Design
  • 2018
  • Ingår i: 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS). - : IEEE. - 9781538648810
  • Konferensbidrag (refereegranskat)abstract
    • This work proposes a fault tolerant memory design using the method of Triple Module Redundancy (TMR) group coding to tolerant the Single-Event Upset (SEU) and Multi-Bit Upset (MBU) influence on memory devices in space environment. The group coding method uses different models to partition and code each word line in memory with Hamming code to achieve best performance. TMR group coding method further increases the capability of self-correction for the errors occurred in parity bits. The evaluation results show that the suggested approach can obtain improved correctness for the memory output with optimized tradeoff between reliability and cost. At 5% error rate, the probability of correct output reaches 70.78% with small cost increment. To achieve 90% reliability, the accuracy improvement is 31.9% compared to TMR with 9% increased area. This solution proposed is evaluated on the memory rich micro-coded processor, but can be further extended to other memory-based processors that need high reliability for the SEU and MBU influence in aerospace applications.
  •  
3.
  • Xu, Jiawei, et al. (författare)
  • Modeling Cycle-to-Cycle Variation in Memristors for In-Situ Unsupervised Trace-STDP Learning
  • 2024
  • Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-7747 .- 1558-3791. ; 71:2, s. 627-631
  • Tidskriftsartikel (refereegranskat)abstract
    • Evaluating the computational accuracy of Spiking Neural Network (SNN) implemented as in-situ learning on large-scale memristor crossbars remains a challenge due to the lack of a versatile model for the variations in non-ideal memristors. This brief proposes a novel behavioral variation model along with a four-stage pipeline for physical memristors. The proposed variation model combines both absolute and relative variations. Therefore, it can better characterize different memristor cycle-to-cycle (C2C) variations in practice. The proposed variation model has been used to simulate the behavior of two physical memristors. Adopting the non-ideal memristor model, the trace-based spiking-timing dependent plasticity (STDP) unsupervised in-memristor learning system is simulated. Although the synaptic-level weight simulation shows a performance degradation of 7.99% and 4.07% increase in the relative root mean square error (RRMSE), the network-level simulation results show no accuracy loss on the MNIST benchmark. Furthermore, the impacts of absolute and relative C2C variations on network performance are simulated and analyzed through two sets of univariate experiments.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-3 av 3

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy