SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Huan Yuxiang) "

Sökning: WFRF:(Huan Yuxiang)

  • Resultat 1-10 av 21
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Chu, H., et al. (författare)
  • An ASIC Design of Multi-Electrode Digital Basket Catheter Systems with Reconfigurable Compressed Sampling
  • 2019
  • Ingår i: International System on Chip Conference. - : IEEE Computer Society. - 9781538614907 ; , s. 308-313
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents an Application Specific Integrated Circuit (ASIC) design with reconfigurable compressed sampling (CS) for multi-electrode basket catheter systems that acquire intracardiac electrograms (IEGMs). This work adopts a reconfigurable CS (ReCS) encoder for near-electrode processing to enable sub-Nyquist sampling rate thus improve the system capacity. The ReCS encoder is designed to work with a reconfigurable compression cycle as well as a reconfigurable compression ratio, which makes it suitable for a wide range of different signals. This digital ASIC chip is placed at the distal end of the catheter close to electrodes, so that all signals have been digitalized and encoded before transmitting to an external receiver. Such architecture ensures serial data transmission, reducing number of traces and size of the catheter, as well as fabrication complexity. Evaluated area cost of total digital circuits is 0.046 mm 2 and the power consumption is 49.1 μW with 4 MHz clock frequency in 65 nm process.
  •  
2.
  • Ding, Chen, et al. (författare)
  • An Ultra-Low Latency Multicast Router for Large-Scale Multi-Chip Neuromorphic Processing
  • 2021
  • Ingår i: 2021 IEEE 3rd international conference on artificial intelligence circuits and systems (AICASs). - : Institute of Electrical and Electronics Engineers (IEEE).
  • Konferensbidrag (refereegranskat)abstract
    • Neuromorphic simulation is fundamental to the study of information processing mechanism of the human brain and can further inspire application development of event-driven spiking neural networks. However large-scale neuromorphic simulation requires massive parallelism on multi-chip processing and imposes great challenges on dealing with data transmission latency and congestion problems between chips, especially when the number of simulated neurons reaches to billions or even trillions level. In this paper, we propose an ultra-low-latency on-chip router together with a multicast routing algorithm that focuses on reducing global loads and balancing loads between links. Additionally, we build a large-scale neuromorphic simulation platform consisting of 64 FPGA chips and evaluate the proposed design on it. The experiment results suggest that this design benefits from the proposed multicast routing algorithm in global communication loads and simulation capacity. This work has 4.1% similar to 5.2% reduction of global loads comparing to previous works and can achieve a latency as low as 25ns and a maximum data throughput of 6.25Gbps/chip.
  •  
3.
  • Huan, Yuxiang, et al. (författare)
  • A 101.4 GOPS/W Reconfigurable and Scalable Control-Centric Embedded Processor for Domain-Specific Applications
  • 2016
  • Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : IEEE. - 1549-8328 .- 1558-0806. ; 63:12, s. 2245-2256
  • Tidskriftsartikel (refereegranskat)abstract
    • Adapting the processor to the target application is essential in the Internet-of-Things (IoT), and thus requires customizability in order to improve energy efficiency and scalability to provide sufficient performance. In this paper, a reconfigurable and scalable control-centric architecture is proposed, and a processor consisting of two cores and an on-chip multi-mode router is implemented. Reconfigurability is enabled by a programmable sequence mapping table (SMT) which reorganizes functional units in each cycle, thus increasing hardware utilization and reducing excessive data movement for high energy efficiency. The router facilitates both wormhole and circuit switching to construct intra- or inter-chip interconnections, providing scalable performance. Fabricated in a 65-nm process, the chip exhibits 101.4 GOPS/W energy efficiency with a die size of 3.5 mm(2). The processor carries out general-purpose processing with a code size 29% smaller than the ARM Cortex M4, and improves the performance of application-specific processing by over ten times when implementing AES and RSA using SMTs instead of general-purpose C. By utilizing the on-chip router, the processor can be interconnected up to 256 nodes, with a single link bandwidth of 1.4 Gbps.
  •  
4.
  • Huan, Yuxiang, et al. (författare)
  • A 3D Tiled Low Power Accelerator for Convolutional Neural Network
  • 2018
  • Ingår i: 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS). - : IEEE. - 9781538648810
  • Konferensbidrag (refereegranskat)abstract
    • It remains a challenge to run Deep Learning in devices with stringent power budget in the Internet-of-Things. This paper presents a low-power accelerator for processing Convolutional Neural Networks on the embedded devices. The power reduction is realized by exploring data reuse in three different aspects, with regards to convolution, filter and input features. A systolic-like data flow is proposed and applied to rows of Processing Elements (PEs), which facilitate reusing the data during convolution. Reuse of input features and filters is achieved by arranging the PE array in a 3D tiled architecture, whose dimension is 3 x 14 x 4. Local storage within PEs is therefore reduced and only cost 17.75 kB, which is 20% of the state-of-the-art. With dedicated delay chains in each PE, this accelerator is reconfigurable to suit various parameter settings of convolutional layers. Evaluated in UMC 65 nm low leakage process, the accelerator can reach a peak performance of 84 GOPS and consume only 136 mW at 250 Mhz.
  •  
5.
  • Huan, Yuxiang, et al. (författare)
  • A 61 μa/MHz reconfigurable application-specific processor and system-on-chip for Internet-of-Things
  • 2016
  • Ingår i: International System on Chip Conference. - : IEEE Computer Society. - 9781467390941 ; , s. 235-239
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a SoC design that combines general purpose control and application-specific acceleration within a reconfigurable ASIP core for Internet-of-Things applications. Sufficient processing capability and re-configurability are provided by highly customizable data path and efficient sequence control loop. By fully utilizing the data path of proposed architecture, the processor significantly reduces >4X code size and offers superior performance compared with MSP430 and Atmega128 in FIR and Whetstone benchmarks. More than 10X speedup can be obtained in executing encryption algorithms by optimized micro-instructions without extra hardware accelerators. Fabricated in 0.18 μm CMOS, our SoC's energy efficiency beats most of the microcontrollers with a value as low as 61 μA/MHz.
  •  
6.
  • Huang, Boming, et al. (författare)
  • Automated trading systems statistical and machine learning methods and hardware implementation : a survey
  • 2019
  • Ingår i: Enterprise Information Systems. - : Taylor & Francis. - 1751-7575 .- 1751-7583. ; 13:1, s. 132-144
  • Tidskriftsartikel (refereegranskat)abstract
    • Automated trading, which is also known as algorithmic trading, is a method of using a predesigned computer program to submit a large number of trading orders to an exchange. It is substantially a real-time decision-making system which is under the scope of Enterprise Information System (EIS). With the rapid development of telecommunication and computer technology, the mechanisms underlying automated trading systems have become increasingly diversified. Considerable effort has been exerted by both academia and trading firms towards mining potential factors that may generate significantly higher profits. In this paper, we review studies on trading systems built using various methods and empirically evaluate the methods by grouping them into three types: technical analyses, textual analyses and high-frequency trading. Then, we evaluate the advantages and disadvantages of each method and assess their future prospects.
  •  
7.
  • Huang, Boming, et al. (författare)
  • IECA : An In-Execution Configuration CNN Accelerator With 30.55 GOPS/mm(2) Area Efficiency
  • 2021
  • Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-8328 .- 1558-0806. ; 68:11, s. 4672-4685
  • Tidskriftsartikel (refereegranskat)abstract
    • It remains challenging for a Convolutional Neural Network (CNN) accelerator to maintain high hardware utilization and low processing latency with restricted on-chip memory. This paper presents an In-Execution Configuration Accelerator (IECA) that realizes an efficient control scheme, exploring architectural data reuse, unified in-execution controlling, and pipelined latency hiding to minimize configuration overhead out of the computation scope. The proposed IECA achieves row-wise convolution with tiny distributed buffers and reduces the size of total on-chip memory by removing 40% of redundant memory storage with shared delay chains. By exploiting a reconfigurable Sequence Mapping Table (SMT) and Finite State Machine (FSM) control, the chip realizes cycle-accurate Processing Element (PE) control, automatic loop tiling and latency hiding without extra time slots for pre-configuration. Evaluated on AlexNet and VGG-16, the IECA retains over 97.3% PE utilization and over 95.6% memory access time hiding on average. The chip is designed and fabricated in a UMC 55-nm process running at a frequency of 250 MHz and achieves an area efficiency of 30.55 GOPS/mm(2) and 0.244 GOPS/KGE (kilo-gate-equivalent), which makes an over 2.0x and 2.1x improvement, respectively, compared with that of previous related works. Implementation of the IEC control scheme uses only a 0.55% area of the 2.75 mm(2) core.
  •  
8.
  • Jin, Yi, et al. (författare)
  • Self-aware distributed deep learning framework for heterogeneous IoT edge devices
  • 2021
  • Ingår i: Future generations computer systems. - : Elsevier BV. - 0167-739X .- 1872-7115. ; 125, s. 908-920
  • Tidskriftsartikel (refereegranskat)abstract
    • Implementing artificial intelligence (AI) in the Internet of Things (IoT) involves a move from the cloud to the heterogeneous and low-power edge, following an urgent demand for deploying complex training tasks in a distributed and reliable manner. This work proposes a self-aware distributed deep learning (DDL) framework for IoT applications, which is applicable to heterogeneous edge devices aiming to improve adaptivity and amortize the training cost. The self-aware design including the dynamic self-organizing approach and the self-healing method enhances the system reliability and resilience. Three typical edge devices are adopted with cross-platform Docker deployment: Personal Computers (PC) for general computing devices, Raspberry Pi 4Bs (Rpi) for resource-constrained edge devices, and Jetson Nanos (Jts) for AI-enabled edge devices. Benchmarked with ResNet-32 on CIFAR-10, the training efficiency of tested distributed clusters is increased by 8.44x compared to the standalone Rpi. The cluster with 11 heterogeneous edge devices achieves a training efficiency of 200.4 images/s and an accuracy of 92.45%. Results prove that the self-organizing approach functions well with dynamic changes like devices being removed or added. The self-healing method is evaluated with various stabilities, cluster scales, and breakdown cases, testifying that the reliability can be largely enhanced for extensively distributed deployments. The proposed DDL framework shows excellent performance for training implementation with heterogeneous edge devices in IoT applications with high-degree scalability and reliability.
  •  
9.
  • Jin, Yi, et al. (författare)
  • TMR Group Coding Method for Optimized SEU and MBU Tolerant Memory Design
  • 2018
  • Ingår i: 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS). - : IEEE. - 9781538648810
  • Konferensbidrag (refereegranskat)abstract
    • This work proposes a fault tolerant memory design using the method of Triple Module Redundancy (TMR) group coding to tolerant the Single-Event Upset (SEU) and Multi-Bit Upset (MBU) influence on memory devices in space environment. The group coding method uses different models to partition and code each word line in memory with Hamming code to achieve best performance. TMR group coding method further increases the capability of self-correction for the errors occurred in parity bits. The evaluation results show that the suggested approach can obtain improved correctness for the memory output with optimized tradeoff between reliability and cost. At 5% error rate, the probability of correct output reaches 70.78% with small cost increment. To achieve 90% reliability, the accuracy improvement is 31.9% compared to TMR with 9% increased area. This solution proposed is evaluated on the memory rich micro-coded processor, but can be further extended to other memory-based processors that need high reliability for the SEU and MBU influence in aerospace applications.
  •  
10.
  • Li, Sirui, et al. (författare)
  • Glioma grading, molecular feature classification, and microstructural characterization using MR diffusional variance decomposition (DIVIDE) imaging
  • 2021
  • Ingår i: European Radiology. - : Springer Science and Business Media LLC. - 0938-7994 .- 1432-1084. ; 31:11, s. 8197-8207
  • Tidskriftsartikel (refereegranskat)abstract
    • Objective: To evaluate the potential of diffusional variance decomposition (DIVIDE) for grading, molecular feature classification, and microstructural characterization of gliomas. Materials and methods: Participants with suspected gliomas underwent DIVIDE imaging, yielding parameter maps of fractional anisotropy (FA), mean diffusivity (MD), anisotropic mean kurtosis (MKA), isotropic mean kurtosis (MKI), total mean kurtosis (MKT), MKA/MKT, and microscopic fractional anisotropy (μFA). Tumor type and grade, isocitrate dehydrogenase (IDH) 1/2 mutant status, and the Ki-67 labeling index (Ki-67 LI) were determined after surgery. Statistical analysis included 33 high-grade gliomas (HGG) and 17 low-grade gliomas (LGG). Tumor diffusion metrics were compared between HGG and LGG, among grades, and between wild and mutated IDH types using appropriate tests according to normality assessment results. Receiver operating characteristic and Spearman correlation analysis were also used for statistical evaluations. Results: FA, MD, MKA, MKI, MKT, μFA, and MKA/MKT differed between HGG and LGG (FA: p = 0.047; MD: p = 0.037, others p < 0.001), and among glioma grade II, III, and IV (FA: p = 0.048; MD: p = 0.038, others p < 0.001). All diffusion metrics differed between wild-type and mutated IDH tumors (MKI: p = 0.003; others: p < 0.001). The metrics that best discriminated between HGG and LGGs and between wild-type and mutated IDH tumors were MKT and FA respectively (area under the curve 0.866 and 0.881). All diffusion metrics except FA showed significant correlation with Ki-67 LI, and MKI had the highest correlation coefficient (rs = 0.618). Conclusion: DIVIDE is a promising technique for glioma characterization and diagnosis. Key Points: • DIVIDE metrics MKIis related to cell density heterogeneity while MKAand μFA are related to cell eccentricity. • DIVIDE metrics can effectively differentiate LGG from HGG and IDH mutation from wild-type tumor, and showed significant correlation with the Ki-67 labeling index. • MKIwas larger than MKAwhich indicates predominant cell density heterogeneity in gliomas. • MKAand MKIincreased with grade or degree of malignancy, however with a relatively larger increase in the cell eccentricity metric MKAin relation to the cell density heterogeneity metric MKI.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 21

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy