SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Xu Jiawei) srt2:(2021)"

Sökning: WFRF:(Xu Jiawei) > (2021)

  • Resultat 1-6 av 6
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Jin, Yi, et al. (författare)
  • Self-aware distributed deep learning framework for heterogeneous IoT edge devices
  • 2021
  • Ingår i: Future generations computer systems. - : Elsevier BV. - 0167-739X .- 1872-7115. ; 125, s. 908-920
  • Tidskriftsartikel (refereegranskat)abstract
    • Implementing artificial intelligence (AI) in the Internet of Things (IoT) involves a move from the cloud to the heterogeneous and low-power edge, following an urgent demand for deploying complex training tasks in a distributed and reliable manner. This work proposes a self-aware distributed deep learning (DDL) framework for IoT applications, which is applicable to heterogeneous edge devices aiming to improve adaptivity and amortize the training cost. The self-aware design including the dynamic self-organizing approach and the self-healing method enhances the system reliability and resilience. Three typical edge devices are adopted with cross-platform Docker deployment: Personal Computers (PC) for general computing devices, Raspberry Pi 4Bs (Rpi) for resource-constrained edge devices, and Jetson Nanos (Jts) for AI-enabled edge devices. Benchmarked with ResNet-32 on CIFAR-10, the training efficiency of tested distributed clusters is increased by 8.44x compared to the standalone Rpi. The cluster with 11 heterogeneous edge devices achieves a training efficiency of 200.4 images/s and an accuracy of 92.45%. Results prove that the self-organizing approach functions well with dynamic changes like devices being removed or added. The self-healing method is evaluated with various stabilities, cluster scales, and breakdown cases, testifying that the reliability can be largely enhanced for extensively distributed deployments. The proposed DDL framework shows excellent performance for training implementation with heterogeneous edge devices in IoT applications with high-degree scalability and reliability.
  •  
2.
  • Gehrmann, Sebastian, et al. (författare)
  • The GEM Benchmark : Natural Language Generation, its Evaluation and Metrics
  • 2021
  • Ingår i: The 1st Workshop on Natural Language Generation, Evaluation, and Metrics. - Stroudsburg, PA, USA : Association for Computational Linguistics. ; , s. 96-120
  • Konferensbidrag (refereegranskat)abstract
    • We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. Due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it challenging to identify the limitations of current models and opportunities for progress. Addressing this limitation, GEM provides an environment in which models can easily be applied to a wide set of tasks and in which evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models. This paper serves as the description of the data for the 2021 shared task at the associated GEM Workshop.
  •  
3.
  • Huang, Boming, et al. (författare)
  • IECA : An In-Execution Configuration CNN Accelerator With 30.55 GOPS/mm(2) Area Efficiency
  • 2021
  • Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-8328 .- 1558-0806. ; 68:11, s. 4672-4685
  • Tidskriftsartikel (refereegranskat)abstract
    • It remains challenging for a Convolutional Neural Network (CNN) accelerator to maintain high hardware utilization and low processing latency with restricted on-chip memory. This paper presents an In-Execution Configuration Accelerator (IECA) that realizes an efficient control scheme, exploring architectural data reuse, unified in-execution controlling, and pipelined latency hiding to minimize configuration overhead out of the computation scope. The proposed IECA achieves row-wise convolution with tiny distributed buffers and reduces the size of total on-chip memory by removing 40% of redundant memory storage with shared delay chains. By exploiting a reconfigurable Sequence Mapping Table (SMT) and Finite State Machine (FSM) control, the chip realizes cycle-accurate Processing Element (PE) control, automatic loop tiling and latency hiding without extra time slots for pre-configuration. Evaluated on AlexNet and VGG-16, the IECA retains over 97.3% PE utilization and over 95.6% memory access time hiding on average. The chip is designed and fabricated in a UMC 55-nm process running at a frequency of 250 MHz and achieves an area efficiency of 30.55 GOPS/mm(2) and 0.244 GOPS/KGE (kilo-gate-equivalent), which makes an over 2.0x and 2.1x improvement, respectively, compared with that of previous related works. Implementation of the IEC control scheme uses only a 0.55% area of the 2.75 mm(2) core.
  •  
4.
  • Vogelsang, Jan, et al. (författare)
  • Coherent Excitation and Control of Plasmons on Gold Using Two-Dimensional Transition Metal Dichalcogenides
  • 2021
  • Ingår i: ACS Photonics. - : American Chemical Society (ACS). - 2330-4022. ; 8:6, s. 1607-1615
  • Tidskriftsartikel (refereegranskat)abstract
    • The hybrid combination of two-dimensional (2D) transition metal dichalcogenides (TMDs) and plasmonic materials open up novel means of (ultrafast) optoelectronic applications and manipulation of nanoscale light-matter interaction. However, control of the plasmonic excitations by TMDs themselves has not been investigated. Here, we show that the ultrathin 2D WSe2 crystallites permit nanoscale spatially controlled coherent excitation of surface plasmon polaritons (SPPs) on smooth Au films. The resulting complex plasmonic interference patterns are recorded with nanoscale resolution in a photoemission electron microscope. Modeling shows good agreement with experiments and further indicates how SPPs can be tailored with high spatiotemporal precision using the shape of the 2D TMDs with thicknesses down to single molecular layers. We demonstrate the use of WSe2 nanocrystals as 2D optical elements for exploring the ultrafast dynamics of SPPs. Using few-femtosecond laser pulse pairs we excite an SPP at the boundary of a WSe2 crystal and then have a WSe2 monolayer wedge act as a delay line inducing a spatially varying phase difference down to the attosecond time range. The observed effects are a natural yet unexplored consequence of high dielectric functional values of TMDs in the visible range that should be considered when designing metal-TMD hybrid devices. As the 2D TMD crystals are stable in air, can be defect free, can be synthesized in many shapes, and are reliably positioned on metal surfaces, using them to excite and steer SPPs adds an interesting alternative in designing hybrid structures for plasmonic control.
  •  
5.
  • Wang, Deyu, et al. (författare)
  • Mapping the BCPNN Learning Rule to a Memristor Model
  • 2021
  • Ingår i: Frontiers in Neuroscience. - : Frontiers Media SA. - 1662-4548 .- 1662-453X. ; 15
  • Tidskriftsartikel (refereegranskat)abstract
    • The Bayesian Confidence Propagation Neural Network (BCPNN) has been implemented in a way that allows mapping to neural and synaptic processes in the human cortexandhas been used extensively in detailed spiking models of cortical associative memory function and recently also for machine learning applications. In conventional digital implementations of BCPNN, the von Neumann bottleneck is a major challenge with synaptic storage and access to it as the dominant cost. The memristor is a non-volatile device ideal for artificial synapses that fuses computation and storage and thus fundamentally overcomes the von Neumann bottleneck. While the implementation of other neural networks like Spiking Neural Network (SNN) and even Convolutional Neural Network (CNN) on memristor has been studied, the implementation of BCPNN has not. In this paper, the BCPNN learning rule is mapped to a memristor model and implemented with a memristor-based architecture. The implementation of the BCPNN learning rule is a mixed-signal design with the main computation and storage happening in the analog domain. In particular, the nonlinear dopant drift phenomenon of the memristor is exploited to simulate the exponential decay of the synaptic state variables in the BCPNN learning rule. The consistency between the memristor-based solution and the BCPNN learning rule is simulated and verified in Matlab, with a correlation coefficient as high as 0.99. The analog circuit is designed and implemented in the SPICE simulation environment, demonstrating a good emulation effect for the BCPNN learning rule with a correlation coefficient as high as 0.98. This work focuses on demonstrating the feasibility of mapping the BCPNN learning rule to in-circuit computation in memristor. The feasibility of the memristor-based implementation is evaluated and validated in the paper, to pave the way for a more efficient BCPNN implementation, toward a real-time brain emulation engine.
  •  
6.
  • Xu, Jiawei, et al. (författare)
  • A Memristor Model with Concise Window Function for Spiking Brain-Inspired Computation
  • 2021
  • Ingår i: 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS. - : Institute of Electrical and Electronics Engineers (IEEE).
  • Konferensbidrag (refereegranskat)abstract
    • This paper proposes a concise window function to build a memristor model, simulating the widely-observed non-linear dopant drift phenomenon of the memristor. Exploiting the non-linearity, the memristor model is applied to the in-situ neuromorphic solution for a cortex-inspired spiking neural network (SNN), spike-based Bayesian Confidence Propagation Neural Network (BCPNN). The improved memristor model utilizing the proposed window function is able to retain the boundary effect and resolve the boundary lock and inflexibility problem, while it is simple in form that can facilitate large-scale neuromorphic model simulation. Compared with the state-of-the-art general memristor model, the proposed memristor model can achieve a 5.8x reduction of simulation time at a competitive fitting level in cortex-comparable large-scale software simulation. The evaluation results show an explicit similarity between the non-linear dopant drift phenomenon of the memristor and the BCPNN learning rule, and the memristor model is able to emulate the key traces of BCPNN with a correlation coefficient over 0.99.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-6 av 6

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy