SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Zhang Chenxin) "

Sökning: WFRF:(Zhang Chenxin)

  • Resultat 1-10 av 18
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Andersson, Per, et al. (författare)
  • Beyond von Neumann: weakly programmable processor arrays and their programming
  • 2011
  • Ingår i: [Host publication title missing].
  • Konferensbidrag (refereegranskat)abstract
    • The age of parallelism is here. For a sustainable software development for massively parallel architectures the von Neumann model need to be replaced by one with native support for parallelism. We suggest a data flow model for signal processing applications. This will make it possible to reuse software implementations for different targets and future platform generations. We also outline our development tool flow for compiling CAL, a data flow language, to parallel architectures. We also present our processor array, which can be configured to handle massively parallel computations. We demonstrate its power by implementing part of a software radio receiver.
  •  
2.
  • Diaz, Isael, et al. (författare)
  • A New Digital Front-End for Flexible Reception in Software Defined Radio
  • 2015
  • Ingår i: Microprocessors and Microsystems. - : Elsevier BV. - 0141-9331. ; 39:8, s. 889-900
  • Tidskriftsartikel (refereegranskat)abstract
    • Future mobile terminals are expected to support an ever increasing number of Radio Access Technologies (RAT) concurrently. This imposes a challenge to terminal designers already today. Software Defined Radio (SDR) solutions are a compelling alternative to address this issue in the digital baseband, given its high flexibility and low Non-Recurring Engineering (NRE) cost. However, the challenge still remains in the Digital Front-End (DFE), where many operations are too complex or energy hungry to be implemented as software instructions. Thus, new architectures are needed to feed the SDR digital baseband while keeping complexity and energy consumption at bay. In this article the architecture of a Digital Front-End Receiver (DFE-Rx) for the next-generation mobile terminals is presented. The flexibility needed for multi-standard support is demonstrated by detecting, synchronizing and reporting carrier-frequency offset, of multiple concurrent radio standards. Moreover, the proposed architecture has been fabricated in a 65 nm CMOS low power high-VT cell technology in a die size of 5 mm2. The core module of the DFE-Rx, the synchronization engine, has been measured at 1.2 V and reports an average power consumption of 1.9 mW during Wireless Local Area Network (WLAN) reception and 1.6 mW during configuration, while running at 10 MHz.
  •  
3.
  • Diaz, Isael, et al. (författare)
  • Next Generation Digital Front-End for Multi-Standard Concurrent Reception
  • 2013
  • Ingår i: [Host publication title missing].
  • Konferensbidrag (refereegranskat)abstract
    • This article presents an architecture of a Digital Front-End Receiver (DFE-Rx) for the next-generation mobile terminals. A main focus is placed in flexibility, scalability and concurrency. The architecture is capable of detecting, synchronizing and reporting carrier-frequency offset, of multiple concurrent radio standards. The proposed receiver is fabricated in a 65nm CMOS low power high-VT cell technology in a die size of 5mm2. The synchronization engine has been measured at 1.2V and reports an average power consumption of 1.9mW during IEEE 802.11 reception and 1.6mW during configuration, while running at 10MHz.
  •  
4.
  • Granlund, Stefan, et al. (författare)
  • A Low-Latency High-Throughput Soft-Output Signal Detector for Spatial Multiplexing MIMO Systems
  • 2015
  • Ingår i: Microprocessors and Microsystems. - : Elsevier BV. - 0141-9331. ; 39:8, s. 901-908
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a low latency, high throughput soft-output signal detector for a 4x4 64-QAM spatial-multiplexing MIMO system. To achieve high data-level parallelism and accurate soft information, the detector adopts a channel-adaptive node perturbation technique to generate a list of candidate vectors around an initial linear estimation. The detection algorithm provides a large range and convenient performance-complexity trade-off by adjusting the node perturbation parameter. A partial-parallel pipelined VLSI architecture is developed to implement the algorithm with high throughput, low processing latency, while offering the flexibility to support run-time performance tuning. Moreover, a fast and hardware-friendly node enumeration scheme is developed to further reduce the processing delay by exploiting the geometric property of the quadrature amplitude modulation (QAM) constellation. The detector was synthesized using Synopsys Design Compiler with a 65nm CMOS standard cell library. The core area is 0.58mm2 with 290K gates. The peak throughput is 3Gb/s at 500MHz clock frequency with a latency of 20ns. Compared to other reported soft-output MIMO detectors, this is a latency reduction of 71%. The corresponding energy consumption is 33pJ per bit detection.
  •  
5.
  • Granlund, Stefan, et al. (författare)
  • Implementation of a Highly-Parallel Soft-Output MIMO Detector with Fast Node Enumeration
  • 2013
  • Ingår i: [Host publication title missing].
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a high throughput, low latency soft-output signal detector for a 4×4 64-QAM MIMO system. To achieve high data-level parallelism and accurate soft information, the detector adopts a node perturbation technique to generate a list of candidate vectors around Zero Forcing, ZF, result. Additionally a fast and hardware friendly node enumeration scheme is developed to significantly reduce processing delay. Implemented using a 65nm CMOS technology, the detector occupies 0.58mm2 core area with 290K gates. The peak throughput is 3Gb/s at 500 MHz clock frequency with a latency of 20ns. Energy consumption per detected bit is 33pJ.
  •  
6.
  • Ren, Fengbo, et al. (författare)
  • A Square-Root-Free Matrix Decomposition Method for Energy-Efficient Least Squares Computation on Embedded Systems
  • 2014
  • Ingår i: IEEE Embedded Systems Letters. - 1943-0663. ; 6:4, s. 73-76
  • Tidskriftsartikel (refereegranskat)abstract
    • QR decomposition (QRD) is used to solve least squares (LS) problems for a wide range of applications. However, traditional QR decomposition methods, such as Gram-Schmidt (GS), require high computational complexity and non-linear operations to achieve high throughput, limiting their usage on resource-limited platforms. To enable efficient LS computation on embedded systems for real-time applications, this paper presents an alternative decomposition method, called QDRD, which relaxes system requirements while maintaining the same level of performance. Specifically, QDRD eliminates both the square-root operations in the normalization step and the divisions in the subsequent backward substitution. Simulation results show that the accuracy and reliability of factorization matrices can be significantly improved by QDRD, especially when executed on precision-limited platforms. Furthermore, benchmarking results on an embedded platform show that QDRD provides constantly better energy-efficiency and higher throughput than GS-QRD in solving LS problems. Up to 4 and 6.5 times improvement in energy-efficiency and throughput respectively can be achieved for small-size problems.
  •  
7.
  • Zhang, Chenxin, et al. (författare)
  • A coarse-grained dynamically reconfigurable architecture for digital signal processing
  • 2009
  • Ingår i: 9th Swedish System-On-Chip Conference.
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents design and implementation of a coarse-grained reconfigurable architecture, targeting digital signal processing applications. The proposed architecture is constructed from a mesh of resource cells, containing the separated processing and memory elements that communicate via a hybrid interconnect network. Parameterizable design of resource cells enables flexible static mapping of arbitrary applications, and the feature of dynamic reconfigurability provides mapping possibilities during system run-time to adapt to the current operational and processing conditions. Functionality is demonstrated by mapping a radix 22 FFT processor reconfigurable between 32 and 1,024 points. Performance evaluation exhibits a great reconfigurability and execution time reduction when compared to an ordinary DSP solution.
  •  
8.
  • Zhang, Chenxin, et al. (författare)
  • A Heterogeneous Reconfigurable Cell Array for MIMO Signal Processing
  • 2015
  • Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers. - 1549-8328. ; 62:3, s. 733-742
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a heterogeneous reconfigurable cell array, designed for high-throughput baseband processing of multiple-input multiple-output (MIMO) systems. To achieve high performance and energy efficiency while retaining high flexibility, the proposed architecture adopts heterogeneous and hierarchical resource deployments. Additionally, extensive vector computation enhancements and flexible memory access schemes are employed to better support MIMO signal processing. Implemented in a 65 nm CMOS technology, the cell array occupies 8.88 ${rm mm}^{2}$ core area and is capable of running at 500 MHz. For illustration, three computationally intensive blocks, namely channel estimation, channel matrix pre-processing, and hard-output data detection, of a 4 $times$ 4 MIMO processing chain in a 20 MHz 64-QAM 3GPP long term evolution advanced (LTE-A) downlink are mapped and processed in real-time. Implementation results report a maximum throughput of 367.88 Mb/s with 1.49 nJ/b energy consumption. Compared to state-of-the-art designs, the proposed solution outperforms programmable platforms by several orders of magnitude in energy efficiency, and achieves similar level of efficiency to that of ASICs.
  •  
9.
  • Zhang, Chenxin, et al. (författare)
  • A Highly Parallelized MIMO Detector for Vector-Based Reconfigurable Architectures
  • 2013
  • Ingår i: [Host publication title missing]. ; , s. 3844-3849
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a highly parallelized MIMO signal detection algorithm targeting vector-based reconfigurable architectures. The detector achieves high data-level parallelism and near-ML performance by adopting a vector-architecture-friendly technique - parallel node perturbation. To further reduce the computational complexity, imbalanced node and successive partial node expansion schemes in conjunction with sorted QR decomposition are applied. The effectiveness of the proposed algorithm is evaluated by simulations performed on a simplified 4x4 MIMO LTE-A testbed and operation analysis. Compared to the K-Best detector and fixed-complexity sphere decoder (FSD), the number of visited nodes in the proposed algorithm is reduced by 15 and 1.9 times respectively, with less than 1dB performance degradation. Benefiting from the fully deterministic non-iterative dataflow structure, reconfiguration rate is 95% less than that of the K-Best detector and 17% less than the case of FSD.
  •  
10.
  • Zhang, Chenxin, et al. (författare)
  • Design of coarse-grained dynamically reconfigurable architecture for DSP applications
  • 2009
  • Ingår i: International Conference on Reconfigurable Computing and FPGAs. - 9780769539171 ; , s. 338-343
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents the design and implementation of a coarse-grained reconfigurable architecture, targeting digital signal processing applications. The proposed architecture is constructed from a mesh of resource cells, containing separated processing and memory elements that communicate via a hybrid interconnect network. Parameterizable design of resource cells enables flexible mapping of arbitrary applications at system compile-time, and the feature of dynamic reconfigurability provides mapping possibilities during system run-time to adapt to the current operational and processing conditions. Functionality and flexibility of the proposed architecture is demonstrated through mapping of a radix-22 FFT processor reconfigurable between 32 and 1024 points. Performance evaluation exhibits a great reconfigurability and execution time reduction when compared to a traditional DSP and ARM solution.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 18

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy