SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Öwall Viktor) srt2:(2015-2019)"

Search: WFRF:(Öwall Viktor) > (2015-2019)

  • Result 1-20 of 20
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Diaz, Isael, et al. (author)
  • A 350μW Sign-Bit architecture for multi-parameter estimation during OFDM acquisition in 65nm CMOS
  • 2015
  • In: 2015 IEEE International Symposium on Circuits and Systems (ISCAS). - 9781479983919
  • Conference paper (peer-reviewed)abstract
    • Correct estimation of symbol timing, Carrier Frequency Offset (CFO), and Signal-to-Noise Ratio (SNR) is crucial in Orthogonal Frequency Division Multiplexing (OFDM) communication. Typically, high estimation accuracy is desired, but often comes with increased complexity. Which has a direct repercussion in energy consumption. In this article, an architecture based on Sign-Bit estimation with low complexity, and hence low power dissipation, is presented. The architecture, is capable of estimating the afore-mentioned parameters in virtually any OFDM standard. The proof of concept has been fabricated in 65 nm CMOS technology with low-power high-VT cells. Measurements performed with supply voltage of 1.2V. resulted in a power dissipation of 350 μW, 6 times smaller to that of an equivalent 8-bit architecture, and the lowest power density reported in literature.
  •  
2.
  •  
3.
  • Diaz, Isael, et al. (author)
  • A New Digital Front-End for Flexible Reception in Software Defined Radio
  • 2015
  • In: Microprocessors and Microsystems. - : Elsevier BV. - 0141-9331. ; 39:8, s. 889-900
  • Journal article (peer-reviewed)abstract
    • Future mobile terminals are expected to support an ever increasing number of Radio Access Technologies (RAT) concurrently. This imposes a challenge to terminal designers already today. Software Defined Radio (SDR) solutions are a compelling alternative to address this issue in the digital baseband, given its high flexibility and low Non-Recurring Engineering (NRE) cost. However, the challenge still remains in the Digital Front-End (DFE), where many operations are too complex or energy hungry to be implemented as software instructions. Thus, new architectures are needed to feed the SDR digital baseband while keeping complexity and energy consumption at bay. In this article the architecture of a Digital Front-End Receiver (DFE-Rx) for the next-generation mobile terminals is presented. The flexibility needed for multi-standard support is demonstrated by detecting, synchronizing and reporting carrier-frequency offset, of multiple concurrent radio standards. Moreover, the proposed architecture has been fabricated in a 65 nm CMOS low power high-VT cell technology in a die size of 5 mm2. The core module of the DFE-Rx, the synchronization engine, has been measured at 1.2 V and reports an average power consumption of 1.9 mW during Wireless Local Area Network (WLAN) reception and 1.6 mW during configuration, while running at 10 MHz.
  •  
4.
  • Granlund, Stefan, et al. (author)
  • A Low-Latency High-Throughput Soft-Output Signal Detector for Spatial Multiplexing MIMO Systems
  • 2015
  • In: Microprocessors and Microsystems. - : Elsevier BV. - 0141-9331. ; 39:8, s. 901-908
  • Journal article (peer-reviewed)abstract
    • This paper presents a low latency, high throughput soft-output signal detector for a 4x4 64-QAM spatial-multiplexing MIMO system. To achieve high data-level parallelism and accurate soft information, the detector adopts a channel-adaptive node perturbation technique to generate a list of candidate vectors around an initial linear estimation. The detection algorithm provides a large range and convenient performance-complexity trade-off by adjusting the node perturbation parameter. A partial-parallel pipelined VLSI architecture is developed to implement the algorithm with high throughput, low processing latency, while offering the flexibility to support run-time performance tuning. Moreover, a fast and hardware-friendly node enumeration scheme is developed to further reduce the processing delay by exploiting the geometric property of the quadrature amplitude modulation (QAM) constellation. The detector was synthesized using Synopsys Design Compiler with a 65nm CMOS standard cell library. The core area is 0.58mm2 with 290K gates. The peak throughput is 3Gb/s at 500MHz clock frequency with a latency of 20ns. Compared to other reported soft-output MIMO detectors, this is a latency reduction of 71%. The corresponding energy consumption is 33pJ per bit detection.
  •  
5.
  • Liu, Yangxurui, et al. (author)
  • Adaptive Resource Scheduling for Energy Efficient QRD Processor with DVFS
  • 2015
  • Conference paper (peer-reviewed)abstract
    • This paper presents an energy efficient adaptive QR decomposition scheme for Long Term Evolution Advance (LTE-A) downlink system. The proposed scheme provides a performance robustness to fluctuating wireless channels while maintaining lower workload on a reconfigurable hardware. A statistic based algorithm-switching strategy is employed in the scheme to achieve workload reduction and stable computing resource requirement for QR decomposition. With run time resource allocation, computing resources are assigned to highest performance gain segments to reduce performance loss. By utilizing the dynamic voltage and frequency scaling (DVFS) technique, we further exploit the potential of power saving in various workload situation while maintaining fixed throughput. The proposed technique brings power reduction upto 57.8% in EVA-5 scenario and 24.4% with a maximum SNR loss of 1 dB in EVA-70 scenario, when mapped on a coarse grain reconfigurable vector-based platform.
  •  
6.
  • Liu, Yangxurui, et al. (author)
  • An Area-Efficient On-Chip Memory System for Massive MIMO Using Channel Data Compression
  • 2018
  • In: IEEE Transactions on Circuits and Systems I: Regular Papers. - 1549-8328.
  • Journal article (peer-reviewed)abstract
    • Massive multiple-input-multiple-output has proven to deliver improvements in both spectral and transmitted energy efficiency. However, these improvements come at the cost of critical design challenges for the hardware implementation due to the huge amount of data that has to be processed immediately, especially the storage of large channel state information (CSI) matrices. This paper presents an on-chip memory system equipped with CSI which provides high area efficiency, while supporting flexible accesses and high bandwidths. Optimization across system-algorithm-hardware is used to develop hardware-friendly compression algorithms exploring propagation characteristics and large antenna-array features. More specifically, group-based and spatial-angular algorithms are implemented in a heterogeneous memory system, which consists of an unified memory for storing compressed CSI and a parallel memory for flexible access. Up to 75% memory can be saved for a 128-antenna system, at a less than 0.8,dB performance loss. Implemented in ST 28,nm FD-SOI technology, the capacity of designed system is 1.06,Mb, which is equivalent to 4,Mb uncompressed memory and can store 100 128x10 channel matrices. The area is 0.47 mm², demonstrating a 58% reduction compared with a memory system without CSI compression. With a supply voltage of 1.0,V, the memory system can run at 833, MHz, providing a 833,Gb/s access bandwidth.
  •  
7.
  • Liu, Yangxurui, et al. (author)
  • Architecture Design of a Memory Subsystem for Massive MIMO Baseband Processing
  • 2017
  • In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems. - 1063-8210. ; , s. 2976-2980
  • Journal article (peer-reviewed)abstract
    • This brief presents an on-chip memory subsystem for massive multiple-input-multiple-output (MIMO) baseband processing at the base station. In massive MIMO systems, the required memory bandwidth and capacity are orders of magnitude higher than those used in conventional wireless systems, due to the large number of serving antennas. These are further combined with design targets on low access latency and flexibility in data organization and access modes. This brief applies and improves the concept of parallel memories to achieve the challenging design target with low hardware overhead. As a case study, a memory subsystem for 128-antenna and 16-user massive MIMO systems is evaluated using ST 28-nm technology. According to postlayout simulation results, the proposed memory subsystem provides 512-Gb/s throughput and offers 1-Mb capacity with a cost of 0.30 mm2.
  •  
8.
  • Liu, Yangxurui, et al. (author)
  • Reducing On-chip Memory for Massive MIMO Baseband Processing using Channel Compression
  • 2018
  • In: 2017 IEEE 86th Vehicular Technology Conference: VTC2017-Fall. ; , s. 1-5
  • Conference paper (peer-reviewed)abstract
    • Employing a large number of antennas at the base station, massive MIMO significantly improves spectral efficiency and transmit power efficiency. On the other hand, massive MIMO also introduces unprecedented implementation challenges, especially in terms of processing and storage of large-size channel state information (CSI) matrices. Since on-chip memory is generally very expensive and has limited storage capacity, this paper uses the concept of on-chip CSI data compression and decompression to reduce memory requirements during baseband processing. To achieve this, massive MIMO channel properties are explored using a hardware-friendly DFT-based compression algorithm. The proposed method is evaluated with measured channel data at 2.6 GHz using a 128-antenna linear array [1]. Simulation results show that aggressive CSI compression can be adopted without significant loss in communication performance, while the DFT-based compression can be conveniently integrated into the on-chip memory. This enables a large reduction of required on-chip memory, with negligible hardware overhead for compression/decompression.
  •  
9.
  • Mahdavi, Mojtaba, et al. (author)
  • A Low Complexity Massive MIMO Detection Scheme Using Angular-Domain Processing
  • 2018
  • In: IEEE Global Conference on Signal and Information Processing (GlobalSIP). - 9781728112961 - 9781728112954 ; , s. 181-185
  • Conference paper (peer-reviewed)abstract
    • Signal processing complexity and required memory become problematic in massive MIMO systems as the dimension of channel state information (CSI) matrix grows significantly with the large number of antennas and users. To address these challenges, we propose the first angular-domain massive MIMO detection scheme, which is based on three concepts: transferring the baseband processing from the spatial domain to the angular domain; exploiting the sparsity of received beams to reduce the dimension of CSI matrix; and performing the whole detection and precoding in the angular domain using the reduced CSI matrix. We have measured the massive MIMO channel at 2.6 GHz with a 128-antenna linear array communicating with 16 users to evaluate our scheme. Complexity analysis and simulations show that proposed idea leads to 40% – 70% reduction in the processing complexity and memory without significant performance loss, which significantly outperforms the antenna-domain schemes.
  •  
10.
  • Mahdavi, Mojtaba, et al. (author)
  • A low latency and area efficient FFT processor for massive MIMO systems
  • 2017
  • In: IEEE International Symposium on Circuits and Systems (ISCAS), 2017 - Proceedings. - 9781509014279 - 9781467368537 ; , s. 1-4
  • Conference paper (peer-reviewed)abstract
    • A low-latency and area-efficient FFT/IFFT scheme is presented. The main idea is to utilize OFDM guard bands to reduce the operation counts and processing time, which results in 42% latency reduction compared to the reported pipelined schemes. To realize this idea, a modified pipelined architecture and an efficient data scheduling scheme are proposed. Furthermore, the proposed architecture is scalable to different FFT sizes and is also reconfigurable to support a wide range of applications. A 2048-point FFT/IFFT processor based on the proposed scheme has been designed, resulting in 1200 clock cycles latency, which can address the low latency demand of massive MIMO systems. Synthesis results in a 28 nm CMOS technology show that proposed design attains a throughput of 1 GS/s when clocked at 500 MHz.
  •  
11.
  • Mahdavi, Mojtaba, et al. (author)
  • A Low Latency FFT/IFFT Architecture for Massive MIMO Systems Utilizing OFDM Guard Bands
  • 2019
  • In: IEEE Transactions on Circuits and Systems I: Regular Papers. - 1558-0806. ; 66:7, s. 2763-2774
  • Journal article (peer-reviewed)abstract
    • A considerable part of latency in the baseband of massive multiple-input multiple-output (MIMO) systems is introduced by orthogonal frequency division multiplexing (OFDM) (de)modulation. To address the low-latency demand of massive MIMO systems, a fast Fourier transform (FFT) processor and corresponding reordering scheme are proposed, which reduce the processing latency and reordering latency of OFDM-based systems, respectively. The main idea is to utilize the OFDM guard bands to decrease the number of required computations and thus the processing time. In case of a 2048-point IFFT, the proposed scheme leads to 42% reduction in latency compared to the reported pipelined schemes at the cost of 4% additional memory, which is around 2.4% of the total chip area. To realize this idea, a modified pipelined architecture with a reorganized memory structure and also an efficient data scheduling mechanism for memories and butterflies are developed. Using the proposed scheme, a 2048-point FFT/IFFT processor has been implemented in a 28 nm complementary metal-oxide-semiconductor technology. The post-layout simulations show that our design achieves a throughput of 0.6 GS/s and 1200 clock cycles latency, the lowest latency reported to-date for single-input pipelined FFT/IFFT architectures.
  •  
12.
  • Malkowsky, Steffen, et al. (author)
  • A programmable 16-lane SIMD ASIP for massive MIMO
  • 2019
  • In: 2019 IEEE International Symposium on Circuits and Systems, ISCAS 2019 - Proceedings. - 9781728103976 ; 2019
  • Conference paper (peer-reviewed)abstract
    • This paper presents a 16-lane, 16-bit complex application-specific instruction processor (ASIP) for baseband processing in massive multiple-input multiple-output (MIMO). The architecture utilizes a 3/4-way very large instruction word (VLIW) with highly efficient pre- and post-processing units specifically trimmed for massive MIMO requirements. Architecture optimizations include features like single cycle vector-dot-product, vector indexing and broadcasting, hardware loops and full complex accumulator to provide high performance for various massive MIMO algorithms. Moreover, the ASIP is fully C-programmable, which is crucial for adapting to the evolving 5G standard. In our evaluation, a full massive MIMO up-link detection is executed in ≈11k clock cycles while synthesis results in ST 28 nm FD-SOI suggest a clock frequency of 900 MHz equating in a detection throughput of 330 Mb/s for a 128×16 massive MIMO system.
  •  
13.
  • Malkowsky, Steffen, et al. (author)
  • Building and operating a real-time massive MIMO testbed - Lessons learned
  • 2018
  • In: Conference Record of 51st Asilomar Conference on Signals, Systems and Computers, ACSSC 2017. - 9781538618233 ; 2017-October, s. 603-607
  • Conference paper (peer-reviewed)abstract
    • Massive multiple-input multiple-output (MIMO) is one of the key candidates for the upcoming 5G wireless generation. It offers a multitude of advantages over traditional techniques, such as reduced latency, reduced interference among user equipments (UEs) and increased spectrum and energy efficiencies. However, to verify the theoretically promised gains in real-life, prototype systems are inevitable. This paper discusses the many experiences gathered during designing, building and operating the Lund University Massive MIMO (LuMaMi) testbed. We discuss the six main lessons learned including practical issues, such as the mechanical setup or driver issues but also implementation challenges, such as increasing operation count compared to traditional wireless systems, complicated data shuffling and low-processing latency and detail their specific requirements.
  •  
14.
  • Malkowsky, Steffen, et al. (author)
  • Implementation of Low-latency Signal Processing and Data Shuffling for TDD massive MIMO Systems
  • 2017
  • In: ; , s. 260-265
  • Conference paper (peer-reviewed)abstract
    • Low latency signal processing and high throughput implementations are required in order to realize real-time TDD massive MIMO communications, especially in high mobility scenarios. One of the main challenges is that the up-link and down-link turnaround time has to be within the coherence time of the wireless channel to enable efficient use of reciprocity. This paper presents a hardware architecture and implementation of this critical signal processing path, including channel estimation, QRD-based MMSE decoder/precoder and distributed reciprocity calibration. Furthermore, we detail a switch-based router implementation to tackle the stringent throughput and latency requirements on the data shuffling network. The proposed architecture was verified on the LuMaMi testbed, based on the NI SDR platform. The implementation supports real-time TDD transmission in a 128x12 massive MIMO setup using 20 MHz channel bandwidth. The processing latency in the critical path is less than 0.15 ms, enabling reciprocity-based TDD massive MIMO for high-mobility scenarios.
  •  
15.
  • Malkowsky, Steffen, et al. (author)
  • The World's First Real-Time Testbed for Massive MIMO: Design, Implementation, and Validation
  • 2017
  • In: IEEE Access. - 2169-3536. ; , s. 9073-9088
  • Journal article (peer-reviewed)abstract
    • This paper sets up a framework for designing a massive multiple-input multiple-output (MIMO) testbed by investigating hardware and system-level requirements such as processing complexity, duplexing mode and frame structure. Taking these into account, a generic system and processing partitioning is proposed which allows flexible scaling and processing distribution onto a multitude of physically separated devices. Based on the given hardware constraints such as maximum number of links and maximum throughput for peer-to-peer interconnections combined with processing capabilities, the framework allows to evaluate available off-the-shelf hardware components. To verify our design approach, we present the Lund University Massive MIMO (LuMaMi) testbed which constitutes the first reconfigurable real-time hardware platform for prototyping massive MIMO. Utilizing up to 100 base station antennas and more than 50 field-programmable gate arrays (FPGAs), up to 12 user equipments are served on the same time/frequency resource using an LTE-like OFDM TDD-based transmission scheme. Field trials with this system show that massive MIMO can spatially separate a multitude of users in a static indoor and static/dynamic outdoor environment.
  •  
16.
  •  
17.
  • Tang, Wei, et al. (author)
  • A 1.8Gb/s 70.6pJ/b 128×16 link-adaptive near-optimal massive MIMO detector in 28nm UTBB-FDSOI
  • 2018
  • In: 2018 IEEE International Solid-State Circuits Conference, ISSCC 2018. - 9781538622278 - 9781509049400 ; 61, s. 224-226
  • Conference paper (peer-reviewed)abstract
    • This work presents a 2.0mm2 128×16 massive MIMO detector IC that provides 21dB array gain and 16x multiplexing gain at the system level. The detector implements iterative expectation-propagation detection (EPD) for up to 256-QAM modulation. Tested with measured channel data [1], the detector achieves 4.3dB processing gain over state-of-the-art massive MlMo detectors [2, 3], enabling 2.7x reduction in transmit power for battery-powered mobile terminals. The iC uses link-adaptive processing to meet a variety of practical channel conditions with scalable energy consumption. The design is realized in a condensed systolic array architecture and an approximate moment-matching circuitry to reach 1.8Gb/s at 70.6pJ/b. The performance and energy efficiency can be tuned over a wide range by UTBB-FDSOI body bias.
  •  
18.
  • Zhang, Chenxin, et al. (author)
  • A Heterogeneous Reconfigurable Cell Array for MIMO Signal Processing
  • 2015
  • In: IEEE Transactions on Circuits and Systems Part 1: Regular Papers. - 1549-8328. ; 62:3, s. 733-742
  • Journal article (peer-reviewed)abstract
    • This paper presents a heterogeneous reconfigurable cell array, designed for high-throughput baseband processing of multiple-input multiple-output (MIMO) systems. To achieve high performance and energy efficiency while retaining high flexibility, the proposed architecture adopts heterogeneous and hierarchical resource deployments. Additionally, extensive vector computation enhancements and flexible memory access schemes are employed to better support MIMO signal processing. Implemented in a 65 nm CMOS technology, the cell array occupies 8.88 ${rm mm}^{2}$ core area and is capable of running at 500 MHz. For illustration, three computationally intensive blocks, namely channel estimation, channel matrix pre-processing, and hard-output data detection, of a 4 $times$ 4 MIMO processing chain in a 20 MHz 64-QAM 3GPP long term evolution advanced (LTE-A) downlink are mapped and processed in real-time. Implementation results report a maximum throughput of 367.88 Mb/s with 1.49 nJ/b energy consumption. Compared to state-of-the-art designs, the proposed solution outperforms programmable platforms by several orders of magnitude in energy efficiency, and achieves similar level of efficiency to that of ASICs.
  •  
19.
  • Zhang, Chenxin, et al. (author)
  • Energy Efficient Group-Sort QRD Processor with On-line Update for MIMO Channel Pre-processing
  • 2015
  • In: IEEE Transactions on Circuits and Systems Part 1: Regular Papers. - 1549-8328. ; 62:5, s. 1220-1229
  • Journal article (peer-reviewed)abstract
    • This paper presents a Sorted QR-Decomposition (SQRD) processor for 3GPP LTE-A system. It achieves energy efficiency by co-optimizing techniques, such as heterogeneous processing, reconfigurable architecture, and dual-supply voltage operation. At algorithm level, a low-complexity hybrid decomposition scheme is adopted, which switches, depending on the energy distribution of spatial channels, between the traditional brute-force SQRD and a proposed group-sort QR update strategy. A reconfigurable vector processor is accordingly developed to support the adaptive processing with high hardware efficiency. Furthermore, on-chip power management technique is also integrated to obtain real-time power-saving by adapting the voltage supply based on the instantaneous workload. As a proof-of-concept, we implemented the processor using a 65nm CMOS technology and conducted post-layout simulation. The proposed SQRD processor occupies 0.71mm2 core area and has a throughput of up to 69MQRD/s. Compared to the brute-force approach, an energy reduction of 10~61.8% is achieved.
  •  
20.
  • Zhang, Chenxin, et al. (author)
  • Heterogeneous Reconfigurable Processors for Real-Time Baseband Processing: From Algorithm to Architecture
  • 2015
  • Book (other academic/artistic)abstract
    • This book focuses on domain-specific heterogeneous reconfigurable architectures, demonstrating for readers a computing platform which is flexible enough to support multiple standards, multiple modes, and multiple algorithms. The content is multi-disciplinary, covering areas of wireless communication, computing architecture, and circuit design. The platform described provides real-time processing capability with reasonable implementation cost, achieving balanced trade-offs among flexibility, performance, and hardware costs. The authors discuss efficient design methods for wireless communication processing platforms, from both an algorithm and architecture design perspective. Coverage also includes computing platforms for different wireless technologies and standards, including MIMO, OFDM, Massive MIMO, DVB, WLAN, LTE/LTE-A, and 5G. •Discusses reconfigurable architectures, including hardware building blocks such as processing elements, memory sub-systems, Network-on-Chip (NoC), and dynamic hardware reconfiguration; •Describes a unique design and optimization methodology, applied to different areas and levels, including communication theory, hardware implementation, and software support; •Demonstrates design trade-offs during different development phases and enables readers to apply similar techniques to various applications.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-20 of 20

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view