SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Öwall Viktor) "

Sökning: WFRF:(Öwall Viktor)

  • Resultat 1-50 av 184
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Akgun, OmerCan, et al. (författare)
  • High-level energy estimation in the sub-VT domain: simulation and measurement of a cardiac event detector
  • 2012
  • Ingår i: IEEE Transactions on Biomedical Circuits and Systems. - 1932-4545. ; 6:1, s. 15-27
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a flow that is suitable to estimate energy dissipation of digital standard-cell based designs which are determined to be operated in the sub-threshold regime. The flow is applicable on gate-level netlists, where back-annotated toggle information is used to find the minimum energy operation point, corresponding maximum clock frequency, as well as the dissipated energy per clock cycle. The application of the model is demonstrated by exploring the energy efficiency of pipelining, retiming and register balancing. Simulation results, which are obtained during a fraction of SPICE simulation time, are validated by measurements on a wavelet based cardiac event detector that was fabricated in 65 nm low-leakage high-threshold technology. The mean of the absolute modeling error is calculated as 5.2 %, with a standard deviation of 6.6% over the measurement points. The cardiac event detector dissipates 0.88 pJ/sample at a supply voltage of 320mV.
  •  
2.
  • Al-Obaidi, Mohammed, et al. (författare)
  • Hardware Acceleration of the Robust Header Compression (RoHC) Algorithm
  • 2013
  • Ingår i: 2013 IEEE International Symposium on Circuits and Systems (ISCAS). - 2158-1525 .- 0271-4310. - 9781467357609 - 9781467357623 ; , s. 293-296
  • Konferensbidrag (refereegranskat)abstract
    • In LTE base-stations, RoHC is a processingintensive algorithm that may limit the system from serving a large number of users when it is used to compress the VoIP packets of mobile traffic. In this paper, a hardware-software and a full-hardware solution are proposed to accelerate the RoHC compression algorithm in LTE base-stations and enhance the system throughput and capacity. Results for both solutions are discussed and compared with respect to design metrics like throughput, capacity, power consumption, and hardware resources. This comparison is instrumental in taking architectural level trade-off decisions in-order to meet the present day requirements and also be ready to support a future evolution. In terms of throughput, a gain of 20% (6250 packets/sec) is achieved in the HW-SW implementation by accelerating the Cyclic Redundancy Check (CRC) and the Least Significant Bit (LSB) encoding in hardware. The full-HW implementation leads to a throughput of 45 times (244000 packets/sec) compared to the SW-Only implementation. The full-HW solution consumes more Adaptive Look-Up Tables (7477 ALUTs) compared to the HW-SW solution (2614 ALUTs) when synthesized on Altera’s Arria II GX FPGA.
  •  
3.
  • Anderson, John B, et al. (författare)
  • Faster-Than-Nyquist Signaling
  • 2013
  • Ingår i: Proceedings of the IEEE. - 0018-9219. ; 101:8, s. 1817-1830
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, we survey Faster-than-Nyquist (FTN) signaling, an extension of ordinary linear modulation in which the usual data bearing pulses are simply sent faster, and consequently are no longer orthogonal. Far from a disadvantage, this innovation can transmit up to twice the bits as ordinary modulation at the same bit energy, spectrum, and error rate. The method is directly applicable to orthogonal frequency division multiplex (OFDM) and quadrature amplitude modulation (QAM) signaling. Performance results for a number of practical systems are presented. FTN signaling raises a number of basic issues in communication theory and practice. The Shannon capacity of the signals is considerably higher.
  •  
4.
  •  
5.
  •  
6.
  • Berkeman, Anders, et al. (författare)
  • A configurable divider using digit recurrence
  • 2003
  • Ingår i: Proceedings - IEEE International Symposium on Circuits and Systems. - 2158-1525 .- 0271-4310. ; 5, s. 333-336
  • Konferensbidrag (refereegranskat)abstract
    • The division operation is essential in many digital signal processing algorithms. For a hardware implementation, the requirements and constraints on the divider circuit differ significantly with different applications. Therefore, it is not possible to design one divider component having optimal performance and cost for all target applications. Instead, the presented divider has a modular architecture, based on instantiation of small efficient divider sub-blocks. The configuration of the divider architecture is set by a number of parameters controlling wordlength, number of quotient bits, number of clock cycles per operation, and fixed or floating point operation. Digit recurrence algorithms with carry save arithmetic and on-the-fly two's complement output quotient conversion are used to make the sub-blocks small, fast and power efficient, The modularity gives the designer freedom to elaborate different parameters to explore the design space. Two applications using the proposed divider are presented. Furthermore, an example divider circuit has been fabricated and performance measurements are included.
  •  
7.
  •  
8.
  • Berkeman, Anders, et al. (författare)
  • A low logic depth complex multiplier
  • 1998
  • Ingår i: ; , s. 204-207
  • Konferensbidrag (refereegranskat)abstract
    • A complex multiplier has been designed for use in a pipelined fast fourier transform processor. The performance in terms of throughput of the processor is limited by the multiplication. Therefore, the multiplier is optimized to make the input to output delay as short as possible. A new architecture based on distributed arithmetic and Wallace-trees has been developed and is compared to a previous multiplier realized as a regular distributed arithmetic array. The simulated gain in speed for the presented multiplier is about 100%. For verification, the multiplier is fabricated in a three metal-layer 0.5µ CMOS process using a standard cell library. The fabricated multiplier chip has been functionally verified.
  •  
9.
  • Berkeman, Anders, et al. (författare)
  • A low logic depth complex multiplier using distributed arithmetic
  • 2000
  • Ingår i: IEEE Journal of Solid-State Circuits. - : Institute of Electrical and Electronics Engineers (IEEE). - 0018-9200 .- 1558-173X. ; 35:4, s. 656-659
  • Tidskriftsartikel (refereegranskat)abstract
    • A combinatorial complex multiplier has been designed for use in a pipelined fast Fourier transform processor. The performance in terms of throughput of the processor is limited by the multiplication. Therefore, the multiplier is optimized to make the input-to-output delay as short as possible. A new architecture based on distributed arithmetic, Wallace-trees, and carry-lookahead adders has been developed. The multiplier has been fabricated using standard cells in a 0.5-μm process and verified for functionality, speed, and power consumption. Running at 40 MHz, a multiplier with input wordlengths of 16+16 times 10+10 bits consumes 54% less power compared to an distributed arithmetic array multiplier fabricated under equal conditions
  •  
10.
  •  
11.
  •  
12.
  • Berkeman, Anders, et al. (författare)
  • Co-optimization of FFT and FIR in a delayless acoustic echo canceller implementation
  • 2000
  • Ingår i: [Host publication title missing]. - 0780354826 ; 5, s. 241-244
  • Konferensbidrag (refereegranskat)abstract
    • In application specific implementation of digital signal processing algorithms optimization is important for a low power solution, not only on block level but also between blocks. This paper presents a co-optimization of a fast Fourier transform and a finite impulse response filter in a silicon implementation of an acoustic echo. The optimization gain can be measured in the number of operations and memory accesses performed per second, and therefore processing power. The optimization can also be applied to other algorithms with a similar constellation of Fourier transforms and finite impulse response filters
  •  
13.
  • Berkeman, Anders, et al. (författare)
  • Implementation Issues for acoustic echo cancellers
  • 1999
  • Ingår i: [Host publication title missing]. - 0780354915 ; , s. 97-100
  • Konferensbidrag (refereegranskat)abstract
    • The high computational complexity of acoustic echo cancellation algorithms requires application specific implementations to sustain real time signal processing with affordable power consumption. This is especially true for systems where a delayless approach is considered important, e.g. wireless communication systems. The proposed paper presents architectural considerations to reach a feasible hardware solution.
  •  
14.
  • Bruce, H, et al. (författare)
  • Power optimization of a reconfigurable FIR-filter
  • 2004
  • Ingår i: 2004 IEEE Workshop on Signal Processing Systems Design and Implementation. - 0780385047 ; , s. 321-324
  • Konferensbidrag (refereegranskat)abstract
    • This paper describes power optimization techniques applied to a reconfigurable digital finite impulse response (FIR) filter used in a Universal Mobile Telephone Service (UMTS) mobile terminal. Various methods of optimization for implementation were combined to achieve low cost in terms of power consumption. Each optimization method is described in detail and is applied to the reconfigurable filter. The optimization methods have achieved a 78.8% reduction in complexity for the multipliers in the FIR structure. A comparison of synthesized RTL models of the original and the optimized architectures resulted in a 27% reduction in look-up tables when targeted for the Xilinx Virtex II Pro field programmable gate array (FPGA). An automated method for transformation of coefficient multipliers into bit-shifts is also presented
  •  
15.
  • Chakrabartty, Shantanu, et al. (författare)
  • Guest Editorial
  • 2011
  • Ingår i: IEEE Transactions on Biomedical Circuits and Systems. - 1932-4545. ; 5:2, s. 101-102
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)
  •  
16.
  • Chandran, J, et al. (författare)
  • Xilinx Virtex II Pro implementation of a reconfigurable UMTS digital channel filter
  • 2004
  • Ingår i: [Host publication title missing]. - 0769520812 ; , s. 77-82
  • Konferensbidrag (refereegranskat)abstract
    • A reconfigurable digital root raised cosine (RRC) filter for a UMTS terrestrial radio access (UTRA) mobile terminal receiver is implemented on a Xilinx Vitrex II Pro Field Programmable Gate Array (FPGA). The filter employs a finite impulse response (FIR) and monitors in-band and out-of-band received signal powers and calculates the appropriate filter length that meets the bit-energy to interference ratio (Eb/No) of the system. The results presented are for the time division duplex (TDD) mode of UTRA.
  •  
17.
  • Dasalukunte, Deepak, et al. (författare)
  • A 0.8mm2 9.6mW Implementation of a Multicarrier Faster-Than-Nyquist Signaling Iterative Decoder in 65nm CMOS
  • 2012
  • Ingår i: [Host publication title missing]. - 1930-8833. - 9781467322126 ; , s. 173-176
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a decoder for multi-carrier modulated signals employing Faster-than-Nyquist (FTN) signaling. FTN signaling is a method of improving bandwidth efficiency at the expense of higher processing complexity in the transceiver. The decoder can switch between orthogonal and FTN signaling modes and exploits channel properties to improve bandwidth efficiency. The decoder is fabricated in a 65nm CMOS process and occupies an area of 0.8mm2, with a power consumption of 9.6mW at 1.2V when clocked at 100MHz. To the best of our knowledge, those measurement results are from the first-ever silicon implementation of a decoder for FTN signaling.
  •  
18.
  • Dasalukunte, Deepak, et al. (författare)
  • A generic hardware MAC for wireless personal area network platforms
  • 2008
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a generic hardware-MAC for systems designed based on high rate (IEEE 802.15.3) and low rate (IEEE 802.15.4)Wireless Personal Area Networks. functionality that are better run in hardware are moved over from the software part of the MAC layer. An easy to access memory like interface has been defined for data and control transfer between the software and hardware parts of theMAC layer. A key challenge in designing such a system was to arrive at a generic architecture without compromising with either of the standards on the lines of which the two systems are implemented. Emphasis on reuse of the modules has been done in order to avoid repetition of design and implementation effort and in turn reducing the time required for testing. The design has been successfully tested on different FPGA platforms.
  •  
19.
  • Dasalukunte, Deepak, et al. (författare)
  • A Transmitter Architecture for Faster-than-Nyquist Signaling Systems
  • 2009
  • Ingår i: Proceedings, IEEE International Symposium on Circuits and Systems, 2009. - 9781424438273 ; , s. 1028-1031
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents the complexity analysis of a transmitter architecture for a faster-than-Nyquist (FTN) system. Complexity issues in terms of computations and memory requirements to achieve an FTN system are dealt with. An OFDM based multi-carrier system is considered as it is one of the most widely used in upcoming wireless standards. Retaining the modules within the OFDM transmitter helps in exploiting the already optimized and hardware efficient structures, the IFFT being one. From an implementation perspective the introduction of FTN introduces negligible overhead for the transmitter.
  •  
20.
  • Dasalukunte, Deepak, et al. (författare)
  • An 0.8-mm(2) 9.6-mW Iterative Decoder for Faster-Than-Nyquist and Orthogonal Signaling Multicarrier Systems in 65-nm CMOS
  • 2013
  • Ingår i: IEEE Journal of Solid-State Circuits. - 0018-9200. ; 48:7, s. 1680-1688
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents an iterative decoder for faster-than-Nyquist (FTN) and orthogonal signaling multi-carrier systems. FTN signaling is a method of improving bandwidth efficiency at the expense of higher processing complexity in the transceiver. The decoder can switch between orthogonal and FTN signaling modes and exploits channel properties to improve bandwidth efficiency. The decoder is fabricated in a 65-nm CMOS process and occupies a total area of 0.8 mm(2) with decoder core taking up 0.567 mm(2). The power consumption of the chip is 9.6 mW at 1.2 V when clocked at 100 MHz, providing a peak information throughput of 1 Mbps and with an energy efficiency of 0.6 nJ per bit per iteration. To the best of our knowledge, those measurement results are from the first ever silicon implementation of a decoder for FTN signaling.
  •  
21.
  •  
22.
  •  
23.
  • Dasalukunte, Deepak, et al. (författare)
  • Complexity analysis of IOTA filter architectures in Faster-than-Nyquist multicarrier systems
  • 2011
  • Ingår i: [Host publication title missing]. - 9781457705144
  • Konferensbidrag (refereegranskat)abstract
    • This paper has evaluated the overhead requirements for IOTA pulse shaping filters employed in faster-than-Nyquist multicarrier systems. Faster-than-Nyquist signaling has shown the promise of improving bandwidth efficiency, but comes at the cost of increased processing complexity in the transceiver. The IOTA filter being one of the blocks contributing for the processing overhead, different architectural options have been evaluated. A comparison is drawn between the architectures of the IOTA filter and the suitable architecture with moderate hardware overhead is chosen for implementation.
  •  
24.
  • Dasalukunte, Deepak, et al. (författare)
  • Design and Implementation of Iterative Decoder for Faster-than-Nyquist Signaling Multicarrier systems
  • 2011
  • Ingår i: [Host publication title missing]. - 2159-3477. ; , s. 359-360
  • Konferensbidrag (refereegranskat)abstract
    • Abstract in UndeterminedFaster-than-Nyquist (FTN) signaling is a method of improving bandwidth efficiency by transmitting information beyond Nyquist's orthogonality limit for interference free transmission. Previously have theoretically established that FTN can provide improved bandwidth efficiency. However, this comes at the cost of higher decoding complexity at the receiver. Our work has evaluated multicarrier FTN signaling for its implementation feasibility and complexity overhead compared to the gains in bandwidth efficiency. The work carried out in this research project includes a systems perspective evaluating performance, algorithm hardware tradeoffs and a hardware architecture leading to a silicon implementation of the decoder for FTN signaling. From the systems perspective, co-existence of FTN and OFDM based multicarrier system has been evaluated. OFDM being a part of many existing and upcoming broadband access technologies such as WLAN, LTE, DVB, this analogy is motivated. On the hardware aspect, the proposed architecture can accommodate both OFDM and FTN systems. The processing blocks in transmitter and receiver were designed for reuse and carry out different functions in the transceiver. Furthemore, the hardware could be configured to operate at varying bandwidth efficiencies (by FTN signaling) to exploit the channel conditions. The decoder implementation also considered block sizes and data rates to comply with the 3GPP standard. The decoding is carried out in as few as 8 iterations making it more practical for implementation in power constrained mobile devices. The decoder is implemented in 65nm CMOS process and occupies a total chip area of 0.8mm2.
  •  
25.
  •  
26.
  • Dasalukunte, Deepak, et al. (författare)
  • Hardware implementation of mapper for Faster-than-Nyquist signaling transmitter
  • 2009
  • Ingår i: [Host publication title missing]. - 9781424443109
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents the implementation of the mapper block in a faster-than-Nyquist (FTN) signaling transmitter. The architecture is Look-Up Table (LUT) based and the complexity is reduced to a few adders and a buffer to store intermediate results. Two flavors of the architecture has been designed and evaluated in this article, one, a register based implementation for the buffer and the other using a Random Access Memory (RAM). The tradeoff between the two is throughput versus area. The register based implementation is fast requiring only one clock cycle to complete the calculation (i.e a read, calculate and write back) for every incoming FTN symbol. However, it becomes prohibitive when systems with large number of sub-carriers (>64) is considered. The RAM based implementation provides a better solution in terms of area with slightly lower throughput. The mapper has been targetted for both FPGA (Xilinx Virtex-II Pro) and ASIC (130nm standard cell CMOS) implementations. The design has been successfully tested on the FPGA and its output verified with the reference MATLAB model.
  •  
27.
  • Dasalukunte, Deepak, et al. (författare)
  • Improved memory architecture for multicarrier faster-than-Nyquist iterative decoder
  • 2011
  • Ingår i: [Host publication title missing]. ; , s. 296-300
  • Konferensbidrag (refereegranskat)abstract
    • Architectural improvements for a multicarrier faster-than-Nyquist (FTN) decoder are presented in this work. A previously designed FTN decoder has been optimized during implementation, especially with respect to memory considerations to reduce area and power. The memory optimized architecture achieves 28.7% savings in overall chip area and provides 43.8% savings in the estimated power compared to the pre-optimized design. The BER performance tradeoff from one of the memory optimization shows that the degradation is acceptable and can actually provide better performance for certain scenarios. The other memory optimization considers the minimal buffering required within the interference canceller, resulting in memory reduction close to 50% of what was previously reported. The performance from the actual RTL implementation of the FTN decoder is also presented in comparison with the floating and fixed point benchmark performances.
  •  
28.
  • Dasalukunte, Deepak, et al. (författare)
  • Multicarrier faster-than-Nyquist transceivers: hardware architecture and performance analysis
  • 2011
  • Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers. - 1549-8328. ; 58:4, s. 827-838
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper evaluates the hardware aspects of multicarrier faster-than-Nyquist (FTN) signaling transceivers. The choice of time-frequency spacing of the symbols in an FTN system for improved bandwidth efficiency is targeted towards efficient hardware implementation. This work proposes a hardware architecture for the realization of iterative decoding of FTN multicarrier modulated signals. Compatibility with existing systems has been considered for smooth switching between the faster-than-Nyquist and orthogonal signaling schemes. One such being the use of FFTs for multicarrier modulation. The performance of the fixed point model is very close to that of the floating point representation. The impact of system parameters such as number of projection points, time-frequency spacing, finite wordlengths and their design trade-offs for reduced complexity iterative decoders in FTN systems have been investigated. The FTN decoder has been designed and synthesized in both 65nm CMOS and FPGA. From the hardware resource usage numbers it can be concluded that FTN signaling can be used to achieve higher bandwidth efficiency with acceptable complexity overhead.
  •  
29.
  •  
30.
  • Diaz, Isael, et al. (författare)
  • A 350μW Sign-Bit architecture for multi-parameter estimation during OFDM acquisition in 65nm CMOS
  • 2015
  • Ingår i: 2015 IEEE International Symposium on Circuits and Systems (ISCAS). - 9781479983919
  • Konferensbidrag (refereegranskat)abstract
    • Correct estimation of symbol timing, Carrier Frequency Offset (CFO), and Signal-to-Noise Ratio (SNR) is crucial in Orthogonal Frequency Division Multiplexing (OFDM) communication. Typically, high estimation accuracy is desired, but often comes with increased complexity. Which has a direct repercussion in energy consumption. In this article, an architecture based on Sign-Bit estimation with low complexity, and hence low power dissipation, is presented. The architecture, is capable of estimating the afore-mentioned parameters in virtually any OFDM standard. The proof of concept has been fabricated in 65 nm CMOS technology with low-power high-VT cells. Measurements performed with supply voltage of 1.2V. resulted in a power dissipation of 350 μW, 6 times smaller to that of an equivalent 8-bit architecture, and the lowest power density reported in literature.
  •  
31.
  •  
32.
  • Diaz, Isael, et al. (författare)
  • A New Digital Front-End for Flexible Reception in Software Defined Radio
  • 2015
  • Ingår i: Microprocessors and Microsystems. - : Elsevier BV. - 0141-9331. ; 39:8, s. 889-900
  • Tidskriftsartikel (refereegranskat)abstract
    • Future mobile terminals are expected to support an ever increasing number of Radio Access Technologies (RAT) concurrently. This imposes a challenge to terminal designers already today. Software Defined Radio (SDR) solutions are a compelling alternative to address this issue in the digital baseband, given its high flexibility and low Non-Recurring Engineering (NRE) cost. However, the challenge still remains in the Digital Front-End (DFE), where many operations are too complex or energy hungry to be implemented as software instructions. Thus, new architectures are needed to feed the SDR digital baseband while keeping complexity and energy consumption at bay. In this article the architecture of a Digital Front-End Receiver (DFE-Rx) for the next-generation mobile terminals is presented. The flexibility needed for multi-standard support is demonstrated by detecting, synchronizing and reporting carrier-frequency offset, of multiple concurrent radio standards. Moreover, the proposed architecture has been fabricated in a 65 nm CMOS low power high-VT cell technology in a die size of 5 mm2. The core module of the DFE-Rx, the synchronization engine, has been measured at 1.2 V and reports an average power consumption of 1.9 mW during Wireless Local Area Network (WLAN) reception and 1.6 mW during configuration, while running at 10 MHz.
  •  
33.
  • Diaz, Isael, et al. (författare)
  • A sign-bit auto-correlation architecture for fractional frequency offset estimation in OFDM
  • 2010
  • Ingår i: ; , s. 3765-3768
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents an architecture of an auto-correlator for Orthogonal Frequency Division Multiplexing systems. The received signal is quantized to only the sign-bit, which dramatically simplifies the frequency offset estimation. Hardware cost is reduced under the assumption that synchronization during acquisition does not have to be very accurate, but sufficient for coarse estimation. The architecture is synthesized towards a 65nm low-leakage high threshold standard cell CMOS library. The proposed architecture results in area reduction of 93% if compared to typical 8-bit implementation. The area occupied by the architecture is 0.063mm^2. The architecture is evaluated for WLAN, LTE and DVB-H. Power simulations for DVB-H transmission shows a power consumption of 4.8uW per symbol.
  •  
34.
  • Diaz, Isael, et al. (författare)
  • Next Generation Digital Front-End for Multi-Standard Concurrent Reception
  • 2013
  • Ingår i: [Host publication title missing].
  • Konferensbidrag (refereegranskat)abstract
    • This article presents an architecture of a Digital Front-End Receiver (DFE-Rx) for the next-generation mobile terminals. A main focus is placed in flexibility, scalability and concurrency. The architecture is capable of detecting, synchronizing and reporting carrier-frequency offset, of multiple concurrent radio standards. The proposed receiver is fabricated in a 65nm CMOS low power high-VT cell technology in a die size of 5mm2. The synchronization engine has been measured at 1.2V and reports an average power consumption of 1.9mW during IEEE 802.11 reception and 1.6mW during configuration, while running at 10MHz.
  •  
35.
  • Diaz, Isael, et al. (författare)
  • Selective Channelization on an SDR Platform for LTE-A Carrier Aggregation.
  • 2012
  • Ingår i: 19th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 2012. - 9781467312615 ; , s. 316-319
  • Konferensbidrag (refereegranskat)abstract
    • The total transmission bandwidth and component carrier aggregation proposed by LTE-Advanced, sets a new challenge to the design of terminals. This article presents a way to assure terminals cope with the large bandwidth in an efficient manner. Various filtering methods are explored showing that an SDR architecture, such as ADRES (Architecture for Dynamically Reconfigurable Embedded Systems), is suitable for dynamic adaptation of filtering methods as function of the aggregation scheme and the individual bandwidth assigned to each terminal. This method is able to reduce the processing load by 70% for LTE-A with legacy support and possibly higher reduction when LTE legacy is not supported. Simulations conclude that the performance loss derived from the proposed method is marginal with no negative repercussion on the posterior baseband stages.
  •  
36.
  • Diaz, Isael, et al. (författare)
  • Sign-Bit based architecture for OFDM acquisition for multiple-standards
  • 2009
  • Ingår i: [Host publication title missing]. - 9781424443109
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a hardware mapping of an auto-correlator for Orthogonal Frequency Division Multiplexing stage for three radio standards: LTE, DVB-H, and IEEE 802.11n. Hardware cost is minimized by using only the sign bit in the autocorrelation function. The frequency offset estimation procedure is dramatically simplified by reducing the phase of the envelope to pi/2 resolution, which in turn reduces the need of specialized components. The architecture is synthesized towards a 65 nm low-leakage high threshold standard cell CMOS library. The 1-bit architecture reports an area reduction of 90% for memories, 56% for the logic and a power dissipation reduction of 35%, when compared to an identical 8-bit implementation. The approximate area occupied by the architecture is 0.03mm^2. Power simulations for IEEE 802.11n packet reports a power dissipation of 42uW.
  •  
37.
  •  
38.
  • Edman, F, et al. (författare)
  • A scalable pipelined complex valued matrix inversion architecture
  • 2005
  • Ingår i: IEEE International Symposium on Circuits and Systems (ISCAS). - 0780388348 ; , s. 4489-4492
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a fast, pipelined and scalable hardware architecture for inverting complex valued matrices. The matrix inversion algorithm involves, a QR-factorization based on the squared Givens rotations algorithm, the application of a recurrence algorithm for inversion of an upper triangular matrix R, and a matrix multiplication of R-1 with Q. We show that traditional triangular array architectures employing O(n2) communicating processors can be mapped onto a scalable linear array architecture with only O(n) processors. The linear array architecture avoids drawbacks such as non-scalability, large area consumption and low throughput rate. The architecture is implemented using arithmetic operations with 12 bit fixed-point representation. The hardware implementation will be used as a core processor in a real-time smart antenna system
  •  
39.
  •  
40.
  • Edman, Fredrik, et al. (författare)
  • Fixed-point implementation of a robust complex valued divider architecture
  • 2005
  • Ingår i: [Host publication title missing]. - 0780390660 ; , s. 143-146
  • Konferensbidrag (refereegranskat)abstract
    • In this paper a fixed-point implementation of robust complex valued divider architecture is presented. The architecture uses feedback loops and time multiplexing strategies resulting in a fast and area conservative architecture. The architecture has good numerical properties and the result is accurate to less than one ulp. A combination of low latency and high throughput rate makes the architecture ideal for modern high speed signal processing applications. The complex valued divider architecture was implemented and tested on a Xilinx Virtex-II FPGA, clocked at 100MHz, and can easily be ported to an ASIC. The FPGA implementation is used as a core component in a matrix inversion implementation
  •  
41.
  •  
42.
  • Edman, Fredrik, et al. (författare)
  • Implementation of a Highly Scalable Architecture for Fast Inversion of Triangular Matrices
  • 2003
  • Ingår i: ; , s. 1137-1140
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, an F'F'GA implementation of a novel and highly scalable hardware architecture for fast inversion of triangular matrices is presented. An integral part of modem signal processing and communications applications involves manipulation of large matrices. Therefore, scalable and flexible hardware architectures are increasingly sought for. In this paper, the traditional triangular shaped array architecture with n(n+1)/2 communicating processors, with n being the number of inputs, is mapped to a linear structure with only n processors, The linear and the triangular shaped architectures are compared in aspect of area consumption, latencies, and maximum clocking speed. This paper also show that the linear array structure avoids drawhacks such as non-scalability, large area, and large power consumption. The implementation is based on a numerically stahle recurrence algorithm, which has excellent properties for hardware implementation.
  •  
43.
  • Edman, Fredrik, et al. (författare)
  • Implementation of a Scalable Matrix Inversion Architecture for Triangular Matrices
  • 2003
  • Ingår i: [Host publication title missing]. ; 3, s. 2558-2562
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents an FPGA implementation of a novel snd Ihighl! scalable hardware architecture for inversion of triangiiliir matrices. An integral part of modern signal processing and communications applications involves manipulation of large matrices. Therefore, scalable and flexible hardware architectures are increasingly sought for. In this paper the traditional triangular shaped array architecture with n(n+l)/Z, where n being the number of inputs, communicating processors are mapped to a linear structure with only n processors. We show that the linear array structure avoids drawbacks such as nonscalability, large area and large power consumption. The implementation is based on a numerical stable recurrence algorithm which has excellent properties for hardware implementation. The implementation is the core processor in a smart antenna system.
  •  
44.
  •  
45.
  • Eilert, Johan (författare)
  • ASIP for Wireless Communication and Media
  • 2010
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • While general purpose processors reach both high performance and high application flexibility, this comes at a high cost in terms of silicon area and power consumption. In systems where high application flexibility is not required, it is possible to trade off flexibility for lower cost by tailoring the processor to the application to create an Application Specific Instruction set Processor (ASIP) with high performance yet low silicon cost. This thesis demonstrates how ASIPs with application specific data types can provide efficient solutions with lower cost. Two examples are presented, an audio decoder ASIP for audio and music processing and a matrix manipulation ASIP for MIMO radio baseband signal processing. The audio decoder ASIP uses a 16-bit floating point data type to reduce the size of the data memory to about 60% of other solutions that use a 32-bit data type. Since the data memory occupies a major part of the silicon area, this has a significant impact on the total silicon area, and thereby also the static and dynamic power consumption. The data width reduction can be done without any noticeable artifacts in the decoded audio due to the natural masking effect ofthe human ear. The matrix manipulation SIMD ASIP is designed to perform various matrix operations such as matrix inversion and QR decomposition of small complex-valued matrices. This type of processing is found in MIMO radio baseband signal processing and the matrices are typically not larger than 4x4. There have been solutions published that use arrays of fixed-function processing elements to perform these operations, but the proposed ASIP performs the computations in less time and with lower hardware cost. The matrix manipulation ASIP data path uses a floating point data type to avoid data scaling issues associated with fixed point computations, especially those related to division and reciprocal calculations, and it also simplifies the program control flow since no special cases for certain inputs are needed which is especially important for SIMD architectures. These two applications were chosen to show how ASIPs can be a suitable alternative and match the requirements for different types of applications, to provide enough flexibility and performance to support different standards and algorithms with low hardware cost.
  •  
46.
  • Fu, Siyuan, et al. (författare)
  • Generalized lock-in amplifier for precision measurement of high frequency signals
  • 2013
  • Ingår i: Review of Scientific Instruments. - : AIP Publishing. - 1089-7623 .- 0034-6748. ; 84:11
  • Tidskriftsartikel (refereegranskat)abstract
    • We herein formulate the concept of a generalized lock-in amplifier for the precision measurement of high frequency signals based on digital cavities. Accurate measurement of signals higher than 200 MHz using the generalized lock-in is demonstrated. The technique is compared with a traditional lock-in and its advantages and limitations are discussed. We also briefly point out how the generalized lock-in can be used for precision measurement of giga-hertz signals by using parallel processing of the digitized signals.
  •  
47.
  • Granlund, Stefan, et al. (författare)
  • A Low-Latency High-Throughput Soft-Output Signal Detector for Spatial Multiplexing MIMO Systems
  • 2015
  • Ingår i: Microprocessors and Microsystems. - : Elsevier BV. - 0141-9331. ; 39:8, s. 901-908
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a low latency, high throughput soft-output signal detector for a 4x4 64-QAM spatial-multiplexing MIMO system. To achieve high data-level parallelism and accurate soft information, the detector adopts a channel-adaptive node perturbation technique to generate a list of candidate vectors around an initial linear estimation. The detection algorithm provides a large range and convenient performance-complexity trade-off by adjusting the node perturbation parameter. A partial-parallel pipelined VLSI architecture is developed to implement the algorithm with high throughput, low processing latency, while offering the flexibility to support run-time performance tuning. Moreover, a fast and hardware-friendly node enumeration scheme is developed to further reduce the processing delay by exploiting the geometric property of the quadrature amplitude modulation (QAM) constellation. The detector was synthesized using Synopsys Design Compiler with a 65nm CMOS standard cell library. The core area is 0.58mm2 with 290K gates. The peak throughput is 3Gb/s at 500MHz clock frequency with a latency of 20ns. Compared to other reported soft-output MIMO detectors, this is a latency reduction of 71%. The corresponding energy consumption is 33pJ per bit detection.
  •  
48.
  • Granlund, Stefan, et al. (författare)
  • Implementation of a Highly-Parallel Soft-Output MIMO Detector with Fast Node Enumeration
  • 2013
  • Ingår i: [Host publication title missing].
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a high throughput, low latency soft-output signal detector for a 4×4 64-QAM MIMO system. To achieve high data-level parallelism and accurate soft information, the detector adopts a node perturbation technique to generate a list of candidate vectors around Zero Forcing, ZF, result. Additionally a fast and hardware friendly node enumeration scheme is developed to significantly reduce processing delay. Implemented using a 65nm CMOS technology, the detector occupies 0.58mm2 core area with 290K gates. The peak throughput is 3Gb/s at 500 MHz clock frequency with a latency of 20ns. Energy consumption per detected bit is 33pJ.
  •  
49.
  •  
50.
  • Guo, Zhan, et al. (författare)
  • VLSI architecture of the soft-output sphere decoder for MIMO systems
  • 2005
  • Ingår i: 48th Midwest Symposium on Circuits and Systems, 2005.. - 0780391977 ; 2, s. 1195-1198
  • Konferensbidrag (refereegranskat)abstract
    • Sphere decoders can approach the performance of maximum-likelihood (ML) decoders for MIMO systems with lower complexity. In this paper, VLSI architecture of the modified K-best Schnorr-Euchner (MKSE) sphere decoder is proposed for soft-output MIMO decoding. The MKSE decoder can achieve near-ML performance with higher decoding throughput and lower computational complexity. The proposed VLSI architecture is implemented for a 4 /spl times/ 4 16-QAM MIMO system in a 0.13-/spl mu/m CMOS technology. The implemented soft-output MKSE chip can achieve a decoding throughput of more than 100 Mb/s with a 0.56 mm/sup 2/ core area and 97 K gates.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-50 av 184

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy