SwePub - sökning: WFRF:(Garrido Gálvez Mario)

Numrering	Referens	Omslagsbild	Hitta
1.	Bae, Cheolyong, et al. (författare) Improved Implementation Approaches for 512-tap 60 GSa/s Chromatic Dispersion FIR Filters 2018 Ingår i: 2018 CONFERENCE RECORD OF 52ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS. - : IEEE. - 9781538692189 ; , s. 213-217 Konferensbidrag (refereegranskat)abstract In optical communication the non-ideal properties of the fibers lead to pulse widening from chromatic dispersion. One way to compensate for this is through digital signal processing. In this work, two architectures for compensation are compared. Both are designed for 60 GSa/s and 512 filter taps and implemented in the frequency domain using FFTs. It is shown that the high-speed requirements introduce constraints on possible architectural choices. In this work, it is shown that it is not required to use two overlapping FFTs to obtain continuous filtering. In addition, efficient highly parallel implementation of FFTs is discussed and an unproved FFT compared to our earlier work is proposed. The results are compared to using an approach with a shorter FFT and FIR filters.
2.	Boopal, Padma Prasad, et al. (författare) A Reconfigurable FFT Architecture for Variable-Length and Multi-Streaming OFDM Standards 2013 Ingår i: IEEE International Symposium on Circuits and Systems (ISCAS), 2013. - : IEEE. - 9781467357609 ; , s. 2066-2070 Konferensbidrag (refereegranskat)abstract This paper presents a reconfigurable FFT architecture for variable-length and multi-streaming WiMax wireless standard. The architecture processes 1 stream of 2048-point FFT, up to 2 streams of 1024-point FFT or up to 4 streams of 512-point FFT. The architecture consists of a modified radix-2 single delay feedback (SDF) FFT. The sampling frequency of the system is varied in accordance with the FFT length. The latch-free clock gating technique is used to reduce power consumption. The proposed architecture has been synthesized for the Virtex-6 XCVLX760 FPGA. Experimental results show that the architecture achieves the throughput that is required by the WiMax standard and the design has additional features compared to the previous approaches. The design uses 1% of the total available FPGA resources and maximum clock frequency of 313.67 MHz is achieved. Furthermore, this architecture can be expanded to suit other wireless standards.
3.	Chen, Sau-Gee, et al. (författare) Continuous-flow Parallel Bit-Reversal Circuit for MDF and MDC FFT Architectures 2014 Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-8328 .- 1558-0806. ; 61:10, s. 2869-2877 Tidskriftsartikel (refereegranskat)abstract This paper presents a bit reversal circuit for continuous-flow parallel pipelined FFT processors. In addition to two flexible commutators, the circuit consists of two memory groups, where each group has P memory banks. For the consideration of achieving both low delay time and area complexity, a novel write/read scheduling mechanism is devised, so that FFT outputs can be stored in those memory banks in an optimized way. The proposed scheduling mechanism can write the current successively generated FFT output data samples to the locations without any delay right after they are successively released by the previous symbol. Therefore, total memory space of only N data samples is enough for continuous-flow FFT operations. Since read operation is not overlapped with write operation during the entire period, only single-port memory is required, which leads to great area reduction. The proposed bit-reversal circuit architecture can generate natural-order FFT output and support variable power-of-2 FFT lengths.
4.	Garrido Gálvez, Mario, et al. (författare) A 4096-Point Radix-4 Memory-Based FFT Using DSP Slices 2017 Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1063-8210 .- 1557-9999. ; 25:1, s. 375-379 Tidskriftsartikel (refereegranskat)abstract This brief presents a novel 4096-point radix-4 memory-based fast Fourier transform (FFT). The proposed architecture follows a conflict-free strategy that only requires a total memory of size N and a few additional multiplexers. The control is also simple, as it is generated directly from the bits of a counter. Apart from the low complexity, the FFT has been implemented on a Virtex-5 field programmable gate array (FPGA) using DSP slices. The goal has been to reduce the use of distributed logic, which is scarce in the target FPGA. With this purpose, most of the hardware has been implemented in DSP48E. As a result, the proposed FPGA is efficient in terms of hardware resources, as is shown by the experimental results.
5.	Garrido Gálvez, Mario (författare) A New Representation of FFT Algorithms Using Triangular Matrices 2016 Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1549-8328 .- 1558-0806. ; 63:10, s. 1737-1745 Tidskriftsartikel (refereegranskat)abstract In this paper we propose a new representation for FFT algorithms called the triangular matrix representation. This representation is more general than the binary tree representation and, therefore, it introduces new FFT algorithms that were not discovered before. Furthermore, the new representation has the advantage that it is simple and easy to understand, as each FFT algorithm only consists of a triangular matrix. Besides, the new representation allows for obtaining the exact twiddle factor values in the FFT flow graph easily. This facilitates the design of FFT hardware architectures. As a result, the triangular matrix representation is an excellent alternative to represent FFT algorithms and it opens new possibilities in the exploration and understanding of the FFT.
6.	Garrido Gálvez, Mario, et al. (författare) A Serial Commutator Fast Fourier Transform Architecture for Real-Valued Signals 2018 Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1549-7747 .- 1558-3791. ; 65:11, s. 1693-1697 Tidskriftsartikel (refereegranskat)abstract This brief presents a novel pipelined architecture to compute the fast Fourier transform of real input signals in a serial manner, i.e., one sample is processed per cycle. The proposed architecture, referred to as real-valued serial commutator, achieves full hardware utilization by mapping each stage of the fast Fourier transform (FFT) to a half-butterfly operation that operates on real input signals. Prior serial architectures to compute FFT of real signals only achieved 50% hardware utilization. Novel data-exchange and data-reordering circuits are also presented. The complete serial commutator architecture requires 2 log(2) N - 2 real adders, log(2) N - 2 real multipliers, and N + 9 log(2) N - 19 real delay elements, where N represents the size of the FFT.
7.	Garrido Gálvez, Mario, et al. (författare) Accurate Rotations Based on Coefficient Scaling 2011 Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-7747 .- 1558-3791. ; 58:10, s. 662-666 Tidskriftsartikel (refereegranskat)abstract This brief presents a novel approach for improving the accuracy of rotations implemented by complex multipliers, based on scaling the complex coefficients that define these rotations. A method for obtaining the optimum coefficients that lead to the lowest error is proposed. This approach can be used to get more accurate rotations without increasing the coefficient word length and to reduce the word length without increasing the rotation error. This brief analyzes two different situations where the optimization method can be applied: rotations that can be optimized independently and sets of rotations that require the same scaling. These cases appear in important signal processing algorithms such as the discrete cosine transform and the fast Fourier transform (FFT). Experimental results show that the use of scaling for the coefficients clearly improves the accuracy of the algorithms. For instance, improvements of about 8 dB in the Frobenius norm of the FFT are achieved with respect to using non-scaled coefficients.
8.	Garrido Gálvez, Mario, et al. (författare) Continuous-flow variable-length memoryless linear regression architecture 2013 Ingår i: Electronics Letters. - : Institution of Engineering and Technology (IET). - 0013-5194 .- 1350-911X. ; 49:24, s. 1567-1568 Tidskriftsartikel (refereegranskat)abstract A pipelined circuit to calculate linear regression is presented. The proposed circuit has the advantages that it can process a continuous flow of data, it does not need memory to store the input samples and supports variable length that can be reconfigured in run time. The circuit is efficient in area, as it consists of a small number of adders, multipliers and dividers. These features make it very suitable for real-time applications, as well as for calculating the linear regression of a large number of samples.
9.	Garrido Gálvez, Mario, et al. (författare) CORDIC II: A New Improved CORDIC Algorithm 2016 Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1549-7747 .- 1558-3791. ; 63:2, s. 186-190 Tidskriftsartikel (refereegranskat)abstract In this brief, we present the CORDIC II algorithm. Like previous CORDIC algorithms, the CORDIC II calculates rotations by breaking down the rotation angle into a series of microrotations. However, the CORDIC II algorithm uses a novel angle set, different from the angles used in previous CORDIC algorithms. The new angle set provides a faster convergence that reduces the number of adders with respect to previous approaches.
10.	Garrido Gálvez, Mario, et al. (författare) Feedforward FFT Hardware Architectures Based on Rotator Allocation 2018 Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1549-8328 .- 1558-0806. ; 65:2, s. 581-592 Tidskriftsartikel (refereegranskat)abstract In this paper, we present new feedforward FFT hardware architectures based on rotator allocation. The rotator allocation approach consists in distributing the rotations of the FFT in such a way that the number of edges in the FFT that need rotators and the complexity of the rotators are reduced. Radix-2 and radix-2(k) feedforward architectures based on rotator allocation are presented in this paper. Experimental results show that the proposed architectures reduce the hardware cost significantly with respect to previous FFT architectures.
11.	Garrido Gálvez, Mario, et al. (författare) Guest Editorial: Special Section on Fast Fourier Transform (FFT) Hardware Implementations 2018 Ingår i: Journal of Signal Processing Systems. - : SPRINGER. - 1939-8018 .- 1939-8115. ; 90:11, s. 1581-1582 Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)abstract n/a
12.	Garrido Gálvez, Mario, et al. (författare) Low-Complexity Multiplierless Constant Rotators Based on Combined Coefficient Selection and Shift-and-Add Implementation (CCSSI) 2014 Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-8328 .- 1558-0806. ; 61:7, s. 2002-2012 Tidskriftsartikel (refereegranskat)abstract This paper presents a new approach to design multiplierless constant rotators. The approach is based on a combined coefficient selection and shift-and-add implementation (CCSSI) for the design of the rotators. First, complete freedom is given to the selection of the coefficients, i.e., no constraints to the coefficients are set in advance and all the alternatives are taken into account. Second, the shift-and-add implementation uses advanced single constant multiplication (SCM) and multiple constant multiplication (MCM) techniques that lead to low-complexity multiplierless implementations. Third, the design of the rotators is done by a joint optimization of the coefficient selection and shift-and-add implementation. As a result, the CCSSI provides an extended design space that offers a larger number of alternatives with respect to previous works. Furthermore, the design space is explored in a simple and efficient way. The proposed approach has wide applications in numerous hardware scenarios. This includes rotations by single or multiple angles, rotators in single or multiple branches, and different scaling of the outputs. Experimental results for various scenarios are provided. In all of them, the proposed approach achieves significant improvements with respect to state of the art.
13.	Garrido Gálvez, Mario, et al. (författare) Multiplierless Unity-Gain SDF FFTs 2016 Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1063-8210 .- 1557-9999. ; 24:9, s. 3003-3007 Tidskriftsartikel (refereegranskat)abstract In this brief, we propose a novel approach to implement multiplierless unity-gain single-delay feedback fast Fourier transforms (FFTs). Previous methods achieve unity-gain FFTs by using either complex multipliers or nonunity-gain rotators with additional scaling compensation. Conversely, this brief proposes unity-gain FFTs without compensation circuits, even when using nonunity-gain rotators. This is achieved by a joint design of rotators, so that the entire FFT is scaled by a power of two, which is then shifted to unity. This reduces the amount of hardware resources of the FFT architecture, while having high accuracy in the calculations. The proposed approach can be applied to any FFT size, and various designs for different FFT sizes are presented.
14.	Garrido Gálvez, Mario, et al. (författare) Optimum Circuits for Bit Reversal 2011 Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : Institute of Electrical and Electronics Engineers (IEEE). - 1549-7747 .- 1558-3791. ; 58:10, s. 657-661 Tidskriftsartikel (refereegranskat)abstract This brief presents novel circuits for calculating bit reversal on a series of data. The circuits are simple and consist of buffers and multiplexers connected in series. The circuits are optimum in two senses: they use the minimum number of registers that are necessary for calculating the bit reversal and have minimum latency. This makes them very suitable for calculating the bit reversal of the output frequencies in hardware fast Fourier transform (FFT) architectures. This brief also proposes optimum solutions for reordering the output frequencies of the FFT when different common radices are used, including radix-2, radix-2(k), radix-4, and radix-8.
15.	Garrido Gálvez, Mario, et al. (författare) Pipelined Radix-2(k) Feedforward FFT Architectures 2013 Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - : Institute of Electrical and Electronics Engineers (IEEE). - 1063-8210 .- 1557-9999. ; 21:1, s. 23-32 Tidskriftsartikel (refereegranskat)abstract The appearance of radix-2(2) was a milestone in the design of pipelined FFT hardware architectures. Later, radix-2(2) was extended to radix-2(k). However, radix-2(k) was only proposed for single-path delay feedback (SDF) architectures, but not for feedforward ones, also called multi-path delay commutator (MDC). This paper presents the radix-2(k) feedforward (MDC) FFT architectures. In feedforward architectures radix-2(k) canbe used for any number of parallel samples which is a power of two. Furthermore, both decimation in frequency (DIF) and decimation in time (DIT) decompositions can be used. In addition to this, the designs can achieve very high throughputs, which makes them suitable for the most demanding applications. Indeed, the proposed radix-2(k) feedforward architectures require fewer hardware resources than parallel feedback ones, also called multi-path delay feedback (MDF), when several samples in parallel must be processed. As a result, the proposed radix-2(k) feedforward architectures not only offer an attractive solution for current applications, but also open up a new research line on feedforward structures.
16.	Garrido Gálvez, Mario (författare) The Feedforward Short-Time Fourier Transform 2016 Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1549-7747 .- 1558-3791. ; 63:9, s. 868-872 Tidskriftsartikel (refereegranskat)abstract This brief presents the feedforward short-time Fourier transform (STFT). This new approach is based on reusing the calculations of the STFT at consecutive time instants. This leads to significant savings in hardware components with respect to fast Fourier transform based STFTs. Furthermore, the feedforward STFT does not have the accumulative error of iterative STFT approaches. As a result, the proposed feedforward STFT presents an excellent tradeoff between hardware utilization and performance.
17.	Garrido Gálvez, Mario, et al. (författare) The Serial Commutator FFT 2016 Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1549-7747 .- 1558-3791. ; 63:10, s. 974-978 Tidskriftsartikel (refereegranskat)abstract This brief presents a new type of fast Fourier transform (FFT) hardware architectures called serial commutator (SC) FFT. The SC FFT is characterized by the use of circuits for bit-dimension permutation of serial data. The proposed architectures are based on the observation that, in the radix-2 FFT algorithm, only half of the samples at each stage must be rotated. This fact, together with a proper data management, makes it possible to allocate rotations only every other clock cycle. This allows for simplifying the rotator, halving the complexity with respect to conventional serial FFT architectures. Likewise, the proposed approach halves the number of adders in the butterflies with respect to previous architectures. As a result, the proposed architectures use the minimum number of adders, rotators, and memory that are necessary for a pipelined FFT of serial data, with 100% utilization ratio.
18.	Kanders, Hans, et al. (författare) A 1 Million-Point FFT on a Single FPGA 2019 Ingår i: IEEE Transactions on Circuits and Systems Part 1. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1549-8328 .- 1558-0806. ; 66:10, s. 3863-3873 Tidskriftsartikel (refereegranskat)abstract In this paper, we present the first implementation of a 1 million-point fast Fourier transform (FFT) completely integrated on a single field-programmable gate array (FPGA), without the need for external memory or multiple interconnected FPGAs. The proposed architecture is a pipelined single-delay feedback (SDF) FFT. The architecture includes a specifically designed 1 million-point rotator with high accuracy and a thorough study of the word length at the different FFT stages in order to increase the signal-to-quantization-noise ratio (SQNR) and keep the area low. This also results in low power consumption.
19.	Kumm, Martin, et al. (författare) Optimal Single Constant Multiplication Using Ternary Adders 2018 Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1549-7747 .- 1558-3791. ; 65:7, s. 928-932 Tidskriftsartikel (refereegranskat)abstract The single constant coefficient multiplication is a frequently used operation in many numeric algorithms. Extensive previous work is available on how to reduce constant multiplications to additions, subtractions, and bit shifts. However, on previous work, only common two-input adders were used. As modern field-programmable gate arrays (FPGAs) support efficient ternary adders, i.e., adders with three inputs, this brief investigates constant multiplications that are built from ternary adders in an optimal way. The results show that the multiplication with any constant up to 22 bits can be realized by only three ternary adders. Average adder reductions of more than 33% compared to optimal constant multiplication circuits using two-input adders are achieved for coefficient word sizes of more than five bits. Synthesis experiments show FPGA average slice reductions in the order of 25% and a similar or higher speed than their two-input adder counterparts.
20.	Källström, Petter, et al. (författare) Low-Complexity Rotators for the FFT Using Base-3 Signed Stages 2012 Ingår i: APCCAS 2012 : 2012 IEEE Asia Pacific Conference on Circuits and Systems. - Piscataway, N.J., USA : IEEE. - 9781457717284 ; , s. 519-522 Konferensbidrag (refereegranskat)abstract Rotations by angles that are fractions of the unit circle find applications in e.g. fast Fourier transform (FFT) architectures. In this work we propose a new rotator that consists of a series of stages. Each stage calculates a micro-rotation by an angle corresponding to a power-of-three fractional parts. Using a continuous powers-of-three range, it is possible to carry out all rotations required. In addition, the proposed rotators are compared to previous approaches, based of shift-and-add algorithms, showing improvements in accuracy and number of adders.
21.	Moeller, Konrad, et al. (författare) Optimal Shift Reassignment in Reconfigurable Constant Multiplication Circuits 2018 Ingår i: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 0278-0070 .- 1937-4151. ; 37:3, s. 710-714 Tidskriftsartikel (refereegranskat)abstract This paper presents a new method called optimal shift reassignment (OSR), used for reconfigurable multiplication circuits. These circuits consist of adders, subtractors, shifts, and multiplexers (MUXs). They calculate the multiplication of an input number by one out of several constants which can be selected dynamically during run-time. The OSR method is based on the idea that shifts can be placed at different positions along the circuit, while the calculated output constant stays the same. This differs from previous approaches, which were limited by the fact that all constants within the constant multiplier were forced to be odd. The OSR method subsequently releases this restriction. As a result, the number of required MUXs in the circuit can be reduced. This happens when the shift reassignment aligns the shift values of different inputs of an MUX. Experimental results show MUX savings of up to 50% and average savings between 11% and 16% using the OSR method compared to previous approaches.
22.	Mohammadi Sarband, Narges, et al. (författare) Using Transposition to Efficiently Solve Constant Matrix-Vector Multiplication and Sum of Product Problems 2020 Ingår i: Journal of Signal Processing Systems. - : SPRINGER. - 1939-8018 .- 1939-8115. ; 92:10, s. 1075-1089 Tidskriftsartikel (refereegranskat)abstract In this work, we present an approach to alleviate the potential benefit of adder graph algorithms by solving the transposed form of the problem and then transposing the solution. The key contribution is a systematic way to obtain the transposed realization with a minimum number of cascaded adders subject to the input realization. In this way, wide and low constant matrix multiplication problems, with sum of products as a special case, which are normally exceptionally time consuming to solve using adder graph algorithms, can be solved by first transposing the matrix and then transposing the solution. Examples show that while the relation between the adder depth of the solution to the transposed problem and the original problem is not straightforward, there are many cases where the reduction in adder cost will more than compensate for the potential increase in adder depth and result in implementations with reduced power consumption compared to using sub-expression sharing algorithms, which can both solve the original problem directly in reasonable time and guarantee a minimum adder depth.

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Träfflista för sökning "WFRF:(Garrido Gálvez Mario) "

Avgränsa träffmängd

År