1. |
|
|
2. |
|
|
3. |
- Caputa, Peter, et al.
(författare)
-
An On-Chip Delay- and Skew-Insensitive Multi-Cycle Comunication Scheme
- 2006
-
Ingår i: International Solid-State Circuits Conference 2006, San Fransisco, USA. - 1424400791
-
Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
- A synchronous latency-insensitive design (SLID) method that mitigates unknown on-chip global wire delays and removes the need for controlling global clock skew is presented. An SLID-based 5.4mm-long on-chip global bus, fabricated in a standard 0.18mum CMOS process, supports 3Gb/s/wire and accepts plusmn2 clock cycles of data-clock skew. This paper focuses on data synchronization for large global on-chip signals, which has become a difficult issue in high-frequency processor designs.
|
|
4. |
- Caputa, Peter, 1973-
(författare)
-
Efficient high-speed on-chip global interconnects
- 2006
-
Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
- The continuous miniaturization of integrated circuits has opened the path towards System-on-Chip realizations. Process shrinking into the nanometer regime improves transistor performancewhile the delay of global interconnects, connecting circuit blocks separated by a long distance, significantly increases. In fact, global interconnects extending across a full chip can have a delay corresponding to multiple clock cycles. At the same time, global clock skew constraints, not only between blocks but also along the pipelined interconnects, become even tighter. On-chip interconnects have always been considered RC-like, that is exhibiting long RC-delays. This has motivated large efforts on alternatives such as on-chip optical interconnects, which have not yet been demonstrated, or complex schemes utilizing on-chip F-transmission or pulsed current-mode signaling.In this thesis, we show that well-designed electrical global interconnects, behaving as transmission lines, have the capacity of very high data rates (higher than can be delivered by the actual process) and support near velocity-of-light delay for single-ended voltage-mode signaling, thus mitigating the RC-problem. We critically explore key interconnect performance measures such as data delay, maximum data rate, crosstalk, edge rates and power dissipation. To experimentally demonstrate the feasibility and superior properties of on-chip transmission line interconnects, we have designed and fabricated a test chip carrying a 5 mm long global communication link. Measurements show that we can achieve 3 Gb/s/wire over the 5 mm long, repeaterless on-chip bus implemented in a standard 0.18 μm CMOS process, achieving a signal velocity of 1/3 of the velocity of light in vacuum.To manage the problems due to global wire delays, we describe and implement a Synchronous Latency Insensitive Design (SLID) scheme, based on source-synchronous data transfer between blocks and data re-timing at the receiving block. The SLIDtechnique not onlymitigates unknown globalwire delays, but also removes the need for controlling global clock skew. The high-performance and high robustness capability of the SLID-method is practically demonstrated through a successful implementation of a SLID-based, 5.4 mm long, on-chip global bus, supporting 3 Gb/s/wire and dynamically accepting ± 2 clock cycles of data-clock skew, in a standard 0.18 μm CMOS porcess.In the context of technology scaling, there is a tendency for interconnects to dominate chip power dissipation due to their large total capacitance. In this thesis we address the problem of interconnect power dissipation by proposing and analyzing a transition-energy cost model aimed for efficient power estimation of performancecritical buses. The model, which includes properties that closely capture effects present in high-performance VLSI buses, can be used to more accurately determine the energy benefits of e.g. transition coding of bus topologies. We further show a power optimization scheme based on appropriate choice of reduced voltage swing of the interconnect and scaling of receiver amplifier. Finally, the power saving impact of swing reduction in combination with a sense-amplifying flip-flop receiver is shown on a microprocessor cache bus architecture used in industry.
|
|
5. |
- Caputa, Peter, et al.
(författare)
-
Well-Behaved Global On-Chip Interconnect
- 2005
-
Ingår i: IEEE Transactions on Circuits and Systems I: Regular Papers. - 1057-7122. ; 52:2, s. 318-323
-
Tidskriftsartikel (refereegranskat)abstract
- Global interconnects have been identified as a serious limitation to chip scaling, due to their latency and power consumption. We demonstrate a scheme to overcome these limitations, based on the utilization of upper-level metals, combined with structured communication architecture. Microwave style transmission lines in upper-level metals allow close-to-velocity-of-light delays if properly dimensioned. As an example, we demonstrate a 480-μm-wide and 20-mm-long bus with a capacity of 320 Gb/s in a nearly standard 0.18-μm process. The process differs from a standard process only through a somewhat thicker outer metal layer. We further illustrate how "self pre-emphasis" at the launch of a data pulse can be used to double the maximum available data rate over a wire. The proposed techniques are scalable, given that higher level metals are properly dimensioned in future processes.
|
|
6. |
|
|
7. |
- Källsten, Rebecca, et al.
(författare)
-
Capacitive Crosstalk Effects on On-Chip Interconnect Latencies and Data-Rates
- 2005
-
Ingår i: Proceedings of the 23rd Norchip Conference, Oulu, Finland. ; , s. 281-284
-
Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
- We investigate how crosstalk affects latency, data-rate, and power dissipation for on-chip global interconnects in a 6-layer 0.18μm CMOS process. A simplified analytical interconnect description is compared to circuit simulations of a field solver extracted wire model. We show how repeater insertion can be utilized to achieve wave pipelining, which pushes maximum data-rate beyond the classical limit. Compared to simulations, the analytical model is pessimistic by 10% for latency, 30% for maximum data-rate, and 35% for power dissipation, highlighting the importance of avoiding too simple wire representations.
|
|