1. 
 Grange, Matt, et al.
(författare)

Modeling the Computational Efficiency of 2D and 3D Silicon Processors for EarlyChip Planning
 2011

Ingår i: 2011 IEEE/ACM International Conference on ComputerAided Design (ICCAD).  9781457713989  9781457713996 ; s. 310317

Konferensbidrag (refereegranskat)abstract
 Hierarchical models from physical to systemlevel are proposed for architectural exploration of highperformance silicon systems to quantify the performance and cost trade offs for 2D and 3D IC implementations. We show that 3D systems can reduce interconnect delay and energy by up to an order of magnitude over 2D, with an increase of 2030% in performanceperwatt for every doubling of stack height. Contrary to previous analysis, the improved energy efficiency is achievable at a favorable cost. The models are packaged as a standalone tool and can provide fast estimation of coarsegrain performance and cost limitations for a variety of processing systems to be used at the early chipplanning phase of the design cycle.


2. 
 Grange, Matt, et al.
(författare)

Modeling the Efficiency of Stacked Silicon Systems : Computational, Thermal and Electrical Performance
 2011

Konferensbidrag (refereegranskat)abstract
 Technological advances in processor design have typically reliedon scaling feature size and frequency. Recently however, many new design choiceshave emerged partly due to the slowing of scaling:– Manycore architectures arebeginning to replace singlecore ICs to circumvent 2D bottlenecks, The number ofI/Os are on the rise, so the cost of offchip transactions is becoming heftier. Moreover,3D Integration may provide further performance benefits without investment in lowertechnology nodes. Understanding these tradeoffs can provide guidelines to optimizethe architecture of future systems under performance, thermal and cost constraints.We have constructed a model and tool that assesses computational efficiency underthese criteria.


3. 
 Grange, Matt, et al.
(författare)

Optimal Network Architectures for Minimizing Average Distance in kary ndimensional Mesh Networks
 2011

Ingår i: NOCS 2011: The 5th ACM/IEEE International Symposium on NetworksonChip.  ACM Digital Library. ; s. 5764

Konferensbidrag (refereegranskat)abstract
 A general expression for the average distance for meshes of any dimension and radix, including unequal radices in different dimensions, valid for any traffic pattern under zeroload condition is formulated rigorously to allow its calculation without networklevel simulations. The average distance expression is solved analytically for uniform random traffic and for a set of local random traffic patterns. Hot spot traffic patterns are also considered and the formula is empirically validated by cycle true simulations for uniform random, local, and hot spot traffic. Moreover, a methodology to attain closedform solutions for other traffic patterns is detailed. Furthermore, the model is applied to guide design decisions. Specifically, we show that the model can predict the optimal 3D topology for uniform and local traffic patterns. It can also predict the optimal placement of hot spots in the network. The fidelity of the approach in suggesting the correct design choices even for loaded and congested networks is surprising. For those cases we studied empirically it is 100%.


4. 
 Grange, Matt, et al.
(författare)

Physical mapping and performance study of a multiclock 3Dimensional NetworkonChip mesh
 2009

Ingår i: 2009 IEEE INTERNATIONAL CONFERENCE ON 3D SYSTEMS INTEGRATION.  San Francisco : IEEE conference proceedings.  9781424445110 ; s. 345351

Konferensbidrag (refereegranskat)abstract
 The physical performance of a 3Dimensional NetworkonChip (NoC) mesh architecture employing through silicon vias (TSV) for vertical connectivity is investigated with a cycleaccurate RTL simulator. The physical latency and area impact of TSVs, switches, and the onchip interconnect is evaluated to extract the maximum signaling speeds through the switches, horizontal and vertical network links. The relatively low parasitics of TSVs compared to the onchip 2D interconnect allow for higher signaling speeds between chip layers. The systemlevel impact on overall network performance as a result of clocking vertical packets at a higher rate through the TSV interconnect is simulated and reported.


5. 
 Jantsch, Axel, et al.
(författare)

The Promises and Limitations of 3D Integration
 2011

Ingår i: 3D Integration for NoCbased SoC Architectures.  Springer Publishing Company. ; s. 2744

Bokkapitel (övrigt vetenskapligt)abstract
 The intrinsic computational efficiency (ICE) of silicon defines the upper limit of the amount of computation within a given technology and power envelope. The effective computational efficiency (ECE) and the effective computational density (ECD) of silicon, by taking computation, memory and communication into account, offer a more realistic upper bound for computation of a given technology. Among other factors, they consider how distributed the memory is, how much area is occupied by computation, memory and interconnect, and the geometric properties of 3D stacked technology with through silicon vias (TSV) as vertical links. We use ECE and ECD to study the limits of performance under different memory distribution constraints of various 2D and 3D topologies, in current and future technology nodes. Among other results, our model shows that in a 35 nm technology a 16 stack 3D system can, as a theoretical upper limit, obtain 3.4 times the performance of a 2D system (8.8 Tera OPS vs 2.6 TOPS) at 70% reduced frequency (2.1 vs 3.7 GHz) on 1/8 the total area (50 vs 400 mm2).


6. 
 Pamunuwa, Dinesh, et al.
(författare)

3D Integration and the Limits of Silicon Computation
 2011

Ingår i: Proceedings of the International Conference on Very Large Scale Integration (VLSISoC). ; s. 343348

Konferensbidrag (refereegranskat)abstract
 The intrinsic computational efficiency (ICE) of silicon defines the upper limit of the amount of computation within a given technology and power envelope. The effective computational efficiency (ECE) and the effective computational density (ECD) of silicon, by taking computation, memory and communication into account, offer a more realistic upper bound for computation of a given technology. Among other factors, they consider how distributed the memory is, how much area is occupied by computation, memory and interconnect, and the geometric properties of 3D stacked technology with through silicon vias (TSV) as vertical links. We use the ECE and ECD to study the limits of performance under different memory distribution, power, thermal and cost constraints for various 2D and 3D topologies, in current and future technology nodes.


7. 
 Pamunuwa, Dinesh, et al.
(författare)

A study on the implementation of 2D meshbased networksonchip in the nanometre regime
 2004

Ingår i: Integration.  01679260. ; 38:1, s. 317

Tidskriftsartikel (refereegranskat)abstract
 Onchip packetswitched networks have been proposed for future gigascale integration in the nanometre regime. This paper examines likely architectures for such networks and considers tradeoffs in the layout, performance, and power consumption based on fullswing, voltagemode CMOS signalling. A study is carried out for a future technology with parameters as predicted by the International Technology Roadmap for Semiconductors to yield a quantitative comparison of the performance and power tradeoff for the network. Important physical level issues are discussed.


8. 
 Weldezion, Awet Yemane, et al.
(författare)

Scalability of NetworkonChip Communication Architecture for 3D Meshes
 2009

Ingår i: 2009 3RD ACM/IEEE INTERNATIONAL SYMPOSIUM ON NETWORKSONCHIP.  NEW YORK : IEEE.  9781424441426 ; s. 114123

Konferensbidrag (refereegranskat)abstract
 Design Constraints imposed by global interconnect delays as well as limitations in integration of disparate technologies make 3D chip stacks an enticing technology solution for massively integrated electronic systems. The scarcity of vertical interconnects however imposes special constraints on the design of the communication architecture. This article examines the performance and scalability of different communication topologiesfor 3D NetworkonChips (NoC) using ThroughSiliconWas (TSV) for interdie connectivity. Cycle accurate RTLlevel simulations are conducted for two communication schemes based on a 7port switch and a centrally arbitrated vertical bus using different traffic patterns. The scalability of the 3D NoC is examined under both communication architectures and compared to 2D NoC structures in terms of throughput and latency in order to quantify the variation of network performance with the number of nodes and derive key design guidelines.


9. 
 Weldezion, Awet Yemane, et al.
(författare)

Zeroload Predictive Model for Performance Analysis in Deflection Routing NoCs
 2015

Ingår i: Microprocessors and microsystems.  Elsevier B.V..  01419331. ; 39:8, s. 634647

Tidskriftsartikel (refereegranskat)abstract
 We study a static model for 2D and 3D networks that accurately represents the average distance travelled by packets under deflection routing, which is a specific form of adaptive routing. The model captures static properties of the network topology and the spatial distribution of traffic, but does not take into account traffic loading and congestion. Even though this static model cannot accurately predict packet latency under high load, we contend that it is a perfect predictor of deflection routing networks’ relative performance under any load condition below saturation, and thus always correctly predicts the optimum network configuration. This is verified through cycleaccurate simulations of congested and uncongested networks with fully adaptive, deflection routing for regular traffic patterns such as uniform random, localised, bursty, and others, as well as irregular patterns in both regular and irregular networks. As the networks with minimal average distance perform best even under high traffic load, the average distance model establishes a robust relation between a static network property, average distance, and network performance under load, providing new insight into network behaviour and an opportunity to identify the optimal network configuration without timeconsuming simulations.

