SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Öberg Johnny) srt2:(2005-2009)"

Sökning: WFRF:(Öberg Johnny) > (2005-2009)

  • Resultat 1-10 av 16
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Krenz-Bååth, René, 1974- (författare)
  • Dominator-based Algorithms in Logic Synthesis and Verification
  • 2007
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Today's EDA (Electronic Design Automation) industry faces enormous challenges. Their primary cause is the tremendous increase of the complexity of modern digital designs. Graph algorithms are widely applied to solve various EDA problems. In particular, graph dominators, which provide information about the origin and the end of reconverging paths in a circuit graph, proved to be useful in various CAD (Computer Aided Design) applications such as equivalence checking, ATPG, technology mapping, and power optimization. This thesis provides a study on graph dominators in logic synthesis and verification. The thesis contributes a set of algorithms for computing dominators in circuit graphs. An algorithm is proposed for finding absolute dominators in circuit graphs. The achieved speedup of three orders of magnitude on several designs enables the computation of absolute dominators in large industrial designs in a few seconds. Moreover, the computation of single-vertex dominators in large multiple-output circuit graphs is considerably improved. The proposed algorithm reduces the overall runtime by efficiently recognizing and re-using isomorphic structures in dominator trees rooted at different outputs of the circuit graph. Finally, common multiple-vertex dominators are introduced. The algorithm to compute them is faster and finds more multiple-vertex dominators than previous approaches. The thesis also proposes new dominator-based algorithms in the area of decomposition and combinational equivalence checking. A structural decomposition technique is introduced, which finds all simple-disjoint decompositions of a Boolean function which are reflected in the circuit graph. The experimental results demonstrate that the proposed technique outperforms state-of-the-art functional decomposition techniques. Finally, an approach to check the equivalence of two Boolean functions probabilistically is investigated. The proposed algorithm partitions the equivalence check employing dominators in the circuit graph. The experimental results confirm that, in comparison to traditional BDD-based equivalence checking methods, the memory consumption is considerably reduced by using the proposed technique.
  •  
2.
  • Landernäs, Krister (författare)
  • Implementation of digital-serial LDI/LDD allpass filters
  • 2006
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In this thesis, digit-serial implementation of recursive digital filters is considered. The theories presented can be applied to any recursive digital filter, and in this thesis we study the lossless discrete integrator (LDI) allpass filter. A brief introduction regarding suppression of limit cycles at finite wordlength conditions is given, and an extended stability region, where the second-order LDI allpass filter is free from quantization limit cycles, is presented.The realization of digit-serial processing elements, i.e., digit-serial adders and multipliers, is studied. A new digit-serial hybrid adder (DSHA) is presented. The adder can be pipelined to the bit level with a short arithmetic critical path, which makes it well suited when implementing high-throughput recursive digital filters.Two digit-serial multipliers which can be pipelined to the bit level are considered. It is concluded that a digit-serial/parallelmultiplier based on shift-accumulation(DSAAM) is a good candidate when implementing recursive digital systems, mainly due to low latency. Furthermore, our study shows that low latency will lead to higher throughput and lower power consumption.Scheduling of recursive digit-serial algorithms is studied. It is concluded that implementation issues such as latency and arithmetic critical path are usually required before scheduling considerations can be made. Cyclic scheduling using digit-serial arithmetics is also considered. It is shown that digit-serial cyclic scheduling is very attractive for high-throughput implementations.
  •  
3.
  • Minhass, Wajid Hassan, et al. (författare)
  • Design and implementation of a plesiochronous multi-core 4x4 network-on-chip FPGA platform with MPI HAL support
  • 2009
  • Ingår i: 6th FPGAworld Conference, Academic Proceedings 2009. - New York, NY, USA : ACM. - 9781605588797 ; , s. 52-57
  • Konferensbidrag (refereegranskat)abstract
    • The Multi-Core NoC is a 4 by 4 Mesh NoC targeted for Altera FPGAs. It implements a deflective routing policy and is used to connect sixteen NIOS II processors. Each NIOS II is connected to the NoC via an address-mapped Resource Network Interface. The Multi-Core NoC is implemented on four separate Altera Stratix II FPGA boards, each hosting a Quad-Core NoC, which operates on a local 50 MHz clock. It has an onboard throughput of 650 Mbps (12.5 MFlit/s), and uses 28% of the LUs, 18% of the ALUTs, 22 % of the dedicated registers and 31% of the total memory blocks of a Stratix II FPGA. Asynchronous clock bridges, with a throughput of 50 Mbps (∼1MFlit/s), are used for the inter-board communication. Application programs use an MPI compatible Hardware Abstraction Layer (HAL) to communicate with the Resource Network Interface of the NoC. The RNI sets up message transfer, with a maximum length of 512 bytes, and sends flits with the size of 32 bit data plus 20 bit headers through the network. The MPI is the bottleneck of the system; it takes 46 us (43.4 kPackets/s) to send a minimum-sized packet through the protocol stack to a near neighbour and bounce it back to the original application. The bounce-back time for a far neighbour is 56 us.
  •  
4.
  • Minhass, Wajid Hassan, et al. (författare)
  • Implementation of a scalable, globally plesiochronous locally synchronous, off-chip NoC communication protocol
  • 2009
  • Ingår i: 2009 NORCHIP. - 9781424443109 ; , s. 1-5
  • Konferensbidrag (refereegranskat)abstract
    • Multiprocessor system-on-chip design (MPSoC) is becoming a regular feature of the embedded systems. Shared-bus systems hold many advantages, but they do not scale. Network on chip (NoC) offers a promising solution to the scalability problem by enhancing the topology design. However, standard NoCs are only scalable within a chip. To be able to build infinitely scalable structures, a method to enhance the NoC-grid off-chip is needed. In this paper, we present such a method. As a proof of concept, the protocol is implemented on a 4 by 4 Mesh NoC, with NIOS II CPU cores as nodes, partitioned across four separate Altera FPGA boards, each board hosting a Quad-Core (2x2) NoC, operating on a local 50 MHz clock. The inter-chip communication protocol uses asynchronous clock bridges, with a throughput of 50 Mbps (~1MFlit/s) and is completely scalable. The NoC has an onboard throughput of 650 Mbps (12.5 MFlit/s). Each Quad-Core uses 28% of the LUs, 18% of the ALUTs, 22 % of the dedicated registers and 31% of the total memory blocks of the Stratix II FPGAs. Application programs use an MPI compatible Hardware Abstraction Layer (HAL) to communicate with each other over the NoC.
  •  
5.
  • Navas, Byron, et al. (författare)
  • Camera and LCM IP-Cores for NIOS SOPC System
  • 2009
  • Ingår i: 6th FPGAworld Conference, Academic Proceedings 2009. - New York : ACM. - 9781605588797 ; , s. 18-23
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents the development of IP-Cores to integrate the Terasic DC2 Camera and LCM (LCD Module) daughter boards into an Altera Nios System, so that the image can be further processed by embedded software or custom hardware instructions. Among other challenges overcome during this work are clock-domain crossing, synchronizing FIFO design, variable and pipelined burst control, multi-masters contention for system memory and image frame buffer switching. In addition, we designed software device drivers, and API functions intended for graphics, image processing and video control; which are part of the IP deliverables. In a brief, this work describes some concepts and methodologies involved in the creation of IP-Cores for an Altera SOPC; it also presents the results of the designed CAM-IP and LCM-IP Cores working in an application demo, which constitutes a real solution and a reference design.
  •  
6.
  • Nilsson, Erland, 1977- (författare)
  • Exploring trade-offs between Latency and Throughput in the Nostrum Network on Chip
  • 2006
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • During the past years has the Nostrum Network on Chip (NoC) been developed to become a competitive platform for network based on-chip communication. The Nostrum NoC provides a versatile communication platform to connect a large number of intellectual properties (IP) on a single chip. The communication is based on a packet switched network which aims for a small physical footprint while still providing a low communication overhead. To reduce the communication network size, a queue-less network has been the research focus. This network uses de ective hot-potato routing which is implemented to perform routing decisions in a single clock cycle. Using a platform like this results in increased design reusability, validated signal integrity, and well developed test strategies, in contrast to a fully customised designs which can have a more optimal communication structure but has a significantly longer development cycle to verify the new design accordingly. Several factors are considered when designing a communication platform. The goal is to create a platform which provides low communication latency, high throughput, low power consumption, small footprint, and low design, verification, and test overhead. Proximity Congestion Awareness is one technique that serves to reduce the network load. This leads to that the latency is reduced which also increases the network throughput. Another technique is to implement low latency paths called Data Motorways achieved through a clocking method called Globally Pseudochronous Locally Synchronous clocking. Furthermore, virtual circuits can be used to provide guarantees on latency and throughput. Such guarantees are dificult in hot-potato networks since network access has to be ensured. A technique that implements virtual circuits use looped containers that are circulating on a predefined circuit. Several overlapping virtual circuits are possible by allocating the virtual circuits in different Temporally Disjoint Networks. This thesis summarise the impact the presented techniques and methods have on the characteristics on the Nostrum model. It is possible to reduce the network load by a factor of 20 which reduces the communication latency. This is done by distributing load information between the Switches in the network. Data Motorways can reduce the communication latency with up to 50% in heavily loaded networks. Such latency reduction results in freed buffer space in the Switch registers which allows the traffic rate to be increased with about 30%.
  •  
7.
  •  
8.
  • Nilsson, Erland, et al. (författare)
  • Trading off Power versus Latency using GPLS Clocking in 2D-Mesh NoCs
  • 2005
  • Ingår i: Isscs 2005: International Symposium On Signals, Circuits And Systems, Vols 1 And 2, Proceedings. - New York, USA : IEEE. - 0780390296 ; , s. 51-54
  • Konferensbidrag (refereegranskat)abstract
    • To handle the design complexity when the number of transistors on-chip reaches one billion, new ways of organizing chips will be needed. One solution to this problem is to organize computational resources in a grid, where all communication between the resources are performed using an interconnection network. These networks are commonly referred to as Networks-on-Chip, or NoCs. This paper focus on the trade-off between power and latency while keeping the required interconnection bandwidth constant. The clock frequency can be lowered to reduce the power, with reduced bandwidth as a consequence, which in a synchronous system will increase the latency linearly. In a 2D-Mesh NoC structure, it is possible to choose the regions with different clock phase and arrange them in such ways that the latency from sender to receiver along certain paths is nearly constant, and the total average latency is reduced with 50%. The reduction can then be exploited to trade off latency vs. power; the GPLS solution consumes 50% or the power compared to the fully synchronous solution, at the same latency and constant throughput.
  •  
9.
  •  
10.
  • Petersen, Kim, et al. (författare)
  • Toward a scalable test methodology for 2D-mesh network-on-chips
  • 2007
  • Ingår i: 2007 Design, Automation & Test In Europe Conference & Exhibition. - 9783981080124 ; , s. 367-372
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a BIST strategy for testing the NoC interconnect network, and investigates if the strategy is a suitable approach for the task. All switches and links in the NoC are tested with BIST running at full clock-speed, and in a functional-like mode. The BIST is carried out as a go/no-go BIST operation at start up, or on command It is shown that the proposed methodology can be applied for different implementations of deflecting switches, and that the test time is limited to a few thousand-clock cycles with fault coverage close to 100%.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 16

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy