SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Öberg Johnny) srt2:(2010-2014)"

Sökning: WFRF:(Öberg Johnny) > (2010-2014)

  • Resultat 1-10 av 15
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Collin, Mikael, et al. (författare)
  • A performance and energy exploration of dictionary code compression architectures
  • 2011
  • Ingår i: 2011 International  Green Computing Conference and Workshops (IGCC). - : IEEE conference proceedings. - 9781457712227 ; , s. 1-8
  • Konferensbidrag (refereegranskat)abstract
    • We have made a performance and energy exploration of a previously proposed dictionary code compression mechanism where frequently executed individual instructions and/or sequences are replaced in memory with short code words. Our simulated design shows a dramatically reduced instruction memory access frequency leading to a performance improvement for small instruction cache sizes and to significantly reduced energy consumption in the instruction fetch path. We have evaluated the performance and energy implications of three architectural parameters: branch prediction accuracy, instruction cache size and organization. To asses the complexity of the design we have implemented the critical stages in VHDL.
  •  
2.
  • Mand, Nowshad Painda, et al. (författare)
  • Artificial neural network emulation on NOC based multi-core FPGA platform
  • 2012
  • Ingår i: NORCHIP, 2012. - : IEEE. - 9781467322218 ; , s. 6403122-
  • Konferensbidrag (refereegranskat)abstract
    • With the emergence of Multi-Core platforms, brain emulation in the form of Artificial Neural Nets has been announced as one of the important key research area. However, due to large non-linear growth of inter-neuron connectivity, direct mapping of ANNs to silicon structures is very difficult due to communication bottleneck.
  •  
3.
  • Mand, N. P., et al. (författare)
  • Going for brain-scale integration - Using FPGAS, TSVs and NOC based artificial neural networks : A case study
  • 2014
  • Ingår i: 11th FPGAworld Conference - Academic Proceedings 2014, FPGAWorld 2014. - New York, NY, USA : ACM Digital Library. - 9781450331302
  • Konferensbidrag (refereegranskat)abstract
    • With better understanding of brain's massive parallel processing, brain-scale integration has been announced as one of the key research area in modern times and numerous efforts has been done to mimic such models. Multicore architectures, Network-On-Chip, 3D stacked ICs with TSVs, FPGA's growth beyond Moore's law and new design methodologies like high level synthesis will ultimately lead us toward single- and multi-chip solutions of Artificial Neural Net models comprising of millions or even more neurons per chip. Historically ANNs have been emulated as either software models, ASICs or a hybrid of both. Software models are very slow while ASICs based designs lacks plasticity. FPGA consumes a little more power but offer the flexibility of software and performance of ASICs along with basic requirement of plasticity in the form of reconfigurability. However, the traditional bottom up approach for building large ANN models is no more feasible and wiring along with memory becomes major bottlenecks when considering networks comprised of large number of neurons. The aim of this paper is to present a design space exploration of large-scale ANN models using a scalable NOC based architecture together with high level synthesis tools to explore the feasibility of implementing brain-scale ANNs on FPGAs using 3D stacked memory structures.
  •  
4.
  • Navas, Byron, et al. (författare)
  • On providing scalable self-healing adaptive fault-tolerance to RTR SoCs
  • 2014
  • Ingår i: Proceedings of ReConFigurable Computing and FPGAs (ReConFig), 2014 International Conference on. - 9781479959440 ; , s. 1-6
  • Konferensbidrag (refereegranskat)abstract
    • The dependability of heterogeneous many-core FPGA based systems are threatened by higher failure rates caused by disruptive scales of integration, increased design complexity, and radiation sensitivity. Triple-modular redundancy (TMR) and run-time reconfiguration (RTR) are traditional fault-tolerant (FT) techniques used to increase dependability. However, hardware redundancy is expensive and most approaches have poor scalability, flexibility, and programmability. Therefore, innovative solutions are needed to reduce the redundancy cost but still preserve acceptable levels of dependability. In this context, this paper presents the implementation of a self-healing adaptive fault-tolerant SoC that reuses RTR IP-cores in order to self-assemble different TMR schemes during run-time. The presented system demonstrates the feasibility of the Upset-Fault-Observer concept, which provides a run-time self-test and recovery strategy that delivers fault-tolerance over functions accelerated in RTR cores, at the same time reducing the redundancy scalability cost by running periodic reconfigurable TMR scan-cycles. In addition, this paper experimentally evaluates the trade-off of the implemented reconfigurable TMR schemes by characterizing important fault tolerant metrics i.e., recovery time (self-repair and self-replicate), detection latency, self-assembly latency, throughput reduction, and increase of physical resources.
  •  
5.
  • Navas, Byron, et al. (författare)
  • The RecoBlock SoC Platform : A Flexible Array of Reusable Run-Time-Reconfigurable IP-Blocks
  • 2013
  • Ingår i: Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013. - 9781467350716 ; , s. 833-838
  • Konferensbidrag (refereegranskat)abstract
    • Run-time reconfigurable (RTR) FPGAs combine the flexibility of software with the high efficiency of hardware. Still, their potential cannot be fully exploited due to increased complexity of the design process. Consequently, to enable an efficient design flow, we devise a set of prerequisites to increase the flexibility and reusability of current FPGA-based RTR architectures. We apply these principles to design and implement the RecoBlock SoC platform, which main characterization is (1) a RTR plug-and-play IP-Core whose functionality is configured at run-time; (2) flexible inter-block communication configured via software, and (3) built-in buffers to support data-driven streams and inter-process communications. We illustrate the potential of our platform by a tutorial case study using an adaptive streaming application to investigate different combinations of reconfigurable arrays and schedules. The experiments underline the benefits of the platform and shows resource utilization.
  •  
6.
  • Navas, Byron, et al. (författare)
  • The Upset-Fault-Observer : A Concept for Self-healing Adaptive Fault Tolerance
  • 2014
  • Ingår i: Proceedings of the 2014 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2014. - : IEEE Computer Society. - 9781479953561 ; , s. 89-96
  • Konferensbidrag (refereegranskat)abstract
    • Advancing integration reaching atomic-scales makes components highly defective and unstable during lifetime. This demands paradigm shifts in electronic systems design. FPGAs are particularly sensitive to cosmic and other kinds of radiations that produce single-event-upsets (SEU) in configuration and internal memories. Typical fault-tolerance (FT) techniques combine triple-modular-redundancy (TMR) schemes with run-time-reconfiguration (RTR). However, even the most successful approaches disregard the low suitability of fine-grain redundancy in nano-scale design, poor scalability and programmability of application specific architectures, small performance-consumption ratio of board-level designs, or scarce optimization capability of rigid redundancy structures. In that context, we introduce an innovative solution that exploits the flexibility, reusability, and scalability of a modular RTR SoC approach and reuse existing RTR IP-cores in order to assemble different TMR schemes during run-time. Thus, the system can adaptively trigger the adequate self-healing strategy according to execution environment metrics and user-defined goals. Specifically the paper presents: (a) the upset-fault-observer (UFO), an innovative run-time self-test and recovery strategy that delivers FT on request over several function cores but saves the redundancy scalability cost by running periodic reconfigurable TMR scan-cycles, (b) run-time reconfigurable TMR schemes and self-repair mechanisms, and (c) an adaptive software organization model to manage the proposed FT strategies.
  •  
7.
  • Navas, Byron, et al. (författare)
  • Towards the generic reconfigurable accelerator : Algorithm development, core design, and performance analysis
  • 2013
  • Konferensbidrag (refereegranskat)abstract
    • Adoption of reconfigurable computing is limited in part by the lack of simplified, economic, and reusable solutions. The significant speedup and energy saving can increase performance but also design complexity; in particular for heterogeneous SoCs blending several CPUs, GPUs, and FPGA-Accelerator Cores. On the other hand, implementing complex algorithms in hardware requires modeling and verification, not only HDL generation. Most approaches are too specific without looking for reusability. Therefore, we present a solution based on: (1) a design methodology to develop algorithms accelerated in reconfigurable/non-reconfigurable IP-Cores, using common access tools, and contemplating verification from model to embedded software stages; (2) a generic accelerator core design that enables relocation and reuse almost independently of the algorithm, and data-flow driven execution models; and (3) a performance analysis of the acceleration mechanisms included in our system (i.e., accelerator core, burst I/O transfers, and reconfiguration pre-fetch). In consequence, the implemented system accelerates algorithms (e.g., FIR and Kalman filters) with speedups up to 3 orders of magnitude, compared to processor implementations.
  •  
8.
  • Robino, Francesco, 1985- (författare)
  • A model-based design approach for heterogeneous NoC-based MPSoCs on FPGA
  • 2014
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Network-on-chip (NoC) based multi-processor systems-on-chip (MPSoCs) are promising candidates for future multi-processor embedded platforms, which are expected to be composed of hundreds of heterogeneous processing elements (PEs) to potentially provide high performances. However, together with the performances, the systems complexity will increase, and new high level design techniques will be needed to efficiently model, simulate, debug and synthesize them. System-level design (SLD) is considered to be the next frontier in electronic design automation (EDA). It enables the description of embedded systems in terms of abstract functions and interconnected blocks. A promising complementary approach to SLD is the use of models of computation (MoCs) to formally describe the execution semantics of functions and blocks through a set of rules. However, also when this formalization is used, there is no clear way to synthesize system-level models into software (SW) and hardware (HW) towards a NoC-based MPSoC implementation, i.e., there is a lack of system design automation (SDA) techniques to rapidly synthesize and prototype system-level models onto heterogeneous NoC-based MPSoCs. In addition, many of the proposed solutions require large overhead in terms of SW components and memory requirements, resulting in complex and customized multi-processor platforms. In order to tackle the problem, a novel model-based SDA flow has been developed as part of the thesis. It starts from a system-level specification, where functions execute according to the synchronous MoC, and then it can rapidly prototype the system onto an FPGA configured as an heterogeneous NoC-based MPSoC. In the first part of the thesis the HeartBeat model is proposed as a model-based technique which fills the abstraction gap between the abstract system-level representation and its implementation on the multiprocessor prototype. Then details are provided to describe how this technique is automated to rapidly prototype the modeled system on a flexible platform, permitting to adjust the system specification until the designer is satisfied with the results. Finally, the proposed SDA technique is improved defining a methodology to automatically explore possible design alternatives for the modeled system to be implemented on a heterogeneous NoC-based MPSoC. The goal of the exploration is to find an implementation satisfying the designer's requirements, which can be integrated in the proposed SDA flow. Through the proposed SDA flow, the designer is relieved from implementation details and the design time of systems targeting heterogeneous NoC-based MPSoCs on FPGA is significantly reduced. In addition, it reduces possible design errors proposing a completely automated technique for fast prototyping. Compared to other SDA flows, the proposed technique targets a bare-metal solution, avoiding the use of an operating system (OS). This reduces the memory requirements on the FPGA platform comparing to related work targeting MPSoC on FPGA. At the same time, the performance (throughput) of the modeled applications can be increased when the number of processors of the target platform is increased. This is shown through a wide set of case studies implemented on FPGA.
  •  
9.
  • Robino, Francesco, et al. (författare)
  • From Simulink to NoC-based MPSoC on FPGA
  • 2014
  • Ingår i: <em>Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014</em>. - : IEEE.
  • Konferensbidrag (refereegranskat)abstract
    • Network-on-chip (NoC) based multi-processor systems are promising candidates for future embedded system platforms. However, because of their complexity, new high level modeling techniques are needed to design, simulate and synthesize embedded systems targeting NoC-based MPSoC. Simulink is a popular modeling environment suitable to model at system level. However, there is no clear standard to synthesize Simulink models into SW and HW towards a NoC-based MPSoC implementation. In addition, many of the proposed solutions require large overhead in terms of SW components and memory requirements, resulting in complex and customized multi-processor platforms. In this paper we present a novel design flow to synthesize Simulink models onto a NoC-based MPSoC running on low-cost FPGAs. Our design flow constrains the MPSoC and the Simulink model to share a common semantics domain. This permits to reduce the need of resource consuming SW components, reducing the memory requirements on the platform. At the same time, performances (throughput) of dataflow applications can increase when the number of processors of the target platform is increased. This is shown through a case study on FPGA.
  •  
10.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 15

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy