Sökning: onr:"swepub:oai:DiVA.org:kth-160878" >
On providing scalab...
On providing scalable self-healing adaptive fault-tolerance to RTR SoCs
-
- Navas, Byron (författare)
- KTH,Elektroniksystem,ESPE Universidad de Las Fuerzas Armadas, Ecuador
-
- Öberg, Johnny (författare)
- KTH,Elektroniksystem
-
- Sander, Ingo (författare)
- KTH,Elektroniksystem
-
(creator_code:org_t)
- 2014
- 2014
- Engelska.
-
Ingår i: Proceedings of ReConFigurable Computing and FPGAs (ReConFig), 2014 International Conference on. - 9781479959440 ; , s. 1-6
- Relaterad länk:
-
https://urn.kb.se/re...
-
visa fler...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- The dependability of heterogeneous many-core FPGA based systems are threatened by higher failure rates caused by disruptive scales of integration, increased design complexity, and radiation sensitivity. Triple-modular redundancy (TMR) and run-time reconfiguration (RTR) are traditional fault-tolerant (FT) techniques used to increase dependability. However, hardware redundancy is expensive and most approaches have poor scalability, flexibility, and programmability. Therefore, innovative solutions are needed to reduce the redundancy cost but still preserve acceptable levels of dependability. In this context, this paper presents the implementation of a self-healing adaptive fault-tolerant SoC that reuses RTR IP-cores in order to self-assemble different TMR schemes during run-time. The presented system demonstrates the feasibility of the Upset-Fault-Observer concept, which provides a run-time self-test and recovery strategy that delivers fault-tolerance over functions accelerated in RTR cores, at the same time reducing the redundancy scalability cost by running periodic reconfigurable TMR scan-cycles. In addition, this paper experimentally evaluates the trade-off of the implemented reconfigurable TMR schemes by characterizing important fault tolerant metrics i.e., recovery time (self-repair and self-replicate), detection latency, self-assembly latency, throughput reduction, and increase of physical resources.
Ämnesord
- TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik -- Annan elektroteknik och elektronik (hsv//swe)
- ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering -- Other Electrical Engineering, Electronic Engineering, Information Engineering (hsv//eng)
- TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik -- Inbäddad systemteknik (hsv//swe)
- ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering -- Embedded Systems (hsv//eng)
- TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik -- Datorsystem (hsv//swe)
- ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering -- Computer Systems (hsv//eng)
Nyckelord
- Fault tolerant systems
- Hardware
- Redundancy
- Software
- System-on-chip
- Self-healing
- Adaptive-computer-systems
- FPGA
- Partial and run-time reconfiguration
- Space applications
- Dependability
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)
Hitta via bibliotek
Till lärosätets databas