SwePub
Sök i LIBRIS databas

  Utökad sökning

onr:"swepub:oai:DiVA.org:kth-237109"
 

Sökning: onr:"swepub:oai:DiVA.org:kth-237109" > A Design of Autonom...

A Design of Autonomous Error-Tolerant Architectures for Massively Parallel Computing

Liu, Lizheng (författare)
Fudan Univ, State Key Lab ASIC & Syst, Shanghai 200433, Peoples R China.
Jin, Yi (författare)
Fudan Univ, State Key Lab ASIC & Syst, Shanghai 200433, Peoples R China.
Liu, Yi (författare)
Fudan Univ, State Key Lab ASIC & Syst, Shanghai 200433, Peoples R China.
visa fler...
Ma, Ning (författare)
KTH,Skolan för informations- och kommunikationsteknik (ICT)
Huan, Yuxiang (författare)
KTH,Skolan för informations- och kommunikationsteknik (ICT)
Zou, Zhuo (författare)
Fudan Univ, State Key Lab ASIC & Syst, Shanghai 200433, Peoples R China.
Zheng, Lirong (författare)
Fudan Univ, State Key Lab ASIC & Syst, Shanghai 200433, Peoples R China.;KTH Royal Inst Technol, Sch Informat & Commun Technol, S-16440 Kista, Sweden.
visa färre...
Fudan Univ, State Key Lab ASIC & Syst, Shanghai 200433, Peoples R China Skolan för informations- och kommunikationsteknik (ICT) (creator_code:org_t)
Institute of Electrical and Electronics Engineers (IEEE), 2018
2018
Engelska.
Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - : Institute of Electrical and Electronics Engineers (IEEE). - 1063-8210 .- 1557-9999. ; 26:10, s. 2143-2154
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • The massively parallel computing systems composed of many processors are connected on chips, which will become more and more complex and unreliable. This paper presents an error-tolerant design based on the autonomous error-tolerant (AET) architecture that aims to have a self-repairing capability. A nearby error sensing mechanism is designed to discover faults, and an active evolution scheme is studied to handle unrecoverable errors. A circuit backup switching mechanism is proposed to bypass the failed nodes. The board-level prototype is implemented based on dual-core embedded processors. The analysis shows that the error-tolerant capability of the proposed architecture is better than the conventional multimodular redundant system when the failure rate of a single core is less than 0.7. In the AET test system consisting of 16 processors, the error-tolerant capability is verified. The results show that the relative variation of the overall performance of the AET system will not be changed due to the high reliability requirements of the system. Through experimental comparison, under the premise that the architecture of AET and the triple modular redundancy method are basically consistent in reliability, whether on the logical-level error tolerant or on the physical-level error tolerant, the former has lower power consumption.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorteknik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Engineering (hsv//eng)

Nyckelord

Error tolerant
nanosystem
self-reparation
sensing

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy