SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Singh Virendra) srt2:(2010-2014)"

Search: WFRF:(Singh Virendra) > (2010-2014)

  • Result 1-10 of 16
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Nikolov, Dimitar, et al. (author)
  • Estimating Error-Probability and Its Application for Optimizing Roll-back Recovery with Checkpointing
  • 2010
  • In: <em>Proceedings - 5th IEEE International Symposium on Electronic Design, Test and Applications, DELTA 2010</em>. - : IEEE. - 9781424460267 - 9780769539782 ; , s. 281-285
  • Conference paper (peer-reviewed)abstract
    • The probability for errors to occur in electronic systems is not known in advance, but depends on many factors including influence from the environment where the system operates. In this paper, it is demonstrated that inaccurate estimates of the error probability lead to loss of performance in a well known fault tolerance technique, Roll-back Recovery with checkpointing (RRC). To regain the lost performance, a method for estimating the error probability along with an adjustment technique are proposed. Using a simulator tool that has been developed to enable experimentation, the proposed method is evaluated and the results show that the proposed method provides useful estimates of the error probability leading to near-optimal performance of the RRC fault-tolerant technique.
  •  
2.
  • Nikolov, Dimitar, et al. (author)
  • Evaluation of Level of Confidence and Optimization of Roll-back Recovery with Checkpointing for Real-Time Systems
  • 2014
  • In: Microelectronics and Reliability. - : Elsevier BV. - 0026-2714. ; 54:5, s. 1022-1049
  • Journal article (peer-reviewed)abstract
    • Increasing soft error rates for semiconductor devices manufactured in later technologies enforce the usage of fault tolerant techniques such as Roll-back Recovery with Checkpointing (RRC). As RRC introduces time overhead that increases the completion (execution) time, time constraints (deadlines) might be violated. This is a drawback for a class of computer systems where the correct operation is defined not only by providing the correct outcome of an operation but also by ensuring that the deadlines are met. These computer systems are referred to as real-time systems (RTSs). In general RTSs are classified as soft and hard RTSs depending on the consequences of violating the deadlines. For soft RTSs, where consequences of violating the deadlines are not very severe, research have focused on optimizing RRC and shown that it is possible to find the optimal number of checkpoints such that the average execution time (AET) is minimal. While minimal AET is important for soft RTSs, it is more important to provide a high probability that deadlines are met for hard RTSs, where consequences of violating the deadlines may be catastrophic. Hence, there is a need of probabilistic guarantees that jobs employing RRC complete before a given deadline. Traditionally, AET analysis have been used for soft RTSs and worst case execution time (WCET) analysis along with schedule feasibility have been used for hard RTSs. In this paper we introduce a reliability metric, Level of Confidence (LoC), which is equally applicable to both soft and hard RTS. LoC is used as a metric to evaluate to what extent a deadline is met. The main contributions of this paper are as follows. First, we present a mathematical framework for the evaluation of LoC when RRC is employed. Second, we provide a proof to verify the correctness of the proposed expression. Third, in the context of hard RTSs, we provide a method to obtain the optimal number of checkpoints that maximizes the LoC. Fourth, in the context of soft RTSs where the maximal LoC may not be needed, but instead some LoC requirement is needed, we present an optimization method for RRC that finds the number of checkpoints that results in the minimal completion time while the minimal completion time satisfies a given LoC requirement. Fifth, we use the proposed framework to evaluate and compare probabilistic guarantees when RRC is optimized towards soft RTSs.
  •  
3.
  • Nikolov, Dimitar, et al. (author)
  • Level of Confidence Evaluation and Its Usage for Roll-back Recovery with Checkpointing Optimization
  • 2011
  • In: 5th Workshop on Dependable and Secure Nanocomputing (WSDN 2011), Hong Kong, June 27, 2011. - : IEEE. - 9781457703744
  • Conference paper (peer-reviewed)abstract
    • Increasing soft error rates for semiconductor devices manufactured in later technologies enforces the use of fault tolerant techniques such as Roll-back Recovery with Checkpointing (RRC). However, RRC introduces time overhead that increases the completion (execution) time. For non-real-time systems, research have focused on optimizing RRC and shown that it is possible to find the optimal number of checkpoints such that the average execution time is minimal. While minimal average execution time is important, it is for real-time systems important to provide a high probability that deadlines are met. Hence, there is a need of probabilistic guarantees that jobs employing RRC complete before a given deadline. First, we present a mathematical framework for the evaluation of level of confidence, the probability that a given deadline is met, when RRC is employed. Second, we present an optimization method for RRC that finds the number of checkpoints that results in the minimal completion time while the minimal completion time satisfies a given level of confidence requirement. Third, we use the proposed framework to evaluate probabilistic guarantees for RRC optimization in non-real-time systems.
  •  
4.
  • Nikolov, Dimitar, et al. (author)
  • Level of Confidence Study for Roll-back Recovery with Checkpointing
  • 2011
  • In: <em>The 11th Swedish System-on-Chip Conference, Varberg, Sweden, May 2-3, 2011</em>.
  • Conference paper (other academic/artistic)abstract
    • Increasing soft error rates for semiconductor devices manufactured in later technologies enforces the use of fault tolerant techniques such as Roll-back Recovery with Checkpointing (RRC). However, RRC introduces time overhead that increases the completion (execution) time. For non-real-time systems, research have focused on optimizing RRC and shown that it is possible to find the optimal number of checkpoints such that the average execution time is minimal. While minimal average execution time is important, it is for real-time systems important to provide a high probability of meeting given deadlines. Hence, there is a need of probabilistic guarantees that jobs employing RRC complete before a given deadline. Therefore, in this paper we present a mathematical framework for the evaluation of level of confidence, the probability that a given deadline is met, when RRC is employed.
  •  
5.
  •  
6.
  • Nikolov, Dimitar, et al. (author)
  • On-line Techniques to Adjust and Optimize Checkpointing Frequency
  • 2010
  • In: <em>IEEE International Workshop on Realiability Aware System Design and Test (RASDAT 2010), Bangalore, India, January 7-8, 2010</em>. ; , s. 29-33
  • Conference paper (peer-reviewed)abstract
    • Due to increased susceptibility to soft errors in recent semiconductor technologies, techniques for detecting and recovering from errors are required. Roll-back Recovery with Checkpointing (RRC) is one well known technique that copes with soft errors by taking and storing checkpoints during execution of a job. Employing this technique, increases the average execution time (AET), i.e. the expected time for a job to complete, and thus impacts performance. To minimize the AET, the checkpointing frequency is to be optimized. However, it has been shown that optimal checkpointing frequency depends highly on error probability. Since error probability cannot be known in advance and can change during time, the optimal checkpointing frequency cannot be known at design time. In this paper we present techniques that are adjusting the checkpointing frequency on-line (during operation) with the goal to reduce the AET of a job. A set of experiments have been performed to demonstrate the benefits of the proposed techniques. The results have shown that these techniques adjust the checkpointing frequency so well that the resulting AET is close to the theoretical optimum.
  •  
7.
  • Nikolov, Dimitar, et al. (author)
  • Optimizing Fault Tolerance for Multi-Processor System-on-Chip
  • 2010
  • In: Design and Test Technology for Dependable Systems-on-chip. - : Information Science Publishing. - 1609602129 - 9781609602123 ; , s. 578-
  • Book chapter (other academic/artistic)abstract
    • Designing reliable and dependable embedded systems has become increasingly important as the failure of these systems in an automotive, aerospace or nuclear application can have serious consequences.Design and Test Technology for Dependable Systems-on-Chip covers aspects of system design and efficient modelling, and also introduces various fault models and fault mechanisms associated with digital circuits integrated into System on Chip (SoC), Multi-Processor System-on Chip (MPSoC) or Network on Chip (NoC). This book provides insight into refined "classical" design and test topics and solutions for IC test technology and fault-tolerant systems.
  •  
8.
  • Nikolov, Dimitar, et al. (author)
  • Study on the Level of Confidence for Roll-back Recovery with Checkpointing
  • 2011
  • In: 1st Intl. Workshop on Dependability Issues in Deep-submicron Technologies (DDT 2011), Trondheim, Norway, May 26-27, 2011.
  • Conference paper (peer-reviewed)abstract
    • Increasing soft error rates for semiconductor devices manufactured in later technologies enforces the use of fault tolerant techniques such as Roll-back Recovery with Checkpointing (RRC). However, RRC introduces time overhead that increases the completion (execution) time. For non-real-time systems, research have focused on optimizing RRC and shown that it is possible to find the optimal number of checkpoints such that the average execution time is minimal. While minimal average execution time is important, it is for real-time systems important to provide a high probability of meeting given deadlines. Hence, there is a need of probabilistic guarantees that jobs employing RRC complete before a given deadline. Therefore, in this paper we present a mathematical framework for the evaluation of level of confidence, the probability that a given deadline is met, when RRC is employed.
  •  
9.
  • Subramanyan, Pramod, et al. (author)
  • Adaptive Execution Assistance for Multiplexed Fault-Tolerant Chip Multiprocessors
  • 2011
  • In: Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors. - 9781457719530 ; , s. 419-426
  • Conference paper (peer-reviewed)abstract
    • Relentless scaling of CMOS fabrication technology has made contemporary integrated circuits increasingly susceptible to transient faults, wearout-related permanent faults, intermittent faults and process variations. Therefore, mechanisms to mitigate the effects of decreased reliability are expected to become essential components of future general-purpose microprocessors. In this paper, we introduce a new throughput-efficient architecture for multiplexed fault-tolerant chip multiprocessors (CMPs). Our proposal relies on the new technique of adaptive execution assistance, which dynamically varies instruction outcomes forwarded from the leading core to the trailing core based on measures of trailing core performance. We identify policies and design low overhead hardware mechanisms to achieve this. Our work also introduces a new priority-based thread-scheduling algorithm for multiplexed architectures that improves multiplexed fault tolerant CMP throughput by prioritizing stalled threads. Through simulation-based evaluation, we And that our proposal delivers 17.2% higher throughput than perfect dual modular redundant (DMR) execution and outperforms previous proposals for throughput-efficient CMP architectures.
  •  
10.
  • Subramanyan, Pramod, et al. (author)
  • Energy-Efficient Fault Tolerance in Chip Multiprocessors Using Critical Value Forwarding
  • 2010
  • In: <em>The 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'10), Fairmont Chicago, Millennium Park, Chicago, Illinois, USA, June 28-July 1, 2010.</em>. - 9781424474998 - 9781424475001 ; , s. 121-130
  • Conference paper (peer-reviewed)abstract
    • Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to wear-out related permanent faults and transient faults, necessitating on-chip fault tolerance in future chip microprocessors (CMPs). In this paper we introduce a new energy-efficient fault-tolerant CMP architecture known as Redundant Execution using Critical Value Forwarding (RECVF). RECVF is based on two observations: (i) forwarding critical instruction results from the leading to the trailing core enables the latter to execute faster, and (ii) this speedup can be exploited to reduce energy consumption by operating the trailing core at a lower voltage-frequency level. Our evaluation shows that RECVF consumes 37% less energy than conventional dual modular redundant (DMR) execution of a program. It consumes only 1.26 times the energy of a nonfault- tolerant baseline and has a performance overhead of just 1.2%.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-10 of 16

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view