Fault-tolerant real-time tasks scheduling with dynamic fault handling

  • Gang Chen
  • , Nan Guan
  • , Kai Huang*
  • , Wang Yi
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

Predictable performance when coping with transient failures is of paramount importance in safety-critical real-time systems. Various software fault-tolerant techniques are employed towards this goal among which check-pointing is a relatively cost-effective scheme. In this paper, we propose an efficient fault-tolerant scheduling framework with run-time fault handling protocol, where criticality levels can be adaptively inserted for fault handling according to run-time fault workload. In contrast to prior works which apply with task re-execution strategy, the proposed framework adaptively determines on-demand re-executions only on the faulty checkpoint segments, rather than on the whole job. Towards this, a unified overrun handling protocol is developed to handle fault recovery adaptively to avoid over-provisioning of resources. In addition, we develop an off-line schedulability analysis technique for the proposed scheduling algorithm. The simulation results show that our fault-tolerant scheduling framework can bring up to 81% improvement in supporting low-criticality service without sacrifice in the MC-schedulability compared with the existing techniques.

Original languageEnglish
Article number101688
JournalJournal of Systems Architecture
Volume102
DOIs
StatePublished - Jan 2020
Externally publishedYes

Keywords

  • Check-pointing
  • Fault-tolerant scheduling
  • Run-time fault handling
  • Safety-critical real-time system

Fingerprint

Dive into the research topics of 'Fault-tolerant real-time tasks scheduling with dynamic fault handling'. Together they form a unique fingerprint.

Cite this