Item Details

Error Recovery in Critical Infrastructure Systems

Knight, John; Elder, Matthew; Du, Xing
Format
Report
Author
Knight, John
Elder, Matthew
Du, Xing
Abstract
Critical infrastructure applications provide services upon which society depends heavily; such applications require survivability in the face of faults that might cause a loss of service. These applications are themselves dependent on distributed information systems for all aspects of their operation and so survivability of the information systems is an important issue. Fault tolerance is a key mechanism by which survivability can be achieved in these information systems. Much of the literature on fault-tolerant distributed systems focuses on local error recovery by masking the effects of faults. We describe a direction for error recovery in the face of catastrophic faults, where the effects of the faults cannot be masked using available resources. The goal is to provide continued service that is either an alternate or degraded service by reconfiguring the system rather than masking faults. We outline the requirements for a reconfigurable system architecture and present an error recovery system that enables systematic structuring of error recovery specifications and implementations.
Language
English
Date Received
20121029
Published
University of Virginia, Department of Computer Science, 1999
Published Date
1999
Collection
Libra Open Repository
Logo for In CopyrightIn Copyright

Availability

Access Online