128 How Time-Related Failures Affect the Software System (PSAM-0200)
-
Published:2006
Download citation file:
Probabilistic Risk Assessment (PRA) is a well-established technique to assess the probability of failure or success of a system. It integrates different reliability modeling techniques, such as Fault Tree, Event Tree, and FMEA, to quantify risks. PRA has proven to be a key tool in safety management. However, classical PRA does not account for the risk of system failure due to software failure, despite the fact that software plays an increasingly important role in modern safety-critical systems. To integrate software into classical PRA, a software failure-mode taxonomy has been established. In the taxonomy, the possible software failures have been categorized into input failures, output failures, support failures, and functional failures; and each failure mode can be further divided into lower-level failure modes. The focus of this paper is on input failures.
Input failures are caused by incorrect inputs that reach the software function, causing it to fail. An input may be incorrect either in the value domain (e.g., incorrect value, range, type or amount) or in the time domain (e.g., incorrect arrival time, duration or rate). Software failures due to incorrect inputs in the time domain (i.e., time-related input failures) are very common in software-based systems, especially in real-time systems.
When a software fault (or failure) occurs, it may or may not cause a system failure. The process by which a fault transfers from the location it arises to the system output is called fault propagation. Fault propagation analysis is an important field in software engineering. Propagation probability is directly related to software reliability and consequently to system reliability. Propagation is an intrinsic characteristic of the function representing the remaining computation from the location the fault occurs to the output.
In this paper, a set of methods is suggested to estimate the propagation probability of the different manifestations of the time-related input failure modes. This set of methods can be used to integrate software risks into PRA methodologies.