Software Safety Analysis (SSA) is a kind-of-a verification activity that is performed on the Software to prove that it achieves all the Safety goals and provides the required Independence and freedom from interference.
Quite often we have seen that it is not clear for Software teams on how to do this analysis and which parts of the Software should they set their focus on. Also, in many cases, the activity is started towards the fag end of the project nearly when the safety software development has been completed, treating it more as a paper work to be completed.
In this article, we present an 8-step method for performing a meaningful analysis.
Before applying the method, here are some aspects to keep in mind:
Basis and stage of the analysis
SSA must be performed during the Architecture stage, based on the static and dynamic aspects of the Architecture. This provides the opportunity to correct weaknesses at an early stage of the software development. Static aspects includes Interfaces, Variables and its data types. Dynamic aspects include behavior of the component, control flow, data processing sequence, state transition behavior etc
We also recommend that SSA is repeated at the unit design and software implementation stage to cover the additional details that are introduced in these stages. For e.g., analysis of potential faults in the detailed run time behavior of a component or other systematic faults such as a safety violation due to a compiler setting or macro behavior can be detected at these stages. Please note that the standard does not expect an SSA at the unit design and coding stages; It is our recommendation based on the experience of gaining benefits out of doing this.
Scope of the analysis
Typically, SSA is only performed for the SW developed in-house and not for ASIL COTS. This is because the COTS supplier usually performs the SSA and provides a safety manual for resulting requirements that must be fulfilled by the integrator of the COTS.
Let's now get to the method and its 8 steps:
Step 1: Identify "classes" of elements that can fail.
Classify various types of elements in software as "classes". These include:
Inputs from QM or lower ASIL components. e.g.,data received via an interface
- Shared data. e.g., global variables updated by both QM and ASIL SW (though it is best to completely avoid this in design)
- Outputs to QM or lower ASIL components
- Function Interfaces from/to QM components
- Internal Data. e.g., static variables and local variables
- Executables/Runnables of an ASIL component
- Shared resources. e.g., a library or a HW resource
Step 2: Identify "guide words" for each of these "classes" to derive fault models.
Guide words are nothing but simple words that can be interpreted based on the context in which it is applied. For the SSA, guide words can be applied on the "classes" that were identified in the previous step. Example of guide words from standard are "before/early", "More", "less", "No/missing", "After/late", "Other than".
Consider the following usage as an example:
Class: "Inputs from QM or lower ASIL components. e.g.,data received via an interface"
- "before/early" -> data is received too early
- "after/late" -> data is received too late
- "no/missing" -> data is not received at all, or missing for a certain time
- "more" -> data received is more than a permitted range
- "less" -> data received is less than a permitted range
- "other than" -> data received is other than a certain expected value
Pre-determine the guide words for each of the "classes". In some cases, a guide word may make sense for a class but not for another. For e.g., "early" may make sense for a data received from another component, but may not make sense for an internal variable.
Step 3: Lay out the complete model of the SW Safety Architecture.
The prerequisite for this step is the Architecture model/representation of how the ASIL software works.
Consider various representations to cover all static and dynamic aspects of the architecture such as:
- All the functional chains related to Safety (function flow from input to output)
- Component/Deployment diagram for Safety related components
- Processing chains related-to or implemented in Safety components
Step 4: Perform an analysis on the Architecture based on the classes and guide words.
Go sequentially through the diagram, looking at each and every step. Think about which classes are applicable at each step and apply the relevant guide words for those classes. Repeat the analysis for every predetermined guide word. This will ensure that the analysis is systematic and complete.
Step 5: Describe the fault, error and failure.
Describe what exactly is the fault, and why it may occur. Describe the resulting error. Describe if the error leads to a failure of a Safety requirement.
Step 6: Verify the quality of the predetermined Safety mechanisms/measures to protect against the failure.
If the fault leads to a safety failure, verify if the safety mechanisms or measures (development time) that were already designed at the system level or architecture level can detect or protect against this failure. If they don't, then we have identified a weakness.
Step 7: Identify and resolve weaknesses
Identify what must be implemented to fill-in the weakness. Think about where did the weakness surface from: was it due to a missing system level analysis, or a missing aspect in the SW architecture? Do we need a new mechanism to be implemented in the SW to cover this? Or is a development time measure such as a review or a special test sufficient to cover this? If needed, cover the weakness by defining an additional requirement at the corresponding work product (system or software requirements) and link it down all the way until the implementation and testing stage.
Step 8: Repeat the analysis during the SW development.
Repeat the analysis for additional "classes" of faults that are identified in the unit design and unit implementation stage.
SSA definitely requires an additional planning and significant investment of time during the Safety development. Cutting short this activity, or doing it a very late stage can prove to be extremely costly and could mean an "inadequate safety" quality for the product.