We recently conducted a training on Functional Safety Software. We started to discuss Software Safety requirements (SSRs), stating something like “SSRs are derived from the Technical Safety requirements (TSRs). Start by looking into those requirements in TSR that are assigned to SW”. Immediately, one of the trainees asked “oh, all we need to do is to filter out the TSRs for SW, and put them into a new document, name this document as SSR and that’s it we are done?” No marks for guessing that we shouted a loud and clear “NOOOO!!!”
This is the subject of this blog. Once you have the TSR, how should you derive the SSR? We will tell you the actionable steps that you can take and also give you an example of how we have derived SSRs from TSRs.
Firstly, let’s look at the actionable steps in the process of deriving the SSR:
- Read and understand the TSRs assigned to SW. What is the Software supposed to do? Is this clearly specified? The first step before starting the SSR is to ensure that SW Requirements at TSR level are
- Achievable in SW and
- Discuss about how each of you understand the requirement
- Discuss about how to test the requirement, even if you wouldn’t be doing it
- Breakdown the TSR Requirement until you get atomic SW Requirements. Note: ISO26262 does not state that Requirements must be atomic but it is definitely a best practice in our opinion.
- Translate/Re-rewrite (if required) the TSR Requirement to clear actionable SW Requirements.
- Add SW Specific details that are relevant and required by the Component for its implementation.
- Make sure again that the SSRs are
- Achievable and
Now, let’s take an example. For our convenience (of ease of explanation!!!), we have taken a really simple one.
TSR 1: “Safety relevant RAM shall be protected by double buffering”
TSR 2: “If double buffering detects a corruption of Safety RAM, a reset shall be triggered
TSR 3: FTT time for tolerating the fault in Safety data shall be 500ms.
Now, if you are in the SW team that is going to be implement these TSRs, what are the aspects you would think of?
Here are some questions you should be asking.
“There are several Safety components using RAM. Do we have to implement the double buffering in every Safety component separately, or should we have a single “centralized” Safety component that protects all the data?”
“When should double buffering detect a corruption? Upon every fetch of a RAM variable? Or should it check all buffers cyclically? If cyclically, how often should it check?
“Should a reset be triggered as soon as 1 instance of corruption is detected, or should it have happened multiple times?”
“Should both the buffers hold the data as it is, or should one of the buffers have an inverted copy of the data” If it is inverted data, should a 1s complement be done, or a 2s complement?
How should the FTT time of 500ms be achieved in Software by design?
Now, you (with the SW team) brainstorm on these questions, and let’s assume you arrived at the following conclusions:
- There will be single “centralized” safety component protecting all the Safety data
- One buffer holds the original data, and the other buffer holds an inverted data. This Inverted data is the 1s complement of the original data.
- The Safety component checks the integrity of the RAM data every time that data is accessed.
- Upon detecting even a single instance of data corruption, a reset has to be triggered.
- In order to achieve the 500ms FTT, it must be ensured that the check of every Safety data and the resulting reset is completed within 500ms. This means that the implementation for checking the Safety data, the time taken to process the reset and the latency of the involved tasks must all be taken into account to meet the 500ms.
Now, based on these conclusions, you can write the SSR.
TSR 1: Safety relevant RAM shall be protected by double buffering
SSR 1: SW shall store the Safety RAM variables of all ASIL components in two different memory addresses.
SSR 2: SW shall store the original RAM data in 1 memory location and the inverted (1s complement) data in the other memory location.
SSR 3: Upon every fetch of any data in the Safety RAM, SW shall check the integrity of the data by comparing the two memory locations.
SSR 4: Upon every write of a data in the Safety RAM, SW shall calculate the inverted value and update it in its redundant memory location
TSR 2: If double buffering detects a corruption of Safety RAM, a reset shall be triggered
SSR 5: If a mismatch between the original and inverted data is detected, SW shall trigger a Reset
TSR 3: FTT time for tolerating the fault in Safety data shall be 500ms
SSR 6: SW shall ensure that Check of every Safety relevant data and the resulting reset is completed within 500ms
That’s it! Now, as you may have observed, our SSRs are
- Action oriented. Clearly specifies what the SW should do and when
- Has the SW specific details (e.g., the inverted copy, 1s complement, design decision outcomes) which are not known at TSR level
- Atomic in the sense that no requirement can be partially implemented, they are either implemented or not.
The approach we have described is a generic one that you can take not only to develop SSRs, but for that matter, any low level requirements document from a parent requirement document, irrespective of whether or not it is Safety relevant or SW relevant. If you have any comments or questions on this one, please feel free to share it with us!