HAZOP is a powerful hazard analysis technique which has a long history in process industries. As the use of programmable electronic systems becomes more ...
Applying HAZOP to Software Engineering Models PETER FENELON High Integrity Systems Engineering Group, Department Of Computer Science University Of York Heslington York Y01 5DD
BARRY HEBBRON School of Computing and Mathematics University of Teesside Middlesbrough Cleveland TS1 3BA England
ABSTRACT HAZOP is a powerful hazard analysis technique which has a long history in process industries. As the use of programmable electronic systems becomes more common, it is clear that there is a need for a HAZOP method which can be used effectively with such systems. This paper describes several attempts to derive such a process, and identifies some requirements which must be met by any PES HAZOP procedure.
MODELLING THE HAZOP PROCESS The HAZOP study was initially developed to support the chemical process industries, and after nearly 25 years of successful application it is generally considered to be an effective yet simple hazard identification method. However, the apparent simplicity of the method belies the subtlety of the associated concepts, and as a precursor to use of HAZOP to support the software development process it is important to clarify the definitions and activities that contribute to HAZOP. HAZOP is a semi-formalised team based activity that systematically reviews a representation of a system and its operating procedures in order to identify potential hazards. It is based upon the principle that a problem can only arise when there is some deviation from the intent of the system as represented by the model under review. The procedure is to search the representation, element by element (traditionally this has been line by line for Process & Instrumentation diagram models) for every conceivable deviation from its normal operation
using a list of guidewords. These are carefully chosen to prompt open, free-ranging thought about all possible system abnormalities. As each deviation is derived, the team then discuss potential causes and consequences and recommend appropriate remedial action or identify emergent requirements. This paper provides three different models of the HAZOP study: a “formal” model expressed in Z, an algorithmic model and a causal model. These different perspectives will help us to draw out the apparent subtleties and will enable us to move towards a justified strategy for the application of HAZOP on Software Engineering Models. In particular, the formal model enables us to investigate the consistency of HAZOP studies, the causal model allows us to integrate HAZOP with causal safety techniques such as Failure Modes and Effects Analysis, and the algorithmic model provides us with a sound basis for the provision of methodological tool support.
A FORMAL MODEL The Z language is a set-based notation used for specifying the properties and behaviour of systems. Here we apply it to the structure of the HAZOP process. The Z schemas themselves are presented separately in the Appendix; this discussion describes some of the properties of HAZOP we have captured in set-theoretical terms. Our formal model of HAZOP consists of two components; a state schema named Hazop Issues and an operation schema Produce Deviations. The state schema Hazop Issues (see the Appendix) introduces two relationships: the concept that intent is expressed by property words and the notion that property words invoke guidewords. The schemas are listed in the Appendix. The predicate supplied within the schema ties together the two relationships and demonstrates that guidewords are associated with intent through selected property words. INTENT
PROPERTY
expressed_by
GUIDEWORDS
invokes
Figure 1: Domain Restriction In HAZOP
The relations allowed by Hazop Issues are illustrated in figure 1, the predicate does not restrict the development of meaningless deviations as there is still a possibility of an intent being associated with inappropriate property words and of guidewords being invoked by property words which produce meaningless statements. It is also possible for an intent to be reflected through property words for which there is no meaningful guideword and also for there to be property words relating to guidewords that do not express any known intent. The operation schema Produce_Deviation (see the Appendix) does address these issues. The declaration introduces two variables, an input variable “in” of type INTENT and an output variable, “deviations”, which is drawn from the relationship of Guideword and Property_Word. The following predicate tightens this description by only allowing members of the domain of expressed_by, that is the given set INTENT, to be considered. This removes the possibility of a deviation being produced for no known intention. Then by restricting the domain of the sets expressed_by and invokes the schema restricts the possible deviation pairs to only meaningful deviations.
INTENT
PROPERTY
GUIDEWORDS
y
in
x
v
z
w
Figure 2: Property Words And Guide Words
This illustration models this restriction, there is a sub-set V of guidewords W that can usefully be invoked by a sub-set Y of property words Z. In turn only a sub-set X of these particular property words Y can be used to express a specific intent in.
The intent of an itm under investigation
associated property words
guidewords
is exprressed through which invoke
expressed_by
invokes
Figure 3: Expression and Invocation
This can be stated less formally; for a specific intent there is a set of appropriate Property words that can be attributed to that intent. Once the Property words have been chosen they in turn may only be associated with a selection of Guidewords that can produce meaningful deviations for that property word.
A PROCEDURAL MODEL Translating the flow diagram found within the CIA guide and synthesising this approach with suggestions from numerous sources such as [1, 2] we have been able to develop a general algorithmic model (fig. 4) that reflects the systematic agenda that forms the core of the HAZOP procedure. INTERROGATE THE ( Component ) BEGIN For Each Component Clarify the intent through selection of property words For each Property word For each Guide word Formulate Deviation ( Guideword, Property word) Examine and evaluate possible causes Exit If no possible deviations and causes If the deviation is not visible to the operator then identify the required changes to provide the appropriate visibility and controllability EndIF Examine Consequences Detect Hazards Make a suitable Record End loop End loop Mark Component as having been examined End loop END Figure 4: An Algorithmic Approach To HAZOP
There is some debate [3,4] over how best to strike a balance between a rigorous agenda that will provide confidence in the coverage of the meeting and a less stringent approach that will allow freedom to investigate other issues when appropriate. The responsibility for this compromise is in the hands of the Hazop study team leader, yet until the software community have developed the appropriate experience to make these judgements we believe that it is prudent to err on the side of caution and adopt this more rigorous approach.
A CAUSAL MODEL This section places HAZOP in context with other failure analysis methods, and highlights the underlying semantics of failure. Figure 5 illustrates the bases of our ideas.
normal operation
States
fault
Events
erroneous state
failure
hazardous state
accident
catastrophe
Figure 5: From Fault To Accident
Each accident has an associated chain of events and states. One possible initiating event is a component failure or fault. This may be internal or external, random or systematic. Such faults may give rise to an erroneous internal state of the system. These errors, or symptoms, may then cause the system to no longer perform in accordance with its specification and hence produce a system-level failure. The mode of failure can place the system into a hazardous state. If the hazardous state is uncontrolled this may lead to an accident Failure analysis mechanisms, including HAZOP, are required to relate a system representation to this underlying chain of causality. According to McDermid: “Failure analysis is primarily a synthetic exercise that determines ways in which the failure in the real world affects the behaviour of the system. In order to trace from faults to failures or vice versa, techniques such as Fault Tree Analysis (FTA) and Failure Modes and Effects Analysis (FMEA) require a representation of the dependencies between different components in the system so that it is possible to follow through the causal links between failure in a particular components to failure in the system as a whole.” [5] Later, Fenelon and McDermid suggest that “there is a more fundamental relationship: we believe that FTA and FMEA are both abstractions of the
same underlying causal model of the propagation of failure (cause and effect) through a system.” [6] The simplest view is to consider just cause and consequence.
causes
consequences
Figure 6: Simple Causal Model
This simple model (figure 6) demonstrates the complementary approaches of FTA and FMEA. Specifically; FTA starts with a top event (system-level failure mode) and identifies a range of potential causes for that specific consequence. In contrast, FMEA is a bottom up analysis starting from a given component-level failure and working forwards to evaluate its effects. multiple causes
single causes
single consequence
FTA
FMEA
multiple consequences
Figure 7: Causal Models Of FTA and FMEA
HAZOP does not appear to fit into this simple model. The starting point for a HAZOP study is the deviation from the design intent. Once identified the HAZOP then aims to identify potential causes and consequences of that deviation. Only when we extend our simple model to include the immediate effect of an event are we then able to develop a clearer image of were the HAZOP study fits. Multiple Causes
Effect
Multiple Consequences
Figure 8: Causal Model Of HAZOP
Now, if we take into account the causal links between Fault and Accident identified earlier we can distinguish FTA, FMEA and HAZOPS by their starting point, the immediate effect identified and the consequence they determine. Specifically:
•
FTA starts with a top event, normally a systems-level failure, from which the fault tree develops by recursively identifying the causes of events until component-level failures are discovered..
•
FMEA starts with the fault event for the component then identifies the effects of that fault, the errors it causes , and eventually the consequences leading to potential system failure.
•
HAZOP starts with the deviation from design intent, and identifies the potential causes (faults), and the consequences (system-level failure modes).
CURRENT STATE OF SOFTWARE HAZOP We include a summary of the recent and current research into software HAZOP. This survey has enabled us to evolve some recommendations and draw together common threads of work.
DATA FLOW BASED APPROACHES Chudleigh [7] has recently published an account of the application of HAZOP to Data Flow Models. This confirmed the suggestion made by Earthy [8] that the use of DFD models during a HAZOP study was appropriate and could produce useful results. The DFDs provided the team with a systems view rather than a software view of the application under review. Chudleigh suggests that not only was the data flow model “readily understandable by all interested parties”, but also that the model provided the “most natural” representation to use for a HAZOP study. Classical CIA guidewords were not directly applicable to DFD models so new set of guidewords shown based around data flow and algorithmic functionality were devised. The resultant method required the appropriate guidewords to be applied to each of the input flows and transformations, down through the hierarchy of the model.
BURNS AND PITBLADO Burns and Pitblado [9] presented a modified 3 stage HAZOP approach. 1. Conventional HAZOP 2. Programmable Electronic Systems HAZOP 3. Human Factors HAZOP The PES HAZOP focuses on the control aspects of the system. They claim that several traditional views, such as cause and effect charts and ladder logic can be used as an appropriate system representation for a HAZOP study. Their PES HAZOP approach bears strong similarities to classical FMEA. The paper claimed that application on various casestudies has identified “numerous safety and operability problems, and provided possible solutions for most of them”, although it is unclear how the PES HAZOP is related to an overall systems-level view of the system.
PUMFREY & McDERMID, UNIVERSITY OF YORK A modified form of HAZOP is being used at the University of York [10] at the software design stage (using the MASCOT notation) to characterise likely failure modes of software components. This HAZOP-like method is called SHARD (Software Hazard Analysis and Resolution in Design) The innovative aspect of this work is the derivation of deviations by applying fault classes to flow types. The fault classification used is based upon those of Bondavalli and Simoncini [11] and Shrivastava and Ezhilchelvan [12]. Currently the classification is based around the following fault taxonomy: •
SERVICE PROVISION: omission, commission
•
SERVICE TIMING: early, late
•
SERVICE VALUE: coarse incorrect; subtle incorrect
For Mascot they have identified seven basic flow types: Stim (binary signal), Binary (boolean value), Timed Pulse, Value (scalar), Message, Complex and Compound. The integration of these concepts have been summarised in the table below. Table 1: SHARD Guidewords for MASCOT
Failure Categorisation
Flow
Protocol
Provision
Type
Omission
Boolean
No update
Commission
Unwanted
Timing
Early
N/A
Old
Update
Pool
Subtle
Coarse
Stuck at...
N/A
Data
Value
“
“
“
“
wrong in tolerance
out of tolerance
Complex
“
“
“
“
Incorrect
Inconsistent
Stuck at...
N/A
Boolean
Channel
Late
Value
No Data
Value
“
Complex
“
Extra Data
“
“
Early
Late
“
“
wrong in tolerance
out of tolerance
“
“
incorrect
inconsistent
CHAZOPS The CHAZOP (Computer Hazard and Operability) study is a technique for undertaking an assessment of a computer system by investigating the areas where potential plant Hazard and Operability Problems could arise. There is substantial variation in detail of implementations throughout different organisations. However, the HSE report by P Andow [13] appears to capture the essence of the overall approach, which is essentially a two-stage process combining a preliminary analysis early in the design stage with a post-implementation analysis.
SHAZOPS SHAZOPS [14] is basically a systematically applied checklist that can be used to review sequence flow charts, control schematics and high level program source (in this case RTL2).. The approach can be divided into 2 stages: Stage 1 considers the whole process under review; specifically, discussion is prompted by considering design intent, historical data, compliance to standards and documentation. Stage 2 is concerned more with the implementation of the system, its failure behaviour and its interaction with the outside world.
EVOLUTION OF MoD PES HAZOP GUIDANCE Defence Standard 00-56 calls for Hazard and Operability Studies to be carried out on systems and sub-systems. Two feasibility studies have been undertaken with the intent of defining a HAZOP method for Programmable Electronic Systems. The first identifies six key areas:[15] •
Team Structure
•
Life-Cycle Issues
•
Design Representation
•
Parameters
•
The choice and interpretation of Guidewords
•
Reporting and Recording.
The overall philosophy of these initial recommendations appears to be pragmatic and built upon experience within the field. Much of the discussion reinforces existing HAZOP practices, and the work is aimed towards the production of guidelines that will “extend the previous guides by catering for systems which include PES, but is applicable to all systems.” The second group [16] have addressed a similar range of issues to the first. A key concept in their approach is the development of a formal reference model based upon object oriented techniques. The model has two elements:
•
Object based definition of system components
•
Formal definition of hazardous conditions and unsafe behaviour of the system.
•
The reference model a similar philosophy to the modelling stage of our work.
Within this paper we concentrate upon the issues that are innovative or controversial and are relevant to our interests, particularly the representation. They indicate that the notation used for system representation should conform to a list of specific attributes: •
well defined
•
have an ability to adopt abstraction
•
make visible all important issues
•
be able to deal with the idiosyncrasies of the domain under study
•
expressive yet comprehensive
•
verifiable against the system it is modelling.
It is clear that their is no such single notation and that representations will be domain and expertise specific. However, it is interesting to note that although this work is independent of ours the conclusions reached are basically similar.
OUR EMERGING RECOMMENDATIONS This section builds upon both the modelling process we have undertaken, and the experience of various active members of the HAZOP community. We have focused here on three issues, the most important being the systems representation used to review and interrogate the system. We will also briefly address the role of property words and guidewords.
THE SYSTEM REPRESENTATION The representation of the system placed before the HAZOP team is the key issue that will control the effectiveness of the HAZOP study process. It is a reflection of the of the developers’ understanding of the system under review. An incomplete, inconsistent, unclear or ambiguous systems model presented to the team often mirrors a similar level of apprehension and incomprehension that the developers have of the system they have created. It is the visibility and clarity of the intent that is of major concern. It is vital that the HAZOP team can perceive all aspects of the system under review. Omission or an opaque view of either the control, data or temporal aspects of the system behaviour will reduce the effectiveness of the review. For the team to be able to discuss the failure behaviour of a model they must be able to use these multiple perspectives in order see through the notation used and perceive the underlying causal chain from a failure event to a potentially hazardous state. In the context of HAZOP, we must be aware of the fact that in isolation software is not dangerous — it is the system of which it is a part that has the ability to harm. Therefore, it is
important that the model under review will enable the analysts to observe the interaction between the software component and its environment. Particularly when HAZOP is applied to software, the team will require some mechanism for managing the complexity of the model under review. Traditionally this has been through abstraction, using top-down hierarchically decomposed models. We believe this is problematic. A top-down model will always require the analyst to anticipate the required division of functionality; this may lead to arbitrary decisions being taken and as a consequence hazard visibility may be impaired. This approach will always require the analyst to consider the appropriate level of abstraction at which the clarity of intent has been achieved. We do not present a solution here, however we do believe that there is a relationship between the levels of abstraction at which the intent and property words are formulated — it is possible to both over- and under-generalise property words such that they will not provide meaningful deviations. Work at the University of Teesside [19] suggests that one way forward would be to consider abstract representations based around event-response pairs. Interpretation of, and consistency checking between, various views of the system provided by (hopefully) integrated models of the system will require a modelling notation with a wellunderstood syntax and semantics. With traditional representations, such as P & I diagrams; the interpretations associated with the models can be readily visualised as concrete physical entities; pipes, pumps, and vessels. However for the less tangible software components this is not so readily understood, and a clear, preferably formal, definition of the notation used is recommended .
INTEGRATION When we refer to integration in the context of safety assessment notations we are in fact describing a very subtle concept with broad ramifications. True integration is achieved when we have a broad-spectrum approach to safety analysis which integrates well with the rest of the development and assessment process. In order for a set of notations, methods and tools to be truly well-integrated we believe that three levels of integration must be achieved. As far as a user of a set of notations and supporting software tools is concerned, the most important of these is operational integration: a set of tools must be created which can interoperate with each other in a synergistic fashion, allowing information from diverse sources to be assembled into a coherent whole. This is dependent upon methodological integration — the methods implemented by the software tools must be fundamentally compatible and there must be meaningful procedures for translating results obtained from one notation into raw data for another. In turn, methodological integration is predicated upon semantic integration — the models underlying the notations must be semantically consistent so that data from one part of a diverse set of notations retains its meaning when exported to some other notation. Figure 9 illustrates our abstract model of integrated notations and tools.
OPERATIONAL INTEGRATION
tools REQUIRES METHODOLOGICAL INTEGRATION
methods REQUIRES SEMANTIC INTEGRATION
models
Figure 9: A Model Of Integration
In order to achieve any degree of successful integration between notations used in software safety analysis we therefore need to identify some commonality between them. We have described the causal relationships between HAZOP and various other notations earlier in this paper; how does this help us to achieve integration? We have previously demonstrated that there is a causal link between FTA and FMEA, and between HAZOP and these notations. They are all dealing with the same underlying set of components, events and states, so it is therefore possible to derive a unified data model which we can use as a basis for semantic integration between the notations. Since FTA, FMEA and HAZOP are all based around the same causal processes it is easy to “export” parts of analyses from one notation to another (typically FTA and FMEA share events; here we propose that there is a role for HAZOP in causally-based assessment processes). In fact, one of the current authors has developed a Failure Propagation And Transformation Notation (FPTN) [17] which subsumes the causal models of FTA and FMEA. HAZOP has been proposed as a means of generating failure types (along the lines of Pumfrey’s work) inside this notation. We can therefore start to examine systems at various levels of abstraction, using the causallybased notation which is most appropriate given the current level of knowledge about the system’s structure, and about its behaviour — HAZOP can be used very early in the design process, and initial speculation about likely failure modes and effects can be confirmed or refuted by subsequent use of techniques including FMEA and FTA.
PROPERTY WORDS Property words enable the team to express their understanding of the intent. If the team are unable to capture the intent in this way, they will not be able to interrogate the elements performance within the context of its system. It is not only the visibility of the intent but also
the classification of the intent that is important. With Software Engineering Models this suggests that the type or class of item under analysis will infer the property words required to express its intent. The ability to select the appropriate property words will depend upon how clearly we understand the semantics of the model under review.
GUIDEWORDS Guidewords are an expression of the failure modes associated with a specific intent. It has been discussed in various papers [17, 11, 12] that in many models these failure modes can be categorised into various domains. The specific guide word used will depend upon the level of abstraction required to generate a useful deviation and also the specific property word used to express the intent. For example, a simple stimulus does not have a failure mode within the value domain (it either occurs or does not occur; no value information is passed on) so guidewords such as MORE and LESS are inappropriate and the failure modes of this event can be captured at a high level of abstraction by using the Guidewords INVALID or WRONG or alternatively in greater detail by use of guidewords such as EARLY and LATE.
PARALLEL WORK Work at the University of Teesside [18] is investigating the relationship between Hazard and Operability Studies (HAZOPS), Ward and Mellor Essential Models and the Calculus of Communicating Systems (CCS). we believe that Ward and Mellor (W & M) models not only provide the required control flow and transformation extensions to capture the essence of control and protective systems but we also suggest that the modelling philosophy underpinning the Ward and Mellor development method provides an appropriate model to which the HAZOPS of such systems can be successfully applied at the requirements stage. Specifically, we can demonstrate how the model can be effectively partitioned to provide the necessary visibility of the systems requirements that will enable an efficient and effective HAZOP meeting. This work highlights, with appropriate property words, guidewords and interpretations, how meaningful deviations from the required intent can be developed. Associated with this work is the integration of Ward and Mellor models and CCS. This work, undertaken at the University of Teesside, provides translations between both models and has led us to develop a greater understanding of the Ward and Mellor syntax, semantics, heuristics and limitations[19, 20]. This provides us the basis of a method for developing formal arguments to support the HAZOP findings by expressing deviations in an appropriate temporal logic. We then use tool support to demonstrate whether or not the formal deviations are displayed by the CCS model. Supporting this, It has been shown that State Transition Diagrams, as used by Ward and Mellor, can be translated into equivalent executable ladder logic programs [21]. This provides yet another integrated perspective of the system available to study. A particular concern has been that these techniques should be seen as an extension of current good practice within the relevant engineering disciplines. Our approach is being evaluated by application to a small but realistic industrial case study.
FUTURE WORK It is our intention is to build upon our existing abstract models of the HAZOP process by integrating the reference model under development for the MoD and Pumfrey’s failure mode classification to develop a generic framework for applying HAZOP to Software Engineering representations. We will also be investigating the potential links between Fenelon’s Failure Propagation and Transformation Notation [17] and HAZOP. A HAZOP tool is likely to be constructed as part of the Safety Argument Manager project (ASAM-II) currently in progress at York; this will exploit the ASAM-II integrated data model and will share data with other modelling and safety analysis tools. We believe that the way forward is through the use of integrated methods and notations and will be investigating further the coupling and cohesion required between notations that will produce an effective representation for a HAZOP study.
CONCLUSIONS Various independent studies appear to have reached a consensus on the issues that must be addressed if HAZOP is to successfully evolve towards its application on software engineering notations. In addition to the factors relevant to all varieties of HAZOP there are key software-specific problems associated with the identification and adaptation of an appropriate design representation notation, and how property words and guidewords can be developed to work with this. It is apparent that the interpretation of the failure classification of an interrogable intent is the crucial issue, from which the development of property words, guidewords and resulting deviations will emerge.
ACKNOWLEDGEMENTS The authors would like to thank their colleagues at Teesside (particularly Clive Fencott) and York (particularly John McDermid and David Pumfrey) for discussions and comments on this work.
REFERENCES [1]
A guide to Hazard and Operability Studies, Chemical Industries Association Limited 1987
[2]
Kletz, T., HAZOP and HAZAN
[3]
Lloyd’s Register Workshop on Failure Analysis Techniques, proceedings not available
[4]
Minutes of Hazard Identification and Analysis Interest Group meetings. School of Computing and Maths, University of Teesside.
[5]
McDermid, J. A., Issues And Trends In The Development Of Software for Safety Critical Systems, University of York technical report YCS138
[6]
Fenelon, P. & McDermid, J., A. An Integrated Toolset For Software Safety Assessment, Journal of Systems And Software, July 1993
[7]
Chudleigh, M., Hazard Analysis Using HAZOP: A Case Study, in Proceedings of SAFECOMP 93.
[8]
Earthy, J. V., Hazard and Operability Studies As An Approach To Software Safety Assessment, in Proceedings of IEE Computing and Control Division Colloquium on Hazard Analysis, IEE, November 1992
[9]
Burns, D. J. & Pitblado, R. M., A Modified Methodology For Safety Critical Systems Assessment, in F. Redmill & T. Anderson, eds, Directions in safety critical systems: SCS Symposium, Bristol, 1993 (Springer-Verlag)
[10]
McDermid, J. A & Pumfrey, D. J., A Development of Hazard Analysis To Aid Software Design, COMPASS ‘94
[11]
Bondavalli, A. & Simoncini, L. Failure Classification With Respect To Detection, in First Year report, TASK B : Specification and Design for Dependability, Volume 2. ESPRIT BRA Project 3092: Predictably Dependable Computing Systems.
[12]
Ezhilchelvan, P. D. & Shrivastava, S. K., A Classification Of Faults In Systems, University of Newcastle upon Tyne.
[13]
Andow P. Guidance on HAZOP Procedures for Computer Controlled Plants, KBC Process Technology, ISBN 0-7176-0367-9
[14]
An Introduction to Software Hazard and Operability Procedures, ICI 1988.
[15]
Feasibility study for MoD PES HAZOP, MoD Ref NSM 13C/1063, 1994
[16]
Feasibility study for MoD PES HAZOP, MoD Ref NSM 13C/1062, 1994
[17]
Fenelon, P., McDermid, J. A., Pumfrey, D. & Nicholson, M., Towards Integrated Safety Analysis And Design, ACM Applied Computing Review, August 1994
[18]
Fencott, P.C. & Hebbron, B. D., The Application Of Hazop Studies To Integrated Requirements Models For Control Systems, to appear in proceedings of SAFECOMP94
[19]
Fencott, P. C. et. al., Formalising the semantics of Ward and Mellor SA/RT essential Models using a Process Algebra. University of Teesside
[20]
Fencott, P. C. et al. Experiences With The Integration Of Structured And Formal Methods For Real Time Systems Specifications, University of Teesside
[21]
Hunt J. R. Designing Ladder Logic From State Transition Diagrams; and also, Polymer Handling: A Worked Example, ICI Eutech