The SW Problem++ - Personal webpages at NTNU

16 downloads 5289 Views 626KB Size Report
Oct 27, 2004 - Always a drive for bigger, better, faster, and cheaper system technologies. .... Vessel on DP in 830m, recovering lifting floats from pipeline, 2DGPS + HPR online ... change-over the operator selected the 'Motion Page' and B crashed. ..... various types of real-time communication based on hard-wiring, field ...
Computer-based systems on ships and offshore vessels:

The Software Problem ++

Marine Cybernetics

Det Norske Veritas

Statoil

Smedvig Kongsberg Maritime

Smedvig Offshore

Norsk Hydro

Stolt Offshore

Eidesvik Shipping

Solstad Offshore

Subsea 7

Ulstein

ABB

INTERNAL

Preface This document is prepared by Marine Cybernetics (Roger Skjetne and Asgeir Sørensen) and based on contributions from all companies in the sponsor reference group. The contents of the document are the sole responsibility of Marine Cybernetics, and the views, assessments, and conclusions presented are not necessarily representative for the other contributing parties in the project.

We acknowledge special contributions and comments from DNV (Aleks Karlsen and Stian Ruud), Smedvig Offshore (Harry Verhoeven), Kongsberg Maritime (Bjørn Gjelstad and Nils A. Jenssen), and Statoil (Ole J. Nordahl and Kjell Larsen).

Sponsors and reference group in the project: ƒ ƒ ƒ ƒ ƒ ƒ ƒ ƒ ƒ ƒ ƒ ƒ ƒ ƒ

Marine Cybernetics DNV Statoil Smedvig Offshore Kongsberg Maritime Norsk Hydro Stolt Offshore Eidesvik Subsea 7 Solstad Offshore Ulstein ABB Petroleum Safety Authority Norway Norwegian Maritime Directorate

Trondheim, 2004-10-27

Reference: Reference to part of this report which may lead to misinterpretation is not permissible. Copyright © 2004 by Marine Cybernetics AS

The Software Problem ++

Page 2

Disclaimers The information contained in this document is the proprietary and exclusive property of Marine Cybernetics and the companies indicated by name and logo on the title page, called hereafter the “contributing companies”. No part of this document, in whole or part, may be reproduced, stored, transmitted, or used without prior written permission of Marine Cybernetics. The information in this document is subject to change without notice. The information in this document is provided for informational purposes only. Any or all errors in writing or failures of any kind in the document or information given in the document can not in any matter be any ground for responsibility or liability whatsoever for Marine Cybernetics or any other contributor. The contributing companies specifically disclaim all warranties, express or limited, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose.

Non disclosure This document may contain information of a sensitive nature. This information should not be given to persons other than those involved in the project, or who will be involved during the document lifecycle. Breach of obligations of non disclosure or other infringements or violation of the contributing companies’ rights or privileges, included patents, involves liability.

Version history Revision chart Version

Author(s)

1.0

Marine Cybernetics with contributions from DNV, Smedvig Offshore, Statoil, and Kongsberg Maritime

The Software Problem ++

Description of version Completed in connection with sea trials in the pilot project CyberSea Simulator.

Date completed 2004-10-27

Page 3

Table of contents

PREFACE .....................................................................................................................................................................2 EXECUTIVE SUMMARY .............................................................................................................................................5 REFERENCES .............................................................................................................................................................6 1

INTRODUCTION ..................................................................................................................................................7

1.1

Objective ............................................................................................................................................................................ 7

1.2

Purpose and future application ...................................................................................................................................... 7

1.3

Trends for computer-based systems .............................................................................................................................. 8

2

DP INCIDENTS...................................................................................................................................................10

2.1

Examples of IMCA reported incidents........................................................................................................................ 11

2.2

Examples of contractor reported incidents................................................................................................................. 17

2.3

Examples of Statoil reported incidents........................................................................................................................ 18

2.4

Potential consequences related to DP incidents.......................................................................................................... 21

3

WHAT IS THE SOFTWARE PROBLEM ++? ..................................................................................................22

3.1

Computer-based systems on ships and offshore vessels............................................................................................ 22

3.2

Drivers for technology progress.................................................................................................................................... 26

3.3

Practice for quality assurance in maritime industry................................................................................................. 26

3.4

Observations and problems related to computer-based control and monitoring systems .................................. 33

CONCLUSION............................................................................................................................................................36

The Software Problem ++

Page 4

Executive Summary This report deals with the imprecise term denoted ‘software problems’ for control and monitoring systems onboard ships and offshore vessels. It has, however, turned out that many problems identified are not purely related to the software, but are related to design, fabrication, operation, and maintenance of the systems. Due to this fact we have denoted all these related problems as the ‘Software Problem ++’. In order to understand why software problems occur, this report gives a brief description of the main drivers for technology progress, leading to an increasing integration complexity and number of I/O for industrial control and monitoring systems for offshore vessels. Moreover, state of practice for quality assurance (QA) is briefly treated in order to evaluate the effectiveness of present QA procedures for capturing the described problems. Examples in the report are taken from incidents related to failures in computer-based systems onboard DP vessels. The main focus has therefore been problems related to DP systems, but it is anticipated that most of the problems reported will also be found in other computer-based systems. This report should form the basis for evaluation of possible solutions to the challenges identified. The next step should be to identify the most promising solutions and how these solutions actually will solve the various challenges identified in the 'Software problem++'.

The Software Problem ++

Page 5

References [1]

IMO, “Guidelines for Vessels with Dynamic Positioning Systems,” MSC/Circ.645, June 6, 1994.

[2]

IMCA, “Guidelines for The Design & Operation of Dynamically Positioned Vessels,” IMCA M 103, February 1999.

[3]

DNV Rules for classification of Ships / High Speed, Light Craft and Naval Surface Craft, Pt. 4, Ch. 8, “Electrical Installations,” January 2004.

[4]

DNV Rules for classification of Ships / High Speed, Light Craft and Naval Surface Craft, Pt. 4, Ch. 9, “Control and Monitoring Systems,” January 2004.

[5]

DNV Rules for classification of Ships, Pt. 6, Ch. 7, “Dynamic Positioning Systems,” January 2004.

[6]

DNV Rules for classification of Ships, Pt. 6, Ch. 8, “Nautical Safety,” January 2003.

[7]

Statoil, “DP Requirements for Drilling and Intervention Units,” Report TR 1029, September 1, 2002.

[8]

Scandpower Risk Management AS, “Safety of Dynamic Positioning Operation on Mobile Offshore Drilling Units,” Report no. 27.790.044/R1, March 24, 2004.

[9]

IEC, “Functional safety of electrical/electronic/programmable electronic safety-related systems,” IEC61508, 1998.

[10]

IMCA, “Dynamic Positioning Station Keeping Incidents: Incidents reported for 2000,” IMCA M 165, December 2001.

[11]

IMCA, “Station Keeping Incidents Reported for 2001,” IMCA M 169, February 2003.

[12]

Rensvik, E., A. J. Sørensen, and M. Rasmussen, “Maritime Industrial IT,” Proc. 9th Int. conf. Marine Eng. Systems (ICMES), Helsinki, Finland, May 19-21, 2003.

[13]

Sørensen, A. J., “Marine Cybernetics: Modelling and Control,” Lecture notes, Report UK-0476, Dept. Marine Technology, NTNU, Trondheim, Norway, 2004.

[14]

Norsk Hydro, “DP Unit requirements for drilling/well intervention units,” Internal report, 200306-21.

[15]

Norske Shell, “DP drilling unit requirements.”

[16]

IMCA, “Guidance on Failure Modes & Effects Analyses (FMEAs),” IMCA M 166, April 2002.

The Software Problem ++

Page 6

1 Introduction 1.1 Objective This document describes important aspects and problems related to safety-critical control and monitoring systems onboard ships and offshore vessels. By a computer-based system we mean a complex system [4] consisting of one or several integrated computers performing a number of specified autonomous function related to control and/or monitoring of vessel processes. Such systems perform their function as an interaction between the vessel, the sensors, the actuators, the computer hardware (HW), the software (SW), and the operators. The operation of the systems itself, or a failure during operation of the systems may potentially become harmful to life, property, or the environment. The focus in this document is on the SW and its interaction with HW and the human operator in computer-based systems onboard marine vessels. The main objective is to identify various types of SW problems. To do so we consider the increasing complexity of computer-based systems for marine vessels and the drivers for such technology development in order to understand why the problems arise. Then, the state of practice in maritime industry for quality assurance (QA) throughout the lifecycle of a computer-based system is described to understand why some general types of problems are very often not revealed. DP incident reports are used as examples in order to document actual problems.

1.2 Purpose and future application This document is a first description of challenges related to the so called “Software Problem”. The purpose of this document is to be applied as a basis for selection of measures for improving the quality assurance and the safety of software-based control and monitoring systems. In 2003-2004 Marine Cybernetics, in cooperation with DNV, oil companies, vendors, ship operators and contractors, and yards, developed a service offering Hardware-In-the-Loop (HIL) testing of marine control and monitoring systems. In 2004 DNV published new rules for Control and monitoring systems in ships including additional requirements to computers and software. Other companies are also proposing different services for improving the quality assurance of such systems. However, each of the above initiatives will address some of the challenges identified. The intention is that when the ‘Software Problem ++’ document has been established, it will be easier to evaluate proposed quality assurance measures available, and then identify an optimal set of future measures to be recommended. This version of the document should not be considered to cover all aspects of the “Software Problem”. The document is intended for use by system suppliers, manufacturers, yards, designers, ship owners and operators, suppliers of quality assurance services, consultants, and classification societies.

The Software Problem ++

Page 7

1.3 Trends for computer-based systems As the technology evolves, HW and SW driven systems are getting more and more advanced. The following characteristics and trends are observed for computer-based systems for ships and offshore vessels: Technology: ƒ ƒ ƒ ƒ ƒ ƒ

ƒ ƒ ƒ

The dependence and importance of computer-based systems are increasing. The share of SW in computer-based systems is increasing. The level of integration and system complexity is increasing. Low-cost Commercial Off The Shelf (COTS) technology, originally developed for the office and consumer market, is emerging into safety-critical applications. More “black-box” components and complex interactions between HW, SW, and the human operator. Quality of SW is variable. System design and SW manufacturing is often done by ITengineers, who may not have system, and/or safety engineering, and/or application understanding. Functional redundancy relies not only on redundant HW components; it depends strongly on the implementation of redundancy functions in the SW. The consequences of SW failures increase. Multiple failures and sequences of failures, involving a combination of HW, SW, and human interactions, result in incidents and accidents.

Cost: ƒ ƒ ƒ ƒ

Always a drive for bigger, better, faster, and cheaper system technologies. Control and monitoring systems are a significant and increasing part of costs in new vessels and offshore installations. Low-cost solutions are often given priority in decision criteria for contract awards. On the other hand; the consequences of failures of such computer-based systems can be significant in cost. For instance, typical day-rates for advanced offshore vessels are in the order of 1-3 MNOK, including operator and client cost. Hence, quality problems and cost in the North Sea related to failures in computer-based systems onboard marine vessels result annually in a direct cost of several 10xMNOK. The consequential cost related to off-hire and loss of production is substantially higher.

QA procedures and testing: ƒ ƒ ƒ ƒ

Adequate quality of functions implemented in SW depends to a great extent on sufficient functional testing. Marine testing and certification methodology have so far had its main focus on structures and machinery. Development of sufficient testing and verification methodologies for computer-based systems is a concern. Testing and certification has historically targeted single-point failures in system components. Partly as a result of this, incidents today are often caused by scenarios of multiple failures.

The Software Problem ++

Page 8

ƒ ƒ ƒ ƒ ƒ

Insufficient documentation quality. Insufficient requirement specification and analysis. Insufficient in-house QA procedures. Lack of QA for reconfiguration and reprogramming during installation, commissioning, testing, and after delivery. Insufficient revision control of upgrades. Lack of tools for QA, verification and validation of SW safety and performance; in particular, testing of the integrated HW and SW embedded in real environmental conditions.

The Software Problem ++

Page 9

2 DP incidents The success (or perhaps failure?) of present industry practices for testing safety-critical computerbased systems can be measured in the number of incidents for such systems. For Dynamic Positioning (DP) systems a collection of approximately 40-100 incidents are reported each year, 1990−2001, to IMCA on the basis that the vessel name, owner, and client are kept confidential. The reporting of such incidents is believed to be the tip of an iceberg, meaning that a large amount of incidents are not reported for various reasons. DP incident trends of IMCA from 1990−2001 [11] indicates an increasing number of incidents with primary cause in computer related problems, and several of these incidents have “insufficient testing/commissioning or QA” as a secondary cause. “No simple explanation is available other than the obvious, that generally vessels have more computers with integrated control systems” [11]. Another interesting point made by IMCA in [11] is that “Computer hardware problems cause downtime and generally not position loss (at least not alone) whereas computer software causes position loss.” It is further commented that “this is to be expected since hardware is redundant but the software is not.” The IMCA reported incidents [10,11] are classified in terms of consequences as follows: ƒ

Loss of position 1 – major consequences.

ƒ

Loss of position 2 – minor consequences.

ƒ

Lost time incidents – loss of redundancy or downtime, but no loss of position.

The incidents reported by IMCA in 2000 and 2001 were categorized as follows: Loss of position 1

Loss of position 2

Lost time incidents

2000

36 (33%)

41 (37%)

33 (30%)

2001

21 (21%)

34 (35%)

43 (44%)

Computers were according to IMCA [10,11] the primary cause in 19% of the incidents in 2000 and 33% in 2001 among Loss of position 1 incidents. The corresponding numbers for reference systems were 19% and 14%, thrusters caused 8% and 14%, while generators caused 14% and 10%, respectively. Figure 1 shows trends for the primary causes of incidents in the 12 year period 1990-2001. The numbers are in percentage out of total reported incidents (Loss of position 1 & 2, and Lost time incidents) the respective years. The two main causes are “Computers” and “Reference systems”, where computers seem to have the most increasing trend. Another trend is that operators are blamed less for incidents at the expense of technology or the environment. Perhaps an operator is not always capable of executing a preventive action, even if an alarm or warning has been issued. Consequently, the technology must do more, if possible, with the result of more software and thereby increased probability of SW-related failures. Nevertheless, the

The Software Problem ++

Page 10

importance of a well-designed human-machine interface (HMI), work station layout, and operational procedures are crucial to ensure proper failure handling.

Figure 1: Incident trends 1990-2001 for different primary cause failure groups. “Computers” and “Reference systems” are main causes. Operator errors seems to be decrasing.

2.1 Examples of IMCA reported incidents The following are typical SW-related incidents; see [10,11]. Note that when problems are said to be caused by the DP SW, this does not necessarily mean that a traditional SW failure (bug) is encountered. Rather it is often expected that the SW should handle the condition/situation. No. 1.1 [10, Case 0002]: Loss of position 1 Events: ƒ Vessel on DP, moving to other side of platform, 2 DGPS + TW online ƒ Position move executed ƒ Heading change input ƒ DP “current” increasing, heading overshoot ƒ Position out-of-limits ƒ Yellow alert, TW at limit ƒ Divers recovered Comment: The vessel was unable to make the heading change as expected and manual control was required to clear the platform. There was no fast current update facility, but there might have been thruster tuning problems. Main cause: Secondary cause: Computer (DP software) Insufficient testing, Commissioning/QA

The Software Problem ++

Page 11

No. 1.2 [10, Case 0003]: Loss of position 1 Events: ƒ Vessel on DP 1300m, drilling operations, 2 DGPS + 2 HPR online ƒ 3+3 Diesel generators online, bus tie open ƒ Insufficient thrust alarm ƒ All thrusters not ready, yellow alert, manual control selected ƒ PLC comms error alarm ƒ Standby boat called, 24m drift off Comment: This is the first stage in a series of problems over a period of several hours. DP was restored and the yellow alert cancelled. The root cause is unclear but the PLCs and the DP-control software were prime suspects for causing the thrusters to drop off the DP desk. Main cause: Computer (DP software)

Secondary cause: Computer (DP hardware)

No. 1.3 [10, Case 0004]: Loss of position 1 Events: ƒ Vessel on DP 1300m, drilling operations, 2 HPR online ƒ 3+3 Diesel generators online, bus tie closed, 4 thrusters online ƒ One thruster trips ƒ Drift off ƒ Thruster restored 6m but trips ƒ All thrusters lost to DP ƒ Manual mode, yellow alert, excursion increasing ƒ Red alert and Emergency disconnect Comment: The problem was initially with one azimuth thruster; it was lost and restored many times. When all thrusters were lost there was some hope of manual recovery but this was not successful and the red alert was given when the position loss reached 50m. Main cause: Computer (DP software)

The Software Problem ++

Secondary cause: Computer (DP hardware)

Page 12

No. 1.4 [10, Case 0007]: Loss of position 1 Events: ƒ Vessel on DP in 830m, recovering lifting floats from pipeline, 2DGPS + HPR online ƒ Over manifold and

Suggest Documents