Model Checking Safety Properties of Servo-Loop

0 downloads 0 Views 80KB Size Report
issues can be discovered and defined for control systems using a ... Rushby [RUS89] defines a safety kernel ... antenna in Safe mode via the Run/Safe toggle switch. If .... See Table 2. Table 2 Servo output states. Value. Name. Value. Name. -4.
Model Checking Safety Properties of Servo-Loop Control Systems M. Edwin Johnson ITT Industries, Advance Engineering & Sciences [email protected]

Abstract This paper presents the experiences of using a symbolic model checker to check the safety properties of a servoloop control system. Symbolic model checking has been shown to be beneficial when the system under analysis can be modeled as a finite state machine. Servo-loop control systems are typically represented by differential equations (Laplace transforms) – not as finite state machines. However, the control loop is only a part of the software system needed to properly and safely operate the system. This paper first validates the safety of the servo loop using control theory and simulation. Then, a simple state model of a servo loop is combined with the state model of entire system. This model is then entered into a model checker (SMV) along with safety predicates. The model checker is used to validate the safety predicates. This paper shows via a concrete example that safety issues can be discovered and defined for control systems using a model checker. Furthermore, it demonstrates that effective hazard analysis may require multiple techniques.

1. Introduction This paper describes the modeling and analysis of the software for an antenna tracking system. Real-time, event-driven systems are not obvious candidates for finite state modeling. At first glance, their continuous range of operation does not appear to map to a manageable set of states. However, by viewing the system from a single perspective, in this case safety, discrete states can be identified. The critical safety aspect is how fast the antenna is moving near a physical limit. This paper provides an example of how a model checker, combined with traditional methods, can analyze the safety of a complex system. The technique requires identifying states from the perspective of this aspect – not from the system’s general operation.

2. Safety The term safety as used in this paper follows Leveson’s [LEV95] definition: the freedom from

accidents or losses. Accidents, according to Levenson, result from hazards. Levenson defines hazards as: a state or set of conditions of a system that, together with other conditions in the environment of the system, will lead inevitably to an accident. Tracking antenna systems use very powerful motors that can damage or destroy the gears, pedestal, motors, cabling, dish, and all other system components. The objective is to verify safety through the management of hazards. Antenna system domain experts know many common hazards. The design created during the specification process can create additional hazards.

3. Methodology The technique employed is to describe the system to extract the hazards, architect a solution that addresses the hazards, test for them (by analysis), and refine the solution until the models show desired response. Different hazards will require different models. In this paper, both a control model and symbolic model are used. Hazard detection and prevention should map directly to a software artifact. Often this can be packaged in safety kernels. Rushby [RUS89] defines a safety kernel as a relatively small and simple component that guarantees safety.

4. Tracking Antenna Systems Tracking antenna systems are used to continuously communicate with spacecraft that change position relative to the antenna. Antennas move in two axes – azimuth and elevation. These axes are analogous to the angles of a 3D polar coordinate system. The antenna is positioned by a motor and gear set for each axis. The motors are powered by a motor control system employing a negative feed-back SCR amplifier. The motor control system translates an input voltage into a constant angular rate. Feed-back is provided by a tachometer. A significant hazard that this system possesses is called “open-loop.” In the event that the feedback from the tachometer becomes disconnected or broken, the negative input of the amplifier will float. The result is the

0-7695-1597-5/02/$1700 (C)2002 IEEE

motor will be driven at full speed until something catastrophic happens to stop it. The range of motion of each axis is divided into ranges – operating, limit, emergency limit, and stop (Figure 1). There is a set of hardware interlocks to guard against leaving the operating range. These interlocks are usually ineffective in preventing damage to the antenna in an open loop situation. For conditions other than open loop, they work well in preventing severe damage to the antenna. However, they do not mitigate all hazards.

Stop Limit Emergency Limit

Operating Range

Limit

Stop

Emergency Limit

Figure 1 Relationship of limit switches and range of movement for each axis.

The antenna utilizes anti-lash drive gears. These are two sets of gear plates connected with very strong springs along a common axis. The anti-lash gears prevent any play or slop in the drive train. These gears can be damaged over time (along with other components of the system) due to oscillations in the reference voltage that results in the antenna “bouncing.” The control implementation for motor positioning utilizes a servo loop. A servo loop computes the error between the desired position and the actual position and produces an output to eliminate the error. For the tracking system, the desired position is an azimuth and elevation pair. The actual position is read from the antenna. The servo loop computes the error and produces an appropriate reference voltage (Figure 3). This loop is implemented in software and all indicators, controls, and status are available to the software. On/Off Bit

The interlocks are two sets of switches at either extreme of each axis. The first switch is referred to as the limit switch. It prevents the motor control system from applying power to drive the motor further into limits (Figure 2) effectively applying an inductive break. It also sets a software accessible flag. Upon entering a limit, the antenna must be moved several degrees back into the operational range before the limit indicator is off. No damage to the system occurs as a result of being in limits. However it is good practice to operate with sufficient safety margin by staying out of limits. The next set of switches, Emergency Limits, shuts all power off to both motors and the SCR amplifiers and causes brakes to be applied to stop the antenna. This situation is exited only by manual intervention. The brakes must be disabled manually and the antenna manually cranked out of limits. To prevent the antenna from becoming active once the antenna is returned to the operational range, the maintenance worker can place the antenna in Safe mode via the Run/Safe toggle switch. If she/he fails to perform this step, the antenna can become immediately active resulting in severe injury or death.

Indicator Word

Shutdown Interface

Indicator Interface

Digital Reference

Tracking List Software

Remote On/Off Signal Limits, Emergency Limits, Run/Safe, Brakes

Reference Voltage D/A

Antenna System Synchro Encoded Position

Digital Position S/D

Figure 3 System block diagram.

5. Hazards From the description of the tracking antenna system, we can create the list of hazards shown in Table 1. As Leveson [LEV95] points out, safety should be analyzed at a system and environment level. To borrow a phrase from the environmentalist movement, we plan to “think globally and act locally.” Hazards will be analyzed at the highest levels but protection placed at the lowest appropriate level.

Down Limt to motor

from amp

Up Limt

Figure 2 Limit switches operating positions. Throwing the run/safe switch to safe causes all power to be removed to the motors, SCR amplifier, and brake solenoids. If the motor is moving at quick rate it comes to an abrupt halt significantly stressing the mechanical components. Furthermore, when the switch is returned to the run position, a high reference voltage on the SCR amplifier will cause the antenna to accelerate very rapidly, stressing its mechanical components.

6. Software Architecture The static UML object model for the entire system is shown in Figure 4. Hardware objects are denoted with the stereotype “hardware,” the remaining objects are software. Each of these software objects will be model and analyzed for safety against the hazards list. This analysis will determine specific functionality required in each object. The behavior of each object is also modeled as finite state machines. Dynamic system behavior is model through use case and collaboration diagrams. Both the static and dynamic models are updated based on analysis.

0-7695-1597-5/02/$1700 (C)2002 IEEE

Table 1 System hazards ID 1 2 3 4 5 6 7

8 9 10

Description Open Loop. The antenna can reach unsafe speeds if the feed back line is disconnected. Stops . If the antenna strikes the antenna stops, severe damage may occur. Emergency Limits. If the antenna enters emergency limits, the brakes force a sudden stop of the antenna and possible damage may result. Limits. The operational safety margin of operating the system is depleted. Run-to-safe transition. The operator turns off the antenna system possibly causing a large stress on the mechanical components due to the sudden stop. Safe-to-run transition. The operator turns on the antenna system possibly causing a large stress on the mechanical components due to the sudden acceleration. Emergency Limit Exit. Maintenance personnel remove an emergency limit situation from the system without properly engaging the run/safe switch. The potential rapid movement of the antenna could kill the maintenance personnel. Oscillation. An oscillating antenna will eventually damage mechanical components. Acceleration. Any condition that causes the rapid acceleration of the antenna. Deceleration. Any condition that causes the rapid deceleration of the antenna.

Figure 4 System object model

7. Control Theory and Simulation As described by Dorf [DOR81], dynamic control systems can be represented by a set of simultaneous differential equations. Through the use of the Laplace transform, the problem is reduced to a set of linear algebraic equations. This is done by using a block diagram of the system. Each block contains the transfer equation that relates its inputs to its outputs. The various pieces of the control circuit are assembled in a block diagram. Analysis is then performed on this model. The first block is the Antenna System. For a linearized torque-speed control system the Laplace transform for the transfer function is

ω(s)/Vm(s) = Gm/(τms+1) Where ω(s) := the angular velocity Vm (s) := the input voltage Gm := gain of the motor control

(Eq. 1)

τm := time constant of motor control. and s = d/dt (Eq. 2) 1/s = ? dt | 0+ to t (Eq. 3) The antenna system’s output is actually position – not rate – as viewed from the servo loop. Therefore Eq. 1 becomes after applying the integrator 1/s (see Eq. 3): θ(s)/Vm(s) = Ga/s(τms+1) (Eq 4) Where

0-7695-1597-5/02/$1700 (C)2002 IEEE

θ(s) := the antenna position Figure 5 shows a very simple model. The difference between the desired positions θd (s) and the actual position θ(s) is computed: E(s). E(s) is then used to drive the antenna system. E(s) will be truncated to fall within the minimum and maximum input range It is time to revisit our hazard list, specifically, items 4, 8, 9 and 10, limits, oscillation, acceleration and deceleration respectively. An overshoot near the limits could cause the antenna to enter limits and possibly emergency limits. The simple servo does not provide any control over acceleration and velocity. EL < E(s) < EH Gm /(τm s+1) 1/s -

25 Uncompensated Loop 20 Position (degrees)

θd (s) + L(s)

The final loop using control transfer blocks is shown in Figure 7. The simulation of this control loop along with the simple loop is shown in Figure 6. These demonstrate the effectiveness of the hazard mitigation. Furthermore, if the control loop equation is solved using typical values, there are only positive coefficients in the denominator, therefore by the Routh-Hurwitz theorem, there are no positive roots. This means the system is stable.

θ(s) P(s)

Antenna System

Figure 5 Simple servo block model

15 Compensated Loop 10 5 0

To regulate acceleration, we need to put a block between the E(s) and the antenna system in Figure 5. The typical control item used here is an integrator with a tunable time constant. The equation is Vd (s)/E(s) = Gd /(τd s+1) (Eq. 5) Where Gd := drive gain (usually 1)

8. Symbolic Model

τd := drive time constant (> 3τm) To regulate overshoot, we need to add an additional feedback to slow the antenna as it arrives close its final position. The magnitude of the feedback is a function of the rate of the antenna and the overall system time constant. By setting τd > 3τm, the effects τm of can be ignored. Differentiating the position and setting the gain to τd produces the equation for the overshoot feedback protection transfer function. F(s)/θ(s) = τd s (Eq. 6) To regulate velocity, we will add rate control. The rate control will monitor the position and compute the rate. If the rate exceeds a certain threshold, the offending value of E(s) will be saved and E(s) will not be allowed to exceed that value. In the event the rate is exceeded, E(s) will be set to zero.

This experiment encodes a finite state model into the model checker SMV [CLA00]. The paper follows the technique of Ammann [AMM01] and Chan [CAB+98]. This technique models continuous operation by partitioning it into discrete critical states. This partitioning is the art of applying this analysis. The critical states are determined by understanding the hazards. For instance, accidents can occur if the antenna is moving too fast near a physical stop. After the model is entered into SMV, it is analyzed by constructing a set of safety predicates. The model checker analyzes the state space to find counterexamples where the predicate is violated. For each violation, a new software artifact is designed and modeled to mitigate the hazard. The model checker is executed, and the process repeats.

0

2

4

6

8

Time (seconds)

Figure 6 Servo loop simulated response

Overshoot Compensation w(s)

τd

θd (s) L(s)

E(s) +

+ -

-

s

|w(s)| > W m

: 0

else

: 1

Rate Control

Gd /( τ d s+1) Acceleration Control

D(s)

G m /( τm s+1)

θ(s) P(s) 1/s

Antenna System

Figure 7 Control loop with acceleration, rate, and overshoot compensation

0-7695-1597-5/02/$1700 (C)2002 IEEE

Table 3 Angular rate values

8.1. SMV Model Space prohibits showing the details of every modeled component. Therefore, examples of different modeling and analysis techniques are shown. Like the real system, the model has a clock. In the real system, the software periodically samples the position of the antenna and the status indicators and provides updates to the drive level. For SMV, the clock is a little different. It provides synchronization between the different modeled components. For both evaluating counterexamples and to model the antenna system, it is beneficial to have each component active on different clock cycles. The clock is a state machine in a sequential loop of states getrate, move, and sample. The servo is allowed to change during getrate. The angular velocity is updated during move. The antenna position is updated during sample. The servo is modeled as a random function. This allows the safety of the system to be analyzed for all servo outputs in any sequence. If the system is shown to be safe for a rogue servo, safety is shown for normal operations. This is an important and powerful modeling aspect. The safety of the servo has already been validated in Section 7. The output is defined to be an integer from –4 to +4. Each value is also assigned a name to make the SMV spec more readable. The SMV variable for the servo output is “Vi.” See Table 2.

-6 -5 -4 -3 -2 -1 0

Variable neg_open_loop neg_way_fast neg_too_fast neg_fast neg_med neg_slow zero_rate

1 2 3 4 5 6

Variable pos _slow pos _med pos _fast pos _too_fast pos _way_fast pos_open_loop

The position of the antenna is a function of its previous position and the angular rate. The continuous range is shown in Table 4 and includes the states of Figure 1. States are added to capture the crictal condition of being close to limits (low_edge and up_edge) as well as the states needed to remove a limit condition (low_hist and up_hist). Table 4 Antenna position -5 -4 -3 -2 -1 0

Variable low_stop low_emlim low_lim low_edge low_hist operate

1 2 3 4 5

Variable up_hist up_edge up_lim up_emlim up_stop

Table 2 Servo output states Value -4 -3 -2 -1 0

Name negVhigh negHigh negMed negLow zeroV

Value 1 2 3 4

Name posLow posMed posHigh posVhigh

The applicable SMV statements are shown below. VAR Vi : -4..4; ASSIGN init(Vi) := -4..4; next(Vi) := case clock != move : Vi; 1 : -4..4; esac; The angular velocity is dependent on both the servo output and the previous state of the angular velocity. Changes to angular velocity are tied to physical constraints. The rate states are shown in Table 3. The rate transitions rules captured in SMV (not shown) restrict the change in angular rate. For instance, the rate can only change in the direction of servo output and change by a value 2 for a clock cycle. In the event of an open loop condition, the value can change by 3 during a clock cycle.

After the antenna system is modeled, the protection software is modeled. The protection software detects a hazardous condition then takes appropriate corrective action. The SMV spec for the Open Loop & Off Detector is shown below. VAR remoteOnOff : { on, off }; ASSIGN init(remoteOnOff) := on; next(remoteOnOff) := case clock != sample : remoteOnOff; power = off : off ; rate < neg_too_fast : off ; rate > pos_too_fast : off ; position < low_lim : off ; position > up_lim : off ; 1 : remoteOnOff; esac; This spec turns off the antenna drive system if the rate is too fast. Similar specs are developed for each protection software object.

8.2. Safety Predicates A safety predicate is constructed for each hazard analyzed. The model is then executed to determine if counterexa mples exist that violate the predicate. The following spec checks for the open loop condition. The

0-7695-1597-5/02/$1700 (C)2002 IEEE

first specifies that the rate should never hit 100%. The second specifies that if the rate is ever exceeded, the antenna is disabled. SPEC AG( rate = neg_way_fast) SPEC AG( ( rate > pos_too_fast | rate < neg_too_fast ) -> AF ( power = off ) )

12. References [AMM01] Paul Ammann, Wei Ding, and Daling Xu. Using a Model Checker to Test Safety Properties. In Proceedings ICECCS 2001: Seventh IEEE International Conference on Engineering of Complex Computer Systems, pages 212-221, Skovde, Sweden, June 2001. [CAB+98] William Chan, Richard J. Anderson, Paul Beame, Steve Burns, Francesmary Modugno and David Notkin. Model Checking Large Software Specifications. IEEE Transactions on Software Engineering, 24(7):498-520, July 1998.

9. Results

[CLA00] E. M. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, Cambridge, Massachusetts, USA, 2000.

Each of the modeling techniques reveal safety concerns that are not captured by the other. The benefits of control theory and simulation modeling are well accepted for this type of system. The symbolic model checker proved to be extremely powerful in identifying the set of conditions that could lead to an accident. During analysis, the symbolic model checker identified the need for additional protection software models. For example, the rate limiter object was added to limit speed near a limit. Only a subset of the protection required is identified during a typical development activity. Only through rigorous models are the subtle safety issues identified. The counterexamples provide a set of requirements for the protection software. For the antenna system, the SMV model executed in 260 seconds and required 521,211 Binary Decision Diagram (BDD) nodes and 10 MB of state space. The model consists of 300 statements and 11 safety predicates. However, as with many SMV models, seemly small changes can result in a state space explosion. Therefore, the challenge is to find the minimum state mapping that captures the critical aspect of the system.

[LEV95] Nancy G. Leveson Safeware: System Safety and Computers, Addison-Wesley, 1995. [DOR81] Richard C. Dorf: Modern Control Systems, AddisonWesley, 1981. [RUS89] John Rushby. Kernels for safety? In Tom Anderson, editor, Safe and Secure Computing Systems, 210-220. Blackwell Scientific Publications, 1989. Proceedings of a Symposium held in Glasgow, UK, October, 1986.

10. Conclusion When analyzing the hazards of complex systems, many techniques may be required to adequately address all concerns. Just because a system seems too analog or real-time, that does not mean that using a symbolic model checker will not yield worthwhile results. This experiment shows that meaningful statements can be made about the implementation of a servo loop by doing symbolic model checking of its environment. By combining techniques, otherwise hidden or subtle hazards could go unnoticed.

11. Acknowledgements This work was supported, in part, by the National Science Foundation under grant CCR-99-01030. I appreciate the guidance, review, and direction provided by Paul Ammann.

0-7695-1597-5/02/$1700 (C)2002 IEEE

Suggest Documents