The Boeing 777 required six times more newly developed lines of code than older. Boeing's commercial aircraft [23]. As this trend contin- ues, it presents twin ...
Certifying Adaptive Flight Control Software Vittorio Cortellessa, Bojan Cukic, Diego Del Gobbo , Ali Mili, Marcello Napolitano, Mark Shereshevsky, Harjinder Sandhu Department of Computer Science and Electrical Engineering Department of Mechanical and Aerospace Engineering West Virginia University Morgantown, WV 26506-6109
Abstract As aircraft designs become more complex, automation has become an important factor in improving safety and reliability. Automated flight control systems can respond intelligently to faults when it is impractical for a human to take control quickly. In recent years neural networks have been proposed for fault identification and accommodation purposes within flight control schemes because they are well suited to non-linear, multi-variable systems. Because neural networks learn to associate various control actions with particular input data patterns, they avoid the need to explicitly program all the relevant fault situations. A major issue in the use of adaptive fault-tolerant flight control systems is certification. Current practice relies heavily on testing as a means of performing certification. This has two shortcomings, the first being high cost. The second shortcoming of certification by testing is that it only provides a fairly limited guarantee. Certification based on testing exclusively is meaningless for new software technologies, such as adaptive controllers and neural networks which ”learn” after deployment since the system tested in the lab is not the system that is being run in the field. In this paper, we formulate the research problems underlying the certification of adaptive systems. We also describe preliminary results of a certification case study, based on the methodology which combines formal methods and automated testing, and identify the areas of basic research needed to overcome the limitations observed in the proposed framework.
1
Introduction
As aircraft designs become more complex, automation has become an important factor in improving safety and reliability. The size and complexity of avionics software for commercial aircraft is growing rapidly. For example the Boeing 777 and the Airbus A340 each required
around 3 million lines of code. The Boeing 777 required six times more newly developed lines of code than older Boeing’s commercial aircraft [23]. As this trend continues, it presents twin challenges to the aerospace industry: how to improve the reliability and fault tolerance of automated avionics systems, and how to certify these systems as safe to fly. Certification is particularly important, as it has become a major cost factor. The need for better fault tolerance increases as automation increases. Automated flight control systems need to be able to respond intelligently to faults when it is impractical for a human to take control quickly. For example, low weight, unmanned aircraft can greatly increase survivability and reduce reliance on ground support through the use of fault-tolerant controllers. Similarly for commercial and military aircraft, as avionics software becomes more complex, it becomes harder for a pilot to successfully intervene in the event of a failure of a flight control system component. A particular challenge for fault tolerant systems is to react correctly to sensor and actuator failures. In these cases, the failure lies outside the avionics system, and hence the controller needs to be able to accommodate the fault, and adjust its control functions accordingly. The availability of such a feature for military and commercial aircraft can increase the survivability rate following sensor failures and/or loss of control authority on primary control surfaces. We distinguish two aspects to this problem: 1. Sensor failure detection, identification, and accommodation (SFDIA); 2. Actuator failure detection, identification, and accommodation (AFDIA). In increasing order of importance, both schemes have the objectives to reduce the handling quality degradation associated with sensor/actuator failure, as well as to provide a lower mission abort rate, and a lower aircraft loss rate.
In recent years neural networks have been proposed for fault identification and accommodation purposes within flight control schemes because they are well suited to nonlinear, multi-variable systems. Because neural networks learn to associate various control actions with particular input data patterns, they avoid the need to explicitly program all the relevant fault situations. In addition, neural networks offer some potential for parallel or distributed processing, and can be implemented either in hardware or software. Neural network-based SFDIA and AFDIA schemes have been investigated and successfully demonstrated via numerical simulations. The next step in this work is to demonstrate that such systems are stable and can be certified for flight readiness. In addition to greater fault tolerance, flight certification is a major challenge for automated flight control systems. Current practices rely heavily on testing. For example, the FAA standard for avionics software uses the Modified Condition Decision Coverage (MC/DC) criteria, which is based on demonstrating, for each branch in the software, that each parameter can independently affect the result. While this is less demanding than full branch condition coverage, it presents a huge cost overhead for large avionics systems. Boeing estimates that 40% of the software development costs for the 777 were spent on testing1 . Hence, testing to this standard has become a major cost driver in the development of new aircraft. One consequence is that alternatives for flight software certification could enable a significant reduction in the cost of flight control software development. At the same time, this form of testing has a number of important limitations. Firstly, it is accepted in software engineering that for complex systems, testing can never demonstrate correctness; it can only be used to reveal errors2 . Secondly, criteria such as MC/DC are based on the structure of the code, and hence may not uncover problems associated with missing or incorrect requirements, nor can they uncover systemic problems to do with the interaction between components. Finally, these criteria are meaningless for new software technologies, including adaptive controllers and neural networks. For these types of system, alternative certification methodologies are needed, based on a combination of mathematical proof and testing to system and performance requirements. Recently, we have embarked on a research project that has three main objectives:
line learning for use within flight control systems.
To establish new certification methodologies for adaptive flight controllers. Our intention is to establish the foundations for software certification of adaptive flight control systems.
In this paper, we describe the efforts undertaken towards establishing the foundations of a methodology for certification of adaptive flight control systems. Section 2 introduces the structure of a flight control system (FCS) and briefly discusses its analytical model. Section 3 outlines the shortcomings of the current software certification standards (RTCA DO178-B), when the system to be certified is adaptive in nature. Section 4 describes the specific efforts we undertook so far: formalizing the requirements of an FCS using a formal methods framework and the proposed automated testing environment. Section 5 provides a critique on the limitations of our current approach and outlines the near term research directions. The venues for further research are overviewed in Section 6.
2
System Description
Figure 1 shows the basic architecture of a Fly-By-Wire Flight Control System (FBW-FCS). In FBW technology conventional mechanical controls are replaced by electronic devices coupled to a digital computer. The net result is a more efficient, easier to maneuver aircraft. Four subsystems form the core of such FCS’s. The Measurement Subsystem (MS) consists of the Sensors and the Conditioning Electronics. It measures quantities that allow observing the state of the aircraft. The Actuator Subsystem (AS) consists of the Control Surfaces, the Power Control Units (PCU’s), and the Engines. It produces aerodynamic and thrust forces and moments by means of which the FCS controls the state of the aircraft. The Control Panel Subsystem contains all control devices and displays through which the pilot maneuvers the aircraft. The Flight Control Software subsystem (FCSw) includes all software components of the FCS. It interfaces to the hardware of the FCS through A/D and D/A cards (not shown in the figure). Current measurements, pilot inputs, and commands to the actuators are processed according to the Flight Control Law (FCL) to obtain the commands to the actuators at the next time step. Dash blocks and arrows represent the system providing Analytical Redundancy based Fault To develop neural networks with theoretically Tolerant Capability (AR-FTC) to the FCS and will be dedemonstrated stability and convergence of the on- scribed in the next section. Components of an FCS can be grouped on the basis of 1 The total development cost for the B777 was $5 billion. Let us the function that they perform with respect to the whole assume that half of this was software development cost; hence, roughly system. Such a classification cuts across the FCS and proa billion dollars were spent on software testing. 2 The quote is due to E.W. Dijkstra and can be found discussed in any duces control systems known as Stability Augmentation good book on software engineering or testing. An example is [15]. System (SAS), Control Augmentation System (CAS), and
motion cues
Airframe
C
X
;; ;;;; Sensors
Control Surfaces
Engines
primary
Electronics
PCU's
^
u
Ouput data
^
uv
FCSw FTFCS
secondary
u ^
u
^
FCL ^
s
AR-FTC
y
Input data
^
y
^
yv
^
^
m
v
yp
yr
Cockpit control panel
r v m
Pilot
Figure 1: Fly by Wire Flight Control System.
Autopilot. The SAS controls the rotational modes of the aircraft: short-period, roll, and dutch-roll. These modes are too fast for a pilot to be able to control the aircraft; hence, they need to be automatically controlled to achieve suitable aircraft responsiveness to pilot commands. The CAS provides additional control to the aircraft modes in order to obtain particular control functions such as control over normal acceleration. The Autopilot provides pilotrelief functions such as automatic holding of altitude, velocity, pitch attitude, etc. From the perspective of safety the SAS plays a more critical role than CAS and Autopilot do. Hence, sensors belonging to the SAS have been labeled as primary sensors, as opposed to secondary sensors used within CAS and Autopilot only. Any hardware or software fault within the FCS can compromise the safety of the aircraft. For this reason FBW-FCS’s must meet strict Fault Tolerance (FT) requirements. The standard solution adopted to achieve fault tolerance is physical redundancy. A typical multichannel architecture for the FCS consists of three virtually equivalent intercommunicating FCS’s, that are able to work independently. Intercommunication allows checking operative consistency among the channels, hence detection and isolation of a faulty component. Independence allows the aircraft to operate safely with only one channel working properly. Besides additional costs and weight consideration multichannel FCS adds another undesirable feature: complexity. A close look at the FCS of the FBW Boeing 777 [28] shows how complex the architecture of the FCS is. Physical separation and design diversity within redundancy, do address the issue of common mode faults, but at the expense of the excessive complexity. Complex systems seldom prove to be highly reliable, so it might be counter productive to address fault-tolerance by increasing the complexity of the system. Furthermore, complexity is not faced only at design phase, but throughout the
life cycle of the aircraft including maintenance. Maintenance is ’de facto’ a source of faults and, in multichannel FCS, it raises considerable concern. A recent study finds that maintenance-related accidents in the Air Force led to 79 fatalities and permanent injuries to nearly 200 people between 1972 and 1997. In addition to the increasing complexity of aircraft systems, there is an increasing number of incidents due to improper installation of parts. In recent years, these factors have contributed to an increased interest toward alternative approaches for enhancing FCS’s reliability. In fact, while redundancy is a must for fault tolerance, physical redundancy is not the only form of redundancy. In the past two decades a variety of techniques based on Analytical Redundancy (AR) have been suggested for fault detection purposes in a number of applications [22]. The AR approach is based on the idea that the output of sensors measuring different but functionally related variables can be processed in order to detect a fault and identify the faulty component. Furthermore, preserved observability allows estimating the measurement of an isolated sensor, while preserved controllability allows controlling the system with an isolated actuator. As far as sensor and actuator faults are concerned, AR is a sufficient technique. Fault tolerance is achieved by means of software routines that process sensor outputs and actuator inputs to check for consistency with respect to the analytical model of the system. If an inconsistency is detected, the faulty component is isolated and the FCL reconfigured accordingly. By introducing AR, it is possible to reduce the level of physical redundancy, thus cutting costs, weight and complexity. Physical redundancy would be required only where either post-failure system observability and controllability are not preserved or detection of the fault by means of AR is not feasible in the first place. Once again, fault tolerance is achieved at the expense of an increased complexity of the FCS. However, complexity has moved from the hardware to the software of the FCS. Increased complexity of the FCS software should not preclude adopting AR since both computational and design complexity of fault detection and accommodation algorithms is comparable to that of the FCL itself. Furthermore, the scope of the added complexity is limited to design phase. Application of AR in FCS’s is not new. The very same airplane used to conduct research on FBW technology was also used as testbed for an AR based fault detection algorithm [25]. The algorithm showed desirable performance during flight tests. However, poor robustness to modeling errors and the degree of required modeling hampered further development. In the past few years, a number of results have been obtained in the area of robust fault detection through AR
[21]. Unknown-input observers, robust parity relations, adaptive modeling, and H1 optimization are a few examples. While research went forward, the design methodology involving feasibility analysis, requirements specification, and certification of AR based fault tolerant control systems is still missing. Exploring strengths, weaknesses, related degree of reduction of physical redundancy, and overall reliability is a fundamental step in the engineering process of such systems. In order to design AR-FTFCS one needs to use an analytical model of the system. The dynamics of many systems can be analytically described in terms of a set of relations among its inputs, outputs, states, and state derivatives. These relations are nothing more than constrains imposed by mechanics, electronics, and thermodynamics laws upon system inputs, outputs, and their derivatives up to a certain order n typical of the system. States are a convenient set of variables that allow formulating such relationships in terms of a set of first order differential equations. AR-FTFCS approach is based on the fact that system inputs and outputs are functionally related variables that, when properly processed, allow the detection of faults in the system. A variety of solutions has been considered in the technical literature and many issues related to their application have been addressed. Nevertheless, the practical implementation of analytical redundancy based fault tolerance schemes is still very limited. Fault tolerance algorithms are typically based on Kalman filtering, neural networks, and fuzzy logic techniques. Note that some of these techniques are adaptive in nature. The skepticism with which analytical redundancy is sometimes received cannot be fully removed until a trustful certification procedure for all the AR fault detection schemes becomes available.
2.1
AR-FTFCS
Here follows the brief illustration of how an AR-FTFCS works. At the instant of time new measurements, originating as sensor readings are available as software data. The data are not independent. By processing sensor measurement and actuator command histories it is possible to check whether the relations modeling the state of the aircraft are satisfied. If a fault within the hardware loop produces an inconsistency with respect to the analytical model the system, it is said to hold AR properties allowing for the detection of the fault. After detection of the fault, it is necessary to identify which component has failed. Each component of the FCS plays a different role Hence, the distortion affecting these relations at the occurrence of a failure is characterized by the failed component and by the fault mode. When processing sensor mea-
surements and actuator command histories it is possible to locate the source of distortion. If a fault within the hardware loop produces a distinct signature in terms of commands/measurements, the correlation is said to hold AR properties allowing identification of the fault. Once identified the faulty component the FCS needs to be accommodated. Accommodation can be carried out at software level because the flight control algorithm is not unique. Some of the accommodation algorithms do not use all of the sensors and/or actuators available. Hence, if a hardware component of the FCS fails, safety can be preserved by switching to a control algorithm that does not employ that component. If such an algorithm exists the system is said to hold AR allowing accommodation of the fault. It is important to point out the orthogonality between fault tolerance as conceived in AR terms, and as conceived in a multichannel solution. In a multichannel solution the structure of the FCS is frozen and fault tolerance is achieved by guaranteeing availability of each component needed in the FCL by means of physical redundancy. On the other hand, the structure of an AR-FTFCS is flexible. The FCL is chosen as a function of the available hardware resources to meet safety requirements despite hardware deficiencies. Another interesting observation is related to the type of faults that can be addressed by means of AR. We briefly explained how a system featuring AR properties can be made fault tolerant with respect to sensor and actuator faults at the software level. However, software along with its supporting hardware (computers, data buses, etc.) and hardware systems, other than sensors and actuators, can fail as well. AR cannot be adopted to provide fault tolerance with respect to failure of such components and a different approach must be adopted. Finally, it is worth noting that although AR is exploited at the software level, it is strongly tied with the specific hardware system. Hence, the degree of AR provided by the system depends on the set of actuators, on the set of sensors, on the dynamics of the aircraft, and on the aircraft state evolution. Since the evolution of the aircraft state is governed by the FCL and the imposed maneuver, the FCL and the aircraft operational flight conditions also play a role in determining the system’s degree of AR. On the other hand, this implies that a system can be designed, and eventually operated, to feature desirable AR properties. This study considers the aerodynamic model of an F16 A detailed non-linear model of the dynamics of this airplane is presented in [24]. In this phase of the study we focus our attention on sensor faults only. Hence, we assume none of the components of the other subsystems are subject to failure. More specifically, we require fault tolerance with respect to failure of the roll, pitch, and yaw rate gyros. The rationale is that such sensors are among the
primary sensors used within the SAS and are necessary to satisfy the safety requirements. To simplify our analysis we neglect multiple failures and transient failures. Fault modes have been derived from experimental data.
3
Certification Standards
Certification can be defined as an official assessment of equivalence between the specified and the actual service provided by the software and/or system. There are basically two ways of certifying an embedded software-based system: the indirect ”process certification” and, in our opinion more suitable for adaptive systems, ”product certification”. Process certification is an indirect way of certification. It works as follows:
process does not imply the adequacy of the system, since the system will change over time. For flight control systems, certification is a mandatory process that must be completed before commissioning the product. The benefits are evident. Our project has the goal to set up the scientific foundations for regulatory evolution in certification of intelligent flight control systems. The approach must be based on product certification. Product certification is defined as the direct assessment of the adequacy of the actual service versus the specified service. Software product certification comprises three steps: 1. Precise identification, at the specification level, of the functional characteristics of software and of the nonfunctional attributes necessary for its intended use: reliability, maintainability, security, etc.
The certifier and the developer agree on a development methodology, the stringency of which is a function of the criticality of software,
2. Measurement (quantitative) and examination (qualitative) of the same functional attributes at the specification, software (product) and service level.
This methodology is applied by the developer,
3. Multi-dimensional assessment of the equivalence between the two sets of attributes.
The certifier checks that the methodology is effectively applied.
Process certification is the core of current standards for software certification. The Federal Aviation Administration (FAA) and other federal agencies, such as the Nuclear Regulatory Commission (NRC) and Food and Drugs Administration (FDA), have chosen to perform software certification using a technique similar to that used for certifying hardware. The basic message of the Radio Technical Commission for Aeronautics document RTCA/DO-178B [1], for example, is that “designers must take a disciplined approach to software: requirements definition, design, development, testing, configuration management, and documentation.” There are two observations leading to conclusion that process certification is insufficiently rigorous for intelligent flight control systems:
For neural network based flight control schemes, such as SFDIA and AFDIA, theoretical foundations for product certification have yet to be established. The approach adopted in this paper is an early attempt to evaluate the limitations of a combination of techniques, including the formalization of requirements specifications, their rigorous validation, and the techniques for proving the convergence and stability properties of neural networks. The long term goal of this study is the definition of theoretically sound and practical procedures for demonstration of the equivalence between specification level attributes and service level attributes. The short term goal, reported in this paper, is to investigate the applicability of different software verification methods to neural networks based control systems. This research effort will, hopefully, lead towards the definition of credible and rigorous assessment and assurance methodologies for intelligent flight control systems.
1. General observation: Software engineering techniques for building and validating software for complex embedded systems so that it adheres to stringent safety and reliability requirements are the sub- 3.1 Limitations of DO178B ject of permanent improvements and represent open The basic principles governing the software life cycle research problems. processes for aircraft control systems, laid down in the 2. Specific observation: Intelligent flight control sys- RTCA/DO178B document, [1] were designed with the tems or, more generally, systems built by any soft traditional (algorithmic) computing paradigm in mind. computing paradigm, are adaptable; i.e., they change The adaptive (“non-programmed”) flight control software over time. While this is one of the basic reasons requires rethinking or modification of certification procebehind the technical appeal of soft computing plat- dures proposed in [1]. Here we briefly discuss certain forms, it implies that process certification procedures aspects of adaptive FCS software in regard to which the are inadequate. In other words, the adequacy of the DO-178B standard needs to be amended or modified.
ality. For the sake of practicality, it is rather imperative that the specification be written in a language that is supported by automated tools. In the validation/ certification phase, the neural net and the executable specification can be executed independently to provide a basis for checking the former against the latter. On the other hand, we also propose to write the specification in such a way that it captures, not only the features of the particular implementation of this project, but rather general requirements of fault tolerant flight control systems. We have chosen to use SCR as the specification vehicle, because it lends itself to this type of application: requirements/properties of the FTFCS. For example, tabular representations, which form the semantic foundations of SCR, were used in [8] to specify the requirements of the Navy’s A7-E aircraft, and in [20] to specify nuclear power plant’s control systems. SCR was also used to specify an autopilot [6], to specify a variety of high assurance applications [10, 11], and to specify some functions of the space shuttle software [27]. The first issue we address is delimiting the boundaries of our specification. We have pondered two possible options, that in the following are called option 1 and option2. Whereas option 1 focuses on the inputs and outputs of the fault tolerant capability component (re: AR-FTC, in Figure 1), option 2 (the aggregate of the flight control system, with the aircraft) considers the impact of the outputs of the AR-FTC on the aircraft’s state. For a given situation, defined by a set of sensor readings, there are many sequences of actions that a flight control system can follow to achieve/ maintain the maneuverability/ stability of the aircraft. At any instant, these actions may be different, but their combined effect over time is identical. Hence by virtue of abstraction (we do not wish to deal with the detailed mechanics of how the AR-FTC operates) and generality (writing specifications that apply across a wide range of possible implementations), option 2 is better than the first. Meanwhile, if we choose option 2, then we cannot judge the outputs of the AR-FTC directly, but we have to observe their effect on the aircraft. This gives us much lower observability of the AR-FTC than option 1. Controllability is the same for both options. Ultimately, this decision amounts to choosing between observability (option 1), i.e. the ability to observe and monitor the exact values that are produced by the FCS implementation, and abstraction (option 2), i.e. the ability to give the implementer some latitude in how to maintain 4 Certification Case Study maneuverability. We have ruled in favor of abstraction. By virtue of this choice, the input variables of the spec4.1 Requirements Engineering ification are the sensor readings of relevant flight paramIrrespective of what methods we choose, the certification eters (altitude, speed, acceleration, angle of attack, rate of the adaptive system is contingent upon the indepen- gyros, aileron deflections, elevator deflections, rudder dedently developed, redundant description of its function- flection) and the actuator input values; the output variPartitioning: Partitioning is a technique based on identifying and using functional independence of software components to contain and isolate faults and potentially reduce the effort of the software verification process. For a Neural Network based control system, the applicability of this method is limited by the integral, inseparable nature of neural computation. A partitioned component can never be smaller than a constituent neural network, e.g., two variables cannot be separated by partitioning, if there exists a neural network in the system whose input or output vector contains both these variables. Requirement guidelines: Requirement feasibility issues need to be addressed including the fault detectability [5] and the minimal necessary learning time of the neural networks approximating the flight dynamics. Software verification process: DO-178B states that the input to software verification process must include requirements, architecture, traceability data, source code, etc. The mathematical model of the artificial neural network employed in the system design should be required for the neural network based system. This mathematical model should include statements and proofs of neural network stability and convergence properties. Testing objectives: Among the testing objectives listed in DO-178B, the crucial one is the structural coverage, which aims to assure that the exercised test cases mobilize all the data structures and all the branches of the program. For NN-based software this type of coverage is no longer applicable, since we deal with non-algorithmic, parallelized, data-defined evolutionary computation. Instead, one should be concerned about the input space coverage and the training data coverage. Formal methods: Due to the limitations in the testing efficiency for the NNbased FCS, the role of formal methods becomes ever more important. They complement the testing by providing the mathematically rigorous definition of requirements for the system’s behavior. Verified and validated system requirements used as a reliable test oracle, may enable the statistical inference providing confidence level for the reliability of software with respect to its requirements.
ables are the actual values (i.e., the validated vectors) of the same parameters and an error report. The space of sensor readings is therefore partitioned, under fault hypotheses, and for each partition analytical relations among system variables are introduced in order both to characterize the partition and to express constraints that must be satisfied to guarantee stability and maneuverability of the aircraft when a fault occurs while system conditions falling in the partition. The formulation of such relations using the Software Cost Reduction (SCR) notation takes advantage from the tabular representation of variable behavior in SCR: it comes straightforward to introduce the expression whose result is the value that a variable must assume, under given conditions. Functional dependency among tables is exploited to catch out indirect relations. On the other hand, several representation/execution issues have been raised on the usage of SCR for such a domain (e.g., modeling time).
4.2
Test Environment
Our plan calls for using the target specification as an oracle in the test plan of the neural network. Specifically, the neural net feeds its inputs into a certified flight simulator, which plays the role of the aircraft components in the graph of Figure 1. This aggregate is placed side by side with the SCR specification, whereby the SCR is used as an oracle to test the neural net. Input data is submitted to the system under test and the SCR oracle, to check for correctness. This input data is the aggregate of sensor readings and pilot controls, which are collected from previously collected flight simulation data. The purpose of the testing plan is to make a ruling on the certifiability of the neural net as an implementation of the fault tolerant capability of the flight control system. The use of requirements specifications as a test oracle presented us with a dilemma while deriving a specification for the fault tolerant capability flight control system, dealing with the determinacy of the specification. We had two options:
Make the specification deterministic. This is more natural, from the standpoint of SCR (which revolves around the pattern of formulating controlled variables as a function of monitored variables), and yields generally simpler specifications. The main drawback of this solution, of course, is that it forces us to second guess the designer of the neural net, because we have to derive a specification for the exact function that the neural net is implementing. This, in turn, has two drawbacks: first, it imposes much coordination between the implementer team and the specifier team, and is counterproductive from a V&V
viewpoint (V&V relies primarily on redundancy); second, it imposes early constraints on the designer, prohibiting him from altering design decisions that affect the specifier team.
Make the specification non deterministic. The position here is to let the specification focus on expressing the desired functional properties, without going as far as to uniquely specify which output will satisfy these desired properties. This solution is consistent with traditional guidelines for good specification, but causes some difficulty in SCR, because SCR does not handle non-determinacy naturally.
We felt very justified in choosing the second option, but have found that it raises an issue which may, with hindsight, cause us to reassess our choice: Under the first option (deterministic specification), the specification of the system does not have to capture the criteria under which differences of output between the specification and the implementation can be considered tolerable; this decision can be made during the verification and validation step, by the V&V team, to take into consideration any special circumstances that may arise. By contrast, under the second option, the tolerance margins have to be hard-coded into the specification, and cannot be adjusted subsequently by the V&V team to account for special testing/operational conditions. Hence both options force us to make early decisions: The first option imposes on us to agree with the implementer on specific design decisions; the second option imposes on us to agree with the V&V team on specific tolerance margins. The verification environment we derived for this purpose is presented in figure 2. The fault reports of the neural net and the SCR specification are compared for logical equality, producing the result shown in the lower right corner of the figure. On the other hand, the actual state of the aircraft, produced by the flight simulator, is matched against the pilot controls (by virtue of a law that captures aircraft maneuverability), to return a boolean indicator of whether the aircraft maintains adequate maneuverability (despite the possible presence of faults).
4.3
Quantitative Assessment of Testing
Software assessment criterion, as prescribed in DO-178B emphasizes structural coverage, i.e., making sure the test cases sensitize all (or sufficiently many) branches of the algorithmic tree implemented by the program in question, to reveal (nearly) all the existing programming errors or demonstrate the absence thereof. For an adaptive, in particular NN-based, software, which relies on training data, rather than an algorithm, testing should evidently aim at achieving adequate data coverage or, in other words, input space coverage.
Accomodation Requirement
U Y MR
actuators sensors pilot
SCR specifications of Fault Tolerant Capability
validated actuators
^ U
validated sensors
^ Y
fault report
command
X aircraft state
D = D’
Neural Network
Y’ validated sensors
Flight Simulator
system’s operation only needs to be verified over a compact finite-dimensional set of inputs I = f(y p (t); fs (t)) : p 2 P Rk ; s 2 S Rm g. It should be noted that the specified fault dictionary must include the “zero fault” member in order to certify the system’s operation under the nominal conditions. The output of the NN-based control system under investigation is the aircraft trajectory x(t) = ((t); q(t)) T over the observation time interval 0 t T . For every input 2 I , the automated test environment runs the adaptive flight control software. At the end of the computational cycle, the output from the controller is compared with the specification oracle, thus providing a discrepancy measure:
fault report
(; t) = gt (k Xnom (t) ; xactual(t) k); 0 t T;
Figure 2: A Test Environment for the Adaptive Flight Control System where, for every computational cycle t, gt is a nondecreasing function which takes values Xnom (t) and x actual (t) and equals 0 for 0 distance, and 1 for distances Fault tolerant capability of the system, i.e., fault detecabove a safety critical, predefined threshold . tion and quantification are tested against the executable In order to assess reliability, we introduce a probabilversion (SCR simulation) of the appropriate system reity density P ( ) on the input space I , which reflects the quirements. Since the results of testing fault detection frequency of the occurrence of the particular faults and and identification are given as a boolean variable, the outsensor readings during the field use. Following [26], we come of successive tests is quantified as the proportion define the estimated software reliability as of successes versus the total number of tests. A more interesting challenge for the quantification of test results is Z presented by testing the aircraft’s ability to accommodate R(t) = 1 ; P ( )(; t)d: identified (as well as non identified) sensor faults. For I this purpose, we briefly discuss a formal test evaluation framework, which, we believe, can provide the basis for To avoid the time dependence of reliability estimate R, building a certification methodology for adaptive control we may use the ”worst moment” estimation of reliability. software systems. In other words, maxf(; t) : 0 t T g would reThe test inputs are the sensor readings: place (; t). For the discretized input space it is easy to
y(t) = (An (t); (t); q(t))T ; 0 t T;
and the sensor faults:
f (t) = (fAn (t); f (t); gq (t))T ; 0 t T; where T is the observation time interval. The space of all such functions is infinite-dimensional, and hence is far too rich to be covered by reasonable number of tests. For the concrete implementation (like the one we are concerned with in this article), however, the set of all plausible (with respect to the requirements) inputs is confined to the set of functions (yp (t); fs (t)) parameterized by finitely many parameters p = (p1; :::; pk); s = (s1; :::; sm), representing the standard types of maneuvers (p) and faults (s). Besides, the values of the parameters above are limited (by the virtue of the requirements) to certain finite intervals. Thus, the correctness of the
rewrite the expression above with summation instead of the integration. It is clear, however, that the entire input space I is too large to test exhaustively, i.e., to compute the value R directly using the formula given above. We propose to approach the problem by augmenting testing with formal proofs based on the mathematical description of the neural network, in order to perform a generalization of the test results from a limited set of test cases to all I . By exploiting the (piecewise) continuity of the function modeled by the neural network and, therefore, of the program transforming inputs into outputs, we can find for every test case 2 I , its neighborhood U such that local discrepancy measure (; t); 2 U can be estimated based on its value (; t)) for the test input . Then one can cover I by such neighborhoods of a limited number of the test inputs, and thus obtain an estimate of the integral above and the reliability R(t) without exhaustive testing.
5
Critical Assessment Methodology
of
the
Even though our procedure is consistent with the directives of DO 178B standard, we feel that, in the form explained above, it is still inadequate to provide us with quality assurance, because of a crucial feature of adaptive systems. Adaptive systems evolve their behavior over time, hence the behavior that they exhibit during field usage does not duplicate their behavior under test. Yet a crucial hypothesis of all forms of certification testing (be it to establish correctness or to estimate reliability) is that the system will duplicate in field usage the behavior that it exhibits under test. For example, this idea is at the center of Cleanroom’s discipline of reliability assessment, which generates test data to mimic usage patterns, and predicts reliability in the field as an extrapolation of reliability growth under test [3, 13, 12]. Of course, one may argue that although adaptive systems evolve over time, they change for the better. But better in the sense of adaptive systems (stability, convergence) does not mean better in the traditional sense of testing (satisfies the oracle condition more often). Concretely, an adaptive system may very well satisfy the oracle condition in the testing phase, and fail to satisfy it in the field usage phase, even though it converges; see Figure 3. Note that this does not rule out testing completely; in principle, testing is still useful if only we could prove that the learning process which drives the evolution of the adaptive system is monotonic. What we mean by monotonic is this: in [17], we have shown that certification testing can be modeled by a property to the effect that the product under test refines a specification which is derived from the test oracle and the test data. An evolution is monotonic if and only if, as the adaptive system evolves, its function grows increasingly more refined. Because the refinement ordering is transitive, monotonicity ensures that any behavior that is observed during the testing phase is duplicated or exceeded during field usage. One may argue that since testing does not work, we should turn to proving. Another feature of adaptive systems rules out this option as well: whereas traditional proving techniques rely on the assumption that the functional behavior of a software product is fully determined by its representation (e.g. source code), and focus on analyzing the representation to infer functional properties, the functional behavior of adaptive systems is heavily dependent upon the data on which the system was trained. Note that this observation does not rule out proving completely; in principle, proving is still useful, but it is weakened considerably by the premise that it can only deal with data independent functional features —which are typically very nondeterministic.
SCR Range
NN Range, under test
Behaviour under test
NN Range, in field
Behaviour in field
Figure 3: Convergence does Not Ensure Consistency
6
Venues for Adaptive System Certification
The foregoing discussion leads us to seek alternative venues for the certification of adaptive systems. We adopt the philosophy advocated in [4, 14, 17], which promotes the deployment of several alternative quality assurance methods. This philosophy embodies the Law of diminishing returns, and consists in deploying many methods in concert, with the expectation that each method will work best for some circumstances, and will be superseded by other methods as its effectiveness diminishes. We are mindful of an important difference between the application of this approach to traditional software products and its application to adaptive software products. In traditional software products, any one method (testing, proving, fault tolerance: [17]) is, in principle, adequate to provide an arbitrary level of quality assurance, although perhaps at a prohibitive cost. In the case of adaptive systems, none of the methods we propose is provably complete. Hence while this approach is something of a luxury for traditional systems (designed to deploy each method where it is most effective), it is more of a necessity for adaptive systems, since none of the methods we propose below is known to be complete. Continuity-based equivalence classes partitioning. Partitions of the NN input space into equivalence classes could correspond to qualitatively different system behavior by exploiting continuity/stability properties
rigorously proved for the mathematical model of the stant maneuverability requirements. The logical redunneural network. One would select test cases from each dancy approach consists of structuring the FCS as the agequivalence class and combine the test results with gregate of two components: formally proved statements establishing correctness (or at 1. the neural net responsible for preserving non-critical least providing statistical confidence level) of the system information; operation through the use of fuzzy functions or another similar methodology. 2. a safety (logical) device responsible for preserving critical information; Semantic Approach. By virtue of this solution, all the NN-specific V&V The purpose of the semantic approach is to combine the spirit of the traditional static techniques with the specifics methods can be applied to the neural network, while tradiof adaptive systems. The key idea of this approach is to tional refinement-based V&V methods can be applied to recognize that even though the behavior of an adaptive the safety device to ensure that the safety is preserved, as system is not fully captured in its static representation, the neural net evolves. one can still: 1. Infer some functional properties of the system from its representation, independent of its input data;
Acknowledgments
The authors acknowledge the support provided by NASA 2. Infer data-dependent properties of the system, sub- Dryden Flight Research Center, administered by the Instiject to relatively weak hypotheses on the input data. tute for Software Research. Work we have done in the past on program fault tolerance bears out our expectations that both of these hypotheses are fairly realistic. Reduction of an intelligent to a regular control system. For many on-line neural networks (in particular, for RBF dynamical system identification ANN) the learning algorithm is described by a finite-order differential. One could expand the state space of the control system by adding the internal parameters of the neural network (weights and perhaps their derivatives) to the list of system state variables. Thus, the entire Intelligent Flight Control System (IFCS, Control System + Neural Network) could be represented as a merged dynamical system at the cost of increasing its state space dimension. This, if achieved, would enable to apply standard control system certification methodology to flight qualify the intelligent flight control system. Logical Redundancy. It has been demonstrated that it is possible to distinguish two types of information in the state of a computation: 1. Critical information, which is indispensable to the failure- free completion of the computation;
References [1] Software Considerations in Airborne Systems and Equipment Certification. RTCA, Washington DC,1992. [2] H. Ammar, B. Cukic, C. Fuhrman, and Mili. A comparative analysis of hardware and software fault tolerance: Impact on software reliability engineering. Annals of Software Engineering, June 2000 (to appear). [3] S. A. Becker, J. A. Whittaker. Cleanroom Software Engineering Practice. IDEA Publishing, 1997. [4] B. Cukic. Combining Testing and Correctness Verification in Software Reliability Assessment. Proceedings of the 2nd IEEE International Symposium on High-Assurance Systems Engineering (HASE’97), Washington, DC, Aug. 1997. [5] D. Del Gobbo, B. Cukic, S. Easterbrook, M. Napolitano, Fault Detectability Analysis for Requirements Validation of Fault Tolerant Systems. Proc. 4 th IEEE International Symposium on High-Assurance Systems Engineering, Washington, DC, Nov. 1999. [6] R. Bharadwaj and C. Heitmeyer. Applying the SCR requirements method to a simple autopilot. In Proceedings, Fourth Langley Formal Methods Workshop, Hampton, VA, September 1997.
2. Non-critical information, which may be lost during the computation (due to a fault) and recovered onthe-fly by forward recovery.
[7] M. Frappier, A. Mili, and J. Desharnais. Defining and detecting feature interactions. In IFIP TC2 Working Conference on Algorithmic Languages and Calculi. Chapman and Hall, 1997.
In the context of a flight control system, critical information would capture the aircraft survivability requirements, whereas non-critical information captures only in-
[8] K. L. Heninger, J. Kallander, D. L. Parnas, and J. E. Shore. Software requirements for the A-7E aircraft. Technical Report 3876, United States Naval Research Laboratory, Washington D. C., 1978.
[9] R. Janicki, D. L. Parnas, and J. Zucker. Tabular representations in relational documents. In C. Brink, W. Kahl, and G. Schmidt, editors, Relational Methods in Computer Science, chapter 12, pages 184–196. Springer Verlag, January 1997. [10] J. K. Jr, M. Archer, and C. Heitmeyer. Applying formal methods to a high security device: An experience report. In Proceedings, IEEE International Symposium on High Assurance Systems Engineering, pages 81–88. IEEE Computer Society Press, November 1999. [11] J. K. Jr, M. Archer, and C. Heitmeyer. Scr: A practical approach to building high assurance: Comsec system. In Proceedings, Annual Computer Security Applications Conference. IEEE Computer Society Press, December 1999. [12] R. C. Linger, P. A. Hauser. Cleanroom Software Engineering. Proc. 25th Hawaii International Conference on System Sciences, Kauai, Hawaii, Jan. 1993. [13] R. C. Linger. Cleanroom Software Engineering for ZeroDefect Software. Proc. 15th International Conference on Software Engineering, Baltimore, MD, May 1993. [14] M. Lowry, M. Boyd, D. Kulkarni. Towards a Theory of Integration of Mathematical Verification and Empirical Testing. Proc. 13th IEEE International Conference on Automated Software Engineering, pp. 322-331, Honolulum Hawaii, October 1998. [15] G. J. Myers. Art of Software Testing. John Wiley & Sons, 1979. [16] A. Mili. An Introduction to Program Fault Tolerance: A Structured Programming Approach. Prentice-Hall, Englewood Cliffs, NJ, 1990. [17] A. Mili, B. Cukic, T. Xia, R. Ben Ayed. Combining Fault Avoidance, Fault Removal and Fault Tolerance: An Integrated Approach Proc. 14 th IEEE International Conference on Automated Software Engineering, pp. 137-146, Cocoa Beach, FL, October 1999. [18] F. Nasuti and M. Napolitano. Sensor failure detection, identification and accomodation using radial basis function networks. Technical report, West Virginia University, Mechanical and Aerospace Engineering, Morgantown, WV, November 1999. [19] D. L. Parnas. Tabular representation of relations. Technical Report 260, Communications Research Laboratory, Faculty of Engineering, McMaster University, Hamilton, Ontario, Canada, October 1992. [20] D. L. Parnas, G. J. K. Asmis, and J. Madey. Assessment of safety-critical software in nuclear power plants. Nuclear Safety, 32(2):189–198, April-June 1991. [21] R. J. Patton. Robust model-based fault diagnosis: The state of the art. In T. Ruokonen, editor, IFAC Symposium on Fault Detection, Supervision and Safety for Technical Processes: SAFEPROCESS ’94, volume 1, pages 1–24, Helsinki Univ. Technol, Espoo, Finland, June 1994. IFAC, Springer Verlag.
[22] R. J. Patton, P. Frank, and R. Clark. Fault Diagnosis in Dynamic Systems: Theory and Applications. Prentice Hall, 1989. [23] R. J. Pehrson Software Development for the Boeing 777. CrossTalk, The Journal of Defense Software Engineering, January 1996 [24] B. L. Stevens and F. L. Lewis. Aircraft Control and Simulation. John Wiley and Sons, New York, N.Y., 1992. [25] K. Szalai, R. Larson, and R. Glover. Flight experience with flight control redundancy management. In AGARD Lecture Series No.109. Fault Tolerance Design and Redundancy Management Techniques, AGARD Lecture Series, pages 8/1–27. AGARD, Neuilly-sur-Seine, France, October 1980. [26] S. N. Weiss, E. Weyuker. An Extended Domain-Based Model of Software Reliability. IEEE Trans. on Software Engineering, Vol. SE-14, No. 10, pages 1,512-1,524, Oct. 1988. [27] V. Wiels and S. Easterbrook. Formal modeling of space shuttle change request using scr. Technical Report NASAIVV-98-004, NASA IV& V Facility, Fairmont, WV 26554, http://www.ivv.nasa.gov/, 1998. [28] Y. C. Yeh. Design considerations in boeing 777 fly-bywire computers. In Proceedings Third IEEE International High-Assurance Systems Engineering Symposium, pages 64–72, Washington, DC, USA, 1998.