Modeling and Simulated Fault Injection for Time-Triggered Safety-Critical Embedded Systems Iban Ayestaran, Carlos F. Nicolas, Jon Perez, Asier Larrucea
Peter Puschner
Embedded Systems Group IK4-Ikerlan Research Center Arrasate-Mondrag´on, Basque Country (Spain) Email: {iayestaran, cfnicolas, jmperez, alarrucea}@ikerlan.es
Institut f¨ur Technische Informatik Technische Universit¨at Wien Wien, Austria Email:
[email protected]
Abstract—The development and certification of safety critical embedded systems require the implementation of fault-tolerance mechanisms to ensure the safe operation of the system even in the presence of faults. These mechanisms need to be verified and validated by means of fault injection. Simulated fault injection enables an early dependability assessment that validates the correct implementation of fault-tolerance mechanisms and reduces the risk of late and expensive discovery of safety related pitfalls. This paper presents a novel modeling and simulation framework for time-triggered safety critical embedded systems. Our approach supports simulated fault injection at different abstraction levels (platform independent and platform specific models) and integrates a time-triggered automatic test executor for the early verification and validation of the systems. The feasibility of the proposed framework is illustrated with a case study where a simplified railway signaling system is modeled and simulated at different levels of abstraction.
I.
I NTRODUCTION
Safety-critical embedded systems are dependable systems that could lead to loss of life, significant property damages or damages to the environment in case of failure. These systems implement fault-tolerance and fault-control solutions that need to be validated by means of fault injection experiments. As the late fixing of the detected design flaws is expensive, it is advisable to validate the fault-tolerance mechanisms in early steps of the development. In fact, the IEC-61508 safety standard strongly recommends fault injection techniques in all stages of the development process of safety-critical systems [1]. Nevertheless, fault injection from early stages is feasible only if we can build an executable model of the system. Nowadays, the design of safety-critical embedded systems often follows the well known Y-chart development process [2], [3]. This approach specifies the platform model and the functional model of the system separately, and combines both by means of a mapping model to obtain the complete system model. Furthermore, Model Driven Development (MDD) approaches, such as Model Driven Architecture (MDA) [4], also follow the idea of a strict separation of behavioral and platform models. More specifically, it proposes the development of a Platform Specific Model (PSM) by applying transformations to a Platform Independent Model (PIM). This paper presents a novel modeling and simulation framework for safety-critical time-triggered hardware / software systems, based on the Y-chart approach and MDA models. The approach enables validation of fault-tolerance
mechanisms by fault injection from the early stages of the design and the re-usability of fault campaigns through the subsequent design stages. Platform independent models rely on the Logical Execution Time (LET) [5] model of computation, while platform specific models rely on the Executable TimeTriggered Model (E-TTM) [6]. The herein presented framework includes a time-triggered Automatic Test Executor (ATE) for assessing dependability by injecting faults at different abstraction levels. The paper is structured as follows: first, Section II briefly describes the previous related work. Then, Section III introduces basic background concepts and Section IV presents our approach to model safety-critical time-triggered systems at different abstraction levels, and describes the simulation and automatic test execution framework. Section V evaluates the approach by means of a case study drawn from the European Train Control System (ETCS), and finally section VI sums up the main conclusions. II.
R ELATED W ORK
Several design languages and frameworks have been proposed for modeling and validating fault-tolerant embedded systems, such as SCADE Suite [7], the Model Driven Architecture (MDA) [4] and its derivative languages (e.g. UML [8] and SysML [9]), the Architecture Analysis and Design Language (AADL) [10], the Fault-Tolerant Operating System (FTOS) [11], and the Executable Time-Triggered Model (E-TTM) [6]. SCADE Suite is a model-based development environment for safety-critical software based on the synchronous Model of Computation (MoC). The suite includes an automatic C code generator tool, called KCG. However, this tool only covers the application functionality of embedded systems, and nonfunctional aspects such as the modeling of hardware components and the distribution of software in hardware platforms are not covered. MDA suggests separating the specification of the operation of a system from the details of its deployment on a specific platform, in order to increase the portability, interoperability and re-usability of the systems. Therefore, the MDA decouples the system in two models: PIM and PSM. Different modeling languages, such as the Unified Modeling Language (UML) [8] and the Systems Modeling Language (SysML) [9], are compliant with the MDA.
UML is an international standard that defines a modeling language for software systems design. However, as SCADE, UML does not support architectural design so it is not suitable for the modeling of HW/SW systems. SysML is the adaptation of UML for systems engineering. Namely, it can be used to represent systems that may include combinations of HW, SW, data, people, facilities, and natural objects. However, as SysML is a general purpose language, it is not restricted to a specific model of computation. Besides, most of the information contained in SysML models is not intended for the implementation of software. AADL is a modeling language for the development and analysis of embedded systems with multiple critical operational properties, such as responsiveness, safety-criticality, security, and reliability. In contrast to MDA, AADL defines the system and all its components in a unique hierarchical model. AADL is not focused on a specific model of computation, and hence dynamic aspects are described by the selected operational model. FTOS is a model-based development tool for the design of dependable automotive systems. Unlike other approaches such as MDA and AADL, FTOS decouples the system in four different models: the HW architecture model, the SW architecture model, the fault model and the fault-tolerance mechanisms. It relies on a specific model of computation, the Logical Execution Time. In this context, SystemC [12], IEEE-1666, is an opensource language defined as a C++ library for the development and simulation of executable models of HW/SW systems at different levels of abstraction. Both HW and SW components can be modeled in SystemC, making it suitable for the development of platform specific models of safety-critical systems. However, to the best of our knowledge, there is no modeling and simulation framework focused on time-triggered systems that enables HW/SW co-simulation and allows performing simulated fault injection in all stages of the development process. In order to fill this gap, this paper presents the Platform Independent Time-Triggered Model (PI-TTM) and the Platform Specific Time-Triggered Model (PS-TTM), two modeling and simulation environments for the design and dependability assessment of safety-critical time-triggered systems. III.
BACKGROUND
This section describes some basic background concepts that we use in this paper. A. Time-Triggered Architecture (TTA) The TTA [13] provides a computing infrastructure for the design and implementation of dependable and safetycritical embedded systems. TTA decomposes the systems into nearly autonomous clusters and nodes that share a faulttolerant global time base of known precision. The existence of this global time in all the components of the system enables to abstract the communication interfaces, guarantees the timeliness of the real-time applications, and eases prompt error detection in communications. Therefore, TTA is based
Host
Host
CNI CC
CNI CC redundant comm. channel
CC CNI
CC CNI
CC CNI
Host
Host
Host
Fig. 1: Structure of a TTA cluster with five nodes
in the time-triggered MoC [14], which relies on the sparsetime model of time. The TTA infrastructure guarantees the agreement between the time-stamps at each node. The interfaces and the predictable Time-Triggered Protocol (TTP) decouple the processing functions from communications among the distributed subsystems, simplifying the design of the internal application software of the nodes. In the TTA, systems are composed of one or several clusters, which are composed of one or several nodes interconnected by a TTP network (Figure 1). Each node includes a time-triggered Communication Controller (CC), a Communication Network Interface (CNI) and a host processor with memory that executes the operating system and the application software. B. Executable Time-Triggered Model (E-TTM) The E-TTM [6] is a SystemC based extension for the modeling of real-time embedded systems based on the TTA. It extends SystemC with the sparse notion of time, and decouples computation activities from communication, while they both share a global notion of time. The execution of computational components is instantaneous and restricted to the sparse-time macrotick. Although simultaneously triggering components are executed sequentially, the output messages are not delivered during the execution macrotick and the events generated during the computation do not trigger activation events until next execution macrotick. This guarantees that the execution order chosen by the E-TTM scheduler for simultaneously triggering jobs does not have any impact on the outputs of the system. The components communicate with each other by exchanging messages across ports. This message exchange occurs during the silence interval of the sparse-time model and is quasi-instantaneous. However, E-TTM enables the designer to delay the transmission of a message in order to represent the amount of time that computation or message-exchange take in real world. The E-TTM simplifies the design of systems by applying hierarchy. The developer can define three types of components in the model: Systems, Distributed Application Subsystems (DASes), and jobs. The former two are hierarchical whereas the latter is atomic. C. Logical Execution Time (LET) LET [5] was first introduced with the Giotto [15] timetriggered programming language. LET is a programming abstraction of the time-triggered MoC that specifies a logical
physical perspective logical perspective
job running
job running
(outputs delayed)
time
job invoked read inputs
write outputs
LET
Fig. 2: LET: Logical Execution Time of a job
duration for each computational job, regardless of its physical duration. Since it requires specifying a duration for each job, the LET paradigm is well suited for real-time systems that exhibit time-periodic behavior. The logical execution time of a given job specifies the duration between the instants when the inputs are read and the instants when the outputs are written. Since communication between jobs is instantaneous in LET models, the logical execution time represents the execution time of the job. However, if the program completes its execution before the deadline of the LET, the output writing is delayed until the deadline. In other words, from the physical point of view, the LET of a job is the upper bound of its execution time. However, from the logical perspective, the LET of a job is not only the upper bound, but also a lower bound. Figure 2 shows the concept of logical execution time of a job which is physically executed in two time slots. Due to the decoupling of physical and logical execution times in LET models, the use of faster machines does not result in a logically faster program, but only in decreased machine utilization. Thus, when the deployment of the LET model into a hardware platform satisfies the LET specification, the input/output behavior does not vary depending on the platform. The conditions to satisfy the LET specification in the deployment are: •
The worst case (physical) execution time (WCET) of a job is not longer than the LET.
•
The platform ensures that the inputs of a job are read at the beginning of the LET interval and outputs are written at the end of the LET interval.
While the first condition depends to a great extent on the technology of the HW components, the second condition is inherent when the LET model is mapped onto architectures that provide time synchronization, such as the TTA. D. Faults and Simulated Fault Injection According to Benso et al. [16], a fault is a physical defect, imperfection or flaw that occurs within some HW or SW component. An error is the deviation from accuracy or correctness due to the occurrence of a fault, whereas a failure is the non-performance or incorrect performance of an expected action due to a behavior deviation. Then, as described in [17], failures are consequences of errors, and errors are consequences of faults. Thus, fault injection and the analysis of the response of the system to such faults is a straightforward technique to validate the implementation of fault-tolerance mechanisms and assess the dependability of systems. Simulated Fault Injection (SFI) is a technique to mimic the effects of faults in simulated systems. Thanks to this technique,
developers can observe the behavior of the system in the presence of faults before assembling a prototype enabling the early verification of the fault-tolerance mechanisms implemented in the system. The results obtained in the SFI experiments can be used to assess the suitability of the selected fault-tolerance mechanisms. SFI strategies and techniques have been widely analyzed in the past and several tools have been developed. Although most of them are focusing on VHDL models [18], [19], [20], SFI in SystemC models is getting an increasing interest in the latest years [21], [22], [23], given that SystemC is nowadays the de-facto standard in industrial HW/SW system co-design and simulation. IV.
P ROPOSED M ODELING A PPROACH
This section discusses our proposed approach for the model-driven design and simulation of time-triggered safetycritical embedded systems. The modeling workflow is based on the Y-chart development process and the MDA. First, we describe the functionality of the system with a platform independent model. At the same time we can provide information about the candidate hardware architecture in a platform model. Finally, we create the platform specific model of the system by deploying the platform independent model into the target platform. The decoupling between the design of functionality and the design of HW saves design time and cost during the development process, and enables the early validation of the dependability of the system. Furthermore, it eases the assessment of the emerging behavior of the system in several platform variants. In order to ensure the integrability of the models described above, we require a coherent meta-model. This provides a modeling framework that guarantees the syntactical correctness of the models. Based on the meta-model we define a suitable modeling language for the platform independent and platform specific models. The early validation typically consists of examinations of the expected system behavior by simulations, thus requiring the transformation of the design models into executable specifications. To that end, the meta-model must guarantee that the models are unambiguous. The model of computation of the meta-model is a central element for this purpose. Besides, the execution framework shall enable simulated fault injection to validate the correctness of the fault-tolerance mechanisms. A. Platform Independent Time-Triggered Model (PI-TTM) As mentioned in section III, systems that rely on the LET MoC need to specify a logical execution time for each job, whereas communications are instantaneous. This enables the designer to abstract from communication issues, while still being able to deploy the system on a specific TTA platform. Moreover, the LET MoC avoids synchronization points during the execution of jobs and triggers all the jobs synchronously. This reduces the number of failures that might occur in the system, since failures due to faulty synchronization points and different orders of execution are avoided. Thus, the semantics of the platform independent meta-model in this approach rely
DAS
Job
DAS DAS
Comm. channel
System
Job
Comm. channel
DAS
Job
Comm. channel
Job
DAS
(delay = 1)
job2
(delay = 0)
job1
job2
(delay = 0) (delay = 0)
job2
PI-TTM
time job 1 invoked (freq = 1)
LET perspective job 2 invoked (freq = 2)
job 2 invoked (freq = 2)
t=0 period
Fig. 4: PI-TTM: LET to E-TTM MoC transformation
Fig. 3: Example of hierarchy in PI-TTM
on the LET MoC. This work presents a LET MoC execution engine, which has been built as an extension to the SystemC library. The LET-based library is called Platform Independent Time-Triggered Model (PI-TTM). The library is built as an extension that imposes LET MoC constraints to the E-TTM engine in SystemC. The PI-TTM library works as a bridge between the LET models and the E-TTM engine, giving an abstraction of the E-TTM details and providing a clear interface for the design and simulation of LET systems to the developers. Models in PI-TTM follow the architectural hierarchy described in [6], where computational components are specialized as systems, DASes and jobs. However, PI-TTM restricts the definition of these components to be time-triggered. The Systems and DASes are hierarchical in PI-TTM, whereas the jobs are atomic. In other words, the systems and DASes may contain several DASes and jobs communicating among them through the communication infrastructure provided by the library. Figure 3 shows an example of a hierarchical PI-TTM model. As stated before, LET models are cyclic over time. Thus, models in PI-TTM have a period, which determines the duration of its periodic behavior. The period of a model is defined as an attribute in the system component. DASes and jobs, on the contrary, specify their frequency. The frequency of a job, together with the period of the hierarchical component where it is located, determine its logical execution time (e.g., if a system with a period of 100ms contains a DAS of f req = 2, which in turn contains a job with f req = 5, the LET of the job will be 10ms). The PI-TTM library automatically handles the abstraction from LET components to E-TTM components to perform simulation. This is accomplished by transforming the LET model into an equivalent E-TTM model, with instantaneous jobs and message delays. Since messages are sent quasi-instantaneously in E-TTM, the PI-TTM library delays the delivery of messages until one delta-cycle before the next execution of the job that sends the message. The granularity of the simulation is also automatically determined by PI-TTM, and it is calculated by dividing the period by the Least Common Multiple (LMC) of all the jobs in the system (Equation 1). granularity =
job1 E-TTM perspective
system period LCM (LET of all jobs)
(1)
Figure 4 shows an example of how LET-based models are managed by the PI-TTM library. In the example, two LET jobs with different frequencies communicate with each other. Job 1 has frequency 1 and job 2 has frequency 2. Therefore, PI-TTM sets the granularity of the simulation to the half of the period, and triggers job 1 every two macroticks and job 2 every macrotick. The PI-TTM transforms the LET model into an equivalent E-TTM model by automatically setting trigger instants for each job and delaying each message according to the logical execution time of the jobs. Note that the dotted sections in the time axis represent zero time delay, so execution of jobs in E-TTM models and communication in LET models are instantaneous. Besides enabling design and simulation of LET models in SystemC, our approach eases the transformation of LET models into E-TTM models. Furthermore, it enables merging system descriptions that require both models of computation, providing a seamless connection between them. B. Platform Specific Time-Triggered Model (PS-TTM) According to our proposed modeling workflow, once the platform independent model is defined, it has to be deployed onto a specific hardware platform in order to build a platform specific model. Therefore, the PSM must be able to model and simulate HW components. Besides, the platform specific model may require a further refinement regarding to timing properties, due to the increase in the level of detail of the system. Therefore, we developed a new modeling environment for platform specific models, called Platform Specific TimeTriggered Model (PS-TTM). Given that the behavior of safety-critical systems must be time-deterministic, we constrain the MoC in the PS-TTM to the E-TTM. Therefore, we developed the PS-TTM simulation engine as an extension to the E-TTM. This engine enables modeling and simulating HW-related components, and provides mechanisms to perform fault injection during model simulation. The HW components included in the PS-TTM library are: clusters, nodes, processors, cores, hypervisors and partitions. These hardware components are defined as SystemC wrappers. Hence, the platform is defined hierarchically, by concatenating components in cascade. The developer shall finally define the platform specific model by deploying the jobs defined in the PI-TTM into cores or partitions. Figure 5 shows an example of a hierarchical platform specific model in PS-TTM.
SUT
Processor
job
Core job
job
Core job
SUT inputs
Comm. channel
job
SUT outputs
FI points
Test points
Processor Processor
Comm. channel
Node
FIU
TPM TCI
Cluster
ATE
Comm. channel
Node Node Node
Fig. 7: Time-Triggered Automatic Test Executor (ATE)
Cluster Cluster
Comm. channel
System
scripts that enables defining the input values to the SUT in order to simulate the desired functional testcase.
Fig. 5: Example of hierarchy in PS-TTM
System Subsystem1
Subsystem2
Platform Model
PI-TTM LET
PI-TTM
Deployment
LET
Compliant with
PS-TTM E-TTM
PI-TTM library
Compliant with
Compliant with
PS-TTM library
Fig. 6: Mixed abstraction level simulation with PS-TTM engine
Although the PS-TTM is a new extension to the E-TTM simulation engine, it is compliant with the PI-TTM library, so it enables the simulation of PI-TTM components relying on the LET MoC. Therefore, the PS-TTM library provides a seamless connection between platform independent and platform specific models. This enables the simulation of mixed abstraction systems, even combining descriptions of subsystems at PIM and PSM design stages, as shown in Figure 6. C. Simulated Fault Injection and Testing Framework The framework presented in this paper includes the possibility to integrate a time-triggered ATE in the simulation, which is capable of reading/writing variables and signals from/to the System Under Test (SUT). The ATE is composed of three modules which share the global notion of time with the SUT, so that test experiments are reproducible. Figure 7 shows the ATE and its modules: •
Test Case Interpreter (TCI): Is an interpreter of Python
•
Fault Injection Unit (FIU): Is an XML code interpreter that enables performing simulated fault injection by inserting different types of faults into the SUT.
•
Test Point Manager (TPM): Is an XML code interpreter that enables the observation of internal variables of the SUT.
The Simulated Fault Injection is driven by the FIU of the ATE. To do so, the tester must define the faults to inject in an XML file. The selected XML schema complies with the international ASAM AE HIL [24] standard, which defines the de-facto interface to perform error simulation in hardware-inthe-loop testing. This facilitates the forward reuse of the fault injection campaigns until the final prototyping phase. The FIU parses the file during the initialization process, and injects the faults in the SUT during simulation by modifying variables values in the SUT. The corruption of variables takes place when they are read and written, i.e., during the silence interval of the sparse-time model. The PI-TTM and PS-TTM libraries are key components to perform the fault injection in the SUT. Since all communication in the SUT goes through the communication infrastructure provided by the libraries, every time a component reads or writes a variable, the communication interfaces defined in the PI-TTM and PS-TTM send the data to the FIU. The latter decides whether it has to corrupt the current variable at a given instant by consulting the fault configuration. If the fault configuration specifies that a fault has to be injected, the FIU modifies the value according to the selected fault effect and sends the corrupted value to the communication infrastructure. Alternatively, it sends the received value if it does not need to be corrupted. This SFI approach is non-intrusive, since all the SFI campaign is managed by the PI-TTM/PS-TTM libraries and the FIU, and thus, the SUT does not need to be modified. The FIU includes a fault library with the implementation of the fault effects shown in Tables I and II, which can be specified in the fault-configuration files in order to simulate different faults in the system. The FIU supports two different fault modes: permanent and transient. Permanent faults remain active until the simulation ends, whereas transient faults are
job emerg
stuck value
DAS SUT
stuck value, condition
DAS EVC System
constant value ampl value max amp value, min ampl value drift value offset value max offset value, min offset value min value, max value
DAS SUT DAS ATE
DAS DMI
job serv
DAS DMI
job dmi
Comm. channel
Integer/Floating
job mode
Configuration attributes
Comm. channel
Boolean
Fault Model Invert Stuck At Stuck Stuck If Open Circuit Constant Amplification Amplification Range Drift Offset Offset Range Stuck Random
Comm. channel
Datatype
job odo
Comm. channel
DAS EVC
TABLE I: Fault library for PI-TTM models
Fig. 8: Case Study: Platform independent model of ETCS with PI-TTM
TABLE II: Fault library for PS-TTM models Error Model Corruption No execution Out of time Babbling
Description The functionality is performed incorrectly. The information provided in the interface is corrupted The functionality is not executed. No information is provided as a result Time bounds of the functionality are not respected. Information is provided later/earlier than expected Information in the interface is erroneous both in terms of content and time.
In Supervision mode, the EVC supervises the current speed and position of the train and activates the warning and brakes when the maximum permitted speed values are exceeded. •
Emergency brake control: This task controls the activation and deactivation of the emergency brake of the train. When the system is in Standby mode the brake is activated. On the other hand, when the train is in Supervision mode, the estimated position and speed are compared to a pre-defined braking-curve that sets a maximum speed for each position on the track. The emergency brake is activated when the estimated speed exceeds the speed set by the braking-curve. The brake is only released when the train is stopped and the driver sends a reset command through the Driver Machine Interface (DMI).
•
Service brake control: This task controls the activation and deactivation of the service brake and the warning signal. If the system is in Standby mode, the service brake and the warning are deactivated. However, in Supervision mode the estimated position and speed are compared to the service-brake and warning brakingcurves. The warning signal is activated when the speed of the train reaches the warning activation speed set by its braking-curve, and analogously, the service brake is activated when the speed exceeds the service brake activation speed. Both the warning and the service brake are deactivated when the speed of the train falls below the warning activation speed.
temporary misbehaviors whose duration is defined in the faultconfiguration file. V.
C ASE S TUDY
This section briefly describes the modeling of the European Train Control System (ETCS) at different abstraction levels using the approach presented in section IV. ETCS constitutes the on-board unit of the European Railway Traffic Management System (ERTMS), which is a European Union backed initiative for the definition of a unique train signaling standard throughout Europe [25]. A. European Train Control System (ETCS) The ETCS prevents over-speeding in high-speed trains by supervising the traveled distance and speed and activating an emergency brake when the train exceeds the authorized values. The safety requirements for the ETCS states that it shall be designed as a safety-critical embedded system for safety integrity level 4 (SIL-4). The ETCS is composed by several subsystems connected to the central safety processing unit called European Vital Computer (EVC). We can summarize the functionality of the EVC in four different tasks: •
Estimation of the speed and position of the train: The EVC reads the information provided by an accelerometer and two encoders located in two different wheels. With this information, the odometry subsystem of the EVC makes an estimation of the current speed and position of the train.
•
Selection of the operation mode: The driver can choose between two operation modes: In Standby mode, the emergency brake of the train is activated and service brake and warning signal are deactivated.
B. Platform Independent Model We develop the platform independent model of the ETCS in SystemC using the PI-TTM library. As shown in Figure 8 we design the SUT hierarchically. It contains two DASes: the EVC and the DMI. Four jobs are mapped to the EVC, one for each of the tasks previously defined. The DMI contains a single job that enables the driver to select the desired operation mode and release the emergency brake. The time-triggered ATE is integrated in the system as another DAS, and contains the Python modules for test execution: the TCI, the FIU and the TPM. The components shaded in gray in Figure 8 are automatically generated by the PI-TTM library when the system is modeled.
Core 2 job serv
Comm. channel
job emerg
D. Results
C. c.
job mode
Comm. channel
Processor Core 1 job odo
Processor
Comm. channel
Node
Cluster
Node EVC Node EVC Node voter Node voter System
Cluster ATE
Node DMI Comm. channel
Cluster SUT
Comm. channel
Node EVC
Fig. 9: Case Study: Platform specific model of ETCS with PS-TTM
In this example we set the period of the system to 250ms. All jobs in the SUT have f requency = 1, so the LET of the jobs is 250ms. We design the functionality of the jobs in SCADE and integrate the automatically generated code in the jobs. C. Platform Specific Model Once we define and verify the functional system with the PI-TTM library, the model is refined into a platform specific model. Following the recommendations from the IEC-61508 safety standard, we choose a Triple Module Redundant (TMR) configuration for the ETCS platform, in order to achieve the SIL-4 safety requirement. Redundancy increases the robustness of the system against random faults, such as hardware faults. The introduction of redundancy in the system requires the implementation of voters in order to handle the redundant output values. Figure 9 shows the design of the platform specific model of the system with the PS-TTM library. The components in gray are given by the PS-TTM library. The SUT is defined as a cluster containing 6 nodes. The redundant EVCs are hosted in three identical nodes, whereas a fourth node hosts the DMI and the latest two nodes host two voters. The architecture of the EVC nodes contains a dual-core processor. One of the cores is the host for the safety-critical jobs, i.e., the odometry job, the mode control job and the emergency brake control job, whereas the other core hosts the service brake control job. Since the nodes for the DMI and the voters execute a unique job, they contain a single-core processor. The PS-TTM also integrates an ATE with the TCI, FIU and TPM modules mentioned before, what enables verifying the functionality of the system and validate the fault-tolerance mechanisms. We again design the functionality of the jobs with SCADE suite, and integrate the automatically generated code in the system.
Figure 10 shows the results captured by the TPM module when the system is simulated against a pre-defined test-case. In the first simulation over the PI-TTM library (Figure 10a), the TCI feeds the SUT with the test-case and the system estimates the speed and activates the service brake and warning signal accordingly. In the second simulation (Figure 10b) we repeat the same test-case but we simulate a blocked wheel, by injecting a “stuck at 0” fault in the encoder1 between t = 24.0 and t = 27.0 with the FIU. As the figure shows, the odometry system masks the fault and estimates the speed correctly, so the brakes and warning are activated as expected. Figures 10c and 10d show the results of the simulations of the PS-TTM model. In the first simulation (Figure 10c) we repeat the previous test-case and we obtain similar results, with a small delay due to the execution time of the voters. In the second simulation we execute the same test case and we inject a permanent “no-execution” fault in the core1 of the EVC node A at instant t = 14.5, what mimics the destruction of the core. As Figure 10d shows, the speed estimated by node A gets stuck at t = 14.5, whereas the other nodes still estimate the speed correctly. The output of the system is still correct, since the voters mask the faulty node. VI.
C ONCLUSION
This paper presented an approach to model timetriggered safety-critical embedded systems following the Ychart paradigm over logical execution time and time-triggered models of computation. We introduced the PI-TTM, which is a LET simulation engine built as an extension to the E-TTM that enables the simulation of platform independent LET-models in SystemC, and the PS-TTM, another extension to the E-TTM that enables the design of platform specific E-TTM models including HW components. The deployment of LET and E-TTM based models into time-triggered architecture platforms becomes straightforward. Moreover, the fact of relying on time-triggered models of computation reduces the number of system failures that need to be considered since failures due to faulty synchronization or altered orders of execution are prevented by construction. Besides, the fact of implementing the LET MoC as an extension to the E-TTM engine enables merging subsystems described in both models of computation. This enables the simulating mixed-abstraction level subsystems with fault injection, seamlessly combining descriptions of subsystems at PIM and PSM design stages. Finally, the framework integrates a time-triggered Automatic Test Executor (ATE) that enables non-intrusive simulated fault injection (SFI) in the PI-TTM and PS-TTM models. The ATE is synchronized with the SUT in order to enable reproducibility in the tests and SFI experiments. The selected SFI configuration scheme is compliant with the international ASAM AE HIL standard easing forward reuse of the SFI campaigns until the final TTA prototypes. ACKNOWLEDGMENT This research work has been supported in part by the Spanish INNPACTO project VALMOD under grant number IPT-2011-1149-370000.
PI-TTM
0
10
20
30
40
50
60
70
(s)
PI-TTM(SFI)0
Enc1
Enc1
Enc2
Enc2
Estimated_V
Estimated_V
Serv_Brake
Serv_Brake
Warning
Warning
(a) PI-TTM test-case results PS-TTM
0
10
20
30
40
10
20
30
40
50
60
70
(s)
(b) PI-TTM test-case results with SFI in encoder 1 50
60
70 (s)
PS-TTM(SFI) 0
Enc1
Enc1
Enc2
Enc2
V_NodeA
V_NodeA
V_NodeB
V_NodeB
V_NodeC
V_NodeC
Serv_Brake_VotA
Serv_Brake_VotA
10
20
30
40
50
60
70 (s)
Warning_VotA
Warning_VotA
(c) PS-TTM test-case results
(d) PS-TTM test-case results with SFI on the core 1 of Node A
Fig. 10: Simulation results of the PS-TTM model
R EFERENCES [1]
[2]
[3]
[4] [5] [6]
[7]
[8] [9] [10] [11]
[12] [13] [14]
J. Perez, M. Azkarate-askasua, and A. Perez, “Codesign and Simulated Fault Injection of Safety-Critical Embedded Systems Using SystemC,” in European Dependable Computing Conference, 2010, p. 9. F. Balarin, M. Chiodo, P. Giusto, H. Hsieh, A. Jurecska, L. Lavagno, C. Passerone, A. Sangiovanni-Vincentelli, E. Sentovich, K. Suzuki, and B. Tabbara, Hardware-software co-design of embedded systems: the POLIS approach. Kluwer Academic Publishers, 1997. B. Kienhuis, E. Deprettere, K. Vissers, and P. van der Wolf, “An approach for quantitative analysis of application-specific dataflow architectures,” in Application-Specific Systems, Architectures and Processors, 1997. Proceedings., IEEE International Conference on, 1997, pp. 338– 349. J. Miller and J. Mukerji, “MDA Guide Version 1.0.1,” p. 62, 2003/06/12 2003. C. Kirsch and A. Sokolova, The Logical Execution Time Paradigm. Springer Berlin Heidelberg, 2012, ch. 5, pp. 103–120. J. Perez, C. F. Nicolas, R. Obermaisser, and C. El Salloum, “Modeling Time-Triggered Architecture Based Real-Time Systems Using SystemC,” in Forum on specification & Design Languages (FDL) 2010, T. J. Kamierski and A. Morawiec, Eds. 2010: Springer, 2010. “SCADE Suite.” [Online]. Available: http://www.ansys.com/ Products/Simulation+Technology/Systems+&+Multiphysics/Systems+ &+Embedded+Software/SCADE+Suite Object Management Group (OMG), “OMG Unified Modeling Language (OMG UML) 2.4.1,” 2011. ——, “OMG Systems Modeling Language (OMG SysML) 1.3,” 2012. SAE Aerospace, “SAE Architecture Analysis and Design Language (AADL),” 2009. C. Buckl, D. Sojer, and A. Knoll, “FTOS: Model-driven development of fault-tolerant automation systems,” in Emerging Technologies and Factory Automation (ETFA), 2010 IEEE Conference on, 2010, pp. 1–8. IEEE, “IEEE Standard SystemC Language Reference Manual,” p. 441, 2005. H. Kopetz and G. Bauer, “The Time-Triggered Architecture,” in Proceedings of the IEEE, vol. 91, 2003, p. 15. H. Kopetz, “The Time-Triggered Model of Computation,” In Real Time Systems Symposium, IEEE Computer Society, p. 16, 1998.
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24] [25]
T. Henzinger, B. Horowitz, and C. Kirsch, “Giotto: a time-triggered language for embedded programming,” Proceedings of the IEEE, vol. 91, no. 1, pp. 84–99, 2003. A. Benso and P. Prinetto, Fault Injection Techniques and Tools for Embedded Systems Reliability Evaluation. Kluwer Academic Publishers, 2003. A. Avizienis, J.-C. Laprie, B. Randell, and C. Landwehr, “Basic Concepts and Taxonomy of Dependable and Secure Computing,” IEEE Trans. Dependable Secur. Comput., vol. 1, no. 1, pp. 11–33, 2004. E. Jenn, J. Arlat, M. Rimen, J. Ohlsson, and J. Karlsson, “Fault injection into VHDL models: the MEFISTO tool,” in Fault-Tolerant Computing, 1994. FTCS-24. Digest of Papers., Twenty-Fourth International Symposium on, 1994, pp. 66–75. J. Gracia, J. C. Baraza, D. Gil, and P. J. Gil, “Comparison and application of different VHDL-based fault injection techniques,” in Defect and Fault Tolerance in VLSI Systems, 2001. Proceedings. 2001 IEEE International Symposium on, 2001, pp. 233–241. J. C. Baraza, J. Gracia, S. Blanc, D. Gil, and P. J. Gil, “Enhancement of Fault Injection Techniques Based on the Modification of VHDL Code,” Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 16, no. 6, pp. 693–706, 2008. S. Misera, H. T. Vierhaus, and A. Sieber, “Fault Injection Techniques and their Accelerated Simulation in SystemC,” in Digital System Design Architectures, Methods and Tools, 2007. DSD 2007. 10th Euromicro Conference on, 2007, pp. 587–595. C. Bolchini, A. Miele, and D. Sciuto, “Fault Models and Injection Strategies in SystemC Specifications,” in Digital System Design Architectures, Methods and Tools, 2008. DSD ’08. 11th EUROMICRO Conference on, 2008, pp. 88–95. S. Reiter, M. Pressler, A. Viehl, O. Bringmann, and W. Rosenstiel, “Reliability assessment of safety-relevant automotive systems in a model-based design flow,” in Design Automation Conference (ASPDAC), 2013 18th Asia and South Pacific, 2013, pp. 417–422. A. H. workgroup, “ASAM AE HIL Programmers Guide,” p. 162, 2009. P. Winter, B. Guiot, and I. U. o. Railways, Compendium on ERTMS: European Rail Traffic Management System. Eurail Press, 2009.