Characterizing Intransitive Non-Interference in Security Policies with Observability ∗ Nejib Ben Hadj-Alouane†, St´ephane Lafrance‡, Feng Lin§, John Mullins¶, and Moez Yeddesk October 2, 2004
Abstract This paper introduces a new algorithmic approach to the problem of checking the property of intransitive non-interference (INI) using discrete event systems (DES) tools and concepts. INI property is widely used in formal verification of security problems in computer systems and protocols. The approach consists of two phases: First a new property called iP observability (observability based on a purge function) is introduced to capture INI. We prove that a system satisfies INI if and only if it is iP -observable. Secondly, a relation between iP observability and P -observability (observability as used in DES) is established by transforming the automaton modeling a system/protocol into an automaton where P -observability (and hence iP -observability) can be determined. This allows us to check INI by checking P -observability, which can be done efficiently. Our approach can be used for all systems/protocols with three levels, which is sufficient for most non-interference problems for cryptographic protocols and systems. We also give examples to illustrate the applications our approach to cryptographic protocols and systems.
Keywords: Interference, Intransitive Non-interference, Security Policies, Formal Verification, Observability, Cryptographic Protocols.
∗
This research is supported in part by NSF under grants ITR-0082784 and INT-0213651, by NASA under grant NAG2-1043, and by NIH under grant 1 R21 EB001529-01A1. † Department of Applied Computer Sciences, National School of Information Sciences, University of Manouba, Tunisia, Email:
[email protected] ‡ ´ Department of Computer Engineering, Ecole Polytechnique de Montr´eal, Montr´eal, Quebec, Canada, Email:
[email protected] § Department of Electrical and Computer Engineering, Wayne State University, Detroit, MI 48202, USA and School of Electronics and Information Engineering, Tongji University, Shanghai, China, Email:
[email protected] ¶ ´ Department of Computer Engineering, Ecole Polytechnique de Montr´eal, Montr´eal, Quebec, Canada, Email:
[email protected] k Department of Applied Computer Sciences, National School of Information Sciences, University of Manouba, Tunisia, Email:
[email protected]
1
1
Introduction
Security is a crucial property of system behavior. The need for secure protocols has seen a rapid growth during the past decade due to, on the one hand, the development and wide spread use of computer networks and distributed systems, and on the other hand, the globalization of various forms of electronic communication, collaboration, and trade. Within this global context, security policies and protocols are mainly seen as tools for providing secure information exchange. While the information-security community has not yet reached a consensus on the exact meaning of the terms “security” or “confidentiality”, it is quite clear that both require a strict control over the information flowing between different agents manipulating objects and acting within systems with multiple security levels [1]. In this regard, several information-flow security properties have been proposed. Non-interference [6, 7, 12, 21], with its different generalizations and forms, is one such property. Intuitively, non-interference captures any causal dependency from a high-level action h to a lower-level action l. By causal dependency we mean that the dependent action cannot occur without the occurrence of the preceding action. Such a dependency creates an insecure channel called a covert channel, having the capability of transmitting high-level information concerning h to any low-level agent observing l. In practice, however, many security problems go beyond the scope of simple non-interference. In particular, the problem of confidentiality in multi-level security systems, where the relation over the set of security levels capturing allowed information flows, is not transitive. Therefore, Intransitive non-interference (INI) has been proposed in the literature [20]. A paradigmatic example for INI is a three-level security system modeling a cryptosystem. The three levels are: a private or high level H, a public or low level L and a downgrading level D. Such a system has the requirement of insuring non-interference from a private data m and an encryption key k to the public level, unless it has been previously downgraded through the cryptosystem. The general schema of information flow properties is based on a process-algebraic notion of observation-dependent congruence [2]. The (global) operational characterization of this schema states that in any hostile environment, no condition observable from a point of view H will be observable from a point of view L, unless this observation has been previously downgraded from H to L (through a downgrading level D). A detailed study of INI is given in [20]. Other recent works concerning applications of INI to the verification of cryptoprotocols , can be found in [9, 17, 5]. A formulation of INI within the context of process algebra is given in [13], in the form of a property called admissible interference (AI), which is verified using trace equivalence. When trace equivalence is replaced by bisimulation, 2
a property called bisimulation-based admissible interference (BNAI) [14] is obtained. Despite the significant progress being made in understanding INI, prior to our work, no rigorous algorithms have been proposed for checking INI based on some necessary and sufficient conditions. For example, only a sufficient condition is given in [20] as a basis for checking INI. This paper fills this important gap using a new approach based on notions and techniques borrowed from the observability theory for DES. The theory of supervisory control of discrete event systems (DES) was introduced over twenty years ago [15, 11]. Since then, the properties of controllability and observability have been used as tools to characterize and solve many problems with diverse application domains: control and supervision applied to manufacturing [3], verification of communication protocols [19], and database systems [10], just to name a few. Observability introduced by Lin and Wonham [11] is a property on information flow. It has played an important role within the theory of DES. So far, the definition of observability has been based on a static projection P , capturing the fact that a fixed subset of events are observable and the rest are not. This type of information flow is not appropriate for security problems. An important aim of this paper is to capture information flow property using observability that is based on a purge function iP , rather than the projection. The purge function erases events from a given sequence (or trace) of events, based on an allowable information flow lattice defined over the security levels of a given system. Thus, we extend P -observability (observability as used in DES) to iP -observability (observability based on a purge function). We prove that a system satisfies INI if and only if it is iP -observable. To check iP -observability, we establish a relation between iP -observability and P -observability by transforming the automaton modeling a system/protocol into an automaton where P -observability can be determined. We prove that the system modeled by the first automaton is iP -observable if and only if the system modeled by the second automaton is P -observable. This transformation allows us to check INI by checking P -observability. Our approach can be used for all systems/protocols with three levels, which is sufficient for most non-interference problems for cryptographic protocols and systems. We also give examples to illustrate the application our approach to cryptographic protocols and systems. Another way to describe our approach is to view the projection as “linear” while the purge function as “nonlinear”. Hence, our approach can be described as a “linearization” of the INI verification problem, set within a DES context to benefit from existing tools and concepts. The paper is organized as follows. Section 2 presents the necessary DES background on au-
3
tomata, projections and observability. Section 3 provides the necessary background on multi-level security and characterizes INI in terms of observability using a purge function. Section 4 constructs the tools, algorithms, and the necessary lemmas for the verification of INI. Section 5 gives examples for applying our method to cryptographic protocols and systems. The conclusion is provided in Section 6.
2
DES Background
This section provides an overview of DES concepts. For brevity purposes, we only discuss the concepts needed to develop the specific problems addressed in this paper. A more thorough discussion can be found in the literature [15, 16, 11, 4]. Let us recall that our interest is to characterize security properties using the notion of observability in DES. A DES is typically modeled by an automaton or a state machine G, commonly called generator, G = (Σ, X, δ, x0 ) where Σ denotes a set of events, X denotes a state space, δ denotes a transition function and x0 denotes the initial state. The (partial) transition function δ : X × Σ −→ X describes the system dynamics: given states x, y ∈ X and event σ ∈ Σ, δ(x, σ) = y if the execution of σ from state x takes the system to state y. Note that δ(x, σ) is undefined whenever the event σ cannot be executed from the state x. We often use the extended version of δ, also denoted by δ (for simplicity), δ : X × Σ∗ −→ X where, given states x, y ∈ X and trace (or string) t ∈ Σ∗ , δ(x, t) = y if starting at x, it is possible to execute the sequence of events specified by the trace t and this execution moves the system to y. The state machine G characterizes a language L(G) which is the set of traces generated by G and defined by def
L(G) = {s ∈ Σ∗ | δ(x0 , s) is defined}. Note that L(G) is prefix-closed, i.e, it equals its prefix closure, commonly denoted by L(G) (the set of all the prefixes of traces in L(G)). Given two DES generators G1 = (Σ1 , X1 , δ1 , x1,0 ) and G2 = (Σ2 , X2 , δ2 , x1,0 ), the parallel composition of G1 and G2 is the DES generator G1 k G2 = (Σ1 ∪ Σ2 , X1 × X2 , δ, (x1,0 , x2,0 ))
4
where
(δ1 (x1 , σ), δ2 (x2 , σ)) (δ1 (x1 , σ), x2 ) δ((x1 , x2 ), σ) = (x1 , δ2 (x2 , σ))
if σ ∈ Σ1 ∩ Σ2 if σ ∈ Σ1 \ Σ2 if σ ∈ Σ2 \ Σ1 .
In the supervisory control theory of DES, the control objective is to restrict the system behavior within a legal language. The control objective is achieved by a supervisor who can observe a subset of observable events, denoted by Σo , and can disable a subset of controllable events, denoted by Σc . The existence of such a supervisor is characterized by observability and controllability conditions [15, 11]. For the discussion of intransitive non-interference, we only need to use observability. To define observability, we first consider the projection P : Σ∗ −→ Σ∗o . Given a string s ∈ Σ∗ , P (s) erases all events in s that are not observable. The following definition of observability (called P -observability in this paper, i.e., observability based on the projection P ) was introduced by Lin and Wonham [11]. Definition 1 (P -Observability). Let M ⊆ Σ∗ and Σc ⊆ Σ. A prefix-closed sublanguage K ⊆ M is P-observable w.r.t. (M, Σc ) if, ∀s,s0 ∈K ∀σ∈Σc
3
(P (s) = P (s0 ) ∧ sσ ∈ K ∧ s0 σ ∈ M ) =⇒ s0 σ ∈ K.
Intransitive Non-interference in Multi-Level Systems
The concept of non-interference is introduced by Goguen and Meseguer [7] as a basis for specifying and analyzing security issues in computer systems and protocols. The basic idea behind noninterference can be simply stated as follows: the behavior of a given entity is said not to interfere with the behavior of a second entity whenever no action performed by the first can influence subsequent outputs seen by the second. Rushby [20] gives a formalization of non-interference, in terms of input/output automata, and introduces intransitive non-interference (INI). INI is an information flow property which extends non-interference and enable the specification of a generalized class of security policies dealing with channel control mechanisms. Roughly speaking, channel control mechanisms [20] require the following type of specification: given a system with three channels H, L and D, information is allowed to flow from H to L only after passing through D, but never directly (intransitivity). Here, channel D is seen as a downgrading channel (for example, an encryption mechanism). In terms of non-interference, the event stream generated by H is allowed to interfere with the event stream generated by L, only through D events. 5
Our treatment of interference is based on event automata, instead of Rushby’s input/output automata [20]. However, it is well-known that these two types of automata are essentially equivalent. We are given a set D of security domains and a set of events Σ partitioned over these domains. The operator dom : Σ → D is used to capture this partition: to every domain U ∈ D, the set ΣU = {σ ∈ Σ | dom(σ) = U } specifies the events associated with U . The domains are interpreted to represent the security channels for which we will define non-interference requirements. We also consider an interference relation Ã⊆ D × D defined over D: given domains U, U 0 , the intended meaning of the relation à is such that the domain U is allowed to interfere with the domain U 0 whenever U à U 0 . We write U 6à U 0 whenever (U, U 0 ) 6∈Ã. We assume that our system, i.e., the combined behavior associated with all the domains, is modeled by a language K ⊆ Σ∗ . Moreover, K is generated by the finite automaton G = (Σ, X, δ, x0 ), that is, K = L(G) (in particular, K is prefix-closed). Intuitively, INI can be understood from the following example. Example 1. Consider a system with two domains: one is a high security domain (classified), the other is a low security domain (unclassified); e.g., D = {H, L}. The non-interference relation is such that: à = {(L, H)}. Let h ∈ ΣH and l ∈ ΣL , and consider the two automata given in Figure 1 and Figure 2 as possible specifications for system behavior. In the case of automaton G1 (Figure 1), the behavior of the system poses a problem: at the initial state, the L domain cannot execute the event l; but in its second state, after the execution of h by the H domain, the L domain can now execute l. Thus, the behavior generated by the H domain interferes with the behavior of the L domain (classified information is being leaked to the unclassified levels). This is contrary to the specification of the domain structure given above (i.e., H 6à L ). Such a dependency does not exist in automaton G2 (Figure 2). G1 a
h- a
l -a
Figure 1: Automaton not satisfying non-interference.
Next, consider a three domain system, D = {H, D, L}, where D is a downgrading domain. The non-interference relation is such that: Ã = {(H, D), (D, L), (D, H), (L, D), (L, H)}, 6
l
b 6
b 6 l
h- b
G2 b
Figure 2: Automaton satisfying non-interference.
i.e., only H Ã L is not allowed (H 6Ã L). We consider the automata given in Figure 3 and Figure 4, and assume h ∈ ΣH , l ∈ ΣL and d ∈ ΣD . Automaton G3 (Figure 3) poses a problem with intransitive non-interference, since the event l is not possible following the trace hd, but becomes possible following hdh. G3 b
h- b
d- b
h- b
l- b
Figure 3: Automaton not satisfying intransitive non-interference.
However, the system specified by automaton G4 (Figure 4) satisfies intransitive non-interference. For instance, note that the fact that ll is possible following hd and is not possible following h does not constitute an intransitive non-interference problem. According to the domain structure, H is allowed to interfere with L through D events. In other words, we only have the explicit requirement to preserve confidentiality of H with respect to L in between D events, but not across them. G4 c l
? c
h- c l
d-c
h- c
l
l
c?
l
c? c?
l
c? c?
Figure 4: Automaton satisfying intransitive non-interference.
To formally define INI, a function iP on traces called intransitive purge is introduced by Rushby [20]. The purge of a trace with respect to a given domain removes, from the trace, every event from domains that are not allowed to interfere directly or indirectly with the given domain. The iP function is defined using the function sources : Σ∗ × D → D given as follows:
sources(², U )
= {U ½ }, and sources(s, U ) ∪ {dom(σ)} if (∃V ∈ sources(s, U ))dom(σ) Ã V sources(σs, U ) = sources(s, U ) otherwise. 7
Intuitively, sources captures the set of domains which are allowed to interfere throughout the execution of a trace. This set of domains is determined backwards (i.e., starting from the end of the trace). Moreover, the fact that a given domain V is in sources(s, U ) means either that V = U or there is a subsequence σ1 , σ2 , ..., σn of the trace s, such that dom(σ1 ) Ã dom(σ2 ) Ã ... Ã dom(σn ) with V = dom(σ1 ) and dom(σn ) Ã U . Using the function sources, we consider an intransitive purge function iP : Σ∗ ×D → Σ∗ defined as follows: iP (², U ) = ²,½ and σiP (s, U ) if dom(σ) ∈ sources(σs, U ) iP (σs, U ) = iP (s, U ) otherwise Informally, iP is a string reduction function such that iP (s, U ) consists of the subsequence obtained from s by removing every event belonging to any domain that should not interfere with the domain U. Based on the above definition of sources(., .) and iP (., .), intransitive non-interference is defined as follows, which is essentially the definition of Rushby [20]. Definition 2 (Intransitive Non-Interference). A language K satisfies INI if, ∀U ∈D ∀s∈K ∀σ∈ΣU
sσ ∈ K ⇐⇒ iP (s, U )σ ∈ K.
(1)
The following result leads to a characterization of INI in terms of an observability property using the iP reduction. Lemma 1. Language K satisfies INI if and only if, ∀U ∈D ∀s1 ,s2 ∈K ∀σ∈ΣU
(iP (s1 , U ) = iP (s2 , U ) ∧ s1 σ ∈ K) =⇒ s2 σ ∈ K.
(2)
Proof. Let us first prove condition (1) implies condition (2). For all U ∈ D, for all s1 , s2 ∈ K, and for all σ ∈ ΣU , by condition (1), =⇒ =⇒ =⇒ =⇒
iP (s1 , U ) = iP (s2 , U ) iP (s1 , U ) = iP (s2 , U ) iP (s1 , U ) = iP (s2 , U ) iP (s1 , U ) = iP (s2 , U ) s2 σ ∈ K
8
∧ ∧ ∧ ∧
s1 σ ∈ K iP (s1 , U )σ ∈ K iP (s2 , U )σ ∈ K s2 σ ∈ K
Let us now prove condition (2) implies condition (1). For all U ∈ D, for all s ∈ K, and for all σ ∈ ΣU , let s1 = s and s2 = iP (s, U ). Clearly iP (s1 , U ) = iP (s2 , U ). By condition (2), sσ ∈ K =⇒ iP (s1 , U ) = iP (s2 , U ) ∧ s1 σ ∈ K =⇒ s2 σ ∈ K =⇒ iP (s, U )σ ∈ K On the other hand, let s1 = iP (s, U ) and s2 = s. By condition (2), iP (s, U )σ ∈ K =⇒ iP (s1 , U ) = iP (s2 , U ) ∧ s1 σ ∈ K =⇒ s2 σ ∈ K =⇒ sσ ∈ K
We now introduce the following observability property, which can be used to characterize intransitive non-interference. This property, called iP -observability is inspired by the observability introduced by Lin and Wonham [11] as defined in Definition 1. Definition 3 (iP -observability). Let M ⊆ Σ∗ and ΣU ⊆ Σ. A prefix-closed sublanguage K ⊆ M is iP -observable w.r.t. (M, ΣU ) if, ∀s,s0 ∈K ∀σ∈ΣU
(iP (s, U ) = iP (s0 , U ) ∧ sσ ∈ K ∧ s0 σ ∈ M ) =⇒ s0 σ ∈ K.
Theorem 1. A language K satisfies INI if and only if K is iP -observable w.r.t (Σ∗ , ΣU ), for all U ∈ D. Proof. The proof is immediate from Definition 3 and Lemma 1. This theorem states that the problem of verifying INI in a multi-level system can be reduced to the problem of checking iP -observability. We develop an algorithmic approach to check iP observability in the next section.
4
P -Observability vs iP -Observability
In the remainder of this paper, we restrict our attention to systems and protocols with only three security domains H, L and D, governed by the following non-interference relation: Ã = {(H, D), (D, L), (D, H), (L, D), (L, H)}. 9
Within the context of a three domains system, only domain L can pose an interference problem. Therefore, language K satisfies INI if and only if K is iP -observable w.r.t. (Σ∗ , ΣL ). Furthermore, we write iP (s) instead of iP (s, L) since we have to verify the iP -observability of K only w.r.t. (Σ∗ , ΣL ). In this section, we will show that by properly defining a new language KiP based on the original language K, we can characterize iP -observability of K in terms of the P -observability of KiP . In this way, the verification of the INI property amounts to checking P -observability, for which efficient algorithms already exist in the literature. The intuitive idea behind the transformation can be explained as follows. After transformation, observability shall be considered from the viewpoint of the low level, that is, events corresponding to ΣL are observable and events corresponding to ΣH are unobservable. Events in ΣD represent downgrading and shall be treated in a special way to reflect this. Our approach is to replace each trace in K that ends with an event in ΣD by a single event that is assumed to be observable. Because there may be infinitely many traces in K ending with an event in ΣD , we use the notion of a minimal trace to obtain a finite number of such new events, for a regular language K. It should be noted that the ΣD events are removed and replaced during the transformation. We first define the language KiP and show the equivalence between the iP -observability of K and the P -observability of KiP . We then construct an iP -quotient automaton GiP (generating KiP ) from the automaton G (generating K). The automaton GiP is used in checking the P -observability of KiP .
4.1
iP -Quotient Language
We now construct the language KiP by transforming the given language K. We start by introducing the notion of a minimal subtrace for a given trace. Intuitively, a minimal subtrace, with respect to G, of a given trace in L(G), is obtained by removing any events involved in loops in G. In this manner, we can obtain a subtrace (still in L(G)) “representing the original trace” that does not go through any loop in G. Our goal is to use the set of minimal subtraces which is finite, to classify a possible infinite sublanguage of L(G). However, removing events from a given trace to obtain a minimal subtrace as specified above can be done in many different ways; i.e., the notion of minimal subtrace is not unique as shown in the following example. 10
Example 2. Consider the state machine G of Figure 5. For s = αγβλ, both αλ and βλ are minimal subtraces. β
α
G c
- c?
λ
-c
6
γ
Figure 5: Minimal subtrace is not unique.
We therefore adopt the following constructive definition, which associates with every given trace a unique minimal subtrace. Definition 4. For every trace s ∈ K = L(G), the minimal subtrace s (with respect to G) is obtained from s by repeatedly, starting from the first event of the trace, removing any subsequence of contiguous events generated by a loop in G. Explicitly, if s = σ1 . . . σi−1 σi . . . σj σj+1 . . . σn and if σ1 . . . σj is the smallest prefix of s that has a loop, σi . . . σj , at the end, then the loop is removed to obtain the new trace s0 = σ1 . . . σi−1 σj+1 . . . σn . Note that s0 ∈ K = L(G). The above process is repeated until the minimal subtrace s is obtained. The following lemma states an immediate property of minimal subtraces used in results that follow. Lemma 2. ∀s∈K ∀t∈Σ∗ (st ∈ K ⇔ st ∈ K). Proof. Follows from the fact that s and s always end in the same state of G. To find the set of all minimal subtraces of K with respect to G, we construct an acyclic automaton G0 that is the depth-first expansion of G as follows. Let K n be the set of traces in K of length at most n, where n is the cardinality of the state space of G (that is, n = |X|). Then, G0 is defined as follows: def
G0 = (Σ, X 0 , δ 0 , x00 ) = Ac(Σ, X × K n × 2X , δ 0 , (x0 , ², {x0 })), where the operator Ac gives the accessible (reachable) part of the G0 ; the transition function δ 0 is defined, recursively, starting from the initial state, as follows. 11
For the initial state (x0 , ², {x0 }) and σ ∈ Σ, ½ (δ(x0 , σ), σ, {x0 , δ(x0 , σ)}) if δ(x0 , σ) is defined and δ(x0 , σ) 6= x0 0 δ ((x0 , ², {x0 }), σ) = . undefined otherwise. For a state (x, s, Y ) already in the range of δ 0 and σ ∈ Σ, ½ (δ(x, σ), sσ, Y ∪ {δ(x, σ)}) if δ(x, σ) is defined and δ(x, σ) 6∈ Y 0 δ ((x, s, Y ), σ) = . undefined otherwise It is important to note that G0 has the following properties. Proposition 1. 1. L(G0 ) is the set of all minimal subtraces of K with respect to G, i.e., L(G0 ) = {s | s ∈ K}. 2. The cardinality of L(G0 ) is bounded by |X||Σ||X| , i.e., |L(G0 )| ≤ |X||Σ||X| , where |Σ| and |X| are, respectively, the cardinalities of event set and state space of G. 3. G0 can be computed with a complexity of O(|Σ||X| ).
Proof. 1. Note that the state of G0 , (x, s, Y ), consists of the current state x, the current string s, and the set of states Y visited by s. δ 0 ((x, s, Y ), σ) is defined if and only if δ(x, σ) is defined (i.e., sσ ∈ K) and δ(x, σ) 6∈ Y (i.e., sσ does not form a loop). Therefore, every string generated by G0 is a minimal subtrace. On the other hand, in the construction of G0 , all strings with length ≤ n(= |X|) are examined. Any strings with length > n cannot be a minimal substrace because it must form a loop. Therefore, G0 generates all minimal subtraces of K. 2. Since the length of any string in L(G0 ) is bounded by |X|, L(G0 ) ⊆ Σ≤|X| , where Σ≤|X| is the set of all possible traces of length at most |X|. Define Lmax (G0 ) = {s ∈ L(G0 ) | (∀t ∈ L(G0 ))s 6= t ⇒ s is not a prefix of t} be the set of strings in L(G0 ) that are not proper (or strict) prefix of another string in L(G0 ). Clearly, L(G0 ) ⊆ Σ≤|X| implies |Lmax (G0 )| ≤ |Σ||X| . For each s ∈ Lmax (G0 ), there are |s| prefixes. Therefore, |L(G0 )| ≤ |X||Σ||X| . 12
3. G0 can be computed with a complexity of O(|Σ||X| ) because there are at most |X||Σ||X| strings to be examined.
The event set ΣiP of the transformed language KiP is then given as follows based on the above notion of minimal subtraces. Definition 5. ΣiP = ΣLiP ∪ ΣHiP ∪ ΣDiP where, • ΣLiP = {[σ] | σ ∈ ΣL }; • ΣHiP = {[σ] | σ ∈ ΣH }; • ΣDiP = {[s0 σ] | s0 = s ∧ sσ ∈ K ∧ σ ∈ ΣD }. Note that [.] is used to denote the new events of KiP based on the symbols of K’s events. In the above definition, the event sets ΣLiP and ΣHiP are computed by renaming the corresponding ΣL and ΣH events. The event set ΣDiP are computed by finding all traces in G0 that can be appended with a ΣD event: ΣDiP = {[s0 σ] | s0 ∈ L(G0 ) ∧ σ ∈ ΣD ∧ s0 σ ∈ L(G)} = {[s0 σ] | s0 σ ∈ L(G0 )ΣD ∩ L(G)} Note that ΣDiP can be computed based on L(G0 )ΣD ∩ L(G) with a worst-case complexity of O(|Σ||X| ). Definition 6. Consider the operator h·i : K → Σ∗iP defined inductively as follows: ½ hsi[σ] if σ ∈ ΣL ∪ ΣH h²i = ², and hsσi = [sσ] if σ ∈ ΣD The operator h·i transforms a trace s = σ1 ...σi σi+1 ...σn from K, where σi is the last ΣD event in s, into a new trace hsi = [σ1 ...σi ][σi+1 ]...[σn ] in Σ∗iP . Note that [σ1 ...σi ] ∈ ΣDiP and [σi+1 ], ..., [σn ] ∈ ΣHiP ∪ ΣLiP . If s 6∈ K, then hsi is undefined. We now define the iP -quotient language of K as follows. Definition 7. For K ⊆ Σ∗ , consider the following language over Σ∗iP defined as follows: KiP = {hti ∈ Σ∗iP | t ∈ K}. 13
We sometimes also use hKi to denote this language, that is, hKi = KiP . The “inverse” of h·i, denoted by e·, is defined below. Definition 8. Consider the operator e· : Σ∗iP → Σ∗ defined inductively as follows: ½ seσ 0 if σ = [σ 0 ] ∈ ΣLiP ∪ ΣHiP e ² = ² and sf σ= 0 0 sσ if σ = [s0 σ 0 ] ∈ ΣDiP 0 ][s0 σ][σ ]...[σ ] from Σ∗ into a trace s The operator e· transforms a trace s = [σ10 ]...[σm e = 1 n iP
s0 σσ1 ...σn in Σ∗ . Note that we define e· from Σ∗iP to Σ∗ , not just from KiP to Σ∗ . If s ∈ KiP , then s must have the form s = [s0 σ][σ1 ]...[σn ] or s = [σ1 ]...[σn ], where [s0 σ] ∈ ΣDiP , [σi ] ∈ ΣLiP ∪ ΣHiP . Definition 9. For KiP ⊆ Σ∗iP , consider the following language over Σ∗ defined as follows: g K t ∈ Σ∗ | t ∈ KiP }. iP = {e g g The following lemma states the relation between K and K iP = hKi. g Lemma 3. K iP ⊆ K. g f g Proof. If s ∈ K iP = hKi, then there exists t ∈ K such that s = hti. Consider two possibilities. 1. t does not contain any event in ΣD . In this case, s = t ∈ K. 2. t contains at least one event in ΣD . In this case, t = t1 σt2 , where σ is the last ΣD event in t. Hence s = t1 σt2 . By Lemma 2, t = t1 σt2 ∈ K implies s = t1 σt2 ∈ K.
We now consider the following projection P : Σ∗iP → (ΣLiP ∪ ΣDiP )∗ . The intuitive idea is to replace the iP reduction by the projection P , so that the observability theory developed in the supervisory control of DES can be applied. For this purpose, we assume that only events observable are those events in ΣLiP (events obtained from the domain L) and events in ΣDiP (new events which are traces in K ended by a downgrading event that can interfere with domain L). The following lemma translates the projection P of a trace s ∈ Σ∗iP into an iP reduction of a trace se ∈ Σ∗ . Lemma 4. For any s ∈ Σ∗iP , e t = iP (e s) whenever t = P (s). 14
Proof. We proceed by induction on the length of s ∈ Σ∗iP . The case of s = ² is immediate. For induction, let s = s0 σ. By the induction hypothesis, we have : te0 = iP (se0 ) whenever t0 = P (s0 ). Consider three possible cases. 0 σ) = e Case 1. σ = [σ 0 ] ∈ ΣHiP (that is σ 0 ∈ ΣH ): We need to prove that if P (s0 σ) = t then iP (sf t. 0 0 0 0 0 Since σ = [σ ] ∈ ΣH , by the definitions of P and iP , P (s σ) = P (s ) = t = t and iP (sfσ) = iP
0 σ) = iP (s e0 ) = te0 = e iP (se0 σ 0 ) = iP (se0 ). By the induction hypothesis, te0 = iP (se0 ). Hence, iP (sf t.
Case 2. σ = [σ 0 ] ∈ ΣLiP (that is σ 0 ∈ ΣL ): We need to prove that if P (s0 σ) = t then 0 σ) = e iP (sf t. Since σ = [σ 0 ] ∈ ΣLiP , by the definitions of P and iP , P (s0 σ) = P (s0 )σ = t0 σ = t 0 σ) = iP (s 0 σ) = e0 σ 0 ) = iP (se0 )σ 0 . By the induction hypothesis, te0 = iP (se0 ). Hence, iP (sf and iP (sf 0σ = e iP (se0 )σ = te0 σ 0 = tf t. 0 σ) = Case 3. σ = [s”σ 0 ] ∈ ΣDiP (that is σ 0 ∈ ΣD ): We need to prove that if P (s0 σ) = t then iP (sf
e t. Since σ = [s”σ 0 ] ∈ ΣDiP , by the definitions of P and iP , P (s0 σ) = P (s0 [s”σ 0 ]) = P (s0 )[s”σ 0 ] = t 0 σ) = iP (s^ 0 [s”σ 0 ]) = iP (s”σ 0 ) = s”σ 0 . By the definition of e 0 )[s”σ 0 ] = s”σ 0 . Hence and iP (sf ., P (s^ 0 σ) = s”σ 0 = P (s 0 )[s”σ 0 ] = e ^ iP (sf t. The following lemma translates the iP reduction of a trace s ∈ K into a P projection of a trace hsi ∈ KiP . Lemma 5. For any s ∈ K, hti = P (hsi) whenever t = iP (s). Proof. We proceed by induction on the length of s ∈ K. The case of s = ² is immediate. For induction, let s = s0 σ. By the closure of K, we have s0 ∈ K. By the induction hypothesis, we have ht0 i = P (hs0 i) whenever t0 = iP (s0 ). Consider three possible cases. Case 1. σ ∈ ΣH . We need to prove that if iP (s0 σ) = t, then P (hs0 σi) = hti. Since σ ∈ ΣH , by the definitions of P and iP , iP (s0 σ) = iP (s0 ) = t0 = t and P (hs0 σi) = P (hs0 i[σ]) = P (hs0 i). By induction hypothesis, ht0 i = P (hs0 i). Hence, P (hs0 σi) = P (hs0 i) = ht0 i = hti. Case 2. σ ∈ ΣL . We need to prove that if iP (s0 σ) = t, then P (hs0 σi) = hti. Since σ ∈ ΣL , by the definitions of P and iP , iP (s0 σ) = iP (s0 )σ = t0 σ = t and P (hs0 σi) = P (hs0 i[σ]) = P (hs0 i)[σ]. By the induction hypothesis, ht0 i = P (hs0 i). Hence, P (hs0 σi) = P (hs0 i)[σ] = ht0 i[σ] = ht0 σi = hti. Case 3. σ ∈ ΣD . We need to prove that if iP (s0 σ) = t, then P (hs0 σi) = hti. Since σ ∈ ΣD , by the definitions of P and iP , P (hs0 σi) = P ([s0 σ]) = [s0 σ] and iP (s0 σ) = s0 σ = t. By the definition of h.i, hs0 σi = [s0 σ]. Hence P (hs0 σi) = [s0 σ] = hs0 σi = hti.
15
The following two lemmas state that the P projection on KiP is equivalent to the iP reduction on K. These lemmas will be used in the proof of Theorem 2. Lemma 6. For any s1 , s2 ∈ KiP , iP (se1 ) = iP (se2 ) whenever P (s1 ) = P (s2 ). ^ ^ Proof. Since P (s1 ) = P (s2 ), by Lemma 4, we have iP (se1 ) = P (s2 ) and iP (se2 ) = P (s1 ) (we ^ ^ consider first t = P (s2 ) and then t = P (s1 )). Also P (s1 ) = P (s2 ) implies P (s1 ) = P (s2 ). Hence, iP (se1 ) = iP (se2 ). Lemma 7. For any s1 , s2 ∈ K, P (hs1 i) = P (hs2 i) whenever iP (s1 ) = iP (s2 ). Proof. Since iP (s1 ) = iP (s2 ), by Lemma 5, we have, P (hs1 i) = hiP (s2 )i and P (hs2 i) = hiP (s1 )i (we consider first t = iP (s2 ) and then t = iP (s1 ))). Moreover, iP (s1 ) = iP (s2 ) implies hiP (s1 )i = hiP (s2 )i. Hence, P (hs1 i) = P (hs2 i). Now we can present the main result of this section: iP -observability of K is equivalent to P -observability of KiP . Theorem 2. K is iP -observable w.r.t. (Σ∗ , ΣL ) if and only if KiP is P -observable w.r.t. (Σ∗iP , ΣLiP ). Proof. Recall that K is iP -observable w.r.t (Σ∗ , ΣL ) if and only if ∀s,s0 ∈K ∀σ∈ΣL
(iP (s) = iP (s0 ) ∧ sσ ∈ K) =⇒ s0 σ ∈ K.
(3)
and KiP is P -observable w.r.t (Σ∗iP , ΣLiP ) if and only if ∀t,t0 ∈KiP ∀[σ]∈ΣL
iP
(P (t) = P (t0 ) ∧ t[σ] ∈ KiP ) =⇒ t0 [σ] ∈ KiP .
(4)
Let us first prove that (3) implies (4). Assume that (3) is true. For all t, t0 ∈ KiP and [σ] ∈ ΣLiP , we want to show that if P (t) = P (t0 ) and t[σ] ∈ KiP , then t0 [σ] ∈ KiP . By Lemma 3, g∈K g t, t0 , t[σ] ∈ KiP =⇒ e t, te0 , t[σ] iP ⊆ K. g=e Note t[σ] tσ. By Lemma 6, P (t) = P (t0 ) =⇒ iP (e t) = iP (te0 ). Therefore, t, t0 ∈ KiP ∧ [σ] ∈ ΣLiP ∧ P (t) = P (t0 ) ∧ t[σ] ∈ KiP =⇒ e t, te0 ∈ K ∧ σ ∈ ΣL ∧ iP (e t) = iP (te0 ) ∧ e tσ ∈ K. 16
By (3), this implies te0 σ ∈ K or hte0 σi ∈ KiP . If t0 ∈ (ΣLiP ∪ ΣHiP )∗ , then te0 ∈ (ΣL ∪ ΣH )∗ and hte0 i = t0 . Hence, t0 [σ] = hte0 i[σ] = hte0 σi ∈ KiP . Otherwise, in general, we can write t0 = w1 [wσ 0 ]w2 with [wσ 0 ] ∈ ΣDiP , wσ 0 = wσ 0 , w2 ∈ (ΣLiP ∪ ΣHiP )∗ . Since t0 ∈ KiP , we conclude w1 = ². Also te0 = wσ 0 w f2 with σ 0 ∈ ΣD and w f2 ∈ (ΣL ∪ ΣH )∗ . Hence, t0 [σ] = [wσ 0 ]w2 [σ] = hwσ 0 w f2 σi = hte0 σi ∈ KiP . In either case, we conclude t0 [σ] ∈ KiP , that is, KiP is P-observable w.r.t (Σ∗iP , ΣLiP ). Next, we prove that (4) implies (3). Assume that (4) is true. For all s, s0 ∈ K and σ ∈ ΣL , we want to show that if iP (s) = iP (s0 ) and sσ ∈ K, then s0 σ ∈ K. By the definition of KiP , s, s0 , sσ ∈ K ⇒ hsi, hs0 i, hsσi ∈ KiP . By Lemma 7, iP (s) = iP (s0 ) ⇒ P (hsi) = P (hs0 i). Therefore, s, s0 ∈ K ∧ σ ∈ ΣL ∧ iP (s) = iP (s0 ) ∧ sσ ∈ K ⇒ hsi, hs0 i ∈ KiP ∧ [σ] ∈ ΣLiP ∧ P (hsi) = P (hs0 i) ∧ hsσi ∈ KiP . Since hsi[σ] = hsσi ∈ KiP , by (4), this implies hs0 i[σ] ∈ KiP or hs0 σi ∈ KiP . By Lemma 3, 0 σi ∈ K ] g hs iP ⊆ K. 0 σi = s0 σ. Hence, s0 σ = hs 0 σi ∈ K. ] ] If s0 ∈ (ΣL ∪ ΣH )∗ , then hs0 σi ∈ (ΣLiP ∪ ΣHiP )∗ and hs
Otherwise, in general, we can write s0 = w1 σ 0 w2 , where σ 0 is the last ΣD event in s0 . Then 0 σi = w σ 0 w σ. By Lemma 2, w σ 0 w σ = hs 0 σi ∈ K implies s0 σ = w σ 0 w σ ∈ K. ] ] hs 2 1 2 1 2 1 In either case, we conclude s0 σ ∈ K, that is, K is iP-observable w.r.t (Σ∗ , ΣL ).
4.2
iP -Quotient Automaton To verify the iP -observability of KiP , an automaton generating KiP needs to be constructed.
This automaton is called iP -quotient automaton and is defined as follows. Definition 10. Let G = (Σ, X, δ, x0 ) and K = L(G). The iP-quotient automaton of G is defined as GiP = (ΣiP , X, δiP , x0 ) 17
where the transition function δiP : X × ΣiP → X is given by, if σ = [σ 0 ] ∈ ΣLiP ∪ ΣHiP , δ(x, σ 0 ) δ(x0 , sσ 0 ) if σ = [sσ 0 ] ∈ ΣDiP , and x = x0 , δiP (x, σ) = undefined otherwise. Complexity Note: An algorithm to construct GiP can be easily obtained based on Definition 10. Its computational complexity is O(|Σ||X| ). This can be perceived as follows: First, recall that GiP has the same state space X as G. Furthermore, computing ΣDiP , using the minimal trace computation described before, has complexity of O(|Σ||X| ) as stated in Proposition 1. All other steps are just symbol transformations and do not modify the stated complexity. The following theorem shows that GiP is the automaton generating the language KiP , under the assumption that G does not have self-loops of ΣD events at the initial state. Theorem 3. If G does not have self-loops of ΣD events at the initial state, then KiP = L(GiP ). Proof. Let us first prove that for all s ∈ Σ∗ , δ(x0 , s) = x ⇔ δiP (x0 , hsi) = x, by induction on the length of s. Since h²i = ², δ(x0 , ²) = x0 and δiP (x0 , h²i) = x0 . Therefore, δ(x0 , ²) = x ⇔ δiP (x0 , h²i) = x. For induction, let s = s0 σ with σ ∈ Σ. By induction hypothesis, we have δ(x0 , s0 ) = x0 ⇔ δiP (x0 , hs0 i) = x0 . Let us now consider two possible cases for σ. Case 1 σ ∈ ΣL ∪ ΣH . By the definition of δiP , ⇔ ⇔ ⇔ ⇔
δ(x0 , s0 σ) is undefined δ(x0 , σ) is undefined δiP (x0 , [σ]) is undefined δiP (x0 , hs0 i[σ]) is undefined δiP (x0 , hs0 σi) is undefined
If δ(x0 , s0 σ) is defined, then ⇔ ⇔ ⇔ ⇔
δ(x0 , s0 σ) = x δ(x0 , σ) = x δiP (x0 , [σ]) = x δiP (x0 , hs0 i[σ]) = x δiP (x0 , hs0 σi) = x 18
Case 2 σ ∈ ΣD . By the definition of δiP , δ(x0 , s0 σ) is undefined ⇔ s0 σ 6∈ K ⇔ hs0 σi is undefined ⇔ δiP (x0 , hs0 σi) is undefined If δ(x0 , s0 σ) is defined, then δ(x0 , s0 σ) = x ⇔ δ(x0 , s0 σ) = x ⇔ δiP (x0 , [s0 σ]) = x ⇔ δiP (x0 , hs0 σi) = x This proves that for all s ∈ Σ∗ , δ(x0 , s) = x ⇔ δiP (x0 , hsi) = x. We will now prove KiP ⊆ L(GiP ) as follows. ⇒ ⇒ ⇒ ⇒ ⇒
s ∈ KiP ∃t∈K s = hti ∃t∈Σ∗ δ(x0 , t) is defined ∧ s = hti ∃t∈Σ∗ δiP (x0 , hti) is defined ∧ s = hti δiP (x0 , s) is defined s ∈ L(GiP ).
To prove L(GiP ) ⊆ KiP , we need to use the assumption that G does not have self-loops of ΣD events at the initial state. From the definition of GiP , GiP does not have self-loops of ΣDiP events at the initial state either. Furthermore, ΣDiP events are only defined at the initial state of GiP . Therefore, if s ∈ L(GiP ), then s can have at most one ΣDiP event and if so, it will be the first event in s, that is, s ∈ (ΣLiP ∪ ΣHiP )∗ or s ∈ ΣDiP (ΣLiP ∪ ΣHiP )∗ . Case 1 s = [σ1 ][σ2 ]...[σm ] ∈ (ΣLiP ∪ ΣHiP )∗ . In this case, let t = σ1 σ2 ...σm , then s = hti. Therefore, ⇒ ⇒ ⇒ ⇒
s ∈ L(GiP ) ∃t∈Σ∗ δiP (x0 , hti) is defined ∧ s = hti ∃t∈Σ∗ δ(x0 , t) is defined ∧ s = hti ∃t∈K s = hti s ∈ KiP .
Case 2 s = [wσ][σ1 ][σ2 ]...[σm ] ∈ ΣDiP (ΣLiP ∪ ΣHiP )∗ . 19
In this case, let t = wσσ1 σ2 ...σm , then s = hti. Therefore, ⇒ ⇒ ⇒ ⇒
s ∈ L(GiP ) ∃t∈Σ∗ δiP (x0 , hti) is defined ∧ s = hti ∃t∈Σ∗ δ(x0 , t) is defined ∧ s = hti ∃t∈K s = hti s ∈ KiP .
The assumption that G does not have self-loops of ΣD events at the initial state is not restrictive as long as observability is concerned, as shown below. Proposition 2. Let F be obtained from G by removing all self-loops of ΣD events at the initial state of G and J = L(F ). Then K is iP -observable w.r.t (Σ∗ , ΣL ) if and only if J is iP -observable w.r.t (Σ∗ , ΣL ). Proof. Assume that K 6= J, otherwise there is nothing to prove. Recall that K is iP -observable w.r.t (Σ∗ , ΣL ) if and only if ∀s,s0 ∈K ∀σ∈ΣL
(iP (s) = iP (s0 ) ∧ sσ ∈ K) =⇒ s0 σ ∈ K.
(5)
and J is iP -observable w.r.t (Σ∗ , ΣL ) if and only if ∀s,s0 ∈J ∀σ∈ΣL
(iP (s) = iP (s0 ) ∧ sσ ∈ J) =⇒ s0 σ ∈ J.
(6)
We first prove (5) implies (6) by contradiction as follows. Assume (5) is true but (6) is not, that is, ∃s,s0 ∈J ∃σ∈ΣL
iP (s) = iP (s0 ) ∧ sσ ∈ J ∧ s0 σ 6∈ J.
Sine J ⊆ K, s, s0 , sσ ∈ K. Since only the initial ΣD events are removed from K and σ ∈ ΣL , s0 σ 6∈ K. Therefore, ∃s,s0 ∈K ∃σ∈ΣL
iP (s) = iP (s0 ) ∧ sσ ∈ K ∧ s0 σ 6∈ K.
a contradiction to (5) is true. Next we prove (6) implies (5) by contradiction as follows. Assume (6) is true but (5) is not, that is, ∃s,s0 ∈K ∃σ∈ΣL
iP (s) = iP (s0 ) ∧ sσ ∈ K ∧ s0 σ 6∈ K. 20
Let t (t0 respectively) be obtained from s (s0 respectively) by removing the initial events corresponding to the self-loops of ΣD events at the initial state, if there are such events. Clearly t, t0 ∈ J. Also, s and t end at the same state and s0 and t0 end at the same state. Therefore, tσ ∈ J ∧t0 σ 6∈ J. Furthermore, since t, t0 are obtained by removing some initial ΣD events in s, s0 , by the definition of iP function, iP (s) = iP (s0 ) ⇒ iP (t) = iP (t0 ). Hence, ∃s,s0 ∈J ∃σ∈ΣL
iP (s) = iP (s0 ) ∧ sσ ∈ J ∧ s0 σ 6∈ J.
a contradiction to (6) is true.
4.3
Checking P -observability
P -observability is introduced in [11] for the supervisory control of discrete event systems. It plays a key role in the existence of supervisors. Because of its importance, observability has been studied extensively in the literature. In particular, the question of whether observability of a regular language KiP can be tested efficiently, which is not immediate from the definition of observability, has been answered. Results in the literature [18] show that observability can be tested with a worst-case complexity of O(|X|2 |ΣiP |). We shall not present the details of this testing algorithm here. Instead, we will present a test that is less computationally efficient but more insightful. This test consists of four steps described by the following algorithm: Algorithm 1. 1. Replace all unobservable events (i.e. events in ΣHiP ) by ² in GiP to obtain G²iP . 2. Convert the automaton with ²-transitions into an automaton without ²-transitions using standard procedures (e.g., see [8]): GiP = (ΣiP , X, δiP , x0 ), where states of GiP are subsets of state set of GiP , that is, x ⊆ X. 3. For each state x, check its consistence: ∀σ∈ΣLiP
(∀x∈x δiP (x, σ) is defined) ∨ (∀x∈x δiP (x, σ) is undefined), 21
that is, a state x is consistent if low level events are either defined at all x ∈ x or undefined at all x ∈ x. 4. KiP is P -observable if and only if all its (accessible) states are consistent. Example 3. In this example, we transform an automaton G into its iP -quotient GiP and the check P -observability. Consider the automaton G in Figure 6, with h ∈ ΣH , d ∈ ΣD , and l ∈ ΣL . h -xb 1 d - bx¾ 3 d xb8 G xb0 @
6
@ l
l
@ Rxb @
2
h
x4 b?
6
l d- b x 7
Z l } 6 Z Z d x5 ? b h Zb x6 l
Figure 6: An automaton G with loops.
First let us compute the set of all minimal traces of G by constructing G0 , which is given in Figure 7. (x0 , ², {x0 })
g Jh J (x1 , h, {x0 , x1 }) À (x2 , l, {x0 , x2 }) J ^g g (x7 , hdhd, {x0 , x1 , x3 , x4 , x7 }) g l - g(x8 , hdhdl, {x0 , x1 , x3 , x4 , x7 , x8 }) £ d * (x3 , hd, {x0 , x1 , x3 }) ©© £ ? ©d l £ g -g h HH £ (x4 , hdh, {x0 , x1 , x3 , x4 }) , hdhlh, {x0 , x1 , x3 , x4 , x5 , x6 }) j g h (x l H £ -6 g - g -g (x8 , hdhlhdl, {x0 , x1 , x3 , x4 , x5 , x6 , x7 , x8 }) d l £ (x5 , hdhl, {x0 , x1 , x3 , x4 , x5 }) (x , hdhlhd, (x1 , ll, {x0 , x2 , x1 })£ {x0 , x1 , x3 , x4 , x5 , x6 , x7 }) °g 7 l
(x7 , lldhd, {x0 , x2 , x1 , x3 , x4 , x7 }) (x8 , lldhdl, {x0 , x2 , x1 , x3 , x4 , x7 , x8 }) d l (x3 , lld, {x0 , x2 , x1 , x3 }) d
? g
h
-g g © * ©© -g HH{x0 , x2 , x1 , x3 , x4 }) (x4 , lldh, lldhlh, {x0 , x2 , x1 , x3 , x4 , x5 , x6 }) j g h (x l H -6 , g - g l - g(x8 , lldhlhdl, {x0 , x2 , x1 , x3 , x4 , x5 , x6 , x7 , x8 }) d
(x5 , lldhl, {x0 , x2 , x1 , x3 , x4 , x5 }) (x7 , lldhlhd, {x0 , x2 , x1 , x3 , x4 , x5 , x6 , x7 })
Figure 7: Acyclic automaton G0 generating minimal traces of G.
Given the automaton G0 and the formula: ΣDiP = {[s0 σ] | s0 ∈ L(G0 ) ∧ σ ∈ ΣD ∧ s0 σ ∈ L(G)} we obtain: ΣDiP = {[hd], [hdhd], [hdhlhd], [hdhdld], [hdhlhdld], [lld], [lldhd], [lldhlhd], [lldhdld], [lldhlhdld]}. The GiP automaton is given in Figure 8. 22
x2 b
[l]
6 [l]
Gip
¡
[h]
- b x1 µ ¡¡
¡
¡ ¡ [lldhdld] ¡[hdhdld], [lld] x b¡ - b3 x0 @ [hd], [hdhlhdld] @[lldhlhdld] @ @ @ [lldhd], [lldhlhd], [hdhd], [hdhlhd] @ x @@ [l] Rb7
x
- b4 [h]
x
- b5
@ I @
[l]
@ @
@[l] @ @
-x8b
[h]
@ b?x6
Figure 8: GiP , the iP -quotient of G.
We now apply Algorithm 1 to check P -observability of L(GiP ). Step 1 of the above algorithm transform the automaton GiP into G²iP as described in Figure 9. x2 b
[l]
6 [l]
²
¡
G²ip
¡
- b x1 µ ¡¡
¡
¡
¡[lldhdld] [hdhdld], [lld] x3
-xb4
b¡ -b @ [hd], [hdhlhdld] @[lldhlhdld] @ @ @ [lldhd], [lldhlhd], [hdhd], [hdhlhd] @ @@ [l] Rxb 7 x0
²
-xb5
@ I @
[l]
@ @
-x8b
@[l] @ @
²
@ b?x6
Figure 9: G²iP , unobservable events replaced by ²-transitions.
The results of step 2 is the automaton GiP given in Figure 10. We can verify here that KiP is not P -observable since in automaton GiP , we can see that δiP (x5 , [l]) is undefined however δiP (x6 , [l]) is defined and x5 , x6 are in the same state in the automaton GiP .
23
¿ x2
[l]
GiP
¡
[l] ¡ µ ¡ ÁÀ
¡ ¡ ¡ ¿ ¿ [l] [lldhdld] ¡ [hdhdld], [lld] x0 , x1 ¡ - x3 , x4 @[hd], [hdhlhdld] ÁÀ ÁÀ @ [lldhlhdld] @ @ @ [lldhd], [lldhlhd], [hdhd], [hdhlhd] ¿ @ [l] @ x R @ 7
¿ -
x1
ÁÀ
¿ [l] x5 , x6
¾ ÁÀ
¿ x4
[l]
ÁÀ
¿ -
ÁÀ
x8
ÁÀ
Figure 10: Automaton GiP without ²-transitions.
5
Applications to Cryptoprotocol
In this section, we give two applications of using the property of intransitive non-interference in cryptoprotocol analysis. Example 4. Consider the following one-step protocol where two principals A and B sharing a secret key kAB want to exchange a secret message m ∈ {0, 1}n : Message: A
{m}kAB
−→
B.
Given a public channel c used by the protocol, we consider generators GA (Figure 11) and GB (Figure 12) describing the behaviors of principals A and B respectively. For simplicity, we consider only two-byte messages m ∈ {0, 1}2 . • The event encrypt(kAB , m) stands for the encryption of message m with the key kAB (producing ciphertext {m}kAB ). • The event c({m}kAB ) (respectively c(0) and c(1)) stands for emission/reception over the public channel c of message {m}kAB (respectively messages 0 and 1). • The event decrypt(kAB , {m}kAB ) stands for the decryption of {m}kAB with the key kAB . Note that the specification GB may receive over the channel c any possible ciphertext {m}kAB (with m ∈ {0, 1}2 ).
24
GA a encrypt(kAB , m)- a
c({m}kAB ) - a
Figure 11: DES generator for principal A.
ABa) adecrypt(kAB , {00}k-
¡
c(0)
-a
c(1)
-a
c(0)
-a
c(1)
-a
¡ µ
c({00}kAB ) ¡
¡
¡
decrypt(kAB , {01}kABa) ¡c({01}kAB )» :a ¡ »»»» »» » GB a¡ XXX XXX @ ABa) z adecrypt(kAB , {10}k@ c({10}k X)X AB @ @ ) c({11}kAB@ @ @ @ R adecrypt(kAB , {11}kABa)
Figure 12: DES generator for principal B.
In Figure 11, principal A encrypts message m and then submit it over the public channel c. In Figure 12, principal B receives an encrypted message, decrypts the message and outputs message 0 or 1, depending on the parity of the received message. The protocol is then viewed as the generator GA k GB , which is illustrated in Figure 13 with m = 00. From this particular specification, intuitively, we see that there is an obvious inadmissible confidentiality break since GA k GB leaks pieces of information about m’s content (in this case its parity), but without revealing m entirely. We would like to see if our theory captures this. GA k GB a encrypt(kAB , 00)- a
c({00}kAB )
ABa) - adecrypt(kAB , {00}k-
c(0)
-a
Figure 13: DES generator for the protocol.
In order to use the INI property to check the security, we consider the following assignment of the security domain: ΣL = {c(0), c(1)} ∪ {c({m}kAB ) | m ∈ {0, 1}n } ΣH = {decrypt(kAB , {m}kAB ) | m ∈ {0, 1}n } 25
ΣD = {encrypt(kAB , m) | m ∈ {0, 1}n }. Thus, any event over the public channel is assumed to be low-level (public), since they can be intercepted, and decryption events are assumed to be high-level (private), since they can potentially reveal secret messages. As for encryption events, they are assumed to be downgrading events since, in the context of perfect encryption, their output {m}kAB does not reveal the secret content of either m or kAB . From this consideration, we see that the protocol does not satisfy INI, in other words, K = L(GA k GB ) does not satisfy INI. Indeed, considering the following two traces s1 and s2 in K: s1 = encrypt(kAB , m) c({m}kAB ) decrypt(kAB , {m}kAB ) and s2 = encrypt(kAB , m) c({m}kAB ) we can easily see that iP (s1 ) = iP (s2 ) = s2 . Furthermore, we have s1 c(0) ∈ K but s2 c(0) 6∈ K (where c(0) ∈ ΣL ). Hence, by Definition 3, K is not iP -observable w.r.t (Σ∗ , ΣL ); therefore, by Theorem 1, K does not satisfy INI. Hence, the confidentiality breach caused by the specification GB is detected by INI property. Example 5 (Electronic Payment Protocol). In the following, we give an example of a simple electronic payment protocol which does not satisfy INI. This protocol implements a credit-cardbased transaction between a buyer A and a seller B by using the existing financial network for clearing and authorization. After the buyer and seller agree upon the transaction, B uses information obtained from A in order to request an authorization clearance for an acquired ACQ (for example a bank) for the payment by forwarding the encrypted information obtained from A. Message 1:
A
Message 2: B Message 3: ACQ
{n}k
−→
A,{n}k
−→ resp −→
B ACQ B
where n is A’s credit card number, k is the acquired public encryption key, and resp is the acquired response, hence resp = ok or resp = notok, and {n}k stands for n encrypted by k. The buyer A, the seller B, and the acquired ACQ are specified (individually) in Figure 14. The protocol is specified as concurrent principals G = A k B k ACQ given in Figure 15. In this specification: message1({n}k ), message2(A, {n}k ) are downgrading events (since the secret value n is encrypted), clearing(A, n) is a high-level event (since n appears within its parameters) and message3(ok), message3(notok) are low-level events (since messages are exchanged over public channel and can be intercepted by any intruder). Hence : 26
• ΣL = {message3(ok), message3(notok)}. • ΣH = {clearing(A, n)}. • ΣD = {message1({n}k ), message2(A, {n}k )}. and Σ = ΣL ∪ ΣH ∪ ΣD . A
B
d message1({n}k )
dmessage1({n}k )
ACQ
- d © * d © message3(ok) ©© ©© message2(A, {n} ) © k - d - d HH HH message3(notok) HH j d H
d * © © message3(ok) ©© ©© © message2(A, {n} ) clearing(A, n) k d - d - d H HH HH message3(notok) H H jd
Figure 14: The principals of the payment protocol.
message1({n}k ) d
* d © message3(ok) ©© © ©© message2(A, {n}k ) clearing(A, n) © - d - d - d HH H HH H jd H message3(notok)
Figure 15: Payment protocol G = A k B k ACQ.
Now, construct the iP -quotient GiP from G, which is shown in Figure 16). Note that there are two minimal traces that are followed by a downgrading event: the first one is message1({n}k ), and the second one is message1({n}k )message2(A, {n}k ). Consequently: • ΣLiP = {[message3(ok)], [message3(notok)]}
27
• ΣHiP = {[clearing(A, n)]} • ΣDiP = {[message1({n}k )], [message1({n}k ) message2(A, {n}k )]} * d © © [message3(ok)] © © ©© [message1({n}k )] clearing(A, n) © d - d d - d XXX HH : » »» » XXX HH »» » XXX » HH X»»» jd H [message3(notok)] [message1({n}k ) message2(A, {n}k )]
Figure 16: Ip-Quotient GiP = (A k B k ACQ)ip .
It is easy to verify that L(GiP ) is not P -observable w.r.t. (Σ∗ip , ΣLiP ). Indeed, consider the following traces: t = [message1({n}k ) message2(A, {n}k )] [clearing(A, n)] t0 = [message1({n}k ) message2(A, {n}k )] . We have P (t) = P (t0 ) and t [message3(ok)] ∈ L(GiP ), but t0 [ message3(ok)] 6∈ L(Gip ) (where [ message3(ok)] ∈ ΣLiP ). Consequently, from Theorem 2, L(G) is not iP -observable w.r.t (Σ∗ , ΣL ), and, from Theorem 1, we can deduce that the protocol does not satisfy INI. This particular information flow leakage could be exploited by an intruder (or even the seller B) who sends several message (A, {n}k ) with different values n, could eventually deduce whether it is A’s credit card number or not upon receiving message3(ok) or message3(notok).
6
Conclusion In this paper, we used the observability theory of discrete event systems to formulate and pro-
vide an algorithmic approach to the problem of checking the property of intransitive non-interference of security policies. This property plays a key role in the area of security, because it is used to model information flow and the lack of information flow within multi-level security systems and protocols. We first defined the concept of iP -observability (observability defined using a purge function instead of a projection) and used it to capture intransitive non-interference. Second, we established an equivalence between iP -observability and P -observability (normal observability as defined in DES using a projection). Third, we derived an algorithm for transforming the original automaton, 28
modeling the system or protocol, to a new automaton where P -observability can be checked. The new automaton uses the same state spaces as the original one, but adds to its events and transitions. This work presented in this paper serves two important purposes. First, it provides an efficient solution to an important security problem. Second, it introduces an important application to the area of discrete event systems. In fact, a new area of research involving the application of the tools of DES theory to the solution of security problems can now be undertaken.
References [1] D. Bell and L. LaPadula, “Secure computer system: Unified exposition and multics interpretation,” Tech. Rep. MTR-2997, Mitre Corp., Bedford, Mass., USA, June 1976. [2] G. Boudol, “Notes on algebraic calculi of processes,” in Logic and Models of Concurrent Systems, NATO ASI Series F-13, pp. 261–303, Springer, 1985. [3] B. A. Brandin, W. M. Wonham, and B. Benhabib. Manufacturing cell supervisory control—a timed discrete event system approach. In Proceedings of the IEEE Conference on Robotics and Automation, Nice, France, May 1992. [4] C.G. Cassandras and S. Lafortune. Introduction to Discrete Event Systems. Kluwer Academic Publishers, Boston, MA, 1999. [5] R. Focardi, A. Ghelli and R. Gorrieri, “Using non interference for the analysis of security protocol,” Proc. of DIMACS Workshop on Design and Formal Verification of Security Protocols, 1997. [6] R. Focardi and R. Gorrieri, “A classification of security properties for process algebras,” Journal of Computer Security, vol. 3, no. 1, pp. 5–33, 1995. [7] J. Goguen and J. Meseguer, “Security policies and security models,” in Proceedings 1982 IEEE Symposium on Research in Security and Privacy, pp. 11–20, Apr. 1982. [8] J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, MA, 1979. [9] IEEE Computer Society, 1995 IEEE Symp. on Research in Security and Privacy, (Oakland, CA), May 1995.
29
[10] S. Lafortune. Modeling and analysis of transaction execution in database systems. IEEE Transactions on Automatic Control, 33(5):439–447, May 1988. [11] F. Lin and W. M. Wonham, “On observability of discrete-event systems,” Information Sciences, vol. 44, pp. 173–198, 1988. [12] J. McLean, “A general theory of composition for a class of possibilistic properties,” IEEE Trans. on Software Engineering, vol. 22, no. 1, pp. 53–66, 1996. [13] J. Mullins, “Nondeterministic admissible interference,” Journal of Universal Computer Science, vol. 6, no. 11, pp. 1054–1070, 2000. [14] J. Mullins and S. Lafrance, “Bisimulation-based non-deterministic admissible interference with applications to the analysis of cryptographic protocols,” International Journal in Information and Software Technology, pp. 1–25, 2002. [15] P. J. Ramadge and W. M. Wonham, “Supervisory control of a class of discrete-event processes,” SIAM Journal of Control and Optimization, vol. 25, no. 1, pp. 206–230, 1987. [16] P. J. G. Ramadge and W. M. Wonham. The control of discrete event systems. Proceedings of the IEEE, 77(1), January 1989. [17] A. W. Roscoe and M. H. Goldsmith, “What is intransitive noninterference,” in 12th IEEE Computer Security Foundations Workshop, 1999. [18] K. Rudie and J. C. Willems. The computational complexity of decentralized discrete-event control problems. IEEE Transactions on Automatic Control, 40(7):1313–1319, July 1995. [19] K. Rudie and W. M. Wonham. Protocol verification using discrete-event systems. In Proceedings of the 31st IEEE Conference on Decision and Control, pages 3770–3777, Tucson, Arizona, December 1992. [20] J. Rushby, “Noninterference, transitivity and channel-control security policies,” Tech. Rep. CSL-92-02, SRI International, Menlo Park CA, USA, Dec. 1992. [21] P. Ryan and S. Schneider, “Process algebra and non-interference,” in Proceedings of CSFW-12, (Mordano, Italy), IEEE, June 1999.
30