Diagnosis with a Logic of Partial Order Computations Shaofa Yang1 , Lo¨ıc H´elou¨et1 , Thomas Gazagnaire2 1: INRIA Rennes 2: ENS Cachan-Bretagne shaofa.yang, loic.helouet,
[email protected]
Abstract. This paper proposes a diagnosis framework based on a new partial order logic. The purpose of this logic-based diagnosis is to recover some information that was not observed from a set of formulae describing the expected behaviors of a distributed system and an incomplete observation. We first propose a local partial order logic as a way to describe distributed behaviors, and show that without restriction, satisfiability of a set of formulae, and hence diagnosis is not decidable. We show that we can restrict the class of distributed behaviors to models that ensure decidability of satisfyability and of the diagnosis problem. We then show the complexity of diagnosis and unobserved information inference.
1
Introduction
Diagnosis consists in inferring missing information (unobserved events or states, faulty behaviors,...) from a partial observation of a system. It is an important task in telecommunication networks, where the size and heterogeneity of the architectures forces partial observation. When a fault occurs, it is crucial for safety and economical reasons to discover the cause for the failure and to correct it as fast as possible. Model-based diagnosis brings automated solutions for this problem. Roughly speaking, it consists in comparing runs of a model with the observed behaviors, and infer the missing parts. One might be interested in discovering if an unobserved fault has occurred, as in [26], or in finding all possible runs that may have lead to the observation, as in [3]. Another possibility is to augment observations with useful information such as causal relationships among observed events, known local properties of the system along the run, and events of interest missed by the observation as in [15]. Within this setting, several models have been proposed to represent runs of distributed systems: automata [26], Petri nets [3], Message Sequence Charts (MSCs) [17]. However, diagnosis always needs an up to date model of the running system to be accurate. In nowaday’s architectures, new services are added and modified every day, which calls for a daily redesign of models. Clearly, this is not possible with automata or nets, where the smallest modification implies updating many transitions. We propose a new logic-based approach for diagnosis. The main idea is to collect users and designers knowledge with a logic that describes known interactions as partial orders, and temporal relations between these known pieces of behavior. Hence, a model for a running system is
2
S.Yang, L.H´elou¨et,T.Gazagnaire a conjunction of formulae, and one can expect that a modification of the system only implies redefining a limited number of formulae. The LPOC logic proposed in this paper is a CTL-like logic of partial order, with past and future modalities. It is a local logic, that defines the shape of causal chains in all executions of a system. We first show that satisfiability of this logic is undecidable. This is not really surprising, as similar undecidability results hold for m-LTL [23], a local temporal logic on Lamport diagrams and for template MSCs [16] a logical framework for MSCs. Hence, without restriction on the logic or on the shape of the explanations (i.e. the models of the formulae that are compatible with the observation), satisfiability of LPOC formulae is undecidable (and as a corollary, diagnosis is also an untractable problem). We then show a restriction on models allowed for LPOC formulae that makes diagnosis a decidable problem, and show how to extract explanations from a partial observation and a set of formulae. When the runs of the system meet a syntactical requirement called “K-influence”, then explanations can be computed as the intersection of an automaton representing all runs that are compatible with the observation and an automaton representing all runs that are compatible with the LPOC description of the system. This paper is organized as follows. The LPOC logic is described in 2. The negative satisfiability result is shown in Section 3. Section 4 shows that assuming that all executions are K-influencing, satisfiability and diagnosis become decidable problems, and establishes complexity results for diagnosis. Section 5 compares our results with some related works. Section 6 concludes this work.
2
Preliminaries
Through the rest of the paper, we fix a finite nonempty set of process names P and let p, q range over P. We also fix a finite nonempty set A of atomic propositions. Definition 1. A partially ordered computation (or computation for short) over (P, A) is a tuple (S, η, ≤, V ) where: – S is a finite set of (local) states. – η : S → P identifies the location of each state. For each p ∈ P, we define Sp = {s ∈ S | η(s) = p}. – ≤ ⊆ S × S is a partial order, called the causality relation. Furthermore, for each p, ≤ restricted to Sp × Sp is a total order. – V : S → 2A is a labeling function which assigns a set of atomic propositions to each state. We call V (s) the valuation of s. Intuitively, a computation (also called Lamport diagram in the literature [22, 23]) represents the causal ordering among local states in a distributed execution, in which states of each process are sequentially ordered. The valuation of local state s collects the atomic propositions that hold at s. Figure 1 shows a computation. States are designated by black dots with associated name s1 , . . . , s6 . Processes P, Q, R are represented by
Diagnosis with a Logic of Partial Order Computations vertical lines, and states located on a process line are ordered from top to bottom. Finally, valuations of states take value in {a, b, c}, and are represented between two brackets near the associated state. Note that we consider only finite computations, since diagnosis is performed at a given moment after a finite observation period. We will say that two computations (S, η, ≤, V ) and (S ′ , η ′ , ≤′ , V ′ ) are isomorphic iff there is a bijection f : S → S ′ such that η(s) = η ′ (f (s)) for any s ∈ S, s1 ≤ s2 iff f (s1 ) ≤′ f (s2 ) for any s1 , s2 ∈ S, and V (s) = V ′ (f (s)) for any s ∈ S. We identify isomorphic computations and write W ≡ W ′ if W and W ′ are isomorphic.
Q
P
s1
{a, b}
R
P
R {b, c} s5
{b, c} s5
{c} s2
{a} s4
{b} s3
{a} s2 {b} s6
Fig. 1. a) A computation W
{b} s3
b) the 1-view of s3 in W
Let (S, η, ≤, V ) be a computation, and s, s′ ∈ S. As usual, we write s < s′ when s ≤ s′ and s 6= s′ . For each p ∈ P, we let ≪p ⊆ S × S be given by: s ≪p s′ iff s, s′ ∈ Sp , s < s′ , and there does not exist s′′ ∈ Sp with s < s′′ < s′ . That is, ≪p is the “immediate” sequential ordering of states belonging to p. We let ⋖ ⊆ S × S be the least relation such that ≤ is the reflexive and transitive closure of ⋖. For each p, q ∈ P with p 6= q, let us define ≪pq ⊆ S × S as follows: sS≪pq s′ iff s ∈ Sp , s′ ∈ Sq , and s ⋖ s′ . We also define ≪= (∪p∈P ≪p ) (∪p,q∈P,p6=q ≪pq ). If s ≪ s′ , we say s′ is a (causal) successor of s, and call s a (causal) predecessor of s′ . We emphasize that ≪ is not equal to ⋖. A state s is minimal if it has no predecessor. We say s is maximal if it has no successor. A causal chain is a sequence s1 s2 . . . sn of states where s1 ≪ s2 ≪ . . . ≪ sn . A causal path is a causal chain s1 s2 . . . sn such that sn is a maximal state. We want to define a temporal logic of partial-order computations (called “LPOC” for short) to reason about distributed behaviors. It has two basic features. First, at a state s of a computation, atomic formulae which assert that a “pattern” occurs in a bounded past or future of s. Secondly, we consider a branching time framework with CTL-like operators and reason along sequences of causally ordered states. In the sequel, we precise these notions before defining the syntax and the semantics of the logic.
3
4
S.Yang, L.H´elou¨et,T.Gazagnaire Definition 2. Let (S, η, ≤, V ) be a computation, s ∈ S and m a natural number. The m-view of s, denoted ↓ m (s), is the collection of states s′ in S such that there exists a causal chain of length at most m starting from s′ and ending at s. More precisely, ↓ m (s) = {s′ | ∃s0 , . . . sn ∈ S, n ≤ m and s′ = s0 ≪ s1 ≪ . . . ≪ sn = s}. Similarly, the m-frontier of s, denoted by ↑ m (s) is the collection of states s′ in S such that there exists a causal chain of length at most m starting from s and ending at s′ . In particular, the 0-view and 0-frontier of s are both the singleton set {s}. Note that s has at most |P| successors, one belongingP to each Sp . i Thus, inductively, the m-view of s contains at most Nm = m i=0 |P| = 1−|P|m+1 1−|P|
states, and the same bound holds for the size of m-frontiers. In order to reason about the “pattern” of a computation, we also need a notion of projection.
Definition 3. Let W = (S, η, ≤, V ) be a computation over (P, A), and let A ⊆ A. The projection of W onto A is the computation W ′ = (S ′ , η ′ , ≤′ , V ′ ) where S ′ = {s ∈ S | V (s) ∩ A 6= ∅}, and η ′ , ≤′ , V ′ are the respective restrictions of η, ≤, V to S ′ . The restriction of W to a b Vb ) where ηb, subset of states S ′ ⊆ S is the computation W ′ = (S ′ , ηb, ≤, ′ b b ≤, V are the respective restrictions of η, ≤ and V to S . The atomic formulae of our logic will assert that the projection of the computation formed from the m-view or the m-frontier of a state is isomorphic to a given computation. We are now ready to define the logic LPOC.
Definition 4. The set of LPOC formulae over a set of processes P and a set of atomic propositions A, is denoted by LPOC (P, A), and is inductively defined as follows: – For each p ∈ P, the symbol loc p is a formula in LPOC (P, A). – If ϕ, ϕ′ ∈ LPOC (P, A), then ¬ϕ and ϕ∨ϕ′ are formulae in LPOC (P, A). – Let m be a natural number, A be a subset of A, and T = (S, η, ≤ , V ) be a computation such that V (s) ⊆ A for every s ∈ S. Then ↓ m,A (T ), ↑ m,A (T ) are formulae in LPOC (P, A). – If ϕ, ϕ′ are formulae in LPOC (P, A), then EXϕ, EU(ϕ, ϕ′ ) are formulae in LPOC (P, A). From now on, we shall refer to formulae in LPOC (P, A) simply as formulae. Their semantics is interpreted at local states of a computation. For a computation W and a given state s of W , we write W, s |= φ when W satisfies φ, which is defined inductively as follows: – W, s |= loc p iff the location of s is p (i.e. η(s) = p), – W, s |=↓ m,A (T ) iff the projection of ↓ m (s) onto A is isomorphic to T. – W, s |=↑ m,A (T ) iff the projection of ↑ m (s) onto A is isomorphic to T. – W, s |= EXϕ iff there exists a state s′ in W such that s′ is a causal successor of s and W, s′ |= ϕ. – W, s |= EU(ϕ, ϕ′ ) iff there exists a causal path s1 s2 . . . sn in W with s = s1 . Further, there exists an index i in {1, 2, . . . , n} with W, si |= ϕ′ , and W, sj |= ϕ for every j in {1, 2, . . . , i − 1}.
Diagnosis with a Logic of Partial Order Computations We define the standard boolean operators ∧, ⇒ as usual, and some derived CTL temporal operators as follows. First, we have EFϕ ≡ EU(true, ϕ) V ¬EX locp ). That is, EFϕ means that there exists and EGϕ ≡ EU(ϕ, ϕ∧ p∈P
a causal path ρ starting from the current state such that ϕ holds at some state of ρ. And EGϕ demands that there exists a causal path ρ starting from the current state such that ϕ holds at every state of ρ. Secondly, we define AXϕ ≡ ¬EX(¬ϕ); AFϕ ≡ ¬EG¬ϕ; and AGϕ ≡ ¬EF¬ϕ. We write EXp (ϕ) for the formula EX(loc p ∧ ϕ). The notations EFp , EGp , AXp , AFp , AGp are defined in the same way. We will say that two formulae ϕ and ϕ′ are logically equivalent iff for any computation W , any state s of W , we have W, s |= ϕ iff W, s |= ϕ′ . In LPOC we can also assert the truth of atomic propositions at¡ a state of a computation. Let a ∈ A. Consider the formula ϕa = ∨p∈P loc p ∧ ↓ ¢ 0,{a} (Tp,a ) , where each Tp,a is the computation containing a singleton state of location p and valuation {a}. It is clear that for any computation W and a state s, we have W, s |= ϕa iff the valuation of s contains a. Observations of a system do not necessarily record all causal ordering among observed states. More precisely, if in an observation O, a state s is causally earlier than s′ , then in any explanation W of O, the state in W corresponding to s must be causally earlier than the state in W corresponding to s′ . But the converse need not hold. For this reason, we need to introduce the formulae ⇓ m,A (T ) and ⇑ m,A (T ), where m is a natural number, A ⊆ A and T is a computation such that the valuation of any state s of T is a subset of A. Let W = (SW , ηW , ≤W , VW ) be a computation and s ∈ SW . Consider the formula ⇓ m,A (T ) where T = (ST , ηT , ≤T , VT ). We say W, s |=⇓ m,A (T ) iff there exists an injective mapping f : ST →↓ m (s) satisfying: – For every s ∈ ST , ηT (s) = ηW (f (s)) and VT (s) = VW (f (s)) ∩ A. – For every s, s′ ∈ ST , s ≤T s′ implies f (s) ≤W f (s′ ). But we do not demand the converse. The semantics of ⇑ m,A (T ) is defined in the same way, except the mapping f is from ST to ↑ m (s). Remark that ⇓ m,A (T ) and ⇑ m,A (T ) can be easily defined in LPOC. Recall that the m-view of a state s of a computation contains at most Nm states. Now it is easy to see that ⇓ m,A (T ) W ↓ m,A (X) where X is the collection of is equivalent to the formula X∈X
computations X = (SX , ηX , ≤X , VX ) containing at most Nm states and for which there exists an injective mapping g : ST → SX satisfying: – For every s ∈ ST , ηT (s) = ηX (g(s)) and VT (s) = VX (g(s)). – For every s, s′ ∈ ST , s ≤T s′ implies g(s) ≤X g(s′ ). Lastly, for a computation W and a LPOC formula ϕ, we say that W satisfies ϕ, written W |= ϕ, iff there exists some minimal state smin of W such that W, smin |= ϕ. We say that ϕ is satisfiable iff there exists a computation W such that W |= ϕ. For application to diagnosis, it is useful to define the notion of a computations satisfying a collection of formulae, one for each process. Formally, for a computation W = (S, η, ≤, V ) and a P-indexed family of formulae {ϕp }p∈P , we say W satisfies {ϕp }p∈P , written W |= {ϕp }p∈P by abuse of notation, iff the following condition holds: for each p ∈ P, Sp 6= ∅ and
5
6
S.Yang, L.H´elou¨et,T.Gazagnaire W, sp |= ϕp where sp is the minimum stateVin Sp (i.e. sp ≤ s for every EU(¬loc p , loc p ∧ ϕp ). s ∈ Sp ). Note that W |= {ϕp }p∈P iff W |= p∈P
3
Diagnosis
Our goal is to use LPOC for diagnosis, that is find explanations for an observation from a set of formulae described in LPOC, that represents a specification of the observed system. To this end, we fix Aob ⊆ A the subset of observable atomic propositions, and Aex ⊆ A the subset of explanatory atomic propositions. The sets Aob and Aex are assumed to be disjoint. An observable atomic proposition is one whose truth value may be “recorded” during system execution. On the other hand, atomic propositions in Aex are those that may explain the cause of what have been observed. Thus in typical applications, explanatory atomic propositions correspond to the faults or state information that cannot be directly accessed. On the other hand, observable atomic proposition indicate state information that can be directly recorded, for instance, alarms and abnormal behavior. The observable atomic propositions are in a sense manifestations of the explanatory atomic propositions. To formulate the diagnosis problem, we introduce the notions of observation and explanation. ob
Definition 5. An observation is a computation with valuations in 2A . That is, (S, η, ≤, V ) is an observation iff V (s) ⊆ Aob for every s ∈ S. Let O = (SO , ηO , ≤O , VO ) be an observation and W = (SW , ηW , ≤W , VW ) a computation. Then, W is an explanation for O iff there exists an injective mapping f : SO → SW such that: (i) ∀s ∈ SO , ηO (s) = ηW (f (s)) and VO (s) = VW (f (s)) ∩ Aob . (ii) ∀s, s′ ∈ SO , if s ≤O s′ , then f (s) ≤W f (s′ ). (iii) For every s in the image of f , VW (s) ∩ Aob 6= ∅. Furthermore, for any s′ ∈ SW , if ηW (s′ ) = ηW (s), s′ ≤W s and VW (s′ ) ∩ Aob 6= ∅, then s′ is also in the image of f . Intuitively, an observation describes everything that is “recorded” during an the execution of some prefix of a computation. We assume that the observation is produced by a monitoring mechanisms that observes each process and transmits “recorded” observable atomic propositions to a central diagnosis system. Only records of the same process are guaranteed to be received by the central diagnosis system in the order they were sent. We emphasize that the observation may have only recorded an initial portion of the explanation. Thus, the image of f may be a proper subset of the states of W whose valuation contains observable atomic propositions. However, we suppose the observer is faithful, that is, it does not create wrong states or causalities. Note that for given O and W , if the mapping f exists, then it is necessarily unique. Thus, for each state s ∈ SO , it makes sense to call VW (f (s)) the W -valuation of s. Condition (i) means that the location mapping should be respected by the observation mechanism, and that in explanations, only states at which
Diagnosis with a Logic of Partial Order Computations at least one observable atomic proposition holds may be “recorded”, and hence appear in the observation. Condition (ii) asserts that the recorded causality orderings must originate from an execution. However, we do not demand the converse. That is, some causality orderings in the explanation may not be “recorded”. Condition (iii) states that there is no loss during recording of states: if a state of process p is recorded, then any causally preceding states s′ of p must be recorded in case the valuation of s′ contains observable atomic propositions. Definition 6. Let O be an observation, and {Φspec }p∈P be a set of forp mulae. W is an {Φspec }p∈P -explanation for O iff W is an explanation p }p∈P . for O and W |= {Φspec p Now we can define the diagnosis problem associated with LPOC. Definition 7. Let P be a set of processes, Aob be a set of observable atomic propositions, and Aex be a set of atomic explanatory propositions. The LPOC-diagnosis problem is defined as follows: given an observation O = (SO , ηO , ≤O , VO ) and a P-indexed family of formulae {Φspec }p∈P p over (P, Aob ∪Aex ), determine whether there exists a {Φspec }p∈P -explanation p for O. In what follows, we will often omit the subscript p ∈ P. The formulae {Φspec } specify the knowledge about execution of the system The obp jective of diagnosis is to figure out whether there exists an explanation for what have been observed. In case such an explanation exists, we also want to obtain more detailed information about the possible truth values of propositions in Aex at each observed state. We define the (explanaex tory) summary of O under {Φspec }p∈P as the mapping g : SO → 2A p ex such that for every s ∈ SO , g(s) ⊆ A , and a ∈ g(s) iff there exists an explanation W for O with W |= {Φspec }p∈P and a is in the W -valuation p of s. Thus, when the answer to the diagnosis problem is positive, we would want further to compute the summary of O. Unfortunately, in the general case, the LPOC-diagnosis problem is undecidable (and so is the computation of summaries). Theorem 1. The LPOC-diagnosis problem is undecidable. proof This theorem is proved by reduction from the Post Correspondence Problem (PCP). The reduction is similar to that used in decision problems related to message sequence charts (see for instance [16]). Consider an instance of PCP with Σ a finite alphabet such that |Σ| > 1, and (g1 , h1 ), . . . , (gn , hn ) a finite sequence of pairs of words over Σ. A solution (if it exists) is a finite sequence of indices j1 , j2 , . . . , jt in {1, 2, . . . , n} such that gj1 gj2 . . . gjt = hj1 hj2 . . . hjt . The reduction is as follows. We pick P = {p, q, r}. Let Ap = {ap | a ∈ Σ} ∪ {ip | i ∈ {1, 2, . . . , n}}, Aq = {aq | a ∈ Σ} ∪ {iq | i ∈ {1, 2, . . . , n}}, and Ar = {ipr , iqr | i ∈ {1, 2, . . . , n}}. We take Aex = Ap ∪ Aq ∪ Ar , and Aob = ∅. Let O be the empty observation, that is, a computation with no states. Thus, any computation is an explanation for O.
7
8
S.Yang, L.H´elou¨et,T.Gazagnaire We encode solutions to the PCP instance by computations which have the form illustrated in Figure 2. The total ordering of states on each process is drawn as a vertical line with the minimum state at the top. The downward-sloping arrows represent pairs (s, s′ ) of states in the successor relation such that s, s′ are on different locations. The label of each state indicates its valuation. We can construct formulae Φp ,Φq ,Φr such that {Φp , Φq , Φr } is satisfiable iff there exists a computation W with W |= {Φp , Φq , Φr } and W represents a solution to the PCP instance. It will then follow that the PCP instance has a solution iff there exists an explanation for O satisfying {Φp , Φq , Φr }.
q
p {i p}
{i pr }
p gi,1
hqi,1
p gi,|g i|
{i′p} gip′,1 gip′,|g ′ | i
r
{iq}
{iqr }
hqi,|hi| {i′q} {i′pr }
hqi′,1
{i′qr }
hqi′,|h ′ | i
Fig. 2. Encoding solutions to PCP with computations
In the sequel, we outline the construction of Φp , Φq and Φr . For A ⊆ Ap , we write VAL(p, A) for the formula locp ∧ ↓ 1,Ap (T ) where T is the computation with a singleton state of location p and valuation A. In other words, the formula VAL(p, A) asserts that the current state has location p and valuation A. We define the notations VAL(q, A), VAL(r, A) in the same way. The formula Φp is the conjunction of the following formulae. W (P1) VAL(p, {ip }). It asserts that the valuation of the minimum state i∈{1...n}
of Sp is {ip }, for some i in {1 . . . n}. V (P2) EGp ( (VAL(p, {ip }) → ϕi )) where each ϕi is given as follows. Let i∈{1...n}
gi,j be the j-th letter of gi , for j = 1, . . . , |gi |. Then ϕi asserts that there exist m states s1 , s2 , . . ., s|gi | in Sp with s1 being a successor of the current state, and sj+1 a successor of sj for j = 1, . . . , |gi | − 1. p Furthermore, the valuation of sj is {gi,j } for j = 1, 2, . . . , |gi |, and either s|gi | has no successor in Sp , or s|gi | has a successor in Sp whose valuation is {ℓp } for some ℓ in {1, . . . , n}. More precisely, ϕi =↓ |gi |+1,Ap (T )∨(∨ℓ=1,2,...,n ↓ |gi |+1,Ap (Tℓ′ )), where T is the computation ({t1 , t2 , . . . , t|gi | }, ηT , ≤T , VT ) with ηT (tj ) = p
Diagnosis with a Logic of Partial Order Computations for every j = 1, 2, . . . , |gi |, t1 ≤T t2 ≤T . . . ≤T t|gi | , and VT (tj ) = p {gi,j } for every j = 1, 2, . . . , |gi |. Similarly, for each ℓ = 1, 2, . . . , n, Tℓ′ is the computation ({t1 , t2 , . . . , t|gi | , t|gi |+1 }, ηT ′ , ≤T ′ , VTℓ′ ) with ηTℓ′ (tj ) = p for every j = 1, 2, . . . , |gi | + 1, t1 ≤Tℓ′ t2 ≤T ′ . . . ≤T ′ p t|gi |+1 , VTℓ′ (tj ) = {gi,j } for every j = 1, 2, . . . , |gi |, and VTℓ′ (t|gi |+1 ) = {ℓp }. V (VAL(p, {ap }) → EXq VAL(q, {aq }))). Intuitively, this as(P3) EGp ( a∈Σ
serts that every state of valuation {ap } of p is “matched” by a state of q with valuation {aq }. We emphasize that the matchings are guaranteed to be one-to-one and order-preserving. This is due to the fact that for every pair p, q of processes