Cooperative Runtime Monitoring Sylvain Hallé Département d’informatique et de mathématiques Université du Québec à Chicoutimi, Canada
[email protected] Abstract Requirements on message-based interactions can be formalized as an interface contract that specifies constraints on the sequence of possible messages that can be exchanged by multiple parties. At runtime, each peer can monitor incoming messages and check that the contract is correctly being followed by their respective senders. We introduce cooperative runtime monitoring, where a recipient “delegates” its monitoring task to the sender, which is required to provide evidence that the message it sends complies with the contract. In turn, this evidence can be quickly checked by the recipient, which is then guaranteed of the sender’s compliance to the contract without doing the monitoring computation by itself. A particular application of this concept is shown on web services, where service providers can monitor and enforce contract compliance of thirdparty clients at a small cost on the server side, while avoiding to certify or digitally sign them.
1
Introduction
Message-based communications can be found in a wide variety of domains and applications. Business-to-business communication can be represented as messaging patterns and modelled with a suitable notation such as BPMN or business artifacts [16], passed from station to station and modified by each agent, can also be viewed as a form of message-passing mechanism suitable for the modelling of business processes. Distributed systems’ remote procedure calls (RPC) can be conveniently modelled as pairs of request-response messages whose parameters map to each call’s arguments and return values. Recently, the rise of web services as a paradigm for decoupling IT systems into independent functional units further confirmed the appropriateness of messaging as an interaction model by its use of SOAP as its underlying protocol. The reach of message-based communications extends to domains where the “communication” does not involve remote entities. In the Singularity operating system, inter-process communication is allowed solely through the use of dedicated message channels. Even platforms such as the iPhone OS, which allows 1
third-party applications to consume system resources by calling appropriate methods, can be likened to a form of messaging. Although in each of these cases, a “message” is conveniently regarded as an atomic unit of data, in many practical contexts, this assumption does not hold. The semantics of the underlying operations impose that these messages be sent and received following some agreed-upon sequence or contract. Various techniques have been suggested to ensure that a “client” (in its most general meaning) correctly accesses the “server’s” resources following its contract, be it a third-party module in an operating system, a web application accessing a web service, or a business process interacting with another) uses the “server”’s available methods according to a contract. For example, the extensive testing of a potential client, followed by the generation of a “digital signature” that is checked at execution time, can ensure a server that only pre-authorized (and supposedly compliant) clients are allowed to interact. However, pre-certification is cumbersome, and in many situations, does not even prevent a pre-validated client from being compromised. This is especially true of Ajax web applications, which run on a remote machine and whose plain-text source code can easily be tampered with. The only safe alternative is for the server to check contract compliance dynamically, as the message exchange unwinds, through a process called runtime monitoring. Yet, traditional runtime monitoring lacks the appeal of pre-validation, in particular since the whole compliance checking must be done by the server itself at runtime. This paper introduces a novel approach to contract compliance checking called cooperative runtime monitoring. In this framework, the client performs the same computation the server would do on its side when receiving a new message, and sends the result. To make sure that the client can be trusted, a proof that the message is a valid continuation of the current transaction is appended to the message. The server merely checks the proof to be ensured of the client’s “good faith”. This approach presents numerous advantages. First, it shifts the monitoring load on each client, which in turn only need to monitor their particular instance of the contract. Second, it still ensures the server that a client follows the contract, message by message, without requiring a rigid and cumbersome a priori certification process. The present paper provides empirical results on a proof-of-concept system that uses a first-order extension of Linear Temporal Logic as its specification language, and shows that the same compliance guarantee can be ensured on the server side for a fraction of the computational cost of classical monitoring. It also shows how one of the fragments of LTL tractable for cooperative runtime monitoring corresponds exactly to the so-called “simple subset” of an IEEE standard called the Property Specification Language (PSL). This result is obtained through different means, and for a completely different motivation, than its original definition. The paper concludes by providing an analysis of variants on the concept of cooperative monitoring, where the three factors of computational savings, completeness of the monitoring and expressiveness of the specification language 2
can be modulated to accomodate different kinds of requirements.
2
Monitoring of Message-based Contracts
Although message-based communication scenarios abound, perhaps the most fertile is the context of web services and web-based applications. In a web service interaction, two independent entities expose their functionalities as a set of possible requests and responses, which can be sent and received through XML documents (“messages”) containing element names nested within each other. A web application is a particular case of this scenario where one of the partners is a web browser running a client-side Javascript application.
2.1
A Motivating Example
To illustrate the need for compliance checking of message-based contracts, we provide an example taken from an actual web service, the Amazon E-Commerce Service (ECS). The ECS is a free service that exposes Amazon’s product data, and provides a number of operations to search for items in Amazon’s catalogue, create and manipulate the contents of a shopping cart that can eventually be “checked out” for payment directly on Amazon’s web site. Any third-party developer can obtain a free Amazon account and create a web application that interacts in the background with the ECS, through SOAP request-response messages. Although each of these operations is intended as a simple request-response pattern of interaction, the semantics of each operation is such that their ordering is also important. For example, it does not make sense (and is actually an error, according to Amazon’s documentation) to add contents to a shopping cart if no shopping cart has first been created. Similarly, one cannot delete an item from a shopping cart before adding anything to it. We list a few constraints on message sequences derived directly from Amazon’s online documentation: P1. The CartCreate message must precede any occurrence of CartAdd, CartModify or CartRemove. P2. CartModify and CartRemove must occur after one CartAdd. P3. If CartClear is invoked, no CartAdd, CartModify or CartRemove can occur before a new CartCreate. P4. If CartRemove removes some item c, no CartModify can involve item c anymore. The reader is referred to [29] for a detailed description of sequential constraints for the ECS. The Amazon ECS is not an exceptional example. Other contexts where the sequence of messages must be taken into account have been described [18, 35], including the PayPal web service [23] and the “channel contracts” in the Singularity operating system [30]. 3
2.2
Compliance to Message-based Contracts
In addition to the pre-defined structure of each XML message, specified in a WSDL document that accompanies each web service, the conjunction of all sequential constraints on these messages describes an interface contract. However, while checking that each message has a valid structure is straightforward, making sure that each message is the continuation of a valid sequence is much harder. No standard exists at the moment to formalize sequential contracts in the way WSDL does for message structure. It is therefore left to each application developer to peruse the plain-text documentation of a given web service, and make sure that an Ajax application interacting with this service follows all the sequential constraints scattered across that documentation. Worse yet, even though a developer is required to have a valid Amazon user ID to use the Amazon web services, this ID does not provide any assurance that the application she develops follows the contract. This issue is not specific to Amazon, or even to web services in general. Any platform allowing third-party applications to use its resources is faced with the same problem. As soon as some form of “dialogue” is implied between a client and a server, the question of making sure that this dialogue is followed by both parties is an open problem. It would be tempting to believe that web service orchestration languages, such as BPEL [2] or BPMN [37], solve that problem by dealing with so-called “behavioural properties”. This, however, is not the case. First off, the scenario described in the previous section shows that both ends of the communication run their own code and are not coupled in anything more than the messages they exchange through an HTTP connection. One cannot therefore assume a higherlevel, centralized execution engine that would coordinate message exchanges from both sides. Hence the use of any language, either on the server or the client side, is irrelevant to the problem: the fact that a BPEL or BPMN engine is used on any end of that communication has nothing to do with the fact that client and server may send messages unexpected by the other. Web service choreography languages, such as WS-CDL [31], allows one to specify constraints on the global pattern of messages exchanged between multiple peers. A choreography specification leaves actual implementation decisions in the hands of each individual peer. However, nothing prevents two peers from agreeing on a global choreography, yet produce individual implementations that do not comply with it and create errors when combined in an actual interaction. A couple of remedies have been explored, which we describe below. 2.2.1
A priori Certification
A possible solution is for the server’s organization to certify each third-party client beforehand for contract compliance. Ways of doing so include thorough testing in a variety of possible use cases, or static analysis such as model checking. Once the organization deems the client good enough, it grants a digital signature, possibly computed from the client’s code, or some other form of authorization. The server can then require each client to provide a valid signature in order to
4
j
d
m
OK
m s’
Client
Server
(a) j
m
d
OK
m s’
Client
Server
(b)
Figure 1: In classical runtime monitoring, a message to be sent (or received) is first checked for compliance with the current state of the contract ϕ through the use of a function called δ. It can be done on the server side 1(a) or on the client side 1(b). start interacting. This is the approach followed, to different degrees, by many current application platforms, such as the AppStore for the Apple iPhone. However, this process requires resources (testing teams or powerful static analysis tools) dedicated to this certification. Moreover, numerous examples show that this barrier can be bypassed (e.g. “jail-breaking” of iPhones [46]). In the web realm, this approach is simply null and void, since the plain-text Javascript code of Ajax applications can be subverted at runtime through techniques such as prototype hijacking [38]. Not surprisingly, the a priori certification of web applications has never been seriously considered. 2.2.2
Server-side Runtime Monitoring
A possible way to escape this issue is to monitor the message exchange between a client and a server at runtime. In particular, the server, requiring compliance from its clients, follows the message exchange and constantly compares it to the contract specification. This can be integrated directly in the server’s application logic, or as an independent agent at the server’s interface that performs a first compliance check before relaying the message to the application proper, as explored in various works [27, 18, 35, 14, 4, 34]. No trust assumption is required on the clients, since any non-compliant behaviour is anyway checked (and prevented) by the server itself. This is shown in Figure 1(a). Although this approach ensures contract compliance from clients, its downside is to still let any message reach the server itself. The application is protected from erratic message exchanges, but the computational burden of this protection is carried solely by the server, without requiring anything from the clients. 5
� a a a a a a a a a a a a a a a a a a a a a a a a a a a
�
� m
�
m a a a a a a a a a a a a a a a a a a a a a a a a a a a
}
OK
m
OK a a a a a a a a a a a a a a a a a a a a a a a a a a a
m a a a a a a a a a a a a a a a a a a a a a a a a a a a
�
m
Client
Server
Figure 2: Framework for cooperative runtime monitoring. 2.2.3
Client-side Runtime Monitoring
A sensible workaround to this issue would be to have have each client monitor the message exchange with the server, as shown in Figure 1(b). This way, instead of having a centralized monitoring of every transaction on the server side, the same computational load is distributed among all clients, which each follow only one transaction (i.e. theirs). Moreover, no time is wasted on the server side to weed out bad sequences of messages, as each client censors itself through its monitoring “guard dog”. Client-side monitoring was studied by [22], where a runtime monitor was placed atop an Ajax web application and run by the user’s web browser. Studies on a real web application using the Amazon web services showed that such an approach does not put undue load on the client web application, while at the same time makes sure that the contract is followed.
2.3
Cooperative Runtime Monitoring
There is, however, one caveat to this approach: how can a server be guaranteed that a client monitors the right contract (or any contract at all) when interacting with it? Clearly, a server cannot let all its guards down without “trusting” the other end of the exchange. We therefore introduce an approach that intends to be a middle-ground solution between not trusting the client (server-side monitoring), and completely trusting the client (client-side monitoring). In cooperative runtime monitoring, the task of asserting the compliance of a message to some sequential contract is shared between the client and the server. The process is shown in Figure 2. First, the sender computes (with a function γ) a “‘proof” that its message complies with the current state of the contract (dashed rectangle) and includes this proof along with the message to send. Upon reception of this message, the server checks that this proof is coherent with the message it accompanies (function µ), and then checks the proof itself (function ν). The message is relayed to the server if the proof is deemed correct with respect to the contract specification. This way, the burden of proving compliance is shifted to the sender, who has to compute and produce a “compelling” piece of evidence that the receiver can then simply verify.
6
For such a scheme to work, however, three basic requirements must be fulfilled: 1. The proof must be equivalent to the monitoring computation —that is, the proof should be judged correct if and only if the message that accompanies it follows the contract. 2. The proof must be unspoofable. This means that any arbitrary proof, aimed at fooling the receiver into accepting a bad message, should be detected as such. 3. Checking the proof should be tractable. It is generally accepted that a “tractable” algorithm runs in time polynomial with respect to the size of its input. Remark that nothing is said about the hardness of building the proof; this part is delegated to the sender of a message. Obviously, the interest of this method is for the “verifying” part (µ and ν) to be easier than the “proving” part (γ). The tractability of checking the result of that computation corresponds to a familiar concept, that of the NP complexity class. Formally, a decision problem P is in the NP complexity class if, for a potential solution x, it can be verified that P (x) holds in time polynomial to the size of x [15]. From this observation, it follows directly that: Requirement 1. Cooperative runtime monitoring requires that both µ and ν be in NP. This principle might seem paradoxical, since any error in the client implementation resulting in an invalid continuation of the current message exchange should be intercepted by γ, which could not produce a proof of compliance and hence not send the message to the server. Actually, the inclusion of the proof acts as a deterrent: it coerces clients to check their messages and provides ground for the server to reject them in the presence of a faulty (or nonexistent) proof.
3
A Formal Model for Cooperative Runtime Monitoring
Based on the informal model described above, we devise a formal model for cooperative runtime monitoring based on an extension of Linear Temporal Logic (LTL), called LTL-FO+ . LTL itself is an extension of classical propositional logic that expresses properties over message traces. Many major model checking tools such as SPIN [28] verify temporal formulæ expressed in LTL. Other approaches for specifying contracts include star-free regular expressions, Hennessy-Milner Logic, µ-calculus, PSL [12], LTL-FO [10], LTL-FO+ [27], SWSpec [44, 48, 47], Logscope [21], Eagle and RuleR [5], which all subsume LTL. Two languages, DecSerFlow [42], and Let’s Dance [9], propose graphical renditions of sequential patterns of events. The reader is referred to [8] for a deeper coverage of LTL and other temporal logics. 7
3.1
Extended Linear Temporal Logic
A message trace m0 m1 . . . , noted m, represents a sequence of incoming and outgoing messages at a peer’s interface over a period of time. The basic building blocks of LTL-FO+ formulæ are propositional variables p, q, . . . , expressing Boolean conditions on particular messages of this trace. More precisely, in the present context, each propositional variable stands for a simple XPath expression denoting a particular path inside the XML document. The expression evaluates to true if the path can be found in the current message, and to false otherwise. For example, the expression /CartCreate/Cart/CartID/123 is true on a CartCreate message whose CartID element, nested inside a Cart element, contains the value “123”. The complete semantics of LTL-FO+ is given in Table 1. This table gives the formal conditions for a trace m to satisfy a given formula ϕ, written as m |= ϕ. These conditions are expressed recursively on the structure of the formula. On top of propositional variables, LTL-FO+ allows Boolean connectives ∨ (or), ∧ (and), ¬ (not), bearing their usual meaning and temporal operators to express constraints on the sequence of messages. The temporal operator G means “globally”; the formula G ϕ means that formula ϕ is true in every message of the trace, starting from the current message. The operator F means “eventually”; the formula F ϕ is true if ϕ holds for some future message of the trace. The operator X means “next”; it is true whenever ϕ holds in the next message of the trace. Finally, the U operator means “until”; the formula ϕ U ψ is true if ϕ holds for all messages until some message satisfies ψ. We also define ϕ V ψ as ¬(¬ϕ U ¬ψ) and ϕ W ψ as (ϕ U ψ) ∨ G ϕ. In LTL-FO+ , these operators are completed with a universal and existential quantifier, used to fetch values inside messages, store them into variables and refer to them at a later time. Intuitively, the expression ∃π x : ϕ means that there exists a possible value c such that formula ϕ holds when occurrences of x are replaced by c. The possible values are fetched from the current message m with the help of a domain function, Dom(π, m), where π is a filter expression specifying locations in the message where to get the values. In the following, π will represent a path in an XML message. Hence, if we let π = /message/a/b and m be the following XML message:
0 1 2 3
then Dom(π, m) = {0, 1}, which indeed designates all values at the end of path of the form /message/a/b. We will denote Dom(π) the union of Dom(π, m) for all possible messages m; we assume this set to be finite and known in advance.
8
m |= ¬ϕ ≡ m 6|= ϕ m |= ϕ ∧ ψ ≡ m |= ϕ and m |= ψ m |= ϕ ∨ ψ ≡ m |= ϕ or m |= ψ m |= ϕ → ψ ≡ m 6|= ϕ or m |= ψ m |= X ϕ ≡ m1 |= ϕ m |= G ϕ ≡ m0 |= ϕ and m1 |= G ϕ m |= F ϕ ≡ m0 |= ϕ or m1 |= F ϕ m |= ϕ U ψ ≡ m0 |= ψ or both m0 |= ϕ and m1 |= ϕ U ψ m |= ∀π x : ϕ ≡ for all c ∈ Dom(π, m0 ), m |= ϕ[x/c] m |= ∃π x : ϕ ≡ for some c ∈ Dom(π, m0 ), m |= ϕ[x/c]
Table 1: The semantics of LTL-FO+ operators, where ϕ and ψ are LTL-FO+ formulæ, m is a message trace m0 , m1 , m2 , . . . and mi designates its suffix mi , mi+1 , . . . . Models of LTL-FO+ formulæ are infinite traces of messages m = m0 , . . . , mn . In the context of runtime monitoring, decisions must be taken on finite prefixes of these traces. Therefore, we extend the definitions above for finite traces, by specifying that a finite trace m satisfies a formula ϕ if it can be extended into an infinite trace m0 that satisfies ϕ. Since most observed traces will be finite, it is possible that the observed trace m be such that neither m |= ϕ, nor m |= ¬ϕ. An example of this is the formula F ϕ for any formula ϕ. This formula can never become false on a finite prefix of any trace, as the next message can always fulfill ϕ. Upon reaching the end of the finite trace, since ϕ is neither confirmed nor violated, the result is undefined. In this case, strong satisfiability requires that ϕ be fulfilled at least once before the end of the finite trace, and regards the formula as false. On the opposite, weak satisfiability evaluates the formula to true. A further discussion on the interpretation of LTL formulæ on finite prefixes can be found in [6, 13]; the discussion could be adapted to LTL-FO+ as well. The remaining discussion is independent of this semantical choice. Using this language, the properties described in Section 2.1 can be formalized as LTL-FO+ formulæ. For example, property P1 becomes ¬(/CartModify) U /CartCreate This formula states that, until some message contains a “CartCreate” element, a message cannot contain a “CartModify” element. This is indeed equivalent to the first part of property P1; a similar formula can be written to state the same
9
condition with CartAdd and CartRemove messages. In the same way, a translation of P2 and P3 are: ¬(/CartModify) U /CartAdd G (¬/CartClear ∨ (¬/CartAdd U /CartCreate)) Finally, property P4 refers to data elements inside messages and requires the quantification mechanism specific to LTL-FO+ : G (∀/CartRemove/Item c : G (∀/CartModify/Item c0 : c0 6= c)) This formula states that globally, whenever an Item element in a CartRemove message has some value c, then from now on, any Item element in a CartModify message has a value c0 different from c. This amounts to stating that an item removed from the cart cannot be modified afterwards. The reader is referred to [27, 22] for more formalizations of message properties in LTL-FO+ , and to [23] for further examples of temporal message properties in actual web services.
3.2
Classical Runtime Monitoring
The runtime monitoring of an LTL-FO+ formula over some message trace m consists in processing each message one at a time, and outputting an intermediate result regarding the validity of this trace relative to the formula. This should be contrasted to offline methods such as [24], which analyze a complete trace without producing intermediate results after each message. An “on-the-fly” runtime monitoring algorithm for LTL has been developed by [17], and implemented as a string-rewriting algorithm in the Maude language by [40]. The algorithm assumes that an LTL formula has first been converted into its equivalent Negated Normal Form (NNF), using simple equivalence rules. The repeated application of these rules has for effect of pushing any negations at the lowest level of the formula, next to the propositional variables. This transformation preserves the truth value of any formula. The algorithm works as follows: the formula ϕ to monitor is first placed into the right-hand side of a node of the form ∅ ϕ. Intuitively, the left part of the node represents the formulæ that must be true in the current message, and the right part represents the formulæ that must be true in the next message to be read (the symbol is a simple separator with no additional semantics). Since no message has yet been processed, the specification goes into the right-hand side. Once a message must be processed, the right-hand side of the last generated node is shifted to the left-hand side, which is then decomposed using the rules described in Figure 3. These rules successively break down the formula on the left-hand side into a list of smaller formulæ. Some of these rules send some contents to the right-hand side in the process, such as the decomposition for the X operator; indeed, for X ϕ to be true in the current message, then ϕ must be true in the next message. Similarly, if G ϕ is
10
p
if p holds
p
if p does not hold
Figure 3: Decomposition rules for an LTL-FO+ formula. true in the current message, this means both that ϕ is true in the current message (ϕ therefore appears on the left-hand side), and G ϕ must still be true in the next message (and G ϕ appears on the right-hand side of the node). Finally, some decompositions introduce two children; hence F ϕ can hold if ϕ is true in the current message, or otherwise if F ϕ holds in the next. The decomposition rules for other operators, including LTL-FO+ ’s quantifiers, follow the same intuitive approach. No further decomposition can be applied when each formula has been broken down into individual propositional variables, or negations of variables. These “atoms” are then evaluated against the current message (they evaluate to true or false, depending on the XPath expression they stand for, and whether they are negated or not). An atom evaluating to false immediately spawns the node ⊥, while an atom evaluating to true is simply removed from the node. The current message is a valid continuation of the trace if this decomposition spawns at least one non-⊥ leaf. In such a case, the set of such non-⊥ leaves is then taken as the roots of new decomposition trees, awaiting for the next message, and the process starts over. A more detailed description of this algorithm can be found in [27]. Figure 4 shows a sample decomposition for a message where p is true, q is true and s is false. One can see that this message is a valid continuation of the trace from state G (p ∧ (X q ∨ F s)), since the resulting decomposition spawns at least one leaf node which is not ⊥. Daisy-chaining applications of δ to each message sent and received, and refeeding the resulting state for the next evaluation allows one to perform runtime monitoring of ϕ.
11
G
p
1
2
X
F1
F2
s
Figure 4: Decomposition from a start state, with a message where p and q are true and s is false.
12
3.3
Theoretical Consequences
The overall complexity of the previous algorithm has been established: Theorem 1. LTL-FO+ runtime monitoring is PSPACE-hard. Proof. The proof is done by reduction to the Quantified Boolean Formula (QBF) satisfiability problem, which is known to be PSPACE-complete. Take any quantified Boolean formula ϕ; without loss of generality, we can assume that ϕ is in prenex form and can hence be written Q1 x1 Q2 x2 . . . Qn xn ψ, where each Qi is either ∀ or ∃ and the xi are variables occurring in ψ. We can also assume that negations have been pushed directly to the variables. We build the following XML message m: < x1 >0< /x1 > < x1 >1< /x1 > < x2 >0< /x2 > < x2 >1< /x2 > ... < xn >0< /xn > < xn >1< /xn >
We can transform ϕ into an LTL-FO+ formula ϕ0 by performing three simple changes: each quantifier ∀xi is replaced by ∀ message/xi xi (ditto for ∃), each occurrence of xi in ψ is replaced by xi = 1, and each occurrence of ¬xi in ψ is replaced by xi = 0. Then, clearly, the trace m = m is such that m |= ϕ0 if and only if ϕ is satisfiable. Since m and ϕ0 are linear in size with respect to the original formula, the reduction is polynomial and LTL-FO+ monitoring is PSPACE-hard. The combination of this result with Requirement 1 leads to a result with important consequences: Theorem 2. If LTL-FO+ can be used for cooperative runtime monitoring, then P = NP. Proof. By Requirement 1, the cooperative monitoring scheme imposes that verifying that a trace of messages complies with an LTL-FO+ specification must be an NP problem. But by Theorem 1, LTL-FO+ monitoring is PSPACE-hard. Since P ⊆ NP ⊆ PSPACE, these two facts can only be reconciled if P = NP. Since the conclusion of Theorem 2 is deemed very unlikely (it actually requires that at least three complexity classes collapse into one), we can safely consider the problem of cooperative LTL monitoring impossible. More precisely, any method of checking that some evidence provided by the sender follows the contract will be as complex for the server as if it simply monitored the whole conversation by itself. Hence, “outsourcing” the monitoring computation to the client offers no gain unless it can be trusted. 13
It is important to note that this corollary applies not only to LTL-FO+ , but to any specification language at least as expressive as LTL-FO+ , such as all the languages mentioned at the beginning of this section.
4
Cooperative Runtime Monitoring in LTL-FO+
To use LTL-FO+ for cooperative runtime monitoring, a compromise must therefore be made regarding the input language. In particular, it can be restricted to a subset of its possible formulæ. Such a fragment, if well chosen, could then ensure us that the decompositions produced by the on-the-fly algorithm can always be checked in polynomial time. The definition of the µ function is straightforward. It consists of checking that each propositional variable has the proper truth value with respect to the message sent. Since each propositional variable stands for a simple path in the current message, which is either present or not, we can safely assume that checking this property is polynomial with respect to message length.
4.1
Proofs in LTL-FO+
First, we must define what constitutes a proof that a message is a valid continuation of a trace, according to some LTL-FO+ formula. In Section 3.2, we have shown an algorithm which, from a given state and a set of propositional variables, produces a derivation tree. The message is a valid continuation if this tree contains at least one leaf that is not ⊥, as shown in Figure 4. Then, the sequence of decomposition rules, applied from the start state and leading to the non-⊥ leaves, can be seen as witnesses that the current message is valid. In Figure 4, there are two such witnesses: the paths from the root to the leftmost and rightmost leaves, respectively. These witnesses can be described in a shorthand notation, by simply giving the sequence of decomposition rules applied at each derivation step. For example, the transition from the root node to its immediate child is obtained through the decomposition rule for the G operator; similarly, the last transition leading to the rightmost leaf is obtained by taking the right-hand side of the decomposition rule for the F operator. Using this notation, one can succinctly describe the two witnesses in Figure 4 as follows: G, ∧, p, ∨1 , X : {q, G (p ∧ (X q ∨ F s))} G, ∧, p, ∨2 , F2 : {F s, G (p ∧ (X q ∨ F s))} 4.1.1
An Example
Let’s take as an example a server whose current state is the LTL formula G (p ∧ (X q ∨ F s)), and which receives a message where p and q are true, and s is false, accompanied by these two witnesses. The server will first process the first witness and break down its current state according to the symbols contained in the proof. 14
1. The first symbol is G; this is consistent with the server’s current state, whose top-level operator is also G. The server pushes its whole state into the next-state set of formulæ, and peels off the operator from its current state, which becomes p ∧ (X q ∨ F s). 2. The second symbol in the witness is ∧, which again corresponds to the top-level connective in the server’s state. The connective is removed, and both its operands are kept side by side. 3. The third symbol in the witness is p. By convention, we assume binary connectives are decomposed starting by the left operand. The ground term p is indeed the left-hand side of the server’s current state. Since ground terms express assertions on the content of the message being sent, the server will check that p indeed holds for that message. In our example, we assumed p is true, so that the server’s check succeeds; p is hence removed from the current state, and the evaluation for the left-hand side of ∧ is over. 4. The penultimate symbol in the witness is ∨1 . This is in line with the top-level connective of the server’s current state. As we have seen, the “1” subscript indicates that the server should keep the left-hand operand of the connective, which is X q. 5. The last symbol in the witness is X; the server removes it from its current state, which becomes q. It also pushes q into the next-state set of formulæ. At this point, all logical connectives have been removed from both the witness and the server’s state in the same sequence, and no logical connectives are left to remove from either. Moreover, all ground terms that occurred in the witness have been evaluated on the accompanying message and resulted in the expected value. Finally, the next-state set of formulæ computed by the server is identical to the one carried in the proof. Therefore the message is a valid continuation of the current trace, and the server processes it accordingly. (A similar treatment should be done for the second witness.) One can see how this procedure fails when the proof or the message are modified. Suppose one wants the server to accept a message where p is false. A first possibility is to send that message with the same witnesses as above. The server’s verification of the proof will fail at step 3, when p is evaluated on the message and returns false instead of true as expected. A second possibility is to replace p by ¬p in the proof, hence trying to trick the server to validate a different contract in which the message would be valid. However the verification will again fail at step 3, this time because the server’s state expects p while the witness contains ¬p. Finally, one might want to relax the condition by replacing ∧ by ∨ in the witness (thus allowing the possibility that p be false in the current message); again, the verification will fail at step 4, the server expecting ∧ instead of ∨2 .
15
4.1.2
Satisfaction of Requirements
We can now revisit each of the three requirements given in Section 2.3 and show they are achieved. 1. Equivalence to monitoring: follows directly from the fact that each witness is a branch from the actual decomposition tree produced by the full-fledged runtime monitoring algorithm. A message is a valid continuation of the trace if at least one branch does not result in a dead end. Therefore, the presence of a valid witness entails a valid continuation of the trace. 2. Unspoofability: a witness is spoofed when its sequence of symbols is altered. Two cases must then be considered: • The server rejects the tampered witness. Then the spoof has failed to have the server process an invalid message. • The server accepts the proof and processes the corresponding message. Then by the previous point, the tampered witness happens to be another valid branch of the decomposition tree, which entails that the message was valid in the first place. 3. Tractability: each witness has a length in linear proportion with the size of the original formula; this is true since, at any derivation step, at least one symbol of the original formula is removed to produce one of the children nodes. Hence from a known start state, the receiver can easily take one of the witnesses, compute the sequence of derivation rules and check that the resulting end state corresponds to that sequence of derivations. One can devise a polynomial-time algorithm ν which, given a start state and a sequence of derivation rules, checks that sequence of derivations can indeed be carried to the end. This procedure has the advantage that “dead-end” branches need not be expanded by the receiver. It therefore presents the potential of making the verification of a given derivation simpler than its computation, in the case where proportionally few branches end up in non-⊥ states compared to the size of the whole derivation tree.
4.2
An NP Fragment of LTL-FO+
In some situations, the number of branches to develop might amount to a significant proportion of the derivation tree, and even to the whole tree in the case where none of the branches ends up with ⊥.1 In such a case, checking the proof amounts to reproducing the whole derivation on the receiver side. This is consequent with Theorem 2, which predicts that a proof that a message complies with an LTL-FO+ formula can, in some cases, be as long as its computation. As 1 Remark that all witnesses must be kept, since the validity of the next message requires that any one of them spawns a non-⊥ node in the next round of decomposition.
16
a first workaround to this consequence, we are interested in restricting the input language to a subset of LTL-FO+ formulæ where a derivation is easier to check than to compute. More precisely, one wants to find fragments of LTL-FO+ that are in NP, in line with Requirement 1. To this end, we introduce a new subset of LTL-FO+ , called the non-branching fragment. We call a formula operator-free when it does not contain any temporal operator. We then define non-branching LTL-FO+ as follows: Definition 1. An LTL-FO+ formula ϕ is non-branching if it follows these rules: 1. In ϕ ∨ ψ, one of ϕ or ψ is operator-free 2. In F ϕ, ϕ is operator-free 3. In ϕ U ψ and ϕ V ψ, both ϕ and ψ are operator-free 4. In ∃π x : ϕ(x), all but one value c ∈ Dom(π) is such that ϕ(c) is operatorfree 5. There is at most one universal quantifier. By removing conditions 4 and 5 and restricting ourselves to LTL operators, one obtains a similar “non-branching” fragment for classical LTL. The next theorem shows that a non-branching formula always produces a proof that can be checked in polynomial time. Theorem 3. Let ϕ be a non-branching LTL formula, and c be the proof obtained from the on-the-fly decomposition algorithm for some message m. The size of c is polynomial in the length of ϕ. Proof. The proof is done in three steps. We first show that the derivation of ϕ produces a single branch for any message m. Then we show that the resulting leaf node contains only non-branching LTL-FO+ formulæ. Finally we show that the size of each node is itself linear in the length of the original formula. Step 1: we proceed to show that the derivation of ϕ does not produce any branching. We first remark that the derivation of any operator-free formula ψ does not produce branching. Indeed, in such a case, ψ is a propositional formula whose truth value is completely determined by the contents of the current message. Its derivation will produce branches that will all end either with the ⊥ node, or with a node of the form ∅ ∆, with an empty left side. In the first case, this branch is not a witness and does not need to be included to the proof. In the second case, ϕ is true, and any one such branch can be chosen as the witness. If the current operator to decompose is G, X, ∀ or ∧, then no branching ever happens. We study the remaining operators one by one: • (∨): since ϕ is non-branching, then ϕ = ψ ∨ψ 0 and we can suppose, without loss of generality, that ψ is operator-free. By the previous remark, ψ does not produce any branching.
17
• (F ϕ): by definition, ϕ is operator-free. Therefore, the left-hand side of the decomposition rule for F is a propositional formula; it does not produce any branching. Only if all the nodes produced by this side of the decomposition end up with ⊥ can the right-hand side of the F rule be expanded. In such a case, the whole left side can be discarded, and again no branching is produced. • (ϕ U ψ): similarly to the previous cases, one can see that the decomposition of U branches into two nodes with formulæ that are, by definition, operator-free. By the above remark, any one branch ending with a non-⊥ node is an appropriate witness, and all others can be discarded. A similar reasoning can be applied for V. • (∃): one must remark that ∃π x : ψ(x) is equivalent to ψ(c1 ) ∨ ψ(c2 ) · · · ∨ ψ(cn ) for c1 , . . . , cn ∈ Dom(π). Since ϕ is non-branching, then we can suppose, without loss of generality, that all ψ are operator-free except ψ(c1 ). By the initial remark, the remaining terms do not produce any branching. Step 2: by simple inspection of all decomposition rules, one can see that if the topmost formula is non-branching, then all decomposed subformulæ are also non-branching. Step 3: we observe from the decomposition rules that each node contains a subset of all the subformulæ that can be constructed from ϕ. For a formula where no free variable occurs, the total number of possible subformulæ is linear in the number of operators inside ϕ. Each occurrence of the existential quantifier ∃π x : ψ(x) expands into a disjunction of formulæ ψ(c1 ) ∨ ψ(c2 ) · · · ∨ ψ(cn ) for c1 , . . . , cn ∈ Dom(π), of which only one does not resolve immediately into > or ⊥. Finally, there is at most one occurrence of the universal quantifier ∀π ; hence the number of possible subformulæ is multiplied by at most Dom(π), which is assumed to be a finite constant. As an example, the formula G ¬(p → F q) is a non-branching formula; however, the formula G (p ∧ (X q ∨ F s)) is not a non-branching formula. Indeed, Figure 4 shows that some of its proofs produce more than one branch.
4.3
Relative Strength of Non-Branching LTL-FO+
The non-branching condition is very strong. If forces any derivation in a trace to produce at most one witness, regardless of the total size of the derivation tree. However, we can now prove that cooperative runtime monitoring is achieved in this setting. Theorem 4. The non-branching fragment of LTL-FO+ can be used for cooperative runtime monitoring.
18
Proof. By a previous remark, since a witness is itself polynomial in the size of the original formula, polynomiality of the proof checking algorithm is guaranteed. Hence the non-branching fragment of LTL-FO+ is in NP, and satisfies Requirement 1. By a construction identical to Theorem 1, we see that a formula of the form ∃x1 x1 ∃x2 x2 . . . ∃xn xn ϕ, with ϕ any operator-free formula, is in the non-branching fragment of LTL-FO+ . However, monitoring this formula amounts to solving a Boolean satisfiability problem, which is itself NP-complete. Hence computational savings are indeed achieved between the (polynomial) checking of the proof and the (NP-complete) computation of that proof. One should first realize that this fragment indeed restricts the expressiveness of interface contracts. For example, the assertion “eventually, some property p will hold forever” translates into the LTL formula F G p, which lies outside the non-branching fragment. In general, any specification requiring the monitor to “pick” a future state and evaluate some non-trivial condition on it will produce branching. Yet, it should be noted that all four properties for the Amazon E-Commerce service are non-branching LTL-FO+ formulæ. Moreover, as strong as such a condition may be, it should be contrasted with other fragments of LTL amenable to cooperative runtime monitoring. 4.3.1
Simple Subset of PSL
The Property Specification Language (PSL) is a formalism that has been developed from IBM’s Sugar language and made into an IEEE Standard [1]. It can be used to complement other specification languages such as VHDL with a “temporal layer” based on LTL. PSL is a rich language that extends LTL with a limited form of quantification, in addition to the use of regular expressions to specify traces of events. However, the richness of this language makes it prone to the development of complex and confusing expressions. The simple subset of PSL is a fragment restricting the form of possible expressions [1, Section 4.4.4]. Intuitively, the simple subset imposes a linear flow of time in the evaluation of a temporal property. Therefore, if one needs to evaluate a subformula at some time T , then the value of anything at the right of this subformula need not be known before time T . It has been advocated that the properties of a system that need to be checked at runtime be in the simple subset of PSL. To achieve this result, restrictions are imposed on the use of Boolean and temporal operators, as formalized in Table 2. In this table, “Boolean” is equivalent to operator-free, and the sequence type is equivalent to a construct of the form ϕ0 ∧ X (ϕ1 ∧ X (· · · ∧ ϕn )) where all the ϕi are operator-free. The expression p before q is equivalent to the LTL expression ¬q W p. The resemblance between non-branching LTL and the simple subset of PSL is striking. One can see that some of the conditions are identical to the nonbranching fragment of LTL defined above; however, a few are even stronger.
19
PSL Operator
Restriction
! never eventually! ∨ → until, until! until− , until!− before next
Operand must be Boolean Operand must be Boolean or sequence Operand must be Boolean or sequence One operand must be Boolean Left-hand side must be Boolean Right-hand side must be Boolean Both operands must be Boolean Both operands must be Boolean Operand must be Boolean
Table 2: The simple subset of PSL, as defined in [12] For example, the non-branching formula G ¬(p → X q) translates in PSL as never (p → next q), which is not in the simple subset of PSL.
5
Experimental Results
To demonstrate the interest of cooperative runtime monitoring, a proof-of-concept implementation of this principle has been developed as a pair of Java applications, and tested on the example described in Section 2.1. The goal of the experiment is to determine, under controlled and comparable conditions, the time required to verify a message vs. the time required to verify only its proof, using messages and properties representative of real-life use. This section summarizes initial findings.
5.1
Implementation
We implemented two components, intended to operate on the client and server side respectively, as shown in Figure 2. The first component is a a runtime prover (RP) which, given an LTL-FO+ specification, updates its state at each message sent and received, computes a proof for each message sent, and attaches it to the message. This is the implementation of function γ in Figure 2. The prover is based on BeepBeep [25], a lightweight runtime monitor that integrates seamlessly to any Ajax web application and intercepts its incoming and outgoing SOAP messages. Each message is first routed through the runtime monitor, which makes sure that it is a valid continuation of the current message trace. If this is the case, a proof of this fact is generated, and added to the SOAP header in a Proof element encoded in XML format, as is shown in Figure 5. The CRM:proof tag is a custom element that will be ignored by any application other than our CRM tool. It encodes each branch of an LTL proof as a series of op tags, giving the chain of decomposition rules used to obtain the end state, represented as an LTL formula. If the input formula is non-branching, by Theorem 3 we are sure that only one branch element will be present in the header.
20
G X /message/operation/CartAdd X (!(/message/operation/CartModify)) ... CartClear 1234 8F6C59AF00 859285
Figure 5: A SOAP message, to which an “XML-ized” proof that it is a valid continuation of the current trace has been appended in the SOAP header. The second component is a runtime checker (RC) which, given an LTL-FO+ specification, a message and a proof produced by the RP, checks that the proof is valid. It first makes sure that any atomic op element (a simple XPath expression) can actually be found in the SOAP-Body. This first step ensures that the proof indeed applies to the current message, and is not spoofed. Once this is done, the checker takes its current internal state, and checks that the sequence of decompositions given in the Proof element can indeed be applied to this state. It simply starts from the current state and updates it according to the sequence of derivations given in the proof. If the whole chain of decompositions can be carried to the end, the proof is deemed valid; the message is stripped of its proof, relayed to the application, and the contents of the state element is adopted as the new internal state. This corresponds to the implementation of functions µ and ν in Figure 2. To simplify the implementation, the runtime prover and checker are actually the same application, which is instructed to run either as the client or the server side. It consists of roughly 50 kb of compiled Java code.
5.2
Results and Discussion
To test the approach, we used a sample Ajax application for the Amazon ECommerce Service, and generated traces of request-response messages between the application and the web service.2 The specification to be monitored consisted of the conjunction of all LTL-FO+ properties described in Section 3.1. The messages were successively processed by the runtime prover, which appended a 2 The source code of the runtime monitor, including proof validation methods, can be found at http://beepbeep.sourceforge.net.
21
Property 1 2 3 4 Cumulative
Classical (ms) 1.09 1.09 1.03 0.946 4.156
Cooperative (ms) 0.712 0.712 0.719 0.737 2.880
Reduction (%) 44% 44% 39% 29% 31%
Table 3: Average validation time per message for each of the four properties, both for the classical and cooperative runtime monitoring approaches, and relative overhead reduction. proof of their validity with respect to this specification, and the runtime checker, which read and checked the proof as described previously. Since we (obviously) did not have disk access to the Amazon server itself, we could not install the runtime checker on the server side; we instead simulated processing on the server side by running the runtime checker on the same computer as the client. To simplify the analysis, we excluded the serialization into XML that the prover would normally do, which would then be cancelled by a de-serialization of the XML by the runtime checker. The proof was instead transferred from prover to checker in its symbolic representation as a Java object. We randomly generated 100 traces of 100 requests and responses from the Amazon ECS, including the various cart manipulation operations shown in our case study. Each trace was then run through both the classical monitor and the cooperative runtime monitor, for each of the four LTL-FO+ properties we described. We compared the amount of computing saved by the server through cooperative runtime monitoring vs. classical, server-side monitoring, The average overhead reduction for each property is given in Table 3. From these results, one can draw the following conclusions. First, the server is assured of the client’s compliance to the contract. This is a direct consequence of the use of runtime monitoring, be it collaborative or not. Second, for the server, this compliance is enforced for a fraction of the computing cost of verifying it by itself. This fraction averages 69% for the traces we generated. Not surprisingly, the savings are greater for simpler properties (especially 1 and 2), which indicates that there is room for improvement in the construction of the proof when firstorder quantification is used. However, for all properties, the use of cooperative runtime monitoring is advantageous from the server’s point of view, even in its current version. The distribution of overhead reduction is plotted in Figure 6. It shows that there is a fair deal of variability in the savings actually obtained for each message in a trace. With Properties 1 and 2, for instance, nearly half of all messages exhibit an overhead reduction of 50% or more compared to classical monitoring. We therefore examined if the savings were affected by the length of the trace —that is, whether the cooperative runtime monitor “wears out” as the message exchange progresses. This could be caused by the accumulation of data about past messages (e.g. item IDs) that need to be carried in proofs of later 22
3000
Number of messages
2500 2000 Property 1 Property 2 Property 3 Property 4
1500 1000 500 0 0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Relative overhead reduction
Figure 6: Distribution of relative overhead savings of the cooperative runtime monitor with respect to the classical monitor. messages. For each of the 100 traces, we plotted the evaluation time of each message relative to its position in the trace. The results for properties 1 and 4 are shown in Figure 7. As one can see, there is no increasing trend as the trace progresses, which confirms that, for the given monitor implementation, proof checking maintains its advantage over classical runtime monitoring as the message exchanges progresses. Finally, one can observe that the verification time for each message is about 100 times lower than the typical network latency from major service providers (about 40 ms for North-American round-trips3 ). It is hence fair to conclude that proof validation does not impose an unreasonable delay when performed at runtime on web services by monitoring each incoming and outgoing message; the network accounts for much greater variations in responsiveness.
6
A Multi-Dimensional Problem
As we have seen, the key point of cooperative runtime monitoring is to devise a way to verify in tractable (polynomial) time a given proof. One possibility, described in the previous sections, is to limit the number of branches produced by the evaluation of a formula on a given message: this leads to the “non-branching” fragment of LTL-FO+ . This, however, is far from being the only possible solution.
6.1
Modulation of Requirements
As Figure 8 shows, the task of monitoring message sequences is bounded by two extreme approaches. Server-side monitoring does not rely on trusting the client, 3 http://www.verizonbusiness.com/about/network/latency/
23
1
Relative overhead reduction
0,8 0,6 0,4 0,2 0 -0,2
0
10
20
30
40
50
60
70
80
90
100
80
90
100
Message position in trace
-0,4 -0,6 -0,8 -1
1
Relative overhead reduction
0,8 0,6 0,4 0,2 0 -0,2 -0,4
0
10
20
30
40
50
60
70
Message position in trace
-0,6 -0,8 -1
Figure 7: Relative overhead reduction with cooperative runtime monitoring, plotted against the position of each message in the trace. The graphs shown are computed for Property 1 (top) and Property 4 (bottom).
24
Complete Cooperative monitoring
Guarantees
Server-side monitoring
None 0
Computational savings
100%
Client-side monitoring
Figure 8: Cooperative runtime monitoring lies on the continuum between clientside and server-side monitoring.
and hence preserves complete guarantees on the result; this is done at the price of doing 100% of the work. On the contrary, client-side monitoring offloads the server of all the monitoring work, yet exposes it to total manipulation of the message sequence by the client, since no double-check is done. One can therefore express monitoring as a function relating server-side guarantees to server-side computational load. Intuitively, an unknown line describes this trade-off, with server and client-side monitoring as its two endpoints. In this setting, cooperative runtime monitoring as described in this paper can be seen as the search for a middle ground solution where computational savings are made without sacrificing any guarantees. Clearly, this is only possible if the trade-off function exhibits a horizontal plateau at the top left of the graph —otherwise, even the smallest savings would cost some guarantees. Fortunately, we have seen in the previous sections that there is a way for the server to fully trust the computation made by a client by doing a fraction of the work required to perform complete monitoring. Hence the trade-off function resembles the line drawn in Figure 8, with the method described in this paper standing somewhere along the horizontal line. The exact length of that plateau and the shape of the decreasing segment of the function are not known. As it turns out, the trade-off function is multi-dimensional: many parameters of this problem can be modulated and provide variants of cooperative runtime monitoring for many kinds of requirements. The present section surveys two of such alternate methods.
25
6.2
Conservative Approximations
Another strategy consists in replacing the original specification by another one that has the property of yielding simpler proofs. Consider for example the formula F (a → X b), which states that eventually, a message satisfying a will be immediately followed by a message satisfying b. This formula is not part of non-branching LTL, as the argument of the F operator is not operator-free. However, one can instead settle on enforcing the property G (a → X b); this time, every message satisfying a must be followed by a message satisfying b. Contrarily to the initial specification, this one is part of the non-branching fragment of LTL, and hence can be monitored cooperatively. In addition, this property is a “stronger” version of the original, in that any trace that satisfies the latter also satisfies the former. Therefore, using it for monitoring will not miss any violation of the original specification, and is considered a conservative approximation of the actual property to check. However, while the previous example was straightforward to devise, we must now define a systematic way of producing conservative approximations of a formula. We call a subformula of some formula ϕ an operand of its top operator. For example, the formula ϕ defined as (p ∧ F q) → G r has → as its top operator; its operands are p ∧ F q and G r, and these are the two subformulæ of ϕ. The definition can be applied recursively; hence r is the only subformula of G r, and so on. On can see how in Figure 4, the decomposition of an LTL-FO+ formula produces nodes that contain progressively deeper (and smaller) subformulæ of the original expression. We can then define the polarity of a formula ϕ as a function of the number of nested negations under which the formula stands. A formula ϕ can be positive (noted π(ϕ) = +) or negative (π(ϕ) = −); by convention, we define −(+) = − and −(−) = +. Definition 2. Let ϕ be an LTL formula, and π(ϕ) be its polarity. The polarity of its subformulæ is uniquely defined by the following set of equalities, depending on the structure of ϕ: • If ϕ is of the form a, for a some atomic proposition, then π(a) = π(ϕ) • If ϕ is of the form ¬ψ for ψ an arbitrary formula, then π(ψ) = −π(ϕ) • If ϕ is of the form ψ ∧ ψ 0 , for ψ and ψ 0 two arbitrary formulæ, then π(ψ) = π(ψ 0 ) = π(ϕ) • If ϕ is of the form ψ ∨ ψ 0 , for ψ and ψ 0 two arbitrary formulæ, then π(ψ) = π(ψ 0 ) = π(ϕ) • If ϕ is of the form ψ → ψ 0 , for ψ and ψ 0 two arbitrary formulæ, then −π(ψ) = π(ψ 0 ) = π(ϕ) • If ϕ is of the form G ψ for ψ an arbitrary formula, then ϕ = G ψ: π(ψ) = π(ϕ) 26
• If ϕ is of the form X ψ for ψ an arbitrary formula, then π(ψ) = π(ϕ) • If ϕ is of the form F ψ for ψ an arbitrary formula, then π(ψ) = −π(ϕ) • If ϕ is of the form ψ U ψ 0 , for ψ and ψ 0 two arbitrary formulæ, then π(ψ) = π(ψ 0 ) = π(ϕ) • If ϕ is of the form ψ V ψ 0 , for ψ and ψ 0 two arbitrary formulæ, then π(ψ) = π(ψ 0 ) = π(ϕ) This definition gives a procedure to calculate the polarity of each subformula; π(ϕ) is computed “top-down”: once we know the polarity of ϕ, we can compute the polarity of its immediate subformulæ, and so on recursively. A top-level LTL formula is always taken as positive. Take for example the formula (p ∧ G q) → G r shown above. This top-level formula is given positive polarity. Its two immediate subformulæ are p ∧ G q on the left, and G r on the right. By Definition 2 (fifth case), the polarity of p ∧ G q is negative, and the polarity of G r is positive. Descending down the left operand, the top-level operator now becomes ∧. By Definition 2 (third case), since p ∧ G q is negative, then so are p and G q. One can see how each remaining subformula can be given a polarity in the same fashion. Conservative approximations can then be obtained by substituting parts of the original formula in the following way. Theorem 5. Let ϕ be an LTL formula and ψ be of one of its subformulæ. Let ϕ0 be the LTL formula obtained by replacing ψ in ϕ by some other subformula ψ 0 . We have that ϕ0 is a conservative approximation of ϕ if either of the following conditions apply: 1. π(ψ) = + and ψ 0 → ψ 2. π(ψ) = − and ψ → ψ 0 This theorem is a reformulation of a result demonstrated by [39]. Using this result, it is possible to build a conservative approximation of a formula ϕ by picking any subformula ψ, and, if its polarity is positive, replace it with a stronger subformula ψ 0 ; on the contrary, if its polarity is negative, a weaker subformula is a suitable replacement. For example, let ϕ be the formula (p∧G q) → G r, and let ψ be its subformula G q. Finally let ψ 0 be the formula F q; remark that G q implies F q no matter what q is. Since we know that the polarity of G q in ϕ is negative, by Theorem 5 we know that replacing Gq by F q in ϕ will produce a conservative approximation of ϕ —that is, whenever the resulting formula (p ∧ F q) → G r is false, then so is (p ∧ G q) → G r. By choosing appropriate substitutions for subformulæ, one can transform an original specification into a conservative approximation that conforms to the non-branching conditions. It might, however, introduce false positives —that is, violations of the approximated formula that are not violations of the actual one.
27
There are multiple ways of dealing with these spurious errors. The first one is to accept the stronger specification as the actual specification, in which case these violations become genuine. A more refined way of proceeding consists of having the client and the server be aware of the original specification, while still use the stronger one for cooperative runtime monitoring. In the event where the client sends a message that creates a false violation, it could mark this message as an exception and ask the server to switch from cooperative to full runtime monitoring. This indicates that the message exchange has reached a corner case where the server is required to compute the full derivation tree for this message. In any case, if false positives are not regarded as detrimental to the quality of the monitoring process, the use of conservative approximations can significantly extend the reach of tractable specifications.
6.3
Branching Restrictions
As we have seen, the non-branching condition for LTL is a strong one, since it imposes exactly one branch leading to a non-⊥ node at each message. This entails that there exists at most one possible continuation of the exchange, logically speaking. This continuation can be materialized by more than one trace of messages, but each such trace is evaluated against a single alternative. 6.3.1
Polynomial Branching
Actually, the condition for cooperative runtime monitoring is not so rigid: it simply requires that the number of nodes kept at all times by the monitor be a polynomial function in the length of the formula. Non-branching LTL (or “1-witness” LTL) is the extreme case where this function is the constant 1. But then consider the following formula: G (a → (X b∨X c)). It states that whenever a message satisfying a is received, the following message must either satisfy b or c. While this formula is not in the non-branching fragment of LTL, it is straightforward to see that the monitor never has to deal with more than two alternatives at any time. The constant 1 has been replaced with the constant 2, and therefore polynomial verification of a witness is still possible with this formula. One must realize that the polynomial condition relates to the total number of nodes kept in memory by the monitor, and not to the total number of alternatives produced at each new message. A function that spawns two new alternatives from each existing one yields an exponential number of candidate solutions. For this reason, providing general syntactical conditions for the “2-witness” fragment of LTL, or more generally the “p(n)-witness” fragment (with p(n) a polynomial in the length of the formula) is much more involved than for the simpler, singlewitness version we developed in this paper. The problem is closely related to the search for upper bounds for classes of deterministic Büchi automata, and is considered as future work.
28
6.3.2
Branch Deletion
Another, more drastic solution is to deliberately eliminate witnesses when computing a proof. For example, a client producing the two witnesses shown in Figure 4 could choose which one to send to the server and discard the other, to keep with the single-witness restriction of non-branching LTL-FO+ . We recall that each witness represents a set of conditions for future traces to satisfy the specification. Therefore, removing one witness also suppresses one possible way in which future messages could be valid continuations of the current exchange. In the formula G (a → (X b ∨ X c)), sending a message satisfying a normally produces one witness predicting that the next message must satisfy b, and a second one announcing it must satisfy c. By selecting the witness to send to the server, the client commits itself to produce a message sequence taken from a smaller set of possibilities than the contract actually permits.
7
Related Work
A distinction must first be made between the cooperative runtime monitoring introduced in this paper and runtime monitoring of distributed systems such as CORBA [33]. The latter requires a centralized observer, external to the agents involved in a communication and that receives all the information monitored from all parts of the application. In the same way, the notion of cooperative data management introduced by [45] also requires a trusted third party, which is used as a witness for a transaction between two peers. This trusted party records the sequences of events exchanged between the peers, which can then refer to it to settle disagreements. In contrast, in cooperative runtime monitoring, no external third party is required, nor is any trust assumed from any external agent. Moreover, the approach suggested in both papers records the messages, but is “neutral”: it does not actively interfere in the actual message exchange, even in the case of a violation, since no particular specification is given to it. Zero-knowledge proofs involve two parties where one is able to prove to the other the possession of some secret information without revealing it [20]. It is possible, using a zero-knowledge proof system, to force a user to act according to some protocol; the proof it provides can only be valid (and verified by an external agent) if its behaviour is compliant, yet does not require the user to reveal its secret [19]. However, while zero-knowledge proofs aim at the protection of some secret information, cooperative runtime monitoring is motivated by computational savings. Moreover, zero-knowledge proofs are based on probability; the certainty in the proof is an increasing function that depends on the number of times the verification process is repeated. In CRM, protocol compliance is ensured when the proof is deemed valid, and this verification occurs only once for every message. Cooperative runtime monitoring draws natural parallels with proof-carrying code (PCC) [36]. PCC suggests that compiled programs be accompanied by a “proof” of their correctness that an execution environment could then easily
29
check before allowing it to run. A malicious or tampered program could then be detected. For example, a compiler can produce Java code, accompanied with a proof that the compiled program is memory safe. While this idea has been used to statically prove that a program follows a set of requirements (mostly memory safety) beforehand, our approach rather provides runtime proofs that individual messages produced by a program follow some contract. The program itself is not checked, signed or validated in any way. As far as we know, the present work is the first application of this idea to individual messages produced at runtime by a program. CRM is actually closer to the classical idea of a “token” or “hash value” that is used to ensure the integrity of a message, which has become common practice in the field of computer security [11]. However, while traditional hashes ensure the integrity of a particular message individually, the token produced by CRM provides integrity assurance relative to some contract. A message can still be tampered, but not in a way that would constitute a violation of the particular contract being monitored. In addition this contract involves a sequence of messages, defined by some LTL-FO+ formula, and their particular position in that sequence; traditional hashing ensures the integrity of messages individually. In the GridCop project, a potentially fraudulent host periodically sends beacons reporting the submitter on the proper advancement of a computation task submitted by a client [49]. A specific kind of beacon called the “R-beacon” sends the input and output values for a specific region of the computation made by the host, which the submitter can then use to re-compute the output values on its side. However, in this case, the verification step mirrors the computation done on the host, and no computational savings are made as is the case for cooperative runtime monitoring. The parallelization of propositional LTL has already been studied by [32] as an alternate way of sharing the verification burden between multiple machines. However, the approach described in the paper assumes that the path is completely known when the analysis starts; different machines take care of different parts of that trace and combine their results. Therefore, it cannot be applied directly for runtime monitoring. The idea that checking a proof should be “easy” has been similarly argued in other domains; for example, [43] suggested that proofs of unsatisfiability for propositional logic formulæ (a co-NP-complete problem) should be checkable by an algorithm in LOGSPACE, a very low complexity class. In that context, the goal is not so much to save space, as to use an algorithm simple enough that it can be trusted, or proved correct by hand. The conservative approximation proposed in Section 6.2 is related to the notion of abstraction used to simplify the representation of problems in hope of making them tractable for model checking. For example, [39] develops an automatic abstraction method for µ-calculus, while [3] use Boolean programs as a simplification of C code; more recently, abstraction has been developed for model checking of web services [41]. However, most abstraction techniques rely on simplifying the model (i.e. the representation of the system), rather than the formula, as we propose here. 30
Finally, non-branching LTL-FO+ can be seen as closely related to another subset of LTL-FO+ called the forward-only fragment. This fragment was studied as a suitable language for leveraging the capabilities of streaming XML query processors to perform runtime monitoring [26]. The exact similarities between these two logics have yet to be determined.
8
Conclusion
In this paper, we introduced the notion of cooperative runtime monitoring. We showed how this concept can be used to shift the computing load of contract compliance from the server to the client of an application, while still maintaining the same compliance guarantees for the server, and yet without any trust assumptions about the client. Our initial experimental results show that indeed, having the sender of a message compute a proof of compliance that can easily be checked by the receiver can reduce the work required on the server side by a large fraction. Cooperative runtime monitoring is a multi-dimensional problem, where a number of factors can be modulated. A server can decide to surrender part of its guarantees, or to accept lesser computational savings, or trade these two aspects for a richer specification language; all these three factors are interdependent. This paper presented in more detail one particular fragment of Linear Temporal Logic with first-order quantifiers for which CRM is tractable. Incidentally, we have also shown how seemingly unrelated specification languages, such as the simple subset of PSL, actually become natural consequences of CRM constraints over Linear Temporal Logic. Preliminary results look very promising, and suggest that the approach could become a fruitful alternative to enforce sequential patterns in message-based interactions such as web services. In particular, cooperative runtime monitoring is amenable to a number of interesting applications that could be studied in future work. We mention a few of them by way of conclusion. • Traceability. By storing the sequence of witnesses in addition to the sequence of messages, one obtains an “annotated log” of some message exchange. Hence a process that analyzes messages in real time can annotate the record with proofs, so that a re-analysis of the log a posteriori can be done in polynomial time. • Security. Cooperative runtime monitoring lessens the risk of message spoofing and “replay” types of attacks in systems such as web services. Indeed, each peer keeps a copy of the current state of the message exchange, yet this state is never shared on the wire. An attacker trying to replay a captured message must therefore guess what this state is, so as to produce a correct witness. Moreover, since this inserted message is likely to create a difference in the state maintained by the legitimate client and the server, correct proofs of genuine messages might start to be rejected, thus providing a way for the actual client to detect the intrusion. 31
References [1] IEEE standard for property specification language (PSL). Technical Report IEEE Std 1850-2005, IEEE Computer Society, 2005. [2] Tony Andrews, Francisco Curbera, Hitesh Dholakia, Yaron Goland, Johannes Klein, Frank Leymann, Kevin Liu, Dieter Roller, Doug Smith, Satish Thatte, Ivana Trickovic, and Sanjiva Weerawarana. Business process execution language for web services, version 1.1, 2003. [3] Thomas Ball, Andreas Podelski, and Sriram K. Rajamani. Boolean and cartesian abstraction for model checking C programs. In Tiziana Margaria and Wang Yi, editors, TACAS, volume 2031 of Lecture Notes in Computer Science, pages 268–283. Springer, 2001. [4] Fabio Barbon, Paolo Traverso, Marco Pistore, and Michele Trainotti. Runtime monitoring of instances and classes of web service compositions. In ICWS, pages 63–71. IEEE Computer Society, 2006. [5] Howard Barringer, David Rydeheard, and Klaus Havelund. Rule systems for run-time monitoring: From Eagle to RuleR. Journal of Logic and Computation, 2008. Preprint. [6] Andreas Bauer, Martin Leucker, and Christian Schallhart. The good, the bad, and the ugly, but how ugly is ugly? In Oleg Sokolsky and Serdar Tasiran, editors, RV, volume 4839 of Lecture Notes in Computer Science, pages 126–138. Springer, 2007. [7] Mario Bravetti, Manuel Núñez, and Gianluigi Zavattaro, editors. Web Services and Formal Methods, Third International Workshop, WS-FM 2006 Vienna, Austria, September 8-9, 2006, Proceedings, volume 4184 of Lecture Notes in Computer Science. Springer, 2006. [8] Edmund M. Clarke, Orna Grumberg, and Doron A. Peled. Model Checking. MIT Press, 2000. [9] Gero Decker, Johannes Maria Zaha, and Marlon Dumas. Execution semantics for service choreographies. In Bravetti et al. [7], pages 163–177. [10] Alin Deutsch, Liying Sui, Victor Vianu, and Dayou Zhou. Verification of communicating data-driven web services. In Stijn Vansummeren, editor, PODS, pages 90–99. ACM, 2006. [11] Whitfield Diffie and Martin Hellman. New directions in cryptography. IEEE Trans. on Information Theory, 22(6):644–654, 1976. [12] Cindy Eisner and Dana Fisman. A Practical Introduction to PSL. Springer, 2006.
32
[13] Cindy Eisner, Dana Fisman, John Havlicek, Yoad Lustig, Anthony McIsaac, and David Van Campenhout. Reasoning with temporal logic on truncated paths. In Warren A. Hunt Jr. and Fabio Somenzi, editors, CAV, volume 2725 of Lecture Notes in Computer Science, pages 27–39. Springer, 2003. [14] Yuan Gan, Marsha Chechik, Shiva Nejati, Jon Bennett, Bill O’Farrell, and Julie Waterhouse. Runtime monitoring of web service conversations. In CASCON ’07: Proceedings of the 2007 conference of the center for advanced studies on Collaborative research, pages 42–57, New York, NY, USA, 2007. ACM. [15] Michael R. Garey and David S. Johnson. Computers and intractability, a guide to the theory of NP-completeness. W. H. Freeman, 1979. [16] Cagdas E. Gerede and Jianwen Su. Specification and verification of artifact behaviors in business process models. In Bernd J. Krämer, Kwei-Jay Lin, and Priya Narasimhan, editors, ICSOC, volume 4749 of Lecture Notes in Computer Science, pages 181–192. Springer, 2007. [17] Rob Gerth, Doron Peled, Moshe Y. Vardi, and Pierre Wolper. Simple on-thefly automatic verification of linear temporal logic. In Piotr Dembinski and Marek Sredniawa, editors, PSTV, volume 38 of IFIP Conference Proceedings, pages 3–18. Chapman & Hall, 1995. [18] Carlo Ghezzi and Sam Guinea. Run-Time Monitoring in Service-Oriented Architectures, pages 237–264. Springer, 2007. [19] Oded Goldreich, Silvio Micali, and Avi Wigderson. How to play any mental game or a completeness theorem for protocols with honest majority. In Alfred V. Aho, editor, STOC, pages 218–229. ACM, 1987. [20] Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof systems. SIAM J. Comput., 18(1):186–208, 1989. [21] Alex Groce, Klaus Havelund, and Margaret H. Smith. From scripts to specifications: the evolution of a flight software testing effort. In Jeff Kramer, Judith Bishop, Premkumar T. Devanbu, and Sebastián Uchitel, editors, ICSE (2), pages 129–138. ACM, 2010. [22] Sylvain Hallé, Tevfik Bultan, Graham Hughes, Muath Alkhalaf, and Roger Villemaire. Runtime verification of web service interface contracts. IEEE Computer, 43(3):59–66, 2010. [23] Sylvain Hallé, Graham Hughes, Tevfik Bultan, and Muath Alkhalaf. Generating interface grammars from WSDL for automated verification of web services. In Luciano Baresi, Chi-Hung Chi, and Jun Suzuki, editors, ICSOCServiceWave, volume 5900 of Lecture Notes in Computer Science, pages 516–530, 2009.
33
[24] Sylvain Hallé and Roger Villemaire. XML methods for validation of temporal properties on message traces with data. In Robert Meersman and Zahir Tari, editors, CoopIS, volume 5331 of Lecture Notes in Computer Science, pages 337–353. Springer, 2008. [25] Sylvain Hallé and Roger Villemaire. Browser-based enforcement of interface contracts in web applications with BeepBeep. In Ahmed Bouajjani and Oded Maler, editors, CAV, volume 5643 of Lecture Notes in Computer Science, pages 648–653. Springer, 2009. [26] Sylvain Hallé and Roger Villemaire. Runtime monitoring of web service choreographies using streaming XML. In Sung Y. Shin and Sascha Ossowski, editors, SAC, pages 2118–2125. ACM, 2009. [27] Sylvain Hallé and Roger Villemaire. Runtime enforcement of web service message contracts with data. IEEE Trans. on Services Computing, 5(2):192– 206, February 2012. [28] Gerard J. Holzmann. The SPIN Model Checker: Primer and Reference Manual. Addison-Wesley Professional, 2003. [29] Graham Hughes, Tevfik Bultan, and Muath Alkhalaf. Client and server verification for web services using interface grammars. In Tevfik Bultan and Tao Xie, editors, TAV-WEB, pages 40–46. ACM, 2008. [30] Galen Hunt, James Larus, Martï¿ 12 n Abadi, Mark Aiken, Manuel Fï¿ 12 hndrich Paul Barham, Chris Hawblitzel, Orion Hodson, Steven Levi, Bjarne Steensgaard Nick Murphy, David Tarditi, Ted Wobber, and Brian Zill. An overview of the Singularity project. Technical Report MSR-TR-2005-135, Microsoft Research, 2005. [31] Nickolas Kavantzas, David Burdett, Gregory Ritzinger, Tony Fletcher, Yves Lafon, and Charlton Barreto. Web services choreography description language version 1.0, 2005. [32] Lars Kuhtz and Bernd Finkbeiner. LTL path checking is efficiently parallelizable. In Susanne Albers, Alberto Marchetti-Spaccamela, Yossi Matias, Sotiris E. Nikoletseas, and Wolfgang Thomas, editors, ICALP (2), volume 5556 of Lecture Notes in Computer Science, pages 235–246. Springer, 2009. [33] Xavier Logean, Falk Dietrich, Hayk Karamyan, and Shawn Koppenhöfer. Run-time monitoring of distributed applications. In Middleware ’98: Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing, pages 459–474, London, UK, 1998. Springer-Verlag. [34] Khaled Mahbub and George Spanoudakis. Run-time monitoring of requirements for systems composed of web-services: Initial implementation and evaluation experience. In ICWS, pages 257–265. IEEE Computer Society, 2005. 34
[35] Khaled Mahbub and George Spanoudakis. Monitoring WS-Agreements: An Event Calculus-Based Approach, chapter 10, pages 265–306. Springer, 2007. [36] George C. Necula. Proof-carrying code. In POPL, pages 106–119, 1997. [37] Object Management Group. Business process modeling notation, version 1.1, 2008. [38] Stefano Di Paola and Giorgio Fedon. Subverting Ajax, 2006. http://events.ccc.de/congress/2006/Fahrplan/events/1602.en.html. [39] Abelardo Pardo and Gary D. Hachtel. Automatic abstraction techniques for propositional µ-calculus model checking. In Orna Grumberg, editor, CAV, volume 1254 of Lecture Notes in Computer Science, pages 12–23. Springer, 1997. [40] Grigore Roşu and Klaus Havelund. Rewriting-based techniques for runtime verification. Autom. Softw. Eng., 12(2):151–197, 2005. [41] Natasha Sharygina and Daniel Kröning. Model Checking with Abstraction for Web Services, pages 121–148. Springer, 2007. [42] Wil M.P. van der Aalst and Maja Pesic. DecSerFlow: Towards a truly declarative service flow language. In Bravetti et al. [7], pages 1–23. [43] Allen Van Gelder. Extracting (easily) checkable proofs from a satisfiability solver that employs both preorder and postorder resolution. In AMAI, 2002. [44] Wattana Viriyasitavat, Li Da Xu, and Andrew Martin. SWSpec: The requirements specification language in service workflow environments. IEEE Trans. on Industrial Informatics, 2012. Published online, Digital Object Identifier: 10.1109/TII.2011.2182519. [45] Chen Wang, Surya Nepal, Shiping Chen, and John Zic. Cooperative data management services based on accountable contract. In Robert Meersman and Zahir Tari, editors, CoopIS, volume 5331 of Lecture Notes in Computer Science, pages 301–318. Springer, 2008. [46] Ben Wilson. Jailbreak for iPhone 3G released: how to use, 2008. http://reviews.cnet.com/8301-19512_7-10115639-233.html. [47] Li Da Xu, Huimin Liu, Song Wang, and Kanliang Wang. Modelling and analysis techniques for cross-organizational workflow systems. Syst. Res., (26):267–289, 2009. [48] Li Da Xu, Wattana Viriyasitavat, Puripan Ruchikachorn, and Andrew Martin. Using propositional logic for requirements verification of service workflow. IEEE Trans. on Industrial Informatics, 2012.
35
[49] Shuo Yang, Ali Raza Butt, Y. Charlie Hu, and Samuel P. Midkiff. Trust but verify: monitoring remotely executing programs for progress and correctness. In Keshav Pingali, Katherine A. Yelick, and Andrew S. Grimshaw, editors, PPOPP, pages 196–205. ACM, 2005.
36