cheese shop or a butcher, where there are often lots of people to be served ..... so we state and sketch a proof of it as a way of motivating the complexities.
On the Logic of Information Flow Jon Barwise Dov Gabbay Chrysa s Hartonas December 20, 1994
Abstract
This paper is an investigation into the logic of information ow. The basic perspective is that logic ows in virtue of constraints (as in [7]), and that constraints classify channels connecting particulars (as in [8]). In this paper we explore some logics intended to model reasoning in the case of idealized information ow, that is, where the constraints involved are exceptionless. We look at this as a step toward the far more challenging task of understanding the logic of imperfect information ow, that is where the constraints admit of exceptional connections. This paper continues and ampli es work presented by the same authors in [10].
Over the past decade, information has emerged as an important topic in the study of logic, language, and computation. This paper contributes to this line of work. We formulate a notion of information network that covers many important examples from the literature of information-theoretic structures. Information networks have two kinds of items. Following Barwise [5] we call these items: sites of information, and channels between sites. We present various logical calculi intended to model perfect reasoning about the ow of information through an information network. To motivate the basic picture, let's brie y consider some real world examples of information ow, and then think about how we can model them in a uniform way. First, imagine a person going into a room and seeing some scene, which we construe as a site. Based on general information about how the world works, the person knows something about how things came to be as they are. Certain connections exist between what they see and other situations. The person's knowledge of these connections and how they work lead him 1
to know other facts about other situations.1 We think of this as information
ow, from the information that is actually visually present in the scene, to the richer information dervied by the person viewing it. For a second example, borrowed from [9], imagine a simple circuit with components a battery, a switch, and a light bulb, hooked up in series. This \hooking up" establishes connections between these particulars. As a result, information about the bulb, that it is on, can carry information about the switch (that it is closed) and the battery (that it is charged). Action and planning provides a third example of information ow. We can think of an action like ipping the light switch as connecting the situation before the action with the situation after the action. If the initial situation s0 is of one type, say A, and we carry out a certain action c, the resulting situation s1 will be of some other type, say B . We think of this as information ow from initial situation to resulting situation s1 along the action c. In our formalism, an action c is of type A ! B if it can connect a situation of type A only to one of type B . The whole point of planning is to choose actions which get us from where we are to a situation of some desired type. Many other examples exist, but these three should allow us to get started. We are going to think of information owing through some distributed system ([9]) of \information sites" connected together by \channels." Information will ow from one site to another in virtue of connections established by these channels. Hence reasoning about information will require us to reason about ow along channels. In the next section we de ne two increasingly stronger models of such a distributed system: information frame and information network.
1 Information ow In this section we present the basic de nitions used in the remainder of this paper, together with some examples.
1.1 Modeling information networks
We work up to the de nition of an information network in two stages.
Situation theory views these in terms of situations and constraints; LDS in terms of labels and labelled entailment. The common abstraction we present here is site and channel. 1
2
De nition 1.1 An information frame is a structure of the form hS; C; ;i where S is a set of objects called sites, C is a set of objects called channels, and ; is a relation on S C S . The signaling relation s ;c t is read c is a channel from source s to target t. A connection for the channel c is a pair c hs; ti such that s ; t. An information network is a structure of the form N = hS; C; ;; i where hS; C; ;i is an information frame and is an associative binary operation on C . The channel a b is called the composition of a and b. The signaling relation and composition operation are required to satisfy the following condition: for all channels a and b,
8s; t [ s a;b t
9r ( s ;a r and r ;b t ) ] :
i
The general set-up allows for the possibility that S and C could have elements in common, or even be the very same set. If C S we call the network homogeneous. Otherwise it is called heterogeneous. Any information frame hS; C; ;i can be expanded to an information network in a canonical manner. The new channels are nite sequences of c t i old channels. Given such a sequence c = (c1; : : :; cn), we de ne s ; ci si+1 for each there are sites s0 ; s1; : : :; sn such that s = s0 , t = sn and si ; i < n. Composition of sequences is just concatenation. Each of the original channels c is identi ed with the sequence (c) of length one. To keep things simple, we assume that the composition operation of a network is a total operation on channels. Partiality can usually be handled by adding a nil channel u (for \unde ned") which does not connect anything to anything. In this case, we regard channels c and c0 as non-composable just in case c c0 = u. The nil channel acts as a zero element for the operation , i.e. for every channel c we have c u = u = u c.2 In addition to the more concrete motivating examples given above, we present some examples of mathematical structures that can be construed as information networks or frames. We will return later in the paper to the general question of when a partial, binary operation should be considered associative. It turns out that there are such partial associative operations for which the trick of adding one new nil channel, with composition de ned as above, does not work. However, it does work in all of the examples given above. For the proof of the Completeness Theorem for the calculi G and G , we need to work with the more re ned notion of partially associative operation. 2
3
Example 1 Let w be \the world" and, let S = C = fwg, let w ;w w and let w w = w. With this network, our logic will just reduce to classical propositional logic for the connectives ^ and !, except that we will have two copies of each.
Example 2 Let S consist of some set of \worlds" and let C = fg be some t i s t. accessibility relation on S , with s ; Example 3 Generalizing Example 2, let S consist of any set, C the set of binary relations on S , with s ;c t i hs; ti 2 c. Let be composition of relations.
Example 4 Let S be the set of hereditarily nite sets on some set A and
let C consist of those elements of S which are binary relations, i.e., nite sets of ordered pairs of elements of A [ S . Again de ne s ;c t i hs; ti 2 c. Notice that this example, unlike the previous two, has channels which are also sites.
Example 5 For another example where channels are also sites, let S and C both be the set of natural numbers, with s ;c t i s is in the domain of the unary recursive function '1c whose Godel number is c and '1c (s) = t. The logic we are going to explore turns out to be closely related to the Lambek Calculus [25]. To see why, note that any associative operation on a set C gives rise to an information network in a natural way. Let S = C and de ne s ;c t i s c = t. Then it is routine to see that N = hS; C; ;; i is an information network. We will call a network of this kind a Lambek network. In other words, a Lambek network is an information network where the signaling relation coincides with the composition operation thought of as a three-place relation. Here are two natural examples of Lambek networks.
Example 6 Let S = C consist of the nite strings on some alphabet . De ne to be concatenation of strings and consider the associated Lambek
network. Thus the noun phrase Mary is the channel between the transtive verb admires and the verb phrase admires Mary. The latter is, in turn, the channel between the noun phrase John and the sentence John admires Mary.
Example 7 The connection with the Lambek Calculus can be further illuminated by considering the relational semantics for the Lambek Calculus 4
suggested by van Bentham and shown to be complete in [1]. Here models are taken to be the ordered pairs in some transitive relation R. To construe such a relation as an information network, we take the sites and channels both to be the elements of R and de ne ha; bihb; ci = ha; ci. We add a new element u and de ne ha; bi hc; di = u if b 6= c. We also de ne x u = u x = u for all x. This makes an associative operation. If we consider the associated hb; ci Lambek network we have ha; bi ; ha; ci for all pairs ha; bi and hb; ci in the transitive relation R and for no other pairs. That is, we think of a pair hb; ci in R as a channel which takes as source any pair in R whose second element is b, say ha; bi, and connects it to a unique target, namely ha; ci.
Example 8 This example points to the connection of this work with la-
belled deductive systems. Let us consider the lds treatment of linear logic. Let La be some set of atoms and let S and C both consist of nite subsets of La. De ne s ;c t to hold i s \ c = ; and t = s [ t. To see what this might have to do with labelled deductions and linear logic, let A consist of labels for the premises of some proof. A site s will give us information about which premises an inference of a w A depends on by being the set of labels of the premises that have been used to prove A. In order to apply a rule like modus ponens to A and A ! B in linear logic, it is required that no premise have been used in the proofs of both of these (hence the condition that s \ c = ;) and then add together the sets of labels on which A and A ! B depend to tell us what B depends on (hence the condition that t = s [ c). Notice that this example, unlike the earlier examples, has the property that s ;c t i c ;s t. This corresonds to the fact that we can think of the proof of either w A or A ! B as connecting the proof of the other w to the proof of the conclusion B . This is one of the simplest possible examples where we use sites to carry infomation about resources of theorems, and connections to regulate these resources. Instead of sets, we could use structures of various sorts. For instance, if we wanted to allow a single w to be used more than once without being listed more than once in premises we might use multisets. The general point is that resource limitations can be modeled explicitly by using sites and channels connecting them. As a nal example, we amplify the remarks on action and plans from the introduction.3 This example was suggested by work of Bibel and associates [12] applying linear logic to planning, on the one hand, and the connection of the work here with linear logic on 3
5
Example 9 Suppose we have some rst order language L0 with relations
and constants. For our sites, we take L0-structures over some xed domain A. For channels, we take arbitrary relations between such structures, i.e., sets of pairs hMl; Mr i of such structures. We can think of these pairs hMl; Mri as models for a language L1 which is just like L0 except that every predicate R of L0 has two versions Rl and Rr in L1 . To be a little more concrete, the language L0 might be used to describe blocks worlds, having predicates like Block(x), Table(x), Empty(x), and On(x,y). The channels we would be interested in are relations that capture actions like the action of moving a onto b, or simply moving a somewhere. Thus, for example, the channel MoveOn(a; b) would consist of those pairs hMl; Mri such that Ml and Mr are just alike except that a is moved onto b in Mr . (We assume that this can only succeed if b was empty in Ml. We also assume that the stack of blocks that were on a stay on it as it is moved onto b.) Thus, for example, the L1-sentence
8x8y [Onr (x; y) (Onl(x; y) _ x = a)] expresses a truth about the action of moving a. (We use \" for the material conditional since we want to save \!" for the conditional of information
ow.)
Notice that in Examples 5{8, channels have the property that each source is connected to at most one target. An information frame with this property is said to be deterministic. In the deterministic case, channels act like operations on sources, each source yielding at most one target. Thus we sometimes think of general channels as non-deterministic operations from sources to targets. All but three of the examples of information frames given above have a natural total, composition operation de ned on the set of channels, so we can construe them as information networks. For instance, in Example 9, the composition operation is just composition of binary relations. It is interesting to note, though, that in this example the composition of rst-order de nable channels (actions) is not necessarily rst-order de nable. Expressing composition takes you a bit (11, for those who know the lingo) into second-order logic. This expressive defect of L1 will be remedied in the language to be constructed below by the new connective . the other.
6
Let us consider the three exceptions: Example 2, Example 7, and Example 8. First, if the accessibility relation in Example 2 is transitive, then we can de ne = and this will give us an information network. Example 7 is a bit more interesting. It has a natural composition operation: namely ha; bi hb; ci = ha; ci. This operation satis es the condition on composition in the de nition of information network. The only trouble is that it is a partial operation; it is only de ned on those pairs that match up as shown. So here we need to add an extra nil channel u in order to make the example t our framework. Similarly, in Example 8, we have a partial composition operation, namely c1 c2 = c1 [ c2, as long as c1 and c2 are disjoint. Again, to make this an example of an information network, we need to add a nil channel u and de ne c1 c2 = u where c1 and c2 are not disjoint. We are primarily interested in information ow along chains of connections c c c
s ;1 t1 ;2 t2 ; ;n tn
If we have such a chain and we know various things about elements of the chain, what can we tell about other elements of the chain? The simplest case of this is a chain of length 1, s ;c t. If we know that s is of such-and-such a type and that c is of such-and-such a type, what can we tell about t? There is a natural intuition that is captured in Dretske's famous \Xerox Principle" in [16]: If s being of type A carries the information that r is of type B and r being of type B carries the information that t is of type C then s being of type A carries the information that t is of type C . On our account, the reason for this intuition is that the connection s ;c r that d t that allows allows the rst bit of information ow, and the connection r ; the second can be \composed," giving one a channel c d which connects s to t. We have captured this idea in our de nition of an information network.
1.2 The languages L and L2 of types
In order to have a theory of information, we need to have ways of classifying sites, so that we can say that if site s is of type A and if it is connected to some site t by a channel c that supports the inference A ! B , then t is of type B . We can think here of these \types" syntactically, as expressions in some language, or more semantically, as units of information. But in either case we need a calculus of types and an analysis of what it means for a site or channel to be of some type. 7
Our languages are going to have some basic types and four connectives for relating them #; !; and . These are read as follows, where we use A; B to range over types of sites; C and D to range over types of channels. A # C is read \A and C ." A ! B is read \A to B." A C is read \A given C ." C D is read \C and then D." Both of these \and" connectives are noncommutative, but for dierent reasons, as we will see. To motivate these connectives, we return to our examples. Example 10 For a rst example, consider the network where (Godel numbers of) computable functions connect natural numbers to natural numbers. For site types we might take sets of natural numbers, like Even, Prime, and Odd. If e is a Godel number of the function 3x + 1, then e will be of type Even!Odd, read \even to odd," since the function takes any even number x to an odd number. It will also be of type Odd!Even. However, it will not be of type Prime!Even since it connects the prime number 2 to the number 7 which is not even. A number n will be of type Prime #(Odd!Even) if n is of the form f (m) where m is a prime greater than 2 and f is a recursive function that takes Odds to Evens. (Every number is of this type, of course, since we can always send 2 to n and everything else to some even number.)
Example 11 For another example, consider the network of strings under
concatenation given in Example [6]. For atomic types, we might take English syntactic categories like n, tv, vp and s. Then the English expression Mary would be of type both n and of type tv!vp (the type of expression that takes a tv to its left and gives back an vp).
Example 12 Consider the case of action and plans, as modeled in Example 9. Here it is natural to use the rst-order sentences of L0 as basic types over the sites and rst-order sentences of L1 as basic types over the channels (actions). Thus we would have basic types like MoveOnto(a,b) or Move(a) to classify actions. The new connectives will allow us to form sentences like Empty(b) # Move(a; b), which would hold at a model M i this model can 8
be gotten from a model where b was empty by the action of moving a onto b. An action will be of type :Empty(b) ! Empty(b) if it takes one from situations where b has something on it to situations where b is empty. An action will be of type (:Empty(b) ! Empty(b)) Move(a; b) if it consists of some action that uncovers the covered block b, followed by moving a onto b. Notice that this is quite dierent than Move(a; b) (:Empty(b) ! Empty(b)) which holds of those actions which rst move a onto b and then empty b. The rst entails that the resulting situation is of type On(a; b) while the second entails just the opposite. An example of the connective, consider On(a; b) Move(a), read \On(a,b) given Move(a)". This will classify those situations where a will be on b, given any action of moving a. Thus it must be that the table is full and b is the only block that is empty, except for a or the block that is on the top of the stack above a. Notice that whereas A ! B combines sentences A; B 2 L0, A B combines a sentence A of L0 with a sentence B of L1 . The historical neglect of connections and channels in logic no doubt re ects an intuition that channels are, somehow, of a dierent nature than the things they connect, what we call sites (sources and targets) in this paper. And in some of our examples, sites and channels are disjoint from one another. But some of the other examples, most notably those illustrating the lambek calculus, but also the one for recursion theory, are important cases where channels are themselves particulars with their own channels connecting them to other particulars. There are, then, two options to be explored, accordingly as we impose sorting constraints in the language. One option re ects the rst kind of example, where sites and channels are kept separate. The other is more appropriate when channels are themselves sites. To explore both options, we introduce two distinct \languages" for classifying sites and channels, a single sorted L and two-sorted language L2 . The single sorted language L has a collection AtExp of atomic types. An information model M for L consists of an information network N together with a function f assigning to each A 2 AtExp some set of sites and channels. If x 2 f (A) we will say that x is of type A in M. The language L2 is similar, except that it has expressions Exps of sort s, used to classify sites, and expressions Expc of sort c used to classify 9
channels, including atomic expressions AtExps and AtExpc of each sort. A model M = hN ; f i for L2 is just like a model for L except that the function f is required to respect the sorting. That is, if A is of sort s, then f (A) is a set of sites, and if C is of sort c then f (C ) is a set of channels. The expressions of L2 are given by the following context-free grammar4: Exps =: AtExps j (Exps # Expc ) j (Exps Expc ) Expc =: AtExpc j (Exps ! Exps ) j (Expc Expc ) In words, this amounts to the following recursive de nition of the set of expressions: Every atomic expression of either sort is an expression of that sort. If A 2 Exps and C 2 Expc, then (A # C ) 2 Exps and (A C ) 2 Exps. If A 2 Exps and B 2 Exps then (A ! B) 2 Expc . If A 2 Expc and B 2 Expc then (A B) 2 Expc. We call expressions of either sort types since we are thinking of them as classifying sites and channels. The basic intuition behind the c-expressions is that A ! B classi es those channels between sites that take one from a site of type A to a site of type B . A channel is of type A B if it can be decomposed into a channel of type A followed by one of type B . These intuitions are captured by the de nition of satisfaction in a model given below. The expressions of the single sorted langauge L are generated in the same way, but without the sorting restrictions. That is: Exp =: AtExp j (Exp # Exp) j (Exp
Exp) j (Exp ! Exp) j (Exp Exp)
Notice that if we let AtExp = AtExps [AtExpc then L2 is a sublanguage of L consisting of those expression where the sorting restrictions are observed. L2 is a proper sublanguage of L. For example the expressions A ! (B C ) Our choice of the symbol for one of the operators makes it appear that there are two operators with signature s c 7! s. However, using for a moment the symbol ) in place of , it becomes clear that the signature of ) (hence of ) is really ): c s 7! s. We conform here to notation used in the context of other logics with restricted structural rules, where and ! denote the residuals of a product operation other than conjunction (\times", in the linear logic community, or \fusion" and \intensional conjunction" in the relevance logic tradition). 4
10
and A ! (B ! C ) of L cannot be well-formed expressions of L2 since the succedents are of sort c whereas ! demands that both antecedent and succedent be of sort s in L2 .
Example 13 The two-sorted langauge would be appropriate for Exam-
ples 2, 3, and 9. The single-sorted language would be appropriate for the other examples given earlier. We now give the de nition of what it means for a site or channel to be of some type. We use \s" and \t" to range over sources and targets and \c", possibly with subscripts, to range over channels.
De nition 1.2 Given an information model M = hN ; f i, the of-type relation j= is de ned by the following clauses: 1. For an atomic type A and s 2 S [ C , s j=M A i s 2 f (A) 2. (to) c j=M (A ! B ) i 8s; t (if s j=M A and s ;c t; then t j=M B ) 3. (and then) c j=M (A B ) i 9c1 ; c2( c1 j=M A; c2 j=M B and c = c1 c2 ) 4. (and) t j= (A # C ) i 9s; c (s j=M A; c j=M C and s ;c t) 5. (given) s j=M (A C ) i 8c; t (if c j=M C and s ;c t; then t j=M A) Both # and are forms of conjunction. There are three ways to see this.
First, we can just look at the de nition and see that \and" appears in both. Second, we note that if the information network is the trivial \one world" network of Example 1, then both of these amount to ordinary conjunction. Finally, the claim is further substantiated by the inference rules for these connectives in the calculi below. By adding certain structural rules, the rst rules for # and would degenerate into the rules for truth-functional \and." In a similarly manner, we can see that ! and are generalizations of the material conditional. De nition 1.2 makes sense for L but also for the sorted sublanguage L2 , provided the model M is a model for L2 , that is, it respects the sorting restrictions of L2. The following example is worth noting.
11
Example 14 Let N be the one-world network of Example 1, let f (A) = fwg for every basic type A, and let M = hN ; f i be the resulting model. An easy induction shows that w j=M A for every type A. This is because our
language does not have any form of negation. With our calculus of types at hand, there is another decision to be made in giving an analysis of information ow. We can think of the types themselves as units of information, leaving the sites and channels that support them implicit. Alternatively, we can think of the information units as being given by a pair [s : A] consisting of a site s (or channel) and a type A. In this case, the logic will trac directly in such units. We explore both alternatives in this paper. Here is an analogy that might be suggestive. Consider some shop, like a cheese shop or a butcher, where there are often lots of people to be served but only one clerk to serve them. One strategy for dealing with this is to have customers line up, and then use the structure of the line to determine who gets served when. The second strategy is to hand out numbers and use these numbers to determine when people get served. This strategy lets people wander o and do other things, since the number keeps track of things. These two strategies are instances of something more general. Sometimes the structure of the enviroment imposes a structure on the information supported by dierent parts of it, and information processors can use that structure to keep things sorted out. This is like the line at the shop. Other times, it is more convenient to make clear which part of the enviroment gave us which piece of information, so that we can put them together in the right way outside that enviroment. This is like handing out numbers to the customers. The rst of these stratgies is re ected by taking types as information units. The second strategy is re ected by taking information units to be given by pairs [s : A] of a site or channel s and a type A. Of these two strategies, the second is more in keeping with both the situation theory approach and the LDS approach to information. In situation theory, propositions are given by a situation together with information about it. In LDS, information is given by a labelled w. The rst approach is more in line with other approaches, however, like the Lambek Calculus. We want to explore them both. Our feeling is that the second is more general and exible, but we do not yet see how to derive either from the other. We will be presenting four dierent Genzten-type systems in what follows. For lack of anything better, we call these systems G ; G ; G and G . 12
The systems G and G are systems where types are taken as information units, whereas the other two systems take propositions as information units.
2 Types as information units In this section we explore the rst possibility suggested above.
2.1 The two-sorted case
We begin with the two-sorted language, L2 , since it is conceptually cleaner. With this to guide us, we then turn to the more general language L.
L2-sequents and validity An L2 pre-sequent is a pair ( ; B ), where is a nite, non-empty, sequence5 of types and B is a type of L2. We write pre-sequents in a Gentzen-style form as ` B . A pre-sequent is an L2 sequent if it is one of the following
forms: 1. B is of sort s and is a sequence consisting of a type of sort s followed by a (possibly empty) sequence of types, each of sort c. 2. B is of sort c and is a sequence of types, each of sort c. We call L2 -sequents of the rst kind site-sequents (s-sequents) and those of the second kind channel-sequents (c-sequents). As a notational convention, we sometimes use j to stress that a sequent is of the second kind, as in C1; C2; C3 j C , while using ` for both kinds of sequents. Given the sorting restrictions of L2 , however, a pre-sequent like A # B; ` C for example, if a sequent at all, can only be a sequent of the rst kind, since it starts with an expression of sort s.
De nition 2.1 An s-sequent A; C1; : : :; Cn ` B is valid in a model M if and only if for every information chain
c1 c2 cn t ; s; t1 ; t2 ; ; n if s j= A and ci j= Ci for each i, then tn j= B . 5
As usual, we take concatentation of sequences to be an associative operation.
13
A c-sequent C1; : : :; Cn j C is valid in M if for every sequence c1; : : :; cn of channels, if c is the composition of the ci and ci j= Ci for each i, then c j= C . A sequent is said to be valid if it is valid in every model. As simple examples of valid sequents, we have
A; A ! B ` B and
A ! B; B ! C j A ! C
Notice, though, that if we permuted the premises in either of these, the results would not be valid. (The rst would not even be a well-formed sequent.) Our rst aim is to nd a proof theory that is sound and complete for this notion of validity.
Examples 2.2 Let's look at a couple of valid sequents in the Example 9, the one where channels are actions. In this case, an s-sequent A; C1; : : :; Cn ` B holds in a model if whenever you start with a situation s of type A and carry out actions of types C1; :::; Cn, in that order, then whatever situation t you get to will be of type B . For instance, in our blocks world model, we would have Empty(b); MoveOn(a; b) ` On(a; b) If we are given site types A and B and asked to devise a plan for getting from situations of type A to those of type B , what we want is an action type C such that A; C ` B holds.6 A c-sequent C1; : : :; Cn j C asserts that if you compose any actions of types C1; :::; Cn, in that order, then the resulting action will be of type C . Thus, for example, in our blocks world model, we would have the MoveOn(a; b); MoveOn(a; c) j MoveOn(a; c)
Notice, however, that if we permute the two premises, the result is not valid. The problem of re ning a plan C is the problem of nding ways to bring about an action of type C by composing actions of other types, C1 ; :::; Cn, 6 This is a bit crude. A better de nition would be to nd a constraint A0 ! B 0 such that A ` A0 and A; A0 ! B 0 ` B are both valid. The rst sequent insures that the initial situation s of type A is guaranteed to satisfy the preconditions of the action to be undertaken.
14
types that you can implement more directly. For instance, in going home, you leave your oce, walk to your car, and drive home. Each of these is similarly re ned until you get types of actions that you can actually carry out. Thus, the task of re ning an action of type C can be modeled as the task of nding a valid c-sequent with C as succedent. Besides axiomatizing the set of valid sequents, we also want to axiomatize the notion of logical consequence between sequents, which we now de ne in the natural way.
De nition 2.3 A theory T is a set of sequents. A sequent S is a logical consequence of a theory T if S is valid in every model M in which all the sequents in T are valid. We write this as j=T S . The Gentzen system G
As usual, the Gentzen system we present is based on a single axiom scheme together with two rules of inference for each connective, a \left" and a \right" rule. The left rules basically tell us how to use the ws containing the connective, while the right rules tell us how to prove ws containing the connective.
De nition 2.4 The set of (logical) theorems is the smallest set of sequents
containing the identity axioms (see below) and closed under the rules listed below. Given a theory T , the set of theorems of T is the smallest set containing the sequents in T and the identity axioms and closed under the rules. (identity) A`A ` A ; A; ` B (cut) ; ; ` B A; B; ` C `A `C (and) (# L) (# R) A # B; ` C ; ` A # C ` A B; ` C (! R) A; ` C (to) (! L) ; A ! B; ` C `A!C ` C A; ` B ( R) ;A ` B (given) ( L) A C; ; ` B `B A ; A; B; ` C `A `B (and then) (L) (R) ; A B; ` C ; ` AB 15
The reader familiar with Gentzen-style proof systems will have noticed that there are no structural rules included in this system. None of the usual structural rules (Weakening, Permutation, Contraction) are sound in this context.7 This said, notice that the rules for application (#) and composition () are both like the usual conjunction rules, in that in the presence of the structural rules, either would amount to the usual conjunction rule, as claimed earlier. Similarly, in the presence of the structural rules, the rules for and ! are interchangeable, they are just the rules for the standard conditional. Thus, what we have here is a teasing apart of two dierent rules for ^ and , that starts from semantic considerations. We stress that in the above rules we are using ` ambiguously for either an s-sequent or a c-sequent and assume whichever reading is appropriate so that the premises and conclusions are, in fact, well-formed sequents made of well-formed expressions. Most rules can be read unambiguously simply by looking at the form of the right hand expression. Consider, for example, the rule (# R). In order for A # C to be well-formed, A must be of sort s and C must be of sort c, so the two sequents in the premise of the (# R) rule are an s-sequent and a c-sequent respectively, and the conclusion is an s-sequent. Exceptions to unique readability are Identity, the Cut rule and the rule (L), which makes sense in two cases, that where both sequents are csequents, but also where both are s-sequents and is not empty. Both versions of (L) are needed in the proof of the Cut Elimination theorem. If we were to blindly apply the rules without making sure that the sorting restrictions on expressions were observed, we could generate \sequents" like the following: A; (A ! (B C )) ` B C . The \derivation" is simply A ` A B ! C ` B C (! L) A; (A ! (B C )) ` B C Note however that even though the premises are sequents, the consequent fails to be well-formed since it contains A ! (B C ) which is not a wellformed expression. But when we turn to the language L, these will be valid derivations. The only possible exception to this claim is the rule of Association, which falls out of our treatment of concatenation of sequences as associative. If we had allowed a nonassociative concatenation operation, we would need an Association Rule. 7
16
With regard to the cut rule, note that there are three cases in its application. We make them explicit below by using the j convention and where ` is used to indicate exclusively an s-sequent: ` A A; ` B j A ; A; ` B j A ; A; j B ; ` B ; ; ` B ; ; j B
Cut Elimination Our rst result about this system is a cut-elimination argument for the case where there is no set T of non-logical axioms to worry about. Theorem 2.5 The system presented above is equivalent to its cut-free fragment. Proof. The proof of this result is a standard cut-elimination argument. It uses a double induction, rst on the complexity of the formula being cut, then within that on the length of the derivation. As usual, there are many cases to consider, and within that many details, too many to present a complete proof in print. Instead, we present an outline. First, we distinguish between two cases, that where the formula being cut is active on both the left and right, and that in which it is not. In the second case, we simply push the cut back to the previous step, by the induction on length of proof, and then repeat the last step in the original proof, as usual. Suppose, for example, that the proof ends as follows: 0 ; B; C ` A ;B C ` A A ` D 0; B C ` D Using the induction on length, we obtain a cut-free derivation of ; B; C ` D and then apply -L to get the desired conclusion. The more intricate case is where the cut formula is active in both the last steps of the proof. Suppose, for example, that the cut formula is A # C and that the proof ends as follows: B ` C (# R) A; C; ` E (# L) 1 ` A A # C; ` E 1; B ` A # C (Cut) ;1 B; ` E By induction on the complexity of the cut formula, we can get a cut-free proof of 1 ; C; ` E . Using induction on the complexity of the cut formula again, together with the fact that B ` C is provable, we get a cut free proof of 1 ; B; ` E , as desired. While there are a lot of cases to consider, they are all similar in spirit to this one. We leave the details to the interested reader. 2 17
Subformula property and decidability We remind the reader of these typical consequences of a propositional cutelimination theorem. If a sequent ` A is provable, existence of a cut-free proof p implies that every type occurring in a sequent in p is a subexpression8 of some type occurring in ` A (subformula property). Decidability follows by the following intuitive argument: Given the sequent ` A, we unfold all possible ( nitely many) derivation trees, by eliminating the occurring connectives. More than one possibility arises, depending on which type we consider as the active type at each step, but the procedure terminates in nitely many steps with top sequents S1 ; : : :; Sk , which we can then check to see if they are initial sequents, i.e. axioms.
Some basic facts
We sometimes abuse notation and write ` B both for a sequent and for the claim that the sequent is a logical theorem. It should be clear from context which we have in mind. Similarly, we will sometimes write `T B to indicate that the sequent ` B is provable from the theory T . We write A B for \A ` B and B ` A," and similarly for A T B. Also, if is a sequence consisting of a site-type followed by channel-types, we de ne # by A# = A and ( ; B )# = ( # # B ). Similarly, if is a sequence of channel-types, we de ne by C = C and (; C ) = ( C ). We list below some basic facts of the system to which we will appeal at various points later. Since we have shown Cut Elimination, we either freely use Cut or proceed by a proof in the Cut free fragment.
Proposition 2.6 1. A # (B C ) (A # B ) # C 2. A # (A ! B ) ` B and (B A) # A ` B 3. B ` A ! C i A # B ` C i A ` C B 4. (A ! B ) (B ! C ) ` (A ! C ) 5. (a) If ` A is an s-sequent, then ` A i
#`A
The subexpression relation is de ned inductively, as usual, by: A is a subexpression of itself and if A = B C , where is one of the type-forming operators, and D is a subexpression of either B or C , then D is a subexpression of A. 8
18
(b) If j C is a c-sequent, then j C i j C
The rst four follow directly from the rules of the system, since we have included Cut. Items 5(a) and 5(b) are shown by induction. 2
Soundness and completeness
In this section we indicate that the Gentzen system G is sound and complete.
Proposition 2.7 Every sequent that is a logical theorem is valid in all mod-
els. Every sequent that is derivable from a theory T in the system is a logical consequence of T . Proof. The proof is a routine induction. One just veri es that the axioms are valid and that the rules preserve validity. 2 The completeness theorem for the set of logical validities is quite a bit simpler than the completeness theorem for the notion of logical consequence, so we state and sketch a proof of it as a way of motivating the complexities in the more general result.
Theorem 2.8 There is a single information model M such that any sequent which is not derivable in G is invalid in this model. The proofs of this and the other completenss theorems are collected together in Section 4. The proof of this result is given in 4.1. It is a canonical model construction proof, where the model is built of equivalence classes of sentences. The validity of the proof hinges on the following observation:
Lemma 2.9 (Decomposition Lemma) C ` A B is a logical theorem i there are c-expressions CA ; CB such that the following are all logical theorems: CA ` A; CB ` B; and C CA CB
The proof is by induction on the length of a cut-free derivation. 2 The Decomposition Lemma does not hold when we relativize to a theory T . (It does not hold even for the pure logic case when we turn to our single sorted language.) For example, the theory might be just the single axiom C ` C C for some atomic C . This asserts that every channel c of type C can be decomposed as c = c1 c2 where c1 and c2 are also channels of 19
type C respectively. It does not follow from this theory, however, that C is equivalent to some composition C1 C2. As a result, the construction of models as equivalence classes of expressions will not work in the general case. Still, completeness can be established. Theorem 2.10 Given any theory T , every logical consequence of T is a theorem of T in the above calculus. Indeed, there is a single model M in which T is valid but which falsi es all sequents not provable from T . The proof is given in Section 4.1
2.2 The single-sorted Gentzen system G
In this section we develop a Gentzen calculus for the single-sorted language. We rst want to know what it means for a sequent A1; : : :; An ` B to be valid. However, as it turns out, there are two reasonable notions, having to do with the two dierent functions a given object might assume: that of a site or that of a channel. These two functions give us two distinct notions of what it means for a sequent to be valid in a model. Consider, for example, the sequents:
A; A ! B ` B A ! B; B ! C ` A ! C
While these both look quite reasonable at rst sight, a second's thought shows that they are only reasonable under dierent interpretations of `. The rst is valid if what ` means is that if s is a site of type A and c is a channel that connects s to t and c is of type A ! B then t is of type B . The second is valid if what it means is that if c and d are channels of type A ! B and B ! C respectively, then c d is of type A ! C . This suggests that for each sequence of types and each expression A, we distinguish two sequents, an \s-sequent" ` A and a \c-sequent" j A. (An alternative system not making such distinction is discussed in the Appendix.)
De nition 2.11 An s-sequent A; C1; : : :; Cn ` B is valid in a model M if and only if for every information chain
c1 t ; c cn s; 1 2 t2 ; ; tn ; if s j= A and ci j= Ci for each i, then tn j= B .
20
A c-sequent C1; : : :; Cn j C is valid in M if for every sequence c1; : : :; cn of channels, if ci j= Ci for each i, then (c1 : : : cn) j= C . With this de nition of validity, the rst and fourth of the following are valid, whereas the middle two will be invalid.
A; A ! B ` B A; A ! B j B A ! B; B ! C ` A ! C A ! B; B ! C j A ! C
Besides axiomatizing the set of valid sequents, we also want to axiomatize the notion of logical consequence between sequents, which we now de ne in the natural way.
De nition 2.12 A theory T is a set of sequents. A sequent S is a logical consequence of a theory T if S is valid in every model M in which all the sequents in T are valid. We write this as j=T S . Notice that every theory in our language is consistent, in virtue of Example 14. Thus axiomatizing the notion of consequence cannot be reduced to the problem of consistency.
Example 15 Recall that a Lambek network is one where the signaling re-
lation is the same as the composition operation thought of as a three-pace relation. The following are valid in every Lambek network, for all expressions A and B :
A # B ` AB AB `A #B
We call the set of all such sequents the Lambek theory. In a theory which includes the Lambek theory, the distinction between # and is lost. We are now ready to present the Gentzen system for our language. The system is a re nement of the Lambek calculus.
21
(identity) A ` A
` A j C `C (# L) AA;#B; (# R) B; ` C ; ` A # C ` A B; ` C (! R) A; ` C (to) (! L) ; A ! B; ` C j A ! C `B ;A ` B (given) ( L) Aj CC; A; ; ` B ( R) ` B A j C j A j B (and then) c (L) ;;AA;B; (R) B; j C ; j A B `C s (L) ;;AA;B; B; ` C ( 6= ;) (and)
Notice that when the sequence = hAi is of length one, then the sequents ` B and j B are semantically equivalent in that one holds in a model if and only if the other does, since they both say that everything of type A is also of type B . These are not derivable from the rules presented so far. Thus, to the above rules, we also need to have a rule that tells us that in the case where the sequence on the left of the sequent is a single type, these two notions coincide. That is, we need rules
`B (Trivial) AA j B
A j B A`B
We also include Cut as a basic rule, in three forms. j A ; A; ` B j A ; A; j B ` A A; ` B ; ` B ; ; ` B ; ; j B
Soundness, cut elimination and completeness It is routine to check
that the above rules are sound, in the sense that if the premises of a rule are valid in a model, so is the conclusion. Hence, any sequent that is provable from a theory is a logical consquence of that theory. In the case of the empty theory, the Cut rule is not needed for the proof of completness given below, so the rule of Cut could, in this case, be eliminated. One can also prove a cut-elimination result directly. We now brie y discuss completeness of the system G . De nition 2.13 A model M of a theory T is a characteristic model of T if every sequent that is valid in M is provable from T . 22
As an immediate consequence of the soundness of our system, we note that if M is a characteristic model of T then the sequents which are valid in M are exactly those sequents which are provable from T . The completeness of our Gentzen system is an immediate consequence of the following result:
Theorem 2.14 Every theory has a characteristic model. Corollary 2.15 (Completeness) A sequent S is a logical consequence of a theory T i it is provable from T in the above system.
The proof also shows the following.
Theorem 2.16 Every extension of the Lambek theory has a characteristic model whose network is a Lambek network.
The proof of Theorem 2.14 is similar to that given in Section 4.2 for the relativized two-sorted system. It was reported in some detail in [10]. An earlier version of our results was weaker in that we had to resort to networks where composition was multiple-valued. The recent proof of the Completeness Theorem for the Lambek Calculus, relative to the relational semantics due to [1] inspired the proof of this stronger result.9
3 Propositions as information units A logic of information ow should allow us to represent the validity of things like the following: if s is of type A and c is of type A ! B and d is of type B ! C and s c;d t then t is of type D. In the previous calculi, this was captured at the level of types, in terms of the sequent A; A ! B; B ! C ` C . The sites and channels were kept implicit. In this section we examine a logic where the units of information are items of the form [x : T ], where x is a site or channel and T is a type. In this calculus, the above would be represented by means of the inference from the set (not sequence) of premises [s : A]; [c : (A ! B )]; [d : (B ! C )] d t. to the conclusion [t : D], provided s c; Jerry Seligman has constructed a clever alternate proof of our result, one which derives it from the completeness of the system in [1]. This proof is not yet written down however. 9
23
There are two more choices to make in setting up our calculus. One choice is whether to make it two-sorted or single sorted. We opt for the single-sorted, as it is more expressive. The other choice is whether to consider the logic of a xed information network N , or to allow for arbitary networks.10 The situation is analogous to looking at the intended model of Peano arithmetic, using the ! -rule, or looking at arbitary models, using just the standard nitary rules of inference. The former is more powerful, but the power is bought at the expense of in nitary rules of inference. We will see that the same applies here. We begin with the study of a xed network simply because it is a little more elegant.
3.1
N -logic, for
a xed network N
Let N = hN; C; ;; i be a xed information network. For each type A of L and each site or channel s, we construe the pair [s : A] as a \proposition." Given any model M = hN ; f i on N , we say that [s : T ] is true in M, denoted by M j= [s : T ], if s j=M T in the sense de ned earlier. One of the advantages of the propositions as information units approach is that we can handle the ordinary connectives and quanti ers in a routine and familiar manner. Since we are going to need in nitary rules of proof in any case, we may as well admit in nite conjunctions and disjunctions as well. Thus we think of the propositions of the form [s : A] as atomic propositions and close the set of propositions under the following: If p is a proposition, so is :p. :p is true in M if and only if p is false in M If is a set of propositions, then V is a proposition. It is true if and only if each p 2 is true in M. We call this class of propositional expressions L1 (N ). We say that a proposition p of L1 (N ) is valid if it is true in all models (which are, recall, of the form hN ; f i). By a theory in L1 (N ) we mean a set of propositions. A proposition p is a logical consequence of the theory T if and only if p is true in every model in which all the sentences in T are true. It is this notion of logical consequence that we want to illustrate and capture. 10 The same choice should have been available to us in the previous section where we took types as information units. We could have tried to axiomatize the sequents valid in all models based on a xed information network. We have not investigated this case at all and, at present, it is not clear to us how to do this.
24
W
V
de ne to be : f:p j p 2 g. We write p q for :p _ q (i.e., for Wf:We p; q g) and p q for (p q ) ^ (q p). We use 8x pV (x) as an informal abbreviation for a conjunction over all sites and channels: fp(x=s) j s 2 Ng and 9xp(x) as the corresponding disjunction.
Example 16 Suppose we are interested in the validity of s-sequents and c-sequents of the single-sorted calculus language L of the previous section, but over the xed network N , rather than over arbitary networks. We can identify each such sequent ` with a single proposition in the current language. For example, the sequent A; C1; C2 ` B is valid in an N -model M = hN ; f i if and only if the following proposition is true in M: ^
c1 c2
s ; t
([s : A] ^ [c1 : C1] ^ [c2 : C2 ]) [t : B ]
Notice that the conjunction is determined entirely by the network structure. We can similarly identify a c-sequent C1; C2; C3 j C with the proposition:
^
c=c1 c2 c3
([c1 : C1] ^ [c2 : C2] ^ [c3 : C3 ]) [c : C ]
Thus by obtaining a complete calculus for the framework, we will also be obtaining one for the earlier calculus with a xed network interpretation.
Example 17 Suppose our network N is the network of strings over some alphabet A of symbols, as in Example 6. We will show how we can formulate a grammatical theory in this language. Suppose our basic types are the symbols S; NP; V P; IV; TV; Det; Comp. A model is an assignment of basic expressions to each of these types. Suppose we wanted to express the context free grammar S =: NP V P V P =: TV NP j IV NP =: Det N N = N Comp S We can identify each of these rules with a sentence in our language. For example, the rst is captured by the following proposition: ^
2A
([ : S ]
_
;
([ : NP ] ^ [ : V P ]))
25
This sentence says that a string is of type S if and only if it is a concatenation of strings and which are of types NP and VP respectively. Similarly for the other production rules. If we take these as axioms, then the following will be logical consequences of our theory:
NP ` S
VP
NP ` TV ! V P NP TV NP ` S
Here we are using the coding of sequents mentioned earlier, so that the rst of these is really 8x ([x : NP ] [x : (S V P )]) This says that every noun phrase is also something that combines with a VP on its right to give an S. The second says that every NP combines with a TV on its left to give a VP. The last says that every sequence consisting of an NP followed by a VP followed by another NP is an S. By an L1 (N )-sequent, we mean a pair h ; i of nite sets of propositions. As before we write ` for the sequent.11 We say that a sequent is N -valid if for every model M = hN ; f i on N , if every proposition in is true, then at least one proposition in is true. We say that a sequent is a consequence of a theory T if it is valid in every model of T . As usual, in displaying a sequent, we leave o the set parenthesis on its d t then the sequent antecedent and succedent. For example, if s c; [s : A]; [c : (A ! B )] [d : (B ! C )] ` [t : D] is valid. We write ; p for [ fpg, and so forth.
The Gentzen system G for L1 (N )
The system we present is an elaboration of the standard Gentzen calculus for in nitary logic rst presented in Lopez-Escobar [26]. In the examples above, we were coding our earlier sequents as propositions in this language. We are now having a new kind of sequent, but using the same ` notation. This could be confusing. From now on, the ` symbol is used to indicate sequents in the sense just introduced. 11
26
De nition 3.1 A sequent ` is derivable from T in G if it is in the
smallest collection containing the logical and nonlogical axioms listed below and closed under the rules indicated below. We sometimes write `T to indicate that the sequent ` is derivable from T . (Nonlogical axioms) If \ T 6= ; then ` (Identity) If \ 6= ;, then ` ` ; p ; p ` (Cut) `
The next set of rules are needed to handle the atomic propositions of our language, those of the form [s : A] for some type A. Since these types are themselves complex, we need rules of derivation that relate them. ` ; [s : A] ; [t : B ] ` (! L) ; [c :(A ! B )] ` provided s ;c t s ;c t; ; [s : A] ` ; [t : B ] (! R) For every s; t such `that ; [c :(A ! B )] (# L) (# R) ( L)
For every s; c such that s ;c t; ; [s : A]; [c : C ] ` ; [t :(A # C )] ` ` ; [s : A] ` ; [c : C ] ; ` ; [t :(A # C )] provided s ;c t
` ; [c : C ] ; [t : A] ` ; [s :(A C )] ` provided s ;c t
s ;c t; ; [c : C ] ` ; [t : A] ( R) For every c; t such `that ; [s :(A C )] For every d; e such that d e = c; ; [d : A]; [e : B ] ` (L) ; [c :(A B )] ` ` ; [d : A] ` ; [e : B ] (R) ` ; [c :(A B )] provided d e = c
27
From the point of view of propositional logic, each of these rules involves atomic formulas. This is a rather novel feature of this system. Note that the quanti ers, for example in the right introduction rule for !, are in the metalanguage. In other words, to apply this rule one needs to prove
; [s : A] ` ; [t : B] for all s and t such that s ;c t, so as to conclude that
` ; [c :(A ! B)] Finally, we come to the usual rules for in nitary logic. ` ; p (:L) ; :p ` ;p ` (:R) ` ; :p V ( L) ; V; p`` provided p 2 V ( R) For each p`2; V ` ; p
Theorem 3.2 (Soundness) The above rules are sound. That is, for any theory T , if `T in G then the sequent ` is a consequence of T . Proof. The proof is a routine induction. 2 In general, the rules are not complete. This follows from the well-known incompleteness of the in nitary propositional logic. However, if we restrict ourselves to structures N that are nite or countable, and to propositions that are themselves hereditarily countable, then the system is complete.
De nition 3.3 A fragment of L1(N ) is a set LA(N ) of formulas of L1(N ) containing all the basic units [s : A], closed under : and nite conjunctions, and such that if q is a subformula of a formula p 2 LA (N ) then q 2 LA (N ). Notice that if LA (N ) is a fragment of L1 (N ) then the cardinality of the network N is less than or equal that of LA (N ), since every unit [s : A] 2 LA (N ). Hence, if LA(N ) is a countable fragment, then the network N is nite or countably in nite.
28
Theorem 3.4 (Completeness of G ) Let T be a theory formulated in a countable fragment LA (N ) and let ` be a sequent all of whose formulas are in LA (N). If ` is a consequence of T then ` is derivable in G with a derivation all of whose formulas belong to the same fragment.
The proof will appear in Section 4.
3.2 A variable network logic G with propositions as units
If we want a deductive calclus with nitary rules of inference, then we must give up talking about a particular site structure. In this section we present a nitary language L!;! (Nets ) for talking about information ow over arbitary networks. Again we start with the language L of types from the previous section. We treat these types as atomic predicate symbols. We add an in nite supply of variables v0 ; v1; v2; : : : and a binary function symbol . Using these, we form the set of terms in the standard way. We also introduce a 3-place relation symbol ; and the identity symbol =. The atomic formulas of L!;! (Nets ) consist of expressions of the following forms: t2 t1 ; t3 ;
t1 = t2 ;
[t : A]
where t; t1; t2 ; t3 are arbitrary terms and A is a type expression of L. The set of formulas of L!;! (Nets ) is de ned to be the smallest set containing the above atomic expressions, closed under the following: if p is in L!;! (Nets ), then :p is in L!;! (Nets ). if is a nite subset of L!;! (Nets ) then ^ is in L!;! (Nets ). If = fp; q g we write p ^ q for ^. if p is in L!;! (Nets ), then for each i, 8vi p is in L!;! (Nets ). In the propositional logic L1 (N ) the 8xp(x) was really an abbreviation for an in nite conjunction. Here it is taken as a basic operator, ranging over whatever sites and channels there are in a given network. Disjunction is de ned in terms of ^ and :, while 9 is de ned in terms of 8 and :. Again we de ne (p q ) to be :p _ q , and (p q ) to be (p q ) ^ (q p). We de ne what it means for a sentence to be true in a model in the standard manner. First, if M is a model for L and g is a function from variables into M, then we de ne tg for each term by: vig = g (vi) and (t1 29
t2 )g = tg1 tg2 . (Note that the on the left is that of the language, while the on the right is in our metalanguage.) Next, we say what it means for g to satisfy p in M, written M j= p[g ], as follows: tg2 g t2 t , then M j= p[g ] i tg ; if p is t1 ; 3 1 t3 . if p is (t1 = t2), then M j= p[g] i tg1 = tg2. if p is [t : A] then M j= p[g] i tg j=M A in the sense de ned in the
previous section. the rules for :; ^ and 8 are as usual. Example 18 In Example 16, we showed how s-sequents and c-sequents of the single-sorted calculus language L could be written as sentences of the language L1 (N ), as long as we were interested only in models based on the network N . If we are interested in all networks, then we can code them as sentences in L!;! (Nets ), in a similar manner. For example, the sequent A; C1; C2 ` B can be expressed by the following sentence: 8x; y; z; w [(x y;z w ^ [x : A] ^ [y : C1] ^ [z : C2]) [w : B]] Thus we can consider the sequents of languages L and L2 as sentences of the present language. However, we do not see how to obtain the completeness of the systems G and G from the completeness of the system given below. Indeed, the completeness proof for the earlier systems seems considerably more subtle than that given for this system. A sequent of L!;! (Nets ) is a pair h ; i of nite sets of formulas of L!;! (Nets ). We write a sequent as ` . Such a sequent holds in M for a given site-assignment g , denoted by j=M;g i if M j= p[g ] for every p 2 , then there is a q 2 such that M j= q[g]. The sequent is valid in the model just in case it holds in for every site-assignment g . A rule is sound in a model M if it leads from premises valid in M to conclusions also valid in M . As usual, we say that a sequent is valid, denoted by j= , if it is valid in every model. A theory T of L!;! (Nets ) is just a set of sentences. A sequent is a consequence of T if it is valid in every model of T .
The Gentzen system G De nition 3.5 Given a theory T , a sequent is derivable from T in G if it is in the smallest set of sequents containing the axioms below and closed under the rules of inference given below. 30
(Logical Axioms) (Identity Axiom)
; p ` ; p ` ; (s=s)
(Associativity) ` ; a (b c)=(a b) c (Nonlogical Axioms) If p 2 T , then ` ; p ` ; u ; u ` (Cut) `
The following are the axioms for the atomic formulas. We use x; y; z to stand for arbitary but distinct variables, s; t; c; d; e to stand for arbitrary terms. In the restrictions accompanying some of the rules, S stands for the conclusion of the rule and var (S ) for the set of free variables of S . (! L) (! R) (# L) (# R) ( L) ( R) (L) (R) (l) (r) (Substitution)
` ; s ;c t;
` ; [s : A] ; [t : B ] ` ; [c : A ! B ] ` ; x ;c y; [x : A] ` ; [y : B ] ` ; [c : A ! B ] ; x ;y t; [x : A]; [y : C ] ` ; [t :(A # C )] ` ` ; s ;c t ` ; [s : A] ` ; [c : C ] ` ; [t :(A # C )] ` ; s ;c t ` ; [c : C ] ; [t : A] ` ; [s :(A C )] ` x y; [x : C ] ` ; [y : A] ;s ; ` ; [s :(A C )] ; [x : A]; [y : B ]; x y = c ` ; [c : A B ] ` ` ; [a : A] ` ; [b : B ] ` ; [(a b): A B ] ; s ;a x; x ;b t ` ; s (a;b) t ` ` ; s ;c r ` ; r ;d t ` ; s c;d t ; p(s) ` ; q(s) ; E; p(t) ` ; q(t) 31
x; y 62 var (S ) x; y 62 var (S )
x; y 62 var (S ) x; y 62 var (S )
x 62 var (S )
In the rule of substitition E is s=t or t=s, p(s); q (s) are expressions in which s occurs, and p(t); q (t) are obtained by replacing one or more occurrences of s by t. In addition, we need the usual Gentzen rules for the operators :; ^, and 8. (:L) (:R) (^L) (^R) (8L) (8R)
` ; p ; :p ` ;p ` ` ; :p ;p ` provided p 2 ; ^ ` For each p 2 ` ; p ` ; ^ ; p(t) ` ; 8xp(x) ` ` ; p(x) x 62 var(S ) ` ; 8xp(x)
Theorem 3.6 (Soundness) The system G is sound for L!;! (Nets ). That is, if `T then the sequent ` is a consequence of T . This is shown by induction on proofs, which amounts to verifying that axioms and rules are valid. 2 For completeness, our argument is a variation of the argument for a xed information network. Notice that as a corollary of this result, we see that the language L!;! (Nets ) is compact: if T is inconsistent, then some nite subset of T is inconsistent, by the standard argument.
4 Completeness proofs In this section we collect together the proofs of completeness of our four systems.
4.1 Completeness of the two-sorted calculus G
In this section we present the proof of completeness of the system G for the two-sorted language. 32
4.1.1 The pure case We rst sketch the proof of the completeness of the pure case, Theorem 2.8, where there is no set T of nonlogical axioms to bother with. While the proof is superceded by the relativized version, the rst proof provides a useful warm-up for the other proofs to follow. We write A B if A and B are provably equivalent and [A] for the -equivalence class of the type A. We de ne the model M as follows. For sites, we take the equivalence classes of well-formed types of sort s. For channels, we take the equivalence classes of well-formed types of sort c. C] To de ne ;, given A; B 2 Exps and C 2 Expc , de ne [A] [; [B ] i B ` A # C . (It is easy to see that this is well-de ned.) Composition is de ned on equivalence classes in the natural way, that is, [A] [B ] = [A B ]. The interpretation function is de ned on the atomic types by f (A) = f[B]jB ` Ag, where \B" ranges over well-formed types.
To see that this does indeed de ne an information network, we need to verify that composition behaves properly. Thus we need to prove the following:
Lemma 4.1 If [Z ] = [X ] [Y ],then 8S; T [ [S] ;Z [T ] i 9R ([S] ;X [R] ;Y [T ])] [ ]
[
]
[
]
Z] Proof. Suppose [S ] [; [T ], that is T ` S # Z , and so T ` S # (X Y ). Let X R. Also, T ` S # (X Y ) (S # X ) # Y now R = (S # X ). Then S ; Y] X] Y] R # Y , hence [R] [; [T ]. Conversely, if [S ] [; [R] [; [T ], then R ` S # X and T ` R # Y . Hence T ` (S # X ) # Y S # (X Y ) S # Z . 2To prove that this model will invalidate any underivable sequent, we will rst show that for all types A; B , [A] j=M B i A ` B . But we need a preliminary lemma in order to establish this. The reason we needed to prove the cutelimination theorem was to be able to prove this lemma. This is a more general version of what we called the Decomposition Lemma earlier.
33
Lemma 4.2 ` A B i there are A0; B0 such that A0 ` A; B0 ` B and A0 B 0 . Proof. The proof is a routine induction on cut-free derivations of theorems of the form ` A B . 2 With this, we can prove the main lemma needed to establish completeness in the case where the set T of non-logical axioms is empty.
Lemma 4.3 For all types D and C , [D] j= C i the sequent D ` C is
provable.
Proof. The proof is by induction on C . 2 Proof of Theorem 2.8. The result follows easily from the above lemma. Suppose, for example, that A; C1; : : :; Cn ` B is an s-sequent that is not derivable. Let s0 = [A], s1 = [A # C1]; : : :; sn = [((A # C1) : : : # Cn )], let ci = [Ci], and let t = [B]. This gives an information chain in the model M . By the lemma, s j= A, ci j= Ci, but t 6j= B. The case of c-sequents is similar. 2
4.1.2 The relativized case
Let us x a set T of sequents and prove completeness relative to T . We want to show that every unprovable sequent S can be falsi ed in a model M where T is valid.12 The model we built in the previous section had two properties that bear comment. First, every site (and channel) s was de nable: there was a sentence A such that for any sentence B , s j= B i A ` B is provable. The model M we will construct in this section will also have that property. The model from the last section also has the property that for every sentence A there was a unique s (or c) that is de ned by A. The model built here will not have this property, not by a long shot. The model M will be of the form hN ; f i, where the network N is constructed as the limit of a sequence of \partial networks" Nn for n < ! . At each stage we will throw in at most one new site (or channel) s (or c) and An earlier version of this paper had weaker completeness theorems for the systems G and G . In that version we had to resort to networks where composition was multiplevalued. The recent completeness theorem for the Lambek Calculus, relative to the relational semantics, due to H. Andreka and S. Mikulas suggested that we should be able to improve our earlier results by getting models where composition was single-valued. 12
34
declare such an s to be labelled by some expression A. There will, in general, be many sites labelled by a given expression, not just one. Our aim, inspired by Lemma 4.3, is to make sure that for any type B , a site labelled by A is of type B if and only if A ` B is provable from T . Thus, for example, if we label some site s by A # C then we will make sure to throw in a site s0 labelled by A and a connnection c labelled by C , and declare s0 ;c s. The principal obstacle to carrying out this construction involves composition. If we have labelled a channel c by some c-expression C and it happens that C `T A B is provable, then at some stage we need to throw in new channels c0; c1, label them by types A and B respectively, and de ne c0 c1 = c. The trick is to do this in a way that makes sure the nal composition operation is associative. Toward this end, we need to think for a bit about partial binary operations, what it means for them to be associative, and how to make them total.
Partially associative operations When we de ned the notion of an
information network, we required that the composition operation be total. To handle partiality, we recommended adding a nil channel u and de ning c d = u when c d is unde ned. In addition, we de ned that a u = u a = u for all a, including a = u. Just when does this trick work? That is, under what conditions on a partial binary operation 0 can you add in a new u with this multiplication and have the result be an associative operation?
Lemma 4.4 Let 0 be a partial binary operation on some set A0 and let A = A0 [ fug where u is some element not in A0. Extend 0 to a total operation on A by the above de nition. Then is associatve if and only if 0 satis es the following condition:13 for all a; b; c 2 A0 , a 0 b and (a 0 b) 0 c are de ned i both b 0 c and a 0 (b 0 c) are de ned and then (a 0 b) 0 c = a 0 (b 0 c) The necesssity of the condition follows immediately from the fact that a u = u a = u for all a. The proof of suciency is straightforward by examining the various cases that can arise in verifying (a b) c = a (b c): If (a b) c 6= u then the rst condition insures that (a b) c = a (b c): So ' is Kleene equality, it means that either both sides are unde ned or they are both de ned and identical. 13
35
let us suppose that (a b) c = u. We need to prove that a (b c) = u: But if a (b c) = d 6= u, then the second condition would give us a contradiction.
2
Motivated by this lemma, one might be tempted to call a partial binary operation 0 on a set A0 associative if it satis es the condition of Lemma 4.4. However, this notion is too strong. There is a weaker notion that is both more natural and is what we need for our proof of completeness.
De nition 4.5 A partial binary function 0 on a set A0 is said to be a partially associative operation on A0 if there is a set A with A0 A and an associative operation on A such that is an extension of 0. Example 19 Here is an example of a partially associative operator that does not satisfy the conditions in the previous lemma. Let A0 = fa; b; c; d; eg and let 0 be given by the two equations a 0 b = d and d 0 c = e. This is a partially associative operation. To see this, take A = A0 and de ne to be the extension of 0 satisfying x y = e for pairs x; y for which 0 is unde ned. It is routine to check that this is an associative operation on A.
Example 20 We also present an example of a binary operation that is not partially associative. Let A0 = fa; b; c; d; e; f g where these are all distinct, and let 0 be given by the following equations: b 0 c = a f 0 g = c b 0 f = d d 0 g = e This is not a partially associative operation; that is, there is no associate operation extending 0. If there were, we could prove a = e:
a = bc = b (f g ) = (b f ) g = dg = e In order to prove the completeness theorem, we will be building up an information network in stages. At each stage, we need to make sure that the partial function which approximates our nal composition operation, 36
has not implicitly forced us to identify channels which are distinct. That is, we need to make sure that the operation is a partially associative operation. There is a standard construction in algebra that tells us when a partial binary function is partially associative. Given a partial operation on a set A, one takes the free semigroup on A ( nite sequences over A under concatenation), and factors out by the smallest equivalence relation that identi es strings up to association, and that identi es strings forced to be identical by . Providing no elements of A are identi ed in this factorization, then this construction gives one an associative extension of . While this construction is quite standard, we need to go into it in a bit more detail, in order to de ne a notion we need in the main lemma needed for the completeness proof. So we now proceed to review the above construction in more detail. Let A be a set with a partial binary operation de ned on A. We want to characterize when is a partially associative operation on A. We write (xy ) for the ordered pair hx; y i. (We assume that no element of A0 is an ordered pair. If it were, we would use a dierent pairing function.) Let A be the smallest set containing A and closed under ordered pairs. We use ; ; to range over A . We de ne four relations on A . expands to if there are elements a; b; c 2 A such that a b = c and can be obtained from be replacing one occurrence of c by the ordered pair (ab). contracts to if there are elements a; b; c 2 A such that a b = c and can be obtained from be replacing one occurrence of (ab) by the ordered pair c. regroups to if there are elements 1; 2; 3 2 A such that can be obtained by replacing one occurrence of one of the following by the other: (( 1 2) 3), ( 1( 2 3 )). Finally, we say that rewrites to , written , i = or there is a nite sequence 0 ; : : :; n, such that = 0 , = n , and for each i < n, i+1 can be obtained from i by expansion, contraction, or regrouping. Such a sequence is called a rewrite sequence. Notice that is symmetric on A since expansion and contraction on converses of one another and regrouping is symmetric. It is also re exive and transitive, and hence an equivalence relation on A . 37
Example 21 In Example 19, we have the following: (a(bc)) regroups to ((ab)c) which contracts to (dc) which contracts to e. Thus (a(bc)) e.
This shows us that in any extension of our operation to a total associative operation, we will have to have a (b c) = e.
Proposition 4.6 (Extension Lemma) For any structure hA; i, where is a partial binary operation on A, the following are equivalent: 1. For all a; b 2 A, if a b then a = b. 2. is a partially associative operation on A. 3. There is an \initial" associative structure hA0; 0i extending hA; i. That is, 0 is an extension of , it is total on A0 , associative, and for any other such hA00; 00i, there is a unique homomorphism f from hA0; 0i into hA00; 00i. Proof. This proof is a rather standard argument. We include it for the sake of completeness. First, note that since (3) is a strengthening of (2), we need only prove that (2) implies (1) and (1) implies (3). Let us rst prove (2) implies (1). Thus, suppose that hA0 ; 0i is a total associative extension of hA; i. De ne a function f : A ! A0 by induction as follows: f (a) = a, for a 2 A, and f ( ) = f () 0 f ( ), for all ; 2 A . It is clear that if expands, contracts, or regroups to , then f () = f ( ). But then a routine induction argument on the length of rewrite sequences shows that if , then f () = f ( ). Hence, if a; b 2 A and a b then a = f (a) = f (b) = b, as desired. Now, let us prove (1) implies (3). Let us write [] for the equivalence class of with respect to the rewrite relation. We let
A0 = f[] j 2 A g We de ne an operation 0 on A0 by [] [ ] = [( )] To see that this is well de ned, we need to show that if rewrites to 0 and rewrites to 0 then ( ) rewrites to (0 0 ). One shows this by rst mimicking the rewriting of to 0 within ( ), and then mimicking the rewriting of to 0 . 38
We need to show that 0 is associative. This follows immediately from the regrouping aspect of rewriting: ([] 0 [ ]) 0 [ ] = [(( ) )] = [(( ))] = [] 0 ([ ] 0 [ ]) We can identify each a 2 A with [a] in A0 if we can show that for all a; b 2 A, [a] = [b] i a = b. But this is just a restatement of (1). Hence we can consider hA0; 0i to be a total, associative extension of hA; i. The proof that it is inital among such is similar to the proof of (2) implies (1). 2
Example 22 If we apply this to the operation de ned in Example 19, the
resulting inital associative algebra is not the extension described, but one with an in nite number of elements. Basically, it consists of all nite strings made out of a; b; c; d and e, subject to the condition that ab is identi ed with d and cd with e. Thus, for example, all of a; aa; aaa; : : : are distinct. The following is obvious but quite important in what follows.
Lemma 4.7 is a partially associative operation on A if and only if for all a; b; c 2 A, if (ab) c and a b is de ned then a b = c. Proof. The necessity of the condition is immediate from the previous Proposition. To prove suciency, assume the condition of the lemma and let us prove condition (1) of the Proposition. Thus suppose a b but a 6= b. Then in any rewrite sequence, the rst step must be an expansion (cd) where c d = a. But then (cd) b and so by the condition, a = b. 2 While the condition given in this lemma seems a bit more complicated than condition (1) of the Extension Lemma, it is actually more useful for our purposes. The reason is that it allows us to make the following de nition, and is why we needed to review this construction in the rst place.
De nition 4.8 Let be a partial function on A and let be a subfunction of . We say that is an expansion basis for if for all a; b; c 2 A, if (ab) c then there is a rewrite sequence from (ab) to c where the expansion rule expand z to (xy ) is used only if z = x y . That is, in the rewriting, we need only expand z to some (xy ) if the smaller function warrants the expansion. 39
Example 23 Here is an instructive example, in connection with the proof
we are about to give. Suppose we have six distinct elements a; b; c; d; e; and f , and a partially associative operator given by b c = a, d e = b, and e c = f . Let the subfunction be de ned by e c = f . Then is an expansion basis for . To see this, notice that only e interacts with other elements on both its left (d) and its right (c). Using this, one can observe that the only rewrite of the form (xy ) z that we have, other than the ones included in the table for itself, is
(df ) (d(ec)) (expand using f = e c) ((de)c) (rewrite) (bc) (contract using d e = b) a (contract using b c = a) For this rewrite, the one equation used in an expansion is included in .
The construction We are now ready to begin constructing our information network N and then our model M = hN ; f i. Let Si and Ch be disjoint,
countable, in nite sets. We will draw our sites from Si and our channels from Ch. Whenever we add a site s (or channel c) to our model, we will label it by an expression As = `(s) (or Ac = `(c)) of the appropriate sort with the intent described above. The network N will be the union of an increasing chain of structures Nn = hSi n ; Ch n ; ;n ; ni, for n < ! . The structures will not themselves be information networks since the operation n will be partial. But the limit will be an information network. We now list various conditions that we will want to satisfy in building this sequence of structures. We identify each condition by an ordered tuple containing the key parameters in the condition.
40
Code
Condition
hs; A; B # C i If A ` B # C is provable from T and A is the label of
s then there is an sc0 labelled by B and a c labelled by C such that s0 ; s. hs; c; A # Bi If A is the label of s and B is the label of c then there c is a t 2 S labelled by A # B and s ; t. hc; A; B C i If A ` B C is provable from T and A is the label of c then there is a c0 labelled by B and a c1 labelled by C such that c = c0 c1. hc0; c1; B C i If B is the label of c0 and C is the label of c1 then there is a c 2 Ch such that c0 c1 = c and Ac ` B C is provable from T . hs; t; c; d; ei If s; t are sites, c; d; e are channels, e = c d, and s ;e t then there is a site r such that s ;c r, r ;d t, and Ar `T As # Ac and At `T Ar # Ad . hs; r; t; c; d; ei If s; r; t are sites, c; d; e are channels, e = c d, and s ;c r and r ;d t then s ;e t.
Notice that we have packed enough into the tuples so that no tuple listed to the left of any one condition could be associated with any other condition. Thus we can determine from the tuple which condition it encodes. There are only countably many such tuples. Using standard techniques from cardinal arithmetic, order these tuples in a list of order type ! , say 0 ; 1; : : :; n ; : : :, so that each tuple occurs in nitely often. We will examine the condition associated with n at stage n of our construction. By having each condition listed in nitely often, we will make sure that if the antecedent of a condition ever gets satis ed, we will later return to examine that condition again, and so to ful ll it.
Lemma 4.9 (Main Lemma) There is an increasing sequence of structures Nn = hSi n; Ch n; ;n ; n; n; `ni, for n < !, satisfying the following conditions:
1. The composition operation n is a partially associative operation on Chn, with an expansion basis n .
2. `n is a function from Si n [ Ch n into expressions. If s 2 Si n then `n(s) is an expression of sort s, called the label of s, and written as As. Similarly, if c 2 Ch n then `n (c) is an expression of sort c, called the label of c, and written as Ac .
41
3. The condition associated with the tuple i is satis ed in Nn for all i < n. 4. If s ;c t in Nn then At `T As # Ac 5. If c = c0 n c1 then Ac `T Ac0 Ac1
6. If c = c0 n c1 then Ac0 Ac1 T Ac . Proof. For n = 0 we take, for each site expression A some s 2 Si and label it with A. For each channel expression C we take some c 2 Ch and label it with C . We let Si 0 and Ch 0 be the set of sites and channels so chosen. We do this in such a way that we leave an in nite number of elements of each Si and Ch left over for later use. The connection relation ;n and the composition operation n are both vacuous on N0 . Suppose we have de ned Nn so that Lemma 4.9 holds for n. We want to de ne Nn+1 . We consider the condition associated with the tuple n . If the constants appearing in n are not in Nn , or if they are but the antecedent of the condition does not hold in Nn , or if the conclusion does hold, then we let Nn+1 = Nn . So suppose that all the constants of n are in Nn , that the antecedent of the condition holds in Nn but that the conclusion does not hold in Nn . Just what we do depends on the condition in question. We set about making n hold in Nn+1 . We break into six cases, according to the six possible forms of the code n . Case 1. n is of the form hs; A; B # C i. To form Nn+1 , we pick s0 2 Si Si n and c 2 Ch Ch n . We let Si n+1 = Si n [ fs0 g, Ch n+1 = Ch n [ fcg. We do not enlarge the composition operation or its basis. The signaling relationc of the new structure is the same as that for the old, except we also have s0 ; s in the new structure. Finally, we label these nodes as required by the condition. To verify Lemma 4.9, we note that the only thing that requires checking in this case is part 4, and we have made sure this is true by our labeling. Case 2. n is of the form hs; c; A # B i is entirely analogous. Case 3. n is of the form hc; A; B C i. We pick new c0 and c1 to throw in, label them with B and C respectively, and de ne c0 c1 = c. However, we do not alter the expansion basis. We need to check that the new composition operation is partially associative. Clearly none of the old channels can rewrite to any of the old channels, since Nn is partially associative, and since the one new equation we have, c0 c1 = c, has brand new channels,
42
channels which cannot interact with any of the old channels. And these channels cannot rewrite to anything other them themselves since they do not appear as values of n+1 . We also need to see that the n+1 (which is the same as n ) is an expansion basis for n . The reason for this is that there is no use in expanding by the equation c0 c1 = c since c0; c1 are new and so cannot interact with any of the old elements, or with each other. Case 4. n is of the form hc1; c2; B C i. This is the trickiest stage of the construction, but we have prepared the sage for it with the Extension Lemma. First, we must decide whether c1 c2 should be an old element or a new element. To do this, we see if there is a c 2 Ch n such that (c1c2) Nn c. If there is, there can only be one such, since Nn is a partially associative structure. Hence, if there is such a c, we de ne c1 c2 = c. We need to prove that Ac ` Ac1 Ac2 . This follows from the fact that n is an expansion basis for n and proceeds by induction on the length of a rewrite sequence that only uses expansions allowed by n . Rather than give the inductive proof, we illustrate the main idea. Suppose that in Nn we have the following equations holding: c2 = c21 n c22 d = c1 n c21 c = d n c22 Given these equations, the following is a rewrite reduction of (c1c2 ) to c, one that adheres to the basis n : (c1c2) (c1(c21c22)) since c2 = c21 n c22 ((c1c21)c22) by associativity (dc22) since d = c1 c21 c since c = d n c22 We want to see why it is that Ac ` Ac1 Ac2 . Since the Lemma holds for Nn , we have the following: Ac21 Ac22 T Ac2 since c2 = c21 n c22 Ad `T Ac1 Ac21 since d = c1 c21 Ac `T Ad Ac22 since c = d n c22 >From the second of these we obtain easily Ad Ac22 `T (Ac1 Ac21 ) Ac22 Hence we get Ac `T (Ac1 Ac21 ) Ac22 `T Ac1 (Ac21 Ac22 ) `T Ac1 Ac2 43
as desired. If there is no c such that (c1c2 ) c, then we pick a new channel c and add it to the set of channels. We de ne c1 n+1 c2 = c and also c1 n+1 c2 = c. We label c with Ac1 Ac2 . Case 5. n is of the form hs; t; c; d; ei. In this case we throw in a new site r and label it with As # Ac . We need to check that this labeling satis es At `T Ar # Ad : We know by induction that At `T As # Ae : But Ae `T Ac Ad. Hence At `T As # (Ac Ad): But As # (Ac Ad) is equivalent to (As # Ac ) # Ad , which is Ar # At . Case 6. n is of the form hs; r; t; c; d; ei. Here all we have to do is to throw in a new tuple into the signaling relation, namely, s ;e t. We need to verify, though, that At j= As # Ae . This is routine, however, since we have At `T Ar # Ad, we have Ar `T AS # Ac , and Ae `T Ac Ad: 2
Lemma 4.10 Given any sequence as in Lemma 4.9, let N = Sn
Proof. We have made sure that each approximation Nn has a partially associative operation. We need to make sure it is total. Assume we have two channels c and d. At some stage n after c and d were added, we considered the condition identi ed by the pair hc; d; Ac Ad i. At that stage, we added a value for c d if it did not already exist. Now we turn to the second claim. To prove the direction from left to d t. At some stage after s; t; c; d and e = c d were right, assume that s c; added, we considered the condition associated with the tuple hs; t; c; d; ei. At that stage we made sure there was an r of the desired sort. For the converse, suppose that s ;c r and r ;d t, and that e = c d. At some stage after s; r; t; c; d and e = c d were added, we considered the condition associated with the tuple hs; r; t; c; d; ei. At that stage we added the needed connection s ;e t. 2 To turn our information network N into a model M = hN ; f i, de ne, for each atomic expression B of sort s,
f (B) = fs 2 Si j As `T Bg where As is the label of the site s. De ne f on atomic expressions of sort c
analogously.
44
Lemma 4.11 In the model M just constructed, each site and each channel is de ned by its label. That is, s j= B i As `T B , and similarly for
channels.
Proof. The proof is by induction on formulas. We show that for all B , a site or channel makes B true i the label of the site (or channel) proves As `T B . The case for atomic formulas is immediate from the de nition of f given above. Consider the case of an expression of the form B # C . Suppose it is true at some site s. There is a site s0 and channel c such that s0 ;c s, s0 j= B , and c j= C . By the induction hypothesis, As0 `T B and Ac `T C . By Lemma 4.9.4, As `T As0 # Ac . But then As `T B # C , as desired. For the converse, suppose that As ` B # C . Then at some stage after s was added to the network, the condition associated with hs; As ; B # C i was considered. At that stage, we added a site s0 and a channel c, labeling them with B and C respectively, and declaring s0 ;c s. By the induction hypothesis, s0 j= B and c j= C . Hence, s j= B # C as desired. The case for expressions of the form B C is similar, except using Lemma 4.9.4 in one direction and the condition hc; As; B C i in the converse direction. Consider the case of an expression of the form B ! C . Suppose rst that c j= B ! C . We want to prove that A `T B ! C . At the 0th stage, we made sure to add a site s labelled with B . By the induction hypothesis, s j= B. sBy condition hs; c; B # Ac i, there is a site t labelled by B # Ac such that c ; t. Hence t j= C . By the induction hypothesis, At `T C , that is, B # Ac `T C . But then Ac `T B ! C as desired. For the converse, suppose that Ac `T B ! C . We want to show that c j= B ! C . Toward this end, suppose that s ;c t, where s j= B . Then by induction again, As `T B . By Lemma 4.9.4, At `T As # Ac . But As # Ac `T B # (B ! C ) and B # (B ! C ) `T C so At `T C . The case for expressions of the form B C is similar. 2 The last step in the proof of completeness is similar to the earlier proof. We need to show that each sequent in the theory T is valid in M and that an unprovable sequent is not valid in M. To begin, let's assume that S 2 T . We assume S is an s-sequent, the other case being similar. We may suppose that S is of the form A ` B . Let s be any site in M such that s j= A. Then by the lemma, As ` A. But then by cut, As ` B and so, again by the lemma, s j= B . Now let us show that if S is valid in M, then A `T B . Let
45
s be any site labelled by A. Hence, if S is valid in M, then s j= B. But then A ` B by Lemma 4.11.
4.2 Completeness of the single-sorted calculus G
The proof of completeness for the system G is basically the same as the proof we gave for the relativized completeness of G . However, even if we are working with the empty theory, the rst proof does not work now. The reason is that there are sequents like
A; A ! (B C ) ` B C This sequent violates the Decomposition Lemma, and so blocks us from giving a canonical model construction of the usual sort. The dierence is that we do not distinguish between sites an channels. We start by assigning each experession A some site s and labeling it by A. Everything else proceeds as in the earlier case.
4.3 Completeness of G for N -logic
Our proof is a variant of a standard model existence argument for in nitary logic, as given in Keisler [24] or Barwise [2]. We show that every nonprovable sequent 6` has an extension to an appropriate pair h 0 ; 0i which can be easily \invalidated" in a model M = (N ; f ), that is to say a pair such that every unit in 0 holds in M while no unit in 0 does. We start with a rather obvious lemma:
Lemma 4.12 If `T and 0 and 0 then 0 `T 0. The proof of this is by a routine induction. 2 The next step is to generalize the usual notion of a consistency property for in nitary logic.
De nition 4.13 A consistency property for a fragment LA(N ) is a set S of pairs h ; i such that each of ; is a set of propositions of LA (N ) and such that the following conditions hold: 1. If X = h ; i 2 S , then \ = ; 2. For X = h ; i 2 S 46
If [c :(A ! B)] 2 , then for all s; t such that s ;c t, h ; [ f[s : A]gi 2 S or h [ f[t : B]g; i 2 S If [c :(A ! B)] 2 , then there are s; t such that s ;c t and h [ f[s : A]g; [ f[t : B]gi 2 S 3. For X = h ; i 2 S If [t :(A # C )] 2 , then there exist s; c such that s ;c t and
h [ f[s : A]; [c : C ]g; i 2 S If [t :(A # C )] 2 , then for all s; c such that s ;c t h ; [ f[s : A]gi 2 S or h ; [ f[c : C ]gi 2 S 4. For X = h ; i 2 S If [s :(A C )] 2 , then for all c; t such that s ;c t,
h [ f[t : A]g; i 2 S or h ; [ f[c : C ]gi 2 S If [s :(A C )] 2 , then there exist c; t such that s ;c t and h [ f[c : C ]g; [ f[t : A]gi 2 S 5. For X = h ; i 2 S If [c :(A B)] 2 , then there exist a; b such that a b = c and
h [ f[a : A]; [b : B]g; i 2 S If [c :(A B)] 2 , then for all a; b such that a b = c h ; [ f[a : A]gi 2 S or h ; [ f[b : B]gi 2 S 6. For X = h ; i 2 S If :p 2 , then h ; [ fpgi 2 S If :p 2 , then h [ fpg; i 2 S 47
7. For X = h ; i 2 S If p 2 and V 2 then h [ fpg; i 2 S If p 2 and V 2 then for some p 2 h ; [ fpgi 2 S . S is called a T -consistency property for LA(N ) if, in addition, for each pair h ; i 2 S , and each p 2 T , h [ fpg; i 2 S .
Proposition 4.14 The set S of pairs h ; i such that ; LA(N ) and 6`T is a T -consistency property for LA (N ). Clearly the identity axiom implies that if 6` , then \ = ;. Now suppose that 6` and [c : (A ! B )] 2 . If the crelevant condition on
consistency properties fails, let s; t be such that s ; t and suppose that both sequents ` ; [s : A] and ; [t : B ] ` are provable. Apply (! L) to obtain the provable sequent ; [c :(A ! B )] ` , i.e. ` (since we assume [c :(A ! B )] 2 ) to get a contradiction. Similarly if [c :(A ! B )] 2 . The conditions relating to are symmetric to these for ! and hence a similar argument applies. For the # conditions, suppose, for example, that [t : (A # C )] 2 . If the relevant condition on consistency properties fails, let s; c be such that s ;c t and suppose both sequents ` ; [s : A] and ` ; [c : C ] are provable. An application of (# R) yields again a contradiction. Similarly for the case where [t : (A # C )] 2 , now using (# L) in connection to the relevant condition for consistency properties. The case of composition, nally, is no dierent from that above. The cases for the conditions involving the propositional operators is similar and routine. Finally, let us check the condition that makes S a T -consistency property. Suppose that : `T , that p 2 T , but that ; p `T . But we also have `T ; p since p 2 T . Hence by Cut, we have ` . Note that this is the only part of the proof that requires Cut. Thus, if T = ;, this proof would not have needed the Cut rule. 2 The proof of the completeness theorem follows now from the above together with the following: Theorem 4.15 (Model Existence Theorem) Let LA (N ) be a countable fragment of L1 (N ). If S is a T -consistency property for LA (A) and h ; i 2 S , then there is a model which makes all the propositions in [ T true and all the propositions in false. The proof of this proposition follows from the next two lemmas. 48
De nition 4.16 A Hintikka pair is a pair h ; i such that its singleton fh ; ig is a consistency property. Lemma 4.17 If X = h ; i is a Hintikka pair, then there is a model that makes all the propositions in true and all the propositions in false. Proof. De ne an interpretation f on atomic types by
fA = fs j [s : A] 2 g Let M = hN ; f i. By induction on types, we show that if [s : A] 2 then s j= A and if [s : A] 2 , then s 6j= A. The atomic case is taken care by the disjointness condition on consistency properties and the way f was de ned. Otherwise, suppose the unit is of the form [c :(A ! B )] and assume rst that it is in . To show that c j= A ! B , let s; t be arbitrary and suppose that s j= A and s ;c t. Then [s : A] 62 , by inductive hypothesis. The rst part of the second condition on consistency properties implies then that [t : B ] 2 . The inductive hypothesis implies then that t j= B and thereby c j= (A ! B ). The case where [c :(A ! B )] 2 is handled similarly, using the second part of the second condition on consistency properties. The argument for units involving is symmetric to the above. Now suppose the unit is [t :(A # C )] and that it is in . By the de nition of consistency properties and Hintikka pairs, let s; c be such that s ;c t and [s : A]; [c : C ] 2 . By induction hypothesis s j= A and c j= C , so that t j= (A # C ) follows. Finally, an induction on propositions shows that M j= p if p 2 and M j= :p if p 2 . 2
Lemma 4.18 (Extension Lemma) Let S be a T -consistency property for a countable fragment LA (N ) and let X0 2 S . There is an increasing chain S Xn; n 2 ! of pairs Xn 2 S , such that the union X = h ; i = n2! Xn is a Hintikka pair with T . Proof. The proof is similar to, but simpler than, the proof of the completeness of the system G . Consider the following conditions on a pair h ; i:
49
Code
Condition hs; c; t; A ! Bi If [c : (A ! B)] 2 and s ;c t then [s : A] 2 or [t : B ] 2 . hc; A ! Bi If [c : (A ! B )] 2 then there are s; t such that s ;c t, [s : A] 2 , and [t : B] 2 . ht; A # C i If [t : (A # C )] 2 then there are s; c such that s ;c t, [s : A] 2 , and [c : C ] 2 . hs; c; t; A # C i If [t : (A # C )] 2 and s ;c t then [s : A] 2 or [c : C ] 2 . hs; c; t; A C i If [s : (A C )] 2 and s ;c t then [t : A] 2 or [c : C ] 2 . hs; A C i If [s : (A C )] 2 then there are c; t such that s ;c t, [s : A] 2 , and [c : C ] 2 . hc; A Bi If [c : (A B )] 2 then there are a; b such that a b = c, [a : A] 2 , and [b : B] 2 . hc; a; b; A Bi If [c : (A B)] 2 and a b = c then [a : A] 2 or [b : B ] 2 . h:pi If :p 2 and then p 2 . h:pVi If :p 2 then pV2 . hp; i If pV2 then if 2 , then p 2 . V h i If 2 then for some p 2 , p 2 . hp; T i If p 2 T then p 2 . Enumerate the tuples on the left in a sequence 1 ; : : :; n ; : : : in such a way that each tuple occurs in nitely often. Let X0 = h 0 ; 0i = h ; i = X . We de ne Xn = h n ; n i by cases so that the condition n is ful lled by Xn = h n ; ni 2 S . Then the limit is a Hintikka pair. Note, however, that it may well not be a member of S . 2
Remark 4.19 (Cut Elimination) Notice that for the case where T = ;, completeness was shown without the use of the Cut rule. Hence the fragment without this rule is, in fact, closed under Cut. We could also have shown Cut Elimination directly for this case.
4.4 Completeness of G for variable network logic
The proof of completeness for this system is very similar to the proof of completeness for the previous fragment, given the way consistency properties work for rst-order logic. We omit the details for lack of space. 50
5 Comments and open questions We conclude with some remarks and questions we have either not had an opportunity to explore or, if we have, have not been able to answer. 1. The results in this paper are clearly closely related to the 3-place relation semantics for relevance logic, introduced by Routley and Meyer [27], [28], and related substructural logics. Indeed, the development of at least the rst author's ideas about this paper was strongly in uenced by the Kripke style semantics for substructural logics as reported, in a more abstract setting, in Dunn [17]. However, there is a dierence of methodology. In relevance and related logics, one starts with intuitions about inference, and comes up with the semantics that makes it sound and complete. Here we start with semantic intuitions about information ow and come up with the logic. The results are quite different in the kinds of conditions on the three-pace relation that arise. It would be a worthwhile project to try to reconcile these two ways of looking at things. 2. It seems clear that the results here are related to results on relational algebras and also to van Bentham's arrow logic (see [11]). We have not yet investigated just what the relationships might be but consider it an interesting project to try to work out. 3. There are quite a few natural additions we could make to our language, additions which would increase our expressive power. As a starter, we could add types True (False) which classi ed all (no) sites. It is pretty obvious how to extend our systems for these. If we do, then we can de ne the range of a channel type C by True # C . However, we have no way of de ning the domain. But if we simply invert all channels, we get a network where everything is reversed. This suggests that we could equally well introduce connectives, say +, ) and ( which look at things from the other side. Thus, for example, s j= C + A i there is a channel c j= C and a target t j= A such that s ;c t. The arrows would be de ned so as to make them the residuals of this connective. If we dealt only with these, then we could get a complete system by inverting things in our system. However, what happens if we have both sets of connectives. It is no longer clear how to formulate things. 4. One could also add a connective A ? B such that a channel c is of this type i and only there are sites s j= A and t j= B such that s ;c t. 51
Thus, A ? B expresses a kind of consistency between the types. If we had a classical negation in our language, we would have :(A ! B ) would be semantically equivalent to A ? :B . We have not thought at all about incorporating ? into the Gentzen calculi. 5. It is not dicult to see how to extend the results in Section 2 to a language where we allow classical boolean conjunction and disjunction of types. However, it is not clear how best to extend it to one with negation. There are several possibilities that need to be investigated. 6. The results of this paper are directly related to the proposals made in Barwise [5]. But there, in addition to sequential composition on channels, Barwise considered inversion on channels and parallel composition of channels. That is, besides enriching the langauge, we can enrich the information network to allow for inversion of channels or parallel composition of channels, or both. We have likewise not had a chance to explore this alternative. 7. In the case of our planning example, it would be nice to bring the quanti ers into the language in a full- edged way, rather than just using the informational operators to apply to sentences with no free variables. And it would be nice to have a completeness result where the networks were in fact of the form given in Example 9.
Appendix: An Alternative Single-Sorted Proof System for L The proof theory for the single sorted language developed earlier suggests an interesting question. Can we develop a proof theory for s-sequents alone, or is the introduction of c-sequents essential? We suspect but cannot prove that the c-sequents are necessary if you want a strictly Genzten system, one with only introduction and elimination axioms of the standard sort. What we can prove, though, is that if you give up on a strictly Gentzen-like calculus, then you can axiomatize the s-sequent fragment alone. In this section, then, we take sequents to be of the form ` B , where is a nonempty sequence of types of L, and assume a de nition of validity in the sense we previously called s-validity. We present a proof system that captures this notion of validity. Some asymmetry in the form of sequents appearing as premises of rules will be 52
noticed, which is in fact necessary in order to avoid confusing the operators # and . We note also that associativity of the composition operator needs to be assumed as an axiom. Finally, there is the option of assuming (A # B ) # C ` A # (B C ) as an axiom (the converse will follow from the rest of the rules), or introducing a double-line rule for the left introduction of . We will take the double-line rule option, but note that the alternative could serve us equally well. (Identity)
A`A
(Association) A (B C ) a` (A B ) C
` A A; B ` C A ` B ;B ` C ;B ` C ;A ` C `C (# R) `; BA ` AB#`CC (Application) (# L) AA;#B; B; ` C (Right Impl.) (! L) `; AA! BB ``CC (! R) BA;` BA `!CC (Left Impl.) ( L) AC` BB; AC ``DD ( R) `; AB ` BA A; B; C ` D (R) AA` CB ` CB `DD (Composition) (L) A; BC ` D The two systems presented for the single-sorted case prove the same s-sequents, as we now sketch. First, it is easy to see that the rules of the present system are all sound. Since our earlier system is complete, all the theorems of this system must be provable in the earlier system. For the converse, we show how to code c-sequents from the rst system in the present system in such a way that every theorem of the rst system is a theorem of the present system { though its proof might be much longer. From the completeness of the rst system, then, we get the completeness of the present system for s-sequents. Since we used ` (and j) for the single sorted system presented in Section 2.2 we use `1 for the system just presented. For the coding, translate the c-sequent j C as the s-sequent `1 B , where is the result of composing the ws in , grouping parentheses to the left. Completeness of the present system follows from the completeness of the earlier system together with the following result. (Cut)
Proposition 5.1 For any sequent S of the original system, if S is provable in that system, then its translation is provable in the present system.
53
The proof is by induction following the inductive de nition of derivations. The case of axioms is immediate. Assume next that the sequent is an ssequent. If S is A # B; `2 C , obtained by (# L), then there is nothing to prove. Now suppose it is ; ` A # C , obtained from ` A and j C . Then in induction, ` A and ` C in the present system. An application of (# L) yields ; ` A # C . This is equivalent to # ; ` C , and the latter is equivalent (by the elimination direction of (L)) to #; ` C . Introduction of # is reversible in both systems, hence this is, up to provability, the same as ; ` C . The cases of the rules for and ! are treated similarly. The case of s (L) is immediate since the rule ; B; C; ` D ; B C; ` D is derivable. The case of c (L) is immediate. For (R), from ` A and ; B ` B we obtain ` A B . By the generalized associativity lemma stated below, the latter sequent is, up to provability, the same as the sequent ( [ ) ` A B . 2Given a sequence hA1 ; : : :; An i of types, by (A1 ; : : :; An) we denote any type obtained from the sequence A1 ; : : :; An by applications of the operator in any order. By ` (A1 ; : : :; An) we mean the type with association to the left and r (A1; : : :; An ) is the type with association to the right, i.e.
` (A1; : : :; An) = ( ((A1 A2) A3) ) An r (A1; : : :; An) = A1 ( (An 2 (An 1 An )) )
We then prove the following general associativity fact Lemma 5.2 For any sequence hA1; : : :; Ani, any type (A1; : : :; An) is equivalent to each of the types ` (A1; : : :; An ) and r (A1 ; : : :; An ). Hence any two types obtained from a sequence hA1 ; : : :; An i by applications of the composition operator are provably equivalent. The proof is by induction on n 3. For n = 3, this is just the associativity axiom. Now let 1 m < n and assume (A1 ; : : :; An) = 1 (A1; : : :; Am ) 2 (Am+1 ; : : :; An). By induction hypothesis, each of the types i is equivalent to both i` and ir , for i = 1; 2 respectively. Hence we can write as
(A1; : : :; An) = [( ((A1 A2) ) Am] [Am+1 ( (An 2 (An 1 An))] By induction on m and using the associativity axiom we obtain ` . By induction on n m and the associativity axiom we obtain r . 2 54
We can now prove equivalence of the two alternative systems we have presented for the single-sorted case. Let L1 be the system with only ssequents and L2 the system with both s-sequents and c-sequents. We de ne maps from sequents of L1 to sequents of L2 and vice versa. From L1 to L2 we take the identity map, sending a sequent `1 B to the s-sequent ` B. Notice that in the particular case where `1 B is A `1 B (i.e. with a single type on the left), its translation in L2 may be regarded as either an s-sequent or a c-sequent.
References [1] H. Andreka and S Mikulas (1994), Lambek Calculus and its Relational Semantics: Completeness and Incompleteness, J Logic, Language, and Information, vol 3, 1{37 [2] J. Barwise (1975), Admissible Sets and Structures: A Approach to De nability Theory, Springer-Verlag, Perspectives in Mathematical Logic. [3] J. Barwise (1989), The Situation in Logic, CSLI Lecture Notes 17, Stanford, California. [4] J. Barwise (1992), \Information Links in Domain Theory", Proceedings of the Mathematical Foundations of Programming Semantics Conference (1991), ed. by S. Brookes et al., LNCS 598, Springer, 168-192. [5] J. Barwise (1993), \Constraints, Channels, and the Flow of Information", Indiana University Logic Group IULG-93-23, to appear in Situation Theory and its Applications, Vol. 3. [6] J. Barwise and J. Etchemendy (1990), \Information, Infons and Inference", in [13]. [7] J. Barwise and J. Perry (1983), Situations and Attitudes, Cambridge, Mass.: Bradford Books/MIT Press. Translated into German (1986) and Japanese (1992). [8] J. Barwise and J. Seligman (1992), \The Rights and Wrongs of Natural Regularity," Indiana University Logic Group, Preprint Series IULG-9217, Perspectives in Philosophy, vol. 8, ed. by James Tomberli. 55
[9] J. Barwise and J. Seligman (1993), \Imperfect Information Flow," Proceedings of the 8th Annual IEEE Symposium on Logic in Computer Science, ed. by M. Vardi, 252{261. IEEE Computer Society Press, Los Alamito, CA. [10] J. Barwise, D. Gabbay and C. Hartonas (1994), \Information Flow and the Lambek Calculus", Logic, Language and Computation: The 1994 Moraga Proceedings, ed. by Jerry Seligman and Dag Westerstahl. Forthcoming from CSLI, Stanford. [11] J. van Benthem (1992), \A Note on Dynamic Arrow Logic", Institute for Logic, Language and Computation, Prepublication Series for Logic, Semantics and Philosophy of Language LP-92-11, University of Amsterdam. [12] W. Bibel, L. Farinas del Cerro, B. Fronhofer, and A.Herzig (1991), \Plan generation by linear proofs: on semantics." Preprint. [13] R. Cooper, K. Mukai and J. Perry (eds) (1990), Situation Theory and its Applications, vol. I, CSLI Lecture Notes 22, Stanford, California. [14] K. J. Devlin (1991), Logic and Information, Cambridge University Press, Cambridge, England. [15] K. Dosen and P. Schroder-Heister (1993), eds, Substructural Logics, Oxford Press. [16] F. Dretske (1981), Knowledge and the Flow of Information, Cambridge: Bradford Books, MIT Press. [17] J. M. Dunn (1993), \Partial Gaggles applied to Logics with Restricted Structural Rules", Indiana University Logic Group, Preprint Series, IULG-93-22. Forthcoming in [15]. [18] M. Fitting (1990), First-Order Logic and Automated Theorem Proving, Springer-Verlag, Texts and Monographs in Computer Science. [19] D. M. Gabbay (1993), \Labelled Deductive Systems: A Position Paper", Logic Colloquium '90, J. Oikkonen and J. Vaananen (eds), Lecture Notes in Logic, vol 2, Springer-Verlag, 66-88. [20] D. M. Gabbay (1995), Labelled Deductive Systems: Principles and Applications, Vol 1: Basic Principles to appear Oxford University Press. 56
[21]
[22] [23] [24] [25] [26] [27] [28] [29]
1st draft Manuscript 1989, 2nd intermediate draft, University of Munich, CIS Bericht 90-22, 1990, 3rd intermediate draft, Max Planck Institute, Saarbrucken, Technical Report, MPI-I-94-223, 1994. D. M. Gabbay (1994), \Classical vs Nonclassical Logic", in Handbook of Logic in Arti cial Intelligence and Logic Programming, vol 2, D. M. Gabbay, C. Hogger and J. A. Robinson (eds), Oxford University Press, 349-489. D. M. Gabbay and R. J. G. B. De Qeiroz (1992), \Extending the CurryHoward Interpretation to Linear, Relevant and Other Resource Logics", The Journal of Symbolic Logic 57, n. 4, 1319-1365. J-Y. Girard, Y. Lafont and P. Taylor (1989), Proofs and Types, Cambridge University Press. H. J. Keisler (1971), Model Theory for In nitary Logic, North-Holland, Studies in Logic and the Foundations of Mathematics. J. Lambek (1958), The Mathematics of Sentence Structure", The American Mathematical Monthly 65, 154-170. E. Lopez-Escobar (1965), \An interpolation theorem for denumerably long sentences. Fund Math. LVII 253-272 R. Routley, R. K. Meyer (1973), \The Semantics of Entailment, I", in H. Leblanc (ed), Truth, Syntax and Semantics, North-Holland, Amsterdam, 194-243. R. Routley, R. K. Meyer (1972), \The Semantics of Entailment II and III", Journal of Philosophical Logic 1, 53-73, 192-208. H. Wansing (1993), \Informational Interpretation of Substructural Logics", Journal of Logic, Language, and Information 2, 285-308.
57