MATH20302 Propositional Logic - School of Mathematics

24 downloads 225 Views 391KB Size Report
MATH20302 Propositional Logic. Mike Prest. School of Mathematics. Alan Turing Building. Room 1.120 [email protected]. April 16, 2013 ...
MATH20302 Propositional Logic Mike Prest School of Mathematics Alan Turing Building Room 1.120 [email protected] April 10, 2015

Contents I

Propositional Logic

1 Propositional languages 1.1 Propositional terms . . . . . . 1.2 Valuations . . . . . . . . . . . 1.3 Beth trees . . . . . . . . . . . 1.4 Normal forms . . . . . . . . . 1.5 Adequate sets of connectives 1.6 Interpolation . . . . . . . . .

3 . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

4 4 9 12 15 18 19

2 Deductive systems 2.1 A Hilbert-style system for propositional logic . . . 2.1.1 Soundness . . . . . . . . . . . . . . . . . . . 2.1.2 Completeness . . . . . . . . . . . . . . . . . 2.2 A natural deduction system for propositional logic

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

21 22 23 24 29

II

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Predicate Logic

32

3 A brief introduction to predicate tures 3.1 Predicate languages . . . . . . . 3.2 The basic language . . . . . . . . 3.3 Enriching the language . . . . . . 3.4 L-structures . . . . . . . . . . . . 3.5 Some basic examples . . . . . . . 3.6 Definable Sets . . . . . . . . . . .

logic: languages and struc. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

33 33 34 36 37 38 39

If you come across any typos or errors, here or in the examples/solutions, please let me know of them.

1

Introduction: The domain of logic By logic I mean either propositional logic (the logic of combining statements) or first-order predicate logic (a logic which can be used for constructing statements). This course is mostly about the former; we will, however, spend some time on predicate logic in the later part of the course. In any case, propositional logic is a part of predicate logic so we must begin with it. Predicate Logic is dealt with thoroughly in the 3rd/4th-year course by that title; other natural follow-on courses from this one are Model Theory and Non-Standard Logics. Propositional logic can be seen as expressing the most basic “laws of thought” which are used not just in mathematics but also in everyday discourse. Predicate logic, which can also be thought of as “the logic of quantifiers”, is strong enough to express essentially all formal mathematical argument. Most of the examples that we will use are taken from mathematics but we do use natural language examples to illustrate some of the basic ideas. The natural language examples will be rather “bare”, reflecting the fact that these formal languages can capture only a small part of the meanings and nuances of ordinary language. There are logics which capture more of natural language (modality, uncertainty, etc.) though these have had little impact within mathematics itself (as opposed to within philosophy and computer science), because predicate logic is already enough for expressing the results of mathematical thinking.1

1 One should be clear on the distinction between the formal expression of mathematics (which is as precise and as formal as one wishes it to be) and the process of mathematical thinking and informal communication of mathematics (which uses mental imagery and all the usual devices of human communication).

2

Part I

Propositional Logic

3

Chapter 1

Propositional languages 1.1

Propositional terms

Propositional logic is the logic of combining already formed statements. It begins with careful and completely unambiguous descriptions of how to use the “propositional connectives” which are “and”, “or”, “not”, “implies”. But first we should be clear on what is meant by a “statement” (the words “assertion” and “proposition” will be used interchangably with “statement”). The distinguishing feature of a statement is that it is either true or false. “The moon is made of cheese” is a (false) statement and “1 + 1 = 2” is a (true, essentially by definition) statement. Fortunately, in order to deal with the logic of statements, we do not need to know whether a given statement is true or false: it might not be immediately obvious whether “113456 × 65421 = 880459536“ is true or false but certainly it is a statement. A more interesting example is “There are infinitely many prime pairs.” where by a prime pair we mean a pair, p, p + 2, of numbers, two apart, where both are prime (for instance 3 and 5 form a prime pair, as do 17 and 19 but not 19 and 21). It is a remarkable fact that, to date, no-one has been able to determine whether this statement is true or false. Yet it is surely1 either false (after some large enough number there are no more prime pairs) or true (given any prime pair there is always a larger prime pair somewhere out there). On the other hand, the following are not statements. “Is 7 a prime number?” “Add 1 and 1.“ The first is a question, the second a command. What about “x is a prime number.”: is this a statement? The answer is, “It depends.”: if the context is such that x already has been given a value then it will be a statement (since then either x is a prime number or it is not) but otherwise, if no value (or other sufficient information) has been assigned to x then it is not a statement. Here’s a silly example (where we can’t tell whether something is a statement or not). Set x = 7 if there are infinitely many prime pairs but leave the value of 1 There

are some issues there but they are more philosophical than mathematical.

4

x unassigned if there are not. Is “x is a prime number” a statement? Answer: (to date) we can’t tell! But this example is silly (the context we have set up is highly artifical) and quite off the path of what we will be doing. When we discuss mathematical properties of, for instance, numbers, we use variables, x, y to stand for these numbers. This allows us to make general assertions. So we can say “for all integers x, y we have x + y = y + x” instead of listing all the examples of this assertion: ..., “0 + 1 = 1 + 0”, “1 + 1 = 1 + 1”, ..., “2 + 5 = 5 + 2”, ... (not that we could list all the assertions covered by this general assertion, since there are infinitely many of them). In the same way, although we will use particular statements as examples, most of the time we use variables p, q, r, s, t to stand for statements in order that we may make general assertions.2 As indicated already, propositional logic is the logic of “and”, “or”, “not”, “implies” (as well as “iff” and other combinations of the connectives). The words in quotes are propositional connectives: they operate on propositions (and propositional variables) to give new propositions. Initially we define these connectives somewhat informally in order to emphasise their intuitive meaning. Then we give their exact definition after we have been more precise about the context and have introduced the idea of (truth) valuation. First, notation: we write ∧ for “and”; ∨ for “or”, ¬ for “not”, → for “implies“ and ↔ for “iff”. So if p is the proposition “the moon is made of cheese” and q is the proposition “mice like cheese” then p ∧ q, p ∨ q, ¬p, p → q, p ↔ q respectively may be read as “the moon is made of cheese and mice like cheese”, “the moon is made of cheese or mice like cheese”, “the moon is not made of cheese”, “if the moon is made of cheese then mice like cheese” and “the moon is made of cheese iff mice like cheese”. A crucial observation is that the truth value (true or false) of a statement obtained by using these connectives only depends on the truth values of the “component propositions”. Check through the examples given to see if you agree (you might have some doubts about the last two examples: we could discuss these). For another example, you may not know whether or not the following are true statements: “the third homology group of the torus is trivial”, “every artinian unital ring is noetherian” but you know that the combined statement “the third homology group of the torus is trivial and every artinian unital ring is noetherian” is true exactly if each of the separate statements is true. That is why it makes sense to apply these propositional connectives to propositional variables as well as to propositions. So now the formal definition. We start with a collection, p, q, r, p0 , p1 , ... of symbols which we call propositional variables. Then we define, by induction, the propositional terms by the following clauses: 2 You might notice that in this paragraph I assigned different uses to the words “assertion” and “statement” (although earlier I said that I would use these interchangably). This is because I was making statements about statements. That can be confusing, so I used “assertion” for the first (more general, “meta”, “higher”) type of use and “statement” for the second type of use. In logic we make statements about statements (and even statements about statements which are themselves statements about statements ...).

5

(0) every propositional variable is a propositional term; (i) if s and t are propositional terms then so are: s ∧ t, s ∨ t, ¬s, s → t, s ↔ t; (ii) that’s it (more formally, there are no propositional terms other than those which are so by virtue of the first two clauses). The terms seen in (i) are respectively called the conjunction (s ∧ t), disjunction (s ∨ t), of s and t, ¬s is the negation of s, s → t is an implication and s ↔ t a biimplication. Remark: Following the usual convention in mathematics we will use symbols such as p, q, respectively s, t, not just for individual propositional variables, respectively propositional terms, but also as variables ranging over propositional variables, resp. propositional terms, (as we did just above). The definition above is an inductive one, with (0) being the base case and (i) the inductions step(s) but it’s a more complicated inductive structure than that which uses the natural numbers as “indexing structure”. For there are many base cases (any propositional variable), not just one (0 in ordinary induction) and there are (as given) five types of inductive step, not just one (“add 1” in ordinary induction). Example 1.1.1. Start with propositional variables p, q, r; these are propositional terms by clause (0) and then, by clause (i), so are p ∧ p, p ∧ q, ¬q, q → p for instance. Then, by clause (i) again, (p ∧ p) ∧ p, (p ∧ q) → ¬r, (q → p) → (q → p) are further propositional terms. Further applications of clause (i) allow us to build up more and more complicated propositional terms. So you can see that these little clauses have large consequences. The last clause (ii) simply says that every propositional term has to be built up in this way. Notice how we have to use parentheses to write propositional terms. This is just like the use in arithmetic and algebra: without parentheses the expression (−3 + 5) × 4 would read −3 + 5 × 4 and the latter is ambiguous. At least, it would be if we had not become used to the hierarchy of arithmetical symbols by which − binds more closely than × and ÷, and those bind more closely than + and −. Of course parentheses are still needed but such a hierarchy reduces the number needed and leads to easier readability. A similar hierarchy is used for propositional terms, by which ¬ binds more closely than ∧ and ∨, which bind more closely than → and ↔ (at least those are my conventions, but they are not universal). Therefore ¬p ∧ q → r means ((¬p) ∧ q) → r rather than ¬(p ∧ q) → r or (¬p) ∧ (q → r) or ¬(p ∧ (q → r)) (at least it does to me; if in doubt, put in more parentheses). You will recall that in order to prove results about things which are defined by induction (on the natural numbers) it is often necessary to use proof by induction. The same is true here: one deals with the base case (propositional variables) then the inductive steps. In this case there are five different types of inductive step but we’ll see later than some of the propositional connectives can be defined in terms of the others. For instance using ∧ and ¬ (or using → and ¬) we can define all the others. Having made that observation, we then need only prove the inductive steps for ∧ and ¬ (or for → and ¬). Proofs of assertions about propositional terms which follow their inductive construction are often called “proofs by induction on complexity (of terms)”. If we wish to be more precise about the set of propositional variables that we 6

are using then we will let L (“L” for “language”) denote the set of propositional variables. We also introduce notation for the set of propositional terms built up from these, namely, set S0 L = L and, having inductively (on n) defined the set Sn L we define Sn+1 L to be the set of all propositional terms which may be built from Sn L using a single propositional connective and, so as to make this process cumulative, we also include Sn L in Sn+1 L. More formally: Sn+1 L = Sn L ∪ {(s ∧ t), (s ∨ t), (¬s), (s → t), (s ↔ t) : s, t ∈ Sn L}. S We also set SL = n≥0 Sn L to be the union of all these - the set of all propositional terms (sometime called sentences, hence the “S” in “SL”) which can be built up from the chosen base set, L = S0 L, of propositional variables.3 Notice that we place parentheses around all the propositional terms we build; we discussed already that leaving these out could give rise to ambiguity in reading them: was “s ∧ t ∨ u” - a term in S2 L - built up by applying ∧ to s, t ∨ u ∈ S1 L or by applying ∨ to s ∧ t, u ∈ S1 L, that is, should it be read as s ∧ (t ∨ u) or as (s ∧ t) ∨ u? In practice we can omit some pairs of parentheses without losing unique readability but, formally, those pairs are there. In fact, although intuitively it might at first seem obvious that if we look at a propositional term in SL, then we can figure out how it was constructed - that is, there is a unique way of reading it - a bit more thought reveals that there is an issue: how do we detect the “last” connective in its construction? Clearly, if we can do that then we can proceed inductively to reconstruct its “construction tree”. We have been precise in setting things up so we should be able to prove unique readability (if it is true - which it is, as we show now). Theorem 1.1.2. Let s ∈ SL be any propositional term. Then exactly one of the following is the case: (a) s is a propositional variable; (b) s has the form (t ∧ u) for some t, u ∈ SL; (c) s has the form (t ∨ u) for some t, u ∈ SL; (d) s has the form (¬t) for some t ∈ SL; (e) s has the form (t → u) for some t, u ∈ SL; (f ) s has the form (t ↔ u) for some t, u ∈ SL. Proof. Every propositional term s does have at least one of the listed forms: because s ∈ SL it must be that s ∈ Sn L for some n and then, just by the definitions of S0 L and Sn+1 L, s does have such a form. We have to show that it has a unique such form. For this we introduce two lemmas and the following definitions: if s ∈ SL then by l(s) we denote the number of left parentheses, “(”, occurring in s and by r(s) we denote the number of right parentheses, “)”, occurring in s (for purposes of this definition we count all the parentheses that should be there). Lemma 1.1.3. For every propositional term s we have l(s) = r(s). Proof. This is an example of a proof by induction on complexity/construction of terms. 3 A word about notation: I will tend to use p, q, r for propositional variables, s, t, u for propositional terms (which might or might not be propositional variables) and v for valuations (see later). That rather squeezes that part of the alphabet so I will sometimes use other parts and/or the Greek alphabet for propositional variables and terms.

7

If s ∈ S0 L then l(s) = 0 = r(s) so the result is true if s ∈ S0 L. For the induction step, suppose that for every s ∈ Sn L we have l(s) = r(s). Let s ∈ Sn+1 L; then either there is t ∈ Sn L such that s = (¬t) or there are t, u ∈ Sn L such that s = (t∧u) or (t∨u) or (t → u) or (t ↔ u). Since t, u ∈ Sn L, we have l(t) = r(t) and l(u) = r(u) by the inductive assumption. In the first case, s = (¬t), it follows that l(s) = 1 + l(t) = 1 + r(t) = r(s), as required. In the second case, s = (t ∧ u), we have, on counting parentheses, l(s) = 1 + l(t) + l(u) and r(s) = r(t) + r(u) + 1, and so l(s) = r(s), as required. The other three cases are similar and so we see that in all cases, l(s) = r(s). Thus the inductive step is proved and so is the lemma.  Digression on proof by induction on complexity. At the start of the proof of 1.1.3 above I said that the proof would be by induction on complexity of terms but you might have felt that the proof was shaped as a proof by induction on the natural numbers N = {0, 1, 2, . . . }. That’s true; we used the sets Sn L to structure the proof, and the proof by induction on complexity of terms was reflected in the various subcases that were considered when going from Sn L to Sn+1 L. But the proof could have been given without reference to the sets Sn L. The argument - the various subcases - would be essentially the same; the hitherto missing ingredient is the statement of the appropriate Principle of Induction. Recall that, for N that takes the form “Given a statement P (n), depending on n ∈ N, if P (0) is true and if from P (n) we can prove P (n + 1), then P (n) is true for every n ∈ N.”4 The corresponding statement for our “construction tree” for propositional terms is: “Given a statement P (s), depending on s ∈ SL, if P (p) is true for every propositional variable p and if, whenever P (s) and P (t) are true so are P (s ∧ t), P (s ∨ t), P (¬s), P (s → t) and P (s ↔ t), then P (s) is true for every s ∈ SL.” Before the next lemma, notice that every propositional term can be thought of simply as a string of symbols which, individually, are either: propositional variables (p, q etc.), connectives (∧, ∨, ¬, →, ↔), or parentheses (left, right). Then the statement that s, as a string, is, for instance, xyz will mean that x, y, z are strings and, if we place them next to each other in the given order, then we get s. For instance if s0 is ¬¬(s ∧ (t ∨ u)) then we could write s0 as xyz where x, y, z are the strings x = ¬, y = ¬(s, z = ∧(t ∨ u)); we could even write s0 = xyzw with x, y, z as before and w the empty string (which we allow). We define the length, lng(x), of any string x to be the number of occurrences of symbols in it. We extend the notations l(x) and r(x) to count the numbers of left parentheses, right parentheses in any string x. If the string x has the form yz then we say that y is a left subword of x, a proper subword if z 6= ∅; similarly z is a right subword of x, proper if y is not the empty string. (We will use the terms “string” and “word” interchangably.) Proposition 1.1.4. For every propositional term s, either s is a propositional variable or there is just one way of writing s in either of the forms s = (¬t) for some propositional term t or s = (t ∗ u) for some propositional terms t, u where ∗ is one of the binary propositional connectives. Proof. We can suppose that s is not a propositional variable. Note that if s has the form (¬t) then the leftmost symbols of s are (¬, whereas if s has the 4 I follow the convention that 0 is a natural number; not followed by everyone but standard in mathematical logic.

8

form (t ∗ u) then its leftmost symbols are (( or (p where p is a propositional variable, so we can treat these two cases entirely separately. In the first case, s = (¬t), this is the only possible way of writing it in this form because t is determined by s. Therefore, since, as we observed above, it cannot be written in the form (t ∗ u), there is no other way of writing s as a propositional term. In the second case, we argue by contradiction and suppose that we can write s = (t ∗ u) = (t0 ∗0 u0 ) where t, u, t0 , u0 are propositional terms and ∗, ∗0 are propositional connectives and, for the contradiction, that these are not identical ways of writing s, hence that either t is a proper left subword of t or t0 is a proper left subword of t. A contradiction will follow immediately once we have proved the following lemma.  Lemma 1.1.5. If s is a propositional term and if s0 is a proper left subword of s then either s0 = ∅ or l(s0 ) − r(s0 ) > 0; in particular s0 is not a propositional term. Similarly, if s00 is a proper right subword of s then either s00 = ∅ or r(s00 ) − l(s00 ) > 0, and s00 is not a propositional term. Proof. We know that s has the form (¬t) or (t ∗ u). In the first case, s0 has one of the forms ∅, ( or (¬t0 where t0 is a left subword (possibly empty) of t. By induction on lengths of propositional terms we can assume that t0 = ∅ or l(t0 ) − r(t0 ) ≥ 0 (“>” if t0 is a proper left subword of t, “=” by 1.1.3 in the case t0 = t) and so, in each case, it follows that l(s0 ) − r(s0 ) > 0. In the second case, s0 has one of the forms ∅, (, (t0 where t0 is a left subword of t, (t ∗ u0 where u0 is a left subword of u. Again by induction on lengths of propositional terms we can assume that l(t0 ) − r(t0 ) ≥ 0 and l(u0 ) − r(u0 ) ≥ 0; checking each case, it follows that l(s0 ) − r(s0 ) > 0. By 1.1.3 we deduce that s0 is not a propositional term. Similarly for the assertion about right subwords.  

1.2

Valuations

Now for the key idea of a (truth) valuation. Fix some set L = S0 L of propositional variables, and hence the corresponding set SL of propositional terms. A valuation on the set of propositional terms is a function v : SL → {T, F} to the 2-element set5 {T, F} which satisfies the following conditions.6 For all propositional terms s, t we have v(s ∧ t) = T iff v(s) = T and v(t) = T; v(s ∨ t) = T iff v(s) = T or v(t) = T; v(¬s) = T iff v(s) = F; v(s → t) = T iff v(s) = F or v(t) = T; v(s ↔ t) = T iff the values of v(s) and v(t) are the same: v(s) = v(t). 5 really,

the two-element boolean algebra course, T represents “true” and F “false”. Often the 2-element set {1, 0} is used instead, normally with 1 representing “true” and 0 representing “false”. 6 Of

9

There’s quite a lot to say about this definition. We start with a key point. Namely, because all propositional terms are built up from the propositional variables using the propositional connectives, any valuation is completely determined by its values on the propositional variables (this, see 1.2.1((b) below, is the formal statement of the point we made (the “crucial observation”) when discussing mice, cheese and homology groups). For instance if v(p) = v(q) = T and v(r) = F then we have, since v is a valuation, v(p∨r) = T and hence v(¬(p∨r)) = F. Similarly, for any propositional term, t, built from p, q and r, the value v(t) is determined by the above choices of v(p), v(q), v(r). That does actually need proof. There is the, rather obvious and easily proved by induction, point that this process works (in the sense that it gives a value), but there’s also the more subtle point that if there were more than one way of building up a propositional term then, conceivably, one construction route might lead to the valuation T and the other to F. But we have seen already in 1.1.4 that this does not, in fact, happen: every propositional term has a unique “construction tree”. Therefore if v0 is an function from the set, S0 L, of propositional variables to the set {F, T} then this extends to a unique valuation v on SL. In particular, if there are n propositional variables there will be 2n valuations on the propositional terms built from them. We state this formally. Proposition 1.2.1. Let L be a set of propositional variables. (a) If v0 : L → {F, T} is any function then there is a valuation v : SL → {F, T} on propositional terms in L such that v(p) = v0 (p) for every p ∈ L. (b) If v and w are valuations on SL and if v(p) = w(p) for all p ∈ L then v = w (so the valuation in part (a) is unique). (c) If t is a propositional term and if v and w are valuations which agree on all propositional variables occurring in t then v(t) = w(t). The proof of part (c), which is a slight strengthening of (b), is left as an exercise. In order to prove it we could prove the following statement first (by induction on complexity of terms): if L0 ⊆ L are sets of propositional variables then for every n, SL0n ⊆ Sn L; furthermore, if v 0 is a valuation on SL0 and v is a valuation on SL such that v(p) = v 0 (p) for every p ∈ L0 then v(t) = v 0 (t) for every t ∈ SL0 . From that, part (c) follows easily (take L0 to be the set of propositional terms actually occurring in t). (You might have noticed that I didn’t actually define what I mean by a propositional variable occurring in a propositional term; I hope the meaning is clear but it is easy to give a definition by, what else, induction on complexity of terms.) Truth tables are tables showing evaluation of valuations on propositional terms. They can also be used to show the effect of the propositional connectives on truth values. Note that “or” is used in the inclusive sense (“one or the other or both”) rather than the exclusive sense (“one or the other but not both”). p q p∨q p q p∧q T T T T T T T F F T F T F T F F T T F F F F F F

10

p q p→q p q p↔q p ¬p T T T T T T F T F F T F T F F T T F T F T F T T F F F F You might feel that the truth table for → does not capture what you consider to be the meaning of “implies” but, if we are to regard it as a function on truth values (whatever the material connection or lack thereof between its “input” propositions) then the definition given is surely the right one. Or just regard p → q as an abbreviation for ¬p∨q “(not-p) or q”, since they have the same truth tables. The following example might make the reading of p → q as meaning ¬p ∨ q reasonable: let p be “n = 1” and let q be “(n − 1)(n − 2) = 0”, so p → q reads “n = 1 implies (n−1)(n−2) = 0” or “If n = 1 then (n−1)(n−2) = 0” and then consider setting n = 1, 2, 3, . . . in turn and think about the truth values of p, q and p → q. You will have seen examples of truth tables in the first year Sets, Numbers and Functions course. Recall that they can be used to determine whether a propositional term t is a tautology, meaning that v(t) = T for every valuation v. The “opposite” notion is: if v(t) = F for every valuation v; then we say that t is unsatisfiable (also called “a contradiction” though that’s not good terminology to use when we’ll be drawing the distinction between syntax and semantics). Notice that the use of truth tables implicitly assumes part (c) of 1.2.1. We say that two propositional terms, s and t, are logically equivalent, and write s ≡ t, if v(s) = v(t) for every valuation v. It is equivalent to say that s ↔ t is a tautology. Let’s prove that. Suppose s ≡ t so, if v is any valuation, then v(s) = v(t) so, from the definition of valuation, v(s ↔ t) = T. This is so for every valuation so, by the definition of tautology, s ↔ t is a tautology. For the converse, suppose that s ↔ t is a tautology and let v be any valuation. Then v(s ↔ t) = T and so (again, by the definition of valuation) v(s) = v(t). Thus, by definition of equivalence, s and t are logically equivalent. We see that the proof was just an easy exercise from the definitions. Now for the semantic notion of entailment; we contrast “semantics” (“meaning” or, at least, notions of being true and false) with “syntax” (construction and manipulation of strings of symbols). If S is a set of propositional terms and t is a propositional term then we write S |= t if for every valuation v with v(S) = T, by which we mean v(s) = T for every s ∈ S, we have v(t) = T: “whenever S is true so is t”. Extending the above notions we say that a set S of propositional terms is tautologous if v(S) = T for every valuation v and S is unsatisfiable if for every valuation v there is some s ∈ S with v(s) = F - in other words, if no valuation makes all the terms in S true. We also say that S is satisfiable if there is at least one valuation v with v(S) = T. So note: tautologous means every valuation makes all terms in S true; satisfiable means that some valuation makes all terms in S true; unsatisfiable means that no valuation makes all terms in S true. Lemma 1.2.2. Let S be a set of propositional terms and let t, t0 , u be proposi-

11

tional terms. (a) S |= t iff S ∪ {¬t} is unsatisfiable (b) S ∪ {t} |= u iff S |= t → u (c) S ∪ {t, t0 } |= u iff S ∪ {t ∧ t0 } |= u Proof. These are all simple consequences of the definitions. Before we begin, we introduce a standard and slightly shorter notation: instead of writing S ∪ {t1 , . . . , tk } |= u we write S, t1 , . . . , tk |= u. (a) S ∪ {¬t} is unsatisfiable iff for all valuations v, we have v(s) = F for some s ∈ S or v(¬t) = F iff for all valuations v, if v(s) = T for all s ∈ S then v(¬t) = F iff for all valuations v, if v(s) = T for all s ∈ S then v(t) = T iff S |= t. (b) S ∪ {t} |= u iff for every valuation v, if v(s) = T for all s ∈ S and v(t) = T then v(u) = T iff for every valuation v with v(s) = T for all s ∈ S then, if v(t) = T then v(u) = T iff for every valuation v with v(s) = T for all s ∈ S then v(t → u) = T (by the truth table for “→”) iff S |= t → u. (c) S ∪ {t ∧ t0 } |= u iff for every valuation v with v(s) = T for all s ∈ S and v(t ∧ t0 ) = T we have v(u) = T iff (by the truth table for ∧) for every valuation v with v(s) = T for all s ∈ S and v(t) = T and v(t0 ) = T, we have v(u) = T iff S ∪ {t, t0 } |= u.  We can use truth tables to determine whether or not S |= u (assuming S is a finite (and, in practice, not very large) set) but this can take a long time: if there are n propositional variables appearing then we need to compute a truth table with 2n rows. The next section describes a method which sometimes is more efficient.

1.3

Beth trees

Beth trees provide a method, often more efficient than and perhaps more interesting than, truth tables, of testing whether a collection of propositional terms is satisfiable or not (and, if it is satisfiable, of giving a valuation demonstrating this). Note that this includes testing whether a propositional term is a tautology, whether one term implies another, whether S |= t, et cetera. The input to the method consists of two sets S, T of propositional terms; to distinguish between these we will write the typical input as S|T . The output will, if we carry the method to its conclusion (which for some purposes will be more than we need to do), be all valuations with v(S) = T and v(T ) = F. So if the output is nonempty then we know that S ∪ {¬t : t ∈ T } is satisfiable. For instance, t is a tautology if the output from the pair ∅|{t} is empty (which often will be easier than checking whether the output of {t}|∅ is all valuations). The actual computation has the form of a tree (as usual in mathematics, trees grow downwards) and, at each node of the tree, there will be a pair of the 12

form S 0 |T 0 . A node (of a fully or partially-computed Beth tree) is terminal if it has no node beneath it. A node is a leaf if all the propositional terms at it are propositional variables. Directly underneath each non-terminal node is either a branch segment with another node at its end, or two branch segments with a node at the end of each. A key feature of the tree is that if a node lies under another then the lower one contains fewer propositional connectives. That means that if the initial data contains k propositional connectives then no branch can contain more than k+1 nodes. And that means that the computation of the tree will terminate. Before we describe how to compute such trees here, in order to anchor ideas, is an example. Example 1.3.1. We determine whether or not ¬p, (p ∧ q) → r |= ¬r → (q → p). We will build a tree beginning with the input ¬p, (p ∧ q) → r | ¬r → (q → p) since there will be a valuation satisfying this condition exactly if ¬p, (p ∧ q) → r |= ¬r → (q → p) does not hold. ¬p, (p ∧ q) → r | ¬r → (q → p)

(p ∧ q) → r | p, ¬r → (q → p)

¬r, (p ∧ q) → r | p, q → p

¬r, q, (p ∧ q) → r | p, p

q, (p ∧ q) → r | r, p TTTT jjj j TTTT j j j j TTTT j jj j j TTT j jj q, r | r, p q | r, p, p ∧ q TTTT qq TTTT q q TTTT qqq TTTT TT qqq q | q, r, p q | r, p The property (∗) below implies that a valuation v satisfies the input conditions (making both ¬p and (p ∧ q) → r true but making ¬r → (q → p) false) iff it satisfies at least one of the leaves. But we can see immediately that the only leaf satisfied by any valuation is q | r, p, which is satisfied by the valuation v with v(q) = T, v(r) = F, v(p) = F. So there is a valuation making both ¬p and (p ∧ q) → r true but making ¬r → (q → p) false. That is ¬r → (q → p) does not follow from ¬p and (p ∧ q) → r. We will list the allowable rules for generating the nodes directly under a given node. To make sense of these, we first explain the idea. The property that we want is the following: (∗) If, at any stage of the construction of the tree with initial node S|T , the currently terminal nodes are S1 |T1 ,...,Sk |Tk then, for every valuation v, we have v(S) = T and v(T ) = F iff v(Si ) = T and v(Ti ) = F for some i. 13

For this section, when I write v(T ) = F I mean v(t) = F for every t ∈ T . This is a convenient, but bad (because easily misinterpreted), notation. In order for this property to hold it is enough to have the following two: (∗1 ) if a node S 0 |T 0 is immediately followed by a single node S1 |T1 then, for every valuation v we have v(S 0 ) = T and v(T 0 ) = F iff v(S1 ) = T and v(T1 ) = F; (∗2 ) if a node S 0 |T 0 is immediately followed by the nodes S1 |T1 and S2 |T2 then, for every valuation v we have: v(S 0 ) = T and v(T 0 ) = F iff [v(S1 ) = T and v(T1 ) = F] or [v(S2 ) = T and v(T2 ) = F]. (The fact that these are enough can be proved by an inductive argument.) In the pair S|T you can think of the left hand side as the “positive” statements and those on the right as the “negative” ones. Each rule involves either moving one term between the positive and negative sides (with appropriate change to the term) or splitting one pair into two. Here are the allowable rules.

S, s ∧ t | T

S, s, t | T

S, ¬t | T

S, | ¬t, T

S | t, T

S, t | T

S, | s ∧ t, T LLL r LLL rrr r LLL r r r L r r S | s, T S | t, T

S, s ∨ t | T KKK rr KKK r r r KKK rr r K r r S, s | T S, t | T

S | s ∨ t, T

S, s → t | T LLL rr LLL r r r LLL r r r L rr S | s, T S, t | T

S | s → t, T

S | s, t, T

S, s | t, T

In lectures we will explain a few of these but you should think through why each one is valid (that is, satisfies (∗1 ) or (∗2 ), as appropriate). You should also note that they cover all the cases - together they allow a single pair to be input and will output a tree where every terminal node is a leaf. When constructing a Beth tree there may well be some nodes where there is a choice as to which rule to apply but no choice of applicable rule is wrong (though some choices might lead to a shorter computation). Example 1.3.2. We use Beth trees to show that p ∧ q → p is a tautology. We already suggested that it might be easier to do the equivalent thing of showing that ¬(p ∧ q → p) is unsatisfiable; here’s the computation for that.

14

∅|p ∧ q → p

p ∧ q|p

p, q | p - and clearly no valuation can make both p, q true but make p false; we conclude that p ∧ q → p is a tautology. For comparison here is the direct check that p ∧ q → p is a tautology. p ∧ q → p|∅ KKK q q KK qq q KKK q q q KK qq p|∅ ∅|p ∧ q MMM w w M w MMM ww MMM ww M ww ∅|p ∅|q Now note that every valuation satisfies the condition expressed by at least one of the leaves, so p ∧ q → p is indeed a tautology.

1.4

Normal forms

First, we look at some more basic properties of logical equivalence where, recall, two propositional terms s, t are said to be logically equivalent, s ≡ t, if v(s) = v(t) for every valuation v (and by 1.2.1(c) it is enough to check for valuations on just the propositional variables actually occurring in s or t). Lemma 1.4.1. If s, t are propositional terms then: (i) s ≡ t iff (ii) s |= t and t |= s iff (iii) |= s ↔ t iff (iv) s ↔ t is a tautology. Proof. All this is immediate from the definitions. For instance, to prove (iv)⇒(i) let v be any valuation; then, assuming (iv), v(s ↔ t) = T and by definition of valuation, we see this can happen only if v(s) = v(t), as required.  Note that this is an equivalence relation on SL; that is, it is reflexive (s ≡ s), symmetric (s ≡ t implies t ≡ s) and transitive (s ≡ t and t ≡ u together imply s ≡ u). Here are some, easily checked, basic logical equivalences. For any propositional terms s, t, u: s ∧ t ≡ t ∧ s; s ∨ t ≡ t ∨ s; ¬(s ∧ t) ≡ ¬s ∨ ¬t; ¬(s ∨ t) ≡ ¬s ∧ ¬t; ¬¬s ≡ s;

15

s → t ≡ ¬s ∨ t; (s ∧ t) ∧ u ≡ s ∧ (t ∧ u), so we can write s ∧ t ∧ u without ambiguity; (s ∨ t) ∨ u ≡ s ∨ (t ∨ u), so we can write s ∨ t ∨ u without ambiguity; (s ∧ t) ∨ u ≡ (s ∨ u) ∧ (t ∨ u); (s ∨ t) ∧ u ≡ (s ∧ u) ∨ (t ∧ u); s ∧ s ≡ s; s ∨ s ≡ s. Proposition 1.4.2. Suppose that s ≡ s0 and t ≡ t0 are propositional terms. Then: (i) ¬s ≡ ¬s0 ; (ii) s ∧ t ≡ s0 ∧ t0 ; (iii) s ∨ t ≡ s0 ∨ t0 ; (iv) s → t ≡ s0 → t0 . Proof. To prove (ii): suppose v(s ∧ t) = T. Then by the truth table for ∧, both v(s) = T and v(t) = T; so v(s0 ) = T and v(t0 ) = T and hence v(s0 ∧ t0 ) = T. The other parts are equally easy.  We introduce notations for multiple P conjunctions and disjunctions; they are completely analogous to the use of for repeated +. Given propositional terms Vn V1 Vk+1 Vk s1 , . . . , sn we define Wi=1 si by induction: i=1 = s1 , i=1 si = i=1 si ∧ sk+1 . n Similarly we define i=1 si . Because of associativity and commutativity of ∧, respectively of ∨, if we permute the terms in such a repeated conjunction or disjunction, then we obtain an equivalent propositional term. Indeed, we have the following (the proofs of which are left as exercises). Proposition 1.4.3. If s1 , . . . , sn are propositional terms and v is a valuation then: Vn (i) v( Wi=1 si ) = T iff v(si ) = T for all i = 1, . . . , n; n (ii) v( T iff v(si ) = T for some i ∈ {1, . . . , n}; Vn i=1 si ) = W n (iii) W i=1 si ≡ ¬V i=1 ¬si ; n n (iv) i=1 si ≡ ¬ i=1 ¬si ; Proposition 1.4.4. Suppose that s1 , . . . , sn and t1 , . . . , tm are sequences of propositional terms such that {s1 , . . . , sn } = {t1 , . . . , tm } (thus the W sequences n differ only in the order of their terms and possible repetitions). Then i=1 si = Wm Vn Vm t and s = t . j=1 j i=1 i j=1 j W If . , sn } is V a finite set of propositional terms then we write S WnS = {s1 , . . V n for i=1 si and S for V i=1 si . What if S = ∅? Since, roughly, the more conjuncts V there are in S the harder it is to be true, it makes someWsense to define ∅ to be any tautology (i.e. always true). Dually we define ∅ to be any unsatisfiable term (so false W under every V valuation). (Because we are only interested in the truth values of ∅ and ∅ it doesn’t matter which tautology and which contradiction are chosen.) A little more terminology: given a set L of propositional variables, we refer to any propositional variable p, or any negation, ¬p, of a propositional variable as a literal. We are going to show that every propositional term is equivalent to one which is in a special form (indeed, there are two special forms: disjunctive and conjunctive). 16

term is in disjunctive normal form if it has the form Wn A Vpropositional mi g where each gij is a literal. i=1 j=1 ij Proposition 1.4.5. (Disjunctive Normal Form Theorem) If t ∈ SL then there is a propositional term s ∈ SL which is in disjunctive normal form and such that s ≡ t. If {p1 , . . . , pk } are the propositional Wn Vmi variables appearing in t then gij with each mi ≤ k and with we may suppose that s has the form i=1 j=1 k n≤2 . Proof. Let v1 , . . . , vn be the distinct valuations v on {p  1 , . . . , pk } such that pj if vi (pj ) = T v(t) = T. For each i = 1, . . . , n and j = 1, . . . , k, set gij = . ¬pj if vi (pj ) = F Vk Note that vi ( j=1 gij ) = T and that if v 0 6= vi is any other valuation on Vk {p1 , . . . , pk } then v 0 ( j=1 gij ) = F. It follows that if w is any valuation on Wn V k {p1 , . . . , pk } then w( i=1 j=1 gij ) = T iff w is one of v1 , . . . , vn . Therefore for Wn Vk W n Vm i gij are logically any valuation v, v( i=1 j=1 gij ) = v(t), so t and i=1 j=1 equivalent, as required. For the final statement, note that there are 2k distinct valuations on {p1 , . . . , pk }.  The proof shows how to go about actually constructing an equivalent propositional term in disjunctive normal form, using either truth tables or, the proof slightly modified, Beth trees. Example 1.4.6. Consider the propositional term t = (p ∧ q) → (¬p ∨ r). If we construct its truth table then we find 7 rows/valuations on {p, q, r} which make it true. For each of these we form the corresponding “gij ”. For instance, the valuation v1 (p) = v1 (q) = v1 (r) = T is one of those making t true and the corresponding term is p ∧ q ∧ r. Another row where t is true is that where p is true and q and r are false, so the corresponding term is p ∧ ¬q ∧ ¬r. Et cetera, giving the disjunctive normal form term (p∧q∧r)∨(p∧¬q∧r)∨(p∧¬q∧¬r)∨(¬p∧q∧r)∨(¬p∧q∧¬r)∨(¬p∧¬q∧r)∨(¬p∧¬q∧¬r) equivalent to t. Normal forms are, however, not unique and you might note that, for example, the last four disjuncts can be replaced by the logically equivalent term ¬p. From this point of view, Beth trees are more efficient, as we can illustrate with this example. If we construct a Beth tree starting with p∧q | ¬p∨r then very quickly we reach the single leaf p, q | r, which corresponds to the single valuation making (p∧q) → (¬p∨r) false. That corresponds to the term p∧q∧¬r, so t is equivalent to the negation of this, namely ¬(p ∧ q ∧ ¬r), which is equivalent to ¬p ∨ ¬q ∨ r - a much simpler disjunctive normal form. You might instead construct a Beth tree starting from (p ∧ q) → (¬p ∨ r) | ∅. Taking an obvious sequence of steps leads to a completed tree with the leaves ∅ | p, ∅ | q and r | ∅. These correspond to the (conjunctions of) literals: ¬p, ¬q, r respectively. Therefore this also leads to the form ¬p ∨ ¬q ∨ r. The dual form is as follows: propositional term is in conjunctive normal Vn WA mi form if it has the form i=1 j=1 gij where each gij is a literal.

17

Proposition 1.4.7. If t ∈ SL then there is a propositional term s ∈ SL which is in conjunctive normal form and such that s ≡ t. If {p1 , . . . , pk } are the propositional variables appearing in s then we may suppose that s has the form Vn Wm i k g i=1 j=1 ij with each mi ≤ k and with n ≤ 2 . Proof. The term t is logically to ¬¬t and, by 1.4.5, ¬tWis equivalent Wn equivalent Vmi n Vm i gij gij . So t is equivalent to ¬ i=1 j=1 to some term of the form i=1 j=1 which, using DeMorgan’s laws (the third and fourth on the list of identities after Vn Wm i ¬gij . Since each ¬gij is a literal (at 1.4.1), is in turn equivalent to i=1 j=1 least once we cancel double negations), the result follows. 

1.5

Adequate sets of connectives

The proof of 1.4.5 actually shows that every truth table on a set, p1 , . . . , pk , of propositional variables can be generated from them by using the propositional connectives ∧, ∨, ¬. More precisely, every propositional term t in p1 , . . . , pk defines a function, evaluation-at-t, from the set Valp1 ,...,pk of valuations v on p1 , . . . , pk , to {T, F}. Conversely, given any function e : Valp1 ,...,pk → {T, F}, one may construct, using ∧, ∨ and ¬, a propositional term t such that e is just evaluation at t. If we change the set of propositional connectives that we are “allowed to use” then we can ask the same question. For instance, using just ∧ and ∨ can we construct every truth table/build a term inducing any given evaluation e? What if we use ¬ and →? And other such questions (the five we have introduced are not the only possible connectives, indeed not even the only ones which occur in nature, or at least in Computer Science, where one also sees NAND=Sheffer stroke, NOR, XOR).7 We say that a set, S, of propositional connectives is adequate if for every propositional term t (in any number of propositional variables) there is a term t0 constructed using just the connectives in S such that t and t0 are “logically equivalent”.8 By “logically equivalent” we mean that they “have the same truth tables” or, a bit more precisely, they define the same function from Valp1 ,...,pn to {T, F}. Example 1.5.1. The set {∧, ¬} is adequate. We have already commented that {∧, ∨, ¬} is adequate so we need only note that s ∨ t ≡ ¬(¬s ∧ ¬t). Example 1.5.2. The NAND gate/operator or Sheffer stroke is a binary (i.e. has two inputs) propositional connective whose effect is as shown in the truth table below. p q p|q T T F T F T F T T F F T 7 We won’t formulate the general question because then we would have to give a general definition of “(n-ary) propositional connective” and would be hard-pressed to distinguish these from propositional terms. 8 Notice that if S includes some “new” propositional connectives then we have to extend our definitions of “propositional term” etc. to allow these. That’s why I used quotation marks just then.

18

You can see from this that p|q is logically equivalent to ¬(p ∧ q), hence the name “NAND”. If we take our set of connectives to be just S = {|} then we have to re-define “propositional term” by saying: every propositional variable is a propositional term; if s, t are propositional terms then so is s|t. We can refer to these as “(propositional) terms build using | (only)” and can write S| (L) for the set of such terms. It is easy to show that {|} is adequate. All we have to do is to show that, given propositional variables p, q we can find terms using | only which are equivalent to ¬p and p ∧ q - because we know that {¬, ∧} is an adequate set of connectives. Indeed, it is easy to check see that p|p is equivalent to ¬p and hence that (p|q)|(p|q) is equivalent to p ∧ q. Showing that a given set S of connectives is not adequate can take more thought: how can one show that some propositional terms do not have equivalents built only using connectives from S? Example 1.5.3. One might feel that, intuitively, ∧ and ∨ together are not adequate since they are both “positive”. How can one turn that intuition into a proof? One would like to show, for instance, that no term built only using ∧ and ∨ can be logically equivalent to ¬p but, even using only the single propositional variable p, there are infinitely many propositional terms to check. That might suggest trying some sort of inductive proof. But a proof of what statement? What we can do is to prove by induction on complexity/length of a term that: if t is any term built only from ∧ and ∨ then for every valuation v such that all the propositional variables appearing in t are assigned the value T by v, we also have v(t) = T. Once that is done, we can deduce, in particular, that no term built only using ∧ and ∨ can be equivalent to ¬p.

1.6

Interpolation

Suppose that s and t are propositional terms and that s |= t, equivalently s → t is a tautology. This could be for the trivial reasons that either s is always false (unsatisfiable) or that t is always true (a tautology). But if that’s not the case then the interpolation theorem guarantees that there is some propositional term u which involves only the propositional variables appearing in both s and t such that s |= u and u |= t. Such a u is referred to as an interpolant between s and t. Theorem 1.6.1. (Interpolation Theorem) Suppose that s ∈ S(L1 ) and t ∈ S(L2 ) are such that s |= t. Then either s is unsatisfiable or t is a tautology or there is u ∈ S(L3 ), where L3 = L1 ∩ L2 , such that s |= u and u |= t. Proof. We suppose that s is satisfiable and that t is not a tautology; we must produce a suitable u. Since s is satisfiable there is some valuation v1 on L1 such that v1 (s) = T and since t is not a tautology there is some valuation v2 on L2 such that v2 (t) = F. First we show that L3 6= ∅. If this were not so, that is, if L1 and L2 had no propositional variables in common, then we could  define a valuation v3 on v1 (p) if p ∈ L1 S(L1 ∪ L2 ) by setting, for p ∈ L1 ∪ L2 , v3 (p) = . Then, v2 (p) if p ∈ L2

19

by 1.2.1(c), we would have v3 (s) = v1 (s) = T and v3 (t) = v2 (t) = F, which contradicts the assumption that s |= t. Now choose, by 1.4.7, a formula of S(L1 ) in disjunctive normal form which Vm i Wn Vli hik ) where we have separated out gij ∧ k=1 is equivalent to s, say i=1 ( j=1 the literals into two groups: the gij - those belonging to S(L1 ) \ S(L3 ); the hik - those belonging to S(L3 ). (We allow that some of these conjuncts might be empty.) We can assume that disjunct is satisfiable (we can drop any Wneach Vm i hik . Clearly u ∈ S(L3 ) and, if v is a which are not). Define u to be i=1 k=1 Vmi Vl i hik ) = T gij ∧ k=1 valuation on S(L1 ) then, if v(s) = T, it must be that v( j=1 Vmi 9 for some i (by 1.4.3) and hence v( k=1 hik ) = T and hence v(u) = T. Thus s |= u and we have just seen that u is satisfiable. It remains to prove that u |= t. So let v be a valuation Vmi0 on S(L2 ) such that v(u) = T. Then there must be some i0 such that v( k=1 hi0 k ) = T. We define a valuation w on S(L1 ∪ L2 ) by setting, for p ∈ L1 ∪ L2 ,  v(p) if p ∈ L2     T if p = gi0 k for some k  F if ¬p = gi0 k for some k w(p) =   T, say if p ∈ L \ L and is not already assigned a value, that is, if p does  1 2   not occur in the i0 th disjunct Vli0 Vmi0 Note that w( j=1 gi0 j ∧ k=1 hi0 k ) = T by construction and hence w(s) = T. But we assumed that s |= t and so w(t) = T. But w and v agree on all propositional variables in L2 ; hence v(t) = T. We conclude that u |= t, which was what had remained to be proved.  The proof gives an effective procedure for computing interpolants. Example 1.6.2. Given that (p → (¬r∧s))∧(p∨(r∧¬s)) |= ((s → r) → t)∨(¬t → (r ∧ ¬s)), how do we find an interpolant involving r and s only? (Note that, in the notation of the proof, L1 = {p, r, s}, L2 = {r, s, t}, L3 = {r, s}. We find a term in disjunctive normal form which is logically equivalent to (p → (¬r∧s))∧(p∨(r∧¬s)); one such is (p∧s∧¬r)∨(¬p∧r∧¬s). Following the procedure in the proof, we obtain the interpolant u which is (s ∧ ¬r) ∨ (r ∧ ¬s).

9 You might reasonably ask what happens if, for this value of i, there are no h ik conjuncts. That could happen but there must be at least one such value of i (that is, with v making the ith conjunct true) such that there is an hik . Otherwise, arguing as before, we could adjust the valuation v, keeping the same values on propositional variables in S(L1 ) \ S(L3 ) but adjusting it on those belonging to S(L3 ), so as to make t false, while keeping s true, contradicting that s |= t.

20

.

Chapter 2

Deductive systems “The design of the following treatise is to investigate the fundamental laws of those operations of the mind by which reasoning is performed; to give expression to them in the symbolical language of a Calculus, and upon this foundation to establish the science of Logic and construct its method; to make that method itself the basis of a general method for the application of the mathematical doctrine of Probabilities; and, finally, to collect from the various elements of truth brought to view in the course of these inquiries some probable intimations concerning the nature and constitution of the human mind.” Thus begins Chapter 1 of George Boole’s “An Investigation of the Laws of Thought (on which are founded the Mathematical Theories of Logic and Probabilities)” (1854) We have already seen the “symbolical language” (though not the way Boole wrote it) and what Boole meant by a Calculus (or Algebra). Now we discuss proof/deductive systems further. Given a propositional term, we may test whether or not it is a tautology by, for example, constructing its truth table. This is regarded as a “semantic” test because it is in terms of valuations. The test is recursive in the sense that we have a procedure which, after a finite amount of time, is guaranteed to tell us whether or not the term is a tautology. More generally, suppose that S is a finite set of propositional terms and that t is a propositional term. Recall that we write S |= t to mean that every valuation which makes everything in S true also makes t true. Checking whether or not this is true also is a recursive procedure. In the case of predicate logic, however, it turns out that there is no corresponding algorithm for determining whether or not a propositional term (“sentence” in that context) is a tautology or whether the truth of a finite set of propositions implies the truth of another proposition.1 The best we can do is to produce a method of “generating” all tautologies or, more generally of starting with a set, S, of sentences/statements/propositions which we treat as axioms and then generating all consequences of those axioms. Such a method of 1 In fact the set of tautologies of predicate logic is “recursively enumerable” but not recursive. Saying that the set is recursively enumerable means that there is an algorithm which will output only tautologies and such that any tautology eventually will be output; but we can’t predict when.

21

generating consequences (and, of course, avoiding anything which is not a consequence) is a propositional or predicate calculus. In the following sections we will describe two such calculi for propositional logic. Of course, for classical propositional logic no such calculus is necessary because we have methods such as truth tables or Beth trees. But these calculi will serve as models of calculi for logics where there is no analogue of those recursive methods. It is also the case that these calculi do correspond to “Laws of Thought” in the sense that their axioms and rules of inference capture steps in reasoning that we use in practice. The calculi that we will see here are considerably simpler than those for predicate logic but the main concepts and issues ((in)consistency, soundness, completeness, compactness, how one might prove completeness) all are present already in this simpler context which provides, therefore, a good opportunity to understand these fundamental issues.

2.1

A Hilbert-style system for propositional logic

Our (Hilbert-style) calculus will consist of certain axioms and one rule of deduction (or rule of inference). There are infinitely many axioms, being all the propositional terms of one of the forms: (i) s → (t → s) (ii) (r → (s → t)) → ((r → s) → (r → t)) (iii) ¬¬s → s (iv) (¬s → ¬t) → (t → s), where r, s and t may be any propositional terms. Thus, for instance, the following is an axiom: (p∧¬r) → ((s∨t) → (p∧¬r)). We refer to (i)-(iv) as axiom schemas. The single rule of deduction, modus ponens says that, from s and s → t we may deduce t. Then we define the notion of entailment or logical implication, written `, within this calculus. Let S be a set (not necessarily finite) of propositional terms and let s, t be propositional terms. (i) If t is an axiom then S ` t (“logical axiom” LA) (ii) If s ∈ S then S ` s (“non-logical axiom” NLA) (iii) If S ` s and S ` s → t then S ` t (“modus ponens” MP) (iv) That’s it. (like the corresponding clause in the definition of propositional term) We read S ` t as “S entails t” or “S logically implies (within this particular calculus) t”. This definition is, like various definitions we have seen before, an inductive one: it allows chains of entailments. Here is an example, of a deduction of p → r from S = {p → q, q → r}. 1. S ` q → r NLA 2. S ` (q → r) → (p → (q → r)) LA(i) 3. S ` (p → (q → r)) MP1,2 4. S ` ((p → (q → r)) → ((p → q) → (p → r)) LA(ii) 5. S ` ((p → q) → (p → r)) MP3,4 6. S ` p → q NLA 7. S ` p → r MP5,6 22

(The line numbers and right-hand entries are there to help any reader follow/check the deduction.) Note that if S ⊆ T and if S ` s then T ` s because any deduction (such as that above) of s from S may be changed into a deduction of s from T simply by replacing every occurrence of “S” by “T ”. A point about notation: if we write something like “S ` t” in a mathematical assertion (as opposed to this being a line of a formal deduction) you should read this as saying “There is a deduction of t from S.”. Warning: it can be surprisingly difficult to find deductions, even of simple things, in this calculus. The (“Gentzen-style/Natural Deduction”) calculus that we will use later allows deductions to be found more easily. Our main point is, however, not in the details of the calculus but the fact that there is a calculus for which one can prove a completeness theorem (2.1.12). For any such deductive calculus there are two central issues: soundness and completeness. We say that a deductive calculus is sound if we cannot deduce things that we should not be able to deduce using it, equivalently if we cannot deduce contradictions by using it. That is, if S ` t then S |= t. And we say that a deductive calculus is complete if it is strong enough to deduce all consequences, that is if S |= t implies S ` t. So soundness is “If we can deduce t from S then, whenever S is true, t is true.” and completeness is “If t is true whenever S is true then there will be a deduction of t from S.”. In the remainder of this section we will give a proof of soundness (this is the easier part) and completeness for the calculus above.

2.1.1

Soundness

Suppose that S is a set of propositional terms and that t is a propositional term. We have to show that if S ` t is true then so is S |= t. Suppose then that S ` t and let v be a valuation with v(S) = T. We must show that v(t) = T. In outline, the proof is this. The fact that S ` t is that there is a deduction of t from S. Any such deduction is given by a sequence of (logical and non-logical) axioms and applications of modus ponens. If we show that v assigns “T” to every axiom and that modus ponens preserves “T” then every consequence of a deduction will be “T”. More precisely, we argue as follows (“by induction on line number”). If r is a logical axiom then (go back and check that all those axioms are actually tautologies!) r is a tautology, so certainly v(r) = T. If r is a non-logical axiom then r ∈ S so, by assumption on v, we have v(r) = T. Suppose now that we have an application of MP in the deduction of t. That application has the form (perhaps with intervening lines and with the first two lines occurring in the opposite order) S`r S ` r → r0 S ` r0 for some propositional terms r, r0 . We may assume inductively (inducting on the length of the deduction) that v(r) = T and that v(r → r0 ) = T. Then,

23

from the list of conditions for v to be a valuation, it follows that v(r0 ) = T, as required. On the very last line of the deduction we have S`t so our argument shows that v(t) = T, and we conclude that the calculus is sound.

2.1.2

Completeness

Our first step is to prove the Deduction Theorem, which allows us to move terms in and out of the set of non-logical axioms. Theorem 2.1.1. (Deduction Theorem) Let S be a set of propositional terms and let s and t be propositional terms. Then S ` (s → t) iff S ∪ {s} ` t. Proof. Both directions of the proof are really instructions on how to transform a deduction of one into a deduction of the other. From a deduction showing that S ` (s → t) we may obtain a deduction of t from S ∪ {s} by first replacing each occurrence of S (to the right of “`”) by an occurrence of S ∪ {s} and noting that this is still a valid deduction, then adding two more lines at the end, namely S ∪ {s} ` s NLA S ∪ {s} ` t MP(line above and line before that). Note that this does give a deduction of t from S ∪ {s}. For the converse, suppose that there is a deduction of t from S ∪ {s}. This deduction is a sequence of lines S ∪ {s} ` ti for i = 1, . . . , n where tn = t. We will replace each of these lines by some new lines. If ti is a logical axiom or member of S then we replace the i-th line by S ` ti LA or NLA S ` (ti → (s → ti )) LA(i) S ` s → ti MP If ti is s then we replace the i-th line by lines constituting a deduction of s → s from S (the proof of 2.1.2 below but with “S” to the left of each “`”). If the i-th line is obtained by an application of modus ponens then there are line numbers j, k < i such that tk is tj → ti . In our transformed deduction there will be corresponding (also earlier) lines reading S ` s → tj and S ` s → (tj → ti ) so we replace the old i-th line by the lines S ` (s → (tj → ti )) → ((s → tj ) → (s → ti )) Ax(ii) S ` ((s → tj ) → (s → ti )) MP(line above and one of the earlier ones) S ` s → ti MP(line above and one of the earlier ones). What we end up with is a (valid - you should check that you see this) deduction with last line S ` s → tn , as required (recall that tn is t). (It’s worthwhile applying the process described to an example just to clarify how this works.)  Next, some lemmas, the first of which was used in the proof above.

24

Lemma 2.1.2. For every propositional term s there is a deduction (independent of s) with last line ` s → s and hence for every set S of propositional terms there is a deduction with last line S ` s → s. Proof. Here’s the deduction. 1. ` (s → ((s → s) → s)) → ((s → (s → s)) → (s → s)) Ax(ii) 2. ` s → ((s → s) → s) Ax(i) 3. ` (s → (s → s)) → (s → s) MP(1,2) 4. ` s → (s → s) Ax(i) 5. ` s → s MP(3,4) To obtain the second statement just put “S” to the left of each “`” and note that the deduction is still valid.  We’ll abbreviate the statement of the following lemmas as in the statement of the Deduction Theorem. Throughout, s and t are any propositional terms. Lemma 2.1.3. ` s → (¬s → t) Proof. The first part of the proof is just to write down a deduction which takes us close to the end. Then there are two applications of the Deduction Theorem. We’ve actually incorporated those uses, labelled DT, into the deduction itself, as a derived rule of deduction. An alternative would be to stop the deduction at the line “7. {s, ¬s} ` t MP(5,6)” and then say “Therefore {s, ¬s} ` t. By the Deduction Theorem it follows that {s} ` ¬s → t and then, by the Deduction Theorem again, ` s → (¬s → t) follows.” 1. {s, ¬s} ` ¬s → (¬t → ¬s) Ax(i) 2. {s, ¬s} ` ¬s NLA 3. {s, ¬s} ` ¬t → ¬s MP(1,2) 4. {s, ¬s} ` (¬t → ¬s) → (s → t) Ax(iv) 5. {s, ¬s} ` s → t MP(3,4) 6. {s, ¬s} ` s NLA 7. {s, ¬s} ` t MP(5,6) 8. {s} ` ¬s → t DT 9. ` s → (¬s → t) DT  In the next proof we use more derived rules of deduction. Lemma 2.1.4. ` (s → ¬s) → ¬s Proof. 1. {s → ¬s} ` ¬¬s → s Ax(iii) 2. {s → ¬s, ¬¬s} ` s DT 3. {s → ¬s, ¬¬s} ` s → ¬s NLA 4. {s → ¬s, ¬¬s} ` ¬s MP(2,3) 5. {s → ¬s, ¬¬s} ` s → (¬s → ¬(s → s)) Lemma 2.1.3 6. {s → ¬s, ¬¬s} ` ¬s → ¬(s → s) MP(2,5) 7. {s → ¬s, ¬¬s} ` ¬(s → s) MP(4,6) 8. {s → ¬s} ` ¬¬s → ¬(s → s) DT 9. {s → ¬s} ` (¬¬s → ¬(s → s)) → ((s → s) → ¬s) Ax(iv) 10. {s → ¬s} ` (s → s) → ¬s MP(8,9) 11. {s → ¬s} ` s → s Lemma 2.1.2 12. {s → ¬s} ` ¬s MP(10,11) 25

13. ` (s → ¬s) → ¬s

DT



Lemma 2.1.5. ` s → ¬¬s Proof. ` ¬¬¬s → ¬s Ax(iii) ` (¬¬¬s → ¬s) → (s → ¬¬s) ` s → ¬¬s MP 

Av(iv)

Lemma 2.1.6. ` ¬s → (s → t) Proof. Exercise!



Lemma 2.1.7. ` s → (¬t → ¬(s → t)) Proof. Exercise!



Now, define a set S of (propositional) terms to be consistent if there is some term t such that there is no deduction of t from S. Accordingly, say that a set S is inconsistent if for every term t one has S ` t. You might reasonably have expected the definition of S being consistent to be that no contradiction can be deduced from S. But the definition just given is marginally more useful and is equivalent to the definition just suggested (this follows once we have proved 2.1.12 but is already illustrated by the next lemma). Lemma 2.1.8. The set S of terms is inconsistent iff for some term s we have S ` ¬(s → s). Proof. The direction “⇒“ is immediate from the definition. For the other direction, we suppose that there is some term s such that S ` ¬(s → s). It must be shown that for every term t we have S ` t. Here is the proof. 1. S ` s → s Lemma 2.1.2 2. S ` (s → s) → ¬¬(s → s) Lemma 2.1.5 3. S ` ¬¬(s → s) MP(1,2) 4. S ` ¬¬(s → s) → (¬t → ¬¬(s → s)) Ax(i) 5. S ` ¬t → ¬¬(s → s) MP(3,4) 6. S ` (¬t → ¬¬(s → s)) → (¬(s → s) → t) Ax(iv) 7. S ` ¬(s → s) → t MP(5,6) 8. S ` ¬(s → s) by assumption 9. S ` t MP(7,8)  Lemma 2.1.9. Let S be a set of terms and let s be a term. Then S ∪ {s} is inconsistent iff S ` ¬s. Proof. Suppose first that S ∪ {s} is inconsistent. Then, by definition, S ∪ {s} ` ¬s. So, by the Deduction Theorem, we have S ` s → ¬s. Since also ` (s → ¬s) → ¬s (2.1.4) and hence S ` (s → ¬s) → ¬s, we can apply modus ponens to obtain S ` ¬s. For the converse, suppose that S ` ¬s and let t be any term. It must be shown that S ∪ {s} ` t. We have S ∪ {s} ` s and also, by 2.1.3, S ∪ {s} ` s → 26

(¬s → t). So, by modus ponens, S ∪ {s} ` ¬s → t follows. Since S ` ¬s also S ∪ {s} ` ¬s so another application of modus ponens gives S ∪ {s} ` t. This shows that S ∪ {s} is inconsistent, as required.  Lemma 2.1.10. Suppose that S is a set of terms and that s is a term. If both S ` s and S ` ¬s then S is inconsistent. Proof. For every term t we have S ` s → (¬s → t) (by 2.1.3). Since also S ` s and S ` ¬s, two applications of modus ponens, gives us S ` t (for every t), so S is inconsistent.  The next lemma is an expression of the finite character of the notion of deduction. Lemma 2.1.11. Suppose that S is a set of terms and that s is a term such that S ` s. Then there is a finite subset, S 0 , of S such that S 0 ` s. Proof. Any derivation (of s from S) has only a finite number of lines and hence uses only a finite number of non-logical axioms. Let S 0 be the, finite, set of all those actually used. Replace S by S 0 throughout the deduction to obtain a valid deduction, showing that S 0 ` s.  In the proof of the next theorem we make use of the observation that all the propositional connectives may be defined using just ¬ and → (that is, together, these two are adequate in the sense of Section 1.5) and so, in order to check that a function v from the set of propositional terms to {T, F} is a valuation, it is enough to check the defining clauses for ¬ and → only. Theorem 2.1.12. (Completeness Theorem for Propositional Logic, version 1) Suppose that S is a consistent set of propositional terms. Then there is a valuation v such that v(S) = T. Proof. Let Γ = {T : T is a consistent set of propositional terms and T ⊇ S} be the set of all sets of terms which contain S and are still consistent. We begin by showing, using Zorn’s lemma2 (see 2.1.16 below, for this), that Γ has a maximal element. S Let ∆ be a subset of Γ which is totally ordered by inclusion. Let T = ∆ be the union of all the sets in ∆. It has to be shown that T ∈ Γ and the only possibly non-obvious point is that T is consistent. If it were not then, choosing any term s, there would be a deduction T ` ¬(s → s). By 2.1.11 there would be a finite subset T 0 of T with T 0 ` ¬(s → s). Since ∆ is totally ordered and since T 0 is finite there would be some T0 ∈ ∆ such that T0 ⊇ T 0 . But then we would have T0 ` ¬(s → s). By 2.1.8 it would follow that T0 is inconsistent, contradicting the fact that T0 ∈ ∆ ⊆ Γ. This shows that every totally ordered subset of Γ has an upper bound in Γ and so Zorn’s Lemma gives the existence of a maximal element, T say, of Γ. That is, T is a maximal consistent set of terms containing S. What we will do, 2 Chances are you haven’t seen this before. It is needed in the general case but if we assume that L is countable then there’s a simpler proof of existence of a maximal element, and that’s the one I’ll give in the lectures.

27

and this is a key step in the proof, is define the valuation v by v(r) = T if r ∈ T and v(r) = F if r ∈ / T , but various things have to be proved in order to show that this really does give a valuation. First, we show that T is “deductively closed” in the sense that (*1) if T ` r then r ∈ T. Suppose, for a contradiction, that we had T ` r but r ∈ / T. Then, by maximality of T, the set T ∪ {r} would have to be inconsistent and hence, by 2.1.9, T ` ¬r. By 2.1.3, T ` r → (¬r → t) for any term t, so two applications of modus ponens gives T ` t. Since t was arbitrary that shows inconsistency of T - contradiction. Therefore (*1) is proved. Next we show that T is “complete” in the sense that (*2) for every term t either t ∈ T or ¬t ∈ T. For, suppose that t ∈ / T. Then, by maximality of T, the set T ∪ {t} is inconsistent so, by 2.1.9, T ` ¬t. Therefore, by (*1), ¬t ∈ T . Next we show that (*3) s → t ∈ T iff ¬s ∈ T or t ∈ T. For the direction “⇐” suppose first that ¬s ∈ T. Then, by 2.1.6 and (*1), s → t ∈ T. On the other hand if t ∈ T then s → t ∈ T by Axiom (i) and (*1). For the converse, “⇒”, if we have neither ¬s nor t in T then, by (*2) both s and ¬t are in T. Then, by 2.1.7 and (*1), we have ¬(s → t) ∈ T and so, by consistency of T , s → t ∈ / T, as required. Now define the (purported) valuation v by v(t) = T iff t ∈ T. Since S ⊆ T certainly v |= S so it remains to check that v really is a valuation. First, if v(t) = T then t ∈ T so (consistency of T ) ¬t ∈ / T so v(¬t) = F. Conversely, if v(t) = F then t ∈ / T so ((*2)) ¬t ∈ T so v(¬t) = T. That dealt with the ¬ clause in the definition of valuation. The → clause is direct from (*3) which, in terms of v, becomes v(s → t) = T iff v(¬s) = T or v(t) = T that is (by what we just showed), iff v(s) = F or v(t) = T, as required.  Theorem 2.1.13. (Completeness Theorem for Propositional Logic, version 2) Let S be a set of propositional terms and let t be a propositional term. Then S ` t iff S |= t. Proof. The direction “⇒” is the Soundness Theorem. For the converse, suppose that S 0 t. Then, by Axiom (iii) and modus ponens, S 0 ¬¬t. It then follows from 2.1.9 that S ∪{¬t} is consistent so, by the first version of the Completeness Theorem, there is a valuation v such that v(S) = T and v(¬t) = T so certainly we cannot have v(t) = T. Therefore S 2 t, as required.  Theorem 2.1.14. (Compactness Theorem for Propositional Logic, version 1) Let S be a set of propositional terms. There is a valuation v such that v(S) = T iff for every finite subset S 0 of S there is a valuation v 0 with v 0 (S 0 ) = T.

28

Proof. One direction is immediate: if v(S) = T then certainly v(S 0 ) = T for any (finite) subset S 0 of S. For the converse suppose, for a contradiction, that there is no v with v(S) = T. Then, by the Completeness Theorem (version 1), S is inconsistent. Choose any term s. Then, by definition of inconsistent, S ` ¬(s → s). So, by 2.1.11, there is a finite subset, S 0 , of S with S 0 ` ¬(s → s). By 2.1.8, S 0 is inconsistent. So by Soundness there is no valuation v 0 with v 0 (S 0 ) = T, as required.  Theorem 2.1.15. (Compactness Theorem for Propositional Logic, version 2) Let S be a set of propositional terms and let t be a propositional term. Then S |= t iff there is some finite subset S 0 of S such that S 0 |= t. Proof. Exercise.



Theorem 2.1.16. (Zorn’s Lemma)3 Suppose that (P, ≤) is a partially ordered set such that every chain has an upper bound, that is, if {ai }i∈I ⊆ P is totally ordered (for all i, j either ai ≤ aj or aj ≤ ai ) then there is some a ∈ P with a ≥ ai for all i ∈ I. Then there is at least one maximal element in P (i.e. an element with nothing in P strictly above it). This is a consequence, in fact is equivalent to, the Axiom of Choice from set theory.

2.2

A natural deduction system for propositional logic

The calculus that we describe in this section has no logical axioms as such but it has many rules of deduction and it allows much more “natural” proofs. We define, by induction, a relation S t where S is any set of propositional terms and t is any propositional term. It will turn out to be equivalent to the relation S ` t because one can prove the Completeness Theorem also for this calculus. A sequent is a line of the form S t where S is a (finite) set of propositional terms and t is a propositional term. We write s1 , . . . , sn t instead of {s1 , . . . , sn } t and we can write t if S is empty. Certain sequents are called theorems and they are defined inductively by the following rules. (Ax) Every sequent of the form S, t t is a theorem (these sequents play the role of non-logical axioms in the Hilbert-style calculus). s → t. (→I) If S, s t is a theorem then so is S (→E) If S s → t and S s are theorems then so is S t. (¬I) If S, s t and S, s ¬t are theorems then so is S ¬s. (¬¬) If S ¬¬t is a theorem then so is S t. The theorems/rules of deduction in this calculus are usually written using a less linear notation, as follows. (Ax) S, t t S, s t (→I) . S s→t 3 included

only for completeness of exposition

29

S s → t S s (→E) . S t S, s t S, s ¬t . (¬I) S ¬s S ¬¬t . (¬¬) S t This is a minimal list, corresponding to writing every propositional term up to equivalence using only {→, ¬} (which, recall, is an adequate set of propositional connectives). Of course, there are also rules involving ∨ and ∧, as follows. S, s, t u S s T t S ∪ T s ∧ t S, s ∧ t u S s ∧ t S s ∧ t S s S t S s S s S s ∨ t S t ∨ s S, s u T, t u S ∪ T, s ∨ t u S ` t S1 ⊇ S S1 ` t As before one may introduce derived rules, for example, Proof by Contradiction which says: If S, ¬s t and S, ¬s ¬t are theorems then so is S s. Which may be expressed by S, ¬s t S, ¬s ¬t . S s This rule from those above as follows: can be derived t and S, ¬s ¬t are theorems then so is S ¬¬s (by (¬I)) and hence If S, ¬s so is S s (by (¬¬)). Here’s the same argument written using the 2-dimensional notation. S, ¬s t S, ¬s ¬t by (¬I) S ¬¬s by (¬¬). S s As in the earlier-described calculus, a sequence of theorems is called a (valid) deduction. You should, of course, check that you agree that the above all are “valid rules of deduction”. If S is a (possibly infinite) set of propositional terms and t is any propositional term then we will write S ` t if there is a proof of t from S in this calculus, more formally, if there is a finite subset S 0 of S such that S 0 t is a theorem. You can read “S ` t” as “there is a deduction of t from S”. Of course, we already have such a notation and terminology from the previous section but ignore that earlier deductive system for the moment. Some further, easily derived, properties of the relation ` are: 30

S, ¬s ` t S, ¬s ` ¬t S ` ¬s S`t (Fin) 0 for some finite subset S 0 ⊆ S S `t S, φ ` t S ` φ (Cut) S`t One can prove soundness and completeness for this calculus. Recall what the issues are. (PbC)

• Is the calculus sound? That is, does the calculus generate only tautologies, more generally, if S t then is it true that S |= t? • Is the calculus complete? That is, does the calculus generate all tautologies, more generally, does S |= t imply that there is a proof in this calculus of S t? The answer to each question is “yes”. The proof of soundness involves checking that each rule of deduction preserves tautologies (compare the analogous point in the Hilbert-style calculus). The proof of completeness is entirely analogous to that for the Hilbert-style calculus (though notice that the Deduction Theorem is already built into this natural deduction calculus). In particular, one makes the same definition for a set of terms of be (in)consistent and the heart of the proof is: given a consistent set S of terms, build a valuation which gives all elements of S the value T.

31

Part II

Predicate Logic

32

Chapter 3

A brief introduction to predicate logic: languages and structures 3.1

Predicate languages

As we said in the introduction, propositional logic is about combining alreadyformed statements into more complex ones, whereas predicate logic allows us to formulate mathematical (and other) statements. Predicate logic is founded on the standard view in pure mathematics that the main objects of study are sets-with-structure. The statements that we can form in predicate logic will be statements about sets-with-structure. So first we need to be able to talk, in this logic, about elements of sets; that is reflected in predicate languages having variables, x, y, ..., which range over the elements of a given set. We also have the universal quantifier ∀ (“for all”) and the existential quantifier ∃ (“there is”) which prefix variables - so a formula in this language can begin ∀x∃y . . . (“for all x there is a y such that ...). Of course, at “...” we want to be able to insert something about x and y and that’s where the “structure” in “sets-with-structure” comes in. This will take a bit of explaining because the predicate language that we set up depends on the exact type of “structure” that we want to deal with. In brief, using a pick-and-mix approach, we set up a predicate language by choosing a certain collection of symbols which can stand for constants (specific and fixed elements of structures), for functions and for relations. Here are some examples. Example 3.1.1. One piece of structure that is always there in a set-with-structure is equality - so we will (in this course) always have the relation = which expresses equality between elements of a set. This is a binary (=2-ary) relation, meaning that it relates pairs of elements. Example 3.1.2. Part of the structure on a set might be an ordering - for example the integers or reals with the usual ordering. If so then we would also include a binary relation symbol, different from equality, say ≤, in our language. Example 3.1.3. Continuing with the examples Z, R, we might want to express 33

the arithmetic operations of addition, multiplication and taking-the-negative in our language: so we would add two binary function symbols (i.e. function symbols taking two arguments), + and ×, and also a unary (=1-ary) function symbol − (that’s meant to be used for the function a 7→ −a, not the binary function subtraction). We might also add symbols 0 and 1 as constants. Example 3.1.4. Functions with more than two arguments are pretty common; for instance we might want to have some polynomial functions (or, if we were dealing with C, perhaps some analytic functions) built into our language, say a 3-ary function symbol f with which we could express the function given by F (x, y, z) = x2 + y 2 + xz + 1. Example 3.1.5. Relation symbols with more than two arguments are not so common but here’s an example. Take the real line and define the relation B(x, y, z) to mean “y lies (strictly) between x and z. To recap: in the case of propositional logic there was essentially just one language (at least once we had chosen a set of propositional variables): in the case of predicate logic there are many, in the sense that when defining any such language one has to make a choice from certain possible ingredients. There is, however, a basic language which contains none of these extra ingredients and we’ll introduce that first. Actually even for the basic language there is a choice: whether or not to include a symbol for equality. The choice between inclusion or exclusion of equality rather depends on the types of application one has in mind but for talking about sets-with-structure it’s certainly natural to include a symbol “=” for equality.

3.2

The basic language

The basic (first-order, finitary, with equality) language L0 has the following: (i) all the propositional connectives ∧, ∨, ¬, →, ↔ (ii) countably many variables x, y, u, v, v0 , v1 , ... (iii) the existential quantifier ∃ (iv) the universal quantifier ∀ (v) a symbol for equality = Then we go on to define “terms” and “formulas”. Both of these, in different ways, generalise the notion of “propositional term” so remember that the word “term” in predicate logic has a different meaning from that in propositional logic. Formulas and free variables A term of L0 is nothing other than a variable (you’ll see what “term” really means when we discuss languages with constant or function symbols). The free variable of such a term x (say) is just the variable, x, itself: fv(x) = {x}. An atomic formula of L0 is an expression of the form s = t where s and t are terms. The set of free variables of the atomic formula s = t is given by fv(s = t) = fv(s) ∪ fv(t). The following clauses define what it means to be a formula of L0 (and, alongside, we define what are the free variables of any formula): (0) every atomic formula is a formula; (i) if φ is a formula then so is ¬φ, fv(¬φ) = fv(φ);

34

(ii) if φ and ψ are formulas then so are φ ∧ ψ, φ ∨ ψ, φ → ψ and φ ↔ ψ, and fv(φ ∧ ψ) = fv(φ ∨ ψ) = fv(φ → ψ) = fv(φ ↔ ψ) = fv(φ) ∪ fv(ψ); (iii) if φ is a formula and x is any variable then ∃xφ and ∀xφ are formulas, and fv(∃xφ) = fv(∀xφ) = fv(φ) \ {x}. ((iv) plus the usual “that’s it” clause) A sentence is a formula σ with no free variables (i.e. fv(σ) = ∅). Just as with propositional logic we do not need all the above, because we may define some symbols in terms of the others. For instance, ∧ and ¬, alternatively → and ¬, suffice for the propositional connectives. Also each of the quantifiers may be defined in terms of the other using negation: ∀xφ is logically equivalent to ¬∃x¬φ (and ∃x is equivalent to ¬∀x¬) so we may (and in inductive proofs surely would, just to reduce the number of cases in the induction step) drop reference to ∀ in the last clause of the definition. We also remark that we follow natural usage in writing, for instance, x 6= y rather than ¬(x = y). If φ is a formula then it is so by virtue of the above definition, so it has a “construction tree” and we refer to any formula occurring in this tree as a subformula of φ. We also use this term to refer to a corresponding substring of φ. Remember that any formula is literally a string of symbols (usually we mean in the abstract rather than a particular physical realisation) and so we can also refer to an occurrence of a particular (abstract) symbol in a formula. As well as defining the set of free variables of a formula we need to define the notion of free occurrence of a variable. To do that, if x is a variable then: (i) every occurrence of x in any atomic formula is free; (ii) the free occurrences of x in ¬φ are just the free occurrences of x in its subformula φ; (iii) the free occurrences of x in φ ∧ ψ are just the free occurrences of x in φ together with the free occurrences of x in ψ; (iv) there are no free occurrences of x in ∃xφ. In a formula of the form Qxφ we refer to φ as the scope of the quantifier Q (∃ or ∀). Any occurrence of x in Qxφ which is a free occurrence of x in φ (the latter regarded as a subformula of Qxφ) is said to be bound by that initial occurrence of the quantifier Qx. So a quantifier Qx binds the free occurrences of x within its scope. A comment on use of variables when you are constructing formulas. Note that bound variables are “dummy variables”: the formula ∃xf (x) = y and ∃zf (z) = y are, intuitively, equivalent. A formula with nested occurrences of the same variable being bound can be confusing to read: ∃x(∀x(f (x) = x) → f (x) = x) could be written less confusingly as ∃x(∀y(f (y) = y) → f (x) = x). Of course these are not the same formula but one can prove that they are logically equivalent and the second is preferable. Another informal notation that we will sometimes use is to “collapse repeated quantifiers”, for example to write ∀x, y(x = y → y = x) instead of ∀x∀y(x = y → y = x). Sometimes the abbreviations ∃! , ∃≤n , ∃=n are useful.

35

3.3

Enriching the language

The language L0 described above has little expressive power: there’s really not much that we can say using it; the following list just about exhausts the kinds of things that can be said. ∀x(x = x); ∀x∀y(x = y → y = x); ∀x∀y∀z(x = y ∧ y = z → x = z); ∃x∃y∃z(x 6= y ∧ y 6= z ∧ x 6= z ∧ ∀w(w = x ∨ w = y ∨ w = z)); ∃x(x 6= x). We are now going to give the formal definitions of the possible extra ingredients for a language but, since this is just a brief introduction to predicate logic, these definitions are included just so that you have precise definitions to refer to in case you have a question that is not answered by the perhaps less formal exposition that I will give in lectures. That exposition will focus on a limited class of examples and on actually making sense of the meanings of various formulas in specific examples. So what follows is just for reference. As we discussed earlier, precisely what we should add to the language L0 depends on the type of structures whose properties we wish to capture within our formal language. We therefore suppose that we have, at our disposal, the following kinds of symbols with which we may enrich the language: • n-ary function symbols such as f ( = f (x1 , . . . , xn )); (since an operation is simply a function regarded in a slightly different way, we don’t need to introduce operation symbols as well as function symbols, but we do use “operation notation” where appropriate, writing, for instance, x + y rather than +(x, y)) • n-ary relation symbols such as R (= R(x1 , . . . , xn )) (1-ary relation symbols, such as P (= P (x)), are also termed (1-ary) predicate symbols); • constant symbols such as c. Formulas of an enriched language Suppose that L is the language L0 enriched by as many function, relation and constant symbols as we require (the signature of L is a term used when referring to these extra symbols). Exactly what is in L will depend on our purpose: in particular, L need not have function and relation and constant symbols, although I will, for the sake of a uniform treatment, write as if all kinds are represented. If S is the set of “extra” symbols we have added then we will write L = L0 ∨ S. (It is notationally convenient to regard L as being, formally, the set of all formulas of L, so then, writing, for example, φ ∈ L makes literal sense. Thus the “∨” should be understood as some sort of “join”, not union of sets.) The terms of L, and their free variables, are defined inductively by: (i) each variable x is a term, fv(x) = {x}; (ii) each constant symbol c is a term, fv(c) = ∅; (iii) if f is an n-ary function symbol and if t1 , . . . , tn are terms, then f (t1 , . . . , tn ) is a term, fv(f (t1 , . . . , tn )) = fv(t1 ) ∪ · · · ∪ fv(tn ). The atomic formulas of L (and their free variables) are defined as follows:

36

(i) if s, t are terms then s = t is an atomic formula, fv(s = t) = fv(s) ∪ fv(t); (ii) if R is an n-ary relation symbol and if t1 , . . . , tn are terms, then R(t1 , . . . , tn ) is an atomic formula, fv(R(t1 , . . . , tn )) = fv(t1 ) ∪ · · · ∪ fv(tn ). The formulas of L (and their free variables) are defined as follows: (0) every atomic formula is a formula; (i) if φ is a formula then so is ¬φ, fv(¬φ) = fv(φ); (ii) if φ and ψ are formulas then so are φ ∧ ψ, φ ∨ ψ, φ → ψ and φ ↔ ψ and fv(φ ∧ ψ) = fv(φ ∨ ψ) = fv(φ → ψ) = fv(φ ↔ ψ) = fv(φ) ∪ fv(ψ); (iii) if φ is a formula and x is any variable then ∃xφ and ∀xφ are formulas, and fv(∃xφ) = fv(∀xφ) = fv(φ) \ {x} A sentence of L is a formula σ of L with no free variables (i.e. fv(σ) = ∅). Since formulas were constructed by induction we prove things about them by induction (“on complexity”) and, just as in the case of propositional terms, the issue of unique readability raises its head. Such inductive proofs will be valid only provided we know that there is basically just one way to construct any given formula (for two routes would give two paths through the induction and hence, conceivably, different answers). Unique readability does hold for formulas, and also for terms. Both proofs are done by induction (on complexity) and are not difficult.

3.4

L-structures

Suppose that L is a language of the sort discussed above. Formulas and sentences do not take on meaning until they are interpreted in a particular structure. Roughly, having fixed a language, a structure for that language provides: a set for the variables to range over (so, if M is the set then, “∀x” will mean “for all x in M ”); an element of that set for each constant symbol to name (so each constant symbol c of the language will name a particular, fixed element of M ); for each function symbol of the language an actual function (of the correct arity) on that set; for each relation symbol of the language an actual relation (of the correct arity) on that set. Here’s the precise definition. An L-structure M (or structure for the language L) is a non-empty set M , called the domain or underlying set of M, we write M = |M|, together with an interpretation in M of each of the function, relation and constant symbols of L. By an interpretation of one of these symbols we mean the following (and we also insist that the symbol “=” for equality be interpreted as actual equality between elements of M ): (i) if f is an n-ary function symbol, then the interpretation of f in M, which is denoted f M , must be a function from M n to M ; (ii) if R is an n-ary relation symbol, then the interpretation of R in M, which is denoted RM , must be a subset of M n (in particular, the interpretation of a 1-ary predicate symbol is a subset of M ); (iii) if c is a constant symbol, then the interpretation of c in M, which is denoted cM , must be an element of M . If no confusion should arise from doing so, the superscript “M” may be dropped (thus the same symbol “f ” is used for the function symbol and for the particular interpretation of this symbol in a given L-structure). 37

3.5

Some basic examples

The basic language An L0 -structure is simply a set so L0 -structures have rather limited value as illustrations of definitions and results. In lectures we will give a variety of examples, concentrating on languages L which contain just one extra binary relation symbol R. Directed graphs An L = L0 ∨ {R(−, −)}-structure M consists of a set M together with an interpretation of the binary relation symbol R as a particular subset, RM , of M × M. That is, an L-structure consists of a set together with a specified binary relation on that set. Given such a structure, its directed graph, or digraph for short, has for its vertices the elements of M and has an arrow going from vertex a to vertex b iff (a, b) ∈ RM . This gives an often useful graphical way of picturing or even defining a relation RM (note that the digraph of a relation specifies the relation completely). Certain types of binary relation are of particular importance in that they occur frequently in mathematics (and elsewhere). Posets A partially ordered set (poset for short) consists of a set P and a binary relation on it, usually written ≤, which satisfies: for all a ∈ P , a ≤ a (≤ is reflexive); for all a, b, c ∈ P , a ≤ b and b ≤ c implies a ≤ c (≤ is transitive); for all a, b ∈ P , if a ≤ b and b ≤ a then a = b (≤ is weakly antisymmetric). The Hasse diagram of a poset is a diagrammatic means of representing a poset. It is obtained by connecting a point on the plane representing an element a of the poset to each of its immediate successors (if there are any) by a line which goes upwards from that point. We say that b is an immediate successor of a if a < b (i.e. a ≤ b and a 6= b) and if a ≤ c ≤ b implies a = c or c = b: we also then say that a is an immediate predecessor of b. Equivalence relations An equivalence relation, ≡, on a set X is a binary relation which satisfies: for all a ∈ X, a ≡ a (≡ is reflexive); for all a, b ∈ X, a ≡ b implies b ≡ a (≡ is symmetric); for all a, b, c ∈ X, a ≡ b and b ≡ c implies a ≡ c (≡ is transitive). The (≡-)equivalence class of an element a ∈ X is denoted [a]≡ , a/ ≡ or similar, and is {b ∈ X : b ≡ a}. The key point is that equivalence classes are equal or disjoint: if a, b ∈ X then either [a] = [b] or [a] ∩ [b] = ∅. Thus the distinct ≡-equivalence classes partition X into disjoint subsets.

38

3.6

Definable Sets

If φ is a formula of a predicate language L and φ just the one free variable x say (in which case we write φ(x) to show the free variable explicitly) then we can look at the “solution set” of φ in any particular L-structure M. This solution set is written as φ(M) and it’s a subset of the underlying set M of M, being the set of all elements a ∈ M such that, if each free occurrence of x in φ is replaced by a, then the result (a “formula with parameter a”) is true in M. That is: φ(M) = {a ∈ M : φ(a) is true}, where φ(a) means the expression we get when we substitute each free occurrence of x by a. I will give examples in lectures but it’s something that you’ll already have seen in less formal mathematical contexts, as is illustrated by the following examples. Suppose that our structure is the real line R with its usual arithmetic (+, ×, 0, 1) and order (≤) structure (I’ll use the same notation, R, for the structure and for the underlying set.). Take the formula φ, or φ(x), to be 0 ≤ x ≤ 1. Then the solution set φ(R) = {a ∈ R : 0 ≤ a ≤ 1} - the closed interval with endpoints 0 and 1. Suppose, again with the reals R as the structure, that our formula, with free variable x, let’s call it ψ this time, is x × x = 1 + 1. √ Then √ the solution set ψ(R) = {a ∈ R : a × a = 1 + 1} = {a ∈ R : a2 = 2} = {− 2, 2}. For yet another example, again using the reals, take the formula, say θ, with free variable x (so we can write θ(x)) to be ∃y (y × y = x). Then the solution set θ(R) = {a ∈ R : ∃b ∈ R b2 = a} = R≥0 - the set of non-negative reals (since these are exactly the elements which are the square of some real number). (The solution set for a formula with more than one free variable can be defined in a similar, and probably obvious, way, but we’ll concentrate on examples with one free variable.)

39