There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to op
Proof Theoretical Reasoning – Lectures 1 and 2 Revantha Ramanayake and Bj¨ orn Lellmann TU Wien
TRS Reasoning School 2015
Aims (Lectures 1 and 2)
I
An introduction to proof theory (the formal study of proofs).
I
Emphasis on structural proof theory: various proof systems for different logics and their structural properties
I
Which logics? We focus on nonclassical logics, in particular standard normal modal logics and substructural logics as case studies
I
To provide a fuller introduction to these logics, we also introduce their semantics
Lectures 3 and 4 (Bj¨orn Lellmann) will discuss more specialised topics and extensions of the proof theory presented here.
A (very) brief history of proof theory I
Proofs are the essence of mathematics — to establish a theorem.. present a proof! Historically, proofs were not the objects of mathematical investigations (unlike numbers, triangles. . . )
I
The so-called foundational crisis of mathematics (early 1900s) arose from various challenges to the implicit assumption that consistency of the ‘foundations of mathematics’ could be shown within mathematics
I
For example, the discovery of Russell’s paradox (1902) showed that naive set theory was an inadequate foundation.
I
Certain arguments that were previously accepted as valid were shown to require further assumptions (e.g. Axiom of Choice)
I
The need for a precise development of the underlying logical systems became apparent.
Hilbert’s Proof theory I
Hilbert and his school viewed mathematics as strings of symbols, whose inference rules are formal, permitting one to pass from a string of symbols to another (formalist perspective). In this way proofs also become formal objects.
I
Elementary (finitary, real) objects, typically integers, but also elementary properties of elementary objects (take the statement of Fermat’s last theorem, four-colour theorem).
I
An elementary proof uses elementary objects and elementary properties . The methods of proof are those that we do not doubt (such as simple combinatorial facts)
I
In contrast, Hilbert’s abstract objects do not acutally exist (e.g. infinitary sets, Hilbert spaces, ultrafilters), but are an elegant way of expressing things.
I
(Hilbert’s notion of elementary not completely clear.)
Hilbert’s Program (around 1922) I
Abstract objects permit simple and elegant proofs but are they absolutely essential? I.e. do we need abstract objects?
I
Hilbert’s program refers to the attempt to show that any proof of an elementary property can be proved using elementary methods (i.e abstract objects not required)
I
The success of Hilbert’s program would yield consistency proofs.
I
E.g. if set theory is inconsistent then “0 = 1” would be provable. By Hilbert’s program there would be an elementary proof of this, but this is impossible (the elementary methods are those whose correctness we do not doubt)
I
Before we follow the subsequent story of Hilbert’s program, let’s see what we mean by a formal proof. . .
Hilbert calculus: a formal proof system I
To investigate provability formally: need a formal definition of proof
Define the classical language: countable set of propositional variables p1 , p2 , . . . and logical connectives →, ¬, ∧, ∨, ⊥, >. Formulae: A → B, ¬A, A ∧ B, A ∨ B, ⊥, > for formulae A and B A Hilbert calculus for propositional classical logic Cp (this is the usual boolean (truthtable) logic). Axiom schemata (schematic variable A, B, C stand for formulae): Ax 1:
A → (B → A)
Ax 2:
(A → (B → C)) → ((A → B) → (A → C))
Ax 3:
(¬A → ¬B) → ((¬A → B) → A)
and the rule of modus ponens: A
A→B B
Derivation of A → A Definition A formal proof (derivation) of B is the finite sequence C1 , C2 , . . . , Cn ≡ B of formulae where each element Cj is an axiom instance or follows from two earlier elements by modus ponens. Ax 1:
A → (B → A)
Ax 2:
(A → (B → C)) → ((A → B) → (A → C))
Ax 3:
(¬A → ¬B) → ((¬A → B) → A)
1
((A → ((A → A) → A)) → ((A → (A → A)) → (A → A)))
Ax 2
2
(A → ((A → A) → A))
Ax 1
3
((A → (A → A)) → (A → A))
MP: 1 and 2
4
(A → (A → A))
Ax 1
5
A→A
MP: 3 and 4
G¨odel’s incompleteness theorems (1931) The fall of Hilbert’s program Take consistency to mean that A ∧ ¬A is not provable. I
G¨odel’s first incompleteness theorem shows that in a consistent theory there is a formula which is elementary and true but not provable using elementary methods.
I
This already shows that Hilbert’s program cannot be realised.
I
G¨odel’s second incompleteness theorem exhibits such a formula
I
The theorem shows that if the theory T is consistent, then the statement Con(T ) asserting its consistency cannot be proved using elementary methods.
I
So, assuming that elementary mathematics is consistent, the statement asserting its consistency cannot be proved using elementary mathematics.
So what next for proof theory? I
The application of using proof theory to establish Hilbert’s program hits a roadblock.
I
But this is not the end of proof theory. . . (phew)
I
Relative consistency proofs: Gentzen showed the consistency of arithmetic using transfinite induction upto a certain ordinal 0 .
I
Using proof theory to understand to prove properties (other than consistency) of the underlying theories
I
Many logics of interest support the subformula property (something close to Hilbert’s program for certain logics of interest other than arithmetic)
I
To understand some of these applications, let us consider some concrete formal proof systems. . .
Drawback of the Hilbert calculus A lack of discernible structure Hilbert calculi are good for defining new logics in terms of old ones: add/remove/change the axioms. However there is a drawback. Consider the derivation of A → A: 1
((A → ((A → A) → A)) → ((A → (A → A)) → (A → A)))
Ax 2
2
(A → ((A → A) → A))
Ax 1
3
((A → (A → A)) → (A → A))
MP: 1 and 2
4
(A → (A → A))
Ax 1
5
A→A
MP: 3 and 4
What is the relation of the proof to A → A? No obvious structure to the proof.
Natural deduction and the sequent calculus (structural proof theory) I
Hilbert calculus not convenient for studying the proofs
I
lack of structure: recall the derivation of A → A
I
Gentzen introduces Natural deduction which is another proof formalism aimed at formalising the way mathematicians reason.
I
His normalisation theorem gives a normal form to natural deduction proofs (‘proofs with nice structural properties’)
I
Gentzen: used his proof theory to show consistency of arithmetic in weak extensions of finitistic reasoning.
I
Gentzen introduced another proof formalism with even more structure: the sequent calculus.
I
The sequent calculus (and its later generalisations) is the standard tool in structural proof theory.
Another Hilbert calculus for classical logic Here is a Hilbert calculus hCp for Cp different to the previous one SLOGAN: There are typically many different proof systems for a given logic. Choose the calculus that best fits your need. p → (q → p)
(p → (q → r )) → ((p → q) → (p → r ))
p∧q →p
p∧q →q
p → (q → (p ∧ q))
p →p∨q
q →p∨q
(p → r ) → ((q → r ) → (p ∨ q → r ))
⊥→p
p ∨ (p → ⊥)
¬p ↔ p → ⊥
> ↔ ¬⊥
Inference rules: Uniform substitution of arbitrary formulae for propositional variables in a formula, and A
A → B modus ponens B
The sequent calculus sCp for classical logic Cp p, X ` Y , p
init
X ` Y,A ¬l ¬A, X ` Y A, B, X ` Y ∧l A ∧ B, X ` Y A, X ` Y B, X ` Y ∨l A ∨ B, X ` Y X ` Y,A B, X ` Y →l A → B, X ` Y
⊥, X ` Y
⊥l
A, X ` Y ¬r X ` Y , ¬A X ` Y,A X ` Y,B ∧r X ` Y,A ∧ B X ` Y , A, B ∨r X ` Y,A ∨ B A, X ` Y , B →r X ` Y,A → B
I
Here X , Y are multiset of formulae (possibly empty)
I
X ` Y is a sequent. X is the antecedent, Y the succedent.
I
Aside: differs from the calculus Gentzen used (not important)
The sequent calculus sCp for classical logic Cp p, X ` Y , p
init
X ` Y,A ¬l ¬A, X ` Y A, B, X ` Y ∧l A ∧ B, X ` Y A, X ` Y B, X ` Y ∨l A ∨ B, X ` Y X ` Y,A B, X ` Y →l A → B, X ` Y
⊥, X ` Y
⊥l
A, X ` Y ¬r X ` Y , ¬A X ` Y,A X ` Y,B ∧r X ` Y,A ∧ B X ` Y , A, B ∨r X ` Y,A ∨ B A, X ` Y , B →r X ` Y,A → B
I
A premise/conclusion is a sequent above/below the line
I
A 0-premise rule is called an initial sequent
I
A derivation in the sequent calculus is an initial sequent or a rule applied to derivations of the premise(s).
The sequent calculus sCp for classical logic Cp p, X ` Y , p
init
X ` Y,A ¬l ¬A, X ` Y A, B, X ` Y ∧l A ∧ B, X ` Y A, X ` Y B, X ` Y ∨l A ∨ B, X ` Y X ` Y,A B, X ` Y →l A → B, X ` Y
⊥, X ` Y
⊥l
A, X ` Y ¬r X ` Y , ¬A X ` Y,A X ` Y,B ∧r X ` Y,A ∧ B X ` Y , A, B ∨r X ` Y,A ∨ B A, X ` Y , B →r X ` Y,A → B
I
A principal formula is the formula containing the newly introduced logical connective
I
The auxiliary formula(e) are the formulae in the premises
I
The multisets X and Y are the context
A derivation in sCp B, A ` C , B C , B, A ` C →l B → C , B, A ` C B, A ` C , A →l B, A, A → (B → C ) ` C A, A → (B → C ) ` C , A →l A, A → B, (A → (B → C )) ` C →r A → B, (A → (B → C )) ` A → C →r (A → (B → C )) ` (A → B) → (A → C ) →r ` (A → (B → C )) → ((A → B) → (A → C )) I
I
I
I I
Actually, the above is not yet a derivation. Recall that the initial sequents have the form p, X ` Y , p not A, X ` Y , A. The height of a derivation is the maximal number of sequents on a branch in the derivation. The size of a formula is the number of connectives in it plus one. Task: show that A, X ` Y , A is derivable. Argument is by induction on the size of a formula.
The sequent calculus sCp for classical logic Cp p, X ` Y , p
init
X ` Y,A ¬l ¬A, X ` Y A, B, X ` Y ∧l A ∧ B, X ` Y A, X ` Y B, X ` Y ∨l A ∨ B, X ` Y X ` Y,A B, X ` Y →l A → B, X ` Y I I I I
⊥, X ` Y
⊥l
A, X ` Y ¬r X ` Y , ¬A X ` Y,A X ` Y,B ∧r X ` Y,A ∧ B X ` Y , A, B ∨r X ` Y,A ∨ B A, X ` Y , B →r X ` Y,A → B
We have traded many axioms and few rules in hCp for no axioms and many rules in sCp. So what’s the point? Notice that each connective has a rule introducing it in the antecedent/succedent conclusion Every rule introduces exactly one logical connective into the conclusion subformula property: every formula in the premise(s) is a subformula of a formula in the conclusion
Subformula property is the point I
Subformula property: every formula in the premise(s) is a subformula of a formula in the conclusion
I
A derivation in the sequent calculus is an initial sequent or a rule applied to derivations of the premise(s).
I
Consider a derivation of the sequent X ` Y . Every formula appearing in this derivation must be a subformula of some formula in X ` Y !
I
Unlike in the Hilbert calculus, the proof has a nice structure!
I
But how do we know this is a sequent calculus for the logic Cp defined using the Hilbert calculus hCp?
I
(What do we mean by classical logic anyhow? it can be shown (a nice exercise) that hCp is the logic given by the usual truthtable semantics. That’s the logic we mean)
(Height-preserving) admissible rule in sCp We say that a rule ρ is admissible in sCp if the conclusion of ρ is derivable whenever the premises are derivable. The left and right weakening rules and the left and right contraction rules are derivable in sCp: X `Y lw A, X ` Y
X ` Y rw X ` Y,A
A, A, X ` Y lc A, X ` Y
X ` Y , A, A rc X ` Y,A
Admissibility of lw and rw is each proved by induction on the height of the premise derivation. The rules lc and rc must be tackled together i.e. prove lc and rc are admissible by induction on the height of the premise derivation. A rule is height-preserving admissible if the derivation of the conclusion has height no greater than the maximal premise height. All the above rules are height-preserving admissible in sCp. (Rule not containing logical connectives are called structural rules.)
Soundness and completeness of sCp for Cp
Need to prove that sCp is actually a sequent calculus for the logic Cp defined using the Hilbert calculus hCp.
Theorem For every formula A we have: `A is derivable in sCp ⇔ A ∈ Cp. (⇒) direction is soundness. ‘what we derive is actually a theorem of the logic’ (⇐) direction is completeness. ‘whatever is a theorem of the logic is derivable’
Proof of completeness
Need to show: A ∈ Cp ⇒ `A derivable in sCp. Show that every axiom of Cp is derivable (easy, below) and modus ponens can be simulated in sCp (not easy, next page) B, A ` C , B C , B, A ` C →l B, A ` C , A B → C , B, A ` C →l B, A, A → (B → C ) ` C A, A → (B → C ) ` C , A →l A, A → B, (A → (B → C )) ` C →r A → B, (A → (B → C )) ` A → C →r (A → (B → C )) ` (A → B) → (A → C ) →r ` (A → (B → C )) → ((A → B) → (A → C ))
How to simulate modus ponens Gentzen’s solution: to simulate modus ponens (below left) first add a new rule—the cutrule—(below right) to sCp: A
X ` Y,A A, U ` V cut X,U ` Y,V
A→B B
Formula A is the cutformula. The following instance of the cut-rule illustrates the simulation of modus ponens. `A→B `A `B
A`A B`B →l A → B, A ` B cut A ` B cut
So: A ∈ Cp ⇒ `A derivable in sCp + cut!
Proof of soundness Need to show: `A derivable in sCp + cut ⇒ A ∈ sCp. We need to interpret sCp + cut derivations in Cp. For sequent S
A1 , A2 , . . . , Am ` B1 , B2 , . . . , Bn
define translation τ (S) A1 ∧ A2 ∧ . . . ∧ Am → B1 ∨ B2 ∨ . . . ∨ Bn Comma on the left is conjunction, comma on the right is disjunction. Translations of the intial sequents are theorems of Cp p∧X →Y ∨p
⊥∧X →Y
Show for each remaining rule ρ: if the translation of every premise is a theorem of Cp then so is the translation of the conclusion. For
A, X ` B X `A→B
need to show:
A∧X →B X → (A → B)
The cut-rule is undesirable in sCp + cut We have shown
Theorem For every formula A we have: `A is derivable in sCp + cut ⇔ A ∈ Cp. I
The subformula property states that every formula in a premise appears as a subformula of the conclusion.
I
If all the rules of the calculus satisfy this property, the calculus is analytic
I
Analyticity is crucial to using the calculus (for consistency, decidability. . . )
I
sCp + cut is not analytic because:
I
X ` Y,A A, U ` V cut X,U ` Y,V We want to show: `A is derivable in sCp ⇔ A ∈ Cp
Gentzen’s Hauptsatz (main theorem): cut-elimination Theorem Suppose that δ is a derivation of X ` Y in sCp + cut. Then there is a transformation to eliminate instances of the cut-rule from δ to obtain a derivation δ 0 of X ` Y in sCp. Suppose that we are given a derivation δ of X ` Y containing a single occurrence of the cutrule as the last rule of the derivation. Argue by principal induction on the size of the cutformula and secondary induction on cutheight (sum of the premise derivation heights) that there is a cutfree derivation of X ` Y . Since `A is derivable in sCp + cut ⇔ A ∈ Cp:
Theorem For every formula A we have: ` A is derivable in sCp if and only if A ∈ Cp. We have a sequent calclulus with the subformula property for Cp!
Proof of Gentzen’s Hauptsatz I
(Base case) A derivation of minimal height concluding in a cutrule must have the following form: p, X ` Y , p p, U ` V , p cut p, X , U ` Y , V , p The conclusion is already an initial sequent so we don’t need the cut!
I
(Inductive case) The cutformula A is not principal in one of the premises of the cutrule e.g. (superscript indicates height) .. . k X ` Y , C , D, A ∨r X `k+1 Y , C ∨ D, A A, U `l X,U ` Y,V,C ∨ D Cut-height is k + l + 1.
.. . V,C ∨ D
cut
Proof of Gentzen’s Hauptsatz II From above. The cutformula A is not principal in one of the premises of the cutrule e.g. .. . X `k Y , C , D, A ∨r X `k+1 Y , C ∨ D, A A, U `l X,U ` Y,V,C ∨ D
.. . V,C ∨ D
cut
Cutheight is k + l + 1. Lift the cut upwards. . . .. .. . . k l X ` Y , C , D, A A, U ` V , C ∨ D cut X , U ` Y , V , C , D, C ∨ D
Derivation has reduced cutheight k + l (< k + l + 1) so apply induction hypothesis to get cutfree derivation X , U ` Y , V , C , D, C ∨ D. Apply rule ∨r to X , U ` Y , V , C , D, C ∨ D to get cutfree derivation of X , U ` Y , V , C ∨ D, C ∨ D. By admissibility of rc we get cutfree derivation of X , U ` Y , V , C ∨ D.
Proof of Gentzen’s Hauptsatz III I
(Inductive case) The cutformula A is principal in both premises e.g. .. .. .. . . . A, X `k Y , B U `l V , A B, U `m V →r →l X `k+1 Y , A → B A → B, U `1+max l,m V cut X,U ` Y,V
Lift the cut upwards. . . .. .. . . A, X ` Y , B B, U ` V cut A, X , U ` Y , V
Since size |B| of the cutformula smaller than before (A → B) apply the induction hypothesis to get cutfree derivation of A, X , U ` Y , V .
Proof of Gentzen’s Hauptsatz IV From above: apply the induction hypothesis to obtain a cutfree derivation of A, X , U ` Y , V . Now proceed: .. .. . . U ` V,A A, X , U ` Y , V X , U, U ` Y , V , V
Since the size |A| of the cutformula is smaller than before (A → B) apply the induction hypothesis to obtain a cutfree derivation of X , U, U ` Y , V , V . By admissibility of lc and rc we get X , U ` Y , V as required. I cutfree proof is typically much longer than proof with cuts (recall Hilbert’s abstract objects). I Cut-elimination: eliminating lemmata from a math. proof I Reflect: cutrule introduced to simulate modus ponens. The former is much more general than the latter. Why cannot we just add modus ponens (for completeness) then eliminate it? (hint: induction hypothesis too weak)
Applications: Consistency of classical logic Consistency of classical logic is the statement that A ∧ ¬A 6∈ Cp.
Theorem Classical logic is consistent. Proof by contradiction. Suppose that A ∧ ¬A ∈ Cp. Then A ∧ ¬A is derivable in sCp (completeness). Let us try to derive it (read upwards from ` A ∧ ¬A): A` ` ¬A `A ` A ∧ ¬A So ` A and A ` are derivable. Thus ` must be derivable in sCp + cut (use cut) and hence in sCp (by cut-elimination). This is impossible (why?) QED.
Applications: Decidability of classical logic Theorem Decidability of Cp. I
Starting from a given formula A, repeatedly apply the rules backwards (choosing some formula as principal).
I
Since each rule reduces the complexity of the sequent (a logical connective is deleted), the backward proof search terminates under any choice of principal formulae
I
In fact there are only finitely many backward proof searches (only finitely many formulae can be made principal).
I
If one such backward proof search is a derivation, then A ∈ Cp otherwise it is not.
I
Note: above argument fails in sCp + lc + rc (why?) although lc and rc are admissible in sCp. Different applications require different proof systems.
First-order language I
We want to extend the propositional language with quantifiers. Consider ∀x.(∀y .((x + y ) = (y + x)))
I
I
I
Quantifier symbol ∀, variables x and y , a function +, could also be written +(x, y ), and the binary relation =. These are the essential language components of a first-order theory. Variable, function and relation symbols, and the propositional connectives augmented by the quantifiers ∀ ‘for all’ and ∃ ‘exists’. A variable x is bound if it appears in the scope of ∀ or ∃, otherwise it is free. A first-order theory consists of two types of axioms. The Proper Axioms of the particular theory (e.g. number theory, group theory) and the Predicate Axioms to make the first-order logic work.
Hilbert calculus hCL for pure first-order logic CL Axiom schemata (schematic variable A, B, C stand for formulae): Ax 1:
A → (B → A)
Ax 2:
(A → (B → C)) → ((A → B) → (A → C))
Ax 3:
(¬A → ¬B) → ((¬A → B) → A)
Ax 4:
∀x.(A → B) → (∀x.A → ∀x.B)
Ax 5:
A → ∀x.A x not free in A
Ax 6:
∀x.A → A[x/t] provided there is no variable capture
and the rules A
A → B mp B
A universal generalisation ∀x.A
Sequent rules for pure first-order logic CL
Obtain the sequent calculus sCL by extending sCp with the following rules: A[x/y ], X ` Y ∀l ∀x.A, X ` Y
X ` Y , A[x/y ] ∀r* X ` Y , ∀x.A
A[x/y ], X ` Y ∃l* ∃x.A, X ` Y
X ` Y , A[x/y ] ∃r X ` Y , ∃x.A
The rules ∀r and ∃l have the variable restriction condition: the eigenvariable y is not free in the conclusion.
Soundness and completeness of sCL wrt CL I
To check completeness of sCL for CL, it suffices to derive all axioms and simulate the rules (once again we need to add a cutrule to simulate modus ponens).
I
Cut-elimination holds. Verify. The interesting cases are when the cutformula has the form ∀A or ∃A and is principal in both premises of the cutrule.
I
It is instructive to verify soundness of the rules wrt CL.
I
Here is an example to illustrate why the variable condition is required:
I
x =2`x +1=3 rule not legal due to variable restriction! x = 2 ` ∀x.(x + 1 = 3) Read Gentzen’s highly readable papers “Investigations into logical deduction” and “New version of the consistency proof for elementary number theory”.
A Hilbert calculus for intuitionistic logic Here is a Hilbert calculus hIp for Ip p → (q → p)
(p → (q → r )) → ((p → q) → (p → r ))
p∧q →p
p∧q →q
p → (q → (p ∧ q))
p →p∨q
q →p∨q
(p → r ) → ((q → r ) → (p ∨ q → r ))
⊥→p
p ∨ (p → ⊥)
¬p ↔ p → ⊥
> ↔ ¬⊥
Inference rules: Uniform substitution of arbitrary formulae for propositional variables in a formula, and A
A → B modus ponens B
The Hilbert calculus hIL for first-order intuitionistic logic IL is obtained by adding the axioms for the predicate quantifiers.
Gentzen’s sequent calculus LJ for intuitionistic logic I
What about a sequent calculus for intuitionistic logic?
I
Gentzen shows that his sequent calculus LJ for IL can be obtained from his sequent calculus LK for CL by restricting the succedent of sequents to at most one formula i.e. sequents have the form X ` Φ where Φ is empty or a single formula.
I
Structural properties of the sequent calculus can be varied to obtain calculi for new logics!
I
The sequent calculus sCL we introduced for CL needs to be amended slightly to obtain a sequent calculus sIL for IL.
I
(The reason we cannot reproduce Gentzen’s result directly is because we ‘fine-tuned’ sCL to make the prior exposition easier—choose the proof system that best fits your need)
The sequent calculus sIL for IL Here Φ is either empty or a single formula. p, X ` p
init
⊥, X ` Φ
⊥l
X `A ¬l ¬A, X ` A, B, X ` Φ ∧l A ∧ B, X ` Φ A, X ` Φ B, X ` Φ ∨l A ∨ B, X ` Φ A → B, X ` A B, X ` Φ →l A → B, X ` Φ A[x/y ], X ` Φ ∀l ∀x.A, X ` Φ
A, X ` ¬r X ` ¬A X `A X ` B ∧r X `A∧B X ` Ai ∨r X ` A1 ∨ A2 A, X ` B →r X `A→B X ` A[x/y ] ∀r* X ` ∀x.A
A[x/y ], X ` Φ ∃l* ∃x.A, X ` Φ
X ` A[x/y ] ∃r X ` ∃x.A
The rules ∀r and ∃l have the variable restriction condition: the eigenvariable y is not free in the conclusion.
Some further observations on sIL I
I I
Use sIL to show that intuitionistic logic has the disjunction property: if A ∨ B ∈ IL then either A ∈ IL or B ∈ IL. (obviously classical logic does not have the disjunction property. . . recall the axiom p ∨ (p → ⊥)) Backward proof no longer is finite due to the →l rule (below). A → B, X ` A B, X ` Φ →l A → B, X ` Φ
I
I
If we remove the auxiliary A → B in the rule, we can no longer prove the intuitionistic theorem ¬(A ∨ ¬A) → ⊥ (try it!) (Dyckhoff, 1992) shows that a terminating calculus can be obtained by replacing →l with the following:
B, p, X ` Φ →1 l p → B, p, X ` Φ C → B, D → B, X ` Φ →3 l (C ∨ D) → B, X ` Φ
C → (D → B), X ` Φ →2 l (C ∧ D) → B, X ` Φ D → B, X ` C → D B, X ` Φ →4 l (C ∧ D) → B, X ` Φ
Substructural Logics: deleting structural rules from LJ Here is Gentzen’s sequent calculus LJ (X , U are lists, Φ is empty or a single formula): p`p Ai , X ` Φ ∧l A1 ∧ A2 , X ` Φ
X `A X ` B ∧r X `A∧B
A, X ` Φ B, X ` Φ ∨l A ∨ B, X ` Φ
X ` Ai ∨r X ` A1 ∨ A2
X `A B, U ` Φ →l A → B, X , U ` Φ
A, X ` B →r X `A→B
A, A, X ` Φ lc A, X ` Φ
X , A, B, Y ` Φ ex X , B, A, Y ` Φ
X `Φ lw A, X ` Φ
X ` rw X `A
We have made the structural rules explicit.
Equivalent rules in presence of weakening and ctr I In the presence of the weakening and contraction rules, there is some freedom in the form of certain rules: p`p Ai , X ` Φ ∧l A1 ∧ A2 , X ` Φ
X `A X ` B ∧r X `A∧B
A, X ` Φ B, X ` Φ ∨l A ∨ B, X ` Φ
X ` Ai ∨r X ` A1 ∨ A2
X `A B, U ` Φ →l A → B, X , U ` Φ
A, X ` B →r X `A→B
A, A, X ` Φ lc A, X ` Φ
X , A, B, Y ` Φ ex X , B, A, Y ` Φ
X `Φ lw A, X ` Φ
X ` rw X `A
Equivalent rules in presence of weakening and ctr II In the presence of the weakening and contraction rules, there is some freedom in the form of certain rules: p`p A1 , A2 , X ` Φ ∧l A1 ∧ A2 , X ` Φ
X `A Y ` B ∧r X,Y ` A ∧ B
A, X ` Φ B, X ` Φ ∨l A ∨ B, X ` Φ
X ` Ai ∨r X ` A1 ∨ A2
X `A B, U ` Φ →l A → B, X , U ` Φ
A, X ` B →r X `A→B
A, A, X ` Φ lc A, X ` Φ
X , A, B, Y ` Φ ex X , B, A, Y ` Φ
X `Φ lw A, X ` Φ
X ` rw X `A
Deleting weakening and contraction yields more rules p`p Ai , X ` Φ ∧l A1 ∧ A2 , X ` Φ
X `A X ` B ∧r X `A∧B
A1 , A2 , X ` Φ ⊗l A1 ⊗ A2 , X ` Φ
X `A Y ` B ⊗r X,Y ` A ⊗ B
A, X ` Φ B, X ` Φ ∨l A ∨ B, X ` Φ
X ` Ai ∨r X ` A1 ∨ A2
X `A B, U ` Φ →l A → B, X , U ` Φ
A, X ` B →r X `A→B
X , A, B, Y ` Φ ex X , B, A, Y ` Φ
What happens if we delete the exchange rule?
→ and ← in the absence of exchange I This calculus has no structural rules. p`p X , Ai , Y ` Φ ∧l X , A1 ∧ A2 , Y ` Φ
X `A X ` B ∧r X `A∧B
X , A1 , A2 , Y ` Φ ⊗l X , A1 ⊗ A2 , Y ` Φ
X `A Y ` B ⊗r X,Y ` A ⊗ B
X , A, Y ` Φ X , B, Y ` Φ ∨l X , A ∨ B, Y ` Φ
X ` Ai ∨r X ` A1 ∨ A2
X `A U, B, V ` Φ →l U, X , A → B, V ` Φ
A, X ` B →r X `A→B
Notice that the principal and auxiliary fomulae must now occur between contexts.
→ and ← in the absence of exchange II This calculus has no structural rules. p`p X , Ai , Y ` Φ ∧l X , A1 ∧ A2 , Y ` Φ
X `A X ` B ∧r X `A∧B
X , A1 , A2 , Y ` Φ ⊗l X , A1 ⊗ A2 , Y ` Φ
X `A Y ` B ⊗r X,Y ` A ⊗ B
X , A, Y ` Φ X , B, Y ` Φ ∨l X , A ∨ B, Y ` Φ
X ` Ai ∨r X ` A1 ∨ A2
X `A U, B, V ` Φ ←l U, B ← A, V ` Φ
X,A ` B ←r X `B←A
Notice that the principal and auxiliary fomulae must now occur between contexts.
→ and ← in the absence of exchange III This calculus has no structural rules. p`p X , Ai , Y ` Φ ∧l X , A1 ∧ A2 , Y ` Φ
X `A X ` B ∧r X `A∧B
X , A1 , A2 , Y ` Φ ⊗l X , A1 ⊗ A2 , Y ` Φ
X `A Y ` B ⊗r X,Y ` A ⊗ B
X , A, Y ` Φ X , B, Y ` Φ ∨l X , A ∨ B, Y ` Φ
X ` Ai ∨r X ` A1 ∨ A2
X `A U, B, V ` Φ →l U, X , A → B, V ` Φ
A, X ` B →r X `A→B
X `A U, B, V ` Φ ←l U, B ← A, V ` Φ
X,A ` B ←r X `B←A
Notice that the principal and auxiliary fomulae must now occur between contexts.
Substructural logics Logics weaker than classical logic. I
Obtained from the proof theory by deleting the structural rules from LJ
I
The calculus with no structural rules is called Full Lambek calculus FL
I
The calculus with exchange is called FLe (so just a single implication connective →)
I
useful for reasoning, e.g., about natural language, vagueness, resources, dynamic data structures, algebraic varieties ... includes
I
I I I I
I
intuitionistic logic, relevance logics, linear logic without exponential, fuzzy logics,
Aside. FLc recently shown to be undecidable! (Chvalovsk´y and Horˇc´ık, 2014)
A Hilbert calculus for FLe →, ∨, ∧, ⊗, >, ⊥, 0, 1 Axiom Schemata α→α (α → (β → γ)) → (β → (α → γ)) (α ∧ β) → α ((α → β) ∧ (α → γ)) → (α → (β ∧ γ)) β → (α ∨ β) α ↔ (1 → α) ¬α ↔ (α → 0)
(α → β) → ((β → γ) → (α → γ)) ((α · β) → γ) ↔ (α → (β → γ)) (α ∧ β) → β α → (α ∨ β) ((α → γ) ∧ (β → γ)) → ((α ∨ β) → γ) α→> ⊥→α
Rules α β (adj) α∧β
α
α→β (MP) β
Soundness and completeness to be shown like before (introduce cut rule to simulate modus ponens and show cut-elimination to ensure subformula property).
Algebraic semantics of FLe A commutative residuated lattice CRL is P = hP, ∧, ∨, ⊗, →, 1i
1. hP, ∧, ∨i is a lattice with > greatest and ⊥ least element 2. hP, ⊗, 1i is a commutative monoid. 3. For any x, y , z ∈ P: x ⊗ y ≤ z ⇐⇒ y ≤ x → z An FLe-algebra expands a CRL by an extra constant 0. Notation: We write a ≤ b instead of a = a ∧ b (a ∨ b = a). I
CRL and FLe-algebras are varieties (equationally defined classes, closed under subalgebras, homomorphic images and direct products)
Subvarieties of FLe-algebras
I
The class FLe of all FLe-algebras is a variety
I
The classes of all Heyting algebras and of all Boolean algebras are subvarieties of FLe Each of the following equations determine important subvarieties of FLe
I
I I I I I I
square increasing: x ≤ x 2 integrality: x ≤ 1 minimality of 0: 0 ≤ x linearity: 1 ≤ ((x → y ) ∧ 1) ∨ ((y → x) ∧ 1) prelinearity: 1 ≤ (x → y ) ∨ (y → x) ....
Modal Logics
“Modal languages are simple yet expressive languages for talking about relational structures” Modal Logic (Blackburn, Venema and de Rijke) I
Augment the usual boolean connectives (¬, ∧, ∨, →, ⊥, >) with modal operators like (but not limited to) ♦ and .
I
No variable binding, so the language is simpler than first-order.
I
A relational structure is a set with a collection of relations on the set.
I
Relational structures appear everywhere.
I
E.g. to describe mathematical structures, theoretical computer science (model program execution as a set of states, where the binary relations model the behaviour of the program), knowledge representation, economics, computational linguistics
I
We could already imagine that first-order and second-order languages are well-equipped to talk about relational structures
I
The point is that modal languages are the simplest languages to describe relational structures
Modal language
I
Let V be a set of variables. The formulae of modal logic are: F ::= V | F ∧ F | F ∨ F | F → F | ¬F | F with ♦A abbreviating the formula ¬¬A
I
Equivalently A abbreviating ¬♦¬A.
I
Alternatively we could include both ♦ and in the signature
I
So ♦A and A are said to be duals of each other
I
Recall ∀A = ¬∃¬A.
Some standard interpretations of the modal operators 1. ♦A as ‘it is possibly the case that A’. So A reads ‘it is not possible that not A’ or simply ’it is necessarily the case that A’. So what can we say about statement like A → ♦A and ♦A → ♦A? Are they linked by logical consequence or independent? 2. Epistemic logic. Read A as ‘the agent knows A’. Or have lots of modal operators and read i A as ‘the i th agent knows A. Since we use the word knowledge, we would expect A → A (‘if the agent knows A then A’—contrast with belief) . But is it the case that A → A (‘if A, then the agent knows it’). What about A → A?
Some standard interpretations (cont.) 1. Provability. Read A as ‘it is provable in Peano arithmetic that A’. It may be shown that (A → A) → A (L¨ob formula) holds. 2. Temporal language. Read ♦A as ‘A holds in a some future time’ and _A as ‘A was true at some past time’. (what is A and A?) 3. Propositional dynamic logic. hπiA as ‘some terminating execution of program π from the present state leads to a state bearing information A’ . So [π]A is ‘every execution of program π from the present state leads to a state bearing information A’
Interpreting the modal language on a relational structure A frame consists of a nonempty set W of worlds and a a binary relation R ⊆ W × W . A model is a pair (F , V ) where F = (W , R) is a frame and V is a function mapping each propositional variable to a subset V (p) ⊆ W ‘valuation’. Truth (satisfaction) at a world w in a model M is defined via: M, w |= p iff w ∈ V (p) M, w |= A ∧ B iff M, w |= A and M, w |= B M, w |= A ∨ B iff M, w |= A or M, w |= B M, w |= A → B iff M, w 6|= A or M, w |= B M, w |= A iff ∀v ∈ W .(Rwv ⇒ M, v |= A) M, w |= ♦A iff ∃v ∈ W .(Rwv & M, v |= A)
Validity I
I
A frame is a formalisation of the phenomenon we wish to capture (time as a linearly ordered set).
I
A model ‘dresses up’ the frame with information (the program executes at t = 4).
I
Since logic is concerned with reasoning (invariant under local information), we need to consider those things that hold under all possible models.
I
A formula that holds on all possible models is said to be valid.
Validity II
Definition Formula A is valid at a state w in a frame F (F , w |= A) if for all valuations V it is the case that (F , V ), w |= A. Formula A is valid on the frame F if it is valid at every state in F . Formula A is valid on a class F of frames if A is valid on every frame in F. I
Given a class F of frames, the set ΛF of formulae valid on F is called the logic of F.
I
The definition of validity utilises second-order quantification: ‘over all valuations V ’ (over all subsets of W ).
The logics of various frame classes
I
The logic of all frames
I
The logic of transitive frames i.e. ΛF where F ∈ F ⇔ F |= ∀xyz.(Rxy ∧ Ryz → Rxz)
I
The logic of reflexive frames ΛF where F ∈ F ⇔ F |= ∀x.Rxx
I
The logic of finite (irreflexive) transitive trees (cannot be described by a first-order formula!)
Syntactic definition of modal logics I
The definition of modal logic in terms of the structures the modal language is intended to talk about i.e. relational structures.
I
The valid formulae then represent the properties that are invariant under local information
I
When we are concerned solely about such valid formulae, it makes sense to abstract away the details of the relational structure.
I
Recall we have seen this before! Instead of talking about the theorems of classical logic as those that are valued t under all truthtable valuations, we focussed on generating the set of theorems in a nice way (i.e. via proofs with nice structure)
I
i.e. we want nice syntactic mechanisms for generating ΛF .
A Hilbert calculus hK for the normal modal logic K I
Define the Hilbert calculus hK to be the extension of the Hilbert calculus hCp for classical propositional logic with the following axioms and rule: (A → B) → (A → B)
A ↔ ¬♦¬B
A necessitation A I
Axiom top left is called the normality axiom.
I
Axiomatic extensions of hK are called normal modal logics.
I
Non-normal modal logics are also interesting, but not discussed here.
I
Syntactically speaking, normal axiom permits modus ponens under ; necessitation allows us to add boxes.
I
(we need not ask yes ‘what do these rules and axioms mean?’)
Soundness and completeness wrt semantics I
Our claim is that K is the logic of all frames i.e. K = ΛF where F is the class of all frames.
I
What is derivable in hK is valid on all frames (soundness)
I
A formula valid on all frames is derivable (completeness)
I
Soudness of the rules. Let M be an arbitrary model and w some world in M. Show that each axiom holds on M at w .
I
Classical formulae are satisfied at any world by the definition of the satisfaction relation.
I
Next show soundness of the rules. Supposing that the premises are valid show that the conclusion is also valid
I
Completeness entails showing that if A is valid on all frames, then A is a theorem of the Hilbert calculus. This is non-trivial and omitted here.
Some axiomatic extensions of hK I
Consider the following axioms 4 : p → p T : p → p
(or perhaps more clearly ♦♦p → ♦p) (or perhaps more clearly p → ♦p)
L : (p → p) → p (L¨ ob axiom) I
We claim that the addition of these axioms to hK yield the following logics: K4
the logic of transitive frames
KT
the logic of reflexive frames
KL
the logic of finite (irreflexive) transitive trees
I
For historical reasons, axiom T is reflexivity (and not transitivity!)
I
Check soundness. Completeness is non-trivial.
Obtaining a sequent calculus for K
Let’s try to derive the normality axiom (A → B) → (A → B) A`A B`B →l A → B, A ` B ... (A → B), A ` B (A → B) ` (A → B) (A → B) → (A → B) We might ‘guess’ the following X `A K X ` A
A sequent calculus sK for the modal logic K I
Add the K rule to the sequent calculus for classical logic.
I
X `A K X ` A We claim that sK is sound and complete for K
I
Soundness. Show the contrapositive by arguing via the semantics
I
If the conclusion translation is not valid, then the premise translation is not valid either.
I
Completeness: Show that sK derives all the axioms of hK and simulates all the rules.
I
Only modus ponens is nontrivial. For this we need to show (surprise. . . ) cut-elimination.
Cut-elimination for sK I
Recall the Gentzen-style cut-elimination (primary induction on size of cutformula, secondary induction on cutheight) 1. Base case. Consider when the cutheight is minimal. 2. Inductive case. Either the cutformula is principal in both premises or it is not principal in at least one premise.
I
Let us consider the case of principal cuts (i.e. cutformula is principal in both premises) A, Y ` C X `A K K X ` A A, Y ` C cut X , Y ` C Lift cut, then apply induction hypothesis, finally reapply K X `A A, Y ` C cut X,Y ` C induction hypothesis yields cutfree: X,Y ` C K X , Y ` C
A sequent calculus sK4 for K4
I
Recall: K4 is the logic of transitive frames (T is for reflexive, remember?)
I
I
Here is the rule encountered in the literature. X , X ` A 4 X ` A Soundness and completeness of sK4 wrt K4
I
Check soundness of 4 and derive the 4 axiom.
I
Simulating modus ponens leads us to introduce the cutrule. . .
I
. . . subformula property considerations motivate us to eliminate the cutrule. . .
I
. . . blah blah. . .
A Kripke semantics for intuitionistic logic I
I
I
Intuitionistic logic can be given the following relational semantics. An intuitionistic model is a reflexive, transitive and antisymmetric model such that x ∈ V (p) and Rxy implies y ∈ V (p) (upward closed) The satisfaction relation is defined as follows: Truth (satisfaction) at a world w in a model M is defined via: M, w |= p iff p ∈ V (w ) M, w |= A → B iff ∀v , Rwv implies M, v 6|= A or M, v |= B
I
I
We can verify that the →r rule is sound in intuitionistic logic. . . (using M, w |= A and Rwv implies M, v |= A) What happens if we try to prove soundness of the rule below right? A, X ` Y , B A, X ` B →r →r-classical X `A→B X ` Y,A → B
The modal provability logic GL
I
GL = K + (p ⊃ p) ⊃ p (L¨ ob’s axiom)
I
characterised by Kripke frames satisfying transitivity and no ∞-R-chains (finite transitive trees) ¯ is provable in Peano arithmetic” Interpreting A as “A ¯ GL is sound and complete wrt (frequently written Bew (A)) formal provability interpretation in Peano arithmetic (Solovay, 1976).
I
I
Hence the name provability logic
I
The logic is decidable (a benefit of studying a fragment of Peano arithmetic)
A sequent calculus for GL I
K: X ⇒A K X ⇒ A
I
K4 (the 4 axiom is A ⊃ A and corresponds to transitivity) X , X ⇒ A 4 X ⇒ A
I
GL (axiomatised by addition of (A ⊃ A) ⊃ A to K) X , X , A ⇒ A GLR X ⇒ A (Sambin and Valentini, 1982). A is called the diagonal formula. Motivated from 4 rule.
The sequent calculus sGL for GL Initial sequents:
A ⇒ A for each formula A
Logical rules: X ⇒ Y,A L¬ X , ¬A ⇒ Y
A, X ⇒ Y R¬ X ⇒ Y , ¬A
Ai , X ⇒ Y L∧ A1 ∧ A2 , X ⇒ Y
X ⇒ Y , A1 X ⇒ Y , A2 R∧ X ⇒ Y , A1 ∧ A2
A1 , X ⇒ Y A2 , X ⇒ Y L∨ A1 ∨ A2 , X ⇒ Y
X ⇒ Y , Ai R∨ X ⇒ Y , A1 ∨ A2
X ⇒ Y,A B, U ⇒ W →L A ⊃ B, X , U ⇒ Y , W
A, X ⇒ Y , B →R X ⇒ Y,A ⊃ B
Modal rule: X , X , A ⇒ A GLR X ⇒ A Structural rules: X ⇒Y LW A, X ⇒ Y
X ⇒Y RW X ⇒ Y,A
Soundness and completeness of sGL wrt KL I
As before soundness can be verified by taking the contrapositive of each rule and falsifying on a finite tree.
I
Completeness: simulate modus ponens with cut; eliminate cut to obtain subformula property
I
An alternative semantic proof of completeness: A ∈ GL ⇒ sGL derives ` A FGL |= A ⇒ sGL derives ` A there is no derivation of ` A in sGL ⇒ FGL 6|= A
I
Idea. Suppose that there is no derivation of ` A. Use this to falsify A on a finite tree.
I
Nonetheless, the proof of cut-elimination is interesting so let us close by sketching the proof.
Syntactic cut-elimination for GL - a brief history I
Leivant (1981) suggests a syntactic proof, counter-example by Valentini (1982)
I
new proof of syntactic CE for GLSset proposed by Valentini (1983) — induction on degree · ω 2 + width · ω + cutheight
I
Subsequently Borga (1983) and Sasaki (2001) present new proofs
I
Moen (2001) claimed that Valentini’s proof has a gap when contractions are made explicit
I
Many other proofs were subsequently presented as an alternative (e.g. Mints, Negri)
I
Gor´e and R. (2008) show Moen’s claim is incorrect, introducing new transformations to deal with contraction.
Sambin Normal Form
The interesting case is the Sambin Normal Form (SNF) where both Π and Ω are cutfree Ω
Π
l
k
X , X , B ⇒ B k+1
X ⇒ B
GLR
B, B, U, U, D ⇒ D
GLR l+1 B, U ⇒ D cut(B) X , U ⇒ D
cut-height is (k + 1) + (l + 1). degree of cut-formula is d(B).
The principal case — a derivation in SNF A derivation is in Sambin Normal Form when: I
the last rule is the cut rule with cutfree premises
I
the cut-formula is principal by GLR in both premises
A naive transformation to eliminate cut: Π
Π X , X , B ⇒ B k+1
X ⇒ B
Ω k
k
GLR
X , X , B ⇒ B
l
B, B, U, U, D ⇒ D [k,l]+1
X , X , B, U, U, D ⇒ D cut2 X , X , X , U, U, D ⇒ D LC ∗ (X ) X , X , U, U, D ⇒ D GL X , U ⇒ D
Cut-height is k + l (cut1 ) and (k + 1) + ([k, l] + 1) (cut2 ) Problem with cut2 !
cut1
A successful transformation for SNF Transform derivation in SNF to: Π k
X , X , B ⇒ B Σ X , X ⇒ B
k+1
GLR
X ⇒ B
Ω l
B, B, U, U, D ⇒ D cut1 X , B, U, U, D ⇒ D cut2 X , X , X , U, U, D ⇒ D LC ∗ (X ) X , X , U, U, D ⇒ D GLR X , U ⇒ D
where Σ is some cut-free derivation. I
cut1 has cut-height (k + 1) + l
I
cut2 has smaller degree of cut-formula
New task: obtain a cut-free derivation of X , X ⇒ B from a derivation of X , X , B ⇒ B
A sketch of the proof The width is the number n of occurrences of the following schema, where no GLR rule occurrences appear between GLR1 and GLR2 G , G , B, B, C ⇒ C GLR2 G , B ⇒ C .. . X , X , B ⇒ B GLR1 X ` B If n = 0 then the B in X , X , B ⇒ B has either been introduced by 1. LW (B). In this case delete the LW (B) rule. Or, 2. the initial sequent B ⇒ B. Replace with X ⇒ B. In this way we obtain a derivation of X , X ` B.
The width is the number n of occurrences of the following schema, where no GLR rule occurrences appear between GLR1 and GLR2 G , G , B, B, C ⇒ C GLR2 G , B ⇒ C .. . X , X , B ⇒ B GLR1 X ` B If n = k + 1, each occurrence of the above schema is deleted as follows. Replace below left by below right. G , G , B, B, C ⇒ C GLR2 G , B ⇒ C
C ⇒ C lw G , B, C ⇒ C
Continuing downwards we obtain a derivation of X , C ` B with smaller width.
Now proceed: X , X , B ` B G , G , B, B, C ⇒ C cut X , C ` B X , X , G , G , B, C ` C cut X , X , X , G , G , C , C ` C
The second cut has lesser width than before! So we obtain a cutfree derivation of X , X , G , G , C ` C . Now replace below left in original derivation with below right. G , G , B, B, C ⇒ C GLR2 G , B ⇒ C
X , X , G , G , C ` C GLR X , G ` C lw X , G , B ` C
We thus obtain a derivation of the following of lesser width. X , X , B ⇒ B GLR1 X ` B
Finally, recall we started with: Π Ω X , X , B ` B B, B, U, U, D ` D GLR GLR X ` B B, U ` D cut(B) X , U ⇒ D
Transform as: new der. of lesser width X ` B
X , X , B ` B B, B, U, U, D ` D cut X , X , U, U, B, D ` D cut X , X , U, U, X , D ` D lc, GLR X , U ` D
GL, Grz and Go (p ⊃ p) ⊃ p
L:
(L¨ ob’s axiom)
Grz :
((p ⊃ p) ⊃ p) ⊃ p
Go :
((p ⊃ p) ⊃ p) ⊃ p
GL=K + L
Go=K + Go
Grz=K + Grz
A sequent calculus for Grz is obtained by adding the rules below left and center. For Go add rule below right. B, X ⇒ Y B, X ⇒ Y I
I
X , (B ⊃ B) ⇒ B X ⇒ B
X , X , (B ⊃ B) ⇒ B X ⇒ B
sGrz has cut-elimination (Borga and Gentilini, 1986). Reflexivity rule above left simplifies argument. Cut-elimination for sGo (Gor´e and R., 2013). Extends Valentini’s argument for sGL.