Contextual Natural Deduction

Contextual Natural Deduction Bruno Woltzenlogel Paleo1,2 1

2

INRIA, Nancy, France Theory and Logic Group, Vienna University of Technology, Vienna, Austria [email protected]

Abstract. This paper defines the contextual natural deduction calculus NDc for the implicational fragment of intuitionistic logic. NDc extends the usual natural deduction calculus (here called ND) by allowing the implication introduction and elimination rules to operate on formulas that occur inside contexts. In analogy to the Curry-Howard isomorphism between ND and the simply-typed λ-calculus, an extension of the λcalculus, here called λc -calculus, is defined in order to provide compact proof-terms for NDc proofs. Soundness and completeness of NDc with respect to ND are proven by defining translations of proofs between these calculi. Furthermore, some NDc -proofs are shown to be quadratically smaller than the smallest ND-proofs of the same theorems.

1

Introduction

Natural deduction was introduced by Gentzen in [10] and one of its distinguishing features is that the meaning of a logical connective is determined by elimination and introduction rules, and not by axioms. As a result, formal natural deduction proofs are considered to be similar in structure to their informal counterparts and hence more natural. This subjective claim is corroborated by the observation that widely used proof assistants3 follow a natural deduction style. However, as exemplified in Section 2.1, the inference rules of natural deduction style calculi can be inconvenient, lengthy and ultimately unnatural for formalizing reasoning steps that modify a deeply located subformula of a formula, such as: skolemization, double negation elimination, quantifier shifting, prenexification. . . Because these deep reasoning steps are commonly used by automated deduction tools during preprocessing of the theorem to be proved, the resulting proofs may contain deep inferences [8]. Therefore, automatically replaying (i.e. reproving) these proofs in proof assistants (e.g. when an automated deduction tool is integrated within a proof assistant [2]) can be inefficient in terms of proving time and size of the generated shallow proof. These issues serve as long-term practical and technological motivations for the contextual natural deduction calculus presented here, which is a simple extension of the usual natural deduction calculus allowing introduction and elimination rules to operate on formulas occurring inside contexts. The goals of the 3

e.g. Isabelle

(www.cl.cam.ac.uk/research/hvg/Isabelle/)

and Coq

(http://coq.inria.fr).

current paper, however, are purely theoretical. In particular, it is shown that the proposed calculus is sound and complete, that proofs can be normalized, and that some proofs can be quadratically smaller than corresponding proofs in the usual natural deduction calculus. For the sake of succintness, only minimal logic (intuitionistic logic with only the implication connective) is considered, since it is straightforward to extend the techniques described here to natural deduction calculi containing inference rules for other connectives as well. Related work: The related idea of deep inference has been intensively investigated in the last decade, especially for classical logic (e.g. [3, 6, 5, 12]). More recently, the techniques and calculi originally developed for classical logic have been adapted for intuitionisitc logic. In Tiu’s calculus [16], inferences are not only deep but also local (i.e. only a fixed amount of information needs to be checked when applying a rule). Br¨ unnler and McKinley [4] proposed another calculus with an algorithmic interpretation inspired by the Curry-Howard isomorphism. And in Guenot’s calculus [11], it is possible to encode lambda terms with explicit substitutions as formulas in such a way that computation corresponds to proofsearch. Although these calculi differ significantly from each other, they adher to the original deep inference methodology of avoiding branching inference rules. The calculus presented here, on the other hand, does utilize branching, since the aim is a minimal modification of the usual branching natural deduction calculus. Acknowledgements: I would like to especially thank: the deep inference community, for inspiring me to develop the calculus presented here; Bernhard Gramlich, for his help in finding useful term rewriting lemmas; Pascal Fontaine and David Deharbe, for discussions on proof production and proof checking for skolemization in SMT-solving, which served as a motivation for this work.

2

Deep Natural Deduction

Figure 1 presents the inference rules of a standard natural deduction calculus ND for the implicational fragment of intuitionistic logic. An ND-derivation is a tree of inferences (instances of the inference rules) and a derivation ψ is an ND-proof of a theorem T if and only if its leaves are axiom inferences and it ends in ` t : T , for some term t of type T of the simply typed λ-calculus. This term t is called the proof-term of ψ and denoted I(ψ). As indicated in the figure, the implication introduction rule corresponds to abstraction while the implication elimination rule corresponds to application in the λ-calculus. I is thus an isomorphism (Curry-Howard [7]) between ND-proofs and simply typed λ-terms and between (implicational intuitionistic) theorems and types. Figure 2 shows the inference rules of NDc , the contextual natural deduction calculus that aims at extending ND in a simple, minimal, straightforward and yet general way by allowing the inference rules to operate on subformulas located deeply inside the premises. The notation Cπ [F ] indicates a formula that has the subformula F in position π. Cπ [ ] is called the context of F in the formula Cπ [F ]. Concretely in this paper, a position π is encoded as a binary string indicating

Γ, a : A ` a : A Γ, a : A ` b : B Γ ` λaA .b : A → B

→I

Γ `f :A→B Γ `a:A →E Γ ` (f a) : B

Fig. 1: The natural deduction calculus ND the path from the root of Cπ [F ] to F in the tree structure of Cπ [F ]; thus, a subformula at position π of a formula P , denoted Atπ (P ), can be retrieved by traversing the formula according to the following inductive definition: At (A) = A

At0π (A → B) = Atπ (B)

At1π (A → B) = Atπ (A)

A position is said to be positive (negative) if and only if it contains an even (odd) number of digits 1. In other words, in the tree structure of a formula, a node and its left (right) child always occupy positions with opposite (same) polarities, and the root position is positive. Moreover, a position is strongly positive if and only if it does not contain any digit 1. Note: π, π1 and π2 must be positive positions. Γ, a : A ` a : A Γ, a : A ` b : Cπ [B] Γ ` λπ aA .b : Cπ [A → B]

→I (π)

Intuitionistic Contextual Soundness Condition: a is allowed to occur in b only if π is strongly positive.

Γ ` f : Cπ11 [A → B]

Γ ` a : Cπ22 [A]

1 2 Γ ` (f a)* (π1 ;π2 ) : Cπ1 [Cπ2 [B]]

Γ ` f : Cπ11 [A → B]

Γ ` a : Cπ22 [A]

2 1 Γ ` (f a)( (π1 ;π2 ) : Cπ2 [Cπ1 [B]]

→* E (π1 ; π2 )

→( E (π1 ; π2 )

Fig. 2: The contextual natural deduction calculus NDc The contextual natural deduction calculus NDc has two implication elimination rules, because the implication elimination rule of ND is extendable in two different ways, depending on the order in which the contexts are combined.

The proof-term of an NDc -proof ψ is denoted Ic (ψ). As shown in Figure 2, proof terms for NDc -proofs, here called λc -terms (contextual λ-terms), must be modified accordingly: the positions in which applications and abstractions are performed are indicated as subscripts, and the superscript arrows on applications inform the order in which the contexts are combined. In case at least one of the contexts is empty, the order does not matter, and hence the superscript arrow can be omitted. Likewise, if all positions are empty (), the subscript can be omitted. By these omission conventions, contextual applications and abstractions with empty positions can be written like ordinary ones. No special parenthesis or precedence convention is used for λ- and λc -terms in this paper. As indicated in the construction rules in Figures 1 and 2, applications are always surrounded by parentheses, and parentheses are not used for abstractions. It follows that a term of the form (λx.t u) should be understood as the function λx.t applied to the argument u. 2.1

Comparing ND and NDc

Many preprocessing techniques change deeply occurring subformulas. In classical first-order resolution theorem proving, for example, the formula ¬¬∃x.P (x) ∨ ∀y.Q(y) would be first transformed into the following formula in clause form P (c) ∨ Q(v) where c is a new Skolem constant, and v is a free-variable. All three steps used in this transformation (double negation elimination, skolemization, and the dropping of universal quantifiers) perform deep modifications in the formula. As the following examples show, formalizing these deep steps in ND is lengthy, inefficient and unnatural. The reason is that many introduction and elimination rules must be performed in order to decompose a formula until the subformula that has to be modified is reached. In contrast, in NDc a deep modification can be done with a single inference, independently of the depth at which the target subformula occurs. Example 1. Skolemization can be simulated in ND and NDc by assuming the axiom schema sk : ∃x.F [x] → F [fsk (x1 , . . . , xn )] where x1 , . . . , xn are the free-variables of F and fsk must be a globally new4 (Skolem) function symbol. Given the formula (A → B) → ∃x.P (x), Skolemization replaces the subformula ∃x.P (x) at position 0 in the given formula by P (c), for some Skolem constant c. In NDc , this can be formalized with a single implication elimination inference deriving (A → B) → P (c) from (A → B) → ∃x.P (x) and an instance of the Skolemization axiom schema: 4

The only leaf of the proof in which fsk should occur is the leaf where fsk is introduced by a skolem axiom instance.

` sk : ∃x.P (x) → P (c)

a : . . . ` a : (A → B) → ∃x.P (x)

a : (A → B) → ∃x.P (x) ` (sk a)( (;0) : (A → B) → P (c)

→( E (; 0)

In ND, on the other hand, the proof is larger and arguably less natural: c : ... ` c : A → B ` sk : ∃x.P (x) → P (c)

a : . . . ` a : (A → B) → ∃x.P (x)

a : (A → B) → ∃x.P (x), c : A → B ` (a c) : ∃x.P (x)

c : A → B, a : (A → B) → ∃x.P (x) ` (sk (ac)) : P (c) a : (A → B) → ∃x.P (x) ` λcA→B .(sk (a c)) : (A → B) → P (c)

→E

→E

→I

Example 2. Double negation elimination can be simulated in ND and NDc by assuming the axiom schema dne : ¬¬F → F Given the formula (¬¬A → B) → C, double negation elimination replaces the subformula ¬¬A at position 11 in the given formula by A. In NDc , this can be formalized with a single implication elimination inference deriving (A → B) → C from the given formula and an instance of the double negation elimination axiom schema: ` dne : ¬¬A → A

a : . . . ` a : (¬¬A → B) → C

a : (¬¬A → B) → C ` (dne a)( (;11) : (A → B) → C

→( E (; 11)

In ND, on the other hand, the proof is again larger and less natural: ` dne : ¬¬A → A c : ... ` c : A → B

d : . . . , c : . . . ` (c (dne d)) : B a : . . . ` a : (¬¬A → B) → C

d : . . . ` d : ¬¬A

d : . . . ` (dne d) : A

c : . . . ` λd¬¬A .(c (dne d)) : ¬¬A → B

a : . . . , c : . . . ` (a λd.(c (dne d))) : C a : (¬¬A → B) → C ` λcA→B .(a λd.(c (dne d))) : (A → B) → C

→E

→E

→I →E

→I

3

Soundness and completeness

In order to prove that NDc is sound and complete with respect to ND, it must be shown that a formula is provable in NDc iff it is provable in ND. This can be done by defining a translation of proofs from one calculus into the other. In order to do so more compactly, the next two subsections will show how to translate not proofs themselves but rather their proof terms. 3.1

Translating λ-terms into λc -terms

The translation of λ-terms into λc -terms of the same type is trivial, since λ-terms can be regarded as λc -terms with empty positions. Definition 1. The translation function ζ, from λ-terms to λc -terms of the same type, is defined inductively on the structure of λ-terms: – ζ[v] = v (for a variable v). – ζ[λv T .t] = λ v T .ζ[t] – ζ[(m n)] = (ζ[m] ζ[n])(;)

3.2

Translating λc -terms into λ-terms

The translation of λc -terms into λ-terms of the same type can be done by means of the function ξ defined below. Definition 2. The translation function ξ, which maps a λc -term t into a shallow λ-term of the same type, is defined inductively on the structure of t: – If t is a variable, then ξ[t] = t – If t is an abstraction of the form λπ aA .b, the translation is defined by induction on the position π, according to the cases below: • If π = 0π 0 , it is the case that t matches λ0π0 aA .bC→D , and then ξ[t] = λcC .ξ[λπ0 aA .(bc)] • If π = 1π 0 , then there is at least one occurrence of the digit 1 in π 0 , since π is positive and π 0 is negative. Therefore, π is necessarily of the form 10 . . . 01π 00 and t matches λ10...01π00 aA .b(C1 →...Cn →(Tπ00 [B]→D1 ))→D2 . Then ξ[t] = λk

C1 →...Cn →(T 00 [A→B]→D1 ) π .(b

C

C

λc1 1 . . . cn n .λh

T 00 [B] π .(k

A

c1 . . . cn ξ[λπ00 a .h])

Note that, if a occurred in b, then this occurrence would become unbound after the translation. This undesirable effect, which would jeopardize soundness as shown in Example 3, is prevented by the intuitionistic contextual soundness condition described in Figure 2. • If π = , it is the case that t matches λ a.b, and then ξ[t] = λa.ξ[b] – If t is an (*)-application of the form (f a)* (π1 ;π2 ) , the translation is defined by two successive inductions, firstly on the position π1 and then (when π1 = ) on π2 , according to the cases below: • If π1 = 0π, it is the case that t matches (f C→D a)* (0π;π2 ) , and then ξ[t] = λcC .ξ[((f c) a))* (π;π2 ) ] • If π1 = 1π 0 , then there is at least one occurrence of the digit 1 in π 0 , since π1 is positive and π 0 is negative. Therefore, π1 is necessarily of the form 10 . . . 01π and t matches (f (C1 →...Cn →(Tπ [A→B]→D1 ))→D2 a)* (10...01π;π2 ) . Then ξ[t] = λk

C1 →...Cn →(Tπ [B]→D1 ) .(f

C

C

λc1 1 . . . cn n .λh

Tπ [A→B]

*

.(k c1 . . . cn ξ[(h a)(π;π

2)

])

• If π1 = and π2 = 0π, it is the case that t matches (f aC→D )* (;0π) , and then ξ[t] = λcC .ξ[(f (a c))* (;π) ] • If π1 = and π2 = 1π 0 , then there is at least one occurrence of the digit 1 in π 0 , since π2 is positive and π 0 is negative. Therefore, π2 is of the form 10 . . . 01π and t matches (f A→B a(C1 →...Cn →(Tπ [A]→D1 ))→D2 )* (;10...01π) . Then ξ[t] = λk

C1 →...Cn →(Tπ [B]→D1 ) .(a

C

C

λc1 1 . . . cn n .λh

Tπ [A]

*

.(k c1 . . . cn ξ[(f h)(;π) ]))

• If π1 = π2 = , it is the case that t matches (f a)* (;) , and then ξ[t] = (ξ[f ] ξ[a]) – If t is an (()-application of the form (f a)( (π1 ;π2 ) , the translation is analogous * to the previous case for (f a)(π1 ;π2 ) , but the induction is made firstly on the position π2 and only then (when π2 = ) on π1 . 3.3

Soundness and completeness of NDc relative to ND

With the translation functions defined in the previous subsections, soundness and completeness can be proved shortly, as shown in Theorems 1 and 2 below. Theorem 1 (Completeness). If T is provable in ND, then T is provable in NDc . Proof. Let ψ be an ND-proof of T , then Ic−1 (ζ[I(ψ)]) is an NDc -proof of the same theorem T . Theorem 2 (Soundness). If T is provable in NDc , then T is provable in ND. Proof. Let ψ be an NDc -proof of T , then I −1 (ξ[Ic (ψ)]) is an ND-proof of the same theorem T . Example 3. The importance of the intuitionistic contextual soundness condition can be illustrated with the following incorrect NDc -proof. Its lowermost inference violates the condition in order to derive Peirce’s Law, which should not be intuitionistically derivable. g : (B → A), a : A ` a : A a : A ` λg (B→A) .a : (B → A) → A

→I ()

` λ11 aA .λg (B→A) .a : ((A → B) → A) → A

→I (11)

Translating the proof-term using ξ results in the following λ-term, where a has become unbound: λk ((A→B)→A) .((λg (B→A) .a) (λhB .(k λaA .h))). As a is unbound, there is no closed ND-proof corresponding to this λ-term. The intuitionistic contextual soundness condition guarantees that there will be no variable a becoming unbound in this manner.

4

Normalization

In λ-calculus, a term t of the form (λaA .f t0 ) is a redex with respect to βreduction, and is β-reducible to f [a\t0 ]. Correspondingly, the proof I −1 (t) is said to contain a detour (cut). The normalization rules for eliminating detours correspond to β-reduction and the ND-proof I −1 (t) can be rewritten to I −1 (f [a\t0 ]). The purpose of this section is to extend this notion of normalization to NDc

and to the corresponding λc -calculus. As the relation between NDc -proofs and λc -terms is given by the bijection Ic , it suffices to consider the λc -calculus. Unfortunately, the notion of β-redex is more complicated in λc -calculus. Consider, for example, the following λc -term: (λ0 aA .f t0 )(0;) Its form is very similar to the redex form described above, except for the fact that abstraction and application are now contextual. Therefore, it is intuitively clear (and it could be shown by using the translation function ξ) that this term could be reduced to f [a\t0 ], and hence ought to be considered a redex. Nevertheless, it would not be sufficient to extend β-reduction to λc -terms of this form. The λc -term below, for example, is not of this form, but it is also clear (again by using ξ) that it could be reduced to λbB .(f [a\t0 ] b): (λbB .λaA .(f b) t0 )(0;) As the term above exemplifies, the issue is that the abstraction (in this case, λaA ) that ought to be eliminated together with the outermost contextual application can occur arbitrarily deep inside the left subterm of the application (in this case, λbB .λaA .(f b)). Generalizing β-reduction so that it could directly handle all such potential redexes formed by pairs of contextual applications and deeply occurring abstractions would not be easy. Therefore, instead of generalizing the β-reduction rule, an alternative approach is considered in the next subsection.

4.1

Unfolding

For redexes made of contextual abstractions and applications with empty positions (which, by the position omission convention, can be written like ordinary abstractions and applications), β-reduction can be applied normally. The problem lies only in the presence of contextual applications and abstractions with non-empty positions, which may block β-reduction. This problem can be solved by translating the λc -term only partially, in order to remove the blocking contextual applications and abstractions. To this aim, a term rewriting system based on the translation function ξ is defined in Figure 3. The rewriting of a term t to a term t0 using one of the rules of this term rewriting system is denoted t δ t0 ; and it is said that the term t unfolds to t0 . Example 4. Consider the term (λ0 aA .(λ0 bB .h a) t0 )(0;) . Even though it is clear that this term is a deep redex, β-reduction alone is not capable of producing the desired result (λ0 bB .h t0 ). As shown in the rewriting sequence below, the unfolding of the outermost contextual application and of the outermost contextual

λ0π aA .bC→D λcC .λπ aA .(bc) λ10...01π aA .f (C1 →...Cn →(Tπ [B]→D1 ))→D2 Cn Tπ [B] 1 .(k c1 . . . cn λπ aA .h) λkC1 →...Cn →(Tπ [A→B]→D1 ) .(f λcC 1 . . . cn .λh

(f C→D a)* (0π;π2 )

(f aC→D )( (π1 ;0π)

λcC .((f c) a))* (π;π2 )

λcC .(f (a c))( (π1 ;π)

(f (C1 →...Cn →(Tπ [A→B]→D1 ))→D2 a)* (10...01π;π2 ) Cn Tπ [A→B] 1 λkC1 →...Cn →(Tπ [B]→D1 ) .(f λcC .(k c1 . . . cn (h a)* (π;π2 ) ) 1 . . . cn .λh

(f a(C1 →...Cn →(Tπ [A]→D1 ))→D2 )( (π1 ;10...01π) Tπ [A] Cn 1 .(k c1 . . . cn (f h)( λkC1 →...Cn →(Tπ [B]→D1 ) .(a λcC (π1 ;π) )) 1 . . . cn .λh

(f A→B a(C1 →...Cn →(Tπ [A]→D1 ))→D2 )* (;10...01π) Cn Tπ [A] 1 λkC1 →...Cn →(Tπ [B]→D1 ) .(a λcC .(k c1 . . . cn (f h)* (;π) )) 1 . . . cn .λh

(f (C1 →...Cn →(Tπ [A→B]→D1 ))→D2 a)( (10...01π;) Tπ [A→B] Cn 1 .(k c1 . . . cn (h a)( λkC1 →...Cn →(Tπ [B]→D1 ) .(f λcC (π;) ) 1 . . . cn .λh

(f aC→D )* (;0π)

(f C→D a)( (0π;)

λcC .(f (a c))* (;π)

λcC .((f c) a)( (π;)

Fig. 3: Term Rewriting Rules for Unfolding

abstraction enables the use of β-reduction. (λ0 aA .(λ0 bB .h a) t0 )(0;)

δ

A B 0 (λbB 1 .λa .((λ0 b .h a) b1 ) t )(0;)

δ

B A B 0 λbB 2 .((λb1 .λa .((λ0 b .h a) b1 ) b2 ) t )

β

A B 0 λbB 2 .(λa .((λ0 b .h a) b2 ) t )

β

B 0 λbB 2 .((λ0 b .h t ) b2 )

=η

(λ0 bB .h t0 )

The following theorems show that unfolding is terminating and confluent. Definition 3. The size s(π) of a position π is the number of digits in π. Theorem 3.

δ

is strongly terminating.

Proof. Let µ be a function that counts the total size of all positions in an λc -term, inductively defined as follows: – – – –

µ(v) = 0 (for a variable v) µ(λπ a.b) = s(π) + µ(b) µ((f a)* (π1 ;π2 ) ) = s(π1 ) + s(π2 ) + µ(f ) + µ(a) µ((f a)( (π1 ;π2 ) ) = s(π1 ) + s(π2 ) + µ(f ) + µ(a)

0 0 Note that, for all rules in Figure 3, if t δ t , then µ(t ) < µ(t). Since there can be no infinite decreasing chain of natural numbers, there cannot be an infinite δ reduction sequence either. Therefore δ is strongly terminating.

Theorem 4.

δ

is confluent.

Proof. (Sketch) Note that there are no critical pairs. Therefore, by the critical pair theorem (Theorem 6.2.4 in [1]), δ is locally confluent. By Newman’s Lemma (Lemma 2.7.2 in [1]), a terminating and locally confluent term rewriting system must be confluent. If the shallow β-reduction rule is added to the term rewriting rules shown in Figure 3, a new higher-order term rewriting system is obtained. The rewriting of a term t to a term t0 according to this term rewriting system is denoted t βδ t0 ; and it is said that the term t beta-unfolds to t0 . Formally: βδ

=

δ

∪

β

The combined term-rewriting system also enjoys termination and confluence. Theorem 5.

βδ

is terminating.

Proof. (Sketch) The traditional approach of proving termination of combined terminating term rewriting systems via commutation lemmas [9] does not work here, because the conditions required by those lemmas do not hold for β and δ . Their combination is nevertheless terminating. For the sake of contradiction, assume that there could be an infinite βδ reduction sequence. Since β and δ are both terminating, this infinite sequence would need to contain alternations of β and δ steps in such a way that δ generates β-redexes and β generates δ-redexes. However, by inspection of the unfolding and β-reduction rules, this is not possible. β-reduction never creates new δ-redexes (although it can duplicate existing ones). Unfolding never duplicates existing β-redexes. And although it can create new β-redexes, reducing them does not cause any duplication, because the abstractions introduced by unfolding are all linear (i.e. the abstracted variables k, c, c1 ,. . . , cn occur only once in the respective term’s

body). Hence, in any βδ reduction sequence, duplications of δ -redexes are finite, limited by the non-linear abstractions (which may participate in duplicating β-reductions) already occuring in the initial term. Therefore, no infinite reduction sequence is possible and thus βδ is terminating. Theorem 6.

βδ

is confluent.

Proof. The term rewriting system for unfolding is a first-order term rewriting system. β-reduction is a particular kind of higher-order rewriting rule known as pattern rewriting rule [14]. Therefore, their combination is a pattern rewriting system (PRS). Since there are no critical pairs in the combined system, a generalization of the critical pair theorem for PRSs (Theorem 4.7 in [14]) can be used to conclude that βδ is locally confluent. Therefore, as βδ is terminating, Newman’s Lemma allows to conclude that βδ is confluent.

5

Sizes of Proofs in ND and NDc

As seen in Examples 1 and 2, NDc -proofs can be significantly smaller than NDproofs. In order to measure how much smaller they can be, the notion of size must first be precisely defined. This is done below for types and terms, essentially by counting the number of symbols they contain. The size s(ψ) of a ND-proof ψ is simply defined as the size s(I(ψ)) of its corresponding proof-term I(ψ). Likewise, the size s(ψ) of a NDc -proof ψ is s(Ic (ψ)). Definition 4. The size s(T ) of a type T is inductively defined as follows: – s(A) = 1 (if A is an atomic type) – s(T1 → T2 ) = 1 + s(T1 ) + s(T2 ) Definition 5. The size s(t) of a λ-term t is inductively defined as follows: – s(v) = 1 (if v is a variable) – s(λv T .t0 ) = 1 + s(v) + s(T ) + s(t0 ) = 2 + s(T ) + s(t0 ) – s((m n)) = 1 + s(m) + s(n) Definition 6. The size s(t) of a λc -term t is inductively defined as follows: – – – –

s(v) = 1 (if v is a variable) s(λπ v T .t0 ) = 1 + s(π) + s(v) + s(T ) + s(t0 ) = 2 + s(π) + s(T ) + s(t0 ) s((m n)* π1 ;π2 ) = 1 + s(m) + s(n) + s(π1 ) + s(π2 ) s((m n)( π1 ;π2 ) = 1 + s(m) + s(n) + s(π1 ) + s(π2 )

The next theorem shows that, in the best cases, quadratic compression can in principle be achieved by translating ND-proofs to NDc -proofs. Theorem 7. There is a sequence of theorems Fn whose smallest ND-proofs ψn are such that s(ψn ) ∈ Ω(n2 ), while there are NDc -proofs ψnc of Fn such that s(ψnc ) ∈ O(n).

Proof. Let Fn = T n (A → B) → (A → T n (B)) where: T 0 (F ) = F

T n (F ) = (T n−1 (F ) → D2n−1 ) → D2n

and D1 , . . . , D2n are distinct atomic formulas. Let ψnc = Ic−1 (tcn ) and ψn = I −1 (tn ) where: tcn = λf T

n

(A→B)

tn = ξ(tcn )

.λaA .(f a)(11 . . . 1;) | {z } 2n

Any ND-proof of Fk must decompose Fk until the subformulas A → B and A are obtained and then apply A → B to A. ψk does exactly this and nothing more. ψk could not be further compressed by introducing detours, because D1 , . . . , D2k are all distinct and hence ψk does not contain repeated subproofs5 . Therefore, ψk is a smallest ND-proof of Fk . By definition, s(ψnc ) = s(tcn ), and as shown below, s(tcn ) ∈ O(n). s(tcn ) = s(λf T

n

(A→B)

.λaA .(f a)(11 . . . 1;) ) = 11 + 6n | {z } 2n

By definition, s(ψn ) = s(tn ), and s(tn ) is computed below: s(tn ) = s(ξ(λf T

n

(A→B)

.λaA .(f a)(11 . . . 1;) )) | {z } 2n

= s(λf T

n

(A→B)

.λaA .ξ((f a)(11...1;) ))

= 8 + 4n + s(ξ((f a)(11 . . . 1;) )) | {z } 2n

Let q(n) = s(ξ((f a)(11 . . . 1;) )). Then: | {z } 2n

q(0) = s(ξ((f a)(;) )) = 3 q(n) = s(ξ((f a)(1111 . . . 1;) )) | {z } 2n

= s(λk

T n−1 (B)→D2n−1

.(f λhT

n−1

(A→B)

.ξ((h a)(11 . . . 1;) ))) | {z } 2n−2

= 4 + 8n + s(ξ((h a)(11 . . . 1;) )) = 4 + 8n + q(n − 1) | {z } 2n−2

Solving the recurrence relation above gives the following closed-form for q: q(n) = 4n2 + 8n + 3 Therefore, s(tn ) = 8 + 4n + q(n) = 4n2 + 12n + 11 and hence s(ψn ) ∈ Ω(n2 ). 5

For a term to be compressible by the introduction of detours, it must contain repetitions. An example is ((f t) t), having size 3 + 2s(t) and compressible (by a detour with a non-linear abstraction) to (λx.((f x) x) t), whose size (8 + s(t)) is smaller when s(t) > 5). On the other hand, a detour with a linear abstraction always just adds 4 to a term’s size.

6

Proof compression

The fact that NDc -proofs can be quadratically smaller than the smallest NDproofs suggests that a ND-proof ψ could be compressed by first translating it to the NDc -proof ζ[ψ] and then folding it (i.e. applying the rewriting rules for unfolding in the opposite direction). The purpose of this section is to remark a few obstacles that make the use of this idea non-trivial in practice. The first obstacle is the fact that, even though −1 is clearly terminating δ (since the size of terms decreases), it is not confluent. Indeed, assuming f : A → B and a : (A → D) → E, the term λk B→D .(a λhA .(k (f h))) can be rewritten to both terms below: (f a)(;11)

λk B→D .(a (k f )(;0) )

and this critical pair is clearly not joinable, since both terms are already normal forms with respect to −1 δ . As a consequence of the non-confluence, the choice of which rewriting rule to apply matters: bad choices may lead to compressed proof-terms that are not optimally small (e.g. the second term above). The second, and more problematic, obstacle can be illustrated with the following term t, where hab : A → B, hbc : B → C and hade : (A → D) → E: λhC→D .(hade λhA cd a .(hcd (hbc (hab ha )))) : (C → D) → E It would be very desirable for a proof compression method to be able to obtain one of the smaller λc -terms (of the same type and also constructed using hab , hbc and hade ) below: (hbc (hab hade )(;11) )(;11)

((hbc hab )(;0) hade )(;11)

However these terms cannot be obtained by folding t, simply because t is already in normal form w.r.t. −1 (it is easy to verify that no rewriting rule δ from −1 is applicable to t). As shown below, in order to be able to obtain the δ terms above from t, not only folding but also β-expansion must be used: t = λhC→D .(hade λhA cd a .(hcd (hbc (hab ha )))) −1 β −1 β −1 δ −1 δ

B λhC→D .(hade λhA cd a .(λkb .(hcd (hbc kb )) (hab ha ))) B→D B λhC→D .(λkbd .(hade λhA cd a .(kbd (hab ha ))) λkb .(hcd (hbc kb ))) B→D (hbc λkbd .(hade λhA a .(kbd (hab ha ))))(;11)

(hbc (hab hade )(;11) )(;11)

t = λhC→D .(hade λhA cd a .(hcd (hbc (hab ha )))) −1 β −1 δ −1 δ

A λhC→D .(hade λhA cd a .(hcd (λka .(hbc (hab ka )) ha )))

(λkaA .(hbc (hab ka )) hade )(;11) ((hbc hab )(;0) hade )(;11)

The introduction of detours by β-expansion brings the term to a form to which folding rules can be applied. Unfortunately, it is generally unclear which detours need to be introduced and where they should be introduced; the possibilities are numerous, since β-expansion is non-terminating and can be applied to almost any sub-term, thus greatly enlarging the search space. Furthermore, as noticeable in the examples above, β-expansion can momentarily increase the size of the term, rendering strategies that greedily minimize the size ineffective. It is also interesting to note that the introduction of detours in natural deduction calculi is closely related to the introduction of cuts in sequent calculi, another notoriously difficult problem in proof compression [17, 13].

7

Conclusions

The main contribution of this paper was the development of NDc , a sound and complete simple extension of natural deduction that allows inference rules to operate on deeply occurring subformulas. Formalizations of intrinsically deep reasoning steps then become more convenient and more natural, and the resulting proofs can be significantly smaller. Indeed, it has been shown (in Theorem 7) that NDc -proofs can be quadratically smaller than ND-proofs in the best cases. Therefore, NDc might be especially useful in application scenarios where the size of proofs and the proof-checking time are important. NDc has been implemented in the Skeptik system (https://github.com/Paradoxika/Skeptik), and current research is focusing on designing an experiment to complement the best-case asymptotic analysis of Theorem 7 with empirical data about the average-case. Surprisingly, if the intuitionistic contextual soundness condition is dropped, a natural deduction calculus for classical logic is obtained. This is particularly interesting because classical principles (e.g. Peirce’s law) do not need to be assumed - they can be derived; and because the calculus uses single conclusion sequents, and thus is distinct from other assumption-free natural deduction calculi for classical logic, such as Parigot’s [15]. A deeper investigation of such a classical contextual natural deduction calculus will be pursued in the near future.

References 1. Franz Baader and Tobias Nipkow. Term rewriting and all that. Cambridge University Press, 1998. 2. Sascha B¨ ohme and Tobias Nipkow. Sledgehammer: Judgement day. In J¨ urgen Giesl and Reiner H¨ ahnle, editors, IJCAR, volume 6173 of Lecture Notes in Computer Science, pages 107–121. Springer, 2010. 3. Kai Br¨ unnler. Atomic cut elimination for classical logic. In Matthias Baaz and Johann A. Makowsky, editors, CSL, volume 2803 of Lecture Notes in Computer Science, pages 86–97. Springer, 2003. 4. Kai Br¨ unnler and Richard McKinley. An algorithmic interpretation of a deep inference system. In 15th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, volume 5330 of LNCS, pages 482–496. Springer, 2008.

5. Paola Bruscoli and Alessio Guglielmi. On the proof complexity of deep inference. ACM Transactions on Computational Logic, 10:1–34, 2009. 6. Paola Bruscoli, Alessio Guglielmi, Tom Gundersen, and Michel Parigot. A quasipolynomial cut-elimination procedure in deep inference via atomic flows and threshold formulae. In Edmund M. Clarke and Andrei Voronkov, editors, LPAR (Dakar), volume 6355 of Lecture Notes in Computer Science, pages 136–153. Springer, 2010. 7. Philippe de Groote, editor. The Curry-Howard Isomorphism, volume 8 of Cahiers du Centre de Logique. Academia, Universite Catholique de Louvain, 1995. 8. David Deharbe, Pascal Fontaine, and Bruno Woltzenlogel Paleo. Quantifier inference rules in the proof format of verit. In 1st International Workshop on Proof Exchange for Theorem Proving, 2011. 9. Nachum Dershowitz. On lazy commutation. In Orna Grumberg, Michael Kaminski, Shmuel Katz, and Shuly Wintner, editors, Languages: From Formal to Natural, volume 5533 of Lecture Notes in Computer Science, pages 59–82. Springer, 2009. 10. G. Gentzen. Untersuchungen u ¨ber das logische Schließen. Mathematische Zeitschrift, 39:176–210,405–431, 1934–1935. 11. Nicolas Guenot. Nested proof search as reduction in the lambda-calculus. In Peter Schneider-Kamp and Michael Hanus, editors, PPDP, pages 183–194. ACM, 2011. 12. Alessio Guglielmi. A system of interaction and structure. CoRR, cs.LO/9910023, 1999. 13. Stefan Hetzl, Alexander Leitsch, and Daniel Weller. Towards algorithmic cutintroduction. In LPAR, 2012. 14. Richard Mayr and Tobias Nipkow. Higher-order rewrite systems and their confluence. Theor. Comput. Sci., 192(1):3–29, 1998. 15. Michel Parigot. Lambda-mu-calculus: An algorithmic interpretation of classical natural deduction. In Logic Programming and Automated Reasoning: International Conference LPAR, number 624 in Lecture Notes in Computer Science, pages 190– 201. Springer-Verlag, 1992. 16. Alwen Tiu. A local system for intuitionistic logic. In Miki Hermann and Andrei Voronkov, editors, LPAR, volume 4246 of Lecture Notes in Computer Science, pages 242–256. Springer, 2006. 17. Bruno Woltzenlogel Paleo. Atomic cut introduction by resolution: Proof structuring and compression. In Edmund Clarke and Andrei Voronkov, editors, 16th International Conference on Logic for Programming Artificial Intelligence and Reasoning (LPAR), number 6355 in LNAI, pages 463 – 480. Springer, 2010.