The Reflection Theorem: A Study in Meta-Theoretic Reasoning Lawrence C. Paulson University of Cambridge, Computer Laboratory, JJ Thomson Avenue, Cambridge CB3 0FD, UK,
[email protected]
Abstract. The reflection theorem has been proved using Isabelle/ZF. This theorem cannot be expressed in ZF, and its proof requires reasoning at the meta-level. There is a particularly elegant proof that reduces the meta-level reasoning to a single induction over formulas. Each case of the induction has been proved with Isabelle/ZF, whose built-in tools can prove specific instances of the reflection theorem upon demand.
1
Introduction
A vast amount of mathematics has been verified using proof tools. The Mizar Mathematical Library1 is probably the largest single repository, but others exist, built using a variety of theorem-provers. An optimist might conclude that any theorem can be verified given enough effort. A sufficiently large and talented team could enter the whole of Wiles’s proof of Fermat’s Last Theorem [17] and its mathematical prerequisites into a theorem prover, which would duly assert the formula ∀nxyz (n > 2 → xn + y n 6= z n ). The flaw in this point of view is that mathematicians sometimes reason in ways that are hard to formalize. Typical is G¨odel’s proof [5] of the relative consistency of the axiom of choice (AC). G¨odel begins with a complicated set-theoretic construction. At a crucial stage, he introduces operations on syntax. He defines absoluteness in terms of the relativization φM of a first-order formula φ with respect to a class M. He proceeds to apply absoluteness to his entire construction. His proof is of course correct, but it mixes reasoning about sets with reasoning about the language of sets. Relativization [8, p. 112] replaces each subformula ∃x φ by ∃x (x ∈ M ∧ φ) and dually ∀x φ by ∀x (x ∈ M → φ), bounding all quantifiers by M. In ZermeloFraenkel (ZF) set theory, a class is simply a formula and x ∈ M denotes M(x). So relativization combines the two formulas, φ and M, to yield a third, φM . This suggests that we recursively define the set F of first-order formulas within ZF. Relativization for elements of F is trivial to formalize, but it is useless—we cannot relate the “formulas” in F to real formulas. More precisely, no formula χ expresses the truth of elements of F . If for each formula φ we write pφq for the corresponding element of F , then some formula ψ is handled incorrectly: 1
Available via http://mizar.org
ψ ↔ ¬χ(pψq) can be proved in ZF. This fact is Tarski’s theorem on the nondefinability of truth [8, p. 41]. G¨ odel introduced meta-level reasoning in order to make his consistency proof effective. He could have worked entirely with sets and demonstrated how to transform a model of set theory into a model of set theory that also satisfied AC. In the latter approach, if we found a contradiction from the axioms of set theory and AC, then we would know that there existed a contradiction in set theory, but we would have no idea how to find the contradiction. G¨odel’s methods let us transform the contradiction involving AC into a contradiction involving the axioms of set theory alone. One way to handle meta-level reasoning is to throw away our set theory provers and work formalistically. We could work in a weak logic, such as PRA, which has been proposed for the QED project for mechanizing mathematics [13]. In this logic, we would define the set of formulas (the set F ), an internalized inference system, and the ZF axioms. Instead of proving the theorem φ in a ZF prover, we would prove the theorem ZF ` pφq in PRA. Then we could easily express syntactic operations on formulas. However, the formalist approach is not easy. The formal language of set theory has no function symbols and its only relation symbols are = and ∈. An assertion such as hx, yi ∈ A ∪ B has to be expressed in purely relational form, say ∃p C [isPair(x, y, p) ∧ isUnion(A, B, C) ∧ p ∈ C]. Expressions such as {x ∈ A | S φ(x)} and x∈A B(x) require a treatment of variable binding. Theorems would be hard even to state, and their proofs would require reasoning about syntax when we would rather reason about sets. My earlier work with Gr¸abczewski [12] using Isabelle/ZF [10,11] demonstrates that large amounts of set theory can be formalized without taking such an extreme measure. It is worth trying to see what can be accomplished using a set theory prover, recognizing that we can never formalize arguments performed at the meta-level. As it happens, our result can be proved as a collection of separate theorems that Isabelle can use automatically to prove any desired instance of the reflection theorem. Overview. The paper introduces the reflection theorem (§2) and the proof eventually formalized (§3). Excerpts from the two Isabelle/ZF theories are presented, concerning normal functions (§4) and the reflection theorem (§5). An interactive Isabelle session demonstrates the reflection theorem being applied (§6), and the paper concludes (§7).
2
The Reflection Theorem
The reflection theorem is a simple result that illustrates the issues mentioned above. Let ON denote the class of ordinals. Suppose that {Mα }α∈ON is a family of sets that is increasing S (which means α < β implies Mα ⊆ Mβ ) and continuous (whichSmeans Mα = ξ∈α Mξ when α is a limit ordinal). Define the class M by M = α∈ON Mα , Then the reflection theorem states that if φ(x1 , . . . , xn ) is a formula in n variables and α is an ordinal, then for some β > α and all x1 ,. . . ,
xn ∈ Mβ we have (intuitively) M |= φ(x1 , . . . , xn ) ↔ Mβ |= φ(x1 , . . . , xn ). I say intuitively, because M could be V, the universal class; as remarked above, truth in ZF is not definable by a formula. A precise statement of the conclusion requires relativization: φM (x1 , . . . , xn ) ↔ φMβ (x1 , . . . , xn ). The reflection theorem reduces truth in the class M to truth in the set Mβ , where β can be made arbitrarily large. It is valuable because classes do not exist in ZF; they are merely notation. The theorem can be applied by letting M be V and letting MαSbe Vα , the cumulative hierarchy defined by V0 = 0, Vα+1 = P(Vα ) and Vα = ξ∈α Vξ when α is limit. The reflection theorem is also applied to L, the constructible universe [8, p. 169]; it is an essential part of modern treatments of G¨ odel’s consistency proof that are based on ZF set theory. Proving the reflection theorem is not difficult, if only we can formalize it. Bancerek [1] proved it in Mizar. Mizar’s native Tarski-Grothendieck properly extends ZF: classes really do exist, and we can define M |= pφ(x1 , . . . , xn )q when M is a class. This solves the problem concerning the definability of truth. It is ironic that the formalization problems can be solved by working either in the weaker logic PRA or in a stronger logic. The approach taken below is more in the spirit of set theory: a theorem follows from the axioms, while a meta-theorem is a mechanical procedure for yielding theorems. Most authors do not formalize the meta-theory. Results such as the following are not meta-theorems, but merely theorem schemes: a ∈ {x ∈ A | φ(x)} ↔ a ∈ A ∧ φ(a) [ a∈ B(x) ↔ ∃x (x ∈ A ∧ a ∈ B(x)) x∈A
Their proofs depend not at all on the structure of the formula φ or the expression B. Thanks to Isabelle’s higher-order syntax, each is a single Isabelle/ZF theorem, with a trivial proof. The reflection theorem is different: it is proved by reasoning about a formula’s structure.
3
Proof Overview
The first task in formalizing the reflection theorem is to find a proof with the least amount of meta-level reasoning. Kunen’s proof [8, p. 136] needs a lemma, also proved at the meta-level, about a subformula-closed list of formulas. The proof idea is related to Skolemization and involves finding all existentially quantified subformulas. Drake’s proof [4, p. 99] requires the formula to be presented in prenex form and involves a simultaneous construction for the whole quantifier string. In both proofs, the meta-level component is substantial.
Mostowski’s proof [9, p. 23], fortunately, is a simple structural induction. Reflection for atomic formulas is trivial. Reflection for ¬φ(x) and φ(x)∧φ0 (x) follows trivially from induction hypotheses for φ(x) and φ0 (x). Reflection for ∃y φ(x, y) follows from an induction hypothesis for φ(x, y). The main complication is that the case for ∃y φ(x, y) adds a variable to the induction hypothesis; we do not want the theorem statement to depend upon the number of free variables in the formula. By assuming that the class M is closed (in a suitable way) under ordered pairing, it suffices to derive reflection for ∃y φ(hx, yi) from reflection for φ(hx, yi), which trivially follows from reflection for φ(z). The proofs are nontrivial, but they take place entirely within ZF. The only meta-level reasoning is the structural induction itself: noting that it suffices to prove the cases for atomic formulas, ¬, ∧ and ∃. The simple structure of these lemmas makes it easy to apply reflection to individual formulas and yields an expression for the class of ordinals that reflect the formula. At the end of this paper, we shall see Isabelle doing this automatically. Mostowski’s proof owes its simplicity to the classic technique of strengthening the induction hypothesis. The required conclusion has the form ∀α ∃β > α . . .; in other words, the possible values of β form an unbounded class. In Mostowski’s proof, this class is closed as well as unbounded. A S class X of ordinals is S closed provided for every nonempty set Y , if Y ⊆ X then Y ∈ X. (The union Y is the supremum, or limit, of the set Y .) It turns out that if X and X0 are closed and unbounded, then so is X ∩ X0 . This fact is crucial; in particular, it gives an immediate proof for the conjunctive case of the reflection theorem: if X is the class of ordinals for φ(x) and X0 is the class of ordinals for φ0 (x) then X ∩ X0 is a closed, unbounded class of ordinals for φ(x) ∧ φ0 (x). The function F : ON → ON is normal provided it is increasing and continuous: F (α) < F (β) if α < β [ F (α) = F (ξ) if α is a limit ordinal ξ α} is empty.) The constructions below come from Kunen [8, p. 78]. A locale [7] lets us fix the class P and the index set A . It states assumptions that hold for the whole development, namely that P is closed and unbounded and that A is nonempty. It also contains definitions of functions next greater and sup greater, which are local to the proof. locale cub family = fixes P and A fixes next greater — the next ordinal satisfying class A fixes sup greater — sup of those ordinals over all A
assumes closed: "a ∈A =⇒ Closed(P(a))" and unbounded: "a ∈A =⇒ Unbounded(P(a))" and A non0: "A 6=0" defines "next greater(a,x) S ≡ µy. x