Complexity in the Paradox of Simplicity
Jonathan Vos Post Computer Futures, Inc. Altadena, CA http://magicdragon.com
[email protected] Philip Vos Fellman School of Business, Southern New Hampshire University
[email protected]
2 1.1 Introduction As Alan Baker (2001), whose work guides this paper, puts it “most philosophers believe that, other things being equal, simpler theories are better.” But what exactly does theoretical simplicity amount to? The concept of “simpler” is not unproblematic. For example, in modern science, we tend to see “simplicity” as an inverse of “complexity.” That is, for a given measure of complexity, whatever has greater complexity has lesser simplicity, and vice versa. This fails to reconcile the traditional philosophy of simplicity with the modern mathematics of complexity. It begs the question of how to measure complexity, or how to apply this to either theoretical or practical problems. Ontological simplicity, or parsimony, measures the number of kinds of entities postulated by a theory. Syntactical simplicity, or elegance, measures the number and conciseness of the theory’s basic principles. One issue concerns how these two forms of simplicity relate to one another. Other issues concern the justification of principles such as Occam’s razor, which favors “simpler” theories. Much of the debate on this subject, however, is characterized by a great degree of circularity. That is, when one tries to get closure on the subject or to arrive at a constructive proof, terms become elusive and most authors’ arguments ultimately end up assuming the very thing which they are trying to prove. Finally, it is rarely the case that all things are, in fact, equal. The moment two explanations fall out of one to one correspondence, complexity arises. In contradistinction to the ancient philosophical theory of Heraclitus, quantum mechanics teaches us that you can’t even set foot in the same stream once.
1.1.1 Simpler Than What? Before the kind of advanced problems posed by theoretical physics (and there are several), why not begin with some of the simplest, grandest and ill-formed ideas about simplicity which have been quite popular in the philosophy of science for quite some time. F.S.C. Northrop (1947) illuminates the ways in which the assumptions of contemporary philosophy frequently underlie the world-view and hence the findings of science. He begins with an example we all know from physics, Aristotle’s conflation of velocity and acceleration, a mistake which held back the progress of science by a good five hundred years. A slightly less well known example would be the concept of phlogiston (which was thought to underlie the process of combustion) which only held back chemistry and biochemistry for a couple of hundred years. At the turn of the 20th century, most physicists still did not believe seriously in the atomic theory of matter and expected that the infamous “complete picture of nature” (itself a most unscientific concept) would emerge as a theory explaining the propagation of waves in the aether. A concept with little more to commend it mathematically than the aether which has been in vogue nearly forever, is often referred to as scientific parsimony (occasionally confused with the virtue of elegance, which has additional dimensions). Parsimony is a normative term, exhorting us to believe the simpler of two competing explanations. This, in and of itself, is not science. While "simplicity" sounds simple and perhaps even Quaker-like, in a technical sense it is a kind of meta-mathematics which is ultimately either paradoxical, useless, or simply a matter of taste. Simplicity was the topic of David Hilbert's (1917) long lost "24th problem" somehow omitted from the canonical set of Hilbert's 23 problems set as a challenge for 20th Century mathematics (and mathematical physics). Hilbert wanted to clarify the notion that for every theorem, there is a "simplest" proof. His grand program was demolished by Kurt Gödel, but his problems, upon solution, bestow instant success and even immortality to the solver. His notion of "simplest" raises key questions for the 21st century.
1.1.2 The Razor’s Edge “The sharp edge of a razor is difficult to pass over; thus the wise say the path to Salvation is hard.” W. Somerset Maugham, “The Razor’s Edge”, 1944 (Mistranslation from the Katha Upanishad)1
1
For most readers, one of the most misleading aspects of Maugham’s masterpiece is the clever fiction Maugham inserts at the beginning of the novel, claiming to have invented nothing, which is frequently taken at face value and has given rise to innumerable magazine articles, essays and blogs. As Critique magazine notes “The book's theme is spiritual discovery, but this is well-trod ground in our age, and we can give Maugham more credit than that for the book's resonance. With backdrops of Paris, the Riviera and India, there's a faint whiff of the bohemian to Larry and to tragic poet Sophie, but the characters have no affectations. Larry doesn't want to paint or gain fame, he simply
3
Maugham’s poem is more than a little reminiscent of the problem of Occam’s razor. Like most conventional wisdom, it doesn’t come with a label attached telling one that it is conventional wisdom. In terms of mathematical logic, it is not, and cannot be complete (i.e. self-defining or self-validating). Those of us who speak Sanskrit would certainly like Maugham’s poem to be an accurate rendition of the Katha Upanishad. Maugham’s poem is not only inspirational, but it is (at least by some measures) considerably better than the original. As an inspiration it is elegant. As a translation it is utter nonsense (as Sanskrit scholar Christopher Isherwood, who helped him do the translation informed him when the book was written). Occam’s razor suffers from some of the same problems. Baker (2001, 2003) summarizes this problem as follows “There is a widespread philosophical presumption that simplicity is a theoretical virtue. This presumption that simpler theories are preferable appears in many guises. Often it remains implicit; sometimes it is invoked as a primitive, self-evident proposition; other times it is elevated to the status of a ‘Principle’ and labeled as such (for example, the ‘Principle of Parsimony’). However, it is perhaps best known by the name ‘Occam's (or Ockham's) Razor.”
1.2 Simplicity Principles Simplicity principles have been proposed in various forms by theologians, philosophers, and scientists, from ancient through medieval to modern times.” The core concept of this paper is that these different definitions of simplicity present a fundamentally and inescapably paradoxical problem with respect to which choice is actually the simplest. Aristotle put it this was in his Posterior Analytics: “We may assume the superiority ceteris paribus of the demonstration which derives from fewer postulates or hypotheses.” Unfortunately, Aristotle’s argument suggests an arithmetical process of counting postulates, or counting hypotheses, in order to choose the simplest demonstration.2 However, a demonstration is not the same as a theory, or as a proof, or, perhaps most importantly, a methodology of science. Since “proof” is a particularly important word in this paper, let us review a conventional definition, and then distinguish between the differing attitudes towards proof generally held by the disciplines of mathematics and in physics. “Proof: A rigorous mathematical argument which unequivocally demonstrates the truth of a given proposition. A mathematical statement which has been proven is called a theorem.” (Weinstein, Math World) According to G.D. Hardy, (1967), “all physicists, and a good many quite respectable mathematicians, are contemptuous about proof. I have heard Professor (Arthur) Eddington, for example, maintain that proof, as pure mathematicians understand it, is really quite uninteresting and unimportant, and that no one who is really certain that he has found something good should waste his time looking for proof.... [This opinion], with which I am sure that almost all physicists agree at the bottom of their hearts, is one to which a mathematician ought to have some reply.” In support of Hardy's assertion, Nobel laureate physicist Richard Feynman is reported to have said, “A great deal more is known than has been proved” (Derbyshire, 2004).
wants to know. The novel strikes a chord because it is a brilliant portrait of friends drifting together and apart, with all the changing appraisals, disillusionment and occasional forgiveness that we bring to our own relationships…(continued)… To achieve this, Maugham achieves unequalled suspension of disbelief. He injects himself into the narrative and presents his story as gossip. A narrator claiming these things actually happened is, of course, a familiar enough cliché of 19th century literature, and Maugham's own life straddled two centuries. "I have invented nothing," he tells us early on, and we follow him into this lie (He eventually admits to us well, yes, he has invented a few things.) But Maugham is writing in 1944, so he takes the old device and turns it inside out. He doesn't offer his tale in a linear way. He plays fly-on-the-wall with his characters, then wanders into his set as an extra. He leaves gaps. Years pass, and The Writer Somerset Maugham mentions how he's getting on with business before casually reporting second-hand news of Larry or Isabel.” http://www.critiquemagazine.com/article/maugham.html 2 Aristotle’s influence upon all of Medieval philosophy can be seen in the work of St. Thomas Aquinas, when the latter writes: “If a thing can be done adequately by means of one, it is superfluous to do it by means of several; for we observe that nature does not employ two instruments where one suffices.” Unfortunately, Aristotle and Aquinas beg the questions of what is an “instrument,” what is the extent to which Nature “doing a thing” and how this is similar to a human being “doing a thing,” and whether making a proof is an instantiation of “doing a thing.”
4 1.3 Proof by Computer: Shot by Guns? Or just another case of bee-honey? At present, there is considerable contemporary debate in the mathematics community, triggered by the enormous increase in the use of computer software to do mathematical work, as to precisely what constitutes a proof. The four-color theorem is an example of this debate. The four-color theorem states that any map in a plane can be colored using four-colors, in such a way that regions sharing a common boundary (other than a single point) do not share the same color. This problem is also called Guthrie’s problem, since F. Guthrie, first conjectured the theorem in 1853, after which it was then communicated to de Morgan and then more widely (Coxeter, 1959). Various apparent proofs have been published and then found invalid, and there have been several computer-proof hoaxes. Arthur Cayley wrote the first paper on the four-color conjecture, in 1878. This problem has become the nexus of debate on proof because it relies on an exhaustive computer testing of many individual cases which cannot be and have not been verified “by hand” (Appel, 1976, 1977, 1986, 1989). Nor has a more comprehensible human-generated proof been universally accepted.3 For the present, a large number of mathematicians consider computer-assisted proofs as valid, but a conservative fraction of purists do not. There are many computer systems currently under development and use for automated theorem proving, including for the verification of published proofs in papers and textbooks, and for the automated (and semi-automated) creation of new proofs. This includes new proofs of theorems already known to be true, as well as case-by-case implementation of Hilbert’s goal: to find “simplest” proofs [as discussed in Section XXXX], which brings us back to the problem of what constitutes ‘simplest.” In the eighteenth century, Aristotle’s version of Occam’s razor was applied to the philosophy of science by Immanuel Kant (1950), who argued that, “rudiments or principles must not be unnecessarily multiplied (entia praeter necessitatem non esse multiplicanda)” thereby arguing that this is a regulative and normative idea of pure reason (not observed as a descriptive notion as such) at the foundations of scientists' theorizing about nature (pp. 538-90. In part, Kant was restating Galileo’s and Newton’s versions of Occam’s razor. Newton includes a principle of parsimony as one of his three ‘Rules of Reasoning in Philosophy’ at the beginning of Book III of Principia Mathematica (1972, p. 398): “Rule I: We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.” This principle applies Occam’s razor to the causes in a cause and effect theory, and neatly evades the determination of what constitutes “truth” and “sufficiency.” Newton further observes, anthropomorphically that “Nature is pleased with simplicity, and affects not the pomp of superfluous causes” as if this observation, in and of itself, constituted a proof. Newton, in his day, was attacked precisely for the lack of causation in his Universal Theory of Gravitation, as well as for its being a philosophically untenable form of action at a distance, when cause and effect were traditionally (Aristotle) presumed to be mediated only by direct contact between bodies.4 However this approach to parsimony runs into problems because (a) there are effectively an infinite number of empirical facts; (b) not all facts deserve the same weight of evidence in a scientific theory, and (c) the mere counting of hypotheses or axioms (not the same thing in any case) is not a panacea.
3
See also Barnette, 1983; Birkhoff, 1913; Cahit, 2004; Chartrand, 1985; Devlin, 2005]; Dharwadker, internet; Errera, 1921; Franklin, 1938, 194; Gardner, 1966, Apr 1975, July 1975; Harary, 1994; Heawood, 1890, 1898; Hutchinson, 1998; Kempe, 1879; Kittell, 1935; Knight, 2005; Kraitchik, 1942; Morganstern, 1991; [Ore, 1967, 1969; [Pappas, 1989; Ringel, 1968; Robertson, 1996; Saaty, 1986; Skiena, 1990; Steinhaus, 1999; Tait, 1880; Thomas, 1998; Wagon, 1998, 1999; and Wells, 1986, 1989. 4 Galileo, comparing of Ptolemaic and Copernican theories of the solar system, (Galileo, 1962, p. 397) concludes: “Nature does not multiply things unnecessarily; that she makes use of the easiest and simplest means for producing her effects; that she does nothing in vain, and the like”. Again, this anthropomorphizes Nature, avoids definition of “easiest” and “simplest,” and features causes rather than theories of cause and effect. Similarly, Lavoisier maintained (1862, pp. 623-4) “If all of chemistry can be explained in a satisfactory manner without the help of phlogiston, that is enough to render it infinitely likely that the principle does not exist, that it is a hypothetical substance, a gratuitous supposition. It is, after all, a principle of logic not to multiply entities unnecessarily.” While we all agree that eliminating phlogiston was a good thing, the end of Alchemy, and the beginning of modern Chemistry. However, this sidesteps the question of when it is necessary to “multiply entities.” Albert Einstein joined in the chorus (Einstein, quoted in Nash 1963, p. 173): “[T]he grand aim of all science…is to cover the greatest possible number of empirical facts by logical deductions from the smallest possible number of hypotheses or axioms.”
5 2.1 Simplicity is not simple As Alan Baker argues, “the apparent familiarity of the notion of simplicity means that it is often left unanalyzed, while its vagueness and multiplicity of meanings contributes to the challenge of pinning the notion down precisely.” In this regard, Baker cites Poincaré who noted that “simplicity is a vague notion” and “everyone calls simple what he finds easy to understand, according to his habits.” (Gauch, 2003). In attempting to sort out which simplicity various authors are describing, Baker recognizes two fundamentally distinct senses of simplicity: (1) ELEGANCE or “syntactic simplicity” (roughly, the number and complexity of hypotheses), and (2) PARSIMONY or “ontological simplicity” (roughly, the number and complexity of things postulated) This later is also known as ‘semantic simplicity’ (Sober, 2001). Again, we caution the reader against the implicit assumption that a mere count will solve the underlying problem. Baker also warns that “the terms ‘parsimony’ and ‘simplicity’ are used virtually interchangeably in much of the philosophical literature.” Accepting this distinction, for the purposes of argument, the body of literature representing these two competing notions of simplicity can be seen as seeking answers to three basic kinds of questions; (1) DEFINITIONAL (ontological): “How is simplicity to be defined?” (2) OPERATIONAL (teleological): “What is the role of simplicity principles in different areas of inquiry?” (3) INVESTIGATIVE (epistemological) “Is there a rational justification for such simplicity principles? Baker intriguingly claims that DEFINITION “is more straightforward for parsimony than for elegance.” And that “conversely, more progress on… rational justification has been made for elegance than for parsimony”. He also argues that “it should also be noted that the above (three) questions can be raised for simplicity principles both within philosophy itself and in application to other areas of theorizing, especially empirical science. A subdivision of the literature becomes necessary for OPERATIONAL matters, in that we must (to avoid confusion) distinguish between two styles of simplicity principles all loosely identified with. Occam’s razor: (1) EPISTEMIC: if theory T is simpler than theory T*, then it is rational (other things being equal) to believe T rather than T*.” and (2) METHODOLOGICAL: “ if T is simpler than T* then it is rational to adopt T as one's working theory for scientific purposes.” (we have here substituted the term “OPERATIONAL” for Baker’s term “USAGE”) Making the same distinction, Baker also argues that the two differing conceptions of Occam’s razor require different types of justification… He notes that “in analyzing simplicity, it can be difficult to keep its two facets — elegance and parsimony — apart. Principles such as Occam’s razor are frequently stated in a way which is ambiguous between the two notions, for example, ‘Don't multiply postulations beyond necessity.’ Here it is unclear whether ‘postulation’ refers to the entities being postulated, or the hypotheses which are doing the postulating, or both.” The first reading corresponds to parsimony, the second to elegance. Examples of both sorts of simplicity principle can be found in the earlier quotations by Aristotle, Kant, Galileo, Newton, Lavoisier, and Einstein. Continuing in this vein, he explains an important complex tradeoff, explaining that “while these two facets of simplicity are frequently conflated, it is important to treat them as distinct. One reason for doing so is that considerations of parsimony and of elegance typically pull in different directions. Postulating extra entities may allow a theory to be formulated more simply, while reducing the ontology of a theory may only be possible at the price of making it syntactically more complex.” This tradeoff suggests some kind of philosophical cost/benefit analysis as an implicit part of any scientific method (see also Northrop on “concepts by intuition” and “concepts by postulation” as well as on inductive vs. deductive scientific method and the special meaning of probability in quantum mechanics). Baker describes this cost/benefit analysis thus: “There is typically a trade-off between ontology and ideology — to use the terminology favored by Quine — in which contraction in one domain requires expansion in the other. This points to another way of characterizing the elegance/parsimony distinction, in terms of simplicity of theory versus simplicity of world respectively. This version of the distinction is reflected in some philosophers’ choice of terminology….” Gauch (2003) similarly writes ‘epistemological parsimony’ where we’ve written ELEGANCE, and ‘ontological parsimony’ where we’ve abbreviated as parsimony. We get a modern mathematical flavor in the Sober (2001) argument that, as Baker puts it: “both these facets of simplicity can be interpreted in terms of minimization. In the (atypical) case of theoretically idle entities, both forms of
6 minimization pull in the same direction; postulating the existence of such entities makes both our theories (of the world) and the world (as represented by our theories) less simple than they might be.” Minimization over nonlinear functions is a richer approach than mere arithmetic counting, but raises deeper problems in that minimization can be computationally expensive, and is sometimes impossible.
2.2 Ontological Parsimony The ontological version of Occam’s razor is most frequently some paraphrase of Kant to the effect that: “Entities are not to be multiplied beyond necessity.” Baker notes that such recent treatments of Occam’s razor “are connected only very tenuously to the 14th-century figure William of Ockham. We are not here interested in the exegetical question of how Ockham intended his ‘Razor’ to function, nor in the uses to which it was put in the context of medieval metaphysics.” (Thornburn, 1918; Stanford Encyclopedia of Philosophy, 2005). Baker further explains that contemporary philosophers often reinterpret Occam’s razor as a principle of theory choice stating that “Occam’s razor implies that — other things being equal — it is rational to prefer theories which commit us to smaller ontologies. Hence we find the typical modern paraphrasings of Occam’s razor like, which Baker labels as OR1: “Other things being equal, if T1 is more ontologically parsimonious than T2 then it is rational to prefer T1 to T2.” The question then becomes how do we determine if one theory is more ontologically parsimonious than another? Most frequently, Willard Quine's (1981) concept of ontological commitment is utilized as follows: (1) A theory, T, is ontologically committed to Fs if and only if T entails that F's exist.” (2) If two theories, T1 and T2, have the same ontological commitments except that T2 is ontologically committed to Fs and T1 is not, then T1 is more parsimonious than T2. To use the apparatus of Set Theory, a sufficient condition for T1 being more parsimonious than T2 is for the ontological commitments of T1 to be a proper subset of those of T2. Baker also notes that OR1 is much weaker than the informal version of Occam’s razor, OR, with which we started. Why? Because “OR stipulates only that entities should not be multiplied beyond necessity. OR1, by contrast, states that entities should not be multiplied other things being equal, and this is compatible with parsimony being a comparatively weak theoretical virtue.” There is, of course, a trivial case to demonstrate this. OR1 can be straightforwardly applied is when a theory, T, postulates entities which are “explanatorily idle.” Eliminating these unused idle entities from T produces a second theory, T*, which has the same theoretical virtues as T but a smaller set of ontological commitments. Hence, according to OR1, it is rational to pick T* over T. However, Baker warns that “terminology such as ‘pick’ and ‘prefer’ is crucially ambiguous between the epistemological and teleological (methodological) versions of Occam’s razor. For the purposes of defining ontological parsimony, it is not necessary to resolve this ambiguity.)” We might also add that, having used the elementary apparatus of Set Theory, we should be extremely careful to the extent that “picking” can embroil us in deep problems with Zermelo’s Axiom of Choice (Zermelo, 1931/1985). Bake also raises “a more general worry concerning the narrowness of the applications of OR1. First, how often does it actually happen that we have two (or more) competing theories for which ‘other things are equal? In other words, these conditions will hardly ever apply if we are being precise (Holsinger 1980). Baker also cautions “how often are one candidate theory's ontological commitments a proper subset of another's? Much more common are situations where ontologies of competing theories overlap, but each theory has postulates which are not made by the other. Straightforward comparisons of ontological parsimony are not possible in such cases.” A final distinction within the definitional question for the ontological parsimony distinction is that between qualitative parsimony (roughly, the number of types or kinds of thing postulated)”, and quantitative parsimony (roughly, the number of individual things postulated). To exemplify this distinction, the hypothesis that the assassination of John F. Kennedy was accomplished by 75 CIA agents in a secret cabal is less quantitatively parsimonious but more qualitatively parsimonious than the hypothesis that it was performed by one agent of Fidel Castro, one hit-man from the New Orleans Mafia, one rogue soldier afraid that Kennedy would order a retreat from Vietnam, one friend of Lyndon Baines Johnson, one representative of the major banks concerned
7 with his fiscal policy, one jealous husband, one time traveler, and one extraterrestrial.5 In this regard, Baker’s study concluded that “the default reading of Occam’s razor in the bulk of the philosophical literature is as a principle of qualitative parsimony. Thus Cartesian dualism, for example, is less qualitatively parsimonious than materialism because it is committed to two broad kinds of entity (mental and physical) rather than one.” Baker notes that “interpreting Occam’s razor in terms of kinds of entity brings with it some extra philosophical baggage of its own. In particular, judgments of parsimony become dependent on how the world is sliced up into kinds. Nor is guidance from extra-philosophical usage — and in particular from science — always clear cut. For example, is a previously undiscovered subatomic particle made up of a novel rearrangement of already discovered sub-particles a new ‘kind’? What about a biological species, which presumably does not contain any novel basic constituents? Also, ought more weight to be given to broad and seemingly fundamental divisions of kind — for example between the mental and physical — than between more parochial divisions? Intuitively, the postulation of a new kind of matter would seem to require much more extensive and solid justification than the postulation of a new sub-species of spider.”
2.2.1 A Priori Justifications of Simplicity At the beginning of this paper we mentioned that the paradox of simplicity revolves, in part, around a fundamental circularity, where if the argument is pursued long enough, one discovers that the author has in fact assumed the very thing he or she is trying to prove. While it is beyond the scope of this particularly paper to undertake an exhaustive analysis of the a priori justification of simplicity, Baker has done so, recognizing four different fundamental categories of a priori simplicity, (1) theoretical virtue, (2) theological explanations, (3) metaphysical justifications and intrinsic value justifications: 1. Theoretical Virtue - This concept is so often presented, so commonly claimed to be fundamental, and so often and implicit in general theories, that an army of philosophers, scientists, and theologians found themselves marching side by side against the common enemy of complexity. This army, temporarily avoiding internecine conflict, seek justification for Occam’s razor and related techniques on wide and fundamental grounds. Rationalism itself seems married to the view that only by making a priori simplicity assumptions can one avoid what Baker calls “the underdetermination of theory by data.” That is, the same observations give rise to a plethora of explanatory and predictive theories; the data themselves do not provide a method to prefer one theory to another. “Until the second half of the 20th Century this was probably the predominant approach to the issue of simplicity.” 2. Theological Explanations – Baker notes that that “The post-medieval period coincided with a gradual transition from theology to science as the predominant means of revealing the workings of nature. In many cases, espoused principles of parsimony continued to wear their theological origins on their sleeves, as with Leibniz's thesis that God has created the best and most complete of all possible worlds, and his linking of this thesis to simplifying principles such as light always taking the (time-wise) shortest path.”6
2.2.2 Metaphysical Justifications of Simplicity Baker says that one “approach to justifying simplicity principles is to embed such principles in some more general metaphysical framework. Perhaps the clearest historical example of systematic metaphysics of this sort is the work of Leibniz. The leading contemporary example of this approach — and in one sense a direct descendent of Leibniz's methodology — is the possible worlds framework of David Lewis.” Lewis himself argues that “I subscribe to the general view that qualitative parsimony is good in a philosophical or empirical 5
It is a well known fact that all members of the cabal were extra-terrestrials (X-Files, 1998). For these scientists, and presumably for many of Stephen Hawking’s readers, “God” stands for: “an hypothesized omnipotent, omniscient, incorporeal yet personal Creator; the traditional Mosaic God of Judaism, Christianity and Islam. This conception of God needs to be, but here is not, sharply distinguished from that of Einstein” (Flew, 1996), who was once asked – “to settle an argument -whether he believed in God. He replied that he believed in Spinoza's God (Sommerfeld, 1959). Since for Spinoza the words 'God' and “Nature” were essentially synonymous, Einstein was hedging his bets in public between atheism, Naturalism, Deism, and agnosticism when he protested against quantum theory in the phrase often translated as: “The Lord God does not play dice.” Presumably, this same slippery context exists for his statement, now inscribed over a fireplace in Fine Hall in Princeton University: “God who creates and is nature is very difficult to understand, but he is not arbitrary or malicious.”
6
8 hypothesis” (Lewis 1973). Baker says that “Lewis has been attacked for not saying more about what exactly he takes simplicity to be,” and cites (Woodward, 2003). “What is clear is that simplicity plays a key role in underpinning his metaphysical framework, and is also taken to be a prima facie theoretical virtue.”
2.2.3 ‘Intrinsic Value’ Justifications of Simplicity Does simplicity have intrinsic value as a theoretical goal?. “Just as the question ‘why be rational?’ may have no non-circular answer, the same may be true of the question ‘why should simplicity be considered in evaluating the plausibility of hypotheses?’” (Sober, 2001). Baker distinguishes between: (1) The notion that “such intrinsic value may be ‘primitive’ in some sense,” or, (2) Intrinsic value of simplicity “may be analyzable as one aspect of some broader value” [Derkse,1992] is a book-length exegesis of this idea, echoing Quine's remarks, in defending Occam’s razor, that his tastes motivate him towards “clear skies” and “desert landscapes.” Baker speculates that, “in general, forging a connection between aesthetic virtue and simplicity principles seems better suited to defending methodological rather than epistemic principles.” As we stated earlier, aesthetic defenses are close to ancient theological justifications of simplicity.
2.2.4 Justifications of Simplicity via Principles of Rationality Another approach which Baker finds in common use “is to try to show how simplicity principles follow from other better established or better understood principles of rationality.”7 He argues that “some philosophers just stipulate, that they will take ‘simplicity’ as shorthand for whatever package of theoretical virtues is (or ought to be) characteristic of rational inquiry. This approach is quite similar in its mechanics to the theological approach, only there is an ad hominem substitution of secular philosophical authority for religious authority in the acceptance of revealed truth. Curiously, Ben-Ami Scharfstein (1989, 1993), in an interesting set of psychological analyses the psychological foundations of the philosophical thought of the major philosophers of the 18th and 19th centuries and finds a host of pathologies, not the least of which was the ability of any of them to maintain a normal home or family life (illustrating the dangers of ad hominem arguments, as opposed to illustrating the dangers of philosophy). A more substantive alternative is to link simplicity to some particular theoretical goal, for example unification. See, for instance, (Friedman, 1983). The problem here is relevance. As Baker says, “while this approach might work for elegance, it is less clear how it can be maintained for ontological parsimony.” Conversely, a line of argument which seems better suited to defending parsimony than to defending elegance is to appeal to a principle of epistemological conservatism. Parsimony in a theory can be viewed as minimizing the number of ‘new’ kinds of entities and mechanisms which are postulated. This preference for old mechanisms may in turn be justified by a more general epistemological caution, or conservatism, which is characteristic of rational inquiry. Note, however, that such conservatism can cut both ways, since it implies resistance to any ontological revision, be it a contraction, an expansion, or a wholesale replacement. In summary, then, the weakness of a priori justifications of simplicity principles is the fundamental difficulty in distinguishing between an a priori defense and no defense(i.e. pleading nolo contendre and being sentenced). Baker adds: “Sometimes the theoretical virtue of simplicity is invoked as a primitive, self-evident proposition that cannot be further justified or elaborated upon….. (as in) the beginning of (Goodman & Quine, 1947), where they state that their refusal to admit abstract objects into their ontology is ‘based on a philosophical intuition that cannot be justified by appeal to anything more ultimate.’ Critically, it is unclear where leverage for persuading skeptics of the validity of such principles can come from, especially if the grounds provided are not themselves to beg further questions. Misgivings of this sort 7
See (Nolan,1999) for a rich parallel with the question of, whether Quine’s notion of “fertility” is a theoretical virtue in its own right.
9 have led to a shift away from justifications rooted in ‘first philosophy’ towards approaches which engage to a greater degree with the details of actual practice, both scientific and statistical.” We thus are led to survey scientific and statistical justifications for inductive definitions of simplicity before turning to results based on modern Proof Theory and Complexity Theory.
2.2.5 Naturalistic Justifications of Simplicity: Einstein Rationalism, within analytic philosophy, after roughly 1950, was displaced by scientific “naturalized epistemology.” Naturalism holds that, philosophy is continuous with science, rather than independently privileged, different in scope, or distinct in methodology. Naturalism concludes that Science cannot and need not have independent philosophical justification. This leads to a Naturalist epistemic justification of simplicity, based on ontological parsimony and Occam’s razor. The most commonly used example in this regard is Einstein’s theory of special relativity. However, the underlying reason special relativity replaced the earlier Lorenz- Poincaré theory had little to do with simplicity and was instead, based largely on other theoretical problems (such as that of measurement).8 From a technical sense, the issue of special relativity being successful as a “simpler” theory is further clouded by the following, the details (most of which go beyond the scope of this paper): (1) Einstein himself admitted that SR did not describe the physical world, in that (for example) under SR a wheel cannot rotate; (2) Einstein was led to go beyond SR to GR (General Relativity). (3) Einstein was ultimately unsuccessful in connecting GR with Quantum mechanics in a “unified field theory.” (4) Einstein admitted in confusing ways that, in GR and a unified theory, space-time itself is the banished luminiferous ether.
2.3 Epistemological Problems of Naturalism If we return to our earlier discussion on the problems of mere counting, we can now incorporate Baker’s caution on representational issues, which is “how the world is sliced up into kinds [affects] the extent to which a given theory ‘multiplies’ kinds of entity.” The crux of his argument is, as we have seen is that the justification underlying any particular set of categories becomes much more difficult once we abandon the various categories of a priori reasons for simplicity. Naturalism probably falls into its most serious trouble with a wellintentioned but fruitless “application of parsimony principles to abstract objects.” In a Wittgensteinian sense, this is probably the crux of the matter – simplicity carries a great deal of metaphysical baggage which we generally choose to ignore entirely (because we believe for reasons other than stated scientific reasons that it must be right). As Baker puts it: The scientific data is — in an important sense — ambiguous. Applications of Occam’s razor in science are always to concrete, causally efficacious entities, whether landbridges, unicorns, or the luminiferous ether. Perhaps scientists apply an unrestricted version of Occam’s razor to that portion of reality in which they are interested, namely the concrete, causal, spatiotemporal world. Or perhaps scientists apply a ‘concretized’ 8 Baker frames this as: “the replacement of an empirically adequate theory (the Lorentz-Poincaré theory) by a more ontologically parsimonious alternative (Special Relativity). Hence it is often taken to be an example of Occam’s razor in action. The problem with using this example as evidence for Occam’s razor is that Special Relativity (SR) has several other theoretical advantages over the LorentzPoincaré (LP) theory in addition to being more ontologically parsimonious. Firstly, SR is a simpler and more unified theory than LP, since in order to ‘save the phenomena’ a number of ad hoc and physically unmotivated patches had been added to LP. Secondly, LP raises doubts about the physical meaning of distance measurements. According to LP, a rod moving with velocity, v, contracts by a factor of (1 v2/c2)1/2. Thus only distance measurements that are made in a frame at rest relative to the ether are valid without modification by a correction factor. However, LP also implies that motion relative to the ether is in principle undetectable. So how is distance to be measured? In other words, the issue here is complicated by the fact that — according to LP — the ether is not just an extra piece of ontology but an undetectable extra piece. Given these advantages of SR over LP, it seems clear that the ether example is not merely a case of ontological parsimony making up for an otherwise inferior theory.”
10 version of Occam’s razor unrestrictedly. Which is the case? The answer determines which general philosophical principle we end up with: ought we to avoid the multiplication of objects of whatever kind, or merely the multiplication of concrete objects? The distinction here is crucial for a number of central philosophical debates. Unrestricted Occam’s razor favors monism over dualism, and nominalism over Platonism. By contrast, ‘concretized’ Occam’s razor has no bearing on these debates, since the extra entities in each case are not concrete.”
3.1 Statistical and Probabilistic Justifications of Simplicity Both approaches for which we have annotated Baker’s survey, namely a priori rationalism and naturalized empiricism, are practical failures, and too lacking in mathematical rigor to be repaired. Baker says that they also fail because they “are both in some sense extreme. Simplicity principles are taken either to have no empirical grounding, or to have solely empirical grounding. Perhaps as a result, both these approaches yield vague answers to certain key questions about simplicity. In particular, neither seems equipped to answer how exactly simplicity ought to be balanced against empirical adequacy. Simple but wildly inaccurate theories are not hard to come up with. Nor are inaccurate theories which are highly complex.” Baker agrees that the issue of cost/benefit analysis has been sidestepped. “But how much accuracy should be sacrificed for a gain in simplicity? The black-andwhite boundaries of the rationalism/empiricism divide may not provide appropriate tools for analyzing this question. In response, philosophers have recently turned to the mathematical framework of probability theory and statistics, hoping in the process to combine sensitivity to actual practice with the ‘trans-empirical’ strength of mathematics.” While a general historical survey of the philosophical development of probability theory is beyond our present scope, we may still derive a number of valid insights if we restrict ourselves to philosophically influential early work on simplicity modeled by probability, as done both by Jeffreys and by Popper. Baker summarizes their work thus: “Jeffreys argued that ‘the simpler laws have the greater prior probability,’ and went on to provide an operational measure of simplicity, according to which the prior probability of a law is 2−k, where k = order + degree + absolute values of the coefficients, when the law is expressed as a differential equation (Jeffreys, 1961). A generalization of Jeffreys' approach is to look not at specific equations, but at families of equations. For example, one might compare the family, LIN, of linear equations (of the form y = a + bx) with the family, PAR, of parabolic equations (of the form y = a + bx + cx2). Since PAR is of higher degree than LIN, Jeffreys' proposal assigns higher probability to LIN. Laws of this form are intuitively simpler (in the sense of being more elegant).”9
3.1.1 Probability and Statistics The popular culture does not distinguish between probability and statistics, but they are, in fact, two different bodies of practice, with different literatures. Baker writes: “More recent work on the issue of simplicity has borrowed tools from statistics as well as from probability theory. It should be noted that the literature on this topic tends to use the terms ‘simplicity’ and ‘parsimony’ more-or-less interchangeably (see Sober, 2003). But, whichever term is preferred, there is general agreement among those working in this area that simplicity is to be cashed out in terms of the number of free (or ‘adjustable’) parameters of competing hypotheses. Thus the focus here is totally at the level of theory. Philosophers who have made important contributions to this approach include Forster and Sober (1994), and Lange (1995).”
9
Baker also notes that “Popper (1959) pointed out that Jeffreys' proposal, as it stands, contradicts the axioms of probability. Every member of LIN is also a member of PAR, where the coefficient, c, is set to 0. Hence ‘Law, L, is a member of LIN’ entails ‘Law, L, is a member of PAR.’ Jeffreys' approach assigns higher probability to the former than the latter. But it follows from the axioms of probability that when A entails B, the probability of B is greater than or equal to the probability of A. Popper argues, in contrast to Jeffreys, that LIN has lower prior probability than PAR. Hence LIN is — in Popper's sense — more falsifiable, and hence should be preferred as the default hypothesis. One response to Popper's objection is to amend Jeffrey's proposal and restrict members of PAR to equations where c ≠ 0.”
11 3.1.2 Curve Fitting Reflecting the treatment of Gauch, (2003), Baker now explains the philosophical and scientific centrality of the curve-fitting problem. “The standard case in the statistical literature on parsimony concerns curve-fitting. We imagine a situation in which we have a set of discrete data points and are looking for the curve (i.e. function) which has generated them. The issue of what family of curves the answer belongs in (e.g. in LIN or in PAR) is often referred to as model-selection. The basic idea is that there are two competing criteria for model selection — parsimony and goodness of fit. The possibility of measurement error and ‘noise’ in the data means that the correct curve may not go through every data point. Indeed, if goodness of fit were the only criterion then there would be a danger of ‘overfitting’ the model to accidental discrepancies unrepresentative of the broader regularity.”10 Even if there is some clique of statistical philosophers with a consensus that “simplicity should be cashed out in terms of number of parameters,” we still have a debate on the fundamental objective of simplicity principles. Why? Baker believes that this “is partly because the goal is often not made explicit. … An analogous issue arises in the case of Occam’s razor. ‘Entities are not to be multiplied beyond necessity.’ But necessity for what, exactly? Forster (2001) distinguishes two potential goals of model selection, namely probable truth and predictive accuracy, and claims that these are importantly distinct (Forster, 2001).
3.1.3 Accuracy Forster argues that predictive accuracy tends to be what scientists care about most. ‘They care less about the probability of an hypothesis being exactly right than they do about it having a high degree of accuracy.” From experience with scientists in several different fields and at several different universities and corporations we might draw the following conclusions: (1) There is a generally accepted and nuanced distinction between Precision and Accuracy which management and the public fails to comprehend; (2) Being exactly right is not the scientists’ problem, as, once they’ve published their results, the scientific method will deal with this through independent verification and external third parties with more powerful equipment, or willingness to compute with larger data sets; (3) Making an accurate prediction yields greater career rewards than making a vague and unfalsifiable prediction with a beautifully articulated model. (4) The data is held, in some sense, to be the final arbiter, along with journal referees, as in Nobel laureate Joshua Lederberg’s slogan, above his office door: “Abandon Hope All Ye Who Enter Here…Without Good data!”
3.2 Tradeoffs To return to our earlier cost-benefit tradeoff discussion, Baker argues, “one reason for investigating statistical approaches to simplicity is a dissatisfaction with the vagaries of the a priori and naturalistic approaches. Statisticians have come up with a variety of numerically specific proposals for the trade-off between simplicity and goodness of fit. However, these alternative proposals disagree about the ‘cost’ associated with more complex hypotheses.” We emphasize this disagreement here, because the core question of this paper is how to resolve the paradox of determining a definition of “simplest” that allows one to choose the simplest definition of “simplest.” Again as Baker notes: “Two leading contenders in the recent literature on model selection are the Akaike Information Criterion [AIC] and the Bayesian Information Criterion [BIC]. AIC directs theorists to choose the model with the highest value of {log L(Θk)/n} – k/n, where Θk is 10
(Jonathan Vos Post) “I am reminded here of my mentor Richard Feynman’s telling me to always be skeptical of the right-most data point in an experimental physicist’s graphs. He maintained that the experimental apparatus probably gave absurd results for some further right data point, which was thrown out for ad hoc reasons, and that the remaining right-most point had large error bars but could not be so easily rejected. Baker writes in a similar vein, “Parsimony acts as a counterbalance to such overfitting, since a curve passing through every data point is likely to be very convoluted and hence have many adjusted parameters.”
12 the best-fitting member of the class of curves of polynomial degree k, log L is loglikelihood, and n is the sample size.” “By contrast, BIC maximizes the value of {log L(Θk)/n} – klog[n]/2n. In effect, BIC gives an extra positive weighting to simplicity by a factor of log[n]/2 (where n is the size of the sample).” [Forster, 2001, pp. 106-7, more deeply compares and contrasts AIC and BIC. He summarizes these two competing positions as follows: “Extreme answers to the trade-off problem seem to be obviously inadequate. Always picking the model with the best fit to the data, regardless of its complexity, faces the prospect (mentioned earlier) of ‘overfitting’ error and noise in the data. Always picking the simplest model, regardless of its fit to the data, cuts the model free from any link to observation or experiment. Forster associates the ‘Always Complex’ and the ‘Always Simple’ rule with empiricism and rationalism respectively [ibid].” All the candidate rules that are seriously discussed by statisticians fall in between these two extremes. Yet they differ in their answers over how much weight to give simplicity in its trade-off against goodness of fit. In addition to AIC and BIC, other rules include Neyman-Pearson hypothesis testing, and the minimum description length (MDL) criterion.”
3.3 Comparative Approaches A more detailed comparison between these four approaches is beyond the scope of this paper, but we note that the existence of at least four approaches indicates how acute is the meta-problem of choosing between different ways of defining, using, and justifying simplicity: (1) Akaike Information Criterion (AIC); (2) Bayesian Information Criterion (BIC); (3) Neyman-Pearson hypothesis testing (Neyman & Pearson, 1928, 1933a, 1933b); (4) the minimum description length (MDL) criterion, originally proposed by Jorma Rissanen in 1978 as a computable approximation of Kolmogorov complexity (Rissanen, 1996), (Kontkanen, 2000), (Simon, 1972), (Barron, 1998), (Clarke, 1990, 1994).
4.1 Hypothesis Testing Without going into a detailed explanation of Neyman-Pearson hypothesis testing, we can summarize as follows. In a statistical test, the researcher selects between two mutually exclusive hypotheses: the null and the alternate hypothesis. It is a commonly assumed that: (a) You don't believe in the null hypothesis; (b) You do believe in the alternate hypothesis. This naïve view is incorrect. The idea of disbelieving the null hypothesis stems from the principle of falsification as introduced by Karl Popper (1902-1994). According to Popper (1959), we cannot conclusively affirm a hypothesis, but we can conclusively negate it. Hence the validity of knowledge is tied to the probability of falsification. The more specific a statement, the greater is the possibility that the statement can be negated. Popper therefore defines a scientific method as “proposing bold hypotheses, and exposing them to the severest criticism, in order to detect where we have erred.” (Popper, 1974) If the hypothesis survive “the trial of fire,” then we have confirmed its validity.
13 4.2 Popperian Falsification Popperian falsification is embedded in much of contemporary statistical terminology. Structural Equation Modeling (SEM) holds that when equations fail to specify a unique solution, the model is called “untestable” or “unfalsifiable,” on the grounds that it is capable of perfectly fitting any data. The notion is that for any model which is always right, and with which there is no way to disprove it, then this model is useless. A good hypothesis or a good model demands a high degree of specification. Quantification by, for instance, asserting “the mean of population A is the same as the mean of population B” is considered to be such a high degree of specification. Statistically following Popper’s logic, the goal of a researcher is thus to falsify a specific statement rather than to prove that it is right. Hence, attempting falsification alone leads correctly to disbelief of the null hypothesis. Those unfamiliar with statistical methods tend to ask, at this point, “Why do we only distrust and try to falsify the null hypothesis? Why don't we equally distrust and try to falsify the alternate hypothesis?” This is not so naïve as it seems, because historically the standard formal approach to hypothesis testing is a strange amalgam of the concept of the null hypothesis (R. A. Fisher, 1949) and the alternate hypothesis (Neyman and Pearson, 1928) with traces of other more recent approaches. One can easily specify the null hypothesis, but there is no a priori way to characterize the alternate hypothesis, until we have performed the experiment and contemplated the data. We may a priori hope that there is a mean difference between two sample populations, but we cannot a priori estimate how far apart the means should be. We can't even, a priori, know from which of the alternate populations the test statistic comes from. At most we can state that the difference is nonzero. The logic of hypothesis testing has become: “Given that the null hypothesis is true, what is the probability that the observation of this particular experimental data set took place? When the p value is 0.001, that means that 1 time out of 1,000 times the data will surface, as it did, under the assumption of the null hypothesis. But since we were restricted initially to the null hypothesis, this is not as close to Popper’s logic of falsification as students are usually told. D. H. Lewis in particular disagrees radically with this, by appeal to an infinite number of possible universes.
4.3 Type I and Type II Errors Neyman and Pearson (1933a), introduced the concepts of Type I and Type II errors. Ludbrook and Dudley (1998) argued that in biomedical research it is advisable to control Type I error. Lipsey (1990) gives a specific guideline as how to balance between two types of error. Wang, (1993) argues that Type I and Type II errors, with an accept-reject method are useful only for a subset of engineers in the field of statistical quality control who need clear rules of decision, and that science somehow differs. Hubbard and Bayri (2003) argue that the tradeoff between Type I and Type II errors has “nothing to do with statistical theory, but are based instead on context-dependent pragmatic considerations where informed personal judgment plays a vital role” (see also Bayri, Stevens, 1992 and Yu, (2006). In analyzing the tradeoff problem Baker argues that there are fundamentally three possible responses. The first response, which is favored by Forster (1995) and by Sober (1994), maintains that there is no genuine conflict here because the different criteria have different aims. Thus AIC and BIC might both be optimal criteria, if AIC is aiming to maximize predictive accuracy whereas BIC is aiming to maximize probable truth. Another difference which may influence the choice of criterion is the degree to which the goal of the model is to extrapolate beyond given data as compared to interpolating between known data points. The second response, typically favored by statisticians, is to argue that the conflict is genuine but that it has the potential to be resolved by analyzing (using both mathematical and empirical methods) which criterion performs best over the widest class of possible situations. The third, and most pessimistic, response is to argue that the conflict is genuine but is irresolvable. Kuhn (1977) takes this view, claiming that “how much weight individual scientists give a particular theoretical virtue, such as simplicity, is solely a matter of taste, and is not open to rational resolution.”
14 4.4 Additional Problems Baker also notes other problems with the statistical approach to simplicity. A major difficulty which he recognizes concerns language, particularly language relativity noting that: “Crudely put, hypotheses which are syntactically very complex in one language may be syntactically very simple in another. The traditional philosophical illustration of this problem is Goodman's ‘grue’ challenge to induction. Are statistical approaches to the measurement of simplicity similarly language relative, and — if so — what justifies choosing one language over another? It turns out that the statistical approach has the resources to at least partially deflect the charge of language relativity. Borrowing techniques from information theory, it can be shown that certain syntactic measures of simplicity are asymptotically independent of choice of measurement language.” He also identifies a problem with numbers which at first may appear to be a natural preference, but in fact is mathematically quite arbitrary, arguing that: “a second problem for the statistical approach is whether it can account not only for our preference for small numbers over large numbers (when it comes to picking values for coefficients or exponents in model equations), but also our preference for whole numbers and simple fractions over other values. In Gregor Mendel's original experiments on the hybridization of garden peas, he crossed pea varieties with different specific traits, such as tall versus short or green seeds versus yellow seeds, and then self-pollinated the hybrids for one or more generations [Bennet, 1965]. In each case one trait was present in all the first-generation hybrids, but both traits were present in subsequent generations. Across his experiments with seven different such traits, the ratio of dominant trait to recessive trait averaged 2.98 : 1. On this basis, Mendel hypothesized that the true ratio is 3 : 1. This ‘rounding’ was made prior to the formulation of any explanatory model, hence it cannot have been driven by any theory-specific consideration. This raises two related questions. First, in what sense is the 3 : 1 ratio hypothesis simpler than the 2.98 : 1 ratio hypothesis?” As an example of this type of reasoning, we can also reference the semi-facetious Frivolous Theorem of Arithmetic: “Almost all natural numbers are very, very, very large.” [Weisstein; citing Steinbach, 1990]. This is no joke, however. Newton’s Universal Law of Gravitation famously models attractive force as varying with the inverse square of the radius (distance) between two objects that is, proportional to r -2. Theoretical or measured deviations have been variously explained by Special Relativity (in the precession of the perihelion of Mercury), General Relativity, hypothetical “fifth forces,” theories of 4-dimensional gravity, and the like, beyond the scope of this paper. For our purposes, it suffices to ask whether r -2 is inherently “simpler” than r 2.00000000001 or r -1.999999999903, particularly as, centuries after Newton, such models have been proposed and experimentally tested.
4.5 The Limits of Statistical Simplicity In this vein, Baker asks: “can this choice [between an integer and a real number near and integer] be justified within the framework of the statistical approach to simplicity? The more general worry lying behind these questions is whether the statistical approach, in defining simplicity in terms of number of adjustable parameters, is replacing the broad issue of simplicity with a more narrowly — and perhaps arbitrarily — defined set of issues.” This is precisely our earlier point about the arbitrariness of a number of mathematical approaches to simplicity. Baker likewise critiques the statistical approach as to “whether it can shed any light on the specific issue of ontological parsimony. At first glance, one might think that the postulation of extra entities can be attacked on probabilistic grounds. For example, quantum mechanics together with the postulation ‘There exist unicorns’ is less probable than quantum mechanics alone, since the former logically entails the latter. However, as Sober has pointed out, it is important here to distinguish between agnostic Occam’s razor and atheistic Occam’s razor.
15 Atheistic OR directs theorists to claim that unicorns do not exist, in the absence of any compelling evidence in their favor. And there is no relation of logical entailment between {QM + there exist unicorns} and {QM + there do not exist unicorns}. This also links back to the terminological issue. Models involving circular orbits are more parsimonious — in the statisticians' sense of ‘parsimonious’ — than models involving elliptical orbits (per our discussion of Kepler), but the latter models do not postulate the existence of any more things in the world.”
5.1 Quantitative Parsimony In examining the concept of quantitative parsimony Baker explains that: “Theorists tend to be frugal in their postulation of new entities. When a trace [of the trajectory of an ionizing charged particle] is observed in a cloud-chamber [or more modern subatomic particle detectors such as bubble chamber, spark chamber, and the like], physicists may seek to explain it in terms of the influence of a hitherto unobserved particle. But, if possible, they will postulate one such unobserved particle, not two, or twenty, or 207 of them. This desire to minimize the number of individual new entities postulated is often referred to as quantitative parsimony.” We have previously discussed the limitations of mere counting (or “cashing out”) as vague, since the definition of ‘entity” is vague. But even if we axiomatize “entity” in models of physics, we are not out of the woods. This is primarily because there is no universally accepted justification for qualitative parsimony. As David Lewis (1973) philosophizes “I subscribe to the general view that qualitative parsimony is good in a philosophical or empirical hypothesis; but I recognize no presumption whatever in favour of quantitative parsimony.” Baker elaborates on this theme by asking “is the initial assumption that one particle is acting to cause the observed trace more rational than the assumption that 207 particles are so acting? Or is it merely the product of wishful thinking, aesthetic bias, or some other non-rational influence?” He cites Nolan (1997) who examined these questions in the context of the discovery of the neutrino (see also Bunge, 1963; and Schlesinger, 1963). Physicists in the 1930's were baffled by anomalies in experiments where radioactive atoms emitted electrons through “beta decay.” The total spin (a quantum mechanical variable not really the same as a commonplace, macroscopic objects’ spin) of the particles in the system before decay exceeds by 1/2 (in appropriate quantum units) the total spin of the (observed) emitted particles.” By intentional analogy between macroscopic and quantum “spin” there was expressed a strong belief in the Law of Conservation of Angular Momentum, namely, that spin itself is neither created nor destroyed, but merely transferred from one part of a system to another. Hence: “physicists' response was to posit a ‘new’ fundamental particle, the neutrino (not directly observed in detectors such as a cloud chamber because it has no charge and hence does not ionize the air, and is not indirectly observed in the recoil of the atom or electron because the neutrino is posited to have no mass), with spin 1/2 (thereby preserving the Law of Conservation of Angular Momentum and to hypothesize that exactly one neutrino is emitted by each electron during [one atom’s) Beta decay.” Nolan and Baker note that there is an infinite set of very similar massless uncharged neutrino theories which can also account for the missing spin: H1: 1 neutrino with a spin of 1/2 is emitted in each case of Beta decay. H2: 2 neutrinos, each with a spin of 1/4 are emitted in each case of Beta decay and, more generally, for any positive integer n, Hn: n neutrinos, each with a spin of 1/2n are emitted in each case of Beta decay. As Baker states: “Each of these hypotheses adequately explains the observation of a missing 1/2-spin following Beta decay. Yet the most quantitatively parsimonious hypothesis, H1, is the obvious default choice.” This argument is not the same argument as was discussed in the question of why to prefer an integer to nonintegral real number in our examples of Mendel and Newton, but rather, is directed at why the preferred choice is a small integer over a large integer in counting hypothesized subatomic particles. Baker shows that this is subtle. “One argument for preferring H1 focuses on explanatory idleness. At first blush it seems that the extra neutrinos postulated by H2, H3, and the other less parsimonious hypotheses are explanatorily idle. For example, H2 postulates that 2 neutrinos rather than 1 are emitted following each Beta decay. Doesn’t this
16 introduce an extra superfluous neutrino? However, this objection is too quick, as both Nolan and Barnes have pointed out.(Nolan, op. cit., p. 339; Barnes, 2000, p. 355). “Within the context of the explanation provided by H2, neither of the neutrinos postulated is explanatorily idle; the ¼-spin of each neutrino is required to explain the overall missing ½-spin.” Baker could also have pointed out that, to preserve the Law on Conservation of Charge, Feynman, Gell-Mann, and other physicists chose to hypothesize Quarks as particles with charge of the previously unimaginable multiples of the charge of an electron, even if the hypothesis also comes with theories of why no quark can be observed in isolation, namely the theory of Asymptotic Freedom in the Strong Interaction that won the 2004 Nobel prize in Physics for David J. Gross, H. David Politzer, and Frank Wilczek (Nobel prize, internet, 2004).
6.1 Alternative Hypotheses Baker does write: “One promising approach is to focus on the relative explanatory power of the alternative hypotheses, H1, H2, … Hn. When neutrinos were first postulated in the 1930's, numerous experimental set-ups were being devised to explore the products of various kinds of particle decay. In none of these experiments had cases of ‘missing’ -spin, or 1/4-spin, or τ-spin been found. The absence of these smaller fractional spins was a phenomenon which competing neutrino hypotheses might potentially help to explain.” However, he avoids mentioning Robert Andrew Millikan making a few outlier observations of such fractional charge to mass ratios on the classic Oil Drop experiments (which won Millikan his Nobel prize in 1924. That is, experimental physicists concede no a priori Occam’s razor rejection of fractional charges or spins, but perform experiments to search for them (Halyo et al., 2000). Baker also considers “the following two competing neutrino hypotheses: H1: 1 neutrino with a spin of 1/2 is emitted in each case of Beta decay. H10: 10 neutrinos, each with a spin of 1/20, are emitted in each case of Beta decay. Why has no experimental set-up yielded a ‘missing’ spin-value of 1/20? H1 allows a better answer to this question than H10 does, for H1 is consistent with a simple and parsimonious explanation, namely that there exist no particles with spin 1/20 (or less). In the case of H10, this potential explanation is ruled out because H10 explicitly postulates particles with spin 1/20. Of course, H10 is consistent with other hypotheses which explain the non-occurrence of missing 1/20-spin. For example, one might conjoin to H10 the law that neutrinos are always emitted in groups of ten. However, this would make the overall explanation less syntactically simple, and hence less virtuous in other respects. In this case, quantitative parsimony brings greater explanatory power. Less quantitatively parsimonious hypotheses can match this power only by adding auxiliary claims which decrease their syntactic simplicity. Thus the preference for quantitatively parsimonious hypotheses emerges as one facet of a more general preference for hypotheses with greater explanatory power.”
7.1 Simplicity and Induction The problem of induction is closely linked to the issue of simplicity. One obvious link is between the curvefitting problem and the inductive problem of predicting future outcomes from observed data. Less obviously, Schulte (1999) argues for a connection between induction and ontological parsimony. Schulte frames the problem of induction in information-theoretic terms: given a data-stream of observations of non-unicorns (for example), what general conclusion should be drawn? He argues for two constraints on potential rules. First, the rule should converge on the truth in the long run (so if no unicorns exist then it should yield this conclusion). Second, the rule should minimize the maximum number of changes of hypothesis, given different possible future observations. Schulte argues that the ‘Occam Rule’ — conjecture that Ω does not exist until it has been detected in an experiment — is optimal relative to these constraints. An alternative rule — for example, conjecturing that Ω exists until 1 million negative results have been obtained — may result in two changes of hypothesis if, say, Ω's are not detected until the 2 millionth experiment. Occam's Rule leads to at most one change of hypothesis (when an Ω is first detected).
17 8.0 Conclusion: The Preference for Simplicity With respect to the justification question, arguments have been made in both directions. Scientists are often inclined to justify simplicity principles on broadly inductive grounds. According to this argument, scientists select new hypotheses based partly on criteria that have been generated inductively from previous cases of theory choice. Choosing the most parsimonious of the acceptable alternative hypotheses has tended to work in the past. Hence scientists continue to use this as a rule of thumb, and are justified in so doing on inductive grounds. One might try to bolster this point of view by considering a counterfactual world in which all the fundamental constituents of the universe exist in pairs. In such a ‘pairwise’ world, scientists might well prefer pairwise hypotheses in general to their more parsimonious rivals. This line of argument has a couple of significant weaknesses. Firstly, one might legitimately wonder just how successful the choice of parsimonious hypotheses has been; examples from chemistry spring to mind, such as oxygen molecules containing two atoms rather than one. Secondly, and more importantly, there remains the issue of explaining why the preference for parsimonious hypotheses in science has been as successful as it has been. Making the justificatory argument in the reverse direction, from simplicity to induction, has a strong historical precedent in philosophical approaches to the problem of induction, from Hume onwards. Justifying the ‘straight rule’ of induction by appeal to some general Principle of Uniformity is an initially appealing response to the skeptical challenge. However, in the absence of a defense of the underlying principle itself (and one which does not, on pain of circularity, depend inductively on past success), it is unclear how much progress this represents.) As with our earlier work on the Nash Equilibrium (Fellman and Post, 2004a; 2004b; Fellman and Post, 2006) this paper is a preliminary exploration of a topic which we plan to explore over the course of several papers. In future papers we will discuss decimal Gödelization as well as formal models of complexity and the problem of computational complexity (and computational classes).
18 References Ackerman, M., (1970) Hilbert's Invariant Theory Papers, Math Sci Press, Brookline, MA, 1970. Appel K., and Haken, W. (1976) Every planar map is four-colorable, Bull. Amer Math. Soc. 82 (1976) 711-712. Appel, K. and Haken, W. "Every Planar Map is Four-Colorable, II: Reducibility." Illinois J. Math. 21, 491-567, 1977. Appel, K. and Haken, W. "The Solution of the Four-Color Map Problem." Sci. Amer. 237, 108-121, 1977. Appel, K. and Haken, W. "The Four Color Proof Suffices." Math. Intell. 8, 10-20 and 58, 1986. Appel, K. and Haken, W. Every Planar Map is Four-Colorable. Providence, RI: Amer. Math. Soc., 1989. Appel, K.; Haken, W.; and Koch, J. "Every Planar Map is Four Colorable. I: Discharging." Illinois J. Math. 21, 429-490, 1977. Aquinas, T. (1945) Basic Writings of St. Thomas Aquinas, trans. A.C. Pegis, New York: Random House, p. 129 Aristotle, Posterior Analytics, transl. McKeon, [1963, p. 150]. Audi, R. (ed.) (1995) The Cambridge Dictionary of Philosophy, Cambridge: Cambridge University Press. Baker, A. "Simplicity", Stanford Encyclopedia of Philosophy on http://plato.stanford.edu/entries/simplicity/ Baker, A. (2001) “Mathematics, Indispensability and Scientific Practice,” Erkenntnis, 55, 85-116. Baker, A. (2003) “Quantitative Parsimony and Explanation,” British Journal for the Philosophy of Science, 54, 245-259. Barnes, E. (2000) “Ockham's Razor and the Anti-Superfluity Principle”, Erkenntnis, 53, 353-74. Barnette, D., Map Coloring, Polyhedra, and the Four-Color Problem, Providence, RI: Math. Assoc. Amer., 1983. Barron, A.R., Rissanen, J. and Yu, B. (1998). The minimum description length in coding and modeling. IEEE Trans. Inform. Theory, 44: 2743-2760. Bennett, J. (ed.) (1965) Experiments in Plant Hybridisation (by Gregor Mendel), London: Oliver & Boyd. Bernays, P. (1922) Ober Hilbert's Gedanken zur Grundlegung der Arithmetik, Jahresber Deutsch. Math. Verein. 31 (1922) 65-85. Bilaniuk, O.-M. & Sudarshan, E. (1969) “Particles Beyond the Light Barrier,” Physics Today, 22, 43-52. Birkhoff, G. D. "The Reducibility of Maps." Amer. Math. J. 35, 114-128, 1913. Borger, E. Gradel, E., and Gurevich, Yu. (2001) The Classical Decision Problem, Springer-Verlag, Berlin, 1997; 2nd ed., 2001. Bos, H. (2001) Redefining Geometrical Exactness, Springer-Verlag, Berlin, 2001. Bourbaki,, N. (1949) “Foundations of mathematics for the working mathematician”, J. Symbolic Logic 14 (1949) 1-8. Browder, E.E., ed., (1976) Mathematical Developments Arising from Hilbert Problems, Proceedings of Symposia in Pure Mathematics, vol. 28 (two parts), American Mathematical Society, Providence, 1976. Bunge, M. (1963) The Myth of Simplicity: Problems in Scientific Philosophy, Englewood Cliffs: Prentice Hall. Burgess, J. (1998) “Occam’s razor and Scientific Method,” in Schirn (ed.) (1998), 195-214. Cahit, I., “Spiral Chains: A New Proof Of The Four Color Theorem.” 18 Aug 2004. http://arxiv.org/abs/math.CO/0408247 Cantor, G. Briefe, H. Meschkowski and N. Nilson, eds., Springer-Verlag, Berlin, 1991. Caratheodory, C. (1937) The beginning of research in the calculus of variations, Osiris 3 (1937) 224-240; also in Gesammelte mathematische Schriften, vol. 2., Beck,Munchen, 1955, pp. 93-107. (We quote the last edition.) Chaitin, Gregory J. 1974a. Information-theoretic computational complexity. IEEE Transactions on Information Theory IT-20, pp. 10-15. Chaitin, Gregory J. 1974b. Information-theoretic limitations of formal systems. Journal of the ACM, vol. 21, pp. 403-424. Chartrand, G. "The Four Color Problem." §9.3 in Introductory Graph Theory. New York: Dover, pp. 209-215, 1985. Church, A note on the Entscheidungsproblem, J. Symbolic Logic 1 (1936) 40-41, 101-102. Clarke, B.S. and Barron, AR. (1990). Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inform. Theory , 36: 453-471.
19 Clarke, B.S. and Barron, AR (1994). Jeffrey's prior is asymptotically least favorable under entropy risk. J. Statistical Planning and Inference, 41: 37-60. The Clay Mathematics Institute, Millennium Prize Problems, announced May 24, 2000, College de France, Paris. http://www.claymath.org/prize-problems/html. Corry, L. (2000) Modern Algebra and the Rise of Mathematical Structures, Birkhauser, Basel, 1996. Coxeter, H. S. M. "The Four-Color Map Problem, 1840-1890." Math. Teach. 52, 283-289, 1959. Curry, H. (1963) Foundations of Mathematical Logic, McGraw-Hill, New York, 1963. Dalen, D. and Ebbinghaus, H.D. “Zermelo and the Skolem paradox”, Bull. of Symbolic Logic 6 (2000) 145161. Davis, M. (1973) “Hilbert's tenth problem is unsolvable”, The American Mathematical Monthly, vol. 80, no. 3, pp. 233-269. Derbyshire, J. Prime Obsession: Bernhard Riemann and the Greatest Unsolved Problem in Mathematics. New York: Penguin, 2004. Derkse, W. (1992) On Simplicity and Elegance, Delft: Eburon. Devlin, K. "Devlin's Angle: Last Doubts Removed About the Proof of the Four Color Theorem." Jan. 2005. http://www.maa.org/devlin/devlin_01_05.html Dharwadker, A. "A New Proof of the Four Color Theorem." http://www.geocities.com/dharwadker/ Dirac, P. (1930) “The Proton,” Nature, London, 126, 606. American Mathematical Monthly 104 (1997) 963-964. Errera, A. Du colorage de cartes et de quelques questions d'analysis situs. Paris: Gauthier-Villars, 1921. Ewald, W. ed., From Kant to Hilbert: A Source Book in the Foundation of Logic, 2 vols., Clarendon Press, Oxford, 1996. Fang, J. (1970) Hilbert. Towards a Philosophy of Modern Mathematics, II, Paideai, Hauppauge, 1970. Feferman, S. (1979) “What does logic have to tell us about mathematical proofs?”, Math. Intelligencer 2 (1) (1979) 20-24. Fellman, P.V., and Post, J.V. (2004a) “Philip V. Fellman, “The Nash Equilibrium Revisited: Chaos and Complexity Hidden in Simplicity”, paper presented at the 5th International Conference on Complex Systems, Boston, MA, May 2004. Fellman, P.V., and Post, J.V. (2004b) “The Nash Equilibrium: Polytope Decision Spaces and Non-linear and Quantum Computational Architectures”, Proceedings of the North American Association for Computation in the Social and Organizational Sciences, Carnegie Mellon University, June, 2004. Fellman, P.V., and Post, J.V. (2006) “Quantum Nash Equilibria”, Proceedings of the North American Association for Computation in the Social and Organizational Sciences, Notre Dame University, June, 2006. Fetzer, J. (ed.) (1984) Principles of Philosophical Reasoning, Totowa, NJ: Rowman & Allanheld. Fichman (1977) “Zoography and the Problem of Land Bridges,” Journal for the History of Biology, 10(1), 4563. Fisher, R. A. (1949). The design of experiments. London: Oliver and Boyd. Flew, A. (1996) Stephen Hawking and the Mind of God, 1996 Ford, K. (1963) “Magnetic Monopoles,” Scientific American, 209, 122-31. Forster, M. (1995) “The Curve-Fitting Problem,” in Audi (ed.) (1995). -----. (2001) “The New Science of Simplicity,” in Zellner et al. (eds.) (2001), 83-119. Forster, M. and Sober, E. (1994) “How to Tell when Simpler, More Unified, or Less Ad Hoc Theories will Provide More Accurate Predictions,” British Journal for the Philosophy of Science, 45, 1–35. Franklin, P. "Note on the Four Color Problem." J. Math. Phys. 16, 172-184, 1937-1938. Franklin, P. The Four-Color Problem. New York: Scripta Mathematica, Yeshiva College, 1941. Friedman, M. (1983) Foundations of Space-Time Theories, Princeton: Princeton University Press. Friedman, H. (1978) Classically and intuitionistically provably recursive functions. In Müller and Scott (eds.), Higher Set Theory, Lecture Notes in Mathematics 669, pp. 21-27. Springer-Verlag, Berlin. Freudenthal, H., “David Hilbert”, in Biographical Dictionary of Mathematicians, C. Gillespie, ed., Scribner's, New York, 1991, pp. 1052-1058; published earlier in the Dictionary of Scientific Biography, Scribner's, New York, 1970. Galileo, G. Dialogue Concerning the Two Chief World System, transl. Drake (1962), Berkeley. Gardner, M. "Mathematical Games: The Celebrated Four-Color Map Problem of Topology." Sci. Amer. 203, 218-222, Sep. 1960.
20 Gardner, M. "The Four-Color Map Theorem." Ch. 10 in Martin Gardner's New Mathematical Diversions from Scientific American. New York: Simon and Schuster, pp. 113-123, 1966. Gardner, M. "Mathematical Games: Six Sensational Discoveries that Somehow or Another have Escaped Public Attention." Sci. Amer. 232, 127-131, Apr. 1975. Gardner, M. "Mathematical Games: On Tessellating the Plane with Convex Polygons." Sci. Amer. 232, 112117, Jul. 1975. Garfinkle, R.,Celestial Matters, Tor, 1996 Gauch, H. (2003) Scientific Method in Practice, Cambridge: Cambridge University Press. Gentzen, G. (1969) The Collected Papers of Gentzen, M. Szabo’, ed., North Holland, Amsterdam, 1969 (includes Die Widerspruchsfreiheit der Zahlentheorie, Math. Annal. 112 (1936) 493-565). Gierer, A. (1970) Der physikalische Grundlegungsversuch in der Biologic and das psychophysische Problem, Ratio 1 (1970) 40-54; English version in the English edition of this journal: The physical foundations of biology and the psychophysic problem, Ratio 12 (1970) 47-64. Gödel meets Carnap: a prototypical discourse on science and religion, Zygon 32 (1997) 207217. Gödel, Kurt. 1933. Zur intuitionistischen Arithmetik und Zahlentheorie. In Ergebnisse eines mathematischen Kolloquiums, Heft 4, pp. 34-38. Gödel, K. Die Vollstandigkeit der Axiome des logischen Funktionskalkuls, Monatshefte fur Mathematik and Physik 37 (1930) 349-360 (Gödel's Dissertation, 1929); also with English transl. in Collected Works, S. Feferman et al., eds., vol. 1, Oxford University Press, Oxford, 1986, pp. 60-101. Gödel, K. Nachtrag zu der Diskussion zur Grundlegung der Mathematik am Sonntag, dem 7. Sept. 1930 [Supplement to the discussion [33] on the Konigsberg Meeting, September 7, 1930], Erkenntnis 2 (1930) 149-151. Gödel, K. Uber formal unentscheidbare Satze der Principia Mathematica and verwandte Systeme I, Monatshefte fir Mathematik and Physik 38 (1) (1931) 173-198; English translation, On formally undecidable propositions of Principia Mathematica and related systems, I, in The Undecidable, M. Davis, ed., Raven Press, Hewlett, NY, 1965, pp. 5-38; further English edition by R. B. Braithwaite, Dover, New York, 1992; with English translation in Collected Works, S. Feferman et al., eds., vol. 1, Oxford University Press, 1986, pp. 144-195. Gödel, K.Uber die Lange von Beweisen, Erg. Math. Koll. 7 (1934) 23-24. Gödel, K. Letters to Constance Reid from March 22, 1966, and June 25, 1969; letters in the possession of Mrs.Reid, not yet included in Gödel's Collected Works, S. Feferman et al., eds., but partly quoted in Reid [87]. “We quote with the kind permission of the Institute for Advanced Study, Princeton, NJ.” Gödel, K. Letter to Ernst Zermelo from October 12, 1931, University of Freiburg i.Br., Universitatsarchiv, C 19/36 Goodman, N. (1955) Fact, Fiction, and Forecast, Cambridge, MA: Harvard University Press. Goodman, N. & Quine, W. (1947) “Steps Towards a Reconstructive Nominalism,” Journal of Symbolic Logic, 12, 105-122. Grattan-Guinness, I (2000) “A sideways look at Hilbert's twenty-three problems of 1900”, Notices Amer Math. Soc. 47 (2000) 752-757; corrections by G. H. Moore and response of Grattan-Guinness in Letters to the Editor, Notices 48 (2001) 167. Gray, J. The Hilbert Challenge, Oxford University Press, Oxford, 2000. Griffith, P.A. (2000) “Mathematics at the turn of the millennium”, Amer. Math. Monthly 107 (2000) 1-14. Groarke, L. (1992) “Following in the Footsteps of Aristotle: the Chicago School, the Glue-stick and the Razor,” Journal of Speculative Philosophy, 6(3), 190-205. Hahn, H. et al., (1930) Diskussion zur Grundlegung der Mathematik am Sonntag, dem 7. Sept. 1930 [Discussion on the second Meeting Erkenntnislehre in Konigsberg, September 5-7, 1930], Erkenntnis 2 (1930) 135-149. Note both the Vorbemerkung of the editors on the page before the content and Gödel's supplement [25]. Halsted, G.B. (1900) The International Congress of Mathematicians, American Mathematical Monthly 7 (1900) 188-189. Halyo, V. et al.,”Search for free fractional electric charge elementary particles using an automated Millikan oil drop technique," Physical Review Letters, v.84 no.12, pp. 2576-2579, 2000. Hardy, G.H. (1940) A Mathematician's Apology, Cambridge University Press, Cambridge, 1940; reprinted with a foreword by C. P. Snow, 1967. (We quote the enlarged 1967 edition.) Hawking, S. (1988) A Brief History of Time, New York: Bantam, 1988, p.193. Heawood, P. J. "Map Colour Theorems." Quart. J. Math. 24, 332-338, 1890.
21 Heawood, P. J. "On the Four-Color Map Theorem." Quart. J. Pure Math. 29, 270-285, 1898. Heyting, A. (1934) Mathematische Grundlagenforschung. Intuitionismus. Beweistheorie, Ergebnisse der Mathematik and ihrer Grenzgebiete, Band 3, Heft 4, Springer-Verlag, Berlin Hilbert, D. Mathematische Notizbacher, 3 notebooks, Niedersachsische Staats- and Universitatsbibliothek Gottingen, Handschriftenabteilung, Cod. Ms. D. Hilbert 600:1-3. Hilbert, D. Invariantentheorie, lecture notes from Winter 1886 by Hilbert, Niedersachsische Staats- and Universitatsbibliothek Gottingen, Handschriftenabteilung, Cod. Ms. D. Hilbert 521. Hilbert, D. Uber die Theorie der algebraischen Invarianten, submitted to the International Mathematical Congress in Chicago 1893, in Mathematical Papers Published by the American Mathematical Society, vol. 1, E. H. Moore et al., eds., Macmillan, New York, 1896, pp. 116-124. Hilbert, D. Theorie der algebraischen Invarianten nebst Anwendungen auf Geometrie, lecture notes from Summer 1897 prepared by Sophus Marxsen, Library of the Mathematical Institute of the University of Gottingen; English translation Theory of Algebraic Invariants, R. C. Laubenbacher and B. Sturmfels, eds. [using a different copy from the Mathematics Library of Cornell University], Cambridge University Press, Cambridge, 1993. (We quote Laubenbacher's translation.) Hilbert, D. Uber den Zahlbegriff, Jahresber. Deutsch. Math. Verein. 8 (1900) 180-184; also in [57] from the 3rd to 7th editions, pp. 256-262 (3rd. ed., 1909), pp. 237-242 (4th to 6th ed., 1913, 1922, and 1923), pp. 241246 (7th ed., 1930); English translation (by W. Ewald) [17:2, pp. 1089-1095]. Hilbert, D. Ober die Grundlagen der Logik and der Arithmetik, in Verhandlungen des 3. Internationalen Mathematiker-Kongresses in Heidelberg 1904, A. Krazer, ed., Teubner-Verlag, Leipzig, 1905, pp. 174-185; also in [571 from the 3rd to the 7th editions, pp. 263-279 (3rd ed., 1909), pp. 243-258 (4th to 6th ed., 1913, 1922, and 1923), pp. 247-261 (7th ed. 1930); English translations (by G. B. Halsted) The Monist 15 (1905) 338-352 and (by W. Woodward) in From Frege to Gödel. A Source Book in Mathematical Logic 18791931, J. van Heijenoort, ed., Harvard University Press, Cambridge, 1967, pp. 130-138; 2nd ed., 1971; French translation L'Enseignement Mathematique 7(1905) 89-103. Hilbert, D. Logische Prinzipien des mathematischen Denkens, lecture notes from Summer 1905 prepared by E. Hellinger, Library of the Mathematical Institute of the University of Gottingen. Hilbert, D. Logische Prinzipien des mathematischen Denkens, lecture notes from Summer 1905 prepared by M. Born, Niedersachsische Staats- and Universitatsbibliothek Gottingen, Handschriftenabteilung, Cod. Ms. D. Hilbert 558a. Hilbert, D. Principien der Mathematik, lecture notes from Winter 1917 prepared by P. Bernays, Library of the Mathematical Institute of the University of Gottingen. Hilbert, D. Axiomatisches Denken, Math. Annal. 78 (1918) 405-415; also in [58:3, pp. 146-1561; English translations (by Ewald) [17:2, pp. 1105-1115] and (by Fang) [18, pp. 187-1981; French translation L'Enseignement Mathematique 20 (1918) 122-136. Hilbert, D. Neubegrundung der Mathematik, Erste Mitteilung, Abhandlungen aus dem Mathematischen Seminar der Universitat Hamburg 1 (1922) 157-177; also in [58:3, pp. 157-177]; English translation (by W. Ewald) [17:2, pp. 1115-11341. Hilbert, D. Wissen and mathematisches Denken, lecture notes from Winter 1922 prepared by W. Ackermann; revised reprint, C.-F. Bodigheimer, ed., Mathematisches Institut Gottingen, 1988. Hilbert, D. Die logischen Grundlagen der Mathematik, Math. Annal. 88 (1923) 151-165; also in [58:3, pp. 1781911; English translation (by W. Ewald) [17:2, pp. 1134-1148]. Hilbert, D. Uber das Unendliche, Math. Annal. 95 (1926) 161-190; also in [57, 7th ed., pp. 262-288], shortened version in Jahresber Deutsch. Math. Verein. 36 (1927) 201-215; English translation (by S. BauerMengelberg) in From Frege to Gödel. A Source Book in Mathematical Logic 1879-1931, J. van Heijenoort, ed., Harvard University Press, Cambridge, 1967, pp. 367-392; 2nd ed., 1971; French translation (by A. Weil), Sur l'infinie, Acta Math. 48 (1926) 91-122. Hilbert, D. Die Grundlagen der Mathematik, [Zweite Mitteilung], Abhandlungen aus dem Mathematischen Seminar der Universitat Hamburg 6 (1928) 65-85; followed by Diskussionsbemerkungen zu dem zweiten Hilbertschen Vortrag by H. Weyl, pp. 86-88, and Zusatz zu Hilberts Vortrag by P. Bernays, pp. 89-95; shortened version in [57, 7th ed., pp. 289-312]; English translation (by S. Bauer-Mengelberg and D. Follesdal) in From Frege to Gödel. A Source Book in Mathematical Logic 1879-1931, J. van Heijenoort, ed., Harvard University Press, Cambridge, 1967, pp. 464-479; 2nd ed., 1971.
22 Hilbert, D. Probleme der Grundlegung der Mathematik, in Atti del Congresso Internazionale dei Matematici, Bologna 1928, Tomo I, Zanichelli, Bologna, 1929, pp. 135-141; also in Math. Ann. 102 (1930) 1-9, [57, 7th ed., pp. 313-323], and Hilbert. Gedenkband, K. Reidemeister, ed., Springer-Verlag, Berlin 1971, pp.9-19. Hilbert, D. Naturerkennen and Logik, Naturwissenschaften (1930), 959-963; also in Gesammelte Abhandlungen [58:3, pp. 378-387]; English translations (by W. Ewald) in [17:2, pp. 1157-11651; selected parts in Reid [87, chap. 12]; English translation of the corresponding radio broadcast in Vinnikov [102]. Hilbert, D. Die Grundlegung der elementaren Zahlenlehre, Math. Annal. 104 (1931) 485-494; shortened version in the Gesammelte Abhandlungen [58:3, pp. 192-195]; English translation (by W. Ewald) in [17:2, pp. 1148-1157]. Hilbert, D. Grundlagen der Geometrie, Teubner-Verlag, Leipzig, 1899; from the 8th amended ed. by P. Bernays with supplements, Stuttgart, 1956-1987; 14th ed., with contributions by H. Kiechle et al., M. Toepell, ed., Stuttgart, 1999; first English translation The Foundations of Geometry by E. J. Townsend, Open Court, Chicago, 1902; new translation by L. Unger, Open Court, La Salle, 1972. [The first German edition of the Grundlagen was a contribution to the Festschrift zur Enthullung des GaussWeber Denkmals in Gottingen, later editions were published separately.] Hilbert, D. Gesammelte Abhandlungen, 3 vols., Springer-Verlag, Berlin, 1932-1935; 2nd ed., 1970. Hilbert, D. and Ackermann, W. (1928) Grundzuge der theoretischen Logik, Springer-Verlag, Berlin, 1928; 6th ed., 1972; English translation of the second edition Principles of Mathematical Logic, by L. M. Hammond et al., Chelsea, New York, 1950; Chinese translation Science Press, Beijing, 1950; Russian translation Nauka, Moscow, 1979. Hilbert, D. and Bernays, P. (1934) Grundlagen der Mathematik, 2 vols., Springer-Verlag, Berlin, 1934-1939; 2nd ed., 1968-70; Russian translation Nauka, Moscow, 1979-1982. Holsinger, K. (1981) “Comment: the Blunting of Occam’s razor,” Canadian Journal of Zoology, 59, 144-6. Hubbard, R., & Bayarri, M. J. (2003). Confusion over measures of evidence (p's) versus errors (alpha's) in classical statistical testing, American Statistician, 57, 171-178. Hungerford, T. Algebra, 8th ed., Springer-Verlag, New York, 1974. Hutchinson, J. P. and Wagon, S. "Kempe Revisited." Amer. Math. Monthly 105, 170-174, 1998. Jahnke, H.N., Hilbert, D., Weyl, H. and die Philosophie der Mathematik, Math. Semesterberichte 37 (1990) 157-179. Jeffreys, H. (1961) Theory of Probability, Oxford: Clarendon Press. Kant, I. The Critique of Pure Reason, transl. Kemp Smith (1950), London. Kemeny, J. (1959) A Philosopher Looks at Science, New York: Van Nostrand. Kempe, A. B. "On the Geographical Problem of Four-Colors." Amer. J. Math. 2, 193-200, 1879. Kittell, I. "A Group of Operations on a Partially Colored Map." Bull. Amer. Math. Soc. 41, 407-413, 1935. Knight, W. "Computer Generates Verifiable Mathematics Proof." New Scientist Breaking News. Apr. 19, 2005. http://www.newscientist.com/article.ns?id=dn7286 Koetsier, T. (2001) Hilberts 24ste probleem, Nieuw Archief voor Wiskunde 5 (2) (2001) 65-67. [The canceled 24th problem is reproduced in facsimile.] Koetsier, T. an van Mill, J. (1997)General topology, in particular dimension theory, in the Netherlands, in Handbook of the History of General Topology, vol. 1, C. E. Aull and R. Lowen, eds., Kluwer, Dordrecht, 1997, pp.135-180. Kontkanen, P,.Myllymäki, P., Silander, T., Tirri, H., and Grünwald, P. (2000). On Predictive Distributions and Bayesian Networks. Statistics and Computing, 10, 39-54. Kragh, H. (1981) “The Concept of the Monopole: a Historical and Analytical Case-Study,” Studies in History and Philosophy of Science, 12(2), 141-72. Kraitchik, M. §8.4.2 in Mathematical Recreations. New York: W. W. Norton, p. 211, 1942. Kreisel, G. and Krivine, J.-L. (1967) Elements de logique mathematique. Theorie des mo&les, Dunond, Paris, 1967; English translation Elements of Mathematical Logic (Model Theory), North Holland, Amsterdam, 1967. Kreisel, G. What have we learnt from Hilbert's second problem?, in Browder [8: 1, pp. 93-130]. 67. C. W. H. Lam, How reliable Is a Computer-Based Proof?, Math. Intelligencer 12 (1) (1990) 8-12. Kreisel, G. (1958) Mathematical significance of consistency proofs. Journal of Symbolic Logic 23, pp. 155182.
23 Kuhn, T. (1977) “Objectivity, Value Judgment, and Theory Choice,” in The Essential Tension, Chicago: University of Chicago Press, 320-39. Lange, M. (1995) “Spearman's Principle,” British Journal for the Philosophy of Science, 46, 503–52. Lavoisier, A. (1862) “Réflexions sur le Phlogistique,” in Oeuvres, vol. 2, 623-4, Paris: Imprimerie Impériale. Leivant, D. (1985) Syntactic translations and provably recursive functions. Journal of Symbolic Logic, vol. 50, pp. 682-688. Lemoine, E.M.H. (1902) La Geometrographie ou l'Art des Constructions Geometriques, Naud, Paris, 1902. 69. U. Majer, Hilbert's program to axiomatize physics (in analogy to geometry), in History of Philosophy and Science, M. Heidelberger and F. Stadler, eds., Kluwer, Dordrecht, 2002, pp. 213-224. Lewis, D. (1973) Counterfactuals, Oxford: Basil Blackwell. Lipsey, M. W. (1990). Design sensitivity: Statistical power for experimental research,. Newbury Park: Sage Publication. Ludbrook, J. & Dudley, H. (1998). Why permutation tests are superior to t and F tests in biomedical research, American Statistician, 52, 127-133. Matiyasevich, Yuri V. 1970. Diofantovost' perechislimykh mnozhestv. Doklady Akademii Nauk SSSR, vol. 191, no. 2, pp. 297-282 (Russian). (English translation, Enumerable sets are Diophantine, Soviet Mathematics Doklady, vol. 11, no. 2, pp. 354-358). Matiyasevich, Yuri V. 1993. Hilbert's Tenth Problem. M.I.T. Press, Cambridge, Mass. Matiyasevich, Yu. (1970) Diophantine enumerable sets, Doklady Akademii Nauk SSSR 191 (1970) 279-282 [Russian]; English translation in Sov. Math Doklady 11 (1970) 354-357. Maurer, A. (1984) “Ockham's Razor and Chatton's Anti-Razor,” Mediaeval Studies, 46, 463-75. May, K. O. “The Origin of the Four-Color Conjecture.” Isis 56, 346-348, 1965. McKeon, R. (1941) The Basic Works of Aristotle, New York: Random House. Miller, J. K. & Knapp, T. R. (1978). The importance of statistical power in educational research. (ERIC Document Reproduction Service No. : ED 152 838). Minkowski, H. Briefe an David Hilbert, L. Rudenberg, ed., Springer-Verlag, Berlin, 1973. [The 105 letters are preserved in the Niedersachsische Staats- and Universitatsbibliothek Gottingen, Handschriftenabteilung, Nachlass Hilbert, Cod. Ms. D. Hilbert 258.] Menzler-Trott, E. Gentzens Problem (including an essay on Gentzen's proof theory by J. von Plato), BirkhauserVerlag, Basel, 2001. [A shortened English translation is in preparation.] Meschkowski, H. Richtigkeit and Wahrheit, Bibliographisches Institut, Mannheim, 1978. Meyer, F. Bericht fiber den gegenwartigen Stand der Invariantentheorie, Jahresber Deutsch. Math. Verein. 1 (1890-91) 81-292. Miller, G.H. (1987) S2-Bibliography of Mathematical Logic, 6 vols., Springer-Verlag, Berlin, 1987. Morgenstern, C. and Shapiro, H. “Heuristics for Rapidly 4-Coloring Large Planar Graphs.” Algorithmica 6, 869-891, 1991. Murawski, R. (1999) Recursive Functions and Metamathematics. Problems of Completeness and Decidability, Gödel's Theorems, Kluwer, Dordrecht, 1999. von Neumann, J. (1927) Zur Hilbertschen Beweistheorie, Math. Zeitschrift 26 (1927) 1-46. Nash, L. (1963) The Nature of the Natural Sciences, Boston: Little, Brown. Nelson, G. (1978) “From Candolle to Croizat: Comments on the History of Biogeography,” Journal of the History of Biology, 11(2), 269-305. Newton, I. (1964) The Mathematical Principles of Natural Philosophy, New York: Citadel Press. Neyman, J. & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference. Part I and II. Biometrika, 20, 174-240, 263-294. Neyman, J. & Pearson, E. S. (1933a). The testing of statistical hypotheses in relation to probabilities a priori, Proceedings of Cambridge Philosophical Society, 20, 492-510. Neyman, J. & Pearson, E. S. (1933b). On the problem of the most efficient tests of statistical hypotheses, Philosophical Transactions of Royal Society;, Series A, 231, 289-337. Nolan, D. (1997) “Quantitative Parsimony”, British Journal for the Philosophy of Science, 48, 329–43. Nolan, D. (1999) “Is Fertility Virtuous in its Own Right?”, British Journal for the Philosophy of Science, 50, 265–82. Northrop, F.S.C. (1947) The Logic of the Sciences and the Humanities”, New York: MacMillan, 1947 Norton, J.D. (2000) `Nature is the realisation of the simplest conceivable mathematical ideas': Einstein and the canon of mathematical simplicity, Studies in the History of Modern Physics 31 (2000) 135-170. Ore, Ø. The Four-Color Problem. New York: Academic Press, 1967.
24 Ore, Ø. and Stemple, G. J. “Numerical Methods in the Four Color Problem.” Recent Progress in Combinatorics (Ed. W. T. Tutte). New York: Academic Press, 1969. Pappas, T. “The Four-Color Map Problem: Topology Turns the Tables on Map Coloring.” The Joy of Mathematics, San Carlos, CA: Wide World Publ./Tetra, pp. 152-153, 1989. Parshall, K.V.H. (1990) The one-hundredth anniversary of the death of invariant theory?, Math. Intelligencer 12 (4) (1990) 10-16. Peckhaus, V.(1990) Hilbertprogramm and Kritische Philosophie, Vandenhoeck & Ruprecht, Gottingen, 1990. Peckhaus, V. (1994) Hilbert's axiomatic programme and philosophy, in The History of Modern Mathematics, vol. 3, E. Knobloch and D. Rowe, eds., Academic Press, Boston, 1994, pp. 90-112. J.-P Pier, J.-P. ed., Development of Mathematics, 1900-1950 and 1950-2000, 2 vols., Birkhauser, Basel, 1994 and 2000. Popper, K. (1959) The Logic of Scientific Discovery, London: Hutchinson. Pudlak, P. (1998) “The lengths of proofs”, in Handbook of Proof Theory, S. R. Buss, ed., Elsevier, Amsterdam, 1998, pp. 547-637. Quine, W. (1966) “On Simple Theories of a Complex World”, in The Ways of Paradox, New York: Random House. Quine, W. (1981) Theories and Things, Cambridge, MA: Harvard University Press. Raatikainen, Panu. 1998. On interpreting Chaitin's incompleteness theorem. Journal of Philosophical Logic. Forthcoming. S. Rebsdorf and H. Kragh, eds., Edward Arthur Milne-The relations of mathematics to science, Studies in History and Philosophy of Modern Physics 33 (2002) 51-64. Redei, M. (1954) "Unsolved Problems in Mathematics": John von Neumann's address to the ICM Amsterdam, 1954, Math. Intelligencer 21 (4) (1999) 7-12. Reich, K. (1993) “The American contribution to the theory of differential invariants, 1900-1916”, in The Attraction of Gravitation, J. Earman et al., eds., Birkhauser, Boston, 1993, pp. 225-247. Reid, C. (1970) Hilbert, Springer-Verlag, New York, 1970; reprinted Copernicus, New York, 1996. von Renteln, M. (1994) Zur Situation der Analysis um die Jahrhundertwende, in Vorlesungen zum Gedenken an Felix Hausdorff, E. Eichhorn and E. J. Thiele, eds., Heldermann, Berlin, 1994, pp. 107-130. Rescher, N. (1998) Complexity: a Philosophical Overview, New Brunswick, NJ: Transaction. Ringel, G. and Youngs, J. W. T. “Solution of the Heawood Map-Coloring Problem.” Proc. Nat. Acad. Sci. USA 60, 438-445, 1968. Rissanen, J. (1996). Fisher information and stochastic complexity, IEEE Trans Information Theory, 42, 40-47. Robertson, N.; Sanders, D. P.; Seymour, P. D.; and Thomas, R. “A New Proof of the Four Colour Theorem.” Electron. Res. Announc. Amer. Math. Soc. 2, 17-25, 1996. Robertson, N.; Sanders, D. P.; and Thomas, R. “The Four-Color Theorem.” http://www.math.gatech.edu/~thomas/FC/fourcolor.html Rota, G.C. (1999) Two turning points in invariant theory, Math. Intelligencer 21 (1) (1999) 20-27. Saaty, T. L. and Kainen, P. C., The Four-Color Problem: Assaults and Conquest. New York: Dover, 1986. Sarkar, S. & Pfeifer, J. (eds.) (2003) The Philosophy of Science: An Encyclopedia, London: Routledge. Scarpellini, B. (2000) Komplexitatstheorie, Wissenschaftsmagazin der Universitat Basel "uni nova" 87 (6) (2000) 51-52. Scharfstein, B. (1989) The Philosophers : Their Lives and the Nature of their Thought, Oxford University Press. Scharfsein, B. (1993) Ineffability: The Failure of Words in Philosophy and Religion, State University of New York Press. Schirn, M. (ed.) (1998) The Philosophy of Mathematics Today, Oxford, New York: Clarendon Press. Schlesinger, G. (1963) Method in the Physical Sciences, London: Routledge. Schulte, O. (1999) “Means-End Epistemology,” British Journal for the Philosophy of Science, 50, 1-31. Scott, C.A. (1900) The International Congress of Mathematicians in Paris, Bull. Amer Math. Soc. 7 (1900) 5779. Serre, J.P. (1965) Al#bre Locale, Ultilicites, Lecture Notes in Mathematics, no. 11, Springer-Verlag, Berlin, 1965. Sheehan, W., Nicholas Kollerstrom, N. and Waff, C.B. “The Case of the Pilfered Planet,” Scientific Americam, Dec 2004. Sigmund, K.,(1995) “Hans Hahn and the foundational debate”, in The Foundational Debate, W. Depauli— Schimanovic et al., eds., Kluwer, Dordrecht, 1995, pp. 235-245.
25 Simon, H. A. (1972). Complexity and the representation of patterned sequences of symbols. Psychological Review, 79, 369--382. [Perceptual coding languages; empirical confirmation of the Invariance Theorem as later articulated in MDL] Simpson, S.G. (1988) Partial realizations of Hilbert's program, J. Symb. Logic 53 (1988) 349-363. Skiena, S., Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Reading, MA: Addison-Wesley, p. 210, 1990. Skolem, Thoralf. (1934) Über die Nicht-charakterisierbarkeit der Zahlenreihe mittels endlich oder abzählbar unendlich vieler Aussagen mit ausschließlich Zahlenvariablen. Fundamentamathematicae, vol. 23, pp. 150161. Smart, J. (1984) “Ockham's Razor,” in Principles of Philosophical Reasoning, ed. Fetzer, 118-28. Smorynski, Craig. 1991. Logical Number Theory I. Springer-Verlag, Berlin. Sober, E. (1981) “The Principle of Parsimony,” British Journal for the Philosophy of Science, 32, 145-56. Sober, E. (1988) Reconstructing the Past: Parsimony, Evolution and Inference, Cambridge, MA: MIT Press. Sober, E. (1994) “Let's Razor Ockham's Razor”, in From A Biological Point of View, Cambridge: Cambridge University Press, 136–57. Sober, E. (2001) “What is the Problem of Simplicity?” in Zellner et al. (eds.) (2001), 13-31. Sober, E. (2003) “Parsimony,” in Sarkar & Pfeifer (eds.) (2003). Swinburne, R. (1997) Simplicity as Evidence for Truth, Milwaukee: Marquette University Press. Sobocinski, B. (1956) On well constructed axiom systems, in Polskie Towarzystwo Naukowe na Obczyznie [Polish Society of Art and Sciences, Abroad], Yearbook for 1955-56, London, 1956, pp. 1-12. Sommerfeld, A. (1959) "To Albert Einstein's Seventieth Birthday", in P.A. Schilpp (ed.) Albert Einstein: Philosopher Scientist, Harper Torchbooks, Vol. 1, 1959, p.103. Steinbach, P., Field Guide to Simple Graphs, Albuquerque, NM: Design Lab, 1990. Steinhaus, H., Mathematical Snapshots, 3rd Edn., New York: Dover, pp. 274-275, 1999. Stevens, J. (1992). Applied multivariate statistics for the social sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. Stoltzner, M. (2002) “How metaphysical is "Deepening the foundations"? Hahn and Frank on Hilbert's axiomatic method, in History of Philosophy of Science, F. Stadler, ed., Kluwer, Dordrecht, 2002, pp. 245-262. Tait, P. G. “Note on a Theorem in Geometry of Position.” Trans. Roy. Soc. Edinburgh 29, 657-660, 1880. Taylor, R. and Wiles, A. Ring-theoretic properties of certain Hecke algebras, Ann. Math. 141 (1995) 532-572. Thiele, R. (2000) Hilbert and his twenty-four problems, in Mathematics at the Dawn of a Millennium, Proceedings of the Canadian Society for History and Philosophy of Mathematics, Hamilton, Ont., 2000, vol. 13, M. Kinyon, ed., pp. 1-22. [Some documents including the canceled twenty-fourth problem are reproduced in facsimile.] Thiele, R. (2001) David Hilbert and French Mathematics, in Proceedings of the XXXI. International Congress of History of Science, Mexico City, Mexico, 2001 (to appear). Thiele, R. (2003) Von der Bernoullischen Brachistochrone zum Kalibratorkonzept. Untersuchungen zur Geschichte der Feldtheorie bei einfachen Variationsproblemen, Habilitationsschrift, Universitat Hamburg, Fachbereich Mathematik, 2001; Brepols, Turnhout, 2003 (to appear). Thiele, R. (2003) “Hilbert's twenty-fourth problem”, Am. Math. Monthly, Jan 2003] http://www.findarticles.com/p/articles/mi_qa3742/is_200301/ai_n9227477/print Thomas, R. (1998) “An Update on the Four-Color Theorem.” Not. Amer. Math. Soc. 45, 858-857, 1998. Thornburn, W. (1918) “The Myth of Occam’s razor,” Mind, 27, 345-53. Troelstra, Anne S. 1973. Metamathematical Investigations of Intuitionistic Arithmetic and Analysis. SpringerVerlag, Berlin. Troelstra, Anne S. and van Dalen, Dirk. 1988. Constructivism in Mathematics: An Introduction, vol. I-II. North Holland, Amsterdam. Turing, Alan M. 1936-1937. On computable numbers, with an application to the Entscheidungsproblem. In Proceedings of the London Mathematical Society, ser. 2, vol. 42, pp. 230-265. Correction, ibid. 43, pp. 544-546. Turk, H.C. (1987) Ether Ore, Tor Books, 1987, ISBN 0-812-55635-6, $3.50, paperback Velleman, D. (1997) Fermat's last theorem and Hilbert's program, Math. Intelligencer 19 (1) (1997) 64-67. 102. Vinnikov, V. (1999) We shall know: Hilbert's apology, Math. Intelligencer 21 (1) (1999) 42-46. Vitányi, L. & Li, M. (2001) “Simplicity, Information, Kolmogorov Complexity and Prediction,” in Zellner et al. (eds.) (2001), 135-55.
26 Vollmer, G., ungeloste und uniosbare Probleme, in Fortschritt und Gesellschaft, E.-L. Winnacker, ed., HirzelVerlag Stuttgart, 1993, pp. 79-97; also in G. Vollmer, Wissenschaftstheorie im Einsatz, Stuttgart, HirzelVerlag, 1993, pp. 183-210. Vollmer, G. (1971) Denkzeuge, in Mannheimer Forum 90/91, E. P. Fischer, ed., Piper, Minchen, 1991, pp. 1578. 105. A. Weil, The future of mathematics, in Great Currents of Mathematical Thought, vol. 1, F. Le Lionnais, ed. (translated from the French by A. Dresden), Dover, New York, 1971, pp. 320-336. Wagon, S., (1009) "An April Fool's Hoax." Mathematica in Educ. Res. 7, 46-52, 1998. Wagon, S., Mathematica in Action, 2nd Edn., New York: Springer-Verlag, pp. 535-536, 1999. Walsh, D. (1979) “Occam’s razor: A Principle of Intellectual Elegance”, American Philosophical Quarterly, 16, 241–4. Wang, C. (1993). Sense and nonsense of statistical inference: Controversy, misuse, and subtlety. New York: Marcel Dekker, Inc. Weinberg, S. (1993) Dreams of a Final Theory, Vintage, London, 1993. Weisstein, Eric W. "Proof." From MathWorld--A WolframWeb Resource. http://mathworld.wolfram.com/Proof.html Weisstein, Eric W. "Frivolous Theorem of Arithmetic." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/FrivolousTheoremofArithmetic.html Wells, D., The Penguin Dictionary of Curious and Interesting Numbers, Middlesex, England: Penguin Books, p. 57, 1986. Wells, D., The Penguin Dictionary of Curious and Interesting Geometry, London: Penguin, pp. 81-82, 1991. Weyl, H. (1924) Randbemerkungen zu Hauptproblemen der Mathematik, Math. Zeitschrift 20 (1924) 131-150. 108. David Hilbert and his mathematical work, Bull. Amer. Math. Soc. 50 (1944) 612-654. Weyl, H. (1951) A half-century of mathematics, Amer. Math. Monthly 58 (1951) 523-553; also in Gesammelte Abhandlungen, Springer-Verlag, Berlin, 1968, pp. 464-494. Weyl, H. (1946) The Classical Groups. Their Invariants and Representations, 2nd. ed., Princeton University Press, Princeton, 1946. Wiles, A. (1995) “Modular elliptic curves and Fermat's last theorem”, Ann. Math. 141 (1995) 443-531. Woodward, J. (2003) Making Things Happen: A Theory of Causal Explanation, Oxford: Oxford University Press. Wos, L. (1998) Automating the search for elegant proofs, J. Automated Reasoning 21 (g) (1998) 135-175. 114. L. Wos and R. Thiele, Hilbert's new problem, University of Lodz, Bull. of the Section of Logic 30 (2001) 165-175. Yandell, B.H. (2002) The Honors Class. Hilbert's Problems and Their Solvers, A. K. Peters, Natick, MA, 2002. Yu, Chong Ho (Alex) “Don’t believe in the Null Hypothesis?” http://seamonkey.ed.asu.edu/~alex/computer/sas/hypothesis.html Zermelo, E. (1931) Letters to K. Gödel from September 21, 1931, and October 29, 1931, University of Freiburg i. Br., Universitatsarchiv, C 129/36; reprinted in I.Grattan-Guinness, In memoriam Kurt Gödel: His 1931 correspondence with Zermelo on his incompletability theorem, Historia Math. 6 (1979) 294-304; see also the addition of J. W. Dawson, Completing the Gödel-Zermelo correspondence, Historia Math. 12 (1985) 66-70. Zellner, A., Keuzenkamp, H. & McAleer, M. (eds.) (2001) Simplicity, Inference and Modelling: Keeping It Sophisticatedly Simple, Cambridge: Cambridge University Press.