Oracles and Advice as Measurements - Semantic Scholar

1 downloads 0 Views 280KB Size Report
Edwin Beggs 1⋆, José Félix Costa 2,3, Bruno Loff 2,3, and John V Tucker 1 ...... Luis Balcázar, Ricard Gavald`a, Hava Siegelmann, and Eduardo D. Sontag.
Oracles and Advice as Measurements Edwin Beggs

1?

, Jos´e F´elix Costa

2,3

, Bruno Loff

2,3

, and John V Tucker

1

1

3

School of Physical Sciences Swansea University, Singleton Park, Swansea, SA2 8PP Wales, United Kingdom [email protected], [email protected] 2 Department of Mathematics, Instituto Superior T´ecnico Universidade T´ecnica de Lisboa, Lisboa, Portugal [email protected], [email protected] Centro de Matem´ atica e Aplica¸co ˜es Fundamentais do Complexo Interdisciplinar Universidade de Lisboa Lisboa, Portugal

Abstract. In this paper we will try to understand how oracles and advice functions, which are mathematical abstractions in the theory of computability and complexity, can be seen as physical measurements in Classical Physics. First, we consider how physical measurements are a natural external source of information to an algorithmic computation. We argue that oracles and advice functions can help us to understand how the structure of space and time has information content that can be processed by Turing machines (after Cooper and Odifreddi [10] and Copeland and Proudfoot [11, 12]). We show that non-uniform complexity is an adequate framework for classifying feasible computations by Turing machines interacting with an oracle in Nature. By classifying the information content of such an oracle using Kolmogorov complexity, we obtain a hierarchical structure for advice classes.

1

Introduction

In computability theory, the basic operations of algorithmic models, such as register machines, may be extended with sets, or (partial) functions, called “oracles”. For example, in Turing’s original conception, any set S can be used as an oracle in an algorithm as follows: from time to time in the course of a computation, an algorithm produces a datum x and asks is x ∈ S? The basic properties of universality, undecidability, etc., can be proved for these S-computable functions. Technically, there is nothing special about the operations chosen as basic in an algorithmic model — a fact characteristic of computability theories over abstract algebras ([22]). Oracles are seen as abstract theoretical entities, technical devices whose use is to compare and classify sets by means of degree theories and hierarchies: see [23]. ?

Corresponding author.

However, here we will argue that it is a useful, interesting, even beautiful, endeavour to develop a computability theory wherein oracles are natural phenomena, and to study the oracles that arise in Nature. More specifically, we will consider how physical measurements can be a natural external source of information for an algorithm, especially automata and Turing machines. First, in Section 2, we reflect on an example of an algorithm that has need of a physical oracle. Hoyle’s algorithm calculates eclipses using the ancient monument Stonehenge. Abstractly, it has the structure of an automaton with an oracle accessed by experimental observation. The discussion focusses on calculating solar eclipses and how the oracle is needed to make corrections. In examining Hoyle’s algorithm, our aim is to explore some of the essential features of digital computations dependent on analogue oracles in Nature, and to set the scene for the theory that follows. To study this conceptually complex type of computation properly, we choose a physical experiment that we have studied in some detail from the computational point of view. The Scatter Machine Experiment (SME) is an experimental procedure that measures the position of a vertex of a wedge to arbitrary accuracy [8]. Since the position may itself be arbitrary, it is possible to analyse the ways in which a simple experiment in Newtonian kinematics can measure or compute an arbitrary real in the interval [0, 1]. In [7], we examined three ways in which the SME can be used as an oracle for Turing machines and established the complexity classes of sets they defined. With this technical knowledge, we can go on to consider how physical measurements are a natural external source of information to an algorithmic computation. In Section 3, we summarise what we need about oracles and advice functions in order to understand how the structure of space and time may have information content that can be processed by Turing machines (after Cooper and Odifreddi ([10]) and Copeland and Proudfoot ([11, 12]). In Section 4, we introduce reductions between advice functions and, in Section 5, concepts based on the Kolmogorov complexity measure are used to express the information content that can be processed by Turing machines. In Section 7, we apply these notions to the SME and through the interactions between oracle SME in Nature and the Turing machines, develop an inner structure of the advice class P/poly, similar to the one found in [3, 21].

2 2.1

Stonehenge and calculating with an oracle Hoyle’s algorithm

Stonehenge is an arrangement of massive stones in Wiltshire. Its earliest form dates from 3100 BC and is called Stonehenge I. The Astronomer Sir Fred Hoyle showed in [15] that Stonehenge can be used to predict the solar and the lunar eclipse cycles. Specifically, he gave a method, which we may call Hoyle’s algorithm, to make such calculations. For our purposes it doesn’t really matter whether the Celts used Stonehenge I to predict the eclipse cycles, but it matters

that, in our times, we can use Stonehenge I to make good predictions of celestial events, such as the azimuth of the rising Sun and of the rising Moon, or that we can use this Astronomical Observatory as a predictor of eclipses (see [18] for a short introduction). Consider the prediction of eclipses, especially the solar eclipse. This is done by a process of counting time but also requires celestial checks and making corrections. The counting of days is a purely algorithmic process. The celestial correction is an experimental process, an observation, which we interpret as consulting a physical oracle. The important structure is the circle of Aubrey holes, made of 56 stones, buried until the XVII century, and discovered by John Aubrey (see Fig. 1).

Fig. 1. A schematic drawing of Stonehenge I.

Three stones are used as counters that will be moved around the circle of Aubrey holes. The first counter counts the days of the year along the Aubrey holes; the second counter counts the days of the lunar month; finally, the third counter takes care of the Metonic cycle, in which the same phases of the moon are repeated on the same date of the year to within an hour or so, after a period of nineteen years (discovered by Meton around 430 B.C., but believed to have been known earlier); in other words, the third stone counts along the cycle of the lunar node, one of the intersection points of the ecliptic with the Moon’s orbit.

The example of Stonehenge illustrates what is meant by an oracle that arises in Nature. From the point of view of the Earth both the Moon and the Sun follow approximately circular orbits, as shown in Fig. 2, which cross at the nodes N and N 0 . Suppose the moon is passing through N . Then a solar eclipse will occur if the sun is no further than 15◦ of N , and a lunar eclipse happens if the sun is within 10◦ of N 0 . If the moon is passing through N 0 the situation is reversed. One can then wait for a solar eclipse, set the three tokens in the appropriate Aubrey hole, and use the following:

Fig. 2. The approximate orbits of the Moon and the Sun around the Earth.

Simplified Hoyle’s algorithm 1. The first token, a little stone for instance, is moved along the Aubrey holes to keep track of the 28 day lunar cycle. We move the first token two places every day, since 56/2 = 28. 2. The second token counts the days of the year. Since 56 × 13/2 = 364, we move the second token two places every thirteen days. 3. The third token will represent one of the nodes, say N . N and N 0 themselves rotate around the Earth, describing a full cycle (the Metonic cycle) every 18.61 years. So we will move the third token three times every year, because 56/3 = 18.67. 4. Eclipses occur when the three tokens become aligned with each other up to one Aubrey hole to the right or to the left. Ignoring the error for now, we conclude that simple modulo 56 arithmetic is enough to predict every eclipse with one single necessary input, namely: the day of a solar eclipse when one sets the tokens in the first Aubrey hole. Now we introduce celestial corrections that constitute the call to an oracle. To the Northeast of Stonehenge I there is a 5 meter tall stone, called the Heelstone. In the morning of the Summer solstice the sun (our oracle) raises slightly to the

north of the Heelstone. To know the exact day of the Summer solstice we wait for the day when the sun rises behind the Heelstone. The sunrise should then proceed north for a few days, and then back south. We count the number of days between the first sunrise behind the Heelstone and the second sunrise. The day of the summer solstice happened in the middle of these two events. With this information we can calibrate the second token to enough precision every year, so that Stonehenge I can predict eclipses indefinitely. 4 2.2

Physical oracles

We have described an unusual form of computation, aided by an unusual oracle. Is the measurement or observation of the summer solstice in Hoyle’s algorithm “call to an oracle”? In our discussion we could have replaced the structure Stonehenge I with a modern computer and corrections could be made via a link with a satellite telescope, for example. While it seems natural to consider the Sun as an oracle in the Stonehenge I algorithm described above, calling this satellite link an “oracle” may feel awkward — could one call it “input”? However, let us point these two sources of information have the same nature. It is customary to consider input to be finitely bounded information that is given prior to the start of the computation, but the corrections are updates that over time give — in principle — an unbounded amount of data. Without such oracles both Stonehenge I and our modern computer would eventually be incapable of predicting eclipses, although the modern computer could keep providing accurate predictions for hundreds of years. In both cases, the observations of the sun act exactly as an oracle. Hoyle’s algorithm is an example of an algorithm with a physical oracle. Said differently, the oracle notion extended to include a physical process is just what we need to best express Hoyle’s algortihm. Hoyle’s algorithm is also a description of a physical process. The components of Stonehenge I referring to celestial objects make a simple model of solar system dynamics: in reality we have the sky and the big circle of Aubrey holes. The algorithm is embodied by the real world. Cooper and Odifreddi, in [10], comment on this type of phenomenon: the Turing model supports (in-)computability in Nature in the sense that the Turing model is embedded in Nature in one way or another. For these authors, incomputability sounds more like an intrinsic limitation of our knowledge about the Universe rather than a manifesto for hypercomputation. Do these incomputabilities come out of (i) unpredictable behaviour of the model (e.g., an uncertainty based upon mathematical limitations), or (ii) a real and essential incomputability in Nature (e.g., the hypercomputational character of some physical phenomenon). Indeed, the following conjecture is extremely debatable. 4

The calibration procedure explained in [15] is slightly more complicated and detailed: we only illustrate it here. The remaining tokens can also be calibrated using other oracles: the phases of the moon give the adjustment of the first token and the precise day in which a solar eclipse occurs allows for calibration of the third token.

Conjecture O (for ‘oracle’). The Universe has non-computable information which may be used as an oracle to build a hypercomputer. The conjecture was popularised by Penrose’s search for (ii) in [19, 20] and much can be written about it. Cooper and Odifreddi [10] have suggested similarities between the structure of the Universe and the structure of the Turing universe. Calude [9] investigates to what extent quantum randomness can be considered algorithmically random. The search for a physical oracle was proposed by Copeland and Proudfoot [12]. Their article and subsequent work have been severely criticised [13, 14] for historical and technical errors. There is, however, an appealing aesthetical side to what Copeland and Proudfoot proposed. Consider a variation of the Church–Turing thesis: the physical world is simulable. This thesis leads us to conclude that one could, in principle, construct a Turing machine that could successfully predict eclipses forever, without the use of any oracle. 5 Being able to predict eclipses indefinitely, however, would not imply that the physical world is simulable, unless the prediction of planet alignments is, in some sense, complete for the simulation problem. Measuring the rise of the sun to the side of the Heelstone is a human activity very close to the abstract machine we are going to describe in the following sections: The Stonehenge apparatus measures a point in space and time whereas the device we are describing next measures a point in space. Both are real numbers in classical physics.

3

Some classical results on non–uniform complexity

In this paper Σ denotes an alphabet, and Σ ∗ denotes the set of words over Σ (where λ stands for the empty word). A language (or just a set) is a subset of Σ ∗ . The census function of a set A is the function that, for each n ∈ N, gives the number of words in A of size less or equal to n. Almost always we will adopt the binary alphabet {0, 1}. Definition 3.1. Let the set of finite sequences over the alphabet Σ be ordered alphanumerically (i.e. first by size, then alphabeticaly). The characteristic function of a language A ⊆ Σ ∗ is the unique infinite sequence χA : N → {0, 1} such that, for all n, χA (n) is 1 if and only if the n-th word in that order is in A. The pairing function is the well known homomorphism h−.−i : Σ ∗ × Σ ∗ → Σ , computable in linear time, that allows to encode two words in a single word over the same alphabet by duplicating bits and inserting a separation symbol “01”. By an advice we mean any map (total function) f : N → Σ ∗ . We recall the definition of non–uniform complexity class. ∗

Definition 3.2. If F is a class of advices and A is a class of sets, then we define the new class A/F as the class of sets B such that there exists a set 5

In this discussion, we abstract from the cosmological “knowledge” that eclipses will not happen forever.

A ∈ A and an advice f ∈ F such that, for every word x ∈ Σ ∗ , x ∈ B if and only if hx, f (|x|)i ∈ A. If we fix the class P of sets decidable by Turing machines in polynomial time, we still have one degree of freedom which is the class of advices F that makes P/F. We will work in this paper with subpolynomial advices such that F is a class of functions with sizes bounded by polynomials and computable in polynomial time. Note that the advices are not, in general, computable; but the corresponding class of bounds is computable. E.g., if the class is poly, then it means that any advice f : N → Σ ∗ , even being non–computable, is bounded by a computable polynomial p such that, for all n ∈ N, |f (n)| ≤ p(n). Although the class F of functions is arbitrary it is useless to use functions with growing rate greater than exponential. Let exp be the set of advice functions bounded in size by functions in the class O(2O(n) ). Then P/exp contains all sets. Given this fact, we wonder if either P/poly or P/log (subclasses of P/exp) exhibit some interesting internal structure. Three main results should be recalled from [1] (Chapter 5): Proposition 3.1. There exist sets in EXPSPACE not in P/poly. [Here EXPSPACE utilises a Turing machine whose working tape, rather than being infinite, has length bounded by an exponential on the size of the input. There is no bound on time.] This proposition above is proved by a non–trivial diagonalization of P/poly. It is an important result, since it tell us that although exp suffices as a class of advices to decide all sets in polynomial time, there exist sets that are indeed outside of P/poly. Proposition 3.2. The characteristic of the Sparse Halting Set is in P/poly. This result is fundamental in what it says that there are non decidable sets in P/poly. One such set is K = {0n : the Turing machine coded by n halts on input 0}. Proposition 3.3. If SAT ∈ P/log, then P = N P . In the last proposition SAT stands for the set of binary encodings of satisfiable Boolean formulæ. This result is also considered relevant, since it shows the importance of the advice class log. A set is said to be sparse if its census is bounded by a polynomial. We also need to recall the concept of tally set: a set is said to be tally if it is a language over an alphabet of a single letter (we take this alphabet to be {0}). Tally sets are sparse (but not vice–versa). For each tally set T , χT is defined relative to a single letter alphabet, e.g., Σ = {0}. The halting set K above is tally. The most common characterization of P/poly is given by the following statement, where by P (S) we denote the class of sets decidable by deterministic Turing machines having access to the oracle set (or language) S:

Proposition 3.4. P/poly =

S

S sparse

P (S) =

S

S tally

P (S)

Although in [4] the following statement (needed to prove Proposition 4.1) is offered as exercice to the reader (Chapter 5, Exercise 9), we present a simple proof of it to make sure that it holds. The reader is reminded that a query to the oracle is written on a special query tape, and that the oracle answers YES or NO in one time step. Further, we note that adding extra tapes to a Turing machine will not affect our results. This is because a Turing machine with 1 working tape and 1 input tape can simulate a Turing machine with k working tapes and 1 input tape in time O(t × log(t)), where t is the time taken by the multi–tape machine. Proposition 3.5. Tally oracle Turing machines and advice Turing machines are polynomial time equivalent. Proof. Consider an oracle Turing machine M with input x of size n consulting the tally oracle T and let χT n be the sequence of the first n bits of the infinite sequence χT . In polynomial time p, the oracle Turing machine M can only write a number p(n) of 0s in the query tape. We substitute the query tape by an arbitrary working tape and define an advice Turing machine M 0 that performs the same task in polynomial time too. Let hx, χT p(|x|) i be the modified input x paired with the advice f (|x|) = χT p(|x|) . The Turing machine M 0 mimicks the non–oracle steps of M . Whenever the oracle Turing machine M enters in the query state, the advice Turing machine M 0 counts a number m ≤ p(n) of 0s in the former query tape and at the same time swipe the input tape until the position m of the advice χT p(|x|) . If the corresponding bit is 1, then the advice machine enter in state YES else it enters in the state NO, both states now considered regular states of the new machine. The simulation is achieved in polynomial time. Conversely, let M be an advice Turing machine with advice function χT and consider the tally oracle T = {0n : χT (n) = 1}. We define an oracle Turing machine M 0 that decides the same set as M and is polynomial time equivalent to M . The input x of size n is given to the new machine M 0 . Whenever the machine M consults the i-th bit of χT , the new machine M 0 writes a number i of 0s on the query tape and consults the tally oracle. The result is the same and the two machines are polynomial time equivalent.  We will also need to treat prefix non-uniform complexity classes. For these classes we may only use prefix functions, i.e., functions f such that f (n) is always a prefix of f (n + 1). The idea behind prefix non-uniform complexity classes is that the advice given for inputs of size n may also be used to decide smaller inputs. Definition 3.3. Let B be a class of sets and F a class of functions. The prefix advice class B/F∗ is the class of sets A for which some B ∈ B and some prefix function f ∈ F are such that, for every length n and input w, with |w| ≤ n, w ∈ A if and only if hw, f (n)i ∈ B.

4

Structure within advice classes

If f : N → Σ ∗ is an advice function, then by |f | we denote its size, i.e., the function |f | : N → N such that, for every n ∈ N, |f |(n) = |f (n)|. Let |F| = {|f | : f ∈ F}. We already have seen that log, poly are classes of advices. Now consider the concept of reasonable advice class that we adapt from [21] to our purpose. 6 Definition 4.1. A class of reasonable advices is a class of advice functions F such that (a) for every f ∈ F, |f | is computable in polynomial time, (b) for every f ∈ F, |f | is bounded by a polynomial, (c) |F| is closed under addition and multiplication by positive integers, (d) for every polynomial p of positive integer coefficients and every f ∈ F, there exists g ∈ F such that |f | ◦ p ≤ |g|. Other definitions could have been used. (According to this definition, polynomially long advices constitute themselves a class of reasonable advices.) Herein, we preferred to use the same concept already used in [3, 21], for the purpose of classifying real numbers in different Kolmogorov complexity classes. Definition 4.2. There is a relation between two total functions, s and r, by saying that s ≺ r if s ∈ o(r). This relation can be generalised to two classes of advices, F and G, by saying that F ≺ G if there exists a function g ∈ G, such that for all functions f ∈ F, |f | ≺ |g|. 7 Since ordered reasonable advice functions in the context of P/poly are classes of sublinear functions, the most natural chain of advice function sizes is a descendent chain of iterated logarithmic functions: Define log(0) (n) = n and log(k+1) (n) = log(log(k) (n)). Note that log(k+1) ≺ log(k) , for all k ≥ 0. Now we take the reasonable class of advices log (k) given by closure of each bound under addition and multiplication by positive integers and under composition with polynomials (to the right). 8 The class of advices poly is reasonable if we restrict it to functions of computable size: |poly| is closed under addition and multiplication by positive integers and under polynomial composition (to the right). The class of advices log is also reasonable if we restrict it to functions of computable size: |log| is closed under addition and multiplication by positive integers and under composition with polinomials (to the right, log ◦ (λn. nk ) = λn. k log n). The same applies to the sublogarithmic advices, but not to the quasipolynomial advices: the class k of advices induced by functions of sizes λn. 2log(n) , for all k > 1, although compose with polynomials (to the right), is not reasonable. 9 Note that the class 6

7

8 9

The concept so-called reasonable advice bounds does not coincide with ours and is not coherently applied throughout the cited book. The main reason is that functions computable in polynomial time can grow faster than polynomials. Note that a quite different definition could be thought: F ≺ G if for every function f ∈ F, there exists a function g ∈ G, such that |f | ≺ |g|. In this case, the composition with polynomials do not add any further functions. They are computable in polynomial time, but do not satisfy the condition (b) in the definition.

of exponential advices is not reasonable since their sizes are not closed under composition with polynomials (to the right). Proposition 4.1. If F and G are two classes of reasonable sublinear advice classes such that F ≺ G, then P/F ⊂ P/G (strict inclusion). Proof. Let linear be the set of advice functions of size linear in the size of the input and η. linear be the class of advice functions of size η times the size of the input, where η is a number such that 0 < η < 1. There is a tally set A whose characteristic function, χA , is in P/linear but not in P/η.linear for some η sufficiently small. 10 We prove that there is a g ∈ G (with |g| strictly sublinear) so that for all f ∈ F with |f | ∈ o(|g|), there is a set in P/g that does not belong to P/f . A new tally set T is defined in the following way: for each length n, if |g|(n) ≤ n, then the word βn = χA|g|(n) 0n−|g|(n) is the unique word of size n in T , else 0n is the unique word of size n in T . 11 This tally set 12 belongs trivially to the class of P/g choosing as advice the function γ(n) = χA|g|(n) . We prove that the same set does not belong to P/f . Suppose that some Turing machine with advice f , running in polynomial time, decides T . Since |f | ∈ o(|g|), then for all but finitely many n, |f |(n) < η|g|(n), for arbitrarily small η, meaning that we can compute, for all but finitely many n, |g|(n) bits of χA using an advice of length η.|g|(n), contradicting the fact that χA is not in P/η.linear. The reconstruction of the binary sequence χA |g|(n) is provided by the following procedure: M procedure: begin input n; x := λ; compute |g|(n); for i := 1 to |g|(n) do query 0i to T using advice f (i); if “YES”, then x := x1, else x := x0 end for; output x end. The function g itself should have a computable size |g|, due to the restriction of G being a class of reasonable advice functions. The computation of |g|(n) takes 10 11 12

We can take for A the set of prefixes of Ω. This situation can only happen for a finite number of values of n. The set T can be seen as tally by performing the corresponding subtitution of each word by the required words from 0.

a polynomial number of steps on n. So does each query and the loop (herein, we are using the Proposition 3.5). We end up with a polynomial number of steps on the size of the input.  The class P/poly restricted to the advice functions of polynomial size constitute itself a reasonable advice class and can not reveal any internal structure. If we consider the full class P/poly with advice functions with size less or equal to polynomial the same proof allows us to conclude that (since λn. n is in poly) P/poly is the supremum of all classes of sets induced by the relation between reasonable advice classes so far considered. To our previously defined advice classes log (k) we add the limit advice class (ω) log = ∩k≥1 log (k) . Then proposition 4.1 allows us to take the infinite descending chain of advice function sizes log (ω) ≺ . . . ≺ log (3) ≺ log (2) ≺ log ≺ poly and turn it into a strictly descending chain of sets P/log (ω) ⊂ . . . ⊂ P/log (3) ⊂ P/log (2) ⊂ P/log ⊂ P/poly To show that log (ω) is not trivial, we note that the function log ∗ , defined by log ∗ (n) = min{k : log (k) (n) ≤ 1}, 13 is in log (ω) . Identifying this function allows us to continue the descending chain by defining log (ω+k) , for k ≥ 1, to be the class generated by log (k) ◦ log ∗ . Again we take the limit log (2ω) = ∩k≥1 log (ω+k) , giving the descending chain log (2ω) ≺ . . . ≺ log (ω+2) ≺ log (ω+1) ≺ log (ω) ≺ . . . ≺ log (3) ≺ log (2) ≺ log ≺ poly Now the function log ∗(2) = log ∗ ◦ log ∗ is in log (2ω) , so the class log ∗(2) is not trivial. We can continue descending by setting log (2ω+k) for k ≥ 1 to be the class generated by log (k) ◦ log ∗(2) . Of course this continues till we reach 2 log (ω ) = ∩k≥1 log (kω) . To get beyond this would require finding log 2∗ ≺ log ∗(k) for all k, and this continuation is left to the reader!

5

Kolmogorov complexity

From this section on, by P we denote the set of polynomials P = {λn. nk : k ∈ N}. We will work with one of the definitions of Kolmogorov Complexity discussed by Balc´ azar, Gavald` a, and Hermo in [2]: Definition 5.1. Let U be a universal Turing machine, let f : N → N be a total function and g : N → N be a time constructible function, 14 and let α ∈ {0, 1}ω . We say that α has Kolmogorov complexity K[f, g] if there exists β ∈ {0, 1}ω such that, for all n, the universal machine U outputs αn in time g(n), when given n and βf (n) as inputs. 13 14

log ∗ (0) = 0 by convention. A function f : N → N is said to be time constructible if there is a Turing machine M such that for all n ∈ N and all inputs of size n, M halts in exactly f (n) steps.

This definition can be restated as follows: the dyadic rational αn of size n is generated by a universal Turing machine given the dyadic rational βf (n) as input. The reader should look to the input βf (n) as a binary sequence (dyadic rational without the left leading zero) made of a prefix, which is the required program to the universal Turing machine, paired with the actual input. K[f, g] can also be seen as the set of all infinite binary sequences with Kolmogorov complexity K[f, g]. Definition 5.2. K[f ] is the set of all infinite binary sequences with Kolmogorov complexity K[f, g], where g is an arbitrary time constructible function. Definition 5.3. If G is a set of time constructible bounds, then K[F, G] is the set of all infiniteSbinary sequences taken from K[f, g], where f ∈ F and g ∈ G, i.e., K[F, G] = f ∈F , g∈G K[f, g]. Definition 5.4. K[F] is the set of all infinite binary sequences taken from K[f ], where f ∈ F. Definition 5.5. A sequence is called Kolmogorov random sequence if it belongs to K[(λn. n) − O(1)] and does not belong to any smaller class K[f ]. Every sequence belongs to K[(λn. n) + O(1), P], since every sequence can be reproduced from itself in polynomial time plus the constant amount of input which contains the program necessary to the universal Turing machine to make the copy. The class K[O(1)] contains all computable real numbers, in the sense of Turing (i.e. all the binary digits are computable). All the characteristic functions of recursively enumerable sets are in K[log]. This proof was done by Kobayashi in 1981 [16] and by Loveland in 1969 [17] for a variant of the definition of Kolmogorov complexity. The Kolmogorov complexity of a real is provided by the following definition. Definition 5.6. A real is in a given Kolmogorov complexity class if the task of finding the first n binary digits of the real is in that class.

6

The analog–digital scatter machine as oracle or advice

Experiments with scatter machines are conducted exactly as described in [8], but, for convenience and to use them as oracles, we need to review and clarify some points. The scatter machine experiment (SME) is defined within Newtonian mechanics, comprising of the following laws and assumptions: (a) point particles obey Newton’s laws of motion in the two dimensional plane, (b) straight line barriers have perfectly elastic reflection of particles, i.e., kinetic energy is conserved exactly in collisions, (c) barriers are completely rigid and do not deform on impact, (d) cannons, which can be moved in position, can project a particle with a given velocity in a given direction, (e) particle detectors are capable of telling if a particle has crossed a given region of the plane, and (f) a clock measures time.

right collecting box

6

6 sample trajectory

@ 1

@ @  @ @ @

5m x 6 ?

1

10 m/s

s  cannon

0 limit of traverse of point of wedge

6 z

0? limit of traverse of cannon

left collecting box 5m

? 

-

Fig. 3. A schematic drawing of the scatter machine

The machine consists of a cannon for projecting a point particle, a reflecting barrier in the shape of a wedge and two collecting boxes, as in Figure 3. The wedge can be at any position, but we will assume it is fixed for the duration of all the experimental work. Under the control of a Turing machine, the cannon will be moved and fired repeatedly to find information about the position of the wedge. Specifically, the way the SME is used as an oracle in Turing machine computations, is this: a Turing machine will set a position for the canon as a query and will receive an observation about the result of firing the cannon as a response. For each input to the Turing machine, there will be finitely many runs of the experiment. In Figure 3 the parts of the machine are shown in bold lines, with description and comments in narrow lines. The double headed arrows give dimensions in meters, and the single headed arrows show a sample trajectory of the particle after being fired by the cannon. The sides of the wedge are at 45◦ to the line of the cannon, and we take the collision to be perfectly elastic, so the particle is deflected at 90◦ to the line of the cannon, and hits either the left or right collecting box, depending on whether the cannon is to the left or right of the point of the wedge. Since the initial velocity is 10 m/s, the particle will enter one of the two boxes within 1 second of being fired. Any initial velocity v > 0 will work with a corresponding waiting time. The wedge is sufficiently wide so that the particle can only hit the 45◦ sloping sides, given the limit of traverse of the cannon. The wedge is sufficiently rigid so that the particle cannot move

the wedge from its position. We make the further assumption, without loss of generality (see [5]) that the vertex of the wedge is not a dyadic rational. Suppose that x is the arbitrarily chosen, but non–dyadic and fixed, position of the point of the wedge. For a given dyadic rational cannon position z, there are two outcomes of an experiment: (a) one second after firing, the particle is in the right box — conclusion: z > x —, or (b) one second after firing, the particle is in the left box — conclusion: z < x. The SME was designed to find x to arbitrary accuracy by altering z, so in our machine 0 ≤ x ≤ 1 will be fixed, and we will perform observations at different values of 0 ≤ z ≤ 1. Consider the precision of the experiment. When measuring the output state the situation is simple: either the ball is in one collecting box or in the other box. Errors in observation do not arise. There are different postulates for the precision of the cannon, and we list some in order of decreasing strength: Definition 6.1. The SME is error–free if the cannon can be set exactly to any given dyadic rational number. The SME is error–prone with arbitrary precision if the cannon can be set only to within a non-zero, but arbitrarily small, dyadic precision. The SME is error-prone with fixed precision if there is a value ε > 0 such that the cannon can be set only to within a given precision ε. The Turing machine is connected to the SME in the same way as it would be connected to an oracle: we replace the query state with a shooting state (qs ), the “yes” state with a left state (ql ), and the “no” state with a right state (qr ). The resulting computational device is called the analog–digital scatter machine, and we refer to the vertex position of an analog–digital scatter machine when mean to discuss the vertex position of the corresponding SME. In order to carry out a scatter machine experiment, the analog–digital scatter machine will write a word z in the query tape and enter the shooting state. This word will either be “1”, or a binary word beginning with 0. We will use z indifferently to denote both a P word z1 . . . zn ∈ {1} ∪ {0s : s ∈ {0, 1}∗ } and the n corresponding dyadic rational i=1 2−i+1 zi ∈ [0, 1]. In this case, we write |z| to denote n, i.e., the size of z1 . . . zn , and say that the analog–digital scatter machine is aiming at z. The Turing machine computation will then be interrupted, and the SME will attempt to set the cannon at the position defined by the sequence of bits: z ≡ z1 · z2 · · · zn . with precision ε = 2−n+1 . After setting the cannon, the SME will fire a projectile particle, wait one second and then check if the particle is in either box. If the particle is in the right collecting box, then the Turing machine computation will be resumed in the state qr . If the particle is in left box, then the Turing machine computation will be resumed in the state ql . Definition 6.2. An error–free analog–digital scatter machine is a Turing machine connected to an error–free SME. In a similar way, we define an error-prone analog–digital scatter machine with arbitrary precision, and an error-prone analog–digital scatter machine with fixed precision.

If an error–free analog–digital scatter machine, with vertex position x ∈ [0, 1], aims at a dyadic rational z ∈ [0, 1], we are certain that the computation will be resumed in the state ql if z < x, and that it will be resumed in the state qr when z > x. We define the following decision criterion. Definition 6.3. Let A ⊆ Σ ∗ be a set of words over Σ. We say that an error-free analog–digital scatter machine M decides A if, for every input w ∈ Σ ∗ , w is accepted if w ∈ A and rejected when w ∈ / A. We say that M decides A in polynomial time, if M decides A, and there is a polynomial p such that, for every w ∈ Σ ∗ , the number of steps of the computation is bounded by p(|w|). gedankenexperiment: The position for firing the cannon is written as a dyadic rational on the query tape, and since it takes unit time to write a symbol on the tape, there is a limit to the accuracy of determining the wedge position that we can obtain within a given time. Conversely, using bisection, we can determine the wedge position to within a given accuracy, and if the wedge position is a good encoding, we can find the original sequence to any given length (see [6, 5]). Theorem 6.1. An error–free analog–digital scatter machine can determine the first n binary places of the wedge position x in polynomial time in n. Theorem 6.2. The class of sets decided by error–free analog–digital scatter machines in polynomial time is exactly P/poly. We then conclude that measuring the position of a motionless point particle in Classical Physics, using a infinite precision cannon, 15 in polynomial time, we are deciding a set in P/poly. Note that, the class P/poly includes the Sparse Halting Set. In this paper we shall only consider error–free analog–digital scatter machines. The error–prone analog-digital scatter machines do not behave in a deterministic way, and in this paper we are not concerned with probabilistic classes. However, lest the reader were to think that the computational power of the analog–digital scatter machine was dependent on some “unphysical” assumption of zero error, in [6, 7] it is shown that the arbitrary precision machine can still compute P/poly (with suitable account of time taken to set up each experiment), and that the fixed precision machines can compute BP P//log∗. It seems that measurement in more reasonable conditions do not affect the computational power of a motionless point particle position taken as oracle. I.e., making measurements in Classical Physics with incremental precision, may decide languages above the Turing limit, namely the Sparse Halting Set. Since the vertex of the wedge of the analog–digital scatter machine is placed at a position x ∈ [0, 1], a real number, we have to explain how a real number can be considered a tally oracle. A real number can be seen either as an infinite binary sequence, either as the tally set containing exactly the words 0n such that 15

Remember that by infinite precision we mean the cannon can be set at a dyadic rational point with infinite precision.

the n-th bit in the sequence is 1. The following procedure M reads from a such tally set T the sequence of bits of a real number up to the m–bit in linear time: M procedure: begin input m; x := λ; for i := 1 to m do query 0i to T ; if “YES”, then x := x1, else x := x0 end for end.

7

The complexity of the vertex position

In this section, we will apply the same methods developed in [4, 3, 21] in the study of neural networks with real weights to the analog–digital scatter machine. We shall apply a “good” coding of sequences of 0s and 1s into the binary digits of a real number that will allow a measurement of a given accuracy to determine the first n 0s and 1s (and that in addition will never produce a dyadic rational). For example, we can replace every 0 in the original sequence with 001 and every 1 with 100. Then the sequence 0110 . . . becomes the number 0·001100100001 . . . The set of “good” encodings will typically be some form of Cantor set in [0, 1]. See [6] for more details. Proposition 7.1. Let S be a set of infinite binary “good” encodings and let T be the family of tally sets T = {T : χT ∈ S}. The computation times of the analog–digital scatter machines with vertex in S are polynomially related to the computation times of oracle Turing machines that consult oracles in T . Proof. We first prove that an analog–digital scatter machine M with vertex at x ∈ S, can be simulated by an oracle Turing machine M 0 that consults a tally oracle T ∈ T . Let the characteristic of T be (χT =) x. Let t be the running time of M (possibly a non–constructible time bound). 16 According to the Theorem 6.1, p(t) bits of x are enough to get the desired result in time t. The oracle Turing machine M 0 computes as follows: M0 procedure: begin input w; n := |w|; s := 1; loop for i = 1 to p(s) query 0i to T 16

Note that M halts only after t(n) steps on input x of size n, if t(n) is defined; otherwise, M does not halt.

to construct the sequence ξ := xs ; simulate M with vertex at ξ, step by step until time s; if M halts, then output the result; s := s + 1 end loop end. To see that the output is correct, note that after the for step, M 0 has the value of x with enough precision to correctly simulate t(n) steps of the computation. The simulation is polynomial on the time t(n). 17 Conversely, we prove that an oracle Turing machine M that consults the oracle T ∈ T can be simulated by an analog–digital scatter machine with vertex exactly at χT . The query tape is substituted by a working tape and a new query tape is added to aim the cannon. The machine M 0 reads one by one the number i of 0s written in the former query tape and calls the scatter machine procedure to find i bits of the vertex position using the new query tape. Each call can be executed in a time polynomial in i ([5, 7]). The overall time of the computation is polynomially related with the running time of the digital–analog scatter machine.  The following theorem is the analogue to the corresponding theorem of neural networks with real weights, due to Balc´azar, Gavald`a, and Siegelmann in [3, 21]. Proposition 7.2. If F is a class of reasonable sublinear advices, 18 then the class P/F∗ is exactly the class of languages accepted by polynomial time analog– digital scatter machines with vertex in the subset of “good” encodings of K[|F|, P]. Proof. By Proposition 7.1, we only have to prove that P/F∗ coincides with the class of languages decidable by Turing machines in polynomial time consulting tally oracles T with characteristic χT ∈ K[|F|, P]. Assume that A ∈ P (T ) and consider an infinite sequence β such that the first n bits of χT can be recovered from the first |f |(n) bits of β in polynomial time in n, with |f | ∈ F. Let p be the polynomial associated with an oracle Turing machine M that witnesses A ∈ P (T ). The longest word queried to the tally oracle T along the computation of M has size not exceeding p(n). Let g ∈ |F | be a bound on |f | ◦ p. 19 Then A is in P/F∗ by choosing as advice for length n the first |g|(n) bits of β, from which up to p(n) bits of χT can be written in a tape in polynomial time and then used to decide all words in A of size less or equal to n. 20 17

18 19 20

If the time of M is constructible, than a single loop suffices to get the amount of bits of x needed to conclude the simulation. However, in general, t is not constructible or, even worse, t may be undefined for a given input. I.e., a class of reasonable advices of sublinear sizes. Such a function g exists since F is a class of reasonable advice functions. Note that, although A has potentially an exponential number of words of size less or equal to n, the number of words in T of size less or equal to n is quadratic in n since T is a tally set.

Conversely, 21 let A ∈ P/F∗ be witnessed by the Turing machine M running in polynomial time with advice f ∈ F∗ with size bounded by a strictly increasing polynomial p. Let q(n) = (n + 1)p(n), so that for all but finitely many values of n, p(n + 1) < q(n + 1) − q(n): it follows that |f |(n + 1) − |f |(n) < |f |(n + 1) ≤ p(n+1) < q(n+1)−q(n). Let the characteristic χT of the tally oracle T be defined as follows: the bits from 1 to |f |(1) of χT are f (1); the (|f |(1) + 1)–th bit of χT is 1, and the bits |f |(1) + 2 to q(1) of χT are 0; when 1 ≤ ` ≤ |f |(n + 1) − |f |(n), the bit q(n)+` of χT is bit |f |(n)+` of f (n+1); bit number q(n)+|f |(n+1)−|f |(n)+1 of χT is 1, and when |f |(n + 1) − |f |(n) + 1 < ` ≤ q(n + 1) − q(n), bit q(n) + ` of χT is 0. As an infinite binary string, χT looks like this: . . . 0} f (3) |10 {z . . . 0} . . . χT = f (1) |10 {z . . . 0} f (2) |10 {z padding

padding

padding

Given n and q(n) ≥ n bits of χT , we can print out f (n) in polynomial time, by removing the extra bits with which we have padded f in χT . Therefore, in polynomial time, a Turing machine with oracle T can simulate M and decide A. Since with |f |(n) bits of f we can print q(n) ≥ n bits of χT , we also conclude that χT ∈ K[|F|, P].  The class of languages accepted by the analog–digital scatter machine with vertex in K[|poly|, P] is P/poly∗ = P/poly. The class of languages accepted by the analog–digital scatter machine with vertex in K[|log|, P] is P/log∗. Thus we can restate one of the main results of the Gedankenexperiment of Section 4 with the analog–digital scatter machine while measuring a distance from the origin as oracle. The result is the same for neural nets with real weights computing in polynomial time (see [21]). Proposition 7.3. The analog–digital scatter machines decide in polynomial time exactly the class P/poly. Proof. From Proposition 7.2, we know that the analog–digital scatter machines decide in polynomial time exactly the class P/poly∗ = P/poly. Take for F the class poly, restricted to advices of computable size. If an advice has non-computable size, but it is bounded in size by a polynomial p, then we can pad the advice of size m, for the input x of size n, with the word 10p(n)−m−1 . Thus, for every advice in poly, there is always an advice of computable size equivalent to the previous one that do not alter the complexity of the problem.  We can then prove a hierarchy theorem. The statement can be found in [3, 21], but here the proof relies on the structure of advice classes. Proposition 7.4. If F and G are two classes of reasonable advice functions such that F ≺ G, then K[|F|, P] ⊂ K[|G|, P] (strict inclusion). 21

Herein, we adapt the proof in [21].

Proof. If F ≺ G, then, by Proposition 4.1, P/F ⊂ P/G, from where it follows that P/F∗ ⊂ P/G∗ 22 and, consequently, by Proposition 7.2, that K[|F|, P] ⊂ K[|G|, P] (all strict inclusions). Proposition 7.5. If F and G are two classes of reasonable advice functions such that F ≺ G, then the class of languages decidable by digital–analog scatter machines with vertex in K[|F|, P] is strictly included in the class of languages decidable by digital–analog scatter machines with vertex in K[|G|, P]. In the limit of a descendent chain of sizes of classes of reasonable advice functions we have O(1). The class K[O(1), P] is, as we know, the class of Turing computable numbers in polynomial time.

8

Conclusion

We have reflected upon the way physical experiments, measuring some quantities, can arise in computation and be viewed as special kinds of oracles — Hoyle’s algorithm is an intriguing, yet simple, case study for this purpose. Next, we have analysed in some detail a case study based upon the scatter machine experiment SM E, a computational Gedankenexperiment we have analised earlier ([6, 7]). Using the SM E, we have shown that non-uniform complexity is an adequate framework for classifying feasible computations by Turing machines interacting with an oracle in Nature. In particular, by classifying the information content of such an oracle using Kolmogorov complexity, we have obtained a hierarchical structure for advice classes. In order to use the scatter machine experiment as an oracle, we need to assume that the wedge is sharp to the point and that the vertex is placed on a precise value x. Without these assumptions, the scatter machine becomes useless, since its computational properties arise exclusively from the value of x. The existence of an arbitrarily sharp wedge seems to contradict atomic theory, and for this reason the scatter machine is not a valid counterexample to the physical Church–Turing thesis. If this is the case, then what is the relevance of the analog–digital scatter machine as a model of computation? The scatter machine is relevant when it is seen as a Gedankenexperiment. In our discussion, we could have replaced the barriers, particles, cannons and particle detectors with any other physical system with this behaviour. So the scatter machine becomes a tool to answer the more general question: if we have a physical system to measure an answer to the predicate y ≤ x, to what extent can we use this system in feasible computations? If we accept that “measuring a physical quantity” is, in essence, answering whether y ≤ x, then the scatter machine is just a metaphor for a measuring device. Thus, our work studies the fundamental limitations of computation depending on the measurement of some physical constant. As current research, besides a few other aspects of the measurement apparatus that we didn’t cover in this paper, we are studying a point mass in motion, 22

The proof of Proposition 4.1 is also a proof that P/F ⊂ P/G∗. Since P/F∗ ⊂ P/F, the statement follows.

according to some physical law, such as Newtonian gravitation, and we will apply instrumentation to measure the position and velocity of such a point mass.

References 1. Jos´e Luis Balc´ azar, Josep D´ıas, and Joaquim Gabarr´ o. Structural Complexity I. Springer-Verlag, 2nd edition, 1995. 2. Jos´e Luis Balc´ azar, Ricard Gavald` a, and Montserrat Hermo. Compressibility of infinite binary sequences. In Andrea Sorbi, editor, Complexity, logic, and recursion theory, volume 187 of Lecture notes in pure and applied mathematics, pages 1175– 1183. Marcel Dekker, Inc., 1997. 3. Jos´e Luis Balc´ azar, Ricard Gavald` a, and Hava Siegelmann. Computational power of neural networks: a characterization in terms of Kolmogorov complexity. IEEE Transactions on Information Theory, 43(4):1175–1183, 1997. 4. Jos´e Luis Balc´ azar, Ricard Gavald` a, Hava Siegelmann, and Eduardo D. Sontag. Some structural complexity aspects of neural computation. In Proceedings of the Eighth IEEE Structure in Complexity Theory Conference, pages 253–265. IEEE Computer Society, 1993. 5. Edwin Beggs, Jos´e F´elix Costa, Bruno Loff, and John Tucker. The computational complexity of the analog-digital scatter machine, 2008. Technical Report, University of Wales Swansea. 6. Edwin Beggs, Jos´e F´elix Costa, Bruno Loff, and John Tucker. Computational complexity with experiments as oracles. Proceedings of the Royal Society, 2008. 7. Edwin Beggs, Jos´e F´elix Costa, Bruno Loff, and John Tucker. On the complexity of measurement in classical physics. In Manindra Agrawal, Dingzhu, Du Zhenhua Duan, and Angsheng Li, editors, Theory and Applications of Models of Computation (TAMC 2008), volume 4978 of Lecture Notes in Computer Science, pages 20–30. Springer–Verlag, 2008. 8. Edwin Beggs and John Tucker. Experimental computation of real numbers by Newtonian machines. Proceedings of the Royal Society, 463(2082):1541–1561, 2007. 9. Cristian Calude. Algorithmic randomness, quantum physics, and incompleteness. In Maurice Margenstern, editor, Machines, Computations and Universality (MCU 2004), volume 3354 of Lecture Notes in Computer Science, pages 1–17. Springer– Verlag, 2004. 10. Barry Cooper and Piergiorgio Odifreddi. Incomputability in nature. In Barry Cooper and Sergei Goncharov, editors, Computability and Models, Perspectives East and West, University series in mathematics, pages 137–160. Springer, 2003. 11. Jack Copeland. The Church–Turing thesis. In Edward Zalta, editor, The Stanford Enciclopedia of Phylosophy. Published online at http: // plato.stanford.edu/ archives/ fall2002/ entries/ church-turing/, 2002. 12. Jack Copeland and Diane Proudfoot. Alan Turing’s forgotten ideas in computer science. Scientific American, 280:99–103, 1999. 13. Martin Davis. The myth of hypercomputation. In Christof Teuscher, editor, Alan Turing: the life and legacy of a great thinker, pages 195–212. Springer, 2006. 14. Andrew Hodges. The professors and the brainstorms. Published online at http:// www.turing.org.uk /philosophy /sciam.html, 1999. 15. Fred Hoyle. From Stonehenge to Modern Cosmology. W. H. Freeman, 1972. 16. K. Kobayashi. On compressibility of infinite sequences. Technical Report C–34, Research Reports on Information Sciences, 1981.

17. D. W. Loveland. A variant of the Kolmogorov concept of complexity. Information and Control, 15:115–133, 1969. 18. C. A. Newham. The Astronomical Significance of Stonehenge. Coats and Parker Ltd, 2000. First published in 1972. 19. Roger Penrose. The Emperor’s New Mind. Oxford University Press, 1989. 20. Roger Penrose. Shadows of the Mind. Oxford University Press, 1994. 21. Hava T. Siegelmann. Neural Networks and Analog Computation: Beyond the Turing Limit. Birkh¨ auser, 1999. 22. John V. Tucker and Jeffery I. Zucker. Computable functions and semicomputable sets on many sorted algebras. In Samson Abramsky, Dov Gabbay, and Tom Maibaum, editors, Handbook of Logic for Computer Science, volume V of University Series in Mathematics, pages 317–523. Oxford University Press, 2000. 23. John V. Tucker and Jeffery I. Zucker. Abstract versus concrete computation on metric partial algebras. ACM Transactions on Computational Logic, 5:611–668, 2004.

Suggest Documents