Learning Branches and Learning to Win Closed Games - CiteSeerX

Learning Branches and Learning to Win Closed Games (Extended Abstract)

Martin Kummer and Matthias Ott Institut für Logik, Komplexität und Deduktionssysteme Universität Karlsruhe, D-76128 Karlsruhe, Germany Email: fkummer; m [email protected]

Abstract We introduce two new notions of inductive inference: learning infinite recursive branches of recursive trees and learning winning strategies for closed recursive games. Branch learning can be seen as a natural generalization of learning functions, and learning winning strategies is a new approach to constructively find winning strategies for a special kind of Gale-Stewart games. These two independently motivated concepts turn out to be equivalent. In branch learning there appear new phenomena compared to function learning: e.g. we can show that learning and uniform computation are incomparable. In the setting of learning functions uniform computation is trivial. Another example is that there are two distinct natural definitions for BC-style branch learning which yield different classes. By studying different learning criteria, enumeration and computation of branches, and the effect of using an oracle for the halting problem we get a hierarchy built by the corresponding classes. All presented results also hold for winning strategy learning by the equivalence mentioned above. Additionally, we investigate the notion of counter strategy learning for closed recursive games.

1 Introduction In the traditional model of inductive inference the learner has to find a program for a recursive function f from in Supported by the Deutsche Forschungsgemeinschaft Gradu-

iertenkolleg “Beherrschbarkeit komplexer Systeme” (GRK 209/296).

put/output examples of f . Learning branches of trees can be seen as a natural generalization of learning functions. A recursive function f can be identified with the infinite recursive tree T which only consists of the branch f(0)f(1) : : :. Thus, to infer f is the same as to learn a program for an infinite recursive branch of T from initial segments of T . We find it natural to ask whether it is possible to learn infinite recursive branches of more general recursive trees. Infinite branches of recursive trees are of general interest in recursion theory [15]. In [9] it is studied to which extend (in the sense of so called k-selectors) infinite recursive branches of trees can be computed uniformly. This approach was combined with inductive inference in [4]. Here the learner receives input/output examples of f and as additional information an index of a tree T such that f is a branch of T . In an abstract view T can be seen as a problem specification and the infinite branches of T as the solutions. We ask whether it is possible to learn solutions for T from partial information about T . There are two main differences between our notion of leaning branches and the traditional notion of learning functions: Firstly, we do not know an infinite recursive branch (the object to be learned) which we can give the learner as an input. Secondly, there may be many infinite recursive branches, and we only want to learn one of them.

Recently, there has been extensive theoretical research on the problem of navigating a robot in unknown environments. It has been studied whether it is possible to efficiently learn a complete model of the environment (e.g. [1, 18]) or a path to a target point in the environment (e.g. [2, 17]). Learning infinite branches in trees models the task of learning a path in unknown infinite environments. Hereby, the target point is the point 1 and the environment is represented by a computable function. In other words, learning a branch in a tree can be construed as learning the way “out” of an infinite maze. The nodes in the tree represent the (simple) finite paths through the maze without striking on an obstacle. Thus the infinite branches of the tree correspond exactly to the paths out of the maze. But in contrast to the mentioned work on robot navigation in unknown finite environments, we do not search for efficient

algorithms. Instead, our viewpoint is a recursion theoretic one: what is learnable by machines with unbounded time and space resources? This is the reason why in our model there is no difference between active and passive learning. The robot can simply perform a breadth-first search in the environment to obtain the characteristic function of the associated tree in the limit. Thus, we can assume that there is an oracle which enumerates the characteristic function of the input tree. The concept of branch learning gains additional significance by the fact that it is equivalent to learning winning strategies for closed recursive Gale-Stewart games. This concept is introduced in Section 5. Gale-Stewart games can be used to model reactive systems [21]. In this model the correct reactive programs correspond to computable winning strategies. The question whether one can uniformly compute winning strategies for infinite games has been intensively studied for example in [3, 10, 11, 14, 21]. To learn winning strategies is a new alternative approach for the construction of winning strategies for infinite games and is justified by the fact that learning and uniform computation turn out to be incomparable. We also consider the concept of counter strategy learning for closed recursive games. The idea of counter strategy learning is taken from Fortnow and Whang [6], and Freund et al. [7], who have studied this concept in the framework of repeated matrix games.

2 Notation and Definitions The natural numbers are denoted by !. A is the characteristic function of A !. We are using an acceptable programming system '0; '1 ; : : :; the function computed by the e-th program within s steps is denoted by 'e;s . REC is the set of all total recursive functions. REC0;1 is the set of all 0; 1valued total recursive functions. If f is a total 0; 1-valued function we write ext(f) := fx: f(x) = 1g for the set with characteristic function f . Turing reducibility is denoted by T . If A is a set, then A0 is the halting problem relative to A, that is fe : 'Ae (e) #g. The halting problem ;0 is denoted by K .

For strings a; b 2 f0; 1g, a b means that a is an initial segment of b. We code the strings in f0; 1g according to the alphabetical ordering: , 0, 1, 00, 01, 10, 11; : : :. The coding is denoted by hi. Total Functions f : ! ! f0; 1g are identified with the infinite string f(0)f(1) : : :.

T f0; 1g is a (binary) tree if T

is closed under initial segments. Elements of a tree are called nodes. A branch of T is a maximal linearly ordered subset of T . If for an 2 f0; 1g [ f0; 1g! the set A := fa: a g is a branch of T , we often write instead of A. A tree T is recursive if its characteristic function T is recursive. Except of Section 5.4 we are only interested in recursive trees which have an infinite recursive branch:

Definition 1 TREE := fT f0; 1g : T is a recursive tree with at least one infinite recursive branchg. Remaining recursion theoretic notation is from [15]. EX, FIN, and BC denote the classes of sets S REC which are identifiable by explanation, finitely identifiable by explanation and behaviorally correctly identifiable, respectively. Definitions for the different learning criteria will be given shortly in the context of branch learning. For background from inductive inference see e.g. Appendix A in [16]. We now define the notion of branch learning. At first, we give a definition which corresponds to that of EX-learning a class of recursive functions. As input we provide the initial segments of a tree T . We look for a recursive function which computes from this input in the limit a program for an infinite branch of T : Definition 2 A class of Trees B is branch learnable (B BRANCH) if there is a g 2 REC such that for all T 2 B:

2

eT := limn!1 g(hT (0) : : :T (n)i) exists and 'eT is an infinite branch of T . In inductive inference there exist many different learning criteria which can also be considered in the framework of branch learning. Definition 3 A class of Trees B is finitely branch learnable (B 2 BRANCH n ) if there is a recursive function g : ! ! ! [ f?g such that for every T 2 B there exists an n 2 ! such that g(hG (0) : : :G (i)i) = ? for all i < n and g(hG (0) : : :G (n)) is an index of an infinite branch in T . While EX-style branch learning and finitary branch learning can be defined straightforwardly, it is not so obvious how to define BC-style branch learning. BC-style learning means that in the limit an infinite sequence of correct programs for the target concept is found. Our target concept is an infinite branch of the input tree. Should we require that eventually all programs compute the same infinite branch? Or should we allow that the programs may compute different branches as long as they are infinite branches of the input tree? We define both versions and in Section 3.3 we show that they yield different classes. This is a typical phenomenon of branch learning in which it differs from the classical setting of inductive inference, where functions are learned. Definition 4 A class of trees B is (strongly) BC-branch learnable (B 2 BRANCHBC ) if there is a g 2 REC such that for every T 2 B there exists an infinite branch f 2 REC of T with

(9n0 )(8n n0 )['g(hT (0):::T (n)i) = f]:

A class of trees B is weakly BC-branch learnable (B 2 BRANCHWBC ) if there is a g 2 REC such that for every T 2 B:

(9n0 )(8n n0 )['g(hT (0):::T (n)i) is an infinite branch of T]: Example 5 As an easy example consider the trees Tf := f(0)f(1) : : : which consist of exactly one infinite branch f 2 REC0;1. Obviously, the classes S REC0;1 and BS := fTf : f 2 S g are related by the following equivalences:

BS 2 BRANCH () S 2 EX, BS 2 BRANCH n () S 2 FIN, BS 2 BRANCHBC () BS 2 BRANCHWBC () S 2 BC. The power of an inductive inference machine can be improved, if we allow the machine to use an oracle. This has been studied in inductive inference for example in [5]. We will also consider the situation when the branch learner has access to an oracle A !. To get the relativized definitions we have to replace g 2 REC in the Definitions 2, 3 and 4 with g T A. We write BRANCHA , BRANCHA n , . . . etc. for the corresponding classes.

3 Four Phenomena of Branch Learning In this section we present four phenomena which show the character of branch learning. We investigate four types of problems, which are serious problems in the framework of branch learning. But in the framework of learning functions — the classical setting of inductive inference — the corresponding problems do not appear, because they are trivial. Example 5 shows that the problem of learning functions can be reduced to branch learning. The reduction yields exactly that kind of branch learning problems where the character of branch learning disappears. It yields the trees which consist exactly of one infinite branch, i.e. improper trees. The results we present in this section have been chosen and formulated with the aim to demonstrate the character of branch learning. Therefore, we do not state the results in their strongest form, and we do not pay attention to completeness. This will be done in Section 4. 3.1 Learning a Tree versus Learning a Branch A class of trees B TREE can be considered as a set of recursive functions, namely the set SB := fT : T 2 Bg which consists of the characteristic functions of the trees in B. What is the difference between learning a branch in a tree and learning the tree itself as a function? More formally, is

there a relation between the questions BRANCH?

SB 2

EX and

B2

The answer is no. For answering this question we need a special class of trees, which we will use several times in this paper: Theorem 6 There is a class fTe :

e 2 !g of trees such that

Te can be uniformly computed from e, in all Te there exist infinite branches 0 and 1 such that 0 or 1 is recursive,

if f 2 REC is an infinite branch of Te , then f(0) = 1 iff 'e is total.

Proof sketch:

We use the following well known fact:

Fact 7 There is an infinite recursive tree Te out infinite recursive branches.

f0; 1g with-

We construct Te uniformly in e. The idea of the construction is that every tree Te contains two subtrees Te0 and Te1 . An infinite branch has to go through one of the two trees, i.e., Te has the form 0Te0 + 1Te1. Both trees Te0 and Te1 are infinite. In Te1 there is an infinite recursive branch iff 'e is total and in Te0 an infinite recursive branch exists iff 'e is not total. This is achieved by the following construction. In both subtrees we try to construct the tree Te. But the two constructions will depend on a stepwise computation of 'e (0); 'e(1); : : : in different ways.

In Te0 we proceed with the construction of Te every time we detect that 'e is defined on the next value. In all other steps we extend all branches of Te0 by 0. I.e. Te0 will be a stretched initial segment of Te. If 'e is total then Te can be embedded into Te0. Moreover, from every infinite recursive branch of Te0 we can uniformly compute an infinite recursive branch of Te. Because Te has no infinite recursive branches, if 'e is total there are no infinite recursive branches in Te0 . If 'e is not total there is a finite variant of 0! which is a branch of Te0. Thus, in this case Te0 has an infinite recursive branch.

In Te1 we proceed in a contrary fashion. Here the default step is to go on with the construction of Te. In the other case, when we get that 'e holds on the next value, we completely reset the construction of Te. I.e. we let all branches end except of the leftmost branch of maximal length. At the end of this branch we start a new construction of Te. So, in Te1 we get a stuttered version of Te. If 'e is not total, Te1 will end up with Te. It follows that every infinite branch of Te1 ends up with an infinite branch of Te and thus cannot be recursive. If 'e is infinite then there is an infinite recursive branch: compute the next reset step and search for the leftmost branch of maximal length.

fTe : e 2 !g 62 BRANCH. Assume that fTe : e 2 !g 2

Corollary 8

Proof: BRANCH via g. We set e := Te . Then f := e; n:['g(he (0):::e (n)i);n(0)] would be a recursive function such that the sequence fn := e:f(e; n) converges to the characteristic function of Tot := fe : 'e totalg. By the Limit Lemma it follows that Tot T K which is a contradiction. Theorem 9 There is a class B EX. There is a class BRANCH.

C

2 BRANCH such that SB 62

TREE such that

SC 2

EX, but

C 62

Proof: Define Tf0 := Tf [f0! g, where Tf are the trees of Example 5. Then B := fTf0 : f 2 RECg is branch learnable via the function that alway outputs a program for 0! . Since REC 62 EX the class SB cannot be in EX, too. For the second part we use a small modification of the class

fTe : e 2 !g of Theorem 6: we set Te0 := 0e 1Te and C := fTe0 : e 2 !g. Because Te is uniformly computable from e if follows that fTe : e 2 !g 2 FIN EX. But C is not in 0

BRANCH by Corollary 8.

Example 5 shows that in the framework of learning functions to learn the tree and to learn a branch in the tree are the same things. 3.2 Learning versus Uniform Computation If we want to constructively find an infinite recursive branch in a tree, to learn such a branch is only one reasonable approach. A second one, and perhaps the more obvious one, is to uniformly compute an infinite branch. What corresponds to uniform computation in the classical framework of inductive inference? The task would be: find an algorithm which computes for every f of a class S REC from an index of f an index of f . Obviously, this is trivial, i.e. the question of uniform computation is not a serious criterion in this framework, and it is not possible to compare the power of learning and the power of computing. The situation changes if we consider the task of finding branches of trees: Definition 10 Infinite recursive branches can be computed uniformly for a class B of recursive trees (B 2 UNI), if

(9g 2 f'i : i 2 !g)(8e)(8T 2 B)[T = 'e =) g(e) # ^ 'g(e) is an infinite branch of T]: Considering the class fTf :

f 2 RECg of Example 5 we get:

Theorem 11 UNI ? BRANCH 6= ;.

The trees of Theorem 6 are an example for a class of trees such that the infinite recursive branches cannot be uniformly computed and cannot be learned. There also exist classes, such that infinite recursive branches can be learned but cannot be uniformly computed, showing that BRANCH and UNI are incomparable: Theorem 12 BRANCH ? UNI 6= ;. Proof: We construct a class of trees fVe : e 2 !g. Ve will diagonalize 'e as an algorithm for uniform computation of infinite branches. Construction of Ve : Let si := s:['e;s(i) # = j ^ 'j;s(0) #]. si may be undefined. Define the tree Ui in the following way:

8 < Ui := :

f0si ; 1! g f0! ; 1si g f0! ; 1! g

if si 2 !, 'e (i) = j and 'j (0) = 0, if si 2 !, 'e (i) = j and 'j (0) = 1, otherwise.

Then there is an h 2 REC such that 'h(i) is the characteristic function of Ui . By the recursion theorem there exists a fixpoint i0 with 'i0 = 'h(i0 ) . We set Ve := ext('i0 ).

The class of trees B := fVe : e 2 !g is branch learnable with at most one mind change: On input V (0); V (1); : : : output a program for 0! , until you detect an n with V (h0n i) = 0. Then output a program for 1! . Now assume that B were in UNI via 'e . Then in the construction of Ve the number si0 is finite, thus Ve contains only one infinite branch. For the index i0 of Ve the branch 'j with j = 'e (i0 ) will at the first node differ from the only infinite path of Ve , which is a contradiction. 3.3 Strongly versus Weakly BC-Branch Learning In Section 2 we have defined two different versions of BC style branch learning. In both versions we require that from a certain point on the learner always outputs programs for an infinite branch of the input tree. In the weak version these programs may compute different branches of the input tree, while in the strong version all these programs have to compute the same branch. In the framework of learning functions it is not possible to make this distinction because we have only one target concept, namely the input function itself. One can show by diagonalization, that strongly and weakly branch learning are really different notions of learning:

BRANCHWBC . Proof sketch: Clearly, BRANCHBC BRANCHWBC . We will construct a class of trees fTe : e 2 !g 2 BRANCHWBC ? BRANCHBC by diagonalization. Te will Theorem 13 BRANCHBC

diagonalize 'e as a strong BC -branch learner.

In the beginning of Te we code the index e, i.e., we set Te := 0e 1Te0. The index e will be used by the weak BC-learner for

simulating the following construction.

Each Te0 is constructed in stages. At every stage, Te0 will have the form 0Te0 + 1Te1, it will have exactly three branches of maximal length, one in Te0 and two in Te1 , or vice versa. The subtree Tei which has only one maximal branch is called the 1-branch subtree, the other is called the 2-branch subtree. We start with Te;0 := 0e 1(0(0 + 1) + 10).

In stage s+1 we apply 'e , as a potential strong BC -learner, to the tree constructed so far. Say, that in stage s we know the characteristic function e of Te at the arguments 0; : : :; xs. A marker m xs is used to point to the last argument where we were active. Initially m is 0. Now we compute the hypotheses hn := 'e;s(he (0) : : :e(n)i) for n = m; : : :; xs. For all hypotheses hn which are defined we compute for i = 0; : : :; xs and search for an inconsistency, i.e. an argument where to different hypotheses are defined and have different values. In this case the learner has changed the branch in the period [m; xs]. We become active and set m := xs , Te;s+1 := Te;s0 and go to the next stage.

'hn ;s (i)

If the set of all defined hypotheses is consistent, then they compute (in s steps) one maximal sequence a := a0 : : :ak . If a is not an initial segment of a maximal branch of Te;s then the learner has produced a wrong hypotheses and we can proceed like in the case of an inconsistency. If a is an initial segment of a maximal branch in Te;s but this branch is not unique, then we have not enough information to become active. We do not change m, set Te;s+1 := Te;s 0 and go to the next stage. In the case when a lies on a unique maximal branch in the 2-branch subtree then we can become active. Assume that b1 and b2 are the two maximal branches of the 2-branch subtree in Te;s and c is the maximal branch of the 1-branch subtree. If a b1 we secure that b is a wrong hypotheses by setting Te;s+1 := Te;s [ fb20; b21; c0g, i.e., we truncate the branch b1 . Additionally, we split the branch b2 securing our invariance that there are three maximal branches in Te;s+1 . If b b2 we can proceed analogously. We set m := xs and go to the next stage. If a c then we cannot simply truncate c since otherwise there would no longer be a maximal branch in the corresponding subtree Tei . In this case we split the branch c and truncate one of the branches b1 or b2, say b1 . I.e., we set Te;s+1 := Te;s [ fb20; c0; c1g and say that the 1-branch and 2-branch subtrees have been exchanged in stage s + 1. We let m be unchanged, because we cannot secure that a is a wrong hypotheses in this stage. Since now a lies above a split in Te;s+1 we will pay attention to this requirement in a later stage, unless we find an inconsistency or the hypotheses leaves the tree.

We now show that Te is not strongly BC-branch learnable via 'e . If the marker m does not tend to infinity during the construction then from a certain point on the learner always outputs nontotal hypotheses and thus is no BC-learner.

If m tends to infinity and we find infinitely many inconsistencies then the learner changes infinitely often the branch and cannot be a strong BC-learner.

Otherwise, the learner produces an infinite subsequence of wrong branches and again fails to be a BC-learner.

So, the class fTe : e 2 !g is not strongly BC-branch learnable. But it is weakly branch learnable by the following algorithm. Wait until you can decode the index e. Then always output a program that chooses the 1-branch subtree of the current input tree and then simulates the above construction. During the simulation it chooses 0 unless the program comes to a split in the tree constructed simultaneously in the simulation. Now it waits until one of the branches is truncated and then follows the other one. Assume that during the construction the 1-branch and 2branch subtrees are infinitely often exchanged. Then all hypotheses of the above learning algorithm are infinite branches of Te . This is because at every split the hypothesized program only has to wait a finite number of steps. If the 1-branch and 2-branch subtrees are only finitely often exchanged, then from a certain point on all hypotheses will follow the maximal path in the 1-branch subtree.

3.4 Learning versus Enumerating In the learning procedures considered so far the learner has to find a program for a branch of the input tree. We now consider situations where the learner has to enumerate the nodes of an infinite branch during learning instead of hypothesizing programs for such a branch. This process can be seen as a kind of on-line learning. Similar to uniform computation, in the framework of learning functions enumeration is trivial: simply repeat the input values. As inductive input the learner gets the characteristic function of a tree. After each input value the learner can output the next node of his branch or wait another step by outputting “?”. The learner can wait an arbitrary but finite number of steps until he provides the next node . We identify an infinite sequence 2 f0; 1; ?g! with the sequence that is obtained from by deleting all occurrences of “?”. Definition 14 A class of trees B is branch enumerable (B 2 ENUM) if there is a recursive function g : ! ! ! [ f?g such that for every T 2 B the sequence g(hi), g(hT (0)i), g(hT (0); T (1)i), : : : is an infinite recursive branch of T .

The following theorem clarifies the relation of ENUM to the classes BRANCH and UNI:

Theorem 17 For all classes Q; R contained in Figure 1 it holds that Q R iff there is an arrow from Q to R.

Theorem 15

TREE 2 BRANCHK

3 Qk QQ K BRANCHWBC UNI * 6 6 BRANCHBC 6 BRANCH UNI 6 6 BRANCHK ENUM Qk nQ 3 Q

1. BRANCH ? ENUM 6= ;, 2. ENUM ? BRANCH 6= ;, 3. ENUM UNI. Proof: (1) The proof is similar to that of theorem 12. This time in the construction of Ve wait until 'e enumerates the first move on input 0s + 1s. If 'e enumerates 0 or 1 truncate the corresponding branch. Here, we do not need the recursion theorem. (2) Use the trees of Example 5: fTf : BRANCH.

f 2 RECg 2 ENUM?

(3) Assume that B 2 ENUM via g. Given an index of T for a T 2 B output the following program which computes an infinite branch of T . On input u simulate the enumeration procedure using g until a node which has length greater than u is produced. If u lies on the branch enumerated so far output 1, otherwise output 0.

To show that UNI 6 ENUM we can diagonalize similar as in (1). But to secure, that the class fVe : e 2 !g is in UNI we cannot simply truncate one branch, when 'e has enumerated the first node. Again, we let the branch which is chosen by 'e be finite. But instead of simply truncating it, we add a finite subtree under the chosen branch 0s or 1s. This subtree begins with a split and is chosen in such a way, that for all i s the functions 'i are different from Ve . Such a subtree exists, but note, that we can not find it uniformly in e.

BRANCH n

Figure 1: The hierarchy of branch learning. Theorem 17 follows from a number of propositions. Here, we will only state and prove some of these propositions: Proposition 18 BRANCHK n ? UNI 6= ;. Proof: It suffices to slightly modify the proof of theorem 12. There we have constructed a class fVe : e 2 !g to diagonalize against all algorithms for uniform computation of branches. Now we code e, and the fixpoint i0 into the beginning of the tree Ve0 , and then proceed analogously to the construction of Ve .

fVe0 : e 2 !g

in this branch. Thus, we can uniformly compute an infinite branch from j .

is still not in UNI, but the class is finitely branch learnable using a K -oracle: Wait until you can decode e and i0 . Using the K -oracle decide whether 'e(i0 ) # = 0. If the answer is ‘yes’ output a program for 1! , otherwise output a program for 0! .

4 The Hierarchy of Branch Learning

Proposition 19 BRANCHBC

Now for every j with 'j = Ve we know that if 0! or 1! is not a branch of Ve , after at most j nodes we can find a split

The following hierarchy of learning classes can be obtained directly by well known results from inductive inference:

UNIK .

2 BRANCHBC via g. We can compute infinite branches for B uniform in K in the Proof:

Let B

following way. Proposition 16 BRANCH n BRANCHK n BRANCHK .

BRANCH BRANCHBC

We now clarify the exact relations between BRANCHWBC , ENUM, UNI, UNIK , TREE and the classes of Proposition 16. Hereby, UNIK denotes the class of all sets of trees, such that infinite branches can be computed uniformly in K . The considered classes can be arranged in a hierarchy which is shown in Figure 1.

On input e simulate the BC-learn procedure g on the input 'e(0); 'e (1); : : :. If 'e = T for a T 2 B the simulation yields a sequence h0; h1; : : : of hypotheses which BCconverges against a branch in T .

Using the oracle K we can decide for every n0 if the set of hypotheses fhn : n n0g is consistent, i.e. (8n; m n0)(8x)['hn (x) # ^ 'hm (x) # =) 'hn (x) = 'hm (x)]. Since the sequence of hypotheses BC-converges there exists such an n0.

Wait until such an n0 is found and then output the following algorithm: On input x the algorithm again computes the sequence h0; h1; : : : by simulation and then searches by dovetailing for an n n0 such that 'hn (x) #. Since almost all hypotheses in the sequence compute one branch of T such an n is found, and because all hypotheses hi with i n0 are consistent, 'hn (x) = 1 iff x 2 . Again, the class fTf : f 2 RECg from Example 5 shows that the two classes are not equal.

UNIK by definition. From Proposition 16 and the

UNI Propositions 18 and 19 we get Corollary 20 UNI UNIK .

Proposition 21 TREE 2 UNIA

() A T ;00. If TREE 2 UNIA then the class fTe : e 2 !g from

Proof: Theorem 6 is in UNIA . Thus, by Theorem 6 we can decide Tot recursive in A. It follows that A T Tot T ;00. Assume now A T 2 TREE iff 1. 2. 3.

T ;00. 'e computes an infinite branch of

'e 2 REC0;1 , 'e computes an 2 f0; 1g!, T

The condition (1) can be decided recursive in ;00. And under the assumption that (1) is valid, an ;0-oracle suffices to decide (2) and (3). Proposition 22 BRANCHWBC ? UNIK

6 ;. = Proof: We can reuse the trees B := fTe : e 2 !g constructed in the proof of Theorem 13. We have proved that B is in BRANCHWBC ? BRANCHBC . Assume that B 2 UNIK via 'K i . Since the construction of Te was uniform in e there is an h 2 REC such that Te = 'h(e) . Now the following algorithm branch learns B which yields a contradiction, since BRANCH BRANCHBC : On input Te wait until you can decode e, then output in stage n the hypotheses 'Ki;nn (h(e)).

5 Learning to Win Closed Infinite Games Finite games like chess can be represented by a finite game tree. It is always possible to compute an optimal strategy for a finite game by an exhaustive search through the game tree. But for interesting games, like chess or Go, these game trees are usually very big such that an exhaustive search is

intractable. Thus, it would be nice if one can find a winning strategy by inspecting only some part of the game tree. We will consider the analogous question for closed infinite games, which can be represented by infinite recursive trees. Is it possible to come up with a winning strategy by inspecting just a finite amount of the game tree? 5.1 Closed Recursive Games We consider two-person games in the style of Gale-Stewart games (see e.g. [20]). The games are of infinite duration, i.e., the plays consist of ! many moves. The players are called player I and II. The players move alternately by choosing an element from the alphabet = f0; 1g (Figure 2). Thus, a play is an !-word over the alphabet . Player I : Player II :

1

2

3

4 : : :

Figure 2: An infinite play. A game is defined by a set of plays G ! . Player I wins the play = (i )i2! if 2 G. Otherwise, player II wins the play . With respect to the Borel hierarchy the most basic GaleStewart games are the closed games, where a game G is closed if for all 2 f0; 1g!:

(8u )(9 2 G)[u ] =) 2 G: In this paper we consider only closed games. The closed games correspond to the safety properties of reactive systems [22]. As an example we consider the following control problem. Assume that we want to find a controller which holds the temperature in a room between tmin and tmax . This requirement is a safety property. We can model this situation as a closed game where the controller is player I. To achieve the given requirement player I can switch a heater on and off. In this example the adversary is the physical disturbance which influences the temperature in the room and the heating process. To make the example more concrete (but easy, too), assume that the controller can regulate the heater on a scale from 0 to 10, i.e. the moves an of player I are in the alphabet f0; : : :; 10g, and the disturbance is measured on a scale from ?4 to 2, i.e. player II can choose his moves bn out of the alphabet f?4; : : :; 2g. Assume further that the temperature behaves according to the following equations:

t0 := 30; ? an (jb j? b )+ 1 + an (jb j +b ): tn+1 := tn +an ? 11 10 n n 10 n n

The winning condition (8n)[25 tn 35] now yields a closed game. Note, that this game could also be coded over the alphabet f0; 1g.

Closed games G can be identified with their characteristic function G : ! ! ! defined by:

G (hai) :=

1 0

if there is an 2 G with a , otherwise.

G is the characteristic function of a tree which we call the game tree. A closed game is recursive if G is recursive. For

closed games we can naturally interpret the task of the two players. Player I wins if he can keep the play on the game tree, while player II has to lead the play out of the game tree in order to win the play. 5.2 Winning Strategies and Learning A strategy for a player is a function which yields the next move for the player given all the previous moves of the adversary. A strategy is a winning strategy for player I, if player I wins every play in which he follows . We are especially interested in computable strategies. and denote strategies for the players I and II, respectively. We write for the resulting play when player I plays according to and player II plays according to . In the following we write I-GAMES for the set of all closed recursive games in which player I has a computable winning strategy: Definition 23 I-GAMES := fG f0; 1g! : G closed, G 2 REC and player I has a recursive winning strategy for Gg. We now define our notion of game learning. As input we provide the initial segments of the game tree. We look for a recursive function which computes from this input in the limit a program for a winning strategy: Definition 24 ? I-GAMES is winning strategy learnable (? 2 WS) if there is a total recursive function g such that for every G 2 ?:

eG := limn!1 g(hG (0) : : :G (n)i) exists and 'eG is a winning strategy for player I in G. Definition 24 defines EX-style winning strategy learning. As for branch learning we can also define notions of game learning which correspond to the other criteria of learning used in inductive inference. We omit the corresponding definitions. 5.3 Equivalence between Branches and Strategies In this section we show that the task of finding a winning strategy for a closed recursive game is equivalent to the task

of finding an infinite branch in a recursive tree. This yields a very useful method which enables us to translate all results about learning, computing and enumerating branches of trees into the framework of game learning. Playing an infinite branch in a tree can be seen as a one-person game, i.e. there is no adversary. This is the main reason why proofs get more simple and clearer if they are done in the framework of finding branches in trees. Let T be a recursive tree. Then T can canonically be expanded to a closed recursive game GT :

GT :=f(a1b1 a2 b2 : : :): (8i)[a1 : : :ai 2 T]g [ f(a1b1 : : :an 0): a1 : : :an 2 T ^ a1 : : :an0; a1 : : :an 1 62 T ^ 2 f0; 1g!g: Corollary 25 GT is closed and recursive. It is clear that T and GT can be computed uniformly from each other. Moreover, it is not difficult to see that the input stream T (0); T (1); : : : can be effectively translated into an input stream for the characteristic function of GT , and vice versa. But note, that the two streams will proceed with different rates. That is, we may have to await a finite number of input values until we can enumerate the next value of the target stream. The following lemma states that we can also compute branches and winning strategies uniformly from each other:

= a1 a2 : : : is an infinite branch of T then hb1 : : :bn?1i:[an] is a winning strategy for player I in GT . If is a winning strategy in GT then ((h1n i))n2! is an infinite branch of T . Lemma 26 If

From the above arguments about the effective translations between trees T and the games GT , and from Lemma 26 we get Corollary 27 Let B

Bg 2 WS.

TREE. B 2 BRANCH iff fGT : T 2

Analogously, we get similar statements for all other considBRANCHBC , ered classes, like BRANCH n , BRANCHWBC , UNI and the relativized versions, e.g. A BRANCHA n , UNI , etc. For the reduction of games to trees let a game G 2 I-GAMES be given. We now define a tree TG such that the tasks of finding a winning strategy in G and a branch in TG are of same complexity. The idea is to define a recursive tree whose infinite branches are exactly the winning strategies for player I in G. Every s = s0 : : :sn 2 f0; 1g represents an initial segment of a strategy. The set of all initial plays according to s as a strategy for player I is Init(s) := fa1 b1 : : :ak+1 bk+1 : hb1 : : :bk i n ^ ai+1 = shb1 :::bii for i = 0; : : :; kg:

is a winning strategy for player I in G iff (8s Init(s))[G (u) = 1].

)(8u 2

To secure, that TG contains the same information as G, we additionally code the game G into TG . TG is defined inductively starting with 2 TG :

c0 s0 c1 s1 : : :sn?1 cn 2 TG : () c0 s0 c1 s1 : : :sn?1 2 TG ^ cn = G(n);

g which produces for every G 2 ? and every b 2 f0; 1g! a sequence of moves a 2 f0; 1g! such that a b 2 G. From the above reductions between games and trees we get also the following equivalence: Theorem 29 Play learning for closed recursive games is equivalent to branch enumerating for trees. 5.4 Counter Strategy Learning

c0 s0 c1 s1 : : :cn sn 2 TG : () c0 s0 c1 s1 : : :cn 2 TG ^ Init(s0 : : :sn ) fu: G (u) = 1g: Since player I has a winning strategy for G, the tree TG has at least one infinite branch that contains G completely as a subsequence. Again, we can effectively move between games G and the trees TG not only with respect to characteristic indices but also with respect to enumerations of the characteristic functions. Similar statements as Lemma 26 and Corollary 27 hold for the reduction of games to trees. We informally summarize our result in the following theorem: Theorem 28 Computing and learning winning strategies for closed recursive games is equivalent to computing and learning infinite recursive branches of recursive trees, respectively. This is true for all considered learning criteria and also for the relativized versions of learning and computation. The class ENUM which we have considered in Section 3.4 has also a natural counterpart in game playing. Here the idea is that instead of producing hypotheses for a winning strategy the learner has to produce moves for a play against a fixed strategy of player II. That is, he has to learn and to play simultaneously.

Fortnow and Whang [6], and Freund et al. [7] have studied a notion of counter strategy learning in repeated matrix games. In these games there are always optimal strategies for the two players. But when the computational resources of the adversary are limited he may not be able to execute the optimal strategy. Is it possible to learn for every bounded adversary a counter strategy which wins against ? In this section we investigate the corresponding question for closed recursive games. Fix a game G. Player I has no recursive winning strategy in G iff (8 2 REC)(9)[ 62 G]. is called a counter strategy to . Analogously, player II has no recursive winning strategy in G iff (8 2 REC)(9)[ 2 G]. Here, is a counter strategy to . We now consider games in which none of the player has a recursive winning strategy. Does there exist games of this kind such that the counter strategies are recursive? If so, how difficult is it to find such recursive counter strategies? Note, that in this setting the game G is fixed and the learning algorithm gets different strategies for the adversary as input. It is well known that if player II wins a closed game, then he has a recursive winning strategy. Thus, because the closed games are determined, in the games we are looking for player I has a winning strategy, but no recursive one. If player I has no recursive winning strategy in a closed recursive game G, then it is easy for player II to find recursive counter strategies given a fixed strategy for player I as inductive input:

As inductive input the learner gets again the characteristic function of a game tree. Additionally, in each step he gets the next move of player II, or ? if it is player I’s turn, but player I has not yet chosen his next move. I.e. after each move of player II player I can wait an arbitrary but finite number of steps until he provides the next move. Figure 3 shows this process:

Theorem 30 Let G be a closed recursive game such that player I has no recursive winning strategy. Then player II can finitely learn recursive counter strategies to 2 REC from the inductive input (0); (1); : : :.

G G (0) G (1) G (2) G (3) G (4) G (5) G (6) G (7) II ? ? b1 b2 ? ? ? b3 I ? a1 a2 ? ? ? a3 ?

Theorem 31 There are closed recursive games in which no player has a recursive winning strategy, but both players always have recursive counter strategies such that the counter strategies for player I

... ... ...

Figure 3: The process of play learning. We say that ? is play learnable if there is a recursive learner

By reusing tree classes already presented in this paper we get the following result:

1. cannot be uniformly computed, but can be EX-learned, 2. cannot be uniformly computed in K , but can be weakly BC -learned.

Proof: (1) Using the trees fVe0 : e 2 !g of Proposition 18 and the canonic transformation of Section 5.3 we get a class ? := fGe : e 2 !g I-GAMES such that

Ge can be constructed uniformly from e, e is coded into the beginning of Ge , the winning strategies for player I cannot be uniformly computed from Ge 2 ?, ? is winning strategy learnable. The idea of the following construction is to put the games Ge in an effective way together to one game G. We define the game G := f02e+11Ge : e 2 !g[f0! g. By playing 0e 1 player II can choose one the games Ge. If he plays 0! , i.e. if he does not choose a Ge, player II loses. Because the winning strategies for player I cannot be uniformly computed, player I has no recursive winning strategy in G. Thus, every recursive strategy of player I loses in some game Ge . Player II can win against by choosing such a Ge . Assume now that the recursive strategy for player II is given. If does not choose any Ge then the 0-strategy wins against . If chooses a Ge then player I can win against by playing according to a recursive winning strategy in Ge. Because ? 2 WS the counter strategies can be EX-learned: Output the 0-strategy until player I chooses a Ge then apply the learning algorithm for winning strategies.

If the counter strategies for player I could be computed uniformly from then we could also uniformly compute winning strategies for player I in ?: on input Ge decode e from Ge and compute a counter strategy against 0e1! . From this counter strategy we can extract an infinite branch of the underlying tree Ve0, which yields the winning strategy for Ge . (2) Analogously to (1), but using the trees of Theorem 13 instead of fVe : e 2 !g. Finally we provide a game with full symmetry with respect to counter strategy learning. The proof makes essential use of a result of Jockusch and Soare [8]: Theorem 32 There are closed recursive games in which no player has a recursive winning strategy, but both players always have recursive counter strategies which can be finitely learned. Proof: Fact 33 (Jockusch/Soare [8], Theorem 4.7) There is a recursive tree U with at least two infinite branches 0 and 1 such that all infinite branches in U are pairwise Turing incomparable.

We define a closed recursive game G, using the tree U of Fact 33. The intention of the game is that both players have to play an infinite branch of U , where the first node of player I’s branch has to be 0 and the first node of player II’s branch has to be 1. If both players play an infinite branch then player I wins (thus, G is closed). Otherwise, the player who first leaves U loses. Player I has a winning strategy for G: regardless of the moves of player II play an infinite branch 0 of U .

Assume now that player I has a recursive winning strategy . We consider the play in which player I plays according to and player II plays an infinite branch 1 of U . Since is a winning strategy player I will produce an infinite branch 0 of U in the play . Thus, the branch 0 of U is recursive relative to 1 which contradicts Fact 33. It follows that player I does not have a recursive winning strategy for G and by Theorem 30 that the counter strategies for player II can be finitely learned. We now show that player I can also finitely learn counter strategies by a similar argument. Fix a recursive strategy for player II. We search for an a = a1 : : :an 2 U such that a1 = 0 and loses the finite play against a, i.e. (hi) 6= 1 or (hi)(ha1 i) : : :(ha1 : : :an i) 62 U .

We claim, that such an a exists: otherwise, consider an infinite branch 0 of U . If a with the above properties does not exist, then will produce an infinite branch 1 of U , when playing against 0. Thus, the branch 1 of U is recursive relative to 0 which contradicts Fact 33. The strategy which plays a0! is a recursive counter strategy for player I against and can be finitely learned from .

6 Conclusion We have introduced the notion of learning infinite branches for recursive trees as a natural generalization of learning functions. We have considered different learning criteria and have compared their power with that of uniform computation and enumeration of branches. Additionally, we have studied the effect of using an oracle, especially the oracle K , for learning and computing infinite branches of trees. Clearly, the access to the oracle K helped to learn and compute branches for greater classes of trees. In a following paper we want to investigate the exact relationship between the information content of an oracle and the question how much the oracle helps to improve the power of learning, computing or enumerating infinite branches of trees. Branch learning has been shown to be equivalent to learning winning strategies for closed recursive games. This result is of theoretical significance. The concept of branch learning models well the robot nav-

igation problem in unknown infinite recursive environments where the robot has to learn how to reach the point 1. The theorems we have obtained for branch learning directly apply to the navigation of a robot in infinite environments. For example, Theorem 9 states that for EX-style learning the questions whether we can learn a complete model of the environment or only a path out of the infinite maze are incomparable. We believe that there are also other interesting applications for our models. For example consider the problem of learning to control a physical system [13], like a temperature controller. As we have shown in Section 5.1 such control problems can be modeled as Gale-Stewart games. It has already been suggested and demonstrated by others that infinite games are useful to solve control problems [12, 19, 21]. But there the approach was to uniformly compute winning strategies. Often one does not have an exact mathematical model of the system to control. Thus, in such situation learning winning strategies would be a better approach. If the desired behaviour of the system is a safety property the corresponding game is closed. By our equivalence theorems this situation is covered by the branch learning model. Nevertheless, it would be interesting to develop and study learning models for more general specification languages, e.g. boolean combinations of 02-predicates. Acknowledgements: We would like to thank Frank Stephan for proofreading and comments. We would also like to thank the anonymous referees for their comments and suggestions which helped to improve the motivations of our models.

References [1] B. Awerbuch, M. Betke, R. L. Rivest, and M. Singh. Piecemeal graph exploration by a mobile robot. In Proceedings of the Eighth Annual Conference on Computational Learning Theory, pages 321–328, 1995. [2] A. Blum and P. Chalasani. An on-line algorithm for improving performance in navigation. In 34th Annual Symposium on Foundations of Computer Science, pages 2–11, Palo Alto, California, 3–5 Nov. 1993. IEEE. [3] J. R. Büchi and L. H. Landweber. Solving sequential conditions by finite-state strategies. Transactions of the American Mathematical Society, 138:295–311, 1969. [4] J. Case, S. Kaufmann, E. Kinber, and M. Kummer. Learning recursive functions from approximations. In EuroCOLT’95, volume 904 of LNCS, pages 140–153. Springer-Verlag, 1995. [5] L. Fortnow, W. Gasarch, S. Jain, E. Kinber, M. Kummer, S. Kurtz, M. Pleszkoch, T. Slaman, R. Solovay, and F. Stephan. Extremes in the degrees of inferability. Annals of Pure and Applied Logic, 66:21–276, 1994.

[6] L. Fortnow and D. Whang. Optimality and domination in repeated games with bounded players. In ACM Symposium on Theory of Computing (STOC), pages 741– 749, 1994. [7] Y. Freund, M. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, and R. E. Schapire. Efficient algorithms for learning to play repeated games against computationally bounded adversaries. In 36th Annual Symposium on Foundations of Computer Science, pages 332–341, Milwaukee, Wisconsin, 23–25 Nov. 1995. IEEE. [8] C. G. Jockusch and R. I. Soare. 01 Classes and Degrees of Theories. Transactions of the American Mathematical Society, 173:33–56, 1972. [9] S. Kaufmann and M. Kummer. On a quantitative notion of uniformity. In Mathematical Foundations of Computer Science, volume 969 of LNCS, pages 169–178. Springer-Verlag, 1995. [10] M. Kummer and M. Ott. Effective strategies for enumeration games. Technical Report 41/95, Fakultät für Informatik, Universität Karlsruhe, 1995. To appear in Proceedings of Computer Science Logic CSL ’95. [11] A. H. Lachlan. On some games which are relevant to the theory of recursively enumerable sets. Annals of Mathematics, 91(2):291–310, 1970. [12] O. Maler, A. Pnueli, and J. Sifakis. On the synthesis of discrete controllers for timed systems. In STACS 95, volume 900 of LNCS, pages 229–242. Springer-Verlag, 1995. [13] E. Martin, D. Luzeaux, and B. Zavidovique. Learning and control from a recursive viewpoint. In IEEE International Symposium on Intelligent Control, Glasgow, Ecosse, 1992. [14] R. McNaughton. Infinite games played on finite graphs. Annals of Pure and Applied Logic, 65:149–184, 1993. [15] P. Odifreddi. Classical Recursion Theory. Holland, Amsterdam, 1989.

North-

[16] P. Odifreddi. Classical Recursion Theory (Volume II). North–Holland Publishing Co., Amsterdam, To Appear. [17] C. Papadimitriou and M. Yannakakis. Shortest paths without a map. Theoretical Computer Science, 84:127– 150, 1991. [18] R. L. Rivest and R. E. Schapire. Inference of finite automata using homing sequences. Information and Computation, 103(2):299–347, Apr. 1993. [19] J. G. Thistle and W. M. Wonham. Control of infinite behavior of finite automata. SIAM Journal on Control and Optimization, 32(4):1075–1097, 1994.

[20] W. Thomas. Automata on infinite objects. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, pages 133–191. Elsevier Science Publishers B. V., 1990. [21] W. Thomas. On the synthesis of strategies in infinite games. In STACS 95, volume 900 of LNCS, pages 1– 13. Springer-Verlag, 1995. [22] W. Thomas and H. Lescow. Logical specification of infinite computations. In J. W. de Bakker, W.P. de Roever, and G. Rozenberg, editors, A Decade of Concurrency: Reflections and Perspectives, volume 803 of LNCS, pages 583–621. Springer-Verlag, 1993.

Learning Branches and Learning to Win Closed Games - CiteSeerX

Learning Branches and Learning to Win Closed Games - CiteSeerX

Suggest Documents

Learning To Win Process-Control Games ... - Semantic Scholar

Learning to Commit in Repeated Games - CiteSeerX

Video Games, Mind, & Learning - CiteSeerX

Prediction, Learning, and Games

Prediction, Learning, and Games

Computer Games and Language Learning Abstract - CiteSeerX

Computer Games and Language Learning Abstract - CiteSeerX

Learning to Learn and Action Video Games

Learning to Learn and Action Video Games

Win-Win Math Games

Connecting Youth, Games, and Learning

Prediction, Learning, and Games

Learning to be a Bot: Reinforcement Learning in Shooter Games

Learning with Board Games - The Learning Key

conceptual mini-games for learning - CiteSeerX

Anticipatory Learning in General Evolutionary Games - CiteSeerX

Anticipatory Learning in General Evolutionary Games - CiteSeerX

Designing Games-Based Embedded Authentic Learning ... - CiteSeerX

Anticipatory Learning in General Evolutionary Games - CiteSeerX

Learning in games with unstable equilibria - CiteSeerX

ASPIRATION LEARNING IN COORDINATION GAMES 1 ... - CiteSeerX

Games Based Learning

Learning Arabic with Games

Games Based Learning