Program Synthesis by Learning and Planning - Semantic Scholar

1 downloads 0 Views 103KB Size Report
tower is given, a general action scheme (a recursive pro- ... block of a n-block tower is learned: clear(b .... Problems as the tower of hanoi puzzle are classically.
Program Synthesis by Learning and Planning Ute Schmid and Fritz Wysotzki

Department of Arti cial Intelligence Institute of Applied Computer Science, Technische Universitat Berlin Franklinstr. 28-29, D-10587 Berlin [email protected], [email protected] In arti cial intelligence automatic programming has been a topic of research since the Sixties (c.f. [BGK84], [LD89]). Its main concern is to model programming expertise with the aim of automating or supporting the process of software engineering. The applicability of fully automatisized synthesis systems is very restricted. Nevertheless, research in this area can lead to valuable models about reasoning and problem solving in general. The idea, that enriching program synthesis systems with a memory component may improve performance is not a new one (c.f. [MW75], p. 200). But the notion of combining program synthesis with learning has been scarceley exploited. In our research project we are elaborating this idea. There are two di erent approaches to learning which were investigated: learning by problem solving and learning by analogy. Furthermore, the idea of learning by problem solving can be applied not only to the domain of automatic programming. We investigated the acquisition of generalized rules from problem solving by state-space search (f.e. [MUB93]) as well as the acquisition of correct behavior in dynamic environments by reinforcement learning. We argue that learning of this kind can be also seen as program synthesis because the acquired rules can be looked upon as speci c programs (c.f. [Sim93], p. 30). Inductive Synthesis of Functional Programs In learning by problem solving we are working on an integration of hierarchical planning and inductive program synthesis. While in the last years research interest was focused on the synthesis of logic programms (e.g. [Mug92]), we investigate the synthesis of functional programs ([Sum77], [Wys83]). The basic idea of this approach is to nd a generalization over a given initial problem solution. This initial solution is represented by a partial program, capturing computation traces or input-output relations. For example, if the action sequence of clearing the bottom block of a three-block tower is given, a general action scheme (a recursive programming scheme, c.f. [Eng74]) for clearing the bottom block of a n-block tower is learned: clear(b; s) = if cleartop(b) then s

else puttable(topof(b); clear(topof(b); s)): This recursive program scheme (RPS) can be further generalized to an abstract scheme for structural equivalent problems, as for example list union: union(l; m) = if empty(l) then m else cons(head(l); union(tail(l); m)): The general RPS for clear and union is: G(x; y) = if boolop(x) then y else op(op(x); G(op(x); y)): The inital partial solution of a general problem can be constructed by two di erent problem solving methods: heuristic search over a problem space ([Nil71]) or hierarchical planning ([Sac77], [Wys87]). We are working on the hypothesis that the problem representations obtained by these approaches di er with respect to the ease of inferring a recursive generalization and its complexity. We have started with the implementation of both approaches in Common Lisp, but there are still some theoretical problems to solve. The general approach to inductive program synthesis is reported in chapter 1, the hierarchical planning approach is reported in chapter 2. Programming by Analogy In the analogy approach a new problem is solved by adapting the solution of an already solved, structurally similar, problem. Learning is described as generalization over the structures of the newly solved problem and the \analog" example problem retrieved from memory (e.f. [AT89], Chap. 4, [NH91]). Schmid ([Sch94]) proposed a hierarchical memory structure with RPS as basic cognitive units. This memory structure is acquired bottom-up: When a new problem is solved by adapting a stored example solution, both problems are represented as predecessors of the infered generalized RPS. If there are at least two generalized RPS in memory, a further generalization over these abstract schemes will be inferred resulting in a still more abstract scheme. Thereby the memory will become more

organized over the learning episodes. The set of unconnected RPS will be transformed to one tree with a general scheme for recursive functions as root. Simultaneously this kind of memory organization will produce the e ect, that searching memory for a suciently similar example problem will become more ecient with learning experience although the number of stored RPS gets larger. The idea is, to construct a retrieval function which searches the memory structure top-down descending the path of the tree with the highest similarity evaluation. Another approach to memory organisation, typically applied in case based reasoning, is to represent solutions by clusters using a given similarity metric. This approach should be applied if subsumption is not possible. Until now, we have not worked on the formalisation of memory organisation, but we have started some work on the problem of similarity mapping. We are investigating two approaches: The rst approach is, to present the new problem by the initial solution steps and to expand the stored program schemes accordingly1. Similarity can then be determined by a tree metric ([Lu79]). The second approach uses the structure mapping engine (SME) of Falkenhainer, Forbus and Gentner ([FFG89]). Here the new problem is presented as partial program without the recursive relation. Both approaches are reported in chapter 3. An architecture for a learning program synthesis system We will continue our work in the areas described above with the aim of building an integrated learning system which exploits learning by problem solving as well as learning by analogy (see gure 1). The system behavior is determined by the degree of matching between a given problem and the example solutions stored in memory. If no adequate example is available, the system will try to synthetisize the program from scratch by a general problem solving and program synthesis strategy. If a sucient similar example is found in the memory, this example will be retrieved and adapted to the new problem. In both cases the system will learn by storing the problem solutions as well as a common generalization in the hierarchical memory. The proposed system architecture can be used as a computational model for problem solving and learning. Furthermore, program synthesis systems could get more ecient in combination with analogical learning. Learning by problem solving: A broader view The notion of learning by problem solving can be applied not only to the domain of automatic programming. GenThe initial steps for calculating the faculty of a natural number are n=0 ! 1, n=1 ! 1*1, n=2 ! 2*1*1, n=3 ! 3*2*1*1. The expansion of a recursive program scheme is realized by interpreting the scheme with some xed values for the input parameter. 1

Problem Specification

no Mapping

Inductive Program Synthesis

sufficient?

yes

Programming by Analogy

Hierarchical Program Memory Retrieval of example program

Learning

Executable Program 1: Outline of the architecture for a learning program synthesis system Fig.

erally learning may occur during all interactions with the environment, when solving a problem like the tower of hanoi puzzle as well as when learning how to ride a bicycle. The knowledge acquired commonly is a set of condition-action rules which can be regarded as a \behavioral program". Problems as the tower of hanoi puzzle are classically solved by state-space search ([Nil71]). A method for combining state-space search with learning was proposed by Newell ([NR81]). In this approach simple production rules were combined when they were frequently applied in sequence during problem solving. We propose a different approach which is based on the construction of decision trees ([UW81]). In a rst step, solution pathes for di erent initial states of a problem are obtained by state-space search. In a second step generalized rules for problem solving are inferred by the following procedure: the descriptions of the problem states are represented as feature vectors and each description is associated with the operator applied during problem solving. These pairs of feature vectors and operators constitute the training examples of a concept learning problem, with the operators as classes. A decision tree is constructed where each path classi es a given state with respect to the operator at the leaf of the path. This operator is the (optimal) action to be applied to the state. By operator application a new state is produced which is again classi ed by the decision tree. This procedure is repreated recursively until the goal state is reached. That means, the tree contains implicitly the optimal problem solution (operator sequence) for each initial state if the training set provides optimal actions (found by problem solving) which are generalized by the learning process over all problem

states. By using techniques of tree optimization, only the relevant features of the problem states are extracted. In a new problem solving situation search can be omitted. The decision tree corresponds to a conditional program which represents a generalization over the actual problem states which occured during problem solving. Such a program, which represents a general solution for a concrete problem (e.g. Tower of Hanoi puzzle with three discs), may be used as input to a program synthesis system as described. With help of program synthesis techniques there might be produced a further generalization, hopefully a program which represents the solution for a whole problem class (e.g. Tower of Hanoi puzzle with n discs). The decision tree approach to learning by problem solving is reported in chapter 4. Problems as riding a bicycle constitute dynamic and continous problem spaces: The actual problem state of a system is in uenced not only by the actions executed but also by parameters of the environment (f.e. road conditions) and the attributes of the problem space are real valued (f.e. speed). Learning over dynamic and continuous problem spaces can be modelled with reinforcement learning techniques. Barto ([BSA83]) proposed a reinforcement learning method for problems where only two discrete operations are needed (f.e. driving to the right vs. to the left). We generalized this approach to problems with n continuos operations (f.e. varying degrees of acceleration). Our approach is reported in chapter 5. Conclusion Program synthesis, generalized hierarchical planning, learning by analogy, learning in state-space search and reinforcement learning are all facets of learning from experience. Humans acquire a huge set of intelligent behavior by experience. That is, they extract relevant features of the problems they are confronted with, and thereby infer knowledge which they can exploit in future when similar situations occur. As larger the amount of experience in a domain as higher is the probability that the problem will be solved succesfully. Our work tries to make this idea fruitful for arti cial intelligence.

References [AT89]

J.R. Anderson and R. Thompson. Use of analogy in a production system architecture. In S. Vosniadou and A. Ortony, editors, Similarity and Analogical Reasoning, pages 267{297. Cambridge University Press, 1989. [BGK84] A.W. Biermann, G. Guiho, and Y. Kodrato , editors. Automatic Program Construction Techniques. Collier Macmillan, London, 1984. [BSA83] A. B. Barto, R.S. Sutton, and C.W. Anderson. Neuronlike adaptive elements that can solve dicult learning control problems. IEEE Transactions on Systems, Man and Cybernetics, 13(5), 1983. inf-nn.

[Eng74] J. Engelfriet. Simple Program Schemes and Formal Languages. Springer, Berlin, 1974. [FFG89] B. Falkenhainer, K.D. Forbus, and D. Gentner. The structure mapping engine: Algorithm and example. Arti cial Intelligence, 41:1{63, 1989. [LD89] M. Lowry and R. Duran. Knowledge-based software engineering. In P.R. Cohne A. Barr and E.A. Feigenbaum, editors, Handbook of Arti cial Intelligence, volume IV, pages 241{ 322. Addison-Wesely, Reading, Mass., 1989. [Lu79] S. Lu. A tree-to-tree distance and its application to cluster analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2):219{224, April 1979. [MUB93] T.M. Mitchell, P.E. Utgo , and R. Banerji. Learning by experimentation: acquiring and re ning problem solving heuristics. In R.S. Michalski, J.G. Carbonell, and T.M. Mitchell, editors, Machine Learning: An Arti cial Intelligence Approach, volume 1, chapter 6, pages 163{190. Springer, 1993. [Mug92] S. Muggleton. Inductive logic programming. In Inductive Logic Programming. Academic Press, 1992. [MW75] Z. Manna and R. Waldinger. Knowledge and reasoning in program synthesis. Arti cial Intelligence, 6:175{208, 1975. [NH91] L.R. Novick and K.J. Holyoak. Mathematical problem solving by analogy. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14:510{520, 1991.

[Nil71]

N.J. Nilsson. Problem-solving Methods in Arti cial Intelligence. McGraw-Hill, 1971.

[NR81] A. Newell and P.S. Rosenbloom. Mechanisms of skill acquisition and the law of practice. In J.R. Anderson, editor, Cognitive skills and their acquisition. Erlbaum, Hillsdale, N.J., 1981. [Sac77] E.D. Sacerdoti. A Structure for Plans and Behavior. North-Holland, Amsterdam, 1977. [Sch94] U. Schmid. Erwerb rekursiver Program-

miertechniken als Induktion von Konzepten und Regeln (Acquistion of Recursive Programming Skills as Induction of Concepts and Rules), volume 70 of DISKI. in x, Sankt Au-

gustin, 1994.

[Sim93] H.A. Simon. Why should machines learn? In R.S. Michalski, J.G. Carbonell, and T.M. Mitchell, editors, Machine Learning: An Arti cial Intelligence Approach, volume 1, chapter 2, pages 25{38. Springer, 1993. [Sum77] P.D. Summers. A methodology for lisp program construction from examples. Journal ACM, 24(1):162{175, 1977. [UW81] S. Unger and F. Wysotzki. Lernfahige Klassi zierungssysteme. Akademie-Verlag, Berlin, 1981. [Wys83] F. Wysotzki. Representation and induction of in nite concepts and recursive action sequences. In Proceedings of the 8th IJCAI, Karlsruhe, 1983. [Wys87] F. Wysotzki. Program synthesis by hierarchical planning. Arti cial Intelligence: Methodology, Systems, Application. Elsevier Science, 1987.

Suggest Documents