Feb 2, 2000 - controller and morphology of a robot evolve together. Consequently ... erate offspring to form a new population; and recre- ation is to apply ...
IEICE TRANS. INF. & SYST., VOL.E83–D, NO.2 FEBRUARY 2000
200
PAPER
Evolving Autonomous Robot: From Controller to Morphology Wei-Po LEE† , Nonmember
SUMMARY Building robots is generally considered difficult, because the designer not only has to predict the interactions between the robot and the environment, but also has to deal with the consequent problems. In recent years, evolutionary algorithms have been proposed to synthesize robot controllers. However, admittedly, it is not satisfactory enough just to evolve the control system, because the performance of the control system depends on other hardware parameters — the robot body plan — which might include body size, wheel radius, motor time constant, etc. Therefore, the robot body plan itself should, ideally, also adapt to the task that the evolved robot is expected to accomplish. In this paper, a hybrid GP/GA framework is presented to evolve complete robot systems, including controllers and bodies, to achieve fitness-specified tasks. In order to assess the performance of the developed system, we use it with a fixed robot body plan to evolve controllers for a variety of tasks at first, then to evolve complete robot systems. Experimental results show the promise of our system. key words: evolutionary robotics, learning robot, artificial life, evolutionary computing
1.
Introduction
Artificial life research complements traditional biological science that concerns analyses of living systems, as it synthesizes artificial systems that can exhibit the behaviors of natural living systems. In this research field, evolutionary techniques have been widely applied to evolve systems that emerge life-like behaviors. In recent years, the artificial life technique (i.e., evolutionary algorithms) has been proposed to synthesize control systems for robots [1], [2]. The central idea is that by using the characteristic of self-adaptation of evolutionary algorithms, we can synthesize robot behaviors closer and closer to what we expect, over the generations and through the fitness improvement. Based on the evolutionary technique employed, work in evolving robot controllers can be categorized into the Genetic Algorithm (GA)-based work (e.g., [2]– [4]) and Genetic Programming (GP)-based work (e.g., [5], [6]). In general, a GA-based work involves encoding the relevant parameters of a controller (such as the weights and thresholds of a neural network) into a linear string, and then uses a standard GA to breed values for these parameters. On the other hand, a GP-based work Manuscript received May 6, 1998. Manuscript revised March 26, 1999. † The author is with the Department of Management Information Systems, National Pingtung University of Science and Technology, Pingtung, Taiwan.
uses a tree structure to represent a control program and GP techniques to breed control programs. According to the literature work, it is observed that using GP to evolve robot controllers has the innate advantage of operating variable-size genotypes. This is an important feature in developing control systems because it provides complete freedom to evolve the morphology of controllers. Nevertheless, the GP based work has been criticized for its need of a relatively large population size which in fact means more computational resources. This is especially an unfavorable feature in evolving robot controllers for its time-consuming characteristic. Therefore, how to design a system to include the advantages associated with GP but avoid its disadvantages in evolving robots is crucial. In addition to the controller, we observe that robot morphology, or body plan, also plays an important role in generating robot behaviors. The body plan here means the physical structure of a robot and can be specified as a set of structure parameters. In the case of a mobile robot, for example, they may be the wheel radius, motor time constant, the number and position of sensors, and so on. Sometimes, a small change in structure means a huge amount of effort in the design of control mechanism. Thus, in an ideal situation, the robot body plan itself should also adapt to the task we want the evolved robot to achieve. In fact, it is reasonably evident that in nature the structure and abilities of a creature’s body are closely related to the kinds of behaviors it has; the physical patterns need to be operated according to certain skillful strategies in order to generate the special behaviors which are beneficial for the creature. Any modification of a creature must be profitable in some way to the modified form, and it is often influenced by the changes of other correlated parts of the organization — the changes of physical parts or the strategies operating the parts. In short, the control system and the morphology evolve together over generations. In this paper, we present a framework in which the controller and morphology of a robot evolve together. Consequently, such robot creature can improve its performance through the change of its control system or body plan. In this system, the robot consists of two parts: a tree-like controller and a string of structure parameters. GP and GA subsystems are implemented to evolve the controller and the values of parameters,
LEE: EVOLVING AUTONOMOUS ROBOT: FROM CONTROLLER TO MORPHOLOGY
201
respectively. Because the developed genetic representation captures the important characteristics of robot control, our GP subsystem can thus evolve high performance controllers by using a relatively small population size. To prove the efficiency of such GP subsystem, we use it to evolve controllers to achieve a variety of tasks, including obstacle avoidance, box-pushing, and safe exploration, for a robot with a fixed body. Then, we use the whole evolutionary system to evolve complete robot creatures to achieve an obstacle avoidance task; that is, we explore the issue of evolving both robot controller and morphology. We also conduct a series of experiments to verify that both controllers and body plans of the evolved robot have adapted to the specific task. The results show the promise of our system. 2.
Background
Fig. 1
The general flow of a simulated evolution mechanism.
2.1 Genetic Algorithms and Genetic Programming Genetic Algorithms (GAs) [7] are currently the most popular forms in modeling the natural evolution process to solve problems with high complexities. In GAs, an individual is constituted by a string of genes; each of them is regarded as carrying a genetic feature. An individual with better fitness is said to have some genetic features that are capable of solving the problem, and these features are expected to be propagated to the subsequent population by means of duplicating or recombining the gene sequence of parent populations. In general, an individual is represented as a fixedlength string, often, but not necessarily, in the form of a bit string. Most importantly, the design of a representation must ensure that a potential solution in the search space for the problem to be solved can be expressed as a string of the developed representation. This requires considerable knowledge and insight for the problem. GA is operated as an iteratively cyclic mechanism which includes a sequence of selecting parents and creating children, after the initialization phase. Figure 1 illustrates the general flow of an evolution mechanism. Selection involves probabilistically choosing individuals from the current population as parents to generate offspring to form a new population; and recreation is to apply genetic operators, such as reproduction, crossover, and mutation on the selected individuals to generate new ones. Among different operators, the crossover is the major one to create most of the offspring, and it is generally implemented as alternately copying gene sequences, separated by randomly chosen crossover points from the selected parents. Figure 2 (a) shows a typical example of crossover in GA. A variant of Genetic Algorithms, named Genetic Programming (GP), was recently invented by John Koza [8], and its popularity is now increasing in the community of evolutionary computation research. It is an extension of the traditional GAs with the ba-
(a)
(b) Fig. 2 (a) An example of one-point crossover in GA; (b) an example of crossover in GP.
sic distinction that the individuals are dynamic tree structures rather than fixed-length vectors. GP aims to evolve dynamic and executable structures often interpreted as computer programs to solve problems without explicit programming. As in computer programming, a tree structure of GP is constituted by a set of non-terminals as the internal nodes of the tree, and by a set of terminals as the external nodes (leaves) of the tree. The construction of a tree is based on the syntactical rules which extend a root node to appropriate symbols (non-terminal and/or terminals) and each non-terminal is extended again by
IEICE TRANS. INF. & SYST., VOL.E83–D, NO.2 FEBRUARY 2000
202
suitable rules accordingly, until all the branches of the tree end up with terminals. Hence, the first step in applying GP to solve a problem is to define appropriate non-terminals, terminals and the syntactical rules associated for the program development. The search space in GP is the space of all possible tree structures composed of non-terminals and terminals. As in GAs, operators of reproduction, crossover, and mutation are used to create new tree individuals. The reproduction simply copies the original parent tree to the next generation; the crossover randomly swaps subtrees for two parents to generate two new trees; and mutation randomly regenerates a subtree for the original parent to create a new individuals. Because of such specific structure in GP, when the crossover is performed, all syntactic constraints must be satisfied to guarantee the correctness of new trees. An example of the GP crossover is illustrated in Fig. 2 (b). 2.2 Evolving Controllers The process of evolving controllers is similar to the traditional evolution-based work. It first specifies a fitness function which evaluates a robot behavior, and then begins the evolution process: generating an initial population consisting of control systems on a simulated robot; using the fitness function to evaluate each control system in the population; and applying genetic operators on the current robot population to create a new population, according to the fitness. By the self-adaptation of the internal control structure, the robot is expected to generate behaviors closer and closer to what is desired. The technique of evolutionary computation has been employed to develop robot controllers: in [2], [4], a GA-based system was used to evolve neural networks as controllers; in [3] GA was used to evolve classifier systems; and in [5], [6], GP was used to evolve tree-like programs for the robot control. Though both GA and GP can be used to evolve controllers to achieve different tasks, they both have disadvantages as described in Sect. 1. In Sect. 4, we will show how by defining an appropriate representation for controllers, our GP system can restrain itself to use a relatively small population size to evolve reliable and robust control system in short time. 2.3 Evolving Morphology When the brain and body evolve together, they adapt to each other — a brain does not do much without a body and vice versa. Yet, the fact that the body largely determines the performance of the brain (or the control mechanism) has previously been ignored by the research communities of evolutionary computation, artificial life, and robotics, in which contribution has mostly been made in developing control systems for agents with a fixed structure. Although some re-
searchers occasionally mention evolving the morphology of the robot as well as the controllers, their work is limited to evolve the placements or acceptance angles of sensors (e.g., [6], [9]). In nature, the whole body plans evolve: the biological data suggests that body plans of all animals, from fruit flies to elephants, are controlled by the same kinds of genes, namely the Hox genes [10]. The first multicellular animals’ body plans are believed to be largely the work of a primitive set of Hox genes, and descendants of these genes have been sculpting the body plans of animals ever since. For example, genetic experiments with homeotic genes in mice have been demonstrated that Hox genes are in part responsible for the specification of segmental identity along the anterior-posterior axis; it has been proposed that an axial Hox code determines the morphology of individual vertebrate [11], and that Hoxa-1 and Hoxa-2 have been shown to play a critical role in head development in both mice and Drosophila. The biological facts about Hox genes suggest that the evolution of body plans can be modeled by having parts of a genetic string to express growth of different body plans. The idea of a developmental morphogenetic model has been investigated in simulation, e.g., [12], [13], but only to develop the neural structure and not the physical structure of an agent. Sims has used Lindenmayer’s L-system (i.e., [14]) to evolve both the controller and 3D-rigid parts of a creature acting in a virtual poly-world [15]. However, impossible regions in the morphological space (described in [16]) are not considered in Sims’ work, so his approach tends to be unworkable in real robots. In fact, his approach is more suitable for animation in computer graphics rather than for the design of a physical robot system. A more fruitful approach should integrate our knowledge of robots and that of genetics. Funes and Pollocks used GA to evolve buildable objects such as LEGO bridges [17], but not complete robot creatures which include controllers and physical structures. Our first approach toward evolving robot body plans uses a simple direct encoding from gene to physical expression, since our immediate goal is to show the validity of evolving hardware structures rather than the validity of a developmental model. The genotype of a robot expresses a tree structure used as controller that determines the number of sensors and their positions, and a list of real numbers that determines the robot body plan (i.e., body size, wheel base, wheel radius, etc.). With evolutionary computation techniques, we can then evolve both the robot brain and body. 3.
Evolving Complete Robot System
3.1 System Overview In our evolutionary system, an individual is a com-
LEE: EVOLVING AUTONOMOUS ROBOT: FROM CONTROLLER TO MORPHOLOGY
203
Fig. 3 The structure of an individual defined in this work: in the tree structure, a node with a N/T is a non-terminal/terminal node; in the string representation, Pi is a real number.
plete robot creature; that is, it consists of a control system and a body plan, represented as brain, body, in which the brain is a tree-structure specifying a robot controller and the body is a string of real numbers quantitatively specifying a physical structure of the robot. Figure 3 shows a typical structure of an individual. To evolve a complete robot, we design a hybrid system of GP and GA; the GP part evolves control systems, and the GA part evolves body plans. In such a system, to evaluate the performance of a robot individual is to execute its brain controller on its own body plan for a period of time. Because of the difficulty in conducting on-line body reconfiguration for a robot, we use an offline approach instead: the performance evaluation is done in simulation. It is based on the assumption that we can build a robot according to the result evolved. This system can also be used to evolve controller systems for a robot with a fixed body plan, by blocking the GA subsystem. For the creation of new robot individuals, genetic operators, reproduction, crossover, and mutation are used. The reproduction operation simply copies the selected parent individuals, without changing the controllers or body plans into the next generation. The crossover or mutation operation is allowed to take place on the brain or body at random. Because of the special structure of an individual here, however, the crossover is constrained to occur on both brains or both bodies of the involved parents to maintain the syntactical feasibility of child individuals. Besides, as the representations of the brain part and the body part are different, this work requires separate crossover and mutation operations for the tree expressions and the linear strings. Related techniques of GP and GA are thus applied independently to the brains and the bodies. 3.2 Evolving Brains with Sensors — The GP Part A brain here means a behavior controller and is represented by a combinational/sequential circuit. In this
work, we concentrate on evolving reactive behavior controllers which are represented by combinational circuits; the same approach can be used to evolve sequential ones. By duplicating and separating those components, the output of which serve as inputs of multiple components, and by introducing a dummy root node to connect the outputs of a circuit together, it is very straightforward to convert a combinational circuit to a tree, called circuit tree. In other circuit-based work [18], [19], an input variable to a circuit is normally a sensor response, specifying information of distance, brightness, etc., thresholded by a pre-defined value. But in a real control work, the threshold is normally unknown and, when the robot continuously senses the environment to monitor the variation, it not only checks whether certain sensors reach some kind of threshold, but also observes the difference between sensors to determine its actions. Therefore, in our representation, we structure the perception information into sensor conditionals and connect them to the inputs of a logic circuit. A sensor conditional has a constrained syntactic structure; it exists in the form of X > Y , where X, Y can be any normalized sensor response or threshold which is determined genetically. In our work, a robot is assumed to have physical round body with sensors positioned around the body pointing radially outward. Depending on the characteristics of the specific tasks, different kinds of sensors are required. A sensor is defined to be associated with a value between 0 and 1 which indicates the angle between the direction which the sensor is pointing at and the robot’s heading. Thus, whenever a sensor is called in the control system, the normalized sensor response is returned, in the direction indicated by the value associated with that sensor. For instance, an infrared sensor with a value 0.3 (written as “IR 0.3” in a representation) will return the normalized sensor response in the direction 0.3 revolution (108 degrees) anti-clockwise, relative to the robot’s heading. In this way, the sensor positions/directions are also allowed to be evolved if the sensors are adjustable [20]. For a robot with fixed sensors, the values associated with the sensor are constrained (and normally replaced by the identification numbers, as we will see in Sect. 4), subject to the availability of sensors. After organizing our genetic representation, we then define non-terminals and terminals which constitute a circuit tree for our GP system. Three types of non-terminals are defined: the dummy root node, the comparator, and the logic components. The dummy root node is to collect the main outputs of a control system for convenient manipulation by a GP system; the comparator is to construct the sensor conditionals; and the logic components are to constitute the main frame of the controller for mapping the structured sensor information into appropriate actuator commands. As is
IEICE TRANS. INF. & SYST., VOL.E83–D, NO.2 FEBRUARY 2000
204
Fig. 4
Diagram of a typical controller.
mentioned above, because the elements in a sensor conditional can be normalized sensor response or numerical thresholds, both of them are defined as terminals. Figure 4 illustrates an example of a robot’s brain. In such a structure, the outputs of the subtrees of a circuit tree are interpreted as commands (i.e., mapped into a table of motor commands) to drive actuators. To evolve instances to solve different control tasks, we have to define different sensor terminals, depending on the requirements of the specific tasks. For example, we may define infra-red (IR) sensors as sensor terminals to detect walls and objects for an obstacle avoidance task; we may also define ambient light sensors as terminals to sense the light for a phototaxis task. The following experiments will give an account of how to evolve such kind of controllers in detail. As is mentioned, three genetic operators — reproduction, crossover and mutation — are used to recreate new controllers. Because there are some constrained syntactic structures defined in this work, the crossover operation for the tree controllers must be restricted to ensure the correctness in syntax. Apart from the swapping in crossover, an extra operation, averaging, is introduced to average the values associated with the sensor terminals to provide a way to shift sensors gradually. 3.3 Evolving Body Plans — The GA Part In order to evolve a robot body plan we need to analyze what constitutes a body plan and to extract the determining elements, which affect the behavior of a robot profoundly, from the physical structure of a robot. In the mobile robot design, for instance, there are some determining elements such as the wheel radius, the width of the wheel base, the time constant of the motion system, the body size (the diameter of the body, if we assume the robot body is round), and positions (with orientations) of the sensors, etc. The wheel radius affects the speed of the robot and determines the maximum and minimum moving speed for the specified motor commands; the width of the wheel base determines the turning rate of a robot; the time constant affects the response of the robot and determines the acceleration of the robot; the size of a robot body should be taskoriented: to avoid obstacles it may need to be smaller but to push boxes it may need to be larger; and the po-
sitions and orientations of the sensors allow the robot to acquire the perceptual information it needs. To evolve a robot body plan, in fact, means to decide these determining structural parameters of a robot genetically. In our system, the structural parameters (except, as we have seen, the sensor position/direction) are arranged as a linear string, in which each position is filled up with a real number representing the value of the corresponding parameter. Due to hardware limitations and performance considerations, each structural parameter has its own lower and upper bounds; when we build a robot, the value of each structural parameter must lie in its bounds. Thus, a robot body can be expressed as: P1 P2 . . . Pn where M in(Pi ) ≤ Pi ≤ M ax(Pi );
1≤i≤n
For this linear string, a Genetic Algorithm is employed to determine the value of each structural parameter Pi within its own range. As in a standard GA, operators of reproduction, crossover, and mutation are used to create new body strings. But the function of crossover is slightly different from the standard one: it is defined to perform operations of swapping or averaging randomly, for each pair of Pi between the two chosen crossover points. The use of averaging is to gradually change the structural parameters for fine tuning. 3.4 Implementation Details In our experiments, a fitness function is in fact a penalty function. This differs from the traditional GAs, but is often used in GP-based work. In our evolutionary system an island model is implemented for the performance enhancement, as in [21]. The populations are configured as a binary n-cube and the migration policy is that a certain number of the best individuals (10% in this work) of each population is sent to substitute the same number of worst individuals in its immediate neighbor populations at a regular interval (ten generations in this work). In each experimental run, two populations of 50 individuals were used. During the run, each individual was evaluated in multiple trials to start from different positions, to ensure the robustness and reliability of the results. The rates of reproduction, crossover and mutation in the following experiments were 10%, 85% and 5%, respectively. In addition, to prevent trees from growing without limit, the initial depth limit for a tree was 6 and the maximum depth after crossover and mutation was 13. As mentioned above, we adopt an off-line approach in which the performance evaluation of an individual is done in simulation. The simulator is independent
LEE: EVOLVING AUTONOMOUS ROBOT: FROM CONTROLLER TO MORPHOLOGY
205
from the evolutionary system; it depends only on the robot used. In the experiments of this work, we use two kind of simulators. The first is built by a look-up table method [22]. It first records the responses of each sensor to each object in the environment through the robot’s sensors themselves and records the effects (such as displacement in distance or angle) of the actuators for each allowable command before the simulation, and then access the sensor/actuators tables in the simulation. The other is built by a mathematical modeling method; it describes the responses of sensors and the dynamics of actuators by a set of equations. An sensor has its visual distance and bearing, and the motor system converts commands (outputs of a circuit tree) into actual speed (actual angular velocity). The robot is driven by two independent motors with separate speed commands. This enables a robot to move forward, backward, turn right and left at any possible speed. In order to make the simulation more realistic, 5% random noise (+5– −5% uniformly) is injected to the perceptions and the motions. 4.
Experiments in Evolving Controllers
After describing our evolutionary framework, in this section we use this framework and the miniature robot Khepera which has a non-reconfigurable body (see Fig. 5), to evolve controllers for three different types of tasks to assess the generalization and performance of our GP subsystem. They include the task of obstacle avoidance which has been widely used as a testbed for other systems; box-pushing which involves two movable objects in the environment and can be regarded as a simplest case of multi-robots interaction; and a task of exploration in which the fitness of a controller can only be assigned after a complete trial rather than the sum of fitness over each time step like most other reactive tasks do. Since the main goal of the experiments is to prove the correctness and efficiency of our GP work, we only present simulation results. The performance of the simulator has been shown in our preliminary study:
Fig. 5 The Khepera miniature robot and its sensor arrangement. In the right figure, a sensor Si can function as an infra-red or an ambient light sensor.
the evolved controllers can be downloaded to the real robot without the loss of performance [23]. 4.1 Obstacle Avoidance This task requires the robot to move forward as straight as possible but not bump any obstacle. To achieve such a task, the robot needs to use its IR sensors to detect the appearance of the obstacle. Hence, two kinds of terminals, IRs and numerical thresholds were defined. The second step is to define a fitness function to guide the evolution. For this task, we can formulate the fitness function, through the quantitative description of the expected robot behavior, as keeping the maximum response of the eight IRs low to ensure the robot away from the obstacle; and keeping the forward speed high and the speed difference between two motors low to encourage the robot forward and to discourage it from rotation. Thus, the fitness function was defined as: f=
T {α × max response(t) + β × [1 − v(t)] t=1
+ γ × w(t)} in which max response(t) is the largest IR response at time step t; v(t) is the normalized forward speed; w(t) is the absolute value of the normalized rotation speed; and α, β, γ are the corresponding weights expressing the relative importance of the above three criteria. In order to illustrate the performance of this GP system, the fitness curves over the ten runs are shown in Fig. 6 (a). It indicates that such GP system is considerably efficient: the performance was improved quickly and stably. The typical obstacle avoidance behavior evolved is shown in Fig. 6 (b). As can be seen, the task was achieved successfully: the robot moved straight forward most of the time, but it was able to move away from the obstacles whenever it approached them. 4.2 Box Pushing The task of box-pushing is that the robot keeps pushing a box forward as straight as possible. In this task, the robot needs to use its IR sensors to acquire perception cues for the location of the box. Therefore, terminals for this task were defined as the same as those in previous task: IRs and numerical thresholds. Again, a fitness function was needed and it was formulated as keeping the activation value of the robot’s front IR sensor high, the robot moving fast forward, and the speed difference between two motors low. The pressure from keeping the front IR sensor with high activation value was to reinforce the robot to approach a box; and the pressure from keeping robot moving fast forward with low speed difference between two motors was to encourage the robot to move straight and to prevent it from getting stuck in front of a box. The combi-
IEICE TRANS. INF. & SYST., VOL.E83–D, NO.2 FEBRUARY 2000
206
(a) (a)
(b) Fig. 6 (a) Population average fitness and best individual fitness at each generation. Values are averaged over ten runs; (b) the trajectory of the simulated robot (the thick circles) when it performed the evolved controller.
nation of these can lead to a pushing-forward behavior. Thus, the fitness function for evolving a behavior controller of box-pushing was defined as: f=
T
{α × [1 − s(t)] + β × [1 − v(t)]
t=1
+ γ × w(t)} in which s(t) is the average of normalized sensor activations of the front sensors IR2 and IR3; v(t) is the normalized forward speed; and w(t) is the normalized speed difference of two motors at each time step t. Figure 7 (a) shows the fitness curves over the ten independent runs and Fig. 7 (b) presents the evolved box-pushing behavior of the simulated robot, which demonstrates that the task was achieved successfully. To prove its reliability and robustness, the controller evolved was tested several times and at each time the robot started from an arbitrary position and heading around the box. During the tests, the robot always generated the consistent behavior: it turned to face the box, approached it, and then pushed the box; while at some specific time steps, the robot produced prompt turns to drive back from path deviations and to head towards the box again.
(b) Fig. 7 The fitness curves averaged over ten runs for evolving box-pushing controllers; (b) the typical box-pushing behavior of the robot.
4.3 Safe Exploration In this task, the robot is required to wander safely in an enclosure and visit as much of the enclosed space as possible. It can be described quantitatively as that the space is divided into some grid squares and the robot must visit as many squares as possible during a fixed period of time. There are different ways to achieve this task. For instance, it can be achieved by using a map to provide location information to the robot. Having the location information, the robot can realize its own location and the locations of those squares that have been visited already, and then heads to those squares which have not been visited according to the records of the map. There is also another kind of strategy without using a map, when no location information is available. In such a strategy, the robot does not know where it is and which squares it has not visited yet. To carry out the exploration task, the robot needs to determine its turning angle carefully in the situations when it senses the boundary of the enclosure. In our experiment, we intended to evolve a reactive controller for exploring a
LEE: EVOLVING AUTONOMOUS ROBOT: FROM CONTROLLER TO MORPHOLOGY
207
that the robot was able to visit most of the specified arena during a fixed period of time. We should note that it is not important how the robot moved when it did not sense anything; but the appropriate match between the turning angle (when the robot sensed the wall) and the way it moved (when it did not sense anything) is nevertheless crucial for a reactive controller to perform exploration. As we can see in Fig. 8 (b), a successful match has been evolved and it enabled the robot to achieve the task. 5. (a)
(b) Fig. 8 (a) The fitness curves averaged over ten independent runs in evolving exploration controllers; (b) the exploration behavior produced by the robot.
space without location information. Since IR sensors were the only mechanism for providing perception cues, the terminals for the exploration task were then defined to include IRs and numerical thresholds as the above two tasks. Unlike the experiments presented above, the fitness measurement for this task was not the sum of penalties over each time step but rather could only be assigned after a complete trial. The main concern for the fitness was to minimize the number of squares which had not been visited, while an extra pressure on the speed was added to encourage the robot to move forward when exploring. Thus the fitness function was defined as: f = α × (1 − P ) + β × (1 − Avg) where P is the proportion of the space visited, i.e., #visited-grids/#total-grids; and Avg is the average speed of the robot during a complete trial. Figure 8 (a) illustrates the fitness curves. It indicates the fast and smooth converging behavior of the evolutionary runs and again, shows the good performance of our GP system. The typical exploration behavior produced is presented in Fig. 8 (b), which shows
Experiments in Evolving Brains and Bodies Together
5.1 Implementation and Results After verifying that our GP subsystem can efficiently evolve robot controllers, we then used the overall evolutionary system to evolve both robot controllers and body plans to achieve an obstacle avoidance task, to evaluate its performance. The structural parameters we intended to evolve in this work were motor time constant, wheel base, wheel radius, and body size. As mentioned in Sect. 3.3, each structural parameter has its own limitations. In this experiment, the value of the motor time constant was restricted to lie between 0.5 and 2.5 second; the lower bound and upper bound of the wheel radius were 1.0 cm and 3.0 cm; the value of body size was limited to be in the range 10 cm to 25 cm; and the wheel base was constrained to be no larger than the body size. These values were chosen to approximate the device constraints of LEGO robots which provide the possibility of body reconstruction. The fitness function was the same as the one in previous obstacle avoidance task, except that the max response here meant the maximum of responses from eight virtual IR sensors positioned around the robot with identical separation. The virtual sensors were for fitness assessment only; they were removed after an evolutionary run. All other experimental parameters were also the same as those in the previous obstacle avoidance experiment. Figure 9 (a) illustrates an example of how the evolutionary system converged in evolving robot controllers and body plans together. The robot system evolved from this example is indicated below and its typical behaviors are shown in Fig. 9 (b). As can be observed, a complete robot system has been evolved successfully. (PROG (> IR 0.97 IR 0.52) (OR (OR (AND (> IR 0.97 IR 0.19) (> IR 0.97 IR 0.21)) (> IR 0.97 IR 0.60)) (> IR 0.21 IR 0.23)) (> IR 0.83 IR 0.71))
IEICE TRANS. INF. & SYST., VOL.E83–D, NO.2 FEBRUARY 2000
208 Table 1 The results of testing the evolved controller on different robot body plans. The motor time constants of the left and right data sets are 0.68 and 1.35, respectively.
(a)
propriate brain-body couplings cannot achieve the task perfectly. This demonstrates that the brain and body of a robot have both participated in the evolutionary process and have adapted to the specific task. 5.3 Further Investigation (b) Fig. 9 (a) An example of the behavior of the evolutionary system, when it was used to evolve robot brains and bodies to achieve an obstacle avoidance task; (b) the typical obstacle avoidance behavior.
time constant: 0.68 wheel base: 10.53 wheel radius: 1.56 body size: 10.53 5.2 The Importance of Appropriate Brain-Body Coupling We have shown that the controller and the robot body can be evolved together to achieve a fitness-specified task. In order to investigate how the evolved controller relies on the correspondingly evolved body (i.e., to prove they have been evolved together), we tested the evolved controller on different robot body plans which were designed according to various combinations of structural parameters. For each body plan, we tested it with the above controller in 100 trials for different starting positions and changed environments. Table 1 shows the results. Each entry in Table 1 represents the number of successful test trials (the ones in which the robot did not bump any wall during the complete trials) for a certain robot body plan. From Table 1, we find that the combination where the controller was executed on the correspondingly evolved robot body (marked with an asterisk) has the highest success rate (actually it is 100%). The inap-
Some results in Table 1 attract our attention. In case a (indicated by the mark a at the corresponding number), the number of successes increases dramatically, when compared to case b. We can explore the reasons by observing the behaviors emerging from the two pairs of brains and bodies. In case b, the robot always bumped the wall because of the inappropriate enlargement of the body size and wheel base. But if the wheel base was enlarged more (as in case a), it is more difficult for the robot to rotate, especially with the relatively small wheels which slowed down the motion of the robot. This caused the robot to get stuck easily — it oscillated forward and backward with a little turning at one particular position, so it was safe in most of the test cases. The typical behaviors of case a and case b are shown in Fig. 10 (a) and Fig. 10 (b), respectively. We also investigated case c whose performance is much better than cases d and e. After examining their behaviors carefully, we found that there were actually two types of failure caused by certain kinds of robots which bumped the wall easily in cases c, d, and e. The first kind was a robot with a large body, small wheel base, and large wheels. A large body inevitably increased the bumping probability; the small wheel base with large wheels made the behavior of the robot unstable: it was easy to turn at high speed. Consequently, it often drove the robot to bump the wall suddenly. Figure 10 (c) shows this situation. The second kind was a robot with a large body, wide wheel base, and large wheels. As in the first situation, a large body increased the probability of the
LEE: EVOLVING AUTONOMOUS ROBOT: FROM CONTROLLER TO MORPHOLOGY
209
Fig. 10 Some faults caused by the inappropriate brain-body couplings (see text for explanation).
robot bumping into the wall, while the wide wheel base with large wheels, drove the robot forward faster with a small turning rate. This resulted in the robot bumping the wall in most of the test cases although it could detect the wall efficiently and tried to move away from the wall. Figure 10 (d) shows this situation. According to our observations, if the wheel base of the robot became larger, failure of the first kind decreased, but failure of the second kind increased. In case d, most of the failures were caused by the first type of situation but in case e most of the failures belonged to the second situation. Considering both situations which cause bumping behavior, we found case c happened to be the best case with a small number of failures for both situations. So its performance is better than case d and case e. 6.
Conclusions
In this paper, we have developed a hybrid system of Genetic Programming and Genetic Algorithm to evolve complete autonomous robots — including control sys-
tems and their corresponding robot body plans. In our system, a circuit tree has been defined to represent a behavior controller for the robot, and the crucial structural parameters of a physical robot are extracted and arranged as a linear string of real numbers to represent a robot body. The GP part and GA part of the system can be used to evolve behavior controllers and strings of robot body parameters, respectively. To assess our evolutionary system, we first use the GP subsystem to evolve behavior controllers to achieve a variety of tasks for a robot with a fixed body plan. Because our circuit tree controllers can capture the important characteristics of a behavior controller, the system can efficiently evolve controllers that perform desired behaviors within relatively short time. It means that our system not only has the advantage of operating variable-size genotypes as other GP-based work in evolving controllers, but also avoids the disadvantage of the need of a large population size. Also, our circuit controller is easy to evaluate (i.e., computationally cheaper), and its characteristic of low sensitivity to input noise makes the result robust, subject to the
IEICE TRANS. INF. & SYST., VOL.E83–D, NO.2 FEBRUARY 2000
210
unreliable sensors and motors in the real world. These are all very important features in evolving robot controllers. After verifying that our GP subsystem can evolve reliable behavior controllers, we use the overall system to evolve a complete robot to do a fitness-specific task and evaluate its performance. The experimental result shows the promising of our approach. In addition, we have also analyzed the importance of appropriate brain-body coupling in designing a robot system. The evolved controller can achieve the task perfectly only when performing on the correspondingly evolved robot body plan. This indicates that in the evolutionary process, the robot body itself also play an important role, because the controller and the robot body have both adapted to the environment to achieve the task. For simple tasks, a human designer may be able to design a robot body according to his prediction of the difficulty of the task first, and then to design (or evolve) the controller. But for more complicated tasks, evolving controllers and morphologies together for robot systems provides a potential alternative. Our work presented here points to some prospects of future research. Because of our experiments in evolving robots with controllers and body plans are done in simulation, consequently more research into the realization of this approach in the real world can be undertaken. To achieve this, we can build a real LEGO robot according to the evolved body specification, and download the co-evolved controller to it to evaluate the performance. By using the information obtained from the observation of the performance difference between the simulated and real robots, we can improve the simulator and so bring our approach of evolving robot brains and body plans together into the real world. Another aspect is to use the developed system to achieve a variety of tasks or more difficult tasks to examine its generality. References [1] R.A. Brooks, “Artificial life and real robots,” Proc. First European Conference on Artificial Life, pp.3–11, 1992. [2] D. Cliff, I. Harvey, and P. Husbands, “Explorations in evolutionary robotics,” Adaptive Behavior, vol.2, no.1, pp.73– 110, 1993. [3] M. Dorigo and M. Colombetti, “Robot Shaping: Developing autonomous agents through learning,” Artificial Intelligence, vol.71, no.2, pp.321–370, 1994. [4] D. Floreano and F. Mondada, “Evolution of homing and navigation in a real robot,” IEEE Trans. Syst. Man. & Cybern., vol.26, no.3, pp.396–407, 1996. [5] J.R. Koza and J.P. Rice, “Automatic programming of robots using genetic programming,” Proc. AAAI-92, pp.194–201, 1992. [6] C.W. Reynolds, “Evolution of corridor following behavior in a noisy world,” in From Animals to Animats 3: Proc. Third International Conference on Simulation of Adaptive Behavior, pp.402–410, 1994. [7] D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, 1989.
[8] J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press, 1992. [9] I. Harvey, P. Husband, and D. Cliff, “Seeing the light: Artificial evolution, real vision,” in From Animals to Animats 3: Proc. Third International Conference on Simulation of Adaptive Behavior, pp.392–401, 1994. [10] S. Day, “Invasion of the shapechangers,” New Scientist, vol.2001, pp.30–35, 1995. [11] M. Kessel and P. Gruss, “Murine developmental control gene,” Science, vol.249, pp.347–379, 1990. [12] A. Cangelosi, D. Parisi, and S. Nolfi, “Cell division and migration in a ‘genotype’ for neural networks,” Networks, vol.5, pp.497–515, 1997. [13] F. Dellaert and R.D. Beer, “Toward an evolvable model of development for autonomous agent synthesis,” Proc. Artificial Life IV, 1994. [14] A. Lindenmayer, “Mathematical models for cellular interactions in development,” J. Theoretical Biology, vol.18, pp.280–315, 1968. [15] K. Sims, “Evolving 3D morphology and behavior by competition,” Proc. Artificial Life IV, pp.28–39, 1994. [16] R. Riedl, Order in Living Organisms: A System Analysis of Evolution, John-Wiley & Sons, 1978. [17] P. Funes and J. Pollocks, “Computer evolution of buildable objects,” Proc. Fourth European Conference on Artificial Life, 1997. [18] P. Agre and D. Chapman, “Pengi: An implementation of a theory of activity,” Proc. AAAI-87, pp.268–272, 1987. [19] S.J. Rosenschein and L.P. Kaelbling, “A situated view of representation and control,” Artificial Intelligence, vol.73, pp.149–174, 1995. [20] W.-P. Lee, J. Hallam, and H.H. Lund, “A hybrid GP/GA approach for co-evolving controllers and robot bodies to achieve fitness-specified tasks,” Proc. IEEE International Conference on Evolutionary Computation, pp.384–389, 1996. [21] R. Tanese, “Distributed genetic algorithms,” Proc. Third International Conference on Genetic Algorithms, pp.434– 439, 1989. [22] O. Miglino, H.H. Lund, and S. Nolfi, “Evolving mobile robots in simulated and real environments,” Artificial Life, vol.2, no.4, pp.417–434, 1996. [23] W.-P. Lee, J. Hallam, and H.H. Lund, “Applying genetic programming to evolve behavior primitives and arbitrators for mobile robots,” Proc. IEEE International Conference on Evolutionary Computation, pp.501–506, 1997.
Wei-Po Lee received his Ph.D. in artificial intelligence from University of Edinburgh, United Kingdom, in 1997. He is currently an assistant professor at the department of Management Information Systems, National Pingtung University of Science and Technology, Taiwan. He is interested in intelligent autonomous agents, biologically inspired robotics, machine learning and artificial life.