Journal of ICT, 4, pp: 117-133
EVOLVING NEURAL CONTROLLERS FOR TERRESTRIAL AND EXTRATERRESTRIAL LOCOMOTION IN AN ARTIFICIAL QUADRUPED
*J. Teo
*School of Engineering and Information Technology, Universiti Malaysia Sabah, 88999 Kota Kinabalu, Sabah, Malaysia.
E-mail:
[email protected]
ABSTRACT
This study explores the use of a multi-objective evolutionary algorithm for the automatic synthesis of neural controllers for the quadrupedal locomotion of an artificial creature in a 3-dimensional, physics-based environment. The Paretofrontier Differential Evolution (PDE) algorithm is used to generate a Pareto optimal set of artificial neural networks that optimize the conflicting objectives of maximizing locomotion behavior and minimizing neural network complexity. The focus of this artificial life experiment is to firstly evolve embodied locomotion controllers for a physically simulated quadrupedal creature under terrestrial conditions (i.e. simulating Earth's gravity) and then to investigate the performance of the best evolved controller in this physically simulated creature under different extraterrestrial environments (i.e. simulating gravity on planets other than Earth). It was found that under all extraterrestrial conditions the artificial creature was still able to perform the required locomotion task while in the worst case, some minimal locomotion behavior was still achieved.
Key words:
Artificial life, evolutionary robotics, embodied cognition, evolutionary multiobjective algorithms, evolutionary artificial neural networks.
117
Journal of ICT, 4, pp: 117-133
1.0
INTRODUCTION
There has been a strong resurgence of research into the evolution of morphology and controller of physically simulated creatures. The pioneering and captivating work of Sims (1994) has not been paralleled until very recently. Further work in this area was limited by the complexity of programming a realistic physicsbased environment and the steep computational resources required to run the artificial evolution. These physically realistic simulations of evolving artificial minds and bodies have become more accessible to the wider research community as a result of the recent convergence in the maturation of physics-based simulation packages and the increase of raw computing power of personal computers (Taylor & Massey, 2001).
Research in this area generally falls into two categories: (1) the evolution of controllers for creatures with fixed (Bongard & Pfeifer, 2002; Ijspeert, 2000) or parameterized morphologies (Lee et al., 1996; Paul & Bongard, 2001), and (2) the evolution of both the creatures' morphologies and controllers simultaneously (Lipson & Pollack, 2000; Hornby & Pollack, 2001; Taylor & Massey, 2001). Some work has also been carried out in evolving morphology alone (Eggenberger, 1997) and evolving morphology with a fixed controller (Lichtensteiger & Eggenberger, 1999). Related work using mobile robots have also shown promising results in robustness and the ability to cope with changing environments by evolving plastic individuals that are able to adapt both through evolution and lifetime learning (Floreano & Urzelai, 2000).
However, considerably little has been said about the role of controllers in the artificial evolution of such creatures. It has been noted that the potential of designing more complex artificial systems through the exploitation of sensorymotor coordination remains largely unexplored (Nolfi & Floreano, 2002). As such, there is currently a lack of understanding of how the evolution of controllers affects the evolution of morphologies and behaviors in physically simulated creatures. It remains unclear what properties of an artificial creature's controller allow it to exhibit the desired behavior. A better understanding of controller complexity and the dynamics of evolving controllers should pave the way towards the emergence of more complex artificial creatures with more complex morphologies and behaviors. Furthermore, there has not been any extensive investigations on whether terrestrially evolved controllers are able to perform under varying extraterrestrial conditions.
118
Journal of ICT, 4, pp: 117-133
In this paper, we investigate the use of a multi-objective approach in evolving controllers for a fixed morphology artificial creature. The artificial creature investigated here consists of 8 joint angle sensors, 4 touch sensors and 8 actuators. Firstly, we evolve controllers with different hidden layer complexities under terrestrial conditions. Secondly, the best evolved controller in terms of locomotion behavior is placed under differing extraterrestrial conditions to observe whether the best evolved controller from the terrestrial environment can still perform under these varying extraterrestrial environments.
2.0
METHOD
2.1
Evolving Artificial Neural Networks
Traditional learning algorithms for Artificial Neural Networks (ANNs) such as backpropagation (BP) usually suffer from the inability to escape from local minima due to their use of gradient information. Evolutionary approaches have been proposed as an alternative method for training ANNs. A thorough review of Evolutionary Artificial Neural Networks (EANNs) can be found in Yao (1999). Abbass et al. (2001) first introduced the Pareto-frontier Differential Evolution (PDE) algorithm, an adaptation of the Differential Evolution algorithm introduced by Storn and Price (Storn & Price, 1995) for continuous multiobjective optimization. The Memetic Pareto Artificial Neural Network (MPANN) algorithm (Abbass, 2001) combined PDE with local search for evol ving ANN s and was found to possess better generalization whilst incurring a much lower computational cost (Abbass, 2002). In this paper, PDE is used to simultaneously evolve the weights and architecture of the ANN.
2.2
Representation of a Chromosome
Similar to Abbass (2001) and Abbass (2002), our chromosome is a class that contains one matrix, W of real numbers representing the weights of the ANN and one vector, r of binary numbers (one value for each hidden unit) to indicate if a hidden unit exists in the network or not. Thus, it works as a switch to tum a hidden unit on or off. The sum of all values in this vector represents the actual number of hidden units in a network. This representation allows simultaneous training of the weights in the network and selection of a subset of hidden units. The matrix W is of size N x M, where N equals the number of inputs plus the number of hidden units and M equals the number
of
hidden
units
119
plus
the
number
Journal of ICT, 4, pp: 117-133
of output units. The vector r is equal to the number of hidden units. The morphogenesis of a chromosome into ANN is shown in Fig \* MERGEFORMAT 1.
2.3
The PDE algorithm
We have a multi-objective problem with two objectives in this study: (1) to maximize the horizontal distance traveled by the creature from its initial starting position, and (2) to minimize the numbers of hidden units. The Pareto-frontier of the tradeoff between the two objectives will have a set of networks with different number of hidden units and different locomotion behaviors. An entire set of controllers is generated in each evolutionary run without requiring any further modification of parameters by the user. The PDE algorithm for evolving ANNs consists of the following steps: 1.
Create a random initial population of potential chromosomes or solutions. The elements of the weight matrix W, are assigned random values according to a Gaussian distribution N(O,l). The elements of the binary vector r, are assigned the value 1 with the probability 0.5 based on a randomly generated number according to a uniform distribution between [0,1]; otherwise O.
120
Journal of ICT, 4, pp: 117-133
2.
Repeat a.
Evaluate the individuals or solutions in the population and label those who are nondominated according to the two objectives: (1) to maximize the horizontal distance traveled by the creature from its initial starting position, and (2) to minimize the number of hidden units.
b.
If the number of non-dominated individuals (a solution is considered to be nondominated if it is optimal in at least one objective) is less than three, repeat the following until the number of non-dominated individuals is greater than or equal to three (since the Differential Evolution algorithm requires at least three parents to generate an offspring via crossover): i.
Find a non-dominated solution among those who are not labelled.
ii. Label the solution as non-dominated. c.
Delete all dominated solutions from the population.
d.
Repeat i.
Select at random an individual as the main parent uj' and two individuals, az, a3 as supporting parents.
ii. Crossover with some uniform (0,1) probability, do
121
Journal of ICT, 4, pp: 117-133
where each weight in the main parent,