Available online at www.sciencedirect.com
ScienceDirect Procedia Computer Science 103 (2017) 176 – 182
XIIth International Symposium «Intelligent Systems», INTELS’16, 5-7 October 2016, Moscow, Russia
Control system synthesis by means of Cartesian Genetic Programming G.I. Balandina * Peoples’ Friendship University of Russia, 6, Miklukho-Maklaya str. Moscow 117198, Russia
Abstract Cartesian Genetic Programming (CGP) is a type of Genetic Programming based on a program in a form of a directed graph. It also belongs to the methods of Symbolic Regression allowing to receive the optimal mathematical expression for a problem. Nowadays it becomes possible to use computers very effectively for symbolic regression calculations. CGP was developed by Julian Miller in 1999-2000. It represents a program for decoding a genotype (string of integers) into the phenotype (graph). The nodes of that graph contain references to functions from a function table, which could contain arithmetic, logical operations and/or user-defined functions. The inputs of those functions are connected to the node inputs, which itself could be connected to a node output or a graph input. As a result, it’s possible to construct several mathematical expressions for the outputs and calculate them for the given inputs. This CGP implementation use point mutation to form new mathematical expressions. Steady-state genetic algorithm is chosen as a search engine. Solution solving the control system synthesis problem is presented in a form of the Pareto set, which contains a set of satisfactory control functions. Nonlinear Duffing oscillator is taken as a dynamic object. © 2017 2017The TheAuthors. Authors.Published Published Elsevier © by by Elsevier B.V.B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the scientific committee of the XIIth International Symposium «Intelligent Systems». Peer-review under responsibility of the scientific committee of the XIIth International Symposium “Intelligent Systems” Keywords: Optimal control synthesis, nonlinear control systems, Cartesian Genetic Programming, genetic programming
1. Introduction There are definite problems connected with control system synthesis for a non-linear dynamic object. And one of them is complexity and often impossibility to define control function by analytical methods. However, with the rapid
* Corresponding author. Tel.: +7-495-955-0792. E-mail address:
[email protected]
1877-0509 © 2017 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the scientific committee of the XIIth International Symposium “Intelligent Systems” doi:10.1016/j.procs.2017.01.051
177
G.I. Balandina / Procedia Computer Science 103 (2017) 176 – 182
advancements in computer technologies and development of new methods and algorithms such problems are successfully solved by numerical methods of symbolic regression. Cartesian genetic programming belongs to these methods. 2. Problem statement The synthesis problem is formulated as a search problem of control function from the object state [1]. A mathematical model of the control object is given in a form of a system of ordinary differential equations: f x, u ,
x
(1) T
Here, x ª¬ x1 xn º¼ - state vector of the control object, x R n , u m d n , U - closed limited set. Initial conditions: х0 R n , x 0
x0
ª x 0,1 ¬
x 0, k º ¼
ª¬u1
T
u m º¼ - control vector, u U R m ,
T
(2)
The terminal conditions are given in a form of n r dimensional diversity:
Ii x t f
0, i
1, r
(3)
Quality functional: tf
³ F0 x t , u t dt o min
J
(4)
0
t f - the duration of the control process f ° t , x (t ) x ® f ° ¯ t , x (t ) x
tf
(5)
t - given upper level of the acceptable control time.
It is necessary to synthesize a control system in the following form: g x , q , g x, q : R n o R m ,
u q
> q1
(6)
q R @ is a vector of control system parameters, q Q R R , where Q is a limited set. Moreover, the T
resulting control system should provide a minimum of the functional (4) and satisfy the terminal conditions:
x t f x f ,t f d t
and control bounds of system (1):
(7)
178
G.I. Balandina / Procedia Computer Science 103 (2017) 176 – 182
u
g x (t ), q U x(0) x 0
(8)
3. Cartesian genetic programming CGP was developed by Julian Miller in 1999-2000. The origins of the method grew out of the studies of J Miller concerning the evolution of digital circuits in 1997. The term Cartesian genetic programming was firstly used in her article in 1999. In 2000 Cartesian genetic programming (CGP) was proposed as a kind of the genetic programming, which was very popular recently and is used to solve many different types of problems. The method was called “Cartesian”, because it utilizes a two-dimensional grid of nodes. Possible solution is taken as a fixed-length string containing integers, that can be decoded into a directed acyclic graph. CGP can effectively represent general computational structures, including mathematical expressions, computer programs, neural networks, etc. The mechanisms of the genetic programming and closely related genetic algorithms were borrowed from biological evolution, their terms often replicate genetics terms. That’s why a possible solution, represented by the string of integers, is named a genotype, each of these numbers - a gene, and the program that results from the decoding of a genotype - a phenotype. In genetics, genotype is the set of all genes of an organism that characterize a specific individual, and the full set of structural and behavioral traits of an organism is called phenotype. Genotype in CGP has a fixed length. However, the size of a phenotype (in terms of the number of computational nodes) can be anything from zero nodes to the number of all nodes, defined in the genotype. The general record form of a genotype is shown below. The string of integers represents the elementary operations of the program and the relationships between them: F0 C0,0 ....C0, a F1C1,0 ....C1, a ............F( c 1) r 1C( c 1) r 1,0 ....C( c 1) r 1, a O0 O1 ....Om
(9)
There are three types of genes in CGP: function, connection and output genes. Fi – function genes store addresses of the computational node functions from the function look-up table, a – maximum arity of functions represented in the look-up table. C i , j - connection genes, O k – the output (result) genes, m – number of program outputs. Nodes take their inputs in a feed-forward manner from either the output of nodes in a previous column or from a program input. The corresponding overall structure of the CGP is presented in Fig. 1 [1].
Fig. 1. General form of CGP.
G.I. Balandina / Procedia Computer Science 103 (2017) 176 – 182
When decoding a genotype, some nodes may not participate in calculating outputs, in other words, some nodes can be inactive. In this case, we consider these nodes and their genes as "non-coding" or, as it’s used to call in genetics, "dormant genes". A genotype in CGP has a fixed length, while a dimension of a phenotype (in terms of the number of computational nodes) can vary from zero to the number of nodes defined in the genotype by a user. There are three user defined major parameters in CGP: the number of rows of the matrix grid – r , the number of columns – c and levels-back – l . Levels-back constrains which columns a node can get its inputs from. The number of inputs of the program is n . The types of nodes calculation functions used in CGP are defined by a user and must be numbered in the look-up table of primitive functions. Every node of a directed feed-forward graph is coded by a set of integers – genes. The first gene is an identifier of a function from the look-up table, that is a functional gene. The other genes called connection ones define data addresses at the node data inputs. Nodes can receive data on its inputs from the program data inputs or from nodes in previous columns. Nodes in the same column are not allowed to be connected to each other. The number of connection genes of a node corresponds to the maximum number of inputs (often called the arity) that any function in the function look-up table has. Absolute addresses from 0 to n-1 are assigned to the program data inputs, node outputs in the genotype are defined sequentially from column to column, from left to right and top to bottom, starting from n and finishing at n Ln 1 , where Ln is a user defined upper level of a node numbers. There are m integers attached as the output genes at the end of a genotype. 4. Mathematical equation representation by means of CGP Let us have the following vector (Fig. 2):
Fig. 2. The example of a genotype
That vector can be represented by different graphs, depending on user-defined parameters. For example, the number of grid rows is r 3 , the number of columns is c 6 , levels-back are l 3 . The number of inputs is n 4 and the number of outputs is m 2 , thus the total number of nodes in the structure will be equal to 18 (Fig. 3). The inputs are filled with the specific values, which are then transferred to the nodes in accordance with the addresses of the corresponding inputs of that nodes.
Fig. 3. The example of the genotype from figure 2 mapped into a phenotype
179
180
G.I. Balandina / Procedia Computer Science 103 (2017) 176 – 182 Table 1. Functions lookup table. #
Function Name
Arity
#
Function Name
Arity
1
@pluss
2
7
@cos
1
2
@minuss
2
8
@tan
1
3
@timess
2
9
@F10_1
1
4
@divv
2
10
@zz
1
5
@sqr
1
11
@sgnn
1
6
@sin
1
12
@exp_z
1
In each node, the operation, defined by the functional gene and the functions lookup table, is applied to the node inputs values. (figure 4). The extra inputs of the node are ignored. For example, the 1st program node takes inputs data from the 2nd and 4th inputs of the program and uses function #1 from the functions lookup table (a sum) to process them. As a result of that operation, the 5th output takes the value of 0.4 1 0.6 . The “dormant” nodes and extra inputs are shown as dimmed (Fig. 2, 3). During the process of the decoding of the genotype string, we’ll obtain the calculation results, according to the following mathematical expressions: ª y1 º « » ¬ y2 ¼
ª cos( x1 sin( x2 q2 )) º « q1 x1 q2 » cos( x22 x12 ) »¼ «¬ e
ª 0,8193º « » ¬1, 3499 ¼
(10)
5. Control system synthesis problem statement for the Duffing oscillator Mathematical model of a nonlinear dynamical control system is given [2]: ° x1 ® °¯ x2
x2
(11)
x1 x13 u ( x )
Given constraints on control: 1 d u x d 1
(12)
Initial conditions: x 0,1
ª1º 0,2 « », x 1 ¬ ¼
ª1 º 0,3 « », x 1 ¬ ¼
ª 1º 0,4 « », x 1 ¬ ¼
ª 1º « » ¬1 ¼
(13)
Terminal conditions: xf
ª0 º « » ¬0 ¼
Quality functionals:
(14)
181
G.I. Balandina / Procedia Computer Science 103 (2017) 176 – 182
4
J1
2
i 1©
4
J2
§
¦ ¨¨ ¦
j
j 1
i
¦tf
x t x i
f
f
j
2
· ¸ ¸ ¹x 0
o min x
(15)
0 ,i
o min
(16)
i 1
Here in (15), the indexes of x 0
x(0,i ) mean that the system (11) is integrated under the initial condition x (0,i ) .
The t f is set according to (5); t (fi ) is the finish time of the modelling process for x ( i ) , and t 15с is the greatest time accepted. The problem solution is a control function which meets the constraints on control (12) and for any initial conditions from (13) in the timeframe restricted by reaches the objective with minimum value of the quality functional (15) and (16). The set of Pareto-optimal solutions should be constructed to ensure functional minimums. The result of the synthesis is the control system, selected from a number of control systems (11), for each of which the solution at the optimum parameters values belongs to the Pareto set. 6. Algorithm of CGP method for control system synthesis Stage 0. The preliminary stage when the parameters of the genetic algorithm such as population size, number of generations, the probability of crossover, mutation probability and parameters of Cartesian Genetic Programming such n as: n i - the program inputs, n o – the program outputs, n n , - the nodes inputs (addresses), F , - the functions set, r n
, - the number of nodes in a row, c - the number of nodes in a column, l – control depth, the parameter responsible for how many previous columns can a node receive data on its inputs from, are set. Stage I. A possible solutions set if randomly generated - or a zero initial population of M chromosomes, taking into account constraints on the chromosome alleles. Each individual from the G population represents one valid genotype - a set of integers comprising the chromosome. Each chromosome is divided into two parts. The first part of the chromosome affects the structure of the control system. The second part of the chromosome affects values of the control parameters of the system and represents a binary Gray code string. Stage II. The value of the fitness function for each chromosome is calculated, which corresponds to the quality functions (14), (15). To determine the fitness of chromosomes, a conditional Pareto set is introduced. Conditional Pareto set Pc is a set of possible solutions or chromosomes for which there is no chromosomes, better in terms of Pareto relations. Let J m
ª J1m ¬
T
m º JM - vector of chromosome ¼
m functional values. Pareto ratio J m t J m is valid if the 1
2
following conditions are satisfied: J im t J im , i 1, M , k k ^1, , M ` , J im ! J im . To build a conditional Pareto set, based on a chromosome population, a characteristic - the distance from the current chromosome to conditional Pareto set is introduced. 1
M
dk
¦ Omk , m 1
where Omk
2
m k °1, if J d J ® °¯ 0, otherwize
1
2
(17).
The distance to the conditional Pareto set is used as a fitness function for the solution evaluation. Chromosomes with zero distance, belong to the conditional Pareto set.
182
G.I. Balandina / Procedia Computer Science 103 (2017) 176 – 182
Stage III. Cartesian Genetic Programming crossover and mutation are applied to the individuals of the current population. m m m m For a reproduction of chromosomes in the population we randomly select a couple V 1 , s 1 and V 2 , s 2 . The chromosomes parts are exchanged, and four new descendant chromosomes are created: two, where the structural parts of the parents are preserved, and two, where both the structural and parametric parts are changed. The mutation is applied to the newly created chromosomes with a given probability pm . The elements of chromosomes from structural and parametric parts are randomly selected for the mutation and are replaced with new, randomly generated.
Stage IV. The functional values for the new chromosome are calculated. Next, the check for including of the new chromosome into the population is performed. To achieve that, the distance to the conditional Pareto set is calculated for the new chromosome. Then, the chromosome from the population with the greatest distance is found - this is the worst chromosome in the population. If the distance to the conditional Pareto set of the worst chromosome is more than from the new chromosome, the worst chromosome is replaced with the new one. The distance to the conditional Pareto set is recalculated (1) for all of the chromosomes in the population and the operations of parents’ selection, crossover and mutation are repeated. Stage V. The criteria for the algorithm to stop is checked. Algorithm stops after passing a specified number of generations or if the conditional Pareto set does not change over a given number of generations. The result of the algorithm is the conditional Pareto set, which is built on the final population of chromosomes. The result of the problem solving is the g x, q
ª¬ g1 x , q
g m x , q º¼
T
vector function, which under optimal
parameters q > q1 q R @ values provides an unambiguous translation of the space of states into a limited subset of the space of controls. The selection of a specific solution is performed on a basis of additional criteria or based on an expert analysis of the solutions. T
References 1. Miller JF. Cartesian Genetic Programming Springer Berlin Heidelberg, 2011 . 2. Diveev AI, Kazaryan DE and Sofronova EA. Grammatical evolution and network operator methods for control system synthesis, 21st Mediterranean Conference on Control and Automation. IEEE (Platanias, Chania - Crete, Greece), 2013: 1148-55. 3. Diveev AI, Kazaryan DE, Sofronova EA. Symbolic Regression Methods for Control System Synthesis, Proceedings of 22nd Mediterranean Conference on Control & Automation (MED'14), Palermo, Sicilia, Italy, June 16–19, 2014: 587–92. 4. Koza JR. Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge: MIT Press; 1992. 5. Diveev AI. A Numerical Method for Network Operator for Synthesis of a Control System with Uncertain Initial Values, Journal of Computer and Systems Sciences International, 2012; 51; 2: 228-43. 6. Gladkov LA, Kureichik VV, Kureichik VM. Genetic algorithms. Moscow: Fizmatlit; 2010.