New Firefly Optimization Algorithm based On Learning Automata Azam Amin Abshouri 1, Babak Nasiri2 and Mohammad Reza Meybodi3 Department of electronic, Computer and IT, Islamic Azad university,Iran, Qazvin Department of electronic, Computer and IT, Islamic Azad university,Iran, Qazvin 3 Department of Computer engineering and IT, Amirkabir University,Iran, Tehran {
[email protected],
[email protected], and
[email protected]} 1 2
Abstract. Firefly algorithm is an evolutionary computing model that is based on behavior of the fireflies in nature. Two of the parameters of this algorithm are attractiveness coefficient and random movement that adjust algorithm behavior and speed of convergence. Appropriate value of these algorithm parameters are achieved by error and trail. This article proposed a new model Firefly-LA based on firefly algorithm which used learning automata for adjusting firefly behavior and control attractiveness coefficient. Simulation results show that performance of proposed model is better than standard firefly and PSO algorithm. Keywords: Optimization, Firefly algorithm, Learning Automata.
1 Introduction In general, optimization algorithms can be classified into two main categories: deterministic and stochastic. Deterministic algorithms such as hill-climbing will produce the same set of solutions if the iterations start with the same initial guess. On the other hand, stochastic algorithms often produce different solutions even with the same initial starting point. However, the final results, though slightly different, will usually converge to the same optimal solutions within a given accuracy.[1] Most stochastic algorithms can be considered as metaheuristic, and good examples are genetic algorithms (GA) and particle swarm optimization (PSO). Many modern metaheuristic algorithms were developed based on the swarm intelligence in nature. New modern metaheuristic algorithms are being developed and begin to show their power and efficiency. For example, the Firefly Algorithm developed by the author shows its superiority over some traditional algorithms [1]. Two parameters of the algorithm are attractiveness coefficient and randomization coefficient. The values are crucially important in determining the speed of the convergence and the behavior of FA algorithm. Learning automata are adaptive decision-making devices that operating in an unknown random environment and progressively improve their performance via a learning process. It has been used successfully in many applications such as call admission control in cellular networks [2] and [3], capacity assignment problems[4], adaptation of back propagation parameter[5], and determination of the number of hidden units for
three layers neural networks [6]. In this paper, we propose a new approach for adaptive adjusting parameters of firefly, absorption coefficient and randomization coefficient. In the proposed approach, parameters of firefly are set by means of two learning automata, one learning automata for each parameter. Results and tests have shown the standard functions. The algorithm works better than firefly algorithm standard and PSO and its improved algorithms [7]. In the rest of the paper the following materials are provided. Section 2 reviews the Firefly Algorithm. Section 3 gives a brief introduction to learning automata. The proposed algorithm is given in section 4. Experiments settings and results are presented in section 5. Section 6 concludes the paper.
2 Firefly Algorithm The Firefly Algorithm was developed by author, and it was based on the idealized behavior of the flashing characteristics of fireflies. For simplicity, in describing our Firefly Algorithm (FA), we now use the following three idealized rules: 1) All fireflies are unisex so that one firefly will be attracted to other fireflies regardless of their sex; 2) Attractiveness is proportional to their brightness, thus for any two flashing fireflies, the less bright one will move towards the brighter one. The attractiveness is proportional to the brightness and they both decrease as their distance increases. If there is no brighter one than a particular firefly, it will move randomly; 3) The brightness of a firefly is affected or determined by the landscape of the objective function. For a maximization problem, the brightness can simply be proportional to the objective function. Other forms of brightness can be defined in a similar way to the fitness function in genetic algorithms or the bacterial foraging algorithm (BFA). In the FA, there are two important issues: the variation of light intensity and formulation of the attractiveness. For simplicity, we can always assume that the attractiveness of a firefly is determined by its brightness or light intensity which in turn is associated with the encoded objective function. In the simplest case for maximum optimization problems, the brightness I of a firefly at a particular location x can be chosen as I(x)α f(x). However, the attractiveness β is relative, it should be seen in the eyes of the beholder or judged by the other fireflies. Thus, it should vary with the distance rij between firefly i and firefly j. As light intensity decreases with the distance from its source, and light is also absorbed in the media, so we should allow the attractiveness to vary with the degree of absorption. In the simplest form, the light intensity I(r) varies with the distance r monotonically and exponentially. That is 2
2
I = I0 × e−γrij I = I0 × e−γrij (1) where I0 is the original light intensity and γ is the light absorption coefficient. As a firefly’s attractiveness is proportional to the light intensity seen by adjacent fireflies, we can now define the attractiveness of a firefly by
2
β = β0 × e−γrij
2
β = β0 × e−γrij
(2)
where β0 is the attractiveness at r = 0. It is worth pointing out that the exponent r can be replaced by other functions such as γr when m > 0. Schematically, the Firefly Algorithm (FA) can be summarized as the pseudo code[1,10]. Pseudo code 1- Standard Firefly Algorithm Objective function f(x), x=(x1,x2,…,xd)T Initialize a population of fireflies 𝑥𝑖 (𝑖 = 1,2, . . , 𝑛) Define light absorption coefficient γ While (t 𝐼𝑖 ) Move firefly i towards j in all d dimensions End if Attractiveness varies with distance r via xp[−𝛾𝑟] Evaluate new solutions and update light intensity End for j End for i Rank the fireflies and find the current best End while Post process results and visualization.
The movement of a firefly i is attracted to another more attractive (brighter) firefly j is determined by 2
1
Xi = Xi + β0 e−γrij (Xj − Xi ) + αsign [rand − ] − rand 2
(3)
For most cases in our implementation, we can take β0 = 1, α ϵ [0, 1], and γ= 1. In addition, if the scales vary significantly in different dimensions such as -105 to 105 in one dimension while, say, -103 to 103 along others, it is a good idea to replace α by αSk where the scaling parameters Sk(k = 1, ..., d) in the d dimensions should be determined by the actual scales of the problem of interest.[1]
3 Learning Automata
Learning Automata are adaptive decision-making devices operating on unknown random environments [11]. The Learning Automaton has a finite set of actions and each action has a certain probability (unknown for the automaton) of getting rewarded by the environment of the automaton. The aim is to learn to choose the optimal action (i.e.
(Apply Eq (5)) 2
(𝛽 = 𝛽0 𝑒 −𝛾𝑟𝑖𝑗 )
the action with the highest probability of being rewarded) through repeated interaction on the system. If the learning algorithm is chosen properly, then the iterative process of interacting on the environment can be made to select the optimal action. Figure 1 illustrates how a stochastic automaton works in feedback connection with a random environment. Learning Automata can be classified into two main families: fixed structure learning automata and variable structure learning automata (VSLA) [11]. In the following, the variable structure learning automata is described. α(n)
Environment
Learning Automata
β(n)
Fig 1- The interaction between learning automata and environment
Variable structure learning automata can be shown by a quadruple { α, β, p, T } where α={α1, α2, ..., αr } which is the set of actions of the automaton, β={β1, β2,…, βm} is its set of inputs, p={p1, ..., pr} is probability vector for selection of each action, and p(n +1) = T[α (n),β(n), p(n)] is the learning algorithm. If β={0,1}, then the environment is called P-Model. If β belongs to a finite set with more than two values, between 0 and 1, the environment is called Q-Model and if β is a continuous random variable in the range [0, 1] the environment is called S-Model. Let a VSLA operate in a SModel environment. A general linear schema for updating action probabilities when action i is performed is given by: Pi(n+1)=Pi(n)+a(1-Pi(n))- b.β(n).Pi(n) Pj(n+1)=Pj(n)+a.(1-βi(n)).Pj(n)+(b.βi(n))[1/(r-1)-Pj(n)]. ∀ j j≠i
(4) 4
Where a and b are reward and penalty parameters. When a=b, the automaton is called S-LR-P. If b=0 and 0