2014 - Journal of Artificial Intelligence and Soft Computing Research

0 downloads 0 Views 4MB Size Report
Programming, a variant of canonical Genetic Programming. The main ... brid architecture, enabling high accuracy and fast ..... 5.1.1 Cart-centering Problem .... a tutorial, Proceedings of the IEEE, Vol.83, No.3,. 1995 ...... Seeking Bias in PSO, Proceedings of GECCO'05, ..... the size of the military inventory makes manual dis-.
Journal of Artificial Intelligence andIntelligence Soft Computing Research Journal of Artificial and Soft Computing Research

2 0 1 22 0 1 42

ISSN 2083–2567

ISSN 2083-2567 ISSN 2083–2567

4, Number 3 lume 2, NumberVolume 4 Volume 2, Number 4

Polish Neural Network Society Polish Neural Network Society University of Social Sciences University of Social Sciences

JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH (JAISCR) is a semi-annual periodical published by the University of Social Sciences in Lodz, Poland. PUBLISHING AND EDITORIAL OFFICE: University of Social Sciences (SAN) Information Technology Institute (ITI) Sienkiewicza 9 90-113 Lodz tel.: +48 42 6646654 fax.: +48 42 6366251 e-mail: [email protected] URL: http://jaiscr.eu Print: Mazowieckie Centrum Poligrafii, ul. Duża 1, 05-270 Marki, www.c-p.com.pl, [email protected]

Academy ofofManagement (SWSPiZ), Lodz,Poland. Poland.All Allrights rightsreserved. reserved. Copyright © 2012 2014 University Social Science (SAN), Lodz,

AIMS AND SCOPE:

Journal of Artificial Intelligence and Soft Computing Research is a refereed international journal whose focus is on the latest scientific results and methods constituting soft computing. The areas of interest include, but are not limited to: Artificial Intelligence in Modelling and Simulation Artificial Intelligence in Scheduling and Optimization Bioinformatics Computer Vision Data Mining Distributed Intelligent Processing Evolutionary Design Expert Systems Fuzzy Computing with Words Fuzzy Control Fuzzy Logic Fuzzy Optimisation Hardware Implementations Intelligent Database Systems Knowledge Engineering Multi-agent Systems Natural Language Processing Neural Network Theory and Architectures Robotics and Related Fields Rough Sets Theory: Foundations and Applications Speech Understanding Supervised and Unsupervised Learning Theory of Evolutionary Algorithms Various Applications

Contents

Adriano S. Koshiyama, Marley M. B. R. Vellasco and Ricardo Tanscheit GPFIS-CONTROL: A GENETIC FUZZY SYSTEM FOR CONTROL TASKS ………………………………………………………167 Zhenyuan Wang and Li Zhang-Westmant NEW RANKING METHOD FOR FUZZY NUMBERS BY THEIR EXPANSION CENTER …………………………..………………………181 Simone A. Ludwig REPULSIVE SELF-ADAPTIVE ACCELERATION PARTICLE SWARM OPTIMIZATION APPROACH ………………………………..……………189 Rajesh Thiagarajan, Mustafizur Rahman, Don Gossink and Greg Calbert A DATA MINING APPROACH TO IMPROVE MILITARY DEMAND FORECASTING …………………………………………………………205 Edgar Camargo and Jose Aguilar ADVANCED SUPERVISION OF OIL WELLS BASED ON SOFT COMPUTING TECHNIQUES …...…………..………………………………215

JAISCR, 2014, Vol. 4, No. 3, pp. 167 – 179 10.1515/jaiscr-2015-0006

GPFIS-CONTROL: A GENETIC FUZZY SYSTEM FOR CONTROL TASKS Adriano S. Koshiyama, Marley M. B. R. Vellasco and Ricardo Tanscheit

Department of Electrical Engineering, Pontifical Catholic University of Rio de Janeiro Rua Marqus de So Vicente, 225, Gvea – Rio de Janeiro, RJ, Brazil Abstract This work presents a Genetic Fuzzy Controller (GFC), called Genetic Programming Fuzzy Inference System for Control tasks (GPFIS-Control). It is based on Multi-Gene Genetic Programming, a variant of canonical Genetic Programming. The main characteristics and concepts of this approach are described, as well as its distinctions from other GFCs. Two benchmarks application of GPFIS-Control are considered: the Cart-Centering Problem and the Inverted Pendulum. In both cases results demonstrate the superiority and potentialities of GPFIS-Control in relation to other GFCs found in the literature.

1

Introduction

Fuzzy Logic Controllers (FLCs) [1,2] have been extensively used as an alternative to manipulate and describe complex systems when traditional control methods do not provide viable solutions. FLCs have the capacity of modeling systems by using fuzzy ”if-then” rules, normally provided by an expert. Classical fuzzy logic approaches employ either a Mamdani-type Fuzzy Inference System (FIS) [2-3] or a Takagi-Sugeno (TSK) FIS [4-5] and both have different parameters that must be tuned in order to obtain the best performance, such as rule base, membership function parameters, etc. These parameters can be tuned manually by an expert or automatically by employing a learning approach. In this respect, this work considers Genetic Fuzzy Systems [3,6], or, more specifically, Genetic Fuzzy Controllers. In Genetic Fuzzy Controllers (GFC) the automatic learning and tuning of parameters is based on a Genetic-based Meta-Heuristic (GBMH). Some previous works have considered FLCs embedded with a Genetic Algorithm (GA) to tune membership function parameters [7-8] or to search for concise

fuzzy rule bases [9-10]. More recently, some works have explored Genetic Programming (GP) to build an FLC by using methodologies and concepts similar to those employed on a GA based FLC [11-12]. In general, it is advantageous to use a GBMH exclusively to search for the FLC best configuration. In this perspective, the meta-heuristic is seen as a tool to build an FLC and not as a mechanism that may change reasoning. Still, in frameworks with a high level of hybridization, in which a genetic-based meta-heuristic has a higher participation, it may be possible to obtain better accuracy. Examples are Neuro-Fuzzy models [13,14], where Neural Networks play an important role in the hybrid architecture, enabling high accuracy and fast convergence. This work proposes a new GFC called Genetic Programming Fuzzy Inference System for Control tasks (GPFIS-Control). It makes use of Multi-Gene Genetic Programming [15-16] for extracting knowledge from the plant. The resulting architecture should: (1) automatically tune the FLC parameters; (2) make the plant output reach the setpoint as fast as possible; (3) provide linguistic comprehension

168 considers some applications involving GP. Section 3 describes for each FLC action; andMulti-Gene (4) be easy toGenetic implement. Programming and its basic differences from This paper is organized as follows: Section standard Genetic Programming strategy. 2 describes related works on GFC and considSection 4 presents the proposed GPFISers some applications involving GP. Section Control in detail. Case studies are considered3 deMulti-Gene Genetic Programming and its inscribes Section 5 and section 5 concludes the work. basic differences from standard Genetic Programming strategy. Section 4 presents the proposed in detail. Case studies are consid2GPFIS-Control Related Works ered in Section 5 and section 5 concludes the work. The first attempt to build an FLC by using GBMH algorithms was presented in [7], where 2 Related Works a GA was used to tune membership functions parameters input to andbuild output variables. The firstofattempt an FLC by using Subsequently, many other researchers GBMH algorithms was presented in [7], have where a employed evolutionary algorithms,functions mostly GA, GA was used to tune membership paramto tune FLC parameters and search for concise eters of input and output variables. Subsequently, rule bases [17-19]. many other researchers have employed evolutionSeveral works can be found in the GFC ary to tunepresents FLC parameters area,algorithms, such asmostly [9], GA, which an and search for procedure concise rule to basesmodify [17-19].rules, evolutionary

initially set by an expert, a Mamdani typearea, Several works can be for found in the GFC FLC. In [20], membership functions, rule such as [9], which presents an evolutionarysets proceand types (TSK orset Mamdani types)for a dureconsequent to modify rules, initially by an expert, are tuned by a GA. other approachesfunctions, are: Mamdani type FLC.Two In [20], membership [8], employs linguistic rulewhich sets and consequent typeshedge (TSKoperators, or Mamdani selected by a GA, to tune membership types) are tuned by a GA. Two other approaches functions, and [10] where a hierarchical selfare: [8], which employs linguistic hedge operators, organized GA-based scheme is proposed. selected by amost GA, works to tune membership functions, Recently, that make use of GA and [10] where a hierarchical self-organized to tune FLCs focus on real applications [19,GAbased is proposed. 21, 22].scheme Type-2 FLCs have also been tuned through GA [18]. Additionally, some nonRecently, most works that make use of GA to GBMH works for tuning an FLC have also tune FLCs focus on real applications [19, 21, 22]. considered Particle Optimization [23] GA Type-2 FLCs have Swarm also been tuned through and other bio-inspired algorithms [24]. [18]. Additionally, some non-GBMH works for Few attempts, however, have been made to tuning an FLC have also considered Particle Swarm build an FLC by using GP, despite its dynamic Optimization [23] and other bio-inspired algorithms structure that benefits rule base codification [24]. [6]. The first works in this sense were [25] and

[26],Few which used ahowever, type-constrained to to attempts, have beenGPmade build a fuzzy rule based system. In [27] an build an FLC by using GP, despite its dynamic FLC basedthatonbenefits GP forrulemobile robot path [6]. structure base codification tracking presented. More were recently, [12][26], The first isworks in this sense [25] and proposes the use of a GP variant to build a which used a type-constrained GP to build a fuzzy TSK FLC. All those approaches adapt the GP rule based system. In [27] an FLC based on GP structure to formulate an FLC in a canonical for mobile robot path tracking is presented. More way, similarly to a GA common procedure. recently, [12] proposes the use of a GP variant to Some intrinsic advantages of GP are build a TSK FLC. those approaches adapt the effectively used byAll these authors, but many GP structure arise, to formulate a canonical possibilities such an as FLC the in use of way, similarly to a GA common procedure. Some intrinsic advantages of GP are effectively used by

Koshiyama A. S., Vellasco M. M. B. R. and Tanscheit R.

combinations of different t-norms and tconorms, of linguistic and of arise, different these authors, but manyhedges possibilities such as aggregation operators. the use of combinations of different t-norms and tAll approaches previously focus agconorms, of linguistic hedges discussed and of different on Pittsburgh-type GFC, i.e., an individual of gregation operators. the population encodes a whole fuzzy rule set approaches previously [3,6].All Then, methods that discussed consider focus an on Pittsburgh-type GFC,rule i.e., --anMichigan, individualGenetic of the popindividual as a fuzzy Cooperative-Competitive ulation encodes a whole fuzzyLearning rule set [3,6].and Then, Iterative Rule Learning – have not been methods that consider an individual as a fuzzy rule – noticed in the literature [17]. Michigan, Genetic Cooperative-Competitive LearnGPFIS-Control is a novel GFC based on noing and Iterative Rule Learning – have not been Multi-Gene Genetic Programming. This model ticed in the literature [17]. builds a Pittsburgh-type Fuzzy Rule Based GPFIS-Control a novel GFC based on MultiSystem, making useis of a different reasoning Gene Genetic method to learnProgramming. fuzzy rules. This model builds a Pittsburgh-type Fuzzy Rule Based System, making use of a different reasoning method to learn fuzzy 3rules. Multi-Gene Genetic

Programming 3

Multi-Gene Genetic Programming Genetic Programming (GP) [28-29]

belongs to theProgramming Evolutionary(GP) Computation field. to Genetic [28-29] belongs Typically, it employs a population of the Evolutionary Computation field. Typically, it individuals, each of them denoted by a tree employs a population of individuals, each of them structure that codifies a mathematical equation, denoted by a treethestructure that codifies mathewhich describes relationship betweena the matical equation, which describes the relationship output Y and a set of input terminals Xj between the outputinY the andcurrent a set of input terminals (j=1,...,J) (features, work). X j (j=1,...,J) (features, in the current work). Multi-Gene Genetic Programming (MGGP) [15-16] Genetic denotes Programming an individual (MGGP) as a Multi-Gene structure of trees, also called genes, thattrees, [15-16] denotes an individual as a structure of receives X and tries to predict Y (Figure also calledj genes, that receives X j and tries 1). to preEach individual is composed of D functions fd dict Y (Figure 1). Each individual is composed of (d=1,…D) that map Xj variables to Y through D functions fd (d=1,. . . D) that map X j variables to user-defined mathematical operations. In GP Y through user-defined terminology, the Xj inputmathematical variables areoperations. included In GP terminology, the X input variables are included j in the Terminal set, while the mathematical in the Terminal set, while the mathematical operaoperations (plus, minus, etc.) are inserted in the tions (plus, etc.) are inserted in the Function Function Setminus, (or Mathematical Operations Set). Set (or Mathematical Operations Set).

Figure 1: Example of aofmulti-gene individual. Figure 1. Example a multi-gene individual . With respect to genetic operators, mutation in MGGP is similar to that in GP. As for crossover, the level at which the operation is performed must be

169

GPFIS-CONTROL: A GENETIC FUZZY . . .

each feedback error xtk is mapped on fuzzy sets. specified: it is possible to apply crossover at high Then, functions that map each linguistic state of xtk and low levels. The low level is the space where it is possible to manipulate the structures (terminals to a state of yt are synthesized based on MGGP prinand functions) of equations present in an individual. ciples. The crisp control signal is obtained through The high level, on the other hand, is the space where the defuzzification process. This solution is evalexpressions can be manipulated in a macro way. In uated and then selection and recombination operaFigure 3: Block diagram of GPFIS-Control model. this case, mutation and low level crossover operators are applied. These steps are repeated until a tions are similar to those performed in GP. Figure stopping criterion is met. These four modules are 2 presents a multi-gene individual with five equadescribed in details in the following sub-sections. tions (D=5). Figure 2a shows the mutation opera4 GPFIS-Control 4.1. Fuzzification tion, while Figure 2b a low level crossover. 4.1 Fuzzification An of high level crossover is displayed Theexample GPFIS-Control model is shown in inFigure Figure3.2c. By observing the dashed lines, it The control signal yt is sent to thecan be seen that equations were from an plant at timethe t (t=0,1, ..., T). Theswitched plant outputs individual to the The cutting can be ztk (k=1,…K) areother. compared with thepoint setpoint symmetric – the same equations is exvalue, so that the number result ofofthe difference betweenbetween each plant’s output–,and its respectiveInchanged individuals or asymmetric. setpoint high (the level errorcrossover signal xtkhas= azdeeper tuitively, effect tk – Ref k) is on presented to low the level GPFIS-Control the output than crossover ormodel. mutationBy has. using it is possible to build a model, controlthe signal In of thextkproposed GPFIS-Control asymyt to high satisfy a crossover performance criteria (Fitness metric level is considered. function g(xtk, t)). In general, the evolutionary process in MGGP The GPFIS-Control model is comprised of differs from GP due to the addition of two paramefour modules: fuzzification, inference, ters: maximum number of trees per individual and defuzzification and evaluation. The inference high levelbegins crossover the firsterror parameter, process whenrate. eachFor feedback xtk is a high value is always used in order to avoid creatmapped on fuzzy sets. Then, functions that ing obstacles in the evolutionary process (i.e., map each linguistic state of xtk to a state of ynot t allowing the MGGP individual to create more trees, are synthesized based on MGGP principles. which could be necessary providethrough solutions The crisp control signal is to obtained thefor complex problems). On the other hand, the defuzzification process. This solution high is evaluated then to selection level crossoverand rate, similar other geneticand operrecombination operators are applied. ators’ rates, needs to be adjusted and its These value is steps are repeated until a stopping criterion is always determined by performing different tests. met. These four modules are described in details in the following sub-sections.

4

GPFIS-Control

Letxtkxtkand andyt yadmit J distinct linguistic t admit Let J distinct linguistic terms, terms, or fuzzy sets (j=1,...,J). These areby or fuzzy sets (j=1,...,J). These are defined defined by normalized and uniformly normalized and uniformly distributed membership distributed membership Figure of functions [30]. Figure 4functions presents [30]. an example 4 presents an example of the fuzzification of the fuzzification of the k-th plant output, with seven the k-th plant output, with seven membership membership functions with thelinguistic followingterms: linguisfunctions with the following tic NB –Big; Negative NM Medium; – Negative NBterms: – Negative NM – Big; Negative Medium; NS – Negative Small; NZ – NearPSZero; NS – Negative Small; NZ – Near Zero; – PS – Positive Small; PM – Positive Medium; Positive Small; PM – Positive Medium; andand PB PB––Positive PositiveBig Big(in (inthis thiscase, case,j=1,2,...,7). j=1,2,...,7).

Figure 4. Example membership functions Figure 4: Example of of membership functions.

Afterfuzzification fuzzification of each xtkGPFIS, the After of each inputinput xtk , the GPFIS-Control inference process initiates. Control inference process initiates.

4.2

Fuzzy Inference

The GPFIS-Control model is shown in Figure 3. The control signal yt is sent to the plant at time t (t=0,1, ..., T). The plant outputs ztk (k=1,. . . K) are compared with the setpoint value, so that the result of the difference between each plant’s output and its respective setpoint (the error signal xtk = ztk – Refk ) is presented to the GPFIS-Control model. By using xtk it is possible to build a control signal yt to satisfy a performance criteria (Fitness function g(xtk , t)).

The inference procedure consists of three stages: Formulation, Partitioning and Aggregation. In Formulation stage, t-norm, t-conorm, linguistic hedges and negation operators are defined. In Partitioning stage, the mechanism that connects each antecedent with a consequent is established. Finally, in Aggregation stage, operators used to combine all rules are defined.

The GPFIS-Control model is comprised of four modules: fuzzification, inference, defuzzification and evaluation. The inference process begins when

4.2.1 xtk

Formulation

Through each A jk (xtk ) (membership degree of to a fuzzy set A jk ), GPFIS-Control evolves a

170

Koshiyama A. S., Vellasco M. M. B. R. and Tanscheit R.

Figure Figure2:2.Example ExampleofofMulti-Gene Multi-GeneGenetic GeneticProgramming Programmingrecombination recombinationoperators operators . With respect to genetic operators, mutation in MGGP is similar to that in GP. As for crossover, the level at which the operation is performed must be specified: it is possible to apply crossover at high and low levels. The low level is the space where it is possible to manipulate the structures (terminals and functions) of equations present in an individual. The high level, on the other hand, is the space where expressions can be manipulated in a macro way. In this case, mutation and low level crossover operations are similar to those performed in GP. Figure 2 presents a multi-gene individual with five equations (D=5). Figure 2a shows the mutation operation, while Figure 2b a low level crossover.

individuals –, or asymmetric. Intuitively, high level crossover has a deeper effect on the output than low level crossover or mutation has. In of the proposed GPFIS-Control model, the asymmetric high level crossover is considered.

4 GPFIS-Control

4.1. Fuzzification

In general, the evolutionary process in MGGP differs from GP due to the addition of two parameters: maximum number of trees per individual and high level crossover rate. For the first parameter, a high value is always used in order to avoid creating obstacles in the evolutionary process (i.e., not allowing the MGGP individual to create more trees, which could be necessary to provide solutions for complex problems). On the other hand, the high level crossover rate, similar to other An example of high level crossover is genetic operators’ rates, needs to be adjusted displayed in Figure 2c. By observing the and its value is always determined by dashed lines, it can be seen that the equations performing different tests. were switched from an individual to the other. Figure diagram Figure3:3.Block diagramofofGPFIS-Control GPFIS-Controlmodel. model The cutting point can be symmetric –Block the same number of equations is exchanged between

171

GPFIS-CONTROL: A GENETIC FUZZY . . .

( )] [ µB1 (yt ) = g fd∈s1 µA j1 (xt1 ) , . . . , µA jK (xtK ) (1)

describe the actions µB j (yt ) taken by the controller. It is possible to enter a negated or modified (”hedged”) fuzzy set in the Input Fuzzy Sets stage, instead of using negation and linguistic hedge operators in the Fuzzy Operators Set stage. This entails a larger search space, but can be of help in rules analysis. In this case, this procedure has been used to make the fuzzy rules simpler.

( [ )] µB2 (yt ) = g fd∈s2 µA j1 (xt1 ) , . . . , µA jK (xtK ) (2)

By using the operators and membership functions shown in Table 1, the MGGP builds premises of fuzzy rules as follows:

controller whose ouput has several terms (B1 = Negative Big, ..., B7 = Positive Big, for example), with membership degrees given by:

...

( [ )] µBJ (yt ) = g fd∈sJ µA j1 (xt1 ) , . . . , µA jK (xtK ) (3) ( ) where fd∈s j µA j1 (xt1 ) , . . . , µA jK (xtK ) represents a set of functions, where each one combines all µA jk (xtk ), k=1,.., K, by using a set of user-defined mathematical operations; s j (j=1,. . . , J) is an index set that describes which d-th function fd is related to the j-th consequent term (d ∈ s j ). Methods to define s j are best described in the Partitioning stage. In order to each function fd associated to s j behave as a fuzzy rule, it needs to employ t-norm, t-conorm, negation and linguistic hedges operators, with the aim to represent logic connectives for each linguistic term induced by µA jk (xtk ). Finally, g aggregates the activation degrees of each rule set (represented by fd∈s j ) in a final value. Therefore, if a set A jk is activated, GPFIS-Control builds a rule set (function set) that combines all membership degrees (µA jk (xtk )) and produces an action. In Formulation stage, some parameters of GPFIS-Control must be defined. In MGGP, initial parameters are called Terminals (input variables) and Mathematical Operations or Function Set (plus, times, etc.). In GPFIS-Control, on the other hand, the terminology will be Input Fuzzy Sets and Fuzzy Operators Set, respectively. Table 1 presents the initial user-defined parameters. Table 1. Input Fuzzy Sets and Fuzzy Operators Input Fuzzy Sets µA j1 (xt1 ) , . . . , µA jK (xtK )

Fuzzy Operators Set t-norms, t-conorms, negation and linguistic hedges operators

Subsequently, by using the Fuzzy Operators Set, the µA jk (xtk ) is combined in order to best

”If X1 is A j1 and .... and XK is A jK ” where negation and linguistic hedges can operate on each element of the antecedent term. 4.2.2

Partitioning

Let S={s1 ,s2 ,...,sJ } be the set of indicators s j , where each s j represents which fd (d={1,...,D}) is related to the j-th consequent B j . The method that describes which d-th function is associated to s j is called Uniform Division. This partitioning method makes use of a simple heuristic, given by: 1 Compute: U = ator).

⌊D⌋ J

(where ⌊.⌋ is the floor oper-

2 Partition: s1 = {1,...,U}, s2 = {U+1,..., 2*U},..., sJ = {U*(J-1)+1,...,U*J}. As an example, consider D (number of functions) = 10 and J (number of consequent terms) = 5. Thus U = 2, s1 = {1, 2}, s2 = {3,4}, s3 = {5,6}, s4 = {7,8}, s5 = {9,10}. Figure 5 illustrates this process.

In summary, each fd is uniformly divided for each s j so that a consequent has at least one rule associated to it. This method is similar to others GFS based on GP, such that consequent and antecedent terms are both synthesized. Through the definition of the rule set associated to each consequent (S={s0 ,s1 ,s2 ,...,sJ }), the next step is to aggregate them, in order to generate a final degree of activation. 4.2.3

Aggregation

Many different aggregation operators may be found in the literature [31-32].

 𝑔𝑔 →

∑_{d∈sj }

𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐(𝑠𝑠𝑗𝑗 )

[[𝑓𝑓𝑑𝑑∈𝑠𝑠1 (𝜇𝜇𝐴𝐴𝑗𝑗1 (𝑥𝑥𝑡𝑡1 ), … , 𝜇𝜇𝐴𝐴𝑗𝑗𝑗𝑗 (𝑥𝑥𝑡𝑡𝑡𝑡 ))]:

arithmetic mean operator intends to provide equal weights for each element of 172 the rule set associated to the j-th consequent.

where 𝜙𝜙𝑗𝑗 is an indicator function, such that 𝜙𝜙𝑗𝑗 = 1, when 𝜇𝜇𝐵𝐵𝑗𝑗 (𝑌𝑌𝑡𝑡 ) > 𝜇𝜇𝐵𝐵𝑙𝑙 (𝑌𝑌𝑡𝑡 ), for all l=1,...,J, e l ≠ j, e 𝜙𝜙𝑗𝑗 = 0, otherwise.

Koshiyama A. S., Vellasco M. M. B. R. and Tanscheit R.

Figure 5. Uniform division procedure Figure 5. Uniform division procedure. [ ( )] Some examples of g fd∈s1 µA j1 (xt1 ) , . . . , µA jK (xtK ) ∑Jj=1 ϕ j b j µB j (yt ) are: (5) yt = J ∑ j=1 phi j µB j (yt ) ( )] [ : – g → max fd∈s1 µA j1 (xt1 ) , . . . , µA jK (xtK ) where ϕ j is an indicator function, such that ϕ j = 1, max aggregation operator are the most common when µB j (Yt ) > µBl (Yt ), for all l=1,...,J, e l ̸= j, e used on Mamdani type FIS. ϕ j = 0, otherwise. ( ) ∑ {d∈sj } Figure 6 presents an illustration of the dif– g → card(s j ) [[ fd∈s1 µA j1 (xt1 ) , . . . , µA jK (xtK ) ] : ference between these procedures. We obtained: arithmetic mean operator intends to provide µNM (Yt ) = 0, 8, µPZ (Yt ) = 0, 6, while µNG (Yt ) = equal weights for each element of the rule set µNP (Yt ) = µPP (Yt ) = µPM (Yt ) = µPG (Yt ) = 0. associated to the j-th consequent. Then, according to Eq. (4): In [31], several aggregation operators are presented. It can be shown that t-norms and t-conorms −20 . 0, 8 + 0 . 0, 6 = −11, 43 Yt = are special cases of aggregation operators. In the 0, 8 + 0, 6 experiments both the arithmetic mean and maximum operators have been used. Once the aggreIf it was used maximum height method, the regation operators have been defined, it is possible to sponse would be -20, because the presence of only compute the membership degrees for different acone maxima value. Height method provides a tions µB j (yt ) taken by the controller. The defuzzismoother transition between responses than maxified control signal yt is then computed. mum height.

4.3

Defuzzification

4.4

Basically, a defuzzification method (center of gravity, mean of maximum, etc.) produces a crisp value that is an interpretation of the information contained in the output fuzzy set resultant from the inference process. In GPFIS-Control the height method is used: yt =

∑Jj=1 b j µB j (yt ) ∑Jj=1 µB j (yt )

(4)

where b j represents the center (location) parameter of each B j . The maximum height method may be employed when the control signal assumes values in some finite set:

Evaluation

The right definition of the fitness function is crucial for obtaining a good performance of the GPFIS-Control model. For optimal tracking of a trajectory, a possible fitness function is the Mean Squared Error (MSE): MSE =

1 K ∑ (xtk )2 K k=1

(6)

When the MSE is minimized, the GPFISControl model successfully obtains a trajectory close to the setpoint. In minimum time problems, the fitness function may be the time (t) the output takes to reach an MSE < ε, where ε is a tolerance.

173

GPFIS-CONTROL: A GENETIC FUZZY . . .

GPFIS-Control tries to reduce the size and complexity of the rule base by employing a simple heuristic called Lexicographic Parsimony Pressure [33]. This technique is only used in the selection phase: given two individuals with the same fitness, the best one is that with fewer nodes. Fewer nodes indicate rules with fewer antecedents, hedge and negation operators, as well as few functions (fd ), and, therefore, a smaller rule set. After the evaluation procedure, a set of individuals are selected (using tournament procedure) and recombined. Then, in a subset of the population, the mutation (Figure 7a), low-level crossover (Figure 7b) or high-level crossover (Figure 7c) operators are applied. Finally, a new population is generated. This process is repeated until a stopping criteria is met. At this moment, the final population is returned.

5 Case Studies Two benchmark problems have been considered to evaluate the GPFIS-Control model: cartcentering [25,28] and inverted pendulum [9,12]. The cart-centering problem has been used to assess the performance of GPFIS-Control in comparison with other GFCs. The application to the inverted pendulum made use of the GPFIS-Control parameters obtained in the tuning for the cart-centering problem. Results were compared to those presented in [12].

5.1 Experimental Settings 5.1.1 Cart-centering Problem The cart-centering problem consists of a cart with mass m moving on a frictionless rail; at some instant t its position is xt (m), with velocity vt (m/s). The cart must stop (vt = 0) at a user-defined setpoint ref. Tolerance values ε may be considered, so that |xt – ref.|< ε and |vt – ref.|< ε. The plant dynamics is given by Equations (7) and (8): Ft m

(7)

xt+τ = xt + τvt

(8)

vt+τ = vt + τ

where τ is the sampling period and Ft is the force (N) applied by the controller to the cart at time t. The objective is to reach the setpoint in minimum time. The performance of GPFIS-Control has been compared to the GFC presented in [25]. Several configurations for GPFIS-Control (t-norms, aggregation operators, etc.) have been evaluated. To perform a fair comparison, configurations were the same as those in [25] for all variables and parameters (Table 2 displays these values). GPFIS-Control is required to move the cart until |xt – 0|< 0.5 and |vt – 0|< 0.5, given 16 initial values uniformly distributed on the xt domain. The fitness function has been defined as: Fitness = tε +

∑ |xt |

(9)

t

where tε is the time needed to satisfy the stopping criteria (|xt – 0|< 0.5 and |vt – 0|< 0.5). An individual in the GPFIS-Control population is considered unfeasible if it cannot stop the cart in 10 seconds (500 sampling steps). Table 2. Configurations Set for Cart-Centering Problem Variable Ft vt xt Parameter τ ε m ref.

Domain [-2.5, 2.5] N [-2.5, 2.5] m/s [-2.5, 2.5] m Value 0.02s 0.5 2.0 kg xt = vt = 0

After the best solution is found, it is applied to 1000 initial random positions in order to evaluate the time taken by GPFIS-Control to stop the cart. In order to perform a fair comparison with [25], the following procedure has been executed 10 times: (i) generate a GPFIS-Control model and, (ii) apply it on 1000 random position, in order to produce statistical relevant results. The best GPFIS-Control model is obtained, in each execution, after 25000 evaluations (population size = 50 and number of generations = 500, respectively). Table 3 displays all the parameters of the GPFIS-Control model.

the GPFIS-Control model. For optimal tracking of a trajectory, a possible fitness function is the Mean Squared Error (MSE):

174

𝐾𝐾

1 𝑀𝑀𝑀𝑀𝑀𝑀 = ∑(𝑥𝑥𝑡𝑡𝑡𝑡 )2 𝐾𝐾

(6)

𝑘𝑘=1

crossover (Figure 7c) operators are applied. Finally, a new population is generated. This process is repeated until a stopping criteria is met. At this moment, the final population is returned.

Koshiyama A. S., Vellasco M. M. B. R. and Tanscheit R.

Figure 6. Example for defuzzification procedures. Figure 6. Example for defuzzification procedures Table 3. Remaining GPFIS-Control parameters Parameter Tournament Size Maximum Tree Depth Elitism Rate Maximum rules per individual Low level crossover rate High level crossover rate Mutation rate Direct reproduction rate Input Fuzzy Sets

Fuzzy Operators Set

Value 5 5 1% 50

ϕt =

75% 50%

ψ=

20% 5% 7 Fuzzy Sets + Classical Negation* of each Fuzzy Set per variable t-norm: product, others: described for each experiment

* used only when told 5.1.2

on a frictionless rail. The controller must apply Ft in order to increase or decrease vt and consequently change the angular velocity ωt and the pendulum angle θt . The dynamic model [12,28] is described below:

at =

gsin θt + cos(θt )Ψ ] [ 2θ t λ 43 − mcos M+m

(10)

−Ft − mλωt θt2 sin θt M+m

(11)

ωt+1 = ωt + τϕt

(12)

θt+1 = θt + τωt

(13)

Ft + mλ[θt2 sin θt − ωt cos θt ] M+m

(14)

vt+1 = vt + τat

(15)

xt+1 = xt + τvt

(16)

Inverted Pendulum Problem

The second experiment consisted of an application of GPFIS-Control to the inverted pendulum problem. Results were then compared to those of [12]. In this problem, a cart of mass M with a pole of mass m and height λ attached to its center moves

where ϕt is the angular acceleration and τ is the sampling step. In order to perform a fair comparison with [12], the feasible domain for each

GPFIS-CONTROL: A GENETIC FUZZY . . .

GPFIS-CONTROL: A GENETIC FUZZY . . .

Figure 7: Example of of recombination operators applied in in GPFIS-Control solutions. Figure 7. Example recombination operators applied GPFIS-Control solutions Figure 7. Example of recombination operators applied in GPFIS-Control solutions

175

176

Koshiyama A. S., Vellasco M. M. B. R. and Tanscheit R.

variable was set as: ωt ∈ [-0.87, 0.87] rad/s, θt ∈ [-0.34, 0.34] rad, Ft ∈ [-25, 25] N, while xt and vt are unconstrained, M=1kg, m=0.1kg, λ =0.5m, g=9.8m/s2 and τ=0,01s. Two initial conditions were considered: θ0 = {-0,18, 0,18}rad, with ω0 ={0,0}rad/s and the setpoint is ref=0 rad with ε=0.01. The time allowed for the position |θt – 0|< 0.01 to be reached is at most 1 second (100 sampling steps).

The best configurations were obtained with the following parameters: maximum height method for defuzzification and average as the aggregation operator. Figure 8 present the 16 initial and final positions when |xt – 0|< 0.5 and |vt – 0|< 0.5. Figure 9 exhibits the response surface for GPFISControl best configuration for (a): maximum height defuzzification method and (b): height defuzzification method. It can be seen that the surface for (b) is smoother than that for (a), due to a broader set of values that Ft can assume when the height method is chosen.

As in [12], 100,000 evaluations (population size = 100 and number of generations 1000) have been made. All this procedure was repeated 10 times, in order to generate statistical relevant results. Table 3 The average best result for GPFIS-Control exhibits the remaining parameters used. The fitness Figure (135.8 steps) compares with those of [25] 9: Response surfacefavorably for the best individual in cart-centering for function is [12]: (158 steps) and [35] (149 steps). The optimal solu(a): maximum height; (b) height. tion is 129 steps. Fitness =

100

∑ (θt − re f )2

5.2.2. Inv

(17)

t=1

In both experiments seven fuzzy sets have been assigned to each variable (Ft , xt , vt , ωt , θt ), as shown in Figure 4. In some cases, the negation of a fuzzy set was entered in the Input Fuzzy Sets stage of the GPFIS-Control routine (as described in section 4.2.1). All experiments were performed in MATLAB R2010a [34].

5.2 5.2.1

Results and Discussions Cart-Centering Problem

Figure 8: 8. Initial and Figure Initial andfinal finalposition positionfor forthe the best best individual in an execution of GPFIS-Control, individual in an execution of GPFIS-Control, using using Product+Root-Sq+MaxHeight and Product+Root-Sq+MaxHeight and average average aggregation operator. aggregation operator

Based o established MaxHeight to the inve shows the the best i given two 0,18}rad, w best result seconds to during 1.0 average. In perform th fewer rules

The main results obtained with the cartcentering problem are presented in Table 4. GPFISControl was tested with the linguistic hedge square 5.2.2 Inverted Pendulum root, the classic negation operator, different ag- Table 4: Results of GPFIS-Control: Cart-Centering Problem gregation operators (max and average) and differBased on the best configuration previously es- Aggregation o Product+Rootent defuzzification methods (height and maximum tablished (Product + Root-Sq + Average +Product+Root-Sq+ MaxAttribute Sq+ height). It can be seen that for almost all configuHeight), GPFIS-Control has been applied to theMaxHeight inHeight rations, the use of the average aggregation operator verted pendulum problem. Figure 10 shows the Average Steps (0.02s) 215.9 243.6 reduces by about 39% the mean time taken by the controller’s behavior, generated by the best individStd. Dev. Steps (0.02s) 25.73 94.09 controller to position the cart at |xt – 0|< 0.5 and |vt ual in 100,000 evaluations, given two initial condiAverage Time (s) 4.318s 4.872 – 0|< 0.5. It may also be noted that the maximum tions: θ0 =Rules {-0,18, 0,18}rad, with Average 21 ω0 ={0,0}rad/s. 24 height defuzzification reduces that time in 14% in The average best result found for GPFIS-Control Aggregation ope average. However, the use of the negation operator was 0.27 seconds to reach and stay at |θt – 0| 0 and me (t2 ) > 0. Let c = min[me (t1 ) , me (t3 )]. Then s > 0. By the convexity of fuzzy numbers, we have

184

Zhenyuan Wang and Li Zhang-Westman

√  1  + (ab − al )(ad + ar − al − ab ) if al + ad + ar ≤ 3 ab a  l  √2 ce = ar − 12 (ar − ad )(ad + ar − al − ab ) if al + ab + ar ≥ 3 ad    (al +ab +ad +ar ) otherwise . 4  √  al + 1 (a0 − al )(ar − al ) √2 ce =  a − 1 (a − a )(a − a ) r

2

r

0



∞ m (t) ≥ s for every t(t1 , t2 ). Thus, −∞ me (t) dt ≥ m (t) dt ≥ s (t − t ) > 0. This contradicts the e 2 1 t1 ∫∞ condition −∞ me (t) dt = 0.

∫ te2





b me (t) dt > 0 and d∞ me (t) dt > 0 Lemma 3. If −∞ for some real numbers b and d with b < d, then ∫d b me (t) dt > 0.



b me (t) dt > 0 , we may find point Proof. From −∞ t1 < b such that me ( t1 ) > 0. Simirally, we may find point t2 > d such that me ( t2 ) > 0. Denote min [me ( t1 ) , me ( t2 )] by s. Then s > 0. Since the convexity of e� , we have me (t) > s for any t between ∫ t1 and t2 . Thus, bd me (t) dt > s(d − b) > 0 .

Definition 4. Let e� be a fuzzy number but not a crisp real number. A real∫ number ce is called the ex∫∞ ce pansion center of e� iff −∞ me (t) dt = ce me (t) dt. For any real number, its expansion center is just itself.

The role of expansion center for a given fuzzy number in fuzzy mathematics is similar to the one of medium (50 percentile) in statistics. The former is one of the numerical indexes of fuzziness possessed by a fuzzy number, while the latter is used for describing the randomness of a random variable. For any crisp real number, its expansion center is defined as itself. For any symmetric fuzzy number, its expansion center coincides with the symmetry center. Theorem 1. For any given fuzzy number, its expansion center exists and is unique. Proof. We only need to prove the conclusion for fuzzy numbers that are not crisp real numbers. Denote a given non crisp fuzzy number by e�. From Lemma 1, we know that definite ∫x ∫ integrals −∞ me (t) dt and x∞ me (t) dt exist and are finite for any real number x∫ and can be re∞ garded as functions of x. ∫ Let −∞ me (t) dt = r ∫x ∞ and f (x) = −∞ me (t) dt− x me (t) dt, where x ∈ R. Function f is continuous. Since e� is not a

r

l

(4)

if al + ar ≤ 2 a0 otherwise

(5)

.

crisp real number and supp(e) is bounded, we have 0 < r < ∞ and limx→−∞ f (x) = −r as well as limx→∞ f (x) = r . By the well-known Intermediate Value Theorem of Continuous Function, we know that there exists a real∫ number ce ce such that f (ce ) = 0 and, therefore, −∞ me (t) dt ∫∞ = ce me (t) dt, that is, ce is an expansion cen∫ ce ter of e� . Furthermore, we have −∞ me (t) dt = ∫∞ m ce e (t) dt = r/2. Now, let’s prove the uniqueness of the expansion center. Assume that, for given fuzzy number e� , there exists another expan′ sion center ce . Without any loss of generality, we ∫ ce ′ e may assume that c < ce . From both −∞ me (t) dt ∫∞

me (t) dt = r/2 ∫ and ∞ ce ′ c me (t) dt=r/2, weobtain

= ∫

ce

e

∫ ce



∫ c′e

−∞ me (t) dt

ce me (t)dt=

∫ c′e me (t)dt ce

.

= So,



. By using Lemma 3, we get ce = ce . The proof is now complete. ′

ce me (t)dt=0

Example 1. Let e� be a trapezoidal fuzzy number with membership function (3) shown in Section 2. Its expansion center ce can be expressed as (4) As a special case, when e� be a triangular fuzzy number with membership function (2), its expansion center is (5) As another special case, when e� be a rectangular fuzzy number with membership function (1), i.e., when ab − al = ar − ad = 0, expression (4) is reduced to ce = 21 (al + ar ). This coincides with the intuition. For a given fuzzy number, in general cases, a numerical method may be used to calculate the values of the involved definite integrals and a soft computing technique, such as the genetic algorithm, can be adopted to optimize the location of its expansion center.

0

A NEW RANKING METHOD FOR . . .

5

Ranking Fuzzy Number By Their Expansion Center

By using the expansion center of fuzzy numbers, we may propose a ranking method as follows. Definition 5. Let e and  f be two fuzzy numbers. We say that fuzzy number e precedes fuzzy number  f , denoted by e   f , iff their expansion centers satisfy ce ≤ c f . Relation  is well defined on the set of all fuzzy numbers, NF . It does not satisfy the antisymmetry, i.e., e and  f may be different even both e   f and  f  e hold. So, it is not a total ordering on NF . However, this relation can be used to rank fuzzy numbers. Example 2. Let fuzzy numbers e and  f have membership functions

me (x) =

  1 

if x ∈ [0, 1] if x ∈ (1, 5] otherwise

5−x 4

0

3

4

6

5

x

Figure 1. The membership function of fuzzy number 𝑒𝑒̃ and its 185 expansion center 𝑐𝑐𝑒𝑒  1.536 in Example 2

m f ( x)

0

1

1.5625

2

3

4

6

5

shown in Figures 1 and 2 respectively. e is a trapezoidal fuzzy number but  f is not. Their expansion centers are ce 1.5359 and c f = 1.5625 respectively. So, e  f. me ( x)

1

Exa func

x

Figure 2. The function of fuzzy number 𝑓𝑓̃ and its Figure 2. membership The membership function of fuzzy expansion center 𝑐𝑐 = 1.5625 in Example 2 𝑓𝑓 expansion center c = 1.5625 in number  f and its

Example 2

6

f

The geometric intuitivity can be defined by the geoThe geometric can beparts defined by the geometric metric location intuitivity of the non-zero of membership location ofcorresponding the non-zero to parts membership functions functions theofgiven fuzzy numcorresponding to the given fuzzy numbers. bers.

ourcase, case,the theproposed proposedranking ranking method method obviobviously InInour follows the above-mentioned definition. So, our ranking ously follows the above-mentioned definition. So, method adheres to theadheres geometric for rankings. our ranking method tointuitivity the geometric intuitivity for rankings.

7

and

VI. GEOMETRIC INTUITIVITY OF Geometric Intuitivity Of RankRANKING AND TOTAL ORDERING ing And Total Ordering

Intuitively, if the curve of the membership funcIntuitively, if the curve of the membership function of 𝑒𝑒̃ tion of e˜ is totally on the left (including the same) is totally on the left (including the same) of the curve of the of the curve of the membership function of f˜, then membership function of 𝑓𝑓̃, then the ranking method should the ranking method should issue e˜  f˜. This charissue 𝑒𝑒̃ ≼guarantees 𝑓𝑓̃. This charactonym guarantees that the ranking actonym that the ranking coincides with coincides with the natural ordering of the real numbers, the natural ordering of the real numbers, i.e., regard- i.e., regarding real numbers as special fuzzy numbers, their ing real numbers as special fuzzy numbers, their dedefined rank for fuzzy numbers is the same as their natural fined rank for fuzzy numbers is the same as their order. natural order.

if x ∈ [0, 1] if x ∈ (1, 1.5] if x ∈ (1.5, 3.5] otherwise

Def equ exp

1

Definition For two two fuzzy fuzzy numbers numbers𝑒𝑒̃ e˜and and𝑓𝑓̃,f˜if, ifthe theα-cuts Definition 6. 6. For α-cuts e ≤ f , i.e., l (e) ≤ l ( f ) and r (e) ≤ r ( α 𝑙𝑙𝛼𝛼 (𝑓𝑓) α α 𝑟𝑟𝛼𝛼 (𝑓𝑓) αforf )every 𝑒𝑒𝛼𝛼 ≤ 𝑓𝑓𝛼𝛼 α, i.e.,α𝑙𝑙𝛼𝛼 (𝑒𝑒) ≤ and 𝑟𝑟𝛼𝛼 (𝑒𝑒) ≤ ˜ for ∈ (0, 1],𝑓𝑓̃then e˜  f . (0,1],αthen 𝛼𝛼 ∈every 𝑒𝑒̃ ≼ .

and  x    2−x m f (x) = 0.5    0

1 1.5359 2

By rela

Figu 𝑔𝑔̃ an 𝑐𝑐𝑔𝑔 a two met

Figu with

Equivalence Classes And A Total VII. EQUIVALENCE CLASSES AND A TOTAL Ordering On The Quotient Space ORDERING ON THE QUOTIENT SPACE

0

1 1.5359 2

3

4

5

6

x

By using the concept of expansion center, we may using the concept of expansion center, we may define a defineBy a relation on the set of all fuzzy numbers,  on the set of all fuzzy numbers, 𝒩𝒩𝐹𝐹 , as follows. NF , asrelation follows.

Definition 5. We5.say numbers e and𝑒𝑒̃ and 𝑓𝑓̃ are Definition Wethat saytwo thatfuzzy two fuzzy numbers   denoted as as 𝑒𝑒̃e ∼ ∼ 𝑓𝑓̃f , , iff iff they equivalent, denoted they have the same Figure 1. The membership of fuzzy Figure 1. The membership functionfunction of fuzzy number 𝑒𝑒̃ and its f are equivalent, the same expansion center, i.e., c = c .  1.536 in Example 2 expansion center 𝑐𝑐 expansion center, i.e., 𝑐𝑐𝑒𝑒 = 𝑐𝑐𝑓𝑓e. f number e and its 𝑒𝑒 expansion center ce ≈ 1.536 in Example 2 m f ( x)

1

Example 3. Let fuzzy numbers 𝑔𝑔̃ and ℎ̃ have membership functions

 2x 

if x  [0, 1.5)

Definition 5. We say that two fuzzy numbers 𝑒𝑒̃ and 𝑓𝑓̃ are equivalent, denoted as 𝑒𝑒̃ ∼ 𝑓𝑓̃ , iff they have the same expansion center, i.e., 𝑐𝑐𝑒𝑒 = 𝑐𝑐𝑓𝑓 .

𝑒𝑒̃ and its

Example 3. Let fuzzy numbers 𝑔𝑔̃ and ℎ̃ have membership functions

186

x 𝑓𝑓̃ and its

OF RING

he geometric ip functions

if the α-cuts 𝑓𝑓) for every

function of 𝑒𝑒̃ curve of the ethod should the ranking umbers, i.e., mbers, their their natural

 2x  have  and Example 3. Let fuzzy x h [0, 1.5) mem 3 numbers gif bership functions  if x  1.5 1 mg ( x)    2x 6  2 x if x1.5)  (1.5, 3] if x ∈ [0,    3 3 1 if x = 1.5 otherwise mg (x) = 0 6−2x if x ∈ (1.5, 3]   3  0 otherwise and

and

   mh= ( x) mh (x) 

if x  [0, 2) 1  1 4  x if x ∈ [0, 2) 4−x if x  (2, 4] 2 2 if x ∈ (2, 4]  0 0 otherwise . otherwise .  

9

Zhenyuan Wang and Li Zhang-Westman

Acknowledgement

This work is partially supported by National Science Foundation of China #70921061 and #71110107026, and by the CAS/SAFEA International Partnership Program for Creative Research Teams.

References [1] S. Abbasbandy and B. Asady, Ranking of fuzzy numbers by sign distance, Information Sciences 176(16), 2405-2416, 2006.

[2] G. Bortolan and R. Degani, A review of some methods for ranking fuzzy numbers, Fuzzy Sets and Systems 15, 1-19, 1985. Figure 3 shows offuzzy fuzzynumbers Figure 3 showsthe themembership membership functions functions of [3] C. H. Cheng, A new approach for ranking fuzzy numbers gwith and their h with their expansion center. Their centers 𝑔𝑔̃ and ℎ̃ expansion center. Their expansion numbers by distance method, Fuzzy Sets and Sysand value ch have expansion have thecgsame 1.5. the So, same 𝑔𝑔̃  ℎ̃ , value that is, these tems 95, 307-317, 1998. 𝑐𝑐𝑔𝑔 and 𝑐𝑐ℎcenters 1.5.two So,fuzzy g h , numbers that is, these fuzzyrank numbers have two the same whenhave the ranking [4] T. C. Chu and C. T. Tsao, Ranking fuzzy numbers themethod same rank when the ranking method shown in shown in Section 4 is used. with an area between the centroid point and origiSection 4 is used. nal point, Computers and Mathematics with Applications 43, 111-117, 2002. [5] D. Dubois and H. Prade, Fuzzy Sets and Systems: Theory and Applications, Academic Press, New York, 1980.

m( x )

1

0

mg

mh

1

1.5

2

3

4

5

6

x

[6] D. Dubois and H. Prade, Ranking fuzzy numbers in the setting of possibility theory, Information Sciences 30, 183-224, 1983.

[7] B. Farhadinia, Ranking fuzzy numbers on lexicographical ordering, International Journal of ApFigure 3.3.The functions of fuzzy Figure Themembership membership functions ofnumbers fuzzy 𝑔𝑔̃ and ℎ̃ plied Mathematics and Computer Sciences 5(4), = 1.5 in Example with their expansion 𝑔𝑔 = 𝑐𝑐ℎexpansion numbers g and hcenter with𝑐𝑐their center3 248-251, 2009.

d obviously our ranking ankings.

cg = ch = 1.5 in Example 3

Relation ∼ is reflective, symmetric, and transitive. Hence, it is an equivalence relation on NF . The collection of all equivalent classes with respect to equivalence relation ∼ forms the quotient space, denoted by Q∼ . Regarding  as a relation on Q∼ , (Q∼ , ) is a total ordered set [14].

8

Conlusions

The introduced new concept of expansion center for fuzzy numbers is effective in ranking fuzzy numbers. This new way of ranking fuzzy numbers is intuitive and practicable. It provides an alternative choice in decision making and data mining within fuzzy environment.

[8] N. Furukawa, A parametric total order on fuzzy numbers and a fuzzy shortest route problem, Optimization 30, 367-377, 1994.

[9] G. J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications, Prentice Hall, 1995. [10] M. Kurano, M. Yasuda, J Nakagami, and Y. Yoshida, Ordering of convex fuzzy sets − a brief servey and new results, Journal of the Operation Research Society of Japan 43(1), 138-148, 2000. [11] T. S. Liou and M. J. Wang, Ranking fuzzy numbers with integral value, Fuzzy Sets and Systems 50, 247-255, 1992. [12] S. H. Nasseri and M. Sohrabi, Hadi’s method and its advantage in ranking fuzzy numbers, Australian Journal of Basic Applied Sciences 4(10), 46304637, 2010.

A NEW RANKING METHOD FOR . . .

[13] J. Ramk and J. imnek, Inequality relation between fuzzy numbers and its use in fuzzy optimization, Fuzzy Sets and Systems 16, 123-138, 1985.

187 and Mathematics with Applications 55, 2033-2042, 2008.

[14] K. H. Rosen, Discrete Mathematics and Its Applications (Seventh Edition), McGraw-Hill, 2011.

[17] Y. -M. Wang, J. -B. Yang, D. -L. Xu, and K. S. Chin, On the centroid of fuzzy numbers, Fuzzy Sets and Systems, 157, 919-926, 2006.

[15] W. Wang and Z. Wang, Total orderings defined on the set of all fuzzy numbers, Fuzzy sets and Systems, 234, 31-41, 2014.

[18] Z. Wang, R. Yang, and K. S. Leung, Nonlinear Integrals and Their Applications in Data Mining, World Scientific, 2010.

[16] Y. J. Wang and H. S. Lee, The revised method of ranking fuzzy numbers with an area between the centroid point and original point, Computers

[19] J. S. Yao and K. Wu, Ranking fuzzy numbers based on decomposition principle and signed distance, Fuzzy Sets and Systems 116, 275-288, 2000.

JAISCR, 2014, Vol. 4, No. 3, pp. 189 – 204 10.1515/jaiscr-2015-0008

REPULSIVE SELF-ADAPTIVE ACCELERATION PARTICLE SWARM OPTIMIZATION APPROACH Simone A. Ludwig Department of Computer Science, North Dakota State University, Fargo, ND, USA Abstract Adaptive Particle Swarm Optimization (PSO) variants have become popular in recent years. The main idea of these adaptive PSO variants is that they adaptively change their search behavior during the optimization process based on information gathered during the run. Adaptive PSO variants have shown to be able to solve a wide range of difficult optimization problems efficiently and effectively. In this paper we propose a Repulsive Self-adaptive Acceleration PSO (RSAPSO) variant that adaptively optimizes the velocity weights of every particle at every iteration. The velocity weights include the acceleration constants as well as the inertia weight that are responsible for the balance between exploration and exploitation. Our proposed RSAPSO variant optimizes the velocity weights that are then used to search for the optimal solution of the problem (e.g., benchmark function). We compare RSAPSO to four known adaptive PSO variants (decreasing weight PSO, time-varying acceleration coefficients PSO, guaranteed convergence PSO, and attractive and repulsive PSO) on twenty benchmark problems. The results show that RSAPSO achives better results compared to the known PSO variants on difficult optimization problems that require large numbers of function evaluations.

1

Introduction

Particle Swarm Optimization (PSO) is one of the swarm intelligence methods [1]. The behavior of PSO is inspired by bird swarms searching for optimal food sources, where the direction in which a bird moves is influenced by its current movement, the best food source it ever experienced, and the best food source any bird in the swarm ever experienced. As for PSO, the movement of a particle is influenced by its inertia, its personal best position, and the global best position of the swarm. PSO has several particles, and every particle maintains its current objective value, its position, its velocity, its personal best value, that is the best objective value the particle ever experienced, and its personal best position, that is the position at which the personal best value has been found. In addition, PSO maintains a global best value, that is the best

objective value any particle has ever experienced, and a global best position, that is the position at which the global best value has been found. Basic PSO [1] uses the following equation to move the particles: x(i) (n + 1) = x(i) (n) + v(i) (n + 1), n = 0, 1, 2, . . . , − 1, (1a) where x(i) is the position of particle i, n is the iteration number with n = 0 referring to the initialization, is the total number of iterations, and v(i) is the velocity of particle i, i = 1, 2, . . . , n p , where n p is the number of particles. Basic PSO uses the following equation to update the particle velocities: (i)

(i)

v(i) (n + 1) = wv(i) (n) + c1 r1 (n)[x p (n) − x(i) (n)] (i)

+ c2 r2 (n)[xg (n) − x(i) (n)],

n = 0, 1, 2, . . . , − 1, (1b)

190

Simone A. Ludwig

(i)

where x p (n) is the personal best position of parti(i) cle i, and xg (n) is the global best position of particle i, w is the inertia weight and is set to 1, and (i) the acceleration constants are c1 and c2 . Both r1 (i) and r2 are vectors with components having random values uniformly distributed between 0 and 1. The notation r(i) (n) denotes that a new random vector is generated for every particle i at iteration n. PSO can focus on either population/particle diversity or convergence of particles at any iteration. Diversity favors particles that are searching a large area coarsely, whereas convergence favors particles that are searching a small area intensively. A promising strategy is to promote diversity of the swarm in early iterations and convergence in later iterations [2, 3], or assigning attributes to individual particles to promote diversity or convergence [4]. Despite PSO’s simplicity, the success of PSO largely depends on the selection of optimal values for the control parameters w, c1 , and c2 . Nonoptimal values of the control parameters lead to suboptimal solutions, premature convergence, stagnation, or divergent or cyclic behaviour [5, 6]. However, the optimal setting of the control parameters is dependent on the problem and might be different for the particles within the swarm. Since finding the optimal control parameters manually is very time-consuming, therefore, related work has addressed this with PSO variants that adaptively change all or some of the control parameters. For example, decreasing weight PSO decreases the inertia weight w(n) linearly over time [2], timevarying acceleration coefficients PSO changes not only the inertia weight but also c1 (n) and c2 (n) over time [2, 3], and guaranteed convergence PSO ensures that the global best particle searches within a dynamically adapted radius [7, 8]. Other variants include the linear reduction of the maximum velocity PSO [9], and non-linear adjusted inertia weight PSO [10]. PSO with dynamic adaption [11] uses an evolutionary speed factor that measures personal best value changes and an aggregation degree that measures the relative position of particles in the objective space to calculate the inertia weight w. APSO in [12] adapts the inertia weight of every particle based on its objective value, the global best value, and the global worst value. APSO in-

troduced in [13] changes its inertia weight based on swarm diversity to reduce premature convergence and hence to increase overall convergence. The swarm diversity is calculated as a function of positions. Different variations of the self-tuning APSO are discussed in [14, 15, 16]. Self-tuning APSO as described in [15] grants every particle its own personal best weight ci1 and global best weight ci2 . Self-tuning APSO initializes the personal best weights c1i and the global best weights ci2 randomly for every particle, and moves the personal and global best weights towards values of particle i that yielded the most updates of the global best position, where the distance of the movement towards the personal best weight ci1 and the global best weight ci2 are based on the total number of iterations [14]. In an update of self-tuning APSO, the personal and global best weights are moved in ever smaller steps for increasing numbers of iterations [15]. It has been shown with past research that the adaptation of the velocity weights improve the converges speed of PSO compared to having fixed velocity weights. Therefore, our approach is basically inspired by other PSO variants that assign every particle its own velocity weights [15, 16]. These PSO variants usually adapt the velocity weights of a certain particle that is selected based on a measure of superior performance [16] and adopt these velocity weights for all other particles. This paper is an extension of the work published as a short paper in [17]. The organization of this paper is as follows: In Section 2, details of the five PSO variants against which we compare RSAPSO is given. Section 3 introduces and describes the proposed RSAPSO variant. In Section 4, the benchmark problems used to compare the variants with RSAPSO are outline. Section 5 lists the conclusions reached from this study.

2

Related Work and PSO Variants used for Comparison

We are interested in finding the global minimum of an objective function f (x) in a D-dimensional search space of the form [xmin , xmax ]D . In order to assess the performance of RSAPSO, we utilize four related adaptive PSO variants: decreasing weight

191

REPULSIVE SELF-ADAPTIVE ACCELERATION . . .

PSO, time-varying acceleration coefficients PSO, guaranteed convergence PSO, and attractive and repulsive PSO.

2.1

Decreasing Weight PSO (DWPSO)

DWPSO is similar to basic PSO, but the inertia weight w(n) is decreased linearly over time [2]. Thus, DWPSO promotes diversity in early iterations and convergence in late iterations. DWPSO uses Equation 1b to determine the velocities of the particles whereby the inertia weight w(n) is calculated using: w(n) = ws − (ws − we )

n , −1

(2)

where ws is the inertia weight for the first iteration, and we is the inertia weight for the last iteration.

2.2

Time-Varying Acceleration cients PSO (TVACPSO)

Coeffi-

TVACPSO adapts the acceleration coefficients, i.e., the personal weight c1 (n) and global best weight c2 (n) over time besides the the inertia weight w(n) [2, 3]. The idea is to have high diversity during early iterations and high convergence during late iterations. The inertia weight w(n) is changed as in DWPSO using Equation (2). TVACPSO uses the following equation to determine the velocities: v(i) (n + 1) = w(n)v(i) (n) (i)

(i)

+ c1 (n)r1 (n)[x p (n) − x(i) (n)] (i)

+ c2 (n)r2 (n)[xg (n) − x(i) (n)],

n = 0, 1, 2, . . . , − 1, (3a)

where the personal best weight c1 (n), and the global best weight c2 (n) at iteration n are calculated using: n , c1 (n) = c1s − (c1s − c1e ) −1 n c2 (n) = c2s − (c2s − c2e ) , (3b) −1 where c1s is the personal best weight for the first iteration, c1e is the personal best weight for the last iteration, c2s is the global best weight for the first iteration, and c2e is the global best weight for the last iteration.

2.3

Guaranteed (GCPSO)

Convergence

PSO

GCPSO guarantees that the global best particle searches within a dynamically adapted radius [7, 8]. This addresses the problem of stagnation and increases local convergence by using the global best particle to randomly search within an adaptively changing radius at every iteration [8]. GCPSO, as described in [2], uses the following equation to update the position:

x(ig ) (n + 1) = xg (n) + w(n)v(ig ) (n) + ρ(n)(1 − 2r3 (n)),

n = 0, 1, 2, . . . , − 1, (4a)

GCPSO uses Equation (1b) to determine the velocities v(i) (n). The personal best weight c1 and the global best weight c2 are held constant. GCPSO uses the following equation to update the velocity of the global best particle: v(ig ) (n + 1) = −x(ig ) (n) + xg (n) + w(n)v(ig ) (n) + ρ(n)(1 − 2r3 (n)),

n = 0, 1, 2, . . . , − 1, (4b)

where ig is the index of the particle that updated the global best value most recently. The expression −x(ig ) (n) + xg (n) is used to reset the position of particle ig to the global best position. r3 (n) are random numbers uniformly distributed between 0 and 1. The search radius is controlled by the search radius parameter ρ(n). The search radius parameter ρ(n) is calculated using:    2ρ(n), if σ(n + 1) > σc , ρ(n + 1) = 21 ρ(n), if φ(n + 1) > φc ,   ρ(n), otherwise,

(4c)

where σc is the consecutive success threshold, and φc is the consecutive failure threshold defined below. Success means that using Equations (1) and (4b) to update the particle positions results in an improved global best value and position, and failure means it does not. The numbers of consecutive successes σ(n) and failures φ(n) are calculated us-

192

Simone A. Ludwig

ing:

where S is the swarm, |S| is the swarm size, |L| is the length of the longest diagonal in the search space, N is the dimensionality of the problem, pi j is the jth value of the ith particle, and p¯ j is the jth value of the average point p. Finally, Equation (5d) is modified by multiplying the sign-variable dir by the social and personal components that decide whether the particles are attracted to, or repelled by each other:

σ(n + 1) = φ(n + 1) =

2.4

{

{

0, if φ(n + 1) > φ(n), σ(n) + 1, otherwise,

(4d)

0, if σ(n + 1) > σ(n), φ(n) + 1, otherwise.

(4e)

Attractive and Repulsive PSO (RPSO)

RPSO aims to overcome the problem of premature convergence [18]. It uses a diversity measure to control the swarm by alternating between phases of “attraction” and “repulsion”. The attraction phase operates as basic PSO by the particles attracting each other (see Equation (1b)). The repulsion phase is done by inverting the velocity-update equation of the particles as follows:

(i)

− c1 r1 (n)[x p (n) − x(i) (n)] (i)

− c2 r2 (n)[xg (n) − x(i) (n)],

n = 0, 1, 2, . . . , − 1, (5a)

In the repulsion phase, the individual particle is no longer attracted to, but instead repelled by the best known particle position and its own previous best position. In the attraction phase, the swarm is contracting, and therefore the diversity decreases. Once the diversity drops below a lower bound, dlow , the repulsion phase is switched to, so that the swarm expands according to Equation (5a). When a diversity of dhigh is reached, the attraction phase is switched on again. Therefore, there is an alternation between phases of exploitation and exploration (attraction and repulsion). Equation (5b) sets the sign-variable dir to either 1 or -1 depending on the diversity values as given in Equation (5c): { −1, if diversity(S) < dlow , dir = (5b) 1, if diversity(S) > dhigh ,

diversity(S) =

1 |S| × |L|

� �N � × ∑ � ∑ (pi j − p¯ j )2 , |S|

i=1

(i)

(i)

+ dir(c1 r1 (n)[x p (n) − x(i) (n)] (i)

+ c2 r2 (n)[xg (n) − x(i) (n)]),

n = 0, 1, 2, . . . , − 1, (5d)

2.5

Dealing with Search Space Violations

If a particle attempts to leave the search space, our strategy is to return it along its proposed path through a series of correcting iterations. In particular, we use:

v(i) (n + 1) = w(n)v(i) (n) (i)

v(i) (n + 1) = w(n)v(i) (n)

j=1

(5c)

x˘(i) (nˇ + 1) = x˘(i) (n) ˇ − v˘(i) (nˇ + 1),

nˇ = 0, 1, ...,˘− 1, (6a)

where x˘(i) (nˇ + 1) is the corrected position, v˘(i) is the corrected velocity, nˇ is the count for the correcting iterations, and˘is the total number of correcting iterations. The initial corrected position x˘(i) (0) is set to the position x(i) (n + 1), which is outside the search space. The corrected velocities v˘(i) are calculated using: v˘(i) (nˇ + 1) = αv˘(i) (n), ˇ nˇ = 0, 1, ...,˘− 1, (6b) where α is the correction factor, and the initial corrected velocity v˘(i) (0) is set to the velocity v(i) (n + 1) that caused the particle to attempt to leave the search space. Equation (6a) is used until the corrected position x˘(i) (nˇ + 1) is in the search space or the limit on the total number of correcting iterations ˘ is reached. If ˘ is reached, the components of x˘(i) (˘) still outside the search space are clamped to the boundary of the search space. Based on good performance in empirical experiments, the values chosen are α = 0.54 and˘= 4.

193

REPULSIVE SELF-ADAPTIVE ACCELERATION . . .

3

Proposed Variant: Repulsive Self-adaptive Acceleration PSO (RSAPSO)

Our RSAPSO variant is inspired by other PSO variants that assign every particle its own velocity weights [15, 16]. These variants typically move the velocity weights of all particles toward the velocity weights of a certain particle that is selected based on a measure of superior performance [16]. For example, self-tuning APSO moves the velocity weights towards the settings of the particle that yielded the most updates of the global best position [15, 16]. Controlled APSO [19] adaptively (i) changes the personal best weights c1 (n) and the (i) global best weights c2 (n) based on the distance between the positions and the global best position. Inertia weight APSO [12] allows every particle its own inertia weight w(i) (n) that is changed using a function of the objective values and the global best value. Optimized PSO [20] uses multiple PSO subswarms, each having their own parameter settings, in an inner iteration to solve the original optimization problem. The parameter settings are then optimized in an outer iteration of PSO for a fixed number of iterations. Inspired by the optimized PSO variant [20], we treat the problem of finding good velocity weights as an optimization problem. In RSAPSO every particle has its own velocity weights, i.e., its inertia (i) weight w(i) , personal best weight c1 , and global (i) best weight c2 . A particular setting of the velocity weights is referred to as the position of the velocity weights. An objective function for the velocity weights is used to quantify how well the positions of the velocity weights perform for solving the overall optimization problem. Using the calculated objective values of the velocity weights, RSAPSO takes a step toward optimizing the velocity weights. The velocity weights are optimized in a fixed auxiliary search space. Compared to optimized PSO [20], the RSAPSO approach of optimizing the velocity weights after every (outer) PSO iteration is more efficient since only one additional PSO instance (for optimizing the velocity weights) is executed and only for one (inner) iteration. An advantage of RSAPSO is that the velocity weights can adapt themselves to dy-

namic changes, e.g., different particle distributions at different iterations. RSAPSO uses the following equation, with the notation used in Equation (1b), to update the velocities of particles: v(i) (n + 1) = w(i) (n)v(i) (n) (i)

(i)

(i)

+ c1 (n)r1 (n)[x p (n) − x(i) (n)] (i)

(i)

+ c2 (n)r2 (n)[xg (n) − x(i) (n)],

n = 0, 1, 2, . . . , − 1, (7a)

An auxiliary objective function is used to quantify the success of particles as a function of their velocity weights. There are reliable and directly employable entities to measure the success of particles. In particular, we use the improvement in the objective value of the particle [21], the number of updates of the global best position that the particle yielded [15, 16], and the number of updates of the personal best position that the particle yielded. We propose the following objective function for the velocity weights, selected based on good performance in empirical experiments: ( ) (i) (i) f˜(i) (n) = e(i) (n) 1 + wl ul (n) + wg ug (n) , n = 1, 2, . . . , − 1, (7b)

where f˜(i) (n) is the objective value of the velocity weights for particle i at iteration n, e(i) (n) is the nor(i) malized improvement described below, ul is the number of times particle i updated its personal best (i) position, ug is the number of times particle i updated the global best position, wl is the local weight factor used to weigh the number of personal best (i) updates ul , and wg is the global weight factor used (i) to weigh the number of global best updates ug . The value of wg is usually set to a larger number than the value of wl because updates to the global best position are relatively more important. Equation (7b) is thus used to guide the evolution of the positions of the velocity weights towards optimal values. Alternative objective functions are possible, e.g., ones that use the normalized improvements e(i) (n), or the local and global best update counters individually. The normalized improvements e(i) (n) are calculated as follows, based on good performance in empirical experiments: e(i) (n) =

δ(i) (n) , σ(n)

(7c)

194

Simone A. Ludwig

where σ(n) is the normalization sum (which has to be greater than zero), and δ(i) (n) is the difference in the objective values calculated using: δ(i) (n) = f (i) (n) − f (i) (n − 1),

(7d)

where f (i) is the objective value of particle i. In practice, early iterations might yield large absolute values of δ(i) , whereas late iterations might only yield small absolute values of δ(i) . Therefore, we propose the following normalization to fairly account for the contribution of the velocity weights from late iterations: { n p ∑i=1 −δ(i) (n), for δ(i) (n) < 0, σ(n) = (7e) 1, otherwise. In other words, the normalization sum σ(n) makes objective values of the velocity weights comparable for different n. This normalization is chosen based on good performance in empirical experiments. The velocity weights are optimized using one step of PSO in an inner iteration, resulting in the following overall iteration to update the positions of the velocity weights: x˜(i) (n + 1) = x˜(i) (n) + v˜(i) (n + 1), n = 1, 2, . . . , − 1, (7f) v˜(i) (n + 1) = w(n) ˜ v˜(i) (n) (i)

(i)

+ c˜1 (n)˜r1 (n)[x˜ p (n) − x˜(i) (n)] (i)

(i)

+ c˜2 (n)˜r2 (n)[x˜g (n) − x˜(i) (n)],

n = 1, 2, . . . , − 1, (7g)

where x˜(i) (n) is the position of the velocity weights, (i) v˜(i) (n) is the velocity of the velocity weights, x˜ p (n) is the personal best position of the velocity weights, (i) x˜g (n) is the global best position of the velocity weights, w(n) ˜ is the inertia weight for optimizing the velocity weights, c˜1 (n) is the personal best weight for optimizing the velocity weights, c˜2 (n) is the global best weight for optimizing the velocity (i) (i) weights, and r˜1 (n) and r˜2 (n) are random vectors with components that are uniformly distributed between 0 and 1 for every particle i and iteration n. Equations (7f) and (7g) are used after Equation (1a) and (7a) have been used to update the positions

of the particles, and the new objective values have been calculated. The first component of x˜(i) (n) is used as the inertia weight w(i) (n), the second component of x˜(i) (n) is used as the personal best weight (i) c1 (n), and the third component of x˜(i) (n) is used as (i) the global best weight c2 (n) as given in Equation (7a). The proposed RSAPSO switches between phases based on the mean separation of particles. If RSAPSO is in the attractive phase and converges, it switches to the repulsive phase once it has reached a small enough mean separation value. This can counter the trapping in a local optimum. If RSAPSO is in the repulsive phase, it switches to the attractive phase once it has reached a large enough mean separation. Similarly, four-state APSO uses the mean separation to decide in which of the four states it is in as given in [22]. The attractiverepulsive PSO [18] switches between phases based on a calculated diversity factor that is calculated similarly to the mean separation. We propose the following objective function for the velocity weights that adapt itself to the current phase: { f˜(i) (n), if a(n) = 1, (i) f¯ (n) = (7h) −s(i) (n), if a(n) = 2, where f¯(i) (n) is the objective value of the velocity weights, and a(n) is the phase indicator. If RSAPSO is in the attractive phase a(n) = 1, the objective value of the velocity weights f¯(i) (n) is set to f˜(i) (n) as calculated in Equation (7b). If RSAPSO is in the repulsive phase a(n) = 2, the objective value of the velocity weights f¯(i) (n) is set to the negation of the mean separation s(i) (n). This objective function for the velocity weights was selected for RSAPSO since good performance of the velocity weights is indicated by f˜(i) (n) in the attractive phase, and −s(i) (n) in the repulsive phase. In particular, in the attractive phase we focus on convergence by rewarding good objective values of the velocity weights f˜(i) (n), and in the repulsive phase we focus on diversity by rewarding high mean separation values s(i) (n). The attractive-repulsive PSO [18] switches to the repulsive phase if its diversity factor goes below an absolute lower threshold value and switches to the attractive phase if its diversity factor goes above

REPULSIVE SELF-ADAPTIVE ACCELERATION . . .

195

an absolute upper threshold value. We use the same mechanism but replace the diversity factor with the mean separation. Specifically, we use the following equation to switch between the phases:

i.e., the mean separation absolute lower threshold sl (n + 1) and upper threshold su (n + 1) are only changed after a full cycle through the attractive and repulsive states.

  1, a(n + 1) = 2,   a(n),

Algorithm 1 outlines our RSAPSO variant. RSAPSO calculates the mean separation after the local and global best positions are updated. RSAPSO requires the mean separation to decide whether a phase switch is required. If so, the objective function for the velocity weights, and the search space for the velocity weights are switched to their counterparts in the new phase.weights. The search space mainly yield negative velocity All velocforweights the velocity in the attractive must ity have weights to be reinitialized in the phase new search mainly yield positive velocity weights. The search space for the velocity weights if a phase switch ocspaceThe for the velocity weights in theand repulsive curs. personal best positions values phase of the must mainly yield negative velocity weights. velocity weights and the global best position All and velocity weights have to be reinitialized in the new value of the velocity weights are reset since if their search were spacediscovered for the velocity weights phase, if a phase values in the attractive they switch occurs. The personal best positions andversa. valcannot be used in the repulsive phase and vice uescase of the velocity weights and the to global best poIn a switch from the repulsive the attractive sition occurs, and value thephase velocity weights are reset phase i.e.,ofone cycle is finished, the since if their values were discovered in the attracmean separation absolute lower and upper threshtiveare phase, theyusing cannot be used (7j) in the old updated Equations andrepulsive (7k). If phase and vice versa. In case a switch the no phase switch occurs, RSAPSO followsfrom the flow repulsive to the attractive phase occurs, i.e., one of optimizing the velocity weights; however, it uses phase cycle finished, the mean (7b) separation absoEquation (7h)isinstead of Equation as the objeclute lower and upper threshold are updated using tive function for the velocity weights. Equations (7j) and (7k). If no phase switch occurs, RSAPSO follows the flow of optimizing the velocity weights; however, it uses Equation (7h) instead of Equation (7b) as the objective function for the velocity weights. Algorithm 1 Description of RSAPSO

if a(n) = 2 ∧ s(n) > su (n), if a(n) = 1 ∧ s(n) < sl (n), (7i) otherwise,

where sl (n) is the mean separation absolute lower threshold, and su (n) is the mean separation absolute upper threshold. RSAPSO starts in the attractive phase a(n) = 1. If the mean separation s(n) falls below the mean separation absolute lower threshold sl (n), RSAPSO changes from the attractive phase a(n) = 1 to the repulsive phase a(n + 1) = 2. If the mean separation s(n) rises above the mean separation absolute upper threshold su (n), RSAPSO changes from the repulsive phase a(n) = 2 to the attractive phase a(n + 1) = 1. To the best of our knowledge, the adaptive change of the mean separation absolute lower sl (n) and upper threshold su (n) is novel. This concept allows for increased accuracy and convergence as the algorithm proceeds. Furthermore, it can be used if good values for the mean separation absolute lower and the mean separation absolute upper threshold are not known. The mean separation absolute lower and upper threshold, sl (n) and su (n) respectively, are adapted as follows: { sl (n) if a(n) = 2 ∧ s(n) > su (n), s˘l , (7j) sl (n + 1) = sl (n), otherwise, { su (n) if a(n) = 2 ∧ s(n) > su (n), s˘u , su (n + 1) = (7k) su (n), otherwise, where s˘l is the mean separation absolute lower divisor and s˘u is the mean separation absolute upper divisor. The mean separation absolute lower threshold sl (n) is divided by the mean separation absolute lower divisor s˘l , and the mean separation absolute upper threshold su (n) is divided by the mean separation absolute upper divisor s˘u if the algorithm switches from the repulsive phase to the attractive phase at iteration n. Both the mean separation absolute lower sl (n + 1) and upper threshold su (n + 1) remain the same if the algorithm does not switch from the repulsive phase to the attractive phase;

4 4.1

Exp

Ben

Twenty op to compar sen PSO v the semi-c ing the Ac brock, and timization based on problems, Jong 5, D Parallel H test proble lems from include th Rastrigin, [25]. In a [20] is use the bench global opt Table 1 erties, bou

initialize positions and velocities initialize positions of velocity weights calculate objective values update local and global best positions and values repeat update positions calculate objective values update local and global best positions and values calculate mean distance if phase changed then update mean absolute lower/upper threshold if necessary switch objective function/search space for velocity weights reinitialize positions of velocity weights reset local/global best positions/values of velocity weights else calculate objective values of velocity weights update local/global best positions/values of velocity weights update positions of velocity weights end if until maximum number of generations reached report final results

Tab

Functio F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20

9

196

4 4.1

Simone A. Ludwig

Experiments and Results Benchmark Problems

Twenty optimization benchmark problems are used to compare our RSAPSO algorithm with the chosen PSO variants. All the benchmark problems from the semi-continuous challenge [23] are used, including the Ackley, Alpine, Griewank, Parabola, Rosenbrock, and Tripod test problems. Some of the optimization problems from [24] have been selected based on their shapes to guarantee a diverse set of problems, including the Six-hump Camel Back, De Jong 5, Drop Wave, Easom, Goldstein–Price, Axis Parallel Hyper-ellipsoid, Michalewicz, and Shubert test problems [23]. We also use optimization problems from [25] to expand our benchmark set. These include the Generalized Penalized, Non-continuous Rastrigin, Sphere, Rastrigin, and Step test problems [25]. In addition, Schaffer’s F6 test problem from [20] is used. For ease of comparison, we normalized the benchmark problems in order for all to have a global optimum of 0.0. Table 1 lists the benchmark functions, their properties, bounds on x, and the search space dimensions.

4.2

Parameter Settings

The parameters are set to the values described as follows: – search space for velocity weights: [−0.5, 2.0]; – search space for personal best weights: [−1.0, 4.2]; – search space for global best weights: [−1.0, 4.2]; – wl = 1; – wg = 6; – ws = 0.9; – we = 0.4; – c1s = 2.5; – c1e = 0.5; – c2s = 0.5; – c2e = 2.5;

– percentage of particles selected for mutation of their velocity weights: 33; – iterations before resetting best positions and velocity weights: 50; – initialization space for inertia weights: [0.4, 0.9]; – initialization space for personal best weights: [0.5, 2.5]; – initialization space for global best weights: [0.5, 2.5]; – reinitialization space for – inertia weights: [0.5, 0.8]; – reinitialization space for personal best weights: [0.6, 2.4]; – reinitialization space for global best weights: [0.6, 2.4]; – α˜ = 0.5; – mˇ = 10; – mˇ u = 2.5.

4.3

Experimental Setup

We compare the PSO variants on four experiments using four different numbers of function evaluations (FE) including initialization. – The first experiment uses n p = 10 particles and = 100 iterations resulting in a total of 1, 000 FE. – The second experiment uses n p = 20 particles and = 500 iterations resulting in a total of 10, 000 FE. – The third experiment uses n p = 40 particles and = 2, 500 iterations resulting in a total of 100, 000 FE. – The fourth experiment uses n p = 100 particles and = 10, 000 iterations resulting in a total of 1, 000, 000 FE. If the FE are the dominant expense, all the variants considered require approximately the same CPU time for a given number of FE. All calculations are performed in double precision. The results reported are best, mean and standard deviation from 30 runs performed.

197

REPULSIVE SELF-ADAPTIVE ACCELERATION . . .

Table 1. Description of Test Problems. Function F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20

[xmin , xmax ] [-30,30] [-10,10] [-2,2] [-65.536,65.536] [-5.12,5.12] [-100,100] [-50,50] [-300,300] [-2,2] [-5.12,5.12] [0, π] [-5.12,5.12] [-20,20] [-10,10] [-10,10] [-100,100] [-10,10] [-100,100] [-100,100] [-100,100]

D 30 10 2 2 2 2 30 30 2 100 10 30 200 30 30 2 2 100 30 2

Results

Analyzing the results, as shown in Tables 2 to 5 (best mean values are given in bold), reveal that RSAPSO improves with increasing numbers of FE, scoring best compared to the other variants for 100,000 and 1,000,000 FE. Figure 1 shows the results counting the number of wins, i.e., number of times an algorithm scored best in terms of best mean value on the benchmark functions. For 1,000 FE (Table 2), DWPSO, GCPSO, and RSAPSO score best on 2 benchmark functions, RPSO scores best on only 1 benchmark function, and TVACPSO outperforms the other algorithms scoring best on 12 benchmark functions. For 10,000 FE (Table 3), TVACPSO and RPSO score best on 5 benchmark functions, DWPSO and GCPSO score best on 7 benchmark functions, and RSAPSO scores best on 8 benchmark functions. For 100,000 FE, as shown in Table 4, reveal that TVACPSO and RPSO score best on 7 benchmark functions, DWPSO, and GCPSO score best on 8 benchmark functions, and RSAPSO scores best on 10 benchmark functions.

Algorith DWPSO TVACPS GCPSO RPSO RSAPSO

14  

Number  of  Wins  

4.4

Name Ackley Alpine Six-hump Camel Back De Jong 5 Drop Wave Easom Generalized Penalized Griewank Goldstein–Price Axis Parallel Hyper-ellipsoid Michalewicz Non-continous Rastrigin Parabola Rastrigin Rosenbrock Schaffer’s F6 Shubert Sphere Step Tripod

12   10   8  

Table 6: A man)

6   4   2   0  

1000  FE   DWPSO  

10000  FE  

TVACPSO  

100000  FE   1000000  FE  

GCPSO  

RPSO  

RSAPSO  

Figure 1. Number of wins versus number of Figure 1: Number of wins versus number of function function evaluations evaluations For 1,000,000 FE (Table 5) DWPSO, For 1,000,000 FE (Table 5) the DWPSO, TVACPSO TVACPSO and GCPSO have best mean value and have the best mean value benchfor 9GCPSO benchmark functions, RPSO forfor 119 benchmark for 11 benchmark functions, mark functions, functions, RPSO and RSAPSO scores best on 14 and RSAPSO scores best on 14 benchmark funcbenchmark functions. tions. For 1,000,000 FE, the optimum value of 0.0 was For 1,000,000 FE, the optimum value of 0.0 was achieved by all PSO variants, measuring the best achieved by all PSO variants, measuring the best value on 7 benchmark functions, for 100,000 FE value on 7 benchmark functions, for 100,000 FE only only on 6 benchmark functions, and for 10,000 FE on 6 benchmark functions, and for 10,000 FE only on only on 2 benchmark functions. This demonstrates 2 benchmark functions. This demonstrates that with increasing numbers of FE more benchmark functions are solved optimally. In terms of the average values achieved, for 1,000,000 FE, 0.0 was achieved by the PSO variants

Figure 10,000, 1 functions once more poorly fo values wit for 100,00 Overall creasing F better than particular which are as expecte exception RPSO has timodal fu switching This allow ues and t phase it h solutions.

198 1.20E-­‐07  

3.00E+00  

1.00E-­‐07  

2.50E+00  

Func/on  value  

Func1on  value  

Simone A. Ludwig

8.00E-­‐08   6.00E-­‐08   4.00E-­‐08   2.00E-­‐08   0.00E+00   DWPSO  

1000  FE   TVACPSO  

10000  FE  

RPSO  

1.50E+00   1.00E+00   5.00E-­‐01   0.00E+00  

100000  FE   1000000  FE  

GCPSO  

2.00E+00  

RSAPSO  

DWPSO  

(a) F9 Benchmark

Func.on  value  

Func/on  value  

10000  FE  

100000  FE   1000000  FE  

GCPSO  

RPSO  

RSAPSO  

3.00E+02  

1.20E+02   1.00E+02   8.00E+01   6.00E+01   4.00E+01   2.00E+01  

DWPSO  

TVACPSO  

(b) F11 Benchmark

1.40E+02  

0.00E+00  

1000  FE  

1000  FE   TVACPSO  

10000  FE  

RPSO  

2.00E+02   1.50E+02   1.00E+02   5.00E+01   0.00E+00  

100000  FE   1000000  FE  

GCPSO  

2.50E+02  

RSAPSO  

DWPSO  

(c) F12 Benchmark

1000  FE   TVACPSO  

10000  FE  

100000  FE   1000000  FE  

GCPSO  

RPSO  

RSAPSO  

(d) F14 Benchmark

Figure 2. Function value versus FE for different benchmark functions. Figure 2: Function value versus FE for different benchmark functions. modal, which are F4, F5, F7, F12, F16, and F17, other variants for higher numbers of FE in particular RSAPSO as expected scored best on these funcfor 1,000,000 FE. A possible reason for RSAPSO’s tions with the exception of F17. As mentioned in poorer performance for 1,000 and 10,000 FE is that literature [18], RPSO has shown to work particuthe optimization of the velocity weights takes sevlarly well on multimodal functions, which is most eral iterations to have a beneficial effect since more likely due to the switching between attractive and knowledge of the optimization problem is acquired repulsive phases. This allows the algorithm to adopt by then. In addition, RSAPSO has shown to work good velocity values and together with the repulsive particularly well on multimodal functions due to the and attractive phase it helps to move the particle toincorporated attractive and repulsive phases for the wards better solutions. The results on the benchoptimization of the velocity weights. mark functions confirms this by the implemented Since as RSAPSO running times dependRPSO well ashas ourlonger proposed RSAPSO variant ing on the difficulty and the dimensionality of the showing the better results. problem, future work will parallelize the algorithm using Hadoop’s MapReduce methodology in order to speed up the optimization process. Furthermore, we 5 Conclusion would like to extend RSAPSO to integrate the idea We proposed a repulsive and adaptive PSO variant, named RSAPSO, for which every particle has 16 its own velocity weights, i.e., inertia weight, personal best weight and global best weight. An objective function for the velocity weights is used to measure the suitability of the velocity weights for solving the overall optimization problem. Due to the calculated objective values of the velocity weights, RSAPSO is able to improve the optimization process. In particular, the RSAPSO variant adapts the velocity weights before it optimizes the solution of the problem (e.g., benchmark function). The advan-

that with increasing numbers of FE more benchthe suitability of the velocity weights for solving the mark functions are solved optimally. overall optimization problem. Due to the calculated In terms of of thetheaverage achieved, foris objective values velocityvalues weights, RSAPSO 1,000,000 FE, 0.0 was achieved by the PSO variants able to improve the optimization process. In particu41 times, for 100,000 FEadapts 28 times, and for 10,000 lar, the RSAPSO variant the velocity weights FE 16 times. before it optimizes the solution of the problem (e.g., benchmark function). RSAPSO A Friedman rankingThe testadvantage [26] was of applied on is that the velocity weights adapt themselves to the average results for the four different FE. Tabledy6 namic e.g., different particle distributions shows changes, the average ranks obtained by each PSO vari-at different iterations. ant. All the results for all four different FE are not statistically significant at the 5%algorithm significance level. We evaluated our RSAPSO on twenty The post hocfunctions procedures Bonferroni-Dunn benchmark andofcompared it with and four Hochberg confirmed However, as thePSO, previous PSO variants, namelythis. decreasing weight timediscussion outlined, our approach has theguaranteed best rank varying acceleration coefficient PSO, for 1,000,000PSO, FE, and evenattractive though the arePSO. not convergence andresults repulsive statistically significant. Our RSAPSO variant achieves better results than the Figure 2 shows the function value for 1,000, 10,000, 100,000 and 1,000,000 FE for benchmark functions F9, F11, F12, and F14. It confirms once more that our proposed RSAPSO first performs poorly for 1,000 FE, however, showing improved values with increasing numbers of FE by scoring best for 100,000 and 1,000,000 FE. Overall, the experiments have shown that for increasing FE our proposed RSAPSO variant scored better than the other PSO variants. Looking at the particular benchmark functions that are multi-

11

RSAPSO

RPSO

GCPSO

TVACPSO

DWPSO

Algorithm

RSAPSO

RPSO

GCPSO

TVACPSO

DWPSO

Algorithm

Best Mean Std Best Mean Std Best Mean Std Best Mean Std Best Mean Std

Best Mean Std Best Mean Std Best Mean Std Best Mean Std Best Mean Std

F1 4.94e+00 5.98e+00 1.26e+00 5.56e+00 6.08e+00 3.11e-01 4.51e+00 5.50e+00 7.32e-01 5.06e+00 5.35e+00 6.66e-02 5.72e+00 6.13e+00 2.37e-01 F11 1.92e+00 2.50e+00 3.52e-01 1.21e+00 1.80e+00 3.49e-01 1.77e+00 2.25e+00 2.67e-01 2.17e+00 2.80e+00 4.36e-01 2.18e+00 2.83e+00 4.63e-01

F2 2.09e-01 5.50e-01 2.36e-01 1.29e-02 2.49e-01 4.49e-02 3.55e-01 5.28e-01 7.70e-02 8.06e-02 7.97e-01 3.95e-01 8.06e-02 7.97e-01 3.95e-01 F12 1.23e+02 1.40e+02 8.93e+02 1.11e+02 1.28e+02 2.29e+02 8.00e+01 1.08e+02 8.45e+02 1.12e+02 1.33e+02 5.33e+02 1.12e+02 1.33e+02 5.33e+02

F3 1.12e-12 4.71e-10 6.45e-19 2.13e-14 2.00e-13 3.86e-26 3.55e-13 6.88e-10 1.18e-18 0.00e+00 4.62e-09 6.41e-17 6.56e-09 1.28e-08 3.33e-17 F13 5.15e+03 5.77e+03 7.34e+05 4.17e+03 4.73e+03 4.53e+05 5.13e+03 5.67e+03 2.51e+05 5.61e+03 6.28e+03 3.57e+05 5.63e+03 6.21e+03 3.55e+05

F4 1.76e-07 1.66e+00 2.30e+00 9.94e-01 3.29e+00 6.11e+00 5.88e-10 3.47e-01 3.62e-01 3.99e-13 3.22e-01 3.26e-01 3.94e-13 Table 2. 3.31e-01 3.29e-01 F14 2.52e+02 2.77e+02 8.67e+02 1.23e+02 1.69e+02 2.96e+03 2.29e+02 2.99e+02 7.73e+03 2.29e+02 2.49e+02 2.28e+02 2.28e+02 2.46e+02 2.24e+02

F5 F6 F7 1.51e-05 6.00e-05 4.95e+04 2.13e-02 1.45e-01 1.00e+06 1.35e-03 6.29e-02 2.04e+12 6.37e-12 2.51e-04 6.10e+02 2.13e-02 3.27e-01 9.59e+03 1.35e-03 3.18e-01 9.84e+07 7.92e-06 1.79e-01 1.02e+02 4.25e-02 7.26e-01 1.39e+04 1.35e-03 2.25e-01 3.17e+08 6.38e-02 5.70e-09 2.33e+03 6.38e-02 3.33e-01 1.82e+05 1.89e-25 3.33e-01 7.08e+10 6.38e-02 5.70e-09 2.33e+03 Performance for benchmark 6.38e-02 3.33e-01 1.82e+05 4.04e-21 3.33e-01 7.08e+10 F15 F16 F17 6.10e+03 9.36e-03 1.80e-07 1.03e+04 9.60e-03 2.55e-04 1.32e+07 4.16e-08 1.89e-07 1.47e+03 9.72e-03 1.27e-07 3.22e+03 9.72e-03 1.14e-06 2.86e+06 3.06e-19 3.07e-12 5.75e+03 9.72e-03 1.12e-08 9.23e+03 9.72e-03 1.10e-07 3.29e+07 5.94e-27 1.92e-14 2.80e+03 3.15e-03 2.84e-13 5.67e+03 5.34e-03 1.36e-07 1.25e+07 1.48e-05 5.59e-14 2.81e+03 3.05e-03 4.09e-07 5.60e+03 5.31e-03 1.04e-05 1.27e+07 1.45e-05 1.93e-10

F8 4.52e+00 5.74e+00 1.26e+00 2.99e+00 6.98e+00 1.24e+01 5.29e+00 8.00e+00 2.02e+01 5.93e+00 9.29e+00 1.03e+01 5.93e+00 F1 to F20 9.29e+00 1.03e+01 F18 4.76e+04 4.91e+04 3.10e+06 3.17e+04 3.78e+04 9.71e+07 4.22e+04 5.06e+04 5.75e+07 3.84e+04 4.62e+04 4.87e+07 3.81e+04 4.60e+04 4.83e+07

Table 2. Performance for benchmark F1 to F20 for 1,000 FE. Table 2: Performance for benchmark F1 to F20 for 1,000 FE. F9 2.56e-10 1.41e-08 5.16e-16 2.44e-14 1.44e-11 5.35e-22 7.51e-10 4.76e-08 2.93e-15 7.46e-14 6.72e-09 1.35e-16 6.47e-09 for 1,000 1.07e-07 1.26e-12 F19 1.82e+03 3.39e+03 4.50e+06 1.10e+03 1.21e+03 8.86e+03 1.27e+03 3.09e+03 2.57e+06 2.24e+03 3.38e+03 1.43e+06 2.24e+03 3.49e+03 1.89e+06

F10 4.82e+03 5.33e+03 5.82e+05 3.31e+03 4.00e+03 7.06e+05 2.92e+03 4.13e+03 1.29e+06 4.98e+03 6.02e+03 1.42e+06 4.98e+03 FE. 6.02e+03 1.42e+06 F20 3.37e-03 6.93e-01 1.28e+00 3.64e-06 1.33e+00 1.33e+00 1.27e-04 1.00e+00 1.00e+00 1.46e-08 3.33e-01 3.33e-01 9.67e-08 3.33e-01 3.33e-01

REPULSIVE SELF-ADAPTIVE ACCELERATION . . .

199

DAPTIVE ACCELERATION . . .

12

RSAPSO

RPSO

GCPSO

TVACPSO

DWPSO

Algorithm

RSAPSO

RPSO

GCPSO

TVACPSO

DWPSO

Algorithm

Best Mean Std Best Mean Std Best Mean Std Best Mean Std Best Mean Std

Best Mean Std Best Mean Std Best Mean Std Best Mean Std Best Mean Std

F1 1.17e-01 4.65e-01 1.02e-01 3.31e-02 9.63e-01 6.57e-01 1.82e-01 8.44e-01 3.78e-01 2.42e+00 3.15e+00 4.51e-01 2.58e+00 3.20e+00 3.40e-01 F11 9.04e-01 1.33e+00 2.79e-01 2.64e-01 7.61e-01 6.97e-01 2.57e-01 5.62e-01 8.42e-02 4.50e-01 1.03e+00 4.32e-01 4.50e-01 1.03e+00 4.32e-01

F2 2.32e-07 1.59e-06 5.53e-12 2.36e-10 2.43e-07 1.11e-13 3.91e-08 6.70e-04 1.34e-06 1.11e-05 2.14e-05 2.37e-10 1.11e-05 2.14e-05 2.37e-10 F12 5.07e+01 8.72e+01 9.99e+02 5.60e+01 6.73e+01 1.21e+02 5.65e+01 7.55e+01 3.56e+02 4.51e+01 7.18e+01 5.34e+02 4.51e+01 7.28e+01 5.77e+02

F3 F4 F5 F6 F7 0.00e+00 2.22e-16 0.00e+00 0.00e+00 1.21e+00 0.00e+00 2.22e-16 0.00e+00 0.00e+00 1.93e+00 0.00e+00 9.12e-64 0.00e+00 0.00e+00 6.24e-01 0.00e+00 2.22e-16 0.00e+00 0.00e+00 1.78e+00 0.00e+00 2.22e-16 2.13e-02 0.00e+00 3.79e+00 0.00e+00 9.12e-64 1.35e-03 0.00e+00 4.89e+00 0.00e+00 2.22e-16 0.00e+00 0.00e+00 1.06e+00 0.00e+00 2.22e-16 0.00e+00 0.00e+00 1.97e+00 0.00e+00 9.12e-64 0.00e+00 0.00e+00 2.04e+00 0.00e+00 2.22e-16 0.00e+00 0.00e+00 2.60e+00 0.00e+00 2.22e-16 0.00e+00 0.00e+00 5.08e+00 0.00e+00 9.12e-64 0.00e+00 0.00e+00 1.09e+01 0.00e+00 2.22e-16 0.00e+00 0.00e+00 3.31e+00 0.00e+00 2.22e-16 0.00e+00 0.00e+00 6.11e+00 0.00e+00 9.12e-56 0.00e+00 0.00e+00 7.61e+00 Table 3. Performance for benchmark F1 F13 F14 F15 F16 F17 1.75e+03 1.11e+01 1.11e+02 5.04e-08 2.84e-14 2.18e+03 2.97e+01 1.36e+02 5.63e-03 4.74e-14 1.68e+05 8.43e+02 9.29e+02 2.54e-05 2.69e-28 1.40e+03 7.01e+01 6.58e+01 0.00e+00 2.84e-14 1.63e+03 8.38e+01 1.15e+02 6.48e-03 5.68e-14 4.26e+04 1.62e+02 1.86e+03 3.15e-05 8.08e-28 1.43e+03 7.39e+01 9.61e+01 0.00e+00 5.68e-14 1.52e+03 1.02e+02 1.12e+02 3.24e-03 6.63e-14 8.85e+03 1.53e+03 7.14e+02 3.15e-05 2.69e-28 2.08e+03 1.24e+02 1.65e+02 5.04e-08 5.68e-14 2.73e+03 1.38e+02 2.34e+02 5.63e-03 7.58e-14 3.44e+05 4.54e+02 5.40e+03 2.54e-05 2.69e-28 2.08e+03 4.22e+01 7.72e+01 0.00e+00 5.68e-14 2.47e+03 5.75e+01 2.05e+02 0.00e+00 7.58e-14 1.70e+05 1.76e+02 1.40e+04 0.00e+00 2.69e-28

F8 F9 4.85e-01 7.77e-14 7.01e-01 7.70e-14 3.75e-02 1.64e-30 1.13e-02 7.82e-14 2.73e-02 7.49e-14 2.07e-04 8.15e-30 2.53e-01 7.77e-14 5.45e-01 7.61e-14 6.40e-02 4.40e-30 1.65e-01 7.68e-14 4.86e-01 7.37e-14 3.09e-01 1.99e-29 3.68e-01 7.64e-14 7.37e-01 7.30e-14 1.44e-01 to F20 for 1.98e-29 10,000 F18 F19 6.02e+03 2.00e+00 8.47e+03 3.00e+00 4.71e+06 1.00e+00 3.51e+03 9.00e+00 6.56e+03 2.33e+01 7.60e+06 2.26e+02 6.46e+03 7.00e+00 1.02e+04 1.10e+01 1.06e+07 1.30e+01 3.22+03 1.90e+02 3.80e+03 3.11e+02 2.96e+05 1.10e+04 6.46e+03 2.90e+01 1.02e+04 2.00e+02 1.06e+07 3.12e+04

Table 3. Performance benchmark F1F20 to F20 10,000 Table 3: Performance for for benchmark F1 to for for 10,000 FE.FE. F10 4.97e+02 6.81e+02 2.86e+04 4.42e+02 5.41e+02 9.61e+03 1.34e+02 2.58e+02 1.27e+04 9.86e+02 1.09e+03 1.40e+04 9.86e+02 1.09e+03 1.40e+04 FE. F20 0.00e+00 3.33e-01 3.33e-01 0.00e+00 3.33e-01 3.33e-01 0.00e+00 3.33e-01 3.33e-01 0.00e+00 8.59e-11 2.21e-20 0.00e+00 0.00e+00 0.00e+00

200 Simone A. Ludwig

Simone A. Ludwig

13

RSAPSO

RPSO

GCPSO

TVACPSO

DWPSO

Algorithm

RSAPSO

RPSO

GCPSO

TVACPSO

DWPSO

Algorithm

Best Mean Std Best Mean Std Best Mean Std Best Mean Std Best Mean Std

Best Mean Std Best Mean Std Best Mean Std Best Mean Std Best Mean Std

F1 5.68e-10 1.19e-09 1.01e-18 4.24e-11 5.29e-11 1.50e-22 5.87e-11 1.83e-10 3.68e-20 1.18e-09 8.59e-01 6.81e-01 1.18e-09 9.76e-02 2.86e-02 F11 4.91e-03 1.96e-01 2.98e-02 2.90e-01 4.22e-01 2.78e-02 4.67e-02 3.11e-01 8.86e-02 1.78e-15 1.39e-02 5.87e-04 0.00e+00 1.33e-02 5.29e-04

F2 2.22e-16 3.52e-16 5.03e-32 0.00e+00 0.00e+00 0.00e+00 3.89e-15 4.16e-15 9.55e-32 3.22e-15 5.03e-14 6.33e-27 3.22e-15 5.03e-14 6.33e-27 F12 1.20e+01 2.30e+01 2.73e+02 2.20e+01 2.70e+01 2.50e+01 3.05e+01 3.58e+01 2.17e+01 5.06e+00 9.02e+00 1.28e+01 5.06e+00 9.02e+00 1.28e+01

F3 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 F13 4.29e+01 1.05e+02 4.78e+03 9.24e+01 1.25e+02 9.89e+02 7.28e+01 7.99e+01 1.28e+02 2.38e+02 2.73e+02 1.25e+03 5.46e+01 1.14e+02 6.74e+03

F4 F5 F6 F7 F8 0.00e+00 0.00e+00 0.00e+00 3.81e-18 8.79e-14 1.48e-16 0.00e+00 0.00e+00 1.60e-17 1.39e-02 1.64e-32 0.00e+00 0.00e+00 3.18e-34 1.83e-04 0.00e+00 0.00e+00 0.00e+00 0.00e+00 6.99e-15 7.40e-17 0.00e+00 0.00e+00 5.67e-20 8.21e-03 1.64e-32 0.00e+00 0.00e+00 6.63e-39 7.47e-05 0.00e+00 0.00e+00 0.00e+00 6.00e-20 9.86e-03 1.48e-16 0.00e+00 0.00e+00 4.80e-19 2.46e-02 1.64e-32 0.00e+00 0.00e+00 2.69e-37 1.88e-04 0.00e+00 0.00e+00 0.00e+00 2.06e-17 2.22e-16 7.40e-17 0.00e+00 0.00e+00 3.46e-02 1.07e-02 1.64e-32 0.00e+00 0.00e+00 3.58e-03 8.70e-05 2.22e-16 0.00e+00 0.00e+00 2.06e-17 2.22e-16 2.22e-16 0.00e+00 0.00e+00 3.46e-02 1.07e-02 9.12e-64 0.00e+00 0.00e+00 3.58e-03 8.70e-05 F14 4. Performance F15 F16 F17 Table for benchmark F1F18 to F20 1.69e+01 2.37e+01 0.00e+00 2.84e-14 2.83e+00 2.06e+01 4.36e+01 0.00e+00 2.84e-14 3.57e+00 1.22e+01 8.93e+02 0.00e+00 0.00e+00 1.28e+00 2.69e+01 9.38e+00 0.00e+00 2.84e-14 3.36e+00 3.32e+01 3.50e+01 0.00e+00 5.68e-14 2.93e+01 4.19e+01 1.21e+03 0.00e+00 8.08e-28 5.19e+02 3.38e+01 2.39e+01 0.00e+00 2.84e-14 2.02e+00 3.65e+01 2.52e+01 0.00e+00 2.84e-14 4.36e+00 9.24e+00 2.48e+00 0.00e+00 0.00e+00 8.60e+00 3.78e+01 2.32e+01 0.00e+00 2.84e-14 5.68e+00 5.77e+01 2.32e+01 0.00e+00 5.68e-14 2.13e+02 3.45e+02 3.16e-04 0.00e+00 8.08e-28 1.29e+01 3.28e+01 2.32e+01 0.00e+00 2.84e-14 1.89e-02 5.60e+01 2.32e+01 0.00e+00 5.68e-14 4.78e-01 4.53e+02 3.16e-04 0.00e+00 8.08e-28 1.79e-01

Table 4. Performance for benchmark F1 to F20 for 100,000 FE. Table 4: Performance for benchmark F1 to F20 for 100,000 FE. F9 F10 7.82e-14 8.89e-02 7.80e-14 4.12e-01 6.57e-32 1.16e-01 7.82e-14 2.77e-01 7.82e-14 6.25e-01 2.39e-58 9.09e-02 7.95e-14 4.77e-02 7.86e-14 6.54e-02 5.92e-31 3.62e-04 7.84e-14 4.33e+00 7.80e-14 4.47e+01 7.57e-32 4.29e+03 7.82e-14 2.04e-01 7.80e-14 4.67e+00 6.57e-32 2.16e+01 forF19100,000F20 FE. 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 3.33e-01 3.33e-01 3.33e-01 3.33e-01 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 3.33e+00 0.00e+00 1.23e+01 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00

REPULSIVE SELF-ADAPTIVE ACCELERATION . . .

201

VE ACCELERATION . . .

Algorithm DWPSO TVACPSO GCPSO RPSO RSAPSO

1,000 FE 3.275 2 2.975 3.15 3.6

10,000 FE 2.85 2.7 2.7 3.375 3.375

100,000 FE 2.65 3.15 2.75 3.55 2.9

RSAPSO

RPSO

GCPSO

TVACPSO

DWPSO

Algorithm

RSAPSO

RPSO

GCPSO

TVACPSO

DWPSO

Algorithm

Best Mean Std Best Mean Std Best Mean Std Best Mean Std Best Mean Std

Best Mean Std Best Mean Std Best Mean Std Best Mean Std Best Mean Std

REPULSIVE SELF-ADAPTIVE ACCELERATION . . .

F1 7.55e-15 7.55e-15 0.00e+00 7.55e-15 9.92e-15 1.68e-29 7.55e-15 7.55e-15 0.00e+00 7.55e-15 9.92e-15 1.68e-29 7.55e-15 9.92e-15 1.68e-29 F11 1.78e-15 2.81e-02 6.13e-04 4.91e-03 3.62e-02 8.37e-04 0.00e+00 1.72e-02 6.58e-04 0.00e+00 5.71e-06 9.78e-11 0.00e+00 1.18e-15 1.05e-30

F2 0.00e+00 9.99e-16 3.00e-30 0.00e+00 0.00e+00 0.00e+00 0.00e+00 1.30e-15 5.03e-30 2.22e-16 2.61e-15 6.60e-30 2.22e-16 2.61e-15 6.60e-30 Table F12 7.00e+00 1.00e+01 1.30e+01 8.00e+00 1.37e+01 2.43e+01 7.00e+00 9.00e+00 3.00e+00 0.00e+00 3.33e-01 3.33e-01 0.00e+00 3.33e-01 3.33e-01

F3 F4 F5 F6 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 7.40e-17 0.00e+00 0.00e+00 0.00e+00 1.64e-32 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00for 0.00e+00 0.00e+00 5. Performance benchmark F1 to F13 F14 F15 F16 8.55e-04 4.97e+00 1.40e+01 0.00e+00 1.65e-03 2.12e+01 1.80e+01 0.00e+00 4.82e-07 2.40e+01 1.19e+01 0.00e+00 2.27e-02 1.19e+01 6.14e+00 0.00e+00 5.52e+00 1.82e+01 1.04e+01 0.00e+00 8.88e+01 5.97e+01 3.40e+01 0.00e+00 1.01e-04 1.09e+01 8.70e+00 0.00e+00 2.86e-04 1.76e+01 1.64e+01 0.00e+00 2.57e-08 3.70e+01 4.41e+01 0.00e+00 8.56e-07 4.97e+00 1.73e-03 0.00e+00 5.36e-05 2.12e+01 1.36e+00 0.00e+00 8.13e-09 2.40e+01 5.25e+00 0.00e+00 4.42e-04 1.09e+01 1.73e-03 0.00e+00 7.03e+00 1.23e+01 1.36e+00 0.00e+00 1.48e-02 2.31e+00 5.25e+00 0.00e+00

F7 F8 0.00e+00 7.40e-03 0.00e+00 1.07e-02 0.00e+00 8.09e-06 0.00e+00 0.00e+00 0.00e+00 6.56e-03 0.00e+00 1.29e-04 0.00e+00 0.00e+00 0.00e+00 4.93e-03 0.00e+00 7.28e-05 0.00e+00 0.00e+00 0.00e+00 6.57e-03 0.00e+00 3.85e-05 0.00e+00 0.00e+00 0.00e+00 6.57e-03 0.00e+00 3.85e-05 F20 for 1,000,000 F17 F18 0.00e+00 1.05e-13 9.47e-15 2.13e-12 2.69e-28 8.66e-24 0.00e+00 5.54e-08 4.74e-14 1.63e-07 1.88e-27 1.77e-14 5.68e-14 7.64e-16 5.68e-14 2.79e-15 0.00e+00 1.02e-29 0.00e+00 4.25e-18 3.79e-14 8.31e-16 1.88e-27 1.08e-30 0.00e+00 0.00e+00 3.79e-14 0.00e+00 1.88e-27 0.00e+00

Table 5. Performance for benchmark F1 to F20 for 1,000,000 FE. Table 5: Performance for benchmark F1 to F20 for 1,000,000 FE. F9 7.95e-14 7.86e-14 5.92e-31 7.82e-14 7.82e-14 2.39e-58 8.04e-14 7.89e-14 1.64e-30 8.04e-14 7.89e-14 1.64e-30 8.04e-14 7.61e-14 1.64e-30 FE. F19 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00

F10 9.94e-15 6.26e-14 7.75e-27 7.59e-10 1.18e-08 2.00e-16 2.76e-16 4.63e-16 6.43e-32 2.40e-19 2.32e-17 1.32e-33 2.40e-19 2.32e-17 1.32e-33 F20 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00

202 Simone A. Ludwig

Simone A. Ludwig

14

Table 6. Average rankings of the algorithms (Friedman)

1,000,000 FE 3.275 3.475 2.8 2.8 2.65

REPULSIVE SELF-ADAPTIVE ACCELERATION . . .

203

tage of RSAPSO is that the velocity weights adapt themselves to dynamic changes, e.g., different particle distributions at different iterations.

[6] F. Van den Bergh and A. P. Engelbrecht, A Study of Particle Swarm Optimization Particle Trajectories, Information Sciences, 176(8), 937-971, 2006.

We evaluated our RSAPSO algorithm on twenty benchmark functions and compared it with four PSO variants, namely decreasing weight PSO, timevarying acceleration coefficient PSO, guaranteed convergence PSO, and attractive and repulsive PSO. Our RSAPSO variant achieves better results than the other variants for higher numbers of FE in particular for 1,000,000 FE. A possible reason for RSAPSO’s poorer performance for 1,000 and 10,000 FE is that the optimization of the velocity weights takes several iterations to have a beneficial effect since more knowledge of the optimization problem is acquired by then. In addition, RSAPSO has shown to work particularly well on multimodal functions due to the incorporated attractive and repulsive phases for the optimization of the velocity weights.

[7] C. K. Monson, K. D. Seppi, Exposing OriginSeeking Bias in PSO, Proceedings of GECCO’05, pp. 241-248, 2005.

Since RSAPSO has longer running times depending on the difficulty and the dimensionality of the problem, future work will parallelize the algorithm using Hadoop’s MapReduce methodology in order to speed up the optimization process. Furthermore, we would like to extend RSAPSO to integrate the idea of moving bound behavior that would allow expert knowledge about the search space for the velocity weights to be incorporated.

References [1] J. Kennedy and R. Eberhart, Particle swarm optimization, Proceedings of IEEE International Conference on Neural Networks, 4:1942–1948, 1995. [2] A. Engelbrecht, Computational Intelligence: An Introduction, 2nd Edition, Wiley, 2007. [3] A. Ratnaweera, S. K. Halgamuge, and H. C. Watson, Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients, IEEE Transaction on Evolutionary Computation, 8(3):240–255, 2004. [4] X. Li, H. Fu, and C. Zhang, A self-adaptive particle swarm optimization algorithm, Proceedings of the 2008 International Conference on Computer Science and Software Engineering, 186–189, 2008. [5] I. C. Trelea, The Particle Swarm Optimization Algorithm: Convergence Analysis and Parameter Selection, Information Processing Letters, 85(6), 317-325, 2003.

[8] F. Van den Bergh and A. P. Engelbrecht, A new locally convergent particle swarm optimiser, Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, 3:94–99, 2002. [9] F. Schutte, A. A. Groenwold, A Study of Global Optimization using Particle Swarms, Journal of Global Optimization, 31, 93-108, 2005. [10] A. Chatterjee, P. Siarry, Nonlinear inertial weight variation for dynamic adaptation in particle swarm optimization, Journal of Computer Operation Research, 33(3), 859-871, 2004. [11] X. Yang, J. Yuan, J. Yuan, and H. Mao, A modified particle swarm optimizer with dynamic adaptation, Applied Mathematics and Computation, 189(2):1205–1213, 2007. [12] J. Zhu, J. Zhao, and X. Li, A new adaptive particle swarm optimization algorithm, International Workshop on Modelling, Simulation and Optimization, 456–458, 2008. [13] Y. Bo, Z. Ding-Xue, and L. Rui-Quan, A modified particle swarm optimization algorithm with dynamic adaptive, 2007 Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2:346–349, 2007. [14] T. Yamaguchi and K. Yasuda, Adaptive particle swarm optimization; self-coordinating mechanism with updating information, IEEE International Conference on Systems, Man and Cybernetics, 2006. SMC ’06, 3:2303–2308, 2006. [15] T. Yamaguchi, N. Iwasaki, and K. Yasuda, Adaptive particle swarm optimization using information about global best, IEEE Transactions on Electronics, Information and Systems, 126:270–276, 2006. [16] K. Yasuda, K. Yazawa, and M. Motoki, Particle swarm optimization with parameter self-adjusting mechanism, IEEE Transactions on Electrical and Electronic Engineering, 5(2):256–257, 2010. [17] S. A. Ludwig, Towards A Repulsive and Adaptive Particle Swarm Optimization Algorithm, Proceedings of Genetic and Evolutionary Computation Conference (ACM GECCO), Amsterdam, Netherlands, July 2013 (short paper). [18] J. Riget and J. S. Vesterstrø, A diversity-guided particle swarm optimizer — The ARPSO, EVALife Technical Report no. 2002-2, 2002.

204 [19] A. Ide and K. Yasuda, A basic study of adaptive particle swarm optimization, Denki Gakkai Ronbunshi, Electrical Engineering in Japan, 151(3):41–49, 2005. [20] M. Meissner, M. Schmuker, and G. Schneider, Optimized particle swarm optimization (OPSO) and its application to artificial neural network training, BMC Bioinformatics, 7(1):125, 2006. [21] X. Cai, Z. Cui, J. Zeng, and Y. Tan, Self-learning particle swarm optimization based on environmental feedback, Innovative Computing, Information and Control, 2007. ICICIC ’07, page 570, 2007. [22] Z. H. Zhan and J. Zhang, Proceedings of the 6th International Conference on Ant Colony Optimization and Swarm Intelligence, pages 227–234, 2008. [23] M. Clerc, Semi-continuous challenge, http://clerc.maurice.free.fr/pso/semicontinuous challenge/, April 2004. [24] M. Molga and C. Smutnicki, Test functions for optimization needs, http://zsd.iiar.pwr.wroc.pl/, 2005. [25] Z. H. Zhan, J. Zhang, Y. Li, and H. S. H. Chung, Adaptive particle swarm optimization, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(6):1362–1381, 2009. [26] M. Friedman, The use of ranks to avoid the assumption of normality implicit in the analysis of variance, Journal of the American Statistical Association, vol. 32, no. 200, pp. 675–701, 1937.

Simone A. Ludwig

JAISCR, 2014, Vol. 4, No. 3, pp. 205 – 214 10.1515/jaiscr-2015-0009

A DATA MINING APPROACH TO IMPROVE MILITARY DEMAND FORECASTING1 Rajesh Thiagarajan, Mustafizur Rahman, Don Gossink and Greg Calbert Defence Science and Technology Organization, Edinburgh SA 5111, Australia. [email protected] Abstract Accurately forecasting the demand of critical stocks is a vital step in the planning of a military operation. Demand prediction techniques, particularly autocorrelated models, have been adopted in the military planning process because a large number of stocks in the military inventory do not have consumption and usage rates per platform (e.g., ship). However, if an impending military operation is (significantly) different from prior campaigns then these prediction models may under or over estimate the demand of critical stocks leading to undesired operational impacts. To address this, we propose an approach to improve the accuracy of demand predictions by combining autocorrelated predictions with cross-correlated demands of items having known per-platform usage rates. We adopt a data mining approach using sequence rule mining to automatically determine crosscorrelated demands by assessing frequently co-occurring usage patterns. Our experiments using a military operational planning system indicate a considerable reduction in the prediction errors across several categories of military supplies. 1

1

Introduction

Identifying and accurately forecasting demand for supply items are vital steps in the planning of a military operation as they facilitate effective decision making. Such demand forecasts, referred to as Advanced Demand Information (ADI) models in the supply chain literature, have been shown to be beneficial in several aspects of supply chain management [2, 3, 4]. In a military context, ADI of supply items should be accurately modelled on the basis of their usage and Rate of Effort (ROE) by military platforms such as ships and aircraft. Consider the logistics planning of a week long military operation O, which involves a ship with a ROE set to 5 hours of sailing per day. Assuming the ship consumes 300 litres of diesel an hour, the diesel ADI model ADIdiesel for this operation would consist of 1A

a daily demand for 1500 litres of diesel over 7 days resulting in a total demand of 10500 litres of diesel. Not all stock items in the military inventory have platform-based ROE usage models (referred as ROEM henceforth) that can be used to derive accurate ADI models. A large number of stocks do not have ROEM mainly due to the fact that most stocks in the military inventory are managed through demands aggregated across a number of platforms. For example, demand for a lubricant at a military base over a time period is generally managed by aggregating its demand from a number of military platforms that require the lubricant during that period. While the aggregate demand for supply items at different military bases are captured as a part of the historic demand data, the per-platform usage details are not recorded. Therefore, automatic generation of ROEM is difficult.

preliminary version of this work appeared in ACIIDS 2014 [1]

206 The lack of ROEM has led to the adoption of autocorrelated demand prediction techniques, such as Simple Exponential Smoothing (SES) [5] and ARIMA [6], in the military operational planning process [7]. In these techniques, the demand of an item for an upcoming operation (e.g., a training exercise) is autocorrelated with the item’s usage in the past occurrences of the operation [7]. The demand of an item based on autocorrelation may be assumed to be its ADI model over the operational period. Consider the planning of operation O, which as discussed above, consists of an ADI model of diesel for a ship, but lacks the ADI models of other critical stocks such as lubricants that are essential to the ship’s operation. In this case, the demand for a lubricant ADIl is predicted based on its historic usage by adopting an ARIMA model, denoted as ADIl = ARIMAl . However, if an impending military operation is (significantly) different from prior campaigns then these prediction models may under or over estimate the demand of critical stocks leading to undesired operational or cost impacts. For example, if the operation O is a high intensity deployment in comparison with previous military campaigns then ADIl would be underestimated. To address this, we propose an approach to improve the accuracy of demand predictions by combining autocorrelated predictions with crosscorrelated demands of items having known perplatform usage rates. In our approach, the prediction of an ADI model for an item with no ROEM is based not only on the item’s historic demand but also takes into account the item’s correlation with other items having known ROEM. Our approach builds on the basic premise that the demand for certain supply items are correlated in the context of a military operation. For example, ADIl and ADIdiesel are correlated in the context of operation O because typically there would a surge in lubricant usage as a part of the ship’s maintenance routine after a surge in diesel consumption during the operation. In our approach, the ADI model of the lubricant ADIl required for the ship is predicted by combining ARIMAl with the lubricant’s correlation with ADIdiesel . A key concern of our demand modelling approach is the identification of correlation between demand for supply items. With over a million unique stock items in the military inventory, man-

Thiagarajan R., Rahman M., Gossink D. and Calbert G.

ually identifying correlation in demand between items is infeasible. Therefore, we adopt a data mining approach using sequence rule mining to determine frequently co-occurring usage between supply items from historic demand data. Our approach combines the results from sequence rule mining with time series regression analysis to derive correlated ADI models. We illustrate the effectiveness of our approach by predicting unknown ADI models in a military operational planning system. Our experimental evaluation indicates that incorporating demand correlations in the ADI forecasting process considerably reduces the prediction errors across several categories of military supplies, improving the accuracy of the demand planning process. The rest of the paper is organised as follows. Section 2 provides an overview of the related work followed by a brief discussion on the existing demand modelling process in Section 3. Our approach to extend the demand modelling process with sequence rule mining is presented in Section 4. The evaluation of our extended demand modelling process within a military operational planning system is discussed in Section 5. We share our results from our experimental evaluation in Section 6. Section 7 summarises our contribution and future work.

2

Related work

A large body of work that deals with the advantages of ADI models in several aspects of supply chain management exists, such as [2, 3, 4]. The approaches in [2, 3] emphasise the utility of ADI models, which are established through market research, inputs from sales managers and advance (but imperfect) orders, to effectively realign manufacturing processes according to the ADI models. Existing works such as [4] have shown that military planning systems are a promising source of ADI models particularly when ROEM exist. However, the problem of generating ADI models for items with no ROEM has not received as much attention. To the best of our knowledge, there are no studies that address the problem of missing ADI model generation in the military context. We address this gap by extending a military operational planning system that generates a few ADI models on the basis of their ROEM. The extension’s goal is to leverage

207

A DATA MINING APPROACH TO IMPROVE MILITARY . . .

the generated models to improve the ADI prediction accuracy of a large number of items with no ROEM. In the absence of ADI models, demand prediction techniques that use auto-correlation may be used to estimate the demand for critical military supplies for impending military operation based on their past usage history [7]. However, even items with continuous non-intermittent demands may be under or over estimated if the impending military operation is different from its past occurrences. In our approach, we complement an existing autocorrelated demand prediction technique with ADI correlation and show the resultant reduction in the prediction errors across several categories of military supplies.

3

The military demand modelling process

Accurate ADI models are essential to undertake effective decision making at the planning stage of a military operation. The work in [4] shows that a military operational planning system can be used to estimate the ADI models for an operation. Figure 1 shows the military demand modelling process. The planning system allows a logistics planner to specify the force elements including platforms and personnel to be used in an operation, locations and routes in the operation, a schedule of activities including resource allocations, and the supply chains used to sustain the operation [4]. Apart from troop and equipment movements, the logistics plan of an operation consists of ADI models to effectively procure and distribute supplies to the operation. As shown in Figure 1, the planning system utilises pre-specified ROEM to generate the ADI models for Item1 and Item2. The ADI models generated by the planning system facilitate a variety of decision support tasks including assessing plan feasibility, logistics sustainability, and risks or weak points in the operational plan. However, due to the lack of ROEM, ADI models for many critical items (e.g., Item3 and Item4 in Figure 1) are predicted on the basis of their historic demand instead of being generated by the planning system. This inhibits effective and comprehensive military operational planning.

4

Demand modelling process extended with correlated ADI models

The ideal solution to the lack of ADI models would be to establish ROEM for all stock items. However, such an effort would be an enormous, data-intensive and time-consuming undertaking with high costs due to the size of the military inventory. As mentioned earlier, automatic generation of ROEM is difficult as the historic demand data does not record per-platform usage details. Therefore, an approach to improve the accuracy of ADI models available for analyses in the absence of ROEM is required. In response, we propose an extension to the existing demand modelling process where the accuracy of an item’s ADI autocorrelation-based prediction is improved by combining with the item’s cross-correlations with known ADI models. Our approach builds on the basic premise that the demand for certain supply items are cross-correlated in the context of a military operation. The correlation in demands may arise from a range of factors including complementary relationships (e.g., a gear lubricant oil and an engine oil), part-of relationships (e.g., an engine and a fuel pump), dependence (e.g., a gun and ammunition), and operational circumstances (e.g., an operation at a tropical region requires both anti-malarial drugs and repellents). As a result, in some cases, when the demands of 2 supply items are correlated and if only one item’s ADI model is generated by the military operational planning system then their correlation may potentially be leveraged to improve the accuracy of the second item’s predicted ADI model. Our demand modelling approach presented in Figure 2 extends the basic process discussed in Section 3 with following three additional steps. 1 Identification of cross-correlation in demands using the historic demand data 2 Modelling the cross-correlation in demands 3 Combining autocorrelation with crosscorrelation to improve ADI prediction accuracy The 3 steps in the extended demand modelling approach are detailed below.

208

Thiagarajan R., Rahman M., Gossink D. and Calbert G.

Figure 1: Military demand modelling process Figure 1. Military demand modelling process

Figure 1: Military demand modelling process

Figure 2: Demand modelling process extended with correlated ADI models 4 Figure 2: Demand modelling process extended with correlated ADI models Figure 2. Demand modelling process extended with correlated ADI models 4

209

A DATA MINING APPROACH TO IMPROVE MILITARY . . .

4.1

Sequence rules to identify correlated demands (Step 1)

The key challenge of our demand modelling approach is the identification of cross-correlation in demands. Specifically, the challenge is to determine frequently co-occurring usage between supply items from the historic demand data. We adopt a data mining approach to address this issue because the size of the military inventory makes manual discovery of correlations infeasible. Widely used data mining methods such as association rule mining [8] and all-pairs correlation techniques [9] are unsuitable to this problem mainly because the historic demand data is not a market basket database. Although conversion to a market basket database using time period based sampling is possible (e.g., all demands on a day can be treated as one market basket transaction), previous works like [9] have noted that such methods are vulnerable to false-positive and false-negative correlations because they do not take into account possible lags between the demands of correlated items. For example, unless the whole duration of the operation O discussed in Section 1 is considered as a single market basket transaction, the lagged correlation between demand for diesel and the lubricant for post-operational maintenance would be ignored. To address this, we transform the identification of correlation in demands as a sequence rule mining problem, where the goal is to discover sequential rules from a database of sequences [10]. A sequence database is of the form S = {s1 , . . . , sn }, where each sequence si consists of chronologically ordered itemsets m {{i11 , . . . , i1x }, . . . , {im 1 , . . . , iy }}. The sequence rule mining process over S returns a set of sequence rules of the form X → Y , where X ⊂ si and Y ⊂ si are disjoint itemsets such that Y occurs after X in S with a certain support and confidence [10]. While users are allowed to set a minimum support minsup value to filter infrequent rules, in most application domains, however, it is difficult to ascertain an optimal minsup before the mining process. Therefore, we adopt the rule mining algorithm in [10] that allows the users to efficiently search for topk sequence rules with a certain minimum confidence. The algorithm works by a process called RuleGrowth, where small sequence rules are recursively expanded by adding frequent itemsets to the

left and right side of the rules. The process continuously updates a top-k rules set when new rules with higher support are found. Step 1 in Figure 2 shows the adoption of a sequence rule mining process in our extended demand modelling process to identify correlated demands. Firstly, a sequence database is generated from the historic demand data. Each sequence is formed by chronologically ordering the demand transactions at a military base. The rationale behind locationbased sequence formation is that if the demand for two items co-occur frequently (with or without lag) across a number of military bases then it is likely that their demands are correlated. The demand sequences table in Figure 2 shows the location-based sequence database generated from the historic demand data. For example, the demand sequence at location 1 consists of items {1, 3, 2, 4, . . . } that occur at times {T 1, T 2, T 5, T 6, . . . }. The demand sequences are provided as input to the sequence rule mining process [10] to generate sequence rules that are above a user-specified confidence threshold across all locations. Figure 2 shows the sequence rules generated {{Item1} → {Item3}, {Item2} → {Item4}, {Item1, Item2} → {Item4}, . . . }. The rule Item1 → Item3 implies that if a military base demands Item1 then it is likely that in the future it will be followed by a demand for Item3, indicating a potential correlation in the demands of items 1 and 3. Note that, unlike all-pairs correlation [9], the sequence rules also capture cross-correlations between several items (e.g., {Item1, Item2} → {Item4}).

In some data mining domains (e.g., retail) it is sufficient to just identify potential correlations between items as this may already be enough to effectively adapt marketing tactics. In the context of ADI model generation, however, modelling the characteristics of a correlation is equally critical because it is important to quantify how the demand for one or more supply items impact the demand of a correlated item. To this end, we adopt time series regression analysis to model the correlated demands.

4.2

Time series regression to model correlated demands (Step 2)

While a sequence rule indicates a potential correlation, in order to quantify the relationship between items it is important to consider the de-

210 mand quantities of the correlated items over time. Time period based sampling (e.g., daily demand) can be used to represent the chronologically ordered demands of an item as a time series. Time series regression analysis, a well known statistical method to model the relationship between time series variables, is one way to model the correlated demands. A linear regression model of the form Yt = β1 Xt1 + · · · + βn Xtn + ε models the relationship between a dependent variable Y at time t and independent variables {Xt1 , . . . , Xtn } at t using the regression coefficients β1 , . . . , βn , and an error term ε. The problem of selecting suitable independent variables X from a large pool of potential variables is a common issue in regression analysis. Methods such as stepwise regression [11], which sequentially add/remove variables to a regression model based on a scoring criteria, are suitable if the pool of independent variables is small but do not scale well to larger pools. In the context of ADI model generation, the sequence rules generated in step 1 can be used to substantially reduce the pool of independent variables available to conduct stepwise regression. All items in the antecedent part of a sequence rule are considered as a part of the pool of independent variables to predict the unknown ADI models of items in the subsequent part of the rule. Step 2 of Figure 2 shows the regression analysis process conducted, leveraging the sequence rules to generate linear (or non-linear) models of the correlated demands. For every item in a demand correlation identified by a sequence rule, time period based sampling is performed to generate the item’s demand time series. For example, the demands table in step 2 of Figure 2 shows the time series creation of items {1, 3}, {2, 4}, and {1, 2, 4}. The regression analysis results include a linear model Item3 = β1 Item1 + ε1 quantifying the demand for Item3 from the demand for Item1. A user-defined threshold for the coefficient of determination R2 [12] is used to filter poorly correlated models. Recalling the issue of false-negatives and falsepositives with time period based sampling raised in Section 4.1, the problem was due to the inability to deal with lagged correlations. We address this issue in step 2 by allowing lagged variants of known ADI models to be part of the pool of independent variables available to the regression analysis. For exam-

Thiagarajan R., Rahman M., Gossink D. and Calbert G.

ple, if the monthly demand of a lubricant lt is dependent on a truck’s ROE (i.e., diesel demand) over the last 6 months {dt−1 , . . . , dt−6 } then it can be modelled as lt = β1 dt−1 +· · ·+β6 dt−6 +ε. The weighted sum of the time lagged values has an effect similar to a Finite Impulse Response Filter[13] in removing some frequency components. Due to the additional possible delays and weights there is potential for overfitting in the ADI prediction model. This can be ameliorated by applying regularisation techniques commonly applied in machine learning (regression).

4.3

Combining Auto-correlation and ADI correlation (Step 3)

Recall the ARIMA-based autocorrelated ADI prediction of Item3 and Item4 from Figure 1. In the final step (Step 3) in our extended demand modelling process in Figure 2, we incorporate the crosscorrelated demand models within the ARIMA prediction technique to produce a combined prediction. Specifically, when predicting the ADI model of an item with no ROEM (e.g., Item3), we use ADI models discovered in Step 1 and Step 2 in Figure 2 as external regressors in the ARIMA model. It is important to note that both step 1 and 2 may be performed off-line prior to the planning stage. In Step 3 of the extended demand modelling process, we consider 2 cases of incorporating crosscorrelated ADI models in the ARIMA-based prediction process: single demand correlation and multiple demand correlation.

Single Demand Correlation (ARIMA SR) Single demand correlation combines an ARIMA-based autocorrelated prediction technique with a single cross-correlated ADI model, referred to as ARIMA SR [1]. When an ARIMA SR model is used to predict an item’s ADI model then the prediction is not only based on the item’s past usage but also on its correlation with one other known ADI model. The ARIMA SR process is illustrated in Step 3.1 of Figure 2. The demand for Item3, for example, is predicted by combining its ARIMA prediction ARIMAItem3 with a single cross-correlated ADI model Item1. Candidates for cross-correlation are selected in Step 2 of the extended demand mod-

211

A DATA MINING APPROACH TO IMPROVE MILITARY . . .

sist of ADI models of items with ROEM. Based on this initial plan, our implementation offers the planner a set of improved ADI predictions that are derived from a library of autocorrelated and Multiple Demand Correlation (ARIMA SRM) cross-correlated ADI models. The library of correlated demand models used in our system is develThe demand for some supply items may be oped prior to the planning process. As discussed cross-correlated with more than one ADI model. in Section 4.1, the identification of demand correWe refer to this case as multiple demand correlation is transformed into a sequence rule mining lation (ARIMA SRM), shown as Step 3.2 in Figproblem that is solved using the top-k rule algoure 2. Let us consider the prediction of a gear rithm [10] in the Sequential Rule Mining Framelubricant’s ADI model ADIl based on combining work (SPMF) [14]. The linear models quantifying ARIMAl and a cross-correlated diesel ADI model the cross-correlated demands are generated using ADIdiesel . The dependence that underlies the crossthe dynlm package in the statistical system R [15]. correlation between diesel and lubricant usage is The ARIMA models are generated using the forediscussed in Section 1. However, the lubricant’s cast [16] package in R. demand may also be cross-correlated with other known ADI models, for example brake fluid ADIb f , Multiple Demand Correlation because typically a vehicle’s maintenance process (ARIMA SRM) includes a change of lubricants and brake fluid. Therefore, predictions potentially imThe demandADI forl some supplymay items may be be crossproved bywith also considering the known ADIb f . We correlated more than one ADI model. refer The to this case SRM as multiple correlation processdemand is illustrated in Step ARIMA (ARIMA SRM), shown as Step 3.2 in Figure 2. 3.2 of Figure 2. The demand for Item4, for exLet us consider the prediction of a gear lubricant’s ample, is predicted by combining its ARIMA preADI model ADIlItem4 based on multiple combining ARIMAl and diction ARIMA with cross-correlated The a ADI cross-correlated diesel ADI model ADI diesel . SRM models Item1 and Item2. The ARIMA dependence that underlies the cross-correlation prediction of Item4 is modelled as Item4be= tween diesel and lubricant usage is discussed in SecARIMAItem4 + β41 Item1 + β42 Item2 + ε4 . tion 1. However, the lubricant’s demand may also be Figure Inventory Profile Figure 3: 3. Inventory Profile cross-correlated with other known ADI models, for 5 Evaluation ofb f ,the extended example brake fluid ADI because typically adeve5.2 isInventory profile implemented as an extension to a military hicle’smand maintenance process includes a change of lu- system modelling process planning system. bricants and brake fluid. Therefore, ADIl predictions operational Our initial analysis indicates that the military may potentially be improved also SES, considering the To evaluate, we usedby the ARIMA, Consider the planning of military contingency operation planning systemacan only generate ADI known ADISR, ARIMA Thesome mission’s logistics plan using the milmodels for frequently used military consumb f . ARIMA SRM techniques to predict operation. Thedemand ARIMA illustrated in Step itary the of SRM items process requiredisfor military training operational planning system profiles: would consist of ables in the following inventory 3.2 of Figure 2. The demand for Item4, for examexercises. ADI models of items with ROEM. Based on this iniple, is predicted by combining its ARIMA predic- tial –plan, our implementation offers theoil) planner a set fuels and lubricants (e.g., engine with multiple cross-correlated ADI tion ARIMA of improved ADI predictions that are derived from Item4 5.1 Implementation – food (e.g., combat ration packs) models Item1 and Item2. The ARIMA SRM predic- a library of autocorrelated and cross-correlated ADI ToItem4 illustrate our extended demand modelling tion of is modelled as Item4 = ARIMA Item4 + models. The library of correlated demand models (e.g., combat boots) approach, we have implemented a software sys- used– inclothing β41 Item1 + β42 Item2 + ε4 . our system is developed prior to the planning tem that uses SES, ARIMA, ARIMA SR, and process. As discussed in Section 4.1, the identifi– cleaning products (e.g., engine cleaning brush) ARIMA SRM to infer ADI models for items with cation of demand correlation is transformed into a 5no ROEM. Evaluation ofis the extended de- sequence The system implemented as an exten– building (e.g., aircraft rule products mining problem that issealant) solved using sionmand to a military operational planning system. modelling process the top-k rule algorithm [10] in the Sequential Rule – packaging (e.g., fuel drums) Consider the planning of a military contingency Mining Framework (SPMF) [14]. The linear models Tooperation. evaluate, we used the SES,logistics ARIMA,plan ARIMA The mission’s usingSR, the quantifying the cross-correlated demands are generThe relative proportion of the military inventory ARIMA SRM techniques to predict the demand of ated using military operational planning system would conthe dynlm package in the statistical system profile under consideration is presented in Figure 3. items required for military training exercises. R [15]. The ARIMA models are generated using the forecast [16] package in R. elling process as discussed in Section 4.2. The ARIMA SR prediction of Item3 is modelled as Item3 = ARIMAItem3 + β31 Item1 + ε3 .

Building 23%

Packaging 27%

Cleaning 8%

Fuels and lubricants 9%

Food 1%

Clothing 32%

5.1

Implementation

To illustrate our extended demand modelling ap- 5.2 proach, we have implemented a software system that

Inventory profile

212 40

Thiagarajan R., Rahman M., Gossink D. and Calbert G.

0

10

20

Error % (NRMSE)

30

SES ARIMA ARIMA_SR ARIMA_SRM

Cleaning

Building

Packaging

Clothing

Food

Fuels and lubricants

Figure 4: Average Demand Prediction Errors for SES, ARIMA, ARIMA SR and ARIMA SRM Figure 4. Average Demand Prediction Errors for SES, ARIMA, ARIMA SR and ARIMA SRM ARIMA over SES Experimental set-up

100 120 140

5.3

ARIMA_SR over ARIMA ARIMA_SRMARIMA over ARIMA_SR SR, ARIMA,

Relative improvement %

We used the SES, ARIMA SRM techniques to predict the demand of items across the categories of military consumables mentioned in Section 4.3. We used the demand data from military training exercises conducted over the last 20 years in these experiments. The first 10 years of data from the training exercises were used to train the demand predictors, while the remaining data was used for testing the prediction accuracy. To quantify the accuracy of a predicted ADI model, we use the Normalised Root Mean Square Error (NRMSE) metric defined as,

6

Results and discussion

We present the results from our experiments in this section.

6.1

ARIMA SR evaluation

0

20

40

60

80

The results from our experiments to evaluate ARIMA SR are presented in Figure 4. Each group of bars denotes the average NRMSE resulting from the adoption of SES, ARIMA, ARIMA SR and ARIMA SRM for ADI prediction in an inventory segment. Figure 4 shows that on average the ARIMA SR model’s NRMSE is lower than SES and ARIMA across all inventory segments. When the single standard deviations shown as erCleaning Building Packaging Clothing Fuels and lubricants RMSE ror bars in Figure Food 4 are taken into consideration, the NRMSE = ARIMA SR technique performs better than the SES max(ADIa ) − min(ADIa ) and ARIMA techniques in terms of NRMSE in 3 Figure 5: Relative improvements in prediction accuracies out of the 6 categories of military consumables. where RMSE is the standard Root Mean Square Er10 ror [17] and ADIa is the actual ADI model. 6.2 ARIMA SRM evaluation Our evaluation included two sets of experiments. The first set of experiments were conducted to assess whether ARIMA SR predictions were more accurate than those that are predicted by SES and ARIMA. The second set of experiments evaluated whether adopting ARIMA SRM over ARIMA SR improved the accuracy of ADI predictions.

Figure 4 also presents the results from our experiments to evaluate whether adopting ARIMA SRM instead of ARIMA SR improves the accuracy of the ADI predictions. From the results it is evident that on average the ARIMA SRM model’s NRMSE is lower than ARIMA SR across all inventory segments. The single standard devi-

0 Cleaning

Building

Packaging

Clothing

Food

Fuels and lubricants

Figure 4: Average Demand Prediction Errors for SES, ARIMA, ARIMA SR and ARIMA SRM213 ARIMA over SES ARIMA_SR over ARIMA ARIMA_SRM over ARIMA_SR

0

20

40

60

80

Relative improvement %

100 120 140

A DATA MINING APPROACH TO IMPROVE MILITARY . . .

Cleaning

Building

Packaging

Clothing

Food

Fuels and lubricants

Figure 5: Relative improvements in prediction accuracies Figure 5. Relative improvements in prediction accuracies ations indicate that the ARIMA SRM is more accurate than the SES and ARIMA in all categories of military consumables except ’Food’. We conducted further analysis to explain the results in the ’Food’ inventory segment. From our analysis we were able to conclude that as the ’Food’ segment consists only 1% items in the experiment’s inventory profile the results were more sensitive to variance in the NRMSE values. Furthermore, we also measured the relative improvement achieved by ARIMA over SES, ARIMA SR over ARIMA, and ARIMA SRM over ARIMA SR. The relative improvement, that is the reduction in NRMSE, of technique T over T ′ is measured as NRMSE T ′ − NRMSE T × 100. NRMSE T ′ The relative improvement results shown as a stacked bar chart in Figure 5 indicate that by adopting ARIMA SRM the ADI prediction accuracy can be improved over ARIMA by at least 20% across all stock categories. Our experiments indicate that in the biggest inventory segment ’Clothing’ (32% of the inventory profile as shown in Figure 3) the relative improvement of ARIMA SRM over ARIMA is about 47%. From the results presented in Figures 4-5, it can be concluded that combining correlated ADI mod-

10 els discovered using sequence rule mining with autocorrelated demand prediction improves the accuracy of military demand predictions.

7

Summary and future work

We presented an extended demand modelling approach for military operational planning, where the accuracy of ADI prediction for critical stocks is improved by combining autocorrelated predictions with cross-correlated ADI models of items with known ROEM. We highlighted the importance of accurate ADI models to conduct comprehensive and effective operational analysis and decision making. We presented an approach based on sequence rule mining to identify (possibly delayed) correlation in demands. We also showed how regression models are used to quantify the demand correlations. An experimental evaluation of our approach was conducted to show the improvement in the military demand modelling process. Conducting a full-scale user study to further validate ADI model generation is part of our current and future work. We also plan to explore the prospect of probabilistic cross-correlations with ADI models generated by the military operational planning system.

214

References [1] R. Thiagarajan, M. Rahman, G. Calbert, and D. Gossink, “Improving military demand forecasting using sequence rules,” in 6th Asian Conference on Intelligent Information and Database Systems (ACIIDS), pp. 475–484, 2014. [2] S. Benjaafar, W. L. Cooper, and S. Mardan, “Production-inventory systems with imperfect advance demand information and updating,” Naval Research Logistics (NRL), vol. 58, no. 2, pp. 88– 106, 2011. [3] F. Karaesmen, “Value of advance demand information in production and inventory systems with shared resources,” in Handbook of Stochastic Models and Analysis of Manufacturing System Operations, vol. 192, pp. 139–165, 2013.

Thiagarajan R., Rahman M., Gossink D. and Calbert G.

[8] R. Agrawal, T. Imieli´nski, and A. Swami, “Mining association rules between sets of items in large databases,” SIGMOD Rec., vol. 22, no. 2, pp. 207– 216, 1993. [9] H. Xiong, W. Zhou, M. Brodie, and S. Ma, “Topk correlation computation,” INFORMS Journal on Computing, vol. 20, no. 4, pp. 539–552, 2008. [10] P. Fournier-Viger and V. S. Tseng, “TNS: mining top-k non-redundant sequential rules,” in ACM Symposium on Applied Computing (SAC), pp. 164– 166, 2013. [11] L. Wilkinson, “Tests of significance in stepwise regression,” Psychological Bulletin, vol. 86, no. 1, pp. 168–174, 1979. [12] R. H. Myers, Classical and modern regression with applications, vol. 2. 1990.

[4] R. Thiagarajan, M. A. Mekhtiev, G. Calbert, N. Jeremic, and D. Gossink, “Using military operational planning system data to drive reserve stocking decisions,” in 29th IEEE International Conference on Data Engineering (ICDE) Workshops, pp. 153–162, 2013.

[13] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing. Prentice–Hall, 1989.

[5] E. Gardner, “Exponential smoothing: The state of the art Part II,” International Journal of Forecasting, vol. 22, no. 4, pp. 637–666, 2006.

[15] A. Zeileis, dynlm: Dynamic Linear Regression, 2013. R package version 0.3-2.

[6] G. Box, G. Jenkins, and G. Reinsel, Time Series Analysis: Forecasting and Control. 2008. [7] M. Downing, M. Chipulu, U. Ojiako, and D. Kaparis, “Forecasting in airforce supply chains,” International Journal of Logistics Management, vol. 22, no. 1, pp. 127–144, 2011.

[14] P. Fournier-Viger, A. Gomariz, A. Soltani, and T. Gueniche, “SPMF: Open-Source Data Mining Platform - http://www.philippe-fournierviger.com/spmf/,” 2013.

[16] R. J. Hyndman, G. Athanasopoulos, S. Razbash, D. Schmidt, Z. Zhou, Y. Khan, and C. Bergmeir, forecast: Forecasting functions for time series and linear models, 2013. R package version 4.06. [17] C. J. Willmott, “On the validation of models,” Physical Geography, vol. 2, no. 2, pp. 184–194, 1981.

JAISCR, 2014, Vol. 4, No. 3, pp. 215 – 225 10.1515/jaiscr-2015-0010

ADVANCED SUPERVISION OF OIL WELLS BASED ON SOFT COMPUTING TECHNIQUES Edgar Camargo1 and Jose Aguilar2

1 PDVSA.

Edificio El Menito, LSAI Lagunillas, Edo Zulia-Venezuela

2 Universidad

de Los Andes, CEMISID, Mrida, 5101, Venezuela Prometeo Researcher, Universidad Tcnica Particular de Loja, Ecuador Abstract In this work is presented a hybrid intelligent model of supervision based on Evolutionary Computation and Fuzzy Systems to improve the performance of the Oil Industry, which is used for Operational Diagnosis in petroleum wells based on the gas lift (GL) method. The model is composed by two parts: a Multilayer Fuzzy System to identify the operational scenarios in an oil well and a genetic algorithm to maximize the production of oil and minimize the flow of gas injection, based on the restrictions of the process and the operational cost of production. Additionally, the first layers of the Multilayer Fuzzy System have specific tasks: the detection of operational failures, and the identification of the rate of gas that the well requires for production. In this way, our hybrid intelligent model implements supervision and control tasks.

1

Introduction

The wrong functioning of industrial systems can cause financial and human losses, undesired environmental impacts, among others. This fact is true for a multitude of industrial domains: aerospace industry, oil companies, etc. Many of these systems are highly associated to automation. The automatic control frees them of the human manual control, but it is not immunized against operational failures. Therefore, it is necessary to complement the industrial automation systems with potent and accurate supervision tools that allow indicating undesired or unpermitted performance states, as well as taking the proper actions in order to keep the system within the optimal performance states. The utilization of Hybrid Intelligent Systems (HIS) on supervision tasks in production systems is becoming an area of great interest at industrial level [1], [3], [5], [10]. Particularly, the HIS have

gained a large influence in the oil industry, because they allow approaching the problem of handling the complexity of the hydrocarbon production systems [1], [2]. In special, the use of computational intelligence techniques represent an attractive alternative to deal with highly varying, complex, and confusing problems [7], [9]. So, in this work is proposed a HIS for supervising and optimizing oil production processes. The HIS is composed by a Multilayer Fuzzy Classifier System (MFCS) to detect faults, operational scenarios, and the rate of gas that the well requires for production; and a Genetic Algorithm (GA) to optimize the production. The MFCS consists of multiple fuzzy systems hierarchically distributed, each one for each task, which have the advantage that the total number of rules of the knowledge base is smaller, and are simpler than a conventional fuzzy system. The GA defines a population of individu-

216

Camargo E. and Aguilar J.

als, each of them representing a possible solution to the oil production optimization problem. The main goal of the MFCS is the identification of different operational scenarios in an oil well, to implement control tasks. The MFCS carried out other supervision tasks (to detect faults that affect the process or the equipment involved, in real time, in the production facilities at the level of well and reservoir; to determine the rate of gas that the well requires for production) and it is the input to the GA. The system is initially tested in wells requiring artificial lift by Gas (ALG). The main goal of the GA is the optimization of two objectives: the maximization of the production of hydrocarbons and the minimization of the gas injection. The GA must solve a zone of negotiation among these criteria that allows finding the ideal production. The work herein presented reveals field tests of the system used to opmice (HIS) wells, based o the importante usage of axial load and several other variables (gas lift flow, pressures, etc) with a special ingredient that has a tremendous impact in optimization: Artificial Intelligence and Automation. The mathematical principles are also presented to encourage the usage of axial load in multivariable mathematical model in order to complete a good optimization scheme for these wells. The benefits of operating a HIS system using a artificial lift by Gas (ALG), are substancial, as describe in this paper. The idea is to reduce downtime, workovers, improve system operating response time and equipment useful life, while optimizing the well production. This paper is structured as follows: Theoretical aspects about Fuzzy Classifier Systems and the Production Process of wells are presented in Section 2. The design of our HIS is presented in Section 3, the experiments with our HIS are shown in Section 4. The paper ends with conclusions.

2 2.1

Theoretical Framework Industrial Automation

The Integral Automation Pyramid, proposed by the International Standards Organization (ISO), is associated with the hierarchical structure of

decision-making processes. The base of the pyramid corresponds to the level of measurement and control, followed by a second level of Supervision, a third level of optimization, and finally, a last level of Asset Management. These same levels of the automation pyramid can be associated in three levels: the Operational Level for the task of control and measurement, the Tactical Level of Supervision and optimization tasks, and finally, the Strategic level. Our HIS correspond to the Tactical Level, because it allows us to incorporate the following qualities to a system: autonomy in the decision making process; adaptation for the possibility of learning from the occurrence of events on the industrial system under supervision; self-diagnosing and self-organizing capacities anticipating the effect of the supervision tasks. Additionally, the HIS is distributed at process level, in a way that the decisions are carried out locally, thus minimizing the response times of the supervision tasks. This approach extends the classical approach of the Data Control and Acquisition Systems (SCADA) that limit them to supervision and control tasks. This new approach is based on a selfregulation process in the wells, from the information they handle (status-actions), which allows them to anticipate situations, have a proactive behavior, without losing the global vision of the business. Our HIS is oriented towards the provision of intelligence to the well by giving it onsite selfdiagnosing characteristics, in order to determine the production method with better performance and financial profitability. Wells with these characteristics have been called in the literature as “conscious wells”, meaning by this term a well that, based on its profitability, regulates its production. For that, the well must have capabilities of self-diagnose, control of its damages, supervision of the behavior of its subsoil/surface infrastructure, etc. [7]. So, this allows providing intelligence to the production process through onsite self-supervision and self-optimization. One of the most notable advantages of our approach is the architectural change towards a distributed intelligence model oriented to the field and the subsoil, eliminating the tactical level with the incorporation of the supervision and optimization tasks at the operational level (Figure 1).

ented in presented HIS are nds with

proposed anization rarchical ses. The e level of d by a level of of Asset of the in three task of

have capabilities of self-diagnose, control of its damages, supervision of the behavior of its subsoil/surface infrastructure, etc. [7]. So, this allows providing intelligence to the production process through onsite self-supervision and ADVANCED SUPERVISION OF OIL WELLS . . . self-optimization. One of the most notable advantages of our approach is the architectural change towards a Business Planning and One of the Asset most notable advantages of our Strategic Management distributed intelligence model oriented to the approach is the architectural change towards a field and the subsoil, eliminating the tactical distributed intelligence model oriented to the level withOptimization, the incorporation of the supervision field and the subsoil, eliminating Supervision and Intelligent Supervision the tactical and optimization the operational level Control tasks at Systems level withAdvanced the incorporation of the supervision Operational (Fig 1).Basic Measurement and optimization tasks at the operational level and Control (Fig 1).

Asset 2.2 Fuzzy Classifier System Process Manage A Fuzzy Fuzzy Classifier System (FCS) based is a system ment 2.2 Classifier System Figure 1. Pyramidal Automation Model on

Figurewhose 1: Pyramidal Automation Model rules areour based the theorybased of Fuzzy HISon (FCS) A Fuzzy Classifier System is a system on our HIS. (FL), Logic which includes the same elements whose rules are based on the theory of Fuzzy of a Classifier Systems (CS), but working in a 2.2 Logic Fuzzy Classifier System (FL), which includes the same elements fuzzy framework [3], [4]. In this way, the of a Classifier Systems (CS), but working in a Aactivation Fuzzy Classifier System (FCS) iswhen a system of a rule is achieved in the fuzzy framework [3], [4]. In this way, the whose"antecedent” rules are based on the theory Fuzzyvariables Logic some values ofoffuzzy activation of a rule is achieved when in the (FL), from which includes the same elements of a Clas- In the environment are activated. "antecedent” some values of fuzzy variables standard (CS), fuzzybut systems, forinaaproblem with n sifier Systems working fuzzy framefrom the environment are activated. In input described by m work [3], [4].variables In thissystems, way, thefor activation of alinguistic rule isn standard fuzzy a problem with labels, theinmaximum possiblesome number of rules achieved the described ”antecedent” values of input when variables by m linguistic in the fuzzy system is mn. This exponential fuzzy variables from the environment are activated. labels, the maximum possible number of rules growthfuzzy causes, in practice, that with for an inhigh In standard systems, a problem in the fuzzy system isformn. This exponential number of variables the number of rules is put growth variablescauses, described m linguistic labels, theso in by practice, that for a high large that the interpretability of the system maximum number rules in the fuzzyissysnumberpossible of variables theofnumber of rules so becomes impossible. This problem is not interpretability of the insystem temlarge is mn.that Thisthe exponential growth causes, pracexclusive to fuzzy systems, and is known as This problemtheisnumnot tice,becomes that for aimpossible. high of variables the problem of number dimensionality [4]. exclusive to fuzzy systems, and is known as ber ofOne rulesway is so that the the number interpretability of large reducing of rules of and the problem of dimensionality [4]. the system This problem thus becomes increase impossible. the interpretability, is is to way of reducing the number of rules and not One exclusive to fuzzy is known the decompose the systems, fuzzy and system into assimple thus increase the interpretability, is to problem of dimensionality [4]. modules, this is called multilayer fuzzy decompose the fuzzy system into simple systems (MFS) [5].the A number MFCS of consists of a One way ofthis reducing rulesfuzzy and modules, is called multilayer number of fuzzy systems hierarchically thussystems increase(MFS) the interpretability, is to decompose [5]. A MFCS consists of a distributed, which have the advantage that the of into fuzzy systems the number fuzzy system simple modules,hierarchically this is called total number of rules of the knowledge base is distributed, have the advantage thatconthe multilayer fuzzywhich systems (MFS) [5]. A MFCS smaller, and are simpler than a conventional number ofofrules of systems the knowledge base is siststotal of a number fuzzy hierarchically fuzzy system. There are numerous design smaller, which and are simpler than a conventional distributed, the systems advantage that The the total proposals ofhave such [5]. more fuzzy system. There are numerous design number of rules of the knowledge base is smaller, traditional type of MFS is one in which each of such systems [5]. Thesystem. more andproposals are simpler a conventional fuzzy module is athan complete fuzzy system (FS) relates traditional type of MFS is one in which each Theretoarea numerous design proposals of suchcan sys-be reduced set of variables, which module is a complete fuzzy system (FS) relates tems [5]. more traditional typesystem of MFS one inputThe variables of the global or isinternal to a reduced set of variables, which can be variables generated as outputs of other in which each module is a complete fuzzy system input variables of the global system or internal modules [5]. Theresetare other approaches, (FS)variables relates to agenerated reduced canfor asof variables, outputs which of other example someone identify commonorset of rules be input variables of the global system internal modules [5]. There are other approaches, for and generated define common modules them [7], variables as identify outputs of otherformodules [5]. or example someone common set of rules those in which each level corresponds to an There othercommon approaches, for example someone andare define modules for them [7], or increase in granularity of the variables [6]. Our identify of rules define common thosecommon in whichseteach leveland corresponds to an work uses the first approach, which can be increase granularity of theinvariables [6].level Our modules for in them [7], or those which each seen in Fig. 2 [10]. work uses theincrease first approach, which canvaribe corresponds to an in granularity of the seen in Fig. 2 [10]. ables [6]. Our work uses the first approach, which can be seen in Figure 2 [10].

217 Input FS-1 1Input Input FS-2 FS-3 FS-1 2Input 1Input FS-2 FS-3 3Input 2Input 4 3Input Figure 2. Classic Model of a Multilayer Fuzzy 4 System. Figure 2. Classic Model of a Multilayer Fuzzy Figure 2. Classic Model of a Multilayer Fuzzy System. System

2.3 Production Process of Wells by

2.3 2.3 Production Process of Wells the Gas the Gas Lift Method Production Process of by Wells by Lift Method The Gas Lift method consists of gas injecting

the Gas Lift Method

at Gas an established pressure atofthegas lower part of The The Gas Lift Lift method method consists consists of gas injecting injecting the well pipe’s fluid column, at different at an pressure at the part of the at established an established pressure at lower the lower part of depths, with the purpose of decreasing wellthe pipe’s column, with its wellfluid pipe’s fluidat different column, depths, at different weight, thus helping the reservoir fluids rise the purpose of decreasing its the depths, purpose with of decreasing its weight, thus helping from the bottom of the well to the surface. weight, thus helping thethe reservoir fluids rise the reservoir fluids rise from bottom of the well That way, in the wells exploited by the Gas from the bottom of the well to exploited the surface. to the surface. Thatthe way, the wells by Lift method gasin is continuously injected That way, in the wells exploited by injected the Gas the Gas Lift method the gas is continuously into the well in order to mix with the fluids of method the gas is continuously injected intoLift the order to mix with the fluids of the thewell wellin and reduce the density of the fluid into the well in order to mix with the fluids of well and reduce the of the fluid thus column, thusdensity decreases the column, difference in the well and reduce the density of the fluid decreases the difference pressures between pressures between inthe bottom-hole andthethe column, thus decreases the difference in bottom-hole surface.and the surface. pressures between the bottom-hole and the The production curve of a well that produces The production curve of a well that produces surface. by the gas injection method (see Fig. 3) Thegas production curve of a well that produces by the injection method (see Figure 3) indicates indicates that when the Gas Lift Flow increases the gasGasinjection method (see(GLF, Fig. ex3) thatby when Lift Flow increases (GLF,theexpressed “mpcdg” thousands of gas indicates that thousands when the Gas Liftcubic Flowfeet increases pressed “mpcdg” of gas days), cubic feet days), the production rate (Qprod, expressed (Qprod, “mpcdg”expressed thousands“BNPD” of gas the (GLF, production expressed rate “BNPD” Daily Production Net cubic feet days), the production rate (Qprod, Daily Barrels) Production Net increases, Barrels) also increases, un- its also until reaching expressed “BNPD” Daily Production Net til reaching its highest value (Stable Region), such highest value (Stable Region), such that Barrels) also increases, until reaching its that additional injection will cause additionalincreases increasesininthethe injection will cause highest value (Stable Region), such that a decrease the production (Unstable Region) a decrease in the in production (Unstable Region) [1], additional increases in the injection will cause [1], [2]. [2]. a decrease in the production (Unstable Region) [1], [2].

Figure 3. Artificial Gas Lift well behavior´s Figure model 3. Artificial Gas Lift well behaviors model Figure 3. Artificial Gas Lift well behavior´s model The well’s production curve is is obtained byby thethe The well’s production curve obtained characterization of theofwell mass and characterization the using well using massen-and The well’s production curve is obtained by the ergy balance [1], [3]. The[1], mechanical energy techniques balance techniques [3]. The characterization of the well using mass and completion installed at the bottom andatsurface of mechanical completion installed the bottom energy balance techniques [1], [3]. The andallows surface of the well allows the the well the characterization of the physical mechanical completion installed at the bottom and surface of the well allows the

218

Camargo E. and Aguilar J.

properties of the fluid (Gravity of the oil, water cut, Bottom-hole pressure, Gas-liquid ratio). It is necessary because the oil production behavior in the wells injected with gas depends of variables, both of the reservoir and of the mechanical design (valves, production pipes, among others) [1]. The implantation of this ALG method needs an instrumentation and control arrangement. For that, the measurement and control of the following variables are required: Gas Lift Flow (Qin j ), Production Rate (Q prod ), Gas Lift Pressure (Glp), Gas Lift Pressure Differential (Gldp), Casing Pressure (Pg,in j ), Production Tubing Pressure (Pthp ) and Bottom Pressure (Pw f ).

gas towards the production pipe. This is described by the following equation:

So, a simple gas lift model is proposed [1]: the oil and gas “Inflow” of the reservoir is modeled with the use of the productivity index (oil volume that the reservoir can provide) and the existing relation between the production rate (Q prod ) and the differential between the reservoir pressure (Pws ) and the flowing pressure at the bottom of the well (Pw f ). Eq. (1) is used, which determines the capacity of contribution of the oil reservoir. This equation represents an instant of such capacity of contribution of the well of the reservoir, in a given time of its productivity life. It is normal for such capacity decreases through the time, due to the reduction of permeability of the well surroundings and the increase of viscosity of the oil. This equation is considered as the energy offered, or fluid affluence curve, that the reservoir yields to the well (Pw f vs Q prod ).

c = Constant related to the characteristics of the valve

{ √ c ρg (Pg,in j − PT,in j ) i f Qin j = 0 else

Pg,in j ⟩PT,in j (2)

Where, Pg,in j = Pressure of Injection of Gas to the Valve PT,in j = Pressure of the Production Pipe at the Point of Injection ρg = Gas Density

Qin j = Gas Injection Rate For the model, the node at the gas injection valve is assumed in order to establish the capacity of production of the lifting system [2], [3]. Thus, the production of the system responds to an energy balance in the form of pressure between the capacity of energy contribution from the reservoir and the energy demand from the installation [2], which is expressed in the node as follow: Node arrival pressure: Pvalve(inf low) = Pws − ∆Py

Node output pressure:

Pvalve(out f low) = Pthp − ∆Pp Where:

Pw f = Pws ∗

[(

1, 25 ∗ Q prod 1, 266 − Qo

)0,5

− 0, 125

]

(1)

Where Qo represents a base production rate, which is determined through reservoir core tests. As for the “outflow”, gas is injected at a given depth to reduce the weight of the column and to reduce the bottom pressure of the well, allowing the establishment of a given production rate in which the capacity of fluid contribution from the reservoir equals the capacity of fluid extraction from the well. In this sense, in order to inject gas, it is assumed that the pressure at the level of the bottom injection valve located in the casing must be greater than the pressure in the space of the production pipe at the injection point (Pg,in j⟩ PT,in j ), to ensure a displacement of the

∆Py = Pws − Pw f (Pressure Drop in the Reservoir)

∆Pp = Pthp − PT,in j (Pressure Drop in the Well) And now Qiny is defined as: √ Qin j = c ρg,in j (Pg,in j − Pthp + Pw f )

(3)

From equations (1), (2) and (3), the mathematical model that describes the behavior of a gas lift well is (4):

3 3.1

Design of our HIS Hybrid Intelligent Systems

Our HIS is composed by a MFCS and a GA [10]. The MFCS consists of a number of fuzzy sys-

characterization of the pressure drop in the production tubing of the well is important because it determines operational failures (4) that may affect well production. So, the 219 first layer defines a detection system of operational failures, based on rules which (( ) )2 2 )/ρ the relationship define Pwf vs Thp. Pthp +Pg,in j −((Qin j /c) g 3 Design of our HIS + 0, 125 − 1, 266 QO ∗ Pws The goal of the second layer is determined (4) Q prod = − the 1, 25 rate of gas injection "Qinj" (see Fig 3.1 Hybrid Intelligent Systems 4.B), and with this value determine the operational stage of production of the well. Our HIS is composed by a MFCS and a GA withthe this value determine operational tems hierarchically distributed, which allows idenWith operational stage ofthe production of stage [10]. The MFCS consists of a number of fuzzy the well. With the other operational tifying the different operational scenarios present in theof production well we of can determine systems hierarchically distributed, which stage of production of the well we can determine the oil production process. Additionally, the MFCS operational faults due to an under-injection allows identifying the different operational other operational faults due to an under-injection carries out other supervision tasks: detects faults or over-injection of gas. In this sense, it scenarios present in the oil production process. or over-injection gas. that In this sense, itthe consists that affect the the equipment involved; consists of a set ofofrules combine Additionally, the process MFCSorcarries out other pressure obtained in the the firstpressure layer drop of a setdrop of rules that combine and determines rate faults of gas that the wellthe requires supervision tasks: the detects affect with the input variable “Casing Pressure”, obtained in the first layer with the input variable for production. Identified theinvolved; operationaland scenario, process or the equipment to get the gas injection rate. These rules rate. determines rate of the gas process that the of well requires “Casing Pressure”, to get the gas injection the GA the simulates natural evolution use such variables because the rate of gasthe rate for toproduction. the Every operational These rules use such variables because optimize the Identified oil production. individual of that into the into wellthe to well extract the the scenario, the GArepresents simulatesa potential the process of of the of is gasinjected that is injected to extract the population solution oil to the surface depends on the pressure natural evolution to optimize the oil oil to the surface depends on the pressure of the oil production problem. The evolution is guided of the casing and the pressure drop in the production. Everyofindividual casing and the pressure drop in the tubing, acby a strategy selection of of the the population individuals, with tubing, according to [3], [5]. represents a potential solution of the oil cording to [3], [5]. the intention of improving their ”fitness”, a mea Finally, the goal of the last layer is identify production problem. The evolution is guided sure based on the restrictions contextualized in the operational it the by a strategy of selection of the individuals, –theFinally, the goal scenario. of the last For layer that, is identify operational scenario determined by the MFCS. That determines production rate it(see Fig the with the intention of improving their "fitness", operationalthe scenario. For that, determines the population of individuals will be specific 4.C), and according this 4.C), valueand it can a means, measure based on the restrictions production rate (see to Figure according to the operational scenario identified in the previous identify the operational scenario. In this scecontextualized in the operational scenario to this value it can identify the operational phase, so by thatthe the MFCS. GA may That optimize the production case the In setthis of case rulesthe aresetdefined thedefined determined means, the nario. of rulesbyare for that operational scenario. bottom pressure (P , fluid load capacity of population of individuals will be specific to the wf by the bottom pressure (Pw f , fluid load capacity the reservoir) and the gas injection rate operational scenario identified in the previous of the reservoir) and the gas injection rate (Q , (Qinj, energy needed to extract the oil), in j phase, that the GA may optimize the 3.2 soMFCS Design energy needed to extract the oil), because because these variables determine the these production for that operational scenario. variables determine the production rate accordThe proposed MFCS consists of 3 layers: production rate according to [7], [8]. ing to [7], [8]. Pwf – The goal of the fist layer is to detect the faults 3.2 MFCS Design Pwf-Thp A) FD-1 Thp that affect the process or the equipment inThe proposed MFCS consists of 3 layers: volved. For that, the first layer determines the  The goal of the fist layer is to detect the Pwf pressure drop in the production tubing. To calcufaults that affect the process or the Thp FD-1 late this drop are usedFor the ”bottom equipment involved. that, thepressure” first and Pwf-Thp ”tubing pressure” that are in layer determines the(pressures pressure drop in present the B) the production tubing), which define the rules of production tubing. To calculate this drop Chp FD-2 fuzzy the Figureand 4.A. In this way, arethe used the system "bottomofpressure" "tubing Qinj the first(pressures fuzzy system determines thethe intermepressure" that are present in diate linguistic variable (see Figure production tubing), which ”P define the rules w f Thp” Pwf of 4.A). the fuzzy system the with Fig. the 4.A.input In this So, the HIS of starts variables FD-1 Thp way, the first fuzzyand system determines bottom pressure production tubingthe pressure Pwf-Thp intermediate "Pwfpressure _Thp" drop to obtain ”Plinguistic which is the w f Thp”, variable (see Fig 4.A). So,pressures. the HIS starts with the between those This characterization FD-2 Qprod Chp C) input variables bottom pressure and of the pressure drop in the production tubing of FD-3 Qinj production tubing pressure obtain operthe well is important because itto determines "Pwf_Thp", which is the pressure drop Figure 4. Our Multilayer Fuzzy Classifier System ational failures that may affect well production. Figure 4. Our Multilayer Fuzzy Classifier between those pressures. This So, the first layer defines a detection system of System With the output of the last level (Q prod , the prooperational failures, based on rules which define duction rate) the HIS determines the operational the relationship Pwf vs Thp. scenario of the well. Known the rate of produc– The goal of the second layer is determined the tion, the GA solves the problem of oil production rate of gas injection ”Qin j ” (see Figure 4.B), and optimization. 2      Pthp  Pg,inj  (( Qinj / c )2 ) /  g   0,125   1,266  QO *     Pws        ADVANCED SUPERVISION OF OIL WELLS . . .   Q prod 1,25

3.3 Optimization ofofthe Pro3.3 Optimization theProduction Production cess Process

The problem of ALG wells consists Theoptimization optimization problem of ALG wells of increasing the production oil and ofconsists increasing the production of oil and of minimizing minimizing the flow of based injected on the flow of injected gas, on gas, threebased variables: three variables: Qprod, Cost and Qinj. This Qprod, Cost and Qinj. This optimization problem problem is function: described by the isoptimization described by the objective objective function: (PVPOil CostProdOil)*Qprod CostGas*Qinj f f= (PV POil  −Cost Pr odOil)∗Qprod −CostGas∗Qin j (5) (5)

Where, Where, PVPOil=Sellprice priceofof in terms of daily the daily PVPOil=Sell oiloil in terms of the barrel, barrel, $/bl, $/bl, CostProductionOil=Production Cost, CostProductionOil=Production Cost, CostGas=In $/Mpcn. CostGas=In $/Mpcn. And thetherestrictions process are: are:we weasAnd restrictionsof of the the process assume that: Pws is a constant, due to the slow sume that: Pws is a constant, due to the slow dynamdynamics of the reservoir; and Pwf is lower ics of the reservoir; and Pw f is lower than the presthan the pressure of the reservoir, due to the sure the inreservoir, due pressure to the factofthat in a well factofthat a well the bottom is the pressure of bottom is minor that the pressure minor that the pressure of reservoir. of reservoir. Additionally, we establish maximum Additionally, we establish the the maximum production capacity that a reservoir can contribute production capacity that a reservoir can ascontribute Q prod,max , as [3].QThese [3]. Theseare: restrictions prod,max,restrictions are: Pws = Cons tant. Pws  Constan t.Pw f ⟨Pws P Pws Q prod ≤ Q prod,max wf Q

Q

prodthe specific prod,maxvalues of the variables Q and Finally, in j, Pw f depend on the scenario identified in the preFinally, the specific values of theidentified variablesdevious phase. That is, the scenario Q and P depend on the scenario identified inj, wf values of Q termines the iny,min , Qin j,max, , Pw f ,min , in the previous phase. That is, the scenario Pw f ,max . With these values, we define the next reidentified determines the values of Qiny,min, strictions: Qinj,max,, Pwf,min, Pwf,max. With these values, we define the next restrictions: Qin j,min≤ Qin j ≤ Qin j,max Q QPw  Q ≤ Pw f ≤ Pw f ,max f ,min inj,min inj inj,max P P P wf,min wf ofwf,max The structure the individuals is composed by

two fields that represent the variables Casing presThe (P structure the individuals sure pressures is(Pcomposed g,in j ) andofTubing thp ). These by two fields that represent therelated variables variables are used, because they are to the Casing pressure andbeTubing pressures g,inj)can gas behavior, and (P they manipulated at an operational level with an instrumentation arrangement. This is important, because such pressures can be adjusted in terms of the optimum values recommended by the GA, and thus achieve the best performances of the producing well (see equations (2), (3)

(Pthp). These variables are used, because they are related to the gas behavior, and they can be manipulated at an operational level with an instrumentation arrangement. This is Camargo E. and Aguilar important, because such pressures can be J. adjusted in terms of the optimum values recommended by the GA, and thus achieve the , and in section II.B, which describewell the model best (4) performances of the producing (see ofequations gas injection in [1], this way, (2),defined (3) , and (4)[2]). in In section II.B,the which describe the modeland ofinjection gas injection optimum value of production is estabdefined in [1], to [2]). this way, the optimum lished according the In current operational scenario, valuethe of equation production established using (5),and in ainjection way thatisthe set of valaccording to the current operational scenario, ues allowed to variables Pthp and Pg,in j depend on using the equation (5), in a way that theprevious set of the operational scenario identified in the values allowed to variables P and Pg,inj thp phase. depend on the operational scenario identified in the previous phase.

4 Experiments 4 Experiments

The was imThewell wellcharacteristics characteristicswhere wherethethesystem system was plemented are the following: The completation implemented are the following: Theof the producing vertical well is 3489 ft and well valvesis to completation of the producing vertical 3489ft,ft25 and valves 3184 ft,6% 25water API crude 3184 API crudetoGravity, Cut. It Gravity,gas 6%lift water Cut. receives gas lift from at receives from theItgas Manifold located the gas Manifold located at 508,53 ft far from is 508,53 ft far from it, and the Production Curve it, andinthe Production Curve is shown in Fig. 5. shown Figure 5. Curve of Production 300

Rate of Production (BNPD)

With the output of the last level (Qprod, the production rate) the HIS determines the operational scenario of the well. Known the rate of production, the GA solves the problem 220 of oil production optimization.

250

200

150

100

50

0 550

600

650

700

750

800

850

900

950

1000

Gas Lift (mpcndg)

Figure by aa Figure5.5. Experimental Experimental Production Production Curve by Pressure of Reservoir to 2400 psi. Pressure of Reservoir to 2400 psi. The gas lift liftinjection injectionversus versus Thebehavior behaviorofof the the gas said said production is the following: Well 1 production wells iswells the following: Well 1 operating at arategas injection rate550mpcndg atoperating a gas injection mpcndg between and 950, between 550 and 950, and associated and associated production well ranged between 190 production well ranged between 190 and 250 and 250 BNPD. Well 2 operating at a gas injection BNPD. Well 2 operating at a gas injection rate rate mpcndg between 2000 and 8000, and associmpcndg between 2000 and 8000, and ated production well range 4000 anb 6000 associated production wellbetween range between 4000 BNPD. These valuesThese were obtained at the flow staanb 6000 BNPD. values were obtained tion. In Table 1 and figure 6 are shown curat the flow station. In Table 1 and figurethe 6 are rent real production rate curve at the flow station shown the current real production rate curve at (real 2400station psi curve), the curve whichthehas been the flow (real 2400 psi curve), curve determined gas injection and the obwhich hasusing beenthedetermined using the oilgas tained at the station and evaluated in equation (4), and the theoretical curve. Using equation (4), families of curves can be created for different reservoir pressures (2600 psi, 2800 psi and 3000 psi), which show higher levels of production with respect to the

205,55

625,619

210,215

Table 1. Experimental Production Curve and 214,149 605,080 204,732 Theoretical Curves for well 1

723,158

Qin j Q prod Qin j Q prod 221,87 598,212 Theoretical Theoretical Real 203,567 Real (2400 psi) (2400 psi) (2400 (2400 psi) psi) 793,645 228,73 583,850 200,832 590,578 185,76 579,539 200,798 680,824 205,55 625,619 210,215 828,213 234,72 598,803 203,888 204,732 723,158 214,149 605,080 759,345 221,87 598,212 203,567 793,645 228,73 583,850 200,832 862,520 206,949 828,213 239,84234,72601,048 598,803 203,888 862,520 239,84 601,048 206,949 759,345

Curve of Productions 7000 6000 5000 4000

Theoretical Productions

3000

Real Productions

2000 1000 0 0

5000

10000

Gas Lift (Mpcndgs)

Figure 7. Experimental Production Curve for Figure well 2. 7. Experimental Production Curve for well 2. Importantly,well well2 2have havedifferent differentgeometric geometric Importantly, characteristicsand andmechanicals mechanicals the the well well 1. characteristics 1. It Itis is wellofofgreater greaterflow flowofofoil, oil,gas gasinjected injectedinto intothe a awell the production tubing for three valves, and the production tubing for three valves, and the comcompletion of the producing vertical well pletion of the producing vertical well is 12000is ft 12000 ft with reservoir a pressure4000 reservoir 4000 andcut with a pressure psi and 4% psi water 4% water cut (see Table II). This well initially (see Table II). This well initially operated by natural operated by natural flow, did not need flow, did not need mechanical or electrical energy mechanical or electrical energy to raise toproduction. raise production. to the systematic deterioDue to Due the systematic deterioration ration of the reservoir pressure, has decayed proof the reservoir pressure, has decayedits its duction, reason why now raises its production using production, reason why now raises its the gas lift method, indicated in the following production using as theis gas lift method, as is table. indicated in the following table. Table 2. Experimental Production Curve and Theoretical Curves for well 2 Qin j Q prod Qin j Theoretical Theoretical Real (4000 psi) (4000 psi) (4000 psi) 7590,571 5805,743 7624,532 7240,331 5731,243 7329,642 7943,491 6015,321 7862,218 7010,555 5642,921 7128,311

4.1

Q prod Real (4000 psi) 5721,984 5843,218 5943,841 5541,921

MFCS: Identification of the Operational Scenarios

With the curve of the figures 6 and 7 and the historical data of the bottom and surface variables, we can characterize the input variables of the MFCS as follows (see Table III): Figure andand Figure 6.6.Experimental ExperimentalProduction ProductionCurve Curve Theoretical pressures ofof thethe TheoreticalCurves Curvestotodifferent different pressures reservoir forreservoir well 1. for well 1.

Curve of Productions )

he station and he theoretical lies of curves voir pressures ), which show respect to the of 2400 psi, in a real way. allows us to uction curves ent reservoir ed with the to each LAG res a similar ction curve. In he production voir pressures tical and real

680,824

Figure 6. Experimental Production Curve and Theoretical Curves to different pressures of the 221 reservoir for well 1.

Rates of Productions (B/D)

similar behavior as would occur in a real way. This is interesting because it allows us to determine the possible production curves presenting the well at different reservoir pressures. The model defined with the equations SUPERVISION given above OF is specific to .each LAG ADVANCED OIL WELLS .. well, a new LAG well requires a similar procedure to determine its production curve. In the sameofway, we canfield calculate thepsi, production pressure theoretical of 2400 similar berate curves of the well 2 for reservoir havior as would occur in a real way. pressures This is in(4000 psi) (see figure 7, theoretical and real teresting because it allows us to determine the poscurves). it is important to note that the sible production curves presenting the well at diftheoretical model follows the real curve, the ferent pressures. The is model defined with valuereservoir of average production approximately the equations 5753.53 B/D.given above is specific to each LAG well, a new LAG well requires a similar procedure Table I. to determine its production curve. In the same way, Experimental Production Curve and Theoretical Curves we can calculate the rate curves of the for production well 1 well 2 for reservoir pressures (4000 psi) (see figQinj7, theoretical Q prod and Q prod inj curves).Qit ure real is important Theoretical Theoretical Real Real to(2400note thepsi)theoretical follows the real psi) that(2400 (2400 psi) model (2400 psi) curve, the value of average production is approximately 5753.53 B/D. 579,539 590,578 185,76 200,798

7000

222

Camargo E. and Aguilar J.

Table 3. Membership Function for Input Variables for well 1 Chp (psi) Low 1000-1120 Medium 1100-1220 High 1190-1320

Pw f (psi) 1-320 212-649 429-1093

Thp(psi) 150-200 190-260 230-300

Table 4. Membership Function for Input Variables for well 2 Chp (psi) Low 6140-6290 Medium 6200-6440 High 6430-6800

Pw f (psi) 1900-2150 2100-2300 2250-2400

Thp(psi) 430-490 480-510 500-550

In the case of the output variable (Qprod) of the MCFS, its membership function is (Table V): Table 5. Membership Function for output variables of wells Operational Scenarios Under-Injected Normal Over-Injected

Well 1

Well 2

400-600 550-750 700-900

5500-6000 6000-6500 6500-6800

The first layer characterizes the pressure drop in the production tubing of the well. This characterization is important because operational failures that may affect well production can be identified. So, to make that first detection of operational failures, we define a system of rules based on the relationship Pwf vs Thp, which gives their diagnosis. The Table VI shows the detection system of operational faults. It describes the different rules which define the operational diagnosis for different entries in our case studies: in each row, the first value of the first column is for the well 1 and the second for the well 2. Regarding the second layer (FD-2), this allows us to identify the rate of gas that the well requires for production. With this value, we can determine the operational state of production of the well due to the gas injection rate derived from the pressure drop and the casing pressure. The operational state of production of the well defines new type operational faults due to the gas injection rate: under-injection and over-injection.

Table 6. Rules For Detection of Operational Failures FD-1 Pressure Drop: Well 1 Well 2 3027,37 6100,12 (High Drop)

2890 5580 (High Drop)

2870 5020 (Medium Drop)

2229,7 4995 (Medium Drop)

2165 4870 (Low Drop)

Operational Diagnosis High hydrostatic pressure in the tubing by: Low Flow of Gas Injection. High Flow of Gas Injection with presence of High Water Cut. High hydrostatic pressure in the tubing by: Low Flow of Gas Injection. High Flow of Gas Injection with presence of High Wate Cut. Medium hydrostatic pressure in the tubing by: Low flow of Gas Injection with presence of High Water Cut and High Bottom Pressure. High Flow of Gas Injection with Leak Gas at level of completation well. Medium hydrostatic pressure in the tubing by: Low flow of Gas Injection with High Water Cut and Low Bottom Pressure. High Flow Gas of Injection with Leak Gas at Level Completation Well. Normal Flow of Gas and Production

Table VII defines the rules of the detection system of these operational faults for our case studies: in each row, the first value is for the well 1 and the second for the well 2.

223

ADVANCED SUPERVISION OF OIL WELLS . . .

Table 7. Rules of detection of faults generated by the output of FD-2 Pressure Drop: Well 1 Well 2 2164,78 4870 (Low) 2229,07 4995 (Medium) 2870,00 5020 (Medium) 2890,00 5580 (High) 3027,37 6100,92 (High)

Chp

1190 6430 (Medium) 1320 6800 (High) 1250 6555 (High) 1090 6300 (Medium) 1020 6150 (Low)

Operational Qinj Scenario MFCS: Well 1 Well 2 UnderInj 517,8 5250 Normal

666,67 6345

Normal

735,23 6420

OverInj

776,17 6555

OverInj

816,67 6740

Finally, the last layer (FD-3) determines the well production using the estimate rates of gas injection of the previous layer. With this value, the MFCS identifies the current operational scenario using the rules system defined in the Table VIII. Table 8. Different Operational Scenarios determined by the MFCS Qprod MFCS: Well 1 Well 2 166,66 5140 200,38 5529 213,33 5674 243,09 5700 252,09 6340

4.2

Operational Scenario UnderInj Normal Normal OverInj OverInj

Qinj MFCS: Well 1 Well 2 517,8 5250 666,67 6345 735,23 6420 776,17 6555 816,67 7100

Optimization using GA

The GA was applied for the operational scenario identified in the previous phase with MFCS,

for our case study well 1 is analized for a Normal scenario and well 2 for an OverInj scenario. The optimization problem of LAG wells consists of increasing the oil production and minimizing the gas lift flow, based on the objective function and the operational restrictions described in section III.C (see equation (5)). In order to solve that problem, the GA used presents the following components: Number of individuals: random, between 2 and 10. Number of generations: 25, Objective function: equation (5), including its respective restrictions. Crossover operator: single point cross with 0.7 probability. Mutation operator: random with 0.03 probability. The final population given by the GA for the normal operational scenario detected by the MFCS for the well 1 is shown in Table IX. An individual gives the values of Pthp and Pg,in j , specified on a row of that table, which objective function is the value of Profits. That is, the optimum values for the normal operational scenario for the variables Tubing Pressure (Pthp ) and Casing Pressure (Pg,in j ) are shown in Table IX. These values are used in the models of gas injection for wells [1], [2] (see section II.B) and in the objective function (eq. 5), giving the results of Qin j , Q prod and Profits shown in the same Table IX. Table 9. Results Obtained Pthp 170 170,4 172,5

Pg,in j 1022 1109,8 1226,3

Qin j 596,6 619,1 689,1

Q prod 232 230,2 233,7

Profits 7093 7034 7133

According to the results of the Table IX, the production system presents an optimum behavior at a gas injection rate of about 596,6 mpcndg, with an associated production of 232,06 b/d, a casing pressure of 1022 psi and production pipe of 170 psi. On the other hand, for a gas flow of 619,1 mpcndg its production rate is 230,21 b/d, generating a smaller profit and greater consumption of gas with respect to the case of 596,6 mpcndg. Regarding the gas flow of 689,1 mpcndg, a production of 233,71 b/d is expected, higher than the one of 596,6 mpcndg (1,64892 b/d), but more gas flow is required. In this

224

Camargo E. and Aguilar J.

case, the profit differential is 39 $/d, which indicates that this case could be interesting (more optimum) because it better combines the two costs. In the case of the well 2 for an OverInj scenario, it presents an optimum behavior at a gas injection rate of about 5200 mpcndg with an associated production of 4934 b/d, a casing pressure of 6430 psi and production pipe of 490 psi (see Table X). Table 10. Results Obtained Pthp 490

5

Pg,in j 6430

Qin j 5200

Q prod 4934

Profits 127752

Conclusion

Our HIS uses MFCS and GA to define a control and supervision system for oil industrial production. The population of individuals in the GA correspond to the operational scenario identified with the MFCS, generating the optimum value of production and gas injection for this current operational scenario. MFCS for the Analysis of Wells allows the analysis and classification of data from the well. It generates information from the reservoir variables (downhole pressure), head variables (casing pressure) and the gas flow. These variables are related to the gas injection process and its effect at the level of the reservoir. With such information it is more accurate the determination of the operational scenario of a well from its operating conditions, since in the same system the bottom and surface variables are integrated, different with respect to the current systems used in the industry, in which they only use surface variables. So, this will allow the self-diagnose of the well, monitor its damage, and care the performance of its infrastructure underground/surface. Specifically, our system allows estimating the rate of production and the gas injection rate, from which the well can improve its level of production to lower gas injection rate. The production using AGL wells was optimized due to the integration of subsoil and surface information, which will allow minimizing costs and guaranteeing the best distribution of the gas injection maximizing the production of oil. The subsoilsurface integrated approach is innovative in the sense that it integrates the reservoir/wellhead infras-

tructure behavior. This is carried out through an objective function, with the respective restrictions of the process, which allows contextualizing such objective function in the operational scenario and the reservoir conditions identified in the supervision scheme. The GA establishes the optimum production and the gas injection value for the identified operational scenario identified, from the relationship of the two costs of the productive process: reduce the production costs and optimize the gas injection. The MFCS allows introducing a stepwise mechanism (in each layer) to detect/diagnose operational failures in each one of them, and thus throughout the completion of the Well. Each layer determines operational conditions from which we can establish the diagnosis of operational failures in the system. From this information, an online monitoring system could be developed at different levels of the well (for example, at the wellhead using the output of FD-1), with the aim of generating corrective actions based on the operational failure detected. Our HIS needs a customization phase for the specific case study where it will be used. The production curve of the well and the historical data of the bottom and surface variables characterize the input variables of the MFCS. These input variables are specific for each case study (see tables I and II). Additionally, it is necessary define when a pressure is high, low, etc. for each case study (see tables III and IV), and its operational scenarios (see table V). Finally, the different layers (FDs) of the fuzzy system must be customized (tables VI, VII and VIII).

6

Acknowledgment

This work has been partially supported by Postgraduate Cooperation Program (PCP) between Venezuela and France entitled ”Supervision and maintenance task in a shared organizational environment”, No. 2010000307, between the University of Los Andes, Venezuela and LAAS-CNRS, France; and the Prometeo Project of the Ministry of Higher Education, Science, Technology and Innovation of the Republic of Ecuador.

References [1] Edgar Camargo, Jose Aguilar, Addison Ros, Francklin Rivas, Joseph Aguilar-Martn, “Un mod-

ADVANCED SUPERVISION OF OIL WELLS . . .

elo de Produccin de Pozos por Levantamiento Artificial utilizando Anlisis Nodal”. Revista Ciencia e Ingeniera. Universidad de los Andes. Vol.30, pp. 23-28.N 1. 2009. [2] Edgar Camargo, Jose Aguilar, Addison Ros, Francklin Rivas, Joseph Aguilar-Martn, “Nodal Analysis- based Design for Improving Gas Lift Wells Production”. WSEAS Transactions on Informations Science & Applications. Vol.5, No. 5, pp. 706-715. 2008. [3] Edgar Camargo, Jose Aguilar, Addison Ros, Francklin Rivas, Joseph Aguilar-Martn, “A NeoFuzzy Approach for Bottom Parameters Estimation in Oil Wells”, WSEAS Transactions on Systems and Control, Vol.4, No. 9, pp. 445-454, 2009. [4] Robles O, Romn, R., “Un Modelo de Programacin No Lineal para la Planeacin de la Produccin de Gas y Petrleo”. Paper SPE 112186, 2008 Intelligent Energy Conference, Amsterdam, Netherlands. [5] Yang, D., Licheng J., Maoguo G. “Adaptive MultiObjective Optimization Based on Nondominated Solutions” Computational Intelligence, Volume 25, Number 2, 2009.

225 [6] Gong, M., L.C.Jiao., H.F.Du. “Multiobjective innmure algorithm with nondominated nieghbordbased selection” Evolutionary Computation, Volume 16, pp. 225-255, 2008. [7] Shahab, D,M., ”Recent Development in Application of Artificial Intelligence in Petroleum Engineering”. paper SPE 89033. Society of Petroleum Engineers. 2005. [8] Popa A., Ramos R., Cover A., “Integration of Artificial Intelligence and Lean Sigma for LargeField Production Optimization: Application to Kern River Field”, Paper SPE 97247, pp. 34-45, 2005. [9] Cordero, S., Moreno, F. “Una Herramienta de Induccin de Sistemas Difuso Jerrquicos”. XV Congreso Espaol sobre Tecnologas y Lgica Difusa. Huelva (Spain), pp. 249-254. Feb. 2010. [10] Edgar Camargo, Jos Aguilar, “Hybrid Intelligent Supervision Model of Oil Wells”, Proceedings of the IEEE World Congress on Computational Intelligence (IEEE WCCI), Beijing, China, 2014.