Selection strategy for XCS with adaptive action ... - ACM Digital Library

Selection Strategy for XCS with Adaptive Action Mapping Masaya Nakata

Pier Luca Lanzi

Keiki Takadama

Dept. of Informatics The University of Electro-Communications 1-5-1, Chofugaoka, Chofu-shi Tokyo, Japan

Dipartimento di Elettronica, Informazione e Bioinformatica Politecnico di Milano I-20133 Milano, Italy

Dept. of Informatics The University of Electro-Communications 1-5-1, Chofugaoka, Chofu-shi Tokyo, Japan

[email protected] [email protected]

[email protected]

ABSTRACT

1. INTRODUCTION

XCS with Adaptive Action Mapping (XCSAM) evolves solutions focused on classifiers that advocate the best action in every state. Accordingly, XCSAM usually evolves more compact solutions than XCS which, in contrast, works toward solutions representing complete state-action mappings. Experimental results have however shown that, in some problems, XCSAM may produce bigger populations than XCS. In this paper, we extend XCSAM with a novel selection strategy to reduce, even further, the size of the solutions XCSAM produces. The proposed strategy selects the parent classifiers based both on their fitness values (like XCS) and on the effect they have on the adaptive map. We present experimental results showing that XCSAM with the new selection strategy can evolve more compact solutions than XCS which, at the same time, are also maximally general and maximally accurate.

Learning Classifier Systems [11] combine reinforcement learning [20] and genetic algorithms [10] to solve classification [2], regression [9], and sequential decision making problems [7]. XCS [21] is probably the most popular classifier system model so far that has been successfully applied to a wide range of problem domains [2]. XCS evolves solutions that provide an evaluation of the expected payoff of all the available actions in every situation, i.e., they represent complete (state-action) mappings of the problem solutions. Because it evolves complete mappings, XCS can learn optimal solutions for very difficult problems and because of its intrinsic genetic pressure toward generalization, XCS generates solutions that are maximally accurate and maximally general. Complete mappings are often considered redundant since in most applications (e.g., classification, stock market prediction) only the actions with the highest return really count. Complete mappings are also very expensive both in terms of memory (since they allocate classifiers to non-optimal actions) and in terms of time required to achieve optimal performance (since they require a more thoroughly exploration to the problem space); Accordingly, to avoid complete mappings, Bernado et al. introduced the UCS classifier system [1] which, like XCS, exploits accuracy-based evolution, to produce maximally accurate maximally general solutions, but employs supervised learning (instead of reinforcement learning) to distribute incoming feedback to the classifiers accountable for it. Thus, UCS only evolves classifiers that advocate the best action but can solve only classification problems (since it does not apply reinforcement learning as XCS), therefore its approach cannot generalized to broader machine learning applications. To overcome the limitations of UCS, we introduced XCS with Adaptive Action Mapping (XCSAM) [19, 18] a classifier system that (i) can evolve mappings mainly focused on the best action, with the highest payoffs, like UCS; at the same time, XCSAM (ii) can tackle both classification and sequential decision making problems using reinforcement learning like XCS. XCSAM [19, 18] mainly works like XCS and mainly selects parents classifiers based on their relative accuracy encoded in the classifiers’ fitness [21]. In XCS, plain accuracy-based fitness results in the evolution of complete mappings but in XCSAM it produces solutions with more classifiers than XCS, even if, XCSAM tries to delete classifiers that do not represent the highest rewarded actions.

Categories and Subject Descriptors I.2.6 [Artificial Intelligence]: Learning—knowledge acquisition,concept learning

General Terms Algorithms, Experimentation, Performance

Keywords Learning Classifier System, XCS classifier system, XCS with Adaptive Action Mapping, Best action mapping, Genetic Algorithm, Selection strategy

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. GECCO’13, July 6–10, 2013, Amsterdam, The Netherlands. Copyright 2013 ACM 978-1-4503-1963-8/13/07 ...$15.00.

1085

Then, the prediction error and classifier fitness are updated as,

To reduce the size of the solutions evolved by XCSAM, in this paper we firstly analyzes the effect of accuracy-based selection in XCSAM; then, we introduce a more effective selection strategy for XCSAM; finally, we compare XCS, the original XCSAM, and the improved XCSAM to the hidden parity and multiplexer problems. Our results show that the improved version of XCSAM can produce solutions that are more compact than those evolved by XCS and the original XCSAM.

2.

pk ×

clk ∈[M ](a)

Fk cli ∈[M ](a)

Fi

a∈[M ]

The XCS classifier system [21] evolves solutions that map every possible situation (state) and action pair in its corresponding expected return. The maintenance of a complete mapping makes XCS robust to local optima and guarantees that, given an adequate amount of resources [8, 15], XCS can learn an optimal solution made of maximally accurate and maximally general classifiers. A complete mapping is however considered redundant in several domains (for instance, in classification we are not interested in the expected payoff of wrong classes) and very expensive to maintain both in terms of memory (since it keeps information on all available actions) and time (since it requires a thoroughly exploration of the space). Accordingly, Bernardo with colleagues introduced UCS [1] a classifier system specifically designed for classification tasks that applies supervised learning (instead of reinforcement learning as XCS) to evolve solutions containing only classifiers advocating the best actions. XCS with Adaptive Action Mapping (XCSAM) [19, 18] represents a trade-off between XCS and UCS. Like XCS, it can solve both reinforcement learning (multi-step) and supervised classification (one-step) problems; similarly to UCS, XCSAM does not evolve a complete mapping but tries to focus on the best actions while the learning proceeds. To accomplish this, XCSAM extends the original XCS by adding mechanisms (i) to adaptively identify redundant actions and (ii) to get rid of them while still exploring the space of viable optimal solutions.

(1)

3.1 Identifying Best Action Mappings To focus on the classifiers with the highest expected return, XCSAM examines the expected payoff at the current state maxP (st , a) (Equation 2) against the payoff expected at the previous state (maxP (st−1, a)). Note that in general the former (maxP (st , a)) tends to be higher than the latter (maxP (st−1 , a)) (because of the discount factor γ). Accordingly, since the action corresponding to a higher reward also corresponds to a shorter state sequence, the best actions will tend to have a value of maxP (st , a) larger than maxP (st−1 , a). More precisely, maxP (st−1 , a) converges to γmaxP (st , a) at the next state, while maxP (st , a) converges to maxP (st−1 , a)/γ. Thus, in XCS the prediction of the accurate classifiers in [A] tends to converge to

(2)

where γ is the discount factor [20]. Next, the parameters of the classifiers in [A] are updated in the following order [6]: prediction, prediction error, and finally fitness. Prediction p is updated with learning rate β (0 ≤ β ≤ 1): pk ← pk + β(P (t) − pk )

(5)

3. XCS WITH ADAPTIVE ACTION MAPPING

where, [M](a) represents the subset of classifiers of [M ] with action a, pk identifies the prediction of classifier clk , and Fk identifies the fitness of classifier clk . Then, XCS selects an action to perform; the classifiers in [M] which advocate the selected action form the current action set [A]. The selected action at is performed, and a scalar reward rt+1 is returned to XCS together with a new input st+1 . When the reward rt+1 is received, the estimated payoff P (t) is computed as follows: P (t) = rt+1 + γ max P (st+1 , a)

Fk ← Fk + β(κk − Fk ).

where is the raw accuracy of classifier k computed from the classifier error k [21]. On regular basis (dependent on parameter θga ), the genetic algorithm is applied to classifiers in [A]. It selects two classifiers, copies them, with probability χ performs crossover on the copies, and with probability μ it mutates each allele. The resulting offspring classifiers are inserted into the population and two classifiers are deleted to keep the population size constant.

The XCS classifier system maintains a population of rules (the classifiers) which represents the solution to a reinforcement learning problem [13]. Classifiers consist of a condition, an action, and four main parameters [21, 6]: (i) the prediction p, which estimates the relative payoff that the system expects when the classifier is used; (ii) the prediction error ε, which estimates the error of the prediction p; (iii) the fitness F , which estimates the accuracy of the payoff prediction given by p; and (iv) the numerosity num, which indicates how many copies of classifiers with the same condition and the same action are present in the population. At time t, XCS builds a match set [M] containing the classifiers in the population [P] whose condition matches the current sensory input st ; if [M] does not contain all the feasible actions covering takes place and creates a set of classifiers that matches st and cover all the missing actons. This process ensures that XCS can evolve a complete mapping so that in any state it can predict the effect of every possible action in terms of expected returns. For each possible action a in [M], XCS computes the system prediction P (st , a) which estimates the payoff that the XCS expects if action a is performed in st . The system prediction P (st , a) is computed as the fitness weighted average of the predictions of classifiers in [M] which advocate action a:

(4)

κk

THE XCS CLASSIFIER SYSTEM

P (st , a) =

k ← k + β(|P (t) − pk | − k )

(3)

1086

maxP (st−1 , a)/γ. For this reason, XCSAM can identify the actions that are likely to be part of the best mapping by comparing maxP (st , a) against ζ × maxP (st−1, a)/γ (where ζ is a learning rate added to guarantee convergence). If maxP (st , a) is greater than the threshold ζ×maxP (st−1, a)/γ, then a is a good candidate and should be maintained. After having identified good candidate actions for the best action mapping, XCSAM needs to adaptively identify classifiers that may be good candidate for the final best mapping. For this purpose, a parameter eam (effect of adaptive mapping) is added to the classifiers of XCSAM, and updated as, ⎧ ⎪ ⎨eami + β(1 − eami ) eami ← if maxP (st , a) ≥ ζ × maxP (st−1, a)/γ ⎪ ⎩eam + β(nma − eam ) otherwise. i i where nma represents the number of available actions. The value of eam of classifiers advocating the selected action converges to 1 if the classifier is a good candidate for the final best action mapping; otherwise, eam converges to nma. Therefore, classifiers with an eam close to one are good candidates to represent the final best action mapping while classifiers with an eam close to nma are less likely to be maintained as they are probably advocating actions with lower expected return.

4.1 Hidden Parity problem This class of Boolean functions has been first used with XCS in [12] to relate the problem difficulty to the number of accurate maximally general classifiers needed by XCS to solve the problem. They are defined over binary strings of length n in which only k bits are relevant; the hidden parity function (HPn,k ) returns the value of the parity function applied to the k relevant bits, that are hidden among the n inputs. For instance, given the hidden parity function HP6,4 defined over inputs of six bits (n = 6), in which only the first four bits are relevant (k = 4), then we have that HP6,3 (110111) = 1 while HP6,3 (000111) = 1. In this analysis, we applied the standard parameter settings for XCS [16]: N = 2000, 0 = 1, μ = 0.04, P# = 1.0, Pexplr = 1.0, χ=1.0, β = 0.2, α=0.1, δ = 0.1, ν = 5, θGA = 25, θdel = 20, θsub = 20, GA subsumption is applied but Action set subsumption is not applied, in addition, we use the tournament selection with τ = 0.4; for XCSAM, we applied the same parameters of XCS and in addition we set ζ = 0.99.

4.2 Accuracy-Based Selection in XCS Figure 1 shows the characteristics of parents classifiers selected by accuracy-based fitness in XCS. In particular, Figure 1a and Figure 1b show the relation between the prediction and the fitness of selected parents and the iteration when its parent has been selected. For instance, a parent is plotted at the upper right corner of Figure 1a, (with prediction 1000) if it was selected at iteration 150000. Figure 1c shows the relations between the prediction and the fitness of classifiers in [P] after 150000 iterations, that is, the final solutions. In XCS, to ensure a complete mapping, selection is designed to focus toward maximally accurate classifiers. Therefore, in classification problems with a 1000/0 reward scheme, the classifiers which should be selected as parent have a prediction of 1000 or 0. Figure 1a and Figure 1b show that the genetic algorithm in XCS selects classifiers with a prediction between between 100 and 900 until problem 60000; then, XCS gradually focuses on classifiers with a 0 or 1000 prediction. Similarly, selected classifiers have a low fitness until iteration 70000; then, XCS gradually focuses selection on classifiers with higher fitness values (near to 1). This behavior is coherent to what we should expected in XCS however the same strategy has a different effect in XCSAM.

3.2 Focusing Evolution on the Best Actions To focus evolution on the best actions, XCSAM acts on the covering operator to prevent the generation of classifiers that are not likely to be included in the final solution. In particular, XCSAM tunes the activation threshold of the genetic algorithm θnma using the actions’ predicted reward and the eam parameters. Initially, θnma is set to the number of feasible actions (the same value used in XCS). When [M ] is generated, XCSAM computes the prediction array before the covering is applied (whereas XCS computes it only after covering). Then, XCSAM computes the current θnma as the average eam of the classifiers in [M] weighted for the expected future return maxP (st , a). If the number of different actions in [M ] is smaller than the computed θnma , covering is called and the prediction array is computed again. After action selection is performed, XCSAM generates both the ¯ conaction set [A] (as XCS) and also the not action set [A] sisting of the classifiers in [M ] not advocating the selected action. When the executed action is considered a candidate best action, during the genetic algorithms, (i) the parent classifiers are selected from [A] to promote the evolution of classifiers that are likely to be in the final best action map¯ ping; instead, (ii) the deleted classifiers are selected from [A] to get rid of classifiers that are not likely to be part of the final solution. Otherwise, if there is not enough information ¯ is empty, XCSAM applies about the executed action, or [A] deletion in [P ] as done in XCS. When the executed action is not identified as a candidate best action, the parents are ¯ to explore the solution space even further selected from [A] and deletion is applied to the population as in XCS.

4.

to the hidden parity problem HP20,5 and compared the characteristics of parents classifiers in the two models.

4.3 Accuracy-Based Selection in XCSAM In classification problems with a 1000/0 reward scheme, XCSAM should tend to focus mainly on accurate classifiers with high (near 1000) prediction values, which represent the best action mappings. Thus, XCSAM should behave differently than XCS and should not select accurate classifiers wih low, near 0, prediction values, which represent redundant classifiers for XCSAM.

ANALYSIS OF SELECTION IN XCSAM

We analyzed the effect of pure accuracy-based selection in XCSAM by studying the characteristics of selected parent classifiers. For this purpose, we applied XCS and XCSAM

1087

150000

1

150000

0.9 0.8 0.7

50000

Fitness

Iterations

100000

Iterations

100000

50000

0.6 0.5 0.4 0.3 0.2 0.1

0

0

0 0

200

400 600 Prediction

800

1000

0

a) Prediction of selected parents and iteration that its parent are selected

0.2

0.4 0.6 Fitness

0.8

0

1

200

400

600

800

1000

Prediction

b) Fitness of selected parents and iteration that its parent are selected

c) Prediction and Fitness of classifiers in [P] as final solutions

Figure 1: Characteristics of parents selected by accuracy-based selection in XCS (a & b) and characteristics of the final solutions (c). 150000

1

150000

0.9 0.8 0.7

50000

Fitness

Iterations

100000

Iterations

100000

50000

0.6 0.5 0.4 0.3 0.2 0.1

0

0

0 0

200

400

600

800

1000

0

0.2

Prediction

a) Prediction of selected parents and iteration that its parent are selected

0.4

0.6

0.8

0

1

Fitness

b) Fitness of selected parents and iteration that its parent are selected

200

400 600 Prediction

800

1000


Figure 2: Characteristics of parents selected by accuracy-based selection in XCSAM (a & b) and characteristics of the final solutions (c). As can be noted, classifiers with a low prediction (a near zero prediction) tend to have the high eam (which converges to nma); classifiers with a high prediction (a near 1000 prediction), have the low eam (which converges to 1). This suggests that XCSAM should select parents with a small value of eam and a high fitness. Accordingly, we introduce the selection vote which is used to decide parents. In detail, we compare the selection vote of each classifiers in [A] to select parents. In particular, we modify tournament selection [4] by defining the selection vote as:

Figure 2 shows the characteristics of selected classifier parents in XCSAM using plain accuracy-based fitness. Figure 2a and Figure 2b show that XCSAM selects most promising parent classifiers for a best action mapping (in fact, it tends to select classifiers with high fitness values). However, at the same time, the genetic algorithm still selects inaccurate classifiers until iteration 150000 (Figure 2b). XCSAM keeps selecting classifiers with low fitness (Figure 2c) therefore the final solutions evolved by XCSAM contain a large number of inaccurate classifiers. Thus, plain accuracy-based selection in XCSAM select the proper classifiers but also keeps a large number of inaccurate classifiers and thus produces solutions with more classifiers than needed (in fact, the system should still get rid of inaccurate classifiers).

5.

selection vote =

cl.F 1 × cl.num cl.eam − 1

(6)

Note that the selection vote is equal to cl.F/cl.num that is original selection pressure of the tournament selection [3], i.e., we only add 1/(cl.eam − 1) to the original selection vote. This means that XCSAM selects parents based on not only fitness but also based on the eam value so that XCSAM detects accurate parents that should be included in the best action mappings (because they have a low eam). We select the classifier which has maximum selection vote in the tournament with tournament size τ as done [4].

IMPROVED XCSAM

We propose a new selection strategy for XCSAM to ensure that XCSAM evolves compact solutions containing mainly classifiers advocating the actions with the highest payoff. As shown by previous analysis, XCSAM with plain accuracybased selection fails to select good parents that represent the best action mappings. To understand how we designed an adequate selection strategy for XCSAM, we analyzed the eam parameter in XCSAM. Figure 3 shows the relations between the prediction and effect of adaptive mapping (eam) of the classifiers in the final solutions.

5.1 Analysis of New selection strategy To analyze the characteristics of selected parent classifiers by the proposed selection strategy in XCSAM, we applied XCSAM on the 20/5 Hidden parity problem.

1088

150000

1

150000

0.9 0.8 0.7 Fitness

Iteration

100000

Iteration

100000

50000

50000

0.6 0.5 0.4 0.3 0.2 0.1

0

0

0 0

200

400 600 Prediction

800

1000

0

a) Prediction of selected parents and iteration that its parents are selected

0.2

0.4 0.6 Fitness

0.8

0

1

200

400

600

800

1000

Prediction

b) Fitness of selected parents and iteration that its parents are selected


Figure 4: Characteristics of parents selected by the new selection strategy in XCSAM (a & b) and characteristics of the classifiers in the final solution. 150000

100000

100000 Iteration

Iterations

150000

50000

50000

0

0 1

1.2

1.4

1.6

1.8

2

1

Effect of Adaptive Mapping: eam

1.2

1.4

1.6

1.8

2

Effect of Adaptive Mapping: eam

a) XCSAM with XCS’s selection strategy

b) XCSAM with proposed selection strategy

Figure 5: Effect of adaptive mapping (eam ) of parents selected by accuracy-based selection strategy and the new selection strategy in XCSAM. Figure 4a shows that with the proposed selection strategy XCSAM selects inaccurate classifiers that have prediction = 100 ∼ 900 till iteration 50000; then, it clearly focus only on good candidates with high accuracy and a high (near 1000) prediction. Similarly, Figure 4b shows that XCSAM still selects maximally accurate classifiers. Therefore, as shown in Figure 4c, final solutions contain very few inaccurate or redundant classifiers and thus represent the best mapping XCSAM should generate. Additionally, Figure 5 shows that the proposed selection strategy correctly focuses on classifiers with a low eam whereas pure accuracy-based fitness cannot stably select such classifiers. Overall, these results suggest that the proposed selection strategy promotes accurate classifiers that mainly advocate the best actions in every situation and prevent the selection of inaccurate classifiers.

Effect of Adaptive Mapping (eam)

2

1.8

1.6

1.4

1.2

1 0

200

400

600

800

1000

Prediction

Figure 3: Relation between prediction and effect of adaptive mapping (eam) of classifiers in the final solution.

6. EXPERIMENTAL RESULTS We applied XCS, the original XCSAM [19, 18], and XCSAM with the new selection strategy to the hidden parity problem and to the Boolean multiplexer. We compared the three models in terms of learning performance, population size, and evolved mapping.

Figure 4 compares the characteristics of selected parent using accuracy-based selection and the new selection strategy. Figure 5 shows the relation between the iteration that parents are selected and the eam parameter of selected parents by both selection strategi in XCSAM.

Design of Experiments. Each experiment consists of a number of problems that the system must solve. Each problem is either a learning problem or a test problem.

1089

In contrast, our improved XCSAM is even faster than the original XCSAM and reaches optimality after 170000 problems and produces solutions that are half of what XCS needs (just 570 classifiers). Overall our results show that by focusing only on the actions with the highest reward, the original XCSAM [19, 18] can learn faster than XCS but requires more classifiers than XCS. The improved XCSAM combines the best of the two worlds and ensures fast learning and more compact solutions. The improved XCSAM in fact evolves solution that are around half of what XCS produces (as it should be expected).

During learning problems, the system selects actions randomly from those represented in the match set. During test problems, the system selects the action with highest expected return. When the system performs the correct action, it receives a 1000 reward, otherwise it receives 0. The genetic algorithm is enabled only during learning problems, and it is turned off during test problems. The covering operator is always enabled, but operates only if needed. Learning problems and test problems alternate. The performance is reported as the moving average over the last 5000 test problems. All the plots are averages over 10 experiments. Hidden Parity. In the first set of experiments, we applied XCS, XCSAM, and the improved XCSAM with the new selection strategy to HP20,5 (see Section 4) using the standard parameter settings used in [17, 14]. Figure 6 compares the performance and population size of XCS, XCSAM, and the improved XCSAM. As can be noted XCS reaches optimal performance after 110000 problems (Figure 6a) and evolves solutions containing an average of 120 classifiers (Figure 6b). XCSAM learns faster and reaches optimality after 70000 iterations (Figure 6a) but produces larger solutions than XCS (an average of 580 classifiers, Figure 6b). In contrast, the improved version of XCSAM, using the new selection strategy, reaches optimality a bit faster than the original XCSAM but needs less classifiers than XCS (an average of 79 classifiers).

7. CONCLUSIONS XCSAM [19, 18] is an extension of XCS [21] that generates solutions mainly containing classifiers that advocate the actions with the highest returns. While XCS [21] learns the expected payoff of the available actions in every possible situation, XCSAM only concentrates on the most promising actions and therefore it can learn faster than XCS [19, 18]. However, XCSAM can often produces solutions that are larger than those evolved by XCS. Accordingly, in this paper, we extended XCSAM by introducing a novel selection strategy that could reduce the size of the solutions evolved by XCSAM. The proposed selection strategy enables XCSAM to select the parents classifiers based both on their fitness (as done in XCS) and also on the effect they have on the mapping (encoded by the parameter eam). We applied XCS, the original XCSAM, and the improved XCSAM to the 20/5 hidden parity problem [17, 14], to the 20-multiplexer and to the 37-multiplexer [21]. Our results show that the improved XCSAM can evolve solutions that are around half of the size of the solutions produced by XCS while also reaching optimal much smaller than XCS. Thus, the improved XCSAM opens up new opportunity to tackle much complex problems in acceptable time.

Boolean Multiplexer. In the second set of experiments, we compared the three models on the Boolean multiplexer [21]. These are are defined over a binary string of k + 2k bits; the first k bits represent an address pointing to the remaining 2k bits. For instance, the 6-multiplexer function (k = 2) applied to the input string 110001 will return 1, while when applied to 110110 it will return 0. We compared the three XCS models using the 20-multiplexer (k = 4) and the 37-multiplexer (k5) using the standard parameter settings [5]: N = 2000(k = 4), 5000(k = 5), 0 = 10, μ = 0.04, P# = 0.5(k = 4), 0.65(k = 5), Pexplr =1.0, χ = 0.8, β = 0.2, α = 0.1, δ = 0.1, ν = 5, θGA = 25, θdel = 20, θsub = 20, tournament selection is applied, τ = 0.4, GA subsumption is turned on while AS subsumption is turned off; in XCSAM, we set ζ = 0.99. Figures 7 and 8 compare the performance and population size of XCS, the original XCSAM, and the improved XCSAM using the new selection strategy on the 20-multiplexer and the 37-multiplexer. As can be noted, in the 20-multiplexer, XCS reaches optimal performance after 40000 problems (Figures 7a) and evolved solutions that on the average contain around 340 classifiers (Figures 7b). XCSAM learns a little bit faster than XCS but it produce larger solutions of about 630 classifiers (i.e., more than twice the size of the solutions evolved by XCS). In contrast, the improved XCSAM reaches optimality a little bit faster than XCS, while producing very compact solutions containing an average of 180 classifiers (Figures 7b), i.e., almost half the size of the solutions evolved by XCS. In the 37-multiplexer, XCS needs to train over 400000 problems before it can reach optimality and evolves solutions containing an average of 1200 classifiers. The original XCSAM [19, 18] learns much faster than XCS and reaches optimal performance just after 200000 problems; however, the final population size is about 2300 classifiers (almost twice what XCS needs).

Acknowledgment This work was supported by the JSPS Institutional Program for Young Researcher Overseas Visits.

8. REFERENCES [1] E. Bernad´ o-mansilla and J. M.Garrell-Guij. Accuracy-based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks. Evolutionary Computation, 11:209–238, 2003. [2] Larry Bull, Ester Bernad´ o-Mansilla, and John H. Holmes, editors. Learning Classifier Systems in Data Mining, volume 125 of Studies in Computational Intelligence. Springer, 2008. [3] M. V. Butz. Rule-Based Evolutionary Online Learning Systems. Springer, 2006. [4] M. V. Butz, D. E. Goldberg, and K. Tharakunnel. Analysis and Improvement of Fitness Exploitation in XCS: Bounding Models, Tournament Selection, and Bilateral Accuracy. Evolutionary Computation, 11(3):239–277, 2003. [5] M. V. Butz, T. Kovacs, P. L. Lanzi, and S. W. Wilson. Toward a Theory of Generalization and Learning in XCS. IEEE Transactions on Evolutionary Computation, 8(1):28–46, February 2004.

1090

1

2000

1500

0.8

Population size

Performance

0.9

XCS XCSAM Improved XCSAM

0.7


1000

500

0.6

0.5

0 0

50

100 150 200 iterations (1000s)

250

300

0

50

a) Performance

100 150 200 iterations (1000s)

250

300

b) Population size

Figure 6: Performance and population size on 20/5 Hidden parity problem 1

2000

1500

0.8

Population size

Performance

0.9


0.7


1000

500

0.6

0.5

0 0

20

40

60

80

100

0

20

40

iterations (1000s)

60

80

100

iterations (1000s)

a) Performance

b) Population size

Figure 7: Performance and population size on the 20-multiplexer. 1

5000

4000

0.8

Population size

Performance

0.9



0.7

0.6

3000

2000

1000

0.5

0 0

200

400 600 iterations (1000s)

800

1000

0

a) Performance

200

400 600 iterations (1000s)

800

1000

b) Population size

Figure 8: Performance and population size on the 37-multiplexer. [7] Martin V. Butz, David E. Goldberg, and Pier Luca Lanzi. Gradient descent methods in learning classifier systems: Improving xcs performance in multistep

[6] M. V. Butz and S. W. Wilson. An algorithmic description of xcs. Journal of Soft Computing, 6(3–4):144–153, 2002.

1091

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15] Pier Luca Lanzi, Luigi Nichetti, Kumara Sastry, Davide Voltini, and David E. Goldberg. Real-coded extended compact genetic algorithm based on mixtures of models. In Ying-Ping Chen and Meng-Hiot Lim, editors, Linkage in Evolutionary Computation, volume 157 of Studies in Computational Intelligence, pages 335–358. Springer, 2008. [16] P.L. Lanzi, D. Loiacono, S.W. Wilson, and D.E. Goldberg. XCS with computed prediction for the learning of Boolean functions. In Evolutionary Computation, 2005. The 2005 IEEE Congress on, volume 1, pages 588 –595, 2005. [17] Martin V. Butz and David E. Goldberg, K. Tharakunnel. Analysis and Improvement of Fitness Exploitation in XCS: Bounding Models, Tournament Selection, and Bilateral Accuracy. Evolutionary Computation, 11(4):239–277, 2003. [18] Masaya Nakata, Pier Luca Lanzi, and Keiki Takadama. Enhancing learning capabilities by xcs with best action mapping. In Carlos A. Coello Coello, Vincenzo Cutello, Kalyanmoy Deb, Stephanie Forrest, Giuseppe Nicosia, and Mario Pavone, editors, PPSN (1), volume 7491 of Lecture Notes in Computer Science, pages 256–265. Springer, 2012. [19] Masaya Nakata, Pier Luca Lanzi, and Keiki Takadama. Xcs with adaptive action mapping. In Lam Thu Bui, Yew-Soon Ong, Nguyen Xuan Hoai, Hisao Ishibuchi, and Ponnuthurai Nagaratnam Suganthan, editors, SEAL, volume 7673 of Lecture Notes in Computer Science, pages 138–147. Springer, 2012. [20] R. S. Sutton and A. G. Barto. Reinforcement Learning – An Introduction. MIT Press, 1998. [21] S. W. Wilson. Classifier fitness based on accuracy. Evolutionary Computation, 3(2):149–175, June 1995.

problems. IEEE Transaction on Evolutionary Computation, 9(5):452–473, October 2005. Martin V. Butz, Tim Kovacs, Pier Luca Lanzi, and Stewart W. Wilson. Toward a theory of generalization and learning in xcs. IEEE Transaction on Evolutionary Computation, 8(1):28–46, February 2004. Martin V. Butz, Pier Luca Lanzi, and Stewart W. Wilson. Function approximation with xcs: Hyperellipsoidal conditions, recursive least squares, and compaction. IEEE Trans. Evolutionary Computation, 12(3):355–376, 2008. D. E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley, 1989. J. H. Holland. Escaping Brittleness: The Possibilities of General Purpose Learning Algorithms Applied to Parallel Rule-based system. Machine Learning, 2:593–623, 1986. Tim Kovacs and Manfred Kerber. What makes a problem hard for xcs? In Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson, editors, IWLCS, volume 1996 of Lecture Notes in Computer Science, pages 80–102. Springer, 2000. Pier Luca Lanzi. Learning classifier systems from a reinforcement learning perspective. Soft Computing A Fusion of Foundations, Methodologies and Applications, 6(3):162–170, 2002. Pier Luca Lanzi, Daniele Loiacono, Stewart W. Wilson, and David E. Goldberg. XCS with computed prediction for the learning of boolean functions. In Proceedings of the IEEE Congress on Evolutionary Computation – CEC-2005, pages 588–595, Edinburgh, UK, September 2005. IEEE.

1092

Selection strategy for XCS with adaptive action ... - ACM Digital Library

Selection strategy for XCS with adaptive action ... - ACM Digital Library

Suggest Documents

Adaptive Computation - ACM Digital Library

43 Optimal Selection of Adaptive Streaming ... - ACM Digital Library

OR-PCA with Dynamic Feature Selection for ... - ACM Digital Library

BAD: Bandwidth Adaptive Dissemination - ACM Digital Library

with compliance - ACM Digital Library

with compliance - ACM Digital Library

An Adaptive Learning with Gamification ... - ACM Digital Library

Building Adaptive Distributed Applications with ... - ACM Digital Library

Optimizing List Selection for Smartwatches - ACM Digital Library

MPEG-4-based adaptive remote rendering for ... - ACM Digital Library

Mini-Rank: Adaptive DRAM Architecture for ... - ACM Digital Library

Adaptive Decision Support System (ADSS) for ... - ACM Digital Library

WADEIn II: A Case for Adaptive Explanatory ... - ACM Digital Library

Adaptive case management for the government ... - ACM Digital Library

An Adaptive View Element Framework for Multi ... - ACM Digital Library

Evaluating Adaptive User Profiles for News ... - ACM Digital Library

A Multi-step Strategy for Approximate Similarity ... - ACM Digital Library

Playground Games: A design strategy for ... - ACM Digital Library

Playground Games: A design strategy for ... - ACM Digital Library

design - ACM Digital Library

crpit - ACM Digital Library

Conversations - ACM Digital Library

Incentives - ACM Digital Library

Gunrock - ACM Digital Library