A Meta-Learning System Based on Genetic Algorithms Eric Pellerin*a, Luc Pigeona, Sylvain Delisleb a: Defence R&D Canada (DRDC Valcartier), Information and Knowledge Management Section, Val-Bélair (Québec), Canada, G3J 1X5; b: Département de mathématiques et d’informatique, Université du Québec à Trois-Rivières, Québec, Canada, G9A 5H7;
[email protected];
[email protected];
[email protected] ABSTRACT The design of an efficient machine learning process through self-adaptation is a great challenge. The goal of metalearning is to build a self-adaptive learning system that is constantly adapting to its specific (and dynamic) environment. To that end, the meta-learning mechanism must improve its bias dynamically by updating the current learning strategy in accordance with its available experiences or meta-knowledge. We suggest using genetic algorithms as the basis of an adaptive system. In this work, we propose a meta-learning system based on a combination of the a priori and a posteriori concepts. A priori refers to input information and knowledge available at the beginning in order to built and evolve one or more sets of parameters by exploiting the context of the system’s information. The selflearning component is based on genetic algorithms and neural Darwinism. A posteriori refers to the implicit knowledge discovered by estimation of the future states of parameters and is also applied to the finding of optimal parameters values. The in-progress research presented here suggests a framework for the discovery of knowledge that can support human experts in their intelligence information assessment tasks. The conclusion presents avenues for further research in genetic algorithms and their capability to learn to learn. Keywords: Meta-learning, genetic algorithms, machine learning, learning system, knowledge, neural Darwinism, learning to learn.
1. INTRODUCTION Meta-learning is certainly one of the most important areas in machine learning research. It is a great challenge to design an efficient machine learning mechanism through self-adaptation. One of the central questions that led Schmidhuber’s research on meta-learning is the following: can we construct metalearning algorithms that can learn better learning algorithms? [1] This question is very important for the development of any learning system. The goal of meta-learning is to build a self-adaptive learning system which improves its bias dynamically through its experiences by accumulating meta-knowledge [2]. To obtain efficient and good quality metaknowledge, the learning system must be able to adapt itself to a specific environment. From that perspective, to help the development of meta-learning algorithms, we suggest to use genetic algorithms (GAs) as the basis of adaptive systems. GA are the clear preference of some researchers in the development of learning systems [3, 4]. In the literature, several research works on learning systems and genetic algorithms are related to this issue [3, 4, 5, 6, 7]. Military information is classified according to the different types of intelligence sources it comes from: Signals Intelligence (SIGINT), which includes Communications Intelligence (COMINT) and Electronics Intelligence (ELINT); Imagery Intelligence (IMINT); Measurement and Signature Intelligence (MASINT); Open Source Intelligence (OSINT); and Human Intelligence (HUMINT). When human experts receive these various types of information, the source’s reliability and credibility must be evaluated. These evaluations indicate the degree of confidence one can have in the information and they also impact on data interpretation and fusion. However, the ability of human experts to reliably perform such evaluations can be seriously affected by stressful situations and information overload, thus leading to erroneous interpretation. We suggest that a system base on meta-learning can help human experts in their intelligence information assessment tasks. This system is focused on the optimization of weights associated with its
information sensors (sources) in order to ensure that the information received, and later processed, is optimally reliable under the inevitable constraints affecting the system. The purpose of this paper is to make suggestions as to how such a meta-learning system could be constructed. In this paper, we propose a meta-learning system to illustrate the algorithms’ capability to learn to learn. Section 2 presents several research works on meta-learning and meta-learning with genetic algorithms. Section 3 presents the proposed architecture of the meta-learning system. Section 4 explains the meta-knowledge obtained by the system. And finally, we discuss and conclude in Section 5.
2. BACKGROUND 2.1 Meta-learning The goal of meta-learning is to build a self-adaptive learning system that is constantly adapting to its specific environment. To realize that, the meta-learning process must be able to improve its bias dynamically by adapting the current learning strategy in accordance to its experiences or meta-knowledge [1, 8]. This is the general definition accepted by several authors in the literature. However, it turns out that the application of this definition varies significantly from one researcher to another. Different views of meta-learning, here as described by Vilalta and Drissi [2], are reported in the machine-learning literature. • Building a meta-learner of base-learners A meta-learner combines the predictions of the base learning. Stacked generalization, bagging and boosting are common approaches that work by combining individual generalizers to achieve greater predictive accuracy. • Dynamic selection of bias This type of meta-learning focuses on the evaluation and selection of bias to provide an appropriate condition to the learning algorithm. The dynamic selection of bias proceeds by modifying the hypothesis space to cover the largest number of possible tasks. • Meta-rules matching task properties with algorithm performance The goal here is to determine, under some conditions, which learning algorithm to apply in order to perform well on the tasks at hand. It is then desirable to take many of the tasks’ criteria into account when attempting to identify an appropriate algorithm. • Inductive transfer and learning to learn Inductive transfer improves learning for one task by using the information contained in other related tasks. As experience accumulates, learning for each task can help other tasks to perform even better learning. Various studies have considered how to improve learning by detecting, extracting, and exploiting the existence of patterns across domains. • Learning classifier systems Learning classifier systems are a combination of genetic algorithms and reinforcement learning. The system learns to identify condition-action rules by interacting with an environment and reward system. The system also learns how to create better rules and improve existing ones. The goal of this system is to increase the amount of reward received from the environment. • Case-based reasoning A case-based classifier system uses an approach similar to the one used by humans sometimes. The system tries to solve new cases (problems) by using old previously solved cases (problems). The system is able to provide explanations of its reasoning as well as justifications of its solutions.
• Learning agents Meta-learning is applied to learning agents to help making these agents collaborate with each other thanks to a reinforcement process or, alternatively, to use meta-level problem-solving information and knowledge to support cooperation in a multi-agent system. In this way, the interaction among different kinds of learning agents can be seen as meta-heuristics and may help to solve NP-hard problems in a way that may lead to satisfying solutions that cannot be obtained with exact algorithms. Following this brief description, we now present our own view of meta-learning in Figure 1 below.
Performance evaluator Performance effectors
Knowledge assessment
Learning module knowledge
Training set
Adaptive knowledge
Meta-knowledge
Performance module
Hypothesis
ENVIRONMENT Figure 1. A Systemic Perspective of Meta-Learning
Figure 1 shows many components that, taken together, form the whole system. Each major component is now described: Performance module: Takes the training set and returns the hypothesis or series of actions (via its performance effectors) to the environment. Learning module: Improves the system through modification of the performance module by using adaptive knowledge and feedbacks. Knowledge assessment: Evaluates the quality of the knowledge according to its degree of contribution to the resolution of the problem. Adaptive knowledge: Explores new hypotheses. We suggest that in order to build a useful and performing learning system, we should combine several learning techniques. Different learning techniques have been used to develop meta-learning system: •
Inductive logic programming;
•
Genetic algorithms;
•
Neural networks;
•
Decision trees;
•
Clustering;
•
Reinforcement learning;
•
Fuzzy logic;
•
Etc.
All these techniques are limited in their learning capabilities and one of these, i.e. GAs, appears especially promising to support the design of a meta-learning system. Some researchers favour GAs because of their capabilities to obtain a good quality solution quickly and relatively easily compared to other techniques [3, 4]. Because of the interest GAs represent to us in the light of the goals we set ourselves, we next briefly review some research developments of GAs within the realm of machine learning. 2.2 Meta-learning with genetic algorithm Meta-learning with genetic algorithms has been under consideration for the past fifteen years. De Jong, in [9], reports that GAs are better than other learning techniques for the design of learning systems. Some investigations of genetic algorithm applications to many traditional machine learning problems are presented in [10]. Several authors in the field of machine learning have used GAs for the acquisition of classification knowledge [5, 6, 11]. A good review paper in this area is the one by [7] in which new models are discussed and several interesting applications in learning classifier systems are presented. Some research works in multi-agent systems propose to use GAs inside individual learning agents in order to make these agents completely autonomous from a genetic standpoint [3]. Genetic autonomy for an agent may lead to meta-learning to determine a set of good heuristics in multi-agent systems. Most of the research works mentioned here are considered along the way, especially the classification knowledge with action-rules evolution. In this paper, we propose an architecture that has no implication in terms of a set of rules: the system is designed with a set of parameters and evolution takes place through the adaptation of these parameters. More details on the architecture we propose are presented in Section 3.
3. ARCHITECTURE OF A META-LEARNING SYSTEM We here introduce our own meta-learning system based on a combination of the a priori and a posteriori concepts. A priori refers to input information and knowledge available at the beginning in order to define and evolve one or more sets of parameters. These parameters are specifics to the system, and the sets of parameters to obtain from results constitute meta-knowledge. A posteriori concept refers to refine the sets of parameters first obtained (see Figure 2). The same combination is used to design a hybrid, multi-objective optimization system [12]. 3.1 A priori The notion of a priori consists in starting from initial input information and knowledge to construct and adapt the sets of parameters or meta-knowledge. We show that the system is able to extract meta-knowledge from knowledge. These capabilities are obtained by a combination of learning techniques. 3.1.1 Darwin brain The module called “Darwin brain” is a major component in our meta-learning system. Its goal is to build and evolve one or more sets of parameters by exploiting the knowledge, the context of information and the optimal parameter setting after simulation runs. This self-learning component is based on the principle of Darwinian natural selection and the system uses a combination of learning techniques based on genetic algorithms and neural Darwinism. Genetic algorithms are optimization algorithms based on Charles Darwin’s theory of species evolution [13]. John Holland’s initial work goes up to the 1960’s, although his best-known contribution was made in 1975 with the publication of Adaptation In Natural And Artificial Systems [14]. However, it is Goldberg’s seminal work that largely contributed to the development of genetic algorithms [15] as we know them nowadays. The GA’s mechanism depends on the choice of several key parameters and all these parameters come from the optimized evolution of “solutions” (individuals) for a given problem in order to find a satisfying solution (which is not, strictly speaking, necessarily an absolute optimal solution, depending on the fitness function and other considerations). Neural Darwinism, also called the theory of neuronal group selection, was proposed by Gerald Edelman [16] and refers to the neurobiology concept of acting as a selective system. The human brain is in permanent contact with itself and the
various areas that constitute it. These areas communicate between themselves in the form of mechanisms of reintroduction inside the neural networks. At every moment, it selects data flows, consolidates some of them and erases some others. Certain groups of neurons get reinforced, while others get weakened according to the stimuli in the environment. The brain thus rebuilds its memory continuously and, by this fact, builds a new possible world at every moment. This theory is called neuronal Darwinism and can be useful if applied to learning systems. In fact, learning is a form of selection and consists in a great number of neuronal connections behaving according to an evolutionary schema. The evolution of neuronal-connection schemas is based on genetic algorithms and the principle of selection is implemented more specifically at the level of groups of neurons. Each group of neurons is connected in layers that are interconnected in order to constitute categories but, also, to allow updating these categories whenever necessary. The purpose of the categories is to help the discovery of implicit knowledge in the sets of parameters. The functioning of this module is presented in Section 4. 3.1.2 Memory The “Memory” module stores all knowledge that will be used by the “Darwin brain” module to learn parameter settings. The structure of the “Memory” module is based on the physical world representation of space (x,y,z), time (t), and possible worlds (w) defined by Pigeon [17, 18].
A priori
Optimal parameters settings
Sets of parameters adapted
A posteriori Figure 2. Meta-Learning System and A Priori and A Posteriori Knowledge
3.2 A posteriori This section considers the simulation and choice of parameter settings among all possible sets of parameters. Our system must learn how to carry out this task without a priori knowledge and without external user assistance. This capability will enable the system to react to unforeseen situations.
3.2.1 Sim worlds In this module, unsupervised learning is applied to the discovery of implicit knowledge, present within many parameters, by detecting the similarities or the differences between these parameters. No feedback is required for learning (by the very definition of unsupervised learning), only simulation or estimation of future states of parameters. The module “Sim worlds” uses a set of parameters adapted at the output of the “Darwin brain” module to extract results. This phase must prepare the performance evaluation of a set of parameters. It consists in predicting the behaviour that our system with one of the sets of parameters could have. For that, we can apply them to the real system or apply them to an approximate system simulating the real system. This phase must be as quick as possible to explore a maximum of different simulations and, thus, to maximize our chances to detect an optimal parameter settings [12]. 3.2.2 Choice The goal of this module is to carry out a performance evaluation of a set of parameters. This module focuses on estimating the quality of simulations obtained from the “Sim worlds” module. Supervised learning is used to evaluate the quality of the sets of parameters. In this way, optimization algorithms can be applied to find optimal parameters settings. We suggest that a simple genetic algorithm is sufficient to obtain a good optimization of parameters. These results will then be used by the “Darwin brain” module to bring about improvements to the learning system.
4. META-KNOWLEDGE This section describes the functioning of the meta-learning system to obtain meta-knowledge on output. The information provided by human (military) experts is the system’s input. The information received by the human (military) expert at the other end, the human information consumer, is evaluated on two aspects: the reliability of the source and the credibility of the information. These evaluations indicate the degree of confidence associated with particular items of information. Table 1 shows the evaluation rating of the source reliability and the information credibility.
Sources reliability
Information credibility
A B
Completely reliable Usually reliable
1 2
Confirmed by other sources Probably true
C
Fairly reliable
3
Possibly true
D
Not usually reliable
4
Doubtful
E
Unreliable
5
Improbable
F
Reliability cannot be judged
6
Truth cannot be judged
Table 1. Evaluation Rating of Information To each input in the system, corresponds a vector of weights (i.e. the degree of confidence). The goal of the metalearning system is to learn the best weight that should be associated to incoming information so that the human information consumer knows up to what level he /she can rely on the available information. It performs learning not only on the system’s knowledge, but also on the system itself and on its own structures of knowledge. We define a learning cycle below: 4.1 Learning Learning is a resultant of the theory of neuronal group selection (neuronal Darwinism). This theory is based on three postulates described as follows:
1) Neurons are initially connected randomly, then more and more systematically, to answer very general constraints of development. 2) The most frequently used connections are reinforced, while others disappear. 3) Selection occurs when an external or internal stimulus occurs. Millions of neurons interconnected in layers are then activated in parallel; these layers communicate between them in order to constitute categories. It is from the categories and the selective system by which learning takes place. This type of learning is represented by a form of selection and consists in a great number of neuronal-connection evolutionary schemas. 4.2. Selection (GA) The evolution of neuronal-connection schemas is based on genetic algorithms and the principle of selection proceeds more specifically on the level of groups of neurons. The selection of the neuronal groups is dependent on the context (world). 4.3 Context When the human brain learns, different areas are used depending on the context or world. The activity levels throughout the brain are not the same everywhere. Several neuronal-connection schemas are more frequently selected than others and are thus strongly dependent on the physical world. Therefore, learning is seen as the aptitude to organize the surrounding world into categories. 4.4 Categories The purpose of the categories is to help discover the implicit knowledge present in sets of parameters, by detecting the similarities or differences that exist between these parameters. Each group of neurons is connected in layers that are interconnected in order to constitute categories. In practice, categorization brings about effective help in the learning of knowledge.
Figures 3. (a) Forecast information on input, and temperature and sources as categories; (b) Genetic algorithms applied on each category; (c) Interaction between categories; (d) Meta-knowledge representation.
The learning cycle is applied on input and the weight vectors are associated to information. The context of the information is extracted to define the categories. The connections between categories and inputs are created and constitute the neuronal-connection schema. There are as many neural groups as categories. For example, a military operation needs to know the weather conditions at a specific moment in time. The system has received, on input, all forecast information with associated vectors of (degrees of) confidence provided by different sources (see Figure 3a above). In this example, the context of information presents two salient categories, which are: temperature and source. The connections between categories and inputs emerge in the system and they produce a topology that is like a neuronal-connection schema. The selection is represented by two genetic algorithms on two separate levels. The first GA level is applied on each category and attempts to find the best combination of input weights. At the end, each category has a population of individual biases. In Figure 3b, we show a hypothetical schema of selection produced by GAs. For each category, we find a population of best individuals and the fitness is given by the knowledge already present in the system. For example, in Figure 3b, the temperature category uses a value of 27oC as fitness. This value was determined by statistical computations over the forecast data already present in the database of the “Memory” module. A second GA is used to optimize a population of individual biases to obtain a population of sets of parameters (metaknowledge). Each set consists in different best weights with their associated categories and information. See Figure 3c for an example of the application of the system’s second GAs. Interaction between categories is the most important process in the system: thanks to this process, the system can discover meta-knowledge. This meta-knowledge is represented by a structure that contains categories and their associated fitness, input information and a vector of weights (see the example in Figure 3d). At the end of a run, we are able to figure out which input information elements are most important to predict a good answer. In the example above in Figure 3d, the hypothetical result suggests that a good forecast is provided by the first input information or when coupled with one another. In this section we described the functioning of the “Darwin brain” module to obtain meta-knowledge. The other modules in the system use this meta-knowledge to enhance the system’s learning. Indeed, the “Sim worlds” module enhances the system’s performance by simulating meta-knowledge and helps to learn more about such metaknowledge.
5. DISCUSSION AND CONCLUSION In this paper, we proposed a meta-learning, adaptive system based on genetic algorithms. A combination of the a priori and a posteriori concepts is used to sketch an architecture. A priori consists in starting from input information and knowledge to built and evolve one or more sets of parameters by exploiting the context of information and the optimal parameter setting after simulation. The self-learning component is a combination of learning techniques based on genetic algorithms and neural Darwinism. The results, at the end of this learning process, consist in metaknowledge. A posteriori is applied to the discovery of the implicit knowledge by estimation of future states of metaknowledge. In this case, the system learns how to carry out its task without external user assistance. This property helps the system to deal with unforeseen situations. This paper has focused on the description of the “Darwin Brain” module and the meta-knowledge obtainable from the system’s results. The hypothetical results shown above constitute a framework for the discovery of knowledge that can help human experts in intelligence information assessment tasks. Obviously, this work still is in progress and the refinement of the “Darwin brain” module will be considered as a priority. Further work is also required for the development of the “Sim worlds” and “Choice” modules, which play a central role in the improvement of the learning system. The system’s ability to learn to learn is provided by a combination of genetic algorithms and other learning techniques. The robustness of the meta-learning system requires an enhancement of the learning techniques’ performance. Future work on such enhancement should be based on self-adaptive parameters in GA [19].
REFERENCES 1. Schmidhuber J., Evolutionary Principles in Self-Referential Learning, or on Learning How to Learn: The MetaMeta-... Hook. Diploma thesis, Institut für Informatik, Technische Universität München, 1987. 2. Vilalta R., Drissi Y., “A Perspective View and Survey Of Meta-Learning”, Journal of Artificial Intelligence Review, 18 (2): 77-95, 2002. 3. Cadon A., Galinho T.,Vacher J.-P., “Genetic Algorithm using Multi-Objective in a Multi-Agent System”, Robotic and Autonomous Systems, Elsevier, 33 (2-3): 179-190, 2000. 4. Pigeon L., Inglada J., Solaiman B., “Genetic Algorithms For Multi-Agent Fusion System Learning”, Proceedings of SPIE, Sensor Fusion: Architectures, Algorithms, and Applications V, International Symposium on Aerospace/Defense Sensing, Simulation, and Controls, Orlando, 4385: 87-95, 2001. 5.
Knight L., Sen S., “PLEASE: A Prototype Learning System Using Genetic Algorithms”, ICGA, 429-435, 1995.
6. Fàbrega X.L., Guiu J.M.G.I., “GENIFER: A Nearest Neighbour based Classifier System using GA”, Proceeding of the Genetic and Evolutionary Computation Conference (GECCO99), 1999. 7. Holmes J.H., Lanzi P.L., Stolzmann W., Wilson S.W., “Learning Classifier Systems: New Models, Successful Applications”, Inf. Process. Lett., 82(1): 23-30, 2002. 8. Schmidhuber J., Zhao J., Wiering M., “Simple Principles of Metalearning”, Technical Report IDSIA-, IDSIA, 6996, 1996. 9.
De Jong K., “Learning with Genetic Algorithms: An Overview”, Machine Learning, 3(2): 121-138, 1988.
10. Grefenstette, J. J., “Genetic Algorithms and Machine Learning”, Invited talk, Proc. Sixth Annual ACM Conf. Computational Learning Theory (COLT 93), ACM, 1993. 11. Spears W.M, De Jong K., “Using Genetic Algorithms for Supervised Concept Learning”, IEEE 1990 Second International Conference on Tools for Artificial Intelligence, 335–341, 1990. 12. Lesert A., Aide au paramétrage d’un system réactif, DEA, Intelligence Artificielle & Optimisation Combinatoire, Rapport de stage, Université Paris 8, 2003. 13. Darwin Charles, L’origine des espèces, Petite collection Maspero, Paris, 1980. 14. Holland J.H., Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, Cambridge, Mass., MIT Press, 1992. 15. Goldberg, D.E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, 1989. 16. Edelman, G.M., Neural Darwinism: The Theory of Neuronal Group Selection, Basic Books, New York, 1987. 17. Pigeon L., “An Advanced C2 Concept For Urban Operations”, Proceedings of the 7th International Command and Control Research and Technology Symposium, Quebec City, September 2002. 18. Pigeon L., “A Conceptual Approach For Military Data Fusion”, Proceedings of the 7th International Command and Control Research and Technology Symposium, Quebec City, September 2002. 19. Pellerin E., Pigeon L., Delisle S., “Self-adaptive Parameters in Genetic Algorithm”, SPIE Defence & Security Symposium 2004 -- Proceedings of the Conference on Data Mining and Knowledge Discovery: Theory, Tools, and Technology VI, Orlando (Florida, USA), 12-16 April 2004, to appear.