Sequential Learning for SOM Associative Memory with Map Reconstruction Motonobu Hattori, Hiroya Arisumi and Hiroshi Ito Yamanashi University, 4-3-11, Takeda, Kofu, 400-8511, JAPAN
[email protected]
Abstract. In this paper, we propose a sequential learning algorithm for an associative memory based on Self-Organizing Map (SOM). In order to store new information without retraining weights on previously learned information, weights fixed neurons and weights semi-fixed neurons are used in the proposed algorithm. In addition, when a new input is applied to the associative memory, a part of map is reconstructed by using a small buffer. Owing to this remapping, a topology preserving map is constructed and the associative memory becomes structurally robust. Moreover, it has much better noise reduction effect than the conventional associative memory.
1
Introduction
It is well known that when a neural network is trained on one set of patterns and then attempts to add new patterns to its repertoire, catastrophic interference, or the complete loss of all of its previously learned information may result. This type of radical forgetting is unacceptable both for a model of human brain and for practical applications. French has pointed out that catastrophic forgetting is a direct consequence of the overlap of distributed representations and can be reduced by reducing this overlap [7],[8]. Since most of conventional associative memories employ mutually connected structures, information is stored in weights in completely distributed fashion [1]-[4]. Therefore, catastrophic forgetting is inevitable and a serious problem for them. Yamada et al. have proposed a learning algorithm for an associative memory based on Self-Organizing Map (SOM) [6]. In order to prevent catastrophic forgetting and enable sequential learning with only new information, weights fixed neurons and weights semi-fixed neurons are used in the learning algorithm [12]. That is, weights of well-trained neurons on the map are fixed and those of neurons surrounding the weights fixed neurons are semi-fixed. Information is stored by the weights fixed neurons and their weights are not changed if once fixed. Hence, in the SOM associative memory, it is not necessary for the old information to be continually relearned by the network. In the SOM associative memory, however, it becomes very difficult to preserve topology from the space
This work was supported by the Telecommunications Advancement Organization of Japan. The authors would like to thank Masayuki Morisawa for valuable discussions.
of inputs to the plane of the output units as the number of additional data to be learned increases. In this paper, we propose a novel sequential learning algorithm for the SOM associative memory. In the proposed learning algorithm, when a new training data is applied to the SOM associative memory, it is stored in a small buffer with some previously learned data which are extracted from weights of fixed neurons. Then the data stored in the buffer are learned by the SOM algorithm [6] within a small circular area of the map. Hence, the topological relationships of the data stored in the buffer are preserved within this area. Since this remapping within a small area of the map is occurred every time a new data is applied to the SOM associative memory, it can gradually construct a topology preserving map for all inputs. The ability in topology preserving contributes the structural robustness of the SOM associative memory. Moreover, we can expect that the SOM associative memory can be applied to various intelligent tasks such as categorization, concept formation, knowledge discovery and so on. A number of computer simulation results show the effectiveness of the proposed learning algorithm.
2 2.1
SOM Associative Memory Structure of SOM Associative Memory
Ichiki et al. have proposed a supervised learning algorithm for Self-Organizing Map (SOM) and they have shown that the SOM can behave as an associative memory [10]. The SOM associative memory regards the input vector consisting of different several components. Figure 1 shows the structure of a SOM associative memory when the input layer is divided into two parts. The learning algorithm for the SOM associative memory is based on the conventional SOM algorithm [6]. F (k)
Input Layer
R (k) WrR
WrF
r Feature Map
Fig. 1. Structure of SOM associative memory [10].
In the SOM associative memory, since each training data is stored by a small number of neurons on the map, we can retrieve the stored data by examining weights of the winner neuron on the map.
2.2
Conventional Sequential Learning for SOM Associative Memory
Yamada et al. have proposed a sequential learning algorithm for the SOM associative memory and they have shown that new information can be stored without retraining previously learned information [12]. In the learning algorithm, weights of well-trained neurons of the SOM associative memory are fixed [9] and weights of neurons surrounding a weights fixed neuron become hard to be learned in inverse proportion to the distance from the nearest fixed neuron. Here, we briefly review the conventional sequential learning algorithm [12]. Let us consider that the learning vector X (k) consists of two components: (k) 0 F (k) X = + (1) 0 R(k) where X (k) ∈ {−1, 1}M +N , F (k) ∈ {−1, 1}M and R(k) ∈ {−1, 1}N . For efficiency of the process and mathematical convenience, X (k) is normalized to u: (k) F 0 u u u X (k) = √ . (2) X (k)∗ = √ +√ 0 M +N M +N M + N R(k) Let Wi be the connection weights between the input layer and the map: 0 Wi F Wi = + 0 Wi R
(3)
where i is a position vector indicating the position of a neuron on the map. Then, the connection weights Wi are learned by the following procedure: 1. Choose the initial values of weights randomly. 2. Calculate the Euclidean distance between X (k)∗ and Wi , d(X (k)∗ , Wi ). 3. Find a neuron which satisfies min[d(X (k)∗, Wi )] and let r be the position vector of the neuron. 4. If d(X (k)∗, Wr ) < df , the weights of the neuron r are fixed. Weights except those of fixed neurons are learned by the following rule: Wi (t + 1) = Wi (t) + H(d) × α(t) · hri · (X (k)∗ − Wi (t))
(4)
where t shows discrete time and hri is the neighborhood function: hri = exp
−r − i2 2σ(t)2
(5)
α(t) and σ(t) shows the following decreasing functions: −α0 (t − T ) T t(σf − σi ) + σi σ(t) = T
α(t) =
(6) (7)
where T shows the upper limit of learning iterations, α0 is the initial value of α(t) and σ(t) varies from σi to σf (σi > σf ). H(d) shows effect from the nearest fixed neuron: 1 − exp(−d · k) (8) H(d) = 1 + exp(−d · k) where d shows the Euclidean distance between neuron i and the nearest weights fixed neuron on the map and k defines a slope of H(d). Owing to H(d), weights of neurons close to the fixed neurons are semi-fixed, that is, they become hard to be learned. 5. 2–4 is iterated until d(X (k)∗, Wr ) < df is satisfied. 6. 2–5 is iterated for all k. Information is stored by the weights fixed neurons and their weights are not changed if once fixed. Hence, the SOM associative memory can store new information without retraining previously learned information. In the recall process, an output can be obtained by the following rule: √ 2 F +R2 R Wr · if F is inputed u √ O= (9) F 2 +R2 F if R is inputed . Wr · u Owing to the process above, the SOM associative memory can behave as a bidirectional associative memory (BAM) [3]. Figure 2 shows the structure of the SOM associative memory using weights fixed and semi-fixed neurons. F (k)
Input Layer
WrR
WrF
s s s r s f s s s s
R (k)
s s s s f s s s s
Feature Map
Fig. 2. Structure of SOM associative memory using weights fixed neurons (f) and weights semi-fixed neurons (s) [12].
3
Sequential Learning with Map Reconstruction
In the conventional sequential learning algorithm [12], it becomes very difficult to preserve topology from the space of inputs to the plane of the output units as the number of additional data to be learned increases because each data is
learned by the SOM associative memory one by one. If the SOM associative memory can construct a topology preserving map, its structural robustness can be much improved. Moreover, the SOM associative memory can be applied to categorization of memory items, concept formation, knowledge discovery [11] and so on. Namely, owing to the inherent features of the SOM, we can expect that the SOM associative memory can handle various intelligent tasks which are difficult for the conventional associative memories [1]-[4]. In order to preserve the topological relationships of input data, we use a buffer of small capacity in the proposed sequential learning algorithm. When a new data to be learned is applied to the SOM associative memory, some previously learned data are extracted from the weights of fixed neurons which exist near the winner neuron to the new data and they are stored in the buffer with the new data. Then, all of the data stored in the buffer are learned by the SOM algorithm [6] within a radius of the farthest fixed neuron from the winner neuron on the map. Since this remapping within small area of the map is occurred whenever a new training data is applied to the SOM associative memory, it can gradually construct a topology preserving map for all inputs. In the proposed learning algorithm, weights are learned as follows: 1. Choose the initial value of weights randomly. 2. Calculate d(X (k)∗, Wi ). 3. Find the winner neuron which satisfies min[d(X (k)∗, Wi )] and let r be the position vector of the neuron. 4. X (k)∗ is stored in the buffer which has a capacity of B. 5. Find (B −1) fixed neurons near r, store their weights in the buffer in order of nearness to r and release their fixations of weights. Let E be the Euclidean distance from r to the farthest fixed neuron. 6. Define a circular area C on the map, the center and radius of which are r and E, respectively. 7. Calculate d(X (kB )∗ , W iC ), where X (kB )∗ is a data stored in the buffer and W iC shows weights of neuron i in C. 8. Find the winner neuron in C which satisfies min[d(X (kB )∗ , W iC )] and let rC be the position vector of the neuron. 9. Weights of all neurons in C are changed by W iC (t + 1) = W iC (t) + H(d) × α(t) · hr C iC · (X (kB )∗ − W iC (t))
(10)
where d shows the Euclidean distance from iC to the nearest fixed neuron in the outside of C. 10. 7–9 is iterated until d(X (kB )∗ , W rC ) < df is satisfied for all kB = 1, · · · , B. 11. Fix the weights of each winner neuron for X (kB )∗ . 12. 2–11 is iterated when a new input to be learned is applied to the SOM associative memory.
4
Computer Simulation Results
In the computer simulation, we used the following parameters: α0 = 0.1, σi = 3.0, σf = 0.5, T = 3000, df = 10−3 , k = 1.0. The size of map were set to 9 × 9.
4.1
Comparison of Topological Mapping
In this experiment, we examined the ability in topology preserving of the SOM associative memory learned by the proposed learning algorithm by using 25 two dimensional training data shown in Fig.3. The capacity of the buffer was set to B = 9. Figures 4 and 5 show the obtained maps by the conventional and proposed learning algorithm, respectively: numbers on each map show the winner neurons for inputs. 20 24 21
22
23
24
25
13 15
19 18
14 5
4
17
18
19
20
11
12
13
14
15
6
7
8
9
10
1
2
3
4
5
Fig. 3. Training data.
2
9 7
17
12
21 11 6
16
15
10
24
19
14
9
23
18
22
17
21
16
13
10
23 22
20
5 4
25 16
25
3
3 8 12
8
2 7
1
Fig. 4. A mapping by conventional algorithm.
11
6
1
Fig. 5. A mapping by proposed algorithm.
As shown in these figures, the ability in topology preserving of the SOM associative memory becomes much better by using the proposed algorithm. The average of correlation coefficients between input data and the position vectors of winner neurons was 0.310 for the conventional algorithm and 0.945 for the proposed algorithm based on 10 trials. Even when B = 5, the proposed algorithm showed much better performance than the conventional algorithm: the average of correlation coefficients was 0.675. 4.2
Structural Robustness
In this experiment, we compared the structural robustness between the SOM associative memory learned by the proposed algorithm and that learned by the conventional algorithm. The capacity of the buffer was set to B = 5. 20 pairs of letters shown in Fig.6 were stored by both sequential learning algorithms and then some neurons on the maps were made to fail randomly.
Fig. 6. Alphabetical patterns.
Figure 7 shows the result of this experiment based on 50 trials. In this figure, the failure rate means the rate of neurons on the map which lose their functions,
that is, the rate of dead neurons on the map. As shown in this figure, the SOM associative memory learned by the proposed algorithm can greatly improve the robustness because neurons surrounding each winner neuron store similar information to the winner neuron by the ability to preserve topology. Therefore, even if a winner neuron loses its function, its neighbor can play a role of the winner neuron. 4.3
Noise Reduction Effect
In this experiment, we examined sensitivity to noise of the SOM associative memories. B = 5 was used in the proposed algorithm. Figure 8 shows the result when 20 training pairs shown in Fig.6 were stored in each network. In the learning of the BAM [3], the pseudo-relaxation learning algorithm [5] was used. We can see that the SOM associative memories can greatly improve the noise reduction effect in comparison with the BAM. In the SOM associative memory learned by the conventional algorithm, almost only winner neurons store training data. In contrast, training data are stored by not only winner neurons but also neurons surrounding them in the proposed algorithm. However, some neurons surrounding winner neurons are likely to store slightly different information from training data because of the ability to preserve topology and such neurons can respond to noisy inputs. Therefore, the SOM associative memory learned by the proposed algorithm shows a little worse performance than that learned by the conventional algorithm. That is, a tradeoff exists between the structural robustness and the noise reduction effect in the SOM associative memory.
Perfect Recall (%)
Conventional 50
Proposed
50
BAM
0
0
25
Failure Rate (%)
50
Fig. 7. Structural robustness of SOM associative memory.
5
Conventional
100
Perfect Recall (%)
Proposed
100
0
0
10
Noise Level (%)
20
Fig. 8. Sensitivity to noise.
Conclusions
We have proposed a sequential learning algorithm for an associative memory based on Self-Organizing Map (SOM). In the proposed algorithm, when a new
data to be learned is applied, it is learned within a small area of map with some previously stored data which are extracted from weights of fixed neurons. Repeating this partial remapping every time a new training data is applied, the SOM associative memory can preserve the topological relationships of input data. This contributes the structural robustness. Moreover computer simulation results showed that the noise reduction effect of the SOM associative memory is much better than that of the conventional BAM. In addition, we have confirmed that the SOM associative memory can easily improve the storage capacity by increasing the size of map and can deal with one-to-many associations [12]. Since information once stored in the SOM associative memory learned by the proposed algorithm is not destroyed by newly learned information, we can regard this model as a long term memory. In the future research, we apply our method to much more intelligent tasks such as concept formation, categorization of memory items and so on.
References 1. Nakano, K.: Associatron - a model of associative memory. IEEE Trans. Syst. Man. Cybern. SMC-2 1 (1972) 380–388 2. Hopfield, J.J.: Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proc. of National Academy Sciences USA. 79 (1982) 2554–2558 3. Kosko, B.: Bidirectional Associative Memories. IEEE Trans. Neural Networks. 18 1 (1988) 49–60 4. Hagiwara, M.: Multidirectional Associative Memory. Proc. of IEEE and INNS International Joint Conference on Neural Networks. 1 (1990) 3–6 5. Oh, H. and Kothari, S.C.: Adaptation of the Relaxation Method for Learning in Bidirectional Associative Memory. IEEE Trans. Neural Networks. 5 4 (1994) 576– 583 6. Kohonen, T.: Self-Organized Formation of Topologically Correct Feature Map. Biological Cybernetics. 43 (1982) 59–69 7. French, R.M.: Using Semi-Distributed Representation to Overcome Catastrophic Forgetting in Connectionist Networks. Proc. of the 13th Annual Cognitive Science Society Conference. (1991) 173–178 8. French, R.M.: Dynamically constraining connectionist networks to produce distributed, orthogonal representations to reduce catastrophic interference. Proc. of the 16th Annual Cognitive Science Society Conference. (1994) 335–340 9. Kondo, S., Futami, R. and Hoshimiya, N.: Pattern recognition with feature map and the improvement on sequential learning ability. IEICE Japan Technical Report. NC95-161 (1996) 55–60 10. Ichiki, H., Hagiwara, M. and Nakagawa, M.: Kohonen feature maps as a supervised learning machine. Proc. of IEEE International Conference on Neural Networks. (1993) 1944–1948 11. Simula, O., Vesanto, J. and Vasara, P.: Analysis of Industrial Systems Using the Self-Organizing Map. Proc. of Second International Conference on KnowledgeBased Intelligent Electronic Systems. 1 (1998) 61–68 12. Yamada, T., Hattori, M., Morisawa, M. and Ito, H.: Sequential Learning for Associative Memory using Kohonen Feature Map. Proc. of IEEE and INNS International Joint Conference on Neural Networks. (1999) paper no.555