Dynamics of LVQ Algorithms. Michael Biehl, Anarta Ghosh and Aree Witoelar.
Institute for Mathematics and Computing Science,. University of Groningen, PO ...
Dynamics of LVQ Algorithms Michael Biehl, Anarta Ghosh and Aree Witoelar Institute for Mathematics and Computing Science, University of Groningen, PO Box 800, 9700 AV Groningen, The Netherlands (biehl,anarta,aree)@cs.rug.nl
4. Results
1. Learning Vector Quantization Learning Vector Quantization (LVQ) is a class of algorithms used to construct a set of prototype vectors representing different classes of data. They can be used for nearest prototype classification in a potentially high dimensional space. Successful applications of LVQ include such diverse problems like medical image or data analysis (Figure 1), fault detection in technical systems, or the classification of satellite spectral data to mention only a few.
0.26 0.24 0.22 0.2 0.18 0.16 0.14 0
Figure 1: Samples of healthy (left most) and damaged (second from left) boar sperm cells used to train the Kohonen’s LVQ1 [3, 2]. The resulting prototypes which represent the healthy (third from left) and damaged (right most) classes of boar sperm cells.
50
100
150
200
Figure 3: LVQ1: The Learning curve (left panel); evolution of g in the course of training for η = 0.5, 1, 2. Note that the asymptotic g monotonically increases with η . Trajectory (right panel) of the ~ ±1}; RS− vs RS+ prototypes in the plane spanned by {B 0.35
0.25
0.3
0.2
0.25
2. Model and Algorithms With the help of the successful theory of on-line learning [1] the dynamics of LVQ algorithms can be analyzed in the following model scenario: (ξ~µ, σ µ) - training data at time stamp µ with class label σ µ ∈ {±1}; w ~ l , l ∈ {±1} - prototype vectors; P ~ ~ σ , vσ IN ×N ) - training data is distributed as mixξ ∼ ±1 pσ N (λB ture of Gaussians, N - dimensionality of the system and all aforementioned vectors belong to RN ; pσ - class prior probabilities. 4
4 2
2 0
0.15
0.2 0.15
0.1
0.1 0.05
0.05 0 0
0.2
0.4
0.6
0.8
1
0 0
0.2
0.4
0.6
0.8
1
Figure 4: The achieved generalization error of the algorithms with respect to p+1 for the equal (left panel) and unequal (left panel) class variance cases. Dotted: best linear decision. Dashed: LVQ2.1 with early stopping. Chain: LFM. Solid: LVQ1 LVQ1 outperforms all other algorithms; only in the unequal class variance case for p1 ≈ p−1 LVQ2.1 with early stopping is more favorable.
0 -2
-2 -4 -4
-2
0
2
4
-2
0
2
5. Outlook and Perspective
4
Figure 2: Sample training data generated according to sum of Gaussians. The data are separable when it is projected to the plane spanned by the mean vectors (right panel) but almost completely overlap when projected to a plane spanned by a randomly chosen pair of orthogonal vectors. (Here N = 200). The analysis of the following three different are LVQ algorithms ~ lµ−1); presented [3]: LVQ1: w ~ lµ = w ~ lµ−1 + Nη lσ µΘ d−l − dl (ξ~µ − w µ µ−1 µ−1 µ µ−1 η µ ~µ LVQ2.1: w ~l = w ~ l + N (lσ )(ξ − w ~ l ); LFM: w ~l = w ~l + µ−1 η µ ~µ (lσ )(ξ − w ~ )Θ d+σµ − d−σµ ; where Θ(.) is the Heaviside funcN
l
tion, dl = (ξ~µ − w ~ lµ−1)2.
• Learning rate schedules, variational optimization of algorithms. • Adaptive metrics, relevance learning. • Multi-prototype, multi-class problems. • Neural gas and self-organizing maps. Figure 5: Unsupervised Neural Gas algorithm in the model scenario with two clusters of data and three prototype vectors: Trajectories of the prototypes in the plane spanned by the {B±}.
3. The Analysis The dynamical analysis comprises of the following key steps: (1) ~ m, Qlm = w Defining the order parameters: Rlm = w ~l · B ~l · w ~ m. (2) Construction of a system of ordinary differential equations (ODE) in the limit N → ∞ in terms of Rlm, Qlm by computing averages over random data and using the recurrence relations derived from the learning algorithms. (3) Solution of the system of ODE yields the dynamical analysis which in turn enables a performance evaluation of the LVQ algorithms in terms of learning curves, convergence, stability, generalization ability (g ) , geometric properties.
Selected Publications [1] M. Biehl and N. Caticha. The statistical mechanics of on-line learning and generalization. In The Handbook of Brain Theory and Neural Networks, pages 1095–1098. MIT Press, Cambridge, MA, 2003. [2] M. Biehl, A. Ghosh, and B. Hammer. Learning Vector Quantization: The dynamics of Winner-Takes-All algorithms. Neurocomputing. In press. [3] A. Ghosh, M. Biehl, and B. Hammer. Dynamical analysis of LVQ type learning rules. In Processings of Workshop on the Self-Organizing-Map. Univ. de Paris I Pantheon-Sorbonne, 2005.
SIREN 2005, Scientific ICT Research Event, the Netherlands, 6 October 2005, Eindhoven, the Netherlands