A Continuous Self-Organizing Map using Spline Technique for

2 downloads 0 Views 96KB Size Report
We propose a new method called C-SOM for function approximation. ... interpolate between neighbouring neurons of the map to get a continuous variation of ... A blurring technique has been proposed in [Kohonen-97] to get continuous values ...
Proc. of Artificial Intelligence Control and Systems (AICS 99), Cork, Ireland, September 1999

A Continuous Self-Organizing Map using Spline Technique for Function Approximation Michaël AUPETIT

Pierre MASSOTTE

Pierre COUTURIER

LGI2P - Site EERIE - EMA Parc Scientifique Georges Besse - F 30000 Nîmes, France e-mail: {aupetit,massotte,pcouturi}@site-eerie.ema.fr ABSTRACT We propose a new method called C-SOM for function approximation. C-SOM extends the standard Self-Organizing Map (SOM) with a combination of Local Linear Mapping (LLM) and cubic spline based interpolation techniques to improve standard SOMs' generalization capabilities. CSOM uses the gradient information provided by the LLM technique to compute a cubic spline interpolation in the input space between neighbouring neurons of the map, bringing a first-order continuity at the border hyperplanes of their respective Voronoï's regions. We present the case of a onedimensional map and show C-SOM performs better than SOM and LLM in an approximation test. 1. INTRODUCTION Standard Self-Organizing Maps (SOMs) have very interesting properties as they realize a projection from a high-dimensional continuous input space onto the low-dimensional discrete space of the map, preserving topology of the input space and density distribution of the data [Kohonen-88]. An output vector (or output weight) can be associated to each neuron in a way to realize an associative memory [Ritter&Al-92]. However, that vector quantization brings about the poor performances of SOMs in function approximation, because of its discrete representation of the data. SOMs are competitive-like networks which share the input space into clusters corresponding to neurons of the map. Each of these neurons is located in the input space thanks to a kernel weight. Any point of a cluster (e.g. Voronoï region) is projected onto the «closest» corresponding neuron of the map according to the Euclidean distance. In that way, SOMs make no difference between points of a cluster «close» or «far» from the cluster's kernel, and assign to the both the same output vector. We propose to interpolate between neighbouring neurons of the map to get a continuous variation of output values and to perform a more accurate function approximation (Figure 1). Output Space

LLM output using gradient information discontinuities

Output Space

Function to approximate discontinuities

SOM output

Function to approximate

Function to approximate

Voronoï's edge Neuron's kernel

Output Space

N1

N2

Standard SOM

N3 1D Input Space

N1

N2

Standard LLM

N3 1D Input Space

N1

C-SOM

N2

C-SOM output using cubic spline interpolation based on LLM gradient information

N3 1D Input Space

Figure 1: Comparison between function approximation methods of SOM, LLM and C-SOM.

Proc. of Artificial Intelligence Control and Systems (AICS 99), Cork, Ireland, September 1999

A blurring technique has been proposed in [Kohonen-97] to get continuous values from the discrete output values of a SOM. A symmetric Gaussian kernel is used in the map space to get smooth decreasing expansion around the output value of each neuron. We also find in [Renders-95] a combination of RBF neural networks and SOMs. Each kernel of a RBF network is connected to neighbours to form a SOM. The former method cannot insure a first order continuity in the input space between neighbouring neurons of the map, because of the broken-like shape of the map in the input space. The latter method drags with it the drawbacks of the RBF networks which could be come down to the learning process which can get stuck into local minima of its global cost function, and to the symmetrical shape of the Gaussian kernels which leads to influence equally the neighbouring kernels of the input space without any consideration for the real topology of the map. A Local Linear Mapping (LLM) technique has been proposed in [Ritter&Al-92] to improve approximation abilities of SOMs. In LLM, each neuron of the SOM stores an input weight Ws(in) with its corresponding output weight Ws(out) and stores the local gradient As of this input-output pair calculated during the learning phase. This gradient information is used to produce a first-order expansion in the output space around the representative output weight W s(out) (Figure 1) leading to the approximation of ynet corresponding to the real input data x: ynet=Ws(out)+As(x-Ws(in))

The LLM method has been used to control a robot arm in [Ritter&Al-92] and for function approximation and system identification in [Moshou&Al-97][Martinetz&Al-93][Walter&Al-90]. However, with a little number of neurons, this kind of linear approximation could be inadequate to perform correctly highly non-linear function approximation because of the persistent discontinuities along the edge of the Voronoï regions. We notice that LLM extrapolates from the winning neuron's output weight to compute the output value without any consideration for the output of the neighbouring neurons of the map. We think that neighbourhood relationships could be very useful to improve the relevance of the output values that should be associated to the input data. We propose to use the neighbourhood relationships to improve the approximation capabilities of SOMs, by interpolating the output informations stored by the winning neuron and its neighbours. We use the LLM method to provide a gradient support information for a cubic spline interpolation. In that way, we insure a first-order continuity along the Voronoï regions' edge of neighbouring neurons of the map, and preserve the useful gradient information provided by LLM, to finally perform a more accurate and non-linear function approximation than LLM's. In the following, a term AB denotes the Euclidean distance between A and B.

Proc. of Artificial Intelligence Control and Systems (AICS 99), Cork, Ireland, September 1999

2. THE C-SOM METHOD Let the SOM be a one-dimensional map unfold in a two-dimensional input space, and M be a point of that space. The C-SOM method can be decomposed in four phases. 1) C-SOM first finds the closest «winning» segment [NiNi+1] of the SOM to M, where Ni and Ni+1 are neighbouring neurons of the map. As bisecting lines are defined as the locus of points equidistant from two straight lines, C-SOM uses the bisecting lines (Bj) of each couple of consecutive segments [Nj-1Nj] and [NjNj+1] of the SOM to define the «winning» segment [NiNi+1](Figure 2a). 2) It projects M onto (Bi) and (Bi+1) in the direction of [NiNi+1] and get Fi and Fi+1 (Figure 2b). 3) Then, C-SOM calculates the interpolation parameter IP as (FiM / FiFi+1) and the output values outFi and outFi+1 associated to Fi and Fi+1 are evaluated thanks to the LLM's gradient information provided by Ni and Ni+1. 4) At last C-SOM computes the output outM associated to M using a cubic spline interpolation between Fi and Fi+1 knowing the interpolation parameter Ip, outFi, outFi+1 and the derivative in Fi and Fi+1 along [FiFi+1] (Figure 2c). C-SOM allows to insure the first order continuity of the output both along the Voronoï edges and bisecting lines of neighbouring neurons. C-SOM can be compared to the LLM as it extrapolates in the input space in all the orthogonal directions to the map, but it differs from LLM as it uses the neighbouring information to interpolate in the local principal direction of the map, coming over the LLM's discontinuities in this direction. Hereafter, we extend the C-SOM method to the case of a n-dimensional input space. (Bi)

In2 (Bi-1)

Ni-1 Ni

In2 M (B ) i+1

Ni-1

(Bi) Fi Ni

Ni+2

Ni-2

(a)

(Bi+2) In1

(Bi+1)

M

Ni+2

Ni-2

(b)

In2

Fi+1 Ni+1

Ni+1 (Bi-2)

Output space

In1

OutM

(Bi) M Fi Ni

(Bi+1) F Ni+1 i+1

(c)

In1

Figure 2: The different phases of the C-SOM method. See description in text. 3. GENERALIZATION OF C-SOM METHOD TO N-DIMENSIONAL INPUT SPACES In this section, we consider a one-dimensional SOM unfold in a n-dimensional input space. Then we can describe the map as a broken line in the input space and as the discrete equivalent of a Principal Curve [Hastie&Al-89][Mulier&Al-95][Kégl&Al-97]. Our aim is to project any point of the input space onto this map, as to get a continuous output information instead of a discrete one. This continuous projection has to provide a relevant information (the interpolation parameter Ip) to allow the following interpolation phase. As in the two-dimen-

Proc. of Artificial Intelligence Control and Systems (AICS 99), Cork, Ireland, September 1999

sional case (Section 2) we use the bisecting lines to edge the regions containing all the points of the input space closest to one segment of the map, we propose to define «bisector hyperplanes» as the border hyperplanes of such a region in a n-dimensional space. We define the bisector hyperplane (H i) (resp. (Hj)) as it contains all the input points that project orthogonally onto the plane (Pi) (resp. (Pj)) along the bisecting line (Bi) (resp. (Bj)) both defined by [Ni-1Ni] and [NiNj] (resp. [NiNj] and [NjNj+1]) which are two neighbouring segments of the map. C-SOM follows four phases (Section 2): 1) To find the closest «winning» segment of the map, we have shown that: 

2

2

2

– MNj + NiNj -----------------------------------------------------------  ( [NiNj] is the closest segment of the map to M ) ⇔ MNi NiNj 



=



2

2

2

MNi – MNk + NiNk Maxk ------------------------------------------------------------ NiNk 



where a term AB denotes the Euclidean distance between A and B, where Ni is the closest neuron of the map to M according to the Euclidean distance and where Ni, Nj and Nk are neighbouring neurons of the map. 2) Then, we can easily calculate the Euclidean distances MFi and MFj between M and its projection Fi and Fj onto (Hi) and (Hj) in the direction of the «winning» segment [NiNj], as MFi =M'F'i and MFj=M'F'j where M', F'i and F'j are the orthogonal projections of respectively M, F i and Fj onto (Pi) and (Pj). Let notice that for the extreme neurons N1 and Np of the map, we just have to define the orthogonal hyperplanes (H1) and (Hp) to the extreme segments [N1N2] and [Np-1Np] of the map as the border hyperplanes, so that MF1=M'N1 and MFp=M'Np and any input point located outside of (H1) and (Hp) are associated with the extrapolated value provided by the LLM method. 3) Then, we calculate F'i and F'j and their respective Euclidean distance to M' and we infer the value of the interpolation parameter as: Ip=FiM / FiFj . 4) Now, our aim is to improve the LLM approximation by interpolating between two neighbouring neurons of the map. That means we attempt to overcome the inherent discontinuity appearing at the border line of Voronoï region with standard LLM. As we have calculated the interpolation parameter Ip, it is quite easy now to interpolate as it is presented in [Bézier-87]. (Algorithm 1) Algorithm 1: Algorithm of the cubic spline interpolation 1 - Using gradient Asi and Asj provided by Ni and Nj, calculate the derivative along [FiFj] at the extremities Fi and Fj: dFi = Asi ⋅ ( Fj – Fi ) dFj = Asj ⋅ ( Fj – Fi ) 2 - Calculate the output value of Fi and Fj extrapolated from the gradients of Ni and Nj:  ynetFi = WsNi ( out ) + Asi ⋅ ( Fi – Ni )   ( out ) ynetFj = WsN j + Asj ⋅ ( Fj – Nj ) 3 - Calculate the parameters of the cubic spline: a = ynetFi b = dFi c = 3 ⋅ ( ynetFj – ynetFi ) – 2 ⋅ dFi – dFj d = 2 ⋅ ( ynetFi – ynetFj ) + dFi + dFj 4 - Calculate the interpolated output value associated with the input vector M using the interpolation parameter Ip: 2 ynetM = a + b ⋅ Ip + c ⋅ Ip + d ⋅ Ip 3

Proc. of Artificial Intelligence Control and Systems (AICS 99), Cork, Ireland, September 1999

Even if it could exist some cases where LLM would perform better than C-SOM as it is shown on figure 3, it should be possible to improve C-SOM's method by considering the curvature of the function to approximate. We call these extensions of LLM and C-SOM: LQM and C-QSOM (Q for «Quadratic») (Section 4). Function to approximate

Ni

LLM

Nj C-SOM

Figure 3: A particular case where LLM could perform better than C-SOM. In this case, C-SOM method could be improved using second order derivatives (C-QSOM). 4. APPLICATION TO FUNCTION APPROXIMATION In this section, we compare our continuous C-SOM with the standard LLM and the standard SOM. All the maps have a one-dimensional topology. We didn't perform a learning phase because it would have no influence on the inherent approximation capabilities of each map. So we defined a nine-neuron map organization in the input space (Figure 4). We also defined a two-dimensional sinusoidal function to approximate: S(x,y)=sin(2x+0.5)+sin(2y+0.5). Input space y Positions in the input space: 3

2

4 5

1

9 6

8 7

x

N1 (1,1) N2 (0,1) N3 (-1,1) N4 (-1,0) N5 (-1,-1)

N6 (0,-1) N7 (1,-1) N8 (1,0) N9 (0,0)

Figure 4: The 9-neuron map used for the experiment. The results of the approximation of S using the 9-neuron map, with the standard SOM, standard LLM and C-SOM's methods are presented in the table 1 as the Mean Square Error (MSE) calculated within the interval [-1.5 ,1.5] from 0.05 to 0.05 for both x and y values. We also performed the test with LQM and C-QSOM's methods (Section 3). In this test, the LQM's curvature is calculated directly from the Sinusoidal function as for the LLM's gradient, and C-QSOM uses a fifth degree polynomial interpolation. Figure 5 shows the differences between SOM, LLM, C-SOM, LQM and C-QSOM methods in the approximation of the sinusoidal function. Method

Standard SOM

9-neuron map

0.3130

Standard

LLM

0.0554

C-SOM

LQM

C-QSOM

0.0446

0.0148

0.0096

Table 1: Mean Square Error of a two-dimensional Sinusoidal function approximation. We see that C-SOM performs better than standard LLM and standard SOM in this function approximation test. These results confirm what is observed on the Figure 5, as C-SOM does provide a better approximation of the Sinusoidal function because it comes over the discontinuities between neighbouring neurons of the map of both standard LLM and standard SOM's approximations.

Proc. of Artificial Intelligence Control and Systems (AICS 99), Cork, Ireland, September 1999

Table 1 and figure 5 show that if we extend LLM by considering the local curvature in LQM and C-QSOM methods (Section 3), we observe that C-QSOM performs thirty time better than standard SOM in this approximation test.

Sinusoidal function to approximate

C- SOM approximation

C-QSOM approximation

Standard SOM approximation

Standard LLM approximation

LQM approximation

Figure 5: Sinusoidal function approximation test with 9-neuron maps and different methods. C-SOM and C-QSOM interpolate between neighbouring neurons of the map

5. CONCLUSION We have presented a new method called C-SOM for function approximation based on standard SOM's architecture and LLM's method. We have proposed a continuous projection technique to overcome drawbacks of the discrete representation of data in SOMs and we have shown that C-SOM and its second order extension C-QSOM (which considers the curvature) perform better than LLM and SOM in a function approximation test. Now, we are working on the extension of the C-SOM's method to a more useful two-dimensional map . As C-SOM should be able to approximate functions with sufficient accuracy, we intend to apply C-SOM in order to improve neural control systems based on reinforcement learning models [Sehad&Al 95].

REFERENCES [Bézier-87]

P.Bézier - Courbes et Surfaces - Mathématiques et CAO, vol.4 - Hermès 1987

[Hastie&Al-89]

T.Hastie and W.Stuetzle - Principal Curves - Journal of the American Statistical Association, vol.84, no.406, pp. 502-516 - 1989

[Kégl&Al-97]

B.Kégl, A.Krzyzak, T.Linder and K.Zeger - A Polygonal Line Algorithm for Constructing Principal Curves - NIPS98, Denver, Colorado - December 1998

Proc. of Artificial Intelligence Control and Systems (AICS 99), Cork, Ireland, September 1999

[Kohonen-88]

T.Kohonen - Self-Organization and Associative Memory - Springer-Verlag 1988

[Kohonen-97]

T.Kohonen - Exploration of Very Large Databases by Self-Organizing Maps International Conference on Neural Networks 1997, vol. 1/4, pp. pl1-pl6 - IEEE 1997

[Martinetz&Al-93] T.Martinetz, S.Berkovich and K.Schulten - «Neural-Gas» Network for Vector Quantization and its Application to Time-Series Prediction - IEEE Transactions on Neural Networks, vol. 4, No. 4 - July 1993 [Moshou&Al-97]

D.Moshou and H.Ramon - Extended Self-Organizing Maps with Local Linear Mappings for Function Approximation and System Identification - WSOM97, Web site: http://www.cis.hut.fi/wsom97/progabstracts/31.html - 1997

[Mulier&Al-95]

F.Mulier, V.Cherkassky - Self-Organization as an Iterative Kernel Smoothing Process - Neural Computation 7, 1165-1177, MIT 1995

[Renders95]

Jean-Michel Renders. Applications à la commande de processus. Algorithmes Génétiques et Réseaux de Neurones - Hermès Paris 1995

[Ritter&Al-92]

H.Ritter, T.Martinetz and K.Schulten - Neural Computation and Self-Organizing Maps - Addison&Wesley 1992

[Sehad&Al 95]

S.Sehad and C.Touzet - Neural Reinforcement Path Planning for the Miniature Robot Khepera - World Congress on Neural Networks, vol. 2, pp. 350-354, Washington, D.C., USA, Juillet 1995 - INNS Press 1995

[Walter&Al-90]

J.Walter, H.Ritter and K.Schulten - Non-Linear Prediction with Self-Organizing Maps - IJCNN 1990, vol. 1, pp. 589-594 - 1990

Suggest Documents