Application of Potts-model Perceptron for Binary Patterns Identification

2 downloads 0 Views 208KB Size Report
Abstract. We suggest an effective algorithm based on q−state Potts model providing an exponential growth of network storage capacity M ∼. N2S+1, where N is ...
Application of Potts-model Perceptron for Binary Patterns Identification Vladimir Kryzhanovsky, Boris Kryzhanovsky and Anatoly Fonarev1 Center of Optical Neural Technologies of Scientific Research Institute for System Analysis of Russian Academy of Sciences 44/2 Vavilov Street, 119333 Moscow, Russian Federation [email protected],[email protected] 1 CUNY City University of New York, Department of Engineering and Science 2800 Victory Blvd. SI, NY 10314

Abstract. We suggest an effective algorithm based on q−state Potts model providing an exponential growth of network storage capacity M ∼ N 2S+1 , where N is the dimension of the binary patterns and S is the free parameter of task. The algorithm allows us to identify a large number of highly distorted similar patterns. The negative influence of correlations of the patterns is suppressed by choosing a sufficiently large value of the parameter S. We show the efficiency of the algorithm by the example of a perceptron identifier, but it also can be used to increase the storage capacity of full connected systems of associative memory. Restrictions on S are discussed. Key words: identification, Potts model, q−state perceptron, storage capacity.

1

Introduction

The storage capacity of the Hopfield model is rather small. It allows one to storage M ∼ N/2 ln N randomized patterns only. These estimates are correct for randomized patterns, whose binary coordinates are independent random variables. If there are correlations between patterns, the recognizing ability of Hopfield model decreases drastically. For a long time it was considered that the only way to overcome these difficulties was the sparse coding [1],[2]. This technique consists of random diluting of informative coordinates by a great number of spurious coordinates. Then with the aid of a special choice of the threshold and the level of the patterns activity, the storage capacity can be increased up to maximal value M ∼ (N/2 ln N )2 . The last estimate is of theoretical interest only, since no distortions of the patterns are assumed. In the presence of distortion he sparse coding allows one to increase the storage capacity by 1.5-2 times only. Another way to increase the binary storage capacity based on using of q−state models of neural nets was suggested in [3]-[4]. Q-state models were investigated in

2

Application of Potts-model Perceptron for Binary Patterns Identification

a lot of papers [5]-[15]. Among them the most well-known is the Potts spin-glass model [5]. The analysis of q−state models showed that they have an extremely large storage capacity M ∼ N q 2 /4 ln N q, which is q 2 greater than the same characteristic of the Hopfield model. In addition q−state models have an extremely large noise immunity and the ability for recognition in the presence of very large distortions. At the present q−state models of the associative memory are the best both with regard to the storage capacity and noise immunity. In the same time such high parameters of q−state models were practically not used up to now. The situation changed when the algorithm of mapping of binary patterns into q−valued ones was proposed [3]. It was shown in [3] that such mapping allows one to use q−state neural networks for storing and processing of signals of any type and any dimension. Moreover, the mapping brings to nothing the main difficulty of all the associative memory systems, which is the negative influence of correlations between the patterns.

X

mapping

Y

R →R N

n

q-ary input

binary input preprocessing

q-state Potts-model

(perceptron)

Z q-ary output

identification

Fig. 1. Two-stage scheme of binary pattern identification.

In the present paper we use the mapping algorithm [3] to create the identifier of binary patterns based on the Potts spin-glass model. The pattern identification includes two stages as shown schematically in Fig.1. At the first stage a preprocessing of the input binary signal is done. The preprocessing consists in mapping of a binary pattern X belonging to a N -dimensional configuration space into some internal q−nary pattern Y from the space of other dimension (RN → Rn ). In other words, we bring the black-white image X with the number of pixels N in one-to-one correspondence with the colored image Y with the less number of pixels (n = N/r), but with the greater number of colors (q = 2r ). The number r > 1 is called the mapping parameter. During the preprocessing two objects can be achieved at once. First, the mapping X → Y eliminates correlations between the input patterns. Second, for identification of patterns we can use a q−state neural network, whose storage capacity and noise immunity are much higher than in scalar models of the Hopfield type. For the purposes of illustration we use the most simple variant of the identification system based on the Potts-model perceptron.

Application of Potts-model Perceptron for Binary Patterns Identification

3

The description of the q−state model properties is given in Sections 2. The mapping algorithm is described in Section 3. The description of binary patterns identification is given in Sections 4. Restrictions on the possibility of binary patterns identification discussed in Section 5.

2

Identification of the q−nary pattern

Let us have a set of n-dimensional q−nary patterns {Yµ }: (µ)

(µ)

Yµ = (y1 , y2 , ..., yn(µ) )

(1)

(µ)

where yi ∈ {ek }q and {ek }q is the set of basis vectors of the q−dimensional space Rq (µ = 1, ..., M ; i = 1, ..., n; k = 1, ..., q). It is supposed that each pattern Yµ is one to one associated with the identifier Zµ : (µ)

(µ)

Zµ = (z1 , z2 , ..., z(µ) m )

(2)

(µ)

where zj ∈ {ek }q , j = 1, ..., m. Usually, by the identifier Zµ one codes: i) a directive, produced as a result of the input Yµ ; ii) the number µ of the input vector, that allows one to reconstruct the pattern Yµ itself afterwards; iii) any other information related to Yµ . We will examine the most simple case, when the number µ of the pattern Yµ is coding by the vector Zµ : the sequence of numbers of the basis vectors of q−dimensional (µ) (µ) (µ) vector space along which the unit vectors z1 , z2 , ..., zm are directed is just the number µ in the q−nary presentation. By kj = 1, ..., q we denote the number (µ) of the basis vector along which the vector zj is directed. Then the number µ is defined by the expression: µ=1+

m X

(kj − 1)q j−1

(3)

j=1

The problem is to create a network, which is able to reconstruct the identifier Zµ when presenting distorted pattern Yµ . The Potts-model perceptron solving the problem of identification is shown in Fig.2. It consists of two layers of q−state neurons (n input and m output neurons). Each neuron of the input layer is connected with all the neurons of the output layer. The number of output neurons (m = 1 + logq M ) is sufficient for coding the set of given patterns {Yµ }. The interconnection matrix elements are given by the Hebb rule: Jji =

M X

(µ) (µ)+

zj yi

µ=1 (µ)+

where yi

is the vector-row (1 ≤ i ≤ n, 1 ≤ j ≤ m).

(4)

4

Application of Potts-model Perceptron for Binary Patterns Identification

Let an input image Y = (y1 , y2 , ..., yn ) be a distorted copy of the l-th pattern Yl . The local field created by all the neurons from the input layer, which is acting on the j-th output neuron, have to be calculated as: hj = hj0 +

n X i=1

Jji yi ,

hj0 = −

M n X (µ) z q µ=1 j

(5)

. Under the action of the local field hj the j-th output neuron becomes aligned along the basis vector, whose direction is the most close to that of the local field. The calculation algorithm is as follows: i) the projections of the local field vector hj onto all the basis vectors of the q−dimensional space are to be calculated; ii) the maximal projection is to be found (let it be the projection onto a basis vector emax ); iii) the value zi = emax is set to the output neuron. As it was shown in [5]-[7], under this dynamics all the components of the encoding output vector Zl are retrieved reliable. y1

1

y2

2

y3

3

yn

1

z1

m

zm

n

Fig. 2. The scheme of q−state Potts perceptron.

The reliability of the perceptron identifier, i.e. the probability P that the number of the input pattern is defined correctly, is given by the expression [14],[15]: ¡ ¢ mq P = 1 − √ exp − 12 γ 2 (6) γ 2π where nq(q − 1) γ2 = (1 − b)2 (7) 2M Here γ is so called signal to noise ratio and b is the level of distortions in the input pattern (nb is the number of distorted pixels). It follows from (6) that P → 1 if γ 2 > 2 ln mq and n >> 1. So, the maximal number of the patterns, which can be reliably identified by the perceptron, may be defined from the equality γ 2 = 2 ln mq in the form: M∼

nq(q − 1) (1 − b)2 4 ln mq

(8)

Application of Potts-model Perceptron for Binary Patterns Identification

5

The expression (8) allows us to estimate the number of output neurons, which is necessary to enumerate the patterns written in the perceptron memory. Taking into account the equality m = 1 + logq M , it is easy to obtain the desired estimate in the form: m = 2 + ln n/ ln q (9) We see that the number of output neurons is much less than the number of the input ones. If q is sufficiently large, as a rule, this value is of the order of or slightly larger than 3 ÷ 4.

3

Mapping algorithm

Here we describe the mapping algorithm of binary patterns into q−nary ones. Later the q−nary patterns are used for q−state Potts model construction. Let X = (x1 , x2 , ..., xN ) be N -dimension binary vector, xi = {0, 1}. We divide it mentally into n fragments containing r elements each (n = N/r). Each i-th fragment (i = 1, ..., n) can be considered as an integer ki − 1 written down in the binary code (1 ≤ ki ≤ q, q = 2r ). This fragment is associated with the vector yi = eki , where eki ∈ {ek }q is ki -th basis vector in the space Rq . Thus, the vector X as a whole is put in one-to-one correspondence with a set of q-dimensional vectors, i.e. with an internal image Y = (y1 , y2 , ..., yn ). For example, the binary vector X = (01000000) can be split into two fragments of four elements: (0100) and (0000). The first fragment (it is ”4” in the binary code, k1 = 5) is associated with the vector y1 = e5 in the space of the dimension q = 16, and the second fragment (it is ”0” in the binary code,k2 = 1) is associated with the vector y2 = e1 . The relevant mapping takes the form X = (0100000) → Y = (e5 , e1 ). It is important that the mapping is biunique and the binary vector X can be restored uniquely from its internal image Y. It is even more that the mapping eliminates correlations between internal images. For example, let we have two 75% overlapping binary vectors X1 = (10000001) and X2 = (10011001). If now we use the mapping procedure with the parameter r = 2 (n = 4,q = 4) then we obtain two images Y1 = (e3 , e1 , e1 , e2 ) and Y2 = (e3 , e2 , e3 , e2 ), which are only overlapped by 50%. Using the same procedure with the mapping parameter r = 4(n = 2,q = 16), we obtain two images Y1 = (e9 , e2 ) and Y1 = (e10 , e10 ) that do not overlap completely. This means that our q−state perceptron can be effectively used for identification of correlated (similar) binary vectors.

4

Binary pattern identification

Here we describe the work of our model as a whole, i.e. the mapping of original binary patterns into internal images and identification of these images with the aid of q−state perceptron. For a given mapping parameter r we apply the procedure for Sect.3 to the set of binary patterns {Xµ }M ∈ RN , µ ∈ 1, M . As a result we obtain a set of ndimensional q−nary internal images {Yµ }M , where n = N/r and q = 2r . These

6

Application of Potts-model Perceptron for Binary Patterns Identification

images can be considered as randomized ones. With the aid of these images we build the Potts-model perceptron as was described in Sect.2. Now let us estimate the number of patterns, which can be recognized by our two-stage model (see Fig.1). Let the probability of distortions of coordinates of input binary pattern Xµ be 0 ≤ p < 1/2. Then mapping this vector into the q−nary representation, we obtain distorted internal image Yµ with the level of distortion defined as b = 1 − (1 − p)r . The recognizing properties of Potts-model perceptron are given by the expressions (6)-(8) in which n = N/r, q = 2r and 1 − b = (1 − p)r have to be substituted. In particular, the maximal number of images that can be identified by the perceptron constructed in such a way, is N [2(1 − p)]2r (10) 4r2 The presence of the factor (1−p)r is due to the fact that even small distortions of the components of the binary vector X lead to very large distortions of the components of its q−nary image Y. As a result, the number of patterns which can be recognized under such distortions also decreases. However, it follows from (10) that the value of M increases exponentially when the mapping parameter r increases. Indeed, let us write the mapping parameter in the normed form M∼

r = S log2 N

(11)

For simplicity we set p = 0. Then the expression (10) takes the form M∼

N 2S+1 (2S log2 N )2

(12)

We see that the storage capacity of the described system increases exponentially when the parameter S increases. Already when S ≥ 0.5, it becomes much greater than the storage capacity of any known one-level neural networks. When r = 1(q = 2) the expression (10) describes the functioning of perceptron based on the scalar Hopfield model. In this case the analysis of (10) shows that even if correlations are absent, the storage capacity does not exceed relatively small value M0 ∼ N/2 ln N . Even if small correlations are introduced, the number of recognized patterns is still more less, and practically the network failed in realizing the functions of an associative memory. When the mapping parameter r increases, the picture changes drastically. The network begins to operate as a q−nary model, i.e. its storage capacity increases significantly and the influence of the correlations goes down (see Fig.3 where the case of biased binary patterns is presented). In Fig.4 we show the decrease of the error of recognition when the parameter r increases, though the bias of the binary patters is large enough (a = 0.4) as well as the distortions of the images, which have to be recognized (p = 0.3). We see that when r ≤ 5, the network failed in realizing the functions of an associative memory. In other words, it recognize not a single pattern (P > 1 ⇒ r r/N

(15)

8

Application of Potts-model Perceptron for Binary Patterns Identification 1.2

1.0

P

50

Recognition error

0.8

5 0.6

M / L = 0.5 0.4

M / N =0.5 0.2

0.0 1

3

5

7

9

Mapping parameter

11

13

r

Fig. 4. The dependence of the error of recognition (1−P ) on the value of the parameter r: the loading parameter M/N = 0.5, 5, 50; p = 0.3; a = 0.4.

As it follows from Eq.(15), in the presence of a noise (p 6= 0), the value of mapping parameter r cannot exceed the critical value rc =

ln N |ln(1 − p)|

(16)

Correspondingly, the storage capacity cannot exceed the critical value Mcr ∼

N 2Sc −1 (2Sc log2 N )2

where Sc = |ln(1 − p)|

−1

(17)

(18)

When r increases inside region 1 ≤ r ≤ rc , the storage capacity M increases exponentially according to Eq.(12). However, when r > rc , the network failed to recognize patterns. In this case, the basin of attraction becomes so small, that a distorted pattern falls out of its boundaries. On the other hand, we see that near the critical value rc rather large storage capacity can be obtained.

6

Discussion of the results and conclusions

The algorithm described above can be used both for identification of binary and q−nary patterns (in the last case the preprocessing and the mapping are not necessary). Comparing the expression (8) with the results of the works [11]-[15], we see that for the perceptron algorithm the probability of the error recognition is n times less than in the case of the full connected neural networks. Actually this means that q−state perceptron is able to identify reliably the input vector

Application of Potts-model Perceptron for Binary Patterns Identification

9

and give out the correct directive, even when the full connected neural networks a priory give out incorrect output signals. Summarizing the presented results we can say that the proposed algorithm allows us to create the associative memory and identification systems with the storage capacity exponentially large with regard to the mapping parameter. These systems are able to work with sets of correlated binary patterns. Applied for binary patterns recognition described model of neural network is able to store M ∼ N 2S patterns, where S can be chosen as S >> 1. For example, it is possible to store and recognize a large number of binary patterns: M ∼ N 6 if p = 0.1; M ∼ N 3 if p = 0.2; M ∼ N 3/2 if p = 0.3. Acknowledgements. The work in part supported by the Russian Foundation for Basic Research (#06-01-00109).

References 1. Perez-Vicente, C.J., Amit, D.J.: Optimized network for sparsely coded patterns. Journal of Physics A, vol. 22, pp. 559-569 (1989) 2. Palm, G., Sommer, F.T.: Information capacity in recurrent McCulloch-Pitts networks with sparsely coded memory states. Network, vol. 3, pp. 1-10 (1992) 3. Kryzhanovsky, B.V., Mikaelian, A.L.: An associative memory capable of recognizing strongly correlated patterns. Doklady Mathematics v.67, No.3, p.455-459 (2003) 4. Kryzhanovsky, B.V., Mikaelian, A.L., Fonarev, A.B.: Vector neural network identifing many strongly distorted and correlated patterns. Int. conf on Information Optics and Photonics Technology, Photonics Asia-2004, Beijing- 2004. Proc. of SPIE, vol. 5642, pp. 124-133 (2004) 5. Kanter, I.: Potts-glass models of neural networks. Physical Review A, v.37(7), pp. 2739- 2742. (1988) 6. Cook, J.: The mean-field theory of a Q-state neural network model. Journal of Physics A, 22, 2000-2012 (1989) 7. Vogt, H., Zippelius, A.: Invariant recognition in Potts glass neural networks. Journal of Physics A, 25, 2209-2226 (1992) 8. Bolle, D., Dupont, P., Huyghebaert, J.: Thermodynamics properties of the q−state Potts-glass neural network. Phys. Rew. A, 45, 4194-4197 (1992) 9. Wu, F.Y.: The Potts model. Review of Modern Physics, 54, 235-268 (1982) 10. Nakamura, Y., Torii, K., Munaka, T.: Neural-network model composed of multidimensional spin neurons. Phys. Rev. E, vol.51, No.2, pp.1538-1546 (1995) 11. Kryzhanovsky, B.V., Mikaelyan, A.L.: On the Recognition Ability of a Neural Network on Neurons with Parametric Transformation of Frequencies. Doklady Mathematics, vol.65, No.2, pp. 286-288 (2002) 12. Kryzhanovsky, B.V., Kryzhanovsky, V.M., Mikaelian, A.L., Fonarev, A.: Parametric dynamic neural network recognition power. Optical Memory&Neural Network, Vol. 10, 4, pp.211-218 (2001) 13. Kryzhanovsky, B.V., Litinskii, L.B., Fonarev, A.: Parametrical neural network based on the four-wave mixing process. Nuclear Instuments and Methods in Physics Research, A. vol 502, No.2-3, pp. 517 - 519 (2003)

10

Application of Potts-model Perceptron for Binary Patterns Identification

14. Kryzhanovsky, B.V., Litinskii, L.B., Mikaelian, A.L. Vector-neuron models of associative memory. Proc. of Int. Joint Conference on Neural Networks IJCNN-04, Budapest-2004, pp.909-1004 (2004) 15. Kryzhanovsky, B.V., Kryzhanovsky, V.M., Fonarev, A.B.: Decorrelating Parametrical Neural Network. Proc. of IJCNN Montreal-2005, pp.1023-1026 (2005)

Suggest Documents