Auto-Associative Memory Based on a New ... - University of Ottawa

12 downloads 65222 Views 580KB Size Report
M. Amiri, is with the Department of Electrical and Computer. Engineering, University of ... recurrent networks with shorter training for associative recall of memory ...
Auto-Associative Memory Based on a New Hybrid Model of SFNN and GRNN: Performance Comparison with NDRAM, ART2 and MLP Hamed Davande, Mahmood Amiri, Alireza Sadeghian, Sylvain Chartier 

Abstract— Currently, associative neural networks (AsNNs) are among the most extensively studied and understood neural paradigms. In this paper, we use a hybrid model of neural network for associative recall of analog and digital patterns. This hybrid model which consists of self-feedback neural network structures (SFNN) parallel with generalized regression neural network (GRNN) were first proposed by authors of this paper. Firstly, patterns are stored as the asymptotically stable fixed points of the SFNN. In the retrieving process, each new pattern is applied to the GRNN to make the corresponding initial conditions of that pattern which initiate the dynamical equations of the SFNN. In this way, the corresponding stored patterns and noisy version of them are retrieved. Several simulations are provided to show that the performance of the hybrid model is better than those of recurrent associative memory, feed-forward multilayer perceptron and is equally comparable with the performance of hard-competitive models. I. INTRODUCTION

I

N general, desired memory patterns are presented by binary or real-valued (analog) vectors [1]. Hopfield presented continuous-time feedback neural networks, which provided a way of storing analog patterns [2]. The aim of such networks is to retrieve a previously learned pattern from an example which is similar to, or a noisy version of, one of the previously presented patterns. This property of associative neural networks makes them very suitable for a variety of applications such as image segmentation [3] and chemical substances recognition [4]. Adaptive Resonance Theory (ART) is an unsupervised paradigm based on competitive learning capable of automatically finding categories and creating new ones when they are needed. ART1 was developed to perform clustering on binary-valued patterns. ARTs structures, an extension of ART1, were later developed to handle realvalued patterns [5]. Recently, a nonlinear dynamic recurrent associative memory (NDRAM) for storing analog and digital patterns is presented by Chartier, et al, [6], which is based on an unsupervised attractor neural network, and is able to develop analog and bipolar attractors. Moreover, the H. Davande, is with the Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran (e-mail: davande@ aut. ac.ir). M. Amiri, is with the Department of Electrical and Computer Engineering, University of Tehran , Tehran, Iran; To whom correspondence should be addressed, (e-mail: [email protected]). A. Sadeghian, is with the Department of Computer Science, Ryerson University, Toronto, Ontario, Canada,( e-mail: [email protected]) S. Chartier, is with the School of Psychology, University of Ottawa, Ottawa, Canada; (e-mail: [email protected])

model is able to develop less spurious attractors and has a better recall performance under random noise than any other Hopfield type neural network. Self-feedback neural networks (SFNN) are simple recurrent networks with shorter training for associative recall of memory patterns. SFNN is a two-layer network where the output layer contains self-feedback units. The self-feedback connection of units ensures that the output of the SFNN contains the complete past information of the system even if the inputs of the SFNN are only the present states and inputs of the system [7]. The generalized regression neural network (GRNN) was introduced by Nadaraya [8] and Watson [9] and rediscovered by Specht [10]. This model is a generalization of both radial basis function networks (RBFNs) and probabilistic neural networks (PNNs) that can perform linear and nonlinear regression [11]. These networks are basis function architectures that approximate any arbitrary function between input and output vectors directly from training samples and can be used for multi-dimensional interpolation. GRNNs are not as commonly used as RBFNs or MLP networks. The GRNNs have been applied to solve a variety of problems including prediction, control, plant process modeling or general mapping problems [12]. In this paper, we use a hybrid model of SFNN and GRNN neural networks proposed in [13] for associative recall of analog and digital patterns. Firstly, patterns are stored as the asymptotically stable fixed points of the SFNN by a new training algorithm developed by authors of this paper in [14]. Next, we utilize these input patterns as the input vectors of the GRNN and desired initial conditions of the SFNN as the desired output of the GRNN. These desired initial conditions are obtained by selecting an arbitrary point in the attraction domains of each stable equilibrium point. In the recognition stage, each new pattern is firstly applied to the GRNN to make the corresponding initial conditions of that pattern. These initial conditions are used to initiate the dynamical equations of the SFNN. The rest of this paper is organized as follows. In Section II, the SFNN and GRNN models will be described and then, the hybrid model and its working mechanism are described. The experimental results and comparisons with different classes of neural networks, auto-associative (NDRAM, [6]), competitive (ART2, [15]) and feed-forward (MLP, [16]), will be presented in Section III. Finally, Section IV concludes the paper.

1698 c 978-1-4244-1821-3/08/$25.002008 IEEE

Authorized licensed use limited to: University of Ottawa. Downloaded on October 6, 2008 at 10:14 from IEEE Xplore. Restrictions apply.

II. METHODS

B. Generalize Regression Neural Network The main function of a GRNN is to estimate a linear or nonlinear regression surface on independent variables (input vectors) X given the dependent variables (desired output vectors) Y. That is, the network computes the most probable value of an output, yˆ given only training vectors X.

A. Self-Feedback Neural Network Model The architecture of the SFNN model is depicted in Fig.1. The mathematical description is as follows [7]: n

¦W ijI u i (k ) W jD X j (k  1)

S j (k )

(1)

Specifically, the network computes the joint probability density function of X and Y. Then the expected value of the Y given X is expressed as [11]:

i 1

X j (k )



f S j (k )



(2)

f

where ui (k ) (i 1,..., n) denote the external input, and

³Y

S j (k ) , X j (k ) ( j 1, ..., m) are the state variable and output

of the jth neuron of the output layer, respectively. f (O ) is the

activation

function

defined

f (O )

by

³

1/(1  e O ) .

(4) f ( X , Y ) dy

f

An important advantage of the GRNN is its simplicity and fast approximation procedure. Another attractive feature is that, unlike back-propagation based neural networks (BPNN), GRNN does not converge to local minima [10].

WijI , W jD are connection weights from input to output layer and within the output layer, respectively. W jD

f ( X , Y ) dy

f f

E[Y X ]

Z 1

Ÿ

Fig. 1. The Structure of the SFNN model.

The authors of this paper in [14] fully investigated and presented a simple and efficient algorithm for storing the desired number of stable equilibrium points in SFNN. The algorithm can be briefly summarized as follows: 1. Choose an arbitrary value greater than 4 for selffeedback coefficient of each self-feedback unit ( wDj ! 4 ).

Fig. 2. The GRNN architecture.

The topology of a GRNN is described in Fig. 2 and it consists of 1) The input layer that is fully connected to the pattern layer 2) The pattern layer that has one unit for each pattern. It computes the pattern Gaussian function expressed by

2. Regarding the selected value for w Dj calculate following terms for each self-feedback unit: § wD  wD  4 · w Dj  w Dj  4 2 j j ¨ ¸ b j1 ln ¨¨ ¸¸ 2 w Dj 2 w Dj © ¹ bj2

2 w Dj

§ ln ¨ ¨¨ ©

w Dj

4 · ¸ 2 ¸¸ ¹ ( j 1,..., m) 

w Dj

w Dj



w Dj

4

hi

(b j w

D j

¦ u l ; w (Ii 1) j n

)

(3)

2 w Dj

w ijI ; i

1,..., n ; j

1,..., m

l 1

to update the input weight matrix (WI) for each input vector. This will end the process of SFNN model training. 5. In the recovering stage, (2) and (3) are used to calculate the output of the SFNN model. It is noted that for each input vector, this recursive equation should be computed until it converges to its stable states, i.e., a predefined threshold is reached.

( x  X i )T ( x  X i )

(5)

where 1 denotes the smoothing parameter, x is the input presented to the network and Xi is each of the training vector. 3) The summation layer has two units N and S. The first unit computes the weighted sum of the hidden layer outputs and the second unit has weights equal to "1", consequently is the summation of exponential terms (hi) alone. 4) Finally, the output unit divides N by S to provides the prediction result.

3. To use the maximum capacity of the SFNN, adjust the value of parameter bj as bj1

Suggest Documents