307
NEURAL NETWORKS FOR QUATERNION= VALUED FUNCTION APPROXIMATION P.Arena, L. Forhma, L. Occhipinti, M.G. Xibilia Dipartimento Elettrico Elettronico e Sistemistico Universita' degli Studi di Catania Viale A. Doria 6,95100 Catania, Italy (+39) 95-339535
[email protected] ABSTRACT In the paper a new structure of Multi-Layer Perceptron, able to deal with quaternion-valued signal, is proposed. A learning algorithm for the proposed Quaternion MLP (QMLP) is also derived. Such a neural network allows to interpolatefunctions of quaternion variable with a smaller number of connections with respect to the corresponding real valued MLP. INTRODUCTION In the last few years, neural networks models have been used in a wide variety of applications, due to their capabilities of classification and interpolation of real valued functions. In particular, the capabilities of the multi-layer perceptron (MLP) structure have been widely investigated, proving the suitability of MLPs with only one hidden layer in the approximation of any continuous real-valued functions with an arbitrary degree of accuracy [ 11. More recently, complex MLP structures have been proposed [2], [3] and a suitable learning algorithms has been derived, in order to deal with complex-valued signals with lower computation complexity and fewer parameters, with respect to real MLPs. Two different structures of complex MLPs have been proposed in literature, differing for the type of activation function which is embedded into the neurons. In [4] it has been proved that the complex MLP proposed in [2], which adopts a bounded, non-analytic activation function can approximate with an arbitrary degree of accuracy, any continuous complex-valued function, while the network structure proposed in [3] can approximate only analytic complex valued function. A comparison of the complex and real structures in communications applications is reported in [5], showing that complex MLP structure can achieve the same performance of real MLP with a lower number of parameters and real multiplications in the feed-forward phase. These results leads to the conclusion that, when the intrinsic characteristic of the signals are embedded into the neural structure, the computational complexity decreases.
Starting from this observation, the idea arise to extend MLP structure in hypercomplex domain. In 1843, the Irish mathematician W.R. Hamilton invented quaternion in order to extend 3dmensional vector algebra for inclusion of multiplications and division [6]. Quaternion are generalized complex numbers and represent rotation in space as ordinary complex numbers represent rotation in a plane. U sing quaternion to describe finite rotations brings attention to their capability to spec@ arbitrary rotations in space without degenerating to singularity [7]. The application of the quaternionic analysis to the problems of mathematical physics enable also to formulate a unified approach for solving all questions arising in the consideration of boundaq value problems PI. Moreover, in the context of molecular dynamics simulations, they have been rediscovered for the integration of the rotational equation of motion of rigid molecules [9]. In the paper a new multi-layer perceptron structure able to deal with quaternion is proposed. In particular, a suitable activation function for a quaternion neuron is considered and a recursive learning algorithm for updating the weight of the quaternion MLP (QMLP) is derived. In section I, the fundamental properties of quaternions are illustrated. In section I1 a list of notation adopted in quaternion MLP is described in order to derive the learning algorithm, which is reported in section 111. A numerical example is also reported in section IV in order to evaluate the suitability of the proposed approach, and a comparison between the QMLP and the real MLP is shown. QUATERNION ALGEBRA
A quaternion 4 is defined as a complex number:
4=q0+q,T+q2J+q~k=q0+~ formed from four different units (l,i, I , k)by means of the real parameters q, (i=0,..4),where r,J,kare the 3 orthogonal spatial vectors. It is convenient to represent if in the matrix form:
308 The conjugate of a quaternion is denoted by q* and is defined by:
q*=qo-q Addiction and subtraction of two quaternions 4 and are defined, in the matrix form, as:
q fF = [ q o fPo 941 fPI 7 4 2 fP2 7 9 3 f P3 I
p
T
Quaternion multiplication is defined as:
= 1 represent the bias inputs
(In particular
n= 1,..No is the input signal and is the output)
Ff
=
,F;'
n= 1,..,Nu
4wnm0 - 0) + k-~ 3(0n m : quaternion Inm + jw~nm - w O n0) m +i;u(') sinaptic weigth of the n-th neuron of the I-th layer, relative to the m-th output of the (I- 1)-th layer; s, -0) , - so,, 0)+isl,, - ( J ) +j$; : 'net function' relative
+~$fl
to the n-th neuron of the I-th layer; = +fd: + Jidi,!+ Etf,! : bias of the n-th neuron of the I-th layer; t,, = to, +Ft1,, +jttn+ kt3,,,, n=l,..M: target of the n-th output of the network.
e'," sb','
where I is the 3 *3 identity matrix and: -93
92
This relation can be obtained by multiplying q and
QMLP LEARNING ALGORITHM The forward phase of the QMPL is described by the following equations:
Forward phase: for I=l,..,M and n=l,..N,
3;) = It has to be remarked that quaternion multiplication is not commutative, so that the space of quaternions has the algebraic structure of a ring. Unlike spatial vectors, the set of quaternions form a division algebra since for each non zero quaternion there is an inverse such that: qlgl q-1 = 4-1 @q= 1. The inverse is given by:
p i ;8
NI-,
+qo
p - 1 )
m
(1)
m=O
=$$d)
$4
(2)
where the quaternion activation function Z(q) is defined by:
-
4140 + G I +742
1
+kgJ =
MI,)
1
(3)
= dqo +ro(q, +fdq,>+
and
o(.) is the classical real-valued sigmoidal function.
As in the real back-propagation algorithm [lo], the where:
q* c3 q = Iq l 11'
=N(q)
is the norm of 4. ~f ~ ( q= 1,) ij is called unit quaternion. QUATERNION MLP: BACKGROUND OF NOTATION Let us define a quaternion MLP (QMLP)a MLP in which both the weights of connections and the biases are quaternions, as well as the input and output signal. For this neural structure, the following notation will be adopted:
M :number of layers in the network; I :layer index (I=O and I=M denotes the input and output layer respectively); Nl:number of neurons of the I-th layer; n :neuron index; = i;u!f' + 7~:: + F X ~ :quaternion ? output of the n-th neuron of the I-th layer;
~Afl+
learning procedure involves the presentation of a set of pair of sample, in the quaternion space,which represent the inputs and outputs of the network. During the forward phase the network computes its output and compares it with the target. An error function is then computed as: (4)
where p indicates the p-th pattern. The weights are updated at time k + l back-propagating the error in such a way to perform a steepest descent on a surface in the weight space whose heigth at any point is equal to the error measure, it means:
z
Vm(k+l)KF,(k)--
(5)
4 t m
where[ 111:
a?
--
--
&
- &
+i-+j-
- a ? + k-- a
(6)
honm h l n m h2nm h3nm Each term of the above relation is computed by applying the chain rule. In the following, the obtained algorithm is reported. %nm
309 Learning phase: for I=l,..,M and n=l,..N1
where:
0.21 10
Finally, the weigths of each connection and the biases are updated with:
(9)
where E is the learning rate.
NUMERICAL EXAMPLE In order to zvaluate the suitability of the introduced algorithm to approximate quaternionic functions, a numerical example is reported in this section. For sake of completeness, a real MLP has been also trained. The results reported below show that the QMLP represents a powerful technique to interpolate quaternionic functions, in that it requires a lower number of real parameters to reach the same performance as a real MLP. The function to be interpolated is:
20
so
40
50
0
Fig. 1 Comparison between the outputs of the 1-2-1 QMLP and the target on 50 testing samples for the real part. In order to evaluate the suitability of the QMLP as a quaternionic function interpolator a real MLP has also been trained using the same 100 learning pattern used in the QMLP. It has been found that a topology with 4 input, 4 output and 6 hidden neurons should be adopted in order to reach the same testing performance of the QMLP. The results obtained in the testing phase for the same component of the function are reported in Fig.2.
05
ir
where:
h = [1.5,- 1.3,l.2,O. 51'
7= [0.5,0.3,0.2,-11T To interpolate the proposed function a set of 500 patterns has been built: 100 have been employed to carry out the learning phase, while the last 400 have been used to evaluate the network generalization capability. A QMLP with one input, one output and 2 hidden neurons has been found to reach good performance during both the learning and the testing phase. The results obtained in the testing phase with the last 50 samples for the real part of the quaternionic function are reported in Fig. 1.
0.2 I
IO
20
50
40
50
0
Fig. 2 Comparison between the outputs of the 4-64 real MLP and the target on 50 testing samples for the real part. In Fig.3 and Fig.4 the comparison between the results obtained from the simulation of the first imaginary component is depicted. Similar results have been obtained for the other imaginary parts of the considered function. As it can be observed, the real and QMLP performance are quite similar, but the number of real parameters used in the function interpolation are strongly different. In fact, while the real MLP employes 58 real
310
parameters, the QMLP uses 7 quaternionic connections which correspond to 28 real parameters. Under these consideration, the results obtained aquires much interest.
J
0.41
0
10
20
so
40
50
60
Fig.3 Comparison between the outputs of the 1-2-1 QMLP and the target on 50 testing samples for the first imaginary component.
0.75
-
07-
065-
06-
0.55-
0.5
-
0.45
0
10
20
30
40
50
60
Fig.4 Comparison between the outputs of the 1-2-1 QMLP and the target on 50 testing samples for the first imaginary component. CONCLUSIONS A learning algorithm, based on the quaternion algebra, has been developed for training supervised feedforward neural networks in order to interpolate quaternionic functions. The suitability of the proposed strategy has been shown via a numerical example, and a comparison between the QMLP and a real MLP has been carried out. The results obtained have shown the suitability of the quatemionic MLP as a powerful strategy to interpolate quatemionic functions with a smaller number of real parameters.
REFERENCES [1]G. Cybenko, "Approximation by superpositions of a Sigmoidal function", Math. Conttrol Systems, v01.2, pp.303-314, 1989 [2] N. Benvenuto, F.Piazza, "On the complex Backpropagation algorithm", IEEE Trans. On signal Processing, Vo1.40, no.4, Aprl 1992. [3] H. Leung and S. Haykin, T h e complex Backpropagation algorithm", IEEE Trans. On Signal Processing, Vo1.39 no.9, September 1991. [4] P. Arena, L. Fortuna, R. Re, M. G. Xibilia, "On the capability of neural networks with complex neurons in complex valued functons approximation", Proc. ISCAS 1993. [5] N. Benvenuto, M. Marchese, F.Piazza, A. Uncini, "A comparison between real and complex valued neural networks in communication applications", Proc. 1991 Intem., Conf. on Artificial neural networks, Espoo, Finland. [6] W.R. Hamilton, "Elements of quaternions", Chelsea Pub., New York, 1969 [7] J.K. Chou, "Quaternion kinematic and dynamic differential equations", IEEE Trans. on Robotics and Automation, vol. 8, no 1, February 1992. [SI K. Guerlebeck, W.Sprossig, "Quaternionic analysis and elliptic boundary value problems", Int. Series of Numerical Mathemetics, Vol89, Birkhauser. [9] G. R. Kneller, "Quaternions as a tool for the analysis of molecular systems" J. Chim Phys, pp 27092715, Elsevier 1991. [ 101 D.E. Rumelhart, J.L. McClelland, "Learning internal representations by error propagation", in Parallel Distributed Processing, vol 1, MIT Press, Cambridge, USA, 1986. [ l l ] A. Sudbery, "Quaternionic analysis", Math. Proc. Camb. Phil. Soc.,no 85, pp 199-225, 1979.