Using Three Layer Neural Network to Compute Multi-valued Functions

1 downloads 0 Views 225KB Size Report
1 College of Computer Science, Beijing University of Technology, Beijing 100022, ... 3 Engineering and Physics Department, Oral Roberts University, Tulsa, OK ...
Using Three Layer Neural Network to Compute Multi-valued Functions Nan Jiang1, Yixian Yang2, Xiaomin Ma3, and Zhaozhi Zhang4 1

College of Computer Science, Beijing University of Technology, Beijing 100022, China [email protected] 2 Information Security Center, Beijing University of Posts and Telecommunications, Beijing 100876, China 3 Engineering and Physics Department, Oral Roberts University, Tulsa, OK 74171, USA 4 Institute of Systems Science, Academia Sinica, Beijing 100080, China [email protected]

Abstract. This paper concerns how to compute multi-valued functions using three-layer feedforward neural networks with one hidden layer. Firstly, we define strongly and weakly symmetric functions. Then we give a network to compute a specific strongly symmetric function. The number of the hidden neurons is given and the weights are 1 or -1. Algorithm 1 modifies the weights to real numbers to compute arbitrary strongly symmetric functions. Theorem 3 extends the results to compute any multi-valued functions. Finally, we compare the complexity of our network with that of binary one. Our network needs fewer neurons.

1 Introduction Multi-valued multi-threshold neural networks can be used for solving many kinds of problems, such as multi-valued logic, machine learning, data-mining, pattern recognition and etc [1-2]. Diep [3] gives the number of functions that can be calculated by a multi-threshold neuron and gives a lower bound on the number of weights required to implement a universal network. Žunić [4] derives an upper bound of the number of linear multivalued threshold functions defined on some subsets of {0,1}n. Ojha [5] enumerates the number of equivalence classes of linear threshold functions from a geometric lattice perspective in weight space. Ngom [6] points out that one strip (points located between two parallel hyperplanes) corresponding to one hidden neuron and constructs two neural networks based on these hidden neurons to compute a given but arbitrary multi-valued function. Young [7] deals with classification task for real-valued input. Anthony [1] presents two consistent hypothesis finders to learn multi-threshold functions. Obradović and Parberry [8, 9] view multi-threshold networks as analog networks of limited precision. In [9], the authors give two learning algorithms: one is for realizing any weakly symmetric q-valued functions and the other is for realizing arbitrary q-valued functions. But they use seven layer and five layer neural networks respectively. In [8], the authors use a simple three layer network with q-1 hidden D. Liu et al. (Eds.): ISNN 2007, Part III, LNCS 4493, pp. 1–8, 2007. © Springer-Verlag Berlin Heidelberg 2007

2

N. Jiang et al.

neurons to realize n-variable q-valued functions. But they need to increase the inputs to n+q-1 variables. Our method is different from the above algorithms. We extend the technique of Siu [10] (Chapter 3) from binary valued to multi-valued cases.

2 Preliminaries



R(q)={0,1,…,q-1} is a residue class ring. In this paper, we assume q 3 and use q to replace 0. A q-valued function f can be denoted by Rn(q)→R(q) or {1,2,…,q}n→{1,2,…,q}. Assume that X=(x1,…,xn)T is the input vector of a neuron, where xi∈{q,1,…,q-1}. n

W=(w1,…,wn)T is the weight vector, where wi∈R (R is real field). Denote WTX= ∑ wi xi . i =1

A

q-valued

(or

⎧ q W T X ∈ ( −∞, t1 ) ⎪ 1 W T X ∈ [t1 , t2 ) , = ⎪⎨ ... ... ⎪ T ⎪q − 1 W X ∈ ⎡t , +∞ ) ⎣ q −1 ⎩

(q-1)-threshold)

neuron

Y

is

defined

as

Y(t1,t2,…,tq-1)

where t1 j l< j l> j

4

N. Jiang et al. n

Case 2: there exists a j such that

(0) ∑ x =kj +1, then i =1

i

Y1j( )=1

⎧q − 1 l < j Y1l = ⎨ l> j ⎩ q



Ydj( )=q

⎧q − 1 l < j Ydl = ⎨ l> j ⎩ q

Y(d+1)j=q-1

l< j ⎧ q Y( d +1) l = ⎨ − > j 1 q l ⎩



Y(2d)j=q-2

l< j ⎧ q Y(2 d ) l = ⎨ − > j 1 q l ⎩

So, ∑(0)+1=2dq-q-d+(2dq-d)(s-1). Computing the other cases similarly, we can get Eq. (2).

s

2d

∑∑ Y ( ) j =1 i =1

ij

n ⎧ (i ) ⎫ ⎪2dq − (i + 1)q − d + (i + 1) + (2dq − d )( s − 1), ∑ xi =k j ⎪ ⎪ = 1 i ⎪ ⎬ if 0 ≤ i ≤ d − 2 n ⎪ (i ) xi =k j + 1 ⎪ ⎪2dq − (i + 1)q − d + (2dq − d )( s − 1), ∑ ⎪⎭ i =1 ⎪ ⎪ n ⎫ ( ) i ⎪dq + (2dq − d )( s − 1), xi =k j ∑ ⎪ ⎪ i =1 = ⎪⎨ ⎬ if d − 1 ≤ i ≤ q − 3 n (i ) ⎪dq − d + (2dq − d )(s − 1), ⎪ = + x k 1 ∑ i j ⎪ ⎪⎭ i =1 ⎪ n ⎪ ⎫ xi =k j (i ) ⎪ ∑ ⎪dq + (i − q + 2)(q − 1) + (2dq − d )( s − 1), ⎪ i = 1 ⎪ ⎬ if q − 2 ≤ i ≤ d + q − 4 n ⎪ (i ) ⎪ dq + (i − q + 3)q − d + (2dq − d )( s − 1), ∑ xi =k j + 1 ⎪⎪ ⎭ i =1 ⎩

s

In order to realize q-valued functions,

(2)

2d

∑∑ Y ( ) must have at least q different j =1 i =1

ij

values. From Eq. (2) we see that this is related to d. Theorem 1 gives the value of d. s

Theorem 1. In order to make

2d

∑∑ Y ( ) have at least q different values, d=q-1 is j =1 i =1

ij

necessary and sufficient. s

Proof. From Eq. (2), we know that

2d

∑∑ Y ( ) always includes (2dq-d)(s-1). For a j =1 i =1

ij

residue class ring, this has no affection to the number of values. So in this proof, we omit them. s

2d

∑∑ Y ( ) have at least q different values.

1. Sufficiency: d=q-1 ⇒

j =1 i =1

ij

(1) When 0≤i≤d-2=q-3, ∑(i)=2dq-(i+1)q-d+(i+1)=1-q+i+1=i+2. Since 2≤i+2≤q-1, ∑(i) can take the value 2,3,...,q-1. (2) When 0≤i≤d-2=q-3, ∑(i)+1=2dq-(i+1)q-d=1-q=1. (3) When q-2≤i≤d+q-4=2q-5, ∑(i)=dq+(i-q+2)(q-1)=q-i-2. Since 3-q≤q-i-2≤q, ∑(i) can take the value 3,4,...,q. (4) When q-2≤i≤d+q-4=2q-5, ∑(i)+1=dq+(i-q+3)q-d=1-q=1. s

From (1)-(4), we can see that

2d

∑∑ Y ( ) can take the value 1,2,...,q. j =1 i =1

2. Necessity: d≤q-2 ⇒

s

2d

ij

∑∑ Y ( ) can not take q different values. j =1 i =1

ij

Using Three Layer Neural Network to Compute Multi-valued Functions

5

(1) When 0≤i≤d-2, ∑(i)=2dq-(i+1)q-d+(i+1)=i+1-d. Because 1-d≤i+1-d≤q-1 and d≤q-2, we can get that ∑(i)≥3. (2) When 0≤i≤d-2, ∑(i)+1=2dq-(i+1)q-d=q-d≥2. (3) When d-1≤i≤q-3, ∑(i)=dq=q. (4) When d-1≤i≤q-3, ∑(i)+1=dq-d=q-d≥2. (5) When q-2≤i≤d+q-4, ∑(i)=dq+(i-q+2)(q-1)=q-i-2. Because 2-d≤q-i-2≤q and d≤q-2, we can get that ∑(i)≥4. (6) When q-2≤i≤d+q-4, ∑(i)+1=dq+(i-q+3)q-d=q-d≥2. s

From (1)-(6), we can see that

2d

∑∑ Y ( ) can not take the value 1. j =1 i =1



ij

s

2d

Remark 1. According to Theorem 1, the input of neuron z is ( ∑∑ Yij ( ) mod q) j =1 i =1

s

instead of

2d

s

2d

∑∑ Y ( ) directly. Otherwise, if the input of z is ∑∑ Y ( ) , d=q-1 will j =1 i =1

ij

j =1 i =1

ij

only be the sufficient condition. That is to say, d≤q-1. The inputs and weights of z have been determined. Now we give the q-1 thresholds of z. From the sufficiency proof of Theorem 1, we can easily get ∑(0)+1=1, ∑(i)=i+2, 0≤i≤q-3. So they can be taken as the q-1 thresholds of z. Then, we obtain a three layer feedforward neural network which realizes a specific strongly symmetric function: f(x1,…,xn)=z(∑(0)+1,∑(0),∑(1),…,∑(q-3))=

⎧ q ⎪ 1 ⎪⎪ ⎨ 2 ⎪ # ⎪ ⎪⎩q − 1

Since s=((q-1)n+1)/2(q+d-3), there are 2ds= layer for q

n

for ∑ i =1

⎧ k j ( q−2) ⎪ (0) + 1, ", k j (2 q−5) + 1 k ⎪ j ⎪ xi = ⎨ k j (0) ⎪ # ⎪ ⎪⎩ k j ( q −3) , k j ( q −1)

qn q + n − 1 + 2 2(q − 2)

j=1,2,…,s.

neurons in the hidden

≥3. This is the complexity of the neural network.

3.3 Computing Arbitrary Strongly Symmetric Functions If we allow Yij have integer and rational weights, we can prove the following theorem. Theorem 2. Let f(x1,…,xn) be a q-valued strongly symmetric function. Then f can be computed using a three layer feedforward neural network with 2ds=

qn q + n − 1 + 2 2(q − 2)

(q-1)-

threshold neurons in the hidden layer which have integer and rational weights chosen from the set S={±1,…,±q, ±1/2,…,±q/2, ±1/3,…,±q/3,…,±1/q,…,±(q-1)/q}. n

Proof. Since f only depends on the sum of its inputs

∑ x , then we assume that i =1

i

n

f(x1,…,xn)=am∈{1,2,…,q}, where m= ∑ xi ∈[n, qn]. For proving the theorem, we only i =1

need to prove that we can choose a set of integer and rational weights from the given set S for the neurons Yij i=1,2,…,2d, j=1,…,s such that the constructed network

6

N. Jiang et al. n

outputs am. For every m, we choose fixed inputs x1,…,xn such that

∑ x =m, without i =1

i

loss of generality, let (x1,…,xn)=(1,1,…,1), (2,1,…,1), (2,2,…,1),…,(q,q,…,q). Then we choose weights w1(ij),…,wn(ij) for Yij, where, wl(ij)∈S, l=1,2,…,n, i=1,2,…,d. The weights of the neurons Y(d+1)j,…,Y(2d)j are set to be -w1(dj),…,-wn(dj),…, -w1(1j),…,-wn(1j), s

respectively. We require that the output of the neurons satisfying

2d

∑∑ Y ( ) =(2dqj =1 i =1

ij

d)(s-1)+am for all m (n≤m≤qn). This can be realized by using the following algorithm. Algorithm 1. Initially, set the weights equal 1 for neurons Y1j,…,Ydj and -1 for neurons Y(d+1)j,…,Y(2d)j, j=1,…,s, and set t=1. Set the remnant set m={n, n+1,…,qn}. Step 1. Compute bm=z(∑(0)+1,∑(0),…,∑(q-3)), for the remnant set of m. Step 2. Compare am and bm, delete those m such that am=bm. Step 3. If the remnant set of m is not void, find mt=the smallest m in the remnant set. Otherwise, stop. Step 4. If mt∈[kj(0), kj(2q-4)), then set the weights wi(1j)=…=wi(dj), i=1,…,n. such that n

∑ i =1

⎧ k j ( q−2) ⎪ (0) ⎪k j + 1 ⎪ (1 j ) wi xi = ⎨ k j (0) ⎪ # ⎪ ⎪⎩ k j ( q−3)

if am

t

⎧ q ⎪ 1 ⎪ = ⎪⎨ 2 ⎪ # ⎪ ⎪⎩q − 1

n

where

((d+1)j) =…=wi((2d)j)=-wi(1j), i=1,…,n. Set t=t+1, then go to step 1. ∑ x =mt. Set wi i =1

i

Evidently, execute the algorithm one cycle, the remnant set reduces at least one m. Hence the algorithm must stop after executing at most (q-1)n+1 cycles. Thus the function f is computed. □ Corollary 1. If f(x1,…,xn) is a q-valued strongly symmetric function, then f can be computed using a three layer feedforward neural network with fixed architecture. There are T,

qn q + n − 1 + ≤T≤2(q-1)((q-1)n+1), 2 2(q − 2)

(q-1)-threshold neurons in the hidden

s

layer. More precisely, T= ∑ t j , there are tj, 1≤tj≤2q, same neurons Yij, i=1,2,…,2d. The j =1

weights of the neurons Yij are chosen from the set S. Proof. It is easy to see that the proof of corollary 1 is a variant of theorem 2. In fact, if the algorithm is executed tj, tj>1, cycles for mt∈[kj(0), kj(2q-4)), then we need tj same neurons Yij, i=1,2,…,2d in the hidden layer and choose different weights for them as step 4 of the algorithm. In this case, the weights of the output neuron z must be adapted. For each m∈[kj(0), kj(2q-4)), only 2d weights of the neuron z equal 1 2d

corresponding to its inputs

∑ Y ( ) such that the output of the neuron z equals am, the i =1

ij

other weights of z equal 0. 3.4 Computing Arbitrary q-Valued Functions As a consequence of Theorem 2, we have the following theorem.



Using Three Layer Neural Network to Compute Multi-valued Functions

7

Theorem 3. Any q-valued function f(x1,…,xn) can be computed by a three layer neural network with

q −1 n q 2(q − 2)

(q-1)-threshold neurons in the hidden layer. n

Proof.

Since

the

sum

∑q j =1

j −1

xj

is

distinct

integers

for

different

inputs

(x1,…,xn)∈{1,2,…,q}n. Thus any q-valued function can be regarded as a function of weighted sum of its inputs or a strongly symmetric function of (qn-1)/(q-1) variables. The result follows from theorem 2. □

4 Binary Neuron vs. Multi-valued Neuron Siu [10] has proved that any m-variable Boolean function can be computed by a binary neural network with ⎡⎢(2m − 1) 2⎤⎥ neurons in the hidden layer. If we use binary neural networks to compute any n-variable q-valued function f(x1,…,xn), we must combine at least q-1 such networks. Each time, one value is separated from the remnant values. Furthermore, variables (x1,…,xn) must be altered to Boolean variables (y1,…,ym), i.e. a n-variable q-valued function is represented to a set of (q-1) mvariable binary functions. Since, m=nlog2q, a n-variable q-valued function can be computed by a three layer binary neural networks with ⎡⎢(2m − 1) 2⎤⎥ = ⎡⎢(q n − 1) 2⎤⎥ hidden neurons. So the total number of hidden neurons is (q − 1) ⎡⎢(q n − 1) 2 ⎤⎥ . Theorem 3 asserts that if we use three layer network with (q-1)-threshold neurons, the total number of hidden neurons is ⎡ q n − 1⎤ q −1 n q < (q − 1) ⎢ ⎥, 2(q − 2) ⎢ 2 ⎥

q −1 n q . 2(q − 2)

this indicates that the complexity of neural networks with

(q-1)-threshold neurons is lower than that of binary neural networks.

5 Conclusion This paper proposes a method to compute arbitrary q-valued functions using three layer feedforward neural networks with one hidden layer. The number of hidden neurons required is given. Estimating the number of hidden neurons is a fundamental problem concerning the complexity of neural networks. Through the constructive proving process, we give the upper bound. Acknowledgment. This work is supported by Excellent Person Development Foundation of Beijing under Grant No. 20061D0501500191.

References 1. Anthony, M.: Learning Multivalued Multithreshold functions. CDMA Research Report No. LSE-CDMA-2003-03, London School of Economics (2003) 2. Wang, S.: Multi-valued Neuron and Multi-thresholded Neuron: Their Combinations and Applications. Acta Electronica Sinica (1996) 1-6

8

N. Jiang et al.

3. Diep, T.A.: Capacity of Multilevel Threshold Devices. IEEE Transactions on Information Theory (1998) 241-255 4. Žunić, J.: On Encoding and Enumerating Threshold Functions. IEEE Transactions on Neural Networks (2004) 261-267 5. Ojha, P.C.: Enumeration of Linear Threshold Functions from the Lattice of Hyperplane Intersections. IEEE Transactions on Neural Networks (2000) 839-850 6. Ngom, A., Stojmenović, I., Milutinović, V.: STRIP --- a Strip-based Neural-Network Growth Algorithm for Learning Multiple-valued Functions. IEEE Transactions on Neural Networks (2001) 212-227 7. Young, S., Downs, T.: CARVE --- a Constructive Algorithm for Real-valued Examples. IEEE Transactions on Neural Networks (1998) 1180-1190 8. Obradović, Z., Parberry, I.: Learning with Discrete Multi-valued Neurons. Journal of Computer and System Sciences (1994) 375-390 9. Obradović, Z., Parberry, I.: Computing with Discrete Multi-valued Neurons. Journal of Computer and System Sciences (1992) 471-492 10. Siu, K.Y., Roychowdhury, V., Kailath T.: Discrete Neural Computation: a Theoretical Foundation. Englewood Cliffs, NJ: Prentice-Hall (1995)

Suggest Documents