bounding the errors in constructive function approximation - eurasip

1 downloads 0 Views 129KB Size Report
proximation, obtain the best possible trade-off between. iThis work ... This is guaranteed by (1), for = 1. .... we have proved in 6] that the best rate of convergence.
BOUNDING THE ERRORS IN CONSTRUCTIVE FUNCTION APPROXIMATION D. Docampo and C.T. Abdallah ETSIT, Universidad de Vigo, Campus Universitario, Vigo 36207 EECE Department, University of New Mexico, USA Tel: +3486 812134 fax: +3486 812116 e-mail:

[email protected]

ABSTRACT In this paper we study the theoretical limits of nite constructive convex approximations of a given function in a Hilbert space using elements taken from a reduced subset. We also investigate the trade-o between the global error and the partial error during the iterations of the solution. The results obtained constitute a renement of well established convergence analysis for constructive iterative sequences in Hilbert spaces with applications in projection pursuit regression and neural network training.

1 Introduction Continuous functions on compact subsets of Rd can be uniformly approximated by linear combinations of sigmoidal functions 1], 2]. The issue of how the error in the approximation is related to the number of sigmoidals used is one of paramount importance from the point of view of applications it can be phrased in a more general way as the problem of approximating a given element (function) in a Hilbert space by means of an iterative sequence n , and has an enormous impact in establishing convergence results for projection pursuit algorithms 3], neural network training 4] and classication 5]. It has been shown that this problem can be given a constructive solution where the iterations taking place involve computations in a reduced subset of . In this paper we formulate the problem in such a way that we can study the bounds for the error in the approximation, obtain the best possible trade-o between f

H

f

G

1

H

This work has been partially supported by AFOSR and BNR.

global and partial errors, and derive bounds for the global error when a prespecied partial error is xed. The rest of the paper is organized as follows: Section 2 will state the problem and highlight its practical implications. Section 3 will review Barron's and Dingankar's solutions to the problem, while Section 4 will provide the framework under which those solutions can be derived. Section 5 will analyze the limits of the global error and its relation to the partial errors at each step of the iterative process. Finally, Section 6 will close the paper with the conclusions and further work.

2 The Problem Let be a subset of a real or complex Hilbert space , with norm jj jj, such that its elements, , are bounded in norm by some positive constant . Let  ( ) denote the convex closure of (i.e. the closure of the convex hull of in ). The rst global bound result, attributed to Maurey 4], concerning the error in approximating an element of  ( ) using convex combinations of points in , is the following: Lemma 2.1 Let be an element of  ( ) and a con2 ; jj jj2 = 2 . Then, for each stant such that f G

H

:

g

b

co G

G

G

H

co G

n

G

f

co G

c > b

f

c

b

positive integer n there is a point fn in the convex hull c of some n points of G such that: jjf ; fn jj2  n .

The rst constructive proof of this lemma was given by Jones 3] and rened by Barron 4] it includes an algorithm to iterate the solution. The result is the following:

Theorem 2.1 For each element in  ( ), let us def

co G

ne the parameter  as follows: 



= inf sup jj ; jj2 ; jj ; jj2 2H g 2G

g



f





Let  be a constant such that  >  . Then, we can construct an iterative sequence fn , fn chosen as a convex combination of the previous iterate fn;1 and a gn 2 G,  2 . fn = (1 ; )fn;1 + gn , such that jjf ; fn jj  n

This new parameter, , is related to Maurey's 2f just make = 0 in the denition of to realize that  2f . The relation between this problem and the universal approximation property of sigmoidal networks was clearly established in 3], 4] specically, it has been proven that, under certain mild restrictions, continuous functions on compact subsets of Rd belong to the convex hull of the set of sigmoidal functions that one hidden layer neural networks can generate. 





The proof of Theorem 2.1 given in 4] and 5] is based on the following lemma:

Lemma 3.1 Given 2  ( ), for each element of ( ), , and 2 0 1]: inf jj ; (1 ; ) ; jj2  (1 ; )2 jj ; jj2 + 2 (1) g2G f

f

co G





 h

g



f

h

 

The main result is derived using an inductive argument. At step 1, nd 1 and 1 so that jj ; 1 jj2  inf G jj ; jj2 + 1  . This is guaranteed by (1), for = 1. Let n be our iterative sequence in ( ) assume that for  2, jj ; n;1jj2  ( ; 1). It is then possible to choose among dierent values of and n so that: g

g







f

g





f

f

(1 ; )2 jj

;1 ; f jj

fn



At step n, select

gn

2

+

2

 



n

such that jj ; (1 ; )

inf jj ; (1 ; ) g2G

 fn

f

;1 ; gjj

2

+

n

f

n

4 Discussion

4] :



5] :









=

2 = +jj jj; ;n;1jj jj2

n;1 1 =

n= 2 f



n





 

n



n



n



n









n n



1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

5

10

15

20

25

Figure 1: admissible values of as a function of n, for =1 



The values of and n in 4] and 5] are related to the parameter , = ; 1, in the following way:

n





f



n

(3)

Hence, using (1), (2), and (3) we get: jj ; n jj2  , and that completes the proof of Theorem 2.1.

;1 jj4 ; jjf ; fn;1jj2 +



;1 ; gnjj2 

n

fn

f



 fn

f

fn



(2)

n

jj ;

(1 ; )2 ; 1 + 2  ; n Admissible values of for positive values of n fall then in the interval: s 1 +  ; 1 (1 + ) + + ( ; 1) Figure 1 shows the bounds of the interval for as a function of . Bounds are plotted using solid lines, the center of the interval dotted line, and in 5] dash dotted line. Note how this last value approaches the limits of the interval, resulting in a poorer value for n, as will be shown later.



 ;

;1jj

2



= n



f



co G

n

f

f

To evaluate the possible choices for the bound n we use the induction hypothesis in inequality (2) values of should now satisfy



f

+ jj ;

f

r

1

f



jj ; n;1jj2  + jj ; n;1 jj2 f

b

3 Constructive solutions

h



b



co G

It can be shown 6] that admissible values of satisfying inequality (2) for positive values of n fall in the following interval:

f

f

f



 n

n

= ( + )



n n



We now formulate the following questions: 1. Is it possible of achieve a further reduction in the error using convex combinations of elements from ? What is the minimumbound for the global error assuming n = 0 for all ? n

G



n

2. What is the optimal choice of for a given bound, so that n is maximum, making the quasioptimization problem at each step easier to solve? 



3. For a prespecied partial error, n, what is the bound for the global approximation problem? 

Based on the assumptions made and in Lemma 3.1, let us formulate the problem again in a more general way: We look for a constructive approximation so that the overall error using elements from satises n

G

jj ; n jj2  ( ) f

(4)



f

b n

( ) being a function of the parameter which indicates the order of our approximation (i.e. ( ) = both in 3] and 5]) and the parameter related to as dened before. Let n = (1 ; ) n;1 + n we want to nd , n, and the function ( ) so that:

b n

n

b n

n





f

 f



g



provided that the induction hypothesis (4) is satised for . We have introduced the notation n to stress the variation of this parameter along the iterative process. Writing ( n) n = 0, we get (1 ; n ) = 1+ (6) n= ( ; 1) 1 + + ( ; 1) k < n



dQ 

f

inf inf jj ; (1 ; )

f

0