Network Approximation of Dynamical Systems - CiteSeerX

7 downloads 36 Views 191KB Size Report
norm jj jjE, but now it is not assumed that E has a ... view Xw as a normed linear space with norm jj jjw. ..... 6] B. D. Coleman and W. Noll, \An approximation.
Invited paper at NOLTA'95, Las Vegas, Nevada, December 10{14, 1995

1

Network Approximation of Dynamical Systems Ajit T. Dingankar and Irwin W. Sandberg 

y

y

 IBM Corporation, Austin, TX 78758, U.S.A., [email protected] The University of Texas at Austin, TX 78712, U.S.A., [email protected]

nite sums of the form

Abstract

We consider the problem of approximating any memk(i) ` X X ber of a large class of input-output operators of timecij uij [yij ()]vi (3) varying nonlinear dynamical systems. We introduce a i=1 j =1 family of \tensor product" dynamical neural networks, and show that a certain continuity condition is neces- achieve such an approximation. In (3) the cij are real sary and sucient for the existence of arbitrarily good constants, the uij are certain continuous real-valued approximations using this family. functions of the reals, and the yij are continuous real functionals. Of course, (3) is of the form (2). We know I. Introduction of no closely related criterion in the literature. HowIn this paper we consider the problem of approxi- ever, sucient conditions for a specialized approximamating input-output operators G of time-varying non- tion of the general type we consider are given in [2]. linear dynamical systems that take a subset C of a II. Preliminaries normed linear space X into another normed linear space E. Suppose that E is complete and assume ini- Throughout the paper m and n are arbitrary positive tially that it has a basis fe1; e2 ;P : : :g, so that every integers, and D denotes the domain Rm+ of our input e 2 E can be represented as e = 1 j =1 gj (e)ej where functions. The Euclidean norm on D is denoted by the gj are unique functionals. Then the gj are contin- j  j. We consider nonlinear dynamical systems with uous [1, p.135], and we have inputs x belonging to a subset of the set of all Rnvalued continuous functions de ned on D. We assume 1 X (1) that the outputs belong to a real Banach space E with G(x) = gj [G(x)]ej ; x 2 C: norm jj  jjE , but now it is not assumed that E has a j =1 basis.1 An important example of an E of interest to The right side of (1) is of the form us that does not have a basis is the set of real-valued bounded Lebesgue measurable functions de ned on D. 1 X In the development of our results a central role is aj (x)vj played by certain weighted norm spaces.2 To describe j =1 these spaces, let w: D ! (0; 1] be a continuous map in which the aj are continuous functionals and the vj with limj j!1 w( ) = 0. Let Xw be the set of all Rnare elements of the output space E. A natural ques- valued continuous functions x de ned on D for which tion that arises is whether arbitrarily good uniform approximations of G of the form jjxjjw := sup jjx( )w( )jj < 1 `

X

j =1

aj ()vj

2D

(2)

can be obtained with ` nite. By \uniform" we mean uniform in the system inputs. One of the principal results of this paper is a result in a certain setting that provides a criterion under which

1 The case in which E does have a basis leads to a much simpler problem that can be addressed using the characterization in [1, p.136] of compact subsets of a Banach space with a basis. 2 Related ideas (in a general sense) involving weighted norms on function spaces can be found in, for example, proofs of the boundedness of solutions of integral equations [3], [4, pp. 880{ 1], and in a large number of other publications (see, for instance, [5], [6], [7], [8]).

Invited paper at NOLTA'95, Las Vegas, Nevada, December 10{14, 1995 where jj  jj is the Euclidean norm on Rn. We use X to denote the space of Rn-valued bounded continuous functions on D, with the usual norm sup 2 D jj  jj. We view Xw as a normed linear space with norm jj  jjw . Later we will use the fact that Xw is a Banach space (see Appendix A). For each a 2 R+, let ca denote [0; a]m. Let  be a function from R+ to R+, and let r: D ! R+ be a continuous function that has restricted growth rate at in nity and is nondecreasing, in the sense that lim r( )w( ) = 0

(4)

j j!1

and max i  1 max ) r( )  r( ); ; 2 D 1im im i (5) where i and i denote the ith components of and , respectively. Let B be a closed nonempty subset of Xw such that: 1. jjx( )jj  r( ) for x 2 B and 2 D. 2. jjx( ) ? x( )jj  (a)j ? j for a 2 R+, x 2 B, and ; 2 ca . Of course r can be taken to be any bounded function satisfying (5). It is not dicult to check that the set of all x 2 Xw that satisfy the above two conditions with r and  any two constant functions is a closed subset of Xw .3 Thus B can be taken to be any such set of \uniformly bounded uniformly Lipschitz" functions. We denote by Xw the set of bounded linear functionals on Xw (i.e., the set of bounded linear maps from Xw to the reals R). Let Y be any set of continuous maps from Xw to R that is dense in Xw on B, in the sense that for each  2 Xw and any  > 0 there is a y 2 Y such that j(x) ? y(x)j < , x 2 B. Also, let U be any set of continuous maps u : R ! R such that given  > 0 and any bounded interval ( 1 ; 2 )  R there exists a nite number of elements P u1 ; : : :; uq of U for which j exp( ) ? j uj ( )j <  for 2 ( 1 ; 2).4 Let T denote the set of maps from B to E of the form k(i) ` X X cij uij [yij ()]vi (6) i=1 j =1

where ` 2 N (the positive integers), and k(i) 2 N, vi 2 E for 1  i  `, and where cij 2 R, uij 2 U, and yij 2 Y for P1  i  `; 1  j  k(i). Since a sum of the form `i=1 i()vi is an element of the so-called

3 To show this it is helpful to notice that convergence with respect to the norm in Xw implies uniform convergence on any ca . 4 Of course we can take U to be the set whose only element is exp( ), or the set u : u( ) = ( )n =n!; n 0; 1; :: : . 

f

2 f

gg

2

tensor product (e.g., see [9]), a general element M of

T can be realized by what may naturally be called the tensor product neural network shown in Figure 1.

III. Characterization of Continuity of Input-Output Maps

Let G denote the linear space of all maps G: B ! E such that supx 2 B jjGxjjE < 1: The set G is a large set in that (as will become clear) it contains all continuous maps from B to E. A. The approximation theorem Theorem 1 Let G 2 G be given. Then the following two conditions are equivalent. 1. G is continuous on B with respect to the norm jj  jjw . 2. For any  > 0, there exists an M 2 T such that jjGx ? MxjjE < ; x 2 B .

Proof: (1) ) (2). We rst prove that B is relatively compact in the norm jj  jjw . We will use the following result.5

Lemma 1 ([11]) Let S be a subset of a complete met-

ric space A with metric , and let T1 ; T2; : : : be maps of A into itself such that (i) Tk (S) is a relatively compact subset of A for each k, and (ii) (s; Tk s) ! 0 as k ! 1 uniformly for s 2 S . Then S is a relatively compact subset of A.

Continuing with the proof of the theorem, note that ck is closed and convex for every positive integer k. Hence [12, p. 98] for any x 2 D and any k, there exists a unique x^ 2 ck such that jx ? x^j = inf y 2 ck jx ? yj. For k = 1; 2; : : : de ne Tk : Xw ! Xw by (Tk s)(x) = s(^x): (7) Referring to the lemma, set A = Xw , set S = B, and take  to be the usual metric induced by the norm jj  jjw . Consider Hypothesis (i). Let k be an arbitrary positive integer. Note that X is a subset of Xw and that the map X ! Xw given by x 7! x is continuous. Hence, to show that Tk (B) is relatively compact it suces to prove that it is relatively compact in X. Let C (ck ) denote the space of continuous Rn-valued functions on ck , with the usual norm. Using the two conditions in the de nition of B together with an extension [13, p. 76] of the classical Ascoli-Arzela theorem, it follows directly that the restriction Tk (B)jck of 5

In [10] a similar application is made of Lemma 1.

Invited paper at NOLTA'95, Las Vegas, Nevada, December 10{14, 1995

3

v1

x2B

? ? ? @ @@ -

1 .. . `

-? N

v`

-? N

@ @@R ? ??

L

M(x)

P (a) Network representation of the tensor product M(x) = `i=1 i (x)vi .

? yi ? .. ? . @@ @ - yik i 1

x

( )

-

ui1 .. . - uik(i)

@@ci R ? ? cik i 1

L

i (x)

( )

P (i) cij uij [yij (x)]. (b) Network representation of the ith input functional i(x) = jk=1

Figure 1:

Invited paper at NOLTA'95, Las Vegas, Nevada, December 10{14, 1995 Tk (B) to ck is relatively compact in C (ck ). Consider the map Q from C (ck ) to X given by Qf = g, where g( ) = f(^ ); 2 D. Since Q is continuous, Tk (B) is relatively compact in X. Now consider Hypothesis (ii). Note that here (s; Tk s) = sup jj[s( ) ? s(^ )]w( )jj 2 Dnck  sup r( )w( ) + 2 Dnck   sup max r( ) w( ): 2 Dnck 2 ck By (4) the rst term tends to 0 as k ! 1. By (5) we have max 2 ck r( )  r( ) for 2 D n ck and any k. Hence (s; Tk s) ! 0 uniformly for s 2 B, as required. Thus B, which is assumed to be closed, is relatively compact, and thus compact.6 Recall that we are assuming that G is continuous in the norm jj  jjw . We will use the following lemma.

4

functionals ai and elements vi of E for i = 1; : : :; ` such that

jjGx ? P`i=1 ai (x)vi jjE < =2; x 2 B: The case in which jjvijjE = 0 for each i is trivial. So as-

sume otherwise. By Theorem 1 of [16] (see Appendix B ), there exist k(i) 2 N, cij 2 R, uij 2 U, and yij 2 Y , 1  j  k(i) such that

ai (x) ?

k(i)

X

j =1



cij uij [yij (x)] < =(2` ); x 2 B; 1  i  `;

where = max1  i  ` jjvijjE . By combining the above estimates using the triangle inequality, we see that (1)

) (2).

(2) ) (1). This follows from the compactness of B and Lemma 2. Lemma 2 Let H be a compact metric space, and let P` P be the set of all maps of the form i=1 i ()vi where Comment. With regard to Condition 1 of Theothe i are continuous functionals on H , vi 2 E , and rem 1, it is easy to give examples of G's in G that are ` < 1. Then a map G: H ! E is continuous if and not continuous. For instance, suppose that B, viewed only if for any  > 0, there exists an M 2 P such that as a metric space with its metric derived in the usual jjGx ? MxjjE < ; x 2 H . way from the norm jj  jjw , has a limit point x0. SupProof of the Lemma: (Necessity) Assume that G pose also that G is de ned on B so that Gx is an eleis continuous. Let the space of all continuous func- ment of E with nonzero norm only for x = x0. Then tionals on H be denoted by C (H), and let the set of G is obviously not continuous. continuous maps from H to E be denoted by C (H; E). IV. Conclusion The so-called tensor product C (H) E of C (H) and E is the linear manifold of C (H; E) consisting of nite We have introduced a family T of maps of \tensor P sums of the form `i=1 ai ()vi, where ai 2 C (H) and product" dynamical neural network structures and, given the input-output map G of any member of a vi 2 E for each i. certain large class of time-varying nonlinear dynamiThe following proposition establishes necessity. cal systems, we have shown that continuity of G with Proposition: C(H) E is dense in C(H; E). respect to the norm jj  jjw is necessary and sucient Proof of the Proposition: Let M = C(H) E, and for T to contain an arbitrarily good approximation to for each x 2 H let M(x) = ff(x): f 2 Mg. Since G. Inputs and outputs are not restricted to be deC (H) contains the constant function taking the value ned on nite intervals, nor need they be functions of unity, M(x) = E for every x. The proposition thus only one variable. In this paper we have not considered the important problem of actually determining follows from [14, Corollary 1]. the elements of the approximating structures. (Suciency) Suppose that for any  > 0, there Related results can be found in [17], [18] for cases exists an M 2 P such that jjGx ? MxjjE < ; x 2 H. in which D is replaced with Zm+ , or Rm or Zm, and for Thus we can nd a sequence fGp g of elements of P classes of continuous-time input functions that need that is uniformly convergent to G. By the continuity of not be continuous. the Gp and a well-known result [15, p.121] concerning Appendix A uniform convergence, it follows that G is continuous. Proof of the Completeness of Xw Continuing with the proof of the theorem, let  > 0 To see that Xw is complete,7 let fxpg be a Cauchy be given. By Lemma 2, there exist ` and continuous sequence in Xw , and notice that for each positive  6 See [5] for a related result for m = 1. Also, this shows that there is a positive N such that

as mentioned earlier, contains all continuous maps from B to E. G

7

Similar results are given in [18] and [10].

Invited paper at NOLTA'95, Las Vegas, Nevada, December 10{14, 1995

5

We may assume that j jdj j 6= 0. Choose > 0 P such that j jdj j < =3. Let [a 0; b 0] be an interval jjwxp ? wxq jj0 < ; p; q > N: in R that contains all of the sets wj (K), and let real a and b be such that a < a 0; b > b 0. Select  > 0 such where jj  jj0 is the norm on X of Section II.. By the that j exp( 1 ) ? exp( 2 )j < for 1 ; 2 2 [a; b] with completeness of X, there is a y 2 X such that fwxp g j 1 ? 2 j < . With  = min(; a 0 ? a; b ? b 0), choose converges to y in X. Thus, for an arbitrary positive  yj 2 W such that jwj (x) ? yj (x)j < ; x 2 K for all there exists a positive N 0 such that j. This gives j exp[wj (x)] ? exp[yj (x)]j < ; x 2 K for each j (because we have yj (K) 2 [a; b] and jwj (x) ? jjy ? wxpjj0 < ; p > N 0: yj (x)j <  for each j and x), and thus X Since the weighting function w never vanishes, the jf(x) ? dj exp[yj (x)]j function x = y=w can be seen to be the limit in Xw of j X fxp g.  jf(x) ? dj exp[wj (x)]j + P

Appendix B

X

j

X

j dj exp[wj (x)] ? dj exp[yj (x)]j In Section III.reference is made to Theorem 1 of [16], j j a part of which is used in the proof of our theorem. X Since the proof of the result in [16] contains some typos  =3 + jdj j  j exp[wj (x)] ? exp[yj (x)]j etc., for the sake of completeness we state below as a j theorem the result we use and include a proof.  (2)=3; x 2 K: Let K be a nonempty compact subset of a real normed linear space Z, and let Z  be the set of Now let [c; d ]  R be such that yj (K)  [c;Pd ] for each bounded linear functionals on Z (i.e., the set of j. Pick u1; : : :; u` 2 U P so that j exp( ) ? i ui( )j  bounded linear maps from Z to the reals R). Let W be 1 ; 2 [c; d ] where 1 j jdj j < =3. Then any set of continuous maps from Z to R that is dense XX in Z  on K, in the sense that for each  2 Z  and any jf(x) ? dj ui [yj (x)]j  > 0 there is a y 2 W such that j(x) ? y(x)j < , j i X x 2 K. Let U be as described in Section II..  jf(x) ? dj exp[yj (x)]j + j Theorem. Let f be a continuous map of K into XX X R. Then given  > 0 there are a positive intedj ui [yj (x)]j j dj exp[yj (x)] ? ger k, real numbers c1; : : :; ck , elements u1; : : :; uk of j i j X X U, and elements y1 ; : : :; yk of W such that jf(x) ? P ui [yj (x)]j j d exp[y (x)] ? d  (2)=3 + j j j c u [y (x)] j <  for x 2 K. j j j j i j

Proof

X

X

ui [yj (x)]j Let f be given, and notice that theP set V of all funci j X tions v : K ! R of the form v(x) = j aj exp[j (x)];  (2)=3 + 1 jdj j < : in which the sum is nite and the aj and the j bej long to R and Z  , respectively, is an algebra under P P the natural de nition of addition and multiplication. Since i dj ui [yj (x)] can be written in the form By a consequence [19, p.198] of the Hahn-Banach the- Pj cj uj [yj j (x)], with the cj ; uj ,and yj in R;U, and W, orem, given distinct xa and xb in K there is a  in respectively, we have proved the theorem. Z  such that exp[(xa)] 6= exp[(xb)], showing that V References separates the points of K. It is clear that v(x) 6= 0 for some v 2 V for each x. Thus, by a version of the [1] L. A. Liusternik and V. J. Sobolev, Elements of Functional Analysis. New York: Frederick Ungar Stone-Weierstrass Theorem [20, p.162], given  > 0 Publishing Co., 1961. there are a positive integer p, real numbers d1; : : :; dp, and elements w1; : : :; wp of Z  such that [2] T. Chen and H. Chen, \Approximation capability X to functions of several variables, nonlinear funcjf(x) ? dj exp[wj (x)]j < =3 tionals, and operators by radial basis function j neural networks," IEEE Transactions on Neural Networks, vol. 6, pp. 904{910, July 1995. for x 2 K.8 8 Here we view K as a metric space with the metric derived [3] I. W. Sandberg, \On the boundedness of solutions of nonlinear integral equations," Bell System in the usual way from the norm in Z .

 (2)=3 +

jdj j  j exp[yj (x)] ?

Invited paper at NOLTA'95, Las Vegas, Nevada, December 10{14, 1995 Technical Journal, vol. 44, pp. 439{453, March

[4]

[5] [6]

[7] [8] [9]

[10]

1965. I. W. Sandberg, \Some results on the theory of physical systems governed by nonlinear functional equations," Bell System Technical Journal, vol. 44, pp. 871{898, May{June 1965. S. Boyd, Volterra Series: Engineering Fundamentals. PhD thesis, The University of California at Berkeley, Berkeley, California, 1985. B. D. Coleman and W. Noll, \An approximation theorem for functionals, with applications in continuum mechanics," Arch. Rational Mech. Anal., vol. 6, pp. 355{370, 1960. B. D. Coleman and V. J. Mizel, \Norms and semigroups in the theory of fading memory," Arch. Rational Mech. Anal., vol. 23, pp. 87{123, 1966. B. D. Coleman and V. J. Mizel, \On the general theory of fading memory," Arch. Rational Mech. Anal., vol. 29, pp. 18{31, 1968. E. W. Cheney, Multivariate Approximation Theory: Selected Topics, vol. 51 of The Regional Conference Series in Applied Mathematics. Philadelphia, Pennsylvania: SIAM, 1986. I. W. Sandberg and L. Xu, \(in preparation)," 1996. I. W. Sandberg, \Weighted norms and network approximation of functionals," in Proceedings of

6

[12] A. A. Goldstein, Constructive Real Analysis. Harper's Series in Modern Mathematics, New York: Harper & Row, 1967. [13] A. Mukherjea and K. Pothoven, Real and Functional Analysis. New York: Plenum Press, 1978. [14] R. C. Buck, \Approximation properties of vector valued functions," Paci c Journal of Mathematics, vol. 53, no. 1, pp. 85{94, 1974. [15] W. A. Sutherland, Introduction to Metric and Topological Spaces. Oxford: Clarendon Press, 1975. [16] I. W. Sandberg, \General structures for classi cation," IEEE Transactions on Circuits and Systems-I: Fundamental Theory and Applications, vol. 41, pp. 372{376, May 1994.

[17] A. T. Dingankar and I. W. Sandberg, \Tensor Product Neural Networks and Approximation of Dynamical Systems," in Proceedings of the International Symposium on Circuits and Systems, Atlanta, Georgia, (Atlanta, Georgia), May 12{15 (to appear) 1996. [18] A. T. Dingankar, On Applications of Approximation Theory to Identi cation, Control and Classi cation. PhD thesis, The University of Texas at

Austin, Austin, Texas, 1995. [19] G. Bachman and L. Narici, Functional Analysis. [11] New York: Academic Press, Inc., 1966. the special sessions on The Mathematical Theory [20] W. Rudin, Principles of Mathematical Analysis. of Electrical Networks presented at the MTNS'96 International Series in Pure and Applied MatheConference, College of Engineering Technical Rematics, New York: McGraw-Hill, Inc., Third ed., port no. 717, State University of New York at 1976. Stony Brook, (to appear) 1996.

Suggest Documents