Universal Approximation of Multiple Nonlinear ...

NOTE

Communicated by Irwin Sandberg

Universal Approximation of Multiple Nonlinear Operators by Neural Networks Andrew D. Back [email protected] Windale Technologies, Brisbane, QLD 4075, Australia

Tianping Chen [email protected] Department of Mathematics, Fudan University, Shanghai, 200433, China

Recently, there has been interest in the observed capabilities of some classes of neural networks with xed weights to model multiple nonlinear dynamical systems. While this property has been observed in simulations, open questions exist as to how this property can arise. In this article, we propose a theory that provides a possible mechanism by which this multiple modeling phenomenon can occur. 1 Introduction Understanding the mechanisms by which neural networks can approximate functions, functionals, and operators has generated signicant interest in recent years. The property of universal function approximation has been proven for various feedforward neural network models (Hornik, Stinchcombe, & White, 1989). In dynamic networks with embedded time delays, these results have been extended to the case of universal functional approximation (Chen & Chen, 1993) and universal operator approximation (Sandberg, 1991; Chen & Chen, 1995). 1 Universal approximation properties for recurrent neural networks over a nite time interval have been proven by Sontag (1992). Related work in continuously or discretely time-varying dynamical systems exists in the literature under various names, for example, multiple controllers (Narendra, Balakrishnan, & Ciliz, 1995), modular neural networks (Jacobs, Jordan, Nowlan, & Hinton, 1991; Schmidhuber, 1992), and hybrid systems (Branicky, Borkar, & Mitter, 1994; Brockett, 1993; Branicky, 1996; Lemmon & Antsaklis, 1995). Recently, several researchers have shown independently that some types of neural network are capable of learning to

1 We refer to neural networks with time-delay or recurrent connections as dynamic neural network models.

c 2002 Massachusetts Institute of Technology Neural Computation 14, 2561– 2566 (2002) °

2562

Andrew D. Back and Tianping Chen

model several different dynamical systems within one structure (Feldkamp, Puskorius, & Moore, 1997; Younger, Conwell, & Cotter, 1999; Hochreiter & Schmidhuber, 1997; Hochreiter, Younger, & Conwell, 2001). In this article, we provide a theoretical explanation of a mechanism by which neural network models can approximate dynamical systems that change continuously or switch between some form of characteristic behaviors. We refer to such systems as multiple nonlinear operator (MNO) models. In the next section, we present results on the universal approximation of MNOs by neural networks, followed by a brief example. 2 Universal Approximation of Multiple Nonlinear Operators 2.1 Functional and Operator Approximation. It is well known that nonlinear dynamical systems can be treated as functionals and operators (Chen & Chen, 1995; Sandberg, 1991; Stiles, Sandberg, & Ghosh, 1997). The time delay neural network (TDNN) functional (Chen & Chen, 1993) is dened on a compact set in C[a,b] , (the space of all continuous functions), and is given by G(u) D

N X iD 1

0 ci g @

m X jD 1

1 jij u(tj ) C hi A ,

(2.1)

where u(t) is a continuous real-valued output at time t. A more general class of model, which has signicant implications for modeling dynamic systems, are those capable of approximating arbitrary nonlinear operators. For dynamic models, this means mapping from one time-varying sequence (a function of time) to another. Chen and Chen (1995) have proved universal approximation for the following operator model, G(u)(y) D

N X M X kD 1 i D 1

0 cik g @

m X jD 1

1 ¡ ¢ jijku xj C hikA g(vk ¢ y C &k ),

(2.2)

where G(u) (y) is an arbitrary nonlinear operator with inputs u and y. The Chen network can be regarded as a feedforward network with two weight layers, in which the output weights have each been replaced by a two-layer functional approximating network. The idea of these parameter replacement networks has also been proposed independently elsewhere (Priestley, 1980; Back & Chen, 1998; Schmidhuber, 1992). In this previous work, universal operator approximation properties were not proven, however. 2.2 Multiple Nonlinear Operator Approximation. Consider a multiple nonlinear operator H, dened broadly as a set of mappings of the form

Universal Approximation of Multiple Nonlinear Operators

2563

H: Fi ! Fi C 1 , where i D 1, 2, . . . is the index of the discrete functionals in the discrete multiple model case, or H: Fa ! Fb corresponding to the continuous multiple model case. Such a model encapsulates the basic framework of multiple model capabilities observed in various neural networks (Feldkamp et al., 1997). Following a similar approach to Chen and Chen (1995), it is possible to construct a network that performs the task of universally approximating a multiple nonlinear operator. We obtain the following theorem: Theorem 1. Suppose that g is a nonpolynomial, continuous, bounded function, X is a Banach space, K1 µ X is a compact set in X and K2 µ Rn , K3 µ Rn , K4 µ R n are compact sets in Rn , V is a compact set in C (K1 ), Z is a compact set in C(K3 ), and G is a nonlinear continuous operator, which maps V into C(K2 ) £ C(K3 ). Then for any 2 > 0, there are positive integers M, N, P, m, p, constants cikh , Qilkh , rikh , &k, jijk , hik 2 R, points vk 2 Rn xj 2 K1 , xl 2 K3 i D 1, . . . , M, k D 1, . . . , N, h D 1, . . . , P, j D 1, . . . , m, l D 1, . . . , p such that G(u) (y) (z)¡ ± ² P PM PP Pp kh kh C kh N c g Q z r i 0, a positive integer N, real numbers cik (G(z)),hik, &k, jijk 2 R, vectors vk 2 Rn , i D 1, . . . , N, j D 1, . . . , m, k D 1, . . . , N such that G(u)(y) (z)¡ ±P ² PN PM k m k k 2 < j D 1 jij u(xj ) C hi kD1 iD 1 ci (G(z))g 2 g(vky C &k )

(2.4)

holds for all y 2 K2 , v 2 V, and z 2 Z. As before, we note that since G is a continuous operator, for each k D 1, . . . , N, cik (G(z)) is a continuous functional dened on Z. Hence, by applying theorem 4 (Chen & Chen, 1995) for each i D 1, . . . , N, k D 1, . . . , N, it is possible to determine positive integers Nk, m k, constants cikh , Qijkh , r ikh , 2 R,

2564

Andrew D. Back and Tianping Chen

and xj 2 K1 , h D 1, . . . , N k, j D 1, . . . , mk, such that 0 1 Nk mk X X k kh @ kh kh A 2 (G(z)) C c ¡ c g Q z < r i i i ij j 2L j D1 hD 1

(2.5)

holds for all i D 1, . . . , N, k D 1, . . . , N and z 2 Z where LD

N X

sup | g(vk ¢ y C &k )|.

kD 1 y2K2

Therefore, substituting equation 2.5 into 2.4, we have G(u) (y) (z)¡ ± ² PN P M P P P mk kh C kh kh g c Q z r j i

Universal Approximation of Multiple Nonlinear ...

Universal Approximation of Multiple Nonlinear ...

Suggest Documents

Universal Approximation to Nonlinear Operators ... - Semantic Scholar

Universal Approximation of Mappings on

Nonlinear approximation by sums of

Hyperbolic Relaxation Approximation to Nonlinear

Automatic rational approximation and linearization of nonlinear ...

Nonlinear approximation by sums of ... - Semantic Scholar

Approximation of nonlinear systems with radial

Successive Galerkin Approximation Of Nonlinear ... - Semantic Scholar

Incremental Approximation of Nonlinear Constraint Functions for ...

Piecewise affine approximation of nonlinear systems

Order of approximation for nonlinear sampling

Approximation by Multiple Refinable Functions

MULTIPLE SOLUTIONS OF NONLINEAR ...

Dirichlet approximation and universal Dirichlet series

Universal Approximation by Ridge Computational ... - Semantic Scholar

Universal Approximation Using Incremental Constructive Feedforward ...

Universal Approximation Using Incremental Constructive Feedforward ...

Simultaneous zero-free approximation and universal optimal

Approximation of multiple integrals over hyperboloids

Approximation of RNA Multiple Structural Alignment - MIMUW

Nonlinear approximation with nonstationary Gabor frames

Nonlinear approximation method in Lagrangian relaxation-based ...

NONLINEAR APPROXIMATION WITH DICTIONARIES. II. INVERSE

An approximation technique for robust nonlinear optimization