A NONPARAMETRIC TEST OF THE NON-CONVEXITY ... - CiteSeerX

0 downloads 0 Views 288KB Size Report
Abstract. This paper proposes a nonparametric test of the non-convexity of a smooth regression function based on least squares or hybrid splines. By a.
A NONPARAMETRIC TEST OF THE NON-CONVEXITY OF REGRESSION 1 by Cheikh A.T. DIACK* and Christine THOMAS-AGNAN ** *Universite Paul Sabatier Laboratoire de Statistique et Probabilites 118 route de Narbonne 31062 Toulouse, France. **GREMAQ, Manufacture des tabacs, 21 Allees de Brienne, 31000 Toulouse, France.

Abstract

This paper proposes a nonparametric test of the non-convexity of a smooth regression function based on least squares or hybrid splines. By a simple formulation of the convexity hypothesis in the class of all polynomial cubic splines, we build a test which has an asymptotic size equal to the nominal level. It is shown that the test is consistent and is robust to nonnormality. The behavior of the test under the local alternatives is studied. 1

key words:least squares estimator, test of convexity, B-splines, modulus of continuity

1

1

INTRODUCTION

This paper proposes a test of non-convexity of an unknown regression function in a nonparametric regression model. Interest in this problem which has been addressed for example in Schlee(1980) and Yatchew(1992), arises in particular from econometric models. Economic theory predicts the convexity of functions like for example cost functions, production functions, Engel curves... Such tests provide a way of confronting theory with real data sets. We consider the following regression model where we are given n realizations yij of real randoms variables yij = f (xi ) + "ij  i = 1 ::: r j = 1 ::: ni with xi 2 (0 1) i = 1 ::: r: At each deterministic design point xi (i = 1 ::: r) ni measurements are taken.The P probability measure assigning mass i = ni=n to the point xi( i = 1) is referred to as the design and will be denoted by n : We assume that the random errors "ij are uncorrelated and identically distributed with mean zero. Their variance 2 will be assumed known. Finally f is an unknown smooth regression function. We use a cubic spline function to approximate the function f , not only because of the common use of cubic splines in approximation problems, but also because they allow to formulate these convexity hypothesis in a very simple way. Indeed, if g is a cubic spline then, its second derivative is a linear function between any pair of adjacent knots i and i+1 and it follows that g is a convex function in the interval i  x  i+1 if and only if g (i) and g (i+1) are both non negative. This property was used by Dierckx (1980) to dene a convex estimator. After some preliminaries, we rapidly sketch in the next section, the cubic spline estimator. In section 3, we describe our test of non-convexity of the regression function and the rest of the section is devoted to a discussion and demonstration of its properties. Section 4 presents an application to a real data set. 00

2

00

NOTATIONS AND PRELIMINARIES

2.1 The Cubic Spline Estimator Let p be a positive continuous density on (0,1). We assume that

min p(x) > 0:

0x1

Let 0 = 0 < 1 < ::: < k+1 = 1 be a subdivision of the interval (0,1) by k distincts points dened by

Z

0

i

p(x) dx = i=(k + 1) i = 0 ::: k + 1: 2

(2:1)

Let k = max0

i k (i+1 ; i ),

 

we see that

k =min ( ; i)  max x p(x)= min x p(x): i i+1

(2:2)

For each xed set of knots of the form (2.1), we dene S (k,d) as the collection of all polynomial splines of order d (degree d-1) having for knots 1 < ::: < k : The class S (k,d) of such splines is a linear space of functions of dimension (k+d). A basis for this linear space is provided by the B-splines(see Schumaker (1981)). Let fN1 :: Nk+dg the set of normalized B-splines associated to the following nondecreasing sequence ft1 :: tk+2dg : 8> t1  t2  :::  td = 0 < t  t 1 >: 2d+k t2d+k=1 :::for tld+=k+11 = :: k d+l l ;

The reader is referred to Schumaker (1981) for a discussion of these B-splines. In what follows, we shall only work with the class of cubic splines: S (k,4). A cubic spline on the interval (0,1) is said to be a natural cubic spline if its second and third derivatives are zero at 0 and 1. They imply that a natural cubic spline is linear on the two extreme intervals (0 1) and (k  1): It will be convenient to introduce the following notations:

N (x) = (N1(x) :: Nk+4(x)) 2 IRk+4 : F = (N (x1) :: N (xr)) (k + 4)  r matrix: We will denote by fbn the least squares spline estimator of f : kX +4 fbn(x) = bpNp(x) (2:3) 0

p=1

where

XX X b = ( ^1 ::: ^k+4) = arg mink+4 (yij ; pNp(xi))2:  IR p=1 i=1 j =1 r ni

0

k+4

2

Let

ni ni X X yi = n1 yij  "i = n1 "ij  i j =1

i j =1

(2:4)

Y = (y1 :: yr)  " = ("1 :: "r)  f = (f (x1) :: f (xr)) Let D(n ) be the r  r diagonal matrix with diagonal elements 1 ::: r then, basic least squares arguments prove that: r b = M 1 (n )F D(n )Y with M (n ) = X N (xi)N (xi)i = F D(n )F : 0

0

0

0

;

i=1

3

0

Asymptotic properties of this estimator have been established in Argarwal and Studden(1980). We will introduce the hybrid cubic spline estimator of f denoted by fen as in Kelly and Rice (1990). It is dened to be the cubic spline minimizing the penalized least squares criterion i.e:

X fen = epNp(x) k+4

(2:5)

p=1

where

ni r X X

Z 1 kX +4

kX +4

e = arg mink+4 (yij ; pNp(xi))2 + ( pNp (x))2 dx (2:6) 0 p=1  IR p=1 i=1 j =1 with a smoothing parameter > 0: This estimator is therefore a hybrid of smoothing splines and least squares splines in the sense that the amount of smooothing is controlled by both a penalty term through the choice of as in smoothing splines and by the number of knots, k as in least squares splines. As above with b  we can specify e : For this we need to introduce the following matrix W dened by 00

2

Wij =

Z1 0

Ni (x)Nj (x)dx i j = 1 ::: k + 4: 00

00

W is a band matrix. Indeed, Ni(x) = 0 if x 62 (ti ti+4) (see Argarwal and Studden(1980)). Hence Wij P= 0 if j i ; j j 4: +4 It is easy to see that if g = kp=1

pNp(x) then Z1 (g (x))2dx =  W : (2:7) 00

0

0

Now, we are ready for the main result of this section. Lemma 1 If e saties (2.6) then e = ; 1 (n )F D(n )Y with ;(n ) = M (n ) + W: ;

Proof Pk+4 Let g = p=1 pNp(x) be a cubic spline.

Let

L(g) =

ni r X X

kX +4

i=1 j =1

p=1

(yij ;

Z 1 kX +4

pNp(xi))2 + 0 ( pNp (x))2 dx: 00

p=1

It is immediate that the residual sum of squares about g can be rewritten ni r X X i=1 j =1

(yij ; g(xi))2 = (Y ; F ) D(n )(Y ; F ): 0

4

0

0

Using (2:7) we obtain

L(g) = (Y ; F ) D(n )(Y ; F ) +  W : (2:8) Since W is non-negative denite, the matrix M (n ) + W is strictly positivedenite. It therefore follows that (2.8) has a unique minimum, obtained by setting g = fen with e = ; 1 (n )F D(n )Y :2 0

0

0

0

;

2.2 Smoothness Conditions We will assume some regularity conditions on f: For m 2 IN and M L > 0 let FmM = ff 2 C m+1 (0 1) : sup j f ((m+1) 4)(x) j M g ^

0x1

GmML = ff 2 FmM : sup j f j Lg: 0x1

The following lemma is an immediate consequence of theorem 2.59 of Schumaker(1981).

Lemma 2 For f 2 C m then, !m (f t)  tm sup j f (m)(x) j 0x1

t ;! 0

as

where !m (f ) denotes the mth modulus of smoothness of f in L : 1

When m=1, the mth modulus of smoothness is traditionally referred to as the modulus of continuity.

3

TEST OF NON-CONVEXITY

3.1 Convexity and Strict Convexity in S (k,4) Let us rst characterize convexity in the class S (k,4). For a function g in the class S (k,4), we can write:

g(x) =

kX +4 p=1

Then:

pNp(x) with  = ( 1 ::: k+4) 2 IRk+4 0

g (l) = 00

kX +4 p=1

pN p(l) = 00

kX +4 p=1

pdpl

where the coecients dpl are easily calculated from the knots (see Dierckx (1980)) 5

8> >< dpl = 0 ifd p = l or 6p  l + 4 l+1l (tl+5 tl+2 )(tl+5 tl+3 ) >> dl+3l = (tl+6 tl+3 )(6 tl+5 tl+3 ) for l = 0 ::: k + 1 : dl+2l = ;(dl+3l + dl+1l) Let Bl = (0 0 :: 0 ;dl+1l ;dl+2l ;dl+3l 0 :: 0) 2 IRk+4and  = ( 1 :: k+4)  ;

;

;

;

0

then

0

g (l) = ;B l: We have already mentionned in the introduction that g is a convex function if and only if g (l) is non-negative for all l=0,..,k+1 or equivalently B l  0 for all l=0,...,k+1. Characterizing similarly strictly convex functions does not seem possible but the following lemma gives a necessary condition. +4 Lemma 3 Let g(x) = Pkp=1

pNp(x) let Cl = Bl + Bl+1 l = 0 ::: k then if g is strictly convex we have for all l = 0 ::: k ;C l > 0: 00

0

00

0

0

Proof

Let us assume that g is strictly convex. Since g 2 S (k,4), then g is a linear function between any pair of adjacent knots i and i+1: Therefore, we can write 00

g (x) = aix + bi for all x 2 (i i+1): Moreover g (x)  0 or equivalently aix + bi  0 i = 0 ::: k: Hence g is strictly convex if and only if 8> < if ai > 0 then aii + bi = g (i)  0 g (i+1)  0 >: if ai < 0 then aifii+1a +=b0i = then bi > 0 i 00

00

00

00

) g ( i) ) g ( i) But ai = g ( i+1 and bi = g (i) ; g ( i+1 i therefore g is strictly i+1 i i+1 i convex if and only if 8> < if g (i+1) > g (i) then g (i)  0 g (i+1)  0 >: if ifg g(i(+1 ) 0 i+1 i 00

;

00

00

00

;

;

00

;

00

00

00

00

00

00

00

00

00

Hence, if g is strictly convex then ;(B l + B l+1) > 0:2 0

0

It is easy to see that there are no redundant vectors in fC0 ::: Ckg i.e there is no Cj such that f : ;C l  > 0 l = 0 ::: k g = f : ;C l  > 0 l = 0 ::: k l 6= j g 0

0

6

(3:1):

3.2 Denition of the Test We intend to construct a test H0 of: \f is not strictly convex" against H1 : \f is strictly convex." Lemma 3 does not characterize strict convexity, however we have the following property. Dene Ck to be the class of functions g 2 C 2(0 1) satisfying the set of conditions g (l) + g (l+1) > 0 l = 0 ::: k associated to k knot points. If k tends to innity, the union of all knot points dened by (2.1) is dense in (0,1). It follows that a function g in C 2(0 1) belongs to Tk=1 Ck if and only if g is strictly convex. It is therefore natural to weaken H1 replacing it by H 1 : f 2 Ck  for some large k. It is then possible to construct a test for H 1 against H 1 using a likelihood ratio test (LRT) procedure studied by Berger (1989). Berger showed that if X is a p variate (in our case p = k + 4 ) normal random vector with unknown mean  and nonsingular covariance matrix  and if C0 ::: Ck satisfy (3.1), then the size- LRT with null hypothesis \ ; C l  0 for some l = 0 ::: k" versus the alternative hypothesis \ ; C l > 0 for all l = 0 ::: k" is the test with rejects the null hypothesis if 00

00

1

0

0

0

0

0

C lX  q for all l = 0 ::: k: (C lCl )1=2  where q is the upper 100 percentile of the standard normal distribution. On the other hand, Beatson (1982) shows that for a smooth and convex function f 2 C m (0 1)(0  m  3), the uniform distance between f and the set S (k,4) of convex functions of S (k,4) tends to zero when the mesh size k tends to zero (see lemma 6 below). Therefore, for f 2 C m (0 1)(0  m  3), in order to test H0 against H1 since 2 M 1 (n ) and 2 ; 1 (n )M (n ); 1 (n ) are respectively the covariance matrices n n of b and e  it is natural to dene a test (T) by rejecting H0 when p nC lb ; (C lM 1 (n )Cl)1=2  q for all l = 0 ::: k for the least squares estimator of f: Now, for the hybrid cubic spline estimator of f we dene a test (T') by rejecting H0 when p nC le ; (C l; 1(n )M (n ); 1 (n )Cl)1=2  q for all l = 0 ::: k: 0

;

0



;

;

;

0

0

;

0

0

;

;

3.3 Asymptotic Properties To derive asymptotic properties of (T) and (T'), we assume next that n converges to a design measure  where  is an absolutely continuous measure. We denote by Hn and H the cumulative distribution function of n and  respectively. The

7

number of knots and the smoothing parameter will be functions of the sample size k = kn and = n : The critical regions of (T) and (T') are respectively: p

b l n = f; (C M nC  q for all l = 0 ::: kn g 1 (n )C )1=2 l l 0

and

0

;

p

e l n = f; (C ; 1 (n)MnC (n ); 1(n )Cl)1=2  q for all l = 0 ::: kng: l In the following theorems, we give a result about the size and the power of (T) and we approximate the distribution of the statistic by a gaussian distribution with mean which converges to innity. A measure of the distance between two distributions is given by the following modication of the Mallows distance (This measure was used by Hardle and Mammen (1993)) 0

0

0

;

;

d(  ) = XY inf (IE jj X ; Y jj2 ^1 : L(X ) =  L(Y ) =  ): Convergence in this metric is equivalent to weak convergence (see Hardle and Mammen (1993)). Consider the following additional notations: 0) + f (1)  ::: f (kn ) + f (kn+1) ) f = ( (Cf (M 1 (n )C )1=2 (C kn M 1 (n)Ckn )1=2 0 0 and f (kn ) + f (kn+1) 0) + f (1 ) f = ( (C ; 1 (fn()M  ::: (n ); 1(n )C0)1=2 (C kn ; 1 (n )M (n ); 1 (n)Ckn )1=2 ): 0 Let p b l l l = 0 ::: kn Tn (f ) = ; (C M nC 1 (n )C )1=2 00

00

0

00

00

0

;

00

00

;

00

0

;

00

00

;

0

;

00

;

0

0

and

l

l

;

Tn(f ) = (Tn0(f ) ::: Tnkn (f )) : T ln(f ) and T n(f ) are given by similar denitions. 0

0

0

Theorem 1 Let f 2 FmM with m  2: Let us consider the hypothesis H0 : \f is not strictly convex" and H1 : \f is strictly convex." Then, under the following assumptions

(i)"ij i = 1 ::: r  j = 1 ::: ni i:i:d with mean zero and variance 2 < +1: 8

(ii)ni ;! +1 i = 1 ::: r : (iii) sup j Hn (x) ; H (x) j= (kn 1) as kn ;! +1 ;

0x1

(iv) n lim+ r1=2n1=2 kn (m+1) 4( sup i)1=2 = 0 ^

;!

1ir

1

the test (T) which rejects H0 if p b nC l ; (C lM 1 (n )Cl)1=2  q for all l = 0 ::: kn has asymptotically size : More precisely, 0

0

;

lim sup sup Pf (n ) = : n

+1 f 2H0

;!

We now give a result on the power of the test (T).

Theorem 2 Let us assume that m  2: Under the assumptions of theorem 1 if

(v) n lim+ n1=2 k5n=2 = +1 ;!

1

then for f 2 H1 and f > 0 00

n

lim+ Pf (n ) = 1:

;!

1

Now it would be interesting to know how the test behaves under the local alternatives. We consider a sequence of a local alternatives:

fn(x) = g(x) + hn  (x) ( ) where the know function  (:) lies in FmM and g lies in the null hypothesis. The following theorem gives the power of our test against local misspecication ( ): Theorem 3 Let us assume that m  3: Under the assumptions (i) (iii) and (v) if "ij  N (0 2) lim r1=2n1=2 kn 3( sup i)1=2 = 0 n + ;!

1ir

1

9

and

hn = (n k5n )

1=2

;

then

p

d(L(Tn (g + hn  )) N ( ng + kn5=2  V (n )) ! 0:

where

00

00

;

V (n ) = C M 1 (n )C 0

and

;

C = ( (C M 1C(0 n )C )1=2  ::: (C M C1 (knn )C )1=2 ): 0 0 kn kn 0

;

0

;

We have similar theorems for the test (T') in the smooth class of functions GmML: Theorem 4 Let f 2 GmML with m  2: Under the assumptions of theorem 1 and if (iv ) n lim+ r1=2n1=2 kn 0

;!

7=2

;

1

n ( sup i)1=2 = 0 1ir

then the test (T') has asymptotically size smaller than :

Theorem 5 Under the assumptions of theorems 2 and 4 we have for all f in H1 such that f > 0 00

lim Pf (n ) = 1: n + 0

;!

1

Theorem 6 Under the assumptions of theorems 3 and 4, if n

then

lim+ r1=2 kn

;!

13=2

;

1

p

n ( sup i )1=2 = 0 1ir

d(L(Tn(g + hn  )) N ( n(g + kn5=2 ) V 1(n )) ! 0: 0

where

00

;

00

V 1(n ) = B ; 1(n )M (n ); 1 (n )B 0

and

;

;

B = ( (C ; 1 (n )M (C0n ); 1(n )C )1=2  ::: (C ; 1(n )M (Cknn); 1 (n )C )1=2 ): 0 0 kn kn 0

;

;

0

10

;

;

Remarks Assumption (iii) is the same as in Aggarwal and Studden (1980) It is easy to see that assumption (iv) implies that n

lim+ kn = +1

;!

1

For a uniform design, i.e i = 1=r i = 1 ::: r then (iv) becomes n

lim+ n1=2 kn (m+1) 4 = 0 ^

;!

1

This is the case for example when there are n distinct points for n measurements(r=n). If in particular n satises n

7=2;(m+1)^4

lim+ n kn

;!

;

1

= 0

then (iv') is satised. Finally, note that the test procedure requires the knowledge of the variance 2: However, the results in this article can be considered approximately true if  is unknown but, in order to compute the statistic, we need a consistent estimate of : This can be obtained in the case of the least squares estimator, P r 1 using bn = n (k+4) i=1(yi ; fbn(xi))2 where yi and fbn are dened by (2.3) and (2.4)and enn (for a xed n ) has a similarly denition or alternatively, any consistent estimator based on nonparametric regression techniques. For the proof of theorems 1 to 6 we need some preliminaries lemmas. Lemma 4 from Schumaker (1981) gives a sup norm error bound when approximating a function in C m(0 1) (0  m  3) with a cubic spline. Lemma 4 Let 0  m  3: There is a constant c such that for all f 2 C m (0 1) there is a function S in S (kn  4) such that: ;

sup j f (r)(x) ; S (r)(x) j c kmn r !4 m (f (m) kn ) ;

0x1

;

for all

r = 0 ::: m:

Using this property and lemma 2, we easily get the following: Lemma 5 Let m  0: There is a constant c such that for all f 2 FmM there is a function S in S (kn  4) such that: sup j f (r)(x) ; S (r)(x) j c k((nm+1)

4);r

^

0x1

11

for all

r = 0 ::: m ^ 3:

Lemma 6 from Beatson (1982) gives a similar sup norm error bound when approximating a convex function of C m(0 1) (0  m  3) with a convex cubic spline. Lemma 6 There is a constant c such that if f 2 C m(0 1)(0  m  3) is convex, then there is a convex function S in S (kn  4) such that : sup j f (x) ; S (x) j c kmn !(f (m) kn ): 0x1

The following lemma is obtained by a straightforward manipulation of results of Agarwal and Studden (1980). Lemma 7 Under the assumptions of theorem 1, 1) jj M 1(n ) jj=jj (F D(n )F ) 1 jj= O( kn1) as n ;! +1 2) jj M (n ) jj= O( kn ) as n ;! +1 where jj A jj=def maxx(jj Ax jj = jj x jj) with jj x jj= (x x)1=2: Proof: Equality 1) follows from lemma 6.6 of Agarwal and Studden (1980). Now, recall that r X M (n ) = N (xi)N (xi)i: ;

0

;

;

0

0

We can write:

M (n ) =

Z

i=1

N (x)N (x)dn (x): 0

Z M () = N (x)N (x)d(x): Agarwal and Studden (1980) prove that M () = O( kn ) as n ;! +1 in lemma 6.3. As in equality 6.22 of Agarwal and Studden (1980) pp. 1319, we can write M 1 (n) = I ; Un] 1M 1 () where I is the (k + 4)  (k + 4) identity matrix and a Un matrix such that jj Un jj= (1): Hence jj M (n ) jjjj M () jjjj I ; Un jj : Now, since M () = O( kn ) n ;! +1 we have the result. 2 Let

0

;

;

;

The following important result will be used in the proof of theorems 4 to 6. Lemma 8 Under the assumptions of theorem 3 jj M 1 (n )W jj= O( kn4) as n ;! +1: ;

;

12

Proof: We have

jj M 1 (n )W jjjj M ;

1

;

jj jj W jj:

Using a classical inequality (see Ciarlet and Thomas (1982)) we obtain jj W jjjj W jj where X jj A jj =def max j aij j : i 1

1

j

Hence, since jj M 1 jj= O( kn1) the lemma follows if we prove that jj W jj = O( kn3 ): In fact, because W is a band matrix, it suce to prove that j Wij j= O( kn3): Using formula (6:19) (page 1318) of Argarwal and Studden (1980) we see that for all i = 1 ::: kn + 4 and for j = 1 2 sup j Ni(j)(x) j= O(knj ) ;

;

;

1

;

x

2

(01)

and since the sequence of knots is quasi-uniforme, we obtain sup j Ni(j)(x) j= O( knj ): (

) ;

x

2

(01)

On the other hand

Z1

Wij =

0

Ni (x)Nj (x)dx = Ni (1)Nj (1) ; Ni (0)Nj (0) ; 00

00

0

00

0

00

Z1 0

Ni (x)Nj (x)dx: 0

000

Hence, using (

) we see that j Ni (1)Nj (1) ; Ni (0)Nj (0) j= O( kn3 ): Now, we can write 0

Z1 0

00

0

Ni (x)Nj (x)dx = 0

000

00

pX =k Z p=0

;

p+1 p

Ni (x)Nj (x)dx 0

000

and using the proof of lemma 3, we obtain (since for x 2 (p p+1) Nj (x) = Nj ( p+1 ) Nj ( p) ) p p+1 000

00

;

;

j

00

Z

p+1 p

Ni (x)Nj (x)dx j sup j Ni (x) jj Nj (p+1) ; Nj (p) j : 0

000

0

x

00

00

(01)

2

Using notation of section (3:1), we can write Nj (p+1) = djp+1 and Nj (p) = djp: Hence the lemma follows from the (

) and the fact that djp = 0 if j p ; j j 4 else djp = O( kn2): 2 00

;

13

00

Proof of Theorem 1

Recall that

b = M 1 (n)F D(n )Y = M 1 (n)F D(n )(f + ") ;

;

hence, b has the same asymptotic distribution than 2 N (M 1 (n )F D(n )f  M 1 (n )): n In what follows, we denoted this by 2 b  NAs (M 1(n )F D(n )f  n M 1(n )): Let f 2 FmM (m  1): By Lemma 5, there is a S in S (kn  4) such that : ;

;

;

;

sup j f (r)(x) ; S (r)(x) j c k((nm+1)

^

0x1

4);r

r = 0 ::: m ^ 3:

for all

(3:2)

S 2 S (kn 4) hence, there exists 0 2 IRk+4 such that S (x) = N (x)0: Let S = (S (x1) :: S (xr)) = (N (x1)0 :: N (xr )0) = F 0: Then, M 1 (n)F D(n )S = M 1 (n )F D(n )F 0 = 0: Hence IEb = 0 + M 1 (n )F D(n )(f ; S ) = 0 + bn say Now, let us assume that f is not strictly convex, then there exists an interval I (0 1) such that for all x 2 I f (x)  0 (f is concave in I ). 0

0

0

;

0

;

0

0

0

;

00

Therefore there exists ln with 0  ln  kn such that

C ln 0 = ;(S (ln ) + S (ln+1))  0: 0

00

00

(3:3)

Indeed, if for all x 2 I f (x) < 0 then using (3.2) we see that for n suciently large, there exists an interval J I such that for all x 2 J S (x)  0 and we can take ln and ln+1 2 J: Now, if for all x 2 I f (x) = 0 then, using formula 6.44 of theorem 6.20 in Schumaker (1981) we see that there is an interval J I such that for all x 2 J S (x) = 0: 00

00

00

00

n Now, let Rl = f; (C lM nC1(lnb)Cl)1=2  qg then, n = Tkl=0 Rl and therefore, p

0

0

;

Pf (n )  0 inf P (R )  Pf (Rln ): l k f l  

n

14

Hence,

p ; nC ln b 0

(C ln M 1 (n )Cln )1=2  q) p p nC nC ln bn ln 0  Pf (NAs (0 1)  q + + 1 n 1 = 2 (C ln M ( )Cln ) (C ln M 1 (n )Cln )1=2 ): Then, by (3.3) we have p nC ln bn Pf (n )  Pf (NAs (0 1)  q + (C ln M 1 (n)Cln )1=2 ) Pf (n )  Pf (

0

;

0

0

0

;

0

;

0

0

;

The proof consists of bounding the absolute value of We have j C ln bn jjj Cln jjjj bn jj and 0

nC ln bn (C ln M 1(n )Cln )1=2 : p

0

0

;

jj bn jj jj M 1 (n ) jjjj F D(n ) jjjj f ; S jj : ;

Since

jj F D1=2 (n ) jj2 =jj F D(n )F jj 0

we have:

jj F D(n ) jjjj F D1=2(n ) jjjj D1=2 (n ) jjjj M (n ) jj1=2 ( sup i )1=2 1ir

r X

jj f ; S jj = ( (f (xi ) ; S (xi ))2)1=2 i=1

 r1=2 sup j f (x) ; S (x) j r1=2c k(mn +1) 4: ^

0x1

Now,

(C ln M 1(n )Cln )=jj Cln jj2  min (X M 1(n )X )=jj X jj2 = (1=jj M 1=2(n ) jj)2 X 0

;

0

;

 1=jj M (n ) jj: Hence, jj Cln jj =(C ln M 1 (n )Cln )1=2  jj M (n ) jj1=2: Therefore, 0

;

jj Cln jjjj bn jj =(C ln M 1 (n )Cln )1=2 jj bn jj jj M (n ) jj1=2: 0

Hence

;

p n j C ln bn j sup (C M 1(n )C )1=2  n r1=2c0 k(mn +1) 4( sup i )1=2 1 i r f H0 ln ln for some constant c0: p

0

^

2

0

;

 

15

(3:4) (3:5)

Therefore, using (3.5) we have

p

sup Pf (n )  P (NAs (0 1)  q ; n r1=2c0 k(mn +1) 4( sup i)1=2): 1 i r f H0 ^

 

2

Hence, by assumption (iii) lim sup sup Pf (n)  : n

+1 f 2H0

;!

On the other hand, sup Pf (n ) 

f H0 2

sup

f H0 2

\S

(kn 4)

Pf (n ):

But, if f 2 H0 \ S (kn  4) there is  2 IRkn +4 such that

f (x) =  N (x): 0

Hence f 2 H0 \ S (kn 4) if and only if  2 K where K = fx 2 IRkn +4 : 9l 0  l  kn : ;C l x  0g: Berger (1989) prove that 0



therefore

sup P (n ) =  =IEb

2K

lim sup sup Pf (n) = :2

Proof of Theorem 2

n

+1 f 2H0

;!

We want to show that for all f in H1 with f > 0 that 00

n

lim+ Pf (n ) = 1:

;!

1

Let f 2 FmM (m  2): Let us assume that f is strictly convex, then by lemma 6, there is a S in S (kn 4) convex such that : sup j f (x) ; S (x) j c k(mn +1) 4 ^

0x1

As above in (3.5), it is easy to see that p

(3:6):

p

1 n n j nC l M ( )F1 Dn( )(f1=2 ; S ) j n r1=2c1 k(mn +1) 4( sup i )1=2: (C lM ( )Cl)  1 i r 0

;

^

0

;

 

16

Therefore, under the assumption (iv), we see that: p 1 n ( )F D(n )(f ; S) = 0 (3:7) lim sup nC lM (C lM 1 (n )Cl)1=2 n + On the other hand, for all l = 0 ::: kn: 1  C l M 1 (n )Cl =jj Cl jj2 jj M 1 (n ) jj : n jj M ( ) jj Hence, jj M (n ) jj1=2 1 1   jj Cl jj (C lM 1 (n)Cl)1=2 jj Cl jj jj M 1(n ) jj1=2 :(3:8) Now, since the sequence of knots is quasi-uniforme (see (2.2)), it is easy to see that there are constants a1 > 0 and a2 > 0 such that a1 kn2 jj Cl jj a2 kn2: Hence p p nC l0 n(S (l) + S (l+1))  ;apn 5=2(S ( )+S ( ))(3:9) = ; l l+1 kn (C lM 1(n )Cl)1=2 (C lM 1(n )Cl)1=2 for some constant a > 0: Now, \ Pf (n ) = Pf ( Rl): 0

;!

;

0

1

0

0

;

;

;

;

;

;

0

;

00

00

00

0

;

0

00

;

0lkn

If f is strictly convex we see that by (3.2) S is convex for n suciently large. Hence C l0  0 for all l = 0 ::: kn: Therefore, using (3.9) we see that p p 5=2 nC l0  ; a n kn 0 inf f (x): x 1 (C lM 1(n )Cl)1=2 Hence, p 5=2 pn 1=2 (m+1) 4 ( sup i )1=2g Rl for all l = 0 ::: kn fNAs (0 1)  q;c4 n kn + r c0 kn  1 i r where c3 = a inf 0 x 1 f (x): Moreover , under the assumptions (iv) and (v) we see that p 5=2 pn 1=2 (m+1) 4 ( sup i )1=2) = 1: limn + Pf (NAs (0 1)  q ; c3 n kn +  r c0 kn 1 i r Hence Hence for all f in H1 and such that f > 0 then limn + Pf (n ) = 1: 2 0

0

00

0

;

 

^

 

00





^

;!

1

 

00

;!

1

17

Proof of Theorems 3 We have fn = g + hn  therefore we can write

b = M 1(n )F D(n )(g + ") + hnM 1 (n )F D(n ): ;

;

Now, we proced as in the proof of theorem 1. For l = 0 ::: kn we can write p p nC n(S (l) + S (l+1)) l bn l IETn (g + hn  ) = ; + 1 n 1 = 2 (C lM ( )Cl) (C lM 1 (n)Cl)1=2 0

0

00

;

p

00

0

;

h nC lM 1 (n )F D(n ) : ; n (C lM 1 (n )Cl)1=2 Arguing as above in (3.5) and using (3.2) and (3.8) we can see that p p n ( f (  n( (l) +  (l+1)) j l ) + f (l+1 )) l j IETn (g + hn  ) ; + h n 1 n 1 = 2 (C lM ( )Cl) (C lM 1(n )Cl)1=2 0

;

0

;

00

00

0

p

= O( n k(mn +3=2)

00

;

0

p

9=2

^

00

;

) + O( nr1=2 k(mn +1) 4( sup i )1=2)

1=2 3=2

^

1ir

+O(r kn ( sup i )1=2 + k2n ): Now, let

1ir

p

Yn = Tn(g + hn ) ; IETn (g + hn ) + nf + kn5=2 Under the assumptions of theorem 3, we see that p

00

;

00

Yn  N ( nf + kn5=2  V (n )) and

00

00

;

IE jj Yn ; Tn (g + hn  ) jj2! 0:

hence

p

d(L(Tn(g + hn  )) N ( ng + kn5=2 ) V (n)) ! 0: 00

00

;

2

Proof of Theorems 4 to 6 Let us recall that

e = ; 1 (n )F D(n )Y = (M (n ) + n W ) 1F D(n )Y : ;

Therefore

;

IEe = (M (n ) + n W ) 1 F D(n )f : ;

18

= f(M (n ) + n GQR 1 Q G ) 1 F D(n)f ; M 1(n )F D(n )f g + IEb : ;

Let

0

0

;

;

an = (M (n ) + n W ) 1F D(n )f ; M 1 (n )F D(n )f: Hence, as above in the proof of theorem 1, we can write IEe = an + bn + 0: ;

;

Then, for all f 2 GmML we have Pf (n ) = Pf (N (0 1)  q + (C l ;

nC l 0 1 (n )M (n); 1 (n )Cl )1=2 + p

0

0

nC l bn 1 (n )M (n ); 1 (n )Cl )1=2 p

(C l ; 0

0

;

;

+ (C l; 0

0

;

;

nC l an

1 (n )M (n); 1 (n )Cl )1=2 ): p

;

0

;

Therefore as in the proofs of theorems 1 to 3, theorems 4, 5 and 6 will be established when we show that p

n j C lbn j lim sup sup (C ; 1 (n )M (n ); 1 (n)Cln )1=2 = 0 n + f H0 l ;!

1

2

0

0

;

and

;

p

n j C lan j lim sup sup (C ; 1 (n )M (n ); 1 (n)Cln )1=2 = 0: n + f H0 l ;!

1

2

0

0

;

;

(3:10) (3:111)

As above in (3.4) it can be shown that jj Cl jjjj bn jj =(C l ; 1 (n )M (n ); 1 (n )Cl )1=2 0

;

;

jj bn jj jj ;(n ) jj jj M 1 (n ) jj1=2 ;

and

(3:12)

jj Cl jjjj an jj =(C l ; 1 (n )M (n ); 1 (n )Cl )1=2 jj an jj jj ;(n ) jj jj M 1 (n ) jj1=2: (3:13) Therefore, the proof consist of bounding jj ;(n ) jj and jj an jj : Under the assumptions of theorem 4 using lemma 8 we see that jj n M 1(n )W jj< 1 for 0

;

;

;

;

n suciently large. We can then invert (I + n M expansion,

1

;

(n )W )

using a power series

(I + n M 1 (n )W ) 1 = I ; n M 1 (n )W + ( n M 1(n )W )2 + ::: ;

;

;

=I +U 19

;

say:

Using the matrix norm properties, we can show that jj U jjjj n M 1 (n )W jj =(1; jj n M 1 (n )W jj) = O( n kn4) by lemma 8: ;

;

We have

;

; 1 (n ) = (I + U )M 1 (n) ;

and therefore

;

;(n ) = M (n )(I + U ) 1 : Standard arguments (e.g. Ciarlet and Thomas (1982 )) show that ;

jj (I + U ) 1 jj 1=(1; jj U jj): ;

Hence

jj ;(n ) jj c jj M (n ) jj

(3:14) for some constant c. Now, using (3.12) we see that there is a constant c1 such that jj Cl jjjj bn jj =(C l ; 1 (n )M (n ); 1 (n )Cl )1=2  c1 jj bn jj jj M (n ) jj1=2 0

;

;

and (3.10) follows from the assumptions (iv) of theorem 1. Now, since ; 1(n ) = (I + U )M 1 (n ) we see that an = UM 1(n )F D(n )f : Hence it can be shown as in the proof of theorem 1 ;

;

;

jj an jjjj U jjjj M 1 (n ) jjjj F D(n ) jjjj f jj= O( n kn4 r1=2L( sup i )1=2): ;

;

1ir

Now, (3.11) follows from (3.13), (3.14) and assumption (iv') of theorem 4. 2

4

APPLICATION TO A REAL DATA SET

We have experimented our method on a real data set obtained from the investigation \Formation Qualication Professionnelle" realized by INSEE in 1993. Simioni (1995) used this data set in a nonparametric regression model to estimate the logarithm of wage as a function of a set of characteristic variables such as age in the rm, experience, ... Human capital theory predicts the concavity of the logarithm of wage as a function of the experience (see Berndt (1991)). In this data set, this curve displays however a low curvature and it is therefore interesting to test its concavity. The sample size i.e the number of wage-earners is very large n = 7948: The deterministic points xi i = 1 ::: r are represented by the experience ( the number of year since the rst job). Table (4.1) from Simioni (1995) gives some characteristics on the empirical distribution of the variables wage and experience. 20

A least squares and a hybrid cubic splines estimators are tted to the data with kn equidistant knots, n = 0 corresponds to the least squares estimator. gcv  where k gcv is the The number kn takes the values 3, 9, 15, 21, 27 and kn nn n number of knots selected by cross-validation (when kn varies in f3 4 ::: 27g) for xed n : Computations were done for a smoothing parameter n equal to 0 5:10 3  5:10 3:5 5:10 4  5:10 4:5 5:10 5 : Figure (1) shows the data set with the gcv = 4: estimate corresponding to n = 5:10 4 and kn = kn n To compute tests statistics (T) and (T'), we used respectively consistent estimates bn and enn of  as in section 3.3. Table (4.2) displays a \1" when the test rejects H0 : ;

;

;

;

;

;

The results of this table show that the test conrms economic theory for choices of the smoothing parameters kn and n corresponding to smooth estimates(i.e. for n small enough and kn not too large). This is not surprising since rough estimates will produce articial changes in curvature.Therefore a good choice of these parameters is essential. In practice, one could choose simultaneously the two parameters kn and n :

5

CONCLUSION

A small simulation study in Diack (1996) shows that the nite sample behavior of the test is quite satisfactory. It also reinforces our belief that the hybrid spline test is more reliable. A test of non-monotonicity can be readily constructed paralleling the above nonconvexity test with quadratic splines instead of cubic splines. Indeed, Beatson (1982) also shows that for a monotone function f 2 C m (0 1)(1  m  2), the uniform distance between f and the set S (k,3) of monotones functions of S (k,3) tends to zero when the mesh size kn tends to zero. 

It would be nice to extend this framework to multivariate regression models. It seems, however, that some new techniques should be used since there are important di"erences between the one- and higher-dimensional case. For example, the extension of piecewise nonparametric modeling to higher dimensions is dicult in practice. This additional step is still under study. Acknowledgments The authors wish to thank M. Simioni for very helpful discussions and suggestions.

21

References 1] G.G. Agarwal and W.J Studden. Asymptotic integrated mean square error using least squares and bias minimizing splines. The Annals of Statistics, 8(6):1307{1325, 1980. 2] R.K. Beatson. Monotone and convex approximation by splines: error estimates and a curve tting algorithm. SIAM J. of Math. Analysis, 19(4), 1982. 3] R.L. Berger. Uniformly more powerful tests for hypotheses concerning linear inequalities and normal means. Journal of The American Statistical Association, 84(405):192{199, 1989. 4] E.R. Berndt. The practice of econometrics. Addison-Wesley Publishing Company, 1991. 5] P.G. Ciarlet and J.M. Thomas. Exercices d'analyse matricielle et d'optimisation. Collection mathematiques appliquees pour la maitrise. Masson, 1982. 6] C.A.T. Diack. Testing convexity. Compstat'96: Proceedings of Physica Verlag, 1996. 7] H. Dierckx. An algorithm for cubic spline tting with convexity constraints. Computing, 24:349{371, 1980. 8] W. Hardle and E. Mammen. Comparing nonparametric versus parametric regression ts. The Annals of Statistics, 21(4):1926{1947, 1993. 9] C. Kelly and J. Rice. Monotone smoothing with application to dose-response curves and the assessment of synergism. Biometrics, 46:1071{1085, 1990. 10] W. Schlee. Nonparametric test of the monotony and convexity of regression. Nonparametric Statistical Inference, 2:823{836, 1980. 11] L. Schumaker. Spline function: Basic theory. John Wiley, New York, 1981. 12] M. Simioni. Fonctions de gains: une approche non parametrique. To appear in INSEE Methods, 1995. 13] A.J. Yatchew. Nonparametric regression tests based on least squares. Econometrics Theory, 8:435{451, 1992.

Table 4.1 22

Variables Mean Standard-deviation Minimum Maximum Wage 9437.97 5458.56 0 3766.67 77583.33 Experience 19.80 11.10 0.00 49.00

Table 4.2 n 0 5:10 5:10 5:10 5:10 5:10

3

;

3:5

;

4

;

4:5

;

5

;

gcv kn = 3 kn = 9 kn = 15 kn = 21 kn = 27 kn n 1 0 0 0 0 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 0 0

11.5

11

logarithme du salaire

10.5

10

9.5

9

8.5

8 0

0.1

0.2

0.3

0.4

0.5 0.6 expérience

0.7

0.8

n = 5:10 4  kn = 4 ;

Figure 1:

Hybrid cubic spline estimator.

23

0.9

1

Suggest Documents