A Consistent Nonparametric Test of the Convexity of Regression Based on Least Squares Splines 1 by Cheikh A.T. DIACK Laboratoire de Statistique et Probabilités UMR CNRS C5583 Université Paul Sabatier 118 route de Narbonne 31062 Toulouse cedex 4, France. e-mail:
[email protected]
Abstract This paper provides a test of convexity of a regression function. This test is based on the least squares splines. The test statistic is shown to be asymptotically of size equal to the nominal level, while diverging to innity if the convexity is misspecied. Therefore, the test is consistent against all deviations from the null hypothesis.
1
INTRODUCTION
Tests of convexity of a regression function is one of the most important problems in econometrics. Indeed, The General Theory of Employment, Interest, and Money emphasized the central importance of the consumption function and explicitly argued that the consumption function is concave (Carroll & Kimball 1996). Economic theory predicts also the convexity of functions like for example Bernoulli utility function, cost function, production function, Engels curves, ... Otherwise, the Human Capital theory argued that the relationships between the logarithm of wage and the experience is concave. On the other hand, psychologists have worried for over a century about whether subjective reports about physical magnitudes like length, weigth, area, luminance and etc. have a convex or concave relationships to corresponding measurement. Also, this convexity problem is very closely connected to the order-restricted hypothesis testing problems describe in references such as Robertson and al. (1988). There are some papers in the statistics literature dealing with nonparametric hypothesis testing convexity of the regression function. The work along this line includes Schlee (1980), Yatchew (1992), Diack (1996), Diack (1997), Diack & Thomas (1998) and Diack (1998). Schlee (1980) in a nonparametric regression model with random design used an estimator of the second derivative of the regression function. His test statistic requires computing the distribution of the supremum of this normalized estimator 1
key words:least squares estimator, test of convexity, Likelihood ratio test, convex cone.
1
over an intervall. But this method imposes some theorical diculties. To overcome the problem, he proposes a sequence of points from the interval and uses the theory of maximal deviation to obtain the distribution of the test statistic under the null hypothesis. However, this work no discuss asymptotic results or pratical implementation. Yatchew's test ( with semi-parametric model) is based on comparing the nonparametric sum of squared residuals with convexity contraints, with the nonparametric sum of squared residuals without contraints. Yatchew's approach relies on sample splitting which products a lost of eciency. He gives a heuristic proof of the consistency of the test. Diack (1998) adapt respectively Schlee's idea and Yatchew's idea in a nonparametric model with xed design to construct two other tests of convexity for which he gives new asymptotic results of convergence. Diack and Thomas (1998) use least squares splines estimator and develope in nonparametric model with deterministic design, a non-convexity test which is consistent for some alternative hypothesis. A small simulation study in Diack(1996) shows that the nite sample of this test is quite satisfactory. In this paper, we propose a new test of convexity of a regression function in a nonparametric model. Our test uses, as Diack and Thomas's test, a cubic spline estimator which allows us to formulate the convexity hypothesis in a very simple way. Hence, our problem becomes roughly, a problem to test a multivariate normal mean with composite hypotheses determined by linear inequalities. The remainder of this paper is organized as follows. In section 2, we introduce the nonparametric regression model and the hypotheses to be tested. After, we recall some properties of the cubic spline estimator. Section 3 describes our test of convexity of the regression function and section 4 is devoted to a discussion and demonstration of some properties of the test.
2
PRELIMINARIES The model and the hypotheses Consider the nonparametric regression model:
2.1
yij = f (xi ) + "ij ; i = 1; :::; r; j = 1; :::; ni with xi 2 (0; 1); i = 1; :::; r: At each deterministic design point xi; (i = 1; :::; r); ni measurementsP are taken. The probability measure assigning mass i = ni=n to the point xi( i = 1) is referred to as the design and will be denoted by n : We assume that the random errors "ij are uncorrelated and identically distributed with mean zero. Their variance 2 will be assumed unknown. Finally f is an unknown smooth regression function. 2
In what follows, we will assume some regularity conditions on f: The following class of functions were use by Diack and Thomas (1998) to construct a test of non-convexity. For l 2 IN and M; L > 0; let Fl;M = ff 2 C l+1(0; 1) : sup j f ((l+1)^4)(x) j M g: 0x1
We intend to contruct a test of convexity of the regression function f: Thus the null hypothesis to be tested is that the convexity of f is correct:
H0 : \f is convex while, without a specic alternative hypothesis, the alternative to be tested will be that the null is false:
H1 : \f is non-convex. Thus the alternative encompasses all the possible departure from the null hypothesis. In what follows, a testing problem with null hypothesis H0 and alternative H1 is denoted by [H0; H1]: We will use a cubic spline estimator and characterizing convexity in the set of collection of all polynomial cubic splines, so we transform our problem at a test of a multivariate normal mean with composite hypotheses determined by linear inequalities. The Cubic Spline Estimator Let p be a positive continuous density on (0,1). We assume that min p(x) > 0: 0x1
2.2
Let 0 = 0 < 1 < ::: < k+1 = 1 be a subdivision of the interval (0,1) by k distincts points dened by
Z i
p(x) dx = i=(k + 1); i = 0; :::; k + 1:
(2:20)
Let k = max0ik (i+1 ? i), we see that k =min (i+1 ? i) max x p(x)= min x p(x): i
(2:21)
0
3
For each xed set of knots of the form (2.20), we dene S (k,d) as the collection of all polynomial splines of order d (degree d-1) having for knots 1 < ::: < k : The class S (k,d) of such splines is a linear space of functions of dimension (k+d). A basis for this linear space is provided by the B-splines(see Schumaker (1981)). Let fN1; ::; Nk+dg the set of normalized B-splines associated to the following nondecreasing sequence ft1; ::; tk+2dg :
8 >< t1 t2 ::: td = 0 =1 >: t2d+k tt2d+k=?1 :::for tld+=k+11; ::; k d+l l
The reader is referred to Schumaker (1981) for a discussion of these B-splines. In what follows, we shall only work with the class of cubic splines: S (k,4). It will be convenient to introduce the following notations: N (x) = (N1(x); ::; Nk+4(x))0 2 IRk+4 : F = (N (x1); ::; N (xr)); (k + 4) r matrix: We will denote by fbn the least squares spline estimator of f :
X fbn (x) = bpNp(x) k+4
where
(2:22)
p=1
b = (^1; :::; ^k+4)0 = arg mink 2IR
+4
Let
ni r X X i=1 j =1
(yij ?
kX +4 p=1
pNp(xi))2: (2:23)
ni ni X X yi = n1 yij ; "i = n1 "ij ;
i j =1 i j =1 0 0 = (y1; ::; yr) ; " = ("1; ::; "r) ; f = (f (x1); ::; f (xr))0
Y Let D(n ) be the r r diagonal matrix with diagonal elements 1; :::; r; then, basic least squares arguments prove that: r X b = M ?1 (n )F D(n )Y with M (n ) = N (xi)N 0(xi)i = F D(n )F 0: i=1
Asymptotic properties of this estimator have been established in Argarwal and Studden(1980). Remark that the rst moment of fbn is given by IEfbn (x) = N (x)0 M ?1 (n )F D(n )f : Thus, if f is a cubic spline function (that is to say there is such that f = F 0) then fbn is unbiased and IEb = : We will use below this property to construct our test. 4
3
TEST STATISTIC
Convexity in S (k,4) First of all, characterize convexity in the class S (k,4). Remark that, if a function g is a cubic spline then, its second derivative is a linear function between any pair of adjacent knots i and i+1; and it follows that g is a convex function in the interval i x i+1 if and only if g00(i) and g00(i+1) are both non negative (this property was used by Dierckx (1980) to dene a convex estimator). For a function g in the class S (k,4), we can write: 3.1
g(x) =
kX +4 p=1
pNp(x) with = (1; :::; k+4)0 2 IRk+4
Then:
g00(l) =
kX +4 p=1
pN 00p(l) =
kX +4 p=1
pdp;l;
where the coecients dp;l are easily calculated from the knots (see Dierckx (1980))
8 >> dp;l = 0 if p l or p l + 4 < dl+1;l = (tl ?tl )(6 tl ?tl ) >> dl+3;l = (tl ?tl )(6 tl ?tl ) for l = 0; :::; k + 1 : dl+2;l = ?(dl+3;l + dl+1;l) Let bl = (0; 0; ::; 0; ?dl+1;l; ?dl+2;l; ?dl+3;l; 0; ::; 0)0 2 IRk+4 and = (1; ::; k+4)0; +5
+2
+5
+3
+6
+3
+5
+3
then
g00(l) = ?b0l: Hence, we see that a cubic spline g is a convex function if and only if b0l 0 for all l=0,...,k+1. Hence, the idea of our test follows from this property. Indeed, we have already mentionned in the section 2 that, if f is a cubic spline then IEb = : Therefore to build a test of convexity in the case where the regression function f is a cubic spline function, is the same to build a test for the hypotheses concerning linear inequalities and random vector mean. More precisely, the test [H0; H1] is equivalent in this case at the following test with null hypothesis H00 : b0l 0 for all l = 0; :::; k + 1: against the alternative
H10 : 9l : b0l > 0 where is the mean of the random vector b : 5
On the other hand, Beatson (1982) shows that for a smooth and convex function f 2 C m (0; 1)(0 m 3), the uniform distance between f and the set S (k,4) of convex functions of S (k,4) tends to zero when the mesh size k tends to zero (see lemma 3 below). A testing problem in the form [H00 ; H10 ] is related to the one-sided testing problem in multivariate analysis and has been studied by several authors (Bartholomew 1961, Kudô 1963, Nüesch 1966, Kudô and Choi 1975, Shapiro 1985 and more recently by Raubertas and al. 1986 and Robertson and al. 1988). One-sided Test Let Y be a random vector distributed as Nq (; q ) (q 2 IN; q > 0) where q is a known nonsingular matrix. We consider the following testing problem:
3.2
null hypothesis, H00 : b0l 0 (l = 0; :::; k + 1): alternative, H10 : 9l : b0l > 0: In this paper we identify an hypothesis with the corresponding set of parameters. For example, we write H00 = f 2 IRq : b0l 0g: The likelihood function is L = c0exp(? 21 jj Y ? jj2?q ) where c0 is a positive constant independent of : Thus, the likelihood ratio for the problem [H00 ; H10 ] is given by: sup 0 L = sup 0H 0 L = exp(? 21 inf 0 jj Y ? x jj2?q ): 1
1
x2H
H [H1
So, to determine the test statistic under the null hypothesis, we need to resolve the following nonlinear programming: inf x2H0 jj Y ? x jj2?q : Remark that H0 is a polyhedral convex cone. Hence, for a given Y; this innimum is attained at unique point denoted by H0 (Y ) and represents the square distance from Y to H0 : Thus, the likelihood ratio test (LRT) rejects H10 for large values of 2 = inf 0 jj Y ? x jj2?q : 1
1
x2H
Shapiro (1985) showed, in a study of the distribution of a minimum discrepancy 0 statistic, that if H is a any convex cone and if = 0; then the distribution of 2 statistic, called chi-bar-squared statistics, is a mixture of chi-squared distributions. Raubertas and al.(1986) generalize the one-side testing problem to allow hypotheses involving homogeneous linear inequality restrictions. This framework includes the hypotheses of monotonicity, nonnegativity, and convexity. Here we 6
give an immediate consequence of theorem 3.1 of Shapiro (1985). For that purpose we shall use some geometrical properties of polyhedral cones.
3.2.1 Polyhedral convex cones Let fa1; :::; apg be a set of vectors in IRq (with p q) and let C be a convex polyhedral cone dened by fa1; :::; apg that is, C
= fx 2 IRq
:x=
p X i=1
iai; i 0; i = 1; :::; pg:
The polar cone C of a cone C is dened by C = fx 2 IRq : x0y 0; 8y 2 Cg: It is easy to see that
C = fx 2 IRq : a0ix 0; i = 1; :::; pg: Note that C is also a polyhedral convex cone and (C ) = C: The boundary of C is given by
@ C = fx 2 IRq : a0ix 0; i = 1; :::; pg; where the notation mean (as in Sasabuchi 1980) that at least one equality hold in this sequence of inequalities. The polar cone C has p faces Fj ; Fj = fx 2 IRq : a0j x = 0; a0ix 0; i = 1; :::; pg; j = 1; :::; p: The dimension of each face is at most p ? 1: For the denition and basic properties of faces and polar cone the reader is reered to Rockafeller (1970). To a face Fj ; we denote by IPFj the symmetric idempotent matrix giving the orthogonal projection onto the space generated by Fj : As above, for all x; C represents the projection of x onto C : Now, we shall need the following result which in various forms has been used by several authors (Kudô (1963), Shapiro (1985)). Lemma 1 For all x 2 IRq ; we have: C (x) 2 Fj if and only if PFj (x) 2 C and x ? PFj (x) 2 C and in this case: 0 C (x) = PFj (x) and x ? PFj (x) = jj aaj xjj2 aj : j 7
In some pratical situations there may be all the face Fj are exactly p ? 1 dimensional. That may be depends on the conguration of the vectors a1; :::; ap: For these cases, we prepare some denitions and theorems which are a restatement of denition 2.1, 2.2 and theorem 2.1 of Sasabuchi (1980) in this setting.
Deniton 1 A vector aj is said to be redundant in fa1; :::; apg if fx : a0ix 0; i = 1; :::; pg = fx : a0ix 0; i = 1; :::; p; i 6= j g: or equivalently, there does not exist a vector x such that a0ix 0(i = 6 j ); a0j x < 0: Denition 2 A set of vectors fa1; :::; apg is said to be with positive relations if there exist nonnegative numbers 1 ; :::; p; not all simultaneously zero, such that Pi iai = 0; otherwise the set is said to be without positive relations.
Remark 1 If fa1; :::; apg is with positive relations, it is linearly dependent; equivalently, if fa1; :::; apg is linearly independent, it is without positive relations. Theorem 1 Suppose that fa1; :::; apg is without positive relations and has no redundant vector in it. Then all the faces Fj (j = 1; :::; p) are exactly p ? 1 dimensional. In our case, C = H0 with p = k + 2 and it is easy to see that fb0; :::; bk+1gis without positive relations and has no redundant vector in it. 3.2.2 The distribution of 2
The following result is an immediate consequence of theorem 3.1 of Shapiro (1985) and theorem 1 above. Theorem 2 Let Y be a random vector distributed as Nq (; q ) where 2 IRq and q is a known nonsingular matrix. Suppose that fa1; :::; apg is without positive relations and has no redundant vector in it. If = 0; then the random variable 2 is distributed as a mixture of chi-squared distributions, namely
P s (2
2) =
!0P ( s 2 0
2) + (
p X j =1
!j )P (21 s2) + !q P (2q s2) (3:20)
with
!0 = P (Y 2 H0 ) = P (a0iY 0; i = 1; :::; p); 0 !j = P (a0j Y 0)P (a0iY ? (aa0 iq aaj ) a0j Y 0; i = 1; :::; p); j = 1; :::; p j q j
and
!q = P (Y 2 (H0 )): 8
Moreover
p X j =0
!j + !q = 1:
See Diack (1997) for a proof of this theorem. The result of theorem 2 shows that the distribution of 2 when = 0; is a mixture of chi-squared distributions. So, to calculate the probabilities in the right-hand side of (3:20); the values of !j ; (j = 0; :::; p) are needed. However, even for moderate q(q > 3); good closed form expressions for these level probabilities have not found. Thus approximations are of interest. For this, one can use Monte Carlo method (see Diack (1997)). Note that the coecients !j depend of the vectors aj and the matrix q : Hence, in what follows, we denoted 2 by 2q (p): Questions concerning the determination of the distribution of 2q (p) for any point of null hypothesis are unresolved. However, the following lemma which can nd in Robertson and al. (1988), prove that, theorem 2 suces to construct a test of size :
Lemma 2 Let K be a closed convex cone. For x in IRq ; xK denotes its projection onto K with respect any inner product < :; : > and jj : jj the associated norm. If z 2 K; then jj x + z ? (x + z)K jjjj x ? xK jj : Remark 2 Recall that Y ; Nq(; q ): Let P be the law of probabilities of Y under the hypothesis = 0: Then, for all 0 2 K; we have: under the hypothesis ( = 0); Y ; Nq (0; q ); under the hypothesis ( = 0); Y + 0 ; Nq (0; q ); hence, under the hypothesis ( = 0) the law of jj Y ? YK jj2?q is the same that the law of jj Y + 0 ? (Y + 0)K jj2?q under the hypothesis ( = 0): 0
1
Now, using lemma 1, we see that
1
P (jj Y ? YK jj2?q s) = P0(jj Y + 0 ? (Y + 0)K jj2?q s) 0
1
1
P0(jj Y ? YK jj2?q s): The last inequality mean that, if K is a set of parameters corresponding to null hypothesis of a test statistic given by jj Y ? YK jj2?q ; then the maximal level 1
1
of this test is obtain for = 0: In this case, ( = 0) is so-called the least subhypothesis of K: Lemma 2 has0 the following consequence: the size-0 likelihood ratio test with null hypothesis H versus the alternative hypothesis H1 is the test with reject the null hypothesis if 2q (p) s2;p 9
where s2;p is dened by p X
( !j )P ( s;p) + (1 ? !0 ? ( j =1
2 1
2
p X j =1
!j ))P (2q s2;p) = : (3:21)
Hence s2;p is a function of the weights !j : Now, the following result give a sucient condition of convergence of the power of the test. Theorem 3 Under the assumptions of theorem 2, the power function of 2q (p) converges uniformly to one as minx2H0 jj ? x jj2?q ?! +1: 1
Proof
Let T be a q q nonsingular matrix such that T q T 0 = Iq ; that is q = T ?1(T ?1)0; and make the transformation
X = TY; U = T : Then X is a random vector distributed as Nq (U; Iq ): Dene the set of vectors fc1; :::; cpg as c0j = a0j T ?1; (j = 1; :::; p): It is esay to see that the set of vectors fc1; :::; cpg is without positive relations and has no redundant vector in it. 0 0 We have a0j = c0j U (j = 1; :::; p00); and hence the problem [ H ; H1] is trans 00 formed to the following problem [ H ; H ] : 1 H0000 : c0j U 0 (j = 1; :::; p) H1 : 9j : c0j U > 0: We can write
2q (p) = min00 jj X ? x jj2=jj X ? H00 (X ) jj2 : x2H
On the other hand,
jj U ? H00 (U ) jjjj U ? H00 (X ) jjjj X ? U jj + jj X ? H00 (X ) jj : hence, the result follows from the assumption
jj U ? H00 (U ) jj?! +1:2 The test statistic requires computing the projection H0 (Y ) of Y: However, a good closed-form solution has not found. Hence, this problem requires extensive numerical work to obtain solution. We propose an algorithm based on succesive projections which has been introduced by Dykstra (1983) (see also Boyle and Dykstra (1985)). This algorithm determines the projection of a point X of any real Hilbert space onto the intersection K of convex set Kj (j = 1; :::; p) and 10
it is meant for applications where projections onto the Kj 's can be calculated relatively easily. Let K be a closed convex cone in IRq : We suppose that, K can T p be written as j=1 Kj and each Kj is also closed convex cone. For all X 2 IRq ; we denote by XK? the ?? projection onto K; where ? is a positive denite matrix. The algorithm consists of repeated cycles and every cycle contains p stages. ? be the appproximation of X ? given by Dykstra's algorithm at the ith Let Xmi K stage of mth cycle. The following result (see Boyle & Dykstra (1985)) proves that the algorithm converges correctly.
Theorem 4 For any (1 i p); the sequence fXmi? g converges strongly to XK? ? ? X ? jj ! 0 as m ! +1: i.e. jj Xmi K ? Application: Let0 K = H0 be the null hypothesis of the problem [H0 ; H10 ] and let Ki = fx 2 IRq : bix 0g: Let ? = ?q 1 be the covariance matrix of Y: For all m 2 IN; m > 0; we dened 2q (p; m) by ? q jj2 ? 2q (p; m) =jj Y ? Ymp q 1
1
?1
q is given by the pth stage of mth cycle of the Dykstra's algorithm. where Ymp We have then the following equality:
?1
?1
?1
?1
q jj2 + jj Y q ? Y q jj2 2q (p) =jj Y ? YH0q jj2?q =jj Y ? Ymp mp ? ? H0 q q 1
?1
where ?q we see that
1
?1
1
1
?1
q ; Y q ? Y q > ? +2 < Y ? Ymp mp q H0 is the inner product on IRq dened by ?q 1: Using now theorem 4,
?1
jj Ymp?q ? YH0q jj2?q ! 0 1
1
a.s. as
1
m ! +1:
In the same way, it can be shown that for xed p and q ?1
?1
?1
q ; Y q ? Y q > ? ! 0 < Y ? Ymp mp q H0 1
a.s. as
m ! +1:
Hence, 2q (p; m) converges almost surely to 2q (p) as m tends to innity. Therefore, to implement the test statistic, we will used 2q (p; m) instead 2q (p): We can now dene our convexity test. Denition of the test Consider the problem [H; H1] where H means the regression function f; is convex and H1 is the unrestricted alternative to be tested. 3.3
11
Let b be the solution of the quadratic programming problem (2:23): Let dene b m;kn +2 by b m;kn +2 = b m;kn n+2 with ?n 1 = n M ?1 (n ) and where b m;kn n +2 given by the (kn + 2)th stage of the mth cycle of Dykstra's algorithm. Like this, we will dened our test of convexity by rejecting H when (3:31) 2 M ? (n)(kn + 2; m) = n2 jj b ? b m;kn +2 jj2M (n) s2;kn+2; n where s2;kn+2 is dened by (3:21): 2
2
4
1
ASYMPTOTIC PROPERTIES
Note that the test procedure requires the knowledge of the variance 2: However, the following results can be considered approximately true if is unknown but, in order to compute the statistic, we need a consistent estimate of : This can be obtained in the case of the least squares estimator, using P 1 r bn = n?(k+4) i=1(yi ? fbn(xi))2 or alternatively, any consistent estimator based on nonparametric regression techniques. In what follows, we assume that n converges to a design measure ; where is an absolutely continuous measure. We denote by Hn and H the cumulative distribution function of n and respectively. The number of knots will be function of the sample size k = kn . The critical region of the test is n;m = f2 M ? (n)(kn + 2; m) = ^n2 jj b ? b m;kn+1 jj2n s2;kn+2 g: n n Now, we are ready for the main result of this section. Theorem 5 Let f 2 Fl;M with l 3: Let us consider the problem [H; H1]: Then, under the following assumptions (i)"ij ; N (0; 2) and iid; 2
1
(ii) sup j Hn (x) ? H (x) j= (kn ?1); 0x1
as kn ?! +1
(iii) n?! lim+1 r1=2n1=2kn 3( sup i )1=2 = 0; 1ir
the test is asymptotically size : More precisely,
lim+1 Pf (n;m) = : lim sup sup m?! n?!+1 f 2H0
Moreover, if
(iv) n?! lim+1 n1=2k5n=2 = +1 12
then, the test statistic is diverging to innity: it is consistent i.e lim lim P ( ) = 1: n?!+1 m?!+1 f n;m
The proof of theorem 5 is given in Appendix.
Remarks Assumption (ii) is the same as in Aggarwal and Studden (1980) It is easy to see that assumption (iii) implies that lim k = +1 n?!+1 n For a uniform design, i.e i = 1=r; i = 1; :::; r then (iii) is equivalent to lim n1=2kn (m+1)^4 = 0
n?!+1
This is the case for example when there are n distinct points for n measurements(r=n). Discussion: We have proposed a consistent test of convexity of a regression function in a nonparametric model. While it appears dicult to impose properties such as concavity on nonparametric local averaging estimators, this restriction is readily introduced by using cubic spline estimator. Hence, the idea of the test exploits the close connection between the convexity problem and the hypothesis testing problems concerning linear inequalites and normal means. A simulation study in Diack (1997) shows the usefulness of this method and that the nite sample behavior of the test is quite satisfactory. It would be nice to extend this framework at the case where the errors are no necessary gaussian and to studie the behavior of the tests under the local alternatives. A test of monotonicity can be readily constructed paralleling the above convexity test with quadratic splines instead of cubic splines. This additionnal step is still under study.
Appendix For the proof of the theorem 5, we need to use lemma 3 which is obtained by a straightforward manipulation of results of Schumaker (1981). It gives a sup norm error bound when approximating a smooth and convex function with a convex cubic spline (see also, Beatson (1982) and Diack (1997)). Lemma 3 Let l 3: There is a constant c such that for all function f 2 Fl;M ;there is S in S (kn ; 4) such that: sup j f (j)(x) ? S (j)(x) j ck4n?j ; j = 0; :::; 3: 0x1
Moreover, if f is convex then S is also convex
13
Proof of theorem 5:
Let us recall that
b = M ?1(n )F D(n )Y : From Lemma 3, there is a function S in S (kn; 4); such that : sup j f (j)(x) ? S (j)(x) j ck4n?j
0x1
for all
j = 0; :::; 3:
S 2 S (kn; 4) hence, there exists 2 IRkn +4 such that S (x) = N 0(x): Let S = (S (x1); ::; S (xr))0 = (N 0(x1); ::; N 0(xr))0 = F 0: Then, M ?1(n )F D(n )S = M ?1 (n)F D(n )F 0 = : Let b = M ?1 (n )F D(n )(S + "): s Then 2 IEsb = and sb ; N (; M ?1 (n )): n Therefore, we can write b = sb + Bn and IEb = + Bn with Bn = M ?1 (n )F D(n)(f ? S): Now, let us recall that the test statistic is given by 2 M ? (n)(kn + 2; m) = n2 jj b ? b m;kn +2 jj2M (n) : n But, for m suciently large and for xed n; (see section 3.2.3 ) we have: 2 M ? (n)(kn + 2; m) converges in probability to 2
2
n
1
1
n jj b ? IP() b jj2M (n)
?1 n (kn + 2) = 2 n M ( )
2
2
b the M (n )?projection of b onto the polyhedral closed convex cone where IP()is
K = fx 2 IRkn+4 : b0lx 0; l = 0; :::; kn + 1g: Otherwise,
n jj (b ? ) b ? IP(s)) b ? IP() b + IP(s) b jj2M (n) : s b + (s
?1 n (kn + 2) = 2 n M ( )
2
2
14
We can rewritten this in the following form: b jj2M (n) 2 M ? (n)(kn + 2) = n2 jj s b ? IP(s) n b ? IP(s b + Bn) jj2M (n) + n2 jj Bn jj2M (n) + n2 jj IP(s) b ; Bn >M (n) + 2n2 < sb ? IP(s ) b ; IP(s) b ? IP(s+ b Bn ) >M (n) + 2n2 < sb ? IP(s) b ? IP(sb + Bn ) >M (n) + 2n2 < Bn ; IP(s) where M (n) is the saclar product dened by the metric M (n ): Now, let us rst show that, under the null hypothesis, we have: b jj2M (n)) sup (2 M ? (n)(kn + 2) ? n2 jj s b ? IP(s) f 2Fl;M n converges in probability to zero. Indeed, the following inequality hold: rn rn 2 jj Bn jjM (n) sup 2 jj Bn jj2jj M (n ) jj sup f 2Fl;M f 2Fl;M 2 Now, it is easily seen, as in Diack&Thomas (1998) (see formula (3.5)) that the right-hand size of the above inequality verie the following equality: rn sup i )1=2): jj Bn jj2jj M (n ) jj = O(r1=2n1=2kn 4(1sup ir f 2Fl;M 2 Hence, rn sup 2 jj Bn jj2M (n) = O(r1=2n1=2kn 4( sup i)1=2) 2
1
2
1
1ir
f 2Fl;M
Therefore, using (iii) of theorem 5, we see that rn (3:31) lim sup sup 2 jj Bn jj2M (n) = 0: n?!+1 f 2Fl;M Using now the fact that projections onto closed convex cones are contracting maps as are projections onto linear subspaces, we obtain that n jj IP( ) b + Bn) jj2M (n) n2 jj Bn jj2M (n) : s b ? IP(s 2 Then b ? IP(s b + Bn) jj2M (n)= 0 a:s: lim sup sup n2 jj IP(s) n?!+1 f 2Fl;M It follows that b ? IP(sb + Bn) >M (n)= 0 a:s: lim sup sup 2n2 < Bn ; IP(s) n?!+1 f 2Fl;M 15
On the other hand b ; Bn >M (n)j 2n2 jj sb ? IP(s) b jjM (n)jj Bn jjM (n) j 2n2 < sb ? IP(s) b jjM (n)jj Bn jjM (n) 1I(pnjj b?IP( b)jj n 1) 2n2 jj sb ? IP(s ) s s M b jjM (n)jj Bn jjM (n) 1I(pnjj b?IP( b )jj n >1) + 2n2 jj s b ? IP(s) s s M where 1IA is dened by: (
)
(
)
1IA(x) = 1 if x 2 A and 1IA(x) = 0 otherwise: By (3:31); we have b jjM (n)jj Bn jjM (n) 1I(pnjj b?IP( b )jj n 1) = 0: lim sup sup 2n2 jj sb ?IP(s) s s M n?!+1 f 2Fl;M (
)
Otherwise, p b ? IP( ) 2n jj b ? IP( ) b n) jj Bn jjM (n) 1I( n jj s jj s b jjM (n )> 1) s s M ( 2 b jj2M (n) pn jj Bn jjM (n) 1I(pnjj b?IP( b)jj n >1) 2n2 jj s b ? IP(s ) s s M Now, under the null hypothesis, and for n suciently large, S is convex. That is to say that 2 K: Then, using lemma 2, we see that (
Moreover,
)
b jjM (n)jj s b ? ? IP(sb ? ) jjM (n) : jj s b ? IP(s)
b ? ; N (0; M ?1 (n)) s n Applying now, theorem 2, we obtain 2
kX n +1 P ( n2 jj sb ? ? IP(s b ? ) jj2M (n) s) = !0P (20 s2) + ( !j )P (21 s2) j =0
+!kn +4P (2kn +4 s2): Then, under the null hypothesis, b jj2M (n) s) P (2kn +4 s2): P ( n2 jj s b ? IP(s) 16
Hence, for all 2 IR+; we have, on a b jjM (n)jj Bn jjM (n) 1I(pn jj sb ?IP(s ) b jjM (n)> 1) > ) P ( sup 2n2 jj sb ?IP(s ) f 2Fl;M
p P ( sup 2kn+4 n jj Bn jjM (n)> ): f 2Fl;M
Applying now the Bienayme-Chebychev inequality, we can deduce that
b jjM (n)> 1) > ) b jjM (n)jj Bn jjM (n) 1I(pn jj sb ?IP(s ) P ( sup 2n2 jj sb ?IP(s ) f 2Fl;M n jj Bn jj2M (n) IE((2kn +4 )2 ): 2 We have IE((2kn+4 )2 ) = 2(kn + 4) = O(kn ): Then b jjM (n)> 1) > ) b jjM (n)jj Bn jjM (n) 1I(pn jj sb ?IP(s ) P ( sup 2n2 jj sb ?IP(s ) f 2Fl;M = O(r1=2n1=2kn 3( sup i)1=2): 1ir
Hence under (iii) of theorem 5, we obtain b ; Bn >M (n)j= 0 lim sup sup j 2n2 < sb ? IP(s) n?!+1 f 2Fl;M with convergence in probability. Now, b ; IP(s) b ?IP(s + b Bn) >M (n)j 2n2 jj sb ?IP(s ) b jjM (n)jj Bn jjM (n) : j 2n2 < sb ?IP(s ) We see as above that b ; IP(s ) b ? IP(sb + Bn ) >M (n)j sup j 2n2 < sb ? IP(s) f 2Fl;M converges in probability to zero. Hence b jj2M (n)) sup (2 M ? (n)(kn + 2) ? n2 jj s b ? IP(s) f 2Fl;M n converges in probability to zero. Therefore, we can write that lim sup m?! lim+1 sup Pf (n;m ) = lim sup sup Pf (2 M ? (n)(kn +2) s2;kn+2) 2
n?!+1
f 2H0
1
n?!+1 f 2H0
17
2
n
1
b jj2M (n) s2;kn+2) = lim sup sup Pf ( n2 jj s b ? IP(s ) n?!+1 f 2H lim n?! sup sup Pf ( n2 jj sb ? ? IP(s b ? ) jj2M (n) s2;kn+2) = : +1 f 2H Otherwise f 0 lies in H and in this point, we have lim sup m?! lim+1 Pf (n;m ) = : 0
0
n?!+1
Then the test is asymptotically of size : It remainds to prove that the test is consistent. From theorem 3, it suces to show that n jj IEb ? IP(IE) b jj2M (n)?! +1: 2 As above, we have n jj IEb ? IP(IE) b jj2M (n)= n2 jj Bn + ? IP(Bn + ) jj2M (n) : 2 In other words n jj IEb ? IP(IE) b jj2M (n)= n2 jj ? IP() jj2M (n) + n2 jj Bn jj2M (n) 2
+ n2 jj IP() ? IP( + Bn ) jj2M (n) + 2n2 < ? IP(); Bn >M (n) + 2n2 < ?IP(); IP()?IP(+Bn) >M (n) + 2n2 < Bn; IP()?IP(+Bn) >M (n) : In the same way as above, we see that n jj B jj2 ! 0: 2 n M (n) n jj IP() ? IP( + B ) jj2 ! 0: n M (n) 2 Hence 2n j< ? IP(); B > n j= ( n jj ? IP() jj2 ) n M ( ) n 2 M (n ) 2 and 2n j< ?IP(); IP()?IP(+B ) > n + 2n < B ; IP()?IP(+B ) > n j n M ( ) n n M ( ) 2 2 = 0n ( n2 jj ? IP() jj2M (n))
where n and 0n are nonnegatives reals converging to zero. Therefore, the consistency of the test will be established when we show that n jj ? IP() jj2 ?! +1: M (n) 2 18
If f is non-convex, then S is also non-convex and therefore, 62 K: Hence, there is a face Fj of K such that IP() lies in Fj with Fj dened by Fj = fx 2 IRkn +4 : b0j x = 0; b0lx 0; l = 0; :::; kn + 1g: Hence, from lemma 1, IP() is also the orthogonal projection of onto the subspace generated by Fj and ? IP() is also in the polar cone of K: That is to say n jj ? IP() jj2 = n(b0j )2 : n M ( ) 2 (b0 M ?1 (n )b ) 2 And
j
j
8 0 < b j 0 : b0i ? bb00jiMM?? ((nn))bbjj b0j 0; i = 0; :::; kn + 1: 1
1
Otherwise,
1
b0j M ?1 (n )bj
Hence,
jj b jj2jj M1 ?1 (n) jj : j
n(b0j )2 n jj ? IP() jj2 n M ( ) 2 jj b jj2jj M ?1 (n ) jj : 2 j Therefore, for n suciently large, n jj ? IP () jj2 n (f 00( ))25 : 2 j M (n) 2 kn 2 Under the assumption (iv) on a nk5n ! +1 hence, we will nished when we 00(j ))2 > where is a positif real. show that: for n suciently large ( f 0 ? n Let aij = bb0 jiMM ? ((n))bbjj : Because the sequence of knots is quasi-uniforme, it is readily to see that sup jj bj jj= O(0jinf jj b jj): kn +1 j 1
1
0j kn +1
Since now,
jj M (n ) jj jj M (n ) jj : 1 b0j M ?1 (n )bj jj bj jj2 inf 0jkn +1 jj bj jj2 We see that ( using the fact that jj M (n ) jj= O(kn ) and jj M ?1(n ) jj= O(k?n1) (see Diack& Thomas (1998))) sup0jkn +1 jj bj jj2jj M ?1(n ) jjjj M (n ) jj j aij j = O(1): inf 0jkn +1 jj bj jj2 Now, let us suppose That b0i 0 since b0j 0 we have aij 0: Then 0 b0i aij b0j ?O(1)b0j : 19
Hence, But for n suciently large,
(b0i)2 O(1)(b0j )2:
b0i = ?f 00(i) + (1): Because the knots are dense in (0; 1) one can deduce that (b0j )2 = (f 00(j ) + (1))2 c sup [(f 00(x))21I(f 00(x)0)] > x2(0;1)
where c et are positive constant. We have n jj ? IP() jj2 ! +1: M (n ) 2 The consistency of the test follows OUf !2
References [Agarwal and Studden, 1980] Agarwal, G. and Studden, W. (1980). Asymptotic integrated mean square error using least squares and bias minimizing splines. The Annals of Statistics, 8(6):13071325. [Barlow et al., 1972] Barlow, R. E., Bartholomew, D., Bremner, J., and Brunck, H. (1972). Statistical inference under order restrictions. John Wiley Sons Ltd. [Bartholomew, 1961] Bartholomew, D. (1961). Ordered tests in the analysis of variance. Biometrika, 48:325332. [Beatson, 1982] Beatson, R. (1982). Monotone and convex approximation by splines: error estimates and a curve tting algorithm. SIAM J. of Math. Analysis, 19(4). [Carroll and Kimball, 1996] Carroll, C. and Kimball, M. (1996). Notes and comments on the concavity of the comsumption function. Econometrica, 64(4):981 992. [Diack, 1996] Diack, C. (1996). Testing convexity. Proceedings in Computational Statistics: Physica Verlag. [Diack, 1997] Diack, C. (1997). Tests de convexite pour une fonction de regression. These de Doctorat de l'Universite Paul Sabatier de Toulouse, France. [Diack, 1998] Diack, C. (1998). Sur la convergence des tests de schlee et yatchew. submitted. 20
[Diack and Thomas-Agnan, 1998] Diack, C. and Thomas-Agnan, C. (1998). A nonparametric test of the non-convexity of regression. to appear in Journal of Nonparametric Statistics. [Dierckx, 1980] Dierckx, H. (1980). An algorithm for cubic spline tting with convexity constraints. Computing, 24:349371. [Dykstra, 1983] Dykstra, R. (December 1983). An algorithm for restricted least squares regression. Journal of american statistical association, 78(384). [Kudo, 1963] Kudo, A. (1963). A multivariate analogue of the one-sided test. Biometrika, 50(3 and 4):403418. [Kudo and Choi, 1975] Kudo, A. and Choi, J. (1975). A generalized multivariate analogue of the one-sided test. Mem. Fac. Sci., Kyushu Univ. A., 29:303328. [Nuesch, 1966] Nuesch, P. (1966). On the problem of testing location in multivariate populations for restricted alternatives. Ann. Math. Statist., 37:113119. [Raubertas et al., 1986] Raubertas, R., Lee, C., and Nordheim, E. (1986). Hypothesis tests for normal means constrained by linear inequalities. Comm. Statist. Theor. Meth., 15:28092833. [Robertson et al., 1988] Robertson, T., Wright, F., and Dykstra, R. (1988). Order restricted statistical inference. Wiley New-York. [Rockafeller, 1970] Rockafeller, R. (1970). Convex Analysis. Princeton, New Jersey: Princeton University Press. [Sasabuchi, 1980] Sasabuchi, S. (1980). A test of a multivariate normal mean with composite hypotheses determined by linear inequalities. Biometrika, 67(2):429 439. [Schlee, 1980] Schlee, W. (1980). Nonparametric test of the monotony and convexity of regression. Nonparametric Statistical Inference, 2:823836. [Schumaker, 1981] Schumaker, L. (1981). Spline function: Basic theory. John Wiley, New York. [Shapiro, 1985] Shapiro, A. (1985). Asymptotic distribution of test statistics in the analysis of moment structures under inequality constraits. Biometrika, 72(1):133144. [Yatchew, 1992] Yatchew, A. (1992). Nonparametric regression tests based on least squares. Econometrics Theory, 8:435451. 21