A note on the maximization of matrix valued Hankel ... - CiteSeerX

0 downloads 0 Views 144KB Size Report
Jun 4, 2003 - Holger Dette. Ruhr-Universität Bochum. Fakultät für Mathematik. 44780 Bochum. Germany email: holger.dette@ruhr-uni-bochum.de. FAX: +49 ...
A note on the maximization of matrix valued Hankel determinants with applications Holger Dette Ruhr-Universit¨at Bochum Fakult¨at f¨ ur Mathematik 44780 Bochum Germany

W. J. Studden Dept. of Statistics Purdue University West-Lafayette IN 47907-1399 USA

email: [email protected]

email: [email protected]

FAX: +49 2 34 70 94 559

June 4, 2003 Abstract In this note we consider the problem of maximizing the determinant of moment matrices of matrix measures. The maximizing matrix measure can be characterized explicitly by having equal (matrix valued) weights at the zeros of classical (one dimensional) orthogonal polynomials. The results generalize classical work of Schoenberg (1959) to the case of matrix measures. As a statistical application we consider several optimal design problems in linear models, which generalize the classical weighing design problems.

AMS Subject Classification: 42C05, 33C45, 62K05 Keywords and Phrases: Matrix measures, Hankel matrix, orthogonal polynomials, approximate optimal designs, spring balance weighing designs

1

Introduction

In recent years considerable interest has been shown in the area of matrix measures and matrix polynomials [see Rodman (1990), Sinap and Van Assche (1994), Duran and Van Assche (1995), Duran (1995, 1996, 1999), Duran and Lopez-Rodriguez (1996, 1997), Zygmunt (2001, 2002) among many others]. A matrix measure µ is a p × p matrix µ = {µij } of finite signed measures µij on the Borel field of the real line R or of an appropriate subset. It will be assumed here that for each Borel set A ⊂ R the matrix µ(A) = {µij (A)} is symmetric and nonnegative definite, i.e. µ(A) ≥ 0. The moments of the matrix measure µ are given by the p × p matrices  Sk = Sk (µ) = tk dµ(t) (1.1) k = 0, 1, · · · 1

If it is not stated otherwise, the integrals will usually be over the interval [0, 1]. In the present note we are interested in the maximum of the determinant of the Hankel matrix   S 0 . . . Sm  ..  , H2m =  ... (1.2) .  Sm . . . S2m where the maximum is taken over a certain subset N ∗ = {(S0 , . . . , S2m ) ∈ M2m+1 | S0 ∈ N }

(1.3) of the moment space (1.4)

M2m+1 = {(S0 (µ), . . . , S2m (µ)) | µ matrix measure on [0, 1]}

where N is some subset of symmetric positive semi-definite matrices. For the one dimensional case p = 1 problems of this type have been studied by Schoenberg (1959) and the solution of these problems are discrete measures supported on the roots of classical orthogonal polynomials [see Karlin and Studden (1966)]. Our interest in these types of problems stems from both a mathematical and practical viewpoint. On the one hand we are interested in generalizations of Schoenberg’s results to the matrix case. On the other hand the optimization of the determinant of the Hankel matrix in (1.2) appears naturally in some areas of mathematical statistics, where optimal designs for linear regression experiments have to be determined. In Section 2 we present some recent facts on matrix measures, matrix orthogonal polynomials and matrix valued canonical moments [see Dette and Studden (2002a,b)]. This methodology simplifies the optimization problem substantially, and it can be shown that the matrix measure maximizing the Hankel determinant |H2m | is a uniform distribution on m + 1 points. These points are the points 0, 1 and the roots of the derivative of the (one-dimensional) mth Legendre polynomial and do not depend on the particular choice of the set N in the definition of N ∗ . The common weight matrix at these points is the solution of the optimization problem (1.5)

A∗ = arg max{|S0 | | S0 ∈ M1 ∩ N }

[here and throughout this paper we assume that the maximum in (1.5) exists and is unique]. Finally, some statistical applications of the general results are discussed in Section 3.

2

Canonical moments and the maximum of the Hankel determinant

Let N denote a subset of the nonnegative definite matrices such that the maximum in (1.5) exists and is unique. In order to find (2.1)

max{|H2m (µ)| | µ matrix measure on [0, 1]; S0(µ) ∈ N } 2

we need some basic facts on matrix measures and orthogonal matrix polynomials, which have recently been established by Dette and Studden (2002a,b) and will be briefly summarized here for the sake of completeness. 1 For a matrix measure µ on the interval [0, 1] with moments Sj = 0 xj dµ(t) define the ”Hankel” matrices     S0 · · · Sm S1 − S2 · · · Sm − Sm+1    ..  , .. .. H 2m =  H 2m =  ... (2.2) , .  . . Sm − Sm+1 . . . S2m−1 − S2m

Sm . . . S2m and (2.3)

 S1 · · · Sm+1  ..  , =  ... .  Sm+1 . . . S2m+1 

H 2m+1

 S0 − S1 · · · Sm − Sm+1   .. .. = , . . Sm − Sm+1 . . . S2m − S2m+1 

H 2m+1

then Dette and Studden (2002a) showed that a point (S0 , · · · , Sn ) is in the (interior) moment space Mn+1 generated by the matrix measures on the interval [0, 1], if and only if the matrices H n and H n are nonnegative (positive) definite. The nonnegativity of the matrices H n and H n imposes bounds on the moments Sk as in the one dimensional case [see Dette and Studden (1997), Chapter 1]. To be precise let hT2m = (Sm+1 , · · · , S2m ) , hT2m−1 = (Sm , · · · , S2m−1 ) ¯ T = (Sm − Sm+1 , · · · , S2m−1 − S2m ) , h ¯T h = (Sm − Sm+1 , · · · , S2m−2 − S2m−1 ) 2m

and define (2.4)

S1−

2m−1

= 0 and

− = hTn H −1 Sn+1 n−1 hn ,

and S1+ = S0 , S2+ = S1 and (2.5)

n≥1,

+ ¯T H ¯ −1 ¯ = Sn − h Sn+1 n n−1 hn ,

n≥2,

whenever the inverses of the Hankel matrices exist. It is to be noted and stressed that Sn− and Sn+ depend on (S0 , S1 , · · · , Sn−1 ) although this is not mentioned explicitly. It follows from a straightforward calculation with partitioned matrices that (S0 , . . . , Sn−1 ) is in the interior of the moment space Mn if and only if Sn− < Sn+ in the sense of Loewner ordering (note that a matrix is positive definite if and only if its main subblock and the corresponding Schur complement are positive definite). Moreover, for (S0 , . . . , Sn ) ∈ Mn+1 we have (2.6)

Sn− ≤ Sn ≤ Sn+

in the sense of Loewner ordering. If (S0 , S1 , · · · , Sn−1) is in the interior of the moment space Mn , then we define the kth matrix canonical moment as the matrix (2.7) where (2.8)

Uk = Dk−1 (Sk − Sk− ) ,

1≤k≤n,

Dk = Sk+ − Sk− .

These quantities are the analog of the classical canonical moments pk in the scalar case [see Dette and Studden (1997)]. We will also make use of the quantities (2.9)

Vk = Ip − Uk = Dk−1 (Sk+ − Sk ) , 3

1≤k≤n.

The main results in Dette and Studden (2002a) are the following theorems. The first represents the width Dn+1 of the moment space Mn+1 in terms of the matrix canonical moments Uk and Vk and the second shows how canonical moments appear naturally in the three term recurrence relation for the monic matrix orthogonal polynomials. Theorem 2.1. If the point (S0 , · · · , Sn ) is in the interior of the moment space Mn+1 generated by the matrix measures on the interval [0, 1], then (2.10)

+ − − Sn+1 = S0 U1 V1 U2 V2 · · · Un Vn . Dn+1 = Sn+1

A p × p matrix polynomial is a p × p matrix with polynomial entries. It is of degree n if all the polynomials are of degree less or equal than n and is usually written in the form (2.11)

P (t) =

n

Ai ti .

i=0

where the Ai are real p × p matrices. The matrix polynomial P (t) is called monic if the highest coefficient satisfies An = Ip where Ip denotes the p × p identity matrix. The (pseudo) inner product of two matrix polynomials is defined by  (2.12) < P, Q > = P T (t)µ(dt)Q(t). Sinap and Van Assche (1996) call this the ’right’ inner product. The left inner product would put the transpose of the Q polynomial. The orthogonal polynomials are defined by orthogonalizing the sequence Ip , tIp , t2 Ip , · · · with respect to the above inner product. It is easy to see that matrix orthogonal polynomials satisfy a three term recurrence realtionship. The following result expresses the coefficients in this recurrence relation for the monic orthogonal polynomials in terms of canonical moments. Theorem 2.2. Let µ denote a matrix measure on the interval [0, 1] with matrix canonical moments Un , n ∈ IN. 1) The sequence of monic orthogonal polynomials {P k (x)}k≥0 with respect to the matrix measure µ satisfies the recurrence formula P 0 (x) = Ip , P −1 (x) = 0 and for m ≥ 0 (2.13)

xP m (x) = P m+1 (x) + P m (x)(ζ2m+1 + ζ2m ) + P m−1 (x)ζ2m−1 ζ2m ,

where the quantities ζj ∈ Rp×p are defined by ζ0 = 0, ζ1 = U1 , ζj = Vj−1 Uj if j ≥ 2 and the sequences {Uj } and {Vj } are given in (2.7) and (2.9). 2) The sequence of monic orthogonal polynomials {Qk (x)}k≥0 with respect to the matrix measure x(1 − x)dµ(x) satisfies the recurrence formula Q0 (x) = Ip , Q−1 (x) = 0 and for m ≥ 0 (2.14)

xQm (x) = Qm+1 (x) + Qm (x)(γ2m+2 + γ2m+3 ) + Qm−1 (x)γ2m+1 γ2m+2 .

where the quantities γj ∈ Rp×p are defined by γ1 = V1 , γj = Uj−1 Vj if j ≥ 2 and the sequences {Uj } and {Vj } are given in (2.7) and (2.9). 4

In the following discussion moment points with Sn = Sn+ for some n ∈ N will be of importance. Note that these points satisfy Un = Ip for some n ∈ N. Theorem 2.3. Assume that (S0 , . . . , S2m−1 ) ∈ Int(M2m ), then the probability measure correspond+ ing to the point (S0 , . . . , S2m−1 , S2m ) is uniquely determined. The roots are the different zeros of the polynomial x(1 − x)det Qm−1 (x), where Qm−1 (x) is the (m − 1)th monic orthogonal polynomial with respect to the matrix measure x(1 − x)dµ(x). The weights are determined as the unique solution of the system     Ip S0 Ip ... Ip Ip      x2 Ip ... xk−1 Ip Ip   S1   0 Λ1  .   .   .. .. ..   ..   .. Λ2  . . .          ..  (2.15) ... xm Ip  xm  Sm  =  0 .  2 Ip k−1 Ip        0   O Qm−1 (x2 ) . . .  0 0  Λk−1      ..  .. ..  ..   .. Λk  .   . .  . . 0 0 0 Qm−1 (xk−1 ) 0 where rank(Λi ) = p if xi ∈ {0, 1}, rank(Λi ) = i of xi as a root of the polynomial det Qm−1 (x).

if det Qm−1 (xi ) = 0, and i is the multiplicity

For a proof of this result see Dette and Studden (2002b). We now present some new properties of matrix valued canonical moments, which turn out to be useful for the maximization of the determinant of the Hankel matrix. Lemma 2.4. The eigenvalues of the matrix valued canonical moments U1 , U2 , . . . are all real and located in the interval [0, 1], whenever the canonical moments are defined. Proof. Recall the definition of the canonical moments in (2.7) and consider the transformation U˜k = (Sk+ − Sk− )1/2 Uk (Sk+ − Sk− )−1/2 = (Sk+ − Sk− )−1/2 (Sk − Sk− )(Sk+ − Sk− )−1/2 . Obviously U˜k is nonnegative definite und symmetric. On the other hand ˜k = (S + − S − )−1/2 (S + − Sk )(S + − S − )−1/2 Ip − U k k k k k is also nonnegative definite and symmetric. Therefore U˜k has real eigenvalues located in the interval ˜k are similar. [0, 1] and the assertion follows, because Uk and U 2

5

Lemma 2.5. Let µ denote a matrix measure on the interval [0, 1] with canonical moments U1 , U2 , . . . , then the determinant of the Hankel matrix can be represented as (2.16)

|H2m (µ)| = |S0 (µ)|

m+1

m  · {|V2j−2 | · |U2j−1 | · |V2j−1| · |U2j |}m−j+1 j=1

where |V0 | = 1.

 Proof. From (2.7) and Theorem 2.1 we have Sk − Sk− = S0 kj=1 Vj−1Uj with V0 = Ip . Now a well − known result on partitioned matrices [see e.g. Muirhead (1982), 581-582] and the definition of S2m in (2.4) shows that − | · |H2m−2 (µ)| |H2m (µ)| = |S2m − S2m m  − = |S2j − S2j | j=0

= |S0 |

m+1

m  {|V2j−2 | · |U2j−1 | · |V2j−1| · |U2j |}m−j+1 . j=1

2 Theorem 2.6. The determinant of the Hankel matrix defined in (1.2) attains its maximum in the 1 set N ∗ defined in (1.3) if and only if the corresponding matrix measure µ has equal weight m+1 A∗ at the roots of the polynomial (2.17)

x(1 − x)Pm (x),

where Pm (x) denotes the mth (univariate) Legendre polynomial on the interval [0, 1] and the matrix A∗ is defined in (1.5). Proof. We determine the solution of the maximization problem in two steps. In a first step we use Lemma 2.5 to find the canonical moments of the maximizing measure and the corresponding ∗ ) of moment S0∗ . In the second step we show that the corresponding moment point (S0∗ , S1∗ , . . . , S2m ∗ the maximizing measure is in fact an element of the set N defined in (1.3). By Lemma 2.5 the determinant can be maximized using the matrix A∗ in (1.5) for S0 and the canonical moments 1 m−k+1 ∗ ∗ Ip k = 1, . . . , m. U2k−1 (2.18) = Ip U2k = 2 2(m − k) + 1 The first assertion is obvious, while the second follows by maximizing the terms |V2j−1 | · |U2j−1 |;

|V2j |m−j |U2j |m−j+1

in (2.16) separately using Lemma 2.4. For example the eigenvalues of U2j−1 , say λ1 , . . . , λp , are real and located in the interval [0, 1] and this gives |V2j−1 ||U2j−1| = |(Ip − U2j−1 )U2j−1 | =

p  i=1

6

λi (1 − λi ).

∗ The product is maximal for λ1 = . . . = λp = 1/2, which yields U2j−1 = 12 Ip , and the other cases are treated similary. ∗ ∗ Now we obtain from (2.18) U2m = Ip which shows that the corresponding moment point (S0∗ , . . . , S2m ) ∗+ ∗ ∗ satisfies S2m = S2m . By Theorem 2.3 it therefore follows that the maximizing measure µ is supported at the roots of the polynomial x(1 − x)Qm−1 (x), where Qm−1 (x) is the (m − 1)th monic orthogonal polynomial with respect to the matrix measure x(1 − x)µ(dx). By Theorem 2.2 this polynomial can be obtained by the recursion Q−1 (x) = 0 ∈ Rp×p , Q0 (x) = Ip ,

1 1 (m − j − 1)(m − j + 1) Q (x). Qj+1 (x) = (x − )Qj (x) − 2 4 (2(m − j) − 1)(2(m − j) + 1) j−1 ˜ m−1 (x), where Consequently, all matrix polynomials are diagonal. In particular Qm−1 (x) = Ip · Q ˜ m−1 (x) is the supporting polynomial of the (one-dimensional) sequence of canonical mox(1 − x)Q ments 1 m 1 m−1 1 1 (2.19) , , , , . . . , , , 1. 2 2m − 1 2 2m − 3 2 3 ˜ m−1 (x) is proportional to P  (x). The results in Dette and Studden (1997), Chap. 4, show that Q m Similarly, observing the definition of canonical moments it is easy to see that the corresponding moments satisfy Sk∗ = ck A∗ k = 0, 1, 2, . . . where c0 , c1 , c2 , . . . are the one-dimensional moments corresponding to sional canonical moments defined in (2.19). Therefore the system of to         1 1 . . . 1 1     c0       0 x1 xm−1 1   ..  ∗ . .  ⊗ Ip  (2.20)   . ⊗A = . .   . . . .   . .   . .     cm   0 xm . . . x m−1 1 1

the sequence of one dimenequations in (2.15) reduces  Λ0 ..  . .  Λm

where x1 , . . . , xm−1 are the roots of the polynomial Pm (x) and throughtout this paper ⊗ denotes the Kronecker product. Because the D-optimal design for the univariate polynomial regression model has equal weights at the roots of the polynomial Pm (x) it follows that     −1  1 1 1 ... 1 1 c0        0 x1 xm−1 1   c1  1 1    . .   ..  .. . .   ..  = m + 1  ..  , .  .  .  . . . m 0 xm 1 . . . xm−1 1

cm

1

which implies for the unique solution of (2.20) Λj =

1 A∗ , m+1

where the matrix A∗ is defined by (1.5). Because the optimal canonical moments in (2.18) satisfy ∗ ∗ = Ip , the corresponding moment point (S0∗ , . . . , S2m ) is 0 < Uk∗ < Ip (k = 1, . . . , 2m − 1), U2m ∗ obviously an element of the set N defined in (1.3), which completes the proof of the theorem. 7

2 The following result is proved in a similar manner and its proof therefore omitted. Theorem 2.7. The quantity

|H2m (µ)| |H2m−2 (µ)|

attains its maximum in the set N ∗ defined in (1.3) if and only if the corresponding matrix measure µ puts weights 1 1 ∗ 1 ∗ 1 ∗ A , A , . . . , A∗ , A 2m m m 2m at the roots of the polynomial x(1 − x)U˜m−1 (x), where U˜m−1 (x) denotes the (one-dimensional) Chebyshev polynomial of the second kind on the interval [0, 1] and the matrix A∗ is defined in (1.5).

3

A statistical application

Consider a linear regression model of the form (3.1)

Yk =

p i=1

zki

m 

βjixj

 + εk

k = 1, . . . , n ;  = 1, . . . , s

j=0

where ε11 , . . . , εsns are i.i.d. random variables with zero mean and variance σ 2 > 0, x varies in the interval [0, 1] ( = 1, . . . , s), and zki (k = 1, . . . , n ; i = 1, . . . , p) are chosen from either the set {0, 1} or the set {−1, 0, 1}. The quantities zki and x can be controlled by the experimenter but the parameters βij are unknown and have to be estimated from the known data. The model (3.1) is a generalization of the classical weighing design problem which corresponds to the case m = 0. [see Banerjee (1975) or Shah and Sinha (1989)]. For m = 0 there are p objects with unknown weights β01 , . . . , β0p which are to be determined using either a spring balance or a chemical balance. A total of n weighings of observations are allowed. For the spring balance any number of the objects can be placed on the single pan and the measurement represents the total weight of the objects. This corresponds to the case where zki are chosen from {0, 1}. For m = 0 the  subscript does not appear and zki is one or zero depending on whether the ith object is included in the kth weighing or not. If zlki are chosen from {−1, 0, 1} then one has a chemical balance with 2 pans and one can observe the difference in weight of any two subsets of the objects. The model (3.1) generalizes the classical weighing model to the case where the weight of all the objects depends on a variable x by a polynomial trend. Note that the degree of the polynomial is the same for each of the objects. In general, the model (3.1) can be conveniently written as (3.2)

Y = Xβ + ε 8

where Y = (Y1 , . . . , Yn )T denotes the vector of observations, ε ist the vector of (unobservable) errors, β = (β01, , . . . , β0p , . . . , βm1 , . . . , βmp )T is the vector of unknown parameters. The matrix X is called the design matrix and is given by   ) ⊗ Z (1, x1 , . . . , xm 1 1    (1, x2 , . . . , xm ) ⊗ Z 2  2 , X = (3.3) ..   .   m (1, xs , . . . , xs ) ⊗ Zs where for  = 1, . . . , s the matrix Z = (zki )i=1,...,p k=1,...,n has entries 0 or 1 in the spring balance case. The least squares estimate of β is chosen to minmize the n-dimensional Euclidean norm |Y − Xβ|. This estimate is then given by βˆ = (X T X)−1 X T Y and its covariance matrix can be calculated as

2

T

ˆ = σ (X X) Cov(β)

−1

  s s  ZT Z x Z Z  =1 =1  .. .. = σ2  . .  s s   T xm xm+1 ZT Z  Z Z  =1

=1

... .. . ...

s  =1 s  =1

T xm  Z Z

.. .

T x2m  Z Z

    = σ 2 H−1 (ξ), 2m  

where H2m (ξ) denotes the ”Hankel” matrix of order 2m corresponding to the matrix measure ξ with weights Λ = ZT Z at the points x ( = 1, . . . , s). Note that the matrix Λ is nonnegative definite (by its construction) and has nonnegative integer valued entries. Our purpose here is to maximize the determinant of a normalized version of the matrix H2m (ξ) in an approximate sense which will be explained in the next paragraph, since, neither the weighing design problem (m = 0) nor the ordinary polynomial regression problem (p = 1), are completely solved in the situation above [see e.g. Banerjee (1975), Shah and Sinha (1989), Imhof, Krafft, Schaefer (2000)]. For m = 0, in the weighing design, a complete solution for the approximate case is given in Huda and Mukerjee (1988). The ordinary polynomial D-optimal design in the approximate sense is ”well-known” and given, for example in Dette and Studden (1997). It will be shown that the two separate solutions can be combined in a simple manner. We give a rather detailed discussion of the spring balance design. The case of the chemical balance is very similar and left to the reader. Let Ω ⊂ Rp denote the set of all p-dimensional vectors with components in {0, 1} , which has t = 2p elements. A probability measure ξ with finite support on the set (3.4)

X := [0, 1] × Ω

is called a design on X [see Pukelsheim (1993)]. If n is the total number of observations and ξ puts mass ξk at the point (x , ωk ) ∈ X , then the experimenter takes approximately nξk independent observations under experimental condition x using the polynomial regression models corresponding to the nonvanishing components in the vector ωk . For (x, ω) ∈ X let f T (x, ω) = (1, x, . . . , xm ) ⊗ ω T denote the vector of regression functions of length p(m + 1), then the information matrix is given

9

by   M(ξ) =

X

T

f (x, ω)f (x, ω)dξ(x, w) =

  ξk    k=0

s t =0

1 x .. . xm 

    (1, x , . . . , xm ) ⊗ ωk ω T .  k  

Without loss of generality we assume that x0 , . . . xr are all distinct and define nonnegative matrix weights t Λ = (3.5) ξk ωk ωkT  = 0, . . . , s. k=0

If µ is the matrix measure with mass Λj at xj (j = 0, . . . , s), then it follows that M(ξ) = H 2m (µ), and a D-optimal (approximate) design can be obtained by maximizing H 2m (µ) over the set of matrix measures s µ= (3.6) Λj δxj j=1

with weights of the form (3.5). With reference to the previous paragraph we have that Λ is approximately equal to ZT Z /n, where n is the total number of observations. Throughout this section we let (3.7)

Ξ∗∗ = {µ | µ ∈ Ξ is of the form (3.5) and (3.6)} ,

where Ξ is the set of all matrix measures on the interval [0, 1]. Obviously we have Ξ∗∗ ⊂ Ξ∗ , where (3.8)

Ξ∗ = {µ | µ ∈ Ξ with S0 (µ) = S0 (ν) for some measure ν ∈ Ξ∗∗ }.

Note that the moments (S0 , . . . , S2m ) of the matrix measures in Ξ∗ define a set of the form (1.3). Theorem 3.1. Let Sj ⊂ {0, 1}p denote the set of all vectors with exactly j components equal to one (j = 0, . . . , p). A D-optimal (approximate) spring balance design for the regression model (3.1) is the uniform distribution on the set {(x, ω) | x(1 − x)Pm (x) = 0, ω ∈ S p2  ∪ S p+1  }, where Pm 2 denotes the mth Legendre polynomial on the interval [0, 1]. Proof. Consider first the maximization of the Hankel determinant H2m (µ) over the set Ξ∗ defined in (3.8). According to Theorem 2.6 the solution of this problem is given by the matrix measure (3.9)

1 δxj , µ = A∗ m+1 j=0 m



where x0 , . . . , xm are the roots of the polynomial x(1 − x)Pm (x) and  1     ∗ µ(dx); µ ∈ Ξ∗∗ . A = argmax |S0 |S0 = (3.10) 0

10

Note that for the case m = 0 we do not hae to distinguish different explanatory variables and consequently the representation (3.5) does not depend on  in this case. Therefore the optimization in (3.10) simplifies to (3.11)

t t     T A = argmax |S0 |S0 = ξ k ωk ωk ; ξk = 1, ξk > 0, ωk ∈ Ω . ∗

k=0

k=0

This problem is the problem of finding the D-optimal information matrix for the classical spring balance weighing design (in the approximate case) and has been solved by Huda and Mukerjee (1988) using the classical equivalence theory for approximate designs. The D-optimal approximate spring balance weighing design ξ0∗ puts equal weight at the elements of the set Sp/2 ∪ S(p+1)/2 and the maximizing information matrix is given by  −1 p/2 + 1 2p/2 + 1 ∗ (Ip + Jp ) = A = (3.12) ωω T 2(2p/2 + 1) p/2 ω∈S ∪S p/2

p/2+1

where Jp ∈ Rp×p denotes the matrix with all entries equal to one. Therefore the measure maximizing H2m (µ) in the class Ξ∗ is of the form (3.9) with A∗ given by (3.12). From the second equality in (3.12) it follows that the matrix measure µ∗ is also in the class Ξ∗∗ defined in (3.7) and therefore ∗ specified in the corresponds to an approximate design ξ. It is now easy to see that the design ξm ∗ Theorem 3.1 corresponds to µ by the relation (3.5), that is ∗ )| = |H2m (µ∗ )| = max{|H2m (µ)| | µ ∈ Ξ∗ } |M(ξm = max |{H2m (µ)| | µ ∈ Ξ∗∗ } = max{|M(ξ)| | ξ design on X }. ∗ This proves D-optimality of the design ξm specified in Theorem 3.1.

2

Remark 3.2. It is worthwhile to mention that, once the design in Theorem 3.1 has been identified, its D-optimality can also be established by an application of the classical equivalence theorem for the D-optimality criterion [see Pukelsheim (1993), p. 180]. One can use the results of Section 2 to solve other optimization problems in statistics. Exemplarily we consider the case, where the main interest of the experimenter is to discriminate between a regression of degree m and m − 1 in the model (3.1) and a design which maximizes (3.13)

H2m (ξ) H2m−2 (ξ)

might be appropriate [see e.g. Studden (1980) or Pukelsheim (1993)]. These designs are called D1 -optimal and can be obtained by similar arguments as given in the proof of Theorem 3.1 using Theorem 2.7. Theorem 3.3. A D1 -optimal design ξ1∗ in the regression model (3.1) maximizing the ratio in (3.13) is supported at the set T1 = {(x, ω) | x(1 − x)U˜m−1 (x) = 0, ω ∈ S p2  ∪ S p+1  } where U˜m−1 (x) 2

11

denotes the (m − 1)th (one-dimensional) Chebyshev polynomial of the second kind on the interval ! [0, 1] orthogonal with respect to the measure x(1 − x)dx. This design has equal masses  1  2p/2 + 1 −1 m p/2 at the points {(x, ω) ∈ T1 | x ∈ (0, 1)} while the masses at the points {(x, ω) ∈ T1 | x ∈ {0, 1}} are given by  1  2p/2 + 1 −1 . 2m p/2

Acknowledgements. This work was done while H. Dette was visiting the Department of Statistics, Purdue University, and this author would like to thank the department for its hospitality. The work of H. Dette was supported by the Deutsche Forschungsgemeinschaft (De 502/14-1, SFB 475, Komplexit¨atsreduktion in multivariaten Datenstrukturen). The authors would like to thank Isolde Gottschlich, who typed this paper with considerable technical expertise.

References K.S. Banerjee (1975). Weighing Designs. Marcel Dekker, N.Y. H. Dette, W.J. Studden (1997). The Theory of Canonical Moments with Applications in Statistics, Probability, and Analysis, Wiley. H. Dette, W.J. Studden (2002a). Matrix measures, moment spaces and Favard’s theorem on the interval [0, 1] and [0, ∞). Linear Algebra Appl. 345, 163 - 193. H. Dette, W.J. Studden (2002b). Quadrature formulas for matrix measures - a geometric approach. Linear Algebra Appl. 364, 34 - 65. A.J. Duran (1995). On orthogonal polynomials with respect to a positive definite matrix of measures. Can. J. Math. 47, 88-112. A. J. Duran (1996). Markov’s Theorem for orthogonal matrix polynomials, Can. J. Math, 48, 1180-1195. A.J. Duran (1999). Ratio asymptotics for orthogonal matrix polynomials. Journal of Approximation Theory 100, 304-344. A.J. Duran, P. Lopez-Rodriguez (1996). Orthogonal matrix polynomials: zeros and Blumenthal’s Theorem. Journal of Approximation Theory 84, 96-118. A.J. Duran, P. Lopez-Rodriguez (1997). Density questions for the truncated matrix moment problem. Can. J. Math. 49, 708-721. A. J. Duran, W. Van Assche (1995). Orthogonal matrix polynomials and higher-order recurrence relations. Lin. Alg. and Appl. 219, 261-280. 12

S. Huda, R. Mukerjee (1988). Optimal weighing designs: approximate theory. Statistics 19, 513-517. L. Imhof, O. Krafft, M. Schaefer (2000). D-optimal exact designs for parameter estimation in a quadratic model. Sankhya, Ser. B 62, 266-275 (2000). S. Karlin, W.J. Studden (1966). Tchebycheff Systems: with Applications in Analysis and Statistics. Wiley, New York. R.J. Muirhead (1982). Aspects of Multivariate Statistical Theory. Wiley, N.Y. F. Pukelsheim (1993). Optimal Design of Experiments. Wiley, N.Y. L. Rodman (1990). Orthogonal matrix polynomials. In: P. Nevai (ed.), Orthogonal polynomials: theory and practice. NATO ASI Series C, Vol. 295; Kluwer, Dordrecht. I.J. Schoenberg (1959). On the maxima of certain Hankel determinants and the zeros of classical orthogonal polynomials. Indag. Math. 62, 282-290. K.R. Shah, B.K. Sinha (1989). Theory of Optimal Designs. Lecture Notes in Statistics, Springer, N.Y. A. Sinap, W. Van Assche (1994). Polynomial interpolation and Gaussian quadrature for matrixvalued functions, Linear Algebra and Appl. 207, 71-114. A. Sinap, W. Van Assche (1996). Orthogonal matrix polynomials and applications. Jour. Computational and Applied Math. 66, 27-52. W.J. Studden (1980). Ds -optimal designs for polynomial regression using continued fractions. Ann. Stat. 8, 1132-1141. M.J. Zygmunt (2001). Generalized Chebyshev polynomials and discrete Schr¨odinger operators. J. Phys. A: Math. Gen. 34, 10613-10618. M.J. Zygmunt (2002). Matrix Chebyshev polynomials and continued fractions. Linear Algebra Appl. 227, 155-188.

13