Communications in Statistics - Theory and Methods
ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20
The Generalized Pascal Triangle and the Matrix Variate Jensen-Logistic Distribution Francisco J. Caro-Lopera, Graciela González-Farías & N. Balakrishnan To cite this article: Francisco J. Caro-Lopera, Graciela González-Farías & N. Balakrishnan (2015) The Generalized Pascal Triangle and the Matrix Variate Jensen-Logistic Distribution, Communications in Statistics - Theory and Methods, 44:13, 2738-2752, DOI: 10.1080/03610926.2013.791374 To link to this article: http://dx.doi.org/10.1080/03610926.2013.791374
Published online: 24 Jul 2015.
Submit your article to this journal
Article views: 54
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=lsta20 Download by: [Centro de Investigaciones en Matemáticas, A.C,]
Date: 07 March 2016, At: 08:27
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
Communications in Statistics—Theory and Methods, 44: 2738–2752, 2015 Copyright © Taylor & Francis Group, LLC ISSN: 0361-0926 print / 1532-415X online DOI: 10.1080/03610926.2013.791374
The Generalized Pascal Triangle and the Matrix Variate Jensen-Logistic Distribution FRANCISCO J. CARO-LOPERA,1 GRACIELA ´ ´IAS,2 AND N. BALAKRISHNAN3 GONZALEZ-FAR 1
Departamento de Ciencias B´asicas, Universidad de Medell´ın, Medell´ın, Colombia 2 Centro de Investigaci´on en Matem´aticas, Monterrey, M´exico 3 Department of Mathematics and Statistics, McMaster University, Hamilton, Canada This article defines the so called Generalized Matrix Variate Jensen-Logistic distribution. The relevant applications of this class of distributions in Configuration Shape Theory consist of a more efficient computation, supported by the corresponding inference. This demands the solution of two important problems: (1) the development of analytical and efficient formulae for their k-th derivatives and (2) the use of the derivatives to transform the configuration density into a polynomial density under some special matrix Kummer relation, indexed in this case by the Jensen-Logistic kernel. In this article, we solve these problems by deriving a simple formula for the k-th derivative of the density function, avoiding the usual partition theory framework and using a generalization of Pascal triangles. Then we apply the results by obtaining the associated Jensen-Logistic Kummer relations and the configuration polynomial density in the setting of Statistical Shape Theory. Keywords Jensen-Logistic distribution; Pascal triangle; Statistical shape theory; Zonal polynomials; Generalized Kummer relations. Mathematics Subject Classification 05A99; 33E99; 02E15.
1. Introduction Matrix-variate distribution theory has played a key role in the development of Statistics for the last sixty years. Main contributions may be grouped into two types: (a) improvements in the computation of Jacobians and the integration theory through zonal polynomials and invariant polynomials of matrix arguments (see Muirhead, 1982; Davis, 1981; and the references therein), necessary to establish densities, and evident in numerous theorical papers describing families of ditributions in those terms; and (b) efficient computation of Received October 22, 2012; Accepted March 25, 2013. Address correspondence to Graciela Gonz´alez-Far´ıas, Centro de Investigaci´on en Matem´aticas, Monterrey, M´exico; E-mail:
[email protected]
2738
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
Pascal Triangle and Jensen Logistic Distribution
2739
zonal polynomials, recently implemented by Koev and Edelman (2006) as an initial step for building inference assosciated with a matix-variate distribution. These advances have left the field with the following open problems. (1) Although the computation of zonal polynomials is feasible, the matrix-variate densities of interest in applications are represented by infinite series of polynomials, and as Koev and Edelman (2006) observe, there is as yet no solution to the problem of convergence and truncation of the series of zonal polynomials, and consequently a robust and trustworthy inference cannot be constructed. Many theorical papers writen using zonal polynomials are awaiting application. The few articles that have documented their inferential applications have done so through approximations and asymptotic distributions, requiring at their very foundations idealized assumptions, often hard to sustain when modeling reality (see Muirhead, 1982). (2) The problem becomes harder when more complex distributions are expanded in terms of invariant polynomials (a generalization of the zonal polynomials; see Davis, 1981), since there is to date no efficient computational method for them beyond the fifth order, and so densities represented by products of infinite series of these polynomials cannot be computed (see examples of such series in Davis, 1979; Chikuse and Davis, 1986, and the references therein). (3) Finally, a somewhat specific but still non-trivial problem is that of obtaining explicit formulae for the k-th derivative of the generating function of a distributional model under consideration, a necessity for finding its exact distribution. Although a general expression for the k−th derivative of the generator exists and can be written in terms of partition theory (Caro-Lopera et al., 2009a), its computational implementation is costly, and so in the case of many densities a simplification has been attempted. Among the few cases that allow for such a simplification are the Kotz type, Pearson, and Jensen-Logistic distributions; the first two have been studied and applied (see Caro-Lopera et al., 2009a; D´ıaz-Garc´ıa and Caro-Lopera, 2010, 2012, 2013, 2014), but the last one is still awaiting its study with a non-partitional approach. It seems that the second problem is still an impossible one, and more than 30 years after Davis proposed the invariant polynomials, an efficient computational construction of them is unavailable. Solutions to the first problem may be attempted through numerical and analytical approches. Trying a numerical solution, like that proposed by Koev and Edelman (2006), is computationally quite involved and would involve the typical trade-off between the convergence and truncation of the series. On the other hand, the analytical route has succesfully solved the problem in specific contexts, using polynomial densities via generalized Kummer matrix relations. Recent studies on Shape Theory, such as the one in Caro-Lopera and D´ıaz-Garc´ıa (2012), have shown that the densities in infinite series are highly unstable, and do not reach the required global maximum as they exceed the capacity of the algorithms, which cannot handle the required number of truncations of the series (over 120), while the use of polynomial densities allows the exact estimation of these parameters. Other treatments of these ideas have been discussed by in Caro-Lopera et al., (2008) and D´ıaz-Garc´ıa and Caro-Lopera (2008). Finally, the third problem depends exclusively on the function to derive, and cannot be generalized to any function by a theory other than that of partitions, already implemented. The introduction of a new matrix distribution should generally address three issues: (1) to solve the problem of integration in the domain of the matrix variate; (2) to find a polynomial density under some transformational approach and conditions for the parameters, to ensure the feasibility of inference over exact distributions; and (3) to find an easily computable formula for the k−th derivative. Accordingly, the rest of this article is organized as follows. In Sec. 2, we derive a new family of elliptical distributions that generalizes the Jensen-Logistic density, and
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
2740
Caro-Lopera et al.
obtain its general k−th derivative in the simplest possible way using a mathematical technique based on the Pascal triangle. Then, in Sec. 3, the corresponding Kummer relation is obtained through the use of the general derivative, setting the distribution in the domain of polynomial densities, and avoiding the open problem regarding their computation via a series of zonal polynomials. Finally, in Sec. 4, the above-mentioned Kummer relation is applied in the context of Shape Theory by deriving the configuration density under a Generalized Jensen-Logistic sample. It turns out to be a simple polynomial expression, a very uncommon situation in the complicated set of distributions expressed in terms of zonal polynomials that facilitates the handling of inferential problems in the future.
2. The Generalized Matrix Variate Jensen-Logistic distribution and its derivatives In the last two decades, the matrix variate distribution theory based on non-Gaussian distributions has been studied by a number of authors, in an attempt to extend the wellknown Gaussian procedures to the classes of elliptical distributions; see, for example, Fang and Anderson (1990), Fang and Zhang (1990), and Gupta and Varga (1993). Constantine (1963), five decades ago, described the matrix variate distribution based on the normal in the setting of zonal polynomials of A.T. James. The extensions for elliptical models were not immediate, because the generalization requires the integration of nonexponential kernels and such procedures usually cannot be inherited from the Gaussian case. The best compilation of the matrix variate distribution theory based on normality is in Muirhead (1982). It is easy to see that most of the results depend critically on the integral etr(−XZ)|X|a−(m+1)/2 Cκ (XY)(dX) = X>0
(a)κ m (a) Cκ (YZ−1 ), |Z|a
(1)
where Z is a complex symmetric m×m matrix with (Z) > 0 and Y is an m×m symmetric matrix. If we want to translate the procedures of Muirhead (1982), written in terms of zonal polynomials, to distributions of the elliptical class, we see that the classical literature on elliptical densities is not expressed in terms of such polynomials. The first issue we have to consider is to solve the problems relating to the integral in (1). As in the Gaussian case, the generalizations that we pursue can be found in the general integral derived by Caro-Lopera et al., (2009a): h(tr XZ)|X|a−(m+1)/2 Cκ (XY)(dX) X>0
=
m (a) |Z|−a (a)κ Cκ (YZ−1 ) (ma + k)
∞
h(w)w ma+k−1 dw.
(2)
0
Note that the result in (1) is trivially obtained from (2) by taking h(y) = e−y . This integral can now be used in different and very general contexts such as the Statistical Shape Theory under elliptical models (see, for example, Caro-Lopera et al., 2009a, 2009b, D´ıaz-Garc´ıa and Caro-Lopera 2010, 2013, 2014 and Caro-Lopera and D´ıazGarc´ıa, 2012), and new generalizations of relations based on elliptical kernels, such as the Kummer relations (see D´ıaz-Garc´ıa and Caro-Lopera, 2008).
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
Pascal Triangle and Jensen Logistic Distribution
2741
From the structure of (2), new approaches can be developed based on Taylor expansions of the generator functions of the elliptical models. Therefore, it is important to provide closed expressions for the k-th derivatives of the generator function. For the main elliptical distributions, such as Kotz, Pearson, Bessel, and Jensen-Logistic, this is not a trivial problem. In fact, the general derivative of the generator function stays in the context of partition functions. Explicit formulae for the above-mentioned elliptical families of distributions can be seen in Caro-Lopera et al., (2009a). However, only a few families, such as some classes of Kotz distributions, have a simple form for the general derivative, thus facilitating inferential procedures. First, we propose the class of distributions that will be the baseline for the paper. The p × n random matrix X is said to have a Generalized Matrix Variate Symmetric Jensen-Logistic distribution with parameters M : p × n, > 0 : p × p, > 0 : n × n, m ∈ N, if its pdf is f (X) =
c ||n/2 ||p/2
etr[−(X − M) −1 (X − M)−1 ] , {1 + etr[−(X − M) −1 (X − M)−1 ]}m
(3)
pn/2 ∞ pn e−z 2 −1 dz < 0. where c = (pn/2) 0 z (1+e−z )m For m = 2, we have the matrix variate symmetric Jensen-Logistic distribution. This family belongs to the class of elliptically contoured distributions, with generator function h(y), which has recently been studied under a general approach involving a series of zonal polynomials (see Muirhead, 1982) and the derivatives h(k) (y); see, for example Caro-Lopera et al., (2009a). Thus, in this context and in order to facilitate some applications, we will propose a new distribution and study its derivatives. In general, this is not a trivial problem, since the derivations are set in terms of partition theory, a rather cumbersome approach. The required derivatives have a closed form expression and were given in the following result due to Caro-Lopera et al., (2009a).
Lemma 2.1 Let f (t) = s(t)r(g(t)), where s(·), r(·) and g(·) have derivatives of all orders. k If w (k) denotes ddtwk , then f (k) =
k k m=0
m
s (m) [r(g(t))](k−m) ,
(4)
where [r(g(t))](k) =
ν κ=(k νk ,(k−1)νk−1 ,···,3ν3 ,2ν2 ,1 1
k )
k!
νi i=1 νi !(i!)
k
r(
i=1 νi )
k
(g (i) )νi ,
i=1
(5) where the summation runs over all the partitions κ = (k νk , (k − 1)νk−1 , · · · , 3ν3 , 2ν2 , 1ν1 ) of k. Note that the function f admits a Taylor expansion. Therefore, the above expressions always exist for all k.
2742
Caro-Lopera et al.
In this section, we study the derivatives of the generator function of a Generalized Jensen-Logistic distribution avoiding the above mentioned partitional approach, and setting the result in a simple way by using a generalization of the Pascal triangles. We start with the classical matrix variate symmetric Jensen-Logistic distribution known in the literature, the case of m = 2, and show that the general results can be obtained by extending it.
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
2.1 The classical case, m = 2 i. The solution via partition theory. Consider the generator function of (3) when m = 2. The required derivative was obtained by Caro-Lopera et al., (2009a), taking h(y) = exp(−y) (1 + exp(−y))−2 ,
(6)
and computing the k-th derivative by using Lemma 2.1, as h(k) = k k m=0
m
κ∈Pk−m
(k − m)!
k−m i=1
k−m (−1)m+ i=1 (1+i)νi
k−m νi + 1 ! e−(1+ i=1 νi )y
k−m i=1
k−m
νi !(i!)νi (1 + e−y )2+
i=1
νi
. (7)
Here, the summation is over all the ordered partitions κ of k −m. For simplification in the notation we use h(k) instead of h(k) (y). ii. The new approach. A natural way to avoid the partitional formula in (7), is as follows. Write the generator function in (6) as h(y) = H (y)(1 − H (y)),
(8)
where H (y) =
1 . 1 + e−y
Then, the k-th derivative of h(y) can be written as
k (k) i h =h· 1+ bi H ,
(9)
(10)
i=1
for certain coefficients bi . The formula for the coefficients bi needs to be established. For example, the first five cases give h(1) = h · (1 − 2H ) , h(2) = h · 1 − 6H + 6H 2 , h(3) = h · 1 − 14H + 36H 2 − 24H 3 , h(4) = h · 1 − 30H + 150H 2 − 240H 3 + 120H 4 , h(5) = h · 1 − 62H + 540H 2 − 1560H 3 + 1800H 4 − 720H 5 .
Pascal Triangle and Jensen Logistic Distribution
2743
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
The coefficient of H k is (−1)k (k + 1)!, but the remaining ones do not seem to have a simple recursion formula. iii. An alternate solution. Based on an idea similar to one above and recalling that the objective of the above-mentioned procedure is to avoid the partitional formula given by (7) and simplify the computation, we propose the following procedure. First, write (6) as −2 −2 = ey 1 + ey . h(y) = e−y 1 + e−y Instead of (9), consider H (y) = ey ,
(11)
h(y) = H (y) (1 + H (y))−2 .
(12)
then
Consider the ci ’s such that h(k) = (1 + H )−k h 1 +
k
ci H i .
(13)
i=1
Computing the ci ’s for some specific cases we obtain: h(0) = (1 + H )0 h [1] , h(1) = (1 + H )−1 h [1 − H ] , h(2) = (1 + H )−2 h 1 − 4H + H 2 , h(3) = (1 + H )−3 h 1 − 11H + 11H 2 − H 3 , h(4) = (1 + H )−4 h 1 − 26H + 66H 2 − 26H 3 + H 4 , h(5) = (1 + H )−5 h 1 − 57H + 302H 2 − 302H 3 + 57H 4 − H 5 . The symmetry in the coefficients of the polynomial inside the brackets suggest to collect them in the following Pascal-type triangle (with alternating signs, starting with +). For example the first nine cases are as follows: 1 1 1 1 1 1 1 1 1 1
57
247 502
1013
26
120
47840
302
4293
302
15619
1 57
1191 15619
156190 1310354
1 26
2416
88234 455192
1 11
66
1191
14608
1 4
11
1 120
4293 88234
1310354
1 247
14608 455192
1 502
47840
1 1013
1
Now, the question is whether we are able to build the triangle as in the classical Pascal array. Fortunately, the answer is yes, and this can be explained by looking at the binomial array type of (12). For example, every element of the 10th line can be obtained from the two coefficients above in the 9th line, namely, 1013 = 9 · 1 + 2 · 502 47840 = 8 · 502 + 3 · 14608
2744
Caro-Lopera et al. 455192 = 7 · 14608 + 4 · 88234 1310354 = 6 · 88234 + 5 · 156190 1310354 = 5 · 156190 + 6 · 88234 455192 = 4 · 88234 + 7 · 14608 47840 = 3 · 14608 + 8 · 502
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
1013 = 2 · 502 + 9 · 1
So, by using the same classical induction proof in Pascal’s triangles, we can establish the following result. Lemma 2.2 If aji is the coefficient cj of H j in the i-th derivative of h according to (13), then: ⎡ ⎤ k
j⎦ −ajk−1 (k + 1 − j ) + ajk−1 , (14) h(k) = (1 + H )−k h ⎣1 + +1 (j + 1) H j =1 r r with a−1 = ak+1 = 0 and a0r = akr = 1, for every r = 1, 2, . . ..
Proof.
We already saw that the result holds for k = 1, 2. Now, let us assume that ⎡
h(k)
⎤ k
j⎦ −ajk−1 (k + 1 − j ) + ajk−1 = (1 + H )−k h ⎣1 + +1 (j + 1) H j =1
⎡ = (1 + H )−k h ⎣1 +
k
⎤ cj H j ⎦ .
j =1
Then, recalling that c0 = 1 and ck+1 = 0, we have h(k+1)
⎧ ⎡ ⎤⎫ k ⎬ d ⎨ = cj H j ⎦ (1 + H )−k−2 H ⎣1 + ⎭ dy ⎩ j =1 ⎛
= (−k − 2)(1 + H )−k−3 H ⎝H + ⎛ +(1 + H )−k−2 ⎝H + ⎡
k
cj H j +1 ⎠
j =1 k j =1
⎞
⎞
(j + 1)cj H j +1 ⎠ ⎛
= (1 + H )−k−1 h ⎣(−k − 2) ⎝H +
k j =1
⎞ cj H j +1 ⎠
Pascal Triangle and Jensen Logistic Distribution ⎛ +(1 + H ) ⎝1 + ⎡ = (1 + H )
−k−1
k
⎞⎤ (j + 1)cj H j ⎠⎦
j =1 k
h ⎣(−k − 2)H +
(−k − 2)cj H j +1
j =1 k
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
+1 +
(j + 1)cj H j + H +
j =1
k
⎤ (j + 1)cj H j +1 ⎦
j =1
= (1 + H )−k−1 h (−k − 2)H +
k+1 (−k − 2)cr−1 H r r=2
k+1
+1 +
(r + 1)cr H + H + r
r=1
k+1
= (1 + H )−k−1 h 1 +
(r + 1)cr H + r
r=1
= (1 + H )
k+1
rcr−1 H
r
r=1
−k−1
rcr−1 H
k+1 (−k − 2)cr−1 H r r=1
+
r
r=2
k+1
2745
h 1+
k+1
(−(k + 2 − r)cr−1 + (r + 1)cr ) H
r
.
r=1
Finally, by inspection of the recurrences we can obtain the corresponding formula for the binomial coefficient type in terms of k and j, as follows. Theorem 2.1 The k-th derivative of h(y) = exp(−y) (1 + exp(−y))−2 , is given by ⎡ h(k) = (1 + ey )−k h · ⎣1 +
k j =1
Proof.
! ⎤ j +1 k + 2 n1+k ejy ⎦ . (−1)j +1−n (−1)j j + 1 − n n=1
(15)
The proof follows from (14), i.e., just by verifying that k−1 cjk = −(k + 1 − j )cjk−1 −1 + (j + 1)cj ,
where cjk is the coefficient of H j in (20).
2746
Caro-Lopera et al.
2.2 The General Case The extension for any m ∈ N is straightforward, we just need to apply the same procedure as described for the case m = 2. Explicitly, we want to obtain the k-th derivative of the generator function of (3) given by −m −m h(y) = e−y 1 + e−y = e(m−1)y 1 + ey .
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
Consider as before H (y) = ey ,
(16)
h(y) = (H (y))m−1 (1 + H (y))−m .
(17)
then
Consider the ci ’s (functions of m) such that h
(k)
−k
= (1 + H ) h (m − 1) + k
k
ci H
i
;
(18)
i=1
Computing the ci ’s for some particular cases, we obtain h(0) = (1 + H )0 h · [1] h(1) = (1 + H )−1 h · [(m − 1) − H ] h(2) = (1 + H )−2 h · (m − 1)2 − (3m − 2)H + H 2 h(3) = (1 + H )−3 h · (m − 1)3 − (6m2 − 8m + 3)H + (7m − 3)H 2 − H 3 h(4) = (1 + H )−4 h · (m − 1)4 − (10m3 − 20m2 + 15m − 4)H +(25m2 − 20m + 6)H 2 − (15m − 4)H 3 + H 4 h(5) = (1 + H )−5 h · (m − 1)5 − (15m4 − 40m3 + 45m2 − 24m + 5)H +(65m3 − 75m2 + 46m − 10)H 2 − (90m2 − 34m + 10)H 3 +(31m − 5)H 4 − H 5 . Motivated by the symmetry of the case m = 2 (classical Jensen-Logistic), and their arrangement into a Pascal-type triangle, we try to collect the polynomial coefficients of the brackets under the same class of arrays. 1 (m − 1) 1 (m − 1)2 3m − 2 1 6m2 − 8m + 3 7m − 3 1 (m − 1)3 (m − 1)4 10m3 − 20m2 + 15m − 4 25m2 − 20m + 6 15m − 4 1 (m − 1)5 15m4 − 40m3 + 45m2 − 24m + 5 65m3 − 75m2 + 46m − 10 90m2 − 34m + 10 31m − 5 1
In the general case, we have the same rule for constructing the coefficients of a given line in terms of the preceding one: for example, every element of the 6th line (corresponding
Pascal Triangle and Jensen Logistic Distribution
2747
to the 5th derivative) can be obtained from the two coefficients above, in the 5th line. Note that the first and last terms of any row i > 1 are clearly (m − 1)i and 1, respectively (which can be proved very easily by mathematical induction). Thus, we have 15m4 − 40m3 + 45m2 − 24m + 5 = 5 · (m − 1)4 + m · (10m3 − 20m2 + 15m − 4) 65m3 − 75m2 + 46m − 10 = 4 · (10m3 − 20m2 + 15m − 4) + (m + 1) · (25m2 − 20m + 6) 90m2 − 34m + 10 = 3 · (25m2 − 20m + 6) + (m + 2) · (15m − 4)
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
31m − 5 = 2 · (15m − 4) + (m + 3) · (1). So again, by using the same classical induction proof as in Pascal’s triangles, we can prove analogous to Lemma 2.2, the following result.
−m Lemma 2.3 If aji is the coefficient cj of H j in the i-th derivative of h(y) = e−y 1 + e−y according to (18), then ⎡ h(k) = (1 + H )−k · h · ⎣(m − 1)k +
k
⎤
j⎦ −ajk−1 (k + 1 − j ) + ajk−1 , +1 (m + j − 1) H
(19)
j =1 r r = ak+1 = 0, a0r = (m − 1)r and akr = 1, for every r = 1, 2, . . .. with a−1
Finally, we can obtain, as in Theorem 2.1, the corresponding formula for the binomial coefficient in terms of k and m, as follows. −m of the Theorem 2.2 The i-th derivative of the generator function h(y) = e−y 1 + e−y Generalized Matrix Variate Jensen-Logistic distribution is ⎡ ! ⎤ j +1 k k + 2 m1+k ejy ⎦ . (20) (−1)j +1−m (−1)j h(k) = (1 + ey )−k · h · ⎣1 + j + 1 − m j =1 m=1
The above procedures motivated us to study Pascal triangle arrays as a possible source of solutions in generating functions of certain polynomials. However, the kernels from other classical elliptical distributions (see Gupta and Varga, 1993) are not suitable for simplification via Pascal triangles, and some like the general Kotz distribution should definitively be expressed in terms of partition theory. Only some specific functions allow for simplification, such as those of the Kotz-type I and the Pearson distributions (see CaroLopera et al., 2009a and D´ıaz-Garc´ıa and Caro-Lopera, 2012, 2013, 2014). An interesting problem that can be addressed in the future is to determine what new structures for Pascal triangles could be proposed to obtain the generating function by integration, and this would be useful in widening the classes of matrix distributions with desirable properties.
3. Generalized Jensen-Logistic Kummer Relation The Kummer matrix relation, first derived by Herz (1955) and studied in the context of zonal polynomials by Constantine (1963), has been extremely useful in statistical applications, since in some special distributions, involving infinite series of zonal polynomials, it avoids the problem in the computation of the series (Koev and Edelman, 2006) and turns them into
2748
Caro-Lopera et al.
simple low-degree polynomials (see Caro-Lopera and D´ıaz-Garc´ıa, 2012). This famous relation states that 1 F1 (a; c; X)
= etr(X) 1 F1 (c − a; c; −X);
(21)
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
see Muirhead (1982, Eq. (6), p. 265). For the Gaussian kernel, as expected, a general relation applies to any elliptical model. Recently, D´ıaz-Garc´ıa and Caro-Lopera (2008) obtained a general relation, which is summarized next. Lemma 3.1 If the generator function f (y) of a given matrix variate elliptical model admits a Taylor expansion, then the generalized Kummer relation is given by (t) (t) (22) 1 P1 f (0) : a; c; X = 1 P1 f (tr(X)) : c − a; c; −X , where X > 0, (c) > (m − 1)/2 and a is arbitrary (or at least (a) > (m − 1)/2, if the integral representation of 1 F0 is used; see Herz (1955, p. 485) or Muirhead (1982, Corollary 7.3.5). The result is expressed in terms of the hypergeometric generalized function 1 P1 of matrix argument, which was defined by Caro-Lopera et al., (2009a) as 1 P1 (f (t, tr(X))
: a; c; X) =
∞ f (t, tr(X)) (a)τ
t!
t=0
τ
(c)τ
Cτ (X),
(23)
where τ denotes the summation over all partitions τ , τ = (t1 , · · · , tm ), t1 ≥ t2 · · · ≥ tm > 0, of t, Cτ (X) is the zonal polynomial of X corresponding to τ , the function f (t, tr(X)) is independent of τ and the generalized hypergeometric coefficient (b)τ is given by (β)τ =
m 1 β − (i − 1) , 2 ti i=1
where (b)t = b(b + 1) · · · (b + t − 1),
(b)0 = 1.
Then, D´ıaz-Garc´ıa and Caro-Lopera (2008) applied Lemma 3.1 to the classical JensenLogistic distribution (m = 2), obtaining the Jensen-Logistic Kummer relation by setting h(y) = exp(−y) (1 + exp(−y))−2 , and computing the k-th derivative using (7). Then f (t) (tr(X)) is given by t−m t−m t (t − m)! t i=1 νi + 1 ! exp(−(1 + i=1 νi ) tr(X)) t−m , t−m t−m m+ (1+i)ν ν m i i i=1 (−1) νi !(i!) [1 + etr(−X)]2+ i=1 νi κ∈P m=0 i=1
t−m
and so
⎛ 1 P1
⎝
t t
m=0
m
κ∈Pt−m
(t − m)!
t−m
(−1)m+
i=1
(1+i)νi
t−m i=1
t−m i=1
νi + 1 !
⎞ t−m
νi !(i!)νi 22+
i=1
νi
: a; c; X⎠
Pascal Triangle and Jensen Logistic Distribution
2749
= 1 P1 f (t) (tr X) : c − a; c; −X .
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
Even though the above expression is a polynomial, under certain special values of a and c, the computation of the coefficients is not easy given the indexation with lexicographically ordered partitions implicated in the summation. This is where the simplification of the derivative of the generator obtained in the previous section becomes useful. We can now improve and generalize that result by using the simple derivative given in Theorem 2.2, as follows. Theorem 3.2 The Generalized Kummer relation associated with the gen Jensen-Logistic −m is erator function h(y) = e−y 1 + e−y (k) (k) (24) 1 P1 h (0) : a; c; X = 1 P1 h (tr(X)) : c − a; c; −X , where X > 0, (c) > (m − 1)/2 and a is arbitrary (or at least (a) > (m − 1)/2), if the integral representation of 1 F0 is used (see Herz, 1955, p. 485; Muirhead, 1982, Corollary 7.3.5)), where ⎡ !⎤ j +1 k k + 2 m1+k ⎦ (−1)j +1−m (−1)j h(k) (0) = 2−k−m · ⎣1 + j + 1 − m j =1 m=1 and h(k) (tr(X)) = (1 + etr(X))−k · etr(−X) (1 + etr(−X))−m ⎡ ⎤ ! j +1 k k + 2 m1+k etr(j X)⎦ . · ⎣1 + (−1)j +1−m (−1)j j + 1 − m j =1 m=1 An easily computable expression, which can be implemented in efficient algorithms, would allow for inference with exact polynomial distributions.
4. An Application in Statistical Configuration Theory The main advantage of the generalized Kummer relations lies in the fact that if c − a is a negative integer, the series in the right hand side of (24) ends and it is a polynomial. Thus, if we have a density written in terms of series of zonal polynomials as in the left hand side of (24), which has important implications in their their computability (see Koev and Edelman, 2006), then it can be turned into a polynomial density. This allows one to develop inference on exact distributions which can now be computed efficiently. In fact, in the context of statistical shape theory of planar images, there are formulas available for those polynomials; see Caro-Lopera et al., (2007). See also Caro-Lopera et al (2009a) for the main aspects of statistical shape theory under elliptical models. As a summary, we start from the fact that two figures X : N × K and X1 : N × K have the same configuration , if X1 = XE + 1N e , for some translation e : K × 1 and a nonsingular E : K × K. Therefore, the configuration or shape of an image summarized in a matrix X : N × K of K landmarks in N dimensions, is all geometric information left after filtering various kinds of noise in the image such as rotation, scaling, translation and uniform shear.
2750
Caro-Lopera et al.
The final configuration matrix N − 1 × KU, which contains the configuration coordinates of the original N × K landmark matrix X, can be obtained through the following transformational process:
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
LX = Y = UE,
(25)
where L is a N − 1 × N Helmert submatrix (see Dryden and Mardia, 1998), and Y = (Y1 | Y2 ) ; here, Y1 is a K × K nonsingular matrix and Y2 a q × K matrix, with q = N − K − 1 ≥ 1. Observe that E = Y1 and U have the form U = (I | V ) , where V = Y2 Y−1 1 . The first work in statistical shape theory of Goodall and Mardia (1993) considered that X has an isotropic matrix Gaussian distribution, with a mean μX , X ∼ NN×K (μX , σ 2 IN ⊗ IK ), and so, under (25), the distribution of configuration matrix U can be obtained. Caro-Lopera et al., (2009a,b) generalized this approach by assuming that X has a general elliptically contoured distribution, that is, X ∼ EN×K (μX , X ⊗ X , h), and then, obtained an expression for the configuration or affine distribution in terms of an infinite series of zonal polynomials. Unfortunately, this family of distributions inherits all the problems about the relationship between convergence and truncation of a series, when trying to estimate the distributional parameters of form and dispersion (see Koev and Edelman, 2006). However, when the the generalized Kummer relation in (22) is applied, the computation is solved analytically and the following polynomial configuration density is obtained, as derived in Caro-Lopera et al., (2008). Lemma 4.1 If Y ∼ EN−1×K (μN−1×K , N−1×N−1 ⊗ IK , h), > 0, K is even (odd) and N is odd (even), respectively, then the polynomial expression of the non isotropic noncentral configuration density is given by K N −1 K − ; ; −X , (26) A × 1 P1 f (t) (tr(X)) : − 2 2 2 and it is a polynomial of degree K N−1 − K2 in the latent roots of 2 X = U −1 μμ −1 U(U −1 U)−1 , where A=
πK K
2
/2
K
N−1
|| 2 |U −1 U|
2 N−1 2
K
K , 2
f (y) admits a Taylor expansion and it is uniquely defined by f (t) (0) = g(t, X),
(27)
Pascal Triangle and Jensen Logistic Distribution
2751
via g(t, X) =
∞
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
r=0
−1 r ∞ tr μ μ K(N−1) h(2t+r) (y)y 2 +t−1 dy. K(N−1) r! +t 0 2
(28)
The particular result for the Generalized Jensen-Logistic distribution is easily obtained by applying Theorem 2.2 in h(2t+r) (y). This density has a remarkable characteristic: if K is even (odd) and N isodd (even), then the Jensen-Logistic configuration density is a polyK − in the latent roots of (27). Thus, the inference is extremely nomial of degree K N−1 2 2 simple to develop as can be seen in other models like the Kotz-type; see Caro-Lopera et al., (2009b).Lastly, we stress that according to comparative studies performed recently by D´ıaz-Garc´ıa and Caro-Lopera (2012), the optimal estimation of the configuration and its dispersion can only be obtained through polynomial densities, since no analytical relation among convergence, truncation and the optimum exists; as a matter of fact, these studies show that the algorithms available to perform the computation of series of zonal polynomials do not tolerate the truncation necessary to approach the solution obtained by polynomials. As an example, for the digit 3 database, with 13 landmarks (Dryden and Mardia, 1998), the configuration density is simply a grade 10 polynomial, while the corresponding distribution in series requires a polynomial of a grade much greater than 120 to begin the approximation in the way obtained by the grade 10 polynomial; unfortunately, the maximum is not reached because the algorithms do not support major truncation. It is important to note that the computational cost with series is very high, since it is necessary to calculate all the zonal polynomials indexed by two part partitions of all integers from 0–120 or more, demanding modifications to the complex algorithms of Koev and Edelman (2006), while the density proposed here is a simple polynomial of grade 10 that can be obtained through direct application of formulae in Caro-Lopera et al., (2007), given for the case of images in two dimensions.
Acknowledgments The authors wish to express their gratitude to the anonymous referees for their helpful comments and suggestions.
Funding This research work was supported by the University of Medellin (Medellin, Colombia) and the Center for Mathematical Research (Guanajuato-Monterrey, Mexico), joint grant No. 157. The authors also thank grant 105657 of CONACYT, M´exico.
References Caro-Lopera, F. J., D´ıaz-Garc´ıa, J. A., Gonz´alez-Far´ıas, G. (2007). A formula for Jack polynomials of the second order. Appl. Math. (Warsaw) 34: 113–119. Caro-Lopera, F. J., D´ıaz-Garc´ıa, J. A., Gonz´alez-Far´ıas, G. (2008). Finite elliptical configuration distributions: inference and applications. Technical report, 25.09.2008, I-08-15 (PE). CIMAT, Mexico.” http://www.cimat.mx/index.php?m=279” Also submitted. Caro-Lopera, F. J., D´ıaz-Garc´ıa, J. A., Gonz´alez-Far´ıas, G. (2009a). Noncentral elliptical configuration density. J. Multivariate Anal. 101(1): 32–43.
Downloaded by [Centro de Investigaciones en Matemáticas, A.C,] at 08:27 07 March 2016
2752
Caro-Lopera et al.
Caro-Lopera, F. J., D´ıaz-Garc´ıa, J. A., Gonz´alez-Far´ıas, G. (2009b). Inference in statistical shape theory: elliptical configuration densities. J. Statist. Res. 43(1): 1–19. Caro-Lopera, F. J., D´ıaz-Garc´ıa, J. A. (2012). Matrix Kummer-Pearson VII relation and polynomial Pearson VII configuration density. J. Iran. Statist. Soc. 11(2): 217–230. Chikuse, Y., Davis, A. (1986). Some properties of invariant polynomials with matrix arguments and their applications in econometrics. Ann. Inst. Statist. Math. 38: 109–122. Constantine, A. G. (1963). Noncentral distribution problems in multivariate analysis. Ann. Math. Statist. 34: 1270–1285. Davis, A. W. (1979). Invariant polynomials with two matrix arguments extending the zonal polynomials: applications to multivariate distribution theory. Ann. Instit. Statist. Math. 31: 465–485. Davis, A. W. (1981). On the construction of a class of invariant polynomials in several matrices, extending the zonal polynomials. Ann. Inst. Statist. Math. 33: 297–313. D´ıaz-Garc´ıa, J. A., Caro-Lopera, F. J. (2008). Matrix Kummer General relation. Technical report: 25.09.2008, I-08-16 (PE), CIMAT, Mexico. ”http://www.cimat.mx/index.php?m=279”. D´ıaz-Garc´ıa, J. A., Caro-Lopera, F. J. (2010). Generalized shape theory via SV decomposition I. Metrika 75: 541–565. D´ıaz-Garc´ıa, J. A., Caro-Lopera, F. J. (2012). Statistical theory of shape under elliptical models and singular value decompositions. J. Multivariate Anal. 103: 77–92. D´ıaz-Garc´ıa, J. A., Caro-Lopera, F. J. (2013). Generalised shape theory via psuedo-Wishart distribution. Sankhya A. 75:253–276. D´ıaz-Garc´ıa, J. A., Caro-Lopera, F. J. (2014). Statistical theory of shape under elliptical models via QR decomposition. Statistics. 48:456–472. Dryden, I. L., Mardia, K. V. (1998). Statistical Shape Analysis. Chichester: John Wiley & Sons. Goodall, C. R., Mardia, K. V. (1993). Multivariate aspects of shape theory. Ann. Statist. 21: 848–866. Gupta, A. K., Varga, T. (1993). Elliptically Contoured Models in Statistics. Dordrecht: Kluwer Academic Publishers. Fang, K. T., Anderson, T. W., Eds., (1990). Statistical Inference in Elliptically Contoured and Related Distributions. New York: Allerton Press. Fang, K. T., Zhang, Y. T. (1990). Generalized Multivariate Analysis. Beijing: Science Press, SpringerVerlag. Herz, C. S. (1955). Bessel functions of matrix argument. Ann. Math. 61: 474–523. Koev, P., Edelman, A. (2006). The efficient evaluation of the hypergeometric function of a matrix argument. Math. Comp. 75: 833–846. Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. New York: John Wiley & Sons.