Modelling multivariate count data - Extras Springer

2 downloads 0 Views 99KB Size Report
Joe (1997) of best (largest likelihood) fitting copula with Laplace transform family. LTC and ... References. [BLNZ95] R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu.
Modelling multivariate count data Aristidis K. Nikoloulopoulos and Dimitris Karlis Department of Statistics, Athens University of Economics, 76 Patission Str., 10434, Athens, GREECE{akn,karlis}@aueb.gr

Summary. Multivariate count data occur in several different disciplines. Models for such data are few mainly because of the computational complexity for their application. We propose models based on the currently fashionable idea of copulas. We propose copulas appropriate for this case taking into account computational limitations. A real data application is provided.

1 Introduction Multivariate count data occur in several disciplines, like epidemiology, marketing, criminology, sports statistics, industrial statistics, among others. However, flexible models for such data are not widely available and usually are hard to fit in real data. Copulas are currently fashionable models for modelling dependent data as they can separate the estimation of the marginal properties and the dependence structure. While there are plenty of publications that treat continuous data there are only few treating count data. The purpose of the present paper is to propose a model based on copulas for multivariate count data. Definition 1. ( [Nel99]) A multivariate copula is a function C from In to I with the following properties: (1) For every u in In C(u) = 0 if at least one coordinate of u is 0 and if all coordinates of u are 1 except uk , then C(u) = uk (2) For every a and b in In such that a ≤ b, VC ([a, b]) ≤ 0. There are few paper for the use of copulas with discrete data. [MM94, TDB99, Lee94,CLTZ04] exploit the use of Frank copula, see [Fra79], to model discrete bivariate data (i.e., count and binary data). The multivariate extension of Frank copula has the disadvantages, that restrict to positive dependence and form one dependence parameter for all the bivariate margins. The multivariate normal copula overcomes this drawback. [Lee01, vO99] exploit the use of bivariate normal copula to model count data while [LV02, Son00] used the latter to model multivariate non-normal longitudinal continuous data. For discrete data, generalization to the multivariate case is not easy since the joint probability function involves computation of the copula in several different points and hence multivariate numerical integration is needed leading to tremendous computational problems.

600

Aristidis K. Nikoloulopoulos and Dimitris Karlis

We propose the use of mixtures of max-infinitely divisible bivariate copulas, see [Joe96] to derive flexible positive dependence between the random variables. We also fit a multivariate normal copula and discuss the computational problems occurred.

2 Multivariate parametric families of copulas 2.1 Copulas via mixtures of max-infinitely divisible bivariate copulas Let Λ be a univariate cumulative distribution function (cdf) of a positive random (Λ(0) = 0), and let φ be the Laplace transform (LT) of Λ, φ(t) = R ∞ variable −ts exp dΛ(s), t ≥ 0,. 0 Mixtures of max-infinitely divisible copulas (maxid) have the form C(u) = φ −

X

′ log Cij (e−pi φ

−1

(ui )

, e−pj φ

−1

(uj )

)+

m X

i