Solving problems in parameter redundancy using computer algebra

Journal of Applied Statistics, Vol. 29, N os. 1- 4, 2002, 625- 636

Solving problems in parameter redundancy using com puter algebra

1

2

3

E. A . CAT CHP OLE , B. J. T. M ORG AN & A. V IA LLEFONT *, 1 2 Australian D efence Force Academy, Australia, University of Kent at Canterbur y, UK 3 and Laboratoire Sabres, Vannes, France

abstract

A model, involving a particular set of parameters, is said to be parameter redund ant when the likelihood can be expressed in terms of a smaller set of parameters. In many important cases, the parameter redund ancy of a model can be checked by evaluating the symbolic rank of a derivative matrix. We describe the main results, and show how to constr uct this matrix using the symbolic algebra package M aple. We apply the theor y to examples from the mark- recapture ® eld. General code is given which can be applied to other models. 1 Introduction 1.1 B ackground Let us suppose that a probability m odel has been proposed for a set of data, and that we intend to ® t the m odel to the data using m axim um likelihood. It is often the case that the likelihood surface is m axim ized on a com pletely ¯ at ridge or plane, due to a redundancy in the param eter set. As we shall see from the exam ples later, it can be diý cult to gauge whether or not all the param eters can, in principle, be estim ated from the data. Areas in which this occurs include compartment m odelling (Seber & W ild, 1989, chapter 8), Kalm an Filter m ethodology (H arvey, 1989, p. 205), ion-channel m odelling (Chen et al., 1997), directed networks (Geiger et al., 1996; W hiley, 1999), econom etrics (Rothenberg, 1971), latent structure m odels (G oodm an, 1974), and in m odels for the analysis of recovery / recapture data resulting from obser vations on m arked anim als (Freeman & M organ, 1992; Lebreton et al., 1992). If data are missing it m ay no longer be possible to Correspondence: E. A. Catchpole, School of M athem atics and Statistics, U niversity of New South Wales at AD FA, Canberra ACT 2600, Australia. E-m ail: e.catchpole@ adfa.edu.au * Present address: Institut U niversitaire de Technologie, Lyon, France. ISSN 0266-476 3 print; 1360-053 2 online/02/010625-1 2 DOI: 10.1080 /02664760 12010860 1

© 200 2 Taylor & Francis Ltd

626

E. A. Catchpole et al.

estim ate all of the parameters in a m odel that is not param eter redundant (C atchpole & Morgan, 2001). Recent research has shown how, for a wide class of m odels, established m ethods of com puter algebra m ay be used to detect which m odels are param eter redundant (Catchpole & M organ, 1997), and to determ ine w hich param eter combinations are estim able, i.e. have unique m axim um -likelihood estimates (C atchpole et al., 1998). After sum m arizing existing results for detecting parameter redundancy, we show how the basic problem s m ay be solved in a sim ple and straightforward m anner using the sym bolic computation package, M aple. In order to do this, we focus on a single ® eld, nam ely m odels for m arkrecapture data. However, the procedures given apply equally well to other ® elds, using straightforward modi® cations to the M aple code provided here. Evidently, alternative sym bolic algebra com puter packages m ay be used. C onsider a data vector y 5 ( y 1 , . . . , y n) from an exponential fam ily distribution. The m odels we consider specify the distribution and provide an exp ression for the m ean vector l 5 E[ y] in term s of a param eter vector h 5 (h 1 , . . . , h q ), say. A m odel is param eter redundant if l can be exp ressed in term s of a param eter vector b 5 ( b 1 , . . . , b r ), w ith r < q. O therwise it is said to be full rank. The test for param eter redundancy of Catchpole & M organ (1997) requires the formation of the derivative matrix, A 5

{} ¶ l ¶ h

i

,

1

m : 5 2: k : 5 3: phi : 5 vector(k): p : 5 vector(k): Phi : 5 matrix(m,k,0): P : 5 matrix(m,k,0): for i from 1 to m do for j from i to k do Phi[i, j] : 5 phi[j]: P[i, j] : 5 p[j]: od; od;

A sim pler m ethod of denoting this pure tim e dependence is to use the param eter index m atrices (PIM ) U

pi

5

f

Ppi 5

1

2

3

0

2

3

g

This m ethod is used in the MARK package (W hite & Burnham , 1999) for the analysis of m arkrecapture and recover y data. After constructing the matrix P pi as above, all that is then required is

> Phi : 5

Index2Mat(Ppi,phi): P : 5

Index2Mat(Ppi,p):

where the code for Index2Mat is given in the Appendix. A sim pler m ethod still does not use PIM , but instead im plements special procedures for particular m odels. For purely time-dependent param eters, as above, we could use

> Phi : 5

Time(phi): P : 5

Time(p):

using the procedure Time given in the Appendix. T he probability m atrix X is then constructed in two stages. First, the procedure CumSurviv (cum ulative survival) transform s the m atrix U into the m atrix Cum Surviv( U ) 5

f

u

u

1

1

0

u

(1 2

p 1 )p 2

u

u

2

1

u

2

u

2 2

u

u

3 3

g

and CumRecap transform s P into CumRecap(P) 5

f

p1 0

p2

(1 2

p 1 )(1 2 (1 2

p 2 )p 3

p 2 )p 3

g

Secondly, these two m atrices are multiplied together, elem entwise using pmult, to form X . The M aple code for this operation is sim ply

Omega : 5

pmult(CumSurviv(Phi), CumRecap(P));

T he code for CumSurviv , CumRecap and pmult is also given in the Appendix. It should be clear now how other standard m odels, incorporating age-dependence, for exam ple, m ay be sim ilarly program m ed, as well as com plex m odels tailored to particular data. The beauty of this use of M aple for constructing X is that it reduces the chance of hum an error at this stage.

Solving problems in parameter redund ancy

629

3 The derivative m atrix 3.1 The rank of A In order to form the derivative m atrix A, we collect the non-zero elem ents of the probability matrix X into a single vector x , take logarithm s of the elem ents x , and then calculate the sym bolic derivatives with respect to the param eters. The rank of the resulting m atrix then gives the number of param eters that can be estimated by m aximum likelihood. T his is done by m eans of the following com m ands, with the results show n for the m odel of Exam ple 1.

> q :5

> A :5

+

vectdim(phi)

f

vectdim(p); q: 5

6

Dmat(Omega,phi,p );

A5

u

1

1

u

1

u

0 1

0 0

u

1

1

0

p1

1 1

u

2

2

u

2

1

0

u 1

12 1 p2

p1 1

12 1

0

p3

> with linalg: rank(A); 4

1

1

u

2

u

0 1

p2

p2 0

2

1

3

12 2

0

0 1

2

p1

0

1

3

2

0 1 12 1 p3

p2

g

N ote that in this sm all exam ple we have illustrated the derivative m atrix A. This will not norm ally be shown. The code for Dmat is given in the Appendix. Since rank(A) 5 4 is less than the num ber of parameters q 5 6, the m odel is param eter redundant, from Catchpole & Morgan (1997). Since the de® ciency, de® ned as n 2 rank(A), is 2 rather than 1, it is not just the elem ents of the com bination u 3 p 3 that are non-estim able: the rank shows that there are only four independent theoretically estim able com binations of the six param eters u 1 , u 2 , u 3 , p 1 , p 2 , p 3 , from Catchpole et al. (1998).

3.2 Theoretically estimable parameters W hen a model is param eter redundant, it is important to know which, if any, of the param eters are theoretically estim able. To discover this, we need to consider the general solution vector a to (2), as explained in Catchpole et al. (1998). This is accom plished by the instructions below.

630


> zero : 5

vector(coldim(A),0):

> alpha : 5

linsolve(transpos e(A),zero,’r’,t) ;

a :5

2

f

0,

u

2

( 2 t1 p 3 + t1 p 3 p 2 2

u

+ t1 p 3 p 2 2 u

p 2 ( 2 t1 p 3

u

3

3

u

t

3 2

+ t2 u

3

p2 )

p3 t

3 2

+ t2 u

p3

3

p2 )

, t2

, t 1 , 0,

g

This form of the linsolve com m and assigns rank(A) to the variable r and uses t for any unknown constants. T he vector a has two arbitrary constants, t 1 and t 2 , since the model has de® ciency 2. H owever a has zero entries in positions corresponding to the rows in A resulting from taking derivatives w ith respect to the param eters u 1 and p 1 . By Catchpole et al. (1998), therefore, u 1 and p 1 are theoretically estimable, but none of the other param eters are. T here m ust exist two m ore independent theoretically estim able com binations of the param eters, since rank(A) 5 4. T hese can be found, as explained in Catchpole et al. (1998), by solving the set of linear ® rst-order partial diþ erential equations

¶ f

q

+ s5

a 1

s, j

¶ h

5

0,

j5

1, . . . , d

(3)

s

where in this case there are q 5 6 parameters and the de® ciency is d 5 2. H ere we are denoting by a 1 and a 2 the independent solutions of (2) form ed by taking t 2 5 0 and t 1 5 0 respectively, and letting a s,j be the sth com ponent of a j . T he pair of equations (3) can be solved using M aple, although we do not show the code here (see Gim enez, 2001). In the exam ple above, we can in fact see by inspection that the m atrix X can be written in term s of u 1 , p 1 , u 2p 2 and u 2u 3(1 2 p 2 )p 3 . Since these are clearly independent param eter com binations, in the sense that no one can be obtained from the others, they m ust be the four independent theoretically estim able param eter com binations. For m ore com plex exam ples, identi® cation of theoretically estim able param eter com binations by inspection is likely to be m ore diý cult than here.

3.3 M issing data Although a m odel m ay be full rank, m issing data m ay render certain param eters inestim able in practice in any particular application (C atchpole & M organ, 2001). Suppose for exam ple, in Example 1, no anim als were recaptured from the ® rst cohort of m arked animals in year 3 of the study. This gap in the m-array results in the element X 1,3 not appearing in the likelihood, and the derivative m atrix m ust be am ended so that this elem ent is om itted. N o other changes are required in the code, except that it now becom es essential to add an extra colum n to X to incorporate probabilities for anim als not recaptured at all during the study, so that each row becom es a full m ultinom ial distribution (see Catchpole & M organ, 2001). The M aple code for this illustration is shown below.

> X : 5 matrix(m,1): > for i to m do X[i,1] : 5

1 - sum(Omega[i, j],j 5 1..k) od:


631

> Omega_X : 5 augment(Omega,X): > Omega_X[1,3] : 5 0: print(Omega_X);

f

u 1p 1 u

1

u 2 (1 2 p 1 )p 2 u 2p 2

0

12 u

u

0 2

u

3

1

p1 2 u

1

(1 2 p 2 )p 3

u

2

(1 2 p 1 )p 2 2 u

12 u

p2 2 u

2

2

u

1 3

u

2

u

3

(1 2 p 1 )(1 2 p 2 )p 3

(1 2 p 2 )p 3

> A_X : 5 Dmat(Omega_X,phi,p): > rank(A_X);

g

4 T he rank of A is unchanged, so that in this case the missing data do not aþ ect the param eter redundancy.

4 Exam ple 2: two groups of anim als Consider now a situation in which there are two groups of animals (e.g. two sexes), with time-dependent survival and recapture probabilities, as in Exam ple 1, but with the survival and recapture probabilities of group 2 being constant m ultiples of those for group 1. T hus, the sur vival probabilities for groups 1 and 2 can be written as U 1 and U 2 , where U 1 is as in Exam ple 1 and U 2 5 a U 1 , for som e constant a; and sim ilarly for the recapture probabilities we have P 2 5 bP 1 , for som e constant b. In the notation of Lebreton et al. (1992), the m odels for both survival and recapture are of the form `tim e + group’ , since the group eþ ect is additive on a logarithm ic scale. In exam ples such as this, it is convenient to have a separate probability m atrix X for each group. To construct the overall derivative m atrix, it is then suý cient to arrange these probability m atrices side-by-side, using the M aple augment comm and. T he code required in this illustration is:

> m: 5 2: k: 5 3: phi: 5 vector(k) : p: 5 vector(k) : a: 5 ’a’ : b: 5 ’b’ : > Phi1 : 5 Time(phi); # survival for group 1

f

U 1 :5

> Phi2 : 5

u

2

u

3

u

2

u

3

1

0

g

evalm(a * Phi1) ; # survival for group 2 U 2 :5

> > > > >

u

f

au

1

0

au

2

au

3

au

2

au

3

g

P1 : 5 Time(p) : P2 : 5 evalm(b * P1) : Omega1 : 5 pmult(CumSurviv(Phi1), CumRecap(P1)) : Omega2 : 5 pmult(CumSurviv(Phi2), CumRecap(P2)) : Omega : 5 augment(Omega1,Om ega2) : q : 5 vectdim(phi) + vectdim(p) + 2; q: 5

> A :5

8

Dmat(Omega,phi,p ,a,b): rank(A); 7

632


> zero : 5 vector(coldim(A),0): > alpha : 5 linsolve(transpos e(A), zero, ’r’,t);

a :5

f

0, 0, 2

t1 u p3

3

, 0, 0, t 1 , 0, 0

g

The m odel is therefore param eter redundant, since rank(A) 5 7 is less than the num ber of param eters, q 5 8. Unlike Exam ple 1, the de® ciency is now only 1. Furtherm ore, since a has zeros in ever y position except those corresponding to u 3 and p 3 , all param eters except these two are estim able. It can be seen by inspection that the other estim able param eter com bination is the product u 3 p 3 . An interesting aspect of this m odel is that if the group eþ ect is additive on any other scale, such as the logistic for exam ple, then the m odel is not param eter redundant ( Viallefont, 1995, ch. 3). We hypothesize that in such a case the m odel will be `near-redundant’ , using the term inology of Catchpole et al. (2001), and m ay result in certain param eters being estim ated with low precision. N ear-redundant m odels are not param eter redundant. However, they m ay provide poor estim ates of som e m odel param eters, as a result of having sm all eigenvalues of the inform ation m atrix. C atchpole et al. (2001) suggest that the num erical procedure of Viallefont et al. (1998) is then needed. C hoquet (2001) and Gim enez (2001) provide applications to m ulti-state m arkrecapture m odels.

5 D iscussion In the exam ples we have given, we have used very sm all studies, with m 5 2 years of marking and k 5 3 years of recaptures. We have done this purely for illustration. There are, in principle, no problem s in using M aple on much larger problem s. Focusing on a derivative m atrix, rather than an expected Hessian m atrix, greatly enhances the speed of the M aple procedures. Extension theorem s, mentioned below, quite often result in Maple only being needed for `sm all’ examples of m odels of a given structure. Furtherm ore we have also, for pedagogical reasons, kept the m odels considered to be fairly sim ple. However, the methods given can very easily be applied to test quite com plicated m odels, including for exam ple all those considered by Lebreton et al. (1992). Any results obtained by the m ethods described above are for a ® xed size of study only. It would clearly be bene® cial to be able to draw general conclusions about all studies where the sur vival and recapture m odels are of a particular type (e.g. both purely tim e-dependent). In other words, we would like to extrapolate from the results for a particular m and k to general m and k. C atchpole & M organ (1997) give such an extension theorem for the full rank case; that is, they show that, under suitable conditions, if a m odel is full rank for a sm all study then it will rem ain full rank for a larger study. T he corresponding theory for the parameter-redundant case, showing when the de® ciency is preserved, is given in Catchpole & M organ (2001). An illustration is provided by Catchpole & M organ (2001), in which it is supposed that in the Corm ack- Jolly- Seber model there is an im m ediate eþ ect of capture, resulting in a change in the probability of capture that extends to the following year only. It is show n that when m 5 k > 4, the model has de® ciency 1. Additional exam ples are provided by Kgosi (2000). Currently, little is known for certain when sim ple m odels are extended to account for age and /or capture eþ ects.


633

The procedures of this paper can be used w ith con® dence to provide de® nite statem ents regarding parameter-redundancy, and for particular data sets to investigate whether m issing data values m ay m odify the param eter-redundancy. T he use of M aple to calculate the sym bolic rank of the derivative m atrix relies on this m atrix being a rational function of the param eters. Although this covers m any exam ples of practical interest, it does not cover all cases. One im portant exception would be where param eters appear in a non-linear way, for example in describing the dependence of sur vival on a covariate. T he possibility of using com puter algebra in such situations is under current investigation. For the m om ent, it is necessary in such cases to revert to num erical m ethods, as in Viallefont et al. (1998). R EF ER EN C ES B urnham, K . P. & A nderson, D . R . (1998) M odel Selection and Inference . A Practical Informationtheoretic Approach (N ew York, Springer). B urnham, K . P., A nderson, D. R ., W hite, G . C ., B rownie, C . & P ollock, K . H . (1987 ) Design and analysis m ethods for ® sh survival experim ents based on release- recapture. Am erican Fisheries Society, Monograph 5, Bethesda, M aryland. C atchpole, E . A ., F reeman, S . N . & M organ, B . J . T. (1996 ) Steps to param eter redundancy in agedependent recovery m odels, Journal of the Royal Statistica l Society B , 58, pp. 763 - 774. C atchpole, E . A ., K gosi, P. M . & M organ, B . J. T. (2001 ) On the near-singularity of models for anim al recovery data, B iometrics, 57, pp. 720 - 726. C atchpole, E . A . & M organ, B. J . T. (1997 ) Detecting parameter redundancy, B iometrika, 84, pp. 187 - 196. C atchpole, E . A . & M organ, B . J . T. (2001 ) De® ciency of param eter-redundant m odels, B iometr ika, 88, pp. 593 - 598. C atchpole, E . A ., M organ, B . J. T. & F reeman, S . N . (1998 ) Estimation in parameter redundant m odels, B iometr ika, 85, pp. 462 - 468. C hen, D ., L ear, J . & E isenberg, B . (1997 ) Perm eation through an open channel: Poisson- NernstPlanck theory of a synthetic ionic channel, B iophysical Jour nal, 72, pp. 97 - 116. C hoquet, R . (2001 ) Computing easily the rank of m ulti-state capture- recapture m odels. C ormack, R . M . (1964 ) Estim ates of survival from the sighting of m arked anim als, B iometr ika, 51, pp. 429 - 438. F reeman, S . N . & M organ, B. J . T. (1992 ) A modelling strategy for recovery data from birds ringed as nestlings, B iometr ics, 48, pp. 217 - 236. G eiger, D ., H eckerman, D . & M eek, C . (1996 ) Asym ptotic model selection for directed networks with hidden variables. Technical Report MSR-TR-96-07, Microsoft Research, Redm ond WA 98052 , U SA. G imenez, O. (2001 ) Param eter redundancy for m ultistate capture- recapture m odels. G oodman, L . A . (1974 ) Exploratory latent structure analysis using both identi® able and unidenti® able m odels, B iometr ika, 61, pp. 215 - 231. H arvey, A . C . (1989) Forecasting, Str uctural Time Series and the Kalman Filter (C ambridge U niversity Press). J olly, G . M . (1965 ) Explicit estim ates from capture- recapture data w ith both death and imm igrationstochastic m odels, B iometrika, 52, pp. 225 - 247 . K gosi, P. M . (2000 ) Models for avian dem ography. PhD thesis, U niversity of Kent at Canterbury. U npublished. L ebreton, J.-D ., B urnham, K . P., C lobert, J. & A nderson, D . R . (1992 ) M odeling survival and testing biological hypotheses using marked anim als: a uni® ed approach w ith case studies, Ecological M onographs, 62, pp. 67 - 118 . R othenberg, T. J . (1971 ) Identi® cation in param etric m odels, Econometrica, 39, pp. 577 - 591. S eber, G . A . F. (1965 ) A note on the multiple recapture census, B iometrika, 52, pp. 249 - 259. S eber, G . A . F. & W ild, C . J . (1989) Nonlinear Regression (N ew York, W iley). Viallefont, A . (1995 ) Robustesse et ¯ exibiliteÂ des analyses dem ographiques par capture- recapture: de l’ estimation de la survie a la deÂ tection de com promis eÂ volutifs. PhD thesis, University of M ontpellier II. U npublished. Viallefont, A ., L ebreton, J .-D ., R eboulet, A .-M . & G ory, G . (1998 ) Param etric identi® ability

634


and m odel selection in capture- recapture m odels: a num erical approach, B iometrical Jour nal, 40, pp. 313 - 325. W hiley, M . (1999 ) Aspects of the interface between statistics and neural networks. PhD thesis. U niversity of Glasgow. U npublished. W hite, G . C . & B urnham, K . P. (1999 ) Program M ARK: survival estimation from populations of m arked anim als, B ird Study, 46 (suppl.), pp. 120 - 139.

Appendix M aple procedures The following code is available from www.ma.adfa.edu.au/ ~ eac/Redundancy/ M aple. A1 Index2M at

A2 Tim e

Solving problems in parameter redund ancy A3 Cum S ur viv

A4 Cum Recap

A5 pm ult

635

636


A6 Dm at

Solving problems in parameter redundancy using computer algebra

Solving problems in parameter redundancy using computer algebra

Suggest Documents

Solving Algebra Problems Before Algebra Instruction - CiteSeerX

Solving computational problems using coherent

Solving marketing optimization problems using

Solving reliability redundancy allocation problems with ... - IEEE Xplore

Parameter Redundancy with Applications in Statistical ...

Solving computer animation problems with numeric ... - SMARTech

On Solving Geometric Optimization Problems ... - Computer Science

Parameter Identification Problem Solving Using Genetic ... - Pmf

Parameter Identification Problem Solving Using Genetic Algorithm

the concept of parameter in a computer algebra environment - CiteSeerX

the concept of parameter in a computer algebra environment

Solving Mass Balances using Matrix Algebra

Solving polynomial equations for minimal problems in computer vision

Solving Examination Timetabling Problems using Honey ... - CiteSeerX

Solving Vehicle Routing Problems Using Constraint ...

Solving Examination Timetabling Problems using ... - Graham Kendall

Solving Optimization Problems using the Matlab ... - CiteSeerX

Solving Symbolic Regression Problems Using ... - Semantic Scholar

Solving Set Constraint Satisfaction Problems using ROBDDs

Solving discrete minimax problems using interval arithmetic

Solving Multiconstraint Assignment Problems Using Learning ... - Core

Solving University Course Timetabling Problems Using ... - MDPI

Solving optimal control problems using a Gegenbauer

A Scalable Approach to Solving Dense Linear Algebra Problems on ...