Modelling asymmetrically distributed circular data ...

9 downloads 0 Views 291KB Size Report
We consider problems of inference for the wrapped skew-normal distribution on the circle. A centered parametrization of the distribution is introduced,.
Modelling asymmetrically distributed circular data using the wrapped skew-normal distribution ARTHUR PEWSEY Departamento de Matem´ aticas, Escuela Polit´ecnica, Universidad de Extremadura, 10071 C´ aceres, Spain E-mail: [email protected]

We consider problems of inference for the wrapped skew-normal distribution on the circle. A centered parametrization of the distribution is introduced, and simulation used to compare the performance of method of moments and maximum likelihood estimation for its parameters. Maximum likelihood estimation is shown, in general, to be superior. The operating characteristics of two moment based tests, for wrapped normal and wrapped half-normal parent populations, respectively, are also explored. The former test is easy to apply, maintains the nominal significance level well and is generally highly powerful. The latter test does not hold the nominal significance level so well, although it is very powerful against negatively skew alternatives. Likelihood based tests for the two distributions are also discussed. A real data set from the ornithological literature is used to illustrate the application of the developed methodology and its extension to finite mixture modelling. Keywords: bird migration headings, finite mixture modelling, point estimation, tests of hypothesis, wrapped normal distribution, wrapped half-normal distribution

1

1. Introduction Circular data arise in scientific disciplines as diverse as biology, geology, medicine, meteorology and psychology, with studies into the directions of movement, or orientation, of biological organisms providing a particularly rich source. The texts of Mardia (1972), Batschelet (1981), Fisher (1993), Mardia and Jupp (1999) and Jammalamadaka and SenGupta (2001) provide introductions to the statistical methodology available for analyzing circular data which reflect the varying requirements of these disparate disciplines. Curiously, although symmetrically distributed circular data are comparatively rare (Mardia, 1972, p. 10), fewer than ten pages amongst the total of over 1400 pages making up these five principal texts are devoted to asymmetric models. In a move to redress this disjuncture between established theory and the implicit statistical demands raised by real data, Pewsey (2000a) proposed the wrapped skew-normal distribution on the circle (henceforth, the WSNC distribution) as a potential model for asymmetrically distributed circular data. The emphasis there was on moment based methods of inference. In this paper we consider those methods in greater detail and contrast their performance with that of likelihood based techniques. The remainder of the paper is organized as follows. In Section 2 we review the genesis of the WSNC distribution and discuss its ‘direct’ and ‘centered’ parametrizations. The method of moments and maximum likelihood approaches to estimation are discussed and contrasted in Section 3. In Section 4 we consider the operating characteristics of the large-sample tests of Pewsey (2000a) for parent wrapped normal and wrapped half-normal distributions. We also discuss likelihood based alternatives to the two tests. In Section 5 we illustrate the use of the developed methodology, as well as its extension to finite mixture modelling, by analyzing data on the directions of flight of migrating birds. The conclusions drawn from this work are discussed in Section 6.

2. Parametrizations 2.1 Direct parametrization Let X ∼ SN (λ) denote that the linear random variable X is distributed according to the standard skew-normal distribution with skewness parameter

2

λ (Azzalini, 1985). Then X has density f (x; λ) = 2φ(x)Φ(λx),

− ∞ < x < ∞, −∞ < λ < ∞,

where φ(·) and Φ(·) denote the standard normal density and distribution functions, respectively. Introducing location and scale parameters, YD = ξ + ηX is a skew-normal random variable with “direct” parameters ξ, η and λ and density Ã

!

( Ã

2 y−ξ y−ξ f (y; ξ, η, λ) = φ Φ λ η η η

!)

,

(1)

−∞ < y < ∞, −∞ < ξ < ∞, η > 0, −∞ < λ < ∞. Following the notation used in Pewsey (2000b), we denote this distributional relationship as YD ∼ SND (ξ, η, λ). Throughout the paper, a “D” subscript as employed here refers to the use of the direct parametrization. From (1), the circular random variable ΘD = YD (mod 2π), corresponding to wrapping YD onto the unit circle, has density !

Ã

( Ã

∞ 2 X θ + 2πr − ξ θ + 2πr − ξ f (θ; ξ, η, λ) = Φ λ φ η r=−∞ η η

!)

,

(2)

0 ≤ θ ≤ 2π. Using the terminology of Pewsey (2000a), we will say that ΘD has a WSNC distribution with direct parameters (ξ, η, λ), denoted ΘD ∼ W SN CD (ξ, η, λ). The fundamental properties of ΘD are derived in Pewsey (2000a). 2.2 Centered parametrization Returning to a consideration of the linear random variable YD , it is known that the direct parametrization is parameter redundant for the important case of the normal distribution. This case corresponds to a λ-value of 0. The implications of parameter redundancy are discussed in Azzalini (1985), Azzalini and Capitanio (1999) and Pewsey (2000b). Most significantly, there is no unique solution to the likelihood equations when the parent population is normal. As a means of resolving this problem, Azzalini (1985) introduced the “centered” parametrization of the distribution. Under this alternative parametrization, 



YC = µ + σ

X − E(X)   q

var(X)

,

3

− ∞ < µ < ∞, σ > 0,

is a skew-normal random variable with mean µ, standard deviation σ and coefficient of skewness γ1 , where −0.99527 < γ1 < 0.99527. We will denote this relation by YC ∼ SNC (µ, σ, γ1 ), the “C” subscript indicating here, and in subsequent uses of it, that the centered parametrization is being referred to. The direct parameters are related to the centered ones according to q

1/3

cγ1 1/3 2/3 , ξ = µ − cγ1 σ, η = σ 1 + c2 γ1 , λ = q 2/3 2 2 2 b + c (b − 1)γ1

(3)

q

where b = 2/π and c = {2/(4 − π)}1/3 . Using these relations in (1), the density of YC is given by 

f (y; µ, σ, γ1 ) =

2

q

2/3

φ q

σ 1 + c 2 γ1

½µ

1 2/3

1 + c 2 γ1



¾

½µ

1/3

cγ1

× Φ q





y−µ 1/3 + cγ1  σ

2/3

2/3

{b2 + c2 (b2 − 1)γ1 }(1 + c2 γ1 )





¾

y−µ 1/3 + cγ1  . σ

Similar problems to those described above for the linear random variable YD occur for the circular random variable ΘD when the parent population is wrapped normal (corresponding to setting λ equal to 0 in (2)). To avert such problems, it is again advisable to make use of the centered parametrization. Wrapping YC onto the unit circle, ΘC is a circular random variable distributed according to the WSNC distribution with centered parameters and density f (θ; µ, σ, γ1 ) =

q

2

σ 1+ 

∞ X 2/3 c2 γ1 r=−∞

φ

  

q

Ã

1

1+

2/3 c 2 γ1

1/3 cγ1

× Φ q

2/3

2/3



!  θ + 2πr − µ 1/3 + cγ1  σ

{b2 + c2 (b2 − 1)γ1 }(1 + c2 γ1 )

Ã

θ + 2πr − µ 1/3 + cγ1  , σ

for 0 ≤ θ < 2π. We will denote this distributional relation as ΘC ∼ W SN CC (µ, σ, γ1 ). There are five important limiting cases of the WSNC distribution. As η → 0 (σ → 0), the distribution tends to a point distribution whereas, as η → ∞ (σ → ∞), the limiting distribution is the circular uniform. A point 4

!

(4)

distribution is easily identified, and many procedures are available for testing for the latter; see, for example, Mardia and Jupp (1999, Section 6.3). As noted previously, the wrapped normal distribution is obtained when λ = 0 (γ1 = 0). As λ → ±∞ (γ1 → ±0.99527), the distribution tends to the (negative) wrapped half-normal distribution. Tests for wrapped normal and wrapped half-normal distributions within the WSNC class are discussed in Section 4.

3. Point estimation In this section we consider maximum likelihood (ML) estimation for the parameters of the WSNC distribution and contrast the results produced by it with those for method of moments (MM) estimation. As we shall show, although ML point estimation is computationally more demanding, its properties are generally superior to those for MM estimation. Likelihood based inference also has the appeal that it can readily be extended to address problems such as hypothesis testing, confidence set construction and finite mixture modelling, as we will demonstrate in Sections 4.2 and 5. 3.1 Method of moments Method of moments estimation for the WSNC distribution is discussed in Pewsey (2000a). In order to overcome problems of instability encountered when solving the moment equations, Pewsey (2000a) makes use of the “circular” parametrization. This third parametrization of the distribution is a circular analogue of the centered parametrization and is based on the use of the mean direction, mean resultant length and second central sine moment. These are standard measures used to represent the central location, concentration and skewness of a circular distribution. Moment based point estimation of the circular parameters is trivial and the conversion from the circular to the direct parameters, although requiring numerical integration and non-linear equation solving techniques, is otherwise not computationally involved. We refer the reader to Pewsey (2000a) for the details. If required, estimates for the centered parameters can be calculated from those for the direct parameters using the relations √ bδ 3 (2b2 − 1) , (5) µ (mod 2π) = ξ (mod 2π) + bηδ, σ = η 1 − b2 δ 2 , γ1 = (1 − b2 δ 2 )3/2 5

√ where δ = λ/ 1 + λ2 ∈ (−1, 1). The MM estimates of the circular parameters of the WSNC distribution are always ‘admissible’. By this we mean that they always lie within the appropriate ranges associated with the parameters. However, if these estimates are transformed to obtain estimates of the direct or centered parameters, it is possible that the estimates obtained for the skewness parameters are inadmissible. Specifically, estimates of |δ| (|γ1 |), greater than 1 (0.99527) can occur. 3.2 Maximum likelihood From (4), the log-likelihood for a random sample of size n, θ = (θ1 , ..., θn ), from the WSNC distribution with centered parameters µ, σ and γ1 is given by n 2/3 log(1 + c2 γ1 ) 2  Ã ! n ∞   X X 1 θi + 2πr − µ 1/3 + log φ q + cγ1   2/3 σ r=−∞ i=1 1 + c2 γ

l(µ, σ, γ1 ; θ) = n log 2 − n log σ −



1

1/3 cγ1

× Φ q 2/3 2/3 {b2 + c2 (b2 − 1)γ1 }(1 + c2 γ1 )

Ã

!

θi + 2πr − µ 1/3 + cγ1  . σ

Numerical methods of optimization must be employed in order to maximize this log-likelihood. In practice, the infinite summation in (6) can be reduced to include just a small number of its central terms. The three central terms suffice for most applications, whilst the first seven central terms are more than adequate for data sampled from even the most dispersed of parent populations. The score equations and observed and expected information matrices can be derived from (6), although the results obtained provide little theoretical insight. In order to maximize (6) we recommend the use of the Nelder-Mead simplex method (Nelder and Mead, 1965) combined with a grid of starting values. The use of a grid of initial values rather than, say, the estimates obtained from method of moments estimation as the sole starting values, is advisable as, firstly, the MM estimates of the centered parameters are not necessarily good starting values and, secondly, because multiple maxima can occur on the log-likelihood surface. During optimization it is necessary to 6

(6)

check that the maximum of the log-likelihood function does not correspond to a wrapped half-normal distribution, i.e., that the maximum likelihood solution is not a point on the boundary of the parameter space. The occurrence of boundary ML estimates for the skewness parameter of the skew-normal distribution is well-documented; see, Azzalini (1985), Azzalini and Capitanio (1999) and Pewsey (2000b). By definition, the problem of inadmissible ML estimates does not arise. Having obtained ML estimates of the centered parameters, they can be transformed to obtain estimates of the direct or circular parameters, as required. Estimates of the direct parameters, ξ (mod 2π), η and λ, can be calculated from the centered ones, µ (mod 2π), σ and γ1 , using the relations in (3). If required, estimates for the circular parameters can be derived from the estimates of the direct parameters using the relations in Pewsey (2000a). 3.3 Monte Carlo comparison of MM and ML estimation In order to compare the small sample performance of MM and ML estimation, we performed a Monte Carlo experiment for a range of (n, ρ, λ) combinations. For each such combination, 3000 random samples of size n were simulated from the WSNC distribution with ξ = 0, mean resultant length ρ and skewness parameter λ. The relation between ρ and the direct parameters η and λ is given in Pewsey (2000a). We considered sample sizes of 20, 50 and 100; ρ-values of 0.3(0.2)0.9 and λ-values of 0, 2, 5 and 20. A λ-value of 0 gives the symmetric wrapped normal distribution, whereas the other three λ-values correspond to relatively symmetric, relatively asymmetric and highly asymmetric cases of the WSNC class, respectively. The pseudo-random variates from the WSNC distribution were simulated using the obvious adaptation of the algorithm of Henze (1986) for simulating skew-normal variates. The pseudo-random number generator routine employed was a double precision version of the algorithm of Wichmann and Hill (1982) incorporating the amendment of McLeod (1985). In order to avoid the complication of infinite estimates for the direct parameter λ, we based our comparison on the estimates of the centered parameters. Numerical optimization of (6) was performed as described in Section 3.2 using an amended version of algorithm AS47 of O’Neill (1971) available from the HENSA archives. The MM estimates of the centered parameters were obtained by first estimating the circular parameters. Those estimates were then transformed to obtain estimates of the direct parameters as described in Pewsey (2000a). Finally, 7

the estimates of the centered parameters were calculated from the direct parameter estimates using (5). Programming was carried out in FORTRAN. The performance measures we calculated were the estimated bias and mean square error (MSE). For comparative purposes, the measures for the MM estimates of γ1 were calculated after truncating any inadmissible estimates to ±0.99527, as appropriate. For the estimates of γ1 , we also recorded the percentages of inadmissible MM estimates and boundary ML estimates. * * * Table 1 about here * * * In Table 1 we present the results obtained for the estimates of γ1 . These results are representative of the comparative performance of MM and ML estimation for all three of the centered parameters. Considering, first, the results for the bias, both types of estimate tend to be negatively biased (for positively skew cases of the class), the bias being particularly pronounced for highly dispersed populations. The results for the MSE indicate that MM estimation is the more precise when the underlying distribution is close to symmetric and relatively dispersed. However, as the asymmetry and concentration increase, ML estimation dominates. The proportion of ML estimates on the boundary of the parameter space rises with the degree of asymmetry of the parent population; the increase being most dramatic for highly concentrated distributions.

4. Tests for wrapped normal and wrapped half-normal cases 4.1 Operating characteristics of moment based tests As was pointed out in Section 2.2, the wrapped normal and wrapped halfnormal distributions are important limiting cases of the WSNC class. Pewsey (2000a) introduced large-sample moment based tests for these two cases within the WSNC class. Both are based on b2 , the second central sine moment about the sample mean direction. A major appeal of the procedures is their simplicity, this being particularly true of the test for a parent wrapped normal distribution. Here we consider the two tests’ abilities to maintain the nominal significance level and their power, these being the two most important operating characteristics of any test. The large-sample asymptotic distribution of b2 is derived in Pewsey (2003). There it is shown that, under quite general conditions, b2 is normally dis8

tributed with, to order n−1 , mean (

Ã

1 β 2α2 E(b2 ) = β 2 − β3 + 2 1 − 2 nρ ρ ρ

!)

(7)

and variance "

(

1 1 − α4 2α2 α2 (1 − α2 ) 2 − 2α2 − β 2 + α3 + var(b2 ) = n 2 ρ ρ

)#

,

(8)

where αp and β p , p = 1, 2, ..., denote the pth cosine and sine moments about the mean direction, respectively. Expressions for these trigonometric moments for the WSNC class are given in Pewsey (2000a). 4.1.1 Test for a wrapped normal distribution

Under the null hypothesis of a wrapped normal, i.e., W SN CD (ξ, η, 0), distribution, the mean and variance of b2 are EW N (b2 ) = 0 and varW N (b2 ) =

o 1 n (1 − ρ16 ) + 4ρ4 (ρ8 − ρ6 + ρ2 − 1) , 2n

2

where ρ = e−η /2 . Using these results in conjunction with (7) and (8) it is a simple matter to obtain the asymptotic power of the test of Pewsey (2000a) for an underlying wrapped normal distribution within the WSNC class. For a significance level of 100α%, that asymptotic power is given by q q      −zα/2 varW N (b2 ) − E1 (b2 )   zα/2 varW N (b2 ) − E1 (b2 )  q q 1−Φ +Φ ,    

var1 (b2 )

var1 (b2 )

(9)

where zα/2 is the upper α/2 quantile of the standard normal distribution and E1 (b2 ) and var1 (b2 ) are the values of (7) and (8) calculated under the alternative hypothesis of a parent W SN CD (ξ, η, λ) distribution (with λ 6= 0). In order to investigate empirically the operating characteristics of the test we conducted an in-depth simulation study in which samples of size 20, 30, 50, 100, 200 and 500, were simulated from WSNC distributions with λ-values of 0, 2, 5 and ∞, and ρ-values of 0.1(0.05)0.95. For each (n, ρ, λ) combination we simulated 5000 samples of size n from the WSNC distribution with mean direction 0, mean resultant length ρ and skewness parameter λ. The size of 9

the test, or its power, was then estimated by performing the test for each of the 5000 samples and noting the proportion of samples for which the null hypothesis was rejected. The nominal significance levels investigated were 10%, 5% and 1%, although here we consider just the results for the representative 5% level. * * * Figure 1 about here * * * For the size of the test, our results (not presented here) showed that the test is able to hold the nominal significance level very well indeed, even for n = 20. In Fig. 1 we present the results obtained for the power of the test. For comparative purposes, we have also included in the three plots making up this figure the corresponding theoretical asymptotic power functions calculated using (9). As can be seen from this figure, the agreement between the estimated and theoretical results is, in general, very good. The largest differences between the two sets of results occur for highly concentrated, and highly skew and dispersed, cases of the WSNC distribution. As one might expect, such disparities tend to be most pronounced for small n. Clearly, for moderate to large sized samples from all but the most dispersed or concentrated cases of the WSNC distribution, the procedure is a powerful test against asymmetry. 4.1.2 Test for a wrapped half-normal distribution

The results in (7) and (8) can also be used to obtain the asymptotic power of the test of Pewsey (2000a) for a parent wrapped half-normal distribution. For a significance level of 100α%, this is easily shown to be  q   zα varW HN (b2 ) − E1 (b2 ) + EW HN (b2 )  q , 1−Φ  

var1 (b2 )

(10)

where EW HN (b2 ) and varW HN (b2 ), and E1 (b2 ) and var1 (b2 ), are the values of (7) and (8) calculated under the null hypothesis of a parent wrapped half-normal, i.e., W SN CD (ξ, η, ∞), distribution and under the alternative hypothesis of a W SN CD (ξ, η, λ) distribution with λ < ∞, respectively. To investigate empirically the operating characteristics of the test we again used Monte Carlo methods. The results (not presented here) of a simulation study conducted along similar lines to the one described in Section 10

4.1.1, showed the test to be non-conservative for ρ < 0.6. For ρ-values in this range, the ability of the test to maintain the nominal size improves with increasing n. For ρ > 0.7, the test is conservative; its conservatism increasing with ρ and n. With regard to the test’s power, it was found that (10) tends to underestimate the true power of the test. As is to be expected, the agreement between the theoretical (asymptotic) and true power of the test improves with increasing n. The power of the test generally increases with increasing ρ before reaching a maximum for a ρ-value of around 0.7 and subsequently decreasing as ρ increases still further. This drop-off in power as ρ → 1 is consistent with the increasing conservatism of the test noted previously. The power increases spectacularly as λ → −∞. The test is reasonably powerful for moderate to large sized samples when the parent population is only moderately skew and is neither extremely concentrated nor dispersed. For a wrapped normal (i.e., symmetric) population or one which is negatively skew, the test is very powerful (even for sample sizes as small as 20 if λ < −2); again, so long as the parent population is not excessively concentrated or dispersed. 4.2 Generalized likelihood ratio test The following generalized likelihood ratio test provides an alternative to the test considered above for a parent wrapped normal distribution. Appealing to standard likelihood theory, the large-sample asymptotic distribution of −2 log(λ) is χ21 , where log(λ) = max l(µ, σ, 0; θ) − max l(µ, σ, γ1 ; θ), max l(µ, σ, γ1 ; θ) being the maximum value of the log-likelihood function (6) obtained when maximizing over µ, σ and γ1 , and max l(µ, σ, 0; θ) the corresponding maximum value when γ1 is set equal to 0 and maximization is with respect to µ and σ only. Large values of −2 log(λ) in comparison with the quantiles of the χ21 distribution lead to the rejection of an underlying wrapped normal distribution in favor of some skew member of the WSNC class. Despite its evident simplicity, this test is computationally more demanding than the test based on b2 . A wrapped half-normal distribution corresponds to a point on the boundary of the parameter space of the WSNC class. Consequently, standard likelihood theory cannot be used to derive a large-sample test for such a distribution. 11

5. Application to bird migration headings In order to illustrate the methodology discussed in the previous sections and its extension to finite mixture modelling, we analyze a data set introduced to the ornithological literature by Bruderer and Jenni (1990). The data consist of the ‘headings’ of 1827 migrating birds recorded at an observational post near Stuttgart during the autumnal migration period of 1987. Here, the term ‘heading’ refers to the direction, measured in a clockwise direction from North, of a bird’s body during flight. A subset of 1655 of the observations was analyzed in Pewsey (2000a). A linear histogram for all 1827 data values appears in Fig. 2. The data are available in ASCII format from the author upon request. The large-sample tests of Section 4 emphatically rejected the wrapped normal and wrapped half-normal distributions as potential models for the underlying distribution (p-values ' 0). Given the form of the histogram in Fig. 2 we first explored the fit of the WSNC distribution as a model for the data. The MM solution, obtained by estimating the circular parameters of the WSNC distribution and then transforming to the direct parameters, was W SN CD (4.66, 1.10, −1.79). The ML fit was W SN CD (4.70, 1.21, −2.22), with a log-likelihood value of −2202.06. The densities for these two fits are overlaid upon the histogram in Fig. 2. Visually, they are very similar and it is clear that neither provides a good fit to the data. Specifically, neither is capable of simultaneously modelling the peakedness and long ‘tails’ of the histogram. Chi-squared tests for the goodness-of-fit of these distributions, based on the class intervals used in Fig. 2 but combining them where necessary to have expected values greater than 5, emphatically rejected both models (p-values ' 0). Given the dominant direction of migration and the angular spread of the data, it would appear improbable that the directions making up the distribution’s tails correspond to birds moving in the general direction of migration. It seems more likely that these headings are those of birds pursuing other needs of an avian existence, such as tracking down the next meal or pursuing a potential mate. Assuming that the directions followed by such birds are not restricted exclusively to the tails of the empirical distribution, a circular uniform distribution suggests itself as a possible model for the headings of this background of presumably non-migrating birds. As, overall, the empirical distribution is asymmetric, a skew member of the WSNC class would 12

appear to be a reasonable model for the data making up the remainder of the empirical distribution. Viewed in this way, the form of the empirical distribution is due to the mixing of data from two distinct populations. As there is no additional information available which might be used to classify the individuals as belonging to one or other of the two component distributions, we explored the fit of the model with density Ã

!

( Ã

∞ θ + 2πr − ξ p 2 X θ + 2πr − ξ φ f (θ; p, ξ, η, λ) = + (1 − p) Φ λ 2π η r=−∞ η η

!)

,

(11) corresponding to a mixture distribution with circular uniform and WSNC components. We denote this model as U W SN CD (p, ξ, η, λ), p being the mixing probability associated with the two components. * * * Figure 2 about here * * * The ML solution obtained using the log-likelihood derived from (11) was U W SN CD (0.10, 4.56, 0.92, −2.07), with a log-likelihood value of −2128.03. The density for this ML solution is also displayed in Fig. 2 and clearly corresponds to a major improvement in fit. A generalized likelihood ratio test also supported this impression. Comparing −2(−2202.06+2128.03) = 148.06 with the χ21 distribution, the improvement in fit over the best fitting WSNC solution is highly significant (p-value ' 0). Combining the class intervals of the histogram in Fig. 2 as previously described, the chi-squared goodnessof-fit statistic for this solution was calculated to be 34.03 on 28 degrees of freedom. The p-value of around 0.2 for this result confirms the visual impression regarding the excellent fit of the mixture model. According to the fitted density, the circular uniform background was made up of 10% of the birds and the remaining 90% pursued headings distributed according to the W SN CD (4.56, 0.92, −2.07) distribution. The mean direction of this WSNC distribution, calculated using Equation 3.2 of Pewsey (2000a), was found to be 3.81 (radians). According to this estimate, the migrating birds flew roughly in a mean south-westerly direction. We also used profile likelihood methods to obtain approximate 95% confidence intervals for λ and p. The intervals for these two parameters were found to be (−2.63, −1.56) and (0.075, 0.124), respectively. The first of these intervals provides yet more evidence, if it were needed, that the underlying 13

distribution was skew. The second provides more insight as to just how large the percentage of birds making up the uniform background might have been.

6. Discussion This paper has presented solutions to various inferential problems associated with the application of the WSNC distribution in the modelling of skew distributed circular data. The centered parametrization was proposed here as a means of circumventing problems experienced when performing likelihood based inference using the direct parametrization of the distribution. Empirical evidence also suggests that the likelihood surface is better behaved for this alternative parametrization. As shown in Pewsey (2000a), method of moments estimation of the distribution’s circular parameters is trivial. However, when transformed to obtain estimates of the direct or centered parameters, inadmissible estimates of the skewness parameter can occur. ML estimation is, in comparison, computationally more demanding as it requires the application of numerical methods of optimization. Moreover, although the ML estimates are always within range, boundary estimates of the skewness parameter can arise, particularly for small sized samples from highly dispersed or highly skew parent populations. Further inference is hampered by this as standard likelihood theory does not apply on the boundary of the parameter space. Should the necessary computer power be available, computer intensive methods provide a potential solution to this inferential impasse. Simulation results revealed that both approaches to point estimation tend to underestimate the skewness of positively skew parent populations. They also showed MM estimation to be more precise for cases of the WSNC class that are relatively dispersed and close to being symmetric. For all other cases, ML estimation tends to be superior. We have also investigated the two principal operating characteristics of the two tests of Pewsey (2000a) for parent wrapped normal and wrapped halfnormal distributions within the WSNC class. The results from a simulation study showed that the former test maintains the nominal significance level very well indeed, even for samples of just 20 observations. Equally, the formula derived for the theoretical asymptotic power of the test was found generally to describe very accurately the true power of the test. Overall, the test is extremely powerful for moderate to large sized samples drawn from all but the most dispersed or concentrated cases of the WSNC class. 14

In contrast, the test for a wrapped half-normal distribution does not hold the nominal significance level well; it is non-conservative when the mean resultant length of the parent wrapped half-normal distribution is less than 0.6. For mean resultant lengths greater than 0.7 the test is conservative, its conservatism increasing with ρ and n. The expression derived for the theoretical asymptotic power of this test tends to underestimate its true power, although the agreement between the two improves with increasing sample size. Generally, the test is very powerful against symmetric and negatively skew alternatives; again, providing the parent population is not extremely concentrated or dispersed. As an alternative to the moment based test for a parent wrapped normal distribution we have proposed a generalized likelihood ratio test. The computational demands of this test are greater than those for the moment based test, as numerical methods must be employed to maximize the log-likelihood function for the full WSNC class as well as that for the wrapped normal sub-model. Although we have not investigated the operating characteristics of this test, it is clear that its power cannot possibly exceed that of the moment based test for large sized samples drawn from moderately to highly skew cases of the WSNC class. Our application illustrates that the WSNC distribution provides a flexible model for asymmetrically distributed circular data, particularly when used as a component in finite mixture modelling. Of the two approaches to inference considered here, likelihood based methods adapt themselves more readily to the types of inferential problems that arise for that form of modelling. For the reader interested in applying the various forms of inference considered in this paper, a suite of FORTRAN programs is available from the author upon request.

References Azzalini, A. (1985) A class of distributions which includes the normal ones. Scandinavian Journal of Statistics, 12, 171–8. Azzalini, A. and Capitanio, A. (1999) Statistical applications of the multivariate skew-normal distribution. Journal of the Royal Statistical Society, Series B, 61, 579–602. Batschelet, E. (1981) Circular Statistics in Biology, Academic Press, London. 15

Bruderer, B. and Jenni, L. (1990) Migration across the Alps. In Bird Migration: Physiology and Ecophysiology, E. Gwinner (ed.), Springer-Verlag, Berlin, pp. 60–77. Fisher, N.I. (1993) Statistical Analysis of Circular Data, Cambridge University Press, Cambridge. Henze, N. (1986) A probabilistic representation of the ‘skew-normal’ distribution. Scandinavian Journal of Statistics, 13, 271–5. Jammalamadaka, S.R. and SenGupta, A. (2001) Topics in Circular Statistics, World Scientific, Singapore. Mardia, K.V. (1972) Statistics of Directional Data, Academic Press, London. Mardia, K.V. and Jupp, P.E. (1999) Directional Statistics, Wiley, Chichester. McLeod, A.I. (1985) A remark on AS 183. An efficient and portable pseudorandom number generator. Applied Statistics, 34, 198–200. Nelder, J.A. and Mead, R. (1965) A simplex method for function minimization. Computer Journal, 7, 308–13. O’Neill, R. (1971) Algorithm AS 47: function minimization using a simplex procedure. Applied Statistics, 20, 338–45. Pewsey, A. (2000a) The wrapped skew-normal distribution on the circle. Communications in Statistics – Theory and Methods, 29, 2459–72. Pewsey, A. (2000b) Problems of inference for Azzalini’s skew-normal distribution. Journal of Applied Statistics, 27, 859–70. Pewsey, A. (2003) The large-sample joint distribution of key circular statistics. Metrika. In press. Wichmann, B.A. and Hill, I.D. (1982) Algorithm AS 183: an efficient and portable pseudo-random number generator. Applied Statistics, 31, 188– 90.

Biographical sketch Arthur Pewsey is a lecturer of Statistics at the University of Extremadura, with nearly twenty years’ experience in statistical research, consultancy and teaching. In recent years his research has focused on developing statistical methodology for the analysis of skew distributed data on the line or circle, with areas of application including biology, medicine, meteorology, physics, 16

psychology and the earth sciences. He is a Fellow and Chartered Statistician of the Royal Statistical Society.

17

FIGURE LEGENDS Figure 1. Theoretical asymptotic power (dotted) and empirical power (solid) of the test for an underlying wrapped normal distribution when the parent population is WSNC with: a) λ = 2, b) λ = 5, c) λ = ∞. The six curves of each type correspond to sample sizes of 20, 30, 50, 100, 200 and 500, the power increasing with sample size. The dashed horizontal line delimits the nominal significance level of α = 0.05. Figure 2. Histogram of the 1827 bird-flight headings together with fitted densities: MM solution W SN CD (4.66, 1.10, −1.79) (short dash); ML solution W SN CD (4.70, 1.21, −2.22) (long dash); ML solution U W SN CD (0.10, 4.56, 0.92, −2.07) (solid).

18

Table 1. Performance measures for the MM and ML estimates of γ1 . The measures are: bias; (mean squared error); {percentage of inadmissible or boundary estimates, respectively}. For each (n, ρ, λ) combination the measures were calculated using 3000 simulated samples of size n from the WSNC distribution with ξ = 0 and ρ and λ as specified.

ρ

0.3

0.5

0.7

0.9

λ

n

MM

ML

MM

ML

MM

ML

MM

ML

0

20

−0.0036 (0.6786) {42.57} −0.0054 (0.6130) {28.20} −0.0181 (0.4967) {13.10}

−0.0016 (0.9463) {94.23} 0.0162 (0.8759) {81.57} −0.0143 (0.7015) {52.60}

0.0028 (0.5425) {27.67} 0.0046 (0.3661) {8.83} −0.0099 (0.2289) {1.40}

0.0083 (0.8320) {78.90} −0.0025 (0.4877) {30.87} −0.0101 (0.2308) {6.00}

−0.0008 (0.4442) {19.67} 0.0032 (0.2404) {2.47} 0.0021 (0.1368) {0.40}

−0.0087 (0.6143) {52.63} −0.0028 (0.2147) {7.40} 0.0034 (0.0865) {0.23}

−0.0083 (0.4327) {22.87} −0.0267 (0.2514) {6.63} 0.0045 (0.1412) {0.90}

−0.0028 (0.5763) {48.10} −0.0143 (0.1854) {5.07} 0.0038 (0.0727) {0.20}

−0.4022 (0.6194) {45.37} −0.3219 (0.5886) {27.33} −0.2771 (0.5206) {13.60}

−0.4165 (0.6200) {95.60} −0.3312 (0.5820) {80.93} −0.2940 (0.5255) {55.33}

−0.2106 (0.4652) {29.03} −0.1204 (0.3057) {11.73} −0.0575 (0.1905) {4.97}

−0.2054 (0.4971) {79.47} −0.1100 (0.3423) {36.87} −0.0695 (0.1792) {8.33}

−0.0716 (0.3284) {26.47} 0.0192 (0.1783) {11.73} 0.0561 (0.0991) {4.67}

−0.0716 (0.3679) {59.23} −0.0167 (0.1568) {12.50} −0.0177 (0.0659) {0.87}

−0.0231 (0.3133) {35.27} 0.1003 (0.1906) {25.40} 0.1833 (0.1336) {21.47}

−0.0323 (0.3423) {55.27} −0.0101 (0.1328) {9.27} −0.0010 (0.0546) {0.73}

−0.6541 (0.7047) {48.43} −0.4131 (0.5190) {37.73} −0.2546 (0.3400) {29.93}

−0.6801 (0.4377) {95.77} −0.4196 (0.3736) {81.53} −0.2429 (0.2574) {56.07}

−0.3201 (0.3979) {41.07} −0.1351 (0.1601) {36.07} −0.0486 (0.0640) {34.83}

−0.3016 (0.2541) {85.27} −0.0751 (0.1585) {44.77} −0.0300 (0.0415) {20.20}

−0.1354 (0.1929) {48.50} −0.0154 (0.0599) {45.90} 0.0426 (0.0248) {46.63}

−0.0397 (0.1496) {66.27} −0.0122 (0.0427) {35.77} −0.0042 (0.0159) {9.97}

−0.0948 (0.1704) {59.77} 0.0569 (0.0419) {73.23} 0.1067 (0.0231) {81.20}

−0.0688 (0.1240) {77.53} −0.0007 (0.0385) {36.17} 0.0055 (0.0140) {8.03}

−0.6463 (0.6539) {52.60} −0.3609 (0.4119) {49.50} −0.1741 (0.1666) {53.17}

−0.5991 (0.3186) {96.77} −0.2571 (0.1940) {90.30} −0.0790 (0.0557) {82.20}

−0.3379 (0.3458) {48.97} −0.1563 (0.1093) {52.30} −0.0825 (0.0356) {56.23}

−0.2041 (0.1284) {93.60} −0.0377 (0.0472) {80.97} −0.0106 (0.0063) {63.13}

−0.1698 (0.1384) {61.00} −0.0703 (0.0344) {65.63} −0.0260 (0.0074) {72.83}

−0.0411 (0.0500) {92.07} −0.0050 (0.0065) {87.50} 0.0001 (0.0007) {65.53}

−0.1570 (0.1265) {67.73} −0.0279 (0.0163) {85.83} −0.0009 (0.0025) {93.77}

−0.0631 (0.0492) {94.40} −0.0034 (0.0052) {87.73} −0.0003 (0.0007) {64.13}

50

100

2

20

50

100

5

20

50

100

20

20

50

100

19

Fig. 1

b)

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Power

1.0

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

Rho

0.8

0.9

1.0

0.2

0.3

0.4

0.1

0.2

0.3

0.4

1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.1

0.5

0.6

Rho

c)

Power

Power

a) 1.0

0.5

0.6

Rho

19

0.7

0.8

0.9

1.0

0.7

0.8

0.9

1.0

Fig. 2

0.6 0.5

Density

0.4 0.3 0.2 0.1 0 0

1

2

3

4

Heading (radians)

20

5

6