Environ Ecol Stat (2007) 14:101–111 DOI 10.1007/s10651-007-0010-7 O R I G I NA L A RT I C L E
Test for independence between marks and points of marked point processes: a subsampling approach Yongtao Guan · David R. Afshartous
Received: 28 July 2004 / Revised: 1 March 2005 / Published online: 21 March 2007 © Springer Science+Business Media, LLC 2007
Abstract Ecological data often involve measurements taken at irregularly spaced locations (e.g., the heights of trees in a forest). A useful approach for modeling such data is via a marked point process, where the marks (i.e., measurements) and points (i.e., locations) are often assumed to be independent. Although this is a convenient assumption, it may not hold in practice. Schlather et al. (Journal of the Royal Statistical Society Services B, 66, 79–93, 2004) proposed a simulation-based approach to test this assumption. This paper presents a new method for testing the assumption of independence between the marks and the points. Instead of considering a simulation approach, we derive analytical results that allow the test to be implemented via a conventional χ 2 statistic. We illustrate the use of our approach by applying it to an example involving desert plant data. Keywords
Conditional expectation of marks · Subsampling
1 Introduction Ecological data often exist in a format readily amenable to methods used in spatial statistics. For example, for a species of desert plants, we might possess data on the locations of the plants, in addition to measurements on the canopy volumes of such plants. Ecologists are often interested in studying this type of data in order to understand the nature and strength of possible interactions between plant locations and canopy volume. A useful approach for modeling such data is via a marked point
Y. Guan (B) Division of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT 06520-8034, USA e-mail:
[email protected] D. R. Afshartous Department of Management Science, University of Miami, Coral Gables, FL 33124-6544, USA
102
Environ Ecol Stat (2007) 14:101–111
process, where the marks (i.e., measurements) and points (i.e., locations) are often assumed to be independent. However, it is often necessary to distinguish between data arising from a random field versus data arising from a marked point process. Random fields represent random functions in R2 , with values in R, and are often called regionalized variables. Although the values of a regionalized variable can be ascertained across a continuous range of points, for a marked point process, the quantity of interest might be observable only at predefined locations. For example, the diameter of a tree can, of course, be obtained only where a tree exists. Regardless, geostatistical techniques, such as variogram analysis and kriging, which were initially developed to model random field data, are often applied toward spatial marked point process data. The validity of such applications, however, depends on the correctness of the assumption that the point locations are independent of the values of the variable of interest (Diggle et al. 2002), which appears to be unreasonable for many examples arising from ecological studies, where a dependence between the values and locations could be a result of the nature of the data. For example, the diameters of trees in a forest might depend on their relative position, in addition to soil quality and sunlight. As a consequence, such applications can often be inappropriate and misleading. For example, a variogram estimated for marked point process data via geostatistical software does not necessarily represent the variogram of a random field (Wälder and Stoyan 1996). Instead, these situations are more appropriately modeled via tools of marked point processes (Diggle 2000; Diggle et al. 2002). Schlather et al. (2004) introduce two characteristics of marked point processes, both based on the inter-point distance, t, in order to assess whether the values of the marks can be modeled by a random field that is independent of the unmarked point process (i.e., the locations). Specifically, E(t) and V(t) represent the conditional expectation and conditional variance of a mark, respectively, given that there exists another point of the process at a distance t. They employ MC tests based on the deviance of estimates of E and V from a constant function, as under the null hypothesis E and V should be constant. Deviance is measured with respect to various weighted Lp -norms in order to select weight-norm combinations that perform well with respect to the size and power of the corresponding tests. The performance of these two characteristics is evaluated separately with respect to three theoretical models that violate the independence assumption in different ways. Depending on the model, this violation manifests itself in nonconstant E, nonconstant V, or both. The tests for V(t) = constant do not perform as well as the tests for E(t) = constant. The results indicate that size and power both vary greatly, according to the selected weight or norm used to assess deviance of the test statistics from a constant. This method also requires any real or simulated data set to be marginally transformed to normal variables, via the empirical distribution function, since these methods are based on the Gaussianity of the random field model. As noted therein, this assumption is questionable in the case of discrete marks and it is not clear how this test would work for such a case. This paper presents a new method for testing the marked point independence assumption. As in Schlather et al. (2004), we assume stationarity and isotropy of the marked point process and base our test on the conditional expectation of a mark. However, instead of a simulation approach, we derive analytical results that allow the test to be implemented via a conventional test statistic method. Furthermore,
Environ Ecol Stat (2007) 14:101–111
103
our method does not require the marginal distribution of the marks to be normally distributed. The outline of the paper is as follows. In Sect. 2, we derive preliminary asymptotic results that form the basis of the independence test presented in Sect. 3. Our test statistic has a limiting χ 2 distribution, and also includes a covariance term that is often unknown and must be estimated. The estimation of this covariance matrix is discussed in Sect. 3, where we describe a subsampling approach that yields an L2 consistent estimator of the desired covariance matrix. In Sect. 4, we investigate the behavior of our test statistic on a desert plant data set. Finally, we conclude in Sect. 5 with a discussion of the main results and the potential benefits of employing our methodology toward real spatial data sets. 2 Preliminary asymptotic results 2.1 Notation and setup Consider a stationary and isotropic marked point process, " = [x, m(x)], where x denotes a point location and m(x) denotes a real-valued mark measured at x. Let N denote the point process generating {x}, D denote the region where observations are made, |D| denote the area of D, and N(D) denote the number of points from N in D. Let dx be an infinitesimal region containing x. Define the second-order intensity function as ! " E[N(d0) × N(dx)] #(t) ≡ lim , where ||x|| = t. |d0| × |dx| |d0| , |dx|→0 Following Schlather et al. (2004), we consider the conditional expectation of a mark µ(t) ≡ E [m(0)|0,
x ∈ N, %x% = t].
Throughout this article, let h be a positive constant and w(·) be a bounded, non-negative, symmetric density function that takes positive values on only a finite support. The constant h is called the bandwidth and controls the amount of smoothing in estimation. A kernel type of estimator of µ(t) can then be defined as $$w[(t − ||x2 − x1 ||)/h] × m(x1 ) , $$w[(t − ||x2 − x1 ||)/h]
(1)
$$w[(t − ||x2 − x1 ||)/h] × m(x1 )/(2πt|D|h) . $$w[(t − ||x2 − x1 ||)/h]/(2πt|D|h)
(2)
µ(t) ˆ =
where the sums in the numerator and the denominator are both overall pairwise distinct x1 and x2 . Clearly we see that (1) can be rewritten as µ(t) ˆ =
ˆ ˆ Let M(t) and #(t)denote the numerator and the denominator terms, respectively in & (2). Clearly #(t) is an estimate of the second-order intensity function of N. In the ˆ ˆ following sections, we study the asymptotic properties of M(t) and #(t) separately, under the assumption of independence between the marks and the points, and then combine them to obtain the asymptotic properties of µ(t). ˆ Note that the assumption of independence here simply implies that there is no interaction between the marks and the points, but that it does not warrant an independence among the marks nor
104
Environ Ecol Stat (2007) 14:101–111
one among the points. The results can easily be extended to the conditional variance case under suitable conditions, and thus we omit the discussion. 2.2 Asymptotic consistency To demonstrate the asymptotic consistency of µ(t), ˆ we need to account for the shape of the region, D, and the choice of bandwidth, h. Let |D|, ∂D, and |∂D| denote the volume and boundary of D and the length of ∂D, respectively. Consider a sequence of increasing regions, Dn , and a sequence of constants, hn . We assume that # $ % & |Dn | = O n2 , |∂Dn | = O (n) , and hn = O n−β for someβ ∈ (0, 1) . (3)
Practically, condition (3) requires that Dn grow in all directions and that hn decrease to 0 at a rate slow enough to ensure sufficient averaging for each µˆ n (t), where µˆ n (t) is µ(t) ˆ calculated on Dn . In addition to (3), we also need to assume conditions on the point process N and the process generating the marks (referred to as the mark process henceforth). For the point process, we assume the existence of the kth order cumulant density function of N, which is defined as ( ' Cum[N(dx1 ), . . . , N(dxk )] C(k) (x2 − x1 , . . . , xk − x1 ) ≡ lim . |dx1 | × · · · × |dxk | |dx1 |,...,|dxk |→0
In the foregoing definition, Cum(Y1 , . . . , Yk ) stands for )the cumulant given+by the * coefficient of ik t1 . . . tk in the Taylor series expansion of log E[exp(i kj=1 Yj tj )] about
the origin (see Brillinger 1975). Following Guan (2003), we assume that C(2) (·) is finite and continuous, C(3) (·, ·) is finite, , , , |C(2) (u)|du < ∞, |C(3) (u1 , u2 )|du1 < ∞, |C(3) (u1 , u1 + u2 )|du1 < ∞, R2
R2
and
R2
,
R2
|C(4) (u1 , u2 , u2 + u3 )|du2 < ∞.
For the mark process, let µ ≡ E[m(x)] and µ2 ≡ E[m(x)2 ]. We assume that , µ2 < ∞ and |R(t)| dt < ∞,
(4)
(5)
where R(t) ≡ Cov[m(0), m(x)] for ||x|| = t. Loosely speaking, conditions (4) and (5) require the point process and the mark process to be weakly dependent. For examples of processes satisfying these conditions, see Guan (2003). ˆ n (t) and # ˆ n (t) are The following theorem states that under the above conditions, M ˆ ˆ ˆ consistent estimators for M(t)(≡ µ#(t)) and #(t), where Mn (t) and #n (t) are M(t) ˆ and #(t), respectively, calculated on Dn . Theorem 1 Assume that conditions (3) – (5) hold, then ˆ n (t)] = w(x)#(t − hn x)dx → #(t), 1. E[# ˆ n (t)] = µ w(x)#(t − hn x)dx → µ × #(t), 2. E[M ˆ n (t)] → w2 (x)dx × #(t)/(πt), 3. |Dn |hn Var[# ˆ n (t)] → w2 (x)dx × [2µ2 + R(t)] × #(t)/(2πt). 4. |Dn |hn Var[M
Environ Ecol Stat (2007) 14:101–111
105
Proof The proof follows, similar to that of Theorem III. 1 in Guan (2003). As µˆ n (t) is a ratio of consistent estimators for M(t) and #(t), it follows from Slutzky’s Theorem that µˆ n (t) is also a consistent estimator for µ. Furthermore, we see that µˆ n (t) is asymptotically unbiased for µ. This holds, regardless of the value of n. 2.3 Asymptotic normality To prove the asymptotic normality, we need to further quantify the strength of dependence in the marks and in the points. Following Rosenblatt (1956), we use the following strong mixing coefficients: αM (p; d) ≡ sup {|P (A1 ∩ A2 ) − P (A1 ) P (A2 )| : A1 ∈ F M (E1 ) , A2 ∈ FM (E2 ) , E2 = E1 + x, |E1 | = |E2 | ≤ p, d (E1 , E2 ) ≥ d} , and
αN (p; d) ≡ sup {|P (A1 ∩ A2 ) − P (A1 ) P (A2 )| : A1 ∈ FN (E1 ) , A2 ∈ FN (E2 ) , E2 = E1 + x, |E1 | = |E2 | ≤ p, d (E1 , E2 ) ≥ d} ,
where the suprema are taken over all compact and convex subsets E1 ⊆ R2 , and over all x ∈ R2 such that d(E1 , E2 ) ≥ d. In the foregoing, FM (E) and FN (E) denote the σ -algebras generated by the random variables {m (x) : x ∈ E} and {x : x ∈ N ∩ E}, respectively. Following Guan et al. (2004), we assume that sup
αM (p; d) = O(d−∈ ) for some ∈> 2, p
(6)
sup
αN (p; d) = O(d−∈ ) p
(7)
p
p
for some ∈> 2.
Conditions (6) and (7) require that the dependence of both the mark and the point processes decreases at a polynomial rate in the inter-distance d for any fixed p. This condition is generally weaker than the commonly used strong mixing condition (see, e.g., Doukhan 1994), in that we allow the dependence to depend on the volume of regions p. For examples of processes satisfying these conditions, see Politis and Sherman (2001) and Guan et al. (2004). We also require the following mild moment conditions that are only slightly stronˆ n (t) and # ˆ n (t): ger than the existence of the (standardized) asymptotic variances of M './ ( ) 0 1+.2+δ . ˆ n (t) − E M ˆ n (t) .. ≤ Cδ , (8) sup E . |Dn | × hn × M n
'./ ) 0 1+.2+δ ( . ˆ n (t) − E # ˆ n (t) .. sup E . |Dn | × hn × # ≤ Cδ , n
(9)
for some δ > 0 and Cδ < ∞. The following theorem states that under the foregoing conditions, the standardized µˆ n (t) is asymptotically normal. Theorem 2 Consider a set of fixed, predetermined t1 , t2 , . . . , tk . Assume that conditions (3)–(9) hold. Then / D |Dn | × hn × [µˆ n (t1 ) − µ, . . . , µˆ n (tk ) − µ] −→ N(0, $),
106
Environ Ecol Stat (2007) 14:101–111
where the ith diagonal element of $ is ' ( , 2µ2 2µ2 + R(ti ) µ2 w(x)2 dx × − + 2πti #(ti ) πti #(ti )2 πti #(ti )3
(10)
and the off-diagonal elements are all zero. Proof See Appendix. 3 Test for independence
Define a lag t as the Euclidean distance between a pair of observations. For any two arbitrary lags, ti and tj , under the null hypothesis of independence E[µˆ n (ti )] = E[µˆ n (tj )]. Thus, a test for independence can be performed by comparing µˆ n (t) obtained at different t values. Specifically, consider a set of pre-chosen lags T. Under the null hypothesis, we rewrite the null hypothesis, in terms of µˆ n (t)fort ∈ T, as follows: H0 : E[µˆ n (ti )] = E[µˆ n (tj )]
for all ti , tj ∈ T.
ˆ n ≡ {µˆ n (t) : t ∈ T}. Following Lu (1994), we form a set of contrasts based on the Let G ˆ n ) = 0 for some full row rank matrix A. above equations under H0 such that AE(G For example, if T = {t1 , t2 , t3 , t4 }, then A may be defined as 1 −1 0 0 0 −1 0 . A = 1 1 0 0 −1 ˆ n ) = 0. We assess the assumption of independence by testing the hypothesis H0 : AE(G In light of the asymptotic normality of µˆ n (t), we define the following test statistic: ˆ n ), (A$ ˆ n ), ˆ n A, )−1 (AG TSn ≡ |Dn | × hn × (AG
(11)
ˆ n is a consistent estimator of $. The test statistic converges to a χ 2 distribuwhere $ tion with degrees of freedom given by the row rank of the matrix A as a result of the multivariate Slutzky’s Theorem (see, e.g., Ferguson 1996). To obtain an estimate of $, we apply a subsampling approach (see, e.g., Sherman and Carlstein 1994). The idea of subsampling is to divide the studied region into many subregions of the same size and shape. The subregions will then be treated as if they were independent and will be used to estimate the standardized covariance (say, $ , ) of the conditional expectations [i.e., µ(t)], ˆ obtained on any such subregion. If the subregions are small compared to the original region, such that enough replicates are obtained, then the estimated covariance matrix will be close to $ , . If further subregions are large enough, then $ , will be close to $. Thus, the resulting covariance estimator can serve as a good approximate for the target matrix $. The main advantage of using subsampling is that it does not require any explicit knowledge of the dependence or distribution of the underlying process. The use of subsampling to estimate the variance of a general statistic has been justified by Politis and Sherman (2001) in the marked point process setting. Specifically, they have shown that the subsampling estimator converges to the target parameter in the L2 sense. Following the notation in Guan (2003), let Dl(n) be the subregion with the same shape as Dn but rescaled, where l(n) = cnα for some α ∈ (0, 1) and some c > 0. Define
Environ Ecol Stat (2007) 14:101–111
107
ˆ l(n) (y) Dl(n) (y) ≡ {x + y : x ∈ Dl(n) } and D1−c ≡ {y : y ∈ Dn , Dl(n) (y) ⊂ Dn }. Let G n ˆ n obtained on Dl(n) (y) and let hl(n) be the bandwidth. Define the following represent G ˆ n ): estimator (denoted as $ ! , 10 1, " 0 1 ˆ ˆ ¯ ¯ × D1−c |Dl(n) | × hl(n) × Gl(n) (y) − Gn Gl(n) (y) − Gn dy, (12) n |D1−c n |fn ¯ n ≡ ∫ 1−c G ˆ l(n) (y)dy/|D1−c | and fn = 1 − |Dl(n) | is a finite sample bias correcwhere G n Dn Dn tion due to Guan et al. (2004). The asymptotic consistency of (12) can be established similarly as in Politis and Sherman (2001) and Guan et al. (2004) under suitable conditions. Thus our test statistic (11) converges in distribution to a χ 2 random variable, where the degrees of freedom is given by the row rank of A. 4 Application In this section, we apply the proposed testing method to the Ambrosia Dumosa data studied by Miriti et al. (1998). The data consist of locations and estimated plant canopy volumes (logarithmically transformed) of 4,358 Ambrosia Dumosa plants from a 1984 census within a square hectare (100 × 100m2 ) area in the Colorado Desert (Fig. 1). Ambrosia Dumosa is an extremely abundant, long lived, and drought-resistant shrub that can reach a height of 2 feet (60 cm) and a spread of 3 feet (90 cm). Here, we treat the locations of Ambrosia Dumosa plants as points and the logarithmically transformed estimated plant canopy volumes as marks. Ecologists are interested in understanding the nature and strength of interactions, if any, among neighboring Ambrosia Dumosa plants. For example, Miriti et al. (1998) studied the effect of adult neighbors on mortality rates of juvenile plants and concluded no apparent evidence of interactions. Schenk and Mahall (2002), on the other hand, considered the effects of neighbors on canopy sizes of Ambrosia Dumosa and detected a positive relationship between the sizes of nearest Ambrosia Dumosa 100 90 80 70 60 50 40 30 20 10 0
0
10
20
30
40
Fig. 1 Locations of Ambrosia Dumosa plants
50
60
70
80
90
100
108 Fig. 2 Histogram plot of Ambrosia Dumosa canopy volumes
Environ Ecol Stat (2007) 14:101–111 1200
1000
800
600
400
200
0
6
8
10
12
14
16
18
20
22
24
Log2 cm3
neighbors and the distances between them. They found specifically that Ambrosia Dumosa plants increased in biomass by about 60% for each meter distance to the nearest conspecific (i.e., of the same species) neighbor for a distance of up to 3 m. The histogram of the marks clearly exhibits a biomodal nature (Fig. 2). As a result, the approach of Schlather et al. (2004) is not appropriate for this example because it requires the marginal distribution of marks to be normal. Our approach, on the contrary, does not require such a condition and thus can be applied. To do so, we conveniently use seven values for t, from 0.25–3.25 m, with an increment of 0.5 m between adjacent lags. We do not consider any larger lags, as competition among plants beyond this range is thought to vanish (Schenk et al. 2003). We use a uniform kernel for w(·) and h = 0.249 to estimate µ(t) ˆ and set the A matrix such that µ(0.25) ˆ is compared to each of the remaining µ(·) ˆ values. The resulting test statistic is equal to 19.7449, with an approximate P-value of 0.0031, when compared to a χ 2 distribution with seven degrees of freedom. Thus, we conclude there is strong evidence against the assumption of independence. Before proceeding to explore the nature and strength of the dependence between the canopy volumes and locations of the Ambrosia Dumosa plants, we note that the assumption of isotropy for the plant locations, which we assumed while deriving the test statistic, might be questionable for this data set (e.g., Rosenberg 2004). From the proof of Theorem 2, we see that the asymptotic normality of µˆ n , upon which our test statistic is based, depends on conditions (3)–(9) and the existence of a limiting covariance $. The main benefit of isotropy is to obtain a clean form of $. A limiting covariance $ still exists, even under anisotropy, due to conditions (3)–(5), but with a more complicated expression. As a result, the main effect of anisotropy here is to alter the form of the covariance matrix $ given in Theorem 2, but not to invalidate the asymptotic normality of µ(t). ˆ In addition, the subsampling approach that we propose requires the existence of $, but not its explicit form. Thus our conclusion remains valid. To understand the dependence structure, we examine the plot of the calculated µ(·) ˆ values (Fig. 3). The plot indicates that the overall interactions among neighboring
Environ Ecol Stat (2007) 14:101–111
12.8 12.7
empirical µ(t)
Fig. 3 Empirical conditional expectation plot for Ambrosia Dumosa. The t-axis represents the inter-point distances; the y-axis represents the empirical conditional expectations of the canopy volumes
109
12.6 12.5 12.4 12.3 12.2 12.1
0
0.5
1
1.5
2
2.5
3
3.5
4
t
plants are negative and that the interaction appears to diminish after 2 m. The surprising peak at 0.5 m implies certain attraction or positive growth relationship among neighboring plants separated by this distance. This interesting phenomenon persists under several different choices of w(·) and h with which we have experimented. The reasons for this phenomenon are unknown to the authors and deserve further investigation. 5 Discussion It is very important to have a test for the null hypothesis of independence between marks and locations in a marked point process. By falsely assuming independence in the presence of a relationship between the values of the marks and the actual locations, one is essentially assuming that the marked point process fits the random field model (i.e., the marks represent the values of the underlying random field at the pre-defined locations). Although data for a regionalized variable are suitable for analysis via standard geostatistical methods, the application of geostatistical methods for marked point processes that violate the independence assumption is questionable, leading to unexpected variograms, for instance (Wälder and Stoyan 1996). We have introduced a new method of testing the null hypothesis of independence between the marks and locations of a marked point process. As in Schlather et al. (2004), our test assumes that the marked process is stationary and is based on the estimated conditional expectation of a mark, µ(t). However, instead of following a simulation-based approach, we derive the asymptotic properties, viz., consistency and normality, of µ(t). We make use of the intuitive fact that the mark conditional expectation is equal at different lags under the null hypothesis, allowing the formation via contrasts of a test statistic that is asymptotically χ 2 distributed. Thus, the reasoning behind our test is easy to understand for practitioners, and the resulting P-values quantify the strength of evidence against the null hypothesis. Furthermore, the covariance matrix within our test statistic is easily estimated via a subsampling method as derived in Guan (2003). The main advantage of using subsampling is that it is completely nonparametric and thus can yield a consistent estimator for the covariance matrix under a variety of settings.
110
Environ Ecol Stat (2007) 14:101–111
Our test is flexible, in the sense that different sets of lags can form the basis of the test; thus, the test is easily adaptable to different applications where the scales of the lags are different and researchers are interested in testing specific lag combinations. Although we present our method with respect to the conditional expectation, the analytical results for the conditional variance may be similarly derived. In Schlather et al. (2004), the simulation test based on the conditional variance was inferior to that based on the conditional expectation, largely because of the greater degree of variability usually associated with the conditional variance. By restricting our test to the conditional expectation and employing a χ 2 test, both the underlying logic and the implementation remain simple and easy to implement. Finally, our method is applicable to a wider range of applications because it does not require the marginal distribution of the marks to be normal. For example, in the application considered in this paper, Ambrosia Dumosa, our method was used to reject the null hypothesis (p = 0.0031) of independence. Based on the histogram of marks illustrated in Fig. 2, it is clear that the marks cannot be transformed to normality and thus the method in Schlather et al. (2004) cannot be applied. Although we considered a set of lags ranging from 0.25 to 3.25 m for these data, this could be altered for other applications where the lag scale is different. In future research, we will apply our method to different applications and investigate the sensitivity of our method with respect to variations in the chosen lag combinations. Once the null hypothesis of independence has been rejected, a key interest is to model the dependence structure between marks and points. Wälder and Stoyan (1996) and Stoyan and Wälder (2000) proposed useful models that can be applied to ecological applications. See also Schlather et al. (2004) and Schoenberg (2004). Generally speaking, however, the literature on modeling dependent marked point processes is still limited and worth further investigation. Acknowledgements The authors thank Maria Miriti for providing the Ambrosia Dumosa data and Scott Urquhart and two anonymous referees for helpful comments on an earlier version of the manuscript.
Appendix A: Proof of Theorem 2 6 7 D √ We consider the univariate case, that is, |Dn | × hn × µˆ n (t) − µ −→ N(0, σ 2 ), where σ 2 is given by (10). It is sufficient to show that ) 0 1 0 1+ D / ˆ n (t) − E M ˆ n (t) , # ˆ n (t) − E # ˆ n (t) −→ |Dn | × hn × M N(0, $2 ), (13)
where $2 is a 2×2 matrix. The diagonal elements of $2 are given by the results in Theorem 1; the off-diagonal elements can be proven to be equal to ∫ w2 (x)dx×#(t)×µ/(πt). To prove (13), let V ≡ (a, b), for arbitrary scalars a, b. We need only to show that 0 ) 0 1+ ) 0 1+1 / ˆ n (t) − E M ˆ n (t) + b # ˆ n (t) − E # ˆ n (t) |Dn | × hn × a M
converges in distribution to N(0, V, $2 V). The proof follows similarly as that of Theorem 2 in Guan et al. (2004), by utilizing a blocking technique in conjunction with the asymptotic results of Theorem 1, the mixing conditions (6) and (7), and the moment conditions (8) and (9).
Environ Ecol Stat (2007) 14:101–111
111
References Brillinger DR (1975) Time series: data analysis and theory. Holt, Rinehart & Winston, Austin, TX Diggle P (2000) Discussion paper to A. Baddeley and R. Turner on Maximum likelihood for spatial point patterns. Aust N Z J Stat 42:934–936 Diggle P, Ribeiro P Jr, Christensen O (2002) An introduction to model-based geostatistics, In: Hansen M, Moller J (eds) Spatial statistics and computational methods. Springer, New York Doukhan P (1994) Mixing: properties and examples. Springer, New York Ferguson T (1996) A course in large sample theory. Chapman and Hall, Boca Raton, FL Guan Y (2003) Nonparametric methods of assessing spatial isotropy. Unpublished Ph.D. dissertation, Department of Statistics, Texas A&M University Guan Y, Sherman M, Calvin JA (2004) A nonparametric test for spatial isotropy using subsampling. J Am Stat Assoc Theory Methods 99:810–821 Lu H (1994) On the distributions of the sample covariagram and semivariogram and their use in testing for isotropy. Unpublished Ph.D. dissertation, Department of Statistics and Actuarial Science, University of Iowa Miriti MN, Howe HF, Wright SF (1998) Spatial patterns of mortality in a desert perennial plant community. Plant Ecol 136:41–51 Politis DN, Sherman M (2001) Moment estimation for statistics from marked point processes. J R Stat Soc B 63:261–275 Rosenberg MS (2004) Wavelet analysis for detecting anisotropy in point patterns. J Veg Sci 15:277–284 Rosenblatt M (1956) A central limit theorem and a strong mixing condition. Biometrics 37:531–539 Schenk HJ, Mahall BE (2002) Positive and negative interactions contribute to a northsouth-patterned association between two desert shrub species. Oecologia 132:402–410 Schenk HJ, Holzapfel C, Hamilton JG, Mahall BE (2003) Spatial ecology of a small desert shrub on adjacent geological substrates. J Ecol 91(3):383–395 Schlather M, Ribeiro PJ Jr, Diggle PJ (2004) Detecting dependence between marks and locations of marked point processes. J R Stat Soc B 66:79–93 Schoenberg FP (2004) Testing separability in spatial-temporal marked point processes. Biometrics 60:471–481 Sherman M, Carlstein E (1994) Nonparametric estimation of the moment of a general statistic computed from spatial data. J Am Stat Assoc 89:496–500 Stoyan D, Wälder O (2000) On variogram in point process statistics, II: models of markings and ecological interpretation. Biom J 42:171–187 Wälder O, Stoyan D (1996) On variogram in point process statistics. Biomed J 38:895–905
Biographical sketches Yongtao Guan is Assistant Professor in the Division of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT 06520-8034, USA. His main research interest involves analyzing spatial point and marked point processes that arise from ecological studies. His recent projects include detecting spatial anisotropy of plant locations and interactions between intra- and inter-specific plants and linking these phenomena to ecological factors that have driven to them. David Afshartous is Assistant Professor in the Department of Management Science, School of Business Administration, University of Miami, Coral Gables, FL, USA. His research interests span several areas, ranging from multilevel models, to high-speed wireless data protocol performance, to spatial statistics. He is currently involved in a project for the US Coast Guard that entails performing spatial modeling of a marked point process in the presence of geographical barriers and using this model to solve an optimization problem. His collaborate work often finds him in contact with researchers in environmental science and operations research.