Provided for non-commercial research and educational use only. Not for reproduction, distribution or commercial use. This chapter was originally published in the book Developments in Environmental Science, Vol. 12, published by Elsevier, and the attached copy is provided by Elsevier for the author's benefit and for the benefit of the author's institution, for noncommercial research and educational use including without limitation use in instruction at your institution, sending it to specific colleagues who know you, and providing a copy to your institution’s administrator.
All other uses, reproduction and distribution, including without limitation commercial reprints, selling or licensing copies or access, or posting on open internet sites, your personal or institution’s website or repository, are prohibited. For exceptions, permission may be sought for such use through Elsevier's permissions site at: http://www.elsevier.com/locate/permissionusematerial From: Davide Travaglini, Gherardo Chirici, Francesca Bottalico, Marco Ferretti, Piermaria Corona, Anna Barbati and Lorenzo Fattorini, Large-Scale Pan-European Forest Monitoring Network: A Statistical Perspective for Designing and Combining Country Estimates. Example for Defoliation. In Marco Ferretti and Richard Fischer, editors: Developments in Environmental Science, Vol. 12, Oxford, UK, 2013, pp. 105-135. ISBN: 978-0-08-098222-9 © Copyright 2013 Elsevier Ltd. Elsevier
Author's personal copy
Chapter 7
Large-Scale Pan-European Forest Monitoring Network: A Statistical Perspective for Designing and Combining Country Estimates. Example for Defoliation Davide Travaglini*,1, Gherardo Chirici{, Francesca Bottalico*, Marco Ferretti{, Piermaria Corona}, Anna Barbati} and Lorenzo Fattorini} *
Dipartimento di Economia, Ingegneria, Scienze e Tecnologie Agrarie e Forestali, Universita` degli Studi di Firenze, Firenze, Italy { Dipartimento di Bioscienze e Territorio, Universita` degli Studi del Molise, Contrada Fonte Lappone s.n.c., Pesche, Isernia, Italy { TerraData Environmetrics, Monterotondo Marittimo (GR), Italy } Dipartimento per l’Innovazione dei sistemi Biologici, Agroalimentari e Forestali, Universita` degli Studi della Tuscia, Viterbo, Italy } Dipartimento di Economia Politica e Statistica, Universita` degli Studi di Siena, Siena, Italy 1 Corresponding author: e-mail:
[email protected]
Chapter Outline 7.1. Introduction 7.2. Sampling Designs in Large-Scale Forest Monitoring in Europe 7.3. Relationship Between FCM and NFI Networks 7.4. Design-Based European Monitoring System of Forest Condition
106
107 110
113
7.4.1. The Importance of Clear Objectives 113 7.4.2. Defining Parameters of Concern 115 7.4.3. Defining Accuracy Measures for Status Assessment 117 7.4.4. Defining Accuracy Measures for Change Assessment 118
Developments in Environmental Science, Vol. 12. http://dx.doi.org/10.1016/B978-0-08-098222-9.00007-8 © 2013 Elsevier Ltd. All rights reserved.
105
Author's personal copy 106
SECTION
7.5. Sampling Strategies at the Country Level 7.5.1. Uniform Random Sampling 7.5.2. URS Versus Systematic and Stratified Sampling 7.5.3. Sampling Effort: A Preliminary Test
119 119
122 125
II
Designing Forest Monitoring
7.6. Aggregating Country Estimates at the European Level 7.6.1. Combining FCM Estimates 7.6.2. Coupling FCM and NFI Estimates Across Europe 7.7. Conclusions References
125 127
128 131 133
7.1 INTRODUCTION National Forest Inventories (NFIs) and Forest Condition Monitoring (FCM) networks are primary data sources for large area assessment of forest resources. NFIs have been traditionally designed to provide country-based estimates on the kind, amount, and condition of timber and nontimber forest resources (Corona et al., 2011); with time, new variables have been included to meet evolving demands of forest information related to international conventions and policy process (Vidal et al., 2008). FCM was established in 1980s in response to the concern of monitoring the alleged progressive deterioration of forest condition due to atmospheric pollution (Innes, 1993). In general terms, NFIs and FCM share the same approach: data from sample surveys are used to estimate population parameters for the attribute(s) of concern and to estimate changes over time. The International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests (ICP Forests) large-scale (Level I, see Chapters 2 and 6) monitoring network has a European-wide dimension and the potential for providing forest information to fulfill reporting obligations under several international agreements (e.g., Ministerial Conference on the Protection of Forests in Europe-MCPFE, currently Forest Europe; Montreal processes and for the purposes of forest certification; climate negotiations). Yet, several authors argue about the quality of the data collected by the ICP Forests surveys (e.g., Ferretti, 1997, 2004; Innes and Materna, 1992; Innes et al., 1993; Neumann and Stowasser, 1986; Percy and Ferretti, 2004). Main criticisms concerned the quality of defoliation assessments, while very few authors addressed sampling-related issues (e.g., Ferretti, 1997; Ferretti and Chiarucci, 2003; Innes, 1988; Ko¨hl and Kaufmann, 1993; Percy and Ferretti, 2004). This latter point, instead, deserves careful attention, as the sampling approaches adopted by individual countries, which are responsible for the implementation of Level I, have a significant impact on survey results.
Author's personal copy Chapter
7
Large-Scale Pan-European Forest Monitoring Network
107
As a matter of fact, the ICP Forests Level I network is a composite of national networks, based on different sampling schemes (Cozzi et al., 2002). Nevertheless, all national sampling schemes are required to follow a probabilistic sampling design, ensuring, for each element of the population, a nonzero probability of being selected. The density of sampling units within ICP Forests Level I network provides the basis for Europe-wide analysis rather than for national assessments, which are partly based on denser national grids (Ferretti et al., 2010a). Although several countries adopt a systematic sampling to select sampling sites, the target population is not homogeneous within the monitoring network, as different forest definitions are adopted by individual countries. Moreover, where applied, the so-called cross-cluster plot is surveyed based on a fixed number of nearest trees (UNECE, 1998). As a consequence, it may be very difficult to achieve statistically sound estimates of forest condition parameters (e.g., mean defoliation, frequency of trees in certain defoliation classes) and of their accuracy from the current structure of ICP Forests surveys. Indeed, differences between countries in the definition of the target population preclude the achievement of reliable estimates at the European level. Likewise, the selection of a fixed number of nearest trees around points precludes the estimation (at least from a design-based perspective) even at the country level, due to the difficulties in determining the inclusion probabilities of trees, as pointed out by Kleinn and Vilcˆko (2006). In this chapter, we will first introduce the main sampling designs that may be applied in the context of large-scale forest monitoring; second, present the relationships between FCM and NFI networks; and finally, explore and suggest formal solutions to overcome the problems and drawbacks outlined above. In this respect, we will move in the framework of design-based inference: accordingly, we will propose (i) a set of requirements for status and change assessment and (ii) a harmonized sampling strategy able to provide unbiased and consistent estimators of forest condition parameters and their changes at both country and European levels.
7.2 SAMPLING DESIGNS IN LARGE-SCALE FOREST MONITORING IN EUROPE Collecting data on forests over large areas can hardly cover all stands and trees as complete enumeration (population census) is too time consuming and costly. Thus, both forest inventories and forest monitoring systems are based on data gathered from design-based surveys. Sampling consists of making observations on parts of the investigated population (the forest and its characteristics) to obtain estimates that are representative of the parameters of the population (like volume per hectare or crown defoliation per cent) and to assess the accuracy of the estimates. Observations are carried out on sample units whose distribution on the field is determined according to a sampling design. Multiphase sampling
Author's personal copy 108
SECTION
II
Designing Forest Monitoring
strategies (e.g., Gregoire and Valentine, 2008; Mandallaz, 2008) are common in many types of forest inventories (for a typology of forest inventories, see Ko¨hl et al., 2006). However, all design-based inventories over large areas share a common methodological feature: sample units are objectively selected by probabilistic rules as a means of guaranteeing the credibility of estimates (Olsen and Schreuder, 1997). The histories of NFIs show a progressive evolution toward statistical sampling techniques, with the majority of countries now using design-based sampling schemes (Lawrence et al., 2010). Similarly, several countries participating in the ICP Forests Level I network declared to adopt a design-based sampling (Cozzi et al., 2002) and the recently updated monitoring manual also refers to this point (Ferretti et al., 2010a). Traditionally, forest inventory data are analyzed in the framework of designbased inference for which population values are regarded as fixed constants and the randomization distribution resulting from the sampling design is the basis of inference. In this framework, the bias and variance of an estimator of a population parameter are determined from the set of all possible samples (the sample space) and from the probability associated with each sample. Sa¨rndal et al. (1992), Gregoire (1998), and Fattorini (2001) provide extensive discussion of design-based inference and contrast it with model-based inference. Usually, forest inventories adopt sampling schemes in which a set of points are randomly selected from the study region in accordance with a spatial sampling design. The main sampling designs that may be applied in the context of large-scale forest monitoring are the Uniform Random Sampling (URS), the Pure Systematic Sampling (PSS), and the Tessellation Stratified Sampling (TSS). Under the URS, a set of points is randomly and independently selected in the study area. URS is the fundamental selection method and all other sampling procedures are modifications of URS. PSS, which is based on a regular grid of points with a random start, represents the scheme most commonly adopted by NFIs, and it has been used by ICP Forests to systematically select Level I plots (Table 7.1). TSS is a random systematic scheme based on a regular tessellation of the study area and the random placement of a point in each tessellation unit. Once the sampling design has been chosen, plots of adequate size are then established at the selected points, and forest attributes are recorded for the trees within the plots (Corona, 2010). The shape of plots may be square, rectangular, or circular, although transect and angle count are used too (Figure 7.1). Circular plots are used by most of European NFIs and many countries use cluster sampling in which multiple plots (often four plots per cluster or more) are established in close spatial proximity (Lawrence et al., 2010). Traditionally, on many Level I plots of the ICP Forests network, a fixed number of nearest trees was customary selected: for each point falling into a forest, the so-called cross-cluster plot was performed, in which four further points are established along the directions N–S and E–W at a distance of 25 m from the central point and on each point the six nearest trees are selected (UNECE, 1998). Recently, a shift to fixed-area plot has been suggested (Ferretti et al., 2010a).
Author's personal copy Chapter
7
109
Large-Scale Pan-European Forest Monitoring Network
TABLE 7.1 Systematic Grid Spacing and Number of Plots Adopted by European NFIs and ICP Forests Level I NFI
ICP Forests Level I
Country
Systematic grid spacing (km km)
Number of plots
Systematic grid spacing (km km)
Albania
–
–
No survey in 2010
Andorra
–
–
16 16
3
Austria
3.889 3.889
22,236
16 16
135
Belarus
–
–
16 16
410
Belgium (Walloon Region)
1 0.5
Approximately 11,000
4 4/8 8
119
Bulgaria
–
–
4 4/8 8/16 16
159
Croatia
–
–
16 16
84
–
1970
16 16
15
22
Approximately 39,000
8 8/16 16
132
Denmark
22
42,793
7 7/16 16
25
Estonia
55
4500
16 16
97
Finland
3 3 to 10 10
69,388
16 16/24 32
932
Francea
1.41 1.41
275,000
16 16
532
Germany
2 2 to 4 4
54,009
4 4/16 16
415
Greece
–
95,220
–
90
Hungary
–
–
16 16
77
22
17,423
16 16
36
Italy
11
301,000
16 16
253
Latvia
2 2 to 4 4
–
88
325
Liechtenstein
–
–
No survey in 2010
Lithuania
44
7500
4 4/16 16
Luxembourg
1 0.5
Approximately 1800
No survey in 2010
Cyprus Czech Rep.
a
a
Ireland a
Number of plots
1065
Continued
Author's personal copy 110
SECTION
II
Designing Forest Monitoring
TABLE 7.1 Systematic Grid Spacing and Number of Plots Adopted by European NFIs and ICP Forests Level I—Cont’d NFI
ICP Forests Level I
Systematic grid spacing (km km)
Number of plots
Systematic grid spacing (km km)
FYR of Macedonia
–
–
No survey in 2010
Rep. of Moldova
–
–
22
The Netherlandsa
11
3622
16 16
Norway
33
16,522
3 3/9 9
1651
Poland
44
–
16 16
1957
Portugal
22
355,737
No survey in 2010
Romania
2 2 to 4 4
29,000
16 16
239
Russian Fed.
–
Approximately 150,000
32 32
288
Serbia
–
–
4 4/16 16
130
Slovak Rep.
44
12,268
16 16
108
Slovenia
44
778
16 16
44
Spain
11
95,327
16 16
620
Sweden
Varying
–
Varying
3149
Switzerland
1.41 1.41
165,000
16 16
48
Country
Number of plots
622 11
Turkey
–
–
16 16
555
Ukraine
–
–
16 16
1505
United Kingdoma
–
Approximately 15,000
16 16
80
ICP Forests data refer to countries participating in 2010 survey (Fischer and Lorenz, 2011). a Denotes countries using random component in NFI plot location. NFI data are taken from Lawrence et al. (2010).
7.3 RELATIONSHIP BETWEEN FCM AND NFI NETWORKS Harmonized FCM and NFI networks, or may be a single network of field plots supporting both NFI and FCM information needs, would offer considerable advantages, enhancing the value of both NFIs and FCM in Europe (Ferretti, 2010).
Author's personal copy Chapter
7
111
Large-Scale Pan-European Forest Monitoring Network
A
r
N 0
10
W
20 m
E S
B
r
N 0
10
20 m
W
E S
FIGURE 7.1—Cont’d
Author's personal copy 112
SECTION
II
Designing Forest Monitoring
C
L
N 0
10
W
20 m
E S
r1
D
r2
r3
N 0
10
20 m
W
E S
Author's personal copy Chapter
7
Large-Scale Pan-European Forest Monitoring Network
113
At present, the analysis of the relationship between the ICP Forests Level I and NFIs shows that, in some countries, the ICP Forests and NFI networks are coincident while, in some others, the two grids are different for several reasons. The most common are as follows: l
l
NFI and ICP Forests developed separately because they are under the responsibility of different administrations (e.g., Spain); The ICP Forests plots were initially selected as a subsample of the NFI grid, and then the NFI grid changed and the ICP Forests plots remained unchanged (e.g., Italy).
Table 7.2 gives an overview of the status of integration between the ICP Forests Level I and NFI networks. The results are based on data provided by the ICP Forests database integrated with an enquiry conducted in 2009–2010. In most of the countries where integration is under study or where it has been accomplished, the integration approach is based on the selection of the ICP Forests plots as a subsample of the NFI network.
7.4 DESIGN-BASED EUROPEAN MONITORING SYSTEM OF FOREST CONDITION 7.4.1
The Importance of Clear Objectives
Although the recent revision of the ICP Forest Manual (Ferretti et al., 2010b) emphasizes the need for a formal and operational definition of objectives, this has never been done for tree condition assessment (e.g., Eichhorn et al., 2010) and the desired accuracy of the estimates was never addressed. This limits a proper assessment of the effectiveness of the monitoring program in terms of achievement of objectives and of cost-benefit. In addition, disregarding the formal definitions of objectives has the undesired consequence of disregarding the field procedures necessary to achieve the objectives. Quantitative assessment of forest condition parameters and of their changes rests on the scheme adopted to select the observation sites and trees around selected sites. In theory, a design-based European forest monitoring system should allow
FIGURE 7.1 Example of sample plot configuration for tree selection. Trees are depicted as small white circles and sampled trees as small black circles. (A) Configuration of a circular plot with radius r. (B) Configuration of a cluster with four circular plots of radius r. (C) ICP Forests Level I plot configuration: the plot has four subplots assembled in a cross-cluster, oriented along the main compass directions and at a distance L ¼ 25 m from the plot center; on each subplot, the six nearest trees to the subplot center are selected as sample trees, resulting in a total of 24 sample trees per plot. (D) Configuration of concentric circular plots with radius r1, r2, and r3, respectively.
Author's personal copy 114
SECTION
II
Designing Forest Monitoring
TABLE 7.2 Status of Integration Between NFIs and ICP Forests Level I Networks Status of integration
Country
Integration approach
None
Andorra
–
Belgium/Wallonia
–
Bulgaria
–
Croatia
–
Czech Rep.
–
France
–
Germany (most regions)
–
Lithuania
–
Montenegro
–
Russian Fed.
–
Serbia
–
Slovak Rep.
–
Spain
–
The Netherlands
–
United Kingdom
–
Belgium/Flanders
Plot design NFI in ICP Forests
Denmark
ICP Forests subsample of NFI
Estonia
ICP Forests subsample of NFI
Germany/BadenWu¨rttemberg
A slightly modified version of the NFI was assessed on ICP Forests for the first time in 2006
Ireland
ICP Forests and NFI network run in parallel until a time series exists which allows for the interpretation of trends
Italy
Hypothesis of new ICP Forests grid as subsample of NFI. For 1 or 2 years old ICP grid still active to maintain time series
Latvia
ICP Forests subsample of NFI
Norway
–
Under study
Author's personal copy Chapter
7
Large-Scale Pan-European Forest Monitoring Network
115
TABLE 7.2 Status of Integration Between NFIs and ICP Forests Level I Networks—Cont’d Status of integration
Country
Integration approach
Accomplished
Austria
ICP Forests subsample of NFI
Belarus
–
Finland
ICP Forests subsample of NFI
Germany/Bavaria
ICP Forests subsample of NFI
Hungary
ICP Forests subsample of Growth Monitoring
Poland
ICP Forests subsample of NFI
Romania
New NFI plots on ICP Forests
Slovenia
ICP Forests subsample of NFI
Sweden
ICP Forests subsample of NFI
Switzerland
–
Turkey
–
(a) the quantitative estimates of the condition attribute of interest (e.g., proportion of trees with defoliation >25%) for the target statistical population (e.g., the whole population of forest trees) and for defined subgroups (e.g., the beech trees) at a specified level of accuracy for each country and at the European level; (b) the quantitative estimates of change for the condition of the attribute of interest at both country and European levels, with a subsequent statistical assessment of the null hypothesis of no change. These generic objectives are rather common in management-oriented environmental monitoring (e.g., Urquhart et al., 1998) and will be developed hereafter by defining parameters and precision requirements for status and change detection.
7.4.2
Defining Parameters of Concern
Consider a population U of N trees over a delineated study area (e.g., the whole forest trees or a defined subgroup of forest trees in a country) and denote by yj( j 2 U) the value of defoliation level Y for the j-th tree in the population. Usually, the defoliation level is defined as needle/leaf loss in the assessable crown as compared with a reference tree (Eichhorn et al., 2010; see Chapter 8), ranging from 0 to 100% and assessed in 5% classes.
Author's personal copy 116
SECTION
II
Designing Forest Monitoring
Accordingly, Y constitutes a discrete variable with range 0, 5, . . . , 95, 100. The average defoliation value 1X Y¼ yj (7.1) N j2U together with the fraction of trees with defoliation greater than 25%, say F25, usually constitute the target parameters under estimation. Denote by Nk the abundance of trees in the population whose defoliation level equals k%, with k ¼ 0(5)100 and by Pk the relative abundance, that is, the proportion of trees for the same defoliation level. Denote by N ¼ [N0, . . . , N100]T the abundance vector of the 21 defoliation classes and by P ¼ [P0, . . . , P100]T the relative abundance vector. Accordingly, the average defoliation value can be rewritten in terms of P as Y¼
100 X
kPk
(7.2)
k¼0
while F25 can be rewritten as F25 ¼
100 X
Pk
(7.3)
k¼30
Practically speaking, the main interest parameters Y and F25 are linear combinations of the components of P of type C ¼ cT P ¼
100 X
ck Pk
(7.4)
k¼0
where c ¼ [0, 5, . . . , 95, 100]T in the case of Yand c ¼ [0, 0, 0, 0, 0, 1, .. . , 1]T in the case of F25. Henceforth, Y and F25 will be viewed as particular cases of parameters of type (1), which will be referred to as C-parameters. Obviously, the estimation of C-parameters rests on the estimation of P, which, in turn, rests on the estimation of N. Moreover, it is also worth noting that C-parameters constitute percentages and as such they are not affected by standards, as opposite to other physical attributes of trees (e.g., bole volume, basal area, living biomass, necromass). As to change, denote by N1 and N2 the abundance vectors at periods 1 and 2, in such a way that P1 and P2 are the corresponding relative abundance vectors and C1 and C2 are the C-parameter values at periods 1 and 2. Change in C-parameter is defined as D ¼ C2 C1 ¼ cT ðP2 P1 Þ
(7.5)
Positive values of D denote (at least for nonnegative cks) increases in defoliation and hence are considered undesirable. However, for the subsequent inference on changes, it is important to determine when a positive difference is small enough to be considered biologically irrelevant (see, e.g., Elzinga
Author's personal copy Chapter
7
Large-Scale Pan-European Forest Monitoring Network
117
et al., 2001, p. 179). For the purposes of this proposal, it seems suitable to consider positive changes lower than 5% as biologically irrelevant, so that 5 is taken as the minimum value for a biologically significant change (BSC). The defoliation parameters considered for each country can be considered at the European level, providing that the same definition of forest has been established among European countries. In this case, suppose the presence of L countries and denote by Nl the abundance vector of the 21 defoliation classes for the l-th country (l ¼ 1, . . ., L). Hence, NE ¼ N1 þ þ NL denotes the abundance vector for the whole Europe, while PE denotes the relative abundance vector. Once the vector PE ¼ [P0,E, . . . , P100,E]T is achieved, the C-parameter at the European level is given by CE ¼
100 X
ck P k , E
(7.6)
k¼0
It is worth noting that the estimation of C-parameters at the European level ultimately rests on the estimation of the Nls in each country. Finally, denote by Nl,1 and Nl,2 the abundance vectors at periods 1 and 2 for the l-th country, in such a way that NE,1 ¼ N1,1 þ þ NL,1 and NE,2 ¼ N1,2 þ þ NL,2, respectively, denote the abundance vectors for the whole Europe at periods 1 and 2 and PE,1 and PE,2 are the corresponding relative abundance vectors. Then, CE,1 and CE,2 are the values of C-parameter at periods 1 and 2 at the European level and the change turns out to be DE ¼ CE, 2 CE, 1 ¼ cT PE, 2 PE, 1 (7.7) As for any single country, a positive change of 5% points is considered as the minimum BSC.
7.4.3
Defining Accuracy Measures for Status Assessment
A first need in planning a monitoring program is to fix the required accuracy level for status and change estimates. Denote by S a sample of trees selected from the population U according to a design-based sampling scheme. Once a sample S is selected, the defoliation level is quantified for each tree in the sample, thus obtain^ is ing the sample data, say, {yj; j 2 S}, from which an estimate of C, say, C, achieved. Usually, statisticians tend to avoid biased estimators. Rather, they prefer sampling strategies providing unbiased or, at least, nearly unbiased estimators. Indeed, the accuracy of an unbiased estimator is straightforwardly determined by ^ Being a squared quantity, the sampling variance has a its variance, say, VarðCÞ. ^ usually more difficult interpretation than its positive square root, say SEðCÞ, ^ ^ referred to as the standard error, or the ratio RSEðCÞ ¼ SEðCÞ=C, referred to ^ ¼ 100 RSEðCÞ% ^ as the relative standard error or PSEðCÞ which gives the percentage error. All these indexes can be used indifferently and are adopted by statisticians to evaluate the accuracy of unbiased sampling strategies. Unfortunately, the sampling variance of any estimator is actually unknown and must be necessarily estimated from sample data. Once an estimate V2C is
Author's personal copy 118
SECTION
II
Designing Forest Monitoring
obtained for the sampling variance, the corresponding estimators of standard ^ error, relative standard error, and percentage standard error are VC, VC =C, ^ and 100ðVC =CÞ%, respectively. Usually, statisticians tend to achieve unbiased estimators of the sampling variances. However, when unbiasedness cannot be ensured, conservative estimators are preferred, that is, estimators which, on average, overestimate the sampling variance, thus avoiding false and optimistic evaluations of accuracy. As far as the current status is concerned and for the purposes of this proposal, an estimate of 5% seems to be a suitable target for the accuracy of the estimators of C-parameters. Finally, in the class of unbiased estimators, the normality of the sampling distributions also constitutes a very attractive characteristic. Indeed, unbiasedness and normality allow for the construction of 0.95 confidence intervals, which are simply obtained from C^ 2VC .
7.4.4 Defining Accuracy Measures for Change Assessment The change detection objective is one of the most important outcomes for a monitoring program. The reports on European forest condition always present statistics about annual changes and graphs on long-term trends. In practice, the objective is to determine whether there has been a change in C-parameters. Suppose that two estimates of C, say, C^1 and C^2 , are obtained from the same sample S at periods 1 and 2, respectively, in such a way that the estimate of change D is given by D^ ¼ C^2 C^1 . Thus, a test for statistical significance must be conducted to determine if a true change has occurred or if the difference is simply due to sampling errors. If the estimators C^1 and C^2 are (approximately) unbiased, their difference D^ is (approximately) normal, and an unbiased (or conservative) estimator for the variance of D^ is available, the p-value of the test is given by D^ (7.8) p¼1F VD where F denote the standard normal distribution function and VD is an estimate ^ If p is smaller than a threshold value, say a, the of the standard error of D. hypothesis of no change is rejected at a significance level a. As to the purpose of this proposal, a suitable value for a should be 0.05. As Elzinga et al. (2001, p.179) point out, if the test yields a nonsignificant result, it is important to evaluate the probability of refusing the hypothesis of no change when a BSC of size D has actually occurred (power of test). For a given value of a, under the same assumptions previously adopted to achieve the p-value, the power of test turns out to be D (7.9) 1 b ¼ 1 F z1a VD
Author's personal copy Chapter
7
Large-Scale Pan-European Forest Monitoring Network
119
where zq denotes the q-quantile of the standard normal distribution function. If the resulting power is low, a change may have taken place notwithstanding the hypothesis of no change has been accepted. Since 1 b is an increasing function of D, power can be computed only for D equal to the minimum BSC in order to obtain the lower bound for the power of detecting BSCs. As to this study, for a ¼ 0.05 and a minimum BSC equal to 5% points, the lower bound turns out to be 5 (7.10) 1 b ¼ 1 F 1:64 VD Alternatively, if a value of 1 b is fixed and Equation (7.9) is solved for D, the so-called minimum detectable change (MDC) (7.11) MDC1b ¼ z1a zb VD is achieved, that is, the minimum change that can be detected with probability 1 b. As to this proposal, a suitable value for 1–b should be 0.9, in such a way that MDC0.9 ¼ 2.92VD. It is worth reminding that these definitions have never been suggested on a formal basis for FCM at the European level.
7.5
SAMPLING STRATEGIES AT THE COUNTRY LEVEL
Plot sampling represents a unifying scheme to sample trees which, at the same time, allows one to maintain the likely differences among the overall sampling designs distinctively adopted by each European country. Indeed, plot sampling simply involves the selection of a prefixed number of sites in accordance with a spatial sampling scheme and the subsequent selection of all the trees lying within the plots of prefixed size centered at the sites. Thus, each country may vary the scheme to select site and its intensity (number of sites per 100 ha) as well as the size and shape of plots. For the purposes of this proposal, the sampling schemes introduced in Section 7.2 are considered to select sites: URS, PSS, and TSS.
7.5.1
Uniform Random Sampling
Consider an area G of size G covering the study area in such a way to eliminate any edge effect (Gregoire and Valentine, 2008) Then, a point (site) is randomly selected onto G and the sampled trees S are those lying within the plot of prefixed shape and prefixed size b centered at the random site. By construction, the probability of any trees to enter the sample (inclusion probability) is invariably equal to b/G, from which the Horvitz–Thompson (HT) estimator (Sa¨rndal et al., 1992, Section 2.8) of Nk turns out to be G N^k ¼ nk , k ¼ 0ð5Þ100 b
(7.12)
Author's personal copy 120
SECTION
II
Designing Forest Monitoring
where nk denotes the number of sampled trees whose defoliation level equal k. Accordingly, the HT estimate of the whole abundance vector N can be written as ^ ¼ Gn N b
(7.13)
where n ¼ [n0, . . . , n100]T is the vector of the counts of sampled trees belonging to the 21 defoliation classes. From the well-known result on plot sampling (e.g., ^ is an unbiased estimator of N with a variance– Gregoire and Valentine, 2008), N ^ covariance matrix, say VarURS ðNÞ, where henceforth VURS will denote variances and covariances arising from URS, that is, the complete random placement of sites onto G. The variances of the N^k s strictly depend on the spatial distribution of trees within the study area: a distribution of trees evenly scattered throughout the study area generally provides more accurate estimator than a clumped one. As no country can be adequately sampled by means of one site, R sites are randomly and independently selected (Figure 7.2). Hence, the replication
N W
E S
FIGURE 7.2 Uniform random sampling (URS) with circular plots. The study area (in light gray) is covered by an enlarged area G to eliminate edge effects and R plots are randomly and independently located onto G. The dark gray zone represents the forest area, black triangles represent the selected sites, gray points represent the forest trees, and white points represent the selected trees.
Author's personal copy Chapter
7
Large-Scale Pan-European Forest Monitoring Network
121
procedure gives rise to R independent samples, say S1, . . ., SR, which in turn give ^ R , which constitute R independent realizations ^ 1 , .. ., N rise to R estimates, say N ^ of the HT estimator N. Accordingly, on the basis of the very standard results on independently and identically distributed random vectors (e.g., Mardia et al., 1979, Section 2.8 and Theorem 2.9.1), the arithmetic mean vector R X ^ ¼1 ^i N N R i¼1
(7.14)
provides an estimator for N which is unbiased, consistent, and asymptotically (R ! 1) normal with a variance–covariance matrix which is unbiasedly and consistently estimated by VN ¼ S/R, where S¼
R 1 X ^ N N ^ T Ni N i R 1 i¼1
(7.15)
^ i s. is the empirical variance–covariance matrix of the N In accordance with these results, an obvious estimator for P is given by ^ ^ ¼ N=ð1 T^ P NÞ. After little algebra, it can be proven that the k-th component ^ ^ ¼ T =T, where T denotes the total number of P can be simply rewritten as P k k of trees sampled by the R sites and Tk denotes the number of these trees belonging to the k-th defoliation class. From the most familiar version of Delta method (see, e.g., Mardia et al., 1979, Theorem 2.9.2), it follows ^ constitutes a consistent and asymptotically normal estimator for P that P with a variance–covariance matrix which is consistently estimated by VP ¼ ^ T ÞV ðI 1P ^ T Þ where I denotes the identity matrix of appropriate ðI P1 N
^ Thus, ^ ¼ cT P. order. Finally, any C-parameter can be simply estimated by C ^ from the Delta method, C constitutes a consistent and asymptotically normal estimator for C with variance which is consistently estimated by V2c ¼ cTVPc. ^ is given by Accordingly, the estimator of the percentage standard error for C ^ while the confidence interval with an asymptotical coverage of 100ðVC =CÞ%, ^ 2V . about 0.95 is given by C C ^ t the HT estimators of Nt based As to the inference on change, denote by N on a unique plot randomly selected onto G and then visited at period ^ t is unbiased t (t ¼ 1, 2). From the previous considerations on HT estimators, N ^ with a variance–covariance matrix VarURS ðNt Þ. Moreover, denote by ^ 1; N ^ 2 Þ the covariance matrix between the two estimators. As R sites CovURS ðN are randomly and independently thrown onto G, the replication procedure ^ 1, 2 Þ, .. ., ðN ^ R, 1 ; N ^ R, 2 Þ which constitute ^ 1, 1 ; N gives rise to R pairs of estimates ðN ^ ^ R independent realizations of the pair ðN1 ; N2 Þ. From the above-mentioned results on independently and identically distributed random vectors, the arith^ , is an unbiased, consistent, and asymptotimetic mean vector of the Ni,ts, say N t cally normal estimators of Nt with a variance–covariance matrix which is unbiasedly and consistently estimated by VN,t ¼ St/R, where St is the empirical
Author's personal copy 122
SECTION
II
Designing Forest Monitoring
^ i, t s, while the covariance matrix is unbiavariance–covariance matrix of the N sedly and consistently estimated by CN ¼ S1,2/R, where S1, 2 ¼
R 1 X ^ ^ T ^ i, 1 N ^ i, 2 N N N 1 2 R 1 i¼1
(7.16)
^ i, 1 s and N ^ i, 2 s. From the Delta is the empirical covariance matrix of the N ^ ^ T^ method, Pt ¼ Nt =ð1 Nt Þ is a consistent and asymptotically normal estimator of Pt with a variance–covariance matrix which is consistently estimated by ^ 1T ÞV ðI 1P ^ T Þ and covariance matrix which is consistently VP, t ¼ ðI P t t N, t ^ 1T ÞC ðI 1P ^ T Þ. From these last results, the differestimated by CP ¼ ðI P 1 N 2 ^ ^ ence P2 P1 turns out to be a consistent and asymptotically normal estimator of P2 P1, with a variance–covariance matrix which is consistently estimated ^ C ^ ¼ ^ ¼C by VP,1 þ VP,2 CP CTP. Hence, the difference estimator D 2 1 ^ P ^ Þ is a consistent and asymptotically normal estimator of D with cT ð P 2 1 variance which is consistently estimated by V2D ¼ cT(VP,1 þ VP,2 CP CTP)c. ^ as well as the conOwing to the asymptotic unbiasedness and normality of D 2 sistency of VD, the p-value, the power, and the MDC adopted for inference on change can be computed via expressions (7.8), (7.9), and (7.11), respectively.
7.5.2 URS Versus Systematic and Stratified Sampling Gregoire and Valentine (2008) provide an excellent introductory chapter on the issue of sampling discrete objects (trees in the present case) scattered over a region by means of plots, focusing on the problem of how to effectively select these plots. Despite its theoretical simplicity, URS may lead to an uneven coverage of the study area (Cordy and Thompson, 1995; Stevens, 2006). To avoid this shortcoming, systematic schemes can be adopted. However, PSS based on a regular grid of plots with a random start (commonly adopted in large-scale forest inventories; Figure 7.3) may be unsuitable in the presence of some spatial regularity, leading to substantial losses of efficiency with respect to URS. Accordingly, random systematic schemes based on a regular tessellation of the study area and the random placement of a plot in each tessellation unit have been theoretically preferred by statisticians. One such scheme, usually referred by Cordy and Thompson (1995) and Stevens (1997) to as TSS, involves enlarging the study area by a region G constituted by R nonoverlapping polygons of equal size and such that each of them contain at least a portion of the study area, and then selecting a plot in each of these polygons (Figure 7.4). The scheme has a long-standing tradition in statistical literature (see, e.g., Overton and Stehman, 1993). If the R sites/plots are thrown onto the same reference region G, TSS ^ is unbiased invariably outperforms URS, in the sense that under TSS, N ^ ^ with variance–covariance matrix such that VarURS ðNÞ VarTSS ðNÞ
Author's personal copy Chapter
7
123
Large-Scale Pan-European Forest Monitoring Network
N W
E S
FIGURE 7.3 Pure systematic sampling (PSS) with circular plots. The study area (in light gray) is covered by an enlarged area G partitioned into R regular polygons, a plot is randomly located in one polygon and then repeated in the remaining R 1. The dark gray zone represents the forest area, black triangles represent the selected sites, gray points represent the forest trees, and white points represent the selected trees.
(e.g., Barabesi and Franceschi, 2011), where henceforth ETSS and VTSS will denote expectations, variances, and covariances arising from the TSS scheme. Interestingly, TSS displays variances and covariances decreasing with R3/2 (Barabesi and Franceschi, 2011) while URS displays variances and covariances decreasing with R1. Accordingly, for large R, TSS gives rise to relevant gains in precision with respect to the URS. Moreover, under weak assumptions, the ^ is preserved in case of TSS (Barabesi and asymptotic normality of N Franceschi, 2011). Hence, from an enlarged version of the Delta method (e.g., ^ and D ^ C, ^ derived Shao and Tu, 1995, p.448), under TSS the estimators P, ^ from N are consistent and asymptotically normal with variances and covariances decreasing with R3/2. Finally, under TSS, VN constitutes a conservative ^ in the sense that E ðV Þ ¼ Var ðNÞ ^ þ H where H estimator for VarTSS ðNÞ TSS N TSS is a positive definite matrix (the proof of this result is simply based on the inde^ i s), while VP, V2C, and V2D are asymptotically conservative, in the pendence of N
Author's personal copy 124
SECTION
II
Designing Forest Monitoring
N W
E S
FIGURE 7.4 Tessellation stratified sampling (TSS) with circular plots. The study area (in light gray) is covered by an enlarged area G partitioned into R regular polygons and a plot is randomly located in each polygon. The dark gray zone represents the forest area, black triangles represent the selected sites, gray points represent the forest trees, and white points represent the selected trees.
sense that they are asymptotically equivalent to conservative estimators for ^ and Var ðDÞ. ^ Var ðCÞ, ^ VarTSS ðPÞ, TSS TSS Even if these theoretical results cannot be proved under PSS (in the presence of some spatial regularity, PSS may be even worser than URS), however, apart from anomalous situations which should not occur over large areas, the performance of PSS are likely to be very similar (sometimes superior) to that of TSS. Moreover, the use of systematic schemes is suitable in forest studies, as it can be straightforwardly executed by a random shift of a grid superimposed onto a map of the study area (e.g., Gregoire and Valentine, 2008, p. 119), taking the nodes as sample sites and locating the sites in the terrain ^ as by a modern-day GPS system. Accordingly, under PSS the estimator N ^ ^ ^ well as the subsequent estimators P, C, and D are henceforth supposed to share the theoretical properties arising from TSS.
Author's personal copy Chapter
7.5.3
7
Large-Scale Pan-European Forest Monitoring Network
125
Sampling Effort: A Preliminary Test
As reported in Section 7.5.1, T denotes the total number of trees sampled by the R sites selected at the country level and Tk denotes the number of these trees belonging to the k-th defoliation class. These statistics are needed for the estimation of ^ which in turn allows for the computation of the relative abundance vector P, any C-parameter estimate. Descriptive statistics from the ICP Forests large-scale monitoring usually report the number of sites assessed in each country, the total number of selected trees, and the total number of selected trees belonging to the 21 defoliation classes. These statistics might be used as a reference to identify, for each country, the number of sites needed to obtain an estimate of the parameter of concern, say the proportion of defoliated trees greater than 25% (F^25 ). Thus, a preliminary test has been conducted to assess the theoretical number of sampling sites that should be selected at the country level (Travaglini et al., 2012). To do this, the sampling effort of the ICP Forests network in terms of number of plots (R) and number of trees (T) has been compared with the theoretical sampling effort (R0.05, T0.05) required for estimating the proportion of defoliated trees greater than 25% (F^25 ) with a percentage standard error (e) of 5%. For each country, data related to R, T, and F^25 have been taken from the 2008 survey (Lorenz et al., 2009). The theoretical sampling effort in term of trees has been computed by T0:05 ¼
1 F^25 e2 F^25
(7.17)
and on the basis of the following assumptions: a common definition of forest is applied across European countries; a simple random sampling with replacement has been supposed to select trees from the population. R0.05 has been derived dividing T0.05 by the average number of trees per plots (T/R) observed in 2008 surveys. The results are shown in Table 7.3. It is worth noting that, as these sampling efforts are determined presuming a rough with-replacement random selection of trees from the population, which should be less accurate than the systematic grids of plots adopted in most European country, the reported efforts are highly cautionary and likely provide standard errors smaller than 5%. It is worth noting, however, that the number of plots reported in Table 7.3 may be not appropriate for small countries and/or low frequency of defoliated trees, and/or individual species (Ko¨hl et al., 1994).
7.6 AGGREGATING COUNTRY ESTIMATES AT THE EUROPEAN LEVEL Two statistical strategies are proposed when combining independent country estimates for the assessment of forest condition at the European level: the first one is solely based on information acquired from FCM networks (Travaglini et al., 2012); the second one takes into consideration the potential outcome from FCM and NFIs.
Author's personal copy 126
SECTION
II
Designing Forest Monitoring
TABLE 7.3 Sampling Effort of ICP Forests Network in Term of Plots (R) and Trees (T) Performed in 2008 Surveys Compared with Sampling Effort (R0.05, T0.05) Required for Estimating the Proportion of Defoliated Trees Greater than 25% (F^25 ) with a Percentage Standard Error of 5% Country
T
a
Andorra
R
T/R
F^25 (%)
T0.05
R0.05
72
3
24.00
15.3
2215
92
Belarus
9460
400
23.65
8
4600
195
Belgium
2860
121
23.64
14.5
2359
100
Bulgaria
4531
136
33.32
31.9
854
24
Croatia
2039
85
23.99
23.9
1273
53
360
15
24.00
46.9
452
19
5477
136
40.27
56.7
306
8
452
19
23.79
9.1
3996
168
Estonia
2196
92
23.87
9
4045
170
Finland
8819
475
18.57
10.2
3522
190
10,138
508
19.96
32.4
835
42
10,347
423
24.46
25.7
1157
43
679
31
21.90
10
3600
165
Italy
6579
236
27.88
32.8
820
30
Latvia
8090
342
23.65
15.3
2215
94
Lithuania
7539
1342
5.62
19.6
1641
292
Rep. Moldova
9841
528
18.64
33.6
791
43
Norway
9495
1720
5.52
22.7
1363
247
Poland
39,320
1916
20.52
18
1823
89
a
Serbia
2789
130
21.45
11.5
3079
144
Slovak Rep.
4083
108
37.81
29.3
966
26
Slovenia
1056
44
24.00
37
742
31
14,880
620
24.00
15.6
2165
90
6890
3464
1.99
17.3
1913
961
1008
48
21.00
19
1706
82
8978
398
22.56
24.6
1227
55
33,986
1465
23.20
8.2
4479
194
a
Cyprus
Czech Rep. a
Denmark a
France Germany a
Ireland
Spain Sweden Switzerland Turkey Ukraine a
a
Denotes countries that would need to increase plot numbers. Modified from Travaglini et al. (2012).
Author's personal copy Chapter
7.6.1
7
Large-Scale Pan-European Forest Monitoring Network
127
Combining FCM Estimates
Suppose a homogeneous definition of forest among the L countries participat^ , the estimates ^ , ..., N ing in the forest monitoring network and denote by N 1 L of their abundance vectors achieved by means of separate, independent surveys performed in each country by means of plot sampling with sites ^ constitutes an unbiased, conselected in accordance with PSS. Since each N l sistent, and asymptotically normal estimator of Nl with a variance–covariance matrix which can be conservatively estimated by VN,l ¼ Sl/Rl, where Sl is the empirical variance–covariance matrix of the Rl estimates for the l-th country ^ ¼N ^ þ and Rl is the number of sites adopted in the country, then the sum N E 1 ^ is an unbiased and consistent (R , ..., R ! 1) estimator of N with a þ N L 1 L E variance–covariance matrix which (owing to the independence of the L estimates) is conservatively estimated by VN , E ¼
VN , 1 VN, L þ þ R1 RL
(7.18)
Moreover, if the Rls are supposed to increase with constant ratios Rl/Rh, then ^ E is an asymptotically normal estimator of NE. as R1, . . ., RL ! 1, N ^ (unbiasedness, consistency, and asymptotic normalThe properties of N E ity) once again allow for the application of the enlarged version of the ^ =ð1T N ^ Þ constitu^ ¼N Delta method (Shao and Tu, 1995, p. 448). Thus, P E E E tes a consistent and asymptotically normal estimator for PE, while ^ 1T ÞV ðI 1P ^ T Þ constitutes an asymptotically conservative VP, E ¼ ðI P E N, E E estimator of the variance–covariance matrix. Finally, as to C-parameters at ^ constitutes a consistent and asymptotically ^ ¼ cT P the European level, C E E normal estimator for CE, while V2C,E ¼ cTVP,Ec constitutes an asymptotically conservative estimator for the variance. The estimate of the percentage stan^ Þ%, while the confidence interval ^ is given by 100ðV =C dard error for C E C, E E ^ 2V . with asymptotical coverage of 0.95 is given by C E C, E ^ As to the inference on change, denote by Nl, t the plot sampling estimators ^ is unbiased, consistent, and asymptotically normal, of Nl,t (t ¼ 1, 2). Hence, N l, t while VN,l,t ¼ Sl,t/Rl is a conservative estimator of the variance–covariance matrix of Nl,t and Cl ¼ Sl,1,2/Rl is the estimator of the covariance matrix of Nl,1 and Nl,2, where Sl,t is the empirical variance–covariance matrix at period t and Sl,1,2 is the empirical covariance matrix between periods 1 and 2. Accord^ þ þ N ^ ^ ¼N ingly, from the previous results of this section, N E, t 1, t L, t is an unbiased, consistent, and asymptotically normal estimator of NE,t with a variance–covariance matrix which can be conservatively estimated by VN, E, t ¼
VN, 1, t V N , L, t þ þ R1 RL
(7.19)
Moreover, since correlation exists only among estimators achieved in the ^ ^ same country at different times, the covariance matrix of N E, 1 and NE, 2 can be estimated by
Author's personal copy 128
SECTION
CN , E ¼
II
Designing Forest Monitoring
CN , 1 CN , L þ þ R1 RL
(7.20)
From the enlarged version of the Delta method, the relative abundance vector ^ ¼N ^ =ð1T N ^ Þ is a consistent and asymptotically normal estimator P E, t E, t E, t ^ 1T ÞV ^T estimators of PE,t. Moreover, VP, E, t ¼ ðI P E, t N , E, t ðI 1PE, t Þ is an asymptotically conservative estimator of the variance–covariance matrix of ^ 1T ÞC ðI 1P ^ T Þ is an estimator for the covari^ , while C ¼ ðI P P E, t P, E E, 1 N, E E, 2 ^ ^ ance matrix of PE, 1 and PE, 2 . From these last results, the difference ^ ^ P P E, 2 E, 1 turns out to be a consistent and asymptotically normal estimator of PE,2 PE,1, while VP,E,1 þ VP,E,2 CP,E CTP,E is an asymptotically conservative estimator of the variance–covariance matrix. Hence, the difference esti^ C ^ ¼ cT ð P ^ P ^ Þ is a consistent and asymptotically ^ ¼C mator D E E, 2 E, 1 E, 2 E, 1 normal estimator of DE with variance which can be conservatively estimated by VD2 , E ¼ cT VP, E, 1 þ VP, E, 2 CP, E CTP, E c (7.21) ^ , as Once again, owing to the asymptotic unbiasedness and normality of D E 2 well as the conservative nature of VD,E, the p-value, the power, and the MDC adopted for inference on change can be computed via expressions (7.8), (7.9), and (7.11). As to the number of sites to be selected within each country, it is worth noting that the accuracy of estimates concerning small countries and regions, infrequent tree species, and their combination may be strongly impacted if a unique density of sampling sites is adopted all over the Europe. This problem has been investigated in Switzerland by Ko¨hl and Kaufmann (1993) for the estimation of mean defoliation (or transparency) and by Ko¨hl et al. (1994) for the estimation of proportions. Ko¨hl et al. (1994, p. 217) conclude that “the results clearly indicate the decreasing reliability of the results as the grid density is decreased from 4 4 to 16 16 km. However, from a practical point of view, the results obtained from the 4 4 and 8 8-km grid for the whole Switzerland are similar. However, further reductions result in a sharp increase in variability between grids, suggesting that neither the 12 12-km nor the 16 16-km grids would provide reliable data for Switzerland.” Thus, when designing a European network, it is important to be aware that a sampling grid able to provide precise estimates at the European level and for the most frequent tree species may be not suited for individual countries and/or less frequent tree species.
7.6.2 Coupling FCM and NFI Estimates Across Europe The aggregation at the European level of the outcome from FCM and NFIs can provide an alternative estimation of C-parameters at the European level with respect to the methodology proposed in the previous section which is instead completely based on the information acquired from FCM. Indeed, the relative abundance vector at the European level PE ¼ NE/(1TNE) can be rewritten as
Author's personal copy Chapter
7
Large-Scale Pan-European Forest Monitoring Network
PE ¼
L X
wl Pl
129
(7.22)
l¼1
where wl ¼ Nl/NE denotes the proportion of forest trees in the l-th country and NE ¼ N1 þ þ NL denotes the total number of forest trees in Europe. Accord^ ¼N ^ =ð1T N ^ Þ from FCM surveys, ingly, while each Pl can be estimated by P l l l the wl weights can be estimated by using the information arising from NFIs. As NFIs are usually performed by intensive surveys, the resulting estimators of the Nls are likely to be more accurate than those arising from FCM surveys. e L denotes the NFI estimates of N1, . .., NL in such a way that e 1 , ..., N Thus, if N ~ E ¼ N~1 þ þ N~L , el ¼ Nel =NeE where N the wls can be trivially estimated by w then PE can be estimated by e ¼ P E
L X
^ el P w l
(7.23)
l¼1
e , the statistical properties In order to derive the statistical properties of P E e of each N l are needed. Usually, forest inventories are multiphase surveys adopting unbiased (or approximately unbiased) estimators of the interest parameters as well as unbiased or conservative estimators of the sampling vare l ) Nl with variance Varl(N e l ) which can iances. Accordingly, suppose El(N 2 e be unbiasedly or conservatively estimated by V l , where El and Varl denote expectation and variance with respect to the sampling scheme adopted in e L are e 1 , ..., N the NFI of the l-th country. Moreover, since the L estimates N obtained by means of separate surveys, they are independent to each other ^ . ^ , ..., P as well as independent to P 1 L In accordance with these considerations, the weight vector estimator e ¼ ½w eL T is approximately unbiased with a variance–covariance e1 ;... ; w w matrix which can be approximated up to the first term by Varðe wÞ ðI w1T ÞDðI 1wT Þ where w ¼ [w1, .. . ,wL]T is the vector of true weights and D ¼ diag{Var1(N~1), ..., VarL(N~L)} is the diagonal matrix having the variances of the N~ls as diagonal elements. Thus, an obvious estimator for e 1e ~ ¼ diagðV~2 ; ...; V~2 Þ. e w ¼ ðI w e 1T ÞDðI wT Þ where D Varðe wÞ is given by V 1 L e is approximately unbiased, while generalizing the result of Moreover, P E
Goodman (1960) on the variance of products of independent random variables to the variance–covariance matrix and the covariance matrix of scalar products of random variables with independent random vectors, the variance– e can be approximated by covariance matrix of P E L L X X e ^ þ ~l ÞVarPSS P ~l ÞPl PTl Var P Varðw Varðw E l l¼1
l¼1
L L X X ^ þ ~l ; w ~h Þ Pl PTh þ Ph PTl þ w2l VarPSS P Covðw l l¼1
h>l¼1
(7.24)
Author's personal copy 130
SECTION
II
Designing Forest Monitoring
where VarPSS denotes variances and covariances with respect to PSS selection e Þ can be estimated by of sites onto the l-th country. Thus, VarðP E ~ P, E ¼ V
L X
2 V~w, l VP; l þ
l¼1
L X
L L X X 2 ^ ^T ^P ^T ^ ^T ~2w, l VP, l þ w V~w, l P V~w, l, h P l Pl þ l h þ Ph Pl
l¼1
l¼1
h>l¼1
(7.25) 2 T ^ ^ T e e where VP, l ¼ ðI Pl 1 ÞVN, l ðI 1Pl Þ and V w, l and V w, l, h are the l,l and l,h elements of Vew . Finally, a C-parameter at the European level can be estimated by e constitutes an approximately unbiased estimator for C e , where C e ¼ cT P C E E 2 e P, E c. The estimate of the with variance which can be estimated by VeC, E ¼ cT V e e Þ%. Since nothing e percentage standard error for CE is given by 100ðV C, E =C E ensures that the multiphase estimators arising from NFIs are normally e as well as the subsequent normaldistributed, nothing ensures the normality P E e ity of CE . If normality is (as customary) assumed, then the confidence interval e 2Ve . with approximate coverage of 0.95 is given by C E C, E As to the inference on change, by obvious notation we can write L L X X e ¼ ^ constitutes the estimael, 1 P w wl, t Pl, t in such a way that P PE , t ¼ E, t l, t l¼1 l¼1 e is tor of PE,t (t ¼ 1, 2). Accordingly, the variance–covariance matrix of P E, t estimated by L L X X 2 2 ^ P ^T ~ P, E, t ¼ V V~w, l, t VP, l, t þ V~w, l, t P l, t l, t l¼1
þ
L X
~2l, t VP, l, t þ w
l¼1
l¼1 L X
^ ^T ^ P ^T V~w, l, h, t P l, t h, t þ Ph, t Pl, t
(7.26)
h>l¼1
^ 1T ÞV ^T e2 e where VP, l, t ¼ ðI P l, t N , l, t ðI 1Pl, t Þ and V w, l, t and V w, l, h, t are the l,l T e t ðI 1e e w, t ¼ ðI w e t 1T Þ D and l,h elements of V wt Þ. Moreover, generalizing once again the results of Goodman (1960) L L X X ^ ,w ^ ^ ,w ^ ~ ;P ~ ~ l, 1 P ~ l, 1 P Cov P Cov w Cov w E, 1 E, 2 ¼ l, 1 ~ l, 2 P l, 2 þ l, 1 ~ h, 2 Ph, 2 h6¼l¼1
l¼1
L L X X ^ ;P ^ ~ l, 1 ; w ~l, 2 CovPSS P ~ l, 1 ; w ~l, 2 Pl, 1 PTl, 2 Cov w Cov w l, 1 l, 2 þ l¼1
þ
L X l¼1
l¼1
L X ^ ;P ^ ~ l, 1 ; w ~h, 2 Pl, 1 Ph, 2 wl, 1 wl, 2 CovPSS P Cov w l, 1 l, 2 þ h6¼l¼1
(7.27) where CovPSS denotes covariances with respect to PSS performed in the l-th e 2 Þ ¼ ðI w1 1T ÞGðI 1wT2 Þ is the covariance matrix country, Covðe w1 ; w e 2 and G ¼ diag{Cov1(N~1,1, N~1,2), .. ., CovL(N~L,1, N~L,2)} is e 1 and w between w the diagonal matrix having the covariances of the estimators N~l,1 and N~l,2 as diagonal elements. If estimates of these covariances, say Cel , are available
Author's personal copy Chapter
7
Large-Scale Pan-European Forest Monitoring Network
131
e w ¼ ðI w e 2 Þ can be estimated by C e 1 1T Þ from each NFI, then Covðe w1 ; w T e ¼ diagðCe1 ; ... ; CeL Þ, in such a way that e 1e GðI w2 Þ where G ~ P, E ¼ C
L X
~ w, l CP, l þ C
l¼1
þ
L X
L X
^ P ^T ~ w, l P C l, 1 l, 2 þ
l¼1
^ P ^ ~ w, l, h P C l, 1 h, 2
L X
~ l, 1 w ~ l, 2 C P, l w
l¼1
(7.28)
h6¼l¼1
e Þ, where C is the estimate of e ;P constitutes an estimate of CovðP E, 1 E, 2 P,l ^ ^ e w. CovPSS ðPl, 1 ; Pl, 2 Þ while Cew, l and Cew, l, h are the l,l and l,h elements of C e e P From these last results, the difference P E, 2 E, 1 turns out to be an approximately unbiased estimator of PE,2 PE,1, with a variance–covariance e P, E C e T . Hence, the e P, E, 2 C e P, E, 1 þ V matrix which can be estimated by V P, E e e e e e ¼ C C ¼ cT ðP P Þ is an approximately difference estimator D E E, 2 E, 1 E, 2 E, 1 unbiased estimator of DE with variance which can be estimated by 2 ^e e P, E C e T Þc. If the normality of D e P, E, 1 þ V e P, E, 2 C VeDE ¼ cT ðV E is presumed, P, E the p-value, the power, and the MDC adopted for inference on change can be computed via expressions (7.8), (7.9), and (7.11).
7.7
CONCLUSIONS
The work presented here gives an overview on the current status of forest condition assessments in Europe from a statistical point of view and takes into account implications of different sampling designs. It presents an approach to quantify and improve the accuracy of defoliation assessments and aggregated evaluations. The proposal aims to (i) promote plot-based sampling as a unifying sampling framework for forest condition assessments, (ii) introduce concrete sampling objectives in terms of status and change detection, and (iii) improve forest condition assessment at the European scale by using country estimates. The approach relies on the assumption that a common definition of forest is applied across all European countries. The proposal adopts a probabilistic sampling scheme based on fixed-area plots selected over the target region by means of systematic or stratified schemes. Statistical estimators at the European level are based on two alternative strategies: the combination of FCM estimates or the aggregation of FCM and NFI estimates (Figure 7.5). Aggregation of FCM and NFI estimates may improve the results by taking benefit from the larger numbers of NFI plots as a basis to upscale the defoliation results from a smaller number of FCM plots with a much higher temporal frequency of assessment. Under this framework, some operative guidelines can be provided to adapt the current structure of ICP Forests monitoring system to the proposed sampling strategy: l
the shift from a fixed number of trees selected on a site to a fixed-area plot centered at the site forms the basis for the approach presented;
Author's personal copy
Assumptions
Common definition of forest among L countries Objectives
(1) Quantitative estimate of the parameter of concern for the target statistical population at specified level of accuracy for each country and at European level; (2) Quantitative estimate of change of the parameter of concern at both country and European levels, with statistical assessment of the hypothesis of no change Defining parameters of concern
For example, defoliation level, ranging from 0 to 100% Defining accuracy standards
For status assessment: for example percent error