Web-based Supplementary Materials for Halton Iterative Partitioning: spatially balanced sampling via partitioning by B. L. Robertson1,∗ , T. McDonald2 , C. J. Price1 and J. A. Brown1 1 School
of Mathematics and Statistics, University of Canterbury, Private Bag 4800, Christchurch, New Zealand 2 Western EcoSystems Technology, Inc. Cheyenne, Wyoming 82001, USA ∗ email:
[email protected]
Table 1: The number of boxes, B = 2J1 3J2 , that were considered to partition a point resource under HIP sampling. B 6 12 24 36 72
1
J1 1 2 3 2 3
J2 1 1 1 2 2
B 144 216 288 432 864
J1 4 3 5 4 5
J2 2 3 2 3 3
B 1728 2592 3456 5184 7776
J1 6 5 7 6 5
J2 3 4 3 4 5
B 10368 20736 31104 62208
J1 7 8 7 8
J2 4 4 5 5
Permutation of Halton Indices
The Halton index k (mod B = 2J1 3J2 ) for a particular Halton box [`1 , u1 ) × [`2 , u2 ),
(1)
in [0, 1)2 is obtained by solving the system of congruences k = a1 (mod 2J1 )
(2)
k = a2 (mod 3J2 ), where
Ji n X b`i bji c ai =
o mod bi bj−1 . i
j=1
For example, the mod 36 index for Halton box [1/4, 1/2) × [1/3, 4/9) is 10 (see the first entry of Table 2 for all the mod 36 indices). The Halton indices are permuted by changing the ai ∈ {0, 1, . . . , bJi i − 1} values in (2). First, the values {0, 1, . . . , bJi i − 1} are permuted for each base using Algorithm 1 to give P1 and P2 . The ai values for box (1) are replaced with (2J1 `1 +1)
a1 = P1
and
(3J2 `2 +1)
a2 = P2
,
where P (i) denotes the ith element in P. The system of congruences (2) is then solved to give the permuted Halton index k (mod B) for box (1). Consider, for example, B = 36 and the mod B Halton box [1/4, 1/2) × [1/3, 4/9). If P1 = {0, 2, 1, 3}
and
P2 = {7, 4, 1, 2, 8, 5, 6, 0, 3},
the box’s permuted Halton index k is a solution to k = 2 (mod 4) k = 2 (mod 9). 1
(3)
Algorithm 1 Permutation of {0, 1, . . . , bJi i − 1}, where Ji > 1. Here σb is a row vector containing a random permutation of {0, 1, . . . , bi − 1} and Ij is a row vector containing bi copies of the jth element in I. I = σb for k = 1 to Ji − 1 do v=() for j = 1 to bki do v ← v, Ij + bki σb end for I←v end for Output Pi = I. Hence, the permuted index for this box is k = 2 (mod 36) (the original index was 10). Two different permutations of this form are given in rows two and three of Table 2, where the first permutation uses (3) and the second uses P1 = {1, 3, 2, 0}
and
P2 = {0, 3, 6, 5, 2, 8, 7, 1, 4}.
Table 2: Halton indices for 36 Halton boxes and their corresponding mod B nested structure, where row i column j is the Halton index for Halton box [(j − 1)/4, j/4) × [(9 − i)/9, (10 − i)/9). Two permutations of the indices are shown.
Original
Permutation 1
Permutation 2
8 32 20 16 4 28 24 12 0 12 0 24 32 8 20 28 4 16 13 1 25 17 29 5 33 21 9
mod 36 26 17 14 5 2 29 34 25 22 13 10 1 6 33 30 21 18 9 30 21 18 9 6 33 14 5 26 17 2 29 10 1 22 13 34 25 31 22 19 10 7 34 35 26 11 2 23 14 15 6 3 30 27 18
35 23 11 7 31 19 15 3 27 3 27 15 23 35 11 19 31 7 4 28 16 8 20 32 24 12 0
8 8 8 4 4 4 0 0 0 0 0 0 8 8 8 4 4 4 1 1 1 5 5 5 9 9 9
mod 12 2 5 2 5 2 5 10 1 10 1 10 1 6 9 6 9 6 9 6 9 6 9 6 9 2 5 2 5 2 5 10 1 10 1 10 1 7 10 7 10 7 10 11 2 11 2 11 2 3 6 3 6 3 6
2
11 11 11 7 7 7 3 3 3 3 3 3 11 11 11 7 7 7 4 4 4 8 8 8 0 0 0
2 2 2 4 4 4 0 0 0 0 0 0 2 2 2 4 4 4 1 1 1 5 5 5 3 3 3
mod 6 2 5 2 5 2 5 4 1 4 1 4 1 0 3 0 3 0 3 0 3 0 3 0 3 2 5 2 5 2 5 4 1 4 1 4 1 1 4 1 4 1 4 5 2 5 2 5 2 3 0 3 0 3 0
5 5 5 1 1 1 3 3 3 3 3 3 5 5 5 1 1 1 4 4 4 2 2 2 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
mod 2 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0
Table 3: Results for populations 1, 2 and 3 with N = 10, 000 for different sample sizes, n. The reported values are averages using 1000 different samples, where VSIM is the simulated b NBH is the local mean variance estimator and V b LPM is LPM’s variance estimator. MSE, V Exact values are shown for SRS, VSRS . The lowest simulated variance for each problem is shown in bold. Pop
1
2
3
2
n 20 50 100 150 200 20 50 100 150 200 20 50 100 150 200
SRS VSRS 0.1044 0.0416 0.0207 0.0137 0.0103 0.1812 0.0722 0.0359 0.0238 0.0178 74.614 29.756 14.803 9.819 7.327
GRTS b NBH V VSIM 0.0210 0.0282 0.0041 0.0053 0.0012 0.0014 0.0006 0.0006 0.0004 0.0004 0.0801 0.1164 0.0171 0.0321 0.0071 0.0103 0.0036 0.0051 0.0018 0.0031 44.805 45.405 9.016 13.304 2.873 4.431 1.448 2.215 0.811 1.348
GRTS VSIM 0.0247 0.0094 0.0041 0.0019 0.0008 0.0875 0.0285 0.0121 0.0069 0.0036 43.332 12.275 5.480 3.096 1.562
(2n) b NBH V 0.0279 0.0051 0.0014 0.0006 0.0003 0.1122 0.0303 0.0100 0.0049 0.0029 45.491 12.913 4.316 2.099 1.278
VSIM 0.0135 0.0025 0.0007 0.0003 0.0002 0.0630 0.0143 0.0038 0.0021 0.0012 30.875 6.556 2.118 1.031 0.602
LPM b NBH V 0.0306 0.0057 0.0015 0.0007 0.0004 0.1200 0.0331 0.0111 0.0055 0.0033 48.238 13.760 4.723 2.376 1.433
b LPM V 0.0176 0.0028 0.0007 0.0003 0.0002 0.1219 0.0238 0.0063 0.0029 0.0016 48.086 10.557 2.967 1.389 0.792
VSIM 0.0132 0.0023 0.0006 0.0002 0.0002 0.0699 0.0131 0.0041 0.0019 0.0010 31.667 7.937 1.686 0.990 0.671
HIP b NBH V 0.0298 0.0055 0.0015 0.0007 0.0004 0.1162 0.0331 0.0108 0.0053 0.0031 48.464 13.590 4.660 2.316 1.368
b LPM V 0.0160 0.0026 0.0007 0.0003 0.0002 0.1110 0.0218 0.0057 0.0026 0.0015 45.947 9.843 2.759 1.262 0.714
Populations used in Section 5
The sampling frame was a rectangular grid of N = Z 2 points in [0, 1)2 : ) ( 2 1 X zi ei : zi ∈ {0, 1, . . . , Z − 1} , Z i=1
where ei is the ith row of the two-dimensional identity matrix. Two Z values were considered: Z = 30 and Z = 100, giving point resources with 900 and 10,000 points, respectively. The response value for each point on the grid, x = (x1 , x2 ), was defined as the integral of f (x) over a box defined by [x1 , x1 + 1/Z) × [x2 , x2 + 1/Z). Three functions were used to define response values, listed below. • Population 1 (Robertson et al. 2013; Grafstr¨om, Lundstr¨om and Schelin 2012): f (x) = 3(x1 + x2 ) + sin(6(x1 + x2 )), with population total τ = 2.9994 (see Figure 1(a)). • Population 2 (Peak function): f (x) = 3(4 − 6x1 )2 exp(−(6x1 − 3)2 − (6x2 − 2)2 ) . . . −10(0.2(6x1 − 3) − (6x1 − 3)3 − (6x2 − 3)5 ) exp(−(6x1 − 3)2 − (6x2 − 3)2 ) . . . 1 − exp(−(6x1 − 2)2 − (6x2 − 3)2 ), 3 with population total τ = 0.3627 (see Figure 1(b)). • Population 3 (Bird function): f (x) = (12x1 − 12x2 )2 + exp[(1 − sin(12x1 − 6))2 ] cos(12x2 − 6) . . . + exp[(1 − cos(12x2 − 6))2 ] sin(12x1 − 6), with population total τ = 23.3982 (see Figure 1(c)). 3
10
6
5
4
f(x)
f(x)
5
3 2
0 -5
1
-10 1
0 1 0.5
x2
0
0.2
0
0.6
0.4
1
0.8
0.5
x2
x1
0
(a)
0
0.2
0.4
0.6
0.8
1
x1
(b) 200
f(x)
100 0 -100 -200 1 0.5
x2
0
0
0.2
0.4
0.6
0.8
1
x1
(c) Figure 1: Functions used to define response values.
References [1] Grafstr¨ om, A., Lundstr¨ om, N. L. P. and Schelin, L. (2012). Spatially balanced sampling through the pivotal method. Biometrics 68, 514–520. [2] Robertson, B. L., Brown, J. A., McDonald, T., and Jaksons, P. (2013). BAS: Balanced acceptance sampling of natural resources. Biometrics 3, 776-784.
4