Peter A. Rogerson Eric Delmelle Rajan Batta Mohan Akelh Alan Blatt Glenn Wihon
Optimal Sampling Design for Variables with Varying Spatial Importance I t is often desirable to sample in those locations where uncertainty associated with a variable is highest. However, the importance of knowing the variable> value may va y across space. W e are interested in the spatial distribution of Received Signal Strength Indicator (RSSI), a measure of the signal strength from a cell tower received at a particular location. It is crucial to estimate RSSI values accurately in order to evaluate the efleectivenessof mayday systems designed f o r rapid emergency notijicationfollowing vehicle crashes. RSSI estimation is less important f o r locations where the probability of a crash is low and where the likelihood of call completion is either close to zero or one. W e develop a method for augmenting an initial spatial sample of RSSI values to achieve a high-precision estimate of the probability of call completion following a crash. W e illustrate the approach using data on RSSI and vehicle crashes in Erie County, NY 1. INTRODUCTION
The optimal selection of spatial sampling locations has been the subject of much previous study (see, e.g., Berry and Baker 1968; Ripley 1981; Haining 1990; and Cressie 1993 for various summaries). A common objective is to design a sampling This pa er was presented at the annual meeting of the North American Regional Science Association, San Juan, Buerto Rico, November, 2002. This research was carried out with the assistance of a grant from the Center for Transportation Injury Research, Veridian Engineering, Buffalo, NY. Ikuho Yamada pro-
vided programming assistance.
Peter A. Rogerson is in the Department of Geography, University at Buffalo and National Centerfor Geographic Information and Analysis (
[email protected]). Eric Delmelle is in the Department of Geography, University at Buffab (
[email protected]). Rajan Batta is in the National Centerfor Geographic Information and Analysis and Department of Industrial Engineering (
[email protected]).Mohan Akella is in the Department of Industrial Engineering, University at Buffalo (
[email protected]).Alan Blatt is at Veridian Engineering, Buffalo, NY (
[email protected]).Glenn Wilson is at VeridianEngineering, Buffalo, Ny. Geographical Analysis, Vol. 36, No. 2 (April 2004) The Ohio State University Submitted:January 1,2003. Revised version accepted: September 24,2003.
178 / Geographical Analysis scheme that minimizes the variance associated with an estimate of the variable of interest. In this regard, the optimal location of a fixed number of sampling locations depends in part on the spatial structure of the variable of interest. For example, if portions of the study area exhibit strong spatial dependence, there is little point in sampling at locations of close proximity. Unfortunately, the spatial structure associated with the variable is often unknown, although it can be estimated if the results of some initial phase of sampling or from a pilot study are available. In this paper, we focus on a specific type of second-phase sampling problem-one that calls for supplementing an initial sample with additional observations. This problem arises in the context of mapping cell phone signal strength, for the purpose of evaluating estimates of the probability that a cell phone call will be completed following a vehicle crash. The essential question is where to take additional samples of signal strength, given available information on the spatial pattern of signal strength, crash locations, and the relationship between signal strength and call completion. In section 2, we describe the context of the problem that led to consideration of an optimal sampling scheme. In section 3, we review Cressie’s (1993) discussion on spatial sampling and indicate how it can be modified for use in our application. The fourth section presents the results of an application of the method to sampling signal strength in a subregion of Erie County, NY. In the final section, we provide a summary. 2. RECEIVED SIGNAL STRENGTH INDICATORS (RSSI) AND THE AUTOMATED COLLISION NOTIFICATION (ACN)
To reduce vehicle crash-related fatalities, the National Highway Traffic and Safety Administration (NHTSA) sponsored the Automated Collision Notification (ACN) Field Operational Test Program from 1995 to 2000. ACN explored the ability of in-vehicle equipment to reliably sense and characterize crashes and automatically transmit crash location and crash-severity data to the proper public safety agencies (Tilman and Knox 1997).In addition to ACN, several commercial automated collision notification systems (mayday systems) have been introduced (Clark and Cushing 2002). In general, the mayday system consists of a positional locatiodcommunication system that links an invehicle emergency notification device to emergency response service providers. Mayday systems such as OnStar use airbag deployment as the event to trigger an automated emergency message. More advanced systems such as ACN employ three accelerometers to detect vehicle decelerations that exceed predetermined crash thresholds (Evanco 1999).As a minimum, the emergency message transmitted by the vehicle provides the geographic location of the crash (i-e.,latitude and longitude) and an indication of the type and severity of the crash. Once the data is received by the Public Safety Answering Point (PSAP),it is used to dispatch and route emergency services (i.e., fire, police, and emergency m e d d services to the scene. The wireless call must transmit at least a relatively short burst of data containing the vehicle location (Evanco 1999).More robust systems provide addtional data that can be used to characterize the crash severity and estimate potential occupant injuries. Robust cellular communications are essential for mayday system effectiveness, and concerns arise regarding signal strength issues, especially in rural areas. Received Signal Strength Indicator (RSSI) is a measure of the bond between the customer’s cellular phone and the cellular service provider’s tower or cell site (Akella et al. 2003). The strength of this bond determines the likelihood of completing a phone call within that specific cell. Many factors, including distance to cell tower, foliage cover, topographical features, and loading of the cellular system contribute to the integrity of a cellular connection (Delmelle et al. 2003). In most urban and suburban areas, signal strength is more than adequate for cellular communications. In rural, hilly, or forested areas, the RSSI may vary significantly at a scale of a few hundred feet. Sometimes the RSSI may drop low enough to be inadequate for establishing a cellular con-
l? A. Rogerson, E . Delmelle, R. Batta, M . Akella, A. Blatt, and G. Wilson / 179 nection, or it may be low enough to cause a delay in establishing a call (Walton and Meyer 2000). In the case of mayday systems, the RSSI may vary enough during the phone call to allow only partial transmission of crash data. Unfortunately, the areas of the country in which mayday systems are most needed (e.g., rural areas where crashes can otherwise go unreported for long periods of time) are also the same areas that have the most limited cellular phone coverage. Previous research summarized the relationship between RSSI and call completion probability, the initial RSSI sampling effort and the subsequent prediction of RSSI values throughout Erie County, NY (Akella et al. 2003). The beta-test program results for the ACN system are also reported in Akella et al. 2003. In addition, we have investigated in more detail the relationship between RSSI and its covariates, including foliage cover, slope, elevation, and distance from the nearest cell tower (Delmelle et al. 2003). Results indicate that RSSI values decrease exponentially with the log distance to the cell tower. Increasing slope and altitude values induce a decrease in the received signal strength. Finally, RSSI values drop significantly (> 5dB) when measured behind densely vegetated areas. Given an initial set of RSSI values taken at sampling locations, our objective here is to choose an additional set of locations at which RSSI will be sampled. Instead of simply gathering more information on the spatial variation in RSSI, our underlying goal is to gather information in those strategic locations where (a) call completion probabilities are particularly sensitive to the level of RSSI, and (b) crash densities are relatively high (if the probability of a crash in a particular location is low, the need for a precise estimate of call completion probability is also relatively low). More specifically then, we wish to choose the second-phase sample to reduce as much as possible the variance in the estimate of the call-completion probability following a vehicle crash. In the next section, we outline the general approach; in the following section we demonstrate how that approach is applied to the spatial sampling of RSSI values. 3. OPTIMAL SPATIAL SAMPLING DESIGN FOR VARIABLES OF VARYING SPATIAL IMPORTANCE
Given a set of values at sampled locations, kriging (see, e.g., Cressie 1993; Bailey and Gatrelll995) is based on a variogram that summarizes how the variance of values at points separated by a particular distance, changes with distance. The empirical variogram may be estimated by:
where the sum is over all pairs of points (si.si) separated by a distance h, and the number of such pairs is d ( h ) .In practice, distance ranges are constructed, since there will be few pairs separated by a specific distance. In modeling the variogram, it is common to invoke the assumption of isotropy, where the distance effect is the same in all directions. The function q(h)is a function of distance ( h ) and is termed the semivariogram; it is typical to use the data to fit a parametric model for y(h). An example of such a model is the exponential (Bailey and Gatrelll995):
where o2is termed the “sill” and is equal to the variance of point values observed at large distances, and r is the range, i.e., the &stance at which the semivariogram levels out at a height of 02.
180 / Geographical Analysis Although the value of y(h)in equation ( 2 )is equal to zero at h = 0, often a constant term, called the nugget effect, is added to equation (2), because in practice values separated by small distances can be dissimilar. The modified form of equation (2)is:
y(h) = a
+ (02- a ) ( l - e-3Mr)
(3)
where a is the magnitude of the nugget effect. The corresponding covariogram model is:
where C ( h )is the covariance of two points separated by a distance h. Kriging is a method of interpolation, where the estimated value at any particular location is a weighted sum of the values observed at surrounding locations and where the weights are a function of the variogram. Kriging yields (a) a set of spatial predictions for values of the variable at sampled locations, and (b) an associated kriging vanance that measures the uncertainty in the predictions made at specific geographic locations. As Cressie notes, the kriging variance of depends on the number of points, the sample locations, and on the parameters of the variogram, but not on the specific values at the locations. Following Bailey and Gatrell’s notation, the ordinary kriging variance at a location, s, is: o?(s) = G2 - c+(s)’c;’c+(s)
(5)
where the subscript “k” is used to denote the kriging variance. In addition, C is the m x rn matrix of covariances among the m observations, and c(s) is a m x 1vector of covariances between the value at point s and each of the m observations. In addition, the indicates that these vectors and matrices have been augmented. In the case of c, the vector is augmented with a “1”in position m + 1, and C is augmented with a row and a column consisting of one in the first m positions, and a zero in position m+ 1.With a fitted exponential model, the parameter values o’, a, and r may be used first in equation ( 3 ) ,and then in equations (4)and (5),to determine the kriging variance at a particular location. Given an initial set of m sample locations, our first objective is to sample n new locations to reduce the krigng variance by as much as possible over the area of interest. Alternatively stated, we wish the new kriging variance based on the enhanced set of sample locations to be as small as possible. This is similar to the objective of van Groenigen, Siderius, and Stein (1999) and van Groenigen (2000).The objective may be expressed as:
“+”
where the integral is taken over the domain of the study area, D. This integral is typically evaluated by finding the mean kriging variance over a lattice consisting of g grid points:
l? A. Rogerson, E. Delmlle, R. Batta, M . Akella, A. Blatt, and G. Wilson /
181
In addition, we will have weights associated with each location that reflect the “importance” of that location. Our objective is then to choose the n new sample locations to maximize the decrease in the (weighted) kriging variance:
where w iis the (nonnegative) weight associated with grid point location i , g is the number of grid points, and A represents the change in the kriging variance following the selection of the new set of points. Let the region be divided into N possible sampling sites. There are possible sampling plans; each of these may be evaluated to determine the resultant reduction in the weighted kriging variance. Possible approaches to avoid enumeration include integer programming with branch and bound (Garside 1971) or heuristics such as discretized partial gradient (Fedorov 1972), simulated annealing (Sacks and Schiller 1988), and genetic algorithms (Goldberg 1989). In the next section, we illustrate how sampling plans may be constructed for gathering information on cell phone signal strength and call-completion probabilities.
(2)
4. APPLICATION TO RSSI SAMPLING IN ERIE COUNTY, NY
We begin by modeling the probability of call completion ( p ) as a function of RSSI. RSSI values for locations are determined using kriging with the original sample values. It would be desirable to include additional variables such as the location of cell towers, but since we do not have complete information on this, we leave this as a future extension. We model the relationship between call-completion probability and RSSI using logistic regression: y=ln
- =a+br+E
i1TP)
(9)
where a and b are parameters to be estimated, and r is the value of RSSI. Because of uncertainty in the value of r, the vw-iance of the estimated log-odds of call completion is:
i r
V [ y ] = V In il!P)l
7
= b20:+o:
where 0,“ is the variance of the residua’ , and 0; represents the uncertainty associated with the measured value of RSSI:
The term o;,~is the kriging variance and the factor of 1.25 accounts for measurement imprecision; the “black box” is onlv Cible to measure RSSI within +/- 2.5 db (so that, approximately, one standard deviation of measurement error is equal to 1.25 db). Let the probability of call complption at location s be denoted p,. Then the expected probability of call completiw in the study area, following a crash is:
182 / Geographical Analysis
wheref, is the probability that, when a crash occurs, it occurs at location s. The average variance of p throughout the study region (D),when weighted by crash density, is:
We wish to maximize the change in this quantity (where the “change” will be a decrease in the variance of p ) , over possible sampling plans. Equation (13) may be implemented using its relationship to equation (10) (see, e.g., Alonso 1968):
where the dependence of y and the kriging variance on location has been made explicit using the subscript s, and where 0;is assumed constant. Substituting equation (14) in equation (13), and then focusing on the change (reduction) in the variance yields:
Again, it is this quantity that we are interested in maximizing. Equation (15) reveals that we wish to choose the new sampling locations in areas that (a) will have relatively high crash densities (f,), (b) reduce the kriging variance, o:, and (c) have RSSI values that are in the range where small changes will influence the estimated call-completion probability significantly (using the term e ’y/( 1+ey)4. Note that in the context of equation ( 8 ) ,the weight at a location may be defined as:
where, as a reminder, ys = In
(A)
is location-specific.The term b2is a constant
throughout the study region; because it is not location-specific,it does not enter the weight defined in equation (16).
Data and Analysis Figure 1 shows the location of m = 34 sampled RSSI values within a subregion of Erie County. The shading within each circle corresponds to the RSSI value, with darker shading representing weaker signals. Figure 2 depicts the RSSI surface that results from kriging, with darker values again corresponding to areas of weak signal strength.
I? A. Rogerson, E. Delm.de, R. Buttu, M . Akella, A. Blutt, and G. Wilson / 183
FIG.1. Original RSSI Sample Points
To illustrate some potential sampling plans, we placed seven sampling points at various locations on the network (see Figure 3 ) .The contours of Figure 3 correspond to lines of equal krigmg variance. We chose the potential sites in areas where the kriging variance was relatively high, and consequently, the number of previously sampled points in the vicinity of these sites was small. Now suppose that we wanted to choose two of the seven potential sites for additional sampling. There are (72) or 21 possible combinations. In Table 1, the “percent-
184 / Geographical Analysis
-
-98 99--84 00
-
83.99 49.m
Q
%399-#.w)
&$.m- -54.M
Q
6899--54.Oa
a
68
n
1
Klbmet&n
FIG.2. Interpolation from Original RSSI Sample Points
age improvement" column shows the reduction in the mean kriging variance that may be achieved with the 21 subsets of two sample locations. Sampling at points D and F leads to the greatest reduction in the kriging variance (1.63%).Figure 3 reveals that the kriging variance at the locations of these points is relatively high (compared with the
value of the kriging variance at the other potential sites). Note that these points are both relatively far removed from existing sample points, and the not-surprising conclusion is that we should sample where we currently have relatively little information. Next, we add the criterion that we wish to sample not necessarily at those locations where the variance of the RSSI estimate will be reduced, but rather at those locations where further information on RSSI will contribute substantially to reducing the un-
P A. Rogerson, E. Delrnelle, R. Batta, M . Akella, A. Blatt, and G. Wilson / 185
Legend
1751-194
19 40 - 24.87
FIG.3. Location of Potential Sample Points
certainty regarding call completion. There is little point in collecting further information on RSSI in locations where we believe the probability of' call conipletion is close to zero or one, even if the density of previously sampled points is low. We now choose two points among the seven, with the objective of maximizing the reduction in the uncertainty of call completion (i.e., the reduction in the weighted kriging variance, as defined in equations (8)and (15),omitting for now the effect of crash density). Column ( 3 ) of Table 1 gives the mean value of the reduction in kriging variance, weighted by the sensitivity of call completion to RSSI (captured by the term e2Y/(l+ey)4).Entries in the column show that the optimal subset (shown in bold) would now be to choose points C and D. These points are all at locations on the net-
186 / Geographical Analysis TABLE 1
k"Tti,
vanance Mean (1)
Improvement (w) (2)
Mean reduchon In weighted kn ng vanance (dB7 (3)
Original sample Original sample all r t e n t i a l points A AC AD AE AF AG BC BD BE BF BG CD CE CF CG DE DF DG EF EG FG
13.5554
0
13.0980 13.4649 13.4550 13.4459 13.4499 13.3629 13.4756 13.4498 13.4408 13.4447 13.3577 13.4497 13.4333 13.4347 13.3476 13.4604 13.4256 13.3386 13.4588 13.3425 13.4552 13.3683
3.4921 0.6721 0.7462 0.8144 0.7844 1.4406 0.5922 0.7851 0.8526 0.8234 1.4800 0.7859 0.9089 0.8984 1.5568 0.7058 0.9668 1.6254 0.7177 1.5957 0.7447 1.3996
0.00258 0.00000 0.00091 0.00028 0.00387 0.00140 0.00231 0.00141 0.00257 0.00348 0.00261 0.00090 o.Oooo1 0.00091
HG HF HE HD HC HB HA
13.4303 13.3726 13.4046 13.4005 13.4100 13.4197 13.4249
0.9315 1.3670 1.1250 1.1559 1.0843 1.0112 0.9721
0.00196 0.00227 0.00195 0.00452 0.00335 0.00195 0.00196
+
0.00000 0.00140 0.00258 0.00000 0.00091
0.00000 0.00140 . ..~..
Mean reduction in weighted kngmg mcludm crash d&ty (dBk
vanance
(4)
0.00000 0.00016 0.00027 0.00000 0.00009 0.ooOoo 0.00016 0.00027 0.00000 0.00009 0.00042 0.00005 0.00016 0.00025 0.00016 0.00027 0.00036 0.00029 0.00009 0.00000 0.00009 0.00021 0.00025 0.00021 0.00048 0.00037 0.00021 0.00021
work where estimated RSSI is relatively weak (see Figure 2), and more importantly, where the sensitivity of call completion to changes in RSSI is particularly high (i.e., the dark areas in Figure 4). Figure 5 shows where each of the potential sample points lies on the call completion curve; note that points C , D, and F are all on relatively steep portions of this curve. Figure 6 depicts the spatial distribution of the weighted change in kriging variance for the 21 potential pairs of sampling points. The figure allows visual comparison of the benefits of alternative sampling plans. On the basis of the previous results, we added one more potential sample pointpoint H in Figure 4.This point was chosen because the estimated probability of call completion at that point is particularly sensitive to RSSI value, and thus further information on RSSI could potentially reduce uncertainty in the probability of call completion significantly. Table 1 reveals that with the addition of this point, the optimal pair of points to sample is now H and D, since this pair reduces the weighted kriging variance by the greatest amount. Figure 7 displays how the weighted kriging variance would be reduced for each additional pair of points. Finally, we add crash density to our criteria. There is little point in improving estimates of call completion if there is only a very remote possibility of a crash at particular points in the network. Rather, we wish to improve our estimates at locations where the crash density is relatively high. We used the 206 crash locations observed
P A. Rogerson, E. Delmelle, R. Batta, M . Akella, A. Blatt, and G. Wilson
/
187
FIG.4. Sensitivity of Call Completion to Changes in RSSI
in Erie County during 1995 together with kernel density methods (Bailey and Gatrell 1995) to estimate the probability density associated with crashes. A refinement would be to develop a more comprehensive model of crash density-for example, crashes are often more likely at intersections. Results in the column (4) of Table 1 show that among the original seven potential sampling points, the pair C and D yields the greatest decline in the weighted kriging variance (where the weights now include crash density). When the set of eight points is considered the pair H and D achieves the greatest reduction.
188 / Geographical Analysis 1.0
.-3
0.8
2
n
0.6
C
.o
ti
0.4
-
0
0"
0.2
0.0
.1#)
-114
.lo9
.lo4
.99
-94
-89
-84
.79
.74
.69
RSSI FIG.5. Position of Potential Locations Mapped on the Call Completion Probability Curve
Larger problems The results to this point have demonstrated the nature of the problem, but it is more realistic to be concerned about larger samples ( n )from larger potential sets of
sampling points ( N ) .Although RSSI as an indicator of cell tower strength is a variable that is continuous in space, it is less time-consuming and more relevant to collect measurements on a network. In addition, there are practical constraints to sampling on a network; collecting RSSI values every five meters along a road would yield very accurate results, but it would be very time-consuming to collect such information. On the other hand, sampling every two kilometers may be too large an interval to capture relevant spatial characteristics. In our study area, a potential sampling interval of every 100 meters strikes a balance between these two extremes. To generate these potential sample points on a network, two scripts were written in Avenue, the ESRI ArcView programming language. The first script reports the length of each segment on the network, characterized by a begin- and end-node. The second script allocates points on these segments, 100 meters apart from each other. This continues until the distance separating the last generated point from the end-node is less than 100 meters. When a road segment is smaller than 100 meters, the algorithm stops after the first step; only one point will be generated, corresponding with the endnode of that segment. A table is created while the script is running-it contains the new sample point ID, the corresponding road segment ID, and the distance separating that point from the endnode of that segment. Finally, Network Analyst, an ArcView extension was used to display the points on the network (see Figure 8). From the set of 433 potential sampling points, we decided to select n = 20 points to sample. We implemented a dynamic greedy algorithm (Rardin 1998) by first selecting the single point that would yield the greatest reduction in weighted kriging variance. With that point added to the set of sampled points, we then chose another
DG
DF
FIG.6. Change in Weighted Kriging Variance for the 21 Potential Points
EG
CG
CF
CE
CD
BG
EF
BF
BE
BD
BC
AG
DE
AF
AE
AD
AC
1
Kriging parameters
Kilometers
exponential model 10 lags of 350m each nuaaet effect: 0 dB ;?;; iar
1
sample point
-roads
0
7.30
0.21 -0.30
0.11 - 0.20
=
0.06 - 0.10
c .05
FG
Legend
190 / Geographical Analysis
HA
HD
HB
HC
HE
HF
Leaend
0c.05 00.06 - 0.1
Kriging parameters
. I 0.11 -0.2
m m
exponential model 10 lags of 350m each nugget effect: 0 dB range: 3473.7m
0.21 - 0.3
0
> .31
sample point
-roads
S
0
1
2
4
I Kilometers
HG FIG.7.
single point that would achieve the greatest additional reduction in weighted kriging variance. The results are displayed in Figure 9, which shows the reduction in weighted kriging variance with the addition of sample points. Over a 10%reduction is achieved through the addition of the first 15 points. Figure 10 compares this myopic, greedy strategy with the more ndive strategy of random selection. At each stage, a sampling point could be chosen randomly from the set of potential sample points. The average of five trials of this strategy is shown in the figure; clearly the myopic, greedy strategy improves upon the naYve, random strategy. One approach would be to compare the myopic, greedy strategy with the best of say M repetitions of the random strategy. If M is large enough, the random strategy would produce the actual opti-
P. A. Rogerson, E . Delmelle, R. Batta, M . Akella, A. Blatt, and G. Wilson /
I
191
.
FIG.8. Points generated based on a 100 meters interval
mum; in practice, it is unlikely that values of M that are reasonable from a computational point of view would yield results as good as the myopic, greedy strategy. Our future work will focus on comparing the myopic, greedy approach with optimal strategies (an optimal selection of 20 points would have resulted in a weighted kriging variance less than or equal to that represented by the solid line in Figure 10). Finally, Figure 11shows the change in weighted kriging variance (before and after the selection of the 20 points). The 20 selected points are primarily in the western portion of the study region, and it is here where reductions in the weighted kriging variance are achieved.
192 I Geographical Analysis 135
130
8 0 0 0
st
125
E
m
E 120
115
1
I 5
0
20
15
10 points added
FIG.9. Mean of the weighted kriging variance
115
1 0
2
4
6
0
10
12
14
16
10
20
FIG.10. Mean of the weighted kriging variance
5. SUMMARY
We have illustrated some of the tradeoffs that one may face in constructing an optimal spatial sampling plan. In general, it is important to sample in those locations where relatively little information is available, but only if the information sampled at those locations can contribute to reducing the uncertainty in the underlying question of interest. In this application, we were interested in sampling RSSI values at addi-
I? A. Rogemon, E. Delmelle, R. Batta, M . Akella, A. Bhtt, and G. Wilson / 193
FIG.11. Change in weighted kriging variance with 20 points added
tional locations, but only if the sampled values were likely to contribute to estimation of call completion probabilities in areas of relatively high crash densities. An alternative approach would be to consider adaptive sampling (see, e.g., Thompson and Seber 1996; Cox 1999). In adaptive sampling, the selection of subsequent sample units (in this case geographic point locations) depends upon the outcome values observed at previously selected units. This type of sampling is particularly appropriate for situations where a spatial variable of interest is rare and clustered. In our particular context, this type of sampling could be useful in delineating the relatively rare subregions where signal strength is weak enough to cause the call completion probability to fall below one. The design of optimal sampling schemes can be improved with cokriging. RSSI values are strongly correlated with variables such as distance to cell towers and terrain characteristics. This information can be used to reshape the sampling scheme and in turn reduce the total uncertainty over the study area.
194 /
Geographical Analysis
Some of the analysis has been camed out in the plane rather than on a network (in particular, the estimation of the variogram and the kriging of RSSI values). Kriging on a network would lead to different variogram parameters, as well as different predicted RSSI values and variances. Clearly one is interested in a larger number of potential points and sampled points, and there is interest in finding good heuristics to avoid enumeration of all possibilities. We have begun to investigate this using implementation of a greedy heuristic, constructed so that one first assumes that just one additional point is to be sampled; once this point is chosen, n- 1 additional locations are be chosen in a similar, sequential fashion. Alternative strategies include two-step look ahead (in comparison to the one-step approach just described), and using simulated annealing to modify the solutions found with either the one-step or two-step look ahead (Rardin 1998). van Groenigen and Stein (1998)have looked at this possibility in the context of soil sampling. Finally, it is of interest to compare these strategies with the optimal solution. This can be done directly by using enumeration to find the optimal in small problems. For larger problems, we hope to be able to find bounds on the optimal solution in our future work. LITERATURE CITED Akella, M., Bang, C., Beutner, R., Delmelle, E., Wilson, G., Blatt, A,, Bath, R., and Ro erson, P. (2003). Evaluating the Reliability of Automated Collision Notification Systems. Accident Anafysis and Prevention 35,349-60. Alonso, W. (1968). Predicting best with imperfect data. Journal of the American Institute of Planners, 248-55. Bailey, T., and Gatrell, A.C. (1995). Interactive spatial data analysis. Essex: Addison Wesley Longman Limited. Berry, B.J. L., and Baker, A.M. (1968). Geogra hic samplin In Spatial analysis, 91-100 edited by B.J.L. Prentice-ail. Berry and D.F. Marble. Englewood Cliffs, Clark D.E., and CushinJ B.M. (2002). Predicted effect of automatic crash notification on traffic mortality. Accident Andysis an Prevention 34,507-13. Cox, L.A. (1999). Adaptive spatial sampling of contaminated soil. Risk Analysis 19, 1059-69. Cressie, N.A.C. 1993. Statistics for spatial data. New York: Wiley. Delmelle, E., Hogerson, P., Akella, M., Batta, R., Blatt, A,, and Wilson, G . (2003). A Spatial Model of Remote Signal Strength Indicator Values. Submitted for publication. Evanco W.M. (1999). The otentid of rural mayday systems on vehicular crash fatalities. Accident Analysis and Prevention 31, 495-62. Federov, V.V. (1972). Theory of optimal experiments. New York: Academic Press. Garside, M. (1971). Some computational procedures for the best subset problem. Applied Statistics 20, 8-15. Goldberg, D.E. (1989). Genetic algorithms in search, optimization, and machine learning. Reading, MA: Addison-Wesley. Haining, R. (1990). Spatial data analysis in the social and environmental sciences. Cambridge: Cambridge University Press. Rardin, R. (1998). Optimization in Operations Research. New York: Prentice Hall. Ripley, B. D. (1981). Spatial statistics. New York: Wiley. Sacks, J., and Schiller, S. (1988). Spatial designs. In Statistical decision theory and related topics. IV Vol. 2, 2385-2399 edited by S.S. Gupta and J. 0. Berger. New York: Springer.. Thompson, S.K., and Seber, G.A.F. (1996). Adaptive sampling. New York: Wiley. Tilman ., Jeffre R , and Knox T. (1997). New technologies from NHTSA. Annals of Emergency Medecine 29 (6/, 812-1J. ' Van Groeni en, J W , and Stein, A. (1998). Constrained optimization of spatial sampling using continuous simulatef annealing.Journal of Environmental Quality 27, 1078-86. Van Groenigen, J.W., Siderius, W., and Stein, A. (1999). Constrained optimization of soil sampling for minimization of the kriging variance. Geodermu 87,239-59. Van Groenigen, J.W. (2000). The inflnence of variogram parameters on optimal sampling schems for mapping by kriging. Georlerma 97,223-36. Walton S., and Meyer E. (2000). Proceeding of RATTS conference on interpreting cellular coverage data. Brandon, Missouri, 13-16 August 2000.
4: