Effects of 'local' clutter on human target detection

1 downloads 0 Views 1MB Size Report
E-mail: gary.ewing@dsto.defence.gov.au. †Sadly, Professor Douglas (Doug) Vickers passed away during the final preparation of this paper. Doug was formerly ...
Spatial Vision, Vol. 19, No. 1, pp. 37– 60 (2006)  VSP 2006.

Also available online - www.vsppub.com

Effects of ‘local’ clutter on human target detection GARY J. EWING 1,∗ , CHRISTOPHER J. WOODRUFF 2 and DOUGLAS VICKERS 3,† 1 Command

and Control Division, Defence Science and Technology Organisation, Edinburgh SA, Australia 2 Land Operations Division, Defence Science and Technology Organisation, Edinburgh SA, Australia 3 Department of Psychology, University of Adelaide, Adelaide SA, Australia Received 14 September 2004; accepted 23 June 2005 Abstract—In theory, properties of clutter can be defined globally or locally. However, in the literature, the distinction between local and global clutter is arbitrary, where the standard approach of setting the local domain to twice the expected target size, in applying local clutter metrics, is adopted without any justification. This paper addresses this problem and considers the implications for the application of clutter metrics. It was found that the size of the local clutter region around a target has a strong effect on the probability of detection of that target and that this is affected by regions much larger than twice the target size. It was also discovered that this effect was much stronger for targets subtending less than 0.8 degrees of visual angle than for larger targets. In the case of the former, the fall-off in human visual performance with clutter region size was approximately quadratic, compared to a slight linear fall-off for larger targets. A simple model is presented explaining these phenomena, indicating that the auto-covariance function characterising the clutter is the main determinant of the size of the region of local clutter. Keywords: Clutter metric; visual detection; visual clutter simulation.

1. INTRODUCTION

This paper considers the effects of the extent of the clutter around a target on a human observer’s ability to detect that target. Here, clutter is defined as any structure in the image that masks the target or confuses the observer as to the ∗ To whom correspondence should be addressed at DSTO/C2D, Building 205 Laboratories, PO Box 1500, Edinburgh SA 5111, Australia. E-mail: [email protected] † Sadly, Professor Douglas (Doug) Vickers passed away during the final preparation of this paper. Doug was formerly the first author’s PhD supervisor and will be greatly missed by colleagues and students.

38

G. J. Ewing et al.

location and/or class of the target. In theory, properties of clutter can be defined globally or locally (Rotman et al., 1994; Aviram and Rotman, 2000a) and it has been shown that the human visual system (HVS) processes information at both the global and local levels (Caelli and Julesz, 1979; Burr et al., 1986). Whether or not the HVS uses mainly local or global pre-attentive cues to focus attention depends upon the properties of the particular image being viewed. It appears that the HVS can integrate global features, such as the statistical properties of a texture (Caelli and Julesz, 1979), or underlying spatial spectral distributions (Burr et al., 1986), which in turn tune the HVS to local properties. In the context of detection, these properties in visual processing must depend on the types of clutter and target. This study does not claim to explore the full range of possibilities, but is intended as a preliminary investigation of the effect of clutter localisation on human target detection and of its implications for the application of local clutter metrics. This is important, since the conventional wisdom of setting the local domain to twice the expected target size lacks empirical justification. It is also important because it is likely that, in the context of detection, rather than search, local clutter has a greater effect than global clutter on visual performance (Overington, 1976a; Doll et al., 1993; Aviram and Rotman, 2000b; Witus and Ellis, 2003; Chen and Tokuda, 2003). Accordingly, the following experiment was carried out to provide information on the extent and functional form of the effects of local clutter on human target detection.

2. EXPERIMENTAL METHODS

Experimental subjects were presented with visual stimuli that consisted of circular regions of simulated background clutter, at the centre of which was a circular region (target), with an incremental increase in luminance1 over the rest of the clutter region. The surround consisted of a uniform luminance, equal to the average simulated clutter luminance. Example stimuli are shown in Fig. 1. Details of the experimental procedure are given in Section 2.3. Circular targets and clutter regions were used in this study because many studies in the visual detection literature have used circular stimuli. However, previous studies have used uniform luminance targets and/or uniform luminance backgrounds or surrounds. A classic example is that of Blackwell (1946), who defined human visual contrast thresholds using uniform luminance disc targets on uniform luminance backgrounds. Most studies do not address the interaction between size of the background and visual performance, although Overington (1976a) does cite some studies, which addressed this issue with uniform luminance stimuli. On the basis of these studies, Overington (1976b) stated that size of the background has little effect if this is less than 1 to 2 orders of magnitude less bright than the target (positive target contrast) and the local background is greater than about 6 milliradians (≈0.3◦ ). However, if the background is much brighter than the target

Effects of ‘local’ clutter on human target detection

39

Figure 1. Illustrations of stimuli seen by the experimental subjects. The clutter properties are controlled by the clutter parameter, ρ.

(negative target contrast), the detection threshold drops markedly. For local target backgrounds of a size range of about 1 to 2 milliradians, a decrease in detection threshold has been reported (Overington, 1976c). Since visual performance differs for targets with positive contrast compared to targets with negative contrast, and the literature just cited yields more data on positive contrast stimuli, only positive target contrasts are considered in this experiment.

40

G. J. Ewing et al.

These results from the literature provided a baseline for this study, and show that any effects related to the background (clutter) radius are related to the clutter effects and are not purely luminance-background size interaction effects. 2.1. Experimental design We designed a full factorial experiment to analyse the effects of four salient factors within the constraints of available time and resources. The experiment was a fixed effects, repeated measures (within-subjects) design. The four factors were clutter ‘clumpiness’, ρ; clutter background radius, rclut ; target radius, rt ; and target contrast, c. There were 2304 treatments, presented to each of ten subjects in a different random order. These consisted of all the combinations of the four factor levels; i.e. 4 levels of ρ × 12 levels of rclut × 6 levels of rt × 4 levels of c = 1152. The specific levels for each of the four factors are shown in Table 1. There was also an implicit fifth factor, target presence, with levels 0 or 1; i.e. for each stimulus containing a target, there was a stimulus with identical clutter properties and clutter radius, but no target. Thus the total number of stimuli presented in the experiments was 10 × 1152 × 2 = 23040. The performance measures intended for use in the analysis were response time (tr ) and the probability of a correct decision, which we call hit-rate2 , pd i , which was calculated by use of equation (1). 1  Wij , (1) pdi = N j Table 1. Stimulus factors and levels Factors Clutter radius

Target radius

1.4 2.1 2.8 3.5 4.2 4.9 5.6 6.4 7.0 7.7 8.4 9.2

0.3 0.5 0.7 0.9 1.1 1.3

Contrast Levels 2.0 3.3 4.7 6.0

Clutter parameter (ρ) −0.050 −0.300 −0.550 −0.925

Level values for the independent factors, which are the stimulus variables. The radii are given in degrees.

Effects of ‘local’ clutter on human target detection

41

where Wij is 0 for an incorrect decision and 1 for a correct decision for treatment i and subject j and N is the number of subjects. Therefore, the hit-rate measure was determined across subjects (based on binary response of each subject per treatment), which gave a dependent measure suitable for ANOVA to compare the effects of the different factors. Though we collected confidence rating data (see 2.3) we did not perform Receiver Operating Characteristic analyses (this is addressed in Section 3.3). 2.2. The Generation of the image stimuli When the image presented contained a target, the visual stimulus reflected a unique treatment (or combination of factors). However, when the presented image did not contain a target, there was uniqueness only in the factor combinations of rclut and ρ, since c and rt do not apply in these cases. The (Weber) contrast was defined as c=

Lt µt − µb = , µb µb

µt > µb ,

(2)

where µt is the mean target luminance, µb is the mean clutter background luminance and t is the incremental increase in luminance of the target over the clutter backgrounds. The surround luminance was set to be equal to the average simulated background clutter luminance. The average luminance of each clutter background was set to 1.0 candela per square metre (cd/m2 ), which is about midway (mesopic vision) within the visual dynamic range of luminance (Hood and Finkelstein, 1986). (In order to set target contrast, the mean luminance of the clutter background was calculated ‘on the fly’ at each stimulus presentation.) This level of average luminance was chosen because: (i) Pilot studies indicated that high levels of contrast (Weber contrast up to about 6) would be required. This placed limits on the maximum luminance available due to the constraint on the dynamic range of the display. (ii) Targets ranged in size from intra-foveal to extra-foveal, therefore needing both rod and cone stimulation. (iii) Similar luminance levels were used in applied domains in practice. The images as shown in the figures are not photometrically correct, as they indicate grey-levels and not luminance. These clutter types are represented by the parameter ρ, which in effect determines the granularity or ‘clumpiness’ of the images. 2.2.1. Simulation of clutter. A compromise was required in constructing the visual stimuli. On the one hand, it was necessary that the background clutter should represent real clutter. On the other, it was ‘necessary to have control over the image statistics. A resolution of these conflicting demands was achieved by

42

G. J. Ewing et al.

simulating natural vegetation based on the statistics of natural imagery, in particular by ensuring that the simulated and real images had a similar auto-covariance function (Bertilone et al., 1997, 1998) Bertilone et al. showed that visible imagery of Australian natural terrain possesses a Gaussian distribution of grey-levels over an ensemble of images. In light of work by Newsam and Woodruff (1991), a modification to the work of Bertilone et al. was made to produce a simulation of clutter imagery which had random Gaussian statistics, but was fractal in nature. The background clutter in the stimuli was derived from a fractal statistical process. This was done for two reasons: (i) Natural scenes can be described well with a fractal model (Pentland, 1984; Knill et al., 1990; Rotman et al., 1994; van der Schaaf and van Hateran, 1996); (ii) This procedure allowed control of the process by a single parameter. This in turn minimised the size of the experimental design. 2.2.2. Fractal simulated clutter algorithm. The following provides a brief background to the generation of the fractal image stimuli and the meaning of the ρ parameter. This also provides an introduction to the full mathematical derivation for the fractal image generation algorithm, which is given in Appendix A. Consider an image from a family consisting of N images (an ensemble), which are isotropic3 , stationary4 , Gaussian random fields (GRF) (Yaglom, 1987), with ˜ denoting the luminance function of the ith image (i.e. i ∈ {1, 2, . . . , N}) in Li (x) the spatial region R ≡ {x˜ = (x, y) : 0  x  X, 0  y  Y }. This family of images will form a realisation of a Gaussian random field if, the N sized set of luminance values Li (x) ˜ for each point x˜ in the field is a sample from a Gaussian ˜ With this condition met, every distribution with mean µ(x) ˜ and variance σ 2 (x). ˜ and Li (y) ˜ at two different spatial locations x˜ and pair of luminance values Li (x) ˜ is y, ˜ form a bivariate Gaussian distribution5 . A Gaussian random field image L(x), completely characterised by its mean and its covariance function C(x, ˜ y), ˜ which is defined as C(x, ˜ y) ˜ = (L(x) ˜ − µ(x))(L( ˜ y) ˜ − µ(y)), ˜

(3)

where  ·  is the expectation operator. Since the GRF is isotropic and stationary, r = x˜ − y; ˜ i.e. C(x, ˜ y) ˜ = C(r). For the GRF to be fractal, i.e. invariant under scaling, it is required that C(r) = kr 2ρ ,

(4)

where ρ = ln t/ln s, with s = 1, a scaling factor on the region size, and t a scaling factor on the luminance values within the region (Newsam and Woodruff, 1991). Note that ρ must lie in the range −1 < ρ < 0, and is the single parameter which controls the ‘roughness’ of the texture of the fractal image. The fractal GRF thus defined, which has µ = 0 and infinite variance, can exist only notionally, but could be realised by convolving (i.e. smoothing) it with an appropriate function, such as

Effects of ‘local’ clutter on human target detection

43

the system function of a display device. Therefore, once a fractal image is displayed correlation lengths are imposed on the displayed image. The parameter ρ has a range of −1 to 0, with ρ near −1 indicating low correlation between pixels (i.e. the clutter approaches white noise in appearance), while a value for ρ near 0, indicates a high correlation over a longer range between the pixels (see Fig. 3 for a graphical illustration of this effect). See Appendix B for a discussion on the appropriate use of fractal images in visual experiments. 2.2.3. Pilot study on the perceptual scaling of simulated clutter. In order to test how the clutter parameter, ρ, is related to the subjective perception of clutter properties, a pilot study was carried out prior to the main experiment. A series of hard-copy images of the simulated clutter was produced, with ρ incremented from −0.5 to −0.95, in steps of −0.5. These were presented to seven subjects6 , who were asked to order the images and then place them on a bench, such that the physical distance between the images represented the perceived distance between the ‘clumpiness’ or roughness of their texture. Each subject participated in isolation from the others, and the image stimuli set was shuffled anew for each subject. Of the seven subjects, two failed to completely order the images perfectly with respect to ρ. However, even these two subjects ordered the images in a manner that was correct overall, with ‘mistakes’ taking the form of the reversal of pairs of adjacent images (with respect to ρ). Figure 2 depicts the scatter-plot of the normalised mean subjective (interval) rating versus ρ, with the linear regression line overlaid, for all the subjects. It is evident from Fig. 2 that the parameter ρ corresponds to a subjective linear mapping of the simulated clutter property of clumpiness, which seems to have been interpreted similarly by all subjects. This indicated that ρ was appropriate as a factor in the experimental design, particularly for analysis-of-variance, which is based on a linear model.

Figure 2. The subjective rating of clutter ‘clumpiness’ (R 2 = 0.996).

44

G. J. Ewing et al.

2.3. Experimental procedure The subjects were all volunteers from within the age group 28–50 years, and had at least 6/6 vision. They were instructed as to how to perform the task by the following means: (i) written instructions detailing the use of the software and the experimental procedure; (ii) verbal instructions, conveying the same information and providing an opportunity for clarifying any unclear points; and (iii) a demonstration and trial period with a training version of the software that gave feedback, after each trial, on target presence and size. The software presented the stimuli and logged response times in conjunction with the observer’s confidence rating for each trial. The observers were required to determine the presence or absence of targets that had been placed centrally into imagery; i.e. this was a detection task without a search component. Observers were then prompted to enter their confidence rating according to a 5-point scale. Each session was arranged for the same time every day for each observer and was limited to a maximum duration of one half-hour, preceded by at least 20 minutes of dark adaptation time. After an initial training session, each observer typically sat through 4–6 experimental sessions. 2.4. Apparatus A personal computer ran experimental software especially written (in Borland C) to control the experiment. Images were presented on a photometrically calibrated Electrohome 1719X high quality monochrome television monitor from a Matrox PIP-1024 image digitising and display card. A hood was placed over the screen to constrain the viewing distance to 500 mm and block out ambient light, which was held to a constant low level as the experiment was performed in a light controlled room.

3. RESULTS

The data were explored by graphical means, and analysed by a within-subjects ANOVA to ascertain statistical significance of the treatment effects. In some instances specific regression analyses were performed. The main result of interest was the effect of the clutter background radius, rclut , on the subject’s ability to detect targets. However, there were also other interesting and relevant effects that will be discussed and explored in differing degrees of detail. 3.1. Main effects The only performance measure used for the analysis was the hit-rate, pdi , which is defined by equation (1), since response time (time to detect), tr , is not a particularly

Effects of ‘local’ clutter on human target detection

45

Figure 3. This figure shows the autocorrelation function for each of the four values of ρ. The abscissa indicates the lag in pixels, while the ordinate is the amplitude of the autocorrelation function, which has dimensionless values between 0 and 1.

valid measure of performance in this experiment. This is the case when the hitrate varies significantly, as is the case here, since “Search time [detection time] may be a misleading performance measure to use for comparing conditions in a situation where a significant proportion of searches [detection attempts] does not result in target acquisition. If the number of valid responses under the various conditions differs, then mean search [detection] times for condition A and condition B will have been bases on samples from two conditions that probably differ systematically in dimensions other than the treatment condition” (Woodruff and Newsam, 1994; bracketed comments added). All the main effects for hit-rate were highly significant. There existed a strong relationship between all factors and hitrate in all subjects, and this is summarised in Fig. 4. Figure 4 provides a graphical representation, including standard error bars, of the main effects for the independent variables. Figure 4a shows the main effect for hitrate versus clutter radius, F (11, 99) = 15.21, p < 0.001. This curve is complex, but has a slight downward trend, with a possible levelling off after a clutter radius of about 5 degrees. Figure 4b shows the plot of hit-rate versus target radius. This graph indicates a strong effect, F (5, 45) = 98.74, p < 0.001, of target radius on hit-rate for targets with a radius less than about 0.8 degrees of angle, where the effect for these small targets appears to be linear. The effect of the clutter parameter on the hit-rate, F (3, 27) = 102.26, p < 0.001, is presented in Fig. 4c, with the four values of the clutter parameter (ρ). Perusal of

46

G. J. Ewing et al.

(a)

(b)

(c)

(d)

Figure 4. The effects that the independent variables have directly on the hit-rate. (a) Effect of background radius, p < 0.001. (b) Effect of target radius, p < 0.001. (c) Effect of clutter parameter, p < 0.001. (d) Effect of target contrast, p < 0.001.

Fig. 4c shows that hit-rate drops for the intermediate values of ρ, particularly at the value −0.3. This is discussed in Section 3.2. Figure 4d shows the effect of target luminance contrast on hit-rate, F (3, 27) = 235.65, p < 0.001. As expected, there is an increase in hit-rate with increase in contrast, with the functional form of the relationship being almost linear. This indicates that, as intended, we are operating on, or near, the linear portion of the psychometric function. A full analysis, using hit-rate as the dependent variable, was performed. All the effects, including main effects and interactions, were found to be highly significant statistically (p < 0.001). This implies that the situation is complex, and shows

Effects of ‘local’ clutter on human target detection

47

that the four main factors all contribute to visual task performance; i.e. none of the factors is redundant. 3.2. Interactions As this was a four factor full-factorial experiment, there were possible interactions up to 3rd order. The ANOVA showed that higher order interactions were indeed statistically significant, and these were explored graphically. However, interactions above 1st order were found to be too complex to interpret; i.e. the graphs were extremely convoluted, and required up to 5 dimensions to represent. 3.2.1. Target size and background size. Since Figure 4b indicates a dichotomy of effects between small and large targets, further analyses-of-variance were performed for both small and large targets, with the threshold between them set at 0.8 degrees. These analyses are concerned with the interaction between target radius (small and large) and clutter radius, and included polynomial contrasts for trend analysis (Keppel, 1991). A graph of clutter radius versus hit-rate was produced, with separate plots for the targets that were classified as either small or large, and is shown in Fig. 5a. Comparing the two analyses, we find that the main effect for target radius is significant for both small, F (2, 18) = 60.24, p < 0.001, and large, F (2, 18) = 5.00, p = 0.019, targets, but much less so for large targets. The linear trend of clutter radius is also significant for both small, F (1, 99) = 75.38, p < 0.001, and large, F (1, 99) = 11.58, p < 0.001, targets. However, a quadratic trend is evident for small targets, but not for large targets (F (1, 99) = 43.81, p < 0.001 and F (1, 99) = 0.02, p = 0.889, respectively). Cubic trends are not evident for either case, with, F (1, 99) = 0.23, p = 0.631; F (1, 99) = 0.01, p = 0.905, for small and large targets, respectively. However, the contribution to the mean square (variance) for ‘Deviations’ is significant: small, F (7, 99) = 8.41, p < 0.001; large, F (8, 99) = 10.18, p < 0.001. Therefore, although neither linear nor quadratic functions fully describe the plots, the trends are adequately described by these functions. Figure 5 shows graphical representations of the interactions between clutter and target size. Because of considerable fluctuation in the data, as seen in the clutter radius main effect (Fig. 4a), the data plotted in Fig. 5 have been smoothed by an 11point moving average filter7 . The curves for the individual target sizes are included to show qualitative effects. However, as there were insufficient data in each target size category, these curves are not supported by statistical analysis. As can be seen in Fig. 5a, there is no trend in the hit-rate for large targets, while there is a definite smooth trend downward with clutter radius for small targets (which is supported by the ANOVA just mentioned). This trend appears to level off at about 3.5◦ to 5.5◦ of clutter radius. Though Fig. 5b can be discussed in a qualitative way only, it indicates that the effect of clutter radius on hit-rate depends on target size. The gradient of the curve and the spatial extent of the clutter which affects hit-rate (and thus the amplitude) appears to depend on target size. This

48

G. J. Ewing et al.

(a)

(b)

Figure 5. Interaction of clutter size and target size. (a) Small and large targets. (b) Individual small target sizes.

(a)

(b)

Figure 6. Clutter, contrast and clutter, target size interaction effects on hit-rate. (a) rt × ρ interaction line plot. (b) c × ρ interaction line plot.

dependence corresponds to the significant interaction found between target radius and clutter radius. 3.2.2. Target size and clutter parameter. It was shown earlier in this section that the clutter parameter ρ exhibited a main effect as shown graphically in Fig. 4c. Since clutter must be referenced to a target, the interaction between the clutter parameter and target radius would be expected to show some interesting properties. These interactions are shown graphically in Fig. 6a. The curves in Fig. 6a exhibit the same shape as the ρ main effect curve in Fig. 4c, except for the smallest target radius of 0.3◦ . For this smallest target size tested, the range of ρ, which produces a depressing effect on hit-rate, seems to have been extended. 3.2.3. Contrast and clutter parameter. The interactions of contrast and the clutter parameter are represented in Fig. 6b. The shapes of these curves are very similar to those for the target size × clutter parameter interaction, discussed in the last subsection. The implications of this are discussed in Section 4.

Effects of ‘local’ clutter on human target detection

49

3.2.4. Contrast and target size. It is well known that the effect of contrast and target size on detection performance are inter-related, namely, the product of c · rt2 = k, where k is a constant,

(5)

for small targets. This is a form of Ricco’s Law (Barlow, 1958), which is often stated as L ∝ a −1 , (6) L where L is the incremental increase in luminance at threshold, of a disc of area a, over the background luminance L. Therefore, a significant interaction is quite expected. Figure 7 graphically illustrates these interaction effects. The line plots in Fig. 7a represent the effects of contrast on hit-rate for each target size. As expected, the hit-rate increases for increases in both contrast and target size, with target size ‘biasing’ the hit-rate-versus-contrast curves to different levels (on the psychometric function). Although the curve for the smallest target radius of 0.3◦ looks linear, these curves become more non-linear with increases in target size. These effects are discussed in Section 4. Figure 7b depicts a contour plot of hit-rate versus target radius and contrast. The plot is divided nicely into distinct constant hit-rate regions, bounded by hyperboloid curves, which gives an excellent illustration of Ricco’s Law. 3.3. Confidence rating and performance As indicated in Section 2, all subjects were required to record their confidence, cf , in their decision as to whether they thought the stimulus in each trial contained a target; we thought it of interest to consider how the subject’s confidence in their decision as to the existence of a target related to the objective reality, as indicated by the hit-rate pd . This relationship is plotted in Fig. 8, with the data represented as points. The straight line is the line of best fit (linear regression), and provides an excellent fit (R 2 = 0.955). The regression equation is pd = 0.24cf − 0.16, or cf = 4.2pd + 0.16. This indicated that, in this experiment, the subjective rating was a very good predictor of actual performance, although, as indicated by the regression equation, the subjects tended to under-estimate slightly their ability to detect targets. 3.4. Post-hoc signal detection analysis This study was not designed to be analysed using signal detection theoretical (SDT)) methods, such as ROC analysis (Green and Swets, 1966; Wickens, 2001), even though ratings were recorded. Nevertheless, we performed a post-hoc SDT analysis to check that observer decision bias had not unduly distorted our findings; i.e. the observed effects were mainly due observer perception and not observer bias. Ideally, when using a SDT approach, observers see the same number of stimuli with targets (signal) as without targets (noise), and many repetitions of each

50

G. J. Ewing et al.

(a)

(b) Figure 7. Target radius and contrast interaction effect on hit-rate. (a) Target radius and contrast interaction effect on hit-rate expressed as line graphs. Each curve is a plot of hit-rate versus contrast for a given target angular radius. (b) Target radius and contrast interaction effect on hit-rate expressed as a contour plot. Hit-rate (value shown on contours) is plotted simultaneously against contrast and target angular radius.

stimulus. In our case each observer saw 1152 unique stimuli containing a target. However, the 1152 non-target stimuli contained 48 (4 levels of ρ × 12 levels of rclut ) groups of 24 identical stimuli. This provided a basis to estimate the false alarm rate (FAR). The false alarm rates corresponding to these groups of 24 identical stimuli were averaged over participants then combined to calculate the ‘guessing correction’, and d = Z(H R) − Z(F AR) with the (corresponding) hit-rate (H R) averaged over participants in the 1152 target conditions, and where d is a measure

Effects of ‘local’ clutter on human target detection

51

Figure 8. Regression of actual performance (hit-rate) on subjective confidence rating (R 2 = 0.955).

of the detectability (separation of signal and noise distributions) of targets in the stimuli. The hit-rate was determined to be 0.738 and the false alarm rate was found to be 0.13. Assuming the Equal Variance Gaussian Model we found d = 1.77 and the criterion (λ) = 1.14. An estimate of observer bias can be obtained by finding the position of λ relative to λc , where λc is the halfway point between the signal and noise distributions and is found from λc = −(1/2)(Z[H R] − Z[F AR]) = 0.26. The bias = λ − λc = 0.88. This indicates that observers tended to say ‘no’, i.e. they were conservative. To determine the effect of the false alarm rate on the hit-rate we corrected the hit-rate by applying a guessing correction. Under an assumption of the High Threshold Model (Wickens, 2001), this is found from H R − F AR/(1 − F AR) and was 0.69 (compared to 0.73). This analysis indicates that observers’ decision bias did not unduly corrupt our analysis using only the hit-rate as a measure of performance.

4. DISCUSSION

The graphs shown in Fig. 6a are most interesting. Why does the hit-rate drop for all target sizes for ρ = −0.3 to −0.55, with the maximum drop in hit-rate for a target radius of 0.3◦ (where the hit-rate is equally the minimum for ρ = −0.3 and ρ = −0.55)? It might be thought that the correlation length, imposed on the stimuli by a value for ρ = −0.3, was probably most similar to the range of target sizes used, thereby producing a confounding effect due to clutter granularity being about the same size as the targets. But consider Fig. 3, which shows the 2D profiles of the circularly symmetric correlation function (equation (4)) for the displayed images. If we compare the range of target radii of 4–18 pixels (which is the range of target sizes used in the experiment), with the full-width-half-maximum (FWHM) value for the correlation length, we would expect that the hit-rate would decrease for increasing (i.e. more positive) ρ, as the mean target radius approached the correlation length.

52

G. J. Ewing et al.

As is shown in Fig. 6a, this decrease is not observed, and therefore does not offer an explanation for the experimental curves. It seems more likely that, since the c × ρ interaction curves (Fig. 6b) produced a very similar set of curves to those for the rt × ρ interaction, this phenomenom must be related to a generalised perceived contrast or signal-to-noise ratio. This approach is explored in the remainder of this section. Consider an image9 , L(x, y) which is a GRF, as defined in Section 2.2, with covariance function  ∞ ∞ C(α, β) = [L(x + α) − µ(x)][L(y + β) − µ(y)] dx dy, (7) −∞

−∞

where µ is a mean, and L(x, y) is assumed to be stationary and ergodic (Yaglom, 1987). This can be considered, without loss of generality, in one dimension, especially in our case with isotropic image statistics. If we also assume zero mean, then (7) becomes  ∞ L(x + τ )L(x) dx. (8) C(τ ) = −∞

This is in fact the auto-covariance function, since this describes the correlation between points within a single image. If L(x) is even (i.e. L(x) = L(−x)), then equation (8) can be written as:  ∞ L(τ − x)L(x) dx, (9) f (τ ) = −∞

which is the convolution integral (Gonzalez and Wintz, 1987). In general, convolution is defined as  ∞ f (τ − x)g(x) dx, (10) f (x) ∗ g(x) = −∞

where ∗ is the convolution operator. This is often used in the engineering analysis of linear shift-invariant systems (Karbowiak, 1969a), as equation (10) represents a filtering operation. It will also be used in a model of the HVS to be discussed shortly. To facilitate further discussion, a simplified diagram, representing a one dimensional mapping of the displayed image on the viewer’s retina, is shown in Fig. 9, where x is a unit-less distance variable, and the origin is at the centre of the visual field. For vision out to about 30◦ of periphery, the HVS has been modelled as a linear, shift-invariant system (Laming, 1986; Thibos, 1989), which is a useful model for development here. Thibos (1989) showed that the output from a foveal or nearfoveal receptive field (Overington, 1982) is given by  w(u − x)Lr (x) dx, (11) r(u) = w(x) ∗ Lr (x) = rf

Effects of ‘local’ clutter on human target detection

53

Figure 9. Mapping of viewed image to retina in 1D, with x as a dimensionless measure of distance perpendicular to the incoming light.

where w(x) is a weighting function over the spatial response of the receptive field and Lr (x) = p(x) ∗ L(x) is the luminance function at the retina, where p(x) is the point-spread-function of the eye. Now consider the present experiment, with a displayed image consisting of a background clutter image L(x), characterised in general terms, by the co-variance function defined in equation (7), and in particular by equation (4), which express the dependence of the co-variance function on the value of ρ, the clutter parameter. The target Lt (x) was a disc of radius rt produced by adding a luminance increment Lt (x) to the background; i.e. the input image to the eye was  0 if |x| > rt (12) L (x) = L (x) ∗ fρ (x) + Lt (x), where Lt (x) = Lt (x) if |x|  rt , where L (x) ∗ fρ (x) = L(x) and fρ (x) is the filtering function defined by (9) and is applied to a conceptual, uncorrelated image L (x), to produce the actual observed background image L(x). Combine (11) and (12) to obtain r(u) = w(x) ∗ p(x) ∗ (L (x) ∗ fρ (x) + Lt (x)), = w(x) ∗ p(x) ∗ L (x) ∗ fρ (x) + w(x) ∗ p(x) ∗ Lt (x),

(13) (14)

since convolution is a linear operation. For a focussed eye and foveal viewing, the effects of fρ (x) should ‘swamp’ the effects of w(x) ∗ p(x). Anyway, w(x) ∗ p(x) is relatively fixed, since viewing geometry and conditions are constant. Therefore, r(u) ≈ L(x) ∗ fρ (x) + Lt (x),

(15)

where fρ (x) is the equivalent function which, when convolved with a given L(x), yields the same effect as L (x) ∗ fρ (x). Now, this is related to the effects represented in Fig. 6a. In the case where the clutter parameter ρ approaches −1, the clutter image becomes less correlated; i.e. approaches white noise. Then the auto-correlation function of the image approaches

54

G. J. Ewing et al.

Figure 10. Perceived contrast at retina (1D). Here Lr (x) is the luminance function of the stimulus as ‘seen’ by the retina, and x is a dimensionless measure of distance perpendicular to the incoming light (see Fig. 9).

a delta-function (δ) (Karbowiak, 1969b), where the convolution of a δ-function with another function produces this same function; i.e.  ∞ g(x)δ(x − τ ) dx = g(τ ). (16) −∞

Therefore, for the clutter parameter ρ = −0.925, (15) becomes r(u) ≈ L(x) + Lt (x),

(17)

thereby maintaining the input image signal-to-noise ratio or contrast; i.e. the perceived contrast is approximately the physical contrast. This situation is diagrammatically represented in Fig. 10a. At the other end of the scale, for ρ → 0, the function fρ (x) spreads out the energy in L(x), also producing a relatively high perceived contrast (Fig. 10c). However, for intermediate values of ρ, fρ (x) ∗ L(x) is less spread out and of higher amplitude, thereby reducing the perceived contrast between target and background (Fig. 10b). The region for full spatial integration of the retinal luminance function Lr (x) extends out (off axis) to about 0.5◦ of visual angle according to Barlow (1958); Laming (1986); i.e. the region for which Ricco’s Law (6) is obeyed. Apparently, outside this region, a form of square law summation occurs (Laming, 1986), where, for stimuli persisting longer than 0.93 seconds, Ricco’s Law is modified to L ∝ a −1/4 . (18) L This would explain the flattening of the curve (in Fig. 6a) for 0.3◦ radius targets, since up till 0.5◦ , the background luminance would contribute equally to r(u) for equivalent levels of the target luminance. However, targets larger than 0.5◦ radius

Effects of ‘local’ clutter on human target detection

55

have the background luminance contribution falling off according to equation (18). Therefore, the 0.3◦ target has a relatively lower perceived contrast and an associated loss in sensitivity to ρ value. The model just discussed can be viewed from a spatial frequency point of view, since it is known that the convolution of two functions, say f (x) ∗ g(x), in the x domain is equivalent to F (ν) · G(ν) in the frequency (ν) domain, where F (ν) and G(ν) are the Fourier transforms of f (x) and g(x) respectively (Yaglom, 1987; Karbowiak, 1969). Also, the Fourier transform of the auto-covariance function C(x) is the spectral density S(x), which is the equivalent spectral characterisation of the function. These apply equally in the general 2D case for images (Gonzalez and Wintz, 1987). This is an appropriate way to analyse vision, since it is well known that the HVS incorporates spatial frequency channels (Wilson, 1995). Now, the phenomena discussed are re-presented in the spectral context, although only briefly and qualitatively. If we consider Fig. 6a, as ρ goes from −1 to 0, the image is effectively low-pass filtered, with the frequency content becoming lower in frequency. Consider now the clutter image with ρ = −0.925, where the image is dominated by high frequencies, due to sharp transitions between the relatively uncorrelated pixels. The insertion of a target introduces lower frequency components, cueing the HVS to the target area, though local high frequency (edge) effects probably localise the target Burr et al., 1986, so that here the perceived contrast is high. At the other end of the range, where ρ is near 0, the background image is highly correlated, and therefore dominated by low frequencies. The insertion of a target, which is small compared to the correlation length, introduces relatively high spatial frequencies, again producing high perceived contrast. However, at intermediate values of ρ, the frequency content of the background image must overlap the frequency range introduced by the targets, causing lower perceived contrast and resulting in lower hit-rates. Under this hypothesis, when applied to the main effect of clutter radius on hitrate, we would expect to see little or no fall-off for ρ near −1 and the largest fall-off in hit-rate with ρ near 0. To test this, another plot was produced of hit-rate versus clutter radius, but with separate plots for each ρ value. This is shown in Fig. 11, where the data has been moving-average-filtered as before. This graph shows hitrate to be independent of clutter radius for ρ = −0.925 and the greatest fall-off with ρ = −0.05, as expected. At the intermediate values of ρ, the situation is slightly more complex, but an intermediate fall-off in hit-rate with clutter radius is evident. A plausible explanation can now be given as to why the hit-rate for larger targets is less affected by clutter radius than that for small targets. It seems that the size of the region of stimulus integration is determined by the correlation function of the clutter. However, given this, the perceived contrast is then determined by the target size. For targets smaller than the area for full integration (about 0.5◦ ), part of the clutter stimulus also falls within this region. Therefore, the rate of fall-off in hit-rate for small targets, with increasing clutter radius, is greater compared to larger targets, which force the clutter background luminance into a region, that is

56

G. J. Ewing et al.

Figure 11. Clutter radius interaction with ρ value.

integrated at a lower rate. This, in turn, reduces the sensitivity of subjective hit-rate to clutter radius with larger targets.

5. CONCLUSIONS

It was found that the size of the local clutter region around a target has a strong effect on the probability-of-detection of that target and that this is affected by regions much larger than twice the target size, as routinely used in the literature for setting clutter metric regions of support. It was also discovered that this effect was much stronger for targets subtending less than 0.8 degrees of visual angle than for larger targets. In the case of the former, the fall-off in human visual performance with clutter region size was approximately quadratic, compared to a slight linear fall-off for larger targets. The Weber contrast ratios required for subjects to detect the circularly symmetric, but complexly structured targets and backgrounds described in this paper, are considerably higher than that required for constant luminance circularly symmetric targets and backgrounds. For example, the highest contrast ratio used in this experiment was 6.0. This yielded an average hit-rate of about 90% over all treatments and subjects; i.e. saturation of the psychometric function was not achieved (on average). This contrast is high compared to the contrasts required for the detection of constant luminance (plain) circular targets and backgrounds, which was found typically to be less than 1.0 by Blackwell in his famous studies of 1946. A simple model was presented to explain these phenomena. This model implies that the auto-covariance function characterising the clutter is the main determinant of the size of the region of local clutter, but is reduced for larger targets. The large regions for stimulus integration assumed in this model are much larger than the areas for single receptive fields, but have been shown to exist by other researchers, as discussed in Section 4.

Effects of ‘local’ clutter on human target detection

57

The work reported here considered only a narrow class of clutter and a simple type of target, and did not elucidate detailed stimulus interactions across multiple receptive fields. Further work needs to be done in order to understand the mechanisms involved in more general situations.

NOTES

1. Each pixel in the target region had a constant luminance increment applied to it. 2. Note, in signal detection theory hit-rate usually refers to proportion of targets detected in stimuli only containing targets. 3. The statistics are independent of direction; i.e. invariant under rotation. 4. The statistics are the same at all regions within the image; i.e. invariant under translation. 5. Where y˜ = (x, y), = x. ˜ 6. Only two of these subjects participated in the main experiment. 7. An N point moving average filter replaces the datum at its current output with the average of its N inputs. It then moves forward one place in the input data etc. 8. This hit-rate was calculated using only the 1152 target conditions, whereas an overall hit-rate of 0.80, found by using equation (1), included all correct decisions in the full data set of 2304 target and non-target stimuli. 9. The image is considered in the continuous domain, even though it is at this stage digital.

REFERENCES Aviram, G. and Rotman, S. (2000a). Evaluating human detection performance of targets and false alarms, using a statistical texture image metric, Optical Engineering 39, 2285–2295. Aviram, G. and Rotman, S. (2000b). Evaluation of human detection performance of targets embedded in natural and enhanced infrared images using image metrics, Optical Engineering 39, 885–896. Barlow, H. (1958). Temporal and spatial summation in human vision at different background intensities, J. Physiol. 141, 337–350. Bertilone, D. C., Caprari, R. S., Angeli, S. and Newsam, G. N. (1997). Spatial statistics of natural terrain imagery. i. non-Gaussian ir backgrounds and long-range correlation, Applied Optics 36, 9167–9176. Bertilone, D. C., Caprari, R. S., Chapple, P. B. and Angeli, S. (1998). Spatial statistics of natural terrain imagery. ii. oblique visible backgrounds and stochastic simulatio, Optical Communication 150, 71–76. Blackwell, R. H. (1946). Contrast thresholds of the human eye, J. Opt. Soc. Amer. 36, 624–643. Burr, D. C., Morrone, M. C. and Ross, J. (1986). Local and global visual processing, Vision Research 26, 749–757. Caelli, T. and Julesz, B. (1979). Psychophysical evidence for global processing in visual texture discrimination, J. Opt. Soc. Amer. 69, 675–678.

58

G. J. Ewing et al.

Chapple, P. (1997). Personal Communication. Chen, L. and Tokuda, N. (2003). Robustness of regional matching scheme over global matching scheme, Artificial Intelligence 144, 213–232. Doll, T. J., McWhorter, S. W. and Schmeider, D. E. (1993). Target and background characterization based on a simulation of human perception, in: Proceedings of the SPIE on Characterization, Propagation and Simulation of Sources and Backgrounds III, Vol. 1967, Watkins, W. R. (Ed.), pp. 432–454. Ewing, G. J. and Woodruff, C. J. (1996). Comparison of jpeg and fractal based image compression on target acquisition by human observers, Optical Engineering 35, 284–288. Gonzalez, R. and Wintz, P. (1987). In: Digital Image Processing, Chapter 7, pp. 336–340. AddisonWiley. Green, D. M. and Swets, J. A. (1966). Signal Detection Theory and Psychophysics. Penninsula Publishing, Los Altos, California, USA. Hood, D. C. and Finkelstein, M. A. (1986). Sensitivity to light, in: Handbook of Perception and Human Performance, Vol. 1, 1st edn, Boff, L. K. and Thomas, J. (Eds.). John Wiley, New York. Karbowiak, A. (1969a). In: Theory of Communication. Oliver and Boyd, Edinburgh. Karbowiak, A. (1969b). In: Theory of Communication, p. 44. Oliver and Boyd, Edinburgh. Keppel, G. (1991). In: Design and Analysis: A Researcher’s Handbook, Chapter 7, Analysis of trend. Prentice Hall. Knill, D. C., Field, D. and Kersten, D. (1990). Human discrimination of fractal images, J. Opt. Soc. Amer. A 7, 1113–1123. Laming, D. (1986). Sensory Analysis. Academic Press. Newsam, G. and Woodruff, C. (1991). Fractal random fields and their use in vision experiments, in: DICTA-91 Digital Image Computing: Techniques and Applications, pp. 365–372. Australian Pattern Recognition Society. Overington, I. (1976a). In: Vision and Aquisition, Chapter 8, Rudimentary search modelling, pp. 164– 174. Pentech Press, London. Overington, I. (1976b). In: Vision and Aquisition, Chapter 4, Basic thresholds for detection, page 54. Pentech Press, London. Overington, I. (1976c). In: Vision and Aquisition, Chapter 13, Background Structure, p. 236. Pentech Press, London. Overington, I. (1982). Towards a complete model of photopic visual threshold, Optical Engineering 21, 2–13. Pentland, A. P. (1984). Fractal-based image description of natural scenes, IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-6, 661–673. Rotman, S., Tidhar, G. and Kowalczyk, M. (1994). Clutter metrics for target detection systems, IEEE Transactions on Aerospace and Electronic Systems 30, 81–90. Thibos, L. (1989). Image processing by the human eye, in: Proceedings of SPIE on Visual Communications and Image Processing IV, Vol. 1199, Pearlman, W. A. (Ed.), pp. 1148–1153. van der Schaaf, A. and van Hateran, J. (1996). Modelling the power spectra of natural images: Statistics and information, Vision Research 36, 2759–2770. Wickens, T. D. (2001). Elementary Signal Detection Theory. Oxford University Press. Wilson, H. R. (1995). Quantitative models for pattern detection and discrimination, in: Vision Models for Target Detection and Recognition, Peli, E. (Ed.), pp. 3–15. World Scientific. Witus, and Ellis, (2003). Computational modeling of foveal target detection, Human Factors 45, 47– 60. Woodruff, C. J. and Newsam, G. N. (1994). Displaying undersampled imagery, Optical Engineering 33, 579–585. Yaglom, A. (1987). Correlation Theory of Stationary and Related Random Functions, Springer Series in Statistics. Springer-Verlag, New York.

Effects of ‘local’ clutter on human target detection

59

APPENDIX A. DERIVATION OF THE FRACTAL IMAGE SIMULATION ALGORITHM.

Consider a fractal Gaussian random field (as defined in Section 2.2.2) within a square domain h, defined by vertices at [(0,0),(0,1),(1,1),(1,0)], which is our pixel size. Since fractal GRF statistics are scale invariant, this does not lose any generality. We are given that C(r) = kr 2δ ,

(19)

while in general 

1

Ch (µ, ν) =





1

1



L(x, y) dx dy 0

0

0

1











L(x + µ, y + ν) dx dy ,

(20)

0

 where r = µ2 + ν 2 and  ·  is the expectation operator and x = x + δx , x = y + δy . Substitute (19) into (20).  1 1 1 1 k([x + µ − x]2 + [y + ν − y]2 )δ dx dx dy dy , ⇒ Ch (µ, ν) = 0

0



1

⇒ Ch (µ, ν) = k

0

(21)



0

0

1

([rx − µ]2 + [ry − ν]2 (1 − |rx |)δ (1 − |ry |) drx dry ,

0

(22) 

1



since 0

1







f (x − x ) dx dx =

0

1

f (rx )(1 − |rx |) drx

(Chapple, 1997).

0

(23) It was shown by Chapple that (21) becomes  1 [ln |rx − µ + (rx − µ)2 + (ry − ν)2 |]1rx =−1 dry Ch (µ, ν) = k −1

 1 k/2 ([rx − µ]2 + [ry − ν]2 )1+δ ]1rx =0 (1 − |ry |) dry (24) 1 + δ −1  1 k/2 ([rx − µ]2 + [ry − ν]2 )1+δ ]0rx =−1 (1 − |ry |) dry , + 1 + δ −1 −

and that the integrals can be evaluted from −1/2 to +1/2 since  1  1  1 2 2 f (x − x) dx dx = f (u)(1 − |u|) du. − 12

− 12

−1

(25)

60

G. J. Ewing et al.

APPENDIX B. IMPROPER USE OF ‘FRACTAL’ STIMULI IN VISUAL EXPERIMENTS.

It has been suggested that the use of fractal Gaussian random fields (GRF) (Yaglom, 1987) as stimuli rather than standard GRFs, produces a simplification in the experimental setup (Newsam and Woodruff, 1991) Newsam and Woodruff argue that, for fractal imagery, the perceived image statistics should be independent of the viewing distance and thus pixel size; i.e. the perceived image should be scale invariant. In contrast, they show that, for a standard (non-fractal) GRF image, the perceived correlation between the pixels depends on the perceived pixel size (viewing distance). Unfortunately, as Newsam and Woodruff point out, a GRF characterised by equation (4) cannot be physically realised. Therefore, an algorithm was used to generate the fractal clutter images by integrating the conceptual fractal GRF over a defined region for each pixel of the resultant image (see Appendix A). It is to this region, which becomes a pixel in the produced digital image, that Newsam and Woodruff’s arguments apply, in the sense that, whatever the size of this region of integration, the resultant image statistics will remain constant. However, once the ‘fractal’ digital image has been displayed on a monitor, with its fixed pixel size, a correlation length is imposed and the image is no longer scale invariant. Therefore, it turns out that a fractal GRF image, when used as a stimulus in a visual experiment is scale invariant only at the stage where the realisation of a digital image is formed (by the fractal image generation algorithm), not at the display of that image. Nevertheless, in practice, the scale invariance problem was overcome by using a viewing hood to fix the viewing geometry of the observers.

Suggest Documents