A Novel Target Detection Method based on Visual ... - IEEE Xplore

Proceedings of the 34th Chinese Control Conference July 28-30, 2015, Hangzhou, China

A Novel Target Detection Method based on Visual Attention with CFAR Yaojun Li 1, Lizhen Wang2, Lei Yang1,Yong Wang1, Geng Wang3 1. Xi 'an electronic engineering research institute, Xi’an, Shaanxi, 710100, China 2. Xi’an Leitong technology co., LTD, Xi’an, Shaanxi, 710100, China 3. Research Institute 365, Northwestern Polytechnical University, Xi'an Shanxi 710072, China E-mail:[email protected] Abstract: Based on visual attention theory and local probability density function statistical feature, a novel target detection method with Constant false alarm rate (CFAR) is proposed in this paper. Visual attention model mimics the effective and efficient visual system of primates to deal with complex scenarios. The proposed target detection algorithm inherits the advantages of both visual attention model and CFAR, which is applied to complex circumstances for target detection. By computing the phase of Fourier Transform, the saliency map is calculated by applying the adaptive Gaussian Filters. In order to extract the ground targets rapidly from CFAR detection images, the gradient feature is extracted to detect visual saliency area. By using watershed transform method, the segmentation image for target detection is obtained. Experimental results show that the adaptive Gaussian Filter could not only de-noise images effectively, but also can reserve as much original information as possible. The proposed method is proven to be capable of detecting ground targets in complex scenarios. In addition, the calculation procedure of the proposed method is pretty simple, which enables it to be suitable for engineering application. Key Words: Visual Attention, Saliency Map, Target Detection, CFAR

1

INTRODUCTION

Radar is an effective instrument to obtain remote sensing information, which is also widely used on civil remote sensing and military reconnaissance. Maneuvering targets, mainly including vehicle, ship and airplane, are interesting targets on civil remote sensing and military reconnaissance. At the present stage, target detection is one of the hotspots and difficulties on SAR image interpretation, which plays an important role in civil and military application. Human could effortlessly judge the importance of image regions, and adaptively focus attention on those perceived interesting parts, known as salient region. Visual attention originates from visual uniqueness or unpredictability, and is often attributed to variations in image attributes like color, gradient, edges, and boundaries. Visual attention, being closely related to how we perceive and process visual stimuli, is investigated by multi-disciplines including cognitive psychology [1, 2], neurobiology [3, 4], and computer vision [5, 6]. Some scholars found theories of human attention hypothesize that the human vision system only processes parts of an image in detail, while leaving others nearly unprocessed, another scholars used the results above, in terms of visual psychology, trying to compute and detect significant regions of images [7-12], as it allows preferential allocation of computational resources in subsequent image analysis and synthesis. The value of salient object detection methods lies in their wide applications in many fields: including object-of-interest image segmentation [8, 9], object recognition [10], adaptive compression of images [11], content aware image resizing [12-14], image retrieval [15] and so on. Constant false alarm rate (CFAR) algorithm [15-17] is a classical target detection algorithm for SAR image. CFAR algorithm is designed based on the fact that the radar cross

section (RCS) of the target is larger than the RCS of the clutter. The CFAR algorithm performs efficiently in single clutter background. However, there are rather false alarms if the background is complex with buildings, trees and shadows. It is important to note that since the detector is the first stage of SAR-ATR, the efficiency of the detector directly impacts the succeeding stages in the SAR-ATR processing chain. The detection algorithms for SAR images are generally categorized into three classes: single-feature-based, multi-feature-based and expert-system-oriented [18]. The last one is the most sophisticated and utilized a multistage artificial intelligence approach while multi-feature-based method uses two or more features extracted from the input image. The first one is the most common and widely used in literature, and CFAR is the most popular one among this class. This algorithm bases the search for regions of interest (ROIs) on radar cross sections alone. It assumes that the background clutter can be roughly modeled by a certain probability distribution and CFAR detection is performed after estimating the model distribution parameters. The early one-parameter CFAR algorithm uses one parameter to characterize the distribution model. More realistic two parameter CFAR uses two-parameter distribution models to characterize clutter, such as Weibull distribution [19] and K-distribution [20]. It is assumed that target pixels obey a certain distribution and pixels in the reference window are used to estimate the parameters of the distribution model. The drawback of CFAR is obvious: As the size of the image and the reference window increases, the execution time increases dramatically. This severely restricts the key requirement of the SAR-ATR system that its detector should be relatively computationally simple, in order to enable operation in real-time or near-real-time [18]. This paper presents a bottom-up visual attention model for ground target detection. By building scale adaptive

3975

Gaussian filter saliency map, saliency region could be extracted and segmented, so interest targets should be detected automatically. Experiment results verified the feasibility and effectiveness of the proposed visual attention model for ground target detection of real-world applications.

Based on a bottom-up model of visual attention, we deeply research on the multi-scale parameters in frequency domain for extracting visual salient map, and propose an adaptive Gaussian filter model of visual attention, and apply it for ground target detection to improve detection probability of CFAR.

2

2.2 saliency map generation

VISUAL ATTENTION MODEL

2.1 visual attention model In visual perception research field, there are a few research findings in cognitive psychology [1, 2] and neurobiology [3, 4] which show that the human visual process appears as a bottom-up and top-down two combined processes. A top-down process has something to do with image content and human cognition, while a bottom-up process has nothing to do with image content, but concern with human visual contrast caused by elements in image, the greater the contrast are, the more likely vision system may be attracted. Such contrast elements set make up a special region, called visual salient region. In order to extract visual salient region, the saliency region-based model of visual attention has been widely used in object attention detection [16].

In this paper, the adaptive Gaussian filters [21] based saliency map will be generated by calculating the phase of Fourier transform (FT). The basic principle is that for a given image f (x, y) firstly, image is transformed to F frequency domain by FT, that is: f ( x, y) o F ( f )(u, v) .

Its amplitude spectrum is expressed as: A(u, v)

F( f ) .

Phase is expressed as: P(u, v) angle( F ( f )) Then the image amplitude spectrum is calculated like this: L(u, v) log( A(u, v)) , the magnitude of the spectral radius is defined as follows:

R(u, v)

L(u, v) hn L(u, v)

(1)

Then by Fourier inverse transform, saliency map will be extracted, which could be calculated as follows S ( x, y)

F -1 >exp( R(u, v) i P(u, v))@

(2)

In order to get a better visual saliency map, the ultimate expression of the visual saliency map is defined as S ( x, y)

g F -1 >exp( R(u, v) i P(u, v)) @ (3) 2

Where f ( x, y) is the input image, A(u, v) is amplitude spectrum obtained by Fourier transform, P(u, v) is Fourier transform phase spectrum, hn , g are low pass filters, i is a

Fig.1 architecture of visual attention model

Visual attention models usually consist of feature extraction, salient map creation and threshold segmentation. The entire process is shown in Fig. 1. First, we do re-sampling on the input image and extract low-level image features, then, we use those features to generate a Gaussian pyramid and calculate local visual contrast by the Center-Surround operator. The high contrast region represents a salient region which likely to attract visual attention and then all different scales of salient regions will be normalized by fusion. Finally, the fusion image is the comprehensive visual saliency map. In applications of target detection, the target of interest is generally likely to extract salient map of visual attention, therefore, by salient map we could effectively narrow the search and improve detection in real time. Obviously, the human visual attention model is the root to get information from outside interest target, and is consistent with the automatic visual target detection system on principle.

virtual function, F -1 is the inverse Fourier transform, S ( x, y) is saliency map function [23] from the phase of the Fourier transform. By FT, amplitude spectrum of image is expressed as sum of each sinusoidal component, and phase spectrum of image is expressed as location of these sinusoidal components. After image region restoration from the phase signals, it can effectively filter out the repeat image pattern, leaving higher contrast edges and irregular texture areas. These areas are most likely to happen to cause human visual attention, namely saliency region, therefore, the image phase information for restoration can be obtained simulate human visual saliency map. This selection mechanism can accurately focus on salient region of image. Amplitude Fourier transform based visual saliency map extraction results are shown in Fig. 2, the salient region in image is the cause of our attention, just like human vision. In order to obtain adequate image information, scale adaptive Gaussian filter g ( x, y) is used to smooth image noise, where ( x, y) is spatial coordinates, V is the scale parameter, the higher the value, the stranger the smoothing effect, accordingly, de-noising get better, but the edge shift phenomenon lead to happen; When the value of scale parameter adjust smaller, the band of Gaussian filter will

3976

become narrow and output image edge with high accuracy positioning, but smoothing effect is weaker, as well as de-noise capability. Scale adaptive Gaussian filter is defined as g ( x, y )

Input

X1

( x2 y2 )

1 2S f ( x, y) M ( x, y)

Mean level CFAR is suitable for homogeneous clutter environment. The most popular ML-CFAR is cell average CFAR(CA-CFAR), its process diagram as shown in Fig. 3.

2

e

2 f ( x, y ) M ( x, y )

2

(4)

XN

Ă

M ( x, y)

Detected Cell

Protected Cell

Y1

YN

Ă

Ă

H0

Ă

¦

¦

Where M ( x, y) is local gray mean, ( x, y) is the center gray value of current filter window, is the, the window size is m u n pixels. According to equation (5), M ( x, y) can be calculated for local gray mean, the scale adaptive Gaussian filter window size is selected as 3 × 3 pixels. 1 x ( m 1)/ 2 y ( m 1)/ 2 ¦ ¦ f (i, j) m u n i x ( m 1)/ 2 j y ( m 1)/ 2

Protected Cell

X

Comparing Unit

H1

Y

CA˖Z=X+Y Z T

S T Z

Fig. 3 Process diagram of CA-CFAR

(5)

Calculation process for scale parameter optimization is shown in Fig. 2, wherein Fig. 2b is shown the generated salient map using adaptive scale Gaussian filter. Fig. 2c is shown the generated salient map using non-adaptive Gaussian filter. As it shown in Fig.2c, from comparative results with non-adaptive Gaussian filter to adaptive Gaussian filter, the image information for scale-adaptive Gaussian filter could effectively generate valid extreme points while reducing the number of extreme points, which can significantly improve extraction algorithm efficiency and simultaneously enhance robustness.

CA-CFAR can be used along with range or Doppler dimension. For reference cell selection, the protected cells are needed to prevent the influence from the target in the reference cell. With the detected cell being center, the estimated value Z of the clutter and noise around the target in the main lobe can be obtained by averaging the levels of the reference cells in two sides of the detected cell. Once Z is obtained, the threshold can be determined by multiply Z with T. where T is a parameter related to the detection performance. Comparing the level of the detected cell and the threshold, the target can be detected if the target level is higher. By using the measured data, Fig 1 shows the detection result of the CA-CFAR in the homogeneous clutter environment. 10 Data CFAR

9

Targeto

8

Amplitude

7

(a) CFAR detection image (b) Optimal saliency map

6 5 4 3 2 1 0

0

50

100

150

200

250

300

350

400

Sample Sequence

(a) One target 10 Data CFAR

9

Target1o

8

Amplitude

7

(c) Saliency map for different scale parameters

6 5

m Target2

4 3

Fig. 2 The optimal map selecting form spectrum scale space

3

2 1

CONSTANT FALSE ALARM RATE FOR TARGET DETECTION

0

0

50

100

150

200

250

300

350

400

Sample Sequence

Constant false alarm rate (CFAR) is used for target detection in the clutter environment. The algorithm keeps the constant false rate by adaptively setting the detection threshold according to the clutter level of the reference cell.

(b) Two targets Fig.4 CA-CFAR Detection Results

The number of protect cell is 2, and the number of reference cell is 32. T is chosen to be 3. From Fig.4a, we can see that the CA-CFAR can detect the target well with the few false.

3977

Fig.4b shows the situation that two different, targets appear in different range cells. Though the amplitude of target 2 is much lower than the amplitude of target 1, both two targets can be detected.

4

VISUAL SALIENCY MAP BASED IMAGE SEGMENTATION WITH CFAR

CFAR. Fig. 5bshows target detection based on visual attention model with CFAR. As it shown, the mostly false targets have been removed. And Fig. 5c shows final detection results after image segmentation for visual saliency map. The two real targets that we want to detect have been reliably detected already.

After obtaining salient maps in this article by AM Fourier transform and scale-adaptive Gaussian filter, in order to extract ground targets we need to segment image region to help target detection. Feature extraction by calculating gradient map is used to achieve detecting salient region [24]. After extraction of salient region, watershed transformation method is used to achieve image segmentation for ground targets detection. Gradient expression G( x, y) describes the change of target gray value, namely G( x, y)

(a) Target detection based on CFAR

( f ( x, y) u Gx ) 2 ( f ( x, y) u Gy ) 2 (6)

Where f ( x, y) is gray value, GxˈGy is Sobel edge operator for gradient mask, here Gy Gx' . Firstly, the image is divided into several regions by gradient map, then watershed marking matrix Lrgb is generated by gradient map. Finally, targets will be detected by selection of threshold for segmentation. Fig. 5 shows the results of target detection based on visual attention model with CFAR. Fig. 5a shows target detection based on CFAR for complicated environments. From the detection results some false targets still be detected by

(b)Target detection based on visual attention model with CFAR

(b)Detection results after image segmentation from visual saliency map Fig. 5Target Detection Based on Visual Attention Model with CFAR

(a) CFAR detection images

(b) Saliency maps

(c) Targets segment maps Fig.6 Results of test one for ground targets detection (Group 1#)

(a) CFAR detection images

(b) Saliency maps

(c) Targets segment maps Fig.7 Results of test two for ground targets detection (Group 2#)

3978

5

EXPERIMENTAL RESULTS AND DISCUSSIONS

Experimental data comes from a test record. The length of record lasts 600 seconds. In this paper, consecutive frames were extracted for experiments, for convenience, this section test frames referred to as "test sequence". The experiments were carried on a person computer with 2.0GHz CPU, 1GHz RAM on Windows 7 by composing MATLAB code. In order to verify the validity of the scale-adaptive Gaussian filter based model of visual attention for ground target detection algorithm, here we designed two experiments for testing aviation image sequences. The first experiment is tested under relatively simple background surrounding, while the second experiment is tested under a little bit complicated background surrounding. Doing two experiments above is to verify the reliability and adaptability of the proposed algorithm in both cases by detecting ground targets. The first group of experiments results is shown in Fig. 6 under simple background surroundings for ground target detection. As can be seen from Fig. 6b, cars on the road surface are detected by visual attention model. In this paper, scale-adaptive Gaussian filter is used to calculate salient map, remove regular white reflective zone and leave contrast image edge because of dramatic changes and irregular texture. Then, based on visual map watershed method is used to segmented visual saliency map and finally extract car regions as shown in Fig. 6c. The second group of experiments results is shown in Fig. 7 under a bit more complicated background surroundings for ground target detection. As can be seen from Fig. 7b, cars in the forests are detected by visual attention model. In this paper, scale-adaptive Gaussian filter is used to calculate salient map, remove repeated trees and grass and leave contrast image edge because of dramatic changes and irregular texture, even if the cars cover up by trees. Then, based on visual map watershed method is used to segmented visual saliency map and finally extract car regions as shown in Fig. 7c. In summary, the proposed method can effectively calculate the salient regions from CFAR detection image, without prior knowledge of the template or characteristics of the target. Moreover, this novel method doesn’t need supervision or unsupervised learning. It can automatically detect ground targets, which meets the requirements of flexibility for real-world platforms.

6

CONCLUSIONS

In this paper, a novel target detection algorithm with CFAR from radar system is proposed by combining visual attention theory and target detection theory. This algorithm is appropriate for target detection in radar systems with complex backgrounds. This algorithm could address the problem of that a small-size target in strong clustered circumstance would be detected as a false alarm using traditional CFAR method. This algorithm integrates visual attention model and watershed segmentation method to detect regions of

interest, without providing template or training or learning from the targets. It could overcome the weaknesses of the traditional methods based on pattern recognition. The target detection method based on visual attention model mimics the biological visual system. The calculation of the proposed method is simple, which enables it to be suitable for engineering applications.

REFERENCES [1] H. Teuber. Physiological psychology[C]. Annual Review of Psychology, 1955, 6(1):267-296 [2] J. M. Wolfe and T. S. Horowitz. What attributes guide the deployment of visual attention and how do they do it[J] Nature Reviews Neuroscience, 2004, 5:1-7 [3] R. Desimone and J. Duncan. Neural mechanisms of selective visual attention[C]. Annual review of neuroscience, 1995, 18(1):193-222 [4] S. K. Mannan, C. Kennard, and M. Husain. The role of visual salience in directing eye movements in visual object agnosia. Current biology, 2009, 19(6):247-248 [5] L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE TPAMI, 1998, 20(11):1254-1259 [6] R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk. Frequency-tuned salient region detection. In CVPR, 2009: 1597-1604 [7] Ming-Ming Cheng, Guo-Xin Zhang, Niloy J. Mitra, Xiaolei Huang, Shi-Min Hu. Global Contrast based Salient Region Detection[C]. IEEE CVPR, 2011: 409-416 [8] J. Han, K. Ngan, M. Li, and H. Zhang. Unsupervised extraction of visual attention objects in color images[C]. IEEE TCSV, 2006, 16(1):141-145 [9] B. Ko and J. Nam. Object-of-interest image segmentation based on human attention and semantic region clustering [J]. J Opt Soc Am, 2006, 23(10):2462 [10] U. Rutishauser, D. Walther, C. Koch, and P. Perona. Is bottom-up attention useful for object recognition?[C] CVPR, 2004, 2: 37-44 [11] C. Christopoulos, A. Skodras, and T. Ebrahimi. The JPEG2000 still image coding system: an overview [J]. IEEE Trans. on Consumer Electronics, 2002, 46(4):1103-1127 [12] Y.-F. Zhang, S.-M. Hu, and R. R. Martin. Shrink ability maps for content-aware video resizing[C]. Comput. Graph. Forum, 2008, 27(7):1797-1804 [13] S. Yu-Shuen Wang, Chiew-Lan Tai and T.-Y. Lee. Optimized scale-and-stretch for image resizing [J]. ACM Trans. Graph. , 2008, 27(5): 1-8 [14] G.-X. Zhang, M.-M. Cheng, S.-M. Hu, and R. R. Martin. A shape-preserving approach to image resizing. Comput [C]. Graph. Forum, 2009, 28(7):1897-1906 [15] Di Bisceglie, M. and C. Galdi. CFAR detection of extended objects in high-resolution SAR images," IEEE Trans. on Geoscience and Remote Sensing, Vol. 43, No. 4, 833{843, Apr. 2005. [16] Morgan, C. J., L. R. Moyer, and R. S. Wilson, \Optimal radar threshold determination in Weibull clutter and Gaussian noise," IEEE Aerospace and Electronic Systems Magazines,Vol. 11, No. 3, 41{43, Mar. 1996. [17] Gao, G., A parzen-window-kernel-based CFAR algorithm for ship detection in SAR images. IEEE Geoscience and Remote Sensing Letters, Vol. 8, No. 3, 557{561, May 2011. [18] El-Darymli K, McGuire P, Power D, Moloney C. Target detection in synthetic aperture radar imagery: a

3979

state-of-the-art survey.JAppl Remote Sens. 2013;7(1):071598. [19] Di Bisceglie M, Galdi C. CFAR detection of extended objects in high-resolution SAR images. IEEE Trans Geosci Remote Sens.2005, 43(4):833–43. [20] Kuttikkad S, Chellappa R, editors. Non-Gaussian CFAR techniquesfor target detection in high resolution SAR images. Image processing, 1994. Proceedings ICIP-94., IEEE international conference; 1994: IEEE.

[21] Ye Congying, Li Cuihua. Application of HIS based on visual attention model in ship detection [J]. Journal of Xiamen University. 2005, 44(4): 484-488. [22] Jin Wei, Zhang Jianqi. Zhang Xiang. Method for IR Target Detection Based on Visual Attention Model [J]. Infrared Technology. 2007, 12(12): 720-723 [23] X. Hou and L. Zhang. Saliency detection: A spectral residual approach [C]. IEEE CVPR , 2007: 1-8 [24] Chen Shuo, Wu Chengdong. Rapid scene registration method based on visual saliency [J]. Journal of Image and Graphics. 2011, 16(7): 1241-1247

3980