Airport runway detection in satellite images by Adaboost learning Ugur Zongur* a,b , Ugur Halici**a, Orsan Aytekina , Ilkay Ulusoya a Computer Vision and Intelligent Systems Research Laboratory, Electrical and Electronics Eng. Dept., Middle East Technical University, 06531, Ankara, Turkey b Aselsan Inc., Radar, Electronic Warfare and Intelligence Systems Division, P.O. Box: 1, 06172, Yenimahalle, Ankara, Turkey; ABSTRACT Advances in hardware and pattern recognition techniques, along with the widespread utilization of remote sensing satellites, have urged the development of automatic target detection systems in satellite images. Automatic detection of airports is particularly essential, due to the strategic importance of these targets. In this paper, a runway detection method using a segmentation process based on textural properties is proposed for the detection of airport runways, which is the most distinguishing element of an airport. Several local textural features are extracted including not only low level features such as mean, standard deviation of image intensity and gradient, but also Zernike Moments, Circular-Mellin Features, Haralick Features, as well as features involving Gabor Filters, Wavelets and Fourier Power Spectrum Analysis. Since the subset of the mentioned features, which have a role in the discrimination of airport runways from other structures and landforms, cannot be predicted trivially, Adaboost learning algorithm is employed for both classification and determining the feature subset, due to its feature selector nature. By means of the features chosen in this way, a coarse representation of possible runway locations is obtained. Promising experimental results are achieved and given. Keywords: Airport runway detection, satellite images, automatic target detection, textural features, Adaboost algorithm
1. INTRODUCTION Airports are important structures from both economical and military perspective. Economically, as fundamental cargo and passenger transportation stations, airports serve to establish and maintain financial relationships with national and global businesses. The military airports, i.e. airbases, are also critical strategic targets considering the importance of the aviation branch of a nation's defense forces. In terms of these reasons, automatic detection of airports can provide vital data for various civil and military applications. There exists a number of prior studies on the problem of airport detection, where some of them employ a classification stage based on texture1-4 while some of them do not employ such a process5-6. In Ref. 1, a method, which is established on a texture based pre-screening step to obtain Regions of Interests (ROIs), and subseqeuently an elongated rectangle detection on those ROIs for finding runways is proposed. Kernel Matching Pursuits is used for the classification process in the segmentation step. The proposed method is intended to find airport runways in large optical images (approximately 6,500x7,500). In Ref. 2, first a search for the elongated rectangles pertaining to runways is carried out generating runway hypotheses and then, contrary to Ref. 1, textural properties are utilized in Support Vector Machine classifier for hypothesis verification instead of obtaining ROIs. In Ref. 3, a primitive segmentation based on image intensity thresholding is carried out before shape analysis. In Ref. 4, basic properties related to texture, such as homogeneity and average intensity, are used for hypothesis verification. Refs. 5 and 6 also aimed to detect airports but there is no process involving textural features employed for airport detection. Except Ref. 4, no numerical experimental
[email protected]; phone +90 312 592 2258; +90 312 354 5205; aselsan.com.tr
[email protected] ; phone +90 312 210 2333; fax +90 312 210 2304; metu.edu.tr
Image and Signal Processing for Remote Sensing XV, edited by Lorenzo Bruzzone, Claudia Notarnicola, Francesco Posa, Proc. of SPIE Vol. 7477, 747708 · © 2009 SPIE · CCC code: 0277-786X/09/$18 · doi: 10.1117/12.830295
Proc. of SPIE Vol. 7477 747708-1 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
performance evaluation is provided in the mentioned studies up to this point, however results are presented for visual examination. Due to the fact that the runways can be considered as road segments, runway detection has some relevancy to road detection. The encountered road detection algorithms in the literature are either semi-automatic7,8, which needs a seed provided with human interaction (i.e. a part of a road and generally the initial direction), or automatic9,10, which operates without such an interaction. Since runways are lack of a complex network structure unlike roads, application of semiautomatic algorithms to runway detection would obviously be infeasable. However, concepts involving textures and segmentation can be utilized. There are studies that uses Haralick features9,11,12, which are textural features and also employed in this study. Refs. 13 and 14 provides a good bibliography of previous work about road detection. In this study, a method for airport runway detection using Adaboost learning algorithm employed on textural features is proposed. Adaboost is a boosting algorithm, which aims to improve the performance of a set of weak learning algorithms15. It selects the desired number of most beneficial weak learners from a provided weak learner set. When each weak learner depends on a single feature, selecting the most beneficial weak learners corresponds to selecting the most beneficial features in a sense. Since determining the features that can represent the genuine characteristics of the runway texture is not a straightforward task, a strategy that exploits the aforementioned property of Adaboost is utilized in this study. This strategy is established upon finding as many features that could probably serve for discriminating runway textures, and let Adaboost algorithm to judge and decide which features to use based on the effectiveness of them. As for the textural features employed in this study, basic features such as mean and standard deviation of image intensity and gradient, Zernike moments16 and Circular-Mellin features17 have been used in airport detection previously1,2. Haralick features18, which are also utilized in this study, have been commonly used in road detection as mentioned before. Other prevalent textural features involving Fourier Power Spectrum19-21, Wavelets20,21 and Gabor Filters20-23, which are employed in this study as well, had not been used for the airport detection problem before. Additionally a set features extracted from HSV color space are employed. The rest of the paper is organized as follows. In Section 2, the features employed in this study are explained in detail. In section 3, the Adaboost learning algorithm used for detection of image blocks containing airport runways is explained. in Section 4, the data set and the performance criteria used, and the results obtained are given. Finally, Section 5 concludes the study.
2. FEATURES In order to extract the features, first, satellite images, Nx by Ny pixels in size, are divided into non-overlapping image blocks which are N by N pixels in size. In this study N is chosen to be 32, which is found appropriate for 1 meter resolution images. Throughout the segmentation process, these blocks are considered as the basic elements to be classified and all the feature extraction and classification operations are executed over these blocks. Every such block is represented as f(x,y) and sent to the feature extractor where feature vectors are obtained. The features used in this study are explained below. 2.1 Basic Features The first four features, F1, F2, .., F4, are chosen from basic features which are Mean of Intensity Image: 1 ∑∑ f (x, y) N2 x y
(1)
1 ∑∑( f (x, y) − F1)2 N2 x y
(2)
F1 =
Standard Deviation of Intensity Image: F2 =
Mean of Gradient Magnitude: F3 =
1 ∑∑ ∇f (x, y) N2 x y
Standard Deviation of Gradient Magnitude:
Proc. of SPIE Vol. 7477 747708-2 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
(3)
1 ∑∑( ∇f (x, y) − F3 )2 N2 x y
F4 =
(4)
where ∇f (x, y) is the gradient magnitude estimate and calculated as follows: (5)
∇f (x, y) = f (x, y) − f (x −1, y) + j( f (x, y) − f (x, y −1))
The resulting feature vector related to basic features is Fbasic = [ F1 F2 F3 F4 ]
(6)
2.2 Zernike Moments Zernike moments are image moments, which are used in rotation invariant recognition of images. For the calculation of Zernike moments, the basis functions, Zernike Polynomials, are defined as16: (7)
Vn,m (x, y) = Vn,m (r,θ ) = Rn,m (r)e j m θ
where j denotes the imaginary unit,
, n is a non-negative integer, m is an integer,
is even,
and
finally r and θ are the magnitude and the angle of the vector from origin to (x,y) point respectively where the same equation, Rn,m(r) denotes the radial polynomial, which is defined as n−|m| 2
∑ (−1)
Rn,m (r) =
s=0
s
(n − s)! r n−2s n− | m | n+ | m | − s)!( − s)! s!( 2 2
. In (8)
Given these polynomials, Zernike Moments in discrete domain are expressed as Z n,m ( f ) =
n +1 * (r,θ ) ∑∑ f (x, y)Vn,m π x y
(9)
Here, * denotes complex conjugate. Since f(x,y) is a real signal and Rn,m(r)=Rn,-m(r), complex conjugate of Zn,m(f ) is equal to Zn,-m(f). For the computation, the origin is taken to be the center of the image f, and pixel coordinates are mapped into the range of unit circle The pixels outside this range are omitted. Zernike Moments are complex numbers, where rotation of the image f, results in a shift in the phase of these numbers, while their magnitude remains same as shown in Ref. 16. Therefore obtaining rotation invariant features is possible by utilizing the magnitudes of Zernike Moments. Zernike Moment of order from 0 to 4, which are Z0,0 , Z1,1, Z2,0, Z2,2 , Z3,1 , Z3,3 , Z4,0 , Z4,2 and Z4,4 are considered in this study. Fzernike = [ F5
F6 .. F13 ] = [ Z 0,0 ( f )
Z1,1 ( f )
.. Z 4,4 ( f ) ]
(10)
2.3 Circular-Mellin Features Circular-Mellin Features are orthogonal texture features, which are orientation and scale invariant17. Rotation and scale invariance is obtained by the polar-log coordinate transformation. Cartesian to Polar-log coordinate conversion is as follows, where (x,y) is the coordinates of a point in Cartesian coordinate system and (λ,θ) in polar-log coordinate system, (11) eλ = r = x 2 + y 2 and (12) θ = arctan(y, x) Correlation response of an image, f(λ,θ) with a filter h(λ,θ) is given as
∫ ∫ λ
θ
f ( λ,θ )h * ( λ,θ )e 2 λ dλdθ
Proc. of SPIE Vol. 7477 747708-3 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
(13)
If the filter h is in the form given in Equation (14), it is shown in Ref. 17 that the magnitude of correlator output is rotation invariant and the ratios between magnitudes are scale invariant: (14)
h p,q ( λ,θ ) = e− λe j 2 πpλe j qθ
With the above filter, the correlator function output becomes C p,q ( f ) =
∫ ∫ λ
θ
(15)
f ( λ,θ )h *p,q ( λ,θ )e 2 λ dλdθ
Since all the images that we worked on are approximately at the same resolution, we considered only rotation invariancy and used the magnitudes of these correlator functions, C p,q ( f ) as features. Fcircular−mellin = [ F14
F15 .. F23 ] = [ C1,1 ( f )
C1,2 ( f )
.. C2,5 ( f ) ]
(16)
2.4 Fourier Power Spectrum Fourier analysis provides mathematical background for analysis of signals based on frequency. Let f(x,y) be the signal representation of an image of size NxN. Discrete Fourier transform of f(x,y) is calculated as follows: −2 π 1 N −1N −1 F(u,v) = 2 ∑∑ f (x, y)e N x =0 y =0
⎛ ux vy ⎞ j⎜ + ⎟ ⎝N N⎠
(17)
The Fourier power spectrum is calculated as F(u,v) 2 = F(u,v)F * (u,v) where * denotes complex conjugate. It should be noted that coarse textures have high values of power spectrum near the origin and finer textures have a more spread out power spectrum24. Analysing power spectrum is in ring shaped regions results in rotational invariancy19,20,: φ r ,r ( f ) = 1 2
r2
2π
r1
0
∫ ∫
2
F(r,θ ) dθdr
(18)
Ring shaped regions are approximated with their discrete counterparts and six ring shaped regions with equally spaced radii over frequency domain are used as features F24, F25, .. F29. Additionally four statistical features, given in Equations (19-22) are used as features F30 , F31, .., F33. Maximum Magnitude: F30 = max{ F(u,v) : (u,v) ≠ (0,0)}
(19)
Average Magnitude: F31 = μF =
1 NF
Energy of Magnitude: F32 =
u
v
∑∑( F(u,v) ) u
(20)
∑∑ F(u,v)
(21)
2
v
Variance of Magnitude: F33 =
1 NF
∑∑( F(u,v) − μ )
2
F
u
(22)
v
where NF is the number of frequency components. The resulting feature vector related to fourier power spectrum is F fourier = [ F24
F25 .. F33 ]
(23)
2.5 Gabor Filters Gabor filters are linear filters composed of one harmonic and one Gaussian function. These filters have optimal localization in both spatial and frequency domain by minimizing the joint uncertainty in both domains which is an
Proc. of SPIE Vol. 7477 747708-4 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
attractive property22. An interesting fact about the relationship of Gabor filters and human perception is that the characteristics of cortical cells in the mammalian visual cortex can be approximated by Gabor filters23. A two dimensional Gabor function is given below:
g(x, y) =
1 e 2πσ xσ y
⎡ ⎛ ⎞⎤ ⎢ 1⎜ x2 y2 ⎟⎥ ⎢− 2 ⎜ 2 + 2 +2 πjWx ⎟⎥ ⎟⎥ ⎢⎣ ⎜⎝ σ x σ y ⎠⎦
(24)
The Fourier transform of the same function is
G(u,v) = e
⎡ ⎤ 1 (u−W )2 v 2 ⎥ − ⎢ + ⎥ 2 2⎢ σ 2 σv ⎦ ⎣ u
(25)
In these equations, W controls the modulation and the variances have the relationship σ x =
1 1 and σ y = . 2πσ u 2πσ v
Gabor functions form a complete but nonorthogonal set of basis functions. In order to get rid of the redundancy caused by nonorthogonality, these basis functions must be scaled and rotated through a generating function given in Equation (26) in order to obtain a Gabor filter dictionary. (26)
gk,s (x, y) = a−sg( x ′, y ′)
kπ where x ′ = a−s (x cos θ + y sinθ ) , y ′ = a−s (−x sinθ + y cos θ ) and θ = . In these equations 0≤s≤(S-1) and 0≤k≤(K-1) K
where S and K are total number of scales and orientations respectively. Real part of the generating function, i.e. , is used as filter. A dictionary of Gabor Filters is designed in order to reduce redundancy with typical parameters: K=6 orientations and S=4 scales. Other parameters are chosen according to Ref. 23 and given in Equations (27-29) where lower and upper center frequencies selected as Ul=0.06 and Uh=0.5. (27)
W = Uh 1
⎛ U ⎞ S−1 a =⎜ h ⎟ ⎝ Ul ⎠
σu =
(a −1)U h (a + 1) 2ln(2)
(28)
tan( and σ v =
⎤ π ⎡ σ2 )⎢U h − ( u )2ln(2)⎥ 2K ⎣ Uh ⎦
(29)
(2ln(2)) 2 σ u2 2ln(2) − U h2
Gabor filters characteristically possess directionality which is not a desired property for runway detection problem. To overcome this difficulty and make Gabor filter outputs approximately rotation invariant, a simple method is proposed in Ref. 21 According to this method, the vector given in Equation (30) is circularly shifted so that the scale-orientation pair having the maximum mean is located at the beginning of the vector. This method is shown to be effective by the experimental results in the same study. Fgabor = [ F34
F35 .. F81 ] = =CircShift
( [μ
G1,1 ( f )
σG
1,1 ( f )
μG
1,2 ( f )
σG
1,2 ( f )
.. μG6,4 ( f ) σ G6,4 f )
])
(30)
where Gk,s(f) denotes the f signal filtered by Gabor filter gk,s, and μG and σG denotes mean and standard deviation of G respectively.
Proc. of SPIE Vol. 7477 747708-5 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
2.6 Haralick Features The features explained in this section are proposed by Ref. 18. These features are very popular for the textural analysis of remote sensed images and derived from a matrix called Gray-Level Co-occurrence Matrix (GLCM) which indicates the distribution of co-occurring values for a given offset. GLCM is Ng by Ng in size, where Ng is the number of gray levels and is calculated as follows. ⎧1 if f (x, y) = α and f (x + Δx, y + Δy) = β GLCM Δx,Δy (α , β ) = ∑∑⎨ 0 otherwise x y ⎩
(31)
In this equation, f(x,y) is the image function, (Δx,Δy) is the offset vector, and α and β are row or column indices. After calculation of GLCM, which is not so useful in its raw format, additional operations is required to extract useful information that is contained in this matrix. These operations are called Haralick features. Typically, a subset of Haralick features is employed after obtaining GLCM matrices. In this study, energy, contrast, homogeneity and correlation are used among the Haralick features. Energy (Angular Second Moment): ζ eng,Δx,Δy = ∑∑{GLCM Δx,Δy (α, β )}2 α
(32)
β
Contrast: 2
ζ con,Δx,Δy = ∑∑ α − β GLCM Δx,Δy (α, β ) α
(33)
β
Homogeneity (Inverse Difference Moment): GLCM Δx,Δy (α , β ) 1+ α − β
(34)
(α − μα )(β − μβ )GLCM Δx,Δy (α, β )
(35)
ζ homo,Δx,Δy = ∑∑ α
Correlation25:
ζ corr,Δx,Δy = ∑∑ α
β
β
σ ασ β
σα , σβ are means and standard deviations of marginal-probability matrices, where μα , μβ and 1 1 MPM Δx,Δy,α (α ) = ∑GLCM Δx,Δy (α,β ) and MPM Δx,Δy,β (β ) = N ∑GLCM Δx,Δy (α,β ) respectively. Ng β g α GLCM features' performances are demonstrated to be noteworthy, considering their relative simlicity, low extraction costs and compact feature vectors20. When no a priori information is available, it is common to use offset vectors (Δx,Δy): (1,0) (1,-1), (0,-1) and (-1,-1), which correspond to adjacent pixels at 0°, 45°, 90° and 135° respectively. Therefore these offsets are utilized in this study yielding the feature vector
Fharalick = [ F82
F83 .. F97 ] = [ζ eng,1,0 ζ con,1,0 ζ homo,1,0 ζ corr,1,0 .. ζ homo,−1,−1 ζ corr,−1,−1 ]
(36)
2.7 Wavelet Features Wavelet features are extracted using Discrete Wavelet Transform (DWT). DWT involves a filtering and a downsampling operation. A multiresolution decomposition is obtained by applying the single-level wavelet decomposition recursively to low-frequency component. Multiresolution decomposition provides a basic hierarchical system in order to interpret both frequency and location based information contained within the image. Since utilized wavelet functions are orthogonal, each stage of decomposition finds out different periodical aspects of the image. Wavelet features are acquired using daubechies-4 wavelet. These features are expected to provide quantative description of textural properties related with frequency like Fourier Power Spectrum. Contrarily, Wavelet Analysis has localization in spatial domain too. A three level decomposition structure is employed and energies and standard deviations of four components (Low-Low, Low-High, High-Low, High-High) for three levels are used as features, making 24 features in total. The resulting feature vector is given in Equation (37) where engc,s and σc,s are the energy and standard deviations of the wavelet filtered signals of component, c at stage s. Fwavelet = [ F98
F99 .. F121 ] = [engLL,1 σ LL,1 engLH ,1 σ LH ,1 eng HL,1 σ HL,1 .. eng HH ,3 σ HH ,3 ] (37)
Proc. of SPIE Vol. 7477 747708-6 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
2.8 Features in HSV color space HSV color space represents the color value in a very parallel way with human perception. Since human is considered to be successful at discriminating elements like runway textures, and by all means, the color is an evaluated aspect by humans, using HSV is advantageous for the problem for certain. For instance, saturation will surely provide very valuable information about a runway, bearing in mind that the runways tend to be in gray tones and colorfulness is a synonym for saturation. In order to define these features, images that are inherently in the RGB color space must be converted to HSV color space. This is performed by using Equations (38-40) where max=max{r(x,y), g(x,y), b(x,y)}, min=min{r(x,y), g(x,y), b(x,y)}, and finally r(x,y), g(x,y), b(x,y) and h(x,y), s(x,y), v(x,y) functions denote channels of RGB and HSV respectively. ⎧0 , if max = min ⎪ ⎪⎛⎜ g(x, y) − b(x, y) 60 + 360 ⎞⎟ mod 360 , if max = r(x, y) ⎪⎪⎝ max − min ⎠ h(x, y) = ⎨ b(x, y) − r(x, y) 60 + 120 , if max = g(x, y) ⎪ max − min ⎪ ⎪ r(x, y) − g(x, y) if max = b(x, y) ⎪⎩ max − min 60 + 240 ,
⎧0, ⎪ s(x, y) = ⎨ min , ⎪⎩1− max
if
max = 0
(38)
(39)
otherwise
(40)
v(x, y) = max
Mean, variance, mean of gradient magnitude and variance of gradient magnitude as well as Zernike moment of order one and Circular-Mellin feature (p=1, q=1) for both Saturation (F126, F127 .. F131 respectively) and Value (F132, F133 .. F137 respectively) components are employed. Since these two components are linear data, common mean and variance formulas, given in Section 2.1 still apply. On the other side, Hue is an angular data and directional statistics is involved in mean and variance calculations which are given in Equations (41- 44) where ∠ denotes the angle of complex number. Since Zernike and Circular-Mellin features inherently require magnitudes, rather than angles, Hue component is not utilized for these features. F122 = ∠(∑ ∑ e 2 π x
F123 = 1−
1 N2
j h(x,y )
)
(41)
y
∑∑e x
2 π j h(x,y )
(42)
y
(43) (44) where
2 2 ∇h(x, y) = dh,1 (x, y) + dh,2 (x, y) ,
in
which
dh,1 (x, y) = (h(x, y) − h(x −1, y) + 360 ) mod 360
and
. The resulting feature vector Fhsv is Fhsv = [ F122
F123 .. F137 ]
Proc. of SPIE Vol. 7477 747708-7 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
(45)
3. THE ADABOOST ALGORITHM Boosting is a general method to improve the performance of a learning algorithm15. Adaboost (short for Adaptive Boosting) is a boosting algorithm, which takes a set of weak learners and constitutes a linear combination of them, in a number of iterations, to produce a strong classifier. A weak learner is a classifier, which gives weak hypotheses that are insufficient to solve the whole problem alone. These weak learners are often selected as threshold classifiers26 which decide the output by judging the result of a comparison between input and a threshold. Such a threshold classifier, hj (xj), is given in Equation (46). ⎧+1 if p j x j < p jθ j (46) h j (x j ) = ⎨ ⎩−1 otherwise In this equation xj is the feature, θj is the threshold, pj is the parity which decides the direction of inequality and 1≤j≤K where K is the number of features. Every weak learner makes its decision by examining only one feature, so every classifier corresponds to a feature. Training of this weak learner is given in Equation 61, and it means determining θj and pj values that minimizes the classification error on the iteration t. (θ j , p j ) = argmin(θ
j ,p j )
{εt, j }
(47)
This operation can be achieved simply by a search in intervals min(x)≤θj≤max(x) and pj∈{-1,+1}. The definition of εt,j is given below, where yi is the desired output: (48) εt, j = Dt (i)
∑
i:h j (x i )≠y i
In this equation, Dt (i) is the distribution function over training samples on the tth iteration. This distribution is utilized to emphasize the misclassified samples, forcing the algorithm to focus on the hard examples in the training set. Dt(i) is initialized to be uniform, and on every ireration it is updated in a way that the true classified samples' values are reduced and false classified samples' values are increased. Complete algorithm of the Adaboost is given below: Input: Training data are (x1,y1), (x2,y2), .. , (xm,ym) where input feature vectors are xi ∈ X1× X2×..× XK , the jth element of xi is represented as xi,j ∈Xj, the training data labels are yi ∈Y={-1,+1} , for 1≤i≤m
Initialize:
Algorithm: For t=1,..,T do the following steps: 1) Train weak classifiers, i.e. find (θj,pj) pairs considering Equation (47), for j=1..K, 2) Get weak hypotheses hj(xi,j): Xj→{-1,+1} i=1..K 3) Find error εt, j for j=1..K using Equation (48) and select the classifier j*(t) with minimum error, that is j * (t) = argmin(θ , p ){εt, j } j
4)
j
* 1 ⎛1− ε* ⎞ Set α t = ln⎜ * t ⎟ where εt = εt, j * (t ) calculated for j*(t). 2 ⎝ εt ⎠
5) Update Dt+1(i) as −α D (i) ⎧e t Dt +1 (i) = t × ⎨ α Zt ⎩e t
−α y i h t* (x i )
if ht* (x i ) = y i Dt (i)e t = Zt if ht* (x i ) ≠ y i −1
⎡ ⎤ −α t y i h t* (x i ) ⎥ where h (x i ) =ht (x i, j * (t ) ) and Z t = ⎢∑Dt (i)e ⎣i ⎦ * t
Output:
Proc. of SPIE Vol. 7477 747708-8 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
Adaboost has many advantages including that it is fast, simple and easy to program. It is also an nice property that it requires no parameters to be tuned, except the iteration count T. Adaboost provides a general and provably effective method for producing an accurate prediction rule by combining rough and moderately inaccurate weak learners. Since it is not known which feature is better than which one for the discrimination task of runway from other landforms, a considerable amount of features are selected as given in the previous section and Adaboost is employed because of its feature selector property. A weak classifier (threshold classifier) is utilized for every feature and Adaboost is performed to find the most helpful weighted combination of T weak classifiers. After training the Adaboost, a new sample x, which is possibly not used in training, classified into one of the negative or positive classes considering whether H(x)=+1 or H(x)=-1.
4. EXPERIMENTAL RESULTS 4.1 The Data Set These experimental results are carried out with a dataset, consisting of 55 satellite images gathered from GoogleEarthTM having sizes of 14,000x11,000 on average and resolution of 1 meter. While 28 of these images are randomly selected for training of Adaboost, 27 of them are reserved for testing. Each image is divided into blocks of size 32x32 and in this way 4,205,796 blocks are obtained for training. Each block of the training images is labeled as runway (positive) if more than half of its pixels belongs to main runway of an airport and labeled as non-runway (negative) if not. Such a labeling, in training images resulted in 5,315 runway blocks and 4,200,481 non-runway blocks. For composition of the training set, 10% of the non-runway blocks are randomly selected for use because of memory constraints, while full set of the runway blocks are utilized. Ground truth data are constituted manually, by marking “only” the main runways. A similar procedure is applied also on the images reserved for testing. For this purpose each one of the 27 test images is divided into blocks of size 32x32 and in this way 4,186,148 blocks are obtained for testing. The ground truth data, utilized for evaluation of classification results, contains all of the 6,502 runway blocks and all of the 4,179,646 nonrunway blocks. 4.2 The Performance Criteria The performance evaluation for segmentation is carried out over the image blocks. For this purpose the comparison of the ground truth data with the decision of the Adaboost Algorithm for each image block is needed. In order to define performance equations, the following ground truth data labeling is used: ⎧1, if block having coordinates (a,b) is truly a runway Ω(a,b) = ⎨ ⎩0, otherwise
(49)
where ``truly a runway'' implies that ``at least half of the pixels contained within the image block, belong to a runway''. A similar labeling is used to represent the decision of the Adaboost algoritm: ⎧1, if block having coordinates (a,b) is classified as a runway Λ(a,b) = ⎨ ⎩0, otherwise
(50)
The True Positive Rate (Sensitivity), which is the ratio of correctly classified positive labeled samples to truly positive samples (indicates how much the algorithm fails to find an existing airport), can be expressed as
∑
Ω(a,b)
True Positive Rate= (a,b ):Λ (a,b )=Ω(a,b ) ∑Ω(a,b)
(51)
(a,b )
Likewise True Negative Rate (Specificity), which is the ratio of correctly classified negative labeled samples to truly negative samples (indicates how much the algorithm fails to label a truly non-airport block as a non-airport block), can be written as
Proc. of SPIE Vol. 7477 747708-9 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
∑ True Negative Rate=
1− Ω(a,b)
(a,b ):Λ(a,b )=Ω( a,b )
(52)
∑1− Ω(a,b) (a,b )
4.3 Performance Evaluation As mentioned before, there are no parameters required to tune for Adaboost except the iteration count. Since the calculations on an iteration does not effect the computation results obtained on the previous iterations, this parameter just determines how many classifiers to be taken into account and it is always possible to perform the classification with less weak learners if the results for such a number of classifiers are satisfactory. This implies that the tuning of this parameter does not cause critical consequences and can be selected due to computational constraints. Figure 1.a shows the performance results on every iteration for the training set. As one may notice the x-axis values of these results indicate the number of features (weak learners) utilized, while y-axis values show the corresponding performances (True Positive and True Negative rates) for that number.
Figure 1. Performance vs Number of Features a) Training Set, b) Test Set
In order to obtain the performance versus feature number graph of the test set, 10% of the 4,179,646 negative blocks are randomly selected for faster calculation, while full set of the positive blocks are utilized. Obtained classification results are compared to the ground truth data to get performance results. The performance results for the testing set are given in Figure 2.b. While the True Negative Rates are very close to the ones achieved for the training set, the True Positive Rate is slightly lower. The outcomes of both sets show that a value of number of iterations around 16 gives comparable results with the value of 40. That is to say, after approximately 16 features, adding more features to the feature vector does not provide significant performance improvement. It is a question of computation capacity, whether to include remaining features beyond the stated number or not. The features chosen by Adaboost algorithm at each iteration are given in Table 1. In this table, the names and parameters of the selected features, their indices in feature vector and Adaboost weak classifier rules are given in the selection orders. The threshold values of classifier rules are normalized according to the minimum and maximum values of the corresponding feature in the training feature set in order to provide a more explanatory expression. While it is not easy to state the possible selection reasonings for every selected feature, some of them are explicit. For instance the first feature in the table, the mean of Saturation, denotes how colorful the image block is, in average. Observing the classifier rule, the weak learner output is positive, if input is less colorful than the 16% of the saturation range. Since airport runways are not colorful structures and they tend to be in grayscale, this outcome is consistent with common sense. Likewise the second selected feature, variance of intensity gradient magnitude, which is selected multiple times, denotes
Proc. of SPIE Vol. 7477 747708-10 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
how variable the rate of intensity level change is, in an image block. Higher values means the block must have abruptly changing neighboring pixels, as well as uniform areas at the same time. This is the case when there is a runway sign or runway edge on a block. Selection of duplicate features are possible in Adaboost. In the first glance this may seem to be causing a redundancy since there is no new information provided. However, in this way, more sophisticated classification rules can be established, because the thresholds and/or parities change on every iteration due to the training of weak classifiers. Table 1. Features selected by Adaboost in their selection orders Order Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
126 4 93 88 4 129 1 131 86 135 59 4 35 4 128 4 77 89 31 53
Name and Parameter Mean of Saturation Var. of Intensity Grad. Mag. Haralick - Correlation (at 90◦) Haralick - Homogeneity (at 45◦) Var. of Intensity Grad. Mag. Var. of Saturation Grad. Mag. Mean of Intensity Circ.-Mellin of Sat. (p=1, q=1) Haralick - Energy (at 45◦) Var. of Value Grad. Mag. Var. of Gabor Filt. Output (13th) Var. of Intensity Grad. Mag. Var. of Gabor Filt. Output (1st) Var. of Intensity Grad. Mag. Mean of Saturation Grad. Mag. Var. of Intensity Grad. Mag. Var. of Gabor Filt. Output (22th) Haralick - Correlation (at 45◦) Average of DFT magnitude Var. of Gabor Filt. Output (10th)
Classifier Rule < 0.16 > 0.16 > 0.92 > 0.77 > 0.22 < 0.08 > 0.58 > 0.09 > 0.16 > 0.33 < 0.07 > 0.11 < 0.05 > 0.26 < 0.03 > 0.09 < 0.06 > 0.93 > 0.18 < 0.05
Order
Index
Name and Parameter
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
4 115 124 126 4 59 125 113 129 131 31 4 122 108 4 103 105 75 115 123
Var. of Intensity Grad. Mag. Var. of Wavelet Output (St.3, LL) Mean of Hue Grad. Mag. Mean of Saturation Var. of Intensity Grad. Mag. Var. of Gabor Filt. Output (13th) Var. of Hue Grad. Mag. Var. of Wavelet Output (St.2, HH) Var. of Saturation Grad. Mag. Circ.-Mellin of Sat. (p=1, q=1) Average of DFT magnitude Var. of Intensity Grad. Mag. Mean of Hue Mean of Wavelet Output (St.2, LH) Var. of Intensity Grad. Mag. Var. of Wavelet Output (St.1, HL) Var. of Wavelet Output (St.1, HH) Var. of Gabor Filt. Output (21th) Var. of Wavelet Output (St.3, LL) Variance of Hue
Classifier Rule > 0.42 > 0.06 < 0.11 < 0.05 > 0.14 < 0.05 > 0.03 > 0.24 < 0.10 > 0.19 < 0.46 > 0.48 > 0.11 < 0.01 > 0.2 < 0.07 > 0.09 < 0.08 > 0.13 < 0.03
Using all of the 40 features, the performance of the algorithm calculated for all of the blocks in the training and test sets. For the training set, 4,765 of the 5,315 truly runway blocks are successfully classified as runway, corresponding to a 90% true positive rate where 3,854,585 of the 4,200,481 truly non-runway blocks are classified as non-runway, corresponding to 92% true negative rate. For the entire test set, 5,414 of the 6,502 truly runway blocks are successfully classified as runway, corresponding to a 83% true positive rate where 3,815,208 of the 4,179,646 truly non-runway blocks are classified as non-runway, corresponding to 91% true negative rate.
5. CONCLUSIONS A method for texture based discrimination of airport runways from other structures and landforms is proposed. This method adopts a generic approach for finding runways in large images using Adaboost and a set of textural features. The method can be used for either finding Regions of Interests prior to shape detection or verifying runway hypotheses after shape analysis. Performance evaluation experiments are carried out with a considerable amount of data. These experiments demonstrate promising performance results for the proposed method. The method is tunable where the number of features to be used is up to the user, and can be reduced with a trade off between computational cost and performance as shown in Section 4.2. As future work, a shape analysis will be utilized for further performance improvement.
REFERENCES [1] Liu, D., He, L., and Carin, L., “Airport detection in large aerial optical imagery,” Acoustics Speech and Signal Processing 2004 Proceedings (ICASSP ’04) IEEE International Conference on 5, 761–764 (2004). [2] Qu, Y., Li, C., and Zheng, N., “Airport detection base on support vector machine from a single image,” Information Communications and Signal Processing 2005 Fifth International Conference on, 546–549 (2005).
Proc. of SPIE Vol. 7477 747708-11 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms
[3] Gupta, P. and Agrawal, A., “Airport detection in high-resolution panchromatic satellite images,” Journal of Institution of Engineers (India) 88(5), 3–9 (2007). [4] Han, J., Guo, L., and Bao, Y., “A method of automatic finding airport runways in aerial images,” Signal Processing Proceedings 2002 6th International Conference on 1, 731–734 (2002). [5] Pi, Y., Fan, L., and Yang, X., “Airport detection and runway recognition in sar images,” Geoscience and Remote Sensing Symposium IGARSS ’03 Proceedings 2003 IEEE International 6, 4007–4009 (2003). [6] Wang, Q.-T. and Bin, Y., “A new recognizing and understanding method of military airfield image based on geometric structure,” Wavelet Analysis and Pattern Recognition 2007 ICWAPR ’07 Proceedings International Conference on 3, 1241–1246 (2007). [7] Niu, X., “A semi-automatic framework for highway extraction and vehicle detection based on a geometric deformable model,” International Journal of Photogrammetry and Remote Sensing 61, 170–186 (2006). [8] Udomhunsakul, S., “Semi-automatic road detection from satellite imagery,” Image Processing 2004 ICIP ’04 2004 Proceedings International Conference on, 3, 1723–1726 (2004). [9] Mena, J. B. and Malpica, J. A., “An automatic method for road extraction in rural and semi-urban areas starting from high resolution satellite imagery,” Pattern Recognition Letters 26(9), 1201–1220 (2005). [10] Mokhtarzade, M., Zoej, M., and Ebadi, H., “Automatic road extraction from high resolution satellite images using neural networks, texture analysis, fuzzy clustering and genetic algorithms,” The international archives of the photogrammetry remote sensing and spatial information sciences 2008 Proceedings ISPRS Congress Beijing, B3b, 549 (2008). [11] Fernandez-Maloigne, C., Laugier, D., and Bekkhoucha, A., “Texture analysis for road detection,” Intelligent Vehicles ’92 Symposium Proceedings of the, 219–224 (1992). [12] Popescu, D., Dobrescu, R., and Merezeanu, D., “Road analysis based on texture similarity evaluation,” SIP’08: Proceedings of the 7th WSEAS International Conference on Signal Processing, 47–51 (2008). [13] Mena, J. B., “State of the art on automatic road extraction for gis update: a novel classification,” Pattern Recognition Letters 24(16), 3037–3058 (2003). [14] Mayer, H., Hinz, S., Bacher, U., and Baltsavias, E., “A test of automatic road extraction approaches,” International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences, 209–214 (2006). [15] Freund, Y. and Schapire, R. E., “A short introduction to boosting,” Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 1401–1406 (1999). [16] Khotanzad, A. and Hong, Y., “Invariant image recognition by zernike moments,” Pattern Analysis and Machine Intelligence IEEE Transactions on 12, 489–497 (1990). [17] Ravichandran, G. and Trivedi, M., “Circular-mellin features for texture segmentation,” Image Processing IEEE Transactions on 4, 1629–1640 (1995). [18] Haralick, R. M., Shanmugam, K., and Dinstein, I., “Textural features for image classification,” Systems Man and Cybernetics IEEE Transactions on 3, 610–621 (1973). [19] Augusteijn, M., Clemens, L., and Shaw, K., “Performance evaluation of texture measures for ground cover identification in satellite images by means of a neural network classifier,” Geoscience and Remote Sensing IEEE Transactions on 33, 616–626 (1995). [20] Newsam, S. D. and Kamath, C., “Retrieval using texture features in high resolution multi-spectral satellite imagery,” SPIE Defense and Security Symposium Data Mining and Knowledge Discovery: Theory Tools and Technology VI, (2004). [21] Newsam, S. D. and Kamath, C., “Comparing shape and texture features for pattern recognition in simulation data,” Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series 5672, 106–117 (2005). [22] Manjunath, B. and Ma, W., “Texture features for browsing and retrieval of image data,” Pattern Analysis and Machine Intelligence IEEE Transactions on 18, 837–842 (1996). [23] Rangayyan, R., Ferrari, R., Desautels, J., and Frere, A., “Directional analysis of images with gabor wavelets,” Computer Graphics and Image Processing 2000. Proceedings XIII Brazilian Symposium on, 170–177 (2000). [24] Weszka, J. S. and Rosenfeld, A., “A comparative study of texture measures for terrain classification,” NASA STI/Recon Technical Report N 76, (1975). [25] Bevk, M. and Kononenko, I., “A statistical approach to texture description of medical images: A preliminary study,” Computer-Based Medical Systems Proceedings IEEE Symposium on, 239 (2002). [26] Viola, P. and Jones, M., “Robust real-time object detection,” International Journal of Computer Vision, (2001).
Proc. of SPIE Vol. 7477 747708-12 Downloaded from SPIE Digital Library on 14 Oct 2010 to 144.122.171.37. Terms of Use: http://spiedl.org/terms