Progressive Line Processing of Kernel RX Anomaly Detection ... - MDPI

sensors Article

Progressive Line Processing of Kernel RX Anomaly Detection Algorithm for Hyperspectral Imagery Chunhui Zhao *, Weiwei Deng, Yiming Yan and Xifeng Yao

ID

College of Information and Communication Engineering, Harbin Engineering University, Harbin 150001, China; [email protected] (W.D.); [email protected] (Y.Y.); [email protected] (X.Y.) * Correspondence: [email protected]; Tel.: +86-451-8258-9810 Received: 30 June 2017; Accepted: 4 August 2017; Published: 7 August 2017

Abstract: The Kernel-RX detector (KRXD) has attracted widespread interest in hyperspectral image processing with the utilization of nonlinear information. However, the kernelization of hyperspectral data leads to poor execution efficiency in KRXD. This paper presents an approach to the progressive line processing of KRXD (PLP-KRXD) that can perform KRXD line by line (the main data acquisition pattern). Parallel causal sliding windows are defined to ensure the causality of PLP-KRXD. Then, with the employment of the Woodbury matrix identity and the matrix inversion lemma, PLP-KRXD has the capacity to recursively update the kernel matrices, thereby avoiding a great many repetitive calculations of complex matrices, and greatly reducing the algorithm’s complexity. To substantiate the usefulness and effectiveness of PLP-KRXD, three groups of hyperspectral datasets are used to conduct experiments. Keywords: hyperspectral imagery; KRX anomaly detection; real-time algorithm; progressive line processing; the causal sliding window

1. Introduction Because of abundant spectral information, hyperspectral imagery (HSI) has the potential to discover the subtle differences of ground materials that cannot be visually inspected in a multispectral image [1]. As such, it has gained widespread attention in the fields of geology, the military, the mining industry, and medical imaging, among others [2–4]. Anomaly detection, which is one of the most popular branches in hyperspectral image processing, is capable of uncovering many masked targets of interest without a priori spectral knowledge, and as such, it generally conforms to practical conditions, and has gradually been considered very effective and useful in HSI [5–8]. The Reed–Xiaoli detector (RXD) [9] is a widely used anomaly detection algorithm, and is regarded as a benchmark algorithm in HSI [10–12]. Derived from the generalized likelihood ratio test (CFAR), it finds abnormal points by calculating the Mahalanobis distance between the background and the pixel under test (PUT) in a multivariate Gaussian background. However, the RXD algorithm merely utilizes the low-order linear statistics of hyperspectral data, thus leading to inferior detection accuracy in complex and changeable ground distributions. To address this issue, kernel-RXD (KRXD), a kernel version of the RXD algorithm, is presented by Kwon and Nasrabadi [13]. KRXD exhaustively mines the high-order correlation between spectral bands via a kernel function. It can vastly improve the detection accuracy as compared to the RXD when original data are mixed in a non-linear model, which is always the case. Unfortunately, the data kernelization in KRXD produces lots of multiplications and the inversion of matrices, thereby consuming a lot of processing time. Recently, real-time anomaly detection has gained wide attention in HSI [14–21]. It is particularly significant, as some moving objects are highly desirable to be detected on a timely basis. Furthermore, with the huge pressure of big data storage, it becomes the inevitable trend to perform anomaly detection

Sensors 2017, 17, 1815; doi:10.3390/s17081815

www.mdpi.com/journal/sensors

Sensors 2017, 17, 1815

2 of 16

in real time. It is worth noting that the realization of real-time processing is mainly dependent on the three data acquisition formats [17,18]. The first such format is the band interleaved pixel (BIP) format, which performs anomaly detection pixel by pixel. Two real-time algorithms based on the RXD, the global real-time causal RX detector (GRTC-RXD), and the local real-time causal RX detector (LRTC-RXD), are discussed in [19,20] in terms of BIP, respectively. Both real-time algorithms solve the problem of causality (the one requirement of real-time processing), and only the data samples already visited can be used to detect PUT; no future data samples should be involved in the data processing [21]. They also enhance the execution efficiency by recursively updating the covariance matrices via the Woodbury matrix identity [22]. While the GRTC-RXD utilizes all processed data before the PUT as the background, LRTC-RXD regards the data contained in sliding causal array windows as the background to perform real-time detection. As these algorithms usually have an undesirable detection output by using the low-order statistics of the HSI dataset, a real-time algorithm based on the KRXD to yield better detection accuracy with much shorter processing time has been recently proposed [23]. The second such format is the band sequential (BSQ) format, which processes data samples band by band. According to the BSQ format, progressive band processing anomaly detection (PBP-AD) could be implemented band by band progressively while the data acquisition is still ongoing [24]. The third and final such format is the band interleaved line (BIL) format, which collects data line by line. Since acquiring data with a push broom scanner has gradually become the mainstream for hyperspectral imaging sensors [25,26], the literature [26] presents a real-time anomaly detector based on the BIL format. By using a linear algebra-based strategy, it effectively updates the inverse covariance, thus avoiding complex computation. The renewal process of correlation matrices only separately considers the data going in and out of the dual window. Yet, it does not re-derive the covariance matrix by treating these data and the duplicate data as a whole, leading to the loss of some information in recursion. Additionally, the low detection accuracy is still a challenge, as it is based on the RXD. Therefore, to tackle the aforementioned issues, this paper presents an approach to the progressive line processing of KRXD (PLP-KRXD) via BIL. To ensure the causality of the detection system, a new type of local window (parallel causal sliding window) is defined. In this case, hyperspectral data can be collected line by line progressively during the process of line acquisition. As the proposed method is updated recursively using the Woodbury matrix identity and the matrix inversion lemma [27], it avoids having to recalculate the previously processed data lines, thereby significantly reducing computational complexity. Experimental results demonstrate that the PLP-KRXD keeps the detection accuracy unchanged and speeds up the execution efficiency compared to the KRXD. 2. Methods 2.1. KRX Anomaly Detector In this section, we review the KRXD, a nonlinear version of the RXD, which ideally can separate the background and PUT by sufficiently utilizing high-order statistics. Suppose that each data sample T vector consists of J spectral bands, denoted by xi = xi1 , xi2 , · · · xiJ . The KRXD maps the original data XB = [x1 , x2 , · · · , x M ], where M is the number of the data sample vectors, to the high-dimensional feature space with the nonlinear function as follows Φ(XB ) = Φ(XB ) = [Φ(x1 ), Φ(x2 ), · · · Φ(x M )]

(1)

where Φ(·) is the kernel function. With the same assumption as for the RXD, the KRXD now consists of two Gaussian distributions, thereby modeling the two hypotheses as (

H0Φ : Φ(x) = nΦ H1Φ : Φ(x) = aΦ Φ(s) + nΦ

(Target absent) (Target present)

(2)

Sensors 2017, 17, 1815

3 of 16

where aΦ = 0 under H0Φ and aΦ > 0 under H1Φ , respectively. Φ(s) and nΦ separately represent the spectral signal and background clutter in the feature space. Then, the corresponding representative form of the KRXD is compactly given by −1

ˆ BΦ (Φ(r) − µˆ BΦ ) δKRXD (Φ(r)) = (Φ(r) − µˆ BΦ ) T K

(3)

ˆ BΦ are the estimated mean and covariance matrix of the background where r is the PUT, and µˆ BΦ and K data in the feature space, respectively. According to the specific kernelization and derivation, we have krT

kµTˆ

= k(r, XB ) −

1 M k ( xi , X B ) M i∑ =1

=

! 1 M k (r, xi ) 11× M M i∑ =1

!

−

! 1 M M ∑ k xi , xj 11× M M2 i∑ =1 j =1

(4)

(5)

where 11× M is an 1 × M vector with all values equal to 1. The Gram matrix is expressed by KB = k(XB , XB ). So, the final KRXD algorithm can be simplified as T 1 δKRXD (Φ(r)) = krT − kµT K− krT − kµT . B

(6)

Choosing the proper kernel function could exert a crucial influence on the performance of the kernel-based learning algorithm. The polynomial kernel function implements the dot product in feature space with an effective kernel trick, expressed as follows: d k(x, xi ) = xT xi , d ∈ N

(7)

where d is a positive constant. 2.2. Proposed Progressive Line Processing of the KRX Anomaly Detector Though the KRXD has desirable detection accuracy, it is excessively time-consuming due to the computation of complex matrices. Fortunately, the implementation of the KRXD can be improved to perform in real-time. Interestingly, the issue of performing KRXD with current hyperspectral imaging sensors working in a push-broom fashion has not received much attention in the past. Therefore, the idea is presented that the PLP-KRXD arises from a need to implement the KRXD in real time using a new type of a causal sliding window. 2.2.1. Push Broom Scanner A push broom scanner or a so-called along-track scanner is a device for obtaining images with spectroscopic sensors. Figure 1 shows the imaging principle of a push broom scanner. It employs a flat panel detector to simultaneously collect all data as spectral information in a line-by-line fashion. Meanwhile, spatial imaging is attained as the platform moves along with the flight direction. A push broom scanner can achieve high system sensitivity and a hight signal-to-noise ratio (SNR), because it gazes at a particular area for a long time.

Sensors 2017, 17, 1815 Sensors 2017, 17, 1815 Sensors 2017, 17, 1815

4 of 16 4 of 16 4 of 16

Band Band

Flag panel detector Flag panel detector

Figure 1. The imaging principle of push broom scanner. Figure 1. 1. The scanner. Figure The imaging imagingprinciple principleofofpush pushbroom broom scanner.

2.2.2. Parallel Causal Sliding Windows 2.2.2. Parallel Causal Sliding Windows 2.2.2. Parallel Causal Sliding Windows Hyperspectral anomaly detection adopts the traditional double-window model to detect targets Hyperspectral anomaly detection adopts the traditional double-window model to detect targets Hyperspectral anomaly detection adoptswhich the traditional double-window model to detect targets of in most most cases. cases.However, However, causality, which is the prerequisite of real-time detection, cannot of interest interest in causality, is the prerequisite of real-time detection, cannot ofbe interest in most cases. However, causality, which is the prerequisite of real-time detection, cannot attained. This is because the dual window must process the future data sample vectors after be attained. This is because the dual window must process the future data sample vectors after the the bePUT, attained. isto because the dual window must process theTo future data sample vectors after which leads to thenon-causality non-causality the processing system. To address this issue, a new type PUT, whichThis leads the ofofthe processing system. address this issue, a new type oftheof PUT, which leads to thewindow non-causality ofprogressive the processing system. Toisaddress this issue, a new type parallel causal sliding window basedonon progressive line processing is defined, as shown in Figure parallel causal sliding based line processing defined, as shown in Figure 2. of2. parallel causal sliding window based on progressive line processing defined, assliding shown in Figure 2. As from Figure Figure 2a,the theimplementation implementation attained a is series of causal sliding rectangle As is is evident from 2a, is is attained by by a series of causal rectangle As is evident from Figure 2a, the implementation is attained by a series of causal sliding rectangle windows in a parallel alignment. In this manner, each data line can be divided into several parts and windows parallel alignment. In this manner, each data line can be divided into several parts and windows in a parallel alignment. In data this manner, each data line processed can be divided intoline. parts and simultaneously detected, andthe the data canbebeprogressively progressively processed byseveral line. To further simultaneously detected, and can lineline by To further illustrate the processing of Figure 2b2b shows aprocessed representation line-by-line real-time simultaneously detected, and thealgorithm, data can be progressively line byofline. To further illustrate illustrate processing ofthis this algorithm, Figure shows a representation of line-by-line real-time th m processing in of the cross-track direction. thethenth nthdata line in the causal LmnLmndenote the processing algorithm, Figure Let 2bLet shows adenote representation ofprocessing line-by-line real-time processing in mth causal processing thethis cross-track direction. data processing line in the m th th the cross-track direction. Letspecified L the nlength data processing linebin. bthe causal sliding rectangle sliding rectangle window byby itsits width It. should be noted that the the a aand n denote sliding rectangle window specified length and width Itm should be noted that window specified by its length a and width b. It should be noted that the currently processed is currently processed line is not contained in the causal sliding rectangle window. When the causal currently processed line is not contained in the causal sliding rectangle window. When theline causal m m m m (the line window) to to (the(the dashed line sliding rectangle slides from L not contained in thewindow causal sliding rectangle window. When the causal sliding rectangle window slides n+1 L (thedotted dotted line window) dashed line sliding rectangle window slides from LnL n+1 n m from Lm (the dotted line window) to L (the dashed line window), most of the data line vectors window), most of the data line vectors collected do not vary, except for the contribution of the data n 1 window), most of the data line vectorsn+collected do not vary, except for the contribution of the data collected do not vary, except for the the causal contribution of the data line vectors leaving and entering the line vectors leaving and entering sliding rectangle window ( Lmn− b mand Lmn+1 ). m line vectors leaving and entering the causal msliding rectangle window ( Ln − b and Ln+1 ). causal sliding rectangle window (Lm n−b and Ln+1 ). in − flight

in − flight

NL

NL

J

J

NS

NS



 Lmn−b

Lmn Lmn+1 m n −b

L (a)

(b)

Lmn Lmn+1

(a) Figure 2. Schematic representation of data acquisition and processing: (a) The(b) process of the parallel causal sliding windows; (b) The causal sliding rectangle window at Lmn and Lmn+1 . Figure2.2.Schematic Schematicrepresentation representationofofdata dataacquisition acquisitionand andprocessing: processing:(a) (a)The Theprocess processofofthe theparallel parallel Figure m m and . causal sliding windows; (b) The causal sliding rectangle window at L L m m n causal sliding windows; (b) The causal sliding rectangle window at Ln and Ln+1 .n+1

Sensors 2017, 17, 1815

5 of 16

2.2.3. Causal KRXD Algorithm n −1 m Now we assume that L(n) makes up of all the data lines up to Lm n−1 , i.e., L( n ) = ∪i =n−b Li , where b is equal to the number of lines contained in the current causal sliding rectangle window, and Lim is the ith data line. As a result, the KRXD can be causally designed as follows:

T T T T T 1 δKRXD (Φ(Lm K− L n k − k L n ( ( )) ( ( )) m µ n )) = kLm − kµ (L( n )) L B n

(8)

n

where KB (L(n)) = k(XB (L(n)), XB (L(n))) is the causal Gram matrix, and XB (L(n)) is the causal background matrix formed by all of the data line vectors in the current causal sliding rectangle window. In order to satisfy the causality, kµT (L(n)) and kLTm must be specified by: n

kµT (L(n)) =

1 b 1 b b m m k (Lm ∑ n , X B (L( n ))) − 2 ∑ ∑ k Li , L j 1b×w b i =1 b i =1 j =1

kTLmn (L(n)) = k (Lm n , X B (L( n ))) −

(9)

1 b m k (Lm n , Li ) 1 b × w b i∑ =1

(10)

where w is the total number of entire data sample vectors in a causal sliding rectangle window. 2.2.4. Progressive Line Processing of the KRX Anomaly Detector Theoretically, according to Equation (8), the causality of KRXD is ensured so that it can be 1 implemented in real-time. However, recalculating K− B (L( n )) is excessively time consuming as long as the causal sliding window moves. This processing time scarcely meets the time restraint for real-time 1 implementation. Therefore, in what follows, we derive the recursive update equation of K− B (L( n )) to solve this issue. m m Let XB (L(n)) = Lm n−b , Ln−b+1 , · · · Ln−1 be the background data sample vectors in the current causal sliding rectangle window at Lm n . Using the polynomial kernel function in Equation (7), K B (L( n )) can be given by d KB (L(n)) = k(XB (L(n)), XB (L(n))) = XB (L(n)) T XB (L(n))  m m m k Lm k Lm k Lm n − b , Ln − b n − b , Ln − b +1 · · · n − b , Ln −1  m m m k Lm · · · k Lm  k Lm n − b +1 , Ln − b n − b +1 , Ln − b +1 n − b +1 , Ln −1  = .. .. .. .. . . . .  m m m m m k Ln −1 , Ln − b k Ln −1 , Ln − b +1 · · · k Ln −1 , Lm n −1

  "  m  = ρ Ln T  α  Lnn

αLmn Kbw

#

(11)

m m m m m where αLmn and ρLmn are separately defined by αLmn = k Lm n−b , Ln−b+1 , k Ln−b , Ln−b+2 , · · · k Ln−b , Ln−1 , −1 m and ρLmn = k Lm n−b , Ln−b . Taking the advantage of the matrix inversion lemma, K B (L( n )) can be expressed as: " 1 K− B (L( n ))

=

ρLmn αLTn n

κ=

αLmn Kbw

# −1

"

=

ρL−m1 + ρL−m1 αLmn κ−1 αLTn ρ−1 n

n

−κ−1 αLTn ρL−m1 n

Kbw − αLTn ρL−m1 αLmn n n

n

n

−ρL−m1 αLmn κ−1 n κ −1

# .

(12)

1 Note that κ−1 can be acquired from K− B (L( n )). As follows from Equation (12), we can obtain

−1 1 T −1 m K− = κ + α ρ α . n m Ln L L bw n

n

(13)

Sensors 2017, 17, 1815

6 of 16

Sensors 2017, 17, 1815

6 of 16

−1 For the purpose of finding a recursive formula of K bw to reduce the computational complexity, −1 : we evaluate the Woodbury matrix identity expressing K−bw For the purpose of finding a recursive formula of Kbw1 to reduce the computational complexity, −1 we evaluate the Woodbury matrix identity−1 expressing K−1 : −1 K bw = κ + αLTn ρL−m1 αLm = κ −1 − κ −1αLTn bwρLm + αLm κ −1αLTn αLm κ −1 . (14) n n n n n n n n −1 −1 1 T K−1 = κ + αLTn ρL−m1 αLmn = κ−1 − κ−th1 αLTnn ρprocessing + αLmn κ−line α Ln α m κth−1 . (14) Lm n n nin theLnm Similarly, bw the kernel Gram causal sliding n matrix at the ( n + 1) data

(

)

(

)

rectangle window at Lmn+1 can be calculated as Similarly, the kernel Gram matrix at the (n + 1)th data processing line in the mth causal sliding rectangle window at Lm K ( Lbe ( n + 1calculated ) ) = k ( X ( L ( n + 1as ) ) , X ( L ( n + 1) ) ) = ( X ( L ( n + 1) ) X ( L ( n + 1) ) ) n+1 can d

T

B

B

 k ( Lmn −b +1 , Lmn −b +1 )  m ( Lmn −n +1 ) b + 2 , Ln − b1 n 1 k X=B k L   m  m m Lm n−b+1 , Ln−b+1k ( Ln , Ln −b +1 ) k

B

B

B

 k ( Lmn −b +1 , Lmn −1 )

k ( Lmn −b +1 , Lmn )   m m  fw βLTn   k (L , XB L n, L ) 1k ( Ln −b + 2 , Ln )X=B K L n n  1 T XB   β Lmn γ Lmn      m m m m m Lm  Lmn1, ,LmnL n−1 k ( Ln ,kLn )Ln−b+1 , Ln n−kb(+ −1 ) "

d (15) KB (L( + )) = ( ( ( + )) ( ( + ))) = ( ( + )) (L(n + 1))  k ··· #   m m T (15) k L , L · · · k Lm , Lm k Lm , Lm   K β n n n − 1 n − b + 2 n − b + 1 n − b + 2 n − b + 2 f w Ln =  m m = . . . . γ = k L , L × ( w − a β) mmatrix, where Lm n ..n is a matrix . . of size .. a × a , and β Lmn ..is an a  γLmn defined by n  Ln m , Lm m , Lm βLm =  k ( Lmn , Lkmn −L , kb+ thek(L matrix ( L1mn , Lmn−2·)·,·k ( Lmn ,kLmLn−mn1 ),L.mn−Then, n n ) inversion lemma is repeatedly 1 n− 1 ) , b +n n

(

)

m n −b + 2

m n −1

−1 exploited calculate + 1) ) : of size a × a, and β m is an a × (w − a) matrix, defined by where γ mto = k (Lm , LmK)B is( La( nmatrix Ln Ln m m n n m m , Lm β Lmn = k Ln , Ln−b+1 , · · · , k Ln , Lm , k L . Then, the matrix inversion lemma is repeatedly n n −2 n −1 −1  K −1 + λ γ + β λ −1 λT λ γ + β λ −1  T −1   exploited to calculate KB (L(nK+ β:Lm Lmn+1 Lmn +1 Lmn+1 Lmn +1 fw 1))  fw  n +1  = K B−1  L ( n + 1)  =   − 1 − 1   β Lm #−1γ Lm  (16) " γ Lm m + β Lm λm λT−1 T γLm m+ β Lm λ m .−1 n +1  T n+1 −1 K + λ γ + β λ λ λ γ + λ n +1 n +1L n +L n +β 1 1 L K β   m L f w   fw Ln +1 −1 n +1 n +1 n +1 n +1 KB [L(n + 1)] = −1 T = −1 −1  . (16) λ = − K fwββLLmnm+1 γLmn+1 γLm + β Lm λ λT γLm + β Lm λ

(

n +1

(

n +1

1 T λ = −K− f w β Lm

)

)

( (

n +1

) )

n +1

n +1

−1 n +1 Taking advantage of Equation (14), K bw is obtained by κ −1 , and K B−1 ( L ( n + 1) ) is derived by

−1Equations (11) and 1 −1 (15), it−can K −fw1 Taking through Equationof(16). According be found that K = K −1. advantage Equation (14), Kto bw is obtained by κ , and K B (L( n + 1)) is derivedbwby K ffww 1 −1 K f w . Above through Equation (16).inAccording it can beupdated found that KbwK= K B−Equations canand be (15), recursively from Above all, as shown Figure 3, to ( L ( n + 1) )(11) B ( L ( n ) ) , and −1 −1 all, as shown in Figure 3, KB (L(n + 1)) can be recursively updated from KB (L(n)), and thus the thus the calculation of K B ( L ( n + 1−) )1 and K B−1 ( L ( n + 1) ) is avoided. calculation of KB (L(n + 1)) and KB (L(n + 1)) is avoided.

K

−1 B

( L ( n))

(12 )

κ −1

(14 )

−1 K bw

(11)

(15)

K −fw1

(16 )

K B−1 ( L ( n + 1) )

Figure 3. Recursive update process of the kernel Gram matrix inversion.

3. Description of Hyperspectral Datasets 3.1. Synthetic Dataset In Figure 4a, 4a, the the real real Cuprite Cuprite image, image, which which was was collected collected by by the the Airborne Airborne Visible/Infrared Visible/Infrared Imaging Spectrometer sensor (AVIRIS) from Nevada in 1997, is used to design the synthetic synthetic dataset. dataset. It is a 224-band 224-band image image with with aa size size of of 350 350 × × 350 pixels. After removing the water absorption and low SNR bands, bands areare used for the The five spectralspectral signatures, namely bands,only only189 189 bands used for experiments. the experiments. Thetargeted five targeted signatures, alunite buddingtonite (B), calcite (C), kaolinite (K), and (M),(M), marked by by circles in namely (A), alunite (A), buddingtonite (B), calcite (C), kaolinite (K), muscovite and muscovite marked circles Figure 4b,4b, areare used to simulate 25 anomaly targets, withwith five panels in each row simulated by theby same in Figure used to simulate 25 anomaly targets, five panels in each row simulated the targeted spectralspectral signature, and eachand column superposed by a different background same targeted signature, each iscolumn is superposed by proportion a different of proportion of interference. The size of The the panels the left column tocolumn the right 4 × 4, is 2× 1, background interference. size of from the panels from the left to column the rightiscolumn 4 ×2,4,12×× 2, and 1 × 1, respectively. These 25 panels are then inserted in a synthetic background image with the 1 × 1, and 1 × 1, respectively. These 25 panels are then inserted in a synthetic background image with size of 200 × 200 pixels, as as shown inin Figure the size of 200 × 200 pixels, shown Figure5a, 5a,which whichisissimulated simulatedby bythe thesample sample mean mean of of the real image scene scene that thatisiscorrupted corruptedby byGaussian Gaussian noise achieve SNR of 20:1 is defined in [28]. noise to to achieve thethe SNR of 20:1 thatthat is defined in [28]. The The ground truth of the synthetic data is presented in Figure 5b. ground truth of the synthetic data is presented in Figure 5b.

Sensors 2017, 17, 1815 Sensors Sensors 2017, 2017, 17, 17, 1815 1815 Sensors 2017, 17, 1815

7 of 16 77 of of 16 16 7 of 16

(a) (a) (a)

(b) (b) (b)

Figure Cuprite Airborne Visible/Infrared Imaging Spectrometer sensor (AVIRIS) image; (b) Figure 4. 4. (a) (a) Cuprite Airborne Visible/Infrared Imaging Spectrometer sensor (AVIRIS) image; (b) Figure 4. (a)Cuprite CupriteAirborne AirborneVisible/Infrared Visible/Infrared Imaging Spectrometer sensor (AVIRIS) image; Figure 4. (a) Imaging Spectrometer sensor (AVIRIS) image; (b) Spatial positions of A, B, C, K, and M. Spatial positions of A, B, C, K, and M. (b) Spatial positions of B, A,C, B,K, C,and K, and Spatial positions of A, M. M.

(a) (a) (a)

(b) (b) (b)

Figure (a) Synthetic Synthetic image; Figure 5. 5. Synthetic Synthetic dataset. dataset. (a) image; (b) (b) Ground Ground truth truth map. map. Figure 5. Synthetic dataset. (a) Synthetic image; (b) Ground truth map.

3.2. 3.2. Pavia Pavia Center Center Dataset Dataset 3.2. Pavia Center Dataset 3.2. Pavia Center Dataset A A Pavia Pavia center center hyperspectral hyperspectral image image scene scene has has been been collected collected by by the the Reflective Reflective Optics Optics System System A Pavia center hyperspectral image scene has been collected by the Reflective Optics System A Pavia center hyperspectral image scene has been collected by the Reflective Optics System Imaging Spectrometer sensor (ROSIS) from the Pavia city center in northern Italy. As shown in Figure 6, Imaging Spectrometer sensor (ROSIS) from the Pavia city center in northern Italy. As shown in Figure Imaging Spectrometer sensor (ROSIS) from the Pavia city center in northern Italy. As shown in Figure 6, 6, Imaging Spectrometer sensor (ROSIS) from the Pavia city center in from northern Italy. As μ shown in aFigure 6, m it consists of 115 bands at about 4 nm apart in the spectral region 0.43 to 0.86 with spatial μµm m with it consists consists of with aa spatial spatial it of 115 115 bands bands at at about about 44 nm nmapart apartin inthe thespectral spectralregion regionfrom from0.43 0.43toto0.86 0.86 μ it consists of 115 m. bands at about 4 nm715 apart in the spectral region from 0.43As to 0.86 m with a spatial resolution pixels in the entire image scene. in 7a, of 1.3 1.3 m. There are 1096 715 pixels in in thethe entire image scene. As shown shown in Figure Figure 7a, we we resolution of m.There Thereare are1096 1096××× 715 pixels entire image scene. As shown in Figure 7a, resolution of 1.3of m.itThere are size 1096of× 115 715 pixels in the to entire image scene. As shown inofFigure 7a, we select a subarea with the × 115 pixels conduct experiments. A total 102 spectral select a subarea of itof with the size 115of×115 115 × pixels to conduct experiments. A total of we select a subarea it with the of size 115 pixels to conduct experiments. A 102 totalspectral of 102 select aare subarea of itanalysis with theafter size of 115 × 115 pixels to conduct experiments. A total of 102 spectral bands used for the water bands and The bands are usedare forused analysis after removing removing the atmospheric atmospheric water water bandsbands and low low SNR bands. The spectral bands for analysis after removing the atmospheric andSNR low bands. SNR bands. bands are used for analysis after removing the atmospheric water bands and low SNR bands. The ground-truth map is displayed in Figure 7b. ground-truth mapmap is displayed in Figure 7b. 7b. The ground-truth is displayed in Figure ground-truth map is displayed in Figure 7b.

Figure Figure 6. 6. Pavia Pavia center center hyperspectral hyperspectral image image scene. scene. Figure 6. 6. Pavia Pavia center center hyperspectral hyperspectral image image scene. scene. Figure

Sensors 2017, 17, 1815 Sensors 2017, 17, 1815 Sensors 2017, 17, 1815 Sensors 2017, 17, 1815

8 of 16 8 of 16 8 of 16 8 of 16

(a) (a) (a)

(b) (b) (b)

Figure Figure 7. 7. (a) (a) Pavia Pavia Center Center dataset; dataset; (b) (b) Ground Ground truth truth map. map. Figure 7. (a) Pavia Center dataset; (b) Ground truth map.

3.3. San Diego Airport Dataset 3.3. San San Diego Diego Airport Airport Dataset Dataset 3.3. Figure 8 shows a real hyperspectral image, which was collected by the AVIRIS from San Diego, Figure 88 shows shows a real real hyperspectral hyperspectral image, was by AVIRIS San Diego, hyperspectral image, image, which which was was collected collected by by the the AVIRIS AVIRIS from from San San Diego, Diego, Figure CA, USA. There are aa total of 224 bands rangingwhich from 0.4 tocollected 1.8 μm withthe a size of 400from × 400 pixels. In CA, USA. There are a total of 224 bands ranging from 0.4 to 1.8 μm with a size of 400 × 400 pixels. In USA. There are a total of 224 bands ranging from 0.4 to 1.8 µm with a size of 400 × 400 pixels. CA, study, USA. There are a total of 224 bands ranging from 0.4 towere 1.8 μm with a size 400the × 400 pixels. In this atmospheric water bands and low SNR bands removed, and of only surplus 126 this study, atmospheric water bands andand lowlow SNR bands were removed, andand onlyonly the surplus surplus 126 In this study, atmospheric water bands SNR bands were removed, the surplus this study, atmospheric water bands and low SNR bands were removed, and only the 126 bands are selected for the experiments. The San Diego airport dataset shown in Figure 9a is one bands are selected for the experiments. The San Diego airport dataset shown in Figure 9a is one 126 bands are selected for the experiments. The San Diego airport dataset shown in Figure 9a is bands are for hyperspectral the experiments. TheItSan Diegoofairport shown Figure bands 9a is one subarea of selected the AVIRIS image. consists 60 × 60dataset pixels with 126inspectral for subarea of of the the AVIRIS AVIRIS hyperspectral hyperspectral image. image. It It consists consists of of 60 60 × × 60 60 pixels with 126 126 spectral spectral bands bands for for 60 pixels pixels with subarea AVIRIS hyperspectral image. consists of 60 × experiment. Figure 9b displays the ground truth map of it. experiment. Figure Figure 9b 9b displays displays the the ground ground truth experiment. truth map map of of it. it.

Figure 8. AVIRIS Hyperspectral Image. Figure 8. AVIRIS Hyperspectral Image. Figure 8. 8. AVIRIS AVIRIS Hyperspectral Hyperspectral Image. Figure Image.

(a) (a) (a)

(b) (b) (b)

Figure 9. (a) San Diego airport dataset; (b) Ground truth map. Figure 9. 9. (a) San San Diego airport airport dataset; (b) (b) Ground truth truth map. Figure Figure 9. (a) (a) San Diego Diego airport dataset; dataset; (b) Ground Ground truth map. map.

4. Experimental Results and Analysis 4. Experimental Experimental Results Results and and Analysis Analysis 4. 4. Experimental Results and Analysis In this section, three groups of HSI datasets, including one synthetic image and two real images, In this section, three groups of HSI HSI datasets, datasets, including including one one synthetic synthetic image image and and two two real real images, images, In this section, three groups of are utilized for experiments to demonstrate the performance of synthetic the PLP-KRXD. In this section, three groups of HSI datasets, including one image and two real images, are utilized for experiments to demonstrate the performance of the PLP-KRXD. are utilized for experiments to demonstrate the performance of the PLP-KRXD. are utilized for experiments to demonstrate the performance of the PLP-KRXD. 4.1. Optimum Kernel Parameter on the PLP-KRXD 4.1. Optimum Optimum Kernel Kernel Parameter Parameter on on the the PLP-KRXD PLP-KRXD 4.1. This group of experiments is conducted to find the optimum kernel parameter d on the PLPd on This group group of of experiments experiments is is conducted conducted to to find the the optimum optimum kernel kernel parameter parameter d on the the PLPPLPThis KRXD. According to cross-validation, the size find of the causal sliding rectangle windows on the KRXD. According According to to cross-validation, cross-validation, the the size size of of the the causal causal sliding sliding rectangle rectangle windows windows on on the the KRXD.

Sensors 2017, 17, 1815

9 of 16

Sensors 2017, 17, 1815

9 of 16

4.1. Optimum Kernel Parameter on the PLP-KRXD synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset are chosen as (18,5 ) , This group of experiments is conducted to find the optimum kernel parameter d on the PLP-KRXD. (15,5) , and (12,7 ) , respectively. According to cross-validation, the size of the causal sliding rectangle windows on the synthetic The receiver operating characteristic (ROC) [29] curves employed as one of dataset, the Pavia Center dataset, and the San Diego Airportare dataset are chosen asthe 5), (15, 5), (18,frequently criteria in anomaly detection. The detection probability ( Pd ) and false alarm rate andused (12, 7evaluation ), respectively. operating Pf ) receiver can be defined as characteristic (ROC) [29] curves are employed as one of the frequently ( The used evaluation criteria in anomaly detection. The detection probability ( Pd ) and false alarm rate Pf Number of target pixels detected  can be defined as ,   Pd = Number target detected of of total realpixels targets pixels,  Pd = Number  Number of total real targets pixels (17)  (17) of background pixels classified as target  P = Numbere Numbere of background pixels classified as target  . .  fPf = Numberofoftot total pixels ininthe Number al pixels theimage image

TheThe area under thethe curve (AUC) criterion totoreasonably reasonably area under curve (AUC)isisemployed employedas asanother another evaluation evaluation criterion estimate the accuracy of anomaly detection with respect to the ROC analysis. In detail, a larger AUC estimate the accuracy of anomaly detection with respect to the ROC analysis. In detail, a larger AUC value represents a higher accuracy ofof anomaly thedifferent differentAUCs AUCs value represents a higher accuracy anomalydetection. detection. Figure Figure 10 10 illustrates illustrates the withwith diverse polynomial kernel parameter d d = 1, 2, 3, 4, 5 for the synthetic dataset, the Pavia Center ( ) diverse polynomial kernel parameter d ( d = 1, 2,3, 4,5 ) for the synthetic dataset, the Pavia dataset, and the San Diego Airport dataset, respectively. As can be seen in this figure, the AUC of all Center dataset, and the San Diego Airport dataset, respectively. As can be seen in this figure, the AUC three datasets a maximum whenvalue d is equal then go into a downward trend. d and of HSI all three HSIreaches datasets reaches a value maximum whento 2, is equal to 2, and then go into a For downward the synthetic dataset and the Pavia Center dataset, the AUC of them has a modest decline, butait trend. For the synthetic dataset and the Pavia Center dataset, the AUC of them has declines sharply after thesharply maximum the San Airport Hence,Airport the polynomial modest decline, butreaching it declines after for reaching theDiego maximum fordataset. the San Diego dataset. kernel parameter d is relatively other words, Hence, the polynomial kernel sensitive parametertodthe is complex relatively hyperspectral sensitive to thedataset. complexInhyperspectral when the distribution of the when data features is simple,ofthe d has aisweaker on the separation d has a dataset. In other words, the distribution theselected data features simple,effect the selected of the background weaker effect onand the objects. separation of the background and objects. 1

Area Under Curve (AUC)

0.98 0.96

Synthetic dataset Pavia Center dataset San Diego Airport datast

0.94 0.92 0.9 0.88 0.86 0.84 0.82 0.8

1

2

3

4

Polynomial kernel parameter d

5

Figure Areaunder under the curve thethe progressive line line processing Kernel-RX detectordetector (PLPFigure 10. 10.Area curve(AUC) (AUC)of of progressive processing Kernel-RX d KRXD) with a different polynomial kernel parameter on three datasets: Synthetic dataset, Pavia (PLP-KRXD) with a different polynomial kernel parameter d on three datasets: Synthetic dataset, Center dataset, and San Diego Airport dataset. Pavia Center dataset, and San Diego Airport dataset.

Effects from Changing Window Sizeononthe thePLP-KRXD PLP-KRXD 4.2. 4.2. Effects from thethe Changing Window Size group of experiments is to investigate that the detection effect of the PLP-KRXD is related ThisThis group of experiments is to investigate that the detection effect of the PLP-KRXD is related to to the size of the causal sliding rectangle windows. In these experiments, a and b of the causal the size of the causal sliding rectangle windows. In these experiments, a and b of the causal sliding sliding rectangle window are separately chosen in a range from 12 to 18 with a step of 3 and 5 to 7 rectangle window are separately chosen in a range from 12 to 18 with a step of 3 and 5 to 7 with a step with a step of 1 on the three HSI datasets. By means of cross-validation, the polynomial kernel of 1 on the three HSI datasets. By means of cross-validation, the polynomial kernel parameter d of the parameter d of the synthetic dataset, the Pavia Center dataset and the San Diego Airport dataset synthetic dataset, the Pavia Center dataset and the San Diego Airport dataset are all manually set to 2. are all manually set to 2. Table 1 reveals the output AUCAUC valuesvalues of theof PLP-KRXD for thefor three datasets with a changing Table 1 reveals the output the PLP-KRXD theHSI three HSI datasets with a causal sliding rectangle window size. From Table 1, the maximum values obtained in the synthetic changing causal sliding rectangle window size. From Table 1, the maximum values obtained in the dataset, the Pavia Center dataset, and the San Diego Airport dataset are respectively 0.9966, 0.9979, synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset are respectively 0.9966, and0.9979, 0.9458and at the detection windows of , , and . The residuals between the best 18, 5 15, 5 12, 7 ( ) ( ) ( ) 0.9458 at the detection windows of (18,5 ) , (15,5 ) , and (12,7 ) . The residuals between AUC and the worst AUC of the PLP-KRXD for the three HSI datasets are given in Table 3. Compared the best AUC and the worst AUC of the PLP-KRXD for the three HSI datasets are given in Table 3. to the San Diego Airport dataset, the residuals of the synthetic and Pavia Center datasets are smaller, Compared to the San Diego Airport dataset, the residuals of the synthetic and Pavia Center datasets

Sensors 2017, 17, 1815

10 of 16

Sensors 2017, 17, 1815

10 of 16

which means they can weaken the influence resulting from the causal sliding rectangle window size. For the Diegowhich Airport dataset, dueweaken to the distribution the complex background, its rectangle robustness is areSan smaller, means they can the influence of resulting from the causal sliding relatively weaker the other two HSI datasets. window size. than For the San Diego Airport dataset, due to the distribution of the complex background, its robustness is relatively weaker than the other two HSI datasets. Table 1. AUC performance of the PLP-KRXD with a changing window size for the three hyperspectral Table 1. AUC performance of thedataset, PLP-KRXD with a changing window size for the three hyperspectral imagery (HSI) datasets: Synthetic Pavia Center dataset, and San Diego Airport dataset. imagery (HSI) datasets: Synthetic dataset, Pavia Center dataset, and San Diego Airport dataset.

Windows

Synthetic Dataset

Windows

Synthetic Dataset

(12,5) (12,5) (15,5) (15,5) (18,5) (18,5) (12,6) (12,6) (15,6) (18,6) (15,6) (12,7) (18,6) (15,7) (12,7) (18,7) (15,7) Residual

Pavia Center Dataset

Pavia Center Dataset

0.9955 0.9955 0.9956 0.9956 0.9966 0.9966 0.9956 0.9956 0.9886 0.9964 0.9886 0.9955 0.9964 0.9964 0.9955 0.9965 0.9964 0.0080

(18,7) Residual

San Diego Airport

San Diego Airport

0.9945 0.9945 0.9979 0.9979 0.9891 0.9891 0.9946 0.9946 0.9943 0.9892 0.9943 0.9945 0.9892 0.9946 0.9945 0.9896 0.9946 0.0088

0.9965 0.0080

0.9412 0.9412 0.9363 0.9363 0.9283 0.9283 0.9402 0.9402 0.9231 0.9321 0.9231 0.9458 0.9321 0.9426 0.9458 0.9413 0.9426 0.0277

0.9896 0.0088

4.3. Detection Performance of PLP-KRXD

0.9413 0.0277

4.3.comparison Detection Performance PLP-KRXD In with theofKRXD, several experiments are conducted on the three HSI datasets to demonstrate the effectiveness ofKRXD, PLP-KRXD. these experiments, by cross-validation, polynomial In comparison with the severalIn experiments are conducted on the three HSIthe datasets to of dataset, PLP-KRXD. In these experiments, the kerneldemonstrate parameter the d foreffectiveness the synthetic the Pavia Center dataset, by andcross-validation, the San Diego Airport d for polynomial kernel the the synthetic dataset, thecausal Pavia sliding Center dataset, andwindow the San size dataset are all set to 2.parameter At the same time, corresponding rectangle Diego Airport dataset are all set to 2. At the same time, the corresponding causal sliding rectangle chosen for these three HSI datasets are set to (18, 5),(15, 5), and (12, 7), respectively. For the KRXD, window size chosen for these three are set to (18,5 (15,5) , and (12,75/11, ) , respectively. the polynomial kernel parameter d andHSI thedatasets dual window size are) ,chosen as 2 and respectively. d and For the 11 KRXD, the polynomial parameter the dual for window size are chosen as 2the andPavia Figure illustrates the ROCkernel curves of the two detectors the synthetic dataset, 5 /11 , respectively. Center dataset, and the San Diego Airport dataset, respectively. The ROC curves of the KRXD and Figure 11 illustrates the ROC curves of the two detectors for the synthetic dataset, the Pavia PLP-KRXD correspond to their best AUC. As can be observed for the synthetic dataset in Figure 11a, Center dataset, and the San Diego Airport dataset, respectively. The ROC curves of the KRXD and the detection precision of the PLP-KRXD is basically the same as that of the KRXD, and it has a higher PLP-KRXD correspond to their best AUC. As can be observed for the synthetic dataset in Figure 11a, probability of detection under lower false alarm the probability. Forofthe dataset shown the detection precision of thethe PLP-KRXD is basically same as that the Pavia KRXD,Center and it has a higher in Figure 11b, the ROC results of the KRXD and the PLP-KRXD are comparable, even if the KRXD probability of detection under the lower false alarm probability. For the Pavia Center dataset shown is slightly better the PLP-KRXD. ofand them achieveare a probability detection of 1 with a in Figure 11b,than the ROC results of theBoth KRXD thecan PLP-KRXD comparable,of even if the KRXD is better rate. than the Both Airport of them can achieve a probability detection of 1 with a and lowerslightly false alarm ForPLP-KRXD. the San Diego dataset shown in Figureof11c, the PLP-KRXD lower are falsealmost alarm rate. For the Diego Airport dataset and the KRXD identical in San detection accuracy. Thisshown result in is Figure due to11c, the the factPLP-KRXD that the recursive the KRXD are almost identical in detection accuracy. This result is due to the fact that the recursive process of the PLP-KRXD is derived from the KRXD by applying identity without information leaks. the PLP-KRXD is derived from KRXD by applying Someprocess subtle of distinctions occur between thethe PLP-KRXD and theidentity KRXDwithout becauseinformation the KRXDleaks. needs to Some subtle distinctions occur between the PLP-KRXD and the KRXD because the KRXD needs to implement the pseudoinverse of the Gram matrix for every pixel, while the PLP-KRXD cancels this implement the pseudoinverse of the Gram matrix for every pixel, while the PLP-KRXD cancels this operation via the causal sliding rectangle window. 1

1

0.9

0.8

Probability of detection


operation via the causal sliding rectangle window.

0.8

0.7

KRXD PLPKRXD

0.6

0.5

0

0.02

0.04

0.06

False alarm rate

0.08

0.6

0.4

KRXD PLPKRXD

0.2

0.1

0

0

(a)

0.01

0.02

0.03

0.04

(b)

Figure 11. Cont.

0.05

False alarm rate

0.06

0.07

0.08

Sensors 2017, 17, 1815 Sensors 2017, 17, 1815

11 of 16 11 of 16

Sensors 2017, 17, 1815

11 of 16 1 1

0.8



0.8

0.6

0.6

0.4 0.4

KRXD KRXD PLPKRXD PLPKRXD

0.2 0.2

0

0

0

0

0.2

0.2

0.4

0.4

0.6

0.6

False alarm rate

False alarm rate (c)

0.8

0.8

1

1

(c)

Figure 11. Receiver operation characteristic (ROC) curves obtained by the Kernel-RX detector (KRXD) Figure 11. 11. Receiver Receiver operation operation characteristic characteristic (ROC) (ROC) curves curves obtained obtained by by the the Kernel-RX Kernel-RX detector detector (KRXD) (KRXD) Figure and the PLP-KRXD on three HSI datasets. (a) Synthetic dataset; (b) Pavia Center dataset; (c) San Diego and the the PLP-KRXD PLP-KRXD on on three three HSI HSI datasets. datasets. (a) (a) Synthetic Synthetic dataset; dataset; (b) (b) Pavia Pavia Center Center dataset; dataset; (c) (c) San San Diego Diego and Airport dataset.

Airport Airport dataset. dataset. Figure 12 presents the detection maps of the KRXD and the PLP-KRXD on the three HSI datasets, respectively. Through thereofisthe no KRXD appreciable visual differenceon between the HSI results of Figure 12 presents the comparison, detection maps and the PLP-KRXD the datasets, KRXD the PLP-KRXD on the three three HSI datasets, the two Through anomaly detectors for the synthetic dataset and the Pavia Center dataset. For the San Diego respectively. comparison, there is no appreciable visual difference between the results of Through comparison, there Airport dataset, the detection result obtained by the PLP-KRXD achieves better visual inspection than the two anomaly detectors for the synthetic dataset and the Pavia Center dataset. For the San Diego does the KRXD with regard to the detection maps. In order to make more visualized descriptions for Airport dataset, the detection result obtained by the PLP-KRXD achieves better visual inspection than the data and expose more detailed information, Figure 13 illustrates the three-dimensional (3D) plots does the KRXD with regard to the detection maps. In order to make more visualized descriptions for of the KRXD and the PLP-KRXD on the three HSI datasets. It can clearly be seen that the 3D results the and expose more detailed information, Figure 13 illustrates the three-dimensional (3D) the data data exposeand more detailed information, theare three-dimensional (3D) plots plots of and the KRXD the PLP-KRXD for each Figure group 13 of illustrates HSI datasets extremely comparable; of the KRXD and the PLP-KRXD on the three HSI datasets. It can clearly be seen that the 3D results of the particularly, KRXD and the PLP-KRXDhas onbetter the three HSI datasets. It canand clearly be seen that the theanomalies 3D results of the PLP-KRXD background suppression accurately detects of and the PLP-KRXD forgroup eachofgroup of HSIare datasets are comparable; extremely comparable; the the KRXD PLP-KRXD for each HSI datasets extremely particularly, onKRXD theand Santhe Diego Airport dataset.

particularly, thehas PLP-KRXD has better background anddetects accurately detects the on anomalies the PLP-KRXD better background suppressionsuppression and accurately the anomalies the San on the San Diego Airport dataset. (a) Synthetic dataset (b) Pavia Center dataset (c) San Diego Airport dataset Diego Airport dataset. (a) Synthetic dataset

KRXD

KRXD

PLPKRXD

(b) Pavia Center dataset

KRXD

KRXD

PLPKRXD

(c) San Diego Airport dataset

KRXD

KRXD

PLPKRXD

Figure 12. The detection maps of the KRXD and the PLP-KRXD on the three HSI datasets. (a) Synthetic dataset; (b) Pavia Center dataset; (c) San Diego Airport dataset.

PLPKRXD

PLPKRXD

PLPKRXD

Figure Figure 12. 12. The The detection detection maps maps of of the the KRXD KRXD and and the the PLP-KRXD PLP-KRXD on on the the three three HSI HSI datasets. datasets. (a) (a) Synthetic Synthetic dataset; (b) Pavia Center dataset; (c) San Diego Airport dataset. dataset; (b) Pavia Center dataset; (c) San Diego Airport dataset.

Sensors 2017, 17, 1815 Sensors 2017, 17, 1815

12 of 16 12 of 16

(a) Synthetic dataset

(b) Pavia Center dataset

(c) San Diego Airport dataset

KRXD

KRXD

KRXD

PLPKRXD

PLPKRXD

PLPKRXD

Figure Figure 13. 13. The Thethree-dimensional three-dimensional(3D) (3D)plots plotsof ofthe thedetection detectionresults resultsobtained obtainedby bythe theKRXD KRXDand andthe the PLP-KRXD on three HSI datasets. (a) Synthetic dataset; (b) Pavia Center dataset; (c) San Diego Airport PLP-KRXD on three HSI datasets. (a) Synthetic dataset; (b) Pavia Center dataset; (c) San Diego dataset. Airport dataset.

Figure 14 provides the line-varying detection maps produced by the PLP-KRXD as it was Figure 14 provides the line-varying detection maps produced by the PLP-KRXD as it was implemented in real-time on the synthetic dataset, the Pavia Center dataset and the San Diego Airport implemented in real-time on the synthetic dataset, the Pavia Center dataset and the San Diego Airport dataset, respectively. As time progresses, these progressive detection maps are beneficial to visual dataset, respectively. As time progresses, these progressive detection maps are beneficial to visual inspection at varying levels of background suppression. For example, in the Pavia Center dataset, the inspection at varying levels of background suppression. For example, in the Pavia Center dataset, first three weak anomalies are detected in a better visual assessment. From Figure 14j, k, l, however, the first three weak anomalies are detected in a better visual assessment. From Figure 14j, k, l, however, as the detection process progresses, these weak anomalies are easily overwhelmed by the subsequent as the detection process progresses, these weak anomalies are easily overwhelmed by the subsequent strong anomalies and become dimmer than before. Obviously, this phenomenon cannot be observed strong anomalies and become dimmer than before. Obviously, this phenomenon cannot be observed using the KRXD that obtains its result in the light of the final detection maps. using the KRXD that obtains its result in the light of the final detection maps.

Sensors 2017, 17, 1815

13 of 16

Sensors 2017, 17, 1815

13 of 16

(a) n = 20

(b) n = 50

(c) n = 85

(d) n = 120

(e) n = 155

(f) n = 200

(g) n = 20

(h) n = 40

(i) n = 60

(j) n = 80

(k) n = 100

(l) n = 150

(m) n = 10

(n) n = 20

(o) n = 30

(p) n = 40

(q) n = 50

(r) n = 60

Figure 14. Real-time progressive procedures of PLPKRXD on the three HSI datasets. (a–f) Synthetic Figure 14. Real-time progressive procedures of PLPKRXD on the three HSI datasets. (a–f) Synthetic dataset; (g–l) Pavia Center dataset; (m–r) San Diego Airport dataset. dataset; (g–l) Pavia Center dataset; (m–r) San Diego Airport dataset.

4.4. Computational Analysis of the PLP-KRXD 4.4. Computational Analysis of the PLP-KRXD Computational analysis is an essential process in real-time processing. Theoretically speaking, Computational analysis is an essential process in real-time processing. Theoretically speaking, the computational complexity is mainly related to the number of multiplication operations. In this the computational complexity is mainly related to the number of multiplication operations. In this paper, it is worth noting that the computational complexity of PLP-KRXD is mainly determined by paper, it is−1worth noting that the computational complexity of PLP-KRXD is mainly determined in order to intuitively analyze the time efficiency of PLP-KRXD, Table 2 presents K and K . Hence, 1 byB KB andB K− B . Hence, in order to intuitively analyze the time efficiency of PLP-KRXD, Table 2 the computational complexity of the kernel matrix its inversion in both the KRXD the presents the computational complexity of theGram kernel Gramand matrix and its inversion in both theand KRXD PLP-KRXD. As can be inseen Table by virtue of recursive processing in the the and the PLP-KRXD. As seen can be in 2, Table 2, by virtue of recursive processing in PLP-KRXD, the PLP-KRXD, −1 2 − 1 2 and in the KRXD is reduced to and from computational complexity of K K O ( a ω ⋅ J ) O ( a ω ) B the computational complexity ofB KB and K B in the KRXD is reduced to O ( aω · J )and O ( aω ) from 3 andOO , respectively, where the number of background data vectors sample O((ω ω 22 ⋅· JJ )) and ω represents O (ω(ω3 ),) respectively, where ωrepresents the number of background data sample in the processing window.window. Note thatNote a must than ω; hence, complexity vectors in the processing thatbeasmaller must be smaller thanthe hence, the computational ω ;computational of the PLP-KRXD cheaper than that of the KRXD. In the addition, is evident that theevident computational complexity of the isPLP-KRXD is cheaper than that of KRXD.it In addition, it is that the complexity is not directly linked processing time This is mainly because themainly PLP-KRXD, using computational complexity is nottodirectly linked to [8]. processing time [8]. This is because the BIL, detects all of the pixels of aall line one-shot while theoperation KRXD, using BIP, only PLP-KRXD, using BIL, detects of in thea pixels of aoperation line in a one-shot while theprocesses KRXD, using one each time. BIP,pixel processes only one pixel each time. Table 2. Table 2. Computational Computational complexity complexity of of KRXD KRXD and and PLP-KRXD. PLP-KRXD. KRXDKRXD Multiply Operation Multiply Operation

KB K B−1

KB 1 K− B

2 ω 2 ⋅ J ω 3· J

ω

ω3

PLP-KRXD PLP-KRXD

Complexity 2

O (ωO2(ω ⋅ J )3· J ) O(ω )

O (ω 3 )

Multiply Operation Multiply Operation

(ω(ω−−aa ) )· ⋅aJaJ 2aω 2 − 2a2 ω + a3

2aω 2 − 2a2ω + a3

Complexity Complexity O( aω O (·aJω) ⋅ J ) O( aω 2 )

O (aω 2 )

In order to further perform the concrete comparative analysis and better show the efficiency of the PLP-KRXD, we discuss the total processing time in seconds required by running the KRXD and the In order to further perform the concrete comparative analysis and better show the efficiency of PLP-KRXD on the synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset by the PLP-KRXD, we discuss the total processing time in seconds required by running the KRXD and averaging five runs, respectively. The computer environments used for experiments are all performed the PLP-KRXD on the synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset on a 64-bit operating system running on an Intel i7-4770k with a CPU of 3.5 Hz and 16 GB of RAM. by averaging five runs, respectively. The computer environments used for experiments are all As shown in Table 3, compared with the KRXD, the PLP-KRXD using recursive update equations performed on a 64-bit operating system running on an Intel i7-4770k with a CPU of 3.5 Hz and 16 GB of RAM. As shown in Table 3, compared with the KRXD, the PLP-KRXD using recursive update

Sensors 2017, 17, 1815

14 of 16

requires far less computing time on the three HSI datasets. It is approximately 40 times faster than the KRXD on the synthetic dataset, and on the San Diego Airport dataset, it is at least 58 times faster. Table 3. Total processing time for three HSI datasets. Total Computing Time (seconds)

Synthetic Dataset Pavia Center Dataset San Diego Airport Dataset

KRXD

PLPKRXD

667.671 242.698 58.132

16.607 4.998 1.131

Speedup 40.204 48.559 58.187

5. Discussion At the theoretical analysis stage, it is clear that the computation of the current kernel Gram matrix 1 −1 inversion K− B (L( n + 1)) needs previous processed information K B (L( n )) by virtue of Figure 2; −1 therefore, it is necessary to set an initial kernel Gram matrix Kinitial . It should be noted that the initial kernel Gram matrix is related to the size of the parallel causal sliding windows. The larger the window size is, the more data sample vectors the initial kernel Gram matrix has. However, a large window size may exceed the time constrains of real-time processing, while small window size easily leads to a singularity in the kernel Gram matrix. It offers a reference for us to determine a proper window size by cross-validation. In the experiment, the pseudoinverse of the low-rank kernel matrix was used to guarantee the stability of the PLP-KRXD. A common alternative approach for inverting singular and near-singular matrices is to regularize them first [30]. Therefore, in place of the kernel Gram matrix KB , we employ a regularization operation K0B = KB + λI for some small λ to ensure that all eigenvalues can avoid approaching zero. According to the analysis of the experiments and simulations, the proposed PLP-KRXD algorithm can maintain the detection accuracy unaltered and accelerate the execution efficiency compared to the KRXD. In each step of deriving the update equations, no information is leaked out, thereby retaining the same detection performance as KRXD while performing real-time processing. The PLP-KRXD recursively updates the kernel Gram matrix and its inversion via a Kalman-like recursive equation, thus avoiding the computation of complex matrices. Besides, parallel causal sliding windows, which only include data samples currently being processed without re-processing previous data samples, are adopted to ensure the causality of real-time processing. Although the PLP-KRXD achieves an over 58-fold speedup, it is sometimes still limited for practical applications. Digital Signal Processors (DSPs) and Graphics Processing Units (GPUs) can be taken into account to speed up real-time processing. 6. Conclusions In this paper, a new real-time anomaly detection of the KRXD algorithm in a BIL fashion is presented. Compared with the KRXD, there are several major advantages that the PLP-KRXD offers. First, the concept of parallel causal sliding windows is developed to ensure the causality of the PLP-KRXD, so that it could be implemented progressively by line in real-time. Second, a new real-time KRXD using BIL via a causal line-varying kernel Gram matrix to detect the anomalies line by line is proposed. Third, weak anomalies can be easily extracted before they are overwhelmed and dominated by the subsequent strong anomaly points, thereby forming data-line varying detection maps. Finally, aiming at the complex computation of the KRXD, this paper takes advantage of the Woodbury matrix identity and the matrix inversion lemma to recursively update the kernel Gram matrix and its inversion to meet the requirement of rapid processing. Through the complexity analysis, for the PLP-KRXD, the computational complexity is significantly reduced and operation efficiency is greatly improved as compared to the KRXD.

Sensors 2017, 17, 1815

15 of 16

Several experiments were conducted on the three HSI datasets to validate the detection performance of the PLP-KRXD algorithm. The experimental results indicate that the PLP-KRXD holds a comparable detection accuracy with the original KRXD and remarkably improves the algorithm’s execution efficiency at the same time. Acknowledgments: This work is partially supported by National Natural Science Foundation of China under Grant (Grant No. 61405041 and 61571145), the Key Program of Heilongjiang Natural Science Foundation (Grant No. ZD201216), the Program Excellent Academic Leaders of Harbin (Grant No. RC2013XK009003), and the Fundamental Research Funds for the Central Universities (Grant No. GK2080260167). Author Contributions: All authors participated in analyzing and writing the paper. Chunhui Zhao conceived and proposed the method. Weiwei Deng conducted and analyzed the experiments and wrote the manuscript. Yiming Yan and Xifeng Yao revised the paper in terms of grammar and writing. All authors reviewed and approved the final manuscript. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3. 4. 5. 6. 7. 8.

9. 10. 11. 12. 13. 14. 15.

16. 17.

Wang, Q.; Lin, J.; Yuan, Y. Salient Band Selection for Hyperspectral Image Classification via Manifold Ranking. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 1–11. [CrossRef] [PubMed] Bajorski, P. Target detection under misspecified models in hyperspectral images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 470–477. [CrossRef] Chang, C. Hyperspectral Data Processing: Algorithm Design and Analysis; John Wiley & Son, Inc.: Hoboken, NJ, USA, 2013. Khazai, S.; Mojaradi, B. A modified kernel-RX algorithm for anomaly detection in hyperspectral images. Arab. J. Geosci. 2015, 8, 1487–1495. [CrossRef] Chang, C.I.; Chiang, S.S. Anomaly detection and classification for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2002, 40, 1314–1325. [CrossRef] Liu, W.; Chang, C.I. Multiple window anomaly detection for hyperspectral imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 644–658. [CrossRef] Yuan, Y.; Wang, Q.; Zhu, G. Fast Hyperspectral Anomaly Detection via High-Order 2-D Crossing Filter. IEEE Trans. Geosci. Remote Sens. 2015, 53, 620–630. [CrossRef] Matteoli, S.; Diani, M.; Theiler, J. An Overview of Background Modeling for Detection of Targets and Anomalies in Hyperspectral Remotely Sensed Imager. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2317–2336. [CrossRef] Reed, I.S.; Yu, X. Adaptive multiple-band CFAR detection of an optical pattern with unknown spectral distribution. IEEE Trans. Signal. Process. 1990, 38, 1760–1770. [CrossRef] Bajorski, P. Practical evaluation of max-type detectors for hyperspectral images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 462–469. [CrossRef] Kwon, H.; Der, S.Z.; Nasrabadi, N.M. Adaptive anomaly detection using subspace separation for hypersprctral imagery. Opt. Eng. 2003, 42, 3342–3351. [CrossRef] Gurram, P.; Han, T.; Kwon, H. Hyperspectral anomaly detection using sparse kernel-based ensemble learning. Proc. SPIE 2011, 8048, 289–293. Kwon, H.; Nasrabadi, N.M. Kernel RX-algorithm: A nonlinear anomaly detector for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2005, 43, 388–397. [CrossRef] Stellman, C.M.; Hazel, G.G.; Bucholtz, F.; Michalowicz, J.V.; Stocker, A.D.; Schaaf, W. Real-time hyperspectral detection and cuing. Opt. Eng. 2000, 39, 1928–1935. [CrossRef] Zhou, J.; Ayhan, B.; Kwan, C.; Eismann, M. New and Fast algorithms for Anomaly and Change Detection in Hyperspectral images. In Proceedings of the International Symposium on Spectral Sensing Research, Springfield, MO, USA, 24 July 2010. Zhou, J.; Kwan, C.; Ayhan, B.; Eismann, M.T. A Novel Cluster Kernel RX Algorithm for Anomaly and Change Detection Using Hyperspectral Images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6497–6504. [CrossRef] Schowengerdt, R. Remote Sensing, Models and Methods for Image Processing; Academic Press: New York, NY, USA, 1997.

Sensors 2017, 17, 1815

18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.

16 of 16

Jensen, J. Introductory Digital Image Processing: A Remote Sensing Perspective; Prentice-Hall: Englewood Cliffs, NJ, USA, 2004. Chen, S.Y.; Wang, Y.; Wu, C.C.; Liu, C.; Chang, C.I. Real-time causal processing of anomaly detection for hyperspectral imagery. IEEE Trans. Aerosp. Electron. Syst. 2014, 50, 1511–1534. [CrossRef] Chang, C.I.; Wang, Y.L.; Chen, S.Y. Anomaly detection using causal sliding windows. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 3260–3270. [CrossRef] Zhao, C.; Wang, Y.; Qi, B.; Wang, J. Global and local real-time anomaly detectors for hyperspectral remote sensing imagery. Remote Sens. 2015, 7, 3966–3985. [CrossRef] Kailath, T. Linear Systems; Prentice-Hall: Englewood Cliffs, NJ, USA, 1980. Zhao, C.; Yao, X.; Huang, B. Real-time Anomaly Detection Based on a Fast Recursive Kernel RX Algorithm. Remote Sens. 2016, 8, 1011. [CrossRef] Chang, C.I.; Li, Y.; Hobbs, M.C.; Schultz, R.C.; Liu, W. Progressive band processing of anomaly detection in hyperspectral imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 3558–3571. [CrossRef] Chang, C.I.; Li, H.C.; Song, M.P.; Liu, C.H.; Zhang, L.F. Real-Time Constrained Energy Minimization for Subpixel Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 2545–2559. [CrossRef] Rossi, A.; Acito, N.; Diani, M. RX architectures for real-time anomaly detection in hyperspectral images. J. Real-Time Image Process. 2014, 9, 503–517. [CrossRef] Chang, C.I. Hyperspectral Data Processing: Algorithm Design and Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2013. Harsanyi, J.C.; Chang, C.I. Hyperspectral image classification and dimensionality reduction: An orthogonal subspace projection approach. IEEE Trans. Geosci. Remote Sens. 1994, 32, 779–785. [CrossRef] Molero, J.M.; Garzon, E.M.; Garcia, I. Anomaly detection based on a parallel kernel RX algorithm for multicore platforms. J. Appl. Remote Sens. 2012, 6, 623–632. Theiler, J.; Grosklos, G. Problematic projection to the in-sample subspace for a kernelized anomaly detector. IEEE Geosci. Remote Sens. Lett. 2016, 13, 485–489. [CrossRef] © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Progressive Line Processing of Kernel RX Anomaly Detection ... - MDPI

Progressive Line Processing of Kernel RX Anomaly Detection ... - MDPI

Suggest Documents

A Novel Cluster Kernel RX Algorithm for Anomaly and ...

Kernel Methods for Anomaly Detection and Noise

Kernel-Based Anomaly Detection in Hyperspectral

Signal Processing-based Anomaly Detection Techniques - CiteSeerX

Improving the RX Anomaly Detection Algorithm for ...

GNSS Trajectory Anomaly Detection Using Similarity ... - MDPI

Anomaly Detection in Paleoclimate Records Using ... - MDPI

Network Anomaly Detection via Clustering and Custom Kernel in MSVM

Robust Metric based Anomaly Detection in Kernel Feature Space

Recursive Local Summation of RX Detection for Hyperspectral ... - MDPI

On-line Anomaly Detection of Deployed Software: A ... - CiteSeerX

RX-V775WABL - Hifi on Line

On-line sentence processing in Primary Progressive Aphasia

On-line sentence processing in Primary Progressive Aphasia Elena ...

VARIATIONAL INFERENCE FOR ON-LINE ANOMALY DETECTION IN ...

Acoustic Signal Processing for Anomaly Detection in ... - IBM Research

Real-time big data processing for anomaly detection

An image processing approach to traffic anomaly detection

Difficulties and Challenges of Anomaly Detection in Smart Cities - MDPI

Advancements of Data Anomaly Detection Research in ... - MDPI

Potential Benefits of Combining Anomaly Detection with View ... - MDPI

Acoustic Signal Processing for Anomaly Detection in Machine Room ...

UNSUPERVISED ANOMALY DETECTION IN

ANOMALY DETECTION - arXiv