Detection of Moving Ships in Sequences of ... - Semantic Scholar

International Journal of

Geo-Information Article

Detection of Moving Ships in Sequences of Remote Sensing Images Shun Yao 1 , Xueli Chang 2,3, *, Yufeng Cheng 1 , Shuying Jin 1 and Deshan Zuo 4 1

2 3 4

*

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China; [email protected] (S.Y.); [email protected] (Y.C.); [email protected] (S.J.) Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430079, China School of Resource and Environmental Sciences, Wuhan University, Wuhan 430079, China Beijing Institute of Remote Sensing Information, Beijing 100020, China; [email protected] Correspondence: [email protected]

Received: 25 August 2017; Accepted: 30 October 2017; Published: 1 November 2017

Abstract: High-speed agile remote sensing satellites have the ability to capture multiple sequences of images. However, the frame rate is lower and the baseline between each image is much longer than normal image sequences. As a result, the edges and shadows in each image in the sequence vary considerably. Therefore, more requirements are placed on the target detection algorithm. Aiming at the characteristics of multi-view image sequences, we propose an approach to detect moving ships on the water surface. Based on marker controlled watershed segmentation, we use the extracted foreground and background images to segment moving ships, and we obtain the complete shape and texture information of the ships. The inter-frame difference algorithm is applied to extract the foreground object information, while Otsu’s algorithm is used to extract the image background. The foreground and background information is fused to solve the problem of interference with object detection caused by long imaging baseline. The experimental results show that the proposed method is effective for moving ship detection. Keywords: movement detection; sequences of remote sensing images; watershed; inter-frame differential

1. Introduction With the development of economic globalization, sea safety and economic conflicts between countries on the seas are becoming more and more prominent. Therefore, through highly mobile remote sensing satellites, continuous observations can be carried out over a certain area within a certain range. They can provide real-time, fast and accurate access to dynamic marine information, providing for timely and effective decisions, as well as the rapid settlement of marine emergencies. Meanwhile, agile satellites have the ability of rapid maneuvering and multi-view imaging over the same place. Therefore, using multi-image data, we can observe and track ships as targets, to obtain their orientation, speed and other dynamic information. This not only provides the basis for decision-making and guidance, but also reflects an important aspect of remote sensing satellite applications [1]. However, long imaging intervals and large base-height ratios exist in multi-view sequence images which are captured by agile satellites. Therefore, the appearance of the same surface feature in different image sequences is greatly changed. Figure 1 shows the method of capturing sequence images by agile satellites. It is obvious that the displacement of building shadows, different angle of buildings, and changes in illumination, all can exist in one group of sequence image, and they strongly impact the target detection method. At present, mainstream target detection methods are divided into two classes: background difference methods and inter-frame difference methods [2]. The background difference method uses various algorithms to obtain the background image, and then subtracts the background ISPRS Int. J. Geo-Inf. 2017, 6, 334; doi:10.3390/ijgi6110334

www.mdpi.com/journal/ijgi

ISPRSInt. Int.J.J.Geo-Inf. Geo-Inf.2017, 2017,6,6,334 334 ISPRS

of12 12 22of

background difference method uses various algorithms to obtain the background image, and then image from the current frame to locate the moving target in the current image. There are many subtracts the background image from the current frame to locate the moving target in the current methods for building the background model. For example, considered the fact that the background image. There are many methods for building the background model. For example, considered the is always observed in image sequences, the background is extracted based on partial differential fact that the background is always observed in image sequences, the background is extracted based equations [3,4]; using the non-parametric kernel of the Gaussian distribution to estimate the probability on partial differential equations [3,4]; using the non-parametric kernel of the Gaussian distribution of tie-points, an estimation model is established with a non-parametric density kernel, which can detect to estimate the probability of tie-points, an estimation model is established with a non-parametric objects in a slightly shaking background [5,6]; based on the mixture of the k numbers of a Gaussian density kernel, which can detect objects in a slightly shaking background [5,6]; based on the mixture distribution model, the characteristics of the background distribution are obtained [7–9]. However, of the k numbers of a Gaussian distribution model, the characteristics of the background distribution the background model established by these algorithms is highly correlated with the background of the are obtained [7–9]. However, the background model established by these algorithms is highly images, which means the background model cannot be established accurately when the scene in every correlated with the background of the images, which means the background model cannot be image differs greatly. established accurately when the scene in every image differs greatly.

Figure 1. The method of capturing image sequence by agile satellites. Figure 1. The method of capturing image sequence by agile satellites.

The inter-frame difference method is another method of target detection that uses the difference Thetwo inter-frame is another method of target that uses between or moredifference frames to method obtain the shape, position and otherdetection information aboutthe thedifference moving between two or more frames to obtain the shape, position and other information about the moving object [9–11]. Based on the continuous difference between two or more frames to update the object [9–11].information, Based on an theentire continuous difference between two or more frames update[10]. the background background model is obtained to extract movingtoobjects background an entire background model is obtained to extract moving objects [10]. Moreover, theinformation, moving objects in image sequences are obeyed high-order statistical distributions. Moreover, the themoving movingobjects objectscan in image sequences obtained are obeyed Therefore, be subsequently by ahigh-order high-orderstatistical statisticaldistributions. operator to Therefore, the movingimages objectsin can be subsequently byCanny a high-order operator to filter filter the differential certain areas [11]. obtained Moreover, featurestatistical points can be extracted the differential images in certain areas [11]. Moreover, Canny feature points can be extracted from from image sequences, and the feature images are differentiated, so the moving targets are extracted image sequences, and are differentiated, the moving are extracted by by the characteristics ofthe the feature feature images points [12]. Although theseso algorithms aretargets more adaptable to real the characteristics feature points [12]. Although these algorithms are more adaptable to real environments than of thethe background difference algorithms, they are influenced by the displacement environments than the background algorithms, they are influenced by part the displacement of of the moving target. Sometimes theydifference cannot detect the complete target, and only of information the moving target.target Sometimes cannot detect the complete target, and only part of information about the moving can be they obtained. Therefore, this paper proposes a method of combining the about the moving target can be obtained. Therefore, difference this paperalgorithm, proposes atomethod of combining background difference algorithm and the inter-frame effectively solve the the background difference algorithm andlarge the inter-frame difference algorithm, to effectively solve problem that large base-height ratios and ground feature changes can interfere with the target the problem that large base-height ratios andThe large groundoffeature interfere with the detection in the multi-view image sequences. accuracy target changes detectioncan is greatly improved, target provides detection in the support multi-view image The accuracy of targets target using detection greatly which technical for the rapidsequences. discovery and tracking of key agile is satellites. improved, which provides technical support for the rapid discovery and tracking of key targets using 2. Methods agile satellites. In this paper, the target detection algorithm for moving ships in multi-view image sequences is divided into three parts: foreground extraction; background extraction; target segmentation. The flow chart is shown in Figure 2. First, histogram matching is performed using three frames of a multi-view image sequence, so the gray level difference between the images is eliminated. Then, computing the

ISPRS Int. J. Geo-Inf. 2017, 6, 334

3 of 12

2. Methods In this paper, the target detection algorithm for moving ships in multi-view image sequences is divided into three parts: foreground extraction; background extraction; target segmentation. The flow chart is shown in Figure 2. First, histogram matching is performed using three frames of a multi-view image sequence, so the gray level difference between the images is eliminated. Then, computing ISPRS Int. J. Geo-Inf. 2017, 6, 334 3 of 12 the difference between these three frames, we can obtain the partial information about the moving targets. Next, the differential results are filtered multi-construction operator and binarized, difference between these three frames, we can using obtainathe partial information about the moving whichtargets. can eliminate the interference caused by movement between frames. After that step, we generate Next, the differential results are filtered using a multi-construction operator and binarized, the mask images, which provide partial information on the moving targets as foreground images, which can eliminate the interference caused by movement between frames. After that step, we generate theimages mask images, which provide partial information the moving targets as foreground and the original are threshold using Otsu’s method andonbinarized such that we can extract the images, information and the original images are threshold using Otsu’scombining method and binarized suchand that background we can background from the image sequences. Finally, the foreground extract the background information from the image sequences. Finally, combining the foreground images, we use marker-based watershed segmentation to segment the multi-view image sequences, and background images, we use marker-based watershed segmentation to segment the multi-view to quickly and completely detect the moving targets. image sequences, to quickly and completely detect the moving targets. Histogram Matching

Inter-frame Difference

Multi-structual Morphological Filter

Foreground Extraction

Watershed Segmentation

Capture Sequence Images OTSU

Information of Moving Objects

Object Segmentation

Binarization Background Extraction

Figure 2. The algorithm flow.

Figure 2. The algorithm flow.

2.1. Methods of Foreground Extraction

2.1. Methods of Foreground Extraction 2.1.1. Inter-Frame Difference Algorithm

2.1.1. Inter-Frame Difference Algorithm

The inter-frame difference algorithm can quickly obtain dynamic targets in the foreground [13].

The basic principledifference of the inter-frame difference is that obtain two or more frames in an image are [13]. The inter-frame algorithm can quickly dynamic targets in thesequence foreground subtracted, so that the difference images containing part of moving target area are obtained. We then The basic principle of the inter-frame difference is that two or more frames in an image sequence are use a threshold to binarize these images. Assume that three consecutive frames of multi-view images subtracted, so that the difference images containing part of moving target area are obtained. We then are f ( k −1) ( x , y ) , f ( k ) ( x , y ) and f ( k +1) ( x , y ) . The first two images are processed as follows: use a threshold to binarize these images. Assume that three consecutive frames of multi-view images are f (k−1) ( x, y), f (k) ( x, y) and f (k+d1) ( x, y()x. ,The as follows: y ) =first f (two x , y )images − f (are x , yprocessed ) (1) ( k −1, k )

(k )

( k −1)

where d ( k −1, k ) ( x , y ) is the result of) (the of1)large d(k−1,k x, ysubtraction. ) = f (k) ( x, Because y ) − f (k− ( x, y)changes in the appearance of

(1)

ground features for the high mobility of agile satellites, the result of this single subtraction contains where d(k−1,k isnoise. the result of the subtraction. Because large changes in the appearance a large amount This interferes greatly with the target of information in the difference images, of ) ( x, y )of ground features for the high mobility of agile satellites, the result of this single subtraction contains so the third frame is subtracted with the result:

a large amount of noise. This interferes greatly with the target information in the difference images, d ( k −1,k ,k +1) ( x, y ) = f ( k +1) ( x, y ) − f ( k ) ( x, y ) − f ( k −1) ( x, y ) (2) so the third frame is subtracted with the result: where d ( k −1, k , k +1) ( x , y ) is the resulting image after the twice difference. Then, we transform the d ( x, y) = f (k+1) ( x, y) − f (k) ( x, y) − f (k−1) ( x, y) (2) T (i, j ) : into+a1)binary image result d ( k −1, k , k +1) ( x , y(k)−1,k,k where d(k−1,k,k+1) ( x, y) is the resulting image twice difference. Then, we transform the result d ( kthe 0after −1, k , k +1) ( x, y ) ≤ Th T (Ti,(ji,) j=): (3) d(k−1,k,k+1) ( x, y) into a binary image

1 d ( k −1,k ,k +1) ( x, y ) > Th

(

where Th is the binarization threshold.0 0 and 1 represent non-target area and target area, d(k−1,k,k ≤ Th +1) ( x, y )the T (i, j) = respectively. Because of the following filtering step, in this place we need as much information on the 1 d(k−1,k,k+1) ( x, y) > Th differential images as possible, so we just take Th as zero-value.

(3)

where Th is the binarization threshold. 0 and 1 represent the non-target area and target area, 2.1.2. Multi-Structuring Element Morphological Filtering respectively. Because of the following filtering step, in this place we need as much information Because ofimages noise generated by the features in the binary images after on the differential as possible, sodynamic we just change take Thofasground zero-value. computing the inter-frame difference, the binary images cannot directly provide information on the moving targets. It is necessary to filter the difference images using morphological filtering operators. Meanwhile, in order to avoid the pitfall of morphological filtering with a single structuring element [13], we perform morphological filtering using several structuring elements. Knowing that building corners, shadows and other features occur mostly in linear and angular combinations, we


4 of 12

2.1.2. Multi-Structuring Element Morphological Filtering Because of noise generated by the dynamic change of ground features in the binary images after computing the inter-frame difference, the binary images cannot directly provide information on the moving targets. It is necessary to filter the difference images using morphological filtering operators. Meanwhile, in order to avoid the pitfall of morphological filtering with a single structuring element [13], we perform morphological filtering using several structuring elements. Knowing that building corners, shadows and other features occur mostly in linear and angular combinations, we continuously use J. Geo-Inf. ◦2017, 6, 334 12 ◦ , 90Int. ◦ and 0◦ , 45ISPRS 135 and different dimensions of the linear structuring elements to process4 of opening and closing operations, finally the resultsdimensions from usingofthese morphological resulting continuously use 0°, and 45°, 90° and obtain 135° and different the linear structuring filters, elements to in enhancement of the background suppression. The structuring elements are shown in process opening andtarget closingwith operations, and finally obtain the results from using these morphological Figure 3. Using different scales andofdifferent elements, positiveThe and negativeelements noise signals filters, resulting in enhancement the targetstructuring with background suppression. structuring can be [14]. scales This maximizes the filtering elements, out of signals notand related to target aresimultaneously shown in Figuresuppressed 3. Using different and different structuring positive negative noise signals be simultaneously suppressed [14]. This maximizes the filtering out of signals not information. The can formula of the algorithm is as follows: related to target information. The formula of the algorithm is as follows:

∑ ωωi i((((ff ◦ bbijij )) ⋅·bbijij)) r =∑  i j

r=

i

j

(4) (4)

wherewhere i and ij represent the number of different structuring elements and different scales, ω represents and j represent the number of different structuring elements and different scales, ω the weight, f represents the original image, b is the structuring elements, symbol “ ◦ “ represents represents the weight, f represents the original image, b is the structuring elements, symbol “  opening operation, and “operation, ·” represents “ represents opening and “closing. ⋅ ” represents closing.

Figure 3. 0°, 45°, 90° and 135° structuring elements. Figure 3. 0◦ , 45◦ , 90◦ and 135◦ structuring elements.

2.2. Methods of Background Extraction and Segmentation

2.2. Methods of Background Extraction and Segmentation 2.2.1. Otsu’s Method

2.2.1. Otsu’s Method

Otsu’s method is simple, efficient and adaptable for finding the optimal threshold between the

background and the targets in the foreground [15]. The for larger grayscale rangethreshold between minimum Otsu’s method is simple, efficient and adaptable finding the(the optimal between the and maximum grey value) between[15]. the background and the targets, the more accurate the background and the targets indifference the foreground The larger grayscale (the range between minimum threshold [15,16]. In the optical images, the grayscale value of the water surface is always less than and maximum grey value) difference between the background and the targets, the more accurate the the grayscale value of the target ships. Therefore, Otsu’s method can effectively segment the target threshold [15,16]. In the optical images, the grayscale value of the water surface is always less than the ship from the background. grayscale value of the target ships. Therefore, Otsu’s method can effectively segment the target ship The Otsu’s method assumes that the image can statistically be divided into two parts: from background the background. and foreground. That is, the histogram of the image is bimodal. The main purpose of The Otsu’s method assumes that the that image can statistically be divided twothe parts: background Otsu’s method is to find the threshold minimizes the internal variance into of both background and foreground. That is,Define the histogram ofvariance the image is bimodal. The mainaspurpose and the foreground. the sum of weight of the two classes follows: of Otsu’s method is to find the threshold that minimizes the internal variance of both the background and the foreground. 2 2 (t) =two ω0 classes (t)σ02 (t)as+ ω (5) 1 (t )σ1 (t ) Define the sum of variance weight σ ofωthe follows: where the weight ω0,1 represents2the probabilities (background and foreground) 2 of the two classes 2

σω (t) = ω0 (t)σ0 (t) + ω1 (t)σ1 (t)

divided by the threshold t, and

2 σ0,1

(5)

are the variances of the two classes. The method of calculation

where the weight ω0,1 represents the probabilities of the two classes (background and foreground) for the probabilities of classes 2ω0,1 is as follows: divided by the threshold t, and σ0,1 are the variances of the two classes. The method of calculation for t −1 the probabilities of classes ω0,1 is as follows:

ω 0 (t ) =  p (i )

(6)

ti =−01

ω0 (t) = L∑ −1 p (i ) i =0 p (i ) ω1 (t ) = 

(6) (7)

i =t

where

L is the grayscale level of the image, and p (i ) is the probability of gray level value i. Otsu

has shown that the smallest inter-class variance is equal to the largest outer-class variance [15], that is:


5 of 12

L −1

ω1 ( t ) =

∑

p (i )

(7)

i =t

where L is the grayscale level of the image, and p(i ) is the probability of gray level value i. Otsu has shown that the smallest inter-class variance is equal to the largest outer-class variance [15], that is: σb2 (t)

= σ2 − σω2 (t) = ω0 (µ0 − µ T )2 + ω1 (µ1 − µ T )2 = ω0 (t)ω1 (t)[µ0 (t) − µ1 (t)]2

(8)

where µ0,1,T (t) is the average of classes, the calculation is as follows: t −1

µ0 ( t ) =

∑i

p (i ) ω0

(9)

L −1

p (i ) ω1

(10)

i =0

µ1 ( t ) =

∑i

i =t L −1

µT =

∑ ip(i)

(11)

i =0

It can be seen that the probability ω and the mean value µ can be calculated iteratively for the threshold t, as well as the outer-class variance σb2 (t). When obtaining the largest outer-class variance σb2 (t), the threshold t at that time is the optimal threshold for image segmentation. 2.2.2. Marker-Based Watershed Segmentation Algorithm The watershed algorithm is a regional image segmentation method based on mathematical morphology, which was first proposed and applied by Beucher and Lantuejoul in the late 1970s [17–20]. The watershed algorithm based on the immersion principle [18] is one of the forms of its expression. The algorithm first simulates the topological map of geodesics, and simulates the lowest points of the geodesics at each local minimum, where we insert a small hole, and then submerge the whole topographic map in water. With the water level rising, at the lowest of the confluence of the establishment of the “dam”, and ultimately these “dams” are the final division of the border, that is, a “watershed”. However, the traditional watershed algorithm also has several defects [19]: (1) when the noise or the texture of the image is very noticeable, it is easy to produce the effect of excessive division; (2) because of the weak response for the low contrast images, the traditional algorithm cannot obtain a better segmentation effect. Because of the remote sensing images’ high resolution and clarity, as well as a relatively complex image texture, the final segmentation results of the traditional watershed algorithm will exhibit a significant over-segmentation phenomenon. The marker-based watershed algorithm can resolve all the defects of the traditional watershed algorithm [20,21]. The marker-based watershed algorithm uses the known segmented area as a local minimum, that is, the lowest point in the topographic map. In addition, then in the immersion, it begins from the known lowest point, and ultimately obtains the final segment boundary [20]. Because of the a priori knowledge involved in the division, this method can effectively avoid the problem of over-segmentation caused by texture or noise, greatly improving segmentation accuracy. The process of marker-based watershed algorithm is as follows: (1)

Different markers are given different labels, and pixels of the markers are the start of the immersion.


(2)

6 of 12

Corresponding to the gradient magnitude of the pixels neighboring markers, we insert the neighboring pixels of markers into a queue with a priority level. The gradient magnitude of the pixels is calculated as follows: s

ISPRS Int. J. Geo-Inf. 2017, 6, 334 grad(i, j) =

[ f (i, j) − f (i + 1, j)]2 + [ f (i, j) − f (i, j + 1)]2 2

(12)6 of 12

(3) The where pixel with the lowest priority level is extracted from the priority queue. If the neighbors of grad(i, j) is the gradient of the pixel located at (i, j), and f (i, j) is the pixel value. the extracted pixel thatlowest have priority alreadylevel beenislabeled allfrom havethe thepriority same label, then theneighbors pixel is labeled (3) The pixel with the extracted queue. If the of withthe their label. All neighbors that are not yet thelabel, priority areisput into the extracted pixelnon-marked that have already been labeled all have thein same thenqueue the pixel labeled priority withqueue. their label. All non-marked neighbors that are not yet in the priority queue are put into the (4) Redopriority step 3 queue. until the priority queue is empty. (4)

Redo step 3 until the priority queue is empty.

In this paper, the foreground information of the moving targets obtained using the inter-frame difference as well as the information backgroundofinformation obtained usingusing Otsu’s are the Inalgorithm, this paper, the foreground the moving targets obtained themethod, inter-frame difference algorithm, as well as the background information obtained using method, are the regions of interest and provide a priori knowledge, that is, the markers for Otsu’s the watershed algorithm. regions of interest and provide a priorithe knowledge, that is,watershed the markers for the watershed algorithm. We then divide the whole image using marker-based algorithm to capture the targets’ We then divide the whole image using the marker-based watershed algorithm to capture the targets’overshape information we need. By considering the known regions as priori knowledge, shape information we need. By considering the known regions as priori knowledge, over-segmentation segmentation can be greatly reduced in the final segmentation result. can be greatly reduced in the final segmentation result.

3. Experiments 3. Experiments 3.1. Data 3.1. Data In this paper, we used three continuous frames of multi-view imagescaptured capturedby byaaChinese Chineseagile In this paper, we used three continuous frames of multi-view images agile remote satellite. sensing satellite. The images’ × and 2492,they and are theypanchromatic are panchromatic images, remote sensing The images’ size is size 3440is×3440 2492, images, which which contain the whole visible spectra and higher resolution than the multispectral images, as shown contain the whole visible spectra and higher resolution than the multispectral images, as shown in in Figure 4. From the figure, we can see that the number of moving ships is three, for convenient Figure 4. From the figure, we can see that the number of moving ships is three, for convenient explanation, explanation, we zoomed in on the three areas containing the targets, and numbered the three targets. we zoomed in on the three areas containing the targets, and numbered the three targets. The The movement of these ships is also evident. The speed of Ship 1 and Ship 3 is relatively slow, so there movement of these parts shipsofiseach alsoship evident. speed Ship and Shipfaster 3 is relatively slow,ships, so there are overlapping in the The images. Theof Ship 2 is1 relatively than two other are overlapping partsno ofoverlapping each ship inpart the of images. Ship 2 is relatively faster than two ships otherhave ships, so so there is almost Ship 2 The in the three frames. Meanwhile, all three therevery is almost no overlapping partthe of Ship 2 intime the three frames. Meanwhile, all threetwo ships have very clear trails. Table 1 shows interval and the base-height ratio between adjacent We can see that the and timethe that the agile satellite needs totwo maneuver clearimages. trails. Table 1 shows thebecause intervalof time base-height ratio between adjacentbetween images. We capturing two frames, interval time and base-height between two adjacent frames two is large. can see that because of the the time that the agile satellite needsratio to maneuver between capturing frames, As a result, there are large changes to background features. This finding also raises difficulties the interval time and base-height ratio between two adjacent frames is large. As a result, there are large withto post-processing. changes background features. This finding also raises difficulties with post-processing.

Figure sequenceimages. images. Figure4.4.Original Original multi-view multi-view sequence Table 1. Interval time and base-height ratio of sequence images.

Frame Interval Time Base-height Ratio 3.2. Results of Foreground Extraction

1–2 22 s 0.02643

2–3 23 s 0.02655


7 of 12

Table 1. Interval time and base-height ratio of sequence images. Frame

1–2

2–3

Interval Time Base-height Ratio

22 s 0.02643

23 s 0.02655

3.2. Results of Foreground Extraction 3.2.1. Histogram Matching To compensate for changes in the overall radiance of the images, which are caused by the angle changes between the sensor and the sun when the satellite maneuvers, we first adjust the histograms of the last two images to match with the first frame. The result is shown in Figure 5. We can intuitively see that after histogram adjustment, the brightness of second and third frame is basically the same as the first. Meanwhile, comparing the histograms between before and after adjustment, we can that the ISPRS Int.see J. Geo-Inf. 2017,gray 6, 334level distributions after adjustment are more consistent with the gray7level of 12 distribution of first frame’s histogram than before. Therefore, this method can solve the problem where difference processing next step cannotthis obtain a good resultthe because of where offsetsdifference between the frames’ frame’s histogram thanin before. Therefore, method can solve problem processing overall radiance. in next step cannot obtain a good result because of offsets between the frames’ overall radiance.

(a)

(b) Figure 5. The experimental experimental results of histogram matching: matching: (a) (a) Sequence Sequence images images of before histogram matching; (b) Sequence images of after histogram matching.

At measure the the improvement improvement between At the the same same time, time, in in order order to to measure between before before and and after after histogram histogram adjustment, adjustment, we we evaluate evaluate the the correlation correlation coefficient, coefficient, the the intersection intersection test test coefficient, coefficient, the the chi-square chi-square test test coefficient coefficient and and the the Bhattacharyya Bhattacharyya Distance Distance between between two two histograms. histograms. ItIt can can be be seen seen from from Table Table 22 that that after the histogram histogrammatching, matching,the thecorrelation correlationcoefficient coefficient and chi-square coefficient between Frame 1 after the and chi-square coefficient between Frame 1 and and Frame and between 1 and Frame 3 are obviously and the intersection test Frame 2 and2between FrameFrame 1 and Frame 3 are obviously closer tocloser 1, and to the1,intersection test coefficient coefficient is obviously increased. At the same time, the Bhattacharyya Distance is also reduced. The is obviously increased. At the same time, the Bhattacharyya Distance is also reduced. The changes changes theseshow factors show that after histogram matching, the correlation the second and of these of factors that after histogram matching, the correlation of the of second frameframe and third third frame to the first frame is stronger. The gray level distribution among these frames in Figure 6 is frame to the first frame is stronger. The gray level distribution among these frames in Figure 6 is also also improved, which lays the solid foundation for the improvement of the following differential precision. improved, which lays the solid foundation for the improvement of the following differential precision.

after the histogram matching, the correlation coefficient and chi-square coefficient between Frame 1 and Frame 2 and between Frame 1 and Frame 3 are obviously closer to 1, and the intersection test coefficient is obviously increased. At the same time, the Bhattacharyya Distance is also reduced. The changes of these factors show that after histogram matching, the correlation of the second frame and third frame to the2017, first 6,frame also ISPRS Int. J. Geo-Inf. 334 is stronger. The gray level distribution among these frames in Figure 6 is 8 of 12 improved, which lays the solid foundation for the improvement of the following differential precision.

(a)

(b)

Figure 6. Histograms before and after histogram matching: (a) Before histogram matching; (b) After histogram matching.

Table 2. Correlation test before and after after histogram histogram matching. matching. 1–2 1–2 1–3 1–3 Before After Before After Before After Before After Correlation 0.4330 0.8130 0.1281 0.7050 Correlation 0.4330 0.8130 0.1281 0.7050 Chi-Square 256,590.2673 29.4254 29.4254 79,456.7418 79,456.7418 64.9714 64.9714 Chi-Square 256,590.2673 Intersection 77.0942 106.2336 Intersection 77.0942 106.2336 48.8889 48.8889 67.6905 67.6905 Bhattacharyya Distance 0.4009 0.3118 0.5704 0.4224 ISPRS Int. J. Geo-Inf. 2017, 6, 334 Distance Bhattacharyya 0.4009 0.3118 0.5704 0.4224

Frame Frame

8 of 12

3.2.2. Inter-Frame Difference 3.2.2. Inter-Frame Difference Through the inter-frame difference algorithm, we can quickly detect information the moving Through the inter-frame difference algorithm, we can quickly detect information onon the moving targets. The results are shown in Figure 7. From Figure 7a, we see that because of large changes targets. The results are shown in Figure 7. From Figure 7a, we see that because of large changes ofof attitude and the long imaging interval time when the agile satellite captures the multi-view sequence attitude and the long imaging interval time when the agile satellite captures the multi-view sequence images, the ground features change significantly in the sequence of images. The error or noise images, the ground features change significantly in the sequence of images. The error or noise in finalin final result obtained by the traditional difference method is more than the information we want result obtained by the traditional difference method is more than the information we want to extractto extract thetarget. moving target. Therefore, based on the traditional method, we combine about theabout moving Therefore, based on the traditional difference difference method, we combine the third the third frame in a second differential processing step, and the result is shown in Figure 7b. can frame in a second differential processing step, and the result is shown in Figure 7b. We can seeWe that see computing that after computing thedifference, second difference, the information on the moving is saved, after the second the information on the moving targets targets is saved, as wellasaswell a as a small part of error information caused by the changes of building contours, shadows or light. small part of error information caused by the changes of building contours, shadows or light. The The result is greatly improved over the traditional method. result is greatly improved over the traditional method.

(a)

(b)

Figure 7. Inter-frame difference results : (a)(a) thethe first difference; (b)(b) the second difference. Figure 7. Inter-frame difference results: first difference; the second difference.

3.2.3. Multi-Structuring Element Morphological Filter 3.2.3. Multi-Structuring Element Morphological Filter From the results shown above, although using the difference procedure twice removes most ofof From the results shown above, although using the difference procedure twice removes most the changes of ofbackground backgroundfeatures, features,there there still a small of noise the thenoise noisecaused causedby bythe the dynamic dynamic changes is is still a small partpart of the noise remaining inresults. the results. We need a morphological filtering operator to the filter the images and remaining in the We need a morphological filtering operator to filter images and remove remove the remaining the remaining errors.errors. However, as shown in Figure 8a, the noise in the difference results is largely due to changes of feature edges or shadows caused by the different viewing angles of agile satellites when they capture the images. Therefore, the noise in these images has a generally linear aspect, which cannot be eliminated with a filter using a single structuring element. Figure 8b shows that after processing with a top-hat morphologic filter using a single structuring element, there still remains lots of noise. To

Figure 7. Inter-frame difference results: (a) the first difference; (b) the second difference.

3.2.3. Multi-Structuring Element Morphological Filter From the results shown above, although using the difference procedure twice removes most of the noise by the dynamic changes of background features, there is still a small part of9 of the ISPRS Int. J.caused Geo-Inf. 2017, 6, 334 12 noise remaining in the results. We need a morphological filtering operator to filter the images and remove the remaining errors. However,as asshown shownininFigure Figure8a, 8a,the thenoise noise the difference results is largely to changes However, inin the difference results is largely duedue to changes of of feature edges or shadows caused by the different viewing angles of agile satellites when they feature edges or shadows caused by the different viewing angles of agile satellites when they capture capture the images. Therefore, theinnoise these images a generally aspect, which cannot the images. Therefore, the noise theseinimages has a has generally linearlinear aspect, which cannot be be eliminated with a filter using a single structuring element. Figure 8b shows that after processing eliminated with a filter using a single structuring element. Figure 8b shows that after processing with a top-hat morphologic a single structuring element, still remains lots of noise. awith top-hat morphologic filter filter usingusing a single structuring element, therethere still remains lots of noise. To To solve this problem, we can perform morphological filtering using multiple structuring elements. solve this problem, we can perform morphological filtering using multiple structuring elements. The The results are shown in Figure seethat thatafter aftermulti-structuring multi-structuringelement element morphological morphological results are shown in Figure 8c. 8c. WeWe cancan see filtering, the the linear original image is mostly eliminated. At the time, filtering, linear noise noisewhich whichremained remainedininthe the original image is mostly eliminated. At same the same the part of moving targets information which we want to extract is saved. time, the part of moving targets information which we want to extract is saved.


(a)

9 of 12

(b)

(c) Figure 8. Comparison between top-hat filtering and multi-structuring element filtering: (a) Original Figure 8. Comparison between top-hat filtering and multi-structuring element filtering: (a) Original Data; Data;(b) (b)Top-hat Top-hat filtering filtering result; result; (c) (c) Multi-structural Multi-structural filtering filtering result. result.

3.3. Results of Background Extraction and Segmentation 3.3. Results of Background Extraction and Segmentation 3.3.1. 3.3.1.Otsu’s Otsu’sMethod Method Generally, in remote remote sensing sensingimages, images,the thenumber numberofof pixels about moving ship targets is much Generally, in pixels about moving ship targets is much less less than the number of pixels about the water surface, and the overall grayscale values of the water than the number of pixels about the water surface, and the overall grayscale values of the water surface surface are smaller the grayscale of targets. the shipTherefore, targets. Therefore, Otsu’sismethod is usedthe to are smaller than thethan grayscale values ofvalues the ship Otsu’s method used to extract extract the background of the moving ship targets, that is, the water surface. Meanwhile, in order to background of the moving ship targets, that is, the water surface. Meanwhile, in order to effectively effectively theofinterference ofcarry noise, also carry out multi-structuring suppress thesuppress interference noise, we also out we multi-structuring element morphological element filtering, morphological filtering, and binarize the results. Figure 9b shows the result of background extraction, and binarize the results. Figure 9b shows the result of background extraction, where the highlighted where the in highlighted in the figure the background We can seehuge that because positions the figurepositions is the background weisextracted. We canwe seeextracted. that because of the contrast of the huge contrast between the water surface and the moving ship targets, the background is between the water surface and the moving ship targets, the background is completely extracted, completely extracted, which provide a good foundation for the next segmentation step. which provide a good foundation for the next segmentation step.

extract the background of the moving ship targets, that is, the water surface. Meanwhile, in order to effectively suppress the interference of noise, we also carry out multi-structuring element morphological filtering, and binarize the results. Figure 9b shows the result of background extraction, where the highlighted positions in the figure is the background we extracted. We can see that because ISPRS Int. huge J. Geo-Inf. 2017, 6,between 334 of the contrast the water surface and the moving ship targets, the background10isof 12 completely extracted, which provide a good foundation for the next segmentation step.

(a)

(b)

Figure9.9. The The result result of of background Data; (b)(b) Otsu’s method result. Figure backgroundextraction: extraction:(a) (a)Original Original Data; Otsu’s method result.

3.3.2. Marker-Based Watershed Segmentation 3.3.2. Marker-Based Watershed Segmentation The texture of high resolution remote sensing images is very rich. From these three images, we The texture of high resolution remote sensing images is very rich. From these three images, can see that the ground features on the riverbank and the trails the ships leave behind are very clear, we can see that the ground features on the riverbank and the trails the ships leave behind are very clear, which will cause an over-segmentation phenomenon when using the traditional watershed which will cause an over-segmentation phenomenon when using the traditional watershed algorithm, algorithm, and greatly reduce the accuracy of detection for the moving ship targets [22], as shown in and greatly the the accuracy of detection for thealgorithm moving ship targets [22], asproblem. shown inFirst, Figure Figure 10a.reduce Therefore, marker-based watershed is used to solve this the10a. Therefore, theand marker-based to solve this problem. First, the foreground foreground backgroundwatershed images wealgorithm obtained is inused the aforementioned steps are superimposed. and background images we obtained in the aforementioned steps are superimposed. Next,the using Next, using the foreground and background regions as the known local minimum to rectify ISPRS Int. J. Geo-Inf. 2017, 6, 334 10 of 12 the foreground and background regions as the known local minimum to rectify the original images, original images, we carry out the marker-based watershed segmentation to obtain the final result, we carry marker-based watershed segmentation to obtain the final result, which is10shown in ISPRSout Int. J.the Geo-Inf. 2017, 6, 334 of 12 which is shown in Figure 10b. It can be seen that the marker-based watershed algorithm can Figure 10b. It can be seen that the marker-based watershed algorithm can effectively split the moving effectively split the in moving avoid over-segmentation, and greatly improve the accuracy which isavoid shown Figureship 10b.targets, It can seen that the marker-based algorithm can ship targets, over-segmentation, andbegreatly improve the accuracywatershed of segmentation. of effectively segmentation. split the moving ship targets, avoid over-segmentation, and greatly improve the accuracy of segmentation.

(a)

(b)

(a) (b) Figure Thecomparison comparison between between traditional traditional and and marker-based marker-based watershed Figure 10.10.The watershedsegmentation: segmentation:(a)(a) Figure 10. The comparison between traditional and marker-based watershed segmentation: (a) Traditional watershed result; (b) Marker-based watershed result. Traditional watershed result; (b)(b) Marker-based result. Traditional watershed result; Marker-basedwatershed watershed result.

Figure 11 show the detection results overlaid on the original image. We can see that the detection Figure 11 show detection results overlaidon onthe theoriginal original image. image. We that the detection 11 show the the overlaid Wecan cansee see the detection of Figure the shapes of Ship 1detection and Shipresults 2 is relatively complete, effectively separating thethat ship’s body and of of the shapes of Ship 1 and Ship 2 is relatively complete, effectively separating the ship’s body and theits shapes of Ship 1ofand Ship 2 isshape relatively complete, effectively separating thebetween ship’s body latter and its trail. trail. Because there is is “fault”ofof grayvalues values its trail. Because ofthe thespecial special shapeof of Ship Ship 3, 3, there aa“fault” gray between thethe latter halfhalf Because of the special shape of Ship 3, there is a “fault” of gray values between the latter half and the first and thethefirst close to to the the grayscale grayscaleofofwater watersurface. surface. a result, and firsthalf halfofofthe theship, ship,which which is is rarely rarely close AsAs a result, halfwatershed of the ship, which is rarely close to the grayscale of water surface. As a result, watershed segmentation segmentation part of ofthe theship, ship,and andultimately ultimately separates it from watershed segmentationcannot cannotdetect detect the the trailing trailing part separates it from cannot detect the trailing part of the ship, and ultimately separates it from the first half of Ship 3. thethe first half of Ship 3. first half of Ship 3.

(a)(a)

(b) (b)

(c)(c)

Figure 11. The finalresults: results: (a)Contour Contour line of of Ship 1; (b) Contour line ofofShip (c)(c) Contour line of Ship 3. 3. Figure Ship2;2; 2; Contour line of Ship Ship Figure 11. 11.The Thefinal final results: (a) (a) Contour line line of Ship Ship 1; 1; (b) (b) Contour Contour line line of Ship (c) Contour line of 3.

Overall, the target detection algorithm we proposed in this paper can effectively detect moving Overall, the target detection algorithm we proposed in this paper can effectively detect moving ship targets in multi-view image sequences, and the accuracy of this detection is high. The shape ship targets in multi-view image sequences, and the accuracy of this detection is high. The shape information of the moving ship targets is obtained more completely.

information of the moving ship targets is obtained more completely. 4. Conclusions

4. Conclusions

To solve the problems caused by the long baseline and a large optical parallax of agile satellites To solve the problems caused by the long baseline and a large optical parallax of agile satellites when they capture a sequence of images, we propose a moving target detection algorithm based on


11 of 12

Overall, the target detection algorithm we proposed in this paper can effectively detect moving ship targets in multi-view image sequences, and the accuracy of this detection is high. The shape information of the moving ship targets is obtained more completely. 4. Conclusions To solve the problems caused by the long baseline and a large optical parallax of agile satellites when they capture a sequence of images, we propose a moving target detection algorithm based on marker-based watershed segmentation with foreground extraction using an inter-frame difference algorithm and background extraction using Otsu’s method. Using these methods, we take advantage of the relevance and continuity of the moving ship targets in a multi-view sequence of images, and overcome the disadvantages of traditional object detection methods which are sensitive to changes in background features and have low reliability when extracting target information. Therefore, our method effectively results in the improvement of moving target detection. The method shows good results for the detection of moving ships on the water surface, and is capable of providing the information on the positions, shapes and textures of targets rapidly, which builds the foundation for the observation and tracking of ships [23], provides timely and effective guidance for decision-making on the ground, and has broad prospects in many applications. Acknowledgments: This work was substantially supported by the Fundamental Research Funds for the Central Universities (2042017kf0042), Open Fund of Twenty First Century Areospace Technology Co., Ltd. (Grant No. 21AT-2016-02), and the National Natural Science Foundation of China (91438111). These supports are valuable. Author Contributions: Yao Shun and Chang Xueli carried out the experiment and wrote this paper. Cheng Yufeng and Jin Shuying participated in the experiment. Zuo Deshan was responsible for the data acquisition. All authors reviewed the manuscript. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3. 4. 5. 6. 7.

8. 9.

10.

11.

Zhao, S.; Yin, D.; Dou, X. Moving target information extraction based on single satellite image. Acta Geod. Cartogr. Sin. 2015, 44, 316–322. Zhang, J.; Mao, X.; Chen, T. Survey of moving object tracking algrithom. Appl. Res. Comput. 2009, 12, 4407–4410. Kornprobst, P.; Deriche, R.; Aubert, G. Image sequence analysis via partial differential equations. J. Math. Imaging Vis. 1999, 11, 5–26. [CrossRef] Aubert, G.; Kornprobst, P. Mathematical Problems in Image Processing: Partial Differential Equations and the Calculus of Variations; Springer Science & Business Media: Berlin, Germany, 2006; Volume 147. Elgammal, A.; Duraiswami, R.; Harwood, D.; Davis, L.S. Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. Proc. IEEE 2002, 90, 1151–1163. [CrossRef] Elgammal, A.; Harwood, D.; Davis, L. Non-parametric model for background subtraction. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2000; pp. 751–767. Stauffer, C.; Grimson, W.E.L. Adaptive background mixture models for real-time tracking. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Fort Collins, CO, USA, 23–25 June 1999; IEEE: New York, NY, USA, 1999; pp. 246–252. Stauffer, C.; Grimson, W.E.L. Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 747–757. [CrossRef] Grimson, W.E.L.; Stauffer, C.; Romano, R.; Lee, L. Using adaptive tracking to classify and monitor activities in a site. In Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, USA, 25 June 1998; IEEE: New York, NY, USA, 1998; pp. 22–29. Migliore, D.A.; Matteucci, M.; Naccari, M. A revaluation of frame difference in fast and robust motion detection. In Proceedings of the 4th ACM International Workshop on Video Surveillance and Sensor Networks, Santa Barbara, CA, USA, 27 October 2006; ACM: New York, NY, USA, 2006; pp. 215–218. Neri, A.; Colonnese, S.; Russo, G.; Talone, P. Automatic moving object and background separation. Signal Process. 1998, 66, 219–232. [CrossRef]


12.

13.

14.

15. 16. 17. 18. 19. 20. 21.

22. 23.

12 of 12

Zhan, C.; Duan, X.; Xu, S.; Song, Z.; Luo, M. An improved moving object detection algorithm based on frame difference and edge detection. In Proceedings of the Fourth International Conference on Image and Graphics (ICIG), Sichuan, China, 22–24 August 2007; IEEE: New York, NY, USA, 2007; pp. 519–523. Ma, W.; Zhao, Y.; Zhang, G.; Jie, F.; Pan, Q.; Li, G.; Liu, Y. Infrared dim target detection based on multi-structural element morphological filter combined with adaptive threshold segmentation. Acta Photonica Sin. 2011, 7, 1020–1024. Aragón-Calvo, M.A.; Jones, B.J.T.; Van De Weygaert, R.; Van Der Hulst, J.M. The multiscale morphology filter: Identifying and extracting spatial patterns in the galaxy distribution. Astron. Astrophys. 2007, 474, 315–338. [CrossRef] Otsu, N. A threshold selection method from gray-level histograms. Automatica 1975, 11, 23–27. [CrossRef] Fan, J.; Zhao, F. Two-dimensional Otsu’s curve thresholding segmentation method for gray-Level images. Acta Electron. Sin. 2007, 35, 751–755. Vincent, L.; Soille, P. Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 1991, 13, 583–598. [CrossRef] Lotufo, R.; Silva, W. Minimal set of markers for the watershed transform. In Proceedings of the ISMM, Berlin, Germany, 20–21 June 2002; ACM: New York, NY, USA, 2002; pp. 359–368. Strahler, A.N. Quantitative analysis of watershed geomorphology. Eos. Trans. Am. Geophys. Union 1957, 38, 913–920. [CrossRef] Xian, G.; Crane, M. Assessments of urban growth in the Tampa Bay watershed using remote sensing data. Remote Sens. Environ. 2005, 97, 203–215. [CrossRef] Fu, Q.; Celenk, M. Marked watershed and image morphology based motion detection and performance analysis. In Proceedings of the 2013 8th International Symposium on Image and Signal Processing and Analysis (ISPA), Trieste, Italy, 4–6 September 2013; IEEE: New York, NY, USA, 2013; pp. 159–164. Wei, J.; Li, P.; Yang, J.; Zhang, J. Removing the Effects of Azimuth Ambiguities on Ships Detection Based on Polarimetic SAR Data. Acta Geod. Cartogr. Sin. 2013, 42, 530–539. Wang, W.; Li, Q. A vehicle tracking algorithm with monte-carlo method. Acta Geod. Cartogr. Sin. 2011, 2, 200–203. © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).