Keypoints based copy-move forgery detection of digital images

6 downloads 0 Views 1MB Size Report
forensics is the copy-move forgery detection where part of the image is copied and ... GLOH etc., in this paper we are using SIFT and SURF for the purpose of ...
Keypoints Based Copy-Move Forgery Detection of Digital Images Swapan Debbarma Department of CSE NIT Agartala, India [email protected]

Angom Buboo Singh Department of CSE NIT Agartala, India [email protected]

Kh.Manglem Singh Department of CSE NIT Manipur, India [email protected]

Abstract— One of the most researched areas in digital image forensics is the copy-move forgery detection where part of the image is copied and paste in the same image. This is done with a motive to either highlight a particular object or to hide or remove an object from the image. Keypoints based approach uses descriptors from specific areas on the image unlike block based methods where features are extracted from every blocks. This lead to a smaller number of descriptors and hence a faster detection algorithm can be devised using keypoints. There are number of keypoints detection algorithms available such as SIFT, SURF, GLOH etc., in this paper we are using SIFT and SURF for the purpose of feature extraction and forgery detection. We also compare the performance of the two approaches in terms of speed and accuracy. Multiple cloning of region is also taken care of using iterative method in matching. Index Terms—Digital Forensic, Copy-move, SIFT, SURF

I. INTRODUCTION Tampering of digital images for vested interest is on a rise mainly due to easy availability of cheap digital camera along with sophisticated image editing software such as Adobe Photoshop, GIMP etc. Image forgery is seen in all area of our life, from fabricated insurance claim to scientific fraud, from magazine cover to false propaganda, from defamation to newspaper, images are no longer displaying the truth. Among the various type of image forgery [1] [2], cloning a part of the image is the most common method. This is done by copying a part of the image and then pasting it into another area of the same image, sometime pasting is done multiple times. Here the main motive is to either highlight a particular object in the image such as a crowd, or to hide or remove an object for instance, to conceal a person. When this is done properly using retouching tools, it can be very difficult to detect cloning. Moreover, since the copied points are from the same image, some components ( eg, noise and color) will be compatible with the rest of the image and thus will not be detectable using methods that looks for incompatibilities in statistical measures in different part of the image. Detection becomes more difficult if the copied area is geometrically transformed before pasting. An example is the Iranian missile test shown in fig.1, where part of the image is copied in this case the second flying missile and paste onto the unsuccessful missile to make it look like a successful missile test.

Fig.1. Example of image tampering that appeared in press in July 2008 [11]. Lower Image shows four successful missiles which is a fake. The real is shown above.

Detection of such copy-move forgery [3] is broadly categorized as block-based or keypoints-based, depending on the approach used. In this paper we applied two algorithm for the keypoint detection, they are Scale Invariant Feature Transform (SIFT) [4][5][6] and Speed Up Robust Feature (SURF) [7] separately and a comparative analysis is done on the performance of both approaches in term of speed and accuracy. The rest of the paper is organized as follows: Section 2 presents the review of the algorithms; Section 3 describes the proposed methods for forgery detection. The result of the analysis is presented in Section 4 and Section 5 describes the conclusion of the paper.

II. REVIEW OF THE ALGORITHMS A. Scale Invariant Feature Transform (SIFT) This method can be summarized as the following four steps: 1) scale-space extrema detection; 2) keypoint localization; 3) assignment of one (or more) canonical orientations; 4) generation of keypoint descriptors. In other words, given an input image, SIFT features are detected at different scales using a scale-space representation implemented as an image pyramid. The pyramid levels are obtained by Gaussian smoothing and subsampling of the image resolution while interest points are selected as local extrema (minimum/maximum) in the scale-space. These keypoints, referred to as in the following, are extracted by applying a computable approximation of the Laplacian of Gaussian called Difference of Gaussians (DoG). Specifically, a DoG image is given by: D(x,y,σ) = (G(x,y,kσ) – G(x,y,σ)) * I(x,y) = L(x,y,kσ) – L(x,y,σ),

(1)

where L(x,y,kσ) is the convolution of the original image I(x,y) with the Gaussian blur G(x,y,kσ) at scale kσ. In order to guarantee invariance to rotations, the algorithm assigns to each keypoint a canonical orientation. To determine this orientation, a gradient orientation histogram is computed in the neighborhood of the keypoint. Specifically, for an image sample L(x,y,σ), at scale σ (the scale in which that keypoint was detected), the gradient magnitude m(x,y) and orientation ϴ(x,y) are precomputed using pixel differences m(x,y) = (((L(x+1,y) – L(x-1,y))2 +((L(x,y+1) – L(x,y-1))2)1/2 ϴ(x,y) = tan-1 (

𝐿(𝑥,𝑦+1)−𝐿(𝑥,𝑦−1)

)

𝐿(𝑥+1,𝑦)−𝐿(𝑥−1,𝑦)

(2) (3)

An orientation histogram with 36 bins is formed, with each bin covering approximately 10° . Each sample in the neighboring window added to a histogram bin is weighted by its gradient magnitude and by a Gaussian-weighted circular window with σ equal to 1.5 times respect to the scale of the keypoint. The peaks in this histogram correspond to dominant orientations. Once these keypoints are detected, and canonical orientations are assigned, SIFT descriptors are computed at their locations in both image plane and scale-space. Each feature descriptor consists of a histogram of 128 elements, obtained from a 16 ×16 pixel area around the corresponding keypoint. This area is selected using the coordinates (x,y) of the keypoint as the center and its canonical orientation as the origin axis. The contribution of each pixel is obtained by accumulating image gradient magnitude m(x,y) and orientation ϴ(x,y) in scale-space and the histogram is computed as the local statistics of gradient orientations (considering eight bins) in 4×4 subpatches. Summarizing the above, given an image I, this procedure ends with a list of N keypoints each of which is completely described by the following information: xi = {x,y,σ,o,f} where (x,y) are the

coordinates in the image plane, σ is the scale of the keypoint (related to the level of the image-pyramid used to compute the descriptor), o is the canonical orientation (used to achieve rotation invariance), and f is the final SIFT descriptor.

Fig. 2. Example of SIFT keypoints

B. Speed Up Robust Feature (SURF) The SURF detector is based on integral image and Hessian matrix approximation [9]. The performance of SURF algorithm is much attributed to the use of an intermediate image representation known as the “Integral Image”. The entry of an integral image IΣ(x) at a location x=(x, y) represents the sum of all pixels in the input image I within a rectangular region formed by the origin and x. 𝑖≤𝑥 𝑗≤𝑦

𝐼𝛴 (𝐱) = ∑ ∑ 𝐼(𝑖, 𝑗) 𝑖=0 𝑗=0

(4)

Once the integral image has been computed, it takes three additions to calculate the sum of the intensities over any upright, rectangular area. Hence, the calculation time is dependent of its size.

Fig.3. Integral image calculation by rectangular region.

Given a point x = (x ,y) in an image I, the Hessian matrix H(x,σ) in x at scale σ is defined as follows 𝐻(𝑥, 𝜎) = [

𝐿𝑥𝑥 (x, 𝜎) 𝐿𝑥𝑦 (x, 𝜎)

𝐿𝑥𝑦 (x, 𝜎) ], 𝐿𝑦𝑦 (x, 𝜎)

(5)

where Lxx(x, σ) is the convolution of the Gaussian second 𝜕2

order derivative 2 𝑔(𝜎) with the image I in the point x, and 𝜕𝑥 similarly for Lxy(x, σ) and Lyy(x, σ). These derivatives are called Laplacian of Gaussian. The approximate determinant of the Hessian matrix is calculated by det(Happrox) = DxxDyy – (0.9Dxy)2

(6)

Orientation Assignment: At first, a circular area is constructed around the keypoints. Then, Haar wavelets are used for the orientation assignment. It also increases the robustnes and decrease the computational cost. Haar wavelets are filters that detect the gradients in x and y directions. In order to make rotation invariant, a reproducible orientation for the interest point is identified. A circle segment of 60° is rotated around the interest point. The maximum value is chosen as a dominant orientation for that particular point. Feature descriptor Generation: For generating the descriptors, first construct a square region around an interest point, where interest point is taken as the center point. This Square area is again divided into 4 × 4 smaller subareas. For each of these cells, Haar wavelet responses are calculated. Here, dx termed as horizontal response and dy as vertical response. For each of these subregions, 4 responses are collected as 𝑣subregion = [∑ 𝑑𝑥, ∑ 𝑑𝑦, ∑|𝑑𝑥|, ∑ |𝑑𝑦|] (7) So each subregion contributes 4 values. Therefore, the descriptor is calculated as 4 × 4 × 4 = 64.

vector dimension is 128 in case of SIFT and 64 for SURF. The proposed approach is illustrated in Fig 5. Convert to gray scale

Input Image

SIFT/SURF

Feature Extraction Matching

Single / Multiple copy

Post-Processing

Clustering and Filtering

Forgery Detection Fig.5. Overview of the Forgery Detection System

The extracted feature vector is matched differently for single clone detection and multiple cloning detection. For detecting single cloning, let X = {x1,..,xn} be a set of keypoints with the corresponding descriptors {f1,…,fn}. A matching operation is performed among the fi vectors. The best candidate match for each keypoints xi is found by identifying its nearest neighbor from all the other (n–1) keypoints of the image, which is the minimum Euclidean distance. The effective way is to use the ratio between the distances of the closest neighbor to that of the second-closest one, and comparing it with a threshold T (we fixed to 0.6). That is, the keypoint is matched only if d1 /d2 < T

(8)

where d1 ,d2 are the sorted Euclidean distances with respect to the other descriptors and T ∈ (0,1). For detection of multiple cloning, instead of checking the first two distances only, we iterate through all the adjacent distance pairs, i.e., the constraints can be given by di / di+1 < T

Fig. 4. Example of SURF keypoints

III. PROPOSED METHOD In this section we describe in details the proposed method to detect duplicated and distorted area in an image. The method is similar for both SIFT and SURF except for the keypoints detection and feature extraction part as they have different feature dimensions. The rest of the methods are kept same such as the matching, filtering and clustering. The input image is first converted into gray scale. Then keypoints as well as feature vectors are extracted using SIFT or SURF algorithm. The

(9)

where di are the sorted Euclidean distances with respect to the other descriptors and T ∈ (0,1). The image may contain very similar textures which may yield to false matches. These false matches can be reduced or eliminated to a great extent by performing post-processing on the matched output. An effective way to reduce false matches is to filter the output by using agglomerative hierarchical clustering [8]. The algorithm starts by assigning each keypoint to a cluster; then it computes all reciprocal spatial distances among clusters, find the closest pair of clusters, and finally merges them into a single cluster. This is repeated iteratively until a final merging situation is achieved through a linkage method adopted and by the threshold used to stop cluster grouping. In our experiment the threshold used is based on the distance criterion with cutoff

value of 30. Single linkage is used for merging and creating a hierarchical tree and is given by d(A,B) = min (dist(xAi ,yBj ))

(10)

This is nothing but the smallest Euclidean distance between objects in the two clusters.

Fig.9(a) SIFT

Fig.9(b)SURF

IV. EXPERIMENTAL RESULTS In this section, experiment of forgery detection is performed on the MICC-F220 dataset [9]. This dataset contains 220 images, and half of them are forged using a combination of different attacks such as scaling, rotation, blurring etc. We start by identifying the keypoints of the images using both SIFT and SURF.

Fig.10. Running profile of SURF method, took 236 second to process 220 images.

Detection performance was measured in terms of true positive rate (TPR) also known as sensitivity and false positive rate (FPR), where TPR is the fraction of the tampered images correctly identified as such, while FPR is the fraction of original images that are not correctly identified. Another performance parameter is the specificity (SPC) or the true negative rate which is calculated as (1 – FPR). TPR =

#𝑖𝑚𝑎𝑔𝑒𝑠 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑎𝑠 𝑓𝑜𝑟𝑔𝑒𝑑 𝑏𝑒𝑖𝑛𝑔 𝑓𝑜𝑟𝑔𝑒𝑑

Fig.6. Keypoints detected by SIFT and SURF

FPR =

#𝑖𝑚𝑎𝑔𝑒𝑠 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑎𝑠 𝑓𝑜𝑟𝑔𝑒𝑑 𝑏𝑒𝑖𝑛𝑔 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙

We can see from the above figure (fig. 6) that SIFT identify thrice more keypoints than SURF. Hence SIFT is able to detect forged area more accurately than SURF. However, since its feature vector (128) is twice more than SURF (64), it is quite slower. The following fig. 7-9 display a comparative output of SIFT and SURF based methods.

SPC = 1 - FRP

Fig.7(a) SIFT

Fig.7(b)SURF

Fig.8(a) SIFT

Fig.8(b)SURF

#𝑓𝑜𝑟𝑔𝑒𝑑 𝑖𝑚𝑎𝑔𝑒𝑠

#𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑖𝑚𝑎𝑔𝑒𝑠

TABLE I: FPR, SPC, TPR values (%) and processing time (average per image) for each method [10]. Methods Fridrich et al. Popescu and Farid Amerini et al Our SIFT Our SURF

FPR% 84 86 8 6.36 5.45

SPC% 16 14 92 93.64 94.55

TPR% 89 87 100 91 88.18

Time(s) 294.69 70.97 4.94 6.36 1.02

The starting three rows shown in Table I are taken from [10] as a benchmark and the fourth row represents the values obtained from SIFT and the fifth row from our SURF detection method. We see that keypoints based methods SIFT or SURF are not only extremely fast but also have high specificity value compare to the rest which is desirable. On the other hand the TPR of SURF is lower than the rest. The main reason is the tradeoff in the size of the feature vector dimension and speed.

tranformation recovery", IEEE Transactions on Information Forensics and Security, vol. 6, no. 3, pp. 1099-1110, 2011 [11] http://www.npr.org/templates/story/story.php?storyId=9244292 8

Fig.11. Performance of the methods represented in graph.

V. CONCLUSION In this paper, keypoints based forgery detection is analyzed using SIFT and SURF algorithm. The detection of larger keypoints makes the matching by SIFT more accurate. On the other hand, SIFT larger feature vector (128) makes it run slower than SURF which has only half the feature vector dimension (64). Moreover, the integral image used in SURF reduces the time complexities even more. Experimental results confirmed that both the methods are invariant towards different combination of scaling and rotation. They even give good results even in the presence of JPEG compression, Gaussian noise addition etc. Their only drawback is the lack of keypoint detection in smooth or plain areas in the image. REFERENCES [1] H. Farid, “A survey of image forgery detection,” IEEE Signal Process. Mag., vol. 2, no. 26, pp. 16–25, 2009. [2] S. Lyu and H. Farid, “How realistic is photorealistic?,” IEEE Trans. Signal Process., vol. 53, no. 2, pt. 2, pp. 845–850, Feb. 2005. [3] J. Fridrich, D. Soukal, and J. Lukas. Detection of copy-move forgery in digital images. In Digital Forensic Research Workshop, 2003. [4] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision, vol. 60, no. 2, pp. 91–110, 2004. [5] V. Christlein, C. Riess, and E. Angelopoulou, “A study on features for the detection of copy-move forgeries,” in Proc. Information Security Solutions Europe, Berlin, Germany, 2010. [6] I. Amerini, L. Ballan, R. Caldelli, A. Del Bimbo, and G. Serra, “Geometric tampering estimation by means of a SIFT-based forensic analysis,” in Proc. IEEE ICASSP, Dallas, TX, 2010. [7] Xu Bo,Wang Junwen, Liu Guangjie, Dai Yuewei, “Image CopyMove Forgery Detection Based on SURF”, in Proceedings of the International Conference on Multimedia Information Networking and Security (MINES) pp. 889 -892, 2010 [8] T. Hastie, R. Tibshirani, and J. H. Friedman, The Elements of Statistical Learning. New York: Springer, 2003. [9] http://www.micc.unifi.it/ballan/research/image-forensics/ [10] I. Amerini, L. Ballan, R. Caldelli, A. del Bimbo, and G. Serra, "A SIFT-based forensic method for copy-move attack detection and