Abstractâ Copyâmove forgery or regionâduplication is a form of digital image forgery which copies some region(s) of an image, and pastes those onto the ...
2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN)
DyWT based Copy-Move Forgery Detection with Improved Detection Accuracy Rahul Dixit and Ruchira Naskar National Institute of Technology, Rourkela Odisha – 769008, INDIA {514CS1001,naskarr}@nitrkl.ac.in
With high rise in cyber–crime rate in the present day, protecting the integrity and authenticity of digital multimedia data is extremely crucial. Security and protection of digital multimedia data require digital devices to be equipped with special software and hardware facilities. For example, special software modules or hardware chips for watermark embedding. However, such techniques increase the cost of the devices manifolds, hence limiting their usage. In this paper, we concentrate on blind methods of cyber–crime detection. The class of blind cyber–crime detection techniques do not require any apriori information to be embedded into digital multimedia data, hence eliminates the need of embedding the digital devices with special hardware or software components. The forensic approach to cyber–crime detection has gained a lot of research interest in the recent years. Research in digital forensics is majorly focused towards identifying and detecting clues about cyber attacks on digital data. In terms of digital images, copy–move forgery [1] is one of the most widely prevalent, as well as most well–researched forms of cyber–attacks on digital images. Basic principle behind this form of attack is to copy a portion of an image and paste it onto itself, with an aim to deceive the viewer by duplicating or obscuring significant object(s) in the image. For example, Fig.1(a) shows the scenery of a sea–beach with three flying sea–gulls. A copy–move forgery has been carried out on Fig.1(a), where the image of a sea–gull has been copied from the original scene and pasted onto it. The resultant is shown in Fig.1(b). In the forged image it appears as if there are four sea–gulls flying over the sea–beach. Similarly regions of an image may be copy–moved so as to obscure the presence of one or more objects in an image. Specifically, images having regular patterns or textures, such as sea–beach, foliage, sky etc., are more prone to copy–move attacks. The past decade has witnessed a lot of research interest and developments towards identifying region duplication or copy–move forgery in digital images. Notable works in this direction include [4], [5], [6], [8], [11], all of which use blind mechanisms for image region duplication identification. Our contribution in this paper includes the development of a region duplication detection technique, with improved detection accuracy as compared to the state–of–the–art. Basic operation of the proposed technique is based on Dyadic Wavelet Transform (DyWT). Subsequently, we propose a
Abstract— Copy–move forgery or region–duplication is a form of digital image forgery which copies some region(s) of an image, and pastes those onto the image itself, with an aim of obscuring important image objects. The last decade has seen rapid growth of research interest in the domain of copy–move forgery detection in digital images. Since in copy–move forgery, the duplicate regions come from the same image, this form of forgery is difficult to be detected by conventional image forgery detection techniques, which look for statistical inconsistency among different image parts. In this paper we propose a region– duplication detection technique which utilizes the Undecimated Dyadic Wavelet Transform for its operation. In the proposed approach, we divide an image into pixel sub–matrices or blocks and aim to find matches among different image blocks, so as to detect image region duplications. Similarity between blocks, with respect to features extracted using the DyWT method, is computed using the Canberra distance formula. Subsequently the detection accuracy of the proposed approach is optimized by reducing the number of false block matches. We evaluate and compare our proposed method through a comprehensive set of experiments. Our experimental results suggest that the proposed method attains considerably high detection accuracy as compared to the state-of-the-art.
Keywords. Copy–move forgery, Detection accuracy, Digital Forensics, Dyadic Wavelet Transform, Image Forensics, Region Duplication. I. I NTRODUCTION Digital Forensics is a branch of Computer Science deals with the investigation, verification and recovery of digital data and evidences found in digital devices such as computers, digital cameras or video recorders. Digital images and videos act as the major sources of evidence towards any event or crime, specifically in the legal industry as well as media and broadcast industries. Such industries are the major application domains of digital image forensics. The present day wide availability of a large number of low–cost multimedia file processing tools, techniques and software, having numerous advanced features, has made editing and manipulation of multimedia data extremely easy. All types of unauthorized modification or tampering of digital multimedia data pose as threats to their integrity. With the vast availability of digital image processing tools and techniques today, and the alarming number of digital images used in televisions, newspapers, posters, magazines and websites, the integrity of legal evidences used in the court of law, or the media and broadcast industries, is indeed at stake. 978-1-4673-9197-9/16/$31.00 ©2016 IEEE
133
UG,QWHUQDWLRQDO&RQIHUHQFHRQ6LJQDO3URFHVVLQJDQG,QWHJUDWHG1HWZRUNV63,1
Fig. 2. (a) Original image Fig. 1.
Operational Flowchart.
(b) Forged image
Initially, for feature extraction, DWT is applied and then SVD is used for dimensionality reduction. SV vectors are arranged in lexicographical order and with the help of a threshold value the authors find the matching pairs. This technique works efficiently even for highly compressed or edge–processed images. Bayram et. al. [8] have proposed the use of Fourier Mellin Transform (FMT) for copy–move forgery detection, and the proposed scheme works for highly compressed, rotated (by less than 10 degrees) and scaled images. Lin et. al. [9] proposed a Double Quantization DCT based copy–move forgery detection method, which works only for JPEG (Joint Photographic Experts Group) format. Muhammad et. al. [2] proposed a region–duplication identification method based on undecimated dyadic wavelet transform (DyWT), by which a forged image is decomposed into sub–bands and sub–bands are divided into overlapping blocks. Similarity between blocks are measured using Euclidean distance with the aim to find a match. This method is robust against JPEG compression. Although the proposed technique also uses DyWT for region–duplication identification, the use of Canberra distance measure, as well as the additional proposed technique of reducing false block matching rate, helps us to achieve a considerably improved detection accuracy. In next section, we describe the proposed method of region–duplication detection in digital images.
Typical example of copy-move forgery by image editing tool.
technique to reduce the number of false positives while searching for duplicate regions in an image; hence to further improve the efficiency of the proposed scheme. Rest of the paper has been organized as follows. In Section II, we present an overview of existing literature in digital image forensics, specifically copy–move forgery detection in digital images. In Section III, we present the proposed DyWT based copy–move forgery detection technique. In Section IV, we present and discuss two efficient parameters to evaluate the performance of the proposed technique. Our experimental results are presented and discussed in Section V. Finally we conclude with some future research directions in Section VI. II. R ELATED W ORK In the recent years, researchers have developed a number of techniques [3], [5], [6], [7] for detecting region– duplication or copy–move forgery in digital images. Fridrich et. al. [7] proposed three different approaches for efficient copy–move forgery detection, which are based on exhaustive searching for duplicate image blocks, utilizing autocorrelation between image blocks and perform an exact matching of image blocks. Another notable contribution has been made by Farid and Popescu [6], which exploits Principle Component Analysis (PCA) of images, broadly used to yield a reduction in dimensionality of an image. However, this technique has very low sensitivity to additive noise. Cao et. al. [4] proposed a region–duplication identification method based on Discrete Cosine Transform (DCT), where the image is first divided into fixed sized blocks. Subsequently DCT is applied to each block and the DCT coefficients are quantized. Duplicate regions are detected by using lexicographic sorting. This scheme is robust to multiple copy– move forgeries and assume considerably low computational complexity. Zhang et. al. [5] developed an approach based on Discrete Wavelet Transform (DWT). They used DWT for feature extraction and calculate similarity criterion between image blocks using phase correlation. This algorithm has considerably low computational complexity as compared to the other related schemes. Li et. al. [11] proposed a method based on DWT and Singular Value Decomposition (SVD).
III. P ROPOSED DY WT BASED C OPY–M OVE F ORGERY D ETECTION T ECHNIQUE In the proposed method, in order to detect duplicate regions in an image, we first transform the possibly forged image to its frequency domain by undecimated Dyadic Wavelet Transform (DyWT). We decompose the image into its low and high frequency components. The frequency components are individually divided into fixed sized overlapping blocks. The blocks are sorted with respect to their similarities, computed using the Canberra distance measure [13]. The blocks having similarities exceeding a particular threshold value, are decided to be duplicates of each other. In the proposed method, the low frequency components are majorly used to signify the similarity between blocks, whereas the high frequency components signify their dissimilarities, and are the major cause behind false block matchings in the proposed technique.
UG,QWHUQDWLRQDO&RQIHUHQFHRQ6LJQDO3URFHVVLQJDQG,QWHJUDWHG1HWZRUNV63,1
The total number of unit detection blocks is given by: (w − b + 1) × (h − b + 1). Next, in order to measure the similarity between individual unit block pairs, we calculate the distance between each pair of blocks using the following Canberra metric [13]: |xi − yi | i=1 |xi | − |yi | n
D(x,y) = ∑
(2)
where D(x, y) is the distance between blocks x and y. xi and yi are the corresponding pixel gray level values in blocks x and y respectively. Total number of pixels in a block is represented by n. The value of D(x, y) lies between [0,1]. Similarly we divide HH 1 into overlapping blocks and subsequently compute the distance between consecutive pairs of pixel blocks.
Fig. 3. (a) Forged image (b) Scaling coefficient (LL1 ) (c) Wavelet coefficient (HH 1 )
Operation of the proposed technique can be broadly divided into five key steps, which include (a) Pre–processing of the image to prepare it for features extraction, (b) Image features extraction using DyWt, (c) Computing similarity among image blocks, (d) Sorting of images blocks, and finally (e) Finding matches between image blocks, from the sorted list. The proposed scheme also presents an efficient technique to reduce the rate of false block matching or false positives, encountered during duplicate region detection. The false positive rate is reduced by investigating the neighbourhood of duplicate block pairs. A block diagram representing the operation of the proposed method is shown in Fig.2. In the following sub– sections we present each of the above steps in detail.
D. Sorting of image blocks Distances for LL1 , computed using Eq.2, are arranged in ascending order in List 1. Similarly, the distances for HH 1 , are arranged in descending order in List 2. Hence highly similar pairs of blocks and highly dissimilar pairs of blocks are appear at the top of List 1 and List 2 respectively. E. Block matching using threshold value In the proposed algorithm, for block matching using similarities and dissimilarities between blocks, we require two threshold parameters, T H 1 and T H 2 . The values of the threshold parameters are selected empirically from numerous trials done to obtain optimal results. Next we present the technique block matching, using the above two thresholds. From List 1, the pairs of blocks (x, y) having D(x, y) > T H1 , are removed from list 1. Hence T H 1 is the threshold value applied to LL1 . For our experiments, we take the value of T H 1 as 0.8. So, now List 1 contains the block pairs satisfying threshold constraint, arranged in an increasing order. Similar process is carried put on HH 1 , with the threshold constraint: D(x, y) < T H2 . T H2 is the threshold value for HH1 and in our work we select it to be 0.2. Now, List 2 has the block pairs satisfying the T H2 threshold constraint, arranged in a decreasing order. The corresponding pairs of residual blocks residing in Lists 1 and 2, after application of threshold constraints, represent duplications in the image. If a block pair is found at mth position of List 1, and between (m − i)th and (m + i)th positions in List 2, we conclude that there are some false matches. Another factor leading to false block matches is the possibility of an image having similar objects originally. Since block matching in the proposed technique is based on finding similarities from LL1 sub–band, this scenario may lead to false matchings. In Fig. 4, we show the detection of duplicate regions in an image by the proposed method, as well as the the results after reducing false positive using the technique is discussed below. Note that in Fig. 4, the forgeries have been manually induced by number of pixel of forged image. We have varied the size of forgery from 10% to 40% to show variation in duplicates detected.
A. Preprocessing The aim of preprocessing done on a possibly forged image is to prepare it for the features extraction. Pre– processing consists of conversion of a forged color image into its grayscale representation, by using the following formula [12]: Grayscaleintensity = 0.299×R+0.587×G+0.114×B (1) where R, G and B represent the red, green and blue color channel intensities of the image. B. Feature extraction using DyWT The input to this step is the grayscale image optained as a result of pre–processing. We apply DyWT on the image and hence obtain four different frequency sub–bands sequentially, viz. LL (approximation), LH (horizontal), HL (vertical) and HH (diagonal). In Fig. 3 we show one level decomposition of an image into its sub–bands using DyWT. At scale one, LL1 and HH 1 contain high similarity and dissimilarity between copied and moved blocks, respectively. LL1 represents approximations i.e. the coarse level coefficients or scaling coefficients and HH 1 represents the finest level coefficients or wavelet coefficients. C. Calculation of similarity using Canberra distance Initially, we take LL1 and divide it into fixed sized overlapping blocks. Let the entire image have w × h pixels and the blocks have b×b pixels, each. We call each such block a unit detection block, for identifying image region–duplication.
UG,QWHUQDWLRQDO&RQIHUHQFHRQ6LJQDO3URFHVVLQJDQG,QWHJUDWHG1HWZRUNV63,1
(a) Forged Image
(b) Gray scale Image
(c) Output Image with false matches
(d) Output Image without false matches
(e) Forged Image
(f) Gray scale Image
(g) Output Image with false matches
(h) Output Image without false matches
(i) Forged Image
(j) Gray scale Image
(k) Output Image with false matches
(l) Output Image without false matches
(m) Forged Image
(n) Gray scale Image
(o) Output Image with false matches
(p) Output Image without false matches
Fig. 4. Detection of copy–move forgery (of varying sizes), before and after reduction of false matches. Forgery sizes: (a)–10%, (e)–20%, (i)–30%, (m)–40%. Detected duplicate regions are highlighted in white. False positives are shown with red circles.
F. Reduction of false positives
all its eight neighbors, the number of neighbors that reside in List 1 and List 2 combinedly, is recorded as Nnbr−test . Only those block pairs which have Nnbr−test ≥ 4 are valid contenders for duplicates detection. If Nnbr−test < 4 then we discard P and Q from both lists. After elimination of all such block pairs, we look for duplicates in the residual Lists 1 and 2. This techniques helps us to attain much improved detecetion accuracy by the proposed method, due to reduction in false positive block match rate. An example of neighborhood selection for two blocks P and Q (selected from Lists 1 and 2 respectively) has been shown in Fig. 5. N11 ...N81 are the neighbors of P and N12 ...N82
As discussed previously, the block pair at mth position in List 1 matches with any one of (m − i)th to (m + i)th block pairs in List 2, and this causes false positives to be generated in the proposed method which effectively reduces its detection accuracy. Next we propose an extension of the above technique, so as to improve its the accuracy by reducing the false positives. We select block pairs P and Q respectively from Lists 1 and 2 and find out the neighbors of each P and Q blocks. We consider eight neighbors of each block: two horizontal, two vertical and four diagonal. For each block (P or Q), out of
UG,QWHUQDWLRQDO&RQIHUHQFHRQ6LJQDO3URFHVVLQJDQG,QWHJUDWHG1HWZRUNV63,1 TABLE I D ETECTION ACCURACY AND FALSE P OSITIVE R ATE Forged Image Region
10%
Fig. 5.
Selected neighbouring blocks of P and Q.
are the neighbors of Q. At least four out of N11 ...N81 , as well as four out of N12 ...N82 must be residing in List 1 and List 2 combinedly. Else both P and Q are discarded.
20%
IV. D ETECTION ACCURACY AND FALSE P OSITIVE R ATE In this section we propose two parameters for performance evaluation of the proposed copy–move forgery detection algorithm, viz. Detection Accuracy and False Positive Rate. The proposed parameters would be useful in comparing the proposed scheme with the state–of–the–art. Next, we formally define the two parameters one–by–one. Performance of the proposed scheme has been evaluated in terms of the proposed parameters, and the results of the evaluation have been presented in our experimental results in Section V. Detection Accuracy (DA) is defined as the percentage of forged pixels in an image, correctly detected by a particular detection algorithm. DA is represented as the ratio of total number of correctly identified copy–moved pixels, to the total number of actually copy–moved pixels. An efficient detection technique results in high detection accuracy. DA =
30%
40%
DAC1
FPR1
DAC2
FPR2
98.5347 98.2272 97.9256 98.1013 97.9419 97.8212 97.2347 96.9732 98.8472 98.1173 98.9878 97.9757 97.9850 97.8218 97.9712 97.8493 98.9053 98.2174 98.9076 98.7368 98.4889 98.4038 98.3410 98.5143 99.2173 98.9245 98.8901 99.3564 99.1433 98.8798 98.8354 98.9823
8.1782 8.6724 9.8495 8.8218 10.4917 12.5370 14.5372 15.2155 6.8797 7.8780 6.7346 7.2302 9.8669 12.7585 10.9244 13.6799 6.8807 5.6808 6.7295 6.9202 7.8969 8.7085 8.9746 8.2689 3.7807 4.9846 5.1560 3.0402 4.1689 6.5871 6.1031 5.6921
99.7378 99.7304 99.7389 99.7398 99.7411 99.7501 99.7588 99.7601 99.7398 99.7362 99.7382 99.7422 99.7511 99.7528 99.7616 99.7724 99.7421 99.7460 99.7503 99.7530 99.7620 99.7698 99.7706 99.7788 99.7429 99.7483 99.7513 99.7580 99.7680 99.7737 99.7787 99.7827
2.0382 2.0612 2.0387 1.9974 1.9914 1.9462 1.9272 1.9012 2.0112 1.8122 1.8083 1.8013 1.7619 1.7111 1.6732 1.6317 1.9081 1.6085 1.3938 1.3920 1.3237 1.3482 1.3012 1.2511 1.4081 1.3821 1.3748 1.3522 1.3018 1.2882 1.2191 1.2024
V. R ESULTS AND D ISCUSSION In this section we present our experimental results. As discussed in Section IV, detection accuracy and false positive rate are the parameters used to evaluate the performance of the proposed scheme. All the implementation are done in MATLAB 2014a installed on a machine equipped with Intel i7, 3.40 GHz Core 2 duo processor and 4GB RAM. Our test dataset consists of 50 standard color test images of size 256 × 256 each, collected from CVG UGR Image Database [14] and USC SIPI Image Database [15]. Regions of the test images have been manually duplicated for the sake of experimentation. Size of the duplicated image regions are varied as 10%, 20%, 30% and 40%. The unit detection block size (as discussed in Section III-C) has been varied from 6×6 to 34 × 34. The detection accuracy and false positive rate results are presented in Table I. In Table I, columns labelled DAC1 and FPR1 indicate the detection accuracy and false positive rate, achieved by the DyWT based region–duplication detection algorithm proposed in Section III; while columns DAC2 and FPR2 indicate the detection accuracy and false positive rate achieved by the proposed method, after application of the false positives reduction technique, proposed in Section III-F. Results in Table I show that the detection accuracy is varied between 96.9732% to 99.2173% and the false positive rate varies between 3.0402 to 15.2155. After application of false positive reduction technique, the detection accuracy and false positive rates vary from 99.7304% to 99.7827% and from 1.2024 to 2.0612 respectively. Fig. 6 shows the variation of detection accuracy and false positive rate, with unit block size as well as the percentage of
CCMP × 100% ACM
where CCMP denotes the total number of correctly copy– moved pixel and ACM denotes the total number of actually copy–moved pixels. False positives while block matching by the proposed mechanism are evident due to the presence of regular patterns or flat regions in the image. False positive matching is the phenomenon of non–duplicated image regions being identified (falsely) as duplicated ones, by the region–duplication detection algorithm. False Positive Rate represents the proportion of absent events that produce positive test outcomes, i.e., the conditional probability of a positive test result given by an absent event. In our proposed algorithm, FPR is used to measure the area of the image regions which are not copy– moved, yet detected as duplicated regions by the forgery detection algorithm. FPR is represented as the ratio between the number of image pixels, falsely detected as copy–moved, to the number of pixels actually copy–moved. FPR =
Unit Block Size 6×6 9×9 13 × 13 17 × 17 22 × 22 25 × 25 29 × 29 34 × 34 6×6 9×9 13 × 13 17 × 17 22 × 22 25 × 25 29 × 29 34 × 34 6×6 9×9 13 × 13 17 × 17 22 × 22 25 × 25 29 × 29 34 × 34 6×6 9×9 13 × 13 17 × 17 22 × 22 25 × 25 29 × 29 34 × 34
IFP ACM
where IFP denotes the number of pixels falsely detected to have been copy–moved, and ACM denotes the number of actually copy–moved pixels.
UG,QWHUQDWLRQDO&RQIHUHQFHRQ6LJQDO3URFHVVLQJDQG,QWHJUDWHG1HWZRUNV63,1 TABLE II C OMPARISION OF D ETECTION ACCURACY AND FALSE POSITIVE RATE WITH OTHER METHODS
Method Muhammad et. al. [2] Li et. al. [11] Mahdian and Saic [10] Proposed Method
Dtetction Accuracy 95.90 91.03 81.18 99.73–99.78†
False Positive rate 4.54 9.65 10.03 1.2 –2.3†
† Varies with unit block size.
VI. C ONCLUSION AND F UTURE W ORK In this paper, we have proposed a technique of region– duplication identification in digital images. The performance of the proposed technique has been optimized in terms of detection accuracy and false positive rate. We have present a two–way parameterization to evaluate and compare copy– move forgery detection schemes. Future direction of this work is the identification of geometrically transformed duplicated image regions. Geometric transformations may include rescale, rotation, reflection or a combination of two or more of these. The shift–invariant properties of DyWT used in the proposed method would be exploited to achieve detection of geometrically transformed image regions. R EFERENCES [1] A.J. Fridrich, B.D. Soukal, and A.J. Luk, Detection of Copy-Move Forgery in Digital Images, in Proceedings of Digital Forensic Research Workshop, 2003. [2] G. Muhammad, M. Hussain, and G. Bebisi, Passive copy move image forgery detection using undecimated dyadic wavelet transform, Digital Investigation, vol. 9, no. 1, pp. 49–57, 2012. [3] Y. Huang, W. Lu, W. Sun, and D. Long, Improved DCT-based detection of copy-move forgery in images, Forensic science international, vol. 206, no. 1, pp. 178–184, Aug. 2011. [4] Y. Cao, T. Gao, L. Fan and Q. Yang, A robust detection algorithm for copy-move forgery in digital images, Forensic science international vol.214, no. 1, pp. 297–309, 33–43, 2012. [5] J. Zhang, Z. Feng, and Y. Su, A new approach for detecting copymove forgery in digital images, Communication Systems, 2008. ICCS 2008. 11th IEEE Singapore International Conference on, pp. 362–366, 2008. [6] A.P. Farid and A.C. Popescu, Exposing digital forgeries by detecting duplicated image region, [Technical Report], Hanover, Department of Computer Science, Dartmouth College. USA, 2004. [7] A.J. Fridrich, B.D. Soukal, and A.J. Luk, Detection of copy-move forgery in digital images, in Proceedings of Digital Forensic Research Workshop, 2003 [8] S. Bayram, H.T. Sencar, and T.N. Memon, An efficient and robust method for detecting copy-move forgery, Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on, pp. 1053–1056, 2009. [9] Z. Lin, J. He, X. Tang, and C.K. Tang, Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis, Pattern Recognition journal, vol. 42, no.11, pp. 2492–2501, 2009. [10] B. Mahdian, S. Saic, Using noise inconsistencies for blind image forensics, Image and Vision Computing, vol.27, no.10, pp. 1497–1503, 2009. [11] G. Li, Q. Wu, D. Tu, and S. Sun, A sorted neighborhood approach for detecting duplicated regions in image forgeries based on DWT and SVD, Multimedia and Expo, 2007 IEEE International Conference on, pp. 1750–1753, 2007. [12] R.C. Gonzalez and R.E. Woods, Digital image processing, Pearson Education India, 2009. [13] Manesh Kokare, P.K. Biswas, and B.N. Chatterji, Texture image retrieval using rotated wavelet filters, Pattern recognition letters, vol. 28, no. 10, pp. 1240–1249, 2007. [14] http://decsai.ugr.es/cvg/dbimagenes/c256.php [15] http://sipi.usc.edu/database/database.php?volume=misc.
Fig. 6. Variation of detection accuracy and false positive rate. (a) Detection accuracy vs. unit block size with false matches. (b) Detection accuracy vs. unit block size after false positive reduction. (c) False positive rate vs. unit block size with false matches. (d) False positive rate vs. unit block size after false positive reduction.
image forgery, both before and after false positives reduction. Table II shows a performance comparison of the proposed method with three recent state–of–the–art copy–move forgery detection algorithms [2], [10], [11]. The results presented in Table II prove that the proposed algorithm achieves considerably high detection accuracy and low false positive rate, as compared to the state–of–the–art.