2009 International Conference on Availability, Reliability and Security
Detecting Image Tampering Using Feature Fusion Pin Zhang
Xiangwei Kong
School of Electronic and Information Dalian University of Technology Dalian 116023 China
[email protected]
School of Electronic and Information Dalian University of Technology Dalian 116023 China
[email protected] features called camera response function (CRF), this approach achieves the detection accuracy of 87%, but its false acceptance rate (FAR) is also at least 15.58%. Based on a natural image model which is effective in detecting the change of correlation between image pixels, Shi, Y. Q. [5] proposed a new approach for image forging detection and achieves an accuracy of 82%. Min Wu [6] introduced a new set of higher order statistical features to determine if a digital image has been tampered. Using SVM classification, the mean accuracy obtained is 71.48%. Bicoherence features were applied by S.-F. Chang [7] into image splicing detection. This method works by detecting the presence of abrupt discontinuities of the features and obtains an accuracy of 80%. A technique detecting the different CFA interpolation algorithms within an image was proposed by M. K. Johnson and H. Farid [8]. This method obtains an accuracy of 95.71% when using a 5x5 interpolation kernel for two different cameras but it drops to 83.33% when three cameras are used. Our blind approach for image tampering detection bases on the principle that the tampering process and the postprocessing operations will inevitably disturb the statistic features of the original image [9]. The technique with a classification accuracy of 90% in this paper outperforms those in the prior work. When we are in doubt about the authenticity of a photo in forensic affairs, we first ask the photographer to turn in the camera by which the photo was taken. Then using the photos taken by this camera, we extract the feature statistics which could stand for the property of this given camera. We detect image tampering by matching the feature statistics to the camera feature pattern (camera feature pattern: the result of training one-class classifier using feature statistic of the given camera). In the second part of this paper, we introduce the imaging process in digital cameras and the features used in our method. Section three describes the theory of the feature statistics. Section four illustrates our detection system. Experiments and their results are shown in Section five to demonstrate the reliability of our approach. The last section concludes the paper.
Abstract—Along with the development of sophisticated image processing software, it is getting easier forging a digital image but harder to detect it. It is already a problem for us to distinguish tampered photos from authentic ones. In this paper, we propose an approach based on feature fusion to detect digital image tampering. First, we extract the feature statistics that can represent the property of a camera from the images taken by that camera. These feature statistics are used for training a one-class classifier in order to get the feature pattern of the given camera. Then, we do sliding segmentation to testing images. Finally, feature statistics extracted from image blocks are fed into the trained one-class classifier to match the feature pattern of the given camera. The images with low percentage of matched blocks are classified as tampered ones. Our method could achieve a high accuracy in detecting the tampered images that undergone post-processing such as JPEG compression, re-sampling and retouching. Keywords: Digital forensics, feature fusion, image forensic, image tampering, tampering detection
I.
INTRODUCTION
Nowadays, digital cameras are playing an important role in our daily life as well as in forensic affairs. However, the sophisticated digital image processing software such as Photoshop and Freehand have made image tampering much easier to operate but harder to detect. Therefore, digital images are no longer as authentic as before. This urges us to find ways to distinguish the tampered images from authentic ones. Based on different aspects, people proposed many effective methods. The first proposed method is the active forensic method. Using this method, we could detect image tampering effectively by embedding watermark into digital images when we take photos. However, it could hardly be widely used because this method requires additional technique in cameras. So passive-blind forensic methods have been widely developed as they need no pre-process for cameras. We could classify passive-blind forensic methods into two categories [1]. First, based on the hardware aspects, Fridrich [2] proposed a method using the feature extracted from photos. This feature is called sensor pattern noise which is brought by hardware defects in cameras. In the experiment, the method identified 83% forged cases. Second, based on the software aspects, S.-F. Chang [3], [4] utilized photo features which are brought by software inside cameras to propose an approach to detect image splicing. Using the
978-0-7695-3564-7/09 $25.00 © 2009 IEEE DOI 10.1109/ARES.2009.150
II.
IMAGING PIPELINE IN DIGITAL CAMERAS
Basic imaging pipeline structure is shown in Fig.1. First, the natural light passing through the optic system is “captured” by a sensor (CCD etc.). Generally, only a single CCD detector is used at every pixel in a consumer digital camera. Moreover, the surface of the detector is partitioned 335
Authorized licensed use limited to: California State University Sacramento. Downloaded on April 01,2010 at 02:29:24 EDT from IEEE Xplore. Restrictions apply.
with different color filters which are called Color Filter Array (CFA). Then the missing pixels in each color planes are filled in by a CFA interpolation. Finally, operations such as demoscaicing, enhancement and gamma correction are carried out in the camera. Usually, after imaging process, the digital image will be converted to a user-defined format, such as RAW, TIFF, and JPEG, and stored in the memory.
ration of RGB colors. The energy ration is defined as: 2
E1 =
2
B. Image Quality Features [11] Different cameras have different image quality features due to the difference in hardware and software. For example, photographing the same scene, the images of different cameras would show dissimilar brightness or texture definitions. Therefore, quality features would make contribution to distinguishing the images taken by different cameras [11]. Image quality features used in this paper are shown in table 1. TABLE I.
Figure 1.
IMAGE QUALITY FEATURES
Features based on pixel difference
Features based on correlation
Features based frequency spectrum
on
mean square error MSE: Q1
image fidelity: Q4
Spectrum amplitude error: Q8
normed cross correlation: Q5
Spectrum phase error: Q9
Example Imaging pipeline in digital cameras.
Although the basic structures within cameras are similar, there are still differences between two cameras. The differences are brought in each imaging step. For example, the differences could be caused by various hardware defects during the capturing process or be caused by specific algorithms used in the adjusting processes such as White balancing and Gamma Correction. These differences endow images a set of features that could represent the property of the imaging process within a camera. The features such as image color features[9] , wavelet statistical correlation features [10], image quality features [11], bicoherence amplitude and phase features [12],[13],[14] vary greatly according to the mechanism within cameras such as the CFA interpolation, post-processing algorithms, and the property of hardware. So these features are effective in identifying cameras. We use these features in the form of their statistic. The reason is that, the statistics are the quantificational representation of the features. So, the camera feature pattern could be got by training a one-class classifier with these statistics. Besides, the statistics are fed into the trained oneclass classifier in the matching step to make a decision whether each part of the testing image comes directly from our given camera or has been tampered. III.
2
G , G , B . E 2 = 2 E3 = 2 2 B R R
mean absolute error MAE: Q2 corrected infinite norm: Q3
Czekonowski correlation:Q6 Mean angle similarity: Q7
Spectrum error: Q10
phase-amplitude
Image block Spectrum amplitude Error: Q11 Image block Spectrum phase Error:Q12 Image block Spectrum phaseamplitude error:Q13
C. Wavelet Features Aside from the features in spatial domain, digital images could also be described in a more detailed way with the features of transformation domain. The wavelet coefficients could describe an image well from the characteristics of time-frequency. Besides, various imaging processes and post-processing in different cameras have distinct influences on correlations in the wavelet domain [10]. Therefore, we do quadrature mirror filter separation (QMFs) in each color channel of real color images, and then evaluate the statistics of sub-band coefficients’ mean, variance, skewness, kurtosis, and the optimal linear prediction error of coefficient amplitude in each lever and direction(horizon, vertical and diagonal). The products are the wavelet features used in our method. In our method, we utilize the statistic features of wavelet coefficients’ prediction error. As the prediction error is the residual of subtracting the content-related portion from the original image, we could improve the accuracy in classifying random image samples by reducing the correlation between statistic of prediction error and image content.
FEATURE EXTRACTION AND ANALYZING
In this section, we will introduce the features we use in detecting image tampering and analyze the effectiveness of these features in our approach. A. Color Features [9] Cameras of different modes use various CFA interpolation algorithms, transaction modes and transform processes. Color features could show these differences in the aspect of the color property of the whole image [9]. Color feature includes: mean pixel value, correlation of RGB color pairs, the neighbor distributing of centroid, and the energy
336
Authorized licensed use limited to: California State University Sacramento. Downloaded on April 01,2010 at 02:29:24 EDT from IEEE Xplore. Restrictions apply.
D. Bicoherence Features The post-processing operations such as enhancement and gamma correction bring about the nonlinear distortion between the image brightness and pixel value, which is shown in Fig.2, and the nonlinear distortion could lead to the high-order dependence in frequency domain, so the bicoherence features could be used to estimate the geometric distortion and brightness distortion in digital images [12],[13],[14]. Besides, as the bicoherence features extract the information that has no direct relation with the content of image, they could reduce the limitations for the image content. In our approach, we introduce two bicoherence features: mean bicoherence amplitude and bicoherence phase entropy. They could be the features for identifying cameras due to their sensitivity to nonlinear distortion.
⎛ ⎛∧ ⎞ I φ b (ω1 , ω2 ) ⎞⎟ ∈ ϕ n ⎟ W ⎜ ⎜ ⎠ ⎝ ⎝ ⎠ I ( ⋅) = indicator function
p (ϕ n ) = W
−1
∑
ϕ n = {φ −π + ( 2π n ) N ≤ φ < −π + 2π ( n + 1) N } n = 0,..., N − 1 IV.
(3)
IMAGE TAMPERING DETECTION BASED ON FEATURE FUSION
In this section, we will introduce the feature selection method and one-class classifier will be described. Then, we will show our detection system for image tampering.
Figure 2. Mean of 201 Camera Responsible Function (DoRF)[19],[20]
In order to make the variation of bicoherence
(ω1 , ω2 ) independent of P (ω1 ) , P (ω 2 ) and P (ω1 + ω 2 ) , we use normalized bicoherence. For single dimension signal f ( x ) , it is estimation at each frequency pair
defined as: ∧
b (ω1 + ω2 ) =
1 ⎛∧ ⎞ ∑ Fk (ω1 ) Fk (ω2 )Fk* (ω1 + ω2 ) jφ ⎜ b (ω1 ,ω2 ) ⎟ ∧ N k ⎠ = b (ω1 , ω2 ) e ⎝ 2 1 2 1 ∑ Fk (ω1 ) Fk (ω2 ) N ∑ k Fk (ω1 + ω2 ) N k
A. Feature Selection Using Fisher Classifier In order to improve the performance of our detection system and reduce the feature dimensions, we select the features which are more effective for our detection system. In this paper, Fisher Classifier is used in feature selection. Fisher Classifier is a linear Classifier [15]. If the samples could be classified well using a single feature under linear classifier, it means the samples in the feature space of the used feature are not interlaced, that is: this feature has the great efficiency in classifying these samples. Therefore, using Fisher classifier, it is clear for us to find the most efficient features for detection system, and consequently, possible for us to obtain the set of most effective features. The features are ordered according to the result of Fisher classifier. Then we add one feature each time according to that order. Fig. 4 shows the relation between selected feature dimensions and the corresponding accuracy of classification. We could see that after features exceed 87 dimensions, the accuracy improves little as the features go on increasing. So we take the first 87 features as our features for detection system.
B. One-Class Classifier [16] As the spliced parts of a tampered image may be captured by any camera, we could get no prior characteristics (1) of the spliced parts. Moreover, the characteristics of the parts Fk (⋅) stands for the k th segment of fourier transform. after blurring and retouching could not even be classified Here, in order to limit estimated error, we calculate the into any camera characteristic. So we could not use multifourier transform for each segment. class classifiers to classify the image blocks by features. In We take the mean bicoherence amplitude and our method, one-class classifier is used to solve this problem. bicoherence phase entropy of the frequency pair as our We train the one-class classifier with the features we select features to estimate correlation: to get the feature pattern of the given camera in fixed size. ∧ Then one-class classifier is also used to make binary −1 M=W b (ω1 , ω2 ) decisions for the blocks of testing images. The percentage of W the matched blocks shows whether the testing image has W = {(ω1 , ω2 ) ω1 = 2π n1 N , ω2 = 2π n2 N , n1 , n2 = 0,..., N − 1} been tampered or not. Training one-class classifier aims at getting the camera (2) pattern scope in the feature space under certain false alarm P = n p (ϕn ) log p (ϕ n ) rate (FAR). When we classify the testing samples, if the
∑
∑
337
Authorized licensed use limited to: California State University Sacramento. Downloaded on April 01,2010 at 02:29:24 EDT from IEEE Xplore. Restrictions apply.
projection of a sample falls into the pattern scope of our given camera, we could say this sample matches the given camera. Otherwise, it does not match. For example, in threedimension feature space, the output of the training in oneclass classifier is a three-dimension ball under certain FAR. The space within this ball is the pattern space of the given camera. In the testing step, if the projection of the sample features falls into this ball, it means the sample is taken by the given camera. Otherwise, it is not.
a tampered one. For most of tampering operations, the goal is to tamper the kernel objects whose areas are normally much bigger than 1% of the whole image. So the matching result of a tampered image is always much lower than 99%. We could distinguish the tampered image clearly in the detection system. Different cameras have dissimilar properties due to the different hardware and software used in the cameras. The features we use in our method represent these properties in various aspects. The tampering operation or any postprocessing operation such as blurring, retouching and gamma correction would inevitably change these features [9]. These changes would make the tampered blocks hardly match the original pattern features. So our method has its practicability in tampered image detection.
C. Image Tampering Detection System From the principles of our features, we could see that the calculation of feature statistics in our method is independent of the size of images and is also homogenous among different locations of an image. So they are fit for training and classifying image blocks.
V.
EXPERIMENT AND ANALYSIS
A. Experiment Environment and Parameters In experiment, after training, we test 600 images coming from our laboratory image dataset. 300 of them are taken by the cameras of NikonE5700, Sony F828 and Canon Por1 and the other 300 are images spliced by 2 authentic images taken by these three types of cameras. The spliced images are named in group with Canon-Nikon, Canon-Sony, and NikonSony. The size of these images ranges from 640x480 to 2048x1536. They have all undergone post-processing operations of blurring and JPEG compression with the quality factor of 75%. Figure 3.
B. Experiment The relation between selected feature dimensions and accuracy of classification is shown in Fig 4. From the figure, we can see that the accuracy of our detection system using features of 247 dimensions is nearly the same as that using the first 87 features. So in our experiment, we use these 87 features for tampering detection.
Image tampering detection system based on feature fusion
The detection system is shown is Fig. 3. In the training step, the training images are sliding segmented into blocks of the same size (In our experiment, the size is 128*128, the sliding step is 50 pixels. If the block size is smaller than 128*128, the accuracy of our method will fall with the decrease of the block size. This is because the feature statistics could not represent the property of a given camera in precision if there are not enough pixels to get representational features.). The next step is feature selection. After that, one-class classifier is trained to get the feature pattern of the given camera in a fixed size. The radial basis function (RBF) is chosen as the kernel function of one-class classifier. After optimum seeking in the training process, cross validation is used to determine the parameter c and g [17]. Finally, the same set of feature statistics extracted from testing blocks are fed into trained one-class classifier to get the matching result. The principle in training and matching is as follows: Because the input of one-class classifier is statistics, which has the property of fluctuation, we set a false rejection ratio of 1% to one-class classifier, that is: we allow the classifier rejects one sample by error from each 100 samples. By doing this, we set a tolerance for the following matching step to adapt to the fluctuation of statistics. In the matching step, if the percentage of matched blocked is lower than 99% of the whole blocks in a testing image, this image could be seen as
Figure 4. Relation between selected feature dimensions and accuracy of classification
338
Authorized licensed use limited to: California State University Sacramento. Downloaded on April 01,2010 at 02:29:24 EDT from IEEE Xplore. Restrictions apply.
TABLE II.
achieves an accuracy which is higher than 90%. For the authentic images, the accuracy is relatively low. However, we could see from the table that: the low accuracy is because the number of the images used for training is too small. There are only 36 images taken by Nikon D70. The number of images taken by Canon EOS 350D Rebel XT is even smaller, only 27. The deficient number of images for training is the main reason for the low accuracy. When there is a big number of images for training, such as the image number of 117 from CanonG3 camera, our method could achieve the accuracy of 86.33%, which is almost the same as the result in Table 2. So our method could keep a high accuracy if there are a proper number of images for training. From the experimental result, it could be seen that our approach has a strong sensitivity for tampered images. It could achieve a high accuracy even there are not a big number of images for training. This is because different camera patterns could hardly match each other. However, for the test of authentic images, a big number of images for training are needed, because we should train the one-class classifier under various conditions such as different image brightness, various image textures and qualities in order to get an accurate camera pattern.
TESTING RESULT OF OUR IMAGE DATABASE
Cameras Feature Pattern
Canon
Accura
Canon
88%
Canon-Nikon
85%
Canon-Sony
90%
Canon-Nikon
88%
Nikon
Nikon
Sony
Images
86%
Nikon-Sony
92%
Canon-Sony
91%
Nikon-Sony
88%
Sony
86%
Mean accuracy in tampering detection
89%
Mean accuracy in authentic image detection
86.
Table 2 shows the testing result for our image dataset. The first column shows the names of the three cameras taking these images. The second column shows the different groups of images. For example, ‘Canon’ means the images in this group are authentic images taken by the Canon camera and ‘Canon-Nikon’ means the images in thse group are spliced with images taken by Canon and Nikon cameras. From the result, we could see that: in the detection for authentic image, our method achieves the mean accuracy of 86.67%;in the detection for tampered images which are retouched and JPEG compressed with the normal quality factor of 75%,our method achieves the mean accuracy of 89%. This approach shows its practicability in image tampering detection as the retouching and JPEG compressing operations are all common post-possessing steps in tampering an image. TABLE III.
Image Types
Authentic
TESTING RESULT OF COLUMBIA IMAGE SPLICING DETECTION EVALUATION DATASET
Cameras
Image Number
Detected Number
Accuracy
CanonG3
117
101
86.33%
Nikon D70
36
25
69.45%
27
16
59.27%
180
163
90.56%
Canon EOS Spliced
VI.
CONCLUSIONS AND FUTURE WORK
In this paper, we proposed an approach for image tempering detection. In our method, a set of feature statistics are calculated first. Then the feature pattern of the given camera is trained out with one-class classifier using these statistics. After that, the testing images are sliding segmented into blocks of the same size. And the same feature statistics of each block are calculated. Finally, we use one-class classifier to determine if each testing image matches the pattern of the given camera. We consider an image as a tampered one if it could not match the pattern of the camera that took it. The result of experiment shows the efficiency and practicability of our approach in image tampering detection. There is still weakness in our method. If the tampered image is made by directly splicing the parts of two images taken by the same camera without any post-processing, our method will lose its power under this situation. Our method works by evaluating whether the percentage of unmatched blocks is more than the false rejection ratio. If the tampered image is made by directly splicing the parts of two images taken by the same camera without any post-processing, the feature statistic of every part is the same as the camera feature pattern. On this condition, our method could not distinguish these parts. Although there are few tampered images without any post-processing, this weakness is still a limitation for our method. We will research on this weakness in the future.
We do not test the three images taken by KodakDCS330 in the Columbia dataset because the number of images for training is too small for us to get a proper camera feature pattern. In the test for tampered images in Columbia Image Splicing Detection Evaluation Dataset, our approach
ACKNOWLEDGMENT The work on this paper was supported by the fund of 863 Project, under search grant number 2008AA01Z418.
339
Authorized licensed use limited to: California State University Sacramento. Downloaded on April 01,2010 at 02:29:24 EDT from IEEE Xplore. Restrictions apply.
[11] Ismail Avcibas, Image Quality Statistics and Their Use in Steganalysis and Compression [D]. Doctoral Dissertation. 2001. [12] Hany Farid. Blind Inverse Gamma Correction [J]. IEEE Transactions on Image Processing, 2001. [13] Hany Farid and Alin C. Popescu. Blind Removal of Lens Distortion [J]. Journal of the Optical Society of America A, 2001. [14] Hany Farid and Alin C. Popescu. Blind Removal of Image Nonlinearities [C]. Computer Vision, 2001. ICCV 2001. Proceedings. vol.1, Pages: 76-81, 2001. [15] Vladimir N. Vapnik, The Nature of Statistical Learning, SpringerVerlag, New York, NY, 1995 [16] B. Scholkopf, J. C. Platt, J. T. Shawe, A. J. Smola, R. C. Williamson, “Estimation the support of a high-dimensional Distribution”, Technical Report MSR-TR-99-87, Microsoft Research [17] C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines. 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. [18] Columbia DVMM Research Lab. (2004) Columbia Image Splicing Detection Evaluation Dataset, http://www.ee.columbia.edu/ln/dvmm/downloads/AuthSplicedDataSe t/AuthSpliced DataSet.htm [19] M. Grossberg and S. Nayar. What is the space of camera response functions? [J] IEEE Computer Vision and Pattern Recognition, 2003, pp. 602-609. [20] Tian-Tsong Ng, Shih-Fu Chang. Camera Response Function Estimation from a Single-channel Image Using Difference Invariants. Technical Report. March 2006. http://www.ee.columbia.edu/dvmm/publications/06/ng-tech-reportcrf-estimation-06.pdf
REFERENCES [1]
Tran Van Lanh a, Kai-Sen Chong b, Sabu Emmanuel b, Mohan S Kankanhalli “A Survey on Digital Camera Image Forensic Methods”c 1-4244-1017-7/07/$25.00 ©2007 IEEE [2] Lukáš J., Fridrich J., and Goljan M.: Digital Camera Identification from Sensor Pattern Noise. IEEE Transactions on Information Forensics and Security (2005) [3] Y. –F. Hsu and S. –F. Chang. “Detecting Image Splicing Using Geometry Invariants and Camera Characteristics Consistency”, In ICME, Toronto, Canada, July 2006. [4] T. –T. Ng, S. –F. Chang, C. –Y. Lin, and Q. Sun, “Passive-blind Image Forensics”, In Multimedia Security Technologies for Digital Rights, W. Zeng, H. Yu, and C. –Y. Lin (eds.), Elsvier, 2006. [5] Chen, W., Shi, Y. Q., and Su, W. Image Splicing Detection Using 2D Phase Congruency and Statistical Moments of Characteristic Function. In: Delp, E. J., Wong, P.W. (eds.): Security, Steganography and Watermarking of Multimedia Contents IX, Proceeding. of SPIE, San Jose, CA, USA, 2007. [6] H. Gou, A. Swaminathan, and M. Wu, “Noise Features for Image Tampering Detection and Steganalysis,” Proc. of IEEE Int. Conf. on Image Processing (ICIP'07), San Antonio, TX, Sept. 2007. [7] T.-T. Ng and S.-F. Chang,: ‘A Model for Image Splicing’, IEEE International Conference on Image Processing, Singapore, Oct. 2427, 2004, pp. 1169-1172. [8] A.C. Popescu and H. Farid, “Exposing digital forgeries in color filter array interpolated images”, IEEE Trans. On SignalProcessing, vol. 53, no. 10, part 2, pp. 3948-3959, October 2005. [9] Kharrazi M., Sencar H.T., and Memon N. Blind Source Camera Identification [C]. Proc. ICIP’ 04, Singapore, October 24-27, 2004. [10] Siwei Lyu and H. Farid, How Realistic is Photorealistic? [J] IEEE Transactions on Signal Processing. February 2005.
340
Authorized licensed use limited to: California State University Sacramento. Downloaded on April 01,2010 at 02:29:24 EDT from IEEE Xplore. Restrictions apply.