Detection of Double Compression in MPEG-4 Videos ... - IEEE Xplore

10 downloads 247 Views 716KB Size Report
Videos Based on Markov Statistics ... editing software, digital videos are exposed to various forms of ... the effectiveness of Markov statistics in MPEG-4 videos.
IEEE SIGNAL PROCESSING LETTERS, VOL. 20, NO. 5, MAY 2013

447

Detection of Double Compression in MPEG-4 Videos Based on Markov Statistics Xinghao Jiang, Wan Wang, Tanfeng Sun, Yun Q. Shi, Fellow, IEEE, and Shilin Wang

Abstract—With the spread of powerful and easy-to-use video editing software, digital videos are exposed to various forms of tampering. Nowadays, a considerable proportion of surveillance systems and video cameras have built-in MPEG-4 codec. Therefore, the detection of double compression in MPEG-4 videos as a first step in video forensics research is of significance. In this paper, Markov based features are adopted to detect double compression artifacts, which imply that the original video may have been interpolated. The advantages and limitations of double MPEG-4 compression detection are analyzed. Experimental results have demonstrated that our scheme outperforms most existing methods. Index Terms—Digital forensics, double compression, Markov statistics, MPEG-4.

I. INTRODUCTION

N

OWADAYS, the usage of surveillance systems and video cameras is growing rapidly. To authenticate the digital contents, a variety of active techniques have been proposed, such as watermarking [1] and digital signature [2]. Recently, many researchers began to focus on passive techniques, e.g., [3]. The passive approaches are more practical since only the intrinsic characteristics of the media, rather than pre-embedded data, are needed. As for video tampering process, the software should first decode the compressed videos, and then work in the uncompressed domain. The tampered video should be re-encoded and saved in compressed format after interpolation. Therefore, the double compression artifacts, as the intrinsic characteristics, may reveal the occurrence of tampering. There are many encouraging results in the field of double JPEG compression detection, e.g., [4], [5]. However, less work has been done to detect video double compression. In [6], the double compression detection is accomplished by examining the periodic artifacts in Discrete Cosine Transform (DCT) histograms of frames. In [7], the disturbance in the probability Manuscript received December 24, 2012; revised February 25, 2013; accepted February 27, 2013. Date of publication March 07, 2013; date of current version March 13, 2012. This work was supported by the National Science Foundation of China under Grants 61071153, 61272439, and 61272249, the RFDP under Grant 20120073110053, and by the Projects of International Cooperation and Exchanges, Shanghai, under Grant 12510708500. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Constantine L. Kotropoulos. X.-H. Jiang, W. Wang, T.-F. Sun, and S.-L. Wang are with School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China (e-mail: [email protected], [email protected], [email protected], [email protected]). Y.-Q. Shi is with the Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102 USA (e-mail: shi@njit. edu). Digital Object Identifier 10.1109/LSP.2013.2251632

distribution of the first digits of non-zero quantized AC coefficients is used as evidence of double compression. Only first digit distributions from the intra-coded frames are selected to further enhance the detection accuracy in [8]. All of the above literatures focus on the MPEG-1/2 coding standard. In [9], double H.264 compression is detected based on the probability distribution of quantized non-zero AC coefficients. Since a considerable proportion of today’s surveillance systems and video cameras use built-in MPEG-4 codec, there are urgent needs to detect double MPEG-4 compression. It has been shown in [5] that Markov transition probability matrix (TPM) could accurately identify double-compressed JPEG images. Since the intra coding principle in MPEG-4 uses JPEG-like compression scheme, we are inspired to investigate the effectiveness of Markov statistics in MPEG-4 videos. The contributions of this paper are three-fold. First, the detection of double MPEG-4 compression is accomplished. To the best of our knowledge, this is the first effective approach to detect double MPEG-4 compression. Second, by analyzing the quantization and de-quantization methodologies in MPEG-4, which differ from JPEG, the limitations of current detection schemes are analyzed. Third, we have given a comprehensive comparison on the detection performance of Markov statistics with other features used in video, including the first digit distribution [7], [8] and the DCT histogram [9]. II. MPEG-4 QUANTIZATION AND DE-QUANTIZATION Different from MPEG-1/2, the heart of MPEG-4 is the objectbased representation model. An object-based scene is built using individual objects that have relationships in space and time, e.g., the foreground objects or the background of the scene. Each object is represented by blocks, which are encoded by using DCT, quantization, and entropy coding. Although MPEG-4 standard stipulates some deterministic methods, the codecs are free to design detailed implementations. Xvid [10] will be taken as an example to illustrate the quantization and de-quantization processes of intra mode. The quantization process basically involves division of the DCT coefficient by a corresponding quantization matrix value based on spatial frequency. Please note that each value in the matrix is pre-scaled by multiplying the quantizer scale and indicate (Q_Scale). Specifically, let -th block-DCT coefficients and quantized AC coeffithe cients, respectively. The quantized level is given by:

(1) where

1070-9908/$31.00 © 2013 IEEE

448

IEEE SIGNAL PROCESSING LETTERS, VOL. 20, NO. 5, MAY 2013

(2) (3) and denote left-shift and right-shift by bits on biis the intra quantization matrix; nary numbers, respectively; is equal to 17 by default. For de-quantization process, we can generate the reconstructed AC coefficients according to (4) It is noted that there is only one intra quantization matrix by default, and Q_Scale is the determining factor for the quality of output video. In a variable bitrate (VBR) environment, Q_Scale, which ranges from 1 to 31, remains unchanged during encoding. In JPEG coding scheme, however, different quality factors (QFs) map to different quantization matrices, which control the image quality directly. This difference between JPEG and MPEG explains the limitations of our algorithm in Section III. III. DOUBLE COMPRESSION DETECTION A. Feature Generation Markov statistics has been proved to be distinguishable for single and double compression in JPEG images [5]. Since blocks in MPEG-4 are encoded using a JPEG-like scheme, it is reasonable to deduce that Markov statistics could be effective in the double MPEG-4 compression detection. Fig. 1 shows the procedure to extract Markov feature in MPEG-4 videos, which is illustrated as follows: — First, extract the quantized DCT coefficients during decoding process. We only consider the magnitudes of coefficients in component here. along the hori— Second, compute the difference array zontal, vertical, major diagonal, and minor diagonal directions. All direction-specific quantities will be denoted by . For the horizontal direction, a superscript (5) The differential operation could extrude the double compression artifacts. — Third, truncate the difference array by thresholding op, eration. If the value is larger than or smaller than , respectively. is set it will be represented by or to be 4 since 93% of elements in difference arrays fall in the interval [ , 4] according to our statistical analysis on MPEG-4 videos. — Fourth, a first-order Markov random process is modeled on each difference array, , along the same direction. For can be calculated as the horizontal direction, the TPM (6) where . In this way, we have obtained a 9 9 TPM on each direction. — Finally, we employ the approach in [11] to reduce the feature dimensionality. Specifically, we separately average

Fig. 1. Markov feature extraction procedure.

the horizontal and vertical matrices and then the two diand as agonal matrices to form the feature sets (7) is the concatenation of and , The final feature which is 162-D. The rounding errors caused by double quantization process will leave statistical artifacts among elements of the difference array. According to the theory of random process, the one-step Markov transition probability matrices could characterize those difference arrays. Hence, double MPEG-4 compression can be detected by using machine learning architecture. B. Limitations in MPEG-4 Videos The limitations are mainly because of the difference between quantization methodologies in JPEG and MPEG-4, which has been discussed in Section II. Take a generic discrete 1-D signal for a theoretical analysis, and simplify the quantization , where is the quantization step. process as Consider that this signal is consecutively quantized by step and . If is an integer, the result will be the same as is singly quantized with step [12]. In JPEG, different QFs map to different quantization matrices. Please note that not all the elements from one matrix are the divisors of the elements from another matrix in the corresponding modes. That is, the so-called distinguishable mode [13] exists, where quantization step is not divisible by . However, in MPEG-4, the quantization matrix is fixed, and all the is a elements are pre-scaled by multiplying Q_Scale . If , all modes are undistinguishable. divisor of The actual quantization process in MPEG-4, which has been illustrated in Section II, is more complicated. It can be shown has a large impact on the detection that the parity of performance. Consider videos which are single-compressed by , or double-compressed by and successively. If is an odd multiple of , more than 99% of quantized DCT coefficients in singly and doubly compressed videos are exactly the same, leaving almost no trace for detection. But if is even, a considerable number of quantized coefficients are different due to rounding process. In this case, Markov feature has enough discriminative power to identify double compression. The detailed formula derivation to explain this phenomenon is given in Appendix.

JIANG et al.: DETECTION OF DOUBLE COMPRESSION IN MPEG-4 VIDEOS

449

TABLE I DETECTION ACCURACIES ACHIEVED BY USING MARKOV FEATURES AND FIRST DIGIT DISTRIBUTION

IV. EXPERIMENTS A. Video Preparation Thirty widely known YUV sequences in Common Intermediate Format (CIF) [14] are selected as source sequences in this work. The resolution is defined by CIF as 352 288. Xvid codec [10] is used to encode and decode videos. Group of pictures (GOP) structure is set to be IPPPPP. All of the YUV sequences are first encoded to MPEG-4 VBR videos with Q_Scale . To simulate double comranging from 2 to 10, e.g., pression, each video is decoded and re-encoded using different . In order to inQ_Scales, e.g., crease sample quantity, we split each video into clips, and each video clip contains 30 frames, i.e., 5 GOPs, generating 5,040 clips in total. B. Experimental Setup Different from images, the decision process of the classifier is defined as follows. Each GOP is treated as a detection unit, which will be classified as singly or doubly compressed by Fisher’s linear discriminant (FLD) analysis [15]. Then the final decision is based on voting mechanism. If the percentage of double-compressed GOPs in a video exceeds threshold , this video will be labeled as “double compression”. Please is adaptive according to the demand of TPR (true note that in our positive rate) and TNR (true negative rate). We set scheme. We equally split the original YUV sequences into training part and testing part. The video clips encoded from the sequences in training part constitute the training set, and those generated from the sequences in testing part are the testing set. In this way, each YUV sequence only contributes to one set. as the negative We define the single-compressed clips with samples, and the double-compressed clips with followed as the positive samples. The classification accuracy, by which is the mean value of TPR and TNR, is averaged over 20 times of experiments by randomly selecting the training and testing part in YUV sequences. Please note that based on previous literatures, it is not considered as double compression . when C. Detection Result Table I gives the classification accuracies achieved by using Markov features. Conclusions can be drawn as follows:

Fig. 2. Accuracies achieved by different sets of features when

equals 8.

— First, Markov feature performs well in double MPEG-4 compression detection. Almost all entries in lower-left corner are 100%. Even though it has been proved that is larger than , the detection becomes harder when [7], [8], most of the entries in upper-right corner still arrive 95%. is an odd mul— Second, as analyzed in Section III, when , nearly all quantized DCT coefficients in singly tiple of or doubly compressed videos are equal. Hence the detection performance at these three entries (shown in framed space) degrades to random guessing. is even, though there are only subtle — Third, when differences between quantized coefficients, Markov statistics could enhance the double compression artifacts by differential operation and gain a strong discriminative power. Corresponding entries (shown with underline) are above 90%. D. Comparison With Other Features We employed the first digit distribution feature used in [7], [8] and DCT histogram feature in [9] for comparison. Here for a fair comparison, we extract the first digit distribution only from frames, following the method in [8]. Specifically, the first digit distribution feature is a 12-D feature combining the probability distribution and three goodness-to-fit statistics (SSE, RMSE, and R-square). The DCT histogram feature involves the distribution of all quantized nonzero AC coefficients ranging from to 10. are shown in Fig. 2. The The accuracies when overall results achieved by using first digit distribution features is a divisor of are given in Table I. It is observed that when

450

IEEE SIGNAL PROCESSING LETTERS, VOL. 20, NO. 5, MAY 2013

, no matter is even or odd, the detection using either the first digit distribution or the DCT histogram fails (marked in table). This is because of the weak discriminative power of these first-order statistics when there are only subtle differences between quantized coefficients. In this way, Markov feature, which is second-order statistics, reflects its advantage.

Only the subtractors are different between (10) and (12). Note that

(13)

V. CONCLUSION We have proposed an effective method to detect double compression in MPEG-4 videos. Double quantization with different parameters will inevitably introduce rounding errors, leaving detectable artifacts. Markov random process could capture the artifacts and detect double-compressed videos. Besides, we have explored the limitations of double MPEG-4 compression detection algorithms by delving into the quantization methodology. The prior arts, including first digit distribuis a divisor of . tion and DCT histogram, will fail when However, the detection accuracies achieved by using Markov is an even multiple of , features can reach 90% when because Markov feature is second-order statistics and it can extrude the subtle artifacts. APPENDIX This section explains why the parity of has a large impact on the detection performance (described in Section III). The quantization process defined in (1)–(3) can be derived as (8) where , , , and denote , , , and Q_Scale, respectively. Since the middle term is much smaller compared with the first and third term, we just omit it and rewrite (8) as (9) where

. For a single-compressed video with , the quantized coefficient would be (10)

For a double-compressed video, the quantized coefficient needs to be de-quantized according to (4): (11) where

. Then, should be re-quantized by . The final quantized coefficient would be (12)

, and

Hence, .

is odd, will always be equal to . Consider for some integer , the decimal part of nearest to or . Neither of them 1/2 may be minus or will cross 1/2 and then change the rounded value, . If is even, since and are bounded between , the decimal however, the disturbance is evident. For part can be exactly 1/2. Different signs of and will render and unequivalent. If

REFERENCES [1] X.-P. Zhang and S.-Z. Wang, “Statistical fragile watermarking capable of locating individual tampered pixels,” IEEE Signal Process. Lett., vol. 14, no. 10, pp. 727–730, Oct. 2007. [2] H.-J. Yang and A. Kot, “Binary image authentication with tampering localization by embedding cryptographic signature and block identifier,” IEEE Signal Process. Lett., vol. 13, no. 12, pp. 741–744, Dec. 2006. [3] H. Yao, S.-Z. Wang, Y. Zhao, and X.-P. Zhang, “Detecting image forgery using perspective constraints,” IEEE Signal Process. Lett., vol. 19, no. 3, pp. 123–126, 2012. [4] T. Pevny and J. Fridrich, “Detection of double-compression in JPEG images for applications in steganography,” IEEE Trans. Inf. Forensics Secur., vol. 3, no. 2, pp. 247–258, Jun. 2008. [5] C.-H. Chen, Y.-Q. Shi, and W. Su, “A machine learning based scheme for double JPEG compression detection,” in Proc. Int. Conf. Pattern Recognit. (ICPR), Dec. 2008, pp. 1814–1817. [6] W.-H. Wang and H. Farid, “Exposing digital forgeries in video by detecting MPEG compression,” in Proc. Multimedia Secur. Workshop (MM and Sec), Sep. 2006, pp. 37–47. [7] W. Chen and Y.-Q. Shi, “Detection of double MPEG compression based on first digit statistics,” Lect. Notes Comput. Sci. (IWDW 2008), vol. 5450, pp. 16–30, 2009. [8] T.-F. Sun, W. Wang, and X.-H. Jiang, “Exposing video forgeries by detecting MPEG double compression,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Mar. 2012, pp. 1389–1392. [9] D.-D. Liao, R. Yang, H.-M. Liu, J. Li, and J.-W. Huang, “Double H.264/AVC compression detection using quantized nonzero AC coefficients,” in Proc. SPIE Int. Soc. Opt. Eng. (Media Watermarking, Security, and Forensics), 2011, vol. 7880. [10] Xvid Codec, [Online]. Available: http://www.xvid.org/ [11] T. Pevny, P. Bas, and J. Fridrich, “Steganalysis by subtractive pixel adjacency matrix,” IEEE Trans. Inf. Forensics Secur., vol. 5, no. 2, pp. 215–224, Jun. 2010. [12] A.-C. Popescu, “Statistical Tools for Digital Image Forensics,” PhD, Dartmouth College, Hanover, NH, USA, 2004. [13] B. Li, Y.-Q. Shi, and J.-W. Huang, “Detecting doubly compressed JPEG images by using mode based first digit features,” in Proc. IEEE Workshop Multimedia Signal Process. (MMSP), Oct. 2008, pp. 730–735. [14] [Online]. Available: http://media.xiph.org/video/derf/ [15] R.-A. Fisher, “The use of multiple measurements in taxonomic problems,” Ann. Eugenics, vol. 7, no. 2, pp. 179–188, 1936.