TRANSPARENT AND PERCEPTUALLY ENHANCED JPEG IMAGE ...

TRANSPARENT AND PERCEPTUALLY ENHANCED JPEG IMAGE ENCRYPTION 1,2Bian Yang, 2Chong-Qing Zhou, 'Christoph Busch, 2Xia-Mu Niu

Norwegian Information Security Laboratory at Gjevik University College, Norway 2 Department of Computer Science and Technology, Harbin Institute of Technology, China 1

ABSTRACT In many applications encryption/decryption of compressed

images or videos is desired to be transparent to the compression decoder to maintain the file format, file size and content-relevant functionalities. We propose in this paper a transparent encryption mechanism for lPEGencoded image files to meet the requirements of formatcompliance and file-size preservation. The algorithm is based on a special cross-block Varied-Length Codes (VLC) shuffling method perceptually enhanced by a reversible histogram spreading processing, which tends to smoothly re-distribute VLCs among spatial blocks in a reversible way. The algorithm performs visually better than existing shuffling based schemes against content leakage attacks such as DC value removal.

Index Terms- transparent encryption, perceptual enhancement, lPEG encryption, reversible histogram spreading, VLC shuffling 1. INTRODUCTION Although existing standard data encryption schemes (DES, AES, RC4, etc) [1] and authentication schemes (MAC, digital signature)[l] can be directly used to secure multimedia data by regarding image and video files as binary data, they destroy the original image or video's file syntax and many functionalities possessed by the original media files, such as file indexing and retrievability, classifiability, transmission rate adaptability, ... , etc. To preserve these functionalities, many "transparent" encryption schemes [2-9] have been proposed for compressed media data to achieve format-compliance to a general media compression decoder. For media data authentication, watermarking based schemes [10-12] have been proposed for images and videos to avoid the troubles in file management caused by separation of authentication information from media data. However, for formatcompliant encryption schemes, the requirements of file size preservation, good perceptual encryption effect, and security of encryption sometimes contradict to each other and an appropriately designed mechanism is required for encryption and secret key generation and management.

978-1-4244-3298-1/09/$25.00 ©2009 IEEE

In this paper we discuss the transparency requirements, analyze the relationship among these requirements and give a brief introduction to existing solutions for media data encryption in Section 2. To achieve transparency, good perceptual encryption effect, and high security in the same time, a reversible histogram spreading based Varied-Length Codes (VLC) shuffling scheme is proposed for VLC coded lPEG images in Section 3. Experimental results in Section 4 demonstrate the effectiveness of the proposed scheme. Section 5 concludes this paper.

2. TRANSPARENCY REQUIREMENTS AND SECURITY CONCERNS For a lPEG image encryption scheme, format compliance is a basic requirement enabling smooth decoding and displaying. File size preservation after encryption is desired in many cases for storage and file management considerations. Perceptual scalability is usually required for some visual quality control applications such as Pay-TV. Besides these transparency requirements, security against known/chosen plain-text attacks and content leakage attack, is usually required as well. 2.1 Transparency requirements and existing solutions (1) Format compliance ensures preservation of the original file syntax and then a general decoder won't have any discrimination towards an encrypted file. Many schemes [2-11] have been proposed to achieve this basic transparency requirement for images and videos. (2) File size preservation will avail storage and transmission bandwidth limited cases (transparent to allotted resources). Schemes proposed in [3,9] encrypt FLC (FixedLength Codewords) for applications of lightweight perceptual encryption (not secure enough in the sense of content leakage), and [4,8] achieve size preservation by VLC shuffling and spatial block shuffling.

2.2 Security concerns with existing encryption schemes Most of existing encryption schemes have security concerns in different aspects. (1) Known/chosen plain-text attacks

DSP 2009

For FLC encryption, block ciphers provide security against know/chosen plain-text attacks . But by only FLC encryption, it is hard to achieve a deep encryption effect and therefore FLC encryption is usually designed for perceptual encryption [3,9]. To achieve both deep encryption effect and file-size preservation, format-compliant encryption schemes usually employ shuffling operation on encoded units such as VLC [4] or data blocks [8]. However, onlyshuffling based image ciphers are insecure [13] against known/chosen plain-text attacks and therefore stream cipher [4,8,9] are usually used to generate an ever-changing key stream such kinds of attacks. This continuous key stream generation mode is efficient for media data transmission applications. However, for off-line encryption/decryption cases such as decryption of an arbitrary encrypted image from a local database or retrieved via Internet , omniscience of correct key segment out of the used key stream makes decryption practically infeasible. Regarding this, local information [4,9] can be incorporated into generation of the initialization vector for a stream cipher. But this is based on two assumptions: a. these local information will be preserved intact from being encrypted and thus available for decryption; and b. these local information are unique to discriminate from each other. Assumption a. sometimes impairs the encryption effect, e.g., the OCT signs as local information [4] while actually signs themselves might even need to be encrypted for better encryption effect. Local information's uniqueness in assumption b. cannot also be perfectly guaranteed especially when facing a large volume of media files or in some cases, e.g. lPEG images, unique ID [9] such as vendor ID or time stamp might not be available or even unintentionally identical for different files. It is easily seen that both these two assumptions could be impractical in complicate application scenarios, and therefore, hard to guarantee the shuffling operation's security against known/chosen plain text attacks in some applications. (2) Content leakage attack Content leakage attack (error concealment based attacks in [4]) can be attributed in principle to format-compliance requirement which maintains a media file's format structure which therefore can be exploited by attackers to "conceal" the encrypted data. A common content leakage attack for lPEG is to set DC components to an arbitrary constant value and thus reveal content information represented by AC components such as edges and contours . To preserve the file length, a stream cipher can be used to encrypt AC coefficients' VLC at their the signs [4,9] or non-zero AC coefficients' amplitude indices, but these operations work only for perceptual encryption applications [9] and obviously not enough in encryption effect for applications requiring high security against content information leakage.

Fig. I. Content leakage example (a) Or iginal lP EG eneoded fingerprint image ; (b)Intra -AC-block VLC shuffling result with DC value set to be 128; (c) Intra-AC-block VLC shuffling and 8x8 image block shuffling result with DC value set to be 128.

Shuffling [4,8] is an effective way to this problem by scrambling the content information either in a block-wise mode [8] or in a VLC-wise mode [4]. Formally, there are three types of VLC shuffling: a. Intra-block shuffling, which shuffles all VLC codewords contained in the same AC coefficient block; b. Intra-Run-Length-set shuffling, which shuffles all the VLC codewords with the same " RunLength" (number of AC coefficients encoded by one VLC codeword) across all blocks ; c. Cross-Run-Length-set shuffling, which shuffles all VLC codewords across all blocks and all Run-Length sets. However, problems still exist in the above shuffling schemes regarding content information leakage. For intra-block shuffling and intra-Run-Length-set shuffling, the AC-blocks' VLC-quantity histogram, which is formed by lPEG's AC coefficient blocks with the same VLC codeword quantity inside one block, is not changed and thus insecure in terms of content leakage. A special case sensitive to content leakage is the quantity of blocks without any VLC codeword, which represents complete smoothness in the spatial domain. This, together with the whole ACblock- VLC-quantity histogram, could be directly used for content estimation of the plain-text image because both intra-block shuffling and intra-Run-Length-set shuffling don 't modify the VLC codewords quantity inside a block. Although the total quantity of VLC codeword in an image is hard to modify if preservation of the file size is required , some applications still require to hide the original spatial luminance distribution indicated by the percentage of smooth area. Fig.1 gives a lPEG-encoded fingerprint image as a special example to show how DC value removal (set to be 128 in gray level) makes the intra-AC-block VLC shuffling result content -perceptible (Fig.I.(b», and aslo makes the combination of intra-AC-block VLC shuffling and 8x8 image block shuffling result having plenty of blocks without VLC codewords (large area of blanks in Fig.Lfcj) which indicates many smooth blocks in the original image. For across-Run-Length-set shuffling, it is easy to imagine the visual encryption effect will be better because both VLC

Key.

JeEG, -_

-+-

blocks: B,:

(a)

Key. Decrypted JP EG blocks: DB,:

Reversible Histogram Spreading (RHS) can be described as follows. Block Shuffling: Step I: Shuffle all lPEG encoded blocks B; (i=1,2,...,NxM, assuming the original lPEG image consists of Nx M blocks) to obtain a new Nx M block matrix with all blocks SB;, where SB;=SF(B;) with SF to be the shuffling function; VLC Fixed Permutation (VLCFP): Step 2: From left to right and top to bottom, group neighboring two blocks SB2/.1 and SBzl (l=1,2, ... ,lN x M /2] ) as a pair PI, with SBZI.1 consisting of} VLC codewords {VII , ViZ, ... ,Vlj} and SBzlconsisting of k VLC codewords {VZi , VZZ, . . o,V2k};

(b)

Fig.Z Proposed reversible histogram spreading based JPEG image encryption algorithm

content and VLC quantity in a block are modified in this case. However, modification of VLC quantity will lead to AC coefficient number overflow (>63) inside one block because one VLC contains varied number (from I to 16) of AC coefficients [14]. The literature [4] suggested using across-Run-Length-set shuffling for better visual encryption effect but didn't address this practically critical problem. 3. PROPOSED REVERSIBLE HISTOGRAM SPREADING BASED ENCRYPTION ALGORITHM

To realize across -Run-Length-set shuffling and increase security against content leakage, we propose a Reversible Histogram Spreading (RHS) algorithm to modify the original AC-block's VLC-quantity histogram to a smooth and flattened shape which is perceptually enhanced and less discriminative in terms of image content. Note that although ordinary histogram equalization can fulfill this expectation but ordinary histogram equalization process is irreversible and therefore useless for encryption. RHS can be realized by inserting a VLC fix-permutation (VLCFP) into lPEG image block shuffling iterations. The whole perceptually enhanced lPEG image encryption/decryption algorithm can be illustrated by the diagrams shown in Fig.2 where {B;}, {E8;}, {WEB;} and {DB;} denote the original , the RHS encrypted, the watermarked and the decrypted image blocks respectively, where the initialization vector for PseudoRandom Number Generator (PRNG) can be losslessly watermarked into the encrypted VLC data using algorithm in [15]; Key" is a user key used together with the initialization vector to calculate a key stream Key., For each image, a unique initialization vector is generated to generate the key stream Key, for RHS iterations. The uniqueness of the initialization vector increases the algorithm's security against known/chosen plain-text attacks . In Fig.2,

Step 3: Let CNum(SBi) be quantity of AC coefficients decoded from VLC codewords inside the block SR;. For I from I to N x M /2], scan each block-pair PI to do the following operation: If} > 0, do (1) Move the first codeword VII inside SB2/.1 to SB2/, and therefore the two blocks are now {VIZ, ... ,vlj} and {VZI, VZZ, ... , VZk, VII} ; and (2) Swap contents between SB2/.1 and SB2/ and obtain two new blocks SB '2/-1= {VZI, VZZ, ... , VZk, VIr} and SB 'zF{V IZ, ... ,Vlj}. Then check if CNum(SB' 2/.1) > 63, if yes, cancel the above (1) and (2) operations and restore the original contents for the two blocks , i.e., set SB 'ZI.I= SBZI.I and SB '21= SB2/; otherwise (overflows does not happen) , keep the moving and swapping result. If} = 0, set SB '21.1 = SB2/.1and SB '21= SB21directly. Step 4: Set Bi=SB;'. # The above -proposed RHS algorithm can be done in an iterative way from Step I to 4 to gradually modify the ACblocks' VLC-quantity histogram mentioned in Section 2.2.(2) and in most cases reduce the quantity of zero blocks (blocks without any VLC codeword) but will strictly preserve the total AC stream size of the whole lPEG file. Thank to the swapping operation , VLC moving operation will not cause any ambiguity problem between contentchanged blocks (from the case that Cnum(SB2/_/) > and Cnum(SB'2/_1) 0), 50 percent will change to value (i-I) and the other 50 percent to value (i+ I). Correspondingly, half of original blocks with VLC codewords' quantity (i-I) and half of original blocks with VLC codewords' quantity (i+ I) will change to quantity value i. Let H(i) be the amplitude of the histogram's ith bin (quantity of blocks with VLC codewords' quantity equal to i), and the new amplitude H'(i) can be approximated as H'(i)= 0.5·H(i-I) t 0.5 ·H(i t I)

0< i< Lmax

(1)

H'(O) = OSH(O) t 0.5·H(I) H'(L m• x ) = 0.5·H(Lma,-I) t 0.5·H(Lma,

(a) Barbara

(f) Fingerprint

)

where Lmax is the maximum VLC codewords' quantity out of all the blocks for each image such that the total decoded AC coefficients' quantity will not exceed 63. Obviously, the above histogram spreading model exhibits a low-pass filtering effect on the original histogram. According to Eq. 0), it is easy to imagine that the histogram Hs amplitude will gradually spread to a straight horizontal line leveled at

(t:

H(i)

)/(L

m ax

t

I).

However, because of the limitation that

RHS preserves the total VLC codewords' quantity of the whole lPEG file (easily seen that moving and switching operations in Step 3 will not change the total VLC codewords' quantity), the final convergence of the RHS iterations may not be a straight line. Note that in Eq.(1) the histogram Hand H are formed from blocks' quantity distribution over different insideblock VLC codewords' quantity from 0 to Lma, . We can roughly suppose that this histogram smoothing and flattening trend exists as well for the case that Hand Hare formed from blocks' quantity distribution over different inside-block VLC-decoded AC coefficients' quantity from 0 to 63 if the used block shuffling is good enough to generate a block distribution pretty close to even distribution . The experimental results in the next section will demonstrate that this conjecture is reasonable. Another issue worthy of note is that RHS might affect the lossless watermarking capacity in the light of the fact that RHS modifies the VLCs' distribution inside and also across blocks. To prevent potential influence, we can keep those VLC categories used for watermarking in the algorithm [12] from being shuffled by RHS. As long as the used VLC categories account for only a small portion of all VLCs, this will not influence the security against content leakage attack.

(b) Barbara after intra-block shuffling and intra-Run-Lengthset shuffling for AC's VLC eodewords (DC=128)

(g) Fingerprint after intra-block shuffling and intra-Run- Lengthset shuffling for AC's VLC eodewords (DC=128)

(c)Barbara after block shuffling (h)Fingerprint after block (DC=128) shuffling (DC=128)

4. EXPERIMENTAL RESULTS In our experiments, an M-sequence generator is employed as the PRNG with a random initialization vector of length L = 32 bits, and a counter is used to plus I to generate a new

(d)Barbara after 20 times ofRHS (i)Fingerprint after 20 times of iterations (DC=128) RHS iterations for perceptual enhancement of encryption effect (DC=128)

3lIO--....----...----.,~-.,....-__r-___,r_____.

·· i :

i!

150

:

(e)Barbaraafter 200 times of RHS iterations (DC=128)

U)Fingerprint after 200 times of RHS iterations for perceptual enhancement of encryptioneffect (DC=128)

1011 :

i

50 .:

Fig.3 Encryption experimental results for Barbaraand Fingerprint with all lPEG files' size preserved initialization vector for each different image. With this PRNG , a key-stream K, was generated and the shuffling keys for each RHS iteration were truncated from K; A Qcoder [16] was implemented without probability estimation optimization for lossless compression of the AC VLC codes in the same VLC category to save space for watermarking according to the loss less VLC watermarking algorithm in [12]. 512x512 sized gray-level lPEG images were tested in the experiments and here we present results for the classic image Barbara (Fig.3(a)) and a special image Fingerprint (Fig.3(f)) after setting DC as a constant (gray-level 128) and RHS iterations based on block shuffling and VLC fixed permutation. The experimental results indicate the large numbers of AC zero blocks (blocks without any VLC codeword) existing in Fig.3 (g) and (h) (shuffling schemes other than RHS algorithm) are decreasing with increased times of RHS iterations based on block shuffling and VLC codewords' fixed permutation. From our experiments we note that after enough times of RHS iterations, the special images such as the Fingerprint image, which is easy to leak in content after ordinary shuffling, exhibits as good encryption effects as other natural images such as Barbara in this example. This can be attributed to the fact that reversible histogram spreading iterations re-distribute a lPEG image's VLC codewords to a wider AC coefficients' quantity range over [0,63] , while strictly preserving the images' total VLC codewords' quantity and thus the lPEG file's size. This explanation can be demonstrated by FigA, which presents the two images' original histograms (for Fig.3(a)-(c) and (f)(h)) and histograms after 500 times of RHS iterations. In both FigA(a) and (b), the histograms via RHS (solid-line) are smoothed and flattened compared to original histograms (dotted-line) and this is especially effective for the Fingerprint image (FigA(b)). Although the histograms after spreading are not horizontally leveled, as explained in

10

20

80

70

3OOO--....--__r----.--.....--""T"""---...----,

15l1l1 10110

···· 500 :

~

o: o

10

20

311

«I

50

80

70

(b) Fingerprint Fig.4 Histogram spreadingresults: original histogram (dotted line) and the one after reversible histogram spreading (solid line). section 3, they exhibit less discrimination than their originals and thus higher security against the contentleakage or analysis attack of the whole image. For the inherent security weakness of shuffling, the image-unique initialization vector for PRNG was used together with the user key to generate the key stream to make the RHS encrypted results more secure against known/chosen plaintext attacks compared to ordinary shuffling schemes.

5. DISCUSSION To be secure against known/chosen plain-text attacks, and in the meanwhile to be feasible for off-line encrypion / decryption applications, we employed file-size preserved lossless watermarking to store the unique initialization vector inside the lPEG file data itself. This design is, however, fragile to transmission errors - once the

initialization vector cannot be correctly extracted because of the errors, the whole image cannot be correctly decrypted. This problem lies in all block shuffling based schemes [4,8]. Possible solutions to gaining robustness against transmission errors could be using external key management system instead of storing the initialization vector as watermark, or using error-correction code over the initialization vector. Fortunately, as long as the image is large enough in size such as the examples (512 x512) in our experiments, the lossless watermarking algorithm can provide enough capacity (around 600 and 200 bits for Barbara and Fingerprint respectively in the experiments) for error-corrected initialization vector codes. 6. CONCLUSIONS AND FUTURE WORK We proposed in this paper a reversible histogram spreading algorithm to visually enhance lPEG VLC codewords' encryption effect. Via reversible histogram spreading, total VLC codewords' quantity of the whole image can be smoothly spread over both spatial domain (by shuffling) and histogram domain (by VLC fixed-permutation) to extend the visual scrambling effect range and to enhance security against content leakage. Besides, the reversible histogram spreading process does not modify the total VLC codewords' quantity and strictly preserves the encrypted lPEG image's decoding transparency in terms of file size and format. Experiments demonstrated our algorithm's effectiveness and advantage. Compared to other existing lPEG encryption schemes and VLC shuffling schemes, the proposed algorithm realizes a reversible cross-Run-Lengthset and cross-block VLC shuffling under AC coefficients' number limitation, and obtains distinct perceptually enhanced encryption effect over other existing schemes. The future work will focus on three aspects: (1) mathematic modelling the perceptual encryption effect advantage of the proposed algorithm; and (2) RHS efficiency improvement on VLC data; and (3) applications on other VLC encoding based compression formats such as lPEG2000 and MPEGx. 6. ACKNOWLEDGEMENT This work was partially supported by the National Natural Science Foundation of China (60703011) and the Research Fund for the Doctoral Program of China Ministry of Education (RFDP : 20070213047). Thank Dr. Chik How Tan for his suggestion of ambiguity-free fixed-permutation algorithm. 7. REFERENCES

[1] A. 1. Menezes, P. C. van Oorschot, and S. A. Vanstone, Handbook ofApplied Cryptography. Boca Raton, FL: CRC, 1996. [2] L. Tang, "Methods for encrypting and decrypting MPEG video data efficiently,"ACM Multimedia 96', pp.219-229, 1996 [3] C. Shi and B. Bhargava, "An efficient MPEG video encryption algorithm," Proc. IEEE Sympo. Reliable Distributed Systems, 1998 [4] 1. Wen, M. Severa, W. Zeng, M. H. Luttrell, W. Jin, "A formatcompliant configurable encryption framework for access control of video," IEEE Trans. Circuits and Systems for Video Technology, vol. 12, no. 6,pp. 545-557,2002. [5] B.B. Zhu, M.D. Swanson, and S. Li, "Encryption and authentication for scalable multimedia: Current state of the art and challenges," Proc. SPIE Int. Sympo. Inf. Technology & Commun., vol.5601, pp.157-170, Philadelphia, PA, US, Oct. 2004. [6] S. Lian, J. Sun, and Z. Wang, "A novel image encryption scheme based-on JPEG encoding," Proc. Int'l Conf. on Information Visualization, pp. 217-220, 2004. [7] Y. Mao, and M. Wu, "A Joint Signal Processing and Cryptographic Approach to Multimedia Encryption," IEEE Transactions on Image Processing, vol. 15, no. 7, pp.2061-2075, July 2006. [8] Y. Yao, Z. Xu, and W. Li, "A compressed video encryption approach based on spatial shuffling," Proc. Int'l Conf. on Signal Processing, 2006. [9] S. Li, G. Chen, A. Cheung, B. Bhargava, K. Lo, "On the design of perceptual MPEG-Video encryption algorithms," IEEE Trans. Circuits Syst. Video Techn. vol.17, no.2, pp.214-223, 2007. [10] 1. Dittmann, A.Steinmetz, R. Steinmetz, "Content-based digital signature for motion pictures authentication and contentfragile watermarking," IEEE Int'l Conf. Multimedia Comput. and Sys., 1999. [11] 1. Wang, Y. Dai, S. Thiemert, and Z. Wang, "A featurewatermarking scheme for JPEG image authentication," Proc. International Workshop on Digital Watermarking, 2003. [12] J. Fridrich, M. Goljan, Q. Chen, and V. Pathak, "Lossless data embedding with file size preservation," Proc. SPIE Electronic Imaging, San Jose, Jan. 2004. [13] S. Li, C. Li, G. Chen, D. Zhang, N. G. Bourbakis, and K. Lo, "A general quantitative cryptanalysis of permutation-only multimedia ciphers against plaintext attacks," Signal Processing: Image Communication, vol. 23, no. 3, pp.212-223, 2008. [14] G. K. Wallace, "The JPEG still picture compression standard," IEEE Transactions on Consumer Electronics, vol.38, no. 1, 1992. [15] 1. Fridrich, "Image encryption based on chaotic maps," IEEE International Conference on Computational Cybernetics and Simulation, vo1.2, pp.ll 05-1110, 1997. [16] W. Pennebaker, 1. Mitchell, G. Langdon, Jr., R. Arps, "An overview of the basic principles of the Q-Coder adaptive binary arithmetic coder," IBM Journal of Research and Development, vol.32, no.6, pp.717-726, 1988.