Volume 2, Issue 3, March 2012
ISSN: 2277 128X
International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com
A Novel Compressed Domain Technique of Reversible Steganography Mahmud Hasan 1 , Kamruddin Md. Nur 2 , Tanzeem Bin Noor3 1,2,3
Lecturer1 , Assistant Professor2 , Lecturer3 Dept. of Computer Science & Engineering, Stamford University Bangladesh.
[email protected] m1
Monotosh Roy4 Software Eng ineer Syngenta Bangladesh
Abstract— Steganography has been a common mode of secret communication in recent years. Although the steganographic approach can be carried out in either image, audio or video signals, due to the availability of higher degree of redundant data and easy accessibility in transmission mediums, digital images are the prevalent practice of steganography. Generally, information hiding inside a digital image results in deprivation of some visual quality of the image. However, recent advancement in steganographic schemes has been able to develop techniques that can disguise an information within a digital image in such a way that the cover image can be taken to its original state after extracting the hidden information. These techniques are commo nly referred to as Lossless Steganography or Reversible Steganography. Although late researches in steganography have been able to successfully conduct reversible steganography in spatial domain, reliable and robust lossless steganographic technique in compressed domain is yet a challenge. In this paper, we propose a novel reversible steganographic approach that can hide high capacity secret information in location based compressed domain. The proposed method can completely reconstruct the cover image or host image and provide the best possible perceptual quality. The proposed method can also provide its users with a key for hiding and extracting information so that the robustness is achieved in a great extent. Keywords— Lossless Steganography, Compressed Domain Data Hiding, Empty Space Using, Peak Signal to Noise Ratio, Vernam Cipher
I. INTRO DUCTION Secured data transmission via digital media had its in itial step through cryptography. Cryptography is a form of secured data transmission where the contents of a message are kept secret. Steganography, on the other hand, is a secured data transmission technique where the existence of a message is itself secret [1,2]. Steganography is the art of hiding one secret informat ion within another informat ion so that the existence of secret informat ion is not suspectable [3]. However, as far as security concerned, neither cryptography nor steganography is perfect alone [1]. Today’s advanced cryptanalysis algorith ms may almost successfully decrypt the message if the existence of message is known. Again, it is easy for an intruder to extract, read, alter or destroy the hidden informat ion inside the cover informat ion if the intruder can presume the subsistence of the message. The strength in security is achieved merely when steganography is combined with cryptography [1]. Steganography, although can be applied with any sort of digital data; it is best suited for the cases where higher degree
of redundancy is obtained. Therefore, dig ital images are considered to be an excellent mediu m for steganography [1, 4]. Since the alterat ion of Least Significant Bit (LSB) introduces at best one gray-level distortion in the particular image pixel, detection of this distortion is beyond the vis ual scope of human [5]. Hence, Image Steganography using LSB has been a common approach [6, 7, 8]. However, changing the LSB of a pixel considerably changes the Peak Signal to Noise Ratio (PSNR) of the image since the amount of introduced noise is influential. Thus, the quality of the cover image is degraded or, we can say, the steganographic process is lossy [1]. On the other hand, lossless or reversible image steganography is defined as a data hiding technique within an image where the image can be taken to its original state after the embedded data is extracted [9]. The lossless or reversible steganography can take place in either spatial or compressed domain. A lthough, the spatial domain reversible steganography has achieved magnificent development during recent years, the compressed domain reversible steganography is yet considered a great challenge since most of the effective
Vo lu me 2, Issue 3, March 2012 image co mpression algorithms are lossy in nature and there is a possibility of the embedded data loss while perfo rming compression or decompression of the image [1, 9]. A lthough, a few researches [10, 11] have been approached for the compressed domain lossless steganography, till now, there is an inadequacy of efficient co mpressed domain reversible steganography. In this paper, we suggest a novel reversible steganographic approach for the compressed domain. We incorporate a recent image compression technique developed by Hasan & Nur [12] with proposed method. The reason behind choosing their co mpression technique [12] is that this technique is straight forward, lossless and computationally easy to carry out. The experimental results show that our proposed method can successfully reconstruct the cover image in a lossless manner with h igh data embedding capacity. The rest of the paper is organized as follo ws - section II discusses the related study followed by the used image co mpression technique in section III. The proposed encryption and decryption technique are explained in section IV. Sect ion V presents the experimental results. Further research is discussed in section VI and a conclusion is dragged in section VII. II. R ELATED S TUDY Earlier in section I, it was mentioned that the lossless or reversible steganography has extensively been studied during past few years but yet there are very few researches that carried out the reversible data h iding technique in co mpressed domain. In this section, we discuss s ome of those compressed domain reversible steganographic studies. The compressed domain lossless image steganography can be roughly categorized into two branches, where one branch deals with the JPEG co mp ressed images for steganography while the other takes care of AMBTC co mp ressed images for hiding data. Lossless steganography using JPEG co mpressed images were motivated by the fact that even being a lossy compression technique, JPEG is lossless for the DC and AC coefficients that are applied fo r Huffman encoding steps. Therefore, using either LSB or other approaches for those coefficients can be worked out efficiently without the loss of the hidden informat ion in any way [1]. Similarly, the concept of double quantization levels and one bit p lane of AMBTC Co mpression appeared advantageous for hid ing data since a relative t rio always holds true for this compression technique [9].Wien Hong et al. [9] presented an approach for lossless image steganography in AMBTC co mpressed domain. They took the advantage of interchanging two quantization levels of AMBTC co mpression technique for embedding data [13]. Recently, Zhenfei Zhao et al. [14] proposed another technique for the same co mpressed domain where they considered a mu ltilevel histogram shifting mechanism for h iding h igh capacity data. Researches in [10,11] explained different approaches and aspects of hiding informat ion in popular DCT co mpressed domain of JPEG co mpression technique. C Velasco et al. thought a different way for h iding data in JPEG co mpressed images [15]. They used the concept of popular convolution codes and bit synchronization technique for categorizing the suitable blocks fro m the image for hid ing data and embedded
© 2012, IJARCSS E All Rights Reserved
www.ijarcsse.com data within only those blocks. Hsien-Wen Tseng et al. [16] contemplated the error at the step of JPEG quantization and two scaling factors for data embedding. Few other researches [17,18,19] on JPEG stego images can also be taken into consideration for instance. Since the co mpressed domain image steganography is presently indeed confined to two specific image co mpression techniques, we are going to extend this field on other compressed domain and prove that the proposed method can achieve higher embedding capacity along with lo wer computational co mplexity. III. LOCATIO N BASED IMAGE CO MPRESSION TECHNIQ UE The location based image compression mechanis m suggested by Hasan & Nur [12] is a recent advancement in image co mpression arena. The compression method takes an input image and div ides it into a nu mber o f 4×4 non overlapping blocks. For the first block, it checks which gray value is the most frequent. Since each p ixel is usually represented by 8 b its, the Most Frequent Pixel (M FP) must possess the maximu m nu mber of bits in the block. It simply deletes all the occurrences of the most frequent pixel fro m the block and represents all other p ixels in an array like data structure. The block to array conversion is performed according to a left-to-right-top-to-bottom manner. Now the Second Most Frequent Pixel (SMFP) and its frequency is searched. For a block, it is guaranteed that the frequency of any pixel except MFP and SMFP will be less than or equal to the frequency of the SMFP. Thus, if k bits are required to denote SMFP's frequency, any pixel frequency of the block can be denoted by k b its. The encoded bit stream is then organized as follows. a. 8 bits for MFP. b. 4 b its for frequency of MFP. c. 4 bits for denoting Individual Pixel Frequency. These bits represent a number k. d. 8 b its for SM FP. e. k bits for denoting SMFP's frequency. f. 4 bits for Encoding each SMFP location. g. Steps d to f for each distinguished pixel of the b lock. Finally, these steps are to be performed for each b lock of the image. The encoding and decoding steps can be more precisely described by the steps mentioned in the following subsections. A. Encoding Mechanism Step 1: Step 2:
Step 3:
Step 4:
An image is divided into a nu mber o f 4×4 nonoverlapping blocks. The block is converted to an array of 16 elements. Left-to-right-top-to-bottom approach is followed fo r the conversion. Find the MFP in the array and delete all of its occurrences. The array now contains less than 16 elements. Write M FP by first 8 b its of the encoded bit stream. Then write 4 b its that will represent MFP's frequency.
Page | 252
Vo lu me 2, Issue 3, March 2012 Step 5: Find the frequency of SM FP. If k b its are required to represent this frequency, write k by next 4 b its. Step 6: Write next p ixel value P of the block by 8 bits. Step 7: Write frequency of P by k bits and each position of P's occurrence by 4 b its. Step 8: Repeat Step 6 to Step 7 for each distinguished pixel of the block. Step 9: Repeat Step 2 to Step 8 to cover the whole image. If the image is color image, the color planes are to be separated first, then the proposed method can be individually used upon each color plane. B. Decoding Mechanism Step 1: Read the first 8 bits, it represents MFP of the block. Read next 4 bits, it will give us the frequency of MFP. Step 2: Read the value of k fro m next 4 bits. Step 3: Read next 8 bits for a pixel and fu rther k b its for its frequency. Step 4: Read next 4 bits to find location of the pixel of Step 3. Repeat Step 4 n times where n is the number given by k bits. Step 5: Repeat Step 3 to Step 4 unless all distinguished pixels are p rocessed. Then simp ly fill the blan k spaces by MFP. In Fig. 1, an example b lock is processed for the convenience according to the algorithm proposed by Hasan & Nur [12].
www.ijarcsse.com IV. PRO POSED METHO D Fro m the discussion of section III, it is clear that according to the compression mechanism proposed by Hasan & Nur [12], we can take the advantage of the block processing. Moreover, since each block is individually co mpressed and does not depend on any other block for deco mpressing, we can use each block for embedding the h iding text. Ho w many distinguished pixels a block possesses is clearly identifiable by the decompression mechanism of Hasan & Nur [12]. We can be sure about the number of the distinguished pixels long before the decompression of the block is finished. Again, since a 4×4 b lock is supposed to contain maximu m p ixels alike, let us assume that we can represent a block of 128 bits by 64 bits. Then we have 64 bits empty for this particular block that can be used for hiding data. The best case of the location based compression algorithm contains only 12 bits for a particular block, and thus the maximu m possible empty bit number for any block can be represented by 7 bits. Hence, after the block is comp ressed, an additional 7 bit value will tell us how many bits are remained empty. Then one character of hiding text is embedded using 8 bits in that bit-space after being encrypted by a user defined encryption key in order to ensure the security of the hidden text. If total 16 bits are empty excluding the 7 bits used for empty-bit-denoting, two of the hiding characters can be embedded within that block. However, if 15 bits are empty, only one character is embedded using first 8 empty bits and rest of the 7 bits are discarded for the block. The decoding phase begins with the extraction of the hidden characters and deletion of the extra b its used for data embedding in the compressed domain. Then only those
Fig. 1 Example Block Processing Using Location Based Image Compression [12]
Fig. 2 Data Hiding Example using Proposed M ethod.
© 2012, IJARCSS E All Rights Reserved
Page | 253
Vo lu me 2, Issue 3, March 2012 bits remain that were essentially preserved by the location based compression algorith m. The fo llo wing subsections will clarify the proposed technique in detail. Fig. 2 illustrates an example so that the understanding of the proposed method is benefited. A. Embedding Procedure Step 1: Step 2:
Step 3:
Step 4:
Step 5:
Step 6: Step 7: Step 8: Step 9:
Step 10:
Step 11: Step 12:
An image is divided into a nu mber o f 4×4 nonoverlapping blocks. The block is converted to an array of 16 elements. Left-to-right-top-to-bottom approach is followed fo r the conversion. Find the MFP in the array and delete all of its occurrences. The array now contains less than 16 elements. Write M FP by first 8 b its of the encoded bit stream. Then write 4 b its that will represent MFP's frequency. Find the frequency of SM FP. If k b its are required to represent this frequency, write k by next 4 b its. Write next p ixel value P of the block by 8 bits. Write frequency of P by k bits and each position of P's occurrence by 4 b its. Repeat Step 6 to Step 7 for each distinguished pixel of the block. Find how many bits are free for the block. Th is is accomp lished by 128-total bits after compression. Represent the number of free bits by a 7-bits m. The rest of bits other than compressed bits plus 7-bits m are grouped in 8-bits per group. If there is a lack o f 8-bits for forming a group, that group is denied. Now, each hid ing character is embedded by 8 bits using the key used by the user. Step 1 to 11 is repeated until the cover image is processed or hiding text is finished.
B. Extracting Procedure Step 1:
Step 2:
Step 3: Step 4:
The first block of the co mpressed image is decompressed using the decoding steps given in section 3.2. Next 7 bits are read. It will tell whether any embedding characters are there in this block or not. If an embedded character is found, it is decoded using the key. Step 2 is repeated for the number of times denoted by 7 bits ext racted. Step 1 to 3 is repeated unless the image is completely decompressed or the total hidden text is extracted.
The security key to be used in the proposed method can be user given. For testing purpose, we used an 8-bits long private key and make a Vernam Ciphering of it with the actual
© 2012, IJARCSS E All Rights Reserved
www.ijarcsse.com character to be hidden. However, one time pad can also be applied depending on the length of the hiding text. Although the proposed method reduces the compression ratio of an image, it can easily embed a lot of characters in it in a robust manner. The robustness is achieved by allowing the user with a security key for data encryption and decryption. Again, if the proper key is not given, the decoder of the proposed method will simply deco mpress the image, discarding the extra bits used for data embedding. Hence, it is free fro m any sort of intruder attack in the middle of transmission process. Since the encoding and decoding technique is written inside the header part of an image file and the decoder can easily discard the extra bits after successful retrieval of the p ixels of a block, the decoder of the intruder will behave same without creating any doubt of the intruder. Thus, a double layered security is achieved by the proposed scheme. As generally a digital image contains a huge number of 4×4 b locks, the nu mber of the characters of the text to be hidden should almost always be lower than the block nu mbers. Therefore, still the image is comp ressed and must contain less number of bits as compared to the original image. The experimental results of this paper will focus in detail how many characters can practicably be h idden using the proposed technique. V. EXP ERIMENTAL R ESULTS 100 test images of 512×512 d imension and 100 test images of 1024×1024 d imension have been taken for testing the proposed method. The number of characters that can be hidden were found out for those images. For 512×512 dimensional images, 3,72,532 characters could be hidden inside the image on an average, wh ile for 1024×1024 dimensional images the character nu mber is 10,99,564. The portion of this study is illustrated in table 1. However, if we would like to consider the achievable co mp ression ratio together, then the number of embedding characters is less. The comparative comp ression ratio and the possible number of embedding characters is presented in table 2. Fig. 3 shows some of the test images and their corresponding PSNRs after the hidden text is ext racted. VI. DISCUSSION & FURTHER R ESEARCH The proposed algorith m has been tested upon only monochrome images as of now. The further studies of this branch can be conducted upon the true-color images. Different dimensional images can be taken into consideration and an optimu m number of h iding characters can be suggested that does not much a ffect the co mpression ratio. Straight forward cryptographic approach can be used upon the obtained bit stream of the location based compression algorith m for hid ing characters without adding any ext ra bits so that the final compression ratio achieved remains same as with the focused compression algorith m.
Page | 254
Vo lu me 2, Issue 3, March 2012
Image Name
www.ijarcsse.com
TABLE I PORTION OF STUDY TO F IND POSSIBLE NUMBER OF CHARACTERS TO BE HIDDEN
Dimension
Possible Number of Characters to be Hidden
Lena
512×512
2,11,232
Baboon
512×512
Iris Bridge
Image Name
Dimension
Possible Number of Characters to be Hidden
Tank
1024×1024
08,49,265
3,45,113
Actress
1024×1024
09,55,275
512×512
3,78,587
Cameraman
1024×1024
10,87,378
512×512
1,99,769
Peppers
1024×1024
10,58,977
TABLE II COMPARATIVE COMP RESSION RATIO AGAINST POSSIBLE NUMBER OF EMBEDDING CHARACTERS
Image Name
Compression Ratio by Location Based Approach
Compression Ratio after Data Hiding
Number of Characters Hidden
Lena
8.319325
8.305914
50
Baboon
10.66667
10.631679
80
Iris
10.40342
10.370133
80
Bridge
6.209833
6.200885
60
Tank
9.187302
9.164552
70
Actress
11.29374
11.244903
100
Cameraman
9.001889
8.980032
70
Peppers
9.118364
9.095945
70
[7]
VII. CO NCLUSION In this paper, we presented a novel reversible image steganography technique for co mpressed domain. Our proposed technique can work efficiently and independently regardless of the image size, quality and dimension. The experimental results shown in this paper proved that the data hiding capacity of the proposed method is extremely high if the relative co mpression ratio is compro mised. Double layer of security for the embedding text ensures the protection of data against any intruder attack. In addition, the user has been given the freedo m of choosing the key and using it by anyway s/he likes. Further research over the proposed method has also been discussed so that studies on this stream can be continued smoothly.
[1] [2] [3] [4] [5] [6]
R EFERENCES T. Morkel, J.H.P. Eloff and M.S. Olivier, “An Overview of Image Steganography”, 5 th Annual Information Security South Africa Conference, July 2005. H. Wang and S. Wang, “Cyber Warface: Steganography vs Steganalysis”, Communications of the ACM, Vol. 47, Issue 10, Otober 2004. D. Artz, “ Digital Steganography: Hiding Data within Data”, IEEE Internet Computing Magazine, Vol. 5, Issue 3, pp. 75-80, August 2002. R. J. Anderson and F.A.P. Petitcolas, “On the limits of steganography”, IEEE Journal of Selected Areas in Communications, pp. 474-481, May 1998. C. K. Chan and L.M. Cheng , “Hiding Data in Images by Simple LSB Substitution ”, Pattern Recognition Letters, Vol. 37, pp. 469-474, 2004. V. Vijayalakshmi, G. Zayaraz, and V. Nagaraj, “A modulo based LSB steganography method”, IEEE Conference on Control, Automation, Communication and Energy Conservation, pp. 1-4, August 2009.
© 2012, IJARCSS E All Rights Reserved
[8]
[9] [10]
[11]
[12] [13]
[14]
[15]
D. Neeta, K. Snehal, and D. Jacobs, “Implementation of LSB Steganography and Its Evaluation for Various Bits”, IEEE International Conference on Digital Information Management, pp. 173-178, June 2007. S. M. M. Karim, M. S. Rahman, and M.I. Hossain, “A New Approach for LSB Based Image Steganography using Secret Key”, Appeared in 14 th International Conference on Computer and Information Technology (ICCIT), pp. 286-291, March 2012, DOI: 10.1109/ICCITechn.2011.6164800. W. Hong, T. Chen, and C. Shiu, “Lossless Steganography for AMBT C-Compressed Images”, IEEE Congress on Image and Signal Processing, Vol. 2, pp. 13-17, July 2008, DOI: 10.1109/CISP.2008.638. A.M. Fard, M.-R. Akbarzadeh-T , and F. Varasteh-A, “A New Genetic Algorithm Approach for Secure JPEG Steganography”, IEEE International Conference on Engineering of Intelligent Systems, pp. 16, September 2006, DOI: 10.1109/ICEIS.2006.1703168. M. Ishaque, and S.A. Sattar, “Quality Based JPEG Steganography Using Balanced Embedding T echnique”, IEEE International Conference on Emerging Trends in Engineering and T echnology, pp. 215-221, January 2010, DOI: 10.1109/ICETET.2009.188. M. Hasan and K. M. Nur, “A Lossless Image Compression Technique using Location Based Approach ”, International Journal of Scientific and Technology Research (IJST R), vol. 1, issue. 2, March 2012. M. D. Lema and O. R. Mitchel, “Absolute Moment Block Truncation Coding and its Application to Color Images”, IEEE Transactions on communications, Vol. 32, Number.10, pp. 1148-1157, 1984, DOI: 10.1109/TCOM.1984.1095973. Z. Zhao and L. Tang , “High Capacity Reversible Data Hiding in AMBT C-Compressed Images ”, International Journal of Digital Content Technology and its Applications, Vol. 6, Number 2, February 2012. C. Velasco, M. Nakano, H. Perez, R. Martinez, and K. Yamaguchi, “Adaptive JPEG Steganography using Convolutional Codes and Synchronization Bits in DCT Domain”, IEEE International Midwest Symposium on Circuits and Systems, pp. 842-847, September 2009, DOI: 10.1109/MWSCAS.2009.5235899.
Page | 255
Vo lu me 2, Issue 3, March 2012 [16] H. T seng and C. Chang, “Steganography using JPEG-Compressed Images”, IEEE International Conference on Computer and Information Technology, pp. 12-17, November 2004, DOI: 10.1109/CIT .2004.1357167. [17] D. R. D. Brabin, and V. Sadasivam, “QET Based Steganography Technique for JPEG Images”, IEEE Conference on Control, Automation, Communication and Energy Conservation, pp. 1-5, August 2009. [18] Y.M. Behbahani, P. Ghayour, and A. H. Farzaneh, “Eigenvalue Steganography based on Eigen Characteristics of Quantized DCT Matrices”, IEEE International Conference on Information Technology
www.ijarcsse.com and Multimedia, pp. 1-4, January 2012, DOI: 10.1109/ICIMU.2011.6122769. [19] S. Sachdeva and A. Kumar, “Colour Image Steganography Based on Modified Quantization Table”, IEEE International Conference on Advanced Computing and Communication Technologies, pp. 309 -313, March 2012, DOI: 10.1109/ACCT .2012.37.
Fig. 3 PSNR of Test Images Before and After Data Embedding and Extraction.
© 2012, IJARCSS E All Rights Reserved
Page | 256