Reversible Visible Watermarking and Lossless Recovery of Original ...

19 downloads 1284 Views 1MB Size Report
I. INTRODUCTION. TRADITIONAL watermarking techniques often sacrifice an imperceptible amount of host data for robustness. However, in some applications ...
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

1423

Transactions Letters Reversible Visible Watermarking and Lossless Recovery of Original Images Yongjian Hu, Member, IEEE, and Byeungwoo Jeon, Senior Member, IEEE

Abstract—In this paper, we propose a reversible visible watermarking algorithm to satisfy a new application scenario where the visible watermark serves as a tag or ownership identifier, but can be completely removed to resume the original image data. It includes two procedures: data hiding and visible watermark embedding. In order to losslessly recover both the watermark-covered and nonwatermark-covered image contents at the receiver end, the payload consists of two reconstruction data packets, one for recovering the watermark-covered region, and the other for the nonwatermark-covered region. The data hiding technique reversibly hides the payload in the image region not covered by the visible watermark. To satisfy the requirements of large capacity and high image quality, our hiding technique is based on data compression and uses a payload-adaptive scheme. It further adopts error diffusion for improving subjective image quality and arithmetic compression using a character-based model for increasing computational efficiency. The visible watermark is securely embedded based on a user-key-controlled embedding mechanism. The data hiding and the visible watermark embedding procedures are integrated into a secure watermarking system by a specially designed user key. Index Terms—Data compression, data hiding, lossless watermark, reversible watermarking, visible watermark.

I. INTRODUCTION RADITIONAL watermarking techniques often sacrifice an imperceptible amount of host data for robustness. However, in some applications such as medical imagery, remote sensing, and law enforcement, any permanent distortion introduced by watermarking is not acceptable. The embedded watermark is required to be reversible (i.e., invertible or erasable) so that the original image is able to be losslessly restored. So far, reversible watermarking is mainly proposed for authentication or data integrity verification (e.g., [1]–[11]). In this study, however, we extend its application to visible watermarking. Basically, a reversible watermark is much more than content authentication. It has an additional advantage that, when watermarked content has been detected to be authentic, one can

T

Manuscript received March 9, 2006; revised June 16, 2006. This work was supported in part by the NRL Program 2006-0397-000 of MOST, the National Science Foundation of China under Grant 60572140, and the Natural Science Foundation of Guangdong under Grant 04020004. This paper was recommended by Associate Editor Q. Sun. Y. Hu is with the School of Information and Communication Engineering, Sungkyunkwan University, Suwon 440-746, Korea, and also with the College of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China (e-mail: [email protected]). B. Jeon is with the Department of Electronic and Electrical Engineering, School of Information and Communication Engineering, Sungkyunkwan University, Suwon 440-746, Korea (e-mail: [email protected]). Digital Object Identifier 10.1109/TCSVT.2006.884011

remove the watermark to retrieve the original, unwatermarked content [4]. Honsinger et al. [1] proposed a reversible embedding method using addition modulo 256. Such an operation flips pixels to implement reversible embedding. It can avoid both overflow and underflow, but would cause annoying salt and pepper noise when pixel values are near the upper or lower bound. Fridrich et al. [2] proposed an invertible authentication method using lossless bit-plane compression. Data hiding is implemented by directly replacing the one-bit pixels on the key bit plane with the authentication bits. Later, they [3] proposed another method called the RS method. Using the discrimination function and the flipping function , all of the pixel groups in an image are classified into three groups: R (regular groups), S (singular groups), and U (unusable groups). By turning the R group into the S group or vice versa, a binary message bit is embedded into the image. Embedding only takes place in R and S groups. U groups are not used. The use of R and S groups avoids overflow and underflow. In [4], Tian proposed a different method based on modifying the difference between a pair of pixel values. He used the “expandable” pairs to circumvent overflow and underflow. His later work [5] further increases the capacity and image quality. Other methods include the extended patchwork method [6], the generalized-LSB embedding method [8], [11], the improved version of Tian’s difference expansion [9], and the histogram modification method [10]. In this paper, we propose a novel reversible visible watermarking algorithm. Generally, a visible watermark is translucently laid on the host image and designed to be irreversible so as to resist unintentional modifications or malicious attacks (see, for example, [12]). However, in some potential applications, a visible watermark is first used as a tag or ownership identifier and then needs to be removable [13]. We give the following two examples. 1) Efficient maintenance of patient’s images is crucial in medical departments. Such a job strictly forbids misinterpreting patient’s images. Furthermore, some secret data in the image may need to be protected from disclosure for individual privacy. In this case, we desire a visible watermark constructed with patient’s name, ID number, or other useful information. It translucently appears on the region with the sensitive data and serves as a tag but shades the sensitive area. Only an authorized user can remove it and losslessly recover the original image data. 2) In remote sensing or military imagery, users are overly concerned with image quality when other data are embedded into the image. The reason for this is that the images are difficult to obtain and each pixel contributes to the final

1051-8215/$20.00 © 2006 IEEE

1424

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

Fig. 1. Framework of visible watermark embedding and data hiding.

judgement. In addition, the owners are often not willing to expose all of the image details to the public. An invisible watermark can be used to verify ownership, but it cannot prevent the most valuable data, for example, the coordinates (e.g., longitude and latitude) of the target of interest from disclosure. Thus, we need a reversible visible watermark as mentioned above. This paper aims at designing this type of reversible visible watermark. The proposed algorithm contains two procedures: data hiding and visible watermark embedding. In order to emphasize their distinction, we denote data hiding as the procedure of preserving information for lossless recovery and visible watermark embedding as the procedure of inserting the watermark into the predefined image region. Due to the requirements of large capacity and high image quality, and some special requirements of visible watermarking (see Section IV), the reversible watermarking methods available in literature are not suitable for our work. The remainder of this paper is organized as follows. Section II presents the framework of the proposed algorithm and then describes in detail the implementation of data hiding and watermark embedding, respectively. Section III briefly describes watermark removal and image data reconstruction. Section IV gives some experiments. We draw the conclusion in Section V. II. DATA HIDING AND VISIBLE WATERMARK EMBEDDING Fig. 1 illustrates the framework of the proposed algorithm. and denote the visible watermark (binary logo) and the image region to be protected (covered), respectively. The location of is determined by image providers. has the same size is obtained when as . The visibly watermarked image on . To achieve lossless recovery of we translucently lay the image , we must preserve information of in the nonbefore embedding into . watermarked image area visibly bears and, meanwhile, secretly accomThus, modates two reconstruction data packets, one for losslessly re. constructing , and the other for A. Data Hiding Our data hiding technique is based on data compression. Like other compression-based methods (e.g., [2]–[5], [8], [9]), we need to hide both the message bits and the compressed bit stream of the original host data. Here, the message bits represent the information of . The data hiding procedure consists of six stages. We explain each step as below.

1) Preprocessing: As shown in the above applications, often has a much smaller size than . Even so, hiding information of entire is beyond the capability of most current data hiding techniques. For example, if is of 512 512 8 bits and has the size of 64 128, we need as large as 8192 bytes spare space to store . It is hard to increase hiding capacity via the increase of lossless compression ratio. Therefore, this work proposes to preprocess to alleviate the burden of data hiding. According to the property of visible watermarking, the watermark is translucently laid on and most information of is kept unchanged in the final image. This means that preserving the altered bits of is sufficient. As will be seen later, this work proposes a bit-plane-based alteration scheme for watermark emis bedding, so we only need to store one bit plane. Suppose the bit plane to be altered by . In Stage 1, all one-bit pixels on constitute the pixel set . Thus, hiding instead of greatly reduces the burden. A bit-plane data usually has a statistical structure, so we can with JBIG, which is a popular arithmetic further compress coder for compressing binary images. In Stage 2, we choose the open C code of JBIG-KIT [14]. This compression proves to be , has a much very efficient. The JBIG-compressed , i.e., smaller size than . For example, in Lena, if we choose the most significant bit (MSB) plane of the above as has of ’s. a size of about 2) Choosing and Compressing Pixels on a Key Bit Plane: has a far smaller size than , it is still large for data Although hiding. In the above example, it has several hundred bytes. To tackle such a large data, we propose a new compression-based data hiding technique. Our method is motivated by [2]. We use the same formula to calculate spare space Spare space

Number of pixels

Compressed data size (1)

As shown in Fig. 1, we will hide in , where and . is hidden in the key bit as the one that has enough spare space plane . We define to accommodate but has the least impact on image quality. to be the lowest possible bit The latter constraint requires , from the LSB to plane. We will test each bit plane of the MSB planes, for . We compress the whole bit plane to estimate the spare space. The compression tool we use is the lossless arithmetic coder in [15], which is the improved version of CACM arithmetic coding in [16]. In [2], the whole key bit plane is changed while embedding regardless of the message length. Such a strategy would greatly degrade image quality. Instead of changing the whole bit plane, we propose to alter only necessary pixels on . Suppose represents the pixel sequence (set) composed of one-bit pixels on , and is the compressed version of . We will choose as less as possible pixels to constitute , which provides just enough spare space for storing . As will be seen, this scheme is necessary not only for minimizing distortion but also for blind data extraction. Our data hiding method depends on this scheme to control the embedding capacity. Since we will not tell the end users where and how many pixels have been changed, the only is to retrieve through iterative tests way of getting back

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

based on the constraint . Here, the operator represents the length of a sequence or the cardinality of a set. So the constitution of is one of key problems in the proposed data hiding technique. In Stages 3 and 4, we propose a payload-adaptive scheme for constituting . The number of pixels in solely depends on . We choose pixels one by one in a raster scan order on . To describe the dynamic searching process, we define a tem. is the compressed version of . poral pixel set starts at . Each time, one The constitution of . Before each addition, we compress pixel is added into and check whether the equation is satisfied. If the equation is true, the current is the we by one pixel and repeat the test want. Otherwise, increase sequence process. Here, we assume that there is always a which satisfies the equation . In order not to break the flow of narration, we will make further discussion on such assumption in Section II-C. 3) Constructing Payload and Performing Data Hiding: Once is found, we construct the payload in Stage 5 by means bit stream with the bit stream, i.e., of concatenating the . is the reconstruction data for . In Stage 6, we directly replace the one-bit pixel sequence with the bit stream to create . We summarize the above preprocessing and data hiding processes in the following steps. Step 1) For an image , determine the sensitive region . Denote as the bit plane of to be altered by . All constitute the pixel set , of the one-bit pixels on which is the data for reconstructing . is the data to be Step 2) Compress by JBIG to yield . hidden in . Step 3) Choose the improved version of the CACM implementation as the compression tool. Test each bit plane of , from the LSB to the MSB planes, to determine the key bit plane . via . is composed of one-bit Step 4) Find out pixels on . The pixel selection follows the raster starts at scanning walk. The constitution of . each time. Before each Step 5) One pixel is added into and test whether the equation addition, compress is satisfied. If true, is found; otherwise, repeat step 5) again. . Step 6) Construct the payload bit stream as . Replace with to create B. Improved Data Hiding We analyze the above data hiding technique from the perspectives of image quality and computation complexity. We first examine image quality. Practically, it is a rare case where the spare coincides with . Usually, the constispace provided by tution of only needs parts of pixels on . As a result, after replacing with , the image is visually divided into two parts. The pixels originally belonging to would vary, while the other pixels remain unchanged. Actually, any compression-based data hiding technique using consecutive alteration scheme (e.g., [8]) would face the same problem.

1425

On the other side, step 5) shows that the computation load of searching for is heavy, especially when the sequence is long. We use an example to explain the situation. Given of 64 128 bits, if the JBIG-compressed data is 269 bytes, the number of iterations before reaching is several times of 269 8 when considering the compression ratio of arithmetic coding. Therefore, how to decrease the computing cost is the second problem. 1) Error Diffusion: The first problem is obviously caused by the improper way of constituting . If we select pixels at intervals rather than consecutively, the errors from data hiding will be dispersed across the image and attract less attention. So error diffusion is the straightforward solution. Below we introduce our error diffusion method. We first give a simple fixed-interval selection method and then develop it into a more sophisticated one. The selection interval is also called the selection step . Suppose Cap represents the maximum hiding capacity. According to (1), Cap is the difference between the size of and that of compressed . Thus, the largest step is determined with the following formula: (2) represents the floor function. Equation (2) indicates where times of data into the image. Thus, that we can hide we can use as the maximum selection interval for selecting pixels to constitute . Although the errors will be maximally , it causes a new problem. The noise from diffused using data hiding looks like vertical stripes in the image. Masking useful information with pseudorandom noise is a common technique originally derived from image processing but widely used in watermarking. Such a scheme motivates us to contrive a random pixel selector (RPS) to control the constitution of . The goal is to choose pixels as randomly as possible and avoid the mentioned noise. We design the RPS as a pseudorandom noise generator. It generates binary elements that follow uniform distribution. When constituting , we still choose pixels on in a raster scan order, but the pixel to be is determined by the RPS. If the RPS generates added into ; otherwise, skip the pixel 1, the current pixel is added into and go to the next. The process proceeds until is found. Obviously, unlike the use of fixed step, this method can disperse the altered pixels randomly. Under the assumption of uniform distribution, the RPS generates almost the same amount of 1 and 0. This implies that, before reaching , the RPS selection method probably passes through two times as many pixels as the . Thus, the RPS sefixed-interval selection method using lection method can be regarded as the method using a random is chosen as the seed of the RPS in this step. The level of paper. Although the RPS selection method is better than the fixedinterval one, there is still room for improvement. We note that, , the fixed-interval method can maximally when using spread data hiding error across the image. Hence, we propose a hybrid selection method that possesses good properties of the above two methods. The hybrid selection method initially chooses pixels at a fixed-interval, but the RPS determines whether the current pixel is used for building . Obviously,

1426

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

if , the hybrid selection method reduces to the RPS , the selection method. When using the largest possible hybrid selection method can diffuse the errors randomly and as sparsely as possible. Note that, as long as the output of the RPS follows uniform distribution, it is impossible for the hybrid . However, we selection method to use the step as large as starting at . The process search for the largest possible can be described as follows. If the current is too large, the hybrid selection method is unable to find enough pixels for building . Then, we decrease the current step by 1 and repeat the search process. The process proceeds until the hybrid selection method just finds enough pixels to build . Then, is the largest possible . In the worst case, the the current . In hybrid selection method fails under any , even that case, we use the fixed-interval method instead. Our error diffusion method can be added into the previous step 4). 2) Evaluating the Effect of Error Diffusion: We define to evaluate the effect of error diffusion, where is the ratio of the error-spreading image area to the whole image area. It measures the degree of spreading noise raised from data is the number of pixels which the selection method hiding. is the total number of passes through before reaching . is, the better the effect pixels on . Apparently, the larger of error diffusion. For the sake of comparison, we also define , where is the ratio of the image area for consecutively selected pixels to the whole image area. It measures the effect of data hiding error when the error diffusion method is several times of . It means that is not used. Usually, our error diffusion method spreads the noise into a far wider area. Note that error diffusion improves subjective but objective image quality. 3) Character-Based Model for Arithmetic Coding: Now we discuss how to decrease computational load. So far, our discussion on constituting has been based on bit-wise operations. Each pixel represents one bit on the bit plane. This constitution is not a good choice for the employed arithmetic coder. According to [16], the CACM implementation adopts the model-based paradigm for coding, where, from an input string (stream) of symbols and a model for representing the input data, an encoded string is produced that is (usually) a compressed version of the input. The improved version of the CACM implementation [15] has three models: word-based, character-based, and bit-based models. The word-based model obtains good compression at reasonable speed; the simple character-based model obtains moderate compression; and the bit-based model is slow and obtains compression midway between the word and the character-based models. In this study, we only discuss the last two models in terms of their effect on computational load and data hiding error. In bit-wise operations, each one-bit pixel is regarded as an input symbol. The input stream is a binary sequence. Therefore, the data hiding error is bit-wise artifacts on . However, when using the character-based model, a character (byte) composed of 8 bits is the input symbol and the input stream is a character sequence. In this case, the data hiding error is 8-bit-wise artifacts, and this noise seems more annoying. However, the length of the input stream in the former case is eight times of that in the latter case. Thus, the computational load is much heavier

using the bit-based model. Since our error diffusion method can effectively disperse data hiding error, we choose the zero-order character-based model for arithmetic coding. We assemble every eight one-bit pixels on the bit plane to form a one-byte-long character. There are two schemes to and ::::, where, “ ” indicates that we form a byte: choose eight pixels consecutively in a row, and “::::” indicates that we choose four consecutive pixels in one row and another four pixels in the row immediately below. Clearly, the second scheme can take more advantage of the correlation among pixels in that it uses correlation information from both row and column. Since the correlation between characters affects compression ratio, the use of the second scheme benefits compression efficiency. We give an example to show the difference. Suppose image Lena and have the sizes described before. The length of the character-form stream in is then 31 744 bytes. If we adopt the first scheme, the coder outputs a compressed sequence of 30 453 bytes on the fourth bit plane (from the LSB to the MSB planes). However, the same coder outputs a compressed sequence of 29 587 bytes using the second scheme. Apparently, the second scheme yields more efficient compression. Thus, we take it in this work. It is worth mentioning that the order of one-bit pixels in a byte has no effect on compression efficiency. The consideration for arithmetic compression using the character-based model can be added into the previous steps 3)–5). C. Discussion on the Constraint for Capacity Control and Blind Data Extraction In Section II-A2, we assume there is always a sequence that satisfies the equation . As can be seen, this equation is important for our data hiding method to adapt the capacity to the need of the message bits. Actually, the equation was originally proposed by [2] for blind data , it is theoretically extraction. However, although impossible to prove that the arithmetic coder can produce and . numerically continuous difference between From common experiments, the difference is often numerically continuous. But this conclusion is affected by both the coder and test images. We note that [2] and even [9] did not mention this problem. In this study, however, we would rather take a more strict measure to ensure the equality. We use an example to explain our solution. Suppose there is an occasion where is 200 and . Apparently, no satisfies the equality. There are only when is long enough. cases We will have provided that records the number of skip points. Thus, we a variable can solve the problem by introducing an extra variable . Since there are many sequences which satisfy , we choose the first , i.e., the shortest sequence. Here we choose the sequence that . Thus, . Obviously, the satisfies transmission of needs overhead information. We will store in our specially designed user key (refer to the next subsection). Our consideration for existence of the equation can be added into the previous step 5).

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

D. User Key Construction and Visible Watermark Embedding The visible watermark embedding procedure only consists of Stage 7. Before addressing that, we introduce our unique user key. 1) User Key Construction: The simple replacement of with does not satisfy the requirement of security. According to the widely accepted viewpoint, watermark security should depend upon user key rather than secrecy of the embedding algorithm. Thus, we propose a user-key-controlled watermark embedding scheme. We deliberately construct the user key in step 7) and let it not only ensure watermark security but also record the parameters used in data hiding. Step 7) User key (80 bits) Watermark size (8 bits 8 bits) Origin position of the protected region (16 bits 16 bits) JBIG-compressed data size (16 bits) key bit plane level (3 bits) Hybrid/fixed way of selection (1 bit) selection step (4 bits) n (8 bits). Here, Origin position of the protected region refers to the coordinates of the upper-left corner of ; JBIG-compressed data . In User key, both Origin position of the prosize refers to tected region and Watermark size are determined by the image providers, as described in Section I, whereas the other five parameters are dependent upon the original image and the compression tools. In real-world applications, we may increase the user key size to enlarge the key space for the security reason. In order to counterattack new powerful key-breaking algorithms, we suggest the user key size is at least 128 bits long so as to satisfy the new encryption standard such as advanced encryption standard (AES). One way to meet the requirement is to mix the user key with additional image owner’s information (e.g., a 128-bit hash value). Further discussion on this topic is beyond the scope of this paper. 2) Visible Watermark Embedding: We propose to insert interleavingly into two bit planes of and its lower . , and are synchronously read in a neighbor raster scan order, respectively. The user-key-controlled embedding mechanism is built in the following way. We first introduce a pseudo random number generator seeded with the user sekey, which we call the UKCE. The UKCE produces a quence that follows uniform distribution. Then, we decide how according to the value of the UKCE. to embed each pixel of The embedding process is described in step 8). into or under the control of the Step 8) Insert UKCE. If the UKCE generates 1, the current watermark pixel is used to replace the pixel in corresponding ; if the UKCE generates 0, the current position on is preserved in the corresponding position pixel of before the watermark pixel is placed in the coron . After embedding, is responding position on created. III. WATERMARK REMOVAL AND IMAGE RECONSTRUCTION With the user key, an authorized user can completely remove the embedded watermark from the visibly watermarked image and losslessly recover the original image data. The process is performed by reversing the flow of operations shown in Fig. 1.

1427

Here, we only emphasize the process of retrieving and , and the reconstruction of the original image. Similar to the process . From of data hiding, we reconstitute is increased by one pixel each time. We perform arithmetic desequence. If the decompressed secoding on every , stop testing. This means quence is as long as sequence is the first part of the payload , that the current i.e., , and the decompressed data are the recovered original sequence. Then, the pixels following the current on belong to the second part of , i.e., . The decompression of these pixels with JBIG decoder yields the recovered original . After replacing the first pixels of with . Then, using to rethe recovered bits of , we regain , we resume original . We combine and construct to regain the original image . IV. EXPERIMENT AND DISCUSSION We have tested the proposed algorithm on a number of images. The test images range from fairly smooth images like Girl to highly textured images like Baboon. All of them are in 512 512 8 bits. The visible watermarks are binary logos with sizes of 64 128, 128 128, and 128 256, respectively. Fig. 2 gives the visibly watermarked images using the MSB , respectively. The plane and the second MSB plane of as apparent difference in these images is watermark transparency. The watermark is heavy in the first case but light in the second case. Another difference is the payload size. Table I shows that is much larger in the second case than in the first case. only However, in the second case, the image quality of changes slightly and is still good. This observation implies that the proposed data hiding technique has good resilience to different payload sizes. Practically, the image quality gradually degrades with increasing watermark size. Table I shows that, when the watermark size changes from 64 128 to 128 128, and to 128 256, increases greatly, but the PSNR value only decreases several decibels. In essence, the change of either watermark transparency or watermark size affects the payload size. The above experiments demonstrate that our data hiding technique has both large capacity and good payload resilience. These good properties enable our method to satisfy different applications. Table I also demonstrates that our error diffusion method is very effective. In most cases, the hybrid selection method is is several times the value of . chosen. It can be seen that Fig. 2 verifies that subjective image quality is good. Generally, our algorithm can produce good watermarked images, no matter whether the original image is smooth or textured. The PSNR value is not very high in textured images (e.g., Baboon), but the strong masking effect still yields acceptable visual quality. We have to point out that our data hiding technique does not have the same goal as common data hiding techniques in literature. Common data hiding techniques purely serve for secrete data transmission and pursue image quality as high as possible, whereas our technique is proposed for copyright protection. Small artifacts from data hiding do not affect the performance of our watermarking system. From the content protection point of view, we even benefit from small impermanent artifacts from being commercially used. because they can prevent

1428

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

Fig. 2. Visibly watermarked images with the MSB plane of R as R (upper row) and the second MSB plane of R as R (lower row), respectively. The watermarks have the sizes of 64 128 and 128 128, respectively.

2

2

TABLE I AND N DENOTE THE BIT PLANE LEVEL OF R AND THE KEY BIT PLANE, PERFORMANCE EVALUATION. THE UNIT OF D ; D ; CAP, AND S IS BYTES. N RESPECTIVELY. THE BIT PLANE IS NUMBERED FROM THE LSB PLANE (0) TO THE MSB PLANE (7). H=F = 1=0 REPRESENTS THE HYBRID/FIXED-INTERVAL SELECTION METHOD. THE PSNR (DB) IS CALCULATED WITHOUT R

j jj j

j j

Since the watermark embedding process is completely controlled by the UKCE, the binary watermark is securely embedded into the image. It is difficult to remove the watermark without correct user keys. V. CONCLUSION To the best knowledge of the authors, the proposed algorithm is the first work that completely implements a reversible visible watermarking system. It includes two procedures: data hiding and visible watermark embedding. To achieve large and resilient hiding capacity, our data hiding technique lays its foundation on data compression. We also introduce error diffusion, the character-based model for arithmetic coding, and user-key-controlled embedding mechanism for further improving the performance of the algorithm. The specially designed user key guarantees the security of our algorithm.

This work is also applicable to other application scenarios. Although we use the standard JBIG compression and the improved version of the CACM implementation, one can replace them with other compression tools to serve specific purposes. REFERENCES [1] C. W. Honsinger, P. Jones, M. Rabbani, and J. C. Stoffel, “Lossless Recovery of an Original Image Containing Embedded Data,” U.S. Patent 6 278 791 B1, Aug. 21, 2001. [2] J. Fridrich, M. Goljan, and R. Du, “Invertible authentication,” in Proc. SPIE Security and Watermarking of Multimedia Contents III, P. W. Wong and E. J. Delp, Eds., vol. 4314, pp. 197–208. [3] ——, “Distortion-free data embedding for images,” in Proc. 4th Inf. Hiding Workshop, Pittsburgh, PA, Apr. 25–27, 2001, pp. 27–41. [4] J. Tian, “Wavelet-based reversible watermarking for authentication,” in Proc. SPIE Security and Watermarking of Multimedia Contents III, P. W. Wong and E. J. Delp, Eds., vol. 4675, pp. 679–690. [5] ——, “Reversible data embedding using a difference expansion,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 8, pp. 890–896, Aug. 2003.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

[6] C. De Vleeschouwer, J.-F. Delaigle, and B. Macq, “Circular interpretation of bijective transformations in lossless watermarking for media asset management,” IEEE Trans. Multimedia, vol. 5, no. 1, pp. 97–105, Mar. 2003. [7] M. Awrangjeb and M. S. Kankanhalli, “Lossless watermarking considering the human visual system,” Lecture Notes in Computer Science, vol. 2939, pp. 581–592, 2004. [8] M. U. Celik, G. Sharma, A. M. Tekalp, and E. Saber, “Lossless generalized-LSB data embedding,” IEEE Trans. Image Process., vol. 12, no. 2, pp. 157–160, Feb. 2005. [9] L. Kamstra and H. J. A. M. Heijmans, “Reversible data embedding into images using wavelet techniques and sorting,” IEEE Trans. Image Process., vol. 14, no. 12, pp. 2082–2090, Dec. 2005. [10] Z. Ni, Y. Q. Shi, N. Ansari, and W. Su, “Reversible data hiding,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 3, pp. 354–362, Mar. 2006.

1429

[11] M. U. Celik, G. Sharma, and A. M. Tekalp, “Lossless watermarking for image authentication: A new framework and an implementation,” IEEE Trans. Image Process., vol. 15, no. 4, pp. 1042–1049, Apr. 2006. [12] C.-H. Huang and J.-L. Wu, “Attacking visible watermarking schemes,” IEEE Trans. Multimedia, vol. 6, no. 1, pp. 16–30, Feb. 2004. [13] Y. J. Hu, S. Kwong, and J. Huang, “An algorithm for removable visible watermarking,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 1, pp. 129–133, Jan. 2006. [14] , [Online]. Available: http://www.cl.cam.ac.uk/mgk25/jbigkit/ [15] A. Moffat, R. M. Neal, and I. H. Witten, “Arithmetic coding revisited,” ACM Trans. Inf. Syst., vol. 16, pp. 56–294, Jul. 1998. [16] I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression,” Commun. ACM, vol. 30, pp. 520–540, Jun. 1987.