Signal Processing 81 (2001) 663}671
Short communication
Digital watermarking based on neural networks for color images夽 Pao-Ta Yu*, Hung-Hsu Tsai, Jyh-Shyan Lin Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi 62107, Taiwan ROC Received 6 December 1999; received in revised form 15 June 2000; accepted 8 October 2000
Abstract In this paper, we propose a novel digital watermarking technique based on neural networks for color images. Our technique hides an invisible watermark into a color image, and then e!ectively cooperates neural networks to learn the characteristics of the embedded watermark related to the watermarked image. Due to neural networks possessing the learning and adaptive capabilities, the trained neural networks almost exactly recover the watermark from the watermarked image against image processing attacks. Extensive experimental results illustrate that our technique signi"cantly possesses robustness to be immune against the attacks. 2001 Elsevier Science B.V. All rights reserved. Keywords: Digital watermarking; Data hiding; Neural networks; Color images
1. Introduction Due to the fact that illegal duplications of multimedia products can be readily proliferated through Internet, a very crucial issue for copyright protection noticeably emerged from the development of multimedia systems [2,5]. Recently, a very important and popular technique, digital watermarking, was e!ectively applied 夽
This work was partially supported by the National Science Council, ROC, under Grant NSC89-2213-E-194-019. * Corresponding author. Tel.: #886-5-272-0411 ext. 6015; fax: #886-5-272-0859. E-mail addresses:
[email protected] (P.-T. Yu),
[email protected] (H.-H. Tsai),
[email protected] (J.-S. Lin). Presently with the Department of Information Management, National Huwei Institute of Technology, Taiwan, ROC. Presently with the Department of Management Information Systems, Ging-Chung Business College, Hualau, Taiwan, ROC.
to protect the copyrights for multimedia and gained prominent results. A signi"cant merit of digital watermarking over traditional protection methods (cryptography) is to provide a seamless interface so that users are still able to utilize protected multimedia transparently by embedding an invisible digital signature (watermark) into multimedia data (audio, images, video) [2,5]. In this paper, we mainly develop watermarking techniques, integrating both color image processing and cryptography, to achieve content protection and authentication for color images [6]. Our watermarking techniques are mainly based on neural networks to further improve the performance of Kutter's technique for color images [4]. Due to neural networks possessing the learning capability from given learning (training) patterns, our method can memorize the relations between a watermark and the corresponding watermarked image. Note that our approach can pave the way for developing
0165-1684/01/$ - see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 5 - 1 6 8 4 ( 0 0 ) 0 0 2 3 9 - 5
664
P.-T. Yu et al. / Signal Processing 81 (2001) 663}671
the watermarking techniques for multimedia data, since color images are ubiquitous in the contemporaneous multimedia systems and also are the primary components of MPEG video [2,5]. In Section 2, basic concepts and some notations for color images are introduced, and Kutter's watermarking for color images is brie#y reviewed. In Section 3, the embedding and the recovery algorithm of our method is described. In Section 4, experimental results are exhibited. Finally, a conclusion is stated in Section 5.
2. A review of Kutter's algorithm An RGB color image can be de"ned by O"[o] with size ¸;K, where o" *") (R, G, B), R, G, B30, 1, 2, 2, 255, that is, o represents a 3;1 column vector and is called the vector-valued pixel located at position "(i, j) over O. In addition, i30, 1, 2, 2 , ¸!1 and j30, 1, 2, 2, K!1. Further, let a watermark = be represented by a bit sequence ="H#S"h h s s s 2s K\ "w w w 2w , K>
(1)
where H"h h and S"s s s 2s are a 2-bit K\ and an m-bit binary sequences, respectively; moreover, ' denotes the concatenation operation and s , 2, s 30, 1. Note that H is extra informa K\ tion known in the beginning of watermark recovery to enhance the correctness of extracting S, and S is a digital signature of owner's images. In addition, the assumption, h "0 and h "1, is mandatory in Kutter's algorithm, and h "w , h "w , s "w , 2, s "w . Next, Kutter's embed K\ K> ding algorithm is summarized as follows: Step 1: Generate a pseudo-random position "(i , j ) over O, for each w , according to the R R R R secret key k provided by the owner of the original image O. Note that the details of the pseudo-random positions can be referred in Section 3.1. Let o be the corresponding vector-valued pixel at MR position . R Step 2: Compute the luminance ¸ R of o R by M M ¸ R "0.299R R #0.587G R #0.114B R . M M M M
Step 3: Embed w into o R by modifying the blue R M component of o R , i.e., B R as M M B R QB R #(2w !1)¸ R , M R M M
(2)
where is a positive constant that determines the watermark strength. Note that larger can o!er better robustness but degrade the visual quality of the watermarked image. Step 4: Repeat Steps 1}3 until all bits in = are embedded (i.e., the algorithm quits if t'm#1). After the embedding procedure, we can obtain the corresponding watermarked image denoted as OM "[o ] with size ¸;K, where *") o "(RM , GM , BM ) and RM , GM , BM 30, 1, 2, 2, 255, that is, o represents a 3;1 column vector. In Kutter's extraction algorithm, a watermark = M is extracted from OM and referred to as the recovered watermark. Similar to (1), = M is represented as = M "hM hM s 2s "w w 2w , where K\ K> hM "0 , hM " 1 , s , 2 , s 3 0 , 1 , hM " w , K\ hM "w , s "w , 2, s "w . Note that the K\ K> values of hM and hM have to be known while extract ing the watermark. The extraction algorithm is summarized as follows: Step 1: Compute and o ι , t"0, 1, 2, m#1, R M according to the same secret key k used in the embedding algorithm. Step 2: While the center of a sliding window with symmetric cross-shape, shown in Fig. 1, covers o ι in turn, BK ι is calculated by M M 1 BK ι R ι " M G H 4c
A A # BM J J !2BM J J , BM J G H >P G H G >PHJ P\A P\A (3)
where t"0, 1, 2, m#1 and c is the number of pixels of the window along the vertical or the horizontal direction. Step 3: Compute by "BM ι !BK ι , R R M M t"0, 1, 2, m#1 and determine the adaptive threshold by "( # ). Step 4: Each bit w , t"2, 2, m#1, is deterR mined by
1 if ', R w " R 0 else.
P.-T. Yu et al. / Signal Processing 81 (2001) 663}671
665
Fig. 2. The structure of our watermark-embedding algorithm.
The watermark used in our watermarking is denoted as Fig. 1. The symmetric cross-shaped window for c"2.
Consequently, the algorithm outputs = M and quits while t'm#1. A fatal weakness for Kutter's algorithm is in the assumption that the "rst two bits are known (i.e., w "0 and w "1) while extracting S. The main purpose of this assumption is to determine an adaptive threshold (a constant decision function) such that the remaining bits that constitute the watermark can be exactly speci"ed according to the adaptive threshold. From our experimental results, the constant decision function, however, seems not to work accurately while the watermarked images are attacked by geometrical transformations or image processing. This motivates us to propose a more adaptive watermarking technique to reduce the false recovery for color images against common attacks. That is, the goal of this paper is to develop a more adaptive threshold performed by neural networks such that the false recovery can be greatly reduced.
3. Our watermarking technique 3.1. Watermark embedding In this subsection, we "rst specify the watermark structure utilized in this paper. Then we describe a scheme, which randomly selects a subset of pixels from the original image to embed a watermark into the subset of pixels. Finally, we introduce the training process for a neural network memorizing the characteristics of the relations between = and OM .
="H #S" 2 2 2 NO O O 2 s 2s (4) N NO M K\ "w w 2w , (5) NO>K\ where H " 2 2 2 2 NO O O N NO is a (p;q)-bit binary sequence, and 30, 1, GH 1)i)p, 1)j)q. Note that H resembles H in NO (1), and S is de"ned by (1). Here H can be arbitNO rary length for obtaining a more accurate signature in contrast to preceding H (2-bit length). For convenience, = can be denoted as (5), where w " , 2, w "s . The main purpose NO>K\ K\ of creating H is to construct the training patterns NO for a neural network to e!ectively memorize the characteristics of the relation between = and OM . In this paper, the trained neural network is employed to extract the watermark = M from OM during the recovery process. Recall that, like H, H is also NO given in the starting of watermark recovery, therefore the training process for the neural network is precedent to the process of extracting S. The structure of our watermark embedding algorithm is pictorially shown in Fig. 2. Many pseudorandom number generators (PRNGs) have been proposed and applied to cryptography. Here we adopt the Blum}Blum}Blum generator due to its simplicity in implementation [1]. Because a location of an image has x-coordinate and y-coordinate, two distinct sequences of random numbers are required to select a subset of pixels. Two di!erent keys k and k are given by the owner of the digital intellectual property before the embedding procedure. Our embedding algorithm can be brie#y described as follows. First, by applying k and k to the PRNG, a sequence of random positions over O is obtained and denoted as NO>K\. Then, the R R blue-channel intensity of each o R is modi"ed by (2), M i.e., to embed w into o ι . R M
666
P.-T. Yu et al. / Signal Processing 81 (2001) 663}671
3.2. Watermark recovery The watermark recovery process can be diagrammatized in Fig. 3. First, a sequence of random positions NO>K\ is generated by the PRNG the R R same as the embedding algorithm. This indicates that each vector-valued pixel o ι hides the message M of the watermark. Next, we use the extra information to train a neural network such that the trained neural network possesses the capability of memorizing the characteristics of relations between = and OM . Finally, the signature SM is retrieved by using the adaptive capability of the trained neural network. 3.2.1. Neural network training For the sliding window with c"2, the relation between = and OM can be represented by a set of patterns, P, which is de"ned by , , , , , P"( R G \HR GR \HR GR HR GR >HR GR >HR R R , R R , R R , R R , d )NO>K\, (6) G H \ G H \ G H > G H > R R where stands for the di!erence between the ST intensity of the blue component of the central vector-valued pixel o and that of the others within ST the window, and is de"ned by "BM !BK ST ST ST 1 BM # BM !2BM "BM ! S>PT ST>P ST ST 4c P\ P\ (7)
and d is called as the desired output for the neural R network corresponding to the tth pattern in P, and is de"ned by (BM ! R R ) if w "1, G H R dι " GR HR \ (BM ! R R ) if w "0. R R G H G H R
(8)
A subset of P is selected to be the set of training patterns, TLP. In this paper, we naturally select the "rst (p;q) patterns from P as the training patterns, which is denoted by T"( R , , , , , G \HR GR \HR GR HR GR >HR GR >HR R R , R R , R R , R R , d )NO\. (9) G H \ G H \ G H > G H > R R Any patterns in P!T should not be selected to be training patterns, because each pattern in P!Tincludes the component of the digital signature (i.e., d ) that we mainly attempt to recover. In R other words, we intend to emphasize the adaptive capability of the neural network for the extraction of digital signatures that can never be trained in advance. For convenience, (9) can be rewritten as T"(MR , MR , 2, MR , d )NO\, (10) R R where MR " R , MR " R , , MR " R R . G \HR G \HR 2 G H > The structure of the neural network is depicted in Fig. 4, and is a 9}5}1 multilayer perceptron (MLP) including an input layer with 9 nodes, a hidden layer with 5 hidden nodes, and an output layer with a signal node. zJ represents the synaptic weight GH connecting the jth node at Layer (l!1) to the ith node at Layer l. Therefore, a set of synaptic weights Z"zJ can characterize the behavior of the neuGH ral network. In this paper, an adaptive approach called the back-propagation algorithm is adopted to obtain a nearly optimal set for Z associated with T [3]. In addition, dK 3[!1, 1] is the estimated R output of the neural network. 3.2.2. Signature extracting based on the trained neural network The trained neural network performs a highly adaptive nonlinear decision function . Therefore, based on the physical output of the trained neural network, namely dK , the estimated signature R SM "s s 2s can be obtained by K\
1 if dK *0, R w "s " NO>R R 0 otherwise,
Fig. 3. The diagram illustrates our watermark-recovery algorithm.
(11)
where t"0, 1, 2, m!1. Once the watermark recovery process terminates, SM is outputted, and then utilized to identify the copyright of owner's intellectual property by comparing S with SM .
P.-T. Yu et al. / Signal Processing 81 (2001) 663}671
667
Accordingly, the signature S in = is denoted by S"w 2w 2w 2w . NO> NO>O NO>K\ NO>K\ O (13) The corresponding physical output of neural network for w is represented as dK for each NO>PQ NO>PQ r and s. Hence, the estimated signature SM is obtained by w "2"w "2"w NO>P NO>PO NO>PO 1 if dK *0, O Q NO>PQ " 0 otherwise.
Fig. 4. The structure of the neural network used in our watermarking techniques.
3.3. Multiple embedding In order to further improve the performance of our watermarking algorithm against the attacks caused from the common signal processing or other types of distortion, our algorithm is enhanced by the multiple embedding scheme [4]. More speci"cally, each bit w is embedded in several di!erent R positions within O. This scheme o!ers two advantages: it is more reliable for the adaptive decision function performed by neural networks; and it is more robust against the di!erent types of attacks. That is, the ability of the neural network to perform can be further improved by increasing the number of the size of extra information, that is, increasing the number of training patterns. Moreover, the probability of incorrect recovery can be reduced due to random spread of each bit in the signature over O. Assume that each w is repeated to hide it into R distinct positions over O. Let the embedded watermark be rewritten as ="w 2w 2w 2w 2w , O ST NO>K\ NO>K\O (12) where w stands for w repeated at the vth time, ST S and w , 2, w , 2, w are identical for each u. S ST SO
(14)
4. Experimental results In our experiments, we take the trademark, shown in Fig. 6(a), as the signature S of the watermark =, and the trademark is a binary image with size 32;32. The watermark = can be formed as a bit sequence H #S if S is transformed as a bit sequence by a row-major fashion, and is concatenated at the end of H . Note that H is the same as H and the number of bits of = is 1184. In order to evaluate the performance of watermarking techniques, mean square error (MSE) and peak signal to noise (PSNR) are two common quantitative indices for watermarked color images, and another quantitative index for signatures is mean absolute error (MAE). These three indices are, respectively, de"ned by 1 ) * MSE" [(R !RM )#(G !GM ) GH GH GH GH 3;K;¸ G H #(B !BM )], (15) GH GH 255 PNSR"10 log , (16) MSE and 1 K\ MAE" s !s , (17) R R m R where K;¸ and m denote the size of a color image and the length of the signature, respectively. Note that the quantitative index, MAE, is exploited to
668
P.-T. Yu et al. / Signal Processing 81 (2001) 663}671
Fig. 5. (a) The trademark is taken as a signature and is depicted as a binary image with size 32;32 (enlarged version); (b) the original Lenna image with size 480;512; (c) the original Baboon image with size 480;512; (d) and (e) exhibit two recovered signatures that are extracted from the watermarked Lenna by using our method (50 epochs) and Kutter's method, respectively; (f ) and (g) exhibit two recovered signatures that are extracted from the watermarked Baboon by using our method (50 epochs) and Kutter's method, respectively.
measure the similarity between two binary images, i.e., the original signature and the extracted signature. Some necessary parameters used in our watermarking algorithm include watermark strength "0.1, the length of cross-shaped window c"2, repeat times "2, and learning rate for the neural network (a 9}5}1 MLPs) "0.1. Attack-free: Figs. 5(b) and (c) show the original Lenna and Baboon images (RGB color images), respectively. In addition, Table 1 exhibits comparisons in terms of these three quantitative indices for evaluating the performance of our method and Kutter's method. Note that the MSE and the PSNR for the watermarked version of Fig. 5(b) are 1.597 and 46.097 dB, respectively. Two signatures recovered from the watermarked version of Fig. 5(b) by using our method and Kutter's method are exhibited in Figs. 5(d) and (e), respectively. Two MAEs for Figs. 5(d) and (e) are 0.00195 and 0.00391, respectively. Moreover, Figs. 5(f ) and (g) show two signatures recovered from the watermarked version of Fig. 5(c) by using our method and Kutter's method, respectively. Therefore, in the attack-free case, the performance of our scheme is superior to that of Kutter's method via the measure of visual perception and the quantitative measures. In addition to the attack-free case, in the following subsections, our method is compared to Kutter's
method for the robustness against several di!erent types of attacks. A single attack: Several types of a single attack are simulated to evaluate the performance of robustness against common attacks. These attacks include blurring, "ltering, sharpening, JPEG compression, scaling, and rotation. Table 1 shows the comparisons of quantitative measures for our method and Kutter's method in cases of these attacks. Due to the limit of paper space, we just exhibit two cases of blurring and rotation attacks for the comparison of visual perception. In the case of the blurring attack, Fig. 6(a) shows the slightly blurred version of the watermarked Lenna image. Two recovered signatures by using our method and Kutter's method are exhibited in Figs. 6(b) and (c), respectively. Conspicuously, Fig. 6(b) is more similar to Fig. 5(a) (the original signature) than Fig. 6(c). Moreover, another illustration is to draw a function of 32 bits that are randomly selected from SM . According to the threshold function and the function value of one bit number, the bit value can be determined to be either 1 or 0. Figs. 7(a) and (b) show two discriminations for Figs. 6(b) and (c), respectively, in which a dot symbol stands for the bit whose value should be 1 (above the dashed line ) and a cross-shaped symbol represents the bit whose value should be 0 (below the dashed line).
P.-T. Yu et al. / Signal Processing 81 (2001) 663}671
669
Table 1 Comparisons in terms of MSE, PSNR, and MAE MAE for recovered signatures Attacks
Watermarked image
Attack-free
Lenna Baboon
Blurring
MSE
PSNR (dB)
Our method
Kutter's method
1.597 1.667
46.097 45.924
0.00195 0.02344
0.00391 0.07617
Lenna Baboon
28.249 84.845
33.621 28.845
0.00293 0.05469
0.08691 0.13965
Filtering
Lenna Baboon
38.714 345.778
32.252 22.743
0.0205 0.16211
0.29395 0.38477
Sharpening
Lenna Baboon
38.193 327.954
32.311 22.973
0.00098 0.03418
0.00293 0.05957
JPEG
Lenna Baboon
21.103 62.631
34.887 30.163
0.08887 0.23535
0.15918 0.27637
Scaling (50%)
Lenna Baboon
32.8 314.812
32.972 23.15
0.06934 0.25977
0.20214 0.33887
Rotate 303
Lenna Baboon
22.269 161.71
34.654 26.043
0.00684 0.1123
0.04199 0.20898
Rotate 753
Lenna Baboon
153.64 519.28
26.266 20.977
0.00293 0.06934
0.40137 0.46875
Multiple attacks: JPEG#mean "ltering#JPEG
Lenna Baboon
39.263 346.542
32.191 22.733
0.16699 0.25879
0.30566 0.37891
Fig. 6. (a) The watermarked Lenna is slightly blurred; (b) and (c) exhibit two recovered signatures from (a) by using our method and Kutter's method, respectively; (d) the watermarked Baboon is slightly blurred; (e) and (f ) are recovered from (d) by using our method and Kutter's method, respectively.
No erroneous estimation can be found in Fig. 7(a), but an erroneous estimation is shown in Fig. 7(b). The estimated value corresponding to the "rst bit in Fig. 7(b) should be less than . In addition,
Fig. 6(d) shows the slightly blurred version of the watermarked Baboon image. From Figs. 7(c) and (d), the numbers of erroneous estimations are 5 and 4, respectively. In the case of rotation (753)
670
P.-T. Yu et al. / Signal Processing 81 (2001) 663}671
Fig. 7. (a) and (b) show two discriminations for 32 bits randomly selected from the recovered signature in Figs. 6(b) and (c), respectively; (c) and (d) show two discriminations for 32 bits randomly selected from the recovered signature in Figs. 6(e) and (f ), respectively.
Fig. 8. (a) The watermarked Lenna is rotated 753 to the left; (b) and (c) exhibit two recovered signatures from (a) by using our method (300 epochs) and Kutter's method, respectively; (d) the watermarked Baboon is rotated 753 to the left; (e) and (f ) are recovered from (d) by using our method (300 epochs) and Kutter's method, respectively.
Fig. 9. (a) and (b) show two discriminations for 32 bits randomly selected from the recovered signature in Figs. 8(b) and (c), respectively; (c) and (d) show two discriminations for 32 bits randomly selected from the recovered signature in Figs. 8(e) and (f ).
attack, Figs. 8(a) and (d), the watermarked Lenna and Baboon images are rotated 753 to the left. Figs. 8(b) and (e) are signi"cantly distinguishable in contrast to Figs. 8(c) and (f ), respectively. Furthermore, the numbers of erroneous estimations in Figs. 9(a) and (b) are 0 and 9, respectively, and the numbers of erroneous estimations in Figs. 9(c) and (d) are 1 and 12, respectively. From the quantitative measures and the measure of visual perception (Table 1 and
Figs. 6}9), we claim that our method in resisting these common attacks is signi"cantly superior to Kutter's method. Multiple attacks: In real environments, a watermarked image is very likely to be attacked by multiple image processing. For example, the watermarked images are compressed and then are distributed through the Internet, and subsequently the compressed images can be decompressed to be the
P.-T. Yu et al. / Signal Processing 81 (2001) 663}671
671
watermarked images and then the watermarked images are distorted by image processing. Hence, we propose a new type of multiple attacks, which is described as follows: the watermarked images compressed by the JPEG format, and then decompressed and "ltered by a mean "lter, and "nally compressed and decompressed by the JPEG format. From Table 1, our method is more capable of resisting this kind of multiple attacks than Kutter's method.
watermarked image may become tenuous. Due to the #exibility and adaptability of the neural network, our watermarking technique against several attacks outperforms that of conventional techniques. Extensive experimental results are also included to illustrate that our method can be immune against many di!erent types of attacks.
5. Conclusions
[1] L. Blum, M. Blum, M. Shub, A simple unpredictable pseudo-random number generator, SIAM J. Comput. 15 (2) (1986) 364}383. [2] I.J. Cox, J. Kilian, F.T. Leighton, T. Shamoon, Secure spread spectrum watermarking for multimedia, IEEE Trans. Image Process. 6 (12) (December 1997) 1673}1687. [3] S. Haykin, Neural Networks, Macmillan College Publishing Company, New York, 1995. [4] M. Kutter, F. Jordan, F. Bossen, Digital watermarking of color images using amplitude modulation, J. Electron. Imaging 7 (2) (April 1998) 326}332. [5] M.D. Swanson, M. Kobayashi, A.H. Tew"k, Multimedia data-embedding and watermarking technologies, Proc. IEEE 86 (6) (June 1998) 1064}1087. [6] H.-H. Tsai, P.-T. Yu, Adaptive fuzzy hybrid multichannel "lters for removal of impulsive noise from color images, Signal Processing 7 (2) (April 1999) 127}151.
A novel watermarking technique based on neural networks has been proposed in this paper for copyright protection of color images. We have successfully fused neural networks with watermarking to enhance the performance of conventional watermarking techniques. The neural network can learn the characteristics of the embedded bits from the di!erences between the embedded pixels and those pixels in the symmetric cross-shaped window. When a watermarked image is distorted by some attacks, such as rotation and "ltering, the watermark information in the distorted-and-
References