In this paper we propose a new watermarking scheme for digital images that allows .... Our target is to embed an N{bit binary signature B = fb0; b1;:::;bN?1g in an .... of di erent modulation functions, we can add the constraint that all functions ...
Watermarking resisting to translation, rotation, and scaling M. Kutter Signal Processing Laboratory Swiss Federal Institute of Technology Ecublens 1015 Lausanne, Switzerland
ABSTRACT
In this paper we propose a new watermarking scheme for digital images that allows watermark recovery even if the image has been subjected to generalized geometrical transforms. The watermark is given by a binary number and every watermark bit is represented by a two dimensional function. The functions are weighted, using a mask that is proportional to the luminance, and then modulated onto the blue component of the image. To recover an embedded bit, the embedded watermark is estimated using a prediction lter. The sign of the correlation between the estimated watermark and the original function determines the embedded bit. In order to allow recovery even after ane transforms the function of each bit is embedded several times at horizontally and vertically shifted locations. In the watermark recovery process we rst compute a prediction of the embedded watermark. Then the autocorrelation function is computed for this prediction. The multiple embedding of the watermark results in additional autocorrelation peaks. By comparing the con guration of the extracted peaks with their expected con guration we can determine the ane distortion applied to the image. The distortion can then be inverted and the watermark recovered in a standard way. Keywords: digital watermarking, copyright protection, labeling, digital signature, robustness, geometrical transformation
1. INTRODUCTION
Digital watermarking, the art of hiding information into multimedia data in a robust and invisible manner, has gained great interest over the past few years. On reason for the large interest in digital watermarking methods is its high commercial potential for applications such as copyright protection, authentication, labeling and monitoring. For a review on watermarking methods for dierent types of media the interested reader is referred to the paper by Swanson et al .9 Besides the general robustness requirement for watermarking schemes towards \additive" noise, e.g. lossy compression, one of the main problems is the issue of how to resist to geometrical transformations such as scaling, cropping, rotation, shearing and change of aspect ratio. Kutter et al 5 suggested a full search approach for watermark recovery after geometrical transformations. If the transformation consist only of scaling, translation or rotation, this approach allows watermark recovery in decent time. However, if we have to cope with generalized geometrical transformations this approach, although functioning, is time consuming. Gruhl and Bender2 suggested a scheme in which multiple cross shapes are embedded into the image for example by LSB plane manipulation. Any geometrical transformation applied to the image will re ect in the shape and position of the embedded crosses. This information can be used to determine the ane transformation. Bender and Gruhl call this scheme a helper scheme. The drawback of this scheme is its low robustness towards noise, such as compression noise. A somehow similar approach was suggested by Fleet and Heeger.1 In their scheme, the watermark is represented by a sum of sinusoidal signals. These sinusoids act as a grid, providing a coordinate frame for the image. The sinusoids appear as peaks in the frequency domain and can then be used to determine the geometrical distortion. As many schemes working in the frequency domain, one problem is due to the fact that geometrical distortions, such as cropping, heavily distort the frequency domain representation of an image. Another method based on a calibration signal in the Fourier domain has been patented by Digimarc Corporation.6 A very dierent approach based on the Fourier{Mellin transformation was suggested by O Ruanaidh and Pun.7 The idea here was not to invert geometrical transformation before Other author information: M. Kutter: E-mail: martin.kutter@ep .ch
watermark extraction, but to perform the watermarking process in a transformation invariant domain. In a rst step, the Fourier transformation of the image is computed. By keeping only the amplitude, the image representation is translation invariant. To achieve scale and rotation invariance, the amplitude of the Fourier transformation is then log{polar mapped, which transforms all scale changes and rotations into horizontal and vertical shifts. Then again the Fourier transformation is computed. The watermark is then embedded into the amplitude using spread spectrum modulation. Although this is a very elegant approach, it has several drawbacks. First of all, is works only if the image is not cropped and only uniformly scaled. This means that the watermark may not be recovered after a change of the aspect ratio. Furthermore, the overall robustness is not very good, since the watermark is embedded only in the amplitude of the Fourier transformation and it is well known that most information in images is contained in the phase of the image. In this publication we present a new approach, in which the watermark itself is used as calibration signal by embedding it several times at dierent, horizontally and vertically shifted, locations. Watermark embedding is performed in the blue image component, and the watermark has the form of a weighted spread spectrum signal. The introduced method allows watermark recovery after translation, cropping, scaling, change of aspect ratio, rotation and even shearing. Hence it can accommodate, within certain limits, any generalized geometric transformation. This paper is organized as follows. In Section 2, the basic watermarking scheme is introduced. In Section 3 the extension to this scheme, providing robustness against generalized geometrical transformation, is presented. This new scheme is then tested in Section 5, before we conclude the paper in Section 6.
2. BASIC WATERMARKING SCHEME
We start the technical description of the proposed method, by rst introducing the underlying digital watermarking scheme. This scheme is based on a previous work5 and can be viewed as some sort of spread spectrum watermarking with the dierence, that in the watermark recovery process an additional step, which predicts the embedded watermark, is introduced to increase detector performance.
2-D Amplitude Modulation
Our target is to embed an N{bit binary signature B = fb0 ; b1; : : :; b ?1 g in an image I . The watermarking process takes place in the watermarking channel, which is a transformation of the original image C = (I ). After watermark embedding in the watermarking channel, resulting in C^ , the watermarked image I^ can be computed by applying the inverse transformation function ?1, i.e. I^ = ?1 (C^ ). Watermark embedding is de ned as the linear combination of the original image in the watermarking channel and a set of N orthogonal two dimensional functions p which de ne the watermark w: N
i
w=
N X
=1
p (x; y):
(1)
i
i
The watermark embedding process is given by: C^ (x; y) = C (x; y) + w;
(2)
and the two-dimensional orthogonal functions are de ned as: p (x; y) = b (x; y) (x; y) i
i
(3)
i
where b is the bit-value mapped from f0; 1g to f?1; 1g, (x; y) is a local weighting factor and (x; y) a two{ dimensional modulation function. The weighting factor has two functions (1) it adapts the modulation function to the human visual system in order to minimize artefacts, and (2) it provides control over the watermark embedding strength. The modulation functions are orthogonal i
i
< ; >= jj jj2 i
j
ij
i
(4)
where < :; : > is the inner product operator, the Cronecker Delta function, and jj:jj2 represents the energy of the modulation function. Numerous ways exist to design orthogonal modulation functions. In digital watermarking applications having control over the visual artefacts is an important issue. Hence it is a common approach to design the functions such that the positions of non{zero values are not intersecting. This way the functions do not interfere with each other and the resulting artifacts can be estimated by looking at each function individually. The modulation functions can be generated using a set of sparsely distributed pseudo{random sequences S . The pseudo-random sequences depend on a secret key k and are not overlapping, i.e. S \ S = ;; 8i 6= j . The modulation functions are then de ned as: ij
i
i
(x; y) =
j
s (x; y) if (x; y) 2 S 0 otherwise i
i
(5)
i
In typical watermarking applications, the spatial distribution of the sets S is uniform and the assignments of a given position to any set S is equiprobable with probability 1 . However, it should be noted that in some cases the use of non-uniform spatial distributions may be advantageous. In this paper we will use a uniform spatial distribution for the sets S . In order to control the watermark embedding density, we introduce the density D which is de ned as the quotient of the number of all positions used for the watermark embedding process and the total number of positions: i
i
N
i
D=
jfS =1 S gj jfS gj N
(6)
i
i
where jf:gj refers to the set cardinality. The probability for a pixel to be assigned to S is then given by D=N . As already mentioned by several authors,8 this modulation scheme can be considered as spread spectrum modulation. The statistical distribution of the s (x; y) has zero mean and a variance of 1. For maximal embedding strength, a bipolar distribution is advantageous, however if uncertainty is of importance, a Gaussian distribution would be optimal.3 Since in our case robustness of interest, we decided to use the bipolar distribution. i
i
Demodulation
The introduced watermarking scheme can be seen as a modulation channel in which the image acts as additive noise. If the statistics of the image were Gaussian, common models could be used to design ecient detectors. However, it is well known, that the statistics of images are in general very dicult to model. Nevertheless, it is a common method in digital watermarking to use the projection of the watermarked image onto the modulation functions as detection statistics. E
D
r = fC^ g; i
(7)
i
where f:; :g is a preprocessing transformation. The purpose of this transformation is to increase detector performance by computing a prediction w~ of the embedded watermark. If properly chosen, the image is transformed such that the resulting statistics are easier to model and favorable for the subsequent watermark detection process. Figure 2 illustrates this idea on the Lena image. The plot on the left shows the probability distribution of the pixel values of blue image component. As mentioned, the statistics are dicult to model since they do not resemble any commonly used statistical model. The plot on the right side shows the distribution after linear ltering with the prediction lter proposed in equation 23. It can be seen clearly, that the resulting distribution is more uniform, has zero mean and is symmetric. Furthermore it can also be seen that the variance decreased. This distribution can be easily modeled for example using the Generalized Gaussian or a Generalized Cauchy probability density function. If we assume that the preprocessing transformation can be implemented by liner ltering with the lter h, the detector statistic can be rewritten as: r
E
D
i
= C^ h; = h(C + w) h; i = hw h; i + hC h; i i
i
i
i
(8)
The term on the left projects the pre- ltered watermark onto the modulation functions. In the term on the right, the image in the watermarking domain is pre{ ltered and projected onto the modulation functions. Since the pre- ltering process generates a distribution with zero mean and decreased variance, the overall detector statistic performance increases. If the additive noise was of Gaussian nature, the optimal detector scheme would be based on a matched lter4 approach. Since the noise in our case in non-Gaussian, and the embedded signal is not weak we do not correlate with the weighted functions (x; y)(x; y). The proposed detector statistics can be seen as a sign detector, often used for signal detection in non Gaussian noise.4 An embedded bit b^ is extracted by looking at the sign of r : i
i
b~ = i
i
1 if r 0 0 if r < 0
(9)
i
i
Since the main purpose of this paper focuses on geometrical robustness, and not detector performance analysis and evaluation, we leave the probabilistic analysis of the proposed statistics for another publication and continue with the extension to this scheme for geometrical robustness. 0.06
0.07
0.06
0.05
0.05
Probability
Probability
0.04
0.03
0.04
0.03 0.02 0.02
0.01
0
0.01
0
50
100
150
200
250
0
−60
−40
−20
Pixel Value
0 Filtered Pixel Value
20
40
60
Figure 1. Probability distribution of pixel values before and after ltering with the prediction lter de ned in equation 23.
3. EXTENSION TO GEOMETRICAL ROBUSTNESS Translation and Cropping
Resilience to translation and cropping can easily be achieved by embedding a reference watermark and performing a full correlation search for this watermark.5,3 Embedding a reference watermark can be seen as extending an N{bit signature by one or more bits and pre{setting the additional bits to known values. If we assume that the rst P bits are pre{set, the search is de ned as: max x y
p ;p
X
w~(x + p ; y + p ) x
x;y
y
P X
=1
b (x; y) i
i
(10)
i
where p and p de ne the translation.This two{dimensional correlation can easily be implemented by multiplication in the frequency domain. x
y
Geometrical distortions
In the previous paragraph we have described a method that allows watermark recovery after cropping and translation. As proposed by Kutter et al 5 this method may also be extended to any geometrical transformation. Although this approach is mathematically feasible, it is computationally far too expensive and hence not realistic with today's computational resources. To bypass this problem, a scheme is required that allows detection of the geometrical transformation. A generalized geometric transform is described by six parameters. Taking into account that we do not consider translation, four parameters are sucient:
x0 y0
= ac db
x y
(11)
where a, b, c, and d are the transformation coecients. In order to compute the inverse transform, and hence solve for the coecients, we need at least two sets of corresponding positions before and after transformation. The inverse transformation is given by:
" #T a b ?1 = x1 y1 x10 y1 0 ?1 c d x2 0 y 2 0 x2 y2
(12)
where T is the transpose of the matrix, (x ; y ) the position before transformation and (x 0; y 0) the position after transformation. A closed form solution for the inverse transformation matrix can easily be found. If we are to recover the geometrical transformation applied to a watermarked image, we need at least the original and transformed locations of two reference points. Instead of working with reference points, we propose the use of the watermark as reference. This is accomplished be embedding the same watermark multiple times at shifted locations. Let us rede ne the modulation functions introduced in the previous section: i
k (x; y) = i
i
i
s (x; y) if (x ? ; y ? ) 2 S ; 8( ; y) 2 otherwise i
x
0
y
i
x
k
i
(13)
Where is the set of pairs of horizontal and vertical shifts, = f ; g. Similarly the modulation functions are now de ned as: k
k
x
pk (x; y) = b (x; y)k (x; y) i
i
i
y
(14)
The resulting watermark embedding process is now given by: XX C^ (x; y) = C (x; y) + p k (x; y): N
k =1
i
(15)
i
As mentioned, this scheme embeds the modulation function of each bit multiple times at dierent locations in the image by shifting the function horizontally and vertically. In order to eliminate artefacts due to the superposition of dierent modulation functions, we can add the constraint that all functions have to be non{overlapping, i.e. S k \ S l = ;; 8(i 6= j; k 6= l). To determine the geometrical transformation in the watermark recovery process, we start by computing an estimate w~ of the embedded watermark w: i
j
w~ = fC^ g
(16)
where f:g describes again a transformation to predict the embedded watermark. Then we compute the autocorrelation function R ~ ~ of the estimated watermark w~ : w;w
R ~ ~ (u; v) =
XX
w;w
x
w~ (x; y)w~ (x + u; y + v):
(17)
y
Since the watermark has been embedded multiple times, the autocorrelation of its estimate has multiple peaks. The plot in the middle of gure 2 illustrates this by showing the autocorrelation function of an estimated watermark which has been embedded four times. If the image has undergone a geometrical transformation, the peaks in the autocorrelation function will re ect the same transformation, and hence provide the point pairs required for the computation for the geometrical transformation parameters in equation 12. In other words, the autocorrelation function has peak values ~ at the center and the transformed positions of the multiple embedded watermarks: k
~ = T ( ) k
k
(18)
where T is the geometrical transformation applied to the image. Peak detection is implemented by rst computing the gradient of the autocorrelation function. The locations of gradient maxima are then used in the autocorrelation function to de ne square regions in which the maxima correspond to peaks. Once a peak has been extracted, the neighboring areas in the gradient function and the autocorrelation function are set to zero. This process is iterated until the required number of peaks has been extracted. After applying the inverse transformation to the image and searching for the watermark location, the embedded watermark is extracted as explained in Section 2. It should be noted that the pseudo random{sequences and the corresponding modulation functions are given by: S
i
=
(x; y) =
[
K
X
i
k
S k
(19)
k (x; y)
(20)
i
i
4. IMPLEMENTATION
In order to keep the proposed method as generic as possible, we did not specify the watermarking channel, weighting function, preprocessing transformation and the shifting parameters in the description above. To test the proposed scheme, we will now de ne these variables. However, it should be noted that other combinations and approaches are possible, based on the same scheme. The watermarking channel is given by the blue image component, for which the transformation is de ned as: C (x; y) = B (x; y)
(21)
where B (x; y) is the value of the blue image component at location (x; y). The inverse transformation ?1 reassigns the probably modi ed valued to the blue image component at the same location. The weighting function (x; y) is de ned as a fraction of the image luminance L(x; y) = 0:299R(x; y)+0:587G(x; y)+ 0:114B (x; y), where R, G, and B are the values of the red, green, and blue image components in the image. L(x; y)
(x; y) =
(22)
where determines the watermarking strength. For the experiments in the next section, was set to 5. A challenge is how to create the non{intersecting sets S k . One approach is by rst de ning N non{intersecting sets by assigning d% = D 100 of all locations to the N sets with equal probability. For the test the density D was set to 0:5 Then the sets are up-sampled by 2 to generate S 0 ; 0 = (0; 0). These sets have the advantage, that only even rows and columns are used, leaving the odd rows and columns available for the other sets. If we now constrain the horizontal and vertical shifts to be odd, the requirement of non{intersecting sets is easily achieved. i
i
1600 1400 1200 50
R(u,v)
1000 40
800 30
600 20
400 10
v
200 0
−10
60 40 20 0 −20 −40 −60
−60
v
Watermarked image
−40
−20
0
20
−20
60
40
−30
−40
−50 −50
u
Autocorrelation function Figure 2. Watermarking example.
−40
−30
−20
−10
0 u
10
20
30
40
50
Extracted peaks
As mentioned in the introduction of this section, in order to recover a generalized geometrical transformation, excluding shifts, we need at least the transformation description of two points. This can be done by de ning another two sets of shifts 1 = ( ; 0) and 2 = (0; ). In order to fully exploit the watermarking channel, a fourth set 3 = ( ; ) is introduced. This fourth set has another advantage: the peak values in the autocorrelation function are doubled for horizontal and vertical shifts. For example for the horizontal shift not only 0 and 1 correlate, but also 2 and 3. This may be especially helpful if the image has not only been subjected to geometrical transformations, but also other attacks, such as lossy compression. For testing we set = = 43. The prediction lter h used in the watermark recovery process is shown in equation 23. h
h
v
v
h
2
0 6 0 6 6 1 6 ?01 h= 6 12 66 0 6 4 0 0
0 0 0 ?1 0 0 0
0 0 0 ?1 0 0 0
?1 0 ?1 0 ?1 0 12 ?1 ?1 0 ?1 0 ?1 0
0 0 0 ?1 0 0 0
v
0 0 0 ?1 0 0 0
3 7 7 7 7 7 7 7 7 5
(23)
The lter computes the prediction error, which can be seen as an estimate of the embedded watermark, for the center pixel through a linear interpolation using pixels in a cross{shaped neighborhood. Other schemes, such as spline interpolation or wiener ltering, are also possible. However tests showed that the proposed lter features a good tradeo between complexity and performance. Furthermore more complex prediction schemes did not remarkably increase the overall performance of the proposed watermarking scheme.
5. RESULTS
To test the proposed method, the 512 512 color version of lena was watermarked with a 34{bit watermark, where two bits were pre{set with known values. This two bit were used to recover translations after inverse geometrical transformation. Figure 2 shows the watermarked image on the left. The plot in the center shows the autocorrelation function of the estimated watermark, and the plot on the right depicts the extracted peaks. The locations of the peaks clearly illustrate the multiple embedding of the watermark at horizontally, vertically and horizontally and vertically shifted locations. The watermark recovery was without problem in this case. To test the algorithm under conditions including generalized geometrical transformations as well as the introduction of noise, we decided to test the method with a print-scan procedure. The original signed image was printed
2500
2000
R(u,v)
60
1500 40
1000 20
v
500 0
60 −20
40
60
20
40
0
20
−40
0
−20
−20
−40 −60
v
−40 −60
−60 −60
u
−40
−20
0 u
20
40
60
Watermarked image Autocorrelation function Extracted peaks Figure 3. Print{scan example. The watermarked image was printed on a 300dpi laser printer and scanned at 300dpi.
1600 1400 1200
R(u,v)
60
1000 800
40
600 20
400
v
200
60
0
−20
40 20 0 −20 −40 −60 v
Watermarked image
−60
−40
−20 u
Autocorrelation function
0
20
40
60 −40
−60 −80
−60
−40
−20
0 u
20
40
60
80
Extracted peaks
Figure 4. Chang of aspect ratio and shearing example. The watermarked image was rst scaled by 120% in vertical direction, and the sheared to the right.
with an HP-Laserjet color printer, with a resolution of 300dpi on standard paper. The image was then scanned using an HP scanner with standard settings and a resolution of 300 dpi. The resulting image was rescaled to a size of 512 by 512 pixels in order to speed up the computation of the cross{correlation by using Fast Fourier Transforms. The image on the left in Figure 3 shows lena after the print{scan procedure. The plot in the center shows the autocorrelation function of the estimated watermark, and the plot on the right shows the extracted peak locations. The extracted peaks clearly illustrate, that the image has been rotated and scaled. The inverse transformation has then been applied to the image, and the embedded watermark was successfully recovered. In a last test, the watermarked image was rst scaled by 120% in the vertical direction (change of aspect ratio), and then sheared to the right. The resulting image is shown on the left in Figure 4. Again, the plot in the center shows the autocorrelation function of the estimated watermark, and the plot on the right shows the extracted peak locations. The peak location again re ect the geometrical transformation. After inverting the geometrical transformation, the watermark was successfully recovered.
6. CONCLUSION
In this paper we propose a new digital watermarking method that allows watermark recovery, even if the watermarked image was subjected to a generalized geometric transformation. The novel idea of the methods is based on multiple embedding the same watermark at shifted locations in the image. With the presented method, the applied geometrical transformation can be estimated and then inverted before watermark recovery. In the rst part of the paper we have developed the framework for a watermarking method based on multiple watermark embedding. In this framework we did not specify parameters such as watermarking channel or weighting functions. The idea is that the proposed strategy may be applicable to a variety of watermarking channels, make use of diverse weighting functions, and dierent prediction functions in the recovery process. It might even be possible, to combine dierent watermarking channels, in which the watermarks are embedded. In such a case, the watermarks would then be predicted in each watermarking channel, and correlated to determine the geometrical transformation. To test the feasibility of the proposed approach, we embedded the watermark in the blue image component and used the luminance to weight the watermark before embedding. Tests veri ed that the proposed method allows watermark recovery after generalized geometric transformation, and even a print{scan process. Towards lossy JPEG compression the technique shows decent robustness. In general, a 32{bit watermark can be recovered after compression with a quality factor as low as 90%. One limitation of the proposed method is that rotations larger than 45 create a registration problem, since the correspondence of the extracted peaks is not unique. This problem can be bypassed by performing several watermark recovery attempts for rotated versions of the images at multiples of 45 . The complexity of the proposed method is acceptable. The most time consuming part is the computation of six 2-D Fourier transformations used for the computation of the autocorrelation and the correlation in the translation recovery process. However, if speed is of importance, optimization is possible in several ways. In the proposed method, the inverse geometric transformation is applied to the image before watermark recovery. However, it is also possible to apply the geometric transformation to the modulation functions and then recover the watermark. First tests indicate that this modi cation improves the overall robustness. Furthermore, to increase the robustness against \noise" attacks, modulation functions in the lower frequency range may be used. To my knowledge the proposed method is the rst method that allows watermark recovery, even after change of aspect ratio and shearing.
REFERENCES
1. D. J. Fleet and D. J. Heger. Embedding invisible information in color images. In Proceedings of the International Conference on Image Processing (ICIP), volume 1, page 532, Santa Barbara, CA, USA, 1997. IEEE. 2. D Gruhl and W. Bender. Ane invariance. http://nif.www.media.mit.edu/DataHiding/ane/ane.html, 1995. 3. J. J. Hernandez, F. Perez-Gonzalez, J. M. Rodriguez, and G. Nieto. Performance analysis of a 2{d multipulse amplitude modulation scheme for data hiding and watermarking still images. IEEE Journal Selected Areas of Communications (JSAC), 16(4):510{524, 1998. 4. Saleem A. Kassam. Signal Detection in Non{Gaussian Noise. Springer{Verlag, 1998. Chapter 2. 5. Martin Kutter, F. Jordan, and Frank Bossen. Digital watermarking of color images using amplitude modulation. Journal of Electronic Imaging, 7(2):326{332, April 1998. 6. G.B. Rhoads. Steganography methods employing embedded calibration data. US Patent, 1995. 7. J.J.K. O Ruanaidh and T. Pun. Rotation, scale and translation invariant spread spectrum digital image watermarking. Signal Processing, 66(3), 1998.
8. J.R. Smith and B.O. Comiskey. Modulation and information hiding in images. In Lecture Notes in Computer Science: Information Hiding, number 1174, pages 207{226. Springer, May/June 1996. 9. M.D. Swanson, M. Kobayashi, and A.H. Tew k. Multimedia data embedding and watermarking techniques. Proceedings of the IEEE, 86(6):1064{1087, June 1998.