Coding and Synchronization: A Boost and a Bottleneck for ... - CiteSeerX

9 downloads 9816 Views 460KB Size Report
email: [email protected], [email protected], [email protected]. Abstract—This paper presents ... To prevent dig- ital images duplication, specialized and costly hardware is required, thus dramatically reducing marketing possibil- ities.
1

Coding and Synchronization: A Boost and a Bottleneck for the Development of Image Watermarking J. R. Hern´andez, F. P´erez-Gonz´alez and J. M. Rodr´ıguez Dept. Tecnolog´ıas de las Comunicaciones, ETSI Telecom., Universidad de Vigo, 36200 Vigo, Spain email: [email protected], [email protected], [email protected] Abstract— This paper presents a theoretical analysis of the performance of a family of watermarking schemes formulated in the spatial domain. A general model for watermarking based on perceptual masking that boils down to an equivalent Gaussian channel is presented and characterized. Starting from the Gaussian equivalent, introduction of coding as a means of improving the performance of the watermarking system is fully justified and illustrative examples are presented. Some important classes of attacks can be readily analyzed with our results, both for the coded and uncoded cases. Finally, the synchronization problem is discussed through some examples, with disappointing results that lead to a final discussion.

Key

Hidden Source

Modulator

Watermark Detection Watermark

Channel Host Source

Key Jammer

Demod.

Wmark Decoder

Source Decoder

Destination

Destin.

Source Encoder

Fig. 1 General model of a watermarking system

I. Introduction This decade has witnessed a stunning proliferation of techniques for representation, storage and distribution of digital multimedia information. Parallel to these developments has been the unauthorized copying, distribution and manipulation of data, mostly images. To prevent digital images duplication, specialized and costly hardware is required, thus dramatically reducing marketing possibilities. This is the cryptographic approach taken by pay TV channels, not foreseeable for scenarios such as Internet. There, watermarking techniques can at least ensure that ownership information is embedded into the image, thus preventing or deterring users from illegal uses. Watermarking schemes have mushroomed over the past few years, with an impressive research effort devoted to them. Many companies (e.g., NEC, Sony, Hitachi, Kodak) have comercial watermarking products available or in preparation. In spite of this, results up to date are quite discouraging. There exist freely available programs (e.g., unZign, Stirmark) that have proven extensively to wash the watermark out for practically all existing comercial methods and with little impact on the perceptual quality. Recently, a very popular non-technical magazine has quit using watermarks in their Internet server for they did not give the desired protection level and is currently looking for a better embedding method. Lacking in all previous research on digital watermarking is a theoretical approach that allow the assessment of the actual limits in performance. As a consequence, there does not exist testbeds for the comparison of the various

proposed methods. It is our belief that such a theoretical approach is the only way to turn electronic copyright protection into a mature discipline, at a level comparable to other branches of Signal Processing and Communications; otherwise, watermarking will be just a fad. Our first attempt in giving it a sounder foundation was taken in [1], where the performance of a spatial watermarking scheme was analyzed. In this paper, we will discuss new aspects of this method, specifically, how channel coding and synchronization affect its behavior. Although these problems are closely tied to the spatial character of the watermark, some of the discussion and ideas are extrapolable to other domains, such as the DCT or subbands expansion. The technique described in [1] is summarized in Figure 1 and is based on [2], [3]. As it can be seen, the original image is not needed (although some of the existing methods require the original image, this would be unmanageable when huge quantities of images need to be compared by intelligent agents searching the net for unauthorized copies). Also, in order to generalize previous approaches, the watermark is divided into two parts: a fixed (but key-dependent) part and an information-driven part. The latter allows the owner to embed information of different nature, e.g., timestamps, copy numbers, etc. Note that this information would be vital should a watermarking scheme be commercially implemented. Related to the two parts the key is splitted into we have proposed the use of two well-known sets of quality parameters, that have been used in different applications. We

define the probability of decoding error Pb as the probability that an information bit in the watermark is wrongly decoded. The probability of detection PD is the probability of correctly deciding that the watermark is present, while the probability of false alarm PF is the probability of erroneously deciding that the watermark is present while it is not. Note that the Pb 6= 0 and PF 6= 0, mainly due to two facts: 1) the errors made when trying to restore the original image from its watermarked version and 2) the manipulations, intentional or unintentional, that may have suffered the image. The paper is organized as follows: in Section II we present a general model for spatial watermarking methods that allows to create an equivalent Gaussian channel from which the performance of coding, analyzed in Section III, can be derived. Section IV is devoted to describing some kinds of attacks that fit in our model, while in Section V we present comparisons between different codes and show how coding may be advantageous over the uncoded case. The problems associated with scaling attacks that affect watermark synchronization, are dealt with in Section VI. Finally, the conclusions and future lines of research that can be extracted from our work are presented in Section VII. II. A General Model for Spatial Watermarking In this section we briefly describe the watermarking method and summarize some important results that have been presented elsewhere [1], [4]. The watermark is expressed as w[m, n] =

L X

bi pi [m, n],

where the bi are the watermark bits (information or fixed) and pi [m, n] is a 2-D pulse defined as pi [m, n] =



α[m, n]s[m, n] if (m, n) ∈ Si 0 otherwise

M [m, n] =

m+P X

n+P X

√ 0.35

(k1 −m)2 +(k2 −n)2

(2)

Both s[m, n] and the sets {Si }, are generated as a function of a secret key K to provide cryptographic security. s[m, n] is a zero-mean i.i.d. sequence with unit-variance, i.e., E[s2 ] = 1. Refer to [1] for further details on the properties that s[m, n] should have. The pulses are chosen to be non-overlapping, i.e., Si ∩Sj = ∅ ∀i 6= j, so they will always be orthogonal. Finally, the sequence α2 [m, n] indicates the maximum allowable variance for the pulses to be invisible. Since the watermark has a random white appearance, this modulation technique resembles a direct-sequence spread spectrum (SS) scheme, although important differences exist [5]. The spatial sequence α2 [m, n] is obtained via a perceptual analysis of the image to be watermarked. This analysis, and the functions defined hereby follow [6],[7]. The

·

k1 =m−P k2 =n−P

(|x[k1 + 1, k2 ] − x[k1 − 1, k2 ]| + |x[k1 , k2 + 1] − x[k1 , k2 − 1]| where x[m, n] is the original image and the masking level is computed in a (2P + 1) × (2P + 1) neighborhood of [m, n]. From here, it is possible to determine the value for α2 [m, n] as α2 [m, n] = nvisib exp(aM [m, n])

(3)

where nvisib controls the noise visibility level and a is a settable real parameter. The watermarked image y[m, n] is just obtained by adding the watermark w[m, n] to the original image x[m, n]. As was mentioned in the Introduction, we assume that the original image x[m, n] is not available in the detection process and that the watermarked image is filtered with a space-variant linear filter hkl [m, n], which models a filtering attack or a preprocessing step before detection (e.g., to estimate the original image), obtaining a signal z[m, n] as a result. Due to the lack of good statistical models for images, we reduce the observation space to the projection of the z[m, n] onto the pulses {pi }, that is, to the statistics ri , i = 0, · · · , L − 1, grouped in a vector r. The rationale for this choice of the decision variables is that this would be a sufficient set of statistics had the original image had a Gaussian i.i.d. distribution in every pixel.

(1)

i=1

4

Masking Function M [m, n] there defined is adapted to become

Taking into account the unavailability of statistical models and recognizing that different images will lead to considerably different results, it is clear that any theoretical analysis of performance should depend on the image x[m, n] to be watermarked. On the other hand, since there should not be a priori favor for any key K –or security would be compromised– this should be treated as the only random variable in the model. With this in mind, it is possible to express the vector of observations r [1], [8] as r = Ab + n where b contains the information bits, A is a deterministic diagonal matrix with elements 1X a= h0,0 [m, n]α2 [m, n] (4) L m,n and n is a zero-mean uncorrelated Gaussian random vector. Furthermore, for L moderately large, the noise vector can be approximated as Gaussian with a diagonal crosscovariance matrix with diagonal term: 1X 2 γ = α [m, n]x2f [m, n] + L m,n 1X 2 + b2i h [m, n]α4 [m, n](E[s4 ] − 1) + L m,n 0,0

+ b2i ·

1 L

X

X

h2k,l [m, n]α2 [m, n]·

IV. Attacks

(k,l)6=(0,0) m,n

2

α [m − k, n − l] + L−1X 2 + b2i h [m, n]α4 [m, n] L2 m,n 0,0

(5)

where xf [m, n] is the image filtered by hkl [m, n]. III. Channel Coding One immediate way of improving the performance of a data hiding system is through the introduction of coding. Here, we will treat the case for which codes are used at bit level, although other possibilities (e.g., at pixel level, at symbol level, etc) exist. Under the uncorrelated Gaussian assumption, it is clear that the bit error probability Pb is just √ Pb = Q(a/ γ) (6) Unfortunately, it can be noted that an improvement over the uncoded performance is not so obvious since it is wellknown that coding may perform badly for low SNRs (such as the current case), so diversity (i.e., L sufficiently large) is necessary. On the other hand, with L large, the amount of information that can be hidden is reduced, so good codes are mandatory in order to conciliate these two requirements. In any case, the overall performance will be determined by two factors: the minimum distance and the redundancy of the code. With hard decoding, we have a BSC (Binary Symmetric) channel with transition probability Pb , so it is possible to approximate the bit error probability for the coded case Pc as [9]   n 1 X n Pc ≈ m Pbm (1 − Pb )n−m (7) m n m=t+1

where an (n, k) block code with a t-error correcting capability has been assumed. Using (7) it is possible to determine approximately the minimum number of pixels per pulse L for which Pc = Pb or, in other words, the value of L that renders coding profitable. First, note that with Pc = Pb in (7), a polynomial equation results. The solution Pb∗ to this equation is used to invert (6), thus resulting in a √ value of a/ γ from which L can be derived. To consider a particularly simple yet illustrative case, let the pulses s[m, n] be designed so as to yield E[s4 ] = 1 and let the filter hkl [m, n] = h00 [m, n]δk,l . Then, P L−1/2 m,n h0,0 [m, n]α2 [m, n] a √ ≈ 1/2 P γ 2 [m, n]x2 [m, n] + h2 [m, n]α4 [m, n] α 0,0 m,n f and the value of L that solves the equation is readily obtained. It is important to remark that this value of L represents the minimum number of pixels per coded bit for which coding is advantageous. A fair comparison between codes results if the number of bits per information bit is considered instead; this number is just L · n/k.

The watermarked image may suffer different attacks aimed at deleting or corrupting the watermark. The objective of the attacker is to transform the original image into a perceptually equivalent image with a high probability of yielding a negative result in the watermark detection test and/or a high probability of error in the information decoding process. In both cases the attacker will try to degrade the vector r so as to decrease the probability of detection PD and increase the probability of error Pb . The model given in Section II can be used to analyze the impact that some attacks will have in the performance of the system. For instance, a linear filtering attack can be readily modeled by considering a space-variant linear filter hk,l [m, n]. An example of application would be the use of a Wiener filter to estimate the original image (and, equivalently, the watermark). Wiener filtering could also be used by an attacker in order to estimate the watermark that in turn can be substracted from the watermarked image. See Section V. The effect of cropping can also be studied by considering it as equivalent to a reduction in pulse size, assuming that cropping affects all the pulses in a similar fashion, provided that they are spread over the entire image. In fact, this is an important justification for the use of spread pulses. The case of adding a perceptually invisible zero-mean white noise can be also included in the model. If the noise variance at pixel (m, n) is σn2 [m, n], then the result would be the same as in (6) with the term P ( m,n α2 [m, n]σ 2 [m, n])/L added to the variance γ. To conlude this section, two comments are at hand. First, although JPEG compression is probably one of the most frequent unintentional attack, given its transformdomain nature it is not easily embraced by our model. Work is in progress to deal with this type of attack through a combination of block filtering and quantization noise, but only rough approximations can be expected. A second type of simple attack is a transformation (e.g., scaling, rotation) of the watermarked image that breaks the synchronization between the original and watermarked images. The problems associated to synchronization will be discussed in Section VI. V. Experimental Results on Coding In Figure 2 we plot the bit error rate (BER) as a function of the number of pixels per information bit when Wiener filtering is performed prior to detection to eliminate part of the noise due to the original image. In this case, the watermarked image has not suffered any attack. Image “Lena” with 256x256 pixels has been used. The value of nvisib in (3) was 1 and a was set to 0.0028. Four cases have been considered: uncoded (i.e., binary antipodal) and codes BCH(63,10), BCH(63,36) and BCH(63,57). All the empirical curves have been obtained by averaging over 100

Fig. 2 Bit error rate versus modulation pulse size for the uncoded and BCH coded cases when Wiener preprocessing and no attack is performed.

Fig. 4 Bit error rate versus modulation pulse size for the uncoded and BCH coded cases when the image is attacked with worst case additive noise and preprocessed by Wiener filtering

image, so they are also dependent on s[m, n]. Thus, the true values for the mean and variance of the decision variables are different from those given in (4) and (5) because it is assumed there that the filter hk,l [m, n] is independent from the watermark. Similar experiments, not shown here, have been performed with fixed (e.g., lowpass) preprocessing filters, producing more accurate results.

Fig. 3 Bit error rate versus modulation pulse size for the uncoded and BCH coded cases when Wiener preprocessing and no attack is performed. The theoretical curves are computed with the Pb obtained for the uncoded case.

keys. As it can be seen in the figure, the best results are obtained with the BCH(63,36) code, which is somewhat in between the other two. This illustrates the fact that the best codes are not necessarily those which achieve higher redundancy. Indeed, it is known [10] that the best rates for BCH codes are those between 1/3 and 3/4, as is accomplished by the BCH(63,36) code. Figure 3, obtained under the same experimental conditions, is devoted to show the goodness of the approximation given in (7). To this end, the theoretical BER has been computed using (7) when Pb is calculated through the empirical value that the uncoded case provides. This confirms the validity of the Gaussian and BSC assumptions. Thus, the discrepancy observed in Figure 2 is only due to small errors in the approximations given in (4) and (5). In particular, note that the coefficients used in the Wiener preprocessing have to be computed with the watermarked

In order to illustrate the degradation in performance suffered by an attacked image, we show in Fig. 4, the results obtained when the image Lena is attacked with additive noise. This noise has been perceptually scaled using α2 [m, n] as in (3) but with a visible level, nvisib = 8. Again, the BCH(63,36) code showed the best performance. In fact, of the many codes tested by the authors, this one achieved superior results. As can be noted in all the examples presented above, there is a certain level of pixels per information bit for which coding is worthwile. Whether this value is achievable, depends on the size of the image and the number of information bits that one wishes to hide. VI. Synchronization: The True Bottleneck As was stated previously, although synchronization attacks are fairly simple they hit directly the core of watermarking systems. For instance, with image editing and drawing programs anybody would be able to scale images just by mouse dragging. Thus, scaling should be recognized as a very frequent (and mostly) unintentional attack. Since in theory scaling implies a linear resampling operation, it becomes clear that the watermark will suffer the same transformation. Then, it is not difficult to see that the correlation between the (possibly preprocessed) scaled and marked image and the watermark will no longer exhibit a peak that would result in a positive detection. Of course, with the original image being available, it

Watermark autocorrelation function

4

x 10 2.5 2 1.5 1 0.5 0 −0.5 60

60

40

50 40 30

20

20 0

10 0

Fig. 5 Watermark autocorrelation function.

value of the cross-correlation function averaged for 50 different keys for three types of pulses s[m, n]: 1) pulses for which the sets Si consist of individually addressable pixels (this is the case we have studied so far); 2) pulses for which the support of the sets Si are blocks of 3x3 pixels and s[m, n] assumes the same value along every block, and 3) pulses with the same structure as in 2) but with blocks of 6x6 pixels. Unfortunately, the use of coarser pulses worsens the performance (peak values are lowered) with the added cost of reduction in the capacity for information hiding. Moreover, when a single realization of the three functions above (i.e., a single key) is considered, which would be the usual case in detection, even worse results are obtained. maximum of the cross−correlation vs. scaling factor

4

4.5

x 10

4

Cross correlation watermark−preprocessed image

3.5 −−−− individual pulses 800

3

600

2.5

400

2

200

1.5

− − 3x3 blocks

0

1

−200 60

0.5

−.− 6x6 blocks

60

40

50 40 30

20

0 0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

20 0

10 0

Fig. 6 Cross correlation between watermarked image and watermark.

Fig. 7 Maxima of the cross-correlation for different scaling factors

maximum of the cross−correlation vs. rotation angle (rads) 900

would be reasonable to estimate the scale factor from the two images [11], but we have already discarded this mode of operation as non reallistic. Obviously, another way of determining the scaling factor would be through exhaustive search, but this would be not only computationally prohibitive but also unlikely to work if white pulses are used, because with a very narrow autocorrelation function, it is highly probable that the true scaling factor is missed.

800 700 −−−− individual pulses 600

− − 3x3 blocks −.− 6x6 blocks

500 400 300 200 100

Figure 5 shows the central portion of the autocorrelation function corresponding to one of the watermarks that has been used for the experiments in the previous section. In Figure 6 we show the cross-correlation between the Wiener processed image and the watermark. Note that although the peak level has decreased and the noise variance has raised, it is evident that the watermark presence would be correctly detected. However, both peaks are likely to be missed if the watermarked image is (even slightly) scaled. As a possible solution, wider pulses could be used, but as the following experiments illustrate, with no better results. We have watermarked a 128x128 pixels portion of the Lena image and the resulting image has been scaled with factors ranging from 0.78 to 1.40. Figure 7 shows the peak

0 −0.1

−0.08

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

0.1

Fig. 8 Maxima of the cross-correlation for different rotation factors.

A similar experiment was repeated with rotations of the watermarked image ranging from -0.1 to 0.1 radians. The averaged (with 50 keys) results are presented in Fig. 8, where again, the use of coarser pulses would not be advisable since the peak values tend to reduce. Another interesting idea that might help to solve the synchronization problem is the creation of spectral lines in

the watermark that could be used to determine the scaling factor. Needless to say, this procedure reduces substantially the robustness of the watermark to attacks, but even disregarding this important fact, the results are not any better. The implementation of the spectral line method is as follows: first a watermark is created with 5x5 supporting blocks. Of these blocks, only the 3x3 upper-left zone takes non-zero values. Thus, the watermark is forced to zero at periodic values, so when its 2-D DFT is computed, spectral lines at the fundamental frequency and harmonics are found. This is clearly shown in Figure 9, where one of the dimensions of the 2-D DFT is considered (with the other set to zero). Unfortunately, when the watermark is embedded into an image, the estimate is so poor that spectral lines are no longer found (Fig. 10). This impairment is attributable to the low SNR per pixel that appears when the watermark is estimated. Note that the reason for the detection/decoding method to work much better is the existence of diversity brought about by the cross-correlation computation. Here, the watermark has to be estimated in a pixel-by-pixel basis.

In this paper we have shown how, through accurate theoretical modeling of the watermarking problem, it is possible to provide coding schemes that result in an substantial improvement on the performnce of the system. Moreover, with a theoretical analysis as it was presented, it is possible to know beforehand not only the expected probability of error but also whether a particular coding scheme is worthwhile given the size of the image and the amount of information that the user wants to hide. The performance analysis is carried out by considering the keys as the only random variables. Previous works do not embed watermarking in a statistical framework. With this approach it is possible to define absolute quality measures that could be used in future testbeds. In fact, the research community should explore the possibility of defining such testbeds (that should include also objective masking constraints) that would allow a comparison among the many existing watermarking schemes. The performance measure that has been extensively used thorughout the paper has been the probability of bit error Pb . We refer to the interested reader to [4], where the improvement that coding produces in the watermark detection process is quantified theoretically. Also, the results presented here apply with some modifications to the case of convolutional codes. This is also analyzed in some detail in [4].

Spectrum of the watermark (w1=0)

9

VII. Discussion and Conclusions

10

8

10

7

10

6

10

5

10

4

10

3

10

2

10

−3

−2

−1

0 Frequency w2 (rads)

1

2

3

Fig. 9 1-D DFT of the watermark showing the spectral lines

Spectrum of the Wiener estimate (w1=0)

9

10

8

10

Last, we have barely scratched the surface of the synchronization problem. This difficulty is often overlooked or underestimated, but in the authors’ opinion, it presently constitutes the major problem in image watermarking. A simple cropping or rescaling (even unintentional) will dramatically affect the performance of the system as measured by Pb and (PF , PD ). Through different examples, we have shown the underlying difficulties in the synchronization problem: conventional synchronization algorithms are unlikely to work in this case, so further work is necessary in this direction. Acknowledgments The authors acknowledge Gustavo Nieto for the programming of the examples that appear in the section on synchronization.

7

10

6

10

References

5

10

[1] 4

10

3

10

−3

−2

−1

0 Frequency w2 (rads)

1

2

3

Fig. 10 1-D DFT of the estimated watermark showing no apparent spectral lines.

[2]

[3]

J. R. Hern´ andez, F. P´ erez-Gonz´ alez, J. M. Rodr´ıguez, and G. Nieto, “Performance analysis of a 2d-multipulse amplitude modulation scheme for data hiding and watermarking of still images.” to be published in IEEE J. Select. Areas Commun., April 1998. B. Zhu, M. D. Swanson, and A. H. Tewfik, “A transparent robust authentication and distortion measurement technique for images,” in Proc. IEEE Digital Signal Processing Workshop, (Loen, Norway), pp. 45–48, September 1996. M. D. Swanson, B. Zhu, and A. H. Tewfik, “Transparent robust image watermarking,” in Proc. IEEE Int. Conf. on Image Processing, vol. III, (Lausanne, Switzerland), pp. 211–214, September 1996.

[4]

J. R. Hern´ andez, J. M. Rodr´ıguez, and F. P´ erez-Gonz´ alez, “Improving the performance of spatial watermarking of images using channel codes.” submitted to Signal Processing, March 1998. [5] J. R. Hern´ andez and F. P´ erez-Gonz´ alez, “Throwing more light on image watermarks,” in Workshop on Information Hiding, (Portland, U.S.A.), April 1998. [6] J. S. Lim, Two-Dimensional Signal and Image Processing. New Jersey: Prentice-Hall, 1990. [7] A. Netravali and B. Haskell, Digital Pictures. Representation, Compression and Standards. New York: Plenum Press, 1995. [8] J. R. Hern´ andez, F. P´ erez-Gonz´ alez, J. M. Rodr´iguez, and G. Nieto, “Data hiding for copyright protection of still images,” in Emerging Techniques for Communication Terminals, (Tolouse, France), pp. 285–289, ENSEEIHT, July 1997. [9] B. Sklar, Digital Communications. Fundamentals and Applications. New Jersey: Prentice-Hall, 1988. [10] J. Wozencraft and I. Jacobs, Principles of Communication Engineering. New York: Wiley, 1965. [11] I. J. Cox, J. Kilian, T. Leighton, and T. Shamoon, “Secure spread spectrum watermarking for multimedia,” Tech. Rep. 9510, NEC Research Institute, Princeton, NJ, USA, 1995.