Progressive Quantized Projection Approach to Data Hiding

8 downloads 9153 Views 1MB Size Report
I. INTRODUCTION. THE development of the Internet and the other digital trans- ... data-hiding tutorials have been published in the past few years. The interested ...
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2006

459

Progressive Quantized Projection Approach to Data Hiding Masoud Alghoniemy and Ahmed H. Tewfik, Fellow, IEEE

Abstract—A new image data-hiding technique is proposed. The proposed approach modifies blocks of the image after projecting them onto certain directions. By quantizing the projected blocks to even and odd values, one can represent the hidden information properly. The proposed algorithm performs the modification progressively to ensure successful data extraction without the need for the original image at the receiver side. Two techniques are also presented for correcting scaling and rotation attacks. The first approach is an exhaustive search in nature, which is based on a training sequence that is inserted as part of the hidden information. The second approach uses wavelet maxima as image semantics for rotation and scaling estimation. Both algorithms have proved to be effective in correcting rotation and scaling distortion. Index Terms—Image watermarking and data hiding, multiscale edge detection, synchronization recovery.

I. INTRODUCTION

T

HE development of the Internet and the other digital transmission channels has created the need to develop methods to protect the transmitted information as well as provide new services. This is due to the fact that, in the digital domain, perfect copying, modification, and redistribution of the information is absolutely possible. Hence, it became necessary to find a way to protect the transmitted information from being copied or tampered with. In this paper, images are used as an example for the digital information. Data hiding provides a solution for such a problem. Data hiding is the process of inserting invisible information into the image in order to perform certain tasks. These could include proof of authenticity, copyright protection, detecting, and correcting possible tampering, preventing unlawful copying, and covert communication systems. Several excellent data-hiding tutorials have been published in the past few years. The interested reader is referred to [1]–[3]. Hiding information into the image is performed either in the spatial or in the transform domain. The idea lies behind modifying the image in an invisible manner. Transform domain embedding is performed by modifying coefficients in the frequency domain [4], [5], the DCT domain [6], [7], or in the wavelet domain [8], [9].

Manuscript received July 21, 2003; revised December 22, 2004. This work was supported by the AFRL under Grant AF/F30602-98-C-0176. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Gaurav Sharma. M. Alghoniemy is with the Department of Electrical Engineering, University of Alexandria, Alexandria 21544, Egypt (e-mail: [email protected]). A. H. Tewfik is with the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TIP.2005.860318

In the spatial domain, by modifying the statistics of the image [10], [11], a spread spectrum based technique [10], [12]–[14] or a simple additive method in [15]. On the other hand, quantized projection-based embedding techniques for data hiding have been exploited in the literature under different contexts [16]–[18]. In particular, the simplest form is called the low-bit(s) modulation (LBM) where the least significant bit(s) in the quantization of the host signal are replaced by a binary representation of the embedded signal [10], [19]. A class of quantized projection approach has been called the quantization index modulation (QIM) which include the generalized LBM, spread LBM, spread transform dither modulation STDM, and distortion compensated QIM (DC-QIM) [20], [21]. Transparency of the embedded data is guaranteed by using the visual masking model either in the spatial or the transform domain [22], [23]. In watermarking applications, the robustness of the embedded data to intentional and unintentional attacks is of great importance [24], [25]. Attacks range from simple signal processing operations such as filtering and image coding like JPEG to severe intentional ones such as rotation, scaling, cropping and noise addition. The watermark should be able to survive these attacks. Rotation and scaling attacks have the effect to desynchronize the hidden data. This is due to the fact that changing the image size and/or its orientation, even by slight amount, could dramatically reduce the receiver ability to retrieve the watermark. This can be compared to loosing synchronization in a communication channel. Several attempts have been devoted to the synchronization recovery problem. These efforts include identifying the geometric attacks by periodic insertion of the mark as in Kalker and Janseen in which the authors designed a periodic insertion of the mark to cope with the important problem of image shifting in video sequences [26]. Dellanay and Macq proposed a method to generate two-dimensional (2-D) patterns having cyclic properties to address image shifting problems [27]. Hartung et al. applied a periodic insertion to perform synchronization and counter StirMark attacks [28]. Kutter proposed an insertion method for watermark detection after geometrical affine transforms in which a periodic mark is embedded in the luminance values of the image and a cross-correlation function of the image allows the localization of the different peaks generated by the periodic mark [29]. Template insertion was also used as a possible solution to regaining synchronization from which the receiver could perform exhaustive search to identify and reverse the attacks. Pereira and Pun proposed to embed templates in image components to identify the geometrical transformation and enable the synchronization of the mark [30]. Fleet et al. [31] proposed

1057-7149/$20.00 © 2006 IEEE

460

an approach in which the watermark is composed of the sum of sinusoids that appear as peaks in the frequency domain which can be used to determine the geometric distortion. Invariant watermark has also been explored in the literature in which the watermark can be recovered even after geometrically manipulating the image. In particular, Ruanaidh et al. [4] used the DFT and the Fourier–Mellin transform to design a watermark which is invariant to translation, rotation, and scaling (RST). In their approach, the DFT of the image is computed and then the Fourier–Mellin transform is performed on the magnitude, the watermark is embedded in the magnitude of the resulting transform. The watermarked image is reconstructed by performing the inverse transforms after considering the original phase. Image semantics have also been used as features for implementing feature-based invariant watermark. Feature points are extracted from image contents that can be used as references linked with the watermark. Edges and corners have been used as salient points, and performing a delaunay tessellation on the points and the mark is embedded inside each triangle of the tessellation [32]. Segmentation-based image features are used to define feature points for mark embedding in . Sun et al. developed a watermarking scheme-based on image features extracted from the original and the marked image, the synchronizing scheme performs a matching between the two set of points and identify the transformation [33]. Dittman et al. designed a content-based watermarking method that does not require the original image and uses self-spanning patterns [34]. In this paper, the proposed embedding technique exploits an adaptive successive quantized projection strategy. The successive quantized projection method performs two operations, namely, projection and quantization. The image is divided into subblocks; 8. Each block hides one bit here, we use blocks of size 8 in it by projecting it onto certain direction(s) and quantizing the projection to satisfy certain constraints. Eigenspace of the image is used as the projection space. Even and odd spaces are considered as the corresponding constraint. In particular, projections located in the even space correspond to one symbol and in the odd space correspond to the other symbol. The proposed approach is adaptive in nature. In particular, the adaptive approach has the advantage that the subspaces that we project onto are image dependent, which means that the receiver extracts these directions from the image once it has been received. This leads to no storage requirement at all at the receiver end. Moreover, given any distortion for the image, both the projection directions and the image blocks should undergo the same amount of distortion which implies relative stability. This is different from the nonadaptive case in which the directions do not depend on the image itself but are rather random [1], [20]. The proposed approach differs from the other quantized projection techniques, such as the QIM and the related algorithms, e.g., the generalized LBM, STDM, and DC-QIM in that the spreading vectors “projection spaces” therein are random which lacks the flexibility of being self-contained algorithms. In particular, the receiver needs to know the projection spaces beforehand in order to complete the detection process which adds to the complexity of the receiver. Unlike the proposed embedding algorithm, schemes

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2006

which use the low-frequency DCT vectors as the projection spaces lacks the flexibility of being adaptive schemes [6]. Two different algorithms are proposed for synchronization recovery. In particular, by synchronization recovery we mean recovering the original size and/or orientation in order for the detection algorithm to work properly. The proposed algorithms do not assume the knowledge of the original image at the receiver side which allow for blind detection. This has the advantage of reducing the receiver complexity and adding more practicality to the designed system. The first method depends on exhaustive search strategy. That is, by using a training sequence inserted in the image as part of the hidden information, the receiver performs exhaustive search to lock for the correct size and orientation. In the second approach, the scaling factor and the rotation angle are estimated based on edges information, as salient points, computed from the wavelet maxima of the image. The scaling factor is approximated by the edges standard deviation ratio (ESDR) which measures the deviation of the edges from a reference point, before and after scaling. While the rotation angle, , is approximated by the average edges angles difference (AEAD), which measures the average angles difference of the edges locations, in a predetermined region, before and after rotation. It should be noted that the choice of edges as features in our technique is motivated by the fact that they reflect both size and orientation changes that occur as a result of geometric manipulations. A brief overview of the multiscale edge detection technique and wavelet maxima computation which are used in the paper is presented in Section II. The proposed embedding algorithm is then discussed in Section III. Synchronization recovery techniques are explained in Section IV. Finally, simulation results are presented in Section V. II. MULTISCALE EDGE DETECTION Multiscale edge detection was first proposed by Mallat [35], [36] and then used by Xu et al. for noise reduction in medical images [37]. Edges are detected by spatially correlating the estimated edges across the various scales after decomposing the image by the wavelet transform. In edge detection, the nonorthogonal wavelet transform is used as opposed to the orthogonal counterpart. This is due to the fact that the transformed signal, in the latter case, is uncorrelated across scales, and, hence, detecting edges would be difficult. It is worth mentioning at this point that the nonorthogonal wavelet transform is also named the dyadic wavelet transform in the literature [38]. The use of the dyadic wavelet transform for signal decomposition yields an overcomplete representation. The reason is that wavelet functions are only sampled across scale but not time sampled. Then, at each decomposition stage, the resulting signal has the same length as the original signal which is advantageous from edge detection point of view. The is dyadic wavelet transform of a function

(1)

ALGHONIEMY AND TEWFIK: PROGRESSIVE QUANTIZED PROJECTION APPROACH TO DATA HIDING

461

Fig. 1. Analysis filter bank for dyadic wavelet transform.

where is the corresponding nonorthogonal wavelet and is the spatial/delay parameter. The wavelet coefficients in (1) can be computed using a fast filter bank algorithm [38]. For simplicity, we will only consider the one-dimensional case as it can be easily generalized to the 2-D case using separable bases. Let be the discrete time filter corresponding to the scaling funcbe the discrete time filter corresponding to the wavelet tion, is function, and the discrete time signal at resolution . Then, the dyadic wavelet transform of can be represented as a convolutional process with a cascade filter bank. The coarse approximation in the decomposition stage at resoluis tion (2) Fig. 2. Modulus maxima of the first four levels.

and the details (3) where is the convolution process. is the filter obtained zeros between each sample of [38]. The by inserting analysis filter bank is illustrated in Fig. 1 It is clear that there is no down sampling after each decomposition stage which clarifies the redundancy introduced by the dyadic wavelet transform. Multiscale edges are computed by spatially correlating the edges, maxima, for each scale. Let be the modulus wavelet transform of a function at scale . Then, the modulus maximum at any point is such that (4) where is the spatial/delay parameter. For each scale, points satisfy (4) are connected together defining the wavelet maxima contour for this scale [38]. Fig. 2 shows the first four levels of the modulus maxima for the F16 image which were computed using XWAVE [39]. It should be noted that low levels preserve high-frequency information while higher levels tend to preserve coarse information. This is well understood from the filter bank formulation (2), (3), since high-frequency edge information is not robust to low-pass filtering, which may occur with the embedded image. The global wavelet maxima are computed by spatially correlating the corresponding high-levels wavelet maxima to increase the robustness to low-pass filtering. Fig. 3 shows the global wavelet maxima constructed from levels 3 and 4 of the F16 image. III. EMBEDDING ALGORITHM The proposed approach is a projection followed by quantization operation under certain constraints. In particular, consider to be a vector that corresponds to a certain 8 8 block

Fig. 3.

Global wavelet maxima.

of the image and we want this vector to represent a binary bit “1” or “0.” The key idea is to project onto the subspace , extracted from the image itself; thus, the space is image depenand dent. By dividing the corresponding space into two sets, , appropriately, one can hide data in each vector. This is acif we want complished by forcing the projection to be in to represent “1” or if to represent “0.” is the set of such that the projection is odd, and all projections of onto consists of even projections.This can be done by quantizing the projection of to even if we want to represent “0” and to odd if we want it to represent “1.” In particular, let be the projection operator onto , then the even and odd spaces and are defined as even

odd (5)

462

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2006

Fig. 5. Singular values for projection spaces for F16. Fig. 4. Iteration “i.”

The reconstructed vector should differ from by an inis visible amount to ensure perceptual transparency. Since image dependent, it should be constructed from the modified vectors, the , to guarantee perfect reconstruction. To be more specific, if was constructed from the unmodified vectors, the at the transmitter, the receiver will reconstruct it from the modified vectors, , and this reconstructed space will be different from . Thus, no perfect reconstruction is guaranteed. To overcome this problem, we use a successive data embedding scheme as explained blow. A. Successive Projection The main point in the projection process is to define the direction(s)/spaces onto which the projection is performed. The spanned by the first authors found that the space eigenvector of the correlation matrix of a certain subimage is robust to JPEG coding, rotation, and scaling. As mentioned before, a successive projection strategy is used, i.e., during the current iteration, the modified blocks will be used with previously modified ones as the new subimage for the next iteration. for In particular, Fig. 4 describes the th iteration. Here, is the prinblocks of size 8 8, where is constructed. Note that is cipal subimage from which the subspace for all s that are located in the strip , as illusstep, the s will be those vectors trated in Fig. 4. In the and will be constructed from the principal in the strip , which is formed by concatenating subimage with the strip . To describe how the subspace is constructed , let be the empirical correlation matrix from in step , then is the space spanned by estimated from the first columns of the matrix , where (6) is the singular value decomposition of the correlation matrix . Let be the projection matrix onto the subspace

, where [40]. Once the projection is performed, the quantization should be done in such a way that correctly represents the hidden bit. Each block is projected onto and then quantized appropriately. Fig. 5 shows the 64 signular values for the corresponding spaces for the f16 image. As it is clear from the figure, the energy is concentrated in the first eigenvector while the remaining vectors have little energy which justifies using only the first eigenvector as the projection space for the following two reasons. First, increased robustness due to the high-energy concentration in the first eigenvector. Second, , due to the ill-conditioned nature of the subspaces for blocking artifacts are emphasized if spaces of dimensionality are used. Finally, from Fig. 5, we can also conclude that the proposed algorithm is very similar to block-DC modification except is it done in a progressive manner. Choosing the Initial Block: The choice of the upper left block as the initial block in the iteration is arbitrary. In principle, it can be any block in the image. However, the drawback of choosing the upper left block is that it is vulnerable to cropping attack. This is due to the fact that by cropping the borders of the image, the formed eigenspace, at the receiver, would be different than what has been used at the transmitter and then the error will propagate to the rest of the image. The effect of cropping attack is that it shifts the coordinates. Choosing the initial block to be coordinate independent would help reducing the effect of cropping attack. Our choice is to use the wavelet maxima locations man (MLM) as the reference point. Simply, the location of the initial block will be determined by the MLM. The embedding process starts at the 8 8 block containing the MLM as the initial subspace and then projecting and quantizing the surrounding blocks onto this initial subspace. The modified blocks will be used as the new subspace for the outer unmodified surrounding blocks as before. Fig. 6(a) shows the MLM of the original Cameraman image which is marked by “ .” The receiver should be able to extract the embedded data after it has been cropped by a forger. To be able to do so, the receiver first estimates the MLM from the cropped image and uses it as its initial block. Then performs the detection process

ALGHONIEMY AND TEWFIK: PROGRESSIVE QUANTIZED PROJECTION APPROACH TO DATA HIDING

463

Fig. 7.

Quantization step.

quantize it accordingly. Let and scalar quantizers defined as follows: if if

be the even and odd even odd even odd

(7)

Fig. 7 illustrates the role of in reducing the quantization error, where is the projected value and the corresponding quantized quanlevels are as shown in Fig. 7 where it is assumed that tizes to the upper level while quantizes to the lower level. C. Embedding Let Fig. 6. MLM for original and cropped Cameraman marked by “3.” (a) Original. (b) Cropped.

in a progressive manner as was described in the embedding process. Fig. 6(b) shows the corresponding wavelet maxima and the MLM which is marked by “ ” for the cropped Cameraman image. The original image size was 256 256 while the cropped image size is 211 211 which amounts to 15% cropping. By comparing Fig. 6(a) and (b), it is clear that the MLM has not been altered although the image has been cropped by 15%. It should be noted that cropping the image by more that this value would change the location of the MLM. B. Quantization Even and odd quantization are used to represent “0” and “1,” respectively. In other words, if the projection is quantized to an even number then this block represents “0” and if it is quantized to an odd number then it represents “1.” The receiver performs the same process to recover the hidden information. In particular, each block is projected onto the corresponding subspace and a rounding operation is performed in order to recover the hidden bit. Let be the rounding error, then, in order to correctly . To achieve this retrieve the hidden data, one must have bound, first rescale the projection by a scale factor and then

be the embedded vector, then to embed a binary bit we use in the following manner: (8)

Here, masking is implicitly introduced by . In particular, there is a tradeoff between the robustness of the embedded data and its transparency. As the value of increases, the robustness increases but the transparency decreases and vice versa. In principle, may take any integer value, typical values for range from 1 to 20 and up to 30 in severe attacks. The value of is chosen based on the anticipated severity of attacks and the required transparency. For example, in applications where the hidden information cannot be compromised, such as in military situations, high is recommended, while in situations where attacks are performed unintentionally, such as transmission over noisy channels, low values could be chosen. In order to compute the embedded data rate, suppose that the and each 8 8 block hides one bit of image size is information. Then the number of bits that can be hidden is . It should be noted that the initial block does not convey any information as it is considered the initial suband space. Thus, for a 256 256 image for a 512 512 image. Fig. 8(a) shows the embedded Pot image , while Fig. 8(b) illustrates a heavily watermarked using image with . The corresponding PSNR w.r.t. the original

464

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2006

Fig. 9.

Fig. 8.

Demonstrating the effect of increasing

10. (b) Embedded using 1 = 40.

1. (a) Embedded using 1 =

image are 45 and 32.9 dB, respectively. In the latter case, blocking artifacts are visible due to quantization effect. IV. SYNCHRONIZATION RECOVERY In this section, we propose two techniques for correcting rotation and scaling attacks performed on the embedded image. The first approach is an exhaustive search technique while the second one estimates both the scaling factor and the rotation angle with the aid of wavelet maxima. A. Exhaustive Search The proposed embedding approach is robust to scaling and rotation provided that one can correct the distortion using the developed techniques for synchronization recovery. The proposed method is based on using a training sequence to recover the synchronization. The receiver performs brute-force scaling and rotation operations on the attacked image and in each time the training sequence is extracted from its known location and

Training sequence.

compared with the original one, which is known to the receiver. The correlation coefficient between the original and the extracted sequence is used as a measure of similarity between the corrected and the original image. Ideally, synchronization is achieved when the correlation coefficient is one. Fig. 9 shows the training sequence and its corresponding location in the image. It is worth mentioning that training sequence should be distributed all over the image in order to maximize the chance of recovery. In particular, if the training sequence occupies only a small part of the image, it is perfectly legitimate that localized attacks, such as localized blurring, would destroy the whole sequence which is located in that portion of the image. However, this comes at the expense of lower embedding rate and a higher decoding time. It should be noted that the image size will be different before and after rotation because of edges deformation. For example, for an original image of size 256 256 after rotation by and applying the inverse attack by , the final image size will be 240 240, which corresponds to total hidden bits. Thus, rotation decreases the data rate by construction. On the other hand, correcting scaling does not change the total number of bits, since edges are preserved. Fig. 10 shows the percentage bit-error rate (BER) of the retrieved data after angle correction and the corresponding correlation coefficient. In particular, after exhaustively searching for the correction angle to the originally rotated Lena image the curve, corresponding to Lena image, Fig. 10 by shows that the error is minimized when the attacked image is which corresponds to the correction angle. As rotated by expected, the correlation coefficient is equal to one only in this case which indicates correct attack identification. Similarly, the curve in Fig. 10, which corresponds to the rotated Mandrill , has a minimum BER and a unity correlation image by . The training coefficient, when the image is rotated by sequence used in the performed experiments is a sequence of all ones. It is located in the strip corresponding to iteration . In general, the length of the training sequence is equal to . Since this is a correlation-based synchronization should be long enough to achieve an recovery scheme, acceptable correlation value. In the performed experiments, we which corresponds to gives acceptable found that performance. Exhaustive search is also used to correct scaling distortion. Fig. 11 illustrates the percentage bit error rate and the corre-

ALGHONIEMY AND TEWFIK: PROGRESSIVE QUANTIZED PROJECTION APPROACH TO DATA HIDING

Fig. 10.

BER versus correction angle,

465

1 = 20.

Fig. 12.

ESD computation for the Cameraman image,

(

x ;y

) is the MLM.

the BER depends on the correlation between the rotated and the original blocks occupying the same location. Hence, if the training sequence is located in a region in which the rotation affects such correlation, such as around edges and corners, then one would expect different convergence error rates. In particular, the edge distribution of Lena and the Mandrill images are not the same. B. Rotation and Scaling Parameters Estimation

Fig. 11.

BER versus scaling factor correction,

1 = 20.

sponding correlation coefficient versus several scaling factors. For a 50% scaled Lena image, the BER is minimized when the image is scaled by a factor of 2 and the corresponding correlation coefficient equals to one. Similarly, the F16 image which was scaled by a factor of 2/3 is corrected when the scaling factor is 1.5. Examining Figs. 10 and 11 reveals an interesting observation. The BER tends to decrease, in the rotation correction case, as the angle of rotation becomes closer to the correction angle. For example, the BER in Fig. 10 decreases as the search goes in the correct direction. This can be understood from the fact that the extracted blocks after small rotation are considered a cropped version of the original blocks, which occupy the same location. Hence, as the image is rotated in the correct direction, the extracted blocks are more correlated with the original ones, which means lower error values. This is not the case in scaling correction as the error is not correlated with the search direction. This is due to the fact that the extracted blocks after scaling has nothing to do with the original blocks which occupy the same location. This gives the rotation correction a bonus over scaling correction because the error gives us a clue about the right search direction which reduces the search space. Comparing the two curves in Fig. 10, one can see a difference in the rate in which the error is decreased. This can be explained giving the fact that

In this section, we will explain how Wavelet maxima can be used to estimate the scaling factor and the angle of rotation for attacked images in order to regain synchronization. 1) Scaling Factor Estimation: ESDR: Scaling factor is estimated as the ESDR which is computed by comparing the deviation of the maxima from a reference point, the MLM in our case, before and after scaling has been performed. The choice of the reference point to be the MLM makes the estimate more robust to cropping as explained in Section III-A. The estimate is performed in two steps, a course estimate followed by a fine tuning step. Step 1: Coarse Estimate: In this step, the scaling factor is coarsely estimated near the true value by the ESDR. Let the , then the edge standard wavelet maximum location be deviation can be formulated as

(9) where

is the coordinate of the maxima locations mean (10)

is the total number of wavelet maxima as shown in and Fig. 12. is a measure of how much edges deviate from the

466

MLM. The estimated scaling factor the ESDR

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2006

is then approximated by

(11) and are the edge standard deviations of the scaled where and the original image, respectively. Note that, for practical implementation, wavelet maxima are first normalized such that the maximum value is unity and the wavelet maxima locations are determined according to a threshold. To be more specific, is declared a wavelet maximum location if (12) is the normalized wavelet maximum where strength at and is a predetermined threshold. Typto avoid considering spurious edges. ically Fig. 13(a) and (b) shows the estimated scale factor for different scales for the F16 and the Cameraman images, respectively. The final scaling factor estimate is the average of estimates for different thresholds. It is clear that the computed estimate gives an accurate indication of how much the image has been scaled. This estimate is then refined in the next step to find the exact scale estimate. Step 2: Fine Tuning: Scaling factors estimated in the previous step are coarse estimates of the true values. In order to find the exact scale, one needs to perform an exhaustive search around these estimates. The method explained in Section IV-A is adopted to perform the fine tuning step. As an example, consider the cameraman image which has been scaled by 75% and the estimated scale factor from Fig. 13(b) is 74.45%. We performed exhaustive search around the coarse estimated scale until locking to the true value. Table I displays the fine scales and the corresponding percentage BER as well as the correlation factor. It is clear from Table I that as the tuning moves to the right scale factor, 75%, which gives us an indication for the right direction. Robustness of scale estimate to cropping: The choice of the reference point as the MLM makes it coordinate independent. This has the advantage of being invariant to cropping, up to 10% cropping. To test the robustness of the scale estimate to cropping, the images of Cameraman, Mandrill, Barbara, and Lena were cropped non symmetrically using the patterns shown in Fig. 14. The numbers on pattern B indicate blocks of size 8 8, which means that we cropped from the images from the right 24 pixels, from the top eight pixels and from the left 16 pixels. The original image size was 256 256; hence, the size of the cropped image is 248 216. Pattern A was used for Cameraman and Mandrill; pattern B was used for Lena and Barbara, blocks for Cameraman and blocks for with Mandrill. As shown in Table II, the ESDR is almost one which proves the robustness of the proposed scale estimate to cropping. It should be noted that the choice of cropping patterns as blocks of multiple of eight pixels is rather arbitrary, i.e., this will work as well even if the cropping was not multiple of eight pixels as long as the overall cropping is within 10%. This is due to the fact that wavelet maxima are image semantics and has noting to do with the size of the cropped blocks.

Fig. 13. Estimated scaling factor versus maxima threshold. (a) F16. (b) Cameraman.

TABLE I FINE SCALE TUNING AROUND 74%

2) Angle Estimation: AEAD: The same principle can be applied to estimate the angle of rotation which is approximated by the AEAD. By comparing the angles of wavelet maxima locations, say in the first quadrant, before and after rotation, the difference should be equal to the angle by which the image has been rotated. It should be noted that in angle computation, the center of the image is used as the reference point instead of the MLM. This is due to the fact that MLM is less robust to rotation than the center of the image. In case of rectangular images, this , as used in will also be true for small rotation angles, up to

ALGHONIEMY AND TEWFIK: PROGRESSIVE QUANTIZED PROJECTION APPROACH TO DATA HIDING

Fig. 14.

467

Cropping patterns.

TABLE II ESDR FOR CROPPED IMAGES

Fig. 16. Estimated rotation angle versus maxima threshold. (a) F16. (b) Cameraman.

Fig. 15. Angle computation for the Cameraman image. (x of the image.

;y

) is the center

the paper. Fig. 15 demonstrates angle computation for the Cameraman image. The estimated angle of rotation can be set to be equal to the average difference between the angles of wavelet maxima AEAD in the first quadrant , before and after rotation, respectively

(13) where is the total number of wavelet maxima in , is the of the original angle of the wavelet maximum at location image and is the corresponding angle for the rotated image. It is clear that the receiver only needs to know to be able to estimate . As in the scaling factor estimation case, we declare a wavelet maximum location according to (12). The final angle is the average of all estimated angles over a wide threshold range. Fig. 16(a) and (b) shows the estimated angles

Fig. 17.

Image normalization.

of rotations for the F16 and the cameraman image, respectively. We may notice that the estimate of the angle of rotation is not as accurate as the scale estimate this can be explained by the sensitivity of the reference point to maxima rotations. 3) Normalization: It was assumed that the receiver knows both and . This may not be practical as the receiver may not have access to these information. A simple, yet effective, trick to overcome such a difficulty is to perform a scale-angle normalization step before encoding and decoding. This idea is illustrated by Fig. 17. For example, for image normalization with and , the required scale and angle corrections and the corresponding BER are listed in Table III for dif-

468

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2006

TABLE III

NORMALIZING AT 

= 50,  = 100 ,  =  , AND  =  + 

Fig. 19.



Fig. 18. Robustness of eigenspace to JPEG. (a) F16. (b) Tea pot.

ferent images. It should be noted that the relatively high BER is due to the inherent interpolation and data loss accompanied with scaling and rotation involved in the normalization process. Some remarks need to be emphasized at this point. • It is well known that the complexity of the exhaustive search technique grows exponentially with the number of attacks. However, this can be reduced dramatically if one combine the two mentioned techniques, namely, geometric parameter estimation and exhaustive search. In particular, in order to regain synchronization, a coarse estimate of the scaling and rotation parameters is first obtained using wavelet maxima as explained in Section IV-B and then the exact parameters could be obtained using the exhaustive search strategy illustrated







Robustness of eigenspace to scaling. (a) Barbara. (b) Cameraman.

in Section IV-A. Doing so would reduce the overall complexity for regaining synchronization. The use of wavelet maxima as feature points as opposed to edges and corners generated using the ordinary detectors is motivated by the multiresolution nature of the wavelet transform as illustrated in Section II. This adds to the stability of feature points generation even after signal processing operations such as JPEG coding. Since scaling and rotation can be modeled as linear operators, then a combination of rotation and scaling attacks would also be linear. Also, since the wavelet transform is a linear transform, then a combined scaling and rotation attacks could be estimated and reversed appropriately. Although the proposed synchronization recovery algorithms work for both square and rectangular images, aspect ratio change attack would be a killer to the parameter estimation process. This can be understood from the fact that the parameter estimation algorithm deals with both axes in a similar manner. In particular, changing the scale of either axes would make the estimation process fail. It should be noted that, due to the use of the eigenspace as the projection space, it makes the algorithm less secure than the other techniques that use random directions as projection spaces. This is the price paid for having an image dependent space.

ALGHONIEMY AND TEWFIK: PROGRESSIVE QUANTIZED PROJECTION APPROACH TO DATA HIDING

Fig. 20. Robustness of eigenspace to rotation. (a) F16. (b) Cameraman.

V. PERFORMANCE A. Robustness of the Eigenspace Since the proposed embedding algorithm assumes projection , spanned by the first eigenvectors of onto the eigenspace the matrix in (6), then it is natural to examine the robustness of the eigenspace to different attacks. In particular, we explore to JPEG coding, rotation, and scaling. the robustness of and be the eigenspaces spanned by the first Let eigenvectors of before and after performing the attacks, respectively. Then, the angle between and defined below is used as a measure for the robustness of the eigenspaces under attacks [40]. (14)

and , rewhere and are arbitrary eigenvectors in spectively. The motivation for using the angle as a measure for robustness is that it gives us a sense of how much the space has been rotated due to the performed attacks. This is acceptable since a rotated space means different projection values and, hence, more errors in the embedding results. Figs. 18–20 show

Fig. 21.

BER versus

469

1 for different PSNR. (a) F16. (b) Tea pot.

for different values for JPEG coded with different Qs, scaled and rotated images. It is clear that the first eigenvector is highly robust to the performed attacks, this justifies the use of the first eigenvector as the projection space in the proposed embedding algorithm. Unfortunately, the eigenspace is not robust to blurring attack, and, hence, the embedding algorithm does not survive blurring attacks. B. Robustness to Attacks The robustness of the proposed embedding algorithm to AWGN, JPEG coding, combined scaling, and rotation has been tested. Fig. 21(a) and (b) illustrates the percentage BER versus for different PSNR for f16 and pot images, respectively. As can be seen from the figures, the BER decreases with the increase of and PSNR, as expected. Fig. 22(a) and (b) illustrates the performance of the embedding algorithm to JPEG coding for different Q factors for the F16 and Mandrill images, respectively. Robustness to combined scaling and rotation attacks is illustrated by Fig. 23(a) and (b) for Cameraman and and , respectively. The Lena images with figures show the percentage BER as a function of different scales and rotation angles after regaining synchronization. As it is clear from the figures, the BER increases dramatically with downscaling the image. This can be understood from the inherent data loss accompanied with the downsampling

470

Fig. 22.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2006

BER versus

1 for different JPEG-Q. (a) F16. (b) Mandrill.

Fig. 23. BER after combined scaling and rotation attacks after correction. (a) Cameraman, . (b) Lena, .

1 = 20

1 = 30

process. On the other hand, enlarging the image has little effect on the error due to the redundancy introduced by the interpolation process. The performance of the algorithm as a function of angle and scaling factor estimation error is illustrated in Fig. 24. It is clear from the figure that the algorithm is sensitive to estimation error which can be understood from the desynchronization effect produced by the estimation error. Hence, the fine tuning step introduced in Section IV-B is important to maintain low BER. Finally, Fig. 25 shows the contour plot of the 2-D autocorrelation function of the original pot image . It and the corresponding watermark embedded with is clear that the watermark power spectrum1 follows that of the original image, i.e., the power spectrum condition is satisfied [41]. VI. CONCLUSION A new data-hiding technique has been presented in this paper. 8 Data is embedded by quantizing the projection of the 8 blocks onto the eigensubspaces extracted from the image. The proposed data embedding algorithm assumes blind detection where no overhead is required for detection. Two techniques are 1The power spectrum (the Fourier transform of the autocorrelation) is not shown here because we found that the autocorrelation function is more informative than the power spectrum.

Fig. 24. with

BER versus angle and scaling factor estimation error for cameraman

1 = 30.

proposed for synchronization recovery, exhaustive search, and scaling-rotation parameters estimation. The exhaustive search technique uses preembedded training sequence for locking to the right scale and orientation. On the other hand, scaling factors and rotation angles are estimated with the aid of wavelet maxima. Performance analysis and robustness tests were also presented.

ALGHONIEMY AND TEWFIK: PROGRESSIVE QUANTIZED PROJECTION APPROACH TO DATA HIDING

Fig. 25. pot.

Contour plot of the autocorrelation of the watermark and the original

ACKNOWLEDGMENT The authors would like to thank Dr. G. Sharma and the anonymous reviewers for their help in improving the quality of this paper. REFERENCES [1] M. Swanson, M. Kobayashi, and A. H. Tewfik, “Multimedia data embedding and watermarking technologies,” Proc. IEEE, vol. 86, no. 6, pp. 1064–1087, Jun. 1998. [2] I. J. Cox, M. L. Miller, and A. L. McKellips, “Watermarking as communications with side information,” Proc. IEEE, vol. 87, no. 7, pp. 1127–1141, Jul. 1999. [3] F. A. P. Petitcolas, “Watermarking schemes evaluation,” IEEE Signal Process. Mag., vol. 17, no. 5, pp. 58–64, Sep. 2000. [4] J. O’Ruanaidh and T. Pun, “Rotation, scale and translation invairant digital image watermarking,” in Proc. IEEE Int. Conf. Image Processing, vol. I, 1997, pp. 536–539. [5] M. Ramkumar, A. N. Akansu, and A. A. Alatan, “A robust data hiding scheme for images using DFT,” in Proc. IEEE Int. Conf. Image Processing, vol. 2, 1999, pp. 211–215. [6] F. Alturki and F. Mersereau, “An oblivious robust digital watermarking technique for still images using the DCT phase modulation,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. 6, 2000, pp. 1975–1978. [7] J. Kim and Y. Moon, “A robust wavelet-based digital watermarking using level adaptive thresholding,” presented at the IEEE Int. Conf. Image Processing, 1999. [8] D. Kundur and D. Hatzinakos, “A robust digital image watermarking method using wavelet-based fusion,” in Proc. IEEE Int. Conf. Image Processing, vol. 1, 1997, pp. 544–547. [9] X. Xia, C. Boncelet, and G. Arce, “A multiresolution watermark for digital images,” presented at the IEEE Int. Conf. Image Processing, 1997. [10] W. Bender, D. Gruhl, and N. Morimoto, “Techniques for data hiding,” Tech. Rep., Media Lab., Mass. Inst. Technol., Cambridge, 1994. [11] N. Nikolaidis and I. Pitas, “Copyright protection of images using robust digital signatures,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. 4, 1996, pp. 2168–2171. [12] I. J. Cox, J. Killian, F. T. Leighton, and T. Shamoon, “Secure spread spectrumwatermarking for multimedia,” IEEE Trans. Image Process., vol. 6, no. 12, pp. 1673–1687, Dec. 1997. [13] J. R. Smith and B. O. Comiskey, “Modulation and information hiding in images,” in Proc. 1st Int. Workshop Information Hiding, Jun. 1996, pp. 207–226. [14] J. R. Hernandez, F. Perez-Gonzalez, J. M. Rodriguez, and G. Nieto, “Performance analysis of a 2-D-multipulse amplitude modulation scheme for data hiding and watermarking of still images,” IEEE J. Sel. Areas Commun., no. 5, pp. 510–524, May 1998.

471

[15] M. Kutter, F. Jordan, and F. Bossen, “Digital signature of color images using amplitude modulation,” Proc. SPIE, pp. 518–526, 1997. [16] M. D. Swanson, B. Zhu, and A. H. Tewfik, “Data hiding for video in video,” in Proc. IEEE Int. Conf. Image Processing, vol. II, Piscataway, NJ, 1997, pp. 676–679. [17] J. M. Barton, “Method and apparatus for embedding authentication information within digital data,” U.S. Patent, Jul. 1997. [18] K. Tanaka, Y. Nakamura, and K. Matsui, “Embedding secret information into a dithered multi-level image,” in Proc. IEEE Military Communications Conf., 1990, pp. 216–220. [19] A. Tirkel, G. Rankin, R. Schyndel, W. Ho, N. Mee, and C. Osborne, “Electronic watermark,” in Proc. DICTA, Dec. 1993, pp. 666–672. [20] B. Chen and G. W. Wornell, “Quantization index modulation: a class of provably good methods for digital watermarking and information embedding,” IEEE Trans. Inf. Theory, vol. 47, no. 4, pp. 1423–1443, May 2001. , “An information-theoretic approach to the design of robust digital [21] watermarking systems,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Process., vol. 4, 1999, pp. 2061–2064. [22] G. Legge and J. Foley, “Contrast masking in human vision,” J. Opt. Soc. Amer., vol. 70, no. 12, pp. 1458–1471, 1990. [23] M. Swanson, B. Zhu, and A. Tewfik, “Transparent robust image watermarking,” in Proc. IEEE Int. Conf. Image Processing, vol. 3, 1996, pp. 211–214. [24] F. Petitcolas, R. Anderson, and M. Kuhn, “Attacks on copyright marking systems,” presented at the 2nd Workshop on Information Hiding, 1998. [25] [Online]. Available: http://www.cl.cam.ac.uk/ fapp2/watermarking/stirmark [26] T. Kalker, G. Depovere, J. Haitsma, and M. Maes, “A video watermarking system for broadcast monitoring,” presented at the Proc. SPIE, Jan. 1999, pp. 103–112. [27] D. Delanay and B. Macq, “Generalized 2-D cyclic patterns for secret watermark generation,” in Proc. IEEE Int. Conf. Image Processing, vol. II, Sep. 2000, pp. 77–80. [28] F. Hartung, J. K. Su, and B. Girod, “Spread spectrum watermarking: malicious attacks and counter-attacks,” in Proc. SPIE Security and Watermarking of Multimedia Contents II, vol. 3657, San Jose, CA, Jan. 1999, pp. 147–158. [29] M. Kutter, “Watermarking resisting to translation, rotation and scaling,” in Proc. SPIE: Multimedia Systems and Applications, vol. 3528, Boston, MA, Nov. 1998, pp. 423–431. [30] S. Pereira and T. Pun, “Fast robust template matching for affine resistant image watermarking,” in Proc. Int. Workshop on Information Hiding, vol. LNCS 1768, Lecture Notes in Computer Science, Dresden, Germany, Sep. 1999, pp. 200–210. [31] D. Fleet and D. Heger, “Embedding invisible information in color images,” in Proc. Int. Conf. Image Processing, vol. 1, 1997, pp. 532–535. [32] P. Bas, J.-M. Chassery, and B. Macq, “Geometrically invariant watermarking using feature points,” IEEE Trans. Image Processing, vol. 11, no. 9, pp. 1014–1028, Sep. 2002. [33] Q. Sun, J. Wu, and R. Deng, “Recovering modified watermarked image with reference to originalimage,” Proc. SPIE, pp. 415–424, Jan. 1999. [34] J. Dittmann, T. Fiebig, and R. Steinmetz, “New approach for transformation-invariant image and video watermarking in the spatial domain: self-spanning patterns (ssp),” in Proc. SPIE Electronic Imaging 2001: Security and Watermarking of Multimedia Content III, San Jose, CA, Jan 2001, pp. 176–186. [35] S. Mallat and S. Zhong, “Characterization of signals from multiscale edges,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 710–732, Jul. 1992. [36] S. Mallat and W. L. Hwang, “Singularity detection and processing with wavelets,” IEEE Trans. Inf. Theory, pt. 2, vol. 38, no. 2, pp. 617–643, Mar. 1992. [37] Y. Xu, J. B. Weaver, D. M. Healy, and L. Jian, “Wavelet transform domain filters: a spatially selective noise filtration technique,” IEEE Trans. Image Process., vol. 3, no. 6, pp. 747–758, Nov. 1994. [38] S. Mallat, A Wavelet Tour of Signal Processing. Chestnut Hill, MA: Academic, 1998. [39] [Online]. Available: ftp://cs.nyu.edu/pub/wave/software/ [40] G. Golub and C. VanLoan, Matrix Computations. Baltimore, MD: John Hopkins Univ. Press, 1989. [41] J. K. Su and B. Girod, “Power-spectrum condition for energy-efficient watermarking,” in IEEE Int. Conf. Image Processing, vol. 1, 1999, pp. 301–305.

472

Masoud Alghoniemy received the B.Sc. and M.Sc. degrees from the University of Alexandria, Alexandria, Egypt, and the Ph.D. degree from the University of Minnesota, Minneapolis, in 1993, 1996, and 2001, respectively. He was a Senior Audio Coding Engineer at iBiquity Digital Corporation, Warren, NJ, from April 2001 to March 2002. He was with Intel Corporation, Santa Clara, CA, as an intern in the summer of 2001. Since April 2002, he has been an Assistant Professor with the Electrical Engineering Department, University of Alexandria. His research interests include adaptive signal representation, multimedia signal processing (including watermarking and data hiding, and content-based retrieval), and source coding. Dr. Alghoniemy was the recipient of the Young Scientist Award from the General Assembly of the International Union of Radio Science (URSI), Lille, France, in 1996. He is also listed in the 23rd edition of the Marquis Who’s Who in the World.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2006

Ahmed H. Tewfik (F’96) was born in Cairo, Egypt, on October 21, 1960. He received the B.Sc. degree from Cairo University in 1982 and the M.Sc., E.E., and D.Sc. degrees from the Massachusetts Institute of Technology, Cambridge, in 1984, 1985, and 1987, respectively. He was with Alphatech, Inc., Burlington, MA, in 1987. He is currently the E. F. Johnson Professor of Electronic Communications with the Department of Electrical Engineering, University of Minnesota, Minneapolis. He served as a Consultant to MTS Systems, Inc., Eden Prairie, MN, and Rosemount, Inc., Eden Prairie, and worked with Texas Instruments and Computing Devices International. From August 1997 to August 2001, he was the President and CEO of Cognicity, Inc., an entertainment marketing software tools publisher that he co-founded, on partial leave of absence from the University of Minnesota. His current research interests are in signal processing for pervasive datanomic computing, multimedia, and high-performance wireless networks. Prof. Tewfik was a Distinguished Lecturer of the IEEE Signal Processing Society from 1997 to 1999. He received the IEEE Millennium award in 2000. He was invited to be a Principal Lecturer at the 1995 IEEE EMBS summer school. He was awarded the E. F. Johnson Professorship of Electronic Communications in 1993, a Taylor Faculty Development Award from the Taylor foundation in 1992, and a National Science Foundation Research Initiation Award in 1990. He delivered plenary lectures at several IEEE and non-IEEE meetings, including the 1994 IEEE International Conference on Acoustics, Speech, and Signal Processing; the 1999 IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing; the 1999 IEEE Turkish Signal Processing Conference; the 1st IEEE International Symposium on Signal Processing and Information Theory in 2001; the SSGRR2002w International Conference on Advances in Infrastructure for Electronic Business, Science, and Education on the Internet; the 2003 European Union COST meeting; and the 10th IEEE International Conference on Electronics, Circuits, and Systems. He gave invited tutorials on ultrawideband communications at the 2003 Fall IEEE Vehicular Technology Conference, on watermarking at the 1998 IEEE International Conference on Image Processing, and on wavelets at the 1994 IEEE Workshop on Time-Frequency and Time-Scale Analysis. He was selected to be the first Editor-in-Chief of the IEEE SIGNAL PROCESSING LETTERS from 1993 to 1999. He is a past Associate Editor of the IEEE TRANSACTIONS ON SIGNAL PROCESSING, was a Guest Editor of three special issues of that journal on wavelets and their applications and watermarking, and a Guest Editor of a special issue of the IEEE TRANSACTIONS ON MULTIMEDIA on multimedia databases.