Tracing watermarking for multimedia communication ... - IEEE Xplore

Tracing Watermarking for Multimedia Communication Quality Assessment Patrizio Campisi, member, IEEE, Marco Carli, Gaetano Giunta, member, IEEE, Alessandro Neri, member, IEEE Università degli Studi “Roma Tre” Dipartimento di Ingegneria Elettronica Via della Vasca Navale 84, I-00146 Roma, Italy Tel: +39.06.5517.7064, Fax: +39.06.5579078

Abstract- Multimedia data hiding by digital watermarking is usually employed for copyright protection purposes. In this contribution, a new application of watermarking is presented. Specifically, watermarking is here employed as a technique for testing the quality of service in multimedia mobile communications. A fragile known watermark is embedded in a MPEG-like host data video transport stream using a spreadspectrum technique to avoid visual interference. Like a tracing signal, a (known) tracing watermark tracks the (unknown) information stream that follows the same communication link. The detection of the tracing watermark allows dynamically evaluating the effective quality of the provided video services, depending on the whole physical layer (including the employed image co/decoder). The performed method is based on the mean-square-error between estimated and actual watermarks. The devised technique has been usefully applied to typical scenarios of mobile wireless multimedia communication systems, in presence of multi-path channel and interfering users. Key words- Multimedia communications, Quality of Service, UMTS services, Video streaming, Watermarking.

I. INTRODUCTION In the past few years, there has been an explosion in the use and distribution of digital multimedia data, essentially driven by the diffusion of Internet. Nowadays electronic commerce applications, on-line services such as banking or booking and, more in general, services involving multimedia data, are rapidly increasing. In the near future, third generation mobile communication systems (IMT2000/UMTS) will be providing a large number of innovating and interlaced services. Therefore, it follows the necessity for the services’ providers to develop a simple and effective billing system related to the quality of the service supplied. Therefore, it is crucial to devise a quality assessment system that not increases the transmission bit-rate. In this scenario, we develop an innovative quality assessment system embedded into the same data. This aim is obtained by an unusual use of fragile watermarking, in order to estimate the quality of the transmission system. Watermarking techniques have been devised to answer the ever-growing need to protect the intellectual property (copyright) of digital still images, video sequences or audio from piracy in a networked environment like the Word Wide Web. This allows performing a controlled distribution of multimedia data.

Fig. 1. Watermark embedding process Although copyright protection was the very first application of watermarking, different uses have been recently proposed in literature. Fingerprinting, broadcast monitoring, data authentication, multimedia indexing, content based retrieval applications, medical imaging applications [1] are only a few of the new applications where watermarking can be usefully employed. The general watermark embedding procedure is sketched in Fig. 1. It consists in embedding a sequence, which is usually binary (the watermark), into host data by means of a key. In the detection phase, the key is used to verify the presence of the embedded sequence. With regards to the domain where the watermark embedding occurs, we can distinguish methods operating in the spatial domain [2], in the DCT domain [3], [4], [5], in the Fourier Transform domain [6], and in the Wavelet Transform domain [7], [8], [9], [10], [11]. When considering a watermarking scheme, according to its specific application, different requirements need to be accomplished. One requirement that a watermarking scheme must satisfy, is the perceptual invisibility of the “mark” superimposed on the host data. This implies that the alterations caused by the watermark superimposition on the data should not degrade their perceptual quality. Moreover, when these techniques are used to preserve the copyright ownership and to avoid unauthorized data duplications, the embedded watermark should be detectable, even if malicious attacks or non-deliberate modifications (i.e. filtering, compression, etc.) affect the embedded watermark. This requirement is known as watermark robustness. However, when in the embedding procedure unwanted modifications of the watermarked data severely affect even the extracted watermark, the embedding scheme is known as fragile. Fragile watermarking [12], [13], [14], can be used to obtain information about the tampering process: in fact, it

0-7803-7400-2/02/$17.00 © 2002 IEEE

1154

indicates whether the data has been altered, and supplies localization information as to where the data was altered. In our approach, we present an unconventional use of fragile watermark to evaluate the quality of service (QoS) in multimedia mobile communications. Specifically, a known watermark is superimposed on the host data. This implies that, by transmitting the watermarked data onto a channel, the mark undergoes the same alterations suffered by the data. At the receiving side, the watermark is estimated and compared with its original counterpart. Since the alterations endured by the watermark are likely to be suffered also by the data, as they follow the same communication link, the watermark degradations can be used to estimate the overall alterations endured by the data. The method we present is twofold. In fact, in a communication scheme where the data are first coded and then transmitted, our approach can be used to estimate both the degradations introduced by the cascade coder-channel and the ones introduced only by the channel. In the first case, the watermark embedding is performed in the data before coding, whereas in the second case the watermark embedding is performed in the coded bit stream by taking advantage of the coder characteristics. In this paper we deal with host data that are video sequences encoded by means of the MPEG-2 [15] compression standard. The reason for this choice relies on the fact that, such as in a digital television-broadcasting scenario, MPEG-2 compressed videos are transmitted over telecommunication wireless systems. The paper is organized as follows. In Section II a brief description of the MPEG-2 video compression standard is provided. The quality assessment procedure principle is summarized in Section III, and in Section IV it is described in detail. Some experimental results are provided in Section V. The conclusions are finally drawn in Section VI.

Fig. 2. Principle scheme of tracing watermarking for coderchannel quality assessment in multimedia communications. coding. The temporal redundancy is reduced by temporally predicting some frames from other motion compensated frames. The prediction error is then encoded. Three types of frames are used in the MPEG standard: (I) Intra-frames, which are coded with any reference to other frames, (P) Predicted frames, which are coded with reference to previous I or P-frames, and (B) Bi-directionally interpolated frames, which are coded with reference to both previous and next frames. An encoded GOP always starts with an I-frame, to allow random access of the video stream. III. TRACING WATERMARKING PROCEDURE

II. THE MPEG2 VIDEO STANDARD The purpose of this Section is to give a very brief overview of the MPEG-2 video coding scheme (a detailed description can be found in [15]). The MPEG-2 video bit stream has a layered syntax. In a top-bottom hierarchical structure the video sequence is subdivided into multiple Group of Pictures (GOPs), representing sets of video frames which are contiguous in display order. The next layer is a constituted by a single frame that is composed by one or more slices. Then, the slice contains one or more macro blocks, consisting of four luminance (Y) and two chrominance (U,V) blocks. Finally the block is the basic coding unit of dimension 8 by 8 pixels. In order to obtain a high compression ratio, both spatial and temporal redundancies are exploited. In particular, the spatial redundancy is reduced by using sub sampling of the chrominance components (U, V) in accordance with the sensitivity of the human visual system, then the Discrete Cosine Transform (DCT) is performed on the blocks of the Y and U, V components. Then the DCT coefficients are quantized and finally encoded by using variable length

A principle scheme of the tracing watermarking for coder-channel quality assessment is reported in Fig. 2. The watermarking embedding is performed by resorting to spread-spectrum technique [4], [16]. Roughly speaking, the watermark (narrow band low energy signal) is spread over the image (larger bandwidth signal) so that the watermark energy contribution for each host frequency bins is negligible, which makes the watermark imperceptible. More in detail, a set of uncorrelated pseudo-random noise (PN) matrices (one per each frame and known to the receiver) is multiplied by the reference watermark (one for all the transmission session and known to the receiver). The use of different spreading PN matrices assures that the spatial localization of the mark is different frame by frame, so that the watermark visual persistency is negligible. After generating the marks, the embedding of the tracing mark is performed in the Discrete Cosine Transform (DCT) Domain [5], [6]. The two-dimensional watermark is embedded in the DCT middle-band frequencies of the whole image. The watermark is then randomised by the PN matrices and added to the DCT of each frame. The inverse DCT transform is

1155

performed, and the whole sequence is MPEG-2 coded and finally transmitted. The receiver implements video decoding. Then, a matched filter extracts the spread watermark from the DCT of each received frame of the sequence. After having dispread it using the known PN matrix, the estimated watermark is matched to the reference one. In particular, its mean-squareerror (MSE) is evaluated as an index of the effective degradation of the provided QoS. As outlined in the Introduction, we deal with MPEG-2 compressed sequences. It is worth noting that we choose to embed the watermark in the I frames, thus avoiding to embed it in the P and B frames. In fact, the I frames do not suffer from any interpolation error with respect to the P and B frames. In this fashion the watermark is affected only by the channel’s errors and, at the receiver side, the estimation of the degradations affecting the received mark can be used to provide quality assessment of the channel. IV. THE QOS ASSESSMENT PROCEDURE USING TRACING WATERMARKING The watermarking procedure used in our method is based on embedding the mark 1 2 in the DCT middle-band

j> >

Wzz

frequencies of each video sequence frame 7 1 2 by resorting to the spread-spectrum techniques described in [4], [16]. Specifically, the watermark 1 2 is first spread by using a set of uncorrelated pseudo-random noise matrices 7 1 2 thus obtaining:

j> >

>>

j7X >1 >2 = j >1 >2 ⋅ 7 >1 >2 . w D

(1)

It is worth noting that both the PN matrices and the watermark are known at the receiving side. Moreover the watermark embedded is the same for each video sequence frame whereas the PN matrices are different for each frame. This assures that the spatial localization of the mark is different frame by frame, thus avoiding visual summation. Then, the watermark is embedded in the mid-frequency of the DCT domain according to the following rule

B >1 >2 = ;gx 1 >2 wjD

7

7

wXD

7

(2)

where α is a scaling factor that determines the watermark strength. The final marked frame is obtained performing the inverse DCT as

>

C

fi( w ) n1 , n2 = IDCT Fi( w ) k1 , k2 .

(3)

This procedure, here described for the i-th frame, is repeated for each frame. Finally, MPEG-2 coding is performed and transmission over a noisy channel simulated by a Poisson’s generator of random transmission errors occurs. At the receiving side, an estimation of the spread version

of the watermark embedded in the i-th frame, evaluated as follows:

j7X >1 >2 = B7 j >1 >2 ⋅ j>1 >2

w D

>

w D

H

>

H

>

j >1 >2 , is

wXD 7

(4)

H

Bj> >

having indicated with 7 > 1 2 H the received DCT of the marked i-th frame. Then, the watermark is estimated by averaging the dispread watermarks, obtained by multiplying 7 1 2 by the corresponding PN matrix 7 1 2 , over the M frames: w D

j>>

>>

-

j>1 >2 = -1 ∑ j7X >1 >2 ⋅ 7 >1 >2 . 7 =1 >

H

w D

>

H

>

(5)

H

The extracted watermark is compared with the original one thus providing a QoS assessment for the transmitted video since we assume that the sequence and the embedded watermark suffer the same quality degradation through the coding/transmission procedure. The metric employed for quality assessment is the meansquare-error (MSE) between the estimated watermark and the original one. Specifically the MSE is first evaluated for the ith frame as follows: MSEi =

1 K1 K 2

K1

K2

∑ ∑ 4w [k , k ] − w [k , k ]9 i

1

2

i

1

2

2

(6)

k1 =1 k2 =1

Then, it is averaged over the M frames under analysis, thus obtaining 1 MSE = M

M

∑ MSE , i

(7)

i =1

which represents the employed metric. V. EXPERIMENTAL RESULTS In this Section some experimental results characterizing the performances of the proposed method are provided. Specifically, simulations have been performed by introducing two different video quality degradation sources: a noisy channel, and the co-decoder. In particular, this latter can induce variations on the video quality by changing: 1. the quantization matrix; 2. the bit-rate; 3. the number of intra-coded frames. To evaluate the sensibility of the method to the degradations due only to the coder, tests have been performed with ideal channel and different coder quality. The MSE (normalized by 105) of the extracted watermark, evaluated according to (7), (8), and of the video sequence

1156

WaterMark Error Energy

50

14

Sequence Error Energy

45

12

40 10

35

MQ*2

MQ*4

MQ*8

MQ*16

MQ*31

8

25

MSE

MSE

30

MQ*1

20 15

6

4

10 2

5 0

0 1

6

11

16

21

26

0

31

Quantization Matrix Index

perfect quality high quality medium quality low quality

5

MSE

10^ -3.5

10^ -3

10^ -2.5

Fig. 4. Watermark MSE (normalized by 105) versus the bit error rate and parameterized with respect to different quantization matrix indices. versus different quantization matrices is depicted in Fig. 3, where higher matrix indices correspond to coarser quantization matrices. Specifically, the original quantization matrix has been multiplied by the matrix indices 1, 2, 4, 8, 16, and 31. From the results shown in Fig. 3, we can then validate the initial hypothesis that the watermark degradations have the same behavior of the degradations suffered by the video sequence. In Fig. 4, the watermark MSE versus the bit error rate, for several quantization matrices, is reported. The original quantization matrix MQ is considered and it is multiplied by 1, 2, 4, 8, 16, 31 thus obtaining coarser quantization matrices. The experimental results highlight that coarser quantization matrices give higher watermark MSE values. This is in accordance with the visual degradation suffered by the coded video sequence when coarser quantization matrices are used. In Fig. 5, the watermark MSE versus the bit error rate, for increasing number of intra-coded frames is reported. The perfect quality situation refers to videos with no predicted frames (B and P), while a different intra-frame rate has been implemented in the other cases.

8

6

10^ -4

Bit Error Rate

Fig. 3. Watermark MSE (normalized by 105) versus several quantization matrix indices.

7

10^ -4.5

4 3 2 1 0 0

10^-4.5

10^ -4

10^ -3,5

10^ -3

10^ -2,5

VI. CONCLUDING REMARKS

Bit Error Rate Fig. 5. Watermark MSE (normalized by 105) versus the bit error rate and parameterized with respect to different number of intra frame.

We have proposed an unconventional method of tracing watermarking as a hidden technique suited for testing the quality of service in multimedia mobile communications. In our approach, a fragile (known) watermark is hidden into MPEG-2 host data video transport stream. Spread-spectrum

1157

pseudo-noise matrices, characterized by random keys, are also used to avoid visual persistency. Experimentations show that the error affecting the watermark is sensitive to the employed quantization matrix, to the bit rate, and to the intra-coded frame number. Therefore our approach allows blindly estimating the quality of a coding/transmission system without affecting the quality of the video-communication. The performed QoS assessment procedure can be usefully employed for different purposes in wireless multimedia communication networks such as: 1. control feedback to the sending user on the effective quality of the link; 2. detailed information to the operator for billing purposes; 3. diagnostic information to the operator on the effective status of the link. REFERENCES [1] Hanjalic, G. C. Langelaar, P.M.B. van Roosmalen, J. Biemond, R.L. Lagendijk, Image and video databases: restoration, watermarking and retrieval, Elsevier 2000. [2] N. Nikolaidis, and I. Pitas, “Robust image watermarking in the spatial domain”, Signal Processing, vol. 66, no.3, pp. 385-403, May 1998. [3] M. Barni, F. Bartolini, V. Cappellini, A. Piva, “A DCTdomain system for robust image watermarking”, Signal Processing, vol. 66, no. 3, pp. 357-372, May 1998. [4] I. Cox, J. Kilian, F. Leighton, T. Shamoon, “Secure Spread Spectrum Watermarking for Multimedia”, in IEEE Trans. Image Processing, vol. 6, no. 12, December 1997. [5] M. D. Swanson, B. Zhu, A. H. Tewfik, “Transparent robust image watermarking,” transform”, Proc. IEEE Int. Conf. on Image Processing 1996, Lousanne, Switzerland, pp. 211-214, 16-19 September, 1996. [6] R.M. Wolfgang, and E.J. Delp, “A watermark for digital images”, Proc. IEEE Int. Conf. on Image Processing 1996, Lousanne, Switzerland, pp. 219-222, 16-19 September 1996.

[7] R. Dugad, K. Ratakonda and N. Ahuja, “A new wavelet-based scheme for watermarking images”, Proc. IEEE Int. Conf. on Image Processing 1998, Chicago, IL, pp. 419-423, 4-7 October 1998. [8] D. Kundur, D. Hatzinakos, “Digital watermarking using multiresolution wavelet decomposition”, Proc. IEEE Int. Conf. On Acoustic, Speech and Signal Processing, pp. 2969-2972, vol.5, 1998. [9] H. Inoue, A.Miyazaki, T. Katsura, “An image watermarking method based on the wavelet transform”, Proc. IEEE Int. Conf. on Image Processing 1999, pp. 296-300, Kobe, Japan, 25-28 October, 1999. [10] H. J. M. Wang, P. C. Su, and C. C. J. Kuo, “Waveletbased blind watermark retrieval technique”, Proc. SPIE, vol 3528, Conf. On Multimedia Systems and Applications, Boston, MA, November 1998. [11] P. Campisi, A. Neri, M. Visconti, “A Wavelet Based Method for High Frequency Subbands Watermark Embedding”, SPIE Multimedia Systems and Applications III, Boston (MA), November 2000. [12] M. M. Yeung, F. Mintzer, “An invisible watermarking technique for image verification”, Proc. IEEE Int. Conf. on Image Processing 1997, Santa Barbara (CA), pp. 680-683, 1997. [13] D. Kundur, D. Hatzinakos, “Towards a telltale watermarking technique for tamper-proofing”, Proc. IEEE Int. Conf. on Image Processing 1998, Chicago (IL), pp. 409-413, 4-7 October, 1998. [14] R.H. Wolfgang, E.J. Delp, “Fragile Watermarking using the VW2D Watermark,” SPIE, Security and Watermarking of Multimedia Contents, vol. 3657, San Jose, CA, USA, January 1999. [15] ISO, “Information Technology – Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s – Part 2: Video”, ISO/IEC 11172-2, 1993. [16] H. Hartung, J. K. Su, B. Girod, “Spread spectrum watermarking: malicious attacks and counterattacks” Proc. SPIE, Security and Watermarking of Multimedia Contents, vol. 3657, San Jose, CA, USA, January 1999.

1158