of compression on the watermark is bypassed. From a practical standpoint, watermarking in the compressed domain is desirable because almost all media are ...
Proceedings SPIE Conference on Security, Steganography and Watermarking of Multimedia Contents, San Jose, CA, January 18-22, 2004
Reversible Watermarking using Two-way Decodable Codes Bijan G. Mobasseri and Domenick Cinalli Department of Electrical and Computer Engineering Villanova University Villanova, PA 19085 ABSTRACT Traditional variable length codes(VLC) used in compressed media are brittle and suffer synchronization loss following bit errors. To counter this situation, resynchronizing VLCs(RVLC) have been proposed to help identify, limit and possibly reverse channel errors. In this work we observe that watermark bits are in effect forced bit errors and are amenable to the application of error-resilient techniques. We have developed a watermarking algorithm around a two-way decodable RVLC. The inherent error control property of the code is now exploited to implement reversible watermarking in the compressed domain. A new decoding algorithm is developed to reestablish synchronization that is lost as a result of watermarking. Resynchronization is achieved by disambiguating among many potential markers that are abundantly emulated in data. The algorithm is successfully applied to several MPEG-2 streams. Keywords: Error-resilient coding, reversible watermarking, MPEG-2
1. INTRODUCTION Digital watermarking of compressed media is performed at one of three entry points; pre-compression1,2, post-compression3,4 or following partial decompression5. Embedding of the watermark directly in the compressed bitstream has several distinct advantages. Since watermark enters post-compression, the impact of compression on the watermark is bypassed. From a practical standpoint, watermarking in the compressed domain is desirable because almost all media are first made available in compressed form. Therefore, watermarking the raw signal requires costly decompression and precludes or greatly complicates real-time implementation. The problem with existing compressed domain watermarking is that they are lossy and impact quality. For example, in3 watermark bits are placed in what is called label-carrying variable length codes of MPEG-2 stream. The embedded watermark keeps the VLCs in the same run category but may shift levels by one. Although visual quality of the I-frames may not be affected, motion prediction will suffer as a result of error propagation unless corrective actions are taken. In other applications such as medical, surveillance and security it is desired to have a mathematically lossless, as contrasted with perceptually lossless, watermark. In this work, we propose a lossless algorithm for watermarking of compressed streams using error recovery property of a proposed bidirectional code. The basic idea is to employ algorithms designed to counter channel errors and use them to carry hidden bits. The decoder can then recover the hidden information using its built-in error recovery capability and restore the cover data to its original state.
2. WATERMARK AS INTENTIONAL BIT ERRORS Communicating multimedia signals over low bandwidth channels requires error resiliency. The standard Huffman VLCs used in JPEG and MPEG, however, have no inherent error protection. The occurrence of bit errors has two possible outcomes. Bit errors may go undetected if the affected VLCs remain valid code words. However, codeword boundaries are in error resulting in data corruption. If bit errors are detected, they are most likely caught at positions away from actual error locations. It is usually not possible to recover from such errors until the next resynchronization marker. To address this problem, recent works have proposed resynchronizing or reversible VLCs (RVLC) that exhibit error resiliency[6]. Such codes are symmetric, hence two-way decodable, and provide limited error recovery. Upon encountering an invalid VLC, reverse decoding is initiated. All unaffected VLCs are correctly decoded on the reverse path. Although not all bit errors can be corrected, it is possible to localize the error to segments much smaller
Proceedings SPIE Conference on Security, Steganography and Watermarking of Multimedia Contents, San Jose, CA, January 18-22, 2004 than the length between resynchronization markers. MPEG-4, for example, can optionally draw from a table of bidirectional codes for limited error recovery. RVLCs in general are not as efficient as the traditional Huffman codes. In7, an algorithm is proposed for the construction of a class of RVLCs that are built upon ordinary VLCs. In this work we observe that bits used to watermark conventional VLCs can be thought of, in effect, as channel errors. Since RVLCs are originally designed to recover from channel errors, it is possible to build a watermarking algorithm around a modification of the concept proposed above. If the number of inserted watermark bits is kept below the error recovery capability of the bidirectional code, then a lossless watermarking capability is achieved. An early version of this idea appeared in 8. However, that work along with7, are not applicable to the real world of compressed bitstreams such as JPEG and MPEG. The issue is regaining lost synchronization in forward decoding phase. This is a crucial feature that has now been solved in this work.
3. WATERMARKING OF BIDIRECTIONAL CODES We first briefly review the construction of RVLCs from ordinary VLCs7:. 3.1. Forming two-way decodable packets A packet of N consecutive VLCs is defined as follows
P = {vlc1 ,vlc 2 ,...,vlc N } vlc i = {b1 ,b2 ,...,blk },b ∈ {0,1}
(1)
lk = length(vlc k ) The definition of a packet may be arbitrary or tied to synchronization markers such as end-of-block(EOB) in JPEG or frame, slice or macroblock headers in MPEG. The reverse of the kth vlc k is defined by
{
}
l vlc k' = fliplr (vlc k ) = bklk ,bkk −1 ,...,bk1 . Similarly, P' = vlc '1, vlc '2 ,..., vlc 'N . A compound stream C is
{
€
}
defined by XORing P and P’ in the following manner:
[
C = [ vlc1 ,vlc 2 ,...,vlc N | zeros(1,L)] ⊕ zeros(1,L) vlc1' ,vlc 2' ,...,vlc N'
€
L ≥ max{lk }
]
€ (2)
k
zeros(1,L) = vector of L zeros For decoding, C is XOR’ed with an L zero vector first. Since L is at least equal to the longest VLC, the first VLC is correctly decoded. This VLC is reversed and appended to the L-zero pad then XOR’ed with the incoming C bits. Bit errors encountered in C will have different impact on decoding. If a bit error immediately creates an invalid VLC, reverse decoding can recover the error (subject to burst error length). Bit errors that result in valid VLCs may not be sensed instantly or may even go entirely undetected. However, an error-free C stream must end with L zeros. Violation of this property would indicate corruption in the packet.
€
3.2. Watermark embedding in the bidirectional code Watermarking of the bidirectional stream is accomplished by introducing watermark bits as a burst error into C stream and then use constraints on bidirectional decoding to recover the watermark. The watermark is defined by a binary array W = w1 ,w 2 ,...,w lw of length lw . lw then defines the payload and can be
{
}
arbitrary increased at the cost of increasing L. This topic is further examined in depth in Section 5. W is then encrypted by a user-defined key κ to produce the encrypted watermark. For notational simplicity we continue to use W to refer to the encrypted watermark. Let vlc k be the longest VLC in the packet. For this € € choice, L = lk and€ lw ≤ lk . In composing C, vlc k is XOR’ed with n complete VLCs plus another partial
€ € €
€
€
Proceedings SPIE Conference on Security, Steganography and Watermarking of Multimedia Contents, San Jose, CA, January 18-22, 2004 n
VLC. The number of VLCs overlapping vlc k is the smallest n for which L ≤
∑ lk− j . The watermarked j=1
portion of C stream consists of three terms shown in (3).
C w ( m : m + l k − 1) = vlc k ⊕ vlc' €*k−n−1 , vlc 'k−n ,..., vlc 'k−1 ⊕ W
{
}
k−1
m=
€
∑l j
(3)
j=1
3.3. Concept of flag
€
€
For successful recovery of the watermark, the bidirectional decoder must fail immediately at vlc k . This failure will then trigger reverse decoding. This failure will not ordinarily happen because the watermarked vlc k , or portions thereof, will likely remain valid VLCs despite watermarking. Therefore, the presence of watermark may be detected at a location away from vlc k or not at all. To meet the objective of lossless € watermarking, we need to sense the beginning of the watermarked portion of the stream at the point of insertion. For forward detection to fail at this instant, the decoded Cw must begin with a sequence of flag bits of length l f . This flag must be neither a valid VLC nor a prefix to a valid VLC. Such flag(s) may be
€
determined from offline analysis of VLC tables. To address this issue, we form an augmented watermark pattern W * = p,W and use it in (3). p is the prefix€needed to ensure that a flag is generated upon
[
]
encountering watermarked VLC. Note that p itself is not the flag. It is a special sequence that produces the € flag when Cw passes through the decoder. Therefore, p is not directly observable in the bidirectionally € coded stream. Straightforward analysis of Cw at the output of the decoder reveals that if
p = flag ⊕ vlc k (1: l f ) then the first l f bits of the forward decoded stream will be the sought-after flag.
€
€ Using W * in (3) guarantees instant failure on forward decoding. Similarly, reverse decoding also needs a stop mechanism but the stop logic is different for fixed length packets vs. variable length packets. The two € * * cases are described in Section € 4 below. The new Wκ is given by Wκ = [ p,Wκ ,q ] where q is now a suffix
€ flag, not necessarily the same as p, designed to stop decoding on reverse decoding
4. WATERMARK DETECTION € € the end-of-packet. However, identifying the endOnce an error is detected, reverse decoding begins from of-packet after losing synchronization is a major issue. This task is generally accomplished by hunting for resynchronization markers. The problem is that once codeword boundaries are lost, it is difficult to identify where to begin reverse decoding. The problem is particularly acute when synchronization markers are short and are easily emulated within data. We address two separate cases. The first case is where packets consist of fixed, known number of VLCs. In the second scenario covering JPEG and MPEG-2, packets are defined to be coincident with the existing markers in the bitstream. Such packets, however, carry a variable number of VLCs that is unknown to the decoder. In this case there is no way for the decoder to know the location of the end-of-packet by counting. Looking for a resynchronization marker fails because markers cannot be readily distinguished from data. For example, end-of-block markers in JPEG and MPEG are at most 4 bits long. There are numerous occurrences of those 4 bits within the bitstream itself. Once synchronization is lost distinguishing correct end-of-block marker from emulations becomes a non-trivial task. We have developed detection algorithms for both cases. 4.1. Fixed length packets In this section bidirectionally encoded packets consist of a known number of VLCs and unique end-ofpacket markers. A forward flag is all that is needed in this case. Upon encountering the flag on forward decoding, the decoder initiates reverse decoding by looking for the resynchronization marker. Reverse '
'
decoding begins with vlc1. This time detection should fail at vlc k−1. However, failure may not be detected
€
€
Proceedings SPIE Conference on Security, Steganography and Watermarking of Multimedia Contents, San Jose, CA, January 18-22, 2004 since there are no flags present on reverse path. The logic of decoding is as follows. On forward decoding, {vlc1,...,vlc k−1} are correctly decoded. On reverse decoding, decoding may not fail at vlc k . To determine where to stop decoding, we recognize that the reversely decoded sequence is {vlc N ,...,vlc x } with x
€
unknown at this point. Since there are N VLCs in the packet and k of them were correctly decoded on forward path, reverse decoding must stop at vlc x when x = N − (k −1) . Here € is the stopping rule: the last VLC recovered on reverse decoding, i.e. vlc N −( k−1) , is the same VLC, i.e. vlc k , that failed detection on € '
'
forward decoding. In addition, when vlc k is identified, vlc k−1 and all others preceding it are also identified € be recovered by the following operation: from forward decoding. Watermark€bits can now
W * = C w ( m : m + l k − 1) ⊕ vlc k€⊕ vlc'*k−n−1 ,vlc 'k−n ,...,vlc 'k−1
€
{
}
€
€
(4)
All three terms on the right are available to the decoder. The first term is the received stream. The second term is found on reverse decoding and the third term are found on forward decoding. The watermark bits
€
*
are found by removing the flag bits from W . Note that at the end of this operation two goals are met: watermark bits are extracted and the stream is restored to its original state. See Figure 1 for an illustration of the algorithm where the packet consists of 4 VLCs {C, A,B,D, E} . C and A are correctly decoded on forward decode. Decoding fails in decoding B because of the flag. Reverse decoding recovers E, D, and B. € once watermarked and once without. This fact alone is responsible for Note that the B is decoded twice; recovering the watermark. € E N C O D E R c1 c2 c3 c4 a1 a2 b1 b2 b3 b4 d1 d2 e1 e2 e3 0 0 0 0 0 0 0 0 c4 c3 c2 c1 a2 a1 b4 b3 b2 b1 d2 d1 e3 e2 e1 p1 p2 p3 w D E C O 0 0 0 0 c4 c3 c2 c1 a2 a1 c1 c2 c3 c4 a1 a2 f1 f2 f3 X forward decode FAIL
D
E
R
d1 d2 e1 e2 e3 0 0 0 0 b4 b3 b2 b1 d2 d1 e3 e2 e1 backward decode
w=b4
a1
w
b4
a1
Figure 1- Watermark detection for fixed length packets. XORing of the first 3 lines generates the watermarked stream. Backward decoding can be stopped at the right place by knowing the packet length. The information recovered on reverse decoding is combined with that recovered on forward decoding to find the watermark bit(s).
Proceedings SPIE Conference on Security, Steganography and Watermarking of Multimedia Contents, San Jose, CA, January 18-22, 2004
Figure 2- Illustrating two-way decoding when neither packet length nor unique end-of packet markers are available. Forward decoding is done as in Figure 1. Reverse decoding then stops after encountering the suffix flag {q1 ,q 2 ,q 3} . Again, the VLCs corrupted by the watermark on forward decoding are recovered intact on reverse decoding. The two pieces of information are put together to recover the watermark bit(s). Note that length{ L} ≥ length Wκ* is required for correct watermark recovery.
{ }
€
4.2. Variable length packets
Implementation of the algorithm on compressed bitstreams raises a number of practical questions. JPEG € and MPEG-2 have precise hierarchical structures with resynchronization markers dispersed at specific locations. From an implementation standpoint, it makes sense to define watermarked packets to be congruent with packets already defined in the bitstream. Not doing so generates a considerable overhead in housekeeping of two packet structures. It is also more efficient to use existing resynchronization markers rather than define new ones. Choosing existing markers gives rise to packets of variable lengths, both in bits as well as VLCs. Variable length packets coupled with short resynchronization markers raises a number of issues: •
End-of-packet marker is emulated in data at numerous locations
•
Number of VLCs per packet is unknown to the decoder
By far, the most serious issue is locating the true end-of-packet marker. For example, if packets are defined to be the same as a block in MPEG, the decoder must move forward and hunt for the first end-of-block marker. The marker can be as simple as ‘10’ in MPEG-2. Clearly, ‘10’ is emulated in data at numerous times and a method must be developed to disambiguate the true end-of-packet from those occurring naturally in the bidirectionally coded stream. Variable length packets complicate decoding in another way even after end-of-packet is correctly identified. Since N is not known, the decoder has no way of knowing when to stop backward decoding. This problem is solved by the insertion of a reverse flag. This flag has all the properties of forward flag in that it causes detection failure on reverse decoding. Figure 2 illustrates an example where the packet carries one watermark bit. Reverse decoding begins once { f1, f 2 , f 3} is encountered on forward decode. On reverse decoding, three full VLCs are correctly decoded before encountering the reverse flag. It is now known that the watermark bit(s) are between the two flags. The watermark bit enters the stream as w ⊕ b4 ⊕ c 2 . When decoding fails on forward decoding, c 2 is already € recovered. Because of watermarking {b1,b2 ,b3 ,b4 } is corrupted on forward but is fully recovered on reverse. The last VLC decoded on reverse is always the same VLC that has received the watermark. Putting these two pieces of information together reveals the watermark bit. € €
€
Proceedings SPIE Conference on Security, Steganography and Watermarking of Multimedia Contents, San Jose, CA, January 18-22, 2004 The above algorithm can be successfully implemented only if end-of-packet can be identified. Since the XORing function in effect scrambles the VLCs carried within it, the decoder cannot quickly “jump” to the correct end-of-block because end-of-block markers are now indistinguishable from data. By exploiting the constraints inherent in bidirectional codes, flag and VLC properties this problem can be solved. The decoder will traverse the bitstream forward, from the point that forward decode halts, and pause at every end-of-block emulation it finds within the bitstream. We call them potential end-of-blocks (PEOBs). From every PEOB, reverse decoding begins as if it was the true marker. Three cases may occur on reverse decoding: Case 1: an invalid VLC or prefix is encountered before encountering the reverse flag Case 2: no flag is found, decoded VLCs are legal but not necessarily correct. Decoding goes past where forward decoding stopped Case 3: valid VLCs are decoded and expected flag is encountered at the expected location Both cases 1 and 2 indicate that the encountered end-of-blocks are bogus and have simply been emulated by data. The reason is that neither of the two cases can possibly occur if the encountered end-of-block was correct. When the flag is detected during forward decode the bit position of the watermark bit within the packet is saved. During backward decode, if the extracted watermark bit does not have the same bit position as from the forward decode, it is another indication that a PEOB has been found. Case 3 arises under two circumstances. First, when the encountered end-of-block is indeed the correct one which is the overwhelming majority. For case 3 to arise by chance, all decoded VLCs must be valid, the exact flag must be encountered and it must be encountered at the precise location found on forward decoding. This alignment of chance is mathematically possible but unlikely. However, they do happen. If the decoder happens to stumble across a bogus PEOB and decide that it is authentic, not only an incorrect watermark is pulled, synchronization will also be lost from that point on, possibly indefinitely. There is a solution to this. In the encoding phase, it is possible to run the decoder on watermarked blocks before sending them out. Those blocks that result in misidentification of correct end-of-blocks are simply not bidirectionally coded. This action raises another question! How does the decoder know that it is looking at an unwatermarked block? When an uncoded block is put through a bidirectional decoder, the output is in reality a bidirectionally coded stream without flags or watermark. Parsing through the packet should not extract a flag. Thus the packet is declared unwatermarked if end-of-block is reached without encountering a flag. There is still the possibility of the lag being randomly emulated in an uncoded block. This event can also be detected at the encoder before the stream is transmitted.
5. EXPERIMENTAL RESULTS The proposed algorithm was implemented on VLCs from 3 MPEG-2 streams. Video footage was recorded using a Sony miniDV Handycam model DCR-TRV30 and transcoded to MPEG-2. The specifications of the three files are shown in Table 1. Video
Length (sec)
Size (kB)
1
3
102
2
3
185
3
10
1,222
Table 1: Sample File Specifications The blocks of intracoded luminance and chrominance macroblocks are used as a basic bidirectional packet. As a result of this choice, the number of blocks in video and their content are critical to the implementation of the algorithm and the resulting capacity. For this purpose, an exhaustive analysis of each file was conducted and the results summarized in Table 2.
Proceedings SPIE Conference on Security, Steganography and Watermarking of Multimedia Contents, San Jose, CA, January 18-22, 2004
Video
# blks used
#skp’d: escape
#skp’d:lack Avg. Total skp’d bits VLC/blk
Avg. bits/blk
Avg. PEOB/blk
Error
1
1882
43
11472
11515
9.27
38.79
9.18
0
2
8263
556
26479
27035
10.79
49.67
11.91
.15%
3
74994
4645
124470
129115
12.67
61.49
14.71
.13%
Table 2: Simulation Results To determine flag sequences, an exhaustive search of the 114 VLCs in table B-14 of the MPEG-2 standard provided a set of 566 binary strings of length less than 17 that are neither valid VLCs nor prefixes to other valid VLCs. The shortest available flag is 12 bits long. This number requires that blocks contain 25 bits or more. In Table 2 the columns, read from left-to-right, represent the following information: video number, the number of blocks used, number of blocks skipped because they contained escape codes, number of blocks skipped because the total bits in block were less than 25, total skipped blocks for the video, average number of VLCs per block, average number of bits per block, average number of PEOBs per block, and percentage error in the simulation. Error in the simulation is defined as when an incorrect watermark was decoded in the simulation. It was found that this scenario occurs when a backward decode from a PEOB which is not the correct end-of-block finds the suffix flag in the correct position. The decoder will then find the correct watermark bit position but might not extract the correct value because it would perform the XOR operation with an incorrect value. Increasing the number of VLCs per block obviously causes the number of bits per block to increase. This increase also caused the PEOBs per block count to increase. Interesting to note is that an increase in PEOBs did not cause an increase in error rate. This simulation inserted one watermark bit per block although no such restriction is in place. Therefore, the capacity of the video is exactly the values located in column two of Table 2. The cost of lossless embedding is file size growth. The growth is governed by L bits per block. However, not all blocks are used in bidirectional encoding. The formula used for file size growth is as follows:
%growth =
€
25 × block _ size(bits) × 100 file _ size(bits)
(5)
The results of these calculations are available in Table 3. Note that percent growth for longer videos is more than others. The reason is that longer videos have used disproportionately more blocks than shorter videos. For example, Video 3 is 12 times larger than Video 1 but has used 40 times more blocks for watermarking causing (14) to grow faster. The reason, of course, is that the shorter video had fewer qualifying blocks and many more were skipped.
Video
Capacity (bits)
Growth (bits)
%Growth
1
1882
47050
5.76%
2
8263
206575
13.9%
3
74994
1874850
19.1%
Table 3: Size and Capacity Metrics Watermarking capacity is clearly greater for longer videos but length alone does not control capacity. A less obvious factor is that scenes with more motion activity generate more coded macroblocks and hence provide additional embedding capacity.
Proceedings SPIE Conference on Security, Steganography and Watermarking of Multimedia Contents, San Jose, CA, January 18-22, 2004
6. CONCLUSIONS In this work we have proposed a lossless watermarking algorithm for compressed media. The novelty of the approach lies in the fact that watermarks are treated as intentional bit errors. We have used resynchronizing VLCs, originally designed to counter channels errors, to recover watermark bits in error-free channels. Decoding does not require access to the original title hence the algorithm is both lossless and blind. In addition, applying the algorithm to actual compressed bitstreams introduced new problems that were not envisioned in the original bidirectional coding. They include decoding with no knowledge of packet length and algorithms for end-of-packet search following loss of synchronization.
7. ACKNOWLEDGMENTS This research is supported in part by a grant from the US Air Force of Scientific Research.
REFERENCES 1. F. Hartung, B. Girod, “Watermarking of uncompressed and compressed video,” Signal Processing, vol. 66, 1998, pp. 283-301 2. B. Mobasseri, ““A spatial video watermark that survives MPEG”, IEEE International Conference on Information Technology: Coding and Computing, March 27-29, 2000, Las Vegas 3. G. Langelaar, et al, “Watermarking of digital image and video,”IEEE Signal Processing Magazine, vol. 17, no. 5, pp. 20-46, 2000 4. D. Cross, B. Mobasseri,” Watermarking for self-authentication of compressed video,” IEEE International Conference on Image Processing, Rochester, NY, September 22-26, 2002 5. I. Setyawan et al, “low bit rate video watermarking using temporally extended differential energy watermarking(DEW) algorithm,” Proceedings SPIE Security and Watermarking of Multimedia Contents III, January 22-25, 2001, San Jose, CA 6. S. Hemami ,’Robust image transmission using resynchronizing variable length Codes and error concealment,” IEEE Journal on Selected Areas of Communications, vol. 18, no.6, pp.927-939, June 2000. 7. B. Girod, “Bidirectionally decodable streams of prefix code-words,” IEEE Communications Letters, vol.3, no. 8, pp. 245-247, August 1999. 8. B. Mobasseri, D. Cinalli, “Watermarking of compressed multimedia using error-resilient VLCs,” Proc. IEEE Workshop on Multimedia Signal Processing, St. Thomas, The US Virgin Islands, Dec. 9-11, 2002.