This is to certify that I have examined the above PhD Thesis and have found that
it is ... Department of Electrical and Electronic Engineering. 20 August 2003 ...
Image Watermarking and Data Hiding Techniques
By Wong Hon Wah
A Thesis Submitted to The Hong Kong University of Science and Technology in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Electrical and Electronic Engineering
August 2003, Hong Kong
Authorization I hereby declare that I am the sole author of the thesis. I authorize the Hong Kong University of Science and Technology to lend this thesis to other institutions or individuals for the purpose of scholarly research. I further authorize the Hong Kong University of Science and Technology to reproduce the thesis by photocopying or by other means, in total or in part, at the request of other institutions or individuals for the purpose of scholarly research.
Wong Hon Wah
i
Image Watermarking and Data Hiding Techniques by Wong Hon Wah This is to certify that I have examined the above PhD Thesis and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the thesis examination committee have been made.
Department of Electrical and Electronic Engineering 20 August 2003
ii
Acknowledgments I would like to express my earnest gratitude to my supervisor, Professor Oscar Au, for his invaluable guidance and stimulation throughout this research work, giving me constructive comments and infinite support. I would also like to thank Professor Kenneth S. K. Law, Professor Ross Murch, Professor Bertram Shi, and Professor Michael S. Brown for serving as my thesis committee members, and for reviewing and giving valuable comments on my thesis. Special thank is given to Professor Edward J Delp of Purdue University for serving as my thesis external examiner. I would like to thank Professor Bing Zeng, Dr. Wai Ho Mow and Dr. Gary Shueng Han Chan to serve as my proposal committee members and their comments on my research. I wish to thank my schoolmates and colleagues, in particular Leo Yiu Hung Fok, Alexandros Tourapis, Ming Sun Fu, Ming Fai Fu, Derek Wing Cheong Chan, Gene Yick Ming Yeung and Andy Chang, for their support and help throughout the past few years in HKUST. Special thank is given to Gene for his help in the JPEG2000 images capacity estimation. Finally I would like to delicate my thesis to the memory of my mother.
iii
TABLE OF CONTENTS Title Page Authorization Page
i
Signature Page
ii
Acknowledgments
iii
Table of Contents
iv
List of Figures
vii
List of Tables
xiv
Abstract
xvii
Chapter 1
Introduction
Chapter 2
Data Hiding in JPEG Compressed Domain by AC Modification 11
Chapter 3
Chapter 4
1
2.1
The J-Mark
11
2.2
Block Selection for J-Mark
12
2.3
DCT Coefficients Selection
14
2.4
Bits Embedding for J-Mark
15
2.5
Data Extraction for J-Mark
18
2.6
Experimental Results and Discussions for J-Mark
18
Single Watermark Embedding
31
3.1
Single Watermarking Embedding (SWE)
31
3.2
Distortion Analysis in SWE
36
3.3
Watermark Decoding and Detection in SWE
41
3.4
Experimental Results and Discussions for SWE
44
Multiple Watermarks Embedding
69
4.1
Multiple Watermarks Embedding (MWE)
69
4.2
Multiple Bits Embedding in Sub-vector
72
iv
Chapter 5
Chapter 6
Chapter 7
4.2.1 Direct Approach (DA)
75
4.2.2 Iterative Approach (IA)
75
4.3
Watermark Decoding and Detection in MWE
77
4.4
Experimental Results and Discussions for MWE
78
Iterative Watermark Embedding
95
5.1
Iterative Watermark Embedding (IWE)
95
5.2
Host Vector Extraction in IWE
96
5.3
Bit Embedding in Sub-Vector
96
5.4
Watermark Decoding and Detection in IWE
99
5.5
Experimental Results and Discussions for IWE
99
Direct JPEG Watermark Embedding
115
6.1
Direct JPEG Watermark Embedding (DJWE)
115
6.2
Host Vector Extraction in DJWE
116
6.3
Bit Embedding in Sub-Vector
118
6.4
Watermark Decoding and Detection in DJWE
120
6.5
Experimental Results and Discussions for DJWE
121
Capacity Estimation for JPEG and JPEG2000 images
132
7.1
The JPEG-to-JPEG Watermarking Model
132
7.2
Capacity Estimation in J2J
134
7.3
Necessary Condition to Achieve Capacity
138
7.4
JND Estimation
140
7.4.1 The Watson’s Model
140
7.4.2 Modification of Watson’s Model
141
7.5
Experimental Results and Discussions
143
7.6
The JPEG2000-to-JPEG2000 Watermarking Model
150
v
Chapter 8
7.7
Background of JPEG2000
150
7.8
Capacity Estimation in J2K2J2K
152
7.9
The HVS Model used in J2K2J2K
156
7.10 Experimental Results and Discussions
157
Conclusions
160
Bibliography and References
163
List of Publications
175
vi
LIST OF FIGURES 2.1
Data hiding process of J-Mark.
12
2.2
Data retrieving process of J-Mark.
12
2.3
DCT coefficients used to compute E AC .
13
2.4
Two texture block examples: 6 texture neighbors in (a), 8 in (b).
13
2.5
DCT coefficients that are candidates for watermarking.
14
2.6
Example of zigzag scanning of candidates to select 3 candidates.
15
2.7
Compression ratio Vs scaling factor (SF) for Lena.
21
2.8
Compression ratio Vs scaling factor (SF) for Pepper.
21
2.9
Number of bits embedded using J-Mark for Lena.
22
2.10 Number of bits embedded using J-Mark for Pepper.
22
2.11 PSNR comparison for Lena.
23
2.12 PSNR comparison for Pepper.
23
2.13 Increased file size after data hiding using J-Mark for Lena.
24
2.14 Increased file size after data hiding using J-Mark for Pepper.
24
2.15 JPEG compressed Lena, SF=1.
25
2.16 Lena with 1897 bits embedded, SF=1.
25
2.17 Selected 698 texture blocks for Lena, SF=1, T=2000.
26
2.18 JPEG compressed Lena, SF=4.
26
2.19 Lena with 584 bits embedded, SF=1, T=2000.
27
2.20 Selected 342 texture blocks for Lena, SF=4, T=4000.
27
2.21 JPEG compressed Pepper, SF=1.
28
2.22 Pepper with 1085 bits embedded, SF=1, T=2000.
28
2.23 Selected 366 texture blocks for Pepper, SF=1, T=2000.
29
2.24 JPEG compressed Pepper, SF=4.
29
vii
2.25 Pepper with 213 bits embedded, SF=4, T=4000.
30
2.26 Selected 106 texture blocks for Pepper, SF=4, T=4000.
30
3.1
Modulation of Watermark.
32
3.2
Modification of sub-vector in SWE watermark embedding for case 1.
35
3.3
Modification of sub-vector in SWE watermark embedding for case 2.
35
3.4
Modification of sub-vector in SWE watermark embedding for case 3.
35
3.5
Example of the distribution of inner product Yi , K i .
39
3.6
Example of the distribution of xi1 for d i = 480 .
39
3.7
Example of the distribution of xi2 for di = 480 .
40
3.8
Example of the distribution of xi3 for di = 480 .
40
3.9
Comparison on estimated E w and the experimental results.
41
3.10 Testing images used in the experiments.
45
3.11 Logo ‘UST’ used as the watermark.
46
3.12 Typical SWE-watermarked image.
47
3.13 Average detection score against JPEG compression for SWE.
47
3.14 Distribution of S1 of SWE under JPEG attack.
51
3.15 Detection error of SWE under JPEG attack.
51
3.16 Distribution of S1 of SWE under LPF attack.
52
3.17 Detection error of SWE under LPF attack.
52
3.18 Distribution of S1 of SWE under noise attack.
53
3.19 Detection error of SWE under noise attack.
53
3.20 SWE-watermarked image under JPEG attack.
54
3.21 Decoded watermark after the JPEG attack.
54
3.22 Random watermark detection results under JPEG attack.
54
viii
3.23 SWE-watermarked image under LPF attack.
55
3.24 Decoded watermark under LPF attack.
55
3.25 Random watermark detection results under LPF attack.
55
3.26 SWE-watermarked image under noise attack.
56
3.27 Decoded watermark under noise attack.
56
3.28 Random watermark detection results under noise attack.
56
3.29 SWE-watermarked image under print-and-scan attack.
57
3.30 Decoded watermark under print-and-scan attack.
57
3.31 Random watermark detection results under print-and-scan attack.
57
3.32 SWE-watermarked image under Y-Shearing attack by Stirmark.
60
3.33 SWE-watermarked image under Gaussian filtering attack by Stirmark.
60
3.34 SWE-watermarked image under 50% cropping attack by Stirmark.
61
3.35 SWE-watermarked image under latest small random distortion (1.05) attack by Stirmark.
61
3.36 SWE-watermarked image under 7 by 7 median filtering attack by Stirmark. 62 3.37 SWE-watermarked image under 40% noise attacks attack by Stirmark.
62
3.38 SWE-watermarked image under watermark embedding attack (strength = 50) by Stirmark.
63
3.39 SWE-watermarked image under random lines removal (1 of 20) by Stirmark.
63
3.40 SWE-watermarked image under 50% rescaling attack by Stirmark.
64
3.41 SWE-watermarked image under rotation attack (30 degrees) by Stirmark.
65
4.1
Watermarks embedding process of MWE.
70
4.2
Watermark decoding process of MWE.
70
4.3
Modification of sub-vector using correlated random sub-vectors.
74
ix
4.4
Modification of sub-vector using orthogonal random sub-vectors.
74
4.5
Original logo ‘Alphabet’.
81
4.6
Average PSNR of MWE Vs number of watermarks (Q).
81
4.7
Complexity comparison between IA-R and IA-F.
82
4.8
Average detection score S1 of MWE (Q=5) Vs different SF.
82
4.9
MWE-watermarked image with 5 watermarks embedded (Q=5, IA-R).
83
4.10 Distribution of S1 of MWE (IA-R, Q=5) under JPEG attack.
84
4.11 Detection error of MWE under JPEG attack.
84
4.12 Distribution of S1 of MWE under LPF attack.
85
4.13 Detection error of MWE under LPF attack.
85
4.14 Distribution of S1 of MWE under noise attack.
86
4.15 Detection error of MWE under noise attack.
86
4.16 MWE-watermarked image under JPEG attack.
87
4.17 Decoded watermarks under JPEG attack.
87
4.18 Random watermark detection results of logo ‘A’ under JPEG attack.
87
4.19 MWE-watermarked image under LPF attack.
88
4.20 Decoded watermarks under LPF attack.
88
4.21 Random watermark detection results of logo ‘A’ under LPF attack.
88
4.22 MWE-watermarked image under noise attack.
89
4.23 Decoded watermarks under noise attack.
89
4.24 Random watermark detection results of logo ‘A’ under noise attack.
89
4.25 MWE-watermarked image under print-and-scan attack.
90
4.26 Decoded watermarks under print-and-scan attack.
90
4.27 Random watermark detection results of logo ‘A’ under print-and-scan attack.
90
x
5.1
The iteration loop in IWE.
98
5.2
Average detection score of IWE for JPEG images Vs SF of original JPEG images (no attack).
5.3
101
Average PSNR of IWE-watermarked images Vs SF of original JPEG image (no attack).
102
5.4
Average number of iterations of IWE per embedded bit.
102
5.5
JPEG compressed image.
102
5.6
IWE-watermarked JPEG image.
103
5.7
Distribution of S1 of IWE under JPEG attack.
104
5.8
Detection error of IWE under JPEG attack.
104
5.9
Distribution of S1 of IWE under LPF attack.
105
5.10 Detection error of IWE under LPF attack.
105
5.11 Distribution of S1 of IWE under noise attack.
106
5.12 Detection error of IWE under noise attack.
106
5.13 IWE-watermarked image under JPEG transcoding attack
107
5.14 Decoded watermark under JPEG transcoding attack.
107
5.15 Random watermark detection results under JPEG transcoding attack.
107
5.16 IWE-watermarked image under LPF attack.
108
5.17 Decoded watermark under LPF attack.
108
5.18 Random watermark detection results under LPF attack.
108
5.19 IWE-watermarked image under noise attack.
109
5.20 Decoded watermark under noise attack.
109
5.21 Random watermark detection results under noise attack.
109
5.22 IWE-watermarked image under print-and-scan attack.
110
5.23 Decoded watermark under print-and-scan attack.
110
xi
5.24 Random watermark detection results under print-and-scan attack.
110
6.1
An example of proposed shuffling.
117
6.2
Watermarked image without shuffling.
117
6.3
Watermarked image after Shuffling is applied.
118
6.4
PSNR of watermarked images Vs scaling factors.
123
6.5
BER in Gaussian noise contaminated images.
123
6.6
BER under JPEG trancoding attack.
124
6.7
JPEG compressed image with SF=1.
124
6.8
DJWE-watermarked image.
125
6.9
DJWE-watermarked image under LPF attack.
126
6.10 Decoded watermark under LPF attack.
126
6.11 Random watermark detection results under LPF attack.
126
6.12 DJWE-watermarked image noise attack.
127
6.13 Decoded Watermark under noise attack.
127
6.14 Random watermark detection results under noise attack.
127
7.1
JPEG-to-JPEG watermarking (J2J) model.
134
7.2
Magnified image using Watson’s model.
142
7.3
Magnified image using modified Watson’s model.
142
7.4
Compression ratio Vs scaling factor for Lena.
145
7.5
Compression ratio Vs scaling factor for Pepper.
145
7.6
PSNR of JPEG-compressed images and JND watermarked images
7.7
of Lena.
146
PSNR of JPEG-compressed images and JND watermarked images
146
of Pepper. 7.8
Estimated capacity of Lena.
147
xii
7.9
Estimated capacity of Pepper.
147
7.10 Block capacity map of Lena.
148
7.11 Block capacity map of Pepper.
148
7.12 Block capacity histogram of Lena.
149
7.13 Block capacity histogram of Pepper.
149
7.14 An example of 3-levels DWT decomposition.
150
7.15 Lena compressed by JPEG2000 at 1 bpp.
158
7.16 JND watermarked Lena at 1 bpp.
158
7.17 File size of compressed images Vs target bit.
159
7.18 PSNR of compressed images.
159
7.19 Estimated capacity Vs target bit rate.
159
xiii
LIST OF TABLES 2.1
Default quantization table of JPEG.
15
3.1
List of symbols in chapter 3.
32
3.2
Stirmark results for affine transformation attacks.
65
3.3
Stirmark results for Gaussian filtering and sharpening attacks.
65
3.4
Stirmark results for cropping attacks.
65
3.5
Stirmark results for JPEG attacks.
65
3.6
Stirmark results for latest small random distortion attacks.
66
3.7
Stirmark results for median cut attacks.
66
3.8
Stirmark results for noise attacks.
66
3.9
Stirmark results for watermark embedding attacks.
66
3.10 Stirmark results for rescaling attacks.
67
3.11 Stirmark results for line removal attacks.
67
3.12 Stirmark results for small random distortions attacks.
67
3.13 Stirmark results for rotation and cropping attacks.
67
3.14 Stirmark results for rotation and scaling attacks.
68
3.15 Stirmark results for rotation attacks.
68
4.1
List of symbols in chapter 4.
71
4.2
Stirmark results for affine transformation attacks.
91
4.3
Stirmark results for Gaussian filtering and sharpening attacks.
91
4.4
Stirmark results for cropping attacks.
91
4.5
Stirmark results for JPEG attacks.
91
4.6
Stirmark results for latest small random distortion attacks.
92
4.7
Stirmark results for median cut attacks.
92
4.8
Stirmark results for noise attacks.
92
xiv
4.9
Stirmark results for watermark embedding attacks.
92
4.10 Stirmark results for rescaling attacks.
93
4.11 Stirmark results for line removal attacks.
93
4.12 Stirmark results for small random distortions attacks.
93
4.13 Stirmark results for rotation and cropping attacks.
93
4.14 Stirmark results for rotation and scaling attacks.
94
4.15 Stirmark results for rotation attacks.
94
5.1
Stirmark results for affine transformation attacks.
111
5.2
Stirmark results for Gaussian filtering and sharpening attacks.
111
5.3
Stirmark results for cropping attacks.
111
5.4
Stirmark results for JPEG attacks.
111
5.5
Stirmark results for latest small random distortion attacks.
112
5.6
Stirmark results for median cut attacks.
112
5.7
Stirmark results for noise attacks.
112
5.8
Stirmark results for watermark embedding attacks.
112
5.9
Stirmark results for rescaling attacks.
113
5.10 Stirmark results for line removal attacks.
113
5.11 Stirmark results for small random distortions attacks.
113
5.12 Stirmark results for rotation and cropping attacks.
113
5.13 Stirmark results for rotation and scaling attacks.
114
5.14 Stirmark results for rotation attacks.
114
6.1
Stirmark results for affine transformation attacks.
128
6.2
Stirmark results for Gaussian filtering and sharpening attacks.
128
6.3
Stirmark results for cropping attacks.
128
6.4
Stirmark results for JPEG attacks.
128
xv
6.5
Stirmark results for latest small random distortion attacks.
129
6.6
Stirmark results for median cut attacks.
129
6.7
Stirmark results for noise attacks.
129
6.8
Stirmark results for watermark embedding attacks.
129
6.9
Stirmark results for rescaling attacks.
130
6.10 Stirmark results for line removal attacks.
130
6.11 Stirmark results for small random distortions attacks.
130
6.12 Stirmark results for rotation and cropping attacks.
130
6.13 Stirmark results for rotation and scaling attacks.
131
6.14 Stirmark results for rotation attacks.
131
7.1
Parameters suggested Watson’s DWT model .
156
7.2
Optimal quantization factors.
156
8.1
Summary of proposed techniques in this thesis.
162
xvi
Image Watermarking and Data Hiding Techniques
by Wong Hon Wah Department of Electrical and Electronic Engineering The Hong Kong University of Science and Technology
Abstract Gradually people are motivated to embed information such as owner info, date, time, camera settings, event/occasion of the image, image title, or even secret message in the digital images for value-added functionalities and possibly secret communication. A novel sample-based methods are proposed to embed some information bits in the JPEG compressed domain. The proposed method called J-Mark embeds the information bits in the DCT coefficients with significant energy in the selected blocks significant masking properties. Spread spectrum technique (SST) is widely adopted for vector-based image and video watermarking in the past few years. Four novel techniques are proposed to embed watermarks for different purposes. The first one call Single Watermark Embedding (SWE) is use to embed a watermark bit sequence in digital images using two secret keys. The second technique called Multiple Watermark Embedding (MWE) extends SWE to embed multiple watermarks simultaneously in the same watermark space while minimizing the watermark energy. The third technique called Iterative Watermark Embedding (IWE) embeds watermarks in JPEG-compressed images. The proposed iterative approach can prevent largely the potential removal of watermarks in the JPEG recompression process. The fourth technique called Direct JPEG Watermark Embedding (DJWE) is an extension of the IWE. DJWE embeds the watermarks with lower computation complexity
xvii
then IWE and uses the Human Visual System (HVS) model to prioritize the coefficients to be altered to achieve good visual quality. Two techniques for watermarking capacity estimation are proposed. The first technique estimates the capacity for JPEG-to-JPEG image watermarking (J2J). In J2J image watermarking, the input is a JPEG image file and, after watermark embedding, the image is JPEG-compressed such that the output file is also a JPEG file. The second technique is an extension of the first technique to JPEG2000-to-JPEG2000 (J2K-2-J2K) watermarking. In J2K-2-J2K, the input is a JPEG2000 image file and, after watermark embedding, the image is JPEG2000-compressed using the same quantization factors. The Watson’s Discrete Wavelet Transform (DWT) HVS model is used to estimate the JND of each Discrete Wavelet Transform (DWT) coefficients. The proposed techniques do not assume any specific watermarking method and thus would apply to any watermarking methods in the J2J and J2K-2-J2K framework.
xviii
CHAPTER 1 INTRODUCTION
In recent years, digital images can be captured easily with scanners, digital cameras and camcorders, and transmitted easily over the Internet. As a result, digital images appear widely in the internet and the World Wide Web (WWW) and in storage media such as CD-ROM and DVD. One of the most popular image formats used is JPEG, which can achieve high compression while retaining high image quality. Associated with the widespread circulation of images are issues of copyright infringement, authentication and privacy. One possible solution is to embed some invisible information into the images where the embedded information can be extracted for different purposes. Digital watermarking is a process to embed some information called watermark into different kinds of media called Cover Work [34]. While some watermarks are visible [11], most watermarks of interest are invisible. There are many classes of invisible watermarks for different applications such as fragile watermarks and robust watermarks. Fragile watermarks are designed to be broken easily by image processing operations. The broken watermark serves as an indication of alteration of the original image. Major applications include tampering detection of images placed on the WWW and authentication of images received from questionable sources. Robust watermarks are required to remain in the watermarked image even after it has been attacked. The attacks may be hostile attacks such as statistical averaging, signal processing, watermark estimation and removal, watermark counterfeiting, etc. The attacks may also be casual or unintended attacks which are common image processing such as filtering, compression, scaling, cropping, etc. Some methods include spread spectrum in the frequency domain [2, 6, 1
12-13, 24, 28-30, 36-37, 39, 42, 45-47, 49-51, 54-58, 64, 91-93, 95] and the incorporation of perceptual models [13, 27, 37, 64]. Major applications include ownership establishment, copyright and distribution control. Data hiding watermarks, also called steganography [8], are used to embed data in the images with the intention to have the data recovered perfectly at the receiver. Such methods usually assume that there are no hostile or even casual attacks. Error control coding is usually used to combat channel noise and casual signal processing. Major applications include secret communication over the internet and embedding of valueadded auxiliary data with such low economic values that there is no motivation for hostile attack. In recent years, many algorithms were proposed to embed robust watermark in digital images. Many of them focus on the robustness to common signal processing such as low pass filtering, rotation, scaling, cropping and compression [17] and some claimed that their algorithms are robust to JPEG compression [5-7]. However, those algorithms usually use normalized correlation as the measurement to detect the existence of the watermark information that is not suitable for data hiding. In [9], several data hiding techniques are proposed but they are not robust to JPEG compression. Embedding a bit sequence in the digital image is a difficult task since the bit sequence should be decoded correctly. Langelaar et al. [5] propose an algorithm to embed a bit sequence in digital image by DCT coefficient removal but the modification of DCT coefficients in smooth regions may result in visual artifacts. In [2, 6-7], spread spectrum technique is used to embed the watermark but the noiselike watermark may be suppressed significantly by JPEG compression. Schundel et al. [14] inserts the invisible data into the least insignificant bits (LSB) of the uncompressed image. Macq et al. [15] inserts the watermark into LSB only around
2
image contours. Caronmi [16] hides small geometric patterns called tags in regions where the tags would be least visible, such as the very bright, very dark or texture regions. Coltuc et al. [17] embeds watermark in the histogram. Bender et al. [18] choose random pairs of image points and increase the brightness of one and decrease that of the other. Nikolaidis et al. [21] add a small positive number to random locations as specified by the binary watermark pattern and use statistical hypothesis testing to detect the presence of watermark. Voyatzis et al. [22] use dynamic systems (toral automorphism) to generate chaotic orbits which are dense in the spatial domain and hide the watermark at the seemingly chaotic locations. While there are many robust watermarks in the DCT domain, there are relatively fewer existing data hiding watermarking techniques in DCT domain [1920]. Kim et al. [23] embed watermark bits as pseudo-random sequences in the frequency domain. Langelaar et al. [5] hide watermarks by removing or retaining selected DCT coefficients. Borg et al. [24] hide watermark in JPEG images by forcing selected DCT blocks to satisfy certain linear or circular constraint. Koch et al. [25] select from some pre-defined pairs or triplets of DCT coefficients and use their relative strength to encode a bit of the watermark pattern. Some embeds watermark patterns in the quantization module after DCT [26] or in selected blocks based on human visual models [27].
Choi et al. [28, 54] utilize inter-block
correlation by forcing DCT coefficients of a block to be greater or smaller than the average of the neighboring blocks. Wu et al. [29] modify selected DCT coefficients by random shuffling and table lookup embedding. A sample-based method is proposed to embed some information bits in the JPEG compressed domain. The proposed method called J-Mark [58] embeds the information bits in the DCT coefficients with significant energy in the selected
3
blocks with significant masking properties. The J-Mark will be described in chapter 2. Some watermark schemes embed a series of watermark bits using some secret keys. Among these schemes, those requiring both the original data and the secret keys for the watermark bit decoding are called private watermark schemes. Those requiring the secret keys but not the original data are called public or blind watermark schemes [36]. Other watermark schemes embed pseudo-random patterns as watermarks, and detection of the presence of the watermark patterns is performed on the test data. Those requiring both the original data and the watermark for watermark detection are called private watermark schemes. Those requiring only the watermark are called semi-private or semi-blind watermark schemes [36]. Usually, the robustness of private watermark schemes is good towards many signal processing procedures such as JPEG compression and filtering. However, private schemes are not feasible in some situations such as watermark detection in DVD players. On the other hand, blind watermark schemes detect the watermarks without the original data and are thus feasible in DVD players. But the trade-off is that the robustness is usually lower. Another weakness of blind watermark schemes is the relatively higher false alarm rate compared with private watermarking schemes. There are many existing private watermark schemes for robust watermarking. Cox et al. [2] use SST to embed watermark in the DCT domain. To improve Cox’s method, Lu et al. [37] use cocktail watermark to improve the robustness and used Human Visual System (HVS) to maintain high fidelity of the watermarked image. Hsu et al. [38-39] embed watermark bits by modifying the polarity of DCT and Discrete Wavelet Transform (DWT) coefficients and use a meaningful logo image as
4
the watermark. Huang et al. [40] embed a watermark pattern by modifying the DC components. Malvar et al. [35] introduce the improved spread spectrum technique. There are also many blind watermark schemes. Hartung [41] assumes small correlation between the secret key and the image and hides data using SST in spatial domain or compressed domain. Wong et al. [7] embed watermark in log-2 spatial domain. Lu et al. [43] extend cocktail watermark to become a blind multi-purpose watermarking system which serves as both robust and fragile watermarks, capable of detecting malicious modifications if the watermark is known. Hernandez et al. [44-45] use 2-D multi-pulse amplitude modulation and SST to embed bit sequences in digital images and develops an optimal detector. Langelaar et al. [47] embed a bit sequence by modifying the energy difference between adjacent blocks. Wong et al. [48] use hash function to embed the watermark in the Least Significant Bit (LSB). Zhang et al. [49] embed a watermark pattern by modifying the DC and low frequency AC coefficients in the DCT domain. Some blind systems focus on embedding a rotation, scaling and translation (RST) invariant watermark pattern for watermark detection. Lin et al. [47] embed watermark in the Fourier-Mellin transform domain. Solachidis et al. [51] use a circularly symmetric watermark in the Discrete Fourier Transform (DFT) domain. Licks et al. [52] use a different kind of circularly symmetric watermark and require exhaustive search in the watermark detection. Stankovic et al. [53] embed watermarks by means of two-dimensional Radon-Wigner distribution with multiple watermark capabilities. All these RST invariant algorithms require the watermark for watermark detection. While most schemes embed only a single watermark, some allow for multiple watermark embedding [2, 37, 53, 57]. Cox [2] assumes the multiple watermarks are close to orthogonal and simply extend the single watermark algorithms to embed
5
them together. Some [37, 57] embed orthogonal watermarks and extend the single watermark algorithms for multiple watermarks. While many schemes embed watermark in a raw image, very few embed watermarks in a JPEG-compressed image (a .jpg file) that is a common output file format for digital cameras. One problem of embedding watermarks in JPEGcompressed images is that the watermarked images need to be JPEG compatible. This implies that all DCT coefficients need to be re-quantized with the same quantization factor after the watermark insertion. The typically small-magnitude watermark can be completely removed in the re-quantization. There are still some existing techniques. Choi et al. [28, 54] and Luo et al. [55] use inter-block correlation to embed the bit information in selected DCT coefficients by adding or subtracting an offset to the mean value of the neighboring DCT coefficients. Hartung [41] and Arena et al. [56] use SST to embed watermarks in I-frames, P-frames or B-frames of MPEG-2 compressed video. I-frame compression is effectively JPEG compression. Four blind watermarking techniques are proposed to embed a watermark in the watermark space. The first proposed technique, called Single Watermark Embedding (SWE) [91], uses secret keys to embed a meaningful binary logo image in the watermark space, using spread spectrum technique and some novel features. It uses a dual-key system and does not require the watermark to be orthogonal to the original data, thus allowing bit sequence embedding even in small images. Based on SWE, the second proposed technique, called Multiple Watermark Embedding (MWE) [9394], is developed to embed multiple watermarks simultaneously in the same watermark space. Different secret keys are used for different watermarks. Solutions are proposed for the special case when the secret keys are orthogonal and for the general case when the secret keys are correlated. It is shown that correlated secret
6
keys can be better than orthogonal keys. The third proposed technique, called Iterative Watermark Embedding (IWE) [92], embeds watermark in a JPEGcompressed image to produce another JPEG-compressed image. A novel iterative approach is used to prevent the removal of the watermark in the re-quantization process. The fourth technique called Direct JPEG Watermark Embedding (DJWE) is an extension of IWE. DJWE embeds the watermark with lower computation complexity than IWE and Human Visual System (HVS) model is used to prioritize the coefficients to be altered to achieve good visual quality. Most digital images are stored in the JPEG format, in digital cameras and WWW alike. To embed the watermark in these digital data, the input to the watermarking scheme is a JPEG image file and the output is also a JPEG image file. This kind of watermarking (or data hiding) scheme is called JPEG-to-JPEG (J2J) watermarking schemes. There are many papers investigating the robustness of watermarks against JPEG compression such as [2, 65-66]. Eggers et al. [66] analyze the quantization effect on the detection of watermarks by considering the additive watermark signal as a dithering signal. Although many watermarking (or data hiding) algorithms are proposed to embed digital watermarks in uncompressed images, those algorithms may not be suitable for embedding watermarks in JPEG-compressed images (.jpg files). This is because the DCT coefficients in JPEG-compressed images have special statistical characteristics – they must be multiples of the corresponding quantization factors. These special characteristics reduce the degree of freedom for watermarking. If the output images are not JPEG compatible, the existence of the watermark may be detectable using steganalysis techniques [67]. If the output images are JPEG-compatible, all DCT coefficients must be re-quantized after the watermark
7
insertion, which further reduces the degree of freedom for watermarking. This is called the JPEG-to-JPEG (J2J) framework in this thesis. There are a few existing schemes for J2J watermarking [28, 30, 33, 41, 54, 58]. These methods embedded different amount of watermark bits into JPEG images while maintaining good visual quality of the watermarked JPEG images. However, no one estimated the J2J data hiding capacity, or the maximum amount of bits that can be embedded in JPEG image files. There are some existing methods to estimate the data hiding capacity of digital images [68-77, 96], though they are not JPEG images. Most of them apply the work of Shannon [61] and Costa [62-63]. Servetto et al. [68] use statistical models to analyze the robustness of spread spectrum technique and estimate the watermarking capacity against jamming noise. Barni et al. [69-70] use generalized Gaussian density to model the watermark channels in the full-frame DCT coefficients. Moulin et al. [71] model coefficients in different domains and estimate the data hiding capacity under MSE constraints. Some papers combined the information-theoretic model [97] and perceptual models to estimate the capacity [72-73]. Some [74-75] focus on comparing the capacity among different transforms such as Identity transform, DCT, Karhunen-Loeve transform (KLT) and Hadamard transform. Fei et al. [74] suggest that the coefficients in Slant transform has the highest capacity while Ramkumar et al [75] indicate that transforms with poor energy compaction property such as Hadamard transform tend to have higher capacity than those with higher energy compaction property such as DCT. Sugihara [76] estimate the capacity by taking robustness of the hidden data into account. Voloshynovskiy et al. [82] analyze the security of the hidden data and suggest different modulation schemes for different purposes of data hiding. Kalker et al. [80] estimate the capacity of a particular data
8
hiding area – lossless data embedding, first proposed by Fridrich et al. [81]. For lossless data embedding, the original cover work can be restored at the decoder. This is particularly useful for many digital media such as medical images. Cohen et al. [79] analyze the capacity for private and public (or blind) data hiding schemes and the capacity under additive attacks. Instead of estimating the capacity, some propose realizations to approach the theoretical limit of capacity such as [78, 82-83]. PérezGonzález et al. [78] suggest using convolutional and orthogonal codes. Eggers et al. [84] propose the scalar Costa scheme (SCS) by considering the data hiding as the communication-with-side-information problem which has good performance at high watermark-to-noise ratio (WNR). A technique is proposed in Chapter 8 to estimate the data hiding capacity of JPEG images in J2J watermarking schemes. There are two assumptions in J2J. The first assumption is that the watermarked images will be JPEG-compressed using either the original quantization table extracted from the input JPEG file or a new quantization table defined by the user. The second assumption is that the dimensions of the images are not changed in the watermark embedding. The J2J model makes no assumption on the domain the watermark is embedded in. The algorithm is extended to estimate the capacity of JPEG2000 compressed images. JPEG2000 is a new international standard for image coding [88-89]. It is known that the JPEG2000 standard gives better image quality than JPEG. It is expected that the JPEG2000 standard will become a popular image-coding standard in the coming future. In watermark applications, if the input to the watermarking scheme is a JPEG2000 image file and the output is also a JPEG2000 image file. This kind of watermarking (or data hiding) scheme is called a JPEG2000-to-JPEG2000
9
(J2K-2-J2K) watermarking scheme. The Watson’s Discrete Wavelet Transform (DWT) HVS model [90] is used to estimate the JND of each DWT coefficient.
10
CHAPTER 2 DATA HIDING IN JPEG COMPRESSED DOMAIN - JMark
2.1
The J-Mark In this chapter, a method called J-Mark is proposed to embed secret
information in JPEG compressed domain. Selected quantized DC and AC coefficients are used to embed the data. The J-Mark is the extension of the previously proposed method called DC-Hide [30]. To embed data in AC coefficients, two problems arise. The first problem is that the quantization factors for AC coefficients are usually large, and thus even a slight modification on quantization AC coefficients may affect the visual quality of the image significantly. The second problem is that the modification on AC coefficients may affect the identification of texture blocks at the decoder since the texture blocks are selected based on the energy of the AC coefficients. To overcome the above problems, solutions are proposed called J-Mark to hide data in JPEG images with negligible visual degradation. As shown in Figures 2.1 – 2.2, J-Mark embeds hidden data into a JPEG compressed image to generate another JPEG image with the hidden data. Three steps are involved in J-Mark: block
selection, DCT coefficient selection and the modification of the selected DCT coefficients of the selected blocks. Alternatively, J-Mark can operate on an uncompressed image by compressing it with JPEG with the target quality factor and quantization table. It is assumed that the output JPEG image from J-Mark would not be transcoded or processed further before the hidden data are extracted. In other words, it is assumed that there are no hostile attacks or casual signal processing on
11
the JPEG compressed image. Under this assumption, the hidden data can be extracted perfectly. The data hiding process of J-Mark is shown in Figure 2.1 and the data extraction is shown in Figure 2.2. Bits to be hidden Key
JPEG Compressed image
DCT Coefficients Selection
Block Selection
Selected Coefficients Modification
JPEG image with hidden data
Figure 2.1: Data hiding process of J-Mark. Key
JPEG image with hidden data
DCT Coefficients Selection
Block Selection
Bits Decoding
Extracted Data
Figure 2.2: Data extraction process of J-Mark.
2.2
Block Selection for J-Mark J-Mark may hide 1, 2, 3 or more bits in an 8x8 block. However, not all blocks
will be used. The AC energy of selected de-quantized DCT coefficients is used to select the texture blocks. The energy E AC of only some of the reconstructed DCT coefficients is first computed, E AC =
∑ (D
( i , j )∈Ν
( i , j ) ⋅ Q( i , j ))
2
q
(2.1)
where Dq ( i , j ) is the quantized DCT coefficient, Q( i , j ) is the corresponding quantization factor, Ν is the shaded region in Figure 2.3.
12
Figure 2.3: DCT coefficients used to compute E AC (top left = DC coefficient) Blocks with E AC larger than a threshold T are declared texture blocks. The threshold T can be a function of the additional quantization factor. However, false detection of texture blocks is possible. As small texture regions tend not to have good masking properties, J-Mark seeks to avoid such small texture regions. Thus a texture block needs to have at least N1 texture block neighbors in the 3x3 neighborhood in order to be chosen for data hiding. When N1 = 0 , all texture blocks are chosen. Two texture block examples are shown in Figure 2.4. If N1 = 7 , the center texture block in case (a) of Figure 2.4 will not be selected, but the center block in case (b) will be selected.
(a)
(b)
Figure 2.4: Two texture block examples: 6 texture neighbors in (a), 8 in (b). As will be explained in the next section, some DCT coefficients shown in Figure 2.5 are candidates for watermarking and may be altered. Although this may change some of the AC coefficients in Ν of the chosen texture blocks, the AC coefficients are changed in such a way that the E AC of the chosen textures blocks at the decoder remain larger than the threshold T1 and would continue to be classified 13
as texture blocks at the encoder. And there are no changes to the non-chosen blocks and their E AC such that there is no change to their texture block classification. In other words, the texture block classification remains the same at both the encoder and the decoder. Consequently, the same blocks would be chosen at the decoder and the block selection is consistent.
Figure 2.5: DCT coefficients that are candidates for watermarking.
2.3
DCT Coefficient Selection Zero, one or more bits are embedded in the chosen blocks. As most high
frequency coefficients tend to be zero after quantization especially when the bit rate is low, only the low frequency DCT coefficients shown in Figure 2.5 are considered as candidates for data hiding. The amount of embedded bits within a chosen block depends on the energy profile of the candidates. Any candidate DCT coefficient whose absolute magnitude is less than a threshold T2 is eliminated and will not be used for data hiding. At most N 2 are chosen out of the remaining candidates because too many altered DCT coefficients can result in visually detectable artifacts. In rare cases, all the candidates will be eliminated. When N 2 or fewer candidates remain, they are selected for data hiding. When there are more than N 2 remaining candidates, they are scanned in a zigzag manner and the first N 2 candidates are chosen. An example is shown in Figure 2.6 with N 2 = 3 . Part (a) of 14
Figure 2.6 shows four remaining candidates after the energy thresholding and the first N 2 = 3 encountered during the zigzag scanning are chosen as shown in part (b).
(a) 4 remaining candidates
(b) 3 selected coefficient
Figure 2.6: Example of zigzag scanning of candidates to select 3 candidates.
2.4
Bits Embedding for J-Mark J-Mark uses ‘randomized parity’ to embed one bit of data into one selected
quantized DCT coefficient of a selected 8x8 block. Suppose the k th data bit wk is to be hidden at the DCT coefficient at location (i, j ) of a selected block. The ‘randomized parity’ is defined as
Dq ( i , j ) ,2 mod round nk
(2.3)
which is the parity of the rounded quotient of the quantized DCT coefficient Dq ( i , j ) divided by a pseudo-random number nk . When the ‘randomized parity’ and the desired data bit wk are equal, no change to Dq ( i , j ) is needed. When they are different, Dq ( i , j ) is altered in such a way that the resulting ‘randomized parity’ is the same as wk . The pseudo-random number n k in J-Mark is uniformly distributed between two positive real numbers R1 and R2 , generated with a user-defined key. Only the
15
key is needed for the data extraction. If R1 = R2 = 1 , the ‘randomized parity’ become the parity of the quantized DCT coefficients. Note that the randomized parity is similar to the table lookup in [23]. However, the general table lookup can have different non-uniform lookup table for each coefficients which might require excessive memory. The randomized parity used in J-Mark is like a uniform lookup table with different cell size for each selected DCT coefficient. The DCT coefficient with hidden data Dq' ( i , j ) is computed according to either (2.4) or (2.5). Dq ( i , j ) + 0.5 nk ⋅ round nk Dq ( i , j ) − 0.5 Dq' ( i , j ) = nk ⋅ round nk Dq ( i , j )
Case 1
Case 2
(2.4)
Case 3
Dq ( i , j ) and Case 1 : Dq ( i , j ) ≥ nk ⋅ round nk
Dq ( i , j ) ,2 ≠ wk mod round nk
Dq ( i , j ) and Case 2 : Dq ( i , j ) < nk ⋅ round n k
Dq ( i , j ) ,2 ≠ wk mod round n k
Dq ( i , j ) ,2 = wk Case 3 : mod round nk
Dq' ( i , j ) = sign(D ( i , j )) ⋅ n q k
Dq ( i , j ) Dq ( i , j ) ,2 ≠ wk ⋅ round + 0.5 mod round nk n k Dq ( i , j ) ,2 = wk mod round Dq ( i , j ) nk
(2.5) 16
where x is the smallest integer greater than or equal to x, x is the largest integer smaller than or equal to x and sign(
) is the signum function. It should be noted that
the D'q ( i , j ) obtained from either (2.4) or (2.5) can be decoded to obtain the hidden data wk . When (2.4) is used, the distortion on Dq ( i , j ) due to data hiding is minimized. However, the resulted E AC may be smaller than T and Dq' ( i , j ) may be smaller then T2 . If the resulted E AC is smaller then T, this block will be classified as block with no hidden data in the data extraction process. If Dq' ( i , j ) is smaller than T2 , even the resulted E AC is larger than T2 , this DCT coefficient will not be selected
to extract the hidden bit. These two situations lead to the failure in data synchronization and are not desirable. In order to ensure the correct blocks and DCT coefficients can be identified in the data extraction process, verification steps will be carried out after D'q ( i , j ) is obtained by (2.4). If the resulting E AC is smaller than T or the resulting absolute magnitude Dq' ( i , j ) is smaller than T2 , (2.5) is used instead since (2.5) guarantees Dq' ( i , j ) ≥ Dq ( i , j ) , however, the distortion on Dq ( i , j ) is larger when (2.5) is used.
2.5
Data Extraction for J-Mark
To extract the hidden data at the decoder, the JPEG image with hidden data is processed as shown in Figure 2.2. It is assumed that no transcoding or other processing is performed on the JPEG image with hidden data. The block selection is the same as that described in Section 2.2. In other words, the E AC is computed for every 8x8 block. Those blocks with E AC > T are declared texture blocks. The textures blocks with at least N 1 neighboring texture blocks are selected. By design, 17
the selected texture blocks at the decoder are identical to those selected at the encoder. The DCT coefficient selection is also the same as that described in Section 2.3. In the selected block, N 2 or fewer candidate DCT coefficients with energy greater than or equal to T2 are selected. To decode the watermark, the user-defined key is used to generate the pseudorandom numbers nk . The extracted data bit wk ' is decoded as
Dq ( i , j ) , 2 wk ' = mod round nk
(2.8)
from the selected DCT coefficients Dq ( i , j ) .
2.6
Experimental Results and Discussions for J-Mark
In the simulation, two testing images, ‘Lena’ and ‘Pepper’ are used. Both consist of 512 by 512 pixels. Only the luminance component is used. The original images are JPEG compressed with the default quantization matrix scaled by various scaling factor (SF) to achieve different compression ratio. Most research papers control the JPEG quality using the quality factor (QF) and the SF is related to the QF by 50 QF , SF = 2 − QF , 50
QF ≤ 50
(2.9) 50 < QF < 100
The default quantization matrix is shown in Table 2.1 which is recommended in the JPEG standard. 8 values of SF are used in the experiments, they are 0.5, 1, 1.5, 2, 2.5, 3, 3.5 and 4. The compression ratios obtained from JPEG compression using different SFs are shown in Figures 2.7 and 2.8 for Lena and Pepper respectively.
18
16 12 14 14 18 24 49 72
11 12 13 17 22 35 64 92
10 14 16 22 37 55 78 95
16 19 24 29 56 64 87 98
24 26 40 51 68 81 103 112
40 58 57 87 109 104 121 100
51 60 69 80 103 113 120 103
64 55 56 62 77 92 101 99
Table 2.1 Default quantization table of JPEG. In the block selection process, four values of the threshold T are simulated in the experiments, they are T =1000, T =2000. T = 3000 and T = 4000, a larger T resulted in more blocks are selected for data hiding implies more bits can be embedded. The block selection threshold N 1 is chosen to be 6. The DCT candidate selection threshold N 2 is chosen to be 3. In other words, at most 3 bits can be embedded in one selected texture block. Note that the number of blocks selected to embed the bit sequence depends on the image characteristics, the compression ratio and the threshold T. The threshold T2 in the DCT coefficients selection process is chosen to be 3 ⋅ Q( i , j ) and Q( i , j ) is the quantization factor of the (i, j)th location of DCT. As mentioned in Section 2.4, the pseudo-random number nk in J-Mark is uniformly distributed between two positive real numbers R1 and
R2 . In the experiments, ni is chosen to be uniformly distributed in [R1, R2], R1 is set
to 1. It is observed that when SF = 1, the maximum value of R2 can be up to 3 if high fidelity is required, but when SF = 4, the maximum value of R2 is 2 as the quantization factor is large when SF = 4. Based on this observation, R2 for other SF is determined according to (2.10). The length of the PN sequence should be the same as the bit sequence.
1 10 R2 = − ⋅ SF + 3 3
(2.10)
Without loss of generality, random bit sequences are used in the experiment with approximately equal number of ‘1’ and ‘0’ bits.
19
The number of bits embedded with different T are shown in Figure 2.9 and 2.10 for Lena and Pepper respectively. The results using DC-Hide [30] are also shown in the figures for comparison. For Lena at SF = 0.5, 3075 bits are embedded using J-Mark with 0.183dB loss in PSNR. However, for Pepper at SF = 0.5, the increase in number of bits embedded is only 876, because Pepper contains mostly smooth regions. For Pepper at SF = 4, only 106 texture blocks are selected in J-Mark. It is observed that neighboring constraints in the block selection process and a large value of T2 should be used when SF is large in order to ensure the fidelity of the outputs. As the quantization factors increase with the SF, even a small change in quantized AC coefficients may cause severe artifacts if masking from high frequency signal is not enough. The PSNR of the JPEG compressed images (no hidden data) and the images with hidden data are shown in Figures 2.11 and 2.12 for Lena and Pepper respectively. J-Mark tends to incur larger loss in PSNR at larger SF. The largest PSNR loss after data hiding is 0.86dB and 0.52dB for Lena and Pepper respectively. The file size before and after data hiding are shown in Figures 2.13 – 2.14. The largest increase in file size of image with hidden data compared with JPEG images are 239 bits and 137 bits for Lena and Pepper respectively, which are negligible small compared with the number of bits embedded. Four data hiding examples are shown in Figures 2.15 – 2.26.
20
Figure 2.7: Compression ratio Vs scaling factor (SF) for Lena.
Figure 2.8: Compression ratio Vs scaling factor (SF) for Pepper.
21
Figure 2.9: Number of bits embedded using J-Mark for Lena.
Figure 2.10: Number of bits embedded using J-Mark for Pepper.
22
Figure 2.11: PSNR comparison for Lena.
Figure 2.12: PSNR comparison for Pepper.
23
Figure 2.13: Increased file size after data hiding using J-Mark for Lena.
Figure 2.14: Increase in file size after data hiding using J-Mark for Pepper.
24
Figure 2.15: JPEG compressed Lena, SF=1 (PSNR = 35.81dB, 163729 bits).
Figure 2.16: Lena with 1897 bits embedded, SF=1, T=2000. (35.52dB, 163834 bits)
25
Figure 2.17: Selected 698 texture blocks for Lena, SF=1, T=2000.
Figure 2.18: JPEG compressed Lena, SF=4 (31.27dB, 69922 bits).
26
Figure 2.19: Lena with 584 bits embedded, SF=4, T=4000. (30.76dB, 69991 bits)
Figure 2.20: Selected 342 texture blocks for Lena, SF=4, T=4000.
27
Figure 2.21: JPEG compressed Pepper, SF=1 (34.76dB, 165897 bits).
Figure 2.22: Pepper with 1085 bits embedded, SF=1, T=2000. (34.63dB, 165950 bits)
28
Figure 2.23: Selected 366 texture blocks for Pepper, SF=1, T=2000.
Figure 2.24: JPEG compressed Pepper, SF=4 (30.87dB, 68878 bits).
29
Figure 2.25: Pepper with 213 bits embedded, SF=4, T=4000. (30.72dB, 68905 bits)
Figure 2.26: Selected 106 texture blocks for Pepper, SF=4, T=4000.
30
CHAPTER 3 SINGLE WATERMARK EMBEDDING
3.1
Single Watermark Embedding (SWE)
In this chapter, a vector-based watermarking technique is proposed called Single Watermark Embedding (SWE).
SWE can be used to embed a single
watermark in an image and the embedded watermark is assumed to be represented by a bit sequence. In other words, the proposed SWE can also be used for data hiding. SWE can be applied in transform domains such as DCT, and possibly the spatial domain, of an image. Some selected image pixels or transform coefficients are grouped to form a vector and called watermark host vector. The watermark host vector is divided into disjoint sub-vectors and each sub-vector is used to embed one bit information. Let the watermark host vector be Y = [ y1 , y2 ,..., yM ] with length M. The watermark, L = [l1 ,l2 ,...,l N ] with li ∈ {0 ,1} , is a bit sequence with length N where N
1 > B1 . Then N
43
E (S1 ) =
1 N
N
∑ [1 • P( X i =1
i
= Yi ) − 1 • P( X i ≠ Yi )] =
N1 (2 p1 − 1) + N 2 (2 p 2 − 1) N N
N N = 1 (2 p1 ) + 2 (2 p 2 ) − 1 N N
(3.21)
N
E (S3 ) = ∑ bi [1 • P( X i = Yi ) − 1 • P( X i ≠ Yi )] i =1
= N1 B1 (2 p1 − 1) + N 2 B2 (2 p2 − 1)
(3.22)
= N1 B1 (2 p1 ) + N 2 B2 (2 p2 ) − 1
Let ∆ = p 2 − p1 > 0 . Then E (S1 ) =
N1 (2 p1 ) + N 2 (2 p1 + 2∆ ) − 1 = 2 p1 − 1 + 2∆N 2 N N N
E (S3 ) = N1 B1 (2 p1 ) + N 2 B2 (2 p2 ) − 1 = N1 B1 (2 p1 ) + N 2 B2 (2 p1 + 2∆ ) − 1 = 2 p1 − 1 + 2 N 2 B2 ∆
(3.23)
(3.24)
1 Therefore, E (S 3 ) − E (S1 ) = 2 N 2 ∆ B2 − > 0 . N
3.3 Experimental Results and Discussions for SWE
The proposed SWE is simulated on many testing images as shown in Figure 3.10. All the images are 512x512 pixels and only the luminance components are used. A 44x30 binary logo called ‘UST’, as shown in Figure 3.11, are used as perceptual meaningful watermarks in the experiments. The binary images are rasterscanned to form 1-dimensional bit sequences and modulated with a pseudo-random binary sequence. The SWE is simulated in the DCT domain. The whole original image is transformed to the (512x512) DCT domain and scanned in a zigzag order. The first 10% of the DCT AC coefficients are used to form the host vector to embed the watermark. The length of the host vector Y is M=floor(512×512×0.1)=26214. The length of the watermark W is N=44×30=1320. The length of the sub-vector Yi is P=floor(26214/1320)=19. Low frequency components of DCT are chosen to form
44
the host vector as these components tend to have large energies such that the embedded watermarks tend to be robust against different kinds of attack. For the score S2 in (3.14), the weighting factor βi is chosen as follows. The default 8x8 quantization matrix in JPEG is bilinear interpolated to 512x512. Treating this as an image, the host vector of length M is extracted and divide them into N subvectors of length P. The weighting factor βi is chosen as the inverse of the sum of elements of the ith sub-vector such that the low frequency components have relatively large βi.
Figure 3.10: Testing images used in the experiments.
45
Figure 3.11: Logo ‘UST’ used as the watermark. The proposed SWE is used to embed the ‘UST’ logo in all the testing images. A typical SWE-watermarked image is shown in Figure 3.12. With very high PSNR, the watermarked images have very good visual quality. As expected, all the watermark bits can be decoded perfectly under no attack resulting in both detection scores S1 and S2 being 1. Several unintentional attacks are simulated, including JPEG compression, low-pass filtering, noise, and print-and-scan. In the JPEG compression attack, the watermarked images are JPEG compressed with the default quantization matrix scaled by various scaling factor (SF) to achieve different compression ratio. Ten trials with different keys D and K are performed for each SF to obtain the average detection scores. The first key D is a vector of 1320 random numbers generated independently from a Gaussian distribution with d = 480 and σ d = 4 . 2
The other key K is generated from a Gaussian distribution with k = 480 and σ k = 16 . 2
These parameters are chosen to achieve a PSNR of about 45 dB for the watermarked images. It is observed that at such PSNR, the watermarked images are almost indistinguishable from the original image to the human eyes. The typical average detection scores, S1 and S2, are shown in Figure 3.13. It is observed that, in most cases, S2 is larger than S1. This agrees with Section 3.2 that the expected value of S2 is larger than that of S1, in the case that βi takes on only two values, a larger one for the bits with low probability of error and vice versa. Perhaps S2 can be as good as, if not better than, S1 for watermark detection.
46
Figure 3.12: Typical SWE-watermarked image, PSNR = 46.12dB.
Figure 3.13: Average detection score against JPEG compression for SWE.
47
The sample distribution of S1 in SWE watermark detection are shown in Figures 3.14, 4.16 and 4.18 for JPEG attack, low pass filtering (LPF) and noise attack respectively. There are two types of detection errors for a given detection threshold: type 1 being the false positive error and type 2 being the false negative error. The total detection errors (sum of type 1 and type 2 error probability) against different thresholds are shown in Figures 3.15, 3.17 and 4.19 respectively. All curves are averages of 100 trials with different keys. Six situations of JPEG compression attack with SF=0.5, 1, 1.5, 2, 2.5, 3 on the SWE-watermarked images are shown in Figure 3.14, together with the reference ‘no watermark’ situation. The sample distribution of ‘no watermark’ does not intersect with any of the sample distributions of the 6 JPEG situations. As a result, any threshold values in the ‘in-between’ region in Figure 3.15 can give zero total detection error. A typical example of SWE under JPEG attack is shown in Figure 3.20. At SF=3.0 (0.319 bpp) with a PSNR of 32.23dB, the JPEG-compressed image has rather severe and visible image distortion due to JPEG compression. The severe compression attack causes the decoded watermark (the 44x30 binary logo) to be noisy with a bit error rate of 25.08%. Although the decoded ‘UST’ logo is barely visible in Figure 3.23, the detection scores remain quite large at S1=0.4985 and S2=0.5208. The watermark can be clearly distinguished in the random watermark test in Figure 3.25. Similarly, the sample distributions of ‘watermarked’ and ‘no watermark’ under LPF attack do not intersect in Figure 3.16 and thus there are many thresholds in Figure 3.17 that can give zero detection error. A 3x3 averaging filter (with coefficients of 1/9) is used in the LPF. A typical example of SWE under LPF attack is shown in Figure 3.18. The LPF attacked image is blurred with a PSNR of 31.85dB.
48
The LPF attack causes the decoded watermark to be somewhat noisy with a bit error rate of 14.62%. Although the UST logo is visibly noisy, the detection scores remain large at S1=0.7076 and S2=0.7150. The watermark is clearly distinguishable in the random watermark test. Six situations of Gaussian noise attack with different noise variance (Var) are shown in Figure 3.18, together with the ‘no watermark’ situation. Again, the nonintersection of ‘no watermark’ and other sample distributions lead to the possibility of zero detection error in Figure 3.19. A typical example of SWE under Gaussian noise attack is shown in Figure 3.26. The image attacked by an additive zero-mean Gaussian noise with variance 225 is noisy with a PSNR of only 24.26dB. The severe noise causes the decoded watermark to be very noisy with a bit error rate of 32.35%. Although the ‘UST’ logo can hardly be recognized and the detection scores are low at S1=0.3530 and S2=0.3537, the watermark is still distinguishable in the random watermark test. In the print-and-scan attack, the watermarked images of Figure 3.12 is printed at 180dpi (2.84 inch x 2.84 inch) on photo papers using a 2880dpi Epson 895 printer. They are then scanned with a 2400dpi Epson 1250 scanner resulting in images with approximately 7000x7000 pixels. A software (e.g. Photoshop) is used to rotate manually the scanned images to an upright position, to crop out the image regions and to resample them down to 512x512 images using bi-cubic interpolation. Typical results for JPEG attack for SF=3.0 are shown in Figures 3.20-3.22. At SF=3.0 (0.319 bpp) with a PSNR of 32.23dB, the JPEG-compressed image shows visible artifacts. The decoded watermark in Figure 3.21 (the 44x30 binary logo) is noisy with 25.08% bit errors and the UST logo barely visible, but the detection scores remain quite large with S1=0.4985 and S2=0.5208. The watermark can be
49
clearly distinguished in the random watermark test in Figure 3.22. In the simulations, the score S2 appears to give larger magnitude than S1 suggesting that it may be more suitable for watermark detection. The results of lowpass filtering attack for ‘Lena’ are shown in Figure 3.233.25. A 3x3 averaging filter (with coefficients being 1/9) is applied to the watermarked image in Figure 3.12. The resulting image with a PSNR of 31.85dB is blurred as expected. The decoded watermark (the 44x30 binary logo) is somewhat noisy with 14.62% bit errors but the UST logo remains visible. The detection scores remain large with S1=0.7076 and S2=0.7150. The watermark is clearly distinguishable in the random watermark test. The results of noise attack for ‘Lena’ are shown in Figures 3.26-3.28. A zeromean Gaussian noise with variance 225 is added to the watermarked image. The resulting image with a PSNR of 24.26dB is noisy. The decoded watermark is very noisy with 32.35% bit errors and the UST logo can hardly be recognized. The detection scores are low with S1=0.3530 and S2=0.3537. But the watermark is still distinguishable in the random watermark test. The results of the print-and-scan attack for Lena are shown in Figures 3.293.31. The resampled image at a PSNR of 24.79dB looks good. The low PSNR suggests that the manual rotation and cropping might have led to misalignment. The decoded watermark is very noisy with 22.95% bit errors and the ‘UST” logo is barely recognizable. The detection scores are low with S1=0.5409 and S2=0.5232 but the watermark is still distinguishable in the random watermark test. The score S2 is smaller than S1 in this case.
50
Figure 3.14: Distribution of S1 of SWE under JPEG attack.
Figure 3.15: Detection error of SWE under JPEG attack.
51
Figure 3.16: Distribution of S1 of SWE under LPF attack.
Figure 3.17: Detection error of SWE under LPF attack.
52
Figure 3.18: Distribution of S1 of SWE under noise attack.
Figure 3.19: Detection error of SWE under noise attack.
53
Figure 3.20: SWE-watermarked image under JPEG attack. SF=3, PSNR=32.23dB, bpp=0.319.
Figure 3.21: Decoded watermark after the JPEG attack.
Figure 3.22: Random watermark detection results under JPEG attack. 54
Figure 3.23: SWE-watermarked image under LPF attack, PSNR = 31.85dB.
Figure 3.24: Decoded watermark under LPF attack.
Figure 3.25: Random watermark detection results under LPF attack.
55
Figure 3.26: SWE-watermarked image under noise attack. PSNR=24.26dB, variance=225.
Figure 3.27: Decoded watermark under noise attack.
Figure 3.28: Random watermark detection results under noise attack. 56
Figure 3.29: SWE-watermarked image under print-and-scan attack, PSNR = 24.79.
Figure 3.30: Decoded watermark under print-and-scan attack.
Figure 3.31: Random watermark detection results under print-and-scan attack.
57
The embedded watermark is further tested under Stirmark 4.0 [98-99] attacks. The default attacks of Stirmark 4.0 are: •
Affine transformation attacks
•
Gaussian filtering attack
•
Sharpening attack
•
Cropping attacks
•
JPEG compression attacks
•
Latest small Random Distortion attacks
•
Median filtering attacks
•
Noise attacks
•
Watermark embedding attacks
•
Rescaling attacks
•
Lines removal attacks
•
Small random distortions attacks
•
Rotation-and-cropping attacks
•
Rotation-and-scaling attacks
•
Rotation attacks
•
Self similarities attacks Typical examples under the above attacks are shown in Figures 3.32 – 4.41,
most of the images under Stirmark attacks are higher distorted. The default profile of Stirmark is used in the simulation. The watermark detection results are shown in Tables 3.2 – 3.15. The self similarities attacks are for color image only and the results are not shown. As the image dimension may be changed after attacks, two simple ways are used to obtain a 512 by 512 pixels image for image detection. For the first way, the top-left 512 by 512 pixels are used, if any one of
58
the dimensions of the testing image is less than 512, zeros are padded to give the 512 by 512 image. For the second way, the image is rescaled to 512 by 512 image using bicubic interpolation. The detection scores obtained using the first ways are denoted as S1, S2 and the detection scores for the second way are denoted as S1’, S2’. For affine transformation attacks, the detection scores of most of the cases are very small, only the detection scores of small amount XShearing and S1’, S2’ of small amount of Y-Shearing are higher than 2.0. These imply that the watermark is not very robust to affine transformation, this is because the watermark is embedded in the DCT domain which is supposed not robust to geometric attacks. For the Gaussian filtering and sharpening attacks, the detection scores are small because the image is highly distorted as shown in Figure 3.33. The watermark cannot be detected under cropping attacks also. For JPEG compression attacks, similar to the result shown above, the detection scores are higher than 4.0 for all quality factors. For median filtering attacks, the detection scores are large for 3 by 3 and 5 by 5 median filters but the detection scores are low for larger median filter. For noise attacks, the detection scores are small in all cases since the noise energy is too large, an example for noise =40% is shown in Figure 3.37. For watermark embedding attacks, a random noise with different strength is added on the watermarked image, as the embedded watermark is uncorrelated with the added noise, all the embedded bits can be decoded correctly to give the detection scores of 1. For rescaling and lines removal attacks, the detection scores are very small when the first way is used to obtain the 512 by 512 image, the detection scores become much larger when the second is used. For rotation attacks, the detection scores are very small when the watermarked image is rotated by more than ± 0.25 degrees.
59
Figure 3.32: SWE-watermarked image under Y-Shearing attack by Stirmark.
Figure 3.33: SWE-watermarked image under Gaussian filtering attack by Stirmark.
60
Figure 3.34: SWE-watermarked image under 50% cropping attack by Stirmark.
Figure 3.35: SWE-watermarked image under latest small random distortion (1.05) attack by Stirmark.
61
Figure 3.36: SWE-watermarked image under 7 by 7 median filtering attack by Stirmark.
Figure 3.37: SWE-watermarked image under 40% noise attack by Stirmark.
62
Figure 3.38: SWE-watermarked image under watermark embedding attack (strength = 50) by Stirmark.
Figure 3.39: SWE-watermarked image under random lines removal (1 of 20) by Stirmark.
63
Figure 3.40: SWE-watermarked image under 50% rescaling attack by Stirmark.
Figure 3.41: SWE-watermarked image under rotation attack (30 degrees) by Stirmark.
64
S1 S2 S1’ S2’ Attack Affine 1 (Y-Shearing) 0.068 0.064 0.250 0.253 Affine 2 (Y-Shearing) -0.023 -0.025 0.020 0.012 Affine 3 (X-Shearing) 0.221 0.200 0.330 0.310 Affine 4 (X-Shearing) 0.061 0.046 0.086 0.065 Affine 5 (XY-Shearing) 0.014 0.014 0.009 0.011 Affine 6 (General) 0.029 0.036 0.052 0.053 Affine 7 (General) 0.033 0.034 -0.014 -0.023 Affine 8 (General) -0.024 -0.027 0.038 0.040 Table 3.2: Stirmark results for affine transformation attacks. S1 S2 S1’ S2’ Attack Gaussian Filtering 0.118 0.087 0.118 0.087 Sharpening -0.048 -0.062 -0.048 -0.062 Table 3.3: Stirmark results for Gaussian filtering and sharpening attacks. Attack Cropping (75%) Cropping (50%) Cropping (25%) Cropping (20%) Cropping (15%) Cropping (10%) Cropping (5%) Cropping (2%) Cropping (1%) Attack JPEG (100%) JPEG (90%) JPEG (80%) JPEG (70%) JPEG (60%) JPEG (50%) JPEG (40%) JPEG (35%) JPEG (30%) JPEG (25%) JPEG (20%) JPEG (15%)
S1 S2 S1’ 0.020 0.014 0.000 -0.006 -0.015 -0.044 -0.029 -0.031 0.017 -0.009 -0.010 -0.065 -0.017 -0.015 -0.039 -0.008 -0.016 -0.042 -0.027 -0.039 -0.030 -0.038 -0.047 -0.035 -0.026 -0.036 -0.039 Table 3.4: Stirmark results for cropping attacks.
S2’ 0.000 -0.045 0.018 -0.077 -0.047 -0.053 -0.039 -0.044 -0.048
S1 S2 S1’ 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.998 0.999 0.998 0.995 0.995 0.995 0.974 0.977 0.974 0.945 0.952 0.945 0.861 0.873 0.861 0.748 0.773 0.748 0.592 0.616 0.592 0.470 0.490 0.470 Table 3.5: Stirmark results for JPEG attacks.
S2’ 1.000 1.000 1.000 1.000 0.999 0.995 0.977 0.952 0.873 0.773 0.616 0.490
65
S1 S2 S1’ S2’ Attack Latest Small Random Distortion 0.017 0.015 0.017 0.015 (0.95) Latest Small Random Distortion -0.030 -0.023 -0.030 -0.023 (1) Latest Small Random Distortion -0.041 -0.037 -0.041 -0.037 (1.05) Latest Small Random Distortion -0.052 -0.053 -0.052 -0.053 (1.1) Table 3.6: Stirmark results for latest small random distortion attacks. Attack Median Cut (3) Median Cut (5) Median Cut (7) Median Cut (9)
S1 S2 S1’ 0.874 0.883 0.874 0.315 0.349 0.315 0.036 0.048 0.036 -0.003 -0.004 -0.003 Table 3.7: Stirmark results for median cut attacks.
S2’ 0.883 0.349 0.048 -0.004
S1 S2 S1’ 1.000 1.000 1.000 0.012 0.006 0.012 0.038 0.042 0.038 0.006 0.005 0.006 0.008 0.010 0.008 0.015 0.013 0.015 Table 3.8: Stirmark results for noise attacks.
S2’ 1.000 0.006 0.042 0.005 0.010 0.013
Attack Add Noise (0) Add Noise (20) Add Noise (40) Add Noise (60) Add Noise (80) Add Noise (100)
S1 S2 S1’ S2’ Attack Watermark Embedding (0) 1.000 1.000 1.000 1.000 Watermark Embedding (10) 1.000 1.000 1.000 1.000 Watermark Embedding (20) 1.000 1.000 1.000 1.000 Watermark Embedding (30) 1.000 1.000 1.000 1.000 Watermark Embedding (40) 1.000 1.000 1.000 1.000 Watermark Embedding (50) 1.000 1.000 1.000 1.000 Watermark Embedding (60) 1.000 1.000 1.000 1.000 Watermark Embedding (70) 1.000 1.000 1.000 1.000 Watermark Embedding (80) 1.000 1.000 1.000 1.000 Watermark Embedding (90) 1.000 1.000 1.000 1.000 Watermark Embedding (100) 1.000 1.000 1.000 1.000 Table 3.9: Stirmark results for watermark embedding attacks.
66
Attack Rescale (50%) Rescale (75%) Rescale (90%) Rescale (110%) Rescale (150%) Rescale (200%)
S1 S2 S1’ -0.014 -0.010 0.761 0.055 0.059 0.986 -0.044 -0.038 0.571 0.000 -0.002 0.997 -0.005 -0.002 0.705 -0.039 -0.044 0.753 Table 3.10: Stirmark results for rescaling attacks.
S1 S2 S1’ Attack Remove Lines (1 of 10) -0.055 -0.051 0.620 Remove Lines (1 of 20) -0.002 0.003 0.547 Remove Lines (1 of 30) -0.009 -0.021 0.611 Remove Lines (1 of 40) -0.018 -0.019 0.505 Remove Lines (1 of 50) 0.011 0.022 0.655 Remove Lines (1 of 60) -0.020 -0.024 0.562 Remove Lines (1 of 70) 0.017 0.015 0.671 Remove Lines (1 of 80) -0.008 -0.011 0.630 Remove Lines (1 of 90) 0.023 0.026 0.591 Remove Lines (1 of 100) 0.017 0.014 0.626 Table 3.11: Stirmark results for line removal attacks.
S2’ 0.765 0.986 0.546 0.996 0.687 0.737 S2’ 0.601 0.521 0.589 0.489 0.637 0.539 0.648 0.609 0.577 0.599
S1 S2 S1’ S2’ Attack Small Random Distortions (0.95) 0.027 0.037 0.027 0.037 Small Random Distortions (1) 0.023 0.029 0.023 0.029 Small Random Distortions (1.1) 0.035 0.038 0.035 0.038 Small Random Distortions (1.05) 0.044 0.043 0.044 0.043 Table 3.12: Stirmark results for small random distortions attacks. S1 S2 S1’ S2’ Attack RotationCrop (-0.25 degree) -0.061 -0.064 0.238 0.220 RotationCrop (-0.5 degree) 0.026 0.028 0.065 0.066 RotationCrop (-0.75 degree) -0.058 -0.050 0.000 0.003 RotationCrop (-1 degree) -0.035 -0.043 0.009 0.014 RotationCrop (-2degree) -0.014 -0.014 -0.056 -0.051 RotationCrop (0.25 degree) 0.055 0.053 0.200 0.182 RotationCrop (0.5 degree) 0.023 0.025 0.011 0.020 RotationCrop (0.75 degree) -0.030 -0.024 -0.003 -0.005 RotationCrop (1 degree) -0.026 -0.025 0.029 0.027 RotationCrop (2 degree) -0.003 -0.011 0.020 0.022 Table 3.13: Stirmark results for rotation and cropping attacks.
67
S1 S2 S1’ S2’ Attack Rotation and Scale (-0.25 degree) 0.215 0.200 0.215 0.200 Rotation and Scale (-0.5 degree) 0.059 0.057 0.059 0.057 Rotation and Scale (-0.75 degree) 0.003 0.008 0.003 0.008 Rotation and Scale (-1 degree) 0.012 0.016 0.012 0.016 Rotation and Scale (-2degree) -0.062 -0.059 -0.062 -0.059 Rotation and Scale (0.25 degree) 0.195 0.179 0.195 0.179 Rotation and Scale (0.5 degree) 0.006 0.014 0.006 0.014 Rotation and Scale (0.75 degree) 0.026 0.018 0.026 0.018 Rotation and Scale (1 degree) 0.018 0.019 0.018 0.019 Rotation and Scale (2 degree) 0.035 0.037 0.035 0.037 Table 3.14: Stirmark results for rotation and scaling attacks. S1 S2 S1’ Attack Rotation (-0.25 degree) 0.050 0.039 0.174 Rotation (-0.5 degree) 0.002 -0.004 -0.039 Rotation (-0.75 degree) -0.026 -0.030 0.012 Rotation (-0.1 degree) -0.021 -0.013 -0.020 Rotation (-0.2 degree) -0.029 -0.028 -0.011 Rotation (0.25 degree) 0.036 0.033 0.164 Rotation (0.5 degree) -0.026 -0.023 -0.012 Rotation (0.75 degree) -0.023 -0.022 -0.003 Rotation (1 degree) -0.003 -0.009 -0.005 Rotation (2 degree) 0.042 0.042 0.021 Rotation (5 degree) 0.000 0.017 -0.008 Rotation (10 degree) -0.026 -0.030 0.033 Rotation (15 degree) 0.058 0.065 0.009 Rotation (30 degree) 0.017 0.025 0.067 Rotation (45 degree) 0.023 0.021 -0.005 Rotation (90 degree) -0.030 -0.029 -0.030 Table 3.15: Stirmark results for rotation attacks.
68
S2’ 0.155 -0.044 0.009 -0.012 -0.007 0.150 -0.022 0.002 -0.006 0.020 -0.011 0.036 0.013 0.072 0.004 -0.029
CHAPTER 4 MULTIPLE WATERMARKS EMBEDDING
4.1
Multiple Watermarks Embedding (MWE)
In this chapter, the SWE is generalized to embed multiple watermarks in the same image while retaining high image quality. In SWE, only one watermark can be embedded in the image and one bit is embedded in each sub-vector. In MWE, two or more watermarks are embedded simultaneously using different sets of secret keys, each set of key is corresponding for embedding one watermark only. The watermark host vector is extracted from the image from some domains and split into sub-vectors. The sub-vector will be used to embed two or more bits and the number if bits embedded in each sub-vector equals to the number of watermarks. The target of MWE is to embed more than one bit in a sub-vector while the distortion on the subvector due to watermarking is as small as possible. The watermarks embedding process is shown in Figure 4.1. One advantage of the proposed MWE is it prevents the crosstalk between different watermarks. To decode or detect a particular watermark, only the corresponding key set is needed at the decoder and each embedded watermark bit sequence can be decoded independently. The watermark decoding process is shown in Figure 4.2. The symbols used in this chapter are listed in Table 4.1.
69
Figure 4.1: Watermarks embedding process of MWE.
Figure 4.2: Watermark decoding process of MWE.
70
Symbol Y = y1 , y2 ,..., yM
Description Watermark host vector
Yi = yi1 , yi2 ,..., yi P
ith sub-vector of Y
M N P Q D j = d j1 , d j 2 ,..., d jN
Length of host vector Length of bit sequence Length of sub-vector Number of watermarks First key for the jth watermark
W j = w j ,1 , w j ,2 ,..., w j , N
Modulated jth watermark bit sequence
K j = k j ,1 , k j ,2 ,..., k j ,M
Second key for the jth watermark
K j,i
ith sub-vector of K j
α j ,i
Scaling factor for ith bit of jth watermark
Αi = [α1,i ,α 2 ,i ,...,α Q ,i ]
Set of scaling factors for ith bit
Y' = y1' , y2' ,..., yM ' Yi' = yi1 ' , yi2 ' ,..., yi P '
Watermarked vector Watermarked sub-vector
W' = w1' , w2' ,..., wN '
Decoded watermark bit sequence
d
Mean value of dji Variance of dji
σd
2
k
σk
2
Ew E wi
S1 S2
βi
Mean value of ki Variance of ki Energy of watermark Energy of ith bit watermark Detection score Weighted detection score Weighting factor for the ith bit Table 4.1: List of symbols in chapter 4.
71
4.2
Multiple Bits Embedding in Sub-vector
In MWE, Q bits are embedded simultaneously in each sub-vector Yi. The first key is generalized to Q sets of N pseudo-random positive real numbers denoted as D1, D2, …, DQ and the second key is generalized to Q pseudo-random vectors of length
[
]
M, denoted as K1, K2, …, KQ with K j = k j ,1 , k j ,2 ,..., k j ,M , for 1 ≤ j ≤ Q . Similar to SWE, the host vector Y and the random vectors K1, K2, …, KQ are split into N subvectors of length P. The ith element of Dj is denoted as dj,i and the ith sub-vector of Kj is denoted as Kj,i. The ith bit of the jth watermark sequence is denoted as wj,i. The watermarked ith sub-vector, denoted as Yi’, is Yi' = Yi + α1,i K 1,i + α 2 ,i K 2,i + ... + α Q ,i K Q,i
(4.1)
The scaling factors form a row vector Αi = [α1,i ,α 2 ,i ,...,α Q ,i ] with α j ,i ∈ ℜ . The goal of the watermark embedding process is to derive a set of scaling factors (or vector Αi ) which satisfies two conditions. The first condition is that the projection of Yi ' onto the direction of Kj,i corresponds to the correct watermark bit as Yi ' , K j,i Round d j ,i
%2 = w , 1 ≤ j ≤ Q j ,i
(4.2)
The second condition is that the distortion or the Euclidean distance E wi between Yi and Y i ' is minimized. The Euclidean distance E wi is also the energy of the ith bit watermark and is equal to Ewi = Yi' − Yi
2 2
= Αi C i Ai
T
(4.3)
where
72
K 1,i K 2,i , K 1,i Ci = M M K Q,i , K 1,i
K 1,i , K 2,i
L L
K 2,i O O K Q,i , K 2,i
L O O L
K 1,i , K Q,i K 2,i , K Q,i M M K Q,i
L O O L
(4.4)
Substituting (4.1) into (4.2), Q simultaneous equations can be obtained. They are, in matrix form, Αi C i = Βi = [b1,i ,b2 ,i ,...,bQ ,i ]
(4.5)
with
(
)
b j ,i = d j ,i ⋅ 2 ⋅ r j ,i + w j ,i − Y i , K j,i , 1 ≤ j ≤ Q
(4.6)
where rj,i is an integer for any i, j. If all the rj,i are determined, the row vector Bi can be computed using (4.6) and the scaling vector Αi can then be obtained as Αi = Β i C i
−1
from (4.5). It is important to choose the integers rj,i such that the E wi is
as small as possible. By substituting (4.5) into (4.3), the E wi can be rewritten in terms of Ci and Bi as −1
Ewi = Βi C i Βi
T
(4.7)
Two approaches are proposed to choose the rj,i: The first approach is called Direct Approach (DA) which is simple and is optimal for the special case of orthogonal random key vectors. The second approach is called Iterative Approach (IA), which is more suitable for the general case of non-orthogonal random key vectors. The second approach is useful because non-orthogonal random vectors can potentially incur smaller host signal distortion than orthogonal ones. This can be seen in the example in Figures 4.3 and 4.4, which have the same d1,i and d 2 ,i when Q = 2. The random vectors are orthogonal in Figure 4.4 but are not in Figure 4.3. The distortion incurred between Yi and Y i ' is smaller in Figure 4.3 than in Figure 4.4.
73
wi1 = 0 di2
wi1 = 1 yi
di2
wi1 = 0
yi'
ki2
di2
ki1
di1
di1
di1
di1
wi1 = 0
wi1= 1
wi1 = 0
wi1 = 1
Figure 4.3: Modification of sub-vector using correlated random sub-vectors.
wi1 = 0 di2 yi' wi1 = 1 di2
yi ki2
wi1 = 0 ki1
di2
di1
di1
di1
di1
wi1 = 0
wi1= 1
wi1 = 0
wi1 = 1
Figure 4.4: Modification of sub-vector using orthogonal random sub-vectors.
74
4.2.1. Direct Approach (DA)
In the proposed direct approach, the rj,i is chosen according to (4.8) to minimize the absolute value of bj,i. 1 Y ,K ⋅ Round i j,i d j ,i 2 Yi , K j,i 1 rij = ⋅ Round d j ,i 2 Yi , K j,i 1 ⋅ Round d j ,i 2
−w , j ,i
for case 1
− w + 1 , for case 2 j ,i
(4.8)
− w − 1 , for case 3 j ,i
(
)
(
)
(
)
(
)
(
)
Case 1: round Yi , K j,i / d j ,i %2 = w j ,i
Case 2: round Yi , K j,i / d j ,i %2 ≠ w j ,i and Yi , K j,i ≥ d j ,i ⋅ round Yi , K j,i / d j ,i Case 3: round Yi , K j,i / d j ,i %2 ≠ w j ,i and Yi , K j,i < d j ,i ⋅ round Yi , K j,i / d j ,i
It is easy to show that (4.8) guarantees all rj,i are integers. The advantage of using DA is its simplicity. And this is the optimal solution for the special case of orthogonal random key vectors. It is not optimal in the general case of nonorthogonal key vectors.
4.2.2. Iterative Approach (IA)
Here is the iterative approach. Expanding (4.7), E wi can be written as a second order function of b1,i, b2,i, b3,i, …,bQ,i. Q
Ewi = ∑ Ci k =1
−1
Q −1
Q
(k , k ) ⋅ bk2,i + 2 ⋅ ∑ ∑ Ci −1 (k ,l ) ⋅ bk ,i ⋅ bl ,i
(4.9)
k =1 l = k +1
where Ci−1 (k ,l ) is the (k, l)th element of matrix C i . The trivial solution b1,i = b2,i −1
=…= bQ,i = 0 achieves minimum E wi . However these values are invalid as they are
75
not in the form of (4.6). It is necessary to find a valid Bi in the form of (4.6), where all rj,i are integers, such that E wi is as small as possible. In the proposed Iterative Approach (IA), a pool of valid candidate vectors is maintained for Bi and will allow the size of the pool to grow as the iterations proceed. The algorithm starts with a vector pool with only one valid vector Bi obtained from the Direct Approach (DA). In each iteration, each valid vector in the pool will be optimized by an algorithm to be described later to yield some new valid vectors. If any new valid vector is not already in the valid candidate vector pool, it will be added to it for the next iteration. The E wi is computed for all the vectors in the pool and the minimum E wi is identified. The iterations will stop when the incremental reduction of the minimum E wi is less than a threshold. Here is the two-stage optimization done in each iteration. In the first stage, consider any valid vector Bi in the vector pool. A set of vectors is produced from Bi by optimizing some elements in Bi while keeping the others unchanged. Let the number Q
of
elements
to
be
optimized
be
NA.
Since
there
are
C N A = Q! / [(Q − N A )!⋅N A !] ways to choose NA elements from Bi (of length Q), there
are Q C N A optimized vectors produced from each vector Bi after the first stage. It is observed experimentally that Q / 2 or Q / 2 + 1 are good values for NA. Let S A be the set of the index of the NA elements selected from Bi. Then E wi can be written as E wi = +
∑C ( j , j)⋅ b
j∈ S A
−1 i
2 j ,i
+
−1 i j ,k ∈S A , j ≠ k
∑ [C ( j , k ) + C (k , j )]⋅ b −1 i
−1 i
j∈ S A , k ∉ S A
+
∑ C ( j ,k )⋅ b
∑C ( j , j)⋅ b
j∉ S A
−1 i
2 j ,i
+
j ,i
⋅ bk ,i
⋅ bk ,i
∑ C ( j ,k )⋅ b
−1 i j ,k∉S A , j ≠ k
j ,i
(4.10) j ,i
⋅ bk ,i
76
Differentiating E wi with respect to each element in Bi to be optimized and setting the derivatives to be zero, NA simultaneous equations are obtained
∑ C ( j ,k )⋅ b
j∈ S A
−1 i
j ,i
= − ∑ Ci−1 ( j , k ) ⋅ b j ,i , for k ∈ S A
(4.11)
j∉ S A
Since E wi is a quadratic function and positive definite, E wi attains its minimum when (4.11) is satisfied. By solving (4.11), one optimal vector of Bi is obtained. Typically, the optimal vectors produced in the first stage are not valid since they are not in the form of (4.6) with integer rj,i. The second stage is now applied to modify these optimal vectors to produce valid sub-optimal vectors. Consider a particular optimal vector Bi opt . The problem is that the corresponding r j ,iopt computed from (4.6) rj ,iopt =
(b
j ,i opt
)
+ Yi , K j,i / d j ,i − w j ,i
2
,
for j ∈ S A
(4.12)
is not an integer. There are two ways to convert it into an integer. The first way is to simply round rj ,iopt to the nearest integer. This is called IA-rounding or IA-R in short.
( ) and ceiling,
The second way is to include all the combinations of floor rj ,iopt
( )
ceiling rj ,iopt
2
for j ∈ S A . This will produce N A sub-optimal valid vectors. This is
called IA-full or IA-F in short. The pool size should grow faster in IA-F than IA-R.
4.3. Watermark Decoding and Detection in MWE
Although multiple watermarks are embedded in the host signal simultaneously in MWE, each embedded watermark bit sequence can be decoded independently, as in SWE. Watermark detection is performed in a similar way by decoding the watermark bit sequence and computing the scores S1 or S 2 in Section 3.3. The ith bit of the jth watermark is decoded according to (4.13). 77
Yi' , K j,i w j ,i' = Round d j ,i
4.4
%2, 1 ≤ j ≤ Q 1 ≤ i ≤ N
(4.13)
Experimental Results and Discussion for MWE
The proposed DA, IA-R and IA-F for MWE are used to embed the ‘UST’ logo in ‘Lena’. The average PSNR of the watermarked images are shown in Figure 4.6 against Q, the number of simultaneously embedded watermark bit sequences. All the values are averaged over ten trials with different random keys. The simulation is done with both orthogonal random sub-vectors K j,i and non-orthogonal sub-vectors. The curves marked as ‘DA’, ‘IA-R’ and ‘IA-F’ correspond to the non-orthogonal cases. The curve marked as ‘ORV’ corresponds to DA in the orthogonal case. In the orthogonal case, both IA-F and IA-R degenerate into DA because DA is the optimal solution. The keys D1, D2, …, DQ are generated from a Gaussian distribution with a mean of 250 and a variance of 4. The other keys K1, K2, …, KQ are generated from a Gaussian distribution with a mean of 0 and a variance of 16. As in SWE, these values are chosen in an ad-hoc way to achieve a PSNR of about 45dB for watermarked images when Q = 5. In Figure 4.6, regardless of the algorithm, the watermarked images have very high PSNR and very good visual quality when only one (Q=1) watermark is embedded. When more watermarks are embedded simultaneously, the PSNR of the watermarked images decrease with Q. In the non-orthogonal cases, the PSNR of IAR and IA-F are similar, both being significantly higher than that of the lowcomplexity DA. This verifies that DA is non-optimal in the non-orthogonal cases, and both the IA-R and IA-F can improve over DA significantly. The results also
78
verify that it is possible to achieve higher PSNR in the non-orthogonal cases (by IAR and IA-F) than in the orthogonal cases. Here is a comparison of IA-F and IA-R in terms of PSNR, complexity and robustness. Theoretically, IA-F should always have higher PSNR than IA-R, but Figure 4.6 suggests that their PSNR difference is insignificant. Their complexity are shown in Figure 4.8 in terms of the average number of distinct vectors in the valid candidate vector pool. They have similar complexity for up to 5 watermarks (Q=5), beyond which IA-F becomes significantly more complex than IA-R. In the Matlab 6.5 implementation on a Pentium 4 1.4GHz PC, the average CPU time for IA-R and IA-F with Q=5 is 17.11s and 59.61s respectively. As expected, all the watermark bits can be decoded perfectly under no attack resulting in both detection scores S1 and S2 being 1. The average detection score S1 of MWE with 5 watermarks embedded (Q=5) under JPEG-compression attack is shown in Figure 4.8 against the JPEG scaling factor (SF). The S1 of IA-F, IA-R and DA are very similar at any SF. Considering the PSNR, complexity and performance under attacks, it appears that IA-R gives better quality-complexity-robustness tradeoff than IA-F. Here are more robustness results of IA-R. The same set of experiments in SWE are carried out for testing the MWE. The five binary logos in Figure 4.5 (b)-(f) are embedded with MWE using IA-R. A typical MWE-watermarked image (IA-R with Q=5) is shown in Figure 4.9. With PSNR of 44.94dB, the IA-R image has very good visual quality. The sample distributions of S1 of MWE (IA-R, Q=5) watermark detection are shown in Figures 4.10, 5.12 and 5.14 for JPEG attack, low pass filtering and noise attack respectively. The corresponding total detection errors (sum of type 1
79
and 2 error probability) against different detection thresholds are shown in Figures 4.11, 4.13 and 4.15. In the JPEG attack on MWE in Figure 4.10, unlike the case of SWE, the distribution of ‘no watermark’ intersects with the distribution corresponding to SF=3. As a result, no threshold can give zero detection error probability at SF=3 in Figure 4.11. A typical example of MWE under JPEG attack (SF=2, zero error) is shown in Figure 4.16. Similar observations can be made for the LPF attack on MWE in Figures 4.12, 4.13 and 4.19. With 5 watermarks embedded, the sample distributions of ‘watermarked’ and ‘no watermark’ are intersecting slightly. Error can occur in some cases. An example of MWE with no error is shown in Figure 4.21. In the noise attack on MWE in Figure 4.14, the distribution of ‘no watermark’ intersects with many distributions and thus no zero detection error is possible in the corresponding situations in Figure 4.15. Figure 4.24 is an example of MWE when no error occurs. Figure 4.25 is an example of the MWE under print-and-scan attack. Comparing these with the SWE results, it appears that more embedded watermarks tend to result in lower robustness. The results for Stirmark 4.0 [98-99] attacks are shown in Tables 4.2 – 4.15. Only the detection scores for the first watermark are shown in the tables. The results are similar to the results of SWE in Chapter 3. The embedded watermark is robust to JPEG compression attacks, watermark embedding attacks, rescaling attacks, line removal attacks and small amount rotation but not robust to affine transformation attacks, Gaussian filtering, sharpening, cropping, noise attacks and large amount rotation.
80
(a)
(b)
(c) (d) (e) Figure 4.5: (a)-(e) Original logo ‘Alphabet’.
Figure 4.6: Average PSNR of MWE Vs number of watermarks (Q).
81
Figure 4.7: Complexity comparison between IA-R and IA-F.
Figure 4.8: Average detection score S1 of MWE (Q=5) Vs different SF.
82
Figure 4.9: MWE-watermarked image with 5 watermarks embedded (Q=5, IA-R). PSNR=44.94dB.
83
Figure 4.10: Distribution of S1 of MWE (IA-R, Q=5) under JPEG attack.
Figure 4.11: Detection error of MWE under JPEG attack.
84
Figure 4.12: Distribution of S1 of MWE under LPF attack.
Figure 4.13: Detection error of MWE under LPF attack.
85
Figure 4.14: Distribution of S1 of MWE under noise attack.
Figure 4.15: Detection error of MWE under noise attack.
86
Figure 4.16: MWE-watermarked image under JPEG attack. SF=2, PSNR=33.46dB, bpp=0.412.
Figure 4.17: Decoded watermarks under JPEG attack.
Figure 4.18: Random watermark detection results of logo ‘A’ under JPEG attack.
87
Figure 4.19: MWE-watermarked image under LPF attack. PSNR=31.82dB.
Figure 4.20: Decoded watermarks under LPF attack.
Figure 4.21: Random watermark detection results of logo ‘A’ under LPF attack.
88
Figure 4.22: MWE-watermarked image under noise attack. PSNR=29.94dB, variance=64.
Figure 4.23: Decoded watermarks under noise attack.
Figure 4.24: Random watermark detection results of logo ‘A’ under noise attack.
89
Figure 4.25: MWE-watermarked image under print-and-scan attack. PSNR=27.41dB.
Figure 4.26: Decoded watermarks under print-and-scan attack.
Figure 4.27: Random watermark detection results of logo ‘A’ under print-and-scan attack. 90
S1 S2 S1’ S2’ Attack Affine 1 (Y-Shearing) 0.055 0.049 0.103 0.095 Affine 2 (Y-Shearing) 0.036 0.045 -0.021 -0.022 Affine 3 (X-Shearing) 0.144 0.129 0.208 0.189 Affine 4 (X-Shearing) 0.035 0.028 0.015 0.012 Affine 5 (XY-Shearing) 0.018 0.021 0.044 0.059 Affine 6 (General) -0.021 -0.027 -0.026 -0.031 Affine 7 (General) 0.005 0.000 -0.018 -0.011 Affine 8 (General) 0.011 0.022 -0.055 -0.054 Table 4.2: Stirmark results for affine transformation attacks. S1 S2 S1’ S2’ Attack Gaussian Filtering 0.011 0.000 0.011 0.000 Sharpening 0.017 0.013 0.017 0.013 Table 4.3: Stirmark results for Gaussian filtering and sharpening attacks. Attack Cropping (1%) Cropping (2%) Cropping (5%) Cropping (10%) Cropping (15%) Cropping (20%) Cropping (25%) Cropping (50%) Cropping (75%) Attack JPEG (15%) JPEG (20%) JPEG (25%) JPEG (30%) JPEG (35%) JPEG (40%) JPEG (50%) JPEG (60%) JPEG (70%) JPEG (80%) JPEG (90%) JPEG (100%)
S1 S2 S1’ 0.015 0.015 0.012 0.014 0.013 0.015 0.008 0.009 0.005 0.011 0.011 0.003 -0.021 -0.014 0.003 0.000 -0.008 0.029 -0.050 -0.054 -0.009 -0.029 -0.028 -0.042 -0.014 -0.013 0.009 Table 4.4: Stirmark results for cropping attacks.
S2’ 0.013 0.015 0.004 0.001 -0.001 0.028 -0.009 -0.048 0.013
S1 S2 S1’ 0.132 0.138 0.132 0.223 0.227 0.223 0.389 0.404 0.389 0.503 0.513 0.503 0.661 0.665 0.661 0.705 0.714 0.705 0.882 0.891 0.882 0.958 0.962 0.958 0.995 0.997 0.995 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 Table 4.5: Stirmark results for JPEG attacks.
S2’ 0.138 0.227 0.404 0.513 0.665 0.714 0.891 0.962 0.997 1.000 1.000 1.000
91
S1 S2 S1’ S2’ Attack Latest Small Random Distortion 0.020 0.020 0.020 0.020 (0.95) Latest Small Random Distortion -0.006 -0.009 -0.006 -0.009 (1) Latest Small Random Distortion -0.018 -0.015 -0.018 -0.015 (1.05) Latest Small Random Distortion -0.011 -0.009 -0.011 -0.009 (1.1) Table 4.6: Stirmark results for latest small random distortion attacks.
Attack Median Cut (3) Median Cut (5) Median Cut (7) Median Cut (9)
S1 S2 S1’ 0.476 0.482 0.476 0.114 0.117 0.114 -0.076 -0.071 -0.076 -0.029 -0.022 -0.029 Table 4.7: Stirmark results for median cut attacks.
S2’ 0.482 0.117 -0.071 -0.022
S1 S2 S1’ 1.000 1.000 1.000 -0.035 -0.028 -0.035 -0.009 -0.013 -0.009 -0.020 -0.021 -0.020 -0.042 -0.050 -0.042 -0.023 -0.022 -0.023 Table 4.8: Stirmark results for noise attacks.
S2’ 1.000 -0.028 -0.013 -0.021 -0.050 -0.022
Attack Add Noise (0) Add Noise (20) Add Noise (40) Add Noise (60) Add Noise (80) Add Noise (100)
S1 S2 S1’ S2’ Attack Watermark Embedding (0) 1.000 1.000 1.000 1.000 Watermark Embedding (10) 1.000 1.000 1.000 1.000 Watermark Embedding (20) 1.000 1.000 1.000 1.000 Watermark Embedding (30) 1.000 1.000 1.000 1.000 Watermark Embedding (40) 1.000 1.000 1.000 1.000 Watermark Embedding (50) 1.000 1.000 1.000 1.000 Watermark Embedding (60) 1.000 1.000 1.000 1.000 Watermark Embedding (70) 1.000 1.000 1.000 1.000 Watermark Embedding (80) 1.000 1.000 1.000 1.000 Watermark Embedding (90) 1.000 1.000 1.000 1.000 Watermark Embedding (100) 1.000 1.000 1.000 1.000 Table 4.9: Stirmark results for watermark embedding attacks.
92
S1 S2 S1’ Attack Rescale (50%) -0.009 -0.003 0.386 Rescale (75%) 0.005 0.009 0.792 Rescale (90%) -0.002 -0.005 0.224 Rescale (110%) 0.012 0.013 0.935 Rescale (150%) 0.020 0.022 0.317 Rescale (200%) 0.021 0.021 0.429 Table 4.10: Stirmark results for line removal attacks.
S2’ 0.377 0.786 0.203 0.932 0.298 0.410
S1 S2 S1’ Attack Remove Lines (1 of 10) 0.011 0.014 0.173 Remove Lines (1 of 20) -0.026 -0.027 0.121 Remove Lines (1 of 30) 0.035 0.044 0.191 Remove Lines (1 of 40) -0.027 -0.025 0.085 Remove Lines (1 of 50) -0.033 -0.036 0.238 Remove Lines (1 of 60) 0.020 0.020 0.156 Remove Lines (1 of 70) -0.011 -0.006 0.258 Remove Lines (1 of 80) 0.006 0.013 0.117 Remove Lines (1 of 90) 0.024 0.014 0.141 Remove Lines (1 of 100) 0.003 0.000 0.209 Table 4.11: Stirmark results for line removal attacks.
S2’ 0.155 0.121 0.173 0.075 0.227 0.135 0.245 0.105 0.129 0.192
S1 S2 S1’ S2’ Attack Small Random Distortions (0.95) -0.027 -0.015 -0.027 -0.015 Small Random Distortions (1) -0.035 -0.029 -0.035 -0.029 Small Random Distortions (1.05) -0.018 -0.027 -0.018 -0.027 Small Random Distortions (1.1) 0.009 0.004 0.009 0.004 Table 4.12: Stirmark results for small random distortions attacks. S1 S2 S1’ S2’ Attack RotationCrop (-0.25 degree) 0.044 0.045 0.061 0.051 RotationCrop (-0.5 degree) 0.000 0.003 0.014 0.005 RotationCrop (-0.75 degree) -0.053 -0.063 0.008 0.003 RotationCrop (-1 degree) -0.017 -0.009 -0.029 -0.034 RotationCrop (-2degree) -0.024 -0.018 0.015 0.007 RotationCrop (0.25 degree) 0.020 0.014 0.045 0.035 RotationCrop (0.5 degree) 0.006 -0.002 0.023 0.021 RotationCrop (0.75 degree) 0.056 0.058 -0.023 -0.035 RotationCrop (1 degree) -0.003 -0.005 0.011 0.016 RotationCrop (2 degree) -0.015 -0.019 -0.050 -0.042 Table 4.13: Stirmark results for rotation and cropping attacks.
93
S1 S2 S1’ S2’ Attack Rotation and Scale (-0.25 degree) 0.056 0.043 0.056 0.043 Rotation and Scale (-0.5 degree) -0.021 -0.027 -0.021 -0.027 Rotation and Scale (-0.75 degree) 0.035 0.025 0.035 0.025 Rotation and Scale (-1 degree) -0.011 -0.016 -0.011 -0.016 Rotation and Scale (-2degree) 0.015 0.001 0.015 0.001 Rotation and Scale (0.25 degree) 0.041 0.033 0.041 0.033 Rotation and Scale (0.5 degree) 0.038 0.036 0.038 0.036 Rotation and Scale (0.75 degree) -0.023 -0.023 -0.023 -0.023 Rotation and Scale (1 degree) 0.020 0.016 0.020 0.016 Rotation and Scale (2 degree) 0.012 0.011 0.012 0.011 Table 4.14: Stirmark results for rotation and scaling attacks. S1 S2 S1’ Attack Rotation (-0.1 degree) -0.009 -0.012 0.041 Rotation (-0.2 degree) 0.062 0.065 0.002 Rotation (-0.25 degree) 0.003 0.004 -0.003 Rotation (-0.5 degree) 0.020 0.009 0.038 Rotation (-0.75 degree) 0.006 0.020 -0.029 Rotation (0.25 degree) 0.027 0.039 -0.002 Rotation (0.5 degree) 0.014 0.006 -0.027 Rotation (0.75 degree) 0.008 0.012 -0.023 Rotation (1 degree) 0.020 0.021 -0.036 Rotation (2 degree) -0.015 -0.018 -0.021 Rotation (5 degree) 0.018 0.027 0.002 Rotation (10 degree) 0.003 -0.001 0.024 Rotation (15 degree) -0.011 -0.008 -0.009 Rotation (30 degree) 0.027 0.031 0.026 Rotation (45 degree) 0.023 0.016 -0.002 Rotation (90 degree) -0.015 -0.008 -0.015 Table 4.15: Stirmark results for rotation attacks.
94
S2’ 0.052 0.001 -0.004 0.043 -0.033 -0.007 -0.026 -0.021 -0.042 -0.023 0.005 0.018 -0.007 0.024 0.005 -0.008
CHAPTER 5 ITERATIVE WATERMARK EMBEDDING (IWE)
5.1
Iterative Watermark embedding (IWE)
An interesting problem arises when the original image for the proposed watermarking algorithm is a JPEG-compressed image from a .jpg file and the watermarked image needs to be JPEG recompressed to produce another .jpg file. Would the watermark still be decodable or detectable in the JPEG recompressed file? Given that the original image is JPEG-compressed, the original DCT coefficients are already quantized. For any watermarking methods, the distortion of these quantized DCT coefficients due to watermarking is typically small compared with the quantization factor. During the recompression to produce the JPEG-compatible file, the watermarked DCT coefficients are re-quantized and this can completely remove the small distortion which carries the watermark information and restore the original quantized DCT values. This is especially serious when the compression ratio is large. Based on SWE, a novel technique called Iterative Watermark Embedding (IWE) is proposed to prevent the removal of the watermark in the requantization process such that watermark decoding and detection can still work. IWE assumes that the watermarked image will be recompressed with the same quantization matrix as the original JPEG-compressed image. IWE embeds the watermark in the reconstructed DCT coefficients.
95
5.2
Host Vector Extraction in IWE
The watermark host vector Y = ( y1 , y 2 ,..., y M ) is extracted directly from the JPEG compressed domain of the original image. As the original image is already JPEG-compressed, the image is partitioned into 8x8 blocks and the DCT coefficients are arranged in zigzag order in the original .jpg file. To form the watermark host vector, the first AC coefficient in zigzag order is extracted from all the 8x8 blocks to form the first portion of Y. Then the second AC coefficient in zigzag order is extracted from the blocks and appended to form the second portion of Y, and so on until a total of M coefficients are extracted. Similar to SWE and MWE, the host vector is segmented into N sub-vectors of length P = M / N . Each sub-vector is used to embed one bit of watermark information. The ith sub-vector is denoted by Yi
(
)
such that Yi = y i1 , y i2 ,..., y iL .
5.3
Bit Embedding in Sub-vector
The watermark, W o = (wo1 , wo 2 ,..., woN ) with woi ∈ {0,1} , is a bit sequence with length N where N