Email: {kesal, koetter}@comm.csl.uiuc.edu, {mihcak, moulin}@ifp.uiuc.edu ... channel coding task. ... coder (blind watermarking), the fact that the encoder ...
ITERATIVELY DECODABLE CODES FOR WATERMARKING APPLICATIONS Mustafa Kesal, M. Kıvan¸c Mıh¸cak, Ralf Koetter and Pierre Moulin U. of Illinois, Coordinated Science Research Lab. and ECE Dept. 1308 W. Main St., Urbana, IL 61801 Email: {kesal, koetter}@comm.csl.uiuc.edu, {mihcak, moulin}@ifp.uiuc.edu ABSTRACT
2. WATERMARKING
The problem of information hiding or watermarking is investigated. Based in an information theoretic analysis of the watermarking task we investigate a strategy to employ binary codes to robustly hide data in a given host signal. The central theme of the problem is the interplay of a vector quantization and a channel coding task. Turbo codes and other coding schemes are compared in terms of their performance in a watermarking application.
A theory has recently been developed to establish the fundamental limits of the fairly general data hiding problem described below [1, 2] see Fig. 1. A message M is to be communicated to a receiver. The ˜N = message is embedded into a length-N sequence X ˜ 1 , ..., X ˜ N ) termed host data set, typically data from (X a host image, video, or audio signal. The embedding ˜ N = (K ˜ 1 , ..., K ˜N ) is done using a cryptographic key K that is also available at the decoder. The resulting watermarked data or composite data X N = (X1 , ..., XN ) is subject to attacks that attempt to remove any trace of M from X N . The data-hiding process should be ˜ N , according transparent: X N should be similar to X to a suitable distortion measure. The system should also be robust: the hidden message should survive any attack (within a reasonable class of attacks). A typical restriction on the attacker is that there is a limit on the amount of distortion that he/she is willing to introduce.
Keywords: Watermarking, vector quantization, turbo codes, iterative decoding, product codes 1. INTRODUCTION Data hiding refers to nearly invisible embedding of information within a host data set such as text, audio, image, or video. Applications include watermarking, steganography, image databases, and in-band captioning. Most watermarking research to date has focused on novel ways to hide information and to detect and/or remove hidden information. However, a rigorous theory describing fundamental limits of any watermarking system is just emerging. The watermarking problem can be viewed as a modified joint source-channel coding problem, with the given distortion and rate constraints. The main role played by the channel code is in fact to embed the watermark signal into the original data with a small distortion and provide a protection for the original and the watermark signal against the attackers distorting signal. In this paper, we first introduce the general watermarking problem along with the optimum strategies to be followed by the intended sender and the attacker. A simple quantization approach combined with channel coding to this problem will be analyzed. In particular, we focus on iterative decoding techniques and the applicability of iteratively decodable codes to the watermarking problem. Plots for the various codes performances and the future work will be discussed as a final section.
M
Encoder
X
N
N
Q(y|x)
Y
Decoder
^ M
~N
X
N
K
Figure 1: A general setup for watermarking scheme depicted The system can be analyzed by defining a statis˜ N and K N , a distortion functical model for M , X tion, specifying constraints on the admissible distortion levels D1 and D2 for the data hider and the attacker, and specifying the information available to all parties. Then one can seek the maximum rate of reliable transmission for M , over any possible data-hiding strategy and any attack that satisfy the specified constraints. This is done by application of informationtheoretic principles, and in particular upon the following fundamental concept, which so far appears to have been overlooked in the watermarking literature: ˜ N is not available at the deEven if the host signal X coder (blind watermarking), the fact that the encoder
knows N signifies that achievable rates are higher than ˜ N was some unknown interference. This problem if X falls in the category of communication problems where encoder and decoder have access to side information [4,7]. The hiding capacity, which bounds the rates of reliable transmission, and depends on the choice of distortion function and on the admissible distortion levels D1 and D2 . Closed-form expressions have been developed for i.i.d. Gaussian host signals and squared-error distortion functions for blind and non-blind watermarking [1,2]. It is assumed that the attacker does not know the encoding and decoding functions, e.g., these functions depend on a cryptographic key. In both cases, the optimal attack is the Gaussian test channel from rate-distortion theory [4]. The capacity takes the form: D1 1 C = log 1 + 2 βD2 where β is a noise boosting factor which tends to 1 at low distortion levels D1 , D2