Multimedia selective encryption by means of ... - Semantic Scholar

3 downloads 0 Views 1MB Size Report
possible to use an external encryption block whose security is well established, like AES or RC4. 1In [22] it is argued that in some settings the performance loss ...
Multimedia selective encryption by means of randomized arithmetic coding Marco Grangetto, Member, IEEE, Enrico Magli, Member, IEEE, Gabriella Olmo, Member, IEEE

Abstract We propose a novel multimedia security framework based on a modification of the arithmetic coder, which is used by most international image and video coding standards as entropy coding stage. In particular, we introduce a randomized arithmetic coding paradigm, which achieves encryption by inserting some randomization in the arithmetic coding procedure; notably, and unlike previous works on encryption by arithmetic coding, this is done at no expense in terms of coding efficiency. The proposed technique can be applied to any multimedia coder employing arithmetic coding; in this paper we describe an implementation tailored to the JPEG 2000 standard. The proposed approach turns out to be robust towards attempts to estimating the image or discovering the key, and allows very flexible protection procedures at the code-block level, allowing to perform total and selective encryption, as well as conditional access.

Index Terms Arithmetic coding; selective encryption; conditional access; JPEG 2000.

The authors are with CERCOM (Center for Multimedia Radio Communications), Dip. di Elettronica, Politecnico di Torino, Corso Duca degli Abruzzi 24 - 10129 Torino - Italy - Ph.: +39-011-5644195 - FAX: +39-011-5644099 - E-mail: marco.grangetto(enrico.magli,gabriella.olmo)@polito.it. Corresponding author: Enrico Magli. This work has been partially sponsored by MIUR (Italian Ministry of Education and Research) under the project PRIMO.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

1

Multimedia selective encryption by means of randomized arithmetic coding I. I NTRODUCTION Digital rights management is a rapidly emerging area of research that deals with all aspects of secure data communication, from the system-level key exchange protocols to the signal processing and cyphering algorithms employed to make the contents unusable by unauthorized parties. In particular, digital rights management requires multimedia securization technologies, and enables applications such as copyright protection, authentication, and conditional access, just to mention a few. The main underlying enabling technology is encryption. Data encryption [1] can be used to cypher all or parts of the content, so that only the user that has received a key can decrypt and display all or parts of the data. By carefully designing the encryption technology, it is possible to provide advanced functionalities. For example, conditional access amounts to providing different portions of a multimedia content under different policies; a thumbnail or a low-resolution version of the content can be made available for free, whereas the user may have to pay in order to see the full-quality content. Conditional access is also very important when a user browses a catalogue of multimedia files in order to retrieve the content of interest; in this case, low-resolution data can be quickly downloaded in order to select the desired content, which can be purchased in a higher resolution version. The potential applications are so important that the Joint Photographic Experts Group (JPEG) committee has started a new work item related to JPEG 2000 [2], namely Part 8 (JPSEC) [3], which specifically deals with the standardization of enabling tools for JPEG 2000 related security applications. In addition to encryption technology, key management protocols are also necessary to exchange keys between service providers and users; several such protocols exist, such as those described in [4], [5], [6], [7]. Although block-based data encryption techniques such as Data Encryption Standard (DES) [8] and Advanced Encryption Standard (AES) [9] have become very popular, they have some clear limitations for multimedia applications. First of all, they may require significant computational resources which, though feasible for desktop-PC applications, may be too demanding in low-power environments such as wireless communications. Secondly, block-based techniques are poorly suited to real-time communications, as they introduce delay; this is e.g. the case of speech coding and videoconferences, which require very short end-to-end latency. Moreover, encrypting a signal in its original domain makes it difficult to provide advanced functionalities such as conditional access, which are more easily implemented in a transformed domain (e.g., the discrete cosine transform or wavelet domain)

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

2

in which the relevant features such as low frequency components, texture, and so on are readily available. Finally, the majority of existing encryption standards such as DES and AES have been developed for i.i.d. data sources; however, multimedia data are typically non i.i.d., and this may cause some serious problems. Stream cyphers [10] are another popular way to perform encryption. In a stream cypher, the plaintext is converted to cyphertext one bit at a time; in this way, it is not necessary to wait for the entire block to be received in order to start decoding. Moreover, the initial portion of the data segment can be decrypted and decoded even if later portions have been lost; this is clearly desirable in a multimedia coding and transmission environment (see e.g. [11]). More recently, to cope with such problems, selective encryption techniques have been proposed [12], [13], [14], [15]. Selective encryption consists in cyphering only a relatively small portion of the multimedia data. It is best done in a transform/compressed domain [16], [17], where it can exploit the fact that the data are represented as a sequence of approximately i.i.d. samples that unequally contribute to the quality of the reconstructed signal. As a consequence, it is often sufficient to encrypt few coefficients in order to avoid content fruition by unauthorized users. Moreover, in a transformed domain it is easier to provide advanced functionalities by carefully selecting the coefficients to be cyphered [18], [19]; for instance, conditional access can be implemented by letting the content be decoded and displayed without errors up to a certain resolution, whereas a key must be owned by the user in order to correctly decode the full quality content. A first selective encryption technique for JPEG 2000 compressed images has been presented in [20]; in order to achieve conditional access, the authors propose to pseudo-randomly invert some bits in the coding passes of the last layers, i.e. those that contribute detail to the image. A decoder knowing the seed of the random sequence (i.e., the key) can undo the scrambling and correctly decode such layers; otherwise, attempts to decode the protected layers will impair the obtained visual quality. More recently, in [21] it is pointed out that the encryption can be carried out at different levels inside the multimedia encoder, namely the image level, the transform level, the quantization/bit-plane level, the entropy coding level and the codestream level. It must be noted that encryption at the image, transform, and quantization/bit-plane levels may reduce coding efficiency, as it alters the statistics of the data that are input to the entropy coder 1 . The approach of encrypting the final codestream can be desirable for several reasons. Firstly, it is possible to use an external encryption block whose security is well established, like AES or RC4. 1

In [22] it is argued that in some settings the performance loss can be avoided by using distributed source coding

techniques.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

3

Moreover, keeping the design of the compression and encryption blocks separated puts less constraints on the designer. On the other hand, external use of an encryption block can produce syntax errors and lead to non-compliant cyphered codestreams (e.g., marker emulation in JPEG 2000 and MPEG-4), and will generally hamper the exploitation of nice syntactical features such as compressed domain processing, progression order change, scalability, and so forth. The marker emulation problem is addressed in [23], where an iterative scheme is proposed to avoid emulations. On the other hand, the joint design of the compression and encryption parts potentially allows to provide a more flexible encryption of the data, and hence more features. An example is given by encryption at the entropy coding level, which allows one to perform compression and securization using one single tool, without the need of an external encryption block. In [24] it is proposed to associate a fixed length index to each variable length codeword, to encrypt the indexes, and to map them back to codewords; this approach is reported to work well for Huffman and Golomb codes. However, it has a number of disadvantages including reduced coding efficiency with respect to the original code, and generation of emulated markers; moreover, the concept does not apply in a straightforward way to entropy coders with fractional codeword length such as arithmetic coders, although some ideas to extend the work of [24] are provided in [21]. In [25], an arithmetic coder (AC) with random adaptation instants is employed, so that the decoder cannot exactly track the source statistics computation. The proposed technique solves some problems related to attacks to ACs using nonrandom adaptation [26], [27], [28], [29], but still exhibits a performance loss with respect to a non-encrypted arithmetic coder; moreover, since encryption is actually performed in the statistical modeler and not in the AC, the implementation is dependent on the selected adaptation strategy. In [30], [31] it is proposed to encrypt only the first few bits of the AC stream; this requires an external encryption block, and can lead to violation of a predefined codestream syntax. More information on AC-based techniques is provided in Sect. III. In this paper we also adopt the entropy coding level approach, and propose a scheme that provides conditional access by means of a modified AC stage. Note that AC is being adopted by plenty of recent standards, such as H.264/AVC [32] and JPEG 2000, thanks to its high coding efficiency and the possibility to employ adaptive coding strategies; moreover, a number of low complexity implementations are available, e.g. the MQ coder in JPEG 2000. In particular, we propose a new general securization approach based on a modified AC, called randomized arithmetic coding (RAC), as an effective way to encrypt multimedia contents. Unlike the approach in [24], [25], RAC does not suffer any loss of compression efficiency with respect to a standard AC, nor is it affected by the codestream non-compliance issues of [24], [27], [30], [31]. The RAC approach can be applied to any AC, including adaptive and context-based ACs, and their multiplierless approximations, which

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

4

are very popular in the major international standards. In particular, we integrate the RAC approach in JPEG 2000 by implementing a randomized version of the MQ-coder. Encryption is carried out in such a way as to cypher selected sets of the wavelet transform coefficients, enabling advanced functionalities such as selective encryption and conditional access. Note that, throughout the paper, we assume that an existing key management protocol is used to communicate the key (see e.g. [5], [6], [7]), and we focus on how the key is used to achieve secrecy. This paper is organized as follows. In Sect. II we briefly review the basic AC concepts and describe the proposed RAC approach, and in Sect. III we perform its cryptanalysis. In Sect. IV we detail the tailoring of the RAC approach to the JPEG 2000 image compression standard, as well as possible applications to total encryption, selective encryption and conditional access. In Sect. V we provide experimental results on the performance and robustness of the proposed approach, whereas in Sect. VI we draw some conclusions and outline possible future developments. II. R ANDOMIZED

ARITHMETIC CODING

The RAC approach exploits the fact that arithmetic decoding is very sensitive to errors in the compressed data, which tend to propagate throughout the decoded block; this otherwise undesirable property can be used to design a robust multimedia securization algorithm. In fact, a single erroneous decoding step is able to cause an irreversible drift, thus making the data decoded any further completely useless; unlike Huffman coding, which tends to recover after a certain number of erroneously decoded symbols, AC exhibits very poor resynchronization capabilities [30]. If we can force decoding errors in an AC system that does not know the decoding key, then this system will not be able to properly decode and render the multimedia content. Ideally, this must be done in such a way as to not decrease coding efficiency. The AC encoding procedure is based on the classical recursive probability interval partition known as Elias coding; at each iteration the interval is split in two sub-intervals. In the partition of the probability interval, an AC decides in advance whether the interval related either to the least probable symbol (LPS) or to the most probable symbol (MPS) comes first. Although both options lead to the same coding efficiency, which ultimately depends only on the size of the last subinterval at the end of the encoding process, this decision is agreed upon between encoder and decoder once and for all. Conversely, RAC is based on a random organization of the encoding intervals, so that only a synchronized decoder is able to interpret correctly the encoded sequence. In particular, as the encoding is done on a bit-by-bit basis, and an interval partition is associated to each bit, for each bit an independent decision is made as to whether the LPS or MPS subinterval comes first. The operation of a classical AC and a RAC is sketched in Fig. 1. In Fig. 1-a a classical AC encodes the input string

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

5

001 by selecting a binary number contained in the interval I ; in Fig. 1-b the RAC performs similar operations on the same input string, but the interval order for the second bit is swapped. This leads to a final interval I  different from that selected by the classical AC. If the classical AC were used to decode the RAC compressed data, it would output 01 for the first two bits (the second bit being erroneous), and the third bit would depend on which binary number inside the interval I has been selected by the RAC. Note that the size of I is equal to that of I  , implying that both coders have the same coding efficiency.

0

(a)

p0

p1

1

p1

1

0 0 I 1 (b) 0

p0

0 0

I’

1 Fig. 1.

(a) Classical arithmetic coder; (b) Randomized arithmetic coder.

We propose an interval swapping rule defined as follows. The interval swapping is triggered by a random variable R, which takes on value 0 and 1 with probability 0.5. A sample value R = r is drawn at each input bit, and interval swapping is performed only when r = 1. In practice, before encoding bit bi , i = 0, . . . , N − 1 of a binary data block, we initialize a random number generator with seed S . Then we perform the following operations: For i = 0 : N − 1 draw a random number ri using the random generator; if ri = 1 then select the order [LPS,MPS] for encoding bit bi , otherwise select the order [MPS,LPS].

The seed S represents the encryption key, which will also be referred to as K in the following; more details are provided in Sect. IV-C. If encoder and decoder use the same key, they will generate the same random sequence for R, and will hence be synchronized; on the contrary, if the correct key is not available, the decoder will not be able to correctly decode the compressed data. In this latter case, the decoder drift guarantees that the decoded data are meaningless. A few such decoding examples are provided in Sect. V.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

6

Note that, if no randomization is employed (i.e., r i is always equal to zero), RAC is identical to a standard arithmetic coding, so that RAC can also be used to decode a non-encrypted data stream. Taking ri to have equal probability for 0 and 1 amounts to generate maximum randomization in the interval swapping process. Of course, one may decide to employed different amounts of randomization. In [33] it has been shown that the performance is little dependent on this probability (provided it is larger than 0.001), but that this probability does not add substantial security to the system. Finally, it should be noticed that, since RAC encrypts one bit at a time, it can be functionally seen as a stream cypher. In fact, although this is not investigated in detail in the present paper, it would be possible to decode only the first part of the data in case later parts had been lost or corrupted. III. C RYPTANALYSIS

OF A RANDOMIZED ARITHMETIC CODER

There has already been some past work regarding cryptanalysis of ACs [26], [27], [28], [29], [30]. We employ a few existing results to show that the RAC approach requires a prohibitively high computational cost to be broken. A binary AC consists of two main parts, i.e. the statistical modeler and the AC compression engine. A first encryption approach consists in exploiting the adaptivity of the statistical model; if the decoder does not know the initial state, then it will not be able to track the encoder adaptation process exactly [34]. However, this approach is known to be not robust [27], [28], because, as highlighted in [25], randomization in an AC should depend not only on the input data themselves, but also on some random number. Various randomizations have been proposed, including statistical modeling [25] and interval endpoints in [35]; however, these techniques suffer from reduced coding efficiency with respect to a standard AC. Another technique has been proposed in [30], [31]. The authors do not modify the statistical modeler, nor the AC coding engine; this has the beneficial effect that the technique can be applied to any statistical modeler and any AC coder, including sophisticated context-based ones. The idea is to cypher only the first AC output bits, and to rely on the poor AC resynchronization capabilities. It is shown in [30] that, if the first b bits of the AC stream are unknown to the decoder, then the number of decoding attempts to be performed by an attacker in order to achieve full synchronization is lowerbounded by O(2b/2 ), which leads to a huge complexity also for relatively small values of b. However, the attacker could try to perform partial synchronization, i.e. avoid estimating the first b bits, which he accepts to be unable to decode, and try to resynchronize from bit b + 1 and decode the remainder of the stream. To do so, the attacker needs to find the lower endpoint of the current interval. For partial resynchronization, analytical results are very difficult to work out. In [30] several numerical experiments are carried out under favorable hypotheses as to the information the attacker knows. In

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

7

particular, it is assumed that the attacker knows the size of the current interval, although the lower endpoint is unknown. It is shown that the probability distribution of the lower endpoint is very flat, so that the search an attacker must carry out is nearly exhaustive; moreover, as the MPS probability tends to one (which typically happens for context-based coders), the distribution tends to be even flatter, and this behavior is very little dependent on the value of b. As an example, if the first 200 bits are encrypted, assuming that the MPS probability is 0.6 a statistical attack would require, with probability 0.5, more than 230 decoding attempts. Furthermore, if the lower endpoint were known with accuracy 10−4 , less than 50 bits could be correctly decoded before losing synchronization. These results on the poor AC resynchronization capability, even when few initial bits of the AC stream are encrypted, suggest that estimating the original sequence when all the bits are encrypted, as is done in the RAC approach, is a prohibitively burdensome task. In fact, the lower bound on the number of decoding attempts for full resynchronization is O(2 N/2 ), with N being the block size. As in applications such as those presented in Sect. IV-D the block size is typically larger than 1000 bits, full resynchronization would require huge complexity, and partial resynchronization would still be unviable. The resynchronization attack is not a known plaintext attack, toward which it is known [26], [28] that arithmetic coding is not robust. For example, if the randomization is embedded in the statistical modeler, one could flood the encoder with zeros, and bring it into a known state. A similar attack could be carried out against a RAC. Given a known input sequence (for example the all-zero sequence) the attacker could compare the AC-coded and RAC-coded sequences and attempt to inferring the swapping decisions. This attack seems to be not trivial, because it requires to cope with the variablelength coding, i.e. the lack of a symbol-by-symbol correspondence between input and output bits. This does not allow the attacker to make local decisions, but they have to encompass as many symbols as the encoder memory. However, robustness towards known plaintext attacks has not been addressed in this paper. Moreover, known plaintext attacks are not relevant for the encryption systems proposed in Sect. IV-D, as in typical applications the original image is not available, nor is it possible to have a selected image cyphered. Still, the attacker could try to exploit possible weaknesses of the encryption key, and attempt to decode the data using several different seeds. The robustness towards this sort of attack is extensively discussed in Sect. V with regard to the JPEG 2000 based system. IV. R ANDOMIZED MQ- CODER

FOR

JPEG 2000

The RAC approach can be applied to any existing signal compression technique that employs AC as entropy coding stage, including adaptive and context-based ACs; in general, the interval swapping

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

8

concept does not even depend on the specific context model employed by the arithmetic coder, since it operates at the last stage of the AC process. However, different algorithms will generally encode information in different ways, and the specific implementation of a selective encryption or conditional access scheme will have to be tailored to the specific coding scheme. For example, JPEG 2000 employs an AC to code bit-planewise the quantized wavelet coefficients, whereas H.264 may employ the AC to encode the motion vectors in addition to the discrete cosine transform coefficients. While both standards are amenable to the RAC approach, the selection of to-be-cyphered data depends on the desired functionality. In the following we tailor the RAC approach to the JPEG 2000 Part 1 image coding standard, which employs a multiplierless AC called MQ-coder, and we describe how the selective encryption and conditional access functionalities can be obtained. A. Overview of JPEG 2000 and the MQ-coder JPEG 2000 is an international standard for lossy and lossless image compression. It provides progressive encoding with quality, spatial and resolution scalability, and supports region-of-interest coding. An image is processed componentwise, and each component may be optionally divided into nonoverlapping tiles. For each tile, a biorthogonal wavelet transform is computed, employing either the (9,7) filter (irreversible mode) or the (5,3) filter (reversible mode). Scalar quantization with an embedded deadzone is applied independently to each subband (a unit quantization step size is employed for reversible compression). After quantization, each subband of the wavelet decomposition is divided into rectangular regions referred to as code-blocks; each code-block is independently coded using a bit-plane approach, along with context-based AC. In particular, for each code-block, each coefficient bit in the current bit-plane (starting from the most significant one) is encoded in only one out of three coding passes by means of the MQ-coder; after all code-blocks in all subbands have been encoded, rate distortion optimization is used to select the coding passes that actually have to be flushed to the codestream so as to achieve the desired bit-rate and scalability characteristics. In practice, in order to facilitate the decoding process and to enable scalability and progression order changes, coding passes from different bit-planes of different code-blocks are typically grouped into so-called packets; each packet has its own header, which contains information about the subbands, code-blocks, bit-planes and coding passes that contribute to the data in the packet body. Conceptually, the codestream is organized as a succession of layers, each layer containing contributions from each code-block that increment quality. The block truncation points associated with each layer are optimal in the rate distortion sense. The MQ-coder is a binary and adaptive AC used to encode the decision bits d i corresponding to a

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

9

code-block of a wavelet transformed image. The code string C is adjusted so as to point to the lower endpoint of the sub-interval corresponding to the input symbol d i . The probability model is adapted to the source statistics by updating the probability Q e of the LPS for the next iteration. Moreover, MQ partitions the interval without using multiplications; in fact, the probability interval A is guaranteed to be in the range 0.75 ≤ A < 1.5, so as that the following approximations are assumed (see Fig. 2, top part): i) if the LPS occurs, the interval is reduced to Q e  A · Qe ; ii) if the MPS occurs, the interval is updated to A − Qe  A − A · Qe ; in this latter case Qe is added to the code string C, in order to make it point to the base of the MPS sub-interval.

(standard) A Qe

C

LPS

MPS

(swapped) A Qe

C Fig. 2.

MPS

LPS

MQ encoding intervals.

B. Randomized MQ-coder The randomized MQ-coder is based on the definition of the two alternative interval conventions shown in Fig. 2. The standard MQ-coder assumes that the LPS interval precedes the MPS interval; the randomized MQ coder allows to swap these two intervals randomly. In particular, the standard MQ-coder performs the following operations for encoding bit b i : If bi is the LPS A ← Qe

the codestring C is unchanged; else if bi is the MPS A ← A − Qe C ← C + Qe .

On the other hand, the randomized MQ-coder operates as follows for encoding bit b i :

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

10

Draw a decision ri using the random generator; if ri = 0 then encode bit bi using the standard MQ-coder encoding procedure described above; else if ri = 1 •

if bi is the LPS A ← Qe C ← C + (A − Qe );



else if bi is the MPS A ← A − Qe

the codestring C is unchanged. All the other auxiliary routines employed by the MQ-coder, including the bit-stuffing functionalities, are also utilized by the randomized MQ-coder. This makes sure that, unlike [30], [31], [25], [24], marker emulation is avoided and the syntax is not violated. If r i is always equal to zero, the randomized MQ-coder becomes a standard MQ-coder. C. Integration within JPEG 2000 The randomized MQ-coder has been integrated in the JPEG 2000 Part 1 codec. As has been seen, JPEG 2000 divides the wavelet transform coefficients of each subband into rectangular regions called code-blocks, which are the basic encoding unit. In our scheme, we decide whether to employ the standard or the randomized MQ coder on a per code-block basis. This approach has several advantages, as it allows to achieve different advanced secrecy functionalities, such as selective encryption and conditional access. Note that, because of post-compression rate-distortion optimization, the encoder codes all coding passes, whereas the decoder only processes a subset of them. As a consequence, one has to make sure that, at the beginning of each code-block, encoder and decoder are synchronized in terms of the r i random sequence. A trivial solution would be to reset the random number generator at the beginning of each code-block. However, this would lead to using the same initial decisions for all code-blocks, leading to a potential security flaw. To avoid this, we use two different random number generators. The outer one (with seed S ) provides as output a sequence of seeds S i to be employed as key for the i-th code-block. The code-block ordering is specified in [2]. The key that is communicated to the user is the seed S ; the user is then able to reproduce a sequence of seeds S i to be used as keys for each code-block. The vast majority of existing encryption algorithms rely on the use of a secret key; therefore, the encryption algorithm must be used in conjunction with a secure key exchange protocol. Most

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

11

existing e-commerce systems are based on the assumption that one of the parties (e.g., the vendor) is a trusted party; this allows to set up a reasonably secure communication scheme. Note that in this work we are exclusively dealing with the encryption technique, and assume to be using an existing key exchange protocol, such as e.g. those employed in e-commerce applications. The development of ad-hoc protocols goes beyond the scope of this paper. A similar reasoning can be made as to secure key and random sequence generation. There exist several secure key and random generators (see e.g. [4], [9], [36], [37], [38], [39], [40]). In this paper we focus on how the seeds are used, but the definition of a specific random number generation is outside the scope of this paper. In our simulations, we represent the seeds S and S i with 32 bits. This allows to employ 232 different keys. It is worth pointing out that the proposed approach is backward compatible with JPEG 2000 in the following sense. The randomized MQ-coder does not produce a legal JPEG 2000 codestream, as it does not comply with the PSNR requirements defined in the compliance tests (Part 4 [41]). However, an image encoded by a modified JPEG 2000 Part 1 encoder employing the randomized MQ-coder shall not crash a Part 1 decoder attempting to decode the image; this is guaranteed by the fact that the JPEG 2000 headers are left intact, and that the header content is consistent with the codestream. However, a Part 1 decoder that does not employ the corresponding randomized decoder, or that does not know the encryption key, will display a meaningless image (see examples in Sect. V). It is worth pointing out that the resulting file is compliant with the JPEG 2000 syntax, and allows to perform quality and resolution scalability, progression order change and other compressed domain processing directly on the compressed and encrypted data, without the need of decoding and/or decrypting. On the other hand, a slightly modified decoder implementing the randomized MQ decoder will be able to correctly decrypt and decode the image. Note that, since the randomized MQ-coder with r i always equal to zero is identical to the standard MQ-coder, the randomized one can be used to decode non-encrypted images. The RAC approach offers the advantage of jointly performing compression and encryption, avoiding the need of an external cyphering software that processes and encrypts a JPEG 2000 compliant nonencrypted codestream. In particular, the added complexity inside the AC is very limited, if compared with the computational power required by a separate encryption algorithm (however, secure random generators may have significant complexity, and this has to be taken into account in the system design). Moreover, the cyphering software would need to be able to parse a JPEG 2000 codestream, and hence to read the (compressed) packet headers, whereas our scheme exploits the syntax parsing capabilities of the JPEG 2000 encoder. A thorough discussion on the use of an external encryption block is given in Sect. V-D.1.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

12

D. Applications We employ the RAC approach for three security applications, namely total encryption, selective encryption, and conditional access, as defined below. Note that, although we employ the same key for all code-blocks, during the decoding process the randomized MQ decoder must know which codeblocks have been encrypted. In some applications, the selection of to-be-cyphered code-blocks is part of the system design, and is assumed to be known to both encoder and decoder. In some specific applications (not considered in this paper), a list of encrypted code-blocks may have to be communicated to the decoder. This list (which amounts to very little information) can be communicated by means of the selected key exchange protocol at the time the key is sent; otherwise, it may also be embedded in a JPEG 2000 comment marker segment. Note that, in the applications defined below, the selection of code-blocks to be cyphered is fixed and made in advance. In the results presented in Sect. V, we always assume in favor of the attacker that he knows which code-blocks have been cyphered (in addition to partial knowledge of the key). 1) Total encryption: In total encryption, the sole purpose of encryption is to avoid that an unauthorized user decodes and displays the image. To this purpose, we cypher all code-blocks in the wavelet domain. As a consequence, there is no need to send to the decoder a list of cyphered code-blocks along with the key. In favor of an attacker, it is assumed that he knows that all code-blocks have been cyphered. Total encryption requires a small amount of additional complexity in the AC with respect to the standard JPEG 2000 encoders and decoders; however, the use of a secure random number generator requires additional computation, and make total encryption somewhat complex. 2) Selective encryption: In selective encryption, we cypher only a given set of code-blocks. The purpose is to enforce secrecy of the image at reduced computational complexity. In particular, since it is known that the low-frequency wavelet subbands contain the most important visual information, we encrypt all code-blocks in the first low-frequency resolution levels, and left some high-frequency resolution levels in the clear. The number of cyphered levels defines a trade-off between secrecy and computational saving, as the secure random generator is employed fewer times; while in Sect. V we provide results for different numbers of cyphered levels, once the application is defined this number is fixed. Again, this avoids the need of sending to the decoder a list of cyphered code-blocks. In favor of an attacker, it is assumed that he knows which resolution levels have been cyphered. 3) Conditional access: In conditional access, we encrypt the first low-frequency resolution levels with a key that is made publicly available (or we leave them in the clear), and the remaining highfrequency levels with a secret key. Hence, a user can download the image and decrypt/decode the first levels, so as to get a low-resolution thumbnail of the image. Then he can decide whether to purchase or not the key that will allow him to decode and display the image at full resolution.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

13

It is also possible to perform joint selective encryption and conditional access (and this is what is actually done in Sect. V). For instance, one could decide that the highest-frequency resolution level is left in the clear, so as to save some computations. Again, we assume that the encryption settings (which resolution levels are cyphered) are fixed and declared publicly, so that both the user and the attacker know them. An argument that may be made with regard to conditional access applications is that, in the wavelet domain, an attacker could use the available low- and mid-frequency information to estimate the cyphered high-frequency subbands, thus improving the quality of the decoded image. However, this is different from breaking the encryption system, as it does not involve estimating the key or exactly estimating the cyphered data. This is a system-level issue that is caused by the fact that the wavelet subbands are not completely uncorrelated, so that the residual correlation can be employed to estimate missing data (in fact, this is a sort of error concealment). If one employed an ideal transform, its samples would be uncorrelated, and no information about the encrypted samples could be gained by the samples in the clear. On the other hand, while there is still some correlation between the wavelet subbands, edge estimation from the low- and mid- frequency subbands typically can increase PSNR; in [42] a PSNR as high as 33.78 dB is reported for doubling the resolution of the Lena image (from 256x256 to 512x512) by means of wavelet edge estimation, though visual quality may be impaired by ringing artifacts. V. E XPERIMENTAL

RESULTS

The proposed randomized MQ has been used for total encryption, selective encryption and conditional access using JPEG 2000 with the randomized MQ-coder. In particular, we employ the Lena and Goldhill images, encoded at 1 bpp, using five resolution levels with the (9,7) filter, 64x64 code-blocks, no visual weighting, and no quality layers. A. Total encryption: performance and robustness In the case of total encryption, all code-blocks in the wavelet domain are cyphered by means of the randomized MQ-coder. This is the most secure version of the encryption scheme, and does not require to send any list of cyphered code-blocks, though its computational complexity is higher than selective encryption. A decoder employing the correct key achieves a PSNR of 40.36 dB. On the other hand, an unauthorized user that does not know the key, and blindly uses a JPEG 2000 Part 1 compliant decoder to decompress and display the image, obtains a PSNR of 14.50 dB. Note that the typical use of image encryption in security applications does not involve transmission over error-prone channels, as the

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

14

effect of errors on the encrypted data would be unpredictable; as a consequence, the JPEG 2000 error resilience tools would likely be switched off in security applications. Nevertheless, error-resilience tools may be used to detect encrypted portions of the codestream as errors, and hence try to skip the decoding of encrypted parts. The performance of JPEG 2000 has been evaluated also in presence of error resilience tools. In this latter case the decoder has the capability to stop decoding as soon as a coding pass appears to be corrupted, thanks to the arithmetic coder termination and segmentation symbol options; decoding is then restarted from the next coding pass. Decoding with error resilience tools yields a PSNR of 9.23 dB. Hence, an error resilient decoder is not able to improve image quality; in fact, the value of PSNR=14.50 dB, obtained by the standard decoder, corresponds to the decoding of a uniform grey image (pixel values equal to 128), presented to the user when all coding passes are recognized as corrupted. On the other hand, the value of 9.23 is close to the value obtained filling the code-blocks with all zeros, which is equivalent to what the error-resilient decoder does when it discards all coding passes as erroneous. Fig. 3 reports the PSNR achieved by a decoder which attempts to decode the image using all possible seeds S ; in this simulation 3000 decoding attempts were made with different values of the key, of which only one is correct (note that a random generator provides sequences that are reasonably “different” from each other, so that 3000 sequences provide a very representative sample of the complete sequence set). As can be seen, only decoding with the correct key yields a PSNR larger than 13 dB. This points out that the decoder would have to exhaustively try all possible seeds in order to break the encryption system. This demonstrates the security of the proposed system, which needs huge computational resources in order to violate the key. In order to appreciate the encryption effect in terms of visual quality, Fig. 4 shows an image that has been encoded with key K = 100, and decoded by a standard JPEG 2000 decoder without knowledge of the key. The resulting PSNR is 8.99 dB, and the image is meaningless. This points out that the proposed scheme effectively scrambles the image information, so that no information can be inferred from the encrypted data. B. Selective encryption In this section we report results obtained by applying the randomized MQ-coder to selected portions of the wavelet coefficients. In particular, with the phrase encryption at level j, we mean that resolution levels from j down to 0 (where 0 indicates the lowest-level LL subband, and increasing j ’s indicate increasing high-frequency resolution levels) have been encrypted, whereas the higher

frequency subbands have been left uncyphered. This is because the high-frequency information yields a lower contribution to PSNR, and needs not be encrypted. Lena at 1 bpp is again used as test image.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

15

45

40

35

PSNR (dB)

30

25

20

15

10

5

Fig. 3.

0

500

1000

1500 seed S

2000

2500

3000

PSNR obtained by decoding all possible combinations of the key; in this case, the decoder knows the correct

swapping probability P = 0.025, and tries for all possible values of the seed S.

A standard decoder will blindly attempt to decode the whole codestream as if it was not encrypted; the resulting image will contain the correct high-frequency information, superimposed to a meaningless low-frequency information. Conversely, a decoder that knows the correct key can decrypt all resolution levels up to the maximum quality. Fig. 5 shows the PSNR obtained by the decoder that knows the correct key for various encryption levels. The PSNR for encryption level j is computed by decoding only up to level j ; this PSNR provides an intuition of how much each additional resolution level contributes to PSNR. Clearly, if the decoder decodes all resolution levels, it will achieve the maximum PSNR (level 5) regardless of the encryption level. On the other hand, the standard decoder that does not know the correct key will achieve the PSNR shown in Fig. 6 as a function of the resolution levels that have been encrypted. As can be seen, if few resolution levels are encrypted, decoding the unencrypted high-frequency information increases PSNR; however, the maximum PSNR (when only the lowest level LL subband is cyphered) is still very low, and does not provide a meaningful image. In particular, Fig. 7 shows the image achieved by the standard decoder that attempts to decode an image encrypted up to level 2. Although some high-frequency details are intelligible, the overall image quality is very bad. Eventually, the selection of the resolution levels to be encrypted is a trade-off between the secrecy

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

Fig. 4.

16

Image produced by a standard JPEG 2000 decoder in case of total encryption. The encoder has used the key

K = 100, which is not known by the decoder.

and complexity requirements. Encrypting all resolution levels clearly provides maximum secrecy, but 100% of the wavelet coefficients have to be encrypted; on the other hand, protecting only up to level 1 requires cyphering only about 0.4% of the wavelet coefficients (and using the random generator proportionally fewer times).

C. Conditional access The proposed protection scheme can also be employed to achieve conditional access to the image. In particular, since the encryption is performed at the code-block level, it is possible to single out those code-blocks that contribute to the decoding of specific resolutions or regions of the image, and to encrypt them. As an example, a typical conditional access application consists in leaving the lowest resolution levels unencrypted (or encrypted with a key that is made publicly available), while encrypting only the last high-frequency resolution levels. In this way, a low resolution version (thumbnail) of the image can be decoded by anyone, and serves as a low-quality content preview. On the other hand, only those who purchase a decryption key can decode and display the encrypted resolution levels, thus obtaining the full-quality image. To investigate the performance of a JPEG 2000 based system for conditional access based on the randomized MQ-coder, we consider a system that encrypts resolution levels 0 and 1 with key K 0

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

17

45

40

PSNR (dB)

35

30

25

20

15

5

4

3

2

1

0

levels

Fig. 5.

PSNR achieved by a JPEG 2000 decoder knowing the correct key K = (100, 0.25).

(which is made publicly available), levels 2 and 3 with key K 1 , and level 4 with key K2 . Level 5 is not encrypted so as to reduce the computational burden. We first analyze the performance obtained by a user that decodes in the correct way, i.e. it only decodes resolution levels 0 and 1, for which the correct (public) key is available, so as to display a thumbnail; resolution level 5 would not be decoded as it would actually impair the visual quality (though possibly improving PSNR). We set K1 = 1000 and K2 = 10; the public key has seed S = 100. The PSNR achieved by this user for the

Goldhill image is equal to 22.55 dB. An example of thumbnail generation is provided in Fig. 8 for the Goldhill image. An attacker could decode the resolution levels for which the key is public, and then try to break the encryption of levels 2, 3 and 4. As stated in Sect. IV-D, as a worst case we also assume that the attacker knows that the code-blocks belonging to resolution levels 2, 3 and 4 have been encrypted, that the key for levels 2 and 3 is different from that of level 4, and that level 5 has not been encrypted. Additionally, we assume that the attacker is judicious, as he only attempts to break level 2, since he does not know if the key he is using is correct. As in the previous case, we investigate the performance obtained by an attacker that attempts to break the key by trying out 3000 different values of S . The results are shown in Fig. 9 and prove that the attacker always obtains a PSNR lower than that of a fair user, except when the key he is

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

18

15

14

PSNR (dB)

13

12

11

10

9

8

Fig. 6.

5

4

3

2 encrypted levels

1

0

PSNR achieved by a JPEG 2000 standard decoder without knowledge of the key K = 100, as a function of the

encrypted levels. The abscissae indicate that all resolution levels from 0 up to the abscissa value have been encrypted.

using is correct; these results are similar to those shown in Fig. 3. As a final case, we assume that the attacker has now gained knowledge of K 0 , K1 , and tries to estimate K2 trying out 500 different seeds. The obtained PSNR is shown in Fig. 10-a. As can be seen, the PSNR obtained by the attacker is almost always below that of a fair user, though it occasionally reaches it. This demonstrates that the performance obtained by trying to decode an encrypted codestream for which not all the keys are available typically leads to a worse image than that achieved by a user that does not try to break the encryption. Fig. 10-b shows the visual quality obtained by knowing K 0 and K1 and trying a wrong key for K2 ; as can be seen, there is a significant quality impairment that makes the decoded image quite unpleasant.

D. Comparison with other techniques 1) Use of an external encryption block: Since existing encryption techniques such as AES have been proved to be very difficult to break, it is worth investigating their applicability in the context of multimedia compression. Again, although in this work we employ the JPEG 2000 standard, most concepts can be applied to other techniques. The main issues related to the application of external cyphering schemes such as AES to an

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

Fig. 7.

19

Image produced by a JPEG 2000 standard decoder without knowledge of the key K = 100; the image has been

protected encrypting up to level 2.

(a) Fig. 8.

(b)

Thumbnail generation for the Goldhill image. (a) image decoded from levels 0 and 1; (b) same image, displayed

at the correct resolution.

international standard are the following. •

An additional external encryption block is required. This may be an issue in terms of complexity.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

20

Block cyphers such as AES are not excessively complex, whereas secure random number generators as used in stream cyphers may require significantly more resources. This issue is further discussed below. •

The external encryption block must operate on data units defined by the specific compression technique employed. This implies that, for a block cypher such as AES, the block size will likely not match the data unit size, and padding will be necessary. This causes a loss in compression efficiency, because of the need to code and encrypt the padded data, as well as to signal their presence. Stream cyphers are potentially unaffected by this problem.



Encrypting compressed data may cause marker emulation and other syntactical issues. This can be avoided, e.g. as proposed in [23], at the expenses of some increase in computational complexity; a comparison with [23] is provided in Sect. V-D.2.

To assess the advantages and drawbacks of this kind of approach, we have setup an encryption scheme that employs an external AES block and attempts to provide functionalities similar to those provided by the proposed technique. This scheme is based on packet encryption (e.g. as proposed in [23]), as opposed to code-block encryption as proposed in this work; packet encryption facilitates the use of an external cypher, because packet data are contiguous in the codestream. The scheme operates as follows. 1) A normal JPEG 2000 encoder generates a compliant compressed file with no securization. 2) A syntax parser scans the main header, tile-part headers, and packet headers, and generates a table indexing all packet headers and packet bodies. 3) For each packet, the packet body is extracted and padded with zero bits to an integer multiple of 128 bits (which is the AES block size). The padded block is then encrypted using AES, and the (possibly larger) cyphered block is written in the compressed file in place of the original block. 4) The packet header is not encrypted. However, the length information field in the packet header, specifying how many bytes of each code-block are contained in the packet body, is updated in order to account for the zero padding. This ensures that the syntax is self-consistent, and avoids the risk that an decoder unaware of the encryption scheme can crash. We have evaluated the performance loss of this scheme versus the unprotected JPEG 2000 encoding. The results are worked out using the 512x512 Lena image at 1 bpp using five decomposition levels of the (9,7) filter. Several numbers of quality layers have been used, since the number of packets in the codestream is proportional to the number of layers. The results are shown in Tab. I. As can be seen, using AES generates a rate overhead. This overhead is below 1% if the number of quality

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

21

TABLE I R ATE OVERHEAD DUE TO ZERO PADDING FOR AES, AS A

FUNCTION OF THE NUMBER OF QUALITY LAYERS .

N. of layers

overhead bits

% rate overhead

1

392

0.15

2

768

0.29

3

1408

0.54

4

1752

0.67

5

2136

0.81

layers is reasonably small, and can become significant if many layers are required, e.g. for fine-grain rate scalability. In such a case, instead of the default mode, the output-feedback mode of AES can be used, which works on individual bytes, and can be used without padding. Since the complexity of AES is manageable, it is possible to devise AES-based encryption schemes for JPEG 2000 with a modest performance loss with respect to the standard. However, despite their high security level, there are still some disadvantages using external cyphering blocks, in that they are less flexible as to what portions of the transform coefficients are encrypted or left in the clear, e.g. for encryption of selected regions of interest. Packet encryption would not allow to perform progression order changes or other compressed domain processing operations that require to access code-blocks and coding passes. This information is typically spread over many packets, while packet data cannot be accessed any longer after they have been encrypted. In fact, in order to achieve the exact functionality of our scheme, an external encryption software would need to rearrange the compressed coding passes into bit-planes, encrypt the bit-planes, rearrange the encrypted coding passes into the respective packet bodies, and update header information. As noted previously, stream cyphers can also be employed. A stream cypher requires a random number generator, whose output is ex-ored with the plaintext to produce the cyphertext. In this case, all the security lies in the random number generator rather than in the encryption formula. RC4 is used in several contexts, including secure socket layer to protect Internet traffic and WEP encryption in IEEE 802.11b wireless LANs. Although RC4 has rather low complexity, it falls short of the high standards of security set by cryptographers [43]; this has motivated the development of the IEEE 802.11i standard. It should also be noted that secure random number generators exist (see e.g. [44]), but they typically have much higher complexity than normal generators. We have compared the generator in [44] with the generator available in the rand.h library under Linux, and found it to be more than 100 times slower in producing a single random number, employing a Pentium IV PC at 1.7 GHz. This shows

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

22

TABLE II C OMPUTATIONAL COMPLEXITY EMPLOYING A SIMPLE RANDOM NUMBER GENERATOR AND THAT IN [44]. T HE FIGURES ARE RUNNING TIMES ( MEASURED IN SECONDS ) FOR THE ENCRYPTION OF THE

NL

[44]

Levels

Time (s)

Time (s)

0

0.22

1.30

all

0.33

374.01

512 X 512 L ENA IMAGE .

that a careful selection of the data to be cyphered is a key aspect to allow real-time operation of such system. Some results are reported in Tab. II; the table compares the computational cost of protecting the 512x512 Lena image, encrypting either all code-blocks in all resolution levels, or only those at the lowest resolution (level 0). We have compared the algorithm in [44], and a simpler one obtained by adding a few non-linear operations to the standard rand function of the Linux C library, so as to make it more secure without adding significant complexity; this latter generator is denoted as NL in the table. As can be seen, for total encryption the generator in [44] increases the total running time significantly, and well beyond the requirements for real-time operation. Real-time can be obtained by either resorting to the simpler but less secure generator, or by employing selective encryption. In summary, for RAC the complexity mainly lies in the random number generator. One could use the RC4 keystream to drive RAC; this would lead to a complexity similar to AES. In fact, the complexity of AES and RC4 is rather low, as they can encrypt tens of MB/s of data. One could also use a very secure random number generator, e.g. [44], which will however increase complexity by several orders of magnitude, as reported in Tab. II. On the other hand, as seen above, one could devise a scheme that employs RC4 or AES as cyphering techniques over a non-encrypted codestream. The complexity of such scheme would be low; however, it would require an external encryption block, and it would not provide a syntax-compliant codestream. In the next section we discuss the scheme in [23], which employs an external encryption block, and performs additional actions in order to retain compliance with the syntax. 2) Other encryption techniques for JPEG 2000: In [23] the syntax violation issue is specifically addressed for JPEG 2000. In particular, [23] employs RC4 to encrypt the packet bodies of a JPEG 2000 codestream. Since the encrypted packet body may contain marker emulations, it is encrypted iteratively until no emulation is present. It is shown that the number of iterations required to eliminate all marker emulations is reasonable. Both RAC and [23] avoid any compression loss due to the encryption process, hence their rate-

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

23

distortion performance is equivalent for a user that knows the secret key. Visual impairment after scrambling is not directly comparable, but seems to be very good for both schemes. RAC offers functionalities similar to [23], although they are based on code-block encryption rather than packet encryption. It would be possible to perform quality-progressive encryption using RAC, e.g. by starting the encryption from a given point of each coding pass instead of doing it from the first bit, although in this paper we do not focus on this aspect, but rather on encryption of selected resolution levels. RC4 is a stream cypher, and can handle data blocks of arbitrary size. Hence, it could be applied instead of AES in the scheme outlined in Sect. V-D.1, without the need of padding. However, as has been seen in Sect. I, RC4 is not secure as other encryption schemes such as AES. It would be interesting to extend the algorithm of [23] by employing AES. VI. C ONCLUSIONS In this paper we have proposed randomized arithmetic coding as a general tool for joint image encryption and compression. An implementation of this paradigm tailored to the JPEG 2000 image compression standard has been described, and applications such as total encryption, selective encryption and conditional access have been studied. The proposed scheme provides several advantages with respect to existing approaches. Most notably, the encryption does not reduce compression efficiency, as opposed to the vast majority of existing techniques. Moreover, it does not alter the syntax of the JPEG 2000 codestream, preserving the ability to perform compressed domain processing such as codestream truncation or reordering, progression order changes, and so on. If compared with the use of an external cyphering block operating on the JPEG 2000 codestream, the RAC approach has low complexity and very high flexibility, as it can be easily adapted to any block length; moreover, the integration within the compression algorithm avoids the need of an external parser in order to locate packets and code-blocks inside the codestream, and prevents from violating the codestream syntax. Extensive tests have been carried out to evaluate the robustness of the proposed scheme to attacks aiming at finding the correct key or decoding with the wrong key. It has been found that the key is very robust, so that a user attempting to estimate the key by exhaustive search should carry out a huge number of decoding attempts. Total encryption, selective encryption and conditional access have been investigated as possible applications. Total encryption turns out to provide a very high degree of security, yielding meaningless images to an unauthorized user, with a small computational overhead with respect to the non-encrypted system. As for selective encryption, it has been found that cyphering only a small fraction of wavelet

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

24

coefficients still provides meaningless images to users that do not know the decryption key; in this case the complexity overhead is much lower. Moreover, it has been found that the proposed system can flexibly combine selective encryption and conditional access in order to deploy an effective multimedia communication system with a high degree of security. R EFERENCES [1] J.L. Massey, “An introduction to contemporary cryptology,” Proceedings of the IEEE, vol. 76, no. 5, pp. 533–549, May 1988. [2] D.S. Taubman and M.W. Marcellin, JPEG2000: Image Compression Fundamentals, Standards, and Practice, Kluwer, 2001. [3] Y. Sadourni and V. Conan, “A proposal for supporting selective encryption in JPSEC,” IEEE Transactions on Consumer Electronics, vol. 49, no. 4, pp. 846–849, Nov. 2003. [4] American National Standards Institute. American National Standard X9.17: Financial Institution Key Management (Wholesale), 1985. [5] W. Diffie and M.E. Hellman, “New directions in cryptography,” IEEE Transactions on Information Theory, vol. 22, pp. 644–654, 1976. [6] E. Okamoto and K. Tanaka, “Key distribution system based on identification information,” IEEE Journal on Selected Areas in Communications, vol. 7, no. 4, pp. 481–485, May 1989. [7] D. Naor and M. Naor, “Protecting cryptographic keys: the trace-and-revoke approach,” Computer, pp. 47–53, July 2003. [8] FIPS PUBS 46-2, 1993, Data Encryption Standard. [9] FIPS PUBS 197, 2001, Advanced Encryption Standard. [10] B. Schneier, Applied Cryptography, 2nd Edition, John Wiley and Sons, Inc., 1995. [11] S.J. Wee and J.G. Apostolopoulos, “Secure scalable streaming enabling transcoding without decryption,” in Proceedings of IEEE International Conference on Image Processing, 2001. [12] D. Kundur and K. Karthik, “Video fingerprinting and encryption principles for digital rights management,” Proceedings of the IEEE, vol. 92, no. 6, pp. 918–932, June 2004. [13] T. Lookabaough and D.C. Sicker, “Selective encryption for consumer applications,” IEEE Communications Magazine, pp. 124–129, May 2004. [14] H. Cheng and X. Li, “Partial encryption of compressed images and videos,” IEEE Transactions on Image Processing, vol. 48, no. 8, pp. 2439–2451, Aug. 2000. [15] A. Servetti and J.C. De Martin, “Perception-based partial encryption of compressed speech,” IEEE Transactions on Speech and Audio Processing, vol. 10, no. 8, pp. 637–643, Nov. 2002. [16] B. Goldburg, S. Sridharan, and E. Dawson, “Design and cryptanalysis of transform-based analog speech scramblers,” IEEE Journal on Selected Areas in Communications, vol. 11, no. 5, pp. 735–744, June 1993. [17] W. Zeng and S. Lei, “Efficient frequency domain selective scrambling of digital video,” IEEE Transactions on Multimedia, vol. 5, no. 1, pp. 118–129, Mar. 2003. [18] M.S. Kankanhalli and T.T. Guan, “Compressed domain scrambler/descrambler for digital video,” IEEE Transactions on Consumer Electronics, vol. 48, no. 2, pp. 356–365, May 2002. [19] J. Wen, M. Severa, W. Zeng, M.H. Luttrelle, and W. Jin, “A format-compliant configurable encryption framework for access control of video,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 6, pp. 545–557, June 2002.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

25

[20] R. Grosbois, P. Gerbelot, and T. Ebrahimi, “Authentication and access control in the JPEG 2000 compressed domain,” in Proceedings of the SPIE 46th Annual Meeting, 2001. [21] M. Wu and Y. Mao, “Communication-friendly encryption of multimedia,” in Proc. of IEEE International Workshop on Multimedia Signal Processing (MMSP), 2002. [22] M. Johnson, P. Ishwar, P. Prabhakaran, D. Schonberg, and K. Ramchandran, “On compressing encrypted data,” IEEE Transactions on Signal Processing, vol. 52, no. 10, pp. 2992–3006, Oct. 2004. [23] Y. Wu and R.H. Deng, “Compliant encryption of JPEG2000 codestreams,” in Proceedings of IEEE International Conference on Image Processing, 2004. [24] J. Wen, M. Muttrell, and M. Severa, “Access control of standard video bitstreams,” in Proceedings of International Conference on Media Future, 2001. [25] A. Barbir, “A methodology for performing secure data compression,” in Proceedings of Twenty-Ninth Southeastern Symposium on System Theory, 1997. [26] I.H. Witten and J.G. Clearly, “On the privacy offered by adaptive text compression,” Computers and Security, vol. 7, pp. 397–408, 1988. [27] H.A. Bergen and J.M. Hogan, “Data security in a fixed-model arithmetic coding compression algorithm,” Computers and Security, vol. 11, 1992. [28] H.A. Bergen and J.M. Hogan, “A chosen plaintext attack on an adaptive arithmetic coding compression algorithm,” Computers and Security, vol. 12, pp. 157–167, 1993. [29] J.G. Clearly, S.A. Irvine, and I. Rinsma-Melchert, “On the insecurity of arithmetic coding,” Computers and Security, vol. 14, pp. 167–180, 1995. [30] P.W. Moo and X. Wu, “Resynchronization properties of arithmetic coding,” in Proceedings of IEEE International Conference on Image Processing, 1999. [31] P.W. Moo and X. Wu, “Joint image/video compression and encryption via high-order conditional entropy coding of wavelet coefficients,” in Proceedings of IEEE International Conference on Multimedia Computing and Systems, 1999. [32] Joint Committee Draft (JCD), Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, May 2002. [33] M. Grangetto, A. Grosso, and E. Magli, “Selective encryption of JPEG 2000 images by means of randomized arithmetic coding,” in Proceedings of IEEE International Workshop on Multimedia Signal Processing, 2004. [34] D. Jones, “Applications of splay trees to data compression,” Communications of the ACM, pp. 996–1007, Aug. 1988. [35] X. Liu, P.G. Farrell, and C.A. Boyd, “Resisting the Bergen-Hogan attack on adaptive arithmetic coding,” in Proceedings of 6th IMA International Conference on Cryptography and Coding, 1997. [36] G.M. Bernstein and M.A. Lieberman, “Secure random number generation using chaotic circuits,” IEEE Transactions on Circuits and Systems, vol. 37, no. 9, pp. 1157–1164, Sept. 1990. [37] R. Bernardini and G. Cortelazzo, “Tools for designing chaotic systems for secure random number generation,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 48, no. 5, pp. 552–564, May 2001. [38] A. Gerosa, R. Bernardini, and S. Pietri, “A fully integrated chaotic system for the generation of truly random numbers,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 49, no. 7, pp. 993–1000, July 2002. [39] Y. Hu and G. Xiao, “Generalized self-shrinking generator,” IEEE Transactions on Information Theory, vol. 50, no. 4, pp. 714–719, Apr. 2004. [40] D.E. Denning and M. Smid, “Key escrowing today,” IEEE Communications Magazine, vol. 32, no. 9, pp. 58–68, Sept. 1994. [41] ISO/IEC 15444-4:2002 Information technology - JPEG 2000 image coding system - Part 4: Conformance testing.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

26

[42] Y. Zhu, S.C. Schwartz, and M.T. Orchard, “Wavelet domain image interpolation via statistical estimation,” in Proceedings of IEEE International Conference on Image Processing, 2001. [43] S.R. Fluhrer, I. Mantin, and A. Shamir, “Weaknesses in the key scheduling algorithm of RC4,” in Proceedings of Annual International Workshop on Selected Areas in Cryptography, 2001. [44] L. Blum, M. Blum, and M. Shub, “A simple unpredictable random number generator,” SIAM Journal on Computing, vol. 15, pp. 364–383, 1986.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

27

Fig. 9. PSNR as a function of P (the constant line represents the PSNR achieved by a fair user): encryption is done using K0 = 100, K1 = 1000 and K2 = 10. The decoder employs the correct K0 and guesses K1 using 3000 different seeds.

IEEE TRANSACTIONS ON MULTIMEDIA (RESUBMITTED NOV. 2005)

Fig. 10.

PSNR as a function of S (the constant line represents the PSNR achieved by a fair user): encryption is done

using K0 = 100, K1 = 1000 and K2 = 10, being K2 unknown at the decoder.

Fig. 11.

28

Decoding example: the Goldhill image, encrypted as described above, is decoded using a wrong K2 .

Suggest Documents