Transparent hashing in the encrypted domain for ... - Semantic Scholar

3 downloads 24917 Views 2MB Size Report
image as query will have to be done in the encrypted domain. The query may be cropped .... tags, there is no way the querying person can verify if the search process is ...... XOR between pairs of hashes is taken and displayed. Numbers speci-.
SIViP DOI 10.1007/s11760-013-0471-0

ORIGINAL PAPER

Transparent hashing in the encrypted domain for privacy preserving image retrieval Kannan Karthik · Sachin Kashyap

Received: 30 May 2012 / Revised: 1 January 2013 / Accepted: 28 January 2013 © Springer-Verlag London 2013

Abstract Search through a database of encrypted images against a crumpled and encrypted query will remain privacy preserving only if comparisons between selective features derived from these images is executed in the encrypted domain itself. To facilitate this, the encryption process must remain transparent to specific image statistics computed in the spatial or transform domain. Consequently, the perceptual hash formed by quantizing the image statistics remains the same before and after the encryption process. In this paper, we propose a transparent privacy preserving hashing scheme tailored to preserve the DCT-AC coefficient distributions, despite a constrained inter-block shuffling operation. These DCT distributions can be mapped onto a generalized Gaussian model characterized by shape and scale parameters, which can be quantized and Gray-coded into a binary hash matrix. The encryption scheme has been shown to be perceptually secure and does not impair the search reliability and accuracy of the hashing procedure. Experimental results have been provided to verify the robustness of the hash to contentpreserving transformations, while demonstrating adequate sensitivity to discriminate between different images. Keywords Privacy preserving search · Transparent hash · Constrained shuffling · Coefficient histogram · Encrypted domain

1 Introduction Electronic stores for high-resolution digitized images use perceptually strong image encryption schemes to protect them from being copied and disseminated for illegal gratification. Any search and retrieval based on a low-resolution image as query will have to be done in the encrypted domain. The query may be cropped, scaled, compressed, or blurred, perceptually equivalent to the original, but may differ considerably in the signal (e.g., pixel intensity) domain. Hence, image matching must be done based on high-level features, derived by carefully aggregating low-level features. These features could be statistical in nature [1,2], geometrically related to anchor points [3], features based on visual words [4,5], or those derived using learning based techniques [6]. Comparison between any two images therefore becomes equivalent to a comparison between two feature sets designed to capture the perceptual variations within and between these images. Most of the architectures developed for privacy preserving search and retrieval engines for images use encrypted feature sets or specific feature transparent multimedia encryption schemes.

2 Related literature K. Karthik (B) Department of Electronics and Electrical Engineering, Indian Institute of Technology, Guwahati, India e-mail: [email protected] S. Kashyap Research and Development Engineer at Tejas Networks, Bangalore, India e-mail: [email protected]

2.1 Search based on encrypted feature sets It was observed in Lu et al. [7] that certain features such as the relative Hamming distances between most significant bit (MSB) planes remained invariant to processes such as shuffling and addition of random binary sequences, which made it possible to compare these encrypted bit planes across images. It was also observed that random projections of high-level

123

SIViP

image feature vectors preserved distances between these feature vectors in the lower dimensional space [8,9]. Without the knowledge of the random projection matrix, it is extremely difficult to determine the original features. However, if the same random projection matrix is re-used several times in the process of creating an encrypted feature set, the system may become vulnerable to a chosen-plaintext attack owing to the linearity of the transform. Alternately, decomposing an image into a bag of visual words based on the principles presented by Nister et al. [4] has several benefits such as (i) efficient content summarization based on the visual alphabet learn, (ii) granular search [10], that is, searching for strings inside visual strings, and (iii) adaptation of the vocabulary space. While the process is computationally intensive in the learning and adaptation phases, this allows a more accurate representation of visual cues. To protect privacy and confidentiality, Lu et al. [5] proposed a family of secure indexing schemes such as those based on order-preserving encryption, for matching visual strings in the encrypted domain. The main issue associated with this framework is the lack of information binding between the images and the encrypted features, which speeds up the search process but raises some issues related to trust, since any attacker can disturb the search by shuffling the encrypted features. 2.2 Search based on feature transparent multimedia encryption schemes By allowing image-specific high-level features to seep through the encryption process, it is possible perform comparisons in the encrypted domain itself. In Hsu et al. [11], a Homomorphic encryption scheme is used for securing images, while Scale-Invariant Feature Transform (SIFT) features are extracted for image comparisons. They observed that the Difference of Gaussian (DoG) of a Homomorphic encrypted image is equivalent to the encryption of a DoG of the original image. This transparency and invariance to the DoG operation allow the SIFT feature extraction from the encrypted image itself. The problem of identifying intrinsic image statistical parameters which remain invariant to soft-ciphers was examined by Kashyap et al. [12]. It was seen that the block means and variances remain invariant to intra-block shuffling, but carry significant information regarding the perceptual content (enough to discriminate between images). By appropriately trimming the feature set and quantizing the statistical parameters, it is possible to construct a robust hash which is transparent to block permutation ciphers. An issue with this approach is the amount of perceptual leakage, since the shuffling is confined to the blocks. The volume of image-specific information which may be allowed to seep through the encryption process depends on

123

the choice and nature of the cipher. With a little sacrifice of information theoretic and computational secrecy, but with good perceptual secrecy, it is possible to identify soft-ciphers which permit image comparisons in the encrypted domain.

3 Problem formulation: encryption transparent hashing The problem of preserving a portion of the image signature despite an intermediate encryption operation is that any cipher has a tendency to distort the statistics of the original image (or plaintext). However, if we focus on a family of multimedia ciphers which achieve perceptual secrecy, but allow the penetration of specific statistical parameters, intrinsic to the image representation, it opens up the possibility of comparing some features in the encrypted domain. If an attempt is made to strengthen the encryption process by adopting the Shannon’s fundamental principles of confusion and diffusion [13] toward cipher design, this will choke the flow and accuracy of statistical measurements pertaining to the original image. We may define E t () as a transparent encryption operation and Hpe () as a perceptual and encryption invariant hash function, which is constructed based on the quantization of certain statistical parameters derived from an image I . Let I1 and I2 be two different images with the former carrying the same perceptual content as the image I . If we define Dh () as the distance metric for comparing two different hash strings or matrices, we expect,   (1) Dh Hpe (I ), Hpe (I1 )  1   (2) Dh Hpe (I ), Hpe (I2 ) ≈ 1 Since the notion of similarity or dissimilarity between images is a subjective measure, it is impossible to arrive at a perfect threshold characteristic for any perceptual hash function. Hence, Eq. 2 may not be applicable to all images which are dissimilar to I . Hence, the inequalities in Eqs. 1 and 2 apply in a probabilistic sense. If the transparent encryption operation does not alter the hash value, we have, Hpe (I ) ≈ Hpe [E t (I, K )]

(3)

where, K is the encryption key. For some soft-ciphers, it is possible to ensure that the hash values before and after the encryption operation remain virtually unaltered. For stronger ciphers where there is some distortion of original image statistics, we have   (4) Dh Hpe (I ), Hpe [E t (I, K )] ≈ 0 Observe that, the keying parameter K completely decouples the original image from the encrypted version. Hence, for two independent keys K 1 and K 2 , Hpe [E t (I, K 1 )] ≈ Hpe [E t (I, K 2 )]

(5)

SIViP

Thus, the images I , E t (I, K 1 ), and E t (I, K 2 ) are statistically equivalent but perceptually very different, which is necessary to make the hash invariant to encryption.

The associations between the hashes and the corresponding encrypted and unencrypted images are shown in Fig. 1 for three different models: (i) Encrypt and Hash (E&H), (ii) Robust Perceptual Hashing (RPH), and (iii) Transparent Encryption Hashing (TEH).

4 Softening of encryption for transparency 5.1 Encrypt and hash (E&H) To compare two images in the encrypted domain itself, the encryption process should allow some of the statistics derived from the image to percolate. For that, an encryption algorithm with limited diffusion capability is deployed. Permutation is a bijective map whose domain and range are the same. Let X be a vector represented by X = [x1 , x2 . . . x N ] and the function, E perm : X → X as a permutation. The encryption process can be written as E perm (X, K ) = S K XT

(6)

where S K represents the matrix obtained after the column permutation of identity matrix, I N based on encryption key K . If we compute the mean (μX ) and variance σX2 from the block of data X, μX =

1 N

N 

xi

i=1

N 1  σX2 = (xi − μ)2 N

(7)

In this model, each image (both the query I Q and those inside the database Ii , i = 1, 2, . . . n) is encrypted using a standard encryption algorithm and is subjected to a cryptographic hashing algorithm such as SHA and MD5 to generate the corresponding hash-tags h i = Crypto_hash [E K (Ii ), K hash ] (Fig. 1a). Matching is done by comparing the corresponding hash-tags of images. Since the hash is a cryptographic one, the tags will change significantly even if there are minute changes in the signal domain. Hence, the search will not be performed under the notion of perceptual similarity, rendering this approach ineffective for image searches. 5.2 Robust perceptual hashing (RPH) Contrary to the E&H model where the tags are generated from the encrypted images, the tags here are generated from

i=1

The value of mean and variance would not change if the block X was shuffled as Z = S K XT . Consequently, μX = μZ σX2 = σZ2

(8)

Hence, the abstractions of the data set X in the form of mean and variance are preserved by the permutation cipher. However, localized permutation ciphers suffer from knownplaintext attacks, and block means and variances do not form sufficiently robust features for image comparison [12]. Hence, in this paper, we focus on the selection of better statistical features coupled with the development of a more perceptually secure but transparent encryption scheme.

5 Need for information binding and transparent encryption In any privacy preserving search and retrieval problem, identification tags or representative hashes can always be used in place of the original images for comparisons. These hashes can be derived either from the encrypted or the unencrypted images. The advantage in using hash-tags in place of images is that the search complexity is limited to O(n) hash comparisons as opposed to complex image similarity measures.

Fig. 1 The associations between the hash-tags and the original and encrypted images for E&H, RPH, and TEH models

123

SIViP

the original images Ii with the help of a perceptual and robust hashing algorithm. h i = H p (Ii , K hash )

(9)

where H p is a perceptual hashing algorithm based on some predefined consistent image features. Once the hash-tags are computed, the images Ii are encrypted using a standard encryption scheme. Since the hash-tags are independent of the encryption algorithm (Fig. 1b), the comparisons can be made on perceptual grounds and the search can result in the retrieval of images which are perceptually similar to I Q . There is a fundamental problem with this model. If an attacker disrupts the search operation by shuffling the hashtags, there is no way the querying person can verify if the search process is progressing accurately, till the final results are supplied by the search engine and some of the images within this subset are decrypted. 5.3 Proposed transparent encryption hashing (TEH) model and information binding In both the E&H and the RPH models, the encryption scheme is completely independent of the hashing algorithm used. In the TEH model, there is some form of symbiosis between the encryption process and the hashing algorithm. The strength of the encryption algorithm depends on the extent to which the key information (K i ) is mixed with the image (Ii ) resulting in a cipher-image (Yi ) whose statistics have no connection with the original unencrypted image. If the encryption process is softened, then the subsequent hashing algorithm can feed on some residual features deliberately left untouched by the encryption process. The computation of the hash-tags is related to both the original and encrypted images (depicted in Fig. 1c) as follows: h i = Hpe [Ii , K hash ] = Hpe [E t (Ii , K i ) K hash ]

  h i(new) = Hpe E t (Ii , K i ) , K hash(new)   = Hpe Ii , K hash(new)

(11)

6 Transparent hashing based on constrained shuffling 6.1 Choice of features Calculating the DCT coefficients of an image block by block is essentially carrying out the frequency decomposition or content decomposition of that block. A DCT coefficient of a block is the representative of the magnitude of that frequency in a particular space. To start with, first we compute the 16 × 16 block DCT of the image of dimension 256 × 256. In order to do that, the image is divided into N B = 256 blocks, each of dimension 16 × 16. For the kth block Ak , the 2-dimensional DCT Bk is given as

(10)

where Hpe is a perceptual and encryption transparent hashing algorithm and E t () is the soft encryption scheme. The cipher design should ensure that it does not leak statistical features which do not have much of a perceptual relevance. On the other hand, if one begins with the selection of robust features for the hashing algorithm, the strength of the encryption process may have to be compromised considerably to meet the demands of a good search/retrieval algorithm. Hence, the overall design process is iterative in nature till one finds the right compromise between encryption strength, perceptual secrecy, key size, and robustness of the transparent hash. Robustness implies ensuring good separability between image classes based on the choice of transparent hash features, while keeping the hash value relatively invariant to several signal processing operations. There are several advantages of linking the hash-tag to both the encrypted and original images.

123

• During the search, if there is a doubt regarding the authenticity and the linkability of the tag corresponding to a certain encrypted image, the tag can be recomputed directly from the encrypted image without exposing the underlying content. This avoids unnecessary security overheads and repercussions such as forced key refreshes. • In the case of a breach and a compromise of several image-hash (It , h t ) pairs, a re-computation of tags for all the images may be unavoidable since the inter-hash distances become heavily key K hash -dependent in a lowdimensional space. However, if this re-computation is restricted to the encrypted domain without requiring a tedious decryption-decompression-hashing operation for each image, then the process becomes more secure and computationally efficient.

Bk (i, j) = ai a j

15 15  

Ak (i, j) cos

m=0 n=0

π(2m +1)i π(2n +1) j cos 2M 2N

(12) where 0 ≤ i ≤ 15, 0 ≤ j ≤ 15 and  ai =

1 4, √1 , 8

i =0 1 ≤ i ≤ 15



 ,aj =

1 4, √1 , 8

j =0 1 ≤ j ≤ 15



Bk (i, j) is the DCT coefficient present at position (i, j) in block k. In order to create confusion, the synergy between space and frequency can be collapsed, which leads to the loss of perceptual information. The coefficients of each block are then zigzag scanned, sequence being Bk (0, 0), Bk (0, 1), Bk (1, 0) . . . Bk (15, 15). Spatial information is collapsed by forming,

SIViP

Z1 = [B1 (0, 0), B2 (0, 0), B3 (0, 0) . . . B256 (0, 0)] Z2 = [B1 (0, 1), B2 (0, 1), B3 (0, 1) . . . B256 (0, 1)] ... Z25 = [B1 (4, 4), B2 (4, 4), B3 (4, 4) . . . B256 (4, 4)]

(13)

The overall matrix can be written as

T T Z = ZT 1 Z2 . . . Z25 The first column of the matrix represents the DC coefficient of all the blocks, the second column the first AC coefficient of all the blocks, and so on. The permutation cipher is employed individually to each of the columns of the matrix Z. The permutations are applied to only the first 25 coefficients. The encryption can be written as

T T (14) E K (Z) = E K 1 (ZT 1 )E K 2 (Z2 ) . . . E K 25 (Z25 ) where K 1 , K 2 . . . K 25 are the subkeys generated from the encryption key K and E K i (ZT i ) is given by E K i (ZiT ) = S K i ZiT where S K i represents the 16 × 16 matrix obtained by the column permutation of identity matrix, I16 . A different set of keys and, hence, different sets of shuffle tables are employed for different frequency components. Constrained random permutation is employed, which deliberately preserves a portion of the statistics such as the histogram of each frequency component Bk (i, j) created by accumulating the same frequency from all blocks, k = 1, 2 . . . 256. Five test sample images were picked from an assorted database (AD) to examine the perceptual security of this constained shuffling: (a) Lena, (b) Actress, (c) Diplomat, (d) Girl with hat-1, and (e) Girl with hat-2. Figure 2

shows that this encryption process provides sufficient masking of perceptual details. 6.2 Statistical modeling of the DCT coefficients for Hashing The sample spaces Z2 , Z3 . . . Z10 represent the first 9 sets of AC coefficients, which have been used to derive the hash as they are relatively robust to signal processing operations. The histograms of the DCT coefficients are discretely distributed. The histograms are distributed into bins. Each bin of the histogram can be thought of as one-dimensional feature component of the n-dimensional space, H n , where n is the number of bins in the histogram. The histogram plots of the first 9 AC frequency samples of Lena image are given in Fig. 3. Every n-bin histogram can be represented as a point in the histogram space. Let s Q be the histogram of the query image and s E be the histogram of the encrypted image. The proximity evaluation methods that have been employed in the literature for histogram-based image retrieval are Euclidean distance, city-block distance, and the histogram intersection method. The Euclidean distance for any pair of histogram features in the histogram space H n is given as d Euc. (s Q , s E ) =

n 

(s Q (t) − s E (t))2

(15)

t=1

The city-block distance is given as dC B (s Q , s E ) =

n  s Q (t) − s E (t)

(16)

t=1

Fig. 2 Original and encrypted images, a Lena, b Actress, c Diplomat, d Girl with hat-1, e Girl with hat-2, f-j Corresponding encrypted images

123

SIViP 30

60

20

40

10

20

100 50

0 −1000

0

0 1000 −1000

0 1000 −500

0

60

60

60

40

40

40

20

20

20

0 −1000

0

0 1000 −1000

100

0 1000 −500

0

60

60

40

40

20

20

0

500

0

500

0

500

50 0 −500

0

500

0 −500

0

500

0 −500

Fig. 3 Histogram of the first 9 AC frequency sample spaces of Lena image. As can be seen from the histogram plots, they can be modeled as generalized Gaussian distribution

The normalized distance of histogram intersection is given as dh (s Q , s E ) = 1 −

n 

min(s Q (t), s E (t))

(17)

t=1

Since the permutation cipher preserves the histogram of the AC coefficients, the above-mentioned proximity evaluation methods will give exact match for the query and its encrypted image in the database. But, these evaluation techniques are not robust to content-preserving signal processing operations. We look to model the coefficients and extract certain attributes which are robust to content-preserving operations. Each of the DCT sample space can be modeled as an independent and identically distributed (IID) random variables having generalized Gaussian distribution (GGD), one of the most widely used statistical models in image processing. The GGD model is a parametric Gaussian distribution model where two parameters that define its characteristics are the shape and the scale parameters. The GGD random variable can be denoted as X ∼ GG D(μ, σ, α)

(18)

where μ is the mean, σ the standard deviation, and α the shape parameter. The probability density function is then ggd(x; μ, σ, α) =

x − μ α 1 exp − 2(1 + 1/α)S(α, σ ) S(α, σ )

(19)

123

σ 2 = R(α)E 2 |X|

(20)

where, R(α) =

(3/α)(1/α)  2 (2/α)

(21)

is called the generalized Gaussian ratio function (GGRF) [16]. A look-up table is created that contains the GGRF for different shape parameters in the range 0–3. For each of the given frequency sample spaces Z2 , Z3 . . . Z10 , the ratio of their variance and the square of the mean of the absolute value are calculated. Thereafter, from the look-up table, the shape parameter corresponding to the calculated GGRF is obtained. Once the shape parameter is known, the scale parameter is calculated using the relation 1/2  2 σ (1/α) (22) S(α, σ ) = (3/α) 6.3 Verification of Transparency

1/2

(1/α) where x ∈ R and S(α, σ ) = σ(3/α) is called the scale parameter. The scale parameter defines the spread of 2

the distribution. The shape parameter defines the decay rate of the density function. The smaller values of α correspond to a more peaked distribution. The same frequency components obtained after the block DCT decomposition and zigzag scanning are modeled as IID random variables having the generalized Gaussian distribution (GGD). The shape and the scale parameters of 9 different frequency components are determined to form the feature vector for the hash calculation. Since permutation ciphers have been applied to the same frequency components, the histograms do not change, as a result of which the shape and scale parameters of all the frequency components remain virtually the same. There is no closed-form expression to estimate the shape parameter of a generalized Gaussian distribution. In [14], a novel and robust image hashing algorithm for image authentication has been proposed. The invariant statistical parameters of DCT coefficients are estimated using the GGD model. They employed the maximum likelihood (ML) estimate to find out the shape parameter, which requires solving a transcendental equation. In an alternative approach proposed by Garcia et. al [15], the shape parameter is estimated in terms of the variance and mean absolute value of the data. We have employed this method to determine the shape parameters of the 9 different frequency samples. Let X be a generalized Gaussian distributed data. There exists an explicit relationship between the variance σ 2 and the mean absolute value of the data E |X|. If μ = E [X] = 0, then

Although in theory it is expected that the histograms of the shuffled and original images should remain the same

SIViP

since shuffling of coefficients does not change the numerical constitution of the set, it is worthwhile examining if existing lossy compression models perturb some of the values owing to the changes in the block constitution. The hashing algorithm was run on images (Lena and Actress) and their corresponding encrypted versions. The corresponding scale and shape parameters for the nine significant AC coefficients are shown in Table 1. Observe that, the parameters show only a slight deviation due to round-off effects owing to quantization. This is anticipated because when encrypted images are stored they will be subjected to lossy compression (such as JPEG) leading to some round-off effects, if a non-standard block size is used. It was observed that all images except Lena showed a deviation in the shape and scale parameter values obtained from the encrypted version. Hence, the set of columns pertaining to the encrypted Actress image in Table 1 tend to show a greater deviation in shape and scale parameters with respect to the original Actress. The Actress image has a lower texture as compared to Lena (Fig.2a, b), owing to which there is a greater skew in the energy distribution of the DCT coefficients toward the lower frequencies. Consequently, when the shuffled image of the Actress is quantized, there is likely to be a greater error in the representation, which possibly accounts for the deviation in the shape and scale parameters. To perform a check whether the feature vectors capture the texture diversity across different images, the shape and scale parameters for four images were estimated. Table 2 gives the result. While the shape parameters show a moderate deviation, considerable changes are seen in the scale parameters, especially at lower frequencies. A close match in a few of the parameters for some of the images is offset by significant deviations in others. Even if half the parameters of one image closely match with another image’s shape and scale

parameters, there will still be sufficient variation to separate the hash values.

7 Experimental verification 7.1 Assorted database The purpose of this database was to investigate if the proposed hashing algorithm and its extensions could withstand several signal processing operations and geometric transformations. The principal image used for study was Lena, which was processed to construct several distorted and noisy variations by itself. This set had a total of 103 images including the following: • Processed versions of Lena: (i) Cropped and scaled, (ii) With additive noise, (iii) Rotated by ±5 degrees and scaled, (iv) JPEG compressed, (v) Sharpened, (vi) Radial Blurring, (vii) Median filtering, (viii) Brightness and contrast changes, (ix) With bubbles, and (x) With unsharp masking (total of 18 variations). • Processed versions of the Nature image (By the sea) (5 variations) and that of the Actress image (1 variation). • Face images of different people with non-uniform backgrounds. • Natural images. • Fingerprint images. • Medical images (including MRI and angiogram images) All the images were of size 256 × 256 and were stored in JPEG compressed form with 100 % quality factor. Among the 103 images, there were 24 processed versions which were not included in the database search. These processed (deliberately distorted) versions were used as queries.

Table 1 Estimated scale and shape parameters of parent image and its encrypted versions Sample spaces

Lena α

Z2

1.1895

S(α, σ )

Encrypted Lena

Actress

α

α

S(α, σ )

Encrypted Actress S(α, σ )

α

S(α, σ )

219.0699

1.1905

219.2597

0.6835

100.6929

0.7435

109.4280

Z3

0.6605

41.7531

0.6655

42.2898

0.5715

42.3154

0.6365

52.5908

Z4

0.6125

12.0774

0.6165

12.1741

0.4795

9.6786

0.5485

15.5038

Z5

0.7195

40.6846

0.7255

41.5273

0.6215

31.2510

0.7345

47.4572

Z6

0.8185

73.9351

0.8325

75.8330

0.5385

18.0054

0.6305

29.9452

Z7

0.7975

36.7081

0.8115

37.8889

0.5785

13.1252

0.6665

20.4463

Z8

0.6255

19.5844

0.6355

20.4839

0.5385

13.2505

0.5945

18.5525

Z9

0.7025

22.0353

0.7295

23.9058

0.5145

8.6849

0.6405

18.3067

Z10

0.6655

10.1810

0.6785

10.7887

0.4785

4.0633

0.5625

8.1282

123

SIViP Table 2 Estimated scale and shape parameters of encrypted images showing variations across images Sample spaces

ENC Diplomat

ENC Girl-hat-1

ENC Girl-hat-2

ENC Girl-hat-3

α

α

S(α, σ )

α

S(α, σ )

α

S(α, σ )

S(α, σ )

Z2

0.5585

32.1572

0.7025

50.6716

1.0605

158.3460

0.7835

66.3013

Z3

0.6095

42.0957

0.6255

37.0478

0.9945

130.4893

0.5795

41.4360

Z4

0.5445

17.2795

0.5855

13.1864

0.8545

51.5102

0.5125

14.9773

Z5

0.4375

4.1112

0.5995

17.2200

1.2225

91.1855

0.6475

22.3976

Z6

0.6175

21.1952

0.5745

12.6524

0.9805

62.2412

0.6625

24.4761

Z7

0.5785

12.3402

0.5155

4.9772

1.0295

39.9353

0.5125

7.2670

Z8

0.4335

2.7244

0.6055

10.3642

1.2065

65.4668

0.6625

15.5918

Z9

0.4655

3.5337

0.5705

8.9848

1.1155

59.5846

0.6595

15.8499

Z10

0.4995

5.3937

0.5455

5.4228

0.9475

37.5653

0.4825

8.8154

Table 3 Estimated scale and shape parameters of Lena and JPEG compressed Lena showing the robustness of the feature vectors to JPEG compression when the quality factor is reduced from 80 to 10 Sample spaces

JPEG(Q=80)

JPEG (Q=60)

α

S(α, σ )

Z2

1.1905

Z3

0.6605

Z4

JPEG (Q=30)

α

S(α, σ )

219.1870

1.1915

41.8208

0.6635

0.6135

12.1869

0.6085

Z5

0.7195

40.6565

Z6

0.8215

74.5930

Z7

0.8035

Z8

0.6245

Z9 Z10

α

S(α, σ )

α

S(α, σ )

219.6131

1.1775

42.3406

0.6635

217.2714

1.1995

223.1164

42.3770

0.6495

11.7932

0.6035

39.6964

11.5713

0.5295

7.1096

0.7235

41.3585

0.8225

74.8795

0.7195

41.0025

0.7035

38.2652

0.8135

72.8004

0.7935

37.3332

0.8075

69.3592

37.8265

0.7745

34.1703

0.8065

19.4783

37.4950

0.6285

19.9274

0.6405

21.1181

0.6315

20.1166

0.7055 0.6705

22.2848

0.7095

22.7321

0.6895

20.9733

0.6065

13.7288

10.4570

0.6595

9.9167

0.7045

12.3553

0.6995

12.2713

7.2 Robustness to content-preserving operations The chosen features must remain robust to content-preserving operations such as JPEG compression (different quality factors), filtering operations, scaling and stretching, noise insertion, and other minor geometric operations. The robustness of the feature vector to JPEG compression is examined in Table 3 when the quality factor is reduced from 80 to 10. The reference shape and scale parameters are given in Table 1 against the Lena image, which was originally stored at 100 % quality. Results indicate that there is only a minor perturbation in both the shape and scale parameters. Results for filtering operations such as block averaging, 3 × 3 median filtering, the 5 × 5 median filtering, and lowpass Gaussian filtering are shown in Table 4. Since only the first 9 AC coefficients have been used to form the coefficient distributions, the histograms remain relatively insensitive to both linear and non-linear forms of spatial low-pass filtering. As a result, only a small deviation is observed in the shape and scale parameters for different types of filter masks (Table 4).

123

JPEG (Q=10)

Other content-preserving transformations were applied to the original Lena image such as bubble insertion, scaling and stretching, unsharp masking, noise addition, and rotation (Fig. 4). The effect on the shape and scale parameters for each of the transformed images is shown in Table. 5. It can be observed that unsharp masking and noise addition do not alter the shape and scale parameters too much; hence, the hash distances with respect to the original are likely to be low. However, the geometric operations, scaling, and rotation change some of the parameters significantly. The deviation is much higher in the case of rotation since the spatial information in the blocks selected for hashing and encryption is de-synchronized throughout the image. This impacts the frequency distribution and consequently the histograms. Hence, this hashing scheme can tolerate only minor rotations. The bubble insertion transformation creates an aperiodic warping in spatial domain in specific clusters of pixels, consequently influencing the frequency composition in selective blocks. This creates a deviation in some of the parameters. If this bubble frequency is increased further, then the histograms can be expected to be very widely separated as the degree of

SIViP Table 4 Estimated scale and shape parameters of filtered Lena showing the robustness of the feature vectors to the various filtering operations Median (3 × 3)

Median (5 × 5)

Average

α

S(α, σ )

α

S(α, σ )

α

S(α, σ )

Z2

1.1985

221.0625

1.1945

219.4404

1.2125

221.2976

Z3

0.6565

40.91707

0.6555

40.30318

0.6675

42.51365

0.6615

Z4

0.6155

11.9275

0.6335

12.54711

0.6035

10.86165

0.6105

11.7412

Z5

0.7105

38.83485

0.7065

37.13322

0.7335

41.25692

0.7235

40.78572

Z6

0.8135

71.59157

0.8145

68.25327

0.8315

73.24102

0.8195

73.04363

Z7

0.7935

34.98959

0.8295

36.03064

0.8275

36.03406

0.8105

35.95904

Z8

0.6255

19.15

0.6185

17.52896

0.6215

17.91584

0.6255

19.17038

Z9

0.6925

20.44735

0.6825

18.30602

0.6875

19.22337

0.6975

21.04978

Z10

0.6565

9.3253

0.6725

Sample spaces

9.259025

Low-pass Gaussian

0.6755

9.526948

α

S(α, σ )

1.2005

220.686 41.75545

0.6655

9.824065

Fig. 4 Altered images of Lena: a Bubble insertion, b Scaling and stretching, c Unsharp masking, d Noise insertion, e Rotation Table 5 Estimated scale and shape parameters of Lena and other variations such as scaling, rotation, noise/bubble insertion, and unsharp masking Sample spaces

Lena

Bubbles

Scaled

Unsharp

Noise

Rotation

α

S(α, σ )

α

S(α, σ )

α

S(α, σ )

α

S(α, σ )

α

S(α, σ )

α

S(α, σ )

Z2

1.1895

219.0699

1.1655

214.0409

1.0795

214.6384

1.1645

218.1100

1.1915

213.1177

0.8665

181.7014

Z3

0.6605

41.7531

0.7025

51.4926

0.6965

50.3672

0.6495

40.5620

0.6645

41.2323

0.5765

40.8080

Z4

0.6125

12.0774

0.8015

28.0742

0.6285

12.2737

0.6235

14.4281

0.6805

16.7677

0.4255

4.7716

Z5

0.7195

40.6846

0.7645

48.3948

0.6975

37.5860

0.7175

43.3836

0.7275

40.8433

0.7605

46.0363

Z6

0.8185

73.9351

0.8695

84.0358

0.8015

66.2808

0.8005

76.6797

0.8265

75.1247

0.8995

105.0993

Z7

0.7975

36.7081

0.7845

36.0113

0.6495

22.7304

0.7535

38.7547

0.8325

40.5720

0.6785

28.7258

Z8

0.6255

19.5844

0.7025

28.5787

0.5815

15.4523

0.6515

25.1151

0.6525

22.2412

0.7055

27.4103

Z9

0.7025

22.0353

0.8415

35.6578

0.5325

8.6455

0.7495

30.2408

0.7425

25.2412

0.6365

14.7785

Z10

0.6655

10.1810

0.7105

15.3956

0.5885

6.5420

0.6705

13.0505

0.7295

13.5649

0.4335

3.3420

warping will increase along with the perceptual distortion. At that point, the hashes between the transformed and original can no longer be expected to match.

8 Deriving the hash from the feature vectors After the extraction of the salient features in images, the binary hash sequences need to be generated. There are a total of 18 feature points—9 shape parameters and 9 scale parameters. The first step toward generating a

binary sequence is quantization of the feature points. The shape parameter is quantized using the relation αquan = round(α/0.001), and the scale parameter is quantized using the relation Squan (α, σ ) = round(S(α, σ )/0.2). The quantization step is chosen keeping in view the robustness of the final hash to content-preserving signal operations. We also have to ensure that two content-different images do not converge to a similar hash value. The quantized shape and scale parameters were placed in a rectangular grid, and the cells in the grid were labeled and Graycoded.

123

SIViP

Gray coding is necessary since the number of bits by which two encoded cells differ must increase linearly as a function of the Euclidean distance between them in the spatial grid [17]. However, since conventional Gray codes have a deterministic structure, there is a need to randomize the code assignment so that the final hash code becomes a parametric function of both the hashing key K hash and the image features. This can be done by rotating the codebook by an arbitrary number of units defined by the hashing key K hash [18]. For each attribute i, we will get a hash which is a li -bit vector. The final image signature is obtained by the concatenation of all the 18 vectors. The hashes obtained for each of the images are of 198 bits length. They are shown in the form of images of dimension 12 × 18 in Fig. 5 for images Lena, Actress, Diplomat, Girl with hat-1, and Girl with hat-2. Note that, in the hash generation process, each column of the binary hash represents an encoded version of the shape or scale parameter. The two parameters are interleaved starting from the encoding of the histogram for the first AC coefficient, and then, the second histogram is picked for coding and interleaving. Each column of the encoded parameter has a sign bit at the top and the least significant bit at the bottom. The most significant bits are attached to the sign and thus form the top few bits. To check if these hashes capture the diversity across these five images, a direct XOR between the corresponding hashes is computed to form a set of 10 difference matrices along with the corresponding normalized Hamming distances (NHDs), as shown in Fig. 6. Since the most significant bits lie in the top few rows, the hash differences do not exhibit much randomness at the upper portions. The real information content capturing the diversity across images is in the middle and bottom rows of the hashes. There is some skew in the comparison although all these images are completely dissimilar in terms

of perceptual content including background texture. Since the histograms are formed by collapsing space-frequency information content, the notion of spatial texture and shape is lost, and hence, some images (though very dissimilar such as the Diplomat and Girl in hat-1, Fig. 5c, d) produce some correlated hashes if the overall frequency constitution matches (Fig. 6h). In order to prove that the hash remains virtually invariant to the encryption process, the hashes of the original Lena image and its encrypted copy are determined. The results are presented in Fig. 7. As expected, the most significant and moderately significant planes remain the same, while the least significant rows in the hash matrix show some minor deviations due to the roundoff effects.

9 Hash uniqueness across images and privacy preserving retrieval The goal of most image retrieval systems is to supply the search engine with a distorted query and to expect the engine to return perceptually relevant matches. When the system is privacy preserving, the query and the targets (images in the database) need to be stored in the encrypted form. Note that, although the matches between the queries and the images in the database are performed in the encrypted domain itself, we have shown only the corresponding unencrypted versions in the retrieved list containing the top-14 closest matches. Different distorted queries were presented (Fig. 8) from the assorted database: (i) Cropped and very low resolution of nature image, By the sea, (ii) Cropped house with car, (iii) Very low resolution of Actress image, and (iv) Lena with bubbles and additive noise. The encryption algorithm used was the constrained shuffling discussed in the Sect. 6 which

Fig. 5 Original figures and their corresponding hashes. a Lena, b Actress, c Diplomat, d Girl with Hat-1, e Girl with hat-2, f–j Corresponding hashes in the same order

123

SIViP

Fig. 6 Distance between hashes from different images expressed graphically to illustrate variation in features across images. A direct XOR between pairs of hashes is taken and displayed. Numbers specified below the figures indicate the NHDs between the hashes of: a Lena

Fig. 7 Results showing the validity of the proposed algorithm. The hash of the original image and the encrypted image are same. a Original Lena image, b Hash derived from original Lena image, c Encrypted Lena image, d Hash derived from encrypted Lena image

preserves the distribution of DCT coefficients in different orthogonal frequency bins. The queries were presented in the encrypted form, and the corresponding hashes were compared with the hashes of the encrypted images stored in the database. Only the top-14 closest matches have been displayed in Figs. 9, 10, 11, and 12. The NHDs of the closest matches are specified below the images.

and Actress, b Lena and Diplomat , c Lena and Girl with hat-1, d Lena and Girl with hat-2, e Actress and Diplomat, f Actress and Girl with hat-1, g Actress and Girl with hat-2, h Diplomat and Girl with hat-1, i Diplomat and Girl with hat-2, j Girl with hat-1, and Girl with hat-2

Retrieval results for a partially cropped and very lowresolution query, By the Sea, are shown in Fig. 9. In the retrieved list, the original which is perceptually relevant to the query is positioned at 9. Note that, among the list of images retrieved there are several irrelevant ones. The normalized Hamming distance (NHD) of the closest image is 0.282407 which not even perceptually close to the query. The collapsing of space-frequency information to form histograms makes the search engine insensitive to shapes and descriptions of both natural and artificial objects. This leads to a flattening of the search space as opposed to constructing a multi-level hierarchical search space which would have ensured proper classification of images based on content. Also, observe that, most of the distances of the top-14 images are clustered between 0.28 and 0.33. Hence, the price paid in selecting global features (such as coefficient distributions) which permit a privacy preserving search is the loss of significant spatial detail. The original, By the sea image in the top-14, is shown encircled. To check if the pertinent original image is picked up in response to other distorted queries, a differently distorted version of the House with car was presented to the engine. The processing was done by cropping the original image and stretching the image to fit into 256 × 256 (Fig. 8b). The corresponding retrieval results are shown in Fig. 10. The relevant original images which are perceptually equivalent are shown encircled. The top 2 images are found pertinent and are positioned at distances 0.24537 and 0.268519, respectively. In this case, there is some class separability between the images which are relevant and those that are not. The closest irrelevant image positioned

123

SIViP

Fig. 8 Distorted queries presented to the privacy preserving search algorithm

Fig. 9 Retrieved images and their NHDs from the assorted database (AD) when a distorted, low-resolution nature query is supplied

at 3 is at a distance 0.282407. If the texture or the relative constitution of objects in a particular image is kept fixed, either by translation, gentle rotation or minor cropping, and stretching, the histograms of the coefficients in the original and processed versions are likely to yield similar shape and scale parameters. The reason for this is the collapse of space-frequency information, which does not change, even if a few blocks are permuted or rotated. Hence, results for scaling and stretching were good. However, if the cropping and stretching are excessive, the chances of the original appearing within the top few will decrease considerably.

123

Retrieval results for a low-resolution, distorted query of the Actress image (Fig. 8c) are shown in Fig. 11. The image closest to the query is the original image of the Actress itself at a distance of 0.236111, shown encircled. The next closest image is that of the fingerprint at a distance of 0.240741 which is perceptually irrelevant. It was observed that the shape and scale parameters deviate significantly from their original values when the original images are subjected to heavy intensity quantization. A distorted version of Lena was presented as the query (Fig. 8d). The processing was done by adding bubbles to the original image along with additive noise. The bubbles create

SIViP

Fig. 10 Retrieved images and their NHDs from the assorted database when a distorted, cropped, and scaled query of the House with car is supplied

Fig. 11 Retrieved images and their NHDs from the assorted database when a distorted, low-resolution query of the Actress is supplied

123

SIViP

Fig. 12 Retrieved images and their NHDs from the assorted database when a distorted and noisy Lena image is the query

a warping of the texture information in Lena. This consequently affects the DCT coefficient distributions in blocks which partially overlap with the bubbles. Since the bubble frequency is low, the number of blocks which are affected is small. However, the addition of uniform noise throughout the image causes a significant deviation in the shape and scale parameters. Retrieval results are shown in Fig. 12. The effect of this deviation is an overlapping of the class of crumpled Lena images with other content-dissimilar images. The original image is retrieved at position 9 at a distance of 0.319444. All the other images in the top-14 are perceptually irrelevant.

the encrypted form, the time taken for the search would be the same, as the features would have been computed directly on the encrypted image without any form of post processing. The hash computation process comprises of the following stages:

9.1 Computation time

The time taken for the above three hash components (per image) are represented as: TDCT−hist , TShape , and Tquant , respectively, and are shown in Table 6. If the block size is reduced to 8 × 8, the computation time (per image match) increases to 0.56 s and the accuracy drops. The distorted Lena image was provided as a query, and this did not yield any matches in the top-14. However,

An encrypted version of the query was provided to the search engine. The images were stored in the unencrypted form, and each matching operation was between the query hash and the hash computed from the unencrypted target image in the database. Note that, even if the image was stored in

• Block DCT computation; Concatenation of DCT coefficients corresponding to specific frequency indices. • Determination of the shape and scale parameters. • Quantization of the shape and scale parameters (formation of the rectangular grid); Random gray coding.

Table 6 Computation time (per image) for different stages in the hashing algorithm in seconds Block size

TDCT−hist (s)

TShape (s)

Tquant (s)

Ttotal (s) (per image)

Retrieval accuracy

16 × 16

0.1720

0.0160

0.0780

0.26

Good (Position no. 6 in top-14)

8×8

0.4840

0

0.078

0.56

Poor (No appearance in top-14)

123

SIViP

if the block size is maintained as 16 × 16, the computation time drops to 0.26 s (per image), with a much higher retrieval accuracy, with a match obtained in the 6th position (Table 6). Simulation with 32 × 32 was not possible since there were insufficient (very few) datapoints to compute the shape and scale parameters. The computed values for 32 × 32 were out of bounds of the pre-defined lookup table used. Hence, we expect the accuracy to drop if the size of the block is increased beyond 16. Hence, any gain in computation for a much larger block size will be offset by the considerable reduction in retrieval accuracy. The experiments were performed on a laptop with an Intel-CORE DUO processor, and the codes were written in MATLAB, which has several overheads. However, the computation time will be much smaller with dedicated software written in C/C++.

10 Security analysis and tradeoffs 10.1 Key space and size of the block The strength of the permutation cipher depends on the number of datapoints within the block. If the size of the image is N × N and DCT block size is chosen as M × M, the total number of blocks N B is  2 N NB ≈ (23) M In constrained shuffling, a certain frequency point (DCT coefficient) is moved to another block to mask its original spatial orientation, but retains its frequency position. As a result, a certain coefficient can migrate to N B possible positions. Hence, the permutation key space is N B !. In the encryption algorithm, a total of K = 25 coefficients are shuffled independently. As a result, the upper bound for the total key space becomes Key_space ≤ (N B !) K

the key space reduces to (N B − r )! (allowing for migrations from the insignificant set to the significant set). A similar analysis can be done for coefficients which are correlated. If the magnitudes and signs of two AC coefficients at the same frequency index but from different blocks are close, then swapping them will not create any perceptual distortion. When the block size M decreases, the upper bound on the key space increases (since the shuffling is across blocks and N B increases), but unfortunately, this increases the number of correlated DCT coefficients at the same index. If B p (i, j) is the AC coefficient in block p at index (i, j), for two adjacent blocks p and q, the probability, B p (i, j) ≈ Bq (i, j) is likely to be high for a small M. Hence, the actual key space drops significantly. There is another problem in choosing a small M. When the image is subjected to a signal processing operation such as spatial domain low-pass filtering, the aperture effect due to a small M causes a greater frequency deviation in the corresponding block DCT coefficients. Hence, the distribution of AC coefficients is likely to change influencing the search results. When a radial blurred version of Lena was supplied as a query for two different algorithm settings (M = 8 and M = 16), the number of relevant images in the top5 increased from 1 to 4 with the increase in block size as shown in Table 7. However, if the block size is increased further (to say M = 32), the performance drops, since the number of datapoints in the histogram is inversely proportional to M 2 . For 2 M = 16, the number of datapoints is N B = ( 256 16 ) = 256, 2 and for M = 32, this becomes N B = ( 256 32 ) = 64 which affects the accuracy of the shape and scale parameter estimate considerably. Hence, one of the main challenges lies in finding the optimal choice of M = Mopt which balances the hash sensitivity to content-preserving operations, and the accuracy of the shape and scale estimates to capture the diversity across several image classes.

(24)

The actual key space is, however, much smaller than the upper bound due to the following factors: • Significance in magnitude of a certain AC coefficient in relation to its neighbors. • Correlation between DCT coefficients from different blocks. If a certain AC coefficient contributes little to the energy of a block, its migration to a block having the same or much higher texture will not create any impact unless it displaces a significant coefficient in that block. If r is number of insignificant coefficients at a certain frequency index among N B blocks, migration of coefficients among the insignificant positions will not cause any perceptual distortion, as a result of which

10.2 Known-plaintext attacks In a known-plaintext attack, the attacker, who access to the encrypted image, acquires some priori information regarding a portion of the original unencrypted image such as a wellknown LOGO or some other known repetitive pattern. This allows him to form several (plain-image piece, cipher-image

Table 7 Relevance results (top-5) when a radial blurred version of Lena was supplied as a query with M varied Block size, M

Relevant images (in top-5)

8

1

16

4

123

SIViP Fig. 13 Search and coefficient localization attack based on known patterns/LOGOs

piece) pairs. If the attacker recovers sufficient independent plain-image pieces, it becomes possible for him to recover a portion of the key if the encryption process is simple (substitution or permutation but not both) and the key is re-used in several places. In simple intra-block shuffling, if the texture in a few blocks is simple and common, then the encryption keys can be estimated if the attacker knows the patterns priori. However, in the case of inter-block shuffling, the plaintext reference image is the entire image. This global shuffling operation makes it difficult for the attacker to execute a known-plaintext attack. Even if the attacker knows the pattern and position of a plaintext LOGO block as shown in Fig. 13, he has to conduct a parallel search for finding pair-wise matches for each coefficient BLOGO (i, j) in the other blocks within the cipherimage. Once this coefficient is located (say in block t1 ) and its value is unique, if he knows the position of the LOGO block, he can swap the corresponding coefficients from the LOGO block and the block identified (i.e. t1 ). The process can be repeated for all the coefficients within the LOGO block. If several such LOGO blocks are known to the attacker, partial decryption can be attempted. One way to avoid this search and identification problem is to add controlled noise to the DCT coefficients. This can form another layer of encryption. The addition of noise to DCT disrupts the coefficient search, localization, and swapping for any attacker, even if he has prior knowledge of a portion of the image. With added noise along with permutation, this cipher resembles a product cipher which is much more immune to known-plaintext attacks. But any addition of noise in the DCT domain is likely to influence the magnitude of the DCT-AC coefficients, conse-

123

quently perturbing the shape and scale parameters. The variance of the noise σ 2 needs to be controlled in such a way that the coefficient magnitudes are altered without significantly impacting the shape and scale parameters. The histograms of the mean absolute values of the DCT-AC coefficients Bk (i, j) is computed over all the blocks and shown for Lena in Fig. 14. There are 25 discrete frequency sets which are shuffled among themselves; hence, the expectation is computed over k for specific frequency indices,

MeanAC (i, j) =

NB 1  |Bk (i, j)| NB

(25)

k=1

Fig. 14 Distribution of mean absolute values of DCT-AC coefficients

SIViP

Fig. 15 Retrieval results obtained when an encrypted query of the Actress image was provided with and without the addition of noise

Figure 14 shows the distribution of MeanAC (i, j) only those values of (i, j), which form the first 25 coefficients, scanned in a zigzag manner. Note that, from the histogram shown in Fig. 14, there is cluster centered around a magnitude of 30. Hence, insertion of noise with a standard deviation of around σ = 20 can alter the magnitudes of most coefficients, sufficient to disrupt the search and localization attack. To verify if noise with σ = 20 preserves some of the histogram properties and corresponding to the shape and scale parameters, the search engine was tested when the following images were presented as the query: (a) Shuffled version of the Actress, (b) Shuffled + noise added Actress. Controlled noise was added to all the shuffled and stored images in the database (for additional security). The top-5 results are shown Fig. 15. In both cases, the best match is the original Actress image. For the case (a), when only shuffling is done, the normalized Hamming distance of the closest retrieved is 0 (the encrypted Actress, same as query). The next closest match is that of the image Diplomat at a distance of 0.263889 (significant class separation). For case (b), when independent Gaussian noise with σ = 20 is added to both the query and the database images, the Hamming distance of the closest match (Actress image itself) increases to 0.212963. This brings it closer to the other classes (next closest match is the Lena image at a distance 0.268519). There is still a significant separation of 0.05 between the two classes which should cover for some signal processing operations. So the price paid for increased security is a reduction in the reliability of the search engine and the hash matching process.

11 Conclusions In this paper, we have proposed a new framework for comparing encrypted images to facilitate privacy preserving search and retrieval. By allowing a portion of the statistical signature in the original image to surface despite the encryption operation, it becomes possible to compare two encrypted images on perceptual grounds, without tapping into their contents. The first challenge in such a framework was the incompatibility of a majority of the standard cryptographic algorithms with perceptual hashing models. The second challenge was the search for selective image statistics, extracted from the spatial or transform domain, that remained invariant to some soft encryption schemes. We have observed that DCT-AC coefficient distributions are preserved when inter-block constrained shuffling is employed. By treating these distributions as a GGD, shape and scale parameters can be extracted for further quantization and coding. The final encoded hash not only remains transparent to the permutation cipher, but also preserves most of the characteristics of a robust perceptual hash. We observed that in the proposed construction, since space-frequency information from the DCT-AC coefficients is concatenated to form histograms, there is loss of spatial detail which is sometimes crucial toward the classification of perceptual information contained in images. Hence, in response to several diverse distorted queries, the retrieved list shows several images which are perceptually irrelevant along with the correct original within the top-10. In an effort to extract the global image signature in the encrypted domain

123

SIViP

while maintaining perceptual secrecy, spatial detail is compromised. Future work entails the exploration of privacy preserving search models which capture both local and global information which allow classification of images based on its constitution.

References 1. Venkatesan, R., Koon, S.-M., Jakubowski, M., Moulin, P.: Robust image hashing. In: Proceedings of International Conference on Image Processing, vol. 3, pp. 664–666 (2000) 2. Karthik, K.: Robust image hashing by downsampling: between mean and median. In: World Congress on Information and Communication Technologies (WICT), pp. 622–627 (2011) 3. Monga, V., Evans, B.L.: Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE Trans. Image Process. 15, 3452–3465 (2006) 4. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Computer Society Conference on Computer Vision and, Pattern Recognition (CVPR06), vol. 2, pp. 2161–2168 (2006) 5. Lu, W., Swaminathan, A., Varna, A.L., Wu, M.: Enabling search over encrypted multimedia databases. In: Proceedings Media Forensics and Security I, vol. 7254 of SPIE Proceedings, SPIE (2009) 6. Monga, V., Banerjee, A., Evans, B.L.: A clustering based approach to perceptual image hashing. IEEE Trans. Inf. Forensics Secur. 1, 68–79 (2006) 7. Lu, W., Varna, A.L., Swaminathan, A., Wu, M.: Secure image retrieval through feature protection. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and, Signal Processing (ICASSP09), 19–24 April 2009, pp. 1533–1536 (2009)

123

8. Achlioptas, D.: Database-friendly random projections. In: Proceedings of the Twentieth ACM Symposium on Principles of Database Systems, PODS ’01, pp. 274–281 (2001) 9. Vempala, S.S.: The Random Projection Method. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society, Providence (2004) 10. Chum, O., Perdoch, M., Matas, J.: Geometric min-hashing: finding a (thick) needle in a haystack. In: International Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA, pp. 17–24, IEEE (2009) 11. Hsu, C.-Y., Lu, C.-S., Pei, S.-C.: Homomorphic encryption-based secure SIFT for privacy-preserving feature extraction. In: Proceedings of SPIE 7880, Media Watermarking, Security, and Forensics III, 788005, 8 Feb 2011 12. Kashyap, S., Karthik, K.: Authenticating encrypted data. In: National Conference on Communications (NCC) 2011, January 28–30 (2011) 13. Shannon, C.E.: Communication theory of secrecy systems. Bell Syst. Tech. J. 28, 656–715 (1949) 14. Wang, Y.-G., Lei, Y.-Q.: A robust content in DCT domain for image authentication. In: International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 94–97, Kyoto, Japan (2009) 15. Sharifi, K., Garcia, L.: Estimation of shape parameter for generalized Gaussian distribution in subband decomposition of video. IEEE Trans. Circuits Syst. Video Technol. 5, 52–56 (1995) 16. Dominguez, J.A.: A practical procedure to estimate the shape parameter in a generalized gaussian distribution. Technical Report. http://www.cimat.mx/reportes/enlinea/I-01-18_eng.pdf (2001) 17. Swaminathan, A., Mao, Y., Wu, M.: Robust and secure image hashing. IEEE Trans. Inf. Forensics Secur. 1, 215–230 (2006) 18. Zhu, G., Kwong, S., Huang, J., Yang, J.: Random gray code and its performance analysis for image hashing. Signal Process. 91, 2178–2193 (2011)

Suggest Documents