Privacy Preserving Through Fireworks Algorithm ...

2 downloads 8419 Views 955KB Size Report
Privacy preserving, fireworks algorithm, data perturbation, big data, privacy ... Particularly with the high growth of data mining and analytical techniques. Data.
IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

Privacy Preserving Through Fireworks Algorithm Based Model for Image Perturbation in Big Data Amine RAHMANI, Abdelmalek AMINE, Reda Mohamed HAMOU, Mohamed Elhadi RAHMANI, Hadj Ahmed BOUARARA GeCoDe Laboratory, Department of Computer Science Tahar Moulay University of Saida Algeria

Abstract. Nowadays, Social networks and cloud services contain billions of users over the planet. Instagram, Facebook and other networks give the opportunity to share images. Users upload millions of pictures each day, including personal images. Another domain concerning medical studies, these last requires a highly sensitive medical images that retain personal details close to patients. Image perturbation had attracted a great deal of attention in the last few years. Many works concerning image ciphering and perturbing had been published. This paper deals with the problem of image perturbation for privacy preserving. We build three new systems that consist of hiding small details in pictures by rotating some pixels. Our models use two algorithms: the first one involves a simulation of the firework algorithm in which we place fireworks on selected pixels then represents sparks as rotation processes. The second system consists of a model of rotation based perturbation using iterated local search algorithm (ILS) with 2 optimization stages. Meanwhile, the third one consists of using the same principle of the previous system except by using the ILS algorithm with 3 optimization stages. Keywords. Privacy preserving, fireworks algorithm, data perturbation, big data, privacy preserving data mining, bio-inspired algorithms

Introduction With the emersion of new applications centred on the sharing of image data over large data services such as medical imagery and social nets. Privacy concerns have become a crucial problem in the modern information sciences. Particularly with the high growth of data mining and analytical techniques. Data perturbation and de-identification are two main fields in the privacy preserving research area. Unlike the cryptographic systems, which propose a temporary change in the data without holding in consideration the utility of ciphered data, the perturbation techniques consist of minimizing the disruption between the extremities of the major paradox, which is hiding sensitive data by altering its content while maintaining its utility. At age of big data, data perturbation, inherited from secure database techniques, becomes an important issue in data protection. It often uses different techniques. We distinguish two different types of data perturbation: differential privacy in which the differential computations are often used such as chaos mapping (Tong, 2009), and privacy preserving data mining in which a data mining technique is used (Ag). However, data perturbation is facing growing concerns. The researchers are attempting to prepare and adapt data mining techniques to be useful for privacy matters. In most engineering disciplines, many problems can be simplified as numerical optimization problems through mathematical modelling. Nowadays, studying such biological phenomenon is no longer concerned by only biologist. All of this gave birth to a new domain of research known by bio-inspired or meta-heuristics methods for optimization. Optimization algorithms and techniques are a set of metaheuristics known by their efficiency in solving difficult problems. Those algorithms are generally probabilistic and stochastic processes that are inspired from life and nature. One special algorithm had seen

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

the light recently known by conventional firework algorithm (FWA). Since its development in 2010 by Tan and Zhu in (Tan, 2010), FWA had used in many problems and even it proved a high efficiency in finding optimal solutions. This paper presents a suggestion of a new usefulness of FWA algorithm, not on optimization problem but on securing aspect. We work out a model of rotation based perturbation that consist of hiding sensitive information over image data using fireworks explosions. The remainder of this report is organised as follows: first we introduce some general concepts that are employed in our theoretical account. And so we introduce the universal framework of this attack. After that, we describe our experiments and discuss its results in term of efficiency by comparing it with some results of known conventional works. And finally, we end with a conclusion and perspectives.

State of the art Data perturbation In the light of the broad emergence of data sharing services, data perturbation took an important spot in the researchers’ community. There are many works in the literature and several approaches. Some of it was interested in using differential computations such as the work reported in (Al-Najjar, 2011) using the logistic map chaotic. The authors proposed two approaches: one consists of changing pixel values without shuffling the image using Pixel Mapping Table and modifying pixel values using rows and column replacement. Tong and his friends also reclaim in their work that is presented in (Tong, 2009) in which they propose a theoretical account of feedback image encryption using chaotic functions. Other works are also published such as (Upmanyu, 2009) using the Chinese Remainder theorem for secret sharing scheme over video surveillance frames. And (Yavuz, 2008) in which the authors examined the efficiency of Time Reversal techniques (TR) on image ciphering. In (Chen, 2015), the authors demonstrated their cryptosystem of image data using a 3-D chaotic map based joint image scrambling and random encoding in gyrator domains. Their system, first, consists of shuffling the image, then random phase encodes it in the spatial domain and gyrator transform domain. They used the 3-D chaotic map in order to generate key stream elements. Another work presented by (Gu, 2014) used also 3-D chaotic map. Unlike (Chen, 2015), the authors had built a scheme in which the idea is to use a 3-D chaotic mapping process in order to, iteratively, permute pixels’ locations and replace their values. Other works were concerned by the usefulness of data mining techniques and meta-heuristics. In (Wu, 2014), the authors demonstrate a new algorithm of image encoding using an inspired algorithm from the natural ripple-like phenomenon. The majority of the privacy preserving data mining works were done for text and structured data, such as medical records as well as the work reported in (Kerschbaum, 2011) using Immune Systems. In that respect are various techniques of data perturbation that are grouped into two major categories: value based perturbation, and multi-dimensional disturbance. However, data mining based perturbation is classed under the multi-dimensional perturbation. In this paper, we are interested only by data mining based perturbation. We distinguish three major classes of techniques that use data mining tasks for perturbation as figure 1 indicates: Data mining based perturbation

Condensation based perturbation

Random rotation based perturbation

Geometric perturbation

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

Figure. 1. Data Mining Based Perturbation Condensation based perturbation This class presents the solidification of multi-dimensional perturbation techniques in which it consists of preserving the covariance of the matrix formed from multiple columns by conserving the eigenvectors and eigenvalues of the graphical representation of the dataset as figure 2 shows.

Figure 2. Results of condensation based perturbation process These techniques, share the same principle that functions as follows: first of all, the dataset is divided into groups of size k, then, for each group, a random selected record is taken as the heart of the group. After that, a (k-1) nearest neighbours are selected to be the other members of the group. The chosen k members are replaced by randomly generated ones from the dataset before forming the next group. At the end of the process, a new dataset is generated in which the application of such data mining algorithm does not involve any change or development of the algorithm. Random rotation based perturbation This category is the one where our approach belongs. It consists of perturbing data by rotating or translating it using the matrix product. First, the dataset is presented by a matrix Xn*m, then a random orthonormal matrix R is generated. R is considered the orthonormal rotation matrix. Ultimately, the perturbed data are the resolution of product X by R. Many advances are brought out in this kind of techniques using classification (Chen, 2005) and (Banerjee, 2014) and transformation techniques such as SVM, KNN, K-means, Inner product and kernel methods. Geometric perturbation The geometric based models consist of combining rotation, transformation, and noise addition techniques. In some other manner, this technique consists of going around the matrix of dataset X using random orthonormal rotation matrix R, and so translating it using random transformation matrix T, and masking it using noise addition matrix N. Such system is a geometric based perturbation if it respects the following formula (Patel, 2013) and (Chen, 2007): G (X) = RX + T + N

(1)

where G(X) is the perturbed dataset, X denotes the original dataset, R is the random rotation matrix, T is the random transformation matrix, and N is a noise addition matrix.

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

Firework algorithm As swarm intelligence algorithm and inspired from fireworks explosions, the firework algorithm was designed as a complex functions’ optimization. Once a firework set off and explode, a set of sparks is generated, and cover a local space around a point where the explosion took place. The general framework of firework algorithm is proposed in the figure 3

Figure. 3. General framework of firework algorithm The figure 4 indicates what seems, according to the authors in (Tan, 2010), the difference between bad and good explosion.

Figure. 4. Good and bad explosion according to (Tan, 2010) As we notice in figure 4, we can clearly deduct that a good explosion depends on two major factors: number of generated sparks from each firework and the positions of each arc. The number of sparks is computed using the formula 2 bellow: Si = m *

𝑦𝑚𝑎𝑥 −𝑓(𝑥𝑖 )+ 𝜀 𝑛 ∑𝑖=1(𝑦𝑚𝑎𝑥 −𝑓(𝑥𝑖 ))+ 𝜀

(2)

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

where m denotes a parameter that is used to control the total number of sparks generated by the n fireworks, f(x) is the function to minimize, ymax denotes the max value of all the functions f (xi) and 𝜀 is used to avoid division by zero. Yet, the positions of the sparks are generated according to their amplitudes. These last are computed using the formula 3: 𝑓(𝑥𝑖 ) − 𝑦𝑚𝑖𝑛 + 𝜀 𝑖=1(𝑓(𝑥𝑖 ) − 𝑦𝑚𝑖𝑛 )+ 𝜀

Ai = Â * ∑𝑛

(3)

Where  is the maximum of the amplitude of an explosion, ymin denotes the minimum value of the function f (xi) and 𝜀 is used, as well as in computing the number of sparks, to prevent division by zero. The locations of the fireworks are chosen randomly only at the start of the process. Starting from the second iteration, the choice of new locations for n other fireworks follows a special treatment. The authors had proposed two ways to select new locations: one depends on the quality of the firework f (xi), while the other is done using a Gaussian explosion process. Since its founding in 2010 (Tan, 2010), the firework algorithm had been widely practiced. Some of the works were interested in examining the algorithm in order to improve its efficiency. In (Liu, 2013) the authors offered a novel method of estimating the bit of sparks and amplitudes of fireworks. Likewise, the authors applied a new random mutation operator that targets to control the diversity of the algorithm. At the end, they came out with two major improved algorithms: one with the best fitness selection and random mutation (IFWABS), and the other with fitness selection using roulette and random mutation (IFWAFS). In (Zheng, 2013), the authors present an enhanced version of firework algorithm (EFWA). They aim by their improvement to tackle two major disadvantages of classical firework algorithm: its inefficiency against optimization of shifted functions in which the optimum function could be out of the local space of the firework, and the high computational cost within the iterations. To do that, they proposed some changes in the original algorithm: 

They change the minimal explosion check function.



Add new operators of sparks and Gaussian sparks generation.



Add a new mapping strategy for the sparks that are out of the search space.



A new way of selection of population for the next iterations.

The same authors proposed another improvement of the last cited one (EFWA), (Zheng, 2014), by integrating an adaptive dynamic local search mechanism. And again, their purpose is to tackle some limitations of the EFWA algorithm in amplitude computations and dependences. They suggest a dynamic search mechanism in their system named (dynFWA). Their idea consists of using a dynamic explosion amplitude for the firework at the best position. If the fitness of this firework can be improved, then the explosion amplitude will increase in order to speed up the convergence. Else it will decrease in order to narrow the search area. In (Pei, 2012), the authors presented an empirical study of the effect of approximation approaches on accelerating firework algorithms. They use three sampling data methods for fitness landscape approximation. Iterated Local Search Algorithm The ILS algorithm is a Stochastic Local Search (SLS) meta-heuristic that generates a sequence of answers generated by embedding heuristic. This algorithm was firstly created as an optimisation of the travelling salesman problem (TSP). The primary destination of this algorithm is to improve upon stochastic multi-restart search solution by trying in the headquarters of candidate answers and then refine results to their local search through the habit of local search technique. The following pseudo-code illustrates the general framework of ILS algorithm:

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

Algorithm1: pseudo-code of Iterated Local Search Input: Initial population (P), max_no_impr, max_iterations Output: Sbest Begin Sbest ←Construct Initial Solution (); Sbest ←Local Search (); SearchHistory ← Sbest; while (┐reaching stop criterion) do Scandidate ← Perturbation (Sbest, SearchHistory); Scandidate ← Local Search (Scandidate); SearchHistory ← Scandidate; if Acceptance Criterion (Sbest, Scandidate, Search History) then Sbest ← Scandidate; end while Return Sbest; End ILS, since its creation in 2001 by Lourenço and Martin (Lourenço, 2001), had been widely studied and used. As well as it designed for, this algorithm had been predominately used for discrete domains such as combinatorial optimization problems. Many works were done and many extensions had been published for this algorithm. Taking an example of (Stützle, 2006) in which the author presented a new extension of ILS algorithm to solve the quadratic assignment problem. Or the algorithm presented in (Rocki, 2012) in which the authors presented an adaptation of large scale parallel computations into ILS algorithm to answer the famous traveling salesman problem (TSP). However, the algorithm varies between greedy solution for too small perturbations and stochastic solution for too large perturbations.

Our approach We propose a new rotation based perturbation of image data by rotating and replacing pixels. Our approach is much simple. It consists of generating new pixels (sparks) from selected ones (fireworks) in order to replace other pixels according to their moving (amplitude). The figure 5 illustrates how our approach works:

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

Image Codification Select n random pixels Set off n firework on n pixels

Compute number of sparks Compute the amplitude Shift pixels

No

Stop criterion Yes

Perturbed Image

Figure. 5. Firework algorithm based image perturbation As the figure 5 above indicates, our system is compound on six steps that work as follows: first at all, a codification process consists of coding the pixels of the image according to their position in the picture so that the image is considered as a vector of (x, y) pairs. After that, in iterative process, the system choses n random pixels and set off a firework in each pixel. The next step consists of computing the number of sparks generated by each firework using the equation (2) above in which m is the total number of pixels in the image, ymax denotes the maximum position among the chosen pixels, and 𝜀 is used to avoid the division by zero. After defining for each firework the number of sparks that it could generate, the system computes the amplitude of each firework using the equation (3) where  denotes the minimum position that a spark could reach in the explosion, and ymin denotes a first pixel at position (0, 0). Afterwards, for each firework, the system choses m random directions where m denotes the number of sparks and rotating between the two pixels (the firework and its spark). At the end the resulted image is considered as the entry of the next iteration while the system doesn’t reach its stop criterion. The stop criterion can be either a predefined number of iterations or simply a comparison of the correlation of adjacent pixels of the image at the entry of the iteration and the one at the end of the iteration. If we use the correlation comparison criterion we must ensure that the correlation of finale perturbed picture must be as close as much to the 0. Since the firework algorithm was designed in first place as local search process, we had developed another approach using Iterated Local Search (ILS) algorithm. The reason that we choose ILS algorithm

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

among all other algorithms is that we aim to benefit from the perturbation and permutation processes of this last. The figure 6 describes our algorithm that we named ILS-IP:

Figure. 6. Presentation of ILS algorithm for image perturbation As noticed in the figure 6 above, ILS-IP algorithm is compound on five essential steps: Segmentation: This step consists of segmenting the image. For that we practiced three different ways: by pixels where each pixel represents an individual, by line where each job represents an individual or by columns where each column represents an individual. Each picture element is presented by the ARGB (Alpha, Red, Green, and Blue) value. Afterward that, the population compound of the souls is studied as initial solution. Random segmentation: This step consists of randomly permuting the individuals in such way where for each individual i a random different individual j is chosen from the ones cited between positions 0 and (n-i) where n is the number of individuals. After that the individual i will be permutated with the one in the position (j + i). Local Search This process works as follows: for a given solution (Sbest) and an actual solution (Scandidate), if the fitness value of the Sbest is bigger than the one of Scandidate then Sbest will be replaced by Scandidate. This process is repeated for a number of iterations (max_no_impr) where in each time the Sbest changes the count for iterations restart from 0. Perturbation: This process consists of perturbing a part from the image as following: first, two random positions i and j are chosen from 0 to n. Then the set of individuals between i and j is reversed. The result of this process will be used as Sbest for another Local Search process. The processes 3 and 4 will be repeated for number of iterations (max_iterations) in order to ensure the maximum of perturbation. Finally, the result will be a permuted image.

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

The correlation is a measurement value computed for the adjacent pixels horizontally, vertically, or diagonally. The formula to compute the correlation of such picture is as follows (Tong’09): Corr=

𝑁 𝑁 𝑁 × ∑𝑁 𝑗=1(𝑥𝑗 ×𝑦𝑗 ) − (∑𝑗=1 𝑥𝑗 × ∑𝑗=1 𝑦𝑗 ) 2

2

(4)

𝑁 𝑁 𝑁 2 2 √(𝑁×∑𝑁 𝑗=1 𝑥𝑗 −(∑𝑗=1 𝑥𝑗 ) )×(𝑁×∑𝑗=1 𝑦𝑗 −(∑𝑗=1 𝑦𝑗 ) )

where 𝑥𝑗 and 𝑦𝑗 are two adjacent pixels and N is the number of chosen pixels. The correlation takes values from -1 to +1. The Table 2 indicates the general interpretation of correlation values. Correlation value Interpretation Exactly –1

A perfect downhill (negative) linear relationship

–0.70

A strong downhill (negative) linear relationship

–0.50

A moderate downhill (negative) relationship

–0.30

A weak downhill (negative) linear relationship

0

No linear relationship

+0.30

A weak uphill (positive) linear relationship

+0.50

A moderate uphill (positive) relationship

+0.70

A strong uphill (positive) linear relationship

Exactly +1

A perfect uphill (positive) linear relationship

Table 1. Interpretation of some correlation values

Experiments and results We had carried on a set of experiments on different image types (multi-coloured and bi-coloured), also for each type of image we had conducted our experiments on three main image sizes (320x240, 384x288, and 768x576). However, in order to give more credibility to our results we had developed two extension of ILS based image perturbation system (ILS-IP) in which we used an ILS algorithm with 2 stages and 3 stages. To evaluate our system, we use the correlation coefficient presented in table 1 above of the both original and perturbed image. Broadly speaking, the more the correlation is close to the 0 the more the image loses his original shape and in that case we suppose that the adjacent pixels are not linearly related and vice versa. In our case we use only horizontal and vertical correlations. The horizontal correlation is computed by the summation of the correlations of each line where the N represents the number of pixels in the line and the vertical correlation is computed by the summation of correlations of each column where the N represents the size of the column. More the image loses her original form more it gives closest correlation to 0. The figure 7 illustrates a sample of original and perturbed images after applying both FWA and ILS algorithms:

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

1)

2) a)

1)

2) b)

3) Results of multi-coloured image

4)

3) Results of bi-coloured image

4)

Figure. 7. Sample of original images (a.1 and b.1) and perturbed images using FWA algorithm (a.2 and b.2), ILS algorithm with 2 stages of optimization (a.3 and b.3), and ILS algorithm with 3 stages of optimization (a.4 and b.4) As figure 7 indicates, we notice that the resulted images from the three systems had lost their initial forms. Meanwhile, this figure could show a clear remainder between the resulted images. Table 2 indicates the results of correlation coefficient applied on each type of image and each size for systems: Firework based image perturbation (FWA-IP) and ILS based image perturbation (ILS-IP). Correlation Coefficient of biCorrelation Coefficient of coloured image multi-coloured image Direction of System Image Size correlation Perturbed Perturbed Plain-Image Plain-Image image image Horizontal +0.9780042 +0 .0049711 +0 .9780042 -0.0142322 320 x 240 Vertical +0.6001065 +0 .0030513 +0.8490958 +0.0656542 Horizontal +0.8754261 +0.0041890 +0.8754261 -0.1384996 FWA-IP 384 x 288 Vertical +0.5902273 -0.0020765 +0.9645153 -0.0827428 Horizontal +0.9515227 +0.0003856 +0 .9515227 -0.0285306 768 x 576 Vertical +0.3412473 +0 .0001565 +1.0000000 -0.0238296 Horizontal +0.9780042 +0.6217054 +0 .9780042 -0.2550313 320 x 240 Vertical +0.6001065 +0.2803274 +0.8490958 -0.0988541 Horizontal +0.8754261 +0. 4068285 +0.8754261 -0.0176774 ILS-IP384 x 288 2opt Vertical +0.5902273 -0.2546009 +0.9645153 -0.1346297 Horizontal +0.9515227 -0. 2921210 +0 .9515227 -0.1346439 768 x 576 Vertical +0.3412473 +0.1434739 +1.0000000 -0.1324268 Horizontal +0.9780042 -0.2363632 +0 .9780042 -0.2363632 320 x 240 Vertical +0.6001065 +0.6994568 +0.8490958 -0.1045505 Horizontal +0.8754261 +0.7574840 +0.8754261 +0.1707652 ILS-IP384 x 288 3opt Vertical +0.5902273 -0.4218956 +0.9645153 -0.0197570 Horizontal +0.9515227 -0.1105119 +0 .9515227 -0.1106347 768 x 576 Vertical +0.3412473 +0.1493740 +1.0000000 -0.1304763 Table 2 results of vertical correlation coefficient of multi-coloured and bi-coloured images in original and perturbed forms for three sizes using FWA-IP, ILS-IP_2opt, and ILS-IP-3opt

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

By analysing the results presented in table 2, in case of multi-coloured image, we could clearly notice that the FWA-IP system presented a high efficiency (almost non-linearly relationship between adjacent pixels of perturbed image), in which the best results in term of horizontal correlation was given by the image of size (320x240) with -0.0142322. And in term of vertical correlation, the image of size (768x576) gave the best results with -0.0238296. Comparing to the two other systems which presented a satisfactory results, in which the worst results in term of horizontal correlation was given by the system ILS-IP-2opt with -0.2550313 (a weak downhill (negative) linear relationship). Concerning the bi-coloured image, the FWA-IP system gave, again, a high efficiency where the image of size (768x576) presented the biggest loss of form in term of horizontal correlation with +0.0003856, and in term of vertical correlation too with +0 .0001565 (almost non-linearly relationship). However, we noticed also that the ILS-IP system in both cases (2opt and 3-opt) lost some of their efficiency in bi-coloured image case. But even with that, they gave a satisfactory results. The worst results was given by the ILS-IP-3opt system with +0.7574840 in term of horizontal correlation applied on image of size (384x288) and +0.6994568 of vertical correlation given by the picture of size (320x240). Overall, we notice that the use of swarm intelligence algorithm in image of fireworks algorithm had presented more efficiency than the stochastic methods in image of ILS algorithm, despite the fact that the both algorithms consist of local search space optimization. In order to give more credibility to our results, we had applied the systems on the famous picture “LENA” and compared it with some conventional works in the image of the systems presented in (Tong, 2009), (Chen, 2015), (Gu, 2014), and (Al-Najjar, 2011). Figure 8 bellow shows the resulted images of the tests.

a) original image

b) perturbed image using FWA-IP

c) perturbed image using ILS-IP-2opt

d) perturbed image using ILS-IP-3opt

e) perturbed image using (Al-Najjar, 2011)

f) perturbed image using (Gu, 2014)

g) perturbed image using (Chen, 2015)

h) perturbed image using (Tong, 2009)

Figure 8 resulted images of Lena picture using several systems As we noticed in figure 8, the both approaches of FWA-IP and ILS-IP consist of perturbing images by only shuffling the pixels without touching the pixels values itself. This may look creepy but it has a major advantage in maintaining images’ utility for several cases such as private information retrieval protocols for images using algorithms based on pixels’ similarity. Meanwhile, perturbation methods that provides a uni-colour or bi-colour images may influence this purpose. If all images of a dataset were

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

transformed into black and white images then image retrieval algorithms have a high probability or returning wrong answers because the photos have almost the same form. The table 3 indicates the vertical and horizontal correlation coefficients of original and perturbed image of the Lena picture using FWA-IP system, ILS-IP-2opt system, and ILS-IP-3opt system with comparison with (Tong, 2009) System Correlation coefficient Horizontal +0.9349843 Plain-Image Vertical +0.9378704 Horizontal +0.0323642 FWA-IP Vertical +0.1279019 Horizontal +0.1564614 ILS-IP-2opt Vertical +0.1803171 Horizontal +0.1368696 ILS-IP-3opt Vertical -0.0849279 Horizontal +0.0171888 (Tong, 2009) Vertical +0.0098527 Horizontal -0.0061000 (Chen, 2015) Vertical -0.0052000 Horizontal +0.0139000 (Gu, 2014) Vertical +0.0073000 Horizontal +0.0186500 (Al-Najjar, 2011) Vertical +0.0345000 Table. 3. Results of vertical and horizontal correlation coefficients of original and perturbed image of Lena’s picture The results of the table 3 revealed that the system presented by (Chen, 2015) had presented an extremely high efficiency with the closest correlation to the zero with -0.0061 in term of horizontal correlation and -0.0052 in term of vertical correlation. Meanwhile, the closest efficiency was given by the system presented in (Gu, 2014) where the both use 3-D chaotic maps. (Gu, 2014) had also an excited results with +0.0139 in term of horizontal correlation and +0.0073 in term of vertical correlation. The reason that these two schemes were highly efficient is that because the use of 3-D chaotic map leads to a major changes in pixels values in a wide range of random values which conducts to a very low probability of getting two adjacent pixels with a high linearity of relationship. Nevertheless, our systems, applied on Lena’s picture, gave also a satisfactory results comparing to the existed ones in the literature. FWA-IP system gave +0.0323642 in term of horizontal correlation, and +0.1279019 in term of vertical correlation. Meanwhile, in spite of the pleasing results of ILS-IP system, ILS-IP-2opt gave the worst results with a horizontal correlation of +0.1564614 and vertical one equals to +0.1803171, and that’s due to the fact that our approaches do not change pixels values. Theoretically, the firework algorithm is known by his linear complexity of computation. The table 4 illustrates a theoretical comparison of our systems with some known perturbation algorithms in the literature.

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document. Approach

Used method

complexity

Number of iterations

Memory

FWA-IP

Firework algorithm

O(T*(n + m+𝑚 ̂))

Multiple

NO

-image

ILS-IP

Iterated Local Search

O(nm1)

Max_iterations x max_no_impr

YES

- Image - max_iterations - max_no_impr

-selection -explosion -rotation - permutation - search - perturbation

(Al-Najjar, 2011)

Logistic Map Chaotic function

NO

- image - key1 - key2 - key3

- shifting - modulus operations - columns and rows replacement

O(N)

One

input

Operators

(Banerjee, 2014)

PCA, LDA and for LDA k-means -O(s n k) for kclustering means

Multiple

NO

- dataset -number of classes -projection -group size -clustering -noise parameter -data sanitization -number of -noise addition selected coefficient

(Liu, 2006)

Multiplicative Random Projection Matrices

O(n2m)+O(n3)

Multiple

NO

-Data matrices

(Newton, 2003)

K-same

O(N)

One

YES

-face image -k privacy constraint

(Gross, 2009)

Multi-factor Models

O(Nm)

Multiple

YES

-images

(Du, 2011)

Least degradation

O(N)

One

YES

-plate bases -plate image -privacy prior

-O(m2n)

-projection - Inner products -Euclidean distance -matching -average -projection -factorization -de-identification -recognition -blurring -projection

Table. 7. A theoretical study of FWA-IP and ILS-IP systems against some known systems By analysing the table 7, we can see that FWA-IP system had a linear computational complexity of computing as well as the system presented in (Newton, 2003) because it doesn’t require any specific steps. Meanwhile, the ILS-IP system has an exponential computational complexity which means that this last requires a long time and could not be used for real time applications on images with big sizes. But, the ILS-IP system presents a major advantage in which it uses a memory to store the local optimal solution while trying others. And that what FWA-IP system lacks. Otherwise, the ILS-IP system keeps always searching for optimal solution as the number of iterations is specified even if it finds it before reaching max_iterations, and this fact could present the advantage of finding new optimal solutions. Unfortunately, the FWA-IP system stops once he finds the first optimal solution because of its principle based on specific stop criterion.

Conclusion In this paper, we presented two approaches of rotation based perturbation of image data using bioinspired algorithms. The first one using swarm intelligence based on firework algorithm and the second using stochastic method presented by iterated local search algorithm. FWA based system consists of coding 1

m is the number of optimisation stages (m-opt)

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

the pixels according to their locations and then randomly shuffle the picture by selecting random pixels and then set off fireworks on it. The explosion fact consists of creating a set of sparks in which each one presents a rotation process that is used to randomly select a direction and then permute the pixels of firework and selected one according to its amplitude. FWA-IP system presented major advantages reside in hid low complexity of computations and high efficiency in perturbing images. But, it also presented some disadvantages reside on the fact of shutting of once he find the first optimal solution. In the other hand, ILS-IP system was a system that we developed in confrontation with FWA-IP system. In term of efficiency, ILS-IP presented good level in hiding information, despite of its bad results comparing to conventional works and FWA-IP. As well as the first system, ILS-IP presented some advantages reside in having almost an optimal solution among several ones because of the fact of using a memory. But also presented some disadvantages in high computational complexity. As future work, we will integrate some functionalities that allow to change pixels for both systems such as modulus operations, and encryption schemes…etc. concerning FWA-IP system, we will define a new extension by integrating a memory use in order to allow it to find an optimal solution among several ones. For the ILS-IP system, we will define some approaches that intended to optimize its computational complexity. In addition, we will integrate this systems for other forms of data (medical records, textual data….etc.).

References Tan, Y., Yu, C., Zheng, S., & Ding, K. (2013). Introduction to fireworks algorithm. International Journal of Swarm Intelligence Research (IJSIR), 4(4), 39-70. Tan, Y., & Zhu, Y. (2010). Fireworks algorithm for optimization. In Advances in Swarm Intelligence (pp. 355-364). Springer Berlin Heidelberg. Zheng, S., Janecek, A., & Tan, Y. (2013, June). Enhanced fireworks algorithm. In Evolutionary Computation (CEC), 2013 IEEE Congress on (pp. 2069-2077). IEEE. Pei, Y., Zheng, S., Tan, Y., & Takagi, H. (2012, October). An empirical study on influence of approximation approaches on enhancing fireworks algorithm. In Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on (pp. 1322-1327). IEEE. Liu, J., Zheng, S., & Tan, Y. (2013). The improvement on controlling exploration and exploitation of firework algorithm. In Advances in swarm intelligence (pp. 11-23). Springer Berlin Heidelberg. Zheng, S., Janecek, A., Li, J., & Tan, Y. (2014, July). Dynamic search in fireworks algorithm. In Evolutionary Computation (CEC), 2014 IEEE Congress on (pp. 3222-3229). IEEE. Li, J., Zheng, S., & Tan, Y. (2014, July). Adaptive Fireworks Algorithm. InEvolutionary Computation (CEC), 2014 IEEE Congress on (pp. 3214-3221). IEEE. Yu, C., Kelley, L., Zheng, S., & Tan, Y. (2014, July). Fireworks algorithm with differential mutation for solving the cec 2014 competition problems. InEvolutionary Computation (CEC), 2014 IEEE Congress on (pp. 3238-3245). IEEE. Janecek, A., & Tan, Y. (2011, July). Iterative improvement of the multiplicative update nmf algorithm using nature-inspired optimization. In Natural Computation (ICNC), 2011 Seventh International Conference on (Vol. 3, pp. 1668-1672). IEEE. Liu, J., Zheng, S., & Tan, Y. (2014, July). Analysis on global convergence and time complexity of fireworks algorithm. In Evolutionary Computation (CEC), 2014 IEEE Congress on (pp. 3207-3213). IEEE. Patel, A., & Patel, H. S. (2014). A Study of Data Perturbation Techniques For Privacy Preserving Data Mining.

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

Keyvanpour, M., & Moradi, S. S. (2011). Classification and evaluation the privacy preserving data mining techniques by using a data modification-based framework. arXiv preprint arXiv:1105.1945. Mivule, K. (2013). Utilizing Noise Addition for Data Privacy, an Overview. arXiv preprint arXiv:1309.3958. Tong, X., & Cui, M. (2009). Image encryption scheme based on 3D baker with dynamical compound chaotic sequence cipher generator. Signal processing,89(4), 480-491. Bonchi, F., Gionis, A., & Tassa, T. (2014). Identity obfuscation in graphs through the information theoretic lens. Information Sciences, 275, 232-256. Wu, Y., Zhou, Y., Agaian, S., & Noonan, J. P. (2014). A symmetric image cipher using wave perturbations. Signal Processing, 102, 122-131. Yavuz, M. E., & Teixeira, F. L. (2008). On the sensitivity of time-reversal imaging techniques to model perturbations. Antennas and Propagation, IEEE Transactions on, 56(3), 834-843. Chen, J. X., Zhu, Z. L., Fu, C., & Yu, H. (2015). Optical image encryption scheme using 3-D chaotic map based joint image scrambling and random encoding in gyrator domains. Optics Communications, 341, 263270. Ng, W. W., He, Z. M., Chan, P. P., & Yeung, D. S. (2011, July). Blind steganalysis with high generalization capability for different image databases using L-GEM. In Machine Learning and Cybernetics (ICMLC), 2011 International Conference on (Vol. 4, pp. 1690-1695). IEEE. Al-Najjar, H. M., AL-Najjar, A. M., & Arar, K. S. A. Image Encryption Algorithm Based on Logistic Map and Pixel Mapping Table. In International Arab Conference on Information Technology (ACIT 2011). Banerjee, M., Chen, Z., & Gangopadhyay, A. (2014). A generic and distributed privacy preserving classification method with a worst-case privacy guarantee.Distributed and Parallel Databases, 32(1), 5-35. Maryak, J. L., & Spall, J. C. (2005). Simultaneous perturbation optimization for efficient image restoration. IEEE transactions on aerospace and electronic systems, 41(1), 356-361. Starck, J. L., Candès, E. J., & Donoho, D. L. (2002). The curvelet transform for image denoising. Image Processing, IEEE Transactions on, 11(6), 670-684. Upmanyu, M., Namboodiri, A. M., Srinathan, K., & Jawahar, C. V. (2009, September). Efficient privacy preserving video surveillance. In Computer Vision, 2009 IEEE 12th International Conference on (pp. 16391646). IEEE. Meyer, F. G., & Shen, X. (2014). Perturbation of the eigenvectors of the graph laplacian: Application to image denoising. Applied and Computational Harmonic Analysis, 36(2), 326-334. Tong, X., & Cui, M. (2009). Image encryption scheme based on 3D baker with dynamical compound chaotic sequence cipher generator. Signal processing,89(4), 480-491. Chen, K., & Liu, L. (2011). Geometric data perturbation for privacy preserving outsourced data mining. Knowledge and information systems, 29(3), 657-695. Shinzawa, H., Morita, S. I., Awa, K., Okada, M., Noda, I., Ozaki, Y., & Sato, H. (2009). Multiple Perturbation Two-Dimensional Correlation Analysis of Cellulose by Attenuated Total Reflection Infrared Spectroscopy. Applied spectroscopy,63(5), 501-506. Gross, R., Sweeney, L., Cohn, J., de la Torre, F., & Baker, S. (2009). Face de-identification. In Protecting Privacy in Video Surveillance (pp. 129-146). Springer London. Du, L., & Ling, H. (2011, September). Preservative license plate de-identification for privacy protection. In Document Analysis and Recognition (ICDAR), 2011 International Conference on (pp. 468-472). IEEE.

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

Lee, N., & Kwon, O. (2015). A privacy-aware feature selection method for solving the personalization– privacy paradox in mobile wellness healthcare services. Expert Systems with Applications, 42(5), 27642771. Gross, R., Sweeney, L., De la Torre, F., & Baker, S. (2006, June). Model-based face de-identification. In Computer Vision and Pattern Recognition Workshop, 2006. CVPRW'06. Conference on (pp. 161-161). IEEE. Gu, G., & Ling, J. (2014). A fast image encryption method by using chaotic 3D cat maps. OptikInternational Journal for Light and Electron Optics, 125(17), 4700-4705. Kargupta, H., Datta, S., Wang, Q., & Sivakumar, K. (2003, November). On the privacy preserving properties of random data perturbation techniques. In Data Mining, 2003. ICDM 2003. Third IEEE International Conference on (pp. 99-106). IEEE. Chen, K., & Liu, L. (2005, November). Privacy preserving data classification with rotation perturbation. In Data Mining, Fifth IEEE International Conference on(pp. 4-pp). IEEE. Muralidhar, K., & Sarathy, R. (2003). A theoretical basis for perturbation methods. Statistics and Computing, 13(4), 329-335. Liu, K., Kargupta, H., & Ryan, J. (2006). Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. Knowledge and Data Engineering, IEEE Transactions on, 18(1), 92-106. Liu, L., Kantarcioglu, M., & Thuraisingham, B. (2008). The applicability of the perturbation based privacy preserving data mining for real-world data. Data & Knowledge Engineering, 65(1), 5-21. Gupta, N., & Rajput, I. (2013). Preserving Privacy Using Data Perturbation in Data Stream. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 2(5), pp-1699. Li, F., Ma, J., & Li, J. H. (2009). Distributed anonymous data perturbation method for privacy-preserving data mining. Journal of Zhejiang University SCIENCE A, 10(7), 952-963. Tiwari, P., & Gupta, H. (2012). RECONSTRUCTION OF PERTURBED DATA USING KMEANS. Journal of Global Research in Computer Science, 3(10), 18-21. Newton, E. M., Sweeney, L., & Malin, B. (2005). Preserving privacy by de-identifying face images. Knowledge and Data Engineering, IEEE Transactions on, 17(2), 232-243. Lindell, Y., & Pinkas, B. (2000, January). Privacy preserving data mining. InAdvances in Cryptology— CRYPTO 2000 (pp. 36-54). Springer Berlin Heidelberg.

ZHAO, J., YANG, J., & ZHANG, J. (2014). Privacy Properties of Random Projection Perturbation When Random Matrix Is Leaking⋆. Journal of Computational Information Systems, 10(8), 3465-3472. Karandikar, P., & Deshpande, S. (2011). Preserving Privacy in Data Mining using Data Distortion Approach. International Journal of Computer Engineering Science, 1(2), 24-31. Li, G., & Wang, Y. (2012). A Privacy-Preserving Classification Method Based on Singular Value Decomposition. Int. Arab J. Inf. Technol., 9(6), 529-534. Machanavajjhala, A., Kifer, D., Gehrke, J., & Venkitasubramaniam, M. (2007). l-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1), 3. Lensvelt-Mulders, G. J., Hox, J. J., Van der Heijden, P. G., & Maas, C. J. (2005). Meta-analysis of randomized response research thirty-five years of validation. Sociological Methods & Research, 33(3), 319348.

IGI Global Microsoft Word 2007 Template Reference templateInstructions.pdf for detailed instructions on using this document.

Aggarwal, C. C., & Philip, S. Y. (2008). A general survey of privacy-preserving data mining models and algorithms (pp. 11-52). Springer US. Li, N., Li, T., & Venkatasubramanian, S. (2007, April). t-Closeness: Privacy Beyond k-Anonymity and lDiversity. In ICDE (Vol. 7, pp. 106-115). Yun, U., & Kim, J. (2015). A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Systems with Applications, 42(3), 1149-1165. Liu, L., Zhu, H., & Huang, Z. (2011). Analysis of the minimal privacy disclosure for web services collaborations with role mechanisms. Expert Systems with Applications, 38(4), 4540-4549. Stützle, T. (2006). Iterated local search for the quadratic assignment problem.European Journal of Operational Research, 174(3), 1519-1539. Lourenço, H. R., Martin, O. C., & Stutzle, T. (2001). Iterated local search. arXiv preprint math/0102188. Rocki, K., & Suda, R. (2012, March). Large scale parallel iterated local search algorithm for solving traveling salesman problem. In Proceedings of the 2012 Symposium on High Performance Computing (p. 8). Society for Computer Simulation International.

Suggest Documents