2013 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 22–25, 2013, SOUTHAMPTON, UK
AN ADAPTIVE ENCRYPTION BASED GENETIC ALGORITHMS FOR MEDICAL IMAGES Ahmed Mahmood Robert Dony Shawki Areibi
[email protected] [email protected] [email protected] School of Engineering, University of Guelph Guelph, Canada ABSTRACT This paper presents a novel efficient symmetric encryption technique that can be applied to medical images. It uses genetic algorithm which makes it highly adaptive. Standard DICOM images are segmented into a number of regions based on pixel intensity and entropy measurements. The novelty of the selective encryption method lies in the use of several encryption algorithms with variable key lengths to control the processing time required for the encryption process and the robustness quality. Encryption processing time, robustness of the encrypted image and the side information required for transmission of the decryption key are the main parameters for optimization. The trade-off among them stems from the variation in processing time with the key length of encryption algorithm, image size, number of regions and the side information to reduce processing time while maintaining a high level of robustness. Index Terms— Selective encryption, Genetic algorithms, Medical image encryption, DICOM, Information theory 1. INTRODUCTION Digital imaging and communications in medicine (DICOM) forms the main standard for picture archiving and communication system (PACS) [1]. Securing medical imaging data within the DICOM system currently relies on techniques that include Advanced Encryption Standard (AES) and the Triple Data Encryption Standard (3DES) algorithms [1]. However, implementation of these algorithms for medical images takes considerable processing time. For example, encrypting a 512 × 512 MRI image using AES takes almost 521.67 s to execute in MATLAB code on an Intel Core2 Quad Q6700 computer [2]. Not only do these algorithms consume substantial processing time [3], but AES in particular also permits the background regions of the medical image to be identified when a short AES key is applied [4]. This can be considered as a problem in robustness [5]. To achieve a high level of security for medical images, long keys for the encryption algorithms are required. The resulting increased time makes the methods highly inefficient. Therefore, the goal of this work is to both reduce the implementation time of encryption process and improve the robustness of the encrypted image. c 978-1-4799-1180-6/13/$31.00 2013 IEEE
In order to achieve these goals, we propose the use of adaptive encryption. Our method uses genetic algorithms as an optimization technique to determine the design parameters. The first contribution of this method is the control of the execution time required for the encryption process by means of the number of regions and encryption algorithms securing each region. The algorithms used for the encryption process can be specified, but there is difficulty in matching the algorithm with the related area. The second contribution is establishing additional security for this method at a level higher than AES or 3DES. This goal is accomplished by applying several algorithms to the image which utilize longer key-lengths. This paper is organized as follows. Section 2 introduces essential background on security techniques and their use in medical imaging, along with necessary background on metaheuristic optimization techniques. Section 3 provides a literature review on the encryption types and their use in the new security approach. Section 4 describes the proposed approach for security technique in the encryption areas, and the evaluation of the performance of the proposed method. Results obtained are presented in Section 5. Finally, conclusions are discussed in Section 6. 2. BACKGROUND The delay time for viewing a medical image transmitted by a medical center to a remote radiologist has to be reduced. The delay depends on the medical image size and complexity of the security algorithm. The image size determines time required for the encryption process, with larger images having longer processing time [6]. The data size has a significant impact on encryption schemes because it determines the speed of the process. Therefore, a naive approach becomes very slow because the encryption/decryption speed of many existing ciphers delays the speedy encryption of medical images. Adaptive encryption as a type of selective encryption is used in our approach to ensure rapid encryption for images. Parameters that play an important role in the cryptosystem robustness are the key length, the avalanche property, and the compression/encryption stage. Breaking the encryption key has become an easier task by increasing the speed of computers [7]. Accordingly, the key length of a cryptosystem needs
to be designed and determined with care to further protect the system. This added security ensures that the destination is the only place the original image can be reconstructed. A fundamental property in the design of an encryption algorithm is the avalanche property whereby a small change in either plaintext or cipher key needs to produce a much larger change in the ciphertext. Compressibility also plays an important role in design because of the trade-off between compression ratio and encryption robustness. Thus, a stage within the compression can be identified in which to perform encryption without affecting the compression ratio. Encryption could occur after compression; however, encryption robustness may be affected. In high speed networks or with the availability of large storage size, compression could be applied after encryption. The randomness of the cipher text will considerably decrease the amount of compression achieved, are resulting in a larger file size with higher robustness. In our proposed approach, genetic algorithms (GA) are used to achieve better design performance for time-robustness relationships by trying to find the optimum threshold values of entropy. Genetic algorithms are a class of optimization algorithms that seek improved performance by sampling areas of the parameter space that have a high probability for leading to good solutions [8]. Genetic algorithms have been applied to many hard optimization problems. They have been recognized as a robust general-purpose optimization technique. As an optimization technique, genetic algorithms simultaneously examine and manipulate a set of possible solutions. The population evolves for a prespecified total number of generations under the application of evolutionary rules called Genetic Operators. There are many characteristics of genetic algorithms which qualify them to be a robust based search procedure. The first feature of genetic algorithms is that they are characterized to climb many peaks in parallel. Thus, the probability of finding a false peak is reduced over methods that proceed form point to point in the decision space. Secondly, the operators make use of a coding of the parameter space rather than the parameters themselves. Only the objective function information is used; this results in a simpler implementation.
the AES encryption processing time using selective encryption by encrypting parts of the image. The level of security is decreased, however, with some regions of the image still visible. In addition, the background is sometimes used to embed a watermark for authentication and data integrity purposes. Other approaches include a selective encryption algorithm encrypting parts of the image pixels using AES [4], and an algorithm that combines permutation and selective encryption to minimize the amount of processed data encryption, designed particularly to work with images that use JPEG2000 as a compression method [10]. However, both lack robustness. 4. PROPOSED FRAMEWORK 4.1. Methodology The proposed selective encryption algorithm in this work uses a novel adaptive process that aims to achieve a high security level of the encrypted medical image in a short period of processing time. Based on selective encryption, the algorithm divides the medical image into a number of regions and encryption is applied separately to each region. The overall block diagram of the adaptive encryption method is shown in Figure 1. The novelty of the design occurs in the adaptive control for the relationship between processing time and resulting robustness. Using an evolutionary based technique in the form of a genetic algorithms creates an adaptive optimized method that controls the processing time by applying five adjusting parameters. These parameters are: encryption algorithms, key-length, robustness parameter (CORR, NPCR), number of regions, and side information. Encryption Algorithm
Input Medical Image
Key Length
Robustness Parameters
Adaptive Encryption System
Number of Regions
Output Encrypted Image
Side Information
3. LITERATURE REVIEW
Fig. 1. General design of adaptive encryption method
The encryption algorithms currently adopted by the DICOM system to secure medical images are either 3DES or AES [1]. 3DES is obtained by applying the DES cipher algorithm three times to each data block [3]. It is a closed key system widely used in many practical applications. AES is the official encryption standard for the U.S. government [5], designed as a block-structured algorithm with variable length keys of 128 bits, 192 bits and 256 bits. The aim of the AES is to replace the 3DES and avoid the shorter keys and the slow implementation. Selective encryption is an important technique for reducing encryption time by applying the encryption algorithm to a single subset of the data [9]. Ou et al. [6] attempted to reduce
Determining the maximum processing time is an essential step within the proposed framework. This value needs to be less than the required time to encrypt the medical image using AES. The GA main task is to determine the most suitable parameters based on the assumption that the more robust encryption algorithms would require more processing time than inferior encryption algorithms. Similarly, a longer key-length requires more processing time. Robustness parameters such as correlation and the number of pixels change rate (NPCR) determine the minimum level of robustness. Dividing the image into a number of regions should reduce the processing time as each region will be encrypted by a less time consuming algorithm. As the number of regions increases the
processing time decreases due to the size reduction of the area that would have to be encrypted using AES. However, increasing the number of regions will increase the side information required to represent the individual regions. Read Medical Image
Divide Image into Blocks
Segmentation using Entropy
Optimization using GA
Side Information
Region 1
Region 2
Compression
Method1 Keylength 1
Method2 Keylength 2
...
Region n
4.3.1. Population Intitialization
Method n . . . Keylength n
Each candidate solution is represented by a string of symbols called a chromosome. The set of solutions Pj , is referred to as the population of the j th generation. The initialization techniques generally used are based on pseudo-random methods. The algorithm will create its starting population by filling it with pseudo-randomly generated bit strings.
Permutation
Encrypted Medical Image
Transmitted Data
4.3.2. Fitness Function The fitness function that is used to provide a measure of how individuals have performed in the problem domain is:
Fig. 2. The proposed encryption method
Total time = R1 × T1 + ... + Rn × Tn + S × Tx
4.2. Encryption Method The proposed encryption method is shown in Figure 2. The input image is read and then divided into non overlapping blocks of equal size. Results obtained in this work are based on blocks of the following sizes: (8x8), (16x8) and (16x16). The next step is applying segmentation based on the information density of the medical image. Entropy, the statistical measure of the information change rate, is applied to each block to define its region of belonging. The threshold values of the regions are calculated based on the ratio of the number of the blocks to the number of regions. The technique attempts to achieve uniform size blocks. The entropy H(d) of data d is measured using the Equation (1) [11]: H(d) =
L i=1
There are essentially four basic components necessary for the successful implementation of a genetic algorithm. At the outset, there must be a code or scheme that allows for a bit string representation of possible solutions to the problem. Next, a suitable function must be devised that allows for a ranking or fitness assessment of any solution. This fitness influences the selection process for the next generation. The third component, contains transformation functions that create new individuals from existing solutions in a population. The crossover and mutation operators are crucial to any GA implementations. Finally, the fourth module contains techniques for population initialization, generation replacement, and parent selection techniques.
p(mi ) log
1 p(mi )
(1)
where L refers to the total number of pixel values and p(mi ) represents the probability of occurrence of a pixel with value mi . When the entropy of the encrypted image is close to log L bits, its histogram is considered as sufficiently uniform [11]. 4.3. Genetic Algorithm The main goal of the proposed genetic algorithm is to determine the optimal entropy threshold values for regions which defines the size of each region. Increasing the region size subsequently tends to increase the number of pixels that will be encrypted by an algorithm.
(2)
where R is the number of blocks for a region, and T is the encryption time of a block for an algorithm, S is the side information size and Tx is its required transmission time. Number of regions may vary between 2 and 6. In the above equation the important variable is the region size (R) where the processing time of an encryption algorithm is a constant. 4.3.3. Selection Strings are selected for mating based on their fitness, those with greater fitness are awarded more offspring than those with lesser fitness. Parent selection techniques that are used, vary from stochastic to deterministic methods. The probability that a string i is selected for mating is pi , the ratio of the fitness of string i to the sum of all string fitness values, i . The ratio of individual fitness to the fitness pi = f itness j f itnessj sum denotes a ranking of that string in the population. 4.3.4. Replacement Generation replacement techniques are used to select a member of the old population and replace it with the new offspring. The quality of solutions obtained depends on the replacement scheme used. 4.3.5. Genetic Algorithm: Flow Figure 3 illustrates a genetic algorithm implementation for encryption selection. The GA starts with several alternative
solutions to the optimization problem, which are considered as individuals in a population. These solutions are coded as binary strings, called chromosomes. The initial population is constructed randomly. These individuals are evaluated, using the specific fitness function. The GA then uses these individuals to produce a new generation of hopefully better solutions. In each generation, two of the individuals are selected probabilistically as parents, with the selection probability proportional to their fitness. The following two types of termination conditions have been employed in our work: (i) An upper limint on the number of generations is reached, (ii) No significant change to the average fitness of the population have been achieved in the past x generations.
1. Encode Solution Space for image 2.(a) set pop size, max gen, gen=0; (b) set cross rate, mutate rate; 3. Initialize Population. 4. While stopping criteria not met Evaluate Fitness For (i=1 to pop size) Select (mate1,mate2) if (rnd(0,1) ≤ cross rate) child = Crossover(mate1,mate2); if (rnd(0,1) ≤ mutate rate) child = Mutation(); Repair child if necessary End For Add offsprings to New Generation. gen = gen + 1 End While 5. Return best chromosome(s). Fig. 3. A genetic algorithm for image encryption
4.3.6. Parameter Tuning Running any meta-heuristic requires setting a number of parameters. Deciding on the best set of parameter values for a specific implementation is a non-trivial task. Poor settings lead to inferior results whereas good settings require timeconsuming trials to find. In our genetic algorithm implementation intial tuning is performed to find effective values. Each parameter under investigation is varied for the range of possible values while all the other parameters are kept constant. The solution quality produced is used to select the proper value for each parameter. Population sizes between 20 - 50 seem to produce excellent results. It is important to keep the population size to a reasonable value so that we minimize the CPU time of the GA algorithm. Accordingly we used a population size of 30. Good performance is associated with high crossover rate combined with low mutation rate. Accordinlgy we have set our crossover rate in this work to be 85% and higher while using a low mutation rate of 1% and lower.
4.3.7. Region Encoding Coding the regions affects the size of the side information. The number of regions represents an important parameter where using more regions provides more security and reduces the processing time. However, increasing the number of regions leads to a larger side information due to the coding. Larger side information is considered as a shortening for the proposed method. Representing n regions requires log2 n bits. The output of the binary encoded image is compressed using run length encoding (RLE) compression algorithm. The output is encrypted and is ready for transmission. In order to achieve better values for correlation and NPCR of the encrypted image a permutation step might be applied. The algorithm in [12] provides fast permutation process. 4.4. Evaluation Methods 4.4.1. Histogram The encrypted image histogram needs to be close to the uniform distribution to avoid statistical attacks [11]. The histogram of an image shows the number of occurrences for each gray level in the medical image. 4.4.2. Correlation Coefficient In order to avoid statistical attacks, the encryption algorithm of a medical image needs to have low correlation among the pixels of the encrypted image. Horizontal, vertical, and diagonal correlation coefficients (rxy ) of two adjacent pixels can be calculated using the following equations [12]: COV (x, y) Corxy = D(x) D(y)
(3)
2 N N 1 1 xi D(x) = xi − N i=1 N i=1
(4)
N 1 (xi − Av(x)) (yi − Av(y)) COV (x, y) = N i=1
(5)
where x and y are gray-scale values of two adjacent pixels in the image and Av denotes the average value shown in Av(z) =
N 1 zi N i=1
(6)
4.4.3. The Difference Between Encrypted and Plain-Images In order to avoid ciphertext attacks, the encrypted image requires a significant difference from the original one by having a high value of the number of pixels change rate (NPCR). The NPCR is the percentage of corresponding pixels with different gray levels in two images. Let C1 (i, j) and C2 (i, j) be the gray level of the pixels at the ith row and jth column of two
(W × H) images. The NPCR of these two images as defined in [12]: N P CR =
i,j
D(i, j) × 100% W ×H
(7)
where D(i, j) is defined as 0, if C1 (i, j) = C2 (i, j) D(i, j) = 1, if C1 (i, j) = C2 (i, j) 5. RESULTS & DISCUSSION The genetic algorithm code was implemented in MATLAB 7.10 on an Intel i7-820 workstation running Windows 7. The DICOM images were obtained from [13]. The method proposed here aims at reducing the processing time of an encrypted medical image while obtaining high level of security. Our approach is to divide the image into multiple regions based on their information density and encrypting the low information regions using a low processing algorithm such as the Gold code with high information regions encrypted using a standard algorithm such as AES. The remaining regions can be encrypted with other algorithms such as DES or Blow Fish. The processing time is thus reduced and quality of the encrypted medical image is maintained. Figure 4 shows some medical images, their segmentation using entropy, and their histograms. Images histograms show diverse shapes that require various threshold values to define regions. The genetic algorithm is used to determine near optimal threshold values of entropy that define the size of each region. In the exam-
Fig. 4. Segmented images and their histograms ple shown in Figure 5 the medical image is divided into four regions. Gold code sequence generators are used for encrypting low information regions of the medical images with 12, 13 and 20 bits key length while AES 256 is used to encrypt the high information region. The right hand side of this Figure shows the encrypted ankle image using AES with its histogram while the middle image shows the encrypted ankle image using proposed method with its histogram. The histogram of the proposed method is
Fig. 5. Encrypting the medical image with the AES and the proposed method Image AES Algorithm Proposed Algorithm
Entropy 7.9441 7.9432
Correlation 0.001 0.009
NPCR 0.9960 0.9965
Table 1. Metrics comparison of two encryption methods flatter than the AES histogram which means more randomness and this presents a better encryption quality. In addition, by simple inspection it is clear that the encrypted image using the proposed method is more obscure than that based solely on AES. A comparison between the AES algorithm and the proposed approach in this paper, as seen in Table 1, reveals that latter has higher value of entropy, close values of NPCR and lower value of correlation between neighboured pixels, thus higher robustness. Table 2 shows the required processing time to implement AES for some medical images. Hence, the processing time of the proposed algorithm should be less than these values. The CPU time required for the proposed method are shown in Table 3 and Table 4. Table 3 shows the required time to achieve segmentation using entropy. The sum of time in the tables is still less than the required processing time for AES in the ankle example. For example, the GA time is 1.1 s (to determine suitable threshold values) and encryption time for Gold code encryption is 3.1 s, yet the AES total CPU time is 13.9 s. Table 3 shows the processing time for block sizes 8x8, 16x8 and 16x16. Increasing the block size should reduce the processing time but results in lower encryption robustness. Table 4 shows the side information processing time of four image regions where Huffman encoding is used to obtain smaller size for this information which should be sent as a part of the decryption key. Increasing the number of regions should increase the side information which can be considered as a drawback of the proposed method. 6. CONCLUSION Securing medical data via adaptive encryption shows considerable promise for speeding up processing time and improving the security level. There is always a trade-off be-
Image hand ankle Us brain
Dimensions 760x576 512x512 384x504 256x256
Processing Time (s) 43.866 26.156 18.443 6.491
Table 2. AES encryption processing time in seconds
Image hand ankle US brain
No. of Pixels 437760 262144 193536 65536
RLE time (s) 0.0480 0.0445 0.0040 0.0022
Huffman size (bit) 1711 1025 757 429
Huffman time (s) 0.4421 0.2211 0.1982 0.1057
Table 4. Side information processing time Image hand ankle Us brain
8x8 1.8393 1.1037 0.8015 0.2616
16x8 0.9213 0.5534 0.4140 0.1343
16x16 0.4539 0.2693 0.2004 0.0709
Table 3. Segmentation processing time in seconds tween time for encryption and the robustness of the product. Maintaining robustness, which depends on key length, will require not only longer key lengths but also considerably more computation time. Therefore, reducing the implementation time while maintaining the robustness are goals set for this work. Selecting the sequence of encryption/compression process avoids this problem and improves algorithm robustness and image compression. Implementing encryption in the spatial domain results in a good compression ratio that ensures a high level of robustness. A major advantage of creating speedy and secure process comes with the use of genetic algorithms to obtain appropriate values of the segmentation threshold. While additional information is required to represent the segmentation, it can easily be compressed in a lossless way using RLE compression. The result of the proposed method is a faster, more robust encryption that adapts to the information content of each individual image.
tional Conference for Internet Technology and Secured Transactions (ICITST), Abu Dhabi, United Arab Emirates, 2011, pp. 596 – 601. [6] Y. Ou, C. Sur, and K. Rhee, “Region-based selective encryption for medical imaging,” in Proceedings of the 1st annual international conference on Frontiers in algorithmics. Springer-Verlag, 2007, pp. 62–73. [7] S. Vaudenay, A classical introduction to cryptography: applications for communications security. Boston, MA: Springer, 2006. [8] D. Beasley, R. Martin, and D. Bull, “An overview of genetic algorithms: Part 1. fundamentals,” University computing, vol. 15, no. 4, pp. 170–181, 1993. [9] A. Massoudi, F. Lefebvre, C. De Vleeschouwer, B. Macq, and J. Quisquater, “Overview on selective encryption of image and video: challenges and perspectives,” EURASIP Journal on Information Security, vol. 2008, pp. 1–18, 2008.
7. REFERENCES
[10] Z. Brahimi, H. Bessalah, A. Tarabet, and M. Kholladi, “Selective encryption techniques of jpeg2000 codestream for medical images transmission,” WSEAS Transactions on Circuits and Systems, vol. 7, no. 7, pp. 718–727, 2008.
[1] NEMA, “Digital Imaging and Communications in Medicine (DICOM) Part 15: Security and System Management Profiles ,” 2008.
[11] K. Wong, “Image Encryption Using Chaotic Maps,” Intelligent Computing Based on Chaos, pp. 333–354, 2009.
[2] Y. Zhou, K. Panetta, and S. Agaian, “A lossless encryption method for medical images using edge maps,” in Proceedings of the 31st Annual International Conference in Medicine and Biology Society, EMBC 2009, Minneapolis, MN, 2009, pp. 3707 – 3710.
[12] A. Mahmood and R. Dony, “Adaptive encryption using pseudo-noise sequences for medical images,” in The 3rd International Conference on Communications and Information Technology (ICCIT), Beirut, Lebanon, 2013, pp. 39–43.
[3] D. Elminaam, H. Kader, and M. Hadhoud, “Evaluating the performance of symmetric encryption algorithms,” international journal of network security, vol. 10, no. 3, pp. 213–219, 2010.
[13] S. Barr. (2013) Medical image samples. [Online]. Available: http://www.barre.nom.fr/medical/samples/
[4] R. Norcen, M. Podesser, A. Pommer, H. Schmidt, and A. Uhl, “Confidential storage and transmission of medical image data,” Computers in Biology and Medicine, vol. 33, no. 3, pp. 277–292, 2003. [5] A. B. Mahmood and R. D. Dony, “Segmentation based encryption method for medical images,” in 6th Interna-