Enhancement CAST Block Algorithm to Encrypt Big Data - IEEE Xplore ...

5 downloads 0 Views 1MB Size Report
Mar 9, 2017 - Abstract― In this research the security issues are discussed for securing big data. Encryption algorithms play an essential part to secure ...
Annual Conference on New Trends in Information & Communications Technology Applications-(NTICT'2017) 7 - 9 March 2017

Enhancement CAST Block Algorithm to Encrypt Big Data Alaa Kadhim F.

Ghassan H. Abdul-Majeed

Computer Sciences Department, University of technology/ Baghdad, Iraq [email protected]

Ministry of higher education/ Baghdad, Iraq [email protected]

Rasha Subhi Ali Computer Sciences Department, University of technology/ Baghdad, Iraq [email protected]

Abstract² In this research the security issues are discussed for securing big data. Encryption algorithms play an essential part to secure information. This paper demonstrates analysis of well-known block cipher CAST-256 and its modified version using modified S-Box and key generator. In this research the implementation of the modified CAST cryptography algorithm was discussed. It was worked dynamically on big data by utilizing the generated keys and compression algorithm to reduce the execution time. In the detail of security analysis and experimental results of the suggested encryption algorithm is rapidly and alternatively it supplied just right protection and added most likely much less overhead at the data, this in these days is probably the most necessities required for big data. The main idea of this research included in how to modify the CAST encryption algorithm by making it work in a dynamic way, make it dependent on DNA computing and using one S-Box instead of 6 S-Boxes. Keywords² CAST, Modified CAST, DNA computing, AES key generation.

I.

INTRODUCTION

Data security has become a significant resource today for the effective operations of the various requirement of any organization. One of the main important requirements of these networks is to provide secure transmission of information from one place to another. Cryptography is one of the mechanisms that provide most secure way to transfer the sensitive information from sender to supposed receiver. Its major aim is to make sensitive information unreadable to all different, except the supposed receiver [1]. All the cryptographic algorithms can be divided into two groups: Symmetric key (also called secret-key) cryptography algorithms and Asymmetric key (also called public-key) cryptography algorithms. Symmetric key consists of one private shared key that is used for every encryption and decryption. Asymmetric key makes use different keys for encryption and decryption [2]. Advanced encryption standard; Blowfish; CAST; RC4 and 3DES are the examples of private key cryptography [3]. Large size of text causes certain challenges for encryption. Naturally a typical text has a very big size. Using a traditional encryption algorithm will make encryption difficult for large volume of data [4]. The motivation for this research are: Gupta and Jain [5] proposed a symmetric-key encryption algorithm based on the DNA approach. A new approach for image encryption based on

978-1-5386-2962-8/17/$31.00 ©2017 IEEE

DNA computation technology was proposed. The original image is encrypted by utilizing the DNA computation and DNA complementary rule. Saeed Al-Wattar, Mahmod, Zukarnain and Udzir [6] proposed a new S-Box design which was inspired by biology DNA techniques to be used for SPN symmetric block ciphers. The new S-Box is utilized so one can make use of biological process as inspiration in creating the S-Box as simple and secure approach. The new S-Box was used in the AES (Advanced Encryption Standard). Chowdhury, Sinha and Dutta [7] proposed a new block cipher termed as ³Modular Arithmetic based Block Cipher with Varying Key-Spaces (MABCVK)´ which is used private keyspaces of varying lengths to encrypt data files. The schematic strength of the cipher and the freedom of using a long keyspace expectedly can make it reasonably nonvulnerable against possible cryptanalytic attacks. The rest of this research consists of section II explain Secret Key Cryptographic Algorithms, section III describe an overview for Deoxyribonucleic Acid (DNA), section IV illustrates the suggested encryption algorithm. The details of the design system are given, together with those concerning the element substitution boxes (s-boxes), the general framework, the key generation, and the round function is presented in section IV. Section V explains the experimental results of the proposed methods and section VI concludes the whole work of this research. II. SECRET KEY CRYPTOGRAPHIC ALGORITHMS Following are some cryptographic algorithms:

most

common

secret

key

A. DES (Data Encryption Standard) DES is a block cipher algorithm which is taking a string of fixed-length (plain text bits) and transforming it into a string (ciphertext bit) of equal length. Within the case of DES, the block length is sixty four bits. The key includes sixty four bits; but, most effective fifty six of those were in reality utilized by the algorithm. There are eight bits are utilized just for testing valence, and after that are discarded [2]. B. AES AES is a block cipher supposed to replace DES for commercial applications. It makes use a 128-bit block size and a key size of 128, 192, or 256 bits. AES does not use a Feistel

80

Annual Conference on New Trends in Information & Communications Technology Applications-(NTICT'2017) 7 - 9 March 2017

structure. Instead, each full round consists of four separate functions: byte substitution, permutation, arithmetic operations over a finite field, and XOR with a key [8]. C. CAST CAST was designed in Canada by Carlisle Adams and Stafford Tavares. They required that the name refers to their design system and will have to recall images of randomness, however realize the authors¶ initials. The example of CAST algorithm utilizes a 64-bit block size and a 64-bit key. The CAST algorithm makes use of S-boxes with an 8-bit input and a 32-bit output. The building of these S-boxes are implementation-dependent and complicated [9]. There are eight rounds in the algorithm. There is no known way to destroy CAST other than brute force. The CAST was evaluated as a new encryption standard by the Canadian government. [10]. The entire encryption system is given in the following four steps. 1. 2.

Split plain text into left and right each half of 32 bit. Right halve enters into the F function which includes 6 S-Boxes and XOR operation to construct a new Right. 3. New left=old Right. 4. Exchange the final blocks L, R and concatenate them to form cipher. Decryption is similar to the encryption algorithm given above, except that the rounds (and therefore the subkey pairs) are used in reverse order to compute (L0, R0) from (R16, L16) [2]. III.

IV.

METHODS

In this section the method used along with the working principle has been discussed. The main goal to adjust CAST is to provide much less computation and better security for data. In modified CAST algorithm instead of using 6 S-Box of 256 items each cell of 8 characters and this will consume too much time when dealing with big data; in the suggested encryption algorithm there is one S-Box of 256 different items each cell of 2 characters was used and this will decrease the required encryption time. Also, modified keys were used. . The S-Box generated by using DNA computing. The keys generated by using an AES generator with some modifications; instead if using Rcon(i) the position of the character in the word was used. The proposed Modified CAST encryption algorithm in this research works dynamically. Dynamic can be worked by applying the CAST algorithm to the plaintext (field) that has length divisible by 32. At the same time the plaintext (field) that has length more than 32 is split into two parts the part that is divisible by 32, and the remaining part of (plaintext (field) length mod 32) is processed by doing an XOR operation for the remainder with a shared key. However; the plaintext (field) that length of less than 32 is processed doing an XOR operation to it with a shared key. Figures (1 and 3) show the general structure of the proposed encryption and decryption algorithm.

DEOXYRIBONUCLEIC ACID (DNA)

DNA (Deoxyribonucleic Acid). DNA is considered as the genetic pattern of living or existing creatures. All individual body cells have a complete set of DNA. DNA is exceptional for every being. It is a polymer made out of monomers called deoxyribose nucleotides. This nucleotide has three fundamental components. A single-strand of DNA is composed of a sequence of molecules named bases, which stick out from a sugar phosphate backbone, the bases are determined of four characters {A, C, G, and T} [ 6]. A DNA sequence consists of four nucleic acid bases A (adenine), C (cytosine), G (guanine), T (thymine), where A and T are complementary, and G and C are complementary [5]. In this paper the C, T, A and G are used to denote 00, 01, 10, 11 (the corresponding decimal digits are ³0123´). Using this encoding method each 8-bit of character can be represented as a nucleotide string of length four. TABLE I.

DNA MAP RULES [11]

Fig. 1.

TABLE II.

OPERATION FOR DNA ADDITION AND SUBTRACTION OPERATION [1]

General Structure for the Encryption Process

A. Key Generation The key generated by using an AES key generator with some modification. In this research 256 different key was generated and each key of length 128 bits means includes 16 characters. The following algorithm explains the general steps for generating the keys.

81

Annual Conference on New Trends in Information & Communications Technology Applications-(NTICT'2017) 7 - 9 March 2017 Algorithm1: Key Generation INPUT: Shared Key (K). OUTPUT: 256 Keys each one of 128 bits. Begin: 1) If K. Length < 16 then Padding for K End If 2) Convert K to HEXA and put the results in an array C(i) 3) Define array call it R(4,4) for putting the elements of C(i) as 4 rows and 4 columns 4) Combine the content of each row and put it in another array call it O(i). 5) For i=0 to 255 If I mod 4 =0 and I >0 then O(i)=Rotate to right (O(i-1)) OO(i)=S-Box(O(i)) ElseIf I mod 4 0 then OO(i)=O(i-4) Else OO(i)=O(i) End If Next 6) For i=4 to 255 F="" If i>0 and I mod 4 =0 then L="" For y=1 to OO(i).length G= asci code for each character in OO(i) FF= asci code for each character in OO(i-4) F=Hex (G XOR FF+i*y) Next L+=F OO(i)=L Else For y=1 to OO(i).length G= asci code for each character in OO(i) FF= asci code for each character in OO(i-1) F=Hex (G XOR FF+i*y) Next L+=F OO(i)=L End If Next 7) For i=0 to 255 List1(i)=OO(i) + OO(i) // list1 represent list of 256 items each one of 16 characters// Next END

formed S-Box is a matrix of 16 * 16, which includes 256 different items and each cell includes two characters. The steps to create the S-Box explains in the next points: x The user input strings are transformed to DNA series. The following procedure is utilized to transform user input strings to the DNA code:

Fig. 2.

Illustration for converting to DNA codes

The concept of converting to DNA series is as follows: Row = 2bits (0-3) and column=4bits (0-7). For example, let current bits=11 (row) and the location=1 (column), then the intersection of row and column= DNA series and in this example DNA series=T, another example lets the strings S1 and S2, S1 = Com, DNA Series1 = CACGAAAACCGT

B. S-Box Creation In this research DNA computing with some logical and mathematical operations are used to create a new S-Boxes. The composition of this S-Box relies on a single key or two keys depending on the user's request. As well as the S-Box inverse be configured based on the same keys and processes that have been used in the formation of S-Box. The proposed technique does not encompass numerous stages along with genetic algorithms which include several stages to get the best solution, so the proposed method is faster than genetic algorithms, this is because the genetic algorithm passes through several stages to select the best solution. Genetic algorithm starts with a set of solutions (represented by chromosomes) called a population. Solutions from one population are taken and used to form a new population and so on the stages for genetic algorithm are: Generate random population of chromosomes, Evaluate the fitness of each chromosome, Create a new population by repeating the following steps until the new population is complete (Selection, Crossover, Mutation, Accepting, Replaced testing if the end condition is satisfied). The DNA coding is utilized to convert user input into the DNA (output) strings. The

S2= Aom, DNA Series2 = CACTAAAACCGT x Apply DNA addition and subtraction operations, these are calculated according to the rules in Table II. By using the example in the previous step (S1 and S2), let: I=addition (DNASeries1) =AGTTCA J=addition (DNASeries2) =ATTTCA K=Subtraction (I, J) = CACCCC x

Apply some mathematical and logical operations of the value resulting from the previous steps, and thus have been generated an S-Box includes different values

In each S-Box and S-Boxinverse any value within the array of-Box acts as head of the column and row in Boxinverse and vice versa. C. Modified CAST Encryption Algorithm In this research the modified CAST encryption algorithm was used to encrypt text file or database file. In the proposed 82

Annual Conference on New Trends in Information & Communications Technology Applications-(NTICT'2017) 7 - 9 March 2017

algorithm the key generated by using an AES key generator with some modification in the generating process. Also, there is one S-Box was used instead of using 6 S-box as in the traditional CAST algorithm. The S-Box that is used in the modified CAST algorithm to generate by using DNA computing with mathematical operation in distributed S-Box items. The algorithm begins by input the plaintext (database field) and splitting it into two parts the left (Li) and right (Ri) part. After that, the right part enters into three operations to produce (Ri+1) these operations are S-Box, xor with (Ki) and finally xor with the (Li). The new left (Li+1) = (Ri). These operations continued for eight rounds and at the end of round 8 apply the replacement process between the two halves.

To authenticate exchanging of data between the client and server there are three key management methods was used in this research and these are as follows: x The user key (Ku) and server shared secret key (Ks) are used for the encryption and decryption process. Also, the exchanged data dependent on this key, but this method cause load on the server site when dealing with big data

Algorithm 2:Modified CAST Encryption INPUT: plain text ILHOG 0««0NH\. N«N OUTPUT&,3+(57(;7&«& BEGIN: 1) Split the plaintext into left and right 128 bit halves / P«PDQG5 P«P 2) For I from 1 to 8 compute Li and Ri as follows: Li=Ri-1 Ri=S-Box(Ri-1) XOR (Ki-1) XOR (Li-1) 3) Replace final blocks L8, R8 and concatenate to form the ciphertext END

Fig. 3.

First Method for Key Management

x The server has a database includes the user identifiers with user keys. Also, for each user there is a secret key (Ku) shared with the server, but in this method the server encrypts the database by using the server key (Ks) and the exchanged data encrypted by using the server key only in this method. This method has more strength protection than the previous method but takes more time in the retrieval process. The retrieval process of exchanging data is illustrated in the following figure

D. Modified CAST Decryption Algorithm Decryption is identical to the encryption algorithm given above, except that the round operations are used in reverse order to compute (L0, R0) from (R8, L8).

Fig. 4.

Key Management Structure by Using Second Method

x Depending on the server database which is also included key with an identifier for each user, but in this method the exchanged data encrypted by using the server key (Ks) and user key (Ku). This method gives strengthen protection for the exchanged data, but takes more time than the previous methods

Fig. 5.

Fig. 3 General Structure for the Decryption Process

83

Structure of the Third Method of Key Management

Annual Conference on New Trends in Information & Communications Technology Applications-(NTICT'2017) 7 - 9 March 2017

V.

RESULTS

The suggested algorithm in this research needs a secret key of 128-bit as 16 bytes (hexadecimal) to generate 256 keys depend on keys generator algorithm to process each block from text file and mdb file. In this paper, a Modified CAST algorithm implemented in a big database and for testing the algorithm, several text files and mdb files of different size was used to check the efficiency of the algorithm. The tests show that the Modified CAST algorithm is much faster than the AES algorithm as shown in tableIII and reference [12]. For reference [12] it was noted that the AES algorithm takes 4 seconds to encrypt file of size 20,527 bytes; while the Modified CAST takes 4 seconds for file of size 381 kilobytes (389,680 bytes) approximately 19 times bigger than the file encrypted with AES. Table III illustrates the evaluation results performed on different sizes of textual content files making use of Modified Cast algorithms and the AES algorithm. Also, the results of the Modified CAST algorithm much faster than AES algorithm and Modified AES algorithm if compared with the results of reference [13]. TABLE III. File name 2014nation LMGIS UDWC

Fig. 8. The comparison results for Modified CAST encryption and decryption time in seconds

Several Big text files has been encrypted by using the original CAST encryption algorithm and the Modified Cast encryption algorithm, the results show in table 5 the Modified CAST encryption algorithm much faster than the traditional CAST. TABLE V. THE COMPARISON FOR ENCRYPTION TIME BETWEEN MODIFIED CAST AND ORIGINAL CAST IN SECONDS table name

ENCRYPTION AND DECRYPTION RESULTS FOR TEXT FILES File size in kb 104 105 381

Modified CAST Encryption Decryption time time 6 7 5

6 7 5

Dept 2014nation DWC dept10000 ISI WEB OF SCIENCE 2014 Hummn Drug 2011aloes 2014 st london post code DHUK Erbel BUSIENESS London Kirkuk BABEL SULAIMANIA

AES Encryption Decryption time time 12 13 13 13 149 108

TABLE VI.

Fig. 6.

The comparison results for encryption time in seconds

Fig. 7. The Comparison Results for Decryption Time in Seconds

TABLE IV.

ENCRYPTION AND DECRYPTION RESULTS FOR TEXT AND MDB FILES BY USING A MODIFIED CAST ALGORITHM

table name

File size

2014nation LMGIS UDWC dept10000 2011aloes 2014 st Erbel

104 105 381 503 2759 3281 69340

Modified CAST Encryption time Decryption time 6 6 7 7 5 5 6 6 207 181 191 158 869 777

File size

DB NAME

DB SIZE

Dept

240

Dhuk

160,464

Erbel

174,732

Babel

422,468

Sulaimania

501,984

8 118 381 503 730 1344 2759 3281 15375 74713 69340 68308 68619 124059 151846

Modified CAST encryption time 0.072 5.432 5.682 5.679 34.793 59.529 207.135 191.975 277.172 1358.415 869.822 2433.728 776.233 1575.578 1561.196

Original CAST encryption time 0.086 10.513 6.327 5.755 69.902 101.292 335.599 338.151 291.277 1385.86 920.612 4038.254 906.871 2333.705 3025.177

Comparisons Retrieval Time (In Second. Millisecond) For The Three Key Management Methods First method 0.265 0.265 0.386 30.49 10.569 43.737 5.968 3.348 0.378 18.556 42.08 47.999 58.967 68.582 52.323 11.976 43.632

Second method 3.508 3.501 3.498 33.75 12.947 46.684 8.429 6.085 3.49 20.909 44.966 46.928 46.189 63.11 54.818 15.03 43.034

Third method 4.358 4.433 4.381 91.551 29.254 122.985 18.452 11.033 4.656 49.858 104.033 120.282 119.956 167.557 146.939 34.353 107.392

NO.OF Retrived Data 3 1 6 20813 6606 30819 3463 1607 34 11381 28680 30395 29554 38039 26027 7861 26496

The proposed scheme is implemented by using visual basic.net programing language on computer of properties core i7, windows 8 and RAM of 8 giga. VI.

CONCLUSION

In this paper, a new design for reinforcing the safety of CAST algorithm was proposed. The design of Modified CAST algorithm different from the original CAST in the used SBoxes and keys. The created S-boxes and keys are dependent on the username and server name. But all the reminder mathematical standards of the original CAST will be kept unchanged. In the detail of security analysis and experimental results of the suggested encryption algorithm is rapidly and alternatively it supplied just right protection and added most likely much less overhead at the data, this in these days is

84

Annual Conference on New Trends in Information & Communications Technology Applications-(NTICT'2017) 7 - 9 March 2017

probably the most necessities required for big data. Theoretical test and experimental results of the fulfillment makes it very convenient for big data. One of the big advantage from the Modified CAST encryption algorithm, it was much faster than the original CAST and AES encryption algorithm. Also, the encryption process depends on the user name and server name (private and shared) keys provide more security for the encrypted files because no one can decrypt the file unless it has the two keys. The S-Boxes and the generated keys of the Modified CAST also dependent on the user name and the shared key and from this the results of the encryption process will be different from one to another. The recommendation for future works are: The construction for new S-Boxes by using hybrid Artificial intelligent techniques and this may lead to increasing the power of constructed SBox and then increasing the protection system and One time system may be used in the encryption process when online exchanging for the data such as of using the current time and date as a key. REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10] [11]

[12]

[13]

M. Nagendra & M. C. Sekhar, "Performance Improvement of Advanced Encryption Algorithm using Parallel Computation", International Journal of Software Engineering and Its Applications Vol.8, No.2 (2014), pp. 287-296. K. Aggarwal, J. K. Saini & H. K. Verma, "Performance Evaluation of RC6, Blowfish, DES, IDEA, CAST-128 Block Ciphers", International Journal of Computer Applications (0975 ± 8887) Volume 68± No.25, April 2013. R. Tiwari & A. Sinhal, "Block based text data partition with RC4 encryption for text data security", International Journal of Advanced Computer Research, Vol 6(24), 2016, ISSN (Print): 22497277 ISSN (Online): 2277-7970. P. Kawle, A. Hiwase, G. Bagde, E. Tekam & R. Kalbande, "Modified Advanced Encryption Standard", International Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, Volume-4, Issue-1, March 2014. R. Gupta & A. Jain, " A New Image Encryption Algorithm based on DNA Approach", International Journal of Computer Applications (0975 ± 8887) Volume 85 ± No 18, January 2014. A. H. Saeed Al Wattar, R. Mahmod & Z. A. Zukarnain, N. I. Udzir, "Generating A New S-Box Inspired by Biological DNA", International Journal of Computer Science and Application, Vol. 4, No. 1²April 2015. A. Chowdhury , A. K. Sinha & S. Dutta, " Proposal of a New Block Cipher reasonably Non-Vulnerable against Cryptanalytic Attacks", IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 1, No 1, January 2012. W. Stallings, "Cryptography and Network Security Principles and Practice", Fifth Edition, Inc., Publishing as Prentice Hall. All rights reserved. Manufactured in the United States of America, 2011. H. B. Pethe & S. R. Pande, "A Survey on Different Secret Key Cryptographic Algorithms", IBMRD's Journal of Management and Research, Print ISSN: 2277-7830, Online ISSN: 2348-5922, Volume3, Issue-1, March 2014. A. K. Farhan, "Block Cipher", University of Technology Computer Science department, lectures for 3 rd class. 2015. Y. Liu, " Cryptanalyzing a RGB image encryption algorithm based on DNA encoding and chaos map", College of Information Engineering, Xiangtan University, Xiangtan 411105, Hunan, China,arXiv:1307.4279v2 [cs.CR] 2 Jan 2014. A. Al Tamimi, "Performance Analysis of Data Encryption Algorithms",2005, http://www.cse.wustl.edu/~jain/cse56706/ftp/encryption_perf/index.html. V. C. Koradia,"Modification in Advanced Encryption Standard", Journal Of Information, Knowledge And Research In Computer Engineering, ISSN: 0975 ± 6760| Nov 12 To Oct 13 | Volume ± 02, Issue ± 02.

85

Suggest Documents