A new randomized data hiding algorithm with encrypted secret message using modified generalized Vernam Cipher Method: RAN-SEC algorithm Rishav Ray1, Jeeyan Sanyal2, Tripti Das3, Kaushik Goswami4, Sankar Das5, Asoke Nath6 Department of Computer Science St. Xavier’s College Kolkata, India e-mail:
[email protected],
[email protected],
[email protected],
[email protected], 5
[email protected],
[email protected]
Abstract— This paper proposes a new method for hiding any encrypted secret message inside a cover file by substituting the LSB of randomly selected bytes of cover file. For encrypting secret message we have used a new algorithm called Modified Generalized Vernam Cipher Method (MGVCM). For hiding secret message we have proposed a new method in which we have inserted the bits of each character of secret message file in the LSB of eight randomly selected bytes of the cover file. The randomly selected bytes read from cover file correspond to successive locations of a randomized offset matrix starting from a certain base address in cover file. The offset matrix is randomized using the randomization method of the previously published MSA encryption method. The randomized embedding of message in a cover file gives an additional layer of security over the encryption. Keywords- randomization; cryptography; LSB
I.
feedback;
steganography;
INTRODUCTION
As the use of Internet and the World Wide Web for transfer or exchange of data increases day by day, the need for better data security systems goes up proportionately. Various data encryption methods are already in use. However, only the use of encryption methods might not be enough to protect confidential data such as banking information or any other confidential data, and hence the concept of data hiding comes in. In data hiding the data is embedded in a file known as the host file which is larger than the message file we want to hide. Several data hiding/embedding methods exist already which involve embedding bit pattern of the secret message(SM) in successive locations of the cover file(CF). These methods insert each bit of SM in the LSB, LSB+1, LSB+2 upto even LSB+4. However, breaking this kind of embedding is relatively easy as the SM is hidden in consecutive locations of the CF, and knowledge of the starting location could possibly reveal the entire SM. Simple data hiding can be made more secure by first encrypting the secret message and then hiding it in the file. In the present work we have introduced a new randomized data hiding method, which hides the bit pattern of SM in seemingly random locations of the cover file (CF). In the present work we have two distinct algorithms:- (i) To encrypt SM using MGVCM proposed by Nath et al [2]. (ii)
c 978-1-4673-0126-8/11/$26.00 2011 IEEE
To insert the encrypted secret message in the standard CF by changing the least significant bit (LSB) of bytes at random locations in CF. Here we have basically tried to make the steganography method more secure. An attacker has to first extract the SM from the CF, which is not very easy as the SM is scattered in the random bytes of the CF and then decrypt the encrypted message. The scattered distribution of SM depends upon the 16X16 offset matrix, which contains random unique whole numbers ranging from 0 to 255. The offset matrix is generated from the text key provided by the user, using MSA [1] randomization method. The successive locations of the matrix in row major form provides the offset value from a certain base address of host file, where the bit is inserted in the LSB of the byte. We have considered a block of three 256 bytes sub-block for embedding the SM. The sub-blocks are randomly chosen and the bits are embedded in the locations of the sub-blocks, corresponding to the respective elements of the offset matrices for the sub-blocks. In this manner 90 bytes of SM can safely be inserted in 768 bytes of CF. Embedding capacity and the quality of CF after embedding is inversely proportional to each other. Thus there is a tradeoff between capacity and quality. In this proposed method we have concentrated on keeping the damage in the CF to be as less as possible, keeping in mind the quality of CF and security of SM after embedding. The SM is first encrypted using the MGVC [2] method, which uses “feedback” effect and also reverses the file, thus making the encryption process very hard to decrypt by using any brute force method. This encryption method also uses a 16X16 random matrix which is different from the matrix used for data hiding. This new randomization is proposed by Nath et al [2]. The un-hiding process is similar to the hiding process. The offset matrix is generated from the text key entered by the user, and the entire encrypted SM is retrieved from the host file. Even if the SM contains any embedded data in itself it can be further retrieved. The text key is not checked for consistency with the text key entered at the time of embedding, so that a wrong key entered by any attacker will extract a garbled, meaningless message. Then the decryption for the MGVC [2] method starts. Two separate key strings are entered by the user for both the hiding and unhiding process.
1211
The main objective of this paper is to provide improved security to confidential data while it is being transferred by the users, first by encrypting and then hiding it as well. II.
RAN-SEC ALGORITHM
A. Data Hiding Algorithm Step-1: Start Step-2: Read host file name Step-3: Read Secret message File Name Step-3: Calculate nhost=sizeof host file Step-4: Calculate nstag=sizeof secret message file Step-5: Calculate n1=int(nstag/90) Step-6: Calculate r1=nstag-n1*90 Step-7: (i) set n=nhost-(n1+1)*768-50 for .doc file or .xls file (ii) set n=nhost-5001 for .pdf file (iii) set n=5000 for all other files Step-8: Calculate size1=(n1+1)*768+n+50 Step-9: if size1>nhost then exit otherwise continue from step-10 Step-10: Read 768 bytes from ‘n’-th position of host file Step-11: Divide the 768 byte-block into 3 sub-blocks of size 256 bytes each Step-12: Read 90 bytes from secret message file Step-13: Take 1 byte from 90 byte block and divide it into 8-bit pattern Step-14: For each bit, select a random sub-block and a random offset position in that sub-block Step-15: Insert the bit in the LSB of this randomly selected byte Step-16: Repeat steps 13-15 for all 90 bytes of secret message block Step-17: Overwrite the 768 bytes of host file with the modified 768 byte-block. Step-18: Increase value of ‘n’ by 768 Step-19: Repeat steps 10-18 for n1 times Step-20: Repeat the same embedding process for the remaining r1 bytes of secret message file Step-21: End B. Data Encryption Algorithm 1) Algorithm for encryption Step-1: Start Step-2: copy plaintext file name as pf[] Step-3: copy encrypted file name in ef[] Step-4: copy the content of pf[] file into a file sec_a.dat. Step-5: call function keygen() to calculate times (=randomization number) Step-5a: Store (0-255) in matr[16][16] Step-6: set i=1 Step-7: Call function randomize() to create randomize matrix matr[16][16] Step-8: Copy all elements of matr[16][16] to key[256] Step-9: if mod(key[i],2) 0 and n1 then goto step-24 Step-10: open file sec_a.dat in read mode Step-11: open file sec_b.dat in write mode Step-12: Read a block (Number of characters