FPGA Prototyping of Single Chip 128-bit AES ...

iJARS International Journal of Engineering Volume 10, Issue 5 (September - 2015) www.engineering. ijarsgroup.com Research Article

FPGA Prototyping of Single Chip 128-bit AES Encryption and Decryption of Image Author:

Bibek Bhattarai* Address For correspondence: Department of Electronics and Computer Engineering, Central Campus, IOE, Tribhuwan University, Lalitpur, Nepal Abstract: In today's world, as much of our personal information and financial transactions are processed via the Internet, data encryption is an essential element to any effective information security system. The importance of cryptography applied to security in electronic data storage and transactions has acquired an essential relevance during the last few years. FPGA implementation of 128-bit Advanced Encryption Standard (AES) algorithm for symmetric-key encryption and decryption is presented in this paper. The design has been performed using Verilog Hardware Descriptive Language. The implementation of the algorithm has been performed on Spartan 3E starter kit FPGA board. All the results have been synthesized and simulated using Xilinx ISE and ISim software respectively. The design uses an iterative looping approach with block and key size of 128 bits, lookup table implementation of s-box. This gives low complexity architecture and easily achieves low latency as well as high throughput. Simulation results, performance results are presented and comparison with previous reported designs is performed. Keywords: encryption; decryption; AES; FPGA; symmetric-key; iterative; latency; throughput [email protected] *Corresponding Author E-Mail Id

I. INTRODUCTION Each day, millions of users generate and interchange large volumes of information in various fields, such as financial and legal files, medical reports, and bank services via Internet. These and other examples of applications deserve a special treatment from the security point of view, not only in the transport of such information but also in its storage. In this sense, cryptography techniques are very essential. Some of very dominant applications of cryptography include secure communication such as HTTPS for web traffic, WPA2 (and WEP), GSM, Bluetooth for wireless traffic, user authentication and many more. For long time, a product cipher developed by IBM, DES (Data Encryption Standard) was widely adopted by the industry for use in security products. However, after Diffie and Hellman designed a machine to break DES in one day and estimated that it could be built for only 20 million dollars (ANDREW S.TANENBAUM, 2011), NIST (National Institute of Standards and Technology) invited researchers from all over the world to submit proposals for a new standard, AES (Advanced Encryption Standard). The Proposed properties of AES were that the algorithm must be a symmetric block cipher; the full design must be public; key lengths of 128, 192, and 256 bits must be supported; both software and hardware implementations must be possible; the algorithm must be public or licensed

Manuscript Id: iJARS/1157 Authors Copy; Restricted to Personal Use Only any manipulation will be against copy Right Policy @ iJARS

1

iJARS International Journal of Engineering Volume 10, Issue 5 (September - 2015) www.engineering. ijarsgroup.com on non-discriminatory terms. Fifteen serious proposals were made, and public conferences were organized in which they were presented and attendees were actively encouraged to find flaws in all of them. Based on the security, efficiency, simplicity, flexibility, and memory requirements of algorithms proposed, Rijndael, by Joan Daemen and Vincent Rijmen is selected as AES and published as FIPS 197 (Federal Information Processing Standard 197) (ADVANCED ENCRYPTION STANDARD (AES), 2001). The Rijndael has three parameters. They are: plaintext, an array of 16 bytes containing the input data; ciphertext, an array of 16 bytes where the enciphered output will be returned; and key. During the calculation, the current state of the data is maintained in a byte array, state, whose size is Nrows×Ncols. For 128-bit blocks, this array is 4 ×4 bytes. With 16 bytes, the full 128-bit data block can be stored. At the start of the Cipher and Inverse Cipher, the input – the array of bytes in0, in1, … in15 is copied into the State array. It is copied in column order, with the first 4 bytes going into column 0, the next 4 bytes going into column 1, and so on. The Cipher or Inverse Cipher operations are then conducted on this State array, after which its final value is copied to the output – the array of bytes out0, out1, … out15. S0,

S0,

0

1

S1,

S1,

0

1

S2,

S2,

0

1

S3,

S3,

0

1

The encryption or decryption starts out by expanding the key into arrays of the same size as the state. They are stored in round key, which is an array of structs, each containing a state array. One of these will be used at the start of the calculation, known as initial round key and the rest will be used one per round for number of rounds being as described in algorithm specification in Table I. AES is secure from all kinds of cryptographic attacks. Since every step is reversible, decryption can be done just by running the algorithm backward. However, there is also a trick available in which decryption can be done by running the encryption algorithm using different tables. The algorithm has been designed for great security and great speed.

S0, 2 S0, 3 S1, 2 s1, 3 S2, 2 S2, 3 S3, 2 S3, 3

Fig. 1: The representation of the state array in algorithm

Fig. 2: The general block diagram of the Image encryption process Hardware implementations of AES algorithm are faster still. FPGAs (Field Programmable Gate Arrays) are hardware simulation devices which can be programmed in system. The potential advantages of encryption algorithm implemented in FPGAs are


2

iJARS International Journal of Engineering Volume 10, Issue 5 (September - 2015) www.engineering. ijarsgroup.com algorithm agility, algorithm upload, algorithm modification, architecture efficiency and throughput and cost efficiency. Throughput of quite higher than software implementations can be obtained using FPGAs. Time and cost for developing an FPGA implementation of a given algorithm are much lower than for an ASIC (Application Specific Integrated Circuit) implementation. AES, also known as Rijndael (Joan Daemen) has a fixed block size of 128 bits and a key size of 128, 192 or 256 bits. This paper deals with an FPGA implementation of an AES encryption and decryption using an iterative looping approach with block and key size of 128 bits. This method gives very low complexity architecture and is easily operated to achieve low latency as well as high throughput. The implementation has run using Spartan 3E starter kit FPGA board and the output and performance are analyzed. The image encryption is performed with Verilog program simulated is Spartan 3E Starter Kit FPGA Board. The image input to the Verilog program is provided from computer through UART (Universal serial Receiver and Transmitter) and outputs were observed and analyzed with Python program for loading and displaying images as depicted in Fig. 2. The image file is converted into hexadecimal data stream is then transferred to the FPGA by using UART and is encrypted by Verilog program over there. The encrypted hexadecimal data stream is sent back to the computer and is converted into image and result is observed. For intermediate storage of the images, DDR- SDRAM is used in FPGA.

which reflects the number of 32-bit words in the cipher key. The number of rounds to be performed during the execution of the algorithm is dependent on the key size. The number of rounds is represented by Nr, where Nr = 10 when Nk = 4, Nr = 12 when Nk = 6, and Nr = 14 when Nk = 8 (ADVANCED ENCRYPTION STANDARD (AES), 2001). The only key-block-round combinations that conform to this standard are listed in Table 1. Table I: The key-block-round combination agreeing with AES Key length (Nk)

Block size (Nb)

AES-128

4

4

Number of rounds (Nr) 10

AES-192

6

4

12

AES-256

8

4

14

The algorithm for encryption and decryption are summarized by flow diagram of Fig. 3.

II. RELATED THEORIES A.

Algorithm Specification

For the AES algorithm, the length of the input block, the output block and the state is 128 bits. This is represented by Nb = 4, which reflects the number of 32-bit words (number of columns) in the state. The length of the cipher key K is 128, 192, or 256 bits. The key length is represented by Nk = 4, 6, or 8,

Fig. 3: Block diagram of AES encryption and Decryption process


3

iJARS International Journal of Engineering Volume 10, Issue 5 (September - 2015) www.engineering. ijarsgroup.com B. Cipher

At the start of the cipher, the input is copied to the state array. After an initial round key addition, the state array is transformed by implementing a round function 10, 12, or 14 times (depending on the key length), with the final round differing slightly from the first Nr-1 rounds. The final state is then copied to the output to give a ciphertext. C. Byte Substitution Transformation

The Byte Substitution transformation is a nonlinear byte substitution that operates independently on each byte of the state using a substitution table which is invertible, is constructed by composing two transformations. 1. Take the multiplicative inverse in the finite field (28). Multiplicative inverse of b(x) is computed using polynomials a(x) and c(x) such that

Where, m(x) = x8 + x4 + x3 + x + 1 is irreducible polynomial. Hence, a(x) • b(x) mod m(x) = 1, which means

2.

Apply the following affine transformation

for 0 ≤ i < 8, Where, bi is the ith bit of the byte, and ci is the ith bit of a byte c with the value {63} or {01100011}.

D. Row Shifting Transformation In the Row Shifting transformation, the bytes in the last three rows of the State are cyclically shifted over different numbers of bytes (offsets). The first row, r = 0, is not shifted. The row shifting transformation can be characterized as

Where the shift value shift (r,Nb) depends on the row number r and for ( Nb = 4) shift(1,4) = 1, shift(2,4) = 2, shift(3,4) = 3. This has the effect of moving bytes to “lower” positions in the row (i.e., lower values of c in a given row), while the “lowest” bytes wrap around into the “top” of the row (i.e., higher values of c in a given row).

In matrix form, the affine transformation element of the S-box can be expressed as


4

iJARS International Journal of Engineering Volume 10, Issue 5 (September - 2015) www.engineering. ijarsgroup.com

Fig. 4: The row shifting operation; it shifts cyclically the last three rows E. Columns Mixing Transformation The Columns Mixing transformation operates on the State column-by-column, treating each column as a four-term polynomial. The columns are considered as polynomials over finite Galois field, GF (28) and multiplied modulo x 4 + 1 with a fixed polynomial a(x), given by

Fig. 5: Mix column operation operating on single column by column F. Add Round Key Transformation In the Add Round Key transformation, a Round Key is added to the State by a simple bitwise XOR operation. Each Round Key consists of Nb words from the key schedule. Those Nb words are each added into the columns of the State, such that

(9)

This can be written as a matrix multiplication

Where [Wi] are the key schedule words, and round is a value in the range 0 ≤ round ≤ Nr. In the Cipher, the initial Round Key addition occurs when round = 0, prior to the first application of the round function. The application of the Add Round Key transformation to the Nr rounds of the Cipher occurs when 1 ≤ round ≤ Nr. G. Inverse cipher The Cipher transformations can be inverted and then implemented in reverse order to produce a straightforward Inverse Cipher for the AES algorithm. The individual transformations used in the Inverse Cipher are Inverse Rows Shifting, Inverse Byte Substitution, Inverse Columns Mixing, and Add Round Key.


5

iJARS International Journal of Engineering Volume 10, Issue 5 (September - 2015) www.engineering. ijarsgroup.com H. Inverse Row Shifting Transformation “Inverse Row Shifting” is the inverse of the “Shift Rows” transformation used in cipher. The bytes in the last three rows of the State are cyclically shifted over different numbers of bytes (offsets). The first row, r = 0, is not shifted. The bottom three rows are cyclically shifted by Nb -shift(r, Nb) bytes, where the shift value shift(r,Nb) depends on the row number. Specifically, the “Inverse Row Shifting” transformation proceeds as explained in (10). Fig.6 illustrates the Inverse Row Shifting transformation.

This can be written as a matrix multiplication. Let

K. Inverse Add Round Key Transformation The Add Round Key operation used in cipher is inverse of itself because only operation it uses is the logical XORing of two words. So the “Inverse Add Round Key Transformation” is same as the “Add Round Key Transformation”. L. Key expansion

Fig. 6: Inverse Row Shifting to cyclically shift the last three rows in the State I. Inverse Byte Substitution Transformation In Inverse Byte Substitution Transformation, the inverse S-box is applied to each byte of the State. This is obtained by applying the inverse of the affine transformation followed by taking the multiplicative inverse in GF (28).

The AES algorithm takes the Cipher Key K and performs a Key Expansion operation to generate a key schedule. The Key Expansion generates a total of Nb*(Nr +1) words: the algorithm requires an initial set of Nb words, and each of the Nr rounds requires Nb words of key data. The resulting key schedule consists of a linear array of 4-byte words, denoted [wi ], with i in the range 0 ≤ i < Nb*(Nr + 1).

J. Inverse Columns Mixing Transformation Inverse Columns Mixing operates on the State column-by-column, treating each column as a fourterm polynomial. The columns are considered as polynomials over GF (28) and multiplied modulo x 4 + 1 with a fixed polynomial a -1(x), given by


6


III. IMPLEMENTATION Every steps of design of the AES encryption and decryption are straightforward but the implementation of the byte substitution and column shifting are somewhat tricky. The byte substitution and inverse byte substitution are performed using lookup table created on block ram of FPGA. The lookup table for byte substitution looks as shown in Table II. Fig. 7: The general block diagram of the keyexpansion algorithm Table II: Byte substitution Look-up table Encryption and decryption contains 10 rounds and each round has its own unique key which is derived from original 128 bit key. The logic of key expansion algorithm is designed to ensure that if one bit of key is changed, it affects rest of all round keys. The original key is mapped to words each of 32 bit as [ω0, ω1, ω2 and ω3]. The algorithm expands words [ω0, ω1, ω2 and ω3] into a 44 word key schedule which can be labelled as, ω0, ω1, ω2, ω3, ω4, ω5, ω6, . . . . . . . , ω41, ω42, ω43. For each round four words [ωi, ωi+1, ωi+2 and ωi+3] are combined to form round key and using this we have to find [ ωi+4, ωi+5, ωi+6 and ωi+7]. As in Fig. 7 we can write ωi+4 = ωi ωi+5 = ωi+1 ωi+6 = ωi+2 ωi+7 = ωi+3

g ωi+4 ωi+5 ωi+6

For producing ‘g’ from ωi+3 the process involved are to rotate cyclic left by 1 byte. And then substitute rotated word with substitution table in Table 2. The substituted word must be XORed with round constant Rcon[i]. The round constant word, Rcon[i], contains the values given by [xi-1, {00},{00},{00}], with xi-1 being powers of x(x is denoted as {02}) in the field GF (28), (here i starts at 1, not 0). For ω4, value of i for Rcon[i] will be 1 and is increased in each round.

For example, if s11 = {53}, then the substitution value would be determined by the intersection of the row with index ‘5’ and column with index’3’ in table. This would result in S’11 having value of {ed}. The inverse byte substitution look-up table is just reverse of this table. For example the row 5 and column 2 in byte substitution table contains {00}. So the row 0 and column 0 of the inverse byte substitution table contains {52}. The column mixing operations requires the matrix multiplication. The direct implementation of the binary numbers multiplication can be very costly in terms of the time and resources. So the shift and add method is implemented to perform the multiplication. Multiplying the binary polynomial in (14) by x is as shown in (15).


7


The result x • b(x) is obtained by reducing the above result modulo m(x). If b7 = 0, the result is already in reduced form. If b7 = 1, the reduction is accomplished by subtracting (i.e., XORing) the polynomial m(x). It follows that multiplication by x (i.e., {00000010} or {02}) can be implemented at the byte level as a left shift and a subsequent conditional bitwise XOR with {1b}. This operation on bytes is denoted by xtime(). Multiplication by higher powers of x can be implemented by repeated application of xtime(). By adding intermediate results, multiplication by any constant can be implemented. For example, {57} • {13} = {fe} because

Asynchronous Receiver and Transmitter), stored in DDR SDRAM (Double Data Rate – Static Distributed Random Access Memory), fetched from there in sequential fashion and encrypted and stored back there. The encrypted image is decrypted using the decryption routine and stored in DDR-SDRAM. The encrypted and decrypted image are sent back to computer via UART and displayed using a python program. The experiment performed on a greyscale fingerprint image of size 400 ×300 (120,000 bytes of date for 8- bit image) is shown in Fig. 10.

Fig. 8: The result of the keyexpansion process

{57} • {02} = xtime({57}) = {ae} {57} • {04} = xtime({ae}) = {47} {57} • {08} = xtime({47}) = {8e} {57} • {10} = xtime({8e}) = {07}, thus, {57} • {13} = {57} • ({01} = {57}

{ae}

{02}

{10})

{07} = {fe}.

IV. RESULTS First of all, the key expansion algorithm was run and its simulation result is observed for original encryption key {2b 7e 15 16 28 ae d2 a6 ab f7 15 88 09 cf 4f 3c} as shown in Fig. 8. The encryption algorithm is used to encrypt the fingerprint image which is sent from PC using UART (Universal

Fig. 9: The simulation of the Encryption and decryption process (Note: Please refer Appendix for enlarged view of Fig. 9)


8

iJARS International Journal of Engineering Volume 10, Issue 5 (September - 2015) www.engineering. ijarsgroup.com REFERENCES:

Fig.10: Encrypted fingerprint image (a). Decryption of the cipher image to obtain fingerprint image back (b). V. PERFORMANCE ANALYSIS

AND

RESOURCES

The resources utilized for the encryption and decryption process are shown in detail in Fig. 11.

1. Andrew S.Tanenbaum, David J.Wetherall. COMPUTER NETWORKS. fifth. s.l. : PRENTICE HALL, 2011. pp. 780 - 783. 2. ADVANCED ENCRYPTION STANDARD (AES). s.l.: Federal Information Processing Standards Publication 197, 2001. 3. Joan Daemen, Vincent Rijmen. The Rijndael Block Cipher. 4. Rijmen, J. Daemen and V. AES Proposal: Rijndael, AES Algorithm Submission. September 3, 1999. 5. Ashwini R. Tonde, Akshay P. Dhande. Review Paper on FPGA Based Implementation of Advanced Encryption Standard (AES) Algorithm. January 2014 6. Rourab Paul, Sangeet Saha, Suman Sau, Alan Chakrabarti. Design and Implementation of Real Time AES-128 on Real Time Operating System for Multiple FPGA Communications.

Fig.11: The resource utilization for Encryption/ Decryption module (Note: Please refer Appendix for enlarged view of Fig. 11) ACKNOWLEDGEMENTS It gives us immense pleasure to thank all the anonymous reviewers for their constructive comments and suggestions.


9


APPENDIX APPENDIX APP

Fig 9: The simulation of the Encryption and decryption process


10


*Autho Fig. 11: The resource utilization for Encryption/ Decryption module r’s Copy


11

FPGA Prototyping of Single Chip 128-bit AES ...

FPGA Prototyping of Single Chip 128-bit AES ...

Suggest Documents

FPGA Emulation and Prototyping of a CyberPhysical-System-On-Chip ...

A Single-Chip FPGA Implementation of Real-time Adaptive ... - Core

RAPID PROTOTYPING OF AES ENCRYPTION FOR WIRELESS ...

FPGA-Based Real-Time Implementation of AES

Rapid Single-Chip Secure Processor Prototyping on the OpenSPARC

FPGA Prototyping By Verilog Examples.pdf

Two Approaches for a Single-Chip FPGA Implementation ... - CiteSeerX

Hybrid breadth-first search on a single-chip FPGA-CPU ...

AN FPGA BASED GENERIC PROTOTYPING ...

FPGA PROTOTYPING BY VHDL EXAMPLES Xilinx

SMPS: An FPGA-based Prototyping Environment for

Electromagnetic Side Channels of an FPGA Implementation of AES

a single chip design and implementation of aes -128/192 ... - CiteSeerX

Very Compact FPGA Implementation of the AES Algorithm - CiteSeerX

FPGA-Based Real-Time Implementation of AES Algorithm ... - wseas.us

An Efficient FPGA implementation of CCM mode using AES - CiteSeerX

FPGA prototyping of a RISC processor core for ... - Semantic Scholar

Rapid Prototyping of an FPGA-Based Video Processing System

Fast FPGA Prototyping of a Multipath Fading Channel Emulator via ...

VLSI Architecture and FPGA Prototyping of a Secure Digital Camera ...

FPGA prototyping by Verilog examples - Home Pages of All Faculty ...

Rapid Prototyping of an FPGA-Based Video Processing ... - VTechWorks

FPGA prototyping of an amba-based windows ... - Semantic Scholar

[hal-00538605, v1] Design and FPGA prototyping of a ...