hardware implementation of the safer+ encryption ...

2 downloads 0 Views 51KB Size Report
properties and the VLSI architecture are described. The whole design was ... BLUETOOTH SECURITY ARCHITECTURE ... "Armenian Shuffle" Permutation. PHT.
Proceedings of IEEE International Symposium on Circuits & Systems (ISCAS'02), Vol. IV, pp. 878-881, USA, May 26-29, 2002

HARDWARE IMPLEMENTATION OF THE SAFER+ ENCRYPTION ALGORITHM FOR THE BLUETOOTH SYSTEM P. Kitsos, N. Sklavos and O. Koufopavlou Electrical and Computer Engineering Department University of Patras. Patras, GREECE E-mail: [email protected] ABSTRACT

built-in security to prevent eavesdropping and falsifying the message originator. Therefore, functionalism for authentication and encryption has been added to this technology. Authentication is used to prevent unwanted access to data and to prevent falsifying the message originator. Encryption is used to prevent eavesdropping. In this paper a hardware implementation of the SAFER+ encryption algorithm, for use in Bluetooth telecommunication system is presented. Comparing with other implementations [2], [3], our proposed implementation is focus on the encryption part only. This is due to the Bluetooth system requirements. The proposed implementation reduced the covered area constraints about 25 percent comparing with the others implementations. In addition, achieves throughput 320 Mbit/sec, comparing with the 59 Mbit/sec of the implementation in [2] and 704 Mbit/sec of the [3]. Finally, the proposed implementation, achieves better performance than the software implementations, in terms of encryption throughput by a factor ranging from 9 to 12500. The paper is organized as follows: In sections 2 and 3 a brief description of the Bluetooth security features and the SAFER+ algorithm are presented. The proposed VLSI architecture is described in section 4. Experimental and simulation results for the FPGA implementation is shown in section 5, and the paper conclusions are given in section 6.

In this paper, a VLSI implementation for the SAFER+ encryption algorithm is presented. The combination of security, and high speed implementation, makes SAFER+ a very good choice for wireless systems. The SAFER+ algorithm is a basic component in the authentication Bluetooth mechanism. The relation between the algorithm properties and the VLSI architecture are described. The whole design was captured entirely in VHDL language using a bottom-up design and verification methodology. FPGA device was used for the hardware implementation of the algorithm. The proposed VLSI implementation of the SAFER+ algorithm reduces the covered area about 25 percent, and achieves a data throughput up to 320 Mbit/sec at a clock frequency of 20 Mhz.

1. INTRODUCTION Wireless communication technology has advanced at a very fast pace during the last years, creating new applications and opportunities. In addition, the number of computing and telecommunications devices is increasing. Special attention has to be given in order to connect efficiently these devices. In the past, cable and infrared light connectivity methods were used. The cable solution is complicated since it requires special connectors, cables and space. This produces a lot of malfunctions and connectivity problems. The infrared solution requires line of sight. In order to solve these problems a new technology, named Bluetooth, has been developed. With this communication system, users are able to connect a wide range of computing and telecommunications devices easily and simply, without the need for connecting cables. Bluetooth technology as a cable replacement technique exposes the need for security functionalism in the wireless solution. By replacing the cable and introducing radio signals there is a need for the Bluetooth device to have

2. BLUETOOTH SECURITY ARCHITECTURE The Bluetooth specifications include security features at the link level [1]. They support authentication (unidirectional or mutual) and encryption (Fig. 1). These features are based on a secret link key that is shared by a pair of devices. Authentication is based on a challenge-response procedure [1], where the verifier issues a random number. This random number is encrypted by the claimant (the other unit) using the shared key and sent back to the

1

Proceedings of IEEE International Symposium on Circuits & Systems (ISCAS'02), Vol. IV, pp. 878-881, USA, May 26-29, 2002 verifier. The verifier checks if the encrypted message is correct. PIN

PIN

E2

E2

diffusion, and second, they use additive constant factors (Bias vectors) in the scheduling for weak keys avoidance. 0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

K2r-1 128

e

Authentication

Link Key

E1

l

e

e

l

l

e

e

l

l

e

e

l

l

e

K 2r 128

E3

PHT

E3

Encryption E0

Figure 1: Bluetooth Security Architecture The authentication scheme is performed by using a stream-cipher (Ar), a common key and a Bluetooth address (Fig. 2). RAND

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

"Armenian Shuffle" Permutation

Encryption Key

Rounds r, r = 1,2,...,8

Encryption Key

l

Link Key

PHT

PHT

PHT

PHT

PHT

PHT

"Armenian Shuffle" Permutation

PHT

PHT

PHT

PHT

PHT

PHT

"Armenian Shuffle" Permutation

PHT

ADDRESS

PHT

PHT

PHT

PHT

PHT

K 17 K

128

Ar

0

ADD

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Figure 3: One Round of the SAFER+ Algorithm

XOR

offset

1

Extent

The SAFER+ algorithm consists of a set of eight round and a parallel mechanism for generating the round keys, which are used in each round [1]. The algorithm produces a 128 bits result from a 128 bit “random” input string and a 128 bits round key (Fig. 3). More details about the SAFER+ algorithm are described in [1], [2] and [7].

L

Ar'

SRES / ACO

Figure 2: Bluetooth Authentication Scheme

4. PROPOSED ARCHITECTUR E

The cipher seed generator is a version of the public SAFER+ cipher, which takes a 128 bits key and a 128 bits data block and proceeds the encrypted data [2]. The key exchange procedure is based on a common PIN number, which, entered manually by user on both communication units. It is then used to encrypt a common random number. The SAFER+ algorithm is an enhanced version of an existing 64 bits block cipher SAFER K-128.

The architecture for the hardware implementation of the SAFER+ algorithm consists of the two main components, the data encryption path and the key scheduling. The round keys are applied in parallel in the encryption data path. In addition the data encryption path consis ts of an encryption round structure. For the implementation of the total encryption algorithm the data are applied in a repeated manner to the round structure by using a Feedback Register and an Input Buffer. In this proposed design the whole single round of the SAFER+ algorithm is implemented in hardware. In order to run the whole SAFER+ algorithm eight loops of the single round implementation are needed (Fig. 4). The single round implementation is chosen because the required system throughput can be achieved and in the same time the covered area is minimized.

3. DESCRIPTION OF SAFER+ ALGORITHM The SAFER+ (Secure And Fast Encryption Routine) algorithm is based on the existing SAFER family of ciphers, which comprises the ciphers SAFER K-64, SAFER K-128, SAFER SK-128. They have been developed by James L. Massey at the ETH Zurich [4], [5], [6]. All algorithms are byte-oriented block encryption algorithms, which are characterized by the following two properties. First, they use a non-orthodox linear transformation, which, is called Pseudo-Hadamard-Transformation (PHT) for the desired

2

Proceedings of IEEE International Symposium on Circuits & Systems (ISCAS'02), Vol. IV, pp. 878-881, USA, May 26-29, 2002 Input Key

Control1

Key scheduling

128

implemented by using one look-up table (LUT) with 256 bytes. The Data Demapping unit performs the reverse function of the Data Mapping. This design results in a 25 percent reduction of the covered area comparing to the conventional implementation. After that, the output of the nonlinear layer is added by a mixed byte-addition/XOR with the round key K2r [8]. The operations after that are four Pseudo-Hadamard-Transformation (PHT) layers that are connected through by three permutations. The PHT boxes defined as PHT (in1, in 2) = ( 2in1 + in 2, in1 + in 2) . The outputs of the PHT, out1 = 2in1 + in2 and out2 = in1 + in2 are implemented over in GF(256). The design of PHT element is shown in Fig. 6.

Control Unit

128

Round Keys

Control2

Plaintext

Ciphertext

one SAFER+ encryption round

128 128 128

128

128

Input Register

128

Feedback Register

Figure 4: SAFER+ Block Diagram Hardware Implementation A multiplexer is used in order to feed the one SAFER+ round with the Plaintext in the first round, and for the next seven round with the round encryption internal data. The detailed design of the one SAFER+ encryption round unit with the Input and Feedback registers is shown in Fig. 5.

In1

In2 8

Plaintext 128

SHL 1 Clock Reset

0

1

8

2 8

3 8

4 8

INPUT REGISTER(BUFFER) 6 7 8 9 10 11 12 13 14 15

5

8

8

8

8

8

8

8

8

8

8

8

x

x2

Data_ready Reset_Data_ready

8

8

K2r-1 128

8

8

8

Data Mapping for e , l Functions x1 8 x2 8

e

l

+

y1 8 y2 8 Data Demapping for e , l Functions

+

K2 r 128

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

PHT

"Armenian Shuffle" Permutation

PHT

PHT

PHT

PHT

PHT

PHT

Out1

Round_complete

"Armenian Shuffle" Permutation

PHT

PHT

PHT

PHT

PHT

PHT

0

2

4

6 7 8 9 10 11 12 13 14 15 FEEDBACK REGISTER

Out2

Figure 6: The PHT Implementation Multiplication by 2 can be achieved by one bit left wired shift. The four linear PHT layers connected through the permutations. The permutation boxes show how input byte indices are mapped into the output byte indices. Thus, position 0 (leftmost) is mapped on position 8, position 1 is mapped on position 11, etc.

"Armenian Shuffle" Permutation

PHT

8

8

PHT

only after last round

K1 7

128

1

3

5

Reset_Complet e

128

Key Sel

128

128 bits Key Register

Complete Ciphertext

8

8

Sum bytes bit-by-bit modulo two

Figure 5: The One SAFER+ Encryption Round M U X 2

The output of the Input Register is used as input for the mixed XOR/byte-addition unit with subkey K2r-1 [8]. Symbol denotes bit-by-bit XOR (modulo-2 sum), and symbol denotes additions mod 256 which is implemented over GF(256). Then the set of the 16 8-bit outputs is used by the Data Mapping unit, which produced the X1 and X2 (8-bit words). These words are the input of the non-linear layer units (e and l). The nonlinear function “e” is implemented as y 1 = 45 x 1 in

Select bytes 8

M U X 1

K2r-1 128

8

Bias2r-1

Rotate each octet left by 3 bit positions 8

8

8

8

K 2r Select bytes

+16 128

Bias2r

Rotate each octet left by 3 bit positions 128

GF(257) with the exception 45128 = 0 . The “l” function is implemented as y 2 = log 45 ( x 2 ) in GF(257) with the exception

+16

Select bytes

K 17 +16 128

Bias17

log 45 (0) = 128 . Each non-linear function is

Figure 7: VLSI Implementation of the Key Scheduling

3

Proceedings of IEEE International Symposium on Circuits & Systems (ISCAS'02), Vol. IV, pp. 878-881, USA, May 26-29, 2002 The process of keys generation is characterized by the 128-bit user-selected key, which divided into 16 bytes. The round keys (Key Scheduling) generation is illustrated in Fig. 7. The round keys r, (r=1, 2,…, 8) are denoted as K 2r −1 [ j ], K 2 r [ j], j = 0,1,2...,15 . The user-selected key itself is

Architecture

Rounds

Frequency

Archit. in [3]

8

44 MHz

704 Mb/sec

used as the first round key K 1[ j ] . Finally, the Key Scheduling use 16 16-byte bias vectors, B2, B3,...,B17 which are computed according to the equation [1]:

Archit. in [2] Software1 [3] Software2 [2]

8 8 8

62 MHz 200 MHz 16 MHz

58.9 Mb/sec 33 Mb/sec 25.6 Kb/sec

Proposed

8

20 MHz

320 Mb/sec

17 p +i +1

B[ p, i] = ((45(45

mod257 )

Throughput

Table II: Performance Comparisons

mod 257 ) mod 256) 6. CONCLUSIONS In this paper, a high-speed implementation for the SAFER+ algorithm is proposed. With the aid of iterative looping structure a very high data throughput of 320 Mbit/sec at a clock of 20 Mhz was achieved, with 25 percent covered area reduction. The whole design was captured entirely in VHDL language and implemented in a XILINX VIRTEX XCV400 FPGA device. Measurement results and comparisons between the proposed and previous hardware and software implementations are presented.

for i=0,…15. The B[p,i] correspondence the i-th byte of the bias vector Bp. The bias vectors are implemented by using one 16x16-byte look-up table (LUT). The Key scheduling design allows an on-the-fly computation of the round keys because the Bluetooth demand only the encryption mode of the SAFER+ algorithm. This means that there is no time delay for the key calculation. 5. EXPERIMENTAL AND SIMULATION RESULTS The whole design was captured entirely in VHDL language. All of the system components have been described with structural architecture. The proposed architecture is synthesized by using FPGA device of XILINX [9]. The system was simulated, for the verification of the correct functionality in real time operation conditions. The test vectors provided by [7] for the SAFER+ algorithm were applied to assure that the design was working as intended. Measurements for the proposed SAFER+ implementation resource utilization are presented in Table I. Target Device VIRTEX XCV400 [9] Speed Grade -3 Resource Used Avail Utilization I/Os 392 404 97 % Functions 8335 9600 87 % Generators CLB Slices 4168 4800 87 % Dffs or Latches 810 9600 8.5 %

7.

REFERENCES

[1] “Specification of the Bluetooth System”, Specification Volume 1, Version 1.1, February 22, 2001. [2] J. L. Massey, G. H. Khachatrian, M. K. Kuregian, “SAFER+ Cylink Corporation’s Submission for the Advanced Encryption Standard”, First Advanced Encryption Standard Candidate Conference, Ventura, CA, August 20-22, 1998. [3] J. L. Massey, “On the Optimality of SAFER+ Diffusion”, Second Advanced Encryption Standard Candidate Conference (AES2), Rome, Italy, March 22-23, on line available at http://csrc.nist.gov/encryption/aes/round1/conf2/aes2conf.htm. [4] J. L. Massey, “SAFER K-64: A Byte-Oriented Block Ciphering Algorithm”, Fast Software Encryption, Proceedings of the Cambridge Security Workshop, Cambridge, U.K, 1998, pp. 1-17. [5] J. L. Massey, “SAFER K-64: One Year Later”, Fast Software Encryption, Lectures Notes in Computer Science No. 809, New York, Springer, pp. 212-241, 1994. [6] B. Schneier, “Applied Cryptography, Protocols, Algorithms, and Source Code in C”, John Wiley & Sons 1994. [7] J. L. Massey, G. H. Khachatrian, M. K. Kuregian, “Nomination of SAFER+ as Candidate Algorithm for the Advance Encryption Standard”, First Advanced Encryption Standard Candidate Conference, Ventura, CA, August 20-22, 1998. [8] A. Schubert, V. Meyer, W. Anheier, “Reusable Cryptographic VLSI Core Based on the SAFER K-128 Algorithm with 251,8 Mbits/s Throughput”, IEEE Workshop on Signal Processing Systems, 1998, pp. 437-446. [9] Xilinx, San Jose, California, USA, Virtex, 2.5 V Field Programmable Gate Arrays, 2001, www.xilinx.com

Table I: Resource Utilization Measurements Performance comparisons between the proposed and previous designs are shown in Table II. The software1 implementation was coded in C and executed on 200 MHz Pentium PC [2], [3], and software2 implementation was coded in Assembly on 8-bit Processors of the MCS 51 family [2]. The encryption throughput of the proposed implementation is 5 times higher than the one proposed in [2], and many times higher comparing to the software implementations.

4

Suggest Documents