FPGA Implementation of Elliptic Curve Point Multiplication over ( )

FPGA Implementation of Elliptic Curve Point Multiplication over ( ) Sameh m. Shohdy, Ashraf b. El-sisi, and Nabil Ismail Computer science department, faculty of computers and information, Menoufiya University, Shebin Elkom 32511, Egypt [email protected], [email protected], [email protected]

Abstract. Hardware acceleration of cryptographic algorithms is beneficial because considerable performance improvements can be attained compared to software implementations. Thus, hardware implementations can be used in critical applications requiring high encryption or decryption speeds. Parallel architecture with efficient hardware implementation of Galois field arithmetic operations is used to produce high speed computation time for the scalar multiplication operation which is the main operation in Elliptic Curve Cryptography (ECC) system. This work proposed a modification in karatsuba-ofman algorithm which is one of the best algorithms used to perform multiplication operation over Galois field. The modification contrasted on truncating karatsuba-ofman algorithm in a low level and using the classic polynomial multiplication algorithm. In addition, this work proposed architecture for implementing ECC on hardware using Montgomery algorithm in projective coordinates. The results show that the proposed architecture is able to compute GF(2^191) elliptic curve scalar multiplication operations in 72.939 µs on Xilinx Virtex-II XC2V6000 FPGA device and 100.68 µs on Xilinx VirtexE 2600. Also, the proposed architecture can be changed to be suitable for any arbitrary Galois field size with little modifications. Keywords: Galois field, Elliptic curve cryptography, Karatsuba-Ofman multiplier, field programmable gate arrays, polynomial multiplication, and polynomial inversion.

1 Introduction Now in the internet age, information is transferred through different media with several ways and using dissimilar methods. The importance of transferred information makes eavesdropper infatuated with crushing the privacy of transferred data. Eavesdropping arise the importance of the information security to guarantee information privacy. Cryptography is the main method to guarantee a secure data transferring through any media. Information sender can encrypt his message from plain text to J.H. Park et al. (Eds.): ISA 2009, LNCS 5576, pp. 619–634, 2009. © Springer-Verlag Berlin Heidelberg 2009

620

S.m. Shohdy, A.b. El-sisi, and N. Ismail

cipher text. Even if anyone trying to eavesdrop on the transferred media, he can't understand the encrypted message. On the other hand, information receiver is the only person who can decrypt the message from cipher form to plain form and can easily read it. Cryptography systems differ from many views. One view is how that encrypt/decrypt processes done. One kind of cryptography system named symmetric cryptography systems depend on a secret key used for encrypt and decrypt operations, so this key must be private between sender and receiver. On the other hand if it’s important to transfer this key between the two parties cryptography system must guaranteed secure channel for this process. The other kind of cryptography system named asymmetric cryptography systems which is depend on two keys - public/private keys - public key is used for encrypt message and this available for any party and private Key is the key which can decrypt the message. But, this one is private for receiver only to guarantee that receiver is the only one who can read the message. Also, cryptography systems differ from the view of mathematical concept making it’s so difficult to compute private key from public key although they related, for example, Discrete Logarithm (DL) systems (e.g. Diffie-Hellman, DSA), Integer Factorization (IF) systems (e.g. RSA) and Elliptic Curve Discrete Logarithm (EC) systems. Also each algorithm can implement using software or hardware. Software is less secure and slow operation, but it's easy. But, although it is difficult to implement security algorithm in hardware, it's more secure and faster operation. Fig. 1 illustrates the ECC operations which can include an Arithmetic/Logic Unit (ALU) for ECC processor.

Fig. 1. ALU for ECC Processor

This work concerned on implements one of the modern cryptography systems called Elliptic curve cryptography on FPGA hardware technology. ECC preferred when compared with classical cryptosystems such as RSA because of higher speed and lower power. Because of ECC guarantees the same security level as RSA but with shorter key size [1]. Elliptic curve used in many applications (e.g. Digital Signature, authentication protocols, etc.).

FPGA Implementation of Elliptic Curve Point Multiplication

621

This paper is organized as follows: Section 2 describes mathematical background of Galois field and elliptic curve cryptography. Section 3 describes architecture and design for Galois field arithmetic and ECC Operations. In section 4 Results and comparison for the implementation of scalar ECC multiplication in GF(2 ) is presented. Finally, in section 5 some conclusions remarks as well as future work are drawn.

2 Mathematical Background 2.1 Galois Field Arithmetic Galois field or Finite field ( ) defines as GF(p ) which is a field with finite number of elements. Galois field arithmetic plays a critical role in elliptic curve cryptography implementation because it’s the core of ECC scalar multiplication operation. So, more efficient implementation of underlying field operations results more efficient in the overall algorithm. Galois fields suitable for ECC implementation divides into two categories: prime field where m = 1 and binary field where p = 2 and m 1. Binary field is the suitable field to implement in hardware because of field elements can be easily implemented as a bit vector (polynomial basis representation) in hardware and free carry propagation property. Arithmetic on Binary finite field is the core of Elliptic Curve Cryptography scalar multiplication implementation. Montgomery algorithm applied on Weierstrass elliptic curve depend on four main operations field multiplication, Field addition, field square and field inversion. Algorithms define the arithmetic operation in binary field depend on the methodology of field element representation. This section presents the finite field arithmetic operations in GF(2 ). 2.1.1 Addition Operation It can be done only using one n-bit XOR operation (equal to bit wise addition module 2). The sum of two elements A, B GF(2 ) is given by equation (1). ( )= ( )

( )=

(

)

(1)

2.1.2 Square Operation The binary representation of element’s square is done by inserting a 0 bit between consecutive bits of the binary representation. The square of A GF(2 ) is given by equation (2). ( )=

2.1.3 Multiplication Operation Assume we have two elements ( ), ( ) belongs to binary field ducible polynomial ( ). Field multiplication done by two steps: 1. Polynomial multiplication of ( ) and ( ) ’( ) = ( ). ( )

(2)

(

) with irre-

622


2. Reduction using irreducible polynomial p(x) ( ) = ’( )

( )

A lot of multipliers addressed the problem of compute polynomial multiplication some suitable for hardware like: Polynomial basis LFSR multiplier [2], Massey-Omura multiplier [3], Berlekamp multiplier [4], and karatsuba multiplier [5]. The work in [6] suggests a modified version of a multiplier algorithm called Karatsuba-ofman multiplier. It make a new algorithm called Binary karatsuba-ofman multiplier which is used to multiply two elements in a finite field GF(2 ) where m an arbitrary number. 2.1.4 Reduction Operation Multiplication and square operations need as mention above a reduction process which is the process to reduce the order of resulting values from larger than m to less or equal to m. ( ) =

’( )

( ).

2.1.5 Inversion Operation Inversion is the most time-consuming process when computing the scalar multiplication in elliptic curve cryptography using Montgomery method which will discuss later. Inversion in binary finite field is the process of getting a-1 for a nonzero element a є GF(2 ) such that: ) =1

(

( ).

Several algorithms exist for computing the inversion like: Standard Division Algorithm [8], extended Euclidean algorithm (EEA), Modified Almost Inverse Algorithm (MAIA) [9], Binary inversion algorithm, and Itoh-Tsujii multiplicative Inverse Algorithm. By take an element (a) and irreducible polynomial f(x) as input, the output will be the inverse (a ). The extended Euclidean Algorithm (EEA) depends on classical Euclidean Algorithm (EA) used for compute greatest common divisor (gcd) of two numbers A and B. EA define as follow: Let a, b be binary polynomials not equal zero. The greatest common divisor (gcd) of A and B, is the binary polynomial D of highest degree that divides both A and B (without remainder), where deg(B) deg(A). Extended Euclidean algorithm (EEA) is a modification of EA. Compute Inversion of a polynomial element in a Binary finite field is one of its applications as follow: Suppose A had an inverse mod p(x) (where, x an element in Finite field and p(x) is an irreducible polynomial). So, B is the inverse of A, so that A · B = 1 mod p(x) if and only if gcd(A, p(x)) = 1. We now know that if this is true, there exist polynomials p and s so that: · + · ( ) =1

,

·

=1

· ( )

·

=1

( ) ( )


Algorithm 1 defines EEA algorithm which gives a method for calculating ly. The algorithm terminates when u = 0, so that is the inverse of x.

623

efficient-

Algorithm 1. Extended Euclidean Algorithm [10]

Input: 1, ( ) 1

Output:

( ).

1.

,

.

2. 1

1, 2

0.

3.

1

3.1

( )

3.2

0

3.3

+

3.4 1 4.

.

( ). :

, 1

2,

.

.

1 +

2.

( 1).

2.2 Elliptic Curve Arithmetic Elliptic curves are defined over chosen finite field. This paper used Weierstrass nonsupersingular Elliptic Curve define over (2 )where = 191. A Weierstrass non(2 ) is defined to be the set of supersingular elliptic curve points ( , ) (2 ) that satisfy the equation (3). +

=

+

+ , Such that ,

(2 ) (3)

The main operation in ECC is the scalar multiplication operation ( = • , where is an integer and is a point on the selected curve and is the scalar multiplication resulting from multiply with ). There is no multiplication operation in elliptic curve groups; however the scalar product ( ) can be obtained by adding k copies of the same point . The security of elliptic curve systems is based on the difficulty of the elliptic curve discrete logarithm problem (ECDLP). ECDLP define as : Given an elliptic curve E defined over a Galois field (2 ) and two points Q and P that belong to the curve, the trick is to find the integer which if multiply by we get Q. Pollard’s rho is one of the popular algorithms known for solving the ECDLP. The largest ECDLP instance solved with Pollard’s rho algorithm is for an elliptic curve over a 109-bit prime field as illustrates in [10]. Different methods solved the problem of computing scalar multiplication for Elliptic curve crypto systems. Ref [13] illustrates one of the most efficient algorithms used to this operation which called Montgomery’s method. Also, Ref [14] presents efficient implementation of this algorithm to compute scalar multiplication over nonsupersingular multiplication ( ) operation. This work used Montgomery algorithm for compute • in GF(2 ).

624


2.2.1 Montgomery Group Law Corresponding to Affine Coordinates: Montgomery algorithm needs an implicit computation of two point’s addition and point’s doubling in affine or projective coordinates. Let = ( 1, 1) (2 ) . For all points on the curve we have: and = ( 2, 2) P+Q=( 3, 3) . +

=

+

+

+

(

+

= (

=

+

+

)+

+

+ 1

+

)

(

)

(

)

= (

)

(4)

(5)

Equations 4 and 5 illustrates that for point addition operation using affine coordinates cost one field inversion and two field multiplications neglecting the costs of field additions and squaring. 2.2.2 Montgomery Group Law Corresponding to Projective Coordinates Point doubling Operation Using projective coordinated to compute Elliptic curve scalar multiplication avoid Inversion operation in underlying field which is the most time Using projective coordinated to compute Elliptic curve scalar multiplication avoid Inversion operation in underlying field which is the most time consuming operation. By using projective coordinates point doubling can computed as 2P= (X , Y , Z ) X =X +b· Z

(6)

Z =X · Z

(7)

As illustrates in equations 6 and 7 the cost of point doubling is: one general multiplication, one multiplication by a constant, five squaring and one addition. Algorithm 2 used to compute point doubling using Montgomery method. Algorithm 2. Montgomery point doubling [14]

:

:

= ( 1, , 1), =2 _

= = . = . = = =

+

2= ( 1, 1,

,

)


625

Point addition Operation Also, by using projective coordinates point doubling can computed as: P + Q = (X , Y , Z ). Z = (X · Z + X · Z )

(8)

X = x . Z + (X · Z ) · (X · Z )

(9)

Also, you can use Algorithm3 to compute point addition using Montgomery method.

Algorithm 3. Montgomery point addition Algorithm [14]

:

= ( 1, , 1), :

=

= ( 2, , 2)

+ _

=(

·

)+(

·

)·(

( 1, 1, 2, 2, ·

,

)

)

= =( =( .

·

)

)+

As illustrates in equation 8, 9 and Algorithm3 the cost of point doubling is: three general multiplications, one multiplication by a constant x, one squaring and two addition operations in (2 ). Convert projective coordinates to affine coordinates The Montgomery algorithm as will discussed in next section depends on the representation of points as projective coordinates. We use projective coordinates to represent points in Elliptic curve to avoid use of inversion operation over GF(2m) many times, but we still need it one time to return projective coordinates to affine coordinates as illustrated in equations 10, 11. = =( +

/ ) (

+

)(

+

/ )+(

(10) + )(

)(

)

+ (11)

The coordinate conversion process makes use of 10 multiplications and only 1 inversion ignoring addition and squaring operations. Montgomery Algorithm Montgomery scalar multiplication method defines using three operations: point addition, point doubling and converting projective coordinates to affine coordinates. The Montgomery algorithm 3 ( = where in an integer and , is two points in curve). Algorithm 4 shows the Montgomery algorithm depends on addition and doubling operation in elliptic curve. Both operations are executed each iteration of the algorithm.

626

S.m. Shohdy, A.b. El-sisi, and N. Ismail Algorithm 4. Montgomery Algorithm [15]

=( =1

: :

,

= _

= ,

( , , )

= 1,

=

+ ,

2 (

,… , ) ( (2 )

, ( , )

=

0

= 1)

_

(

,

,

,

),

_

(

,

)

_

(

,

,

,

),

_

(

,

)

Return Q=Mxy(

,

,

,

)

3 Architectural Description 3.1 Galois field Hardware Architectures 3.1.1 Addition Operation In hardware addition does not have any carry propagation. The addition is done using only XOR operation.

Fig. 2. Addition Operation in

(

)

3.1.2 Square Operation Polynomial squaring over (2 ) is a free cost operation in hardware. It takes no hardware to implement. It's done by only routing input bits to specific output bits. Squaring still needs a reduction process which will illustrates later to complete its function. Figure 3 shows the hardware architecture for Squaring in (2 ).


Fig. 3. Square Operation in

(

627

)

3.1.3 Multiplication Operation A good observation is when truncated Binary karatsuba–ofman multiplier at low level and used the classic multiplier reduced the hardware resources and timing needed for complete the multiplication process. Ref [7] illustrates the architecture of Binary karatsuba multiplier for GF (2191) which can easily modify for any m bits size. ) is ( Architecture of implement polynomial multiplier for binary field implemented using Xilinx xcv2600efg1156-8 FPGA device results , with time delay . . Also, Xilinx Virtex-II XC2V 6000 FPGA device used for testing the architecture of suggested binary karatsuba multiplier truncated at 8 bit results , with time delay . . 3.1.4 Reduction Operation Once the irreducible polynomial P(x) has been selected, the reduction step can be complete by using XOR gates only . C’=∑

C

C=∑

C ,

where C=C’ mod P(x) C(x) = C

,

+C +C

, ,

+C , X )

X + (C

,

(12) We select the irreducible polynomial ( ) = + + 1 in the form X + X + 1 for this work. Figure 8 illustrates the reduction step in (2 ). 3.1.5 Inversion Operation Extended Euclidian Algorithm (EEA) is used to implement inversion operation in (2 ). As mentions in section 2.1.5. inversion operation is the most time consuming operation in ECC scalar multiplication operation. The goal here is to minimize the number of inversion operations needed to complete the overall process.

628


Fig. 4. Reduction Step in

(

)

Montgomery algorithm minimizes the number of inversion operation to one. The suggested architecture of EEA takes about 1,346 slices. An exact value for the latency of the inverter cannot be given because it depends on the number of ones in the given element. But we choose a random element in (2 ) that requires 1420 clock cycles to complete inversion computation. Architecture of implement inversion for (2 ) is implemented using Xilinx xcv2600efg1156-8 FPGA device binary field with clock frequency 21.79 MHz results 1,346 occupied slices with time lay=( clock Cycles 1420 Clock delay 45.889 = 65.162 μs). Also, Xilinx Virtex-II XC2V6000 FPGA device used for testing the architecture of inversion (2 ) on with clock frequency Clock Frequency =30.08 MHz results 1,346 with time delay= clock Cycles 1420 Clock delay 33.245 = 47.207 μs clock Cycles 1420 Clock delay 33.245 = 47.207 μs 3.2 Montgomery Algorithm Hardware Architecture Different methods solved the problem of computing scalar multiplication for Elliptic curve crypto systems. The work in [15]illustrates one of the most efficient algorithms used for this operation which called Montgomery’s method. This algorithm computes scalar multiplication over non-supersingular Elliptic curve multiplication ( ) operation. This work used Montgomery algorithm for compute KP in GF(2 ). Montgomery algorithm relates the scalar multiplication operation with another two operations: point addition and point doubling next subsections illustrate the algorithm and the sub-algorithms needs to complete its process. 3.2.1 Point Addition Operation This section illustrates point addition operation which is defined as: suppose P1, P2, P3, P (2 ) are represented in projective coordinates. The issue is how to compute 3 = 1 + 2 2 = 1 + Point addition operation can be computed using M_addition algorithm illustrated in section 2.2.2. The algorithm illustrates that the point addition computation consists of four multiplications, two addition operations and only one squaring operation. Fig. 5 illustrates these operations step by step. Fig. 5 (a),(b),(c) and (d) is the four multiplication operations as

FPGA A Implementation of Elliptic Curve Point Multiplication

629

Fig. 5. Sequence of operation ns needed for point addition algorithm as illustrates in MA ADD algorithm

Fig. 6. Point addition operation (P+Q) architecture using one karatsuba multiplier

appears the (c) operation is i depend on the results of (a) and (b) operations so we can’t implement point ad ddition using four concurrent karatsuba multiplierss in GF(2m). This work suggesst to hardware architectures to implement point addittion operation. Fig. 6 illustrates hardwaare architecture using one karatsuba multiplier to perfoorm four multiplication operatiions in four computation cycles that minimize the aarea needed for point addition op peration but increase the total time delay. 3.2.2 Point Doubling Opeeration Point doubling is the secon nd operation needed for compute KP scalar multiplicattion operation .This operation is simpler than point addition operation. This section iillu௠

strates point doubling operration which is defined as: suppose ͳǡ ߳‫ܧ‬ሾ‫ܨܩ‬ሺʹ ሻሿ are represented in projective coordinates. The issue is how to compute P1= 2 * P. T The

630


Fig. 7. Sequence of operations needed for point doubling operation as illustrates in MDOUBLE algorithm

Fig. 8. Point Doubling (2P) operation with one karatsuba multiplier in

(

)

point doubling computation consists of 4 multiplications, 2 additions and only one squaring. Fig. 7 illustrates these operations step by step. Fig. 7 (c) and (d) are the two multiplication operations needed. To perform two multiplication operations in two computation cycles that minimizes the area needed for point addition operation. Fig. 8 illustrates hardware architecture using one karatsuba multiplier. 3.2.3 Montgomery Point Multiplication The Montgomery algorithm ( = where in an integer and , is two points in curve). As illustrated in Algorithm 4. Montgomery algorithm depends on addition and doubling operation in elliptic curve. The point multiplication is performed in projective coordinates and therefore, the point = ( , ) must be mapped from affine coordinates to projective coordinates. This mapping is done for the point and for doubled point 2 . Because the point multiplication is performed using Montgomery method, information of the -coordinate is not needed in the point multiplication and the mapping has to be done only for x-coordinates of the point ( 1, 1)and 2p (x2, y2).Aafter the Montgomery point multiplication in projective coordinates,


631

affine coordinates of the result point Q(x, y) are calculated as implemented in pervious section. For parallel implementation, Fig. 9 illustrates an architecture executes point addition and point doubling operations in parallel. So, we need two karatsuba multiplier run in parallel one for point addition and one for point doubling. In each step of the algorithm one point addition and one point doubling are performed The final step is to return the affine coordinates ( , ) from projective nates ( , , ). Calculation of -coordinate of is calculated from X- and Z- coordinates with the equations 10 and 11. This computation costs: 10 multiplications over (2 ) operations and only one inversion operation Total time delay= 10*45.889 ns + 65162 ns=65.62 µs. Also, Fig. 9 shows the architecture of Montgomery algorithm. Total time delay= 35.06+65.62 = 100.68 µs

Fig. 9. Montgomery point multiplication algorithm architecture

4 Results and Comparisons In previous section different architectures for different operations to implement scalar multiplication operation in (2 ) is presented. Table 1 list the results for implement these architectures using Very high speed integrated circuit hardware description language (VHDL) Code and Xilinx ISE 9.1 tool on VirtexE 2600 and Virtex-II XC2V 6000 FPGA devices. Listed operations relate to the algorithm used to implement it and area as number of occupied slices in FPGA device and time latency. The comparison concentrates on comparing the performance of the implementations. Implementations techniques are not presented here in detail. More detailed presentations of the implementations in this comparison can be found from the references, where the implementations were first published. Table 2 shows a comparison between different hardware implementations of scalar multiplication operation of ECC. Many of publications listed in Table 2 concentrates in minimize the area that makes badly effects on timing delay of the overall operation.

632

S.m. Shohdy, A.b. El-sisi, and N. Ismail Table 1. Results for scalar multiplication’s operations needed on two Xilinx devices Operation

Algorithm

Square in GF(2^191) Multiplication in GF(2 ) Inversion in GF(2 ) Point Addition in EC(GF(2 )) Point Doubling in EC(GF(2 )) Projective Conversion Scalar Multiplication

‫ــــــ‬

Area (Slices)

Latency (Virtex-II XC2V 6000)

latency (VirtexE 2600)

91

7.390 ns

6.04 ns

6,265

45.889 ns

47.207 µs

1,346

65,162 µs

33.245 ns

‫ــــــ‬

8,576

183.556 ns

132.98 ns

‫ــــــ‬

7,115

91.778 ns

66.49 ns

‫ــــــ‬

‫ــــــ‬

65.62 µs

47.539 µs

25,963

100.68 µs

72.939 µs

Binary Karatsuba Multiplier Extended Euclidian Algorithm

Montgomery Algorithm

Table 2. Comparison between different scalar multiplication operation hardware implementations Ref.

ůŚĂĚũ [16]

FPGA Device

Field

Occupied slices

Clock (MHz)

Timing delay

sŝƌƚĞǆ2ϲϬϬ

ϭϲϯ ϭϲϯ ϭϲϯ

ϵ͕ϱϴϭ ϭϴϬϬ;ĞƐƚ͘Ϳ ϳ͕ϱϳϵ

EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘

2͘ϲϭϴŵƐ ϱ͘2ŵƐ;ĞƐƚ͘Ϳ ϯ͘ϵϳϲŵƐ

ϭϲϯ

ϭϯϬϬ;ĞƐƚ͘Ϳ

EŽƚĂǀĂŝů͘

ϰ͘ϭŵƐ;ĞƐƚ͘Ϳ

ϭϵϭ

EŽƚĂǀĂŝů͘

EŽƚĂǀĂŝů͘

ϭϳ͘ϳϭŵƐ

ϭϵϭ ϭϲϯ ϭϵϭ ϭϲϯ ϭϵϯ

EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘

EŽƚĂǀĂŝů͘ ϭϬϬD,ǌ ϭϬϬD,ǌ ϲϲ͘ϰD,ǌ ϲϲ͘ϰD,ǌ

ϭϭ͘ϴ2ŵƐ Ϭ͘ϴϰŵƐ 2͘ϭϭŵƐ Ϭ͘ϭϰϯŵƐ Ϭ͘ϭϴϳŵƐ

2ϯϯ

EŽƚĂǀĂŝů͘

ϲϲ͘ϰD,ǌ

Ϭ͘22ϱŵƐ

ϭϵϭ ϭϵϭ ϭϵϭ ϭϵϭ ϭϵϭ ϭϵϭ ϭϲϯ 2ϯϯ ϭϵϭ

EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ EŽƚĂǀĂŝů͘ 2ϱ͕ϳϲϯ ϯϱ͕ϴϬϬ 2ϱ͕ϵϲϯ

ϱϬD,ǌ ϱϬD,ǌ ϱϬD,ǌ ϯϲD,ǌ ϯϲD,ǌ ϯϲD,ǌ ϲϴ͘ϵD,ǌ ϲϳ͘ϵD, 2ϭ͘ϴD,ǌ

ϯ͘ϳ2ŵƐ ϰ͘ϬϳŵƐ 2͘2ϳŵƐ Ϭ͘ϱŵƐ Ϭ͘ϰϲŵƐ Ϭ͘2ϳŵƐ ϰϴђƐ ϴϵђƐ ϭϬϬ͘ϲϴђƐ

ϭϵϭ

2ϱ͕ϵϲϯ

ϯϬ͘ϭD,ǌ

ϳ2͘ϵϯϵђƐ

^ŵĂƌƚ [17]

ysϰϬϬϬy>

^ĂŬŝǇĂŵĂ [18]

sŝƌƚĞǆ//ƉƌŽ

'ƵƌĂ΀ϭϵ΁

sŝƌƚĞǆ ys2ϬϬϬ

ĞĚŶĂƌĂ [20]

sŝƌƚĞǆ ysϭϬϬϬ'

Shu [21]

sŝƌƚĞǆ//

dŚŝƐǁŽƌŬ͕

sŝƌƚĞǆ2ϲϬϬ sŝƌƚĞǆͲ//y2s ϲϬϬϬ


633

As illustrated this work is balanced the area and timing delay for the overall operation by using parallel architecture of Elliptic Curve Operations-Point addition and point doubling- and serial architecture for underlying Galois field Operations-Multiplication and Inversion- . Also, some of publication didn’t demonstrates some parameter of their implementations like clock or number of occupied slices that because of concentrating on some parameter to be efficient and loss the importance of other parameters.

5 Conclusion and Future Work This work design and implement EC scalar operation using FPGA Technology. The work doesn’t concentrate on one parameter, but it trying to make balance between area and timing delay. Two devices are used for implementation Virtex-II XC2V 6000 and Virtex-II XC2V 6000 by using 25,963 Slices and 100.68 µs and 72.939 µs. Also, different architecture for the operations needed to implement KP operation is presented and the time and area for each operation is listed. This work uses Xilinx ISE9.1 as synthesis tool and Xilinx Simulation Tool (XST) for simulation propose. The future work is to improve the architectures of different operations in ECC. One of them is to use the same multiplier on point doubling and projective coordinates that because these operations can’t do in parallel form.

References 1. Lenstra, A., Verheul, E.: Selecting Cryptographic Key Sizes. In: Imai, H., Zheng, Y. (eds.) PKC 2000. LNCS, vol. 1751, pp. 446–465. Springer, Heidelberg (2000) 2. Bednara, M., Daldrup, M., von zur Gathen, J., Shokrollahi, J., Teich, J.: Reconfigurable implementation of elliptic curve crypto algorithms. In: Reconfigurable Architectures Workshop (RAW) (2002) 3. Omura, J.K., Massey, J.L.: Computational method and apparatus for finite field arithmetic, United States Patent 4,587,627 (1986) 4. Robert, J., McEliece: Finite Fields for Computer Scientists and Engineers. The Kluwer International Series in engineering and computer science. Kluwer Academic Publishers, Dordrecht (1987) 5. Karatsuba, A., Ofman, Y.: Multiplication of multidigit numbers on automata. Sov. Transaction Info. Theory 7(7), 595–596 (1963) 6. Rodriguez-Henriquez, F., Kog, Q.K.: On Fully Parallel Karatsuba Multipliers for GF (2m). In: International Conference on Computer Science and Technology (CST), pp. 405–410 (2003) 7. El-sisi, A.B., Shohdy, S., Ismail, N.: Reconfigurable Implementation of Karatsuba Multiplier for Galois Field in Elliptic Curves. In: International Joint Conferences on Computer, Information, and Systems Sciences, and Engineering (CISSE 2008) (2008) 8. Chang Shantz, S.: From Euclid’s GCD to Montgomery Multiplication to the Great Divide., Technical Report SMLI TR-2001-95, Sun Microsystems Laboratories (June 2001) 9. Kejin, B., Younggang, S.: Hardware Implementation and Study of Inverse Algorithm in Finite Field. IJCSNS International Journal of Computer Science and Network Security 6(9A) (Septemeber 2006) 10. Darrel, H., Alfred, M., Scott, V.: Guide to Elliptic Curve Cryptograph. Springer, Heidelberg (2004)

634


11. Rodriguez-Henriquez, F., Saqib, N.A., Diaz-Perez, A., Cetin Kaya, K.: Cryptographic Algorithms on Reconfigurable Hardware. Springer, Heidelberg (2006) 12. Lopez, J., Dahab, R.: An Overview of Elliptic Curve Cryptography, Tech. Report, IC-0010 (May 2000) 13. López, J., Dahab, R.: Fast multiplication on elliptic curves over GF(2m) without precomputation. In: Koç, Ç.K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 316–327. Springer, Heidelberg (1999) 14. Saqib, N.A., Rodríguez-Henruez, F., Díaz-Pérez, A.: A Reconfigurable Processor for High Speed Point Multiplication in Elliptic Curves. Int’l J. Embedded Systems 1(3/4), 237–249 (2005) 15. Rodriguez-Henriquez, F., Saqib, N.A., Diaz-Pérez, A.: A fast parallel Implementation of Elliptic Curve point multiplication over GF(2m). In: Computer Science Section, Electrical Engineering Department, Centro de Investigaciony de Estudios Avanzados del IPN, Microprocessors and Microsystems, August 2, 2004, vol. 28(5-6), pp. 329–339 (2004) 16. Youssef Wajih, E.h., Zied, G., Mohsen, M., Rached, T.: Design and Implementation of Elliptic Curve Point Multiplication Processor over GF (2m). IJCSES International Journal of Computer Sciences and Engineering Systems 2(2) (April 2008) 17. Smart, N.P.: The hessian form of an elliptic curve. In: Koç, Ç.K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 118–125. Springer, Heidelberg (2001) 18. Sakiyama, K., De Mulder, E., Preneel, B., Verbauwhede, I.: A Parallel Processing Hardware Architecture for Elliptic Curve Cryptosystems. In: Acoustics, Speech and Signal Processing, ICASSP (May 2006) 19. Gura, N., Shantz, S., Eberle, H., et al.: An End-to-End Systems Approach to Elliptic Curve Cryptography. In: Kaliski Jr., B.S., Koç, Ç.K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 349–365. Springer, Heidelberg (2003) 20. Bednara, M., Daldrup, M., Shokrollahi, J., Teich, J., von zur Gathen, J.: Reconfigurable Implementation of Elliptic Curve Crypto Algorithms. In: 9th Reconfigurable Architectures Workshop (RAW 2002), Fort Laud- erdale, Florida, U.S.A, pp. 157–164 (April 2002) 21. Shu, C., Gaj, K., El-Ghazawi, T.A.: Low Latency Elliptic Curve Cryptography Accelerators for NIST Curves Over Binary Fields. In: Proceedings of the 2005 IEEE International Conference on Field-Programmable Technology, FPT 2005, Singagore, December 11-14, 2005, pp. 309–310. IEEE, Los Alamitos (2005)

FPGA Implementation of Elliptic Curve Point Multiplication over ( )

FPGA Implementation of Elliptic Curve Point Multiplication over ( )

Suggest Documents

FPGA Implementation of Elliptic Curve Point Multiplication over ( )

Area Efficient Implementation of Elliptic Curve Point Multiplication ...

FPGA Implementation of Elliptic Curve Point ... - Springer Link

RNS-Based Elliptic Curve Point Multiplication for Massive ... - Lip6

Signed MSB-Set Comb Method for Elliptic Curve Point Multiplication

RNS-Based Elliptic Curve Point Multiplication for Massive ... - Pequan

An FPGA Implementation of an Elliptic Curve ... - ESAT KU Leuven

FPGA Implementation of a Microcoded Elliptic Curve Cryptographic ...

An FPGA Implementation of Elliptic Curve ... - Semantic Scholar

Elliptic Curve over Ring=[];=

Parallel Approaches for Efficient Scalar Multiplication over Elliptic Curve

asic implementation of elliptic curve

FPGA Implementations of Elliptic Curve ... - Semantic Scholar

optimal elliptic curve scalar multiplication using ...

Implementation of Elliptic Curve Cryptographic Coprocessor over GF ...

Implementation of Elliptic Curve Cryptosystems over GF (2 n) in ...

Implementation of Elliptic Curve Cryptographic Coprocessor over GF

a single formula and its implementation in fpga for elliptic curve point

Efficient Elliptic Curve Cryptography Implementation over GF(2m)

Implementation of Elliptic Curve Digital Signature Algorithm

Implementation of Elliptic Curve Digital Signature ...

Reconfigurable Implementation of Elliptic Curve Crypto Algorithms

Area Efficient Hardware Implementation of Elliptic Curve

Optimum Implementation of Elliptic Curve Cryptosystems ... - CiteSeerX