Faster Explicit Formulas for Com Pairings over Ordinary Cur ... - Webs

1 downloads 0 Views 1MB Size Report
computer word-size. So far, lazy reduction has been applied to operations in a give characteristic p. → E.g., multiplication in Fp2 (c = a×b using Karatsuba.
Faster Explicit Formulas for Computing Pairings over Ordinary Curves Patrick Longa Department of Combinatorics & Optimization University of Waterloo

Eurocrypt 2011 JOINT WORK WITH: Diego F. Aranha Koray Karabina Catherine Gebotys Julio López

University of Campinas Certicom University of Waterloo University of Campinas

Outline  Motivation  Pairing-Based Cryptography  Basics  Contributions

 Generalized Lazy Reduction (GLR)  GLR on Curve Arithmetic  Maximizing operations without carry checks

 Compressed Squarings  High-Speed Implementation  Conclusions & References

Eurocrypt 2011

1

University of Waterloo

Motivation

Motivation  Pairing computation is (usually) the most expensive operation in Pairing Pairing-based

protocols  Pairings are thought to be inherently slow  Some efforts even focus on pairing-less less protocols

 So we need high-speed implementations  Recently, a flood of efforts have enabled pairings at higher speeds on x86 x86-64

processors:

 10,000,000cc on a 2.4GHz Core 2 Duo, R--ate pairing [Hankerson et al., 2008] 4,470,000cc on a 2.4GHz Core 2 Quad, O--ate pairing [Naehrig et al., 2010] 2,950,000cc on a 1.8GHz Core 2 Duo, O-ate ate pairing [[Beuchat et al., 2010]

 Is it possible to do better???

Eurocrypt 2011

2

University of Waterloo

Pairing-Based Based Cryptography

Pairing Pairing-Based Cryptography Basics  Let G1 and G2 be cyclic elliptic curve subgroups and GT be a cyclic multiplicative

subgroup  Let G1, G2 and GT be defined over prime fields and have order n An admissible (asymmetric) pairing is an efficiently computable function:

e: G1 × G2 → GT , exhibiting two main properties: bilinearity and non non-degeneracy    

Security (ultimately) relies on the DLP problems in G1, G2 and GT Tate pairing variants currently offer best performance (e.g., O O-ate, R-ate, X-ate, etc.) Barreto-Naehrig (BN) curve, having k = 12, is optimal for 128 128-bit security level What can be done with pairings? E.g., identity-based based key agreements, identity identity-based encryption (IBE), short signatures (BLS), multi-party key agreements, non-interactive interactive zero zero-knowledge proof systems, etc. etc.

Eurocrypt 2011

3

University of Waterloo

Pairing Pairing-Based Cryptography Contributions Optimal ate pairing on BN curves (generalized for u < 0) Input: P ∈ G1 , Q ∈ G 2 , p = F (u ), n = G (u ), r = 6u + 2 = ∑ i =02 ri 2i Output: aopt (Q, P ) log r

1: T ← Q, f ←1 2: For i =  log 2 r  −1 downto 0 do 3: f ← f 2 ⋅ lT ,T ( P), T ← 2T 4: if ri = 1 then f ← f ⋅ lT ,Q ( P ), T ← T + Q 5: Q1 ← π p ( P), Q2 ← π 2p (Q) 7:

If u < 0 then T ← −T , f ← f −1 f ← f ⋅ lT ,Q1 ( P ), T ← T + Q1

8:

f ← f ⋅ lT , − Q2 ( P ), T ← T − Q2

9:

f ← f (p

6:

6 −1)( p 2 +1)( p 4 − p 2 +1) / n

MILLER LOOP

FINAL EXPONENTIATION

10: Return f

Eurocrypt 2011

4

University of Waterloo

Pairing Pairing-Based Cryptography Contributions Optimal ate pairing on BN curves (generalized for u < 0) Input: P ∈ G1 , Q ∈ G 2 , p = F (u ), n = G (u ), r = 6u + 2 = ∑ i =02 ri 2i Output: aopt (Q, P ) log r

1: T ← Q, f ←1 2: For i =  log 2 r  −1 downto 0 do 3: f ← f 2 ⋅ lT ,T ( P ), T ← 2T 4: if ri = 1 then f ← f ⋅ lT ,Q ( P ), T ← T + Q 5: Q1 ← π p ( P ), Q2 ← π 2p (Q ) 7:

If u < 0 then T ← −T , f ← f −1 f ← f ⋅ lT ,Q1 ( P ), T ← T + Q1

8:

f ← f ⋅ lT , − Q2 ( P ), T ← T − Q2

9:

f ← f (p

6:

6 −1)( p 2 +1)( p 4 − p 2 +1) / n

MILLER LOOP

FINAL EXPONENTIATION

10: Return f  Efficient extension field arithmetic is crucial

 we apply a generalized lazy reduction (GLR) technique to eliminate reductions in Fpk  extensible to curve arithmetic (usually over a subfield Fpd ) Eurocrypt 2011

4

University of Waterloo

Pairing Pairing-Based Cryptography Contributions Optimal ate pairing on BN curves (generalized for u < 0) Input: P ∈ G1 , Q ∈ G 2 , p = F (u ), n = G (u ), r = 6u + 2 = ∑ i =02 ri 2i Output: aopt (Q, P ) log r

1: T ← Q, f ←1 2: For i =  log 2 r  −1 downto 0 do 3: f ← f 2 ⋅ lT ,T ( P ), T ← 2T 4: if ri = 1 then f ← f ⋅ lT ,Q ( P ), T ← T + Q 5: Q1 ← π p ( P ), Q2 ← π 2p (Q ) 7:

If u < 0 then T ← −T , f ← f −1 f ← f ⋅ lT ,Q1 ( P ), T ← T + Q1

8:

f ← f ⋅ lT , − Q2 ( P ), T ← T − Q2

9:

f ← f (p

6:

6 −1)( p 2 +1)( p 4 − p 2 +1) / n

MILLER LOOP

FINAL EXPONENTIATION

10: Return f  Hard part of FE is performed in the cyclotomic subgroup

 we propose new compressed squarings for fast exponentiation in this subgroup Eurocrypt 2011

4

University of Waterloo

Pairing Pairing-Based Cryptography Contributions Optimal ate pairing on BN curves (generalized for u < 0) Input: P ∈ G1 , Q ∈ G 2 , p = F (u ), n = G (u ), r = 6u + 2 = ∑ i =02 ri 2i Output: aopt (Q, P ) log r

1: T ← Q, f ←1 2: For i =  log 2 r  −1 downto 0 do 3: f ← f 2 ⋅ lT ,T ( P ), T ← 2T 4: if ri = 1 then f ← f ⋅ lT ,Q ( P ), T ← T + Q 5: Q1 ← π p ( P ), Q2 ← π 2p (Q ) 7:

If u < 0 then T ← −T , f ← f −1 f ← f ⋅ lT ,Q1 ( P ), T ← T + Q1

8:

f ← f ⋅ lT , − Q2 ( P ), T ← T − Q2

9:

f ← f (p

6:

6 −1)( p 2 +1)( p 4 − p 2 +1) / n

MILLER LOOP

FINAL EXPONENTIATION

10: Return f  We also eliminate an expensive inversion when u < 0 and some initial operations  Plus a few other optimizations (maximization of operations without carry checks, elimination

of small operations in point/line evaluation formulas, etc.) Eurocrypt 2011

4

University of Waterloo

Pairing Pairing-Based Cryptography Contributions

In this talk (and for illustrative purposes), we will use the tower Fp → Fp2 → Fp6 → Fp12 : – Fp2 = Fp [i]/(i2 − β) – Fp6 = Fp2 [v]/(v3 − ξ) – Fp12 = Fp6 [w]/(w2 − v)

where parameters of irreducible binomials are assumed to have “small size”.  M, S, R, A denote multiplication, squaring, reduction, addition in Fp  i, m, s, r, a denote inversion, multiplication, squaring, reduction, addition in Fp2  Xu denotes an unreduced operation X

Eurocrypt 2011

5

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Lazy reduction: A sum of products ∑ ±ai bi mod p (where ai, bi are elements in Montgomery representation) can be reduced with only one Montgomery reduction modulo p  Inner products are accumulated as double-precision precision integers, and  ∑ ± ai bi < 2N. p, where N = n.w , n is the number of words to represent p and w is the

computer word-size So far, lazy reduction has been applied to operations in a given (extension) field with characteristic p  E.g., multiplication in Fp2 (c = a×b using Karatsuba): Karatsuba

Let a = (a0, a1) and b = (b0, b1) ∈ Fp2 3M + 3R rdcn

c0 = (a0 × b0) + β (a1 × b1) c1 = (a0 + a1) × (b0 + b1) − a0 × b0 − a1 × b1

Eurocrypt 2011

6

rdcn

3M + 2R

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Generalized Lazy Reduction: Key observation: elements in a tower field are ultimately sum of products ∑ ± ai bi mod p Each one requires only one Montgomery reduction!  Consider tower Fp → Fp2 → Fp6 – Fp2 = Fp [i]/(i2 − β) – Fp6 = Fp2 [v]/(v3 − ξ)

 Let u = (u0, u1, u2) and v = (v0, v1, v2) ∈ Fp6  Compute w = u×v in Fp6 using Karatsuba:

w0 = uv0 + ξ [(u1 + u2)(v1 + v2) − uv1 − uv2] w1 = (u0 + u1)(v0 + v1) − uv0 − uv1 + ξ uv2 w2 = (u0 + u2)(v0 + v2) − uv0 + uv1 − uv2 where: uv0 = u0×v0, uv1 = u1×v1, uv2 = u2× ×v2

Eurocrypt 2011

7

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Generalized Lazy Reduction: Key observation: elements in a tower field are ultimately sum of products ∑ ± ai bi mod p Each one requires only one Montgomery reduction!  Consider tower Fp → Fp2 → Fp6 – Fp2 = Fp [i]/(i2 − β) – Fp6 = Fp2 [v]/(v3 − ξ)

 Let u = (u0, u1, u2) and v = (v0, v1, v2) ∈ Fp6  Compute w = u×v in Fp6 using Karatsuba:

w2 = (u0 + u2)(v0 + v2) − u0×v0 + u1×v1 − u2 u2×v2

Eurocrypt 2011

7

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Generalized Lazy Reduction: Key observation: elements in a tower field are ultimately sum of products ∑ ± ai bi mod p Each one requires only one Montgomery reduction!  Consider tower Fp → Fp2 → Fp6 – Fp2 = Fp [i]/(i2 − β) – Fp6 = Fp2 [v]/(v3 − ξ)

 Let u = (u0, u1, u2) and v = (v0, v1, v2) ∈ Fp6  Compute w = u×v in Fp6 using Karatsuba:

w2 = (u0 + u2)(v0 + v2) − u0×v0 + u1×v1 − u2 u2×v2 c0 = (a0 × b0) + β (a1 × b1) c1 = (a0 + a1)×(b0 + b1) − a0×b0 − a1×b1 b1

Eurocrypt 2011

7

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Generalized Lazy Reduction: Key observation: elements in a tower field are ultimately sum of products ∑ ± ai bi mod p Each one requires only one Montgomery reduction!  Consider tower Fp → Fp2 → Fp6 – Fp2 = Fp [i]/(i2 − β) – Fp6 = Fp2 [v]/(v3 − ξ)

 Let u = (u0, u1, u2) and v = (v0, v1, v2) ∈ Fp6  Compute w = u×v in Fp6 using Karatsuba:

w2 = (u0 + u2)(v0 + v2) − u0×v0 + u1×v1 − u2 u2×v2 c0 = (a0 × b0) + β (a1 × b1) c1 = (a0 + a1)×(b0 + b1) − a0×b0 − a1×b1 b1

Eurocrypt 2011

7

rdcn rdcn

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Generalized Lazy Reduction: Key observation: elements in a tower field are ultimately sum of products ∑ ± ai bi mod p Each one requires only one Montgomery reduction!  Consider tower Fp → Fp2 → Fp6 – Fp2 = Fp [i]/(i2 − β) – Fp6 = Fp2 [v]/(v3 − ξ)

 Let u = (u0, u1, u2) and v = (v0, v1, v2) ∈ Fp6  Compute w = u×v in Fp6 using Karatsuba:

w2 = (u0 + u2)(v0 + v2) − u0×v0 + u1×v1 − u2 u2×v2 c0 = (a0 × b0) + β (a1 × b1) c1 = (a0 + a1)×(b0 + b1) − a0×b0 − a1×b1 b1

Eurocrypt 2011

7

rdcn

rdcn rdcn

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Generalized Lazy Reduction: In general, for pairing-friendly towers with k = 2i 3j, i ≥ 1, j ≥ 0, we require 3i 6j muls and k reductions

Consider tower Fp → Fp2 → Fp6 → Fp12  E.g., multiplication in Fp12 (using Karatsuba)) : 54M +36R  54M + 12R  E.g., squaring in Fp12 (using complex → Chung Chung-Hasan → Karatsuba sqr.):

36M +30R  36M + 12R

Eurocrypt 2011

8

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Generalized Lazy Reduction:  Different formulas (Karatsuba, Chung-Hasan,, complex squaring, etc.) can be employed

because transformed expressions are still sums of products  Technique was applied to:

tower arithmetic (in ML and FE) curve arithmetic (in ML) new compressed squarings (in FE)

Eurocrypt 2011

9

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Curve Arithmetic:  Let E/Fp : y2 = x3 + b, and E’/Fp2 : y2 = x3 + b’/ ’/ξ be a sextic twist of E.  Let T = (X1, Y1, Z1) ∈ E’(Fp2) be in homogeneous coordinates.

To compute 2T = (X2, Y2, Z2) and the tangent line evaluated at P = (xP , yP) ∈ E(Fp), consider the following optimized formula: X2 = X1Y1(Y12 – 9b’Z12)/2 Y2 = [(Y12 + 9b’Z12)/2]2 – 27b’’2Z14 Z2 = 2Y13Z1 l = (– 2Y1Z1yP) vw + (3X12xP) v2 + ξ (3b’Z12 – Y12), where b’ = 1 – i, that costs 3m + 6s + 22a + 4M

Eurocrypt 2011

10

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Curve Arithmetic:  Let E/Fp : y2 = x3 + b, and E’/Fp2 : y2 = x3 + b’/ ’/ξ be a sextic twist of E.  Let T = (X1, Y1, Z1) ∈ E’(Fp2) be in homogeneous coordinates.

To compute 2T = (X2, Y2, Z2) and the tangent line evaluated at P = (xP , yP) ∈ E(Fp), consider the following optimized formula: X2 = X1Y1(Y12 – 9b’Z12)/2 Y2 = [(Y12 + 9b’Z12)/2]2 – 27b’’2Z14

rdcn

Z2 = 2Y13Z1 l = (– 2Y1Z1yP) vw + (3X12xP) v2 + ξ (3b’Z12 – Y12), where b’ = 1 – i, that costs 3mu + 6su + 22a + 4M + 8r

Eurocrypt 2011

10

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Maximizing operations without carry checks:  Adds/subs (and other small operations) are not inexpensive in certain platforms (e.g., x86 x86-64

processors)  Lazy reduction requires “double-precision” precision” adds/subs

Eurocrypt 2011

11

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Maximizing operations without carry checks:  Adds/subs (and other small operations) are not inexpensive in certain platforms (e.g., x86 x86-64

processors)  Lazy reduction requires “double-precision” precision” adds/subs  Prime p is selected such that bitlength(p) < N,, where N = n.w, and the maximum number of

consecutive adds w/o carry checks can be performed before and after muls (again, respecting max. bound 2N. p)  For converting to positive after a subtraction with form c = a + l.b, where l < 0, a,b ∈ [0, mp2] and |lmp| < 2N , the following options are studied:

Option 1 ⇒ Option 2 ⇒ Option 3 ⇒ Option 4 ⇒

Eurocrypt 2011

r = c + (2N.p/2h), r ∈ [0, mp2 + 2N.p/2h], h is a small integer if c < 0, r = c + 2N.p, r ∈ [0, 2N.p] r = c – lmp2, r ∈ [0, (|l|+1) |+1)mp2] if c < 0, r = c – lmp2, r ∈ [0, |lmp2|]

11

University of Waterloo

Pairing Pairing-Based Cryptography Generalized Lazy Reduction Maximizing operations without carry checks:  Let Fp2 = Fp [i]/(i2 + 1) and 2N. p ≈ 6.8 p2  Multiplication in Fp2 , a×b = (a0, a1)×(b0, (b0, b1) :

T0 ← a0 × b0 T1 ← a1 × b1 t0 ← a0 + a1 t1 ← b0 + b1 T2 ← t0 × t1 T3 ← T0 + T1 c0 ← T2 – T3 c1 ← T0  T1

[0, p2] [0, p2] [0, 2p] [0, 2p] [0, 4p2] [0, 2p2] [0, 2p2] [0, x]

c0 = (a0 × b0) − (a1 × b1) c1 = (a0 + a1)×(b0 + b1) − a0×b0 − a1×b1

If (Option 1, h = 2) ⇒ c1 = c1 + 2N.p/4 , x ≈ 2.7 2.7p2

Suggest Documents