Acknowledgements. This thesis would never have been possible without the help of many people. .... Divisor sums, partition numbers. 2. Traces of ... defined over a number field has complex multiplication (sometimes abbreviated as CM). ...... [âm](âP). It turns out that [m] is a non-zero isogeny if m = 0 and is always defined.
Computational Aspects of Modular Forms and Elliptic Curves
By Denis Xavier Charles
A dissertation submitted in partial fulfillment of the requirements for the degree of
Doctor of Philosophy (Computer Sciences)
at the UNIVERSITY OF WISCONSIN – MADISON 2005
i
Abstract In this thesis I investigate certain computational aspects of modular forms and elliptic curves. The main computational problem in the theory of Modular Forms is the computation of their Fourier coefficients. The first part of the thesis is concerned with the computational complexity of this problem. I show that if we do not fix the modular form, then the problem of computing Fourier coefficients is ]P-hard. Next, in joint work with Eric Bach, we show that if we fix a modular form that is a Hecke eigenform, then computing the Fourier coefficients is at least as hard as factoring RSA moduli. I also provide a new algorithm to compute the Fourier coefficients of any fixed space of cusp forms. This algorithm has a faster asymptotic running time than any previous algorithm for computing Fourier coefficients of these forms.
The arithmetic theory of Elliptic Curves has many interesting computational problems. I study two such problems in this thesis. One is the problem of distinguishing elliptic curves with Complex Multiplication from ordinary elliptic curves. I give two algorithms for this problem: one is a randomized polynomial time algorithm and the other is a deterministic polynomial time algorithm. I also show how the randomized polynomial time algorithm can be made to have one-sided error. The other problem that I study is the problem of computing the modular polynomial. The modular polynomial is an object that governs isogenies between elliptic curves. In joint work with Kristin Lauter, we provide a new algorithm to compute the modular polynomial over finite fields. This algorithm also yields an efficient method of generating random isogenies from an elliptic
ii curve over a finite field.
iii
Acknowledgements This thesis would never have been possible without the help of many people. Here is my attempt to acknowledge them. I would like to express my sincere gratitude to my advisor Eric Bach for all his help, advice, support and encouragement. His willingness to listen to ideas or problems and his suggestions have been integral to the development of this work. I have learned a lot from Eric’s approach to problem solving, and his insight and uncanny ability to find the crux of a problem have been invaluable. I am very grateful for his editorial help; his advice has greatly improved my mathematical writing. Thanks are due to Jin-Yi Cai for his help even from my early years as a graduate student. His advice was important for me in selecting Computational Number Theory as my research area. His enthusiastic approval of some of my ideas was a key factor in the development of the results of Chapter 3. I am grateful to Kristin Lauter for giving me the opportunity to work together on some very interesting problems. Her advice and encouragement has been especially useful to me. I have learned a lot from her, not just number theory but also its practical aspects and applications. I am very grateful to her for being a friend and a mentor. I would like to thank Nigel Boston for teaching me Galois Representations and their uses. The various courses that he taught have been very useful for my work. I am thankful for his comments and generous suggestions on my work. His candid advice has facilitated some difficult decisions for which I am thankful. I am grateful to Ken Ono for his generous help with my work. By answering my myriad questions and pointing me to interesting and related work he has greatly helped my research. I am eternally grateful to him for suggesting that I read [Ser81], as the techniques and
iv results in that paper are central to my work. He has also been directly responsible for the “atmosphere of modular forms” at Madison which has influenced my work. I would like to thank Dieter van Melkebeek for his detailed comments on my work; they have improved the presentation significantly. Being a TA for some of Dieter’s courses has been a learning experience for me. His dedication and energy have been a constant inspiration.
Several people have helped me with various technical aspects of this thesis. In particular, I would like to thank Antal Balog, Tom Cusick, Steven Galbraith, Marty Isaacs, Ravi Kannan, Bill McGraw, Ren´e Schoof, Igor Shparlinski, William Stein, Gisbert W¨ ustholz and Tonghai Yang. I would like to thank my friends Matt Boylan, Iftikhar Burhanuddin, Venkat Chakaravarthy, Rohit Chatterjee, Jayce Getz, Ahmad El-Guindy, Rajasekar Krishnamurthy, Eric Mortenson, Harris Nover and Taliesin Sutton for very useful discussions that usually took place after classes, at student seminars, conferences and coffee shops. Special thanks to Rohit Chatterjee for responding to many late night phone calls about Hecke operators.
I bear a debt of gratitude to my family for their love and support. They have displayed remarkable patience in waiting for me to graduate. Special thanks to my fianc´ee Madhulika for her love and gentle encouragement. Her confidence in me has brightened many a day. I would like to thank my friends Pavan Aduri, Kirsten Anderson, Lakshmi Bairavasundaram, Ramesh Chokkalingam, John Curran, Aparna Das, S. P. Ganesh, Paul Gestwicki, Maurice Jansen, Karthik Jayachandran, Shankar Kumar, Arunvijay Kumar, Sricharan Poundarikapuram, Arun Raghavan, Shankar Ram, Vaidy Ramachandran, Ravi Ramamurthy, Karthik Ramani, Samik Sengupta, Ram Subramanian, Joe
v Titus and Joseph Tony for making my life so much fun. Finally, thanks to Paul Simon for providing the soundtrack for my graduate life.
vi
Contents
Abstract
i
Acknowledgements
iii
1 Introduction
1
1.1
The Complexity Theory of Modular Forms . . . . . . . . . . . . . . . . .
1
1.2
Algorithms for Elliptic Curves . . . . . . . . . . . . . . . . . . . . . . . .
4
2 Modular Forms
6
2.1
Definition of Classical Modular Forms . . . . . . . . . . . . . . . . . . . .
6
2.2
Dimension of Mk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.2.1
The Modular j-function . . . . . . . . . . . . . . . . . . . . . . .
10
2.3
Bounds for the Fourier coefficients . . . . . . . . . . . . . . . . . . . . . .
11
2.4
Hecke Operators
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.5
Modular forms of Higher Level . . . . . . . . . . . . . . . . . . . . . . . .
13
2.5.1
Modular Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.6
Dimensions of Modular Forms of Higher Level . . . . . . . . . . . . . . .
15
2.7
Hecke Operators for Modular Forms of Higher Level . . . . . . . . . . . .
17
3 Counting Lattice Vectors
19
3.1
Definition of the Problem
. . . . . . . . . . . . . . . . . . . . . . . . . .
19
3.2
The Class ]P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
3.3
Background and Statement of Results . . . . . . . . . . . . . . . . . . . .
21
vii 3.4
The ]P-hardness Result . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.5
A Na¨ıve Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.6
The Approach for Unimodular Lattices . . . . . . . . . . . . . . . . . . .
27
3.6.1
Computing a Basis of Mk
. . . . . . . . . . . . . . . . . . . . . .
28
3.6.2
The Algorithm for Unimodular Lattices . . . . . . . . . . . . . . .
29
Reductions to Integer Factorization . . . . . . . . . . . . . . . . . . . . .
31
3.7.1
Fixed Rank Lattices . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.7.2
Lattices with Bounded Norm Basis Vectors . . . . . . . . . . . . .
32
The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
3.8.1
39
3.7
3.8
Odd Rank Lattices . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Hardness of Computing a Hecke Eigenform
40
4.1
Hardness of Computing τ (n) . . . . . . . . . . . . . . . . . . . . . . . . .
41
4.2
The Reduction in General . . . . . . . . . . . . . . . . . . . . . . . . . .
43
5 Computing a Basis of Cusp Forms
47
5.1
Computing the Ramanujan Tau function . . . . . . . . . . . . . . . . . .
48
5.2
Computing level 1 cusp forms . . . . . . . . . . . . . . . . . . . . . . . .
50
5.2.1
Cyclic Basis for modular forms . . . . . . . . . . . . . . . . . . .
50
5.2.2
Description of the Algorithm . . . . . . . . . . . . . . . . . . . . .
53
5.2.3
Another Approach . . . . . . . . . . . . . . . . . . . . . . . . . .
55
Computing Cusp Forms of Higher Level . . . . . . . . . . . . . . . . . . .
55
5.3
6 Elliptic Curves
59
6.1
Definition of an Elliptic Curve . . . . . . . . . . . . . . . . . . . . . . . .
59
6.2
Isogenies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
viii 6.3
Structure of the Endomorphism Ring . . . . . . . . . . . . . . . . . . . .
64
6.4
The Tate Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
6.5
Elliptic Curves over Finite Fields . . . . . . . . . . . . . . . . . . . . . .
66
6.6
Weil Height . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
7 Testing Elliptic Curves for Complex Multiplication
69
7.1
The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
7.2
A Direct Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
7.3
The Randomized Algorithm . . . . . . . . . . . . . . . . . . . . . . . . .
74
7.3.1
Finding the Discriminant of End(E) . . . . . . . . . . . . . . . .
81
7.3.2
A One-Sided Error Algorithm . . . . . . . . . . . . . . . . . . . .
83
The Deterministic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . .
84
7.4.1
Galois Representations from Elliptic curves . . . . . . . . . . . . .
84
7.4.2
Image of ρ` if E does not have CM . . . . . . . . . . . . . . . . .
85
7.4.3
Image of ρ` if E has CM . . . . . . . . . . . . . . . . . . . . . . .
88
7.4.4
The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
7.4
8 Computing Modular Polynomials
91
8.1
Local Computation of φ` (x, j) . . . . . . . . . . . . . . . . . . . . . . . .
8.2
Computing φ` (x, y) mod p . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Bibliography
93
105
1
Chapter 1 Introduction This thesis consists of two distinct parts. The first part is concerned with the problem of computing Fourier coefficients of Modular Forms. The second part deals with two interesting algorithmic problems that arise in the theory of Elliptic Curves. Here we discuss the motivation of the work and survey the results proved in this thesis.
1.1
The Complexity Theory of Modular Forms
One of the great achievements of mathematics in the last hundred years is the development of the theory of modular forms. Modular Forms, and their generalization, Automorphic Forms, lie at the heart of many deep problems in Number Theory: the Generalized Riemann Hypothesis, the Birch and Swinnerton-Dyer conjecture, Serre’s conjectures, and the Langlands Program. Development of this theory has led to the solution of such famous problems as Fermat’s Last Theorem and Gauss’s class number problem. In this thesis we begin the investigation of the complexity theoretic aspects of this theory. Modular Forms are holomorphic functions on the upper half-plane that satisfy certain functional equations. These functional equations lead to Fourier expansions of Modular Forms. It turns out that the Fourier coefficients of meromorphic Modular Forms (of integer and half-integer weight) encode a wealth of very important arithmetic information. Here is a short list (we have, of necessity, suppressed some details):
2 1. Divisor sums, partition numbers. 2. Traces of singular moduli. 3. (Wiles) Trace of Frobenius on elliptic curves over Q. 4. Trace of Frobenius of Galois representations of dimension 2 over finite fields (this is a conjecture). 5. Special values of L-functions. 6. Class numbers of quadratic fields. 7. Twists of central critical values of L-functions of elliptic curves. 8. Graded dimensions of irreducible representations of certain infinite dimensional Lie algebras. Thus the principal computational problem in the theory of Modular Forms is, naturally, the computation of their Fourier coefficients.
Several methods have been proposed for computing Fourier coefficients of modular forms. Principal among them is the Modular Symbols method. The modular symbols approach is described in [Mer94] and is also implemented in the computational algebra system MAGMA [BC03]. Another method to compute Fourier coefficients is via theta series [Piz80]. Both methods have one thing in common: their running times are exponential. A natural question to ask in light of this fact is: What is the complexity of computing Fourier coefficients of modular forms? This question is the focus of Chapters 3 and 4 of the thesis. In Chapter 3 we show that a natural counting problem associated with
3 lattices is ]P−complete. It turns out that this counting problem is intimately connected with computing Fourier coefficients of modular forms. Thus the hardness of the counting problem, in particular, implies that there is a family of modular forms whose Fourier coefficients are very hard to compute. We also utilize modular forms in another fashion in Chapter 3; we use them to show that certain restricted versions of the counting problem remain hard. We believe this is the first instance of a complexity theoretic reduction that uses modular forms.
The hardness result that we obtain in Chapter 3 is not very satisfactory, in the following sense. Usually, one is interested in computing the Fourier coefficients of a fixed modular form. Thus one would like to know the complexity of this problem. Chapter 4 tackles this question. We show that if the modular form is fixed then the problem is as hard as factoring RSA moduli. This result is in a sense the best possible, as there exist modular forms for which the computation of the Fourier coefficients is no harder than factoring integers.
In Chapter 5 we turn to the problem of computing a basis for the space of modular forms. We propose and analyze an algorithm based on the Selberg Trace formula for this problem. It turns out that if one fixes the weight and level of the space then our algorithm has an asymptotically faster running time than any of the existing methods.
Thus our contributions include new hardness results for the problem of computing Fourier coefficients of modular forms and a new algorithm to compute the space of modular forms.
4
1.2
Algorithms for Elliptic Curves
Elliptic Curves have found many applications in Computational Number Theory, Cryptography and Coding Theory. There are many computational problems in the arithmetic theory of elliptic curves. For instance, one can study efficient implementation of the group operations, counting the number of points on an elliptic curve over a finite field, evaluating the Weil-pairing, finding torsion points etc. In this thesis we study two such problems. The first is to design an algorithm for checking whether an elliptic curve defined over a number field has complex multiplication (sometimes abbreviated as CM). The second is an algorithm to compute the modular polynomial.
An efficient test for complex multiplication was posed as a question in Hartshorne’s book ([Har77] Remark 4.20.4). In Chapter 7 we discuss several methods to check whether an elliptic curve has CM. Using some well-known results of Deuring and Serre one can obtain a randomized (two-sided error) polynomial time algorithm for this problem. We then show that a slightly different approach leads to a provably one-sided error algorithm for CM testing. We also show that one can obtain more precise information about the endomorphism ring of elliptic curve in this way. The algorithm is very easy to implement and works very well in practice. We also provide a deterministic polynomial time algorithm for this problem. The latter algorithm is more complicated, but it serves to show that the task of testing an elliptic curve for CM is in the complexity class P.
Finally, in Chapter 8 we give a new algorithm to compute the modular polynomials. Modular polynomials govern isogenies (the natural maps) between elliptic curves. These
5 polynomials have found uses in improvements to algorithms for point-counting on elliptic curves ([Elk98, Sat02]). The algorithm we give has the distinguishing feature that it does not involve computing Fourier coefficients of the modular j-function and it directly computes the polynomial modulo a prime. The latter feature makes it attractive for applications where the modular polynomial is used to find isogenies over a finite field. Another corollary of this algorithm is a fast method of producing random isogenies from an elliptic curve.
6
Chapter 2 Modular Forms In this chapter we collect the results that we need from the theory of modular forms. The reader should consult the books [Gun62, Hec83b, Kob93, Lan76, Ogg69, Ono04, Shi71] for an in-depth treatment of the subject.
2.1
Definition of Classical Modular Forms
Let H = {z ∈ C | Im(z) > 0} be the Poincar´e half-plane. Let Γ be the group PSL(2, Z), that is the group of 2 × 2 matrices of determinant 1 with integer entries.
Definition 2.1.1. A holomorphic function f : H → C is called a modular form (for Γ) of weight k (k is a non-negative integer) if the following conditions hold: a b +b k 1. f aτ = (cτ + d) f (τ ) for all z ∈ H and ∈ Γ; cτ +d c d 2. As τ → ı∞, |f (τ )| is bounded. If f : H → C is meromorphic but satisfies all the other conditions above, we call f a modular function.
7 The set of all modular forms of a certain k (say), forms a C-vector space, and weight, 1 1 we denote this space by Mk . Since ∈ Γ, the transformation law (1) above 0 1 implies that f (τ ) = f (τ + 1). Thus any modular form is periodic in vertical strips of width 1 on the complex plane. Now H/{z 7→ z + 1} (essentially a cylinder) has a complex analytic isomorphism to the open punctured disc of radius 1, given by the map z 7→ e2πız . Holomorphic maps f on H ∪ {ı∞} such that f (τ ) = f (τ + 1) when considered as maps on the open disc have a Taylor expansion about the origin: f (z) =
X
an z n .
0≤n
It is a fact that this expansion converges everywhere in the disc. Pulling this expansion back via the isomorphism we get the expansion f (τ ) =
X
an q n ,
0≤n
where q = e2πız . We call this expansion the Fourier expansion (at infinity) of the modular form and refer to the an as its Fourier coefficients. There is a natural subspace of Mk , namely those that vanish at ı∞. These are the so-called cusp-forms of weight k and we denote this space by Sk . The space Sk consists of forms whose Fourier expansion has a0 = 0.
−1 0 Remark 2.1.2. We note that since ∈ Γ, the transformation law (1) shows 0 −1 that if f ∈ M2k+1 then f (τ ) = −f (τ ) and hence f = 0. Thus there are no non-zero modular forms of odd weight.
8
2.2
Dimension of Mk
Our first task is to show that Mk is finite dimensional and also provide an explicit basis for the space Mk .
a b ∈ H for ∈ SL(2, R)/{±I}. In fact, this defines an c d action of the group SL(2, R)/{±I} on H. The set H/Γ of equivalence classes under the
Suppose z ∈ H then
az+b cz+d
action of Γ is a non-compact Riemann surface that can be compactified by adding one point, namely ı∞ (see [Shi71] §1.5). The resulting compact Riemann surface is denoted H∗ /Γ (or X0 (1)). Suppose f ∈ M0 then the transformation conditions imply that f : H∗ /Γ → C is a holomorphic function. Since there are no non-constant holomorphic functions from a compact Riemann surface to C we have that M0 = C and dimC M0 = 1. To compute the dimension of Mk we need the following important result, which is called the valence formula (see [Lan76] page 6 for a proof). Theorem 2.2.1. (Valence Formula) Let f ∈ Mk and f 6= 0. Then 1 1 ord∞ (f ) + ordρ (f ) + ordı (f ) + 3 2 where ρ is e
2πı 3
X
ordP (f ), =
P ∈H/Γ,P 6=ı,ρ
k , 12
(2.1)
and ordQ refers to the order of vanishing of f at the point Q.
We also need the following important examples of modular forms. Let k > 2 be even. The Eisenstein series of weight k is Ek (τ ) = 1 −
2k X σk−1 (n)q n , τ ∈ H and q = e2πıτ , Bk 1≤n
(2.2)
where Bk is the k-th Bernoulli number (the coefficient of xk /k! in the Taylor expansion
9 of
x ) ex −1
and σk−1 (n) =
P
d|n
dk−1 . The Discriminant function ∆ is defined by
∆(τ ) = q
Y
(1 − q n )24 , τ ∈ H and q = e2πıτ .
(2.3)
1≤n
Theorem 2.2.2. (see [Lan76]). For k > 2 and even, Ek ∈ Mk and ∆ ∈ S12 . Finally, we are able to show the following well known result. Theorem 2.2.3. Let k be a positive even integer. 1. M0 = C and M2 = 0. 2. Mk = CEk for k = 4, 6, 8, 10 or 14. 3. Sk = 0 if k < 12, S12 = C∆ and for k > 14 Sk = ∆Mk−12 . 4. Mk = Sk ⊕ CEk for k > 2. Proof : We have already proved the first part of (1), and the second follows from the Valence formula since there is no way the non-negative terms on the left of (2.1) can add up to 61 . When k = 4, 6, 8, 10 or 14, there is exactly one possibility for ordP (f ) so that (2.1) holds: for k = 4, we must have ordρ (f ) = 1 and ordP (f ) = 0 for all other P ; for k = 6, we must have ordı (f ) = 1 and ordP (f ) = 0 for all other P ; for k = 8, we must have ordρ (f ) = 2 and ordP (f ) = 0 for all other P ; for k = 10, we must have ordρ (f ) = ordı (f ) = 1 and ordP (f ) = 0 for all other P ; for k = 14, we must have ordρ (f ) = 2, ordı (f ) = 1 and ordP (f ) = 0 for all other P.
10 Now if f, g ∈ Mk for k = 4, 6, 8, 10 or 14, then f /g ∈ M0 as both f and g have same zeros and thus by part (1) are constant multiples of each other. Thus (2) follows by taking f = Ek . For k = 12 and f = ∆ the valence formula says that the only zero of ∆ is at ı∞. Suppose f ∈ Sk then ordı∞ (f ) > 0 since it vanishes at infinity. Now f /∆ is a modular form of weight k − 12 since it is holomorphic everywhere. This gives us the claim (3). If f ∈ Mk we can subtract a suitable multiple of the Eisenstein series Ek to get a cusp form, thus Mk = Sk ⊕ CEk . 2
We get the following corollary immediately: Corollary 2.2.4. For even k we have k b 12 c + 1, if k 6≡ 2 mod 12 dimC Mk = b k c, if k ≡ 2 mod 12. 12
Furthermore, a basis for the space Mk is given by the set of forms ∆` Ek−12` for 0 ≤ ` ≤ k−4 , 12
and if k is divisible by 12 the function ∆k/12 is also in the basis.
Proof :(Sketch) The only thing, given the above theorem, is to check that the forms in the claimed basis are linearly independent. This is easily accomplished by looking at their Fourier expansions. 2
2.2.1
The Modular j-function E3
The functions E43 and ∆ are both in M12 so the function j(z) =def 1728 ∆4 is a weight 0 modular function. We can see that j(z) has a pole only at ı∞ since ∆ vanishes only
11 at ı∞ whereas E43 does not vanish at ı∞ (since it is not a cusp form). The Fourier expansion of j begins as follows: j(z) =
1 + 744 + 196884q + 21493760q 2 + 864299970q 3 + 20245856256q 4 + · · · q
It turns out that j uniformizes the function field of X0 (1).
2.3
Bounds for the Fourier coefficients
We begin with estimates for cusp forms. The following is usually referred to as the Hecke bound. Lemma 2.3.1. Let f ∈ Sk with Fourier expansion f (z) =
X
an q n .
1≤n
k Then |an | = O n 2 . Proof : Let z = x + ıy, the transformation formula for f shows that the function k
z 7→ y 2 |f (z)| is invariant under the action of Γ. Hence this function is bounded on H, and we get |f (x + ıy)|
1 y k/2
for y → 0.
Finding the n-th Fourier coefficient by integration yields −2πny
e
Z an =
1
f (x + ıy)e−2πınx dx.
0
This is true for any value of y, so we let y =
1 n
k
to get |an | n 2 . 2
12 The famous Ramanujan conjecture states that in fact, |an | n
k−1 + 2
for any > 0,
where an are the Fourier coefficients of a cusp form of weight k. This conjecture was proved by Deligne as a consequence of his proof of the Weil conjectures.
We know from Theorem 2.2.3 and Corollary 2.2.4 that if f ∈ Mk then f = αEk + g where g is a cusp form of weight k and α is a constant. Furthermore, we know explicitly the Fourier coefficients of Ek and one can show that they are O(nk−1+ ) for every > 0. Thus we get: Lemma 2.3.2. Let f ∈ Mk with Fourier expansion f (z) =
X
an q n .
0≤n
Then |an | nk−1+ for every > 0.
2.4
Hecke Operators
There is an important algebra of operators, the Hecke operators, on the space of modular forms, with very useful properties. We define these operators by their operation on the Fourier expansion of the modular forms and enumerate some of their most important properties. Let m be a positive integer. Define Tm,k , the m-th Hecke operator on the space Mk , as P follows: If f (z) = 0≤n a(n)q n then mn X X qn. f (z)|Tm,k = dk−1 a 2 d 0≤n d| gcd(m,n)
We suppress the weight k in the notation of the Hecke operator whenever the weight is clear from the context.
13 Proposition 2.4.1. For each positive integer n, the n-th Hecke operator Tn,k is a linear operator on the space Mk that preserves the subspace Sk of cusp forms. Theorem 2.4.2. The Hecke operators commute and are multiplicative, i.e., if gcd(m, n) = 1 then Tmn = Tm Tn . If p is a prime then Tp` = Tp`−1 Tp − pk−1 Tp`−2 for ` ≥ 2. Definition 2.4.3. A modular form f ∈ Mk is called a Hecke eigenform if for every m ≥ 2 there is a complex number λ(m) for which f (z)|Tm = λ(m)f (z). For example ∆ ∈ S12 and S12 is 1-dimensional, hence ∆ is a Hecke eigenform. Theorem 2.4.4. Suppose that f (z) =
P
0≤n
a(n)q n ∈ Mk is a Hecke eigenform for
which f (z)|Tm = λ(m)f (z). 1. If f (z) is non-constant, then a(1) 6= 0. 2. If f (z) is a cusp form normalized so that a(1) = 1, then a(m) = λ(m). Moreover, if m and n are coprime, then a(n)a(m) = a(mn). Another important theorem is the following: Theorem 2.4.5. There is a basis of Hecke eigenforms for the spaces Mk and Sk .
2.5
Modular forms of Higher Level
Modular forms of higher level are obtained by requiring the transformation property of 2.1.1 to hold for subgroups of Γ of finite index. The most important cases are the following groups.
14 Definition 2.5.1. If N is a positive integer, then define a b Γ0 (N ) = ∈ Γ c ≡ 0 mod N , c d and a b Γ1 (N ) = ∈ Γ0 (N ) a ≡ 1 c d
mod N .
These subgroups are called congruence subgroups of level N . Definition 2.5.2. A holomorphic function f : H → C is called a modular form of weight k on a congruence subgroup Γ (either Γ1 (N ) or Γ0 (N )) of level N if the following hold: 1. f
az + b cz + d
a b = (cz + d)k f (z) for all z ∈ H and ∈Γ c d
2. If γ ∈ PSL(2, Z), then the function gγ (z) := (cz + d)−k f
az+b cz+d
has a Fourier
expansion of the form gγ (z) =
X
n aγ (n)qN
0≤n
where qN = e
2πız N
.
We adopt the notation Mk (Γ0 (N )) (resp. Sk (Γ0 (N ))) and Mk (Γ1 (N )) (resp. Sk (Γ1 (N ))) for the C-vector space of modular forms (resp. cusp forms) of weight k for the congruence subgroups Γ0 (N ) and Γ1 (N ) respectively.
We also identify certain subspaces of Mk (Γ1 (N )) that transform nicely with respect to the action by Γ0 (N ).
15 Definition 2.5.3. Let χ be a Dirichlet character modulo N . Then we say that a modular form f ∈ Mk (Γ1 (N )) is a modular form of weight k and Nebentypus (character) χ if az + b a b f = χ(d)(cz + d)k f (z), for all ∈ Γ0 (N ). cz + d c d The space of such modular forms is denoted Mk (Γ0 (N ), χ) (the space of such cusp forms is denoted Sk (Γ0 (N ), χ)). −1 0 Remark 2.5.4. Since ∈ Γ0 (N ), if χ(−1) 6= (−1)k , then there are no non-zero 0 −1 modular forms in Mk (Γ0 (N ), χ).
2.5.1
Modular Curves
The surface H/Γ0 (N ) is non-compact and can be compactified by adding finitely many points. This results in a compact Riemann surface, which is a curve over the complex numbers, denoted X0 (N ). These are the so-called modular curves. They are moduli spaces of elliptic curves with level N structure and turn out to be defined over Q. In chapter 8 we give an algorithm to compute a (singular) model of X0 (`) where ` is a prime.
2.6
Dimensions of Modular Forms of Higher Level
The following formulas for the dimensions of Mk (Γ0 (N ), χ) are from [CO77] (see also [Ono04] §1.2.3). We need some notation to state the formulas. Let k be an integer (positive or negative), and let χ be a Dirichlet character modulo N for which χ(−1) = (−1)k (otherwise Mk (Γ0 (N ), χ) = 0)). If p|N is a prime, then let rp (resp. sp ) denote the
16 exact power of p dividing N (resp. the conductor of χ). Define the integer λ(rp , sp , p) by
λ(rp , sp , p) =
0 0 pr + pr −1 0
2pr 2prp −sp
Define rational numbers νk and µk by 0 νk = − 1 4 1 4
µk =
0
− 13 1 3
if 2sp ≤ rp = 2r0 ; if 2sp ≤ rp = 2r0 + 1; if 2sp > rp .
if k is odd; if k ≡ 2
mod 4;
if k ≡ 0
mod 4,
if k ≡ 1
mod 3;
if k ≡ 2
mod 3;
if k ≡ 0
mod 3.
With the above notation, we have: Theorem 2.6.1. If k is an integer and χ a Dirichlet character modulo N for which χ(−1) = (−1)k , then (k − 1)N Y 1 dimC (Sk (Γ0 (N ), χ)) − dimC (M2−k (Γ0 (N ), χ)) = 1+ 12 p p|N
−
1Y 2
p|N
λ(rp , sp , p) + νk
X x mod N, x2 +1≡0 mod N
χ(x) + µk
X
χ(x).
x mod N, x2 +x+1≡0 mod N
To use this theorem one notes that if k > 2, then dimC (M2−k (Γ0 (N ), χ)) = 0. Hence the left hand side of Theorem 2.6.1 reduces to dimC (Sk (Γ0 (N ), χ)). A similar argument applies when k = 2 depending on whether χ is trivial. If k ≤ 0, then dimC (Sk (Γ0 (N ), χ)) =
17 0. In these cases Theorem 2.6.1 reduces to dimC (M2−k (Γ0 (N )), χ). One can also derive the dimensions of Mk (Γ1 (N )) and Sk (Γ1 (N )) using the following decomposition of the space: Mk (Γ1 (N )) =
M
Mk (Γ0 (N ), χ)
χ
Sk (Γ1 (N )) =
M
Sk (Γ0 (N ), χ),
χ
where both sums are over all Dirichlet characters χ modulo N .
2.7
Hecke Operators for Modular Forms of Higher Level
The general definition of Hecke operators is given below: Definition 2.7.1. If f (z) =
P
0≤n
a(n)q n ∈ Mk (Γ0 (N ), χ), then for m coprime to N
the action of the m-th Hecke operator Tm,k,χ is given by mn X X qn. f (z)|Tm,k,χ = χ(d)dk−1 a 2 d 0≤n d| gcd(m,n)
We will drop the subscripts k and χ if it is clear from the context what they are. Proposition 2.7.2. For each positive integer n with gcd(n, N ) = 1, the n-th Hecke operator Tn,k,χ is a linear operator on the space Mk (Γ0 (N ), χ) that preserves the subspace Sk (Γ0 (N ), χ) of cusp forms. Proposition 2.7.3. The Hecke operators commute and are multiplicative, i.e., Tm,k,χ Tn,k,χ = Tmn,k,χ for gcd(m, n) = 1. If p is a prime then Tp` = Tp`−1 Tp − χ(p)pk−1 Tp`−2 for gcd(p, N ) = 1 and ` ≥ 2.
18 Definition 2.7.4. A modular form f ∈ Mk (Γ0 (N ), χ) is called a Hecke eigenform if for every m ≥ 2 with gcd(m, N ) = 1 there is a complex number λ(m) for which f (z)|Tm = λ(m)f (z). Theorem 2.7.5. Suppose that f (z) =
P
0≤n
a(n)q n ∈ Mk (Γ0 (N ), χ) is a Hecke eigen-
form for which f (z)|Tm = λ(m)f (z). 1. If f (z) is non-constant, then a(1) 6= 0. 2. If f (z) is a cusp form normalized so that a(1) = 1, then a(m) = λ(m). Moreover, if m and n are coprime, then a(n)a(m) = a(mn). There is a basis for the space of cusp forms made up of eigenforms. Theorem 2.7.6. There exists a basis of the space Sk (Γ0 (N ), χ) whose elements are Hecke eigenforms.
19
Chapter 3 Counting Lattice Vectors In this chapter we study the problem of exactly counting the number of vectors in a lattice at a given distance (under L2 -norm). We show that the problem is ]P-complete if inputs are in binary. Then we give a deterministic algorithm for counting the number of lattice vectors in a lattice. The algorithm is asymptotically much faster than exhaustive search and uses modular forms in an essential way. One can view the algorithm as providing a reduction from the ]P-complete problem of counting lattice vectors to that of computing Fourier coefficients of modular forms. This view yields our first result on the complexity of computing Fourier coefficients of modular forms. The results of this chapter are from [Cha05c].
3.1
Definition of the Problem
A lattice L ⊆ Qn is the integer linear span of r ≤ n linearly independent vectors of Qn . In other words L is a Z-submodule of Qn , not necessarily of full rank. Our encoding of a lattice lists the basis vectors whose entries are given in binary. Throughout this chapter when we refer to norm (or length of a vector) we mean the L2 norm, i.e., if v = (a1 , · · · , an ) ∈ Qn then
20
X
kvk2 =
a2i .
1≤i≤n
If L is a lattice, then we define a function ϑL : N → N by ϑL (d) = ]{v ∈ L : kvk2 = d}. The computational problem that we are interested in is the following:
Counting Lattice Vectors Input: A lattice L ⊆ Zn (all the basis vectors have integer coordinates), and an integer d in binary. Question: What is ϑL (d)?
The assumption that the basis vectors have integer coordinates is mild since any lattice can be scaled up by α ∈ Z (say) so that every basis vector has integer coordinates and furthermore ϑL (d) = ϑαL (α2 d).
3.2
The Class ]P
The class ]P is a complexity class that consists of functions. The formal definition is given below: Definition 3.2.1. A function f : {0, 1}∗ → N belongs to the class ]P iff there exists a predicate R : {0, 1}∗ × {0, 1}∗ → {0, 1} such that the following holds: 1. R(x, y) can be computed in time |x|c , where c is a constant that depends only on f. 2. Moreover, f (x) = ]{y : R(x, y) = 1}.
21 Informally, ]P consists of functions mapping strings to the natural numbers that have the property that the value of the function at an argument is the cardinality of a set for which membership can be efficiently checked (in a uniform fashion).
Example 3.2.2. The function ]SAT that counts the number of satisfying assignments of a boolean formula given in CNF is in ]P. Next, we introduce the idea of ]P-hardness: Definition 3.2.3. A function f : {0, 1}∗ → N is said to be ]P-hard if every function g ∈ ]P can be computed by a polynomial time Turing machine that is allowed oracle access to the function f . If f is ]P hard and also belongs to the class ]P, then we say that f is ]P-complete. Example 3.2.4. ]SAT is an example of a ]P-complete function. This is because the reduction that shows SAT is NP-hard is a parsimonious reduction. Remark 3.2.5. Note that we can efficiently solve SAT if we are allowed oracle access to ]SAT. This observation shows that if every function in ]P admits polynomial time algorithms then in particular, NP = P. Thus, it is unlikely that ]P-complete functions can be computed in polynomial time.
3.3
Background and Statement of Results
Ajtai showed in [Ajt98] that finding the shortest non-zero vector in a lattice in L2 -norm is NP-hard. But the reduction he obtained is randomized and non-parsimonious, thus the ]P-hardness of the counting version of this problem remained open. In [RS99] Ravi
22 Kumar and Sivakumar asked whether counting lattice vectors is ]P-hard. We show the following hardness results regarding this problem. 1. Counting lattice vectors is ]P-complete. This resolves the question of the hardness of the counting problem. 2. There is a randomized polynomial time reduction from integer factorization to the problem of counting lattice vectors in lattices of fixed rank r ≥ 8. 3. There is a randomized polynomial time reduction from integer factorization to the problem of counting lattice vectors in lattices generated by vectors of bounded norm. We describe an algorithm to compute ϑL (d) in time 2O(rs+log d) , where r is the rank of the lattice and s is the number of bits of the encoding of L. The exhaustive search method leads to an algorithm that requires 2O(r log d) time, thus our method is faster for large rank r and norm d. Remark 3.3.1. One could consider a variant of the problem which is perhaps more natural, namely that of counting the number of vectors of norm at most d for a lattice L. It is evident that computing ϑL (d) Turing reduces in polynomial time to this problem. Thus this variant of the problem is also ]P-complete as a consequence of our theorem. Furthermore, our algorithm for computing ϑL (d) can be used to solve this problem by P computing the sum `≤d ϑL (`) in the same asymptotic running time. Thus both variants are equivalent for our considerations. However, the reduction we present in the next section is considerably simpler for the version of the problem that we have stated.
23
3.4
The ]P-hardness Result
Theorem 3.4.1. Counting Lattice Vectors is ]P-complete under polynomial time Turing reductions. Proof : It is easy to see that the problem of Counting Lattice Vectors is in ]P, so we concentrate on showing that the problem is hard for the class ]P.
It is known that computing the permanent of an n × n-matrix with entries in {0, 1} is ]P-complete (see [Val79]). Our aim is to give a polynomial time reduction from computing the permanent of such matrices to counting lattice vectors in suitable lattices.
We are given a matrix M = {aij }1≤i,j≤n , where aij ∈ {0, 1}. We wish to compute Q P Per M = σ∈Sn 1≤i≤n aiσ(i) where Sn denotes the full group of permutations of n letters.
Let log n < α1 < α2 < α3 < · · · < αn < β1 < β2 < · · · < βn < γ be a sequence of 2
2n + 1 integers. Consider the lattice L ⊆ Q3n of rank n2 given by basis vectors that are 2
defined below. A vector in Q3n is given by a tuple of 3n2 rational numbers. We treat this tuple as being made up of three blocks each of n2 consecutive entries of the vector. Each block in turn can be thought of as an n × n matrix. We will call these blocks the A, B and C blocks respectively. We now define the basis vector vij for 1 ≤ i, j ≤ n. Each block will have at most one non-zero entry. The hi, ji-th entry of the A-block of vij is aij from the matrix M . The hi, ji-th entry of the B-block of vij is 2αi if aij = 1 and 2γ otherwise, and the hi, ji-th entry of the C-block of vij is 2βj if aij = 1 and 2γ
24 otherwise. This completes the definition of the lattice. It is clear that the rank of L is n2 .
We make the following key claim: Claim: There are choices of the sequence hαi , βj i1≤i,j≤n and γ such that the following is P true. Suppose v = θ11 v11 + θ12 v12 + · · · + θnn vnn . Then kvk2 = n + 1≤i≤n (22αi + 22βi ) Q iff there is a σ ∈ Sn such that 1≤i≤n aiσ(i) = 1. Q Proof of Claim: First suppose there is a σ ∈ Sn such that 1≤i≤n aiσ(i) = 1, then the P P vector v = 1≤i≤n viσ(i) has kvk2 = n + 1≤i≤n (22αi + 22βi ).
Let D = n +
2αi 1≤i≤n (2
P
+ 22βi ), and let v = θ11 v11 + θ12 v12 + · · · + θnn vnn be a vector
in the lattice L such that kvk2 = D.
As the vij are orthogonal we get that D = kvk2 = hv, vi =
X
2 θij kvij k2 .
(3.1)
1≤i,j≤n
Note that if θij 6= 0 this implies that aij = 1, for otherwise kvk2 ≥ kvij k2 ≥ 22γ+1 > D. 2 2 Let δij = θij kvij k, so that if θij 6= 0 then δij = θij (1 + 22αi + 22βj ). Reducing both sides P P 2 2α1 2 . If 1≤i,j≤n θij = of equation (3.1) modulo 22α1 we get: 1≤i,j≤n θij ≡ n mod 2 2 n + k22α1 with k ≥ 1 then there is a θrs such that θrs ≥
22α1 . n2
This implies that
δrs ≥ 22α1 +2αr −2 log n . Suppose we select αi and βj such that βn < 22α1 −log n then δrs > D which is impossible. Thus k = 0 and the congruence is an equality, so that X
2 θij = n.
1≤i,j≤n
Thus we get that |θij | ≤
√
n and since they are integers there are at most n θij ’s that
are non-zero. If, in addition, we have n3 2αi < 2αi+1 and n3 2βi < 2βi+1 then we argue
25 that in fact |θij | ≤ 1. Suppose to the contrary we had a vector with 1 < |θij |2 ≤ n, then 2 2αi 2 2βj 2 2αi 2 2 > 22αi so there must be at least one other vector 2 . Now θij 2 + θij + θij δij = θij
which helps this vector “cheat” so that the sum adds to a valid power 22αk (say). Let 2 2αi S be the set of basis vectors that help to make θij 2 another valid power of 2. But
|S| ≤ n2 and each of these vectors can add a factor of at most n22αi to the norm to boost it to the next valid power of 2, but then since n3 2αi < 2αi+1 this is impossible. Thus the set S is empty and all the |θij | ≤ 1.
But now we have
2 1≤i,j≤n θij
P
= n, with each |θij | ≤ 1 and θij are integers. This implies
that there must be exactly n non-zero θij . Suppose θij1 , θij2 , · · · , θijk with 1 < k ≤ n are all non-zero. Then clearly the 22αi term of the norm of v cannot be accounted for by any of the basis elements, thus for each i there is exactly one j such that |θij | = 1. This defines for us a permutation σ ∈ Sn such that for each i, 1 ≤ i ≤ n |θiσ(i) | = 1. It is now Q evident that 1≤i≤n aiσ(i) = 1. Thus we have proved the claim.
Q To finish the proof of the theorem note that for each σ ∈ Sn if 1≤i≤n aiσ(i) = 1 then P P there are 2n vectors given by 1≤i≤n ±viσ(i) of norm square n + 1≤i≤n (22αi + 22βi ).
Hence we have that: 2 Per M = v ∈ L n
X 2 2α 2β i i kvk = n + (2 + 2 ) . 1≤i≤n
Since a sequence αi , βj and γ that satisfies all the conditions imposed in our proof can be picked in polynomial time, this proves the theorem. In particular, an acceptable sequence would be α1 = cn2 , for some constant c > 0; αi = cn2 + ibblog nc, for b > 3 another constant and 1 < i ≤ n; βi = cn2 + (i + n)bblog nc and γ > βn . 2
26
3.5
A Na¨ıve Algorithm
In this section we analyze the exhaustive search method for counting lattice vectors. We use this algorithm as a part of our main algorithm.
Let L ⊆ Qn be a lattice with basis v1 , · · · , vr and v = hα1 , · · · , αn i ∈ L be such that √ kvk2 = d. Then |αi | ≤ d. If we are given a vector v ∈ Zn , we can check if it belongs to the lattice L by solving for v = e1 v1 + · · · + er vr for the ei and checking whether ei ∈ Z. We can thus evaluate ϑL (d) by exhaustive search in time 2O(n log d) . We can improve the exhaustive search in the case where the lattice is not full rank as follows. Suppose vi = hγi1 , · · · , γin i and assume (without loss of generality) that the r×r minor (γij )1≤i,j≤r is full rank. A lattice vector v is then uniquely determined by its first r coordinates. Further, given the first r coordinates of a vector v, we can check if there is a vector in L with the same initial block of r coordinates. Furthermore, we can produce such a lattice vector by solving the appropriate system of linear equations. Hence we can refine √ our exhaustive search by generating tuples hα1 , · · · , αr i with |αi | ≤ d and checking if there is a vector in L whose projection along the first r coordinates matches the tuple hα1 , · · · , αr i and also if it is of the correct norm. This yields a method to compute ϑL (d) in time 2O(r log d) (ignoring factors that are polynomial in n). Summarizing, we have: Theorem 3.5.1. There is a deterministic algorithm that when given a lattice L ⊆ Qn of rank r and an integer d in binary computes ϑL (d) in time 2O(r log d+log n+log s) , where s is the number of bits to encode the basis of L.
27
3.6
The Approach for Unimodular Lattices
In this section we describe our method for counting lattice vectors in a restricted class of lattices. For these lattices, the description of the algorithm is quite simple and one can see all the aspects of the general case illustrated here. In §3.8 we remove the restrictions we place here.
Let L ⊆ Qd be a rank r lattice. Choosing a basis for L we can form an isomorphism to Zr . This isomorphism is given by a linear transformation. Under the isomorphism, the square of the norm function for L transforms into a positive definite quadratic form QL on Qr . More concretely, if L = hv1 , · · · , vr i then QL (z1 , · · · , zr ) = kz1 v1 + · · · + zr vr k2 . The theta series associated to the lattice L is given by ΘL (τ ) =
X
q kvk
2
v∈L
=
X
q QL (x) , q = e2πıτ .
x∈Zr
The quadratic form QL (x) can be written as 21 xt Ax for an even symmetric matrix A (i.e., A = (aij ) ∈ Zr×r , A = At and aii are even integers). The lattice L is said to be unimodular if det A = 1.
The following astonishing fact (and some of its generalizations) was proved by Schoeneberg [Sch39], see also [Hec83a]. Theorem 3.6.1. Let L be a lattice of even rank r, such that the matrix A associated
28 to the quadratic form QL of the lattice is unimodular. Then the theta series ΘL of the lattice is a modular form of weight
r 2
for the full modular group.
This suggests the following algorithm. Given a unimodular lattice L of even rank r, we know that the theta series of the lattice ΘL lives in the finite dimensional space Mr/2 . By P definition ΘL = 0≤n an q n (q = e2πıτ ), where an = ]{v ∈ L : kvk2 = n}, so our task is to compute the Fourier coefficients of ΘL . Furthermore, we know an explicit basis, from Corollary 2.2.4, for this space (say) {f1 , · · · , fD }, where D is the dimension of Mr/2 . Suppose we can also find α1 , · · · , αD such that ΘL = α1 f1 + · · · + αD fD . Then we can find the Fourier coefficients of ΘL by combining the appropriate Fourier coefficients of the fi according to the linear relation we found for ΘL . If we can compute the Fourier coefficients of fi asymptotically faster than the running time of the algorithm in §3.5 then we get a faster algorithm to count lattice vectors.
3.6.1
Computing a Basis of Mk
Here we show that computing the m-th Fourier coefficient of our basis elements of Mk can be done in 2O(log m) time. Theorem 3.6.2. There is a deterministic algorithm that when given m in binary computes the m-th Fourier coefficient of Gk in 2O(log m+log log k) time if m ≥ 1 and in 2O(log k) time if m = 0. Proof : If m = 0, we need to compute the k-th Bernoulli number. This can be done in k O(1) time using the Akiyama-Tanigawa algorithm (see [Kan00]). If m > 1, then the mP th Fourier coefficient of Gk is σk−1 (m) = d|m dk−1 . One simple way of computing this is to factor m completely and then to evaluate the sum by running over all the divisors
29 of m. Factoring the number m, can clearly be done in 2O(log m) time, even by simple trial division. As every divisor of m is ≤ m the number of divisors is O(m). Computing the term dk−1 can be done in O(log k log d) time. Thus the sum can be evaluated by this procedure in log k × 2O(log m) time as claimed. 2
Theorem 3.6.3. There is a deterministic algorithm that when given m in binary, computes the m-th Fourier coefficient of ∆` in 2O(log m+log log `) time. Proof : Now ∆l = q l
Q
1≤r (1
− q r )24l . We just need to compute this product upto
the r = O(m/l) term. Each term of the product requires (log l)O(1) multiplications (by repeated squaring), we need to compute O(m) such products, and this can be done in 2O(log m+log log l) time. 2
Given these two theorems it is easy to see that the m-th coefficient of the basis for Mk can be computed in 2O(log m+log k) time. Remark 3.6.4. Let D = dim Mk , and f1 , · · · , fD be the basis for the vector space Mk given in Corollary 2.2.4. Let the q-expansion of the fi ’s be given by fi (τ ) =
X
aij q j , for 1 ≤ i ≤ D.
0≤j
Then the matrix (aij )1≤i≤D,0≤j 8 we can boost the rank of E8 to r. More precisely, let vi = hvij i for 1 ≤ i, j ≤ 8 be the basis for E8 . We construct a lattice E8r ⊆ Qr given by the following basis vectors: v1 = hv11 , · · · , v18 , 0, · · · , 0i | {z } r−8
v2 = hv21 , · · · , v28 , 0, · · · , 0i .. . v8 = hv81 , · · · , v88 , 0, · · · , 0i v9 = h0, · · · , 0, d, 0, · · · , 0i | {z } 8
.. . vr = h0, · · · , 0, 0, 0, · · · , di. One can see that ϑE8r (d) = ϑE8 (d) and so we get a reduction from factoring to computing ϑL where rank(L) > 8. 2
It is likely that one could show a result analogous to Theorem 3.7.1 even for lattices of rank r < 8. In particular, note that for the lattice of dimension 2 (say) Z 2 , generated by h0, 1i, h1, 0i we have ϑZ 2 (d) = r2 (d)—the number of representations of d as a sum of two squares. It is a classical fact that r2 (n) = 4(d1 (n) − d3 (n)) where di (n) is the number of divisors of n of the form 4k + i. It seems that computing r2 (n) is hard.
3.7.2
Lattices with Bounded Norm Basis Vectors
The reduction in Theorem 3.4.1 has the feature that the lattice produced has a basis of vectors that have large norms. We can consider a variant of the counting problem,
33 where we restrict the lattices to have a basis of vectors all of whose norms are bounded. With regard to this question, we can show the following theorem: Theorem 3.7.2. There is a reduction from integer factorization to computing ϑL for lattices with a basis of bounded norm vectors. We need some preliminary results before we prove Theorem 3.7.2. Definition 3.7.3. Let L ⊆ Qm1 be a lattice of rank n1 given by basis ui = huij i for 1 ≤ i ≤ n1 , 1 ≤ j ≤ m1 and let M ⊆ Qm2 be another lattice of rank n2 given by basis vk = hvkl i for 1 ≤ k ≤ n2 , 1 ≤ l ≤ m2 . Then define L ⊕ M ⊆ Qm1 +m2 to be the lattice generated by the basis w1 = hu11 , · · · , u1m1 , 0, · · · , 0i | {z } m2
.. . wn1 = hun1 1 , · · · , un1 m1 , 0, · · · , 0i wn1 +1 = h0, · · · , 0, v11 , · · · , v1m2 i | {z } m1
.. . wn1 +n2 = h0, · · · , 0, vn2 1 , · · · , vn2 m2 i. The following lemma is immediate from the definition. Lemma 3.7.4. If L and M are two lattices then ΘL⊕M = ΘL ΘM . Let d ≥ 3 be an integer. Consider the lattice Ld ⊆ Qd of rank d − 1 generated by the
34 basis vectors v1 = h1, −1, 0, · · · , 0i | {z } d
v2 = h0, 1, −1, 0, · · · , 0i .. . vd−1 = h0, · · · , 0, 1, −1i. The following lemma is evident from the definition of Ld . Lemma 3.7.5. Suppose w = hw0 , w1 , · · · , wd−1 i ∈ Ld , then w = hwd−1 , w0 , · · · , wd−2 i ∈ Ld . Proposition 3.7.6. Let w ∈ Lp where p is an odd prime. If w 6= 0 then 2
w, w , w , · · · , w
p−1
are all distinct. i
j
Proof : Suppose w = w for 0 ≤ i 6= j ≤ p − 1. Then w implies that w
gcd(i−j,p)
(i−j)
p
= w = w , which
= w. Thus w = w, but this means that all the coordinates of
w are equal. But all vectors of Lp have coordinates summing to 0. Thus w must be the zero vector contradicting the hypothesis of the proposition. 2
Corollary 3.7.7. If p is an odd prime then ΘLp ≡ 1 mod p. Proof : Group all non-zero vectors in Lp by their orbits via the action w 7→ w . Each such orbit is of size p by Proposition 3.7.6. Further, noting that kwk = kw k we see
35 that ΘLp ≡ 1 mod p. 2
Proof :(of Theorem 3.7.2) Suppose A is an algorithm that can compute ϑL for lattices generated by a basis of bounded norm vectors. Then we show that A can be used to compute the function σ3 (n), which will prove the theorem in view of [BMS86].
We first pick small primes pi for 1 ≤ i ≤ k such that
Q
1≤i≤k
pi > n4 > σ3 (n). Then we
use A to compute ϑE8 ⊕Lpi (n) for each pi . By Corollary 3.7.7 and Lemma 3.7.4, we have that ϑE8 ⊕Lpi (n) ≡ ϑE8 (n) mod pi . Now applying the Chinese remainder theorem we can find ϑE8 (n). By the prime number theorem it suffices to take the first k = O(log n) primes for the pi . The theorem now follows. 2
Remark 3.7.8. All the reductions presented in this chapter have the property that for the lattices produced by the reduction, the problem of finding the shortest non-zero vector in them is trivial. This indicates that the hardness of the counting problem is independent of the hardness of the shortest vector problem in these lattices.
3.8
The General Case
In the general case the theta series of the lattice is no longer a modular form for the full modular group, but for a congruence subgroup. Theorem 3.8.1. Let L be a lattice of rank r (r even). Let QL be the associated quadratic form, and A be the even summetric matrix with integer entries such that QL = 12 xt Ax. Let N be the smallest positive integer such that N A−1 is again even symmetric with
36 r
integer entries. Let D = (−1) 2 det A. Then the theta series of the lattice L is a modular form of level N , weight 2r and character χ = Dd (the Kronecker symbol), i.e., ΘL ∈ M r2 (Γ0 (N ), χ). Remark 3.8.2. In our situation the basis vectors have integer entries so N is always a divisor of det A, so that χ is indeed a character modulo N , even though it need not be a primitive character modulo N . The fact that the matrix A is invertible follows from the theory of bilinear forms and that QL arose from an inner product (see [MH73] Lemma I.§2.2). See Ogg’s book [Ogg69] Chapter 6, or Zagier’s article [Zag92] for more background on this theorem. The following theorem shows that we can compute a basis of forms for the space Mk (Γ1 (N )). The algorithm is the result of the cumulative work of many individuals, see [AL70, Cre97, Kob93, Man72, Mer94, Ste00]. Theorem 3.8.3. There is a basis for Mk (Γ1 (N )) composed of forms each of whose n-th Fourier coefficient can be computed in dim Mk (Γ1 (N )) × 2O(log n) time. We only sketch the ideas behind the method for computing the Fourier coefficients of the basis elements since the details are available in other sources. The basis for the space generated by the generalized Eisenstein series can be explicitly worked out (see for instance [Kob93] III.§3, Proposition 22.) The Fourier coefficients of these elements can also be computed though they are no longer rational but involve roots of unity. Computing the space of cusp forms is much more involved. In this case we know that there is an algebra of operators, the Hecke operators, on this space Sk (N, χ). A beautiful theorem of Hecke (Theorem 2.7.6) says that there is a basis for the space Sk (N, χ) composed of eigenforms for this algebra (see [Hec83b]). More importantly, if suitably
37 normalized, the eigenvalues are the Fourier coefficients of the eigenform (see Theorem 2.7.5). To understand this in more detail refer to [AL70]. Since we do not have a basis for the cusp forms it seems that it is impossible to determine the eigenvectors for the operators—seemingly a circular problem. The idea is to use the space of modular symbols for which a concrete presentation is available by an idea of Manin [Man72]. The space of modular forms embeds (actually as a dual) into the space of modular symbols by the Eichler-Shimura theory. The Hecke algebra acts on the space of modular symbols, and the eigenvectors for this action are then translated to the space of modular forms. The details of this method have been worked out in exhaustive detail in [Mer94], [Cre97] chapter 2, and in [Ste00] chapters 2 and 3. The Fourier coefficients are algebraic and the number field containing all the coefficients of the basis is a finite extension but the degree can be very large (as big as (dim Sk (N )2 )!), since we need to construct the splitting field of the characteristic polynomials of the Hecke operators. For our purposes it suffices to get good approximations to these coefficients, which we do indeed get from the algorithm.
Remark 3.8.4. In Chapter 5 we give an algorithm to compute a basis for the space Sk (N, χ), but it turns out that we cannot use this algorithm here because of the ineffectivity of the algorithm if we allow k and N to vary.
Now our previous algorithm for computing ϑL generalizes readily to this situation. The space Sk (N, χ) has dimension O(kN 2 ) (see §2.6). The key step in the algorithm is to find the coordinates of the theta series ΘL in the space Mr/2 (Γ1 (N )). To do this we must accumulate enough linear relations among the Fourier coefficients of the basis for the space and ΘL . For the case discussed in the previous section this was easy since the
38 matrix formed by the first dim Mk coefficients of the basis forms has full rank. In our case, we do not have an explicit basis to work with so we must argue indirectly. We make use of a result (Proposition 2.16 in [Shi71]) that says if F ∈ Mk (Γ) for Γ any congruence subgroup (actually this result holds in more generality) then if Z is the number of zeros of the function F counting multiplicty in H then Z = Θ(dim(Mk (Γ))). This is a generalization of the valence formula of Theorem 2.2.1. Let D = dim Mr/2 Γ1 (N ), and let f1 , · · · , fD be a basis for Mr/2 (Γ1 (N )). Suppose we had scalars α1 , · · · , αD such that α1 f1 + · · · + αD fD = cm q m + cm+1 q m+1 + · · · , with cm 6= 0 (i.e., the αi cancel out all Fourier coefficients below m) then the function F (z) = α1 f1 + · · · + αD fD vanishes to order m at i∞. Thus in particular if m > Z then F (z) is identically zero by the above result. This implies that αi = 0 since the fi form a basis. Thus among the first Θ(D) Fourier coefficients of the basis elements, we arrive at a matrix of full rank. One can also derive this bound directly from a theorem of Sturm [Stu87].
Suppose we have an algorithm that counts the number of points in a lattice L (of level N ) of rank r of norm square d in time T (r, d). Then in time T (r, D)O(1) (D = dim Mr/2 (Γ1 (N ))) we can find the coordinates of the theta series ΘL in the space Mr/2 (Γ1 (N )) by solving the linear system gathered from the coefficients. Then the number of points of norm square d can be found in time T (r, rN 2 )O(1) 2O(log r+log N +log d) by combining the Fourier coefficients. This yields the following theorem: Theorem 3.8.5. Let L be a lattice in Qn of rank r (all of whose basis vectors have integer entries and r is even), with QL as the associated quadratic form and A the even symmetric matrix of the quadratic form. Let N be the smallest integer such that N A−1 is integral and even symmetric. Suppose that there is an algorithm B that can compute
39 the number of lattice vectors of norm square at most d in time T (r, d). Then there is a deterministic algorithm to do the same in time T (r, rN 2 )O(1) 2O(log r+log N +log d) . Clearly, the above theorem is not useful if the existing algorithm B is very efficient, but it can be used to boost the performance of an algorithm that does not perform well for large values of the distance d. For example, using the algorithm presented in section 3.5 and observing that if the lattice is encoded by vectors using s bits then N ≤ det A ≤ 2O(s) we get: Theorem 3.8.6. Let L be a lattice of rank r (r even) in Qn , such that the basis vectors can be encoded using s bits. Then the number of lattice vectors of norm square d can be computed deterministically in time 2O(rs+log d) .
3.8.1
Odd Rank Lattices
We can reduce the case of odd rank lattices to the even rank case as follows.
The idea is to use the “rank boosting” method in section (3.7.) Let L ⊆ Qn be an odd rank lattice. Set Md ⊆ Q to be the rank 1 lattice generated by the vector h2di. Now the lattice Md ⊕ L is an even rank lattice, which satisfies ϑMd ⊕L (d) = ϑL (d). We can apply our algorithm to Md ⊕ L to count the vectors of norm square d and we note that the reduction does not change the asymptotic running time of the algorithm.
40
Chapter 4 Hardness of Computing a Hecke Eigenform One way to interpret the results of Chapter 3 is that we have a family of modular forms whose Fourier coefficients are ]P-hard to compute. What happens if we fix a modular form and try to compute its Fourier coefficients? This is the case that we investigate in this chapter. The hardness results we obtain are much weaker. We are only able to show that the problem is as hard as factoring a dense subset of the RSA moduli - numbers that are products of two primes. Moreover, the result only applies to Hecke eigenforms. The result also implies that computing a basis of modular forms (of a fixed weight and level) cannot be easier in general than factoring RSA moduli.
The outline of the chapter is as follows. In the first section we discuss the special case of the Ramanujan Tau function and in the following section we generalize the results to cuspidal eigenforms of higher weight and level. The results of this chapter were obtained in joint work with Eric Bach [BC05].
41
4.1
Hardness of Computing τ (n)
The Ramanujan Tau function τ (n) is defined to be the n-th Fourier coefficient of the Discriminant function ∆(z) (cf. 2.3): ∆(z) = q
Y
(1 − q n )24 , q = e2πız
1≤n
= 1 − 24q 2 + 252q 3 − 1472q 4 + 4830q 5 − 6048q 6 − 16744q 7 + · · · X =def τ (n)q n . 1≤n
We know from Theorems 2.2.2 and 2.2.3 that ∆ is a cusp form of weight 12 spanning the one-dimensional space S12 . Thus ∆ is a Hecke eigenform for the weight 12 Hecke operators. This implies (by Theorems 2.4.2 and 2.4.4) that 1. τ (nm) = τ (n)τ (m) if gcd(m, n) = 1. 2. τ (pk ) = τ (pk−1 )τ (p) − p11 τ (pk−2 ) for k ≥ 2. Given an algorithm that computes τ (n), our task is to construct another algorithm that would factor the integer n. Now, if τ (n) = 0 we obtain no information about the factorization of the integer n. A conjecture of Lehmer states that τ (n) is, indeed, never 0. We assume this conjecture for the statement of the next theorem, in the next section we remove the dependence on the conjecture to obtain a slightly weaker result. Theorem 4.1.1. Assume Lehmer’s conjecture that τ (n) 6= 0 for all n. Then a polynomial time algorithm for computing τ (n) implies a polynomial time algorithm for factoring RSA moduli (numbers of the form n = pq, where p, q are distinct odd primes). 11
11
Proof : Suppose n = pq, and let τ (p) = p 2 x, τ (q) = q 2 y. Our goal is to find x and y,
42 then p and q. Using the algorithm to compute τ , we can compute 11
a =def τ (n) = n 2 xy and b =def τ (n2 ). We note that b = n11 (x2 − 1)(y 2 − 1) using the multiplicativity and the recurrence relations that τ satisfies for prime powers. This is a pair of simultaneous equations that 11
we can solve for x and y. Setting α = a/n 2 and β = b/n11 , one obtains p α2 − β + 1 ± (α2 − β + 1)2 − 4α2 2 x = 2 and y2 =
α2 . x2
Our assumption that τ (n) 6= 0 implies that x 6= 0 and y 6= 0, also their squares are rational. We can determine x2 as follows. Replacing α and β by their definitions and clearing fractions, we have 2
x =
a2 − b + n11 ±
p (a2 − b + n11 )2 − 4a2 n11 . 2n11
The radicand is an integer square, and so its square root can be found exactly. We now observe that x2 =
τ (p)2 . p11
In lowest terms, the denominator d must be p to an odd power. Therefore, gcd(n, d) gives us a non-trivial factor of n. 2
43
4.2
The Reduction in General
In what follows, fix f (z) =
P
1≤n
a(n)q n ∈ Sk (Γ0 (N ), χ) (k ≥ 2 and even) to be a
normalized (a(1) = 1) Hecke eigenform. We will also assume that f is not of CM type in the sense of Ribet [Rib77]. This means that there does not exist an imaginary quadratic field L such that a(p) = 0 for all primes p that are inert in L. Under these assumptions, a beautiful theorem of Serre ([Ser81] Corollary 2 to Theorem 15) gives us bounds on the number of primes p for which a(p) = 0. Theorem 4.2.1. Let f (z) =
P
1≤n
a(n)q n ∈ Sk (Γ0 (N ), χ) (k ≥ 2) be a normalized Hecke
eigenform that is not of CM type. Define Pf (x) = {p ≤ x : p a prime such that a(p) = 0}. Then Pf (x) = O
!
x 3
(log x) 2 −δ
for all δ > 0.
Moreover, if one assumes the Generalized Riemann Hypothesis, we have 3 Pf (x) = O x 4 . The Fourier coefficients of a normalized Hecke eigenform need not be integers, but they are at least algebraic integers (see [Ono04] §2.4 & §2.5; the result also follows from [Shi71] Theorem 3.52). Furthermore, we know that each eigenvalue lies in a number field of degree at most dim Sk (Γ0 (N ), χ) since the characteristic polynomials of the Hecke operators have degree dim Sk (Γ0 (N ), χ). In fact, the field Q(a(2), a(3), · · · , a(n), · · · ) is a number field and so a finite degree extension of Q. We assume that the supposed algorithm that computes the Fourier coefficients takes as input an integer n and gives us the (monic) minimal polynomial of the n-th Fourier coefficient a(n). Since we can assume the space
44 Sk (Γ0 (N ), χ) is known, we can also compute χ(n) for any integer n.
We describe the reduction below. Given the positive integer n = pq, where p, q are distinct odd primes. We define the quantities x and y by x = a(p)/χ(p)p a(q)/χ(q)q
k−1 2
k−1 2
and y =
. Note that we can also assume that χ(n) 6= 0 for otherwise gcd(n, N ) 6= 1.
Using the algorithm to compute the Fourier coefficients of f we can compute A =def n =
k−1 2
xy
a(n) χ(n)
and B =def a(n2 ). Now by multiplicativity B = a(n2 ) = a(p2 )a(q 2 )
(4.1)
= (a(p)2 − pk−1 χ(p))(a(q)2 − q k−1 χ(q))
(4.2)
= nk−1 χ(n)(x2 − 1)(y 2 − 1).
(4.3)
Thus we have a pair of simultaneous equations for x and y which we can solve as before. Setting α = A/n
k−1 2
and β = B/nk−1 χ(n), one obtains p 2 α − β + 1 ± (α2 − β + 1)2 − 4α2 x2 = 2
and y2 =
α2 . x2
45 Substituting the definitions of α and β and clearing denominators we get q 2 k−1 A χ(n) − B + n χ(n) ± (A2 χ(n) − B + nk−1 χ(n))2 − 4χ(n)2 2 x = . 2χ(n)nk−1 We note that the radicand is the square of an algebraic integer and hence the square root can be computed exactly. By the definition of x we have that x2 =
a(p)2 . χ(p)2 pk−1
We claim that x2 cannot be an algebraic integer if p is large enough. For otherwise, √ since k − 1 is odd, this would imply that p ∈ Q(χ, a(2), a(3), · · · , a(n), · · · ), but the √ latter is a finite extension and thus if p is large enough it cannot contain p. Thus we can recover p from the above expression.
Suppose y = 0 but x 6= 0 (i.e. a(q) = 0 but a(p) 6= 0), we can still proceed as follows. By equation (4.3) we find that B = nk−1 χ(n)(1 − x2 ). Thus we can still get x2 and by the above argument find p.
Thus our reduction will succeed in factoring the integer n, unless both a(p) and a(q) are zero. Since the set of such primes is density 0 (by Theorem 4.2.1), we get the following theorem: Theorem 4.2.2. Let f (z) =
P
1≤n
a(n)q n ∈ S2k (Γ0 (N ), χ) be a Hecke eigenform that is
not of CM-type. Suppose there is a polynomial time algorithm that computes a(n) given n, then there is a polynomial time algorithm that factors a density 1 subset of the RSA moduli. In the case that f ∈ S2k+1 (Γ0 (N ), χ) the entire reduction works as long as p
k−1 2
does not
divide a(p) for one of the primes dividing n. This happens very rarely. If k ≥ 3 and odd
46 then this implies that a(p) ≡ 0 mod p which means that p is a, so called, non-ordinary prime. A heuristic argument given in [Gou97] shows that the number of non-ordinary primes below x is O(log log x). Thus it is likely that the result of Theorem 4.2.2 remains true even for odd weight cuspidal eigenforms.
47
Chapter 5 Computing a Basis of Cusp Forms We have shown that computing the Fourier coefficients of cuspidal eigenforms is at least as hard as factoring integers. Suppose we can quickly factor integers can we then quickly compute the Fourier coefficients of cuspidal eigenforms? In other words, can we compute the p-th Fourier coefficient of a cuspidal eigenform in polynomial time? Recent work of Edixhoven, Couveignes et al. suggests that this might be possible. But at this time, this question is still open.
In this chapter we discuss an algorithm based on the Selberg trace formula that runs asymptotically faster than the usual algorithm to compute the Fourier coefficients of level 1 cusp forms. We also discuss a generalization of this algorithm to compute Fourier coefficients of higher level cusp forms. The ideas in this chapter are not completely new. It has been known for quite a while that the trace formula can be used to compute Fourier coefficients (see, for instance, [Mes86]). However, there seems to have been a misconception that the method is not very efficient, owing to the fact that the trace formula involves class numbers. The results of this chapter are from [Cha05d] and [Cha05b].
48
5.1
Computing the Ramanujan Tau function
We illustrate our method in the special case of computing a basis for the 1-dimensional Q space of cusp forms S12 . From Theorem 2.2.3 we know that ∆ = q 1≤n (1 − q n )24 P spans this space. Furthermore, the Fourier coefficients of ∆ = 1≤n τ (n)q n define the Ramanujan Tau function τ (n). Since ∆ is a normalized Hecke eigenform we recall that: 1. If gcd(n, m) = 1, then τ (nm) = τ (n)τ (m). 2. If r ≥ 1 and p is a prime then τ (pr+1 ) = τ (p)τ (pr ) − p11 τ (pr−1 ). Thus τ (n) is completely determined by τ (p) for prime p|n. Furthermore, τ (n) is the eigenvalue of the n-th Hecke operator Tn . Since dim S12 = 1, τ (n) = Tr Tn . We can use this interpretation to compute τ (n) as follows. Selberg in [Sel56] proved his famous Trace formula, which can be used to compute the trace of the Hecke operators. The Trace formula gives: Theorem 5.1.1. Let p be a prime. Then τ (p) = −
X √ 0≤t≤ 4p
1 P (t, p)H(4p − t2 ) + p5 H(4p) − 1 2
where P (t, p) = t10 − 9t8 p + 28t6 p2 − 35t4 p3 + 15t2 p4 − p5 where H(D) is the Hurwitz class number. To compute τ (p), given Theorem 5.1.1, we only need to show how the Hurwitz class numbers can be computed, since it is easy to compute the above sum. For this task we need the following lemma (see [Coh93] Lemma 5.3.7):
49 Lemma 5.1.2. Let w(−3) = 3, w(−4) = 2 and w(D) = 1 for D < −4, and set h0 (D) = h(D) , w(D)
where h(D) is defined to be the class number of the order of discriminant D in
√ Q( D) if D ≡ 0, 1 mod 4 and 0 otherwise. Then for N > 0 we have H(N ) =
X d2 |N
N h − 2 . d 0
There are randomized sub-exponential time algorithms to compute the class number (see [Coh93] Chapter 5). Theorem 5.1.3. The class number h(D) can be computed deterministically in time 1
|D| 4 + for every > 0, or by a randomized algorithm with expected running time (under √ the GRH) eO( ln |D| ln ln |D|) . Proposition 5.1.4. The Hurwitz class number H(N ) can be computed by a deterministic 1
algorithm in time O(N 4 + ) or a randomized algorithm with an expected running time O(N ) for every > 0 under GRH. Proof : By Lemma 5.1.2 we have H(N ) =
X d2 |N
N h − 2 . d 0
√ The function h0 (D) is essentially just the class number of Q( D) and so can be com1
puted in time O(|D| ) if we use the randomized algorithm or in time O(|D| 4 + ) if we use the deterministic algorithm. The number of terms in the sum is at most the number of divisors of N . It is known (see [Ten95] Chapter I.5) that the number of divisors d(N ) = O(N ) for every > 0. Thus the sum can be evaluated by computing each of the terms in the stated time bound provided we can run through the divisors of N efficiently. This can clearly be done efficiently if we know the prime factorization of
50 the number N . For the deterministic algorithm to compute H(N ) we can use the deterministic factoring algorithm mentioned earlier ([Coh93] §8.6) to factor N , this has a 1
running time of O(N 4 + ). For the randomized algorithm we can use any randomized algorithm factoring method with running time O(N ). For instance, we can use Dixon’s random squares method [Dix81] which has a provable running time of eO(
√
log N log log N )
. 2
Thus putting all these results together we get the following: Theorem 5.1.5. There is a randomized algorithm to compute τ (p) with expected running 1
time O(p 2 + ) for every > 0 under GRH. 3
Theorem 5.1.6. There is a deterministic algorithm to compute τ (p) in time O(p 4 + ) for every > 0.
5.2
Computing level 1 cusp forms
The standard basis for the space Sk is the one given in §2.2 (Corollary 2.2.4). It can be shown that the n-th Fourier coefficient of any of these basis forms can be computed in time O(n2 ). In this section we show that there is a basis for this space composed 1
of forms for which we can compute the n-th Fourier coefficient in time O(n 2 + ) by a randomized algorithm (under GRH). The results here generalize those of the previous section.
5.2.1
Cyclic Basis for modular forms
Let V be any finite dimensional vector space and let T : V → V be a linear map. Suppose that there is a v ∈ V such that T (v), T (T (v)), · · · , form a basis of V , the we
51 say that V has a cyclic basis with respect to T (or simply, T has a cyclic basis). For example, the finite field Fpn is a Fp -vector space of dimension n, and it is a fact that Fpn has a cyclic basis with respect to the Frobenius automorphism x 7→ xp .
For Sk the space of cusp forms of weight k, we have the Hecke operators Tp : Sk → Sk for each prime p. It is natural to ask: For which primes p does Sk have a cyclic basis with respect to Tp ? A complete answer to this question is very difficult. For example, for k = 12, the operator Tp has a cyclic basis iff τ (p) 6= 0 and so the existence of a cyclic basis for Tp for every prime p is equivalent to Lehmer’s conjecture. In what follows, we will show that for every even k ≥ 12 there is a set of primes P of density 1, such that for every p ∈ P the operator Tp has a cyclic basis. We begin with a lemma. Lemma 5.2.1. Let V be a finite dimensional vector space (say dim V = d), and let T : V → V be a linear map. Suppose that V has a basis of eigenvectors v1 , · · · , vd with corresponding eigenvalues α1 , · · · , αd . If αi 6= 0 and αi 6= αj for i 6= j then T has a cyclic basis. Proof : Consider the vector w = v1 + v2 + · · · + vd . Now T (w), T (T (w)), · · · , T d (w) are linearly independent iff the determinant of the following matrix does not vanish: α1 α2 · · · αd 2 α1 α22 · · · αd2 . . . . .. .. .. d d d α1 α2 · · · αd Since this is a Vandermonde matrix, its determinant is α1 · · · αd
Y
(αi − αj )
1≤i 0. Then the trace of the Hecke operator Tm on the space of cusp forms Sk is given by Tr Tm = −
1 X 1 X Pk (t, m)H(4m − t2 ) − min{d, d0 }k−1 . 2 −∞ 0. Proof : Let pk be a prime such that Tpk has distinct non-zero eigenvalues. Then Tpk has a cyclic basis, moreover, the form fTr =
X
Tr Tn q n
1≤n
is a cyclic vector. Define fi to be Tpk ◦ Tpk ◦ · · · ◦ Tpk (fTr ) for 1 ≤ i ≤ D. By Lemma {z } | i
5.2.1 the fi ’s form a basis for Sk . The Propositions 5.2.4 and 5.2.5 now imply the results of the theorem. 2
Remark 5.2.7. As stated the constant under the big-Oh in the running time of the algorithm above is ineffective, as we do not have an effective bound on the smallest prime
55 pk for which Tpk has distinct non-zero eigenvalues. However, a well-known conjecture implies that the prime 2 should work for all k (see [JO98]).
5.2.3
Another Approach
One can use a different approach to compute the space of cusp forms. Starting from fTr one can consider the forms fi = fTr |Tmi for various mi . Suppose αij are the eigenvalues of Tmi for 1 ≤ i, j ≤ D (= dim Sk ). Now the fi will form a basis of Sk iff the matrix (αij ) is full rank. Our analysis in §3.8 on page 38 shows that among the Hecke operators T1 , · · · , TN for N = O(D) we will find a full rank matrix of eigenvalues. Using this idea, one gets an alternate algorithm for computing a basis of Sk with the same running time as before. We will use this idea to compute a basis for the space Sk (Γ0 (N ), χ). However, the previous algorithm is somewhat simpler to implement as it uses only a single Hecke operator.
5.3
Computing Cusp Forms of Higher Level
In this section we sketch a generalization of the approach in §5.2.3 to compute a basis of cusp forms of higher level. The Selberg trace formula for the Hecke operators Tk,χ (n) on the space Sk (Γ0 (N ), χ) is given in the following theorem (the explicit form we give here is from [HPS89] Chapter II). Theorem 5.3.1. Let k ≥ 2 be an integer. Let χ be a character modulo N and assume Q that χ(−1) = (−1)k . Write χ = ` χ` , where for each prime ` dividing N , χ` is a
56 character
mod `ν , `ν ||N . Then for n such that gcd(n, N ) = 1 we have Tr Tk,χ (n) = −
X
a(s)
s
X
b(s, f )
Y
f
c0χ (s, f, `)
`|N
+ δ(χ) deg Tk,χ (n) √ k−1 Y 1 + δ( n) N 1+ 12 ` `|N √ Y √ n par(`) − δ( n) 2 `|N
where
δ(χ) =
√ δ( n) =
1 if k = 2 and χ is trivial; 0 otherwise, √ k n 2 −1 χ( n), if n is a perfect square
0 2`ν−e , par(`) = `ρ + `ρ−1 , 2`ρ ,
otherwise, if e ≥ ρ + 1 if e ≤ ρ and ν is even if e ≤ ρ and ν is odd.
Here for fixed `|N, `ν ||N, ρ = b ν2 c, and e = e(χ` ) (the exponential conductor of χ` ).
The meaning of s, a(s), b(s, f ) and c0χ (s, f, `) is as follows: Let s run over all integers such that s2 − 4n is negative or a positive square. Hence by some positive integer t and squarefree negative integer m, s2 −4n has one of the following
57 forms which we classify into the cases (h) or (e) as follows: t2 (h) s2 − 4n = t2 m 0 > m ≡ 1 mod 4 (e) t2 (4m) 0 > m ≡ 2, 3 mod 4 (e). Let Φ(X) = Φs (X) = X 2 − sX + n and let x and y be the roots in C of Φ(X) = 0. Corresponding to the classification of s put (min{|x|, |y|})k−1 |x − y|−1 sign(x)k a(s) = (xk−1 −yk−1 ) 2(x−y)
(h) (e).
For each fixed s let f run over all positive divisors of t and let √ 12 φ s2f−4n (h) b(s, f ) = h((s22−4n)/f 22) (e) ω((s −4n)/f ) where φ is Euler’s function and h(d) (resp. ω(d)) denotes the class number of locally √ principal ideals (resp. 1/2 the cardinality of the unit group) of the order of Q( d) with discriminant d.
For a fixed pair (s, f ) and a prime divisor ` of N , let ν be such that `ν ||N and b such ˜ = {x ∈ that `b ||f , and put A˜ = {x ∈ Z | Φ(x) ≡ 0 mod `ν+2b , 2x ≡ s mod `b } and B A˜ | Φ(x) ≡ 0 mod `ν+2b+1 }. Let A = A(s, f, `) (resp. B = B(s, f, `)) be a complete set ˜ of representatives of A˜ (resp. B)
mod `ν+b and let B 0 = B 0 (s, f, `) = {s − z | z ∈ B}.
Then c0χ (s, f, `)
=
P x χ` (x)
if (s2 − 4n)/f 2 6≡ 0
mod `
P χ` (x) + P χ` (y) x y
if (s2 − 4n)/f 2 ≡ 0
mod `,
58 where x runs over all elements of A(s, f, `) and y runs over all elements of B 0 (s, f, `). Using the ideas from the last section it can be shown that Tr Tk,χ (n) can be computed 1
in O(n 2 + ) time by a randomized algorithm. The idea then is to use the image of the form fTr under a suitable set of Hecke operators Tk,χ (m1 ), · · · , Tk,χ (mD ) (here D = dim Sk (Γ0 (N ), χ)) that have linearly independent set of eigenvalues. Our argument in §5.2.3, that among T1 , · · · TN with N = O(dim Sk ) such a set exists, is no longer valid. However, one can argue as follows. Since there is a basis of simultaneous eigenforms for all the Hecke operators Tk,χ (n) for gcd(n, N ) = 1, if there is a linear relation among the eigenvalues of any set of D Hecke operators then this linear relation has to be the same P among all the sets. Thus we have a linear relation i αi Tk,χ (mi ) = 0 for any D of the operators Tk,χ (mi ). This contradicts the fact that the Hecke operators form a full rank commutative algebra over C (see [Shi71] Theorem 3.45). Since we are only interested in an algorithm to compute a basis for a fixed space Sk (Γ0 (N ), χ) the existence of such a set of operators is sufficient. Thus we have the following theorem: Theorem 5.3.2. Fix a space S =def Sk (Γ0 (N ), χ). Then there is a basis for S composed 1
of cusp forms whose n-th Fourier coefficient can be computed in time O(n 2 + ) by a randomized algorithm (under GRH) for any > 0. The implied constant is ineffective and depends only on k and N . We remark that in practice the ineffectivity is not a serious problem. It is quite easy to find Hecke operators with linearly independent sets of eigenvalues. The main problem with the trace formula approach is that it cannot yield an algorithm with running time √ better than Θ( n).
59
Chapter 6 Elliptic Curves We record many of the results about elliptic curves that we will use in this chapter. The standard references for the arithmetic theory of elliptic curves are [Sil86, Tat74].
6.1
Definition of an Elliptic Curve
Definition 6.1.1. Let K be a field. An elliptic curve E/K is a projective non-singular irreducible curve over K of genus 1 together with a K-rational point. If the characteristic of K is not 2 or 3 then E has model of the form E : Y 2 Z = X 3 + AXZ 2 + BZ 3 where A, B ∈ K and ∆E =def −16(4A3 + 27B 2 ) 6= 0. The quantity ∆E is called the discriminant of E. The curve E has two affine components, one where Z 6= 0, setting x=
X Z
and y =
Y Z
we get y 2 = x3 + Ax + B
the other affine component has Z = 0 and there is only one such point on E, namely, (0 : 1 : 0). This latter point is sometimes denoted ∞, but we will use the notation OE .
60 Let E/K be an elliptic curve and let L ⊇ K be a field extension. Then the set E(L) = {(x, y) ∈ L2 : y 2 = x3 + Ax + B} ∪ {OE } can be given the structure of an abelian group with OE as the identity. The group operation is given locally by polynomials (defined over K 1 ), and can be stated succintly: Three points P , Q and R in E(L) satisfy P + Q + R = OE iff they are collinear.
All the group operations on E(L) can be efficiently computed, and explicit formulas can be given for the operations (see, for instance, [Sil86] Page 58).
From now on, we will assume that char K 6= 2 or 3 and we will usually write only the affine model for an elliptic curve E as y 2 = x3 + Ax + B.
The only maps between elliptic curves that we consider are the morphisms. Informally, these are well-defined maps given locally by polynomials. We refer the reader to [Sil86] Chapter I §3 for the formal definition. An isomorphism E1 ∼ = E2 is a morphism with a two sided inverse.
If E/K is an elliptic curve given by y 2 = x3 + Ax + B, we define the j-invariant of E as the quantity jE =
−1728(4A)3 . ∆E
The following proposition justifies the nomenclature of the j-invariant. 1
This will turn out to be important later.
61 Proposition 6.1.2. Let K be a field and fix an algebraic closure K of K. Let E1 , E2 be two elliptic curves over K. Then the two elliptic curves are isomorphic over K iff jE1 = jE2 .
6.2
Isogenies
Definition 6.2.1. A morphism φ : E1 → E2 between elliptic curves is called an isogeny if φ(OE1 ) = OE2 . The curves E1 and E2 are called isogenous if there is a non-zero isogeny between them. It is a fact that φ : E1 → E2 is surjective unless φ is the zero isogeny. If φ : E1 → E2 is a non-zero isogeny, we get an injection of function fields K(E1 ) O
?
φ∗
K(E2 ). This is a finite extension whose degree is called the degree of the isogeny φ. By convention we set deg[0] = 0. If this field extension is separable (resp. inseparable or purely inseparable) we call the isogeny separable (resp. inseparable or purely inseparable). The set Hom(E1 , E2 ) = {isogenies φ : E1 → E2 } is a group under the addition law (φ + ψ)(P ) = φ(P ) + ψ(P ). We define End(E) = Hom(E, E), and this can be given the structure of a ring with multiplication defined by composition of isogenies. End(E) is called the endomorphism ring of E.
62 Example 6.2.2. Let E be an elliptic curve over K. For each m ∈ Z we can define an isogeny, the multiplication by m map, denoted [m] : E → E as follows: If m > 0 then [m](P ) = P | +P + {z· · · + P}; if m < 0 then define [m](P ) = m terms
[−m](−P ). It turns out that [m] is a non-zero isogeny if m 6= 0 and is always defined over the base field K. We note some important properties of isogenies and the endomorphism ring below: Proposition 6.2.3. Let E, E1 , E2 be elliptic curves over K. Then the following are true: 1. If m ∈ Z, m 6= 0 then the multiplication by m map [m] : E → E is non-constant. 2. The group Hom(E1 , E2 ) is a torsion-free Z-module. 3. End(E) is a (not necessarily commutative) ring of characteristic 0 with no zero divisors. Definition 6.2.4. Let E be an elliptic curve and m ∈ Z, m 6= 0. The m-torsion subgroup of E, denoted E[m], is the set of points of order m in E, E[m] = {P ∈ E : [m]P = OE }. The following important result shows that isogenies are actually group homomorphisms. Theorem 6.2.5. Let φ : E1 → E2 be an non-zero isogeny. 1. Then φ is a group homomorphism (even if φ is the zero isogeny).
63 2. Furthermore, ker φ = φ−1 (OE2 ) is a finite subgroup. 3. For every Q ∈ E2 , ]φ−1 (Q) = degs (φ) (the separable degree of the extension of function fields that φ induces). 4. If φ is a separable isogeny, then ] ker φ = deg φ and K(E1 ) is a Galois extension of φ∗ K(E2 ). Theorem 6.2.6. Let φ : E1 → E2 be a non-constant isogeny of degree m. Then there exists a unique isogeny φˆ : E2 → E1 satisfying φˆ ◦ φ = [m]. This isogeny is called the dual isogeny of φ. If φ = [0] then we define the dual isogeny to be [0].
The following theorem summarizes some important properties of the dual isogeny. Theorem 6.2.7. Let φ : E1 → E2 be an isogeny. 1. Let m = deg φ. Then φˆ ◦ φ = [m] on E1 ; φ ◦ φˆ = [m] on E2 .
2. Let λ : E2 → E3 be another isogeny. Then ˆ λ[ ◦ φ = φˆ ◦ λ.
64 3. Let ψ : E1 → E2 be another isogeny. Then ˆ \ φ + ψ = φˆ + ψ. 4. For all m ∈ Z, c = [m] and deg[m] = m2 . [m] 5. deg φˆ = deg φ. ˆ 6. φˆ = φ. The next result gives the structure of the m-torsion subgroups. Theorem 6.2.8. Let E/K be an elliptic curve and m ∈ Z, m 6= 0. 1. If char(K) = 0 or if m is prime to char(K), then E[m] ∼ = (Z/mZ) × (Z/mZ). 2. If char(K) = p, then either E[pe ] ∼ = {0} for all e = 1, 2, 3, · · · ; or E[pe ] ∼ = Z/pe Z for all e = 1, 2, 3, · · · .
6.3
Structure of the Endomorphism Ring
The following theorem of Deuring gives the possibilities for the structure of the endomorphism ring of an elliptic curve. Theorem 6.3.1. Let E be an elliptic curve over K. Then End(E) is one of the following three kinds of rings.
65 1. End(E) ∼ = Z. 2. End(E) is an order in a quadratic imaginary extension of Q. 3. End(E) is an order in a quaternion algebra over Q. It turns out that the last possibility can happen only for elliptic curves over fields of characteristic p > 0. Definition 6.3.2. Let E be an elliptic curve over a number field K. Then E is said to have complex multiplication (or just CM) if End(E) is an order in an imaginary quadratic field. Example 6.3.3. Let E/Q be the elliptic curve E : y 2 = x3 − x. Then End(E) contains an element which we denote [ı], given by [ı] : (x, y) 7→ (−x, ıy) (here ı ∈ C is
√
−1). In this case one can show End(E) ∼ = Z[ı]. Thus E has complex
multiplication.
6.4
The Tate Module
Let E/K be an elliptic curve and m ≥ 2 an integer (prime to char(K) if char(K) > 0). Then by Theorem 6.2.8 we have E[m] ∼ = (Z/mZ) × (Z/mZ).
66 The subgroup E[m] has a Galois action. Suppose σ ∈ Gal(K/K) and P ∈ E[m] then P σ ∈ E[m], where σ acts on P coordinatewise, since the multiplication by m map is defined over K. We thus obtain a representation Gal(K/K) → Aut(E[m]) ∼ = GL2 (Z/mZ). The E[m] for various choices of m enjoy some compatibilities, and the following is a natural object to study. Definition 6.4.1. Let E be an elliptic curve and ` ∈ Z be a prime. The `-adic Tate module of E is the group T` (E) = lim E[`n ], ← n
[`]
the inverse limit is taken with respect to the maps E[`n+1 ] → E[`n ]. Note that the Tate module has a natural Z` -module structure.
6.5
Elliptic Curves over Finite Fields
Let K be a finite field with q elements and let E/K be an elliptic curve. The following important theorem of Hasse gives a bound for the number of elements in E(K). Theorem 6.5.1. Let E/K be an elliptic curve defined over a finite field with q elements. √ |]E(K) − q − 1| ≤ 2 q. If E/K where K is a finite field with q elements. Then the Frobenius map φ : K → K that sends x 7→ xq induces an isogeny which we also denote φ on E. The map φ induces a map φ` on the Tate module T` (E) (for ` 6= char(K)).
67 Proposition 6.5.2. Let E be an elliptic curve over a finite field K with q elements. Let ` be a prime different from char(K) and φ` denote the map induced by the q-th power Frobenius map on T` (E). Then det(φ` ) = q and ]E(K) = q + 1 − Tr φ` . Moreover, there √ are (conjugate) complex numbers α, β with |α| = |β| = q such that Tr φ` = α + β and if Kn ⊇ K is a field with q n elements, then ]E(Kn ) = q n + 1 − αn − β n . We noted earlier that the Endomorphism ring of an elliptic curve over a field of characteristic p can be non-commutative. The next result due to Deuring shows when that can occur. Theorem 6.5.3. Let K be a field of characteristic p and E/K an elliptic curve. For each integer r ≥ 1, let r)
φr : E → E (p
r and φˆr : E (p ) → E
be the pr -power Frobenius map and its dual. Here if E is given by y 2 = x3 + Ax + B r)
then E (p
r
r
is the elliptic curve given by y 2 = x3 + Ap x + B p .
1. The following are equivalent (a) E[pr ] = 0 for one (all) r ≥ 1. (b) φˆr is (purely) inseparable for one (all) r ≥ 1. (c) The map [p] : E → E is purely inseparable and j(E) ∈ Fp2 . (d) End(E) is an order in a quaternion algebra. 2. If the equivalent conditions in (1) do not hold, then E[pr ] = Z/pr Z for all r ≥ 1.
68 Further, if j(E) ∈ Fp , then End(E) is an order in an imaginary quadratic field. Definition 6.5.4. If E has satisfies any of the equivalent conditions of (1), then we call E supersingular. Otherwise we say that E is ordinary.
6.6
Weil Height
We will need the notion of Weil height of an Elliptic curve in Chapter 7 which we define here. Definition 6.6.1. Let α ∈ Q be an algebraic number with minimal polynomial pα (x) = a0 xd + a1 xd−1 + · · · + ad ∈ Z[x]. Assume that pα (x) = a0 (x − α1 )(x − α2 ) · · · (x − αd ) with αi ∈ C. Then the absolute logarithmic Weil height (or just Weil height) of α is defined to be the quantity X 1 h(α) = log |a0 | + max{1, |αi |} . d 1≤i≤d With the notation of the definition, we have the following useful bound ([Fel82] Lemma 8.2) h(α) ≤
X 1 log |ai |. d i
Thus the Weil height of an algebraic number is bounded polynomially by the encoding P length of its minimal polynomial. Also, we denote the quantity i |ai | by w(α).
If E/L is an elliptic curve we define the Weil height of E to be h(jE ), the Weil height of its j-invariant.
69
Chapter 7 Testing Elliptic Curves for Complex Multiplication Elliptic curves defined over number fields with complex multiplication have found applications in cryptography and coding theory, since there are closed form expressions for the number of points on such curves modulo prime ideals. This property was also utilized in the Atkin-Morain primality proving method [AM93]. Constructing elliptic curves with complex multiplication is computationally very expensive. In this chapter we show that testing an elliptic curve for CM on the other hand is easy.
One should be careful in the formulation of the computational problem here. For instance, if one fixes the number field over which the elliptic curves are defined, then there are only finitely many elliptic curves that have CM. Thus the problem becomes very easy from a computational standpoint (this issue is explained in 7.1). For this reason we consider the number field as being part of the input. Once we define the problem in this way, we can try the following idea: transform the method of constructing elliptic curves with CM into a solution for the problem. Unfortunately, to implement this idea we need good effective lower bounds on class numbers of imaginary quadratic fields, which is a notorious open problem. We describe this idea and analyze it in detail in §7.2.
70
Our next approach, discussed in §7.3, uses the elegant results of Deuring on the reduction of endomorphism rings of elliptic curves and Serre on the density of supersingular primes. The approach is based on the observation that supersingular primes are plentiful for curves with complex multiplication. This yields a two-sided error probabilistic polynomial time algorithm for this problem. We also show how this method can be adapted to find the discriminant of the endomorphism ring, but the analysis of this stage of the algorithm presents some interesting open questions. However, we can use the results we obtain here to make the error in the randomized algorithm one-sided. A similar algorithm is sketched in [CNST98] without a precise analysis of the probability of failure and the running time. We improve their results in two ways. First, our algorithm is simpler to implement. Second, unlike theirs, our proof is rigorous and does not rely on any unproven heuristic assumptions.
The final method, which we believe is new, discussed in §7.4 is based on studying the image of the Galois representations afforded by `-torsion points on the curve. This method is deterministic and has a polynomial running time, but we are unable to bound the (multiplicative) constant in the running time effectively. The results of this chapter are from [Cha05a].
7.1
The Problem
The computational problem that is the focus of this Chapter is the following:
71 Complex multiplication of elliptic curves:
Input:
A number field L, and an elliptic curve E : Y 2 Z = X 3 + AXZ 2 + BZ 3 with
A, B ∈ L. Question: Does E have complex multiplication?
We will assume that L = Q(jE ), since E always has a model over Q(jE ) and we can restrict to the subfield generated by jE . The input is specified by giving the minimal polynomials of A and B, from which the minimal polynomial of jE can be determined efficiently ([Len91]). The size of the input is measured by the size of the encoding of the minimal polynomials of A and B. The encoding length of a polynomial p(x) = a0 xd + a1 xd−1 + · · · + ad , with integer coefficients, is defined to be the quantity P 0≤i≤d max{1, log |ai |}. Note that the encoding length of a non-zero polynomial p(x) is at least the degree of p(x).
Our main concern is the complexity of the above decision problem. A consequence of the algorithms presented in this Chapter is that the above decision problem is in P. Next, we explain why the number field needs to be part of the input.
The set of complex points on E, namely E(C), has a particularly simple interpretation as C/LE , where LE is a rank 2 lattice such that LE ⊗Z R = C. In this description, isomorphic elliptic curves correspond to lattices that differ by a non-zero complex scalar ([Sil86] VI Ex. 6.6). Suppose E/C is given by a lattice LE . Then there is an isomorphic elliptic curve given by the lattice Z + ZτE with τE ∈ H, where H = {z ∈ C : Im z > 0}.
72 There is a simple criterion for deciding when E has complex multiplication, provided E is given as C/(Z + ZτE ) ([Sil86] Theorem VI.5.5):
Let τ be an imaginary quadratic number with minimal polynomial ax2 + bx + c and gcd(a, b, c) = 1. Then the discriminant of τ is b2 − 4ac. Theorem 7.1.1. Let E ∼ = C/(Z+ZτE ) with τE ∈ H. Then E has complex multiplication by an order OD of discriminant D iff τE is a quadratic number of discriminant D as defined above. We also have the following important theorem (see [Coh93] Theorem 7.2.14 or [Sil94] Chapter 2): Theorem 7.1.2. Let τ ∈ H be an imaginary quadratic number, and let D be its discriminant. Then j(τ ) (here j is the usual modular j-function) is an algebraic integer of degree equal to h(D), where h(D) is the class number of the imaginary quadratic order of discriminant D. More precisely, the minimal polynomial of j(τ ) over Z is the equation Q (X − j(α)), where α runs over the quadratic numbers associated to the reduced forms of discriminant D. We can interpret Theorems 7.1.1 and 7.1.2 as follows. If E/L has complex multiplication by OD , an order of discriminant D, then its j-invariant has only h(D) possibilities, and is an algebraic integer of degree h(D). Siegel’s theorem implies that h(D) → ∞ as D → −∞, thus one concludes that if we fix a number field L, then there are only finitely many j-invariants of elliptic curves defined over L that have complex multiplication. In other words, if we fix any L, the problem of checking when an elliptic curve over L has CM becomes trivial from a complexity viewpoint: pre-compute this list of j-invariants
73 for the field and check if the curve is one of them. The pre-computation cost though prohibitive in practice is still a computation that requires only O(1) time. For instance, the list for L = Q is given in §7.2 of [Coh93]. This is why we insist on the field being part of the input.
Remark 7.1.3. The j-invariants of elliptic curves with CM are called singular moduli, and these enjoy many nice properties. They turn out to be algebraic integers and generate dihedral extensions of Q. Furthermore, in an important paper Gross-Zagier ([GZ85]) derived a formula for the prime ideal factorization of j(τ1 ) − j(τ2 ) where τ1 , τ2 generate maximal quadratic orders with coprime discriminants. Such numbers are divisible by many primes of small norm. There is even a conjectural extension of this work to the case where the τi do not generate maximal orders; see [Hut98]. We utilize some of these properties in §7.4.
7.2
A Direct Approach
We can turn the results of Theorems 7.1.1 and 7.1.2 into an algorithm for checking if an elliptic curve has CM as follows. First compute the Hilbert class polynomials Q HD = (x − j(α)), where α runs over the quadratic numbers associated to the reduced quadratic forms of (negative) discriminant D. Next we check if the j-invariant of the elliptic curve is a root of this polynomial. If so, we know that E has CM by an order of discriminant D. This computation can be done in |D|O(1) time (cf. [Sch85] §4). One does this for each D ≡ 0, 1 mod 4 until the degree of HD exceeds the degree of the field of definition of the elliptic curve. At this point we declare that the curve does not have CM.
74
The problem with the above approach is this. When do we stop trying new discrimi1
nants? The Brauer-Siegel theorem says that h(D) grows roughly as |D| 2 , but this bound is not effective. We need an explicit lower bound for the class number in terms of the discriminant to be able to decide when to stop. This is a hard problem, first studied by Gauss. Only recently the following explicit bound was proved by Gross, Zagier, Goldfeld and Osterl´e (see [Zag84, GZ86]): Theorem 7.2.1. If D is a negative fundamental discriminant, then √ Q b2 pc 1 7000 ln(|D|) p|D 1 − p+1 , if gcd(D, 5077) 6= 1 h(D) > √ Q b2 pc 1 1− otherwise. ln(|D|) 55
p|D
p+1
Using the fact that the class number of an order is a multiple of the class number of the quadratic field associated to it, and the observation that if D has t prime factors then 2t−1 | h(D) (by Gauss’s genus theory), we obtain an effective lower bound on h(D). This results in a method whose running time is exponential in the degree of the field.
7.3
The Randomized Algorithm
The randomized algorithm is based on the observation that if E/L has CM, then there is an abundance of supersingular primes. This differs from the case where E does not have CM. We describe the algorithm first:
Input: A number field L and E : Y 2 Z = X 3 + AXZ 2 + BZ 3 , with A, B ∈ L. Steps:
75 1. If jE is not an algebraic integer, output “E does not have CM.” 2. Pick a prime p at random in the interval I = [2 · · · (h exp(n2+ ) max{w(A), w(B)})c ], where c, h and are positive constants and n = [L : Q]. 3. Find the decomposition of (p) =
Q
i
Pei i , where Pi are prime ideals of OL (the ring
of integers of L). If this step fails go back to step (2). 4. Choose a prime in this factorization uniformly at random (say) P, treating the ei copies of Pi as distinct. 5. If NL/Q P lies outside the interval I then go to step (2). 6. With probability
1 deg P
proceed with the next step; otherwise, return to step (2).
7. Compute the reduction E˜ of E mod P. If this step does not suceed return to step (2). ˜ 8. Compute aP , the trace of the Frobenius endomorphism of E. 9. If aP = 0 mod p then output “E probably has CM”; otherwise, output “E probably does not have CM.” First we argue that all the steps can be done efficiently, and also bound the probability of failure in some of the steps. Step (1) can be done by computing the minimal polynomial of jE and checking if it is monic with integer coefficients. This can be done in polynomial time [Len91]. Step (2) can be done efficiently using our source of random bits and randomized primality testing methods. To find the splitting of the prime p we make use of Theorem 4.8.13 in [Coh93], which leads to a randomized polynomial time algorithm. Q This algorithm not only provides us with the prime factorization (p) = i Pei i but also
76 gives us the isomorphism OL /P ∼ = Fpd , where P ⊇ (p) is a prime and d = deg(P). The isomorphism can be used to compute the reduction of the curve in step (7). The prime decomposition method we suggest will fail if the prime p divides the index [OL : Z[θ]], where θ = jE (note that θ is an algebraic integer as a consequence of the check made at step (1)). The number of primes for which this failure can occur is bounded by the number of primes that divide the discriminant of the order Z[θ]. Since this order has a basis of the form 1, θ, θ2 , · · · , θn−1 , its discriminant is that of its minimal polynomial T (x) = xn + a1 xn−1 + · · · + an . Using the Hadamard bound, we see that the number of P primes dividing the discriminant is bounded by log ( i (nai )2 )2n−1 which is still polynomial in the input length. The reduction of the elliptic curve can be done in step (7) if p 6 |NL/Q ∆E and this again excludes only a few primes. Thus, if c and h are large enough the probability that we pick a prime for which either step (3) or (7) fails will be negligible. Step (8) can be done in polynomial time using, for instance, Schoof’s algorithm [Sch85].
We now explain the reason for sampling the primes as we do in steps (2) - (5). We wish to pick primes P uniformly at random from the primes of OL whose norm lies in the interval I. The sampling method we use is acceptance-rejection sampling and this ensures that we pick primes according to our requirement.
Firstly, if E has CM then its j-invariant is an algebraic integer (Theorems 7.1.1 and 7.1.2), and step (1) checks that this holds. Next, we argue that if E has complex multiplication then with non-negligible probability the algorithm will output that E probably has CM. For this we need a theorem of Deuring ([Lan87] Chapter 13 §4):
77 Theorem 7.3.1 (Deuring). Let E/L be an elliptic curve with complex multiplication by an order OE of an imaginary quadratic field K. Let P be a prime ideal over the rational prime p. Assume that E has good reduction at P. Then E mod P is supersingular iff p either ramifies or remains inert in K. Let E be an elliptic curve over a finite field Fpd . Then E is supersingular iff it has no ptorsion points. This is equivalent to the trace of the pd -power Frobenius endomorphism being a multiple of p ([Sil86] V. Ex. 5.10). Thus step (9) checks if E has supersingular reduction at the prime P.
Suppose E/L is a curve with complex multiplication by an order in the imaginary √ quadratic field K = Q( D) (where D is the discriminant of K.) Then by Theorem 7.3.1 the primes where E has supersingular reduction are precisely those primes that are either ramified or inert in K. The primes that ramify are those that divide the discriminant D, and the primes p that remain inert are those for which Dp = −1. This immediately suggests that the proportion of such primes can be worked out by choosing primes in certain arithmetic progressions mod D. However, since the discriminant of the field K depends on the input, we need a result that is uniform in the modulus D. Indeed, using quadratic reciprocity and the uniform prime number theorem for arithmetic progressions ([Dav00] Chapter 20) one can show the following theorem: Theorem 7.3.2. Define D π0 (x) = ] p ≤ x : = −1 p and let δ > 0 be fixed. Then there is a positive effective constant c > 0 depending on δ
78 such that if |D| ≤ (log x)1−δ then √ 1 π0 (x) = Li(x) + O xe−c log x 2
uniformly in D. To apply Theorem 7.3.2 we need to ensure that |D| ≤ log1−δ x. In other words, we need 1
to pick primes in an interval which is longer than exp(|D| 1−δ ) for some δ > 0. At this point we apply Siegel’s theorem to get a bound on |D| in terms of the degree of the field over which E is defined. We use Siegel’s theorem, even though it is ineffective, because the ineffectiveness affects only the error term in the success probability of the algorithm. This does not affect the implementation of the algorithm. Furthermore, the effective estimate is too weak for our purpose. Theorem 7.3.3 (Siegel). For each > 0 there is a constant (ineffective) c > 0 such that the class number h(−D) satisfies 1
h(−D) ≥ cD 2 − . By Theorem 7.1.2 we have that [L : Q] = h(−D), where −D is the discriminant of the order by which E has CM. By Siegel’s theorem we get that D ≤ c0 [L : Q]2+ , where c0 is a positive constant depending on . Thus, picking primes that are at least exp(c0 [L : Q]2+ ) will ensure (Theorem 7.3.2) that we have a positive density of supersingular primes. In summary, we have proved the following theorem:
Theorem 7.3.4. Fix any > 0 and let E/L be an elliptic curve with CM. If p is a prime picked uniformly at random in an interval containing [2 · · · exp([L : Q]2+ )] and E
79 has good reduction at P ⊇ (p), then the probability that E has supersingular reduction at P is at least
1 2
+ o(1), the error term being ineffective.
We have shown that about
1 2
of the rational primes give us primes of supersingular
reduction for E. But our algorithm selects primes P of OL that are most likely degree 1 primes. We need to ensure that this somehow does not bias against the primes of supersingular reduction for E. To argue this we consider the following diagram of fields: √ Q( −D, jE )
@@ @@ @@ @@ @@ @@ @@ @@ @@ @
qq qqq q q q qqq
Q(jE