The p-adic ergodic theory and applications

The p-adic ergodic theory and applications Vladimir Anashin The first draft: December 15, 2014

To my beloved Alla and Masha

Contents Preface 0 Background 0.1 Facts from number theory 0.2 Facts from algebra . . . . 0.2.1 Rings . . . . . . . 0.2.2 Finite Fields . . .

I

iv . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

Basics of p-adic analysis

1 1 4 4 7

9

1 Rings and fields of p-adic numbers 1.1 p-adic valuation . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Non-Archimedean absolute value and ultrametric 1.1.2 Non-Archimedean rings and fields . . . . . . . . 1.2 Canonical expansion of p-adic numbers . . . . . . . . . . 1.2.1 Canonical representation . . . . . . . . . . . . . . 1.2.2 Properties of p-adic integers . . . . . . . . . . . .

. . . . . .

10 . 10 . 11 . 13 . 15 . 16 . 19

2 p-adic Calculus 2.1 Univariate functions . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Continuous functions . . . . . . . . . . . . . . . . . . 2.1.2 Differentiability . . . . . . . . . . . . . . . . . . . . . 2.2 Multivariate functions . . . . . . . . . . . . . . . . . . . . . 2.2.1 Differentiability modulo pk for multivariate functions 2.2.2 Properties of derivatives modulo pk . . . . . . . . . .

. . . . . .

21 21 21 22 27 28 30

3 p-adic series 3.1 Taylor series . . . . . . . . . . . . . 3.1.1 Analytic functions on balls 3.1.2 Analytic functions on Zp . . 3.2 Mahler series . . . . . . . . . . . . 3.2.1 Identities modulo pk . . . . 3.3 Van der Put series . . . . . . . . .

33 . 33 . 33 . 34 . 35 . 36 . 37

i

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

CONTENTS

ii

4 Compatible functions 4.1 Compatibility and local compatibility . . . . . 4.1.1 Coordinate functions . . . . . . . . . . . 4.1.2 Differences of compatible functions . . . 4.2 Compatibility and differentiability . . . . . . . 4.2.1 Differentiability modulo p . . . . . . . . 4.3 Mahler expansion of compatible functions . . . 4.4 Van der Put expansion of compatible functions.

. . . . . . .

5 Special classes of compatible functions 5.1 Class C . . . . . . . . . . . . . . . . . . . . . . . 5.2 Class B . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Stone-Weierstrass theorem for B-functions 5.2.2 Taylor theorem for B-functions . . . . . . 5.2.3 Important subclasses of B . . . . . . . . . 5.3 Class A . . . . . . . . . . . . . . . . . . . . . . .

II

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . .

39 39 40 41 42 44 51 53

. . . . . .

55 56 58 59 63 64 65

The p-adic ergodic theory

6 Ergodic theory: basic notions and facts 6.1 Dynamical systems: basic notions . . . . 6.1.1 Finite dynamics . . . . . . . . . 6.2 Dynamics on Znp . . . . . . . . . . . . . 6.2.1 Probability measure on Znp . . . 6.2.2 Hereditable dynamical properties 7 The 7.1 7.2 7.3

67 . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

68 68 71 74 74 75

main ergodic theorem 77 Measure-preserving isometries . . . . . . . . . . . . . . . . . . 77 1-Lipschitz measure-preserving functions . . . . . . . . . . . . 80 1-Lipschitz ergodic functions . . . . . . . . . . . . . . . . . . 83

8 1-Lipschitz ergodicity on Zp 8.1 Ergodicity of affine mappings . . . . . . . . . . . . . . . . . . 8.2 Ergodicity and measure-preservation in terms of coordinate functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Ergodicity and measure-preservation in terms of Mahler expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Ergodicity and measure-preservation in terms of van der Put expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Measure-preservation criteria for T-functions . . . . . 8.4.2 Ergodicity criteria for T-functions . . . . . . . . . . .

84 84 86 89 97 97 99

CONTENTS

iii

9 Ergodicity and differentiability 103 9.1 Conditions for measure-preservation . . . . . . . . . . . . . . 103 9.2 No uniformly differentiable 1-Lipschitz ergodic transformations on Znp , n ≥ 2 . . . . . . . . . . . . . . . . . . . . . . . . 107 9.3 Differentiable ergodic transformations on Zp . . . . . . . . . . 110 9.4 Measure-preservation and ergodicity of A-, B-, and C-functions117 10 1-Lipschitz ergodicity on subspaces 10.1 1-Lipschitz ergodic transformations on balls . . . . . . . . 10.2 1-Lipschitz ergodic transformations on spheres . . . . . . 10.2.1 Ergodicity of B-functions and of analytic functions 10.2.2 Ergodicity of A-functions on spheres . . . . . . . .

. . . .

. . . .

134 134 135 137 140

11 Euclidean plots 142 11.1 Maps of Zp into R . . . . . . . . . . . . . . . . . . . . . . . . 142 11.2 Points falling on hyperplanes . . . . . . . . . . . . . . . . . . 143

III

Applications

12 The p-adic ergodic theory of automata 12.1 Automata and automata maps . . . . . . . 12.1.1 Automata word transformations . . 12.1.2 Reversibility of automata . . . . . . 12.2 Transitivity of automata . . . . . . . . . . . 12.3 Automata 0-1 law . . . . . . . . . . . . . . 12.4 Conditions for complete transitivity . . . . 12.4.1 Finite automata are all of measure 0 12.4.2 Complete and absolute transitivity . 12.5 Automata finiteness criterion . . . . . . . .

150 . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

13 Pseudorandom generators 13.1 Pseudorandom generator is a dynamical system . 13.1.1 What pseudorandom generators are good? 13.1.2 Why the p-adic ergodic theory? . . . . . . 13.2 Congruential generators of the longest period . . 13.2.1 Types of congruential generators . . . . . 13.2.2 Periods of congruential generators . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

151 151 153 155 157 160 164 165 166 170

. . . . . . . . .

. . . . . . . . .

. . . . . .

176 . 178 . 179 . 181 . 182 . 184 . 186

14 Latin squares 198 14.1 The p-adic ergodic theory in design of Latin squares . . . . . 199 14.2 Orthogonal Latin squares . . . . . . . . . . . . . . . . . . . . 201

Preface This textbook is written on a base of a 28-hour lecture course on applied algebraic dynamics I gave to MS and PhD students of the Graduate University of Chinese Academy of Sciences. Although the textbook has some intersections with monograph [5] “Applied Algebraic Dynamics” (DeGryuter, 2009) authored by me and Prof. Khrennikov, the monograph is much wider both in volume and in coverage than the textbook but contains no exercises. On the other hand, some new results which have not been mentioned in the monograph are included into the textbook. These results are mainly about the usage of van der Put series in the p-adic ergodic theory and about its applications to automata theory. The corresponding parts of the textbook are based on recent papers [10, 7, 11, 12, 8, 9]. A successful application of algebraic dynamics to computer science and cryptology is due to the fact that both word transformations performed by automata and basic computer instructions can be considered as continuous non-Archimedean functions and therefore are subject of non-Archimedean dynamics and p-adic analysis. That is why some exercises and examples of the textbook are intentionally dealing with functions which are basic computer instructions, e.g., XOR, AND, NOT, etc. The application part of the textbook is also focused on applications to computer science and cryptology mainly; so T -functions (which actually are straight line programs) are of special interest throughout the book since a number of cryptographic primitives (as well as ciphers at whole) are in fact T-function-based. Unfortunately due to space constraints I could not include into the textbook such useful applications to cryptology as linear relations in T -functions and fast evaluation of T -functions [49, 50] which were obtained within the research project during my stay in China: In 2009 the Chinese Academy of Sciences awarded me a Visiting Professorship for Senior International Scientists (Grant 2009G2-11) and both the lecture course and the textbook preparation (along with participation in research projects at the State Key Laboratory of Information Security of the Institute of Information Engineering CAS) were parts of my visiting professorship duties. I use the opportunity to express my deep gratitude to Prof. Dongdai Lin, the Head of SKLOIS, who was my host professor during my stay in China and became my close friend. I also would like to say words if thanks to all iv

PREFACE

v

people who made my stay and work in China pleasant and fruitful: Prof. Yuefei Wang, Prof. Jia-Yan Yao, Prof. Aihua Fan and many other people from Chinese Academy of Sciences and numerous universities of China which I visited and were I gave talks. And of course my special thanks to people who took care of me during this time: Dr. Tao Shi, Dr. Baofeng Wu and Mrs. Wei Han. All these people have made my visiting professorship an unforgettable and exciting journey to China letting me get acquainted with culture, history and traditions of great Chinese civilization. Vladimir Anashin November-December 2014 Moscow-Beijing

Chapter 0

Background This preliminary Chapter is just a reminder for some basic notions and results we use throughout the book; it is for further references only.

0.1

Facts from number theory

In this Section, we remind some important facts and useful formulas from number theory. We assume that reader is familiar with residues modulo N and their basic properties. Chinese Reminder Theorem Theorem 0.1 (Chinese Reminder Theorem). Let N ∈ N be a natural number, N > 1. Represent N = pe11 pe22 ∙ ∙ ∙ perr , where pj , 1 6 j 6 r, are prime numbers, and r is the number of different primes in decomposition of N . e −1 Then, given arbitrary integers aj ∈ {0, 1, . . . , pj j − 1}, 1 6 j 6 r, there e exists an integer a ∈ {0, 1, . . . , N − 1} such that a ≡ aj (mod pj j ) for all 1 6 j 6 r. Note that the proof of Theorem 0.1 is constructive; that is, gives an algorithm to find this a explicitly, see any relevant book on number theory. Binomial coefficients For i, ∈ N0 , n ∈ N, the binomial coefficient is n n(n − 1) ∙ ∙ ∙ (n − i + 1) ; = i! i note that

n = 1, 0 by the definition. Note also that ni = 0 for i > n. 1

CHAPTER 0. BACKGROUND

2

PN P i i Theorem 0.2 (Lucas’ Theorem). Let r = N i=0 ri p and n = i=0 ni p be base-p expansions of r, n ∈ N0 : ri , ni ∈ {0, 1, . . . , p − 1} (i = 0, 1, 2, . . .). Then the following congruence for binomial coefficients holds: r r0 r1 rN ≡ ∙∙∙ (mod p). n n0 n1 nN Proof. See e.g. [3]. Corollary 0.3. Under conditions of Theorem 0.2, let ` ≥ 1, k ≥ 1, n ≤ pk − 1. Then k p `−1 ≡ (−1)n (mod p). n Proof of Corollary 0.3. Take r = pk ` − 1 in the statement of Theorem 0.2, then ri = p − 1 for i = 0, 1, . . . , k − 1. Now Theorem 0.2 implies that k p−1 p−1 p−1 p `−1 ≡ ∙∙∙ ≡ (−1)n (mod p) n n0 n1 nk−1 as obviously 1}.

p−1 j

=

(p−1)∙(p−2)∙∙∙(p−j) j!

≡ (−1)j (mod p) for all j ∈ {0, 1, . . . , p−

Stirling numbers, falling powers n Remind that by the definition, Stirling number of the first kind is the i number of ways to arrange n elements into i cycles, and Stirling number n of the second kind is the number of ways to partition a set of n elei ments into i nonempty subsets, see e.g. [26, Section 6.1]. Note that by the definition 0 0 = = 1. 0 0 In the sequel we will use Stirling numbers to express powers via falling powers and vice versa. Remind that by the definition the n-th falling power (also known as falling factorial power ) of x is x n x = n! = x(x − 1) ∙ ∙ ∙ (x − n + 1), x0 = 1. n The factorial powers an (ordinary) powers are related one to another in the following way: n n X X n n n i x ; xn = x (1) xn = i i i=0

i=0


3

Finite differences Definition 0.4. A difference (with respect to variable xi ) of a function f (x1 , . . . , xn ) is Δi f (x1 , . . . , xn ) = f (x1 , . . . , xi−1 , xi + 1, xi+1 , . . . , xn ) − f (x1 , . . . , xn ), and the s-th difference (with respect to variable xi ) of the function f is Δsi f = Δs−1 (Δi f ), i

(s = 1, 2, . . .),

where Δ0i f = f by the definition. We write Δf (x) rather than Δ1 f (x) for a univariate function f . One verifies directly that n n+1 n n Δ = − = i i i i−1

(2)

Theorem 0.5 (Gregory-Newton formula). The following identity holds for all n ∈ N and all functions g. g(y + n) =

∞ X n i=0

i

Δi g(y),

Theorem 0.6 (Binomial inversion formula). αm =

∞ X m k=0

if and only if βk =

∞ X

m=0

(−1)

k

m+k

βk

k αm m

M¨ obius and Euler functions Let us begin with the definition of the Möbius function. Definition 0.7. Let n ∈ {n = 1, 2, ....}. Then we can write n = pe11 pe22 ∙ ∙ ∙ perr , where pj , 1 6 j 6 r, are prime numbers and r is the number of different primes. The function μ on N defined by μ(1) = 1, μ(n) = 0 if any ej > 1 and μ(n) = (−1)r , if e1 = . . . = er = 1 is called the M¨ obius function.


4

The M¨ obius function has the following property, see for example [28] or [13], ( X 1, if n = 1, μ(d) = 0, if n > 1, d|n where d is a positive divisor of n. This property is used for proving the following classical result Theorem 0.8 (M¨ obius inversion formula). Let f and g be functions defined for each n ∈ N. Then, X g(d) (3) f (n) = d|n

if and only if g(n) =

X

μ(d)f (n/d).

(4)

d|n

We recall the definition of Euler’s totient function and Euler’s Theorem. Definition 0.9. Let n be a positive integer. Henceforth, we will denote by ϕ(n) the number of natural numbers less than n which are relatively prime to n. The function ϕ is called Euler’s totient function. If p is a prime number then ϕ(pl ) = pl−1 (p − 1). Theorem 0.10 (Euler’s Theorem). If a is an integer relatively prime to b then aϕ(b) ≡ 1 (mod b). For later use we also recall that ϕ(n) =

X d|n

0.2 0.2.1

n μ(d) . d

(5)

Facts from algebra Rings

In this subsection we remind some notions and facts from ring theory, mainly following [45, 44, 14, 43]. A ring R is a universal algebra with two operations + (addition) and ∙ multiplication, such that R with respect to + is a commutative group (which is denoted as R+ ) with neutral 0, which is called zero, and inverse − (that is −a is an additive inverse for a ∈ R, a+(−a) = 0), R is a semigroup with respect to ∙, and (a+b)∙c = (a∙c)+(b∙c), c∙(a+b) = (c∙a)+(c∙b), for all a, b, c ∈ R. We mainly consider commutative rings in this book that is, a ∙ b = b ∙ a, for all a, b ∈ R. As usual, we omit the sign of multiplication ∙ in expressions, and we omit parenthesises according to the common rule: a + bc = a + (b ∙ c). Whenever the ring R has an unity,


5

that is, a multiplicative neutral element, we denote it as 1: a ∙ 1 = 1 ∙ a, for all a ∈ R. A ring having the unity is called a ring with unity. Further within this subsection ‘ring’ stands for ‘commutative ring with unity’. The additive order of 1, that is, the smallest n ∈ N such that n ∙ 1 = 0, if such n exists, called a characteristic of R, and is denoted via char(R). A ring is said to be of zero characteristic if no such n exists. If an element a ∈ R has a multiplicative inverse, it is denoted via a−1 : a ∙ a−1 = a−1 ∙ a = 1. All invertible elements (those having multiplicative inverses) are called units. They form a group R∗ with respect to ring multiplication; this group is called a unit group, or a group of units, or multiplicative (sub)group of the ring R. If R∗ = R \ {0}, the ring R is called a field . Note that in ring theory they often use the term identity rather than unity; we avoid using identity in this meaning in the book. A non-zero element a ∈ R is called a zero divisor whenever there exists an element b ∈ R \ {0} such that ab = 0. An non-zero element a ∈ R is called nilpotent whenever an = 0 for some n ∈ N; the smallest such n is called the nilpotency index of a. A ring R without zero divisors is called an integral domain. Every integral domain can be embedded into a field; the smallest one is called a quotient field of R and denoted as Q(R). For instance, a ring Z = {0, ±1, ±2, . . .} of all rational integers is an integral domain; its quotient field is Q, the field of all rational numbers. An integer-valued function is a map F : Q(R)n → Q(R)m such that F (Rn ) ⊂ Rm . We remind that any integer-valued polynomial f over Q in variable x can be expressed as f (x) =

d X i=0

x ai , i

where ai ∈ Z, i = 0, 1, . . . , d, and vice versa, see a substantial monograph [17] on various aspects of integer-valued polynomials. Integer-valued functions on the field of p-adic numbers Qp are the maps we are mostly focused at. A module over a ring R is a commutative group M with respect to operation ⊕, endowed with an ‘external’ operation of multiplication by elements of R: Given r, s ∈ R, h, g ∈ M , one defines this multiplication r ∙ h ∈ M so that (rs) ∙ h = r ∙ (s ∙ h) and r ∙ (h ⊕ g) = (r ∙ h) ⊕ (r ∙ g). Vector spaces over fields are important example of modules; the other important example are ideals. A non-empty subset I ⊂ R is called an ideal whenever I is a subgroup with respect to ring addition +, and ra ∈ I for all r ∈ R, a ∈ I. An ideal I is called proper whenever I 6= R and I 6= {0}. Ideals are kernels of ring homomorphisms, and vice versa. It is clear that given a1 , . . . , an ∈ R, the set a1 ∙ R + ∙ ∙ ∙ + an ∙ R, which is a set of all sums a1 r1 + ∙ ∙ ∙ + an rn , r1 , . . . , rn ∈ R, is an ideal of R, the smallest ideal that contains a1 , . . . , an . This ideal is called an ideal generated by


6

elements a1 , . . . , an . An ideal that is generated by a single element is called principal . A ring all whose ideals are principal, is called a principal ideal ring. It is clear that factor rings of principal ideal rings are again principal ideal rings. Theorem 0.11. A ring R[x] of all polynomials in a variable x over a field R is a principal ideal ring. In the sequel, we will also need a more special type of rings, a ring of formal power Given a ring R and a variable x, consider all formal Pseries: ∞ i expressions i=0 ai x , ai ∈ R, i = 0, 1, 2, . . .. We can define addition and multiplication of these sums by common rules for infinite series; as every coefficient of a sum or product is then a finite expression of finite number of coefficients of summands (respectively, factors), these operations are welldefined. Thus we obtain a ring R[[x]] of formal power series; its elements are called formal power series over R. Now we remind some facts about finite rings: Proposition 0.12. Every non-zero element of a finite ring is either a unit, or a zero divisor. We mainly deal with residue rings Z/N Z modulo N . For these rings, we have the following re-statement of Chinese Reminder Theorem: Theorem 0.13 (Chinese Reminder Theorem, equivalent form). Let N be a natural number, N > 1. Represent N = pe11 pe22 ∙ ∙ ∙ perr , where pj , 1 6 j 6 r, are prime numbers, and r is the number of different primes in decomposition e of N . Then the residue ring Z/N Z is a direct sum of residue rings Z/pj j Z, 1 6 j 6 r. For residue rings Z/N Z there exist a simple way to determine whether a given element is invertible or a zero divisor, cf. Proposition 0.12: Proposition 0.14 (Invertibility modulo N ). Let N be a natural number, N > 1. Represent N = pe11 pe22 ∙ ∙ ∙ perr , where pj , 1 6 j 6 r, are prime numbers, and r is the number of different primes in decomposition of N . Then the element a of the residue ring Z/N Z is invertible if and only if a 6≡ 0 (mod pj ) for all 1 6 j 6 r. With the use of these results in combination with the following Proposition 0.15, it is easy to determine multiplicative subgroups of residue rings.

Proposition 0.15. Let p be a prime, let k ∈ N. A group (Z/pk Z)∗ of all invertible elements of the residue ring Z/pk Z is a cyclic group of order (p − 1) ∙ pk−1 whenever p is odd. If p = 2 and k > 2 then (Z/2k Z)∗ is a direct product of a group of order 2 by a cyclic group of order 2k−2 . The group (Z/4Z)∗ is a cyclic group of order 2, the group (Z/2Z)∗ is trivial. In the case when the multiplicative (Z/pk Z)∗ is cyclic, the number a ∈ Z is said to be primitive modulo pk if the residue of a modulo pk is a generator of (Z/pk Z)∗ .


0.2.2

7

Finite Fields

Finite fields has some special properties we use throughout the book. A characteristic char(F) of a finite field F is a prime number p, and #F = pn for a suitable n ∈ N. Given a prime p and a positive rational integer n, there exists (up to a ring isomorphism) a unique field of order pn . We denote this unique field of pn elements via Fpn . In particular, if n = 1, then Fp is isomorphic to the residue ring Z/pZ modulo p. A multiplicative subgroup F∗pn is a cyclic group of order pn −1; generators of this group are called primitive elements of the field Fpn . Thus, there are exactly ϕ(pn − 1) different primitive elements in Fpn , where ϕ is the Euler totient function. Finite fields are polynomially complete: Given a map ϕ : Fq → Fq , there exists a polynomial fϕ (x) ∈ Fq [x] such that fϕ (z) = ϕ(z), for all z ∈ Fq : fϕ (x) =

X

z∈Fq

ϕ(z) ∙

xq − x . z−x

We note that fϕ (x) is indeed a polynomial over Fq as xq − x = Formula (6) holds since ( 1, whenever x = z; xq − x = z−x 0, otherwise.

(6) Q

z∈Fq (x − z).

Using this method, which goes back to Lagrange, we can construct interpolation polynomial for arbitrary n-variate mapping from Fnq to Fm q , as, e.g., xq − x y q − y ∙ = a−x b−y

(

1, whenever x = a and y = b; 0, otherwise,

and henceforth. Moreover, we can interpolate simultaneously a mapping and its derivative, in the following way: Proposition 0.16. Given two mappings ϕ : Fq → Fq and ψ : Fq → Fq , there exists a polynomial fϕ,ψ (x) ∈ Fp (x) such that • fϕ,ψ induces on Fq the mapping ϕ: fϕ,ψ (z) = ϕ(z) for all z ∈ Fq , 0 (x) induces on F the mapping ψ: • the derivative fϕ,ψ q 0 (z) = ψ(z) for all z ∈ Fq . fϕ,ψ


8

Proof. Given mappings ϕ and ψ, construct interpolation polynomials fϕ and fψ according to formula (6). Then fϕ,ψ (x) = fϕ (x) + (xq − x) ∙ (fϕ0 (x) − fψ (x)). Note that z q − z = 0 for all z ∈ Fq , that (xq − x)0 = qxq−1 − 1 is identically −1 on Fq , and that fϕ0 (x) is a polynomial over Fq (as fϕ (x) is a polynomial over Fq ). Note 0.17. This Proposition can also be generalized to arbitrary mappings Fnq to Fm q with the use of interpolation formulas for n-variate mappings we mentioned above, as well as for higher order derivatives.

Part I

Basics of p-adic analysis

9

Chapter 1

Rings and fields of p-adic numbers In this Chapter, we introduce basic notions of p-adic valuation, p-adic metric and p-adic integers.

1.1

p-adic valuation

Let p be a fixed prime number. By the fundamental theorem of arithmetics, each non-zero integer n can be written uniquely as ˆ, n = pordp n n where n ˆ is a non-zero integer, p - n ˆ , and ordp n is a unique non-negative integer. The function ordp : Z \ {0} → N0 is called the p-adic valuation. In the sequel we will need the following classical result of Legendre (for the proof see, e.g., [48, Lemma 25.5], [2, Corollary 3.2.2], [36, Ch.1, Section 2, Exercise 14], [33]): Lemma 1.1 (Valuation of factorial). Write natural number n in a base-p P expansion: n = a0 + a1 p + ∙ ∙ ∙ + am pm . Denote wtp n = m a k=0 k , the p-adic weight of n. Then n − wtp n ordp n! = . p−1 Corollary 1.2 (Valuation of binomial coefficient). For all i, k ∈ N0 , i+k 1 ordp = (wtp i + wtp k − wtp (i + k)). i p−1

If a, b ∈ Z, b 6= 0, then we define the p-adic valuation of x = a/b ∈ Q as ordp x = ordp a − ordp b. 10

CHAPTER 1. RINGS AND FIELDS OF P -ADIC NUMBERS

11

Exercise 1.3. Show that the p-adic valuation on Q is well defined; i.e., that ordp x of x does not depend on the fractional representation of x. By using the p-adic valuation we will define a new absolute value on the field Q of rational numbers.

1.1.1

Non-Archimedean absolute value and ultrametric

Let K be a commutative ring with a unity 1. An absolute value on K is a function | ∙ | : K → R such that • |x| > 0 for all x ∈ K, • |x| = 0 if and only if x = 0, • |xy| = |x||y|, for all x, y ∈ K, • |x + y| 6 |x| + |y|, for all x, y ∈ K. If | ∙ | in addition satisfies the strong triangle inequality |x + y| 6 max(|x|, |y|) for all x, y ∈ K then we say that | ∙ | is non-Archimedean. If |x| = 1 for all non-zero x ∈ K we call | ∙ | the trivial absolute value; this absolute value is non-Archimedean. Note. Some authors call the absolute value also a norm. Example 1.4. Let K be a field and let | ∙ | be a non-Archimedean absolute value on K. Let x, y ∈ K such that |x| 6= |y|. Then |x + y| = max(|x|, |y|). Proof. Assume that |x| > |y|. By the strong triangle inequality we have |x| = |(x + y) − y| 6 max(|x + y|, |y|). The assumption |x| > |y| implies max(|x+y|, |y|) = |x+y|. Thus |x| 6 x + y. By the strong triangle inequality, |x + y| 6 max |x|, |y| = |x|. We conclude that |x + y| = |x|.


12

p-adic absolute value The absolute value we mostly deal with is the p-adic absolute value: Definition 1.5. The p-adic absolute value of x ∈ Q \ {0} is given by |x|p = p− ordp x and |0|p = 0.

Example 1.6. If p = 2 then ord2 12 = −1 and 12 2 = 2. Moreover ord2 3 = 0 and |3|2 = 1. If p = 3 then ord3 12 = 0, ord3 3 = 1, 12 3 = 1 and |3|3 = 13 . With the use of p-adic absolute value, the p-adic metric can be defined in a usual way. p-adic metric Let X be a set and let ρ be a metric on X. Remind that by definition the metric is a map ρ from X to R that satisfies the following properties • For all x, y ∈ X, ρ(x, y) > 0 and ρ(x, y) = 0 if and only if x = y. • For all x, y ∈ X, ρ(x, y) = ρ(y, x). • For all x, y, z ∈ X, ρ(x, z) 6 ρ(x, y) + ρ(y, z) (the triangle inequality). We say that the pair (X, ρ) is a metric space. The p-adic absolute value is non-Archimedean. It induces the p-adic metric ρ(x, y) = |x − y|p which is non-Archimedean; that is, the metric ρ satisfies the strong triangle inequality ρ(x, z) ≤ max{ρ(x, y), ρ(y, z)} Exercise 1.7. Prove this! Exercise 1.8. Show that in non-Archimedean metric spaces all triangles are isosceles. Exercise 1.9. Show that non-Archimedean metric spaces violate Archimedean Axiom which read: Appending a segment to itself sufficient number of times results in a segment that is longer than the given one. We note that non-Archimedean metric often is also called the ultrametric and the corresponding metric spaces are called also ultrametric spaces.


13

Metrics on Q Two absolute values on a field (or on a ring) K are said to be equivalent if they generate the same topology on K. Essentially there are only two types of non-trivial absolute values on Q; this is the essence of the following theorem. Theorem 1.10 (Ostrovski). Every non-trivial absolute value on Q is either equivalent to the real absolute value or to one of the p-adic absolute values. For a proof of Ostrovski’s theorem see, for example, [48] or [25].

1.1.2

Non-Archimedean rings and fields

Cauchy sequences and completions of Q Let ρ be a metric induced by the p-adic absolute value on Q; (Q, ρ) is then a metric space. However, this space is not complete: There exist Cauchy sequences that converge to no element of Q. Example 1.11. There is no rational number x satisfying x2 = 7. But since this equation has a solution modulo 3 (namely, x ≡ 1 (mod 3)) it is possible to construct a sequence (xj )j>0 such that xj ≡ xj+1 (mod 3j ) and x2j ≡ 7 (mod 3j+1 ). This sequence is a Cauchy sequence with respect to | |3 . We have that (xj ) is a Cauchy sequence because |xj − xj+1 |p 6 3−(j+1) → 0, j → ∞. It is clear that the limit of this sequence must be a solution of x2 = 7, since |x2j − 7|p 6 3−(j+1) → 0, j → ∞. Thus the limit does not belong to Q. We have proved that Q endowed with the metric induced by the 3-adic absolute value is not complete. Theorem 1.12 (p-adic Cauchy sequences). A sequence (xj ) in Q is a Cauchy sequence with respect to the p-adic absolute value if and only if lim |xj+1 − xj |p = 0.

j→∞

(1.1)

Proof. If (xj ) is a Cauchy sequence then it is clear that xj+1 − xj → 0, when j → ∞. Assume now that (xj ) is a sequence that satisfies (1.1). Let i > j. Then there exists k ∈ Z+ such that i = j + k. We have |xi − xj | 6 max(|xj+k − xj+k−1 |p , |xj+k−1 − xj+k−2 |p , . . . , |xj+1 − xj |p ). If xj+1 − xj → 0 when j → ∞ it follows that xi − xj → 0 when i, j → ∞. Hence (xj ) is a Cauchy sequence.


14

In fact, we can generalize Example 1.11 to any metric space (Q, ρ), where ρ is the metric induced by the p-adic absolute value, see [25]. Thus, the following is true: Theorem 1.13. The metric space (Q, ρ), where ρ is the metric induced by the p-adic absolute value is not complete. The completion of Q with respect to p-adic metric is a field, the field of p-adic numbers, Qp . The p-adic absolute value is extended to Qp , and Q is dense in Qp . It is worth noting that {|x|p : x ∈ Qp } = {|x|p : x ∈ Q} = {pm : m ∈ Z} ∪ {0}. By Ostrovski’s theorem 1.10, a complete list of all possible completions of Q are R and Qp , p a prime. P Exercise 1.14. In Qp , the series ∞ i=0 zi converges if and only if limi→∞ zi = 0. [Hint: Use Theorem 1.12 on p-adic Cauchy sequences] p-adic topology Now we mention some topological properties of fields (and rings) of p-adic numbers. Remind that a topological space is called locally compact if every point has a compact neighborhood. Exercise 1.15. Prove that the space Qp is locally compact. A ring K endowed with a topology is said to be a topological ring if the operations of addition and multiplication are continuous. Exercise 1.16. Prove that the field of Qp of p-adic numbers is a topological field. Recall that open (resp., closed ) ball of radius centered at the point a ∈ M in a metric space (M, ρ) is a set B− (a) = {z ∈ M : ρ(z, a) < } (resp., B (a) = {z ∈ M : ρ(z, a) ≤ }). As the absolute value | ∙ |p may be only p` for some ` ∈ N0 , for p-adic balls (i.e., for balls Qp ) we see that Bp−` (a) = Bp`−1 (a); thus: p-adic balls (of non-zero radii) are open and closed simultaneously! Sets that are open and closed simultaneously are called clopen; so Bp` (a) is a clopen ball of radius p` centered at a ∈ Qp . Balls are compact; the set of all balls (of non-zero radii) form a topological base of a topology of a metric space. Thus, Qp is a totally disconnected topological space. Remind that a topological space X is said to be totally disconnected if we for each pair a, b ∈ X can find open subsets A, B of X such that a ∈ A, b ∈ B, A ∩ B = ∅ and A ∪ B = X.


15

In the sequel, we are mostly interested in a subspace B1 (0) = Zp , which is a compact clopen totally disconnected metric subspace of Qp . In p-adic analysis, Zp plays a role that somewhat resembles a role of a closed real unit interval in real analysis. However. in a contrast to real interval, Zp is endowed with an algebraic structure: it is a ring. We summarize: The subspace B1 (0) = {z ∈ Qp : |z|p ≤ 1} is a ring; it is called the ring of p-adic integers and denoted via Zp . The ring Zp is a completion with respect to the p-adic absolute value of the ring Z; moreover, both N and −N are dense in Zp .

From the strong triangle inequality it follows that in a contrast to real numbers, p-adic numbers are “disordered structures”: Neither the field Qp nor the ring Zp can be totally ordered ; that is, there is no relation ≤ defined for all x, y ∈ Qp (respectively, for all x, y ∈ Zp ) such that for all a, b, c, the inequality a ≤ b implies inequalities a + c ≤ b + c and ac ≤ bc (if 0 ≤ c) or bc ≤ ac (if c ≤ 0). Exercise 1.17. Prove that every point of a p-adic ball is a center of that ball. A similar result holds for p-adic spheres as well. By the definition, the p-adic sphere of radius p−` (` ∈ N0 ) around a ∈ Zp is Sp−` (a) = {z ∈ Zp : |z − a|p = p−` } = Bp−` (a) \ Bp−`−1 (a).

It is clear that the sphere is a disjoint union of p − 1 balls of radii p−`−1 around a; so a p-adic sphere is a compact clopen set. Remind that a family of sets F is said to have the finite intersection property if every finite intersection of sets of F is nonempty. It is well known that a metric space is compact if and only if every family of closed sets with the finite intersection property has a nonempty intersection. This implies the following theorem: Theorem 1.18. Given a nested sequence of non-empty closed subsets M0 ⊃ M1 ⊃ ∙ ∙ ∙ ⊃ Mi ⊃ ∙ ∙ ∙ in Zp , the intersection ∩∞ i=0 Mi is not empty. An ultrametric space is said to be spherically complete if each nested sequence of balls has a nonempty intersection, cf. [48, Definition 20.1]. The space Qp is spherically complete; that is, Theorem 1.18 (which is equivalent to the compactness) remains true for nested sequences of balls from Qp . However, the theorem does not hold for nested sequences of arbitrary closed subsets from Qp since Qp is locally compact, and not compact.

1.2

Canonical expansion of p-adic numbers

Residue field of Qp The ball B1− (0) = {x ∈ Zp : |x|p < 1} = Bp−1 (0) = pZp


16

is a maximal ideal of Zp . The quotient ring Zp /B1− (0) = Z/pZ = Fp is then a field, a finite field Fp of p elements; it is called the residue (class) field of Qp .

1.2.1

Canonical representation

Theorem 1.19. For each x ∈ Zp there exists a sequence (xj )j>0 such that xj ∈ Z,

0 6 xj 6 pj+1 − 1,

xj+1 ≡ xj (mod pj+1 )

for all j > 0 and |x − xj |p 6 p−(j+1) . Proof. Let x ∈ Zp . As Q is dense in Qp , we can find a rational number a/b such that |x − a/b|p 6 p−(j+1) for every j. In fact, this number can be chosen to be an integer. Since |a/b|p 6 max(|x|p , |a/b − x|p ) 6 1 it is clear that p - b, so gcd(pj+1 , b) = 1. Therefore there exist b0 and p0 such that p0 pj+1 + b0 b = 1 or equivalently b0 b ≡ 1 (mod pj+1 ). We then have |a/b − ab0 |p = |a/b|p |1 − b0 b|p 6 p−(j+1) , and |x − ab0 |P 6 max(|x − a/b|p , |a/b − ab0 |p ) 6 p−(j+1) . There is a unique integer xj satisfying 0 6 xj 6 pj+1 − 1 and xj ≡ ab0 (mod pj+1 ). It is clear that |xj − x|p 6 p−(j+1) . It remains to show that xj+1 ≡ xj (mod pj+1 ). This follows from the fact that |xj+1 − xj |p 6 max(|xj+1 − x|p , |x − xj |p ) 6 max(p−(j+2) , p−(j+1) ) 6 p−(j+1) . Corollary 1.20. The residue class field of Qp is isomorphic to the finite field Fp of p elements. Proof. It follows from Theorem 1.19 that the integers {0, 1, . . . , p − 1} is a complete set of representatives of the cosets of B1− (0). Every p-adic number has a unique representation as a sum of a special convergent p-adic series which is called a canonical representation, or a padic expansion, or a base-p expansion. Theorem 1.21 (Canonical representation). Given x ∈ Zp , there exists a unique sequence δ0 (x), δ1 (x), . . . ∈ {0, 1, . . . , p − 1} such that x = δ0 (x) + δ1 (x) ∙ p + δ2 (x) ∙ p2 + . . . + δj (x) ∙ pj + . . . .


17

Proof. By expanding the elements of the sequence (xj ) from Theorem 1.19 in the base p we get x0 = y0 ,

0 6 y0 6 p − 1,

x1 = y0 + y1 p,

0 6 y1 6 p − 1,

0 6 y2 6 p − 1,

x2 = y0 + y1 p + y2 p2 , .. .

xj = y0 + y1 p + . . . yj pj , P It is clear that sum j>0 yj pj converges.

0 6 yj 6 p − 1.

Note 1.22. In the sequel for x ∈ Zp we use notation δi (x) = yi , i = 0, 1, 2, . . .. Thus δi (x) ∈ {0, 1, . . . , p − 1} for all i = 0, 1, 2, . . .. The function δi (x) is called the i-th coordinate function. Corollary 1.23. Every x ∈ Qp has a unique representation as x=

∞ X

j=ordp x

δj (x) ∙ pj ,

(1.2)

where δj (x) ∈ {0, 1, . . . , p − 1} for j ≥ ordp x.

Proof. Let x ∈ Qp and assume that x ∈ Zp ; then (1.2) holds by Theorem 1.21. Further, if x ∈ Qp \ Zp then x can be written as x = y ∙ p−m for some positive integer m = − ordp x and y ∈ Zp : Indeed, put y = p− ordp x x. Then |p− ordp x x|p = pordp x ∙ p− ordp x = 1.

Thus y ∈ Zp , so x = pordp x y. Now by Theorem 1.21 we obtain an expansion of y. If we then divide it by pm we get (1.2). For each positive integer m > 2 we can expand a real number r with respect to the base m in the following way: X r= ri mi , (1.3) i6imax

for some integer imax . A real number r can have infinitely many negative powers in this expansion, but a p-adic number can have infinitely many positive powers in the expansion (1.2). Exercise 1.24. Prove that for every prime p we have the following expansion: −1 = (p − 1) + (p − 1)p + (p − 1)p2 + . . .

Examples. In Q2 , the rational number 1/3 has the expansion 1/3 = 1 + 1 ∙ 2 + 0 ∙ 22 + 1 ∙ 23 + 0 ∙ 24 + . . . .

and 1/6 has the expansion 1 1 1 = − = 1 ∙ 2−1 + 1 ∙ 1 + 0 ∙ 2 + 1 ∙ 22 + 0 ∙ 23 + ∙ ∙ ∙ 6 2 3


18

Computer’s instructions In modern computers, a number of basic processor’s instructions may be regarded as 2-adic functions. In the sequel we use these functions as running examples; also, they will be a subject of our applied studies in Part III. Now we give formal definitions of these basic instructions, bitwise logical and machine: Let z = δ0 (z) + δ1 (z) ∙ 2 + δ2 (z) ∙ 22 + δ3 (z) ∙ 23 + ∙ ∙ ∙

be a 2-adic canonical expansion for z ∈ Z2 (that is, δj (z) ∈ {0, 1}); then,

• y XOR z is a bitwise addition modulo 2: δj (y XOR z) ≡ δj (y) + δj (z) (mod 2), for all j = 0, 1, 2 . . .;

• yANDz is a bitwise multiplication modulo 2: δj (yANDz) ≡ δj (y)∙δj (z) (mod 2), for all j = 0, 1, 2 . . .; • NOT, a bitwise logical negation: δj (NOT(z)) ≡ δj (z) + 1 (mod 2), for all j = 0, 1, 2 . . .; • y OR z is a bitwise logical ‘or’ : δj (y OR z) ≡ δj (y) OR δj (z) (mod 2), for all j = 0, 1, 2 . . .; • b z2 c, the integral part of z2 , is a shift towards less significant bits; • 2k ∙ z, a multiplication by k-th power of 2, is a k-bit shift towards more significant bits; • y AND z, where y is a constant, is also called a masking of z with the mask y; • z mod 2k = z AND (2k − 1) is a reduction of z modulo 2k ; a truncation of all high order bits starting with the k-th one (note that 2k − 1 = P k i i=0 2 ).

Exercise 1.25 (Identities for computer’s instructions). Prove that for all u, v ∈ Z2 the following identities hold: NOT u = u XOR (−1);

u + (NOT u) = −1;

u XOR v = u + v − 2 ∙ (u AND v);

(1.4)

u OR v = u + v − (u AND v);

u OR v = (u XOR v) + (u AND v). Note. In the sequel to avoid extra parenthesises complex expressions that contain both arithmetic and bitwise logical operations we assume that multiplication and taking inverse (division) precede bitwise logical operations while the latter precede arithmetic operations + and −; for instance a + b2 AND c stands for a + ((b2 ) AND c).


1.2.2

19

Properties of p-adic integers

Let x = δ0 (x) + δ1 (x) ∙ p + δ2 (x) ∙ p2 + . . . + δj (x) ∙ pj + . . . be a p-adic integer, in its canonical representation. The map k

mod p : x =

∞ X i=0

k

k

δi (x) ∙ p 7→ x mod p =

k−1 X i=0

δi (x) ∙ pk ∈ Z/pk Z

(1.5)

is a continuous ring epimorphism of the ring Zp onto the ring Z/pk Z of residues modulo pk ; it is called the reduction map modulo pk (we will often omit the word “map” however) Exercise 1.26. Prove that modpk is a continuous ring epimorphism. The kernel of the epimorphism mod pk is a ball pk Zp = Bp−k (0) of radius −k p around 0. The rest pk − 1 balls of radii p−k around 0 are co-sets with respect to this epimorphism; e.g. Bp−k (1) = 1 + pk Zp , is a co-set of 1, i.e. the set of all p-adic integers that are congruent to 1 modulo pk . This is a good illustration why any point from the ball is a center of this ball. p-adic units Note 1.27. A p-adic integer x ∈ Zp is invertible in Zp (that is, has a multiplicative inverse x−1 ∈ Zp , x−1 ∙ x = 1) if and only if δ0 (x) 6= 0; that is, if and only if x is invertible modulo p, meaning x mod p is invertible in Fp = Z/pZ. Exercise 1.28. Prove this! Invertible p-adic integer is also called a unit (cf. multiplicative neutral, 1, is called a unity). Exercise 1.29. Prove that given x ∈ Qp \ {0}, there is a unique unity x ˉ ∈ Zp such that x = pordp x x ˉ.(Hint: Use Theorem 1.21 and Corollary 1.23). Exercise 1.30. Prove the following useful formula: (1 + pz)−1 = 1 − pz + p2 z 2 − p3 z 3 + ∙ ∙ ∙ + (−1)j pj z j + ∙ ∙ ∙

The set Z∗p of all units of Zp (which is a group with respect to multiplication, called a group of units, or a multiplicative subgroup of Zp ) is a p-adic sphere of radius 1 around 0: Z∗p = {z ∈ Zp : |z|p = 1} = Zp \ pZp = B1 (0) \ Bp−1 (0) = S1 (0). p-adic tree The ring Zp can be represented as a homogeneous tree with p branches leaving each vertex and one incoming branch. Here is the 2-adic tree that corresponds to Z2 :


0 0 1 ? 0 1 1

Figure 1.1: The 2-adic tree

20

Chapter 2

p-adic Calculus In this Chapter, we develop the p-adic Calculus for functions defined on (and valuated in) the ring of p-adic integers Zp . That is, the domain of the functions under study is Znp , the n-th Cartesian power of Zp , and the functions take values in Zm p , for some n, m ∈ N = {1, 2, 3, . . .}, and usually m ≤ n. We start with the case m = n = 1.

2.1

Univariate functions

For better convenience, we re-state basic notions of p-adic analysis via congruences rather than via absolute values. Let us make some conventions first. Given s ∈ N and a, b ∈ Qp , we write a ≡ b (mod ps ) if and only if |a − b|p ≤ p−s (or, which is the same, if and only if a = b + cps for suitable c ∈ Zp ). In other words, we use a ≡ b (mod ps ) rather than |a − b|p ≤ p−s meaning that both a and b lie in a ball of radius p−s of the space Qp . Therefore, we can restate definition of the p-adic limit in the following manner: Definition 2.1 (p-adic limit). A p-adic integer z is said to be a limit of the sequence (zi )∞ i=0 p

z = lim zi i→∞

if and only if for every (sufficiently large) positive rational integer K there exists N such that zi ≡ z (mod pK ) for all i > N .

2.1.1

Continuous functions

With the above re-statement of the p-adic limit we re-state the notion of continuous function: Definition 2.2 (p-adic continuous function). A function f : Zp → Zp is said to be continuous at the point z ∈ Zp if and only if for every (sufficiently 21

CHAPTER 2. P -ADIC CALCULUS

22

large) positive rational integer M there exists a positive rational integer L such that f (x) ≡ f (z) (mod pM ) whenever x ≡ z (mod pL ). The function f is said to be uniformly continuous on Zp if and only if f is continuous at every point z ∈ Zp , and L depends only on M , and not on z. Exercise 2.3 (Computer’s instructions are continuous). Let c ∈ Z2 . Prove that all functions x+c, x∙c, xXORc, xANDc, xORc and NOT x are uniformly continuous functions on Z2 . (Hint: Prove that the functions satisfy the 2adic Lipschitz condition with a constant 1). 1-Lipschitz functions The 1-Lipschitz p-adic functions f : Zp → Zp , i.e., the ones that satisfy the p-adic Lipschitz condition with a constant 1 |f (a) − f (b)|p ≤ |a − b|p , are also called non-expansive, or compatible (in the sequel, we use all three these terms). It is clear that these functions are uniformly continuous on Zp . Exercise 2.4. Prove that the function f : Zp → Zp is 1-Lipschitz if and only if δi (f (x)) does not depend on δj (x) for j > i, for all i = 0, 1, 2, . . .. Exercise 2.4 shows that a function f : Zp → Zp is 1-Lipschitz if and only if its canonical representation has a special triangular-like form: f : x = χ0 +χ1 ∙p+χ2 ∙p2 +∙ ∙ ∙ 7→ ψ0 (χ0 )+ψ1 (χ0 , χ1 )∙p+ψ2 (χ0 , χ1 , χ2 )∙p2 +∙ ∙ ∙ , where χi = δi (x) ∈ Fp , ψi = δi (f ) : Fj+1 → Fp , i = 0, 1, 2, . . .. By this p reason, in the case p = 2 the 1-Lipschitz functions are also known under the name of T-functions, T from ‘triangular’. Exercise 2.5. Prove that the function f : Zp → Zp is 1-Lipschitz if and only if for all k = 1, 2, . . . the map f mod pk : a 7→ f (a) mod pk , a ∈ {0, 1, . . . , pk − 1}, of the ring Z/pk Z = {0, 1, . . . , pk − 1} of residues modulo pk into itself is well defined; that is, does not depend on the choice of representatives in the co-sets a + pk Zp . In the sequel, given a 1-Lipschitz function f : Zp → Zp and k ∈ N, we call the map f mod pk : Z/pk Z → Z/pk Z the reduction of function f modulo pk , or, when it is clear what the modulo is, simply the reduced function.

2.1.2

Differentiability

The following definition is just a usual definition of a derivative of a function in the case when metric is the p-adic metric; thus, it is a direct analog of its classical counter-part from real analysis:


23

Definition 2.6 (p-adic differentiability). A function f : Zp → Zp is said to be differentiable at the point x ∈ Zp if there exists f 0 (x) ∈ Qp (called the derivative of f at the point x) such that, given arbitrary (sufficiently large) k ∈ N, f (x + h) ≡ f (x) + f 0 (x)h (mod pordp h+k ) (2.1)

for all sufficiently small h ∈ Zp : in other words, there exists N = Nk ∈ N such that congruence (2.1) holds once |h|p ≤ p−N ; that is, once h ≡ 0 (mod pN ) (equivalently, once ordp h ≥ N ). The function f is called uniformly differentiable (on Zp ) if it is differentiable at every point x ∈ Zp and Nk does not depend on x; that is, given sufficiently large k ∈ N, there exists N = Nk ∈ N such that congruence (2.1) holds simultaneously for all x ∈ Zp and all h ∈ Z2 once ordp h ≥ N .

Rules of derivation in the p-adic case are the same as in real case; that is, given (uniformly) differentiable functions u, v : Zp → Zp , the functions u + v (sum), uv (product), and u(v) (composition) are (uniformly) differentiable, and • (u + v)0 = u0 + v 0 (derivative of a sum); • (uv)0 = u0 v + v 0 u (derivative of a product); • (u(v))0 = u0 (v)v 0 (chain rule). The proof follows the same lines as in real case. Exercise 2.7. Complete the proof. Derivative of a power has also the same for as in real case: Example 2.8. For m ∈ N, the function f (x) = xm is uniformly differentiable on Zp and that f 0 (x) = mxm−1 . (Note that f is 1-Lipschitz). Indeed, By Newton’s binomial formula, n

(x + p z)

m

=

m X m i=0

i

xm−i pni z i ;

so (x+pn z)m ≡ xm +mxm−1 ∙pn z (mod pn+k ) for n ≥ k. xm + mxm−1 ∙ h (mod pordp h+k ) once h ≡ 0 (mod pk ). may take Nk = k in Definition 2.6. Also, in the p-adic the derivative of a constant is 0, peculiarity is that the converse is not true in the p-adic

That is, (x+h)m ≡ In other words, we as in real case; the case:

Exercise 2.9. Prove that the function δ0 (x) is uniformly differentiable on Zp and that δ00 (x) = 0 for all x ∈ Zp . (The function δ0 (x) is 1-Lipschitz as well).


24

The function δ0 serves as an example of the so-called pseudo-constants; the latter are non-constant functions that are differentiable everywhere on the domain and whose derivatives are identically zero. In the case of real analysis, there are no pseudo-constants due to Rolle theorem which, in turn, is based on the fact that R is a totally ordered field, in a contrast to Qp , which can not be totally ordered. Exercise 2.10. Let c ∈ Z = {0, ±1, ±2, . . .}. Prove the following: 1. The function x AND c is uniformly differentiable on Z2 ; (x AND c)0 = 0 if c ≥ 0, and (x AND c)0 = 1 if c < 0. 2. The function f (x) = x XOR c is uniformly differentiable on Z2 ; f 0 (x) = 1 if c ≥ 0, and f 0 (x) = −1 if c < 0. 3. The function x OR c is uniformly differentiable on Z2 ; (x OR c)0 = 1 if c ≥ 0, (x OR c)0 = 0 if c < 0. 4. Given n ∈ N, the function x mod 2n is uniformly differentiable on Z2 ; (x mod 2n )0 = 0. 5. The function NOT x is uniformly differentiable on Z2 ; (NOT x)0 = −1, Exercise 2.11. Let p = 2. Prove that the function f (x) = x XOR (− 13 ) is differentiable at no point x ∈ Z2 . (Hint: consider modpordp h+2 ).

Given x ∈ Zp , the generalized multiplicative inverse invp (x) of x is defined as follows: −1 x invp (x) = pordp x ∙ . (2.2) pordp x Exercise 2.12. Prove the following: 1. The function invp is well defined on Zp . (Hint: Use Note 1.27 and p

Exercise 1.29; prove that lim invp (x) = 0) x→0

2. The function invp is 1-Lipschitz. 3. The function invp is infinitely many times differentiable everywhere on Zp \ {0} and the i-th derivative is inv(i) p (x)

=

(−1)i i! p(i−1) ordp x

x pordp x

−i−1

=

(−1)i i! (invp x)i+1 , p2i ordp x

i = 0, 1, 2, 3, . . . 4. The function invp is not differentiable at 0.

(2.3)


25

Hensel’s Lemma The following result, the famous Hensel’s Lemma (proved by German mathematician Kurt Hensel who discovered p-adic numbers) may be regarded as the point the p-adic analyses has started from at the break of XIX–XX century. Lemma 2.13 (Classical Hensel’s Lemma). If a polynomial f (x) = a0 + a1 x + a2 x2 + ∙ ∙ ∙ over Z has a root z modulo p (i.e., if f (z) ≡ 0 (mod p)) such that f 0 (z) 6≡ 0 (mod p), then f has a root zˆ in Zp , and z ≡ zˆ (mod p). Proof. Put z0 = z; we will show that given zn ∈ Zp such that f (zn ) ≡ 0 (mod pn ) there exists zn+1 ∈ Zp such that f (zn+1 ) ≡ 0 (mod pn+1 ); zn+1 ≡ zn

n

(mod p ).

(2.4) (2.5)

Write zn+1 = zn + pn t; we will find t ∈ Zp to satisfy these conditions. By Example 2.8, for n ≥ 1 we have that f (zn + pn t) ≡ f (zn ) + f 0 (x) ∙ pn t (mod pn+1 ); so we must find t to satisfy 0 ≡ pn s + f 0 (x) ∙ pn t

(mod pn+1 )

(2.6)

where pn s ≡ f (zn ) (mod pn+1 ). Remind that f (zn ) ≡ 0 (mod pn ) implies that f (zn ) = pn s for a suitable s ∈ Zp . However, congruence (2.6) is equivalent to the congruence 0 ≡ s + f 0 (x) ∙ t (mod p) which has a solution with respect to unknown t as f 0 (x) 6≡ 0 (mod p) and thus f 0 (x) is a unit of Zp by Note 1.27. In view of (2.5), the sequence (zn ) is a p-adic Cauchy sequence by Theorem 1.12; so it has a limit, which is a p-adic integer zˆ as Zp is closed in p-adic p

topology. However, f (ˆ z ) = lim f (zn ) = 0 by (2.4) as f is continuous. n→∞

Note. From the proof of Hensel’s lemma it immediately follows that the Lemma is also true for polynomials over Zp . Proposition 2.14. If a 1-Lipschitz function f : Zp → Zp is differentiable at the point x ∈ Zp then f 0 (x) ∈ Zp . Proof. If f 0 (x) = 0 there is nothing to prove. Let f 0 (x) 6= 0; represent f 0 (x) = pm s, where s ∈ Zp , s 6≡ 0 (mod p), m = ordp f 0 (x) (cf. Exercise 1.29 and Note 1.27). By differentiability we have that f (x + pn ) ≡ f (x) + pm s ∙ pn (mod pn+1 ) for all sufficiently large n; so f (x + pn ) − f (x) ≡ pm+n s (mod pn+1 ). On the other hand, |f (x + pn ) − f (x)|p ≤ p−n as f is 1-Lipschitz; so f (x + pn t) − f (x) = pn a for suitable a ∈ Zp . Consequently, pn a ≡ pm+n s


26

(mod pn+1 ); whence a ≡ pm s (mod p). The latter implies that m ≥ 0 since a, s ∈ Zp and s 6≡ 0 (mod p). Thus, ordp f 0 (x) ≥ 0; so f 0 (x) ∈ Zp . Exercise 2.15. Prove the following version of Hensel’s Lemma for 1-Lipschitz functions: If a 1-Lipschitz function f : Zp → Zp has a zero z ∈ Zp modulo p (i.e., if f (z) ≡ 0 (mod p)) such that f is differentiable at z and f 0 (z) 6≡ 0 (mod p), then f has a zero zˆ in Zp (i.e., f (ˆ z ) = 0) and z ≡ zˆ (mod p). Differentiability modulo pk Now we introduce a notion of differentiability modulo pk which is of high importance in the sequel; the notion has no direct analogs in classical real Calculus. Definition 2.16 (Differentiability modulo pk ). Given k ∈ N, a function f : Zp → Zp is said to be differentiable modulo pk at the point x ∈ Zp if there exists fk0 (x) ∈ Qp (called the derivative modulo pk of the function f at the point x) such that f (x + h) ≡ f (x) + fk0 (x)h

(mod pordp h+k )

(2.7)

for all sufficiently small h ∈ Zp . The function f is called uniformly differentiable modulo pk (on Zp ) if it is differentiable modulo pk at every point x ∈ Zp and Nk does not depend on x; that is, there exists N ∈ N such that congruence (2.7) holds simultaneously for all x ∈ Zp and all h ∈ Z2 once ordp h ≥ N . The smallest N with this property is denoted via Nk (f ). From Definitions 2.6 and 2.16 we easily conclude that if the function f is differentiable at x then it is differentiable modulo pk for all k ∈ N, and that f 0 (x) ≡ fk0 (x) (mod pk ). Speaking loosely, the differentiability of f means that f (x + h) − f (x) ≈ fk0 (x) h with arbitrarily high accuracy, whereas in the case of differentiability modulo pk the accuracy is only not worse than p−k . Note 2.17. As a derivative modulo pk is defined up to the term that is 0 modulo pk , for the 1-Lipschitz function f we may assume when convenient that fk0 (x) ∈ Z/pk Z = {0, 1, . . . , pk − 1}; so given a 1-Lipschitz function f : Zp → Zp that is differentiable modulo pk at every point x ∈ Zp , the derivative fk0 maps Zp into the residue ring Z/pk Z. Example 2.18. Let p = 2. The function f (x) = x + (x2 OR 5) is uniformly differentiable on Z2 (whence, uniformly differentiable modulo 2k for all k ∈ N), Nk (f ) = k, and f 0 (x) = 1 + 2x.


27

ˉ 2 = 1 (cf. Exercise 1.29), it is clear that ˉ ∈ Z2 , |h| Indeed, given h = 2` h ` 2 2 `+1 ˉ ˉ + 22` h ˉ 2 x2 once ` ≥ 2 (as δi (5) = 0 for (x + 2 h) OR 5 = (x OR 5) + 2 hx i ≥ 3); so f (x + h) ≡ f (x) + (1 + 2x)h (mod pordp h+k ) once ordp h ≥ k.

Exercise 2.19. Let p = 2. Prove that the function f (x) = x + x2 XOR (− 13 ) is differentiable at no point x ∈ Z2 . Prove that nevertheless f is uniformly differentiable modulo 2 (i.e., with k = 1) on Z2 . Exercise 2.20. Prove that if a 1-Lipschitz function f : Zp → Zp is differentiable modulo pk at the point x ∈ Zp then fk0 (x) ∈ Zp .

Exercise 2.21. Show that the statement of Exercise 2.15 remains true for functions that are differentiable modulo pk (for some k) at the point z.

2.2

Multivariate functions

To deal with multivariate p-adic functions we first define the (ultra)metric in Qnp as follows: Given a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Qnp , |a − b|p = max{|ai − bi |p : i = 1, 2, . . . , n}. Exercise 2.22. Prove that this indeed is an ultrametric. Note 2.23. As the metric space Zp is compact, all metric spaces Znp , n ∈ N, are also compact. To work further with congruences rather than with absolute values in a manner similar to that we have dealt with univariate p-adic functions we need to define congruences for vectors from Qnp . Let s ∈ N; let a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Qnp . We write a ≡ b (mod ps ) if and only if |ai − bi |p ≤ p−s ; or, which is the same, if and only if ai = bi + ci ps for suitable ci ∈ Zp , i = 1, 2, . . . , n. In other words, further a ≡ b (mod ps ) stands for |a − b|p ≤ p−s meaning that both a and b lie in a ball of radius p−s of the space Qnp . For instance, the function F = (f1 , . . . , fm ) : Znp → Zm p is 1-Lipschitz (i.e., |F (a) − F (b)|p ≤ |a − b|p for all a, b ∈ Znp ) if and only if F (a) ≡ F (b) (mod p` ) once a ≡ b

(mod p` ).

(2.8)

Note 2.24. Note that for all s ∈ N the binary relation ∙ ≡ ∙ (mod ps ) is a congruence of a Qnp whenever Qnp is considered as a module over the ring Zp . In other words, we can work with the relation ∙ ≡ ∙ (mod ps ) in a usual manner; e.g., multiply both parts by a p-adic integer, add congruences sidewise. This implies in particular that the reduction map mod ps : Znp → (Z/ps Z)n is well defined, cf. (1.5): mod ps : a = (a1 , . . . , an ) 7→ a mod ps = (a1 mod ps , . . . , an mod ps ). (2.9)


2.2.1

28

Differentiability modulo pk for multivariate functions

Definition 2.25 (Differentiability modulo pk ). Given k ∈ N, a function k F = (f1 , . . . , fm ) : Znp → Zm p is said to be differentiable modulo p at the point u = (u1 , . . . , un ) ∈ Znp if there exists a positive integer rational N and an n × m matrix Fk0 (u) over Qp (called the Jacobi matrix modulo pk of the function F at the point u) such that for every positive rational integer K ≥ N and every h = (h1 , . . . , hn ) ∈ Znp the congruence F (u + h) ≡ F (u) + h ∙ Fk0 (u) (mod pk+K )

(2.10)

holds whenever |h|p ≤ p−K . In the case m = 1 the Jacobi matrix modulo pk is called a differential modulo pk . In the case m = n a determinant of the Jacobi matrix modulo pk is called a Jacobian modulo pk . Entries of the Jacobi matrix modulo pk are called partial derivatives modulo pk of the function F at the point u. Similarly to univariate functions, the function F is called differentiable at the point u if it is differentiable modulo pk at u for all k ∈ N. We denote a partial derivative (respectively, a differential) modulo pk Pn ∂k F (u) via ∂k∂fkix(u) (respectively, via d F (u) = k i=1 ∂k xi dk xi ). j

In cases when all partial derivatives modulo pk at all points of Znp are p-adic integers we say that the function F has integer-valued derivatives modulo pk . In these cases we can associate to each partial derivative modulo pk a unique element of the ring Z/pk Z; a Jacobi matrix modulo pk at every point u ∈ Znp can then be considered as a matrix over the ring Z/pk Z.

Exercise 2.26. Prove that if 1-Lipschitz function F = (f1 , . . . , fm ) : Znp → k k Zm p is differentiable modulo p then all its partial derivatives modulo p are integer-valued. Integer-valued functions that have integer-valued derivatives are sometimes called twice integer-valued. Definition 2.27 (Uniform differentiability modulo pk ). A function F : Znp → k n Zm p is said to be uniformly differentiable modulo p on Zp if and only if there exists K ∈ N such that the congruence (2.10) holds simultaneously for all u ∈ Znp whenever |h|p ≤ p−K . The smallest of these K is denoted via Nk (F ). Example 2.28. Let p = 2. The function f (x, y) = x XOR y : Z2 → Z2 is not uniformly differentiable on Z22 (as a bivariate function); however, f is uniformly differentiable modulo 2 on Z22 , and its partial derivatives modulo 2 are 1 everywhere on Z22 . The function f (x) = x XOR c is uniformly differentiable on Z2 for any c ∈ Z, and f 0 (x) = 1 for c ≥ 0, f 0 (x) = −1 for c < 0, cf. Example 2.10. Therefore f is not differentiable as bivariate function since a 2-adic


29

integer can be simultaneously considered as a limit of a sequence of positive rational integers, and as a limit of a sequence of negative rational integers as both N and −N are dense in Zp . However, at the same time this implies that f is uniformly differentiable modulo 2 on Z2 , as −1 ≡ 1 (mod 2) (and N1 (f ) = 1). Proposition 2.29. Let the functions G : Zsp → Znp and F : Znp → Zm p be k differentiable modulo p at the points, respectively, v = (v1 , . . . , vs ) and u = G(v), and let all partial derivatives modulo pk of the functions G and F at the points, respectively, v and u are p-adic integers. Then the composition k F ◦ G : Zsp → Zm p is differentiable modulo p at the point v, all its partial derivatives modulo pk at this point are p-adic integers, and (F ◦ G)0k (v) ≡ G0k (v)Fk0 (u) (mod pk ). In particular, if the functions f, g : Zp → Zp are differentiable modulo pk at the point u ∈ Zp , and if their derivatives modulo pk at this point are integer-valued, then (f + g)0k (u) ≡ fk0 (u) + g0k (u) (mod pk );

(f ∙ g)0k (u) ≡ fk0 (u)g(u) + f (u)gk0 (u) (mod pk ). If, moreover, there exists an open ball U 3 u such that g(r) 6≡ 0 (mod p) at every point r ∈ U , then the function fg : U → Zp is differentiable modulo pk at the point u, has integer-valued derivative modulo pk at this point, and 0 f 0 (u)g(u) − f (u)gk0 (u) f (u) ≡ k (mod pk ). g(u)2 g k

If additionally the functions F , G, f , g are uniformly differentiable modulo pk , and if their derivatives modulo pk are integer-valued everywhere on Zp , then the same is true for the functions F ◦ G, f + g, and f ∙ g. Finally, if g(v) 6≡ 0 (mod p) for all v ∈ Zp , then the function fg is integer-valued and uniformly differentiable modulo pk everywhere on Zp , and its partial derivative modulo pk is integer-valued at all points of Zp . Sketch proof. A proof of the Proposition with minor changes due to the non-Archimedean metric follows (up to congruences modulo pn rather than equations) the one of the classical Calculus. The argument is still valid since a congruence modulo pn is a congruence relation on the ring Zp : We can, for instance, multiply both sides of some congruence modulo pn by a p-adic unit (i.e., by a p-adic integer with a norm 1) without affecting the validity of this congruence. Example 2.30. The function F (x, y) = (f (x, y), g(x, y)) = (x XOR (2 ∙ (x AND y)), (y + 3x3 ) XOR x)


30

is uniformly differentiable modulo 2 as a bivariate function, and N1 (F ) = 1; namely

1 x+1 F (x + 2 t, y + 2 s) ≡ F (x, y) + (2 t, 2 s) ∙ (mod 2k+1 ) 0 1 1 x+1 for all m, n ≥ 1 (here k = min{m, n}). The matrix = F10 (x, y) 0 1 is a Jacobi matrix modulo 2 of F (see Definition 2.25). Here is how we calculate partial derivatives modulo 2: For instance, ∂1 g(x,y) ∂1 (y+3x3 ) ∂1 (uXORx) ∂1 x ∂1 (uXORx) = ∙ + ∙ = 9x2 ∙ 1 + 1 ∙ 3 ∂1 x ∂1 x ∂1 x ∂1 u ∂1 x u=y+3x u=y+3x3 1 ≡ x + 1 (mod 2). Note that a partial derivative modulo 2 of the function 2 ∙ (x AND y) is always 0 modulo 2, due to the multiplier 2: The function x AND y is not differentiable modulo 2 as a bivariate function, however, the function 2 ∙ (x AND y) is. So the Jacobian of the function F is det F10 ≡ 1 (mod 2). n

m

n

m

Note 2.31. The derivation rules modulo pk generally do not hold for functions whose derivatives modulo pk are not integer-valued. However, both a sum of (uniformly) differentiable modulo pk functions and a product of such function by a p-adic integer are still (uniformly) differentiable modulo pk , since a congruence modulo pn is a congruence relation on Qp when Qp is considered as module over the ring Zp . Exercise 2.32. Complete the proof of Proposition 2.29.

2.2.2

Properties of derivatives modulo pk

Remind that by the definition a locally constant function is a function such that given a point in the domain, there exists a neighborhood where the function is a constant. Therefore a function G with domain Znp is said to be locally constant if given x = (x1 , . . . , xn ) ∈ Znp , there exists a ball of non-zero radius around x where G is a constant. Following [42, 48], we call a function G with domain Znp a step function if there exists N ∈ N0 such that G(x) = G(y) once |x − y|p ≤ p−N ; the smallest N with this property is called the order of the step function G. Exercise 2.33. Prove that the function δk : Zp → {0, 1, . . . , p − 1} ⊂ Zp is a step function of order k + 1. Exercise 2.34. Prove that the function G : Zp → Zp is a step function if and only if the sequence (G(i))∞ i=0 is periodic; and that the length of the shortest period of the sequence is pN then, where N is the order of the step function G. Note 2.35. In the book, by the period of length K > 0 of the sequence (z(i))∞ i=0 we understand a finite sequence a(0), a(1), . . . , a(K − 1) such that z(i) = a(i mod K) for all n ≥ n0 ≥ 0. Remind that a sequence having a


31

period is called eventually periodic, whereas eventually periodic sequence (z(i))∞ i=0 is called (purely) periodic if n0 = 0. Exercise 2.36. Prove that if the function F = : Znp → Zm p is locally conn stant on Zp then it is differentiable everywhere on Zp and that all partial derivatives are identically zero. Exercise 2.37. Prove that locally functions on Znp are step functions, and vice versa; prove that a step function on Znp takes only finitely many values. (Hint: Use that Znp is compact.) From these exercises we deduce that locally step functions are uniformly differentiable on Znp and that their partial derivatives vanish everywhere on Znp . The following proposition therefore shows that if a function F : Znp → k k Zm p is uniformly differentiable modulo p then it is ‘smooth modulo p ’: all its partial derivatives modulo pk are uniformly differentiable on Znp and second derivatives modulo pk are identically 0. Proposition 2.38. If the function F = (f1 , . . . , fm ) : Znp → Zm p is uniformly differentiable modulo pk , then every its derivative modulo pk is a step function of order not greater than Nk (F ). (cf. Definition 2.27). Proof. The proof can obviously be restricted to the case m = n = 1. According to Definitions 2.25 and 2.27, if |h|p ≤ p−K then for all u ∈ Zp and K ≥ Nk (F ) the following congruence holds: F (u + h) − F (u) ≡ h

∂k F (u) ∂k x

(mod pk+K ).

(2.11)

Taking |h1 |p ≤ |h|p and substituting u = u1 + h1 into (2.11), represent F (u + h) − F (u) = F (u1 + h1 + h) − F (u1 ) − (F (u1 + h1 ) − F (u1 )). (2.12) Now applying (2.11) to (2.12) we obtain that F (u + h) − F (u) ≡ (h1 + h)

∂k F (u1 ) ∂k F (u1 ) − h1 ∂k x ∂k x

(mod pk+K ),

and conclude that F (u + h) − F (u) ≡ h

∂k F (u1 ) ∂k x

(mod pk+K )

(2.13)

since a congruence modulo pk+K is a congruence relation of the module Qp over the ring Zp , see Note 2.31. Now comparing (2.11) and (2.13),and taking h = pK we obtain that ∂k F (u) ∂k F (u1 ) ≡ ∂k x ∂k x whenever |u1 − u|p ≤ p−Nk (F ) .

(mod pk )


32

Note. Nowhere in the proof we demand that the derivatives modulo pk must be integer-valued! Note 2.39. Proposition 2.38 implies that each partial derivative modulo pk can be considered as a function defined on the residue ring Z/pNk (F ) Z and valuated in Z/pk Z. Moreover, if a continuation F˜ of the function F = n (f1 , . . . , fm ) : Nn0 → Nm 0 to the space Zp is uniformly differentiable modulo k n p on Zp , then one can simultaneously continue the function F together with all its (partial) derivatives modulo pk to the whole space Znp . Consequently, we may study if necessary (partial) derivatives modulo pk of the function F˜ rather than those of F , and vice versa. For example, a partial derivative ∂k fi (u) ∂k fi (u) k k n ∂k xj modulo p vanishes modulo p at no point of Zp (that is, ∂k xj 6≡ 0 (mod pk ) for all u ∈ Znp , or equivalently, ∂k fi (u) > p−k everywhere on Znp ) ∂k xj

if and only if

∂k fi (u) ∂ k xj

6≡ 0 (mod

pk )

p

for all u ∈ {0, 1, . . . , pNk (F ) − 1}.

Example 2.40. Let p = 2. The function

F (x, y) = (f (x, y), g(x, y)) = (x XOR (2 ∙ (x AND y)), (y + 3x3 ) XOR x) is uniformly differentiable modulo 2 as a bivariate function, and N1 (F ) = 1; namely

1 x+1 F (x + 2 t, y + 2 s) ≡ F (x, y) + (2 t, 2 s) ∙ 0 1 n

m

n

m

(mod 2k+1 )

1 x+1 for all m, n ≥ 1 (here k = min{m, n}). The matrix = F10 (x, y) 0 1 is a Jacobi matrix modulo 2 of F (see Definition 2.25). Here is how we calculate partial derivatives modulo 2: For instance, 3 ) ∂ (uXORx) ∂1 g(x,y) = ∂1 (y+3x ∙ 1 ∂1 u u=y+3x3 + ∂∂11 xx ∙ ∂1 (uXORx) = 9x2 ∙ 1 + 1 ∙ ∂1 x ∂1 x ∂1 x u=y+3x3 1 ≡ x + 1 (mod 2). Note that a partial derivative modulo 2 of the function 2 ∙ (x AND y) is always 0 modulo 2, due to the multiplier 2: The function x AND y is not differentiable modulo 2 as a bivariate function, however, the function 2 ∙ (x AND y) is. So the Jacobian of the function F is det F10 ≡ 1 (mod 2). Exercise 2.41. Let 1-Lipschitz functions fi : Znp → Zp be uniformly differentiable modulo p on Znp , i = 1, 2, . . . , n. Prove that if the system of equations fi (x1 , . . . , xn ) = 0, i = 1, 2, . . . , n, in unknowns x1 , . . . , xn has a solution ˉ = (ˉ u u1 , . . . , u ˉn ) modulo p (i.e., fi (ˉ u1 , . . . , u ˉn ) ≡ 0 (mod p), i = 1, 2, . . . , n) ˉ , then such that the Jacobian of the system does not vanish modulo p at u ˉ (mod p). the system has a solution u = (u1 , . . . , un ) in Znp such that u ≡ u (Hint: cf. Hensel’s Lemma 2.13 and Exercises 2.15, 2.21, 2.26.)

Chapter 3

p-adic series 3.1

Taylor series

The following definition is just a re-statement of the correspondent definition from real analysis. Definition 3.1 (Taylor series). Let the function f : Zp → Zp be infinitely many times differentiable at the point a ∈ Zp . The power series ∞ X f (i) (a) i=0

i!

(x − a)i

(3.1)

are called Taylor series of the function f at the point a. As in real analysis, central questions about Taylor series are where the series (3.1) converges and, if converges for some x, whether the series converges to f (x). These questions lead to the notion of analytic function.

3.1.1

Analytic functions on balls

Definition 3.2 (Analytic function). The function f : Zp → Zp is said to be analytic on the ball Bp−r (a) ⊂ Zp if it can be represented by power series that converges everywhere on Bp−r (a): f (x) =

∞ X i=0

ai (x − a)i

for all x ∈ Bp−r (a) (here ai ∈ Qp , i = 0, 1, 2, . . .). If the function f is analytic on the ball Bp−r (a), it can be represented by Taylor series everywhere on the ball; that is, ai = see e.g. [48, Theorem 40.2].

f (i) (a) i! ,

i = 0, 1, 2, . . .,

Example 3.3. The following functions are analytic on respective balls: 33

CHAPTER 3. P -ADIC SERIES

34

• The p-adic exponential function expp x =

∞ X xj j=0

j!

is analytic on Bp−1 (0) = pZp if p ≥ 3 and on B 1 (0) = 4Z2 if p = 2. 4 Note that the ‘p-adic e’ does not exist since the series does not converge at x = 1. • In the same way, i.e., by considering corresponding power series, we can introduce p-adic trigonometric functions: sinp x =

∞ X (−1)j x2j+1 j=0

(2j + 1)!

, cosp x =

∞ X (−1)j x2j j=0

2j!

.

They are analytic on the same balls as the exponential function. P (−1)k+1 (x−1)k • The p-adic logarithm function lnp x = ∞ is analytic k=1 k on Bp−1 (1) = 1 + pZp

3.1.2

Analytic functions on Zp

In the sequel we are mostly interested in the functions that are analytic on the whole space Zp which is considered as a ball radius 1 centered at 0; Pof ∞ these functions are represented by power series i=0 ai xi . P i Theorem 3.4. Given a0 , a1 , . . . ∈ Qp , the power series ∞ i=0 ai x converges p

everywhere on Zp if and only if lim ai = 0; under the latter condition the i→∞

series defines a continuous function on Zp . Exercise 3.5. Prove Theorem 3.4. (Hint: Consider x = 1, use Exercise 1.14). Exercise 3.6. Prove that even if Taylor series of a function f converges everywhere on Zp , they may not converge to f . (Hint: Consider a nonconstant pseudo-constant.) Exercise 3.7. Given z ∈ Zp , prove that the function lnp (1 + pzx) =

∞ X i=1

(−1)i+1

pi z i xi i

of variable x is integer-valued, 1-Lipschitz, and analytic on Zp . (Hint: Use Theorem 3.4). Exercise 3.8. Prove that the function invp x is analytic on Bp−1 (1) = 1+pZp . (Hint: Change variable, use Exercise 1.30 and Theorem 3.4). Exercise 3.9. Prove claims of Example 3.3. (Hint: Change variable, use Theorem 3.4).


3.2

35

Mahler series

In this Section, we introduce Mahler expansion, a useful technique which we will need in further chapters to study dynamics produced by a compatible (i.e., 1-Lipschitz) mappings. Every function f : N0 → Zp (or, respectively, f : N0 → Z) has the only Mahler expansion, that is, has a unique representation via the so-called Mahler (interpolation) series ∞ X x ai , (3.2) f (x) = i i=0

where ai ∈ Zp (respectively, ai ∈ Z), i = 0, 1, 2, . . ., and x(x − 1) ∙ ∙ ∙ (x − i + 1) x = i! i for i = 1, 2, . . .;

x = 1, 0

by the definition. Various properties of the function f : Zp → Zp can be expressed via properties of coefficients of its Mahler expansion. We recall some basic facts about Mahler series, referring to [42] or [48] for their proofs. If f is uniformly continuous on N0 with respect to the p-adic metrics, it can be uniquely expanded to a uniformly continuous function on Zp . Hence the interpolation series for f converges uniformly on Zp . The following is true: The series (3.2) converges uniformly on Zp if and only if p

lim ai = 0.

i→∞

(3.3)

Hence uniformly convergent series defines a uniformly continuous function on Zp . The function f represented by the interpolation series (3.2) is (uniformly) differentiable everywhere on Zp if and only if p

ai+n =0 i→∞ i lim

(3.4)

for all n ∈ N0 ; in this case the following formula for the derivative holds: f 0 (x) =

∞ X

(−1)i+1

i=1

Δi f (x) . i

(3.5)

The function f is analytic on Zp if and only if p

lim

i→∞

ai = 0. i!

(3.6)


36

To represent functions of several variables we use interpolation series of the following form: X x1 x2 xn f (x1 , . . . , xn ) = ai1 ,∙∙∙ ,in ∙∙∙ ; (3.7) i i in 1 2 n (i1 ,...,in )∈N0

here ai1 ,...,in ∈ Zp . The following is an example of usage of Mahler series to determine whether a function is analytic: Example 3.10. The following 1-Lipschitz function f of variable x is analytic on Zp : f (x) = (1 + pz)x ; z ∈ Zp and p odd. P pi z i 1 i i x Indeed, (1 + pz)x = ∞ i=0 p z i and ordp i! = i − p−1 (i − wtp i) + p

pi z i i→∞ i!

ordp z i → ∞ as i → ∞; i.e. lim

= 0.

Exercise 3.11. Prove that in the case p = 2 the function f (x) = (1 + pz)x is analytic on Z2 if and only if z ∈ B 1 (0) = 2Z2 . 2

Exercise 3.12. For p = 2, express the function δ0 (x) via van der Put series. (Hint: Use Theorems 0.6 and 0.2.)

Open question 3.13. Find analog of conditions (3.4) for uniform differentiability modulo pk on Zp .

3.2.1

Identities modulo pk

This an auxiliary subsection; we describe here a special class of functions, which are, loosely speaking, sufficiently small with respect to a p-adic metric, but not too small. k Definition 3.14. A function F : Znp → Zm p is called an identity modulo p if for every u ∈ Znp the following congruence holds:

F (u) ≡ (0, . . . , 0) (mod pk ).

In other words, F is an identity modulo pk if and only if |F (u)|p ≤ p−k for all u ∈ Znp .

We need to characterize identities modulo pk in order to study behavior of compatible functions modulo some pk since it is clear that two compatible functions coincide modulo pk if and only if their difference is an identity modulo pk . The following easy proposition characterizes identities modulo pk in terms of Mahler expansion.

Proposition 3.15. A function f : Znp → Zp is an identity modulo pk if and only if for all coefficients of its Mahler expansion (3.7) are 0 modulo pk :

for all (i1 , . . . , in ) ∈ Nn0 .

|ai1 ,...,in |p ≤ p−k


37

Proof. Induction on n. Let n = 1. As f is a continuous function, and as N0 is a dense subset in Zp , f is an identity modulo some pk if and only if s X i=0

s ai ≡ 0 (mod pk ) i

(3.8)

for all s = 0, 1, 2, . . .. However, a triangular system of congruences (3.8) has a unique solution 0 ≡ a0 ≡ a1 ≡ a2 ≡ . . .

(mod pk );

(3.9)

hence for n = 1 the proposition is true. As f (x1 , . . . , xn−1 , s) =

s X i=0

x gi (x1 , . . . , xn−1 ) i

for every s ∈ N0 , then by a similar argument we conclude that f (x1 , . . . , xn ) is an identity modulo pk if and only if gi (x1 , . . . , xn−1 ) ≡ 0 (mod pk ) for all x1 , . . . , xn−1 ∈ Zp and all i = 0, 1, 2, . . .. By the induction, in view of (3.7) the latter condition holds if and only if the following congruences ai1 ,...,in−1 ,i ≡ 0 (mod pk ) hold for all i = 0, 1, 2, . . . .

3.3

Van der Put series

Now we remind definition and some properties of van der Put series, see e.g. [42, 48] for details. Given a continuous function f : Zp → Zp , there exists a unique sequence B0 , B1 , B2 , . . . of p-adic integers such that f (x) =

∞ X

Bm χ(m, x)

(3.10)

m=0

for all x ∈ Zp , where χ(m, x) =

1, if |x − m|p 6 p−n 0, otherwise

and n = 1 if m = 0; n is uniquely defined by the inequality pn−1 6 m 6 pn −1 otherwise. The right side series in (3.10) is called the van der Put series of the function f . Note that the sequence B0 , B1 , . . . , Bm , . . . of van der Put coefficients of the function f tends p-adically to 0 as m → ∞, and the series


38

converges uniformly on Zp . Vice versa, if a sequence B0 , B1 , . . . , Bm , . . . of p-adic integers tends p-adically to 0 as m → ∞, then the the series in the right part of (3.10) converges uniformly on Zp and thus define a continuous function f : Zp → Zp . The number n in the definition of χ(m, x) has a very natural meaning; it is just the number of digits in a base-p expansion of m ∈ N0 : logp m = (the number of digits in a base-p expansion for m) − 1; (3.11) therefore n = logp m + 1 for all m ∈ N0 ; we assume logp 0 = 0 by this reason. Recall that bαc for a real α denotes the integral part of α, that is, the nearest to α rational integer which does not exceed α. Note that coefficients Bm are related to the values of the function f in the following way: Let m = m0 + . . . + mn−2 pn−2 + mn−1 pn−1 be a base-p expansion for m, i.e., mj ∈ {0, . . . , p − 1}, j = 0, 1, . . . , n − 1 and mn−1 6= 0, then ( f (m) − f (m − mn−1 pn−1 ), if m > p; (3.12) Bm = f (m), if otherwise. It is worth noticing also that χ(m, x) is merely a characteristic function of the ball Bp−blogp mc−1 (m) = m + pblogp mc−1 Zp of radius p−blogp mc−1 centered at m ∈ N0 : ( ( 1, if x ∈ Bp−blogp mc−1 (m); 1, if x ≡ m (mod pblogp mc+1 ); χ(m, x) = = 0, if otherwise 0, if otherwise (3.13) Exercise 3.16. Represent a constant function via van der Put series. Exercise 3.17. Represent the function δk (x) via van der Put series.

Chapter 4

Compatible functions In this Chapter we study the class of compatible (1-Lipschitz, non-expansive) p-adic functions in detail since further in the book we mostly are interested in dynamics and ergodic properties of corresponding maps. In the book we deal mostly with integer-valued functions F that map Znp into Zm p ; that is n F = (f1 , . . . , fm ), where fi : Zp → Zp , i = 1, 2, . . . , m, and the compatible functions are an important subclass of these.

4.1

Compatibility and local compatibility

From Section 2.2 we already know that an integer-valued function F : Znp → Zm p is 1-Lipschitz if and only if |F (a) − F (b)|p ≤ |a − b|p

(4.1)

for all a, b ∈ Znp (note that we use the same symbol for absolute values n n on Znp and on Zm p ). Often 1-Lipschitz maps from Zp into Zp are also called non-expansive since the maps do not increase distances between points. The function F is 1-Lipschitz if and only if it is compatible (cf. (2.8)): F (a) ≡ F (b) (mod p` ) once a ≡ b

(mod p` ).

The compatibility is an algebraic notion: a function defined on a Cartesian power of a universal algebra and valuated in a Cartesian power of the universal algebra is called compatible once the function agrees with all congruences of the algebra, cf. [40]. As all congruences of the ring Zp (and of Znp ) are the ones that correspond to epimorphisms modpk for a suitable k ∈ N, cf. (1.5) and (2.9). Thus, in the book 1-Lipschitz maps and compatible maps are just synonyms. Along with compatible functions we also consider locally compatible functions (or, locally 1-Lipschitz functions) . The latter are the functions that satisfy (4.1) locally; that is, given a ∈ Znp , there exists an open neighbourhood Oa of a such that (4.1) holds for all b ∈ Oa . As Znp is compact, the 39

CHAPTER 4. COMPATIBLE FUNCTIONS

40

function F : Znp → Zm p is locally compatible if and only if such that (4.1) holds for all a, b ∈ Znp which are sufficiently close to one another, that is, there exists r ∈ N0 such that (4.1) is true once |a − b|p ≤ p−r . Exercise 4.1. Prove the latter assertion.

4.1.1

Coordinate functions

Now we characterize multivariate compatible functions in terms of the socalled coordinate functions; the latter are functions δi (f (x1 , . . . , xn )) defined on Znp and valuated in {0, 1, . . . , p − 1}: The i-th coordinate function is merely a value of coefficient of the i-th term in a canonical p-adic expansion of f (x1 , . . . , xn ), see Note 1.22. Proposition 4.2 (cf. Exercise 2.4). A function f : Znp → Zp is compatible if and only if for every i = 1, 2, . . . the i-th coordinate function δi (f (x1 , . . . , xn )) does not depend on δi+k (xs ), for all s = 1, 2, . . . , n and k = 1, 2, . . . . Proof. Let the function δi (f (x1 , . . . , xn )) depend on δi+k (xs ) for some i, s, k; i.e., let there exist (u1 , . . . , un ) and (v1 , . . . , vn ) in Znp such that uj = vj for j = 1, 2, . . . , n; j 6= s, and δi+k (us ) 6= δi+k (vs ), δr (us ) = δr (vs ) for all r = 0, 1, 2, . . .; r 6= i + k, and δi (f (u1 , . . . , un )) 6= δi (f (v1 , . . . , vn )).

(4.2)

This means that (u1 , . . . , un ) ≡ (v1 , . . . , vn ) (mod pi+k ), i.e., in particular (u1 , . . . , un ) ≡ (v1 , . . . , vn ) (mod pi+1 ), whereas in view of (4.2) f (u1 , . . . , un ) 6≡ f (v1 , . . . , vn ) (mod pi+1 ), a contradiction of the compatibility of f . Note 4.3. From the proof of Proposition 4.2 it immediately follows that a function f : Znp → Zp is locally compatible if and only if there exists N ∈ N0 such that for every i = N, N + 1, N + 2, . . . the i-th coordinate function δi (f (x1 , . . . , xn )) does not depend on δi+k (xs ), for all s = 1, 2, . . . , n and k = 1, 2, . . . . Proposition 4.2 demonstrates that a compatible function F : Znp → Zm p is just a triangular function from a p-valued logic, and vice versa, every triangular function defines a compatible function F : Znp → Zm p . The following definition originates from the theory of discrete functions (functions of p-valued logic):


41

Definition 4.4. An n-variate triangular function is a mapping Φ : (α0↓ , α1↓ , α2↓ , . . .) 7→ (Φ↓0 (α0↓ ), Φ↓1 (α0↓ , α1↓ ), Φ↓2 (α0↓ , α1↓ , α2↓ ), . . .), where αi↓ ∈ Fnp is an n-dimensional columnar vector; Fp = {0, 1, . . . , p − 1}, ↓ ↓ and the mapping Φ↓i : (Fnp )i+1 → Fm p maps n-dimensional vectors α0 , . . . , αi to an m-dimensional vector Φ↓i (α0↓ , . . . , αi↓ ) ∈ Fm p . Accordingly, a univariate triangular function f is a mapping f

(χ0 , χ1 , χ2 , . . .) 7→ (ψ0 (χ0 ); ψ1 (χ0 , χ1 ); ψ2 (χ0 , χ1 , χ2 ); . . .), where χj ∈ {0, 1, . . . , p − 1}, and each ψj (χ0 , . . . , χj ) ∈ {0, 1, . . . , p − 1} is a function in variables χ0 , . . . , χj of a p-valued logic. Triangular functions define p-adic functions in an obvious manner: e.g., a univariate triangular function f sends a p-adic integer χ0 + χ1 ∙ p + χ2 ∙ p2 + ∙ ∙ ∙ to the p-adic integer ψ0 (χ0 ) + ψ1 (χ0 , χ1 ) ∙ p + ψ2 (χ0 , χ1 , χ2 ) ∙ p2 + ∙ ∙ ∙ Seemingly the triangular functions originate from automata theory: Actually, every automaton on p symbols (with n inputs and m outputs) defines a triangular function Φ, and vice versa, see Chapter 12 for details. Note that in automata theory triangular functions are also known under the name of determinate functions, as well as of automata functions. In cryptology, triangular functions are usually considered only for p = 2 and are called T-functions by some authors in this case, see Chapter 13.

4.1.2

Differences of compatible functions

In further study we need the following characterisation of compatible p-adic functions. Proposition 4.5. A continuous function f : Znp → Zp is compatible if and only if every function 1i Δij f (where j = 1, 2, . . . , n; i = 1, 2, . . .) is integervalued on Zp (i.e., all its values on Zp are p-adic integers). Proof. In view of (4.1) the compatibility of we conclude that f : Znp → Zp is compatible if and only if |f (x1 , . . . , xi−1 , xi + h, xi+1 , . . . , xn ) − f (x1 , . . . , xn )|p ≤ |h|p

(4.3)

for all x1 , . . . , xn , h ∈ Zp and all i = 1, 2, . . . , n; or, equivalently, if and only if the p-adic number αh =

1 (f (x1 , . . . , xi−1 , xi + h, xi+1 , . . . , xn ) − f (x1 , . . . , xn )) h

(4.4)


42

is a p-adic integer for all h ∈ Zp \ {0} and all x1 , . . . , xn ∈ Zp . As f (x1 , . . . , xn ) is continuous, (4.3) holds for all h ∈ Zp if and only if it holds for all h ∈ N, since N is a dense subset in Zp . Thus, a continuous function f is compatible if and only if αh is a p-adic integer for each positive rational integer h. Now applying the Gregory-Newton formula (Theorem 0.5), we conclude that for a positive rational integer h the p-adic number αh can be expressed as h h X 1X h h−1 1 j j Δ f (x1 , . . . , xn ). Δi f (x1 , . . . , xn ) = αh = h j j−1 j i j=1

j=1

Thus, the function f is compatible if and only if each p-adic number m−1 X m 1 (4.5) αm = Δk+1 f (x1 , . . . , xn ) k k+1 i k=0

is a p-adic integer for m = 1, 2, 3, . . .. Now applying combinatorial relations 1 of Theorem 0.6 we express k+1 Δk+1 f (x1 , . . . , xn ) from (4.5) via the numbers i αm : k X 1 k+1 m+k k Δ f (x1 , . . . , xn ) = (−1) (4.6) αm+1 , k+1 i m m=0

1 Δk+1 f (x1 , . . . , xn ) where k = 0, 1, 2, . . .. Now (4.6) implies that all fractions k+1 i are p-adic integers whenever all αn for n = 0, 1, 2, 3, . . . are p-adic integers; whereas (4.5) implies the converse. Whence, all αm for m = 0, 1, 2, . . . 1 Δk+1 f (x1 , . . . , xn ) for are p-adic integers if and only if all fractions k+1 i k = 0, 1, 2, . . . are p-adic integers.

4.2

Compatibility and differentiability

We already know that if a 1-Lipschitz function is uniformly differentiable modulo pk then all its partial derivatives modulo pk are integer-valued, see Exercise 2.26. The following theorem demonstrates that 1-Lipschitz functions are tightly related to functions that are uniformly differentiable (or at least are uniformly differentiable modulo some pk ) and have integer-valued derivatives. Theorem 4.6. Let a function F = (f1 , . . . , fm ) : Znp → Zm p be uniformly differentiable modulo p, and let it has integer-valued derivatives modulo p at all points of Znp . Then F (x1 , . . . , xn ) = P (x1 , . . . , xn ) + C(x1 , . . . , xn ) where P is a step function of order not greater than N1 (F ) and C is a compatible function. Consequently, F is locally compatible and C is uniformly differentiable modulo p.


43

Proof. Put P (x1 , . . . , xn ) = (f1 (x1 , . . . , xn ) mod pN1 (F ) , . . . , fm (x1 , . . . , xn ) mod pN1 (F ) ) C(x1 , . . . , xn ) = F (x1 , . . . , xn ) − P (x1 , . . . , xn ). For l ≥ N1 (F ) and all s1 , . . . , sn ∈ Zp Definition 2.25 implies that F (x1 + s1 pl , . . . , xn + sn pl ) ≡ F (x1 , . . . , xn ) (mod pl )

(4.7)

since F10 (x1 , . . . , xn ) is a matrix over Z/pZ, and consequently (s1 pl , . . . , sn pl )F10 (x1 , . . . , xn ) ≡ (0, .., 0) (mod pl ). In particular, (4.7) implies that F is locally compatible. This in turn means that for i ≥ N1 (F ) the function δi (fj (x1 , . . . , xn )) depends only on δ0 (x1 ), . . . , δ0 (xn ), . . . , δi (x1 ), . . . , δi (xn ); i.e., the function δi (fj (x1 , . . . , xn )) is a step function of order not greater than i + 1. Hence C is compatible. On the other hand, (4.7) implies that if i < N1 (F ) then δi (fj (x1 , . . . , xn )) does not depend on δr (xt ) for r = N1 (F ), N1 (F ) + 1, . . . and t = 1, 2, . . . , n; that is, for all i = 0, 1, . . . , N1 (F ) − 1 and all j = 1, 2, . . . , m the function δi (fj (x1 , . . . , xn )) is a step function of order not greater than N1 (F ). Hence the function P (x1 , . . . , xn ) is a step function of order not greater than N1 (F ) since N1 (F )−1 X N1 (F ) fj (x1 , . . . , xn ) mod p = δi (fj (x1 , . . . , xn ))pi i=0

for j = 1, 2, . . . , m. Thus P (x1 , . . . , xn ) is a pseudo-constant, whence has zero derivatives. We conclude finally that the function C = F − P is uniformly differentiable modulo p, and that the corresponding partial derivatives of C and F modulo p pairwise coincide.

Note 4.7. From the proof of Theorem 4.6 it easily follows that any locally compatible function is a sum of a compatible function and of a step function of order K for some K ∈ N0 , and vice versa, any such sum is locally compatible since the congruence (4.7) of the proof of Theorem 4.6 is equivalent to the local compatibility of F . Moreover, this K is equal to N from the statement of Note 4.3: Actually from the proof of Theorem 4.6, as well as from the proof of Proposition 4.2, it can be easily deduced that a function f : Znp → Zp is locally compatible if and only if there exists N ∈ N0 such that f (x1 , . . . , xn ) = g(x1 , . . . , xn ) + c(x1 , . . . , xn ), where c : Znp → Zp is a compatible function (which is an identity modulo pN ) and g : Znp → {0, 1, . . . , pN − 1} is a step function of order not greater than N .


44

Indeed, from the proof of Theorem 4.6, as well as from the proof of Proposition 4.2, it follows that g(x1 , . . . , xn ) = f (x1 , . . . , xn ) mod pN and c(x1 , . . . , xn ) = f (x1 , . . . , xn ) − g(x1 , . . . , xn ), where the mapping ∙ mod pN : Zp → {0, 1, . . . , pN − 1} is just a reduction modulo pN of a p-adic integer: z mod pN = δ0 (z) + δ1 (z) ∙ p + ∙ ∙ ∙ + δN −1 (z) ∙ pN −1 . Thus, the most essential component of any locally compatible function is a compatible function: for instance, the function f is differentiable if and only if its compatible summand c is differentiable since every step function is differentiable everywhere on Zp and its derivative is 0. So in the sequel we focus our study on compatible functions making remarks about locally compatible ones whenever it is reasonable. Exercise 4.8. Prove that that differentiable modulo pk locally compatible functions have integer-valued derivatives modulo pk .

4.2.1

Differentiability modulo p

Now we state a criterion for a differentiability modulo p of a compatible univariate function and find a formula for a derivative modulo p. Theorem 4.9. A compatible function f : Zp → Zp is differentiable modulo p at the point u ∈ Zp if and only if Δi f (u) ≡ 0 (mod p) i for all sufficiently large i. If this condition is satisfied, the derivative f10 (u) modulo p of the function f at the point u is f10 (u)

p−1 ∞ ∞ X i kpt f (u) X X i−1 Δ f (u) k−1 Δ ≡ ≡ (−1) (−1) i kpt

(mod p).

t=0 k=1

i=1

Note. Since f is compatible, the fraction j ∈ N, see Proposition 4.5.

Δj f (u) j

is a p-adic integer for all

To prove the theorem, we need some technical lemmas.

Lemma 4.10. Let f : Zp → Zp be a compatible function, let u ∈ Zp , and let a base-p expansion of i contains more than one nonzero digit (i.e., i 6= pα l for α ∈ {0, 1, 2, . . .}, l ∈ {1, 2, . . . , p − 1}). Then 1i Δi f (u) ≡ 0 (mod p). Proof of Lemma 4.10. Since i X 1 i i+j i − 1 1 (−1) Δ f (u) = (f (u + j) − f (u)), j−1 j i j=1


45

see (4.6) and (4.4) of Proposition 4.5, it is sufficient to demonstrate that S(i) =

∞ X j=1

(−1)

j

i−1 1 (f (u + j) − f (u)) ≡ 0 (mod p) j−1 j

whenever i 6= lpα , where l ∈ {1, 2, . . . , p − 1} and α ∈ N0 . Note that all fractions 1j (f (u + j) − f (u)) are p-adic integers since f is compatible. Represent j ∈ N as j = pr l + pr+1 t where r = ordp j, l ∈ {1, 2, . . . , p − 1}, t ∈ N0 . We have then i−1 f (u + pr l + pr+1 t) − f (u) S(i) = (−1) . pr l + pr+1 t − 1 pr l + pr+1 t r=0 l=1 t=0 (4.8) The compatibility of f implies that p−1 X ∞ X ∞ X

pr l+pr+1 t

f (u + pr l + pr+1 t) = f (u + pr l) + pr+1 ξ for a suitable ξ ∈ Zp ; hence f (u + pr l) − f (u) f (u + pr l + pr+1 t) − f (u) ≡ pr l + pr+1 t pr l + pr+1 t

(mod p)

(4.9)

since l + pt is a unit in Zp . Whence f (u + pr l) − f (u) f (u + pr l) − f (u) ≡ pr l + pr+1 t pr l

(mod p)

(4.10)

since (l +pt)−1 ≡ l−1 (mod p). Now from (4.8) in view of (4.9) and of (4.10) we conclude that S(i) ≡

p−1 ∞ X ∞ X f (u + pr l) − f (u) X r=0 l=1

pr l

l+t

(−1)

t=0

i−1 r p l + pr+1 t − 1

(mod p), (4.11)

since (−1)p ≡ −1 (mod p) for every prime p. Denote σr (i) =

∞ X

(−1)

t=0

l+t

i−1 . pr l + pr+1 t − 1

(4.12)

Note that whenever s is a p-adic integer, ordp s = k, then the j-th digit δj (s − 1) of the base-p expansion of s − 1 is p − 1 for j < k. With this in mind, we consider cases ordp i < r, ordp i > r, and ordp i = r separately. Case 1 : ordp i < r. The above note in view of Lucas’ Theorem 0.2 implies that i−1 ≡ 0 (mod p) pr l + pr+1 t − 1


46

whenever ordp i < r, and consequently, that σr (i) ≡ 0 (mod p) in this case. Case 2 : ordp i > r. In this case Lucas’ Theorem 0.2 implies that i−1 p − 1 ν(i, r) ≡ (mod p), (4.13) pr l + pr+1 t − 1 l−1 t where ν(i, r) = b pi−1 r+1 c, the integral part of

p−1 l−1

≡ (−1)l−1

i−1 . pr+1

Now, since

(mod p),

combining (4.13) and(4.12) we conclude that σr (i) ≡ − Further, since

∞ X

(−1)

t=0

t

∞ X ` m (−1) = ` `=0

ν(i, r) t

(

1, 0,

(mod p).

(4.14)

if m = 0; otherwise,

the right hand part of (4.14) is zero modulo p whenever ν(i, r) 6= 0, that is, whenever i > pr+1 . However, we are considering the case ordp i > r; thus, since the conditions ordp i > r and i ≤ pr+1 hold simultaneously only when i = pr+1 , the condition σr (i) 6≡ 0 (mod p) necessarily implies that i = pr+1 in the case under consideration. Case 3 : ordp i = r. In a manner similar to the one of Case 2 we prove that ∞ X ν(i, r) l+t δr (i) − 1 σr (i) ≡ (−1) (mod p), l−1 t t=0

and that the sum in the right hand part of this congruence may not vanish modulo p only the following two conditions δr (i) ≥ l and ν(i, r) = 0 hold simultaneously. But these two conditions hold simultaneously only when i = pr δr (i). This in view of (4.11) and (4.12) finishes the proof of Lemma 4.10.

Lemma 4.11. Let f : Zp → Zp be a compatible function, and let u, h ∈ Zp . Then the following congruence holds:

f (u + h) ≡ f (u) + h f˜m (u) +

p−1 m X Δip f (u) h − 1 ipm ipm − 1

(mod pm+1 ),

i=2

where m = ordp h and f˜m (u) ≡

p−1 m−1 XX t=0 l=1

t

(−1)l−1

m

Δlp f (u) Δp f (u) + lpt pm

(mod p).


47

In particular, if p = 2 then i m X Δ2 f (u)

f (u + h) ≡ f (u) + h

(mod 2m+1 ).

2i

i=0

Proof of Lemma 4.11. In view of the compatibility of f it is sufficient to prove the lemma under assumption that h = pm χ, χ ∈ {1, 2, . . . , p − 1}. Applying Gregory-Newton formula of Theorem 0.5, we see that m

f (u + p χ) =

mχ pX pm χ

i

i=0

thus m

m

f (u + p χ) = f (u) + p χ

`−1 ` j−1

− 1 Δi f (u) i−1 i

mχ pX pm χ

i=1

since

Δi f (u);

` =j . j

Now Lemma 4.10 implies that

t pm χ − 1 Δlp f (u) f (u + p χ) ≡ f (u) + p χ + pt l − 1 lpt t=0 l=1 χ m m X p χ − 1 Δjp f (u) (mod pm+1 ). pm j − 1 jpm m

m

m−1 p−1 XX j=1

From here, combining together the congruence m χ−1 p χ−1 ≡ (mod p), j−1 pm j − 1

(4.15)

which follows immediately from Lucas’ Theorem 0.2, and an obvious congruence p−1 ≡ (−1)k (mod p), k we deduce that m

m

f (u + p χ) ≡ f (u) + p χ

m−1 p−1 XX

lpt l−1 Δ f (u) (−1) + lpt

t=0 l=1 χ X j=1

m χ − 1 Δjp f (u) j−1 jpm

(mod pm+1 ).


48

The latter congruence implies that

f (u + p χ) ≡ f (u) + p χ f˜m (u) + m

m

p−1 m X χ − 1 Δjp f (u) j=2

j−1

jpm

(mod pm+1 ),

since χ ∈ {1, 2, . . . , p − 1}. This in view of (4.15) proves Lemma 4.11. i

Proof of Theorem 4.9. If Δ fi (u) ≡ 0 (mod p) for all i ≥ N then in view of Lemma 4.11 the following congruences hold f (u + h) ≡ f (u) + hf˜m (u) (mod pm+1 ), f˜m (u) ≡ f˜m+1 (u) ≡ . . . (mod p) for all sufficiently small h ∈ Zp (i.e. for all h with |h|p = p−m , where m sufficiently large). Consequently, f is differentiable modulo p at the point u ∈ Zp . Vice versa, let the function f be differentiable modulo p at the point u, i.e. let there exist N ∈ N and c ∈ Qp such that f (u + h) ≡ f (u) + hc

(mod pm+1 ),

(4.16)

where |h|p = p−m , m ≥ N . From (4.16) in view of Lemma 4.11 we deduce that jpm p−1 X h − 1 Δ f (u) ≡ c (mod p) (4.17) f˜m (u) + m jp − 1 jpm j=2

for all m ≥ N . In the case p = 2 the sum in the left hand part of congruence (4.17) vanishes, so suppose for a moment that p 6= 2. According to Lucas’ Theorem 0.2 we then have p−1 p−1 m m X h − 1 Δjp f (u) X hm − 1 Δjp f (u) ≡ (mod p), jpm − 1 jpm j−1 jpm j=2

j=2

where hm = δm (h), the m-th p-adic digit of h. So in view of (4.17) the function Ψu (hm ) defined by the equation Ψu (hm ) = δ0

X p−1 m hm − 1 Δjp f (u) j=2

j−1

jpm

is a constant whenever |h|p = p−m , m ≥ N . In particular, Ψu (hm ) = Ψu (1) = 0, and this implies that for all m ≥ N the following system of congruences modulo p holds: p−1 m X k − 1 Δjp f (u) j=2

j−1

jpm

≡ 0 (mod p), (k = 2, 3, . . . , p − 1).

(4.18)


49

System (4.18) of congruences is triangular, so necessarily m

Δjp f (u) ≡ 0 (mod p), (j = 2, 3, . . . , p − 1) jpm

(4.19)

for all m ≥ N . Now from (4.17) in view of (4.19) and of Lemma 4.11, we deduce that for all prime p the following congruence holds p−1 N −1 X X t=0 l=1

lpt l−1 Δ f (u) (−1) lpt

s m X Δp f (u) + ≡c ps

(mod p),

(4.20)

s=N

where c does not depend on m. Hence s

Δp f (u) ≡ 0 (mod p) ps

(4.21)

for all s ≥ N + 1. Finally combining (4.19) and (4.21) with Lemma 4.10, we obtain that Δi f (u) ≡ 0 (mod p) i for all i ≥ pN + 1. The second statement of Theorem 4.9 follows from (4.20) in view of Lemma 4.10 since c ≡ f10 (u) (mod p), see (4.16). Now it worth comparing here the notions of differentiability and of differentiability modulo pk once again. As for differentiability of a function f : Zp → Qp at the point u ∈ Zp , the following result is known (see e.g. [42, Chapter 13, Theorem 1]): Theorem 4.12. A function f : Zp → Qp is differentiable at the point u ∈ Zp if and only if p Δi f (u) lim =0 i→∞ i If this condition is satisfied, the derivative f 0 (u) of the function f at the point u is ∞ X Δi f (u) f 0 (u) = . (−1)i−1 i i=1

Comparing Theorem 4.12 to Theorem 4.9 it is reasonable to suppose that a similar result should hold for differentiability modulo pk , k ≥ 2. Note that the case k = 2 is of highest importance in view of Theorem 9.16 on ergodicity. Thus we set the following problem: Open question 4.13. Is this true that a compatible function f : Zp → Zp is differentiable modulo pk (k ≥ 2) at the point u ∈ Zp if and only if Δi f (u) ≡ 0 (mod pk ) i for all sufficiently large i?


50

Note that anyway the formula from Theorem 4.12 holds for a derivative modulo pk as well, in the following sense: Proposition 4.14. If the function f : Zp → Zp is differentiable modulo pk at the point u ∈ Zp , then ! z` i f (u) X Δ mod pk (−1)i−1 fk0 (u) ≡ lim `→∞ i i=1

for every sequence {z` ∈ metrics.

N0 }∞ `=0

that converges to 0 with respect to p-adic

Proof. Applying Gregory-Newton formula of Theorem 0.5, we see that z` X z` Δi f (u); f (u + z` ) = i i=0

thus

f (u + z` ) = f (u) + z` ∙ since z` However, as

z j

z` − 1 j−1

z` X z` − 1 Δi f (u) i=1

i−1

z` =j . j

i

p

z−1 j z→ 0

is a continuous function on Zp , lim

= (−1)j , so

z` z` f (u + z` ) − f (u) X Δi f (u) z` − 1 Δi f (u) X ≡ = (−1)i−1 z` i−1 i i i=1

i=1

p

for all sufficiently large `. As lim definition of a derivative

(mod pk )

`→∞ modulo pk ,

f (u+z` )−f (u) z`

mod pk

= fk0 (u) by the

the conclusion follows.

In other words, Proposition 4.14 claims that the function S(z) =

z X

(−1)i−1

i=1

Δi f (u) mod pk i

of variable z is a constant on a sufficiently small ball pN Zp : S(z) = fk0 (u) for all z ∈ pN Zp . That is, differentiability modulo pk implies that all sums pN (t+1)

X

i=pN t+1

(−1)i−1

Δi f (u) i

are 0 modulo pk , for all t = 1, 2, . . . and sufficiently large N , and our question 4.13 asks whether differentiability modulo pk implies that all terms of these sums are 0 modulo pk . Now we know only that the answer is affirmative for k = 1 (see Theorem 4.9); for k > 1 the problem is still open.


4.3

51

Mahler expansion of compatible functions

In this Section we characterize compatible functions in terms of Mahler expansions. Recall that for α ∈ N a number blogp αc is merely a reduced by 1 a number of digits in a base-p expansion for α, see (3.11). Theorem 4.15. A function f : Znp → Zp represented by Mahler expansion (3.7) is compatible if and only if |ai1 ,...in |p ≤ p−μ(i1 ,...,in ) , where μ(i1 , . . . , in ) = max{blogp ik c : k = 1, 2, . . . , n}. In particular, a univariate function f : Zp → Zp represented by Mahler expansion (3.2) is compatible if and only if |ai |p ≤ p−blogp ic for all i = 1, 2, . . .. In other words, a function f : Zp → Zp is compatible if and only if it can be represented as ∞ X x logp ic b p , (4.22) f (x) = ci i i=0

for suitable ci ∈ Zp ; i = 0, 1, 2, . . ..

Proof of Theorem 4.15. Induction on n. Let n = 1. According to Proposii tion 4.5, the function f is compatible if and only if Δ fi (x) is a p-adic integer for all x ∈ Zp , i = 1, 2, . . . . Yet ∞ x Δi f (x) 1X aj = j−i i i

(4.23)

j=i

i

in view of (2). Now (4.23) implies that Δ fi (x) is a p-adic integer if and only if ∞ X x aj j−i j=i

i

is an identity modulo pordp i . Proposition 3.15 implies now that Δ fi (x) is a p-adic integer if and only if the following congruences hold simultaneously for all j ≥ i: aj ≡ 0 (mod pordp i ) (4.24) Thus, f is compatible if and only if congruences (4.24) hold simultaneously for all i = 1, 2, . . . and all j ≥ i. This means (since blogp jc =


52

max{ordp i : i = 1, 2, 3, . . . , j}) that the following congruences hold simultaneously: aj ≡ 0 (mod pblogp jc ), (j = 1, 2, 3, . . .). This proves Theorem 4.15 for n = 1. Now let the statement of the theorem be true for all r-variate functions that satisfy conditions of the theorem, r < n. Represent ∞ X xn f (x1 , . . . , xn ) = gj (x1 , . . . , xn−1 ) , j j=0

where all functions gj are uniformly continuous on Zpn−1 , for all j = 1, 2, . . .: X x1 x2 xn−1 gj (x1 , . . . , xn−1 ) = ai1 ,...,in−1 ,j ∙∙∙ . i1 i2 in−1 n−1 (i1 ,...,in−1 )∈N0

According to Proposition 4.5, the function f (x1 , . . . , xn ) is compatible if and only if all fractions 1i Δis f (x1 , . . . , xn ) are p-adic integers, for all i = 1, 2, . . ., all s = 1, 2, . . . , n, and all x1 , . . . , xn ∈ Zp . Using an argument similar to that of the case n = 1 we conclude that (P ∞ 1 xn 1 i j=i i gj (x1 , . . . , xn−1 ) j−i , if s = n; (4.25) Δ f (x1 , . . . , xn ) = P∞ 1 i xn i s j=0 i Δs gj (x1 , . . . , xn−1 ) j , otherwise.

If s 6= n, all functions 1i Δis f (x1 , . . . , xn ) (i = 1, 2, . . .) are simultaneously integer-valued if and only if all functions 1i Δis gj (x1 , . . . , xn−1 ) are simultaneously integer-valued, for all j = 0, 1, 2, . . . and all i = 1, 2, . . .. This in force of Proposition 4.5 implies that every function gj (x1 , . . . , xn ) (j = 0, 1, 2, . . .) is compatible. By induction hypothesis, the latter holds if and only if the following inequalities hold simultaneously |ai1 ,...,in−1 ,j |p ≤ p−μ(i1 ,...,in−1 ) , (j, i1 , . . . , in ∈ N0 ).

(4.26)

If s = n, then by an argument similar to that of the case n = 1 from (4.25) we deduce that all functions 1i Δin f (x1 , . . . , xn ) (i = 1, 2, . . .) are integer-valued if and only if the following inequalities hold simultaneously for all j = 1, 2, . . . and all x1 , . . . , xn−1 ∈ Zp : |gj (x1 , . . . , xn−1 )|p ≤ p−blogp jc .

(4.27)

But these conditions imply that every function gj (x1 , . . . , xn−1 ) is an identity modulo pblogp jc ; whence, in view of Proposition 3.15, the following conditions hold simultaneously for all i1 , . . . , in−1 ∈ N0 and all j ∈ N: |ai1 ,...in−1 ,j |p ≤ p−blogp jc .

(4.28)

Now combining (4.26) with (4.28) we finish the proof of Theorem 4.15.


53

Corollary 4.16 (cf. [29]). An integer-valued polynomial f (x) ∈ Q[x] is compatible as a mapping of the ring Z into Z (that is, a congruence a ≡ b (mod m) implies a congruence f (a) ≡ f (b) (mod m), for all m ∈ N\{1} and all a, b ∈ Z) if and only if f can be represented in the following form: f (x) = a0 +

d X i=1

x ai ∙ lcm(1, 2, . . . , i) ∙ , i

where a0 , a1 , . . . ∈ Z, and lcm(k, l, m, . . .) for k, l, m, . . . ∈ N is the least common multiple of k, l, m, . . .. Exercise 4.17. Prove Corollary 4.16. (Hint: Use Chinese Reminder Theorem 0.1 and 0.13).

4.4

Van der Put expansion of compatible functions.

Now we prove the compatibility criterion for arbitrary map Zp → Zp represented by van der Put series. Theorem 4.18. Let a function f : Zp → Zp be represented via van der Put series (3.10); then f is compatible (that is, satisfies the p-adic Lipschitz condition with a constant 1) if and only if |Bm |p ≤ p−blogp mc for all m = 0, 1, 2, . . .. In other words, f is compatible if and only if it can be represented as f (x) =

∞ X

m=0

pblogp mc bm χ(m, x),

(4.29)

for suitable bm ∈ Zp ; m = 0, 1, 2, . . .. Proof of Theorem 4.18. To prove the necessity of conditions, take m ∈ N0 and consider its base-p(”-” deleted) expansion m = m0 + . . . + mn−1 pn−1 . Here mj ∈ {0, . . . , p − 1}, mn−1 6= 0, and n = blogp mc + 1. As m0 + . . . + mn−2 pn−2 ≡ m0 + . . . + mn−2 pn−2 + mn−1 pn−1

(mod pn−1 );

then f (m0 + . . . + mn−2 pn−2 ) ≡ f (m0 + . . . + mn−1 pn−1 ) (mod pn−1 ) by the compatibility of f . From the latter congruence in view of (3.12) it follows that Bm ≡ 0 (mod pn−1 ) for m ≥ p; so |Bm | ≤ p−blogp mc . Now we prove the sufficiency of conditions. As |Bm |p 6 p−blogp mc , the sequence B0 , B1 , . . . tends p-adically to 0 as m → ∞ and so the function


54

f is continuous. Hence while proving that |f (x) − f (y)|p ≤ |x − y|p for x, y ∈ Zp we may assume that x, y ∈ N0 as N0 is dense in Zp . Moreover, by same reasons to prove that f satisfies p-adic Lipschitz condition with a constant 1 it suffices only to prove that given x ∈ N0 and h, n ∈ N, |f (x + hpn ) − f (x)|p ≤ p−n . Let h = h0 + h1 p + . . . + hk pk be a base-p expansion of h ∈ N, and let n0 < n1 < n2 < . . . < nk be all indices in the latter base-p expansion such that hn0 , hn1 , . . . , hnk are nonzero; so h = hn0 pn0 + hn1 pn1 + . . . + hnk pnk . Now in view of (3.12) we have that f (x + hpn ) = f (x) + [f (x + pn hn0 pn0 ) − f (x)]+

k X j=1

[f (x+pn (hn0 pn0 +∙ ∙ ∙+hnj pnj ))−f (x+pn (hn0 pn0 +∙ ∙ ∙+hnj−1 pnj−1 ))] = f (x) +

k X

Bx+hn

n+nj

|p 6 p−(n+nj ) ;

j=0

0p

n+nj n+n0 +∙∙∙+h nj p

(4.30)

However, by our assumption, |Bx+hn

0p

n+n0 +∙∙∙+h

nj p

so (4.30) implies that |f (x + hpn ) − f (x)|p 6 p−n due to the strong triangle inequality that holds for the p-adic absolute value.

Chapter 5

Special classes of compatible functions In this Chapter, we study special classes of p-adic compatible functions which were originally introduced in [6]. We will need these functions later in Part II where we study ergodic properties of corresponding transformations. All these functions are locally analytic functions. We remind the definition (see e.g. [48, Definition 25.3]): Definition 5.1 (Locally analytic function). A function f : Zp → Zp is said to be locally analytic of order r iff f is analytic on all balls of radii p−r . The definition can be re-formulated in equivalent form: Definition 5.2 (Locally analytic functions). A function f : Zp → Zp is said to be locally analytic of order r iff f (a + h) =

∞ X f (i) (a) i=0

i!

hi

for all a ∈ Zp whenever |h|p ≤ p−r . Here as usual f (i) (a) stands for the i-th derivative of the function f at the point a ∈ Zp . Note that functions that are analytic of order 0 are analytic on Zp and vice versa. The following theorem by Yvette Amice gives complete characterization of locally analytic functions represented via Mahler series: Theorem [4, Ch. III, Sec. 10, Th. 3, Cor. 1(c)]). The function P 5.3 (Amice, x f (x) = ∞ a , a ∈ Qp , is locally analytic of order r if and only if i i=0 i i 1 i lim ordp ai + = +∞ wtp i − r i→∞ p−1 p

55

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 56

5.1

Class C

According to Theorem 3.4, power series

P∞

i i=0 ci x ,

where ci ∈ Qp for i = p

0, 1, 2 . . ., converges everywhere on Zp if and only if lim ci = 0; under the i→∞

latter condition the series defines a continuous function on Zp . Of course, in general a function defined by these series may not be integer-valued, not speaking about compatibility. Consider, however, a special case when all coefficients ci are p-adic integers. Namely, in the ring Zp [[x]] of all formal power series in one variable x over the ring Zp consider the set C(x) of all series ∞ X s(x) = ci xi (ci ∈ Zp , i = 0, 1, 2 . . .), (5.1) i=0

that converges everywhere on Zp . In other words, s(x) ∈ C(x) if and only if p

lim ci = 0. Once these conditions are satisfied, the series s(x) ∈ C(x) defines

i→∞

on Zp an integer-valued function s : Zp → Zp which is called a C-function. Exercise 5.4. Prove that any C-function is compatible.

Proposition 5.5. Every C-function s : Zp → Zp is uniformly differentiable on Zp ; its derivative is integer-valued everywhere on Zp . Proof. Consider a formal derivative s0 (x) ∈ Zp [[x]] of the series s(x): s0 (x) =

∞ X

ici xi−1 .

i=1

p

p

Since 0 ≤ |ici |p = |i|p |ci |p ≤ |ci |p , and lim ci = 0, we conclude that lim ici = i→∞

i→∞

0, and hence that s0 (x) ∈ C(x). We assert that the function s0 : Zp → Zp is a derivative of the function s : Zp → Zp with respect to the p-adic metrics. Indeed, it is known that in the ring Zp [[x, y]] of all formal power series in variables x, y over Zp the following equality holds: s(x + y) =

∞ (i) X s (x) i=0

i!

yi ,

where s(i) (x) ∈ Zp [[x]] (i = 1, 2, . . .) is the i-th formal derivative of the series s(x), and s(0) (x) = s(x). By the assertion we just have proved, s(i) (x) ∈ C(x) for all i = 0, 1, 2, . . .. Thus, ∞ s(i) (u) X j j−i cj u ∈ Zp = i i! j=i

(5.2)

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 57 for every u ∈ Zp . However, X ∞ s(i) (u) j j−i cj u ≤ max{|cj |p : j = i, i + 1, . . .}, = i! i j=i

p

p

and consequently,

s(i) (u) = 0, i→∞ i! p

lim

(5.3)

p

since lim ci = 0; so for every u ∈ Zp we conclude that i→∞

s(u + y) =

∞ (i) X s (u)

i!

i=0

y i ∈ C(y).

(5.4)

Finally, if s(x) ∈ C(x), then Taylor series (5.4) at the point u ∈ Zp converges to s everywhere on Zp . In particular, for h ∈ Zp we obtain that s(u + h) = s(u) + s0 (u)h + α(u, h), where

∞

X s(i) (u) p α(u, h) = lim h hi−2 = 0, lim h→0 h→0 h i! p

i=2

since over,

P∞

i=2

s(i) (u) i!

hi−2 ∈ Zp in view of (5.2), (5.3) and of Theorem 3.4. More ∞ (i) X α(u, h) s (u) i−2 = h h ≤ |h|p h i! p i=2

p

for all u, h ∈ Zp . Whence, s is uniformly differentiable on Zp , and s0 is a derivative of the function s. From this proposition we immediately deduce the following Corollary 5.6. A class C of all C-functions is closed with respect to derivations; all C-functions are infinitely many times differentiable. Now consider Mahler expansions for functions defined by series from C(x): Let ∞ X x (5.5) si s(x) = i i=0

be an interpolation series for the function s(x) ∈ C(x) defined by convergent power series (5.1). We note: Proposition 5.7. All fractions

si i!

are p-adic integers, for all i = 0, 1, 2, . . ..

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 58 Proof. Indeed, using formulas for Stirling numbers and falling powers (1) we see that X k ∞ ∞ ∞ ∞ X X X X x x k k s(x) = ck xk = ck i! i! c . (5.6) = i i k i i k=0

i=0

k=0

i=0

k=i

k Further, since all Stirling numbers are rational integers, i k i ≤ 1; p

k whence, as power series (5.1) is convergent, lim ci = 0, and thus lim ck = i→∞ k→∞ i P k 0. Consequently, the series ∞ k=i i ck converges to a certain Ai ∈ Zp , for all i = 0, 1, 2, . . .. This proves our assertion since ∞ X k si = i! (5.7) c = i!Ai (i = 0, 1, 2, . . .), i k p

p

k=i

see (5.6). Exercise 5.8. Let p 6= 2. Prove that the functions f (x) = expp px, f (x) = sinp px, f (x) = cosp px are C-functions (cf. Example 3.3). Exercise 5.9. Let p = 2. Prove that the functions f (x) = expp px, f (x) = sinp px, f (x) = cosp px are not C-functions. Exercise 5.10. Prove that the functions f (x) = expp p2 x, f (x) = sinp p2 x, f (x) = cosp p2 x, lnp (1 + px) are C-functions, for any p prime. Exercise 5.11. Prove that the function f (x) = (1 + px) is a C-function.

5.2

Class B

Proposition 5.7 shows that any function defined by series from C(x) can be represented as falling power series over Zp via falling powers xi : s(x) =

∞ X

bi xi (bi =

i=0

si ∈ Zp ; i = 0, 1, 2, . . .). i!

We now consider a wider class B(x) of falling factorial series p-adic inPwith ∞ teger coefficients; that is, f (x) ∈ B(x) if and only if f (x) = i=0 bi xi , (bi ∈ Zp ). In other words, (∞ ) X x ai ∈ Zp ; i = 0, 1, 2, . . . , B(x) = ai : (5.8) i! i i=0

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 59 By the convergence criterion for Mahler interpolation series (see (3.3)) series from B(x) are uniformly convergent on Zp and thus define uniformly continuous functions on Zp , which we call B-functions: as ai /i! ∈ Zp then p

p

lim ai = 0 since lim i! = 0.

i→∞

i→∞

Exercise 5.12. Prove that any B-function is compatible (Hint: Use Theorem 4.15, Lemma 1.1 and the inequality wtp i ≤ (p − 1)(blogp ic + 1); prove that the latter inequality holds for all i = 1, 2, . . . and all prime p.) Exercise 5.13. Prove that any B-function is uniformly differentiable on Zp (Hint: Use (3.4)). Denote via B a class of all functions defined by series from B(x). It turns out that distinct series define distinct functions, so in the sequel we do not differ series from functions they define. Proposition 5.14. Any two distinct series from B(x) (respectively, from C(x)) define two distinct functions on Zp . Proof. Indeed, for functions defined by series from B(x) the assertion follows from the definition of B-functions in view of Proposition 3.15. As for functions defined by series from C(x), we note that the above mentioned interpolation series (5.5) for s(x) ∈ C(x) defines a function, which is identically 0 on are 0. Whence, Ai =0 for i = 0, 1, 2, . . ., Zp if and only if all coefficients si P∞ k P k see (5.7). However, Ai = k=i ck , thus ci = ∞ k=i i Ak = 0, and the i assertion follows.

5.2.1

Stone-Weierstrass theorem for B-functions

The class B is endowed with the non-Archimedean metric Dp (f, g) = max{|f (z) − g(z)|p : z ∈ Zp }. In other words, we put the distance Dp between two B-functions f and g to be equal to p−N whenever N is the largest natural integer such that the corresponding reduced maps f mod pN and g mod pN of residue ring Z/pN Z coincide point-wise: f (z) mod pN = g(z) mod pN for all z ∈ Z/pN Z. Exercise 5.15. Prove that Dp is indeed a non-Archimedean metric on the space of all compatible functions defined on and valuated in Zp .

is not necessarily analytic on Zp : Indeed, the function P∞A B-function i is a B-function; however, conditions (3.6) are not satisfied (so C is a x i=0 proper subclass of B). On the other hand, B-functions behave somewhat like real analytic functions: A composition of real analytic functions is again an analytic function; generally this is not the case for p-adic analytic functions, see [48, Section 41] for details. Fortunately, for B-functions the following analog of Stone-Weierstrass theorem holds:

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 60 Theorem 5.16 (Stone-Weierstrass theorem for B-functions). The class B is a complete (with respect to Dp ) metric space. The class B is closed with respect to additions, multiplications, derivations, and compositions of functions. Polynomials with non-negative rational integer coefficients constitute a dense subset of B. Note. It should be noticed that although Theorem 5.16 can be considered as an analog of Stone-Weierstrass theorem, it is not the p-adic StoneWeierstrass theorem: The latter theorem states conditions when a uniform closure of a class of functions defined on a compact set is a class of all continuous functions on the set, see e.g. [48, Appendix A.4]. Furthermore, the p-adic Stone-Weierstrass theorem says nothing special about differentiability of the uniform closure; moreover, in a contrast to Theorem 5.16, the uniform closure of polynomials over Qp contains not only uniformly differentiable functions as in the case of uniform closure of polynomials over N0 . Theorem 5.16 deals with a uniform closure of a smaller class of functions, the polynomials over N0 rather than over Qp , however, the closure is with respect to the same metric as in the p-adic Stone-Weierstrass theorem. From this view, Theorem 5.16 can be considered also as an analog of p-adic Kaplansky’s theorem (see e.g. [48, Theorem 43.3]), however, not for all polynomials over the field Qp as the p-adic Kaplansky’s theorem deals with, but only for polynomials over N0 , or equivalently, over the ring Z. The p-adic Kaplansky’s theorem yields that all continuous functions defined on a compact set can be uniformly approximated (with respect to Dp ) by polynomials over Qp , whereas Theorem 5.16 deals with uniform closure of polynomials over Z (or, which is equivalent, with uniform closure of polynomials over Zp ), and all functions of the closure turn out to be uniformly differentiable, and not only continuous. Finally we note that the classical Weierstrass approximation theorem also only yields that continuous functions defined on a real compact set and valuated in R can be uniformly approximated by polynomials over R; that is, the limit of uniformly convergent sequence of polynomials over R (which are differentiable functions) need not be a differentiable function. However, Theorem 5.16 yields that in the case of polynomials over Zp the limit is a uniformly differentiable function. All the above is to justify why Theorem 5.16 can not be derived neither from p-adic Stone-Weierstrass theorem nor from p-adic Kaplansky’s theorem and needs a special proof. Proof of Theorem 5.16. First we prove that B-functions are uniformly differentiable on Zp , and that B is closed with respect to derivations: If f ∈ B, then f 0 ∈ B. Recall that a uniformly continuous function f : Zp → Zp that is represented by the interpolation series (3.2) is uniformly differentiable on Zp if an only if (3.4) holds for all n ∈ N0 . Yet the latter condition is obvi1 ously true for f ∈ B since ordp ai > ordp i! = p−1 (i − wtp i) (see Lemma

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 61 1.1), and blogp ic > ordp i for all i = 0, 1, 2, . . .. Thus, the derivative f 0 of the function f is defined everywhere on Zp , and f 0 (x) =

∞ X

(−1)i+1

i=1

see (3.5). However,

Δi f (x) i

∞ X

(−1)

i=1

=

i+1 Δ

1 i

P∞

i f (x)

i

x j=i aj j−i

=

Δi f (x) , i

; consequently,

∞ X ∞ X x k=0

k

i=1

(−1)i+1

ak+i . i

(5.9)

P i+1 ak+i converges for every k ∈ N to Since (3.4) holds, the series ∞ 0 i=1 (−1) i a some Sk ∈ Qp . Moreover, ordp k+i = ord a − ord i ≥ ord (k + i)! − p k+i p p i 1 1 blogp ic = p−1 (i + k − wtp (i + k)) − blogp ic = p−1 (i − wtp i) − blogp ic + 1 1 1 p−1 (k − wtp k) + p−1 (wtp k − wtp (i + k) + wtp i) ≥ p−1 (k − wtp k) = ordp k!. 1 1 The latter inequality holds since p−1 (i − wtp i) ≥ blogp ic and p−1 (wtp k − i+k wtp (i+k)+wtp i) = ordp i ≥ 0, see Lemma 1.1 and Corollary 1.2. Thus, Sk 0 k! ∈ Zp for all k ∈ N0 ; whence f ∈ B. Now we prove that B is a closure (with respect to the metric Dp ) of the class of all functions induced by polynomials with non-negative rational integer coefficients. Since every polynomial from Zp [x] is congruent modulo pk to some polynomial with non-negative rational integer coefficients, it suffices to prove that B is a closure of Zp [x] with respect to the metric Dp . From the definition of the class B it easily follows that every function f ∈ B can be uniformly approximated by polynomials over Zp : For each n ∈ N there exists a polynomial fn (x) ∈ ZP p [x] such that f (z) ≡ fn (z) x n (mod p ) for all z ∈ Zp . Actually, the series ∞ j=0 rj j defines a function n that is identically 0 modulo p if and only if all rj ≡ 0 (mod pn ), see 3.15. Pω(n) So in view of Lemma 1.1 we may put fn (x) = i=0 ai xi , where ω(n) = 1 max{j ∈ N0 : p−1 (j − wtp j) < n}. The inverse assertion is also true: Suppose a function f : Zp → Zp can be uniformly approximated by polynomials over Zp in the above mentioned meaning; then f ∈ B. To prove the assertion, assume that f (z) ≡ fi (z) (mod pi ) for all z ∈ Zp , where fi (x) ∈ Zp [x], i = 1, 2, . . . . Every polynomial fP di has one and only one Mahler expansion (3.2): i (x) of degree di x fi (x) = j=0 aij j , where aij ∈ Zp and ordp aij ≥ ordp (j!) in view of (5.7), since fi ∈ C ⊂ B. Given a function f , every polynomial fi (x) is uniquely determined up to a summand that is 0 modulo pi everywhere on Zp . So we may assume that di = ω(i); then coefficients of the polynomial fi (x) are determined uniquely up to summands whose p-adic absolute values do not exceed p−i . This implies that ai+1,j ≡ aij (mod pi ) (we assume aij = 0 for p

j > ω(i)). Hence, lim aij = aj ∈ Zp , and i→∞

aj j!

∈ Zp . Consequently, the series

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 62 P∞

defines a function f˜ ∈ B, which is uniformly continuous on Zp . The function f˜ is pointwise equal to f since f (z) ≡ fi (z) ≡ f˜(z) (mod pi ) for all z ∈ Zp and all i = 1, 2, . . . . Actually we have proved that B is a complete metric space with respect to the metric Dp ; from here it follows that B is closed with respect to additions, multiplications and compositions of functions: If f, g ∈ B then f + g, f ∙ g, f (g) ∈ B. Indeed, let g be uniformly approximated by a sequence {gn (x) ∈ Zp [x] : n = 1, 2, . . .}, that is, gn (z) ≡ g(z) (mod pn ) for all z ∈ Zp . Now compatibility of the function f implies that Dp (f (g), f (gn )) ≤ p−n , i.e., that the sequence f (gn ) converges to f (g) with respect to the metric Dp as n → ∞. Yet f (gn ) ∈ B for every n = 1, 2, . . .: If f is uniformly approximated by a sequence {fm (x) ∈ Zp [x] : m = 1, 2, . . .}, then fm (gn (z)) ≡ f (gn (z)) (mod pm ) for all z ∈ Zp . Hence, the sequence {fm (gn (x)) ∈ Zp [x] : m = 1, 2, . . .} converges to the function f (gn ) with respect to the metric Dp , and fm (gn ) ∈ B since fm (gn ) is a polynomial over Zp . Consequently, f (g) ∈ B in view of completeness of B with respect to Dp . x i=0 ai i

It worth noticing here that Theorem 5.16 shows that B may be considered as an analog of Banach space, the normed vector space which is complete with respect to the metric induced by the norm. Remind that given a vector space V over a field Qp , a mapping v 7→ kvk ∈ R is called a norm (and V is called a normed (vector ) space) if all the following conditions are satisfied: 1. kvk ≥ 0 for all v ∈ V , and kvk = 0 if and only if v = 0; 2. kαvk = |α|p kvk, for all v ∈ V and all α ∈ Qp ; 3. ku + vk ≤ max{kuk, kvk}, for all u, v ∈ V . Given the norm k k on the normed space V , the induced metric is defined in a standard manner: the distance between u, v ∈ V is ku − vk by the definition. Remind that the function f : Zp → Qp is called bounded if kf kp = max{|f (z)|p : z ∈ Zp } < +∞ (cf. [48, Section 13]). Exercise 5.17. Prove that k kp is a norm and that the set F of all bounded functions f : Zp → Qp is a normed vector space with respect to the norm k kp .

Actually F is a Banach space over Qp with respect to k kp , see [48, Proposition 13.2]. The point is that although the set B ⊂ F is not a subspace of F since B is not a vector space over Qp , nevertheless B is a submodule of the module F over the ring Zp , and all above conditions (1) − (3) are satisfied for V = B and α ∈ Zp . Moreover, by Theorem 5.16 the metric space B is complete with respect to the metric Dp which is a metric induced

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 63 by the norm k kp . So B is indeed an analog of Banach space, with the only difference: B is a Zp -module rather than a vector space over the field Qp .

5.2.2

Taylor theorem for B-functions

Although a B-function is not necessarily analytic on Zp , it turns out to be analytic on all balls of radii less than 1; that is, all B-functions are locally analytic of order 1, cf. Definition 5.1. Hence the following Taylor theorem for B-functions holds: Theorem 5.18 (Taylor theorem for B-functions). For every f ∈ B, a, h ∈ Zp and k = 1, 2, 3, . . . the following equality holds: f (a + pk h) = f (a) + f 0 (a) ∙ pk h + Moreover, all

f (j) (a) j!

f 00 (a) 2k 2 f 000 (a) 3k 3 ∙p h + ∙ p h + ∙ ∙ ∙ (5.10) 2! 3!

are p-adic integers, j = 0, 1, 2, . . ..

Proof. The first claim of Theorem 5.18 immediately follows from Proposition 5.3 which obviously holds with r = 1 for any B-function f in force of definition of the class B, see (5.8). To prove the second claim of the theorem we note that ∞ X X x ak+i1 +i2 +∙∙∙+in (n) (−1)n+i1 +i2 +∙∙∙+in ; f (x) = i1 ∙ i2 ∙ ∙ ∙ in k k=0

i1 ,i2 ,...,in ≥1

this can be easily proved by induction on n in view of (3.5) and (5.9). Further, X

i1 ,i2 ,...,in ≥1

ak+i1 +i2 +∙∙∙+in (−1)n+i1 +i2 +∙∙∙+in = i1 ∙ i2 ∙ ∙ ∙ in ∞ X s=n

a

X

i1 ,i2 ,...,in ≥1 i1 +i2 +...+in =s

ak+s (−1)n+s , (5.11) i1 ∙ i2 ∙ ∙ ∙ in

a

a

(i1 +i2 +∙∙∙+in )! s! k+s = k+s ∈ Z and (k+s)! ∈ and i1 ∙ik+s i1 ∙i2 ∙∙∙in s! i1 ∙i2 ∙∙∙in ∈ Zp since both 2 ∙∙∙in Zp , see the definition of a B-function (5.8) for the latter. Thus, the sum

σs =

X

i1 ,i2 ,...,in ≥1 i1 +i2 +...+in =s

ak+s (−1)n+s i1 ∙ i2 ∙ ∙ ∙ in a

in the right-hand side of (5.11) is a p-adic integer. Moreover, as i1 ∙ik+s = 2 ∙∙∙in ak+s j1 ∙j2 ∙∙∙jn whenever j1 , j2 , . . . , jn is a permutation of i1 , i2 , . . . , in , the sum σs is a multiple of n!, i.e., σn!s ∈ Zp . This proves the theorem.

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 64

5.2.3

Important subclasses of B

Here we consider some important types of B-functions. Exponential B-functions Consider p-adic functions of the form f (x) = u(x)v(x) where u, v : Zp → Zp are compatible functions. The domain of the function f may be smaller than Zp (and actually may be empty) not speaking of the compatibility. The following proposition states sufficient conditions for the function f to be compatible and moreover to lie in B: Proposition 5.19. Let u, v : Zp → Zp be compatible functions and let u(z) ≡ 1 (mod p) for all z ∈ Zp . Then the function f (z) = u(z)v(z) is well defined for all z ∈ Zp , integer-valued and compatible. Moreover, if w, v ∈ B, u(z) = 1 + p ∙ w(z), then f ∈ B.

Proof. As u(z) ≡ 1 (mod p) for all z ∈ Zp , u(z) = 1 + p ∙ t(z) for a suitable P i i v(z) is a p-adic integer since function t : Zp → Zp ; so f (z) = ∞ i=0 p t(z) i p i t(z)i v(z) = 0. Hence, the function f is well pi t(z)i v(z) and lim p ∈ Z p i i i→∞

defined on Zp and f is integer-valued. To prove the compatibility of f note that for arbitrary b, c, d ∈ Zp and n n n = 1, 2, . . . one has (a + pn b)c+p d = (a + pn b)c ((a + pn b)p )d (note that basic properties of powers are of the same form both in real and p-adic cases, see e.g. [42, Ch. 14, Section 5]). As both u and v are compatible functions, n for arbitrary z, r ∈ Zp there exist s, t ∈ Zp such that (u(z + pn r))v(z+p r) = n n (u(z) + pn t)v(z)+p s ; hence (u(z + pn r))v(z+p r) = (u(z) + pn t)v(z) ((u(z) + n n pn t)p )s ≡ (u(z) + pn t)v(z) (mod pn ) since (u(z) + pn t)p ≡ 1 (mod pn ). Here is a proof of the latter congruence: As u(z) ≡ 1 (mod P p), nfor a suitable n p i i pn = k ∈ Zp we have u(z) + pn t = 1 + pk; yet (1 + pk)p = i=0 k p i Ppn i p i n i pi n i=0 k i! (p ) ≡ 1 (mod p ) since i! ∈ Zp in view of Lemma 1.1. Finally denoting by v(z) = v(z) mod pn the least nonnegative residue of v(z) modulo pn , for a suitable h ∈ Zp we obtain f (z + pn r) ≡ (u(z) + pn t)v(z) = (u(z) + Pv(z) n v(z)−i pni ti v(z) ≡ pn t)v(z) (u(z) + pn t)p h ≡ (u(z) + pn t)v(z) = i=0 u(z) i n

(u(z))v(z) ≡ (u(z))v(z) (u(z))p h = (u(z))v(z) , where ∙ ≡ ∙ stands for ∙ ≡ ∙ (mod pn ). This proves that f is compatible. To prove the rest of the proposition note that for every z ∈ Zp and every Pn−1 i v(z) (mod pn ) n = 1, 2, . . . the congruence (u(z))v(z) ≡ i=0 (u(z) − 1) i holds since |u(z) − 1|p ≤ p1 . This implies that Pn−1 pi i i • in view of Theorem 5.16, all functions fn = i=0 i! v w are in B since all fractions

pi i!

are p-adic integers, see Lemma 1.1;

• the sequence {fn : n = 1, 2, . . .} converges to f with respect to the metric Dp .

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 65 From here it follows that f ∈ B in force of Theorem 5.16 Exercise 5.20. Given a ∈ Zp , a ≡ 1 (mod p), prove that the function f (x) = ax is a B function; prove that f is even a C-function with the the only exception of the case when p = 2 and a 6≡ 1 (mod 4). Rational B-functions g(x) A rational function over Zp , is a function f (x) = u(x) , where g(x), u(x) are polynomials with p-adic integer coefficients. From Proposition 5.19 it immediately follows that f is a B-function once u(x) = 1 + pw(x) where w is a polynomial over Zp . However, the latter assertion can be slightly generalized: g(x) Proposition 5.21. The rational function f (x) = u(x) is a B-function if the denominator u(x) vanishes modulo p nowhere on Zp (that is, if u(z) 6≡ 0 (mod p) for all z ∈ Zp ). g(z) Exercise 5.22. Prove Proposition 5.21. (Hint: Prove that u(z) ≡ g(z)u(z)ϕ(p (mod pn ), where ϕ is Euler’s totient function; then apply Theorem 5.16.)

n )−1

5.3

Class A

Some important functions (for instance, some compatible integer-valued polynomials over Qp ; i.e., polynomials that not necessarily have integer padic coefficients yet map Zp into itself and satisfy Lipschitz condition with a constant 1 everywhere on Zp ) do not lie in B. However, they lie in a wider class A: Definition 5.23. A function f : Zp → Zp lies in A (and is said to be an A-function) if and only if f is compatible (i.e., satisfies Lipschitz condition with a constant 1), and pn f ∈ B for some non-negative rational integer n. Now, since f = p1n g for a suitable B-function g and suitable non-negative rational integer n, from Theorem 5.18 we immediately conclude that Taylor theorem for every A-function f holds in the following form: Theorem 5.24 (Taylor theorem for A-functions). For every f ∈ A, a, h ∈ Zp and k = 1, 2, 3, . . . the function f (a+pk h) in variable h can be represented via convergent Taylor series f (a + pk h) = f (a) + f 0 (a) ∙ pk h +

f 00 (a) 2k 2 f 000 (a) 3k 3 ∙p h + ∙ p h + ∙ ∙ ∙ . (5.12) 2! 3!

CHAPTER 5. SPECIAL CLASSES OF COMPATIBLE FUNCTIONS 66 f (j) (a) j!

are not necessarily p-adic integers now; however, in (j) view of the second claim of Theorem 5.18, f j!(a) ≤ pn for all j = 1, 2, . . .. Note that

p

Moreover, f 0 (a) is still a p-adic integer, cf. Example 2.26. The most important examples of A-functions that are not necessarily Bfunctions are compatible integer-valued polynomials over Qp , i.e., functions P of the form f (x) = di=0 ai pblogp ic xi , where ai ∈ Zp , i = 0, 1, 2, . . .. This example stresses importance of A-functions: In view of Theorem 4.15 and Proposition 3.15, every compatible function can be uniformly approximated on Zp (with respect to the metric Dp ) by A-functions. Exercise 5.25. Prove that the function f (x) = not a B-function.

(xp −x)2 p

is an A-function and

Exercise 5.26. Characterize A-functions in terms of Mahler expansion. (Hint: Prove that any A-function is of the form g(x) pn where g ∈ B and g is an identity modulo pn , then apply Proposition 3.15)

Part II

The p-adic ergodic theory

67

Chapter 6

Ergodic theory: basic notions and facts Now we are in a position to develop our main topic, the p-ergodic theory meaning the ergodic theory for functions defined on and valuated in the space of p-adic integers. Ergodic theory is a part of dynamics, the dynamical systems theory. Therefore we need to introduce main notions of the both theories.

6.1

Dynamical systems: basic notions

In this book, by a (discrete) dynamical system S on a phase space (or, on configuration space) S we understand a triple hS; μ; f i, where S is a measure space endowed with a measure μ and f : S → S is a measurable map f : S → S; that is, the f -preimage f −1 (S) of any μ-measurable subset S ⊂ S is a μ-measurable subset of S. We note also that also are considered topological dynamical systems; the latter are pairs hX; f i, where X is a topological space and f is a continuous transformation on X. In the cases we study in the book, dynamical systems are also topological since configuration spaces are not only measure spaces but also metric spaces, and corresponding transformations are not only measurable but also continuous. The orbit (or, the trajectory) of a point x0 of the dynamical system S is the sequence x0 = f 0 (x0 ), x1 = f (x0 ), x2 = f (x1 ) = f 2 (x0 ), . . . , xi = f (xi−1 ) = f i (x0 ), . . . of points of the space S; that is, the orbit of the point x0 is just sequence of iterates (f i (x0 ))∞ i=0 . The point x0 is called the initial point of the orbit (f i (x0 ))∞ . The point x0 is called (eventually) periodic if its orbit is (eveni=0 tually) periodic; the eventually periodic points are also called pre-periodic. If the orbit (f i (x0 ))∞ i=0 has a period of length r, the point x0 is called an r-periodic point . A 1-periodic point is called a fixed point. 68

CHAPTER 6. ERGODIC THEORY: BASIC NOTIONS AND FACTS 69 If F : S → T is a measurable map of S to some other measure space T endowed with a measure ν (that is, if the F -preimage of any ν-measurable subset of T is a μ-measurable subset of X), the sequence F (x0 ), F (x1 ), F (x2 ), . . . is called the observable. The map F : S → Y is said to be measure-preserving if μ(F −1 (S)) = ν(S) for each measurable subset S ⊂ Y.Also in this case once S = Y and μ = ν the measure μ is said to be invariant with respect to F (or simply invariant when it is clear from the context what F is meant). In the case when S = Y and μ = ν, a measure-preserving map F is said to be ergodic if for each measurable subset S such that F −1 (S) = S holds either μ(S) = 1 or μ(S) = 0. A measurable subset S ⊂ S is called invariant subset of the map F : S → S (or, F -invariant) if F −1 (S) = S; so ergodicity of the map F just means that F has no proper invariant subsets; that is, invariant subsets whose measure is neither 0 nor 1. Note that further in the book the ergodic map is assumed to be measure-preserving; however, in general ergodic theory the ergodic map need not be measure-preserving by the definition although often it is also assumed to be. Nonetheless, in general ergodic theory even if an ergodic map is not measurepreserving, most often it is assumed to be surjective, that is, to be an ‘onto’ map. However, in the case when F : Znp → Znp is a 1-Lipschitz surjective map that has no proper invariant subsets with respect to the natural probability measure μp (which is introduced further in Subsection 6.2.1), F is necessarily measure-preserving: This is a direct consequence of Proposition 7.6 (as well as of results of Subsection 7.2 for the case m = n) as a surjective map of a finite set onto itself is necessarily bijective. Ergodic theory Basic questions in general ergodic theory are: How often does an orbit visit a given region (that is, a given measurable subset S ⊂ S)? What is the frequency with which the orbit of x = x0 hits S? For instance, when the frequency is equal to the probability that a randomly chosen point of S lies in S? That is, if denote χ(S, x) a characteristic function of S (i.e., χ(S, x) = 1 if and only if x ∈ S; otherwise χ(S, x) = 0), when the following condition holds? N −1 1 X lim χ(S, xi ) = μ(S). (6.1) N →∞ N i=0

In the book, we call a sequence (xi )∞ i=0 uniformly distributed if (6.1) holds for all measurable subsets S of S. A word of caution: this definition must not

CHAPTER 6. ERGODIC THEORY: BASIC NOTIONS AND FACTS 70 be considered as a definition of uniform distribution on arbitrary measure space S with arbitrary probability measure μ! Some restrictions must be imposed both on measurable sets S and on the measure μ. However, all these restrictions are satisfied for all dynamical systems we consider in the book, so we are free to use the above definition of uniform definition throughout the book. Readers interested in further detail on these restrictions, as well as in the theory of uniform distribution of sequences, are referred to [40]. One of fundamental results of ergodic theory yields that once the function f is ergodic, the orbit (f i (x0 ))∞ i=0 is uniformly distributed. Again, to make this assertion true in general, some additional restrictions must be imposed (see [40]). However, for dynamical systems we consider further, the ones on finite spaces and on Znp , the restrictions are satisfied; so we use the just stated assertion without limitations. In the book, we mainly focus on the following questions: given a map f : S → T, is it measure preserving? is it ergodic? These questions are crucial with respect to applied problems we consider in Part III of the book. Topological transitivity As mentioned at the very beginning of the chapter, the dynamical systems we mostly consider in the book dynamical systems are also topological since configuration spaces are not only measure spaces but also metric spaces, and corresponding transformations are not only measurable but also continuous. In the theory of topological dynamical systems there is an important counterpart of the notion of ergodicty, the topological transitivity. Definition 6.1. Given a topological space X and a continuous mapping F : X → X, the mapping F (respectively, the dynamical system hX; F i) is called topologically transitive if there exists a dense orbit of F ; that is, if there exists x ∈ X such that the set of iterations {F i (x) : i ∈ N0 } is everywhere dense in X. The system hX; F i is called minimal if every orbit is dense. It is worth noticing here that once a configuration space is endowed both with a metric (thus, with topology) and with a measure, and even the metric and the measure somehow agree, the properties of ergodicity and of minimality may be quite different. Fortunately, in cases we consider further in the book ergodicity and minimality occur to be equivalent since we deal with isometries of compact metric spaces. The following very general proposition is true (see e.g. [30, Corollary 4.3.6]): Proposition 6.2. A minimal isometry of a compact space is uniquely ergodic. Unique ergodicity means that there exists a unique (up to a multiplication by a constant) invariant measure. In the cases considered further in the book the invariant measure is a probability measure; namely, a Haar

CHAPTER 6. ERGODIC THEORY: BASIC NOTIONS AND FACTS 71 measure which is normalized so that the measure of the whole space is 1. Therefore in many further proofs of ergodicty we actually prove minimality of corresponding dynamical systems; but this implies ergodicty of the systems by Proposition 6.2.

6.1.1

Finite dynamics

Now we consider the notions we just have introduced for a very special (however, important) case when dynamical system S = hS; f ; μi is finite, that the configuration space S contains only finitely many points. Total number of these points (that is, the cardinality of the set S) is denoted via #S. Of course, when dynamics is finite, all points are eventually periodic. Now consider measure-preservation and ergodicity for finite dynamical systems. Let S, T be finite. For S ⊂ S and T ⊂ T put μ(S) =

#S #T , ν(T ) = . #S #T

That is, μ, ν are “standard” probability measures on S, T respectively. Let F : S → T be a surjective (that is, an “onto”) map. It is easy to prove that the following is true: 1. The map F is measure-preserving if and only if f is balanced. i.e., if and only if the number #F −1 (t) of F -pre-images of the point t ∈ T does not depend on the point t: #F −1 (t) =

#S #T

for all t ∈ T. 2. In particular, in the case when S = T and μ = ν, the map F is measure-preserving if and only if F is bijective, i.e., if and only if F is a permutation on S. 3. Finally, F is ergodic if and only if F is transitive on S, i.e., if and only if F is a cycle of length #S (that is, F cyclically permutes points in S). It is clear that if a map f : S → S is transitive, then every its orbit is uniformly distributed since the orbit is a periodic sequence such that the length of its shortest period is #S, and every element from S appears at the period exactly once. Sequences having this property are called strictly uniformly distributed. Exercise 6.3. Prove assertions 1–3.

CHAPTER 6. ERGODIC THEORY: BASIC NOTIONS AND FACTS 72 Exercise 6.4. Given a finite dynamical system S = hS; f ; μi where μ is a “standard” probability measure as above, consider a graph G(f ) of the map f : Vertices of G(f ) are points of S; vertices a, b ∈ S are connected by the edge that goes from a to b if and only if b = f (a). Characterize f -invariant subsets in terms of G(f ). Heritable dynamical properties Balanced, bijective and transitive maps may under certain conditions be ‘projected’ onto smaller domains so that the ‘images’ of these maps remain balanced, bijective, transitive, respectively. We consider these conditions for the case of maps of finite rings. Let A, B be rings, let ϕ : A → B be a ring epimorphism. Remind (see [40]) that a mapping f : A → A is called compatible with the epimorphism ψ if there exists a commutative closure ϕf of the following diagram: f

A −−−−→  ϕ y ϕ(f )

A  ϕ y

B −−−−→ B

Here by the definition (ϕ(f ))(b) = f (ϕ(a)) for b ∈ B, where a ∈ A is a ϕ-pre-image of b: a ∈ ϕ−1 (b). In other words, the compatibility of f with respect to the epimorphism ϕ means that the map ϕ(f ) is well defined: given arbitrary x, y ∈ A such that their ϕ-images coincide, ϕ(x) = ϕ(y), then ϕ(f (x)) = ϕ(f (y)). So each compatible transformation on A defines a unique transformation on each epimorphic image of A. As each epimorphism of A defines a unique ideal Iϕ of A, the kernel of ϕ, and vice versa (see Subsection 0.2.1), we will also say that f is compatible with respect to the ideal Iϕ . We will say that f has some property P modulo the ideal Iϕ or, which is the same, modulo epimorphism ϕ if the mapping ϕf induced by f on the epimorphic image B of A has the property P. The notion of compatibility with an epimorphism can be defined for multivariate maps as well: If F : An → Am , m ≤ n is a map from the n-th Cartesian power of A onto its m-th Cartesian power1 , and if I is an ideal in A, then I n , I m are ideals in An , Am , respectively; so he map F is called compatible with respect to the ideal I of A if there exists a commutative closure ϕ(F ) of the following diagram: F

An −−−−→  ϕ y ϕ(F )

1

Am  ϕ y

B n −−−−→ B m

In ring theory, Cartesian products are usually called direct sums.

CHAPTER 6. ERGODIC THEORY: BASIC NOTIONS AND FACTS 73 Note that we denote epimorphisms An → B n and Am → B m with kernels I n , I m , respectively, by the same symbol ϕ just to avoid complicated notation. We will also say that F has some property P modulo I (modulo ϕ) if the map ϕ(F ) has the property P. The following proposition holds: Proposition 6.5. Let A be a finite ring, let I be an ideal of A, and let F : An → Am (where m ≤ n) be a balanced (resp., bijective, transitive) compatible (with respect to I) mapping of the n-th Cartesian power An onto the m-th Cartesian power Am of the ring A. Then F is balanced (resp., bijective, transitive) modulo I. Further, if k = |A : I| is the index of I in A,2 then the mapping F : An → n A is transitive if and only if F is transitive modulo I and the iterated mapping F kn : I n → I n is transitive on I n . Moreover, if A is a direct product of rings B and C, A = B × C, then F is balanced on A if and only if F is balanced both on B and C, i.e., modulo each projection onto a direct factor. Finally, the mapping F : A → A is transitive if and only if it is transitive both on B and C and orders #B and #C are coprime. Proof. Choose arbitrary element c ∈ Am and consider the following inclusion: F (x1 + I, . . . , xn + I) ⊆ c + I m (6.2)

Let S ⊆ I n be arbitrary complete system of representatives of cosets with respect to the ideal I n ; that is S is a set which contains one and only one element of each coset h + I n . Let t be a number of elements of S that satisfy (6.2). Consider inclusion F (a1 , . . . , an ) ∈ c + I m .

(6.3)

If x = (x1 , . . . , xn ) ∈ S and if x satisfies (6.2), then each element (a1 , . . . , an ) which lies in the coset (x1 , . . . , xn ) + I n , satisfies (6.3) since F is compatible. Thus, the number of elements of An that satisfy (6.3) is exactly t ∙ #I n . On the other hand, let F be balanced. Then for each d ∈ c + I m the equation F (a1 , . . . , an ) = d has exactly #An−m solutions in An and consequently there exist exactly #An−m ∙ #I m elements of An that satisfy (6.3). In view of the argument above this implies that #An−m ∙ #I m = t ∙ #I n . Hence, t = #(A/I)n−m . Thus, t does not depend on the choice of c and, consequently, F induces a balanced mapping of a factor-ring (A/I)n onto a factor-ring (A/I)m . The reader is encouraged to prove other assertions of the proposition. Exercise 6.6. Complete the proof of Proposition 6.5. 2

That is, k = |A : I| is the order of the factor-ring A/I of the ring A by the ideal I.

CHAPTER 6. ERGODIC THEORY: BASIC NOTIONS AND FACTS 74

6.2

Dynamics on Znp

Configuration spaces of dynamical systems we are focused at in the book are metric spaces Znp . These spaces are endowed with a natural probability measure.

6.2.1

Probability measure on Znp

The ring Zp can be endowed with a probability measure μp , thus becoming a probability space. The latter measure is a normalized Haar measure: The base of the corresponding σ-algebra of measurable subsets of Zp , the elementary measurable subsets, are all balls of non-zero radii. That is, every element of the σ-algebra, the measurable subset of Zp , can be constructed from the elementary measurable subsets by taking complements and countable unions. We put μp (Bp−` (a)) = p−` . We sometimes will refer to this probability measure on Zp as to p-adic measure. Exercise 6.7. Prove that, given r ∈ {0, 1, . . . , p − 1}, the set Rk (r) = {z ∈ Zp : δk (z) = r} is measurable. What is μp (Rk (r))? Exercise 6.8. Prove that the measure of a point is zero. Therefore, if a measurable subset S ⊂ Zp is countable union of pairwise disjoint balls Bi of radii ri , S = ∪∞ i=0 Bi , then μp (S) =

∞ X

ri

(6.4)

i=0

and the series in the right hand side of (6.4) converges. In a similar manner we define a probability measure μp on Znp : For an n-dimensional ball Bp−` (a) ⊂ Znp we put μp (Bp−` (a)) = p−`n . Remind that in Section 2.2 we have defined the (ultra)metric in Qnp as follows: |a − b|p = max{|ai − bi |p : i = 1, 2, . . . , n} for a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Qnp . We remind (see e.g. [40]) that if a measure space S endowed with a probability measure μ is also a topological space, the measure μ is called Borel if all Borel sets in S are μ-measurable. Recall that a Borel set is any element of σ-algebra generated by all open subsets of S; that is, a Borel subset can be constructed from open subsets with the use of complements and countable unions. Recall further that a probability measure μ is called regular (see e.g. [40]) if μ(E) = sup{μ(C) : C ⊆ E, C closed} = inf{μ(D) : E ⊆ D, D open} (6.5) for all Borel sets E in S. Exercise 6.9. Prove that μp is Borel and regular.

CHAPTER 6. ERGODIC THEORY: BASIC NOTIONS AND FACTS 75 Further in the book, we simply say that a p-adic function is measurepreserving or ergodic meaning it has these properties with respect to the measure μp . Exercise 6.10. Prove that a locally 1-Lipschitz map F from Znp to Zm p (which is not necessarily an ‘onto’ map) is measurable.

6.2.2

Hereditable dynamical properties

n m Given a (locally) 1-Lipschitz map F : Znp → Zm p of Zp onto Zp , m ≤ n, the k k n k m map F mod p : (Z/p Z) → (Z/p Z) is well defined due to the compatibility of F with every epimorphism mod pk , for all (sufficiently large) k, see Section 4.1; so we can define the notion of hereditable properties with respect to the epimorphism mod pk in the same way as wi did in Subsection 6.1.1. As we focus on basic properties of ergodic theory, the measure-preservation and the ergodicity, the following notions are the most important further in the book:

Definition 6.11 (Bijectivity, transitivity, balance modulo pk ). A locally k 1-Lipschitz function F : Znp → Zm p is said to be balanced modulo p (respectively, bijective modulo pk or transitive modulo pk ) if the reduced map F mod pk : (Z/pk Z)n → (Z/pk Z)m is well defined and balanced (respectively, bijective or transitive). We stress once again that if F is a 1-Lipschitz function then all reduced maps F mod pk , k = 1, 2, 3, . . ., are well defined! For locally 1-Lipschitz functions both measure-preservation and ergodicity turn out to be hereditable properties with respect to mod pk . We recall that when configuration space is finite, measure-preservation means balance whereas ergodicity means transitivity. Proposition 6.12. Given a measure-preserving (locally) 1-Lipschitz map F k k n k m of Znp onto Zm p , m ≤ n, the reduced map F mod p : (Z/p Z) → (Z/p Z) is balanced, for all (sufficiently large) k = 1, 2, 3, . . .. If m = n, and if the map F is ergodic, the reduced map F mod pk is transitive for all (sufficiently large) k = 1, 2, 3, . . .. Proof. During the proof, we denote F mod pk via Fˉk for not to overload further formulae. To prove the first claim of the proposition, let for some k there exist x ˉ, yˉ ∈ (Z/pk Z)m = {0, 1, . . . , pk − 1}m such that #Fˉk−1 (ˉ x) 6= #Fˉk−1 (ˉ y ). −1 −1 k n ˉ Thus, both Fk (ˉ x) and Fk (ˉ y ) are disjoint subsets of a finite set (Z/p Z) =

CHAPTER 6. ERGODIC THEORY: BASIC NOTIONS AND FACTS 76 m ˉ + pk Zm ˉ + pk Zm {0, 1, . . . , pk − 1}n . Consider two balls x p and y p in Zp . Then

F −1 (ˉ x + pk Zm p )=

[

(z + pk Znp ),

z∈Fˉk−1 (ˉ x)

y + pk Zm F −1 (ˉ p )=

[

(z + pk Znp ),

z∈Fˉk−1 (ˉ y)

as a pre-image of a ball of (sufficiently small) radius r with respect to a (locally) 1-Lipschitz map is a finite union of pairwise disjoint balls having −1 (ˉ the same radii r. Therefore, μp (F −1 (ˉ x + pk Zm y + pk Zm p )) 6= μp (F p )); a contradiction. By the conditions of the second claim, m = n and F is ergodic; thus, F preserves the measure μp . Hence by the first claim for (a sufficiently large) k ∈ N the mapping Fˉk is a permutation of the elements of the residue ring (Z/pk Z)n . If for some k the permutation Fˉk has more than one cycle, then there exists a proper subset Aˉ ⊂ (Z/pk Z)n = {0, 1, . . . , pk − 1}n such that k Zn = A+p ˉ = A. ˉ This implies that F −1 (A+p ˉ k Zn ) = Fˉ −1 (A)+p ˉ ˉ k Zn . Fˉk−1 (A) p p p k k n −kn −kn ˉ ∙p ˉ ∙p However, μp (Aˉ + p Zp ) = (#A) , and 0 < (#A) < 1 as Aˉ is a proper subset in {0, 1, . . . , pk −1}n . This contradicts the ergodicity of F . In the next chapter we will see that the converse of Proposition 6.12 is also true.

Chapter 7

The main ergodic theorem for p-adic 1-Lipschitz functions A central result of this chapter is the following theorem: Theorem 7.1. Let F be a (locally) 1-Lipschitz map of Znp onto Zm p , m ≤ n. Whenever m = n, the map F preserves the p-adic measure μp (or, accordingly, is ergodic with respect to μp ) if and only if F bijective (accordingly, transitive) modulo pk for all (sufficiently large) k = 1, 2, 3, . . .. For n ≥ m, the map F preserves the measure μp if and only if F is balanced modulo pk for all (sufficiently large) k = 1, 2, 3, . . .. The ‘only if’ part of the statement is already proved, see Proposition 6.12. The rest part of statement of the theorem follows directly from Propositions 7.11 and 7.12 below that are of special interest by themselves. For instance, as a bonus we obtain that a 1-Lipschitz measure-preserving transformation on Znp is an isometry bijection. For not to overload notation, we prove the first claim of the theorem only for the case m = n = 1 since proofs can be easily adjusted for general case. Moreover, by the same reason we will prove the theorem only for 1Lipschitz maps leaving corresponding adjustments of statements and proofs to readers. Exercise 7.2. Do the said adjustments.

7.1

Measure-preserving isometries

We first prove that a 1-Lipschitz map F : Znp → Znp preserves the measure μp if and only if f is bijective modulo pk , for all k = 1, 2, . . .. As said, we consider the case n = 1 just to simplify notation; statements of Propositions 77

CHAPTER 7. THE MAIN ERGODIC THEOREM

78

7.3 and 7.6 as well as of Notes 7.5, 7.8 and of Corollary 7.7 remain true for arbitrary n ∈ N, the respective proofs are quite similar to those for the case n = 1. It worth notice here that Proposition 7.3 can be deduced also from a more general result stated in Section 7.2. However, we present a separate proof for that proposition to obtain some extra information on the functions of considered type. Proposition 7.3. A 1-Lipschitz measure-preserving function f : Zp → Zp is a bijection of Zp onto itself. Proof. We prove that f is both injective and surjective. Claim 1: Under conditions of Proposition 7.3 the function f is injective. Indeed, if there exist a, b ∈ Zp (a 6= b) such that f (a) = f (b) = z then for some k the balls a + pk Zp and b + pk Zp are disjoint, whereas f (a + pk Zp ), f (b + pk Zp ) ⊂ z + pk Zp . Hence μp (f −1 (z + pk Zp )) ≥ 2 ∙ p−k since f −1 (z + pk Zp )) ⊃ a + pk Zp , b + pk Zp ; so f does not preserve μp . Claim 2: Under conditions of Proposition 7.3 the function f is bijective modulo pk for all k = 1, 2, . . .. Already proved; see Proposition 6.12. Claim 3: Under conditions of Proposition 7.3 the function f is surjective. Take arbitrary z ∈ Zp . Then in view of Claim 2 there exists exactly one x1 ∈ Z/pZ such that f (x1 ) ≡ z (mod p) (here and further we identify elements of the residue ring Z/pk Z with non-negative rational integers 0, 1, . . . , pk − 1 in an obvious way). Similarly, there exists exactly one x2 ∈ Z/p2 Z such that f (x2 ) ≡ z (mod p2 ); whence necessarily x2 ≡ x1 (mod p), etc. This way we obtain a sequence x1 , x2 , . . . such that |f (xi ) − z|p ≤ p−i and |xi+1 − xi |p ≤ p−i for i = 1, 2, . . .. It is an exercise to show now that the sequence x1 , x2 , . . . is a Cauchy sequence (which hence converges to some x ∈ Zp ), and that f (x) = z. Exercise 7.4. Complete the proof. Note 7.5. As a bonus we have that whenever a 1-Lipschitz function g : Zp → Zp is bijective modulo pk for all k = 1, 2, . . ., it is a bijection of Zp onto Zp , see proofs of Claims 2 and 3 of the proof of Proposition 7.3. Proposition 7.6. Let a 1-Lipschitz function g : Zp → Zp be bijective modulo pk for all k = 1, 2, . . .. Then g preserves the measure μp . Proof. In view of Note 7.5 the function g is a bijection of Zp onto Zp ; whence, there exists an inverse function f = g −1 , which is also a bijection of Zp onto Zp . Moreover, f is continuous since g is continuous. Claim 1: f is 1-Lipschitz. If there are a, b ∈ Zp such that a ≡ b (mod pk ) and f (a) 6≡ f (b) (mod pk ) then assuming a = g(u), b = g(v) for uniquely defined u, v ∈ Zp we have


79

g(u) ≡ g(v) (mod pk ) and f (g(u)) 6≡ f (g(v)) (mod pk ); that is, g(u) ≡ g(v) (mod pk ) and u 6≡ v (mod pk ). The latter contradicts conditions of Proposition 7.6. Claim 2: f (a + pk Zp ) = f (a) + pk Zp for every a ∈ Zp and every k = 1, 2, . . .. In view of Claim 1, f (a + pk Zp ) ⊂ f (a) + pk Zp . To prove the inverse inclusion, denote f (a) = b; then g(b) = a. Since g is 1-Lipschitz, g(b + pk Zp ) ⊂ g(b) + pk Zp . Applying a bijection f to both sides of the inclusion, we get b + pk Zp ⊂ f (g(b) + pk Zp ) as f is 1-Lipschitz (see Claim 1); that is, f (a) + pk Zp ⊂ f (a + pk Zp ), the needed inverse inclusion. Claim 3: f is bijective modulo pk for all k = 1, 2, . . .. Assuming there exist u, v ∈ Zp and k ∈ {1, 2, . . .} suchT that u 6≡ v (mod pk ) and f (u) ≡ f (v) (mod pk ) one obtains that u+pk Zp v+pk Zp = ∅ and f (u) + pk Zp = f (v) + pk Zp ; a contradiction in view of Claim 2. Claim 4: f satisfies conditions of Proposition 7.6. See Claims 1 and 3. Claim 5: g(a + pk Zp ) = g(a) + pk Zp for every a ∈ Zp and every k = 1, 2, . . .. See Claim 4. Claim 6: μp (g(M )) = μp (M ), for every μp -measurable M ⊂ Zp . Since M is measurable, then μp (M ) = inf{μp (V ) : V ⊃ M, V is open in Zp } as the measure μp is regular, cf. Exercise 6.9. Since V is open, it is a disjoint S union of a countable S number of balls Vj of non-zero radius each: V = j∈J Vj . Then g(V ) = j∈J g(Vj ), since g is a bijection. Note that in view of Claim 5, each g(Vj ) is a ball of a radius that is equal to the one of the ball Vj ; that is,Tμp (g(Vj )) = μp (Vj ), for all j ∈ J. Moreover, T the balls are T disjoint: g(Vi ) g(Vj ) = ∅ whenever i 6= j (since f (g(Vi ) g(Vj )) = Vi Vj in view of Claim 2). This implies that μp (g(V )) = μp (V ). Note that g(V ) is open since g is a continuous bijection. Hence, μp (g(M )) ≤ inf{μp (g(V )) : V ⊃ M, V is open in Zp } = μp (M ). In view of Claim 4, one has then μp (f (R)) ≤ μp (R), for every measurable R ⊂ Zp . Now we take R = g(M ) (whence f (R) = M ) and obtain μp (M ) ≤ μp (g(M )), thus proving the proposition. Corollary 7.7. A 1-Lipschitz function f : Zp → Zp preserves measure if and only if it is bijective modulo pk for all k = 1, 2, . . .. Proof. Necessity of the conditions is proved by Proposition 6.12, whereas their sufficiency is proved by Proposition 7.6.


80

Note 7.8. As a bonus we have that every 1-Lipschitz measure-preserving function f : Zp → Zp is an isometry: A distance between two points is just a radius of the smallest ball that contains them both; however, as it was shown, a measure-preserving 1-Lipschitz mapping is a bijection that merely permutes balls of pairwise equal radii.

7.2

1-Lipschitz measure-preserving functions

Now we prove that a 1-Lipschitz function F : Znp → Zm p , m ≤ n, preserves k the measure μp if and only if F is balanced modulo p , for all k = 1, 2, . . .. The ‘only if’ part is already proved, see Proposition 6.12. To prove the ‘if’ part, we need the following lemma. Lemma 7.9. Let a 1-Lipschitz function F : Znp → Zm p , m ≤ n, be balanced modulo pk , for all k = 1, 2, . . .. Then for every b ∈ Zm p a full preimage −1 s m s(n−m) F (b + p Zp ) is a union of p pairwise disjoint balls aj + ps Znp , j = 1, 2, . . . , ps(n−m) . Proof. We start with proving the lemma ‘modulo pk ’. Recall that Fˉk stands for the reduced map F mod pk . Claim 1. For every ˉbk ∈ (Z/pk Z)m , a full preimage Fˉk−1 (ˉbk +ps (Z/pk Z)m ) of the coset ˉbk +ps (Z/pk Z)m ⊂ (Z/pk Z)m is a disjoint union of ps(n−m) suitable pairwise disjoint cosets: Fˉk−1 (ˉbk

s

k

m

+ p (Z/p Z) ) =

ps(n−m) [

(ˉ ak,j + ps (Z/pk Z)n ).

j=1

Here and further we assume that s ≤ k. In this case #(ˉbk + ps (Z/pk Z)m ) = pm(k−s) , and since F is balanced modulo pk , then #Fˉk−1 (ˉbk + ps (Z/pk Z)m ) = pk(n−m) ∙ pm(k−s) = pkn−ms .

(7.1)

Further, since F is balanced modulo ps , then #Fˉs−1 (ˉbs ) = ps(n−m) , for every ˉbs ∈ {0, 1, . . . , ps − 1}m = (Z/ps Z)m . Take ˉbs ≡ ˉbk (mod ps ) and let Fˉs−1 (ˉbs ) = {ˉ as,1 , . . . , a ˉs,ps(n−m) } ⊂ (Z/ps Z)n = {0, 1, . . . , ps − 1}n . ˉk,j ∈ (Z/pk Z)n so that a ˉk,j ≡ a ˉs,j For j = 1, 2, . . . , ps(n−m) choose (and fix) a s (mod p ). Note that the latter congruence, in accordance with what has been agreed at the beginning of Section 2.2, just means that |ˉ ak,j − a ˉs,j |p ≤ (i) (i) (i) −s s ˉk,j ≡ a ˉs,j (mod p ) for each i-th component a ˉk,j of a ˉk,j ∈ p ; that is a k n k n (Z/p Z) = {0, 1, . . . , p − 1} , i = 1, 2, . . . , n.


81

ˆk,j ∈ (Z/pk Z)n so that a ˆk,j ≡ a ˉs,j Now for j = 1, 2, . . . , ps(n−m) take a s s k (mod p ); that is, a ˆk,j ∈ a ˉk,j + p (Z/p Z)n , and vice versa. Since F is 1Lipschitz, Fˉk (ˆ ak,j ) ≡ ˉbs (mod ps ); thus, Fˉk (ˆ ak,j ) ∈ ˉbk + ps (Z/pk Z)m (recall that ˉbs ≡ ˉbk (mod ps ) by our choice). So every a ˆk,j is an Fˉk -preimage s k m ˉ of a certain element of the coset bk + p (Z/p Z) , and there are exactly ps(n−m) ∙ pn(k−s) = pnk−ms these elements a ˆk,j . Comparing this number with what is given by equation (7.1), we conclude that all these âk,j constitute the full preimage Fˉk−1 (ˉbk + ps (Z/pk Z)m ), which is then just the union of cosets a ˉk,j + ps (Z/pk Z)n over j ∈ {1, . . . , ps(n−m) }. These cosets are disjoint since all a ˉk,j are different modulo ps . Claim 2. For j = 1, 2, . . . , ps(n−m) fix aj ∈ Znp such that aj ≡ a ˉs,j (mod ps ), where a ˉs,j are defined as above for ˉbk ≡ b (mod pk ). Then F −1 (b + ps Zm p )=

ps(n−m) [

(aj + ps Znp ).

j=1

First note that in this setting the definition of ˉas,j (whence, of aj ) does not depend on k, only on b and s, since for ˉbk ≡ b (mod pk ) the set {ˉ as,1 , . . . , a ˉs,ps(n−m) } is just a full Fˉs -preimage of (b mod ps ). Here (b mod ps ) is a unique non-negative rational integer that lays at the distance p−s from the point b; an approximation of b by a non-negative rational integer with accuracy p−s with respect to p-adic metric. In other words, given b ∈ Zm p , we put ˉbs ≡ b (mod ps ), where ˉbs ∈ {1, 2, . . . , ps − 1}m , then take all solutions a ˉs,j ∈ {1, 2, . . . , ps − 1}n of the congruence Fˉs (x) ≡ ˉbs (mod ps ) in ˉs,j , we indeterminate x, and after that, for each of these ps(n−m) solutions a choose an arbitrary aj ∈ Znp so that aj ≡ a ˉs,j (mod ps ). Form the definition of a ˉj it follows immediately that for every h ∈ Znp , s s F (aj + p ∙ h) ≡ b (mod p ) since F is 1-Lipschitz; whence F −1 (b + ps Zm p )⊃ Sps(n−m) (aj + ps Znp ). Thus, we must prove the inverse inclusion only. j=1 Given c ∈ b + ps Zm p , for every k ≥ s from Claim 1 it follows that −1 −1 ˉ F (c) ∈ Fk (c mod pk ) + pk Znp , where Fˉk−1 (c mod pk ) is a subset of a S s(n−m) finite set pj=1 (ˉ ak,j + ps ∙ {0, 1, . . . , pk−s − 1}n ).


82

Thus, applying Claim 1 we obtain: F −1 (c) ∈

∞ \

k=s ∞ \

k=s

(Fˉk−1 (c mod pk ) + pk Znp ) ⊂ ps(n−m) [

(ˉ ak,j + ps ∙ {0, 1, . . . , pk−s − 1}n + pk Znp ) =

∞ \

(ˉ ak,j + ps ∙ {0, 1, . . . , pk−s − 1}n + pk Znp ) =

j=1

ps(n−m)

[

j=1

k=s

ps(n−m) ∞ [ \ j=1

k=s

(ˉ as,j + ps ∙ {0, 1, . . . , pk−s − 1}n + pk Znp ) = ps(n−m) [

(ˉ as,j +

ps Znp )

j=1

=

ps(n−m) [

(aj + ps Znp )

j=1

This finishes the proof of Lemma 7.9. Corollary 7.10. μp (F −1 (b + ps Zm p )) = −sn −sm s m =p = μp (b + p Zp )). p

Pps(n−m) j=1

μp (aj + ps Znp ) = ps(n−m) ∙

Proposition 7.11. Under conditions of Lemma 7.9, the function F preserves the measure μp . Proof. Balls b + ps Zm p constitute a base of a σ-ring of all μp -measurable sets of the space Zm . In view of Corollary 7.10, F is then a measurable mapping; p that is, any preimage of a measurable set is measurable. Now let us find μp (F −1 (M ) for a measurable M ⊂ Zm p . Any open subset A ⊂ Zm is a union of balls since all balls of non-zero p radii (these balls are clopen then) constitute a base of the p-adic topology: A = ∪i∈I Bi where I is a set of indexes and all Bi are balls of non-zero radii. Given any two balls, they are either disjoint or one of them contains another −sm . As one. As every ball from Znp is of a form a + ps Zm p , its radius is p m there are only finitely many balls of a fixed non-zero radius in Zp , and as t m −tm ≤ p−sm then there exist maximal w.r.t. if a + ps Zm p ⊂ b + p Zp then p inclusion balls in the set {Bi : i ∈ I}, and there are not more than countably many these maximal balls since there only countably many balls of non-zero radii in Zm p . Therefore A is a disjoint union of these maximal balls; so every open subset in Zm p is a disjoint union of a countably many balls of non-zero radii. As any open measurable subset A ⊂ Zm p is a countable disjoint union of balls of non-zero radii then F −1 (A) is an open measurable subset of Znp ,


83

and μp (F −1 (A)) = μp (A) in view of Corollary 7.10. As any closed subset m C ⊂ Zm p is a complement for a suitable open subset A, i.e., C = Zp \A, then −1 n −1 −1 F (C) = Zp \F (A) and therefore F (C) is a closed measurable subset of Znp of measure μp (F −1 (C)) = 1−μp (F −1 (A)) = 1−μp (A) = μp (C). Further, for a measurable M one has μp (M ) = inf{μp (V ) : V ⊃ M, V is open in Zm p } as the measure μp is regular, cf. Exercise 6.9; thus, μp (F −1 (M )) ≤ inf{μp (F −1 (V )) : V ⊃ M, V is open in Zm p } = μp (M ).

On the other hand, μp (M ) = sup{μp (W ) : W ⊂ M, W is closed in Zm p } since μp is regular. But μp (F −1 (W )) is a closed subset of Znp ; moreover, we have already proved that μp (F −1 (W )) = μp (W ) for every closed W ⊂ Zm p . Thus, μp (F −1 (M )) ≥ sup{μp (F −1 (W )) : W ⊂ M, W is closed in Zm p } = μp (M ).

Finally we conclude that μp (F −1 (M )) = μp (M ) thus proving the proposition.

7.3

1-Lipschitz ergodic functions

We finally characterize ergodic functions among all 1-Lipschitz functions F : Znp → Znp .

Proposition 7.12. A 1-Lipschitz function F : Znp → Znp is ergodic if and only if F is transitive modulo pk , for all k = 1, 2, . . .. Proof. The ‘only if’ part is already proved, see Proposition 6.12. We need prove the ‘if’ part of the statement only. By the main claim of Subsection 7.2, under conditions of Proposition 7.12 the function F is a measure-preserving isometry by Note 7.8 which holds for measure-preserving maps of Znp to Znp as well. Moreover, the conditions imply that every F -orbit is dense in Znp since given arbitrary x, y ∈ Znp and k ∈ N there exists ` ∈ N0 such that yˉ = Fˉk` (ˉ x) where Fˉk stands for k the reduced map F mod p and x ˉ, yˉ are respective residues modulo pk in k n the Cartesian product (Z/p Z) ; so |F ` (x) − y|p ≤ p−k . Therefore F is a minimal isometry on a compact metric space; but all such isometries are known to be uniquely ergodic (thus ergodic), cf. Proposition 6.2 Exercise 7.13. Given a 1-Lipschitz transformation F : Znp → Znp , and x ∈ Znp , prove that every sequence (F i (x) mod pk )∞ i=0 , k = 1, 2, 3, . . ., is strictly uniformly distributed if and only if F is ergodic. Exercise 7.14. Given a 1-Lipschitz ergodic transformation F : Znp → Znp , prove that every orbit (F i (x))∞ i=0 is uniformly distributed with respect to the measure μp .

Chapter 8

Ergodic 1-Lipschitz transformations on Zp In this chapter we obtain various results on ergodicity (and measure-preservation) for 1-Lipschitz transformations on Zp . Here and further ergodicity (accordingly, measure-preservation) stands for ergodicity (accordingly, measurepreservation) with respect to the measure μp .

8.1

Ergodicity of affine mappings

In this section we obtain explicit conditions for ergodicity of affine transformation f (x) = ax + b on the space Zp , where a, b ∈ Zp . This case serves as a base for further considerations; also, it is important for applications, see Chapter 13. In view of Theorem 7.1 it is clear that f is measure-preserving if and only if a has a multiplicative inverse modulo pk for all k = 1, 2, . . . (that is, a is a unit in Zp ); in other words, if and only if a 6≡ 0 (mod p). Theorem 8.1. The function f (x) = ax + b, where a, b ∈ Zp , is an ergodic transformation on Zp if and only if following conditions hold simultaneously: b 6≡ 0 (mod p);

(8.1)

a ≡ 1 (mod 4), for p = 2.

(8.3)

a ≡ 1 (mod p), for p odd;

(8.2)

Proof of Theorem 8.1. In view of Theorem 7.1 we must prove that f is transitive modulo pk if and only if conditions of Theorem 8.1 hold. We prove this by induction on k, and we state a base of induction as a lemma: Lemma 8.2. The function f (x) = ax + b is transitive modulo p if and only if b 6≡ 0 (mod p) and a ≡ 1 (mod p). 84

CHAPTER 8. 1-LIPSCHITZ ERGODICITY ON ZP

85

Proof of Lemma 8.2. It is clear that a 6≡ 0 (mod p) (otherwise f is a constant) and that b 6≡ 0 (mod p) (otherwise 0 is a fixed point of f ). Now, as for every i = 1, 2 . . . f i (x) = ai x + b(ai−1 + ai−2 + ∙ ∙ ∙ + a + 1)

(8.4)

we conclude that if a 6≡ 1 (mod p) then a − 1 is invertible in Zp by Note 1.27 and so f p (x) = ap x + b(ap − 1)(a − 1)−1 where (a − 1)−1 is a multiplicative inverse of (a − 1) in the ring Zp . Thus, as z p ≡ z (mod p) for every z ∈ Z, we have f p (x) ≡ ax + b (mod p), i.e., f p (x) ≡ f (x) (mod p). However, if f is transitive modulo p then f p (x) ≡ x (mod p). This contradiction proves that a ≡ 1 (mod p). The converse statement of Lemma 8.2 is obvious: If a ≡ 1 (mod p) then (8.4) implies that f i (x) ≡ x + bi (mod p), i.e., given x, y ∈ {0, 1, . . . , p − 1} from the congruence x + bi ≡ y (mod p) one finds i ∈ {0, 1, . . . , p − 1} (since b 6≡ 0 (mod p)) such that f i (x) ≡ y (mod p). Now we assume that conditions of Theorem 8.1 imply transitivity of f modulo pk ; we claim that then f is transitive modulo pk+1 . As f is measure-preserving, f is bijective modulo pk+1 ; thus, as f is transitive modulo pk , it is clear that f is transitive modulo pk+1 whenever f i (0) ≡ 0 (mod pk+1 ) implies i ≡ 0 (mod pk+1 ). Note that f i (0) ≡ 0 (mod pk+1 ) implies i ≡ 0 (mod pk ) since f is transitive modulo pk . Now we just calculate f i (0) mod pk+1 for i = pk `. As a = 1 + pr for a suitable r ∈ Zp , from (8.4) we get i X (1 + pr)i − 1 j−1 j−1 i =b f (0) = b ∙ p r pr j i

(8.5)

j=1

Now represent 1 i i i i i −1 ∙ − 1 ∙∙∙ ∙ −1 . = i(i − 1) ∙ ∙ ∙ (i − j + 1) = ∙ j! j 1 2 j−1 j As ti ∈ Zp for i = pk `, t = 1, 2, . . . , i − 1 we conclude that ordp k − ordp j, so from (8.5) it follows that k

f p ` (0) ≡ b ∙ pk ` (mod pk+1 ) for p odd, and k 2 ` 2k ` k (mod 2k+1 ), f (0) ≡ b ∙ 2 ` + 2r 2

pk ` j

≥

(8.6) (8.7)

since j − ordp j < 2 if and only if either j = 1, or p = 2 and j = 2. k Whenever p is odd, from (8.7) it follows that f p ` (0) = 0 (mod p)k+1 if and only if ` ≡ 0 (mod p), thus proving our claim for odd p. For p = 2, k however, (8.7) implies that f 2 ` (0) ≡ 0 (mod 2k+1 ) ether when ` is even, or


86

when both ` and r are odd. Yet the latter case does not hold since a ≡ 1 (mod 4). We conclude finally that conditions of Theorem 8.1 are sufficient. In view of Theorem 7.1, the above argument shows that these conditions are also necessary. We stress leading idea of the proof: Note 8.3. Given a 1-Lipschitz (that is, a compatible) measure-preserving function f : Zp → Zp , which is transitive modulo pk , the function f is trank sitive modulo pk+1 if and only if f p ` (z) ≡ z (mod pk+1 ) implies ` ≡ 0 (mod p) for some (or, equivalently, every) z ∈ Zp ; cf. Proposition 6.5.

In the sequel, we exploit this observation frequently. Note also that the statement of Note 8.3 holds for locally compatible functions as well once k is sufficiently large.

Exercise 8.4. Given arbitrary n ∈ {2, 3, 4, ...}, state and prove criterion when the affine map x 7→ ax + b is transitive on the residue ring Z/nZ (Hint: Use Chinese Reminder Theorem 0.13 and Proposition 6.5 ).

8.2

Ergodicity and measure-preservation in terms of coordinate functions

In this subsection we prove criteria of measure-preservation and of ergodicity for 1-Lipschitz functions f : Z2 → Z2 in terms of coordinate functions which were defined in Subsection 4.1.1. Recall that according to Proposition 4.2 every 1-Lipschitz function f : Z2 → Z2 can be represented in a form ! ∞ ∞ X X i χi ∙ 2 = ψi (χ0 , . . . , χi ) ∙ 2i (8.8) f i=0

j=0

where χi ∈ {0, 1}, and each i-th coordinate function ψi (χ0 , . . . , χi ) = δi (f (x)) is a Boolean function in Boolean variables χ0 , . . . , χi ; that is, ψi : {0, 1}i+1 → {0, 1}; i = 0, 1, 2 . . .. Recall that an algebraic normal form, the ANF, of the Boolean function ψi (χ0 , . . . , χi ) is a representation of this function via ⊕ (addition modulo 2, that is, logical ‘exclusive or’) and ∙ (multiplication modulo 2, that is, logical ‘and’, or conjunction). In other words, the ANF of the Boolean function ψ is its representation in the form ψ(χ0 , . . . , χj ) = β ⊕ β0 χ0 ⊕ β1 χ1 ⊕ . . . ⊕ β0,1 χ0 χ1 ⊕ . . . , where β, β0 , . . . ∈ {0, 1} and χ0 , . . . , χj are Boolean variables. Recall that the weight of the Boolean function ψ in (j + 1) variables is the number of (j + 1)-bit words that satisfy ψ; that is, the weight is a


87

cardinality of the truth set of ψ, and the truth set of ψ is the set of all points from {0, 1}j+1 where ψ takes value 1. Theorem 8.5 (Folklore). The function f defined by equation (8.8) is measurepreserving if and only if for every i = 0, 1, . . . the ANF of the i-th coordinate function is ψi (χ0 , . . . , χi ) = χi ⊕ ϕi (χ0 , . . . , χi−1 ),

where ϕi is an ANF of a Boolean function in Boolean variables χ0 , . . . , χi−1 , and ϕ0 is a constant from {0, 1}. The function f is ergodic if and only if, additionally, ϕ0 = 1, and every Boolean function ϕi is of odd weight, that is, takes value 1 exactly at an odd number of points from {0, 1}i for i = 1, 2, . . .. The latter condition holds if and only if degree of the ANF of ϕi for i ≥ 1 is exactly i, that is, if and only if the ANF of ϕi contains a monomial χ0 ∙ ∙ ∙ χi−1 .

Proof. Gathering all terms of the ANF that do not contain a variable χj we represent the function ψi in the following form: ψi (χ0 , . . . , χi ) = χi ∙ ωi (χ0 , . . . , χi−1 ) ⊕ ϕi (χ0 , . . . , χi−1 ), where both ωi (χ0 , . . . , χi−1 ) and ϕi (χ0 , . . . , χi−1 ) are Boolean functions in Boolean variables χ0 , . . . , χi−1 . Obviously, whenever all ωi (χ0 , . . . , χi−1 ) are identically 1, the function f is measure-preserving in view of Theorem 7.1 since f is bijective modulo 2k+1 for every k = 0, 1, 2, . . .: To find a preimage of the mapping f mod 2k one must solve a system of Boolean equations   χ0 ⊕ ϕ0 = α0 ,    χ ⊕ ϕ (χ ) = α1 , 1 1 1  ..................... ......    χ ⊕ ϕ (χ , . . . , χ ) = α , k k 0 k−1 k

which has a unique solution given any α0 , . . . , αk ∈ {0, 1}. Conversely, let i be the smallest number such that ωi (χ0 , . . . , χi−1 ) = 0 for a certain vector (ε0 , . . . , εi−1 ) of zeros and ones. Then f (ε0 +ε1 ∙2+∙ ∙ ∙ εi−1 ∙2i−1 +0∙2i ) ≡ f (ε0 +ε1 ∙2+∙ ∙ ∙ εi−1 ∙2i−1 +1∙2i ) (mod 2i+1 ). Whence f is not bijective modulo 2i+1 , thus not measure-preserving in view of Theorem 7.1. Now, to prove the ergodicity part of the statement we first note that f is transitive modulo 2 if and only if ψ0 (χ0 ) = χ0 ⊕ 1. Further, if f is transitive modulo 2k+1 , then f is transitive modulo 2j for all j = 1, 2, . . . , k; so the k i-th coordinate function δi (f 2 )(x) of the 2k -th iterate of the function f is ( χi , if i < k; 2k (8.9) δi (f (χ0 + χ1 ∙ 2 + χ2 ∙ 4 + ∙ ∙ ∙ )) = χk ⊕ σ, if i = k,


88

where σ is a sum modulo 2 of all values of the Boolean function ϕk at all points from {0, 1}k ; that is, σ is the weight modulo 2 of the function ϕk . From (8.9) it follows then that the transitivity of the function f modulo 2k+1 k implies σ = 1; otherwise f 2 (x) ≡ x (mod 2k+1 ) for every x ∈ Z2 . Thus, a weight of the function ϕk must be odd. The rest of the statement of the theorem is a well-known result from the theory of Boolean functions: A weight of a Boolean function is odd if and only if the ANF of the function has a maximum degree. To prove this claim consider a Boolean function ψ(χ0 , . . . , χj ) in Boolean variables χ0 , . . . , χj . For α, β ∈ {0, 1} define αβ = 1 whenever α = β and αβ = 0, otherwise. Then we can represent the Boolean function ψ as M β ψ(χ0 , . . . , χj ) = χβ0 0 ∙ ∙ ∙ χj j , (8.10) (β0 ,...,βj )∈T (ψ)

where T (ψ) ⊂ {0, 1}j+1 is a truth set of the Boolean function ψ. To obtain ANF from representation (8.10) we substitute χβ = χ⊕β ⊕1 and perform all multiplications and additions modulo 2; it is obvious then that the coefficient Coef χ0 ∙∙∙χj ψ of the term χ0 ∙ ∙ ∙ χj (of degree j+1, which is a maximum degree of any Boolean function in j+1 variables) in the ANF of the Boolean function ψ is #T (ψ) mod 2. Exercise 8.6. Find explicitly ANF’s of coordinate functions of the function f (x) = 1 + x. Exercise 8.7. Prove that given arbitrary 1-Lipschitz function g : Z2 → Z2 and arbitrary 1-Lipschitz measure-preserving function f : Z2 → Z2 , both functions u(x) = f (x) + 2 ∙ g(x) and v(x) = f (x) XOR 2 ∙ g(x) are measurepreserving. Exercise 8.8. Prove the following generalization of Exercise 8.7: Let a 1Lipschitz function F : Zn+1 → Z2 be such that for all z1 , . . . , zn ∈ Z2 the 2 function F (x, z1 , . . . , zn ) : Z2 → Z2 is measure-preserving. Then, given arbitrary 1-Lipschitz functions g1 , . . . , gn : Z2 → Z2 and arbitrary 1-Lipschitz measure-preserving function f : Z2 → Z2 , the composite function F (f (x), 2 ∙ g1 (x), . . . , 2 ∙ gn (x)) is measure-preserving. Exercise 8.9. Prove that given arbitrary 1-Lipschitz function g : Z2 → Z2 and arbitrary 1-Lipschitz ergodic function f : Z2 → Z2 , both functions u(x) = f (x) + 8 ∙ g(x) and u(x) = f (x) XOR 4 ∙ g(x) are ergodic. Exercise 8.10. Prove that given arbitrary 1-Lipschitz function g : Z2 → Z2 and arbitrary 1-Lipschitz ergodic function f : Z2 → Z2 , both functions u(x) = f (x + 8 ∙ g(x)) and u(x) = f (x XOR 4 ∙ g(x)) are ergodic. Exercise 8.11. Prove that statements of Exercises 8.9 and 8.10 remain true after replacement multiplier 8 by multiplier 4.


8.3

89

Ergodicity and measure-preservation in terms of Mahler expansion

Recall that every function f : Zp → Zp can be expressed via Mahler interpolation series (3.2) ∞ X x ai , f (x) = i i=0

where ai ∈ Zp , i = 0, 1, 2, . . .. We now are going to describe how one can determine from the coefficients ai whether f is measure-preserving or, respectively, ergodic. Central result of this section is the following Theorem 8.12. The function f defines a 1-Lipschitz measure-preserving transformation on Zp whenever the following conditions hold simultaneously: a1 6≡ 0 (mod p); ai ≡ 0

(mod pblogp ic+1 ), i = 2, 3, . . . .

(8.11) (8.12)

The function f defines a 1-Lipschitz ergodic transformation on Zp whenever the following conditions hold simultaneously: a0 6≡ 0 (mod p);

(8.13)

(mod p), for p odd;

(8.14)

a1 ≡ 1

(mod 4), for p = 2; (mod pblogp (i+1)c+1 ), i = 2, 3, . . . .

(8.15)

a1 ≡ 1 ai ≡ 0

(8.16)

Moreover, in the case p = 2 these conditions are necessary: Namely, if f is 1-Lipschitz and measure-preserving then conditions (8.11) and (8.12) hold simultaneously; if f is 1-Lipschitz and ergodic then conditions (8.13), (8.15) and (8.16) hold simultaneously. Thus, Theorem 8.12 gives a complete description of 1-Lipschitz measurepreserving (respectively, of 1-Lipschitz ergodic) transformations on Zp for p = 2 in terms of Mahler expansion. We also show in this subsection that p = 2 is the only case when the conditions of Theorem 8.12 are necessary. To prove the theorem we need some extra results, which are of interest by their own. Lemma 8.13. Given a 1-Lipschitz function v : Zp → Zp and p-adic integers c, d, c 6≡ 0 (mod p), the function g(x) = d + cx + p ∙ v(x) preserves measure, and the function h(x) = c + x + p ∙ Δv(x) is ergodic. (Recall that Δ is a difference operator: Δv(x) = v(x + 1) − v(x) by the definition.) Proof of Lemma 8.13. In view of Theorem 7.1 we must show that the function g (respectively, h) is bijective (respectively, transitive) modulo pk for all k = 1, 2, 3, . . ..


90

First we prove by induction on k that g is bijective modulo pk for all k = 1, 2, 3, . . .. The assertion is obviously true for k = 1. Assume our claim is true for k = 1, 2, . . . , n−1 and prove that it holds for k = n. Let g(a) ≡ g(b) (mod pn ) for some p-adic integers a, b. Then a ≡ b (mod pn−1 ) by induction hypothesis. Hence p∙v(a) ≡ p∙v(b) (mod pn ) since v is 1-Lipschitz. Further, the congruence g(a) ≡ g(b) (mod pn ) implies the congruence c ∙ a + p ∙ v(a) ≡ c ∙ b + p ∙ v(b) (mod pn ), and consequently, c ∙ a ≡ c ∙ b (mod pn ). Since c 6≡ 0 (mod p), the latter congruence implies that a ≡ b (mod pn ) thus proving the first assertion of Lemma 8.13. To prove the rest part of the statement we note that the assertion we just proved implies that the function h preserves measure. To prove transitivity of h modulo pk for all k = 1, 2, 3, . . . we use induction on k. From Lemma 8.2 it follows that h is transitive modulo p. Assume that h is transitive modulo pk−1 and proceed as in Note 8.3. We calculate successively h1 (x) = c + x + p ∙ v(x + 1) − p ∙ v(x),

.......................................... hj (x) = h(hj−1 (x)) = c(j − 1) + hj−1 (x) + p ∙ v(hj−1 (x) + 1) − p ∙ v(hj−1 (x)) = cj + x + p

j−1 X i=0

i

v(h (x) + 1) − p

j−1 X

v(hi (x)),

i=0

and henceforth. We recall that h0 (x) = x by the definition. Thus pk−1 `

h

(x) = cpk−1 ` + x + p

pk−1 `−1 X i=0

v(hi (x) + 1) − p

pk−1 `−1 X

v(hi (x)). (8.17)

i=0

However, as h is transitive modulo pk−1 and 1-Lipschitz, we see that pk−1 `−1 X i=0

i

v(h (x) + 1) ≡

pk−1 `−1 X i=0

i

v(h (x)) ≡ `

pk−1 X−1

v(z) (mod pk−1 ),

z=0

k−1

so (8.17) implies that hp ` (x) ≡ cpk−1 ` + x (mod pk ). Yet c 6≡ 0 (mod p); k−1 thus if hp ` (0) ≡ 0 (mod pk ) then necessarily ` ≡ 0 (mod p). This proves Lemma 8.13 in view of Note 8.3. Corollary 8.14. Under assumptions of Lemma 8.13, let r ≡ 1 (mod p) if p is odd, and let r ≡ 1 (mod 4) if p = 2. Then the function c + rx + p ∙ Δv(x) is ergodic. Proof of the Corollary 8.14. As r = 1 + ps for odd p (respectively r = 1 + 4s for p = 2) where s ∈ Zp , the function u(x) = s x2 (respectively, u(x) = 2s x2 ) is a polynomial over Zp , thus, 1-Lipschitz. Consequently, the function


91

v1 (x) = u(x) + v(x) is 1-Lipschitz either. Since Δv1 (x) = sx + Δv(x) for odd p (respectively, Δv1 (x) = 2sx + Δv(x) for p = 2), the proof is finished in view of Lemma 8.13 Proof of Theorem 8.12. Recall that according to Theorem 4.15, a function f : Zp → Zp is 1-Lipschitz if and only if it can be represented in the form f (x) = b0 +

∞ X

blogp ic

bi p

i=1

x , i

(8.18)

where bi ∈ Zp , i = 0, 1, 2, . . .. As logp i = logp (i + 1) for all i = 1, 2, . . . but i = pt − 1, (t = 1, 2, 3, . . .), and as Δv(x) =

∞ X

bi pblogp ic

i=1

x , i−1

(see (2)) sufficiency of conditions of Theorem 8.12 follows now from Lemma 8.13 and Corollary 8.14. To prove that for p = 2 conditions of Theorem 8.12 are necessary we will express coefficients of algebraic normal forms of coordinate functions (see Subsection 8.2) via coefficients of Mahler expansion (8.18) and then apply Theorem 8.5. During the proof we denote χi = δi (x) ∈ {0, 1}. Then for arbitrary n ∈ N and x ∈ Z2 Lemma 4.11 implies that f (x) ≡ f (χ0 + χ1 2 + ∙ ∙ ∙ + χn−1 2n−1 ) + 2n χn fñ (x) (mod 2n+1 ), where fñ (x) ≡

s n X Δ2 f (x)

s=0

2s

(mod 2).

(8.19)

(8.20)

From (4.23) (see proof of Theorem 4.15) we conclude that X s ∞ ∞ x x 1 X Δ2 f (x) blog2 ic−s = . = s ai bi 2 i − 2s i − 2s 2s 2 s s i=2

i=2

This, in view of Lucas’ Theorem 0.2, implies that the following congruences modulo 2 hold: δ0

s

Δ2 f (x) 2s

x δ0 (bi ) ≡ ≡ i − 2s i=2s s −1 2X x0 xs−1 δ0 (b2s +j ) ... δ0 (j) δs−1 (j) 2s+1 X−1

j=0

(mod 2). (8.21)


92

2s From 8.21 it follows that δ0 Δ 2fs (x) does not depend on χs , χs+1 , . . . and that δ0 (Δf (x)) ≡ b1 ≡ a1 (mod 2). Now the latter congruence in view of (8.20) implies xn ) ≡ fñ (x) ≡ fñ (ˉ

s n X Δ2 f (ˉ xs )

s=1

2s

+ a1

(mod 2)

(8.22)

where here and after x ˉk stands for x mod 2k = χ0 + χ1 2 + ∙ ∙ ∙ + χk−1 2k−1 , (k = 1, 2, . . .). Theorem 8.5 implies now that f preserves measure if and only if the following two conditions hold: • f is bijective modulo 2, • fñ (x) ≡ 1 (mod 2) for all n = 1, 2, . . . and all x ∈ Z2 .

(8.23) (8.24)

As f (x) ≡ a0 + a1 x (mod 2) then condition (8.23) is equivalent to the following condition: a1 ≡ 1 (mod 2) (8.25) Now, in view of (8.22) and (8.25), condition (8.24) holds if and only if the following condition s Δ2 f (ˉ xs ) ≡ 0 (mod 2) (8.26) s 2 holds for all s = 1, 2, 3, . . . and all x ∈ Z2 . However, in view of (8.21), condition (8.26) holds for all s = 1, 2, 3, . . . and all x ∈ Z2 if and only if the condition bi ≡ 0 (mod 2) (8.27)

holds for all i = 2, 3, . . . . As ai = bi 2blog2 ic for i = 1, 2, . . ., then (8.25) and (8.27) imply necessity of conditions (8.11) and (8.12) when p = 2. Further, as an ergodic function f preserves measure, from Theorem 8.5 in view of (8.19) and condition (8.24) we conclude that the ANF of the Boolean function δi (f (x)) = ψi (χ0 , . . . , χi ) is of the following form: ψi (χ0 , . . . , χi ) = ϕi (χ0 , . . . , χi−1 ) ⊕ xi

(8.28)

where ϕi (χ1 , . . . , χi−1 ) = δi (f (χ0 + ∙ ∙ ∙ + χi−1 2i−1 )) and ϕ0 is a constant. Now from Theorem 8.5 it follows that once the function f is ergodic, ϕ0 = 1, and the coefficient Coef χ0 ∙∙∙χi−1 ϕi of the monomial χ0 ∙ ∙ ∙ χi−1 in the ANF ϕi must be 1 for all i = 1, 2, . . .. Since obviously ϕ0 ≡ a0 (mod 2), we conclude now that f is a 1-Lipschitz ergodic function if and only if the following conditions (8.29) — (8.32) hold


93

simultaneously: a0 ≡ 1 (mod 2);

(8.29)

a1 ≡ 1 (mod 2); aj ≡ 0 (mod 2

(8.30)

blog2 jc+1

), for all j = 2, 3, . . . ;

Ci = 1, for all i = 1, 2, . . . ,

(8.31) (8.32)

where Ci = Coef χ0 ∙∙∙χi−1 ϕi . To finish the proof, we use the following recursive formula for Coef χ0 ∙∙∙χi−1 ϕi : Lemma 8.15. If a 1-Lipschitz function f preserves measure, then Coef x0 ∙∙∙xn ϕn+1 ≡ δ1 (b2n+1 −1 ) + Coef x0 ∙∙∙xn−1 ϕn

(mod 2)

for all n = 1, 2, . . .. Proof of Lemma 8.15. We begin as in the proof of Lemma 4.11: With the use of Gregory-Newton formula from Theorem 0.5 and taking into the account that χn ∈ {0, 1}, we conclude that n

xn )+χn f (ˉ xn+1 ) = f (ˉ

2 X i=1

2n Δ f (ˉ xn ) i i

n

= f (ˉ xn )+2 χn

n −1 2X

k=0

Δk+1 f (ˉ xn ) 2n − 1 . k+1 k

Hence, δn+1 (f (ˉ xn+1 )) ≡ δn+1 (f (ˉ xn )) + δ1 (χn Sn ) + δn (f (ˉ xn ))δ0 (χn Sn ) (mod 2), (8.33) where n −1 2X Δk+1 f (ˉ xn ) 2n − 1 . Sn = k+1 k k=0 n As by Lucas’ Theorem 0.2 2 k−1 ≡ 1 (mod 2) for all k ∈ {0, 1, . . . 2n − 1}, we combining together Lemma 4.10 and Lemma 4.11 conclude that Sn ≡

s n X Δ2 f (ˉ xn )

s=0

2s

≡ fñ (ˉ xn ) (mod 2)

(8.34)

However, fñ (ˉ xn ) ≡ 1 (mod 2) since f preserves measure, see (8.24). Then (8.34) implies that δ0 (Sn ) = 1. (8.35) This, in view of (8.33), implies that xn+1 )) ≡ Coef χ0 ∙∙∙χn δn+1 (f (ˉ

Coef χ0 ∙∙∙χn δ1 (χn Sn ) + Coef χ0 ∙∙∙χn−1 δn (f (ˉ xn )) (mod 2). (8.36)


94

As δ1 (χn Sn ) = χn δ1 (Sn ) then Coef χ0 ∙∙∙χn δ1 (χn Sn ) = Coef χ0 ∙∙∙χn−1 δ1 (Sn ).

(8.37)

Now we must calculate δ1 (Sn ). From ‘school-textbook’ algorithms of addition and multiplication of 2-adic integers uk , vk ∈ Z2 ; uk = δ0 (uk ) + δ1 (uk ) ∙ 2 + δ2 (uk ) ∙ 22 + ∙ ∙ ∙ and vk = δ0 (vk ) + δ1 (vk ) ∙ 2 + δ2 (vk ) ∙ 22 + ∙ ∙ ∙ , it follows that ! ! m m m m X X X X uk vk ≡ δ0 (uk )δ1 (vk )+ δ1 (uk )δ0 (vk )+δ1 δ0 (uk vk ) (mod 2). δ1 k=0

k=0

k=0

k=0

(8.38) P For k ∈ {0, 1}, k = 0, 1, 2, . . . , m, denote Ξ(0 , . . . , m ) = δ1 ( m ), then k k=0 clearly 1 (mod 2), Wt(0 , . . . , m ) Ξ(0 , . . . , m ) ≡ 2

where Wt(0 , . . . , m ) is the number of nonzero coordinates of a binary vector k+1 f (ˉ n xn ) , vk = 2 k−1 we (0 , . . . , m ). Now assuming m = 2n − 1, uk = Δ k+1 apply (8.38) to calculate δ1 (Sn ). From Lucas’ Theorem 0.2 it follows that n 2 −1 δ0 (vk ) = δ0 =1 k for all k = 0, 1, . . . , 2n − 1. Hence, ! ! m m X X δ0 (uk vk ) = δ1 δ0 (uk )δ0 (vk ) = Ξ(δ0 (u0 ) . . . , δ0 (um )). δ1 k=0

k=0

Further, from Lemma 4.10 it follows that for all k 6= 2r − 1 k+1 Δ f (ˉ xn ) δ0 (uk ) = δ0 = 0. k+1

(8.39)

As f preserves measure, (8.26) holds for all s = 1, 2, . . . and all x ∈ Z2 , so from (8.39) it follows that δ0 (u1 ) = ∙ ∙ ∙ = δ0 (um ) = 0, whence the function Ξ(δ0 (u0 ) . . . , δ0 (um )) = δ1

X m

δ0 (uk vk )

k=0

in the right hand part of (8.38) vanishes. Finally applying (8.39) and (8.26) to (8.38) we conclude that δ1 (Sn ) ≡ δ0 (Δf (ˉ xn )) +

n −1 2X

k=0

δ1

Δk+1 f (ˉ xn ) k+1

(mod 2).

(8.40)


95

As f preserves measure, then (8.25) and (8.27) hold; thus, coefficients bi of Mahler expansion (8.18) satisfy the following conditions: ( b1 ≡ 1 (mod 2); (8.41) bi = 2ci , for appropriate ci ∈ Z2 ; i = 2, 3, . . . . Hence for every s ≥ 2, s = sˆ ∙ 2ord2 s , sˆ odd, we have

∞ x 2 X blog2 ic−ord2 s Δs f (x) , ci 2 = i−s s sˆ

(8.42)

i=s

in view of (2) (we note that sˆ is a unit of Z2 thus, sˆ has a multiplicative inverse 1sˆ ∈ Z2 ). Consequently, (8.42) implies that δ1

Δs f (x) s

≡

∞ X

blog2 ic−ord2 s

ci 2

i=s

x i−s

(mod 2).

(8.43)

As either blog2 ic > ord2 s or s ≤ i always but the case s = 2r , 2r ≤ i ≤ 2r+1 − 1, congruence (8.43) implies that s (P2r+1 −1 x ci i−2 (mod 2), if s = 2r for r = 1, 2, . . .; Δ f (x) r i=2r δ1 ≡ s 0 (mod 2), otherwise. (8.44) Further, from (8.18) in view of (8.41) and (2) we derive that Δf (x) ≡ b1

(mod 4).

(8.45)

Now from (8.40) in view of (8.41), (8.44) and (8.45) it follows that s

δ1 (Sn ) ≡ 1 + δ1 (b1 ) +

n 2X −1 X s=1 j=0

cj+2s

x ˉn j

(mod 2).

(8.46)

From here with the use of Lucas’ Theorem 0.2 we deduce that Coef χ0 ∙∙∙χn−1 δ1 (Sn ) ≡ c2n+1 −1

(mod 2).

The latter congruence in view of (8.41), (8.36) and (8.37) finishes the proof of Lemma 8.15. Now we can finish our proof of Theorem 8.12. Lemma 8.15 implies that Coef χ0 ∙∙∙χi−1 δ1 (f (ˉ xi )) ≡

i X r=2

δ1 (b2r −1 ) + Coef χ0 δ1 (f (χ0 )) (mod 2). (8.47)


96

From (8.18) we have f (χ0 ) = b0 + b1 χ0 , so taking into the account (8.41) we conclude that δ1 (f (χ0 )) ≡ δ1 (b0 ) + χ0 (δ1 (b1 ) + δ0 (b0 )) (mod 2). Thus, (8.47) in view of (8.29) implies that xi )) ≡ 1 + Coef χ0 ∙∙∙χi−1 δ1 (f (ˉ

i X r=1

δ1 (b2r −1 ) (mod 2),

since a0 = b0 . This means that the condition (8.32) is equivalent to the following condition i X r=1

δ1 (b2r −1 ) ≡ 0 (mod 2); i = 1, 2, 3, . . . ,

or, equivalently, to the condition δ1 (b2r −1 ) = 0; (r = 1, 2, 3, . . .).

(8.48)

As aj = bj 2blog2 jc , then, combining together (8.29), (8.30), (8.31) and (8.48), we finish the proof of Theorem 8.12. Exercise 8.16. Prove that for every prime p and every a ≡ 1 (mod p) the function f (x) = ax + ax is a 1-Lipschitz ergodic transformation on Zp . Exercise 8.17. Prove the following : 1. A function f : Z2 → Z2 is 1-Lipschitz and measure-preserving if and only if f can be represented as f (x) = c + x + 2 ∙ v(x), where c ∈ Z2 and v is a 1-Lipschitz function. 2. A function f : Z2 → Z2 is 1-Lipschitz and ergodic if and only if f can be represented as f (x) = 1 + x + 2 ∙ Δv(x), where v is a 1-Lipschitz function. P i Exercise 8.18. Let f be a B-function on Z2 ; that is, let f (x) = ∞ i=0 ei ∙ x , for suitable ei ∈ Z2 , i = 0, 1, 2, . . . (see Section 5.2). Prove that the function f is measure-preserving if and only if e1 ≡ 1 (mod 2),

e2 ≡ 0 (mod 2),

e3 ≡ 0 (mod 2).

Prove that the function f is ergodic on Z2 if and only if e0 ≡ 1 (mod 2),

e1 ≡ 1 (mod 4),

e2 ≡ 0 (mod 2),

e3 ≡ 0 (mod 4).


97

Exercise 8.19. Prove that a C-function f : Z2 → Z2 is ergodic on Z2 if and only if f is transitive modulo 8. (Hint: Use Exercise 8.18 and Proposition 3.15). Exercise 8.20. From the preceding exercise we already know that a polynomial over Z2 is ergodic if and only it is transitive modulo 8. Prove that a complete list of polynomial transitive transformations on Z/8Z is a s follows: x+1

5x + 1

2x2 + 3x + 1

2x2 + 7x + 1

x+3

5x + 3

2x2 + 3x + 3

2x2 + 7x + 3

x+5

5x + 5

2x2 + 3x + 5

2x2 + 7x + 5

x+7

5x + 7

2x2 + 3x + 7

2x2 + 7x + 7

(Hint: Use Exercise 8.18 and Proposition 3.15). ExerciseP8.21. Given a C-function f : Z2 → Z2 represented via power series: i f (x) = ∞ i=0 ci x , ci ∈ Z2 , i = 0, 1, 2, . . .. Prove that f is ergodic if and only if the following conditions hold simultaneously: c3 + c5 + c7 + ∙ ∙ ∙ ≡2c2

(mod 4);

c4 + c6 + c8 + ∙ ∙ ∙ ≡c1 + c2 − 1 (mod 4); c1 ≡1 (mod 2);

c0 ≡1 (mod 2).

(Hint: Use Exercise 8.18 and relations (1) from Section 0.1.)

8.4

Ergodicity and measure-preservation in terms of van der Put expansion

In this section, measure-preservation and the ergodicity criteria for 1-Lipschitz transformations on Z2 are proved. Recall that for the case p = 2 1-Lipschitz functions are also called T-functions. In the section, we use this term to stress that we deal with the case p = 2 only.

8.4.1

Measure-preservation criteria for T-functions

Now we prove two criteria of measure-preservation for T-functions. Theorem 8.22. Let f : Z2 → Z2 be a T-function represented via van der Put series (3.10). The T-function f is measure-preserving if and only if the following conditions hold simultaneously: 1. B0 + B1 ≡ 1 (mod 2); 2. |Bm |2 = 2−blog2 mc , m = 2, 3, . . ..


98

Proof. By Exercise 8.17, the T-function f is measure-preserving if and only if it can be represented in the form f (x) = d + x + P2g(x),˜ where g : Z2 → Z2 is a T-function and d ∈ Z2 . Now, given g(x) = ∞ m=0 Bm χ(m, x), the van der Put series for the T-function g, we find van der Put coefficients of the function f (x) = d + x + 2g(x). As the van der Puts series of the T-function t(x) = x are t(x) =

∞ X

2blog2 mc χ(m, x),

(8.49)

m=1

and as d = d ∙ χ(0, x) + d ∙ χ(1, x), we get: f (x) = (d + 2B˜0 )χ(0, x) + (1 + d + 2B˜1 )χ(1, x)+ ∞ X ˜m χ(m, x). (8.50) 2blog2 mc + 2B m=2

˜m |2 ≤ 2−blog2 mc This proves necessity of conditions of the Theorem since |B blog2 mc ˜m = 2−blog2 mc by strong triangle by Theorem 4.18 and hence 2 + 2B 2 inequality. To prove sufficiency of the conditions, we note that the condition |Bm |2 = −blog blog2 mc + 2B ˜m for suitable B ˜m ∈ Z2 , where 2 mc implies that B 2 m =2 ˜ (8.51) Bm 6 2−blog2 mc , 2

and the condition B0 + B1 ≡ 1 (mod 2) implies that B0 = d + 2B˜0 and B1 = 1 + d + 2B˜1 for suitable d, B˜0 , B˜1 ∈ Z2 . Now from equations P∞ ˜(8.50) and (8.49) it follows that f (x) = d + x + 2g(x), where g(x) = m=0 Bm χ(m, x) is a T-function by Theorem 4.18 in view of inequality (8.51). Thus, f is measure-preserving by Exercise 8.17. From Theorems 4.18 and 8.22 we deduce now the following Corollary 8.23. A map f : Z2 → Z2 is a measure-preserving T-function if and only if it can be represented as f (x) = b0 χ(0, x) + b1 χ(1, x) +

∞ X

2blog2 mc bm χ(m, x),

m=2

where bm ∈ Z2 , and the following conditions hold simultaneously 1. b0 + b1 ≡ 1 (mod 2); 2. bm ≡ 1 (mod 2), m = 2, 3, 4 . . ..


8.4.2

99

Ergodicity criteria for T-functions

In this subsection, we prove the following criterion of ergodicity for Tfunctions: Theorem 8.24. A T-function f : Z2 → Z2 is ergodic if and only if it can be represented as f (x) = b0 χ(0, x) + b1 χ(1, x) +

∞ X

2blog2 mc bm χ(m, x)

m=2

for suitable bm ∈ Z2 that satisfy the following conditions: 1. b0 ≡ 1 (mod 2); 2. b0 + b1 ≡ 3 (mod 4); 3. |bm |2 = 1, m > 2; 4. b2 + b3 ≡ 2 (mod 4); P2n −1 5. m=2n−1 bm ≡ 0 (mod 4), n > 3.

To prove the Theorem, we need the following Lemma:

Lemma 8.25. Let f : Z2 → Z2 be a T-function represented by van der Put series (3.10). Then f is ergodic if and only there exists a sequence a0 , a1 , . . . of 2-adic integers such that

Bm

  1 + 2(a1 − a0 ),    2(1 + a + 2a − a ), 0 2 1 = n−1 n  2 + 2 am+1 − 2n am ,    2n−1 + 2n+1 a n − 2n a n − 2n a n−1 , 2 2 −1 2

if if if if

m = 0; m = 1; 2n−1 6 m < 2n − 1, n > 2 m = 2n − 1, n > 2. (8.52)

Proof. By Exercise 8.17, a T-function f is ergodic if and only if it can be represented as f (x) = 1 + x + 2(g(x + 1) − g(x)), where g(x) is a suitable T-function. That is, by Theorem 4.18, g(x) = a0 χ(0, x) +

∞ X

m=1

nm −1

2

am χ(m, x) =

∞ X

˜m χ(m, x), B

(8.53)

m=0

˜0 , B ˜1 , B ˜2 , . . . ∈ Z2 (here nm = blog2 mc + 1, for suitable a0 , a1 , a2 . . . ∈ Z2 , B m = 1, 2, 3, . . .).


100

Now, to prove necessity of conditions of the Lemma, we just need to express van der Put coefficients of the function f via the coefficients of the ˉm of the T-function function g. First, we do this for van der Put coefficients B g(x + 1) =

∞ X

ˉm χ(m, x). B

(8.54)

m=0

ˉm = g(m + 1) − g(m + 1 − q(m)) where q(m) = If m > 1 then by (3.12) B n −1 m ˉm = δnm −1 (m)2 . If m 6= 2nm − 1 then q(m) = q(m + 1), therefore B n −1 ˜m+1 = 2 m am+1 by (8.53) as nm = nm+1 g(m+1)−g(m+1−q(m+1)) = B n m ˉm = g(2nm ) − g(2nm −1 ) as q(2nm − 1) = in this case. If m = 2 − 1 then B n −1 n m 2 . As H2n = g(2 ) − g(0) and H2n−1 = g(2n−1 ) − g(0) by (8.53), we ˉ2n −1 = B ˜2n − B ˜2n−1 . Finally, the coefficients B ˉ0 , B ˉ1 can be conclude that B ˉ ˜ ˉ ˜ ˜ found directly from (8.54): B0 = g(1) = B1 , B1 = g(2) = B0 + B2 . Now we ˆm of the function 2(g(x + 1) − g(x)); they can find van der Put coefficients B are:

ˆm B

 ˜0 ), ˜1 − B  if m = 0; 2(B    2(B ˜0 + B ˜2 − B ˜1 ), if m = 1; = ˜m ), ˜m+1 − B  if 2n−1 6 m < 2n − 1, n > 2; 2(B    2(B ˜m+1 − B ˜m − B ˜ m+1 ), if m = 2n − 1, n > 2.

(8.55)

2

As χ(0, x) + χ(1, x) = 1 for all x ∈ Z2 , from (4.29) we derive the van der Put expansion for the function x + 1; namely, x + 1 = χ(0, x) + 2χ(1, x) +

∞ X

2nm −1 χ(m, x).

(8.56)

m=2

˜ 0 = a0 , B ˜m = 2n−1 am when 2n−1 6 m 6 From (8.53) we have that B 2n − 1, n = 1, 2, . . .. Now combining the latter expressions with (8.56) and (8.54) we conclude that van der Put coefficients Bm of the function f (x) = 1 + x + 2(g(x + 1) − g(x)) are of the form (8.52). To prove sufficiency of conditions of the Lemma we just remark that the above argument shows that given expressions (8.52) for the van der Put coefficients of the function f we can represent the T-function f in the form f (x) = 1 + x + 2(g(x + 1) − g(x)) where the the van der Put expansion for g is given by (8.53). That is, the function g is a T-function by Theorem 4.18; therefore the T-function f is ergodic by Exercise 8.17. Now we are able to prove the following Proposition which actually is a criterion of ergodicity for T-functions, in terms of van der Put coefficients: Proposition 8.26. Let f : Z2 → Z2 be a T-function which represented by the van der Put series (3.10). Then f is ergodic if and only if the following conditions are satisfied simultaneously:


101

1. B0 ≡ 1 (mod 2); 2. B0 + B1 ≡ 3 (mod 4); 3. |Bm |2 = 2−(n−1) , n > 2, 2n−1 6 m < 2n − 1; P n −1 n−1 ) 6 2−(n+1) , n > 2. 4. 2m=2 (B − 2 n−1 m 2

Proof. By Lemma 8.25, if the T-function f is ergodic then its van der Put coefficients Bm can be expressed in the form (8.52), for suitable a0 , a1 , a2 , . . . ∈ Z2 . From (8.52) by direct calculation we easily prove that conditions 1–4 of the proposition are true. By Lemma 8.25, to prove sufficiency of conditions of the proposition we must find a sequence of 2-adic integers a0 , a1 , a2 , . . . such that relations (8.52) for the van der Put coefficients Bn hold. Take arbitrarily a0 , a1 ∈ Z2 so that B0 − 1 (8.57) a1 − a 0 = 2 (cf. the first equation from (8.52) and condition 1 of the proposition); then put B1 + B0 − 3 B1 + 2(a1 − a0 ) − 2 = (8.58) a2 = 4 4 (cf. the second equation from (8.52)). Note that a2 ∈ Z2 due to the condition 2 of the proposition. We construct a3 , a4 , a5 , . . . ∈ Z2 inductively. Denote n−1 ˇ m = Bm − 2 , B 2n

(8.59)

ˇm ∈ Z2 by condition 3. Given where n = blog2 mc + 1, m ≥ 3; then B n−1 a2n−1 ∈ Z2 , for α = 1, 2, . . . , 2 − 1 put a2n−1 +α =a2n−1 +

2n−1 +α−1 X

ˇm , B

(8.60)

ˇm . B

(8.61)

m=2n−1

a2n =a2n−1

1 + 2

n −1 2X

m=2n−1

Then a2n−1 +α ∈ Z2 by condition 3 of the Proposition; and a2n ∈ Z2 by condition 4. Therefore all a0 , a1 , a2 , . . . are in Z2 . Now solving system of equations (8.57),(8.58),(8.59), (8.60), (8.61) with respect to unknowns Bm , m = 0, 1, 2, 3, . . ., we see that the van der Put coefficients Bm satisfy conditions (8.52) of Lemma 8.25. Therefore f is ergodic. Now we are able to prove Theorem 8.24:


102

Proof of Theorem 8.24. Consider the van der Put expansion (3.10) of the Tfunction f ; then by Theorem 4.18, Bm = 2blog2 mc bm , for suitable bm ∈ Z2 . It is clear now that conditions 1, 2 and 3 of Proposition 8.26 are equivalent respectively to conditions 1, 2 and 3 of Theorem 8.24. Take 2n−1 6 m < 2n , n ≥ 2; thus Bm = 2n−1 bP m . Then condition (iv) of n −1 n−1 ) ≡ 0 Proposition 8.26 is equivalent to the congruence 2m=2 n−1 (Bm − 2 P2n −1 n+1 n−1 ); whence is equivalent to the congruence 2 (mod 2 m=2n−1 (bm −1) ≡ n+1 n−1 0 (mod 2 ) as BP bm . However, the latter congruence is equivalent m =2 n −1 to the congruence 2m=2 n−1 (bm −1) ≡ 0 (mod 4) which in turn is equivalent P n −1 either to the congruence 2m=2 n−1 bm ≡ 0 (mod 4) (when n ≥ 3) or to the P2n −1 congruence m=2n−1 bm ≡ 2 (mod 4) (when n = 2). However, the latter two congruences are respectively conditions 5 and 4 of Theorem 8.24. Exercise 8.27. Prove that given a sequence c, c0 , c1 , c2 , . . . of 2-adic integers, the series ∞ X c+ ci δi (x) (8.62) i=0

defines an ergodic T-function f : Z2 → Z2 if and only if the following conditions hold simultaneously: 1. c ≡ 1 (mod 2); 2. c0 ≡ 1 (mod 4); 3. |ci |2 = 2−i , for i = 1, 2, 3, . . .. Exercise 8.28. Prove that the following T-function f is ergodic on Z2 : f (x) = 1 + δ0 (x) + 6δ1 (x) +

∞ X k=2

(1 + 2(x AND (2k − 1)))2k δk (x).

Chapter 9

Measure-preservation and ergodicity of uniformly differentiable functions on Znp In this chapter we obtain ergodicity and/or measure-preservation conditions for functions F : Znp → Zm p that are uniformly differentiable modulo p and that have integer-valued derivatives modulo p. Recall that in view of Theorem 4.6 all these functions are locally 1-Lipschitz (or, locally compatible), that is, they are functions that are 1-Lipschitz on all sufficiently small balls. Therefore for these functions F the reduced maps F mod pk : (Zp /pk Z) → (Zp /pk Z)m are well defined provided k is sufficiently large, say, k ≥ N1 (F ); cf. Definition 2.27. Thus, we can apply Theorem 7.1 to study measure-preservation and ergodicity of F . As 1-Lipschitz uniformly differentiable functions are a special case of functions under consideration, the theory that follows can be applied to various important classes of functions, e.g., for analytic functions on Zp , C-functions, B-functions, Afunctions (in particular, for twice integer-valued polynomials over Qp ), etc. Also, the theory works for a number of problems arising in computer science, numerical simulations, cryptology, see Chapters 12, 13 and 14.

9.1

Conditions for measure-preservation

In this section we answer a question when a uniformly differentiable modulo p function is measure-preserving providing that all its derivatives modulo p are integer-valued. Theorem 9.1. Let the function F : Znp → Zm p , m ≤ n, be uniformly differentiable modulo p, and let all partial derivatives modulo p of the function F be integer-valued. Then F is measure-preserving whenever the following two conditions hold simultaneously: 103

CHAPTER 9. ERGODICITY AND DIFFERENTIABILITY

104

1. F balanced modulo pk for some k ≥ N1 (F ). 2. The rank rk F10 (y) of Jacobi matrix F10 (y) modulo p is m at all points y ∈ Znp . Moreover, in the case when m = n the mentioned conditions are also necessary: If F : Znp → Znp is measure-preserving then F is bijective modulo pk for all k ≥ N1 (F ), and det F10 (y) 6≡ 0 (mod p) for all y ∈ Znp . Finally, the function F : Znp → Znp is measure-preserving if and only if F is bijective modulo pk for some k ≥ N1 (f ) + 1. Proof. During the proof we consider elements of the ring (Z/pr Z)` (the `-th Cartesian power of the residue ring Z/pr Z) as ordered strings of ` numbers from {0, 1, . . . , pr − 1}. With this in mind, for y ∈ (Z/ps Z)m denote Fs−1 (y) = {v ∈ (Z/ps Z)n : F (v) ≡ y (mod ps )}, a preimage of y with respect to the reduced map F mod ps : (Z/ps Z)n → (Z/ps Z)m . Let s ≥ k ≥ N1 (F ). As F is locally 1-Lipschitz, F is a sum of a 1-Lipschitz function with a step function of order not greater than N1 (F ), see Theorem −1 ˉ ∈ Fs−1 (w). ˉ Here and further 4.6); so we conclude that if u ∈ Fs+1 (w), then u s m ˉ = (ˉ ˉm ) ∈ (Z/p Z) stands for a mod ps = (a1 mod in the proof a a1 , . . . , a s s ˉ+ p , . . . , am mod p ), where a = (a1 , . . . , am ) ∈ (Z/ps+1 Z)m . Put z = u s s+1 n n p h ∈ (Z/p Z) , where h ∈ (Z/pZ) . As F is uniformly differentiable modulo p, then F (z) ≡ F (ˉ u) + ps h ∙ F10 (ˉ u) (mod ps+1 ),

(9.1)

ˉ + ps c for ˉ + ps b (mod ps+1 ) and w = w cf. Definition 2.25). As F (ˉ u) ≡ w −1 suitable b, c ∈ (Z/pZ)m , from (9.1) we deduce that z ∈ Fs+1 (w) if and only −1 −1 ˉ ∈ Fs (w) ˉ (i.e., u ˉ ∈ Fs (w)) ˉ and h satisfies the following system of if z linear equations over a field Z/pZ: b + h ∙ F10 (ˉ u) = c.

(9.2)

Consequently, if columns of the matrix F10 (ˉ u) are linearly independent over Z/pZ, then linear system (9.2) has exactly pn−m pairwise distinct solutions h ∈ (Z/pZ)n given arbitrary b, c ∈ (Z/pZ)m . From here it follows that −1 ˉ ∙ pn−m . (w) = (#Fs−1 (w)) #Fs+1

(9.3)

ˉ does not depend on w) ˉ Hence, if F is balanced modulo ps (i.e., if #Fs−1 (w) ˉ is m for all w ˉ ∈ (Z/ps Z)n , then (9.3) and if the rank of the matrix F10 (w) implies that F is balanced modulo ps+1 . However, in view of Proposition ˉ depends only on w ˉ mod pN1 (F ) . This in view of 2.38, the matrix F10 (w) Theorem 7.1 proves the first claim of Theorem 9.1. To prove the second claim, take m = n and suppose that F : Znp → Znp is a measure-preserving function. In view of Theorem 7.1 this implies that


105

F is bijective modulo pk for all k ≥ N1 (F ). Definition 2.25 of uniform differentiability modulo p implies that F (u + pk h) ≡ F (u) + pk h ∙ F10 (u) (mod pk+1 )

(9.4)

for all u, h ∈ Zp . Here F10 (u) is an n × n matrix over a field Z/pZ. If det F10 (u) ≡ 0 (mod p) for some u ∈ Znp (or, equivalently, for some u ∈ {0, 1, . . . , pN1 (F ) − 1}n as partial derivatives modulo p are step functions, see Proposition 2.38), then there exists h ∈ {0, 1, . . . , p − 1}n , h 6≡ (0, . . . , 0) (mod p) such that h ∙ F10 (u) ≡ (0, . . . , 0) (mod p). However, then (9.4) implies that F (u + pk h) ≡ F (u) (mod pk+1 ), in contradiction of bijectivity modulo pk+1 of the function F as u + pk h 6≡ u (mod pk+1 ). Finally, if F is bijective modulo some k ≥ N1 (F ) + 1 then F is bijective modulo pk−1 due to the compatibility of F (see Proposition 6.5), and det F10 (u) ≡ 0 (mod p) nowhere on Zp since otherwise the above argument shows that F is not bijective modulo pk . Therefore F is measure-preserving by the first assertion of Theorem 9.1. The following note is important for applications in computer science and cryptology, see Chapter 13. Note 9.2. As partial derivatives modulo p are step functions by Proposition 2.38, in order to verify whether the condition rk F10 (y) = m of Theorem 9.1 (or, respectively the condition det F10 (y) 6≡ 0 (mod p)) holds for all y ∈ Znp it is sufficient to check only whether these conditions hold for y ∈ {0, 1, . . . , pN1 (F ) − 1}n .

Exercise 9.3. Prove that the bound given by Theorem 9.1 is sharp: find a function f : Zp → Zp such that • f is uniformly differentiable modulo p, • f10 is integer-valued, • f is bijective modulo pN1 (f ) , • f is not bijective modulo pN1 (f )+1 , and • f is not measure-preserving. (Hint: Consider f (x) = 1 + xp )

Exercise 9.4. Prove that a polynomial from Zp [x] is measure-preserving if and only it is bijective modulo p and its derivative vanishes modulo p nowhere on Z/pZ).


106

P i Exercise 9.5. Prove that a C-function f = ∞ i=0 ci x is measure-preserving on Z2 if and only if the following conditions hold simultaneously: c2 + c4 + c6 + ∙ ∙ ∙ ≡0 (mod 2); c3 + c5 + c7 + ∙ ∙ ∙ ≡0 (mod 2);

(9.5)

c1 ≡1 (mod 2).

Exercise 9.6. Prove that the T-function f (x) = c0 1 c1 x 2 c2 x2 3 ∙ ∙ ∙ n cn xn , where j ∈ {+, XOR}, j = 1, 2, . . . , n, is ergodic on Z2 if and only conditions (9.5) hold simultaneously. Exercise 9.7. Prove that a polynomial from Zp [x] is measure-preserving if and only it is bijective modulo p2 . Exercise 9.8. Prove that a polynomial from Zp [x1 , . . . , xn ] is measure-preserving whenever it is balanced modulo p and all its partial derivatives vanish modulo p simultaneously at no point from (Z/pZ)n = {0, 1, . . . , p − 1}n . The two preceding exercises may be generalized:

Exercise 9.9. Under assumptions of Theorem 9.1 assume that m = 1. Prove that F if measure-preserving whenever F is balanced modulo pk for some k ≥ N1 (F ), and all partial derivatives modulo p of the function F vanish simultaneously at no point of (Z/pk Z)n . If additionally n = 1, then F is measure-preserving if and only if it is bijective modulo pN1 (F ) and its derivative modulo p vanishes at no point of {0, 1, . . . , pN1 (F ) − 1}. Equivalently, if m = n = 1 then F is measure-preserving if and only if F is bijective modulo pN1 (F )+1 . Comparing the first assertion of Theorem 9.1 with the rest ones, it is natural to ask whether the sufficient conditions of measure-preservation from the first assertion are also necessary. The answer is negative even for polynomials over Zp : Example 9.10. Consider a polynomial f (x, y) = 2x + y 3 over Z2 , in variables (x,y) (x,y) = 2, ∂f ∂y = 3y 2 , both partial derivatives are 0 modulo 2 x, y. As ∂f∂x whenever y ≡ 0 (mod 2). Nevertheless, f is a measure-preserving mapping from Z22 onto Z2 . Here is a proof. By induction on ` we prove that f is balanced modulo p` for all ` = 1, 2, . . .. The claim follows then from Theorem 7.1. For ` = 1 we have that f (x, y) ≡ y (mod 2), that is, f is balanced modulo 2. Let ` > 1. We will show that for every z ∈ Z/2` Z there exist exactly 2` pairs (x, y) such that f (x, y) ≡ z (mod 2` ) and (x, y) ∈ {0, 1, . . . , 2` − 1}2 . Indeed, if z = 1 + 2r for some r ∈ {0, 1, . . . , 2`−1 − 1}, then it follows that y = 1 + 2k for some k ∈ {0, 1, . . . , 2`−1 − 1}. So 2x + (1 + 2k)3 ≡ 1 + 2r (mod 2` ) implies x + 3k + 6k 2 + 4k 3 ≡ r (mod 2`−1 ). The left hand side of the latter congruence is a polynomial g(x, k) in x, k. The polynomial g(x, k)


107

is measure-preserving in view of Theorem 9.1 This implies that the congruence g(x, k) ≡ r (mod 2`−1 ) in unknowns x, k has exactly 2`−1 solutions in {0, 1, . . . , 2`−1 − 1}2 . If z = 2r for some r ∈ {0, 1, . . . , 2`−1 − 1}, then it follows that y = 2k for some k ∈ {0, 1, . . . , 2`−1 − 1}; consequently, the congruence f (x, y) ≡ z (mod 2` ) implies the congruence x + 4k 3 ≡ r (mod 2`−1 ). The polynomial d(x, k) = x + 4k 3 is measure-preserving in view of Theorem 9.1. Now using an argument similar to that of the case z = 1 + 2r we conclude that the congruence f (x, y) ≡ 2r (mod 2` ) in unknowns x, y has exactly 2` solutions in {0, 1, . . . , 2` − 1}2 . This proves that f is measure-preserving. Theorem 9.1 together with Example 9.10 give rise to the following problem that is important both for theory and for various applications (e.g., in computer science and cryptology, see Chapter 13:

Open question 9.11. Given a function F : Znp → Zm p , m < n, that is uniformly differentiable modulo p and such that all its partial derivatives modulo p are integer-valued. Find necessary and sufficient conditions of measure-preservation for the function F : Znp → Zm p . These conditions are not known even if F is a multivariate polynomial over Zp (or over Z).

9.2

No uniformly differentiable 1-Lipschitz ergodic transformations on Znp , n ≥ 2

Now we are going to conditions for ergodicity of functions that are uniformly differentiable modulo p and that have integer-valued derivatives modulo p. This class of functions contains all locally 1-Lipschitz functions that are uniformly differentiable modulo p, cf. Exercise 2.26. It turns out that in ergodic functions of this kind can be univariate only; namely: Theorem 9.12. Let a function F : Znp → Znp be uniformly differentiable modulo p, let all its partial derivatives modulo p be integer-valued, and let F be ergodic. Then n = 1. To prove Theorem 9.12, we need two lemmas. Lemma 9.13. Let a function f : Znp → Zp be uniformly differentiable modulo p, let it has integer-valued derivatives modulo p, and let f be an identity modulo pk for some k > N1 (f ). Then every partial derivative modulo p of the function f is an identity modulo p (cf. Definition 3.14). Proof of Lemma 9.13. Fix arbitrary x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn ∈ Zp and consider a function gi (x0 , x1 , . . . , xn ) = xi + x0 f (x1 , . . . , xn ) of variate xi . It is clear that gi is uniformly differentiable modulo pk , its derivative modulo pk


108

is integer-valued, and gi is bijective modulo pk . As k > N1 (f ), the function gi is measure-preserving by Theorem 9.1; so the derivative modulo p of the function gi is not zero modulo p everywhere on Zp : ∂1 ∂1 gi (u0 , . . . , un ) = 1 + u0 ∙ f (u1 , . . . , un ) 6≡ 0 (mod p) ∂1 x i ∂1 xi

(9.6)

for all u0 , . . . , un ∈ Zp . If ∂1 f (u1 , . . . , un ) ≡ d 6≡ 0 (mod p) ∂1 xi for some u1 , . . . , un ∈ Zp , then taking u0 such that u0 d ≡ −1 (mod p) we obtain a contradiction to (9.6). This proves Lemma 9.6. Lemma 9.14. Let a function H : Znp → Znp be uniformly differentiable modulo p, and let H has integer-valued derivatives modulo p. If H is bijective modulo pk and if H induces a trivial permutation modulo pk−1 (i.e., an identity transformation on (Z/pk−1 Z)n ) for some k > N1 (H)+1, then H induces on (Z/pk Z)n either a trivial permutation, or a permutation of multiplicative order p.1 Proof of Lemma 9.14. Let G be an arbitrary function that satisfies conditions of Lemma 9.14, and let N1 (G) = N1 (H). Represent both H and G in the following form: H(x1 , . . . , xn ) = (x1 , . . . , xn ) + U (x1 , . . . , xn ); G(x1 , . . . , xn ) = (x1 , . . . , xn ) + V (x1 , . . . , xn ). Then both U and V are uniformly differentiable modulo p, have integervalued derivatives modulo p, and N1 (U ) = N1 (V ) = N1 (H). Moreover, both U and V are identities modulo pk−1 whenever k − 1 > N1 (H). Then Lemma 9.13 implies that U10 = V10 = 0 everywhere on Znp . As |U |p ≤ p−k+1 and |V |p ≤ p−k+1 everywhere on Znp , and as both U and V are uniformly differentiable modulo p, from (2.10) we deduce that H(G(h1 , . . . , hn )) = H((h1 , . . . , hn ) + V (h1 , . . . , hn )) ≡

H(h1 , . . . , hn ) + V (h1 , . . . , hn ) ∙ H10 (h1 , . . . , hn ) ≡

H(h1 , . . . , hn ) + V (h1 , . . . , hn ) + V (h1 , . . . , hn ) ∙ U10 (h1 , . . . , hn ) ≡

(h1 , . . . , hn ) + U (h1 , . . . , hn ) + V (h1 , . . . , hn ) (mod pk )

for all h1 , . . . , hn ∈ Zp . This implies, in particular, that for all s ∈ N the following congruence for iterates of H holds: H s (h1 , . . . , hn ) ≡ (h1 , . . . , hn ) + s ∙ U (h1 , . . . , hn ) (mod pk ). 1

That is, either this permutation is a unit element of Sym(pkn ), a finite symmetric group of all permutations on a pkn -element set, or the order of this permutation (as an element from Sym(pkn )) is p.


109

As U is an identity modulo pk−1 , the latter congruence implies that H p (h1 , . . . , hn ) ≡ (h1 , . . . , hn ) (mod pk ) for all h1 , . . . , hn ∈ Zp . This proves Lemma 9.14 since in view of Theorem 9.1 the function H is measure-preserving and thus in view of Theorem 7.1 induces a permutation of elements of (Z/pk Z)n . Proof of Theorem 9.12. As F is an ergodic locally 1-Lipschitz function, by Theorem 7.1 there exists k > N1 (F ) + 1 such that F is transitive modulo pn for all n ≥ k − 1. The function F then permutes elements of (Z/pk Z)n ; we denote the corresponding permutation via σk (F ). Consider a permutation (k−1)n σ = σk (F )p . As F is transitive modulo pk , the multiplicative order of the permutation σ is pn ; hence σ is not a trivial permutation.2 (k−1)n (k−1)n On the other hand, σ = σk (F p ). However, F p is bijective k k−1 modulo p and induces a trivial permutation modulo p (the latter claim follows from the transitivity of F modulo pk−1 ). As σ is not trivial, in view of Lemma 9.14 the multiplicative order of σ must be p. However, according to the preceding argument, the multiplicative order of σ is pn , so necessarily n = 1. Of course, there exist non-differentiable 1-Lipschitz ergodic transformations on Znp for every n > 1. Actually, given a 1-Lipschitz ergodic transformation f on Zp , one can construct a 1-Lipschitz ergodic transformation on Znp for every n > 1 in the following way. Consider a bijection B : Znp → Zp defined by the rule δk (B(x0 , . . . , xn−1 )) = δ` (xr ), where r ∈ {0, 1, . . . , n − 1} is the least non-negative residue of k ∈ {0, 1, 2, . . .} modulo n, k = ` ∙ n + r, (x0 , . . . , xn−1 ) ∈ Znp . Loosely speaking, we consider an element of Znp as of a table of n one-side infinite rows (say, stretching from left to right) of symbols from {0, 1, . . . , p − 1}, and we put into a correspondence to this table an infinite string of symbols from {0, 1, . . . , p − 1} (that is, an element from Zp ) obtained by reading successively elements of each column of the table, from top to bottom and from left to right. Now take a 1-Lipschitz transformation H : Zp → Zp and a conjugate transformation H B (x0 , . . . , xn−1 ) = B −1 (H(B(x0 , . . . , xn−1 ))) H B (x0 , . . . , xn−1 ) = (f0 (x0 , . . . , xn−1 ), . . . , fn−1 (x0 , . . . , xn−1 )) : Znp → Znp . Obviously, in force of Theorem 7.1, the conjugate mapping H B is 1-Lipschitz and ergodic whenever the mapping H is ergodic: Given a univariate triangular mapping H (see Subsection 4.1.1 about these) x=

∞ X i=0

2

H

χi ∙ pi = (χ0 , χ1 , χ2 , . . .) 7→ (ψ0 (χ0 ); ψ1 (χ0 , χ1 ); ψ2 (χ0 , χ1 , χ2 ); . . .),

That is, σ is not a unit element of the group Sym(pkn .


110

we just construct a n-variate triangular mapping χ0

χn

χ2n . . .

χ1

χn+1

χ2n+1 . . .

...

...

...

χn−1

χ2n−1

χ3n−1 . . .

f0

7→

ψ0 (x)

ψn (x)

ψ2n (x) . . .

7→

ψ1 (x)

ψn+1 (x)

ψ2n+1 (x) . . .

...

...

...

ψn−1 (x)

ψ2n−1 (x)

ψ3n−1 (x) . . .

f1

... fn−1

7→

where χ0 , χ1 , . . . ∈ {0, 1, . . . , p−1}, ψm (x) = ψm (χ0 , . . . , χm ) ∈ {0, 1, . . . , p− 1}, m = 0, 1, 2, . . .. Now assuming thatProws in the left-hand side are new i variables, xj = (χj , χn+j , χ2n+j , . . .) = ∞ i=0 χin+j ∙p (j = 0, 1, . . . , n−1) we see that the n-variate mapping H B = (f0 , f1 , . . . , fn−1 ), where fj (x0 , . . . , xn−1 ) = P ∞ i k i=0 ψin+j (x) ∙ p for j = 0, 1, . . . , n − 1, is transitive modulo p for all k = 1, 2, . . . whenever H is transitive modulo pk for all k = 1, 2, . . .. This easy construction of multivariate ergodic transformation is of some importance in computer science. However, it would be highly desirable to characterize multivariate 1-Lipschitz ergodic transformations of Znp that can not be reduced in this sense to univariate ergodic transformations. Thus we state Open question 9.15. Characterize 1-Lipschitz ergodic transformations on Znp , n > 1. Another open problem concerns the uniform condition: Theorem 9.12 says that uniformly differentiable modulo p ergodic functions only can be univariate; however, in the case when a multivariate function is differentiable everywhere on Znp , although non-uniformly, it is still not known whether the function can be ergodic.

9.3

Differentiable ergodic transformations on Zp

In this section we study conditions for ergodicity of differentiable transformations on Zp . The central result of this subsection is Theorem 9.16 which gives sufficient and necessary conditions of ergodicity for functions that are uniformly differentiable modulo p2 . Theorem 9.16. Let a function f : Zp → Zp be uniformly differentiable modulo p2 , and let a derivative modulo p2 of the function f be integervalued. Then f is ergodic if and only if it is transitive modulo pn for some (equivalently, for every ) n ≥ N2 (f ) + 1 whenever p is odd or, respectively, for some (equivalently, for every ) n ≥ N2 (f ) + 2 whenever p = 2. To prove the theorem, we need a lemma.


111

Lemma 9.17. Let a function f : Zp → Zp be uniformly differentiable modulo p, let its derivative modulo p be integer-valued, and let the function f be transitive modulo pk for some k ≥ N1 (f ) + 1. Then f induces on Z/pk+1 Z a permutation that is either a single cycle of length pk+1 or a product of p pairwise disjoint cycles of length pk each. Proof of Lemma 9.17. The idea of the proof is as follows: As f is transitive (whence, bijective) modulo pk for some k ≥ N1 (f ) + 1, then in view of Theorem 9.1 f is bijective modulo pk+1 . The corresponding permutation of elements of the residue ring Z/pk+1 Z is a product of disjoint cycles, and a reduction modulo pk maps every this cycle on the whole residue ring Z/pk Z since f is transitive modulo pk . Thus, a length of a cycle must be a multiple of pk . Further, as f is asymptotically 1-Lipschitz (see the very beginning of Section 9.3), f maps balls (of radii less than p−N1 (f ) ) into balls; thus, as p-adic ball are cosets in the ring Zp with respect to ideals generated by k powers of p, the iterate f p mod pk+1 permutes cosets of the ring Z/pk+1 Z k with respect to ideal generated by pk . Moreover, as f p mod pk is an identity transformation on Z/pk Z, every this coset must be invariant with respect k to action of f p mod pk+1 . Now it is clear that whenever this action is transitive on the coset, then f is transitive on Z/pk+1 Z. However, it turns k out that f p mod pk+1 acts on the coset by an affine transformation; that is, the action is conjugate to an affine transformation on the finite field of p elements. Here Lemma 8.2 comes into play. We proceed with all this in mind. Remind that for x ∈ Zp we denote via χi = δi (x) ∈ {0, 1, . . . , p − 1} the coefficient of the i-th term in p-adic canonical representation of x; i = 0, 1, 2, . . . (see Theorem 1.21 and Note 1.22). Now Definition 2.27 of uniform differentiability modulo pk implies that for an arbitrary x ∈ Zp and s ≥ N1 (f ) = N the following congruence holds: f (χ0 + χ1 ∙ p + ∙ ∙ ∙ + χs−1 ∙ ps−1 + χs ∙ ps ) ≡ f (χ0 + χ1 ∙ p + ∙ ∙ ∙ + χs−1 ∙ ps−1 )+ χs ∙ ps ∙ f10 (χ0 + χ1 ∙ p + ∙ ∙ ∙ + χs−1 ∙ ps−1 ) (mod ps+1 ). (9.7)

The latter congruence implies that the s-th coordinate function δs (f (x)) of the function f is of the following form δs (f (x)) ≡ Φs (χ0 , . . . , χs−1 ) + χs ∙ f10 (x) (mod p),

(9.8)

where Φs (χ0 , . . . , χs−1 ) = δs (f (χ0 +χ1 ∙p+∙ ∙ ∙+χs−1 ∙ps−1 )). As a derivative f10 (x) modulo p is a step function of order not greater than N by Proposition 2.38, the derivative f10 (x) depends only on χ0 , . . . , χN −1 ; so we can rewrite (9.8) in the form δs (f (x)) ≡ Φs (χ0 , . . . , χs−1 ) + χs ∙ Ψ(χ0 , . . . , χN −1 ) (mod p),

(9.9)


112

where Ψ(χ0 , . . . , χN −1 ) = f10 (x). Further, as a chain rule holds for derivatives modulo p as well (cf. Proposition 2.29), we conclude that for a derivative modulo p of the r-th iterate of f (r = 1, 2, . . .) the following congruence holds: (f r (x))01 ≡

r−1 Y

f10 (f j (x)) (mod p).

(9.10)

j=0

As f is uniformly differentiable modulo p and its derivative modulo p is integer-valued, f is locally 1-Lipschitz (cf. Theorem 4.6); so transitivity of f modulo pk for some k ≥ N implies transitivity of f modulo pn for all k ≥ n ≥ N , see Proposition 6.5. However, as f10 depends only on χ0 , . . . , χN −1 , and as f is transitive modulo pN , from (9.10) we deduce that

(f

pn



(x))01 ≡ 

p−1 Y

γ0 ,...,γN −1 =0

pn−N

Ψ(γ0 , . . . , γN −1 )

(mod p).

(9.11)

Denote the product in parentheses in the right hand side of (9.11) via Π. n Then, as the function f p is uniformly differentiable modulo p and its derivative modulo p is integer-valued, from (9.9) and (9.11) we get that n

δn (f p (x)) ≡ Ξn (χ0 , . . . , χn−1 ) + χn ∙ Πp

n−N

(mod p),

(9.12)

n

where Ξn (χ0 , . . . , χn−1 ) = δn (f p (χ0 + x1 ∙ p + ∙ ∙ ∙ + χn−1 ∙ pn−1 )). As a locally 1-Lipschitz function f is transitive modulo pn+1 for k ≥ n ≥ N , the n function f p , on the one hand, induces a trivial permutation modulo pn , and on the other hand, induces on each coset a+pn ∙(Z/pn+1 Z) of the residue ring Z/pn+1 Z a permutation that is a cycle of length p. This in particular implies that the right hand side of (9.12), considered as a function in a variable χn , must be a permutation, moreover – a cycle of length p on {0, 1, . . . , p − 1}. However, as this function is an affine transformation on a finite field Z/pZ, n−N ≡ 1 (mod p); whence Π ≡ 1 (mod p) from Lemma 8.2 it follows that Πp (since z p ≡ z (mod p), see Subsection 0.2.2) . Finally we conclude that k

k

f p (x) ≡ f p (χ0 + χ1 ∙ p + ∙ ∙ ∙ + χk ∙ pk ) ≡

χ0 + χ1 ∙ p + ∙ ∙ ∙ + χk−1 ∙ pk−1 + pk ∙ (Ξk (χ0 , . . . , χk−1 ) + χk )

(mod pk+1 ). (9.13)

The latter congruence implies that f induces a permutation modulo pk+1 which we denote as σ. We claim that if Ξk (χ0 , . . . , χk−1 ) 6≡ 0 (mod p)


113

for some (equivalently, for all) χ0 , . . . , χk−1 ∈ {0, 1, . . . , p−1}, then f is transitive modulo pk+1 ; otherwise the permutation σ is a product of p disjoint cycles of length pk each. To prove the latter claim, take arbitrary γ0 , . . . , γk ∈ {0, 1, . . . , p − 1} and denote C a cycle of the permutation σ that contains a point γ0 + γ1 ∙ p + ∙ ∙ ∙ + γk−1 ∙ pk−1 + χk ∙ pk ∈ Z/pk+1 Z. As f is transitive modulo pk , then C mod pk = Z/pk Z; thus, pk is a factor of #C, the length of the cycle C. Now, if Ξk (γ0 , . . . , γk−1 ) 6≡ 0 (mod p), then (9.13) implies that k

f p (γ0 + γ1 ∙ p + ∙ ∙ ∙ + γk−1 ∙ pk−1 + χk ∙ pk ) 6≡

γ0 + γ1 ∙ p + ∙ ∙ ∙ + γk−1 ∙ pk−1 + χk ∙ pk

(mod pk+1 ), (9.14)

i.e., that #C > pk . On the other hand, (9.13) implies that #C is a factor of pk+1 . Finally we conclude that in this case #C = pk+1 ; that is, f is transitive modulo pk+1 . Now let the congruence Ξk (γ0 , . . . , γk−1 ) ≡ 0 (mod p) hold for some γ0 , . . . , γk ∈ {0, 1, . . . , p − 1}. Then this congruence must hold for all γ0 , . . . , γk ∈ {0, 1, . . . , p − 1}, since otherwise in view of the preceding argument the function f is transitive modulo pk+1 , so (9.14) holds for all γ0 , . . . , γk ∈ {0, 1, . . . , p−1}; this in view of (9.13) implies that Ξk (γ0 , . . . , γk−1 ) 6≡ 0 (mod p) for all γ0 , . . . , γk ∈ {0, 1, . . . , p − 1}, a contradiction. Thus, in the k case under consideration (9.13) implies that σ p is an identity permutation; hence, #C = pk as pk is a factor of #C. This finally proves Lemma 9.17. Note 9.18. During the proof of Lemma 9.17 we have shown that whenever the function f is transitive modulo pN1 (f )+1 (in particular, whenever f is QpN1 (f ) −1 0 i ergodic) then necessarily i=0 f1 (f (x)) ≡ 1 (mod p) for every x ∈ Zp . Proof of Theorem 9.16. During the proof of Lemma 9.17 we have shown that if f is transitive modulo pk for some k ≥ N1 (f ) then f is transitive modulo pn for all k ≥ n ≥ N1 (f ). This in view of Theorem 7.1 proves the ‘only if’ part of the statement of Theorem 9.16 as f is locally 1-Lipschitz and N2 (f ) + 1 > N1 (f ). To prove the ‘if’ part of the statement, in view of Theorem 7.1 it is sufficient to prove that if n ≥ N2 (f ) + 1 (resp., if n ≥ N2 (f ) + 2 for p = 2) and if f is transitive modulo pn , then f is transitive modulo pn+1 . In turn, to prove the latter claim, in view of Lemma 9.17 it is sufficient to prove that not every element of the residue ring Z/pn+1 Z is a fixed point of the n transformation f p mod pn+1 : n

f p (x) 6≡ x for some x ∈ Zp .

(mod pn+1 ).

(9.15)


114

As transitivity of f modulo pn implies transitivity of f modulo pn−1 (see Proposition 6.5), we get that fp

n−1

(x) = x + pn−1 ξ(x),

(9.16)

where ξ : Zp → Zp ; note that ξ(x) 6≡ 0 (mod p) for all x ∈ Zp since otherwise Lemma 9.17 implies that f is not transitive modulo pn . Further, as f is uniformly differentiable modulo p2 and its derivative modulo p2 is integer-valued, the r-th iterate f r is uniformly differentiable modulo p2 and its derivative modulo p2 is integer-valued, for all r = 1, 2, . . .; moreover, r−1 Q 0 j (f r (x))02 ≡ f2 (f (x)) (mod p2 ), cf. (9.10). Now, as n − 1 ≥ N2 (f ), j=0

then using chain rule for derivatives modulo p2 and an obvious equality n−1 n−1 f sp (x) = f (s−1)p (x + pn−1 ξ(x)) (s = 1, 2, . . .), which follows from (9.16), we successively calculate

f

f

pn

(x) ≡ f

(p−2)pn−1

(p−1)pn−1

n−1

(x) + p 

ξ(x)

(p−1)pn−1 −1 j=0

(p−2)pn−1 −1

(x) + pn−1 ξ(x)  n−1

... ≡ x + p



ξ(x) 1 +

p−1 X

Y

Y

f20 (f j (x)) ≡

f20 (f j (x)) +

j=0 (p−i)pn−1 −1

i=1

Y

j=0

(p−1)pn−1 −1



f20 (f j (x))

Y

j=0



f20 (f j (x)) ≡

(mod pn+1 ). (9.17)

As f20 is a step function of order not greater than N2 (f ) (see Proposition 2.38) and f is transitive modulo pn−1 , we conclude that for arbitrary i, j ∈ N the following congruence holds: f20 (f j (x)) ≡ f20 (f j+ip

n−1

(x)) (mod p2 ).

In view of transitivity of f modulo pn−1 , the latter congruence implies that (p−i)pn−1 −1

Y

j=0

f20 (f j (x)) ≡ α(x)p−i

where α(x) =

pn−1 Y−1

(mod p2 ),

f20 (f j (x)).

j=0

In view of (9.17) we now conclude that f

pn

(x) ≡ x + pn−1 ξ(x) 1 +

p−1 X i=1

α(x)i

!

(mod pn+1 ).

(9.18)


115

Again, as f20 is a periodic function with a period of length pN2 (f ) , and as f is transitive modulo pn−1 for n − 1 ≥ N2 (f ), then α(x) mod p2 does not depend on x; namely

α(x) =

pn−1 Y−1 j=0



(f ) −1 pN2Y

f20 (f j (x)) ≡ 

z=0

pn−1−N2 (f )

f20 (z)

(mod p2 )

(9.19)

We claim that α(x) ≡ 1 (mod p). Indeed, during the proof of Lemma 9.17 we have already established that if k ≥ N1 (f ) and if f is transitive modulo pk , then (f ) −1 pN1Y f10 (f j (x)) ≡ 1 (mod p) (9.20) j=0

for all x ∈ Zp , see the proof of (9.13). From Definition 2.25 of a derivative modulo some p` it follows that f20 (x) ≡ f10 (x) (mod p); consequently, α(x) ≡ 1 + pβ

(mod p2 )

(9.21)

for some β ∈ N0 . In view of (9.20) and (9.21), from (9.18) we deduce now that ! p−1 X pn n−1 ∙ ξ(x) p + pβ i (mod pn+1 ); (9.22) f (x) ≡ x + p i=1

so for p 6= 2 we conclude that n

f p (x) ≡ x + pn ∙ ξ(x) (mod pn+1 ). This, in view of Lemma 9.17, proves Theorem 9.16 in the case p 6= 2 since ξ(x) 6≡ 0 (mod p), see (9.16) and a text thereafter. For the case p = 2, congruence (9.22) implies that n

f 2 (x) ≡ x + 2n (1 + β) (mod 2n+1 );

(9.23)

so to finish the proof it suffices to show that β is even. For n ≥ N2 (f ) + 2 the transitivity of f modulo 2n implies that f is transitive modulo 2N2 (f )+2 , so in view of the definition of a derivative modulo p2 we have that f

2N

N

(x + 2 ξ) ≡ f

2N

N

(x) + 2 ξ

N −1 2Y

f20 (f j (x)) (mod 2N +2 )

(9.24)

j=0

where N = N2 (f ), ξ ∈ Z2 . As f is transitive modulo 2N +2 , we conclude that for every x ∈ {0, 1, . . . , 2N − 1} the mapping N

N

ϕx : ξ 7→ δN (f 2 (x + 2N ξ)) + 2 ∙ δN +1 (f 2 (x + 2N ξ))

(ξ ∈ {0, 1, 2, 3})


116

is a cycle of length 4 on the residue ring Z/4Z. From (9.21) and (9.19) we now conclude that N −1 2Y

j=0

f20 (f j (x)) ≡ 1 + 2β

(mod 4);

thus, (9.24) implies now that ϕx (ξ) ≡ c(x) + ξ(1 + 2β) (mod 4), N

(9.25)

N

where c(x) = δN (f 2 (x)) + 2δN +1 (f 2 (x)). However, for every x the mapping ϕx is transitive modulo 4, so (9.25) in view of Theorem 8.1 implies that β ≡ 0 (mod 2). This ends the proof of Theorem 9.16. Exercise 9.19. State and prove the ergodicity criterion for a polynomial over Zp . Exercise 9.20. Let p = 2. State and prove the ergodicity criterion for the function f (x) = x + x2 OR C, C ∈ N0 . With this criterion, prove that the function f (x) = x + x2 OR 5 is ergodic. Exercise 9.21. Prove that the bound given by Theorem 9.16 is sharp: Take odd prime p, consider the function f (x) = δ0 (x + 1) and show that • f is uniformly differentiable modulo p2 , • a derivative f20 is integer-valued, • f is transitive modulo pN2 (f ) , • f is not transitive modulo pN2 (f )+1 , • f is not ergodic. Note 9.22. A straightforward analog of Theorem 9.16 for functions that are uniformly differentiable modulo p is not true. Namely, for every n ∈ N there exists a 1-Lipschitz function f : Z2 → Z2 such that f is uniformly differentiable modulo 2, f10 = 1 everywhere on Z2 , N1 (f ) = 1, f is transitive modulo 2k for k = 1, 2, . . . , n, and f is not transitive modulo 2k for all k > n. By the argument similar to that which follows, one can construct a counterexample for p 6= 2 as well. Indeed, for x ∈ Z2 consider its canonical 2-adic expansion x = χ0 + χ1 ∙ 2 + χ2 ∙ 22 + . . ., where χ0 , χ1 , χ2 . . . ∈ {0, 1}. Consider a function f (x) =

∞ X i=0

ψi (χ0 , . . . , χi ) ∙ 2i ,


117

where every ψi (x0 , . . . , xi ) a Boolean function that is linear with respect to the Boolean variable xi ; that is, the algebraic normal form (ANF) of the function ψi (x0 , . . . , xi ) is ψi (χ0 , . . . , χi ) = ϕi (χ0 , . . . , χi−1 ) ⊕ χi , see Subsection 8.2. The function f is 1-Lipschitz. Moreover, direct calculations show that for arbitrary s ∈ N and h ∈ Z2 there holds a congruence f (x + 2s h) ≡ f (x) + 2s h (mod 2s+1 ); whence, the function f is uniformly differentiable modulo 2, f10 (x) = 1 for all x ∈ Z2 , and N1 (f ) = 1. Now, given n ∈ N, take a function f such that ϕ0 = 1, all Boolean functions ϕi (χ0 , . . . , χi−1 ) are of odd weight for all i = 1, 2, . . . but i = n, and ϕn (χ0 , . . . , χn−1 ) is of even weight. Then, according to Theorem 8.5, f is transitive modulo 2k for k = 1, 2, . . . , n, but f is not transitive modulo 2n+1 ; thus, f is not ergodic. Note, however, that although the whole Theorem 9.16 holds only for functions that are uniformly differentiable modulo p2 , a substantial part of the theorem, Lemma 9.17, holds for functions that are uniformly differentiable modulo p, and not necessarily modulo p2 . As in applications some important functions are differentiable modulo p, and not modulo p2 (e.g., the bivariate function XOR, see Example 2.28), it is highly desirable to find necessary and sufficient conditions of ergodicity for functions that are uniformly differentiable modulo p, and not modulo p2 . So we set the following problem: Open question 9.23. Find necessary and sufficient conditions of ergodicity for 1-Lipschitz functions f : Zp → Zp that are uniformly differentiable modulo p.

9.4

Measure-preservation and ergodicity of A-, B-, and C-functions

Theorems 9.1 and 9.16 exhibit a ‘Hensel’s-Lemma-like’ phenomenon that often occurs in p-adic dynamics: A behaviour of a dynamical system on the whole continuum space is determined by its ‘behavior modulo pk ’, i.e. on a finite space (cf. Hensel’s Lemma, 2.13). The phenomenon is important in applications: e.g., to determine whether a dynamical system is ergodic (that is, transitive) on a large finite space it is sufficient to determine whether it is ergodic on a relatively small finite space; for a smaller space one may use computers, whereas for a larger space can not. Thus, it is important to estimate N1 (f ) and N2 (f ) with the highest possible accuracy to reduce computational costs. Moreover, although both Theorems 9.1 and Theorem 9.16 give sharp bounds for cardinality of these


118

smaller spaces where one must verify measure preservation (respectively, ergodicity) of a dynamical system, see Exercises 9.3 and 9.21, these bounds are sharp only in the class of all functions that are uniformly differentiable modulo p (respectively, modulo p2 ). However, for narrower classes of functions these bounds can obviously be sharpened; e.g., for affine functions: Theorem 8.1 together with Lemma 8.2 implies that an affine function f (x) = ax + b is ergodic if and only if it is transitive modulo p whenever p is odd, or modulo 4 whenever p = 2, and not modulo p2 and modulo 8, respectively, as follows from Theorem 9.16. In this subsection we show that for some important classes of functions the said bounds can be significantly reduced. Moreover, we calculate these bound explicitly in a contrast to those given by Theorems 9.1 and 9.16: It might be not an easy problem to find N1 (f ) and N2 (f ) given an arbitrary f . We start with A-functions. Let f ∈ A, then, according to Definition 5.23 of A-functions, pn ∙ f ∈ B for a suitable n ∈ N0 . Given f ∈ A, denote ρ(f ) = min {n ∈ N0 : pn ∙ f ∈ B}; put pk − 1 − k > ρ(f ) . λ(f ) = min k ∈ N : 2 ∙ p−1 The following theorem is true. Theorem 9.24. An A-function f is measure-preserving if and only if it is bijective modulo pλ(f )+1 . The function f is ergodic if and only if it is transitive modulo pλ(f )+1 whenever p ∈ / {2, 3}, or modulo pλ(f )+2 whenever p ∈ {2, 3}. Basically, our proof of Theorem 9.24 will follow lines of the proof of Theorem 9.16; however, we will need more than 2 terms in decomposition of the function f (x + pk h) modulo some power of p, cf. (9.7). According to Theorem 5.24, we can develop any A-function f into Taylor series; unfortunately, (j) coefficients f j!(x) in the series are not necessarily p-adic integers if j > 1. So we need to use more delicate techniques to calculate f (x + pk h) modulo some power of p; thus we start with some technical results. Lemma 9.25. The sequence κ(i) = ordp i! − logp i (i = 1, 2, 3, . . .) is monotone nondecreasing. Proof of Lemma 9.25. Obviously, ord i! ≥ ord (i − 1)!; so if log i = p p p logp j > logp (j − 1) logp (i − 1) then κ(i−1) ≤ κ(i). Assume now that for some positive rational integer j. Evidently, logp j + 1 is a number of digits in a base-p expansion of j. Hence, our assumption holds if and only if j − 1 = (p − 1) + (p − 1)p + ∙ ∙ ∙ +(p − 1)pn = pn+1 − 1 forsome n ∈ N0 . But then ordp j! = ordp (j − 1)! + n, logp (j − 1) = n, logp j = n + 1, and thus κ(j) > κ(j − 1).


119

As f is 1-Lipschitz, in view of Theorem 4.15 it can be represented in the following form: ∞ X x , (9.26) f (x) = b0 + bi pblogp ic i i=1

where bj ∈ Zp , j = 0, 1, 2, . . . . Everywhere during the proof of Theorem 9.24 we assume that f is represented in this form. In the sequel we denote λ(f ) via λ, ρ(f ) via ρ. Lemma 9.26. Under assumptions of Theorem 9.24, let p be an odd prime; then the following is true: ( 0 (mod p), for i ≥ 2pλ ; bi ≡ 0 (mod p2 ), for i ≥ 3pλ . Proof of Lemma 9.26. Represent f as f (x) = b0 +

∞ X 1 blogp ic i bi p x, i! i=1

where, we recall, xi = x(x − 1) ∙ ∙ ∙ (x − i + 1) is the ith falling factorial power of x, x0 = 1. As f ∈ A, i.e., as bi pblogp ic ≤ pρ |i!|p we conclude that p

ordp bi ≥ ordp i! − logp i − ρ,

(9.27)

for all i = 1, 2, . . .. In view of Lemma 9.25, to finish the proof of Lemma 9.26 it is sufficient to show only that κ(2pλ ) − ρ ≥ 1 and κ(3pλ ) − ρ ≥ 2. We recall that 1 ordp i! = p−1 (i − wtp i), see Lemma 1.1. 1 (2pλ − 2) − λ − ρ ≥ 1 in As p 6= 2, we conclude that κ(2pλ ) − ρ = p−1 view of the definition of λ = λ(f ). Hence, if p 6= 3, then κ(3pλ ) − ρ =

1 1 (3pλ − 3) − λ − ρ = κ(2pλ ) + (pλ − 1) − ρ ≥ 2. p−1 p−1

This proves Lemma 9.26 for p 6= 3. Finally, let p = 3. Then

1 κ(3pλ ) − ρ = κ(3λ+1 ) − ρ = (3λ+1 − 1) − λ − 1 − ρ ≥ 2, 2 otherwise in view of the inequality 3λ − 1 − λ > ρ (which follows directly from the definition of λ) we conclude that 1 λ+1 − 1) − λ − 1 − 3λ + 1 + λ < 1, (3 2 i.e., that 3λ − 1 < 2; so λ < 1, a contradiction. This finishes the proof of Lemma 9.26.


120

Corollary 9.27. Under assumptions of Theorem 9.24, let p be an odd prime; then for every i ∈ N the following is true: ( 0 (mod p2 ), if i ≥ 2pλ + 1; Δi f (x) ≡ i 0 (mod p), if i ≥ pλ + 1. x Proof of Corollary 9.27. As Δj xi = i−j if i ≥ j and Δj xi = 0 if i < j (see (2)) then ∞ x 1X Δi f (x) logp j c− ordp j b bj p , = j−i i ˆı j=i

where ˆı = ip− ordp i ∈ Zp , ordp ˆı = 0. Now the result is obvious in view of Lemma 9.26. Recall that every A-function is infinitely many times differentiable on Zp , and its derivative f 0 is integer-valued, see Section 5.3. Proposition 9.28. Under assumptions of Theorem 9.24, let p be an odd prime; then N1 (f ) ≤ λ, N2 (f ) ≤ λ + 1, and λ

f 0 (x) ≡

p X

(−1)i−1

Δi f (x) i

(mod p),

(−1)i−1

Δi f (x) i

(mod p2 ),

i=1 λ

0

f (x) ≡

2p X i=1

where λ = λ(f ). Proof of Proposition 9.28. To prove Proposition 9.28 we show that for all x, h ∈ Zp f (x + pm h) ≡ f (x) + pm h ∙ f 0 (x) (mod pm+2 ) (9.28) whenever m ≥ λ + 1, and that f (x + pm h) ≡ f (x) + pm h ∙ f 0 (x) (mod pm+1 )

(9.29)

whenever m ≥ λ. Since f is 1-Lipschitz, it is sufficient to prove congruences (9.28) and (9.29) only for h ∈ {1, 2, . . . , p2 − 1}. By Gregory-Newton formula (see Theorem 0.5), n X n f (x + n) = Δi f (x); i i=0

thus, for n =

pm h

we obtain that

f (x + pm h) = f (x) + pm hϕm (x, h),

(9.30)

CHAPTER 9. ERGODICITY AND DIFFERENTIABILITY where ϕm (x, h) =

m p Xh pm h

i=1

− 1 Δi f (x) . i−1 i

121

(9.31)

Now from Corollary 9.27 we deduce that

p m X p h − 1 Δi f (x)

(mod p)

(9.32)

2p m X p h − 1 Δi f (x)

(mod p2 )

(9.33)

λ

ϕm (x, h) ≡ whenever m ≥ λ and that

i−1

i=1

i

λ

ϕm (x, h) ≡

i−1

i=1

i

whenever m ≥ λ + 1. In view of Corollary 0.3 from (9.32) it follows that λ

ϕm (x, h) ≡

p X i=1

(−1)i−1

Δi f (x) i

(mod p),

for m ≥ λ thus proving assertions of Proposition 9.28 on estimates of N1 (f ) and on the residue f 0 (x) mod p. To prove the rest part of the statement of Proposition 9.28 we first note that for i = 1, 2, . . . , 2pλ the following obvious equality holds:

pm h − 1 i−1

i−2 m i−1 Y p h − (k + 1) Y h m−ordp j −1 , = = p k+1 ˆ k=0

(9.34)

j=1

where ˆ = jp− ordp j is a unit of Zp , (i.e., ˆ has a multiplicative inverse 1ˆ in Zp ); hence, every term of the product in the right hand side of (9.34) is a p-adic integer. If i ≤ pλ then m − ordp j ≥ 2 for all j = 1, 2, . . . , i − 1; so (9.34) implies that m p h−1 ≡ (−1)i−1 (mod p2 ). (9.35) i−1

If pλ + 1 ≤ i ≤ 2pλ and j ∈ {1, 2, . . . , i − 1} then m − ordp j = 1 only in the case when simultaneously j = pλ and m = λ + 1; otherwise m − ordp j ≥ 2. However, if m − ordp j = 1 then Δi f (x) ≡ 0 (mod p) i (see Corollary 9.27); hence in both cases we have that i Δ f (x) h m−ordp j Δi f (x) −1 p ≡− (mod p2 ). ˆ i i

CHAPTER 9. ERGODICITY AND DIFFERENTIABILITY From here in view of (9.34) we deduce that m Δi f (x) p h − 1 Δi f (x) ≡ (−1)i−1 i−1 i i

(mod p2 ).

122

(9.36)

for all i = 1, 2, . . . , 2pλ . Now combining together (9.33), (9.35), and (9.36) we conclude that λ

ϕm (x, h) ≡

2p X

(−1)i−1

i=1

Δi f (x) i

(mod p2 ).

This in view of (9.30), (9.31), and (9.33) completes the proof of Proposition 9.28. Lemma 9.29. Under assumptions of Theorem 9.24, let p be an odd prime; then the function θ(x) =

p−1 X j=2

p−1 j−1 λ−1 kpλ−1 +pλ f (x) Δ2pλ f (x) X 1 Δjp f (x) X k−1 Δ (−1) + (−1) + . ∙ i jpλ−1 kpλ 2pλ+1 j

i=1

k=1

is integer-valued, θ(a) ≡ θ(b) (mod p) whenever a ≡ b (mod pλ ), and f (x + pλ h) ≡ f (x) + pλ h ∙ f 0 (x) + pλ+1 h2 ∙ θ(x) (mod pλ+2 ) for all x, h ∈ Zp . Proof of Lemma 9.29. First we prove that θ is integer-valued, i.e., that θ Δs f (x) maps Zp into Zp . As f is 1-Lipschitz, every fraction (s = 1, 2, 3, . . .) s is a p-adic integer (see Proposition 4.5); so it suffices to show only that the following functions α(x) and βk (x) are integer-valued λ

Δ2p f (x) α(x) = ; 2pλ+1 βk (x) =

Δkp

λ−1 +pλ

f (x)

kpλ

for all k ∈ {1, 2, . . . , p − 1}. As i

Δ f (x) =

∞ X j=i

bj pblogp j c

,

x j−i

(9.37)

for i = 1, 2, 3, . . ., and as bj pblogp j c ≡ 0 (mod pλ+1 )

(9.38)


123

Lemma 9.26, then for all rational integers j ≥ 2pλ , in force of (9.26) and α(x) ∈ Zp for all x ∈ Zp . If j ≥ kpλ−1 + pλ then logp j ≥ λ; so (9.37) implies that βk (x) ∈ Zp for all x ∈ Zp . Now we prove that for all a, b ∈ Zp the congruence a ≡ b (mod pλ ) implies a congruence θ(a) ≡ θ(b) (mod p). From (9.37) it follows that 3p −1 1 X 1 x α(x) ≡ ∙ bj 2 p j − 2pλ λ λ

(mod p).

(9.39)

j=2p

b

Note that in (9.39) every fraction pj is a p-adic integer in force of Lemma 9.29. Now, as a ≡ b (mod pλ ), then Lucas’ Theorem 0.2 implies that for all j = 2pλ , 2pλ + 1, . . . , 3pλ − 1 the following congruence holds: b a ≡ (mod p). j − 2pλ j − 2pλ Thus, (9.39) implies that α(a) ≡ α(b) (mod p).

(9.40)

Further, combining (9.37) with Lemma 9.26 we conclude that the following congruence holds for all k = 1, 2, . . . , p − 1: 1 βk (x) ≡ k

λ −1 2p X

bj

j=kpλ−1 +pλ

x j − kpλ−1 − pλ

(mod p).

From this congruence in force of Lucas’ Theorem 0.2 it follows that βk (a) ≡ βk (b) (mod p)

(9.41)

whenever a ≡ b (mod pλ ). Further, denote λ−1

γk (x) =

Δkp f (x) ; kpλ−1

then in view of (9.37) we conclude that 1 γk (x) ≡ k

λ −1 pX

j=kpλ−1

bj

x j − kpλ−1

(mod p)

for all k = 1, 2, . . . , p − 1. Now applying Lucas’ Theorem 0.2 once again we conclude that (9.42) γk (a) ≡ γk (b) (mod p) whenever a ≡ b (mod pλ ). Now from (9.40)–(9.42) it follows that the congruence a ≡ b (mod pλ ) implies the congruence θ(a) ≡ θ(b) (mod p).


124

Now we prove final assertion of Lemma 9.29. Our proof will follow the lines of the proof of Proposition 9.28; however, now we are considering the case m = λ rather than m ≥ λ + 1. Actually we will derive a congruence for f (x + pλ h) modulo pλ+2 from equality (9.30) with m = λ. In order to do this, we must find a residue of ϕλ (x, h) (see (9.31)) modulo p2 . Again, as f is 1-Lipschitz, during the proof we may assume that h ∈ N. In view of Lemma 4.10, from (9.34) it follows that if i ∈ {1, 2, . . . , 2pλ } and either i ≤ pλ−1 , or pλ−1 < i < pλ , pλ−1 is not a factor of i, then: λ Δi f (x) p h − 1 Δi f (x) ≡ (−1)i−1 (mod p2 ). (9.43) i−1 i i Let now i = kpλ−1 for k ∈ {2, 3, . . . , p − 1}; then (9.34) implies that

pλ h − 1 i−1

≡ (−1)

kpλ−1 −1

k

+ (−1) ph

k−1 X 1 j=1

j

(mod p2 ).

(9.44)

Further, if pλ ≤ i ≤ 2pλ and ordp i 6= λ, λ−1 then combining (9.37) together with (9.38) we see that Δi f (x) ≡ 0 (mod p2 ). i

(9.45)

λ h−1 Δi f (x) Now we find residues modulo p2 of terms p i−1 of the function i λ ϕλ (u, h) (see (9.31)) in two remaining cases, when i = νp , ν ∈ {1, 2}, and, respectively, when i = kpλ−1 + pλ , k ∈ {1, 2, . . . , p − 1}. In the latter case in view of Corollary 9.27 and (9.34) the following congruence holds: λ Δi f (x) p h − 1 Δi f (x) Δi f (x) ≡ (−1)i−1 + (−1)k−1 h (mod p2 ). i−1 i i i (9.46) It is obvious that for all k = 1, 2, . . . , p − 1 the following trivial equality holds: λ−1 λ λ−1 λ Δkp +p f (x) p Δkp +p f (x) = . (9.47) 1+ k kpλ−1 + pλ kpλ−1

As, in view of Corollary 9.27, λ−1

λ

Δkp +p f (x) ≡ 0 (mod p), kpλ−1 + pλ then, since

p k

∈ Zp and ordp λ−1

λ

p k

= 1, the equality (9.47) implies that λ−1

λ

Δkp +p f (x) Δkp +p f (x) ≡ λ−1 λ kp kpλ−1 +p

(mod p2 ).


125

From here, substituting i = kpλ−1 + pλ to (9.46), we deduce that

kpλ−1 +pλ pλ h − 1 Δ f (x) ≡ λ−1 λ λ−1 +p −1 + pλ kp kp kpλ−1 +pλ f (x) kpλ−1 +pλ −1 Δ (−1) kpλ−1 + pλ

+ (−1)k−1 ph ∙ βk (x) (mod p2 ). (9.48)

In the case i = pλ , the equality (9.34) implies that

p−1 X 1 pλ h − 1 λ pλ −1 ≡ (−1)p −1 − ph ≡ (−1) pλ − 1 j

(mod p2 )

(9.49)

j=1

P Pp−1 1 λ since p−1 j=1 j ≡ j=1 j ≡ 0 (mod p) for p 6= 2. Finally for i = 2p from (9.34) in view of Corollary 9.27 we conclude that

λ λ 2pλ f (x) Δ2p f (x) pλ h − 1 Δ2p f (x) 2pλ −1 Δ ≡ (−1) + h ≡ 2pλ − 1 2pλ 2pλ 2pλ λ

(−1)2p

λ −1

Δ2p f (x) + hp ∙ α(x) (mod p2 ). (9.50) 2pλ

Now collecting together (9.43), (9.45), and (9.48)–(9.50), we finish the proof of Lemma 9.29 in the same way as in Proposition 9.28. Lemma 9.30. Under assumptions of Theorem 9.24, let p be an odd prime; then for all x, h ∈ Zp the following congruence holds: f 0 (x + pλ h) ≡ f 0 (x) + 2ph ∙ θ(x) (mod p2 ). Here θ is the function defined in the statement of Lemma 9.29. Proof of Lemma 9.30. From Proposition 9.28 it follows that λ

0

λ

f (x + p h) ≡

2p X i=1

(−1)i−1

Δi f (x + pλ h) i

(mod p2 ).

(9.51)

For i = 1, 2, . . . , 2pλ Lemma 9.29 implies that Δi f (x) Δi f 0 (x) Δi f (x + pλ h) ≡ + hpλ−ordp ∙ + i i ˆı Δi θ(x) h2 pλ+1−ordp i ∙ ˆı

(mod p2 ), (9.52)

where ˆı = ip− ordp i is a unit in Zp ; that is, ˆı has a multiplicative inverse 1 ˆı ∈ Zp .


126

We show now that the term of order 2 (with respect to h) in (9.52) is 0 modulo p2 . If this term is not 0 modulo p2 , then necessarily i ∈ {pλ , 2pλ }. However, from (9.37) it follows that in this case λ−1

Δi+kp f (x) ≡ 0 (mod p), kpλ−1 Δi+kp

λ−1 +pλ

f (x)

kpλ

(9.53)

≡ 0 (mod p),

λ

Δi+2p f (x) ≡ 0 (mod p), 2pλ for all k ∈ {1, 2, . . . , p−1}. Now, by the definition of θ, from (9.53) it follows i that Δ θ(x) ≡ 0 (mod p) for i ∈ {pλ , 2pλ }, and thus ˆı h2 pλ+1−ordp i ∙

Δi θ(x) ≡ 0 (mod p2 ); ˆı

i = 1, 2, . . . , 2pλ .

(9.54)

Now we consider a term of order 1 in (9.52). If this term is not 0 modulo p2 then necessarily i ∈ {1, 2, . . . , 2pλ } and ordp i ≥ λ − 1; that is, i ∈ {pλ , 2pλ , kpλ−1 , kpλ−1 + pλ : k = 1, 2, . . . , p − 1}. Combining together Corollary 9.27, Proposition 9.28, and Lemma 4.10 we conclude that λ−1 p−1

λ

t

τ p f (x) Δp f (x) X X τ −1 Δ + (−1) f (x) ≡ τ pt pλ 0

(mod p);

(9.55)

t=0 τ =1

whence, λ−1 p−1

λ

t

i+τ p f (x) Δi+p f (x) X X τ −1 Δ + (−1) Δ f (x) ≡ τ pt pλ i 0

(mod p).

(9.56)

t=0 τ =1

The latter congruence in force of (9.37) and Lemma 9.26 implies that Δi f 0 (x) ≡ 0 (mod p) when i ∈ {kpλ−1 + pλ : k = 1, 2, . . . , p − 1}; consequently, hp ∙

Δkp

λ−1 +pλ

f 0 (x) ≡ 0 (mod p2 ); k+p

k = 1, 2, . . . , p − 1,

(9.57)

1 since a multiplicative inverse k+p of k + p is in Zp for k = 1, 2, . . . , p − 1. λ−1 If i ∈ {kp : k = 1, 2, . . . , p − 1} then in view of Lemma 9.26 from (9.37) and (9.56) we deduce that that

Δkp

λ−1

f 0 (x) ≡

Δkp

λ−1 +pλ

pλ

f(x)

+

p−k−1 X τ =1

λ−1

(−1)τ −1

Δ(τ +k)p f (x) τ pλ−1

(mod p). (9.58)


127

If i = 2pλ then Proposition 9.28 implies that λ

2pλ

Δ

0

f (x) ≡

2p X

j+2pλ f (x) j−1 Δ

(−1)

j

j=1

(mod p2 ).

This, in view of (9.37) and Lemma 9.26 implies that λ

Δ2p f 0 (x) ≡ 0 (mod p2 ).

(9.59)

Now we consider the case i = pλ . Proposition 9.28 implies that pλ 0

Δ f (x) ≡

λ 1+p X

λ

(−1)j−1

j=1

Δj+p f (x) j

(mod p2 ),

(9.60)

since for j = pλ + 1, . . . , 2pλ from (9.37) in view of Lemma 9.26 it follows that λ Δj+p f (x) ≡ 0 (mod p2 ). j Moreover, (9.37) implies that the latter congruence holds also for all j ≤ pλ − 1 such that j 6= kpλ−1 , where k = 1, 2, . . . , p − 1. Thus, from (9.60) we deduce that p−1

λ

λ−1

λ

kp +p f (x) Δ2p f (x) X k−1 Δ Δ f (x) ≡ + (−1) pλ kpλ−1 pλ 0

(mod p2 ).

(9.61)

k=1

Now, substituting (9.54), (9.57), (9.58), (9.59), (9.61) to (9.52) and summing up all obtained congruences for i ranging from 1 to 2pλ , in view of (9.51) and Proposition 9.28 we conclude that 0

0

λ

f (x + p h) ≡ f (x) + hp ∙ p−1 X

p−1 p−k−1 X (−1)k−1 X k=1

(−1)

k

kp k−1 Δ

p−1 X

(−1)k−1

k=1

Δkp

λ−1 +pλ

f (x)

kpλ

k=1

h

τ =1

λ−1 +pλ

kpλ−1

f (x)

(τ +k)pλ−1 f (x) τ −1 Δ (−1) τ pλ−1

!

+

+ λ

+h∙

Δ2p f (x) (modp2 ). (9.62) pλ

Easy calculations in Qp prove that the following equality for k, τ ∈ {1, 2, . . . , p − 1} is true: X

k+τ =m

m−1 X 1 1 1 X 1 1 2 X 1 = = = + . kτ (m − τ )τ m τ m−τ m τ k+τ =m

k+τ =m

τ =1


128

From here it follows that p−1 p−k−1 X (−1)k−1 X k=1 p−1 X

m=1

k

(−1)m

τ =1

X

k+τ =m

(τ +k)pλ−1 f (x) τ −1 Δ (−1) τ pλ−1

=

p−1 λ−1 m−1 X X 1 Δmpλ−1 f (x) 1 Δmp f (x) m = 2 (−1) , ∙ ∙ kτ τ pλ−1 mpλ−1 m=1

τ =1

(9.63)

As it was shown in the proof of Lemma 9.29, both α(x) and βk (x) are p-adic integers for k = 1, 2, . . . , p − 1 and x ∈ Zp ; thus λ

Δ2p f (x) , 2hp ∙ α(x) = h ∙ pλ hp ∙ βk (x) = h ∙

Δkp

λ−1 +pλ

kpλ−1

f (x)

(9.64) ,

and fractions in the right-hand side are p-adic integers. Finally, the assertion of Lemma 9.30 follows from (9.62), (9.63), (9.64), and from the definition of the function θ. Proof of Theorem 9.24. For p = 2, Theorem 9.24 follows from Theorem 8.12 in view of Lemma 9.25. Indeed, underP conditions of Theorem 9.24, ∞ x coefficients aj of Mahler expansion f (x) = a j j=0 j of the function f ρ ord (i!) ). However, from 2 satisfy the following congruence: 2 ai ≡ 0 (mod 2 the definition of λ in view of Lemma 9.25 it follows that ord2 (i!) − blog2 ic ≥ ρ + 1 for all i ≥ 2λ+1 , as ord2 (2λ+1 !) = 2λ+1 − 1, see Lemma 1.1. That is, ai ≡ 0 (mod 2blog2 ic+1 ) for all i ≥ 2λ+1 . A similar argument proves that ai ≡ 0 (mod 2blog2 (i+1)c+1 ) for all i ≥ 2λ+2 . In view of Theorem 8.12, this proves Theorem 9.24 in the case p = 2. Now let p 6= 2. The first assertion of Theorem 9.24 in this case immediately follows from Theorem 9.1 and Proposition 9.28. Further, if p = 3, then, as N2 (f ) ≤ λ + 1 according to Proposition 9.28, the second assertion of Theorem 9.24 follows from Theorem 9.16. Thus, we only must prove the second assertion of Theorem 9.24 for p ∈ / {2, 3}. As N2 (f ) ≤ λ + 1 according to Proposition 9.28, in force of Theorem 9.16 it is sufficient to show that f is transitive modulo pλ+2 whenever f is transitive modulo pλ+1 . For this purpose, in view of Lemma 9.17 it λ+1 is sufficient only to prove that f p (x) 6≡ x (mod pλ+2 ) for at least one λ+1 x ∈ Zp . Now we merely calculate f p (x) mod pλ+2 . Under our assumptions, f is transitive modulo pλ since f is 1-Lipschitz. Then by Lemma 9.17 we conclude that λ

f p (x) = x + pλ ∙ ξ(x),

ξ(x) 6≡ 0 (mod p),

(9.65)


129

for all x ∈ Zp ; here ξ : Zp → Zp is a function defined everywhere on Zp . We claim that for all i = 0, 1, 2, . . . the following congruence holds: f

pλ +i

λ+1

p

i

λ

(x) ≡ f (x) + p ∙ ξ(x) 2

∙ ξ(x)

i−1 Y

0

j

f (f (x))

j=0

i−1 X k=0

i−1 Y

f 0 (f j (x))+

j=0 k−1 θ(f k (x)) Y 0 τ f (f (x)) (mod pλ+2 ). (9.66) f 0 (f k (x)) τ =0

Recall that a sum (respectively, a product) over an empty set of indices is assumed to be 0 (respectively, 1). Note also that as f is transitive modulo pλ+1 , f is bijective modulo pλ+1 . Then (as λ + 1 ≥ N1 (f ) + 1 by Proposition 9.28) f is measure-preserving, and that f 0 (z) 6≡ 0 (mod p) for all z ∈ Zp , cf. Exercise 9.9. Thus, denominators of all fractions in (9.66) have multiplicative inverses in Zp ; so during the proof of (9.66) and further on, we assume that all calculations are performed in Zp . Q 0 j To prove (9.66) we note that according to chain rule, i−1 j=0 f (f (x)) = (f i (x))0 , so (9.66) can be rewritten in the form fp

λ +i

(x) ≡ f i (x) + pλ ∙ ξ(x) ∙ (f i (x))0 + pλ+1 ∙ ξ(x)2 ∙ (f i (x))0

i−1 X (f k (x))0 ∙ θ(f k (x)) (mod pλ+2 ) f 0 (f k (x)) k=0

and then proved by induction on i. Indeed, for i = 0 our claim trivially folλ lows from (9.65). Now we substitute the above expression for f p +i (x) mod λ λ pλ+2 into the equation f p +i+1 (x) = f (f p +i (x)) and with the use of Lemma 9.29 and obvious direct calculations we prove the demanded congruence for λ f p +i+1 (x). We omit details. λ+1 Now we apply (9.66) to calculate f p (x) mod pλ+2 . We have fp

λ +i

(x) ≡ f i (x) + pλ ∙ ξ(x) ∙ Ai (x)+

pλ+1 ∙ ξ(x)2 ∙ Bi (x) (mod pλ+2 ), (9.67)

where Ai (x) =(f i (x))0 =

i−1 Y

f 0 (f j (x));

j=0 i−1 X

(f k (x))0 θ(f k (x)) = f 0 (f k (x)) k=0   ! k i−1 i−1 k (x)) Y Y X θ(f  f 0 (f j (x)) ∙ f 0 (f τ (x)) . f 0 (f k (x))2

Bi (x) =(f i (x))0

j=0

k=0

τ =0


130

Lemma 9.30 implies that f 0 (a + pλ h) ≡ f 0 (a) (mod p). From here we deduce that f 0 (f k (x)) ≡ f 0 (f r (x)) (mod p) whenever k ≡ r (mod pλ ), as f is transitive modulo pλ . By the latter reason, θ(f k (x)) ≡ θ(f r (x)) (mod p) whenever k ≡ r (mod pλ ), in view of Lemma 9.29. Further, N1 (f ) ≤ λ by Proposition 9.28, and f is transitive modulo pλ+1 by our assumption, so necessarily λ −1 pY f 0 (f τ (x)) ≡ 1 (mod p), (9.68) τ =0

Q Q see the proof of Lemma 9.17; consequently, kτ=0 f 0 (f τ (x)) ≡ rτ =0 f 0 (f τ (x)) (mod p) whenever k ≡ r (mod pλ ). Finally we conclude that Btpλ (x) ≡ t

λ −1 pX

τ =0

τ θ(f τ (x)) Y 0 ν f (f (x)) ≡ t ∙ Bpλ (x) (mod p), f 0 (f τ (x))2

(9.69)

ν=0

for every t ∈ N. Now we calculate Atpλ (x) mod p2 for t ∈ N. Congruence (9.67) in view of (9.68) implies that f kp

λ +τ

(x) ≡ f τ (x) + kpλ ∙ ξ(x)

τY −1

f 0 (f j (x)) (mod pλ+1 ),

(9.70)

j=0

for all k ∈ N and all τ ∈ {0, 1, . . . , pλ − 1}. As Lemma 9.30 implies that f 0 (u) ≡ f 0 (v) (mod p2 ) whenever u ≡ v (mod pλ+1 ), and as λ

Atpλ (x) =

−1 t−1 pY Y

f 0 (f kp

λ +τ

(x)),

k=0 τ =0

we conclude in view of congruence (9.70) that   λ −1 t−1 pY τY −1 Y Atpλ (x) = f 0 f τ (x) + kpλ ∙ ξ(x) f 0 (f j (x)) k=0 τ =0

(mod p2 ).

j=0

This implies in view of Lemma 9.30, that λ

Atpλ (x) =

−1 t−1 pY Y

k=0 τ =0 t−1 Y

k=0





f 0 (f τ (x)) + 2kp ∙ ξ(x) ∙ θ(f τ (x))

λ −1 pY



τ =0

τY −1 j=0



f 0 (f τ (x)) + 2kp ∙ ξ(x) ∙ Bpλ (x)



f20 (f j (x)) ≡

(mod p2 ). (9.71)


131

According to (9.68), λ −1 pY

f 0 (f j (x)) = 1 + pε

j=0

for a suitable ε ∈ Zp ; consequently, (9.71) implies that Atpλ (x) ≡

t−1 Y

k=0

1 + pε + 2kp ∙ ξ(x) ∙ Bpλ (x) ≡

1 + tpε + pt(t − 1) ∙ ξ(x) ∙ Bpλ (x) (mod p2 ). (9.72)

Now combining together (9.67), (9.69), and (9.72) we conclude that λ

f (t+1)p (x) = f p f

tpλ

λ +tpλ

(x) ≡

(x) + pλ ∙ ξ(x) + εtpλ+1 ∙ ξ(x) + pλ+1 t2 ∙ ξ(x)2 ∙ Bpλ (x) (mod pλ+2 ). (9.73)

Finally, by obvious induction on n, from (9.73) and (9.65) we deduce that λ

n(n − 1) + 2 n(n − 1)(2n − 1) pλ+1 ∙ ξ(x)2 ∙ Bpλ (x) ∙ 6

f np (x) ≡ x + npλ ∙ ξ(x) + εpλ+1 ∙ ξ(x) ∙

(mod pλ+2 ).

λ+1

From here it follows in particular that f p (x) ≡ x+pλ+1 ∙ξ(x) (mod pλ+2 ) since p 6= 2, 3. However, the latter congruence in view of (9.65) implies that λ+1 f p (x) 6≡ x (mod pλ+2 ). This finally proves Theorem 9.24. Note 9.31. With the use of Theorem 9.24 we can determine whether a given integer-valued and compatible polynomial f (x) ∈ Qp [x] is ergodic. Represent f (x) in the form f (x) = g(x) r , where r ∈ Zp , g(x) ∈ Zp [x], and at least one coefficient of g(x) is coprime with p. Actually, r is a least common denominator of all coefficients of f (x) represented as irreducible fractions: We assume that f (x) is represented in a falling factorial basis x0 = 1, x1 = x, x2 = x(x − 1), . . ., or in a standard basis 1, x, x2 , . . . . Then ρ(f ) = ordp r; note that r does not depend on a choice of a basis. Now we easily find λ(f ) and determine (e.g., by direct calculations) whether f is transitive modulo pλ(f )+1 in the case p 6= 2, 3 or, respectively, modulo pλ(f )+2 whenever p = 2 or p = 3. Actually one can determine whether a polynomial f (x) ∈ Qp [x] induces a 1-Lipschitz measure-preserving (respectively, ergodic) transformation on Zp by evaluating f at ≈ p3 ∙ deg f points:


132

Proposition 9.32. A polynomial f (x) ∈ Qp [x] induces a 1-Lipschitz measurepreserving (respectively, ergodic) transformation on Zp if and only if the mapping z 7→ f (z) mod pblogp (deg f )c+3 is a compatible and bijective (respectively, transitive) transformation on the residue ring Z/pblogp (deg f )c+3 Z. Proof. We prove only the ergodicity claim; a proof of the measure-preservation claim goes along similar lines and thus is omitted. Coefficients ai ∈ Qp (i = 0, 1, . . . , d) in Mahler expansion of the polynomial f (x) of degree d are completely determined by values of f (x) at the points 0, 1, . . . , d. In particular, all values f (0), f (1), . . . , f (d) are p-adic integers if and only if all coefficients ai ∈ Qp (i = 0, 1, . . . , d) are p-adic integers, i.e., if and only if the polynomial f (x) is integer-valued. As Δi f (x) = 0 for i > deg f = d, in view of Theorem 4.15 from the proof of Proposition 4.5 it follows that f is a 1-Lipschitz transformation on Zp if and only if f induces a compatible transformation on the residue ring Z/pk Z for some (arbitrarily fixed) k ≥ blogp dc + 1. In force of Theorem 9.24, an integer-valued polynomial f (x) ∈ Qp [x] that induces a 1-Lipschitz transformation on Zp is ergodic (on Zp ) if and only if f is transitive modulo pk for any (arbitrarily fixed) k ≥ λ(f ) + 2. Considering P Mahler expansion (9.26) for f (x), f (x) = b0 + di=1 bi pblogp ic xi , where nonnegative bj ∈ Zp for j = 0, 1, 2, . . ., we conclude that ρ(f ) is the least rational integer that is not smaller than any of ordp (i!) − log p i − ordp bi (i = 1, 2, . . . , d). Thus, since the function ordp (i!) − logp i is monotone nondecreasing by Lemma 9.25, every k ∈ N that satisfies the inequality k −1 − k > ordp (d!) − logp d can not be smaller than λ(f ). However, 2 ∙ pp−1 1 ordp (d!) = p−1 ∙ (d − wtp d) by Lemma 1.1; so taking arbitrary k ∈ N that satisfies the inequality 2∙

pk − 1 d −k > , p−1 p−1

(9.74)

we conclude that k ≥ λ(f ). Elementary considerations show that k = blogp dc + 1 satisfies inequality (9.74) thus ending the proof. Exercise 9.33. Prove the assertion of Proposition 9.32 on measure preservation. It is obvious that in some cases conditions of Theorem 9.24 and of Proposition 9.32 can be relaxed; e.g., it is obvious that whenever p > 3, the proposition remains true after replacing pblogp (deg f )c+3 by pblogp (deg f )c+2 . However, the point is that for some important classes of functions these bounds can be tighten significantly so that the conditions depend only on the whole class rather than on a concrete function from the class: Corollary 9.34. A B-function (and thus a C-function) f is measure-preserving if and only if f is bijective modulo p2 . The function f is ergodic if and only


133

/ {2, 3}, or modulo p3 whenever if f is transitive modulo p2 whenever p ∈ p ∈ {2, 3}. Proof. By the definition of the class B, ρ(f ) = 0 for every f ∈ B; whence, λ(f ) = 1, and the conclusion follows from Theorem 9.24. From here we immediately deduce Corollary 9.35 (cf. [39, 19]). A polynomial f ∈ Zp [x] is ergodic if and only if f is transitive modulo p2 whenever p ∈ / {2, 3}, or modulo p3 whenever p ∈ {2, 3}. Note. The bounds given by Corollary 9.35 (and therefore by Corollary 9.34) are sharp: A polynomial 2x3 + 3x + 5 is transitive modulo 4, and is not transitive modulo 8 (whence, is not ergodic on Z2 ); a polynomial 1 + x − x(x − 1)(x − 2)(x − 3)(x − 4)(x − 6)(x − 7) is transitive modulo 9, and is not transitive modulo 27 (whence, is not ergodic on Z3 ); a polynomial 1 + xp is transitive modulo p, and is not transitive (even is not bijective in view of Exercise 9.9) modulo p2 ; whence, is not measure-preserving on Zp .3 Exercise 9.36. Prove that given arbitrary 1-Lipschitz function g : Zp → Zp and arbitrary ergodic B-function u : Zp → Zp , the function f (x) = u(x) + p2 ∙ g(x) is ergodic.

3

The first two examples are due to M. V. Larin, [39].

Chapter 10

1-Lipschitz ergodicity on subspaces In this chapter we study 1-Lipschitz ergodic transformations on certain subspaces S of the space Zp , namely on balls and on spheres. We consider a measure μ ˆp induced on S by the Haar measure μp on the whole space Zp ; we assume that μ ˆp is normalized so that μ ˆ p (S) = 1. Now, if f : S → S is a 1-Lipschitz map, we can speak of ergodicity of this map with respect to the measure μ ˆ p . In the sequel, speaking of ergodicity (and of measure preservation) of a map f on a subspace S we mean that S is invariant under action of f and the measure is μ ˆp .

10.1

1-Lipschitz ergodic transformations on balls

Usually the problem to determine ergodicity of a 1-Lipschitz transformation on a ball Bp−k (a) = a+pk Zp from Zp can be reduced to the one on the whole space Zp . Indeed, if f is a 1-Lipschitz transformation such that f (a+pk Zp ) ⊂ a + pk Zp , then necessarily f (a) = a + pk y for a suitable y ∈ Zp . Thus, f (a + pk z) = f (a) + pk ∙ g(z) for any z ∈ Zp ; so we can relate to f the following 1-Lipschitz transformation on Zp g : z 7→ g(z) =

1 (f (a + pk z) − a − pk y); z ∈ Zp . pk

It is clear that the transformation f is ergodic on the ball Bp−k (a) if and only if the transformation g is ergodic on Zp . Exercise 10.1 (cf. [20]). Prove that the function f (x) = ax−1 + b, where a+b ≡ 1 (mod 2), is ergodic on the ball 1+2Z2 if and only if a ≡ 1 (mod 4) and b ≡ 2 (mod 4).

Exercise 10.2 (cf. [32]). The function f (x) = (ax−1 + b + cx) mod 2n , where a + b + c ≡ 1 (mod 2), is ergodic on the ball 1 + 2Z2 if and only if a + c ≡ 1 (mod 4) and b ≡ 2 (mod 4). 134

CHAPTER 10. 1-LIPSCHITZ ERGODICITY ON SUBSPACES

135

The situation with ergodicity on spheres is much more complicated.

10.2

1-Lipschitz ergodic transformations on spheres

Let Sp−r (y) be a sphere of radius Sp−r (y) =

1 pr ,

r ≥ 1, with a center at y ∈ Zp ; that is

1 z ∈ Zp : |z − y|p = r . p

We remind that the sphere is a disjoint union of balls of radius Sp−r (y) =

p−1 [

(y + pr s + pr+1 Zp ),

1 pr+1

each, (10.1)

s=1

since Sp−r (y) is a set-theoretic complement of the ball y + pr+1 Zp in the ball y + pr Zp (cf. Subsection 1.2.2). So Sp−r (y) is a closed and simultaneously an open (whence, a μp -measurable) subset of Zp . The following easy proposition holds: Proposition 10.3. If Sp−r (y) is invariant under action of a 1-Lipschitz map f , then f (y) ≡ y (mod pr ). Proof. Since Sp−r (y) is invariant, and since f maps balls into balls, f (y + pr s + pr+1 Zp ) ⊂ y + pr sˆ + pr+1 Zp for a suitable sˆ ∈ {1, 2, . . . , p − 1} (see (10.1)). However, f (y + pr s) ≡ f (y) (mod pr ) since f ∈ L1 , and the result follows. From this proposition we derive the following Corollary 10.4. Let all spheres around y ∈ Zp of radii less than ε > 0 be invariant under action of a 1-Lipschitz map f . Then y is a fixed point of f . Exercise 10.5. Prove Corollary 10.4. What is important, the analogue of Theorem 7.1 for a sphere (rather than for the whole space Zp ) remains true: Proposition 10.6. A 1-Lipschitz mapping f : Zp → Zp is ergodic on the sphere Sp−r (y) if and only if it induces on the residue ring Z/pk+1 Z a mapping which is transitive on all subsets Sp−r (y) mod pk+1 = {y + pr s + pr+1 Z : s = 1, 2, . . . , p − 1} ⊂ Z/pk+1 Z k = r, r + 1, . . ..

1

1 That is, the reduced mapping f mod pk+1 permutes cyclically elements of every subset Sp−r (y) mod pk+1 , cf. Subsection 6.1.1.


136

Exercise 10.7. Prove Proposition 10.6. Exercise 10.8. State and prove an analogue of Proposition 10.6 for balls rather than for spheres. It worth notice also that whenever a 1-Lipschitz mapping f is ergodic on the sphere Sp−r (y), f is a bijection of this sphere onto itself; moreover, it is an isometry on this sphere, see Notes 7.5 and 7.8. The same holds for balls. From these notices we deduce the following Lemma: Lemma 10.9. A 1-Lipschitz mapping f : Zp → Zp is ergodic on the sphere Sp−r (y) if and only if the following two conditions hold simultaneously: 1. The mapping z 7→ f (z) mod pr+1 is transitive on the set Sp−r (y) mod pr+1 = {y + pr s : s = 1, 2, . . . , p − 1} ⊂ Z/pr+1 Z. 2. The mapping z 7→ f p−1 (z) mod pr+t+1 is transitive on the set Bp−(r+1) (y+pr s) mod pr+t+1 = {y+pr s+pr+1 S : S = 0, 1, 2, . . . , pt −1}, for all t = 1, 2, . . . and some (equivalently, all ) s ∈ {1, 2, . . . , p − 1}.

Condition 2 holds if and only if f p−1 is an ergodic transformation on the 1 ball Bp−(r+1) (y + pr s) = y + pr s + pr+1 Zp of radius pr+1 centered at y + pr s, for some (equivalently, all ) s ∈ {1, 2, . . . , p − 1}. Proof. As every 1-Lipschitz ergodic transformation f of the sphere is bijective on this sphere, and f is an isometry on this sphere as well (see above notions), f (a+pk Zp ) = f (a)+pk Zp , for all a ∈ Zp and all k = 1, 2, . . .. Thus, the mapping z 7→ f (z) mod pk+1 (k > r) permutes cyclically elements of the set Sp−r (y) mod pk+1 = {y+pr s+pr+1 S : s = 1, 2, . . . , p−1; S = 0, 1, 2, . . . , pk−r −1} if and only if conditions 1 and 2 hold simultaneously for t = k − r. This proves the first part of the statement of the lemma, in view of Proposition 10.6. The second part of the statement is just an analogue of Proposition 10.6 for balls rather than for spheres. Exercise 10.10. Prove that Lemma 10.9 holds for spheres of radii 1 as well, in the following form: A 1-Lipschitz transformation f : S1 (y) → S1 (y) on the sphere [ S1 (y) = s + pZp s∈{0,...,p−1}\{y}

is ergodic if and only if f mod p is transitive on the set {0, . . . , p − 1} \ {y} and f p−1 is an ergodic transformation on every (equivalently, some) ball Bp−1 (s), s ∈ {0, . . . , p − 1} \ {y}.


137

Exercise 10.11. Prove that both Lemma 10.9 and Proposition 10.6 hold for 1-Lipschitz mapping with domain Sp−r (y) rather than with domain Zp ; that is, f may be defined only on the sphere Sp−r (y) rather than on the whole space Zp .

10.2.1

Ergodicity of B-functions and of analytic functions

Remind that p-adic number z ∈ Zp is called primitive modulo pk whenever z mod pk generates the whole group (Z/pk Z)∗ of invertible elements of the residue ring Z/pk Z, cf. Subsection 0.2.1. Note that whenever k > 2 we speak on primitivity modulo pk only for odd p, see Proposition 0.15. Theorem 10.12. Let the function f lie in B. The function f is ergodic on the sphere Sp−r (y) of sufficiently small radius p−r if and only if one of the following alternatives holds: 1. Whenever p is odd, then simultaneously • f (y) ≡ y (mod pr+1 ),

• f 0 (y) is primitive modulo p2 . 2. Whenever p = 2, then simultaneously • f (y) ≡ y (mod 2r+1 ), • f (y) 6≡ y (mod 2r+2 ), • f 0 (y) ≡ 1 (mod 4).

Note. Within the context of the theorem, the ‘sufficiently small’ means that r ≥ 2 if p > 3, or r ≥ 3 if p ≤ 3. Proof. As it immediately follows from Theorem 5.18, for every g ∈ B and all k ∈ Zp , k = 1, 2, 3, . . . the equality g(a + pk h) = g(a) + g 0 (a) ∙ pk h + p2k h2 ∙ gˆ(h),

(10.2)

holds for a suitable C-function gˆ of variable h.2 Since f (y) = y + pr z for a suitable z ∈ Zp in view of Proposition 10.3, from (10.2) we deduce the following equality f (y +pr s+pr+1 S) = f (y)+(pr s+pr+1 S)∙f 0 (y)+p2r ∙(s+pS)2 ∙ w(s+pS) ˆ =

y + pr z + pr s ∙ f 0 (y) + pr+1 S ∙ f 0 (y) + p2r ∙ v(s) + p2r+1 ∙ w(S), (10.3)

where v, w ˆ and w are C-functions in respective variables and r ≥ 1 (note that we have used (10.2) twice; with g = f , a = y, pk h = pr s + pr+1 S, for 2 Of course, coefficients of series (5.8) that represents the function p2k ∙ g ∈ B depend also on a and k, but this is of no importance at the moment


138

the first time, and with g = w, a = s, pk h = pS , for the second time). Note that w depends also on s, yet this is of no importance in argument that follows. Iterating (10.3) we conclude that f p−1 (y+pr s+pr+1 S) = y+pr z

p−2 X

(f 0 (y))i +pr s∙(f 0 (y))p−1 +pr+1 S∙(f 0 (y))p−1 +

i=0

p2r ∙ v˘(s) + p2r+1 ∙ w(S) ˘ (10.4)

for suitable v˘ and w, ˘ which are B-functions (as compositions of C-functions). Now, to satisfy condition 2 of Lemma 10.9, the ball y + pr s + pr+1 Zp must be invariant under action of f p−1 , and f p−1 must act ergodically on this ball. However, 10.4 implies that the ball is invariant if and only if σ(z, s) = z

p−2 X (f 0 (y))i + s ∙ (f 0 (y))p−1 ≡ s

(mod p).

(10.5)

i=0

Assuming the ball is invariant, we conclude that σ(z, s) = s + p ∙ γ(z, s) for a suitable p-adic integer γ(z, s). So, having s fixed, from 10.4 we see that under this assumption the following equality hold: v (s)+pr ∙w(S)); ˘ f p−1 (y+pr s+pr+1 S) = y+pr s+pr+1 ∙(γ(z, s)+S∙(f 0 (y))p−1 +pr−1 ∙˘ Thus, to satisfy condition 2 of Lemma 10.9, the following B-function ˘ Gz,s (S) = γ(z, s) + S ∙ (f 0 (y))p−1 + pr−1 ∙ v˘(s) + pr ∙ w(S)

(10.6)

in variable S must be ergodic on Zp . Now, whenever r > 1 and p > 3, or whenever r > 2 and p ≤ 3, from Corollary 9.34 we deduce that the Bfunction Gz,s (S) from (10.6) is ergodic on Zp if and only if the polynomial Lz,s (S) = γ(z, s) + pr−1 ∙ v˘(s) + S ∙ (f 0 (y))p−1

(10.7)

of degree 1 in variable S is transitive modulo p2 for p > 3, or modulo p3 for p ≤ 3. But this in view of Theorem 8.1 and (10.7) implies that f 0 (y) 6≡ 0 (mod p). Now, as f (y) = y + pr z, from (10.3) it follows that to satisfy condition (1) of Lemma 10.9, the mapping s 7→ z + s ∙ f 0 (y) (mod p) must be transitive on the multiplicative group (i.e., on the whole group of units) (Z/pZ)∗ of the field Z/pZ. Hence, z ≡ 0 (mod p) (that is, f (y) ≡ y (mod pr+1 )) since otherwise s 7→ 0 (mod p) for s ≡ − f 0z(y) (mod p). From this moment we consider cases p = 2 and p > 2 separately. Case 1: p > 2. In this case the mapping s 7→ s ∙ f 0 (y) (mod p) is transitive on (Z/pZ)∗ if and only if f 0 (y) is a primitive element of the field Zp (that is, f 0 (y) generates the cyclic group (Z/pZ)∗ ).


139

Whenever this holds, every ball y + pr s + pr+1 Zp , s ∈ {1, 2, . . . , p − 1} is invariant under action of f p−1 in view of (10.5). Moreover, since z ≡ 0 (mod p), in the case when f 0 (y) is primitive modulo p we have that σ(z, s) ≡ s∙(f 0 (y))p−1 (mod p2 ) and whence γ(z, s) ≡ bs (mod p), where (f 0 (y))p−1 = 1 + pb, b ∈ Zp (see (10.5) and the text thereafter for the definition of σ(z, s) and γ(z, s)). Now, the polynomial Lz,s (S) (see (10.7)) in variable S is ergodic on Zp (and thus condition 2 of Lemma 10.9 is satisfied) if and only if b 6≡ 0 (mod p), see Theorem 8.1. Yet this means that f 0 (y) must be a generator of the multiplicative group (Z/p2 Z)∗ . Case 2: p = 2. In this case the sphere S2−r (y) = y + 2r + 2r+1 Z2 is a ball, see (10.1). Moreover, the above condition f 0 (y) 6≡ 0 (mod p) means that f 0 (y) ≡ 1 (mod 2), and so the condition that the mapping s 7→ s ∙ f 0 (y) (mod p) is transitive on the multiplicative group (Z/pZ)∗ , which just means that z + f 0 (y) ≡ 1 (mod 2) in this case, is automatically satisfied since we have already proved that z ≡ 0 (mod p), (i.e., that z = pc for suitable c ∈ Zp ) for any p. Further, if the polynomial Lz,s (S) in variable S is transitive modulo p3 then f 0 (y) ≡ 1 (mod 4), see (10.7) and Theorem 8.1. That is, f 0 (y) = 1 + 4b for some b ∈ Z2 . Hence γ(z, s) = c + 2b (see (10.5) and the text thereafter), so in view of (10.7) and Theorem 8.1, if Lz,s (S) is transitive modulo 8, then c ≡ 1 (mod 2); that is, f (y) = y + 2r z = y + 2r+1 c 6≡ y (mod 2r+2 ). This proves Theorem 10.12. Corollary 10.13. Let y ∈ Zp be a fixed point of the function f ∈ B, and let p be odd. Then, f is ergodic on all spheres around y of sufficiently small radii if and only if f is ergodic on some sphere around y of a sufficiently small radius. From Theorem 10.12 we derive a complete characterization B-functions that are ergodic on p-adic spheres. Theorem 10.14. Let f be a B-function. Whenever p is odd, the mapping z 7→ f (z) is an ergodic transformation on every sufficiently small sphere centered at y ∈ Zp if and only if the following two conditions hold simultaneously: • f (y) = y, and • the derivative f 0 (y) of the function f at the point y ∈ Zp is primitive modulo p2 . In the case p = 2 no B-function exists such that the mapping z 7→ f (z) is ergodic on all spheres around y ∈ Z2 of radii less than ε, whatever ε > 0 is taken. Exercise 10.15. Prove Theorem 10.14.


140

Exercise 10.16 (cf. [16]). Let p be an odd prime. Consider the map Ma,` : Zp → Zp , Ma,` (z) = az ` , where ` ∈ N, ` 6≡ 1 (mod p) and a ∈ Bp−1 (1). Prove that the map Ma,` has a unique fixed point x0 ∈ Bp−1 (1) and that Ma,` is ergodic on Sp−r (x0 ) if and only if ` is primitive modulo p2 . Exercise 10.17 (cf. [16]). Let p be an odd prime. Prove that the affine map b Ta,b : z 7→ az + b, where a, b ∈ Zp , a 6= 1, has a fixed point y = 1−a ∈ Qp . Prove that in the case when y ∈ Zp , the map Ta,b is ergodic on Sp−r (y) if and only if a is primitive modulo p2 . Exercise 10.18. Let p be an odd prime. Prove that if ` ∈ N is primitive modulo modulo p2 , the functions f (x) = 1 + ` ∙ (−1 + x + p2 ∙ v(x)) and g(x) = ` ∙ (ax + ax − 2a) + 1 are ergodic on all (sufficiently small) spheres around 1, for every a ∈ 1 + p2 Zp and every B-function v. Exercise 10.19. Let p be an odd prime. Prove that the functions f (x) = `∙x ` ∙ x + lnp (1 + p2 x) and g(x) = 1+p 2 x are ergodic on all (sufficiently small) spheres around 0. Recall that lnp is a p-adic logarithm, see Example 3.3. The following exercise gives a solution to a problem posed in [27]: Exercise 10.20. Prove that the perturbed monomial mapping f : x 7→ x` + q(x), where q(x) = pr+1 u(x) and u is a B-function, is ergodic on the sphere Sp−r (1), r > 1, if and only if ` is primitive modulo p2 . Consider the case p = 2 as well!

10.2.2

Ergodicity of A-functions on spheres

Many important functions that satisfy Lipschitz condition with the constant 1 everywhere on Zp do not lie in B; however, they lie in a wider class A, see Sections 5.3 and 5.2. We can determine whether an A-function is ergodic on a p-adic sphere as well. Theorem 10.21. The statement of Theorem 10.12 remains true for f ∈ A.

Sketch proof. By the definition, any A-function f can be represented as f = p1n fˉ for a suitable B-function fˉ and a suitable non-negative rational integer n, see Section 5.3. By Theorem 5.24, we can re-write key equation 10.2 of Theorem 10.12 in the following form: g(a + pk h) = g(a) + g 0 (a) ∙ pk h + p2k−n ∙ h2 ∙ gˆ(h),

(10.8)

where g ∈ A, pn g ∈ B, gˆ ∈ C, and k is sufficiently large (so that 2k − n is positive). Then from (10.3) we obtain (for a sufficiently large r) that ˆ = f (y+pr s+pr+1 S) = f (y)+(pr s+pr+1 S)∙f 0 (y)+p2r−n ∙(s+pS)2 ∙w(s+pS)

y + pr z + pr s ∙ f 0 (y) + pr+1 S ∙ f 0 (y) + p2r−n ∙ v(s) + p2r+1−n ∙ w(S), (10.9) where v, w ˆ and w are C-functions in the respective variables. Now we assume that r is is so large that the inequality 2r − n ≥ r + 3 holds, and finish the proof in a manner similar to that of Theorem 10.12.


141

Exercise 10.22. Complete the proof of Theorem 10.21. Exercise 10.23. Let p be an odd prime, ` ∈ N. When the perturbed monomial x` + p1 (xp − x)2 (which is an integer-valued polynomial over Qp , and not a polynomial over Zp ) is is ergodic on sufficiently small spheres around 1? Exercise 10.24. Let p be an odd prime, ` ∈ N. When the following Afunctions (which are not B-functions!) `∙x 1 1 f (x) = ` ∙ x + lnp (1 + p2 x) + (xp − x)2 ; g(x) = + (xp − x)2 2 p 1+p x p are ergodic on all sufficiently small spheres around 0? Note that our proofs of main results of the section use that A-functions (whence, B-functions) are locally analytic of order 1, cf. Definition 5.1. Within this context it would be interesting to answer the following question: Open question 10.25. Is this possible to expand Theorem 10.12 to the class of all 1-Lipschitz functions that are locally analytic of order n, n = 1, 2, . . .?

Chapter 11

Plotting p-adic dynamics in Euclidean space In this chapter, we consider ‘plots’ of p-adic dynamical systems hZp ; f i in Rn to obtain more information on distribution of points along orbits of the systems. This information is important both for better understanding of 1-Lipschitz ergodic dynamics on p-adic integers and for applications to automata theory and pseudorandom number generators. We already know (cf. the proof of Proposition 7.12) that all orbits of every 1-Lipschitz ergodic dynamical system are dense in Zp . However, using maps into Euclidean spaces we will see that these orbits (although all dense) differ drastically depending on f . The maps give tools to mathematically characterize the differences as well as to make them more ‘visual’.

11.1

Maps of Zp into R

A well-known map m (which is sometimes called the Monna map) from Zp onto a unit interval I = [0, 1] ⊂ R of real numbers isPdefined as follows: i Given z ∈ Zp , consider a canonical p-adic expansion z = ∞ i=0 δi (z)∙p , where P∞ −i−1 δi (z) ∈ {0, 1, . . . , p − 1}; then m(z) = i=0 δi (z) ∙ p ∈ [0, 1]. So, given a map f : Zp → Zp , we can consider a set of all pairs (m(z), m(f (z)), z ∈ Zp , which is a subset in a unit square I2 = [0, 1] × [0, 1], a sort of a ‘graph’ of the function f , see Figures 11.1, 11.2, 11.3, and 11.4. Of course, all these figures were actually obtained as sets of points (m(z), m(f (z) mod pn ), z ∈ Z/pn Z, for some n (p = 2 and n = 17, to be more exact). However, it is clear that these pictures do not depend ‘visually’ on n since the bigger n, the least is dependence of the position of the point (m(z mod pn ), m(f (z) mod pn ) in a unit square on the n-th digit in a base-p representation of the fraction m(f (z) mod pn ) since (m(z mod pn ), m(f (z) mod pn ) → (m(z), m(f (z)) as n → ∞. However, given a 1-Lipschitz transformation f on Zp , we can study 142

CHAPTER 11. EUCLIDEAN PLOTS

143

pn map of another sort: For every n ∈ N consider all points pxn , f (x) pmod , n n x ∈ {0, 1, . . . , p − 1}, as n → ∞. Corresponding ‘graphs’ are much more informative compared to the graph obtained for the Monna map, since in the latter case more significant bits in base-p representation of f (z) play the leading role: For instance, as Figures 11.1, 11.2, 11.3, and 11.4 look somewhat alike, graphs of the second type for corresponding functions are quite different visually, cf. Figures 11.7, 11.8, and 11.5, respectively: We can observe various geometrical structures there, such as straight lines, parabolas, stripes, etc. Moreover, some of these graphs exhibit strong dependence on n, see e.g. Figures 11.9–11.12. In this section, we derive some important information about the transformation f from its graph of the second kind. This information, as we will see, is sometimes crucial whenever one is going to use f as a state transition function of pseudorandom generators, since the mentioned graph reflects a a statistical quality of the produced sequence. Also, this graph says a lot about behaviour of the corresponding automata that evaluates f .

11.2

Points falling on hyperplanes

In this section we study, loosely speaking, what do straight lines in the graphs mentioned above imply. In more precise terms, we study linear complexity of the sequence of iterations x, f (x), f 2 (x), . . .. Here is a definition: Definition 11.1. Let Z = (zi )∞ i=0 be a sequence over a commutative ring R. The linear complexity λR (Z) of the sequence Z over R is the smallest r ∈ N0 such that there exist c, c0 , c1 , . . . , cr−1 ∈ R (not all equal to 0) such that for all i = 0, 1, 2, . . . holds c+

r−1 X j=0

cj ∙ zi+j = 0.

(11.1)

We say that λR (Z) = ∞ if no such r exists. We should notice that in this section we use the notion of linear complexity of a sequence over a ring in a somewhat broader sense than it is commonly used, see e.g. [23]. More often the linear complexity of the sequence (xn ) of elements of a commutative ring R is understood as the smallest r > 0 such that there exist Pr−1c0 , . . . , cr−1 ∈ R that satisfy simultaneously all equations xn+r = j=0 cj xn+j for n = 0, 1, 2, . . .. We, in distinction to the latter, consider non-homogeneous relations (i.e., with a nonzero constant term), as well as relations where all coefficients may be zero divisors (however, not all 0 simultaneously; in the assertion of Theorem 11.5 that follows, the latter, however, is not important). If R is a field, then both notions basically do not differ one from another: If a sequence satisfies


144

Figure 11.1: The function f (x) = x + x2 OR C, C = −131065

Figure 11.2: Same function, C = 1012

Figure 11.3: Same function, C = 111010101000010012

Figure 11.4: The function f (x) = 3 + 5x

P the relation c + ri=0 i xn+i = 0 where cr 6= 0, then it satisfies the relation Pcr−1 −1 xn+r+1 = c−1 c x − 0 n r j=0 cr (cj −cj+1 )xn+j+1 . Our definition is some more convenient for geometric interpretations. For instance, if R = Z/pk Z; then geometrically equation (11.1) means zi+r−1 that all points ( pzki , zpi+1 ), i = 0, 1, 2, . . ., of the unit r-dimensional k ,..., pk Euclidean hypercube fall into parallel hyperplanes. Given a 1-Lipschitz ergodic transformation f on Zp , with the use of linear complexity over the residue ring Z/pk Z we can study distribution of r-tuples of the sequence k (f i (x))∞ i=0 modulo p . From Theorem 7.1, we know that independently on what concrete transformation f is taken, this sequence is strictly uniformly distributed as the sequence of elements from Z/pk Z: The length of the shortest period is pk , and every element from Z/pk Z occurs at the period


145

exactly once. However, distribution of consecutive pairs of elements in this sequence, (triples, etc.) varies depending on f . For example, although every orbit of affine ergodic transformation f (x) = a + bx on Zp is a strictly uniformly distributed modulo pn for all n ∈ N, the linear complexity over Z/pk Z of the orbit is only 2, as it immediately follows from (11.1). Hence, the points that correspond to pairs of consecutive residues fall into a small number of parallel straight lines in a unit square, and this picture does not depend on k, cf. Figure 11.5. Yet another example: The transformation

Figure 11.5: Affine map f (x) = 3 + 5x, p = 2

Figure 11.6: Polynomial map of degree 8

f (x) = x + x2 OR C on Z2 (first considered in [34]) is ergodic if and only if C ≡ 5 (mod 8), or C ≡ 7 (mod 8) (k is fixed). However, distribution of pairs in orbit degrade as the number of 1-s in more significant bit positions of C increases, cf. Figure 11.7 and Figure 11.8. Moreover, in some cases (e.g., when C is a negative rational integer) the distribution degenerates whereas k unboundedly increases, see Figures 11.9–11.12; note that the limit plot (as k → ∞) in this case will be the same as for the linear transformation f (x) = x − 1. The central result of the section is that for ergodic polynomials of degree greater than 2 corresponding linear complexities tend to infinity as k → ∞, cf. Figure 11.6. In other words, the orbits have infinite linear complexities over Zp (and over Qp ): Theorem 11.2. Let f (x) ∈ Qp [x] be an integer-valued 1-Lipschitz ergodic polynomial1 of degree ≥ 2. Then the linear complexity λZ/pk Z (Xk ) of the k sequence Xk = (f i (x0 ) mod pk )∞ i=0 over Z/p Z tends to infinity as k → ∞: lim λZ/pk Z (Xk ) = ∞.

k→∞ 1

these are characterised by Proposition 9.32


Figure 11.7: The map f (x) = x + x2 OR C, C = 101

146

Figure 11.8: Same map, C = 10101010100001001

We split the proof of this theorem into several assertions that are of special interest. Proposition 11.3. Let f ∈ Qp [x] be an integer-valued 1-Lipschitz ergodic polynomial of degree d over a field Qp of p-adic numbers; let r be a positive rational integer such that for each k = 1, 2, . . . there exist c, c0 , . . . , cr ∈ Zp (not all congruent to 0 modulo p) that satisfy the following congruences: c+

r X i=0

ci xn+i ≡ 0 (mod pk ),

(n = 0, 1, 2, . . .)

(11.2)

where xj = f j (x0 ), x0 ∈ Zp , j = 0, 1, 2, . . .. Then d = 1. To prove the proposition, we need the following lemma: Lemma 11.4. Under assumptions of Proposition 11.3, let c, c0 , . . . , cr ∈ Zp do not depend on k; that is, let there exist c, c0 , . . . , cr ∈ Zp that satisfy (11.2) for all k ∈ N simultaneously. Then d = 1. Proof of Lemma P 11.4. As f is ergodic, d 6= 0. Assume that d > 1. Consider w(x) = c + ri=0 ci ∙ f i (x). As w(x) is a composition of integer-valued 1Lipschitz polynomials over Qp , w(x) ∈ Qp [x] is an integer-valued 1-Lipschitz polynomial over Qp . However, deg f i (x) = di ; whence, as d > 1, we conclude that w(x), being a sum of polynomials of pairwise distinct degrees, must be a polynomial of a nonzero degree. On the other hand, since xn+i ≡ f i (f n (x0 )) (mod pk ), assumptions of the lemma imply that w(xn ) ≡ 0 (mod pk ) for all n = 0, 1, 2, . . .. In other words, w(z) ≡ 0 (mod pk ) for all z ∈ Zp since xn takes all values in {0, 1, . . . , pk − 1} in view of the ergodicity of f , and w(x) is 1-Lipschitz.


147

Figure 11.9: The function f (x) = x + ((x2 ) OR (−131065)), k = 16

Figure 11.10: Same function, k = 17



Assumptions of the lemma now imply that w(z) ≡ 0 (mod pk ) for all z ∈ Zp and all k = 1, 2, . . .. Consequently, w(z) = 0 for all z ∈ Zp and hence the polynomial w(x) must be 0 in the ring Qp [x]. A contradiction that proves the lemma. Proof of Proposition 11.3. By the assumption, for each k ∈ N the set Lk of all c = (c, c0 , . . . , cr ) ∈ Zr+2 such that |c|p = 1 and c, c0 , . . . , cr satisfy p (11.2), is not empty. Obviously, L1 ⊃ L2 ⊃ . . . since f is 1-Lipschitz. Further, we assert that each set Lk is closed in the topology of the 0 r+2 0 −s metric space Zr+2 p . Indeed, if c ∈ Lk , c ∈ Zp , |c − c | ≤ p , s ≥ k, then 0 s r+2 0 0 c = c + p z for a suitable z ∈ Zp . Hence, |c |p = 1 and c satisfies (11.2); consequently, c0 ∈ Lk . So we have a decreasing sequence L1 ⊃ L2 ⊃ . . . of non-empty closed


148

subsets in Zr+2 p . It is well known that in compact metric spaces every decreasing sequence of non-empty closed subsets has a non-empty intersection (actually this property is equivalent to compactness); thus, as Zpr+2 is compact, the intersection of the sets L1 ⊃ L2 ⊃ . . . is not empty. That is, there exists c00 ∈ Zr+2 that satisfies assumptions of Lemma 11.4. Yet then p d = 1. Now we are able to prove the following theorem: Theorem 11.5. Let f ∈ Qp [x] be an integer-valued 1-Lipschitz ergodic polynomial, let deg f > 1, and let there exists r ∈ N such that for each k ∈ N the linear complexity over the ring Z/pk Z of the recurrence sequence (xn )∞ n=0 defined by the recursion xn+1 ≡ f (xn ) (mod pk ), does not exceed r. In (k) (k) other words, let there exist c(k) , c0 , . . . , cr ∈ Zp such that the following congruences hold: c(k) +

r X i=0

(k)

ci xn+i ≡ 0 (mod pk )

p

p

k→∞

k→∞

p

(k)

(k)

Then lim c(k) = lim c1 = . . . = lim cr k→∞

(n = 0, 1, 2, . . .).

(11.3)

= 0.

Proof. To start with, we note that from the proofs of both Lemma 11.4 and Proposition 11.3 it follows that they remain true if we let k under their assumptions range over arbitrary infinite subset of N rather than over the whole set N. (k) (k) (k) Now for each k ∈ N take (and fix) c(k) , c0 , c1 , . . . , cr ∈ Zr+2 that p (k)

(k)

(k)

satisfy (11.3). Put ck = (c(k) , c0 , c1 , . . . , cr ) ∈ Zr+2 p . In view of Proposition 11.3 we have then |ck |p < 1 for all sufficiently large k ∈ N. Denote / N if and only if (11.3) N = {k ∈ N : 1 > |ck |p > p−k }. In other words, k ∈ is equivalent to the congruence 0 ≡ 0 (mod pk ). It is obvious that if N is finite, then the conclusion of the theorem is true. Let N be infinite. ˆ the set of all m ∈ N such ˆk = |ck |p ck and denote by N For k ∈ N put c k m that p |ck |p = p for a suitable k ∈ N. In other words, we replace every set of congruences (11.3) with the equivalent system of congruences (k)

cˆ

+

r X i=0

(k)

(k)

(k)

cî xn+i ≡ 0 (mod pm ) (k)

(n = 0, 1, 2, . . .),

ˆk , pm = pk |ck |p . where (ˆ c(k) , cˆ0 , cˆ1 , . . . , cˆr ) = c ˆ is finite, the conclusion of the theorem is obviously true. If N ˆ If the set N is infinite, then, since |ˆ ck |p = 1, in view of Proposition 11.3 and the note at the beginning of the proof, we conclude that deg f = 1. A contradiction.


149

Proof of Theorem 11.2. We just note that by Lemma 11.4, every orbit of the dynamical system hZp ; f i has infinite linear complexity over the ring Zp providing f ∈ Qp [x] is integer-valued 1-Lipschitz ergodic polynomial of degree d > 1. Exercise 11.6. Prove that under conditions of Theorem 11.2 every orbit of the dynamical system hZp ; f i has infinite linear complexity over the field Qp . Exercise 11.7. Prove that in Theorem 11.2, the condition that f is a polynomial over the field Qp is essential. (Hint: Consider the transformation f (x) = 1 + x + 4 ∙ (−1)1+x on Z2 )

Much more information about dynamical system hZp ; f i for the case when f is 1-Lipschitz can be derived if we look at the system at another angle, as at an automaton, see Chapter 12.

Part III

Applications

150

Chapter 12

The p-adic ergodic theory of automata In this chapter, we consider dynamical systems hZp ; f i, where f is a 1Lipschitz map, as automata with a p-letter input/output alphabets. We show that to every system of this sort there corresponds an automaton; and conversely, every automaton defines a 1-Lipschitz dynamical system. We consider ergodicity of collections of maps (rather than of a single map) and develop ergodic theory of automata that gives useful information about behaviour of automata as well as of distribution of points along orbits of the dynamical systems. Also we explain what properties of dynamical systems exhibit their Euclidean plots introduced in Chapter 11.

12.1

Automata and automata maps

By the definition, the (non-initial) automaton is 5-tuple A = hI, S, O, S, Oi where I is a finite set, the input alphabet; O is a finite set, the output alphabet ; S is a non-empty (possibly, infinite) set of states; S : I × S → S is a state transition function; O : I × S → O is an output function. The automaton where both input alphabet I and output alphabet O are non-empty is called the transducer , see e.g. [2]1 ; the automaton where the input alphabet is empty whereas the output alphabet is not empty is called the generator . The initial automaton A(s0 ) = hI, S, O, S, O, s0 i is an automaton A where one state s0 ∈ S is fixed; it is called the initial state. We stress that the definition of the initial automaton A(s0 ) is nearly the same as the one of Mealy automaton (see e.g. [15]) with the only important difference: the set of states S of A(s0 ) is not necessarily finite. Given a non-empty alphabet A, its elements are called symbols, or letters. By the definition, a word of length n over alphabet A is a finite sequence 1

In [2], transducers are understood in a wider meaning; actually, transducers we consider in the book correspond to 1-regular transducers from [2].

151

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA152 (stretching from right to left) αn−1 ∙ ∙ ∙ α1 α0 , where αn−1 , . . . , α1 , α0 ∈ A. The empty word is a sequence of length 0, that is, the one that contains no symbols. Hereinafter the length of the word w is denoted via |w|. Given a word w = αn−1 ∙ ∙ ∙ α1 α0 , any word v = αk−1 ∙ ∙ ∙ α1 α0 , k ≤ n, is called a prefix of the word w; whereas any word u = αn−1 ∙ ∙ ∙ αi+1 αi , 0 ≤ i ≤ n − 1 is called a suffix of the word w. Given words a = αn−1 ∙ ∙ ∙ α1 α0 and b = βk−1 ∙ ∙ ∙ β1 β0 , the concatenation a ◦ b is the following word (of length n + k): a ◦ b = αn−1 ∙ ∙ ∙ α1 α0 βk−1 ∙ ∙ ∙ β1 β0 . Given an input word w = χn−1 ∙ ∙ ∙ χ1 χ0 over the alphabet I, an initial transducer A(s0 ) = hI, S, O, S, O, s0 i transforms w to output word w0 = ξn−1 ∙ ∙ ∙ ξ1 ξ0 over the output alphabet O as follows (cf. Figure 12.1): Initially the transducer A(s0 ) is at the state s0 ; accepting the input symbol χ0 ∈ I, the transducer outputs the symbol ξ0 = O(χ0 , so ) ∈ O and reaches the state s1 = S(χ0 , s0 ) ∈ S; then the transducer accepts the next input symbol χ1 ∈ I, reaches the state s2 = S(χ1 , s1 ) ∈ S, outputs ξ1 = O(χ1 , s1 ) ∈ O, and the routine repeats. state transition si+1 = S(χi , si ) S input ∙ ∙ ∙ χi+1 χi

si

O ξi = O(χi , si )

output ξi ξi−1 ∙ ∙ ∙ ξ0

Figure 12.1: Initial transducer, schematically Throughout the book, ‘automaton’ mostly stands for ‘initial automaton’; we make corresponding remarks if not. Further in the book we mostly consider transducers and generators. Furthermore, throughout the book we consider only reachable transducers; that is, we assume that all the states of an initial transducer A(s0 ) are reachable from s0 ; that is, given s ∈ S, there exists input word w over alphabet I such that after the word w has been feeded to the automaton A(s0 ), the automaton reaches the state s. A reachable transducer is called finite if its set S of states is finite, and is called infinite if otherwise. To the initial automaton A(s0 ) we put into a correspondence a family ˜ O, S, ˜ = S(s) ˜ ˜ O, ˜ si, s ∈ S, where S F(A) of all subautomata A(s) = hI, S, ⊂S

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA153 ˜ O ˜ are is the set of all states that are reachable from the state s and S, respective restrictions of the state transition and output functions S, O on ˜ I × S.

12.1.1

Automata word transformations

Hereinafter in the section the word ‘automaton’ stands for an initial transducer whose input and output alphabet consists of p symbols. We mostly assume that p is a prime although many of further results (for instance, the following Theorem 12.1) are true without this restriction and so we identify input/output symbols with the p-element field Fp = {0, 1, . . . , p − 1}. Thus, for every n = 1, 2, 3, . . . the automaton A(s0 ) = hFp , S, Fp , S, O, s0 i maps n-letter words over Fp to n-letter words over Fp according to the procedure described above, cf. Figure 12.1. We identify n-letter words over Fp with non-negative integers in a natural way: Given an n-letter word w = χn−1 χn−2 ∙ ∙ ∙ χ0 (i.e., χi ∈ Fp for i = 0, 1, 2, . . . , n − 1), we consider w as a base-p expansion of the number χ0 + χ1 ∙ p + ∙ ∙ ∙ + χn−1 ∙ pn−1 . In turn, the latter number can be considered as an element of the residue ring Z/pn Z modulo pn . Thus, to every automaton A there corresponds a map fn,A from Z/pn Z to Z/pn Z, for every n = 1, 2, 3, . . .. Note that when necessary we may also identify n-letter words over Fp with elements of Fnp , the n-th Cartesian power of Fp ; so further we use these one-to-one correspondences between n-letter words and residues modulo pn (as well as between the words and elements from Fnp ) without extra comments. In a similar manner, every automaton A = A(s0 ) defines a map fA from Zp to Zp : Given an infinite word w = . . . χn−1 χn−2 ∙ ∙ ∙ χ0 (that is, an infinite sequence) over Fp we consider a p-adic integer whose canonical expansion is z = z(w) = χ0 + χ1 ∙ p + ∙ ∙ ∙ + χn−1 ∙ pn−1 + ∙ ∙ ∙ ; so, by the definition, for every z ∈ Zp we put δi (fA (z)) = O(δi (z), si )

(i = 0, 1, 2, . . .),

(12.1)

where si = S(δi−1 (z), si−1 ), i = 1, 2, . . .. The so defined map fA is called the automaton function (or, the automaton map) of the automaton A. The point is that the class of all automata functions coincides with the class of all 1-Lipschitz maps from Zp to Zp : Theorem 12.1. The automaton function fA(s0 ) : Zp → Zp of the automaton A(s0 ) = hFp , S, Fp , S, O, s0 i is 1-Lipschitz. Conversely, for every 1-Lipschitz function f : Zp → Zp there exists an automaton A(s0 ) = hFp , S, Fp , S, O, s0 i such that f = fA(s0 ) . Proof. As si = S(δi−1 (z)), si−1 ) for every i = 1, 2 . . ., the i-th output symbol ξi = δi (fA (z)) depends only on input symbols χ0 , χ1 , . . . , χi ; that is δi (fA (z)) = ψi (δ0 (z), δ1 (z), . . . , δi (z))

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA154 → Fp . That is, f = for all i = 0, 1, 2, . . . and for suitable maps ψi : Fi+1 p fA(s0 ) : Zp → Zp is of the form f: x=

∞ X i=0

χi pi 7→ f (x) =

∞ X

ψi (χ0 , . . . , χi )pi .

(12.2)

i=0

By Proposition 4.2 this means that the function fA(s0 ) is 1-Lipschitz. Conversely, let f : Zp → Zp be a 1-Lipschitz map; the by Proposition 4.2 f may be represented in the form (12.2) for suitable maps ψi : Fi+1 → p Fp . We now construct an automaton A(s0 ) = hFp , S, Fp , S, O, s0 i such that fA(s0 ) = f . Let F?p be a set of all non-empty words over the alphabet Fp . We consider these words as base-p expansions of numbers from N = {1, 2, 3, . . .} and enumerate all these words by integers 1, 2, 3, . . . in lexicographical order in accordance with the natural order on Fp : 0 < 1 < 2 < ∙ ∙ ∙ < p − 1. This way we establish a one-to-one correspondence between the words w ∈ F?P and integers i ∈ N: w ↔ ν(w), i ↔ ω(i) (ν(w) ∈ N, ω(i) ∈ F?p ). Note that ν(ω(i)) = i, ω(ν(w)) = w for all i ∈ N and all non-empty words from w ∈ F?p . Assume that ω(0) is empty word. Now put S = N0 = {0, 1, 2, 3, . . .}, the set of all states of the automaton A under construction, and take the initial state s0 = 0. The state transition function S is defined as follows: S(r, i) = ν(r ◦ ω(i)),

(12.3)

where i = 0, 1, 2, . . . and r ∈ Fp . That is, S(r, i) is the number of the word r ◦ ω(i) that is a concatenation of the word ω(i) (the word that has number i), the prefix, with the single-letter word r, the suffix. Now consider a one-to-one map θn (χn−1 ∙ ∙ ∙ χ1 χ0 ) = (χ0 , χ1 , . . . , χn−1 ) from the n-letter words onto Fnp and define the output function of the automaton A as follows: O(r, i) = ψ|ω(i)| (θ|ω(i)|+1 (r ◦ ω(i))),

(12.4)

where i = 0, 1, 2, . . . and r ∈ Fp . As both f and fA(s0 ) are 1-Lipschitz, thus continuous with respect to p-adic metric, and as N0 is dense in Zp , to prove that f = fA(s0 ) is suffices to show that fA(s0 ) (w) ˜ ≡ f (w) ˜ (mod p|w|) ) (12.5)

for all finite non-empty words w ∈ F?p , where w ˜ ∈ N0 stands for a integer whose base-p expansion is w. We prove that (12.5) holds for all w ∈ F?p once |w| = n > 0 by induction on n. If n = 1 then w ˜ ∈ Fp ; so once w is feeded to A, the automaton reaches the state S(w, 0) = ν(w) (cf. (12.3)) and outputs O(w, 0) = ψ0 (θ1 (w)) = f (w) ˜ mod p (cf. (12.4)), see (12.2). Thus, (12.5) holds in this case.

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA155 Now assume that (12.5) holds for all w ∈ F?p such that |w| = n < k and prove that (12.5) holds also when |w| = n = k. Represent w = r ◦ v, where r ∈ Fp and |v| = n − 1. By the induction hypothesis, after the word v has been feeded to A, the automaton reaches the state ν(v) and outputs the word v1 of length n − 1 such that v˜1 ≡ f (˜ v ) mod pn−1 . Next, being feeded by the letter r, the automaton (which is in the state ν(v) now) outputs the letter O(r, ν(v)) = ψ|ω(ν(v))| (θ|ω(ν(v))|+1 (r ◦ ω(ν(v)))) = ψ|v| (θ|v|+1 (r ◦ v)). This means that once feeded by w, the automaton A(s0 ) outputs the word v2 = (ψ|v| (θ|v|+1 (r ◦ v))) ◦ v1 . However, v˜2 ≡ f (w) ˜ (mod pn ), cf. (12.2). Note 12.2. From the proof of Theorem 12.1 it is clear that the mapping fn,A(s0 ) : Z/pn Z → Z/pn Z is just a reduction modulo pn of the automaton function fA(s0 ) : fn,A(s0 ) = fA(s0 ) mod pn for all n = 1, 2, 3, . . .. Further, given a 1-Lipschitz function f : Zp → Zp via Af we denote an initial transducer hFp , S, Fp , S, O, so i whose automaton function is f ; that is, fAf = f . Note that the automaton Af is not unique: There are many automata that has the same automaton function. However, this non-uniqueness will not cause misunderstanding since in the book we are mostly interested with automata functions rather than with ‘internal structure’ (e.g., with state sets, state transition and output functions, etc.) of automata themselves. Exercise 12.3. Let p = 2. Given a function f (x) = 1 + x, construct an automaton Af that has a minimum number of states. Exercise 12.4. Prove that there exists no finite automaton whose automaton function is f (x) = x2 .

12.1.2

Reversibility of automata

An initial transducer A = A(s0 ) = hFp , S, Fp , S, O, s0 i is called n-reversible (or, reversible on words of length n) whenever the reduction fn,A = fA mod pn modulo pn of the automaton function fA : Zp → Zp to the residue ring Z/pn Z is bijective. That is, the automaton is n-reversible if and only if it performs an invertible (i.e., bijective) mapping of n-letter words to n-letter words. The automaton A is called reversible (or, invertible) whenever its automaton function f = fA is bijective. Speaking loosely, the reversibility of an automaton means that the one may work ‘in the opposite direction’, that is, convert output words back to input words. From Theorem 7.1 it follows that the automaton A is reversible if and only if it is n-reversible for all n = 1, 2, . . .; that is, if and only if the automaton function f = fA : Zp → Zp is measure-preserving. Now, given an automaton function, to determine whether the automaton is reversible one may use various techniques we considered in Part II of the book. It is worth mentioning here that the techniques can be also applied to determine reversibility of transducers whose input and output alphabets

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA156 are of prime-power (rather than of just prime) order. Indeed, a transducer A with input/output alphabet pk can be considered as an automaton A0 with k inputs and k outputs over alphabet Fp (see Figure 12.2) up to a natural correspondence between letters from the pk -letter alphabet and kdimensional vectors α↓ over Fp . Theorem 12.1 in this case yields that the automata function is triangular, cf. Definition 4.4; that is, F can be considered as k-variate 1-Lipschitz map F : Zkp → Zkp . Thus, Theorem 7.1 holds for these automata maps; so the reversibility of the automaton is equivalent to measure-preservation of its automaton function. Therefore to determine reversibility of the automaton one may apply various techniques from Part II. αi↓

Φ↓i (α0↓ , . . . , αi↓ )

A0 k-letter input

k-letter output

Figure 12.2: Automaton with k inputs and k outputs For instance, the following theorem is just a re-statement of claim 3 of Theorem 9.1: Theorem 12.5. Let the automaton function F = FA : Zkp → Zkp be uniformly differentiable modulo p. Then the automaton A is reversible if and only if it is n-reversible for a sufficiently large n (actually, for n ≥ N1 (F ) + 1). Exercise 12.6. Let A be an automaton with binary input/output. Let fA = x + x2 OR C, C ∈ Z. Prove that the automaton is reversible. Exercise 12.7. Consider an automaton A whose input/output alphabets are {0, 1, 2, 3} and whose automaton function F = FA : Z4 → Z4 is F (x + 2y) = (x XOR (2 ∙ (x AND y))) + 2 ∙ ((y + 3x3 ) XOR x)) for x, y ∈ Z2 . Prove that the automaton is reversible (Hint: Every 4-adic integer z can be represented as z = x + 2y where x, y are 2-adic integers; that is, as a pair (x, y) of 2-adic integers).

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA157 The Tue-Morse automaton is an initial transducer AhF2 , F2 , F2 , S, O, 0i, where S = O, and S(0, 0) = 0, S(1, 0) = 1, S(0, 1) = 1, S(1, 1) = 0; that is, S is just an addition in the two-element field F2 : S(x, y) = x + y for x, y ∈ F2 . Exercise 12.8. Prove that the Tue-Morse automaton is reversible (Hint: Use either Theorem 8.5 or Theorem 12.5).

12.2

Transitivity of automata

In Subsection 12.1.2 we already have learnt how the p-adic ergodic theory can be applied to determine reversibility of automata since the reversibility is equivalent to the measure-preservation of the corresponding automaton function. The same approach can be applied to determine if an automaton function is ergodic; however, the transitivity of automata is a wider notion which is not reduced only to the ergodicity of the automaton function; moreover, it is related to how the points of orbits are distributed. In this section, we study various aspects of transitivity. Conventions: As previously, in the section by automaton we mean initial transducer A(s0 ) = hFp , S, Fp , S, O, s0 i. However, we will also consider non-initial automata A(s0 ) = hFp , S, Fp , S, Oi. To avoid possible confusion, the latter will be referred as to discrete systems, or, for brevity, just as to systems. The justification of the latter term is as follows: According to the most general definition of a system in a general mathematica systems theory (see e.g. [31]), by a discrete system they usually understand a stationary dynamical system with a discrete time, that is, a 5-tuple A = hI, S, O, S, Oi where I is a non-empty finite set, the input alphabet; O is a non-empty finite set, the output alphabet; S is a non-empty (possibly, infinite) set of states; S : I × S → S is a state transition function; O : I × S → O is an output function. That is, the stationary dynamical systems with a discrete time are just non-initial automata in our terminology. Obviously, the system A corresponds to a family F(A) of all automata A(s) = hI, S, O, S, O, si, s ∈ S. To the latter family, we relate a family of automata functions fA(s) , s ∈ S. In the section, we also additionally assume that I = O = Fp , p a prime (though some further results are true without this limitation) and that there exists a state s0 ∈ S such that all the states of the system A are reachable from s0 . To introduce the main notion of the section, we remind the notion of transitivity of a family of maps: Definition 12.9 (Transitivity). A family F of mappings of a finite nonempty set M into M is called transitive whenever given a pair (a, b) ∈ M × M , there exists f ∈ F such that f (a) = b.

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA158 Note that whenever F consists only of one mapping f , the latter is transitive if it is bijective and the family {e = f 0 , f = f 1 , f 2 , f 3 , . . .} is transitive in the sense of the above definition (here as usual f i stands for the i-th iterate of f ). In other words, the mapping f : M → M is transitive if and only if it cyclically permutes elements of M . Now we are able to state the main notion of the section: Definition 12.10 (Automata transitivity). The automaton A(s0 ) (equivalently, the system A) is said to be • n-word transitive, if the mapping fn,A(s0 ) is transitive on the set Wn of all words of length n; • word transitive, if A(s0 ) is n-word transitive for all n ∈ N; • completely transitive, if for every n ∈ N, the family fn,A(s) , s ∈ S, is transitive on Wn ; • absolutely transitive, if for every s ∈ S the automaton A(s) is completely transitive; that is, if for every n ∈ N the family fn,A(t) , t ∈ SA(s) , is transitive on Wn , where SA(s) is the set of all reachable states of the automaton A(s). The transitivity properties may be defined in equivalent way: Definition 12.11 (Automata transitivity, equivalent). 1. The word transitivity means that given two finite words w, w0 whose lengths are equal one to another, |w| = |w0 | = n, the word w can be transformed into w0 by a sequential composition of a sufficient number of copies of A(s0 ): A |{z}

∙∙∙∙∙∙∙∙∙

A |{z} w0

w

2. The complete transitivity means that given finite words w, w0 , |w| = |w0 |, there exists a finite word y (may be of length other than that of w and w0 ) such that the automaton A(s0 ) transforms the input word w ◦ z (with the prefix y) to the output word w0 ◦ y 0 that has a suffix w0 : ∗ ∙ ∙ ∙ ∙{z ∙ ∙ ∙ ∙ ∙ ∗} |{z} | w

y

A |{z} w0

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA159 3. The absolute transitivity means that given finite words x, w, w0 such that |w| = |w0 | (may be |x| 6= |w|), there exists a finite word y such that the automaton A(s0 ) transforms the input word w ◦ y ◦ x to the output word w0 ◦ y 0 ◦ x0 : ∗∙∙∙∗ |{z} | {z } |{z} w

y

x

A |{z} w0

Exercise 12.12. Prove that the two above definitions of transitivity are indeed equivalent. By Theorem 7.1, an automaton A = A(s0 ) is word transitive if and only if its automation function fA is ergodic; so to determine if the automaton is word transitive one may apply various techniques developed in Part II. For instance, the the following theorem is just a re-statement of Theorem 9.16: Theorem 12.13. Let the automaton function f = fA : Zp → Zp be uniformly differentiable modulo p2 . Then the automaton A is word transitive if and only if it is n-word transitive for a sufficiently large n. Exercise 12.14. Let p = 2. Prove that an automaton whose automaton function is f (x) = x + x2 OR 5 is word transitive. The notions of complete and absolute transitivity are more complex: they are related to ergodicity of a family of maps rather than to the ergodicity of a single map. Definition 12.15 (Ergodicity of a family of maps). A family F = {fi : i ∈ I} of measurable maps fi : S → S (which are not necessarily ‘onto’ maps) of a measure space S endowed with a probability measure μ is called ergodic if the maps fi , i ∈ I, have no common μ-measurable invariant subset other than sets of measure 0 or 1; that is, if there exists a μ-measurable subset S ⊂ S such that fi−1 (S) = S for all i ∈ I, then necessarily either μ(S) = 0, or μ(S) = 1. Note that if a family consists only of one map, Definition 12.15 yields general definition of ergodicity of a single map, with no assumption about the measure-preservation of the latter map (cf. a notice in Section 6.1) although everywhere in the book speaking of ergodicity of a single map we additionally assume that the map is measure-preserving! Theorem 7.1 can be re-stated for families of maps, in the following form: Theorem 12.16. If for all k = 1, 2, . . . a family F = {fi : i ∈ I} of 1Lipschitz maps fi : Znp → Znp is transitive modulo pk (that is, if the family F mod pk = {fi mod pk : i ∈ I} of reduced maps is transitive on (Z/pk Z)n for all k = 1, 2, . . .) then the family F is ergodic with respect to the p-adic measure μp .

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA160 Exercise 12.17. Prove Theorem 12.16 by modifying the ‘if’ part of the proof of Proposition 7.12 for the case of a family of maps. To study transitivity of automata (that is, ergodicity of corresponding families of 1-Lipschitz maps) we develop extra techniques in the next sections.

12.3

Automata 0-1 law

As previously, in the section ‘automaton’ stands for an initial transducer A(s0 ) = hFp , S, Fp , S, O, s0 i such that all states from S are reachable from the initial state s0 . Given an automaton A(s0 ), consider the corresponding automaton function f = fA(s0 ) : Zp → Zp . For k = 1, 2, . . . let Ek (f ) be a set of all the following points efk (x) of Euclidean unit square I2 = [0, 1] × [0, 1] ⊂ R2 : x mod pk f (x) mod pk f , , ek (x) = pk pk where x ∈ Zp . Note that x mod pk corresponds to the prefix of length k of the infinite word x ∈ Zp , i.e., to the input word of length k of the automaton A(s0 ); while f (x) mod pk corresponds to the respective output word of length k. That is, given an input word w = χk−1 ∙ ∙ ∙ χ1 χ0 and the corresponding output word w0 = ξk−1 ∙ ∙ ∙ ξ1 ξ0 , we consider in I2 the set of all points (χk−1 p−1 + ∙ ∙ ∙ + χ1 p−k+1 + χ0 p−k , ξk−1 p−1 + ∙ ∙ ∙ + ξ1 p−k+1 + ξ0 p−k ), for all pairs (w, w0 ) of input/output words of length k. A x mod pk =χk−1 ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ χ1 χ0 |

{z

ξk−1 ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ξ1 ξ0 = f (x) mod pk }

(0.χk−1 . . . χ1 χ0 , 0.ξk−1 . . . ξ1 ξ0 )

We already have acquainted with these point sets Ek (f ), the ‘Euclidean plots’ of p-adic 1-Lipschitz maps f , in Section 11.1. In the current section, we will show that the behaviour of these plots while k → ∞ can be of two types only, and the type of behaviour depends on whether the family of automata maps F(A) of the system A (which corresponds to the initial transducer A(s0 )) is ergodic or not. It can be observed that basically the behaviour is of two types only: 1. As k → ∞, the point set Ek (f ) is getting more and more dense (cf. Fig. 12.3–12.6, p = 2) , or

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA161 2. Ek (f ) is getting less and less dense while k → ∞, cf. Fig. 11.9–11.12 (p = 2).

Figure 12.3: f (x) = 2x2 + 3x + 1, k = 16

Figure 12.4: k = 18

Same function f ,

Figure 12.5: k = 20

Figure 12.6: k = 23

Same function f ,

Same function f ,

Now we explainSthis experimental phenomenon. Denote E(f ) the closure 2 of the set E(f ) = ∞ k=1 Ek (f ) in the topology of real plane R . As E(f ) is closed, it is measurable with respect to the Lebesgue measure on real plane R2 . Let α(f ) be the Lebesgue measure of E(f ). It is clear that 0 ≤ α(f ) ≤ 1; however, it turns out that in fact only the two extreme cases occur: α(f ) = 0 or α(f ) = 1. Theorem 12.18 (The automata 0-1 law). For a 1-Lipschitz map f : Zp → Zp , the following alternative holds: Either α(f ) = 0 (equivalently, E(f ) is nowhere dense in I2 ), or α(f ) = 1 (equivalently, E(f ) = I2 ). Moreover,

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA162 α(f ) = 1 if and only if automaton Af (i.e., whose automaton function is f ) is completely transitive. Remind that nowhere dense sets can nevertheless have positive Lebesgue measures, cf. fat Cantor sets (e.g. the Smith-Volterra-Cantor set), also known as -Cantor sets, see e.g. [1]. Proof of Theorem 12.18. Let α(f ) > 0; we are going to prove that then α(f ) = 1 and E(f ) = I2 . Either of the two following cases is possible: 1) Some point from E(f ) have an open neighbourhood (in the unit square I2 ) that lies completely in E(f ), or, on the contrary, 2) no such point in E(f ) exists (thus, F is nowhere dense in I2 then). We consider the two cases separately and prove that within the first one necessarily α(f ) = 1 while the second one is impossible (that is, if F is nowhere dense in I2 then necessarily α(f ) = 0). Case 1: In this case, there exist u, v, u0 , v 0 , 0 ≤ u < v ≤ 1, 0 ≤ u0 < v 0 ≤ 1 such that the square [u, v] × [u0 , v 0 ] ⊂ I2 lies completely in E(f ), and every point from the open real interval (u0 ; v 0 ) is a limit (with respect to the standard Archimedean metric in R) of some sequence of fractions pm m u0 < f (am )pmod < v 0 , where u < apm < v, m = 1, 2, . . .. Thus, we can take m n ∈ N and w = ω0 + ω1 ∙ p + ∙ ∙ ∙ + ωn−1 ∙ pn−1 , where ωi ∈ {0, 1, . . . , p − 1}, i = 0, 1, . . . , n − 1, so that the square w w 1 1 f (w) mod pn f (w) mod pn S = n, n + n × , + n p p p p pn pn lies completely in E(f ), and every inner point (x, y) of the square S 2 is a limit as j → ∞ (with respect to the standard Archimedean metric in R2 ) of a sequence of inner points zj + pNj ∙ w f (zj + pNj ∙ w) mod pNj +n , ∈ S, (rj , tj ) = pNj +n pNj +n where Nj ∈ N, zj ∈ {0, 1, . . . , pNj − 1}. However, as f is a 1-Lipschitz map from Zp to Zp , for every z ∈ {0, 1, . . . , pN − 1} we have that f (z + pN ∙ w) ≡ (f (z) mod pN ) + pN ∙ ξN (z) (mod pN +n ) for a suitable ξN (z) ∈ {0, 1, . . . , pn − 1}; thus, f (z) mod pN ξN (z) f (z + pN ∙ w) mod pN +n = + . N +n N +n p p pn

Hence, ξNj (zj ) = f (w) mod pn for all j = 1, 2, . . . as all (rj , tj ) are inner points of S. Therefore, every inner point (x, y) ∈ S, which is then can be represented as w χ f (w) mod pn γ (x, y) = + , + n , pn pn pn p 2

that is, (x, y) has an open neighborhood that lies completely in S

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA163 where χ and γ are real numbers, 0 < χ < 1, 0 < γ < 1, is a limit (as j → ∞) of the point sequence zj w 1 f (w) mod pn f (zj ) mod pNj 1 (rj , tj ) = + ∙ , + ∙ n ∈ S. pn pNj pn p pn pNj From here it follows that every innerpoint (χ, γ) ∈ I2 is a limit point of zj f (zj ) mod pNj as j → ∞. This the corresponding sequence of points Nj , Nj p

p

I2

means that E(f ) = and thus α(f ) = 1. Case 2: No point from E(f ) has an open neighbourhood that lies completely in E(f ); i.e., any open neighbourhood U of any point from E(f ) contains points from the subset I2 \ E(f ), which is open in I2 . Hence, U contains an open subset that lies completely in I2 \ E(f ) (we assume that I2 \ E(f ) 6= ∅ since otherwise α(f ) = 1 and there is nothing to prove). Then there exists an open square a a 1 b b 1 Tm (a, b) = × , , + , + pm pm pm pm pm pm where a, b ∈ {0, 1, . . . , pm − 1}, that lies completely in I2 \ E(f ). That is, Tm (a, b) contains no points of the form x mod pk f (x) mod pk , , pk pk where x ∈ Zp and k ∈ N. In other words this means that there exist words ã, ˜b of length m in the alphabet Fp (which are just base-p representations of a and b, respectively) such that, whenever the automaton A = Af is feeded by any input word w ˜ with suffix a ˜, i.e., w = p`+m a + u where u ∈ {0, 1, . . . , p` − 1}, the corresponding output word f (w) mod p`+m = p`+m t + v, v ∈ {0, 1, . . . , p` − 1}, newer has the suffix ˜b, i.e., t 6= b for all ` ∈ N0 and all u ∈ {0, 1, . . . , p` −1} (u is the empty word if ` = 0). 0 It is clear now that given any numbers a0 , b0 ∈ {0, 1, . . . , pm −1}, m0 ≥ m, such that a0 ≡ a (mod pm ), b0 ≡ b (mod pm ), the corresponding open square 0 0 T outside of E(f ), i.e., contains no points of the form m0 (a , bk ) lies completely x mod p f (x) mod pk , , where x ∈ Zp and k ∈ N. Indeed, otherwise some pk pk input word w0 with the suffix a0 results in the output word with the suffix b0 ; however, this means that the corresponding initial subword (whose suffix is a) of the word w0 results in output word whose suffix is b. The latter case contradicts our choice of a, b. Now take m0 = im for i = 1, 2, . . . and construct inductively a disjoint family Ti of (p2m − 1)i−1 open squares Tm0 (a0 , b0 ). The family T1 consists of the only member Tm (a, b).

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA164 Given the family Ti−1 , the family Ti consists of all open squares Tim (a0 , b0 ), where a0 , b0 ∈ {0, 1, . . . , pim − 1}, a0 ≡ a (mod pm ), b0 ≡ b (mod pm ), that are disjoint from all members of the families T1 , . . . Ti−1 . That is, at the first step we obtain a family T1 that consists of the only p−m × p−m square T1 (a, b); on the second step we obtain a family T2 that consists of p2m − 1 disjoint p−2m × p−2m squares; on the third step we obtain a family T3 that consists of (p2m − 1)p2m − (p2m − 1) = (p2m − 1)2 disjoint p3m × p3m squares, etc. The union T of all these open squares from T1 , T2 , . . . is open, whence, measurable, and the Lebesgue measure of T is 1 p2m

+ (p2m − 1) ∙

1 p4m

+ (p2m − 1)2 ∙

1 p6m

+ ∙∙∙ = 1

However, by the construction, T contains no points of the form x mod pk f (x) mod pk , , pk pk where x ∈ Zp and k ∈ N. Consequently, T ∩ E(f ) = ∅; in turn, this implies that the Lebesgue measure of E(f ) must be 0, i.e, that α(f ) = 0. The latter contradicts the assumption from the beginning of the proof. This proves the first assertion of the theorem. The finial assertion of theorem now easily follows from the first one, cf. equivalent definition of complete transitivity in terms of words. Exercise 12.19. Prove the final assertion of Theorem 12.18. According to Theorem 12.18, since now we say for short that a 1Lipschitz function f : Zp → Zp (respectively, a transducer A = Af ) is of measure 1 if and only if α(f ) = 1, and of measure 0 otherwise. Exercise 12.20. Find necessary and sufficient conditions that a constant function is of measure 1. (Hint: Consider finite subwords of the infinite word that corresponds to the constant.)

12.4

Conditions for complete transitivity

In this section, we study when an automaton is completely transitive. According to Theorem 12.18 to determine if an automaton is completely transitive is to determine whether it is of measure 1 or not; so below we prove some sufficient conditions for an automaton to be of measure 0 or 1. We keep the same conventions about the meaning of the term ‘automaton’ as before. The first result is of somewhat negative sort: there are no completely transitive automata among finite ones.

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA165

12.4.1

Finite automata are all of measure 0

For better convenience during proofs, we say that a 1-Lipschitz function f : Zp → Zp has lacunas whenever there exists an open (in the standard 2 2 topology of Rn ) subset Unof the unit square I that contains no points of the f (x) mod p x mod p form , , x ∈ Zp , n = 1, 2, 3, . . .. We call this open subset pn pn U an f -lacuna. We omit ‘f -’ when this does not lead to misunderstanding. According to Theorem 12.18, f has lacunas if and only if f is of measure 0. Thus, to prove that a 1-Lipschitz function is of measure 0 it suffices to demonstrate that it has a lacuna. Theorem 12.21. Whenever a 1-Lipschitz function f : Zp → Zp is an automaton function of a finite automaton, f is of measure 0. Proof. As said, to prove the theorem it suffices to construct an f -lacuna. As f is 1-Lipschitz, it is clear that given k ∈ N, for all x ∈ Zp we can represent f (x) as f (x) = (f (x mod pk )) mod pk + pk ∙ gz (y), (12.6)

where y = p1k (x − (x mod pk )) ∈ Zp , z = x mod pk , and gz : Zp → Zp is a 1-Lipschitz function. Now, as f is the automaton function of a finite automaton A = hFp , S, Fp , S, O, s0 i, the number of pairwise distinct functions gz is finite as actually gz is an automaton function of the automaton Az = hZp , S, Zp , S, O, s(z)i, where s(z) ∈ S is the state the automaton A reaches after being feeded by the input word z = x mod pk . Therefore there exists N ∈ N such that for all k > N the function z = zk (x) in the equality (12.6) takes values in a finite set, that is, the finite number of values the function zk (x) takes does not depend on k. Clearly, this number does not exceed pN , where N = d#Se; we recall that all states of the automaton A are assumed to be reachable from the initial state s0 . Now we take n > N , fix arbitrarily α0 , . . . , αn−1 ∈ {0, 1, . . . , p − 1}, and denote a = α0 + α1 ∙ p + ∙ ∙ ∙ αn−1 ∙ pn−1 . There exist not more than pN pairwise distinct numbers gz (a) mod pn ∈ {0, 1, . . . , pn − 1} as there exist not more than pN pairwise distinct functions gz . As n > N , there exists b ∈ {0, 1, . . . , pn − 1} different from all these gz (a) mod pn : b = β0 + β1 ∙ p + ∙ ∙ ∙ + βn−1 ∙ pn−1 ,

for suitable βi ∈ Fp , i = 0, 1, 2, . . . , n−1. That is, as A is a finite automaton, given a sufficiently long word αn−1 ∙ ∙ ∙ α0 over the alphabet Fp , there exists a word βn−1 ∙ ∙ ∙ β0 such that no output word (of length n + K, K ≥ N ) of the automaton A has a suffix βn−1 ∙ ∙ ∙ β0 whenever the input word (of length n + K) of the automaton has a suffix αn−1 ∙ ∙ ∙ α0 . Therefore given a number a = α0 + α1 ∙ p + ∙ ∙ ∙ + αn−1 ∙ pn−1 we have that if for some x ∈ Zp and L≥N +n N x mod pL pN − 1 p ∙ a pN − 1 + pN ∙ a a a ∈ I(a) = , = n , n + N +n , pL pN +n pN +n p p p

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA166 then

pN − 1 f (x) mod pL b b ∈ / I(b) = n , n + N +n , pL p p p

where I(a), I(b) are shortcuts for corresponding closed real segments. As pL only a finite number of rational numbers of the form x mod (where x ∈ Zp ) pL

f (x) mod pL ∈ I(b) (may be, pL 1 those with L < N +n), the open real interval I 0 (a) = pan ; pan + pk+n ⊂ 0 0 2 0 contains no points of this kind. So I (a) × I (b) ⊂ I , where I (b) stands

from the real closed segment I(a) are such that only

I(a) for the open real interval pbn ; pbn +

1

pk+n

, is an f -lacuna.

Exercise 12.22. Let p = 2. Prove that the function f (x) = x+x2 ORc, where c is a negative rational integer, is of measure 0. Prove that there exists a finite automaton whose automaton function is f .

Exercise 12.23. Let p = 2. Prove that the function f (x) = x + x2 OR (−1/3) is of measure 0 and that there exist no finite automaton whose automaton function is f .

12.4.2

Complete and absolute transitivity

In this subsection we establish sufficient conditions for a 1-Lipschitz function to be of measure 1 (thus, sufficient conditions for complete transitivity of corresponding automata) and then we prove that automata whose automata functions are polynomials of degree greater than 1 over Z are all absolutely transitive (and thus all of measure 1). We keep the same conventions on terminology as before. Theorem 12.24. Let f : Zp → Zp be a 1-Lipschitz function, and let f be differentiable everywhere in a ball B ⊂ Zp of a non-zero radius. The function f is of a measure 1 whenever the following two conditions hold simultaneously: 1. f (B ∩ N0 ) ⊂ N0 ; 2. f is two times differentiable at some point v ∈ B ∩ N0 , and f 00 (v) 6= 0. Proof. We will show that for every sufficiently large k and every z, u ∈ {0, 1, . . . , pk − 1} there exists M = M (k) and a ∈ {0, 1, . . . , pM − 1} such that a f (a) mod pM u 1 z 1 − k < k . (12.7) pM − pk < pk and M p p p

This will prove the theorem as every point from the unit square I2 can be approximated by points of the form puk , pzk .

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA167 Briefly, idea of the proof is as follows: As v ∈ N0 , there exists kP∈ N0 such i that all terms νi ∈ {0, 1, . . . , p − 1} in the p-adic expansion v = ∞ i=0 νi ∙ p are zero, for all i ≥ k. We then replace zeros in this expansion at positions starting with `-th, ` > k, by certain other figures from {0, 1, . . . , p − 1} so that the obtained natural number a = v + p` t will satisfy inequalities (12.7) for some M . As f is differentiable everywhere in B, for x ∈ B we have that given arbitrary K ∈ N, the following congruence holds for all h ∈ Zp and all sufficiently large L ∈ N: f (x + pL h) ≡ f (x) + pL h ∙ f 0 (x) (mod pK+L ).

(12.8)

Let |f 00 (v)|p = p−s ; that is, f 00 (v) = ps ∙ ξ, where s ∈ Z and ξ is a unity of Zp (in other words, ξ has a multiplicative inverse in Zp ). Note that s is not necessarily non-negative since f 00 (v) is in Qp , and not necessarily in Zp ; however, further in the proof we assume that k + s > 0 as we may take k large enough. Now let r ∈ N be an arbitrary number such that r > s, pr > v, and −r p is less than the radius of the ball B (it is clear that there are infinitely many choices of r). Given r, consider n ∈ N such that n > max{logp f (v + pk+r t) : t = 0, 1, 2, . . . , pk − 1} and n > 2k + 2r + 2s (we remind that in view of condition 2 of the theorem, all f (v + pk+r t) are in N0 due to our choice of n). Put u ˜ = 1 + pk+r+s u 0

z˜ = f (v) + p

k+r+s

(12.9) zˆ,

(12.10)

z˜ c mod pk = z. In other where zˆ ∈ {0, 1, . . . , pk − 1} is such that b pk+r+s words, we choose zˆ in such a way that the number whose base-p expansion stands in positions from (k + r + s)-th to (2k + r + s − 1)-th in the canonical p-adic expansion of z˜, is equal to z. Obviously, given f 0 (v) and z, there f 0 (v) exists a unique zˆ that satisfy this condition: zˆ ≡ z − b pk+r+s c (mod pk ); so

z˜ mod p2k+r+s = (f 0 (v) mod pk+r+s ) + pk+r+s ∙ z.

(12.11)

f 0 (v + pr+k ζ) ≡ f 0 (v) + pr+k ζ ∙ f 00 (v) (mod p2k+r+s )

(12.12)

As f is two times differentiable at v, for every ζ ∈ {0, 1, . . . pk − 1} we conclude that for all sufficiently large r (formally, we just substitute f 0 for f , v for x, ζ for h, k + s for K, and r + k for L in (12.8)). From here we deduce that as f is differentiable in B, the following congruence holds for all sufficiently large n: f (v + pr+k ζ + pn u ˜) ≡ f (v + pr+k ζ)+

˜ ∙ (f 0 (v) + pr+k ζ ∙ f 00 (v)) (mod pn+2k+r+s ). (12.13) pn u

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA168 Note that the latter congruence is obtained by combination of congruence ˜ and L = n, with (12.8) where K = 2k + r + s, x = v + pr+k ζ, h = u congruence (12.12). We claim that there exists ζ ∈ {0, 1, . . . pk − 1} such that u ˜ ∙ (f 0 (v) + pr+k ζ ∙ f 00 (v)) ≡ z˜ (mod p2k+r+s ).

(12.14)

Indeed, in view of (12.9)–(12.10) this congruence is equivalent to the congruence (1 + pk+r+s u) ∙ (f 0 (v) + pr+k ζ ∙ f 00 (v)) ≡ f 0 (v) + pk+r+s zˆ (mod p2k+r+s ), and the latter congruence is equivalent to the congruence f 0 (v) + pr+k ζ ∙ f 00 (v) ≡ (1−pk+r+s u)∙(f 0 (v)+pk+r+s zˆ) (mod p2k+r+s ) as (1+pk+r+s u)−1 ≡ 1 − pk+r+s u (mod p2k+r+s ). That is, congruence (12.14) is equivalent to the congruence pk+r ζ ∙ f 00 (v) ≡ pk+r+s zˆ − pk+r+s u ∙ f 0 (v) (mod p2k+r+s ). However, as f 00 (v) = ps ξ, the latter congruence is equivalent to the congruence ζξ ≡ zˆ − u ∙ f 0 (v) (mod pk ). From here we find ζ ≡ ξ −1 ∙ (ˆ z − u ∙ f 0 (v)) (mod pk ), thus proving our claim (we remind that ξ is a unity of Zp ; hence, ξ has a multiplicative inverse ξ −1 modulo pk ). Now we put M = n + 2k + r + s and a = v + pr+k ζ + pn ∙ (1 + pk+r+s u); then v + pr+k ζ + pn a u = + , pn+2k+r+s pM pk a u so pM − pk < p1k , since v < pr , ζ < pk , and n > 2r + 2s + 2k. However, at the same time, combining (12.14), (12.10), (12.11), and (12.13), we see that f (a) mod pM z 1 f (v + pr+k ζ) f 0 (v) mod pk+r+s 1 = + ∙ + ∙ k, pn pk p2k+r+s p pM pk+r+s (12.15) since f (a) mod pM = f (v +pr+k ζ)+pn ∙(f 0 (v) mod pk+r+s )+pn+k+r+s z (the number in the right-hand side than pM due to our choice of n). Now is less M f (a) mod p z − pk < p1k since 0 ≤ f (v + pr+k ζ) ≤ from (12.15) it follows that pM pn − 1 due to our choice of n.

Note 12.25. We note that α(f (x)) = α(−f (x)) = α(f (−x)) for every 1Lipschitz function f : Zp → Zp of variable x; so we may replace condition 1 of Theorem 12.26 by either of conditions f (B∩−N0 ) ⊂ N0 , f (B∩N0 ) ⊂ −N0 , or f (B ∩ −N0 ) ⊂ −N0 , where N0 = {0, −1, −2, . . .}. pn Indeed, for every c ∈ N and every n ∈ N we have that −c mod = pn pn −(c mod pn ) pn 1 y = 2 of the

n

p = 1 − c mod . Thus, a symmetry with respect to the axis pn 2 unit square I ⊂ R2 maps the subset x mod pn f (x) mod pn , : x ∈ Zp , n ∈ N ⊂ I2 E(f ) = pn pn

onto the subset E(−f ) and vice versa; so α(f (x)) = α(−f (x)). A similar argument proves that α(f (x)) = α(f (−x)).

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA169 Theorem 12.26. If the automaton function f = fA of an automaton A is a univariate polynomial of degree ≥ 2 with rational integer coefficients, then the automaton A is absolutely transitive. Proof. As f is a polynomial, f has not more than a finite number of zeros in R, so there exists d ∈ N0 such that for all b ≥ d either values f (b) are all positive or they are all negative. It suffices to consider only the case when all f (b) > 0. Indeed, let A = hFp , S, Fp , S, O, s0 i; consider the automaton A0 = hFp , S, Fp , S, σ(O), s0 i, where for χ ∈ Fp   χ, σ(χ) = 0,   p − 1,

if χ ∈ / {0, p − 1}; if χ = p − 1; if χ = 0.

As the map σ is just a permutation on the alphabet Fp , it is clear that the automaton A is absolutely transitive if and only if the automaton A0 is absolutely transitive, cf. equivalent definition of absolute transitivity in terms of words. However, under the alphabet substitution σ, negative rational integers (the ones in whose canonic p-adic representations there are only finitely many coefficients distinct from (p − 1)) will change to non-negative rational integers (the ones in whose canonic p-adic representations there are only finitely many non-zero coefficients), and vice versa. Given a finite non-empty word g˜ over alphabet Fp , take a finite word v˜ whose prefix is g˜ and such that the corresponding non-negative rational integer v 3 is greater than d, v > d, and f 00 (v) 6= 0. The word v˜ that satisfies these three conditions simultaneously exists as f 00 is a polynomial (and thus has not more than a finite number of zeros) and fixing arbitrary g˜ means that only some less significant digits in the base-p expansion of v are fixed; so the condition v > d can be also satisfied just by taking v whose base-p expansion is sufficiently long. Therefore both f and the so constructed v satisfy conditions of Theorem 12.24. Now note that the claim stated at the very beginning of the proof of Theorem 12.24 is just a re-statement of item 2 from Definition 12.11: Indeed, in the notation of Definition 12.11 and the one from the beginning of the proof of Theorem 12.24, the concatenation w ◦ y corresponds to a, w corresponds to u, w0 corresponds to z, and w is a k-letter suffix of the output word which is a base-p expansion for f (a) mod pM , whereas M is the length of the word w◦y. Up to these correspondences, condition (12.7) is equivalent to the statement of item 2 from Definition 12.11. Furthermore, as the word v˜ has an arbitrarily chosen prefix g˜, and as the condition (12.7) holds for a = v + p` t from the proof of Theorem 12.24 (as the whole Theorem 12.24 3 the one whose base p-expansion is v˜; remind that according to our conventions words are read from right to left, that is the rightmost letters of v˜ correspond to low order digits in the base-p expansion of v

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA170 holds for f and v), this means that statement of item 2 from Definition 12.11 holds for input word with arbitrarily chosen prefix g˜, up to all mentioned correspondences. However, this means that the statement of item 3 from Definition 12.11 also holds for x = g˜ in the case under consideration. This finally proves Theorem 12.26. Exercise 12.27. Prove that the following functions are of measure 1: • f (x) = cx + cx if c ∈ {2, 3, 4, . . . , } and c ≡ 1 (mod p); • f (x) = (x AND c) + ((x2 ) OR c) if c ∈ Z and p = 2. We also note that the following properties of automata functions are independent: • to be ergodic/non-ergodic; • to be of measure 0/of measure 1. Indeed, by Theorem 12.26, a polynomial of degree not less than 2 with rational integer coefficients is of measure 1, however, it may be ergodic or non-ergodic depending on coefficients, cf. Theorem 9.16. For polynomials of degree 1 over Z see the following exercise: Exercise 12.28. Prove that a polynomial of degree 1 over Z is of measure 0; prove that there exist ergodic as well as non-ergodic polynomials of this kind. The following exercise is a tool to construct various absolutely transitive automata: Exercise 12.29. By a proper modification of the proof of Theorem 12.26, prove the following: Under conditions of Theorem 12.24, let B = Zp and let f 00 (x) have no more than a finite number of zeros in N0 . Then an automaton whose automaton function is f , is absolutely transitive.

12.5

Automata finiteness criterion

Given an automaton function, currently no criterion is still known to determine whether the corresponding automaton is of measure 1 or 0. In the foregoing section we already have proved some sufficient conditions for an automaton to be of measure 1. On the other hand, from Theorem 12.21 we already know that a finite automaton is always of measure 0. Therefore if, given an automaton function, one can determine whether the function corresponds to a finite automaton, this will serve as a sufficient condition for an automaton to be of measure 0. Now we are going to prove automata finiteness criterion in terms of van der Put series of automata functions. We first remind some notions and facts from the theory of automata sequences following [2].

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA171 An infinite sequence a = (ai )∞ i=0 over a finite alphabet A, #A = L < ∞, is called p-automatic if there exists a finite transducer4 T = hFp , S, A, S, O, s0 i such that for all n = 0, 1, 2, . . ., if T is feeded by the word χk χk−1 ∙ ∙ ∙ χ0 which is a base-p expansion of n = χ0 + χ1 p + ∙ ∙ ∙ χk pk , χk 6= 0 if n 6= 0, then the k-th output symbol of T is an ; or, in other words, such that δkA (fT (n)) = an for all n ∈ N0 , where k = blogp nc and δkA (r) stands for the k-th digit in the base-L expansion of r. The p-kernel of the sequence a is a set kerp (a) of all subsequences m (ajpm +t )∞ j=0 , m = 0, 1, 2, . . .; 0 ≤ t < p . Theorem 12.30 (Automaticity criterion, cf. [2, Theorem 6.6.2]). Let p ≥ 2; then the sequence a is p-automatic if and only if its p-kernel is finite. Now we are able to state the main result of the section: Theorem 12.31 (Automata finiteness criterion). Given a 1-Lipschitz function f : Zp → Zp represented by van der Put series (4.29), f (x) =

∞ X

m=0

bm pblogp mc χ(m, x),

the function f is the automaton function of a finite automaton if and only if the following conditions hold simultaneously: (i) all coefficients bm , m = 0, 1, 2, . . ., constitute a finite subset Bf ⊂ Q ∩ Zp , and (ii) the p-kernel of the sequence (bm )∞ m=0 is finite. Note 12.32. Condition (ii) of the theorem is equivalent to the condition that the sequence (bm )∞ m=0 is p-automatic, cf. Theorem 12.30. Now we are going to present equivalent statement of Theorem 12.31, in terms of formal power series. Given a q-element field Fq , denote via Fq [[X]] the ring of formal power series in variable X over Fq : (∞ ) X i ai X : ai ∈ Fq ; Fq [[X]] = i=0

4

Note that definition of p-automatic sequence from [2] reads at this place “DFAO”, a discrete finite automaton with output (which is also known under the name of Moore automaton) rather than “transducer”. However in automata theory it is well known that any Moore automaton is equivalent to a suitable Mealy automaton in the following meaning: Given a Mealy automaton, there exists a Moore automaton whose automaton function is the same as the one of the Mealy automaton (and vice versa), see e.g. [24, Section 1.9]. As in the meaning of our book finite transducers are exactly Mealy automata, therefore we may speak of transducers rather than of DFAO in the definition.

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA172 denote via Fq ((X)) the ring of formal Laurent series over Fq : ) ( ∞ X i Fq ((X)) = ai X : n0 ∈ N0 , ai ∈ Fq . i=−n0

Denote via Fq (X) the field of (univariate) rational functions over Fq : u(X) Fq (X) = : u(X), v(X) ∈ Fp [X], u(X) 6= 0 , v(X) where Fp [X] is the ring of polynomials in variable X over Fq . As the field Fq ((X)) contains a subfield Fq (X), it is possible to define algebraicity over P i Fq (X) in a standard way: A formal Laurent series F (X) = ∞ a i=−n0 i X is algebraic over Fp (X) if and only if there exist d ∈ N and polynomials u0 (X), . . . , ud (X) ∈ Fp [X], not all zero, such that in the field Fq ((X)) the following identity holds: u0 (X) + u1 (X) ∙ F (X) + ∙ ∙ ∙ + ud (X) ∙ (F (X))d = 0. Now we remind Christol’s theorem, [2, Theorem 12.2.5]: Theorem 12.33 (Christol). Let p be a prime, and let a = (ai )∞ i=0 be infinite sequence over a finite non-empty alphabet A. The sequence a is p-automatic if and only if there exists an integer an injection τ : A → Fp` P ` ∈ N and i is algebraic over F (X). such that the formal power series ∞ τ (a )X i p` i=0 By Christol’s theorem, we now may replace condition (ii) from the statement of Theorem 12.31 by equivalent one, thus getting an equivalent finiteness criterion:

Theorem 12.34 (Automata finiteness criterion, equivalent). Given a 1Lipschitz function f : Zp → Zp represented by van der Put series (4.29), the function f is the automaton function of a finite automaton if and only if the following conditions hold simultaneously: (i) all coefficients bm , m = 0, 1, 2, . . ., constitute a finite subset Bf ⊂ Q ∩ Zp , and (ii) under a suitable injection τ : Bf → Fp` , the formal power series ∞ X

τ (bm )X m

m=0

over Fp` is algebraic over Fp` (X).

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA173 Proof of Theorem 12.31. Given a 1-Lipschitz function f , for n ∈ N0 , k ≥ blogp nc + 1 consider functions fn,k : Zp → Zp defined as follows: 1 fn,k (z) = k f (n + pk z) − (f (n) mod pk ) ; z ∈ Zp . p The function f is an automaton function of a finite automaton if and only if in the collection F of functions fn,k (n ∈ N0 , k ∈ N, k ≥ blogp nc + 1) contains only finitely many pairwise distinct functions: Note that fn,k is the automaton function that corresponds to the automaton A(s(nk )) = hFp , S, Fp , S, O, s(nk )i, where s(nk ) ∈ S is the state the automaton A = Af = hFp , S, Fp , S, O, s0 i reaches after it has been feeded with the input word nk (of length pk ) that corresponds to a base-p expansion of n (so the word nk may contain some leading zeros that correspond to higher order digits of the expansion). Note that by Theorem 4.18, bn+pk s = p1k (f (n + pk s) − f (n)) = p1k (f (n + pk s) − (f (n) mod pk )) −

1 (f (n) pk

− (f (n) mod pk )) = fn,k (s) − fn,k (0) if

n ≤ pk − 1 and s ∈ {1, 2, . . . , p − 1}, cf. (3.12); so the finiteness of F implies that in the sequence (bm )∞ m=0 there are only finitely many pairwise distinct terms. We proceed with this in mind. Take n ∈ N0 and k ≥ blogp nc + 1. By (4.29), the value f (n + pk z) can be represented as f (n + pk z) = An,k (z) + Bn,k (z), where An,k (z) = Bn,k (z) =

k −1 pX

m=0 ∞ X

bm pblogp mc χ(m, n + pk z);

(12.16)

bm pblogp mc χ(m, n + pk z).

(12.17)

m=pk

By (3.13), once m ≤ pk − 1, the equality χ(m, n + pk z) = 0 holds if and only if m 6≡ n (mod pblogp mc+1 ); and once m ≥ pk , the equality χ(m, n+pk z) = 0 holds if and only if m 6≡ n (mod pblogp mc+1 ) (note that blogp mc + 1 ≥ k + 1 under conditions of the latter case); thus An,k (z) = Bn,k (z) =

k −1 pX

m=0 ∞ X

bm pblogp mc χ(m, n);

bn+pk t pk+blogp tc χ(n + pk t, n + pk z) =

t=1

(12.18) ∞ X

bn+pk t pk+blogp tc χ(t, z).

t=1

(12.19)

From here in particular it follows that An,k (z) does not depend on z and that Bn,k (z) ≡ 0 (mod pk ) for all z ∈ Zp ; consequently, 1 fn,k (z) = k An,k (0) + Bn,k (z) − (An,k (0) mod pk ) = Cn,k + Dn,k (z), p (12.20)

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA174 where 1 k A (0) − (A (0) mod p ) ; n,k n,k pk ∞ X Dn,k (z) = bn+pk t pblogp tc χ(t, z). Cn,k =

(12.21) (12.22)

t=1

Note that from (12.22) it follows that Dn,k (0) = 0 (cf. (3.13)), so from (12.20) we deduce that fn,k (0) = Cn,k . Thus, we have obtained the following criterion of the finiteness of the number of distinct functions in the collection F: There are only finite number of pairwise distinct functions fn,k ∈ F, n ∈ N0 , k ≥ blogp nc + 1, if and only if the following two conditions hold simultaneously: 1. There are only finitely many pairwise distinct constants Cn,k ∈ Zp , n ∈ N0 , k ≥ blogp nc + 1. 2. There are only finitely many pairwise distinct functions Dn,k : Zp → Zp , n ∈ N0 , k ≥ blogp nc + 1. However, since representation (12.22) is a (unique) van der Put expansion of the function Dn,k , condition 2 is equivalent to the condition that in the sequence (bn )∞ n=0 there are only finitely many pairwise distinct subsequences ∞ (bn+pk t )t=1 , where k ≥ blogp nc + 1, n ∈ N0 . In turn, the latter condition is equivalent to the condition that there are only finitely many pairwise distinct subsequences (bn+pk t )∞ t=0 , n ∈ N0 , k ∈ N. Note that if the condition holds, there are only finitely many pairwise distinct terms bm in the sequence (bm )∞ m=0 . Consider condition 1. Note that for k ≥ blogp nc + 1 we have that k −1 pX

m=0

blogp mc

bm p

χ(m, n) =

nc+1 pblogpX −1

bm pblogp mc χ(m, n)

(12.23)

m=0

since χ(m, n) = 0 once blogp mc > blogp nc, cf. (3.13). Thus, An,k (z) does not depend on k (and on z as we have already shown); so denoting the right hand side in (12.23) via A(n), we have that Cn,k = p−k (A(n) − ((A(n)) mod pk )), cf. (12.21) and (12.18). From here by (3.13) we get that Cn,k = p−k (bn − (bn mod pk )) once n is such that blogp nc = 0. Consequently, given n ∈ {0, 1, . . . , p − 1}, the finiteness of the number of pairwise distinct Cn,k , where k ≥ blogp 0c+1 = 1, is equivalent to the condition that the sequence (δi (bn ))∞ i=0 is eventually periodic, that is, to the condition that bn ∈ Q ∩ Zp . Using this as a base for induction on blogp nc, assuming that all bn ∈ Q ∩ Zp for n such that blogp nc < N , we see that, given n such that blogp nc = N , the finiteness of the number of pairwise distinct Cn,k = p−k (A(n) − ((A(n)) mod pk )) for

CHAPTER 12. THE P -ADIC ERGODIC THEORY OF AUTOMATA175 k ≥ blogp nc + 1 = N + 1 is equivalent to the condition that A(n) ∈ Q ∩ Zp . However, in view of (3.13) from the definition of A(n) it follows that blogp nc

A(n) = bn p

+

p nc −1 pblogX

bm pblogp mc χ(m, n).

m=0

The left side sum is in Q∩Zp by induction hypothesis; so bn pblogp nc ∈ Q∩Zp and whence bn ∈ Q ∩ Zp since bn ∈ Zp . We finally have prowed that conditions 1–2 hold simultaneously if and only if the following conditions hold simultaneously: 10 . All coefficients bm , m = 0, 1, 2, . . ., constitute a non-empty finite subset (denoted as Bf ) in Q ∩ Zp . 20 . There are only finitely many pairwise distinct subsequences (bn+pk t )∞ t=0 , n ∈ N0 , k ∈ N, k ≥ blogp nc + 1; that is, the p-kernel of the sequence (bm )∞ m=0 is finite. This ends the proof.

Chapter 13

Pseudorandom generators While the preceding Chapter was devoted mostly to one type of automata, the transducers, in the current Chapter we study another type of automata, the generators. Remind that according to the general definition of an automaton, the generator is a (non-initial) automaton whose input alphabet is empty while the output alphabet is not empty (however, finite), cf. the very beginning of Section 12.1. We are mostly interested in pseudorandom number generators, the PRNG, as the latter generators can be used to produce pseudorandom sequences that ‘look like’ a random ones. Of course, to construct a mathematical theory, the loose term ‘look like’ must be defined more precisely. However, currently there are no standard mathematical definition of what are pseudorandom sequences as in various applications a sequence is considered as pseudorandom if it passes a given set of statistical tests which vary depending on the application. Consequently, the definition of what is a pseudorandom sequence (whence, what is a PRNG) depends on the choice of these tests. We stress that the class of tests a PRNG must pass is set beforehand; for instance, if one takes all polynomial-time tests, he obtains a definition of pseudorandomness in the sense of the complexity theory. However, in practice they often use some standard batteries of tests, e.g NIST, DIEHARD, or some other. As a rule, uniform distribution is the weakest statistical property the sequence must necessarily satisfy to be considered as a pseudorandom in any reasonable meaning as frequencies of occurrences of terms in a truly random sequence are (approximately) equal. That is why in the current Chapter we are focused on algorithms that produce uniformly distributed sequences out of a given short random string. Pseudorandom generators are widely used in numerous applications, especially in modelling, computer simulation (e.g., in quasi-Monte Carlo methods) and cryptography (e.g., in stream ciphers). The latter are ciphers that encrypt information according to the following protocol.

176

CHAPTER 13. PSEUDORANDOM GENERATORS

177

Let information be represented in a binary form, as a sequence of zeros and ones; so a plaintext, the information to be encrypted, is a sequence α0 , α1 , α2 , . . ., where αj ∈ {0, 1}. Let Γ = γ0 , γ1 , γ2 , . . . be another sequence of zeros and ones, which is known both to Alice and Bob, and which is known to no third party. The sequence Γ is called a keystream. To encrypt a plaintext, Alice just XORes it with the keystream (see Subsection 1.2.1 for the definition of XOR): α0 , α1 , α2 , . . . , αi , . . . γ0 , γ 1 , γ 2 , . . . , γ i , . . .

(plaintext) (bitwise addition modulo 2) (keystream)

ζ0 , ζ 1 , ζ 2 , . . . , ζ i , . . .

(encrypted text)

XOR

To decrypt, Bob acts in the opposite order: ζ0 , ζ 1 , ζ 2 , . . . , ζ i , . . . γ0 , γ 1 , γ 2 , . . . , γ i , . . .

(encrypted text) (bitwise addition modulo 2) (keystream)

α0 , α1 , α2 , . . . , αi , . . .

(plaintext)

XOR

Loosely speaking, Shannon’s Theorem yields that this encryption is secure providing the keystream Γ is picked at random for each plaintext. In real life settings we very rarely can fulfil conditions of Shannon’s Theorem, and usually a pseudorandom keystream is used Γ rather than a random one. That is, usually in real life ciphers Γ is produced by a certain algorithm, and Γ only looks like random (that is, passes certain statistical tests). A standard reasoning at this point is that any adversary can apply only a restricted number of tests to distinguish a pseudorandom keystream from a truly random one; so whenever a pseudorandom string passes all these tests, an adversary must conclude that the keystream is random and therefore the cipher can not be broken since otherwise a successive attack that broke the cipher actually can serve as a test that differs the keystream from a truly random. In cryptology, a stream cipher is thought of as an algorithm that takes a short random string (which is called a key) and stretches it into a much longer sequence, the keystream. Actually within the scope of the book we speak about stream cipher meaning the latter is a PRNG which is used for encryption according to the protocol described above. Not every PRNG is suitable for stream encryption. Stream ciphers are cryptographically secure PRNGs; that is, they must not only produce statistically good sequences, but also they must withstand adversary’s attacks. It is worth noticing here that according to postulates of cryptology, both the algorithm and the keystream are assumed to be known to an adversary; the only thing he does not know is a key, and in most cases an attack is


178

aimed to determine a key given both the algorithm and the keystream that corresponds to the unknown key.

13.1

Pseudorandom generator is a dynamical system

Basically, a real-life PRNG can be considered as finite non-initial automaton A = hN, M, f, F i without input, (that is, with empty input alphabet) where N is a finite set of states, f : N → N is a state transition function, M is a finite output alphabet , F : N → M is an output function (sometimes in cryptology called a filter ). That is, a mathematical model of real-life PRNG is (in our terms) a finite generator, cf. the definition of the latter at the beginning of Section 12.1. Schematics of a typical PRNG is shown at Figure 13.1.

f

state transition ui+1 = f (ui )

ui

F output

zi = F (ui )

Figure 13.1: Pseudorandom generator Given an initial state (which sometimes is called also a seed ) u0 ∈ N, the PRNG produces a sequence Z = {F (u0 ), F (f (u0 )), F (f 2 (u0 )), . . . , F (f j (u0 )), . . .} over the set M, where f j (u0 ) = f (. . . f ( u0 ) . . .) (j = 1, 2, . . .); | {z }

f 0 (u0 ) = u0 .

j times

As the set of states N is finite, the output sequence is necessarily eventually periodic. Note that the output sequence depends on the initial state u0 .1 1

In cryptology, the initial state is usually a key which is chosen from N at random.


179

That is, the PRNG can be considered as a mapping from N into the set of all (eventually) periodic sequences over M. The generators may be considered either as pseudorandom generators per se, or as components of more complicated automata, the so-called counterdependent generators; the latter produce sequences {z0 , z1 , z2 , . . .} over M according to the rule z0 = F0 (u0 ), u1 = f0 (u0 ); . . . zi = Fi (ui ), ui+1 = fi (ui ); . . . That is, at the (i+1)-th step the automaton Ai = hN, M, fi , Fi , ui i is applied to the state ui ∈ N, producing a new state ui+1 = fi (ui ) ∈ N, and outputting a symbol zi = Fi (ui ) ∈ M. It is clear that actually the counter-dependent generator may also be considered as a generator whose set of states is a Cartesian product N0 ×N (and whose initial states are pairs (0, s) ∈ N0 ×N). We will focus on ordinary PRNGs which are represented by Figure 12.2. Note that formally speaking the sequence of states u0 , u1 = f (u0 ), u2 = f (u1 ), . . . , ui+1 = f (ui ) = f i+1 (u0 ), . . .

(13.1)

can be considered as an orbit of a dynamical system hN, f i, whereas the output sequence z0 = F (u0 ), z1 = F (u1 ), . . . , zi = F (ui ) = F (f i (u0 )), . . .

(13.2)

is an observable, see Section 6.1. We will show now that this consideration is not only formal, but discloses the essence of the problem how to construct a good PRNG.

13.1.1

What pseudorandom generators are good?

A PRNG that could be considered any good obviously must meet the following conditions: • The output sequence must be pseudorandom (i.e., must pass certain statistical tests). • For cryptographic applications, given a segment zj , zj+1 , . . . , zj+s−1 of the output sequence, finding the corresponding initial state (which usually is a key) must be infeasible in some properly defined sense. • The PRNG must be suitable for software (or hardware) implementations; the performance must be sufficiently fast. In the case the PRNG is an automaton represented by Figure 13.1, we can restate these conditions as follows: Condition 1: The state transition function f must provide pseudorandomness; in particular, it must guarantee uniform distribution and long period of the sequence of states {ui }.


180

For cryptographic purposes, it would be great if one could provide cryptographic security of this sequence as well; that is, given ui , it must be infeasible neither to find (or to predict) ui+1 , nor to find u0 . Unfortunately, this is not easy to provide these properties in real life setting: PRNGs that are ‘provably secure’, for which there exist proofs (based on some plausible, yet still unproven conjectures) that their output sequences can not be predicted by polynomial-time algorithms, are too slow for most practical applications. In practice, one has to undertake additional efforts to make the output sequence secure: This is output functions are needed for. Condition 2: The output function F must not spoil pseudorandomness; at least, the output sequence {zi } must be uniformly distributed and must have a long period. Moreover, in cryptographic applications the function F must make the PRNG secure: Given zi , F and f , it must be difficult to find ui from the equation zi = F (ui ). Finally, in practice, both in cryptography and computer simulations, PRNGs are implemented in software or hardware,and it is highly desirable to make these programs platform-independent to make possible to run the same algorithm on various platforms. Moreover, performance of corresponding programs must be sufficiently fast on all platforms. This demand results in the following condition: Condition 3: To make the PRNG any suitable for software/hardware implementations, and to make it platform-independent, both f and F must be (not too complicated) compositions of basic instructions from Subsection 1.2.1. To satisfy condition 1, one may take transitive state transition function f : N → N; the sequence of states (13.1) will have then the longest possible period (of length #N), and strict uniform distribution: Every element from N will occurs at the period exactly once, see Subsection 6.1.1. To satisfy the first part of condition 2, one may take a balanced output function F : N → M; see Subsection 6.1.1 for the definition (in this case we assume that #N is a multiple of #M). Whenever #N = #M, balanced mappings are just invertible (that is, bijective, one-to-one) mappings. It is easy to see that if a balanced output function is applied to a strictly uniformly distributed sequence of states, the output sequence is also strictly uniformly distributed: It is periodic with a period of length #N, and every #N element from M occurs at the period exactly #M times. We state this as a Proposition: Proposition 13.1. If the state transition function f of the automaton A is transitive on the state set N, i.e., if f is a permutation with a single cycle


181

of length N = #N; if, further, N is a multiple of M = #M, and if the output function F : N → M is balanced (i.e., #F −1 (s) = #F −1 (t) for all s, t ∈ M), then the output sequence Z of the automaton A is purely periodic with a period length N (i.e., maximum possible), and each element of M N occurs at the period the same number of times: M exactly. That is, the output sequence Z is uniformly distributed. Exercise 13.2. Prove Proposition 13.1. Whenever #M is much less than #N, balanced functions may also satisfy the second part of condition 2 since the equation zi = F (xi ) has too many #N solutions then, #M so it is infeasible to an adversary to try them all. Finally, to satisfy condition 3, one may use only operations that are common to all platforms: These are arithmetic (numerical) operations; addition, multiplication, subtraction, division, exponentiation of integers. In this case both N and M can be associated to respective sets of rational integers 0, 1, 2, . . . , N − 1 and 0, 1, 2, . . . , M − 1; and moreover, to residue rings Z/N Z and Z/M Z, respectively. Moreover, if one takes N = 2n and M = 2m , then actually both f and F will work with n-bit to produce output sequence of m-bit words. This case is the most convenient for programming; moreover, in this case one may use along with arithmetic operations bitwise logical operations as well, and other basic instructions (see Subsection 1.2.1) to construct f and F .

13.1.2

Why the p-adic ergodic theory?

Now we explain a general way to construct transitive mappings f and balanced mappings F out of arithmetic operations (in the case both N and M are composite numbers), and out of arithmetic and bitwise logical operations (in the case both N and M are powers of 2). The idea is as follows: Let, say, N = 2n and M = 2m , m ≤ n, n = kr, m = ks; then using the p-adic ergodic theory developed in Part II we construct an ergodic mapping f : Z2 → Z2 and a measure-preserving mapping F : Zr2 → Zs2 out of arithmetic and bitwise logical operations as these operations are 1-Lipschitz functions defined on the space of 2-adic integers Z2 and valuated in Z2 (cf. Subsection 1.2.1). Then, according to Theorem 7.1, taking residues of f and of F modulo 2k and 2k , respectively, we obtain a transitive transformation f mod 2n of the residue ring Z/2n Z and a balanced mapping F mod 2k : (Z/2k Z)r → (Z/2k Z)s . So f mod 2n will serve as a state transition function, whereas F mod 2k will serve as an output function since elements of residue ring Z/2n Z and of Cartesian powers (Z/2k Z)r and (Z/2k Z)s can be treated as n-bit and m-bit words, respectively. Note also that any number whose base-2 representation is longer than a machine word length k is reduced modulo 2k automatically by a computer. The case when both N and M are composite numbers can be reduced


182

to the case of prime powers: That is, we will construct ergodic mappings f : Zp → Zp and measure-preserving mappings F : Zrp → Zsp and then take f mod pn and F mod pk , for all all prime factors of N and M (we assume that prime factors of N and of M form the same set). Then with the use of Chinese Reminder Theorem 0.1 we construct mappings modulo N and M which coincides accordingly with f mod pn and F mod pk for all prime factors p of N and of M (this approach in more detail will be discussed further in Chapter 14, see Theorem 14.4 and the example thereafter). Now we make some conventions on terminology, cf. Subsection 6.1.1: Definition 13.3. A sequence (si )∞ i=0 of p-adic integers is called strictly k uniformly distributed modulo p whenever the sequence (si mod pk )∞ i=0 of residues modulo pk is strictly uniformly distributed over the residue ring Z/pk Z. Note. A sequence (si )∞ i=0 of p-adic integers is uniformly distributed (with respect to the normalized Haar measure μ on Zp ) if and only if it is uniformly distributed modulo pk for all k = 1, 2, . . .; that is, for every a ∈ Z/pk Z relative numbers of occurrences of a in the initial segment of length ` in the sequence {si mod pk } of residues modulo pk are asymptotically equal, i.e., lim`→∞ A(a,`) = p1k , where A(a, `) = #{si ≡ a (mod pk ) : i < `} (see ` [38] for details). So strictly uniformly distributed sequences are uniformly distributed in a usual sense of the theory of distribution of sequences. Note that in view of Proposition 13.1 one can vary both the state transition and the output function of a PRNG (and, for instance, make them key-dependent) without affecting uniform distribution of the output sequence, as the only conditions that must be satisfied to make the output uniformly distributed are ergodicity of the state transition function and measure-preservation of the output function. Of course, to make all these considerations practical we must choose these functions f and F from suitably large classes of ergodic and measurepreserving functions. In other words, we must develop certain tools to produce a number of various measure-preserving, ergodic mappings out of arithmetic (and of bitwise logical) operations. We consider these methods in the next section.

13.2

Congruential generators of the longest period

In this section we study consider the so-called congruential generators, a class of pseudorandom number generators which are widely used in various applications and widely studied in literature. We will show that actually the theory of these generators is a part of p-adic ergodic theory: Numerous known sporadic results of these generators can be explained in a unified way by p-adic ergodic theory considered in Part II. We will show that all


183

known results about periods of these generators can be deduced from basic theorems of p-adic ergodic theory; also, we will prove some new general results in this area. Actually, in this Section we explain how to construct a transformation on a given finite set N such that this transformation has a prescribed form and the longest possible period. These transformations will be compositions of arithmetic operators, and also of bitwise logical operators whenever #N is a power of 2. Thus, generators based on so-called T-functions, which are of interest for modern cryptology and which are just triangular functions from Definition 4.4 when p = 2, are within the scope of our study as well.2 Now we introduce the main notion of this Section: Definition 13.4. A congruential generator is a non-initial automaton A = hN, M, f, F i (sf. the very beginning of Section 13.1) where M = N = Z/N Z, F : M → M is the identity mapping, and the state transition function f : Z/N Z → Z/N Z preserves all congruences of the residue ring Z/N Z: f (a) ≡ f (b) (mod L) whenever a ≡ b (mod L) and L 6= 1 is a factor of N . The function f is called the recurrence law of the congruential generator. Note 13.5. In view of Chinese Reminder Theorem 0.13 it is obvious that the output sequence of the congruential generator has the longest possible period (that is, of length N ) if and only if every function f mod pn is transitive modulo pn , where n = ordp N , for all prime factors p of N . In literature, some authors consider one more class of generators, which they call explicit congruential generators. Definition 13.6. Explicit congruential generator corresponds to the case when the state transition function of automaton A from Definition 13.4 is a map f (x) = x + 1 mod N (which is sometimes called adding machine), whereas the output function F : Z/N Z → Z/N Z preserves all congruences of the residue ring Z/N Z. Note 13.7. Obviously, the explicit congruential generator attains the longest possible period (of length N ) if and only if every function F mod pn is bijective modulo pn , where n = ordp N , for all prime factors p of N . We stress here that according to Chapter 7 to determine whether a congruential generator (in the meaning of Definition 13.4) attains the longest period (of length N ) we should study ergodicity of the function f on space Zp , for all primes p | N ; whenever in the case of explicit congruential generator we should study measure-preservation of F . This is the leading idea of the current section. 2

Actually, T-functions are 1-Lipschitz 2-adic functions, see Subsection 4.1.1; so the theory of T-functions is a part of p-adic theory.


13.2.1

184

Types of congruential generators

Congruential generators from Definition 13.4 (as well as explicit congruential generators from Definition 13.6) were studied in a number of works, see monographs [35, 23, 46] and references therein. In the current subsection we consider some known and widely used types of congruential generators. We will demonstrate that in all cases the longest possible periods are attained by these generators whenever the corresponding state transition function f is ergodic on certain subspaces of Zp , for some prime numbers p. This gives a general method to determine lengths of periods of congruential generators with the use of various techniques of the p-adic ergodic theory from Part II. Further we explain how to tweak these generators to lengthen their periods if the periods are not the longest possible. Linear, quadratic, and cubic congruential generators One of the most wide-spread types of congruential generators are linear congruential generators 3 ; they correspond to the case when f (x) = (ax + b) mod N , where a, b are rational integers and N > 1 is a natural number. Note that they speak about congruential method of generating pseudorandom numbers whenever b ≡ 0 (mod N ); and of mixed congruential method otherwise, see [35]. Other congruential generators that are often used in applications are quadratic and cubic; they correspond to the cases when f (x) is a polynomial with rational integer coefficients, of degree 2 or 3, respectively. Note that Corollary 9.35 yields necessary and sufficient conditions for transitivity modulo N of a polynomial of arbitrary degree, with rational integer coefficients; thus, Corollary 9.35 gives a criterion when a quadratic or cubic congruential generator attains the longest period. A question when a linear congruential generator has the longest possible period (that is, of length N ) was answered in 1962 by Hull and Dobell, cf. [35]. In view of Note 13.5 and Theorem 7.1, the criterion is actually stated by Theorem 8.1. Note that the longest possible period (of length N ) can be achieved only with the use of mixed congruential method, when b 6≡ 0 (mod N ) (actually, only when b and N are coprime, see Theorem 8.1). However, a multiplicative generator (with f (x) = ax mod N ) is also often used in applications. In this case every ideal of the residue ring Z/N Z is an invariant subset of the mapping f (x) = ax, so the longest possible period is achieved whenever f is ergodic on spheres around 0; this holds if and only if a is primitive either modulo p2 for all prime p such that p2 |N , or modulo p, if p | N and p2 - N , see Theorem 10.12. Usually a multiplicative generator is assumed to work only on the unit group of the residue ring Z/N Z, that is, on the multiplicative group (Z/N Z)∗ of all invertible elements of the ring Z/N Z. In this case (for odd N ) the generator is obviously equivalent to 3

which sometimes are also called Lehmer generators


185

a linear congruential generator modulo ϕ(N ), the value of Euler’s totient function, as the group (Z/pk Z)∗ is a cyclic group of order (p − 1)pk−1 , for odd prime p; so the longest period of the generator is of length ϕ(N ) in this case. Note that for N = 2k , k ≥ 2, the multiplicative group (Z/2k Z)∗ is a direct product of a group of order 2 by a cyclic group of order 2 k−2 ; so the maximum length of the period of a multiplicative generator is 2k−2 in this case. Power generators Another type of congruential generators that are used in real life applications are power generators, with f (x) = xn mod N . They can not achieve periods of length N since every p-adic sphere centered at 1 is an invariant subset of the transformation x 7→ xn on Zp : They achieve the longest possible period when they are ergodic on p-adic spheres centered at 1; this holds if and only if n is primitive either modulo p2 for all prime p such that p2 |N , or modulo p, if p | N and p2 - N , see Theorem 10.12. Note that the maximum length of a period of the power generator can be determined with the use of Lemma 10.9. Exercise 13.8. Let N be odd and let ordp N ≥ 2 for all prime p | N . When a power generator with the recurrence law f (x) = xN mod N achieves the longest period, and what is the length of that period? Inversive generators Inversive generators are studied in numerous papers, see e.g. survey paper [22] and references therein. When N is a prime, f (x) (or F (x), for explicit generators) are of the form ax−1 +b or (a+bx)−1 ; here 0−1 = 0 by the definition, a, b ∈ Z. These functions can not be expanded directly to residue rings modulo composite N ; in the latter case domains of f and F are assumed to be restricted to the unit group (Z/N Z)∗ , which is a Cartesian product of unit groups (Z/pordp N Z)∗ , for all prime p | N . Now we can study behavior of functions ax−1 + b or (b + ax)−1 on the unit group Z∗p of all invertible p-adic integers to determine periods of these functions modulo N . As the unit group is a p-adic sphere of radius 1 centered at 0, and as both functions are 1-Lipschitz, the problem of maximality of the period length can be reduced to the problem of ergodicity of these functions on a p-adic sphere. We already have considered the corresponding examples, cf. Exercises 10.1 and 10.2. Also are known inversive generators of another kind, the ones based on the mapping invp , the generalized multiplicative inverse, cf. (2.2) in Subsection 2.1.2. For instance, in [21] it is proved that given a, b ∈ Z, the function f (x) = a ∙ inv2 (x) + b is transitive modulo 2n , n ≥ 2, if and only


186

if a ≡ 1 (mod 4) and b ≡ 1 (mod 2). Later by using techniques of the padic ergodic theory we will prove a much more general result, see further Proposition 13.23. Now we only mention that as the function invp (x) is a 1-Lipschitz transformation of Zp , the question on transitivity of the function a ∙ inv2 (x) + b modulo 2n is equivalent to the question on ergodicity of this function on Z2 by Theorem 7.1.

13.2.2

Periods of congruential generators

Now we discuss various techniques which may be used to construct congruential generators of the longest period or to determine lengths of periods of congruential generators. Therefore we demonstrate that actually the theory of congruential generators is essentially a part of p-adic ergodic theory. Techniques based on convergent p-adic series The most general characterizations of 1-Lipschitz measure-preserving and/or ergodic transformations on Zp are given in terms of Mahler expansions, that is, by representation of the transformation via convergent interpolation series, see Subsection 8.3. This method is the most general as every continuous transformation on Zp admits Mahler expansion. Another method is based on van der Put expansion, cf. Section 8.4. As every continuous transformation of Zp can be represented by convergent van der Put series, the method is as general as the one based on Mahler expansion. In some cases, e.g. for analytic functions, we can also use representations via power series, or via falling factorial series to determine whether the function is measurepreserving or ergodic by applying the results of Section 9.4. Also we can use these methods to tweak generators in order to lengthen their periods. We proceed with the examples. As said, exponential generator, which has the recurrence law f (x) = ax mod N , never attains the longest period, of length N . However, using Mahler expansion, we immediately can tweak generators of this kind to make lengths of their shortest periods the longest, i.e., N , just by adding a linear term to the recurrence law: For instance, Exercise 8.16 just means that for every prime p and every a ≡ 1 (mod p) the function f (x) = ax + ax is a 1-Lipschitz ergodic transformation on Zp . Now, combining Exercise 8.16 with Theorem 7.1 and with Chinese Reminder Theorem 0.1, we are able to construct exponential generators that attains the longest period (of length N ) modulo N for arbitrary composite N : For instance, the function f (x) = 11x + 11x is transitive modulo 10n for all n = 1, 2, . . ., as f is ergodic on Zp for p = 2 and for p = 5, thus transitive modulo pn for all n = 1, 2, . . . in view of Theorem 7.1; whence, f is transitive is transitive modulo 10n for all n = 1, 2, . . . in view of Chinese Reminder Theorem 0.13. In a similar manner we can make tweaks to inversive generators modulo


187

N to lengthen their periods to the maximum value, N . The idea is to use the mapping ιp : x 7→ (1 + pmx)−1 (for some m ∈ Zp ) in a composition of f (x) rather than the mapping x 7→ x−1 : Although both mappings are 1Lipschitz p-adic mappings, the first one is defined everywhere on Zp , whereas the domain of second one is the unit group Z∗p (i.e., the p-adic sphere S1 (0) = Sp−1 a=1 a + pZp of radius 1 centered at 0). Moreover, the function ιp is a Cfunction; that is, a p-adic analytic function defined by power series with p-adic integer coefficients that converges everywhere on Zp , see Section 5.1: (1 + pmx)−1 = 1 − pmx + p2 m2 x2 − p3 m3 x3 + ∙ ∙ ∙ . As a C-function is ergodic if and only if it is transitive either modulo p2 (when p > 3), or modulo p3 (when p ≤ 3), cf. Corollary 9.34, the function f (x) = x + (1 + p3 x)−1 is transitive modulo pn for all n = 1, 2, . . . by Theorem 7.1; by the same reason, if p > 3, then the function f (x) = x + (1 + p2 x)−1 is transitive modulo pn for all n = 1, 2, . . .. Now using Chinese Reminder Theorem 0.1 we can construct inversive generator modulo N , which shortest period is of length N , modulo arbitrary composite N . For instance, taking f (x) = (x + (1 + 200x)−1 ) mod 10n , we obtain the inversive generator whose period length is a maximum, 10n , whatever n = 1, 2, 3, . . . is taken: Again, this follows from Theorem 7.1 and Chinese Reminder Theorem 0.13 as this transformation f is ergodic on Zp for p ∈ {2, 5}. ˇ = Q 2 p2 Q 2 pordp N . Prove Exercise 13.9. Given a composite N , let N p |N p -N that the length of the shortest period of inversive generator whose recurrence ˇ x)−1 ) mod N is the maximum possible, i.e., N . law is f (x) = (x + (1 + N [Hint: Use Exercise 9.36]. For instance, the length of the shortest period of inversive generator with the law f (x) = (x + (1 + 100x)−1 ) mod 10n is 10n , whatever n = 2, 3, . . . is taken. With these ideas, by using Exercise 9.36] in a composition with Proposition 5.19 and Corollary 9.34, we immediately construct a number of different generators of these two kinds (inversive and exponential) that have the longest periods: Exercise 13.10. Prove that the following generators with the law f (x) mod N have the longest possible period, N : x

• f (x) = 1 + x + p2 ∙ ab , a ≡ b ≡ 1 (mod p), (doubly exponential generator), • f (x) = 1 + x +

p2 1+px

(inversive generator), 1

• f (x) = 1 + x + p2 ∙ (1 + px) 1+px (exponential-inversive generator). Now we explain how, given a congruential generator with the recurrence law f (x) mod N , one can determine the length a period of the generator. In view of Chinese Reminder Theorem 0.13, it suffices to consider only prime


188

power moduli N . For N = pk , p prime, the idea is to reduce the problem of finding the period length to the problem of finding a closed subset of Zp (usually a ball or a sphere), where a certain iterate f i (x) is ergodic. For illustration, consider an exponential generator with the law f (x) = ax , where a ≡ 1 (mod p); i.e., a = 1 + pz for some z ∈ Zp . It is clear that f maps Zp into the ball Bp−1 (1) = 1 + pZp ; so we can writeP 1 + p ∙ g(x) = ∞ pi zi xi is (1 + pz)x and then study the function g(x). As (1 + pz)x = i=0 x x 2 2 3 the Mahler expansion for a , we see that g(x) = zx + pz 2 + p z x3 + ∙ ∙ ∙ . Whenever z 6≡ 0 (mod p), all p-adic spheres around 0 are invariant under action of g, so the period will be the longest possible if g is ergodic on spheres Sp−r (0) around 0. Now we can apply Theorem 10.14 and Theorem 10.12 on ergodicity on spheres. From these Theorems we deduce that whenever p 6= 2, the derivative g 0 (0) must be primitive modulo p2 ; however, as g 0 (0) ≡ z − p2 z 2 (mod p2 ), and as (1 − p2 z)i ≡ 1 − i ∙ p2 z (mod p2 ), the element z − p2 z 2 = z ∙ (1 − p2 z) of the residue ring modulo pn , n ≥ 2, is primitive modulo p2 whenever z is primitive modulo p2 (we remind that 2 has a multiplicative inverse in Zp whenever p 6= 2, so p2 ∈ Zp in this case and least non-negative residue of p2 modulo pk is well defined). Now easy calculation shows that g p−1 (x) ≡ xz p−1 + p2 xz (mod p2 ), or, which is the same, that g p−1 (x) ≡ xz p−1 ∙(1+z p2 ) (mod p2 ); so g p−1 (x) is ergodic on the ball 1+ pZp by Corollary 9.34 and Theorem 8.1 (if p > 3). Indeed, represent x ∈ 1 + pZp as x = 1 + py. We must show ergodicity of p−1 the transformation w : y 7→ p1 (g p−1 (1 + py) − 1) on Zp . As w(y) = ( z p −1 + z p−1 + p )y is an affine transformation, in view Theorem 8.1 w is ergodic 2 2 )+(z if and only f simultaneously the following condition hold: z p−1 + p2 ≡ 1 (mod p) (this holds as z is primitive modulo p2 ; whence, modulo p) and p−1 z p−1 −1 + z2 6≡ 0 (mod p). Assuming z p −1 + z2 ≡ 0 (mod p), we obtain that p z p−1 ≡ 1 − p2 z (mod p2 ) , whence, that z p ≡ z − p2 z 2 (mod p2 ). As z − p2 z 2 is primitive modulo p2 then z p must be primitive modulo p2 . However, as z is primitive modulo p2 , the multiplicative order of z p modulo p2 is p − 1, and not p(p − 1). A contradiction. Finally using Exercise 10.10 we conclude that g is ergodic on the sphere S1 (0) of radius 1 around 0. This means, in particular, that the length of the shortest period of exponential generator with the law f (x) = (1 + pz)x mod pk , where p > 3 and z is primitive modulo p2 , is (p − 1)pk−2 , for all k = 2, 3, . . .. Exercise 13.11. Determine length of periods of the exponential generator in remaining cases. In a similar manner one may use van der Put series and Theorem 8.24 (rather than Mahler series and Theorem 8.12) to find lengths of periods of congruential generators, especially when the recurrence law of the generator is a T-function combined from computer instructions which were defined is Subsection 1.2.1; cf. Exercises 8.27 and 8.28.


189

Techniques based on p-adic derivations As it was demonstrated above, the problem to determine whether a congruential generator (or, respectively, an explicit congruential generator) attains the longest period can be reduced to the problem of verifying whether given 1-Lipschitz transformations on Zp , for some prime p, are ergodic, or, respectively, measure-preserving. In a number of practically interesting cases these transformations are differentiable, so we can apply results of Sections 9.1 and 9.3 to check measure-preservation and ergodicity. The method is not as general as techniques based on Mahler (or van der Put) expansion since the class of functions it can be applied to is smaller; however, in a number of cases it is easier to calculate derivatives of compositions of functions rather than their Mahler expansions. Moreover, in the case p = 2 (which is one of the most important cases for applications) it turns out that when we limit our study to differentiable functions only, we actually do not make the class of measure-preserving functions under consideration smaller: Proposition 13.12. If a 1-Lipschitz function f : Z2 → Z2 is measurepreserving then it is uniformly differentiable modulo 2, its derivative modulo 2 is 1 everywhere on Z2 , and N1 (f ) = 1. Proof. Indeed, by Exercise 8.17, f is measure-preserving if and only if f (x) = c + x + 2 ∙ v(x), where c ∈ Z2 is a constant and v : Z2 → Z2 is a 1-Lipschitz transformation.. Then f (x + 2k h) = c + x + 2k h + 2 ∙ v(x + 2k h) ≡ f (x) + 2k h (mod 2k+1 ) as 2∙v(x+2k h) ≡ 2∙v(x) (mod 2k+1 ) since v is 1-Lipschitz. Thus, f is uniformly differentiable modulo 2, f10 (x) ≡ 1 (mod 2), and N1 (f ) = 1 by Definition 2.27. Thus, Proposition 13.12 implies that if a recurrence law of a congruential generator is not differentiable modulo 2 at some point of Z2 , then the generator is not transitive modulo 2n for all sufficiently large n (actually, it is not even bijective modulo 2n for these n). This also means that the corresponding explicit congruential generator does not achieve maximum period length on n-bit words, for all sufficiently large n. So, to determine whether the length of the shortest period of the explicit congruential generator with the law yi = f (i) mod 2n , i = 1, 2, . . ., is 2n , we just use Theorem 9.1 which states that whenever f is uniformly differentiable modulo 2, then f is measure-preserving if and only if f is bijective modulo 2N1 (f ) and f10 (x) ≡ 1 (mod 2) for all x ∈ Z/2N1 (f ) Z. Note that to determine whether the length of the shortest period of the congruential generator with the recurrence law f mod 2n is 2n , we should use Theorem 9.16 which demands that the function f must be uniformly differentiable modulo 4 rather than modulo 2. Consider examples of congruential generators modulo 2n , both explicit and non-explicit, to illustrate the approach. Recall that (explicit) congru-


190

ential generator modulo 2n attains the longest period if and only if its law is (bijective) transitive modulo 2n . Exercise 13.13 (cf. [34]). An explicit congruential generator whose output function is one of the following, attains the longest possible period, of length 2n (n = 1, 2, 3, . . .): x 7→ (x + 2x2 ) mod 2n ,

x 7→ (x + (x2 OR 1)) mod 2n ,

x 7→ (x XOR (x2 OR 1)) mod 2n Exercise 13.14 (cf. [47, 34]). Prove that an explicit polynomial generator whose output function is P (x) = (a0 +a1 x+∙ ∙ ∙+ad xd ) mod 2n , where n > 1 and a0 , a1 , . . . , ad are rational integers, attains the period of length 2n if and only if a1 is odd, (a2 + a4 + ∙ ∙ ∙ ) is even, and (a3 + a5 + ∙ ∙ ∙ ) is even. [Hint: Use Theorem 9.1.] Exercise 13.15. Prove that an explicit generator whose output function is x 7→ P (x) = (a0 XOR a1 x XOR ∙ ∙ ∙ XOR ad xd ) mod 2n , where ai ∈ Z, i − 0, 1, 2, . . . , d, attains the period of length 2n if and only if ai satisfy conditions of Exercise 13.14. Exercise 13.16 (cf. [34]). A congruential generator whose recurrence law is f (x) = (x + (x2 OR 5)) mod 2n attains the period of length 2n , for all n = 1, 2, . . .. [Hint: Use Theorem 9.16.] Exercise 13.17. A congruential generator whose recurrence law is one of the following functions, attains the period of length 2n , for all n = 1, 2, 3, . . .): • f (x) = (x + (5x2 OR 5)) mod 2n , 2

• f (x) = (x + (5x OR 5)) mod 2n , 2

• f (x) = (x + (5−x OR 5)) mod 2n , 2

• f (x) = (x + (5x AND (−5))) mod 2n , 5

• f (x) = (5x + (5x AND (−5))) mod 2n , x

• f (x) = (5x + (55 AND (−5))) mod 2n , • f (x) = (x + (((1 + 4 ∙ (x2 AND (−5)))(1+2∙(xXOR(−5)))

−5

OR 5)) mod 2n

[Hint: Just mimic the proof of Exercise 13.16] All these exercises are to demonstrate that the technique based on p-adic derivations can handle the case when recurrence laws of congruential generators are rather complicated compositions of both arithmetic and bitwise logical computer instructions.


191

Now we explain how, given a composite N , one should use the technique to construct various polynomial generators modulo N that attain the longest period, of length N . It is clear that in view of Chinese Reminder Theorem 0.13 the problem can be reduced to the case when N is a prime power, N = pn . In the latter case we must first construct a transitive polynomial modulo p and then raise it to the polynomial that is transitive modulo pn . In view of Corollary 9.35, it is sufficient to raise a transitive polynomial over Fp to the transitive polynomial modulo p3 in the case p ∈ {2, 3}, or, respectively, modulo p2 , if p > 3. Now we outline a procedure that, given a transitive transformation ϕ on Fp , returns a polynomial f˜ϕ (x) ∈ Z[x], which is transitive modulo pn for all n = 1, 2, 3, . . ., and such that f˜ϕ (x) ≡ ϕ(x) (mod p) for all x ∈ Fp : • Step 1: Consider arbitrary transitive transformation ϕ on Fp and represent ϕ via the corresponding interpolation polynomial fϕ (x) ∈ Fp [x] according to interpolation formula (6). Note that fϕ (x) can be (and will be) considered as a polynomial with rational integer coefficients. • Step 2: Verify whether the polynomial fϕ (x) is transitive modulo p3 or modulo p2 , respectively, depending on whether p ≤ 3 or p > 3. If yes, fϕ (x) is the ergodic polynomial f˜ϕ (x) ∈ Z[x] we are seeking for; otherwise go to the next step. • Step 3: Note that in this case p > 3 since formula (6) gives fϕ (x) = x + 1 for p = 2, which is ergodic on Z2 , and formula (6) gives either fϕ (x) = x + 1 or fϕ (x) = x − 1 for p = 3; both polynomials are ergodic on Z3 . So it suffices to tweak the polynomial fϕ (x) to make it transitive modulo p2 . We will do this with the use of Proposition 0.16. Denote gi = fϕi (0) mod p; then the string g0 , g1 , . . . , gp−1 is a permutation of the string 0, 1, . . . , p − 1. Note that ϕ : gi 7→ g(i+1) mod p , i = 0, 1, . . . , p − 1, as fϕ (x) = ϕ(x) mod p and ϕ is transitive on {0, 1, . . . , p − 1}. Take arbitrary h0 , h1 , . . . , hp−1 ∈ {1, . . . , p − 1} that satisfy the following two conditions:

p−2 X i=0

h0 ∙ h1 ∙ ∙ ∙ hp−1 ≡ 1 (mod p),

(13.3)

hi ∙ hi+1 ∙ ∙ ∙ hp−2 ≡ 0 (mod p).

(13.4)

It is clear that choices of h0 , h1 , . . . , hp−1 that satisfy this system of congruences exist: For instance, h1 = ∙ ∙ ∙ = hp−2 = 1, h0 = 2, hp−1 ≡ 12 (mod p) is one of possible choices as p 6= 2. Now take the mapping ψ : Fp → Fp such that ψ(gi ) = hi , i = 0, 1, . . . , p − 1 and construct a polynomial fϕ,ψ (x) by Proposition 0.16; thus, fϕ,ψ (x) ≡ ϕ(x) (mod p)


192

0 (x) ≡ ψ(x) (mod p) for x ∈ {0, 1, . . . , p − 1}.4 Consider and fϕ,ψ fϕ,ψ (x) as a polynomial over Z and verify whether fϕ,ψ (x) is transitive modulo p2 . If yes, fϕ,ψ (x) is the polynomial f˜ϕ (x) ∈ Z[x] we need; otherwise go to Step 4.

• Step 4: Note that by Step 3, the derivative of the polynomial fϕ,ψ (x) vanishes modulo p nowhere on Zp , so fϕ,ψ (x) is measure-preserving in view of Theorem 9.1; thus, fϕ,ψ (x) is bijective modulo p2 . In view of p (x) ≡ x (mod p2 ) since otherwise fϕ,ψ (x) would be Lemma 9.17, fϕ,ψ transitive modulo p2 . Now put f˜(x) = fϕ,ψ (x) + p. We claim that f˜ is the polynomial f˜ϕ (x) ∈ Z[x] we are seeking for. Indeed, f˜(x) ≡ fϕ,ψ (x) ≡ ϕ(x) (mod p) for all x ∈ Zp , by the construction; moreover, easy induction on j shows that ! j−2 j−2 X Y j j 0 k f˜ (x) ≡ fϕ,ψ (x) + p ∙ 1 + fϕ,ψ (fϕ,ψ (x)) (mod p2 ). (13.5) i=0 k=i

However, the latter congruence imply that f˜p (0) ≡ p (mod p2 ) as p 0 (f k (0)) ≡ h (mod p) for all k = fϕ,ψ (0) ≡ 0 (mod p2 ) and fϕ,ψ k ϕ,ψ 0, 1, . . . , p − 1, see Step 3. Hence, f˜(x) is transitive modulo p2 in view of Lemma 9.17. Note 13.18. The above procedure can be obviously modified to enumerate all polynomials that are transitive modulo p2 (and even modulo p3 for p ≤ 3) and thus (with the use of Proposition 3.15) to obtain a complete list of ergodic polynomials in explicit form. Note that there are exactly (p − 1)! pairwise distinct transitive transformations on Fp . With the use of formula (6), every this transformation can be represented by a polynomial; however, no better description of transitive polynomials on Fp is known. Exercise 13.19. Construct a polynomial generator that has a recurrence law f mod 10n and such that • the length of the shortest period of the generator is 10 n , for all n = 1, 2, 3, . . .m and • f mod 5 = ϕ, where ϕ is a single cycle permutation, ϕ = (0, 1, 4, 3, 2) (i.e., ϕ(0) = 1, ϕ(1) = 4, . . . , ϕ(2) = 0). Techniques based on algebraic normal forms In the case when we need to determine whether a given congruential generator with the recursion law f mod 2n , where f is a 1-Lipschitz transformation 4

Note that condition (13.3) follows from Note 9.18, while condition (13.4) guarantees that the second term in (13.5) is p modulo p2 .


193

of Z2 , has the longest period, we may use one more method, that of Theorem 8.5 from Subsection 8.2. Compare to the two methods we presented above, the method based on Theorem 8.5 can be applied only to relatively simple compositions of arithmetic and bitwise logical instructions; nonetheless some useful results can be obtained by this technique as well. We illustrate the method by examples; some of these are of practical value. By using the ANF techniques, it is possible to construct very fast congruential generators, the so-called add-xor generators of the longest possible period. Exercise 13.20 (Add-xor generator, [37]). Prove that a congruential generator whose recurrence law is f (x) = ((. . . ((((x + c0 ) XOR d0 ) ∙ ∙ ∙ + cm ) XOR dm ) mod 2n , (n ≥ 2) attains a period of length 2n if and only if it attains a period of length 4 when n = 2. With the use of ANF technique it is possible to give a short proof of the main result of paper [34], see further Example 13.21. Note that that recurrence laws of several cryptographic generators are based on the corresponding mapping. Example 13.21 (cf. [34, Theorem 3]). The mapping f (x) = x + (x2 OR C) over n-bit words is invertible if and only if the least significant bit of C is 1. For n ≥ 3 it is a permutation with a single cycle if and only if both the least significant bit and the third least significant bit of C are 1. Proof. We shall prove that the function f (x) = x + (x2 OR C) is measurepreserving (respectively, ergodic) if and only if the conditions on C stated above hold. Denote ci = δi (C); for x ∈ Z2 and i = 0, 1, 2, . . . denote χi = δi (x) ∈ {0, 1}. To calculate ANF of the Boolean function δi (x + (x2 OR C)) in variables χ0 , χ1 , . . ., we start with the following easy claims: • δ0 (x2 ) = χ0 , δ1 (x2 ) = 0, δ2 (x2 ) = χ0 χ1 + χ1 , • δn (x2 ) = χn−1 χ0 + ψn (χ0 , . . . , χn−2 ) for all n ≥ 3, where ψn is a Boolean function in n − 1 Boolean variables χ0 , . . . , χn−2 . The first of these claims could be easily verified by direct calculations. To ˉn−1 = x mod 2n−1 prove the second one represent x = x ˉn−1 + 2n−1 sn−1 for x and calculate x2 = (ˉ xn−1 + 2n−1 sn−1 )2 = x ˉ2n−1 + 2n sn−1 x ˉn−1 + 22n−2 s2n−1 = 2 n n+1 x ˉn−1 + 2 χn−1 χ0 (mod 2 ) for n ≥ 3 and note that x ˉ2n−1 depends only on χ0 , . . . , χn−2 . This gives: 1. δ0 (x2 OR C) = χ0 + c0 + χ0 c0 2. δ1 (x2 OR C) = c1


194

3. δ2 (x2 OR C) = χ0 χ1 + χ1 + c2 + c2 χ1 + c2 χ0 χ1 4. δn (x2 OR C) = χn−1 χ0 + ψn + cn + cn χn−1 χ0 + cn ψn for n ≥ 3 From here it follows that if n ≥ 3 then δn (x2 OR C) = λn (χ0 , . . . , χn−1 ) and deg λn ≤ n − 1 since ψn depends only on χ0 , . . . , χn−2 . Now we successively calculate γn = δn (x + (x2 OR C)) for n = 0, 1, 2, . . .. We have δ0 (x + (x2 OR C)) = c0 + χ0 c0 , so necessarily c0 = 1 since otherwise f is not bijective modulo 2. Proceeding further with c0 = 1 we obtain δ1 (x+(x2 ORC)) = c1 +χ0 +χ1 since χ1 is a carry. Then δ2 (x+(x2 ORC)) = (c1 χ0 + c1 χ1 + χ0 χ1 ) + (χ0 χ1 + χ1 + c2 + c2 χ1 + c2 χ0 χ1 ) + χ2 = c1 χ0 + c1 χ1 + χ1 + c2 + c2 χ1 + c2 χ0 χ1 + χ2 ; here c1 χ0 + c1 χ1 + χ0 χ1 is a carry. From here in view of Theorem 8.5 we immediately deduce that c2 = 1 since otherwise f is not transitive modulo 8. Now for n ≥ 3 one has γn = αn + λn + χn , where αn is a carry, and αn+1 = αn λn + αn χn + λn χn . But if c2 = 1 then deg α3 = deg(μν + χ2 μ + χ2 ν) = 3, where μ = c1 χ0 + c1 χ1 + χ0 χ1 , ν = (χ0 χ1 +χ1 +c2 +c2 χ1 +c2 χ0 χ1 ) = 0. This implies inductively in view of Claim 4 above that deg αn+1 = n + 1 and that γn+1 = χn+1 + ξn+1 (χ0 , . . . , χn ), deg ξn+1 = n + 1. So conditions of Theorem 8.5 are satisfied, thus finishing the proof of Theorem 3 from [34]. The ANF techniques turned out to be useful in a study of generators based on taking generalized multiplicative inverses, i.e., on the map x 7→ invp (x), see (2.2). Lemma 13.22. Let p = 2. Then the ANF of the ith coordinate function δi (inv(x)) is of the form δi (inv2 (x)) = χi ⊕ ϕi (χ0 , . . . , χi−1 ), where χi = δi (x), ϕ0 = 0, and the weight of every Boolean function ϕi (χ0 , . . . , χi−1 ) in Boolean variables χ0 , . . . , χi−1 is even, i = 0, 1, 2, . . .. Note. Recall that the weight of the Boolean function ϕi (χ0 , . . . , χi−1 ) in Boolean variables χ0 , . . . , χi−1 is even if and only if its ANF does not contain the monomial χ0 ∙ ∙ ∙ χi−1 , see Theorem 8.5. Proof. As inv2 : Z2 → Z2 is a 1-Lipschitz measure-preserving transformation on Z2 , then in view of equation (8.8) of Subsection 8.2 and of Theorem 8.5, the Boolean function δi (inv2 (x)) depends only on Boolean variables χ0 , . . . , χi and δi (inv2 (x)) is linear with respect to variable χi : δi (inv2 (x)) = χi ⊕ ϕi (χ0 , . . . , χi−1 ) for a suitable Boolean function ϕi (χ0 , . . . , χi−1 ) in Boolean variables χ0 , . . . , χi−1 , for all i = 0, 1, 2, . . . (recall that a Boolean function on empty set of variables is a constant). Now by induction on i we prove that the weight of the Boolean function ϕi (χ0 , . . . , χi−1 ) is even, for all i = 0, 1, 2, . . .; that is, the number of Boolean


195

i-dimensional vectors on which the Boolean function ϕi (χ0 , . . . , χi−1 ) takes value 1 is even. Direct calculations show that inv 2 (x) ≡ x (mod 2n ) for n = 1, 2, 3; so ϕ0 = ϕ1 = ϕ2 = 0; for n = 4 we have inv2 (x) 6≡ x (mod 2n ) if and only if x is congruent 3,5,11, or 14 modulo 16, so the weight of the Boolean function ϕ3 (χ0 , χ1 , χ2 ) is 2. Let our claim be true for Boolean functions ϕ0 , . . . , ϕi−1 ; let us prove it for the Boolean function ϕi (χ0 , . . . , χi−1 ). For a Boolean function ψ denote via ψˉ its negation; that is, ψˉ = ψ ⊕ 1. Now take arbitrary x ≡ 1 (mod 2) (in other words, put χ0 = 1) and consider δi (inv2 (1 + NOT(x)). Since x = 1 + 2z, where z = χ1 + 2 ∙ χ2 + 4 ∙ χ3 + ∙ ∙ ∙ , then inv2 (1 + NOT(x)) = (1 + 2 ∙ NOT(z))−1 = (1 − 2 ∙ (1 + z))−1 = −(1 + 2z)−1 = 1 + NOT((1 + 2z)−1 ) (we used the second formula from (1.4) during −1 these conversions). It is obvious that if we denote P∞ P∞ (1j ˉ+ 2 ∙ NOT(z)) = j −1 1 + j=1 2 ζj , then 1 + NOT((1 + 2z) ) = 1 + j=1 2 ζj , where ζj ∈ {0, 1} (j = 1, 2, . . .). By this reason, the just proven equality (1 + 2 ∙ NOT(z))−1 = 1 + NOT((1 + 2z)−1 ) implies that ϕi (1, χ1 , . . . , χi−1 ) = ϕi (1, χ ˉ1 , . . . , χ ˉi−1 ),

(13.6)

for all χ1 , . . . , χi−1 ∈ {0, 1}, since ζi = δi (inv2 (x)) = χi ⊕ ϕi (χ0 , . . . , χi−1 ), i = 1, 2, . . .. Further, since inv2 (ab) = inv2 (a) ∙ inv2 (b) for all a, b ∈ Z2 , then inv2 (2 ∙ z) = 2 ∙ inv2 (z), so ϕi (0, χ1 , . . . , χi−1 ) = ϕi−1 (χ1 , . . . , χi−1 ); however, by induction hypothesis, the weight of the Boolean function ϕi−1 (χ1 , . . . , χi−1 ) in Boolean variables χ1 , . . . , χi−1 is even. This, together with equation (13.6), completes induction and proves the Lemma. Now we are able to prove the following Proposition that gives rise to a large new family of inversive generators modulo 2n that involve the function inv2 into their compositions and whose shortest periods are of length 2 n : Proposition 13.23. Let f be any 1-Lipschitz transformation on Z2 . If f is ergodic, then both compositions f (inv2 (x)) and inv2 (f (x)) are ergodic. Vice versa, if either of transformations f (inv2 (x)) or inv2 (f (x)) is ergodic, then f is ergodic. Proof. For i = 0, 1, 2, . . . denote δi (x) = χi . If f is ergodic, then by Theorem 8.5, δi (f (x)) = χi ⊕ χ0 ∙ ∙ ∙ χi−1 ⊕ ψi (χ0 , . . . , χi−1 ), (13.7) where the ANF of the Boolean function ψi (χ0 , ∙ ∙ ∙ , χi−1 ) does not contain the monomial χ0 ∙ ∙ ∙ χi−1 , ψ0 = 0, i = 0, 1, 2 . . . (we recall that product over empty set is 1). By Lemma 13.22, δi (inv2 (x)) = χi ⊕ ϕi (χ0 , . . . , χi−1 ),

(13.8)

where ϕ0 = 0 and ANF of the Boolean function ϕi (χ0 , . . . , χi−1 ) does not contain the monomial χ0 ∙ ∙ ∙ χi−1 , i = 0, 1, 2, . . .. Whence ANF of the


196

Boolean function δi (u(x)), where u(x) is either of functions f (inv2 (x)) or inv2 (f (x)), is of the form δi (u(x)) = χi ⊕ χ0 ∙ ∙ ∙ χi−1 ⊕ ϑi (χ0 , . . . , χi−1 ),

(13.9)

where the ANF of the Boolean function ϑi (χ0 , ∙ ∙ ∙ , χi−1 ) does not contain the monomial χ0 ∙ ∙ ∙ χi−1 , ϑ0 = 1, i = 0, 1, 2 . . .. Thus, by Theorem 8.5, both f (inv2 (x)) and inv2 (f (x)) are ergodic. To prove the converse statement, note that if f is not ergodic, then by Theorem 8.5, the ANF of some Boolean function δi (f (x)) in representation (13.7) does not contain the monomial χ0 ∙ ∙ ∙ χi−1 . Thus, in view of (13.8), representation (13.9) of δi (u(x)) does not contain the monomial χ0 ∙ ∙ ∙ χi−1 either. Therefore u(x) is not ergodic in force of Theorem 8.5. From Proposition 13.23 immediately follows the main result of [21]: The length of the shortest period of the congruential generator with the recurrence law (a∙inv2 (x)+b) mod 2n is 2n , n ≥ 2, if and only if a ≡ 1 (mod 4) and b ≡ 1 (mod 2). Indeed, by Proposition 13.23, the transformation a ∙ inv2 (x) + b is ergodic on Z2 if and only if the polynomial ax + b is ergodic on Z2 ; by Theorem 8.1, the latter holds if and only if ax + b is transitive modulo 4, or, equivalently, if and only if a ≡ 1 (mod 4) and b ≡ 1 (mod 2). More complex congruential generators can be constructed with the use of Proposition 13.23: For instance, the transformation f (x) = 3 ∙ inv2 (x) + 3inv2 (x) is ergodic on Z2 (see Exercise 8.16); this transformation results in a inversive-exponential generator modulo 2n with the shortest period of length 2n . In a similar way we conclude that the length of the shortest period of a more complicated exponential-inversive generator with the recurrence law (inv2 (1 + x) + 4 ∙ (1 + inv2 (2x))inv2 (x) ) mod 2n is also 2n (cf. Exercise 9.36); the same holds for generators with recurrence laws (inv 2 (2x2 ) + inv2 (7x) + 1) mod 2n and (inv2 (2x2 + 7x + 1)) mod 2n (cf. Exercise 8.20), etc. We conclude Subsection 13.2.2 with an open problem concerning congruential generators based on the function inv p : Zp → Zp for odd prime p. The function invp (x) is infinitely many times differentiable on Zp \ {0} (see Exercise 2.12); moreover, it not difficult to see that inv p (x) can be expressed via Taylor power series at every point of Zp except 0. Unfortunately, invp (x) is not a C-function (neither a B-function nor a A-function). Thus, we can not apply directly corresponding theorems from Section 9.4 on ergodicity of compositions of functions if there is inv p in a composition. So the following (somewhat informally posed) open question reads: Open question 13.24. What compositions of the function invp with A-, B- or C-functions are ergodic on Zp , for odd prime p? Note that the answer to the analogous question on measure-preservation is rather clear: e.g., it is obvious that whenever f is 1-Lipschitz, then, as


197

invp is measure-preserving, any composition f (invp (x)) and invp (f (x)) is measure-preserving if and only if f is measure-preserving.

Chapter 14

Latin squares This short chapter serves as a yet one more example of how p-adic ergodic theory is applied to a ‘non-dynamical’ area of mathematics, combinatorics, namely, to the theory of Latin squares. We recall that a Latin square of order P is a P × P matrix containing P distinct symbols (usually denoted by 0, 1, . . . , P − 1) such that each row and column of the matrix contains each symbol exactly once. A circulant matrix serves a simple example of a Latin square. Here is a 6 × 6 one: 0 1 2 3 4 5 1 2 3 4 5 0 2 3 4 5 0 1 3 4 5 0 1 2 4 5 0 1 2 3 5 0 1 2 3 4 In algebra, Latin squares are also known as binary quasigroups, an algebraic system on the set A = {0, 1, . . . , P − 1} with the only binary operation ∗ defined by the Cayley table, which is a Latin square. Note that the operation ∗ is invertible with respect to each variable: given a, b ∈ A, either equation a∗y = b and x∗a = b has a unique solution. However, the operation ∗ need not be associative. In other words, a Latin square is a 2-variate mapping f : A2 → A, where A = {0, 1, . . . , P − 1}, which is invertible (i.e., bijective) with respect to each variable. Latins squares are used widely: For games (recall sudoku), and for more serious applications as, say, private communication networks (for password distribution), in coding theory, in some cryptographic algorithms (under the name of multipermutations), etc., see monographs [41, 18]. However, known methods (e.g., the ones from the mentioned books) may not work efficiently in some practical cases. For instance, a real problem is to write a software that produces a number 198

CHAPTER 14. LATIN SQUARES

199

of large Latin squares; however, this is only a part of the problem. Another part of the problem is that in some constraint environments (e.g., in smart cards) it is impossible to store the whole matrix: Given two numbers a, b ∈ {0, 1, . . . , P − 1} the software must calculate the (a, b)-th entry of the matrix on-the-fly. We apply p-adic ergodic theory to give a solution to this problem, in the following way.

14.1

The p-adic ergodic theory in design of Latin squares

According to Theorem 7.1 a bivariate 1-Lipschitz function f : Z2p → Zp is bijective modulo pk for all k ∈ N with respect to either variable if and only if f is measure-preserving with respect to either variable. And Theorem 9.1 actually states that functions that are uniformly differentiable modulo p, are bijective modulo pk for all k ∈ N if and only if they are bijective modulo pk for some (in most cases, small) k. Note that polynomials with integer coefficients are uniformly differentiable functions; whence, they are uniformly differentiable modulo p. Also, polynomials are easily programmable functions as they are just compositions of additions and multiplications. The idea is to use polynomials with integer coefficients to construct easily programmable Latin squares. Moreover, in the case p = 2 we can also add to numerical operations (addition and multiplication) some bitwise logical operators (e.g., XOR, AND, etc.) to construct measure-preserving functions, see Subsection 1.2.1. So the main tool we use to construct easily programmable Latin squares is the following Corollary 14.1 of Theorem 9.1. We say that a bivariate 1-Lipschitz function f : Z2p → Zp is a Latin square modulo pk whenever the reduced mapping fˉ = f mod pk : Z/pk Z×Z/pk Z → Z/pk Z is a Latin square on A = Z/pk Z = {0, 1, . . . , pk − 1}. Corollary 14.1 (of Theorem 9.1). A uniformly differentiable modulo p triangular (i.e., 1-Lipschitz) function f : Z2p → Zp is a Latin square modulo pk for all k = 1, 2, . . . whenever f is a Latin square modulo pN1 (f ) and ∂1 f (u) N1 (f ) Z)2 , i = 1, 2. Equivalent state∂1 xi 6≡ 0 (mod p) for all u ∈ (Z/p ment: if and only if f is bijective modulo pN1 (f )+1 with respect to either variable. Proof. Indeed, in view of Theorem 9.1, the function f is bijective modulo pk with respect to either variable if and only if f is bijective modulo pN1 (f ) with respect to either variable, and both ∂1 f∂(x,y) and ∂1 f∂(x,y) are 0 modulo p 1x 1y nowhere; these conditions are equivalent to the bijectivity modulo pN1 (f )+1 of the function f with respect to either variable. Exercise 14.2. Prove that given arbitrary 1-Lipschitz function v(x, y) (e.g., arbitrary composition of numerical and bitwise logical operators, see Sub-


200

section 1.2.1) and arbitrary integer γ ∈ Z, the map

f2k (x, y) = (x + y + γ + 2 ∙ v(x, y)) mod 2k

is a Latin square on 2k symbols for all k = 1, 2, . . .. Exercise 14.3. Given arbitrary polynomial v(x, y) with integer coefficients and arbitrary rational integer N > 2 that has prime decomposition N = 2k ∙ 3` ∙ ∙ ∙ pr , prove that the function fN (x, y) = (x + y + 2 ∙ 3 ∙ ∙ ∙ p ∙ v(x, y)) mod N, is a Latin square on N symbols. (Hint: Use Chinese Reminder Theorem 0.1.) Now we expand the underlying idea of the latter exercise. Actually, given arbitrary Latin squares f2 , f3 , f5 . . . , fp on 2, 3, 5, . . . , p symbols, respectively (some primes may absent), we can construct a bivariate polynomial f (x, y) with integer coefficients so that f (x, y) ≡ f2 (x, y) (mod 2), f (x, y) ≡ f3 (x, y) (mod 3), f (x, y) ≡ f5 (x, y) (mod 5), . . . , f (x, y) ≡ fp (x, y) (mod p), and that f (x, y) mod pN is a Latin square on N = 2k ∙ 3` ∙ ∙ ∙ pr symbols, for all k, `, . . . , r ∈ N . Theorem 14.4. Let f2 (x, y), f3 (x, y), f5 (x, y) . . . , fp (x, y) be Latin squares on 2, 3, 5, . . . , p symbols, respectively (some primes may absent). There exists a polynomial with rational integer coefficients g(x, y) ∈ Z[x, y] such that every function f (x, y) mod pN , where fN (x, y) = (g(x, y) + 2 ∙ 3 ∙ ∙ ∙ p ∙ v(x, y)) mod N , is a Latin square on N = 2k ∙ 3` ∙ ∙ ∙ pr symbols, for all natural k, `, . . . , r, and f (x, y) ≡ fq (x, y) (mod q) for all p = 2, 3, 5, . . . , p. Here v(x, y) ∈ Z[x, y] is arbitrary polynomial with rational integer coefficients. Sketch proof. The key idea of the proof exploits the fact that every bivariate function fq : (Z/qZ)2 → Z/qZ, q prime, can be represented by a polynomial with rational integer coefficients such that a derivative of this polynomial with respect to either variable defines a prescribed mapping of Z/qZ into Z/qZ, see interpolation formula (6). That is, for every fq (x, y), q ∈ {2, 3, 5, . . . , p} (some primes may absent) we construct a polynomial gq (x, y) such that fq (x, y) = gq (x, y) for all (x, y) ∈ (Z/qZ)2 . Then we use a Chinese Reminder Theorem 0.1 to construct a polynomial g˜(x, y) ∈ Z[x, y] such that g˜(x, y) ≡ gq (x, y) (mod q) for all q ∈ {2, 3, 5, . . . , p} (respective primes are absent). Then, with the use ˉ of Proposition 0.16, by adding new terms of form Nq ∙ ((xq − x) ∙ uq (x, y) + ˉ = 2 ∙ 3 ∙ 5 ∙ ∙ ∙ p (respec(y q − y) ∙ vq (x, y)) to the polynomial g˜(x, y), where N tive primes in the product are absent), we construct a polynomial g(x, y) such that g(x, y) ≡ g˜(x, y) (mod q), ∂g(x,y)) 6≡ 0 (mod q) and ∂g(x,y)) 6≡ 0 ∂x ∂y 2 (mod q) for all corresponding primes q and all (x, y) ∈ Z . Now a combination of Theorem 9.1 with the equivalent form of Chinese Reminder Theorem 0.13 proves Theorem 14.4.


201

Exercise 14.5. Complete the proof of Theorem 14.4. Note that Theorem 14.4 not only states the existence of the polynomial g(x, y) but gives also a method to construct it explicitly, as both Proposition 0.16 and Chinese Reminder Theorem 0.1 are constructive. For example, let us construct with the use of Theorem 14.4 a Latin square on 10n symbols. We skip the first step, the construction of respective interpolation polynomials for Latin squares on 2 and 5 symbols as this procedure is clear from interpolation formula (6); we assume that these Latin squares are already represented by bivariate polynomials1 : f2 (x, y) = x + y and f5 (x, y) = 1+3x2 +y. We see that f5 (x, y) ≡ f2 (x, y)+1 (mod 2); so we only must ‘tweak’ constant term (note that in general case we would use Chinese Reminder Theorem 0.1 here): we put g˜(x, y) = 6 + 3x2 + y as 6 ≡ 1 (mod 5) and 6 ≡ 0 (mod 2). Then, as ∂˜g(x,y)) = 6x and ∂˜g(x,y)) = 1; we must find ∂x ∂y

non-zero a tweak g(x, y) for g˜(x, y) to make the partial derivative ∂g(x,y) ∂x both modulo 2 and modulo 5 everywhere on Z/2Z and Z/5Z, respectively; however, we must not change g˜(x, y) neither modulo 2 nor modulo 5 by this tweak; that is g˜(x, y) ≡ g(x, y) (mod 2) and g˜(x, y) ≡ g(x, y) (mod 5) must hold for all (x, y) ∈ Z2 . Let us tweak g˜(x, y) so that, say, ∂˜g(x,y)) ≡ ∂x ∂˜ g (x,y)) 1 (mod 2) everywhere on Z/2Z and ≡ 4 (mod 5) everywhere on ∂x Z/5Z. For this purpose, according to formula from Proposition 0.16, we put g(x, y) = 6+3x2 +y+6(x5 −x)(x+1)+5(x2 −x) = y+6−11x+2x2 +6x5 +6x6 . That is, f (x, y) = g(x, y) + 10 ∙ v(x, y), where v(x, y) is arbitrary polynomial over Z. Both g(x, y) mod 10n and f (x, y) mod 10n are Latin squares modulo 10n for every n = 1, 2, 3, . . ..

14.2

Orthogonal Latin squares

Now consider how p-adic ergodic theory may be of use to construct mutually orthogonal Latin squares. Recall that two P × P Latin squares are said to be orthogonal if when the squares are superimposed each of the P 2 ordered pairs of symbols appears exactly once. Here is an example of a pair of orthogonal Latin squares on 3 symbols: The Latin squares 0 1 2 1 2 0 2 0 1

0 1 2 2 0 1 1 2 0

are orthogonal since after we superimpose them, we get a square (0, 0) (1, 1) (2, 2) (1, 2) (2, 0) (0, 1) (2, 1) (0, 2) (1, 0) 1 A reader by direct calculations may verify that both f2 (x, y) and f5 (x, y) are Latin squares on Z/2Z and Z/5Z, respectively.


202

where all pairs are different. Mutually orthogonal Latin squares are used in experiment design to provide consistent testing of samples, as well as in cryptography (e.g., as block mixers for block ciphers, and as cipher combiners), etc. For instance, consider three programs which must be tested on each of three platforms. To run all these 9 tests, we must have a sort of schedule. We can make a schedule using the just mentioned example of orthogonal Latin squares of order 3. Namely, the table of pairs of superimposed squares gives us a schedule: Columns give us days of testing, the first number in a pair is a number of platform, the second number is a number of program. As the pair (0, 2) occurs in the second column, this means that the program No 2 must be tested on the platform No 0 at the second day. The reasoning to use methods of p-adic ergodic theory to construct mutually orthogonal Latin squares are similar to that of preceding section: There is no problem to construct a pair of small mutually orthogonal Latin squares; a problem is to create a software that produces pairs of large Latin squares, and that does it in a somewhat ‘pseudorandom’ way2 . Here we explain corresponding method; it again utilises Theorem 9.1. We say that a pair bivariate of 1-Lipschitz functions f, g : Z2p → Zp are mutually orthogonal Latin squares modulo pk whenever the reduced mappings fˉ = f mod pk : Z/pk Z × Z/pk Z → Z/pk Z and gˉ = g mod pk : Z/pk Z × Z/pk Z → Z/pk Z constitute aa pair of orthogonal Latin squares on A = Z/pk Z = {0, 1, . . . , pk − 1}. The base of the method is the following assertion: Corollary 14.6 (of Theorem 9.1). Let g, f : Z2p → Zp be uniformly differentiable modulo p 1-Lipschitz functions, and let f and g be Latin squares modulo pk for all k = 1, 2, . . . (cf. Corollary 14.1 ). These Latin squares are orthogonal modulo pk for all k = 1, 2, . . . if and only if the function F (x, y) = (f (x, y), g(x, y)) : Z2p → Z2p preserves measure. This holds if and only if f and g are orthogonal modulo pk for some k ≥ max{N1 (f ), N1 (g)}, and ! det

∂1 f (x,y) ∂1 x ∂1 f (x,y) ∂1 y

∂1 g(x,y) ∂1 x ∂1 g(x,y) ∂1 y

6≡ 0 (mod p)

for all (x, y) ∈ (Z/pN1 (F ) Z)2 Proof. From the definition of orthogonal Latin squares it immediately follows that necessary and sufficient conditions for orthogonality modulo pk is bijectivity of F modulo pk ; so the Latin squares are orthogonal modulo pk for all k = 1, 2, 3, . . . if and only if F is measure-preserving, see Theorem 7.1. Now the conclusion follows from Theorem 9.1. 2

[18]

Problems of this kind often arise in genetics, quantitative biology, chemistry, etc., see


203

Note that Corollary 14.6 gives no method to construct pairs of orthogonal Latin squares on 2k symbols: From Corollaries 14.1 and 14.6 it immediately follows that for p = 2, no pair of functions f and g satisfy Corollary 14.6. Indeed, from Corollary 14.1 it follows that, as either of functions f and g is a Latin square modulo 2k , every partial derivative modulo 2 of both f and g must be 1; however, this implies that a determinant from Corollary 14.6 is zero modulo 2. However, for p 6= 2, Corollary 14.6 implies a method to construct large orthogonal Latin squares out of small orthogonal Latin squares. For instance, let p = 3, and let     0 1 2 0 1 2 f (x, y) mod 3 = 1 2 0 g(x, y) mod 3 = 2 0 1 2 0 1 1 2 0 be a pair of orthogonal Latin squares of order 3 each. Then, given arbitrary polynomials v(x, y), w(x, y) ∈ Z3 [x, y], the functions f (x, y) = x + y + 3 ∙ v(x, y) and g(x, y) = 2x + y + 3 ∙ w(x, y) define a pair of orthogonal Latin squares modulo 3k , for all k = 1, 2, . . . since 1 2 det ≡ 2 (mod 3) 1 1 By the same reason, given a set P of odd primes and arbitrary polynomials v(x, y), w(x, y) ∈ Z[x, y], the following two Latin squares are orthogonal modulo P for every P such that all prime factors of P are in P: f (x, y) = x + y + Π ∙ v(x, y); g(x, y) = −x + y + Π ∙ w(x, y), Q where Π = p∈P p. In the same fashion, Theorem 14.4 can be re-stated for pairs of orthogonal Latin squares; and a method of constructing a pair of orthogonal Latin squares on P symbols for large composite odd P can be derived from this theorem as well. Namely, given N pairs of orthogonal Latin squares on p1 , . . . , pN symbols (pi prime, i = 1, 2, . . . , N ), we construct N pairs of bivariate mappings f1 (x, y), . . . , fN (x, y) and g1 (x, y), . . . , gN (x, y) modulo p1 , . . . , pN , respectively, such that every pair fi (x, y) and gi (x, y) represents the i-th pair of given orthogonal Latin squares on pi symbols. For this purpose we apply interpolation formula (6). Then, using Chinese Reminder Theorem 0.1, we construct two bivariate polynomials f (x, y) and g(x, y) with rational integer coefficients such that f (x, y) ≡ fpi (x, y) (mod pi ) and g(x, y) ≡ gpi (x, y) (mod pi ), for all i = 1, 2, . . . , N . After that, with the use of method from Proposition 0.16 we tweak the polynomials f (x, y) and g(x, y) so that that their partial derivatives satisfy conditions of Corollaries 14.1 and 14.6, in a manner we describe in the proof of Theorem 14.4 and in the text thereafter. We leave details to the reader as an exercise.


204

Concluding the section, we stress that presented techniques in an obvious way can be used to construct Latin squares (and mutually orthogonal Latin squares) out of arbitrary uniformly differentiable (modulo some pk ) functions, and not necessarily out of polynomials; e.g., out of rational functions, analytic functions, etc., if needed.

Bibliography [1] Charalambos D. Aliprantis and Owen Burkinshaw. Principles of real analysis. Academic Press, Inc., third edition, 1998. 162 [2] J.-P. Allouche and J. Shallit. Automatic Sequences. Theory, Applications, Generalizations. Cambridge Univ. Press, 2003. 10, 151, 170, 171, 172 [3] R. C. Alperin. p-adic binomial coefficients modp. The Amer. Math. Month., 92(8):576–578, 1985. 2 [4] Y. Amice. Interpolation p-adique. Bull. Soc. Math. France, 92:117–180, 1964. 55 [5] V. Anashin and A. Khrennikov. Applied Algebraic Dynamics, volume 49 of de Gruyter Expositions in Mathematics. Walter de Gruyter GmbH & Co., Berlin—N.Y., 2009. iv [6] V. S. Anashin. Uniformly distributed sequences of p-adic integers. Discrete Math. Appl., 12(6):527–590, 2002. 55 [7] V. S. Anashin, A. Yu. Khrennikov, and E. I. Yurova. Characterization of ergodicity of p-adic dynamical systems by using van der Put basis. Doklady Mathematics, 83(3):306–308, 2011. iv [8] Vladimir Anashin.

205

BIBLIOGRAPHY

206

Automata finiteness criterion in terms of van der Put series of automata functions. p-Adic Numbers, Ultrametric Analysis and Applications, 4(2):151–160, 2012. iv [9] Vladimir Anashin. The non-Archimedean theory of discrete systems. Math. Comp. Sci., 6(4):375–393, 2012. iv [10] Vladimir Anashin, Andrei Khrennikov, and Ekaterina Yurova. Using van der Put basis to determine if a 2-adic function is measurepreserving or ergodic w.r.t. Haar measure. In Afvances in non-Archimedean analysis, volume 551 of Contemporary Mathematics, pages 33–38. American Mathematical Society, Providence, RI, 2011. iv [11] Vladimir Anashin, Andrei Khrennikov, and Ekaterina Yurova. Ergodicity criteria for non-expanding transformations of 2-adic spheres. Discrete and Continuous Dynamical Systems, 34(2):367–377, 2014. iv [12] Vladimir Anashin, Andrei Khrennikov, and Ekaterina Yurova. T-functions revisited: new criteria for bijectivity/transitivity. Designs, Codes, and Cryptography, 71(3):383–407, 2914. iv [13] T. M. Apostol. Introduction to Analytic Number Theory. Springer-Verlag, Berlin, New York, Heidelberg, 1976. 4 [14] L. M. Arkhipov. Finite principal ideal rings. Math. Notes, 12:656–659, 1973. 4 [15] W. Brauer. Automatentheorie. B. G. Teubner, Stuttgart, 1984. 151 [16] J. Bryk and C. E. Silva. Measurable dynamics of simple p-adic polynomials. Amer. Math. Monthly, 112(3):212–232, 2005. 140 [17] P.-J. Cahen and J.-L. Chabert.

BIBLIOGRAPHY

207

Integer-Valued Polynomials, volume 48 of Math. Surv. and Monogr. Amer. Math. Soc., 1997. 5 [18] J. Dénes and A. D. Keedwell. Latin squares. North-Holland, Amsterdam, 1991. 198, 202 [19] D. L. Desjardins and M. E. Zieve. On the structure of polynomial mappings modulo an odd prime power. Available from http://arxiv.org/abs/math/0103046, 2001. 133 [20] J. Eichenauer, J. Lehn, and A. Topuzo˘glu. A nonlinear congruential pseudorandom number generator with power of two modulus. Math. Comp., 51:757–759, 1988. 134 [21] J. Eichenauer-Herrmann and H. Grothe. A new inversive congruential pseudorandom number generator with power of two modulus. ACM Trans. Modelling and Computer Simulation, 2:1–11, 1992. 185, 196 [22] J. Eichenauer-Herrmann, E. Herrmann, and S. Wegenkittl. A survey of quadratic and inversive congruential pseudorandom numbers. In P. Hellekalek, G. Larcher, H. Niederreiter, and P. Zinterhof, editors, Monte Carlo and Quasi-Monte Carlo Methods 1996, volume 127 of Lecture Notes in Statistics, pages 66–97, N.Y., 1998. Springer. 185 [23] G. Everest, A. van der Poorten, I. Shparlinsky, and T. Ward. Recurrence Sequences, volume 104 of American Mathematical Society Surveys. American Mathematical Society, 2003. 143, 184 [24] A. Gill. Introduction to the theory of finite-state machines. McGraw-Hill Inc., 1963. 171 [25] F. Q. Gouvêa. p-adic Numbers, An Introduction. Springer-Verlag, Berlin–Heidelberg–New York, second edition, 1997. 13, 14

BIBLIOGRAPHY

208

[26] R. L. Graham, D. E. Knuth, and O. Patashnik. Concrete Mathematics: A Foundation for Computer Science. Addison–Wesley, Reading., Ma., second edition, 1998. 2 [27] V. M. Gundlach, A. Yu. Khrennikov, and K.-O. Lindahl. Ergodicity on p-adic sphere. In German Open Conference on Probability and Statistics, pages 15–21, Hamburg, 2000. University of Hamburg Press. 140 [28] M. Hall. Combinatorial Theory. Blaisdell, Waltham, Mass., 1967. 4 [29] R. R. Hall. On pseudo-polynomials. Arch. Math., 18:71–77, 1971. 53 [30] B. Hasselblatt and A. Katok, editors. Handbook of Dynamical Systems, volume 1A. Elsevier Science B. V., Amsterdam, 2002. 70 [31] R. E. Kalman, P. L. Falb, and M. A. Arbib. Topics in mathematical system theory. McGraw-Hill, N. Y., 1969. 157 [32] T. Kato, L.-M. Wu, and N. Yanagihara. On a nonlinear congruential pseudorandom number generator. Math. Comp., 65:227–233, 1996. 134 [33] A. Yu. Khrennikov. Non-Archimedean Analysis: Quantum Paradoxes, Dynamical Systems and Biological Models. Kluwer Academic Publishers, Dordrecht, 1997. 10 [34] A. Klimov and A. Shamir. A new class of invertible mappings. In B.S.Kaliski Jr.et al., editor, Cryptographic Hardware and Embedded Systems 2002, volume 2523 of Lect. Notes in Comp. Sci, pages 470– 483. Springer-Verlag, 2003. 145, 190, 193, 194 [35] D. Knuth.

BIBLIOGRAPHY

209

The Art of Computer Programming, volume 2:Seminumerical Algorithms. Addison-Wesley, Third edition, 1997. 184 [36] N. Koblitz. p-adic numbers, p-adic analysis, and zeta-functions, volume 58 of Graduate texts in math. Springer-Verlag, second edition, 1984. 10 [37] L. Kotomina. Fast nonlinear congruential generators. Master’s thesis, Russian State University for the Humanities, Moscow, 1999. (in Russian). 193 [38] L. Kuipers and H. Niederreiter. Uniform Distribution of Sequences. John Wiley & Sons, N.Y. etc., 1974. 182 [39] M. V. Larin. Transitive polynomial transformations of residue class rings. Discrete Mathematics and Applications, 12(2):141–154, 2002. 133 [40] Hans Lausch and Wilfried Nöbauer. Algebra of Polynomials. North-Holl. Publ. Co, American Elsevier Publ. Co, 1973. 39, 70, 72, 74 [41] C. F. Laywine and G. L. Mullen. Discrete mathematics using Latin squares. John Wiley & Sons, Inc., New York, 1998. 198 [42] K. Mahler. p-adic numbers and their functions. Cambridge Univ. Press, 1981. (2nd edition). 30, 35, 37, 49, 64 [43] B. R. McDonald. Finite Rings with Identity. Marcel Dekker, N.Y., 1974. 4 [44] M. Nagata.

BIBLIOGRAPHY

210

Local Rings, volume 13. Interscience tracts in pure and applied mathematics, N.Y.–London, 1962. 4 [45] A. A. Nechaev. Finite rings with applications. In M. Hazewinkel, editor, Handbook of Algebra, volume 5, pages 213– 320. Elsevier B.V., 2008. 4 [46] H. Niederreiter. Random number generation and quasi-Monte Carlo methods. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1992. 184 [47] R. Rivest. Permutation polynomials modulo 2w . Finite Fields and Appl., 7(2):287–292, 2001. 190 [48] W. H. Schikhof. Ultrametric calculus. An introduction to p-adic analysis. Cambridge University Press, Cambridge, 1984. 10, 13, 15, 30, 33, 35, 37, 55, 59, 60, 62 [49] Tao Shi, Vladimir Anashin, and Dongdai Lin. Linear weaknesses in T-functions. In T. Helleseth and J. Jedwab, editors, SETA 2012, volume 7280 of Lecture Notes Comp. Sci., pages 279–290, Berlin–Heidelberg, 2012. Springer-Verlag. iv [50] Tao Shi, Vladimir Anashin, and Dongdai Lin. Fast evaluation of T-functions via time-memory trade-offs. In Information Security and Cryptology, volume 7763 of Lecture Notes in Computer Science, pages 263–275, Berlin–Heidelberg, 2013. Springer. iv

Index 1-Lipschitz function (p-adic), 22, 27, clopen set, 14 39 compatibility, 39 –with epimorphism, 72 A-function, 65 –with respect to ideal, 72 adding machine, 185 compatible function (p-adic), 22, 39 algebraic normal form, ANF, 88 complexity of a sequence analytic function (p-adic), 33 – linear, 145 – locally, of order r, 55 computer’s instructions, 18 Archimedean Axiom, 12 configuration space, 68 automaton, 153 congruential generator, 185 – Mealy, 153 – Lehmer, 186 – Tue-Morse, 159 – add-xor, 195 – generator, 153 – explicit, 185 – initial, 153 – inversive, 187 – reachable state of, 154 – linear, 186 – subautomaton of, 154 – multiplicative, 186 – initial state of, 153 – power, 187 – input alphabet of, 153 – recurrence law of, 185 – of measure 0, 166 continuous function (p-adic), 21 – of measure 1, 166 coordinate function, 17, 40 – output alphabet of, 153 – output function of, 153 derivative (p-adic), 23 – set of states of, 153 – modulo pk , 26, 28 – state transition function of, 153 – integer-valued, 28 – transducer, 153 determinate function, 41 – reversible, 157 difference, Δ, 3 automaton function, 41, 155 differentiable function (p-adic), 23, 28 – modulo pk , 26, 28 B-function, 59 – uniformly, 28 balanced map, 71 dynamical system, 68 Banach space, 62 – minimal, 70 Borel measure, 74 – topological, 68 Borel set, 74 – topologically transitive, 70 bounded p-adic function, 62 ergodic C-function, 56 – family of maps, 161 characteristic function of a ball, 38 211

INDEX – map, 69 –uniquely, 71 ergodic (p-adic) function, 75 Euler’s totient function, 4 exponential B-function, 64 falling (factorial) power, 2 falling power series, 58 field, 5 – primitive element of, 7 – quotient, 5 filter, 180 finite intersection property, 15 fixed point, 69 formal power series, 6 generator, 153 generator (automaton) – counter-dependent, 181 Haar measure (on Znp ), 74 Hensel’s Lemma, 25 identity modulo pk , 36 induced metric (by a norm), 62 integer-valued function, 5, 39 integer-valued polynomial, 5 integral domain, 5 invariant measure, 69 invariant subset, 69 –proper, 69

212 measure –Borel, 74 –Haar (on Znp ), 74 –invariant, 69 –p-adic, 74 –regular, 74 measure-preserving (p-adic) function, 75 measure-preserving map, 69 metric, 12 – p-adic, 12 – non-Archimedean, 12 metric space, 12 minimal dynamical system, 70 Möbius function, 3 module (over a ring), 5 Monna map, 144 multiplicative inverse, 19 – generalized, 24, 187, 196 non-Archimedean, 11 non-expansive function (p-adic), 22, 39 norm, 62 normed (vector) space (over Qp ), 62 observable, 69 orbit, 68 orthogonal Latin squares, 203 Ostrovski’s theorem, 13

p-adic absolute value, 12 p-adic ball, 14 p-adic derivative, 23 p-adic expansion, 17 lacuna, 167 p-adic function Latin square, 200 k – 1-Lipschitz, 22, 27, 39 – modulo p , 201 – of measure 0, 166 – orthogonality modulo pk , 204 – of measure 1, 166 linear complexity (of a sequence), 145 – locally, 39 locally constant function (p-adic), 30 – analytic, 33 Lucas’ Theorem, 2 – locally, of order r, 55 – compatible, 39 Mahler (interpolation) series, 35 – locally, 39 Mahler expansion, 35 – continuous, 21 measurable map, 68 key, 179 keystream, 179

INDEX – uniformly, 22 – differentiable, 23 – modulo pk , 26 – uniformly, 23 – ergodic, 75 – exponential, expp , 34 – integer-valued, 39 – twice, 28 – locally constant, 30 – measure-preserving, 75 – non-expansive, 39 – pseudo-constant, 24 – step, 30 – of order N , 30 p-adic integers, 15 p-adic logarithm, lnp , 34 p-adic measure, 74 p-adic metric, 12 p-adic numbers, 14 p-adic sphere, 15 p-adic Taylor series, 33 p-adic trigonometric functions, 34 p-adic valuation ordp , 10 p-adic weight wtp , 10 p-automatic sequence, 173 p-kernel (of a sequence), 173 period –of a sequence, 30 phase space, 68 plaintext, 179 point – eventually periodic, 68 – fixed, 69 – periodic, 68 – pre-periodic, 68 r-periodic, 69 primitive element (of a field), 7 primitive modulo pk , 6, 139 pseudo-constant, 24 pseudorandom sequence, 178 quasigroup (binary), 200 quotient field, 5

213 rational B-function, 65 reduced function (modulo pk ), 22 reduction map (modulo pk ), 19, 27 reduction of a function modulo pk , 22 regular measure, 74 ring, 4 – (integral) domain, 5 – characteristic of, 5 – commutative, 4 – group of units, 5 – ideal of, 5 – proper, 5 – invertible element of, 5 – multiplicative (sub)group of, 5 – nilpotent element of, 5 – nilpotency index of, 5 – of formal power series, 6 – of zero characteristic, 5 – principal ideal, 6 – unit group of, 5 – unit of, 5 – with unity, 5 – zero divisor of, 5 –ideal of – principal, 6 seed, 180 sequence –(purely) periodic, 31 –eventually periodic, 31 p-automatic, 173 p-kernel of, 173 –pseudorandom, 178 –uniformly distributed, 70 –strictly, 71 –strictly modulo pn , 184 step function, 30 – of order N , 30 Stirling numbers, 2 Stone-Weierstrass theorem for B-functions, 60 stream cipher, 179 strong triangle inequality, 11 subautomaton, 154

INDEX system (automaton), 159 T-function, 22, 41, 185 Taylor series, 33 topological transitivity, 70 totally disconnected, 14 trajectory, 68 transducer, 153 n-reversible, 157 – reachable, 154 – finite, 154 – infinite, 154 – reversible, 157 transitive – family of maps, 159 – mapping, 71, 160 triangle inequality, 12 – strong, 11, 12 triangular function, 41 truth set (of a Boolean function), 89 twice integer-valued function, 28 ultrametric, 12 ultrametric space, 12 uniform differentiability (on Zp ), 23 uniformly continuous function (p-adic), 22 uniformly distributed sequence, 70 unique ergodicity, 71 valuation –p-adic, 10 van der Put series, 37 weight – of a Boolean function, 88 word (over an alphabet), 153 – empty, 154 – of length n, 153 – prefix of, 154 – suffix of, 154

214