Pointwise convergence of multiple ergodic averages and strictly ...

9 downloads 50 Views 381KB Size Report
Jun 23, 2014 - DS] 23 Jun 2014. POINTWISE CONVERGENCE OF MULTIPLE ERGODIC. AVERAGES AND STRICTLY ERGODIC MODELS. WEN HUANG ...
POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES AND STRICTLY ERGODIC MODELS

arXiv:1406.5930v1 [math.DS] 23 Jun 2014

WEN HUANG, SONG SHAO, AND XIANGDONG YE Abstract. By building some suitable strictly ergodic models, we prove that for an ergodic system (X, X , µ, T ), d ∈ N, f1 , . . . , fd ∈ L∞ (µ), the averages X 1 f1 (T n x)f2 (T n+m x) . . . fd (T n+(d−1)m x) N2 2 (n,m)∈[0,N −1]

converge µ a.e. Deriving some results from the construction, for distal systems we answer positively the question if the multiple ergodic averages converge a.e. That is, we show that if (X, X , µ, T ) is an ergodic distal system, and f1 , . . . , fd ∈ L∞ (µ), then multiple ergodic averages

converge µ a.e..

N −1 1 X f1 (T n x) . . . fd (T dn x) N n=0

1. Introduction 1.1. Main results. Throughout this paper, by a topological dynamical system (t.d.s. for short) we mean a pair (X, T ), where X is a compact metric space and T is a homeomorphism from X to itself. A measurable system (m.p.t. for short) is a quadruple (X, X , µ, T ), where (X, X , µ) is a Lebesgue probability space and T : X → X is an invertible measure preserving transformation. ˆ Tˆ ) is a topological model (or Let (X, X , µ, T ) be an ergodic m.p.t. We say that (X, ˆ Tˆ ) is a t.d.s. and there exists an invariant probjust a model) for (X, X , µ, T ) if (X, ˆ such that the systems (X, X , µ, T ) ability measure µ ˆ on the Borel σ-algebra B(X) ˆ ˆ ˆ and (X, B(X), µ ˆ, T ) are measure theoretically isomorphic. The well-known Jewett-Krieger’s theorem [29, 30] states that every ergodic system has a strictly ergodic model. We note that one can add some additional properties to the topological model. For example, in [31] Lehrer showed that the strictly ergodic model can be required to be a topological (strongly) mixing system in addition. Now let τˆd = Tˆ ×. . .×Tˆ (d times) and σ ˆd = Tˆ ×. . .×Tˆ d . The group generated by τˆd ˆ let Nd (X, ˆ x) = O((x, . . . , x), hˆ and σ ˆd is denoted hˆ τd , σ ˆd i. For any x ∈ X, τd , σ ˆd i), the Date: June 22, 2014. 2000 Mathematics Subject Classification. Primary: 37A05, 37B05. Key words and phrases. ergodic averages, distal systems. Huang is partially supported by NNSF for Distinguished Young School (11225105), Shao and Ye is partially supported by NNSF of China (11171320), and Huang and Ye is partially supported by NNSF of China (11371339). 1

2

WEN HUANG, SONG SHAO, AND XIANGDONG YE

orbit closure of (x, . . . , x) (d times) under the action of the group hˆ τd , σ ˆd i. We remark ˆ Tˆ) is minimal, then all Nd (X, ˆ x) coincide, which will be denoted by that if (X, ˆ ˆ Tˆ ) is minimal, then (Nd (X), ˆ hˆ Nd (X). It was shown by Glasner [20] that if (X, τd , σ ˆd i) is minimal. In this paper, first we will show the following theorem. Theorem A: Let (X, X , µ, T ) be an ergodic m.p.t. and d ∈ N. Then it has a strictly ˆ Tˆ) such that (Nd (X), ˆ hˆ ergodic model (X, τd , σ ˆd i) is strictly ergodic. As a consequence, we have: Theorem B: Let (X, X , µ, T ) be an ergodic m.p.t. and d ∈ N. Then for f1 , . . . , fd ∈ L∞ (µ) the averages X 1 (1.1) f1 (T n x)f2 (T n+m x) . . . fd (T n+(d−1)m x) N2 n,m∈[0,N −1]

converge µ a.e.

We remark that similar theorems as Theorems A and B can be established for cubes. Moreover, the convergence in Theorem B can be stated for any tempered Følner sequence {FN }N ≥1 of Z2 instead of [0, N − 1]2 . PN −1 d Πj=1fj (T jn x) It is a long open question if the multiple ergodic averages N1 n=0 converge a.e. Using some results developed when proving Theorem A, we answer the question positively for distal systems. Namely, we have Theorem C: Let (X, X , µ, T ) be an ergodic distal system, and d ∈ N. Then for all f1 , . . . , fd ∈ L∞ (µ) N −1 1 X f1 (T n x) . . . fd (T dn x) N n=0 converge µ a.e.. Note that Furstenberg’s structure theorem [19] states that each ergodic systems is a weakly mixing extension of an ergodic distal system. Thus, by Theorem C the open question is reduced to deal with the weakly mixing extensions. Moreover, we have the following conjecture. Conjecture: Let (X, X , µ, T ) be an ergodic system, then it has a topological model ˆ Tˆ) such that for a.e. x ∈ X, ˆ (x, . . . , x) is a generic point of some ergodic (X, (d) measures µx invariant under Tˆ × . . . × Tˆ d . (d)

We also conjecture that the measures µx are the ones defined in Theorem 6.1. Once the conjecture is proven then the multiple ergodic averages converge a.e. by a similar argument we used to prove Theorem B. 1.2. Backgrounds. In this subsection we will give backgrounds of our research.

POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES

3

1.2.1. Ergodic averages. In this subsection we recall some results related to pointwise ergodic averages. The first pointwise ergodic theorem was proved by Birkhoff in 1931. Followed from Furstenberg’s beautiful work on the dynamical proof of Szemeradi’s theorem in 1977, problems concerning the convergence of multiple ergodic averages (in L2 or pointwisely) attracts a lot of attention. The convergence of the averages (1.2)

N −1 1 X f1 (T n x) . . . fd (T dn x) N n=0

in L2 norm was established by Host and Kra [27] (see also Ziegler [44]). We note that in their proofs, the characteristic factors play a great role. The convergence of the multiple ergodic average for commuting transformations was obtained by Tao [36] using the finitary ergodic method, see [5, 26] for more traditional ergodic proofs by Austin and Host respectively. Recently, the convergence of multiple ergodic averages for nilpotent group actions was proved by Walsh [38]. The first breakthrough on pointwise convergence of (1.2) for d > 1 is due to Bourgain, who showed in [10] that for d = 2, for p, q ∈ N and for all f1 , f2 ∈ L∞ , the PN −1 limit N1 n=0 f1 (T p x)f2 (T q x) exists a.e. Before Bourgain’s work, Lesigne showed this convergence holds if the system is distal, with T p , T q and T p−q ergodic [32]. Also in [14, 2], it was shown that the problem of the almost everywhere convergence of (1.2) can be deduced to the case when the m.p.t. has zero entropy. One can also find some results for weakly mixing transformations in [1, 2, 4]. Recently there are some studies on the limiting behavior of the averages along cubes, and please refer to [8, 27, 3, 11] for details. Also in [11], they obtained the following result. For i = 1, 2, . . . , d, let Ti : X → X be m.p.t., fi ∈ L∞ (µ) be functions, pi ∈ Z[t] be non-constant polynomials such that pi − pj is non-constant for i 6= j, and b : N → N be a sequence such that b(N) → ∞ and b(N)/N 1/d → 0 as N → ∞, where d is the maximum degree of the polynomials pi . Then the averages X 1 m+p (n) m+p (n) f1 (T1 1 x) . . . fd (Td l x) Nb(N) 1≤m≤N,1≤n≤b(N )

converge pointwise as N → ∞.

1.2.2. Topological model. The pioneering work on topological model was done by Jewett in [29]. He proved the theorem under the additional assumption that T is weakly mixing, and the general case was proved by Krieger in [30] soon. The papers of Hansel and Raoult [24], Bellow and Furstenberg [6] and Denker [12], gave different proofs of the theorem in the general ergodic case (see also [13]). One can add some additional properties to the topological model. For example, in [31] Lehrer showed that the strictly ergodic model can be required as a topological (strongly) mixing system in addition. Our Theorems A strengthen Jewett-Krieger’ Theorem in other direction, i.e. we can require the model to be well behavioral under some group actions on some subsets of the product space.

4

WEN HUANG, SONG SHAO, AND XIANGDONG YE

It is well known that each m.p.t. has a topological model [19]. There are universal models, models for some group actions and models for some special classes. Weiss [40] showed the following nice result: there exists a minimal t.d.s. (X, T ) with the property that for every aperiodic ergodic m.p.t. (Y, Y, ν, S) there exists a T invariant Borel probability measure µ on X such that the systems (Y, Y, ν, S) and (X, B(X), µ, T ) are measure theoretically isomorphic. Weiss [39] (see also [42, 21, 23]) showed that Jewett-Krieger’ Theorem can be generalized from Z-actions to commutative group actions. An ergodic system has a doubly minimal model if and only if it has zero entropy [41] (other topological models for zero entropy systems can be found in [25, 16]); and an ergodic system has a strictly ergodic, UPE (uniform positive entropy) model if and only if it has positive entropy [22]. Note that not any dynamical properties can be added in the uniquely ergodic models. For example, Lindernstrauss showed that every ergodic measure distal system (X, X , µ, T ) has a minimal topologically distal model [33]. This topological model need not, in general, be uniquely ergodic. In other words there are measure distal systems for which no uniquely ergodic topologically distal model exists [33]. We refer to [23] for more information on the topic. ˆ → Yˆ is a topological model for π : (X, X , µ, T ) → (Y, Y, ν, S) We say that π ˆ:X if π ˆ is a topological factor map and there exist measure theoretical isomorphisms φ and ψ such that the diagram φ ˆ X −−−→ X     πy yπˆ

ψ Y −−−→ Yˆ is commutative, i.e. π ˆ φ = ψπ. Weiss [39] generalized the theorem of Jewett-Krieger to the relative case. Namely, he proved that if π : (X, X , µ, T ) → (Y, Y, ν, S) is a ˆ νˆ, S) ˆ is a uniquely ergodic model factor map with (X, X , µ, T ) ergodic and (Yˆ , Y, ˆ Xˆ , µ for (Y, Y, ν, T ), then there is a uniquely ergodic model (X, ˆ, Tˆ ) for (X, X , µ, T ) ˆ → Yˆ which is a model for π : X → Y . We will refer this and a factor map π ˆ:X theorem as Weiss’s Theorem. We note that in [39] Weiss pointed that the relative case holds for commutative group actions.

1.3. Main ideas of the proofs. Now we describe the main ideas and ingredients in the proofs. To prove Theorem A the first fact we face is that for an ergodic m.p.t. (X, X , µ, T ), not every strictly ergodic model is the one we need in Theorem A. This indicates that to obtain Theorem A, Jewett-Krieger’ Theorem is not enough for our purpose. Fortunately, we find that Weiss’s Theorem is a right tool. Precisely, for d ≥ 3 let πd−2 : X → Zd−2 be the factor map from X to its d − 2step nilfactor Zd−2 . By the results in [28], Zd−2 may be regarded as a topological system in the natural way. Using Weiss’s Theorem there is a uniquely ergodic model ˆ Xˆ , µ ˆ → Zd−2 which is a model (X, ˆ, Tˆ ) for (X, X , µ, T ) and a factor map π ˆd−2 : X

POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES

5

for πd−2 : X → Zd−2 . X   πd−2 y

φ

−−−→

ˆ X  πˆ y d−2

Zd−2 −−−→ Zd−2 ˆ Tˆ ) is what we need. To this aim, We then show (though it is not easy) that (X, we need understand well the ergodic decomposition of d-fold self-joinings of X. We first study the σ-algebra of σd -invariant sets under σd , and show that we always can deduce this σ-algebra to the one on its nilfactors. Then via studying nilsystems, we get the ergodic decomposition of Furstenberg self-joinings under the action σd . This is the main tool we develop to prove Theorem A. Once Theorem A is proven, Theorem B will follow by an argument using some well known theorems related to pointwise convergence for Zd actions and for uniquely ergodic systems. Let (X, X , µ, T ) be an ergodic distal system. Then πd−1 : X → Zd−1 is a distal extension. By Furstenberg’s Structure Theorem, πd−1 is decomposed into isometric extensions and inverse limit. We show that the property of almost everywhere convergence of the multiple ergodic averages (1.2) is preserved by these isometric extensions, and then we conclude Theorem C. This argument is inspired by Lesigne’s work in [32]. 1.4. Organization of the paper. We organize the paper as follows. In Section 2 we introduce some basic notions and leave most related notions and results in the appendix. In Section 3 we study the ergodic decomposition of self-joinings under T × T 2 × . . . × T d . Then in Sections 4,5 and 6, we prove Theorems A, B and C respectively. 2. Preliminaries 2.1. Ergodic theory and topological dynamics. In this subsection we introduce some basic notions in ergodic theory and topological dynamics. For more information, see Appendix. In this paper, instead of just considering a single transformation T , we will consider commuting transformations T1 , . . . , Tk of X. We only recall some basic definitions and properties of systems for one transformation. Extensions to the general case are straightforward. 2.1.1. Measurable systems. For a m.p.t. (X, X , µ, T ) we write I(X, X , µ, T ) for the σ-algebra {A ∈ X : T −1 A = A} of invariant sets. Sometimes we will use I or I(T ) for short. A m.p.t. is ergodic if all the T -invariant sets have measure either 0 or 1. (X, X , µ, T ) is weakly mixing if the product system (X × X, X × X , µ × µ, T × T ) is erdogic. A homomorphism from m.p.t. (X, X , µ, T ) to (Y, Y, ν, S) is a measurable map π : X0 → Y0 , where X0 is a T -invariant subset of X and Y0 is an S-invariant subset of Y , both of full measure, such that π∗ µ = µ ◦ π −1 = ν and S ◦ π(x) = π ◦ T (x) for x ∈ X0 . When we have such a homomorphism we say that (Y, Y, ν, S) is a factor of

6

WEN HUANG, SONG SHAO, AND XIANGDONG YE

(X, X , µ, T ). If the factor map π : X0 → Y0 can be chosen to be bijective, then we say that (X, X , µ, T ) and (Y, Y, ν, S) are (measure theoretically) isomorphic (bijective maps on Lebesgue spaces have measurable inverses). A factor can be characterized (modulo isomorphism) by π −1 (Y), which is a T -invariant sub- σ-algebra of X , and conversely any T -invariant sub-σ-algebra of X defines a factor. By a classical result abuse of terminology we denote by the same letter the σ-algebra Y and its inverse image by π. In other words, if (Y, Y, ν, S) is a factor of (X, X , µ, T ), we think of Y as a sub-σ-algebra of X . 2.1.2. Topological dynamical systems. A t.d.s. (X, T ) is transitive if there exists some point x ∈ X whose orbit O(x, T ) = {T n x : n ∈ Z} is dense in X and we call such a point a transitive point. The system is minimal if the orbit of any point is dense in X. This property is equivalent to saying that X and the empty set are the only closed invariant sets in X. (X, T ) is topologically weakly mixing if the product system (X × X, T × T ) is transitive. A factor of a t.d.s. (X, T ) is another t.d.s. (Y, S) such that there exists a continuous and onto map φ : X → Y satisfying S ◦ φ = φ ◦ T . In this case, (X, T ) is called an extension of (Y, S). The map φ is called a factor map. 2.1.3. M(X) and MT (X). For a t.d.s. (X, T ), denote by M(X) the set of all probability measure on X. Let MT (X) = {µ ∈ M(X) : T∗ µ = µ ◦ T −1 = µ} be the set of all T -invariant Borel probability measures of X and MTe (X) be the set of ergodic elements of MT (X). It is well known that MTe (X) 6= ∅. A t.d.s. (X, T ) is called uniquely ergodic if there is a unique T -invariant probability measure on X. It is called strictly ergodic if it is uniquely ergodic and minimal. 2.1.4. Topological distal systems. A t.d.s. (X, T ) (with metric ρ) is called topologically distal if inf n∈Z ρ(T n x, T n x′ ) > 0 whenever x, x′ ∈ X are distinct. 2.2. System of order d − 1. A factor Z of X is characteristic for averages (2.1)

N −1 1 X f1 (T n x) . . . fd (T dn x) N n=0

if the limiting behavior of (2.1) only depends on the conditional expectation of fi with respect to Z: N 1 X n (T f1 T 2n f2 . . . T dn fd − T n E(f1 |Z)T 2n E(f2 |Z) . . . T dn E(fd |Z))||L2 = 0 || lim N →∞ N n=1

for any f1 , . . . , fd ∈ L∞ (X, X , µ). The minimal characteristic factor of (2.1) always exists [27, 44], and it is denoted by (Zd−1 , Zd−1 , µd−1 ). Theorem 2.1. [27] Let (X, X , µ, T ) be an ergodic system and d ∈ N. Then the system (Zd−1 , Zd−1 , µd−1 , T ) is a (measure theoretic) inverse limit of d − 1-step nilsystems. (Zd−1 , Zd−1 , µd−1 , T ) is called a system of order d − 1.

POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES

7

2.3. Topological system of order (d − 1). We need to use the inverse limit of nilsystems, so we recall the definition of an inverse limit of t.d.s. If (Xi , Ti )i∈N are t.d.s. with diam(Xi ) ≤ 1 and πi : Xi+1 → Xi are factor Q maps, the inverse limit of the systems is defined to be the compact subset of i∈N Xi given by {(xi )i∈N : πi (xi+1 ) = xi , i ∈ N}, and we denote it by lim(Xi , Ti )i∈N . It is a compact metric ←− P space endowed with the distance ρ((xi )i∈N , (yi )i∈N ) = i∈N 1/2i ρi (xi , yi ), where ρi is the metric in Xi . We note that the maps Ti induce naturally a transformation T on the inverse limit. Definition 2.2. [28] An inverse limit of (d − 1)-step minimal nilsystems is called a topological system of order (d − 1). Definition 2.3. Let (X, T ) be a t.d.s. and let d ∈ N. The points x, y ∈ X are said to be regionally proximal of order d if for any δ > 0, there exist x′ , y ′ ∈ X and a vector n = (n1 , . . . , nd ) ∈ Zd such that ρ(x, x′ ) < δ, ρ(y, y ′) < δ, and ρ(T n·ǫ x′ , T n·ǫ y ′ ) < δ for any ǫ ∈ {0, 1}d \ {(0, 0, . . . , 0)}. The set of regionally proximal pairs of order d is denoted by RP[d] (or by RP[d] (X, T ) in case of ambiguity), and is called the regionally proximal relation of order d. The above definition was introduced in [28] by Host-Kra-Maass and it was proved that for a minimal distal system, RP[d] is an equivalence relation and X/RP[d] is a topological system of order d. Later it was shown that it is an equivalence relation for any minimal systems by Shao-Ye in [35]. We will use the following theorems in the paper. Theorem 2.4. [28, Theorem 1.2] Let (X, T ) be a minimal topologically distal system and let d ∈ N. Then (X, T ) is a topological system of order d if and only if RP[d] = ∆X . Theorem 2.5. [28, Theorem A.1] Any system of order d is isomorphic in the measure theoretic sense to a topological system of order d. Lemma 2.6. [15, Lemma A.3] Let (X, T ) be a system of order d, then the maximal measurable and topological factors of order j coincide, where j ≤ d. 3. ergodic decomposition of self-joinings under T × T 2 × . . . × T d In this section we study ergodic decomposition of self-joinings under T × T 2 × . . . × T d . The theorems in this section are important for our proofs, and also they have their own interest. 3.1. Furstenberg self-joining. Let T : X → X be a map and d ∈ N. Set τd = T × . . . × T (d times), σd = T × T 2 × . . . × T d and σd′ = id × T × . . . × T d−1 = id × σd−1 . Note that hτd , σd i = hτd , σd′ i. For any x ∈ X, let Nd (X, x) = O((x, . . . , x), hτd , σd i), the orbit closure of (x, . . . , x) (d times) under the action of the group hτd , σd i. We

8

WEN HUANG, SONG SHAO, AND XIANGDONG YE

remark that if (X, T ) is minimal, then all Nd (X, x) coincide, which will be denoted by Nd (X). It was shown by Glasner [20] that if (X, T ) is minimal, then (Nd (X), hτd , σd i) is minimal. Hence if (Nd (X), hτd , σd i) is uniquely ergodic, then it is strictly ergodic. Definition 3.1. Let (X, T ) be a minimal system with µ ∈ MT (X) and d ≥ 1. If (Nd (X), hτd, σd i) is uniquely ergodic, then we denote the unique measure by µ(d) , and call it the Furstenberg self-joining. For a t.d.s. (X, T, µ) and d ∈ N, we define µ(d) as one of the weak limit points of PN −1 n d σd µ∆ } in M(X d ), where µd∆ is the diagonal measure on X d as sequence { N1 n=0 defined in [18], i.e. it is defined on X d as follows Z Z d f1 (x1 ) . . . fd (xd ) dµ∆ (x1 , . . . , xd ) = f1 (x) . . . fd (x) dµ(x). Xd

X

When (Nd (X), hτd , σd i) is uniquely ergodic, it is easy to see that N −1 1 X n d σ µ −→ µ(d) , N → ∞, N n=0 d ∆

weakly in M(X d ),

3.2. The σ-algebra of invariant sets under σd = T × T 2 × . . . × T d . In this subsection we study the σ-algebra of invariant sets under σd = T × T 2 × . . . × T d . We will show we always can deduce this σ-algebra to the one on its nilfactors. The proof of the following lemma is similar to the proof of Theorem 12.1 in [27]. Lemma 3.2. Let (X, X , µ, T ) be an ergodic system and d ≥ 1 be an integer. Suppose that λ is a d-fold self-joining of X. Assume that f1 , . . . , fd ∈ L∞ (X, µ) with kfj k∞ ≤ 1 for j = 1, . . . , d. Then −1

1 N

X

n 2n dn (3.1) lim ≤ min {l · 9fl 9d } f1 (T x1 )f2 (T x2 ) . . . fd (T xd ) N →∞ N 1≤l≤d L2 (X d ,λ) n=0 Proof. We proceed by induction. For d = 1, by the Ergodic Theorem, Z N −1 1 X n T f1 kL2 (µ) → | f1 dµ| = 9f1 91 . k N n=0

Let d ≥ 1 and assume that (3.1) holds for d and any d-fold self-joining of X. Let f1 , . . . , fd+1 ∈ L∞ (µ) with kfj k∞ ≤ 1 for j = 1, . . . , d + 1. Let λ be any d + 1-fold self-joining of X. Choose l ∈ {2, 3, . . . , d + 1}. (The case l = 1 is similar). Write ξn =

d+1 O

T j fj = f1 (T n x1 )f2 (T 2n x2 ) . . . fd+1 (T (d+1)n xd+1 ).

j=1

By the van der Corput lemma (Lemma F.1),

−1 H−1 −1 Z 1 N X X X

2

1 N 1 ξn L2 (λ) ≤ lim sup lim sup ξ n+h · ξn dλ . lim sup N H N N →∞ H→∞ N →∞ n=0 n=0 h=0

POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES

9

Letting M denote the last lim sup, we need to show that M ≤ l2 9 fl 92d+1 . For any h ≥ 1, −1 Z 1 N X ξ · ξ dλ n n+h N n=0 Z d+1 N −1 O X 1 h jh n = (f1 · T f 1 ) ⊗ fj · T f j dλ(x1 , . . . , xd+1 ) (σd ) N n=0 j=2



≤ f1 · T h f 1



= f1 · T h f 1

d+1 −1

1 N

O

X

· fj · T jh f j (σd )n N n=0 L2 (λ) L2 (λ) j=2

d+1 −1

1 N

O

X

n fj · T jh f j (σd ) · N n=0 L2 (µ) L2 (λ′ ) j=2

where λ′ is the image of λ to the last d coordinates. It is clear λ′ is a d-fold self-joining of X, and by the inductive assumption, −1 Z 1 N X ξ n+h · ξn dλ ≤ l 9 fl · T lh f l 9d . N n=0

We get

M ≤ l · lim sup H→∞ 2

H−1 H−1 1 X 1 X 9fl · T lh f l 9d ≤ l2 · lim sup 9fl · T h f l 9d H h=0 H→∞ H h=0

≤ l · lim sup H→∞

 1 H−1 X H

9fl · T

h

d f l 92d

h=0

= l2 · 9fl 92d+1 .

1/2d

The last equation follows from Lemma E.1. The proof is completed.



Lemma 3.3. Let (X, X , µ, T ) be an ergodic system and d ∈ N.Suppose that λ is a d-fold self-joining of X and it is σd -invariant. Assume that f1 , . . . , fd ∈ L∞ (X, µ). Then d d O  O  d (3.2) E fj I(X , λ, σd ) = E E(fj |Zd−1 ) I(X d , λ, σd ) . j=1

j=1

Proof. By Lemma 5.1, it suffices to show that d O  d (3.3) E fj I(X , λ, σd ) = 0 j=1

10

WEN HUANG, SONG SHAO, AND XIANGDONG YE

whenever E(fk |Zd−1 ) = 0 for some k ∈ {1, 2, . . . , d}. This condition implies that 9fk 9d = 0. By the Ergodic Theorem and Lemma 3.2, we have d

O 



d (d) fj I(X , µ , σd )

E j=1

L2 (λ)

−1

1 N

X f1 (T n x1 )f2 (T 2n x2 ) . . . fd (T dn xd ) ≤ k · 9fk 9d = 0. = lim N →∞ N L2 (λ) n=0

So the lemma follows.



Proposition 3.4. Let (X, X , µ, T ) be ergodic and d ∈ N. Suppose that λ is a d-fold self-joining of X and it is σd -invariant. Then the σ-algebra I(X d , λ, σd ) is d measurable with respect to Zd−1 . Proof. Every bounded function on X d which is measurable with respect to I(X d , λ, σd ) can be approximated in L2 (λ) by finite sums of functions of the form N E( dj=1 fj |I(X d , λ, σd )) where f1 , . . . , fd are bounded functions on X. By Lemma 3.3, one can assume that these functions are measurable with respect to Zd−1 . In N d d . Since this σ-algebra Zd−1 is this case dj=1 fj is measurable with respect to Zd−1 N −1  N P Nd n invariant under σd , E( dj=1 fj |I(X d , λ, σd )) = lim N1 j=1 fj ◦ σd is also N →+∞

n=0

d measurable with respect to Zd−1 . Therefore I(X d , λ, σd ) is measurable with respect d to Zd−1 . 

Corollary 3.5. Let (X, X , µ, T ) be an ergodic system and d ∈ N. Suppose that d λ is a d-fold self-joining of X and it is σd -invariant. Then the factor map πd−1 : d d e e (X , λ, σd ) → (Zd−1 , λ, σd ) is ergodic, where λ is the image of λ. d e σd ). In particular, one has that I(X d , λ, σd ) is isomorphic to I(Zd−1 , λ,

3.3. Ergodic decomposition of Furstenberg self-joining of nilsystems under action σd . In the previous subsection we show that to study the σ-algebra of invariant sets under σd = T × T 2 × . . . × T d , we only need to study the one on its nilfactors. Hence in this subsection we study the ergodic decomposition of Furstenberg self-joinings of nilsystems under the action σd . In this subsection d ≥ 2 is an integer, and (X = Zd−1 , µd−1 , T ) is a topological system of order d − 1. Recall Nd = Nd (X) = O(∆d (X), σd ) = O((x, . . . , x), hτd , σd i) ⊂ X d and Nd [x] = O((x, . . . , x), σd′ ), where x ∈ X.

POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES

11

3.3.1. Basic properties. First we recall some basic properties. Theorem 3.6. [9, 43] With the notations above, we have (1) The (Nd , hτd , σd i) is ergodic (and thus uniquely ergodic) with some measure (d) µd−1 . (2) For each x ∈ X, the system (Nd [x], σd′ ) is uniquely ergodic with some measure (d) δx × µd−1,x Z . (d)

(3) µd−1 =

(d)

X

δx × µd−1,x dµ(x).

(4) (Ziegler) Let f1 , f2 , . . . , fd−1 be continuous functions on X and let {Mi } and {Ni } be two sequences of integers such that Ni → ∞. For µd−1 -almost every x ∈ X,

(3.4)

1 Ni

Ni +M Xi −1

f1 (T n x)f2 (T 2n x) . . . fd−1 (T (d−1)n x)

n=Mi

→ as i → ∞.

Z

(d)

f1 (x1 )f2 (x2 ) . . . fd−1 (xd−1 ) dµd−1,x (x1 , x2 , . . . , xd−1 )

Remark 3.7. In fact, in [9, 43] Theorem 3.6 is for nilsystems. Via an inverse limit argument, it is easy to see that Theorem 3.6 holds for topological systems of order d. (d)

3.3.2. The ergodic decomposition of µd−1 under σd . Now we study the ergodic de(d) (d) composition of µd−1 under σd . For each x ∈ X, let νd−1,x be the unique σd -invariant measure on O(xd , σd ), where xd = (x, x, . . . , x) ∈ X d . Then ϕ : X −→ M(Nd );

(d)

x 7→ νd−1,x

P is a Borel map and ϕ(X) ⊆ Mσed (Nd ). This fact follows from that x 7→ N1 n 0, where (d+1)

d+1 C = {s ∈ Zd−1 : (πd−1 )∗ (λs × νs(d) )(f ) > δs × µd−1,s (f )}.

Let B = ψ −1 (C) and A = p−1 d−1,2 (B), where pd−1,2 : (Nd+1 (Zd−1 ), hτd+1 , σd+1 i) → (Nd (Zd−1 ), hτd , σd i); (s1 , s) 7→ s (d)

(d+1)

(d)

(d)

d be the projection. Since (πd−1 )∗ (νs ) = µd−1,s = (µd−1 )s and (µd−1 )s (ψ −1 (s)) = 1 for µd−1 -a.e. s ∈ Zd−1 , one has that for µd−1 -a.e. s ∈ Zd−1 , ( ( 1 if s ∈ C 1 if s ∈ C (d+1) d µd−1,s (B) = and (πd−1 )∗ (νs(d) )(B) = 0 if s 6∈ C 0 if s 6∈ C.

Moreover, for µd−1 -a.e. s ∈ Zd−1 , one has that for µd−1 -a.e. s ∈ Zd−1 , ( 1 if s ∈ C d+1 d (πd−1 )∗ (λs × νs(d) )(A) = (πd−1 )∗ (νs(d) )(B) = 0 if s 6∈ C and

( 1 (d+1) (d+1) δs × µd−1,s (A) = µd−1,s (B) = 0

if s ∈ C . if s ∈ 6 C

POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES

Thus (d+1) µd−1 (f

· 1A ) =

Z

(d+1)

Nd+1 (Zd−1 )

=

Z

Zd−1

=

19

Z

f · 1A dµd−1

Z

Nd+1 (Zd−1 ) (d+1)

ZC

(d+1)  f · 1A dδs × µd−1,s dµd−1 (s)

δs × µd−1,s (f )dµd−1 (s)

d+1 (πd−1 )∗ (λs × νs(d) )(f ) dµd−1 (s) Z Z  d+1 f · 1A d(πd−1 )∗ (λs × νs(d) ) dµd−1 (s) =


0. Without loss of generality, we assume that for all 1 ≤ j ≤ d, kfj k∞ ≤ 1. Choose continuous functions gj such that kgj k∞ ≤ 1 and kfj − gj k1 < δ/d for all 1 ≤ j ≤ d. We have Z d d 1 X Y O n+(j−1)m (d) f (T x) − f dµ 2 j j N n∈[0,N−1] Nd (X) j=1 j=1 m∈[0,N−1] d d 1 X X Y Y 1 ≤ 2 fj (T n+(j−1)m x) − 2 gj (T n+(j−1)m x) N n∈[0,N−1] N n∈[0,N−1] j=1 j=1 m∈[0,N−1] m∈[0,N−1] (5.2) Z d d 1 X Y O + 2 gj (T n+(j−1)m x) − gj dµ(d) N n∈[0,N−1] Nd (X) j=1 j=1 m∈[0,N−1] Z Z d d O O (d) gj dµ − fj dµ(d) . + Nd (X) Nd (X) j=1

j=1

20

WEN HUANG, SONG SHAO, AND XIANGDONG YE

Now by Pointwise Ergodic Theorem for Z2 i.e. Theorem B.2 (applying to (n, m) 7→ T n+(j−1)m ) we have that for all 1 ≤ j ≤ d X 1 n+(j−1)m n+(j−1)m f (T x) − g (T x) (5.3) −→ kfj − gj k1 , N → ∞. j j N 2 n∈[0,N−1] m∈[0,N−1]

for µ a.e. This implies that there is a measurable X0δ ⊂ X with µ(X0δ ) = 1 such that for any x ∈ X0δ (5.3) holds for all 1 ≤ j ≤ d. Fix x ∈ X0δ . Hence by Lemma 5.1, when N is large d d 1 X Y X Y 1 n+(j−1)m n+(j−1)m fj (T x) − 2 gj (T x) 2 N n∈[0,N−1] N n∈[0,N−1] j=1 j=1 m∈[0,N−1] m∈[0,N−1] (5.4)

d h X 1 ≤ N2 j=1

0, choose continuous functions gj such that kgj k∞ ≤ 1 and E(|fj − gj ||I(X, T j )) < δ/d for all 1 ≤ j ≤ d. Recall that I(X, T j ) = {A ∈ X : T j A = A}. Since T is ergodic, I(X, T j ) is finite which guarantees the existence of such gj . By Lemma 5.1, we have −1 Y d N −1 Y d 1 N X X 1 fj (T jn x) − gj (T jn x) N n=0 N n=0 j=1 j=1 N −1 d d 1 X hY i Y = fj (T jn x) − gj (T jn x) N n=0

j=1

j=1

d h N −1 i X 1 X jn jn ≤ fj (T x) − gj (T x) . N n=0 j=1

Now by Birkhoff Pointwise Ergodic Theorem we have that for all 1 ≤ j ≤ d N −1 1 X jn jn fj (T x) − gj (T x) −→ E(|fj − gj ||I(T j )), N → ∞. N n=0

for µ a.e.. Hence

(6.13)



−→

−1 Y d N −1 Y d 1 N X X 1 jn jn fj (T x) − gj (T x) N N n=0 j=1 n=0 j=1 d h N −1 i X 1 X fj (T jn x) − gj (T jn x) N n=0 j=1 d X

E(|fj − gj ||I(T j )) ≤ δ,

N → ∞.

j=1

for µ a.e.. By (6.12), (6.14) for µ a.e..

Z O N −1 d d 1 XY jn gj (T x) → gj dµx(d) , N → ∞. N n=0 j=1 X d j=1

28

WEN HUANG, SONG SHAO, AND XIANGDONG YE

By Lemma 5.1, Z Z O d d Z d X O (d) (d) gj dµx − |gj − fj |dµ ≤ δ. fj dµx ≤ (6.15) Xd Xd X j=1

j=1

j=1

So combining (6.13)-(6.15), we have for µ a.e. x ∈ X, when N is large enough Z O d d −1 Y 1 N X jn (d) f (T x) − f dµ j j x < 3δ. N d X j=1 n=0 j=1

Since δ is arbitrary, we have for µ a.e. x ∈ X

N −1 1 X f1 (T n x)f2 (T 2n x) . . . fd (T dn x) N n=0 Z −→ f1 (x1 )f2 (x2 ) . . . fd (xd ) dµx(d) (x1 , x2 , . . . , xd ) Xd

as N → ∞. The proof is completed.



6.3. Proof of Theorem C. In this final subsection we will prove Theorem C. We start with the definition of a distal system. Definition 6.4. Let π : (X, X, µ, T ) → (Y, Y, ν, T ) be a factor between two ergodic systems. We call the extension π a distal extension if there exists a countable ordinal η and a directed family of factors (Xθ , Xθ , µθ , T ), θ ≤ η such that (1) X0 = Y and Xη = X. (2) For θ < η the extension πθ : Xθ+1 → Xθ is isometric and non-trivial (i.e. not an isomorphism). W (3) For a limit ordinal λ ≤ η, Xλ = lim Xθ (i.e. Xλ = Xθ ). ←− θ), if for all f1 , . . . , fd ∈ L∞ (µ) N −1 1 X f1 (T n x) . . . fd (T dn x) N n=0

converge µ a.e.. The aim is to prove each distal system satisfies (>). We will use the structure of distal systems and Proposition 6.3 to complete the proof.

Let πd−1 : X → Zd−1 be the factor map. Then πd−1 is distal since X is distal. By the definition of a distal extension, there exists a countable ordinal η and a directed family of factors (Xθ , Xθ , µθ , T ), θ ≤ η such that (1) X0 = Zd−1 and Xη = X. (2) For θ < η the extension πθ : Xθ+1 → Xθ is non-trivial isometric. (3) For a limit ordinal λ ≤ η, Xλ = lim Xθ . ←− θ). (ii) For θ < η the extension πθ : Xθ+1 → Xθ is non-trivial isometric. If Xθ satisfies (>), then by Proposition 6.3, Xθ+1 satisfies (>). (iii) For a limit ordinal λ ≤ η, if for all θ < λ, Xθ satisfies (>), then it is easy to verify that the inverse limit Xλ = lim Xθ also satisfies (>). ←− θ). The proof is completed. Appendix A. Background on Ergodic Theory In this Appendix we will state notions and results in ergodic theory which are used in the article. Let (X, X , µ, T ) be a m.p.t. A.0.1. Ergodicity and weak mixing. First we list some equivalent conditions for ergodicity and weak mixing. Theorem A.1. Let (X, X , µ, T ) be a m.p.t. Then the following conditions are equivalent: (1) T is ergodic. (2) Every measurable function f from X to some Polish space P satisfying f ◦ T = f a.e. is of form f ≡ p a.e. for some point p ∈ P . Z Z N −1 Z X n (3) lim f ◦ T · g dµ = f dµ g dµ, for all f, g ∈ L2 (µ) (or L1 (µ)). N →∞

n=0

Theorem A.2. Let (X, X , µ, T ) be a m.p.t. Then the following conditions are equivalent: (1) T is weakly mixing. (2) 1 is the only eigenvalue of T and the geometric multiplicity of eigenvalue 1 is 1. (3) The product system with any ergodic system is still ergodic. A.0.2. Conditional expectation. If Y is a T -invariant sub-σ-algebra of X and f ∈ L1 (µ), we write E(f |Y), or Eµ (f |Y) if needed, for the conditional expectation of f with respect to Y. The conditional expectation E(f |Y) is characterized as the unique Y-measurable function in L2 (Y, Y, ν) such that Z Z (A.1) gE(f |Y)dν = g ◦ πf dµ Y

X

2

for all g ∈ L (Y, Y, ν). We will frequently make use of the identities Z Z E(f |Y) dµ = f dµ and T E(f |Y) = E(T f |Y).

We say that a function f is orthogonal to Y, and we write f ⊥ Y, when it has a zero conditional expectation on Y. If a function f ∈ L1 (µ) is measurable with respect to the factor Y, we write f ∈ L1 (Y, Y, ν).

30

WEN HUANG, SONG SHAO, AND XIANGDONG YE

R The disintegration of µ over ν, written as µ = µy d ν(y), is given by a measurable map y 7→ µy from Y to the space of probability measures on X such that Z (A.2) E(f |Y)(y) = f dµy X

ν-almost everywhere.

A.0.3. Ergodic decomposition. The ergodic decomposition is a basic result in ergodic. We quote this theorem from [21]. Theorem A.3. Let (X, X , µ, T ) be a dynamical system I the σ-algebra of T -invariant sets. Let π : (X,RX , µ, T ) → (Y, Y, ν, S) be the factor defined by I, so that I = π −1 (Y). Let µ = Y µy dν(y) be the disintegration of µ over ν and for every y ∈ Y denote Xy = π −1 (y) and Xy = X ∩ Xy , then: (1) S acts as the identity on Y . (2) For ν a.e. y ∈ Y , the system (Xy , Xy , µy , T ) is ergodic. (3) This decomposition is unique in the following sense. If (Z, Z, η) is a standard probabilityR space and ψ : z 7→ µ ˜ z a measurable map from Z into MTe (X) such that µ = Z µ ˜y dη(z), then there exists a measurable map φ : (Z, η) → (Y, ν) such that η a.e. µφ(z) = µ ˜z .

A.0.4. Inverse limit. We say that (X, X , µ, T ) is an inverse limit of a sequence of factors (X,WXj , µ, T ) if (Xj )j∈N is an increasing sequence of T -invariant sub-σ-algebras such that j∈N Xj = X up to sets of measure zero. A.0.5. Group rotation. All locally compact groups are implicitly assumed to be metrizable and endowed with their Borel σ-algebras. Every compact group G is endowed with its Haar measure, denoted by mG . For a compact abelian group Z and t ∈ Z, we write (Z, t) for the probability space (Z, mZ ), endowed with the transformation given by z 7→ tz. A system of this kind is called a rotation. A.0.6. Joining and conditional product measure. The notions of joining and conditional product measure are introduced by Furstenberg in [18]. Let (Xi , µi , Ti ), i = 1, . . . , k, be m.p.t, and let (Yi , νi , SiQ ) be corresponding factors, and πi : Xi → Yi the factor maps. A measure ν on Y = i Yi defines a joining of the measures Q on Yi if it is invariant under S1 ×. . . ×Sk and maps onto νj under the natural map i Yi → Yj . When S1 = . . . = Sk , we then say that ν is a k-fold self-joining. R Let ν be a joining of the measures on Yi , i = 1, . . . , k, and let µi = µXi ,yi dνi (yi ) represent the disintegration of µi with respect to νi . Let µ be a measure on X = Q i Xi defined by Z (A.3) µ= µX1 ,y1 × µX2 ,y2 × . . . × µXk ,yk dν(y1 , y2 , . . . , yk ). Y

Then µ is called the conditional product measure with respect to ν.

POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES

31

Equivalently, µ is conditional product measure relative to ν if and only if for all k-tuple fi ∈ L∞ (Xi , µi ), i = 1, . . . , k Z f1 (x1 )f2 (x2 ) . . . fk (xk ) dµ(x1 , x2 , . . . , xk ) X Z (A.4) = E(f1 |Y1 )(y1 )E(f2 |Y2)(y2 ) . . . E(fk |Yk )(yk ) dν(y1 , y2 , . . . , yk ). Y

A.0.7. Relatively independent joining. Let (X1 , X1 , µ1 , T ), (X2, X2 , µ2 , T ) be two systems and let (Y, Y, R ν, S) be a common factor with πi : Xi → Y for i = 1, 2 the factor maps. Let µi = µi,y dν(y) represent the disintegration of µi with respect to Y . Let µ1 ×Y µ2 denote the measure defined by Z µ1 ×Y µ2 (A) = µ1,y × µ2,y dν(y), Y

for all A ∈ X1 × X2 . The system (X1 × X2 , X1 × X2 , µ1 ×Y µ2 , T × T ) is called the relative product of X1 and X2 with respect to Y and is denoted X1 ×Y X2 . µ1 ×Y µ2 is also called relatively independent joining of X1 and X2 over Y .

A.0.8. Isometric extensions and Ergodic extensions. Let π : (X, X , µ, T ) → (Y, Y, ν, S) be a factor map. The L2 (X, X , µ) norm is denoted by || · || and the L2 (X, X , µy ) norm by || · ||y for ν-almost every y ∈ Y . Recall {µy }y∈Y is the disintegration of µ relative to ν. A function f ∈ L2 (X, X , µ) is almost periodic over Y if for every ǫ > 0 there exist g1 , . . . , gl ∈ L2 (X, X , µ) such that for all n ∈ Z min ||T n f − gj ||y < ǫ

1≤j≤l

for ν almost every y ∈ Y . One writes f ∈ AP (Y). Let K(X|Y, T ) be the closed subspace of L2 (X) spanned by the almost periodic functions over Y. When Y is trivial, K(X, T ) = K(X|Y, T ) is the closed subspace spanned by eigenfunctions of T. X is an isometric extensions of Y if K(X|Y, T ) = L2 (X) and it is a (relatively) weak mixing extension of Y if K(X|Y, T ) = L2 (Y ). Definition A.4. Let π : (X, X , µ, T ) → (Y, Y, ν, S) be a homomorphism. π is relatively ergodic or (X, X , µ, T ) is an ergodic extension of (Y, Y, ν, T ) if T −invariant sets of X is contained in Y, i.e. I(T ) ⊆ Y. Appendix B. The pointwise ergodic theorem for amenable groups B.0.9. Amenability has many equivalent formulations; for us, the most convenient definition is that a locally compact group G is amenable if for any compact K ⊂ G and δ > 0 there is a compact set F ⊂ G such that |F ∆KF | < δ|F |, where we use both | · | and m to denote the left Haar measure on G (for discrete G, we take this to be the counting measure on G). Such a set F will be called (K, δ)invariant. A sequence F1 , F2 , . . . of compact subsets of G will be called a Følner sequence if for every compact K and δ > 0, for all large enough n we have that

32

WEN HUANG, SONG SHAO, AND XIANGDONG YE

Fn is (K, δ)-invariant. Here all groups are assumed to be locally compact second countable. B.0.10. Suppose now that G acts bi-measurably from the left by measure preserving transformations on a Lebesgue space (X, B, µ) with µ(X) = 1. We will use for any f : X → R the symbol A(F, f )(x) = AF (f ) to denote the average Z 1 A(F, f )(x) = f (gx) dm(g). |F | F

Definition B.1. A Følner sequence Fn will be said to be tempered if for some C > 0 and all n [ −1 Fk Fn < C |Fn | . (B.1) k. Then N H N 1 X X

1 X

2 1 lim sup lim sup xn ≤ lim sup < xn , xn+h > . N n=1 N →∞ N →∞ N n=1 H→∞ H h=1 References

[1] E. H. El Abdalaoui, On the pointwise convergence of multiple ergodic averages, arXiv:1406.2608 [2] I. Assani, Multiple recurrence and almost sure convergence for weakly mixing dynamical systems, Israel J. Math. 103 (1998), 111–124. [3] I. Assani, Pointwise convergence of ergodic averages along cubes, J. Analyse Math. 110 (2010), 241–269. [4] I. Assani, A.E. Multiple recurrence for weakly mixing commuting actions, arXiv:1312.5270. [5] T. Austin, On the norm convergence of non-conventional ergodic averages, Ergod. Th. and Dynam. Sys., 30(2010), 321–338. [6] A. Bellow and H. Furstenberg, An application of number theory to ergodic theory and the construction of uniquely ergodic models. A collection of invited papers on ergodic theory. Israel J. Math. 33 (1979), 231–240 (1980). [7] V. Bergelson, Weakly mixing PET, Ergodic Theory Dynam. Systems, 7 (1987), 337–349.

POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES

35

[8] V. Bergelson. The multifarious Poincare recurrence theorem, Descriptive set theory and dynamical systems. London Math. Soc. Lecture Note Series 277, Cambridge Univ. Press, Cambridge, (2000), 31–57. [9] V. Bergelson, B. Host and B. Kra, Multiple recurrence and nilsequences. With an appendix by Imre Ruzsa, Invent. Math. 160 (2005), no. 2, 261–303. [10] J. Bourgain, Double recurrence and almost sure convergence. J. Reine Angew. Math. 404 (1990), 140–161. [11] Q. Chu and N. Frantzikinakis, Pointwise convergence for cubic and polynomial ergodic averages of non-commuting transformations, Ergod. Th. and Dynam. Sys. 32 (2012), 877-897. [12] M. Denker, On strict ergodicity, Math. Z. 134 (1973), 231–253. [13] M. Denker, C. Grillenberger, and K. Sigmund, Ergodic theory on compact spaces, Lecture Notes in Mathematics, Vol. 527. Springer-Verlag, Berlin-New York, 1976. iv+360 pp. [14] J. Derrien and E. Lesigne, Un th´eor`eme ergodique polynomial ponctuel pourles endomorphismes exacts et les K-syst`emes, Ann. Inst. H. Poincar´e Probab. Statist. 32 (1996), no. 6, 765–778. [15] P. Dong, S. Donoso, A. Maass, S. Shao and X. Ye, Infinite-step nilsystems, independence and complexity, Ergod. Th. and Dynam. Sys., 33(2013), 118-143. [16] T. Downarowicz and Y. Lacroix, Forward mean proximal pairs and zero entropy, Israel J. Math. 191 (2012), 945–957. [17] M. Einsiedler and T. Ward, Ergodic theory with a view towards number theory, Graduate Texts in Mathematics, 259. Springer-Verlag London, Ltd., London, 2011. [18] H. Furstenberg, Ergodic behavior of diagonal measures and a theorem of Szemer´edi on arithmetic progressions. J. Analyse Math. 31 (1977), 204–256. [19] H. Furstenberg, Recurrence in ergodic theory and combinatorial number theory, M. B. Porter Lectures. Princeton University Press, Princeton, N.J., 1981. [20] E. Glasner, Topological ergodic decompositions and applications to products of powers of a minimal transformation, J. Anal. Math., 64 (1994), 241–262. [21] E. Glasner, Ergodic theory via joinings, Mathematical Surveys and Monographs, 101. American Mathematical Society, Providence, RI, 2003. xii+384 pp. [22] E. Glasner and B. Weiss, Strictly ergodic, uniform positive entropy models, Bull. Soc. Math. France 122 (1994), no. 3, 399–412. [23] E. Glasner and B. Weiss, On the interplay between measurable and topological dynamics, Handbook of dynamical systems. Vol. 1B, 597–648, Elsevier B. V., Amsterdam, 2006. [24] G. Hansel and J. P. Raoult, Ergodicity, uniformity and unique ergodicity, Indiana Univ. Math. J. 23 (1973/74), 221–237. [25] M. Hochman, On notions of determinism in topological dynamics, Ergod. Th. and Dynam. Sys. 32 (2012), 119–140. [26] B. Host, Ergodic seminorms for commuting transformations and applications, Studia Math. 195 (2009), no. 1, 31–49. [27] B. Host and B. Kra, Nonconventional averages and nilmanifolds, Ann. of Math., 161 (2005) 398–488. [28] B. Host, B. Kra and A. Maass, Nilsequences and a structure theory for topological dynamical systems, Advances in Mathematics, 224 (2010) 103–129. [29] R.I. Jewett, The prevalence of uniquely ergodic systems, J. Math. Mech. 19 1969/1970 717– 729. [30] W. Krieger, On unique ergodicity, Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. II: Probability theory, pp. 327–346, Univ. California Press, Berkeley, Calif., 1972. [31] E. Lehrer, Topological mixing and uniquely ergodic systems, Israel J. Math. 57 (1987), no. 2, 239–255. [32] E. Lesigne, Th´eor`emes ergodiques ponctuels pour des mesures diagonales, Cas des syst`emes distaux, Ann. Inst. H. Poincare Probab. Statist. 23 (1987), 593–612.

36

WEN HUANG, SONG SHAO, AND XIANGDONG YE

[33] E. Lindenstrauss, Measurable distal and topological distal systems, Ergodic Theory Dynamical Systems 19 (1999), 1063–1076. [34] E. Lindenstrauss, Pointwise theorems for amenable groups. Invent. Math. 146 (2001), 259– 295. [35] S. Shao and X. Ye, Regionally proximal relation of order d is an equivalence one for minimal systems and a combinatorial consequence, Adv. in Math., 231(2012), 1786–1817. [36] T. Tao, Norm convergence of multiple ergodic averages for commuting transformations, Ergodic Theory Dynam. Systems 28 (2008), no. 2, 657–688. [37] V.S. Varadarajan, Groups of automorphisms of Borel spaces, Trans. Amer. Math. Soc., 109(1963), 191-220. [38] M. Walsh, Norm convergence of nilpotent ergodic averages, Ann. of Math., 175 (2012) 1667– 1688. [39] B. Weiss, Strictly ergodic models for dynamical systems, Bull. Amer. Math. Soc. (N.S.) 13 (1985), 143–146. [40] B. Weiss, Countable generators in dynamics – Universal minimal models, Contemporary Mathematics 94, (1989), 321–326. [41] B. Weiss, Multiple recurrence and doubly minimal systems, Topological dynamics and applications (Minneapolis, MN, 1995), 189–196, Contemp. Math., 215, Amer. Math. Soc., Providence, RI, 1998. [42] B. Weiss, Single orbit dynamics, CBMS Regional Conference Series in Mathematics, 95. American Mathematical Society, Providence, RI, 2000. x+113 pp. [43] T. Ziegler, A non-conventional ergodic theorem for a nilsystem, Ergodic Theory Dynam. Systems 25 (2005), no. 4, 1357–1370. [44] T. Ziegler, Universal characteristic factors and Furstenberg averages. J. Amer. Math. Soc. 20 (2007), no. 1, 53–97. Department of Mathematics, University of Science and Technology of China, Hefei, Anhui, 230026, P.R. China. E-mail address: [email protected] E-mail address: [email protected] E-mail address: [email protected]