Introduction to Calculus in Several Variables

71 downloads 7805 Views 631KB Size Report
able calculus, to students who have had 3 semesters of calculus, including an ... treatment of multivariable calculus, and who have had a course in linear ...
Introduction to Calculus in Several Variables Michael E. Taylor

1

2

Contents 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

One-variable calculus The derivative Inverse function and implicit function theorem Fundamental local existence theorem for ODE The Riemann integral in n variables Integration on surfaces Differential forms Products and exterior derivatives of forms The general Stokes formula The classical Gauss, Green, and Stokes formulas Holomorphic functions and harmonic functions The Brouwer fixed-point theorem

A. B. C. D. E. F. G. H.

Metric spaces, convergence, and compactness Partitions of unity Differential forms and the change of variable formula Remarks on power series The Weierstrass theorem and the Stone-Weierstrass theorem Convolution approach to the Weierstrass approximation theorem Fourier series – first steps Inner product spaces

3

Introduction The goal of this text is to present a compact treatment of the basic results of multivariable calculus, to students who have had 3 semesters of calculus, including an introductory treatment of multivariable calculus, and who have had a course in linear algebra, but who could use a logical development of the basics, particularly of the integral. We begin with an introductory section, §0, presenting the elements of one-dimensional calculus. We first define the Riemann integral of a class of functions on an interval. We then introduce the derivative, and establish the Fundamental Theorem of Calculus, relating differentiation and integration as essentially inverse operations. Further results are dealt with in the exercises, such as the change of variable formula for integrals, and the Taylor formula with remainder. In §1 we define the derivative of a function F : O → Rm , when O is an open subset of Rn , as a linear map from Rn to Rm . We establish some basic properties, such as the chain rule. We use the one-dimensional integral as a tool to show that, if the matrix of first order partial derivatives of F is continuous on O, then F is differentiable on O. We also discuss two convenient multi-index notations for higher derivatives, and derive the Taylor formula with remainder for a smooth function F on O ⊂ Rn . In §2 we establish the Inverse Function Theorem, stating that a smooth map F : O → Rn with an invertible derivative DF (p) has a smooth inverse defined near q = F (p). We derive the Implicit Function Theorem as a consequence of this. As a tool in proving the Inverse Function Theorem, we use a fixed point theorem known as the Contraction Mapping Principle. In §3 we establish the fundamental local existence theorem for systems of ordinary differential equations. ODE is not a major topic in this course, but a treatment of the existence theory fits in well here, as it makes further use of the Contraction Mapping Principle. In §4 we take up the multidimensional Riemann integral. The basic definition is quite parallel to the one-dimensional case, but a number of central results, while parallel in statement to the one-dimensional case, require more elaborate demonstrations in higher dimensions. This section is the most demanding in these notes, but it is in a sense the heart of the course. One central result is the change of variable formula for multidimensional integrals. Another is the reduction of multiple integrals to iterated integrals. In §5 we pass from integrals of functions defined on domains in Rn to integrals of functions defined on n-dimensional surfaces. This includes the study of surface area. Surface area is defined via a “metric tensor,” an object that is a little more complicated than a vector. In §6 we introduce a further class of objects that are defined on surfaces, differential forms. A k-form can be integrated over a k-dimensional surface, endowed with an extra piece of structure, an “orientation.” The change of variable formula established in §4 plays a central role in establishing this. Properties of differential forms are developed further in the next two sections. In §7 we define exterior products of forms, and interior products

4

of forms with vector fields. Then we define the exterior derivative of a form. Section 8 is devoted to the general Stokes formula, an important integral identity which contains as special cases classical identities of Green, Gauss, and Stokes. These special cases are discussed in some detail in §9. In §10 we use Green’s Theorem to derive fundamental properties of holomorphic functions of a complex variable. Sprinkled throughout earlier sections are some allusions to functions of complex variables, particularly in some of the exercises in §§1–2. Readers with no previous exposure to complex variables might wish to return to these exercises after getting through §10. In this section, we also discuss some results on the closely related study of harmonic functions. One result is Liouville’s Theorem, stating that a bounded harmonic function on all of Rn must be constant. When specialized to holomorphic functions on C = R2 , this yields a proof of the Fundamental Theorem of Algebra. In §11 we present the Brouwer Fixed Point Theorem, which states that a continuous map from the closed unit ball in Rn to itself must have a fixed point. We give a proof of this that makes use of the Stokes formula for differential forms. This fixed point theorem plays an important role in several areas of mathematics, as the student who continues to study the subject will see. Here we present it principally as an illustration of the usefulness of the calculus of differential forms. Appendix A at the end of these notes covers some basic notions of metric spaces and compactness, used from time to time throughout the notes, particularly in the study of the Riemann integral and in the proof of the fundamental existence theorem for ODE. Appendix B discusses partitions of unity, useful particularly in the proof of the Stokes formula. Appendix C presents a proof of the change of variable formula for multiple integrals, following an idea presented by Lax in [L], which is somewhat shorter than that given in §4. Appendix D discusses the remainder term in the Taylor series of a function. Appendix E gives the Weierstrass theorem on approximating a continuous function by polynomials, and an extension, known as the Stone-Weierstrass theorem, which is a useful tool in analysis. Appendix F treats another approach to the Weierstrass approximation theorem. Appendix G provides an introduction to Fourier series. Appendix H gives some basic material on inner product spaces. It has been our express intention to make this presentation of multivariable calculus short. As part of this package, the exercises play a particularly important role in developing the material. In addition to exercises providing practice in applying results established in the text, there are also exercises asking the reader to supply details for some of the arguments used in the text. There are also scattered exercises intended to give the reader a fresh perspective on topics familiar from previous study. We mention in particular exercises on determinants and on the cross product in §1, and on trigonometric functions in §3.

5

0. One-variable calculus In this brief discussion of one variable calculus, we first consider the integral, and then relate it to the derivative. We will define the Riemann integral of a bounded function over an interval I = [a, b] on the real line. To do this, we partition I into smaller intervals. A partition P of I is a finite collection of subintervals {Jk : 0 ≤ k ≤ N }, disjoint except for their endpoints, whose union is I. We can order the Jk so that Jk = [xk , xk+1 ], where (0.1)

x0 < x1 < · · · < xN < xN +1 ,

x0 = a, xN +1 = b.

We call the points xk the endpoints of P. We set (0.2)

`(Jk ) = xk+1 − xk ,

maxsize (P) = max `(Jk ) 0≤k≤N

We then set I P (f ) =

X k

(0.3)

I P (f ) =

X k

sup f (x) `(Jk ), Jk

inf f (x) `(Jk ). Jk

(See (A.10)–(A.11) for the definition of sup and inf.) Note that I P (f ) ≤ I P (f ). These quantities should approximate the Riemann integral of f, if the partition P is sufficiently “fine.” To be more precise, if P and Q are two partitions of I, we say P refines Q, and write P Â Q, if P is formed by partitioning each interval in Q. Equivalently, P Â Q if and only if all the endpoints of Q are also endpoints of P. It is easy to see that any two partitions have a common refinement; just take the union of their endpoints, to form a new partition. Note also that (0.4)

P Â Q =⇒ I P (f ) ≤ I Q (f ), and I P (f ) ≥ I Q (f ).

Consequently, if Pj are any two partitions and Q is a common refinement, we have (0.5)

I P1 (f ) ≤ I Q (f ) ≤ I Q (f ) ≤ I P2 (f ).

Now, whenever f : I → R is bounded, the following quantities are well defined: (0.6)

I(f ) =

inf

P∈Π(I)

I P (f ),

I(f ) = sup P∈Π(I)

I P (f ),

where Π(I) is the set of all partitions of I. Clearly, by (0.5), I(f ) ≤ I(f ). We then say that f is Riemann integrable provided I(f ) = I(f ), and in such a case, we set Z Z b f (x) dx = f (x) dx = I(f ) = I(f ). (0.7) a

I

We will denote the set of Riemann integrable functions on I by R(I). We derive some basic properties of the Riemann integral.

6

Proposition 0.1. If f, g ∈ R(I), then f + g ∈ R(I), and Z (0.8)

Z (f + g) dx =

I

Z f dx +

I

g dx. I

Proof. If Jk is any subinterval of I, then sup (f + g) ≤ sup f + sup g, Jk

Jk

Jk

so, for any partition P, we have I P (f + g) ≤ I P (f ) + I P (g). Also, using common refinements, we can simultaneously approximate I(f ) and I(g) by I P (f ) and I P (g). Thus the characterization (0.6) implies I(f + g) ≤ I(f ) + I(g). A parallel argument implies I(f + g) ≥ I(f ) + I(g), and the proposition follows. Next, there is a fair supply of Riemann integrable functions. Proposition 0.2. If f is continuous on I, then f is Riemann integrable. Proof. Any continuous function on a compact interval is bounded (see Proposition A.14) and uniformly continuous (see Proposition A.15); let ω(δ) be a modulus of continuity for f, so (0.9)

|x − y| ≤ δ =⇒ |f (x) − f (y)| ≤ ω(δ),

ω(δ) → 0 as δ → 0.

Then (0.10)

maxsize (P) ≤ δ =⇒ I P (f ) − I P (f ) ≤ ω(δ) · `(I),

which yields the proposition. We denote the set of continuous functions on I by C(I). Thus Proposition 0.2 says C(I) ⊂ R(I). The proof of Proposition R 0.2 provides a criterion on a partition guaranteeing that I P (f ) and I P (f ) are close to I f dx when f is continuous. We produce an extension, showing when I P (f ) and I(f ) are close, and I P (f ) and I(f ) are close, given f bounded on I. Given a partition P0 of I, set (0.11)

minsize(P0 ) = min{`(Jk ) : Jk ∈ P0 }.

7

Proposition 0.3. Let P0 and P be two partitions of I. Assume (0.12)

maxsize(P) ≤

1 minsize(P0 ). k

Let |f | ≤ M on I. Then 2M `(I), k 2M `(I). I P (f ) ≥ I P0 (f ) − k I P (f ) ≤ I P0 (f ) +

(0.13)

Proof. Let P1 denote the minimal common refinement of P and P0 . Consider on the one hand those intervale in P that are contained in intervals in P0 and on the other hand those intervals in P that are not contained in intervals in P0 (whose lengths sum to ≤ `(I)/k). We obtain 2M I P (f ) ≤ I P1 (f ) + `(I), k 2M I P (f ) ≥ I P1 (f ) − `(I). k Since also I P1 (f ) ≤ I P0 (f ) and I P1 (f ) ≥ I P0 (f ), we obtain (0.13). The following corollary is sometimes called Darboux’s Theorem. Corollary 0.4. Let Pν be a sequence of partitions of I into ν intervals Jνk , 1 ≤ k ≤ ν, such that maxsize(Pν ) −→ 0. If f : I → R is bounded, then (0.14)

I Pν (f ) → I(f ) and I Pν (f ) → I(f ).

Consequently, (0.15)

f ∈ R(I) ⇐⇒ I(f ) = lim

ν→∞

for arbitrary ξνk ∈ Jνk , in which case the limit is

ν X

f (ξνk )`(Jνk ),

k=1

R I

f dx.

The sums on the right side of (0.15) are called Riemann sums, approximating (when f is Riemann integrable).

R I

f dx

Remark. A second proof of Proposition 0.1 can readily be deduced from Corollary 0.4. One should be warned that, once such a specific choice of Pν and ξνk has been made, the limit on the right side of (0.15) might exist for a bounded function f that is not

8

Riemann integrable. This and other phenomena are illustrated by the following example of a function which is not Riemann integrable. For x ∈ I, set (0.16)

ϑ(x) = 1 if x ∈ Q,

ϑ(x) = 0 if x ∈ / Q,

where Q is the set of rational numbers. Now every interval J ⊂ I of positive length contains points in Q and points not in Q, so for any partition P of I we have I P (ϑ) = `(I) and I P (ϑ) = 0, hence I(ϑ) = `(I),

(0.17)

I(ϑ) = 0.

Note that, if Pν is a partition of I into ν equal subintervals, then we could pick each ξνk to be rational, in which case the limit on the right side of (10.15) would be `(I), or we could pick each ξνk to be irrational, in which case this limit would be zero. Alternatively, we could pick half of them to be rational and half to be irrational, and the limit would be 1 2 `(I). Associated to the Riemann integral is a notion of size of a set S, called content. If S is a subset of I, define the “characteristic function” (0.18)

χS (x) = 1 if x ∈ S, 0 if x ∈ / S.

We define “upper content” cont+ and “lower content” cont− by (0.19)

cont+ (S) = I(χS ),

cont− (S) = I(χS ).

We say S “has content,” or “is contented” if these quantities are equal, which happens if and only if χS ∈ R(I), in which case the common value of cont+ (S) and cont− (S) is Z (0.20) m(S) = χS (x) dx. I

It is easy to see that (0.21)

+

cont (S) = inf

N nX

o `(Jk ) : S ⊂ J1 ∪ · · · ∪ JN ,

k=1

where Jk are intervals. Here, we require S to be in the union of a finite collection of intervals. See the appendix at the end of this section for a generalization of Proposition 0.2, giving a sufficient condition for a bounded function to be Riemann integrable on I, in terms of the upper content of its set of discontinuities. There is a more sophisticated notion of the size of a subset of I, called Lebesgue measure. The key to the construction of Lebesgue measure is to cover a set S by a countable (either finite or infinite) set of intervals. The outer measure of S ⊂ I is defined by nX [ o (0.22) m∗ (S) = inf `(Jk ) : S ⊂ Jk . k≥1

k≥1

9

Here {Jk } is a finite or countably infinite collection of intervals. Clearly m∗ (S) ≤ cont+ (S).

(0.23)

Note that, if S = I ∩ Q, then χS = ϑ, defined by (0.16). In this case it is easy to see that cont+ (S) = `(I), but m∗ (S) = 0. Zero is the “right” measure of this set. More material on the development of measure theory can be found in a number of books, including [Fol] and [T2]. R It is useful to note that I f dx is additive in I, in the following sense. ¯ ¯ Proposition 0.5. If a < b < c, f : [a, c] → R, f1 = f ¯[a,b] , f2 = f ¯[b,c] , then ¡ ¢ ¡ ¢ ¡ ¢ (0.24) f ∈ R [a, c] ⇐⇒ f1 ∈ R [a, b] and f2 ∈ R [b, c] , and, if this holds,

Z

Z

c

f dx =

(0.25) a

Z

b

c

f1 dx + a

f2 dx. b

Proof. Since any partition of [a, c] has a refinement for which b is an endpoint, we may as well consider a partition P = P1 ∪ P2 , where P1 is a partition of [a, b] and P2 is a partition of [b, c]. Then (0.26) so (0.27)

I P (f ) = I P1 (f1 ) + I P2 (f2 ),

I P (f ) = I P1 (f1 ) + I P2 (f2 ),

© ª © ª I P (f ) − I P (f ) = I P1 (f1 ) − I P1 (f1 ) + I P2 (f2 ) − I P2 (f2 ) .

Since both terms in braces in (0.27) are ≥ 0, we have equivalence in (0.24). Then (0.25) follows from (0.26) upon taking sufficiently fine partitions. Let I = [a, b]. If f ∈ R(I), then f ∈ R([a, x]) for all x ∈ [a, b], and we can consider the function Z x (0.28) g(x) = f (t) dt. a

If a ≤ x0 ≤ x1 ≤ b, then (0.29)

Z

x1

g(x1 ) − g(x0 ) =

f (t) dt, x0

so, if |f | ≤ M, (0.30)

|g(x1 ) − g(x0 )| ≤ M |x1 − x0 |.

In other words, if f ∈ R(I), then g is Lipschitz continuous on I. A function g : (a, b) → R is said to be differentiable at x ∈ (a, b) provided there exists the limit ¤ 1£ (0.31) lim g(x + h) − g(x) = g 0 (x). h→0 h When such a limit exists, g 0 (x), also denoted dg/dx, is called the derivative of g at x. Clearly g is continuous wherever it is differentiable. The next result is part of the Fundamental Theorem of Calculus.

10

Theorem 0.6. If f ∈ C([a, b]), then the function g, defined by (0.28), is differentiable at each point x ∈ (a, b), and g 0 (x) = f (x).

(0.32)

Proof. Parallel to (0.29), we have, for h > 0, (0.33)

¤ 1 1£ g(x + h) − g(x) = h h

Z

x+h

f (t) dt. x

If f is continuous at x, then, for any ε > 0, there exists δ > 0 such that |f (t) − f (x)| ≤ ε whenever |t − x| ≤ δ. Thus the right side of (0.33) is within ε of f (x) whenever h ∈ (0, δ]. Thus the desired limit exists as h & 0. A similar argument treats h % 0. The next result is the rest of the Fundamental Theorem of Calculus. Theorem 0.7. If G is differentiable and G0 (x) is continuous on [a, b], then Z

b

(0.34)

G0 (t) dt = G(b) − G(a).

a

Proof. Consider the function Z (0.35)

x

g(x) =

G0 (t) dt.

a

We have g ∈ C([a, b]), g(a) = 0, and, by Theorem 0.6, g 0 (x) = G0 (x),

∀ x ∈ (a, b).

Thus f (x) = g(x) − G(x) is continuous on [a, b], and (0.36)

f 0 (x) = 0,

∀ x ∈ (a, b).

We claim that (0.36) implies f is constant on [a, b]. Granted this, since f (a) = g(a)−G(a) = −G(a), we have f (x) = −G(a) for all x ∈ [a, b], so the integral (0.35) is equal to G(x)−G(a) for all x ∈ [a, b]. Taking x = b yields (0.34). The fact that (0.36) implies f is constant on [a, b] is a consequence of the following result, known as the Mean Value Theorem. Theorem 0.8. Let f : [a, β] → R be continuous, and assume f is differentiable on (a, β). Then ∃ ξ ∈ (a, β) such that (0.37)

f 0 (ξ) =

f (β) − f (a) . β−a

11

Proof. Replacing f (x) by fe(x) = f (x) − κ(x − a), where κ is the right side of (0.37), we can assume without loss of generality that f (a) = f (β). Then we claim that f 0 (ξ) = 0 for some ξ ∈ (a, β). Indeed, since [a, β] is compact, f must assume a maximum and a minumum on [a, β]. If f (a) = f (β), one of these must be assumed at an interior point, ξ, at which f 0 clearly vanishes. Now, to see that (0.36) implies f is constant on [a, b], if not, ∃ β ∈ (a, b] such that f (β) 6= f (a). Then just apply Theorem 0.8 to f on [a, β]. This completes the proof of Theorem 0.7. If a function G is differentiable ¡on (a,¢b), and G0 is continuous on (a,¡ b), we ¢ say G is 1 k a C 1 function, and write G ∈ C (a, b) . Inductively, we say G ∈ C (a, b) provided ¡ ¢ 0 k−1 (a, b) . G ∈C An easy consequence of the definition (0.31) of the derivative is that, for any real constants a, b, and c, f (x) = ax2 + bx + c =⇒

df = 2ax + b. dx

Now, it is a simple enough step to replace a, b, c by y, z, w, in these formulas. Having done that, we can regard y, z, and w as variables, along with x : F (x, y, z, w) = yx2 + zx + w. We can then hold y, z and w fixed (e.g., set y = a, z = b, w = c), and then differentiate with respect to x; we get ∂F = 2yx + z, ∂x the partial derivative of F with respect to x. Generally, if F is a function of n variables, x1 , . . . , xn , we set ∂F (x1 , . . . , xn ) ∂xj ¤ 1£ = lim F (x1 , . . . , xj−1 , xj + h, xj+1 , . . . , xn ) − F (x1 , . . . , xj−1 , xj , xj+1 , . . . , xn ) , h→0 h where the limit exists. Section one carries on with a further investigation of the derivative of a function of several variables.

Complementary results on Riemann integrability Here we provide a condition, more general then Proposition 0.2, which guarantees Riemann integrability.

12

Proposition 0.9. Let f : I → R be a bounded function, with I = [a, b]. Suppose that the set S of points of discontinuity of f has the property cont+ (S) = 0.

(0.38) Then f ∈ R(I).

Proof. Say |f (x)| ≤ M . Take ε > 0. As in (0.21), take intervals J1 , . . . , JN such that PN S ⊂ J1 ∪ · · · ∪ JN and k=1 `(Jk ) < ε. In fact, fatten each Jk such that S is contained in the interior of this collection of intervals. Consider a partition P0 of I, whose intervals include J1 , . . . , JN , amongst others, which we label I1 , . . . , IK . Now f is continuous on each interval Iν , so, subdividing each Iν as necessary, hence refining P0 to a partition P1 , we arrange that sup f − inf f < ε on each such subdivided interval. Denote these subdivided intervals I10 , . . . , IL0 . It readily follows that 0 ≤ I P1 (f ) − I P1 (f )
0, ∃f0 , f1 ∈ PK(I) such that Z f0 ≤ f ≤ f1 and (f1 − f0 ) dx < ε. I

This can be used to prove (0.41)

f, g ∈ R(I) =⇒ f g ∈ R(I),

13

via the fact that (0.42)

fj , gj ∈ PK(I) =⇒ fj gj ∈ PK(I).

Exercises 1. Let c > 0 and let f : [ac, bc] → R be Riemann integrable. Working directly with the definition of integral, show that Z

b

(0.43) a

1 f (cx) dx = c

Z

bc

f (x) dx. ac

More generally, show that Z

b−d/c

(0.44) a−d/c

1 f (cx + d) dx = c

Z

bc

f (x) dx. ac

2. Let f : I ×S → R be continuous, where I = [a, b] and S ⊂ Rn . Take ϕ(y) = Show that ϕ is continuous on S. Hint. If fj : I → R are continuous and |f1 (x) − f2 (x)| ≤ δ on I, then (0.45)

R I

f (x, y) dx.

Z ¯Z ¯ ¯ ¯ ¯ f1 dx − f2 dx¯ ≤ `(I)δ. I

I

3. With f as in Exercise 2, suppose gj : S → R are continuous and a ≤ g0 (y) < g1 (y) ≤ b. R g (y) Take ϕ(y) = g01(y) f (x, y) dx. Show that ϕ is continuous on S. Hint. Make a change of variables, linear in x, to reduce this to Exercise 2. 4.¡ Suppose f : (a, b) → (c, d) and g : (c, d) → R are differentiable. Show that h(x) = ¢ g f (x) is differentiable and (0.46)

¡ ¢ h0 (x) = g 0 f (x) f 0 (x).

This is the chain rule. Hint. Peek at the proof of the chain rule in §1. 5. If f1 and f2 are differentiable on (a, b), show that f1 (x)f2 (x) is differentiable and (0.47)

¢ d ¡ f1 (x)f2 (x) = f10 (x)f2 (x) + f1 (x)f20 (x). dx

14

If f2 (x) 6= 0, ∀ x ∈ (a, b), show that f1 (x)/f2 (x) is differentiable and d ³ f1 (x) ´ f10 (x)f2 (x) − f1 (x)f20 (x) = . dx f2 (x) f2 (x)2

(0.48)

6. If f : (a, b) → R is differentiable, show that d f (x)n = nf (x)n−1 f 0 (x), dx

(0.49)

for all positive integers n. If f (x) 6= 0 ∀ x ∈ (a, b), show that this holds for all n ∈ Z. Hint. To get (0.49) for positive n, use induction and apply (0.47). 7. Let ϕ : [a, b] → [A, B] be C 1 on a neighborhood J of [a, b], with ϕ0 (x) > 0 for all x ∈ [a, b]. Assume ϕ(a) = A, ϕ(b) = B. Show that the identity Z

Z

B

(0.50)

b

f (y) dy = A

¡ ¢ f ϕ(t) ϕ0 (t) dt,

a

for any f ∈ C(J), follows from the chain rule and the Fundamental Theorem of Calculus. Hint. Replace b by x, B by ϕ(x), and differentiate. Note that this result contains that of Exercise 1. (Optional) Try to establish (0.50) directly by working with the definition of the integral as a limit of Riemann sums. 8. Show that, if f and g are C 1 on a neighborhood of [a, b], then Z

b

(0.51)

Z

b

0

f (s)g (s) ds = − a

£ ¤ f 0 (s)g(s) ds + f (b)g(b) − f (a)g(a) .

a

This transformation of integrals is called “integration by parts.” 9. Let f : (−a, a) → R be a C j+1 function. Show that, for x ∈ (−a, a), (0.52)

f (x) = f (0) + f 0 (0)x +

f 00 (0) 2 f (j) (0) j x + ··· + x + Rj (x), 2 j!

where Z (0.53)

x

Rj (x) = 0

(x − s)j (j+1) f (s) ds j!

This is Taylor’s formula with remainder. Hint. Use induction. If (0.52)–(0.53) holds for 0 ≤ j ≤ k, show that it holds for j = k + 1, by showing that Z x Z x f (k+1) (0) k+1 (x − s)k (k+1) (x − s)k+1 (k+2) (0.54) f (s) ds = x + f (s) ds. k! (k + 1)! (k + 1)! 0 0

15

To establish this, use the integration by parts formula (0.51), with f (s) replaced by f (k+1) (s), and with appropriate g(s). See Exercise 7 of §1 for another approach. Note that another presentation of (0.53) is (0.55)

xj+1 Rj (x) = (j + 1)!

Z

1

f

(j+1)

³¡

1−t

1/(j+1)

¢ ´ x dt.

0

10. Assume f : (−a, a) → R is a C j function. Show that, for x ∈ (−a, a), (0.52) holds, with Z x £ ¤ 1 (0.56) Rj (x) = (x − s)j−1 f (j) (s) − f (j) (0) ds. (j − 1)! 0 Hint. Apply (0.53) with j replaced by j − 1. Add and subtract f (j) (0) to the factor f (j) (s) in the resulting integrand. See Appendix D for further discussion. 11. Given I = [a, b], show that (0.57)

f, g ∈ R(I) =⇒ f g ∈ R(I),

as advertised in (0.41). 12. Assume fk ∈ R(I) and fk → f uniformly on I. Prove that f ∈ R(I) and Z Z (0.58) fk dx −→ f dx. I

I

13. Given I = [a, b], Iε = [a + ε, b − ε], assume fk ∈ R(I), |fk | ≤ M on I for all k, and (0.59)

fk −→ f uniformly on Iε ,

for all ε ∈ (0, (b − a)/2). Prove that f ∈ R(I) and (0.58) holds. 14. We say f ∈ R(R) provided f |[k,k+1] ∈ R([k, k + 1]) for each k ∈ Z, and ∞ Z X

(0.60)

k=−∞

k+1

|f (x)| dx < ∞. k

If f ∈ R(R), we set Z

Z



(0.61)

k

f (x) dx = lim −∞

k→∞

f (x) dx. −k

Formulate and demonstrate basic properties of the integral over R of elements of R(R).

16

1. The derivative Let O be an open subset of Rn , and F : O → Rm a continuous function. We say F is differentiable at a point x ∈ O, with derivative L, if L : Rn → Rm is a linear transformation such that, for y ∈ Rn , small, (1.1)

F (x + y) = F (x) + Ly + R(x, y)

with kR(x, y)k → 0 as y → 0. kyk

(1.2)

We denote the derivative at x by DF (x) = L. With respect to the standard bases of Rn and Rm , DF (x) is simply the matrix of partial derivatives,   ∂F1 /∂x1 · · · ∂F1 /∂xn µ ¶ ∂Fj   .. .. (1.3) DF (x) = = , . . ∂xk ∂Fm /∂x1 · · · ∂Fm /∂xn so that, if v = (v1 , . . . , vn )t , (regarded as a column vector) then P  (∂F1 /∂xk )vk  k    .. (1.4) DF (x)v =  . . P  (∂Fm /∂xk )vk k

It will be shown below that F is differentiable whenever all the partial derivatives exist and are continuous on O. In such a case we say F is a C 1 function on O. More generally, F is said to be C k if all its partial derivatives of order ≤ k exist and are continuous. If F is C k for all k, we say F is C ∞ . In (1.2), we can use the Euclidean norm on Rn and Rm . This norm is defined by ¡ ¢1/2 kxk = x21 + · · · + x2n

(1.5)

for x = (x1 , . . . , xn ) ∈ Rn . Any other norm would do equally well. An application of the Fundamental Theorem of Calculus, to functions of each variable xj separately, yields the following. If we assume F : O → Rm is differentiable in each variable separately, and that each ∂F/∂xj is continuous on O, then F (x + y) = F (x) + (1.6)

n n X X £ ¤ F (x + zj ) − F (x + zj−1 ) = F (x) + Aj (x, y)yj , j=1

Z

Aj (x, y) = 0

j=1 1

¢ ∂F ¡ x + zj−1 + tyj ej dt, ∂xj

17

where z0 = 0, zj = (y1 , . . . , yj , 0, . . . , 0), and {ej } is the standard basis of Rn . Consequently, F (x + y) = F (x) + (1.7)

n X ∂F (x) yj + R(x, y), ∂x j j=1

Z 1n o ∂F ∂F R(x, y) = yj (x + zj−1 + tyj ej ) − (x) dt. ∂xj ∂xj 0 j=1 n X

Now (1.7) implies F is differentiable on O, as we stated below (1.4). Thus we have established the following. Proposition 1.1. If O is an open subset of Rn and F : O → Rm is of class C 1 , then F is differentiable at each point x ∈ O. As is shown in many calculus texts, one can use the Mean Value Theorem instead of the Fundamental Theorem of Calculus, and obtain a slightly sharper result. Let us give some examples of derivatives. First, take n = 2, m = 1, and set (1.8)

F (x) = (sin x1 )(sin x2 ).

Then (1.9)

DF (x) = ((cos x1 )(sin x2 ), (sin x1 )(cos x2 )).

Next, take n = m = 2 and ¶ x1 x2 . x21 − x22

µ (1.10)

F (x) =

Then µ (1.11)

DF (x) =

x2 2x1

x1 −2x2

¶ .

We can replace Rn and Rm by more general finite-dimensional real vector spaces, isomorphic to Euclidean space. For example, the space M (n, R) of real n × n matrices is 2 isomorphic to Rn . Consider the function (1.12)

S : M (n, R) −→ M (n, R),

S(X) = X 2 .

We have (1.13)

(X + Y )2 = X 2 + XY + Y X + Y 2 = X 2 + DS(X)Y + R(X, Y ),

with R(X, Y ) = Y 2 , and hance (1.14)

DS(X)Y = XY + Y X.

18

For our next example, we take (1.15)

O = G`(n, R) = {X ∈ M (n, R) : det X 6= 0},

which, as shown below, is open in M (n, R). We consider (1.16)

Φ(X) = X −1 ,

Φ : G`(n, R) −→ M (n, R),

and compute DΦ(I). We use the following. If, for A ∈ M (n, R), (1.17)

kAk = sup{kAvk : v ∈ Rn , kvk ≤ 1},

then (1.18)

A, B ∈ M (n, R) ⇒ kABk ≤ kAk · kBk,

hence Y ∈ M (n, R) ⇒ kY k k ≤ kY kk .

Also Sk = I − Y + Y 2 − · · · + (−1)k Y k (1.19)

⇒ Y Sk = Sk Y = Y − Y 2 + Y 3 − · · · + (−1)k Y k+1 ⇒ (I + Y )Sk = Sk (I + Y ) = I + (−1)k Y k+1 ,

hence (1.20)

kY k < 1 =⇒ (I + Y )−1 =

∞ X

(−1)k Y k = I − Y + Y 2 − · · · ,

k=0

so (1.21)

DΦ(I)Y = −Y.

Related calculations show that G`(n, R) is open in M (n, R). In fact, given X ∈ G`(n, R), Y ∈ M (n, R), (1.22)

X + Y = X(I + X −1 Y ),

which by (1.20) is invertible as long as (1.23)

kX −1 Y k < 1.

One can proceed from here to compute DΦ(X). See the exercises. We return to general considerations, and derive the chain rule for the derivative. Let F : O → Rm be differentiable at x ∈ O, as above, let U be a neighborhood of z = F (x) in Rm , and let G : U → Rk be differentiable at z. Consider H = G ◦ F. We have

(1.24)

H(x + y) = G(F (x + y)) ¡ ¢ = G F (x) + DF (x)y + R(x, y) ¡ ¢ = G(z) + DG(z) DF (x)y + R(x, y) + R1 (x, y) = G(z) + DG(z)DF (x)y + R2 (x, y)

19

with

kR2 (x, y)k → 0 as y → 0. kyk

Thus G ◦ F is differentiable at x, and (1.25)

D(G ◦ F )(x) = DG(F (x)) · DF (x).

Another useful remark is that, by the Fundamental Theorem of Calculus, applied to ϕ(t) = F (x + ty), Z (1.26)

1

DF (x + ty)y dt,

F (x + y) = F (x) + 0

provided F is C 1 . For a typical application, see (6.6). For the study of higher order derivatives of a function, the following result is fundamental. Proposition 1.2. Assume F : O → Rm is of class C 2 , with O open in Rn . Then, for each x ∈ O, 1 ≤ j, k ≤ n, ∂ ∂F ∂ ∂F (x) = (x). ∂xj ∂xk ∂xk ∂xj

(1.27)

Proof. It suffices to take m = 1. We label our function f : O → R. For 1 ≤ j ≤ n, we set (1.28)

∆j,h f (x) =

¢ 1¡ f (x + hej ) − f (x) , h

where {e1 , . . . , en } is the standard basis of Rn . The mean value theorem (for functions of xj alone) implies that if ∂j f = ∂f /∂xj exists on O, then, for x ∈ O, h > 0 sufficiently small, (1.29)

∆j,h f (x) = ∂j f (x + αj hej ),

for some αj ∈ (0, 1), depending on x and h. Iterating this, if ∂j (∂k f ) exists on O, then, for x ∈ O, h > 0 sufficiently small, ∆k,h ∆j,h f (x) = ∂k (∆j,h f )(x + αk hek ) (1.30)

= ∆j,h (∂k f )(x + αk hek ) = ∂j ∂k f (x + αk hek + αj hej ),

with αj , αk ∈ (0, 1). Here we have used the elementary result (1.31) We deduce the following.

∂k ∆j,h f = ∆j,h (∂k f ).

20

Proposition 1.3. If ∂k f and ∂j ∂k f exist on O and ∂j ∂k f is continuous at x0 ∈ O, then (1.32)

∂j ∂k f (x0 ) = lim ∆k,h ∆j,h f (x0 ). h→0

Clearly (1.33)

∆k,h ∆j,h f = ∆j,h ∆k,h f,

so we have the following, which easily implies Proposition 1.2. Corollary 1.4. In the setting of Proposition 1.3, if also ∂j f and ∂k ∂j f exist on O and ∂k ∂j f is continuous at x0 , then (1.34)

∂j ∂k f (x0 ) = ∂k ∂j f (x0 ).

We now describe two convenient notations to express higher order derivatives of a C k function f : Ω → R, where Ω ⊂ Rn is open. In one, let J be a k-tuple of integers between 1 and n; J = (j1 , . . . , jk ). We set (1.35)

f (J) (x) = ∂jk · · · ∂j1 f (x),

∂j =

∂ . ∂xj

We set |J| = k, the total order of differentiation. As we have seen in Proposition 1.2, ∂i ∂j f = ∂j ∂i f provided f ∈ C 2 (Ω). It follows that, if f ∈ C k (Ω), then ∂jk · · · ∂j1 f = ∂`k · · · ∂`1 f whenever {`1 , . . . , `k } is a permutation of {j1 , . . . , jk }. Thus, another convenient notation to use is the following. Let α be an n-tuple of non-negative integers, α = (α1 , . . . , αn ). Then we set (1.36)

f (α) (x) = ∂1α1 · · · ∂nαn f (x),

|α| = α1 + · · · + αn .

Note that, if |J| = |α| = k and f ∈ C k (Ω), (1.37)

f (J) (x) = f (α) (x), with αi = #{` : j` = i}.

Correspondingly, there are two expressions for monomials in x = (x1 , . . . , xn ): (1.38)

xJ = xj1 · · · xjk ,

αn 1 xα = xα 1 · · · xn ,

and xJ = xα provided J and α are related as in (1.37). Both these notations are called “multi-index” notations. We now derive Taylor’s formula with remainder for a smooth function F : Ω → R, making use of these multi-index notations. We will apply the one variable formula (0.52)– (0.53), i.e., (1.39)

1 1 ϕ(t) = ϕ(0) + ϕ0 (0)t + ϕ00 (0)t2 + · · · + ϕ(k) (0)tk + rk (t), 2 k!

21

with 1 rk (t) = k!

(1.40)

Z

t

(t − s)k ϕ(k+1) (s) ds,

0

given ϕ ∈ C k+1 (I), I = (−a, a). (See the exercises, and also Appendix D for further discussion.) Let us assume 0 ∈ Ω, and that the line segment from 0 to x is contained in Ω. We set ϕ(t) = F (tx), and apply (1.39)–(1.40) with t = 1. Applying the chain rule, we have ϕ0 (t) =

(1.41)

n X

∂j F (tx)xj =

j=1

X

F (J) (tx)xJ .

|J|=1

Differentiating again, we have X (1.42) ϕ00 (t) =

F (J+K) (tx)xJ+K =

|J|=1,|K|=1

X

F (J) (tx)xJ ,

|J|=2

where, if |J| = k, |K| = `, we take J + K = (j1 , . . . , jk , k1 , . . . , k` ). Inductively, we have X (1.43) ϕ(k) (t) = F (J) (tx)xJ . |J|=k

Hence, from (1.39) with t = 1, F (x) = F (0) +

X

F (J) (0)xJ + · · · +

|J|=1

1 X (J) F (0)xJ + Rk (x), k! |J|=k

or, more briefly, (1.44)

F (x) =

X |J|≤k

1 (J) F (0)xJ + Rk (x), |J|!

where (1.45)

1 Rk (x) = k!

X ³Z |J|=k+1

1

k

(1 − s) F

(J)

´ (sx) ds xJ .

0

This gives Taylor’s formula with remainder for F ∈ C k+1 (Ω), in the J-multi-index notation. We also want to write the formula in the α-multi-index notation. We have X X (1.46) F (J) (tx)xJ = ν(α)F (α) (tx)xα , |J|=k

|α|=k

where (1.47)

ν(α) = #{J : α = α(J)},

22

and we define the relation α = α(J) to hold provided the condition (1.37) holds, or equivalently provided xJ = xα . Thus ν(α) is uniquely defined by X

(1.48)

ν(α)xα =

|α|=k

X

xJ = (x1 + · · · + xn )k .

|J|=k

One sees that, if |α| = k, then ν(α) is equal to the product of the number of combinations of k objects, taken α1 at a time, times the number of combinations of k − α1 objects, taken α2 at a time, · · · times the number of combinations of k − (α1 + · · · + αn−1 ) objects, taken αn at a time. Thus µ (1.49)

ν(α) =

k α1

¶µ

¶ µ ¶ k! k − α1 k − α1 − · · · − αn−1 ··· = α1 !α2 ! · · · αn ! α2 αn

In other words, for |α| = k, (1.50)

k! , where α! = α1 ! · · · αn ! α!

ν(α) =

Thus the Taylor formula (1.44) can be rewritten (1.51)

X 1 F (α) (0)xα + Rk (x), α!

F (x) =

|α|≤k

where (1.52)

Rk (x) =

X |α|=k+1

k + 1³ α!

Z

1

k

(1 − s) F

(α)

´ (sx) ds xα .

0

The formula (1.51)–(1.52) holds for F ∈ C k+1 . It is significant that (1.51), with a variant of (1.52), holds for F ∈ C k . In fact, for such F , we can apply (1.52) with k replaced by k − 1, to get (1.53)

F (x) =

X |α|≤k−1

1 (α) F (0)xα + Rk−1 (x), α!

with (1.54)

´ X k ³Z 1 Rk−1 (x) = (1 − s)k−1 F (α) (sx) ds xα . α! 0 |α|=k

We can add and subtract F (α) (0) to F (α) (sx) in the integrand above, and obtain the following.

23

Proposition 1.5. If F ∈ C k on a ball Br (0), the formula (1.51) holds for x ∈ Br (0), with X k ³Z 1 £ (α) ¤ ´ α k−1 (α) (1.55) Rk (x) = (1 − s) F (sx) − F (0) ds x . α! 0 |α|=k

Remark. Note that (1.55) yields the estimate (1.56)

|Rk (x)| ≤

X |xα | sup |F (α) (sx) − F (α) (0)|. α! 0≤s≤1

|α|=k

The term corresponding to |J| = 2 in (1.44), or |α| = 2 in (1.51), is of particular interest. It is n 1 X (J) 1 X ∂2F J F (0) x = (0) xj xk . 2 2 ∂xk ∂xj

(1.57)

j,k=1

|J|=2

We define the Hessian of a C 2 function F : O → R as an n × n matrix: µ 2 ¶ ∂ F 2 (1.58) D F (y) = (y) . ∂xk ∂xj Then the power series expansion of second order about 0 for F takes the form 1 F (x) = F (0) + DF (0)x + x · D2 F (0)x + R2 (x), 2

(1.59) where, by (1.56), (1.60)

|R2 (x)| ≤ Cn |x|2

|F (α) (sx) − F (α) (0)|.

sup 0≤s≤1,|α|=2

In all these formulas we can translate coordinates and expand about y ∈ O. For example, (1.59) extends to (1.61)

1 F (x) = F (y) + DF (y)(x − y) + (x − y) · D2 F (y)(x − y) + R2 (x, y), 2

with (1.62)

|R2 (x, y)| ≤ Cn |x − y|2

sup

|F (α) (y + s(x − y)) − F (α) (y)|.

0≤s≤1,|α|=2

Example. If we take F (x) as in (1.8), so DF (x) is as in (1.9), then µ ¶ − sin x1 sin x2 cos x1 cos x2 2 D F (x) = . cos x1 cos x2 − sin x1 sin x2

24

The results (1.61)–(1.62) are useful for extremal problems, i.e., determining where a sufficiently smooth function F : O → R has local maxima and local minima. Clearly if F ∈ C 1 (O) and F has a local maximum or minimum at x0 ∈ O, then DF (x0 ) = 0. In such a case, we say x0 is a critical point of F . To check what kind of critical point x0 is, we look at the n × n matrix A = D2 F (x0 ), assuming F ∈ C 2 (O). By Proposition 1.2, A is a symmetric matrix. A basic result in linear algebra is that if A is a real, symmetric n × n matrix, then Rn has an orthonormal basis of eigenvectors, {v1 , . . . , vn }, satisfying Avj = λj vj ; the real numbers λj are the eigenvalues of A. We say A is positive definite if all λj > 0, and we say A is negative definite if all λj < 0. We say A is strongly indefinite if some λj > 0 and another λk < 0. Equivalently, given a real, symmetric matrix A, (1.63)

A positive definite ⇐⇒ v · Av ≥ C|v|2 , A negative definite ⇐⇒ v · Av ≤ −C|v|2 ,

for some C > 0, all v ∈ Rn , and A strongly indefinite ⇐⇒ ∃ v, w ∈ Rn , nonzero, such that (1.64) v · Av ≥ C|v|2 , w · Aw ≤ −C|w|2 , for some C > 0. In light of (1.44)–(1.45), we have the following result. Proposition 1.6. Assume F ∈ C 2 (O) is real valued, O open in Rn . Let x0 ∈ O be a critical point for F . Then (i) D2 F (x0 ) positive definite ⇒ F has a local minimum at x0 , (ii) D2 F (x0 ) negative definite ⇒ F has a local maximum at x0 , (iii) D2 F (x0 ) strongly indefinite ⇒ F has neither a local maximum nor a local minimum at x0 . In case (iii), we say x0 is a saddle point for F . The following is a test for positive definiteness. Proposition 1.7. Let A = (aij ) be a real, symmetric, n × n matrix. For 1 ≤ ` ≤ n, form the ` × ` matrix A` = (aij )1≤i,j≤` . Then (1.65)

A positive definite ⇐⇒ det A` > 0,

∀ ` ∈ {1, . . . , n}.

Regarding the implication ⇒, note that if A is positive definite, then det A = det An is the product of its eigenvalues, all > 0, hence is > 0. Also in this case, it follows from (1.65) that each A` must be positive definite, hence have positive determinant, so we have ⇒. The implication ⇐ is easy enough for 2 × 2 matrices. If A is symmetric and det A > 0, then either both its eigenvalues are positive (so A is positive definite) or both are negative (so A is negative definite). In the latter case, A1 = (a11 ) must be negative, so we have ⇐ in this case. We prove ⇐ for n ≥ 3, using induction. The inductive hypothesis implies that if det A` > 0 for each ` ≤ n, then An−1 is positive definite. The next lemma then guarantees that A = An has at least n−1 positive eigenvalues. The hypothesis that det A > 0 does not allow that the remaining eigenvalue be ≤ 0, so all the eigenvalues of A must be positive. Thus Proposition 1.7 is proven, once we have the following.

25

Lemma 1.8. In the setting of Proposition 1.7, if An−1 is positive definite, then A = An has at least n − 1 positive eigenvalues. Proof. Since A is symmetric, Rn has an orthonormal basis v1 , . . . , vn of eigenvectors of A; Avj = λj vj . If the conclusion of the lemma is false, at least two of the eigenvalues, say λ1 , λ2 , are ≤ 0. Let W = Span(v1 , v2 ), so w ∈ W =⇒ w · Aw ≤ 0. Since W has dimension 2, Rn−1 ⊂ Rn satisfies Rn−1 ∩ W 6= 0, so there exists a nonzero w ∈ Rn−1 ∩ W , and then w · An−1 w = w · Aw ≤ 0, contradicting the hypothesis that An−1 is positive definite. Remark. Given (1.65), we see by taking A 7→ −A that if A is a real, symmetric n × n matrix, (1.66)

A negative definite ⇐⇒ (−1)` det A` > 0,

∀ ` ∈ {1, . . . , n}.

We return to higher order power series formulas with remainder and complement Proposition 1.5. Let us go back to (1.39)–(1.40) and note that the integral in (1.40) is 1/(k + 1) times a weighted average of ϕ(k+1) (s) over s ∈ [0, t]. Hence we can write rk (t) =

1 ϕ(k+1) (θt), for some θ ∈ [0, 1], (k + 1)!

if ϕ is of class C k+1 . This is the Lagrange form of the remainder; see Appendix D for more on this, and for a comparison with the Cauchy form of the remainder. If ϕ is of class C k , we can replace k + 1 by k in (1.39) and write (1.67)

ϕ(t) = ϕ(0) + ϕ0 (0)t + · · · +

1 1 ϕ(k−1) (0)tk−1 + ϕ(k) (θt)tk , (k − 1)! k!

for some θ ∈ [0, 1]. Pluging (1.67) into (1.43) for ϕ(t) = F (tx) gives (1.68)

X

F (x) =

|J|≤k−1

1 (J) 1 X (J) F (θx)xJ , F (0)xJ + |J|! k! |J|=k

for some θ ∈ [0, 1] (depending on x and on k, but not on J), when F is of class C k on a neighborhood Br (0) of 0 ∈ Rn . Similarly, using the α-multi-index notation, we have, as an alternative to (1.53)–(1.54), (1.69)

F (x) =

X |α|≤k−1

X 1 1 (α) F (0)xα + F (α) (θx)xα , α! α! |α|=k

26

for some θ ∈ [0, 1] (depending on x and on |α|, but not on α), if F ∈ C k (Br (0)). Note also that n 1 X (J) 1 X ∂2F F (θx)xJ = (θx)xj xk 2 2 ∂xk ∂xj

(1.70)

j,k=1

|J|=2

=

1 x · D2 F (θx)x, 2

with D2 F (y) as in (1.58), so if F ∈ C 2 (Br (0)), we have, as an alternative to (1.59), 1 F (x) = F (0) + DF (0)x + x · D2 F (θx)x, 2

(1.71)

for some θ ∈ [0, 1]. We next complemant the multi-index notations for higher derivatives of a function F by a multi-linear notation, defined as follows. If k ∈ N, F ∈ C k (U ), and y ∈ U ⊂ Rn , set (1.72)

¯ ¯ Dk F (y)(u1 , . . . , uk ) = ∂t1 · · · ∂tk F (y + t1 u1 + · · · + tk uk )¯

t1 =···=tk =0

,

for u1 , . . . , uk ∈ Rn . For k = 1, this formula is equivalent to the definition of DF given at the beginning of this section. For k = 2, we have D2 F (y)(u, v) = u · D2 F (y)v,

(1.73)

with D2 F (y) as in (1.58). Generally, (1.72) defines Dk F (y) as a symmetric, k-linear form in u1 , . . . , uk ∈ Rn . We can relate (1.72) to the J-multi-index notation as follows. We start with (1.74)

∂t1 F (y + t1 u1 + · · · + tk uk ) =

X

F (J) (y + Σtj uj )uJ1 ,

|J|=1

and inductively obtain (1.75)

∂t1 · · · ∂tk F (y + Σtj uj ) =

X

F (J1 +···+Jk ) (y + Σtj uj )uJ1 1 · · · uJkk ,

|J1 |=···=|Jk |=1

hence (1.76)

Dk F (y)(u1 , . . . , uk ) =

X

F (J1 +···+Jk ) (y)uJ1 1 · · · uJkk .

|J1 |=···=|Jk |=1

In particular, if u1 = · · · = uk = u, (1.77)

Dk F (y)(u, . . . , u) =

X |J|=k

F (J) (y)uJ .

27

Hence (1.68) yields (1.78) F (x) = F (0)+DF (0)x+· · ·+

1 1 Dk−1 F (0)(x, . . . , x)+ Dk F (θx)(x, . . . , x), (k − 1)! k!

for some θ ∈ [0, 1], if F ∈ C k (Br (0)). In fact, rather than appealing to (1.68), we can note that ¯ ¯ ϕ(t) = F (tx) =⇒ ϕ(k) (t) = ∂t1 · · · ∂tk ϕ(t + t1 + · · · + tk )¯ t1 =···=tk =0

k

= D F (tx)(x, . . . , x), and obtain (1.78) directly from (1.67). We can also use the notation Dj F (y)x⊗j = Dj F (y)(x, . . . , x), with j copies of x within the last set of parentheses, and rewrite (1.78) as (1.79)

F (x) = F (0) + DF (0)x + · · · +

1 1 Dk−1 F (0)x⊗(k−1) + Dk F (θx)x⊗k . (k − 1)! k!

Note how (1.78) and (1.79) generalize (1.71).

Exercises 1. Consider the following function f : R2 → R: f (x, y) = (cos x)(cos y). Find all its critical points, and determine which of these are local maxima, local minima, and saddle points. 2. Let M (n, R) denote the space of real n×n matrices. Assume F, G : M (n, R) → M (n, R) are of class C 1 . Show that H(X) = F (X)G(X) defines a C 1 map H : M (n, R) → M (n, R), and DH(X)Y = DF (X)Y G(X) + F (X)DG(X)Y. 3. Let Gl(n, R) ⊂ M (n, R) denote the set of invertible matrices. Show that Φ : Gl(n, R) −→ M (n, R), is of class C 1 and that

Φ(X) = X −1

DΦ(X)Y = −X −1 Y X −1 .

4. Identify R2 and C via z = x + iy. Then multiplication by i on C corresponds to applying µ ¶ 0 −1 J= 1 0

28

Let O ⊂ R2 be open, f : O → R2 be C 1 . Say f = (u, v). Regard Df (x, y) as a 2 × 2 real matrix. One says f is holomorphic, or complex-analytic, provided the Cauchy-Riemann equations hold: ∂u ∂v ∂u ∂v = , =− . ∂x ∂y ∂y ∂x Show that this is equivalent to the condition Df (x, y) J = J Df (x, y). Generalize to O open in Cm , f : O → Cn . 5. Let f be C 1 on a region in R2 containing [a, b] × {y}. Show that, as h → 0, ¤ 1£ ∂f f (x, y + h) − f (x, y) −→ (x, y), uniformly on [a, b] × {y}. h ∂y Hint. Show that the left side is equal to 1 h

Z 0

h

∂f (x, y + s) ds, ∂y

and use the uniform continuity of ∂f /∂y on [a, b] × [y − δ, y + δ]; cf. Proposition A.15. 6. In the setting of Exercise 5, show that d dy

Z

Z

b

b

f (x, y) dx = a

a

∂f (x, y) dx. ∂y

7. Considering the power series f (x) = f (y) + f 0 (y)(x − y) + · · · + show that

f (j) (y) (x − y)j + Rj (x, y), j!

∂Rj 1 = − f (j+1) (y)(x − y)j , ∂y j!

Rj (x, x) = 0.

Use this to re-derive (0.53), and hence (1.22)–(1.23). We define “big oh” and “little oh” notation: ¯ f (x) ¯ ¯ ¯ f (x) = O(x) (as x → 0) ⇔ ¯ ¯ ≤ C as x → 0, x f (x) → 0 as x → 0. f (x) = o(x) (as x → 0) ⇔ x

29

8. Let O ⊂ Rn be open and y ∈ O. Show that f ∈ C k+1 (O) ⇒ f (x) =

X 1 f (α) (y)(x − y)α + O(|x − y|k+1 ), α!

|α|≤k

f ∈ C k (O) ⇒ f (x) =

X 1 f (α) (y)(x − y)α + o(|x − y|k ). α!

|α|≤k

Exercises 9–11 deal with properties of the determinant, as a differentiable function on spaces of matrices. Foundational work on determinants is developed in an accompanying set of auxiliary exercises on determinants. 9. Let Mn×n be the space of complex n × n matrices, det : Mn×n → C the determinant. Show that, if I is the identity matrix, D det(I)B = Tr B, i.e.,

d det(I + tB)|t=0 = Tr B. dt

¡ ¢ 10. If A(t) = ajk (t) is a curve in Mn×n , use the expansion of (d/dt) det A(t) as a sum of n determinants, in which the rows of A(t) are successively differentiated, to show that, for A ∈ Mn×n , ¡ D det(A)B = Tr Cof(A)t · B), where Cof(A) is the cofactor matrix of A. 11. Suppose A ∈ Mn×n is invertible. Using det(A + tB) = (det A) det(I + tA−1 B), show that

D det(A)B = (det A) Tr(A−1 B).

Comparing the result of Exercise 10, deduce Cramer’s formula: (det A)A−1 = Cof(A)t . 12. Assume G : U → O, F : O → Ω. Show that F, G ∈ C 1 =⇒ F ◦ G ∈ C 1 . (Hint. Use (1.7).) Show that, for any k ∈ N, F, G ∈ C k =⇒ F ◦ G ∈ C k .

30

13. Show that the map Φ : Gl(n, R) → Gl(n, R) given by Φ(X) = X −1 is C k for each k, i.e., Φ ∈ C ∞ . Hint. Start with the material of Exercise 3.

Auxiliary exercises on determinants If Mn×n denotes the space of n × n complex matrices, we want to show that there is a map (1.79)

det : Mn×n → C

which is uniquely specified as a function ϑ : Mn×n → C satisfying: (a) ϑ is linear in each column aj of A, e = −ϑ(A) if A e is obtained from A by interchanging two columns. (b) ϑ(A) (c) ϑ(I) = 1. 1. Let A = (a1 , . . . , an ), where aj are column vectors; aj = (a1j , . . . , anj )t . Show that, if (a) holds, we have the expansion det A = (1.80)

X

aj1 det (ej , a2 , . . . , an ) = · · ·

j

=

X

aj1 1 · · · ajn n det (ej1 , ej2 , . . . , ejn ),

j1 ,··· ,jn

where {e1 , . . . , en } is the standard basis of Cn . 2. Show that, if (b) and (c) also hold, then (1.81)

det A =

X

(sgn σ) aσ(1)1 aσ(2)2 · · · aσ(n)n ,

σ∈Sn

where Sn is the set of permutations of {1, . . . , n}, and (1.82)

sgn σ = det (eσ(1) , . . . , eσ(n) ) = ±1.

To define sgn σ, the “sign” of a permutation σ, we note that every permutation σ can be written as a product of transpositions: σ = τ1 · · · τν , where a transposition of {1, . . . , n} interchanges two elements and leaves the rest fixed. We say sgn σ = 1 if ν is even and sgn σ = −1 if ν is odd. It is necessary to show that sgn σ is independent of the choice of such a product representation. (Referring to (1.82) begs the question until we know that det is well defined.)

31

3. Let σ ∈ Sn act on a function of n variables by (1.83)

(σf )(x1 , . . . , xn ) = f (xσ(1) , . . . , xσ(n) ).

Let P be the polynomial (1.84)

P (x1 , . . . , xn ) =

Y

(xj − xk ).

1≤j 0, (hence det T = 1). Show that (1.93)

T ∈ SO(3) =⇒ T u × T v = T (u × v).

Hint. Multiply the 3 × 3 matrix in Exercise 1 on the left by T. 3. Show that, if θ is the angle between u and v in R3 , then (1.94)

|u × v| = |u| |v| | sin θ|.

Hint. Check this for u = i, v = ai + bj, and use Exercise 2 to show this suffices. 4. Show that κ : R3 → Skew(3), the set of antisymmetric real 3 × 3 matrices, given by 

(1.95)

0  κ(y1 , y2 , y3 ) = y3 −y2

−y3 0 y1

 y2 −y1  0

satisfies (1.96)

Kx = y × x,

K = κ(y).

Show that, with [A, B] = AB − BA, (1.97)

£ ¤ κ(x × y) = κ(x), κ(y) , ¡ ¢ Tr κ(x)κ(y)t = 2x · y.

34

2. Inverse function and implicit function theorem The Inverse Function Theorem gives a condition under which a function can be locally inverted. This theorem and its corollary the Implicit Function Theorem are fundamental results in multivariable calculus. First we state the Inverse Function Theorem. Here, we assume k ≥ 1. Theorem 2.1. Let F be a C k map from an open neighborhood Ω of p0 ∈ Rn to Rn , with q0 = F (p0 ). Suppose the derivative DF (p0 ) is invertible. Then there is a neighborhood U of p0 and a neighborhood V of q0 such that F : U → V is one-to-one and onto, and F −1 : V → U is a C k map. (One says F : U → V is a diffeomorphism.) First we show that F is one-to-one on a neighborhood of p0 , under these hypotheses. In fact, we establish the following result, of interest in its own right. Proposition 2.2. Assume Ω ⊂ Rn is open and convex, and let f : Ω → Rn be C 1 . Assume that the symmetric part of Df (u) is positive-definite, for each u ∈ Ω. Then f is one-to-one on Ω. Proof. Take distinct points u1 , u2 ∈ Ω, and set u2 − u1 = w. Consider ϕ : [0, 1] → R, given by ϕ(t) = w · f (u1 + tw). Then ϕ0 (t) = w · Df (u1 + tw)w > 0 for t ∈ [0, 1], so ϕ(0) 6= ϕ(1). But ϕ(0) = w · f (u1 ) and ϕ(1) = w · f (u2 ), so f (u1 ) 6= f (u2 ). To continue the proof of Theorem 2.1, let us set (2.1)

¡ ¢ f (u) = A F (p0 + u) − q0 ,

A = DF (p0 )−1 .

Then f (0) = 0 and Df (0) = I, the identity matrix. We show that f maps a neighborhood of 0 one-to-one and onto some neighborhood of 0. Proposition 2.2 applies, so we know f is one-to-one on some neighborhood O of 0. We next show that the image of O under f contains a neighborhood of 0. We can write (2.2)

f (u) = u + R(u),

R(0) = 0, DR(0) = 0.

For v small, we want to solve (2.3)

f (u) = v.

This is equivalent to u + R(u) = v, so let (2.4)

Tv (u) = v − R(u).

35

Thus solving (2.3) is equivalent to solving (2.5)

Tv (u) = u.

We look for a fixed point u = K(v) = f −1 (v). Also, we want to prove that DK(0) = I, i.e., that K(v) = v + r(v) with r(v) = o(kvk). (The “little oh” notation is defined in Exercise 8 of §1.) If we succeed in doing this, it follows easily that, for general x close to q0 , G(x) = F −1 (x) is defined, and DG(q0 ) = DF (p0 )−1 . A parallel argument, with p0 replaced by nearby u and x = F (u), gives (2.6)

³ ¡ ¢´−1 DG(x) = DF G(x) .

Then a simple inductive argument shows that G is C k if F is C k . See Exercise 6 at the end of this section, for an approach to this last argument. A tool we will use to solve (2.5) is the following general result, known as the Contraction Mapping Principle. Theorem 2.3. Let X be a complete metric space, and let T : X → X satisfy (2.7)

dist(T x, T y) ≤ r dist(x, y),

for some r < 1. (We say T is a contraction.) Then T has a unique fixed point x. For any y0 ∈ X, T k y0 → x as k → ∞. Proof. Pick y0 ∈ X and let yk = T k y0 . Then dist(yk , yk+1 ) ≤ rk dist(y0 , y1 ), so

(2.8)

dist(yk , yk+m ) ≤ dist(yk , yk+1 ) + · · · + dist(yk+m−1 , yk+m ) ¡ ¢ ≤ rk + · · · + rk+m−1 dist(y0 , y1 ) ¡ ¢−1 ≤ rk 1 − r dist(y0 , y1 ).

It follows that (yk ) is a Cauchy sequence, so it converges; yk → x. Since T yk = yk+1 and T is continuous, it follows that T x = x, i.e., x is a fixed point. Uniqueness of the fixed point is clear from the estimate dist(T x, T x0 ) ≤ r dist(x, x0 ), which implies dist(x, x0 ) = 0 if x and x0 are fixed points. This proves Theorem 2.3. Returning to the problem of solving (2.5), we pick b > 0 such that (2.9)

B2b (0) ⊂ O and kwk ≤ 2b ⇒ kDR(w)k ≤

1 . 2

We claim that (2.10)

kvk ≤ b =⇒ Tv : Xv → Xv ,

where (2.11)

Xv = {u ∈ O : ku − vk ≤ Av },

Av =

sup kwk≤2kvk

kR(w)k.

36

See Fig. 2.1. To prove that (2.10) holds, note that (2.12)

Tv (u) − v = −R(u),

so we need to show that (2.13)

kvk ≤ b, u ∈ Xv =⇒ kR(u)k ≤ Av .

Indeed, (2.14)

u ∈ Xv =⇒ kuk ≤ kvk + Av ,

and, by (2.9) and (2.11), (2.15)

kvk ≤ b =⇒ Av ≤ kvk,

and we have (2.16)

u ∈ Xv =⇒ kuk ≤ 2kvk =⇒ kR(u)k ≤

(hence, parenthetically, Xv ⊂ B2b (0)) sup

kR(w)k = Av .

kwk≤2kvk

This establishes (2.10). Now, given uj ∈ Xv , kvk ≤ b,

(2.17)

kTv (u1 ) − Tv (u2 )k = kR(u2 ) − R(u1 )k 1 ≤ ku1 − u2 k, 2

the last inequality by (2.9), so the map (2.10) is a contraction, if kvk ≤ b. Hence there exists a unique fixed point u = K(v) ∈ Xv . Also, since u ∈ Xv , (2.18)

kK(v) − vk ≤ Av = o(kvk),

so DK(0) = I, and the Inverse Function Theorem is proved Thus if DF is invertible on the domain of F, F is a local diffeomorphism. Stronger hypotheses are needed to guarantee that F is a global diffeomorphism onto its range. Proposition 2.2 provides one tool for doing this. Here is a slight strengthening. Corollary 2.4. Assume Ω ⊂ Rn is open and convex, and that F : Ω → Rn is C 1 . Assume there exist n × n matrices A and B such that the symmetric part of A DF (u) B is positive definite for each u ∈ Ω. Then F maps Ω diffeomorphically onto its image, an open set in Rn . Proof. Exercise.

37

We make a comment about solving the equation F (x) = y, under the hypotheses of Theorem 2.1, when y is close to q0 . The fact that finding the fixed point for Tv in (2.9) is accomplished by taking the limit of Tvk (v) implies that, when y is sufficiently close to q0 , the sequence (xk ), defined by ¡ ¢ (2.19) x0 = p0 , xk+1 = xk + DF (p0 )−1 y − F (xk ) , converges to the solution x. An analysis of the rate at which xk → x, and F (xk ) → y, can be made by applying F to (2.19), yielding F (xk+1 ) = F (xk + DF (p0 )−1 (y − F (xk )) = F (xk ) + DF (xk )DF (p0 )−1 (y − F (xk )) + R(xk , DF (p0 )−1 (y − F (xk ))), and hence (2.20)

¡ ¢ e k , y − F (xk )), y − F (xk+1 ) = I − DF (xk )DF (p0 )−1 (y − F (xk )) + R(x

e k , y − F (xk ))k = o(ky − F (xk )k). with kR(x It turns out that replacing p0 by xk in (2.19) yields a faster approximation. This method, known as Newton’s method, is described in the exercises. We consider some examples of maps to which Theorem 2.1 applies. First, we look at µ ¶ µ ¶ r cos θ x(r, θ) 2 (2.21) F : (0, ∞) × R −→ R , F (r, θ) = = . r sin θ y(r, θ) Then µ (2.22)

DF (r, θ) =

∂r x ∂r y

∂θ x ∂θ y



µ =

cos θ sin θ

−r sin θ r cos θ

¶ ,

so (2.23)

det DF (r, θ) = r cos2 θ + r sin2 θ = r.

Hence DF (r, θ) is invertible for all (r, θ) ∈ (0, ∞) × R. Theorem 2.1 implies that each (r0 , θ0 ) ∈ (0, ∞) × R has a neighborhood U and (x0 , y0 ) = (r0 cos θ0 , r0 sin θ0 ) has a neighborhood V such that F is a smooth diffeomorphism of U onto V . In this simple situation, it can be verified directly that (2.24)

F : (0, ∞) × (−π, π) −→ R2 \ {(x, 0) : x ≤ 0}

is a smooth diffeomorphism. Note that DF (1, 0) = I in (2.22). Let us check the domain of applicability of Proposition 2.2. The symmetric part of DF (r, θ) in (2.22) is ¶ µ 1 (1 − r) sin θ cos θ 2 . (2.25) S(r, θ) = 1 (1 − r) sin θ r cos θ 2

38

By Proposition 1.7, this is positive definite if and only if (2.26)

cos θ > 0,

and (2.27)

1 det S(r, θ) = r cos2 θ − (1 − r)2 sin2 θ > 0. 4

Now (2.26) holds for θ ∈ (−π/2, π/2), but not on all of (−π, π). Furthermore, (2.27) holds for (r, θ) in a neighborhood of (r0 , θ0 ) = (1, 0), but it does not hold on all of (0, ∞) × (−π/2, π/2). We see that Proposition 2.2 does not capture the full force of (2.24). We move on to another example. As in §1, we can extend Theorem 2.1, replacing Rn by a 2 finite dimensional real vector space, isometric to a Euclidean space, such as M (n, R) ≈ Rn . As an example, consider (2.28)

Exp : M (n, R) −→ M (n, R),

Exp(X) = e

X

∞ X 1 k X . = k! k=0

Since (2.29)

1 Exp(Y ) = I + Y + Y 2 + · · · , 2

we have (2.30)

D Exp(0)Y = Y,

∀ Y ∈ M (n, R),

so D Exp(0) is invertible. Then Theorem 2.1 implies that there exist a neighborhod U of 0 ∈ M (n, R) and a neighborhood V of I ∈ M (n, R) such that Exp : U → V is a smooth diffeomorphism. To motivate the next result, we consider the following example. Take a > 0 and consider the equation (2.31)

x2 + y 2 = a2 ,

F (x, y) = x2 + y 2 .

Note that (2.32)

DF (x, y) = (2x 2y),

Dx F (x, y) = 2x,

Dy F (x, y) = 2y.

The equation (2.31) defines y “implicitly” as a smooth function of x if |x| < a. Explicitly, (2.33)

|x| < a =⇒ y =

p

a2 − x2 ,

Similarly, (2.31) defines x implicitly as a smooth function of y if |y| < a; explicitly (2.34)

|y| < a =⇒ x =

p a2 − y 2 .

39

Now, given x0 ∈ R, a > 0, there exists y0 ∈ R such that F (x0 , y0 ) = a2 if and only if |x0 | ≤ a. Furthermore, (2.35)

given F (x0 , y0 ) = a2 ,

Dy F (x0 , y0 ) 6= 0 ⇔ |x0 | < a.

Similarly, given y0 ∈ R, there exists x0 such that F (x0 , y0 ) = a2 if and only if |y0 | ≤ a, and (2.36)

given F (x0 , y0 ) = a2 ,

Dx F (x0 , y0 ) 6= 0 ⇔ |x0 | < a.

Note also that, whenever (x, y) ∈ R2 and F (x, y) = a2 > 0, (2.37)

DF (x, y) 6= 0,

so either Dx F (x, y) 6= 0 or Dy F (x, y) 6= 0, and, as seen above whenever (x0 , y0 ) ∈ R2 and F (x0 , y0 ) = a2 > 0, we can solve F (x, y) = a2 for either y as a smooth function of x for x near x0 or for x as a smooth funciton of y for y near y0 . We move from these observaitons to the next result, the Implicit Function Theorem. Theorem 2.5. Suppose U is a neighborhood of x0 ∈ Rm , V a neighborhood of y0 ∈ R` , and we have a C k map F : U × V −→ R` ,

(2.38)

F (x0 , y0 ) = u0 .

Assume Dy F (x0 , y0 ) is invertible. Then the equation F (x, y) = u0 defines y = g(x, u0 ) for x near x0 (satisfying g(x0 , u0 ) = y0 ) with g a C k map. To prove this, consider H : U × V → Rm × R` defined by ¡ ¢ H(x, y) = x, F (x, y) .

(2.39)

(Actually, regard (x, y) and (x, F (x, y)) as column vectors.) We have µ (2.40)

DH =

I Dx F

0 Dy F

¶ .

Thus DH(x0 , y0 ) is invertible, so J = H −1 exists, on a neighborhood of (x0 , u0 ), and is C k , by the Inverse Function Theorem. It is clear that J(x, u0 ) has the form ¡ ¢ J(x, u0 ) = x, g(x, u0 ) ,

(2.41) and g is the desired map.

Here is an example where Theorem 2.5 applies. Set (2.42)

4

2

F : R −→ R ,

µ ¶ x(u2 + v 2 ) F (u, v, x, y) = . xu + yv

40

We have µ ¶ 4 F (2, 0, 1, 1) = . 2

(2.43) Note that

µ (2.44)

Du,v F (u, v, x, y) =

2xu 2xv x y

¶ ,

hence µ (2.45)

Du,v F (2, 0, 1, 1) =

4 0 1 1



is invertible, so Theorem 2.5 (with (u, v) in place of y and (x, y) in place of x) implies that the equation µ ¶ 4 (2.46) F (u, v, x, y) = 2 defines smooth functions (2.47)

u = u(x, y),

v = v(x, y),

for (x, y) near (x0 , y0 ) = (1, 1), satisfying (2.46), with (u(1, 1), v(1, 1)) = (2, 0). Let us next focus on the case ` = 1 of Theorem 2.5, so (2.48)

z = (x, y) ∈ Rn , x ∈ Rn−1 , y ∈ R,

F (z) ∈ R.

Then Dy F = ∂y F . If F (x0 , y0 ) = u0 , Theorem 2.5 says that if (2.49)

∂y F (x0 , y0 ) 6= 0,

then one can solve (2.50)

F (x, y) = u0 for y = g(x, u0 ),

for x near x0 (satisfying g(x0 , u0 ) = y0 ), with g a C k function. This phenomenon was illustrated in (2.31)–(2.35). To generalize the observations involving (2.36)–(2.37), we note the following. Set (x, y) = z = (z1 , . . . , zn ), z0 = (x0 , y0 ). The condition (2.49) is that ∂zn F (z0 ) 6= 0. Now a simple permutation of variables allows us to assume (2.51)

∂zj F (z0 ) 6= 0,

F (z0 ) = u0 ,

and deduce that one can solve (2.52)

F (z) = u0 ,

for zj = g(z1 , . . . , zj−1 , zj+1 , . . . , zn ).

Let us record this result, changing notation and replacing z by x.

41

Proposition 2.6. Let Ω be a neighborhood of x0 ∈ Rn . Asume we have a C k function (2.53)

F : Ω −→ R,

F (x0 ) = u0 ,

and assume (2.54)

DF (x0 ) 6= 0,

i.e., (∂1 F (x0 ), . . . , ∂n F (x0 )) 6= 0.

Then there exists j ∈ {1, . . . , n} such that one can solve F (x) = u0 for (2.55)

xj = g(x1 , . . . , xj−1 , xj+1 , . . . , xn ),

with (x10 , . . . , xj0 , . . . , xn0 ) = x0 , for a C k function g. Remark. For F : Ω → R, it is common to denote DF (x) by ∇F (x), (2.56)

∇F (x) = (∂1 F (x), . . . , ∂n F (x)).

Here is an example to which Proposition 2.6 applies. Using the notation (x, y) = (x1 , x2 ), set (2.57)

F : R2 −→ R,

F (x, y) = x2 + y 2 − x.

Then (2.58)

∇F (x, y) = (2x − 1, 2y),

which vanishes if and only if x = 1/2, y = 0. Hence Proposition 2.6 applies if and only if (x0 , y0 ) 6= (1/2, 0). Let us give an example involving a real valued function on M (n, R), namely (2.59)

det : M (n, R) −→ R.

As indicated in Exercise 11 of §1 (the first exercise set), if det X 6= 0, (2.60)

D det(X)Y = (det X) Tr(X −1 Y ),

so (2.61)

det X 6= 0 =⇒ D det(X) 6= 0.

We deduce that, if (2.62)

X0 ∈ M (n, R),

det X0 = a 6= 0,

42

then, writing (2.63)

X = (xjk )1≤j,k≤n ,

there exist µ, ν ∈ {1, . . . , n} such that the equation (2.64)

det X = a

has a smooth solution of the form (2.65)

³ ´ xµν = g xαβ : (α, β) 6= (µ, ν) ,

such that, if the argument of g consists of the matrix entries of X0 other than the µ, ν entry, then the left side of (2.65) is the µ, ν entry of X0 . Let us return to the setting of Theorem 2.5, with ` not necessarily equal to 1. In notation parallel to that of (2.51), we assume F is a C k map, (2.66)

F : Ω −→ R` ,

F (z0 ) = u0 ,

whre Ω is a neighborhood of z0 in Rn . We assume (2.67)

DF (z0 ) : Rn −→ R` is surjective.

Then, upon reordering the variables z = (z1 , . . . , zn ), we can write z = (x, y), x = (x1 , . . . , xn−` ), y = (y1 , . . . , y` ), such that Dy F (z0 ) is invertible, and Theorem 2.5 applies. Thus (for this reordering of variables), we have a C k solution to (2.68)

F (x, y) = u0 ,

y = g(x, u0 ),

satisfying y0 = g(x0 , u0 ), z0 = (x0 , y0 ). To give one example to which this result applies, we take another look at F : R4 → R2 in (2.42). We have µ ¶ 2xu 2xv u2 + v 2 0 (2.69) DF (u, v, x, y) = . x y u v The reader is invited to determine for which (u, v, x, y) ∈ R4 the matrix on the right side of (2.69) has rank 2. Here is another example, involving a map defined on M (n, R). Set µ ¶ det X 2 (2.70) F : M (n, R) −→ R , F (X) = . Tr X Parallel to (2.60), if det X 6= 0, Y ∈ M (n, R), µ ¶ (det X) Tr(X −1 Y ) (2.71) DF (X)Y = . Tr Y

43

Hence, given det X 6= 0, DF (X) : M (n, R) → R2 is surjective if and only if µ ¶ Tr(X −1 Y ) 2 (2.72) L : M (n, R) → R , LY = Tr Y is surjective. This is seen to be the case if and only if X is not a scalar multiple of the identity I ∈ M (n, R).

Exercises 1. Supose F : U → Rn is a C 2 map, p ∈ U, open in Rn , and DF (p) is invertible. With q = F (p), define a map N on a neighborhood of p by ¡ ¢ (2.73) N (x) = x + DF (x)−1 q − F (x) . Show that there exists ε > 0 and C < ∞ such that, for 0 ≤ r < ε, kx − pk ≤ r =⇒ kN (x) − pk ≤ C r2 . Conclude that, if kx1 − pk ≤ r with r < min(ε, 1/2C), then xj+1 = N (xj ) defines a sequence converging very rapidly to p. This is the basis of Newton’s method, for solving F (p) = q for p. Hint. Apply F to both sides of (2.73). 2. Applying Newton’s method to f (x) = 1/x, show that you get a fast approximation to division using only addition and multiplication. Hint. Carry out the calculation of N (x) in this case and notice a “miracle.” 3. Identify R2n with Cn via z = x + iy, as in Exercise 4 of §1. Let U ⊂ R2n be open, F : U → R2n be C 1 . Assume p ∈ U, DF (p) invertible. If F −1 : V → U is given as in Theorem 2.1, show that F −1 is holomorphic provided F is. 4. Let O ⊂ Rn be open. We say a function f ∈ C ∞ (O) is real analytic provided that, for each x0 ∈ O, we have a convergent power series expansion (2.74)

f (x) =

X 1 f (α) (x0 )(x − x0 )α , α!

α≥0

valid in a neighborhood of x0 . Show that we can let x be complex in (2.16), and obtain an extension of f to a neighborhood of O in Cn . Show that the extended function is holomorphic, i.e., satisfies the Cauchy-Riemann equations. Remark. It can be shown that, conversely, any holomorphic function has a power series expansion. See §10. For the next exercise, assume this as known. 5. Let O ⊂ Rn be open, p ∈ O, f : O → Rn be real analytic, with Df (p) invertible. Take

44

f −1 : V → U as in Theorem 2.1. Show f −1 is real analytic. Hint. Consider a holomorphic extension F : Ω → Cn of f and apply Exercise 3. 6. Use (2.6) to show that if a C 1 diffeomorphism has a C 1 inverse G, and if actually F is C k , then also G is C k . Hint. Use induction on k. Write (2.6) as G(x) = Φ ◦ F ◦ G(x), with Φ(X) = X −1 , as on Exercises 3 and 13 of §1, G(x) = DG(x), F(x) = DF (x). Apply Exercise 12 of §1 to show that, in general G, F, Φ ∈ C ` =⇒ G ∈ C ` . Deduce that if one is given F ∈ C k and one knows that G ∈ C k−1 , then this result applies to give G = DG ∈ C k−1 , hence G ∈ C k . 7. Show that there is a neighborhood O of (1, 0) ∈ R2 and there are functions u, v, w ∈ C 1 (O) (u = u(x, y), etc.) satisfying the equations u3 + v 3 − xw3 = 0, u2 + yw2 + v = 1, xu + yvw = 1, for (x, y) ∈ O, and satisfying u(1, 0) = 1,

v(1, 0) = 0,

w(1, 0) = 1.

Hint. Define F : R5 → R3 by  u3 + v 3 − xw3 F (u, v, w, x, y) =  u2 + yw2 + v  , xu + yvw 

Then F (1, 0, 1, 1, 0) = (0, 1, 1)t . Evaluate the 3 × 3 matrix Du,v,w F (1, 0, 1, 1, 0). Compare (2.42)–(2.47). 8. Consider F : M (n, R) → M (n, R), given by F (X) = X 2 . Show that F is a diffeomorphism of a neighborhood of the identity matrix I onto a neighborhood of I. Show that F is not a diffeomorphism of a neighborhood of µ ¶ 1 0 0 −1 onto a neighborhood of I (in case n = 2). 9. Prove Corollary 2.4.

45

3. Fundamental local existence theorem for ODE The goal of this section is to establish existence of solutions to an ODE (3.1)

dy = F (t, y), dt

y(t0 ) = y0 .

We will prove the following fundamental result. Theorem 3.1. Let y0 ∈ O, an open subset of Rn , I ⊂ R an interval containing t0 . Suppose F is continuous on I × O and satisfies the following Lipschitz estimate in y : (3.2)

kF (t, y1 ) − F (t, y2 )k ≤ Lky1 − y2 k

for t ∈ I, yj ∈ O. Then the equation (3.1) has a unique solution on some t-interval containing t0 . To begin the proof, we note that the equation (3.1) is equivalent to the integral equation Z (3.3)

t

y(t) = y0 +

¡ ¢ F s, y(s) ds.

t0

Existence will be established via the Picard iteration method, which is the following. Guess y0 (t), e.g., y0 (t) = y0 . Then set Z (3.4)

t

yk (t) = y0 +

¡ ¢ F s, yk−1 (s) ds.

t0

We aim to show that, as k → ∞, yk (t) converges to a (unique) solution of (3.3), at least for t close enough to t0 . To do this, we use the Contraction Mapping Principle, established in §2. We look for a fixed point of T, defined by Z t ¡ ¢ (3.5) (T y)(t) = y0 + F s, y(s) ds. t0

Let (3.6)

© ª X = u ∈ C(J, Rn ) : u(t0 ) = y0 , sup ku(t) − y0 k ≤ K . t∈J

Here J = [t0 − ε, t0 + ε], where ε will be chosen, sufficiently small, below. The quantity K is picked so that {y : ky − y0 k ≤ K} is contained in O, and we also suppose J ⊂ I. Then there exists M such that (3.7)

sup s∈J,ky−y0 k≤K

kF (s, y)k ≤ M.

46

Then, provided (3.8)

K , M

ε≤

we have (3.9)

T : X → X.

Now, using the Lipschitz hypothesis (3.2), we have, for t ∈ J, Z

t

k(T y)(t) − (T z)(t)k ≤

Lky(s) − z(s)k ds

(3.10)

t0

≤ ε L sup ky(s) − z(s)k s∈J

assuming y and z belong to X. It follows that T is a contraction on X provided one has (3.11)

ε
T0 , as stated below (3.11). 3. Let M be a compact smooth surface in Rn . Suppose F : Rn → Rn is a smooth map (vector field), such that, for each x ∈ M, F (x) is tangent to M, i.e., the line γx (t) = x + tF (x) is tangent to M at x, at t = 0. Show that, if x ∈ M, then the initial value problem dy = F (y), y(0) = x dt

48

has a solution for all t ∈ R, and y(t) ∈ M for all t. Hint. Locally, straighten out M to be a linear subspace of Rn , to which F is tangent. Use uniqueness. Material in §2 helps do this local straightening. 4. Show that the initial value problem dx dy = −x(x2 + y 2 ), = −y(x2 + y 2 ), dt dt

x(0) = x0 , y(0) = y0

has a solution for all t ≥ 0, but not for all t < 0, unless (x0 , y0 ) = (0, 0). 5. Let a ∈ R. Show that the unique solution to u0 (t) = au(t), u(0) = 1 is given by (3.20)

u(t) =

∞ X aj j=0

j!

tj .

We denote this function by u(t) = eat , the exponential function. We also write exp(t) = et . Hint. Integrate the series term by term and use the Fundamental Theorem of Calculus. Alternative. Setting u0 (t) = 1, and using the Picard iteration method (3.4) to define the Pk sequence uk (t), show that uk (t) = j=0 aj tj /j! 6. Show that, for all s, t ∈ R, ea(s+t) = eas eat .

(3.21)

Hint. Show that u1 (t) = ea(s+t) and u2 (t) = eas eat solve the same initial value problem. 7. Show that exp : R → (0, ∞) is a diffeomorphism. We denote the inverse by log : (0, ∞) −→ R. Show that v(x) = log x solves the ODE dv/dx = 1/x, v(1) = 0, and deduce that Z x 1 dy = log x. (3.22) 1 y 8. Let a ∈ R, i = by (3.23)



−1. Show that the unique solution to f 0 (t) = iaf, f (0) = 1 is given

f (t) =

∞ X (ia)j j=0

j!

tj .

We denote this function by f (t) = eiat . Show that, for all t ∈ R, (3.24)

eia(s+t) = eias eiat .

49

9. Write (3.25)

it

e =

∞ X (−1)j j=0

Show that

(2j)!

t

2j

∞ X (−1)j 2j+1 +i t = u(t) + iv(t). (2j + 1)! j=0

u0 (t) = −v(t),

v 0 (t) = u(t).

We denote these functions by u(t) = cos t, v(t) = sin t. The identity eit = cos t + i sin t

(3.26) is called Euler’s formula.

Auxiliary exercises on trigonometric functions We use the definitions of sin t and cos t given in Exercise 9 of the last exercise set. 1. Use (3.24) to derive the identities (3.27)

sin(x + y) = sin x cos y + cos x sin y cos(x + y) = cos x cos y − sin x sin y.

2. Use (3.26)–(3.27) to show that (3.28)

sin2 t + cos2 t = 1,

cos2 t =

¢ 1¡ 1 + cos 2t . 2

3. Show that γ(t) = (cos t, sin t) is a map of R onto the unit circle S 1 ⊂ R2 with non-vanishing derivative, and, as t increases, γ(t) moves monotonically, counterclockwise. We define π to be the smallest number t1 ∈ (0, ∞) such that γ(t1 ) = (−1, 0), so cos π = −1,

sin π = 0.

Show that 2π is the smallest number t2 ∈ (0, ∞) such that γ(t2 ) = (1, 0), so cos 2π = 1, Show that

sin 2π = 0.

cos(t + 2π) = cos t,

cos(t + π) = − cos t

sin(t + 2π) = sin t,

sin(t + π) = − sin t.

50

Show that γ(π/2) = (0, 1), and that ³ 1 ´ cos t + π = − sin t, 2

³ 1 ´ sin t + π = cos t. 2

4. Show that sin : (−π/2, π/2) → (−1, 1) is a diffeomorphism. We denote its inverse by ³ 1 1 ´ arcsin : (−1, 1) −→ − π, π . 2 2 Show that u(t) = arcsin t solves the ODE du 1 =√ , dt 1 − t2 ¡ ¢ Hint. Apply the chain rule to sin u(t) = t. Deduce that, for t ∈ (−1, 1), Z (3.29)

t

dx √ . 1 − x2

e

πi/6

arcsin t = 0

5. Show that e

πi/3

√ 1 3 = + i, 2 2

Hint. First compute

³1

u(0) = 0.

√ =

3 1 + i. 2 2



3 ´3 + i 2 2

and use Exercise 3. Then compute eπi/2 e−πi/3 . For intuition behind these formulas, look at Fig. 3.1. 6. Show that sin(π/6) = 1/2, and hence that π = 6

Z

1/2

√ 0

∞ X dx an ³ 1 ´2n+1 = , 2n + 1 2 1 − x2 n=0

where a0 = 1, Show that

an+1 =

2n + 1 an . 2n + 2

π X an ³ 1 ´2n+1 4−k − < . 6 n=0 2n + 1 2 3(2k + 3) k

Using a calculator, sum the series over 0 ≤ n ≤ 20, and verify that π ≈ 3.141592653589 · · ·

51

7. For x 6= (k + 1/2)π, k ∈ Z, set tan x =

sin x . cos x

Show that 1 + tan2 x = 1/ cos2 x. Show that w(x) = tan x satisfies the ODE dw = 1 + w2 , dx

w(0) = 0.

8. Show that tan : (−π/2, π/2) → R is a diffeomorphism. Denote the inverse by ³ 1 1 ´ arctan : R −→ − π, π . 2 2 Show that Z (3.30)

y

arctan y = 0

dx . 1 + x2

3b. Vector fields and flows Let U ⊂ Rn be open. A vector field on U is a smooth map X : U −→ Rn .

(3.31) Consider the corresponding ODE (3.32)

dy = X(y), dt

y(0) = x,

with x ∈ U. A curve y(t) solving (3.32) is called an integral curve of the vector field X. It is also called an orbit. For fixed t, write (3.33)

t y = y(t, x) = FX (x).

t The locally defined FX , mapping (a subdomain of) U to U, is called the flow generated by the vector field X. In (3.33), y is a smooth function of (t, x). A proof of this can be found in §6 of [T:1]. The vector field X defines a differential operator on scalar functions, as follows:

(3.34)

¯ £ ¤ d h t LX f (x) = lim h−1 f (FX x) − f (x) = f (FX x)¯t=0 . h→0 dt

52

We also use the common notation (3.35)

LX f (x) = Xf,

that is, we apply X to f as a first order differential operator. Note that, if we apply the chain rule to (3.34) and use (3.32), we have (3.36)

LX f (x) = X(x) · ∇f (x) =

X

aj (x)

∂f , ∂xj

P if X = aj (x)ej , with {ej } the standard basis of Rn . In particular, using the notation (3.35), we have (3.37)

aj (x) = Xxj .

In the notation (3.35), (3.38)

X=

X

aj (x)

∂ . ∂xj

We next consider mapping properties of vector fields. If F : V → W is a diffeomorphism between two open domains in Rn , and Y is a vector field on W, we define a vector field F# Y on V so that (3.39)

FFt # Y = F −1 ◦ FYt ◦ F,

or equivalently, by the chain rule, (3.40)

¡ ¢¡ ¢ ¡ ¢ F# Y (x) = DF −1 F (x) Y F (x) .

In particular, if U ⊂ Rn is open and X is a vector field on U, defining a flow F t , then t for a vector field Y, F# Y is defined on most of U, for |t| small, and we can define the Lie derivative: (3.41)

¡ h ¢ d t ¯¯ LX Y = lim h−1 F# Y − Y = F# Y t=0 , h→0 dt

as a vector field on U. Another natural construction is the operator-theoretic bracket: (3.42)

[X, Y ] = XY − Y X,

where the vector fields X and Y are regarded as first order differential operators on C ∞ (U ). P One verifies that (3.42) defines a vector field on U. In fact, if X = aj (x)∂/∂xj , Y = P bj (x)∂/∂xj , then (3.43)

[X, Y ] =

X³ j,k

ak

∂bj ∂aj ´ ∂ − bk . ∂xk ∂xk ∂xj

The basic elementary fact about the Lie bracket is the following.

53

Theorem 3.2. If X and Y are smooth vector fields, then (3.44)

LX Y = [X, Y ].

¯ s Proof. We examine LX Y = (d/ds)FX# Y ¯s=0 , using (3.40), which implies that (3.45)

¡ s ¢ ¡ s ¢ −s s Ys (x) = FX# Y (x) = DFX FX (x) Y FX (x) .

−s Let us set G s = ¡DFX . Note ¢that G s : U → End(Rn ). Hence, for x ∈ U, DG s (x) is an n element of Hom R , End(Rn ) . Taking the s-derivative of (3.45), we have

(3.46)

¢ ¡ s ¢ ¡ s ¢ ¡ s ¢ ¡ s ¢ ¡ s d Ys (x) = −DX FX (x) Y FX (x) + DG s FX (x) X FX (x) Y FX (x) ds ¡ s ¢ ¡ s ¢ ¡ s ¢ −s + DFX FX (x) DY FX (x) X FX (x) .

Note that G 0 (x) = I ∈ End(Rn ) for all x ∈ U, so DG 0 = 0. Thus (3.47)

LX Y =

¯ d Ys (x)¯s=0 = −DX(x)Y (x) + DY (x)X(x), ds

which agrees with the formula (3.43) for [X, Y ]. Corollary 3.3. If X and Y are smooth vector fields on U, then (3.48)

d t t FX# Y = FX# [X, Y ] dt

for all t. t+s t+s s t Proof. Since locally FX = FX FX , we have the same identity for FX# , which yields (3.48) upon taking the s-derivative.

54

4. The Riemann integral in n variables We define the Riemann integral of a bounded function f : R → R, where R ⊂ Rn is a cell, i.e., a product of intervals R = I1 × · · · × In , where Iν = [aν , bν ] are intervals in R. Recall that a partition of an interval I = [a, b] is a finite collection of subintervals {Jk : 0 ≤ k ≤ N }, disjoint except for their endpoints, whose union is I. We can take Jk = [xk , xk+1 ], where (4.1)

a = x0 < x1 < · · · < xN < xN +1 = b.

Now, if one has a partition of each Iν into Jν1 ∪ · · · ∪ Jν,N (ν) , then a partition P of R consists of the cells (4.2)

Rα = J1α1 × J2α2 × · · · × Jnαn ,

where 0 ≤ αν ≤ N (ν). For such a partition, define (4.3)

maxsize (P) = max diam Rα , α

where (diam Rα )2 = `(J1α1 )2 + · · · + `(Jnαn )2 . Here, `(J) denotes the length of an interval J. Each cell has n-dimensional volume (4.4)

V (Rα ) = `(J1α1 ) · · · `(Jnαn ).

Sometimes we use Vn (Rα ) for emphasis on the dimension. We also use A(R) for V2 (R), and, of course, `(R) for V1 (R). We set X I P (f ) = sup f (x) V (Rα ), (4.5)

α

I P (f ) =

X α



inf f (x) V (Rα ). Rα

Note that I P (f ) ≤ I P (f ). These quantities should approximate the Riemann integral of f, if the partition P is sufficiently “fine.” To be more precise, if P and Q are two partitions of R, we say P refines Q, and write P Â Q, if each partition of each interval factor Iν of R involved in the definition of Q is further refined in order to produce the partitions of the factors Iν , used to define P, via (4.2). It is an exercise to show that any two partitions of R have a common refinement. Note also that (4.6)

P Â Q =⇒ I P (f ) ≤ I Q (f ), and I P (f ) ≥ I Q (f ).

55

Consequently, if Pj are any two partitions of R and Q is a common refinement, we have (4.7)

I P1 (f ) ≤ I Q (f ) ≤ I Q (f ) ≤ I P2 (f ).

Now, whenever f : R → R is bounded, the following quantities are well defined: (4.8)

I(f ) =

inf

P∈Π(R)

I P (f ),

I(f ) =

sup P∈Π(R)

I P (f ),

where Π(R) is the set of all partitions of R, as defined above. Clearly, by (4.7), I(f ) ≤ I(f ). We then say that f is Riemann integrable (on R) provided I(f ) = I(f ), and in such a case, we set Z (4.9) f (x) dV (x) = I(f ) = I(f ). R

We will denote the set of Riemann integrable functions on R by R(R). If dim R = 2, we will often use dA(x) instead of dV (x) in (4.9), and we will often use simply dx. We derive some basic properties of the Riemann integral. First, the proofs of Proposition 0.3 and Corollary 0.4 readily extend, to give: Proposition 4.1. Let Pν be any sequence of partitions of R such that maxsize (Pν ) = δν → 0, and let ξνα be any choice of one point in each cell Rνα in the partition Pν . Then, whenever f ∈ R(R), Z X (4.10) f (x) dV (x) = lim f (ξνα ) V (Rνα ). ν→∞

α

R

Also, we can extend the proof of Proposition 0.1, to obtain: Proposition 4.2. If fj ∈ R(R) and cj ∈ R, then c1 f1 + c2 f2 ∈ R(R), and Z Z Z (4.11) (c1 f1 + c2 f2 ) dV = c1 f1 dV + c2 f2 dV. R

R

R

Next, we establish an integrability result analogous to Proposition 0.2. Proposition 4.3. If f is continuous on R, then f ∈ R(R). Proof. As in the proof of Proposition 0.2, we have that, (4.12)

maxsize (P) ≤ δ =⇒ I P (f ) − I P (f ) ≤ ω(δ) · V (R),

where ω(δ) is a modulus of continuity for f on R. This proves the proposition.

56

When the number of variables exceeds one, it becomes more important to identify some nice classes of discontinuous functions on R that are Riemann integrable. A useful tool for this is the following notion of size of a set S ⊂ R, called content. Extending (0.18)–(0.19), we define “upper content” cont+ and “lower content” cont− by (4.13)

cont+ (S) = I(χS ),

cont− (S) = I(χS ),

where χS is the characteristic function of S. We say S has content, or “is contented,” if these quantities are equal, which happens if and only if χS ∈ R(R), in which case the common value of cont+ (S) and cont− (S) is Z (4.14) V (S) = χS (x) dV (s). R

We mention that, if S = I1 × · · · × In is a cell, it is readily verified that the definitions in (4.5), (4.8), and (4.13) yield cont+ (S) = cont− (S) = `(I1 ) · · · `(In ), so the definition of V (S) given by (4.14) is consistent with that given in (4.4). It is easy to see that (4.15)

+

cont (S) = inf

N nX

o V (Rk ) : S ⊂ R1 ∪ · · · ∪ RN ,

k=1

where Rk are cells contained in R. In the formal definition, the Rα in (4.15) should be part of a partition P of R, as defined above, but if {R1 , . . . , RN } are any cells in R, they can be chopped up into smaller cells, some perhaps thrown away, to yield a finite cover of S by cells in a partition of R, so one gets the same result. It is an exercise to see that, for any set S ⊂ R, (4.16)

cont+ (S) = cont+ (S),

where S is the closure of S. We note that, generally, for a bounded function f on R, (4.17)

I(f ) + I(1 − f ) = V (R).

This follows directly from (4.5). In particular, given S ⊂ R, (4.18)

cont− (S) + cont+ (R \ S) = V (R).

Using this together with (4.16), with S and R \ S switched, we have (4.19)



cont− (S) = cont− (S),

57 ◦



where S is the interior of S. The difference S \ S is called the boundary of S, and denoted bS. Note that N nX o − (4.20) cont (S) = sup V (Rk ) : R1 ∪ · · · ∪ RN ⊂ S , k=1

where here we take {R1 , . . . , RN } to be cells within a partition P of R, and let P vary over all partitions of R. Now, given a partition P of R, classify each cell in P as either being ◦

contained in R \ S, or intersecting bS, or contained in S. Letting P vary over all partitions of R, we see that (4.21)



cont+ (S) = cont+ (bS) + cont− (S).

In particular, we have: Proposition 4.4. If S ⊂ R, then S is contented if and only if cont+ (bS) = 0. If a set Σ ⊂ R has the property that cont+ (Σ) = 0, we say that Σ has content zero, or is a nil set. Clearly Σ is nil if and only if Σ is nil. It follows easily from Proposition 4.2 SK that, if Σj are nil, 1 ≤ j ≤ K, then j=1 Σj is nil. ◦





If S1 , S2 ⊂ R and S = S1 ∪ S2 , then S = S 1 ∪ S 2 and S ⊃ S 1 ∪ S 2 . Hence bS ⊂ b(S1 ) ∪ b(S2 ). It follows then from Proposition 4.4 that, if S1 and S2 are contented, so is S1 ∪ S2 . Clearly, if Sj are contented, so are Sjc = R \ Sj . It follows that, if S1 and ¢c ¡ S2 are contented, so is S1 ∩ S2 = S1c ∪ S2c . A family F of subsets of R such that R ∈ F, Sj ∈ F ⇒ S1 ∪ S2 ∈ F, and S1 ∈ F ⇒ R \ S1 ∈ F is called an algebra of subsets of R. Algebras of sets are automatically closed under finite intersections also. We see that: Proposition 4.5. The family of contented subsets of R is an algebra of sets. The following result specifies a useful class of Riemann integrable functions. Proposition 4.6. If f : R → R is bounded and the set S of points of discontinuity of f is a nil set, then f ∈ R(R). Proof. Suppose |f | ≤ M on R, and take ε > 0. Take a partition P of R, and write P = P 0 ∪ P 00 , where cells in P 0 do not meet S, and cells in P 00 do intersect S. Since cont+ (S) = 0, we can pick P so that the cells in P 00 have total volume ≤ ε. Now f is continuous on each cell in P 0 . Further refining the partition if necessary, we can assume that f varies by ≤ ε on each cell in P 0 . Thus £ ¤ (4.22) I P (f ) − I P (f ) ≤ V (R) + 2M ε. This proves the proposition. To give an example, suppose K ⊂ R is a closed set such that bK is nil. Let f : K → R be continuous. Define fe : R → R by (4.23)

fe(x) = f (x) for x ∈ K, 0

for x ∈ R \ K.

58

Then the set of points of discontinuity of fe is contained in bK. Hence fe ∈ R(R). We set Z Z (4.24) f dV = fe dV. K

R

In connection with this, we note the following fact, whose proof is an exercise. Suppose R e are cells, with R ⊂ R. e Suppose that g ∈ R(R) and that g˜ is defined on R, e to be and R e \ R. Then equal to g on R and to be 0 on R Z Z e (4.25) g˜ ∈ R(R), and g dV = g˜ dV. R

e R

This can be shown by an argument involving refining any given pair of partitions of R and e respectively, to a pair of partitions PR and P with the property that each cell in PR R, e R is a cell in PR . e The following is a useful criterion for a set S ⊂ Rn to have content zero. Proposition 4.7. Let Σ ⊂ Rn−1 be a closed bounded set and let g : Σ → R be continuous. Then the graph of g, ©¡ ¢ ª G = x, g(x) : x ∈ Σ is a nil subset of Rn . Proof. Put Σ in a cell R0 ⊂ Rn−1 . Suppose |f | ≤ M on Σ. Take N ∈ Z+ and set ε = M/N. Pick a partition P0 of R0 , sufficiently fine that g varies by at most ε on each set Σ ∩ Rα , for any cell Rα ∈ P0 . Partition the interval I = [−M, M ] into 2N equal intervals Jν , of length ε. Then {Rα × Jν } = {Qαν } forms a partition of R0 × I. Now, over each cell Rα ∈ P0 , there lie at most 2 cells Qαν which intersect G, so cont+ (G) ≤ 2ε · V (R0 ). Letting N → ∞, we have the proposition. Similarly, for any j ∈ {1, . . . , n}, the graph of xj as a continuous function of the complementary variables is a nil set in Rn . So are finite unions of such graphs. Such sets arise as boundaries of many ordinary-looking regions in Rn . Here is a further class of nil sets. Proposition 4.8. Let S be a compact nil subset of Rn and f : S → Rn a Lipschitz map. Then f (S) is a nil subset of Rn . Proof. The Lipschitz hypothesis on f is that, for p, q ∈ S, |f (p) − f (q)| ≤ C1 |p − q|. If we cover S with k cells (in a partition), of total√volume ≤ α, each cubical with edgesize δ, then f (S) is covered√by k sets of diameter ≤ C1 nδ, √ hence it can be covered by k cubical cells of edgesize C1 nδ, having total volume ≤ (C1 n)n α. From this we have the (not very sharp) general bound ¡ ¢ √ (4.26) cont+ f (S) ≤ (C1 n)n cont+ (S), which proves the proposition. In evaluating n-dimensional integrals, it is usually convenient to reduce them to iterated integrals. The following is a special case of a result known as Fubini’s Theorem.

59

Theorem 4.9. Let Σ ⊂ Rn−1 be a closed, bounded contented set and let gj : Σ → R be continuous, with g0 (x) < g1 (x) on Σ. Take © ª Ω = (x, y) ∈ Rn : x ∈ Σ, g0 (x) ≤ y ≤ g1 (x) .

(4.27)

Then Ω is a contented set in Rn . If f : Ω → R is continuous, then Z (4.28)

g1 (x)

ϕ(x) =

f (x, y) dy g0 (x)

is continuous on Σ, and Z (4.29)

Z f dVn =



ϕ dVn−1 , Σ

i.e., Z ÃZ

Z (4.30)

f (x, y) dy

f dVn = Ω

!

g1 (x)

dVn−1 (x).

g0 (x)

Σ

Proof. The continuity of (4.28) is an exercise in one-variable integration; see Exercises 2–3 of §0. Let ω(δ) be a modulus of continuity for g0 , g1 , and f, and also ϕ. We also can assume that ω(δ) ≥ δ. Put Σ in a cell R0 and let P0 be a partition of R0 . If A ≤ g0 ≤ g1 ≤ B, partition the interval [A, B], and from this and P0 construct a partition P of R = R0 × [A, B]. We denote a cell in P0 by Rα and a cell in P by Rα` = Rα × J` . Pick points ξα` ∈ Rα` . ◦

Write P0 = P00 ∪ P000 ∪ P0000 , consisting respectively of cells inside Σ, meeting bΣ, and ◦

inside R0 \ Σ. Similarly write P = P 0 ∪ P 00 ∪ P 000 , consisting respectively of cells inside Ω, meeting bΩ, and inside R \ Ω, as illustrated in Fig. 4.1. For fixed α, let z 0 (α) = {` : Rα` ∈ P 0 }, and let z 00 (α) and z 000 (α) be similarly defined. Note that z 0 (α) 6= ∅ ⇐⇒ Rα ∈ P00 , provided we assume maxsize (P) ≤ δ and 2δ < min[g1 (x) − g0 (x)], as we will from here on. It follows from (0.9)–(0.10) that (4.31)

Z ¯ X ¯ f (ξα` )`(J` ) − ¯ `∈z 0 (α)

B(α)

A(α)

¯ ¯ f (x, y) dy ¯ ≤ (B − A)ω(δ),

∀ x ∈ Rα ,

60

S where `∈z0 (α) J` = [A(α), B(α)]. Note that A(α) and B(α) are within 2ω(δ) of g0 (x) and g1 (x), respectively, for all x ∈ Rα , if Rα ∈ P00 . Hence, if |f | ≤ M, ¯Z ¯ ¯

(4.32)

B(α)

¯ ¯ f (x, y) dy − ϕ(x)¯ ≤ 4M ω(δ).

A(α)

Thus, with C = B − A + 4M, ¯ X ¯ ¯ ¯ (4.33) f (ξα` )`(J` ) − ϕ(x)¯ ≤ Cω(δ), ¯

∀ x ∈ Rα ∈ P00 .

`∈z 0 (α)

Multiplying by Vn−1 (Rα ) and summing over Rα ∈ P00 , we have ¯ X ¯ X ¯ ¯ (4.34) f (ξα` )Vn (Rα` ) − ϕ(xα )Vn−1 (Rα )¯ ≤ CV (R0 )ω(δ), ¯ Rα` ∈P 0

Rα ∈P00

where xα is an arbitrary point in Rα . Now, if P0 is a sufficiently fine partition of R0 , it follows R from the proof of Proposition 4.6 that the second sum in (4.34) is arbitrarily close to Σ ϕ dVn−1 , since bΣ has content zero. Furthermore, an argument such as used to prove Proposition 4.7 shows that bΩ has content zero, and oneRverifies that, for a sufficiently fine partition, the first sum in (4.34) is arbitrarily close to Ω f dVn . This proves the desired identity (4.29). We next take up the change of variables formula for multiple integrals, extending the one-variable formula, (0.42). We begin with a result on linear changes of variables. The set of invertible real n × n matrices is denoted Gl(n, R). Proposition 4.10. Let f be a continuous function with compact support in Rn . If A ∈ Gl(n, R), then Z Z (4.35) f (x) dV = | det A| f (Ax) dV. Proof. Let G be the set of elements A ∈ Gl(n, R) for which (4.35) is true. Clearly I ∈ G. Using det A−1 = (det A)−1 , and det AB = (det A)(det B), we conclude that G is a subgroup of Gl(n, R). To prove the proposition, it will suffice to show that G contains all elements of the following 3 forms, since the method of applying elementary row operations to reduce a matrix shows that any element of Gl(n, R) is a product of a finite number of these elements. Here, {ej : 1 ≤ j ≤ n} denotes the standard basis of Rn , and σ a permutation of {1, . . . , n}. A1 ej = eσ(j) , (4.36)

A2 ej = cj ej , A3 e2 = e2 + ce1 ,

cj 6= 0

A3 ej = ej for j 6= 2.

The first two cases are elementary consequences of the definition of the Riemann integral, and can be left as exercises.

61

We show that (4.35) holds for transformations of the form A3 by using Theorem 4.9 (in a special case), to reduce it to the case n = 1. Given f ∈ C(R), compactly supported, and b ∈ R, we clearly have Z Z (4.37) f (x) dx = f (x + b) dx. Now, for the case A = A3 , with x = (x1 , x0 ), we have Z 0

Z ³Z

´ f (x1 + cx2 , x0 ) dx1 dVn−1 (x0 )

Z ³Z

´ f (x1 , x0 ) dx1 dVn−1 (x0 ),

f (x1 + cx2 , x ) dVn (x) = (4.38) =

the second identity by (4.37). Thus we get (4.35) in case A = A3 , so the proposition is proved. It is desirable to extend Proposition 4.10 to some discontinuous functions. Given a cell R and f : R → R, bounded, we say (4.39)

f ∈ C(R) ⇐⇒ the set of discontinuities of f is nil.

Proposition 4.6 implies (4.40)

C(R) ⊂ R(R).

From the closure of the class of nil sets under finite unions it is clear that C(R) is closed under sums and products, i.e., that C(R) is an algebra of functions on R. We will denote by Cc (Rn ) the set of functions f : Rn → R such that f has compact support ¯ and its set of n discontinuities is nil. Any f ∈ Cc (R ) is supported in some cell R, and f ¯R ∈ C(R). The following result extends Proposition 4.10. A stronger extension will be given in Theorem 4.15. Proposition 4.11. Given A ∈ Gl(n, R), the identity (4.35) holds for all f ∈ Cc (Rn ). Proof. Assume |f | ≤ M. Suppose supp f is in the interior of a cell R. Let S be the set of discontinuities of f. The function g(x) = f (Ax) has A−1 S as its set of discontinuities, which is also nil, by Proposition 11.8. Thus g ∈ Cc (Rn ). Take ε > 0. Let P be a partition of R, of maxsize ≤ δ, such that S is covered by cells in P of total volume ≤ ε. Write P = P 0 ∪ P 00 ∪ P 000 , where P 00 consists of cells meeting S, P 000 consists of all cells touching cells in P 00 , and P 0 consists of all the rest. Then P 00 ∪ P 000 consists of cells of total volume ≤ 3n ε. Thus if we set (4.41)

B=

[

Rα ,

P0

we have that B is compact, B ∩ S = ∅, and V (R \ B) ≤ 3n ε.

62

Now it is an exercise to construct a continuous function ϕ : R → R such that (4.42)

ϕ = 1 on B,

ϕ = 0 near S,

0 ≤ ϕ ≤ 1,

and set (4.43)

f = ϕf + (1 − ϕ)f = f1 + f2 .

We see that f1 is continuous on Rn , with compact support, |f2 | ≤ M, and f2 is supported on R \ B, which is a union of cells of total volume ≤ 3n ε. Now Proposition 4.10 applies directly to f1 , to yield Z Z (4.44) f1 (y) dV (y) = |det A| f1 (Ax) dV (x). We have f2 ∈ Cc (Rn ), and g2 (x) = f2 (Ax) ∈ Cc (Rn ). Also, g2 is supported by A−1 (R \ B), √ which by the estimate (4.26) has the property cont+ (R \ B) ≤ (3C1 n)n ε. Thus (4.45)

¯Z ¯ ¯ ¯ f (y) dV (y) ¯ ≤ 3n M ε, ¯ 2

¯ ¡ ¯Z √ ¢n ¯ ¯ f (Ax) dV (x) ¯ ¯ ≤ 3C1 n M ε. 2

Taking ε → 0, we have the proposition. In particular, if Σ ⊂ Rn is a compact set that is contented, then its characteristic function χΣ , which has bΣ as its set of points of discontinuity, belongs to Cc (Rn ), and we deduce: Corollary 4.12. If Σ ⊂ Rn is a compact, contented set and A ∈ Gl(n, R), then A(Σ) = {Ax : x ∈ Σ} is contented, and ¡ ¢ V A(Σ) = | det A| V (Σ).

(4.46)

We now extend Proposition 4.10 to nonlinear changes of variables. Proposition 4.13. Let O and Ω be open in Rn , G : O → Ω a C 1 diffeomorphism, and f a continuous function with compact support in Ω. Then Z Z ¡ ¢ (4.47) f (y) dV (y) = f G(x) | det DG(x)| dV (x). Ω

O

Proof. It suffices to prove the result under the additional assumption that f ≥ 0, which we make from here on. If K = supp f, put G−1 (K) inside a rectangle R and let P = {Rα } be a partition of R. Let A be the set of α such that Rα ⊂ O. For α ∈ A, G(Rα ) ⊂ Ω. See Fig. 4.2. Pick P fine enough that α ∈ A whenever Rα ∩ G−1 (K) is nonempty. Note that bG(Rα ) = G(bRα ), so G(Rα ) is contented, in view of Propositions 4.4 and 4.8.

63

eα = Rα − ξα , a cell with center at the origin. Then Let ξα be the center of Rα , and let R ¡ ¢ e α = η α + Hα (4.48) G(ξα ) + DG(ξα ) R is an n-dimensional parallelipiped which is close to G(Rα ), if Rα is small enough. See Fig. 4.3. In fact, given ε > 0, if δ > 0 is small enough and maxsize(P) ≤ δ, then we have (4.49)

ηα + (1 + ε)Hα ⊃ G(Rα ),

for all α ∈ A. Now, by (4.46), (4.50)

V (Hα ) = |det DG(ξα )| V (Rα ).

Hence

¡ ¢ V G(Rα ) ≤ (1 + ε)n |det DG(ξα )| V (Rα ).

(4.51) Now we have

Z f dV =

X α∈A



(4.52)

X α∈A

Z f dV G(Rα )

¡ ¢ sup f ◦ G(x) V G(Rα ) Rα

≤ (1 + ε)n

X α∈A

sup f ◦ G(x) |det DG(ξα )| V (Rα ). Rα

To see that the first line of (4.52) P holds, note that f χG(Rα ) is Riemann integrable, by Proposition 4.6; note also that α f χG(Rα ) = f except on a set of content zero. Then the additivity result in Proposition 4.2 applies. The first inequality in (4.52) is elementary; the second inequality uses (4.51) and f ≥ 0. If we set (4.53)

h(x) = f ◦ G(x) |det DG(x)|,

then we have (4.54)

sup f ◦ G(x) |det DG(ξα )| ≤ sup h(x) + M ω(δ), Rα



provided |f | ≤ M and ω(δ) is a modulus of continuity for DG. Taking arbitrarily fine partitions, we get Z Z (4.55) f dV ≤ h dV. Ω

O

If we apply this result, with G replaced by G−1 , O and Ω switched, and f replaced by h, given by (4.53), we have Z Z Z −1 −1 (4.56) h dV ≤ h ◦ G (y) | det DG (y)| dV (y) = f dV. O





The inequalities (4.55) and (4.56) together yield the identity (4.47). Now the same argument used to establish Proposition 4.11 from Proposition 4.10 yields the following improvement of Proposition 4.13. We say f ∈ Cc (Ω) if f ∈ Cc (Rn ) has support in an open set Ω.

64

Proposition 4.14. Let O and Ω be open in Rn , G : O → Ω a C 1 diffeomorphism, and f ∈ Cc (Ω). Then f ◦ G ∈ Cc (O), and Z Z ¡ ¢ (4.57) f (y) dV (y) = f G(x) | det DG(x)| dV (x). Ω

O

Proposition 4.14 is adequate for garden-variety functions, but it is natural to extend it to more general Riemann-integrable functions. Say f ∈ Rc (Rn ) if f has compact support, say in some cell R, and f ∈ R(R). If Ω ⊂ Rn is open and f ∈ Rc (Rn ) has support in Ω, we say f ∈ Rc (Ω). Theorem 4.15. Let O and Ω be open in Rn , G : O → Ω a C 1 diffeomorphism. If f ∈ Rc (Ω), then f ◦ G ∈ Rc (O), and (4.57) holds. Proof. We can find a compact K ⊂ Ω, a sequence of partitions Pν of a cell containing K, and functions ψν , ϕν , constant on the interior of each cell in the partition Pν , such that supp ψν , ϕν ⊂ K, and, with A =

R Ω

ψν ≤ f ≤ ϕν ,

f (y) dV (y),

1 A− ≤ ν

Z

Z ψν (y) dV (y) ≤ A ≤

ϕν (y) dV (y) ≤ A +

1 . ν

Now ϕν and ψν belong to Cc (Ω), so Proposition 4.14 applies. Thus Z ¡ ¢ 1 A − ≤ ψν G(x) |det DG(x)| dV (x) ν O Z ¡ ¢ 1 ≤ A ≤ ϕν G(x) |det DG(x)| dV (x) ≤ A + . ν O

The proof is completed by the following result, whose proof in turn is an exercise: Lemma 4.16. Let F : R → R be bounded, A ∈ R. Suppose that, for each ν ∈ Z+ , there exist Ψν , Φν ∈ R(R) such that Ψν ≤ F ≤ Φν and

1 A− ≤ ν

Then F ∈ R(R) and

R R

Z

Z Ψν (x) dV (x) ≤

R

Φν (x) dV (x) ≤ A +

1 . ν

R

F (x) dV (x) = A.

The most frequently invoked case of the change of variable formula, in the case n = 2, involves the following change from Cartesian to polar coordinates: (4.58)

x = r cos θ,

y = r sin θ.

65

Thus, take G(r, θ) = (r cos θ, r sin θ). We have µ ¶ cos θ −r sin θ (4.59) DG(r, θ) = , sin θ r cos θ

det DG(r, θ) = r.

For example, if ρ ∈ (0, ∞) and Dρ = {(x, y) ∈ R2 : x2 + y 2 ≤ ρ2 },

(4.60)

then, for f ∈ C(Dρ ), Z Z (4.61) f (x, y) dA =

ρ

Z

0





f (r cos θ, r sin θ)r dθ dr. 0

To get this, we first apply Proposition 4.14, with O = [ε, ρ]×[0, 2π−ε], then apply Theorem 4.9, then let ε & 0. It is often useful to integrate a function whose support is not bounded. Generally, given ¯ n n ¯ a bounded function f : R → R, we say f ∈ R(R ) provided f R ∈ R(R) for each cell R n R ⊂ R , and R |f | dV ≤ C, for some C < ∞, independent of R. If f ∈ R(Rn ), we set Z Z (4.62) f dV = lim f dV, Rs = {x ∈ Rn : |xj | ≤ s, ∀ j}. s→∞

Rn

Rs

The existence of the limit in (4.62) can be established as follows. If M < N , then Z Z Z f dV − f dV = f dV, RN

RM

RN \RM

R

which is dominated in absolute value by RN \RM |f | dV . If f ∈ R(Rn ), then aN = R |f | dV is a bounded monotone sequence, which hence converges, so RN Z Z Z |f | dV = |f | dV − |f | dV −→ 0, as M, N → ∞. RN \RM

RN

RM

It can be shown that, if Kν is any sequence of compact contented subsets of Rn such that each Rs , for s < ∞, is contained in all Kν for ν sufficiently large, i.e., ν ≥ N (s), then, whenever f ∈ R(Rn ), Z Z (4.63) f dV = lim f dV. ν→∞

Rn



Change of variables formulas and Fubini’s Theorem extend to this case. For example, the limiting case of (4.61) as ρ → ∞ is Z Z ∞ Z 2π 2 2 (4.64) f ∈ C(R ) ∩ R(R ) =⇒ f (x, y) dA = f (r cos θ, r sin θ)r dθ dr. R2

0

0

66 2

The following is a good example. Take f (x, y) = e−x Z (4.65)

Z e

−x2 −y 2



Z

−r 2

e 0

. We have Z



dA =

R2

−y 2



r dθ dr = 2π

0

2

e−r r dr.

0

Now, methods of §0 allow the substitution s = r2 , so Z



e

−r 2

0

1 r dr = 2

Z



e−s ds =

0

1 . 2

Hence Z 2

e−x

(4.66)

−y 2

dA = π.

R2

On the other hand, Theorem 4.9 extends to give Z

Z e

(4.67)

−x2 −y 2



Z



dA = −∞

R2

µZ

2

e−x

−y 2

dy dx

−∞

¶ µZ



=

e

−x2





dx

e

−∞

−y 2

dy .

−∞

Note that the two factors in the last product are equal. We deduce that Z



(4.68)

2

e−x dx =



π.

−∞

We can generalize (4.67), to obtain (via (4.68)) µZ

Z (4.69)

e Rn

−|x|2



dVn =

e

−x2

¶n dx = π n/2 .

−∞

The integrals (4.65)–(4.69) are called Gaussian integrals, and their evaluation has many uses. We shall see some in §5.

Exercises 1. Show that any two partitions of a cell R have a common refinement. Hint. Consider the argument given for the one-dimensional case in §0. 2. Write down a careful proof of the identity (4.16), i.e., cont+ (S) = cont+ (S).

67

3. Write down the details of the argument giving (4.25), on the independence of the integral from the choice of cell. 4. Write down a direct proof that the transformation formula (4.35) holds for those linear transformations of the form A1 and A2 in (4.36). Compare Exercise 1 of §0. 5. Show that there is a continuous function ϕ having the properties in (4.42). Hint. To start, Let K be a compact neighborhood of S, disjoint from B, and let ψ(x) = dist(x, K). Alter this to get ϕ. 6. Write down the details for a proof of Proposition 4.14. 7. Consider spherical polar coordinates on R3 , given by x = ρ sin ϕ cos θ,

y = ρ sin ϕ sin θ,

z = ρ cos ϕ,

¡ ¢ i.e., take G(ρ, ϕ, θ) = ρ sin ϕ cos θ, ρ sin ϕ sin θ, ρ cos θ . Show that det DG(ρ, ϕ, θ) = ρ2 sin ϕ. Use this to compute the volume of the unit ball in R3 . 8. Prove Lemma 4.16. 9. Prove the following variant of Lemma 4.16: Lemma 4.16A. Let F : R → R be bounded. Suppose there exist Ψν , Φν ∈ R(R) such that Z Ψν ≤ F ≤ Φν ,

Z Φν (x) dV (x) −

R

Ψν (x) dV (x) ≤

1 . ν

R

Then F ∈ R(R). 10. Given f1 , f2 ∈ R(R), prove that f1 f2 ∈ R(R). Hint. Considering the effect of adding constants, reduce to considering positive fj . Take partitions Pν and positive functions ψjν , ϕjν , constant on the interior of each cell in Pν , such that 0 < A ≤ ψjν ≤ fj ≤ ϕjν ≤ B and

Z

Z ψjν (x) dV (x),

Z ϕjν (x) dV (x) →

fj (x) dV (x).

Show that the result of Exercise 9 holds for F = f1 f2 , Ψν = ψ1ν ψ2ν , Φν = ϕ1ν ϕ2ν .

68

11. If R is a cell and S ⊂ R is a contented set, and f ∈ R(R), define Z Z f (x) dV (x) = χS (x)f (x) dV (x). S

R

Show that, if Sj ⊂ R are contented and they are disjoint (or more generally cont+ (S1 ∩ S2 ) = 0), then, for f ∈ R(R), Z Z Z f (x) dV (x) = f (x) dV (x) + f (x) dV (x). S1 ∪S2

S1

S2

12. Establish the convergence result (4.63), for all f ∈ R(Rn ). 13. Extend the change of variable formula to elements of R(Rn ). 14. Let f : Rn → R be unbounded. For s ∈ (0, ∞), set fs (x) = f (x) if |f (x)| ≤ s 0 otherwise. Assume fs ∈ R(Rn ) for each s < ∞, and assume Z |fs (x)| dx ≤ B < ∞,

∀ s < ∞.

Rn

In such a case, say f ∈ R# (Rn ). Set Z Z f (x) dx = lim fs (x) dx. s→∞

Rn

Rn

Show that this limit exists whenever f ∈ R# (Rn ). 15. Extend the change of variable formula to functions in R# (Rn ). Show that 2 f (x) = |x|−a e−|x| ∈ R# (Rn ) ⇐⇒ a < n. 16. Theorem 4.9, relating multiple integrals and iterated integrals, played the following role in the proof of the change of variable formula (4.47). Namely, it was used to establish the identity (4.50) for the volume of the parallelipiped Hα , via an appeal to Corollary 4.12, hence to Proposition 4.10, whose proof relied on Theorem 4.9. Try to establish Corollary 4.12 directly, without using Theorem 4.9, in the case when Σ is either a cell or the image of a cell under an element of Gl(n, R).

69

17. Assume f ∈ R(R), |f | ≤ M , and let ϕ : [−M, M ] → R be Lipschitz and monotone. Show directly from the definition that ϕ ◦ f ∈ R(R). 18. If ϕ : [−M, M ] → R is continuous and piecewise linear, show that you can write ϕ = ϕ1 − ϕ2 with ϕj Lipschitz and monotone. Deduce that f ∈ R(R) ⇒ ϕ ◦ f ∈ R(R) when ϕ is piecewise linear. 19. Assume uν ∈ R(R) and that uν → u uniformly on R. Show that u ∈ R(R). Deduce that if f ∈ R(R), |f | ≤ M , and ψ : [−M, M ] → R is continuous, then ψ ◦ f ∈ R(R).

Exercises on row reduction and matrix products We consider the following three types of row operations on an n × n matrix A = (ajk ). If σ is a permutation of {1, . . . , n}, let ρσ (A) = (aσ(j)k ). If c = (c1 , . . . , cj ), and all cj are nonzero, set µc (A) = (c−1 j ajk ). Finally, if c ∈ R and µ 6= ν, define εµνc (A) = (bjk ),

bνk = aνk − caµk , bjk = ajk for j 6= ν.

We relate these operations to left multiplication by matrices Pσ , Mc , and Eµνc , defined by the following actions on the standard basis {e1 , . . . , en } of Rn : Pσ ej = eσ(j) ,

Mc ej = cj ej ,

and Eµνc eµ = eµ + ceν ,

Eµνc ej = ej for j 6= µ.

1. Show that A = Pσ ρσ (A),

A = Mc µc (A),

A = Eµνc εµνc (A).

2. Show that Pσ−1 = Pσ−1 . 3. Show that, if µ 6= ν, then Eµνc = Pσ−1 E21c Pσ , for some permutation σ. 4. If B = ρσ (A) and C = µc (B), show that A = Pσ Mc C. Generalize this to other cases where a matrix C is obtained from a matrix A via a sequence of row operations. 5. If A is an invertible, real n × n matrix (i.e., A ∈ Gl(n, R)), then the rows of A form a basis of Rn . Use this to show that A can be transformed to the identity matrix via a sequence of row operations. Deduce that any A ∈ Gl(nR) can be written as a finite product of matrices of the form Pσ , Mc and Eµνc , hence as a finite product of matrices of the form listed in (4.36).

70

5. Integration on surfaces A smooth m-dimensional surface M ⊂ Rn is characterized by the following property. Given p ∈ M, there is a neighborhood U of p in M and a smooth map ϕ : O → U, from an open set O ⊂ Rm bijectively to U, with injective derivative at each point. Such a map ϕ is called a coordinate chart on M. We call U ⊂ M a coordinate patch. If all such maps ϕ are smooth of class C k , we say M is a surface of class C k . In §8 we will define analogous notions of a C k surface with boundary, and of a C k surface with corners. There is an abstraction of the notion of a surface, namely the notion of a “manifold,” which is useful in more advanced studies. A treatment of calculus on manifolds can be found in [Spi]. If ϕ : O → U is a C k coordinate chart, such as described above, and ϕ(x0 ) = p, we set (5.1)

Tp M = Range Dϕ(x0 ),

a linear subspace of Rn of dimension m, and we denote by Np M its orthogonal complement. It is useful to consider the following map. Pick a linear isomorphism A : Rn−m → Np M , and define (5.2)

Φ : O × Rn−m −→ Rn ,

Φ(x, z) = ϕ(x) + Az.

Thus Φ is a C k map defined on an open subset of Rn . Note that µ ¶ v (5.3) DΦ(x0 , 0) = Dϕ(x0 )v + Aw, w so DΦ(x0 , 0) : Rn → Rn is surjective, hence bijective, so the Inverse Function Theorem applies; Φ maps some neighborhood of (x0 , 0) diffeomorphically onto a neighborhood of p ∈ Rn . Suppose there is another C k coordinate chart, ψ : Ω → U . Since ϕ and ψ are by hypothesis one-to-one and onto, it follows that F = ψ −1 ◦ ϕ : O → Ω is a well defined map, which is one-to-one and onto. See Fig. 5.1. In fact, we can say more. Lemma 5.1. Under the hypotheses above, F is a C k diffeomorphism. Proof. It suffices to show that F and F −1 are C k on a neighborhood of x0 and y0 , respectively, where ϕ(x0 ) = ψ(y0 ) = p. Let us define a map Ψ in a fashion similar to (5.2). ep M be its orthogonal complement. To be precise, we set Tep M = Range Dψ(y0 ), and let N (Shortly we will show that Tep M = Tp M , but we are not quite ready for that.) Then pick a ep M and set Ψ(y, z) = ψ(y) + Bz, for (y, z) ∈ Ω × Rn−m . linear isomorphism B : Rn−m → N k Again, Ψ is a C diffeomorphism from a neighborhood of (y0 , 0) onto a neighborhood of p. It follows that Ψ−1 ◦ Φ is a C k diffeomeophism from a neighborhood of (x0 , 0) onto a neighborhood of (y0 , 0). Now note that, for x close to x0 and y close to y0 , ¡ ¢ ¡ ¢ (5.4) Ψ−1 ◦ Φ(x, 0) = F (x), 0 , Φ−1 ◦ Ψ(y, 0) = F −1 (y), 0 .

71

These identities imply that F and F −1 have the desired regularity. Thus, when there are two such coordinate charts, ϕ : O → U, ψ : Ω → U , we have a C diffeomorphism F : O → Ω such that k

(5.5)

ϕ = ψ ◦ F.

By the chain rule, (5.6)

Dϕ(x) = Dψ(y) DF (x),

y = F (x).

In particular this implies that Range Dϕ(x0 ) = Range Dψ(y0 ), so Tp M in (5.1) is independent of the choice of coordinate chart. It is called the tangent space to M at p. We next define an object called the metric tensor on M .¡ Given¢ a coordinate chart ϕ : O → U , there is associated an m × m matrix G(x) = gjk (x) of functions on O, defined in terms of the inner product of vectors tangent to M : n

X ∂ϕ` ∂ϕ` ∂ϕ ∂ϕ gjk (x) = Dϕ(x)ej · Dϕ(x)ek = · = , ∂xj ∂xk ∂xj ∂xk

(5.7)

`=1

where {ej : 1 ≤ j ≤ m} is the standard orthonormal basis of Rm . Equivalently, G(x) = Dϕ(x)t Dϕ(x).

(5.8)

We call (gjk ) the metric tensor of M, on U , with respect to the coordinate chart ϕ : O → U . Note that this matrix is positive-definite. From a coordinate-independent point of view, the metric tensor on M specifies inner products of vectors tangent to M , using the inner product of Rn . If we take another coordinate chart ψ : Ω → U, we want to compare (gjk ) with H = (hjk ), given by (5.9)

hjk (y) = Dψ(y)ej · Dψ(y)ek ,

i.e., H(y) = Dψ(y)t Dψ(y).

As seen above we have a diffeomorphism F : O → Ω such that (5.5)–(5.6) hold. Consequently, G(x) = DF (x)t H(y) DF (x),

(5.10) or equivalently, (5.11)

gjk (x) =

X ∂Fi ∂F` hi` (y). ∂xj ∂xk i,`

We now define the notion of surface integral on M . If f : M → R is a continuous function supported on U, we set Z Z p (5.12) f dS = f ◦ ϕ(x) g(x) dx, M

O

72

where (5.13)

g(x) = det G(x).

We need to know that this is independent of the choice of coordinate chart pU. Thus, R ϕ:O→ if we use ψ : Ω → U instead, we want to show that (5.12) is equal to Ω f ◦ ψ(y) h(y) dy, where h(y) = det H(y). Indeed, since f ◦ ψ ◦ F = f ◦ ϕ, we can apply the change of variable formula, Theorem 4.14, to get Z Z p p (5.14) f ◦ ψ(y) h(y) dy = f ◦ ϕ(x) h(F (x)) |det DF (x)| dx. Ω

O

Now, (5.10) implies that p p g(x) = |det DF (x)| h(y),

(5.15)

so the right side of (5.14) is seen to be equal to (5.12), and our surface integral is well defined, at least for f supported in a coordinate patch. More generally, if f : M → R has compact support, write it as a finite sum of terms, each supported on a coordinate patch, and use (5.12) on each patch. Let us pause to consider the special cases m = 1 and m = 2. For m = 1, we are considering a curve in Rn , say ϕ : [a, b] → Rn . Then G(x) is a 1 × 1 matrix, namely G(x) = |ϕ0 (x)|2 . If we denote the curve in Rn by γ, rather than M, the formula (5.12) becomes Z (5.16)

Z f ds =

b

f ◦ ϕ(x) |ϕ0 (x)| dx.

a

γ

In case m = 2, let us consider a surface M ⊂ R3 , with a coordinate chart ϕ : O → U ⊂ M. For f supported in U, an alternative way to write the surface integral is Z Z (5.17) f dS = f ◦ ϕ(x) |∂1 ϕ × ∂2 ϕ| dx1 dx2 , M

O

where u × v is the cross product of vectors u and v in R3 . To see this, we compare this integrand with the one in (5.12). In this case, µ (5.18)

g = det

∂1 ϕ · ∂1 ϕ ∂1 ϕ · ∂2 ϕ ∂2 ϕ · ∂1 ϕ ∂2 ϕ · ∂2 ϕ

¶ = |∂1 ϕ|2 |∂2 ϕ|2 − (∂1 ϕ · ∂2 ϕ)2 .

Recall from (1.47) that |u × v| = |u| |v| | sin θ|, where θ is the angle between u and v. Equivalently, since u · v = |u| |v| cos θ, (5.19)

¡ ¢ |u × v|2 = |u|2 |v|2 1 − cos2 θ = |u|2 |v|2 − (u · v)2 .

73

√ Thus we see that |∂1 ϕ × ∂2 ϕ| = g, in this case, and (5.17) is equivalent to (5.12). 1 An important class of surfaces is the class of graphs of smooth functions. Let u ∈ ¡ C (Ω),¢ n−1 for an open Ω ⊂ R , and let M be the graph of z = u(x). The map ϕ(x) = x, u(u) provides a natural coordinate system, in which the metric tensor is given by (5.20)

gjk (x) = δjk +

∂u ∂u . ∂xj ∂xk

If u is C 1 , we see that gjk is continuous. To calculate g = det(gjk ), at a given point p ∈ Ω, if ∇u(p) 6= 0, rotate coordinates so that ∇u(p) is parallel to the x1 axis. We see that √

(5.21)

¡ ¢1/2 . g = 1 + |∇u|2

In particular, the (n − 1)-dimensional volume of the surface M is given by Z (5.22)

Vn−1 (M ) =

Z dS =

M

¡ ¢1/2 1 + |∇u(x)|2 dx.



Particularly important examples of surfaces are the unit spheres S n−1 in Rn , S n−1 = {x ∈ Rn : |x| = 1}.

(5.23)

Spherical polar coordinates on Rn are defined in terms of a smooth diffeomorphism (5.24)

R : (0, ∞) × S n−1 −→ Rn \ 0,

R(r, ω) = rω.

Let (h`m ) denote the metric tensor on S n−1 (induced from its inclusion in Rn ) with respect to some coordinate chart ϕ : O → U ⊂ S n−1 . Then, with respect to the coordinate chart Φ : (0, ∞) × O → U ⊂ Rn given by Φ(r, y) = rϕ(y), the Euclidean metric tensor can be written µ ¶ ¡ ¢ 1 (5.25) ejk = . r2 h`m To see that the blank terms vanish, i.e., ∂r Φ · ∂xj Φ = 0, note that ϕ(x) · ϕ(x) = 1 ⇒ ∂xj ϕ(x) · ϕ(x) = 0. Now (5.25) yields √

(5.26)

√ e = rn−1 h.

We therefore have the following result for integrating a function in spherical polar coordinates. Z Z hZ ∞ i (5.27) f (x) dx = f (rω)rn−1 dr dS(ω). Rn

S n−1

0

74

We next compute the (n − 1)-dimensional area An−1 of the unit sphere S n−1 ⊂ Rn , using (5.27) together with the computation Z 2

e−|x| dx = π n/2 ,

(5.28) Rn

from (4.69). First note that, whenever f (x) = ϕ(|x|), (5.27) yields Z (5.29)

Z



ϕ(|x|) dx = An−1

ϕ(r)rn−1 dr.

0

Rn 2

In particular, taking ϕ(r) = e−r and using (5.28), we have Z (5.30)

π

n/2

Z



= An−1

e

−r 2 n−1

r

dr =

0

1 2 An−1



e−s sn/2−1 ds,

0

where we used the substitution s = r2 to get the last identity. We hence have (5.31)

An−1 =

2π n/2 , Γ( n2 )

where Γ(z) is Euler’s Gamma function, defined for z > 0 by Z (5.32)



Γ(z) =

e−s sz−1 ds.

0

We need to complement (5.31) with some results on Γ(z) allowing a computation of Γ(n/2) in terms of more familiar quantities. Of course, setting z = 1 in (5.32), we immediately get (5.33)

Γ(1) = 1.

Also, setting n = 1 in (5.30), we have Z π

1/2

=2

Z



e

−r 2



dr =

0

e−s s−1/2 ds,

0

or (5.34)

Γ

³1´ 2

= π 1/2 .

We can proceed inductively from (5.33)–(5.34) to a formula for Γ(n/2) for any n ∈ Z+ , using the following.

75

Lemma 5.2. For all z > 0, (5.35)

Γ(z + 1) = zΓ(z).

Proof. We can write Z

∞³

Γ(z + 1) = − 0

d −s ´ z e s ds = ds

Z



e−s

0

d ¡ z¢ s ds, ds

the last identity by integration by parts. The last expression here is seen to equal the right side of (5.35). Consequently, for k ∈ Z+ , (5.36)

Γ(k) = (k − 1)!,

³ 1´ ³ 1 ´ ³ 1 ´ 1/2 Γ k+ = k− ··· π . 2 2 2

Thus (5.31) can be rewritten (5.37)

A2k−1 =

2π k , (k − 1)!

A2k = ¡

2π k ¢ ¡ ¢. k − 12 · · · 12 2

We discuss another important example of a smooth surface, in the space M (n) ≈ Rn of real n × n matrices, namely SO(n), the set of matrices T ∈ M (n) satisfying T t T = I and det T > 0 (hence det T = 1). The exponential map Exp: M (n) → M (n) defined by Exp(A) = eA has the property (5.38)

Exp : Skew(n) −→ SO(n),

where Skew(n) is the set of skew-symmetric matrices in M (n). Also D Exp(0)A = A; hence (5.39)

D Exp(0) = ι : Skew(n) ,→ M (n).

It follows from the Inverse Function Theorem that there is a neighborhood O of 0 in Skew(n) which is mapped by Exp diffeomorphically onto a smooth surface U ⊂ M (n), of dimension m = n(n − 1)/2. Furthermore, U is a neighborhood of I in SO(n). For general T ∈ SO(n), we can define maps (5.40)

ϕT : O −→ SO(n),

ϕT (A) = T Exp(A),

and obtain coordinate charts in SO(n), which is consequently a smooth surface of dimension n(n − 1)/2 in M (n). Note that SO(n) is a closed bounded subset of M (n); hence it is compact. We use the inner product on M (n) computed componentwise; equivalently, (5.41)

hA, Bi = Tr (B t A) = Tr (BAt ).

This produces a metric tensor on SO(n). The surface integral over SO(n) has the following important invariance property.

76

¡ ¢ Proposition 5.3. Given f ∈ C SO(n) , if we set (5.42)

ρT f (X) = f (XT ),

for T, X ∈ SO(n), we have Z Z (5.43) ρT f dS = SO(n)

λT f (X) = f (T X),

Z λT f dS =

SO(n)

f dS. SO(n)

Proof. Given T ∈ SO(n), the maps RT , LT : M (n) → M (n) defined by RT (X) = XT, LT (X) = T X are easily seen from (5.41) to be isometries. Thus they yield maps of SO(n) to itself which preserve the metric tensor, proving (5.43). ¡ ¢ R Since SO(n) is compact, its total volume V SO(n) = SO(n) 1 dS is finite. We define the integral with respect to “Haar measure” Z Z 1 ¢ (5.44) f (g) dg = ¡ f dS. V SO(n) SO(n)

SO(n)

This is used in many arguments involving “averaging over rotations.”

Exercises 1. Define ϕ : [0, θ] → R2 to be ϕ(t) = (cos t, sin t). Show that, if 0 < θ ≤ 2π, the image of [0, θ] under ϕ is an arc of the unit circle, of length θ. Deduce that the unit circle in R2 has total length 2π. This result follows also from (5.37). Remark. Use the definition of π given in the auxiliary problem set after §3. This length formula provided the original definition of π, in ancient Greek geometry. 2. Compute the volume of the unit ball B n = {x ∈ Rn : |x| ≤ 1}. Hint. Apply (5.29) with ϕ = χ[0,1] . 3. Taking the upper half of the sphere S n to be the graph of xn+1 = (1 − |x|2 )1/2 , for x ∈ B n , the unit ball in Rn , deduce from (5.22) and (5.29) that Z An = 2An−1 0

1

rn−1 √ dr = 2An−1 1 − r2

Z

π/2

(sin θ)n−1 dθ.

0

Use this to get an alternative derivation of the formula (5.37) for An . Hint. Rewrite this formula as Z π An = An−1 bn−1 , bk = sink θ dθ. 0

77

To analyze bk , you can write, on the one hand, Z π bk+2 = bk − sink θ cos2 θ dθ, 0

and on the other, using integration by parts, Z π d bk+2 = cos θ sink+1 θ dθ. dθ 0 Deduce that bk+2 =

k+1 bk . k+2

4. Suppose M is a surface in Rn ¡of dimension¢ 2, and ϕ : O → U ⊂ M is a coordinate chart, with O ⊂ R2 . Set ϕjk (x) = ϕj (x), ϕk (x) , so ϕjk : O → R2 . Show that the formula (5.12) for the surface integral is equivalent to s ³ Z Z ´2 X f dS = f ◦ ϕ(x) det Dϕjk (x) dx. M

j 0 for x ∈ Rn \ U , and grad ϕ(x) 6= 0 for x ∈ ∂U, so grad ϕ points out of U. Show that the natural orientation on ∂U, as defined just before Proposition 8.1, is the same as the following. The equivalence class of forms β ∈ Λn−1 (∂U ) defining the orientation on ∂U satisfies the property that dϕ∧β is a positive multiple of dx1 ∧ · · · ∧ dxn , on ∂U. 3. Suppose U = {x ∈ Rn : xn < 0}. Show that the orientation on ∂U described above is that of (−1)n−1 dx1 ∧ · · · ∧ dxn−1 . If V = {x ∈ Rn : xn > 0}, what orientation does ∂V inherit? 4. Extend the special case of Poincar´e’s Lemma given in Proposition 7.1 to the case where α is a closed 1-form on B = {x ∈ Rn : |x| < 1}, i.e., from the case dim B = 2 to higher dimensions.

91

9. The classical Gauss, Green, and Stokes formulas The case of (8.1) where S = Ω is a region in R2 with smooth boundary yields the classical Green Theorem. In this case, we have (9.1)

β = f dx + g dy,

dβ =

³ ∂g ∂x



∂f ´ dx ∧ dy, ∂y

and hence (8.1) becomes the following Proposition 9.1. If Ω is a region in R2 with smooth boundary, and f and g are smooth functions on Ω, which vanish outside some compact set in Ω, then (9.2)

ZZ ³ Z ∂g ∂f ´ − dx dy = (f dx + g dy). ∂x ∂y Ω

∂Ω

Note that, if we have a vector field X = X1 ∂/∂x + X2 ∂/∂y on Ω, then the integrand on the left side of (9.2) is ∂X1 ∂X2 + = div X, ∂x ∂y

(9.3)

provided g = X1 , f = −X2 . We obtain ZZ (9.4)

Z (div X) dx dy =



¡ −X2 dx + X1 dy).

∂Ω

¡ ¢ If ∂Ω is parametrized by arc-length, as γ(s) = x(s), y(s) , with orientation as defined for ¡ Proposition ¢ 8.1, then the unit normal ν, to ∂Ω, pointing out of Ω, is given by ν(s) = y(s), ˙ −x(s) ˙ , and (9.4) is equivalent to ZZ (9.5)

Z (div X) dx dy =



hX, νi ds. ∂Ω

This is a special case of Gauss’ Divergence Theorem. We now derive a more general form of the Divergence Theorem. We begin with a definition of the divergence of a vector field on a surface M. Let M be a region in Rn , or an n-dimensional surface in Rn+k , provided with a volume form (9.6)

ωM ∈ Λn M.

92

Let X be a vector field on M. Then the divergence of X, denoted div X, is a function on M given by (9.7)

(div X) ωM = d(ωM cX).

If M = Rn , with the standard volume element (9.8)

ω = dx1 ∧ · · · ∧ dxn ,

and if (9.9)

X=

X

X j (x)

∂ , ∂xj

then (9.10)

ωcX =

n X

dj ∧ · · · ∧ dxn . (−1)j−1 X j (x)dx1 ∧ · · · ∧ dx

j=1

Hence, in this case, (9.7) yields the familiar formula (9.11)

div X =

n X

∂j X j ,

j=1

where we use the notation (9.12)

∂j f =

∂f . ∂xj

Suppose now that M is endowed with both an orientation and a metric tensor gjk (x). Then M carries a natural volume element ωM , determined by the condition that, if one has an orientation-preserving coordinate system in which gjk (p0 ) = δjk , then ωM (p0 ) = dx1 ∧· · ·∧dxn . This condition produces the following formula, in any orientation-preserving coordinate system: (9.13)

ωM =



g dx1 ∧ · · · ∧ dxn ,

g = det(gjk ),

by the same sort of calculations as done in (5.10)–(5.15). We now compute div X when the volume element on M is given by (9.13). We have (9.14)

ωM cX =

X

√ dj ∧ · · · ∧ dxn (−1)j−1 X j g dx1 ∧ · · · ∧ dx

j

and hence (9.15)

√ d(ωM cX) = ∂j ( gX j ) dx1 ∧ · · · ∧ dxn .

93

Here, as below, we use the summation convention. Hence the formula (9.7) gives 1

1

div X = g − 2 ∂j (g 2 X j ).

(9.16)

We now derive the Divergence Theorem, as a consequence of Stokes’ formula, which we recall is Z Z (9.17) dα = α, M

∂M

for an (n−1)-form on M , assumed to be a smooth compact oriented surface with boundary. If α = ωM cX, formula (9.7) gives Z (9.18)

Z (div X) ωM =

M

ωM cX. ∂M

This is one form of the Divergence Theorem. We will produce an alternative expression for the integrand on the right before stating the result formally. Given that ωM is the volume form for M determined by a Riemannian metric, we can write the interior product ωM cX in terms of the volume element ω∂M on ∂M, with its induced orientation and Riemannian metric, as follows. Pick coordinates on M, centered at p0 ∈ ∂M, such that ∂M is tangent to the hyperplane {x1 = 0} at p0 = 0 (with M to the left of ∂M ), and such that gjk (p0 ) = δjk , so ωM (p0 ) = dx1 ∧ · · · ∧ dxn . Consequently, ω∂M (p0 ) = dx2 ∧ · · · ∧ dx2 . It follows that, at p0 , j ∗ (ωM cX) = hX, νiω∂M ,

(9.19)

where ν is the unit vector normal to ∂M, pointing out of M and j : ∂M ,→ M the natural inclusion. The two sides of (9.19), which are both defined in a coordinate independent fashion, are hence equal on ∂M, and the identity (9.18) becomes Z (9.20)

Z (div X) ωM =

M

hX, νi ω∂M . ∂M

Finally, we adopt the notation of the sort used in §§4–5. We denote the volume element on M by dV and that on ∂M by dS, obtaining the Divergence Theorem: Theorem 9.2. If M is a compact surface with boundary, X a smooth vector field on M , then Z Z (9.21) (div X) dV = hX, νi dS, M

∂M

where ν is the unit outward-pointing normal to ∂M.

94

The only point left to mention here is that M need not be orientable. Indeed, we can treat the integrals in (9.21) as surface integrals, as in §5, and note that all objects in (9.21) are independent of a choice of orientation. To prove the general case, just use a partition of unity supported on orientable pieces. We obtain some further integral identities. First, we apply (9.21) with X replaced by uX. We have the following “derivation” identity: (9.22)

div uX = u div X + hdu, Xi = u div X + Xu,

which follows easily from the formula (9.16). The Divergence Theorem immediately gives Z Z Z (9.23) (div X)u dV + Xu dV = hX, νiu dS. M

M

∂M

Replacing u by uv and using the derivation identity X(uv) = (Xu)v + u(Xv), we have Z Z Z £ ¤ (9.24) (Xu)v + u(Xv) dV = − (div X)uv dV + hX, νiuv dS. M

M

∂M

It is very useful to apply (9.23) to a gradient vector field X. If v is a smooth function on M, grad v is a vector field satisfying (9.25)

hgrad v, Y i = hY, dvi,

for any vector field Y, where the brackets on the left are given by the metric tensor on M and those on the right by the natural pairing of vector fields and 1-forms. Hence grad v = X has components X j = g jk ∂k v, where (g jk ) is the matrix inverse of (gjk ). Applying div to grad v, we have the Laplace operator: ¡ ¢ (9.26) ∆v = div grad v = g −1/2 ∂j g jk g 1/2 ∂k v . When M is a region in Rn and we use the standard Euclidean metric, so div X is given by (9.11), we have the Laplace operator on Euclidean space: (9.27)

∆v =

∂2v ∂2v + · · · + . ∂x21 ∂x2n

Now, setting X = grad v in (9.23), we have Xu = hgrad u, grad vi, and hX, νi = hν, grad vi, which we call the normal derivative of v, and denote ∂v/∂ν. Hence Z Z Z ∂v (9.28) u(∆v) dV = − hgrad u, grad vi dV + u dS. ∂ν M

M

∂M

If we interchange the roles of u and v and subtract, we have Z Z Z h ∂v ∂u i (9.29) u(∆v) dV = (∆u)v dV + u − v dS. ∂ν ∂ν M

M

∂M

95

Formulas (9.28)–(9.29) are also called Green formulas. We will make further use of them in §10. We return to the Green formula (9.2), and give it another formulation. Consider a vector field Z = (f, g, h) on a region in R3 containing the planar surface U = {(x, y, 0) : (x, y) ∈ Ω}. If we form 

i  curl Z = det ∂x f

(9.30)

j ∂y g

 k ∂z  h

we see that the integrand on the left side of (9.2) is the k-component of curl Z, so (9.2) can be written ZZ Z (9.31) (curl Z) · k dA = (Z · T ) ds, U

∂U

where T is the unit tangent vector to ∂U. To see how to extend this result, note that k is a unit normal field to the planar surface U. To formulate and prove the extension of (9.31) to any compact oriented surface with boundary in R3 , we use the relation between curl and exterior derivative discussed in Exercises 2–3 of §7. In particular, if we set 3 X

∂ F = fj (x) , ∂xj j=1

(9.32)

then curl F =

P3

1 gj (x)

(9.33)

ϕ=

3 X

fj (x) dxj ,

j=1

∂/∂xj where

dϕ = g1 (x) dx2 ∧ dx3 + g2 (x) dx3 ∧ dx1 + g3 (x) dx1 ∧ dx2 .

Now Suppose M is a smooth oriented (n − 1)-dimensional surface with boundary in Rn . Using the orientation of M, we pick a unit normal field N to M as follows. Take a smooth function v which vanishes on M but such that ∇v(x) 6= 0 on M. Thus ∇v is normal to M. Let σ ∈ Λn−1 (M ) define the orientation of M. Then dv ∧ σ = a(x) dx1 ∧ · · · ∧ dxn , where a(x) is nonvanishing on M. For x ∈ M, we take N (x) = ∇v(x)/|∇v(x)| if a(x) > 0, and N (x) = −∇v(x)/|∇v(x)| if a(x) < 0. We call N the “positive” unit normal field to the oriented surface M, in this case. Part of the motivation for this characterization of N is that, if Ω ⊂ Rn is an open set with smooth boundary M = ∂Ω, and we give M the induced orientation, as described in §8, then the positive normal field N just defined coincides with the unit normal field pointing out of Ω. Compare Exercises 2–3 of §8. Now, if G = (g1 , . . . , gn ) is a vector field defined near M, then Z (9.34)

(N · G) dS = M

Z ³X n M

j=1

j−1

(−1)

´ c gj (x) dx1 ∧ · · · dxj · · · ∧ dxn .

96

This result follows from (9.19). When n = 3 and G = curl F, we deduce from (9.32)–(9.33) that ZZ ZZ (9.35) dϕ = (N · curl F ) dS. M

M

Furthermore, in this case we have Z (9.36)

Z ϕ=

∂M

(F · T ) ds, ∂M

where T is the unit tangent vector to ∂M, specied as follows by the orientation of ∂M ; if τ ∈ Λ1 (∂M ) defines the orientation of ∂M, then hT, τ i > 0 on ∂M. We call T the “forward” unit tangent vector field to the oriented curve ∂M. By the calculations above, we have the classical Stokes formula: Proposition 9.3. If M is a compact oriented surface with boundary in R3 , and F is a C 1 vector field on a neighborhood of M , then ZZ Z (9.37) (N · curl F ) dS = (F · T ) ds, M

∂M

where N is the positive unit normal field on M and T the forward unit tangent field to ∂M.

Exercises 1. Given a “Hamiltonian vector field” n h X ∂f ∂ ∂f ∂ i Hf = − , ∂ξ ∂x ∂x ∂ξ j j j j j=1

calculate div Hf from (9.11). 2. Show that, if F : R3 → R3 is a linear rotation, then, for a C 1 vector field Z on R3 , (9.38)

F# (curl Z) = curl(F# Z).

3. Let M be the graph in R3 of a smooth function, z = u(x, y), (x, y) ∈ O ⊂ R2 , a bounded region with smooth boundary (maybe with corners). Show that Z Z Z h³ ∂F3 ∂F2 ´³ ∂u ´ ³ ∂F1 ∂F3 ´³ ∂u ´ (curl F · N ) dS = − − + − − ∂y ∂z ∂x ∂z ∂x ∂y M O (9.39) ³ ∂F ∂F1 ´i 2 − dx dy, + ∂x ∂y

97

¡ ¢ where ∂Fj /∂x and ∂Fj /∂y are evaluated at x, y, u(x, y) . Show that Z (9.40)

Z ³ (F · T ) ds =

∂M

³ ∂u ´ ∂u ´ Fe1 + Fe3 dx + Fe2 + Fe3 dy, ∂x ∂y

∂O

¡ ¢ where Fej (x, y) = Fj x, y, u(x, y) . Apply Green’s Theorem, with f = Fe1 + Fe3 (∂u/∂x), g = Fe2 + Fe3 (∂u/∂y), to show that the right sides of (9.39) and (9.40) are equal, hence proving Stokes’ Theorem in this case. 4. Let M ⊂ Rn be the graph of a function xn = u(x0 ), x0 = (x1 , . . . , xn−1 ). If n X

β=

c j ∧ · · · ∧ dxn , (−1)j−1 gj (x) dx1 ∧ · · · ∧ dx

j=1

¡ ¢ as in (9.34), and ϕ(x0 ) = x0 , u(x0 ) , show that 

 ¡ ¢ ¡ ¢ ∂u ϕ∗ β = (−1)n  gj x0 , u(x0 ) − gn x0 , u(x0 )  dx1 ∧ · · · ∧ dxn−1 ∂xj j=1 n−1 X

= (−1)n−1 G · (−∇u, 1) dx1 ∧ · · · ∧ dxn−1 , where G = (g1 , . . . , gn ), and verify the identity (9.34) in this case. Hint. For the last part, recall Exercises 2–3 of §8, regarding the orientation of M. 5. Let S be a smooth oriented 2-dimensional surface in R3 , and M an open subset of S, with smooth boundary; see Fig. 9.1. Let N be the positive unit normal field to S, defined by its orientation. For x ∈ ∂M, let ν(x) be the unit vector, tangent to M, normal to ∂M, and pointing out of M, and let T be the forward unit tangent vector field to ∂M. Show that, on ∂M, N × ν = T, ν × T = N. 6. If M is an oriented (n − 1)-dimensional surface in Rn , with positive unit normal field N , show that the volume element ωM on M is given by ωM = ωcN, where ω = dx1 ∧ · · · ∧ dxn is the standard volume form on Rn . Deduce that the volume element on the unit sphere S n−1 ⊂ Rn is given by ωS n−1 =

n X

dj ∧ · · · ∧ dxn , (−1)j−1 xj dx1 ∧ · · · ∧ dx

j=1

if S n−1 inherits the orientation as the boundary of the unit ball.

98

9B. Direct proof of the Divergence Theorem Let Ω be a bounded open subset of Rn , with a smooth boundary ∂Ω. Hence, for each p ∈ ∂Ω, there is a neighborhood U of p in Rn , a rotation of coordinate axes, and a C 1 function u : O → R, defined on an open set O ⊂ Rn−1 , such that Ω ∩ U = {x ∈ Rn : xn ≤ u(x0 ), x0 ∈ O} ∩ U, where x = (x0 , xn ), x0 = (x1 , . . . , xn−1 ). We aim to prove that, given f ∈ C 1 (Ω), and any constant vector e ∈ Rn , Z Z (9.41) e · ∇f (x) dx = (e · N )f dS, Ω

∂Ω

where dS is surface measure on ∂Ω and N (x) is the unit normal to ∂Ω, pointing out of Ω. At x = (x0 , u(x0 )) ∈ ∂Ω, we have N = (1 + |∇u|2 )−1/2 (−∇u, 1).

(9.42)

To prove (9.41), we may as well suppose f is supported in such a neighborhood U. Then we have Z Z ³ Z ´ ∂f ∂n f (x0 , xn ) dxn dx0 dx = ∂xn Ω

O

Z

=

(9.43)

xn ≤u(x0 )

¡ ¢ f x0 , u(x0 ) dx0

O

Z

=

(en · N )f dS. ∂Ω

The first identity in (9.43) follows from Theorem 4.9, the second identity from the Fundamental Theorem of Calculus, and the third identity from the identification ¡ ¢1/2 0 dS = 1 + |∇u|2 dx , established in (5.21). We use the standard basis {e1 , . . . , en } of Rn . Such an argument works when en is replaced by any constant vector e with the property that we can represent ∂Ω∩U as the graph of a function yn = u ˜(y 0 ), with the yn -axis parallel to e. In particular, it works for e = en + aej , for 1 ≤ j ≤ n − 1 and for |a| sufficiently small. Thus, we have Z Z (9.44) (en + aej ) · ∇f (x) dx = (en + aej ) · N f dS. Ω

∂Ω

99

If we subtract (9.43) from this and divide the result by a, we obtain (9.41) for e = ej , for all j, and hence (9.41) holds in general. Note that replacing e by ej and f by fj in (9.41), and summing over 1 ≤ j ≤ n, yields Z (9.45)

Z (div F ) dx =



N · F dS, ∂Ω

for the vector field F = (f1 , . . . , fn ). This is the usual statement of Gauss’ Divergence Theorem, as given in Theorem 9.2 (specialized to domains in Rn ). Reversing the argument leading from (9.2) to (9.5), we also have another proof of Green’s Theorem, in the form (9.2).

100

10. Holomorphic functions and harmonic functions Let f be a complex-valued C 1 function on a region Ω ⊂ R2 . We identify R2 with C, via z = x + iy, and write f (z) = f (x, y). We say f is holomorphic on Ω if it satisfies the Cauchy-Riemann equation: 1 ∂f ∂f = . ∂x i ∂y

(10.1)

Note that f (z) = z k has this property, for any k ∈ Z, via an easy calculation of the partial derivatives of (x + iy)k . Thus z k is holomorphic on C if k ∈ Z+ , and on C \ {0}, if k ∈ Z− . Another example is the exponential function, ez = ex eiy . On the other hand, x2 + y 2 is not holomorphic. The following is often useful for producing more holomorphic functions: Lemma 10.1. If f and g are holomorphic on Ω, so is f g. Proof. We have (10.2)

∂ ∂f ∂g (f g) = g+f , ∂x ∂x ∂x

∂ ∂f ∂g (f g) = g+f , ∂y ∂y ∂y

so if f and g satisfy the Cauchy-Riemann equation, so does f g. R R We next apply Green’s theorem to the line integral ∂Ω f dz = ∂Ω f (dx + i dy). Clearly (9.2) applies to complex-valued functions, and if we set g = if, we get Z (10.3)

ZZ ³ ∂f ∂f ´ f dz = i − dx dy. ∂x ∂y Ω

∂Ω

Whenever f is holomorphic, the integrand on the right side of (10.3) vanishes, so we have the following result, known as Cauchy’s Integral Theorem: Theorem 10.2. If f ∈ C 1 (Ω) is holomorphic, then Z (10.4)

f (z) dz = 0. ∂Ω

Until further notice, we assume Ω is a bounded region in C, with smooth boundary. Using (10.4), we can establish Cauchy’s Integral Formula:

101

Theorem 10.3. If f ∈ C 1 (Ω) is holomorphic and z0 ∈ Ω, then Z 1 f (z) (10.5) f (z0 ) = dz. 2πi z − z0 ∂Ω

Proof. Note that g(z) = f (z)/(z − z0 ) is holomorphic on Ω \ {z0 }. Let Dr be the disk of radius r centered at z0 . Pick r so small that Dr ⊂ Ω. Then (10.4) implies Z Z f (z) f (z) (10.6) dz = dz. z − z0 z − z0 ∂Ω

∂Dr

To evaluate the integral on the right, parametrize the curve ∂Dr by γ(θ) = z0 +reiθ . Hence dz = ireiθ dθ, so the integral on the right is equal to Z 2π Z 2π f (z0 + reiθ ) iθ (10.7) ire dθ = i f (z0 + reiθ ) dθ. iθ re 0 0 As r → 0, this tends in the limit to 2πif (z0 ), so (10.5) is established. Suppose f ∈ C 1 (Ω) is holomorphic, z0 ∈ Dr ⊂ Ω, where Dr is the disk of radius r centered at z0 , and suppose z ∈ Dr . Then Theorem 10.3 implies Z 1 f (ζ) (10.8) f (z) = dζ. 2πi (ζ − z0 ) − (z − z0 ) ∂Ω

We have the infinite series expansion (10.9)

∞ 1 1 X ³ z − z0 ´n = , (ζ − z0 ) − (z − z0 ) ζ − z0 n=0 ζ − z0

valid as long as |z − z0 | < |ζ − z0 |. Hence, given |z − z0 | < r, this series is uniformly convergent for ζ ∈ ∂Ω, and we have ∞ Z 1 X f (ζ) ³ z − z0 ´n (10.10) f (z) = dζ. 2πi n=0 ζ − z0 ζ − z0 ∂Ω

Thus, for z ∈ Dr , f (z) has the convergent power series expansion Z ∞ X f (ζ) 1 n (10.11) f (z) = an (z − z0 ) , an = dζ. 2πi (ζ − z0 )n+1 n=0 ∂Ω

Note that, when (10.5) is applied to Ω = Dr , the disk of radius r centered at z0 , the computation (10.7) yields Z 2π Z 1 1 iθ (10.12) f (z0 ) = f (z0 + re ) dθ = f (z) ds(z), 2π 0 `(∂Dr ) ∂Dr

102

when f is holomorphic and C 1 on Dr , and `(∂Dr ) = 2πr is the length of the circle ∂Dr . This is a mean value property, which extends to harmonic functions on domains in Rn , as we will see below. Note that we can write (10.1) as (∂x + i∂y )f = 0; applying the operator ∂x − i∂y to this gives ∂2f ∂2f + =0 ∂x2 ∂y 2

(10.13)

for any holomorphic function. A general C 2 solution to (10.13) on a region Ω ⊂ R2 is called a harmonic function. More generally, if O is an open set in Rn , a function f ∈ C 2 (Ω) is called harmonic if ∆f = 0 on O, where, as in (9.27), ∂2f ∂2f ∆f = + ··· + . ∂x21 ∂x2n

(10.14)

Generalizing (10.12), we have the following, known as the mean value property of harmonic functions: Proposition 10.4. Let Ω ⊂ Rn be open, u ∈ C 2 (Ω) be harmonic, p ∈ Ω, and BR (p) = {x ∈ Ω : |x − p| ≤ R} ⊂ Ω. Then Z 1 ¢ (10.15) u(p) = ¡ u(x) dS(x). A ∂BR (p) ∂BR (p)

For the proof, set (10.16)

1 ψ(r) = A(S n−1 )

Z u(p + rω) dS(ω), S n−1

for 0 < r ≤ R. We have ψ(R) equal to the right side of (10.15), while clearly ψ(r) → u(p) as r → 0. Now Z Z 1 1 ∂u 0 ¡ ¢ (10.17) ψ (r) = ω · ∇u(p + rω) dS(ω) = dS(x). A(S n−1 ) ∂ν A ∂Br (p) S n−1

∂Br (p)

At this point, we establish: Lemma 10.5. If O ⊂ Rn is a bounded domain with smooth boundary and u ∈ C 2 (O) is harmonic in O, then Z ∂u (10.18) (x) dS(x) = 0. ∂ν ∂O

Proof. Apply the Green formula (9.29), with M = O and v = 1. If ∆u = 0, every integrand in (9.29) vanishes, except the one appearing in (10.18), so this integrates to zero.

103

It follows from this lemma that (10.17) vanishes, so ψ(r) is constant. This completes the proof of (10.15). We can integrate the identity (10.15), to obtain Z 1 ¢ u(x) dV (x), (10.19) u(p) = ¡ V BR (p) BR (p)

¢ ¡ where u ∈ C 2 BR (p) is harmonic. This is another expression of the mean value property. The mean value property of harmonic functions has a number of important consequences. Here we mention one result, known as Liouville’s Theorem. Proposition 10.5. If u ∈ C 2 (Rn ) is harmonic on all of Rn and bounded, then u is constant. Proof. Pick any two points p, q ∈ Rn . We have, for any r > 0,   Z Z 1  ¢ (10.20) u(p) − u(q) = ¡ u(x) dx − u(x) dx .  V Br (0) Br (p)

Br (q)

¡ ¢ Note that V Br (0) = Cn rn , where Cn is evaluated in problem 2 of §5. Thus Z Cn (10.21) |u(p) − u(q)| ≤ n |u(x)| dx, r ∆(p,q,r)

where ³

(10.22)

´

³

´ ∆(p, q, r) = Br (p)4Br (q) = Br (p) \ Br (q) ∪ Br (q) \ Br (p) .

Note that, if a = |p − q|, then ∆(p, q, r) ⊂ Br+a (p) \ Br−a (p); hence ¡ ¢ (10.23) V ∆(p, q, r) ≤ C(p, q) rn−1 , r ≥ 1. It follows that, if |u(x)| ≤ M for all x ∈ Rn , then (10.24)

|u(p) − u(q)| ≤ M Cn C(p, q) r−1 ,

∀ r ≥ 1.

Taking r → ∞, we obtain u(p) − u(q) = 0, so u is constant. We will now use Liouville’s Theorem to prove the Fundamental Theorem of Algebra: Theorem 10.6. If p(z) = an z n + an−1 z n−1 + · · · + a1 z + a0 is a polynomial of degree n ≥ 1 (an 6= 0), then p(z) must vanish somewhere in C. Proof. Consider (10.25)

f (z) =

1 . p(z)

104

If p(z) does not vanish anywhere in C, then f (z) is holomorphic on all of C. On the other hand, (10.26)

f (z) =

1 1 , n −1 z an + an−1 z + · · · + a0 z −n

so (10.27)

|f (z)| −→ 0, as |z| → ∞.

Thus f is bounded on C, if p(z) has no roots. By Proposition 10.5, f (z) must be constant, which is impossible, so p(z) must have a complex root. From the fact that every holomorphic function f on O ⊂ R2 is harmonic, it follows that its real and imaginary parts are harmonic. This result has a converse. Let u ∈ C 2 (O) be harmonic. Consider the 1-form (10.28)

α=−

∂u ∂u dx + dy. ∂y ∂x

We have dα = −(∆u)dx ∧ dy, so α is closed if and only if u is harmonic. Now, if O is diffeomorphic to a disk, it follows from Proposition 7.1 that α is exact on O, whenever it is closed, so, in such a case, (10.29)

∆u = 0 on O =⇒ ∃ v ∈ C 1 (O) s.t. α = dv.

In other words, (10.30)

∂u ∂v = , ∂x ∂y

∂u ∂v =− . ∂y ∂x

This is precisely the Cauchy-Riemann equation (10.1) for f = u + iv, so we have: Proposition 10.7. If O ⊂ R2 is diffeomorphic to a disk and u ∈ C 2 (O) is harmonic, then u is the real part of a holomorphic function on O. The function v (which is unique up to an additive constant) is called the harmonic conjugate of u. We close this section with a brief mention of holomorphic functions on a domain O ⊂ Cn . We say f ∈ C 1 (O) is holomorphic provided it satisfies (10.31)

∂f 1 ∂f = , ∂xj i ∂yj

1 ≤ j ≤ n.

Suppose z ∈ O, z = (z1 , . . . , zn ). Suppose ζ ∈ O whenever |z−ζ| < r. Then, by successively applying Cauchy’s integral formula (10.5) to each complex variable zj , we have that Z Z −n (10.32) f (z) = (2πi) · · · f (ζ)(ζ1 − z1 ) · · · (ζn − zn )−1 dζ1 · · · dζn , γn

γ1

105

where γj is any √ simple counterclockwise curve about zj in C with the property that |ζj − zj | < r/ n for all ζj ∈ γj . √ Consequently, if p ∈ Cn and O contains the “polydisc” D = {z ∈ Cn : |z − p| ≤ nδ}, then, for z ∈ D, the interior of D, we have Z Z £ ¤−1 −n f (z) = (2πi) · · · f (ζ) (ζ1 − p1 ) − (z1 − p1 ) ··· (10.33) Cn C1 £ ¤−1 (ζn − pn ) − (zn − pn ) dζ1 · · · dζn , where Cj = {ζ ∈ C : |ζ − pj | = δ}. Then, parallel to (10.8)–(10.11), we have X (10.34) f (z) = cα (z − p)α , α≥0

for z ∈ D, where α = (α1 , . . . , αn ) is a multi-index, (z − p)α = (z1 − p1 )α1 · · · (zn − pn )αn , as in (1.13), and Z Z −n (10.35) cα = (2πi) · · · f (ζ)(ζ1 − p1 )−α1 −1 · · · (ζn − pn )−αn −1 dζ1 · · · dζn . Cn

C1

Thus holomorphic functions on open domains in Cn have convergent power series expansions. We refer to [Ahl] and [Hil] for more material on holomorphic functions of one complex variable, and to [Kr] for material on holomorphic functions of several complex variables. A source of much information on harmonic functions is [Kel]. Also further material on these subjects can be found in [T].

Exercises 1. Look again at Exercise 4 in §1. 2. Look again at Exercises 3–5 in §2. 3. Let O, Ω be open in C. If f is holomorphic on O, with range in Ω, and g is holomorphic on Ω, show that h = g ◦ f is holomorphic on O. Hint. Chain rule. 4. If f (x) = ϕ(|x|) on Rn , show that ∆f (x) = ϕ00 (|x|) +

n−1 0 ϕ (|x|). |x|

In particular, show that |x|−(n−2) is harmonic on Rn \ 0,

106

if n ≥ 3, and

log |x| is harmonic on R2 \ 0.

If O, Ω are open in Rn , a smooth map ϕ : O → Ω is said to be conformal provided the matrix function G(x) = Dϕ(x)t Dϕ(x) is a multiple of the identity, G(x) = γ(x)I. Recall formula (5.2). 5. Suppose n = 2 and ϕ preserves orientation. Show that ϕ (pictured as a function ϕ : O → C) is conformal if and only if it is holomorphic. If ϕ reverses orientation, ϕ is conformal ⇔ ϕ is holomorphic (we say ϕ is anti-holomorphic). 6. If O and Ω are open in R2 and u is harmonic on Ω, show that u ◦ ϕ is harmonic on O, whenever ϕ : O → Ω is a smooth conformal map. Hint. Use Exercise 3. 7. Let ez be defined by ez = ex eiy ,

z = x + iy, x, y ∈ R,

where ex and eiy are defined by (3.20), (3.25). Show that ez is holomorphic in z ∈ C, and z

e =

∞ X zk k=0

8. For z ∈ C, set cos z =

1 2

¡ iz ¢ e + e−iz ,

k!

.

sin z =

1 2i

¡ iz ¢ e − e−iz .

Show that these functions agree with the definitions of cos t and sin t given in (3.25)–(3.26), for z = t ∈ R. Show that cos z and sin z are holomorphic in z ∈ C.

107

11. The Brouwer fixed-point theorem Here we make use of the calculus of differential forms to provide simple proofs of some important topological results of Brouwer. The first two results concern retractions. If Y is a subset of X, by definition a retraction of X onto Y is a map ϕ : X → Y such that ϕ(x) = x for all x ∈ Y . Proposition 11.1. There is no smooth retraction ϕ : B → S n−1 of the closed unit ball B in Rn onto its boundary S n−1 . In fact, it is just as easy to prove the following more general result. The approach we use is adapted from [Kan]. Proposition 11.2. If M is a compact oriented n-dimensional surface with nonempty boundary ∂M, there is no smooth retraction ϕ : M → ∂M. R Proof. Pick ω ∈ Λn−1 (∂M ) to be the volume form on ∂M , so ∂M ω > 0. Now apply Stokes’ theorem to β = ϕ∗ ω. If ϕ is a retraction, then ϕ ◦ j(x) = x, where j : ∂M ,→ M is the natural inclusion. Hence j ∗ ϕ∗ ω = ω, so we have Z Z (11.1) ω = dϕ∗ ω. ∂M

M

But dϕ∗ ω = ϕ∗ dω = 0, so the integral (11.1) is zero. This is a contradiction, so there can be no retraction. A simple consequence of this is the famous Brouwer Fixed-Point Theorem. We first present the smooth case. Theorem 11.3. If F : B → B is a smooth map on the closed unit ball in Rn , then F has a fixed point. Proof. We are claiming that F (x) = x for some x ∈ B. If not, define ϕ(x) to be the endpoint of the ray from F (x) to x, continued until it hits ∂B = S n−1 . It is clear that ϕ would be a smooth retraction, contradicting Proposition 11.1. Now we give the general case, using the Stone-Weierstrass theorem (discussed in Appendix E) to reduce it to Theorem 11.3. Theorem 11.4. If G : B → B is a continuous map on the closed unit ball in Rn , then G has a fixed point. Proof. If not, then inf |G(x) − x| = δ > 0.

x∈B

The Stone-Weierstrass theorem (Appendix E) implies there exists a polynomial P such that |P (x) − G(x)| < δ/8 for all x ∈ B. Set ³ δ´ P (x). F (x) = 1 − 8

108

Then F : B → B and |F (x) − G(x)| < δ/2 for all x ∈ B, so inf |F (x) − x| >

x∈B

This contradicts Theorem 11.3.

δ . 2

109

A. Metric spaces, convergence, and compactness A metric space is a set X, together with a distance function d : X × X → [0, ∞), having the properties that d(x, y) = 0 ⇐⇒ x = y, (A.1)

d(x, y) = d(y, x), d(x, y) ≤ d(x, z) + d(y, z).

The third of these properties is called the triangle inequality. An example of a metric space is the set of rational numbers Q, with d(x, y) = |x − y|. Another example is X = Rn , with p d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2 . If (xν ) is a sequence in X, indexed by ν = 1, 2, 3, . . . , i.e., by ν ∈ Z+ , one says xν → y if d(xν , y) → 0, as ν → ∞. One says (xν ) is a Cauchy sequence if d(xν , xµ ) → 0 as µ, ν → ∞. One says X is a complete metric space if every Cauchy sequence converges to a limit in X. Some metric spaces are not complete; for example, √ Q is not complete. You can take a sequence (xν ) of rational numbers such that xν → 2, which is not rational. Then (xν ) is Cauchy in Q, but it has no limit in Q. b as follows. Let If a metric space X is not complete, one can construct its completion X b consist of an equivalence class of Cauchy sequences in X, where we say an element ξ of X (xν ) ∼ (yν ) provided d(xν , yν ) → 0. We write the equivalence class containing (xν ) as [xν ]. If ξ = [xν ] and η = [yν ], we can set d(ξ, η) = limν→∞ d(xν , yν ), and verify that this is well b a complete metric space. defined, and makes X If the completion of Q is constructed by this process, you get R, the set of real numbers. This construction provides a good way to develop the basic theory of the real numbers. There are a number of useful concepts related to the notion of closeness. We define some of them here. First, if p is a point in a metric space X and r ∈ (0, ∞), the set (A.2)

Br (p) = {x ∈ X : d(x, p) < r}

is called the open ball about p of radius r. Generally, a neighborhood of p ∈ X is a set containing such a ball, for some r > 0. A set U ⊂ X is called open if it contains a neighborhood of each of its points. The complement of an open set is said to be closed. The following result characterizes closed sets. Proposition A.1. A subset K ⊂ X of a metric space X is closed if and only if (A.3)

xj ∈ K, xj → p ∈ X =⇒ p ∈ K.

Proof. Assume K is closed, xj ∈ K, xj → p. If p ∈ / K, then p ∈ X \ K, which is open, so some Bε (p) ⊂ X \ K, and d(xj , p) ≥ ε for all j. This contradiction implies p ∈ K. Conversely, assume (A.3) holds, and let q ∈ U = X \ K. If B1/n (q) is not contained in U for any n, then there exists xn ∈ K ∩ B1/n (q), hence xn → q, contradicting (A.3). This completes the proof. The following is straightforward.

110

Proposition A.2. If Uα is a family of open sets in X, then ∪α Uα is open. If Kα is a family of closed subsets of X, then ∩α Kα is closed. Given S ⊂ X, we denote by S (the closure of S) the smallest closed subset of X containing S, i.e., the intersection of all the closed sets Kα ⊂ X containing S. The following result is straightforward. Proposition A.3. Given S ⊂ X, p ∈ S if and only if there exist xj ∈ S such that xj → p. Given S ⊂ X, p ∈ X, we say p is an accumulation point of S if and only if, for each ε > 0, there exists q ∈ S ∩ Bε (p), q 6= p. It follows that p is an accumulation point of S if and only if each Bε (p), ε > 0, contains infinitely many points of S. One straightforward observation is that all points of S \ S are accumulation points of S. The interior of a set S ⊂ X is the largest open set contained in S, i.e., the union of all the open sets contained in S. Note that the complement of the interior of S is equal to the closure of X \ S. We now turn to the notion of compactness. We say a metric space X is compact provided the following property holds: (A)

Each sequence (xk ) in X has a convergent subsequence.

We will establish various properties of compact metric spaces, and provide various equivalent characterizations. For example, it is easily seen that (A) is equivalent to: (B)

Each infinite subset S ⊂ X has an accumulation point.

The following property is known as total boundedness: Proposition A.4. If X is a compact metric space, then (C) Given ε > 0, ∃ finite set {x1 , . . . , xN } such that Bε (x1 ), . . . , Bε (xN ) covers X. Proof. Take ε > 0 and pick x1 ∈ X. If Bε (x1 ) = X, we are done. If not, pick x2 ∈ X \ Bε (x1 ). If Bε (x1 ) ∪ Bε (x2 ) = X, we are done. If not, pick x3 ∈ X \ [Bε (x1 ) ∪ Bε (x2 )]. Continue, taking xk+1 ∈ X \ [Bε (x1 ) ∪ · · · ∪ Bε (xk )], if Bε (x1 ) ∪ · · · ∪ Bε (xk ) 6= X. Note that, for 1 ≤ i, j ≤ k, i 6= j =⇒ d(xi , xj ) ≥ ε. If one never covers X this way, consider S = {xj : j ∈ N}. This is an infinite set with no accumulation point, so property (B) is contradicted. Corollary A.5. If X is a compact metric space, it has a countable dense subset. Proof. Given ε = 2−n , let Sn be a finite set of points xj such that {Bε (xj )} covers X. Then C = ∪n Sn is a countable dense subset of X. Here is another useful property of compact metric spaces, which will eventually be generalized even further, in (E) below.

111

Proposition A.6. Let X be a compact metric space. Assume K1 ⊃ K2 ⊃ K3 ⊃ · · · form a decreasing sequence of closed subsets of X. If each Kn 6= ∅, then ∩n Kn 6= ∅. Proof. Pick xn ∈ Kn . If (A) holds, (xn ) has a convergent subsequence, xnk → y. Since {xnk : k ≥ `} ⊂ Kn` , which is closed, we have y ∈ ∩n Kn . Corollary A.7. Let X be a compact metric space. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing sequence of open subsets of X. If ∪n Un = X, then UN = X for some N . Proof. Consider Kn = X \ Un . The following is an important extension of Corollary A.7. Proposition A.8. If X is a compact metric space, then it has the property: (D)

Every open cover {Uα : α ∈ A} of X has a finite subcover.

Proof. Each Uα is a union of open balls, so it suffices to show that (A) implies the following: (D’)

Every cover {Bα : α ∈ A} of X by open balls has a finite subcover.

Let C = {zj : j ∈ N} ⊂ X be a countable dense subset of X, as in Corollary A.2. Each Bα is a union of balls Brj (zj ), with zj ∈ C ∩ Bα , rj rational. Thus it suffices to show that (D”)

Every countable cover {Bj : j ∈ N} of X by open balls has a finite subcover.

For this, we set Un = B1 ∪ · · · ∪ Bn and apply Corollary A.7. The following is a convenient alternative to property (D): (E)

If Kα ⊂ X are closed and

\

Kα = ∅, then some finite intersection is empty.

α

Considering Uα = X \ Kα , we see that (D) ⇐⇒ (E). The following result completes Proposition A.8. Theorem A.9. For a metric space X, (A) ⇐⇒ (D). Proof. By Proposition A.8, (A) ⇒ (D). To prove the converse, it will suffice to show that (E) ⇒ (B). So let S ⊂ X and assume S has no accumulation point. We claim: Such S must be closed.

112

Indeed, if z ∈ S and z ∈ / S, then z would have to be an accumulation point. Say S = {xα : α ∈ A}. Set Kα = S \ {xα }. Then each Kα has no accumulation point, hence Kα ⊂ X is closed. Also ∩α Kα = ∅. Hence there exists a finite set F ⊂ A such that ∩α∈F Kα = ∅, if (E) holds. Hence S = ∪α∈F {xα } is finite, so indeed (E) ⇒ (B). Remark. So far we have that for every metric space X, (A) ⇐⇒ (B) ⇐⇒ (D) ⇐⇒ (E) =⇒ (C). We claim that (C) implies the other conditions if X is complete. Of course, compactness implies completeness, but (C) may hold for incomplete X, e.g., X = (0, 1) ⊂ R. Proposition A.10. If X is a complete metric space with property (C), then X is compact. Proof. It suffices to show that (C) ⇒ (B) if X is a complete metric space. So let S ⊂ X be an infinite set. Cover X by balls B1/2 (x1 ), . . . , B1/2 (xN ). One of these balls contains infinitely many points of S, and so does its closure, say X1 = B1/2 (y1 ). Now cover X by finitely many balls of radius 1/4; their intersection with X1 provides a cover of X1 . One such set contains infinitely many points of S, and so does its closure X2 = B1/4 (y2 ) ∩ X1 . Continue in this fashion, obtaining X1 ⊃ X2 ⊃ X3 ⊃ · · · ⊃ Xk ⊃ Xk+1 ⊃ · · · ,

Xj ⊂ B2−j (yj ),

each containing infinitely many points of S. One sees that (yj ) forms a Cauchy sequence. If X is complete, it has a limit, yj → z, and z is seen to be an accumulation point of S. If Xj , 1 ≤ j ≤ m, is a finite collection of metric spaces, with metrics dj , we can define a Cartesian product metric space (A.4)

X=

m Y

Xj ,

d(x, y) = d1 (x1 , y1 ) + · · · + dm (xm , ym ).

j=1

p Another choice of metric is δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 . The metrics d and δ are equivalent, i.e., there exist constants C0 , C1 ∈ (0, ∞) such that (A.5)

C0 δ(x, y) ≤ d(x, y) ≤ C1 δ(x, y),

∀ x, y ∈ X.

A key example is Rm , the Cartesian product of m copies of the real line R. We describe some important classes of compact spaces. Proposition A.11. If Xj are compact metric spaces, 1 ≤ j ≤ m, so is X =

Qm j=1

Xj .

Proof. If (xν ) is an infinite sequence of points in X, say xν = (x1ν , . . . , xmν ), pick a convergent subsequence of (x1ν ) in X1 , and consider the corresponding subsequence of (xν ), which we relabel (xν ). Using this, pick a convergent subsequence of (x2ν ) in X2 . Continue. Having a subsequence such that xjν → yj in Xj for each j = 1, . . . , m, we then have a convergent subsequence in X. The following result is useful for calculus on Rn .

113

Proposition A.12. If K is a closed bounded subset of Rn , then K is compact. Proof. The discussion above reduces the problem to showing that any closed interval I = [a, b] in R is compact. This compactness is a corollary of Proposition A.10. For pedagogical purposes, we redo the argument here, since in this concrete case it can be streamlined. Suppose S is a subset of I with infinitely many elements. Divide I into 2 equal subintervals, I1 = [a, b1 ], I2 = [b1 , b], b1 = (a + b)/2. Then either I1 or I2 must contain infinitely many elements of S. Say Ij does. Let x1 be any element of S lying in Ij . Now divide Ij in two equal pieces, Ij = Ij1 ∪ Ij2 . One of these intervals (say Ijk ) contains infinitely many points of S. Pick x2 ∈ Ijk to be one such point (different from x1 ). Then subdivide Ijk into two equal subintervals, and continue. We get an infinite sequence of distinct points xν ∈ S, and |xν − xν+k | ≤ 2−ν (b − a), for k ≥ 1. Since R is complete, (xν ) converges, say to y ∈ I. Any neighborhood of y contains infinitely many points in S, so we are done. If X and Y are metric spaces, a function f : X → Y is said to be continuous provided xν → x in X implies f (xν ) → f (x) in Y. An equivalent condition, which the reader is invited to verify, is (A.6)

U open in Y =⇒ f −1 (U ) open in X.

Proposition A.13. If X and Y are metric spaces, f : X → Y continuous, and K ⊂ X compact, then f (K) is a compact subset of Y. Proof. If (yν ) is an infinite sequence of points in f (K), pick xν ∈ K such that f (xν ) = yν . If K is compact, we have a subsequence xνj → p in X, and then yνj → f (p) in Y. If F : X → R is continuous, we say f ∈ C(X). A useful corollary of Proposition A.13 is: Proposition A.14. If X is a compact metric space and f ∈ C(X), then f assumes a maximum and a minimum value on X. Proof. We know from Proposition A.13 that f (X) is a compact subset of R. Hence f (X) is bounded, say f (X) ⊂ I = [a, b]. Repeatedly subdividing I into equal halves, as in the proof of Proposition A.12, at each stage throwing out intervals that do not intersect f (X), and keeping only the leftmost and rightmost interval amongst those remaining, we obtain points α ∈ f (X) and β ∈ f (X) such that f (X) ⊂ [α, β]. Then α = f (x0 ) for some x0 ∈ X is the minimum and β = f (x1 ) for some x1 ∈ X is the maximum. At this point, the reader might take a look at the proof of the Mean Value Theorem, given in §0, which applies this result. If S ⊂ R is a nonempty, bounded set, Proposition A.12 implies S is compact. The function η : S → R, η(x) = x is continuous, so by Proposition A.14 it assumes a maximum and a minimum on S. We set (A.7)

sup S = max x, s∈S

inf S = min x, x∈S

when S is bounded. More generally, if S ⊂ R is nonempty and bounded from above, say S ⊂ (−∞, B], we can pick A < B such that S ∩ [A, B] is nonempty, and set (A.8)

sup S = sup S ∩ [A, B].

114

Similarly, if S ⊂ R is nonempty and bounded from below, say S ⊂ [A, ∞), we can pick B > A such that S ∩ [A, B] is nonempty, and set (A.9)

inf S = inf S ∩ [A, B].

If X is a nonempty set and f : X → R is bounded from above, we set (A.10)

sup f (x) = sup f (X), x∈X

and if f : X → R is bounded from below, we set (A.11)

inf f (x) = inf f (X).

x∈X

If f is not bounded from above, we set sup f = +∞, and if f is not bounded from below, we set inf f = −∞. Given a set X, f : X → R, and xn → x, we set ³ ´ (A.11A) lim sup f (xn ) = lim sup f (xk ) , n→∞

n→∞ k≥n

and (A.11B)

³ lim inf f (xn ) = lim n→∞

´ inf f (xk ) .

n→∞ k≥n

We return to the notion of continuity. A function f ∈ C(X) is said to be uniformly continuous provided that, for any ε > 0, there exists δ > 0 such that (A.12)

x, y ∈ X, d(x, y) ≤ δ =⇒ |f (x) − f (y)| ≤ ε.

An equivalent condition is that f have a modulus of continuity, i.e., a monotonic function ω : [0, 1) → [0, ∞) such that δ & 0 ⇒ ω(δ) & 0, and such that (A.13)

x, y ∈ X, d(x, y) ≤ δ ≤ 1 =⇒ |f (x) − f (y)| ≤ ω(δ).

Not all continuous functions are uniformly continuous. For example, if X = (0, 1) ⊂ R, then f (x) = sin 1/x is continuous, but not uniformly continuous, on X. The following result is useful, for example, in the development of the Riemann integral in §1. Proposition A.15. If X is a compact metric space and f ∈ C(X), then f is uniformly continuous. Proof. If not, there exist xν , yν ∈ X and ε > 0 such that d(xν , yν ) ≤ 2−ν but (A.14)

|f (xν ) − f (yν )| ≥ ε.

Taking a convergent subsequence xνj → p, we also have yνj → p. Now continuity of f at p implies f (xνj ) → f (p) and f (yνj ) → f (p), contradicting (A.14). If X and Y are metric spaces, the space C(X, Y ) of continuous maps f : X → Y has a natural metric structure, under some additional hypotheses. We use ¡ ¢ (A.15) D(f, g) = sup d f (x), g(x) . x∈X

This sup exists provided f (X) and g(X) are bounded subsets of Y, where to say B ⊂ Y is bounded is to say d : B × B → [0, ∞) has bounded image. In particular, this supremum exists if X is compact. The following result is useful in the proof of the fundamental local existence result for ODE, in §3.

115

Proposition A.16. If X is a compact metric space and Y is a complete metric space, then C(X, Y ), with the metric (A.9), is complete. Proof. That D(f, g) satisfies the conditions to define a metric on C(X, Y ) is straightforward. We check completeness. Suppose (fν ) is a Cauchy sequence in C(X, Y ), so, as ν → ∞, ¡ ¢ (A.16) sup sup d fν+k (x), fν (x) ≤ εν → 0. k≥0 x∈X

Then in particular (fν (x)) is a Cauchy sequence in Y for each x ∈ X, so it converges, say to g(x) ∈ Y . It remains to show that g ∈ C(X, Y ) and that fν → g in the metric (A.9). In fact, taking k → ∞ in the estimate above, we have ¡ ¢ (A.17) sup d g(x), fν (x) ≤ εν → 0, x∈X

i.e., fν → g uniformly. It remains only to show that g is continuous. For this, let xj → x in X and fix ε > 0. Pick N so that εN < ε. Since fN is continuous, there exists J such that j ≥ J ⇒ d(fN (xj ), fN (x)) < ε. Hence ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ j ≥ J ⇒ d g(xj ), g(x) ≤ d g(xj ), fN (xj ) + d fN (xj ), fN (x) + d fN (x), g(x) < 3ε. This completes the proof. In case Y = R, C(X, R) = C(X), introduced earlier in this appendix. The distance function (A.15) can be written D(f, g) = kf − gksup ,

kf ksup = sup |f (x)|. x∈X

kf ksup is a norm on C(X). Generally, a norm on a vector space V is an assignment f 7→ kf k ∈ [0, ∞), satisfying kf k = 0 ⇔ f = 0,

kaf k = |a| kf k,

kf + gk ≤ kf k + kgk,

given f, g ∈ V and a a scalar (in R or C). A vector space equipped with a norm is called a normed vector space. It is then a metric space, with distance function D(f, g) = kf − gk. If the space is complete, one calls V a Banach space. In particular, by Proposition A.16, C(X) is a Banach space, when X is a compact metric space. We next give a couple of slightly more sophisticated results on compactness. The following extension of Proposition A.11 is a special case of Tychonov’s Theorem. Q∞ Proposition A.17. If {Xj : j ∈ Z+ } are compact metric spaces, so is X = j=1 Xj . Here, we can make X a metric space by setting (A.18)

d(x, y) =

∞ X j=1

2−j

dj (pj (x), pj (y)) , 1 + dj (pj (x), pj (y))

116

where pj : X → Xj is the projection onto the jth factor. It is easy to verify that, if xν ∈ X, then xν → y in X, as ν → ∞, if and only if, for each j, pj (xν ) → pj (y) in Xj . Proof. Following the argument in Proposition A.11, if (xν ) is an infinite sequence of points in X, we obtain a nested family of subsequences (A.19)

(xν ) ⊃ (x1 ν ) ⊃ (x2 ν ) ⊃ · · · ⊃ (xj ν ) ⊃ · · ·

such that p` (xj ν ) converges in X` , for 1 ≤ ` ≤ j. The next step is a diagonal construction. We set ξν = xν ν ∈ X.

(A.20)

Then, for each j, after throwing away a finite number N (j) of elements, one obtains from (ξν ) a subsequence of the sequence (xj ν ) in (A.19), so p` (ξν ) converges in X` for all `. Hence (ξν ) is a convergent subsequence of (xν ). The next result is a special case of Ascoli’s Theorem. Proposition A.18. Let X and Y be compact metric spaces, and fix a modulus of continuity ω(δ). Then © ¡ ¢ ¡ ¢ ª (A.21) Cω = f ∈ C(X, Y ) : d f (x), f (x0 ) ≤ ω d(x, x0 ) ∀ x, x0 ∈ X is a compact subset of C(X, Y ). Proof. Let (fν ) be a sequence in Cω . Let Σ be a countable dense subset of X, as in Corollary A.5. For each x ∈ Σ, (fν (x)) is a sequence in Y, which hence has a convergent subsequence. Using a diagonal construction similar to that in the proof of Proposition A.17, we obtain a subsequence (ϕν ) of (fν ) with the property that ϕν (x) converges in Y, for each x ∈ Σ, say (A.22)

ϕν (x) → ψ(x),

for all x ∈ Σ, where ψ : Σ → Y. So far, we have not used (A.21). This hypothesis will now be used to show that ϕν converges uniformly on X. Pick ε > 0. Then pick δ > 0 such that ω(δ) < ε/3. Since X is compact, we can cover X by finitely many balls Bδ (xj ), 1 ≤ j ≤ N, xj ∈ Σ. Pick M so large that ϕν (xj ) is within ε/3 of its limit for all ν ≥ M (when 1 ≤ j ≤ N ). Now, for any x ∈ X, picking ` ∈ {1, . . . , N } such that d(x, x` ) ≤ δ, we have, for k ≥ 0, ν ≥ M, ¡ ¢ ¡ ¢ ¡ ¢ d ϕν+k (x), ϕν (x) ≤ d ϕν+k (x), ϕν+k (x` ) + d ϕν+k (x` ), ϕν (x` ) ¡ ¢ (A.23) + d ϕν (x` ), ϕν (x) ≤ ε/3 + ε/3 + ε/3. Thus (ϕν (x)) is Cauchy in Y for all x ∈ X, hence convergent. Call the limit ψ(x), so we now have (A.22) for all x ∈ X. Letting k → ∞ in (A.23) we have uniform convergence of ϕν to ψ. Finally, passing to the limit ν → ∞ in (A.24)

d(ϕν (x), ϕν (x0 )) ≤ ω(d(x, x0 ))

117

gives ψ ∈ Cω . We want to re-state Proposition A.18, bringing in the notion of equicontinuity. Given metric spaces X and Y , and a set of maps F ⊂ C(X, Y ), we say F is equicontinuous at a point x0 ∈ X provided ∀ ε > 0, ∃ δ > 0 such that ∀ x ∈ X, f ∈ F,

(A.25)

dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.

We say F is equicontinuous on X if it is equicontinuous at each point of X. We say F is uniformly equicontinuous on X provided ∀ ε > 0, ∃ δ > 0 such that ∀ x, x0 ∈ X, f ∈ F,

(A.26)

dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.

Note that (A.26) is equivalent to the existence of a modulus of continuity ω such that F ⊂ Cω , given by (A.21). It is useful to record the following result. Proposition A.19. Let X and Y be metric spaces, F ⊂ C(X, Y ). Assume X is compact. then (A.27)

F equicontinuous =⇒ F is uniformly equicontinuous.

Proof. The argument is a variant of the proof of Proposition A.15. In more detail, suppose there exist xν , x0ν ∈ X, ε > 0, and fν ∈ F such that d(xν , x0ν ) ≤ 2−ν but (A.28)

d(fν (xν ), fν (x0ν )) ≥ ε.

Taking a convergent subsequence xνj → p ∈ X, we also have x0νj → p. Now equicontinuity of F at p implies that there esists N < ∞ such that (A.29)

d(g(xνj ), g(p))
c} is open and contains q. If X is connected, then A ∪ B cannot be all of X; so any point in its complement has the desired property.

Exercises 1. If X is a metric space, with distance function d, show that |d(x, y) − d(x0 , y 0 )| ≤ d(x, x0 ) + d(y, y 0 ), and hence d : X × X −→ [0, ∞) is continuous. 2. Let ϕ : [0, ∞) → [0, ∞) be a C 2 function. Assume ϕ(0) = 0,

ϕ0 > 0,

ϕ00 < 0.

Prove that if d(x, y) is symmetric and satisfies the triangle inequality, so does δ(x, y) = ϕ(d(x, y)). Hint. Show that such ϕ satisfies ϕ(s + t) ≤ ϕ(s) + ϕ(t), for s, t ∈ R+ . 3. Show that the function d(x, y) defined by (A.18) satisfies (A.1). Hint. Consider ϕ(r) = r/(1 + r). 4. Let X be a compact metric space. Assume fj , f ∈ C(X) and fj (x) % f (x),

∀ x ∈ X.

119

Prove that fj → f uniformly on X. (This result is called Dini’s theorem.) Hint. For ε > 0, let Kj (ε) = {x ∈ X : f (x)−fj (x) ≥ ε}. Note that Kj (ε) ⊃ Kj+1 (ε) ⊃ · · · . Given a metric space X and f : X → [−∞, ∞], we say f is lower semicontinuous at x ∈ X provided f −1 ((c, ∞]) ⊂ X is open, ∀ c ∈ R. We say f is upper semicontinuous provided f −1 ([−∞, c)) is open, ∀ c ∈ R.

5. Show that f is lower semicontinuous ⇐⇒ f −1 ([−∞, c]) is closed, ∀ c ∈ R, and

f is upper semicontinuous ⇐⇒ f −1 ([c, ∞]) is closed, ∀ c ∈ R.

6. Show that f is lower semicontinuous ⇐⇒ xn → x implies lim inf f (xn ) ≥ f (x). Show that f is upper semicontinuous ⇐⇒ xn → x implies lim sup f (xn ) ≤ f (x). 7. Given S ⊂ X, show that χS is lower semicontinuous ⇐⇒ S is open. χS is upper semicontinuous ⇐⇒ S is closed. 8. If X is a compact metric space, show that f : X → R is lower semicontinuous =⇒ min f is achieved.

9. In the setting of (A.4), let n o1/2 2 2 δ(x, y) = d1 (x1 , y1 ) + · · · + dm (xm , ym ) . Show that δ(x, y) ≤ d(x, y) ≤



m δ(x, y).

120

10. Let X and Y be compact metric spaces. Show that if F ⊂ C(X, Y ) is compact, then F is equicontinuous. (This is a converse to Proposition A.20.) 11. Recall that a Banach space is a complete normed linear space. Consider C 1 (I), where I = [0, 1], with norm kf kC 1 = sup |f | + sup |f 0 |. I

I

Show that C 1 (I) is a Banach space. 12. Let F = {f ∈ C 1 (I) : kf kC 1 ≤ 1}. Show that F has compact closure in C(I). Find a function in the closure of F that is not in C 1 (I).

121

B. Partitions of unity In the text we have made occasional use of partitions of unity, and we include some material on this topic here. We begin by defining and constructing a continuous partition of unity on a compact metric space, subordinate to a open cover {Uj : 1 ≤ j ≤ N } of X. By definition, this is a family of continuous functions ϕj : X → R such that (B.1)

ϕj ≥ 0,

supp ϕj ⊂ Uj ,

X

ϕj = 1.

j

To construct such a partition of unity, we do the following. First, it can be shown that there is an open cover {Vj : 1 ≤ j ≤ N } of X and open sets Wj such that (B.2)

V j ⊂ Wj ⊂ W j ⊂ Uj .

Given this, let ψj (x) = dist (x, X \ Wj ). Then ψj is continuous, supp ψj ⊂ W j ⊂ Uj , and P ψj is strictly positive on V j . Hence Ψ = j ψj is continuous and strictly positive on X, and we see that (B.3)

ϕj (x) = Ψ(x)−1 ψj (x)

yields such a partition of unity. We indicate how to construct the sets Vj and Wj used above, starting with V1 and W1 . Note that the set K1 = X \ (U2 ∪ · · · ∪ UN ) is a compact subset of U1 . Assume it is nonempty; otherwise just throw U1 out and relabel the sets Uj . Now set V1 = {x ∈ U1 : dist (x, K1 ) < 31 dist (x, X \ U1 )}, and W1 = {x ∈ U1 : dist (x, K1 ) < 23 dist (x, X \ U1 )}. To construct V2 and W2 , proceed as above, but use the cover {U2 , . . . , UN , V1 }. Continue until done. Given a smooth compact surface M (perhaps with boundary), covered by coordinate patches Uj (1 ≤ j ≤ N ), one can construct a smooth partition of unity on M , subordinate to this cover. The main additional tool for this is the construction of a function ψ ∈ C0∞ (Rn ) such that (B.4)

ψ(x) = 1 for |x| ≤

1 , 2

ψ(x) = 0 for |x| ≥ 1.

One way to get this is to start with the function on R given by (B.5)

f0 (x) = e−1/x 0

for x > 0, for x ≤ 0.

122

It is an exercise to show that

f0 ∈ C ∞ (R).

Now the function f1 (x) = f0 (x)f0 ( 12 − x) belongs to C ∞ (R) and is zero outside the interval [0, 1/2]. Hence the function Z

x

f2 (x) =

f1 (s) ds −∞

belongs to C ∞ (R), is zero for x ≤ 0, and equals some positive constant (say C2 ) for x ≥ 1/2. Then 1 ψ(x) = f2 (1 − |x|) C2 is a function on Rn with the desired properties. With this function in hand, to construct the smooth partition of unity mentioned above is an exercise we recommend to the reader.

123

C. Differential forms and the change of variable formula The change of variable formula for one-variable integrals, Z

t

(C.1) a

¡ ¢ f ϕ(x) ϕ0 (x) dx =

Z

ϕ(t)

f (x) dx, ϕ(a)

given f continuous and ϕ of class C 1 , is easily established, via the fundamental theorem of calculus and the chain rule. We recall how this was done in §0. If we denote the left side of (C.1) by A(t) and the right by B(t), we apply these results to get ¡ ¢ (C.2) A0 (t) = f ϕ(t) ϕ0 (t) = B 0 (t), and since A(a) = B(a) = 0, another application of the fundamental theorem of calculus (or simply the mean value theorem) gives A(t) = B(t). For multiple integrals, the change of variable formula takes the following form, given in Proposition 4.13: Theorem C.1. Let O, Ω be open sets on Rn and ϕ : O → Ω be a C 1 diffeomorphism. Given f continuous on Ω, with compact support, we have Z Z ¯ ¡ ¢¯ ¯ ¯ (C.3) f ϕ(x) det Dϕ(x) dx = f (x) dx. O



There are many variants of Theorem C.1. In particular one wants to extend the class of functions f for which (C.3) holds, but once one has Theorem C.1 as stated, such extensions are relatively painless. See the derivation of Theorem 4.15. Let’s face it; the proof of Theorem C.1 given in §4 was a grim affair, involving careful estimates of volumes of images of small cubes under the map ϕ and numerous pesky details. Recently, P. Lax [L] found a fresh approach to the proof of the multidimensional change of variable formula. More precisely, [L] established the following result, from which Theorem C.1 is an easy consequence. Theorem C.2. Let ϕ : Rn → Rn be a C 1 map. Assume ϕ(x) = x for |x| ≥ R. Let f be a continuous function on Rn with compact support. Then Z Z ¡ ¢ (C.4) f ϕ(x) det Dϕ(x) dx = f (x) dx. We will give a variant of the proof of [L]. One difference between this proof and that of [L] is that we use the language of differential forms. Proof of Theorem C.2. Via standard approximation arguments, it suffices to prove this when ϕ is C 2 and f ∈ C01 (Rn ), which we will assume from here on.

124

To begin, pick A > 0 such that f (x − Ae1 ) is supported in {x : |x| > R}, where e1 = (1, 0, . . . , 0). Also take A large enough that the image of {x : |x| ≤ R} under ϕ does not intersect the support of f (x − Ae1 ). We can set (C.5)

F (x) = f (x) − f (x − Ae1 ) =

∂ψ (x), ∂x1

where Z (C.6)

A

ψ(x) =

f (x − se1 ) ds, 0

ψ ∈ C01 (Rn ).

Then we have the following identities involving n-forms: ∂ψ dx1 ∧ · · · ∧ dxn ∂x1 = dψ ∧ dx2 ∧ · · · ∧ dxn

α = F (x) dx1 ∧ · · · ∧ dxn = (C.7)

= d(ψ dx2 ∧ · · · ∧ dxn ), i.e., α = dβ, with β = ψ dx2 ∧ · · · ∧ dxn a compactly supported (n − 1)-form of class C 1 . Now the pull-back of α under ϕ is given by ¡ ¢ (C.8) ϕ∗ α = F ϕ(x) det Dϕ(x) dx1 ∧ · · · ∧ dxn . Furthermore, the right side of (C.8) is equal to ¡ ¢ (C.9) f ϕ(x) det Dϕ(x) dx1 ∧ · · · ∧ dxn − f (x − Ae1 ) dx1 ∧ · · · ∧ dxn . Hence we have

(C.10)

Z

Z ¢ f ϕ(x) det Dϕ(x) dx1 · · · dxn − f (x) dx1 · · · dxn Z Z Z ∗ ∗ = ϕ α = ϕ dβ = d(ϕ∗ β), ¡

where we use the general identity ϕ∗ dβ = d(ϕ∗ β),

(C.11)

a consequence of the chain rule. On the other hand, a very special case of Stokes’ theorem applies to X dj ∧ · · · ∧ dxn , (C.12) ϕ∗ β = γ = γj (x) dx1 ∧ · · · ∧ dx j

with γj ∈ C01 (Rn ). Namely (C.13)

dγ =

X ∂γj (−1)j−1 dx1 ∧ · · · ∧ dxn , ∂x j j

125

and hence, by the fundamental theorem of calculus, Z (C.14) dγ = 0. This gives the desired identity (C.4), from (C.10). We make some remarks on Theorem C.2. Note that ϕ is not assumed to be one-to-one or onto. In fact, as noted in [L], the identity (C.4) implies that such ϕ must be onto, and this has important topological implications. We mention that, if one puts absolute values around det Dϕ(x) in (C.4), the appropriate formula is Z (C.15)

¯ ¢¯ f ϕ(x) ¯det Dϕ(x)¯ dx = ¡

Z f (x) n(x) dx,

where n(x) = #{y : ϕ(y) = x}. A proof of (C.15) can be found in texts on geometrical measure theory. As noted in [L], Theorem C.2 was proven in [B-D]. The proof there makes use of differential forms and Stokes’ theorem, but it is quite different from the proof given here. A crucial difference is that the proof in [B-D] requires that one knows the change of variable formula as formulated in Theorem C.1.

126

D. Remarks on power series If a function f is sufficiently differentiable on an interval in R containing x and y, the Taylor expansion about y reads (D.1)

f (x) = f (y) + f 0 (y)(x − y) + · · · +

1 (n) f (y)(x − y)n + Rn (x, y). n!

Here, Tn (x, y) = f (y) + · · · + f (n) (y)(x − y)n /n! is that polynomial of degree n in x all of whose x-derivatives of order ≤ n, evaluated at y, coincide with those of f . This prescription makes the formula for Tn (x, y) easy to derive. The analysis of the remainder term Rn (x, y) is more subtle. One useful result about this remainder is the following. Say x > y, and for simplicity assume f (n+1) is continuous on [y, x]; we say f ∈ C n+1 ([y, x]). Then (D.2) m ≤ f (n+1) (ξ) ≤ M, ∀ ξ ∈ [y, x] =⇒ m

(x − y)n+1 (x − y)n+1 ≤ Rn (x, y) ≤ M . (n + 1)! (n + 1)!

Under our hypotheses, this result is equivalent to the Lagrange form of the remainder: (D.3)

Rn (x, y) =

1 (x − y)n+1 f (n+1) (ζn ), (n + 1)!

for some ζn between x and y. There are various proofs of (D.3). One will be given below. One of our purposes here is to comment on how effective estimates on Rn (x, y) are in determining the convergence of the infinite series ∞ X f (k) (y)

(D.4)

k=0

k!

(x − y)k

to f (x). That is to say, we want to perceive that Rn (x, y) → 0 as n → ∞, in appropriate circumstances. Before we look at how effective the estimate (D.2) is at this job, we want to introduce another player, and along the way discuss the derivation of various formulas for the remainder in (D.1). A simple formula for Rn (x, y) follows upon taking the y-derivative of both sides of (D.1); we are assuming that f is at least (n + 1)-fold differentiable. When we do this (applying the Leibniz formula to those terms that are products) an enormous amount of cancellation arises, and the formula collapses to (D.5)

∂Rn 1 = − f (n+1) (y)(x − y)n , ∂y n!

Rn (x, x) = 0.

If we concentrate on Rn (x, y) as a function of y and look at the difference quotient [Rn (x, y) − Rn (x, x)]/(y − x), an immediate consequence of the mean value theorem is that (D.6)

Rn (x, y) =

1 (x − y) (x − ξn )n f (n+1) (ξn ), n!

127

for some ξn between x and y. This result, known as Cauchy’s formula for the remainder, has a slightly more complicated appearance than (D.3), but as we will see it has advantages over Lagrange’s formula. The application of the mean value theorem to obtain (D.6) does not require the continuity of f (n+1) , but we do not want to dwell on that point. If f (n+1) is continuous, we can apply the Fundamental Theorem of Calculus to (D.5), in the y-variable, and obtain the basic integral formula Z 1 x (D.7) Rn (x, y) = (x − s)n f (n+1) (s) ds. n! y Another proof of (D.7) is indicated in Exercise 9 of §0. If we think of the integral in (D.7) as (x − y) times the mean value of the integrand, we see (D.6) as a consequence. On the other hand, if we want to bring a factor of (x − y)n+1 outside the integral in (D.7), the change of variable x − s = t(x − y) gives the integral formula Z 1 ¡ ¢ 1 n+1 (D.8) Rn (x, y) = (x − y) tn f (n+1) ty + (1 − t)x dt. n! 0 If we think of this integral as 1/(n + 1) times a weighted mean value of f (n+1) , we recover the Lagrange formula (D.3). From the Lagrange form (D.3) of the remainder in the Taylor series (D.1) we have the estimate (D.9)

|Rn (x, y)| ≤

|x − y|n+1 (n + 1)!

sup

|f (n+1) (ζ)|,

ζ∈I(x,y)

where I(x, y) is the open interval from x to y (either (x, y) or (y, x), disregarding the trivial case x = y). Meanwhile, from the Cauchy form (D.6) of the remainder we have the estimate (D.10)

|Rn (x, y)| ≤

|x − y| n!

sup

|(x − ξ)n f (n+1) (ξ)|.

ξ∈I(x,y)

We now study how effective these estimates are in determining that various power series converge. We begin with a look at these remainder estimates for the power series expansion about the origin of the simple function (D.11)

f (x) =

1 . 1−x

We have, for x 6= 1, f (k) (x) = k! (1 − x)−k−1 ,

(D.12) and formula (D.1) becomes (D.13)

1 = 1 + x + · · · + xn + Rn (x, 0). 1−x

128

Of course, everyone knows that the infinite series 1 + x + · · · + xn + · · ·

(D.14)

converges to f (x) in (D.11), precisely for x ∈ (−1, 1). What we are interested in is what can be deduced from the estimate (D.9), which, for the function (D.11), takes the form (D.15)

|Rn (x, 0)| ≤ |x|n+1 ·

|1 − ζ|−n−2 .

sup ζ∈I(x,0)

We consider two cases. First, if x ≤ 0, then |1 − ζ| ≥ 1 for ζ ∈ I(x, 0), so (D.16)

x ≤ 0 =⇒ |Rn (x, 0)| ≤ |x|n+1 .

Thus the estimate (D.9) implies that Rn (x, 0) → 0 in (D.13), for all x ∈ (−1, 0]. Suppose however that x ≥ 0. What we have from (D.15) is x ≥ 0 =⇒ |Rn (x, 0)| ≤ |x|n+1 sup |1 − ζ|−n−2 0≤ζ≤x

(D.17)

1 ³ x ´n+1 = . 1−x 1−x

This tends to 0 as n → ∞ if and only if x < 1 − x, i.e., if and only if x < 1/2. What we have is the following: Conclusion. The estimate (D.9) implies the convergence of the Taylor series (about the origin) for the function f (x) = 1/(1 − x), only for −1 < x < 1/2. This example points to a weakness in the estimate (D.9). Now let us see how well we can do with the estimate (D.10). For the function (D.11), this takes the form (D.18)

|Rn (x, 0)| ≤ (n + 1) |x|

sup ξ∈I(x,0)

|x − ξ|n . |1 − ξ|n+2

For −1 < x ≤ 0 one has an estimate like (D.16), with a harmless factor of (n + 1) thrown in. On the other hand, one readily verifies that 0 ≤ ξ ≤ x < 1 =⇒

x−ξ ≤ x, 1−ξ

so we deduce from (D.18) that (D.19)

0 ≤ x < 1 =⇒ |Rn (x, 0)| ≤ (n + 1)

xn+1 , 1−x

which does tend to 0 for all x ∈ [0, 1). One might be wondering if one could come up with some more complicated example, for which Cauchy’s form is effective only on an interval shorter than the interval of convergence.

129

In fact, you can’t. Cauchy’s form of the remainder is always effective in the interior of the interval of convergence. A proof of this, making use of some complex analysis, is given in [T3]. We look at some more power series, and see when convergence can be established at an endpoint of an interval of convergence, using the estimate (D.10), i.e., (D.20)

|Rn (x, y)| ≤ Cn (x, y),

Cn (x, y) =

|x − y| n!

sup

|(x − ξ)n f (n+1) (ξ)|.

ξ∈I(x,y)

We consider the following family of examples: (D.21)

f (x) = (1 − x)a ,

a > 0.

The power series expansion has radius of convergence 1 (if a is not an integer) and, as we will see, one has convergence at both endpoints, +1 and −1, whenever a > 0. Let us see when Cn (±1, 0) → 0. We have (D.22)

f (n+1) (x) = (−1)n+1 a(a − 1) · · · (a − n) (1 − x)a−n−1 .

Hence

(D.23)

¯ a(a − 1) · · · (a − n) ¯ | − 1 − ξ|n ¯ ¯ Cn (−1, 0) = ¯ ¯ sup n+1−a n! −1 0 in (D.21). On the other hand, (D.24)

¯ a(a − 1) · · · (a − n) ¯ ¯ ¯ Cn (1, 0) = ¯ ¯ sup (1 − ξ)a−1 . n! 0 0, the Taylor series about the origin for the function f (x) = (1 − x)a converges absolutely and uniformly to f (x) on the closed interval [−1, 1]. Proof. As noted, the series is (D.25)

∞ X

cn xn ,

cn = (−1)n

n=0

a(a − 1) · · · (a − n + 1) , n!

and, by an analysis parallel to (D.23), |cn | ≤ C n−1−a .

(D.26) In more detail, if n − 1 > a, (D.26A)

cn = −

a´ a Y ³ 1− n k 1≤k≤a

Y

³

a 0. This was established in §D, and we will use it, with a = 1/2. From the identity x1/2 = (1 − (1 − x))1/2 , we have x1/2 ∈ P([0, 2]). More to the point, from the identity ¡ ¢1/2 |x| = 1 − (1 − x2 ) ,

(E.1)

√ √ we have |x| ∈ P([− 2, 2]). Using |x| = b−1 |bx|, for any b > 0, we see that |x| ∈ P(I) for any interval I = [−c, c], and also for any closed subinterval, hence for any compact interval I. By translation, we have (E.2)

|x − a| ∈ P(I)

for any compact interval I. Using the identities (E.3)

max(x, y) =

1 1 (x + y) + |x − y|, 2 2

min(x, y) =

1 1 (x + y) − |x − y|, 2 2

we see that for any a ∈ R and any compact I, (E.4)

max(x, a), min(x, a) ∈ P(I).

We next note that P(I) is an algebra of functions, i.e., (E.5)

f, g ∈ P(I), c ∈ R =⇒ f + g, f g, cf ∈ P(I).

Using this, one sees that, given f ∈ P(I), with range in a compact interval J, one has h ◦ f ∈ P(I) for all h ∈ P(J). Hence f ∈ P(I) ⇒ |f | ∈ P(I), and, via (E.3), we deduce that (E.6)

f, g ∈ P(I) =⇒ max(f, g), min(f, g) ∈ P(I).

Suppose now that I 0 = [a0 , b0 ] is a subinterval of I = [a, b]. With the notation x+ = max(x, 0), we have (E.7)

¡ ¢ fII 0 (x) = min (x − a0 )+ , (b0 − x)+ ∈ P(I).

133

This is a piecewise linear function, equal to zero off I \ I 0 , with slope 1 from a0 to the midpoint m0 of I 0 , and slope −1 from m0 to b0 . Now if I is divided into N equal subintervals, any continuous function on I that is linear on each such subinterval can be written as a linear combination of such “tent functions,” so it belongs to P(I). Finally, any f ∈ C(I) can be uniformly approximated by such piecewise linear functions, so we have f ∈ P(I), proving the theorem. A far reaching extension of Theorem E.1, due to M. Stone, is the following result, known as the Stone-Weierstrass theorem. Theorem E.2. Let X be a compact metric space, A a subalgebra of CR (X), the algebra of real valued continuous functions on X. Suppose 1 ∈ A and that A separates points of X, i.e., for distinct p, q ∈ X, there exists hpq ∈ A with hpq (p) 6= hpq (q). Then the closure A is equal to CR (X). We present the proof in eight steps. Step 1. By Theorem E.1, if f ∈ A and ϕ : R → R is continuous, then ϕ ◦ f ∈ A. Step 2. Consequently, if fj ∈ A, then (E.8)

max(f1 , f2 ) =

1 1 |f1 − f2 | + (f1 + f2 ) ∈ A, 2 2

and similarly min(f1 , f2 ) ∈ A. Step 3. It follows from the hypotheses that if p, q ∈ X and p 6= q, then there exists fpq ∈ A, equal to 1 at p and to 0 at q. Step 4. Apply an appropriate continuous ϕ : R → R to get gpq = ϕ ◦ fpq ∈ A, equal to 1 on a neighborhood of p and to 0 on a neighborhood of q, and satisfying 0 ≤ gpq ≤ 1 on X. Step 5. Fix p ∈ X and let U be an open neighborhood of p. By Step 4, given q ∈ X \ U , there exists gpq ∈ A such that gpq = 1 on a neighborhood Oq of p, equal to 0 on a neighborhood Ωq of q, satisfying 0 ≤ gpq ≤ 1 on X. Now {Ωq } is an open cover of X \ U , so there exists a finite subcover Ωq1 , . . . , ΩqN . Let (E.9)

gpU = min gpqj ∈ A. 1≤j≤N

Then gpU = 1 on O = ∩N 1 Oqj , an open neighborhood of p, gpU = 0 on X \ U , and 0 ≤ gpU ≤ 1 on X. Step 6. Take K ⊂ U ⊂ X, K closed, U open. By Step 5, for each p ∈ K, there exists gpU ∈ A, equal to 1 on a neighborhood Op of p, and equal to 0 on X \ U . Now {Op } covers K, so there exists a finite subcover Op1 , . . . , Opm . Let (E.10)

gKU = max gpj U ∈ A. 1≤j≤M

134

We have (E.11)

gKU = 1 on K,

0 on X \ U,

and 0 ≤ gKU ≤ 1 on X.

Step 7. Take f ∈ CR (X) such that 0 ≤ f ≤ 1 on X. Fix k ∈ N and set n `o (E.12) K` = x ∈ X : f (x) ≥ , k so X = K0 ⊃ · · · ⊃ K` ⊃ K`+1 ⊃ · · · Kk ⊃ Kk+1 = ∅. Define open U` ⊃ K` by n n ` − 1o ` − 1o (E.13) U` = x ∈ X : f (x) > , so X \ U` = x ∈ X : f (x) ≤ . k k By Step 6, there exist ψ` ∈ A such that (E.14)

ψ` = 1 on K` ,

ψ` = 0 on X \ U` ,

and 0 ≤ ψ` ≤ 1 on X.

Let (E.15)

fk = max

0≤`≤k

` ψ` ∈ A. k

It follows that fk ≥ `/k on K` and fk ≤ (` − 1)/k on X \ U` , for all `. Hence fk ≥ (` − 1)/k on K`−1 and fk ≤ `/k on U`+1 . In other words, (E.16)

`−1 ` `−1 ` ≤ f (x) ≤ =⇒ ≤ fk (x) ≤ , k k k k

so (E.17)

|f (x) − fk (x)| ≤

1 , k

∀ x ∈ X.

Step 8. It follows from Step 7 that if f ∈ CR (X) and 0 ≤ f ≤ 1 on X, then f ∈ A. It is an easy final step to see that f ∈ CR (X) ⇒ f ∈ A. Theorem E.2 has a complex analogue. In that case, we add the assumption that f ∈ A ⇒ f ∈ A, and conclude that A = C(X). This is easily reduced to the real case. Here are a couple of applications of Theorem E.2, in its real and complex forms: Corollary E.3. If X is a compact subset of Rn , then every f ∈ C(X) is a uniform limit of polynomials on Rn . Corollary E.4. The space of trigonometric polynomials, given by (E.18)

N X k=−N

is dense in C(S 1 ).

ak eikθ ,

135

F. Convolution approach to the Weierstrass approximation theorem If u is bounded and continuous on R and f is integrable (say f ∈ R(R)) we define the convolution f ∗ u by Z (F.1)



f ∗ u(x) =

f (y)u(x − y) dy. −∞

Clearly Z (F.2)

|f | dx = A,

|u| ≤ M on R =⇒ |f ∗ u| ≤ AM on R.

Also, a change of variable gives Z (F.3)



f ∗ u(x) =

f (x − y)u(y) dy. −∞

We want to analyze the convolution action of a sequence of integrable functions fn on R that satisfy the following conditions: Z (F.4)

fn ≥ 0,

Z fn dx = 1,

fn dx = εn → 0, R\In

where (F.5)

In = [−δn , δn ],

δn → 0.

Let f ∈ C(R) be supported on a bounded interval [−A, A], or more generally, assume (F.6)

u ∈ C(R),

|u| ≤ M on R,

and u is uniformly continuous on R, so with δn as in (F.5), (F.7)

|x − x0 | ≤ 2δn =⇒ |u(x) − u(x0 )| ≤ ε˜n → 0.

We aim to prove the following. Proposition F.1. If fn ∈ R(R) satisfy (F.4)–(F.5) and if u ∈ C(R) is bounded and uniformly continuous (satisfying (F.6)–(F.7)), then (F.8)

un = fn ∗ u −→ u,

uniformly on R, as n → ∞.

136

Proof. To start, write Z un (x) =

fn (y)u(x − y) dy Z

(F.9)

=

Z fn (y)u(x − y) dy +

In

fn (y)u(x − y) dy

R\In

= vn (x) + rn (x). Clearly (F.10)

|rn (x)| ≤ M εn ,

∀ x ∈ R.

Next, Z (F.11)

vn (x) − u(x) =

fn (y)[u(x − y) − u(x)] dy − εn u(x), In

so (F.12)

|vn (x) − u(x)| ≤ ε˜n + M εn ,

∀ x ∈ R,

|un (x) − u(x)| ≤ ε˜n + 2M εn ,

∀ x ∈ R,

hence (F.13) yielding (F.8). Here is a sequence of functions (fn ) satisfying (F.4)–(F.5). First, set (F.14)

Z

1 2 gn (x) = (x − 1)n , An

1

An =

(x2 − 1)n dx,

−1

and then set (F.15)

fn (x) = gn (x), 0,

|x| ≤ 1, |x| ≥ 1.

It is readily verified that such (fn ) satisfy (F.4)–(F.5). We will use this sequence in Proposition F.1 to prove the Weierstrass approximation theorem, whose statement we recall: Proposition F.2. If I = [a, b] ⊂ R is a closed, bounded interval, and f ∈ C(I), then there exist polynomials pn (x) such that (F.16)

pn −→ f,

uniformly on I.

137

Proof. To start, we note that by an affine change of variable, there is no loss of generality in assuming that h 1 1i I= − , . 4 4

(F.17)

Next, given I as in (F.17) and f ∈ C(I), it is easy to extend f to a function (F.18)

u ∈ C(R),

u(x) = 0 for |x| ≥

1 . 2

Now, with fn as in (F.14)–(F.15), we can apply Proposition F.1 to deduce that Z (F.19)

un (x) =

fn (y)u(x − y) dy =⇒ un → u uniformly on R.

Now |x| ≤ (F.20)

1 =⇒ u(x − y) = 0 for |y| > 1 2 Z =⇒ un (x) =

gn (y)u(x − y) dy,

that is, (F.21)

|x| ≤

1 =⇒ un (x) = pn (x), 2

where Z pn (x) =

gn (y)u(x − y) dy Z

(F.22) =

gn (x − y)u(y) dy.

The last identity makes it clear that each pn (x) is a polynomial in x. Since (F.19) and (F.21) imply (F.23) we have (F.16).

pn −→ u uniformly on

h 1 1i − , , 2 2

138

G. Fourier series – first steps We work on T1 = R/(2πZ), which under θ 7→ eiθ is equivalent to S 1 = {z ∈ C : |z| = 1}. Given f ∈ C(T1 ), or more generally f ∈ R(T1 ), we set, for k ∈ Z, (G.1)

1 fˆ(k) = 2π

Z



f (θ)e−ikθ dθ.

0

We call fˆ(k) the Fourier coefficients of f . We say (G.2)

1

f ∈ A(T ) ⇐⇒

∞ X

|fˆ(k)| < ∞.

k=−∞

We aim to prove the following. Proposition G.1. Given f ∈ C(T1 ), if f ∈ A(T1 ), then (G.3)

f (θ) =

∞ X

fˆ(k)eikθ .

k=−∞

Proof. Given defining (G.4)

P ˆ |f (k)| < ∞, the right side of (G.3) is absolutely and uniformly convergent, ∞ X

g(θ) =

fˆ(k)eikθ ,

g ∈ C(T1 ),

k=−∞

and our task is to show that f ≡ g. Making use of the identities

(G.5)

1 2π

Z



ei`θ dθ = 0,

if ` 6= 0,

1,

if ` = 0,

0

we get fˆ(k) = gˆ(k), for all k ∈ Z. Let us set u = f − g. We have (G.6)

u ∈ C(T1 ),

u ˆ(k) = 0,

∀ k ∈ Z.

It remains to show that this implies u ≡ 0. To prove this, we use Corollary E.4, which implies that, for each v ∈ C(T1 ), there exist trigonometric polynomials, i.e., finite linear combinations vN of {eikθ : k ∈ Z}, such that (G.7)

vN −→ v uniformly on T1 .

139

Now (G.6) implies

Z u(θ)vN (θ) dθ = 0,

∀ N,

T1

and passing to the limit, using (G.7), gives Z (G.8)

∀ v ∈ C(T1 ).

u(θ)v(θ) dθ = 0, T1

Taking v = u gives Z |u(θ)|2 dθ = 0,

(G.9) T1

forcing u ≡ 0, and completing the proof. Remark. One can extend Proposition G.1 slightly, allowing a priori that f ∈ R(T1 ) and f ∈ A(T1 ). In such a case, (G.4)–(G.8) continue to hold, and it can be deduced from (G.8) that u(θ) = 0, except perhaps on a set of upper content 0. Hence (G.3) holds, except on a set of upper content 0. However, this is not an extension worthy of emphasizing. It just means that f can be altered on a set of upper content 0 to be continuous. We seek conditions on f that imply (G.2). Integration by parts for f ∈ C 1 (T1 ) gives, for k 6= 0,

(G.10)

1 fˆ(k) = 2π

Z



f (θ) 0

1 = 2πik

Z



i ∂ −ikθ (e ) dθ k ∂θ

f 0 (θ)e−ikθ dθ,

0

hence (G.11)

|fˆ(k)| ≤

1 2π|k|

Z



|f 0 (θ)| dθ.

0

If f ∈ C 2 (T1 ), we can integrate by parts a second time, and get (G.12)

1 fˆ(k) = − 2πk 2

hence |fˆ(k)| ≤

Z

1 2πk 2



f 00 (θ)e−ikθ dθ,

0

Z 0



|f 00 (θ)| dθ.

140

In concert with Z

1 |fˆ(k)| ≤ 2π

(G.13)



|f (θ)| dθ, 0

which follows from (G.1), we have (G.14)

|fˆ(k)| ≤

Z

1 2π(k 2

+ 1)

2π h

¤ |f 00 (θ)| + |f (θ)| dθ.

0

Hence f ∈ C 2 (T1 ) =⇒

(G.15)

X

|fˆ(k)| < ∞.

We will sharpen this implication below. We start with an interesting example. Consider (G.16)

f (θ) = |θ|,

−π ≤ θ ≤ π,

and extend this to be periodic of period 2π, yielding f ∈ C(T1 ). We have 1 fˆ(k) = 2π

(G.17)

Z

π

|θ|e−ikθ dθ

−π

= −[1 − (−1)k ]

1 , πk 2

for k 6= 0, while fˆ(0) = π/2. This is clearly a summable series, so f ∈ A(T1 ), and Proposition G.1 implies that, for −π ≤ θ ≤ π, |θ| = (G.18) =

X 2 π − eikθ 2 2 πk k odd ∞ X

π 4 − 2 π

`=0

1 cos(2` + 1)θ. (2` + 1)2

Now, evaluating this at θ = 0 yields the identity (G.19)

∞ X `=0

1 π2 = . (2` + 1)2 8

Using this, we can evaluate (G.20)

∞ X 1 , S= k2 k=1

141

as follows. We have ∞ X 1 = k2

(G.21)

k=1

=

X k≥1 odd

1 π2 + 8 4

1 + k2 ∞ X `=1

X k≥2 even

1 k2

1 , `2

hence S − S/4 = π 2 /8, so ∞ X 1 π2 . = k2 6

(G.22)

k=1

We see from (G.17) that if f is given by (G.16), then fˆ(k) satisfies |fˆ(k)| ≤

(G.23)

k2

C . +1

This is a special case of the following generalization of (G.15). Proposition G.2. Let f be Lipschitz continuous and piecewise C 2 on S 1 . Then (G.23) holds. Proof. Here we are assuming f is C 2 on S 1 \{p1 , . . . , p` }, and f 0 and f 00 have limits at each of the endpoints of the associated intervals in S 1 , but f is not assumed to be differentiable at the endpoints p` . We can write f as a sum of functions fν , each of which is Lipschitz on S 1 , C 2 on S 1 \ pν , and fν0 and fν00 have limits as one approaches pν from either side. It suffices to show that each fˆν (k) satisfies (G.23). Now g(θ) = fν (θ + pν − π) is singular only at θ = π, and gˆ(k) = fˆν (k)eik(pν −π) , so it suffices to prove Proposition G.2 when f has a singularity only at θ = π. In other words, f ∈ C 2 ([−π, π]), and f (−π) = f (π). In this case, we still have (G.10), since the endpoint contributions from integration by parts still cancel. A second integration by parts gives, in place of (G.12), Z π 1 i ∂ −ikθ ˆ f (k) = (e ) dθ f 0 (θ) 2πik −π k ∂θ (G.24) Z i 1 h π 00 −ikθ 0 0 =− f (θ)e dθ + f (π) − f (−π) , 2πk 2 −π which yields (G.23). We next make use of (G.5) to produce results on lowing.

R T1

|f (θ)|2 dθ, starting with the fol-

Proposition G.3. Given f ∈ A(T1 ), (G.25)

X

1 |fˆ(k)| = 2π

Z

2

|f (θ)|2 dθ. T1

142

More generally, if also g ∈ A(T1 ), X

(G.26)

1 fˆ(k)ˆ g (k) = 2π

Z f (θ)g(θ) dθ. T1

Proof. Switching order of summation and integration and using (G.5), we have 1 2π (G.27)

Z T1

Z X 1 g (k)e−i(j−k)θ dθ f (θ)g(θ) dθ = fˆ(j)ˆ 2π T1 j,k X = fˆ(k)ˆ g (k), k

giving (G.26). Taking g = f gives (G.25). We will extend the scope of Proposition G.3 below. Closely tied to this is the issue of convergence of SN f to f as N → ∞, where (G.28)

SN f (θ) =

X

fˆ(k)eikθ .

|k|≤N

Clearly f ∈ A(S 1 ) ⇒ SN f → f uniformly on T1 as N → ∞. Here, we are interested in convergence in L2 -norm, where Z 1 2 (G.29) kf kL2 = |f (θ)|2 dθ. 2π T1

Given f ∈ R(T1 ), this defines a “norm,” satisfying the following result, called the triangle inequality: (G.30)

kf + gkL2 ≤ kf kL2 + kgkL2 .

See Appendix H for details on this. Behind these results is the fact that kf k2L2 = (f, f )L2 ,

(G.31)

where, when f and g belong to R(T1 ), we set (G.32)

(f, g)L2

1 = 2π

Z f (θ)g(θ) dθ. S1

Thus the content of (G.25) is that (G.33)

X

|fˆ(k)|2 = kf k2L2 ,

143

and that of (G.26) is that X

(G.34)

fˆ(k)ˆ g (k) = (f, g)L2 .

The left side of (G.33) is the square norm of the sequence (fˆ(k)) in `2 . Generally, a sequence (ak ) (k ∈ Z) belongs to `2 if and only if X (G.35) k(ak )k2`2 = |ak |2 < ∞. There is an associated inner product (G.36)

((ak ), (bk )) =

X

ak bk .

As in (G.30), one has (see Appendix H) (G.37)

k(ak ) + (bk )k`2 ≤ k(ak )k`2 + k(bk )k`2 .

As for the notion of L2 -norm convergence, we say (G.38)

fν → f in L2 ⇐⇒ kf − fν kL2 → 0.

There is a similar notion of convergence in `2 . Clearly (G.39)

kf − fν kL2 ≤ sup |f (θ) − fν (θ)|. θ

In view of the uniform convergence SN f → f for f ∈ A(T1 ) noted above, we have (G.40)

f ∈ A(T1 ) =⇒ SN f → f in L2 , as N → ∞.

The triangle inequality implies ¯ ¯ ¯ ¯ (G.41) ¯kf kL2 − kSN f kL2 ¯ ≤ kf − SN f kL2 , and clearly (by Proposition G.3) (G.42)

kSN f k2L2

=

N X

|fˆ(k)|2 ,

k=−N

so (G.43)

kf − SN f kL2 → 0 as N → ∞ =⇒ kf k2L2 =

X

|fˆ(k)|2 .

We now consider more general functions f ∈ R(T1 ). With fˆ(k) and SN f defined by (G.1) and (G.28), we define RN f by (G.44)

f = SN f + RN f.

144

Note that

R T1

f (θ)e−ikθ dθ =

(G.45)

R

SN f (θ)e−ikθ dθ for |k| ≤ N , hence

T1

(f, SN f )L2 = (SN f, SN f )L2 ,

and hence (G.46)

(SN f, RN f )L2 = 0.

Consequently, kf k2L2 = (SN f + RN f, SN f + RN f )L2

(G.47)

= kSN f k2L2 + kRN f k2L2 .

In particular, (G.48)

kSN f kL2 ≤ kf kL2 .

We are now in a position to prove the following. Lemma G.4. Let f, fν be square integrable on S 1 . Assume (G.49)

lim kf − fν kL2 = 0,

ν→∞

and, for each ν, (G.50)

lim kfν − SN fν kL2 = 0.

N →∞

Then (G.51)

lim kf − SN f kL2 = 0.

N →∞

Proof. Writing f − SN f = (f − fν ) + (fν − SN fν ) + SN (fν − f ), and using the triangle inequality, we have, for each ν, (G.52)

kf − SN f kL2 ≤ kf − fν kL2 + kfν − SN fν kL2 + kSN (fν − f )kL2 .

Taking N → ∞ and using (G.48), we have (G.53)

lim sup kf − SN f kL2 ≤ 2kf − fν kL2 , N →∞

for each ν. Then (G.49) yields the desired conclusion (G.51). Given f ∈ C(S 1 ), we have trigonometric polynomials fν → f uniformly on T1 , and clearly (G.50) holds for each such fν . Thus Lemma G.4 yields the following. (G.54)

f ∈ C(S 1 ) =⇒ SN f → f in L2 , and X |fˆ(k)|2 = kf k2L2 .

145

Lemma G.4 also applies to many discontinuous functions. Consider, for example f (θ) = 0 for − π < θ < 0, 1 for 0 < θ < π.

(G.55) We can set, for ν ∈ N,

fν (θ) = 0 νθ

(G.56)

1

for − π < θ < 0, 1 for 0 ≤ θ ≤ , ν 1 ≤ θ < π. for ν

Then each fν ∈ C(T1 ). (In fact, fν ∈ A(T1 ), by Proposition G.2.). Also, one can check that kf − fν k2L2 ≤ 1/ν. Thus the conclusion in (G.54) holds for f given by (G.55). More generally, any piecewise continuous function on T1 is an L2 limit of continuous functions, so the conclusion of (G.54) holds for them. To go further, let us recall the class of Riemann integrable functions. A function f : T1 → R is Riemann integrable provided f is bounded (say |f | ≤ M ) and, for each δ > 0, there exist piecewise constant functions gδ and hδ on T1 such that Z ¡ ¢ (G.57) gδ ≤ f ≤ hδ , and hδ (θ) − gδ (θ) dθ < δ. T1

Then

Z

(G.58)

Z f (θ) dθ = lim

gδ (θ) dθ = lim

δ→0

T1

Z hδ (θ) dθ.

δ→0

T1

T1

Note that we can assume |hδ |, |gδ | < M + 1, and so Z Z 1 M +1 2 |f (θ) − gδ (θ)| dθ ≤ |hδ (θ) − gδ (θ)| dθ 2π π T1 T1 (G.59) M +1 < δ, π so gδ → f in L2 -norm. A function f : T1 → C is Riemann integrable provided its real and imaginary parts are. In such a case, there are also piecewise constant functions fν → f in L2 -norm, so (G.60)

f ∈ R(T1 ) =⇒ SN f → f in L2 , and X |fˆ(k)|2 = kf k2L2 .

This is not the end of the story. There are unbounded functions on T1 that are square integrable, such as (G.61)

f (θ) = |θ|−α on [−π, π],

0 0

a ∈ C, f ∈ V,

unless f = 0,

kf + gk ≤ kf k + kgk.

The property (H.14) is called the triangle inequality. A vector space equipped with a norm is called a normed vector space. We can define a distance function on such a space by (H.15)

d(f, g) = kf − gk.

Properties (H.12)–(H.14) imply that d : V × V → [0, ∞) satisfies the properties in (A.1), making V a metric space. If kf k is given by (H.11), from an inner product satisfying (H.6), it is clear that (H.12)– (H.13) hold, but (H.14) requires a demonstration. Note that kf + gk2 = (f + g, f + g) (H.16)

= kf k2 + (f, g) + (g, f ) + kgk2 = kf k2 + 2 Re(f, g) + kgk2 ,

while (H.17)

(kf k + kgk)2 = kf k2 + 2kf k · kgk + kgk2 .

Thus to establish (H.17) it suffices to prove the following, known as Cauchy’s inequality.

150

Proposition H.1. For any inner product on a vector space V , with kf k defined by (H.11), (H.18)

|(f, g)| ≤ kf k · kgk,

∀ f, g ∈ V.

Proof. We start with (H.19)

0 ≤ kf − gk2 = kf k2 − 2 Re(f, g) + kgk2 ,

which implies (H.20)

2 Re(f, g) ≤ kf k2 + kgk2 ,

∀ f, g ∈ V.

Replacing f by af for arbitrary a ∈ C of absolute velue 1 yields 2 Re a(f, g) ≤ kf k2 + kgk2 , for all such a, hence (H.21)

2|(f, g)| ≤ kf k2 + kgk2 ,

∀ f, g ∈ V.

Replacing f by tf and g by t−1 g for arbitrary t ∈ (0, ∞), we have (H.22)

2|(f, g)| ≤ t2 kf k2 + t−2 kgk2 ,

∀ f, g ∈ V, t ∈ (0, ∞).

If we take t2 = kgk/kf k, we obtain the desired inequality (H.18). This assumes f and g are both nonzero, but (H.18) is trivial if f or g is 0. An inner product space V is called a Hilbert space if it is a complete metric space, i.e., if every Cauchy sequence (fν ) in V has a limit in V . The space `2 has this completeness property, but C(S 1 ), with inner product (H.7), does not, nor does R(S 1 ). Appendix A describes a process of constructing the completion of a metric space. When appied to an incomplete inner product space, it produces a Hilbert space. When this process is applied to C(S 1 ), the completion is the space L2 (S 1 ). An alternative construction of L2 (S 1 ) uses the Lebesgue integral.

151

References [AM] R. Abraham and J. Marsden, Foundations of Mechanics, Benjamin, Reading, Mass., 1978. [Ahl] L. Ahlfors, Complex Analysis, McGraw-Hill, New York, 1979. [Ar] V. Arnold, Mathematical Methods of Classical Mechanics, Springer, New York, 1978. [B-D] L. Baez-Duarte, Brouwer’s fixed-point theorem and a generalization of the formula for change of variables in multiple integrals, J. Math. Anal. Appl. 177 (1993), 412–414. [Bu] R. Buck, Advanced Calculus, McGraw-Hill, New York, 1978. [Fed] H. Federer, Geometric Measure Theory, Springer, New York, 1969. [Fle] W. Fleming, Functions of Several Variables, Addison-Wesley, Reading, Mass., 1965. [Fol] G. Folland, Real Analysis: Modern Techniques and Applications, Wiley-Interscience, New York, 1984. [Fr] P. Franklin, A Treatise on Advanced Calculus, John Wiley, New York, 1955. [Go] H. Goldstein, Classical Mechanics, Addison-Wesley, New York, 1950. [Hal] J. Hale, Ordinary Differential Equations, Wiley, New York, 1969. [Hil] E. Hille, Analytic Function Theory, Chelsea Publ., New York, 1977. [HS] M. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra, Academic Press, New York, 1974. [Kan] Y. Kannai, An elementary proof of the no-retraction theorem, Amer. Math. Monthly 88 (1981), 264–268. [Kel] O. Kellog, Foundations of Potential Theory, Dover, New York, 1953. [Kr] S. Krantz, Function Theory of Several Complex Variables, Wiley, New York, 1982. [L] P. Lax, Change of variables in multiple integrals, Amer. Math. Monthly 106 (1999), 497–501. [Lef] S. Lefschetz, Differential Equations, Geometric Theory, J. Wiley, New York, 1957. [LS] L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, New York, 1968. [NSS] H. Nickerson, D. Spencer, and N. Steenrod, Advanced Calculus, Van Nostrand, Princeton, New Jersey, 1959. [Poi] H. Poincar´e, Les M´ethodes Nouvelles de la M´ecanique C´eleste, Gauthier-Villars, 1899. [Spi] M. Spivak, Calculus on Manifolds, Benjamin, New York, 1965. [Stb] S. Sternberg, Lectures on Differential Geometry, Prentice Hall, New Jersey, 1964. [Sto] J. J. Stoker, Differential Geometry, Wiley-Interscience, New York, 1969. [T] M. Taylor, Partial Differential Equations, Springer-Verlag, New York, 1996. [T:1] Basic Theory of ODE and Vector Fields, Chapter 1 of [T]. [T2] M. Taylor, Measure Theory and Integration, GSM #76, American Mathematical Society, Providence RI, 2006. [T3] M. Taylor, The remainder term in Taylor series, Preprint, 1999. [Wil] E. B. Wilson, Advanced Calculus, Dover, New York, 1958. (Reprint of 1912 ed.)

152

Index A algebra of sets 51 analytic function complex 24 real 35 anticommutation relation 81 area of spheres 69 arcsin 43 arctan 44 Ascoli’s theorem 114 autonomous system 39 averaging over rotations 71 B basis 28, 64 Brouwer fixed-point theorem 106 boundary 51 C Cauchy sequence 33, 108 Cauchy-Riemann equations 24, 36, 99, 103 Cauchy’s integral formula 99, 104 Cauchy’s integral theorem 99 chain rule 14, 16, 19, 44, 66, 105 change of variable formula 54, 56, 59, 62, 67, 77, 118 characteristic function 8, 50 closed set 108 closure 50, 109 cofactor 30 compact space 109 complete metric space 33, 108 completion of a metric space 108 conformal map 105 connected set 115 content 8, 50 continuous 6, 49

153

uniformly 6, 112 Contraction Mapping Principle 33, 38 coordinate chart 65 corner 87 cos 42, 43, 105 Cramer’s formula 26 cross product 31, 67 curl 84, 93, 94 cusp 88 D Darboux’s Theorem 7 derivation 92 derivative 9, 16 determinant 26, 28, 77 diagonal construction 114 diffeomorphism 32, 42, 56, 65 differential equation 38 differential form 75 closed 83 exact 83 div 84, 89, 90 Divergence Theorem 89, 91, 97 E Euclidean norm 16 Euler’s formula 42 exponential 41, 42 of a matrix 70 exterior derivative 82 F fixed point 33, 106 flow 45 Fubini’s Theorem 53, 60 Fundamental Theorem of Algebra 102 Fundamental Theorem of Calculus 9, 10, 14, 17, 25, 86, 97 G Gamma function 69 Gaussian integral 60

154

Gl(n, R) 54, 64 grad 84 Green’s formula 93 Green’s Theorem 89, 95, 99 H Haar measure 71 harmonic conjugate 103 harmonic function 101 Hessian 20 holomorphic function 24, 36, 99 I Implicit Function Theorem 34, 74 initial value problem 41 integral curve 45, 85 integral equation 38 integrating factor 85 integration by parts 14, 70 interior 51, 109 interior product 76 Intermediate Value Theorem 115 Inverse Function Theorem 32, 65, 70 iterated integral 25, 53 L Laplace operator 92 Lie bracket 46 Liouville’s Theorem 102 Lipschitz 9, 38, 52, 63 log 42 M matrix 16, 21, 23, 54, 64, 66 maximum 11, 22, 112 mean value property 102 Mean Value Theorem 10, 112 metric space 108 metric tensor 66, 71 minimum 11, 22, 112 modulus of continuity 112 multi-index 19

155

N neighborhood 108 Newton’s method 36 nil set 51, 52 normal derivative 92 normal vector outward 89, 91, 96 positive 93, 94, 96 O open set 108 orbit 45 orientation 77, 86, 88, 90 P partial derivative 16 partition 5, 48, 61 of unity 86, 92, 116 permutation 28 sign of 28 pi (π) 43, 44, 72 Picard iteration 38, 41 Poincar´e lemma 83, 88 polar coordinates 59, 61, 69 positive-definite matrix 22, 32, 66 power series 25, 36, 100, 121 pull-back 75, 77, 82 R rational numbers 7, 108 real numbers 108 Riemann integral 5, 48 row reduction 64 S Saddle point 22 sin 42, 43, 105 Skew(n) 70 SO(n) 31, 70

156

sphere area of 69 spherical polar coordinates 61, 68 Stokes formula 86, 94, 95, 106 Stone-Weierstrass theorem 128 surface 65 surface integral 67, 72 T tan 44 tangent space 65 tangent vector forward 94, 96 Taylor’s formula with remainder 14, 19, 122 triangle inequality 108 trigonometric functions 43 Tychonov’s theorem 113 V vector field 45, 89, 90, 95, 96 volume 48, 68, 71 volume form 90, 96 W wedge product 81 Weierstrass theorem 127