June 13 Notes.pdf - Google Drive

22

DEREK JUNG

As m > n, there is a nonzero solution to the equation Ax = 0, call it (α1 , . . . , αm )t . Thus, we have α1 v1 + · · · + αm vm = 0, with at least one of the αi nonzero. This is a contradiction. We can now prove that any two finite bases of V have the same number of elements: Theorem 2.6.16. Let v1 , . . . , vm and w1 , . . . , wn be two bases of a vector space V over F. Then m = n. Proof. Without loss of generality, assume m > n. Since the wi ’s form a basis, we may write vj = a1j w1 + · · · + anj wn ,

j = 1, 2, . . . , m.

The coefficients (a1j , . . . , anj ) of these vectors are m vectors in Fn , so they are linearly independent by the previous lemma. This implies that we may write the coefficient vectors as nontrivial linear combinations of 0. It’s an easy calculation to check that the vectors v1 , . . . , vm must satisfy the same nontrivial linear combination to be 0. We can now end this section by defining the dimension of a vector space: Definition 2.6.17. Let V be a vector space over a field F. If V has a finite basis, we define the dimension of V to be the order of any basis of V and say that V is finite-dimensional. Otherwise, we say that V is infinite-dimensional. 2.7. Row reduction and the general linear group: Shearing is caring. A matrix J decides to change the names of its rows to pears. After some time, J is despondent over the name change and shears with her companion matrix R. R comforts J, “Don’t diss pear. For that which we call rows by any other name is just as sweet.” Suppose that we have a system of equations: a11 x1 + · · · + a1m xm = b1 a21 x1 + · · · + a2m xm = b2 .. .. . . al1 x1 + · · · + alm xm = bl We can rewrite this system as an equation of matrices:     a11 · · · a1m x1  .. . . ..   ..  =   . . .  .   al1 · · · alm xm

 b1 ..  .  bl

For fixed (aij ) and (b1 , . . . , bl ), we wish to find a solution (x1 , . . . , xn ). It will be much easier to find a solution if (aij ) is in a simple form or we know its inverse. That’s the content of this section. Definition 2.7.1. A matrix is in row echelon form if • The first nonzero entry in each row (if any) is normalized to be 1. This is called the pivot (or leading 1 ) for the row. • The leading 1s (if any) appear in echelon form, i.e., as we move down the rows the leading 1s will appear farther to the right. A matrix in row echelon form is said to be in row reduced echelon form if the entries above and below each leading 1 consist of zeros.

SWILA NOTES

23

1 0 1 Example 2.7.2. The matrix is in row echelon form, but not row reduced 0 0 1 1 0 2 echelon form. The matrix is in row reduced echelon form. The matrix 0 1 −3 2 0 0 is not in row echelon form. 1 1 1 To transform a matrix into row reduced echelon form, we can perform the following three row operations: (1) Interchanging rows (2) Adding a multiple of a row to another row (3) Multiplying a row by a nonzero constant Working columns from left to right, one can prove the following proposition: Proposition 2.7.3. Any matrix (possibly nonsquare) can be reduced to row reduced echelon form by a finite number of the above row operations, with the number of pivots equal to the matrix’s rank (the dimension of the image of its columns). We now show that we can associate row operations with so-called elementary matrices. Recall that linear operators on Fm correspond to m × m matrices. In fact, Mm (F) 3 A

←→

(x 7→ Ax) ∈ L(Fm , Fm ).

(See Proposition 2.2.3.) Also recall    | | | A  v1 · · · vn  =  A(v1 ) · · · | | |

 | A(vn )  |

for matrices of appropriate sizes. Finally, observe the marvelous fact that when we are performing a row operation, we are actually performing the same transformation on each of the columns! Fix a matrix A with m rows. We have the following correspondence between row operations, linear operators Fm → Fm , and left multiplication by elementary matrices. Row operation 1. Interchange rows k and l

←→ Linear operator Fm → Fm ←→ left multiplication by E ∈ Mm (F) ←→

Switch xk and xl ←→ Keep same everything else

Ikl

2. Add α × row l ←→ Replace xk with xk + αxl ←→ to row k Keep same everything else

Rkl (α)

3. Multiply row k ←→ Replace xk with αxk ←→ by α 6= 0 Keep same everything else

Mk (α)

24

DEREK JUNG

For example, interchanging rows 1 and 2 corresponds to the linear operator Fm → Fm given by     x1 x2  x2   x1       x3   x3  → 7    ,  ..   ..   .   .  xm xm which corresponds to the identity matrix with its first and second rows switched:   0 1 0 ··· 0  1 0 0 ··· 0     0  I12 =  0 0 1 .   .. .. . .   . . . 0 0 ···

0

1

The three types of matrices are written explicitly below. Recall that we define Ekl ∈ Mm×n (F) to be the matrix for which the entry in the k th row and the lth column is 1 and all other entries are 0. Fix A ∈ Mm×n (F). The row operations can be achieved by multiplying A on the left by the elementary matrices: (1) Interchanging rows k and l: This can be achieved by the matrix multiplication Ikl A, where X eii ∈ Mm (F). Ikl = ekl + elk + i6=k,l

(2) Mulitplying row l by α and adding it to row k 6= l: This can be achieved by the matrix multiplication Rkl (α)A, where Rkl (α) = 1Fn + αEkl ∈ Mm (F). (3) Multiplying row k by α ∈ F \ {0}: This can be acheived by Mk (α)A, where X Mk (α) = αEkk + Eii ∈ Mm (F). i6=k

We can similarly define elementary matrices corresponding to column operations. To perform a column operation, one multiplies a matrix on the right by an associated elementary matrix. Remark 2.7.4. When m = n, an easy way to remember the form of these elementary matrices is by applying the above operations to A = Id. Definition 2.7.5. The kernel of a matrix A ∈ Ml×m (F) is defined to be the kernel of T : Fm → Fl , T (x) := Ax. Equivalently, ker(A) = {x ∈ Fm : Ax = 0}. Example 2.7.6. Let

1 0 2 A= ∈ M2×3 (R). 0 −1 3 The kernel of A is the set of all (x, y, z) ∈ R3 such that   x 1 0 2   y = 0. 0 −1 3 z

SWILA NOTES

25

As the rank of A is 2 and A has 3 columns, the Rank-Nullity Theorem implies the kernel of A must be 1 dimensional. We know that any (x, y, z) ∈ ker A satisfies x + 2z = 0 − y + 3z = 0. This implies ker A = {(−2z, 3z, z) : z ∈ R}. Definition 2.7.7. A matrix A ∈ Mn (F) is invertible if there is a matrix A−1 ∈ Mn (F) such that AA−1 = A−1 A = Idn . An interesting fact about invertible matrices is that the row reduced echelon form of an invertible matrix is the identity matrix. This results from the fact that the rank of any matrix equals the number of leading 1’s of its row reduced echleon form. The following proposition is an interesting consequence of Corollary 3.3.4. It will not be used in these notes so as to avoid any circular reasoning. Proposition 2.7.8. (One-sided invertibility implies invertibility) Let A ∈ Mn (F). Then A is invertible if there exists a matrix B ∈ Mn (F) satisfying one of the following conditions: • AB = Idn . • BA = Idn . In either case, B = A−1 . We now give a formal definition of a group as we will be interested in several matrix groups throughout these notes. Definition 2.7.9. A set G with a binary operation · : G × G → G, (a, b) 7→ a · b, is called a group if it satisfies the following three conditions: For all g1 , g2 , g3 ∈ G, • (Associativity) (g1 · g2 ) · g3 = g1 · (g2 · g3 ) • (Existence of an unit) There exists an element e ∈ G such that g1 · e = g1 . • (Existence of inverses) There exists an element g1−1 ∈ G such that g1−1 ·g1 = e = g1 ·g1−1 . We typically omit writing · for the product a · b and simply write ab. Definition 2.7.10. For a group G and a (possibly infinite) subset S ⊂ G, we say that S generates G if each element of G can be written as a finite product of elements of G. Remark 2.7.11. Much like how we only consider finite linear combinations of vectors, we only consider finite products of group elements. The collection of all invertible n × n matrices is called the general linear group and is denoted by: GLn (F) = {A ∈ Mn (F) : A is invertible}. This collection forms a group with the operation of matrix multiplication. Suppose a matrix A ∈ Mn (F) is invertible. Then, we can find the inverse of A by setting up the augmented matrix [A|Id] and applying row operations to reduce A to Id. The inverse of A will be to the right side of the bar. We get [A|Id]

−→

This leads us to the following proposition:

[Id|A−1 ].

26

DEREK JUNG

Proposition 2.7.12. Each elementary matrix is invertible. On the other hand, any invertible matrix can be written as the product of elementary matrices. In particular, the elementary matrices Ikl , Rkl (α), and Mk (α), α ∈ F, generate GLn (F). Proof. Since each row operation can be reversed and row operations correspond with elemen−1 −1 tary matrices, each elementary matrix is invertible. More explicitily, Ikl = Ikl , Rkl (α) = −1 −1 Rkl (−α), and Mk (α) = Mk (α ). For the other direction, fix an invertible matrix A. By Proposition 2.7.3, there exists a finite sequence of elementary matrices E1 , E2 , . . . , Er such that Er · · · E2 E1 A = Idn . Then A = E1−1 E2−1 · · · Er−1 . We are done since we showed earlier that the inverse of an elementary matrix is an elementary matrix. Remark 2.7.13. This proposition can be useful because a statement about invertible matrices can sometimes be reduced to proving it for the subset of elementary matrices, then applying an inductive argument. For example, one can use this proposition to prove of the most important properties of the determinant (which we will talk about next chapter): For any linear operator T : Rn → Rn and any “nice” subset E of Rn , Volume(T (E)) = det(T ) · Volume(E). Hence, linear operators uniformly scale the volume of “nice” subsets of Rn by the determinants. Here, a “nice” set is a Lebesgue measurable set, of which open sets are included. In fact, the collection of Lebesgue measurable sets is so large that one needs to use the Axiom of Choice to prove that that there non-Lebesgue measurable sets! You can learn more about the topic of measurable sets in Math 540: Real Analysis. Recall that any matrix can be reduced to row echelon form by iteratively applying row operations. Through the correspondence of row operations and invertible matrices, we obtain the following proposition: Proposition 2.7.14. For A ∈ Mn (F), there exists P ∈ GLn (F) such that P A is upper triangular:   b11 b12 · · · b1n  0 b22 · · · b2n    PA =  . .. . . ..  . .  . . . .  0 0 · · · bnn Moreover, ker(A) = ker(P A), and ker(A) 6= {0} if and only if each of the diagonal elements in P A is nonzero. Recall Proposition 3.3.6, which states that the column rank of a matrix equals its row rank. Assuming the Rank-Nullity Theorem (Theorem 3.3.2), which states that given a linear transformation T : Fm → Fl , m = dim(im(T )) + dim(ker(T )). These two results give us the most theorem of this section:

SWILA NOTES

27

Theorem 2.7.15. Let A ∈ Mn (F). Then dim(ker(A)) = dim(ker(At )), where At ∈ Mn (F) is the transpose of A. Recall Atij := Atji . This result will give us that the eigenvalues of a square matrix are the same as those of its transpose, occurring with the same geometric multiplicities. This theorem will not be used before the proof of the Rank-Nullity Theorem to avoid circular reasoning.

28

DEREK JUNG

3. Vector spaces: But wait, there’s more! I don’t know if the reader has ever been to Fat Sandwich Company, but it’s a delightful little sandwich spot on E. John St. It combines a variety of exquisite cooking methods and scrumptious ingredients to create delicate delicacies, fitting of the world’s finest restaurants. The thing is, the lucky consumer already knows what cheesesteak, chicken tenders, mozzarella sticks, waffle fries, mayonnaise, and an egg taste individually. But could you imagine them dumped in a hot fryer, then stuffed together in a hoagie roll? That sounds amazing. This is what this chapter is about. We already know the definition of a vector space and a linear transformation. We will concoct new vector spaces like subspaces, direct sum of vector spaces, and quotient spaces and consider how linear transformations fit in. How do I feel about this? Well, I’m hungry just thinking about it. 3.1. Subspaces. Definition 3.1.1. Let V be a vector space. A nonempty subset S ⊂ V is called a subspace of V if it is under closed under addition and scalar multiplication; i.e., • x, y ∈ S =⇒ x + y ∈ S • x ∈ S and α ∈ F =⇒ αx ∈ S. Example 3.1.2. Fix m ≤ n in N. Then {(x1 , . . . , xm , 0, . . . , 0) ∈ Rn } is a subspace of Rn . For example, we usually view the xy-plane as a subspace of R3 . Example 3.1.3. Given n ∈ N, the collection Pn (F) of polynomials of degree at most n is a subspace of F[t]. Definition 3.1.4. (New vector spaces) Fix a vector space V . Given subspaces M, N ⊂ V , we can define the intersection of M and N : M ∩ N := {x ∈ V : x ∈ M

and x ∈ N },

and the sum of M and N : M + N := {x + y ∈ V : x ∈ M

and y ∈ N }.

These define subspaces of V . Remark 3.1.5. We define the Grassmanian G(m, n) to be the collection of m-dimensional subspaces of Rn . This can actually be given a natural smooth structure and made into a compact manifold. By compactness, we mean that every sequence of k-dimensional subspaces of Rn has a subsequence that converges to a k-dimensional subspace of Rn . Here, given kdimensional subspaces Vj , V , j ∈ N, we say that Vj converges to V if there exists bases Bj = {vj1 , . . . , vjk } of Vj and B = {v1 , . . . , vk } of V such that Bj converges to B coordinatewise as points in Rn . Equivalently, if projVj is the orthogonal projection onto Vj and projV is the orthogonal projection onto V , the operator norm of projVj − projV : Rn → Rn converges to 0 (orthogonal projections onto subspaces will be defined in section 6.4). For readers not familiar with manifolds, a manifold is a set that locally looks like some Euclidean space. Examples of manifolds include the torus, the circle, and the sphere. For those interested in learning more about manifolds and differential geometry, I would highly recommend taking Math 518 next semester with Professor Albin.

SWILA NOTES

29

3.2. Direct sums. Definition 3.2.1. Let V be a vector space. Suppose M, N are subspaces of V . We say M and N are transversal if M + N = V . We say M and N are complementary if M + N = V and M ∩ N = {0}, in which case we define the direct sum of M and N as M ⊕ N := M + N. Equivalently, each element of V can be uniquely written as the sum of an element of M and an element of N . Remark 3.2.2. The reader may have heard of subspaces being complementary in another context. Subspaces M and N of a vector space are called complimentary if they say really nice things about each other. That is a bit different from our sense of complementary and is written with an “i”. M : I really like your zero vector! N : No way! I really like your zero vector! M : Omg! I just realized! We’re matching! We have the same zero vector! N : Hooray! :) Remark 3.2.3. Observe that if M and N are complementary, dim(M ⊕ N ) = dim(M ) + dim(N ). We can extend the previous definition to multiple subspaces of V . If M1 , . . . , Mk ⊂ V are subspaces, we say that V is a direct sum of M1 , . . . , Mk and write V = M1 ⊕ · · · ⊕ Mk if each x ∈ V can be uniquely written as x = x1 + . . . + xk ,

xi ∈ Mi .

It isn’t hard to show that this coincides for k = 2 with the original definition of the direct sum. Example 3.2.4. Define M0 := {a : a ∈ R}, M1 := {b + bt : b ∈ R}, and M2 := {c + ct + ct2 : c ∈ R}. Then P1 (R) = M0 ⊕ M1 and P2 (R) = M0 ⊕ M1 ⊕ M2 . Theorem 3.2.5. (Existence of complements) Let M ⊂ V be a subspace and assume that V = span{x1 , . . . , xn }. If M 6= V , then it is possible to choose xi1 , . . . , xik such that V = M ⊕ span{xi1 , . . . , xik }. Proof. (Sketch) Choose xij not in the span of M with the set of previously chosen xi until you span V . Corollary 3.2.6. If M ⊂ V is a subspace and dim(V ) < ∞, then dim(M ) ≤ dim(V ).

30

DEREK JUNG

We conclude this section by proving a correspondence between direct sums of vector spaces and projections. Definition 3.2.7. A projection E : V → V is a linear operator satisfying E 2 = E. Thanks to Professor Lerman who pointed out during a Math 519 lecture that this result should be proven in Linear Algebra courses, but is often ignored. Proposition 3.2.8. (Direct sums ↔ projections) Fix a vector space V . If E : V → V is a projection, then V = Im(E) ⊕ ker E. On the other hand, given a direct sum decomposition V = W1 ⊕ W2 , there is a projection K : V → V satisfying Im(K) = W1 and ker K = W2 . Proof. First suppose E : V → V is a linear operator. For any x ∈ V , x = Ex + (x − Ex), whence it follows V = E(V ) + ker(E). It remains to show the intersection condition for direct sums. If x = Ey ∈ Im(E) ∩ ker E, then 0 = Ex = E 2 y = Ey = x. This proves Im(E) ∩ ker E = {0}. Now suppose V = W1 ⊕ W2 for subspaces W1 , W2 of V . For any x ∈ V , note x can be uniquely written as x = v1 + v2 for v1 ∈ W1 , v2 ∈ W2 . Define K : V → V by K(v1 + v2 ) = v1 . It isn’t hard to show that K is a projection that works. 3.3. Linear maps and subspaces. Definition 3.3.1. Let T : V → W be a linear transformation. We define the kernel (or nullspace) of T as ker(T ) := {x ∈ V : T (x) = 0} and the image (or range) of T as im(T ) := {T x ∈ W : x ∈ V }. The kernel and image form subspaces of V and W , respectively. We define rank(T ) = dim(im(T )), nullity(T ) = dim(ker(T )). The main theorem of the section is the following: Theorem 3.3.2. (The Rank-Nullity Theorem) Let V be finite dimensional and T : V → W a linear transformation. Then im(T ) is finite dimensional and dim(V ) = rank(T ) + nullity(T ). Proof. Choose a complement N for ker(T ) with dim(N ) = dim(V ) − dim(ker(T )). One can check T |N : N → im(T ) is an isomorphism (cf. The First Isomorphism Theorem for groups). This implies dim(im(T )) = dim(N ) = dim(V ) − dim(ker(T )). The Rank-Nullity Theorem follows from the fact that Corollary 3.2.6 implies dim(ker(T )) is finite-dimensional. The following two corollaries follow from the Rank-Nullity Theorem:

SWILA NOTES

31

Corollary 3.3.3. If M is a subspace of V and dim(M ) = dim(V ) < ∞, then M = V . Proof. Apply the Rank-Nullity Theorem to the inclusion map ι : M → V .

The following is one of the most useful results about linear operators on finite-dimensional vector spaces: Corollary 3.3.4. Suppose V is finite-dimensional. Then for a linear operator T : V → V , the following are equivalent: • T is injective. • T is surjective. • T is an isomorphism. We conclude this section with an important definition and result: Definition 3.3.5. Let A ∈ Ml×m (F). We define the row rank (column rank ) of A to be the dimension of the space spanned by the rows (columns) of A. Proposition 3.3.6. For any matrix A, the column rank of A is equal to the row rank of A. For a proof, see Theorem 7, page 57, of Petersen [6]. 3.4. Quotient spaces. Fix a vector space V and a subspace M ⊂ V . In this section, we will construct a vector space V /M and a linear transformation π : V → V /M such that ker(T ) = M . We first review equivalence relations. Definition 3.4.1. Given a set S, a relation on S is a subset R ⊂ S × S. If (a, b) ∈ R, we write a ∼ b. We say that a relation on S is an equivalence relation if it satisfies the following the three properties: for all a, b, c ∈ S, (1) (Reflexivity) a ∼ a (2) (Symmetry) a ∼ b ⇒ b ∼ a (3) (Transitivity) a ∼ b and b ∼ c ⇒ a ∼ c. For a ∈ S, we define the equivalence class of a to be [a]∼ := {b ∈ S : b ∼ a}. Remark 3.4.2. Suppose ∼ is an equivalence relation on S. For any a, b ∈ S, [a]∼ = [b]∼ or [a]∼ ∩ [b]∼ = ∅. This means that any two equivalence classes are either equal or disjoint. Definition 3.4.3. Fix subsets S, T ⊂ V and α ∈ F. We define αS := {αs : s ∈ S}. S + T := {s + t : s ∈ S, t ∈ T }, For x ∈ V , we define the translate of S x + S := {x + s : s ∈ S}. Definition 3.4.4. (Quotient Space) Let V be a vector space and M ⊆ V a subspace. We define the quotient space V /M := {x + M : x ∈ V }. We sometimes write [x]M for x + M . This becomes a vector space with the operations α(x + M ) := (αx) + M