June 20 Notes.pdf - Google Drive

SWILA NOTES

41

Remark 5.1.5. The reader should not get this confused with when one has a bunch of a day old meat, and one has to determine which is the best, which is ... relatively prime. Using the division algorithm, we have the following lemma: Lemma 5.1.6. Given p(t) ∈ F[t] and λ ∈ F, λ is a root of p(t) if and only if (t − λ) divides p(t). This leads us to the statement that C is algebraically closed. Theorem 5.1.7. (Fundamental Theorem of Algebra) Every polynomial p(t) ∈ C[t] has a root. Equivalently, every polynomial p(t) ∈ C[t] splits, i.e., if n is the degree of p(t), there exist a, λ1 , . . . , λn ∈ C (not necessarily distinct) such that p(t) = a(t − λ1 )(t − λ2 ) · · · (t − λn ). Random Thought 5.1.8. What did 1 + x2 say when it left R[x] for C[x]? It’s been real, but I’ve gotta split. Definition 5.1.9. Fix a linear operator T : V → V . Define T 1 = T and inductively define T i+1 = T ◦ T i for each i ∈ N. Given a polynomial p(t) = a0 + a1 t + · · · + an tn , we define the linear operator p(T ) : V → V by p(T ) = a0 1V + a1 T + · · · + an T n . Given another polynomial q(t), we define the linear operator on V p(T )q(T ) := p(T ) ◦ q(T ). Remark 5.1.10. Fix a linear operator T : V → V . We will frequently use the fact that if p(t), q(t) ∈ F[t], then p(T )q(T ) = q(T )p(T ). Moreover, if a subspace W ⊆ V is T -invariant, i.e., T x ∈ W for all x ∈ W , then p(T )x ∈ W for all p(t) ∈ F[t] and x ∈ W . 5.2. Eigenvalues. The identity operator of a vector space is special in that it sends every vector to itself. This makes it very easy to study the behavior of this operator. We are interested in analyzing more complicated operators. To do this, we aim to find special vectors of a linear operator. The treatment of eigenstuff in this section will be very typical. Vector spaces are not necessarily finite-dimensional unless specifically noted. Definition 5.2.1. Let V be a vector space and T : V → V a linear operator. We say a scalar λ ∈ F is an eigenvalue of T if there exists a nonzero vector x ∈ V such that T (x) = λx. We call such a vector 0 6= x ∈ V an eigenvector of T corresponding to λ. If λ is an eigenvalue of T , we define the subspace Eλ := {x ∈ V : T x = λx} of V . We call Eλ the eigenspace for λ. Given a linear operator T : V → V , note that a scalar λ is an eigenvalue of T if and only if ker(T − λId) 6= {0}. If λ is an eigenvalue of T , note dim(Eλ ) > 0 and Eλ = ker(T − λId). If V is finite-dimensional, we can talk about matrix representations of linear operators with respect to an ordered basis. Definition 5.2.2. Let V be a finite-dimensional vector space, and fix a basis B for V . Let T : V → V be a linear operator. We define the characteristic polynomial of T to be χT (t) = det([T ]B − tId).

42

DEREK JUNG

To see this definition is well-defined, let B, C be two bases for V . Let S = [Id]C,B be the change of basis matrix from B to C. Then det([T − tId]B ) = det(S −1 [T − tId]C S) = det(S −1 ) det([T − tId]C ) det S = det([T − tId]C ). Remark 5.2.3. Some define the characteristic polynomial to be det(tId − [T ]B ). I define it the opposite way because I find that I am less likely to mess up signs of matrix entries this way. Also, it doesn’t really matter as we only care about the roots of the characteristic polynomial. This leads us to the following equivalent statements: Proposition 5.2.4. Assume V is finite-dimensional, T : V → V is a linear operator, and λ ∈ F. The following are equivalent: (1) λ is an eigenvalue of T . (2) ker(T − λId) 6= {0}. (3) T − λId is not invertible. (4) There exists a basis B in which [T − λId]B is not invertible. (5) There exists a basis B in which det([T ]B − λIdn ) = det([T − λId]B ) = 0. (6) λ is a root of the characteristic polynomial χT (t). Proof. This lemma follows from Proposition 4.1.8.

Example 5.2.5. In this example, we will be find the eigenvalues and eigenvectors of a simple 4 × 4 matrix. Let   0 1 0 0  −1 0 0 0   A=  0 0 0 1  ∈ M4 (C). 0 0 1 0 By expanding along the first column, we calculate   −t 1 0 0  −1 −t 0 0   χA (t) = det(A − tI) = det   0 0 −t 1  0 0 1 −t     −t 0 0 1 0 0 = −t det  0 −t 1  + det  0 −t 1  0 1 −t 0 1 −t = −t(−t(t2 − 1)) + t2 − 1 = t4 − t2 + t2 − 1 = t4 − 1. The eigenvalues of A, i.e., the roots of χA (t), are We find  −1 1 0  −1 −1 0 E1 = ker(A − Id) = ker   0 0 −1 0 0 1

1, −1, i, and −i.  0 0   = {(0, 0, z, z) : z ∈ C}. 1  −1

SWILA NOTES

43

Similarly, E−1 = {(0, 0, z, −z) : z ∈ C} Ei = {(z, iz, 0, 0) : z ∈ C} E−i = {(iz, z, 0, 0) : z ∈ C}. We can use induction to prove the following important proposition: Proposition 5.2.6. Let V be a vector space (possibly infinite-dimensional), and let T : V → V a linear operator. Let x1 , . . . , xk be eigenvectors of T corresponding to distinct eigenvalues λ1 , . . . , λk . Then {x1 , . . . , xk } is linearly independent. Equivalently, Eλ1 + · · · + Eλk = Eλ1 ⊕ · · · ⊕ Eλk . In particular, k ≤ dim V . Proof. This proposition is actually surprisingly difficult to prove. A cheap way to prove this proposition is by assuming the formula for the determinant of a Vandermonde matrix. For another proof, see Petersen [6], Lemma 14, page 125. I would like to remind the reader here that my notes are heavily adapted from the notes of Petersen [6]. We conclude this section by comparing the algebraic and geometric multiplicity of eigenvalues: Definition 5.2.7. We define the algebraic multiplicity of an eigenvalue λ to be the degree that λ is a root of χT (t). We define the geometric multiplicity of an eigenvalue λ to be the dimension of the eigenspace Eλ . Proposition 5.2.8. (AM ≥ GM) Let T : V → V be a linear operator on a finite-dimensional vector space. If λ is an eigenvalue of T , the algebraic multiplicity of λ is at least the geometric multiplicity of λ. Proof. Let n = dim(V ), and k be the geometric multiplicity of an eigenvalue λ. Note k ≤ n since ker(T − λId) ⊆ V is a subspace. Let {x1 , . . . , xk } be a basis for ker(T − λI), and complete this to a basis B = {x1 , . . . , xk , xk+1 , . . . , xn } for V . Then [T ]B has the form λIdk ? , 0 A for some (n − k) × (n − k)-matrix A. It follows from Proposition 4.1.10 that χT (t) = χλIdk (t)χA (t) = (λ − t)k χA (t). As (λ − t)k divides χT (t), the algebraic multiplicity of T is at least k.

5.3. The minimal polynomial. Definition 5.3.1. Fix vector spaces V and W . We define L(V, W ) to be the vector space of linear transformations T : V → W . Lemma 5.3.2. Suppose V, W are finite-dimensional vector spaces of dimension l, m, respectively. Then L(V, W ) is finite-dimensional with dim L(V, W ) = l · m.

44

DEREK JUNG

Proof. Fix bases {e1 , . . . , el }, {f1 , . . . , fm } of V, W , respectively. For all 1 ≤ i ≤ l and 1 ≤ j ≤ m, define fj , if k = i Ti,j (ek ) = 0, otherwise. It is easy to check that {Ti,j } forms a basis for L(V, W ).

Proposition 5.3.3. (Annihilator is nonzero in finite-dimensional case) Suppose V is a finitedimensional vector space, and T : V → V is a linear operator. Then there is a nonzero polynomial p(t) ∈ F[t] such that p(T ) = 0. In fact, if V is of dimension n, we can take p(t) to have degree at most n2 . Proof. By the above lemma, any collection of n2 + 1 transformations in L(V, V ) is linearly dependent. In particular, there exist constants a0 , a1 , . . . , an2 , not all zero, such that 2

a0 I + a1 T + · · · + an2 T n = 0. 2

We may take p(t) = a0 + a1 t + · · · + an2 tn 6= 0.

Remark 5.3.4. Let T : F[t] → F[t] be the “multiplication by x” linear operator given by p(t) 7→ t · p(t). Observe for all p(t), q(t) ∈ F[t], q(T )(p(t)) = q(t) · p(t). In particular, q(T )(1) = q(t) 6= 0 for all nonzero q(t) ∈ F[t]. Thus, T is an example of a linear operator on an infinite-dimensional vector such that the prior proposition does not hold. Definition 5.3.5. Let V be a finite-dimensional vector space and 0 6= T : V → V a linear operator. We define the minimal polynomial of T to be a monic polynomial mT (t) ∈ F[t] of least degree such that mT (T ) = 0. Proposition 5.3.6. Given a finite-dimensional vector space V and a linear operator T : V → V , a minimal polynomial exists and is unique. Moreover, if q(t) ∈ F[t] satisifes q(T ) = 0, then mT (t) divides q(t), i.e., there exists a polynomial d(t) ∈ F[t] such that q(t) = mT (t)d(t). Proof. By the above proposition, there exists a polynomial P (t) ∈ F[t] such that P (T ) = 0. Define d = min{deg(q(t)) : q(T ) = 0, q(t) 6= 0} ≥ 1. Choose a polynomial m(t) ˜ = a0 + a1 t + . . . + ad t d ,

ad 6= 0

with m(T ˜ ) = 0. Then mT (t) = a−1 ˜ is a minimal polynomial for T . d m(t) We show that mT (t) divides any polynomial q(t) ∈ F[t] with q(T ) = 0. Uniqueness of the minimal polynomial would then follow. Fix such a polynomial q(t). The desired result is clear if q(t) = 0 so assume otherwise. By the division algorithm (proposition 5.1.2), there exist polynomials d(t), r(t) ∈ F[t] such that q(t) = mT (t)d(t) + r(t) and either deg(r(t)) < d or r(t) = 0. Then r(T ) = q(T ) − mT (T )d(T ) = 0. This contradicts the definition of d if r(t) 6= 0. Random Thought 5.3.7. All beds are warm. That’s a blanket statement.

SWILA NOTES

45

Example 5.3.8. Define T : R2 → R2 by T (x, y) = (x, 2y). If S is the standard basis for R2 , 1 0 [T ]S = . 0 2 Observe for each k ∈ N,

k

[T ]S =

1 0 0 2k

,

which implies for each p(t) ∈ R[t], [p(T )]S =

p(1) 0 0 p(2)

.

We observe p(T ) = 0

⇐⇒

p(1) = p(2) = 0.

It follows that the minimal polynomial of T is mT (t) = (t − 1)(t − 2) = t2 − 3t + 2. 5.4. Diagonalizability. Fix n ∈ N. A particularly simple subset of Mn (F) is the collection of diagonal n × n matrices. These are the matrices in Mn (F) for which every entry off the diagonal is zero. If D is diagonal:   a11 0 · · · 0  0 a22 0    D= . , ..  ..  . 0

0

ann

then Dei = aii ei for all i. This means that ei is an eigenvector of D with corresponding eigenvalue λ. In particular, D has a basis of eigenvectors {e1 , . . . , en }. Suppose a vector space V has a basis B = {v1 , . . . , vn }. Recall we define the coordinates of a vector x = α1 v1 + · · · + αn vn with respect to B as   α1   [x]B =  ...  . αn Recall for a linear operator T : V → V , we define the matrix representation of T with respect to B to be   | | [T ]B =  [T (v1 )]B · · · [T (vn )]B  . | | Definition 5.4.1. Let V be a (possibly infinite-dimensional) vector space. We call a linear operator T : V → V diagonalizable if there is a basis of eigenvectors for V . We call such a basis an eigenbasis for V . λ 1 Example 5.4.2. Fix λ ∈ F and define Aλ := . Then Aλ is not diagonalizable as 0 λ 0 1 its only eigenvalue is λ and ker has dimension 1. 0 0

46

DEREK JUNG

2 −1 Example 5.4.3. Let A = . We calculate −1 2 2 − t −1 det(A − λI) = = t2 − 4t + 3 = (t − 1)(t − 3). −1 2 − t This implies the eigenvalues of A are 1 and 3. Choosing an eigenvector corresponding to each eigenvalue, these two vectors form a basis of eigenvectors for V . Hence, A is an example of a nondiagonal matrix that is diagonalizable. We leave the following as an exercise for the reader. Proposition 5.4.4. Let T : V → V be a linear operator on a finite-dimensional vector space. Suppose λ1 , . . . , λk are the eigenvalues of T . Then T is diagonalizable if and only if V = Eλ1 ⊕ · · · ⊕ Eλk , where Eλi is the eigenspace corresponding to λi . In the case of finite-dimensional vector spaces, we can also talk about matrix representations. This gives us the following proposition. Proposition 5.4.5. Let V be a finite-dimensional vector space and T : V → V a linear operator. Then T is diagonalizable if and only if there is a basis B for which [T ]B is diagonal. Suppose a matrix A has an eigenbasis B = {x1 , . . . , xn } with corresponding eigenvalues λ1 , . . . , λn (not necessarily distinct). By the discussion in Section 2.4, A = [Id]S,B [A]B [Id]B,S . Hence, A = SDS −1 , where

 λ1 0 · · · 0 | | |  0 λ2 · · · 0      x1 x2 · · · xn S= and D= . .. . . ..  . .  . . .  . | | | 0 0 · · · λn Using Proposition 5.2.6, we have a sufficient condition for diagonalizability of linear operators on finite-dimensional vector spaces. 





Theorem 5.4.6. (First characterization of diagonalizability) Let V be a finite-dimensional vector space of dimension n and T : V → V a linear operator. If λ1 , . . . , λk are distinct eigenvalues of T and n = dim ker(T − λ1 Id) + · · · + dim ker(T − λk Id), then T is diagonalizable. In particular, if T has n distinct eigenvalues, then T is diagonalizable. Proof. For each j = 1, . . . , k, choose a basis Bj for ker(T − λj Id). By Proposition 5.2.6, B := B1 ∪ · · · Bk is linearly independent and has n elements. This implies B is a basis for V consisting of eigenvectors. I leave the proof of the following theorem as an exercise to the reader.

SWILA NOTES

47

Theorem 5.4.7. Let T : V → V be a linear operator of a vector space of dimension n. Then T is diagonalizable if and only if the minimal polynomial factors as mT (t) = (t − λ1 ) · · · (t − λk ), for distinct λ1 , · · · , λk ∈ F. The following theorem follows from the fact that the geometric multiplicity of an eigenvalue is at at most its algebraic multiplicity (Proposition 5.2.8): Theorem 5.4.8. Suppose F = C and T : V → V is a linear operator on a vector space of dimension n. Then T is diagonalizable if and only if the algebraic multiplicity of each eigenvalue equals its geometric multiplicity. Proof. The sum of the algebraic multiplicities of the eigenvalues of T equals the degree of χT (t), which is n. If T is diagonalizable, V has an eigenbasis. It follows from Proposition 5.2.8 that the algebraic and geometric multiplicities agree for each eigenvalue. The other direction is clear. 5.5. Cyclic subspaces. We are interested in studying the Jordan canonical form of a matrix. Before we do so, we need to study further the relationship between the minimal polynomial and the characteristic polynomial. This section concludes with the statement that we can decompose each finite-dimensional vector space into the direct sum of cyclic subspaces. This will provide us in the next section with the rational canonical form. In this section, all vector spaces (besides F[t]) will be assumed to be finite-dimensional. Given a linear operator T , recall the inductive definition of T k for k ∈ N. An important observation used throughout the next couple sections is the following: given any p(t), q(t) ∈ F[t], p(T ) ◦ q(T ) = q(T ) ◦ p(T ). As noted in the preface, this section (as in most sections) will be adapted from Petersen’s notes [6]. Definition 5.5.1. Fix a vector space V and a linear operator T : V → V . Given x ∈ V , we define Cx := span{x, T x, T 2 x, . . .} to be the cyclic subspace T -generated by x. We say that a subspace W ⊆ V cyclic if W = Cx for some x ∈ V . In this case, we say that W is T -generated by x. Example 5.5.2. Define the linear operator T : F[t] → F[t] by p(t) 7→ t · p(t). Then F[t] is T -generated by the constant polynomial 1 ∈ F[t]. Fix a vector x ∈ V and a linear operator T : V → V . By Proposition 1.3.2, there is a smallest k such that T k x ∈ span{x, T x, . . . , T k−1 x}. This implies that there exist scalars α0 , α1 , . . . , αk−1 ∈ F such that T k x + αk−1 T k−1 x + · · · + α0 x = 0. Note k ≤ dim(V ). This gives us the following lemma. Lemma 5.5.3. Let T : V → V be a linear operator and x ∈ V . Then Cx is T -invariant and we can find k such that x, T x, . . . , T k−1 x form a basis for Cx . The matrix representation for