Using this lemma, we obtain the Cayley-Hamilton Theorem: Theorem 5.5.8. (Cayley-Hamilton Theorem) Let T : V â V be a l
SWILA NOTES
47
I leave the proof of the following theorem as an exercise to the reader. Theorem 5.4.7. Let T : V → V be a linear operator of a vector space of dimension n. Then T is diagonalizable if and only if the minimal polynomial factors as mT (t) = (t − λ1 ) · · · (t − λk ), for distinct λ1 , · · · , λk ∈ F. The following theorem follows from the fact that the geometric multiplicity of an eigenvalue is at at most its algebraic multiplicity (Proposition 5.2.8): Theorem 5.4.8. Suppose F = C and T : V → V is a linear operator on a vector space of dimension n. Then T is diagonalizable if and only if the algebraic multiplicity of each eigenvalue equals its geometric multiplicity. Proof. The sum of the algebraic multiplicities of the eigenvalues of T equals the degree of χT (t), which is n. If T is diagonalizable, V has an eigenbasis. It follows from Proposition 5.2.8 that the algebraic and geometric multiplicities agree for each eigenvalue. The other direction is clear. 5.5. Cyclic subspaces. We will soon consider the Jordan canonical form of a matrix. Before we do so, we need to study further the relationship between the minimal polynomial and the characteristic polynomial. This section concludes with the statement that we can decompose each finite-dimensional vector space into the direct sum of cyclic subspaces. This will provide us in the next section with the rational canonical form. In this section, all vector spaces (besides F[t]) will be assumed to be finite-dimensional. Given a linear operator T , recall the inductive definition of T k for k ∈ N. An important observation used throughout the next couple sections is the following: given any p(t), q(t) ∈ F[t], p(T ) ◦ q(T ) = q(T ) ◦ p(T ). As noted in the preface, this section (as in most sections) will be adapted from Petersen’s notes [6]. Given a vector space V and a linear operator T : V → V , recall that we say that a subspace W ⊆ V is T -invariant if T (W ) ⊆ W . Definition 5.5.1. Fix a vector space V and a linear operator T : V → V . Given x ∈ V , we define Cx := span{x, T x, T 2 x, . . .} to be the cyclic subspace T -generated by x. We call a subspace W ⊆ V cyclic if W = Cx for some x ∈ V . In this case, we say that W is T -generated by x. Example 5.5.2. Define the linear operator T : F[t] → F[t] by p(t) 7→ t · p(t). Then F[t] is T -generated by the constant polynomial 1 ∈ F[t]. Fix a nonzero vector x ∈ V and a linear operator T : V → V . By Proposition 1.3.2, there is a smallest k such that T k x ∈ span{x, T x, . . . , T k−1 x}. This implies that there exist scalars α0 , α1 , . . . , αk−1 ∈ F such that T k x + αk−1 T k−1 x + · · · + α0 x = 0. Note k ≤ dim(V ). This gives us the following lemma.
48
DEREK JUNG
Lemma 5.5.3. Let T : V → V be a linear operator and a nonzero vector x ∈ V . Then Cx is T -invariant and we can find k such that x, T x, . . . , T k−1 x form a basis for Cx . The matrix representation for T |Cx with respect to this basis is 0 0 · · · 0 −α0 1 0 · · · 0 −α1 0 1 · · · 0 −α2 , .. .. . . . . . 0 0 ···
1 −αk−1
where T k x + αk−1 T k−1 x + · · · + α0 x = 0. Random Thought 5.5.4. I love airplanes and everything about them. So, I always get so disappointed while watching the first episode of a series. Darn misleading titles... Definition 5.5.5. Given a monic polynomial p(t) = tn + αn−1 tn−1 + · · · + α0 ∈ F[t], we define the companion matrix of p(t) to be the n × n-matrix 0 0 · · · 0 −α0 1 0 · · · 0 −α1 0 1 · · · 0 −α2 Ap := . .. .. . . .. . . . . 0 0 ···
1 −αn−1
The companion matrix for p(t) = t + α is just [−α]. Proposition 5.5.6. The characteristic polynomial and minimal polynomial of a companion matrix Ap are both p(t), and all eigenspaces are one-dimensional. In particular, Ap is diagonaliable if and only if p(t) splits and the roots of p(t) are distinct. Proof. Fix a polynomial p(t) = tn + αn−1 tn−1 + · · · + α0 . By interchanging rows and adding multiples of rows to others, one can show that the determinant of t 0 ··· 0 α0 −1 t · · · 0 α1 α2 tId − Ap = 0 −1 · · · 0 .. .. . . .. .. . . . . . 0 0 · · · −1 t + αn−1 is p(t) (see Petersen [6], Proposition 19, page 136). More specifiically, one can use these operations to reduce tId − Ap to the upper-triangular matrix −1 t · · · 0 α1 0 −1 · · · 0 α2 .. .. .. 0 . . . 0 . .. .. . . · · · −1 αn−1 0 0 · · · 0 p(t)
SWILA NOTES
If λ is a root of p(t), i.e., λ is an eigenvalue of Ap , then −1 λ · · · 0 α1 0 −1 · · · 0 α2 . .. . . . .. 0 . 0 .. .. . . −1 αn−1 0 0 ··· 0 0
49
has rank n − 1. It follows that each eigenspace has dimension 1. en−1 = en are linearly For the minimal polynomial, first note e1 , Ap e1 = e2 , . . . , An−1 p independent. This implies Ap is not the root of any nonzero polynomial of degree less than n. On the other hand, for each k = 1, . . . , n, k−1 n Anp (ek ) = Anp Ak−1 p (e1 ) = Ap Ap (e1 )
and Anp (e1 ) = −α0 e1 − α1 e2 − · · · − αn−1 en = −α0 e1 − α1 Ap e1 − · · · − αn−1 An−1 e1 . p This implies p(Ap )(e1 ) = 0, and hence p(Ap )(ek ) = Ak−1 · p(Ap )(e1 ) = 0 p for all 1 ≤ k ≤ n. It follows that p is the minimal polynomial of Ap .
The following lemma follows from Proposition 4.1.10: Lemma 5.5.7. Fix A1 ∈ Mk (F), B ∈ Mk×(n−k) (F), and A2 ∈ Mn−k (F). Define A1 B A= . 0 A2 Then χA (t) = χA1 (t)χA2 (t). Using this lemma, we obtain the Cayley-Hamilton Theorem: Theorem 5.5.8. (Cayley-Hamilton Theorem) Let T : V → V be a linear operator on an n-dimensional vector space V . Then T is a root of its characteristic polynomial: χT (T ) = 0. Proof. Fix x ∈ V . We need to show the linear operator χT (T ) kills x, i.e., χT (T )(x) = 0. By Lemma 5.5.3, we may choose a basis x, T x, . . . , T k−1 x for the cyclic subspace Cx generated by x. Complete this to a basis B for V . Let p(t) be the monic polynomial of degree k such that p(T )(x) = 0. Then Ap ? [T ]B = , 0 A for some (n − k) × (n − k)-matrix A. By Lemma 5.5.7, χT (t) = χA (t)p(t). Hence, χT (T )(x) = χA (T ) ◦ p(T )(x) = 0.
50
DEREK JUNG
An important corollary then follows from Proposition 5.3.6. Recall that we say a polynomial p(t) divides q(t) ∈ F[t] if there exists d(t) ∈ F[t] such that q(t) = p(t) · d(t). Corollary 5.5.9. Let T : V → V be a linear operator. Then the minimal polynomial mT (t) divides the characteristic polynomial χT (t). We now can obtain an interesting characterization of finite-dimensional vector spaces. Theorem 5.5.10. (Cyclic subspace decomposition) Let T : V → V be a linear operator. Then V can be written as the direct sum of cyclic subspaces: V = C x1 ⊕ · · · ⊕ C xk
for some x1 , . . . , xk ∈ V.
In particular, T has a block diagonal matrix representation where each block is a companion matrix: Ap1 0 · · · 0 0 Ap2 · · · 0 [T ] = . .. , . . . . . . 0 0 · · · Apk and χT (t) = p1 (t) · · · pk (t). Moreover, the geometric multiplicity of a scalar λ satisfies dim(ker(T − λIV )) = #{i : pi (λ) = 0}. In particular, we see that T is diagonalizable if and only if all of the companion matrices have distinct eigenvalues. Proof. This theorem is proved using induction on the dimension of V . Some details will be left to the reader and such will be noted. Assume dim(V ) = n. If V is cyclic, we are done. Assume otherwise. Let Cx1 = span{x1 , T x1 , . . . , T m−1 x1 } be a cyclic subspace of maximal dimension m < n. The goal is to show that V is the direct sum of Cx1 with a T -invariant subspace (then we could repeat the argument for TCx1 and apply an inductive-type argument on the dimension). Choose a linear functional f : V → F such that f (T k x1 ) = 0 f (T
m−1
for all 0 ≤ k < m − 1,
x1 ) = 1.
Define K : V → Fm by K(x) = (f (x), f (T x), . . . , f (T m−1 x)). Define B = {x1 , T x1 , . . . , T m−1 x1 }. Then K|Cx1 : Cx1 → Fm is an isomorphism since 0 0 ··· 1 .. .. . . ... ? [K]S,B = .. . . . 0 1 . . 1 ? ··· ? We now show ker(K) is T -invariant. Fix x ∈ ker K. This means f (T k x) = 0 for all 1 ≤ k < m. By the definition of m, T m x is a linear combination of {x, T x, . . . , T m−1 x}. It follows that T x ∈ ker(K). It’s not too hard to see ker(K) ∩ Cx1 = {0}. We then use the Rank-Nullity Theorem (Theorem 3.3.2) to conclude that V = Cx1 ⊕ker(K) (since dim(Cx1 ) = dim(Im(K))). The rest of the claims concerning eigenvalues follow from Proposition 5.5.6.