Further linear algebra

Further linear algebra M. Anthony, M. Harvey MT2175, 2790175

2012

Undergraduate study in Economics, Management, Finance and the Social Sciences This is an extract from a subject guide for an undergraduate course offered as part of the University of London International Programmes in Economics, Management, Finance and the Social Sciences. Materials for these programmes are developed by academics at the London School of Economics and Political Science (LSE). For more information, see: www.londoninternational.ac.uk

This guide was prepared for the University of London International Programmes by: Martin Anthony, Professor of Mathematics, and Michele Harvey, Course Leader, Department of Mathematics, London School of Economics and Political Science. This is one of a series of subject guides published by the University. We regret that due to pressure of work the authors are unable to enter into any correspondence relating to, or arising from, the guide. If you have any comments on this subject guide, favourable or unfavourable, please use the form at the back of this guide.

University of London International Programmes Publications Office Stewart House 32 Russell Square London WC1B 5DN United Kingdom Website: www.londoninternational.ac.uk Published by: University of London © University of London 2012 The University of London asserts copyright over all material in this subject guide except where otherwise indicated. All rights reserved. No part of this work may be reproduced in any form, or by any means, without permission in writing from the publisher. We make every effort to contact copyright holders. If you think we have inadvertently used your copyright material, please let us know.

Contents

Contents Preface

1

1 Introduction

3

1.1

This subject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.1.1

Aims of the course . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.1.2

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.1.3

Topics covered . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.2.1

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.2.2

Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

Online study resources . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

1.3.1

The VLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

1.3.2

Making use of the Online Library . . . . . . . . . . . . . . . . . .

6

1.4

Using the subject guide . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

1.5

Examination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

1.6

The use of calculators . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

1.2

1.3

2 Diagonalisation, Jordan normal form and differential equations

9

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

Aims of the chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.1

Differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.2

Linear systems of differential equations . . . . . . . . . . . . . . . . . . .

10

2.3

Solving by diagonalisation . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.4

The Jordan normal form . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.5

Solving systems of differential equations using Jordan normal form . . . .

19

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

Test your knowledge and understanding . . . . . . . . . . . . . . . . . . . . .

23

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

Comments on exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

i

Contents

Feedback on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Inner products and orthogonality

29

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29


29

Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29


29

3.1

The inner product of two real vectors . . . . . . . . . . . . . . . . . . . .

29

Geometric interpretation in R2 and R3 . . . . . . . . . . . . . . .

30

Inner products more generally . . . . . . . . . . . . . . . . . . . . . . . .

32

3.2.1

Norms in a vector space . . . . . . . . . . . . . . . . . . . . . . .

33

3.2.2

The Cauchy-Schwarz inequality . . . . . . . . . . . . . . . . . . .

34

3.2.3

Generalised geometry . . . . . . . . . . . . . . . . . . . . . . . . .

34

3.2.4

Orthogonal vectors . . . . . . . . . . . . . . . . . . . . . . . . . .

35

3.2.5

Orthogonality and linear independence . . . . . . . . . . . . . . .

35

3.3

Orthogonal matrices and orthonormal sets . . . . . . . . . . . . . . . . .

36

3.4

Gram-Schmidt orthonormalisation process . . . . . . . . . . . . . . . . .

37


38


39

Feedback on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

3.1.1 3.2

4 Orthogonal diagonalisation and its applications

43

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43


43

Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43


43

4.1

Orthogonal diagonalisation of symmetric matrices . . . . . . . . . . . . .

43

4.2

Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

4.2.1

Information on quadratic forms . . . . . . . . . . . . . . . . . . .

48

4.2.2

Quadratic forms in R2 – conic sections . . . . . . . . . . . . . . .

51


54


54

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54


55


57

5 Direct sums and projections

ii

27

59

Contents

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59


59

Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59


59

5.1

The direct sum of two subspaces . . . . . . . . . . . . . . . . . . . . . . .

59

5.1.1

The sum of two subspaces . . . . . . . . . . . . . . . . . . . . . .

59

5.1.2

Direct sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

Orthogonal complements . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

5.2.1

The orthogonal complement of a subspace . . . . . . . . . . . . .

61

5.2.2

Orthogonal complements of null spaces and ranges . . . . . . . . .

62

Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

5.3.1

The definition of a projection . . . . . . . . . . . . . . . . . . . .

63

5.3.2

An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

5.3.3

Orthogonal projections . . . . . . . . . . . . . . . . . . . . . . . .

65

Characterising projections and orthogonal projections . . . . . . . . . . .

65

5.4.1

Projections are idempotents . . . . . . . . . . . . . . . . . . . . .

65

5.5

Orthogonal projection onto the range of a matrix . . . . . . . . . . . . .

66

5.6

Minimising the distance to a subspace

. . . . . . . . . . . . . . . . . . .

67

5.7

Fitting functions to data: least squares approximation . . . . . . . . . . .

68

5.7.1

A linear algebra view . . . . . . . . . . . . . . . . . . . . . . . . .

69

5.7.2

An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69


71


72


72

5.2

5.3

5.4

6 Generalised inverses

73

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73


73

6.1

Left and right inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

6.2

Weak generalised inverses . . . . . . . . . . . . . . . . . . . . . . . . . .

77

6.3

Strong generalised inverses . . . . . . . . . . . . . . . . . . . . . . . . . .

79

6.4

A method for calculating SGIs . . . . . . . . . . . . . . . . . . . . . . . .

81

6.5

Why are SGIs useful? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85


89


90

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

90

iii

Contents


92


99

7 Complex matrices and vector spaces

101

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101


101

Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101


101

7.1

Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101

7.1.1

Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . .

101

7.1.2

Algebra of complex numbers . . . . . . . . . . . . . . . . . . . . .

102

7.1.3

Roots of polynomials . . . . . . . . . . . . . . . . . . . . . . . . .

103

7.1.4

The complex plane . . . . . . . . . . . . . . . . . . . . . . . . . .

104

7.1.5

Polar form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

104

7.1.6

Exponential form and Euler’s formula . . . . . . . . . . . . . . . .

106

7.2

Complex vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . .

108

7.3

Complex matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109

7.4

Complex inner product spaces . . . . . . . . . . . . . . . . . . . . . . . .

111

7.4.1

The inner product on Cn . . . . . . . . . . . . . . . . . . . . . . .

111

7.4.2

Complex inner product in general . . . . . . . . . . . . . . . . . .

112

7.4.3

Orthogonal vectors . . . . . . . . . . . . . . . . . . . . . . . . . .

113

Hermitian conjugates . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

115

7.5.1

The Hermitian conjugate . . . . . . . . . . . . . . . . . . . . . . .

115

7.5.2

Hermitian matrices . . . . . . . . . . . . . . . . . . . . . . . . . .

117

7.5.3

Unitary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . .

117

7.6

Unitary diagonalisation and normal matrices . . . . . . . . . . . . . . . .

118

7.7

Spectral decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . .

121


124


125


125

7.5

A Sample examination paper

129

B Commentary on the Sample examination paper

133

iv

Preface This subject guide is not a course text. It sets out a logical sequence in which to study the topics in this subject. Where coverage in the main texts is weak, it provides some additional background material. Further reading is essential. We are grateful to Dr James Ward and Dr Bob Simon for providing us with their materials on generalised inverses and Jordan normal form.

1

Preface

2

1

Chapter 1 Introduction In this very brief introduction, we aim to give you an idea of the nature of this subject and to advise on how best to approach it. We give general information about the contents and use of this subject guide, and on recommended reading and how to use the textbooks.

1.1 1.1.1

This subject Aims of the course

This subject is intended to: enable students to acquire further skills in the techniques of linear algebra, as well as understanding the principles underlying the subject prepare students for further courses in mathematics and/or related disciplines (e.g. economics, actuarial science). As emphasised above, however, we do also want you to understand why certain methods work: this is one of the ‘skills’ that you should aim to acquire. The examination will test not simply your ability to perform routine calculations, but will probe your knowledge and understanding of the fundamental principles underlying the area.

1.1.2

Learning outcomes

We now state the broad learning outcomes of this course, as a whole. More specific learning outcomes can be found at the end of each chapter. At the end of this course and having completed the reading and activities you should have: knowledge of the concepts, terminology, methods and conventions covered in the course the ability to solve unseen mathematical problems involving an understanding of these concepts the ability to demonstrate knowledge and understanding of the underlying principles of the subject. There are a couple of things we should stress at this point. First, note the intention that you will be able to solve unseen problems. This means simply that you will be

3

1

1. Introduction

expected to be able to use your knowledge and understanding of the material to solve problems that are not completely standard. This is not something you should worry unduly about: all mathematics topics expect this, and you will never be expected to do anything that cannot be done using the material of the course. Second, we expect you to be able to ‘demonstrate knowledge and understanding’ and you might well wonder how you would demonstrate this in the examination. Well, it is precisely by being able to grapple successfully with unseen, non-routine questions that you will indicate that you have a proper understanding of the topic.

1.1.3

Topics covered

Descriptions of topics to be covered appear in the relevant chapters. However, it is useful to give a brief overview at this stage. The topics we study, broadly, are: The application of diagonalisation to solving differential equations, the Jordan normal form and its application to solving differential equations Inner products and orthogonality Orthogonal diagonalisation and its applications Direct sums and projections Generalised inverses Complex matrices and complex vector spaces.

1.2

Reading

There are many books that would be useful for at least some parts of this course. We recommend one in particular, and another for additional, further reading. Neither of these books covers generalised inverses or Jordan normal form, and those topics have therefore been discussed in this subject guide in such a way that they are self-contained.

1.2.1

R

Essential reading

You will need a copy of the following textbook.

4

Anthony, M. and M. Harvey. Linear Algebra: Concepts and Methods. (Cambridge University Press, 2012) [ISBN 9780521279482].

1.3. Online study resources

1.2.2

R

Further reading

For additional reading, we suggest the following. Anton, H. and C. Rorres. Elementary Linear Algebra with Supplemental Applications (International Student Version). (John Wiley & Sons (Asia) Plc Ltd, 2010) tenth edition. [ISBN 9780470561577 ].1

Please note that as long as you read the Essential reading you are then free to read around the subject area in any text, paper or online resource. You will need to support your learning by reading as widely as possible. To help you read extensively, you have free access to the virtual learning environment (VLE) and University of London Online Library (see below). Textbooks will provide more in-depth explanations than you will find in this subject guide, and they will also provide many more examples to study and exercises to work through. The books listed are the ones we have referred to in this subject guide.

1.3

Online study resources

In addition to the subject guide and the Essential reading, it is crucial that you take advantage of the study resources that are available online for this course, including the VLE and the Online Library. You can access the VLE, the Online Library and your University of London email account via the Student Portal at: http://my.londoninternational.ac.uk You should receive your login details in your study pack. If you have not, or you have forgotten your login details, please email [email protected] quoting your student number.

1.3.1

The VLE

The VLE, which complements this subject guide, has been designed to enhance your learning experience, providing additional support and a sense of community. It forms an important part of your study experience with the University of London and you should access it regularly. The VLE provides a range of resources for EMFSS courses: Self-testing activities: Doing these allows you to test your own understanding of subject material. Electronic study materials: The printed materials that you receive from the University of London are available to download, including updated reading lists and references. 1

There are many editions and variants of this book. Any one is equally useful and you will not need more than one of them. You can find the relevant sections cited in this subject guide in any edition by using the index.

5

1

1

1. Introduction

Past examination papers and Examiners’ commentaries: These provide advice on how each examination question might best be answered. A student discussion forum: This is an open space for you to discuss interests and experiences, seek support from your peers, work collaboratively to solve problems and discuss subject material. Videos: There are recorded academic introductions to the subject, interviews and debates and, for some courses, audio-visual tutorials and conclusions. Recorded lectures: For some courses, where appropriate, the sessions from previous years’ Study Weekends have been recorded and made available. Study skills: Expert advice on preparing for examinations and developing your digital literacy skills. Feedback forms. Some of these resources are available for certain courses only, but we are expanding our provision all the time and you should check the VLE regularly for updates.

1.3.2

Making use of the Online Library

The Online Library contains a huge array of journal articles and other resources to help you read widely and extensively. To access the majority of resources via the Online Library you will either need to use your University of London Student Portal login details, or you will be required to register and use an Athens login: http://tinyurl.com/ollathens The easiest way to locate relevant content and journal articles in the Online Library is to use the Summon search engine. If you are having trouble finding an article listed in a reading list, try removing any punctuation from the title, such as single quotation marks, question marks and colons. For further advice, please see the online help pages: www.external.shl.lon.ac.uk/summon/about.php

1.4

Using the subject guide

We have already mentioned that this subject guide is not a textbook. It is important that you read textbooks in conjunction with the subject guide. In particular, you will need to work with the Anthony and Harvey book. Not only does it contain further examples and explanations (including proofs), but it has many exercises for you to attempt. At the end of each chapter of the subject guide is a section in which we urge you to test your knowledge and understanding by attempting some exercises. These will be principally from the main textbook (Anthony and Harvey), but we will occasionally give some others. Full solutions to many (but not all) of the exercises in Anthony and Harvey are in that book. The solutions to the other exercises from the book are

6

1.5. Examination

provided on the VLE area for this course. Solutions to any additional exercises given in this subject guide are provided in the subject guide. The exercises are a very useful resource. You should try them once you think you have mastered a particular chapter. Really try them: do not just simply read the solutions provided. Make a serious attempt before consulting the solutions. It is vital that you develop and enhance your problem-solving skills and the only way to do this is to try lots of examples. Near the end of the chapters, we also provide some feedback on the activities contained in the chapter. Again, you should consult these only after you have attempted the activities. Unless it is explicitly stated otherwise, you are expected to work through and understand the proofs in this subject guide and also those proofs in the Anthony and Harvey book to which we refer. Understanding proofs is a good way to see how the theory works to achieve results. We often use the symbol to denote the end of a proof where we have finished explaining why a particular result is true. This is just to make it clear where the proof ends and the following text begins.

1.5

Examination

Important: the information and advice given here are based on the examination structure used at the time this subject guide was written. Please note that subject guides may be used for several years. Because of this we strongly advise you to always check both the current Regulations for relevant information about the examination, and the VLE where you should be advised of any forthcoming changes. You should also carefully check the rubric/instructions on the paper you actually sit and follow those instructions. This course is assessed by a two-hour unseen written examination. A Sample examination paper is given as the final chapter to this subject guide. In addition, a commentary is provided on the sample paper. Please do not think that the questions in a real examination will necessarily be very similar to those in the Sample examination paper. An examination is designed (by definition) to test you. You will get examination questions unlike questions in this subject guide. The whole point of examining is to see whether you can apply knowledge in familiar and unfamiliar settings. The Examiners (nice people though they are) have an obligation to surprise you! For this reason, it is important that you try as many examples as possible, from the subject guide and from the textbooks. This is not so that you can cover any possible type of question the Examiners can think of! It is so that you get used to confronting unfamiliar questions, grappling with them, and finally coming up with the solution. Do not panic if you cannot completely solve an examination question. There are many marks to be awarded for using the correct approach or method. Remember, it is important to check the VLE for:

7

1

1

1. Introduction

up-to-date information on examination and assessment arrangements for this course where available, past examination papers and Examiners’ commentaries for the course which give advice on how each question might best be answered.

1.6

The use of calculators

You will not be permitted to use calculators of any type in the examination. This is not something that you should panic about: the Examiners are interested in assessing that you understand the key concepts, ideas, methods and techniques, and will set questions which do not require the use of a calculator.

8

Chapter 2 Diagonalisation, Jordan normal form and differential equations

2

Reading The Jordan normal form and its applications to differential equations are not discussed in these textbooks, so we have tried to write this chapter of the subject guide in such a way that the discussion of that topic is self-contained.

R

Essential reading Anthony, M. and M. Harvey. Linear Algebra: Concepts and Methods. Chapter 9. (For revision of diagonalisation, read Chapter 8.)

R

Further reading Anton, H. and C. Rorres. Elementary Linear Algebra. Section 5.4.

Aims of the chapter We explore an application of diagonalisation that was not discussed in MT1173 Algebra, namely the solution of systems of differential equations. You will certainly have studied differential equations if you have taken the course MT1174 Calculus. But even if you have not met the topic before, we present here the basic facts which will enable you to handle the material. Not all matrices are diagonalisable, as we know, so we then present another way in which there exists a relatively ‘simple’ matrix similar to a given matrix (the Jordan normal form). We then apply this to solve systems of differential equations. Before embarking on this chapter, you should ensure that you are familiar with diagonalisation. If it has been a while since you took MT1173 or a similar course, then it is advisable to re-read that material, and also read Chapter 8 of Anthony and Harvey.

2.1

Differential equations

A differential equation is, broadly speaking, an equation that involves a function and its derivatives. We are interested here only in very simple types of differential equation and it is quite easy to summarise what you need to know so that we do not need a lengthy

9

2. Diagonalisation, Jordan normal form and differential equations

discussion of calculus.

2

For a function y = y(t), the derivative of y will be denoted by y 0 = y 0 (t) or dy/dt. We will need the following result: if y(t) satisfies the differential equation y 0 = ay, then the general solution is y(t) = βeat

for some β ∈ R.

If an initial condition, y(0), is given, then since y(0) = βe0 = β, we have a particular (unique) solution y(t) = y(0)eat to the differential equation. Activity 2.1 Check that y = 3e2t is a solution of the differential equation y 0 = 2y which satisfies the initial condition y(0) = 3.

2.2

Linear systems of differential equations

We will look at systems consisting of these types of differential equations. In MT1173 Algebra, we used a change of variable technique based on diagonalisation to solve systems of difference equations. We can apply an analogous technique to solve systems of linear differential equations. In general, a (square) linear system of differential equations for the functions y1 (t), y2 (t), . . . , yn (t) is of the form y10 = a11 y1 + a12 y2 + · · · + a1n yn y20 = a21 y1 + a22 y2 + · · · + a2n yn .. . yn0 = an1 y1 + an2 y2 + · · · + ann yn , for constants aij ∈ R. So such a system takes the form y0 = Ay, where A = (aij ) is an n × n matrix whose entries are constants (that is, fixed numbers), and y = (y1 , y2 , . . . , yn )T , y0 = (y10 , y20 , . . . , yn0 )T are vectors of functions. If A is diagonal, the system y0 = Ay is easy to solve. Suppose A = diag(λ1 , λ2 , . . . , λn ), the diagonal matrix whose diagonal entries are (in order) λ1 , . . . , λn . Then the system is precisely y10 = λ1 y1 , y20 = λ2 y2 , . . . , yn0 = λn yn , and so y1 = y1 (0)eλ1 t ,

y2 = y2 (0)eλ2 t ,

...,

yn = yn (0)eλn t .

Since a diagonal system is so easy to solve, it would be very helpful if we could reduce our given system to a diagonal one, and this is precisely what the method will do.

10

2.3. Solving by diagonalisation

2.3

Solving by diagonalisation

We will come back to the general discussion shortly, but for now we explore the method with a simple example. Example 2.1 Suppose the functions y1 (t) and y2 (t) are related as follows: y10 = 7y1 − 15y2 y20 = 2y1 − 4y2 . In matrix form this is y0 = Ay where A is the 2 × 2 matrix 7 −15 A= . 2 −4 This matrix is diagonalisable. Activity 2.2 Show that the matrix is diagonalisable. Find an invertible matrix P and a diagonal matrix D such that P −1 AP = D. You should find that if

P =

5 3 2 1

then P is invertible and P

−1

AP = D =

, 1 0 0 2

.

We now use the matrix P to define new functions z1 (t), z2 (t) by setting y = P z (or equivalently, z = P −1 y); that is, y1 5 3 z1 y= = = P z, y2 2 1 z2 so that, y1 = 5z1 + 3z2 y2 = 2z1 + z2 . By differentiating these equations we can express y10 and y20 in terms of z10 and z20 , y10 = 5z10 + 3z20 y20 = 2z10 + z20 , so that y0 = (P z)0 = P z0 . Then we have, P z0 = y0 = Ay = A(P z) = AP z and hence z0 = P −1 AP z = Dz. In other words, z10 1 0 z1 z1 = = . z20 0 2 z2 2z2 So the system for the functions z1 , z2 is diagonal and hence it is easily solved. Having found z1 , z2 we can then find y1 and y2 through the explicit connection between the two sets of functions: namely, y = P z.

11

2


2

Let us now return to the general technique. Suppose we have the system y0 = Ay, and that A can indeed be diagonalised. Then, there is an invertible matrix P and a diagonal matrix D such that P −1 AP = D. Here, P = (v1 . . . vn ),

D = diag(λ1 , λ2 , . . . , λn ),

where λi are the eigenvalues and vi are the corresponding eigenvectors. Let z = P −1 y (or, equivalently, let y = P z). Then y0 = (P z)0 = P z0 , since P has constant entries. Activity 2.3 Prove that (P z)0 = P z0 . Therefore P z0 = Ay = AP z, and z0 = P −1 AP z = Dz. We may now easily solve for z, and hence y. We illustrate with an example of a 3 × 3 system of differential equations, solved using this method. Note carefully how we use the initial values y1 (0), y2 (0) and y3 (0). Example 2.2 We find functions y1 (t), y2 (t), y3 (t) such that y1 (0) = 2, y2 (0) = 1 and y3 (0) = 1 and such that they are related by the linear system of differential equations, dy1 = 6y1 + 13y2 − 8y3 dt dy2 = 2y1 + 5y2 − 2y3 dt dy3 = 7y1 + 17y2 − 9y3 . dt We can express this system in matrix form as  6 13 A = 2 5 7 17

y0 = Ay where  −8 −2  . −9

The matrix A is diagonalisable. Activity 2.4 Prove that A is diagonalisable. We have P −1 AP = D where   1 −1 1 P = 0 1 1, 1 1 2

12



 −2 0 0 D =  0 1 0. 0 0 3

2.3. Solving by diagonalisation

We set y = P z, and substitute into the equation, y0 = Ay to obtain (P z)0 = A(P z). That is, P z0 = AP z and so z0 = P −1 AP z = Dz. In other words, if   z1 z =  z2  , z3 then

    −2 0 0 z10 z1  z20  =  0 1 0   z2  . z30 0 0 3 z3 

So, z10 = −2z1 ,

z20 = z2 ,

z30 = 3z3 .

Therefore, z1 = z1 (0)e−2t ,

z2 = z2 (0)et ,

z3 = z3 (0)e3t .

Then, using y = P z, we have      y1 1 −1 1 z1 (0)e−2t  y2  =  0 1 1   z2 (0)et  . y3 1 1 2 z3 (0)e3t It remains to find z1 (0), z2 (0), z3 (0). To do so, we use the given initial values y1 (0) = 2, y2 (0) = 1, y3 (0) = 1. Since y = P z, we can see that y(0) = P z(0). We could use row operations to solve this system to determine z(0). Alternatively, we could use z(0) = P −1 y(0). Let us take the second approach. You should calculate the inverse of P and find that   −1 −3 2 P −1 =  −1 −1 1  . 1 2 −1 Activity 2.5 Do this! Therefore, 

      z1 (0) −1 −3 2 2 −3 z(0) =  z2 (0)  = P −1 y(0) =  −1 −1 1   1  =  −2  . z3 (0) 1 2 −1 1 3 Therefore, finally,     y1 1 −1 1 z1 (0)e−2t  y2  =  0 1 1   z2 (0)et  y3 1 1 2 z3 (0)e3t    1 −1 1 −3e−2t =  0 1 1   −2et  1 1 2 3e3t   −3e−2t + 2et + 3e3t . =  −2et + 3e3t −2t t 3t −3e − 2e + 6e 

13

2


The functions are:

y1 (t) = −3e−2t + 2et + 3e3t y2 (t) = −2et + 3e3t

2

y3 (t) = −3e−2t − 2et + 6e3t . How can we check our solution? First of all, it should satisfy the initial conditions. If we substitute t = 0 into the equations we should obtain the given initial conditions. Activity 2.6 Check this! The real check is to look at the derivatives at t = 0. We can take the original system, y0 = Ay and use it to find y0 (0),  0          y1 (0) 6 13 −8 y1 (0) 6 13 −8 2 17  y20 (0)  =  2 5 −2   y2 (0)  =  2 5 −2   1  =  7  . y30 (0) 7 17 −9 y3 (0) 7 17 −9 1 22 And we can differentiate our solution to find y0 , and then substitute t = 0.   0   −2t y1 (t) 6e + 2et + 9e3t .  y20 (t)  =  −2et + 9e3t 0 −2t t 3t y3 (t) 6e − 2e + 18e Activity 2.7 Substitute t = 0 to obtain y0 (0) and check that it gives the same answer. Often it is desirable to find a general solution to a system of differential equations, where no initial conditions are given. A general solution will have n arbitrary constants, essentially one for each function, so that given different initial conditions later, different particular solutions can be easily obtained. We will show how this works using the system in Example 2.2. Example 2.3 Let y1 (t), y2 (t), y3 (t) be functions related by the system of differential equations dy1 = 6y1 + 13y2 − 8y3 dt dy2 = 2y1 + 5y2 − 2y3 dt dy3 = 7y1 + 17y2 − 9y3 . dt Let the matrices A, P and D be exactly as before in Example 2.2, so that we still have P −1 AP = D, and setting y = P z, to define new functions z1 (t), z2 (t), z3 (t), we have y0 = Ay ⇐⇒ P z0 = AP z ⇐⇒ z0 = P −1 AP z = Dz. So we need to solve the equations z10 = −2z1 ,

14

z20 = z2 ,

z30 = 3z3 .

2.4. The Jordan normal form

in the absence of specific initial conditions. The general solutions are z1 = αe−2t ,

z2 = βet ,

z3 = γe3t ,

2

for arbitrary constants α, β, γ ∈ R. Therefore the general solution of the original system is      −2t  y1 1 −1 1 αe y =  y2  = P z =  0 1 1   βet  ; y3 γe3t 1 1 2 that is, y1 (t) = αe−2t − βet + γe3t y2 (t) = βet + γe3t y3 (t) = αe

−2t

for α, β, γ ∈ R. t

+ βe + 2γe

3t

Using the general solution, you can find particular solutions for any given initial conditions. For example, using the same initial conditions y1 (0) = 2, y2 (0) = 1 and y3 (0) = 1 as in Example 2.2, we can substitute t = 0 into the general solution to obtain, y1 (0) = 2 = α − β + γ y2 (0) = 1 = β + γ y3 (0) = 1 = α + β + 2γ and solve this linear system of equations for α, β, γ. Of course this is precisely the same system y(0) = P z(0) as before, with solution P −1 y(0),        α −1 −3 2 2 −3  β  =  −1 −1 1   1  =  −2  . γ 1 2 −1 1 3

Activity 2.8 Find the particular solution of the system of differential equations in Example 2.3 which satisfies the initial conditions y1 (0) = 1, y2 (0) = 1 and y3 (0) = 0.

2.4

The Jordan normal form

Of course, as we know, not every square matrix is diagonalisable. But it turns out to be the case that for any square matrix, there will be a relatively ‘simple’ matrix similar to it, known as a Jordan matrix. This is made precise in the following theorem, the proof of which is outside the scope of this course. (Actually, this theorem is a special case of a more general result for complex matrices. The theorem stated below applies to matrices which have real eigenvalues only. So, as stated, it does not show that every square matrix will be similar to a Jordan matrix. But be assured that there is a more general version of the theorem that applies in all cases.) Theorem 2.1 If A is an n × n matrix with characteristic polynomial

15


(x − λ1 )m1 . . . (x − λk )mk , then there will exist an invertible n × n matrix P such that

2

A1  0 P −1 AP =   ... 0 

0 A2 .. . 0

 ... 0 ... 0  , . . . ..  .  . . . Ak

where each Ai (an mi × mi matrix) looks like λi 0  0  . . Ai =   ..  ..  0 0 

∗ λi 0 .. . .. . 0 0

0 ∗ λi .. . .. . 0 0

... 0 ∗ ... ... ... ...

... 0 ... 0 ... 0 . . . .. . λi ∗ . . . λi ... 0

 0 0  0 ..  .  , 0  ∗ λi

where the ∗ are either 0 or 1 and all other entries are zeros. Note that the special case of this in which all ∗ entries are 0 is nothing more than a diagonal matrix. But the point of the theorem is that even if such a matrix A cannot be diagonalised, it will be similar to a Jordan matrix, which is an ‘almost-diagonal’ matrix.

Example 2.4 Let 

 3 −1 −4 7  1 1 −3 5   A=  0 1 −1 −1  . 0 0 0 2 If



 3 −4 0 8  2 −3 0 7   P =  1 −2 1 2  , 0 0 0 1

then P is invertible, with  3 −4 0 4  2 −3 0 5   =  1 −2 1 4  0 0 0 1 

P −1

and (check this!) 

1  0 P −1 AP = J =  0 0 which is a Jordan matrix.

16

1 1 0 0

0 1 1 0

 0 0 , 0 2

2.4. The Jordan normal form

There is another way to describe Jordan matrices, which is sometimes more useful. A k × k matrix B is a Jordan block if, for some λ, if k = 1, then B = (λ) and, if k ≥ 2, λ 1 0 0 0 λ 1 0  0 0 λ 1 . . . . . . . .. B=  .. .. .. .  .. .. .. ..  0 0 0 ... 0 0 0 ...

2

 ... 0 0 ... 0 0  ... 0 0 . . .. . .. ..  , .. ..  ..  . . . ... λ 1 ... 0 λ



Then, a Jordan matrix is one of the form B1  0 J =  ... 0 

0 B2 .. . 0

 ··· 0 ··· 0  , . . . ..  .  · · · Br

where each Bi is a Jordan block, i = 1, . . . , r. Example 2.5 Consider the Jordan matrix  1 1 0 1 J = 0 0 0 0

of the previous example,  0 0 1 0 . 1 0 0 2

Then, J= where B1 , B2 are the following Jordan  1 B1 =  0 0

B1 0

0 B2

,

blocks:  1 0 1 1  , B2 = (2). 0 1

If J1 , J2 are two Jordan matrices similar to a given matrix A, then it turns out that the only way in which J1 , J2 can differ is in the order of the Jordan blocks. (This is analogous to the fact that, for a diagonalisable matrix, the only diagonal matrices similar to it are those whose diagonal entries are the eigenvalues, in some order, and this order can be changed.) For this reason, we can speak of the Jordan normal form of A (or the Jordan canonical form of A), by which we mean a Jordan matrix similar to A. (Although there are a number of such Jordan matrices, they differ only in the ordering of the Jordan blocks, so this is why we use the word ‘the’ in ‘the Jordan normal form’: we could instead say ‘a’ but the point is that there is essentially only one such matrix.)

17


2

Example 2.6 Let A be the matrix 

2 0 A= 0 0

 0 1 −3 2 10 4  . 0 2 0  0 0 3

Then it turns out that A is not diagonalisable.  2 0 0 0 2 1 J = 0 0 2 0 0 0

The Jordan normal form of A is  0 0 . 0 3

Note that, given what we said just a moment ago, we could equally well say that the Jordan normal form is   3 0 0 0 0 2 1 0  J = 0 0 2 0. 0 0 0 2 It is the case that, if 

 0 1 0 −3  1 10 0 4   P = 0 0 1 0 , 0 0 0 1 then P is invertible and P −1 AP = J. (Although P is a 4 × 4 matrix, it is easy to see this because the last two rows of P are so simple.) Just to make sure you understand the Jordan block description of a Jordan matrix, note that we can write     2 0 0 0 B 0 0 1 0 2 1 0   0 B2 0  , J = 0 0 2 0 = 0 0 B3 0 0 0 3 where B1 , B2 , B3 are the Jordan blocks B1 = (2), B2 =

2 1 0 2

. B3 = (3).

We will be interested in using Jordan normal forms to solve systems of linear differential equations, but we will not, in this course, be concerned with how to determine the Jordan normal form. But let us make a few observations. Look at Example 2.6. What this tells us is that if         0 1 0 −3 0  4  1  10         v1 =   0  , v 2 =  0  , v3 =  1  , v4 =  0  , 0 0 0 1

18

2.5. Solving systems of differential equations using Jordan normal form

then these four vectors are linearly independent, and: Av1 = 2v1 , Av2 = 2v2 , Av3 = v2 + 2v3 , Av4 = 3v4 .

2

This is simply because J represents the transformation corresponding to A with respect to the basis {v1 , v2 , v3 , v4 }. To explain this further, recall from MT1173 Algebra (section 9.7) the following facts: (i) the matrix M representing the linear transformation T (x) = Ax with respect to a basis B = {v1 , . . . , vn } has as its i-th column the coordinate vector of T (vi ) with respect to the basis B; and (ii) if P is the matrix with i-th column equal to vi , then P −1 AP is the matrix M . So, for example, the fact that Av3 = v2 + 2v3 means that the third column of the matrix J = P −1 AP is the vector (0, 1, 2, 0)T because this is the coordinate vector of Av3 with respect to the basis B = {v1 , v2 , v3 , v4 }. Note that v1 , v2 , v4 are eigenvectors. Vector v3 is not, however, an eigenvector: it is what is called a generalised eigenvector corresponding to eigenvalue 2. It does not satisfy (A − 2I)v = 0, as an eigenvector would, but it does satisfy (A − 2I)2 v = 0. Activity 2.9 Check that (A − 2I)2 v3 = 0 and that (A − 2I)v3 6= 0.

2.5

Solving systems of differential equations using Jordan normal form

Our interest in the Jordan normal form is to use it to solve linear systems of differential equations in cases where the underlying matrix is not diagonalisable. We illustrate this with an example. Example 2.7 Suppose we want to find the general solution of the following system of differential equations: dy1 = y1 + y3 dt dy2 = y1 + y2 − 3y3 dt dy3 = y2 + 4y3 . dt We can express this system in matrix form as y0 = Ay where   1 0 1 A =  1 1 −3  . 0 1 4 Our approach outlined earlier would attempt to diagonalise A. But A turns out not to be diagonalisable. Activity 2.10 Prove that A is not diagonalisable.

19


2

The next best thing is to work with the Jordan normal explicitly, suppose we know that P −1 AP = J where    1 1 0 2    P = −2 −3 0 , J= 0 1 2 1 0

form. Suppose this is given:  1 0 2 1. 0 2

Activity 2.11 Let v1 , v2 , v3 denote the three columns of P , in order. Verify that v1 is an eigenvector for A corresponding to eigenvalue 2, and that Av2 = v1 + 2v2 and Av3 = v2 + 2v3 . (Given that the three vectors are linearly independent, J then represents A with respect to the basis {v1 , v2 , v3 }.) As in the situation in which the coefficient matrix is diagonalisable, we set y = P z, and substitute into the equation y0 = Ay to obtain (P z)0 = A(P z). That is, P z0 = AP z and so z0 = P −1 AP z = Dz. In other words, if   z1 z =  z2  , z3 then

    2 1 0 z10 z1  z20  =  0 2 1   z2  . z30 0 0 2 z3 

So, z10 = 2z1 + z2 z20 = 2z2 + z3 z30 = 2z3 . This system is not uncoupled as it would be had we been able to diagonalise A. But it is easier to solve than the system we started with. We will come shortly to explain how to solve it, but for the moment let us just accept that the solutions for the zi are as follows: t2 z3 = c3 e2t , z2 = c2 e2t + c3 te2t , z1 = c1 e2t + c2 te2t + c3 e2t . 2 Then, the general solution to the original system is given by     y1 z1 + z2  y2  = P z =  −2z1 − 3z2  , y3 z1 + 2z2 + z3 so c3 2 2t te 2 y2 = (−2c1 − 3c2 )e2t + (−2c2 − 3c3 )te2t − c3 t2 e2t c3 y3 = (c1 + 2c2 + c3 )e2t + (c2 + 2c3 )te2t + t2 e2t . 2 This is the general solution. If we had initial values, we could determine the particular solution satisfying those initial values, just as we did earlier. y1 = (c1 + c2 )e2t + (c2 + c3 )te2t +

20

2.5. Solving systems of differential equations using Jordan normal form

Let us look at another example and see how we might generalise from it. Example 2.8 We saw earlier, in Example  2 0 0 2 A= 0 0 0 0 and if

2

2.6, that if  1 −3 10 4  , 2 0  0 3



 0 1 0 −3  1 10 0 4   P = 0 0 1 0 , 0 0 0 1

then P −1 AP is the Jordan matrix  2 0 0 2 J = 0 0 0 0

0 1 2 0

  0 B  0  1 = 0 0 0 3

0 B2 0

 0 0 . B3

To solve y0 = Ay using the reduction to Jordan normal form, we would set y = P z and solve z0 = Jz. Let us look at this system for z. Explicitly, it will be: z10 z20 z30 z40

= = = =

2z1 2z2 + z3 2z3 3z4 .

Clearly the first, third and final equations can be solved directly just as in the diagonalisable case: z1 = c1 e2t , z3 = c3 e2t , z4 = c4 e3t . It turns out that the solution for z2 is z2 = c2 e2t + c3 te2t . We can then determine y by using y = P z. Now, looking at the explicit equations for the zi in this example, we can see that really there are three separate ‘sub-systems’ to solve: namely z10 = 2z1 , which is directly solvable, then z20 = 2z2 + z3 z30 = 2z3 , which involves only the functions z2 and z3 , and solvable. You can probably see in general that if  B1 0  0 B2 z0 = Jz =  ..  ... . 0 0

finally z40 = 3z4 , which is directly we are attempting to solve  ··· 0 ··· 0  .  z, .. . ..  · · · Br

21


2

where each Bi is a Jordan block, then we can separate this into r sub-systems for the zi which can be solved separately (and having solved these, we then determine the solution y = P z). Let us focus on any particular sub-system. It will be of the form w0 = Bw where w consists of some of the zi and where   λ 1 0 ... ... 0 0 0 λ 1 0 ... 0 0   0 0 λ 1 ... 0 0 . . . .  . . . . . . . . ... ...  , B=  .. .. ..   .. .. .. . . . λ 1 0    0 0 0 ... ... λ 1 0 0 0 ... ... 0 λ is a Jordan block corresponding to some λ. (We will assume B is a k × k matrix where k ≥ 2. We don’t need to consider the case k = 1, for that is just the directly solvable case we encountered earlier in the situation where the coefficient matrix is diagonalisable.) We have the following useful result which we can invoke when confronted with a system of the form w0 = Bw where B is a Jordan block. Theorem 2.2 The general solution to the system of differential equations   λ 1 0 ... ... 0 0  0  0 λ 1 0 ... 0 0    w1 w1 0 0 λ 1 ... 0 0 0  w2   . . . .   . . . . . ... ...   w.2   .  =  .. .. ..   ..   ..   . . .  .. .. .. . . . λ 1 0    wk0  0 0 0 . . . . . . λ 1  wk 0 0 0 ... ... 0 λ is given by wk = ck eλt wk−1 = ck−1 eλt + ck teλt wk−2 = ck−2 eλt + ck−1 teλt + ck

t2 λt e , 2

and, in general, for j = 1, . . . , k, t2 λt tk−j λt wj = cj e + cj+1 te + cj+2 e + · · · + ck e . 2 (k − j)! λt

λt

Here, as always, r! denotes the product r(r − 1) · · · 1. Proof We have the equation wk0 = λwk , so certainly wk = ck eλt for some ck , and that concurs with the statement of the theorem. Consider a general value of j between 1 and k − 1 and suppose that we already know that wj+1 is as stated in the theorem. Now, we have wj0 = λwj + wj+1 . Multiplying both sides of this equation by e−λt and rearranging gives e−λt wj0 − λe−λt wj = e−λt wj+1 .

22

2.5. Learning outcomes

Now, the left-hand side is easily seen to be (e−λt wi )0 , so we must have that e−λt wi is the (indefinite) integral of e−λt wj+1 . Given that e

−λt

wj+1

2

t2 tk−j−1 = cj+1 + cj+2 t + cj+3 + · · · + ck , 2 (k − j − 1)!

it follows that −λt

e

Z wj =

wj+1 dt

= cj + cj+1 t + cj+2

t2 tk−j + · · · + ck , 2 (k − j)!

where cj is a constant of integration. The expression for wj then follows by multiplying both sides by eλt . This ‘inductive’ argument proves the theorem. For instance, here is the special case that applies for k = 2: w1 = c1 eλt + c2 teλt w2 = c2 eλt . And, for k = 3, we have: t2 w1 = c1 eλt + c2 teλt + c3 eλt 2 λt λt w2 = c2 e + c3 te w3 = c3 eλt . Activity 2.12 Now go back to the examples given earlier where we merely presented the solutions of the differential equations, and convince yourself that the theorem just proved shows why these are indeed the solutions.

Learning outcomes At the end of this chapter and the relevant reading, you should be able to: solve systems of differential equations in which the underlying matrix is diagonalisable, by using the change of variable method know what is meant by a Jordan matrix, and the Jordan normal form of a matrix use the Jordan normal form to solve systems of differential equations.

Test your knowledge and understanding Exercises You should now attempt the following Exercises from Anthony and Harvey: Exercises 9.7, 9.8, 9.9, 9.17, 9.18, 9.19 and 9.20.

23


Now, attempt the following exercises on the Jordan normal form and its applications to solving systems of differential equations.

2

Exercise 2.1 Let



 0 0 1 A =  1 0 −3  . 0 1 3

Find the eigenvalues of A and show that A cannot be diagonalised. Let       1 1 0      v1 = −2 , v2 = −3 , v3 = 0  . 1 2 1 Verify that v1 is an eigenvector of A. Verify also that Av2 = v1 + v2 , Av3 = v2 + v3 . Hence write down a matrix P and a Jordan matrix J such that P −1 AP = J. Exercise 2.2 Find the functions y1 (t), y2 (t), y3 (t) which are such that y1 (0) = 1, y2 (0) = 1 and y3 (0) = 1 and which satisfy y10 = −y1 + y2 y20 = −y2 + y3 y30 = −y3 . Exercise 2.3 Write down the general solution to  0  1 y1  y20   0  0   y3   0  0   y4   0  0= y  0  50   y  0  6   y70   0 y80 0

the following system of differential equations:   y1 0 0 0 0 0 0 0   2 1 0 0 0 0 0   y2     0 2 1 0 0 0 0   y3    0 0 2 1 0 0 0   y4  .   0 0 0 2 1 0 0   y5    0 0 0 0 3 1 0  y6  0 0 0 0 0 3 1   y7  y8 0 0 0 0 0 0 3

Exercise 2.4 Let



 1 −1 0 A= 1 3 0 . −2 1 −1

Verify that (0, 0, 1)T is an eigenvector of A. Verify also that u = (−1, 1, 1)T is an eigenvector of A and that if v = (0, 1, 0), then Av = 2v + u. Hence find an invertible

24

2.5. Test your knowledge and understanding

matrix P and a Jordan matrix J such that P −1 AP = J. Hence find the general solution to the following system of differential equations:  0    1 −1 0 y1 y1  y20  =  1   3 0 y2  . y30 −2 1 −1 y3

Comments on exercises See Anthony and Harvey for solutions to Exercises 9.7, 9.8, 9.9. Solutions to Exercises 9.17, 9.18, 9.19 and 9.20 may be found on the VLE pages for this course. Here are the solutions to the remaining exercises. Solution to exercise 2.1 There is one eigenvalue, λ = 1, but you will find that there are not three linearly independent eigenvectors. (We omit the details here.) The required verifications are straightforward. The three vectors are easily seen to form a linearly independent set, and hence a basis of R3 . With respect to that basis, the transformation represented by A with respect to the standard basis will, when represented with respect to the basis {v1 , v2 , v3 }, be the matrix   1 1 0 J = 0 1 1 0 0 1 by the theory you will already have studied in MT1173 Algebra. That means that if we take the matrix P with these vectors as its columns, then P −1 AP will equal J. This P is   1 1 0 P =  −2 −3 0  . 1 2 1 Solution to exercise 2.2 The system is y0 = Jy where J is the Jordan matrix   −1 1 0 J =  0 −1 1  . 0 0 −1 By Theorem 2.2, for some constants c1 , c2 , c3 , t2 y1 = c1 e−t + c2 te−t + c3 e−t , y2 = c2 e−t + c3 te−t , y3 = c3 e−t . 2 The initial values then easily establish that c1 = c2 = c3 = 1. So the answer is y1 = e−t + te−t +

t2 −t e , y2 = e−t + te−t , y3 = e−t . 2

25

2


Solution to exercise 2.3

2

The matrix is a Jordan matrix and there are really three Jordan block sub-systems to solve. The first is simply y10 = y1 , the second is  0    y2 2 1 0 0 y2  y30   0 2 1 0   y3   0=    y4   0 0 2 1   y4  , y50 0 0 0 2 y5 and the third is

    y60 3 1 0 y6  y70  =  0 3 1   y7  . y80 0 0 3 y8 

Daunting as this may look, it is easy if we use Theorem 2.2. We have: y1 = c1 et t3 t2 y2 = c2 e2t + c3 te2t + c4 e2t + c5 e2t 2 6 2 t y3 = c3 e2t + c4 te2t + c5 e2t 2 y4 = c4 e2t + c5 te2t y5 = c5 e2t t2 y6 = c6 e3t + c7 te3t + c8 e3t 2 y7 = c7 e3t + c8 te3t y8 = c8 e3t . Solution to exercise 2.4 The verifications are straightforward and are omitted here. The first vector is an eigenvector for eigenvalue −1 and u is an eigenvector for eigenvalue 2. Let w = (0, 0, 1)T . Then {u, v, w} is a basis of R3 since the vectors are easily seen to be linearly independent. The representation of A with respect to this basis must be   2 1 0 J = 0 2 0 . 0 0 −1 So, P −1 AP = J where 

 −1 0 0 P =  1 1 0. 1 0 1 To solve the system of differential equations, we make the substitution y = P z. Then we will have z0 = Jz. That is,  0    z1 2 1 0 z1  z20  =  0 2 0   z2  . z30 0 0 −1 z3

26

2.5. Feedback on selected activities

Solving in the usual way (applying Theorem 2.2) we have z1 = c1 e2t + c2 te2t , z2 = c2 e2t , z3 = c3 e−t .

2

Then the general solution for the yi is    2t    −1 0 0 c1 e + c2 te2t −c1 e2t − c2 te2t  =  c2 te2t + (c1 + c2 )e2t  . y = Pz =  1 1 0 c2 e2t −t 1 0 1 c3 e c1 e2t + c2 te2t + c3 e−t

Feedback on selected activities Feedback to activity 2.1 It is clear that y(0) = 3e0 = 3. Furthermore, y 0 = 6e2t = 2(3e2t ) = 2y. Feedback to activity 2.2 Just to review diagonalisation, we will present the solution in some detail. We have 7 −15 1 0 7−λ −15 A − λI = −λ = 2 −4 0 1 2 −4 − λ and the characteristic polynomial is |A − λI| = = = =

7 − λ −15 2 −4 − λ (7 − λ)(−4 − λ) + 30 λ2 − 3λ − 28 + 30 λ2 − 3λ + 2.

So the eigenvalues are the solutions of λ2 − 3λ + 2 = 0. To solve this for λ, one could use either the formula for the solutions to a quadratic equation, or simply observe that the characteristic polynomial factorises. We have (λ − 1)(λ − 2) = 0 with solutions λ = 1 and λ = 2. Hence the eigenvalues of A are 1 and 2, and these are the only eigenvalues of A. To find the eigenvectors for eigenvalue 1 we solve the system (A − I)x = 0. We do this by putting the coefficient matrix A − I into reduced echelon form. 6 −15 1 − 25 . (A − I) = −→ · · · −→ 0 0 2 −5 This system has solutions 5 v=t , 2

for any t ∈ R.

There are infinitely many eigenvectors for 1: for each t 6= 0, v is an eigenvector of A corresponding to λ = 1. But be careful not to think that you can choose t = 0; for then v becomes the zero vector, and this is never an eigenvector, simply by definition. To find the eigenvectors for 2, we solve (A − 2I)x = 0 by reducing the coefficient matrix, 5 −15 1 −3 (A − 2I) = −→ · · · −→ . 2 −6 0 0

27


2

Setting the non-leading variable equal to t, we obtain the solutions 3 v=t , t ∈ R. 1 Any non-zero scalar multiple of the vector (3, 1)T is an eigenvector of A for eigenvalue 2. It follows, then, that if 5 3 P = , 2 1 then P is invertible and P

−1

AP = D =

1 0 0 2

.

Feedback to activity 2.3 Each row of the n × 1 matrix P z is a linear combination of the functions z1 (t), z2 (t), . . . , zn (t). For example, row i of P z is pi1 z1 (t) + pi2 z2 (t) + · · · + pin zn (t). The rows of the matrix (P z)0 are the derivatives of these linear combinations of functions, so the i-th row is (pi1 z1 (t) + pi2 z2 (t) + · · · + pin zn (t))0 = pi1 z10 (t) + pi2 z20 (t) + · · · + pin zn0 (t), using the properties of differentiation, since the entries pij of P are constants. But pi1 z10 (t) + pi2 z20 (t) + · · · + pin zn0 (t) is just the i-th row of the n × 1 matrix P z0 , so these matrices are equal. Feedback to activity 2.8 For the initial conditions y1 (0) = 1, y2 (0) = 1, y3 (0) = 0 the constants α, β, γ are        α −1 −3 2 1 −4  β  =  −1 −1 1   1  =  −2  , γ 1 2 −1 0 3 so the solution is

y1 (t) = −4e−2t + 2et + 3e3t y2 (t) = −2et + 3e3t y3 (t) = −4e−2t − 2et + 6e3t .

28

Chapter 3 Inner products and orthogonality 3 Reading

R R

Essential reading Anthony, M. and M. Harvey. Linear Algebra: Concepts and Methods. Chapter 10.

Further reading Anton, H. and C. Rorres. Elementary Linear Algebra. Sections 6.1, 6.2 and 6.3.

Aims of the chapter In this chapter we look at inner product spaces. We develop the concept of orthogonality of vectors, using our geometric intuition from R2 to abstract these concepts to a general vector space.

3.1

The inner product of two real vectors

For x, y ∈ Rn , the (standard) inner product (sometimes called the dot product or scalar product) is defined to be the number hx, yi given by hx, yi = xT y = x1 y1 + x2 y2 + · · · + xn yn . Here, x = (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T . This is often referred to as the standard or Euclidean inner product. Example 3.1 If x = (1, 2, 3)T and y = (2, −1, 1) then hx, yi = 1(2) + 2(−1) + 3(1) = 3.

It is important to realise that the inner product is just a number, not another vector or a matrix. The inner product on Rn satisfies certain basic properties as shown in the next theorem. In R2 and in R3 the inner product is closely linked with the geometric concepts of length and angle. This provides the background for generalising these concepts to any vector space V , as we shall see in the next section.

29

3. Inner products and orthogonality

Theorem 3.1 The inner product x, y ∈ Rn

hx, yi = x1 y1 + x2 y2 + · · · + xn yn ,

satisfies the following properties for all x, y, z ∈ Rn and for all α ∈ R: (i) hx, yi = hy, xi

3

(ii) αhx, yi = hαx, yi = hx, αyi (iii) hx + y, zi = hx, zi + hy, zi (iv) hx, xi ≥ 0, and hx, xi = 0 if and only if x = 0. Proof We have hx, yi = x1 y1 + x2 y2 + · · · + xn yn = y1 x1 + y2 x2 + · · · + yn xn = hy, xi which proves (i). We leave the proofs of (ii) and (iii) as an exercise. For (iv), note that hx, xi = x21 + x22 + · · · + x2n is a sum of squares, so hx, xi ≥ 0, and hx, xi = 0 if and only if each term x2i = 0 is equal to zero, that is, if and only if x is the zero vector, x = 0. Activity 3.1 Prove properties (ii) and (iii). Show, also, that these two properties are equivalent to the single property hαx + βy, zi = αhx, zi + βhy, zi.

3.1.1

Geometric interpretation in R2 and R3

We begin by looking at vectors in R2 . A vector a1 a= a2 can be represented as a directed line segment in the plane, starting at the origin and going to the point (a1 , a2 ). As such, it is considered to be the position vector of the point (a1 , a2 ). y 6 (a1 , a2 ) >

(0, 0)

30

a1

a2 -

x

3.1. The inner product of two real vectors

Its length, denoted by kak, can be calculated using Pythagoras’ theorem applied to the right-angled triangle shown above: q kak = a21 + a22 so that kak2 = ha, ai . If a, b are two vectors in R2 , let θ denote the angle between them, 0 ≤ θ ≤ π. The vectors a, b and c = b − a form a triangle, where c is the side opposite the angle θ. The law of cosines applied to this triangle gives us the important relationship stated in the following theorem. Theorem 3.2 Let a, b ∈ R2 and let θ denote the angle between them. Then ha, bi = kak kbk cos θ . For a proof, see Anthony and Harvey. This theorem has many geometrical consequences. Since −1 ≤ cos θ ≤ 1 for any real number θ, the maximum value of the inner product is ha, bi = kak kbk. This occurs precisely when cos θ = 1 (θ = 0), that is, when the vectors a and b are parallel and in the same direction. If they point in opposite directions, then θ = π and we have ha, bi = −kak kbk. The inner product will be positive if and only if the angle between the vectors is acute, 0 ≤ θ < π2 . It will be negative if the angle is obtuse, π2 < θ ≤ π. The non-zero vectors a and b are orthogonal (or perpendicular) when the angle between them is θ = π2 . Since cos( π2 ) = 0, this is precisely when their inner product is zero. We restate this important fact: The vectors a and b are orthogonal if and only if ha, bi = 0. Everything we have said so far about the inner product and its geometric interpretation in R2 extends to R3 .   q a1   If a = a2 then kak = a21 + a22 + a23 . a3 Activity 3.2 Show this. Sketch a position vector a = (a1 , a2 , a3 )T in R3 . Drop a perpendicular to the xy-plane and apply Pythagoras’ theorem twice to obtain the result. The vectors a, b and c = b − a in R3 lie in a plane and the law of cosines can still be applied to establish the result that ha, bi = kak kbk cos θ . Planes and hyperplanes In R3 suppose a vector a 6= 0 and a vector p are given. If a = (a1 , a2 , a3 )T the equation ha, x − pi = 0 can be written as ha, xi = ha, pi, which is a1 x 1 + a2 x 2 + a3 x 3 = k

31

3


where k = ha, pi is a constant. This is the equation of a plane in R3 . This plane consists of all vectors x for which x − p is orthogonal to the given vector a. The vector a is then called a normal to the plane. If k = 0 the plane contains the vector 0. It has equation ha, xi = 0 and defines a (two-dimensional) subspace of R3 .

3

As a set of points which we can graph in R3 , taking the endpoints X = (x1 , x2 , x3 ) of all the vectors x, we can think of ha, x − pi = 0 as the plane passing through the point P = (p1 , p2 , p3 ) and normal to the vector a, meaning that the line segments from X to P are all perpendicular to a. In general, if a and p are given vectors in Rn , we define the set of all vectors x ∈ Rn which satisfy the equation ha, x − pi = 0 to be the hyperplane which contains the point P and for which the normal vector is a. It has the equation a1 x1 + a2 x2 + · · · + an xn = k,

k = ha, pi is a constant.

In two-dimensional space, R2 , a hyperplane is just a line, with equation of the form ax + by = k. In three-dimensional space R3 , we speak of a plane, rather than a hyperplane. In Rn , the hyperplane given by ha, xi = 0 is an n − 1 dimensional subspace of Rn . Activity 3.3 Show that, in Rn , the hyperplane given by ha, xi = 0 is an n − 1 dimensional subspace of Rn .

3.2

Inner products more generally

There is a more general concept of inner product than the one just presented, and this is very important. (It is ‘more general’ in two ways: first, this definition allows us to say what we mean by an inner product on any vector space, and not just Rn , and, secondly, it allows the possibility of inner products on Rn that are different from the standard one.) Definition 3.1 (Inner product) Let V be a vector space (over the real numbers). An inner product on V is a mapping from (or operation on) pairs of vectors x, y to the real numbers, the result of which is a real number denoted hx, yi, which satisfies the following properties: (i) hx, yi = hy, xi for all x, y ∈ V (ii) hαx + βy, zi = αhx, zi + βhy, zi for all x, y, z ∈ V and all α, β ∈ R. (iii) hx, xi ≥ 0 for all x ∈ V , and hx, xi = 0 if and only if x = 0, the zero vector of the vector space V . Some other basic facts follow immediately from this definition: for example, hz, αx + βyi = αhz, xi + βhz, yi.

32

3.2. Inner products more generally

Activity 3.4 Prove that hz, αx + βyi = αhz, xi + βhz, yi. It is a simple matter to check that the standard inner product on Rn defined in the previous section is indeed an inner product according to this more general definition. The abstract definition, though, applies to more than just the vector space Rn , and there is some advantage in developing results in terms of the general notion of inner product. If a vector space has an inner product defined on it, we refer to it as an inner product space. Example 3.2 (This is a deliberately strange example. It is not one you would necessarily come up with, but its purpose is to illustrate how we can define inner products in non-standard ways, which is why we have chosen it.) Suppose that V is the vector space consisting of all real polynomial functions of degree at most n; that is, V consists of all functions p : x 7→ p(x) of the form p(x) = a0 + a1 x + a2 x2 + · · · + an xn ,

a0 , a1 , . . . , an ∈ R.

The addition and scalar multiplication are defined pointwise. (See Section 7.1.1 of MT1173 Algebra.) Let x1 , x2 , . . . , xn+1 be n + 1 fixed, different, real numbers, and define, for p, q ∈ V , n+1 X hp, qi = p(xi )q(xi ). i=1

Then this is an inner product. To see this, we check the properties in the definition of an inner product. Property (i) is clear. For (iii), we have hp, pi =

n+1 X

p(xi )2 ≥ 0.

i=1

Clearly, if p is the zero vector of the vector space (which is the identically-zero function), then hp, pi = 0. To finish verifying (iii) we need to check that if hp, pi = 0 then p must be the zero function. Now, hp, pi = 0 must mean that p(xi ) = 0 for i = 1, 2, . . . , n + 1. So p(x) has n + 1 different roots. But p(x) has degree no more than n, so p must be the identically-zero function. (A non-zero polynomial of degree at most n has no more than n distinct roots.) Part (ii) is left to you. Activity 3.5 Prove that, for any α, β ∈ R and any p, q, r ∈ V , hαp + βq, ri = αhp, ri + βhq, ri.

3.2.1

Norms in a vector space

For any x in an inner product space V , the inner product hx, xi is non-negative (by definition). Now, because hx, xi ≥ 0, we may take its square root and this will be a real number. We define the norm , kxk, of a vector x to be this real number.

33

3


Definition 3.2 (Norm) Suppose that V is an inner product space and x is a vector in V . Then the norm of x, denoted kxk, is p kxk = hx, xi. For example, with the standard inner product on Rn , hx, xi = x21 + x22 + · · · + x2n ,

3

(which is clearly non-negative since it is a sum of squares), and the norm is the standard Euclidean length of a vector: q kxk = x21 + x22 + · · · + x2n . We say that a vector v is a unit vector if it has norm 1. If v 6= 0, then it is a simple v matter to create a unit vector in the same direction as v. This is the vector u = . kvk The process of constructing u from v is known as normalising v.

3.2.2

The Cauchy-Schwarz inequality

This important inequality will enable us to apply the geometric intuition we have developed about angles to a completely abstract setting. Theorem 3.3 (Cauchy-Schwarz inequality) Suppose that V is an inner product space. Then |hx, yi| ≤ kxkkyk for all x, y ∈ V . For a proof, see Anthony and Harvey. (Note that |hx, yi| denotes the absolute value of the inner product.) For example, if we take V to be Rn and consider the standard inner product on Rn , then for all x, y ∈ Rn , the Cauchy-Schwarz inequality tells us that v v u n n n X u X uX u 2 xi y i ≤ t xi t yi2 . i=1

3.2.3

i=1

i=1

Generalised geometry

We now begin to imitate the geometry of the previous section. We have already used Pythagoras’ theorem in R3 , which states that if c is the length of the longest side of a right-angled triangle, and a and b are the lengths of the other two sides, then c2 = a2 + b2 . In R2 , we can think of this triangle as having sides given by orthogonal vectors a and b and the hypotenuse is the vector c = a + b. The generalised Pythagoras’ theorem is: Theorem 3.4 (Generalised Pythagoras’ Theorem) In an inner product space V , if x, y ∈ V are orthogonal, then kx + yk2 = kxk2 + kyk2 .

34

3.2. Inner products more generally

For a proof, see Anthony and Harvey. We also have the triangle inequality for norms. This states the obvious fact in R2 that the length of one side of a triangle must be less than the sum of the lengths of the other two sides. Theorem 3.5 (Triangle inequality for norms) In an inner product space V , if x, y ∈ V , then kx + yk ≤ kxk + kyk.

3

For a proof, see Anthony and Harvey.

3.2.4

Orthogonal vectors

We are now ready to extend the concept of angle to an abstract inner product space V . To do this we begin with the result in R3 that hx, yi = kxk kyk cos θ and use this to define the cosine of the angle between the vectors x and y. That is, by definition, we set cos θ =

hx, yi . kxk kyk

This definition will only make sense if we can show that this number cos θ is between −1 and 1. But this follows immediately from the Cauchy-Schwarz inequality, which can be stated as hx, yi kxk kyk ≤ 1 . The usefulness of this definition is in the concept of orthogonality. Definition 3.3 (Orthogonal vectors) Suppose that V is an inner product space. Then x, y ∈ V are said to be orthogonal if and only if hx, yi = 0. We write x ⊥ y to mean that x, y are orthogonal. Example 3.3 With the usual inner product on R4 , the vectors x = (1, −1, 2, 0)T and y = (−1, 1, 1, 4)T are orthogonal. Activity 3.6 Check this!

3.2.5

Orthogonality and linear independence

If a set of (non-zero) vectors are pairwise orthogonal (that is, any two are orthogonal) then it turns out that the vectors are linearly independent: Theorem 3.6 Suppose that V is an inner product space and that vectors v1 , v2 , . . . , vk ∈ V are pairwise orthogonal (vi ⊥ vj for i 6= j), and none is the zero-vector. Then {v1 , v2 , . . . , vk } is a linearly independent set of vectors. For a proof, see Anthony and Harvey.

35


3.3

Orthogonal matrices and orthonormal sets

Definition 3.4 (Orthogonal matrix) An n × n matrix P is said to be orthogonal if P T P = P P T = I: that is, if P has inverse P T .

3

At first it appears that this definition has little to do with the geometric concept of orthogonality. But, as we shall see, it is closely related. If P is an orthogonal matrix, then P T P = I, the identity matrix. Suppose that the columns of P are x1 , x2 , . . . , xn . Then the fact that P T P = I means that xTi xj = 0 if i 6= j and xTi xi = 1. To see this, consider  1 0 0 1 0 0

the case n = 3. Then, P = (x1 x2 x3 )  T   T 0 x1 x1 x1 0  =  xT2  (x1 x2 x3 ) =  xT2 x1 xT3 x1 xT3 1

and since I = P T P we have  xT1 x2 xT1 x3 xT2 x2 xT2 x3  . xT3 x2 xT3 x3

But, if i 6= j, xTi xj = 0 means precisely that the columns xi , xj are orthogonal. The second statement is that kxi k2 = 1, which means (since kxi k ≥ 0) that kxi k = 1; that is, xi is of norm 1. The converse is also true. If hvi , vj i = 0 for i 6= j and hvi , vi i = 1 then it follows that P T P = I. This indicates the following characterisation: A matrix P is orthogonal if and only if, as vectors, its columns are pairwise orthogonal, and each has length 1. Definition 3.5 (Orthonormal) A set of vectors {x1 , x2 , . . . , xk } in an inner product space V such that any two different vectors are orthogonal and each vector has length 1: hxi , xj i = 0 for i 6= j

and

kxi k = 1

is called an orthonormal set (ONS) of vectors. An important consequence of Theorem 3.6 is that an orthonormal set of n vectors in an n-dimensional vector space is a basis. A basis consisting of an orthonormal set of vectors is called an orthonormal basis. If {v1 , v2 , . . . , vn } is an orthonormal basis of a vector space V , then the coordinates of any vector w ∈ V are easy to calculate as shown in the following theorem. Theorem 3.7 Let B = {v1 , v2 , . . . , vn } be an orthonormal basis of a vector space V and let w ∈ V . Then the coordinates a1 , a2 , . . . , an of w in the basis B are given by ai = hw, vi i. For a proof, see Anthony and Harvey. If P is an orthogonal matrix, then its columns are an orthonormal set of n vectors in Rn . These are linearly independent by Theorem 3.6, and hence form an orthonormal basis of Rn . So we can restate our previous observation as follows. Theorem 3.8 An n × n matrix P is orthogonal if and only if the columns of P form an orthonormal basis of Rn . If the matrix P is orthogonal, then since P = (P T )T , the matrix P T is orthogonal too.

36

3.4. Gram-Schmidt orthonormalisation process

Activity 3.7 Show that if P is orthogonal, so too is P T . It therefore follows that the above theorem is also true if column is replaced by row: A matrix P is orthogonal if and only if the columns (or rows) of P form an orthonormal basis of Rn .

3 3.4

Gram-Schmidt orthonormalisation process

Given a set of linearly independent vectors {v1 , v2 , . . . , vk }, the Gram-Schmidt orthonormalisation process is a way of producing k vectors that span the same space as {v1 , v2 , . . . , vk }, and that form an orthonormal set. That is, the process produces a set {u1 , u2 , . . . , uk } such that: Lin{u1 , u2 , . . . , uk } = Lin{v1 , v2 , . . . , vk } {u1 , u2 , . . . , uk } is an orthonormal set. It works as follows. First, we set u1 =

v1 kv1 k

so that u1 is a unit vector and Lin{u1 } = Lin{v1 }. Then we define w2 = v2 − hv2 , u1 iu1 , and set u2 =

w2 . kw2 k

Then {u1 , u2 } is an orthonormal set and Lin{u1 , u2 } = Lin{v1 , v2 }. Activity 3.8 Try to understand why this works. Show that w2 ⊥ u1 and conclude that u2 ⊥ u1 . Why are the linear spans of {u1 , u2 } and {v1 , v2 } the same? Next, we define w3 = v3 − hv3 , u1 iu1 − hv3 , u2 iu2 and set u3 =

w3 . kw3 k

Then {u1 , u2 , u3 } is an orthonormal set and Lin{u1 , u2 , u3 } is the same as Lin{v1 , v2 , v3 }. Generally, when we have u1 , u2 , . . . , ui , we let wi+1 = vi+1 −

i X j=1

hvi+1 , uj iuj ,

ui+1 =

wi+1 . kwi+1 k

Then the resulting set {u1 , u2 , . . . , uk } has the required properties.

37


Example 3.4 In R4 , let us find an orthonormal basis for the linear span of the three vectors v1 = (1, 1, 1, 1)T ,

3

v2 = (−1, 4, 4, −1)T ,

v3 = (4, −2, 2, 0)T .

First, we have u1 =

v1 v1 1 =√ = v1 = (1/2, 1/2, 1/2, 1/2)T . kv1 k 2 12 + 12 + 12 + 12

Next, we have 

     −1 1/2 −5/2  4   1/2   5/2       w2 = v2 − hv2 , u1 iu1 =   4  − 3  1/2  =  5/2  , −1 1/2 −5/2 and we set



 −1/2  1/2  w2 . = u2 = kw2 k  1/2  −1/2

(Note: to do this last step, we merely noted that a normalised vector in the same direction as w2 is also a normalised vector in the same direction as (−1, 1, 1, −1)T , and this second vector is easier to work with.) Continuing, we have w3 = v3 − hv3 , u1 iu1 − hv3 , u2 iu2         4 1/2 −1/2 2  −2   1/2   1/2   −2         =   2  − 2  1/2  − (−2)  1/2  =  2  . 0 1/2 −1/2 −2 Then, u3 = So

Activity 3.9

w3 = (1/2, −1/2, 1/2, −1/2)T . kw3 k

      −1/2 1/2 1/2        1/2   −1/2  1/2       , , {u1 , u2 , u3 } =  . 1/2   1/2   1/2       1/2 −1/2 −1/2

Verify that the set {u1 , u2 , u3 } of this example is an orthonormal set.

Learning outcomes At the end of this chapter and the relevant reading, you should be able to:

38

3.4. Test your knowledge and understanding

explain what is meant by an inner product on a vector space verify that a given inner product is indeed an inner product compute norms in inner product spaces state and apply the Cauchy-Schwarz inequality, the Generalised Pythagoras’ Theorem, and the triangle inequality for norms

3

prove that orthogonality of a set of vectors implies linear independence state what is meant by an orthogonal matrix explain what is meant by an orthonormal set of vectors explain why an n × n matrix is orthogonal if and only if its columns are an orthonormal basis of Rn use the Gram-Schmidt orthonormalisation process.

Test your knowledge and understanding You should now attempt the Exercises in Chapter 10 of Anthony and Harvey. Solutions to those exercises without solutions in Anthony and Harvey may be found on the VLE.

Feedback on selected activities Feedback to activity 3.1 To prove properties (ii) and (iii), apply the definition to the LHS (left-hand side) of the equation and rearrange the terms to obtain the RHS. For example, for x, y ∈ Rn , using the properties of real numbers, αhx, yi = α(x1 y1 + x2 y2 + · · · + xn yn ) = αx1 y1 + αx2 y2 + · · · + αxn yn = (αx1 )y1 + (αx2 )y2 + · · · + (αxn )yn = hαx, yi.

The single property hαx + βy, zi = αhx, zi + βhy, zi implies property (ii) by letting β = 0 and then letting α = 0, and property (iii) by letting α = β = 1. On the other hand, if properties (ii) and (iii) hold, then hαx + βy, zi = hαx, zi + hβy, zi = αhx, zi + βhy, zi

by property (iii) by property (ii)

39


Feedback to activity 3.3 Let a ∈ Rn be a given non-zero vector and let V = {x ∈ Rn : ha, xi = 0}. The components of x ∈ V satisfy the equation a1 x1 + a2 x2 + · · · + an xn = 0. For some i, ai 6= 0 (since a 6= 0), so the equation can be solved for xi . We have xi = β1 x1 + · · · + βi−1 xi−1 + βi+1 xi+1 + · · · + βn xn

3

where βj = −(aj /ai ). Then     x1 x1 ..   ...   .       xi  =  β1 x1 + · · · + βi−1 xi−1 + βi+1 xi+1 + · · · + βn xn   .    ..  .     . . xn xn x = x1 v1 + · · · + xi−1 vi−1 + xi+1 vi+1 + · · · + xn vn where vj is the vector with 1 in the j-th place, βj in the i-th place, and zeros elsewhere. These n − 1 vectors are linearly independent (why?) and span V , hence they are a basis of V and the dimension of V is n − 1. Feedback to activity 3.4 By the properties of an inner product, we have hz, αx + βyi = hαx + βy, zi = αhx, zi + βhy, zi = αhz, xi + βhz, yi. Feedback to activity 3.5 Since αp + βq is the polynomial function: x 7→ αp(x) + βq(x), we have hαp + βq, ri =

n+1 X

(αp(xi ) + βq(xi ))r(xi )

i=1 n+1 X

= α

p(xi )r(xi ) + β

i=1

n+1 X

q(xi )r(xi )

i=1

= αhp, ri + βhq, ri, as required. Feedback to activity 3.6 Just check that hx, yi = 0. Feedback to activity 3.7 The matrix P is orthogonal if and only if P P T = P T P = I. Since (P T )T = P this statement can be written as (P T )T P T = P T (P T )T = I which says that P T is orthogonal. Feedback to activity 3.8 We have hw2 , u1 i = hv2 − hv2 , u1 iu1 , u1 i = hv2 , u1 i − hv2 , u1 ihu1 , u1 i = 0, as hu1 , u1 i = 1. The fact that w2 ⊥ u1 if and only if u2 ⊥ u1 follows from property (ii) of the definition of inner product since w2 = αu2 for some constant α. The linear spans are the same because u1 , u2 are linear combinations of v1 , v2 .

40

3.4. Feedback on selected activities

Feedback to activity 3.9 We only need to check that each ui satisfies kui k = 1, and that hu1 , u2 i = hu1 , u3 i = hu2 , u3 i = 0. All of this is very easily checked. (It is much harder to find the ui in the first place. But once you think you have found them, it is always fairly easy to check whether they form an orthonormal set, as they should.)

3

41