Dynamical Systems and Models in Biology Artem S. Novozhilov December 8, 2015
1
These lecture notes were prepared for the course I taught twice at NDSU1 . The material is quite standard, and many of the discussed models can be found in the existing textbooks, however there are a few that are very interesting and important from my point of view and at the same time are not a part of more or less standard courses in Mathematical Biology. While lecturing on the subject I tried to emphasize the importance of mathematical rigor while studying the models. The primary audience are the mathematics majors, who are interested in 1) dynamical systems theory, 2) mathematical modeling, and 3) in particular, models having the origin in biology. The major prerequisite is some familiarity with ordinary differential equations (ODE) as taught in most introductory classes. I collected very basic material on ODE in Appendix.
1
Contact information:
[email protected]
2
Contents 1 Mathematical models of the population growth 1.1 How many people can the Earth support? (Or a quick refresher square method) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 ODE models of population growth . . . . . . . . . . . . . . . . . . 1.3 Homework 1: Mathematical models of population growth . . . . .
6 the least . . . . . . . . . . . . . . . . . .
6 9 14
2 Basic properties of autonomous first order ODE 2.1 Definitions and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Mathematical models of harvesting . . . . . . . . . . . . . . . . . . . . . . . . . .
16 16 20
3 Elementary bifurcations 3.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Hyperbolic equilibria are insensitive to small perturbations 3.2.2 Universality of the typical bifurcations . . . . . . . . . . . . 3.2.3 Implicit function theorem . . . . . . . . . . . . . . . . . . . 3.3 Homework 2: Stability and Elementary Bifurcations . . . . . . . .
. . . . . .
22 22 25 25 27 28 28
4 Insect outbreak model 4.1 Analysis of the insect outbreak model . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Non-dimensional variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30 30 34
5 Alfred Lotka, Vito Volterra, and Population Cycles 5.1 Analysis of the Lotka–Volterra model . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Homework 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36 36 42
6 General properties of an autonomous system of two first order ODE
44
7 Planar systems of linear ODE 7.1 General theory . . . . . . . . . . . . . . . . . . 7.2 Three main matrices and their phase portraits 7.3 A little bit of linear algebra . . . . . . . . . . . 7.4 Stability of the linear system (7.2) . . . . . . . 7.5 Bifurcations in the linear systems . . . . . . . . 7.6 Homework 4: Analysis of linear planar systems
. . . . . .
49 49 54 58 60 62 64
8 Linear systems in Rd 8.1 General theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 Examples of the phase portraits in R3 . . . . . . . . . . . . . . . . . . . . 8.1.2 Routh–Hurwitz criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66 66 67 69
3
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
on . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
9 Power laws 9.1 Introduction to power laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Birth–death–innovation model of protein domain evolution . . . . . . . . . . . . 9.3 Preferential attachment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70 70 76 78
10 Back to planar nonlinear systems 10.1 Near the equilibria . . . . . . . . . . . . . . . 10.2 Outside of equilibria . . . . . . . . . . . . . . 10.3 Bifurcations of equilibria. Structural stability 10.4 Homework 5: Linearization . . . . . . . . . .
. . . .
80 80 82 83 87
11 Ecological interactions 11.1 General Lotka–Volterra model on a plane and types of ecological interactions . . 11.2 Lotka–Volterra predator–prey model with intraspecific competition . . . . . . . .
87 87 88
12 More on ecological interactions 12.1 Competition model with intraspecific 12.2 Principle of competitive exclusion . 12.3 Cooperative systems . . . . . . . . . 12.4 Midterm exam . . . . . . . . . . . .
92 92 93 96 97
. . . .
. . . .
competition . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
13 A thousand and one population models 14 Periodic phenomena in nature and limit 14.1 Periodic phenomena in nature . . . . . . 14.2 Limit cycles. Definitions. Stability . . . 14.3 Criteria of absence of the limit cycles . . 15 Limit sets. Lyapunov functions 15.1 Limit sets and their properties . . 15.2 Lyapunov functions and limit sets 15.3 Lyapunov functions and stability of 15.4 Homework 6: Limit cycles . . . . .
98 cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . equilibria . . . . . .
16 Gause predator–prey model
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
101 101 103 105 106 107 108 109 111 111
17 Poincar´ e–Andronov–Hopf bifurcation 114 17.1 Homework 7: Limit cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 18 Biological models with discrete time 121 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 18.2 Cobweb diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 19 Stability of fixed points 126 19.1 Mathematical analysis of fixed points . . . . . . . . . . . . . . . . . . . . . . . . . 126 19.2 Homework 8: Discrete dynamical systems . . . . . . . . . . . . . . . . . . . . . . 130
4
20 Periodic solutions. A first encounter with chaos
132
21 On the definition of chaos. Lyapunov exponents 135 21.1 Itineraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 21.2 Lyapunov numbers and exponents . . . . . . . . . . . . . . . . . . . . . . . . . . 137 22 Two dimensional discrete dynamical systems 139 22.1 Linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 22.2 Nonlinear systems and linearization . . . . . . . . . . . . . . . . . . . . . . . . . . 143 22.3 Final exam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 A Review of ODE A.1 The first encounter with the Malthus equation A.2 Notation . . . . . . . . . . . . . . . . . . . . . . A.3 Solving ODE . . . . . . . . . . . . . . . . . . . A.4 Well-posed problems. Theorem of existence and A.5 Geometric interpretation of the first order ODE
5
. . . . . . . . . . . . . . . . . . . . . uniqueness . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
145 145 146 148 149 151
1 1.1
Mathematical models of the population growth How many people can the Earth support? (Or a quick refresher on the least square method)
I would like to start with a very simple and yet interesting example of biological data that cry out for mathematical analysis. Consider the following numbers: Year Population
1900 1625
1920 1813
1930 1987
1940 2213
1950 1955 2516 2752
1960 3020
1965 3336
1970 3698
1975 1980 4079 4448
1985 4851
1990 5292
1995 5700
These numbers provide (quite accurate) estimates of the total Earth population during the 20th century (and I purposely did not include any data on the years after 2000). I can also represent them graphically, as follows. æ
6000 æ æ Population
5000
æ æ æ
4000 æ æ 3000
æ æ æ
2000 æ 1900
æ æ
æ 1920
1940
1960
1980
2000
Year
Figure 1.1: The total world population during the 20th century, millions versus years Since not every year given a specific value it is very natural to ask the following question: Given the data, determine some function f , such that f (t) gives the world population at the year t. I can fix the qualitative form of function f , e.g., straight line, parabola, exponent and so on, which depends on some free parameters, and determine, given the data, the “best” possible such function (i.e., determine the actual values of these parameters). This is called approximation, and this is what I will discuss in some detail. Let f (t, a1 , a2 , . . . , am ) be a family of functions of a specific form that depend on the parameters a1 , . . . , am . To proceed I need to determine what it means “the best possible such function,” because different choices are possible. It turns out that one that is working good is to minimize the sum of squares of deviations of f from the observed values xj , j = 0, . . . , n. To make this general discussion precise, consider a specific example, e.g., f (t, a, b) = at + b, which is the equation of the straight line with parameters a, b. Also consider some abstract data set: t0 x0
··· ···
t1 x1
6
tn xn
2000 6100
At each point tj the square of the deviation is (xj − f (tj , a, b))2 , and therefore my ultimate task is to find the values of a and b that minimize the following sum g(a, b) =
n ∑
(xj − f (tj , a, b))2 =
n ∑
j=0
(xj − atj − b)2 .
j=0
To find the minimum of a function, I need to find two partial derivatives n n n n ) (∑ ∑ ∑ ∑ ∂g x j tj − a t2j − b tj , (a, b) = −2 (xj − atj − b)tj = −2 ∂a j=0
∂g (a, b) = −2 ∂b
n ∑
j=0
j=0
j=0
n n (∑ ) ∑ xj − a tj − bn , (xj − atj − b) = −2
j=0
j=0
j=0
and equal them to zero, which yields a linear algebraic system a
n ∑
t2j
+b
j=0
n ∑
tj =
j=0
a
n ∑
tj + bn =
j=0
n ∑ j=0 n ∑
xj tj , xj ,
j=0
which can be readily solved for the unknown a and b. For example for the data on the world population I find, after some calculations, that f (t) = −90402 + 48t, i.e., a = 48 and b = −90402. You can compare the data with the found best linear approximation in the figure below. You may see that agreement is not very good, and probably the straight line should be replaced by something else. It can also be seen by calculating the value of g(a, b) = 3060000. It seems that a parabola would approximate the data better: f (t, a, b, c) = at2 + bt + c. Indeed, repeating exactly the same calculations, I will find a system of three linear equations with three unknowns, which again can be solved by the standard methods. After some calculations (for which it is probably better to use a computer) I find f (t) = 0.54t2 − 2047t + 1954510. You can see in the figure that agreement now is quite good, moreover g(a, b, c) = 47273, 7
æ
6000
æ æ
5000
æ
Population
æ æ
4000 æ æ 3000
æ æ æ
2000 æ
æ
æ
æ
1000 1900
1920
1940
1960
1980
2000
Year
Figure 1.2: The total world population during the 20th century, millions versus years, and the best linear approximation f (t) = 48t − 90402 which is significantly better than for the linear case. Now I can use the found formula to interpolate the data, e.g., find the value f (1910) = 1652. Did I solve my problem? Not really. First, the choice of parabola was quite arbitrary, maybe a cubic parabola would do better. Second, if I put t = 0 (extrapolate, i.e., go beyond the given range of the data), I get f (0) = 1954510 million people at the year 0 A.D., which is clearly a ridiculous estimate. æ
6000 æ æ Population
5000
æ æ æ
4000 æ æ 3000
æ æ æ
2000 æ 1900
æ æ 1920
æ 1940
1960
1980
2000
Year
Figure 1.3: The total world population during the 20th century, millions versus years, and the best quadratic approximation f (t) = 0.54t2 − 2047 + 1954510 At this point I would like to stop discussing the data (and the method of least squares, that is used to find a best possible f among a given family of functions depending on free parameters) and switch to the ordinary differential equations (ODE).
8
1.2
ODE models of population growth
The usual process of mathematical modeling goes in several stages: First, we start with the situation at hands and formulate the main features of the considered system (physical, chemical, biological, etc), which we would like to retain in our mathematical model. At the same stage we disregard many unimportant for us details. After this first stage we formulate a mathematical model, which is built on our simplifying assumptions. If we have the model, we can forget about the original system and perform an analysis of the obtained mathematical problem. Finally, the solutions to this model should be interpreted in terms of the original system. Of course, the whole process is usually much more involved, but the outlined above line of reasoning can be found in many real world modeling situations. I will present a great number of examples in this course. As a very basic example of the modeling approach, let me introduce the so-called Malthus equation. A very simple biological process of a population growth is considered. Let N (t) denote the number of individuals in a given population (for concreteness you can think of a population of bacteria) at the time moment t. In this course the variable t will almost exclusively denote time. Now I calculate how the population number changes during a short time interval h. I have N (t + h) = N (t) + bhN (t) − dhN (t). Here I used the fact that the total population at the moment t + h can be found as the total population at the moment t plus the number of individuals born during time period h minus the number of died individuals during time period h. b and d are per capita birth and death rates respectively (i.e., the numbers of births and deaths per one individual per time unit respectively). From the last equality I find N (t + h) − N (t) = (b − d)N (t). h Next, I postulate the existence of the derivative N (t + h) − N (t) dN = lim , h→0 dt h assume for simplicity that both b and d are constant, and hence obtain an ordinary differential equation dN = (b − d)N, dt which is usually called in the biological context the Malthus equation (I will come back to Malthus). Finally, I rewrite the equation in the form N˙ = mN,
N (0) = N0 ,
(1.1)
where N (t) is the population size at the time moment t, m is the parameter of the model, m = b − d. How did Malthus arrived at this mathematical model? Obviously, the process of the population growth or decline is very intricate, which is subject to many important factors, such as weather, temperature, diseases, religion and so on. Malthus stated his simplifying assumption that the population growth is “geometric,” by this he meant that the population size increases 9
in a geometric progression, which can be described as the relation N (t + h) = wN (t), where w is the parameter of the geometric growth. In the terms of the continuous time, this means exactly the equation (1.1). So, he had a simplifying assumption, and his mathematical model — (1.1). Now I analyze the mathematical model, in this case I can simply solve it: N (t) = N0 emt ,
(1.2)
which predicts the exponential growth if m > 0. At the same time Malthus argued that the goods increase in the world linearly. Exponential population growth plus linear increase of food and similar things together mean, by Malthus, a catastrophe. Here is what Malthus wrote A man who is born into a world already possessed, if he cannot get subsistence from his parents on whom he has a just demand, and if the society do not want his labour, has no claim of right to the smallest portion of food, and, in fact, has no business to be where he is. At nature’s mighty feast there is no vacant cover for him. She tells him to be gone, and will quickly execute her own orders, if he does not work upon the compassion of some of her guests. If these guests get up and make room for him, other intruders immediately appear demanding the same favour. The report of a provision for all that come, fills the hall with numerous claimants. The order and harmony of the feast is disturbed, the plenty that before reigned is changed into scarcity; and the happiness of the guests is destroyed by the spectacle of misery and dependence in every part of the hall, and by the clamorous importunity of those, who are justly enraged at not finding the provision which they had been taught to expect. The guests learn too late their error, in counter-acting those strict orders to all intruders, issued by the great mistress of the feast, who, wishing that all guests should have plenty, and knowing she could not provide for unlimited numbers, humanely refused to admit fresh comers when her table was already full. Thomas Robert Malthus (13 February 1766 — 23 December 1834) An Essay on the Principle of Population, Second edition (this quotation was removed from the text in the subsequent editions)
It is interesting to note that actually Malthus was wrong. Ok, I hope that it is not surprising at this point that Malthus was wrong, given the number of the simplifying assumptions put in the model, but what I call “interesting” is that the human population actually grew faster than the exponential growth predicts. Consider this in some details. First, I would like to see how my exponential function approximates the data I have. In my case I have, recall the least square method I used in the first subsection in this lecture, f (t, a, b) = aebt . However, (the student must convince himself that my statement is correct) the system for the partial derivatives ∂g ∂g = 0, =0 ∂a ∂b is no longer linear! and hence cannot be easily solved without applying specific numerical procedures. One can avoid this obstacle by noticing that log f is a linear function of a and b: log f (t, a, b) = log a + bt = A + bt. 10
æ æ æ
5000
æ æ
Population
æ æ æ 3000
æ æ æ æ
2000
æ æ
æ 1900
1920
1940
1960
1980
2000
Year
Figure 1.4: The total world population during the 20th century, millions versus years, in logarithmic coordinates This transformation is also often useful for presenting the data in logarithmic coordinates, see the figure. Anyway, I find that f (t, a, b) = 1.15 × 10−9 e0.01462t ,
g(a, b) = 1600000,
which is better than the linear approximation but worth than the quadratic one, see the graphical comparison. æ æ æ
5000
æ æ
Population
æ æ æ 3000
æ æ æ æ
2000
æ æ
æ 1900
1920
1940
1960
1980
2000
Year
Figure 1.5: The total world population during the 20th century, millions versus years, and the best exponential approximation of the data, in logarithmic coordinates If I consider available estimates of the world population for the last 2000 years, not just a century, the disagreement with the exponential function is even worth. Note that I again use the logarithmic coordinates to plot the population numbers. I.e., instead of N (t) I actually plotted log N (t). If the population growth was exponential, given by N (t) = N0 emt , then in the 11
Population
5000
1000 500 ææ 100 50
æææ
æ æ æ æ æ æ æ æ æ æ æ ææ æææ æ ææææ ææ
æ æ æ æ
10 -3000
-2000
0
-1000
1000
2000
Year
Figure 1.6: World population versus time, in millions, in logarithmic coordinates logarithmic coordinates I would get the straight line. It turns out actually that a much better fit can be obtained if I assume that the function N has the hyperbolic form N (t) =
C , T −t
(1.3)
where C ≈ 2 × 1011 , T ≈ 2026 can be found by the same method of least squares (but here one must to solve a nonlinear system of equations). This formula is very precise if I consider only 400-500 years of the population estimates up to 1960. Note that when t → T , the population blows up. However, if I consider the population growth only during the last 100 years or so (the original data plus the last 15 years), I will see that the population actually stabilizes. And this æ
7000
æ æ æ æ
6000
æ
Population
æ æ æ
5000
æ æ æ æ
4000
æ æ æ æ
3000 æ æ 1950
æ
æ æ 1960
1970
1980
1990
2000
2010
Year
Figure 1.7: World population versus time, in millions, for the last 65 years conclusion should be quite obvious from a common sense. Therefore, Malthus’ interpretation of 12
his mathematical model (1.1) and its solutions is clearly wrong; however, even from such a simple and unrealistic mathematical model can be made a very far reaching conclusion, which was made by Charles Darwin, who put together the underlying law of the geometric (or exponential) growth and clear impossibility of the infinite human population. Darwin wrote: In October 1838... I happened to read for amusement Malthus on Population... it at once struck me that under these circumstances favourable variations would tend to be preserved, and unfavourable ones to be destroyed. The result of this would be the formation of new species.
Hence even very simple models can lead to very important and nontrivial conclusions! The fact that no population can grow to infinity should be included in our mathematical models if we would like to consider predictions of the population size in the future times. Very general, I can assume that the law of growth has the general form N˙ = N F (N ), where F (N ) is some function, which has to be negative for sufficiently large values of N (do you see why it is important?). If this function is smooth enough, I can represent it with the help of the Taylor formula around N = 0: F (N ) = F (0) +
F ′ (0) F ′′ (0) 2 N+ N + o(N 2 ). 1! 2!
Here the notation f (N ) = o(g(N )) when means that f (N ) = 0, N →0 g(N ) lim
and I also assume that this term is negligible when N → ∞. Note that if in the Taylor formula I keep only the constant term, I obtain exactly the Malthus equation N˙ = mN, where m = F (0). If I keep two terms, I obtain the equation ( ) N˙ = N F (N ) = N F (0) + F ′ (0)N = mN
) ( N 1− , K
where I used another parametrization (do you see how F (0) and F ′ (0) are connected to m and K?), which is the logistic equation, and the parameter K is the carrying capacity. Therefore, I presented a mechanistic argument in favor of the logistic equation as the simplest first order differential equation describing the population growth apart of the Malthus equation. I hope that at this point you already know that, given N (0) = N0 , the logistic equation has the solution N (t) =
N0 K → K, N0 + (K − N0 )e−mt
t → ∞.
Again, using the available data and the method of the least squares, I can estimate the three parameters of the logistic curve, and find, e.g., that K = 11740, that is, the world population 13
Population
10 000 8000 ææ ææ æ 6000 æ ææ æ ææ ææ æ 4000 ææ ææ æ æææ 1960
1980
2000
2020
2040
2060
2080
2100
Year
Figure 1.8: World population versus time, in millions, for the last 65 years and the best logistic fit together with prediction of the increase of the world population will stabilize at approximately 12 billion people, see the comparison of the data with the best logistic fit in the figure below. There are a great number of different population models in the literature. To give just one more example, I can consider the model of the form ) ( b N − , N˙ = mN 1 − K 1 + aN where in addition to the usual logistic equation, another mortality term is added, and a and b are positive parameters. This last equation actually describes an important ecological phenomenon of Allee’s effect, which states that the maximal per capita population growth occurs at some intermediate values of N , whereas for both large and small values of N it becomes smaller or even probably negative (for the large values we have, as we discussed, depletion of resources, and for small values of N you can think of the chance of finding a mate). Anyway, I hope at this point it is clear that it would be beneficial to treat first order ODE, since they can describe, as a first approximation, the population growth. I will also use the mathematical theory of first order ODE (which is not very complex, let me put it this way) to introduce the language of nonlinear dynamics, which we will use throughout the course. Finally, you already noted that our models depend on parameters, and if the parameters change, sometimes sudden changes in the system behavior occur. These changes are called bifurcations, and I will also introduce some bifurcation analysis, which will be very handy the whole semester.
1.3
Homework 1: Mathematical models of population growth
1. Least square method. (a) Assume that we have data points (t1 , x1 ), . . . , (tk , xk ). The method of least squares can be applied to find the straight line x = at + b such that the value k ∑ (
xi − (ati + b)
i=1
14
)2
is minimized. Show that ∑k a=
i=1 (ti − t)(xi − ∑k 2 i=1 (ti − t)
x)
,
b = x − at,
gives the solution to this problem. Here k 1∑ t= ti , k i=1
k 1∑ x= xi . k i=1
(b) Given the data 1800 5.3 1830 12.9 1860 31.4 1890 62.9 1920 105.7 1950 150.7 1980 226.5 for the population of US in millions in different years, estimate N0 and m in the assumption that the growth law is exponential: N (t) = N0 emt . (Remember that you need initially transform the data.) Sketch the data and the best exponential fit in the same coordinate system. 2. Radioactive decay. It has been experimentally observed that radioactive elements decay at the rate proportional to their mass. If x(t) is the mass at time t, then the mathematical model is x˙ = −λx, x(0) = x0 , where λ > 0 is the coefficient of proportionality, which is substance dependent, and x0 is the mass at the initial time moment. (a) Solve the IVP. (b) To determine λ the amount of the substance is usually measured at some later time τ . It is standard to use the half-life for τ , i.e., the time required for the substance to decay to half of its original size. Show that if τ is the half-life, then λ = log 2/τ . (c) The half-life of carbon-14, 14 C, is 5730 years. Compute the length of time it takes for a mass of 14 C to reduce to 20% of its original weight. (d) When a tree is alive, the amount of carbon-14 in it is at equilibrium: The amount absorbed from the atmosphere equal to the amount radiated. As soon as the tree dies, it keeps radiating carbon-14 but does not absorb any longer. Let N0 be the number of atoms of 14 C at the moment the tree dies (this number is equal to the observed one in the biosphere) and let N (t) be the number of atoms of 14 C in the
15
sample of the dead tree (can be measured). Show that the age of the tree can be estimated as 1 N0 t = log . λ N (t) (e) What is the crucial assumption of the radiocarbon dating, which is probably false? 3. Logistic equation. Using the separation of variables solve the initial value problem for the logistic equation ( ) N ˙ N = rN 1 − , N (0) = N0 , K where r, K > 0 are constants, and the initial condition N0 > 0. What is limt→∞ N (t)?
2
Basic properties of autonomous first order ODE
I start with some mathematical properties of the first order ODE and after this will show how one can put them to a good use analyzing a biological problem.
2.1
Definitions and basic properties
Definition 2.1. First order ODE x˙ = f (t, x) is called autonomous if the right hand side does not depend explicitly on t: x˙ = f (x). (2.1) I start with an example, and then generalize the properties deduced in this example to all autonomous equations. Example 2.2. Consider the simplest autonomous equation x˙ = x, which is a separable equation, and whose solution is x(t) = Cet , where C is an arbitrary constant. If I had the initial condition x(t0 ) = x0 , then my solution would be x(t; x0 ) = x0 et−t0 . If one has the explicit formula for the solution then it is easy to sketch several integral curves (i.e., the graphs of the solutions) and the direction field of this equation (see the figure). From the figure it becomes obvious that the direction field of the equation in the example, as well as any direction field of the autonomous equations, have the property that it is the same on any line parallel to t axis. Hence I can project the whole direction field onto the x-axis, without losing much information (I put an arrow that points in the positive direction if the slope is positive and an arrow that points in the negative direction if the slope is negative, what lost is the absolute values of the slopes). The picture on the x-axis is called the phase portrait and the
16
x
x
0
t
Figure 2.1: The direction field of x˙ = x together with the phase line. The dashed arrows show the projection of the direction field onto the x-axis x axis is called the phase line (again, the terminology originated in mechanics). Note that for x = 0 there is no direction and hence I mark this point on the phase line. Can I come up with the phase portrait without looking at the direction field, which was known to me because of the simplicity of the original equation? The answer is resounding “yes.” Consider the following figure. Here I look at the function f , which is simply f (x) = x in my case. Note that if x > 0 then f (x) > 0 hence x˙ > 0 hence the solutions are increasing. Similarly, if x < 0, then x˙ < 0 and solutions are decreasing, I point these facts with arrows in the graph and obtain again the same phase portrait that I already saw in the previous figure. Hence the conclusion: We do not need actual analytical (i.e., in the form of a formula) solutions to the autonomous differential equations to figure out the phase portrait of this equation. And knowledge of the phase portrait allows me to infer the asymptotic behavior of the solutions (“asymptotic” in this context means for t → ∞). In my example I see that if the initial condition x0 > 0 then x(t; x0 ) → ∞ for t → ∞, and if x0 < 0 then x(t; x0 ) → −∞. Here and throughout the course the notation x(t; x0 ) means the solution to ODE with the initial condition x0 . f (x) f (x) = x f (x) < 0 x f (x) > 0
Figure 2.2: The phase portrait of x˙ = x
17
Example 2.3. Here is another example: the logistic equation has the form ( x) , r, K > 0. x˙ = rx 1 − K Here r, K are positive parameters. I actually solved this equation and studies its solutions in the previous lecture. Here I will sketch several integral curves by studying its phase portrait. The graph of f (x) = rx(1 − x/K) is a parabola with branches pointing down, which crosses x-axis at the points x ˆ1 = 0 and x ˆ2 = K (see the figure). f (x)
x K
x
f (x) = rx 1 −
x 0
K
t
Figure 2.3: Left: The phase portrait of x˙ = rx(1 − x/K). Right: The direction field of x˙ = rx(1 − x/K). The two horizontal integral curves are x = 0 and x = K I can see that f (x) is negative when x < 0 and x > K and positive for x ∈ (0, K), hence the directions of the arrows. This means that the integral curves are growing for x ∈ (0, K) and degreasing for x < 0 and x > K. If x = 0 or x = K I have that f (x) = 0, and hence the slope of the integral curves here is zero, these points in the phase correspond to the integral curves parallel t-axis. Having just this information I can sketch several integral curves (see the figure below). You should compare it with the previous one. Now I am in the position to formulate general properties of the autonomous ODE. • The direction field is invariant with respect to translations along t-axis. This is the reason I can still get a lot of information simply from the phase portrait (i.e., from the projection of solution curves on the phase line). • A related property is that if x(t) solves the problem (2.1) then x(t + c), where c is any constant, also a solution. This means that if I know the solution with the initial condition x(0) = x0 , then any solution with the initial condition x(t0 ) = x0 can be obtained by translation. This is why I can write x(t; x0 ) without specifying the time moment at which the initial condition is prescribed. • The solutions to the autonomous equation are monotonous functions. In particular, the first order autonomous equations cannot have periodic solutions. • There are special and very important solutions, which can be found as the roots of f (x) = 0. If x ˆ is such that f (ˆ x) = 0 then x ˆ is called an equilibrium point (or stationary point, or 18
critical point, or rest point, or simply equilibrium, or fixed point). If x ˆ is an equilibrium, then x(t) = x ˆ is a solution to (2.1), which corresponds to the integral curve parallel to the t-axis (look at the examples above). • The asymptotic behavior of the solutions to the autonomous ODE (2.1) can be inferred from the phase portrait; there are only three options: Firstly, the solutions can approach the equilibria, secondly, the solutions can be equilibria themselves, and finally, the solutions can go to plus or minus infinity. • The last point can be rephrased in the following form: the phase portrait is a union of equilibrium points and orbits (intervals of R) with specific directions. The orbits are also often called trajectories. If I look again at the example with the logistic equation, I can see that there are two equilibria: x ˆ1 = 0 and x ˆ2 = K, but the behavior of the orbits around these points is manifestly different: the point x ˆ1 repels orbits, whereas x ˆ2 attracts orbits (look at the directions of the arrows). The mathematical formalization that distinguishes these points is the notion of stability. Since this notion is so important for this course, I will give a formal definition. Definition 2.4. An equilibrium x ˆ of the autonomous first order ODE (2.1) is Lyapunov stable (or simply stable) if for any ε > 0 there exists a δ(ε) > 0 such that for any initial condition x0 satisfying |x0 − x ˆ| < δ, it follows that |x(t; x0 ) − x ˆ| < ε. If, additionally, |x(t; x0 ) − x ˆ| → 0,
t → ∞,
then x ˆ is called asymptotically stable. Otherwise, x ˆ is called unstable. This definition is quite general. For the first order equations I mostly will meet asymptotically stable and unstable equilibria (can you think of an example of a Lyapunov stable equilibrium?). This definition is difficult to apply for concrete examples, since it involves actual solutions to (2.1). However, even perfunctory inspection of the phase portrait of the logistic equation should bring you the idea of a simple analytical test for stability. The proof is left as an exercise for those who would like to practice their proof writing skills. Proposition 2.5. Let x ˆ be an equilibrium of (2.1). Assume that f ∈ C (1) . If f ′ (ˆ x) > 0 then x ˆ ′ is unstable; if f (ˆ x) < 0 then x ˆ is asymptotically stable. In this proposition f ′ (ˆ x) means the derivative of f evaluated at the point x ˆ. For example, if I consider again the logistic equation ( x) x˙ = rx 1 − , K then I have
( x ) rx f ′ (x) = r 1 − − . K K 19
Therefore
f ′ (ˆ x1 ) = f ′ (0) = r > 0,
hence the origin is unstable, and f ′ (ˆ x2 ) = f ′ (K) = −r < 0, therefore x ˆ2 = K is asymptotically stable. A very good question to think about is what happens if f ′ (ˆ x) = 0. Note that this case is not covered by the proposition above. Definition 2.6. An equilibrium x ˆ of (2.1) such that f ′ (ˆ x) ̸= 0 is called hyperbolic.
2.2
Mathematical models of harvesting
Let me assume that dynamics of a fish population in a lake is governed by the logistic equation ( ) N N˙ = rN 1 − , r, K > 0, K such that in the long run I have that the population size stabilizes at N (t) = K, the carrying capacity of the lake. Now I assume that I would like to start harvesting the fish in the lake. I need to optimize two conditions: First, we would like to guarantee that the fish does not go extinct in the lake, and second, we would like to maximize the yield. For this there are two strategies: • Fixed yield. This means that I fix the quote for the time period (say, 500 pounds per year). • Proportional yield. This means that I fix the proportion of the fish population I would like to harvest during the time period (say, 25 percent of the current population per year). A biologically important question is which strategy is better? Let me start with the fixed yield strategy. The equation that governs the dynamics of population now reads ( ) N ˙ N = rN 1 − − Y0 , r, K, Y0 > 0, K where Y0 is the yield that I plan to acquire during the time unit. I can find the equation for the equilibrium points and study their stability analytically, but the geometric picture in this case is much easier to deal with. The right hand side here is ( ) N f (N ) = rN 1 − − Y0 , K ( ) N whose graph is the parabola defined by rN 1 − K and shifted down by Y0 . If Y0 is small ˆ1 > 0 and N ˆ2 < K, at that the former enough, then I still have two equilibria, let me call them N is unstable and the latter one is asymptotically stable. My task is to determine the maximum possible Y0 in terms of the population parameters r, K. It is clear that for some Y0 the parabola will touch N -axis and after that, for any Y0 bigger than that, there will be no positive equilibria in the system, which corresponds to extinction (see the figure). Hence, the maximal possible 20
f (x) Y0 = 0 0
K
(3)
Y0 = Y0
x
(1)
Y0 = Y0
(2)
Y0 = Y0
(1)
(2)
(3)
Figure 2.4: The phase portraits for the model with the fixed yield. Y0 < Y0 < Y0 . The (2) maximal possible yield corresponds to Y0 , note that in this case we have only one equilibrium. (3) For Y0 there are no positive equilibria and the population goes extinct yield corresponds to the moment when parabola exactly touches N -axis, which happens when the determinant of the quadratic polynomial ) ( rN 2 N − Y0 = − + rN − Y0 rN 1 − K K is equal to zero: rY0 rK = 0 =⇒ Y0 = . K 4 This is the maximal possible yield in this model. Now back to the proportional yield. The equation reads ( ) ( ) N rN ˙ N = rN 1 − − hN = N r − h − . K K r2 − 4
I have here two equilibria: ˆ1 = 0, N which is always unstable, and
ˆ2 = K(r − h) , N r
ˆ2 > 0, I must require that which is always asymptotically stable (check!). To guarantee that N ˆ2 , then my yield is h < r. If the population at the stationary point N ˆ2 = hN
Kh(r − h) , r
and I am free to pick any h. Let me maximize the last expression with respect to h: the maximum is attained when h = r/2 (prove this using Calculus), hence the maximal yield for this strategy is rK , 4 21
which is exactly the same as in the model with fixed yield. So is there any difference between the two approaches? The answer is “yes, there is.” The fixed yield at the maximal value leads to an inevitable catastrophe because sooner or later Y0 will be such that there will no equilibria. The population will collapse. The mathematical term for this catastrophic event here (two equilibria collide and disappear) is bifurcation. For the model with the proportional yield there is no bifurcation since if I somehow exceed the best possible value h = r/2, then nothing dramatic happens, I will harvest slightly less fish (see the figure). f (x) h=0 0
K
x
h = h(3)
h = h(1) h = h(2)
Figure 2.5: The phase portraits for the model with the proportional yield. h(1) < h(2) < h(3) < r. Note that for any h < r we still have two equilibria, the right of which is asymptotically stable
3 3.1
Elementary bifurcations Examples
We already saw in the previous lecture that mathematical models in the form of ODE often depend on parameters. Moreover, when the parameter changes, the behavior of the solutions to ODE sometimes suddenly changes as well; it is said that a bifurcation occurs. In this lecture I will consider simplest possible bifurcations in the first order autonomous ODE x˙ = f (x, α),
x ∈ U ⊆ R, α ∈ R,
(3.1)
where the explicit dependence on the parameter α is shown to emphasize that we study the behavior of solutions when the parameter values vary. Example 3.1 (Fold or saddle-node bifurcation). Consider the following ODE: x˙ = α + x2 =: f (x, α). The equilibria of this equation are determined by the equation α = −x2 , which has two solutions
√ x ˆ1,2 = ± −α, 22
(3.2)
if α < 0, has one solution x ˆ = 0, if α = 0, and no solutions if α > 0. The stability of these solutions is also easy to determine: x ˆ1 (the one with the positive sign) is unstable and x ˆ2 (the one with the negative sign) is asymptotically stable. When α = 0 the equilibrium x ˆ = 0 becomes non-hyperbolic, since in this case fx′ (0, 0) = 0 (in case when a function depends on more than one variable and I use prime to denote differentiation, the subscript indicates the variable with respect to which the derivative is taken). The whole picture can be described as “as parameter α increases two equilibria approach each other, collide, and disappear.” It is convenient to summarize the analysis in the bifurcation diagram, which is shown in the figure. In this case, when we deal with a scalar equation and the x
x
x
x=α
0
α
Figure 3.1: The bifurcation diagram of x˙ = α+x2 (fold bifurcation). The bold curve corresponds to the stable equilibrium, the dashed curve corresponds to the unstable equilibrium. Three phase portraits are shown: one with α < 0, one with α = 0, and one with α > 0. The shaded circles show the equilibria of the system, and the arrows indicate the direction of the phase flow parameter is one dimensional, the bifurcation diagram can be presented as a direct product of the parameter space and the phase space, R×R. The equation f (x, α) = 0 determines the set of equilibria (in the considered case — parabola α = −x2 ). Projection of this set on the α-axis has a singularity at the point (0, 0) at which the bifurcation occurs: two equilibria turn into one and disappear. This bifurcation is called in the literature fold, or tangent, or saddle-node bifurcation. I repeat that this bifurcation happens when λ = fx′ (0, 0) = 0, i.e., when our equilibrium in the system is non-hyperbolic. Here is another example. Example 3.2 (Pitchfork bifurcation). Consider x˙ = αx − x3 =: f (x, α),
x ∈ U ⊆ R, α ∈ R.
(3.3)
If α < 0 then there is unique asymptotically stable equilibrium x ˆ = 0, if α > 0 then additionally √ we have x ˆ1,2 = ± α, both of which are asymptotically stable, whereas x ˆ = 0 becomes unstable. When α = 0 there is only asymptotically stable x ˆ = 0, which is, however, non-hyperbolic: fx′ (0, 0) = 0. The bifurcation diagram is presented in the figure. This bifurcation has the name pitchfork bifurcation. 23
x
x
x
0
α α = x2
Figure 3.2: The bifurcation diagram of x˙ = αx − x3 (supercritical pitchfork bifurcation). The bold curves correspond to the stable equilibria, the dashed curve corresponds to the unstable equilibrium. Three phase portraits are shown: one with α < 0, one with α = 0, and one with α > 0. The shaded circles show the equilibria of the system, and the arrows indicate the direction of the phase flow The considered examples are very illuminating. However, note that I did not actually define what bifurcation is, and what are the conditions that determine one or another bifurcation. The general definitions are not actually needed at this point, therefore I will discuss the definition of a bifurcation specific to the first order ODE (3.1). First, I need to understand which ODE are considered to be different, and what I mean when I say that “the behavior of solutions changes.” That it, I need to obtain the means to compare two different ODE of the form (3.1). Comparison of any mathematical objects is based on an equivalence relation, which allows to identify classes of equivalent objects and study the relationships between these classes. Definition 3.3. Consider two ODE x˙ = f (x),
x ∈ U ⊆ R,
x˙ = g(x),
x ∈ V ⊆ R.
They are called topologically equivalent, if they have equal number of equilibria of the same stability, located in the same order on the phase line. In short, we call two ODE of the form (3.1) with fixed parameter values topologically equivalent, if they have the same structure of the phase space. The phase portraits of topologically equivalent ODE are also called topologically equivalent. Definition 3.4. Appearance of topologically non-equivalent phase portraits under variation of the parameters is called bifurcation. Note that in my case of scalar ODE, any bifurcation is associated with the change of the number or stability of the equilibria of the system. Therefore, we consider so far only bifurcations of equilibria. The exact value of the parameter at which a bifurcation occurs is called the bifurcation value or bifurcation point. 24
How to actually determine the bifurcation values, if any? If you look again at the examples, you will find that the bifurcations occurred in these systems exactly when one or another equilibrium became non-hyperbolic. It turns out that this is indeed the defining condition for a bifurcation of the equilibrium x ˆ to happen: fx′ (ˆ x, α) = 0 =⇒ α is a bifurcation value. Here is one more example to practice sketching the bifurcation diagram. Example 3.5 (Transcritical bifurcation). Consider x˙ = αx − x2 ,
x ∈ U ⊆ R, α ∈ R.
Here we always have two equilibria x ˆ = 0 and x ˆ1 = α, which “collide” at α = 0 into one nonhyperbolic equilibrium; at the point α = 0 these equilibria exchange stability. The bifurcation diagram is given in the figure below. The bifurcation value is again zero: α = 0. x
x
x x=α
0
α
Figure 3.3: The bifurcation diagram of x˙ = αx − x2 (Transcritical bifurcation). The bold curves correspond to the stable equilibria, the dashed curves correspond to the unstable equilibria. Three phase portraits are shown: one with α < 0, one with α = 0, and one with α > 0. The shaded circles show the equilibria of the system, and the arrows indicate the direction of the phase flow
3.2 3.2.1
General discussion Hyperbolic equilibria are insensitive to small perturbations
First, let me explain why exactly the necessary condition for an equilibrium bifurcation in (3.1) is non-hyperbolicity of this point. As a warm up I consider the equation x˙ = −x, which has a hyperbolic asymptotically stable equilibrium x ˆ = 0 at the origin, and its parametric perturbation in the form x˙ = α − x, 25
which coincides with the original equation if α = 0. It is straightforward to see that the perturbed equation has exactly the same equilibrium with the same stability properties, see the figure. Now I consider a general ODE of the form x˙ = f (x), and assume that f ∈ C (1) , f (0) = 0, and f ′ (0) ̸= 0, i.e., that I have an equilibrium at the origin and that this equilibrium is hyperbolic (note that the requirement to have the equilibrium at the origin is not restrictive because if one considers equilibrium x ˆ ̸= 0, it is always possible make the change of the variables, say y = x − x ˆ, and for y the equilibrium yˆ = 0 appears). Together with the ODE, I consider its perturbation x˙ = F (x, α), where F : R × R −→ R, and F ∈ C (1) function satisfying F (x, 0) = f (x),
∂F (0, 0) = f ′ (0) ̸= 0. ∂x
(3.4)
Let me investigate the equilibria of the perturbed equation if α ̸= 0. These new equilibria probably will be different from zero because generically I will have F (0, α) ̸= 0. Conditions (3.4) together with the implicit function theorem (the exact statement is given at the end of the lecture) imply that for small enough α (technically, for |α| < δ) there exist a unique C (1) function ψ(α) with ψ(0) = 0 and F (ψ(α), α) = 0. This means that under the perturbation only one equilibrium is possible, and this equilibrium is given by x ˆ = ψ(α). The stability of this equilibrium is determined by (the sign of) Fx′ (ψ(α), α). x
x
x x=α
0
α
Figure 3.4: The bifurcation diagram of x˙ = α − x (hyperbolic equilibrium is stable with respect to perturbations). The bold curves correspond to the stable equilibria, the dashed curves correspond to the unstable equilibria. Three phase portraits are shown: one with α < 0, one with α = 0, and one with α > 0. The shaded circles show the equilibria of the system, and the arrows indicate the direction of the phase flow 26
We know that Fx′ (0, 0) = f ′ (0) ̸= 0, and since Fx′ and ψ(α) are continuous, then the sign of Fx′ (ψ(α), α) coincides with the sign of f ′ (0) for small enough α. Therefore, the stability type of the perturbed equilibrium coincides with that of x˙ = f (x). Or, in words, the flow near a hyperbolic curve is insensitive to small perturbations of the equation, no bifurcation is possible. Hence to have a bifurcation, my equilibrium must be non-hyperbolic. 3.2.2
Universality of the typical bifurcations
In the first section of this lecture some examples were presented for the first order parameter dependent ODE. The natural questions are actually how long is the possible list of bifurcations, and what kind of generalizations can be made out of these examples. It turns out that actually in a generic case the only bifurcation which we can see in a system with one parameter is the fold bifurcation. At this point I would not want to go into the precise mathematical details of this statement, and confine to general words. In short, the major tool of analysis of autonomous ODE is the coordinate and parameter changes that put system in the simplest form (we need to define of course what the simplest form means in general, for now this can be a polynomial form of the lowest possible degree). This simplest form is called the normal form. It turns out that for a general function f in (3.1) its normal form coincides with the equation from the first example in this lecture. Here is the exact statement of the theorem, whose proof is actually relies heavily on the implicit function theorem and the inverse function theorem. Theorem 3.6. Any differential equation x˙ = f (x, α),
α ∈ R,
(3.1)
that for α = 0 has an equilibrium x ˆ = 0 satisfying ∂f (0, 0) = 0, ∂x ∂2f ′′ fxx (0, 0) = (0, 0) ̸= 0, ∂x2 ∂f (0, 0) ̸= 0, fα′ = ∂α λ=
by variable and parameter changes can be put in the form ( ′′ ) y˙ = β + sy 2 + o(y 2 ), s = sgn fxx (0, 0) ,
(3.5)
(3.6)
where β is a new parameter. Moreover, equation (3.6) is topologically equivalent in a small enough neighborhood of the point (β, y) = (0, 0) to one of the following normal forms y˙ = β ± y 2 .
(3.7)
Remark 3.7. • The requirements that the equilibrium is at the origin and the bifurcation value of the parameter is zero are not essential because I can always find coordinate and parameter changes to make sure they hold. 27
• The bifurcation diagram with the sign “+” is given in the first figure in this lecture. The second one is analogous (see your homework problems). • If the form of f is such that for any α f (0, α)=0 then the second condition in (3.5) does ′′ (0, 0) ̸= 0, then one obtains the normal not hold. If I exchange it for the condition fxα form of the transcritical bifurcation ( ′′ ) y˙ = βy + sy 3 , s = sgn fxx (0, 0) . • If the differential equation is symmetric with respect to the variable x: −f (−x, α) = f (x, α), i.e., if f (x, α) is odd with respect to x, then the second condition in (3.5) also ′′′ (0, 0) ̸= 0 then the normal form of the pitchfork cannot hold. If one requires that fxxx bifurcation appears: ( ) y˙ = βy + sy 3 , s = sgn fxxx (0, 0) . The normal form with the “−” sign was considered in one of the examples. This bifurcation is called supercritical (I have two stable equilibria while the origin is unstable). In case of “+”, the bifurcation is subcritical : one observes two unstable equilibria while the origin is stable. 3.2.3
Implicit function theorem
Probably the main tool in the bifurcation theory is the implicit function theorem, whose statement in the simplest case when the coordinate is x ∈ R and the parameter is α ∈ R is given here. Theorem 3.8. Suppose that F (x, α) is a C (1) function, F : R × R −→ R, satisfying ∂F (0, 0) ̸= 0. ∂x
F (0, 0) = 0,
Then there exists a unique locally defined C (1) function x = ψ(α), such that ψ(0) = 0,
F (ψ(α), α) = 0
for all α in a neighborhood of the origin. Moreover, ψα′ (0) = −
3.3
Fα′ (0, 0) . Fx′ (0, 0)
Homework 2: Stability and Elementary Bifurcations
1. Find the equilibria, determine their stability, and sketch the phase portrait and several integral curves for:
28
(a)
( N˙ = rN
(b)
( 1−
N K
)θ ) ,
r > 0, K > 0, 1 > θ > 0.
( ) N˙ = N re1−N/K − d ,
r > 0, d > 0, K > 0.
2. For which initial values x(0) does the solution x(t) of the differential equation x˙ = x(e−x − 2) approaches zero as t → ∞. 3. Gompertz’ model. Let N (t) be the population size at time t, which grows is governed by the ODE N˙ = a(t)N, where a(t) is the per capita birth rate that depends on time. It is reasonable to assume that a(t) decreases with time, in the simplest case as a˙ = −αa, where α > 0 is some constant. Assume that N (0) = N0 and a(0) = a0 . (a) Show that two introduced ODE can be reduced to one autonomous ODE of the form K N˙ = αN log , α > 0, K > 0. N What is K in terms of a0 , N0 , α? This model is called Gompertz’ equation in the literature (and is very popular to model the number of tumor cells). (b) Draw the phase portrait of the Gompertz model. (c) Find the maximal sustainable yield of the population, whose dynamics is governed by the Gompertz equation, in terms of the parameters α and K for both constant yield and constant effort, i.e., proportional effort, models (recall the two harvesting models we studied in class). 4. This exercise should convince you that anything is possible if an equilibrium is nonhyperbolic. Can you come with examples of first order ODE such that each has a non-hyperbolic equilibrium (recall, that an equilibrium x ˆ is non-hyperbolic if f ′ (ˆ x) = 0), and in one example this point is asymptotically stable, in another is unstable, and yet in another example it is semi-stable, i.e., it attracts an orbit “from one side” and repels an orbit “on the other side.” 5. Draw the bifurcation diagram for x˙ = αx, What is the bifurcation value? 29
α ∈ R.
4
Insect outbreak model
In this lecture I will put to a good use all the mathematical machinery we discussed so far.
4.1
Analysis of the insect outbreak model
Consider an insect population, which is subject to predation by birds. It is a very well known and very well documented fact that often it is possible to observe so-called outbreaks of the insect population, when the number of insects increase from almost undetectable to extremely high. Here is the question: Is it possible to suggest a minimalist mathematical model in the form of one parameter-dependent ODE, which would exhibit this phenomenon? Assume that the dynamics of the insect population is governed by the ODE of the form N˙ = F (N ) − P (N ), where N (t) is the population size at time t, F is the growth rate of the population in the absence of predators, and P describes the effect of the birds on the rate of change of the population. I am free to choose any reasonable functional form for F and P . Since I am looking for a model as simple as possible, let me take the logistic equation for F : ( ) N F (N ) = rB N 1 − , KB
P (N )
where rB > 0 is the per capita growth rate when N is close to zero, and KB is the population carrying capacity. I know that if there are no predators, then N (t) → KB as t → ∞ for any initial condition N0 > 0. To choose P , I note that it is reasonable to expect that if N → ∞ then P (N ) → B = const since the birds can eat only some amount of insect. It is usually said that function P in this case should exhibit a saturation effect. Another qualitative feature of P is that it should decrease to zero faster than the linear function. The rational for this is that if the insect population is low, then birds prefer looking for food somewhere else. Putting these two assumptions together, I obtain that the graph of P should look similar to the one shown in the figure.
N
Figure 4.1: Qualitative form of P (N ) in the insect outbreak model
30
It is possible to pick various analytical expressions for P , but one of the simplest with the desired properties is possibly BN 2 P (N ) = 2 , A + N2 where B > 0 is the saturation constant and A > 0 determines the inflection point of the graph of P (N ). Putting everything together, ( ) N BN 2 ˙ N = rB N 1 − − 2 . (4.1) KB A + N2 Model (4.1) has four parameters. It is usually quite a difficult task to fully describe the behavior of solutions depending on more than two parameters. Luckily for our case, it is frequently possible to reduce the number of model parameters by properly rescaling the variables. Let me make the change of variables and introduce new parameters x=
N , A
τ=
Bt , A
r=
ArB , B
q=
KB . A
Note that new variables and parameters are dimensionless (I will explain how to actually figure out a possible change of the variables at the end of the lecture). In the new variables I have ) ( x2 x − x˙ = rx 1 − =: f (x, r, q). (4.2) q 1 + x2 To find equilibria of (4.2) I need to solve f (x, r, q) = 0. Obviously, x ˆ = 0 is always an equilibrium, which is unstable for any parameter values r, q (actually, f ′ (0, r, q) = r > 0). Other possible equilibria, if they exist, should satisfy ) ( x x = r 1− . q 1 + x2 Finding the zeros of the last expression boils down to solving a cubic equation, which, albeit formally possible, usually leads to messy expressions, whose use is dubious at best. It is better to look for the equilibria graphically. The equilibria are the points of intersections of g1 (x) = r(1 − x/q) and g2 (x) = u/(1 + x2 ). Since g2 (x) does not have any parameters, I can plot it once and for all. g1 (x) defines a straight line, which intersect x-axis at the point q and y-axis at the point r. By changing r for a fixed q I can see that it is possible to have additionally from one to three equilibria (see the figure), which I will denote x ˆ1 , x ˆ2 , x ˆ3 . How to determine stability of these equilibria? I can use the standard technique by analyzing the sign of the derivative f ′ evaluated at the equilibria, but an easier method is to note that if all the equilibria are hyperbolic, then any two equilibria without other equilibria between them must be such that one is unstable and another one is asymptotically stable. I know that x ˆ=0 is unstable. Therefore, if there is only one additional equilibrium, say x ˆ1 and it is hyperbolic (i.e., f ′ (ˆ x1 ) ̸= 0), then it must be asymptotically stable. If there are all three equilibria then x ˆ1 and x ˆ3 are asymptotically stable and x ˆ2 , which is between them, is unstable (see the figure). In the same figure I can also see that the bifurcations in the system happen when either two equilibria “collide” and disappear, or when two equilibria appear as the parameters change. 31
g1 (x), g2 (x)
r1
g1 (x) = r 1 −
x q
r2
r3 g2 (x) =
x 1 + x2
q x
Figure 4.2: Nontrivial equilibria of (4.2) given as the points of intersection of g1 (x) and g2 (x). It is possible to have one (r1 and r3 cases) or three (r2 case) equilibria. For some parameter values the graph of g1 (x) touches the graph of g2 (x) and there are two equilibria, one of which is non-hyperbolic This is exactly the fold bifurcation. It is interesting to find the bifurcation values of r, q. They can be found as the solutions to the following system ) ( x2 x − f (x, r, q) = rx 1 − = 0, q 1 + x2 ( ) 2x 2x df (x, r, q) = r 1 − − = 0. dx q (1 + x2 )2
f (x)
The first equation here is the condition for x to be an equilibrium, and the second equation is the condition for this equilibrium to be non-hyperbolic. By solving this system with respect to r and q I get 2x3 2x3 r= . , q = (1 + x2 )2 x2 − 1
x ˆ1 0
x ˆ3 x ˆ2
x
Figure 4.3: The case of bistability in model (4.2). Equilibria 0, x ˆ2 are unstable, and x ˆ1 and x ˆ3 are asymptotically stable 32
The last expressions is a parametrically defined curve on the plane (q, r). The shape of this curve can be analyzed by the usual means (I used Mathematica to plot it, see the figure). Note that there is a singularity on this curve, which is called the cusp singularity. In the area “inside” the curve the model (4.2) has three nontrivial equilibria, and “outside” the curve there is only one equilibrium different from zero. Everywhere on the curve, except for the cusp point, crossing of this curve corresponds to the fold bifurcation, when either two equilibria appear or disappear. D C
r
1 2
B A q
Figure 4.4: The bifurcation curve in the space of parameters (r, q) of model (4.2) Model (4.2) (and, correspondingly, (4.1)) exhibits the phenomenon of hysteresis. Hysteresis can be defined as the dependence of the system state not only on the current environment but also on the previous history how the system found itself in this particular state. To illustrate this phenomenon with the help of the model (4.2), consider the surface in the space (r, q, x), which is defined by the equilibrium condition f (x, r, q) = 0. This surface is shown in the figure below, here for each fixed (r, q) the x-coordinate on the surface gives the equilibria of (4.2). The previous figure can be obtained by the projection of this surface onto (r, q) plane. Now I assume that parameter q is fixed, but somehow I can change the value of parameter r in the model. I start at the point A (see the figures), where there is only one nontrivial equilibrium x ˆ1 (this corresponds to the case in Figure 4.3 for r1 ). This point is asymptotically stable, and the insect population settles at this point. Now I assume that r starts growing such that I reach point B in Figure 4.4. For an observer not much changed, since the population is still at the equilibrium x ˆ1 , which is slightly greater than before, but quite close. On the other hand, for the whole system actually I have that the fold bifurcation happened, and aside of x ˆ1 two new equilibria x ˆ2 and x ˆ3 appeared, the former is unstable, and the latter one is asymptotically stable. If I increase r to the point C in Figure 4.4, I have observed continuous growth of x ˆ1 -equilibrium, in which the population resided. At this point actually another bifurcation happens, points x ˆ1 and x ˆ2 collide and disappear. What is going to happen with the population? It must find another stable state, which is actually present in the system and given by x ˆ3 . Note however, that the coordinate of x ˆ3 is much larger than that of x ˆ1 , and at this point the system experiences 33
q
C
B C
A
B
D
x ˆ
r
Figure 4.5: This surface is defined by the condition f (x, r, q) = 0 in model (4.2) the outbreak! Now the insect population at x ˆ3 -equilibrium for any larger values of r (point D). The path for the population state in Figure 4.5 is given by A, lower B, lower C, upper C, D. Now I assume that somehow I am able to reduce the value of r, the insect population will start declining, but staying still at the point x ˆ3 , now I will need to reduce the parameter value to the point B to actually make sure that the insect population is again at x ˆ1 . Hence, the path on Figure 4.5 when we reduce the values of r is given by D, upper C, upped B, lower B, A.
4.2
Non-dimensional variables
In the mathematical model of the insect outbreak a change of variables was presented, which allowed me to reduce the number of parameters. Here is the explanation how to actually find such a change of variables. In physics this process is frequently referred to as non-dimensionalization. As an example consider the logistic equation in the form ( ) N ˙ N = rN 1 − , K where r > 0 and K > 0 are parameters of the problem. The former is the rate of growth and the latter is the carrying capacity. N (t) is the size of population at time moment t. Let me assume that here I am talking about elephants, hence N (t) is the number of elephants in the population at the time moment t. What are the dimensions of the parameters in this problem? First note that on the left hand side I have the derivative N˙ , i.e., the rate of change, which means that 34
[ ] the units on the left have to be [N˙ ] = elephants (the square parenthesis mean “dimensions”), time which implies that the right hand side has to have [ 1 the ] same units, and I have no choice other than consider that [K] = [elephants] and [r] = time . Now I make the substitutions N (t) = An(τ ),
t = T τ,
where τ is a new independent variable, n(τ ) is a new dependent variable, and A and T are constants to be determined. Using the chain rule dN (t) Adn(τ ) Adn(τ ) dτ A dn(τ ) = = = , dt dt dτ dt T dτ hence
) ) ( ( dn A dn An An =⇒ . = rAn 1 − = rT N 1 − T dτ K dτ K
Since I am free to choose A and T , I can set T =
1 , r
[T ] = [time],
A = K,
[A] = [elephants],
and the change of variables N (t) , τ = rt, K reduces the logistic equation to the form without any parameters: n(τ ) =
n˙ = n(1 − n), where now n˙ means the derivative with respect to τ . Note that the new variables are dimensionless: n(τ ) is the proportion of the population at time τ , and time is measured in some abstract non-dimensional units. It is usually possible to suggest several alternative changes of the variables that lead to dimensionless form. Which one to choose is up to the researcher. For example, in the insect outbreak model the choice was made so that all the remaining parameters are in the linear function, whose behavior is easy to figure out. Here is another example for the system of two autonomous equations: N˙ = aN − bN P, P˙ = −dP + cN P. Here N, P are dependent variables, that depend on the independent variable t (time), and a, b, c, d are parameters, all of which are supposed to be nonnegative. Let N (t) = Ax(τ ), P (t) = By(τ ), t = T τ be my change of variables such that A, B, T are some constants I would like to determine. Using again the chain rule, I find Ax˙ = aAT x − bABT xy, B y˙ = −dT By + cABT xy, 35
or, after canceling, x˙ = aT x − bBT xy, y˙ = −dT y + cAT xy. Putting aT = 1,
bBT = 1,
cAT = 1,
I find that the change of variables and parameters x(τ ) =
cN (t) , a
y(τ ) =
bP (t) , a
τ = at,
α=
d , a
leads to x˙ = x − xy, y˙ = −αy + xy, which has only one dimensionless parameter. You should convince yourself that the change x(τ ) =
cN (t) , d
y(τ ) =
bP (t) , a
τ = at,
γ=
d , a
yields x˙ = x − xy, y˙ = γy(x − 1), which is in some cases more convenient for subsequent analysis.
5 5.1
Alfred Lotka, Vito Volterra, and Population Cycles Analysis of the Lotka–Volterra model Dr. Umberto D’Ancona entertained me several times with statistics that he was compiling about fishing during the period of the war and in periods previous to that, asking me if it would be possible to provide an mathematical explanation of the results that were obtained regarding the percentage of the various species in these different periods... This may justify my having permitted myself to publish this research, which is simple from an analytical point of view but which was new to me. Vito Volterra Variazioni e fluttuazioni del numero d’individui in specie animali conviventi’ (Variations and fluctuations in the number of individuals of cohabiting animal species), 1927
Actually, the statistics that Volterra writes above can be summarized in the following way: During the war it was observed that certain predaceous species greatly increased in numbers, when fishing had almost ceased. Opposite to it, the number of prey species was observed to decline compare to that before the war. Discussing the problem with his future father-in-law, Umberto D’Ancona asked if there could be a mathematical explanation for these changes. 36
Here is how this problem was solved by Vito Volterra. Consider two interacting species: a prey and a predator. The relationship between them can be expressed verbally as follows: The change in the number of prey, N , per unit of time is equal to the natural increase of the prey per unit of time minus destruction of the prey by the predator per time unit. Similarly, the change in the number of predators, P , per unit of time is equal to the increase in the predator per time, as the result of ingestion of the prey minus death of the predators per unit of time. Let me translate this verbal description into mathematical formulas as follows: N˙ = F1 (N ) − G1 (N, P ), P˙ = G2 (N, P ) − F2 (P ),
(5.1)
where the meaning of F1 , G1 , G2 , and F2 should be clear from the description above. Let me choose the simplest possible mathematical expressions for these functions. To wit, consider a particular case of (4.1): N˙ = aN − bN P, P˙ = cN P − dP,
(5.2)
where a, b, c, d are nonnegative parameters. The interaction of a prey and a predator is described here by the bilinear function ∝ N P , therefore, system (4.2) is a system of two first order autonomous equations, which in general can be written as N˙ = f (N, P ), P˙ = g(N, P ),
(5.3)
for some suitable f, g ∈ C (1) . Before turning to an analysis of (5.2), I would like to mention that exactly the same system of equations was written by Alfred Lotka, somewhat earlier than Vito Volterra. He was studying some fictional problems from chemical kinetics. To illustrate his approach, consider first a hypothetical chemical reaction in which reagents A and B produce a new element C: A + B → C. This literally says that one molecule of A combined with one molecule of B produce one molecule of C, please remember that I speak here about molecules, not masses. The law of mass action in chemistry states that the speed of reaction is proportional to the concentrations of the reagents. This means that for the reaction about, if I denote with the square parenthesis the concentrations of the chemicals, I can write d[A] = −k[A][B], dt k
where k is some constant of proportionality, which is usually written above the arrow: − →, and the sign “minus” is taken because the arrow points “from” A. The dimensions of k are determined by the form of reaction, and for my particular example this is concentration · time−1 . The same equation is true for the reagent B: d[B] = −k[A][B]. dt 37
Finally, for C I have d[C] = k[A][B]. dt Here is an example of a reversible reaction: k1
A + B C, k−1
for which, following exactly the same law of mass action, I get ˙ = −k1 [A][B] + k−1 [C], [A] ˙ = −k1 [A][B] + k−1 [C], [B] ˙ = k1 [A][B] − k−1 [C]. [C]
Now consider a bimolecular mechanism of the form k1
2A + B C, k−1
which means that it is necessary to have two molecules of A and one molecule of B to produce one molecule of C. The law of mass action is this case implies that ˙ = −2k1 [A]2 [B] + 2k−1 [C], [A] note the coefficients and power 2 in the equation. The two other equations here are ˙ = −k1 [A]2 [B] + k−1 [C], [B] ˙ = k1 [A]2 [B] − k−1 [C], [C]
note the absence of the constant 2 in the equation for B. Alfred Lotka considered the following hypothetical reaction: k
1 A + X −→ 2X,
k
2 X + Y −→ 2Y,
k
3 Y −→ B,
where X and Y are reagents, and the supply of chemicals A and B is constant. This actually means that the system is open, and the exchange of matter with the environment should be present. I have, invoking the law of mass action, that ˙ = k1 [A][X] − k2 [X][Y ], [X] [Y˙ ] = k2 [X][Y ] − k3 [Y ]. Using the notation N = [X], P = [Y ], a = k1 [A], b = c = k2 , and d = k3 , I obtain exactly the equations of Vito Volterra (5.2). It is very convenient to keep in mind the chemical interpretation of equations (5.3) for accessing the validity of this and similar mathematical models. 38
This is actually why system (5.2) is famously known as Lotka–Volterra model, or Lotka– Volterra equations. Because of the ecological or chemical interpretations it is reasonable to set the initial conditions for this system as N (0) = N0 > 0 and P (0) = P0 > 0, which are positive numbers. Lotka–Volterra system is a particular case of the general system (5.3), analysis of which is significantly more involved than the analysis of one autonomous equation, which we were studying in the previous lectures. However, particular form of (5.2) actually allows to obtain a number of results without any need of the general theory. Therefore, in this lecture I will analyze (5.2) by available means, but it should be clear that in more general cases I will need some additional mathematical machinery, which will be our focus for the next several ( lectures.) The key fact to note that if I have solutions to (5.2), then they define a curve N (t), P (t) in (N, P ) plane parameterized by the time t. Now, one of the basic results from Calculus states dP that the derivative dN of such a curve can be found as dP = dN
dP dt dN dt
=
P˙ . N˙
Therefore, two first order ODE in (5.2) can be replaced for one first order equation P (cN − d) dP = , dN N (a − bP )
(5.4)
which is actually a separable equation, and can be integrated as a − bP cN − d dP = dN =⇒ a log P − bP = cN − d log N + C, P N where C is an arbitrary constant, which is determined by the initial conditions. Consider the function H(N, P ) := bP + cN − a log P − d log N. I obtain that the solutions to (5.4) are given by the level sets of the function H: H(N, P ) = C. I have
∂H d ∂H a (N, P ) = c − , (N, P ) = − + b, ∂N N ∂P P which means that the only critical point of H is ( ) d a ˆ ˆ (N , P ) = , . c b Also, ∂2H d ∂2H a ∂H (N, P ) = > 0, (N, P ) = 2 > 0, (N, P ) = 0, 2 2 2 ∂N N ∂P P ∂N ∂P ˆ , Pˆ ) is a strict minimum of the function H. Moreover, due to the which means that point (N fact that the second partial derivatives are negative, H is convex at any point, which implies 39
H(N, P )
P N
Figure 5.1: The surface H(N, P ) together with the level sets H(N, P ) = C that the only level sets of this function are closed curves that surround the point of minimum ˆ , Pˆ ) (see the figure, where the function H together with the level sets are shown). (N So, what kind of information about the original function N and P I got, and what was ˆ , Pˆ ), which is actually an lost? I now know that the curves are closed (apart from the point (N equilibrium point of (5.2)), but I do not know the direction of movement along these curves when t increases. This is a kind of information I lost when jumping from (5.2) to (5.4). However, it is quite straightforward to figure out the directions looking at the vector field given by f (N, P ) = aN − bN P, g(N, P ) = cN P − dP (recall that a vector field means that at each point of the plane I have a prescribed vector). Note, e.g., that along the N -axis the direction is positive, and along the P -axis the direction in negative (though this should be clear from just looking at (5.2)). Therefore, I can put the directions on the curves as shown in the figure. Now to one of the main conclusions: Since the solutions are represented by the closed curves ˙ ˙ on the plane then there exists T > 0 ( (N, P ) and nowhere ) ((N , P ) = 0 except for the equilibrium, ) such that N (t; N0 ), P (t; P0 ) = N (t + T ; N0 ), P (t + T ; P0 ) and since the vector field (N˙ , P˙ ) does not depend on t (it does not change with t), then I conclude that the solutions to (5.2) are periodic functions with period T (see their graphs in the figure below). Moreover, consider the average values of the prey and predator populations over one period: ∫ ∫ 1 T 1 T N (ξ)dξ, P (ξ)dξ. T 0 T 0 To find them I notice that equations (5.2) can be written as d log N = a − bP, dt d log P = cN − d. dt 40
P
P
Pˆ
Pˆ
0
ˆ N
0
N
N
ˆ N
Figure 5.2: The level sets of H together with time directions specified by the vector field (N˙ , P˙ ). ˆ , Pˆ ) and the origin, which is also an equiThe bold points denote the nontrivial equilibrium (N librium. On the right I show the direction field (f, g) I integrate these equalities over one period and find, since ∫
T
log N (ξ)dξ = log N (T ) − log N (0) = 0,
0
that 1 T
∫ 0
T
d ˆ, N (ξ)dξ = = N c
1 T
∫
T
P (ξ)dξ = 0
a = Pˆ . b
That is, the average prey and predator populations do not depend on the initial conditions or the period of oscillations, and coincide with the coordinates of the nontrivial equilibrium. Now assume that I start fishing for both prey and predator populations. It means that the growth rate of preys will decrease by some amount δ1 , and the death rate of the predators will increase by some amount δ2 , i.e., new parameters will be a − δ1 , d + δ2 , and new equilibrium point will be ˆnew = d + δ2 > N ˆold , Pˆnew = a − δ1 < Pˆold . N c c In other words I proved Proposition 5.1 (Volterra’s principle). If in a system “prey–predator” both species are destroyed uniformly and proportionally to their total amounts, then the average prey population increases and the average predator population decreases. Another extremely important from ecological point of view conclusion of the analysis of the Lotka–Volterra model is that in the system “prey–predator” it is possible to observe endogenous 41
oscillations, i.e., the oscillations that are inherent to the system and not caused by any external circumstances (see in the figure time dependent solutions to (5.2) obtained numerically). The fact that the populations of prey and predator oscillate was well known for quite a while. Here is an example of the historical record taken over 90 years for a population of lynxes versus a population of hares (actually, this is the data on sales of the hunting companies, but it is believed that the number of sales reflect the total populations): These data show that the wild populations do experience oscillations, and the Lotka–Volterra model provides an explanation for these oscillations such that we do not need to invoke other additional reasons such as climate, or something else. There are a lot of drawbacks of model (5.2) (can you think of any?), which I will discuss later. For now it should be clear that systems of autonomous differential equations are of great value in modeling interacting populations (or, for that matter, chemical reactions), and in the next several lectures I will present the elements of the mathematical analysis of such systems.
5.2
Homework 3
1. Non-dimensionalization. (a) A predator–prey model for herbivore(H)–plankton(P ) interaction is [ ] BH ˙ P = rP (K − P ) − , C +P [ ] P H˙ = DH − AH , C +P where r, K, A, B, C, D are positive constants. What are the units of each constant? Find the change of the variables that leads to [ ] h p˙ = p (k − p) − , 1+p [ ] p h˙ = dh − ah , 1+p where the derivative now with respect to new time τ and k, a, d are new parameters. N (t), P (t)
t
Figure 5.3: Solutions to (4.2) versus time t. The prey population is in green, and the predator population is in red. Note that the prey oscillations precede the predator oscillations 42
à à
140
àà
Population Sizes
120 100
à à à à à à àà à à ææ à à à à à à à æ à æ ææ à à àæ à à æ à àà æ à æ àà ææ æ æ æ à æ æ æ æ æ àæ æ æ à àæ æ æ à æ à à à æ æ æ æ æ æ æææà à æ æ æ æ à ææ æ æ æ à à à æ à à æ æ à æ à æ æ æ ààà æ à æ æ æ æ àæ à æ æ à àææ à æ æ æ à à à æ ææ à æ æ æ ààà àà æ æ æææ æææ æææ àæææ ææ àà ààààæ à à àà àà ææ à à
80 60 40 20 0
à
à à à
à
à
à
æ
1860
1880
1900
1920
Year
Figure 5.4: Historical data on lynxes and hares populations in Canada. Source: Odum, E. P., Odum, H. T., & Andrews, J. (1971). Fundamentals of ecology. Philadelphia: Saunders. (b) Kolmogorov–Petrovski–Piskunov–Fisher equation. Famous KPP or Fisher’s equation is ( ) ∂N N ∂2N = rN 1 − +D 2 , ∂t K ∂x where the unknown function N (t, x) depend both on time t and spatial variable x. r, K, D are positive constants. This is an example of a partial differential equation. Find the change of dependent and independent variables (now you have two independent variables!) to reduce this equation to ∂2u ∂u = u(1 − u) + 2 . ∂τ ∂s 2. Chemical kinetics. (a) Consider a famous reaction by Michaelis and Menten, which can be presented as k1
S + E SE, k−1 k
2 SE −→ E + P,
where S, E, SE, P are substrate, enzyme, substrate-enzyme complex, and product respectively. Using the law of mass action, write down the system of four ordinary differential equations describing this reaction. (b) Consider a hypothetical chemical reaction that involves four chemicals, two of which kept at a constant level: k1
X A, k−1
k
2 B −→ Y,
k
3 2X + Y −→ 3X.
Using the law of mass action (you may consult Section 5 of the lecture notes), write down two differential equations for concentrations of X and Y assuming that A and B are kept at constant concentrations. 43
3. SIR model. Consider a mathematical model of the spread of a disease in a population. The whole population is divided into 3 classes: susceptible S, infectious I, and recovered R. The differential equations describing the dynamics of these classes take the form S˙ = −βSI, I˙ = βSI − γI, R˙ = γI. Here β > 0 and γ > 0 are parameters. Note that the first two equations do not depend on the third variable and hence can be considered separately. Find a function H(S, I), whose level sets represent the solutions to the first two equations in the plane (S, I). By studying this function say as much as possible about the behavior of the solutions to the SIR model.
6
General properties of an autonomous system of two first order ODE
Here I embark on studying the autonomous system of two first order differential equations of the form x˙ 1 = f1 (x1 , x2 ), x˙ 2 = f2 (x1 , x2 ),
(6.1)
where f1 , f2 ∈ C (1) (U ; R), U ⊆ R2 , x1 , x2 are unknown functions, and t is the independent variable which usually denotes time. System (6.1) can be conveniently written in the vector form x˙ = f (x), x(t) ∈ U ⊆ R2 , f : U −→ R2 , (6.2) ( )⊤ where x(t) = x1 (t), x2 (t) , ⊤ denotes transposition (all the vectors are assumed to be columnvectors), f = (f1 , f2 )⊤ is a vector-function of two variables, which maps a subset U of R2 to R2 , and the bold font usually denotes vectors (please be aware that in some books vectors are written in the same font as scalar variables, therefore it is the reader’s task to figure out the dimensions of the object). I will assume throughout the lectures that for system (6.1) (or, equivalently, for system (6.2)) the theorem of uniqueness and existence of solutions is satisfied. To be precise, Theorem 6.1. Consider problem (6.2) together with the initial condition x(t0 ) = x0 ∈ U ⊆ R2
(6.3)
and assume that f ∈ C (1) (U ; R2 ). Then there exists an ϵ > 0 such that solution to (6.2)–(6.3) exists and unique for t ∈ (t0 − ϵ, t0 + ϵ). It is important to note that the theorem is local, it guarantees the existence and uniqueness of the solutions only on a small interval of t. According to the general theory I can usually extend this unique solution to some larger interval (t0 − T − , t0 + T + ), but, as simple examples 44
show (e.g., x˙ = x2 ), T ± do not have to be infinite. In other words, solutions can blow up in finite time. In the rest of these lectures I will safely ignore this fact by tacitly assuming that T ± = ±∞, and this always will be the case for all the models I will study. Therefore, I assume that problem (6.2)–(6.3) has a unique solution, which I denote x(t; x0 ), which is defined for all t ∈ (−∞, ∞). There are a lot of notions pertaining to (6.2) that I discussed already in the context of the first order ODE. The crucial distinctions will be clear a little later. Definition 6.2. The set U ⊆ R2 in which the solutions to (6.2) are defined is called the phase space or the state space of system (6.2). By the definition of the solutions to (6.2)–(6.3) we have that they represent a curve in the phase space parameterized by the time t. These curves called orbits or trajectories of (6.2). Definition 6.3. The set γ(x0 ) = {x(t; x0 ) : t ∈ R} is called an orbit of (6.2) starting at the point x0 . Positive and negative semi-orbits are defined as γ + (x0 ) = {x(t; x0 ) : t ∈ [t0 , ∞)} and
γ − (x0 ) = {x(t; x0 ) : t ∈ (−∞, t0 ]}
respectively. Some of the orbits are quite special. For example, Definition 6.4. A point x ˆ such that γ(ˆ x) = {ˆ x} is called an equilibrium point, or stationary point, or fixed point, or rest point of system (6.2). It should be obvious that Proposition 6.5. A point x ˆ is an equilibrium if and only if f (ˆ x) = 0. A very important property of the autonomous system (6.2) is presented in the following Proposition 6.6. If ϕ(t) ∈ U ⊆ R2 is a solution to (6.2), then ϕ(t − t0 ) is also a solution for any constant t0 . Proof. I am given that
and I need to show that
( ) d ϕ(t) = f ϕ(t) , dt ( ) d ϕ(t − t0 ) = f ϕ(t − t0 ) dt 45
is also true. Let me make the change of the independent variable in the last expression: τ = t − t0 . I have
or, since
( ) d d dτ ϕ(τ ) = ϕ(τ ) = f ϕ(τ ) , dt dτ dt dτ dt
= 1,
( ) d ϕ(τ ) = f ϕ(τ ) , dτ which is exactly where I started, only with τ instead of t. This means that ϕ(τ ) is a solution, and hence, by returning to τ = t − t0 , that ϕ(t − t0 ) is also a solution. The orbits should not be confused with ( the integral) curves: the graphs of the solutions in the space R × R2 given parametrically as t, x1 (t), x2 (t) . Here is a simple illustration: consider the system x˙ 1 = x2 , x˙ 2 = −x1 . I can actually find the solutions of this system in explicit form because the system is linear. Assume that I consider two different initial conditions: x1 and x2 (here is a very common source of confusion: I use subscripts to denote both the elements of the vectors, i.e., x = (x1 , x2 ), and distinguish two vectors, i.e., x1 = (x11 , x12 ) and x2 = (x21 , x22 ). Try not to confuse them.) Hence I have two solutions of the system: x1 (t) = x1 (t; x1 ), and x2 (t) = x2 (t; x2 ). If I plot these solutions in 3D space (t, x1 , x2 ) I have the integral curves (see the figure, the red curves are the integral curves, and the red dots denote the initial conditions), if I plot them on the plane (x1 , x2 ) then I have the orbits (the blue curves in the figure), which are the curves together with the directions specified by t (I do not put directions on the integral curves because there is a natural direction of the time increase). There is one equilibrium x ˆ = (0, 0), which is shown by a blue dot, together with the corresponding integral curve, which is parallel to t-axis. Therefore, I can conclude that the orbits are the projections of the integral curves on the phase space. You can see in the figure that the orbits do not intersect. This is not an obvious observation in general, but it is true due to the fact that the system is autonomous. Proposition 6.7. Two orbits of (6.2) either do not have common points or coincide. Proof. Assume that x0 ∈ U ⊆ R2 is a point that belongs to two orbits. This implies that there are two solutions ϕ and ψ and t1 and t2 such that ϕ(t1 ) = ψ(t2 ). Consider χ(t) = ϕ(t + (t1 − t2 )). 46
x1
x2
t t
x ˆ x2 x1
x2
x ˆ
x1
x2
x1
Figure 6.1: Integral (red) and phase curves (orbits, blue) of the system x˙ 1 = x2 , x˙ 2 = −x1 . See text for details This function, due to Proposition 6.6, is also a solution. Moreover, the orbit corresponding to χ coincides with the the orbit corresponding to ϕ because of its definition (we just use a different parametrization on the orbit, shifted by the value t1 − t2 ). On the other hand, χ(t2 ) = ϕ(t1 ) = ψ(t2 ), which yields, due to the theorem of existence and uniqueness of the solutions that χ and ψ coincide. Therefore the orbits defined by ϕ and ψ coincide. To repeat myself, the last proposition is not true for non-autonomous systems. This is why, while studying autonomous systems, I can restrict my attention to the phase space and the orbits in this space. Now consider the group property of the solutions to (6.2). Proposition 6.8. If x(t; x0 ) is a solution to (6.2), then ( ) ( ) x(t2 + t1 ; x0 ) = x t2 ; x(t1 ; x0 ) = x t1 ; x(t2 ; x0 ) for any t1 , t2 ∈ R. Proof. Consider two solutions to (6.2): ( ) ϕ1 (t) = x t; x(t1 ; x0 ) ,
47
ϕ2 (t) = x(t + t1 ; x0 ).
By the definition of solution: x(0; x0 ) = x0 . I have that ϕ1 (0) = ϕ2 (0), which, by theorem of uniqueness and existence implies that they coincide. If I plug t2 instead of t in these function, I obtain first of the required equalities. The second one is proved in a similar way. The last proposition in particular shows that ( ) x −t; x(t; x0 ) = 0. In the language of algebra this means that the solution to (6.2), which is often called the flow of the system, forms a group acting on the phase (or state) space U . I will come back to this interpretation in due course. Another useful viewpoint at the system (6.2) is to recognize that the right hand side actually defines a vector field at any point x ∈ U , i.e., for each point x ∈ U there is a vector ( )⊤ f1 (x), f2 (x) , which is tangent to the orbit at this particular point and points in the direction of the time increase. I illustrate this concept with the same simple system x˙ 1 = x2 ,
x˙ 2 = −x1 .
x2
Note that the only point at which the vector field is not defined is the equilibrium x ˆ = (0, 0).
x1
Figure 6.2: The vector field of the system x˙ 1 = x2 , x˙ 2 = −x1 (cf. the previous figure) Using the notion of the vector field I obtain almost immediate
48
Proposition 6.9. Any orbit of (6.2) different from an equilibrium point is a smooth curve. And here by “smooth curve” I mean a curve that has a tangent vector at any point. Moreover, Proposition 6.10. Any orbit of (6.2) belongs to one of three types: a smooth curve without self-intersections, a closed smooth curve (cycle), or a point. The solution corresponding to the cycle is a periodic function of t. Proof. If the orbit is not a point then, according to the previous proposition, it is a smooth curve. The smooth curve is either closed or not. If it is closed and because of the fact that at any point of this curve there is non-zero constant with respect to t vector field, then there is a constant T >)0 such that the solution corresponding to the closed orbit satisfies x(t + T ; x0 ) = ( x t; x(T ; x0 ) = x(t; x0 ), i.e., a periodic function of the period T . It is impossible to draw all the orbits in the phase space. However, I can get a general impression of the orbits’ behavior by looking at several key orbits, such as equilibria, cycles, and orbits connecting equilibria. Definition 6.11. Partitioning the phase space into orbits is called the phase portrait. We already draw a number of phase portraits in the case of one first order ODE. Then it was just a partitioning of the x-axis into orbits. Here I have the plane as our phase space, therefore we have much more possibilities for the mutual positioning of the orbits (and therefore this problem is more complex than the one-dimensional case). You should look at the phase portrait of the Lotka–Volterra system to make sure you understand the meanings of the many concepts introduced in this lecture. And final remark: I was talking about systems of two first order differential equations. However, nowhere in the proof I used the fact that the phase space is two dimensional. All the statements in this lecture hold for any dimension with an obvious caveat that already in three dimensions the phase portraits are quite difficult to present (although there will be some example), and in dimension four and higher this nice geometric interpretation is of almost no use.
7
Planar systems of linear ODE
Here I restrict my attention to a very special class of autonomous ODE: linear ODE with constant coefficients. This is arguably the only class of ODE for which explicit solution can always be constructed. Linear systems considered as mathematical models of biological processes are of limited use; however, such models still can be used to describe the dynamics of the system during the stages when the interactions between the elements of the system can be disregarded. Moreover, the analysis of the linear systems is a necessary step in analysis of a local behavior of nonlinear systems (linearization of the system in a neighborhood of an equilibrium).
7.1
General theory
The linear system of first order ODE with constant coefficients on the plane has the form x˙ 1 = a11 x1 + a12 x2 , x˙ 2 = a21 x2 + a22 x2 , 49
(7.1)
or, in the vector notations, x˙ = Ax,
x(t) ∈ R2 ,
(7.2)
where x = (x1 , x2 )⊤ , and A = (aij )2×2 is a matrix with real entries: [ ] a11 a12 A= . a21 a22 Additionally to (7.2), consider also the initial condition x(0) = x0 .
(7.3)
To present the solution to (7.2)–(7.3), I first prove Proposition 7.1. Initial value problem (7.2)–(7.3) is equivalent to the solution of the integral equation ∫ t x(t) = x0 + Ax(ξ)dξ. (7.4) 0
Proof. Assume that x solves (7.4). Then, by direct inspection I have that x(0) = x0 . Moreover, since the right-hand side is given by the integral this implies that x ∈ C (1) , therefore, I can take the derivative to find (7.2). Now other way around, by assuming that x solves (7.2)–(7.3), integrating (7.2), and evaluating the constant of integration, I recover (7.4). Now I can use (7.4) to approximate the solution to (7.2)–(7.3) by the method of successive iterations. The first approximation is of course the initial condition x0 : ∫ t x1 (t) = x0 + Ax0 dξ = x0 + Ax0 t. 0
Next,
∫ x2 (t) = x0 +
t
Ax1 (ξ)dξ = x0 + Ax0 t + 0
A2 x0 t2 . 2
Or, continuing the process, ( ) At A2 t2 A n tn xn (t) = I + + + ... + x0 . 1! 2! n! Here I is the identity matrix. Please note that what is inside the parenthesis in the last formula is actually a matrix. However, the expression is so suggestive, given that you remember the Taylor series for the ordinary exponential function, exp(t) = et = 1 +
t t2 tn + + ... + + ..., 1! 2! n!
that it is impossible to resist the temptation to make Definition 7.2. The matrix exponent of the matrix A is defined as the infinite series exp(A) = eA = I +
A A2 An + + ... + + ... 1! 2! n! 50
(7.5)
This definition suggests that the solution to the integral equation (7.4), and, therefore, to the IVP (7.2)–(7.3), is of the form x(t) = eAt x0 , note that here I have to write x0 on the right to make sure all the operations are well defined. Before continuing the analysis of the linear system and actually proving that indeed the solution is given by the presented formula, I have to make sure that the given definition makes sense, i.e., all the usual series (there are four of them if matrix A is 2 × 2) converge. Proposition 7.3. Series (7.5) converges absolutely. Proof. Let |aij | ≤ a. For the product AA = A2 I have that the elements of this product are bonded by 2a2 . Similarly, for Ak it is 2k−1 ak = 2k ak /2. Since I have that ∞
1 ∑ 2k ak 1 = e2a , 2 k! 2 k=0
therefore the series in (7.5) converges absolutely to the matrix denoted eA .
Now it is quite straightforward to prove that exp(At) solves (7.2). Proposition 7.4. d At e = AeAt . dt Proof. Since the series converges absolutely, I am allowed to differentiate the series termwise: ∞
∞
∞
k=0
k=0
k=1
∑ d A k tk ∑ Ak tk−1 d ∑ A k tk = = = AeAt . dt k! dt k! (k − 1)! The last proposition indicates that the matrix exponent has some properties similar to the usual exponent. Here is a good example to be careful when dealing with the matrix exponent. Example 7.5. Consider two matrices, [ ] 0 1 , A= 0 0
[
] 0 0 B= . −1 0
I claim that eA+B ̸= eA eB . Let me prove it. I have A2 = 0, therefore
[
] 1 1 e =I +A= . 0 1 A
51
Similarly,
] 1 0 . =I +B = −1 1 [
e
B
Therefore,
[
A B
e e
] 0 1 = . −1 1 [
Now C =A+B = I have
[
] −1 0 C = = −I, 0 −1 2
Therefore,
[ eC =
] 0 1 . −1 0
[
] 0 −1 C = = −C, 1 0 3
1 − 2!1 + 4!1 + . . . 1 − −1 + 2!1 − 4!1 + . . . 1 −
1 3! 1 2!
+ +
[
] 1 0 C = = I. 0 1 4
] [ ] + ... cos 1 sin 1 = , + ... − sin 1 cos 1
1 5! 1 4!
which proves that eA+B ̸= eA eB . In the last expression I used cos t =
∞ ∑ k=0
t2k (−1) , (2k)! k
sin t =
∞ ∑
(−1)k−1
k=1
t2k−1 . (2k − 1)!
Proposition 7.6. If [A, B] = AB − BA = 0, i.e., if matrices A and B commute, then e(A+B)t = eAt eBt . Proof. For the matrices that commute the binomial theorem holds: n ( ) ∑ ∑ Ai B j n An−k B k = n! (A + B) = . k i! j! n
i+j=n
k=0
Since, by the Cauchy product (
) ∞ ∞ ∑ ∑ Bj ∑ Ai B j = , i! j! i! j!
∞ ∑ Ai i=0
n=0 i+j=n
j=0
one has A B
e e
∞ ∑ ∞ ∑ ∑ Ai B j (A + B)n = = = eA+B . i! j! n! n=0 i+j=n
n=0
I can do these formal manipulations since all the series in the question converge absolutely. 52
As an important corollary of the last proposition I have Corollary 7.7. For the matrix exponent eA(t1 +t2 ) = eAt1 eAt2 , and
eAt e−At = I.
Proof. For the first note that At1 and At2 commute. For the second put t1 = t and t2 = −t.
Now I can actually prove that the solution to (7.2)–(7.3) exists, unique, and defined for all −∞ < t < ∞. Theorem 7.8. Solution to the IVP problem (7.2)–(7.3) is unique and given by x(t; x0 ) = eAt x0 ,
−∞ < t < ∞.
(7.6)
Proof. First, due to Proposition 7.4, d At e x0 = AeAt x0 , dt and also eA0 x0 = Ix0 = x0 , which proves that (7.6) is a solution. To prove that it is unique, consider any solution x of the IVP and put y(t) = e−At x(t). I have y(t) ˙ = −AeAt x(t) + e−At x(t) ˙ = −AeAt x(t) + e−At Ax = 0. In the expression above, I used two (non-obvious) facts: First, that the usual product rule is still true for the derivative of the product of two matrices, and that e−At A = Ae−At , i.e., that matrices A and e−At commute. You should fill in the details of the proofs of these statements. Hence I have that y(t) = C, therefore, by setting t = 0, I find y(0) = x0 , which implies that any solution is given by x(t) = eAt x0 . Since the last formulae is defined for any t, therefore, the solution is defined for −∞ < t < ∞.
53
7.2
Three main matrices and their phase portraits
Consider the solutions to (7.2) and their phase portraits for three matrices: [ ] [ ] [ ] λ1 0 λ 1 α β A1 = , A2 = , A3 = , 0 λ2 0 λ −β α where λ1 , λ2 , λ, α, β are real numbers. • A1 . I have, using the definition of the matrix exponent, that [ λt ] e 1 0 A1 t e = , 0 e λ2 t therefore the general solution to (7.2) is given (which can be actually obtained directly, by noting that the equations in the system are decoupled ) [ λt ] [ λ t 0] e 1 0 e 1x x(t; x0 ) = x0 = λ2 t 10 . λ t 2 0 e e x2 If λ1 ̸= 0 and λ2 ̸= 0 then I have only one isolated equilibrium x ˆ = (0, 0), the phase curves can be found as solutions to the first order ODE λ2 x2 dx2 = , dx1 λ1 x1 which is separable equation, and the directions on the orbits are easily determined by the signs of λ1 and λ2 (i.e., if λ1 < 0 then x1 (t) → 0 as t → ∞). Consider a specific example with 0 < λ1 < λ2 . In this case I have that all the orbits are parabolas, and the direction is from the origin because both lambdas are positive. The only tricky part here is to determine which axis the orbits approach as t → −∞, this can be done by looking at the explicit equations for the orbits (you should do it) or by noting that when t → −∞ eλ1 t ≫ eλ2 t and therefore x1 component dominates in a small enough neighborhood of (0, 0) (see the figure). The obtained phase portrait is called topological node (“topological” is often dropped), and since the arrows point from the origin, it is unstable (I will come back to the discussion of the stability a little later). As another example consider the case when λ2 < 0 < λ1 . In this case (prove it) the orbits are actually hyperbolas on (x1 , x2 ) plane, and the directions on them can be identifies by noting that on x1 -axis the movement is from the origin, and on x2 -axis it is to the origin. Such phase portrait is called saddle. All the orbits leave a neighborhood of the origin for both t → ±∞ except for five special orbits: first, this is of course the origin itself, second two orbits on x1 -axis that actually approach the origin if t → −∞, and two orbits on x2 -axis, which approach the origin if t → ∞. The orbits on x1 -axis form the unstable manifold of x ˆ = (0, 0), and the orbits on x2 -axis form the stable manifold of x ˆ. These orbits are also called the saddle’s separatrices (singular, separatrix ). There are several other cases, which need to be analyzed, let me list them all: – 0 < λ1 < λ2 : unstable node (shown in figure) 54
x2
x1
Figure 7.1: Unstable node in system (7.2) with matrix A1 . The eigenvectors v 1 and v 2 coincide with the directions of x1 -axis and x2 -axis respectively x2
x1
Figure 7.2: Saddle in system (7.2) with matrix A1 . Eigenvectors v 1 and v 2 coincide with the directions of x1 -axis and x2 -axis respectively – 0 < λ2 < λ1 : unstable node – 0 < λ1 = λ2 : unstable node – λ1 < λ2 < 0: stable node – λ2 < λ1 < 0: stable node – λ1 = λ2 < 0: stable node – λ1 < 0 < λ2 : saddle 55
– λ2 < 0 < λ1 : saddle (shown in figure) You should sketch the phase portraits for each of these cases. Also keep in mind that for now I exclude cases when one or both λ’s are zero. • A2 . To find eA2 t I will use the fact that [ ] [ ] λ 0 0 1 A2 = + 0 λ 0 0 and these two matrices commute. The matrix exponent for the first matrix was found in the previous point, and for the second it is readily seen that [ ] [ ] 0 1 1 t exp t= . 0 0 0 1 Therefore, e
A2 t
[ ] 1 t =e . 0 1 λt
Assume that λ < 0 (the cases λ = 0 and λ > 0) left as exercises. Now, first, we see that x(t; x0 ) → 0 as t → ∞, moreover, dx2 →0 dx1 as t → ∞, therefore the orbits should be tangent to x1 -axis. The figure is given below, the phase portrait is sometimes called the improper stable node. x2
x1
Figure 7.3: Improper node in system (7.2) with matrix A2 . Vector v 1 coincides with the direction of x1 -axis
56
• A3 . Here I will use
] ] [ α 0 0 β + A3 = −β 0 0 α [
as two commuting matrices to find [ e
A3 t
=e
αt
] cos βt sin βt . − sin βt cos βt
Hence the flow of the system (7.2) is given by [ x(t; x0 ) = e
A3 t
x0 = e
αt
] cos βt sin βt x . − sin βt cos βt 0
To determine the phase portrait observe that if α < 0 then all the solutions will approach the origin, and if α > 0, they will go away from origin. We also have components of eA3 t which are periodic functions of t, which finally gives us the whole picture: if α < 0 and β > 0 then the orbits are the spirals approaching origin clockwise, if α > 0 and β > 0 then the orbits are spiral unwinding from the origin clockwise, and if α = 0 then the orbits are closed curves. Here is an example for α < 0 and β < 0, this phase portrait is called the stable focus (or spiral ). If I take α = 0 and β < 0 then the phase portrait is composed of the closed curves and called the center (recall the Volterra–Lotka model). See the figure. x2
x2
x1
x1
Figure 7.4: Left: Stable focus (spiral) and right: center in system (7.2) with matrix A3 To determine the direction on the orbits, I can use the original vector field. For example, in the case α = 0 β < 0 I have that for any point x1 = 0 and x2 > 0 the derivative of x2 is negative, and therefore the direction is counter-clockwise.
57
7.3
A little bit of linear algebra
So why actually did I spend so much time on studying three quite simple particular matrices A1 , A2 , A3 ? It is because the following theorem is true: Theorem 7.9. Let A be a 2 × 2 real matrix. Then there exists a real invertible 2 × 2 matrix P such that P −1 AP = J , where matrix J is one of the following three matrices in Jordan’s normal form [ ] [ ] [ ] λ1 0 λ 1 α β (a) , (b) , (c) . 0 λ2 0 λ −β α Before turning to the proof of this theorem, let me discuss how this theorem can be used for the analysis of a general system (7.2): x(t) ∈ R2 .
x˙ = Ax,
(7.2)
Consider a linear change of variables x = P y for the new unknown vector y. I have P y˙ = AP y, or
y˙ = P −1 AP y = Jy,
therefore, y(t; y 0 ) = eJt y 0 , where I already know how to calculate eJt . Returning to the original variable, I find that x(t; x0 ) = P eJt P −1 x0 , full solution to the original problem. Moreover, I also showed that eAt = P eJt P −1 , which is often used to calculate the matrix exponent. Finally, the phase portraits for x will be similar to those of y, since the linear invertible transformation amounts to scaling, rotation, and reflection, as we are taught in the course of linear algebra. The only question is how to find this linear change P . For this, let me recall Definition 7.10. A nonzero vector v is called an eigenvector of matrix A if Av = λv, where λ is called the corresponding eigenvalue.
58
Generally eigenvalues and eigenvectors can be complex. To find the eigenvalues, I need to find the roots of the characteristic polynomial P (λ) = det(A − λI), which is of the second degree if A is a 2 × 2 matrix. Once the eigenvalues are found, the corresponding eigenvectors can be found as solutions to the homogeneous system of linear algebraic equations (A − λI)v = 0. Remember that eigenvectors are not unique and determined up to a multiplicative constant. Now I am in a position to prove Theorem 7.9. The proof also gives the way to find the transformation P. Proof of Theorem 7.9. Since the characteristic polynomial has degree two, it may have either two real roots, two complex conjugate roots, or one real root multiplicity two. I assume first that I either have two distinct real roots λ1 ∈ R ̸= λ2 ∈ R with the corresponding eigenvectors v 1 ∈ R2 and v 2 ∈ R2 or a real root λ ∈ R multiplicity two which has two linearly independent eigenvectors v 1 ∈ R2 and v 2 ∈ R2 . Now I consider the matrix P , whose columns are exactly v 1 and v 2 , I will use the notation P = (v 1 |v 2 ). It is known that the eigenvector corresponding to distinct eigenvalues are linearly independent, hence P is invertible (can you prove this fact?). Now just note that AP = (Av 1 |Av 2 ) = (λ1 v 1 |λ2 v 2 ) = P J , which proves the theorem for case (a). For case (b), assume that there is one real root of characteristic polynomial with the eigenvector v 1 . Then there is another vector v 2 , which satisfies (A − λI)v 2 = v 1 , which is linearly independent of v 1 (can you prove it?). Now take P = (v 1 |v 2 ), and AP = (λv 1 |v 1 + λv 2 ) = P J , where J as in (b). Finally, in case (c) I have λ1,2 = α ± iβ as eigenvalues and the corresponding eigenvectors v 1 ± iv 2 , where v 1 , v 2 are real nonzero vectors. Let me take P = (v 1 |v 2 ). Since A(v 1 + iv 2 ) = (α + iβ)(v 1 + iv 2 ), I have Av 1 = αv 1 − βv 2 ,
Av 2 = αv 2 + βv 1 .
Now AP = (αv 1 − βv 2 |βv 1 + αv 2 ) = P J , where J as in (c). The only missing point is to prove that v 1 and v 2 are linearly independent, which is left as an exercise. 59
Example 7.11. Consider system (7.2) with ] 1 3 . A= 1 −1 [
We find that the eigenvalues and eigenvectors are λ1 = −2,
v⊤ 1 = (−1, 1),
λ2 = 2,
v⊤ 2 = (3, 1).
Therefore, the transformation P here is [ ] −1 3 P = , 1 1 and J = P −1 AP =
[
] −2 0 . 0 2
The solution to system y˙ = J y, where y = P −1 x is straightforward and given by [ −2t ] e 0 y(t; x0 ) = y 0 e2t 0 and its phase portrait has the structure of a saddle (see the figure). To see how actually the phase portrait looks in x coordinate, consider solution for x, which takes the form x = P y = (v 1 eλ1 t |v 2 eλ2 t )y 0 = C1 v 1 eλ1 t + C2 v 2 eλ2 t , where I use C1 , C2 for arbitrary constants. Note that x changing along the straight line with the direction v 1 if C2 = 0, and along the straight line v 2 when C1 = 0. The directions of the flow on these lines coincide with the directions of the flow on the y-axes for the system with the matrix in the Jordan normal form (see the figure). This is how we can see the phase portrait for the linear system of two autonomous ODE of the first order. Not taking into account the cases when one or both eigenvalues are zero, we therefore saw all possible phase portraits a linear system (7.2) can have. Now it is time discuss stability.
7.4
Stability of the linear system (7.2)
In what follows I will assume that det A ̸= 0, i.e., this means that there is only one isolated equilibrium of system x˙ = Ax, x(t) ∈ R2 , (7.2) which is the origin: x ˆ = (0, 0). To define stability of this equilibrium, and therefore stability of the linear system itself, I need a notion of a neighborhood and distance in the set R2 . I will use the following generalization of the absolute value to vectors x = (x1 , x2 )⊤ ∈ R2 : ( )1/2 |x| = (x1 )2 + (x2 )2 . 60
x2
y2
v2
y1
x1
v1
Figure 7.5: Saddle point after the linear transformation (left), and the original phase portraits (right). The coordinates are connected by the relation x = P y, where P is defined in the text Then distance between two vectors x1 ∈ R2 and x2 ∈ R2 is simply |x1 − x2 |. Using this convenient notation, now I will verbatim repeat my definition of stability of equilibria of the scalar autonomous ODE. To wit, Definition 7.12. An equilibrium x ˆ is called Lyapunov stable if for any ϵ > 0 there exists a δ(ϵ) > 0 such that for any initial conditions |ˆ x − x0 | < δ, the flow of (7.2) satisfies |ˆ x − x(t; x0 )| < ϵ for any t > t0 . If, additionally, |ˆ x − x(t; x0 )| → 0, when t → ∞, then x ˆ is called asymptotically stable. If for any initial condition x0 the orbit γ(x0 ) leaves a neighborhood of x ˆ, then this point is called unstable. The analysis of linear systems is simple since I actually have the flow of the system given by x(t; x0 ) = eAt x0 . Case by case analysis from this lecture allows me to formulate the following theorem:
61
Theorem 7.13. Let det A ̸= 0. Then the isolated equilibrium point x ˆ = (0, 0) of planar system (7.2) is asymptotically stable if and only if for the eigenvalues of A it is true that Re λ1,2 < 0. If Re λ1,2 = 0 then the origin is Lyapunov stable, but not asymptotically stable (center). If for at least one eigenvalue it is true that Re λi > 0 then the origin is unstable. Therefore I can have asymptotically stable nodes, improper nodes, and foci, Lyapunov stable center, and unstable nodes, improper nodes, foci, and saddles (note that the latter are always unstable). I can summarize all the information in one parametric portrait of linear system (7.2). For this it is useful to consider the characteristic polynomial of A as P (λ) = λ2 − (a11 + a22 )λ + (a11 a22 − a12 a21 ) = λ2 + λ tr A + det A, where I use the notation tr A = a11 + a22 to denote the trace of A. I have √ tr A ± (tr A)2 − 4 det A λ1,2 = , 2 therefore the condition for the asymptotic stability becomes simply tr A < 0,
det A > 0.
Using the trace and determinant as new parameters I can actually present possible linear systems as in the following figure det A det A = stable foci
(tr A)2 4
unstable foci
stable nodes
unstable nodes tr A
saddles
saddles
Figure 7.6: The type of the linear system depending on the values of tr A and det A. The centers here are situated where det A > 0 and tr A = 0
7.5
Bifurcations in the linear systems
I came to the main point of this lecture. I have four parameters in linear systems (7.2), four entries of matrix A. I can consider variations of these parameters, and the structure of the phase portrait of the linear system will be changing. For example, if I cross the boundary det A = 0 in 62
the last figure for negative tr A, then the saddle becomes the stable node. Is this a bifurcation in the system? Or, when crossing the curve det A = 14 (tr A)2 for positive values of tr A the unstable focus turns into the unstable node. Is this change enough to call it a bifurcation? Recall that for the first order equations the definition of bifurcation was based on the notion of topological equivalence, which identified equations with the same orbit structure as being topologically equivalent. But now I can have much richer orbit structures because I am not confined any longer to the phase line, now I am dealing with the phase plane. This is a quite complicated subject, therefore, I will mostly state results, proofs can be found elsewhere2 . First, there is a notion of topological equivalence, which includes the one that we already discussed, as a particular case. Definition 7.14. Two planar linear systems x˙ = Ax and x˙ = Bx are called topologically equivalent if there exists a homeomorphism h : R2 −→ R2 of the plane, that is, h is continuous with continuous inverse, that maps the orbits of the first system onto the orbits of the second system preserving the direction of time. It can be shown that the notion of topological equivalence is indeed an equivalence relation, i.e., it divides all possible planar linear system into distinct non-intersecting classes. The following theorem gives the topological classification of the linear planar system. It is also convenient to have Definition 7.15. An equilibrium x ˆ of x˙ = Ax is called hyperbolic if Re λ1,2 ̸= 0, where λ1,2 are the eigenvalues of A. Matrix A as well as system x˙ = Ax itself are also called hyperbolic in this case. Theorem 7.16. Two linear systems with hyperbolic equilibria are topologically equivalent if and only if the number of eigenvalues with positive real part (and hence the number of eigenvalues with negative real part) is the same for both systems. I usually have this large zoo of the equilibria: nodes, saddles, foci, but from topological point of view there are only three non-equivalent classes of hyperbolic equilibria: with two negative eigenvalues (a hyperbolic sink), one negative and one positive (a saddle), and with two positive eigenvalues (a hyperbolic source). Definition 7.17. A bifurcation is a change of the topological type of the system under parameter variation. Now I state the result that asserts that it is impossible to have bifurcations in the linear system when the equilibrium is hyperbolic. Proposition 7.18. Let A be hyperbolic. Then there is a neighborhood U of A in R4 such that for any B ∈ U system x˙ = Bx is topologically equivalent to x˙ = Ax. Proof. The eigenvalues as the roots of the characteristic polynomial depend continuously on the entries of A. Therefore, for any hyperbolic equilibrium a small enough perturbation of the matrix will lead to the eigenvalues that have the same sign of Re λ1,2 . Which implies, by Theorem 7.16, that this new system will be topologically equivalent to the original system. 2
I recommend for a thorough treatment of the subject the following textbook Hale, J. K., & Ko¸cak, H. (2012). Dynamics and bifurcations. Springer
63
Introducing another term, I can say that a property about 2 × 2 matrices is generic if the set of matrices possessing this property is dense and open in R4 . It can be proved that hyperbolicity is a generic property. I will not go into these details, but rephrase the previous as follows: almost all matrices 2 × 2 are hyperbolic. This brings an important question: Do we really need to study non-hyperbolic matrices and non-hyperbolic equilibria of linear systems? The point is that, in real applications the parameters of the matrix are the data that we collect in our experiments and observations. These data always have some noise in it, we never know it exactly. Therefore, only hyperbolic systems seem to be observed in the real life. However, this is not the case. Very often, systems under investigation may possess certain symmetries (such as conservation of energy). Another situation, which is more important for our course, that all the system we study contain parameters, whose values we do not know. It is therefore unavoidable that under a continuous parameter change the system matrix will be non-hyperbolic, and at some point we will need to cross the boundary between topologically non equivalent behaviors. Taking into account the Jordan normal forms for 2 × 2 matrices, I can consider the following three parameter dependent matrices: ] [ ] [ [ ] −1 0 µ1 1 µ 1 (a) , (b) , (c) . 0 µ µ2 µ1 −1 µ These three matrices becomes non-hyperbolic when µ = 0 or µ1 = µ2 = 0. For example in the case (a) when we perturb µ around zero the matrix with tr A < 0 and det A < 0 becomes a matrix with tr A < 0 and det A > 0, i.e., a topological sink turns into a saddle (or vice verse). In the third case I have the change from tr A > 0 and det A > 0 to tr A < 0 and det A > 0, i.e., a topological sink becomes a topological source. In the case (b) the situation is more involved since I have two parameters, and I will not discuss it here. To conclude, I say that for cases (a) and (c) it is enough to have one parameter, and a bifurcation of codimension one occurs, whereas for the case (b) I face a codimension two bifurcation.
7.6
Homework 4: Analysis of linear planar systems
1. Draw the phase portraits of each of the following systems of differential equations x˙ = Ax with (a)
(b)
(c)
[ ] −5 1 A= ; 1 −5 [
] 0 −1 A= ; 8 −6 [
] 4 −1 A= ; −2 5
(d) A=
64
[ ] −4 −1 ; 1 −6
(e)
[ A=
(f)
] 3 −1 ; 5 −3
[
] 0 2 A= ; −2 −1
(g)
[
] 1 −1 A= . 5 −3
2. The equation of motion of a spring-mass system with damping is m¨ x + cx˙ + kx = 0, where m, c, k are positive parameters (m is the mass, c is the damping coefficient, and k is the constant in Hook’s law). Convert this equation to a system of first order equations for the new variables x1 = x, x2 = x˙ 1 and draw all possible phase portraits (depending on the parameters) of the system. For each different phase portrait find the solution and sketch graph of it. Distinguish the overdamped, critically damped, and underdamped cases. What happens with the system if c < 0? 3. Lanchester’s model of combat. Let B(t) and R(t) be the numbers of “Blue” and “Red” combatants at time t — each of which is an idealized fire source — and b, r > 0 are their respective fire effectiveness. With no air power and no reinforcement, the equations take the form R˙ = −bB, B˙ = −rR, or, in words, the attrition rate of each belligerent is proportional to the size of the adversary. • Sketch the phase portrait of this system and show that for the positive initial conditions (the positive quadrant is all we care about) it is possible that either of combatants wins, depending on the values of r, b, B(0), R(0). • Find the equation for the orbits on the plane B, R for the initial conditions B0 , R0 . • Conclude that in case of B(t) = R(t) → 0 when t → ∞ (stalemate) the analysis yields the Lanchester Square Law bB02 = rR02 , √
or B0 =
r R0 , b
which implies that in order to stalemate an adversary three times as numerous, it does not suffice to be three times as effective; you must be nine times as effective! 65
4. Show that every orbit of x˙ = Ax with [
0 4 A= −9 0
]
is an ellipse.
8 8.1
Linear systems in Rd General theory
In the previous lecture I discussed a few things about planar linear autonomous ODE. However, most of this discussion can be extended to d-dimensional space Rd without much of a change. Especially simple are the formulations in the generic case, and I will stick to it to give an idea what can be expected in such systems. I consider x˙ = Ax, x(t) ∈ Rd , d ≥ 1, (8.1) with the initial condition x(0) = x0 ∈ Rd .
(8.2)
The unique solution to (8.1)–(8.2) is given by the same formula x(t; x0 ) = eAt x0 , where the matrix exponent is defined in the previous lecture. I assume that matrix A is hyperbolic. To wit, let d be the number of eigenvalues of A counting multiplicities, and let d− , d+ , and d0 denote the number of eigenvalues with negative, positive and zero real parts respectively. I have d0 + d− + d+ = d. Definition 8.1. System (8.1), as well as matrix A, as well as the equilibrium x ˆ = 0 ∈ Rd , are called hyperbolic if d0 = 0. Moreover, x ˆ is called a hyperbolic saddle if d+ d− ̸= 0. To be hyperbolic is a generic property: almost all the matrices are hyperbolic. Moreover, a hyperbolic system (8.1) can have only ˆ because in this case I have that ∏ one equilibrium x det A ̸= 0 (recall that I have det A = di=1 λi ). Another generic property is to have distinct eigenvalues (i.e., there are no multiple eigenvalues). In this case I know from the linear algebra that the list of eigenvectors (v 1 , . . . , v d ) corresponding to the eigenvalues is linearly independent and hence forms a basis of Rd (if matrix A is real then I always have the basis of real vectors in the sense that for the complex conjugate pair of eigenvalues λi and λj+1 = λj I take Re v j and Im v j as the real basis vectors). Let me denote T− , T+ , T0 the subspaces formed by the spans of the vectors corresponding to the eigenvalues with negative real part, positive real part, and zero real part respectively. They are called stable, unstable, and neutral subspaces respectively. If all the eigenvalues are distinct then Rd = T− ⊕ T+ ⊕ T0 , i.e., any element of v ∈ Rd can be uniquely represented v = v− + v+ + v0, 66
where v − ∈ T− , v + ∈ T+ , v 0 ∈ T0 . An important property of T± , T0 is their invariance with respect to the flow defined by (8.1). This means that if the initial condition, e.g., x0 ∈ T− then x(t; x0 ) ∈ T− for any t → ±∞. Proposition 8.2. Subspaces T± , T0 are invariant with respect to the flow of (8.1). Proof. Consider, e.g., x0 ∈ T− . It means that x0 = C1 v 1 + . . . + Cd− v d− , such that for each v i there is λi such that Re λi < 0. Consider vector-function ϕ(t) = C1 eλ1 t v 1 + . . . + Cd− eλd− t v d− . I have ϕ(0) = x0 and this function satisfies (8.1) (check it). Therefore x(t; x0 ) = ϕ(t) is the unique solution, and by construction ϕ(t) ∈ T− for any t. The general theory discussed so far implies Theorem 8.3. Let Rd = T− ⊕ T+ ⊕ T0 and det A ̸= 0. Then the unique equilibrium x ˆ is Lyapunov stable if and only if T+ = ∅. u ˆ is asymptotically stable if and only if T0 = T+ = ∅. The condition Rd = T− ⊕ T+ ⊕ T0 means that I have exactly d linearly independent eigenvectors of A. This is not always true in the case when there are eigenvalues of multiplicities larger than one. But even in this case I have Theorem 8.4. Consider (8.1). If for all eigenvalues of A I have Re λi < 0 then the unique equilibrium x ˆ is asymptotically stable. 8.1.1
Examples of the phase portraits in R3
Consider several examples of three dimensional hyperbolic equilibria and their phase portraits. The first example is for −1 0 0 A1 = 0 −2 0 . 0 0 1 Here I have the eigenvalues −1, −2, 1 and the corresponding eigenvectors coincide with the coordinate axes. According to the general theory I have a stable subspace T− spanned by two standard coordinate vectors e1 = (1, 0, 0) and e2 = (0, 1, 0), whereas the subspace spanned by e3 = (0, 0, 1) is unstable (see the figure). The second example is for 1/2 2 0 A2 = 8 1/2 0 , 0 0 −1 and hence the eigenvalues are 1/2 ± 4i and −1. The two eigenvectors corresponding to the complex conjugate eigenvalues can be used to form a two dimensional real subspace T+ , which is unstable. The vector e3 spans the stable subspace T− . 67
T+
T−
T−
x ˆ
T+ x ˆ
x3
x3
x2
x2
x1
x1
T−
T+
x ˆ
x ˆ
x3
x3
x2
x2
x1
x1
Figure 8.1: Three dimensional phase portraits. A hyperbolic saddle with two dimensional stable subspace T− and one dimensional unstable subspace T+ for matrix A1 (see the text for details). A hyperbolic saddle with one dimensional stable subspace T− and two dimensional unstable subspace T+ for A2 . A hyperbolic sink with three dimensional stable subspace T− = R3 for A3 . A hyperbolic source with three dimensional unstable subspace T+ = R3 for A4 For the third example I picked −1/2 2 0 −1/2 0 , A3 = 8 0 0 −1 68
and hence all three eigenvalues have negative real parts. Therefore in this case my state space R3 coincides with the stable subspace T− . Finally, take 1 0 0 A4 = 0 3 0 , 0 0 2 with the real negative eigenvalues. The phase space here coincides with the unstable subspace T+ . 8.1.2
Routh–Hurwitz criteria
The asymptotic stability of the equilibrium point of the linear system is determined by the condition Re λi < 0, where λi are the roots of P (λ) = λd + a1 λd−1 + . . . + ad−1 λ + ad . Therefore it is of great use to have a condition which, without explicit calculations of the roots, would provide us with information on the signs on the real parts of the roots. One of such conditions, and arguably most used, is the Routh–Hurwitz criterion. I just formulate it here, proofs can be found elsewhere3 . Consider a sequence of matrices [ ] a1 1 0 a1 1 H 1 = a1 , H 2 = , H 3 = a3 a2 a1 , . . . a3 a2 a5 a4 a3 a1 1 0 . . . a3 a2 a1 . . . H d = a5 a4 a3 . . . .. . 0
0
0
0 0 0 , . . . ad
where H j are the main corner minors of the last matrix H d . H d is written as follows. First I put on the main diagonal the coefficients from a1 to ad . After this I fill the columns such that the column with odd index can have only odd coefficients, and the columns with even indexes have only even coefficients. I put 1 for a0 and zero for any coefficients ak for k < 0 and k > d. Theorem 8.5 (Routh–Hurwitz). For all the roots of the characteristic polynomial P (λ) to have negative real parts it is necessary and enough that det H i > 0,
i = 1, . . . , d.
3
e.g., Gantmacher, F. R., & Brenner, J. L. (2005). Applications of the Theory of Matrices. Courier Corporation.
69
Corollary 8.6. For d = 2, 3, 4 the necessary and sufficient conditions for the characteristic polynomial to have all the roots with negative real part can be written as follows: d = 2 : a1 > 0, a2 > 0. d = 3 : a1 > 0, a3 > 0, a1 a2 > a3 . d = 4 : a1 > 0, a3 > 0, a4 > 0, a1 a2 a3 > a23 + a21 a4 . Corollary 8.7. The following necessary condition is true: If all the roots of the characteristic polynomial have negative real parts then aj > 0,
9 9.1
j = 1, . . . , d.
Power laws Introduction to power laws
The mathematical models of biological systems I considered so far are conceptual in some sense. That is, they describe qualitative phenomena, and less suitable for quantitative description of available data. This does not mean, of course, that the models I discussed can be used for a quantitative analysis (e.g., it is a fact that the exponential growth does describe the population increase when the supply is virtually unlimited, and the logistic equation can be used to estimate the population carrying capacity in some cases), but such applications should be performed with great care and understanding of a huge amount of simplifying assumptions put into these models. Sometimes, however, our models should be able to describe an observed phenomenon not only at a qualitative level, but also quantitatively. In this lecture I plan to discuss one such phenomenon and possible mathematical models explaining it. Let me start with a biological example. In biological classification species is the lowest possible rank, and species are grouped together in genera (or, singular, genus). The next taxonomic unit is a family. Therefore, I can talk about the distribution of the number of species in different genera in a given family. Such data can be collected and analyzed. An example of such data is given in the figure below (this example is borrowed from a wonderful paper by G. Yule4 ) Here the distribution of the sizes of the genera is shown for the family Chrysomelidx. The word “distribution” means that the number of genera with one species was counted, with two species, with three species, and so on, and after it these numbers were plotted against species numbers (to be more precise, for genera with more than 9 species actually intervals of number of species were considered, i.e., the number of genera with 9–11 species, with 12–14 species, with 15–20 species, and so on, the data can be found in the cited paper). Also note that both axes in the figure have logarithmic scaling, i.e., the natural logarithms of the number of genera are plotted versus the natural logarithms of the number of species in a given genera. The observations clearly follow a very simple pattern, namely, in double logarithmic coordinates they are very close to a strait line. That is, if I denote p(x) the number of genera (or sometimes 4 Yule, G.U. (1925). A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, F.R.S. Philosophical Transactions of the Royal Society of London. Series B, Containing Papers of a Biological Character, 213(402-410), 21-87.
70
it is more convenient to talk about frequency, i.e., the number divided by the total number of observations) with x species, then I have log p(x) = a − α log x, where a and α are positive constants. Therefore, in the usual coordinates p(x) =
A , xα
A := ea .
(9.1)
Distributions of the form (9.1) are said to follow a power law. It is a very surprising fact (well, for me at least) how many quantities around us follow a power law asymptotically, starting from some x0 . The constant α is called the exponent of the power law. Here is another example. Consider a distribution of the cites in USA with population more than 40000 (I pick this number arbitrary). There are slightly over 1000 such cities, and it is clear that the number of cities with large populations is very small (I cannot have more than 40 cities of the size of New York in USA not to exceed the total population). You can see a histogram of this distribution in the figure below. By “histogram” I mean that I took the whole interval from 4×105 to 8.2×106 (the population of New York), divided into 50 intervals√or bins (50 is taken as an example, a good rule of thumb is to take the number of intervals as N , where N is the number of observations, in my case N = 1036) of equal length, and counted the number of cities in each interval, after that I plot these numbers versus the centers of the intervals. You can see that both attempts to present the data are not very successful: In the left graph the number of cities with relatively small populations dominate the figure, in the right figure (with double logarithmic coordinates) the fact that the number of cities with large populations is very small brings a lot of noise. To overcome this difficulty I can follow G.Yule and make logarithmic binning, i.e., I divide not into the intervals of equal length, but such that the length of the second interval β times bigger than the first one, where β is a suitably chosen constant, and so on. You can see the result of these manipulations in the figure below. This time the result is better, especially in 71
the logarithmic coordinates, but still the histograms contain a lot of obvious noise, especially for large cities. Let me change the number of bins (intervals) from 50 to 11. Here what I get: Therefore, based on the presented data it is quite reasonable to formulate a hypothesis that the city population distribution follows a power law. Instead of playing with the number of bins and different binning strategies, it is often more convenient to represent the data in a different way. Namely, consider again p(x), which I define now as the proportion of the quantity of interest in the interval from x to x + dx, where dx is sufficiently small. Since p(x) gives a proportion, I should have ∫ ∞ p(x)dx = 1, xmin
where xmin > 0, because otherwise the integral will diverge. In the language of probability theory, my p(x) is called the probability density function. The chance that a given randomly picked city will have a population from x1 to x2 (note that it does not make much sense to talk that a city has a population exactly 45672, I need to provide an interval to get a meaningful answer) is given then ∫ x2
P{x1 ≤ x ≤ x2 } =
p(x)dx, x1
and here I use the notation P{A} to denote the probability (chance) of event A. Now instead of p(x) consider function ( )−α+1 x A −(α−1) x = F (x) = P{Population of a city not less than x} = p(ξ)dξ = , α−1 xmin x ∫∞ which also follows the power law (here I found A such that xmin p(x)dx = 1). The advantage of F is that this function is well defined for any x given the data, I simply need to calculate the proportion (or the number) of cities, whose population is bigger than x. If F follows the power law, therefore the data should have the power law distribution. The graph of F is often called rank or frequency plot, because the value F (x) is proportional to the rank of x (see the ∫
Number of cities
Number of cities
æ 800 600 400
∞
1000 æ 500
100 æ
50
æ æ 10
æ
5
200 æ 0 ææææææææææææææææææææææææææææææææææææææææææææææææ 0 2 ´ 106 4 ´ 106 6 ´ 106 8 ´ 106
Population of cities
æ æ
1 2 ´ 105
5 ´ 105
æ 1 ´ 106
ææ æ æ
æ
æ
ææ ææææææææææææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ 2 ´ 106 5 ´ 106
Population of cities
Figure 9.1: The distribution of the city population in USA for the cities with more than 40000 people. Left: Usual coordinates, right: double logarithmic coordinates 72
Number of cities
Number of cities
120 æ æ æ 100 æ æ 80 æ 60 æ æ æ 40
æ æ æ 20 æ æ æ æ æ æ ææ ææ æ æææææææææææææ æ æ æ æ æ æ æ æ 0 0 2 ´ 106 4 ´ 106
æ 100 æææ æ ææææ 50 ææ ææ ææ 20 æ æ 10
æææ æ æ æ
æ æ ææ æ æ æ æ ææ æ æ æ
5 2
æ æ 6 ´ 106
æ
æ
æ
1
æ ææ æ ææ ææææææ 5 ´ 105 1 ´ 106 2 ´ 106 5 ´ 106
1 ´ 105 2 ´ 105
Population of cities
æ
Population of cities
Figure 9.2: The distribution of the city population in USA for the cities with more than 40000 people. Left: Usual coordinates, right: double logarithmic coordinates. In both cases the logarithmic binning is used, number of bins 50 500 æ
æ
Number of cities
Number of cities
500 400
300 æ 200 æ 100 æ ææ
0 0
æ æ
100
æ 50 æ æ 10
æ æ
5
æ æ
1 ´ 10
æ 6
æ 6
2 ´ 10
6
3 ´ 10
æ 4 ´ 106
æ 6
5 ´ 10
6
5
6 ´ 10
1 ´ 10
5
2 ´ 10
Population of cities
5
5 ´ 10
6
1 ´ 10
2 ´ 106
æ æ 5 ´ 106
Population of cities
Figure 9.3: The distribution of the city population in USA for the cities with more than 40000 people. Left: Usual coordinates, right: double logarithmic coordinates. In both cases the logarithmic binning is used, number of bins is 11 figure). Here I see exactly the same pattern that the data in double logarithmic coordinates can be describes by a straight line, and hence the data themselves follow the power law. To impress you even more, consider now the word count is a book (I picked On the Origin of Species by Charles Darwin, but a similar picture can be observed for other texts), and consider the distribution of the counts in a text, i.e., how many words were used only once, how many words were used twice, three times, and so on. It is quite clear that there should be a significant number of words that were used only few times, and not so many words that are used in almost every paragraph, but what is the actual distribution of these numbers? Just for curious, here are 10 words that appear most often: the, of, and, in, to, a, that, have, be, as and here are several examples of the words that appear only once: last, decision, personal, drew, patiently, 1837, philosophers, mysteries, INTRODUCTION. 73
æ 1000 æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ 500 æ æ æ æ æ æ æ æ
æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ ææ ææ ææ æ æ æ æ æ ææææ æææ æ æ æ æ ææ ææ æ æ æ
F (x)
100 50
10 5
æ æ æ ææ æ
æ æ æ
1
æ 1 ´ 105
2 ´ 105
5 ´ 105
1 ´ 106
2 ´ 106
5 ´ 106
Population of cities
Figure 9.4: The function F (x) = #{Cities with population not less than x} versus the city population in double logarithmic scale The function F , which gives the number of words used not less than x times can be plotted (see the figure below), and the result, which shows that is not unreasonable to put forward the hypothesis of the power law, is known in linguistics as Zipf ’s law. æ æ
F (x)
1000
100
10
ææ æææ ææææ æææææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æææ æ æ ææ æ ææ æ æ ææ æ ææ æ æ æ æ æææ ææ ææ æ
æ æ æ æ
1
æ 1
10
100
1000
104
Word usage
Figure 9.5: The function F (x) = #{Words used more than x times} versus x in double logarithmic scale The list of examples can be easily continued. In particular, the available data suggest that the following quantities obey the power law: citations of scientific papers, copies of books sold, magnitude of earthquakes, intensities of wars, wealth of reaches Americans, frequencies of family names, etc (for those interested to read much more about power laws I recommend this paper5 ). Here are a few points about power laws. In the literature a more general definition of the power law distribution is used. A quantity X is said to have a power law distribution with 5
Clauset, A., Shalizi, C.R., & Newman, M.E. (2009). Power law distributions in empirical data. SIAM review, 51(4), 661-703.
74
exponent α if it has a probability density function p(x) ∼ Ax−α , i.e., if p(x) behaves similarly to the exact power law asymptotically, for large x. Since one assumes that usually X takes values in the interval (xmin , ∞), then it is necessary to request that α > 1 to guarantee that the integral ∫ ∞ p(x)dx xmin
converges. Very important quantities that describe X are its moments; two most important are the expectation EX (the first moment) and the variance Var X (the second central moment), they are defined as ∫ ∞ ∫ ∞ ( )2 EX = xp(x)dx, Var X = x − E X p(x)dx. xmin
xmin
The expectation shows the average value of X and the variance shows the spread of X around E X, the bigger the variance the less predictable quantity X becomes. A very important fact is that for the variance to exist α has to be bigger than 3 and for the expectation to exist α has to be bigger than 2. It turns out that the estimates of α from available data usually show that 2 < α < 3, i.e., the observed quantities follow the power law with infinite variance. This is why such distributions usually called fat-tailed distributions. Example 9.1 (80/20 law). Let me ask the question where the majority of the distribution of X lies if X has a power law distribution. Consider for example such value m that ∫ ∞ ∫ 1 ∞ p(x)dx = p(x)dx. 2 xmin m m is called the median of the distribution. For the power law I find that m = 21/(α−1) xmin . For example, if X is the wealth, then m is such income that exactly 50% people have smaller income than m and 50% of people have bigger income than m. I can also ask how much income lies in these two halves. The fraction of money of the reacher half is given by ∫∞ α−2 m xp(x)dx ∫∞ = 2− α−1 , xmin xp(x)dx provided α > 2 and the integrals converge. Therefore, if, say α = 2.1 (estimated from data), then the fraction of 2−0.091 ≈ 0.94 of the wealth in the hands of the richest half. A generalization can be made that the fraction α−2 W (p) = p α−1 of the wealth in the hands of the richest p. Using this expression, I can see that approximately 80% of the wealth in the hands of the richest 20%, hence the famous 80/20 law. Finishing this section I would like to mention that power law distributions are also very often called scale-free distributions. I saw several different explanations in the literature to justify the use of this term, however none of them is very convincing, therefore, it would be a much better practice to stick to more informative “power law distribution”. 75
9.2
Birth–death–innovation model of protein domain evolution
Proteins consists of domains, that can be defined as structural elements of a given protein that can function and evolve independently of the rest of the protein chain. One of the main mechanisms of genome evolution is duplication, which corresponds to appearance of two copies of the same sequence of DNA in genome. If a particular DNA sequence corresponds to a protein domain, then after duplication of this sequence there will be two similar domains that belong to the same family. It turns out that the distribution of domain families follows asymptotically the power law (see the figure). æ 1000
F (x)
500
100 50
10
æ
æ æ ææ æææ æææ æææ æææææ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ ææ æ æ æ æ æ ææ ææ æ æ æ æ ææ æ æ æ ææææ æææ æ ææ æææ æ æ æ æ
5
æ ææ æ æ æ æ
1
æ 1
5
10
50
100
500
1000
Number of domains
Figure 9.6: The function F (x) = #{Families with not less than x domains} versus x in double logarithmic scale for Homo Sapience In this section I present a mathematical model of the domain evolution that produces the power law with exponent α = 1. Consider possible elementary events in genome evolution: • Domain birth. This is usually caused by duplication and forces a family with k domains becomes the family with k + 1 domains; • Domain death. This is usually caused by inactivating mutations and results in a family with k − 1 domains if there were originally k domains; • Domain innovation. This event gives birth to a new domain family with 1 domain. This can happen for several reasons, one of which mutation in one of the existing domain such that the domain stays functional, but now it is so different from the rest of his siblings that I count it as a new domain family. Assume that it is possible to have maximum d domains in a family, and let xk be the number of families with k domains, k = 1, . . . , d. Assume that the rates of duplications and inactivating mutations are constant per one domain and denote them λ and µ respectively. I have that the rate of change of the number of domain families with only one domain is given by x˙ 1 = −(λ + µ)x1 + 2µx2 + ν. 76
Here −λx1 term corresponds to duplication that makes one of such families a family with two domains, −µx1 term corresponds to possible mutation that destroys one family, 2µx2 describes possible inactivating mutations in the families with two domains, and coefficient 2 comes from the fact that each family among x2 has 2 domains, finally ν denotes a constant rate of domain innovations. For the rest of the variables I have x˙ k = (k − 1)λxk−1 − k(λ + µ)xk + (k + 1)µxk+1 ,
k = 2, . . . , d − 1,
x˙ d = (d − 1)λxd−1 − dµxd . It is convenient to write the system of ODE in the vector form x˙ = Ax + ν,
(9.2)
where the matrix A is given by −(λ + µ) 2µ 0 ... 0 λ −2(λ + µ) 3µ ... 0 . . . 0 2λ ... ... , .. .. .. . . . −(d − 1)(λ + µ) dµ 0 0 ... (d − 1)λ −dµ and the vector ν = (ν, 0, . . . , 0)⊤ . One can check that det A = (−1)d k!µd ̸= 0 if µ ̸= 0, and hence there is unique equilibrium x ˆ, which is the solution to Ax = −ν. The stability of this equilibrium is determined by the real parts of the eigenvalues of A because the change of variables x = y − A−1 ν leads to the system that we studied: y˙ = Ay. Theorem 9.2. Consider system (9.2) and let θ = λ/µ. Then equilibrium x ˆ is asymptotically stable and given by ν θk xk = , k = 1, . . . , d, λ k and has the power law form with exponent 1 if and only if θ = 1, i.e., if λ = µ. To prove this theorem I need first to establish that the coordinates of the equilibrium exactly as given and that matrix A has eigenvalues with negative real parts. For the second part I will need the following fact (without proof): Proposition 9.3. Consider matrix A with nonnegative non-diagonal elements and its principal minors Mk , k = 1, . . . , d (i.e., the determinants of the matrices obtained from A by keeping first k columns and rows). Matrix A is asymptotically stable (i.e., for all eigenvalues it is true that Re λ < 0) if and only if (−1)k M k > 0 for all k = 1, . . . , d. Proof of Theorem 9.2. First, by summing all the equations in (9.2), I find that −µx1 + ν = 0 =⇒ x ˆ1 = 77
ν . µ
Now from the second equation in (9.2) I can find x ˆ2 , from the third x ˆ3 and so on, which proves the formulae for the coordinates of the equilibrium. I have that M1 = −(λ + µ), and Md = det A = (−1)d d!µd (the last expression can be found by first adding the last row of A to row d − 1, after that row d − 1 added to d − 2 and so on, eventually I will get a diagonal matrix with (−µ, −2µ, . . . , −dµ) on the main diagonal). Now consider matrices M k such that Mk = det M k , k = 1, . . . d − 1. By using the determinant formula through the last row, I find Mk = det Ak − kλMk−1 , where Ak is k × k matrix of the same structure as A and whose det Ak = (−1)k k!µk . I know M1 , and hence can find M2 and so on. The general formula is given by Mk = (−1)k k!
λk+1 − µk+1 , λ−µ
which proves that all the eigenvalues of A have negative real parts.
The biggest question of course is how to modify the model to obtain a power law distribution with exponent other than one. For this one can assume that the birth and death rates are not constant and depend on the size of a given family. E.g., I can assume that the birth rate is λk and the death rate is µk per domain family of the size k (we retrieve the model, which was analyzed above, if we set λk = kλ, µk = kµ). The following theorem holds6 . Theorem 9.4. Birth–death–innovation model with the size dependent birth and death rates has a unique asymptotically stable equilibrium x ˆ ∈ Rd with coordinates ∏ ν i−1 λk , i = 1, . . . , d. x ˆi = ∏i k=1 k=1 µk This equilibrium distribution follows asymptotically the power law distribution with exponent α if and only if λk−1 α = 1 + + O(k −2 ). µk k For example, according to this theorem, the model with λk = λ(k + a), µk = µ(k + b) for some constants a ̸= b will produce a power law with the exponent a − b − 1 if λ = µ.
9.3
Preferential attachment
Here I present a heuristic derivation of the power law distribution via the so-called principle of preferential attachment (in this subsections I follow mainly this paper7 ). Consider the World Wide Web, with can be represented as a directed graph, where the vertices correspond to the web pages and there is an edge from vertex i to vertex j if there is a hyperlink from page i to 6
Karev, G. P., Wolf, Y. I., Rzhetsky, A. Y., Berezovskaya, F. S., & Koonin, E. V. (2002). Birth and death of protein domains: a simple model of evolution explains power law behavior. BMC evolutionary biology, 2(1), 18. 7 Mitzenmacher, M. (2004). A brief history of generative models for power law and lognormal distributions. Internet mathematics, 1(2), 226-251.
78
page j. Now each page can be characterized by the number of hyperlinks to this page, in the graph theoretic language this number is called the in-degree. Therefore, I can talk about the in-degree distribution and it turns out that this distribution can be closely approximated by a power law. Assume that a new web page is created. It is reasonable to expect that this new page will link to some popular web pages, i.e., the chance that a new web page is connected to a web-page with in-degree k should be proportional to k, and this is what is usually called the preferential attachment principle. Here is an informal argument to formalize the preferential attachment. Let xj (t) be the number of web pages with in-degree j when there are t pages total. Then, for j ≥ 1 the probability that xj (t) increases is simply α
xj−1 (t) (j − 1)xj−1 (t) + (1 − α) , t t
if I assume that new web page appears with only one link to existing pages, and this one link is chosen randomly among all t pages with probability α and with probability 1 − α this one link is chosen randomly but with probabilities proportional to the existing in-degrees (preferential attachement). Similarly, the probability that xj (t) decreases is α
xj (t) jxj (t) + (1 − α) . t t
Therefore, for j ≥ 1, ) ( α(xj−1 − xj ) + (1 − α) (j − 1)xj−1 − jxj x˙ j = . t The case j = 0 should be treated differently since each new web page has in-degree 0, and therefore αx0 . x˙ 0 = 1 − t I obtained a non-autonomous system of linear ordinary differential equations. Since the time unit in the model is appearance of one new web page, I can assume that the limiting stationary state should have the form xj (t) = cj t, where cj is a constant, which specifies which fraction of the total number of the pages with in-degree j. I have for x0 1 x˙ 0 = c0 = 1 − αc0 =⇒ c0 = . 1+α For general j cj (1 + α + j(1 − α)) = cj−1 (α + (j − 1)(1 − α)). I can determine cj exactly using the above recurrence, but for my goal it is enough to note that ( ) cj 2−α 2−α 1 =1− ∼1− . cj−1 1 + α + j(1 − α) 1−α j
79
This yields that asymptotically cj ∼ Cj − 1−α , 2−α
for some constant C. To see this, note that the last expression means cj cj−1
( ∼
j−1 j
) 2−α 1−α
( ∼1−
2−α 1−α
)
1 , j
as required.
10 10.1
Back to planar nonlinear systems Near the equilibria
Recall that I started talking about the Lotka–Volterra model as a motivation to study systems of two first order autonomous equations of the form x˙ = f (x),
x(t) ∈ U ⊆ R2 , f : U −→ R2 .
(10.1)
After this I discussed general properties of systems of the form (10.1) and formulated the main goal: Given the system how can I obtain its phase portrait? This task can be fully solved for linear systems of the form x(t) ∈ R2 , A = (aij )2×2 .
x˙ = Ax,
(10.2)
In this lecture I will show how the knowledge of the phase portraits of (10.2) can be used to obtain partial and essentially local information about phase portraits of (10.1). The general idea is quite straightforward. Assume that system (10.1) has an equilibrium x ˆ (i.e., f (ˆ x) = 0) and expand f in its Taylor series around this point: ( ) ( ) f (x) = f (ˆ x)(x − x ˆ) + f ′ (ˆ x)(x − x ˆ) + O |x − x ˆ|2 = f ′ (ˆ x)(x − x ˆ) + O |x − x ˆ|2 , where
∂f1 1 f ′ (ˆ x) = ∂x ∂f2 ∂x1
∂f1 ∂x2 ∂f2 ∂x2 (x1 ,x2 )=(ˆx1 ,ˆx2 )
is called the Jacobi (or ˆ. There is a temptation ( Jacobian) ) matrix of f evaluated at the point x 2 to drop the terms O |x − x ˆ| and make the shift of the variables y = x − x ˆ. Then for y I obtain the linear system y˙ = f ′ (ˆ x)y = Ay, A := f ′ (ˆ x), (10.3) whose phase portrait I know. System (10.3) is called the linearization of (10.1) around (or at) the equilibrium x ˆ. The answer when I can use the linearization to figure out the behavior near x ˆ is given by the following theorem. Theorem 10.1 (Grobman–Hartman). Assume that the origin of the linearization of (10.1) at x ˆ is hyperbolic. Then system (10.1) is locally topologically equivalent to its linearization (10.3). 80
Note that the theorem holds only in some neighborhood of x ˆ. To extend the terminology from the linear system, I call the equilibrium x ˆ hyperbolic, if the Jacobi matrix f ′ evaluated at this equilibrium has no eigenvalues with zero real part. In particular, denoting d− , d+ , d0 the number of eigenvalues with negative, positive, and zero real parts, I have Theorem 10.2 (Lyapunov, Poincar´e). If x ˆ hyperbolic, then it is asymptotically stable if d− = d. If d+ > 0 for any x ˆ (including non-hyperbolic equilibria) then x ˆ is unstable. If the equilibrium is non-hyperbolic in the linear system, for the analysis of the orbit structure of the nonlinear equation I will need to use some additional tools. This is in particular true when the linearization is a center. For example, for the Lotka–Volterra model the nontrivial equilibrium is a center in the linearized system (check this), and how I argued, it is also a center in the nonlinear system. However, according to the general theory, I could not conclude that the equilibrium is a center because the linearization is a center! Therefore, for hyperbolic equilibria of (10.1) exactly the same classification of the phase portraits in some neighborhood of x ˆ holds, as for linear systems. If a point is hyperbolic, then we will see familiar nodes, spirals, and saddles. If I deal with saddles, even more can be said. Let me define first the stable and unstable manifolds of x ˆ. By definition, the stable manifold of x ˆ is defined as W− (ˆ x) := {x0 ∈ U : x(t; x0 ) → x ˆ as t → ∞}, and the unstable manifold of x ˆ is defined as W+ (ˆ x) := {x0 ∈ U : x(t; x0 ) → x ˆ as t → −∞}. Theorem 10.3. Assume that x ˆ is a hyperbolic and denote T− , T+ the stable and unstable subspaces of its linearization. If T− ̸= ∅ then W− (ˆ x) exists and is tangent to T− at x ˆ, and if T+ ̸= ∅ then W+ (ˆ x) exists and is tangent to T+ at x ˆ. The last theorem means that the saddle structure of the equilibrium of the linearization is preserved in the nonlinear system, i.e., there are orbits that approach the equilibrium for t → ∞, and there are orbits that approach the equilibrium for t → −∞. To illustrate the last theorem consider a simple example x˙ = −x, y˙ = y + x2 . This system has a unique equilibrium x ˆ = (0, 0), and the linearization at this point is [ ] −1 0 x˙ = x. 0 1 (I use here and in the following, with a slight abuse of notations, the same variable to denote the linearization of the system). The linearization is a saddle point with eigenvalues λ1 = −1 and λ2 = 1 and eigenvectors v 1 = (1, 0)⊤ and v 2 = (0, 1)⊤ respectively, therefore the stable subspace T− is the x1 -axis and unstable subspace T+ is the x2 -axis. For the full nonlinear system I also
81
have the same orbit structure on x2 -axis (because if x1 = 0 then the system is linear), and the simple form of the equations allows me to find that the orbits on the plane (x, y) are given by y=−
x3 C + . 3 x
Therefore for C = 0 I have orbits approaching the origin along the parabola y=−
x3 , 3
which is the stable manifold for my equilibrium, and which is obviously tangent to T− (see the figure).
T+ x) W+ (ˆ y
y
T−
W− (ˆ x) x ˆ1
x
x
Figure 10.1: Stable and unstable manifolds for a hyperbolic equilibrium
10.2
Outside of equilibria
The previous subsection tells us a lot of possible orbit behavior near equilibria. The natural question is of course how to get an idea about what happens outside of equilibria. Right at this point I would like to say that there are no universal methods to analyze the structure of the phase portrait of a non-linear ODE system, however, by now we have quite a rich arsenal of tools to extract at least partial knowledge on the orbit behavior. For the planar system x˙ = f (x, y), y˙ = g(x, y),
(10.4)
the essential fact is that a simple curve dissects R2 into two connected regions. I can use this fact to identify the regions of R2 , where the solutions to (10.4) are monotone (i.e., the derivatives have definite signs in these regions). Such regions are found by plotting the null-clines f (x, y) = 0,
g(x, y) = 0
and using the following almost obvious proposition 82
( ) Proposition 10.4. Let ϕ(t) = x(t), y(t) be a solution to (10.4). Consider an open bounded set V ⊆ R2 . If x(t) and y(t) are strictly monotone in V then either ϕ(t) hits the boundary of V at some finite t, or ϕ(t) converges to an equilibrium (ˆ x, yˆ) ∈ V . Example 10.5. Consider the system x˙ = −x, y˙ = 1 − x2 − y 2 . The Jacobi matrix is
[
] −1 0 f (x, y) = . −2x −2y ′
The system has two equilibria: x ˆ1 = (0, 1) and x ˆ2 = (0, −1). The first one is an asymptotically stable node since [ ] −1 0 f ′ (ˆ x1 ) = , 0 −2 and the eigenvalues are −1 and −2. Therefore there exists a two-dimensional stable manifold W− (ˆ x1 ). The second equilibrium is unstable because [ ] −1 0 f ′ (ˆ x2 ) = , 0 2 with the stable manifold tangent to the line parallel to x-axis and unstable manifold tangent (actually, coincident with) to y-axis. The null-clines of the system are given by l1 = {(x, y) : x = 0},
l2 = {(x, y) : x2 + y 2 = 1},
therefore there are four regions of R2 where the solutions are monotone (see the figure). Using the proposition above, it should be clear that the only possible orbit structure is given in the figure. In general, it is not quite straightforward to guess a possible structure of the orbits, however, the null-clines are usually quite useful for obtaining at least partial information.
10.3
Bifurcations of equilibria. Structural stability
Recall that a bifurcation is the change of the topological type of the system. Equilibria can change their topological type under some parameter variation only if the linearization of the system around these equilibria is non-hyperbolic, or, in other words, a presence of a non-hyperbolic equilibria is a necessary condition for a bifurcation. This implies that bifurcations of the equilibria of the nonlinear system x˙ = f (x, α),
x(t) ∈ U ⊆ R2 ,
α ∈ R,
are possible when either one of the eigenvalues of the Jacobi matrix becomes zero, or when two eigenvalues cross the real axis. (I consider here only the case when there is one parameter in the system. In general, of course, it is possible to have that both eigenvalues of the Jacobi matrix becomes zero simultaneously, but generically for this it is necessary to have two parameters in the system.) 83
Example 10.6 (Saddle-node or fold bifurcation). Consider the planar system x˙ = α + x2 , y˙ = −y. Here the equations are decoupled and it is quite straightforward to figure out the phase portraits depending on the sign of α. If α < 0 I have two equilibria, one is asymptotically stable (node) and the other one is unstable (saddle), if α = 0 then the origin is a non-hyperbolic equilibrium (unstable, this equilibrium is called the saddle-node, hence the name for the bifurcation), and finally if α > 0 then there are no equilibria in the system, see the figure. Here in this example the bifurcation is essentially one-dimensional, which I already studied at length in the earlier lecture notes for the this course. It turns out that this situation is generic for the case when one of the eigenvalues of the Jacobi matrix becomes zero. I can have for most of the cases either fold, or transcritical, or pitchfork bifurcation, which were already studied. In the first case two equilibria of the opposite stability approach each other, collide and disappear; in the second case there is always an equilibrium for all parameter values that changes the stability properties as another equilibrium “passes” through it, and finally in the third case I should distinguish sub- and supercritical pitchfork bifurcation, this occurs in systems with certain symmetries. It is possible to specify the exact mathematical conditions for these three bifurcation types, but in practice it is usually much easier to determine the type of the bifurcation with one of the eigenvalues zero by analyzing the types and number of equilibria 1.5
x˙ > 0, y˙ < 0
x ˆ1
1.0
x˙ < 0, y˙ < 0
y
0.5
0.0
x˙ > 0, y˙ > 0
l2
x˙ < 0, y˙ > 0
-0.5
l1 x ˆ2
-1.0
-1.5 -1.5
-1.0
0.0
-0.5
0.5
1.0
1.5
x
Figure 10.2: The phase portrait of the system in the text. The red curves are null-clines of the system 84
y
α0
x
Figure 10.3: Saddle-node or fold bifurcation in a two dimensional system for the parameter values close to the bifurcation point, I will present some examples of such analysis. On the other hand, the case when two eigenvalues of the Jacobi matrix cross the imaginary axis is essentially two-dimensional. The corresponding bifurcation is called Hopf, or, more appropriately, Poincar´e–Andronov–Hopf bifurcation, and I will study it in more details later. Before finishing this section, I would like to note that bifurcations of equilibria in two dimensional systems are not the only possible changes of the phase portrait that lead to topologically non-equivalent pictures. Consider the following example. Example 10.7 (Heteroclinic bifurcation in a planar system). Consider the system depending on parameter α x˙ = 1 − x2 − αxy, y˙ = xy + α(1 − x2 ). This system always has two saddle equilibria x ˆ1 = (−1, 0) and x ˆ2 = (1, 0). At α = 0 the x-axis is invariant and approaches both equilibria for t → ±∞. Such trajectories are called heteroclinic. For α ̸= 0 this connection disappears. This is an example of a global bifurcation because to detect it I need to keep track of both equilibria. 85
y
α0
x
Figure 10.4: Global heteroclinic bifurcation There is one more important notion that pertains to bifurcations in systems of ODE. In the last example it is clear that for α = 0 the system is such that a small system perturbation would lead to a qualitatively different phase portraits. To formalize this, consider two systems defined in a closed and bounded region U (the region is chosen as closed and bounded, i.e, compact, not to deal with complications due to unboundness of the plane): x˙ = f (x),
x(t) ∈ U,
(10.5)
x˙ = g(x),
x(t) ∈ U.
(10.6)
and Now consider the C (1) distance between these two vector fields (10.5) and (10.6) in U as { } ∥f − g∥1 = sup |f (x) − g(x)|, |f ′ (x) − g ′ (x)| , x∈U
where the first norm inside the sup is the usual Euclidian norm, and the second one is the norm of a matrix A. The C (0) distance is not enough for my purposes. Using this definition of a distance in the space of all vector fields I can define a neighborhood N (f ) of the vector field f as a set ϵ-close to f with respect to the defined metric. Finally, I can state that 86
Definition 10.8. A planar differential equation x˙ = f (x) (or, equivalently, a vector field f ) is called structurally stable if there is a neighborhood N (f ) such that for any vector field g ∈ N (f ) is topologically equivalent in U to f . One of the indication of a structurally unstable system is the presence of a non-hyperbolic equilibrium. For example, the Lotka–Volterra system is structurally unstable, which is sometimes used as an indication that this system of ODE cannot be considered as a reliable candidate for a mathematical model describing predator–prey interaction.
10.4
Homework 5: Linearization
1. Consider two dimensional system in R2 : x˙ 1 = x2 (1 + x1 − x22 ), x˙ 2 = x1 (1 + x2 − x21 ). Determine the equilibria and characterize the linearized flow in a neighborhood of these points. Try to sketch the phase portrait. 2. Consider the system x˙ 1 = 16x21 + 9x22 − 25, x˙ 2 = 16x21 − 16x22 . Determine the equilibria and characterize them by linearization. Try to sketch the phase portrait. 3. In certain applications one studies the equation x ¨ + cx˙ − x(1 − x) = 0 with a special interest in solutions with the properties lim x(t) = 0, lim x(t) = 1, x(t) ˙ > 0 for − ∞ < t < ∞.
t→−∞
t→∞
Derive a necessary condition for the parameter c for such solutions to exist. (Hint: reformulate the problem as a system of two first order equations)
11 11.1
Ecological interactions General Lotka–Volterra model on a plane and types of ecological interactions
Finally I am in a position to start discussion of simple mathematical models of species interactions in a uniform fashion. For this let me start the discussion with the general form of the
87
Lotka–Volterra model for two interacting species. Let N1 (t) and N2 (t) denote the species population sizes at time t, and all the interactions are modeled using the law of mass action. This implies that I can write N˙ 1 = N1 (b1 + a11 N1 + a12 N2 ), N˙ 2 = N2 (b2 + a21 N1 + a22 N2 ), where b1 , b2 , a11 , a12 , a21 , a22 are constants (non necessarily positive). The constants b1 , b2 describe the Malthusian growth (bi > 0) or decay (bi < 0), the constants a11 and a12 refer to the intraspecific competition in the case aii < 0, and constants a12 and a21 describe the species interactions (interspecific competitions in a broad sense). In particular, two species can be in the following relationships: • Neutralism. This corresponds to the case a12 = a21 = 0, i.e., there is no direct influence of any of the species on the other one. • Amensalism is an interaction when one of the species clearly has a negative effect on another without getting any significant influence back. For the parameters of the system this means that, say, a12 < 0 and a21 = 0. • Commensalism. Here one species benefits without affecting the other. It implies that, e.g., a12 > 0 and a21 = 0 (in this case species 1 is a commensal, the one that benefits, and species 2 is a host). • Competition is a mutually detrimental interaction, a12 < 0, a21 < 0. • Antagonism. In antagonism one species benefits at the expense of the other, a12 < 0, a21 > 0. Different terms can be used in this case, e.g., consumer–resource interaction, or host– parasite interaction, or prey–predator interaction. • Mutualism leads to mutual benefit of interacting species (symbiosis, which is sometimes considered as a synonym, is a more general term, which may refer to any mutual interaction of two species), and hence a12 > 0, a21 > 0. Sometimes matrix A = (aij )2×2 of the parameters describing the intra- and interspecific interactions is called the interaction matrix. I already discussed at length the predator–prey Lotka–Volterra model, in which, b1 > 0, b2 < 0, a11 = a22 = 0, a12 < 0, a21 > 0. In this lecture I will look at the case of predator-prey interaction with a11 > 0, a22 > 0. It should be clear how to extend the system for the case of three, four, or more interacting species.
11.2
Lotka–Volterra predator–prey model with intraspecific competition
Recall that Lotka–Volterra predator–prey model that I analyzed is structurally unstable, i.e., a small perturbation of this system would lead to a topologically nonequivalent system (this phrase means “would lead to a system that possesses a topologically nonequivalent phase portrait, no matter how small perturbation is”). Consider a modification of this model that includes both
88
prey and predator intraspecific competitions: ) ( N N˙ = aN 1 − − bN P, K1 ( ) P ˙ P = −dP 1 + + cN P, K2 where a, b, c, d, K1 , K2 > 0 and N (t) and P (t) are prey and predator populations at time t respectively (actually, the predator intraspecific competition is redundant here, the same result is obtained putting formally K2 = ∞). By switching to non-dimensional variable, I find that x˙ = x(1 − αx − y),
(11.1)
y˙ = y(−γ − βy + x), where N (t) =
ax(τ ) , c
P (t) =
ay(τ ) , b
τ = at,
γ=
d , a
α=
a , cK1
β=
d . aK2
For the biologically motivated models the phase space is R2+ = {(x, y) : x ≥ 0, y ≥ 0}, therefore I do not consider the orbit structure in other parts of the plane. An important point, however, is to show that the chosen biologically realistic state space is positively invariant, i.e., if the initial conditions belong to our state space, then the positive semi-orbit starting at this point will stay in the phase space for any t → +∞ (I can similarly define a negatively invariant set). In my case it is a simple matter since the axis are invariant and consist of the orbits. This follows from the fact that x = 0 is a solution (plug it in the first equation) and y = 0 is a solution (plug it in the second equation). Since the axis are composed of the orbits and other orbits cannot intersect them I can conclude that R2+ is both positively and negatively invariant, and hence simply invariant. In R2+ system (11.1) can have up to three equilibria: ( ) ( ) 1 γ + β 1 − γα x ˆ0 = (0, 0), x ˆ1 = ,0 , x ˆ2 = , , α 1 + αβ 1 + αβ and x ˆ2 ∈ R2+ only if γα < 1. The Jacobi matrix of (11.1) has the form [ ] 1 − y − 2αx −x ′ f (x, y) = . y −γ − 2βy + x By analyzing eigenvalues of the Jacobi matrix, I find that x ˆ0 is always a saddle point, with x-axis being the unstable manifold, and y-axis being the stable manifold. For x ˆ1 one has 1 − −1 ′ α f (ˆ x1 ) = 1 − γα , 0 α therefore this point is a saddle if αγ < 1 and stable node if αγ > 1, for αγ = 1 I have one eigenvalue equal to zero, and therefore a bifurcation occurs, the exact type of this bifurcation 89
will be determined by looking at the third equilibrium. Note that when x ˆ1 is a saddle then the stable manifolds are on the x-axis, and the unstable manifold is tangent to straight line with the direction (1, γα − 1 − α), i.e., has a negative slope. Now assume that x ˆ2 = (ˆ x2 , yˆ2 ) be the coordinates of the third equilibrium in case αγ < 1. That is, they solve the system αx + y = 1, x − βy = γ, and I know that x ˆ2 > 0, yˆ2 > 0. Using the fact that, e.g., ∂f1 = (1 − αx − y) − αx, ∂x I find that
tr f ′ (ˆ x2 ) = −αˆ x2 − β yˆ2 < 0,
and
det f ′ (ˆ x2 ) = x ˆ2 yˆ2 (1 + αβ) > 0,
which implies that x ˆ2 is asymptotically stable and either stable node or stable focus depending on the exact parameter values. I also notice that if αγ = 1 then the coordinates of x ˆ2 are precisely (α−1 , 0), i.e., they coincide with the coordinates of x ˆ1 . This implies, given that both x ˆ1 and x ˆ2 exist for any parameter values (but not always in R2+ ) that the local bifurcation that occurs for αγ = 1 is exactly the transcritical bifurcation, at which these two equilibria collide and exchange the stability properties. To figure out the global behavior of the orbits I need to look at the mutual positioning of the null-clines, which are given here as x = 0,
y = 0,
l1 = {(x, y) : y = 1 − αx},
l2 = {(x, y) : βy = x − γ},
where I named only those different from the coordinate axis. I will use the proposition from the previous lecture, in which it was stated that the monotone orbits in a bounded open set either approach the boundary of this set or converge to an equilibrium. Consider first αγ > 1. l1 has a negative slope and intersect x-axis at the equilibrium with x = 1/α, l2 has a positive slope and intersects x-axis at x = γ > 1/α due to the assumption. Therefore these null-clines divide R2+ into three sets, let me call them U1 , U2 , U3 starting from left to right (see the figure). Assume that x0 ∈ U3 . According to the proposition the orbit γ(x0 ) has to cross l2 , because in this set x˙ < 0 and there are no equilibria to converge to. In U2 I have that x˙ < 0, y˙ < 0, therefore the only two possibilities are to converge to x ˆ1 or to cross l1 . Finally, in U2 there is no other option for the orbits other than converge to x ˆ1 , and I conclude that for the case αγ > 1 the equilibrium x ˆ1 is globally asymptotically stable, meaning that for ˆ1 . almost all initial conditions from R2+ I have x(t; x0 ) → x Now let αγ < 1. Then l1 and l2 intersect at x ˆ2 , and R2+ is divided into four sets U1 , U2 , U3 , U4 going in a clockwise fashion. Now I can argue that the orbits either approach x ˆ2 or cross boundaries of our sets in the order U4 → U3 → U2 → U1 → U4 . This means that there is a theoretical possibility to have a closed orbits that corresponds, as I discussed earlier for Lotka– Volterra model, to periodic oscillations in prey and predator populations. At this point I am 90
l2 U2
U2 l2
l1
y
y
U3 l1
x ˆ2 U1 U1
U3
x ˆ0
U4 x ˆ0
x ˆ1
x ˆ1 x
x
Figure 11.1: Topologically non-equivalent phase portraits of the prey–predator model with intraspecific competition. The left one corresponds to αγ > 1, and the right one corresponds to the case αγ < 1 just going to state that there are no closed orbits in this system, a proof will be given later. Therefore, I conclude that in case αγ < 1 equilibrium x ˆ2 is globally asymptotically stable. As a general conclusion I have that system (11.1) allows two structurally stable topologically non-equivalent phase portraits, the transcritical bifurcation occurs when αγ = 1, and each of these parameters can be taken as a bifurcation parameter. Parameter β does not influence the topological picture. In biological terms I have two very different outcomes: • If αγ > 1, i.e., when
d > 1, cK1 which corresponds to a high mortality rate for the predator, or a low carrying capacity for the prey, or a low effectiveness of the predator to transform prey biomass into predator biomass, then the predator goes extinct whereas the prey population stabilizes at N (t) = K1 .
• If αγ < 1, i.e., when
d < 1, cK1 which means that either prey has a large carrying capacity, or the predator has a low mortality rate, or a high effectiveness in prey consumption, then the prey and predator coexist at the equilibrium x ˆ2 , whose coordinates should be written in the original dimensional parameters.
Finally, it is not necessary for this example, but usually very useful to do is to sketch in the parameter space the domains of topologically non-equivalent behaviors. Here is how it look for 91
our case, where β does not change anything, and hence there are only two parameters changes in which lead to bifurcations:
γ
αγ > 1
αγ < 1
α
Figure 11.2: Parametric portrait of the predator–prey system with intraspecific competition. There is only one bifurcation boundary on which transcritical bifurcation of x ˆ1 occurs The parametric portrait as shown above together with phase portraits for each domain in the parametric portrait as shown in the previous figure constitute together a bifurcation diagram of the system, obtaining which is the ultimate goal of analysis of nonlinear parameter dependent autonomous ODE systems. Unfortunately in many many cases, only partial information about the corresponding bifurcation diagram is available.
12 12.1
More on ecological interactions Competition model with intraspecific competition
Consider a mathematical model that describes interactions of two populations competing for the same resource, assuming that also intraspecific competition is at play: ( ) N1 ˙ N1 = r1 N1 1 − − bN1 N2 , K1 ( ) N2 ˙ N2 = r2 N2 1 − − cN1 N2 , K2 note mutually negative influence of the populations on each other. By passing to dimensionless variables (you should write down the change of variables) I arrive at x˙ = x(1 − y − αx), y˙ = y(γ − x − βy), which always has three equilibria in R2+ : x ˆ0 = (0, 0),
x ˆ1 = (1/α, 0),
x ˆ2 = (0, γ/β).
Math 484/684: Mathematical modeling of biological processes by Artem Novozhilov e-mail:
[email protected]. Fall 2015
92
(12.1)
Additionally, if αγ > 1, β > γ or αγ < 1, β < γ there is an internal equilibrium ( ) β − γ αγ − 1 x ˆ3 = (ˆ x3 , yˆ3 ) = , . αβ − 1 αβ − 1 Point x ˆ0 is an unstable node with eigenvalues 1 and γ, x ˆ1 is a saddle if αγ > 1 and an asymptotically stable node if αγ < 1, x ˆ1 is a saddle if β > γ and asymptotically stable node if β < γ. Finally, if x ˆ3 ∈ R2+ , then tr f ′ (ˆ x3 ) = −αˆ x3 − β yˆ3 < 0,
det f ′ (ˆ x3 ) = (αβ − 1)ˆ x3 yˆ3 ,
which implies that x ˆ3 is stable if αβ > 1 and is a saddle if αβ < 1 if x ˆ3 ∈ R2+ . The last condition is equivalent to αγ < 1, β < γ and αγ > 1, β > γ respectively. The null-clines of interest here are l1 = {(x, y) : y = 1 − αx},
l2 = {(x, y) : βy = γ − x},
which can be in four different mutual positions (see the figure below). If β < γ and αγ > 1 then x ˆ2 is stable, x ˆ1 unstable, x ˆ3 ∈ / R2+ , and the analysis of null-clines implies that the phase portrait look like portrait (1) in the figure below. Species 2 outcompetes Species 1. If β > γ and αγ < 1 then the situation is similar in the sense that now species 2 wins the race, x ˆ1 is an asymptotically globally stable equilibrium. Now assume that αγ > 1 and β > γ, then x ˆ3 ∈ R2+ and asymptotically stable, all other points are unstable, and the analysis of null-clines yields that all the orbits converge to x ˆ3 , in this case we have species coexistence (Panel (3) in the figure). Finally, if αγ < 1 and β < γ then x ˆ3 ∈ R2+ and unstable, whereas both equilibria on the axes are stable (see case (4) in the figure). This is the so-called bistable situation, when the final outcome of the competition is determined not only by the parameter values, but also by the initial conditions. I can summarize all the information in the parametric portrait of the model, which is now actually three-dimensional, since all three parameters play the role in the determining the dynamics of the system. In such cases it is usually convenient to fix one parameter, and consider “slices” of the full parametric portrait for this fixed value. In this case it is very convenient to fix γ, then I obtain the following parametric portrait in the space (α, β). You should represent the same parametric portrait for α fixed in (β, γ) space, and for β fixed in (α, γ) space. Recall that the parametric portrait together with the phase portraits for each topologically non-equivalent domain are called bifurcation diagram.
12.2
Principle of competitive exclusion
Which (biological) conclusions can be obtained from the (mathematical) analysis of the model? This model was used actually to predict the fate of a mosquito imported to the US, but I personally would be quite careful with such quantitative predictions (here is reference for the paper8 , you should totally read it, it only two pages long). On the other hand, in three of four domains in the parametric portrait only one species survives the competition. This modeling 8
Livdahl, T., & Willey, M. (1991). Prospects for an Invasion: Competition Between Aedes albopictus and Native Aedes triseriatus. Science, 253, 189-191
93
x ˆ2
(1)
(2)
l1
y
y
l1
x ˆ2
x ˆ0
l2
x ˆ1
l2
x ˆ0
x (3)
(4)
x ˆ2
l1
y
y
x ˆ1
x
l1 x ˆ3
x ˆ2 x ˆ3
x ˆ0
x ˆ1
x ˆ0
l2
l2
x
x ˆ1
x
Figure 12.1: Four topologically non-equivalent phase portraits of the model (12.1) (2)
(3)
(4)
(1)
β
γ
α
1/γ
Figure 12.2: Parametric portrait of the model (12.1). The numbers of domains coincide with the numbers of topologically non-equivalent phase portraits in the previous figure conclusion was tested experimentally by Georgy Gause, who published his findings in the book
94
The Struggle for Existence 9 . Gause formulated the principle of competitive exclusion, which states that two species competing for the same resources cannot coexist if other ecological factors are constant Here is a mathematical incarnation of this principle when the species depend linearly on available resources (the principle need not be valid when the dependence is nonlinear). Proposition 12.1. If d bounded populations depend linearly on k resources, and k < d, then at least one population dies out. Proof. The assumption on the linear dependence translates into the equations x˙ i = bi1 R1 + . . . + bik Rk − αi , xi
i = 1, . . . , d,
where xi is the population size of the i-th population, αi are the death rates, Rj is the abundance of the j-th resource, and bij describes the effectiveness of consumption of the j-th resource by the i-th population. Since d > k then the system of equations b1j c1 + . . . + bdj cd = 0,
j = 1, . . . , k,
has a nontrivial solution, which I denote c = (c1 , . . . , cd )⊤ . Let (assuming a generic case) α = c1 α1 + . . . + cd αd ̸= 0, such that α > 0 (I can always guarantee this because together with c, −c is also a solution). I have x˙ d x˙ 1 = −α, c1 + . . . + cd x1 xd which, by using the fact that
x˙ i xi
(
=
d log xi dt
and integrating from 0 to some constant T , gives
)c ( )c x1 (T ) 1 · . . . · xd (T ) d = Ce−αT
for some constant C ∈ R. For T → ∞ I have that the right-hand side tends to zero, hence for at least one of the population, say xi , I must have lim inf xi (T ) = 0, T →∞
which means that the i-th population goes extinct.
From a biological point of view the principle of competitive exclusion is still debatable, because there are quite a few instances in nature when two or more populations competing for the same resources are able to survive in a long term. 9
Gause, G. F. (2003). The struggle for existence. http://www.ggause.com/Contgau.htm
95
Courier Dover Publications, also available online at
12.3
Cooperative systems
I would like to be as general as possible in this section. Analysis of many concrete mathematical models is left as an exercise. Definition 12.2. A system x˙ = f (x), x(t) ∈ U ⊆ Rd , is called cooperative in U if ∂fi (x) ≥ 0 ∂xj for all x ∈ U and j ̸= i. It is called strictly cooperative in U if ∂fi (x) > 0 ∂xj holds for all x ∈ U and j ̸= i. This means that the growth rate of each population does not decrease when the population sizes of other species increase. This clearly describes the mutualistic interaction, especially in the strict case. Proposition 12.3. Let two-dimensional system x˙ = f (x),
x(t) ∈ U ⊆ R2 , f : U −→ R2
be strictly cooperative in U . Then all the orbits converge either to an equilibrium or to a boundary of U (including ∞). Proof. Denote Ui , i = 1, 2, 3, 4 the quadrants of R2 in the usual counterclockwise fashion. If x˙ ∈ U 1 for some t = t0 , where U 1 means the closure of the set, then x˙ ∈ U1 for all t > t0 . This can be seen from the fact that x ¨1 =
∂f1 ∂f1 (x)x˙ 1 + (x)x˙ 2 , ∂x1 ∂x2
assumption of being strictly cooperative, and the fact that if both x˙ 1 = x˙ 2 = 0 then we are at an equilibrium. Similarly for U3 . If x˙ ∈ U2 or x˙ ∈ U4 then the sign of derivatives can change only once, and thus both of the components are eventually monotone. Note that the time reversal will turn a cooperative system into a competitive one: Definition 12.4. A system x˙ = f (x), x(t) ∈ U ⊆ Rd , is called competitive in U if ∂fi (x) ≤ 0 ∂xj for all x ∈ U and j ̸= i. It is called strictly competitive in U if ∂fi (x) < 0 ∂xj holds for all x ∈ U and j ̸= i. 96
This implies that Proposition 12.5. Let two-dimensional system x˙ = f (x),
x(t) ∈ U ⊆ R2 , f : U −→ R2
be strictly competitive in U . Then all the orbits converge either to an equilibrium or to a boundary of U (including ∞). In particular, if one assumes a limited growth (i.e., f1 (x) < 0 for large enough x1 and f2 (x) < 0 for large enough x2 ) then every solution of a strictly cooperative system converges to a fixed point. Cooperative and competitive systems are examples of monotone dynamical systems, for which an extensive theory exists.
12.4
Midterm exam
1. Allee’s effect. Allee’s effect usually refers to the case when for low population density the per capita growth rate decreases and even becomes negative. A possible model is given by ( ) N a N˙ = rN 1 − − , N (0) > 0. K 1 + bN Here r, a, b, K are positive parameters. a is used to refer to the Allee effect. If a = 0 then we have the simple logistic equation. (a) Find the change of variables to put the equation into the dimensionless form ( ) x a x˙ = x 1 − − , x(0) > 0. M 1+x (b) (c)
(d) (e) (f) (g)
Assume that M > 1. For 0 < a < 1 show that there is a unique positive equilibrium x ˆ < M and determine its stability. For 1 < a < (M + 1)2 /(4M ) show that there are two positive equilibria, 0 < x ˆ1 < x ˆ2 and determine their stability. x ˆ1 defines a breakpoint density in the sense that it divides the solutions into two types. When the initial condition is below x ˆ1 , the solutions approach zero; if x(0) > x ˆ1 then the solutions approach x ˆ2 . 2 For a > (M + 1) /(4M ) find limt→∞ x(t). Fix M = 8 and consider a as a bifurcation parameter. Draw the bifurcation diagram in this case. Argue that there is a transcritical bifurcation when a = 1. Argue that a fold bifurcation occurs in the system and find its bifurcation value.
2. A mathematical model of interaction between pollutant and environment. Consider a mathematical model of interaction between a pollutant and environment assuming that the total level of pollution and the state of the environment at time t can be described by numerical values P (t) and E(t) respectively. This is an interaction that is closely related to the competitive type model, because both variables have a negative effect on one another. 97
• Assuming that there is a constant source of pollutant emission, describe the dynamics of the pollutant without the environment as P˙ = a − bP, where a is the source intensity and b is the degradation constant. If t → ∞, what happens with P (t)? • Assuming that the pollutant influence on the environment is negative, that there should be a function that describes the dynamics of the biomass when there is no pollution, and that living nature actually absorbs to some extend the pollutant we end up with the system P˙ = a − bP − f (E, P ), E˙ = g(E) − h(E, P ), where f (E, P ) ≥ 0, h(E, P ) ≥ 0 for (P, E) ∈ R2+ . • Take f (E, P ) = cEP, h(E, P ) = dEP , and for g(E) take the standard logistic equation, and obtain P˙ = a − bP − cEP, ( ) E ˙ E = rE 1 − − dEP, K where r, a, b, c, K, d are positive parameters. Prove that R2+ is forward invariant for this system (you are asked to prove that the orbits do not leave the positive orthant of the plane for t → ∞). • Find the change of variables to put the equations into the dimensionless form x˙ = α − x − xy, y˙ = y(β − x) − γy 2 . • Analyze the obtained system of two equations. The result should be in the form of a bifurcation diagram. • Interpret the results of analysis in the previous point in biological terms. What is predicted by this model?
13
A thousand and one population models
The title of this section is inspired by (or borrowed from) Hethcote, H. W. (1994). A thousand and one epidemic models. In Frontiers in mathematical biology (pp. 504-515). Springer Berlin Heidelberg, which I very much recommend for reading. In Section 5 we analyzed the classical predator–prey Lotka–Volterra model that predicts periodic oscillations of the prey and predator population sizes. The model was written in a way to present the simplest possible mathematical expressions that describe the interactions between two populations. A very similar approach was used in the last lectures, when a predator–prey and competition models with intraspecific competition were analyzed. In these models all the interactions are described by linear function, which can be summarized in 98
Definition 13.1. The Lotka–Volterra model for two interacting species is the model of the form x˙ = x(b1 + a11 x + a12 y), y˙ = y(b2 + a21 x + a22 y), for some constants b1 , b2 , a11 , a12 , a21 , a22 . In particular, for the classical predator–prey model I have b1 > 0, b2 < 0, a11 = a22 = 0, a12 < 0, a21 > 0: N˙ = aN − bN P, P˙ = cN P − dP.
(13.1)
A next possible step in formulating mathematical models of two interacting populations is to consider nonlinear expressions for the description of intra- and interspecific interaction. Here, I will follow A.D. Bazykin’s approach10 and consider the example of the prey–predator model. I start with a generalization of (13.1): N˙ = A(N ) − B(N, P ), P˙ = C(N, P ) − D(P ),
(13.2)
where N, P are the populations of prey and predator respectively, A(N ) is the function describing the prey dynamics when the predator is absent, D(P ) is the function describing the extinction of the predator when the prey is absent, function B(N, P ) gives the rate of consumption of the prey by the predator, and C(N, P ) is the effectiveness of consumption of prey. For A(N ) and D(P ) I used either the Malthus law of exponential growth or decay, or the logistic equation, which takes into account intraspecific competitions of the individuals in the population. Additionally, I can consider the case when the rate of growth of a population is low (or even negative), when the population size is small (think of a species where the individuals cannot find a mate), for example, I can take aN 2 A(N ) = A+N for some constants a, A > 0. Next assumption is to consider B(N, P ) = B1 (N )B2 (P ),
C(N, P ) = C1 (N )C2 (P ).
For instance, function B1 (N ) is called the trophic predation function or the functional reaction of the predator to the prey population density. In (13.2) I simply had linear function B1 (N ) = bN , which is equivalent to the absence of any predation saturation as the prey population grows (I already used similar reasonings when discussing the model of the insect outbreak). In general, there are three most common trophic functions: linear function (or piecewise linear, as indicated in the figure, function I). The second type is a function that display the saturation effect: limN →∞ B1 (N ) = b > 0. For example I can take (see curve II in the figure) 10
Bazykin, A. D. (1998). Nonlinear dynamics of interacting populations (Vol. 11). World Scientific
99
III II I
Figure 13.1: Three types of trophic functions
B1 (N ) =
bN . 1 + αN
A similar dependence is given by B1 (N ) = b(1 − e−αN ). The third type of the trophic predation function takes into account two elementary factors: the saturation effect of the predator, and nonlinear character of the prey consumption by the predator (the predator may look for another source of energy if the prey population is scares). In the figure you can see the qualitative character of this curve (III), analytical formula can be taken as bN 2 B1 (N ) = , 1 + αN 2 or, even more generally, bN 2 B1 (N ) = . 1 + α1 N + α2 N 2 Now let me turn the attention to B2 (P ), which is ∝ P in (13.2) and in many other mathematical models, which can be interpreted as the absence of the competition among the predator individuals for prey. If I’d like to take this competition into account, I can take B2 (P ) =
P . 1 + βP
Therefore, I already have six choices for function B(N, P ). A similar discussion can be made about C(N, P ), which often takes the form C(N, P ) = B1 (N )C2 (P ). For instance, if one would like to take into account nonlinear predator birth at small population sizes, he or she can take C(N, P ) = C1 (N )
cP , C +P
for some constants c, C > 0. Let me list the elementary factors that can be taken into account at generalizing (13.2):
100
• Nonlinear character of the prey growth rate for small N (A(N ) = aN 2 /(A + N )). • Intraspecific prey competition (A(n) = aN (1 − N/K1 )). • Predator saturation (B1 (N ) = bN/(1 + αN )). • Nonlinear character of the prey consumption by the predator (B1 (N ) = bN 2 /(1 + αN )). • Predator competition for prey (B2 (P ) = P/(1 + βP )). • Predator intraspecific competition D(P ) = P (1 + P/K2 ). • Nonlinear dependence of the predator birth rate at small P (C2 (P ) = dP/(C + P )). Now I can combine these elementary factors in different ways to obtain a particular example of the predator–prey model (13.1). For example, a model with prey competition, predator competition for prey, predator saturation takes the form ) ( bN P N ˙ − , N = aN 1 − K1 (1 + αN )(1 + βP ) dN P P˙ = −dP + . 1 + αN This model has 7 parameters. This number can be reduced to 4 by a change of the variables. Taking into account that each of the elementary factors given above means adding one more dimensionless parameter to (13.2), which in its turn has one dimensionless parameter. It is well known that analysis of the ODE systems with the number of parameters exceeding three is quite involved, and, more importantly, the results of such analysis are quite difficult to interpret. Therefore, it becomes crucial to be able to identify the key factors for each particular situation. The same reasonings are applicable to other ecological interactions, including interspecies competition and mutualism.
14 14.1
Periodic phenomena in nature and limit cycles Periodic phenomena in nature
As it was discussed while I talked about the Lotka–Volterra model, a great deal of natural phenomena show periodic behavior. Probably one of the best studied and advertised examples is the data from Hudson Bay Company that recorded the numbers of lynx and hare pelts that were bought by the company from hunters in the nineteenth and twentieth century. A canonical in some sense representation of these data is given in Figure 14.1, where the numbers of acquired skins of lynx (circles) and snowshoe hare (squares) are shown. The data show indisputable 10 year cycle in both the prey and predator numbers. Before focusing our attention on the mathematical side of the description of the periodic phenomena, I would like to make a few remarks about these data. This figure is originally from a very respectful book by Odum “Fundamentals of Ecology.” Odum says that his graph is taken from MacLulich’s “Fluctuations in the numbers of varying hare,”1937, which is not widely available. Some authors caution that this data are actually a composition of several time series, 101
à à
Population size
140
àà
120 100 80 60 40 20 0
à
à à à
à à à àà à à à à à à à à æ à ææ à à àæ à à æ à à à à æ æ àà ææ æ æ æ à æ æ æ æ æ àæ æ æ à àæ æ æ à æ à æ à à æ æ æ æ æ æææà à æ æ æ æ à ææ æ æ æ à à à æ à à æ à æ æ à æ æ æ ààà æ à æ æ æ æ æ à à æ æ à àææ à æ æ æ à à à æ ææ à æ æ æ ààà àà æ æ æææ æææ æææ àæææ ææ àà ààààæ à à àà àà ææ à à
àà
à
à
à
æ
à
ææ æ
1860
à
1880
1900
1920
Year
Figure 14.1: Data on lynx and hare in Canada from Hudson Bay Company. Circles show lynx data, and squares provide snowshoe hare data, the numbers are given in thousands and should probably not be analyzed as a whole. A great example of misinterpreting these data is given by Gilpin11 , see the figure from the cited paper. In this figure Gilpin uses data from
Hudson Bay Company, which are, however, different from the data in Odum, to argue that the direction of the data change in the phase plane (hare, lynx) is clearly clockwise, whereas our simple mathematical models (and the classical Lotka–Volterra model in the first place) show counter clockwise movement, in which case the maxima of the prey population precede the maxima of the predator population. Can we discard our mathematical models on the grounds of these data? Probably not, since there are so many issues with collecting these data, including 11
Gilpin, M. E. Do hares eat lynx? American Naturalist (1973): 727–730.
102
the obvious fact that these are not actual population numbers, but the number of traded pelts, which can reflect many other things. Much more on this particular example, and other examples of periodic data in biology can be found in a book by Peter Turchin12 .
14.2
Limit cycles. Definitions. Stability
The simplest type of an asymptotic behavior of solutions to x˙ = f (x),
x(t) ∈ U ⊆ R2
(14.1)
is arguably the equilibrium points. As I already presented in the example of the Lotka–Volterra this is not the only possibly behavior: A closed curve that corresponds to a periodic solution can represent an asymptotic behavior. I also discussed that for autonomous systems (14.1) any periodic solution corresponds to a closed curve in the phase space (which is quite trivial), and, in the opposite direction (which is less trivial and not true for non-autonomous system), a closed curve in the phase plane implies existence of a periodic solution, i.e., of the solution x(t; x0 ) such that x(t + T ; x0 ) = x(t; x0 ) for any t ∈ R, and here T > 0 is the minimal such real number. The examples of the closed curves were in the Lotka–Volterra model, which is structurally unstable (i.e., the behavior of the orbits can be destroyed by any, no matter how small, generic changes of the right hand sides of the system). Moreover, any system that possesses a family of closed curves filling a whole domain is structurally unstable. I am interested, however, in the properties of the models that persist under small changes of the equations. It turns out that a closed curve that is structurally stable has to be isolated. Therefore, I have the following definition. Definition 14.1. A closed orbit γ of (14.1) is called a limit cycle if it is isolated, i.e., there are no other closed curves is a small enough neighborhood of γ. In Figure 14.2 an example of a limit cycle of (14.1) is shown. Example 14.2. Here is a basic example, which shows that limit cycles are stable under small system’s perturbations. Consider an ODE system in polar coordinates r˙ = r(1 − r2 ), θ˙ = 1, where r is the distance to the origin, and θ is the polar angle. The equations here are decoupled and can be easily analyzed. For r I find that there are two equilibria rˆ0 = 0 and rˆ1 = 1 (−1 is not an equilibrium because of the definition that r ≥ 0). The former equilibrium is unstable and the latter is asymptotically stable, and both equilibria are hyperbolic, and this implies that they will persist under small changes in the equations. Therefore, for any initial condition different from zero, r(t) → 1 as t → ∞. Geometrically r = 1 is a circle of the radius one with the center 12
Turchin, P. (2003). Complex population dynamics: a theoretical/empirical synthesis (Vol. 35). Princeton University Press.
103
y
x ˆ
x
Figure 14.2: An asymptotically stable limit cycle at the origin. For θ I have that it is monotonously increasing for any t. The superposition of these two movements results in the phase portrait shown in Figure 14.3. In case of the first equation r˙ = −r(1 − r2 )
x2
we will have an unstable limit cycle (the student is invited to make a graph).
x ˆ
x1
Figure 14.3: An asymptotically stable limit cycle in the system r˙ = r(1 − r2 ), θ˙ = 1 The notions of stability and instability of the limit cycles are intuitively clear and can be formalized by using a distance function d(A, B) between sets A and B. To wit, a limit cycle γ is called stable (or Lyapunov stable) if for any ϵ > 0 there exists a δ(ϵ) > 0 such that for any initial condition x0 , d(x0 , γ) < δ, I have that d(x(t; x0 ), γ) < ϵ for all t > 0. A limit cycle γ is called unstable if it is not stable. A limit cycle γ is called asymptotically stable, if it is stable, and additionally, d(x(t; x0 ), γ) → 0 as t → ∞. These definitions, however, do not provide any means to determine the stability of limit cycle analytically. I will return to the notion of stability of the limit cycle in later lectures. 104
14.3
Criteria of absence of the limit cycles
There are no regular methods to study limit cycles of ODE. Sometimes it is possible to prove that limit cycles do not exist in some domain G. Here is one of the most useful criteria: Proposition 14.3 (Dulac’s criterion). Consider (14.1) and assume that G ⊆ U is simply connected, and B(x) ∈ C (1) (G; R) such that the expression ∂ ∂ (Bf1 ) + (Bf2 ) ∂x1 ∂x2 is sign definite (i.e., positive or negative everywhere in G). Then there are no limit cycles in G. As a quick note, the expression ∂x∂ 1 (Bf1 ) + ∇ · Bf for the del operator ∇ = (∂x1 , ∂x2 ).
∂ ∂x2 (Bf2 )
can be concisely written as div Bf or
Proof. Assume that there is a limit cycle γ ∈ G. Consider the line integral I I := (−Bf2 )dx1 + (Bf1 )dx1 . γ
This integral, due to the assumption that γ is the orbit of (14.1), has to be zero: ∫ T ) ( I= (−Bf2 )x˙ 1 + (Bf1 )x˙ 2 dt = 0. 0
On the other hand, due to Green’s theorem, ∫∫ I= ∇ · Bf dx, D
which cannot be zero because of the sign definiteness of the expression under the integral sign. Here D is the domain confined by γ. Therefore I arrived at a contradiction, which implies that there are no limit cycles in G. Remark 14.4. • This proposition is true only for the plane, d = 2, and the reader is invited to think of a counterexample in dimension d ≥ 3. • Originally this proposition is due to Bendixson, who considered the case B(x) ≡ 1. In this case the condition of the absence of the limit cycles takes the simple form that the expression ∇·f is sign definite. This is why this criterion is often referred as Bendixson–Dulac theorem. • A similar, but slightly more tedious proof shows that, if in G the expression ∇ · Bf is sign definite for some function B, then there are no simple closed curves in G, composed by the orbits. This means that not only the criterion provides the conditions for the absence of the limit cycles, but also guarantees absence of the homo- and/or heteroclinic curves composed in a closed curve. 105
• The condition that G is simple connected (i.e., it does not have any holes, and any two points in G can be connected) is essential. It can be shown that if G is an annular region in which ∇ · Bf is sign definite, then G cannot contain more than one limit cycle. Let me use Dulac’s criterion to show that the general Lotka–Volterra system on the plane cannot have limit cycles. Example 14.5. The general Lotka–Volterra model x˙ = x(b1 + a11 x + a12 y), y˙ = y(b2 + a21 x + a22 y), cannot have limit cycles in R2 if a11 a22 − a12 a21 ̸= 0. First note that the axes x = 0 and y = 0 consist of orbits, hence, the limit cycles, if exist, should lay in one of the quadrants. Consider B(x, y) = xα−1 y β−1 , where α and β to be determined. I calculate ) ( ∇ · Bf = B(x, y) (αa11 + βa21 + a11 )x + (αa21 + βa22 + a22 )y + αb1 + βb2 . By choosing α and β such that αa11 + βa21 + a11 = 0 and αa21 + βa22 + a22 = 0 (this always can be done due to the assumption), I have ∇ · Bf = B(x, y)(αb1 + βb2 ). If αb1 + βb2 ̸= 0 then by applying Dulac’s criterion I obtain the conclusion. If αb1 + βb2 = 0 then the system admits integrating factor B(x, y), can be integrated, and the nontrivial equilibrium will be surrounded by a family of the closed curves (as in the case of the classical predator–prey Lotka– Volterra model).
15
Limit sets. Lyapunov functions
By this point, considering the solutions to x˙ = f (x),
x(t) ∈ U ⊆ R2 ,
(15.1)
I was mostly interested in the behavior of solutions when t → ∞ (sometimes, this is called asymptotic behavior of the solutions). It does not mean that the transient behavior of the solutions is of no importance, but at this point I would like to concentrate on the case t → ∞. From the previous discussion it should be clear that an equilibrium or a limit cycle can be an ultimate outcome of the system’s dynamics. Can we have anything else? How to describe the asymptotic behavior in most general terms? In this lecture I will try to answer these and some other questions by studying the limit sets of the flow of (15.1). I will also introduce the notion of the Lyapunov function, an extremely powerful device to analyze behavior of (15.1). Math 484/684: Mathematical modeling of biological processes by Artem Novozhilov e-mail:
[email protected]. Fall 2015.
106
15.1
Limit sets and their properties
Recall that x(t; x0 ) denotes the solution to (15.1) at time t with the initial condition x(0) = x0 . Considered as the mapping of U to U parameterized by the time t, it is called the flow of system (15.1). The corresponding geometric object that I study is the orbit γ(x0 ). Definition 15.1. Point x ¯ is called an omega limit point of solution x(t; x0 ) or orbit γ(x0 ) if there exists a sequence t1 , . . . , tk , . . . of the time moments such that tk → ∞ as k → ∞, for which x(tk ; x0 ) → x ¯, k → ∞ holds. The set of all such points of x(t; x0 ) is called the ω-limit (or forward) set of x(t; x0 ) or orbit γ(x0 ) and denoted ω(x0 ). Similarly, for tk → −∞ an alpha limit point and α-limit (or backward) set α(x0 ) are defined. Example 15.2. If x ˆ is an equilibrium of (15.1) then ω(ˆ x) = α(ˆ x) = x ˆ. Example 15.3. An asymptotically stable limit cycle is the ω-limit set for at leat some initial conditions (can you think of a sequence of points such that for any point on the limit cycle there is a convergent sequence x(t; x0 )?). By the way, using the limit sets it is possible to define the limit cycle as the closed curve, which is a limit set for some initial conditions that do not belong to the curve itself. Example 15.4. Consider x˙ = −x,
x(t) ∈ R.
For the equilibrium x ˆ = 0 I have that α(0) = ω(0) = 0. For any other point x ∈ R \ {0} I obviously have ω(x) = 0, α(x) = ∅. Example 15.5. Here is a slightly more involved example: r˙ = r(a − r), θ˙ = sin2 θ + (r − a)2 . ˆ = (a, 0), and (ˆ ˆ = (a, π). All three of them Here I have three equilibria rˆ0 = 0, (ˆ r, θ) r, θ) are unstable. The analysis of the phase portrait shows that the ω-limit set for most of the initial conditions is given by the closed curve composed by two equilibria on the circle, and the heteroclinic trajectories connecting them (recall, that an orbit is called heteroclinic if it connects two equilibria). Therefore, in the examples I presented it is possible for the limit set of a planar system to be an equilibrium, a limit cycle, or a closed curve that is composed of equilibria and orbits. Can it be anything else? Limit sets have a lot of nice properties, which I summarize in the following theorem.
107
Theorem 15.6. Limit sets are closed and invariant. If x(t; x0 ) is bounded (i.e., |x(t; x0 )| < M for any t for some constant M which may depend on x0 ), then ω(x0 ) and α(x0 ) are non-empty and connected. I will leave the proof of this theorem to the reader. Here are a few remarks. • The limit set is closed means that for any convergent sequence x ¯k belonging to, e.g., ω(x0 ) the limit limk→∞ x ¯k = x ¯ is also in ω(x0 ). • The invariance property means that if x ¯ ∈ ω(x0 ) then the whole orbit x(t; x ¯) ∈ ω(x0 ). • The fact that a bounded orbit has a non-empty limit set follows immediately from the fact that any bounded sequence has a convergent subsequence. It also holds that ω(x0 ) is empty if and only if |x(t; x0 )| → ∞. • Connectedness of ω(x0 ) can be proved by contradiction. The celebrated Poincar´e–Bendixson Theorem classifies the possible limit sets of planar systems. Theorem 15.7 (Poincar´e–Bendixson). Consider planar system (15.1), for which the equilibria are isolated. If the positive orbit γ + (x0 ) (i.e., for t > 0) is bounded then 1. ω(x0 ) is an equilibrium, or 2. ω(x0 ) is a closed orbit, or 3. For each x ¯ ∈ ω(x0 ), α(¯ x) and ω(¯ x) are equilibria.
15.2
Lyapunov functions and limit sets
In the previous subsection I discussed the limit sets. But how actually can I find them? For this I will need the notion of the Lyapunov function. In particular, the following theorem holds. Theorem 15.8. Consider an autonomous system of ODE x˙ = f (x),
x(t) ∈ U ⊆ Rd ,
( ) and let G ⊆ U . Consider function V ∈ C (1) (G; R) and the mapping t 7→ V x(t; x0 ) along the solution x(t; x0 ). If the derivative of this mapping V˙ is sign definite (i.e., V˙ ≤ 0 or V˙ ≥ 0) then ω(x0 ) ∩ G (and α(x0 ) ∩ G) contained in the set {x : V˙ (x) = 0}. Proof. Assume that V˙ ≥ 0 (the other case is treated similarly). Let x ¯ ∈ ω(x0 ). This means such that x(t ; x ) → x ¯ . This means, by continuity of V˙ , that that there is sequence (tk )∞ 0 k k=1 V˙ (¯ x) ≥ 0. If V˙ (¯ x) = 0 then the theorem is proved. Therefore assume that V˙ (¯ x) > 0. ( ) This implies that V x(t; x ¯) > V (¯ x). On the other hand, since V is non-decreasing along the trajectories, ( ) V x(t; x0 ) ≤ V (¯ x). 108
Consider sequence (t + tk )∞ x(t + tk ; x0 ) = k=1 for some fixed t. By the properties of( the solutions ) x(t; x ¯), which together with the previous yields a contradiction V x(t; x ¯) ≤ V (¯ x). Therefore, ˙ V (¯ x) = 0. Function V as in the last theorem is called a Lyapunov function (but see the next subsection). The derivative V˙ is often called the derivative of V along the vector field f , or Lie derivative. Example 15.9. Consider again the predator–prey model with intraspecific competition, in dimensionless variables x˙ = x(1 − αx − y), y˙ = y(−γ − βy + x). I know from the previous analysis that for some parameter values (αγ < 1) it is possible to have a nontrivial equilibrium x ˆ2 = (ˆ x, yˆ), which corresponds to the mutual species coexistence. I also found that this equilibrium, when present in R2+ is asymptotically stable. In my analysis I was not able to completely discard the possibility of having a closed curve. This example fills this gap. Consider function, whose form is suggested by the integral of the classical Lotka–Volterra predator–prey model V (x, y) = x ˆ log x − x + yˆ log y − y and find its derivative along our vector field: V˙ = (ˆ x − x)(1 − αx − y) + (ˆ y − y)(−γ − βy + x) = α(ˆ x − x)2 + β(ˆ y − y)2 . For any (x, y) ∈ R2+ V˙ is non-negative and is equal to zero only at (ˆ x, yˆ). Therefore, this point is the only candidate for the ω-limit set, and therefore it is globally asymptotically stable for the system. Sometimes the set of zeros of V˙ is large, and I would like to identify those points which are actually in the limit sets. This can be done with ˆ = {x ∈ G : V˙ (x) = 0} be the set of zeros of the derivative of Theorem 15.10 (LaSalle). Let G a Lyapunov function along given vector field f , and let G be positive invariant. Then the orbits ˆ of the vector field tend to the maximal invariant subset of G.
15.3
Lyapunov functions and stability of equilibria
In the previous subsection a function V was called a Lyapunov function for vector field f in the domain G if it is sign definite in G along the orbits of f . A canonical, and slightly more stringent, definition of Lyapunov function is as follows. Definition 15.11. Let x ˆ be an equilibrium of x˙ = f (x), x(t) ∈ U ⊆ Rd . A function V ∈ (1) C (G; R), where G is a neighborhood of x ˆ, is called a Lyapunov function for x ˆ in G if: • V (ˆ x) = 0 and V (x) > 0 for all x ∈ G \ {ˆ x}. • V˙ (x) ≤ 0 for all x ∈ G. 109
If instead of the second condition, the condition • V˙ (x) < 0 for all x ∈ G \ {ˆ x} holds, then V is called a strict Lyapunov function. Using this definition, it is possible to prove Theorem 15.12 (Lyapunov). Let x ˆ be an equilibrium of x˙ = f (x), x(t) ∈ U ⊆ Rd . If there exists a Lyapunov function for x ˆ then x ˆ is stable, if there exists a strict Lyapunov function for x ˆ than x ˆ is asymptotically stable. There are no universal methods to construct Lyapunov functions, though. Example 15.13. Consider first the classical Lotka–Volterra model N˙ = aN − bP N, P˙ = cP N − dP. Recall that I was able to integrate this system, and found that the solutions are the level sets of H(N, P ) = bP + cN − a log P − d log N. If one considers a function ˆ , Pˆ ), V (N, P ) = H(N, P ) − H(N
ˆ = d , Pˆ = a N c b
ˆ , Pˆ ). Hence the equilibrium, as we then it can be checked that V is a Lyapunov function for (N ˙ already saw, Lyapunov stable. Here, actually, V ≡ 0. Consider again the predator–prey model with intraspecific competition, in dimensionless variables x˙ = x(1 − αx − y), y˙ = y(−γ − βy + x). And let H(x, y) = x ˆ log x − x + yˆ log y − y. One can check that V (x, y) = H(ˆ x, yˆ) − H(x, y) is a strict Lyapunov function for (ˆ x, yˆ). It is important to mention that Lyapunov functions allow to infer global aspects of the behavior of the orbits. For example, the following is true. Consider again the same system x˙ = f (x), x(t) ∈ U ⊆ Rd with an asymptotically stable equilibrium x ˆ. By definition, the basin of attraction B(ˆ x) of x ˆ is the set of initial conditions such that x(t; x0 ) → x ˆ for t → ∞. It can be shown that if V is a strict Lyapunov function for x ˆ, then the sets Gc = {x ∈ G : V (x) ≤ c} are in the basin of attraction of x ˆ. In the last example, x, yˆ). the whole R2+ is in the basin of attraction of (ˆ 110
15.4
Homework 6: Limit cycles
1. Can the system x˙ = y(1 + x − y 2 ) y˙ = x(1 + y − x2 )
have limit cycles in R2+ ? 2. Argue that if in some domain G expression ∇ · f is sign definite, then the system x˙ = f (x), x(t) ∈ R2 cannot have two nodes in G such that one is stable and another one is unstable. 3. Show that the system y , 1 + x2 −x + y(1 + x2 + x4 ) y˙ = , 1 + x2
x˙ =
has no limit cycles in R2 . 4. Consider the predator-prey model ( x) 2x y, x˙ = rx 1 − − 2 1+x 2x y˙ = −y + y, 1+x where r > 0, (x, y) ∈ R2+ . Prove that this system has no closed orbits by invoking Dulac’s criterion with the function B(x, y) = y α−1 (1 + x)/x for a suitable choice of α.
16
Gause predator–prey model
Recall that the Poincar´e–Bendixson theorem states, for the planar system x˙ = f (x),
x(t) ∈ U ⊆ R2 ,
(16.1)
the following trichotomy holds: Theorem 16.1 (Poincar´e–Bendixson). Consider planar system (16.1), for which the equilibria are isolated. If the positive orbit γ + (x0 ) (i.e., for t > 0) is bounded then 1. ω(x0 ) is an equilibrium, or 2. ω(x0 ) is a closed orbit, or 3. For each x ¯ ∈ ω(x0 ), α(¯ x) and ω(¯ x) are equilibria. 111
An immediate corollary is Corollary 16.2. Consider system (16.1) and assume that G ⊆ U is closed, bounded, and forward invariant. If G does not contain equilibria, then there is at least one closed curve in G. To illustrate the application of this theorem, consider a predator–prey model, which is given in a qualitative form: x˙ = xg(x) − yp(x), ( ) y˙ = y −d + q(x) ,
(16.2)
where d > 0 is the death rate of the predator y at the absence of prey x; g, p, q are C (1) functions satisfying the following constraints: • There exists a K > 0 (prey carrying capacity), such that g(x) > 0 if 0 < x < K, g(K) = 0, and g(x) < 0 if x > K; therefore, g(x) is the per capita prey growth rate if there is no predator and the prey population size is x. • p is the predator trophic function, which describes the number of prey killed by one predator, p(0) = 0 and p(x) > 0 for positive x; additionally, it is reasonable to have p(x) → A is x → ∞. • q gives the effectiveness of the prey consumption by predators, q(0) = 0, q(x) > 0, q ′ (x) > 0 if x > 0. System (16.2) was suggested by Gause13 as a more flexible alternative to the structurally unstable predator–prey model by Lotka and Volterra. I start my analysis of (16.2) with the null clines. Null clines for y are y = 0,
l1 = {(x, y) : x = x ˆ, q(ˆ x) = d},
and due to the conditions on q there is a unique such point x ˆ. For the prey population, I have { } xg(x) x = 0, l2 = (x, y) : y = . p(x) Due to the fact the axes are composed of the orbits, R2+ is invariant. Two cases are possible. In the first case, assume that x ˆ < K, then in R2+ there are three equilibria x ˆ0 = (0, 0), x ˆ1 = (K, 0), x ˆ2 = (ˆ x, yˆ), where yˆ =
x ˆg(ˆ x) . p(ˆ x)
In the case x ˆ > K only x ˆ0 and x ˆ1 belong to R2+ . The Jacobi matrix of (16.2) is [ ] g(x) + xg ′ (x) − yp′ (x) −p(x) ′ f (x, y) = . yq ′ (x) −d + q(x) 13
Gause, G. F. (2003). The struggle for existence. Courier Dover Publications.
112
y
B1
N (ˆ x2 )
B2
l2 l1 B3 x ˆ1
x ˆ0 x
B4
Figure 16.1: Positive invariant set in the Gause model I have
[
] g(0) 0 f (ˆ x0 ) = , 0 −d ′
therefore, x ˆ0 is always a saddle, with the stable manifold on y-axis and unstable manifold on x-axis. Point x ˆ1 can be either stable or unstable. Indeed, [ ′ ] Kg (K) −p(K) ′ f (ˆ x1 ) = . 0 −d + q(K) I have, due to the assumptions, that g ′ (K) < 0, therefore, if q(K) > d then x ˆ1 is a saddle, and if q(K) < d then x ˆ1 is a stable node (in the latter case the predator goes extinct and the prey settles at its carrying capacity). If x ˆ1 is a saddle, then, according to the analysis, we also have x ˆ2 ∈ R2+ . [ ] [ ] g(ˆ x) + x ˆg ′ (ˆ x) − yˆp′ (ˆ x) −p(ˆ x) µ(ˆ x) −p(ˆ x) ′ f (ˆ x2 ) = = , yˆq ′ (ˆ x) 0 yˆq ′ (ˆ x) 0 (
where µ(ˆ x) = p(x) I have
tr f ′ (ˆ x2 ) = µ(ˆ x),
xg(x) p(x)
)′
.
x=ˆ x
det f ′ (ˆ x2 ) = yˆq ′ (ˆ x)p(ˆ x) > 0.
The sing of tr f ′ (ˆ x2 ) is determined by the slope of the tangent line to l2 at x ˆ. If this slope is positive, then x ˆ2 is unstable node or focus. This implies that there exists a small neighborhood N (ˆ x2 ) of x ˆ2 which is negative invariant (the orbits leave this neighborhood with t > 0) (see the figure). Now consider the unstable set (separatrix ) of x ˆ1 . It emanates from x ˆ1 in the region where x˙ < 0, y˙ > 0 and therefore is doomed to reach the null-cline x = x ˆ, say at the point B1 . After B1 I have x˙ < 0 and y˙ < 0, therefore, the orbit has to cross l2 , at the point B2 . Now I am in the 113
domain where x˙ > 0, y˙ < 0, which means that the orbit has to cross again l1 at the point B3 . Let B4 = (ˆ u, 0). Consider the closed path x ˆ 1 , B1 , B 2 , B3 , B4 , x ˆ1 . Define G as the domain confined by this path without N (ˆ x2 ). G is positive invariant as the set whose boundaries either orbits, or the line on which the direction of the phase flow points “inwards” of G. By construction, G does not possess equilibria and hence, by the Poincar´e–Bendixson theory, must contain at least one closed curve, which has to be an ω-limit set for the orbits entering G. How does this closed curve (the limit cycle) appear in the system? Assume that x ˆ is such that µ(ˆ x) < 0. Then x ˆ2 is stable. If I decrease x ˆ to the point where the slope of l2 is zero, then µ(ˆ x) = 0, and I have non-hyperbolic equilibrium x ˆ2 with two purely imaginary eigenvalues. I know that in this case I cannot use the linearization to judge the behavior of the orbits of the nonlinear system. In the terminology I discussed earlier in the course, this point corresponds to a bifurcation of x ˆ2 , and since a necessary condition for this bifurcation is α(ˆ x) = 0, where α(ˆ x) is the real part of the eigenvalues of the Jacobi matrix at x ˆ2 , then this bifurcation is inherently two dimensional. This bifurcation is called the Poincar´e–Andronov–Hopf bifurcation, or the bifurcation of the birth of a limit cycle, and I will discuss it in the next lecture. Example 16.3. Consider a predator–prey model that takes into account intraspecific prey competition and predator saturation: ) ( CN P N ˙ − , N =N 1− K A+N ( ) (16.3) BN P˙ = P −d + , A+N which is clearly belongs to the class of the Gause model with g(N ) = 1 −
N , K
p(N ) =
CN , A+N
q(N ) =
BN . A+N
ˆ < K (this is true for B > d) then there is x ˆ , Pˆ ) ∈ R2+ , where If 0 < N ˆ2 = (N ˆ )(A + N ˆ) (K − N . Pˆ = CK Since
(
N g(N ) p(N )
)′ =
K − 2N − A , CK
then the condition for the existence of the limit cycle takes the form ˆ < K −A. N 2 An example of the limit cycle in the system is given in the figure
17
Poincar´ e–Andronov–Hopf bifurcation
I already mentioned that there are no regular methods to study the limit cycles of the systems on the plane. Probably, one of the most important approaches, together with the Poincar´e– 114
P
l1
x ˆ2 l2
N
Figure 16.2: The limit cycle in Example 16.3 Bendixson theory, is the Poincar´e–Andronov–Hopf bifurcation14 , which is the only genuinely two dimensional bifurcation (i.e., it cannot be observed in systems of dimension 1), which can occur in generic two dimensional autonomous systems depending on one parameter (co-dimension one bifurcation). Consider the system x˙ = f (x, α), x(t) ∈ U ⊆ R2 (17.1) that depends on a scalar parameter α ∈ R. Definition 17.1. A bifurcation of an equilibrium of system (17.1), for which a pair of purely imaginary eigenvalues λ1,2 = ±iω0 , ω0 > 0, appears, is called the Poincar´e–Andronov–Hopf bifurcation, or the bifurcation of the birth of a limit cycle. Example 17.2. Consider the system x˙ = αx − y − x(x2 + y 2 ), y˙ = x + αy − y(x2 + y 2 ).
(17.2)
This system has the equilibrium x ˆ0 = (0, 0) for all parameter values α, and the Jacobi matrix of (17.2) evaluated at x ˆ0 is [ ] α −1 ′ A := f (ˆ x0 ) = . 1 α The eigenvalues of A are λ1,2 (α) = α ± i. Introducing complex variable z = x + iy, (17.2) can be rewritten in the complex form z˙ = (α + i)z − z|z|2 . 14
V.I. Arnold writes in his textbook “Geometrical methods in the theory of ordinary differential equations”: “Considered above theorem was known essentially to Poincar´e. An explicit formulation and proof were given by A.A. Andronov. [...] R.Tom, who I thought this theory in 1965, was advocating it under the name Hopf bifurcation.”
115
Using the exponential form of the complex numbers z = reiθ , I can rewrite (17.2) in the polar coordinates as r˙ = r(α − r2 ), θ˙ = 1. The last system can be easily analyzed since the equations are decoupled. The first equation √ always has the equilibrium rˆ0 = 0, and, if α > 0, another equilibrium rˆ1 = α. The linear analysis shows that rˆ0 = 0 is asymptotically stable if α < 0 and unstable if α > 0. Note that if α = 0 we cannot analyze the stability of rˆ0 = 0 by the linear approximation, however, the trivial equilibrium of r˙ = −r3 is asymptotically stable. For α > 0 another equilibrium rˆ1 appears. The second equation describes the counterclockwise rotation with constant speed. Superposition of these two behaviors yields the bifurcation diagram of system (17.2) (see the figure), which shows that for α > 0 an asymptotically stable unique limit cycle appears.
α0
α=0
y
x α
Figure 17.1: Supercritical Poincar´e–Andronov–Hopf bifurcation System x˙ = αx − y + x(x2 + y 2 ), y˙ = x + αy + y(x2 + y 2 ), 116
(17.3)
can be analyzed in a similar way. This system also has Poincar´e–Andronov–Hopf bifurcation for α = 0, the difference is that the limit cycle exists for α < 0 and it is unstable. For α > 0 there is no limit cycle, and rˆ0 = 0 is unstable. Note that for α = 0 non-hyperbolic equilibrium rˆ0 is unstable (see the figure).
α0
α=0
y
x
α
Figure 17.2: Subcritical Poincar´e–Andronov–Hopf bifurcation Bifurcation in (17.2) is called supercritical, the limit cycle exists for positive parameter values (“after” the bifurcation), whereas the bifurcation in (17.3) is called subcritical. In the former case the unique stable equilibrium is replaced with a unique asymptotically stable limit cycle of √ a small amplitude α, and the system stays in a neighborhood of rˆ0 = 0. This is so-called soft or non-catastrophic loss of stability. In the latter case, the basin of attraction of rˆ0 is bounded by the unstable limit cycle for negative α, and if α becomes positive, the system leaves any neighborhood of the origin. This is so-called hard or catastrophic loss of stability. The type of the Poincar´e–Andronov–Hopf bifurcation (soft or hard) is determined by the stability of the trivial equilibrium at the bifurcation parameter value. It turns out that the situation in the example above appears in many different systems of the form (17.1). Here is a general statement without proof. Theorem 17.3. Any system (17.1) that has an equilibrium x ˆ for the parameter values |α − αb | < ϵ for some ϵ > 0, whose linearization has eigenvalues λ1,2 (α) = µ(α) ± iω(α) such that 117
µ(αb ) = 0, ω(αb ) = w0 > 0, and satisfying the following conditions dµ (α) ̸= 0, dα α=αb
(17.4)
and L1 (αb ) ̸= 0,
(17.5)
experiences the Poincar´e–Andronov–Hopf bifurcation. The bifurcation is supercritical if L1 (αb ) < 0 and subcritical if L1 (αb ) > 0. Remark 17.4. The statement of the theorem includes the condition L1 (αb ) ̸= 0 for the first Lyapunov values L1 at the bifurcation parameter value αb . Moreover, the sign of L1 (αb ) actually determines the type of the bifurcation. However, I never explained how to actually calculate L1 (αb ). This is a nontrivial computational problem and I skip it in these lectures. The interested reader should consult an extremely readable account in Kuznetsov’s textbook15 . The type of the Poincar´e–Andronov–Hopf bifurcation can be inferred if the stability of the equilibrium x ˆ at the bifurcation value α = αb can be analyzed (e.g., with the help of the Lyapunov functions). Anyway, appearance of purely imaginary eigenvalues that cross the imaginary axis with non-zero speed should indicate that it is possible to have a limit cycle somewhere close. Example 17.5. To illustrate this theorem, consider the following predator–prey system ( )( ) N N N˙ = rN −1 1− − aN P, Lp Kp P˙ = −cP + dN P, where all the parameters are assumed to be positive, and Lp < Kp . This system can be put in dimensionless form as x˙ = x(x − l)(K − x) − xy, y˙ = −γy + xy, where 0 < l < K and γ > 0. It is possible to have up to four equilibria: ( ) x ˆ0 = (0, 0), x ˆ1 = (l, 0), x ˆ2 = (K, 0), x ˆ3 = γ, (γ − l)(K − γ) , and x ˆ3 ∈ R2+ if and only if l < γ < K. The Jacobi matrix for our system is [ ] (2x − l)(K − x) − x(x − l) − y −x ′ f (x) = . y x−γ Analysis of the eigenvalues yields that x ˆ0 is an asymptotically stable node, x ˆ1 is an unstable node if l > γ, otherwise it is a saddle with the unstable manifolds on x-axis, x ˆ2 is an asymptotically stable node if K < γ, otherwise it is a saddle with stable manifolds on x-axis. The Jacobi matrix at x ˆ3 takes the form [ ] γ(K + l − 2γ) −γ ′ f (ˆ x3 ) = , γ(K + l − γ) − Kl 0 15
Kuznetsov, I.A. (1998). Elements of applied bifurcation theory (Vol. 112). Springer.
118
which implies that ( ) det f ′ (ˆ x3 ) = γ γ(K − γ + l) − lK .
tr f ′ (ˆ x3 ) = γ(K − 2γ + l),
Therefore, if γ > (l + K)/2 then x ˆ3 is asymptotically stable. If γ = (l + K)/2 =: γb , then tr f ′ (ˆ x3 ) = 0,
det f ′ (ˆ x3 ) =
(K + l)(K − l)2 , 8
and I have a non-hyperbolic equilibrium with purely imaginary eigenvalues. It is easy to check the first condition for the Poincar´e–Andronov–Hopf bifurcation: dµ x3 ) 1 d tr f ′ (ˆ = −K − l ̸= 0. (γ) = dγ 2 dγ γ=γb γ−γb Somewhat tedious calculations lead to
√ √ 1 2 K +l L1 (γb ) = − , ω0 4(K − l)
which proves that the bifurcation is supercritical (note that ω0 = det f ′ ), with appearance of a unique stable limit cycle (see the figure).
y
l1
x ˆ3
l2 x ˆ0
x ˆ1
x ˆ2 x
Figure 17.3: The limit cycle in Example 17.5
17.1
Homework 7: Limit cycles
1. What is the simplest chemical reaction to produce periodic oscillations? Here is an example. • Consider a hypothetical chemical reaction that involves four chemicals, two of which kept at a constant level: k1
X A, k−1
k
2 B −→ Y,
119
k
3 2X + Y −→ 3X.
Using the Law of Mass Action (you may consult Section 5 of the lecture notes), write down two differential equations for concentrations of X and Y assuming that A and B are kept at constant concentrations. • Find an appropriate dimensionless variables and parameters to end up with the system x˙ = a − x + x2 y, y˙ = b − x2 y, where a > 0, b > 0 • Show that R2+ is forward invariant for this system. • Find the only equilibrium (ˆ x, yˆ) of this system and sketch the null-clines. • Argue that there is a bounded subset of R2+ containing (ˆ x, yˆ) which is forward invariant. • By analyzing the stability of (ˆ x, yˆ) prove that for some parameter values it is possible to have a limit cycle in this system (you will need to invoke the Poincar´e–Bendixson theory). • Sketch the area in the space on parameters (a, b) for which periodic solutions are possible. Hint: You may what to consider x ˆ as a parameter to find a parametric equation for the bifurcation boundary. 2. Check that each of the following systems has an equilibrium that exhibits the Poincar´e– Andronov–Hopf bifurcation, check one of the bifurcation conditions, and find the bifurcation value of the bifurcation parameter. • Van der Pol’s oscillator (you will need to rewrite it as a system): y¨ − (α − y 2 )y˙ + y = 0, • Bautin’s example: x˙ = y, y˙ = −x + αy + x2 + xy + y 2 . • Brusselator, which is a hypothetical chemical reaction: k
1 A −→ X
k
2 B + X −→ Y +D
k
3 2X + Y −→ 3X
k
4 X −→ E
where X and Y are reactants and A, B, D, E are assumed to be constant.
120
18
Biological models with discrete time The most important applications, however, may be pedagogical. The elegant body of mathematical theory pertaining to linear systems (Fourier analysis, orthogonal functions, and so on), and its successful application to many fundamentally linear problems in the physical sciences, tends to dominate even moderately advanced University courses in mathematics and theoretical physics. The mathematical intuition so developed ill equips the student to confront the bizarre behavior exhibited by the simplest of discrete nonlinear systems, such as equation (3)16 . Yet such nonlinear systems are surely the rule, not the exception, outside the physical sciences. I would therefore urge that people be introduced to, say, equation (3) early in their mathematical education. This equation can be studied phenomenologically by iterating it on a calculator, or even by hand. Its study does not involve as much conceptual sophistication as does elementary calculus. Such study would greatly enrich the student’s intuition about nonlinear systems. Not only in research, but also in the everyday world of politics and economics, we would all be better off if more people realized that simple nonlinear systems do not necessarily possess simple dynamical properties. Robert M. May Simple mathematical models with very complicated dynamics. Nature, 261(5560), 1976, 459-467 A certain man put a pair of rabbits in a place surrounded on all sides by a wall. How many pairs of rabbits can be produced from that pair in a year if it is supposed that every month each pair begets a new pair which from the second month on becomes productive? Leonardo Pisano Bigollo (1170 – 1250), known as Fibonacci Liber Abaci, 1202
18.1
Introduction
Up till now I discussed mathematical models of biological process that are characterized by continuous time; this means that at every time instant it is possible to have only one elementary event, and the parameters of my models specify the rates of the events in the system, i.e., the number of events per time unit. For population models this, for instance, means that the generations of our models are overlapping, and birth and death events can occur at every time instant. A number of biological systems, however, can be characterized by discrete time, which means that there are specific time moments at which the elementary events in our system can occur, and it is not required that at these discrete time instants only a unique event happens. For example, such discrete time is reasonable to introduce in some models of fish populations, which reproduce at specific time moments, or for insect populations, for which quite often non-overlapping populations are what is actually observed in reality. Anyway, for many situations (think also about observable time series) I need a modeling tool to describe sequences of the variables that are of interest to me, and hance I am naturally led 16
Equation (3) is given by Xt+1 = aXt (1 − Xt )
121
to consider discrete maps or discrete dynamical systems in the form Nt+1 = f (Nt ),
f : U −→ U,
Nt ∈ U ⊆ R,
t ∈ Z,
(18.1)
where the index notation Nt emphasizes the discrete character of the time variable in the system. Sometimes, however, I will use the usual nutation N (t), when it is more convenient. There is an equivalent notation for system (18.1): N 7→ f (N ),
N ∈ U ⊆ R.
(18.2)
Quite often the maps that I consider are non-invertible, and in this case t ∈ Z+ = {0, 1, 2, . . .}. If I am given an initial condition N0 then discrete dynamical system (18.1) defines an orbit γ(N0 ) = {N0 , f (N0 ), f 2 (N0 ), . . .}, where I use the notation f k := f ◦ f ◦ . . . ◦ f ◦ f, for k times composition of function f (i.e., k times successive applications of f ). I.e., f 2 (x) = f (f (x)) and f 4 (x) = f (f (f (f (x)))). Example 18.1. Consider a simple example of a population growth. By definition, a relative population growth at time moment t is defined by rt :=
Nt+1 − Nt , Nt
where Nt is the population size at time t. Assuming that rt = r = const for any t, I find Nt+1 = (1 + r)Nt = wNt ,
w := 1 + r.
This equation linear and can be easily solved explicitly: Nt = wt N0 = (1 + r)t N0 . Therefore, I obtain an important conclusion that the population is growing exponentially without bounds if w > 1 (r > 0), declining to zero if 0 < w < 1 (−1 < r < 0) and stays constant if w = 1 (r = 0). It is instructive to compare these three phases of population behavior with the solutions of the continuous time Malthus model N˙ = mN . In general, of course, linear models cannot be used on long time intervals, since they predict either unbounded growth or extinction. To guarantee that the orbit is bounded, I should consider a nonlinear model. Example 18.2 (Discrete logistic equation). Consider the discrete dynamical system ( ) Nt Nt+1 = rNt 1 − , K where r, K are positive parameters. Now the model is nonlinear, and the population cannot grow to infinity. However, there is another drawback of this model: For Nt exceeding K, Nt+1 < 0, which contradicts biological interpretation of the model. 122
Example 18.3 (Ricker’s equation). To make sure that the population is bounded for all t and at the same time is nonnegative, I can consider the so-called Ricker model, which is widely used for modeling fish populations: ( ) Nt+1 = Nt e
r 1−
Nt K
.
Here, obviously, Nt ≥ 0 for all t > 0 if N0 ≥ 0. Example 18.4. Concluding this short section, consider the second epigraph to this lecture, which mathematically can be formulated as Nt+1 = Nt + Nt−1 , where Nt is the number of the pairs of rabbits capable of reproduction at the t-th month. Given the initial conditions N0 = 0, N1 = 1, it is easy to see that the solution should be given by the sequence of Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, . . . but the question is how to find a general solution to this equation. Another thing to point out here is that the equation written in this example is not a discrete dynamical system according to my definition, since the population size at t + 1 is determined through two prior points. This can, however, be fixed by considering an additional variable Mt = Nt−1 , and then Mt+1 = Nt ,
Nt+1 = Nt + Mt ,
which is a two-dimensional discrete dynamical system. In general, a d-dimensional discrete dynamical system is defined as a map of subset U of Rd to itself: x 7→ f (x),
18.2
x ∈ U ⊆ Rd ,
f : U −→ U.
Cobweb diagram
There is a simple and efficient way to get an idea of the general behavior of the orbits of (18.1) ˆ is by looking at the graph of function f . Before describing this method, I note that point N ˆ ˆ called a fixed point of (18.1) if f (N ) = N . Geometrically this means that fixed points are the points of the intersection of the graph of f and the bisectrices of the first and third quadrants on the coordinate plane. Now consider a discrete map with f shown in the figure below. Let N0 be an initial point, therefore, N1 = f (N0 ) is the point of intersection between the graph of f and the vertical line passing through N0 . Now I can use the diagonal Nt+1 = Nt to find the location of N1 on the N -axis: This can be done simply by finding the intersection of the horizontal line with the ordinate N1 and the diagonal. After this I can project to N -axis to find N1 . N2 = f (N1 ) and so on. The whole orbit is simply is a series of reflections from the diagonal (see the figure). The picture I obtain is sometime called a cobweb diagram.
123
1
f (N )
N2
N1 N0 0 0
N0
N1
1
N2 N
Figure 18.1: Cobweb diagram Consider now the situation as in Figure 18.2. The large black dots show the locations of the fixed points, small black dot is the initial condition, and together with the cobweb diagram I present the time series (Nt )kt=0 on the right. A little playing with cobweb diagram and choosing different initial conditions should convince you that the picture presented in the right panel of ˆ , and the Fig. 18.2 is universal: For any initial condition the orbit approaches the fixed point N convergence to this point (except for maybe several few steps) is monotonous. It is natural to call such fixed point asymptotically stable or a sink. 0.7
ˆ N ˆ N
æ
æ
æ
æ
æ
æ
æ
Nt
f (N )
1
æ æ æ æ
0 0
ˆ N N
1
0
0
2
4
6
8
10
t
Figure 18.2: Cobweb diagram Using cobweb diagram I can get an idea what kind of phenomena can be expected in discrete dynamical systems. For example, in Figure 18.3 one can see that an asymptotically stable fixed point can attract orbits in an oscillatory way. This fact alone should convince you that one dimensional discrete dynamical systems possess a richer behavior in comparison with scalar ordinary differential equations. Due to the fact that the state space for the scalar equations is one dimensional, and orbits cannot intersect, any non-monotonous behavior of solutions to ODE is prohibited. In scalar discrete dynamical systems a periodic solutions can be observed (see Fig. 18.4). It can be seen in the figure that there are two points N1 and N2 such that N2 = f (N1 ) and 124
1
0.8
æ
æ
æ
æ
æ
æ
æ
æ
æ
ˆ N æ
Nt
f (N )
ˆ N
æ 0
ˆ N
0
0
1
0
2
4
6
N
8
10
t
Figure 18.3: Cobweb diagram. Non-monotonous convergence N1 = f (N2 ), hence N1 = f 2 (N1 ) and N2 = f 2 (N2 ). The orbit {N1 , N2 } is called a 2-periodic orbit, or orbit with period 2. 0.9
1
ˆ N
æ
æ
æ
æ æ
æ
æ
æ æ
Nt
f (N )
æ
ˆ N
æ 0
ˆ N
0
0
1
0
2
4
6
N
8
10
t
Figure 18.4: Cobweb diagram. Periodic solutions A remarkable fact is that if system (18.1) has a 3-periodic solution then it has k-periodic solution for any k. An example of a 3-periodic solution is given in Fig. 18.5. 1
1
æ
æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ
ˆ N
ˆ N
æ æ
æ
æ æ
Nt
f (N )
æ æ
æ
æ æ
ˆ N
0
1
N
0
0
10
æ
æ
æ
æ
0
æ æ æ æ æ æ æ
æ
æ
æ
20
æ æ æ æ æ æ 30
40
50
t
Figure 18.5: Cobweb diagram. A 3-periodic solution Finally, aperiodic orbits can be observed. In Fig. 18.6 an example of such an orbit is given. In 125
the top row first 50 (left panel) and 250 (right panel) points respectively of the orbit are shown. Such orbits are called chaotic, and the system itself is chaotic. I will need a few mathematical preliminaries to define what chaotic means exactly, but here you can imagine writing down 0 every time the coordinate of the orbit is below 0.5, and 1 when the coordinate is above 0.5. As a result you will get a sequence like 00111010011011000110 . . . You can produce a similar sequence by tossing a coin and recording 1 if it is a head and 0 if it is a tail. You will get another sequence. Both sequences look (whatever this means) random. Moreover, there exists no statistical test to determine which sequence is produced by a random experiment (tossing a coin) or by a deterministic discrete dynamical system of the form (18.1).
ˆ N
ˆ N f (N )
1
f (N )
1
0
0
ˆ N
0
1
ˆ N
0
N 1
æ æ æ ææ æ æ æ æ æ æ ææ æ æ æ ææ æ æ æ æææ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ æ æ æ æ æ ææ ææ ææ
æ
Nt
ˆ N
0
1
N
0
æ
ææ
ææ
æ æ
æ æ æ æ ææ æ
æ æ
50
æ æ æ ææ æ
æ
æ
æ
æ
æ æ æ
100
150
200
250
t
Figure 18.6: Cobweb diagram. A chaotic orbit
19 19.1
Stability of fixed points Mathematical analysis of fixed points
In this lecture I will present the basics of the mathematical analysis of one dimensional discrete dynamical systems. The exposition is very concise since all the discussed notions were already treated in the realm of ordinary differential equations. An important warning should be made: Although the discussion closely follows that of ODE, there are a lot of things that distinguish maps with discrete time from continuous time autonomous ODE, so the student should pay careful attention. 126
A one dimensional discrete dynamical system, or a scalar dynamical system with discrete time, or simply a map is a pair {f, U }, where U ⊆ R such that for any x ∈ U mapping x 7→ f (x) ∈ U is defined. U is the phase or state space, and f is the evolutionary operator. It is customary write the discrete dynamical system as t ∈ T,
xt+1 = f (xt ), or
( ) x(t + 1) = f x(t) ,
t ∈ T,
(19.1)
(19.2)
or x 7→ f (x),
(19.3)
where T = Z if the dynamical system is invertible, or T = Z+ if the dynamical system is noninvertible. For the invertible dynamical system one must have that f is one-to-one and onto (bijection). The orbit starting at point x0 ∈ U is defined as the set of points γ(x0 ) = {x ∈ U : there is k ∈ T such that f k (x0 ) = x}, where fk = f ◦ . . . ◦ f
(k times).
If f is invertible, then f k is defined also for negative k, and f −k = f −1 ◦ . . . ◦ f −1
(k times).
If the dynamical system is invertible than I can define orbits, forward orbit, and backward orbit, whereas for non-invertible systems only forward orbits make sense. With a slight abuse of language, I will call forward orbits for non-invertible dynamical systems as simply orbits. Note that if f is invertible then f l ◦ f k = f l+k for any l, k ∈ Z. If f is non-invertible then this is true only for l, k ∈ Z+ . If an orbits consists of unique point x ˆ, γ(ˆ x) = {ˆ x}, then this point is called a fixed point of (19.1). Different terms can be used such as an equilibrium, rest point, critical point, stationary point, etc, but I will try to be consistent and use the term “equilibrium” for systems with continuous time, and the term “fixed point” for discrete dynamical systems. A quite straightforward fact is that x = x ˆ is a fixed point of (19.1) if and only if x ˆ is a root of f (x) = x. I am mostly interested in asymptotically stable fixed points, i.e., those fixed points that attract nearby orbits. Let me call such fixed points sinks. The fixed points that repel nearby orbits are called sources. Let me prove Proposition 19.1. Consider a discrete dynamical system x 7→ f (x), and let x ˆ be a fixed point. Then x ˆ is a sink if |f ′ (ˆ x)| < 1 and a source if |f ′ (ˆ x)| > 1. Proof. Let |f ′ (ˆ x)| < 1. Then, by the mean value theorem, |f (x) − f (ˆ x)| ≤a 0,
which has two fixed points x ˆ0 = 0 and x ˆ1 = (r − 1)/r. I have f ′ (x) = r − 2rx, therefore, f ′ (ˆ x0 ) = r, and hence x ˆ0 is a sink if and only if 0 < r < 1 and a source if r > 1. f ′ (ˆ x1 ) = 2 − r, and this point is a sink if 1 < r < 3 and a source if r > 3. At r = 1 and r = 3 I have non-hyperbolic fixed points. The last example shows that, similar to ODE, when a parameter changes, the number and type of fixed points in the system can change. Hence we observe a bifurcation. To define a bifurcation, I first define two discrete dynamical systems that I consider to be qualitatively the same: A dynamical system {f, U } is called topologically equivalent to dynamical system {g, U } is there exists a homeomorphism h that maps orbits of the first dynamical system onto the orbits of the second dynamical system preserving the direction of time. Opposite to the case of ODE, for two discrete topologically equivalent dynamical system it is possible to write down an explicit relationship. Indeed, let x 7→ f (x) and y 7→ g(y) be topologically equivalent. This means that there is a homeomorphism h that maps the orbit . . . , f −2 (x), f −1 (x), x, f (x), f 2 (x), . . . onto . . . , g −2 (y), g −1 (y), y, g(y), g 2 (y), . . ., which is equivalent to ( ) h(x) = y, h f (x) = g(y), 128
or
( ) ( ) h f (x) = g h(x) ,
or
f = h−1 ◦ g ◦ h.
Two such maps f and g are called conjugate. If a parameter in the system is varied, then it is possible to have topologically nonequivalent dynamical system. Appearance of topologically nonequivalent dynamical systems is called a bifurcation. Exactly due to the same reasons as for the equilibria of ODE, bifurcations of fixed points in discrete dynamical systems occur precisely when the multiplier evaluated at this fixed point is either −1 or 1 (the fixed point is non-hyperbolic). Example 19.3 (Fold or saddle-node bifurcation). Consider the map x 7→ α + x + x2 ,
α ∈ R.
The set of fixed points is given by α = −x2 ,
√ and hence if α > 0 there are no fixed points, and is α < 0 there are two fixed points ± α, one of which is a sink and another one is a source (check this). α = 0 is a bifurcation parameter value for which a non-hyperbolic equilibrium x ˆ exists with f ′ (ˆ x) = 1. When α changes from negative to positive values, two fixed points “approach” each other, “collide” at the bifurcation value, and disappear. The bifurcation diagram is identical to the bifurcation diagram of the scalar ODE x˙ = α + x2 , which I drew in one of the first lectures. Two other cases when f ′ (ˆ x) = 1 correspond to the pitchfork and transcritical bifurcations (see homework problems). Opposite to the ODE case there is one more generic bifurcation of scalar maps, which correspond to appearance of non-hyperbolic fixed point with multiplier −1. Example 19.4 (Flip or period doubling bifurcation). Consider the map x 7→ −(1 + α)x + x3 =: f (x, α). I have x ˆ = 0 for all values of α, and the multiplier is µ = −(1 + α). Therefore, x ˆ is a sink if α > 0 and a source if α < 0. If α = 0 then x ˆ is non-hyperbolic and µ = −1. To see what actually happens consider the second iterate of f (x, α): ( ) ( ) ( )3 f 2 (x, α) = f f (x, α), α = −(1 + α) −(1 + α)x + x3 + −(1 + α)x + x3 = (1 + α)2 x − (1 + α)x3 − (1 + α)3 x3 + o(x3 ) = (1 + α)2 x − (2 + 4α + 3α2 + α3 )x3 + o(x3 ). √ As expected, f 2 has fixed point x ˆ = 0. However, for small enough α I also have x ˆ1,2 = ± α + √ √ o( α) ≈ ± α. This means, in terms of the original map f , that x ˆ1 = f (ˆ x2 , α),
x ˆ2 = f (ˆ x1 , α).
In terms of already discussed bifurcations, the second iterate of f experience a pitchfork bifurcation. Direct calculations confirm that x ˆ1,2 are stable (sources), and hence form a stable 2-periodic 129
orbit of the original map f . Such bifurcation, with an appearance of a stable 2-periodic solution, is called a flip or period doubling bifurcation (see the figure for an illustration). The map x 7→ −(1 + α)x − x3 can be treated similarly (left as an exercise). Example 19.5. Let me return to the discrete logistic equation in the form x 7→ rx(1 − x),
r > 0.
I found that x ˆ1 = (r − 1)/r at r = 3 becomes non-hyperbolic with multiplier µ = −1. Since for r=3x ˆ1 is a sink (nonlinear), then it means a stable 2-periodic solutions appears in the system. It can be checked that the coordinates of the 2-periodic solution are √ √ 1 + r + r2 − 2r − 3r 1 + r − r2 − 2r − 3r , x ˆ2 = , x ˆ1 = 2r 2r √ they have the same√multiplier µ = 4 + 2r − r2 , which√satisfies |µ| < 1 if 3 < r < 1 + 6, and µ = −1 is r = 1 + 6. This means that for r = 1 + 6 another flip bifurcation occurs, which results in a stable 4-periodic solution.
19.2
Homework 8: Discrete dynamical systems
1. Consider the following maps. For all of them find the fixed points and determine the range of parameter values so that each point is a sink or a source: •
Nt+1 = Nt er(1−Nt /K) ,
• Nt+1 = • Nt+1
aNt , (b + Nt )k
r, K > 0.
a, b, k > 0.
{ rNt1−b , Nt > K, = rNt , Nt < K,
• Nt+1 =
(1 + r)Nt , 1 + rNt
r, b > 0.
r > 0.
2. Consider the discrete dynamical system Nt+1 =
R0 Nt , 1 + Nt /M
R0 , M > 0.
Use the cobweb diagram to determine asymptotic behavior of the orbits. Solve this equation by noting that for the variable 1/Nt this equation becomes linear.
130
f (x, α)
α0
x
Figure 19.1: Flip bifurcation (see text for details)
131
3. Consider the map
3x − x3 . 2 Find the fixed points and determine their stability. What can you say about the basins of attraction of each sink? (This example is actually not a simple one, and you should be satisfied with a partial answer.) x 7→
4. Prove that if |f ′ (ˆ x)| > 1 for fixed point x ˆ of the map x 7→ f (x) then x ˆ is a source (unstable fixed point). 5. Build a bifurcation diagram of the map x 7→ x + αx + x3 . 6. Build a bifurcation diagram of the map x 7→ x + αx − x2 . 7. Consider map x 7→ r − x2 . Find the fixed points and determine their stability. Show that there is a flip bifurcation and find the bifurcation value. Find the coordinates of the periodic orbit with period two and determine the interval of parameter r when this orbit is stable.
20
Periodic solutions. A first encounter with chaos
I already showed that discrete dynamical systems or maps can have periodic orbits. Here I will discuss them at some length. Definition 20.1. Consider a discrete dynamical system x 7→ f (x), x ∈ U ⊆ R. A point x ˆ is called k-periodic or a periodic point with period k if f k (ˆ x) = x ˆ. Recall that fk = f ◦ . . . ◦ f . | {z } k times
Therefore I have that k-periodic point is a fixed point of the k-th iteration of f and 1-periodic point is simply a fixed point of the map f . The orbit that starts at x ˆ and consists of exactly k points is called a periodic orbits. Note that every point of this orbit is k-periodic. I am interested in stable periodic orbits. This is equivalent to studying the stability of a fixed point of the k-th iterate of f . The only thing is to check that the stability of the periodic orbit does not depend on the choice of a particular point. To wit, let {x1 , x2 , . . . , xk } be a periodic orbit. Consider, µi =
df k (xi ), dx
i = 1, . . . , k.
Since xi = f (xi−1 ) for all 1 < i < k and x1 = f (xk ), I have ( ) df f k−1 (xi−1 ) df k−1 µi = = f ′ (xi−1 ) (xi ) = . . . = f ′ (xi−1 )f ′ (xi−2 ) . . . f ′ (xi ), dx dx 132
and does not depend on i. Therefore the stability condition for a periodic orbit takes the form |f ′ (x1 ) . . . f ′ (xk )| < 1. In general, if a discrete map x 7→ f (x) is non-monotone then it may have quite intricate structure of periodic and non-periodic points. To formulate one of the most famous results, consider an ordering of all natural numbers: 3 ≻ 5 ≻ 7 ≻ . . . ≻ (all odd numbers except for 1) ≻ 2 · 3 ≻ 2 · 5 ≻ (all odd numbers except for 1 multiplied by 2) ≻ 22 · 3 ≻ 22 · 5 ≻ ( all odd numbers except for 1 multiplied by 22 ) ≻ ... ≻ ≻ 23 ≻ 22 ≻ 2 ≻ 1. Theorem 20.2 (Sharkovsky17 ). Consider continuous map x 7→ f (x) of the interval U into itself and assume that it has a k-periodic point. Then f has m-periodic points for all m such that k ≻ m. In particular, if f has a 3-periodic point, it has orbits of any period. Sharkovsky’s theorem was rediscovered in a famous paper by Li and Yorke, in which the term “chaos” was used for the first time to indicate that the behavior of the orbits is very irregular. The existence of periodic orbits of all possible periods is clearly an indication of a complicated behavior, however more was shown in the paper: Theorem 20.3 (Li and Yorke18 ). Consider continuous map x → 7 f (x) of the interval U into itself and assume that it has a 3-periodic point. Then there is an uncountable subset S of U such that every orbit starting in S is aperiodic and unstable. Example 20.4. Consider again the logistic family x 7→ rx(1 − x) =: fr (x) which maps [0, 1] → [0, 1] if 0 ≤ r ≤ 4. I know that it admits periodic solutions of period 2 and 4. Let me show that there is r ∈ [0, 4] for which fr has 3-periodic orbit. I need to find x1 , x2 , x3 such that x2 = fr (x1 ),
x3 = fr (x2 ),
x1 = fr (x3 ),
and hence each of these points is a fixed point of fr3 . The value of r when such fixed points can 3 r appear corresponds to the case when df dx (x) = 1 (see the figure). Therefore, I have a system of two equations with two unknowns fr3 (x) = x,
dfr3 (x) = 1, dx
which can be solved numerically19 yielding r ≈ 3.8284,
x1 ≈ 0.1599,
x2 ≈ 0.5144,
x3 ≈ 0.9563.
You can also see this orbit of period three in the figure, where the cobweb diagram is shown (right panel). 17
Sharkovsky, A. N. (1965). On cycles and the structure of a continuous mapping. Ukranian mathematical journal, 17(3), 104-111. 18 Li, T. Y., & Yorke, J. A. (1975). Period three √ implies chaos. American mathematical monthly, 985–992. 19 It turns out that the exact value of r is 1 + 8, but to prove that this is exactly the parameter value of 3-periodic point is not a simple problem
133
1
f
f3
1
0
0
x1
0
x2 x
x3
0
x1
x2 x
x3
Figure 20.1: 3-periodic point in the logistic map x 7→ rx(1 − x) for r = 1 +
√ 8 ≈ 3.83
Example 20.5. Consider Ricker’s map x 7→ rxe−x ,
r > 0,
which maps R+ to R+ . There is always fixed point x ˆ0 = 0, if r > 1 another fixed point appears, x ˆ1 = log r. The multiplier of x ˆ1 is µ1 = 1 − log r, therefore x ˆ1 is a sink if 1 < r < e2 and a 2 2 source if r > e , when r = e =: r1 I have µ1 = −1 and I observe the flip bifurcation, which, as can be shown, is accompanied by appearance of a stable 2-periodic point. It can be checked numerically that for r2 ≈ 12.51 2-periodic point loses stability via the flip bifurcation with appearance of a stable 4-periodic point. Next, for r4 ≈ 14.24 a stable 8-periodic point appears, and for r8 ≈ 14.65 a stable 16-periodic point is born via the same bifurcation (see the figure). 5
4
3
2
1
0 1
r1
r2
r4 r8
r
Figure 20.2: Period doubling or flip bifurcations in Ricker’s map x 7→ rxe−x It is natural to assume that there is an infinite sequence of the bifurcation parameter values r2k , k = 0, 1, 2, . . .. It can be shown, actually, that r2k − r2k−1 r2k+1 − r2k 134
converges to a constant µF , called Feigenbaum’s constant, which can be found as ≈ 4.6692. Moreover, the same constant appears in different maps and, hence, universal. Therefore, the sequence of flip, or period doubling, bifurcations, appears over and over again in discrete dynamical systems, and was called the period doubling rout to chaos (keep in mind that I did not even try to define what chaos means). In particular, it should be clear that the sequence of r2k converges quite fast to a limiting value r∞ (for the logistic equation r∞ ≈ 3.5699). This period doubling rout to chaos can be visualized as a bifurcation diagram, which is shown in the following figure for the logistic equation. 1.0
0.8
0.6
0.4
0.2
0.0
3
3.45
3.57
3.828
4
r
Figure 20.3: Bifurcation diagram for the logistic map x 7→ rx(1 − x) For comparison I also present full bifurcation diagram for Ricker’s map
21 21.1
On the definition of chaos. Lyapunov exponents Itineraries
I stated and proved in class that the logistic map x 7→ 4x(1 − x),
x ∈ [0, 1]
(21.1)
has k-periodic orbits for any k ∈ N. Moreover, we discussed a fascinated results that if a continuous map f : I −→ I has a 3-periodic point then it immediately implies that it has kperiodic points for any k. Is this enough to call something chaotic? Actually, no, although presence of periodic points of any period is an indication of a complex behavior. A natural sign of a chaos is the loss of information with time. Depending on what we would call information and how we would describe it, it is possible to have different (and non-equivalent) definitions
135
20
10
0
10
20
30
40
50
r
Figure 20.4: Bifurcation diagram for Ricker’s map x 7→ rxe−x of chaos 20 . I will give only one which is based on quite intuitively simple notion of sensitive dependence on initial conditions. Definition 21.1. Let f : R −→ R. Point x0 has sensitive dependence on initial conditions if there exists d > 0 such that any neighborhood of x0 contains a point x such that |f k (x)−f k (x0 )| ≥ 0 for some nonnegative k. I start with the proof that the logistic map (21.1) exhibits sensitive dependence on initial conditions for any point of [0, 1]. For this I will introduce21 the so-called itineraries. Divide the interval [0, 1] into two halves: [0, 1/2] and [1/2, 1] and call them L and R. Start an orbit at some initial point x0 , the points of this orbit will belong to L or R sets at which iteration, and the corresponding itinerary may look like LLRRLRL..., meaning that x0 ∈ L, x1 ∈ L, x2 ∈ R and so on. Of course, initially we do not know whether any itinerary can be realized for any orbit of the logistic map. For example for the initial point x0 = 1/3 I have the orbit 1/3, 8/9, 32/81, ... and hence the itinerary LRL.... For x0 = 1/4 1/4, 3/4, 3/4, 3/4, . . . and hence LRRRR . . . = LR. Finally for x0 = 1/2 the orbit is 1/2, 1, 0, 0, ... so that I have some ambiguity since both RRL and LRL satisfy my definition. However, if 1/2 is not in the orbit then the corresponding itinerary is uniquely defined. I start with identifying the sets of points whose itineraries start with the given sequence. For example, the set of initial conditions whose itineraries start with LL form a subinterval of the unit interval. A little experimenting shows that this is actually the interval [0, 0.146] (can you 20
An interested reader can find much more in Brown, R., & Chua, L. O. (1996). Clarifying chaos: Examples and counterexamples. International Journal of Bifurcation and Chaos, 6(02), 219–249. 21 I follow a very nice book Alligood, K. T., Sauer, T. D., & Yorke, J. A. (1997). Chaos. Springer Berlin Heidelberg.
136
figure out how to find these coordinates?). Moreover, the unit interval is being subdivided into the sets LL, LR, RR and RL, exactly in this order. The sets with three letters are LLL, LLR, LRR, LRL, RRL, RRR, RLR, RLL (see the figure below). Note that for any set consisting of two letters it is possible to have both L and R at the end (the exact rule is actually as follows: in the given set has an odd number of Rs then this set is divided into two other sets, such that the left one ends with R and the right one ends with L, the situation is opposite for the set having even number of R). This means that any itinerary is possible for orbits of (21.1). This can be also seen by noticing that the image of L contains both L and R, and the same is true about the image of R. Now consider an itinerary starting with k symbols S1 S2 . . . Sk , Si ∈ {L, R}. There are 2k such distinct sets, each of which has a very short length. Actually it can be proved that the length of any such interval is less then π/2k+1 . Now this interval contains subintervals S1 S2 . . . Sk LL, S1 S2 . . . Sk LR, S1 S2 . . . Sk RR, S1 S2 . . . Sk RL. The sets LL and RL are at least 1/4 unit one from another which means that there exist two points in the interval S1 S2 . . . Sk whose orbits after sufficiently many iterations will be 1/4 units apart, which prove the sensitive dependence on initial conditions. Actually, any point in [0, 1] has sensitive dependence on the initial conditions. As an illustration take k = 1000, hence there are two points, which 2−1000 ≈ 10−300 units close, but the orbits after 1000 iterations will be 1/4 units apart.
21.2
Lyapunov numbers and exponents
Ok, I was able to give a convincing argument that the logistic map (21.1) is chaotic. My proof was so smooth because I used the fact that for the parameter value r = 4 the logistic map is f : [0, 1] −→ [0, 1] is surjective. What about other values of r? Can we come up with a relatively straightforward way to check that something is chaotic, at least numerically? The answer is “yes.” Recall that the stability of a fixed point x ˆ is determined by the multiplier f ′ (ˆ x), the stability ′ of a k-periodic point x ˆ1 by the product f (ˆ x1 ) . . . f ′ (ˆ xk ). I would like to generalize these concepts for an orbit that is different from, e.g., fixed or k-periodic point. Definition 21.2. Let f be a smooth map on R. The Lyapunov number of the orbit γ(x1 ) is ( )1/t l(x1 ) = lim |f ′ (x1 )| . . . |f ′ (xt )| t→∞
if the limit exists. The Lyapunov exponent of the same orbit is log |f ′ (x1 )| + . . . + log |f ′ (xt )| , t→∞ t
h(x1 ) = log l(x1 ) = lim if this limit exists.
One should check that if the orbit consists of k-periodic points, these definitions will boil down to the usual stability criterion mentioned above. Intuitively, Lyapunov numbers and exponents show how much on average nearby orbits separate from each other. Hence, one should expect that if l(x1 ) > 1 or h(x1 ) > 0 then we will have sensitive dependence on the initial conditions, which is a hallmark of chaos. 137
Figure 21.1: Itineraries for the logistic map To be technically correct I will need also the notion of an asymptotically periodic orbit: An orbit γ(x1 ) is called asymptotically periodic if there exists γ(y1 ), which is periodic, and lim |xt − yt | = 0.
t→∞
Definition 21.3. The orbit γ(x1 ) is chaotic if it is not asymptotically periodic and h(x1 ) > 0. Example 21.4. Consider the map f (x) = 2x (mod 1). This map is discontinuous, but clearly for almost all the initial conditions we will have that 1∑ 1∑ log |f ′ (xi )| = lim log 2 = log 2 > 0. lim t→∞ t t→∞ t t
t
i=1
i=1
138
Figure 21.2: Graph of the map f (x) = 2x (mod 1) I will leave it as an exercise to check that the set of asymptotically periodic points is countable, and hence for this map there exist chaotic orbits. I cannot do the same straightforward computations for the logistic equation, however, it is always can be done numerically, see the figure
Figure 21.3: Numerically calculated Lyapunov exponents for the logistic map f (x) = rx(1 − x) depending on r It is instructive to compare this figure with the bifurcation diagram of the logistic map.
22 22.1
Two dimensional discrete dynamical systems Linear systems
Let us now move one dimension up and consider two dimensional discrete dynamical systems of the form {f , U }, where now f : U −→ U and U ⊆ R2 . In coordinates I have ( ) x1 (t + 1) = f1 x1 (t), x2 (t) , ( ) x2 (t + 1) = f2 x1 (t), x2 (t) , 139
where (and below) t ∈ Z or t ∈ Z+ . It is quite clear, given the complexity of one dimensional maps, that two (or more) dimensional maps can produce a wealth of different dynamical regimes. Therefore for the beginning I will restrict myself to linear systems x1 (t + 1) = a11 x1 (t) + a12 x2 (t), x2 (t + 1) = a21 x1 (t) + a22 x2 (t),
(22.1)
or, in matrix notations x(t + 1) = Ax(t),
(22.1)
or x 7→ Ax,
x ∈ U ⊆ R2 .
Obviously, the fixed points are solutions to Ax = x, or (A − I)x = 0, which, assuming that 1 is not an eigenvalue of A, which I assume in the following, has only trivial solution x ˆ = (0, 0). How to solve (22.1)? Very easy, by assuming that we are given the initial condition x(0) = x0 , then x(t) = At x0 , and the only problem to find a simple way to compute At = AA . . . A. Recall from our discussion of the 2 × 2 dimensional matrices, that any such matrix is similar to one of three possible Jordan normal forms [ ] [ ] [ ] µ1 0 µ 1 α β J1 = , J2 = , J3 = . 0 µ2 0 µ −β α This means that there exists an invertible matrix P such that J i = P −1 AP ,
i = 1, or 2, or 3,
and we also know how to find P . Now we only need Proposition 22.1. Let A has Jordan normal form J with the given transformation matrix P . Then At = P J t P −1 . Proof of this proposition is left as an exercise. Now we just need to find t-th powers of Jordan normal forms. For J 1 it is immediate that [ t ] µ 0 J t1 = 1 , 0 µt2 and recall that µ1 and µ2 are real eigenvalues of A (recall also that they also called the multipliers of A). 140
For J 2 a simple induction argument shows that [ ] t t−1 µ t , J2 = µ 0 µ and here µ is the only eigenvalue of A multiplicity 2 and A is different from the scalar matrix µI. For J 3 , where A has two complex eigenvalues µ1,2 = α ± iβ, note that [ [ ] √ ] α β cos θ sin θ 2 2 = α +β , −β α − sin θ cos θ for some 0 ≤ θ < 2π. From this representation we have that multiplication by J 3 rotates a √ point around the origin by the angle θ and multiplies the distance by α2 + β 2 , therefore [ ] cos tθ sin tθ t t J3 = ρ , − sin tθ cos tθ where ρ =
√
α2 + β 2 .
Example 22.2 (Fibonacci numbers). As an example, consider the Fibonacci numbers, defined as nt+1 = nt + nt−1 , with the initial conditions n0 = 0, n1 = 1. Reformulate this problem as a two-dimensional discrete system (I use x1 (t) = nt and x2 (t) = nt−1 ) x1 (t + 1) = x1 (t) + x2 (t), x2 (t + 1) = x1 (t), with the initial condition x1 (1) = 1, x2 (1) = 0. In this example I have [ ] 1 1 A= , 1 0 and the general solution is given by x(t) = At−1 x(1). √ √ Eigenvalues of A are µ1,2 = 1±2 5 and the corresponding eigenvectors are ((1 + 5)/2, 1)⊤ and √ ((1 − 5)/2, 1)⊤ , therefore, I have [ [ √ ] [ √ ] √ √ ] 5−1 1+ 5 1− 5 1+ 5 1 1 0 2√ 2 √ 2 2 , P −1 = √ P = , J= , 1− 5 1 1 5 −1 1+2 5 0 2
which implies
x(t) = P J t−1 P −1 x(1),
141
and after some algebra yields 1 x1 (t) = nt = √ 5
(
for the t-th Fibonacci number. Since |1 −
( √ )t √ )t 1+ 5 1 1− 5 −√ 2 2 5
√ 5|/2 is less than 1, therefore, for sufficiently large t
1 nt ≈ √ 5 and
(
√ )t 1+ 5 , 2
√ 1+ 5 nt+1 ≈ , nt 2
i.e., the ratio of two consequent Fibonacci numbers is approximately equal to the golden ratio φ (recall that two positive quantities are in golden ratio if the ratio of their sum to the bigger quantity is √ equal to the ratio of the bigger quantity to the smaller one, deduce from here that φ = (1 + 5)/2). Remark 22.3. If the calculations above are done by hands it may take some time to reach the final results. A somewhat easier approach is to work directly with the difference equation nt+1 = nt + nt−1 . This is an example of a second order linear homogeneous difference equation, whose general theory is very much alike the theory for linear homogeneous ODE with constant coefficients. In particular, I will look for a solution in the form nt = µ t . After canceling µ−1 , I will find the same characteristic polynomial µ2 − µ − 1, √ 1± 5 µ1,2 = . 2 The general solution to the difference equation is given by the linear combination with the roots
nt = C1 (µ1 )t + C2 (µ2 )2 , and the constants can be determined by using the initial conditions n0 = 0, n1 = 1. The calculations are much simpler if we take this rout. Note that if there is only one real root of the characteristic polynomial then the general solution is C1 µt + C2 tµ2 , if the roots are complex conjugate re±iθ then the general solution is C1 rt cos(tθ) + C2 rt sin(tθ). Now using the explicit representation of At , I can state the following important stability result:
142
Theorem 22.4. Consider (22.1) and assume that x ˆ = (0, 0) is isolated. Then x ˆ is asymptotically stable (a sink) if and only if all the eigenvalues of A satisfy the condition |µ1,2 | < 1. Fixed point x ˆ is unstable if there is at least one eigenvalue of A such that |µi | > 1. It is a source if |µ1,2 | > 1, and it is a saddle if |µ1 | < 1 and |µ2 | > 1. In the latter case there exist stable and unstable subspaces of R2 . Finally x ˆ is Lyapunov stable if A has complex conjugate eigenvalues |µ| = 1. For the following it is natural to call matrix A hyperbolic if it does not have eigenvalues with the absolute value equal to 1. Hyperbolicity is a generic property in the sense that almost all the matrices are hyperbolic.
22.2
Nonlinear systems and linearization
Similarly to the case with the differential equations, the linearization of a nonlinear system around a fixed point can be used to infer the stability properties, given that the Jacobi matrix f ′ (ˆ x) is hyperbolic. Formally, Theorem 22.5. Consider discrete dynamical system x 7→ f (x), x ∈ U ⊆ Rd , where f ∈ C (1) (U ; U ), and let x ˆ be a fixed point: f (ˆ x) = x ˆ. Then x ˆ is asymptotically stable if all the multipliers of the Jacobi matrix f ′ (ˆ x) lie within the unit circle on the complex plane. If at least one multiplier lies outside the unit circle then x ˆ is unstable. Example 22.6 (A delayed Ricker’s equation). Consider a difference equation Nt+1 = Nt e
( ) N r 1− t−1 K
,
which, using new variable ut = Nt /K, can be reduced to ut+1 = ut er(1−ut−1 ) ,
r > 0.
This equation is not a discrete dynamical system since ut+1 depends on two time moments: on the present t and on the past t − 1, which is quite often reasonable to assume. However, using new notations x1 (t) = ut , x2 (t) = ut−1 , this equation can be rewritten as x1 (t + 1) = x1 (t)er(1−x2 (t)) , x2 (t + 1) = x1 (t). The fixed points are found as the solutions to x1 er(1−x2 ) = x1 ,
x1 = x 2 ,
and therefore I have x ˆ0 = (0, 0) and x ˆ1 = (1, 1). The Jacobi matrix of the map is [ r(1−x ) ] 2 e −rer(1−x2 ) ′ f = . 1 0 143
Therefore, f ′ (ˆ x0 ) has two multiplier er and 0 and hence always unstable and hyperbolic saddle. The multipliers of x ˆ1 are found by solving µ2 − µ + r = 0, and therefore µ1,2 = and µ1,2 = ρe±iθ ,
ρ=
1± √
r,
√ 1 − 4r , 2
r≤
θ = arctan
1 , 4
√ 4r − 1 ,
r>
1 . 4
If 0 < r ≤ 41 then the multipliers are real and less then 1, hence x ˆ1 is asymptotically stable. If 1 r > 4 then the multipliers are complex conjugate and µ1 µ2 = |µ1 |2 = r, therefore, for r < 1 this fixed point is still asymptotically stable (but convergence is oscillatory). At r = 1 θ
µ1,2 = ei 3 , and both multipliers have absolute values 1 simultaneously, the linearization at x ˆ1 is nonhyperbolic, a bifurcation occurs in full nonlinear system. It can be proved that in this particular case an invariant closed curve is born around the fixed point, which is attracting. The full theory is somewhat similar to the Poincar´e–Andronov–Hopf bifurcation and can be found e;sewhere. The name of the bifurcation is the Neimark–Saker bifurcation. This situation is generic when there are two complex multipliers, one is conjugate of another, both having absolute value 1.
22.3
Final exam
1. Allee’s effect in discrete models. Consider the map xt+1 = a
x2t , b2 + x2t
a > 0.
Determine the equilibria and show that if a2 > 4b2 it is possible for the population to be driven to extinction if it becomes less that a critical size which you should find. 2. Consider a mathematical model of an insect population that includes sterile insects as a measure to control the population size: Nt+1 =
RNt2 N2
(R − 1) Mt + Nt + S
,
where R > 1 and M > 0 are constant parameters, and S is the constant sterile insect population. Determine the steady states and discuss their linear stability, noting whether any time of bifurcation is possible. Find the critical value Sc of the sterile population in terms of R and M so that if S > Sc the insect population is eradicated. Determine the possible solution behavior if 0 < S < Sc (you may want to draw cobweb diagrams).
144
3. For the planar linear ODE with constant coefficients we had bifurcation diagram in terms of tr A and det A (see Lecture 7 Figure 7). Draw an analogous bifurcation diagram for the planar linear discrete dynamical system x 7→ Ax, A = (aij )2×2 . Here x = (x1 , x2 ) ∈ R2 . 4. Plants produce seeds at the end of their growth season, after which they die. A fraction of these seeds survive the winter, and some of these germinate at the beginning of the season, giving rise to the new generation of plants. The fraction that germinates depends on the age of the seeds, no seeds older than 2 years germinate. If pt denotes number of plants in generation t, then the governing equation takes the form pt = ασγpt−1 + βσ(1 − α)σγpt−2 . Give a biological interpretation of each of the terms in the equation and of the parameters. What are the ranges of the parameters? Determine a general condition that ensures growth of the plant population. 5. One of the first mathematical models with discrete time to describe the interaction of host–parasitoid type was the following two dimensional discrete dynamical system Nt+1 = rNt e−aPt , ( ) Pt+1 = Nt 1 − e−aPt , where a > 0 and r > 1. Study this model (i.e., find out about the behavior of this model as much as you can, it may be useful to know that log r < r − 1 if r > 1). Are the results of your analysis biologically realistic? Can you improve the model?
A
Review of ODE
This first section will serve as a quick review of the pertinent material from Math 266: Introduction to Ordinary Differential Equations, which will be of significant importance for the rest of the course. More material will be reviewed as the course progresses. I start with a motivation why differential equations are useful for describing biological processes.
A.1
The first encounter with the Malthus equation
One of the many possible mathematical tools to model biological processes is the ordinary differential equations (which I will abbreviate as ODE). Here is a basic example how an ODE can appear in a biologically motivated problem. Consider a very simple biological process of a population growth. Let N (t) denote the number of individuals in a given population (for concreteness you can think of a population of bacteria) at the time moment t. In this course the variable t will almost exclusively denote time. Now I calculate how the population number changes during a short time interval h. I have N (t + h) = N (t) + bhN (t) − dhN (t). 145
Here I used the fact that the total population at the moment t + h can be found as the total population at the moment t plus the number of individuals born during time period h minus the number of died individuals during time period h. b and d are per capita birth and death rates respectively (i.e., the numbers of births and deaths per one individual per time unit respectively). From the last equality I find N (t + h) − N (t) = (b − d)N (t). h Next, I postulate the existence of the derivative dN N (t + h) − N (t) = lim , h→0 dt h assume for simplicity that both b and d are constant, and hence obtain an ordinary differential equation dN = (b − d)N, dt which is usually called in the biological context the Malthus equation (I will come back to Malthus). This example shows at least two things: First, that ODE naturally arise when models of biological systems are built; second, that ODE mathematical models are quite crude in the sense that to obtain a mathematical model in the form of ODE I have to make very strong simplifying assumptions. As this point I would not want to discuss the approximations I made by deriving the Malthus equation, but it is a good moment to pause and to spend a few seconds on thinking how close actually our mathematical model (the Malthus equation) describes the process I claim it models (the population growth).
A.2
Notation
In the first part of the course we deal with ODE. This subsection serves to fix the notation. Definition A.1. An ODE of the k-th order is the expression of the form ( ) dk x dx dk−1 x = f t, x, , . . . , k−1 . dt dtk dt
(A.1)
Here I use t for the independent variable and x for the dependent variable. A function φ is called a solution to (A.1) on the interval I = (a, b) if this function k times continuously differentiable on I (this is denoted usually as φ ∈ C (k) (I; R), in words: Function φ with the domain I and range R belongs to the space of k times continuously differentiable functions) and after plugging this function into (A.1) turns this equation into identity ( ) dk φ dφ dk−1 φ (t) ≡ f t, φ(t), (t), . . . , k−1 (t) , t ∈ I. dtk dt dt For example, for the Malthus equation the function N (t) = Ce(b−d)t 146
is a solution for any constant C ∈ R (check this) and for any interval I ⊆ R. The order of an ODE is the order of the highest derivative. Hence the first order ODE is dx = f (t, x). dt
(A.2)
The solution formula to the Malthus equation actually gives infinitely many solution, and this is true for most ODE: in a general situation the solution to an ODE depends on arbitrary constants, whose number coincides with the order of the equation. Exercise 1. Can you solve the following ODE: d2 x + λx = 0, dt2 where λ is a parameter? Note that the solution depends on the sign of λ. In any case your solution must depend on two arbitrary constants. In the context of mathematical models it is meaningless to have infinitely many solutions (think of a population growth: We cannot have infinitely many functions describing it). Therefore, we need some additional conditions to choose the solution that is of interest for us. This is done with the help of the initial conditions. The initial conditions for equation (A.1) take the form x(t0 ) = x0 , dx (t0 ) = x1 , dt .. .
(A.3)
dk−1 x (t0 ) = xk−1 , dtk−1 k
where x0 , . . . xk−1 are given numbers, and notation ddtkx (t0 ) means the k-th derivative of function x evaluated at the point t0 . Note that the number of the initial conditions is equal to the order of the ODE. Definition A.2. ODE (A.1) plus the initial conditions (A.3) are called an initial value problem (abbreviated an IVP ) or Cauchy’s problem. For the Malthus equation I hence need only one initial condition (the population size at some time moment t0 ). Check that if the initial condition N (t0 ) = N0 , then the solution to the corresponding Cauchy problem is given by N (t) = N0 e(b−d)(t−t0 ) . Any ODE of the form (A.1) can be written as a system of k first-order equations. Moreover, writing ODE as a system is theoretically the way how ODE should be treated. Hence I will usually write ODE as first order systems. To rewrite (A.1) as a system, I introduce new variables x1 (t) = x(t), x2 (t) =
dx dk−1 x (t), . . . , xk (t) = k−1 (t). dt dt 147
Using these new variables I have dx1 = x2 , dt dx2 = x3 , dt .. .
(A.4)
dxk = f (t, x1 , x2 , . . . , xk ). dt Note that the initial conditions for (A.4) become x1 (t0 ) = x1 , . . . , xk (t0 ) = xk .
(A.5)
From now on for the first order derivatives I am going to use Newton’s dot notation (which is widely used in the classical mechanics): x˙ :=
A.3
dx . dt
Solving ODE
Most of the time in your Math 266 course was devoted to finding analytical solutions to ODE. For example, here is how you can find the solution to the Malthus equation, which I wrote above. Example A.3. Consider again the Malthus equation N˙ = mN,
N (t0 ) = N0 ,
where I introduced new constant m := b − d. This equations is an example of a separable ODE of the form x˙ = f1 (t)f2 (x), when you can separate the variables. To solve the Malthus equation, I write ∫ ∫ dN ˙ N = mN =⇒ = m dt =⇒ log |N | = mt + C =⇒ N (t) = Cemt . N Here I used log to denote the natural logarithm, and C is an arbitrary constant, which can be different from line to line. Using the initial condition, I find the soltion N (t) = N0 em(t−t0 ) . A word of caution is in order here, as the next example shows. Example A.4. Consider the following IVP: x˙ = 4tx2 ,
x(1) = 0.
148
The usual way the students solve this problem is as follows: They separate the variables dx 1 1 = 4t =⇒ − = 2t2 + C =⇒ x = . x2 x C − 2t2 Now they try to find the constant C such that 0=
1 . C −2
And here problems start, since there is no such finite C. Can you figure out what the right way to solve this problem? Actually, I notice that while integrating this equation I divided by x2 . This expression turns into 0 for x = 0. This means that x = 0 is a solution to our equation. Moreover, this solution satisfies the given initial condition. Therefore, the answer to this problem is x(t) = 0. Anyway, I will not need much of this technique in our course. However, it is useful to remember the basic methods of solving the first order ODE. It is also useful to know that most ODE cannot be solved in terms of elementary functions and finite number of integrals of them (the only large class of ODE that can be solved is the linear ODE with constant coefficients, and you studied this class of ODE extensively in Math 266). For example, this innocently looking equation x˙ = x2 − t cannot be solved in terms of elementary function. Note that this does not mean that there is no solution, it means that the usual repertoire of the polynomial, logarithmic, exponential, and trigonometric functions, and four arithmetic operations and integrals are not enough to write down a solution to this equation (this deep result has the name Liouville’s theorem and studied in the Differentiable algebra course). Therefore, we will need other means to solve ODE. What we will study most in this course is the qualitative or geometric analysis of ODE. This means that we will study the properties of solutions of ODE without actually having a formula for these solutions. Another useful approach to solve ODE is the numerical methods. However, I will not discuss them in this course (look for a Numerical analysis course).
A.4
Well-posed problems. Theorem of existence and uniqueness
I consider ODE as mathematical models of some biological systems, and as such they should possess some desirable properties. In particular, one such property is to be well-posed (do not think that if a problem is ill posed, it is impossible to consider it as a mathematical model; actually, a good deal of ill posed problems are of great interest in applications, but this does not belong to this course). The notion of a well posed problem was introduced by the French mathematician Jacques Salomon Hadamard. A mathematical problem is well posed if • Its solution exists. • Its solution is unique. 149
• Its solution depends continuously on the initial data. It turns our that the mathematical problems (A.1), (A.3) or (A.4), (A.5) are well posed. Here I would like to provide the exact mathematical statements for the first order ODE. Consider the following IVP for a first order ODE: x˙ = f (t, x),
x(t0 ) = x0 .
(A.6)
The following general result is usually discussed in Math 266 without proof. Theorem A.5. Consider the IVP (A.6) and assume that function f is continuous in t and continuously differentiable in x for (t, x) ∈ (a, b) × (c, d) for some constants a, b, c, d, assume that (t0 , x0 ) ∈ (a, b) × (c, d). Then there exists an ε > 0 such that the solution φ to (A.6) exists and unique for t ∈ (t0 − ε, t0 + ε). Note that Theorem A.5 is local, i.e., it guarantees that the solution exists and unique on some in general smaller interval (t0 − ε, t0 + ε) ⊆ (a, b). This is important because solutions to ODE can blow up, i.e., approach infinity for a finite t. Example A.6. Consider the ODE x˙ = 1 + x2 = f (t, x). The right-hand side is a polynomial for any (t, x) ∈ R × R. Its solution is given by (check this) x(t) = tan(t + C), and hence for each fixed C is defined only on the interval (−π/2 − C, π/2 − C) (you should sketch several solutions). Here is an example how non-uniqueness can appear. Example A.7. Consider x˙ =
√ x,
x(0) = x0 ,
x ≥ 0.
One solution is given, as can be found by the separation of variables, as √ x(t) = (t + 2 x0 )2 /4. On the other hand, if x0 = 0 then there is a solution which is identically zero for any t. Therefore through the point (t0 , x0 ) = (0, 0) more than one solution passes (actually infinitely many). To formulate the result on the dependence on the initial conditions consider, together with (A.6), the IVP x˙ = f (t, x), x(t0 ) = x1 . (A.7) Theorem A.8. Let the IVP (A.6) and (A.7) satisfy the conditions of Theorem A.5, and x0 (t) and x1 (t) be the solutions to (A.6) and (A.7) respectively at the same time t. Then |x1 (t) − x0 (t)| ≤ |x1 − x0 |eL|t−t0 | , where L is a constant that depends on f . Theorem A.8 shows that the solution to a first order ODE depends continuously on the initial condition. This means that if the initial condition is known only approximately, I still can rely on the solution of the IVP as an approximation of the unknown solution. 150
A.5
Geometric interpretation of the first order ODE
First order ODE x˙ = f (t, x) has a very useful and transparent geometric interpretation. Recall that the derivative of a function at a given point gives the slope of the tangent line to the graph of this function. Since at the right hand side of the first order ODE I have the value of the derivative x˙ at the point (t, x), hence I know the slope at this particular point, and hence I also know the slope at any point at which f (t, x) is defined. I can depict these slopes as small line segments at each point. All together these line segments form a direction (or slope) field. Consider a curve that is tangent to a given slope field at every point and call it an integral curve. The following theorem connects the integral curves of the direction field f and the solutions to the ODE x˙ = f (t, x). Theorem A.9. The graphs of the solutions to the first order ODE are the integral curves of the corresponding direction field. Exercise 2. Prove the last theorem. Now using this geometric interpretation I can say that to solve the IVP (A.6) it (geometrically) means to find an integral curve belonging to the corresponding direction field, which passes through the point (t0 , x0 ). Theorem A.5 now can be restated in the following way: Through each point (t0 , x0 ) ∈ U , where U is the domain in which f is continuous in t and continuously differentiable in x, passes one and one one integral curve. Example A.10. Consider the ODE x˙ = x2 − t.
x
Its direction field is shown in Figure A.1. The green curves represent the integral curves, i.e., the graphs of the solution to the ODE. Can you figure out how you can draw the direction field without, say, a computer?
t
Figure A.1: The direction field of x˙ = x2 − t. The green curves are the integral curves
151