H. Madsen, Time Series Analysis, Chapmann Hall
Time Series Analysis Henrik Madsen
[email protected]
Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
Henrik Madsen
1
H. Madsen, Time Series Analysis, Chapmann Hall
Outline of the lecture Identification of univariate time series models Introduction, Sec. 6.1 Estimation of auto-covariance and -correlation, Sec. 6.2.1 (and the intro. to 6.2) Using SACF, SPACF, and SIACF for suggesting model structure, Sec. 6.3 Estimation of model parameters, Sec. 6.4 Examples... Cursory material: The extended linear model class in Sec. 6.4.2 (we’ll come back to the extended model class later)
Henrik Madsen
2
H. Madsen, Time Series Analysis, Chapmann Hall
Model building in general Data
1. Identification (Specifying the model order) Theory physical insight
2. Estimation (of the model parameters)
No
3. Model checking Is the model OK ? Yes
Applications using the model (Prediction, simulation, etc.)
Henrik Madsen
3
H. Madsen, Time Series Analysis, Chapmann Hall
Identification of univariate time series models
2 4 6 8
12
What ARIMA structure would be appropriate for the data at hand? (If any)
0
20
40
60
80
100
Given the structure we will then consider how to estimate the parameters (next lecture) What do we know about ARIMA models which could help us?
Henrik Madsen
4
H. Madsen, Time Series Analysis, Chapmann Hall
Estimation of the autocovariance function Estimate of γ(k) N −|k|
1 X CY Y (k) = C(k) = γ b(k) = (Yt − Y )(Yt+|k| − Y ) N t=1
It is enough to consider k > 0 S-PLUS: acf(x, type = "covariance")
Henrik Madsen
5
H. Madsen, Time Series Analysis, Chapmann Hall
Some properties of C(k) The estimate is a non-negative definite function (as γ(k)) The estimator is non-central: N −|k|
|k| 1 X )γ(k) γ(k) = (1 − E[C(k)] = N N t=1
Asymptotically central (consistent) for fixed k : E[C(k)] → γ(k) for N → ∞
The estimates are autocorrelated them self (don’t trust apparent correlation at high lags too much)
Henrik Madsen
6
H. Madsen, Time Series Analysis, Chapmann Hall
How does C(k) behave for non-stationary series? N −|k|
1 X (Yt − Y )(Yt+|k| − Y ) C(k) = N t=1
Henrik Madsen
7
H. Madsen, Time Series Analysis, Chapmann Hall
How does C(k) behave for non-stationary series? N −|k|
1 X (Yt − Y )(Yt+|k| − Y ) C(k) = N t=1
7200
7400
7600
7800
8000
Series : arima.sim(model = list(ar = 0.9, ndiff = 1), n = 500)
0
Henrik Madsen
5
10
15
7
20
25
H. Madsen, Time Series Analysis, Chapmann Hall
Autocorrelation and Partial Autocorrelation Sample autocorrelation function (SACF): ρb(k) = rk = C(k)/C(0)
For white noise and k 6= 1 it holds that E[b ρ(k)] ≃ 0 and √ V [b ρ(k)] ≃ 1/N , this gives the bounds ±2/ N for deciding when it is not possible to distinguish a value from zero. S-PLUS: acf(x) Sample partial autocorrelation function (SPACF): Use the Yule-Walker equations on ρb(k) (exactly as for the theoretical relations) √ It turns out that ±2/ N is also appropriate for deciding when the SPACF is zero (more in the next lecture) S-PLUS: acf(x, type="partial") Henrik Madsen
8
H. Madsen, Time Series Analysis, Chapmann Hall
−2
−1 1.1
0
1 1.2 2
What would be an appropriate structure?
20
40
60
100
0.2 Partial ACF −0.1 0.0 0.1
0.60.9
−0.2
−0.2
0.8 0.2
ACF
0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
80
•
1.0
1
1.0
0
9
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
−21.1
0
21.2
4
What would be an appropriate structure?
20
40
60
80
100
•
−0.2
−0.2
0.0
0.8 0.2
ACF
Partial ACF 0.2 0.4
0.60.9
0.6
1.0
1
1.0
0
0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
10
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
1.0
0
20
40
60
100
0.8 −0.4
Partial ACF 0.0 0.4
ACF −0.2 0.8 0.2 0.6 0.9 0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
80
•
1.0
1
−6
−4 1.1 −2
0
1.2 2
4
What would be an appropriate structure?
11
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
1.0
0
20
40
60
100
0.2 Partial ACF −0.2 0.0
0.5 0.9
−0.6
−0.50.8
ACF 0.0 0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
80
•
1.0
1
−4
−2 1.1
0
2 1.2
4
What would be an appropriate structure?
12
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
−2
−1 1.1
0
1
1.2 2
3
What would be an appropriate structure?
20
40
60
100
0.6 Partial ACF 0.2 0.4
0.60.9
−0.2
−0.2
0.0
0.8 0.2
ACF
0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
80
•
1.0
1
1.0
0
13
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
−2
−1 1.1
0
1
1.2 2
3
What would be an appropriate structure?
20
40
60
100
0.4 Partial ACF 0.0 0.2
0.60.9 −0.2
−0.2
0.8 0.2
ACF
0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
80
•
1.0
1
1.0
0
14
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
−150
1.1 −130
−110 1.2 −90
What would be an appropriate structure?
20
40
60
80
100
•
−0.2
−0.4
0.8 0.2
ACF
Partial ACF 0.0 0.4
0.60.9
0.8
1.0
1
1.0
0
0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
15
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
1.0
0
20
40
60
100
0.2 Partial ACF −0.2 0.0
0.5 0.9
−0.6
−0.50.8
ACF 0.0 0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
80
•
1.0
1
−4
−2 1.1
0
2 1.2
4
Example of data from an M A(2)-process
16
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
−40
1.10
20 40 60 1.2 80
Example of data from a non-stationary process
20
40
60
100
1.0 Partial ACF 0.2 0.6
0.60.9
−0.2
−0.2
0.8 0.2
ACF
0
0.85
10 Lag
0.915
20
1.0 1
Henrik Madsen
80
•
1.0
1
1.0
0
17
0
5 1.1
10 Lag
151.2
20
H. Madsen, Time Series Analysis, Chapmann Hall
−4 1.1 −2
0
2 1.2 4
Same series; analysing ∇Yt = (1 − B)Yt = Yt − Yt−1
20
40
60
80
100
•
−0.2
−0.2
0.8 0.2
ACF
0.6 0.9
Partial ACF 0.2 0.6
1.0
1
1.0
0
0
0.8 5
10 Lag
0.9 15
1.0 1
Henrik Madsen
18
0
51.1
10 Lag
1.2 15
H. Madsen, Time Series Analysis, Chapmann Hall
Identification of the order of differencing Select the order of differencing d as the first order for which the autocorrelation decreases sufficiently fast towards 0 In practice d is 0, 1, or maybe 2 Sometimes a periodic difference is required, e.g. Yt − Yt−12
Remember to consider the practical application . . . it may be that the system is stationary, but you measured over a too short period
Henrik Madsen
19
H. Madsen, Time Series Analysis, Chapmann Hall
Stationarity vs. length of measuring period
−0.5
−0.1
0.1
0.3
US/CA 30 day interest rate differential
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
−0.5
−0.4
−0.3
−0.2
−0.1
0.0
US/CA 30 day interest rate differential
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 1990 1991 1992 1993 1994 1995 1996
Henrik Madsen
20
H. Madsen, Time Series Analysis, Chapmann Hall
Identification of the ARMA-part Characteristics for the autocorrelation functions: ACF ρ(k) PACF φkk AR(p)
M A(q)
ARM A(p, q)
Damped exponential and/or sine functions
φkk = 0 for k > p
ρ(k) = 0 for k > q
Dominated by damped exponential and or/sine functions
Damped exponential and/or sine functions after lag q − p
Dominated by damped exponential and/or sine functions after lag p − q
The IACF is similar to the PACF; see the book page 133 Henrik Madsen
21
H. Madsen, Time Series Analysis, Chapmann Hall
Behaviour of the SACF ρˆ(k) (based on N obs.) If the process is white noise then r
±2
1 N
is an approximate 95% confidence interval for the SACF for lags different from 0 If the process is a M A(q)-process then r
±2
1 + 2(ˆ ρ2 (1) + . . . + ρˆ2 (q)) N
is an approximate 95% confidence interval for the SACF for lags larger than q Henrik Madsen
22
H. Madsen, Time Series Analysis, Chapmann Hall
Behaviour of the SPACF φˆkk (based on N obs.) If the process is a AR(p)-process then r
±2
1 N
is an approximate 95% confidence interval for the SPACF for lags larger than p
Henrik Madsen
23
H. Madsen, Time Series Analysis, Chapmann Hall
Model building in general Data
1. Identification (Specifying the model order) Theory physical insight
2. Estimation (of the model parameters)
No
3. Model checking Is the model OK ? Yes
Applications using the model (Prediction, simulation, etc.)
Henrik Madsen
24
H. Madsen, Time Series Analysis, Chapmann Hall
Estimation We have an appropriate model structure AR(p), M A(q), ARM A(p, q), ARIM A(p, d, q) with p, d, and q known Task: Based on the observations find appropriate values of the parameters The book describes many methods: Moment estimates LS-estimates Prediction error estimates • Conditioned • Unconditioned ML-estimates • Conditioned • Unconditioned (exact) Henrik Madsen
25
H. Madsen, Time Series Analysis, Chapmann Hall
Example Using the autocorrelation functions we agreed that yˆt+1|t = a1 yt + a2 yt−1
and we would select a1 and a2 so that the sum of the squared prediction errors got so small as possible when using the model on the data at hand
Henrik Madsen
26
H. Madsen, Time Series Analysis, Chapmann Hall
Example Using the autocorrelation functions we agreed that yˆt+1|t = a1 yt + a2 yt−1
and we would select a1 and a2 so that the sum of the squared prediction errors got so small as possible when using the model on the data at hand To comply with the notation of the book we will write the 1-step forecasts as yˆt+1|t = −φ1 yt − φ2 yt−1 Henrik Madsen
26
H. Madsen, Time Series Analysis, Chapmann Hall
The errors given the parameters (φ1 and φ2 ) Observations: y1 , y2 , . . . , yN Errors: et+1|t = yt+1 − yˆt+1|t = yt+1 − (−φ1 yt − φ2 yt−1 )
Henrik Madsen
27
H. Madsen, Time Series Analysis, Chapmann Hall
The errors given the parameters (φ1 and φ2 ) Observations: y1 , y2 , . . . , yN Errors: et+1|t = yt+1 − yˆt+1|t = yt+1 − (−φ1 yt − φ2 yt−1 ) e3|2 = y3 + φ1 y2 + φ2 y1 e4|3 = y4 + φ1 y3 + φ2 y2 e5|4 = y5 + φ1 y4 + φ2 y3 eN |N −1
Henrik Madsen
.. . = yN + φ1 yN −1 + φ2 yN −2
27
H. Madsen, Time Series Analysis, Chapmann Hall
The errors given the parameters (φ1 and φ2 ) Observations: y1 , y2 , . . . , yN Errors: et+1|t = yt+1 − yˆt+1|t = yt+1 − (−φ1 yt − φ2 yt−1 ) e3|2 = y3 + φ1 y2 + φ2 y1 e4|3 = y4 + φ1 y3 + φ2 y2 e5|4 = y5 + φ1 y4 + φ2 y3 eN |N −1
y3 −y2 .. .. . = . −yN −1 yN Henrik Madsen
.. . = yN + φ1 yN −1 + φ2 yN −2
−y1 .. . −yN −2
φ1 φ2 27
e3|2 + ... eN |N −1
H. Madsen, Time Series Analysis, Chapmann Hall
The errors given the parameters (φ1 and φ2 ) Observations: y1 , y2 , . . . , yN Errors: et+1|t = yt+1 − yˆt+1|t = yt+1 − (−φ1 yt − φ2 yt−1 ) e3|2 = y3 + φ1 y2 + φ2 y1 e4|3 = y4 + φ1 y3 + φ2 y2 e5|4 = y5 + φ1 y4 + φ2 y3 eN |N −1
y3 −y2 .. .. . = . −yN −1 yN Henrik Madsen
.. . = yN + φ1 yN −1 + φ2 yN −2
−y1 .. . −yN −2
φ1 φ2 27
Or just: e3|2 + ... Y = Xθ + ε eN |N −1
H. Madsen, Time Series Analysis, Chapmann Hall
Solution To minimize the sum of the squared 1-step prediction errors εT ε we use the result for the General Linear Model from Chapter 3:
With
b = (X T X)−1 X T Y θ
−y2 .. X= . −yN −1
−y1 .. .
−yN −2
y3 .. and Y = . yN
The method is called the LS-estimator for dynamical systems The method is also in the class of prediction error methods since it minimize the sum of the squared 1-step prediction errors
Henrik Madsen
28
H. Madsen, Time Series Analysis, Chapmann Hall
Solution To minimize the sum of the squared 1-step prediction errors εT ε we use the result for the General Linear Model from Chapter 3:
With
b = (X T X)−1 X T Y θ
−y2 .. X= . −yN −1
−y1 .. .
−yN −2
y3 .. and Y = . yN
The method is called the LS-estimator for dynamical systems The method is also in the class of prediction error methods since it minimize the sum of the squared 1-step prediction errors How does it generalize to AR(p)-models? Henrik Madsen
28
H. Madsen, Time Series Analysis, Chapmann Hall
Small illustrative example using S-PLUS > obs [1] -3.51 -3.81 -1.85 -2.02 -1.91 -0.88 > N X X [,1] [,2] [1,] 3.81 3.51 [2,] 1.85 3.81 [3,] 2.02 1.85 [4,] 1.91 2.02 > solve(t(X) %*% X, t(X) %*% Y) # Estimates [,1] [1,] -0.1474288 [2,] -0.4476040 Henrik Madsen
29
H. Madsen, Time Series Analysis, Chapmann Hall
Maximum likelihood estimates ARM A(p, q)-process: Yt + φ1 Yt−1 + · · · + φp Yt−p = εt + θ1 εt−1 + · · · + θq εt−q
Notation:
θT YtT
= (φ1 , . . . , φp , θ1 , . . . , θq ) = (Yt , Yt−1 , . . . , Y1 )
The Likelihood function is the joint probability distribution function for all observations for given values of θ and σε2 : L(YN ; θ, σε2 ) = f (YN |θ, σε2 )
Given the observations YN we estimate θ and σε2 as the values for which the likelihood is maximized. Henrik Madsen
30
H. Madsen, Time Series Analysis, Chapmann Hall
The likelihood function for ARM A(p, q)-models The random variable YN |YN −1 only contains εN as a random component εN is a white noise process at time N and does therefore not depend on anything
We therefore know that the random variables YN |YN −1 and YN −1 are independent, hence: f (YN |θ, σε2 ) = f (YN |YN −1 , θ, σε2 )f (YN −1 |θ, σε2 )
Repeating these arguments:
L(YN ; θ, σε2 ) = Henrik Madsen
N Y
t=p+1
f (Yt |Yt−1 , θ, σε2 ) f (Yp |θ, σε2 ) 31
H. Madsen, Time Series Analysis, Chapmann Hall
The conditional likelihood function Evaluation of f (Yp |θ, σε2 ) requires special attention
It turns out that the estimates obtained using the conditional likelihood function: N Y L(YN ; θ, σε2 ) = f (Yt |Yt−1 , θ, σε2 ) t=p+1
results in the same estimates as the exact likelihood function when many observations are available For small samples there can be some difference Software: The S-PLUS function arima.mle calculate conditional estimates The R function arima calculate exact estimates Henrik Madsen
32
H. Madsen, Time Series Analysis, Chapmann Hall
Evaluating the conditional likelihood function Task: Find the conditional densities given specified values of the parameters θ and σε2 The mean of the random variable Yt |Yt−1 is the the 1-step forecast Ybt|t−1 The prediction error εt = Yt − Ybt|t−1 has variance σε2 We assume that the process is Gaussian: f (Yt |Yt−1 , θ, σε2 )
1 −(Yt −Ybt|t−1 (θ))2 /2σε2 = √ e σε 2π
And therefore: − N 2−p
L(YN ; θ, σε2 ) = (σε2 2π)
N X 1 exp − 2 ε2t (θ) 2σε t=p+1
Henrik Madsen
33
H. Madsen, Time Series Analysis, Chapmann Hall
ML-estimates b is a prediction error estimate The (conditional) ML-estimate θ since it is obtained by minimizing S(θ) =
N X
ε2t (θ)
t=p+1
By differentiating w.r.t. σε2 it can be shown that the ML-estimate of σε2 is b σ b2 = S(θ)/(N − p) ε
b is asymptoticly “good” and the The estimate θ variance-covariance matrix is approximately 2σε2 H −1 where H contains the 2nd order partial derivatives of S(θ) at the minimum
Henrik Madsen
34
H. Madsen, Time Series Analysis, Chapmann Hall
Finding the ML-estimates using the PE-method 1-step predictions: Ybt|t−1 = −φ1 Yt−1 − · · · − φp Yt−p + θ1 εt−1 + · · · + θq εt−q
If we use εp = εp−1 = · · · = εp+1−q = 0 we can find:
Ybp+1|p = −φ1 Yp − · · · − φp Y1 + θ1 εp + · · · + θq εp+1−q
Which will give us εp+1 = Yp+1 − Ybp+1|p and we can then calculate Ybp+2|p+1 and εp+1 . . . and so on until we have all the
1-step prediction errors we need.
We use numerical optimization to find the parameters which minimize the sum of squared prediction errors Henrik Madsen
35
H. Madsen, Time Series Analysis, Chapmann Hall
S(θ) for (1 + 0.7B)Yt = (1 − 0.4B)εt with σε2 = 0.252 Data: arima.sim(model=list(ar=−0.7,ma=0.4), n=500, sd=0.25)
1.0 0.8 45
AR−parameter
0.6 0.4 40
0.2 0.0
35
−0.2 30
−0.4 −1.0
−0.5
0.0 MA−parameter
Henrik Madsen
36
0.5
H. Madsen, Time Series Analysis, Chapmann Hall
Moment estimates Given the model structure: Find formulas for the theoretical autocorrelation or autocovariance as function of the parameters in the model Estimate, e.g. calculate the SACF Solve the equations by using the lowest lags necessary Complicated! General properties of the estimator unknown!
Henrik Madsen
37
H. Madsen, Time Series Analysis, Chapmann Hall
Moment estimates for AR(p)-processes In this case moment estimates are simple to find due to the Yule-Walker equations. We simply plug in the estimated autocorrelation function in lags 1 to p:
ρb(1) 1 ρb(1) ··· ρb(2) ρb(1) 1 ··· .. = .. .. . . . ρb(p) ρb(p − 1) ρb(p − 2) · · ·
and solve w.r.t. the φ’s
The function ar in S-PLUS does this
Henrik Madsen
38
ρb(p − 1) −φ1 −φ2 ρb(p − 2) .. .. . . 1
−φp