Introduction to Bootstrap Methods - Department of Economics

95 downloads 150 Views 530KB Size Report
Introduction to Bootstrap Methods. Miguel Sarzosa. Department of Economics. University of Maryland. Econ626: Empirical Microeconomics, 2012 ...
Introduction to Bootstrap Methods Miguel Sarzosa Department of Economics University of Maryland

Econ626: Empirical Microeconomics, 2012

1

Recreating the Universe

2

Bootstrap Estimates

3

The Jackknife

4

Applications

5

Now we go to Stata!

Why is Bootstrap Important?

Only in specific instances we are able to infer population parameters from a data set I

Data sets are often samples of the population

Most of the estimators we use for inference rely on asymptotic results (CLT). A setting that is impossible to have in the data unless we use an approximation like bootstrap. Bootstrap is used for two things 1 2

Computation of SE, CI (practical) Asymptotic behavior of estimators (theory)

How does it work?

Bootstrap views the sample you have in your data set as the population of interest. We obtain estimates of characteristics of the distribution of a given ˆ by drawing B sub-samples of size N with replacement. estimator µ ˆ (i.e., µ ˆb ) from which we can obtain Then we get B estimates of µ moments like the mean and the variance !! ˆ⇠ µ

B

1

B

 µˆb , (B

b=1

1)

1

B

Â

b=1

ˆb µ

B

1

B

 µˆb

b=1

Algorithm

1

Take subsample of size N

2

Calculate the desired statistic on the sample

3

Repeat 1. and 2. B times, where B is a large number

How Many Repetitions? In most of the cases, the more the better. However, bootstrapping can be computationally intensive. Andrews and Buchinsky (2000) came up with the rule that you should take B = 384w repetitions, where w with the statistic inquired SE: w = (2 + g4 ) 4. Where g4 depends on the excess kurtosis ) fatter tails mean higher B I I

For no excess kurtosis B = 192 For g4 = 8, B = 960

Two sided CI: Depends on the critical value I I

For a = 0.05, B = 348 For a = 0.01, B = 685

One sided CI: Depends on the critical value I I

For a = 0.05, B = 634 For a = 0.01, B = 989

SE Estimation One of the main uses of bootstrap is to calculate the correct SE. SE estimation through bootstrap is very useful for example in 2-step estimations (e.g., IV) whose SE are difficult to compute. The variance of an estimate qˆ calculated using bootstrap is given by Sqˆ2 = (B

1)

B

1

where ˆ=B q¯

Â

b=1

1



qˆb⇤

ˆ q¯

⌘2

B

 qˆb⇤

b=1

Sqˆ is consistent, therefore it can be used to obtain CI and hypothesis testing

(1)

Bias Estimation

From (1) it is easy to see that the bias of the estimate qˆ is given by ⇣ ⌘ ˆ qˆ Biasqˆ = q¯

This allows to correct for the bias in an estimate. The bias-corrected estimate is given by ⇣ ⌘ ˆ qˆ = 2qˆ q¯ˆ qˆcorr = qˆ Biasqˆ = qˆ q¯

The Jackknife Uses N subsamples of size N 1 (drops one observation at a time). Then the jackknife estimate is given by ˆ=N q¯ Then the bias is

1

Biasqˆ = (N

and the corrected estimate qˆcorr = qˆ

 qˆ i

⇣ ˆ 1) q¯

Biasqˆ = N qˆ

i





(N

1) q¯ˆ

The jackknife SE is given by Sqˆ =

"

N

N

1

B

Â

b=1





i

ˆ q¯

⌘2

#1/2

Heteroskedastic Errors In the presence of heteroskedasticity, we use HEW SE, but they perform very poorly in small samples. Bootstrap can be a better choice Paired Bootstrap: Obtain samples of (yi , xi ) and estimate (i.e., regress). Assuming that each draw is i.i.d we are able to do valid inference because we are still allowing for Var [ui |xi ] to vary with xi . In Stata vce(bootstrap) Wild Bootstrap: Obtain samples of (yi⇤ , xi ) where yi⇤ = xi bˆ + uˆi⇤ and

8 p < 1 5 ui 2 ⇤ p ⌘ uˆi = ⇣ : 1 1 5 ui 2

with probability with probability

p 1+p 5 2 5 p p 5 1 1+ 2 5

Panel Data and Clustered Data Note that in the Paired Bootstrap we assumed the (yi , xi ) draws were i.i.d. In the case we are not able to claim that because the observations are not independently distributed (i.e., panel or clustered data) we use panel bootstrap. Suppose a panel has two dimensions i and t. In the panel bootstrap, we resample over i and not over t. That is when we bootstrap we choose the is that will appear in the subsample and obtain all the t observations of those is chosen A key assumption is that the data are independent over i The same procedure is done when the data is clustered. We resample over the clusters, and then get all observations belonging to that cluster. We need a huge number of clusters. I

In Stata: vce(bootstrap, cluster(varlist ))

Implementation Issues

Remember that the assumption of independence across observations or clusters of observations is crucial In some cases, the second moment might not exist, even asymptotically, so the bootstrap results will be missleading The p basic bootstrap assumes the estimator is a smooth estimator, N -consistent and asymptotically normal

Implementation in Stata

Stata users can perfirm bootstrapped estimations using two ways: 1

2

There are some commands that incorporate the bootstrap option by typing vce(bootstrap) or vce(bootstrap, cluster(varlist )) Stata users can also incorporate bootstrap estimations in a large number of commands including: 1 2 3

Those without the vce(bootstrap) option, Non-estimation commands, (e.g., summarize) User-written commands

using the bootstrap command

The bootstrap Command The syntax of bootstrap is different from the main style of Stata commands. It requires to first specify what is the estimate that is going to be bootstrapped, then the bootstrap options and then the command that is going to be boostrapped. bootstrap exp_list [, options eform_option ] :

command

where exp_list specifies the estimates that will be bootstrapped (e.g., _b, _b[x1] or _se) among the most importatnt options we have I I I I I

reps(#) seed(#) cluster(varlist ) strata(varlist ) size(#)

Example

bootstrap _b, reps(100) seed(10101) cluster(clusvar): if desocupa==0 & year==2000

proIVpro

Now we go to Stata!