INTRO TO R. MONTECARLO SIMULATIONS 1. First problem (Level of ...

30 downloads 135 Views 91KB Size Report
R toolbox. rexp(n, rate=1) runif(n, min=0, max=1) rnorm(n, mean = 0, sd = 1). So, for example, you can get ... the $stat
INTRO TO R. MONTECARLO SIMULATIONS DR. PABLO GOMEZ

1. First problem (Level of difficulty: low) You can use the following commands to generate random numbers from a exponential, uniform, and normal distributions respectively. All you need to do is change the n for the desired size of the samples, and the appropriate parameters to change the characteristics of the distribution. 1.0.1. R toolbox. rexp(n, rate=1) runif(n, min=0, max=1) rnorm(n, mean = 0, sd = 1) So, for example, you can get 20 numbers from a normal µ = 100, σ = 15, store those numbers in an array x, and then obtain the x ¯ and s with: x=rnorm(20,100,15) mean(x) sd(x) 1.1. Manipulating distributions. (1) Why are mean(x) not equal to 100, and mean(x) not equal to 15? (2) What do you have to do to the vector (or array) x to accomplish the following effects? (IMPORTANT: It does not involve resampling) (3) increase the mean by 100, but not changing the sd; (4) increase the sd by 15, but not changing the mean; (5) decreasing both the mean and the SD by half? 2. Your first Montecarlo simulation Suppose that you are running an experiment in which the null hypothesis is correct. In other words, God knows that µ1 = µ2 . But because you are a stubborn student, you decide to run the experiment 1000 times. In how many of those 1000 times would you expect to reject the null hypothesis? Of course, as a graduate student you should know the answer to this question, but often times you need to show it to yourself. This is easy to do in R. 1

2

DR. PABLO GOMEZ

2.1. Simple example. To simplify our problem even more, let’s play with a one sample t-test. Let’s suppose that our n = 20 sample comes from a normal distribution with mean=100, sd=15. xi is i.i.d. N (100, 15). 2.1.1. R toolbox. The rnorm(n, µ, σ) function return n numbers from a normal distribution with a given µ and σ. Try the following commands in R and then play around with the parameters of the functions. rnorm(1000,0,1) x=rnorm(1000,0,1) hist(x) plot(x) mean(x) variance(x) summary(x) 2.1.2. Going back to our example. We can obtain 20 observations i.i.d. N (100, 15) with: x=rnorm(20,100,15) Then, we can run a one sample t-test in which H0 : µ = 100 with: result=t.test(x, mu=100) result result$p.value result$statistic result$method To find help on the t-test function just type help(t.test). It is unlikely that your p.value is less than .05, but if you do it enough times, you will eventually find a “significant” result. So let’s repeat this test 10,000 times. reps=10000 result=array(dim=reps) for(i in 1:reps){ x=rnorm(20,100,15) result[i]=t.test(x, mu=100)$p.value } How many of these simulations were significant? We can look at the histogram of p.values. How do you think there are distributed? Try the following methods to look at the results: hist(result) #

INTRO TO R. MONTECARLO SIMULATIONS

3

plot(density(result, from=0, to=1)) # table(result