Document not found! Please try again

Estimation: Chapter 8 The Problem: Given some data: • Estimate an ...

5 downloads 75 Views 67KB Size Report
Estimation: Chapter 8. The Problem: Given some data: • Estimate an unknown quantity. • Assess accuracy. Example: Estimate population characteristics from a  ...
Estimation: Chapter 8 The Problem: Given some data: • Estimate an unknown quantity • Assess accuracy Example: Estimate population characteristics from a sample. Brief Outline • Statistical Models • Methods of Estimation: MOM and MLE

Statistical Models Data: A random variable (or vector) has (joint) distribution The Unknowns: The joint distribution is not completely known. Example: Sampling. If X1 , · · · , Xn are a SRS w.r. from a population, then X1 , · · · , Xn ∼ind F , where F is the population distribution function. • Non-parametric models: very little is known about F . • Parametic: The shape of F is known–for example, F is Normal(µ, σ 2 ) for some unknown µ and σ 2 .

• Properties: Bias, Variance, and MSE.

Estimation

Another Example: Time Series AR(1) Processes. Many economic indices evolve according to an equation of the form Yk = αYk−1 + Zk , k = 1, 2, · · · , where the Zk are i.i.d. For example, Yk might be the number of employed indivuals in month k. Here:

Statistical Model: X ∼ f , where f is unknown. [X may be a vector.] Estimators: Let θ denote an unknown (real) charaterstics of f –for example, the mean, median, or standard deviation. An estimator of θ is a function ˆ θˆ = θ(X). Notes 1. A very general definition.

• The Yk are observed, but the Zk are not.

2. Need methods for constructing estimators

• The unkowns are α and the distribution of Zk

3. Also need criteria for comparing them. Example: Samples analogue estimation–e.g. estimate the ¯ population mean µ by the sample mean X.

The Method Of Moments An Example

More On MOM Moments. For a sample let X1 , · · · , Xn ∼ind F , let

Suppose

µk = E(Xk ), k = 1, 2, · · · X1 , · · · , X n ∼

ind

Exponential(λ).

(assumed finite), and n

Then µ=

1X 2 X . µ ˆk = n i=1 i

1 . λ

¯ and λ by Estimate µ by X

Estimation: If the unknowns are θ and η = (η1 , · · · , ηm ), solve

n ˆ= 1 = . λ ¯ X1 + · · · + X n X

µk (θ, η) = µ ˆk , k = 1, 2, · · · for θˆ and ηˆ.

The Procedure: If X1 , · · · , Xn are i.i.d. and θ is the only unknown, let µ = E(Xi ) and estimate θ by solving ˆ = X. ¯ µ(θ)

Notes 1. Use as many equations as are needed to get a unique solution, usually m + 1. 2. Doesn’t alway work.

Likelihood An Example

Binomial Likelihood n = 9 and X = 5

The Story: • Experimental Drug • Unknown probability of success, p • Tried on n patients. • X = #Successes ∼ Binomial(n, p). that is,   n x p (1 − p)n−x . P [X = x] = x The Likelihood Function: If n = 9 and X = 5, say, let   9 5 p (1 − p)9−5 . L(p|X = 5) = 5

0.1

0.1

The Likelihood Function The Definition

Maximum Likelihood Estimation ˆ is defined by Def: The MLE θˆ = θ(x)

If P [X = x] = f (x; θ)

ˆ = max L(θ|x). L(θ|x)

has a PMF (or density) depending on an unknown parameter θ, then L(θ|x) = f (x; θ)

θ

Equivalently: ˆ = max `(θ|x), `(θ|x) θ

is called the likelihood function. where Inpterpretation: The apriori probability of observing x, viewed as a function of θ.

`(θ|x) = log[L(θ|x)]. Note: ` is called the log-likelihood function.

Note: X and/or θ may be a vector here.

Example Exponential

Example Binomial n = 9, x = 5 If

Here

  9 5 p (1 − p)4 , L(p|x) = 5

X1 , · · · , Xn ∼ind Exponential(λ), then

  9 ], `(p|x) = 5 log(p) + 4 log(1 − p) + log[ 5

`0 (p) = 0 iff

iff

4 5 = iff p 1−p x 5 pˆ = = . 9 n

λe−λxk

k=1

= λn e−λ(x1 +···+xn ) .

5 4 , `0 (p|x) = − p 1−p and

fλ (x1 , · · · , xn ) =

n Y

So, 5(1 − p) = 4p.

L(λ|x) = λn e−λy , where y = x1 + · · · + x n , and `(λ|x) = n log(λ) − λy.

Exponential Likelihood n = t and y = 2

The MLE Recall: `(λ|x) = n log(λ) − λy. So, `0 (λ|x) =

n −y λ

and `0 (λ|x) = 0

So,

n =y λ iff n = λy.

iff

n ˆ=n = . λ y x1 + · · · + x n

0.1 i

Example: Uniform Distributions Other Examples Of Maximum Likelihood ˆ = 1/X. ¯ Exponential. If X1 , · · · , Xn ∼ind Exponential(λ), then λ

The Model: Suppose that X1 , · · · , Xn ∼ind Unif(0, θ), where θ > 0. MOM: Then µ(θ) = E(Xi ) = θ/2, so that ¯ θˆMOM = 2X.

Poisson: If X1 , · · · , Xn ∼ind Poisson(λ), then ˆ = X. ¯ λ Normal: X1 , · · · , Xn ∼ind Normal(µ, σ 2 ), then ¯ µ ˆ = X, n 1X ¯ 2. (Xi − X) σ ˆ2 = n i=1

The MLE: The density of each Xi is  1 if 0 ≤ x ≤ θ 1 1[0,θ] (x) = 0 if x > θ θ

So, the likelihood function is L(θ|x) =

n Y 1

θ i=1

1[0,θ] (xi ) =

 θ−n 0

if max[x1 , · · · , xn ] ≤ θ if otherwise

Interlude The Distribution of the Maximum . If X1 , · · · , Xn ∼ind F , then Y := max[X1 , · · · , Xn ] has distribuition function

Recall: L(θ|x) = So,

 θ−n 0

if max[x1 , · · · , xn ] ≤ θ

FY (y) = P [Y ≤ y] = P [X1 ≤ y, · · · , Xn ≤ Y ] = P [X1 ≤ y] × · · · × P [Xn ≤ y]

if otherwise

= F (y) × · · · × F (y) = F (y)n ;

θˆMLE = max[X1 , · · · , Xn ].

and if F has density f = F 0 , then Y has density fY (y) =

d FY (y) = nF (y)n−1 f (y). dy

Example. Light globes have life times X1 , · · · , Xn ; the room goes dark at time Y .

Properties of Estimators

Example. If X1 , · · · , Xn ∼ Unif(0, θ), then the density is Notation: Let

1 1(x) θ and the distribution function is    0 F (x) = x/θ    1

T = T (X1 , · · · , Xn )

if x ≤ 0

be an estimator of a parameter θ–for example, the sample and population means.

if 0 ≤ x ≤ θ .

Bias: The bias of T is

if x > θ

So,

Fy (y) = F (y)n = and

for 0 ≤ y ≤ θ.

ny n−1 fy (y) = θn

yn θn

bT (θ) = E(T ) − θ; and T is said to be unbiased if bT (θ) ≡ 0. Example. The sample mean and variance are unbiased, since ¯ = µ, E(X) E(S 2 ) = σ 2 .

Example Uniform Distributions

for 0 ≤ y ≤ θ, and

If then ¯ θˆM OM = 2X ¯ = µ = θ/2. is unbiased, since E(X)

Y = max[X1 , · · · , Xn ].

θ

y 0

ny n−1 dy θn

Z θ n y n dy = n θ 0 n y n+1 θ | = n θ n + 1 y=0 nθ = . n+1

X1 , · · · , Xn ∼ind Unif[0, θ],

The MLE is

Z

E(Y ) =

So, bY (θ) = E(Y ) − θ

Here Y has density n−1

fY (y) =

ny θn

=−

θ . n+1

Variance and MSE Bias Correction

Defs: Let T be an estimator of θ. Then variance of T is σT2 (θ) = Var(T );

In the uniform example, let n+1 )Y. Y˜ = ( n Then n+1 )E(Y ) n n + 1 nθ ) =( n n+1 = θ.

E(Y˜ ) = (

So, Y˜ is unbiased.

and the mean squared error is MSET (θ) = E[(T − θ)2 ]. Important Relation: MSE = σT2 (θ) + bT (θ)2 . Comparison: T1 is better than T2 if MSET1 (θ) ≤ MSET2 (θ) always with strict < sometime.

Example Uniform Distributions The Variance of Y˜ : First

Let X1 , · · · , Xn ∼ind Unif[0, θ],

E(Y 2 ) =

Y = max[X1 , · · · , Xn ], n+1 )Y. Y˜ = ( n

=

and

= ¯ T = θˆMOM = 2X. =

Then ¯ E|T − θ|2 = Var(T ) = 4Var(X

=

So, E|T − θ|2 =

Z

θ

y2 0

ny n−1 dy θn

Z θ n y n+1 dy θn 0 n y n+2 θ | θn n + 2 y=0 n θn+2 θn n + 2 nθ2 . n+2

θ2 θ2 = . 12n 3n

So,

Next, Var(Y ) = E(Y 2 ) − E(Y )2

n+1 2 ) V ar(Y ) n θ2 nθ2 n+1 2 ) = . =( n (n + 1)2 (n + 2) n(n + 2)

V ar(Y˜ ) = (

nθ2 nθ 2 −( ) n+2 n+1 = ··· =

=

So,

nθ2 . (n + 1)2 (n + 2)

MSEY˜ (θ) = V ar(Y˜ ) =

So,

θ2 . n(n + 2)

Finally, 2

2

nθ θ + 2 (n + 1) (n + 2) (n + 1)2 = ···

MSEY˜ (θ) θ2 /n(n + 2) = 2 MSEY (θ) 2θ /(n + 1)(n + 2) 1 n+1 → = 2n 2

MSEY (θ) =

=

2θ2 (n + 1)(n + 2)

as n → ∞.

More Dramatically: MSEY˜ (θ) θ2 /(n + 1)(n + 2) 3n = = MSET (θ) θ2 /3n (n + 1)(n + 2) and lim

n→∞

MSEY˜ (θ) = 0. MSET (θ)