PROBABILITY DISTRIBUTIONS. Part 4: Gamma Distribution. Weibull Distribution.
Lognormal Distribution. Sections 4-9 through 4-11. Another exponential ...
Chapter 4: CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Part 4: Gamma Distribution Weibull Distribution Lognormal Distribution Sections 4-9 through 4-11
Another exponential distribution example first...
1
• Example: Magnitude of Earthquakes The magnitude of earthquakes in a region can be modeled as having an exponential distribution where the mean of the distribution is 2.4, as measured on the Richter scale. Let X represent the magnitude of an earthquake. Then, X ∼ exponential(λ). First, determine λ: 1 E(X) = = 2.4 ⇒ λ
1 λ= ≈ 0.4167 2.4
In this problem, X is not a ‘wait time’ and is not related to the Poisson process. Here, λ is not a rate parameter, but is simply a parameter that tells you the shape of the distribution of earthquake magnitudes.
2
f(x)
0
2
4
6
8
10
x
Find the probability that an earthquake striking this region will... a) exceed 3.0 on the Richter scale.
b) fall between 2.0 and 3.0 on the Richter scale.
3
a) exceed 3.0 on the Richter scale.
b) fall between 2.0 and 3.0 on the Richter scale.
4
Gamma Distribution Section 4-9 Another continuous distribution on x > 0 is the gamma distribution. • Gamma Distribution The random variable X with probability density function λr xr−1e−λx f (x) = for x > 0 Γ (r) is a gamma random variable with parameters λ > 0 and r > 0. • Mean and Variance For a gamma random variable with parameters λ and r, r µ = E(X) = λ 5
and
r 2 σ = V (X) = 2 λ
• The Gamma function: Γ (r) The value in the denominator of f (x) is a constant dependent on r. This value is: Z ∞ Γ (r) = xr−1e−xdx for r > 0 0
and this is a finite integral.
You can think of Γ (r) as a necessary constant in f (x) to make sure the area under f (x) is 1.0
6
The gamma family (expressed as choices of λ and r) is very flexible:
0.6
0.8
Gamma distributions with fixed scale parameter (lambda=1)
0.4 0.0
0.2
f(x)
r=0.2 r=1 r=5
0
5
10
15
X
λ is called the scale parameter as it most influences the spread. r is called the shape parameter as it most influences the peaked-ness of the distribution. 7
• Example: Gene expression data As technology progresses, so does the kind of data we can collect. We can now gather information on the amount of protein (or mRNA) outputted by a certain gene in an organism. Below is a cDNA microarray slide showing the amount of mRNA (as intensity of fluorescence) for each of thousands of genes for a single organism.
8
If we consider the expression values from many individuals for a single gene:
0.3 0.0
0.1
0.2
Density
0.4
0.5
Gene 4 expression values
0
1
2
3
expression
We can model these expression values all coming from a specific gene with a gamma distribution... 9
4
0.3 0.0
0.1
0.2
Density
0.4
0.5
Gene 4 expression values
0
1
2
3
expression
The solid curve is the ‘best fitting’ gamma distribution to the observed data.
10
4
Modeling the observed data with a common distribution allows us to compute theoretical probabilities, and compare different groups (such as healthy patients vs. cancer patients). ——————————————————– • Gamma distribution modeling examples: – Gene expression data – Climatology models for monthly precipitation – The sum of k independent exponential random variables The integration for gamma probabilities would come from tables (like we saw for the normal distribution)... your book does not include these. 11
Instead... book homework problems are about recognizing the gamma probability density function, setting up f (x), and recognizing the mean µ and variance σ 2 (which can be computed from λ and r), and seeing the connection of the gamma to the exponential and the Poisson process. • Example: The time between failures of a laser machine is exponentially distributed with a mean of 25,000 hours. a) What is the expected time until the second failure?
12
13
b) What is the probability that the time until the third failure exceeds 50,000 hours?
14
15
• Thus, we have another gamma distribution modeling example: – Time until rth failure in a Poisson Process with rate parameter λ is distributed gamma(r, λ).
• Some comments on the gamma(r, λ) distribution: – When r = 1, f (x) is an exponential distribution with parameter λ. The exponential distribution is a special case of the gamma distribution. – If r is a positive integer, the distribution is called an Erlang distribution.
16
– Some relationships: Γ (1) = 0! = 1 Γ (r + 1) = rΓ (r) {Γ (4) = 3·2·1 = 6}
Γ (r+1) = r! √ 1 Γ (2) = π
– For the gamma(r, λ) distribution, when λ = 1/2 and r = p/2 where p is a positive integer, then we have a chi-squared distribution with parameter p, another special case of the gamma.
17
Weibull Distribution Section 4-10 Another continuous distribution for x > 0. It can be used to model a situation where the number of failures increases with time, decreases with time, or remains constant with time. (So, it’s used for more complicated situations than a Poisson process). • Weibull Distribution The random variable X with probability density function β−1 β x x β f (x) = exp − for x > 0 δ δ δ is a Weibull random variable with scale parameter δ > 0 and shape parameter β > 0.
18
• Mean and Variance For a Weibull random variable with parameters β and δ, 1 µ = E(X) = δΓ 1 + β and 2 2 1 2 2 2 σ = V (X) = δ Γ 1 + −δ Γ 1 + β β
19
Also a flexible family:
20
• Cumulative Distribution Function If X has a Weibull distribution with parameters δ and β, then the cumulative distribution function of X is β −( xδ ) F (x) = 1 − e
• A comment on the W eibull(δ, β) distribution: – When β = 1, f (x) is an exponential distribution with parameter 1/δ. The exponential is a special case of the Weibull.
21
• Example: Manufacture of semiconductor In an industrial engineering article, the authors suggest using a Weibull distribution to model the duration of a bake step in the manufacture of a semiconductor. Let T represent the duration in hours of the bake step for a randomly chosen lot. Suppose T ∼ W eibull(δ = 10, β = 0.3). a) What is the probability that the bake step takes longer than four hours?
22
b) What is the probability that the bake step takes between two and seven hours?
23
Lognormal Distribution Section 4-11 The last continuous distribution we will consider is also for x > 0. Let W be a normally distributed random variable. Suppose we create a new random variable X with the transformation X = exp(W ). Then, X is a lognormal random variable. The name follows from the fact that ln(X) = W so we have ln(X) being normally distributed. W can take on values from −∞ to ∞. But the domain (or range) of X is the positive real numbers. 24
• Lognormal Distribution
Let W have a normal distribution with mean θ and variance ω 2, then X = exp(W ) is a lognormal random variable with probability density function " # 1 (ln x − θ)2 √ f (x) = exp − for x > 0. 2 2ω xω 2π 25
• Mean and Variance For a lognormal distribution with parameters θ and ω 2, 2/2 θ+ω µ = E(X) = e
and σ 2 = V (X) = e
2θ+ω 2
e
ω2
−1
Note that for the lognormal r.v. X, the mean and variance are µ and σ 2 and these are functions of θ and ω 2, which are the mean and variance of W , a normal random variable such that X = exp(W ). So, θ and ω 2 show up in W ∼ N (θ, ω 2).
26
• Example: Component lifetimes Lifetimes of a randomly chosen component are lognormally distributed with parameters θ = 1 and ω = 0.5 days. a) Find the mean lifetime of these components.
b) Find the standard deviation of the lifetimes.
27
c) Find the probability that a component lasts longer than 4 days.
28