Download Normal distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Topic 4 - Continuous distributions
•
•
•
•
Basics of continuous distributions
Uniform distribution
Normal distribution
Gamma distribution
1
Continuous Random Variables
• A continuous random variable can take on values from
an entire interval of the real line.
• The probability density function (pdf) of a continuous
random variable, X, is a function f(x) such that for a < b
b
P (a  X  b ) 
 f (x )dx
a
• The cumulative density function (cdf) of X is defined as
x
F (x )  P ( X  x ) 
 f (t )dt

2
Some relationships
• What is the relationship between the pdf (f) and the cdf
(F)?
f ( x)  F ' ( x)
– You integrate the pdf to get the cdf
– You take the derivative of the cdf to get the pdf
• P(a ≤ X ≤ b) = F(b) – F(a)
• P(X = a) = P(a ≤ X ≤ a) = F(a) – F(a) = 0
3
Pipeline example
• A pipeline is 100 miles long and every location along the
pipeline is equally likely to break
• Let X be the distance measured in miles from the pipeline
origin where a break occurs
• What is the cdf for X?
P( X  x) 
• What is the pdf for X?
f ( x)  F ' ( x) 
x
, for 0  x  100
100
1
, for 0  x  100, 0 otherwise
100
• What is P(30 ≤ X ≤ 50)?
4
Requirements of a pdf
• A pdf must satisfy the following two requirements:
f (x )  0 for all x or ranges of x

 f (x )dx  1

• Does the pipeline pdf satisfy these requirements?
 1
0  x  100

f ( x)  100
otherwise
 0
100

0
1
x
dx 
100
100
100
0
5
Mean and variance of a cont. random variable
  X  E (X ),
lim
 x * f (x )dx , the mean of X
or exp. value of X
lim
  X2  E [(X   X )2 ], variance of X
  X2  E (X 2 )   X 2 , where E (X 2 ) 
lim

x 2 * f (x )dx
lim
lim
 E (h (X )) 
 h (x ) * f (x )dx , expected value of h (X )
lim
6
Comparisons discrete to continous
7
Uniform distribution
• A uniform distribution on the interval from A to B, U(A,B),
is defined by a pdf of the form
• Does f(x) meet requirements?
f (x ) 
1
B A
for A  x  B
• What is the cdf for the Uniform distribution?
1
B A
dx

1
A B - A
B A
B
all f ( x)  0
and
0

for x  A
x
1
x A

F ( X )  
dx 
for A  x  B
B A
A B  A
for x  B

1

8
Uniform, etc.
• What is the mean of the Uniform distribution?
B
1
x   x
dx 
B A
A

B A
2
• What is the variance of the Uniform distribution?
– Using the same methodology as outlined above….
E( X ) 
2
 
2
x


B 2  AB  A2
3
( B  A)2
12
9
Gamma distribution
• The gamma distribution, G(a,b), is defined by the
following pdf
f (x ) 
where
1
a 1  x / b
x
e
,
a
b G(a )
x  0,a  0, b  0

G(a )   x a 1e  x dx for a  0.
0
• This is more for background purposes. We will not be
doing Gamma calculations by hand.
• Properties of the gamma function, G(a)
– For a  1, G(a)  (a1)G(a1)
– If a is a positive integer, G(a)  (a1)!
– G(1/2)  
10
Properties of the Gamma distribution
• The Gamma is a valid pdf. All probabilities are at least 0
and the integral across all values of X (summation) is 1.
• Example w/o calculus is not possible, but
u  ab and   a (b )
2
2
• “Proofs” information of the above concept are contained
briefly in the Moment Generating Functions section of
Topic 3 Files.
11
More on the gamma distribution
•
The gamma distribution is used as a probability model for the
time or space before the ath event in a Poisson process where
events occur at the rate b1/l. Number of events is fixed and
the interval is varied. Opposite of the Poisson.
a is called the shape parameter
– Normally listed as a specific number of events in the
problem.
b is called the scale parameter, b1/l, where l is defined by the
Poisson.
•
The exponential distribution is a special case of the gamma
This with a 1. This distribution is used to model the time
“between” events that are Poisson distributed (the “next”
occurrence).
12
Uses of the Gamma Distribution
•
Some examples of use for the Gamma include:
– Queuing models, the flow of items through manufacturing and
distribution processes and the load on web servers and the varied
forms of telecommunications systems.
• These are based on a series of exponentially distributed values,
which is a simplified Gamma distribution (mean time between
arrivals).
– Due to its moderately skewed profile, it can be used as a model in
a range of disciplines.
• Climatology, where it is a workable model for rainfall
• Financial services where it has been used for modelling
insurance claims and the size of loan defaults and as such has
been used in probability of ruin and value at risk calculations.
13
Back to the clunker car
• Recall that my car breaks down once a week on average. If
the breakdowns occur as events in a Poisson process, then
what is the probability less than a week passes before my
first breakdown? Gamma or Poisson?
Which calculator
14
Pipe example
• Defects along a piece of pipe occur as events in a Poisson
process with an average of 2 defects every 10 feet. What is the
probability that the third defect will occur at least 20 feet from
the beginning of the pipe?
– What is a , b and l ?
– How do you define the distribution?
– How do you write the probability statement in this case?
Which calculator
15
Check out counter example
• Customers arriving at a checkout counter of a
supermarket are Poisson distributed at a rate of one every
two minutes. What is the probability that at most 10
minutes pass before a 3rd customer arrives in the line?
– What’s lambda?
• Occurrences over time…..1 per 2 minutes or 5 per 10 minutes,
so lambda is 5/10
– What’s alpha?
• Shape parameter…..in this case, the probability associated with
the 3rd customer.
– What’s beta?
• Scale parameter….always 1 divided by lambda, so in this case,
it’d be 2.
Which calculator
16
Normal distribution
• The normal distribution, N(,2), has a pdf given by
1
f (x ) 
e
2
( x   )2
2 2
-  x  
• The normal distribution is always bell shaped.
• The normal distribution is defined in terms of its mean
and standard deviation, since those parameters are on a
consistent basis and are comparable.
• Again, we will not be doing hand calculations of
probabilities using this function, but you could
approximate it by taking the integral within the stated
limits.
17
Empirical rule
• This is a rough approximation or “back of the envelope” guide
to the areas under the Normal.
• What is the “approximate” or “rough” probability …..
– a normal falls within one standard deviation of the mean?
– a normal falls within two standard deviations of the mean?
– a normal falls within three standard deviations of the mean?
18
Empirical rule (cont)
• What is the “approximate” or “rough” probability for a normal
distribution that a randomly selected value – Falls +/- 1 standard deviation of the mean?
– Falls +/- 2 standard deviation of the mean?
– Falls +/- 3 standard deviation of the mean?
19
Side effects of anti-depressant meds
• The weight gain associated with an antidepressant is
normally distributed with a mean of 6 lbs and a standard
deviation of 3 lbs. Using the Empirical Rule ---•
•
•
What is the approximate probability of weight loss?
What is the approximate probability of a weight gain between 0
and 12 pounds?
What is the approximate probability of a weight gain between 9
and 12 pounds?
20
Weight gain example – more exact
• The weight gain associated with an antidepressant is normally
distributed with a mean of 6 lbs and a standard deviation of 3
lbs.
• What is the probability of weight gain?
• What is the probability of gaining between 0 and 12 lbs?
Normal calculator
21
Cement Production Example
• ImanAggie Redimix has a contract with TxDot to supply
concrete for one of the overpasses on Hwy 6 just south of
town.
• Assume that company’s production is normally
distributed with a mean of 3,200 psi and a standard
deviation of 250 psi, what’s the probability that a specific
batch will have a strength below 2,700 psi?
• Why is that important?
• Demonstrate on the next slide.
22
Cement Production Example (cont)
• The distribution of production values looks like this..
Normal calculator
23
Standard normal distribution
• If X has a N(,2) distribution, then Z=(X-)/ has a
standard normal distribution, N(0,1).
• The standard normal is an important reference distribution.
• P(X ≤ x) = P(Z ≤ (x-)/) = F((x-)/)
• The cdf of a standard normal, F(z), is tabled in many
textbooks.
• Standardized values, (x-)/, indicate how far in standard
deviations the value x is from 
• For any normal distribution, probabilities can be phrased in
terms of standardized values
24
Examples of converting x to Z
and from Z to x
• It’s important to be able to make this conversion. The
cement foreman doesn’t know a Z from a hole in the
ground, but he knows his production (which is all x
values)
• Given X~N(3200,250), what’s P(X<2700)?
Z  (x  ) / 
Z  (2700  3200) / 62500  2
F ( 2)  .0228
• Remember that Phi of -2, F ( 2)  .0228 , is the area to the
left of 2 standard deviations to the left of the mean on the
Normal.
25
….and conversely
• Suppose that ImanAggie wants to make the marketing
claim that less than 1% of their product fall below a certain
psi level. How do you determine that level of psi?
F( Z )  .01 Z~-2.326
Z  ( x  ) / 
 2.326  ( x  3200) / 250
solve for x........x  2618.5
• What’s the value of psi, such that 5% of the overall
production is below that level?
F( Z )  .05 Z~-1.645
Z  ( x  ) / 
 1.645  ( x  3200) / 250
solve for x........x  2788.75
• This type of calculation shows up a lot in Statistics.
26
Comparison of X and Z values
27
Is my data normal?
• In StatCrunch, a quantile-quantile plot (QQ plot) plots ordered
data values versus quantiles of a standard normal distribution.
• If the data are from a normal distribution, the points should lie
approximately on a straight line.
• BIG problems associated with
assuming you have a normal,
when the data’s not really normal.
QQ Plot Example
28
Other distributions
• The Weibull distribution and the log normal
distribution are used to model mean time to failure.
Therefore, used a lot in reliability studies.
• The beta distribution is used to model proportions.
• There are many other distributions out there. Choose the
one that serves as the best probability model for your
setting.
• Waaay out of scope for STAT 211…
The additional file for Topic 4 has worked out examples (some with
discussion) for a variety of continuous distribution applications.
29