Download Poisson distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Gibbs sampling wikipedia , lookup

Transcript
Theoretical distributions:
the other distributions
The Aim
By the end of this lecture, the
students will be aware of other
theoretical distributions
2
The Goals
• List the important properties of the
t-, Chi-squared, F- and Lognormal distributions
• Explain when each of these distributions is
particularly useful
• List the important properties of the Binomial
and Poisson distributions
• Explain when the Binomial and Poisson
distributions are each particularly useful
3
Other theoretical distribution
A-Continious probaility distribution
-t-distribution
-Chi-squared (x2) distribution
-F-distribution
-LogNormal distribution
B-Discrete probability distribution
-Binomial distribution
-Poisson distribution
4
More continuous probability
distributions
•These distributions are based on continuous
random variables.
•Often it is not a measurable variable that
follows such a distribution but a statistic derived
from the variable.
•The total area under the probability density
function represents the probability of all possible
outcomes, and is equal to one.
5
The t-distribution
-Derived by W.S. Gossett, who published under
the pseudonym 'Student'; it is often called Student's
t-distribution.
-The parameter that characterizes the t-distribution
is the degrees of freedom (df=n-1), so we can draw
the probability density function if we know
*the equation of the t-distribution and
*its degrees of freedom.
Note that they are often closely affiliated to sample
size.
6
The t-distribution
-Its shape is similar to that of the Standard
Normal distribution, but it is more spread out,
with longer tails. Its shape approaches Normality
as the degrees of freedom increase.
-It is particularly useful for calculating
confidence intervals for and testing hypotheses
about one or two means.
7
Sec. 10.1
t Distribution for Inferences about a Mean
• The following diagram is a comparison between the standard
normal distribution and two different t distributions of sample
size n = 3 and n = 12
– As you can see, they are very similar in shape, and as the sample size
increases, the t distribution becomes more and more normal
• the t distribution
X 
t
s
n
S is the estimated standard deviation
The test statistic has a T distribution (assuming the underyling population
Really is normally distributed)
The distribution has n-1 degrees of freedom
Use of the t-distribution
• The t is often thought of as a small-sample
technique
• But, STRICTLY SPEAKING, the t should be
used whenever the population standard
deviation σ is NOT KNOWN
• Some practitioners use z whenever the sample
is large
– Central Limit Theorem
– There isn’t much difference between t and z
Student’s t Distribution
Note: t
Z as n increases
Standard
Normal
(t with df = )
t (df = 13)
t-distributions are bellshaped and symmetric, but
have ‘fatter’ tails than the
normal
t (df = 5)
0
from “Statistics for Managers” Using Microsoft® Excel 4th Edition, Prentice-Hall 2004
t
t distribution values
With comparison to the Z value
Confidence
t
Level
(10 d.f.)
t
(20 d.f.)
t
(30 d.f.)
Z
____
.80
1.372
1.325
1.310
1.28
.90
1.812
1.725
1.697
1.64
.95
2.228
2.086
2.042
1.96
.99
3.169
2.845
2.750
2.58
Note: t
Z as n increases
from “Statistics for Managers” Using Microsoft® Excel 4th Edition, Prentice-Hall 2004
The Chi-squared (x2) distribution
•
•
•
•
It is a right-skewed distribution taking positive
values.
It is characterized by its degrees of freedom
Its shape depends on the degrees of freedom; it
becomes more symmetrical and approaches
Normality as the degrees of freedom increases.
It is particularly useful for analyzing
categorical data.
15
The Chi-Square Distribution
k-1 degrees of freedom.
(where k = the number of
categories)
See Table P.495
df = 3
df = 5
df = 10
c2
The F-distribution
•
•
•
•
It is skewed to the right.
It is defined by a ratio. The distribution of a ratio
of two estimated variances calculated from
Normal data approximates the F-distribution.
The two parameters which characterize it are the
degrees of freedom of the numerator and the
denominator of the ratio.
The F-distribution is particularly useful for
comparing two variances, and more than two
means using the analysis of variance (ANOVA).
19
The F-distribution
s12 and s22 represent the sample
-Let
variances of two different populations.
-If both populations are normal and the
population variances σ 12 and σ 22 are equal,
then the sampling distribution of
s12
F 2
s2
is called an F-distribution.
Larson/Farber 4th ed
20
Properties of the F-distribution
-F-values are always greater than or equal to 0.
-For all F-distributions, the mean value of F is
approximately equal to 1.
d.f.N = 1 and d.f.D = 8
d.f.N = 8 and d.f.D = 26
d.f.N = 16 and d.f.D = 7
d.f.N = 3 and d.f.D = 11
F
1
2
3
Larson/Farber 4th ed
4
21
The Lognormal distribution
-It is the probability distribution of a random
variable whose log (e.g. to base 10 or e) follows
the Normal distribution.
-It is highly skewed to the right (Fig. 8.3a).
-If, when we take logs of our raw data that are
skewed to the right, we produce an empirical
distribution that is nearly Normal (Fig. 8.3b),
our data approximate the Lognormal
distribution.
22
The Lognormal distribution
-Many variables in medicine follow a
Lognormal distribution. We can use the
properties of the Normal distribution to
make inferences about these variables after
transforming the data by taking logs.
-If a data set has a Lognormal distribution,
we can use the geometric mean as a
summary measure of location.
23
Other theoretical distribution
A-Continious probaility distribution
-t-distribution
-Chi-squared (x2) distribution
-F-distribution
-LogNormal distribution
B-Discrete probability distribution
-Binomial distribution
-Poisson distribution
26
Discrete probability distributions
•The random variable that defines the
probability distribution is discrete.
•The sum of the probabilities of all possible
mutually exclusive events is one.
27
The Binomial distribution
• Suppose, in a given situation, there are only two outcomes,
'success' and 'failure'.
• For example, we may be interested in whether a woman
conceives (a success) or does not conceive (a failure) after
in vitro fertilization (IVF).
• If we look at n = 100 unrelated women undergoing IVF
(each with the same probability of conceiving), the Binomial
random variable is the observed number of conceptions
(successes).
• Often this concept is explained in terms of n independent
repetitions of a trial (e.g. 100 tosses of a coin) in which the
outcome is either success (e.g. head) or failure.
28
The Binomial distribution
• The two parameters that describe the
Binomial distribution are;
-n; the number of individuals in the sample
(or repetitions of a trial) and
-p; the true probability of success for each
individual (or in each trial).
• Its mean (the value for the random variable that
we expect if we look at individuals, or repeat the
trial n times) is np. Its variance is np(1- p).
29
The Binomial distribution
• When n is small,
-if p < 0.5, the distribution is skewed to the right
-if p > 0.5, the distribution is skewed to the left.
• The distribution becomes more symmetrical as
the sample size increases (Fig. 8.4) and
approximates the Normal distribution if both np
and n(1 - p) are greater than 5.
30
The Binomial distribution
• We can use the properties of the
Binomial distribution when making
inferences about proportions.
• In particular, we often use the Normal
approximation
to
the
Binomial
distribution when analyzing proportions.
31
Notation for Binomial Experiments
Symbol
Description
n
The number of times a trial is repeated.
p = P(S)
The probability of Success in a single
trial.
q = P(F)
The probability of Failure in a single
trial (q = 1 – p)
x
The random variable represents a
count of the number of successes in n
trials: x = 0, 1, 2, 3, . . . n.
Binomial Probabilities
There are several ways to find the probability of x
successes in n trials of a binomial experiment. One way
is to use the binomial probability formula.
Binomial Probability Formula
In a binomial experiment, the probability of exactly x
successes in n trials is:
P( x) n C x p q
x
n x
n!
x n x

p q
(n  x)! x!
Ex: Finding Binomial Probabilities
A six sided die is rolled 3 times. Find the probability of
rolling exactly one 6.
Roll 1
You could use
a tree diagram
Roll 2
Roll 3
# of 6’s
Probability
(1)(1)(1) = 1
3
1/216
(1)(1)(5) = 5
2
5/216
(1)(5)(1) = 5
2
5/216
(1)(5)(5) = 25
1
25/216
(5)(1)(1) = 5
2
5/216
(5)(1)(5) = 25
1
25/216
(5)(5)(1) = 25
1
25/216
(5)(5)(5) = 125
0
125/216
Frequency
Ex: Finding Binomial Probabilities
There are three outcomes that have exactly one six, and
each has a probability of 25/216. So, the probability of
rolling exactly one six is 3(25/216) ≈ 0.347. Another way
to answer the question is to use the binomial probability
formula. In this binomial experiment, rolling a 6 is a success
while rolling any other number is a failure. The values for n,
p, q, and x are n = 3, p = 1/6, q = 5/6 and x = 1. The
probability of rolling exactly one 6 is:
P( x) n C x p q
x
Or you could use the binomial
probability formula
n x
n!

p x q n x
(n  x)! x!
Ex: Finding Binomial Probabilities
3!
1 1 5 31
P (1) 
( ) ( )
(3  1)!1! 6
6
1 5 2
 3( )( )
6 6
By listing the possible values of x
1 25
with the corresponding
 3( )(
)
probability of each, you can
6 36
construct a binomial probability
25
distribution.
 3(
)
216
25

 0.347
72
•
The Poisson distribution
• Poisson random variable is the count of the
number of events that occur independently and
randomly in time or space at some average rate,
µ.
• For example, the number of hospital admissions
per day typically follows the Poisson distribution.
• We can use our knowledge of the Poisson
distribution to calculate the probability of a
certain number of admissions on any particular
day.
38
•
The Poisson distribution
• The parameter that describes the Poisson
distribution is the mean, i.e. the average rate, µ.
• The mean equals the variance in the
Poisson distribution.
• It is a right skewed distribution if the mean is
small, but becomes more symmetrical as the
mean increases, when it approximates a
Normal distribution.
39
The Poisson Distribution
The Poisson distribution is defined by:
f ( x) 
x 
 e
x!
Where f(x) is the probability of x occurrences in an
interval
 is the expected value or mean value of occurrences
within an interval
e is the natural logarithm. e = 2.71828
Poisson Distribution, example
The Poisson distribution models counts, such as the number of new cases
of SARS that occur in women in New England next month.
The distribution tells you the probability of all possible numbers of new
cases, from 0 to infinity.
If X= # of new cases next month and X ~ Poisson (), then the probability
that X=k (a particular count) is:
p( X  k ) 
k 
e
k!
Example: Mercy Hospital
• Poisson Probability Function
MERCY
Patients arrive at the
emergency room of Mercy
Hospital at the average
rate of 6 per hour on
weekend evenings.
What is the
probability of
4 arrivals in
30 minutes on a weekend evening?
Example: Mercy Hospital
 Poisson
MERCY
Probability Function
 = 6/hour = 3/half-hour, x = 4
p( X  k ) 
4
 e
k

k!
3
3 (2.71828)
f (4) 
4!

.1680
Using Excel to Compute
Poisson Probabilities
 Formula
A
1
2
MERCY
Worksheet
B
3 = Mean No. of Occurrences ( )
Number of
3 Arrivals (x )
4
0
5
1
6
2
7
3
8
4
9
5
10
6
… and so on
Probability f (x )
=POISSON(A4,$A$1,FALSE)
=POISSON(A5,$A$1,FALSE)
=POISSON(A6,$A$1,FALSE)
=POISSON(A7,$A$1,FALSE)
=POISSON(A8,$A$1,FALSE)
=POISSON(A9,$A$1,FALSE)
=POISSON(A10,$A$1,FALSE)
… and so on
Using Excel to Compute
Poisson Probabilities
 Value
MERCY
Worksheet
A
1
2
B
3 = Mean No. of Occurrences ( )
Number of
3 Arrivals (x )
0
4
1
5
2
6
3
7
4
8
5
9
6
10
… and so on
Probability f (x )
0.0498
0.1494
0.2240
0.2240
0.1680
0.1008
0.0504
… and so on
Example: Mercy Hospital
 Poisson
MERCY
Distribution of Arrivals
Poisson Probabilities
Probability
0.25
0.20
actually,
the sequence
continues:
11, 12, …
0.15
0.10
0.05
0.00
0
1
2
3
4
5
6
7
8
9
Number of Arrivals in 30 Minutes
10
Summary
Other theoretical distributions
A-Continious probaility distribution
-t-distribution
-Chi-squared (x2) distribution
-F-distribution
-LogNormal distribution
B-Discrete probability distribution
-Binomial distribution
-Poisson distribution
47