Download Slideshow

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Life is a school of probability...
Walter Bagehot (English Economist)
I don't believe in providence and fate, as a
technologist I am used to reckoning with the
formulae of probability…
Max Frisch (German Architect and Novelist)
AP Statistics
Sampling Distributions?
§ Review of Statistical Terms
– Population, from a statistical point of view, is
considered as a set of measurements or
counts, existing or conceptual
– Sample is a subset of measurements from
the population. Random Samples are
considered for this section
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Review of Statistical Terms
– Parameter is a numerical descriptive measure
of a population. In statistical practice the value
of a parameter is not know, it is not possible to
examine the entire population
– Statistic is a numerical descriptive measure
of a sample, not depending of any unknown
parameter. An statistic is used to estimate an
unknown parameter
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Common Statistics and Parameters
Measure
Mean
Variance
Standard Deviation
Proportion
Statistic
x
2
s
s
p̂
Parameter



p
2
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Why Sample?
At times, we’d like to know something about the
population, but because our time, resources,
and efforts are limited, we can take just a
sample to learn about the population
Ex: Take a sample of voters to learn about
probable election results (before the final
count).
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Why Sample?
So we must use measurements from a sample
instead. In such cases, we will use a statistic ( x ,
s, or p̂ ) to make inferences about corresponding
population parameters (, , or p)
Inference is to draw conclusions for a entire
population from the information of a sample
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Types of Inference
Estimation. In this case, we estimate or
approximate the value of a population parameter
Testing: In this case, we formulate a decision
about a population parameter
Regression: In this case, we make predictions
or forecasts about the value of a statistical
variable
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Are Inferences Reliable?
To evaluate the reliability of our inference, we
need to know about the probability distribution
of the statistic we are using
Typically, we are interested in the sampling
distributions for sample means and sample
proportions
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Sampling Distributions
A Sampling Distribution is a probability
distribution of a sample statistic based on all
possible simple random samples of the same
size from the same population
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Sampling Distributions Example
In a rural community with a children’s fishing pond,
are posted rules stating that all fish under 6 inches
must be returned to the pond, and the limit of five fish
per day may be kept. 100 random samples of five
trout are taken and recorded the lengths of the five
trout. What is the average (mean) length of a trout
taken from the pond (textbook pp.362 table 7-1)
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
Checking for Understanding
HM STAT Space (7.1)
Practice
• Textbook Section 8.1 Problems: pp. 365
Braser–Braser Chapter 7.1
Information: the negative reciprocal value
of probability…
Claude Shannon (American Mathematical Engineer)
From principles is derived probability, but
truth or certainty is obtained only from facts.
Tom Stoppard (English Playwriths)
There is an old saying: All roads lead to
Rome. In Statistics we can recast this
saying: All probability distributions
average out to the Normal distribution,
(as the sample size increases)
AP Statistics
The Central Limit Theorem
§ The Central Limit Theorem (Normal)
For a Normal Probability Distribution, let x be a random
variable with a normal distribution whose mean is ,
and whose standard deviation is . Let x be the sample
mean corresponding to random samples of size n taken
form the x distribution. Then the following is true:
– The
–
–
x distribution is a normal distribution
The mean of the x distribution is 
The standard deviation of the x distribution is:

n
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Central Limit Theorem. Converting x to z
We can convert the x distribution to the standard
normal z distribution using the following formulas
x  

x 
n
x  x x  
z

x
 n
n is the sample size
 is the mean of the x distribution
 is the standard deviation of the x distribution
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Sample Size Considerations
For the Central Limit Theorem (CLT) to be applicable:
– If the x distribution is symmetric or reasonably
symmetric, n ≥ 30 should suffice
– If the x distribution is highly skewed or unusual,
even larger sample sizes will be required
– If possible, make a graph to visualize how the
sampling distribution is behaving
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Central Limit Theorem. (Any Distribution)
If x posses any distribution with mean  and
standard deviation , then the sample mean x
based on a random sample of size n will have a
normal distribution that approaches the
distribution of a normal random variable with
mean  and standard deviation  n , as n
increases without limit
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Finding Probabilities Using Central Limit Theorem
Given a probability distribution of x values with
sample size n, mean  , and standard deviation  :
– If the x distribution is normal, then the
distribution is normal
x
– Even if the x distribution is not normal, if the
sample of the size is n  30, then by CLT, the
x distribution is approximately normal
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Finding Probabilities Using Central Limit Theorem
Given a probability distribution of x values with
mean  , standard deviation , and sample of size n
– Convert
x to z using the formula:
z
x  x
x
x

 n
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Finding Probabilities Using Central Limit Theorem
Given a probability distribution of x values with
mean  , standard deviation , and sample of size n
– Use the standard normal distribution to find the
corresponding probability for the events
regarding x
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Central Limit Theorem. Example
The heights of 18-year old men are approximately
normally distributed, with a mean  = 68 inches
and a standard deviation  = 3 inches
a. What is the probability that a randomly
selected man is taller than 72 inches?
72  68
z
 1.33
Get the z score:
3
Find Probability: P(z>72) = 1 – P(z<72) = .0918
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Central Limit Theorem. Example
The heights of 18-year old men are approximately
normally distributed, with a mean  = 68 in and a
standard deviation  = 3 in
b. What’s the probability that the average height of
2 randomly selected men is greater than 72 in?

3
Using CLT
x 

 2.121




68
x
x
with n = 2:
n
2
72  68
z
 1.89
2.121
P( x  72)  P( z  1.89)  0.0294
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
§ Central Limit Theorem. Example
The heights of 18-year old men are approximately
normally distributed, with a mean  = 68 in and a
standard deviation  = 3 in
c. What is the probability that the average height of
16 randomly selected men is greater than 72 in?

3
Using CLT
x 

 0.75




68
x
x
with n = 16:
n
16
72  68
z
 5.33
0.75
P( x  72)  P( z  5.33)  0.000
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
Braser–Braser Chapter 7.2
AP Statistics
The Central Limit Theorem
Braser–Braser Chapter 7.2
The Central Limit Theorem
AP Statistics
Checking for Understanding
HM STAT Space (7.2)
Practice
• Textbook Section 7.2 Problems: pp. 373 – 379
Braser–Braser Chapter 7.2
The scientific imagination always restrains
itself within the limits of probability...
Thomas Huxley (English Biologist)
A property in the 100-year floodplain has a 96
percent chance of being flooded in the next
hundred years without global warming. The
fact that several years go by without a flood
does not change that probability...
Earl Blumenauer (Oregon Representative)
Many issues in life come down to success
or failure. In most cases, we will not be
successful all the time, so proportions of
successes are very important. What is the
probability sampling distributions for
proportions?…
The annual crime rate in the Capital Hill of Denver is 111 victims
per 1000 residents. (111 out of 1000 residents have been victim
of a least one crime). These crimes range from minor crimes
(stolen hubcaps or purse snatching) to major crimes (murder).
The Arms is an apartment building on this neighborhood that has
50 year-round residents. Consider each of the n = 50 residents
as a binomial trial. The random variable r, (1 = r = 50),
represents the number of victims of a least one crime next year.
(a) What is the population probability p that a resident a
resident in the Capital Hill neighborhood will be / will not be
a victim of a crime?
(b) What is the probability that between 10% and 20% of the
Arms residents will be victims of a crime next year?
Hint: Use the binomial distribution. Use the normal approach to the binomial.
Compare answers
AP Statistics
Sampling Distributions for Proportions
§ Sampling Distribution for the Proportion
pˆ  r n
Given: n = number of binomial trials (constant)
r = number of successes
p = probability of success on each trial
q = 1 – p = probability of failure on each trial
ˆ r n
If np > 5 and nq > 5, then the random variable p
can be approximated by a normal random variable x with
 pˆ  p
 pˆ 
pq
n
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Continuity Corrections
Since is discrete, but x is continuous, we have to
make a continuity correction; for a small n, the
correction is meaningful
How to make corrections to p̂ intervals
1. If r/n is the right end point of a p̂ interval, we
add 0.5/n to get the corresponding right end
point of the x interval
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Continuity Corrections
Since is discrete, but x is continuous, we have to
make a continuity correction; for a small n, the
correction is meaningful
How to make corrections to p̂ intervals
2. If r/n is the left end point of a p̂ interval, we
subtract 0.5/n to get the corresponding left end
point of the x interval
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Proportion Sampling Distribution. Example
Suppose the annual crime rate in Denver is p = 0.111
If 50 people live in an apartment complex, what is the
probability that between 10% and 20% of the residents
will be victims of crimes next year?
n = 50,
p = 0.111,
q = 1 – p = 1 – 0.111 = 0.899
Checking conditions (np >5, nq > 5):
np = (50)(.111) = 5.55
nq = (50)(.889) = 44.45
p̂ can be approximated with a normal distribution
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Proportion Sampling Distribution. Example
Suppose the annual crime rate in Denver is p = 0.111
If 50 people live in an apartment complex, what is the
probability that between 10% and 20% of the residents
will be victims of crimes next year?
n = 50,
p = 0.111,
q = 0.899
 pˆ  p  0.111
 pˆ 
pq
(0.111)(0.889)

 0.044
n
50
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Proportion Sampling Distribution. Example
Suppose the annual crime rate in Denver is p = 0.111
If 50 people live in an apartment complex, what is the
probability that between 10% and 20% of the residents
will be victims of crimes next year?
n = 50,
p = 0.111,
q = 0.899
Continuity Correction (0.5/n): 0.5/50 = 0.01
P(0.10  pˆ  0.20)  P(0.09  x  0.21)
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Proportion Sampling Distribution. Example
Suppose the annual crime rate in Denver is p = 0.111
If 50 people live in an apartment complex, what is the
probability that between 10% and 20% of the residents
will be victims of crimes next year?
n = 50,
Using z-scores:
p = 0.111,
q = 0.899
 = 0.111,  = 0.044
0.09  0.111
z1 
 0.48
0.044
0.21  0.111
z2 
 2.25
0.044
P(0.09  x  0.21)  P(0.48  z  2.25)  0.6722
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Proportion Sampling Distribution. Example
Suppose the annual crime rate in Denver is p = 0.111
If 50 people live in an apartment complex, what is the
probability that between 10% and 20% of the residents
will be victims of crimes next year?
n = 50,
p = 0.111,
q = 0.899
P(0.10  pˆ  0.20)  0.6722
Thus, there is about a 67% chance that between 10% and
20% of the residents will be victims of a crime next year.
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Control Charts for Proportions
Used to examine an attribute or quality of an
observation (rather than a measurement).
How to use it:
– Select a fixed sample size, n, at fixed time
intervals, and determine the sample proportions
at each interval
– Then use the normal approximation of the
sample proportion to determine the control limits
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ How to Make a P-Chart
1. Estimate p, the overall proportion of successes
Total number of observed successes in all samples
p
Total number of trials in all samples
2. Take the center line of control chart as:  pˆ  p
3. Control limits are located at:
pq
p2
n
and
pq
p 3
n
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ P-Chart. Out of Control Signals
pq
control limit
Signal 1. Any point beyond p  3
n
Signal 2. Run of nine consecutive points on one
side of the center line  pˆ  p
Signal 3. At least two out of three consecutive
points are beyond the control limits
pq
p2
n
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ P-Chart. Out of Control Signals
If no out-of-control signals occur, we say that the
process is in control, while keeping a watchful
eye on what occurs next
In some P-Charts the value of p may be near 0 or 1
In this case, the control limits may drop below 0 or
rise above 1. If this happens, follow the convention
of rounding negative control limits to 0 and control
limits above 1 to 1
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Control Charts for Proportions. Example (pp. 384)
(a) Estimate the overall proportion of successes
p
Total number of observed successes in all samples
Total number of trials in all samples
9  12  8  ...  10 147
p

 0.175
14(60)
840
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Control Charts for Proportions. Example (pp. 84)
(b) Calculate
 p̂
and
 p̂
 pˆ  p  p  0.175
 pˆ 
pq

n
pq
(0.175)(0.825)

 .049
n
60
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Control Charts for Proportions. Example (pp. 84)
(c) Estimate
np
and
nq
np  60(0.175)  10.5
nq  60(0.825)  49.5
Both are greater than 5, this means the normal distribution
should be reasonable good
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Control Charts for Proportions. Example (pp. 84)
(d) Estimate the control limits of the P-Chart
pq
p2
 0.175  2(0.49)
n
0.077 and 0.273
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Control Charts for Proportions. Example (pp. 84)
(d) Estimate the control limits of the P-Chart
pq
p 3
 0.175  3(0.49)
n
0.028 and 0.322
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Control Charts for Proportions. Example (pp. 84)
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
§ Control Charts for Proportions. Example (pp. 84)
Out of control signals
Signal 1. Semester 12 above 3s level (Very good class!)
Signal 2. Not present
Signal 3. Not present
The Proportion of A’s given in class is in statistical
control, with exception of the one unusually good class
two semesters ago
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
Checking for Understanding
HM STAT Space (7.3)
Practice
• Textbook Section 7.3 Problems: pp. 387 – 389
Braser–Braser Chapter 7.3
AP Statistics
Sampling Distributions for Proportions
Braser–Braser Chapter 7.3 pp 389
AP Statistics
Sampling Distributions for Proportions
Braser–Braser Chapter 7.3 pp 389
Custom Shows
AP Statistics
Sampling Distributions?
§ Sampling Distributions Example
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Sampling Distributions Example
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Sampling Distributions Example
Braser–Braser Chapter 7.1
AP Statistics
Sampling Distributions?
§ Sampling Distributions Example
Braser–Braser Chapter 7.1
AP Statistics
Normal Approximation to Binomial Distribution
§ Normal Approximation to Binomial Error
The error of the normal approximation to the binomial
distribution decreases and becomes negligible as the
number of trials n increases
However, if the number of trials is not big, the error in
this approximation can not be ignored…
Braser–Braser Chapter 7.4
AP Statistics
Normal Approximation to Binomial Distribution
§ Normal Approximation to Binomial Error
P(5  r  10)  ?
n = 50
p = 0.111
q = 0.889
 = 6.555
 = 2.221
Binomial
Probability
Normal
Approach
+
Continuity
Correction
5
10
AP Statistics
Normal Approximation to Binomial Distribution
§ Normal Approximation to Binomial Error
n  50
pˆ  0.111
pˆ  r / n
P(0.10  pˆ  0.20)  ?
qˆ  .889
Binomial
Probability
 pˆ  0.111
 pˆ  0.044
Normal
Approach
+
Continuity
Correction
0.1
0.2
AP Statistics
Normal Approximation to Binomial Distribution
§ Normal Approximation to Binomial
Continuity Correction for pˆ  r / n
Step 1. If ^
p is a left-point of an interval, subtract 0.5/n
to obtain the corresponding random variable x:
r  0.5
0.5
x
 pˆ 
n
n
Step 2. If p^ is a right-point of an interval, add 0.5/n to
obtain the corresponding random variable x:
r  0.5
0.5
x
 pˆ 
n
n
Braser–Braser Chapter 7.4