Download Unit 6 – Central Limit Theorem (CLT)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Sampling Distributions and CLT Day 5
*****CENTRAL LIMIT THEOREM (CLT)*****
(One of the MOST important theorems in Statistics - KNOW
AND UNDERSTAND THIS!!!!!!)
Draw an SRS of size n from ANY population with mean μ
and standard deviation σ. When n is LARGE, the
sampling distribution of the sample mean
is
approximately NORMAL!!! with mean μ and standard
deviation
.
FOR EXAMPLE:
http://onlinestatbook.com/stat_sim/sampling_dist/index.html
KEEP IN MIND:
1. You MUST have a sample that is large enough to
use the Central Limit Theorem (≥ 30 is usually good
- we will spend more time on this later...).
2. Normal distributions are GREAT for calculating
probabilities (like we have seen many times before)
so the CLT is a VERY powerful tool
Central Limit Theorem Recap:
If you take large enough sample sizes, then the sampling
distribution will be normal in shape.
1) If the population is normal (or roughly normal), then you can
take samples of any size
2) If the population is not normal, then you will need to take a
large enough sample (generally n≥30 is enough)
FOR EXAMPLE:
An insurance company knows that the mean loss from fire for the entire
population of homeowners is μ = $250 and the standard deviation of loss is
σ = $1000. The distribution of losses is strongly skewed to the right: many
policies have $0 loss, but a few have very large losses.
a) What is the approximate probability the loss of one policy will be
greater than $275?
b) If the company sells 10,000 policies, what is the approximate probability
that the average loss will be greater than $275?
Example 2: The scores of students on the ACT college entrance examination in
2001 had a mean score of 21 with standard deviation of 4.7. The distribution of
scores is only roughly Normal.
a) What is the approximate probability that a single student randomly chosen
from all those taking the test scores 23 or lower?
b) Now take an SRS of 50 students who took the test. What are the mean and
standard deviation of the sample mean score
of these 50 students? What is
the approximate probability that the mean score of these students is 23 or
lower?
c) Which of your two Normal probability calculations in (a) and (b) is more
accurate? Why?
Example 3: The gypsy moth is a serious threat to oak and aspen trees. A
state agriculture department places traps throughout the state to detect the
moths. When traps are checked periodically, the mean number of moths
trapped is only 0.5 moths, but some traps have several moths. The
distribution of moth counts is discrete and strongly skewed, with standard
deviation of 0.7 moths.
a) What are the mean and standard deviation of the average number of
moths in 50 traps?
b) Find the probability that the average number of moths in 50 traps is
greater than 0.6 moths?
Example 4: In a recent year, there were about 4.8 million parents in the U.S. who
received child support payments. The mean number of children in families receiving
child support was 1.7 and the standard deviation was 0.8
a) What is the probability that a person has less than 2 children receiving child
support payments?
b) You randomly select 15 parents who receive child support and ask how many
children in their custody are receiving child support payments. What is the
probability that the mean of the sample is less than 2 children receiving child support
payments?
Example 5: Suppose that we randomly select a sample of 64 measurements from a
population having μ = 20 and σ = 4
a) Describe the sample of the sampling distribution of sample mean, x-bar. Do
we need to know anything about the shape of the population? Why or why not?
b) Find the mean and standard deviation of the sampling distribution of the
sample mean.
c) Can you find the probability that a single measurement will be greater than
21? If so, what is the probability?
d) Calculate the probability that we will obtain a sample mean greater than 21.
e) 94% of the sample means will be below what value?
Example 6: David's iPod has about 10,000 songs. The distribution of the play
times for these songs is heavily skewed to the right with a mean of 225
seconds and a standard deviation of 60 seconds.
a) Suppose we select a song at random from this population. Can we
calculate the probability that the song is more than 4 minutes (240 seconds)?
If so, calculate the probability.
b) Suppose we selected a SRS of 10 songs from this population. Can we
safely calculate the probability that the mean play time of x-bar is more than 4
minutes (240 seconds)? If so, calculate the probability.
c) Suppose we selected a SRS of 36 songs from this population.
Can we safely calculate the probability that the mean play time of x-bar
is more than 4 minutes (240 seconds)? If so, calculate the probability.
d) Using a sample size of 36, the middle 70% of sample means will be in
what range?
Example 7: A machine used to fill half-gallon-sized milk containers is
regulated so that the amount of milk dispensed has a mean of 64 ounces
and a standard deviation of 0.11 ounces.
You randomly select 45 containers and carefully measure the contents.
The sample mean of the containers is 64.05 ounces. Does the
machine need to be reset? Explain.
Example 8: The number of accidents per week at a hazardous intersection varies with
μ = 2.2 and σ = 1.4. This distribution is not normal.
a) Can we find the probability that a single week has less than 2 accidents? Why or
Why not?
b) Let x-bar be the mean number of accidents per week at the intersection during a
year (52 weeks). What is the approximate distribution (mean and standard deviation)
of x-bar according the Central Limit Theorem?
c) What is the approximate probability that x-bar is less than 2?
d) What is the approximate probability that there are fewer than 100 accidents at the
intersection in a year? (Hint: restate this event in terms of x-bar)
e) 90% of the sample means for accidents will be in what range?
Homework:
p. 275 #17, 25, 29, 33, 37, 42
Note: standard error of the mean =
standard deviation of the sampling
distribution of sample means