Download Wksht. 8.04-Discovering Central Limit Theorem

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Mean field particle methods wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Central limit theorem wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Math 4, Unit 1, Central Limit Thm/Confidence Intervals
Wksht. 8.04 – Discovering the Central Limit Theorem
Name: _________________________
Date: ____________
This unit is extremely important because it presents the central limit theorem, which forms the foundation for
estimating population parameters and hypothesis testing – topics studied at length in Statistics and AP Statistics.
The central limit theorem (CLT) is essential for inferential statistics. The goal of inferential statistics is to use a
sample to make an inference about a population.
The CLT is: The Central Limit Theorem states that if n is sufficiently large, the sample means of random
samples from a population with mean  and standard deviation  are approximately normally distributed with
mean  and standard deviation

.
n
One way to simulate the CLT is to use the last four digits of your social security numbers (these are random).
On the (0-9) table on the board, put a tally mark under each of the digits that are in the last four digits of your
social security number. If a digit appears more than once in your number, just put more than one check in the
number’s box. Then fill in the table below with the total number for each digit.
Digit
0
1
2
3
4
5
6
7
8
# of Digits
1. To the right, sketch and label the graph of
the boxplot with minimum, Q1, median,
Q3, and maximum.
2. Describe the distribution:
If we were able to enter, say, a million more social security numbers picture what the
distribution would look like.
Calculate the mean of the last four digits of your social security number and record on the board.
3. Use the means to create a histogram.
4. Describe the distribution:
9
One key element of the central limit theorem tells us: if the sample size is large enough, the distribution of
sample means can be approximated by a normal distribution, even if the original population is not
normally distributed.
5. Find the mean and standard deviation of the data in the table.
What is the theoretical mean & standard deviation?
Compare these values to the CLT values (remember,  x =

n
).
The Central Limit Theorem and the Sampling Distribution of x
Given:
1. The random variable x has a distribution (which may or may not be normal) with mean  and
standard deviation .
2. Simple random samples all of the same size n are selected from the population.
Conclusions:
1. The distribution of sample means x will, as the sample size increases, approach a normal
distribution.
2. The mean of all sample means is the population mean . (i.e. the normal distribution from
conclusion 1 has mean .)
3. The standard deviation of all sample means is
conclusion 1 has standard deviation

n

n
. (i.e. the normal distribution from
.)
Practical Rules Commonly Used:
1. If the original population is not itself normally distributed, here is a common guideline: For
samples of size n greater than 30, can be approximated reasonably well by a normal
distribution. (There are rare exceptions.) As n gets larger, the approximation gets better.
2. If the original population is normally distributed, then (of course) the sample means will be
normally distributed (n can be any value).
Common Notations:
As you know, we use for mean of populations. We use  x for mean of the sample means.
We use (you guessed it)  x for the standard deviation of the sample means.
So our formulas are:  x   and  x 

n
.  x is often called standard error of the mean.
EXAMPLE: The Sky Lift at Six Flags carries
patrons from one end of the park to the other. The
car, called a gondola, bears a plaque stating that
the maximum capacity is 12 people or 2004
pounds. Because men tend to weigh more than
women, a “worse case” scenario involves 12
passengers who are all men. Men have weights
that are normally distributed with a mean of 172
lb. and standard deviation of 29 lb.
a)
Find the probability that if an individual man is randomly selected, his weight will be
greater than 167. (why 167?) (this is an old z-score problem)
b)
Find the probability that 12 randomly selected men will have a mean greater than 167.
EXERCISES:
For items 1 – 3, use the example problem.
1. If 36 men are randomly selected, find the probability that they have a mean weight less than 167.
2. If 64 men are randomly selected, find the probability that they will have a mean weight between
170 and 175.
3. a)
b)
If 25 men are randomly selected, find the probability that they will have a mean weight
between 160 and 180.
Why can the central limit theorem be used in part (a), even though the sample size does not
exceed 30?
4. Assume that cans of Soda are filled so that the actual amounts have a mean of 12.00 oz.
and a standard deviation of 0.11 oz.
a) Find the probability that a sample of 36 cans will have a mean amount of at
least 12.05 oz.
b) Based from the result in part (a), is it reasonable to believe that the cans are
actually filled with a mean of 12 oz? If the mean is not 12 oz, are customers
being cheated?
5. The manager of an electronics store is concerned that his suppliers have been selling him
TV sets with lower than average quality. Research shows that replacement times for TV
sets have a mean of 8.2 years and a standard deviation of 1.1 years. He randomly selects
50 of the TV sets that he sold and finds that the mean replacement time was 7.8 years.
a) Find the probability that 50 randomly selected TV sets will have a mean
replacement time of 7.8 years or less.
b) Based from the result in part (a), does it appear that the electronics company has
been selling TV sets that have lower than average quality?
6. Scores for men on the SAT verbal portion are normally distributed with a mean of 509
and a standard deviation of 112.
a) If 16 men take the test, find the probability that the mean of these 16 is 590 or
more.
b) If those 16 had taken a course to improve verbal SAT scores, is there evidence
that the course was effective?
Answers notes portion:
1. Answers vary
2. evenly distributed; a rectangle
3. Answers vary
4. normally distributed
5. means should be the same, std dev different
EXAMPLE: a) 0.5675;
b) Approach: Use CLT because we are dealing with 12 men not just one man. Even though the sample size
is not greater than 30, we can use CLT because heights are normally distributed.
 x    172,
x   x 167  172
5
z


 0.60

29
29
x
8.37
x 

 8.37
n
12
12
So, the left shade area is 0.2743 & the right is 1 – 0.2743 = 0.7257
Answers:
1. 0.1515
2. 0.5055
3. a) 0.8970; b)
4. a) 0.0032; b)
5. a) 0.0051; b)
6. a) 0.0019; b)
original population is normally distributed
the cans actually have more than 12 oz., so the customers are not being cheated.
yes, the probability of 8.2 being the actual mean is very low.
yes; c) original population is normally distributed