Download Section 6.3: How to See the Future

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Section 6.3: How to See the
Future
Goal: To understand how sample
means vary in repeated samples.
How do sample means vary?
  141.8
  109.2
30
Frequency
 Here’s a graph of
the survival time of
72 guinea pigs,
each injected with
a drug.
 Note: The mean
and std. dev. for
the population are:
20
10
0
0
100
200
300
400
Survival time (days)
500
600
Understanding the POPULATION Data
 The survival times for the population of
guinea pigs is skewed to the right.
 The mean,
 , is 109.2
 The standard deviation,
 , is large!
Distribution of Sample Means
 Suppose that lots and lots of us each
randomly sampled 12 guinea pigs and found
the average survival time for each set of 12
guinea pigs. We would each have a sample
_
mean, x .
 Would all of our
_
x
values be the same?
 Use yesterday’s simulation data (consisting of 106
sample means) to see if there are any patterns.
Graph of 106 sample means
Count
20
10
0
100
150
sample means
200
Graph of the Sample Means
 What is the mean of the sample
means,  x
?
_
 What is the standard deviation of the
sample means,  _ ?
x
 What is the shape of the graph of the
sample means?
Graph of the Sample Means
 What is the mean of the sample
means,  _
? 142.01
x
 What is the standard deviation of
the sample means,  _ ? 27.94
x
 What is the shape of the graph of
the sample means? Roughly bellshaped
Comparing Population Data with
Sample Mean Data
What does this say in
_
x
words?

 _   What does this say in
x
words?
 The graph of the population data is
skewed but the graph of the sample mean
data is bell-shaped.

 
Standard Error
 Since “standard deviation of the sample means” is a
mouthful, we’ll instead call this quantity standard
error.
 Remember, we have two standard deviations floating
around – the first is the population standard
deviation and the second is the standard error.
 The first describes how much spread there is in the
population. The second describes how much spread
there is in the sample means.
Understanding Standard Error
 How does the population standard
deviation relate to the standard error?
 
_
x

n
 What does this formula say?
 When the sample size is large, the standard
error is small.
 When the sample size is small, the standard
error is large.
 When n=1, the two values are equal. Why?
Ex 2: Coin Problem
 Imagine you go home, collect all of
the coins in your home, and make a
graph of the age of each coin.
 This graph represents the graph of
the population data.
 What do you expect its shape to be?
Sample means of coins
 Take repeated samples, each of size 5
coins, and find the mean age of the
coins. If you were to make a graph
of the sample means, what would you
expect it to look like?
 How about if instead you took
samples of size 10? Or of size 25?
Guinea Pigs and Coins
 In both situations the population graphs
were severely skewed, yet the graph of the
sample mean data was bell-shaped.
 In both cases the graphs of the sample
mean data is centered at the population
mean.
 In both cases the standard error is the
population standard deviation divided by
the square root of the sample size.
 Hmmmmm…..
Coincidence?
 No, we couldn’t be that lucky! In
fact, this is the Central Limit Theorem
in action.
Central Limit Theorem
 Suppose that a random sample of size n is
taken from a large population in which the
variable you are measuring has mean  and
standard deviation  .
Then, provided n is at least 30, the
sampling distribution of the sample
means is roughly bell-shaped, centered at
the population mean,  , with standard
error equal to  n .
Since the graph of sample means
will always be bell-shaped….
 68% of the sample means should
come within one standard error of the
center (population mean).
 95% of the sample means should
come within two standard errors of
the center (population mean).
 99.7% of the sample means should
come within three standard errors of
the center (population mean).
Confidence Intervals to Estimate 
 What is the average number of hours
Ship students sleep per night during
final exam week?
 Who is the population?
 What is the parameter we are interested
in estimating?
Confidence Intervals, Cont’d
 Estimate the population mean by
taking a random sample of 50 college
students and finding a sample mean.
 Suppose you find that the sample mean
is 5.8
Confidence Intervals, cont’d.
 Can we say that  is 5.8?
 If the sample was indeed random is it
reasonable to believe  is close to 5.8?
 Reason? Central Limit Theorem.
 So the 95% confidence interval to estimate  is:
_
x 2

n
 What is the formula for a 68% confidence
interval?
 How about a 99.7% confidence interval?