Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

German tank problem wikipedia, lookup

Misuse of statistics wikipedia, lookup

Student's t-test wikipedia, lookup

Transcript
```Section 6.3: How to See the
Future
Goal: To understand how sample
means vary in repeated samples.
How do sample means vary?
  141.8
  109.2
30
Frequency
 Here’s a graph of
the survival time of
72 guinea pigs,
each injected with
a drug.
 Note: The mean
and std. dev. for
the population are:
20
10
0
0
100
200
300
400
Survival time (days)
500
600
Understanding the POPULATION Data
 The survival times for the population of
guinea pigs is skewed to the right.
 The mean,
 , is 109.2
 The standard deviation,
 , is large!
Distribution of Sample Means
 Suppose that lots and lots of us each
randomly sampled 12 guinea pigs and found
the average survival time for each set of 12
guinea pigs. We would each have a sample
_
mean, x .
 Would all of our
_
x
values be the same?
 Use yesterday’s simulation data (consisting of 106
sample means) to see if there are any patterns.
Graph of 106 sample means
Count
20
10
0
100
150
sample means
200
Graph of the Sample Means
 What is the mean of the sample
means,  x
?
_
 What is the standard deviation of the
sample means,  _ ?
x
 What is the shape of the graph of the
sample means?
Graph of the Sample Means
 What is the mean of the sample
means,  _
? 142.01
x
 What is the standard deviation of
the sample means,  _ ? 27.94
x
 What is the shape of the graph of
the sample means? Roughly bellshaped
Comparing Population Data with
Sample Mean Data
What does this say in
_
x
words?

 _   What does this say in
x
words?
 The graph of the population data is
skewed but the graph of the sample mean
data is bell-shaped.

 
Standard Error
 Since “standard deviation of the sample means” is a
mouthful, we’ll instead call this quantity standard
error.
 Remember, we have two standard deviations floating
around – the first is the population standard
deviation and the second is the standard error.
 The first describes how much spread there is in the
population. The second describes how much spread
there is in the sample means.
Understanding Standard Error
 How does the population standard
deviation relate to the standard error?
 
_
x

n
 What does this formula say?
 When the sample size is large, the standard
error is small.
 When the sample size is small, the standard
error is large.
 When n=1, the two values are equal. Why?
Ex 2: Coin Problem
 Imagine you go home, collect all of
the coins in your home, and make a
graph of the age of each coin.
 This graph represents the graph of
the population data.
 What do you expect its shape to be?
Sample means of coins
 Take repeated samples, each of size 5
coins, and find the mean age of the
coins. If you were to make a graph
of the sample means, what would you
expect it to look like?
samples of size 10? Or of size 25?
Guinea Pigs and Coins
 In both situations the population graphs
were severely skewed, yet the graph of the
sample mean data was bell-shaped.
 In both cases the graphs of the sample
mean data is centered at the population
mean.
 In both cases the standard error is the
population standard deviation divided by
the square root of the sample size.
 Hmmmmm…..
Coincidence?
 No, we couldn’t be that lucky! In
fact, this is the Central Limit Theorem
in action.
Central Limit Theorem
 Suppose that a random sample of size n is
taken from a large population in which the
variable you are measuring has mean  and
standard deviation  .
Then, provided n is at least 30, the
sampling distribution of the sample
means is roughly bell-shaped, centered at
the population mean,  , with standard
error equal to  n .
Since the graph of sample means
will always be bell-shaped….
 68% of the sample means should
come within one standard error of the
center (population mean).
 95% of the sample means should
come within two standard errors of
the center (population mean).
 99.7% of the sample means should
come within three standard errors of
the center (population mean).
Confidence Intervals to Estimate 
 What is the average number of hours
Ship students sleep per night during
final exam week?
 Who is the population?
 What is the parameter we are interested
in estimating?
Confidence Intervals, Cont’d
 Estimate the population mean by
taking a random sample of 50 college
students and finding a sample mean.
 Suppose you find that the sample mean
is 5.8
Confidence Intervals, cont’d.
 Can we say that  is 5.8?
 If the sample was indeed random is it
reasonable to believe  is close to 5.8?
 Reason? Central Limit Theorem.
 So the 95% confidence interval to estimate  is:
_
x 2

n
 What is the formula for a 68% confidence
interval?
 How about a 99.7% confidence interval?
```