Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia, lookup

Transcript
```9.3: Sample Means
Warm Up = HW Pass

With regard to a particular gene, the percentages of
genotypes AA, Aa, and aa in a particular population are
60%, 30%, and 10%, respectively. Furthermore, the
percentages of these genotypes that contract a certain
disease are 1%, 5%, and 20%, respectively. If a person
does contract the disease, what is the probability that the
person is genotype AA?
a) .006
b) .010
c) .041
d) .146
e) .600
Sample Proportions vs.
Sample Means

Sample proportions (Section 9.2) are most
often used when we are interested in
categorical variables.


For example, “yes” or “no” questions.
Sample means are most often used when we
are interested in numerical or quantitative
variables.

For example, the average rate of return on stocks
in the stock market.
The distribution of returns for 1,815 New York
Stock Exchange common stocks in 1987
The mean return for 1,815
stocks is μ = -3.5% and the
distribution shows a very wide
represents an individual
sample, the sample size n = 1.
The distribution of returns for all possible
portfolios
that invested equal amounts in
A portfolio is just a
sample
5 stocks
each
ofoffive
stocks in 1987
and its return is the
average return for the
5 stocks chosen.
The mean return for all
portfolios is still -3.5%, but the
portfolios is much less than the
variation among individual
stocks. The sample size is n =
5. Note that a larger sample
size implies less variability.
Life Lessons Learned from the
Stock Market
Statistical Inference is
1) Averages are less
observations.
the process of using
samples
variable
thantoindividual
specific questions
a known degree of
confidence.
2) Averages are more normal than individual
observations.
 These two facts contribute to the popularity
of sample means in statistical inference.
Mean and Standard Deviation
of a Sample Mean


Suppose that x is the mean of an SRS of size
n drawn from a large population with mean μ
and standard deviation σ.
Then the mean of the sampling distribution of
x is
x  

The standard deviation is
x 

n
Sample Mean Basics




The sample mean x-bar is an unbiased
estimator of the population mean μ.
The values of x-bar are less spread out for
larger samples.
You should only use the recipe for standard
deviation of x-bar when the population is at
least 10 times as large as the sample.
These facts are true no matter what the
shape of the population distribution.
Ex 1: Young Women’s Heights



The height of young women varies
approximately according to the N(64.5, 2.5)
distribution.
We could safely say the if we repeatedly
select one woman at random, the heights we
get will also follow this distribution.
But, what will happen if we begin choosing
samples of 10 women at random?

What will be the sample mean height x-bar of
the sampling distribution?
 x    64.5 inches


2.5
x 

 0.79 inch
n
10
Sampling Distribution of a Sample
Mean from a Normal Population

Draw an SRS of size n from a population that
has the normal distribution with mean μ and
standard deviation σ. Then the sample mean
x-bar has the normal distribution
N( ,  / n )
Ex 2: More on Young Women’s Heights


What is the probability that a randomly
selected young woman is taller than 66.5
inches?
What about the probability that the mean
height of an SRS of 10 young women is
greater than 66.5 inches?
• Take a few minutes to try and answer these
questions using normal calculations. (start
with converting to z-scores)
What is the probability that a randomly
selected young woman is taller than 66.5
inches?
66.5  64.5
z
 0.80
2.5
P( X  66.5)  P( z  0.80)
The probability of
choosing a young
woman at random whose
height exceeds 66.5
 1  0.7881
 0.2119
What is the probability that the mean
height of an SRS of 10 young women is
greater than 66.5 inches?
66.5  64.5
z
 2.53
0.79
P( x  66.5)  P( z  2.53)
It is very unlikely (< 1%
chance) that we would
draw an SRS of 10
young women whose
average height exceeds
66.5 inches.
 1  0.9943
 0.0057
The sampling distribution of the mean height x-bar of
10 young women compared with the distribution of
the height of a single woman chosen at random
What have we learned?

The average of n results (the sample mean xbar) is less variable than a single
measurement.
Question…

Does x-bar still have a normal
distribution even when the population
distribution is not normal?
Central Limit Theorem

Draw an SRS of size n from any population
whatsoever with mean μ and standard
deviation σ. When n is large, the sampling
distribution of the sample mean x-bar is very
close to the Normal distribution
N( ,  / n )
How large is large enough?

How large a sample size n is needed for xbar to be close to Normal depends on the
population distribution.

More observations are required if the shape
of the population distribution is far from
Normal.
Three scenarios to consider…
1) The population has a Normal distribution –
shape of sampling distribution: Normal,
regardless of sample size.
2) Any population shape, small n – shape of
sampling distribution: similar to shape of
the parent population.
3) Any population shape, large n – shape of
sampling distribution: close to Normal
(Central Limit Theorem)
The sampling distribution is normal if the
population distribution is normal. It will be
approximately normal for large samples
regardless of the shape of the population
distribution.
```