Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
27 October 2003
6.1 Estimating with Confidence
Sampling
We have a known population.
We
ask “what would happen if I
drew lots and lots of random
samples from this population?”
Inference
We have a known sample.
We
ask “what kind of
population might this sample
have been drawn from?”
Looking ahead
Chapter 6
Estimating mu from sample data
Chapter 7
Estimating mu and sigma from
sample data
Chapter 8
Estimating a population
proportion from sample data
The Central Limit Theorem
If you draw simple random samples of size n
from a population with mean m and variance s2
then
 the expected mean of x-bar is m
 the expected variance of x-bar is
s2 / n
 the expected histogram of x-bar is
approximately normal
Estimating mu from sample data
estimated mu = sample mean
Why?
 Because the Central Limit Theorem tells us
that, if we drew lots and lots of sample, the
sample means would average out to mu.
 (The sample mean is an unbiased estimator
of mu.)
Estimating mu from sample data
Is this true?
mu = sample mean
Why not?
 Because the Central Limit Theorem tells us
that, if we drew lots and lots of sample, the
sample means vary. Some are bigger than
mu and others are smaller than mu.
Estimating mu from sample data
What abou this?
mu = somewhere in the neighborhood
of the sample mean
But how do we define neighborhood?
Example 6.1
 We
have a sample of 500 high-school
seniors, selected at random from the
population of all high-school seniors in
California. For the 500 kids in the sample,
their average score on the math section of
the SAT is 461.
 Known: sample mean is 461
 Unknown: population mean
 Assumed: population sigma is 100
The Central Limit Theorem
If you draw simple random samples of size
500 from a population with mean m and
standard deviation of 100, then
 the expected mean of x-bar is m
 the expected st dev of x-bar is about 4.5
 the expected histogram of x-bar is
approximately normal
Table A tells us...
...about 68% of sample means
should fall within 4.5 points of mu
...about 95% of sample means
should fall within 9 points of mu
...about 99.75% of sample means
should fall within 13.5 points of mu
About 95% of sample means
should fall within 9 points of mu
mu-9
mu
Sample Means
mu+9
mu is 452
435
440
445
450
455
460
465
Sample Means
470
475
480
485
mu is 470
435
440
445
450
455
460
465
Sample Means
470
475
480
485
mu is 452
435
440
445
450
455
460
465
470
475
480
485
470
475
480
485
Sample Means
mu is 470
435
440
445
450
455
460
465
Sample Means
The 95% Confidence Interval
 If
mu is any number less than 452, then our
sample mean would be surprisingly large.
 If mu is any number greater than 470, then our
sample mean would be surprisingly small.
 Therefore, the 95% confidence interval for mu
is the range from 452 to 470.
 If mu is inside this range, then our sample is not
unusual (according to the 95% rule).
Other confidence intervals
If we suppose that the sample mean is
within 1.645 standard deviations of mu,
then we get a 90% confidence interval.
If we suppose that the sample mean is
within 2.576 standard deviations of mu,
then we get a 99% confidence interval.
Effect of sample size on the confidence interval
As n gets larger, the expected variability
of the sample means gets smaller.
 Larger
sample sizes produce narrower
confidence intervals (other things equal).
 Smaller sample sizes produce wider
confidence intervals (other things equal).
Some cautions
 The
data must be a simple random sample
from the population
 The sample mean, and therefore the
confidence interval, may be too heavily
influenced by one or more outliers
 If the sample size is small and population is
not approximately normal, then the CLT
doesn’t promise the approximately normal
distribution for the sample means
One more caution
There is a 95% chance that mu lies
in the confidence interval.
In Example 6.1:
P(452 < mu < 470) = .95

One more caution
There is a 95% chance that mu lies
in the confidence interval.
In Example 6.1:
P(452 < mu < 470) = .95
Related documents