Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Sampling
• We have a known population.
We
ask “what would happen if
I drew lots and lots of random
samples from this population?”
Inference
• We have a known sample.
We
ask “what kind of
population might this sample
have been drawn from?”
The Central Limit Theorem
• If you draw simple random samples of size n
• from a population with mean m and variance s2
• then
 the expected mean of x-bar is m
 the expected variance of x-bar is
s2 / n
 the expected histogram of x-bar is approximately normal
Estimating mu from sample data
• Is this true?
• amu = sample mean
• Why not?
 Because the Central Limit Theorem tells us that, if we
drew lots and lots of sample, the sample means vary. Some
are bigger than mu and others are smaller than mu.
Estimating mu from sample data
• What about this?
• mu = somewhere in the neighborhood
• of the sample mean
• But how do we define neighborhood?
Example 6.1




We have a sample of 500 high-school seniors, selected at
random from the population of all high-school seniors in
California. For the 500 kids in the sample, their average
score on the math section of the SAT is 461.
Known: sample mean is 461
Unknown: population mean
Assumed: population sigma is 100
The Central Limit Theorem
•
If you draw simple random samples of size 500 from a
population with mean m and standard deviation of 100,
then
 the expected mean of x-bar is m
 the expected st dev of x-bar is about 4.5
 the expected histogram of x-bar is approximately normal
Table A tells us...
• ...about 68% of sample means should fall within 4.5 points
of mu
• ...about 95% of sample means should fall within 9 points
of mu
• ...about 99.75% of sample means should fall within 13.5
points of mu
About 95% of sample means should
fall within 9 points of mu
mu-9
mu
Sample Means
mu+9
mu is 452
435
440
445
450
455
460
465
Sample Means
470
475
480
485
mu is 470
435
440
445
450
455
460
465
Sample Means
470
475
480
485
mu is 452
435
440
445
450
455
460
465
470
475
480
485
470
475
480
485
Sample Means
mu is 470
435
440
445
450
455
460
465
Sample Means
The 95% Confidence Interval
 If
mu is any number less than 452, then our
sample mean would be surprisingly large.
 If mu is any number greater than 470, then
our sample mean would be surprisingly small.
 Therefore, the 95% confidence interval for mu
is the range from 452 to 470.
 If mu is inside this range, then our sample is
not unusual (according to the 95% rule).
Other confidence intervals
• If we suppose that the sample mean is within 1.645
standard deviations of mu, then we get a 90% confidence
interval.
• If we suppose that the sample mean is within 2.576
standard deviations of mu, then we get a 99% confidence
interval.
Effect of sample size on the confidence interval
• As n gets larger, the expected variability of the sample
means gets smaller.


Larger sample sizes produce narrower confidence intervals
(other things equal).
Smaller sample sizes produce wider confidence intervals
(other things equal).
Some cautions



The data must be a simple random sample from the
population
The sample mean, and therefore the confidence interval,
may be too heavily influenced by one or more outliers
If the sample size is small and population is not
approximately normal, then the CLT doesn’t promise the
approximately normal distribution for the sample means
Related documents