Download Chapter 9 Notes Answers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 9: Sampling Distributions
9.1: Sampling Distributions
IDEA: How often would a given method of sampling give a correct answer if it was repeated many times?
That is, if you took repeated samples (MANY repeated samples), how often would the sample reflect the true
distribution of the population you are sampling from? This is the basis of statistical inference which we will
study in future chapters. The purpose of this chapter is to prepare us to answer those questions.
Parameter: a ______________ that describes a ______________________. A parameter has
_______ (and only one) value—we just don't know what it is. The most important parameters
for us are the __________________________ (μ) and population ________________ (p or π).
P
S
Population
Parameter
Sample
Statistic
Statistic: a ______________ that can be ___________________ from a sample without making use of any
unknown _____________________. In practice we will use _____________________ to establish unknown
parameters.
Sampling Distribution: the ______________________________________ of a __________________ is the
distribution of the values taken by the _____________________ in all ______________________ samples of
the same size from the same population.
We can view our ____________________________________ as a __________________
_______________________. That is, we have NO WAY of predicting ________________
what value we will get from a _________________________________.
EXAMPLE: (p.570/#5) Sampling Test Scores, I
Let us illustrate the idea of a sampling distribution of x in the case of a very small sample from a very small
population. The population is the scores of 10 students on an exam:
Student #
Score
0
82
1
62
2
80
3
58
4
72
5
73
6
65
7
66
8
74
9
62
The parameter of interest is the mean score in this population, which is 69.4. The sample is a SRS drawn from
the population. Because the students are labeled 0 to 9, a single random digit from Table B chooses 1 student
from the sample.
(a) Use Table B to draw an SRS of size n=4 from this population. Write the 4 scores in your sample and
calculate the mean x of the sample scores. This __________________ is an ________________ of the
___________________________________________________.
Sample: _____ _____ _____ _____
sample mean = _____
ACT
Scores: _____ _____ _____ _____
(b) Repeat this process 10 times. That is, you will take 10 more samples of size 4 and compute each sample’s
mean. Make a histogram of the 10 values of x . You are constructing the sampling distribution of x . Is the
center close to 69.4?
x1
x6
x2
x7
x3
x8
x4
x9
x5
x10
(c) Ten repetitions give a very crude approximation of the sampling distribution. Now pool your data with
that of other students. Use your calculator’s list functions to organize and sort the data, and to construct a
new histogram of the sample means. Copy your histogram below. Describe the shape of the distribution. Is
the center close to 69.4? Is this histogram a better approximation of the sampling distribution?
Describe the histogram you just drew.
Shape:
Center:
Spread:
Unusual Values:
Note: We have no way of knowing whether or not OUR STATISTIC is _____________ to the parameter we
are trying to ______________________. We must be aware of _________ and ______________________.
Unbiased Statistic/Unbiased Estimator: A statistic used to estimate a parameter is _______________ if
the __________ of its sampling distribution is ______________ to the _______________________ of the
parameter being estimated. The statistic is called an ______________________________________ of the
parameter.
Variability of a Statistic: the variability of a statistic is described by the _____________ of the sampling
distribution. The spread is determined by the sampling _________________ and the ____________ of the
sample.
Larger samples give __________________ spread.
If the population is much larger than the sample (at least 10 times as large), the spread of the
_____________________________ is approximately the same for any _____________ size.
So, the ______________ of a sampling distribution depends ONLY on _________________
and NOT on the size of the _____________________.
This means that if a survey of, say, 1,200 people will have the same VARIABILITY (or margin of error)
whether the population being sampled is the city of Fullerton or the entire United States. Although not
always intuitive, this concept will be shown throughout this and future chapters.
Suppose you wanted to estimate the distribution of colors of regular M&M’s. You decide to take a sample
of M&M’s. As long as the M&M’s are well mixed, the sample doesn’t know whether it is coming from a
single serving size bag of M&M’s, a Costco size bag of M&M’s, or a large bucket of M&M’s! If any one
sample taken is a SRS, the variability of the result depends only on the size of the sample.
OUR GOAL: we want to have ______________________ AND ________________________________.
______ Bias, ______ Variability
______ Bias, ______ Variability
______ Bias, ______ Variability
______ Bias, ______ Variability
9.2 Sample Proportions
The objective of some statistical applications is to reach a conclusion about a population proportion, p.
For example, we may try to estimate an approval rating through a survey, or test a claim about the
proportion of defective light bulbs in a shipment based on a random sample.
Since p is unknown to us, we must base our conclusion on a sample proportion, p̂ . However, we know
that the value of p̂ will vary from sample to sample. The amount of variability will depend on the size of
our sample.
Our estimator is the proportion of success:
pˆ
count of " successes" in sample
size of sample
X
n
Note: the values of X and p̂ will vary in repeated samples, both X and p̂ are ______________________.
Something to think about…
Proportions are just another way of looking at counts. For example, I can talk about how many male
students I have in this class, or I can talk about the proportion of males in the class. These are two
different ways of looking at the same information. So don’t be too surprised if we find that much of what
we learn about ______________________ is based on what we already know about _________________.
Sampling Distribution of a Sample Proportion:
Choose an SRS of size n from a large population with population proportion p having some characteristic
of interest. Let p̂ be the proportion of the sample having the characteristic. Then:
The __________ of the sampling distribution of p̂ is ______________ p.
The __________________________________ of the sampling distribution of p̂ is
p(1 p)
n
RULE OF THUMB #1:
Use the recipe for standard deviation of p̂ ONLY when the population is at least _______________ as
large as the sample; that is, when _________________. Where ____ is the size of the ________________
and ____ is the size of the _________________.
Note: we will use this rule throughout the rest of the year whenever our interest is drawing a sample to
make inferences about a population. We are interested in sampling only when the population is large
enough to make taking a census impractical.
RULE OF THUMB #2:
Use the Normal approximation to the sampling distribution of p̂ for values of n and p that satisfy
_____________ and ____________________.
EXAMPLE: Based on Census data, we know 11% of US adults are Black. Therefore, p = 0.11. We would
expect a sample to contain roughly 11% Black representation. Suppose a sample of 1500 adults contains
138 Black individuals. Should we suspect ‘undercoverage’ in the sampling method?
Note,
p̂
138
1500
Is this lower than what would be expected by chance? That is, we know it is possible that a sample could
contain 9.2% Black representation…but is it likely that would happen due to natural variation in a random
sampling method?
Check assumptions:
Rule of thumb #1:
Rule of thumb #2:
Find the mean and standard deviation:
Calculate the probability:
Interpret in context:
9.3 Sample Means
When the objective of a statistical application is to reach a conclusion about a population mean, µ, we
must consider a sample mean, x . However, as we have noted, we know that the value of x will vary
from sample to sample. The amount of variability will depend on n, the size of our sample.
Mean and Standard Deviation of a Sample Mean:
Suppose that x is the mean of a SRS of size n drawn from a large population with a mean
deviation
, then:
and standard
The __________ of the sampling distribution of x is:
The ________________________ of the sampling distribution of x is:
EXAMPLE: ACT Scores
The scores of individual students on the American College Testing (ACT) composite college entrance
examination have a Normal distribution with mean 18.6 and standard deviation 5.9.
(a) What is the probability that a single student randomly chosen from all those taking the test scores 21
or higher?
(b) Now take a SRS of 50 students who took the test. What are the mean and standard deviation of the
average (sample mean) score of the 50 students?
(c) Do your results depend on the fact that individual scores have a Normal distribution?
(d) What is the probability that the mean score x of the students is 21 or higher?
CENTRAL LIMIT THEOREM:
Draw a SRS of size n from ________ population whatsoever with mean
and finite standard deviation
.
When n is large, the _____________________________________ of the sample mean x is close to the
_______________ distribution N
,
n
with mean
and standard deviation
n
.
Note: the CLT discusses the ______________ (and only the shape) of the sampling distribution of x when n
is sufficiently large. If n is not large, the shape of the distribution more closely resembles the shape of the
original population. Thus, there are three situations to consider when discussing the shape of the sampling
distribution:
Shape of Population
Shape of Sampling Distribution