Download Section 8 - Palisades School District

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Section 8.1 ~ Sampling Distributions
Objective: After this section you will understand the fundamental ideas of sampling
distributions and how the distribution of sample means and distribution of sample
proportions are formed. You will also learn the notation used to represent sample
means and proportions.
How does someone come up with these statistics?
~ The mean daily protein consumption by Americans is 67 grams
~ Nationwide, the mean hospital stay after delivery of a baby decreased from 3.2
days in 1980 to the current mean of 2.0 days
~ Thirty percent of high school girls in this country believe they would be happier
being married than not married
~ About 5% of all American children live with a grandparent
What is the difference between the first two statements and the last two statements?
Notation review:
~ n
~ 
~ z
~ r
~ x
~ x
~ 
~ P(x)
~ s
Distribution of sample means:
Distribution of Sample Means Activity
Complete the following activity to discover the relationship between the distribution of
sample means in comparison to the population mean.
In order to discover this relationship without an unmanageable amount of numbers, we
are going to be working with a small population of 5.
Suppose we are interested in knowing the average weight of the 5 starting players on a
professional basketball team. For convenience, we will name the players A, B, C, D, and
E. We can find the population mean very easily since there are only 5 players in the
entire population, but our purpose is to discover the relationship between the
distribution of sample means so we can later apply it to large populations.
What is the population mean?
Suppose you are going to choose samples of size n = 1. There would be 5 possible
samples (each player would be considered a sample) and the distribution of sample
means would be as follows:
What is the mean of the 5 sample means?
Now suppose you are going to choose samples of size n = 2. There would be 10 possible
samples and the distribution of sample means would be as follows:
What is the mean of the 10 sample means?
Now suppose you are going to choose samples of size n = 3. Again, there would be 10
possible samples and the distribution of sample means would be as follows:
What is the mean of the 10 sample means?
Now suppose you are going to choose samples of size n = 4. There would be 5 possible
samples and the distribution of sample means would be as follows:
What is the mean of the 5 sample means?
Lastly, suppose you are going to choose samples of size n = 5. In this case, there is only
1 possible sample (all of the players) which happens to be the population. The
distribution of sample means is as follows:
What is the mean of this 1 sample mean?
What do you notice about the mean of the distribution of sample means in comparison to
the population mean (242.4)?
What do you notice about the histogram?
When you work with ALL possible samples of a population of a given size, the mean of
the distribution of sample means is always the ______________________.
Typically, the population size is too large to calculate the means for all possible
samples, so we calculate the mean of a sample, x , to estimate the population
mean, μ.
When you are working with very large populations, as your sample size increases, the
distribution will look more and more like a _____________________ and the
distribution of sample means will ________________________________.
Sampling error:
The more samples that you gather, the better your estimate will be, but if you can only
gather one sample, that is your best estimate
Ex. ~ The following values are results from a survey of 400 students who were asked
how many hours they spend per week using a search engine on the Internet.
n = 400
μ = 3.88
σ = 2.40
Suppose these were the values that were randomly selected to obtain a sample of
32 students:
1.1
2.7
2.5
7.8
2.6
7.8
6.8
1.4
4.9
7.1
3.0
5.5
6.5
3.1
5.2
5.0
2.2
6.8
5.1
6.5
3.4
1.7
4.7
2.1
7.0
1.2
3.8
0.3
5.7
0.9
6.5
2.4
The mean of the sample is x  4.17 .
Note ~ We say that x is a sample statistic because it comes from a sample of
the entire population.
Now suppose a different sample of 32 students was selected from the 400:
1.8
1.2
0.9
0.4
5.4
4.0
4.0
5.7
2.4
7.2
0.8
5.1
6.2
3.2
0.8
3.1
6.6
5.0
5.7 7.9 2.5 3.6 5.2
3.1 0.5 3.9 3.1 5.8
5.7 6.5
2.9 7.2
The mean of the sample is x  3.98 .
Now you have two sample means that don’t agree with each other (4.17 & 3.98
respectively), and neither one agrees with the true population mean (3.88). This is
an example of sampling error.
In summary, when including all possible samples of size n, the characteristics of the
distribution of sample means are as follows:



Example 1:
Texas has roughly 225,000 farms, more than any other state in the United States. The
actual mean farm size is μ = 582 acres and the standard deviation is σ = 150 acres. For
random samples of n = 100 farms, find the mean and standard deviation of the
distribution of sample means. What is the probability of selecting a random sample of
100 farms with a mean greater than 600 acres?
Population proportion (p):
Ex. ~ Suppose instead of being interested in knowing how many hours per week
students spend using search engines, we took those same 400 students and asked them
a simple Yes or No question, “Do you own a car?” (refer to the raw data on P.341)
Sample proportion ( p̂ ):
Example 2:
Consider the distribution of sample proportions shown on P.341. Assume that its
population proportion is p = 0.6 and its standard deviation is 0.1. Suppose you randomly
select the following sample of 32 responses:
YYNYYYYNYYYYYYNYYNYYYNYYNYYNYNYY
Compute the sample proportion, p̂ , for the number of Y’s in this sample. How far does it
lie from the population proportion? What is the probability of selecting another sample
with a proportion greater than the one you selected?