Download sampling distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Sufficient statistic wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Gibbs sampling wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Section 6-4
Sampling Distributions
and Estimators
Slide
1
The height of young women varies approximately according to the N(64.5, 2.5) distribution.
The random variable measured (X) is the height of a randomly selected young
woman. In this Activity you will use the TI83 to sample from this distribution and them
use Post-it notes to construct a distribution of averages.
1.
If we choose one woman at random, the heights we get in repeated choices follow the
N(64.5, 2.5) distribution. On your calculator, go into the Stats/List editor and clear L1.
Simulate the heights of 100 randomly selected young women and store these heights
in L1:
* Place your cursor at the top of L1; Press Math, PRB, Choose 6: randNorm( and
complete the command: randNorm (64.5, 2.5, 100) and press ENTER.
2. Plot a histogram of the 100 heights by deselecting active fxn in the Y= window, and turn
off all STAT PLOTS. Set WINDOW dimensions to X[57,72] (Xscl:2.5) and Y[-10,45]
(Yscl:5) to extend 3 standard devs of the mean. Define Plot 1 to be a histogram using
the heights in L1. Press GRAPH to plot the histogram. Is it fairly symmetric or clearly
skewed?
3. Use 1-var stats to find the mean, median, and standard deviation for your data. Compare
x-bar with the population mean of 64.5. Compare the sample standard deviation with
the population std. dev. Of 2.5. How do the mean and median for your 100 heights
compare? Recall that the close the mean and the median are, the more symmetric the
distribution.
4. Define Plot2 to be a boxplot using L1, and press GRAPH. The boxplot will be plotted
above the histogram. Does the boxplot appear symmetric? How close is the median in
the boxplot to the mean of the histogram? Based on the appearance of the histogram
and the boxplot, and a comparison of the mean and median, would you say that the
distribution is nonsymmetric, moderately symmetric, or very symmetric?
5. Write the mean for your sample on a Post-it Note. Put your post it note on the appropriate
location on the graph on the board. When the “Post-it” histogram is complete:
* What is the approximate shape of the distribution of x-bar:
* What is the center and std. dev. of x-bar? How does this compare with the mean and
std. dev. for the heights of all young women?
Slide 2
Key Concept
The main objective of this section is to
understand the concept of a sampling
distribution of a statistic, which is the
distribution of all values of that statistic
when all possible samples of the same size
are taken from the same population.
We will also see that some statistics are
better than others for estimating population
parameters.
Slide
3
Definitions
 The
sampling distribution of a statistic (such as the
sample proportion or sample mean) is the distribution of
all values of the statistic when all possible samples of
the same size n are taken from the same population.
The value of a statistic, such as the sample mean x,
depends on the particular values included in the sample,
and generally varies from sample to sample. This
variability of a statistic is called sampling variability.
Slide
4
The sampling distribution of a proportion is
the distribution of sample proportions, with
all samples having the same sample size n
taken from the same population.
Slide
5
 The sampling distribution of the mean is the
distribution of sample means, with all samples
having the same sample size n taken from the
same population. (The sampling distribution of
the mean is typically represented as a probability
distribution in the format of a table, probability
histogram, or formula.)
Slide
6
Ex. 1
 Take a sample of 20 statistics students at the high school
level and calculate the percentage of students who got a
B or Higher on the chapter 1 test. Take another 20
statistics students, and calculate the percentage of
students who got a B or Higher on the chapter 1 test.
Keep going until we have taken all possible samples
from the high school population in the USA (this
percentage will vary from sample to sample).
 Draw a histogram of the distribution of proportions….
 The mean of these proportions will equal the USA
percentage of students who got a B or higher on the
Chapter 1 test.
Slide
7
Ex. 2
Take a sample of 20 statistics students at the
high school level and calculate the mean height
of students. Take another 20 statistics students,
and calculate the mean height of students. Keep
going until we have taken all possible samples
from the high school population in the USA (this
mean will vary from sample to sample).
Draw a histogram of the distribution of means….
The mean of this distribution will equal the USA
mean of student heights.
Slide
8
Example
When two births are randomly selected, the sample space is:
{bb, bg, gb, gg}
Those four equally likely outcomes suggest the following probability
distribution for the number of girls from 2 births:
X
0
1
2
P(x)
0.25
0.50
0.25
Here is the sampling distribution for the proportion of girls:
Proportion
of girls from
2 births
0/2 = 0 ½ = 0.5 2/2 = 1
P(x)
0.25
0.50
0.25
We usually describe a sampling distribution using a table that lists values
of the sample statistic along with their corresponding probabilities.
Slide
9
Example
Consider the genders of the Senators in the 107th
Congress. There are only 100 members, 13
Females and 87 Males.
The population proportion of females is:
Usually, we don’t know all of the members of the
population, so we must estimate it from a sample.
Sample 1: M F M M F M M M M M
Sample 2: M F M M M M M M M M
Sample 3: M M M M M M F M M M
Sample 4: M M M M M M M M M M
Sample 5: M M M M M M M M F M
Slide
10
Example (cntd…)
Suppose we take another 95
samples (for a total of 100).
Combining these additional
samples with the first 5, we get
100 samples summarized
here
 If we were to include all
other possible sample sizes
of 10 (all
100,000,000,000,000,000,000
of them!) the mean of the
sample proportions would
equal 0.13!
Proportion of
Female
Senators
Frequency
0.0
0.1
26
41
26/100
41/100
0.2
0.3
0.4
24
7
1
24/100
7/100
1/100
0.5
Mean
1
1/100
0.119
Std.dev.
P(x) Sample
Proportion
0.100
Slide
11
Properties
 Sample proportions tend to target the value
of the population proportion. (That is, the
mean of all possible sample proportions =
mean of the population proportion.)
 Under certain conditions, the distribution of
the sample proportion can be approximated
by a normal distribution.
Slide
12
Estimators
Some statistics work much better than
others as estimators of the population.
The example that follows shows this.
Slide
13
Example - Sampling Distributions
A population consists of the values 1, 2, and 5. We
randomly select samples of size 2 with replacement.
There are 9 possible samples.
What are they?
a. For each sample, find the mean, median, range,
variance, and standard deviation.
b. For each statistic, find the mean from part (a)
Slide
14
Fill in the table (part a) (For each sample, find
the mean, median, range, variance, and standard deviation).
Sample Mean
Median Range
Variance
Std.
Dev.
Prop.
Of odd
#s
Prob.
1,1
1,2
1,5
2,1
2,2
2,5
5,1
5,2
5,5
Slide
15
Fill in the bottom row (part b): (For each statistic, find the
mean from part (a))
Mean of statistics :
___
___
___
___
____
____
Population Parameter:
___
___
___
___
____
____
Does the sample statistic target the population parameter?
____
___
___
___
___
____
Slide
16
Slide
17
Interpretation of
Sampling Distributions
We can see that when using a sample statistic to
estimate a population parameter, some statistics are
good in the sense that they target the population
parameter and are therefore likely to yield good
results. Such statistics are called unbiased
estimators.
Statistics that target population parameters: mean,
variance, proportion
Statistics that do not target population parameters:
median, range, standard deviation
Slide
18
Recap
In this section we have discussed:
 Sampling distribution of a statistic.
 Sampling distribution of a proportion.
 Sampling distribution of the mean.
 Sampling variability.
 Estimators.
Slide
19