Download Sampling distribution

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Statistical inference wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Transcript
Where Are You?
Children
Adults
Section 7.1
Generating Sampling
Distributions
Why Statistics?
Why Statistics?
Statistics allow us to make inferences
(or draw conclusions) about a
population.
Population
How do we describe populations?
Population
How do we describe populations?
If we can graph data about the population,
we can use shape, center, and spread.
Population
How do we describe populations?
If we can graph data about the population,
we can use shape, center, and spread.
We may use summary numbers such as the
mean (  x ), standard deviation (  x ),
median, quartiles, etc.
Population
What do we call a summary number that
describes a population or a probability
distribution?
Population
What do we call a summary number that
describes a population or a probability
distribution?
parameter
Summary Statistic
What is a summary statistic?
Summary Statistic
Summary statistic: a summary number
calculated from a sample taken from a
population
Summary Statistic
Examples:
sample mean x,
standard deviation sx ,
5-number summary
Sample
Population
statistic
parameter
Sampling Distribution
Suppose we take a random sample of a
fixed size n from our population and
compute a summary statistic.
Sampling Distribution
Suppose we take a random sample of a
fixed size n from our population and
compute a summary statistic.
Then, suppose we repeat this process many
times.
Sampling Distribution
Sampling distribution: distribution of
summary statistics you get from taking
repeated random samples
Sampling Distribution
Sampling distribution: distribution of
summary statistics you get from taking
repeated random samples
Answers the question “How does my
summary statistic behave when I repeat
the process many times?”
(Think Law of Large Numbers).
Sampling Distribution
Two types of sampling distributions:
Sampling Distribution
Two types of sampling distributions:
• Exact Sampling Distribution
Sampling Distribution
Two types of sampling distributions:
• Exact Sampling Distribution
• Approximate Sampling Distribution
Exact Sampling Distributions
When the population size is very small,
Exact Sampling Distributions
When the population size is very small, you
can construct sampling distributions
exactly by listing distribution of
summary statistic for all possible
samples for a fixed size, n.
Exact Sampling Distributions
Utah has five national parks. Your company
has been hired to make maps of two of
these parks, which will be selected at
random.
1. Construct the sampling distribution for the
total number of square miles you would
map.
2. Find P(map > 600 mi2)
Exact Sampling Distributions
Utah has five national parks. Your company
has been hired to make maps of two of
these parks, which will be selected at
random.
How many possible samples of size n = 2
are there?
Exact Sampling Distributions
Utah has five national parks. Your company
has been hired to make maps of two of
these parks, which will be selected at
random.
How many possible samples of size n = 2
are there? 5C2 = 10 samples
Exact Sampling Distributions
Construct the sampling distribution for the
total number of square miles you would
map. Find P(map > 600 mi2).
Exact Sampling Distributions
Find P(map > 600 mi2).
Find P(map > 600 mi2) = 4/10
Exact Sampling Distributions
Should we always use exact sampling
distributions?
Always use Exact Sampling
Distribution?
Suppose you have a population of 100
rectangles with varying dimensions and
you had to construct a sampling
distribution for a sample of size 5.
How many ways can you choose 5
rectangles at a time from the population of
100?
Always use Exact Sampling
Distribution?
Suppose you have a population of 100
rectangles with varying dimensions and
you had to construct a sampling
distribution for a sample of size 5.
How many ways can you choose 5
rectangles at a time from the population of
100? 100C5 = 75,287,520
Would you construct an exact sampling
distribution here?
Approximate Sampling Distribution
Approximate sampling distribution is AKA
simulated sampling distribution.
Approximate Sampling Distribution
Approximate sampling distribution is AKA
simulated sampling distribution.
4-step process:
Approximate Sampling Distribution
Approximate sampling distribution is AKA
simulated sampling distribution.
4-step process:
1. Take random sample of fixed size n from
population
Approximate Sampling Distribution
Approximate sampling distribution is AKA
simulated sampling distribution.
4-step process:
1. Take random sample of fixed size n from
population
2. Compute summary statistic of interest
For example: mean, median, min, or max
Approximate Sampling Distribution
Approximate sampling distribution is AKA
simulated sampling distribution.
4-step process:
1. Take random sample of fixed size n from
population
2. Compute summary statistic of interest
3. Repeat steps 1 and 2 many times
Approximate Sampling Distribution
Approximate sampling distribution is AKA
simulated sampling distribution.
4-step process:
1. Take random sample of fixed size n from
population
2. Compute summary statistic of interest
3. Repeat steps 1 and 2 many times
4. Display distribution of the summary
statistic
Display 7.2, p. 411
Each dot represents 1 rectangle.
Display 7.2, p. 411
Shape: skewed right
Center: μx = 7.4
Spread: σx = 5.2
Now,
• 5 rectangles were selected at random
Now,
• 5 rectangles were selected at random
• Mean area of these five was calculated
Now,
• 5 rectangles were selected at random
• Mean area of these five was calculated
• This was repeated 1000 times
Display 7.3, p. 411
Display 7.3, p. 411
Shape: approx. normal
Center: μx = 7.4 Spread: σx = 2.3
Population
Sampling Dist.
Shape: skewed right
approx. normal
Center: μx = 7.4
μx = 7.4
Spread: σx = 5.2
σx = 2.3
Reasonably Likely vs Rare Events
What are these?
Reasonably Likely vs Rare Events
Reasonably likely events: values that lie in
the middle 95% of sampling distribution
Reasonably Likely vs Rare Events
Reasonably likely events: values that lie in
the middle 95% of sampling distribution
Rare events: values that lie in outer 5% of
sampling distribution
Display 7.3, p. 411
Reasonably Likely vs Rare Events
In normal distribution, rare events lie more
than approximately 2 standard deviations
from the mean
Vocabulary
Population standard deviation is the
standard deviation of the population, 
Vocabulary

Point Estimators
Point estimator: a statistic from a sample
that provides a single point (number) as a
plausible value of a population parameter
Sampling Bias
Recall, when we discussed sampling bias.
What happens if you have biased results?
Sampling Bias
Recall, when we discussed sampling bias.
What happens if you have biased results?
The estimate from the sample is larger or
smaller, on average, than the population
parameter being estimated
Point Estimators
A summary statistic is a biased estimator
of a population parameter if it gives
results that are too large or too small on
average
Biased or Unbiased Estimators?
For a sampling distribution, are these biased
or unbiased estimators?
1) Sample mean
2) Sample median
3) Sample maximum
4) Sample minimum
5) Sample range
6) Sample standard deviation
Estimators
Sample mean is unbiased estimator of the
population mean because the mean of
the sampling distribution of the sample
mean is equal to the population mean
Sample median?
Estimators
Sample median is nearly unbiased
estimator of population median for large
samples
Sample maximum?
Estimators
Sample maximum is biased estimator of
population maximum and is biased in
direction of being too small
Sample minimum?
Estimators
Sample minimum is biased estimator of
population minimum and tends to be too
large
Sample range?
Estimators
Sample range is biased estimator of
population range and tends to be too
small
Sample standard deviation?
Estimators
Sample standard deviation is biased
estimator of population standard deviation
and tends to be too small
Biased or Unbiased Estimators?
For a sampling distribution, are these
biased or unbiased estimators?
1) Sample mean : unbiased
2) Sample median: nearly unbiased
3) Sample maximum
4) Sample minimum
5) Sample range
6) Sample standard deviation
Questions?