Download Sampling Distributions, the CLT, and Estimation

Document related concepts
no text concepts found
Transcript
Sampling Distributions, the CLT,
and Estimation
Carolyn J. Anderson
EdPsych 580
Fall 2005
Sampling Distributions, the CLT, and Estimation – p. 1/63
Sampling and Estimation
•
Sampling Distributions
•
Normal distribution & Central Limit Theorem
•
Estimators and estimates
•
Statistical Inference (interval estimation)
Sampling Distributions, the CLT, and Estimation – p. 2/63
Recall The Big Picture
¶
Select a-Subset
Sample
n
Population
N
µ
³
¾
´
Make Inferences
Sampling Distributions, the CLT, and Estimation – p. 3/63
Population
or “sample space” consists of elementary events.
•
All potential units that could be observed.
•
If finite, then number of units countable.
If infinite, then number of potential
observations is infinite.
If “virtually infinite”, then very, very, very large
number.
•
Real or hypothetical.
• All college students in U.S.
• All possible mean SAT scores from samples drawn
from all college student in U.S.
Sampling Distributions, the CLT, and Estimation – p. 4/63
Random Variables
•
A Random Variable is a number assigned to
any particular member of the population. This
set of numbers has a distribution.
•
Population Distribution is the (frequency)
distribution of these random variables. It has
some form with mean µ and variance σ 2 .
•
Population distributions are almost always
treated as (theoretical) probability
distributions.
•
Random sample with replacement −→
long-run relative frequency of a value is the
same as the probability of that value.
Sampling Distributions, the CLT, and Estimation – p. 5/63
Parameters
Parameters of populations (“true values”) are
values that summarize (define) the distribution.
•
Mean
•
Variance
•
others
Sampling Distributions, the CLT, and Estimation – p. 6/63
Sample
•
A Sample is a sub-set of n units from the
population.
•
Quantities or values computed using a
sample of observations of random variables
are Statistics.
•
Examples:
• Mean: X̄ = (1/n)
Pn
Xi
Pn
i=1
• Variance: s2n = (1/n)
2
(X
−
X̄)
i
i=1
• 2nd observation on X: X2
• Range: (Xmax − X
min )
Sampling Distributions, the CLT, and Estimation – p. 7/63
Sampling Distributions
** Key Concept**
A “conceptual experiment”:
•
Imagine randomly sampling n individuals from
a population and computing some statistic
based on the sample.
•
Repeat this (independently) many times.
•
Result: many values of the sample statistic
−→ The sampling distribution of the sample
statistic.
Sampling Distributions, the CLT, and Estimation – p. 8/63
Sampling Distributions (continued)
From Hayes:
A Sampling Distribution is a theoretical probability
distribution that shows the functional relation
between possible values of a given statistic
based on a sample of n cases and the probability
(density) associated with each value, for all
possible samples of size n drawn from a
particular population.
Sampling Distributions, the CLT, and Estimation – p. 9/63
Sampling Distributions (continued)
•
In general, the sampling distribution will not
be the same as the population distribution.
•
We describe sampling distributions the same
way that we describe population (or a
sample) distributions. i.e., mean, variance,
standard deviation, shape, etc.
Sampling Distributions, the CLT, and Estimation – p. 10/63
Characteristics of Sampling Dist.
If the population distribution has mean µ and
variance σ 2 , then the sampling distribution for a
statistic (for samples of size n) has
• Mean of the sampling distribution of the statistic equals
the population mean of that statistic, µ.
• Variance of the sampling distribution of the statistic
equals the population variance divided by the sample
size, σ 2 /n.
• Standard Deviation of the sampling distribution of the
√
statistic, “standard error of estimate”, equals σ/ n.
Sampling Distributions, the CLT, and Estimation – p. 11/63
Characteristics of Sampling Dist.
The statements on the previous slide regarding
the mean, variance and standard deviation of
sampling distributions are true for all statistics
regardless of the shape of the parent/population
distribution.
Sampling Distributions, the CLT, and Estimation – p. 12/63
Eg: Sampling Dist of the Mean
•
Population: Y is a random variable with mean
µ and variance σ 2 .
•
Sample: Random (independent) sample from
the population: Y1 , Y2 , . . . , Yn .
Pn
The sample mean Ȳ = (1/n) i=1 Yi .
•
•
Expected value of Ȳ (E(Ȳ ), mean of the
sampling distribution) of Ȳ . . .
Sampling Distributions, the CLT, and Estimation – p. 13/63
Expected value of Ȳ
The mean of the sampling distribution of Ȳ . . .
E[Ȳ ] =
=
=
=
=
1
E[ (Y1 + Y2 + . . . + Yn )]
n
1
E[Y1 + Y2 + . . . + Yn ]
n
1
(E[Y1 ] + E[Y2 ] + . . . + E[Yn ])
n
1
(µ + µ + . . . + µ)
n
n
X
1
µ=µ
n i=1
Sampling Distributions, the CLT, and Estimation – p. 14/63
Variance of Ȳ
•
•
•
Recall that σ 2 = E[(Y − µ)2 ] = E(Y 2 ) − µ2 .
var(Ȳ ) = E[(Ȳ − µ)2 ] = E[Ȳ 2 ] − µ2 .
Square sample mean,
Ȳ 2
•
(Y1 + Y2 + . . . + Yn )2
=
n2
(Y12 + . . . + Yn2 + 2Y1 Y2 + 2Y1 Y3 + . . . + 2Y(n−1) Yn )
=
n2
If two random variables, e.g. Y1 and Y2 , are
independent, then
E(Y1 Y2 ) = E(Y1 )E(Y2 ) = µµ = µ2
Sampling Distributions, the CLT, and Estimation – p. 15/63
Variance of Ȳ
E[Ȳ 2 ] =
=
=
=
=
=
(continued)
E[(Y12 + . . . + Yn2 + 2Y1 Y2 + 2Y1 Y3 . . . + 2Y(n−1) Yn )]
2
n
P
Pn
2
i=1 E[Yi ] + 2
i>j E[Yi Yj ]
2
n
P
Pn
2
2
2
µ
(σ
+
µ
)
+
2
i>j
i=1
n2
2
µ
n(σ 2 + µ2 ) + 2 (n−1)n
2
n2
σ 2 + nµ2
n
σ2
+ µ2
n
Sampling Distributions, the CLT, and Estimation – p. 16/63
Variance of Ȳ
(continued)
var(Ȳ ) = E[(Ȳ − µ)2 ]
= E[Ȳ 2 ] − µ2
σ2
= ( + µ 2 ) − µ2
n
σ2
=
n
We made no assumptions regarding the nature
of the population distribution, except that the
mean equals µ and variance equals σ 2 !
Sampling Distributions, the CLT, and Estimation – p. 17/63
Variance of Ȳ
(continued)
var(Ȳ ) = σȲ2 = σ 2 /n
As n increases, var(Ȳ ) decreases (i.e., precision
of the estimate of the statistic increases).
o
Sampling Distributions, the CLT, and Estimation – p. 18/63
Normal Distribution and the C.L.T.
•
The normal distribution is a particular
probability distribution for continuous
variables.
•
The “Bell Curve”
•
Why is it so important?
• It’s a good approximation of the (population)
distribution of many measured variables.
• Many statistical procedures are based on the
assumption of a normal distribution (e.g., sampling
distributions of statistics).
• It has lots of nice mathematic properties.
Sampling Distributions, the CLT, and Estimation – p. 19/63
The Normal Distribution
Formal definition: The family of normal
distributions is a set of symmetric, bell shaped
curves each characterized by its µ and σ 2 . The
formula for the normal p.d.f is
f (x) = √
1
2πσ 2
2
− 12 ( x−µ
σ )
e
where
•
e = 2.71828 . . . (base of natural log).
•
π = 3.14159 (circumference/diameter).
Sampling Distributions, the CLT, and Estimation – p. 20/63
2
Normal: µ = 0 and σ = 4
Sampling Distributions, the CLT, and Estimation – p. 21/63
2
Normal: σ = 1, µ = 0, 5, 10
Sampling Distributions, the CLT, and Estimation – p. 22/63
2
Normal: µ = 0, σ = 1, 4, 16
Sampling Distributions, the CLT, and Estimation – p. 23/63
Normal: A bunch of different ones
Sampling Distributions, the CLT, and Estimation – p. 24/63
The Standard Normal Distribution
Sampling Distributions, the CLT, and Estimation – p. 25/63
The Standard Normal Distribution
• You can transform any normally distributed variable
into a standard normal one:
Y −µ
z–score =
σ
• A z-score equals how many standard deviations a
value of Y is from it’s mean,
zσ = Y − µ
• Use z-scores to find probabilities of continuous
variables from tabled values or computer programs for
the standard normal distribution.
• z ∼ N (0, 1). (special case of x ∼ N (µ, σ 2 )).
Sampling Distributions, the CLT, and Estimation – p. 26/63
The Standard Normal Distribution
Finding areas/probabilities for the standard
normal distribution:
•
Course web-site — downloadable program,
pvalue.exe
•
UCLA web-site
•
SAS function “probnorm” (default is N (0, 1),
but can ask for others).
Sampling Distributions, the CLT, and Estimation – p. 27/63
The Central Limit Theorem
Version 1 (sums): Consider a random sample
from a population distribution having mean µ and
variance σ 2 . If n is sufficiently
Pn large, then the
sampling distribution of i=1 Yi is approximately
normal with mean nµ and variance σ 2 .
Version 2 (means): Consider a random sample
from a population distribution having mean µ and
variance σ 2 . If n is sufficiently large, then the
sampling distribution of Ȳ is approximately
normal with mean µ and variance σȲ2 = σ 2 /n.
Sampling Distributions, the CLT, and Estimation – p. 28/63
Example: Normal (0,1) “Parent”
Parent N (0, 1) =⇒ Sampling distribution of Ȳ is
√
N (0, 1/ n)
Sampling Distributions, the CLT, and Estimation – p. 29/63
Uniform Parent (µ = .5)
Pink is “kernal” density and Red is normal.
Need more than n = 10 for this one. . .
Sampling Distributions, the CLT, and Estimation – p. 30/63
Skewed Parent (µ = 1)
Pink is “kernal” density and Red is normal.
Need more than n = 10 for this one. . .
Sampling Distributions, the CLT, and Estimation – p. 31/63
Skewed Parent (µ = 1) (continued)
Sampling Distributions, the CLT, and Estimation – p. 32/63
Dice Rolling (“Multinomial”)
Sampling Distributions, the CLT, and Estimation – p. 33/63
Dice Rolling (continued)
Look pretty normal?
Sampling Distributions, the CLT, and Estimation – p. 34/63
Example: Dice Rolling (continued)
Sampling Distributions, the CLT, and Estimation – p. 35/63
Example: Dice Rolling (continued)
Population: µ = 3.5, σ 2 = 2.92, σ = 1.71
The MEANS Procedure
Variable
N Mean Std Dev
spot1
1
3.5
1.71
mean2
2
3.5
1.21
mean5
5
3.5
0.76
mean20 20
3.5
0.38
mean50 50
3.5
0.24
Std Dev
Should be
√
1.71/ 1 = 1.71
√
1.71/ 2 = 1.21
√
1.71/ 5 = .76
√
1.71/ 20 = .38
√
1.71/ 50 = .24
Sampling Distributions, the CLT, and Estimation – p. 36/63
Example: Dice Rolling (continued)
Sampling Distributions, the CLT, and Estimation – p. 37/63
Another Discrete Distribution (Bernoulli)
P (Y = 0) = P (Y = 1) = .5, µ = .5, σ 2 = .25
Sampling Distributions, the CLT, and Estimation – p. 38/63
Another Discrete Distribution (Bernoulli)
P (Y = 0) = P (Y = 1) = .5, µ = .5, σ 2 = .25
Sampling Distributions, the CLT, and Estimation – p. 39/63
Another Discrete Distribution (Bernoulli)
P (Y = 0) = P (Y = 1) = .5, µ = .5, σ 2 = .25
Variable
n Mean Std Dev
x1
1 .50
.50
mean2
2 .50
.35
mean5
5 .50
.22
mean50
50 .50
.07
mean100
100 .50
.05
mean500
500 .50
.02
mean5000 5,000 .50
.01
Should be
p
.25/1 = .5
p
.25/2 = .35
p
.25/5 = .22
p
.25/50 = .07
p
.25/100 = .05
p
.25/500 = .02
p
.25/5000 = .01
Sampling Distributions, the CLT, and Estimation – p. 40/63
Another Discrete Distribution (Bernoulli)
P (Y = 0) = .99, P (Y = 1) = .01: µ = .01,
σ 2 = .0099
Sampling Distributions, the CLT, and Estimation – p. 41/63
Another Discrete Distribution (Bernoulli)
P (Y = 0) = .99, P (Y = 1) = .01, µ = .01,
σ 2 = .0099
How about n = 500 (left) and n = 5, 000 (right)?
Sampling Distributions, the CLT, and Estimation – p. 42/63
Another Discrete Distribution (Bernoulli)
µ = .01 and σ 2 = .0099
Variable
n
Mean
Std Dev
x1
1
.01
.1028864
mean2
2
.01
.0713202
mean5
5
.01
.0444879
mean50
50
.01
.0140665
mean100
100
.01
.0099457
mean500
500
.01
.0044401
mean5000 5000
.01
.0014053
p
p
Std Dev is not exactly equal to σ 2 /n = .0099/n
because need more than 100,000 sample means.
Sampling Distributions, the CLT, and Estimation – p. 43/63
Implication of C.L.T or NOT?
• As n increases, σ 2 decreases; the sampling error in
Ȳ
estimating µ decreases when sample size increases.
NOT
• Sampling distributions of (most) statistics are
approximately normal regardless of the shape of the
parent (population) distribution. YES
• Sampling distributions of statistics take on more
normal shapes as n increases. Usually with as small
as n = 25 to 30, the sampling distribution is well
approximated by the normal. YES
If the population distribution is “well behaved”, the the
normal distribution is good for almost all sample sizes.
Sampling Distributions, the CLT, and Estimation – p. 44/63
C.L.T: Summary of Implications
• Since the sampling distribution of Ȳ is approximately
N (µ, σ 2 /n), we can use the tabled probabilities of the
standard normal distribution to compute interval
estimates of µ and do statistical tests (i.e., make
statistical inferences about the degree of
uncertainty)....more later
• n = 25 or 30 does not imply that we have sufficient
precision. We may require much larger n’s to detect
small effects.
n = 30 means that often the samplings distribution of Ȳ
is approximately normal.
Sampling Distributions, the CLT, and Estimation – p. 45/63
C.L.T: Summary of Implications
• The sampling distribution of Ȳ always has mean µ and
σ 2 /n. The shape of the sampling distribution of Ȳ is
normal for small n only if the population distribution of
Y is normal.
• n = 30 does not ensure that the sampling distribution
of a statistic will be even approximately normal.
There are cases were it requires much larger samples.
These cases usually are ones where the statistic equal
the sum of values that are discrete (e.g., Y = 0, 1) and
the probability of (say) Y = 1 is very very small.
• C.L.T. can be proven mathematically.
Sampling Distributions, the CLT, and Estimation – p. 46/63
Estimators and Estimates
•
An estimator is a formula for computing an
estimate
•
An estimator is a random variable whose
value depends on your sample.
•
An estimate is a particular value of an
estimator.
•
Examples:
Sampling Distributions, the CLT, and Estimation – p. 47/63
Estimators and Estimates (continued)
•
Examples of estimators and estimates:
• The sample mean and variance are
estimators,
Ȳ = (1/n)
n
X
Yi
i=1
•
i=1
(Yi − Ȳ )2
Given data from a sample the estimates
are. e.g. HSB reading scores:
Ȳ = 55.89
•
s2n = (1/n)
n
X
s2n = 80.00
The above estimates are point estimates.
Sampling Distributions, the CLT, and Estimation – p. 48/63
Properties of Estimators
•
Bias
•
Consistency
•
Relative efficiency
•
Sufficiency
•
Maximum likelihood
Sampling Distributions, the CLT, and Estimation – p. 49/63
Properties of Estimators: Bias
•
An estimator is unbiased if it’s expected value
equals the population value.
•
An estimator is biased if it’s expected value
does not equal the population value.
Pn
The sample mean Ȳ = (1/n) i=1 Yi is an
unbiased estimator of µ:
•
E(Ȳ ) = µ
•
If the parent population is normal then the
median and mode are an unbiased
estimators of µ:
E(median) = E(mode) = µ
Sampling Distributions, the CLT, and Estimation – p. 50/63
Properties of Estimators: Bias (continued)
•
s2n
The sample variance = (1/n)
is a biased estimator of σ 2 :
E(s2n )
Pn
2
(Y
−
Ȳ
)
i=1 i
1 2
=σ − σ
n
2
It’s a little too small.
•
2
The unbiased estimator
of
σ
is
P
s2 = (1/(n − 1)) ni=1 (Yi − Ȳ )2 :
E(s2 ) = σ 2
Sampling Distributions, the CLT, and Estimation – p. 51/63
Consistency & Efficiency
•
Consistency: As the sample size n increases,
the sample statistic “converges in probability”
to the population value.
• The sample mean Ȳ is a consistent estimator of µ.
• The 2nd observation in a sample is not a consistent
estimator of µ.
•
Relative Efficiency: An estimator is more
efficient if the variance of it’s sampling
distribution is less than the variance of
another estimator. e.g., For normal Y, Ȳ is
more efficient than the median.
Sampling Distributions, the CLT, and Estimation – p. 52/63
Sufficient
•
•
•
The statistic contains all the information in the
data about the population parameter.
e.g.,P
Ȳ is sufficient for µ
n
and i=1 Yi is sufficient for µ.
Sufficient statistics don’t always exist.
In some population distributions, you may
need more than 1 parameter to completely
specify the distribution.
e.g., Bernoulli needs the mean (or
probability).
Normal distribution needs Ȳ and s2 .
Sampling Distributions, the CLT, and Estimation – p. 53/63
Maximum Likelihood
An estimator that maximizes the likelihood
(probability) of obtaining the sample you got.
•
ȳ is the M.L.E of µ (it’s also consistent,
efficient, and unbiased).
•
s2n is the M.L.E but is biased.
•
s2 is not the M.L.E but is unbiased.
Sampling Distributions, the CLT, and Estimation – p. 54/63
Interval Estimates & Statistical Inference
•
So far, we’ve just considered “point
estimates” (a “best guess”).
•
We might want a range of possible values.
•
A range of values that has a high probability
of containing the true population value.
•
Confidence Interval Estimate
Sampling Distributions, the CLT, and Estimation – p. 55/63
Confidence Interval for µ
•
We know
• E(Ȳ ) = µ
• σ 2 = σ 2 /n
Ȳ
•
We assume that the sampling distribution of
Ȳ is normal (i.e., that n is “large enough”);
that is
Ȳ ≈ N (µ, σ 2 /n)
Sampling Distributions, the CLT, and Estimation – p. 56/63
Sampling Distribution of Ȳ
E(Ȳ ) = µ and σȲ2 = σ 2 /n
Sampling Distributions, the CLT, and Estimation – p. 57/63
Confidence Interval for µ
•
Our best point estimate is Ȳ , so an interval
estimate should be centered around Ȳ .
•
We add and subtract an amount c such that
£
¤
Prob (Ȳ − c) ≤ µ ≤ (Ȳ + c) = 1 − α
•
To find the value of c transform Ȳ to z-scores;
that is,
Ȳ − µ
z=
σȲ
p
where σx̄ = σ 2 /n.
Sampling Distributions, the CLT, and Estimation – p. 58/63
Transform to z-Scores
z = (Ȳ − µ)/σx̄2
Sampling Distributions, the CLT, and Estimation – p. 59/63
Confidence Interval for µ
•
•
Before we look at data,
µ
¶
Ȳ − µ
Prob −zα/2 ≤
≤ zα/2 = 1 − α
σȲ
¡
¢
2
2
Prob Ȳ − zα/2 σȲ ≤ µ ≤ Ȳ + zα/2 σȲ = 1 − α
Once you get data, an interval estimate of µ is
x̄ ± zα/2 σx̄
•
The probability that µ is in this interval is NOT
1 − α.
Sampling Distributions, the CLT, and Estimation – p. 60/63
Correct Interpretation of CI
•
•
•
Consider repeating the process of
1. Draw/take a sample of size n
2. Compute the (1 − α)th confidence interval.
(1 − α) × 100 percent of the time, the interval
would contain µ.
Note: later we’ll consider the more realistic
situation where estimate σ.
Sampling Distributions, the CLT, and Estimation – p. 61/63
HSB Reading Scores for Academic
•
Sample statistics for students attending
academic/prep school and the variable “RDG”
(reading achievement in T-scores):
n = 308,
Ȳ = 55.89,
s2 = 87.15,
s = 9.34
•
Standard
√ error of the mean
= 9.34/ 308 = .53
•
The sampling of distribution of Ȳ should be
very well approximated by the normal
distribution because of large n and
distribution of RDG scores is “nice”.
Sampling Distributions, the CLT, and Estimation – p. 62/63
HSB Reading Scores for Academic
64%CI:
90%CI:
95%CI:
99%CI:
55.89 ± 1.00(.53)
55.89 ± 1.645(.53)
55.89 ± 1.96(.53)
55.89 ± 2.58(.53)
−→
−→
−→
−→
(55.36, 56.42)
(55.02, 56.77)
(54.85, 56.93)
(54.52, 57.26)
Higher confidence levels (smaller α) −→
the wider the intervals.
Sampling Distributions, the CLT, and Estimation – p. 63/63
Related documents