Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Transcript
Outline for Class Meeting 2 (1/23/06)
Design-Based Estimation and the Horvitz-Thompson Estimator
I. An unbiased estimator for any design
A. If you know the first order selection probabilities for any design, you can construct an
unbiased estimator of t as
y
tˆ   i .
iS
i
This is called a Horvitz-Thompson estimator. To prove that this estimator is unbiased,
notice that tˆ can be rewritten as
Z y
tˆ   i i ,
iU
i
where Zi = 1 if i  S and = 0 otherwise. Note that Zi is Bernoulli(i). Then
E ( Z i ) yi
E (tˆ)  
  yi  t .
iU
i
iU
An unbiased estimator of mean is then tˆ / N .
B. Weights
1. 1  i is often described as a “weight” attached to the ith unit. Units that have a high
(relative) probability of selection are “weighted down” and those with a low (relative)
probability of selection are “weighted up”.
2. A good check of your computation of selection probability is that the sum of the
selection probabilities must equal sample size (if fixed) or expected sample size (if
random). Why?
C. Example: Bernoulli design.
1. Consider a design in which each unit is selected into the sample independently and
with probability . Then an unbiased estimator of total is
y
tˆ   i .
iS 
2. Is the sample mean an unbiased estimator of population mean?
II. Variance
A. To find the variance of the estimator above, note that
 y  y j 
V (tˆ)    i  Cov( Z i , Z j ) ,
 
iU jU   i   j 
where Cov(Z i , Z j )   i (1   i ) when i = j and Cov( Z i , Z j )   ij   i j when i  j.
Thus
 ij   i j
1i
V (tˆ)   yi2
   yi y j
iU
i
 i j
i  jU
This means that if the first and second order selection probabilities are known, you know
what the variance of the estimator looks like. That is why it is useful to know how to
compute second order selection probabilities.
B. To unbiasedly estimate the variance of the H-T estimator for designs in which ij > 0
for all i, j, just apply the indicator “trick” again:
 y  y j  Z i Z j
Vˆ (tˆ)    i  
Cov( Z i , Z j )
 
iU jU   i   j   ij
1i
1
1
  y i2
   yi y j (
 )
2
iS
i
 i j
i  jS
C. Example (con’t): Bernoulli design
For the Bernoulli design, Cov(Zi, Zj) = 0. Thus
1
V (tˆ) 

y
iU
 ij
2
i
and
1
2
Vˆ (tˆ) 
 yi .
2

iS
D. Is the H-T estimator the “best”?
III. The R.N. Example
Compare the H-T estimator with the “expected” estimator ( Ny S ) of total for the R.N
shift total example.
(a) Is the expected estimator (sample mean) unbiased?
(b) Is the expected estimator ever best?
(c) When does weighting have the most effect?