Download Binary matched pairs

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Binary matched pairs
Christiana Kartsonaki
18 July 2014
Introduction
Individuals which are paired, usually such that the two individuals in any one
pair tend to be similar.
In each pair one individual is assigned at random to group 0, the other to
group 1.
On each individual a binary response is observed. Let n be the number of
pairs.
For the i th pair, the observations are represented by random variables
(Yi0 , Yi1 ), i = 1, . . . , n.
Hence the possible observations on a pair, that for group 0 being written
first, are: (0, 0), (0, 1), (1, 0), (1, 1).
R 00 , R 01 , R 10 , R 11 : numbers of pairs with the four types of response.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
2 / 16
Introduction
group 0
group 1
n
n
0
1
The usual χ2 significance test for such a table ignores the correlation induced
by pairing (McNemar, 1947).
The significance of the difference between groups 0 and 1 should be tested
using McNemar’s test, that is, by rejecting the pairs (0, 0) and (1, 1), and by
examining whether the proportion of (1, 0)’s among the remaining discordant
(‘mixed’) pairs (0, 1) and (1, 0) is consistent with binomial variation with
probability 12 (Cox, 1958).
Matched pair designs provide an effective method to control for potential
confounding effects of covariates in studies of the effect of a binary
explanatory variable.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
3 / 16
Conditional analysis
Consider n binary matched pairs (Yi0 , Yi1 ) such that for the i th pair
P(Yi0 = 1) = L(αi ),
P(Yi1 = 1) = L(αi + θ),
(1)
where L(x) = e x /(1 + e x ) is the logistic function.
The parameter θ is estimated using only the discordant pairs (0, 1)
and (1, 0).
αi : a nuisance parameter characteristic of the i th pair
θ: a treatment effect assumed constant on the logistic scale
Christiana Kartsonaki
Binary matched pairs
18 July 2014
4 / 16
Conditional analysis
Because of the large number of nuisance parameters a conditional likelihood
approach is used.
The statistics associated with the αi s, and hence used for conditioning, are
the pair totals Yi0 + Yi1 , i = 1, . . . , n, and the statistic associated with the
parameter θ is the total number of successes for group 1, T = R 01 + R 11 .
Only discordant pairs, for which Yi0 + Yi1 = 1, contribute to T a random
amount. Therefore the conditional distribution considered is that of the
number R 01 of pairs (0, 1), given that R 01 + R 10 = m, the total number of
discordant pairs.
Thus the conditional probability that the i th pair contributes one to R 01 ,
given that it is discordant, is given by
φ = P(Yi0 = 0, Yi1 = 1 | Yi0 + Yi1 = 1) =
eθ
.
1 + eθ
01
This is the
sameθ for
all pairs, thus the conditional distribution of R is
e
Binomial m, 1+e θ .
Christiana Kartsonaki
Binary matched pairs
18 July 2014
5 / 16
Conditional analysis
The probability that a pair is discordant is,
treating αi as a random variable A,
πd = EA {L0 (A)L(A + θ) + L(A)L0 (A + θ)} ,
where L0 (x) = 1 − L(x).
Let µ = E(A) and σ 2 = var(A).
Christiana Kartsonaki
Binary matched pairs
18 July 2014
6 / 16
Conditional analysis
●
●
●
0.7
●
θ=0
θ = 0.5
θ=1
θ = 1.5
θ=2
●●
●●
0.6
●
●
0.5
●●
●
●
●
●
●
●●
●
●
●
0.2
where L0 (x) = 1 − L(x).
0.4
πd
●
0.3
πd = EA {L0 (A)L(A + θ) + L(A)L0 (A + θ)} ,
●
●
●
●
●
●
−2
●
●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●
●●
●
●
●●
●
●●
●
●
●
●●
● ●
●
●●
●●●
●●●●● ●●●●●
●●
●●
●
●
●
●
●
●
●
●
●
●● ●
●
●●● ●
●
● ● ●
●
● ●●
●
●
●
●
●
●
●●
●
●●●
●
●
●●●
●
●●
●●
●●
●
●●
●
●
●
●
●●
●
●●
●●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−1
0
1
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
1.0
σ
●
The probability that a pair is discordant is,
treating αi as a random variable A,
0.8
0.6
0.4
0.2
0.0
2
ν
Let µ = E(A) and σ 2 = var(A).
Christiana Kartsonaki
Figure: Scatterplot of πd against ν and
σ, where ν = µ + 21 θ; colours represent
different values of θ.
Binary matched pairs
18 July 2014
6 / 16
Conditional analysis
The estimate of θ from the conditional analysis is
θ̂C = log
R 01
R 10
and asymptotically
var(θ̂C ) =
Christiana Kartsonaki
1 (1 + e θ )2
.
nπd
eθ
Binary matched pairs
(2)
18 July 2014
7 / 16
Conditional analysis
The estimate of θ from the conditional analysis is
θ̂C = log
R 01
R 10
and asymptotically
var(θ̂C ) =
1 (1 + e θ )2
.
nπd
eθ
(2)
An alternative is to use an unconditional analysis, in which the probabilities of
success of each group are averaged over the observations in the group, that is, the
matching is ignored.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
7 / 16
Unconditional analysis
Suppose that the pairing is ignored, or equivalently that individuals
are randomized to two groups, 0 and 1, with probabilities of success
P(Yi0 = 1) = E {L(A)} ,
P(Yi1 = 1) = E {L(A + θ)} .
(3)
Here all pairs are used.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
8 / 16
Unconditional analysis
Suppose that the pairing is ignored, or equivalently that individuals
are randomized to two groups, 0 and 1, with probabilities of success
φ0 = P(Yi0 = 1) = E {L(A)} ,
φ1 = P(Yi1 = 1) = E {L(A + θ)} .
(3)
Here all pairs are used.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
8 / 16
Unconditional analysis
The probabilities of success for an individual in each group are approximately
µ
µ+θ
φ0 ' L √
,
φ1 ' L √
,
1 + k 2 σ2
1 + k 2 σ2
where k = 0.607. Then
θU '
Christiana Kartsonaki
p
1 + k 2 σ2
log
φ1
φ0
− log
1 − φ1
1 − φ0
Binary matched pairs
.
18 July 2014
9 / 16
Unconditional analysis
The probabilities of success for an individual in each group are approximately
µ
µ+θ
φ0 ' L √
,
φ1 ' L √
,
1 + k 2 σ2
1 + k 2 σ2
where k = 0.607. Then
θU '
p
1 + k 2 σ2
log
φ1
φ0
− log
1 − φ1
1 − φ0
.
The variance of the estimate of the treatment effect θ in the unconditional
analysis is, assuming σ 2 known,

1+k σ 
var(θ̂U ) '
L √
n
2 2
1
µ
1+k 2 σ 2
L0
√ µ
1+k 2 σ 2
+


1
L
√ µ+θ
1+k 2 σ 2
L0
√ µ+θ
1+k 2 σ 2
.

(4)
Christiana Kartsonaki
Binary matched pairs
18 July 2014
9 / 16
Comparison of the efficiency of the conditional and
unconditional analysis
●
●
●
●
θ=0
θ = 0.5
θ=1
θ = 1.5
θ=2
● ●
● ●
●
1.6
●
●
●
1.4
● ● ●
● ● ●
●
● ●
●
●
●
0.8
1.2
●
1.0
●
● ●
●
●●
●
●
var(θ^C ) / var(θ^U )
● ●
● ●
● ●
●
●
●●
●
●
●
●
●
● ●● ● ●● ● ● ● ●●
●
● ● ● ●●
●
●
● ●
● ●● ●●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ● ●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ● ●
●
●● ●
●● ● ● ●
● ●● ●
●
● ●●
● ●●●
● ●●
●● ●
● ●
●
●
●
●
● ●
●● ●
●
● ●● ● ● ●
● ●
●● ●
●
● ●
● ● ● ●● ● ●
●● ●
● ●
●
● ●
●
●
●
●
●
● ●●
●
●
● ● ●
● ● ●
●
●
● ● ●
●
●●
●
● ● ●
● ● ●
● ●
●
●
●
● ●
● ● ●
●
●
●
●
●
● ●
●
●
● ● ●
●
● ●
●
●
●
●
●
● ●
●
●
●●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
● ● ●
●
2
ν
●
1
0
●
●
0.6
−1
−2
0.0
0.2
0.4
0.6
0.8
1.0
σ
Figure: Scatterplot of var(θ̂C )/var(θ̂U ) against σ and ν, where ν = µ + 12 θ; colours
represent different values of θ.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
10 / 16
Comparison of the efficiency of the conditional and
unconditional analysis
The ratio var(θ̂C )/var(θ̂U ) is equal to one only in the trivial case of
θ = σ = 0.
When θ = 0, var(θ̂C )/var(θ̂U ) ≤ 1, that is, the conditional analysis yields a
more precise estimate than the unconditional. As σ increases the ratio
decreases.
As ν (or equivalently µ) increases in absolute value, the conditional analysis
becomes more precise, although πd decreases with increasing |ν|.
As θ increases, the ratio becomes larger, especially when σ and |ν| are small.
When θ = 2, the unconditional analysis is almost always more efficient than
the conditional.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
11 / 16
Testing the hypothesis of no treatment effect
In the conditional analysis the statistic
TC = log
R 01
R 10
,
interpreted as the logit difference for the two groups, has expected value
E(TC ) = θ.
Pitman efficacy (Cox and Hinkley, 1974) for testing the hypothesis that θ = 0:
2
∂E(TC )/∂θθ=0
πd
=
.
EC =
4
n var(TC )θ=0
Christiana Kartsonaki
Binary matched pairs
18 July 2014
12 / 16
Testing the hypothesis of no treatment effect
Under the null hypothesis,
1 2
1
EC ' L0 (µ)L(µ) 1 + σ (1 − 6L0 (µ)L(µ)) .
2
2
(5)
In the unmatched analysis the logit difference for the two groups is
1
TU ' θ + σ 2 {L0 (µ + θ) − L(µ + θ) − L0 (µ) + L(µ)} .
2
Pitman efficacy:
1
1 2
EU ' L0 (µ)L(µ) 1 + σ (1 − 8L0 (µ)L(µ)) .
2
2
Christiana Kartsonaki
Binary matched pairs
18 July 2014
(6)
13 / 16
Testing the hypothesis of no treatment effect
Under the null hypothesis,
1 2
1
EC ' L0 (µ)L(µ) 1 + σ (1 − 6L0 (µ)L(µ)) .
2
2
(5)
In the unmatched analysis the logit difference for the two groups is
1
TU ' θ + σ 2 {L0 (µ + θ) − L(µ + θ) − L0 (µ) + L(µ)} .
2
Pitman efficacy:
1
1 2
EU ' L0 (µ)L(µ) 1 + σ (1 − 8L0 (µ)L(µ)) .
2
2
(6)
Therefore to assess the relative efficiency for θ = 0, we compare (5) and (6).
Christiana Kartsonaki
Binary matched pairs
18 July 2014
13 / 16
Testing the hypothesis of no treatment effect
Under the null hypothesis,
1 2
1
EC ' L0 (µ)L(µ) 1 + σ (1 − 6L0 (µ)L(µ)) .
2
2
(5)
In the unmatched analysis the logit difference for the two groups is
1
TU ' θ + σ 2 {L0 (µ + θ) − L(µ + θ) − L0 (µ) + L(µ)} .
2
Pitman efficacy:
1
1 2
EU ' L0 (µ)L(µ) 1 + σ (1 − 8L0 (µ)L(µ)) .
2
2
(6)
Therefore to assess the relative efficiency for θ = 0, we compare (5) and (6).
EC ≥ EU
Christiana Kartsonaki
Binary matched pairs
18 July 2014
13 / 16
Testing the hypothesis of no treatment effect
Under the null hypothesis,
1 2
1
EC ' L0 (µ)L(µ) 1 + σ (1 − 6L0 (µ)L(µ)) .
2
2
(5)
In the unmatched analysis the logit difference for the two groups is
1
TU ' θ + σ 2 {L0 (µ + θ) − L(µ + θ) − L0 (µ) + L(µ)} .
2
Pitman efficacy:
1
1 2
EU ' L0 (µ)L(µ) 1 + σ (1 − 8L0 (µ)L(µ)) .
2
2
(6)
Therefore to assess the relative efficiency for θ = 0, we compare (5) and (6).
EC ≥ EU
⇒ near θ = 0 the matched design tends to be slightly more efficient.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
13 / 16
Testing the hypothesis of no treatment effect
When L0 (µ)L(µ) ' 1/4 (near µ = 0),
1
EC '
8
1
EU '
8
1
1 − σ2
4
and
1 2
1− σ .
2
Thus for testing the hypothesis of no treatment effect the conditional analysis is
slightly better than the unconditional analysis, depending on the amount of
variability between pairs.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
14 / 16
Discussion
The parameter θ describing the contrast of log odds between the two groups
in the conditional analysis is defined conditionally on the features implied by
the matching variables.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
15 / 16
Discussion
The parameter θ describing the contrast of log odds between the two groups
in the conditional analysis is defined conditionally on the features implied by
the matching variables.
The contrast of log odds from the unconditional analysis without the
correction term is not the same as the contrast of log odds from the
conditional analysis of the same data and it is likely to be different, perhaps
seriously so. Thus comparison of the conclusions from two different studies,
one matched and one unmatched requires care.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
15 / 16
Discussion
The parameter θ describing the contrast of log odds between the two groups
in the conditional analysis is defined conditionally on the features implied by
the matching variables.
The contrast of log odds from the unconditional analysis without the
correction term is not the same as the contrast of log odds from the
conditional analysis of the same data and it is likely to be different, perhaps
seriously so. Thus comparison of the conclusions from two different studies,
one matched and one unmatched requires care.
The unconditional analysis seems to be more efficient than the conditional
analysis in many cases, in particular when the treatment effect is large. When
the treatment effect is close to zero, the conditional analysis is more efficient.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
15 / 16
Discussion
The parameter θ describing the contrast of log odds between the two groups
in the conditional analysis is defined conditionally on the features implied by
the matching variables.
The contrast of log odds from the unconditional analysis without the
correction term is not the same as the contrast of log odds from the
conditional analysis of the same data and it is likely to be different, perhaps
seriously so. Thus comparison of the conclusions from two different studies,
one matched and one unmatched requires care.
The unconditional analysis seems to be more efficient than the conditional
analysis in many cases, in particular when the treatment effect is large. When
the treatment effect is close to zero, the conditional analysis is more efficient.
However, matching plus randomization controls for unobserved confounders,
a different aspect from variance comparison.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
15 / 16
References
Cox, D. R. (1958). Two further applications of a model for binary regression.
Biometrika, 45, 562–565.
Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics. Chapman and
Hall / CRC, London.
McNemar, Q. (1947). Note on the sampling error of the difference between
correlated proportions or percentages. Psychometrika, 12 (2), 153–157.
Christiana Kartsonaki
Binary matched pairs
18 July 2014
16 / 16