Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Binary matched pairs Christiana Kartsonaki 18 July 2014 Introduction Individuals which are paired, usually such that the two individuals in any one pair tend to be similar. In each pair one individual is assigned at random to group 0, the other to group 1. On each individual a binary response is observed. Let n be the number of pairs. For the i th pair, the observations are represented by random variables (Yi0 , Yi1 ), i = 1, . . . , n. Hence the possible observations on a pair, that for group 0 being written first, are: (0, 0), (0, 1), (1, 0), (1, 1). R 00 , R 01 , R 10 , R 11 : numbers of pairs with the four types of response. Christiana Kartsonaki Binary matched pairs 18 July 2014 2 / 16 Introduction group 0 group 1 n n 0 1 The usual χ2 significance test for such a table ignores the correlation induced by pairing (McNemar, 1947). The significance of the difference between groups 0 and 1 should be tested using McNemar’s test, that is, by rejecting the pairs (0, 0) and (1, 1), and by examining whether the proportion of (1, 0)’s among the remaining discordant (‘mixed’) pairs (0, 1) and (1, 0) is consistent with binomial variation with probability 12 (Cox, 1958). Matched pair designs provide an effective method to control for potential confounding effects of covariates in studies of the effect of a binary explanatory variable. Christiana Kartsonaki Binary matched pairs 18 July 2014 3 / 16 Conditional analysis Consider n binary matched pairs (Yi0 , Yi1 ) such that for the i th pair P(Yi0 = 1) = L(αi ), P(Yi1 = 1) = L(αi + θ), (1) where L(x) = e x /(1 + e x ) is the logistic function. The parameter θ is estimated using only the discordant pairs (0, 1) and (1, 0). αi : a nuisance parameter characteristic of the i th pair θ: a treatment effect assumed constant on the logistic scale Christiana Kartsonaki Binary matched pairs 18 July 2014 4 / 16 Conditional analysis Because of the large number of nuisance parameters a conditional likelihood approach is used. The statistics associated with the αi s, and hence used for conditioning, are the pair totals Yi0 + Yi1 , i = 1, . . . , n, and the statistic associated with the parameter θ is the total number of successes for group 1, T = R 01 + R 11 . Only discordant pairs, for which Yi0 + Yi1 = 1, contribute to T a random amount. Therefore the conditional distribution considered is that of the number R 01 of pairs (0, 1), given that R 01 + R 10 = m, the total number of discordant pairs. Thus the conditional probability that the i th pair contributes one to R 01 , given that it is discordant, is given by φ = P(Yi0 = 0, Yi1 = 1 | Yi0 + Yi1 = 1) = eθ . 1 + eθ 01 This is the sameθ for all pairs, thus the conditional distribution of R is e Binomial m, 1+e θ . Christiana Kartsonaki Binary matched pairs 18 July 2014 5 / 16 Conditional analysis The probability that a pair is discordant is, treating αi as a random variable A, πd = EA {L0 (A)L(A + θ) + L(A)L0 (A + θ)} , where L0 (x) = 1 − L(x). Let µ = E(A) and σ 2 = var(A). Christiana Kartsonaki Binary matched pairs 18 July 2014 6 / 16 Conditional analysis ● ● ● 0.7 ● θ=0 θ = 0.5 θ=1 θ = 1.5 θ=2 ●● ●● 0.6 ● ● 0.5 ●● ● ● ● ● ● ●● ● ● ● 0.2 where L0 (x) = 1 − L(x). 0.4 πd ● 0.3 πd = EA {L0 (A)L(A + θ) + L(A)L0 (A + θ)} , ● ● ● ● ● ● −2 ● ● ● ● ●●● ● ● ●● ● ● ● ●● ● ● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●●● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ●● ●●● ●●●●● ●●●●● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●●● ● ● ●●● ● ●● ●● ●● ● ●● ● ● ● ● ●● ● ●● ●●● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −1 0 1 ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1.0 σ ● The probability that a pair is discordant is, treating αi as a random variable A, 0.8 0.6 0.4 0.2 0.0 2 ν Let µ = E(A) and σ 2 = var(A). Christiana Kartsonaki Figure: Scatterplot of πd against ν and σ, where ν = µ + 21 θ; colours represent different values of θ. Binary matched pairs 18 July 2014 6 / 16 Conditional analysis The estimate of θ from the conditional analysis is θ̂C = log R 01 R 10 and asymptotically var(θ̂C ) = Christiana Kartsonaki 1 (1 + e θ )2 . nπd eθ Binary matched pairs (2) 18 July 2014 7 / 16 Conditional analysis The estimate of θ from the conditional analysis is θ̂C = log R 01 R 10 and asymptotically var(θ̂C ) = 1 (1 + e θ )2 . nπd eθ (2) An alternative is to use an unconditional analysis, in which the probabilities of success of each group are averaged over the observations in the group, that is, the matching is ignored. Christiana Kartsonaki Binary matched pairs 18 July 2014 7 / 16 Unconditional analysis Suppose that the pairing is ignored, or equivalently that individuals are randomized to two groups, 0 and 1, with probabilities of success P(Yi0 = 1) = E {L(A)} , P(Yi1 = 1) = E {L(A + θ)} . (3) Here all pairs are used. Christiana Kartsonaki Binary matched pairs 18 July 2014 8 / 16 Unconditional analysis Suppose that the pairing is ignored, or equivalently that individuals are randomized to two groups, 0 and 1, with probabilities of success φ0 = P(Yi0 = 1) = E {L(A)} , φ1 = P(Yi1 = 1) = E {L(A + θ)} . (3) Here all pairs are used. Christiana Kartsonaki Binary matched pairs 18 July 2014 8 / 16 Unconditional analysis The probabilities of success for an individual in each group are approximately µ µ+θ φ0 ' L √ , φ1 ' L √ , 1 + k 2 σ2 1 + k 2 σ2 where k = 0.607. Then θU ' Christiana Kartsonaki p 1 + k 2 σ2 log φ1 φ0 − log 1 − φ1 1 − φ0 Binary matched pairs . 18 July 2014 9 / 16 Unconditional analysis The probabilities of success for an individual in each group are approximately µ µ+θ φ0 ' L √ , φ1 ' L √ , 1 + k 2 σ2 1 + k 2 σ2 where k = 0.607. Then θU ' p 1 + k 2 σ2 log φ1 φ0 − log 1 − φ1 1 − φ0 . The variance of the estimate of the treatment effect θ in the unconditional analysis is, assuming σ 2 known, 1+k σ var(θ̂U ) ' L √ n 2 2 1 µ 1+k 2 σ 2 L0 √ µ 1+k 2 σ 2 + 1 L √ µ+θ 1+k 2 σ 2 L0 √ µ+θ 1+k 2 σ 2 . (4) Christiana Kartsonaki Binary matched pairs 18 July 2014 9 / 16 Comparison of the efficiency of the conditional and unconditional analysis ● ● ● ● θ=0 θ = 0.5 θ=1 θ = 1.5 θ=2 ● ● ● ● ● 1.6 ● ● ● 1.4 ● ● ● ● ● ● ● ● ● ● ● ● 0.8 1.2 ● 1.0 ● ● ● ● ●● ● ● var(θ^C ) / var(θ^U ) ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ●● ● ●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ν ● 1 0 ● ● 0.6 −1 −2 0.0 0.2 0.4 0.6 0.8 1.0 σ Figure: Scatterplot of var(θ̂C )/var(θ̂U ) against σ and ν, where ν = µ + 12 θ; colours represent different values of θ. Christiana Kartsonaki Binary matched pairs 18 July 2014 10 / 16 Comparison of the efficiency of the conditional and unconditional analysis The ratio var(θ̂C )/var(θ̂U ) is equal to one only in the trivial case of θ = σ = 0. When θ = 0, var(θ̂C )/var(θ̂U ) ≤ 1, that is, the conditional analysis yields a more precise estimate than the unconditional. As σ increases the ratio decreases. As ν (or equivalently µ) increases in absolute value, the conditional analysis becomes more precise, although πd decreases with increasing |ν|. As θ increases, the ratio becomes larger, especially when σ and |ν| are small. When θ = 2, the unconditional analysis is almost always more efficient than the conditional. Christiana Kartsonaki Binary matched pairs 18 July 2014 11 / 16 Testing the hypothesis of no treatment effect In the conditional analysis the statistic TC = log R 01 R 10 , interpreted as the logit difference for the two groups, has expected value E(TC ) = θ. Pitman efficacy (Cox and Hinkley, 1974) for testing the hypothesis that θ = 0: 2 ∂E(TC )/∂θθ=0 πd = . EC = 4 n var(TC )θ=0 Christiana Kartsonaki Binary matched pairs 18 July 2014 12 / 16 Testing the hypothesis of no treatment effect Under the null hypothesis, 1 2 1 EC ' L0 (µ)L(µ) 1 + σ (1 − 6L0 (µ)L(µ)) . 2 2 (5) In the unmatched analysis the logit difference for the two groups is 1 TU ' θ + σ 2 {L0 (µ + θ) − L(µ + θ) − L0 (µ) + L(µ)} . 2 Pitman efficacy: 1 1 2 EU ' L0 (µ)L(µ) 1 + σ (1 − 8L0 (µ)L(µ)) . 2 2 Christiana Kartsonaki Binary matched pairs 18 July 2014 (6) 13 / 16 Testing the hypothesis of no treatment effect Under the null hypothesis, 1 2 1 EC ' L0 (µ)L(µ) 1 + σ (1 − 6L0 (µ)L(µ)) . 2 2 (5) In the unmatched analysis the logit difference for the two groups is 1 TU ' θ + σ 2 {L0 (µ + θ) − L(µ + θ) − L0 (µ) + L(µ)} . 2 Pitman efficacy: 1 1 2 EU ' L0 (µ)L(µ) 1 + σ (1 − 8L0 (µ)L(µ)) . 2 2 (6) Therefore to assess the relative efficiency for θ = 0, we compare (5) and (6). Christiana Kartsonaki Binary matched pairs 18 July 2014 13 / 16 Testing the hypothesis of no treatment effect Under the null hypothesis, 1 2 1 EC ' L0 (µ)L(µ) 1 + σ (1 − 6L0 (µ)L(µ)) . 2 2 (5) In the unmatched analysis the logit difference for the two groups is 1 TU ' θ + σ 2 {L0 (µ + θ) − L(µ + θ) − L0 (µ) + L(µ)} . 2 Pitman efficacy: 1 1 2 EU ' L0 (µ)L(µ) 1 + σ (1 − 8L0 (µ)L(µ)) . 2 2 (6) Therefore to assess the relative efficiency for θ = 0, we compare (5) and (6). EC ≥ EU Christiana Kartsonaki Binary matched pairs 18 July 2014 13 / 16 Testing the hypothesis of no treatment effect Under the null hypothesis, 1 2 1 EC ' L0 (µ)L(µ) 1 + σ (1 − 6L0 (µ)L(µ)) . 2 2 (5) In the unmatched analysis the logit difference for the two groups is 1 TU ' θ + σ 2 {L0 (µ + θ) − L(µ + θ) − L0 (µ) + L(µ)} . 2 Pitman efficacy: 1 1 2 EU ' L0 (µ)L(µ) 1 + σ (1 − 8L0 (µ)L(µ)) . 2 2 (6) Therefore to assess the relative efficiency for θ = 0, we compare (5) and (6). EC ≥ EU ⇒ near θ = 0 the matched design tends to be slightly more efficient. Christiana Kartsonaki Binary matched pairs 18 July 2014 13 / 16 Testing the hypothesis of no treatment effect When L0 (µ)L(µ) ' 1/4 (near µ = 0), 1 EC ' 8 1 EU ' 8 1 1 − σ2 4 and 1 2 1− σ . 2 Thus for testing the hypothesis of no treatment effect the conditional analysis is slightly better than the unconditional analysis, depending on the amount of variability between pairs. Christiana Kartsonaki Binary matched pairs 18 July 2014 14 / 16 Discussion The parameter θ describing the contrast of log odds between the two groups in the conditional analysis is defined conditionally on the features implied by the matching variables. Christiana Kartsonaki Binary matched pairs 18 July 2014 15 / 16 Discussion The parameter θ describing the contrast of log odds between the two groups in the conditional analysis is defined conditionally on the features implied by the matching variables. The contrast of log odds from the unconditional analysis without the correction term is not the same as the contrast of log odds from the conditional analysis of the same data and it is likely to be different, perhaps seriously so. Thus comparison of the conclusions from two different studies, one matched and one unmatched requires care. Christiana Kartsonaki Binary matched pairs 18 July 2014 15 / 16 Discussion The parameter θ describing the contrast of log odds between the two groups in the conditional analysis is defined conditionally on the features implied by the matching variables. The contrast of log odds from the unconditional analysis without the correction term is not the same as the contrast of log odds from the conditional analysis of the same data and it is likely to be different, perhaps seriously so. Thus comparison of the conclusions from two different studies, one matched and one unmatched requires care. The unconditional analysis seems to be more efficient than the conditional analysis in many cases, in particular when the treatment effect is large. When the treatment effect is close to zero, the conditional analysis is more efficient. Christiana Kartsonaki Binary matched pairs 18 July 2014 15 / 16 Discussion The parameter θ describing the contrast of log odds between the two groups in the conditional analysis is defined conditionally on the features implied by the matching variables. The contrast of log odds from the unconditional analysis without the correction term is not the same as the contrast of log odds from the conditional analysis of the same data and it is likely to be different, perhaps seriously so. Thus comparison of the conclusions from two different studies, one matched and one unmatched requires care. The unconditional analysis seems to be more efficient than the conditional analysis in many cases, in particular when the treatment effect is large. When the treatment effect is close to zero, the conditional analysis is more efficient. However, matching plus randomization controls for unobserved confounders, a different aspect from variance comparison. Christiana Kartsonaki Binary matched pairs 18 July 2014 15 / 16 References Cox, D. R. (1958). Two further applications of a model for binary regression. Biometrika, 45, 562–565. Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics. Chapman and Hall / CRC, London. McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12 (2), 153–157. Christiana Kartsonaki Binary matched pairs 18 July 2014 16 / 16