Download mcnemar exercise

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
mcnemar exercise
roger
April 4, 2016
Pasted from the table in document N16:
printDetails <- FALSE
pastedData <- "TRUTH:
disease present
Test B is positive
Test B is negative
Test A is positive
100
17
Test A is negative
8
15
TRUTH:
disease absent
Test B is positive
Test B is negative
Test A is positive
14
16
Test A is negative
22
200"
mcdata.orig <- strsplit(split = "\n", pastedData) [[1]]
Note that strsplit() always returns a LIST. Thus the [[1]].
Now we make it into a data array. (We need the package abind, because R base forgot to
include the abind() function. along=3, or better rev.along=0, makes sure each component is
treated as a slice.)
mcdata.character.matrix <- matrix(mcdata.orig, ncol=3, byrow=T)
library(abind)
mcdata.character.array <- abind(mcdata.character.matrix[1:3, ],
mcdata.character.matrix[4:6, ],
along=3)
if(printDetails) print(mcdata.character.array)
mcdata.array <- mcdata.character.array[ -1, -1, ]
mcdata.array <- apply(mcdata.array, MARGIN = 1:3, as.numeric) # as.numeric
does not work here! 8-(
dimnames(mcdata.array) <- list(
A=c("Apos","Aneg"), # ROW
B=c("Bpos","Bneg"), # COLUMN
TRUTH=c("present","absent") # SLICE
)
Possible data modifications go in this chunk!
#mcdata.array[mcdata.array==16] = 36
mcdata.array["Apos", "Bneg", "absent"] = 36 # originally 16
mcdata.array["Apos", "Bneg", "present"] = 37 # originally 17
Now we will reshape it a "long skinny" data frame. That will be handy for some analyses.
mcdata.df <- expand.grid(dimnames(mcdata.array))
mcdata.df$count <- c(mcdata.array)
if(printDetails) print(mcdata.df)
Now we can use mcdata.array and mcdata.df to test hypotheses.
Hypothesis: A is more likely to say "positive" than B. One approach: Just collapse over
TRUTH:
AposBneg <- mcdata.array["Apos", "Bneg", "present"] +
mcdata.array["Apos", "Bneg", "absent"]
if(printDetails) print(AposBneg <- ### Alternative
sum(mcdata.df [mcdata.df$A=="Apos" &
mcdata.df$B=="Bneg", "count"]))
AposBneg <- sum(mcdata.array["Apos", "Bneg", ])
Here is the other marginalized diagonal:
AnegBpos <- sum(mcdata.array["Aneg", "Bpos", ])
McNemar test, using the binomial method:
cat("McNemar Ptest_AMorePos = ",
Ptest_AMorePos <- round(digits=3,
pbinom(AnegBpos, AposBneg+AnegBpos, 1/2)),
"\n")
## McNemar Ptest_AMorePos =
0
Compare with mcnemar.test-- but careful, it's TWOSIDED. The apply() call is collapsing
over the 3rd dimension (TRUTH).
test_AMoreOrLessPos <- mcnemar.test(apply(mcdata.array, 1:2, sum) )
Ptest_AMoreOrLessPos <- test_AMoreOrLessPos$p.value
This is very limited, though. One test might be much better at agreeing with the truth when
it's "present" but not when it's "absent". Another approach: control for TRUTH by
combining evidence.
Hypothesis: A is more likely to be CORRECT. There are several approaches.
mcdata.combined <- mcdata.array[ , , 1] + mcdata.array[ 2:1, 2:1, 2]
dimnames(mcdata.combined) <- list(c("Aright", "Awrong"), c("Bright",
"Bwrong"))
mcdata.combined
##
Bright Bwrong
## Aright
300
59
## Awrong
44
29
Pvalue.combined = print(mcnemar.test(mcdata.combined))$p.value
##
## McNemar's Chi-squared test with continuity correction
##
## data: mcdata.combined
## McNemar's chi-squared = 1.9029, df = 1, p-value = 0.1678
We can also use our data frame structure:
mcdata.df$Acorrect <(mcdata.df$A=="Apos" & mcdata.df$TRUTH=="present") |
(mcdata.df$A=="Aneg" & mcdata.df$TRUTH=="absent")
mcdata.df$Bcorrect <(mcdata.df$B=="Bpos" & mcdata.df$TRUTH=="present") |
(mcdata.df$B=="Bneg" & mcdata.df$TRUTH=="absent")
### Collapsing over TRUTH:
AcorrectBincorrect <- mcdata.df$Acorrect &
! mcdata.df$Bcorrect
nAcorrectBincorrect <- sum(mcdata.df$count[AcorrectBincorrect])
ABdisagree <- mcdata.df$Acorrect != mcdata.df$Bcorrect
nABdisagree <- sum(mcdata.df$count[ABdisagree])
cat("(collapsing) P = ",
1 - pbinom(nAcorrectBincorrect - 1, nABdisagree, 1/2), "\n")
## (collapsing) P =
0.08372217
This P is ONE-sided. Alternatively, using the array structure:
mcdata.correctness <- mcdata.array[ , , 1] + t(mcdata.array[ , , 2])
dimnames(mcdata.correctness) <- list(A=c("correct", "incorrect"),
B=c("correct", "incorrect"))
mcdata.correctness
##
B
## A
correct incorrect
##
correct
114
59
##
incorrect
44
215
mcnemar.test(mcdata.correctness)
##
##
##
McNemar's Chi-squared test with continuity correction
## data: mcdata.correctness
## McNemar's chi-squared = 1.9029, df = 1, p-value = 0.1678
Note the use of the transpose function t() to reflect the "absent" part of the array.
If there is an imbalance in marginal for TRUTH. (and there is), we should try controlling for
TRUTH. The worry: if most cases are "absent", then B might be right more often just
because B says "negative" more often.
pos vs neg
\\
------
A vs B
/
present vs absent
To define and predict "correctness", adjusting for propensity to say "positive", we need to
make a data set double the size. Then we will use Poisson regression, via the function glm().
mcdata2 <- rbind(mcdata.df, mcdata.df)
mcdata2$test <- rep(c('A','B'), each=8)
(mcdata2$correct <- ifelse(mcdata2$test=="A",
mcdata2$Acorrect, mcdata2$Bcorrect))
## [1] TRUE FALSE TRUE FALSE FALSE
## [12] FALSE FALSE FALSE TRUE TRUE
TRUE FALSE
TRUE
TRUE
TRUE FALSE
mcdata2 <- mcdata2[c('TRUTH','count','test','correct')]
require("MASS")
## Loading required package: MASS
(glm.out.1 <- glm(data=mcdata2,
correct ~ TRUTH + test,
weights=count,
family=poisson) )
##
## Call: glm(formula = correct
mcdata2,
##
weights = count)
##
## Coefficients:
## (Intercept) TRUTHabsent
##
-0.24595
0.09498
##
## Degrees of Freedom: 15 Total
## Null Deviance:
289.9
## Residual Deviance: 288.2
summary(glm.out.1)
~ TRUTH + test, family = poisson, data =
testB
-0.04268
(i.e. Null);
AIC: 1700
13 Residual
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Call:
glm(formula = correct ~ TRUTH + test, family = poisson, data = mcdata2,
weights = count)
Deviance Residuals:
Min
1Q Median
-7.868 -4.859 -1.423
3Q
1.599
Max
2.754
Coefficients:
Estimate Std. Error z value
(Intercept) -0.24595
0.07379 -3.333
TRUTHabsent 0.09498
0.07915
1.200
testB
-0.04268
0.07545 -0.566
--Signif. codes: 0 '***' 0.001 '**' 0.01
Pr(>|z|)
0.000859 ***
0.230138
0.571603
'*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 289.94
Residual deviance: 288.17
AIC: 1700.2
on 15
on 13
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 5
Now we include the interaction term.
(glm.out.2 <- glm(data=mcdata2,
family=poisson) )
correct ~ TRUTH*test, weights=count,
##
## Call: glm(formula = correct ~ TRUTH * test, family = poisson, data =
mcdata2,
##
weights = count)
##
## Coefficients:
##
(Intercept)
TRUTHabsent
testB
##
-0.15519
-0.04793
-0.23785
## TRUTHabsent:testB
##
0.29900
##
## Degrees of Freedom: 15 Total (i.e. Null); 12 Residual
## Null Deviance:
289.9
## Residual Deviance: 284.6
AIC: 1699
summary(glm.out.2)
##
## Call:
## glm(formula = correct ~ TRUTH * test, family = poisson, data = mcdata2,
##
weights = count)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Deviance Residuals:
Min
1Q Median
-7.666 -4.964 -1.435
3Q
1.161
Max
3.689
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)
-0.15519
0.08544 -1.816
0.0693 .
TRUTHabsent
-0.04793
0.10865 -0.441
0.6591
testB
-0.23785
0.12868 -1.848
0.0645 .
TRUTHabsent:testB 0.29900
0.15906
1.880
0.0601 .
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 289.94
Residual deviance: 284.62
AIC: 1698.6
on 15
on 12
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 5
SUMMARY
The data:
mcdata.array
##
##
##
##
##
##
##
##
##
##
##
##
##
, , TRUTH = present
A
B
Bpos Bneg
Apos 100
37
Aneg
8
15
, , TRUTH = absent
A
B
Bpos Bneg
Apos
14
36
Aneg
22 200
P value for HA:"A says positive more", via binomial test:
Ptest_AMorePos
## [1] 0
P value for HA:"A or B says positive more", via chi-square test:
Ptest_AMoreOrLessPos
## [1] 3.497622e-05
P value for HA:"A and B differ in correctness":
Pvalue.combined
## [1] 0.1677527
Model fit controlling for Prob(positive | test):
summary(glm.out.1)$coef
##
Estimate
## (Intercept) -0.24595011
## TRUTHabsent 0.09498272
## testB
-0.04268073
Std. Error
z value
Pr(>|z|)
0.07378803 -3.3331981 0.0008585379
0.07915200 1.2000041 0.2301377443
0.07544861 -0.5656928 0.5716026459
Model fit controlling for Prob(positive | test) + interaction:
summary(glm.out.2)$coef
##
##
##
##
##
Estimate
(Intercept)
-0.1551929
TRUTHabsent
-0.0479318
testB
-0.2378497
TRUTHabsent:testB 0.2990041
Std. Error
z value
Pr(>|z|)
0.08543577 -1.8164862 0.06929582
0.10864518 -0.4411774 0.65908459
0.12867995 -1.8483820 0.06454710
0.15906075 1.8798109 0.06013386