Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
mcnemar exercise roger April 4, 2016 Pasted from the table in document N16: printDetails <- FALSE pastedData <- "TRUTH: disease present Test B is positive Test B is negative Test A is positive 100 17 Test A is negative 8 15 TRUTH: disease absent Test B is positive Test B is negative Test A is positive 14 16 Test A is negative 22 200" mcdata.orig <- strsplit(split = "\n", pastedData) [[1]] Note that strsplit() always returns a LIST. Thus the [[1]]. Now we make it into a data array. (We need the package abind, because R base forgot to include the abind() function. along=3, or better rev.along=0, makes sure each component is treated as a slice.) mcdata.character.matrix <- matrix(mcdata.orig, ncol=3, byrow=T) library(abind) mcdata.character.array <- abind(mcdata.character.matrix[1:3, ], mcdata.character.matrix[4:6, ], along=3) if(printDetails) print(mcdata.character.array) mcdata.array <- mcdata.character.array[ -1, -1, ] mcdata.array <- apply(mcdata.array, MARGIN = 1:3, as.numeric) # as.numeric does not work here! 8-( dimnames(mcdata.array) <- list( A=c("Apos","Aneg"), # ROW B=c("Bpos","Bneg"), # COLUMN TRUTH=c("present","absent") # SLICE ) Possible data modifications go in this chunk! #mcdata.array[mcdata.array==16] = 36 mcdata.array["Apos", "Bneg", "absent"] = 36 # originally 16 mcdata.array["Apos", "Bneg", "present"] = 37 # originally 17 Now we will reshape it a "long skinny" data frame. That will be handy for some analyses. mcdata.df <- expand.grid(dimnames(mcdata.array)) mcdata.df$count <- c(mcdata.array) if(printDetails) print(mcdata.df) Now we can use mcdata.array and mcdata.df to test hypotheses. Hypothesis: A is more likely to say "positive" than B. One approach: Just collapse over TRUTH: AposBneg <- mcdata.array["Apos", "Bneg", "present"] + mcdata.array["Apos", "Bneg", "absent"] if(printDetails) print(AposBneg <- ### Alternative sum(mcdata.df [mcdata.df$A=="Apos" & mcdata.df$B=="Bneg", "count"])) AposBneg <- sum(mcdata.array["Apos", "Bneg", ]) Here is the other marginalized diagonal: AnegBpos <- sum(mcdata.array["Aneg", "Bpos", ]) McNemar test, using the binomial method: cat("McNemar Ptest_AMorePos = ", Ptest_AMorePos <- round(digits=3, pbinom(AnegBpos, AposBneg+AnegBpos, 1/2)), "\n") ## McNemar Ptest_AMorePos = 0 Compare with mcnemar.test-- but careful, it's TWOSIDED. The apply() call is collapsing over the 3rd dimension (TRUTH). test_AMoreOrLessPos <- mcnemar.test(apply(mcdata.array, 1:2, sum) ) Ptest_AMoreOrLessPos <- test_AMoreOrLessPos$p.value This is very limited, though. One test might be much better at agreeing with the truth when it's "present" but not when it's "absent". Another approach: control for TRUTH by combining evidence. Hypothesis: A is more likely to be CORRECT. There are several approaches. mcdata.combined <- mcdata.array[ , , 1] + mcdata.array[ 2:1, 2:1, 2] dimnames(mcdata.combined) <- list(c("Aright", "Awrong"), c("Bright", "Bwrong")) mcdata.combined ## Bright Bwrong ## Aright 300 59 ## Awrong 44 29 Pvalue.combined = print(mcnemar.test(mcdata.combined))$p.value ## ## McNemar's Chi-squared test with continuity correction ## ## data: mcdata.combined ## McNemar's chi-squared = 1.9029, df = 1, p-value = 0.1678 We can also use our data frame structure: mcdata.df$Acorrect <(mcdata.df$A=="Apos" & mcdata.df$TRUTH=="present") | (mcdata.df$A=="Aneg" & mcdata.df$TRUTH=="absent") mcdata.df$Bcorrect <(mcdata.df$B=="Bpos" & mcdata.df$TRUTH=="present") | (mcdata.df$B=="Bneg" & mcdata.df$TRUTH=="absent") ### Collapsing over TRUTH: AcorrectBincorrect <- mcdata.df$Acorrect & ! mcdata.df$Bcorrect nAcorrectBincorrect <- sum(mcdata.df$count[AcorrectBincorrect]) ABdisagree <- mcdata.df$Acorrect != mcdata.df$Bcorrect nABdisagree <- sum(mcdata.df$count[ABdisagree]) cat("(collapsing) P = ", 1 - pbinom(nAcorrectBincorrect - 1, nABdisagree, 1/2), "\n") ## (collapsing) P = 0.08372217 This P is ONE-sided. Alternatively, using the array structure: mcdata.correctness <- mcdata.array[ , , 1] + t(mcdata.array[ , , 2]) dimnames(mcdata.correctness) <- list(A=c("correct", "incorrect"), B=c("correct", "incorrect")) mcdata.correctness ## B ## A correct incorrect ## correct 114 59 ## incorrect 44 215 mcnemar.test(mcdata.correctness) ## ## ## McNemar's Chi-squared test with continuity correction ## data: mcdata.correctness ## McNemar's chi-squared = 1.9029, df = 1, p-value = 0.1678 Note the use of the transpose function t() to reflect the "absent" part of the array. If there is an imbalance in marginal for TRUTH. (and there is), we should try controlling for TRUTH. The worry: if most cases are "absent", then B might be right more often just because B says "negative" more often. pos vs neg \\ ------ A vs B / present vs absent To define and predict "correctness", adjusting for propensity to say "positive", we need to make a data set double the size. Then we will use Poisson regression, via the function glm(). mcdata2 <- rbind(mcdata.df, mcdata.df) mcdata2$test <- rep(c('A','B'), each=8) (mcdata2$correct <- ifelse(mcdata2$test=="A", mcdata2$Acorrect, mcdata2$Bcorrect)) ## [1] TRUE FALSE TRUE FALSE FALSE ## [12] FALSE FALSE FALSE TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE mcdata2 <- mcdata2[c('TRUTH','count','test','correct')] require("MASS") ## Loading required package: MASS (glm.out.1 <- glm(data=mcdata2, correct ~ TRUTH + test, weights=count, family=poisson) ) ## ## Call: glm(formula = correct mcdata2, ## weights = count) ## ## Coefficients: ## (Intercept) TRUTHabsent ## -0.24595 0.09498 ## ## Degrees of Freedom: 15 Total ## Null Deviance: 289.9 ## Residual Deviance: 288.2 summary(glm.out.1) ~ TRUTH + test, family = poisson, data = testB -0.04268 (i.e. Null); AIC: 1700 13 Residual ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Call: glm(formula = correct ~ TRUTH + test, family = poisson, data = mcdata2, weights = count) Deviance Residuals: Min 1Q Median -7.868 -4.859 -1.423 3Q 1.599 Max 2.754 Coefficients: Estimate Std. Error z value (Intercept) -0.24595 0.07379 -3.333 TRUTHabsent 0.09498 0.07915 1.200 testB -0.04268 0.07545 -0.566 --Signif. codes: 0 '***' 0.001 '**' 0.01 Pr(>|z|) 0.000859 *** 0.230138 0.571603 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 289.94 Residual deviance: 288.17 AIC: 1700.2 on 15 on 13 degrees of freedom degrees of freedom Number of Fisher Scoring iterations: 5 Now we include the interaction term. (glm.out.2 <- glm(data=mcdata2, family=poisson) ) correct ~ TRUTH*test, weights=count, ## ## Call: glm(formula = correct ~ TRUTH * test, family = poisson, data = mcdata2, ## weights = count) ## ## Coefficients: ## (Intercept) TRUTHabsent testB ## -0.15519 -0.04793 -0.23785 ## TRUTHabsent:testB ## 0.29900 ## ## Degrees of Freedom: 15 Total (i.e. Null); 12 Residual ## Null Deviance: 289.9 ## Residual Deviance: 284.6 AIC: 1699 summary(glm.out.2) ## ## Call: ## glm(formula = correct ~ TRUTH * test, family = poisson, data = mcdata2, ## weights = count) ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Deviance Residuals: Min 1Q Median -7.666 -4.964 -1.435 3Q 1.161 Max 3.689 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.15519 0.08544 -1.816 0.0693 . TRUTHabsent -0.04793 0.10865 -0.441 0.6591 testB -0.23785 0.12868 -1.848 0.0645 . TRUTHabsent:testB 0.29900 0.15906 1.880 0.0601 . --Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 289.94 Residual deviance: 284.62 AIC: 1698.6 on 15 on 12 degrees of freedom degrees of freedom Number of Fisher Scoring iterations: 5 SUMMARY The data: mcdata.array ## ## ## ## ## ## ## ## ## ## ## ## ## , , TRUTH = present A B Bpos Bneg Apos 100 37 Aneg 8 15 , , TRUTH = absent A B Bpos Bneg Apos 14 36 Aneg 22 200 P value for HA:"A says positive more", via binomial test: Ptest_AMorePos ## [1] 0 P value for HA:"A or B says positive more", via chi-square test: Ptest_AMoreOrLessPos ## [1] 3.497622e-05 P value for HA:"A and B differ in correctness": Pvalue.combined ## [1] 0.1677527 Model fit controlling for Prob(positive | test): summary(glm.out.1)$coef ## Estimate ## (Intercept) -0.24595011 ## TRUTHabsent 0.09498272 ## testB -0.04268073 Std. Error z value Pr(>|z|) 0.07378803 -3.3331981 0.0008585379 0.07915200 1.2000041 0.2301377443 0.07544861 -0.5656928 0.5716026459 Model fit controlling for Prob(positive | test) + interaction: summary(glm.out.2)$coef ## ## ## ## ## Estimate (Intercept) -0.1551929 TRUTHabsent -0.0479318 testB -0.2378497 TRUTHabsent:testB 0.2990041 Std. Error z value Pr(>|z|) 0.08543577 -1.8164862 0.06929582 0.10864518 -0.4411774 0.65908459 0.12867995 -1.8483820 0.06454710 0.15906075 1.8798109 0.06013386