Download 1) Continue the cancer example from the previous problem set More

Psych 613 Fall 2016 Problem Set 3 Solutions 1) Continue the cancer example from the previous problem set More detail on the data file: Patients with advanced cancers of the stomach, bronchus, colon, ovary or breast were treated with vitamin c (ascorbate). The purpose of the study was to determine whether patient survival differed depending on the organ affected by the cancer. Adapted from Cameron, E. and Pauling, L. (1978) Supplemental ascorbate in the supportive treatment of cancer: re-evaluation of prolongation of survival times in terminal human cancer. Proceedings of the National Academy of Science, 75, 4538-42. (a) Present a graph of the five cell means with 1 standard error around each mean. SPSS Syntax graph /errorbar(sterror 1) = survival by cantype. R Syntax install.packages("gplots") library(gplots) plotmeans(formula = survival ~ cantype, data = cancer, p = .68, ylab = "Survival Time in Days", xlab = "Fives types of Cancer Patients") (b) Test the research hypothesis that there is a difference among the five means. Use = 0.05. Interpret the result in words, you can refer to the graph in part 1a. We reject the null that there is no difference among the five cell means, F(4, 59) = 6.43, p = .0002. SPSS Syntax oneway survival by cantype. R Syntax aov1 <- aov(formula = survival ~ cantype, data = cancer) summary(aov1) cantype Residuals Df Sum Sq Mean Sq 4 11535761 2883940 59 26448144 448274 F value Pr(>F) 6.433 0.000229 *** (c) Using only the F test for the ANOVA you ran above, can the researchers claim support for the hypothesis that ascorbates will have the best effect on survival rate for breast cancer patients? What else in addition to the F test is needed to make this claim? No computation is necessary, just explain. No, the F test for the ANOVA does not test the hypothesis that ascorbates will have the best effect on survival rate for breast cancer patients (see answer 1 b). One could compare survival time among the 5 groups, say, with a Tukey test or a contrast in order to determine whether ascorbates have the strongest effect on breast cancer patients. (d) Examine the statistical assumptions for these data. Use boxplots and normal probability plots. Do you think assumptions are violated in the data set? Explain your conclusions and use the graphs to support your claims. No need to perform a transformation or take remedial measures for this part. The boxplots for ovary and breast cancer patients look quite different compared to the rest of the patient groups. For example, breast cancer group has a few outliers and the ovary cancer group’s median is not centered within the interquartile range. SPSS Syntax examine variables = survival by cantype /plot boxplot npplot spreadlevel /statistics = none /nototal. R Syntax boxplot(formula = survival ~ cantype, data = cancer, ylab = "Patient Survival", xlab = "Organ Affected by Cancer") for(i in 1:length(levels(cancer$cantype))){ qqnorm(cancer[cancer$cantype == paste0(levels(cancer$cantype)[i]),"survival"], main = paste0("Normal Q-Q Plot for ", levels(cancer$cantype)[i], " Cancer Survival")) qqline(cancer[cancer$cantype == paste0(levels(cancer$cantype)[i]),"survival"]) install.packages("car") library(car) spreadLevelPlot(survival ~ cantype, data = cancer) (e) Using the spread and level plot you computed in the previous problem set, perform the power transformation suggested by the plot (round up or down to the nearest rung on the ladder of power transformations, e.g., square, identity, sqrt, log, reciprocal). Redo the boxplot but this time on the transformed variable. Does the boxplot look better, the same, worse? Explain. Fewer points fall outside the whiskers and the interquartile ranges among the groups have more similar shapes. SPSS Syntax compute survival.cubed = survival ** (1/3). examine variables = survival.cubed by cantype /plot boxplot /statistics = none /nototal. R Syntax cancer$survival.cubed <- with(cancer, survival^(1/3)) boxplot(formula = survival.cubed ~ cantype, data = cancer, ylab = "Survival Time in Days (cubed)", xlab = "Fives types of Cancer Patients") (f) Recompute the ANOVA F on the transformed variable from part 1e. Did the F change much? Raw survival time: F(4, 59) = 6.43 Squared survival time: F(4, 59) = 6.48 Cubed survival time: F(4, 59) = 5.93 SPSS Syntax oneway survival survival.cubed survival.squared by cantype. R Syntax lapply(X = c("survival","survival.squared","survival.cubed"), FUN = function(x){ summary(aov(formula = as.formula(paste0(x,"~ cantype")), data = cancer)) }) cantype Residuals Df Sum Sq Mean Sq 4 11535761 2883940 59 26448144 448274 F value Pr(>F) 6.433 0.000229 *** cantype Residuals Df Sum Sq Mean Sq F value Pr(>F) 4 3295 823.8 6.484 0.000215 *** 59 7495 127.0 cantype Residuals Df Sum Sq Mean Sq F value Pr(>F) 4 169.2 42.31 5.939 0.000438 *** 59 420.3 7.12 (g) The researcher is interested in testing whether breast cancer patients who were given ascorbate acid lived longer than stomach cancer patients given ascorbate acid. What contrast would the researcher perform to test this research question? Calculate the contrast value using the original untransformed data and test the null hypothesis that the population contrast value is 0 (using α = 0.05). Also, calculate the 95% CI around the contrast value. If the levels are ordered Stomach, Bronchus, Colon, Ovary, and Breast, then we can set our contrasts weights to -1, 0, 0, 0, 1, which compares survival days between breast and stomach cancer patients. When given ascorbate acid, breast cancer patients tended to live longer than stomach cancer patients, t(11.33) = 2.88. The 95% CI is (-1658.761, -561.057). SPSS Syntax oneway survival by cantype /statistics all /contrasts = -1, 0, 0, 0, 1. R Syntax install.packages("gmodels") library(gmodels) fit.contrast(model = aov1, varname = "cantype", coeff = c(1, 0, 0, 0, -1), conf.int = .95, df = TRUE) Estimate cantype c=( 1 0 0 0 -1 ) -1109.909 upper CI cantype c=( 1 0 0 0 -1 ) -561.057 Std. Error 274.2895 t value Pr(>|t|) DF lower CI -4.046488 0.0001533048 59 -1658.761 (h) Perform a Tukey test on all pairwise means. Summarize in words your findings from the Tukey test. SPSS Syntax oneway survival by cantype /statistics all /ranges=tukey. R Syntax TukeyHSD(aov3) diff lwr upr p adj Bronchus-Stomach -0.2213573 -2.98852004 2.545805 0.9994136 Colon-Stomach 1.4429273 -1.32423543 4.210090 0.5874049 Ovary-Stomach 2.6847237 -1.02208101 6.391528 0.2610615 Breast-Stomach 4.2551424 1.17828191 7.332003 0.0022990 Colon-Bronchus 1.6642846 -0.91180367 4.240373 0.3732192 Ovary-Bronchus 2.9060810 -0.66035172 6.472514 0.1616993 Breast-Bronchus 4.4764998 1.57028022 7.382719 0.0005371 Ovary-Colon 1.2417964 -2.32463633 4.808229 0.8632854 Breast-Colon 2.8122151 -0.09400439 5.718435 0.0624480 Breast-Ovary 1.5704187 -2.24131623 5.382154 0.7741468 (i) One of the Tukey pairwise tests involves comparing the breast cancer group to the stomach cancer group, as you did earlier. Explain the underlying difference between the contrast you did earlier and the Tukey test that compared the means of the same two groups. When one plans to test the difference between groups before data is collected, the probability of a false positive is 5%. However, when one can search the data for the greatest possible difference (the greatest vs. the smallest mean), the probability of a false positive exceeds 5%. The Tukey procedure uses the sampling distribution of the range (i.e. snooping and comparing the smallest vs. greatest mean), not the sampling distribution of the mean (e.g. planned contrast). Tukey’s range distribution produces larger critical values, making it harder to reject the null. (j) Calculate the contrast you performed earlier between the means of the breast and stomach cancer groups using Scheffe’s procedure at α = 0.05. Also, calculate the 95% confidence interval for the contrast using the Scheffe procedure. Note: compute this part using the equations presented in the lecture notes as in general Scheffe tests on contrasts are not supported by statistical packages so in real situations you’ll need to do them by hand if you use SPSS or you can write your own functions in R. Scheffe: Estimate = -1109.9, S = 872.2, Scrit = 3.18, df = 59, 95% CI (-1982.1, -237.7) R Syntax From previous output, we know the estimate is 1109.91, SE = 274.29, and df = 59 Scrit = sqrt((5-1)*qf(1-.05,5-1,59)) S = Scrit*274.29 Lower = 1109.91 - S Upper = 1109.91 + S For those of you you want a more general purpose function/code in R: scheffe <- function(ihat, se.ihat, ngroups, df.error,alpha=.05){ ScheffeFcritical <- (ngroups - 1) * qf(1-alpha,ngroups-1,df.error) Scheffetcritical <- sqrt(ScheffeFcritical) S <- Scheffetcritical*se.ihat result <- data.frame(ihat = ihat, se.ihat = se.ihat, df.error = df.error, Scheffetcritical = Scheffetcritical, S = S, lowerCI = ihat - S, upperCI = ihat + S) return(result) } ihat <- sum(with(cancer, tapply(survival, cantype, mean)) * c(1, 0, 0, 0, -1)) se.ihat <- fit.contrast(model = aov1, varname = "cantype", coeff = c(1, 0, 0, 0, -1), conf.int = .95, df = TRUE)[1,"Std. Error"] df.error <- fit.contrast(model = aov1, varname = "cantype", coeff = c(1, 0, 0, 0, -1), Co nf.int = .95, df = TRUE)[1,"DF"] scheffe(ihat = ihat, se.ihat = se.ihat, ngroups = 5, df.error = df.error, alpha = .05) ihat se.ihat -1109.909 274.2895 df.error 59 Scheffetcritical S lowerCI upperCI 3.179878 872.2073 -1982.116 -237.7018 (k) Explain the underlying difference between the contrast you did earlier between the breast and cancer groups and the Scheffe test that compared the means of the same two groups. To determine the critical value for an inferential test, the Scheffe procedure uses the sampling distribution of Fmaximum, which corresponds to the contrast that gives the contrast with the greatest sums of squares (SSC). It uses the F critical for the omnibus test, making the critical value for a given contrast harder to “beat.”

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1) Continue the cancer example from the previous problem set More