Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Study Guide for Exam 3: PubH 6414 Biostatistical Methods I 2 Concepts to Review for Exam 3 Review: Hypothesis Testing • Given the description of a study be able to identify the correct null and / or alternative hypotheses from a selection of choices. o The null hypothesis claims that the parameter of interest = some specified value or, for hypotheses about two parameters, that the two parameters are = which is the same as stating that the difference between the parameters = 0. o Alternative hypotheses can either be one-sided or two-sided Choose one-sided if the research question is only interested in a difference in one specified direction • These will be in the form of < or > the value specified in the null hypothesis Choose two-sided if the research question indicated interest in ‘any difference’ • These will be in the form of ‘not equal’ to the value specified in the null hypothesis or that the two parameters are not equal to each other. • Know the definition of Type I and Type II errors. • Alpha = probability of Type I error = probability that a true null hypothesis is rejected o Beta = probability of Type II error = probability that a false null hypothesis is not rejected • Understand p-values. o Given an alpha level for a test, be able to identify p-values that will result in the decision to reject the null hypothesis If the p-value is < alpha level then you reject the null hypothesis o Understand the relationship between test statistic and p-value: the p-value is the probability of having a test statistic as extreme or more extreme than the calculated test statistic if the null hypothesis is true. For a two-sided test, the p-value is the sum of area to the right of the positive test statistic + area to the left of negative test statistic Large test statistics have small p-values • Understand the relationships between alpha level, rejection regions, critical values, and p-values – draw pictures to illustrate the relationships. o The rejection region is the tail area under the distribution of the test statistic that is equal to the alpha level. o The critical value is the value that separates the rejection region from the rest of the area under the distribution. o For a two-sided test, the rejection region is in the two tails of the distribution (equal area in both tails). o For a one-sided test, the rejection region is in one tail of the distribution. o If the p-value < alpha level then the test statistic is in the rejection region. Hypothesis Tests and Confidence Intervals of Means: Paired and two sample • Given a description of a study involving means be able to identify: o the correct null and alternative hypotheses o the correct test for the study one-sample t-test – (know when this is a z-test instead of a t-test) paired t-test – when data are paired and the difference is calculated two-sample t-test 3 • • • • there are two choices for this test: equal variance and unequal variance o Use the F-test of variances or the “rule of thumb”: largest SD/smallest SD < 2 to decide o if F-test is significant (p < 0.05, use unequal variance) o if F-test is not significant (p > 0.05, use equal variance) o the correct t-test statistic for each of the above tests know the SE for each of the tests o the correct degrees of freedom for each of the above t-statistics o know how to use TINV to find the critical values of the test Given a study description, alpha level for the test and the results of the study (i.e. p-value and test-statistic), be able to identify the correct hypothesis test decision and interpretation. Confidence Interval of Mean Difference • Used for paired data • Calculate the difference between the observations • Estimate the mean difference and SE of the mean difference • Know the correct degrees of freedom for the t-coefficient: number of paired observations minus 1 • Construct confidence interval Confidence Interval for the difference between two independent means • Use when you are comparing two independent (not paired) groups • Calculate the sample mean for each group • Calculate the sample difference of the means and the SE of the difference of the means • Know the correct degrees of freedom for t-coefficient: sample size for group 1 + sample size for group 2 minus 2 • Construct confidence interval Hypothesis Tests and CI of Proportions and Hypothesis Tests for Contingency Tables • Given a description of a study involving proportions be able to identify: o the correct null and alternative hypotheses o the correct test for the study one-sample z-test of proportions Two-sample z-test of proportions o The correct test statistic for each of these tests (i.e. be able to recognize the standard error for each of these) t-statistics are not used for tests of proportions • Confidence Interval for the difference of population proportions • know the formula for the SE of the difference of two proportions – use the overall proportion in the CI of the proportion • Chi-square test o Know how to find the expected cell counts given the observed counts in a contingency table o Know the correct degrees of freedom for the Chi-square test the Chi-square distribution changes shape depending on the df o Chi-square test is always a two-sided test o Chi-square test can be used to test for independence or to compare two proportions 4 • • • the p-value for the Chi-square test of two proportions is equal to the p-value for the two sample z-test of two proportions. McNemar Chi-square test o This is only used to compare proportions for paired data o Know the formula for calculating the McNemar Chi-square test statistic Fisher’s Exact test – know when this is more appropriate than the Chi-square test (no need to calculate this statistic) Given a study description, alpha level for the test and the results of the study (i.e. p-value and test-statistic), be able to identify the correct hypothesis test decision and interpretation. ANOVA: Analysis of Variance • Know when you would use ANOVA. • Be able to recognize the formula for the F-statistic for ANOVA. • Know the degrees of freedom for an ANOVA F-statistic o Numerator df and denominator df • Understand the relationship between SST, SSE, SSA, MSE, MSA. • Know the null hypothesis and alternative hypothesis for ANOVA. • Correctly interpret ANOVA results. o What does a significant F-test tell you? • Understand how to calculate significance level is adjusted for multiple comparisons using Bonferonni (post-hoc tests) Confidence Intervals for OR and RR Confidence Interval for Odds Ratio • Sampling distribution of OR is not normal but sampling distribution of ln(OR) is normal. • Construct CI on ln (natural log) scale • SE of ln(OR) = 1 / a + 1 / b + 1 / c + 1 / d o where a, b, c, d are cell counts from a 2 X 2 table • Use z-coefficients for CI of ln(OR) • After constructing the CI on the ln scale, exponentiate the endpoints so that you have a confidence interval for the OR o your estimate of the OR should be contained in the confidence interval • Know how to use the LN and EXP functions in Excel when constructing the confidence interval for Odds Ratio. o If the 95% CI for the OR contains the value 1.0, you cannot conclude that there is a statistical relationship (at alpha = 0.05) between the variables. o If the 95% CI for the OR does not contain the value 1.0, this is evidence of a statistically significant relationship (at alpha = 0.05) between the two groups. Confidence Interval for Relative Risk • Sampling distribution of RR is not normal but sampling distribution of ln(RR) is normal • Construct CI on ln (natural log scale) b d • SE of ln(OR) = + a ( a + b ) c (c + d ) 5 • • • • • where a, b, c, d are cell counts from a 2 X 2 table Use z-coefficients for CI of ln(OR) After constructing the CI on the ln scale, exponentiate the endpoints so that you have a confidence interval for the OR o your estimate of the OR should be contained in the confidence interval Know how to use the LN and EXP functions in Excel when constructing the confidence interval for Odds Ratio. If the 95% CI for the RR contains the value 1.0, you cannot conclude that there is a statistical relationship (at alpha = 0.05) between the variables. If the 95% CI for the RR does not contain the value 1.0, this is evidence of a statistically significant relationship (at alpha = 0.05) between the two groups.