Download 1 Study Guide for Exam 3: PubH 6414 Biostatistical Methods I

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
1
Study Guide for Exam 3: PubH 6414 Biostatistical Methods I
2
Concepts to Review for Exam 3
Review: Hypothesis Testing
• Given the description of a study be able to identify the correct null and / or alternative
hypotheses from a selection of choices.
o The null hypothesis claims that the parameter of interest = some specified value or, for
hypotheses about two parameters, that the two parameters are = which is the same as
stating that the difference between the parameters = 0.
o Alternative hypotheses can either be one-sided or two-sided
 Choose one-sided if the research question is only interested in a difference in one
specified direction
• These will be in the form of < or > the value specified in the null
hypothesis
 Choose two-sided if the research question indicated interest in ‘any difference’
• These will be in the form of ‘not equal’ to the value specified in the null
hypothesis or that the two parameters are not equal to each other.
• Know the definition of Type I and Type II errors.
• Alpha = probability of Type I error = probability that a true null hypothesis is rejected
o Beta = probability of Type II error = probability that a false null hypothesis is not
rejected
• Understand p-values.
o Given an alpha level for a test, be able to identify p-values that will result in the decision
to reject the null hypothesis
 If the p-value is < alpha level then you reject the null hypothesis
o Understand the relationship between test statistic and p-value: the p-value is the
probability of having a test statistic as extreme or more extreme than the calculated test
statistic if the null hypothesis is true. For a two-sided test, the p-value is the sum of area
to the right of the positive test statistic + area to the left of negative test statistic
 Large test statistics have small p-values
• Understand the relationships between alpha level, rejection regions, critical values, and p-values
– draw pictures to illustrate the relationships.
o The rejection region is the tail area under the distribution of the test statistic that is equal
to the alpha level.
o The critical value is the value that separates the rejection region from the rest of the area
under the distribution.
o For a two-sided test, the rejection region is in the two tails of the distribution (equal area
in both tails).
o For a one-sided test, the rejection region is in one tail of the distribution.
o If the p-value < alpha level then the test statistic is in the rejection region.
Hypothesis Tests and Confidence Intervals of Means: Paired and two sample
• Given a description of a study involving means be able to identify:
o the correct null and alternative hypotheses
o the correct test for the study
 one-sample t-test – (know when this is a z-test instead of a t-test)
 paired t-test – when data are paired and the difference is calculated
 two-sample t-test
3
•
•
•
•
there are two choices for this test: equal variance and unequal variance
o Use the F-test of variances or the “rule of thumb”: largest
SD/smallest SD < 2 to decide
o if F-test is significant (p < 0.05, use unequal variance)
o if F-test is not significant (p > 0.05, use equal variance)
o the correct t-test statistic for each of the above tests
 know the SE for each of the tests
o the correct degrees of freedom for each of the above t-statistics
o know how to use TINV to find the critical values of the test
Given a study description, alpha level for the test and the results of the study (i.e. p-value and
test-statistic), be able to identify the correct hypothesis test decision and interpretation.
Confidence Interval of Mean Difference
• Used for paired data
• Calculate the difference between the observations
• Estimate the mean difference and SE of the mean difference
• Know the correct degrees of freedom for the t-coefficient: number of paired
observations minus 1
• Construct confidence interval
Confidence Interval for the difference between two independent means
• Use when you are comparing two independent (not paired) groups
• Calculate the sample mean for each group
• Calculate the sample difference of the means and the SE of the difference of
the means
• Know the correct degrees of freedom for t-coefficient: sample size for group
1 + sample size for group 2 minus 2
• Construct confidence interval
Hypothesis Tests and CI of Proportions and Hypothesis Tests for Contingency Tables
• Given a description of a study involving proportions be able to identify:
o the correct null and alternative hypotheses
o the correct test for the study
 one-sample z-test of proportions
 Two-sample z-test of proportions
o The correct test statistic for each of these tests (i.e. be able to recognize the standard
error for each of these)
 t-statistics are not used for tests of proportions
• Confidence Interval for the difference of population proportions
• know the formula for the SE of the difference of two proportions – use the
overall proportion in the CI of the proportion
• Chi-square test
o Know how to find the expected cell counts given the observed counts in a contingency
table
o Know the correct degrees of freedom for the Chi-square test
 the Chi-square distribution changes shape depending on the df
o Chi-square test is always a two-sided test
o Chi-square test can be used to test for independence or to compare two proportions
4

•
•
•
the p-value for the Chi-square test of two proportions is equal to the p-value for
the two sample z-test of two proportions.
McNemar Chi-square test
o This is only used to compare proportions for paired data
o Know the formula for calculating the McNemar Chi-square test statistic
Fisher’s Exact test – know when this is more appropriate than the Chi-square test (no need to
calculate this statistic)
Given a study description, alpha level for the test and the results of the study (i.e. p-value and
test-statistic), be able to identify the correct hypothesis test decision and interpretation.
ANOVA: Analysis of Variance
• Know when you would use ANOVA.
• Be able to recognize the formula for the F-statistic for ANOVA.
• Know the degrees of freedom for an ANOVA F-statistic
o Numerator df and denominator df
• Understand the relationship between SST, SSE, SSA, MSE, MSA.
• Know the null hypothesis and alternative hypothesis for ANOVA.
• Correctly interpret ANOVA results.
o What does a significant F-test tell you?
• Understand how to calculate significance level is adjusted for multiple comparisons using
Bonferonni (post-hoc tests)
Confidence Intervals for OR and RR
Confidence Interval for Odds Ratio
• Sampling distribution of OR is not normal but sampling distribution of ln(OR) is normal.
• Construct CI on ln (natural log) scale
• SE of ln(OR) = 1 / a + 1 / b + 1 / c + 1 / d
o where a, b, c, d are cell counts from a 2 X 2 table
• Use z-coefficients for CI of ln(OR)
• After constructing the CI on the ln scale, exponentiate the endpoints so that you have a
confidence interval for the OR
o your estimate of the OR should be contained in the confidence interval
• Know how to use the LN and EXP functions in Excel when constructing the confidence
interval for Odds Ratio.
o If the 95% CI for the OR contains the value 1.0, you cannot conclude that there is a
statistical relationship (at alpha = 0.05) between the variables.
o If the 95% CI for the OR does not contain the value 1.0, this is evidence of a statistically
significant relationship (at alpha = 0.05) between the two groups.
Confidence Interval for Relative Risk
• Sampling distribution of RR is not normal but sampling distribution of ln(RR) is normal
• Construct CI on ln (natural log scale)
b
d
• SE of ln(OR) =
+
a ( a + b ) c (c + d )
5
•
•
•
•
•
where a, b, c, d are cell counts from a 2 X 2 table
Use z-coefficients for CI of ln(OR)
After constructing the CI on the ln scale, exponentiate the endpoints so that you have a
confidence interval for the OR
o your estimate of the OR should be contained in the confidence interval
Know how to use the LN and EXP functions in Excel when constructing the confidence
interval for Odds Ratio.
If the 95% CI for the RR contains the value 1.0, you cannot conclude that there is a
statistical relationship (at alpha = 0.05) between the variables.
If the 95% CI for the RR does not contain the value 1.0, this is evidence of a statistically
significant relationship (at alpha = 0.05) between the two groups.