Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Confidence Intervals J.D. Bramble, Ph.D. Creighton University Medical Center Med 483 -- Fall 2005 Confidence Intervals Data can be described by point estimates Point estimates from a sample are not always equal to population parameters Data can be described by interval estimates mean, standard deviation, etc. shows the variability of the estimate. Using the standard error we can see the amount that the estimate will vary from the true value. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Confidence Intervals Interval estimates are called confidence intervals (CI). CI define the an upper limit and lower limit associated with a known probability. These limits are known as confidence limits. The associated probability of the CI is most commonly 95%, but may be 99% or 90% J.D. Bramble, Ph.D. MED 483 – Fall 2005 Confidence Intervals Confidence limits set the boundaries that are likely to include the population mean. Thus, we can conclude that in general, we are 95% confident that the true mean of the population is found within these limits. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Standard Error s n The standard error is defined as We expect that the mean is within one standard error of m quite often. SE is a measure of the precision of x as an estimate of m. The smaller SE the more precise the estimate SE includes two factors that affect the precision of the measurement n and sd J.D. Bramble, Ph.D. MED 483 – Fall 2005 Standard Deviation vs Standard Error Standard deviation describes the dispersion of the data. The variability from one data point to the next Standard error (SE) describes the uncertainty in the mean of the data that is a result of sampling error. The variability associated with the sample mean J.D. Bramble, Ph.D. MED 483 – Fall 2005 Calculating Confidence Intervals Recall that 95% of the area under a standard curve is between z = ±1.96. 95.45% -1.96 1.96 J.D. Bramble, Ph.D. MED 483 – Fall 2005 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Calculating Confidence Intervals The general formula is: x z n x z = n P 0.95 Lower limit = x - 1.96( n ) Upper limit = x + 1.96 ( n ) J.D. Bramble, Ph.D. MED 483 – Fall 2005 Calculating CI of two samples We use the t-distribution. The t distribution describes the distribution of the sample mean when the variance is also estimated from sample data. Thus, the formula for the CI in these cases is: s * x t n J.D. Bramble, Ph.D. MED 483 – Fall 2005 Example: Problem To assess the effectiveness of hormone replacement therapy on bone mineral density, 94 women between the age of 45 and 64 were given estrogen medication. After taking the medication for 36 months the bone mineral density was measured for each of the women in the study. The average density was 0.878 g/cm2 with a standard deviation of 0.126 g/cm2. Calculate a 95% CI for the mineral bone density of this population. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Example: SE and t Recall that SE is: s 0.126 0.013 n 94 t(a, df) = t(0.025, 93) = 1.990 J.D. Bramble, Ph.D. MED 483 – Fall 2005 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Example: Calculations 0.126 0.878 1.99 94 0.878 0.2586 0.852 to 0.904 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Example: Conclusion The 95% confidence limits are: lower: 0.852 g/cm2; upper: 0.904 g/cm2 We are 95% confident that the average bone density of all women age 45 to 64 who take this hormone replacement medication is between 0.852 g/cm2 and 0.904 g/cm2. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Example: Conclusions (cont’d) For a 95% confidence intervals we believe that 95% of the samples drawn form the population would have a mean that fall within the confidence limits J.D. Bramble, Ph.D. MED 483 – Fall 2005 Other Confidence Limits For a 99% or 90% CI the calculations and interpretations are similar. What CI is going to give the widest or narrowest interval? CI can be established for any parameter mean, proportion, relative risk, odds ratio, etc. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Using CI to Test Hypotheses Diastolic blood pressure of 12 people before and after administration of a new drug. Paired t-test Hypotheses: H0: md > 0; Ha: md < 0 xd = -3.1 sd = 4.1 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Using CI to Test Hypotheses xd t a ( , n 1) 2 sd n 4.1 3.1 (1.795) 3.1 2.12 12 5.22 m d 0.98 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Using CI to Test Hypotheses Conclusion – since zero does not fall within the interval we can conclude with 95% certainty that there is a significant decrease in blood pressure after taking the new drug. If we did a paired t-test the conclusions would be the same. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Visual Representation of CI True Population Mean (m) ] [ 1 ] [ 2 ] [ 3 ] [ CI for different samples 4 ] [ 5 ] [ 67 [ ] 8 [ ] ] [ 9 ] [ 10 [ ] 11 J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: Analysis of Variance Single Factor J.D. Bramble, Ph.D. Creighton University Medical Center Med 483 -- Fall 2005 Objectives Know the assumptions for an ANOVA When is ANOVA used rather than a t-test Set up ANOVA tables and understand the relationships between the values within the table Compute the F-ratio and appropriate degrees of freedom Know how and when to use a two factor ANOVA Apply Tukey’s multiple comparison procedure J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA vs. t-test A statistical method of comparing means of different groups. A single factor ANOVA for two groups produces the same p-value as an independent t-test The t-test is inappropriate for more than two groups – increases probability of a Type I error Using a t-test to test the means of each pair leads to problems regarding the proper level of significance. J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA vs. t-test ANOVA is not limited to two groups Can appropriately handle comparisons of several means from several groups Thus, ANOVA overcomes the difficulty of doing multiple t-tests The sampling distribution used is the F distribution. J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: assumptions The observations are independent one observation is not correlated with another observation. Variance of the various groups are homogeneous ANOVA is a robust test that is not as sensitive to departures from normality and homogeneity, especially when sample sizes are large and nearly equal for each group. J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: Characteristics ANOVA analyzes the variance of the groups to evaluate differences in the mean. Within group Measures the variance of observations within each group variance due to “chance” Between groups measure the variance between the groups variance due to treatment or chance J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: Characteristics It can be shown that when means of each group are equal, the within and the between group variance is equal. treatment chance F chance The F-statistic is the ratio of the estimated variance J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA : the F distribution The ratio follows an F distribution The F statistic has two sets of degrees of freedom. For between groups -- (I - 1); where I is the number of groups For within groups -- I(J - 1); where J is the number of observations in each group J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: single factor Let I = the number of population samples Let J = the number of observations in each sample Thus the data consist of IJ observations The overall or grand mean is: I X J x jk I 1 J 1 IJ J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: single factor Now it is necessary to compute the sums of squares for the treatment -- SSTr (between group); error--SSE (within group), and the total-- SST. Sum of the squared deviations between groups The total sums of squares measures the amount of variation about the grand mean With algebraic manipulation we find that: SST = SSTr + SSE J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: sums of squared I J 1 1 2 2 2 SSTr ( x IJ x) xIJ x J I 1 J 1 IJ 1 2 SST ( x IJ x) x x I 1 J 1 IJ 2 I J 2 IJ When completing the ANOVA table usually only SSTr and SST are calculated. SSE is found by SSE = SST - SSTr J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: mean sums of square After calculating the sums of squares, F is simple the ratio of the mean squares of both the treatment and error. The mean squares is the sums of squares divided by the appropriate degrees of freedom. SSTr MSTr I 1 SSE MSE I ( J 1) MSTr F MSE J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: single factor table Sources of variation Degrees of freedom SS MS F MSTr MSE Treatment I-1 SSTr MSTr Error I(J - 1) SSE MSE Total IJ - 1 SST J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: Example An experiment was conducted to examine various modes of medication delivery. A total of 15 subjects diagnosed with the flu were enrolled and the length of time until alleviation of major symptoms was measured for three groups: Group A received an inhaled version, Group B received an injection, and Group C received an oral dose. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Single factor example Groups Time (min) Average A B C 56 62 72 102 58 100 90 78 117 87 68 109 94 87 103 85.8 70.6 100.2 x= J.D. Bramble, Ph.D. MED 483 – Fall 2005 Single factor example: set up H0: all three means are equal or m1 = m2 = m3 Ha: at least one mean is different a = 0.05 Critical value: F(a, df) given I-1 = 2 and I(J-1) = 12, F(0.05, 2,12) = 3.89 J.D. Bramble, Ph.D. MED 483 – Fall 2005 J.D. Bramble, Ph.D. MED 483 – Fall 2005 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Single factor example: calculating sums of squares SST 114,893 (1283) / 15 114,893 109,739.3 5,153.7 2 1 1 SSTr [( 429) 2 (353) 2 (501) 2 ] 12832 2,962.8 5 15 SSE 5,153.3 2,962.8 2,190.9 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Single factor example: completing the table Sources of variation Degrees of freedom SS MS F 4.44 Treatments 2 2,190.3 1,095.5 Error 12 2,962.8 246.9 Total 14 5,153.7 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Single factor example: decision and conclusions Compare Fstat to Fcrit: 4.45 > 3.89, therefore fail to reject H0. There evidence suggest that the time it takes to alleviate major flu symptoms differed significantly due to the mode of medication delivery. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Where is the Difference? Recall the hypotheses of the ANOVA If we fail to reject Ho the analysis is complete. What does it mean when Ho is rejected Ho is that all the means are equal Ha is that at least on is not. at least one mean is different Which m's are different from one another. if only two treatment levels. three or more treatment levels J.D. Bramble, Ph.D. MED 483 – Fall 2005 Finding the difference We must do a post hoc analysis. a test that is done after the ANOVA The purpose is to determine the location of the difference. Different of post hoc test are available and are discussed in the text. These test include Bonferroni, Sceffe, Student Newman-Keuls, and Tukey' HSD. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Tukey’s HSD MSE w Qa , I , I ( J 1)) * J Where, a = significance level I = number of groups J = number of observation per treatment MSE = mean square error (or within group MS) J.D. Bramble, Ph.D. MED 483 – Fall 2005 Using the Tukey’s HSD All the information, except Q, needed to find w is located in the ANOVA table Q is determined by using the studentized range distribution, a, I and dfwithin. Once w is determined order all treatment level means in ascending order Underline those values that differ by less than w. Treatment means not underlined correspond to treatments that are significantly different. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Example Using the previous example, we no want to find which form(s) of medication really is different form the others. To start we will order the means Groups Average B 70.6 A 85.8 C 100.2 J.D. Bramble, Ph.D. MED 483 – Fall 2005 The ANOVA Does this data indicate that the amount of time it takes a student to nod off is dependent on the statistical topic being studied? Source Treatment (i.e., between) Error (i.e., within) Total df SS MS F 3 5882.4 1960.8 21.09 16 19 1487.4 7369.8 93 Since the computed F-statistic of 21.09 is greater than the critical value of F(0.05, 3, 16) = 3.24 we reject Ho. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Computing the Tukey’s There are I = 3 treatments and the degrees of freedom for the error is 12; thus, from the table Q(0.05, 3, 12) = 3.77. Computing the Tukey value we get: MSE 246.9 Q(a , I , I ( J 1) * 3.77 * 26.5 J 5 J.D. Bramble, Ph.D. MED 483 – Fall 2005 J.D. Bramble, Ph.D. MED 483 – Fall 2005 J.D. Bramble, Ph.D. MED 483 – Fall 2005 And the Difference is… Ordering the treatment level means and underscoring those that differ by less that 26.5 Groups Average B 70.6 A 85.8 C 100.2 We conclude that only significant difference is between group B (injection) and group C (oral). J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANOVA: Analysis of Variance Two Factor J.D. Bramble, Ph.D. Creighton University Medical Center Med 483 -- Fall 2005 Two-Factor ANOVA Single factor ANOVAs Two factor ANOVAs subjects or treatments are categorized in only one way (i.e., type of treatment) subjects or treatments are categorized in two ways (i.e., type of treatment and gender) Two factor ANOVAs test the influence of both factors. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Examples of Two Factor Designs An experiment is designed to test if there is a difference in how fast 3 different antacid brands (Acid Eater, Relieve the Burn, and Blah Stomach) dissolve in male and female stomachs. What type of study techniques (1 hour every day, 3 hours once a week, an "all-nighter" prior to the exam, a late night party before the exam) results in better test scores while controlling for the person's age (<17, 1820, 21-23, 24-26, 27 >). J.D. Bramble, Ph.D. MED 483 – Fall 2005 Advantages of Two-way ANOVAs Economy In a two-factor analysis we can test interactions. Testing for an interaction allows us to determine whether the variation of the treatment varies by the conditions in which the treatment is applied J.D. Bramble, Ph.D. MED 483 – Fall 2005 Example Instruction Ability Computer Classroom Whiz 90 82 Novice Means 80 88 85 85 Means 86 84 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Three Research Hypotheses Is there a significant difference between those taught by computer and those taught in the classroom? Is there a significant difference between computer whizzes and computer novices Is there a significant interaction between type of instruction and computer ability of the subject J.D. Bramble, Ph.D. MED 483 – Fall 2005 Two Factor ANOVA Example Researchers are interested on the effect of caffeine and performance. Controlling for the students academic program, subjects were given 3 different levels of caffeine for two weeks prior to taking a standard aptitude test. They record the test scores below. Under grad Med Pharm Nur Law None 76 67 81 56 51 Low 82 69 96 59 70 Med 68 59 67 54 42 High 63 56 64 58 37 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Writing the hypothesis Hypotheses are written the same as a single factor ANOVA with the exception of adding a second set of hypotheses for the second factor. For Factor A (Caffeine level) (I = # treatment levels) Ho: mnone = mlow = mmed = mhigh Ha: at least on m is different For Factor B (Program) (J = # treatment levels) Ho: munder grad = mmed = mpharm = mnur = mlaw Ha: at least one m is different J.D. Bramble, Ph.D. MED 483 – Fall 2005 Critical Values Critical values for a two factor ANOVA are found by looking on an F-table at the appropriate degrees of freedom. Degrees of freedom for a two factor ANOVA are found for all sources of variation and the total For factor A: I-1 For factor B: J-1 For error: (I-1)(J-1) For total: I(J-1) J.D. Bramble, Ph.D. MED 483 – Fall 2005 Calculating Degrees of Freedom For our example I = 4 and J = 5; thus, the df are: dftask = 2-1 = 1 dfdose = 3 –1 = 2 dferror = 3 * 4 = 12 dftotal = (4 *5)-1 = 19 Notice the relationship between the degrees of freedom is the same as a single factor ANOVA dfFactor 1 + dfFactor 2 + dferror = dftotal; J.D. Bramble, Ph.D. MED 483 – Fall 2005 Calculating Critical Values With the df known we can now find the critical values The critical values are found by looking on an F-table at the appropriate alpha and degrees of freedom for each factor. one for Factor 1and one for Factor 2. For factor A:F(a, dfFactor A, dferror) For Factor B: F(a, dfFactor B, dferror). For our example Factor 1 is F(0.05, 3, 12) = 3.49 Factor 2 is F(0.05, 4, 12) = 3.26 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Computing the Test Statistic First compute the sums of squares for the different sources of variation. SST -- sums of square for the total SSA -- sums of square for factor A SSB -- sums of square for factor B SSE -- sums of square for the error The relationship still holds that if you add all the sums of squares you get the sums of squares of the total. Thus, SST = SSA + SSB + SSE J.D. Bramble, Ph.D. MED 483 – Fall 2005 Computing the Test Statistic The mean sums of squares for Factor 1, Factor 2, and the Error can be computed by dividing the sums of squares by the appropriate degrees of freedom. The F-statistic is calculated by dividing the mean sums of square for each factor by the mean sums of square of the error. J.D. Bramble, Ph.D. MED 483 – Fall 2005 The ANOVA Table Sources of Variation df Factor 1 I-1 Factor 2 J-1 Error (I-1)(J-1) Total IJ-1 Sources of Variation Month Lot Error Total SS SSA SSB SSE SST MS MSA MSB MSE F MSA/MSE MSB/MSE df SS MS F 3 4 12 19 1182.95 1947.5 441.3 3571.75 394.32 486.88 36.78 10.72 13.24 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Decision and Conclusion The relationships for making the decision is the same. For factor A, F(0.05, 3, 12) = 3.49 and Fstat = 10.72. Since Fstat > Fcrit we reject Ho For factor B F(0.05, 4, 12) = 3.26 and Fstat = 13.24. Since Fstat > Fcrit we again reject Ho J.D. Bramble, Ph.D. MED 483 – Fall 2005 Where is the difference? Looking on the table to get Q For factor 1: Q(a, I,( I-1)(J-1)) . (Notice that I is the # of levels for Factor A and (I-1)(J-1) is the df of the error) Thus, Q for Factor B: Q(a, J, (I-1)(J-1)). The formula for w is For factor A: For factor B J.D. Bramble, Ph.D. MED 483 – Fall 2005 Computing the Tukey’s For factor A: Ordering the means and underscoring all the pairs that differ by less than w = 11.39 High 55.6 Med 58 None 66.2 Low 75.2 There is a significant difference in test scores between the high and medium caffeine groups and the none. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Two Factor ANOVA:Repeated Measures Measurements are made repeatedly on each subject (before, during, and after the intervention) Subjects are recruited as matched sets on variables such as age or diagnosis A laboratory is experiment is run several times, each time with several parallel treatments. When appropriate, the use of the repeated measures ANOVA test is usually more powerful than ordinary ANOVA. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Two Factor ANOVA Example How does various types of music affect agitation in Alzheimer’s patients? Group Piano Mozart Easy Listening Early 21 24 22 18 20 9 12 10 5 9 29 26 30 24 26 Middle 22 20 25 18 20 14 18 11 9 13 15 18 20 13 19 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Writing the hypothesis For Factor A (Music) For Factor B (Stage) Ho: mpiano = mmozart = measy listening Ha: at least on m is different Ho: mearly = mmiddle Ha: at least one m is different For the Interaction Ho: no interaction between music and stage on agitation level Ha: there is an interaction J.D. Bramble, Ph.D. MED 483 – Fall 2005 Two Factor Repeated Measures ANOVA Compute SS for the total (SST), error (SSE), factor A(SSA), factor B (SSB), and the interaction of AB (SSAB). SST = SSA + SSB + SSAB + SSE Each SS has associated degrees of freedom SST = IJK SSE = IJ(K - 1) SSA = I-1 SSB = J - 1 SSAB = (I - 1)(J - 1) J.D. Bramble, Ph.D. MED 483 – Fall 2005 Two Factor Repeated Measures ANOVA MS are computed by the appropriate SS/df The test statistic is arrived at by the appropriate MS divided be MSE. Hypotheses H01 vs. Ha1 H02 vs. Ha2 H012 vs. Ha12 Test Statistic MSA / MSE MSB / MSE MSAB / MSE Critical Value Fa, I-1, IJ(K - 1) Fa, J-1, IJ(K - 1) Fa, (I-1)(J-1), IJ(K - 1) J.D. Bramble, Ph.D. MED 483 – Fall 2005 Repeated Measures ANOVA Table Source df SS MS F Factor 1 I-1 SS1 SS1 / df1 MS1/MSE Factor 2 J-1 SS2 SS2 / df2 MS2/MSE (I-1)(J-1) SS1x2 IJ(K-1) SSE IJK-1 SST Interaction Within (Error) Total SS1x2 / df1x2 MS1x2/MSE SSE / dfE J.D. Bramble, Ph.D. MED 483 – Fall 2005 The ANOVA Table Source df SS MS F Music 2 740 370 48.89 Stage 1 30 30 4.05 Music x Stage 2 260 130 17.53 Error 24 178 7.42 Total 29 1208 J.D. Bramble, Ph.D. MED 483 – Fall 2005 Repeated Measures: Tukey’s When no significant interaction is found For comparing levels of factor A, obtain Qa, I, IJ(K - 1) For comparing levels of factor B, obtain Qa, J, IJ(K - 1) w = Q * MSE/JK for factor 1 comparisons w = Q * MSE/IK for factor 2 comparisons Arrange sample means in increasing order and underscore pairs of differing by less than w J.D. Bramble, Ph.D. MED 483 – Fall 2005 Multivariate Analysis of Variance Referred to as MANOVA Used when there is multiple dependent variables Dependent variables are usually releted to one another MANOVA helps to determine the effect of the treatment (IV) on any one outcome (DV) J.D. Bramble, Ph.D. MED 483 – Fall 2005 MANOVA Example Does sex, race, and educational level affect how well people deal with the pressure of a terminal disease? IV = sex (2), race (4), education (4) DV = Coping strategies (5) MANOVA can estimate the effects of the IV (sex, race, and education) for each of the five scales of coping strategies, independent of one another. J.D. Bramble, Ph.D. MED 483 – Fall 2005 Analysis of Covariance Referred to as ANCOVA Allows researchers to adjust or equalize baseline differences between groups In addition to the DV and IV a covariate is enter into the model. The covariate is a variable that is known to have an effect on the DV J.D. Bramble, Ph.D. MED 483 – Fall 2005 ANCOVA example Wood et al. (2002) tested an educational intervention to promote breast self examination (BSE). Quasi experimental design Difference in knowledge and skill related to BSE. Enter these covariates into the model is essential to determine if the difference is the intervention or the initial difference J.D. Bramble, Ph.D. MED 483 – Fall 2005