Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 14 Analysis of Variance 1 Introduction Analysis of variance helps compare two or more populations of quantitative data. Specifically, we are interested in the relationships among the population means (are they equal or not). The procedure works by analyzing the sample variance. 2 14.1 One - Way Analysis of Variance The analysis of variance is a procedure that tests to determine whether differences exits among two or more population means. To do this, the technique analyzes the sample variances 3 One - Way Analysis of Variance : Example 1 – An apple juice manufacturer is planning to develop a new product -a liquid concentrate. – The marketing manager has to decide how to market the new product. – Three strategies are considered Emphasize convenience of using the product. Emphasize the quality of the product. Emphasize the product’s low price. 4 One - Way Analysis of Variance : Example 1 - continued – An experiment was conducted as follows: In three cities an advertisement campaign was launched . In each city only one of the three characteristics (convenience, quality, and price) was emphasized. The weekly sales were recorded for twenty weeks following the beginning of the campaigns. 5 One - Way Analysis of Variance : Convnce Weekly sales 529 658 793 514 663 719 711 606 461 Weekly 529 sales 498 663 604 495 485 557 353 557 542 614 Quality Price 804 630 774 717 679 604 620 697 706 615 492 719 787 699 572 Weekly 523 584 sales 634 580 624 672 531 443 596 602 502 659 689 675 512 691 733 698 776 561 572 469 581 679 532 See file (Xm1.xls) 6 One - Way Analysis of Variance : Solution – The data is quantitative. – Our problem objective is to compare sales in three cities. – We hypothesize on the relationships among the three mean weekly sales: 7 Defining the Hypotheses • Solution H0: m1 = m2= m3 H1: At least two means differ To build the statistic needed to test the hypotheses use the following notation: 8 Notation Independent samples are drawn from k populations (treatments). First observation, first sample Second observation, second sample 1 2 k X11 x12 . . . Xn1,1 X21 x22 . . . Xn2,1 Xk1 xk2 . . . Xnk,1 n2 nk x2 xk n1 x1 Sample size Sample mean X is the “response variable”. The variables’ value are called “responses”. 9 Terminology In the context of this problem… Response variable – weekly sales Responses – actual sale values Experimental unit – weeks in the three cities when we record sales figures. Factor – the criterion by which we classify the populations (the treatments). In this problems the factor is the marketing strategy. Factor levels – the population (treatment) names. In this problem factor levels are the marketing trategies. 10 The rationale of the test statistic Two types of variability are employed when testing for the equality of the population means 11 Graphical demonstration: Employing two types of variability 12 30 25 x3 20 20 x 2 15 16 15 14 11 10 9 x3 20 20 19 x 2 15 x1 10 12 10 9 x1 10 7 A small variability within Treatment 1 Treatment 2 Treatment 3 the samples makes it easier to draw a conclusion about the population means. 1 The sample means are the same as before, Treatment 1 Treatment 2 Treatment 3 but the larger within-sample variability 13 makes it harder to draw a conclusion about the population means. The rationale behind the test statistic – I If the null hypothesis is true, we would expect all the sample means be close to one another (and as a result to the grand mean). If the alternative hypothesis is true, at least some of the sample means would reside away from one another. Thus, we measure variability among sample means. 14 Variability among sample means The variability among the sample means is measured as the sum of squared distances between each mean and the grand mean. This sum is called the Sum of Squares for Treatments SST In our example treatments are represented by the different advertising strategies. 15 Sum of squares for treatments (SSTR) k SST n j ( x j x) 2 j 1 There are k treatments The size of sample j The mean of sample j Note: When the sample means are close to one another, their distance from the grand mean is small, leading to amall SST. Thus, large SST indicates large variation among sample means, which supports H1. 16 Sum of squares for treatments (SST) Solution – continued Calculate SST x1 577.55 x 2 653.00 x 3 608.65 k SST n j (x j x) 2 The grand mean is calculated by n1x1 n2 x 2 ... nk x k X n1 n2 ... nk j1 = 20(577.55 - 613.07)2 + + 20(653.00 - 613.07)2 + + 20(608.65 - 613.07)2 = = 57,512.23 17 Sum of squares for treatments (SST) Is SST = 57,512.23 large enough to favor H1? See next. 18 The rationale behind test statistic – II Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. Therefore, even-though sample means may markedly differ from one another, large SST must be judged relative to the “within samples variability”. 19 Within samples variability The variability within samples is measured by adding all the squared distances between observations and their sample means. This sum is called the Sum of Squares for Error SSE. In our example this is the sum of all squared differences between sales in city j and the sample mean of city j (over all the three cities). 20 Sum of squares for errors (SSE) Solution – continued Calculate SSE s 10,775.00 s 7,238,11 s 8,670.24 2 1 2 2 k nj j 1 i 1 2 3 SSE (x ij x ) (n1 - 1)S12 + (n2 -1)S22 + (n3 -1)S32 2 = (20 -1)10,774.44 + (20 -1)7238.61+ (20-1)8,669.47 = = 506,967.88 21 Sum of squares for errors (SSE) • Note: If SST is small relative to SSE, we can’t infer that treatments are the cause for different average performance. • Is SST = 57,512.23 large enough relative to SSE = 506,983.50 to argue that the means ARE different? 22 The mean sum of squares To perform the test we need to calculate the mean sum of squares as follows: Calculation of MST Mean Square for Treatments SST MST k 1 57,512.23 3 1 28,756.12 Calculation of MSE Mean Square for Error SSE MSE nk 509,967.88 60 3 8,894.17 23 For honors class: Testing normality For honors class: Testing equal variances Calculation of the test statistic We assume: 1. The populations tested are normally distributed. 2. The variances of all the populations tested are equal. MST F MSE 28,756.12 8,894.17 3.23 with the following degrees of freedom: v1=k -1 and v2=n-k 24 The F test rejection region And finally the hypothesis test: H0: m1 = m2 = …=mk Ha: At least two means differ MST Test statistic: F MSE R.R: F>Fa,k-1,n-k 25 The F test MST MSE 28,756.12 8,894.17 3.23 F Ho: m1 = m2= m3 H1: At least two means differ Test statistic F= MST/ MSE= 3.23 R.R. : F Fak 1nk F0.05,31,60 3 3.15 Since 3.23 > 3.15, there is sufficient evidence to reject Ho in favor of H1, and argue that at least one of the mean sales is different than the others. 26 The F test p- value Use Excel to find the p-value =FDIST(3.23,2,57) = .0467 0.1 0.08 0.06 p Value = P(F>3.23) = .0467 0.04 0.02 0 -0.02 0 1 2 3 4 27 See file (Xm1.xls) Excel single factor printout Anova: Single Factor SUMMARY Groups Convnce Quality Price Count 20 20 20 ANOVA Source of Variation SS Between Groups 57512.233 Within Groups 506983.5 Total 564495.73 Sum 11551 13060 12173 df 2 57 Average 577.55 653 608.65 Variance 10774.997 7238.1053 8670.2395 MS F 28756.117 3.2330414 8894.4474 P-value F crit 0.046773 3.1588456 59 SS(Total) = SST + SSE 28 14.2 Multiple Comparisons If the single factor ANOVA leads us to conclude at least two means differ, we often wants to know which ones. Two means are considered different if the difference between the corresponding sample means is larger than a critical number. The larger sample mean is believed to be associated with a larger population mean. Fisher’s Least Significant Difference The Fisher’s Least Significant (LSD) method is one procedure designed to determine which mean difference is significant. The hypotheses are: H0: |mi – mj| = 0 Ha: |mi – mj| 0. The statistic: xi x j 30 Fisher’s Least Significant Difference This method builds on the equal variance t-test of the difference between two means. The test statistic is improved by using MSE rather than sp2. We can conclude that mi and mj differ (at a% significance level if |mi - mj| > LSD, where 1 1 LSD t a 2 MSE( ) ni n j d.f . n k Experimentwise type I error rate (aE) (the effective type I error) The Fisher’s method may result in an increased probability of committing a type I error. The probability of committing at least one type I error in a series of C hypothesis tests each at a level of significance is increasing too. This probability is called experimentwise type I error rate (aE ). It is calculated by aE = 1-(1 – a)C where C is the number of pairwise comparisons (C = k(k-1)/2, k is the number of treatments) The Bonferroni adjustment determines the required type I error probability per pairwise comparison (a) , to secure a pre-determined overall aE. The Bonferroni Adjustment The procedure: – Compute the number of pairwise comparisons (C) [C=k(k-1)/2], where k is the number of populations/treatments. – Set a = aE/C, where the value of aE is predetermined – We can conclude that mi and mj differ (at a/C% significance level if mi m j tαE d.f . n k (2C) 1 1 MSE n n j i The Fisher and Bonferroni methods Example1 - continued – Rank the effectiveness of the marketing strategies (based on mean weekly sales). – Use the Fisher’s method, and the Bonferroni adjustment method Solution (the Fisher’s method) – The sample mean sales were 577.55, 653.0, 608.65. – Then, The significant difference is between m1 and m2. x1 x 2 577.55 653.0 75.45 x1 x 3 577.55 608.65 31.10 ta 2 1 1 MSE( ) ni n j x 2 x 3 653.0 608.65 44.35 t .05 / 2 8894 (1/ 20) (1/ 20) 59.71 The Fisher and Bonferroni methods Solution (the Bonferroni adjustment) – We calculate C=k(k-1)/2 to be 3(2)/2 = 3. – We set a = .05/3 = .0167, thus t.0167/2, 60-3 = 2.467 (Excel). x1 x 2 577.55 653.0 75.45 x1 x 3 577.55 608.65 31.10 ta 2 1 1 MSE( ) ni n j x 2 x 3 653.0 608.65 44.35 2.467 8894 (1/ 20) (1/ 20) 73.54 Again, the significant difference is between m1 and m2. The Tukey Multiple Comparisons The test procedure: – Find a critical number w as follows: MSE w q a (k , ) ng If the sample sizes are not extremely different, we can use the above procedure with ng calculated as the harmonic mean of the sample sizes. k = the number of samples =degrees of freedom = n - k ng = number of observations per sample (recall, all the sample sizes are the same) a = significance level qa(k,) = a critical value obtained from the studentized range table The Tukey Multiple Comparisons The test procedure: – Find a critical number w as follows: MSE w q a (k , ) ng k = the number of samples =degrees of freedom = n - k ng = number of observations per sample recall, all the sample sizes are the same a = significance level qa(k,) = a critical value obtained from the studentized range table The Tukey Multiple Comparisons Recall, all the sample sizes are the same If the sample sizes are not the same, but don’t differ much from one another, we can use the harmonic mean of the sample sizes for ng. k ng 1 n1 1 n2 ...1 nk The Tukey Multiple Comparisons Select a pair of means. Calculate the difference between the larger and the smaller mean. xmax xmin • If xmax xmin w to conclude that there is sufficient evidence mmax > mmin . • Repeat this procedure for each pair of samples. Rank the means if possible. The Tukey Multiple Comparisons Example 1 – continued. We had three populations (three marketing strategies). K = 3, Sample sizes were equal. n1 = n2 = n3 = 20, = n-k = 60-3 = 57, MSE = 8894. Take q.05(3,60) from the table. ω qα (k, ν) Population MSE 8894 q.05 (3,57) 71.70 ng 20 Mean Sales - City 1 577.55 Sales - City 2 653 Sales - City 3 698.65 xmax xmin xmax xmin w City 1 vs. City 2: 653 - 577.55 = 75.45 City 1 vs. City 3: 608.65 - 577.55 = 31.1 City 2 vs. City 3: 653 - 608.65 = 44.35 Excel – Tukey and Fisher LSD method Xm15 -1.xls Fisher’s LDS Multiple Comparisons Omega = 71.7007033950796 Variable Variable Difference LSD 1 2 -75.45 59.72067 3 -31.1 59.72067 2 3 44.35 59.72067 a = .05 Type a = .05/3 = .0167 Bonferroni adjustments Multiple Comparisons Omega = 71.7007033950796 Variable Variable Difference LSD 1 2 -75.45 73.54176 3 -31.1 73.54176 2 3 44.35 73.54176 14.3 Randomized Blocks Design The purpose of designing a randomized block experiment is to reduce the within-treatments variation thus increasing the relative amount of among-treatment variation. This helps in detecting differences among the treatment means more easily. 42 Randomized Blocks The block of The Block of Greyish pinks bluish purples The Block of dark blues Treatment 4 Treatment 3 Treatment 2 Treatment 1 43 Partitioning the total variability The sum of square total is partitioned into three Recall. sources of variation For the independent – Treatments – Blocks – Within samples (Error) samples design we have: SS(Total) = SST + SSE SS(Total) = SST + SSB + SSE Sum of square for treatments Sum of square for blocks Sum of square for error 44 The mean sum of square To perform hypothesis tests for treatments and blocks we need • Mean square for treatments • Mean square for blocks • Mean square for error SST MST k 1 SSB MSB b 1 MSE SSE (k 1)(b 1) 45 The test statistic for the randomized block design ANOVA Test statistics for treatments MST F MSE Test statistics for blocks MSB F MSE 46 The F test rejection region Testing the mean responses for treatments F > Fa,k-1,(k-1)(b-1) Testing the mean response for blocks F> Fa,b-1,(k-1)(b-1) 47 Additional example Randomized Blocks ANOVA - Example Example 2 – Are there differences in the effectiveness of cholesterol reduction drugs? – To answer this question the following experiment was organized: 25 groups of men with high cholesterol were matched by age and weight. Each group consisted of 4 men. Each person in a group received a different drug. The cholesterol level reduction in two months was recorded. – Can we infer from the data in Xm2.xls that there are differences in mean cholesterol reduction among the four drugs? 48 Randomized Blocks ANOVA - Example Solution – Each drug can be considered a treatment. – Each 4 records (per group) can be blocked, because they are matched by age and weight. – This procedure eliminates the variability in cholesterol reduction related to different combinations of age and weight. – This helps detect differences in the mean cholesterol reduction attributed to the different drugs. 49 Randomized Blocks ANOVA - Example ANOVA Source of Variation SS df MS F P-value F crit Rows Columns Error 3848.657 195.9547 1142.558 24 160.3607 10.10537 9.7E-15 1.669456 3 65.31823 4.116127 0.009418 2.731809 72 15.86886 Total 5187.169 99 Treatments Blocks b-1 K-1 MSTR / MSE MSBL / MSE Conclusion: At 5% significance level there is sufficient evidence to infer that the mean “cholesterol reduction” gained by at least two drugs are different. 50 14.2 Multiple Comparisons The rejection region: xi x j > t α/2, nk 1 1 MSE n n j i Example – continued Calculating LSD: MSE = 8894.44; n1 = n2 = n3 =20. t.05/2,60-3 = tinv(.05,57) = 2.002 LSD=(2.002)[8894.44(1/20+1/20)].5 = 59.72 Testing the differences x1 x 2 75.45 >59.72 x1 x 3 31.10 <59.72 51 x 2 x 3 44.35 <59.72