Comparison of groups The purpose of analysis is to compare two or more population means by analyzing sample means and variances. One-way analysis is used with data categorized with one treatment (or factor), which is a characteristic that allows us to distinguish the different populations from one another. Example: A headline in USA Today proclaimed that “Men, women are equal talkers.” That headline referred to a study of the numbers of words that samples of men and women spoke in a day. Given below are the results from the study. Does there appear to be a difference? Example: Weights of college students in September and April of their freshman year were measured. The following table lists a small portion of those sample values. (Here we use only a small portion of the available data so that we can better illustrate the problem.) Can you claim some change in weight from September to April? Independence assumption Two or more samples are independent if the sample values selected from one group are not related to or somehow paired or matched with the sample values from the other groups. Two groups can be dependent if the sample values are paired. (That is, each pair of sample values consists of two measurements from the same subject (such as before/after data), or each pair of sample values consists of matched pairs (such as husband/wife data), where the matching is based on some inherent relationship.) Identifying Means That Are Different Informal methods for comparing means 1. Use the same scale for constructing boxplots of the data sets to see if one or more of the data sets are very different from the others. 2. Calculate the mean for each group, then compare those means to see if one or more of them are significantly different from the others. Analysis of Variance Fundamental Concepts Estimate the common value of : 2 1. The variance between groups (also called variation due to treatment) is an estimate of the 2 common population variance that is based on the variability among the sample means. 2. The variance within groups (also called variation due to error) is an estimate of the 2 common population variance based on the sample variances. Analysis of Variance Requirements 1. The populations have approximately normal distributions. 2. The populations have the same variance (or standard deviation ). 2 3. The samples are independent of each other. 4. The different samples are from populations that are categorized in only one way. Key Components of Analysis of Variance SS(total), or total sum of squares, is a measure of the total variation (around x) in all the sample data combined. S S t o t a l x x 2 Key Components of Analysis of Variance SS(treatment), also referred to as Sum of Squares between groups, is a measure of the variation between the sample means. S S t r e a t m e n t n x n x n x x x x 1 1 2 2 k k 2 2 n x x i i 2 2 Key Components of Analysis of Variance SS(error), also referred to as Sum of Squares within groups, is a sum of squares representing the variability that is assumed to be common to all the populations being considered. S S e r r o r 1 n 1 n 1 n s s s 1 2 k 2 1 1 s n i 2 i 2 2 2 k Key Components of Analysis of Variance Given the previous expressions for SS(total), SS(treatment), and SS(error), the following relationship will always hold. SS(total) = SS(treatment) + SS(error) Mean Squares (MS) MS(treatment) is a mean square for treatment, obtained as follows: MS(treatment) = SS (treatment) k–1 MS(error) is a mean square for error, obtained as follows: MS(error) = SS (error) N–k N = total number of values in all samples combined Identifying Means That Are Different F= MS (treatment) MS (error) After checking a significance of the ratio F of mean squares (MS), we might conclude that there are different population means. But it cannot show that any particular mean is different from the others.