Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Quantitative Methods II WELCOME! Lecture 13 Thommy Perlinger Parametrical tests (tests for the mean) Nature and number of variables One-way vs. two-way ANOVA One-way ANOVA Y1 One dependent variable (metric) X1 One independent/explanatory variable (categorical, e.g. different groups) Two-way ANOVA Y1 One dependent variable (metric) X1 X 2 Two independent/explanatory variables = factors (categorical) One-way Analysis of Variance (ANOVA) H 0 : µ1 = µ2 = … = µk (all k population means are equal) H a : The population means are not all equal Assumptions: • Independent random samples • The variable is Normally distributed in each group • Normally distributed residuals • Equal variance in all groups (homogeneity) Advantages: • Robust to slight deviations from Normality of the variable when the groups are of equal sizes • Robust to slight deviations from equality of variances Disadvantage: • Not appropriate if the variability is large (use nonparametric test instead) The basic principles of ANOVA If we have three fertilisers, and we wish to compare their efficacy, this could be done by a field experiment. The three fertilisers are applied to 10 plots each. The 30 plots are later harvested, with the crop yield being calculated for each plot. The basic principles of ANOVA We now have three groups, with ten values (crop yield) in each group, and we wish to know if there are any differences between these groups. The fertilisers do differ in the amount of yield produced. But there is also a lot of variation between plots given the same fertiliser. The basic principles of ANOVA The variability quantifies the spread of the data points around the mean. Variability – sum of squares To measure the variability, first the mean is calculated, then the deviation of each point from the mean. The deviations are squared, and then summated, and this sum is a useful measure of variability. The sum will increase the greater the scatter of the data points around the mean. This quantity is referred to as a sum of squares (SS), and is central to our analysis. Variance The SS however cannot be used as a comparative measure between groups, since it clearly will be influenced by the number of data points in the group. Instead, this quantity is converted to a variance by dividing by n − 1 The variance is therefore a measure of variability, taking account of the size of the dataset. But why don’t we divide by the actual size, n? We actually do not have n independent pieces of information about the variance. Degrees of freedom – number of independent pieces of information The first step was to calculate a mean (from the n independent pieces of data collected). The second step is to calculate a variance with reference to that mean. If n − 1 deviations are calculated, it is known what the final deviation must be, for they must all add up to zero by definition. So we have only n − 1 independent pieces of information on the variability about the mean. The number of independent pieces of information contributing to a statistic are referred to as the degrees of freedom. Partitioning the variance In ANOVA, it is useful to keep the measure of variability in its two components; a sum of squares, and the degrees of freedom associated with the sum of squares. The variance is also partitioned, i.e. divided into two parts: • the variance due to the event we are interested in (e.g. different fertilisers), and • the variance due to any other factor To illustrate the principle behind partitioning the variability, first consider two extreme datasets. Grand mean If there was almost no variation between the plots due to any of the other factors, and nearly all variation was due to the application of the three fertilisers, then the data would follow the following pattern Grand mean vs. group means The first step would be to calculate a grand mean (as on the previous slide), and there is considerable variation around this mean. The second step is to calculate the three group means that we wish to compare: that is, the means for the plots given fertilisers A, B and C. Group means It can be seen that once these means are fitted, little variation is left around the group means. Amount of variability explained In other words, fitting the group means has removed or explained nearly all the variability in the data. This has happened because the three means are distinct. Now consider the other extreme, in which the three fertilisers are, in fact, identical. Grand mean Once again, the first step is to fit a grand mean and calculate the sum of squares. Group means Second, three group means are fitted, only to find that there is almost as much variability as before. Amount of variability explained Little variability has been explained. This has happened because the three means are relatively close to each other (compared to the scatter of the data). Amount of variability explained The amount of variability that has been explained can be quantified directly by measuring the scatter of the group means around the grand mean. In the first example, the deviations of the group means around the grand mean are considerable. In the second example these deviations are relatively small Group means Now consider a third example, an intermediate situation. In this situation it is not immediately obvious if the fertilisers have had an influence on yield. Significant amount of variability explained? There is an obvious reduction in variability around the three means (compared to the one mean). But at what point do we decide that the amount of variation explained by fitting the three means is significant? The word significant, in this context, actually has a technical meaning. It means ‘When is the variability between the group means greater than that we would expect by chance alone?’ Three measures of variability SSB = Sum of Squares between groups Sum of squared deviations of the group means from the grand mean. A measure of the variation between the different groups (eg the variation between between plots given different fertilisers). SSW = Sum of Squares within groups Sum of squared deviations of the data around the separate group means. A measure of the variation within each group (eg the variation between the different plots that are given the same fertilizer). Three measures of variability SST = Total Sum of Squares Sum of squared deviations of the data around the grand mean. A measure of the total variability in the dataset. SST = SSB + SSW Partitioning the variability In the first example, SSW was small (small variation within the groups) and SSB was large (large variation between the groups) Small variation within the groups (SSW) Large variation between the groups (SSB) Partitioning the variability In the second example, SSW was large (large variation within the groups) and SSB was small (small variation between the groups) Large variation within the groups (SSW) Small variation between the groups (SSB) Significant amount of variability explained? So, if the variability between the group means is greater than that we would expect by chance alone, there is a significant difference between the group means. For a valid comparison between the two sources of variability, we of course need to compare the variability adjusted for the degree of freedom, i.e. the variances. Partitioning the degrees of freedom The first step in any analysis of variance is to calculate SST. This is done with n-1 degrees of freedom. The second step is to calculate the three group means. When the deviations of two of the three treatment means from the grand mean have been calculated, the third is predetermined. Therefore, calculating SSB (the deviation of the group means from the grand mean) has 2 df associated with it, or more formally k-1 df (k = number of groups). Partitioning the degrees of freedom Finally, SSW measures variation around the different group means. Within each of these groups, the deviations sum to zero. For any number of deviations within the group, the last is always predetermined. Thus SSW has n - k df associated with it (k=number of groups). Mean sum of squares Combining the information from the sum of squares and the degrees of freedom, we get mean sum of squares. MST = Total Mean Square The total variance in the dataset. Three measures of variability MSB = Mean Square between groups The variance between the different groups (eg the variation between plots given different fertilisers, adjusted for the sample size). MSW = Mean Square within groups The variance within each group (eg the variation between the different plots that are given the same fertilizer, adjusted for the sample size). F-ratio If none of the fertilisers influenced yield, then the variation between plots treated with the same fertiliser would be much the same as the variation between plots given different fertilisers. This can be expressed in terms of mean squares: the mean square for fertiliser (within group) would be the same as the mean square between the groups M SB F MSW The F-ratio would be 1 if the group means are the same. F-ratio F-ratio >1 means that the between-group variance is larger than the within-group variance, and the group means are quite different. F-ratio <1 means that the within-group variance is larger than the between-group variance, and the relatively large spread within the groups makes it difficult to say that the group means are different. The F-ratio is compared to the F-distribution to calculate an appropriate P-value. F-distribution The shape of the F-distribution depends on the df (both within-group and between-group) 2 and 27 df 10 and 57 df Example: Eyes and ad response Research from a variety of fields has found significant effects of eye gaze and eye color on emotions and perceptions such as arousal, attractiveness, and honesty. These findings suggest that a model’s eyes may play a role in a viewer’s response to an ad. Example: Eyes and ad response In a recent study, 222 randomly chosen students at a certain university were presented one of four portfolios. Each portfolio contained a target ad for a fictional product, Sparkle Toothpaste. The students were asked to view the ad and then respond to questions concerning their attitudes and emotions about the ad and product. The variable of main interest is the viewer’s “attitudes towards the brand”, an average of 10 survey questions, on a 7-point scale. Example: Eyes and ad response The only difference in three of the ads was the model’s eyes, which were made to be either brown, blue, or green. In the fourth ad, the model is in the same pose but looking downward so the eyes are not visible. Group Blue Brown Green Down n 67 37 77 41 Mean 3.19 3.72 3.86 3.11 Std.dev 1.75 1.73 1.67 1.53 Example: Eyes and ad response In SPSS: Analyze >> Graphs >> Legacy Dialogs >> Boxplot. Choose ”Simple”. Example: Eyes and ad response H 0 : µblue = µbrown = µgreen = µdown (all 4 mean attitudes are equal in the population) Ha : The population mean attitudes are not all equal Assumptions: • Independent random samples • The attitude score is Normally distributed in all groups -To be checked by Normality plots and tests • Normally distributed residuals - Save residuals from the analysis and investigate Normality plots and tests • Equal variance in all groups ? Equal variances in ANOVA Using formal tests for the equality of variances in several groups is not recommended (they are largely affected e.g. by deviations from Normality). Since ANOVA is robust to slight deviations from equality of variances, the following rule of thumb can be used: If the largest standard deviation is less than twice the smallest standard deviation, the results from the ANOVA will be approximately correct. OK to use ANOVA if slargest 2 ssmallest Example: Eyes and ad response Group Blue Brown Green Down n 67 37 77 41 Mean 3.19 3.72 3.86 3.11 Std.dev 1.75 1.73 1.67 1.53 None of these standard deviations are twice as large as any other. Example: Eyes and ad response H 0 : µblue = µbrown = µgreen = µdown (all 4 mean attitudes are equal in the population) Ha : The population mean attitudes are not all equal Assumptions: • Independent random samples • The attitude score is Normally distributed in all groups -To be checked by Normality plots and tests • Normally distributed residuals - Save residuals from the analysis and investigate Normality plots and tests • Equal variance in all groups Example: Eyes and ad response Significance level? Wrongly rejecting the null hypothesis would mean that we claim that the mean attitude towards the brand is different depending on the eyes on the ad, when the attitudes in fact are the same (on average). Not a serious consequence, the standard 5% is fine to use. Example: Eyes and ad response ANOVA Score Sum of Squares Between Groups df Mean Square 24,420 3 8,140 Within Groups 613,139 218 2,813 Total 637,558 221 F Sig. 2,894 ,036 F-ratio. >1 if the between-group variance is larger than the within-group variance (which means that the group means are quite different) P-value. Tells us if the group means are significantly different In SPSS: Analyze >> Compare Means >> One-way ANOVA. Choose your variable of interest under ”Dependent List”, and the grouping variable under ”Factor”. Example: Eyes and ad response Tests of Between-Subjects Effects Dependent Variable:Score Type III Sum of Source Squares F Sig. 3 8,140 2,894 ,036 2430,423 1 2430,423 864,131 ,000 group 24,420 3 8,140 2,894 ,036 Error 613,139 218 2,813 Total 3352,860 222 637,558 221 Intercept Withingroup Mean Square a Corrected Model Betweengroup df 24,420 Corrected Total a. R Squared = ,038 (Adjusted R Squared = ,025) R2 the fraction of the overall variance (pooling all the groups) attributable to differences among the group means. In SPSS: Analyze >> General Linear Models >> Univariate Choose your explanatory variable as ”Fixed factors” P-value. Tells us if the group means are significantly different Example: Eyes and ad response The P-value is to be compared to the significance level 0.036 < 0.05 H0 rejected (significant result) Conclusion: The test result indicates that the mean attitude towards the brand is different depending on the eyes on the ad (eye color, or not seeing the eyes). But which eye colors? Again, pairwise tests can be used to find which groups that differ. Recap: Pairwise tests can be performed after a multigroup test If you find a significant difference using a multigroup test, you can perform pairwise tests to find which groups/occasions that differ. It is however very important not to start with the pairwise testing, due to multiplicity issues. First use e.g. ANOVA to find out if there are any significant differences at all, then perform pairwise tests to find where the differences are located. This way you don’t have to adjust the significance level for multiplicity, and can use e.g. 5% in every pairwise comparison. Two-way Analysis of Variance (ANOVA) There are three sets of hypothesis tests with the two-way ANOVA H 01 : µ1_F1 = µ2_F1 = … = µk_F1 (all k population means of the first factor are equal) H 02 : µ1_F2 = µ2_F2 = … = µk_F2 (all k population means of the second factor are equal) H 03 : There is no interaction between the two factors The two explanatory variables in a two-way ANOVA are called factors (categorical variables). Two-way Analysis of Variance (ANOVA) Assumptions: • Independent random samples • The variable is Normally distributed in each group • Normally distributed residuals • Equal variance in all groups (homogeneity) Advantages: • Robust to slight deviations from Normality when the groups are of equal size • Robust to slight deviations from equality of variances Disadvantage: • Not appropriate if the variability is large (use nonparametric test instead) Example: Cardiovascular risk factors A study of cardiovascular (heart disease) risk factors compared runners who averaged at least 15 miles/week with a control group (non-exercising). Both men and women were included in the study. Y = heart rate after 6 min of exercise Factor 1 = exercise group (runners/control) Factor 2 = gender Example: Cardiovascular risk factors Women Runners Mean: 116 b/m Std: 16.0 b/m Control Mean: 148 b/m Std: 16.3 b/m Men Mean: 104 b/m Std: 12.5 b/m Mean: 130 b/m Std: 17.1 b/m None of the standard deviations are twice as large as any other. In SPSS: Analyze >> Graphs >> Legacy Dialogs >> Boxplot. Choose ”Clustered”. Example: Cardiovascular risk factors Female Runners Control Male Heart rate seems to be Normally distributed in all groups In SPSS: Data >> Split File. Mark ”Organize output by groups” and add the two factor variables. Analyze >> Descriptive Statistics >> Explore. Add dependent variable to ”Dependent List”, and the two factor variables Click ”Plots”, mark ”Normality plots with tests” Example: Cardiovascular risk factors Residuals seem to be Normally distributed In SPSS: Analyze >> General Linear Models >> Univariate. Click ”Save”, mark Standardized residuals. Then check the normality of the saved residuals. Example: Cardiovascular risk factors H 01 : µrunners = µcontrol H 02 : µfemale = µmale H 03 : There is no interaction effect on heart rate between exercise (running) and gender. Assumptions: • Independent random samples • Heart rate is Normally distributed in all groups • Normally distributed residuals • Equal variance in all groups Two-way Analysis of Variance (ANOVA) There are two different effects measured with two-way ANOVA: • Main effect • Interaction effect Main effect The main effect describes the effect of the explanatory variables one at a time. The interaction is ignored for this part. This is the part which is similar to the one-way analysis of variance. Each of the variances calculated to analyze the main effects are like the between-group variances. So for two variables, there will be two main effects. Interaction effect The interaction effect describes the effect that one factor has on the other factor. For two variables, there will be one interaction effect. Example: Cardiovascular risk factors Tests of Between-Subjects Effects Dependent Variable:Heart rate Type III Sum of Source Squares Corrected Model 215256,090 a 3 71752,030 296,345 ,000 12398208,080 1 12398208,080 51206,259 ,000 group 168432,080 1 168432,080 695,647 ,000 gender 45030,005 1 45030,005 185,980 ,000 1794,005 1 1794,005 7,409 ,007 Error 192729,830 796 242,123 Total 12806194,000 800 407985,920 799 Intercept Main effects group * gender Interaction effect Corrected Total df Mean Square F Sig. a. R Squared = ,528 (Adjusted R Squared = ,526) R2 the fraction of the overall variance (pooling all the groups) attributable to differences among the group means. In SPSS: Analyze >> General Linear Models >> Univariate Choose your explanatory variables as ”Fixed factors” Example: Cardiovascular risk factors If the lines are parallel, there is no interaction effect In SPSS: Analyze >> General Linear Models >> Univariate Click ”Plots”. Add one of the factors to ”Horizontal axis”, the other to ”Separate Lines”, click ”Add” Example: Cardiovascular risk factors Conclusion: There is a significant interaction effect between exercise (running) and gender on heart rate (P-value 0.007). Exercise has a larger effect on heart rate for females than for males, on average. Main effects analysis shows that both exercise and gender, separately, also have significant effects on average heart rate (both P-values 0.000). Men have on average lower heart rate than women, and the running group has on average lower heart rate than the control group. Recap: Nature and number of variables Analysis of variance (ANOVA/MANOVA) ANOVA Y1 X 1 X 2 X 3 ... X n One dependent variable (metric) Several independent/explanatory variables (categorical) MANOVA Y1 +Y2 Y3 ... Yn Several dependent variables (metric) X 1 X 2 X 3 ... X n Several independent/explanatory variables (categorical) Recap: Dependence techniques Several dependent variables in single relationship Measurement scale of the dependent variable Metric Nonmetric Measurement scale of the predictor variable Canonical correlation analysis with dummy Metric Nonmetric Canonical Multivariate analysis correlation of variance (MANOVA) (Not included in this course) (Not included in this course)