Download Correlation coefficient

Research in business studies Department of Business Administration SPRING 2009-10 Quantitative and Qualitative Data Analysis by Assoc. Prof. Sami Fethi Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Quantitative data analysis  Examining differences  Relationship between variables  Explaining and predicting relationship between variables  Data reduction, structure and dimension  Additional methods  Characteristic of qualitative research  Qualitative data  Analytical procedure  Interpretation  Strategies for qualitative analysis  Quantify qualitative data  Validity in qualitative research 2 Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Examining differences Research in business studies Hypotheses about one mean  In research we often have to make statements about the mean. When the population variance is unknown, the stadard error of the mean is also unknown. The standard error of the mean must be estimated from sample data.  e.g. SDX= SD‘/ N where SDX= standard error of mean SD‘= estimated standard deviation N= sample size   N ( xi  X ) i 1 2 SD‘= N 1 N-1 is degrees of freedom  Example 1: For a supermarket chain to add a new product, at least 100 units must be sold per week. The new product is tested in ten randomly selected stores for a limited time. Apply a test such as one-tailed t test and answer the question that will the new product sell more than 100 unit per week? a) construct hypothesis b) calculate mean and standard deviation if they are not given. c) calculate standart error of mean 3 d) find t- value Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Examining differences Research in business studies a) H0: X<=100 H1: X>100 b) X and SD are given 109.4 and 14.90 respectively. c) SDX = 14.90/ 10  1 =4.55 d) t= (X-µ)/SDX=(109.4-100)/4.55=2.07 Where t-table is 1.83 at 5% significant level. We reject the null  Hypotheses about two means  This is usually associated with such a question: Are the tastes in region A different from the tastes in region B? ( X 1  X 2 )  (1  2 ) Z   e.g. SD X1  X 2 Where X1= sample mean for the first sample X2= sample mean for the second sample Research Methods in Business Studies 4 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Examining differences SDX 1  X 2 = the standard eror of differences in means µ1 and µ2 are the unknown population means and the general estimate of: SD 21 SD 2 2 SDX1  X 2  SDX2 1  SDX2 2   N1  N2 In assuming the two population variances to be equal, the common population variance can be generated by pooling the samples. When the variances are unknonw and the standard errors of means must be estimated, then the t represents an adequate test statistics, distributed with v= N1+ N2-2- degrees of freedom. Example2: A manufacturer has developed a new product and wonders whether the label of the package should be red or blue. The new products with two different labels are tested in ten randomly selected stores. The means sales obtained for the red package are 403.0 and for the blue package 390.3. The standard error of estimate for the difference means is 8.15. Research Methods in Business Studies 5 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Examining differences a) construct hypothesis b) find t- value a) H0: (µ1- µ2 )=0 H1: (µ1- µ2 )≠0 or H0: (µ1- µ2 )<=0 H1: (µ1- µ2 )>0 b) t ( X 1  X 2 )  (1   2 ) =((403.0-390.3)-0)/8.15=1.56 SDX1  X 2 V=10+10-2=18 degrees of freedom...5% and df 18 so critical value from the table is 2.101. This means that null hypothesis is accepted.. H0: (µ1- µ2 )=0. This means that the two unknown population means are assumed to be same. Research Methods in Business Studies 6 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Useful alternative tests     o     In problems involving one or two population means, t-methods are usually appropriate, but often non-parametric methods are good alternatives. e.g. Non-parametric methods have advantage of requiring less in terms of assumptions and less powerful than t-methods (see siegel and Castella; 1998). e.g. The main difference between them is that t-method associates with means while non-parametric methods are concerned with medians. ANOVA- analysis of variance measures comparisons of more than two groups simultaneously. This method rests on comparing the ratio of systematic variance to unsystematic variance. In ANOVA, the following is computed: Total variation by comparing each observation with the grand mean. The between-group variation by comparing the treatment means with the grand mean. The within-group variation by comparing each score in the group with the group mean. Recall-MANOVA-multivariate analysis of variance. This has more than one dependent variable compared to ANOVA: Research Methods in Business Studies 7 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Comparison of more than two group Example 3: In the following table, three advertising campaigns tested in 24 randomly selected cities comparable in size and demographics. The following output is an anova analysis results: Source Sum Degree Mean F-ratio of of sq. freedom sq. Between 49.0 2 24.1 5.88 group Within group 87.5 21 total 136.5 23 4.17 8 Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Example 3 Research in business studies a) construct hypothesis b) find F- value whether significant or not c) Comment on the F-values a) H0: G1= G2= G3 H1: G1≠ G2 ≠ G3 d.f= 24-1=23, between group 3-1=2 within group 232=21. b) Fcalculated=24.1/4.17=5.88 Fcritical=n-k,k-1=24-3,3-1=(21,2). From F-distribution, Fcritical is 3.47. c) Since 5.88 is greater than 3.47, we reject the null hypothesis, that is, the group means are equal and accept the alternative hypothesis that the advertising campaigns vary in effectiveness. Research Methods in Business Studies 9 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Relationship between variables  In research, we are often preoccupied with whether there is a relationship or two or more variables covary. o Correlation coefficient Based on the Pearson criterion, it examines the strength of linear relationship between two variables, for example x and y. o Theoretically, the Correlation coefficient can take the values from -1 to 1. A correlation coefficient of 1 tells us that two variables perfectly covary positively whereas -1 shows that two variables perfectly inversely related. Close to 0 indicates that the variables are unrelated. The formula of the Correlation coefficient as fololw: Where X and Y represent the sample means of X and Y. rXY   ( x  X )( y  Y )  (x  X )  ( y  Y ) i i 2 i 10 2 i Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Relationship between variables o Correlation coefficient A Correlation coefficient shows covariation between two variables, and not that the variables are causally related. The square of the Correlation coefficient is the coefficient of determination. R2=Explained variation/Total variation o Example 4- partial correlation Using the following table (Table 1) and calculate the relationship between advertisement recognition, appeal and sex. In other words, Is the relationship between advertisement recognition and appeal inluenced by controling for sex? 11 Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Example 4 o This is partial correlation and can be formulated as follow based on partial Correlation coefficient r123 as such ad.roc, appeal, sex r123  r12  (r13 ) (r23 ) 1  r13 2 1  r23 2 rAd .roc, appeal, sex  0.24  (0.33) (0.09) 1  (0.33) 2 1  (0.09) 2  0.29 o This shows that controlling for sex the observed relationship between ad.roc, and appeal positive and strengthened. 12 Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Explaining and predicting relationship between variables o Explaining and predicting relationship between variables are important tasks in business research. One of the most applied and useful approaches to examining relationships between variables is regression analysis. In regression analysis, we want to fit a model that best describes the data which is done in regression analysis by applying the method of least squares. More precisely, this is done by fitting a straight line that minimizes the squared vertical deviations from that line as shown in following figure. o Single Linear Regression Y= a0+a1xi+ei Where Y=the outcome variable, X=predictor variable, a1=slope of the straight line fitted to the data and a0=intercept of the line and ei=difference between the score predicted and the score actually obtained. This is called residual. Research Methods in Business Studies 13 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Single Linear Regression Explaining and Predicting Relationship between Variables Figure 1 The linear model Research Methods in Business Studies 14 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Single Linear Regression Example 5 o Assume that a car dealer collects data for six months on four variables; Tv advertising, printing advertising, competitors’ advertising and sales. Y is sales. The car dealer expects carsales to be positively correlated with TV-ads and Print-ads and negatively correlated with competitors’ ads. Table 2 Data matrix Research Methods in Business Studies 15 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Simple Mean Regression-output Example 5 o Assume that a car dealer collects data for six months on four variables; Tv advertising, printing advertising, competitors’ advertising and sales. Y is sales. The car dealer expects carsales to be positively correlated with TV-ads and Print-ads and negatively correlated with competitors’ ads. Based on the information below, comment on the estimated coefficinent and T-ratio as well as R2 on Tv-Ads. Table 3 Simple mean regression-output Research Methods in Business Studies 16 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Simple Mean Regression-output Example 5-Answer o The estimated constant term 0.7 shows that If the dealer does not use Tv-ads at all (Tv-ads=0), the estimated expected value of carsale is 0.7 unit that is 7 car. The estimated regression coefficient of sales on Tv-Ads is 0.9. This coefficient shows that if the variable Tv-ads is increased by 1 unit, the estimated expected value of carsales increases by 0.9 units, that is nine car. The result, Rsquare, R2 that is 85.3 percent shows that the sample determination of coefficient is equal to 0.853. Practically speaking, this means that the variation in the variable Tv-ads has explained 85.3 percent of the variations in the dependent variable carsales. Estimated t-value on Tvads is 4.81 which is greater than 2 (tabular value from t-distribution) or rule of thumb so it is signficant 5% and 1% levels. This means that we can reject the null hypothesis that is the corresponding population regression coefficient is equal to zore. The conclusion then is that Tvads and sales are significantly related to each other or Tv-ads has positive impact on sales. 17 Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Assumptions in Regression analysis o The expected value of the error term is zero o The variance for the error term for each X is constant. This term homoscedasticity. If the variance to e varies with X, this is termed heteroscedasticity. o The error for the observations are uncorrelated. o e should be normally distributed for each X. o The error term should not be correlated with x-corr(e, x)=0 o It is also a common assumption that the regression model should be linear in its parameters. 18 Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Correlation Coefficients-output Example 6 o Assume that a car dealer collects data for six months on four variables; Tv advertising, printing advertising, competitors’ advertising and sales. Y is sales. The car dealer expects carsales to be positively correlated with TV-ads and Print-ads and negatively correlated with competitors’ ads. Use the concept of correlation coefficient and explain the relationships between the variable under inspection based on the information given in table 4. Table 4 Correlation coefficients-output Research Methods in Business Studies 19 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Correlation Coefficients-output Example 6 -Answer Research in business studies o The relationship between carsales (dependent) and Tv advertising, printing advertising, competitors’ advertising (explanatory) are expected to be high. The relationship between the explanatory variables as such Tv advertising, printing advertising, competitors’ advertising are expected to be low. So high correlation coefficient between for example Tv advertising and printing advertising shows a high degree of multicollinearity. This influences the estimates results badly. To remedy this situation, the relevant variable can be dropped from the regression equation. For example between sales and Tv-ads is 0.92 which is highly reasonable score or between sales and Comp-ads is 0.155 which is very low score . Research Methods in Business Studies 20 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Multiple Regression Research in business studies o In multiple regression, at least two or more independent or explanatory variables are applied to explain/predict the dependent variable. The purpose is to make the model more realistic, control for other variables, and explain more of the variance in the dependent variable as well as reduce the residuals. The following is a typical example output for a multiple regression. 21 Table 5 Multiple regression – output Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Research in business studies Dummy Variables o In a multiple regression, dummy variable can be used in two ways. As a dependent variables where its values take 1 or 0 that is also called dichotomous. The other type can be used as independent variable which takes the value 0 or 1. The dummy variable used in an analysis when there does not exist as numerical values. For example, in the following table that is a nominal scaled variable that can not be ranked so to be applied in a regression analysis, the seasons need to be assigned numbers Table 6 Coding of dummy variable Research Methods in Business Studies 22 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Dummy variables Example 7 Research in business studies o In the following table, there three new variables A, B and C and indicates that the four seasons are different combinations of zeros and ones. Assume that the following regression model for sales of women’s clothing where the price (P) is also included, has been estimated: Sale=1000 - 0.5P+100A - 20B - 50C a) Calculate the sales in the summer by considering dummy variables as well (i.e. p=$200 ). b) Calculate the sales in the autumn by considering dummy variables as well (i.e. p=$200 ). c) Compare the sales in winter and spring by keeping the same price. Table 6 Coding of dummy variable Research Methods in Business Studies 23 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Dummy variables Example 7-Answer o Research in business studies In the following table, there three new variables A, B and C and indicates that the four seasons are different combinations of zeros and ones. Assume that the following regression model for sales of women’s clothing where the price (P) is also included, has been estimated: Sale=1000 - 0.5P+100A - 20B - 50C a) Calculate the sales in the summer by considering dummy variables as well (i.e. p=$200 ). Sale=1000 - 0.5 (200)+100(1) – 20(0) – 50(0)=$1000 b) Calculate the sales in the autumn by considering dummy variables as well (i.e. p=$200 ). Sale=1000 - 0.5 (200)+100(0) – 20(1) – 50(0)= $880 c) Compare the sales in winter and spring by keeping the same price. Winter- Sale=1000 - 0.5 (200)+100(0) – 20(0) – 50(1)= $950 spring- Sale=1000 - 0.5 (200)+100(0) – 20(0) – 50(0)= $900 Research Methods in Business Studies 24 © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Correlation coefficient