Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
SPSS EXERCISES For Marketing Research Essentials, 4e Carl McDaniel and Roger Gates SPSS Exercises prepared by Joe Cangelosi University of Central Arkansas INTRODUCTION TO STUDENTS SPSS is recognized as one of the leading software packages for statistical analysis. For about the last 5-7 years, it has been packaged with marketing research texts as an ancillary resource. However, there has not been an organized attempt to integrate SPSS with the marketing research course. The objective of these SPSS Exercises is to do just that – integrate the use of SPSS into the Marketing Research course, resulting a significant data management component. The need for a significant data management component came from several sources. First, the University of Central Arkansas College of Business has a Business Advisory Council, which advises the college with regard to curriculum, recommended that graduates needed stronger data management skills. Second, as Faculty Advisor for the UCA Marketing Club, and a member of the Central Arkansas Professional Chapter of the American Marketing Association, I am hearing from the marketing professionals is that marketing majors need better data management skills. Given the preceding, I decided to develop computer-based exercises using SPSS, with the objective of increasing student proficiency in data analysis and data management. The SPSS Exercises correspond to the data analysis chapters in McDaniel & Gates, Marketing Research Essentials, 4e. The SPSS Software is very user-friendly, and the mechanics of its use can be integrated into daily class sessions. Lastly, SPSS provides a number of learning resources. Their web site address is www.spss.com. Soft Drink & Beverage Consumption Questionnaire 1. Do you drink soft drinks? _____YES(1) If ANO@ go to number 11. 2. What percent of your soft drink consumption is: a. Drinks with sugar b. Drinks without sugar (diet) _____% _____% What percent of your soft drink consumption is: a. Drinks with caffeine b. Drinks without caffeine _____% _____% What percent of your soft drink consumption is: a. Your favorite soft drink b. Your 2nd favorite soft drink c. Other brands of soft drink _____% _____% _____% On the average how many soft drinks do you consume weekly? (Use the equivalent of 12 oz. Cans.) ___________ 12 oz. Cans 3. 4. 5. _____NO(0) 6. What is your favorite soft drink? ________________________________ 7. Indicate the extent to which you agree or disagree with each of the following statements using the scale below: strongly strongly disagree disagree indifferent agree agree |________|________|________|________|________| 1 2 3 4 5 ___a. ___b. ___c. ___d. ___e. ___f. ___g. ___h. ___i. ___j. ___k. Soft drinks really give me a lift during the day. I am hooked on soft drinks. Diet soft drinks give me a headache. When I was last on a diet, one of the lifestyle changes I made was not to drink soft drinks with sugar. Advertising has nothing to do with my choice of soft drink. I prefer ice tea to soft drinks. Soft drink TV commercials have gotten funnier over the past five years. I think the beer TV commercials are better than the soft drink TV commercials. Soft drinks are bad for a person=s health. On an average day, I consumer more ounces of soft drinks than water. In general, soft drinks taste better than beer. 8. Indicate which of the following beverages you would prefer to consume for each of the following occasions: 1=soft drink 2=water 3=beer 4=gatoraide or equivalent 5=ice tea 6=mixed drink 7=coffee 8=other 9=not applicable _____ a. _____ b. _____ c. _____ d. _____ e. _____ f. _____ g. You just mowed the grass. After working out with weights. After jogging or running. Eating at a formal restaurant. Eating fried catfish, shrimp or other seafood. Eating lobster, crab legs or crawfish. Discussing life with your girl/boy friend or spouse or (to be politically correct) significant other. Having an intellectual discussion about religious beliefs. Having a passionate discussion about passionate things. You just got in from work and want to relax. _____ h. _____ i. _____ j. 9. Using a scale of 1=very inactive, 2=somewhat inactive, 3=somewhat active and 4=very active, how physically active do you consider yourself? ________ 10. Using a scale of 1=very inactive, 2=somewhat inactive, 3=somewhat active and 4=very active, how socially active do you consider yourself? ________ DEMOGRAPHIC/CLASSIFICATION INFORMATION 11. Ethnic Group: ____Caucasian(1) ____Asian(3) ____Other(5) 12. Gender: 13. Classification: ___Freshman(1) ___Junior(3) ___Grad Student(5) 14. Age: _____Female(1) ___0-18(1) ___23-25(4) ____African-American(2) ____ European(4) _____Male(0) ___Sophomore(2) ___Senior(4) ___19-20(2) ___26-30(5) ___21-22(3) ___over 30(6) THANK YOU VERY MUCH FOR YOUR COOPERATION Instructions for having the soft drink questionnaire filled out correctly. 1. 2. 3. 4. 5. 6. 7. 8. Fill out all questions. Percentages in questions 2, 3, and 4 should equal 100%, i.e. if 90% of your soft drink consumption is with sugar, then 10% is without sugar, hence the two percents equal 100%. Question 5 simply requests the number of equivalent 12-ounce cans of soft drink consumed in an average week. Question 6 simply requests the name of your favorite soft drink. Give only one name. Question 7: just put the number corresponding to the scale in the blank to the left of each statement. Question 8: simply choose your preference from the possibilities listed. Please only indicate only one choice for this question. Questions 9 & 10 are scale questions. The scales range from 1 (very inactive) to 4 (very active). Simply enter a number from 1 to 4 that indicates Ahow active@ you perceive yourself physically (Q9) and socially (Q10). For the demographic questions, (Q11-Q14), simply check one of the choices in each question. The SPSS Program Template: Students should understand that there is nothing sacred about the variable names, labels or value labels in the template below. The reason for providing the template is so that when the professor merges the student databases into one large database, there will be consistency in how the database is set up. This is especially critical regarding the computer coding of soft drinks in Question #6, which is an open-ended question. Provide students with the following template for developing a SPSS database: Variable Name Q1 Q2a Q2b Q3a Q3b Q4a Q4b Q4c Q5 Q6 Variable Label Value Labels Do you drink soft drinks % soft drinks with sugar % soft drinks without sugar % soft drinks with caffeine % soft drinks without caffeine % of soft drinks consumed are favorite soft drink % of soft drinks consumed are 2nd favorite soft drink % of soft drinks consumed are not 1st or 2nd favorite Average weekly consumption of soft drinks (12 oz cans) Name of favorite soft drink 1=yes 0=no 1=coca cola 2=diet coke 3=pepsi 4=diet pepsi 5=dr pepper 6=diet dr pepper 7=sprite 8=diet sprite 9=7up 10=diet 7up Q7a Q7b Q7c Q7d Q7e Q7f Q7g Q7h Q7i Q7j Q7k Q8a Q8b Q8c Q8d Q8e Q8f Q8g Q8h Q8i Q8j Soft drinks really give me a lift during the day I am hooked on soft drinks Diet soft drinks give me a headache Last diet – quit sugar soft drinks Advertising doesn’t affect my choice of soft drink I prefer ice tea to soft drinks Soft drink commercials have gotten funnier—past 5 years Beer TV commercials better than soft drink TV commercials Soft drinks are bad for a person’s health On an average day, I consume more ounces of soft drink than water In general, soft drinks taste better than beer You just mowed the grass After working out with weights After jogging or running Eating at a formal restaurant Eating fried catfish, shrimp or other seafood Eating lobster, crab, or crawfish Discussing life with your – significant other Having an intellectual discussion about religious beliefs Having a passionate discussion about passionate things Just got in from work and want to relax Q9 How physically active Q10 How socially active Q11 Ethnic group Q12 Gender Q13 Classification 11=mountain dew 12=diet mountain dew 13=root beer 14=diet root beer 15=orange soda 16=grape soda 17=other All of the questions in #7 used the same value labels as follows: 1=strongly disagree 2=disagree 3=indifferent 4=agree 5=strongly agree All of the questions in #8 used the same value labels as follows: 1=soft drink 2=water 3=beer 4=Gatorade type 5=ice tea 6=mixed drink 7=coffee 8=other 1=very inactive 2=somewhat inactive 3=somewhat active 4=very active 1=very inactive 2=somewhat inactive 3=somewhat active 4=very active 1=Caucasian 2=African-American 3=Asian 4=European 5=other 1=female 0=male 1=freshman Q14 Age 2=sophomore 3=junior 4=senior 5=graduate student 1=0 to 18 2=19 to 20 3=21 to 22 4=23 to 25 5=26 to 30 6=over 30 SPSS Exercise #1 OBJECTIVE: Machine Cleaning Data – to get students to correct errors made by incorrect entries into the database. Textbook Reference: Pages 329 to 330 and 331 to 333. Instructions: Using the analyze/descriptive statistics/frequencies sequence, produce one-way frequency tables for all of the variables in the database except QNO (questionnaire number). Inspect very closely the output in each table. a. Are any of the values in the tables not consistent with the computer coding in the questionnaire? b. Do the percentage totals for questions 2, 3 and 4 equal 100%? You can use the transform/compute sequence to create arithmetic variables for questions 2, 3, and 4 (Q2a + Q2b, Q3a + Q3b, and Q4a + Q4b + Q4c). c. Are the value labels correctly indicated in the output out (you will have value labels for questions 1, 6, 7a-7k, 8a-8j, 9, 10, and all of the demographic questions (questions 11-14). I suggest double-checking Q6, as these are open-ended codes. Use a table like the one below as an instrument to compile input errors, so that corrections can be made. Observations/questionnaire number Variable containing error Incorrect value HELPFUL HINTS FOR SPSS Exercise #1: Instructions concerning machine cleaning using either database alternative: This questionnaire will have 2 types of errors: 1. The first is the data entry mistake. For instance, a “2” is inputed for male instead of a “0" or some similar type of input typo-mistake. These types of mistakes are easily found upon examination of frequency distributions. Then, by going back to the database, positioning the cursor on the variable with the problem, keying CTRL-F and inputing the error value, the student will be able to locate the case containing the error. The professor will have all of the original questionnaires, and be able to provide the correct responses for students. If using the Website database, the professor can provide the correct responses once students have identified the errors. Note the table above that can be used for error correction. 2. The most common error will be found in questions 2, 3 and 4, where the percentages do not add up to 100%. Again, once students find such errors, the professor can provide the correct responses (website database) or discuss how to handle such an input error (student-created database). To find such errors, students will need to use the Transform/compute sequence by creating an additional variable. For instance, for question 2, create the variable q2check, which is the result of adding q2a + q2b. If the total equals 100, the question was answered correctly. Otherwise, there is an error. SPSS Exercise #2 OBJECTIVE: To get students to answer questions based on the results from the frequency distributions generated from SPSS Exercise #1. Textbook Reference: Pages 331 to 333. Answer each of the following questions: 1. What percentage of all respondents drink soft drinks? __________% 2. Produce a table indicating the top 5 favorite brand soft drinks with the percentage of respondents drinking each? Always express the results of your tables in descending order. For Example: Brand of Soft Drink Dr. Pepper 2nd favorite soft drink, etc. Percentage of Respondents 21.9% 18.2% 3. What percentage of respondents “strongly agree” with question 7a? __________% 4. What percentage of respondents “strongly disagree” with question 7k? __________% 5. Produce a table indicating the most popular beverage for each of the questions in question 8. Also indicate the percentage of respondents preferring that particular beverage. For Example: (Your table will have a most popular beverage for each of the 10 questions.) Question most popular beverage percent preferring you just mowed the grass Beer 75% After working out with weights Gatoraide 82% 6. Which is the second most consumed beverage after: % Preferring a. b. c. __________% __________% __________% Mowing the grass ___________________________ working out with weights _____________________ jogging or running ___________________________ SPSS Exercise #3 OBJECTIVE: To perform an analysis of the demographic characteristics of your database using frequency distributions generated in SPSS Exercise #1. Textbook References: Pages 331 to 333, 335 to 338. Instructions: 1. 2. 3. Evaluate questions #11 through #14. These 4 questions constitute the demographics of the survey. Display the demographic data in a user-friendly format such as tables. For each demographic variable, illustrate the table results using some type of graphic representation of the data (charts, graphs, etc.) SPSS Exercise #4 OBJECTIVE: This exercise deals with crosstabulation analysis. The objectives are to get students to: a. perform crosstabulation analysis, b. correctly read data from the crosstabulation matrix, c. determine whether or not the sample results can be generalized to the population under study via the use of the chi-square test for independent samples. Textbook Reference: Pages 333 to 335, and 342 to 353 Instructions: 1. 2. Use the analyze/descriptive statistics/crosstab sequence to obtain crosstab results. In addition, click on the “cell” icon and make sure the observed, expected, total, row, and column boxes are checked. Then, click on the “statistics” icon and check the chi-square box. Once you run the analysis, on the output for the chi-square analysis, you will only need the Pearson chi-square statistic to assess whether or not the results of the crosstab are statistically significant. In this exercise we are assessing whether or not persons who drink soft drinks are different from those that don’t drink soft drinks regarding demographic characteristics. Invoke the crosstab analysis for the following pairs of variables: a. Q1 & Q11 b. Q1 & Q12 c. Q1 & Q13 d. Q1 & Q14 Answer the following questions: 1. 2. 3. 4. 5. 6. What % of males don=t drink soft drinks? _________% What % of all respondents are female and drink soft drinks? _________% What % of persons not drinking soft drinks are female? _________% Which classification group drinks soft drinks the most? ________________________ Which age group drinks soft drinks the most? ________________________ Evaluate the chi-square statistic in each of your crosstab tables. Construct a table to summarize the results, similar to the example below. Variables Pearson Degrees Explanation Chiof Square Freedom Q1 & Q11 1.67 3 Based on our sample results, we cannot conclude that in the population under study that the tendency to drink or not drink soft varies significantly by ethnic orientation. Q1 & Q12 2.84 1 We can be 90% confident that based on our sample results, that in the population under study that males differ significantly from females in their tendency to consume or not consume soft drinks. Q1 & Q13 Q1 & Q14 Notes on the Chi-Square Test for Independent Samples: Note the SPSS chi-square output below. do you drink soft drinks? * gender Crosstabulation do you drink soft drinks? no yes Total Count % within do you drink s oft drinks ? % within gender % of Total Count % within do you drink s oft drinks ? % within gender % of Total Count % within do you drink s oft drinks ? % within gender % of Total Value 2.769b 2.338 2.835 gender male female 21 43 Total 64 32.8% 67.2% 100.0% 9.3% 4.0% 204 14.1% 8.1% 262 12.1% 12.1% 466 43.8% 56.2% 100.0% 90.7% 38.5% 225 85.9% 49.4% 305 87.9% 87.9% 530 42.5% 57.5% 100.0% 100.0% 42.5% 100.0% 57.5% 100.0% 100.0% Df 1 1 1 Asymp. Sig. .096 .126 .092 Exact Sig. Exact Sig. Pearson Chi-Square Continuity Correctiona Likelihood Ratio Fisher’s Exact Test .106 .062 Linear-by-Linear 2.764 1 .096 Association N of Valid Cases 530 a. computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 27.17. Managerial interpretation of SPSS Chi-Square output: for purposes of determining significant differences in a crosstabulation analysis via the Chi-Square test, use only the highlighted information in the above table, i.e., Pearson Chi-Square/Value/Df/Asymp. Sig. Purpose: The purpose of the Chi-Square test for “K” independent samples for crosstabulation analysis is to determine if significant differences exist. For example, in the crosstab table above, the question is “based on these sample results, can we generalize to the population under study that males differ significantly from females by likelihood to drink or not drink soft drinks?” Interpretation: If the Chi-Square test is significant (Asymp. Sig. is not greater than either .10 or .05, i.e., 90% or 95% confidence that in the population a significant relationship exists), then the analyst can use the percentages in the crosstab matrix to determine how much more or less likely males are than females to drink or not drink soft drinks. SPSS Exercise #5 OBJECTIVE: To invoke the t-test to evaluate differences in the consumption patterns of males versus females. The t-test (or z-test) compares the means by category groupings, for example males versus females or high income versus low income, and computes the probability that the sample results can be generalized to the population from which the sample was drawn. SPSS calls the categories groupings. OBJECTIVE: To invoke the 1-Way Analysis of Variance Test (ANOVA) to evaluate differences in the consumption patterns by Classification (Freshman, Sophomore, Junior, Senior, Graduate Student, Other). The ANOVA test evaluates for significant differences in consumption patterns for more than two categories or groupings. SPSS calls them FACTORS. Textbook Reference – T/Z-Test: Pages 342 to 353 (See Table 12.12 on page 346) Instructions: T/Z-Test Note: In statistics, if a sample has less than 30 observations or cases, then we invoke a T-test. If there are 30 or more cases, then we invoke a Z-test. SPSS calls both a T-test. Use the analyze/compare means/independent samples t-test sequence to invoke the t-test. For this exercise we are going to compare male and female soft drink consumption for each of the following variables: Q2A - % of soft drink consumption with sugar Q3A - % of soft drink consumption with caffeine Q4A - % of soft drink consumption with favorite soft drink Q5 – weekly soft drink consumption on Q9 – self-perception of how physically active Q10 – self-perception of how socially active SPSS calls the variables being analyzed “test variables.” The variable we are using to compare responses is the “grouping variable,” in our analysis gender. Under grouping variable, you will need to input the values for male (0) and female (1). On the output page read across for each variable the line that says “equal variances assumed.” Notice the significance (Sig.) associated with the “F” test (variances) and “T” test (means). If this value is less than or equal to .10 (90% confidence), then we conclude that either the means and/or variances are significantly different. Questions To Answer: T-Test 1. Produce a table to summarize the T-test results. An example is as follows: Variables Q2A & Gender Q3A & Gender 2. Variance Prob of Sig diff .000 Means Prob of Sig diff .003 .176 .833 Interpretation of Results Over 99% confident that based on our sample results, that in the population under study, that males differ significantly from females concerning the % of soft drinks they drink with sugar. We cannot conclude from our sample results that in the population under study, that males and females differ regarding the % of soft drinks consumed with caffeine. Summarize in a sentence or two the results of your table. What can you say about males versus females? ______________________________________________________________________ ______________________________________________________________________ ______________________________________________________________________ Instructions: ANOVA Test: Use the analyze/compare means/one-way ANOVA sequence to invoke the One Way ANOVA test. The dependent variables will be Q2a, Q3a, Q4a, Q5, Q9, and Q10. The FACTOR will be Q13 (classification). On the output page notice “F” and “Sig.”, which are the computed F-value and the probability of insignificance. 1 – Sig. = the probability that based on the sample results, we can assume that in the population under that the relationships found in the sample results will also be found in the population. Remember, in marketing research we must be at least 90% confident of the results. Hence, if Sig. is greater than .10 then the differences in means for the factors in question will not be significant in the population. Questions to Answer: ANOVA Test 1. Construct a table patterned after the one below to summarize your ANOVA results: Variables Q2a (% soft drinks with sugar) & Q13 (classification) 2. Degrees of Freedom 4, 261 F-Value 2.434 Probability of Insignificance .048 Interpretation of Results 95.2% confident that based on the sample results that in the population under study, that student differ significantly by classification concerning the percentage of the soft drinks they drink with sugar. Summarize in a sentence or two the results of your table. What can you say about soft drink consumption by classification? ______________________________________________________________________ ______________________________________________________________________ ______________________________________________________________________ Technical Notes on the T-Test: What is the objective of the T-Test? The t-test is an inferential statistical test. The results of a t-test yield a probability that the differences observed in the sample data can be generalized to the population in which the data was drawn. The t-test measures for significant differences across 2 categories for a parametric mean or proportion. In other words, suppose we wanted to know if males and females responded differently to a Likert scale question. The t-test for means would compute a mean and standard deviation for males and females. Then it would compare them. The resulting probability would be the likelihood that the differences observed in the sample would also be observed in the population in which the sample was drawn. Assumptions of the T-Test for Independent Samples That the groups being compared are independent of each other. Observations are independent when information about one is unrelated to the other. The test variables must be interval or ratio scale. The grouping variable must be discrete, i.e., discrete categories such as male & female, drink soft drinks or do not drink soft drinks, etc SPSS Output Example for T-Test: Group Statistics % soft drinks with sugar gender male female N 204 262 Mean 79.78 66.87 Std. Deviation 31.816 40.071 Std. Error Mean 2.228 2.476 Lavene’s Test for Equality of Variances % soft drinks with sugar & gender equal variances assumed equal variances not assumed F 46.149 Sig. .000 t-test for equality of Means % soft drinks with sugar & gender equal variances assumed equal variances not assumed t 3.771 df 464 3.879 463.815 Sig.(2-tailed) .000 .000 Mean Difference 12.92 12.92 Interpretation of SPSS T-Test Output: Lavene’s Test for Equality of Variances – this test measures the variance among responses within each grouping category. For example, was there more variance among male or female respondents? In the example above, Lavene’s test indicates that variation among male respondents was significantly different than variation among female respondents. T-Test for Equality of Means – this test measures whether or not it can be assumed that in the population under study that the mean response by one of the groupings (males) is significantly different than the mean response by the other grouping (females). In the example above, we can 100% confident that in the population that the % of soft drinks consumed with sugar for male respondents will be significantly different than for female respondents. Technical Notes on the ANOVA Test: What is the objective of the ANOVA Test? 1-Way ANOVA is an inferential statistical test which test for significant differences across the means of 3 or more groupings, compared to the 2 groupings compared in the T-test. The results of the ANOVA test yields a probability that the differences observed in the sample data can be generalized to the population in which the data was drawn. In other words, suppose we wanted to know if average weekly soft drink consumption varies by age group. The ANOVA test computes a mean consumption and variance for each ethnic group (factor) for the independent variable, Q5, average weekly consumption. Assumptions of the 1-WAY ANOVA test: That the groups (factors) being compared are independent of each other. Observations are independent. The independent variables must be interval or ratio scale. The factor variable must be discrete, i.e., discrete categories such as age categories, income categories, or some limited number of categories to be compared. SPSS Output Example for ANOVA: Average Weekly Consumption of 12 oz. Soft Drinks Sum of Squares df Mean Square Between Groups 371.580 5 74.316 Within Groups 33805.238 460 73.490 Total 34176.818 465 F 1.011 Sig. .410 In the example above, the results of the ANOVA test is that we cannot conclude that soft drink consumption will differ significantly by age category in the population under study. Sig. (the probability of insignificance) is .410 or 41%, which is much higher than the minimum level of confidence for significance in marketing research (90% is the minimum; many consultants prefer at least 95% confidence). SPSS Exercise #6 OBJECTIVE: To run a correlation analysis. Correlation analysis is a very valuable tool for summarizing the relationship between two variables. OBJECTIVE: This exercise requires the use of bivariate regression analysis, which involves determining how much of the variation in the dependent variable is explained by the independent variables. Textbook Reference – Correlation Analysis: Pages 372 to 373 (See additional notes on correlation analysis at the end of this exercise.) Textbook Reference – Bivariate Regression Analysis: Pages 360 to 371. Instructions – Correlation Analysis: 1. Use the analyze/correlate/bivariate sequence to obtain correlated results. All of the variables in this correlation analysis utilize at least interval scale data. Hence, use the Pearson’s option, which is the default correlation method. 2. Explanation of Correlation Results: In the SPSS output, the top number is the correlation coefficient, which measures the strength and direction of the relationship between the two correlated variables. The second number is the probability that whatever relationship exists between the two correlated variables, that relationship will not be significant in the population under study. The bottom number is simply the number of cases in the analysis in question. (Q5) Average weekly consumption of soft drinks (12 oz cans) (Q7d) Last diet—quit sugar soft drinks (Q5) Average Weekly consumption of soft drinks (12 oz cans) 1.000 . 466 -.106 .022 466 (Q7d) Last diet—quit sugar soft drinks -.106 .022 466 1.000 . 466 In the results above, there is an inverse correlation between Q5 and Q7d, as indicated by the correlation coefficient of -.106. The probability of insignificance is .022 or 2.2%. Hence, we have a very significant relationship between the two correlated variables. The bottom number, 466, indicates that the correlation analysis involved 466 pairs of variables. 3. Questions to Answer: In this exercise we are correlating average weekly soft drink consumption with several variables. Hence, correlate the following pairs of variables: a. Q2A & Q5 b. Q3A & Q5 c. Q5 & Q4a d. Q5 & Q7a e. Q5 & Q7i f. Q5 & Q7k 4. Summarize the results of your correlation analysis using the table below. The information already in the table is only an example. Variables Q7d & Q5 Level of Confidence 97.8% Interpret the results if the Correlation is Significant We can be 97.8% confident that based on our sample results, that in the population under study, that persons drinking more soft drinks tended to disagree with the statement, “when I was last on a diet, one of the lifestyle changes I made was not drinking soft drinks with sugar,” and vice-versa. Instructions – Bivariate Regression Analysis: 1. 2. Use the analyze/regression/linear sequence to invoke the bivariate regression procedure. Explanation of SPSS Regression Output: a. The Strength of Association: R2 – The coefficient of determination R2 = explained variation (SSR)/total variation (SST) or R2 = 1 – unexplained variation (SSE)/total variation (SST) SPSS Output – Bivariate Regression ANOVAb Model Sum of Mean Square Squares Df (MS) F Sig. Regression (SSR) 403.690 1 403.690 5.546 .019a Residual (SSE) 33773.13 464 72.787 Total (SST) 34176.82 465 a. Predictors: (Constant), % soft drinks with sugar b. Dependent Variable: Average weekly consumption of soft drinks (12 oz cans) Model 1 a. b. R R2 Adjusted Std. Error of R2 the Estimate a .109 .012 .010 8.53 Predictors: (Constant), % soft drinks with sugar Dependent Variable: Average weekly consumption of soft drinks (12 oz cans) b. Statistical Significance of Regression Results c. F = MSR/MSE = 403.69/72.787 = 5.546 Sig. = .019 (We can be 98.1% confident that our regression results from our sample will be statistically significant in the population under study. The Regression Line Coefficients a Model 1 (Constant) % soft drinks with sugar Unstandardized Coefficients B Std. Error 11.376 .867 -2.5E-02 .011 Standardi zed Coefficie nts Beta -.109 t 13.128 -2.355 Sig. .000 .019 a. Dependent Variable: Average weekly consumption of soft drinks (12 oz cans) Regression Line: Y = 11.376 - .025x If a person consumed 90% of their soft drinks with sugar, based on our regression analysis, we would expect that they would consume a little over 9 soft drinks per week (9.126). Evaluation: 3. The computed F value reveals a significant regression model. However, an evaluation of R2 reveals that % soft drinks consumed with sugar only explains 1.2% of the variation in weekly soft drink consumption. Hence, in the context of bivariate regression analysis, we need to look for a better predictor variable. Regression Problem for Analysis: Invoke a bivariate regression analysis for the following pairs of variables: a. Q5 & Q3a b. Q5 & Q4a c. Q5 & Q2b Which of the regression models from a, b, or c does the best job of explaining the variation in average weekly soft drink consumption? Summarize your results in a table similar to the one below: Variables Q5 & Q3a Q5 & Q4a Q5 & Q2b R2 F d. Briefly discuss your evaluation. _____________________________________________ _______________________________________________________________________ _______________________________________________________________________ e. Using the regression line in “a” above, compute Y (Q5) if Q3a = 100. _______________ f. Using the regression line in “b” above, compute Y (Q5) if Q4a = 100. _______________ g. Using the regression line in “c” above, compute Y (Q5) if Q2b = 100. _______________ Additional Notes on Correlation Analysis: Correlation Analysis: both Pearson’s & Spearman’s 1. Measures the degree to which changes in one variable (sometimes we call it the dependent variable) are associated with changes in another variable. 2. Our analysis will only be bivariate correlation analysis. 3. The correlation coefficient: Pearson’s & Spearman’s a. Is a relative measure of the i. Direction & ii. Strength of the relationship between two variables, or vectors of data in a database. b. The coefficient can range from -1.0 to +1.0. A value close to +/- 1.0 indicates a strong correlation, while a +/- value close to zero would indicative a relative weak correlation or association between two variables. 4. Correlation is only a descriptive analysis; hence, strong correlations do not necessarily mean there is a cause-effect relationship between two variables. However, correlation is one of three preconditions for a cause-effect relationship. 5. Given the following correlation matrix, note the interpretation below. Age Inc Edu Age 1.00 .00 -.05 .88 -.49 .09 Inc -.05 .88 1.00 .00 .84 .03 Edu -.49 .09 .84 .03 1.00 .00 Interpretation of the Correlation Matrix: The top number, for age & edu, -49 indicates the relative strength and direction of the correlation/ association between the two variables in question. The bottom number, for age & edu, .09 indicates the probability that given these sample results, there is a 9% chance that in the population a significant association does NOT exist. A simpler way to say it is, we are 91% confident that based on these sample results, there is a significant correlation/association between the two variables in the population. If the probability of insignificance is ever over .10 (> 10% chance of insignificance in the population) we ALWAYS conclude Ho, that there is no correlation between the two variables in question — no matter what the top number, the correlation coefficient, turns out to be. Hence, when examining correlation matrix always look at the bottom number first!! If the relationship is significant then look at the top number, the correlation coefficient to evaluate the strength and direction of the relationship. The difference between Spearman’s and Pearson’s: Spearman’s can handle categorical data if the response is dichotomous; otherwise we say Spearman’s is a non-parametric test used to evaluate “ranked” or ordinal data. If your analysis of two variables involves at least one ordinal variable, then you must use Spearman’s!! Pearson’s can also handle categorical data from a dichotomous variable; otherwise it requires that BOTH variables be of at least interval scale (interval or ratio). SPSS Exercise #7 OBJECTIVE: To obtain results from a traditional descriptive statistics analysis. This exercise gets students to invoke basic fundamental analytical tools, means and standard deviation, interpret the output and answer the questions. Textbook Reference: Pages 339 to 342. Instructions: 1. Use the analyze/descriptive statistics/descriptives sequence to obtain results for this exercise. On the questionnaire, Question #7 utilizes a 5-point Likert scale. This scale is balanced and can be assumed to yield interval scale/metric data. Given the preceding, invoke SPSS to calculate the mean and standard deviation for variables Q7a-Q7k. 2. Answer each of the questions below: a. Using only the mean for each of the variables, for which question was there the greatest amount of agreement? __________________ b. Again, using only the mean for each of the variables, for which question was there the greatest amount of disagreement? __________________ c. Using only the standard deviation for each of the variables, for which question was there the greatest amount of agreement? __________________ d. Again, using only the standard deviation for each of the variables, for which question was there the greatest amount of disagreement? __________________ e. Explain the difference in the results of questions 1 & 2 versus 3 & 4. ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ SPSS Exercise #8 OBJECTIVE: This exercise requires students to use only parts of the database, depending upon the information to be obtained. In effect, students will select only certain cases in the database depending upon the specifications of the problem. Textbook Reference: No specific data analysis reference. Suggestions for summarizing data can be found on pages 335 to 338. Instructions: 1. From our soft drink database suppose we want to compare persons who prefer their favorite soft drink 75% of the time or less to those preferring their favorite soft drink more than 75% of the time (76% to 100%). We want to compare consumption via the following demographic characteristics: gender (Q12), ethnic orientation (Q11), average weekly consumption (Q5), and how physically active the respondents perceive themselves (Q9). 2. Follow this sequence of steps: data/select cases/ if condition is satisfied (black dot) if (click) Q4a <= 75 continue This sequence of steps will get the database including only respondents consuming their favorite soft drink 75% of the time or less. 3. Utilize the analyze/descriptive statistics/frequencies sequence to obtain frequency distributions for the following variables: Q4a, Q5, Q9, Q11, Q12 4. Follow the same procedure as in #2 above to obtain a database for only those respondents who consume their favorite soft drink more than 75% of the time. data/select cases/ if condition is satisfied (black dot) if (click) Q4a > 75 continue This sequence of steps will get the database including only respondents consuming their favorite soft drink more than 75% of the time. 5. Utilize the analyze/descriptive statistics/frequencies sequence to obtain frequency distributions for the following variables: Q4a, Q5, Q9, Q11, Q12 6. Using tables, graphs or charts, compare the consumption of the two groups of respondents by variables Q5, Q9, Q11, and Q12. 7. Summarize the results of your tables/graphs/charts in a paragraph or less. SPSS Exercise #9 OBJECTIVE: To develop categories for variables with continuous data. For the variable Q5 (average Weekly soft drink consumption), develop 3 discrete categories of consumption: LOW, MEDIUM, and HIGH. Your earlier frequency analysis should help you establish the 3 discrete categories of average weekly soft drink consumption. The ranges used below are just examples. Use your judgment the establish ranges for your database. Textbook Reference: None. This is a SPSS-specific exercise. Instructions: 1. Assume based on analysis of our frequency distribution for Q5 that we establish the following discrete ranges for average weekly soft drink consumption: Consumption Category LOW MEDIUM HIGH 2. Range of Consumption(just examples; you establish ranges) 1 to 4 5 to 10 over 10 Utilize the following sequence of steps to establish discrete numerical ranges of soft drink consumption (categories 1, 2, and 3). Transform Recode Into different variables Click Q5 (numeric variable) Name output variable Q5c Change Label (categories of soft drink consumption) Old Values & New values Range 0 through 4 value 1 Add (These ranges are just examples!! Range 5 through 10 value 2 Add You will need to develop ranges based Range 10 through highest value 3 Add on your database results.) Continue OK (you should see the variable Q5c after Q9) 2. Now, go to your new your new Q5C variable and go to the variable view screen. Go to the VALUES column and give the values labels: 1=low 2=medium 3. 4. 5. 6. 3=high Go back to the data view screen and inspect the new variable, Q5c. Use the analyze/descriptives/frequencies sequence to obtain a frequency distribution for the new variable, Q5c. Printout the frequency table for Q5c. Crosstabulate variables Q12 (gender) and Q5C (categories of soft drink consumption). Be sure to invoke a chi-square analysis and interpret the chi-square statistic. Using the results of your crosstabulation analysis, does soft drink consumption vary by gender? _______________ SPSS Exercises #10 OBJECTIVE: Additional Data Analysis Questions B The preceding 9 exercises covered many aspects of the basics of using SPSS to analyze data from a marketing research project. Exercise #10 is purely an application exercise, in which students will use what they have learned in the previous exercises to answer the subsequent questions. There may be more than one method of answering a particular question. In the real world, students will have to make decisions concerning the selection of the best techniques for analyzing the data for a given problem. That is precisely the objective of this exercise, although there are some hints concerning appropriate methods of analysis. Instructions: Answer the following questions using user-friendly explanations, tables, charts, graphs, numbers or whatever means you deem appropriate. 1. Compare the level of loyalty toward favorite soft drink by gender. a. Do males or females (Q12) show more loyalty (Q4a) to their favorite soft drink? __________________ b. 2. What method of analysis did you use? ________________________ (chi-square, correlation, or t-test?) Compare the average weekly consumption of soft drinks (Q5) by classification (Q13). a. Do students increase or decrease their soft drink consumption as they progress in their college experience? ____________________________________________________________________ b. 3. What method of analysis did you use? _________________________ Compare age categories of respondents (Q14) and average weekly soft drink consumption (Q5). a. Correlate age (Q14) and soft drink consumption (Q5). Does soft drink consumption increase or decrease with age? ______________________________________________________________ b. Is this a Pearson’s or Spearman’s correlation analysis? ___________________________ c. Run an ANOVA using age (Q14) categories as factors and soft drink consumption (Q5). Does softdrink consumption differ significantly across age categories? _______________________________________________________________________ d. Do your correlation and ANOVA results agree? YES ____ NO ____ 4. Evaluate the results in Question 8 (Q8a-Q8j) by gender (Q4). Organize and discuss the results. HINT: Do NOT use crosstabs here. Use the select cases procedure such that you run frequencies for Q8 for males and then for females. Now construct a table indicating preferences by gender. Example: Question you just mowed the grass 5. Beverage Preference B Males Beverage Preference B Females beer(36%) gatorade(41%) Do males or females consider themselves: a. b. c. more socially active. ______________________________________________________ more physically active. ____________________________________________________ What method of analysis did you use to answer 5a & 5b? (chi-square, t-test or correlation) _______________________________________________________________________