Download Marketing Research Essentials, 4e

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
SPSS EXERCISES
For
Marketing Research Essentials, 4e
Carl McDaniel and Roger Gates
SPSS Exercises prepared by
Joe Cangelosi
University of Central Arkansas
INTRODUCTION TO STUDENTS
SPSS is recognized as one of the leading software packages for statistical analysis. For about the last 5-7
years, it has been packaged with marketing research texts as an ancillary resource. However, there has not
been an organized attempt to integrate SPSS with the marketing research course. The objective of these
SPSS Exercises is to do just that – integrate the use of SPSS into the Marketing Research course, resulting
a significant data management component.
The need for a significant data management component came from several sources. First, the University of
Central Arkansas College of Business has a Business Advisory Council, which advises the college with
regard to curriculum, recommended that graduates needed stronger data management skills. Second, as
Faculty Advisor for the UCA Marketing Club, and a member of the Central Arkansas Professional Chapter
of the American Marketing Association, I am hearing from the marketing professionals is that marketing
majors need better data management skills. Given the preceding, I decided to develop computer-based
exercises using SPSS, with the objective of increasing student proficiency in data analysis and data
management.
The SPSS Exercises correspond to the data analysis chapters in McDaniel & Gates, Marketing Research
Essentials, 4e. The SPSS Software is very user-friendly, and the mechanics of its use can be integrated
into daily class sessions. Lastly, SPSS provides a number of learning resources. Their web site address is
www.spss.com.
Soft Drink & Beverage Consumption Questionnaire
1.
Do you drink soft drinks? _____YES(1)
If ANO@ go to number 11.
2.
What percent of your soft drink consumption is:
a.
Drinks with sugar
b.
Drinks without sugar (diet)
_____%
_____%
What percent of your soft drink consumption is:
a.
Drinks with caffeine
b.
Drinks without caffeine
_____%
_____%
What percent of your soft drink consumption is:
a.
Your favorite soft drink
b.
Your 2nd favorite soft drink
c.
Other brands of soft drink
_____%
_____%
_____%
On the average how many soft drinks do you consume weekly?
(Use the equivalent of 12 oz. Cans.)
___________ 12 oz. Cans
3.
4.
5.
_____NO(0)
6.
What is your favorite soft drink? ________________________________
7.
Indicate the extent to which you agree or disagree with each of the following statements using the
scale below:
strongly
strongly
disagree disagree indifferent agree
agree
|________|________|________|________|________|
1
2
3
4
5
___a.
___b.
___c.
___d.
___e.
___f.
___g.
___h.
___i.
___j.
___k.
Soft drinks really give me a lift during the day.
I am hooked on soft drinks.
Diet soft drinks give me a headache.
When I was last on a diet, one of the lifestyle changes I made was not
to drink soft drinks with sugar.
Advertising has nothing to do with my choice of soft drink.
I prefer ice tea to soft drinks.
Soft drink TV commercials have gotten funnier over the past five years.
I think the beer TV commercials are better than the soft drink TV commercials.
Soft drinks are bad for a person=s health.
On an average day, I consumer more ounces of soft drinks than water.
In general, soft drinks taste better than beer.
8.
Indicate which of the following beverages you would prefer to consume for each of the following
occasions:
1=soft drink
2=water
3=beer
4=gatoraide or equivalent
5=ice tea
6=mixed drink 7=coffee
8=other
9=not applicable
_____ a.
_____ b.
_____ c.
_____ d.
_____ e.
_____ f.
_____ g.
You just mowed the grass.
After working out with weights.
After jogging or running.
Eating at a formal restaurant.
Eating fried catfish, shrimp or other seafood.
Eating lobster, crab legs or crawfish.
Discussing life with your girl/boy friend or spouse or (to be politically correct)
significant other.
Having an intellectual discussion about religious beliefs.
Having a passionate discussion about passionate things.
You just got in from work and want to relax.
_____ h.
_____ i.
_____ j.
9.
Using a scale of 1=very inactive, 2=somewhat inactive, 3=somewhat active and 4=very
active, how physically active do you consider yourself? ________
10.
Using a scale of 1=very inactive, 2=somewhat inactive, 3=somewhat active and 4=very
active, how socially active do you consider yourself? ________
DEMOGRAPHIC/CLASSIFICATION INFORMATION
11.
Ethnic Group:
____Caucasian(1)
____Asian(3)
____Other(5)
12.
Gender:
13.
Classification: ___Freshman(1)
___Junior(3)
___Grad Student(5)
14.
Age:
_____Female(1)
___0-18(1)
___23-25(4)
____African-American(2)
____ European(4)
_____Male(0)
___Sophomore(2)
___Senior(4)
___19-20(2)
___26-30(5)
___21-22(3)
___over 30(6)
THANK YOU VERY MUCH FOR YOUR COOPERATION
Instructions for having the soft drink questionnaire filled out correctly.
1.
2.
3.
4.
5.
6.
7.
8.
Fill out all questions.
Percentages in questions 2, 3, and 4 should equal 100%, i.e. if 90% of your soft drink
consumption is with sugar, then 10% is without sugar, hence the two percents equal 100%.
Question 5 simply requests the number of equivalent 12-ounce cans of soft drink
consumed in an average week.
Question 6 simply requests the name of your favorite soft drink. Give only one name.
Question 7: just put the number corresponding to the scale in the blank to the left of each
statement.
Question 8: simply choose your preference from the possibilities listed. Please only indicate only
one choice for this question.
Questions 9 & 10 are scale questions. The scales range from 1 (very inactive) to 4 (very active).
Simply enter a number from 1 to 4 that indicates Ahow active@ you perceive yourself physically
(Q9) and socially (Q10).
For the demographic questions, (Q11-Q14), simply check one of the choices in each question.
The SPSS Program Template:
Students should understand that there is nothing sacred about the variable names, labels or value labels in
the template below. The reason for providing the template is so that when the professor merges the student
databases into one large database, there will be consistency in how the database is set up. This is
especially critical regarding the computer coding of soft drinks in Question #6, which is an open-ended
question.
Provide students with the following template for developing a SPSS database:
Variable
Name
Q1
Q2a
Q2b
Q3a
Q3b
Q4a
Q4b
Q4c
Q5
Q6
Variable Label
Value Labels
Do you drink soft drinks
% soft drinks with sugar
% soft drinks without sugar
% soft drinks with caffeine
% soft drinks without caffeine
% of soft drinks consumed are favorite soft drink
% of soft drinks consumed are 2nd favorite soft drink
% of soft drinks consumed are not 1st or 2nd favorite
Average weekly consumption of soft drinks (12 oz cans)
Name of favorite soft drink
1=yes 0=no
1=coca cola
2=diet coke
3=pepsi
4=diet pepsi
5=dr pepper
6=diet dr pepper
7=sprite
8=diet sprite
9=7up
10=diet 7up
Q7a
Q7b
Q7c
Q7d
Q7e
Q7f
Q7g
Q7h
Q7i
Q7j
Q7k
Q8a
Q8b
Q8c
Q8d
Q8e
Q8f
Q8g
Q8h
Q8i
Q8j
Soft drinks really give me a lift during the day
I am hooked on soft drinks
Diet soft drinks give me a headache
Last diet – quit sugar soft drinks
Advertising doesn’t affect my choice of soft drink
I prefer ice tea to soft drinks
Soft drink commercials have gotten funnier—past 5 years
Beer TV commercials better than soft drink TV commercials
Soft drinks are bad for a person’s health
On an average day, I consume more ounces of soft drink than water
In general, soft drinks taste better than beer
You just mowed the grass
After working out with weights
After jogging or running
Eating at a formal restaurant
Eating fried catfish, shrimp or other seafood
Eating lobster, crab, or crawfish
Discussing life with your – significant other
Having an intellectual discussion about religious beliefs
Having a passionate discussion about passionate things
Just got in from work and want to relax
Q9
How physically active
Q10
How socially active
Q11
Ethnic group
Q12
Gender
Q13
Classification
11=mountain dew
12=diet mountain
dew
13=root beer
14=diet root beer
15=orange soda
16=grape soda
17=other
All of the questions in
#7 used the same
value labels as
follows:
1=strongly disagree
2=disagree
3=indifferent
4=agree
5=strongly agree
All of the questions in
#8 used the same
value labels as
follows:
1=soft drink
2=water
3=beer
4=Gatorade type
5=ice tea
6=mixed drink
7=coffee
8=other
1=very inactive
2=somewhat inactive
3=somewhat active
4=very active
1=very inactive
2=somewhat inactive
3=somewhat active
4=very active
1=Caucasian
2=African-American
3=Asian
4=European
5=other
1=female
0=male
1=freshman
Q14
Age
2=sophomore
3=junior
4=senior
5=graduate student
1=0 to 18
2=19 to 20
3=21 to 22
4=23 to 25
5=26 to 30
6=over 30
SPSS Exercise #1
OBJECTIVE:
Machine Cleaning Data – to get students to correct errors made by incorrect entries into
the database.
Textbook Reference: Pages 329 to 330 and 331 to 333.
Instructions:
Using the analyze/descriptive statistics/frequencies sequence, produce one-way frequency tables for all
of the variables in the database except QNO (questionnaire number).
Inspect very closely the output in each table.
a.
Are any of the values in the tables not consistent with the computer coding in the questionnaire? b.
Do the percentage totals for questions 2, 3 and 4 equal 100%? You can use the
transform/compute sequence to create arithmetic variables for questions 2, 3, and 4 (Q2a +
Q2b, Q3a + Q3b, and Q4a + Q4b + Q4c).
c.
Are the value labels correctly indicated in the output out (you will have value labels for questions
1, 6, 7a-7k, 8a-8j, 9, 10, and all of the demographic questions (questions 11-14). I suggest
double-checking Q6, as these are open-ended codes.
Use a table like the one below as an instrument to compile input errors, so that corrections can be made.
Observations/questionnaire
number
Variable containing error
Incorrect value
HELPFUL HINTS FOR SPSS Exercise #1:
Instructions concerning machine cleaning using either database alternative:
This questionnaire will have 2 types of errors:
1.
The first is the data entry mistake. For instance, a “2” is inputed for male instead of a “0" or some
similar type of input typo-mistake. These types of mistakes are easily found upon examination of
frequency distributions. Then, by going back to the database, positioning the cursor on the
variable with the problem, keying CTRL-F and inputing the error value, the student will be able to
locate the case containing the error. The professor will have all of the original questionnaires, and
be able to provide the correct responses for students. If using the Website database, the professor
can provide the correct responses once students have identified the errors. Note the table above
that can be used for error correction.
2.
The most common error will be found in questions 2, 3 and 4, where the percentages do not add
up to 100%. Again, once students find such errors, the professor can provide the correct responses
(website database) or discuss how to handle such an input error (student-created database). To
find such errors, students will need to use the Transform/compute sequence by creating an
additional variable. For instance, for question 2, create the variable q2check, which is the result of
adding q2a + q2b. If the total equals 100, the question was answered correctly. Otherwise, there
is an error.
SPSS Exercise #2
OBJECTIVE:
To get students to answer questions based on the results from the frequency
distributions generated from SPSS Exercise #1.
Textbook Reference: Pages 331 to 333.
Answer each of the following questions:
1.
What percentage of all respondents drink soft drinks?
__________%
2.
Produce a table indicating the top 5 favorite brand soft drinks with the percentage of respondents
drinking each? Always express the results of your tables in descending order.
For Example:
Brand of Soft Drink
Dr. Pepper
2nd favorite soft drink, etc.
Percentage of Respondents
21.9%
18.2%
3.
What percentage of respondents “strongly agree” with question 7a?
__________%
4.
What percentage of respondents “strongly disagree” with question 7k?
__________%
5.
Produce a table indicating the most popular beverage for each of the questions in question 8. Also
indicate the percentage of respondents preferring that particular beverage.
For Example: (Your table will have a most popular beverage for each of the 10 questions.)
Question
most popular
beverage
percent
preferring
you just mowed the grass
Beer
75%
After working out with weights
Gatoraide
82%
6.
Which is the second most consumed beverage after:
% Preferring
a.
b.
c.
__________%
__________%
__________%
Mowing the grass ___________________________
working out with weights _____________________
jogging or running ___________________________
SPSS Exercise #3
OBJECTIVE:
To perform an analysis of the demographic characteristics of your database using
frequency distributions generated in SPSS Exercise #1.
Textbook References: Pages 331 to 333, 335 to 338.
Instructions:
1.
2.
3.
Evaluate questions #11 through #14. These 4 questions constitute the demographics of the survey.
Display the demographic data in a user-friendly format such as tables.
For each demographic variable, illustrate the table results using some type of graphic
representation of the data (charts, graphs, etc.)
SPSS Exercise #4
OBJECTIVE:
This exercise deals with crosstabulation analysis. The objectives are to get students to:
a.
perform crosstabulation analysis,
b.
correctly read data from the crosstabulation matrix,
c.
determine whether or not the sample results can be generalized to the population
under study via the use of the chi-square test for independent samples.
Textbook Reference: Pages 333 to 335, and 342 to 353
Instructions:
1.
2.
Use the analyze/descriptive statistics/crosstab sequence to obtain crosstab results. In addition,
click on the “cell” icon and make sure the observed, expected, total, row, and column boxes are
checked. Then, click on the “statistics” icon and check the chi-square box. Once you run the
analysis, on the output for the chi-square analysis, you will only need the Pearson chi-square
statistic to assess whether or not the results of the crosstab are statistically significant.
In this exercise we are assessing whether or not persons who drink soft drinks are different from
those that don’t drink soft drinks regarding demographic characteristics. Invoke the crosstab
analysis for the following pairs of variables:
a.
Q1 & Q11
b.
Q1 & Q12
c.
Q1 & Q13
d.
Q1 & Q14
Answer the following questions:
1.
2.
3.
4.
5.
6.
What % of males don=t drink soft drinks?
_________%
What % of all respondents are female and drink soft drinks?
_________%
What % of persons not drinking soft drinks are female?
_________%
Which classification group drinks soft drinks the most?
________________________
Which age group drinks soft drinks the most?
________________________
Evaluate the chi-square statistic in each of your crosstab tables. Construct a table to summarize
the results, similar to the example below.
Variables
Pearson
Degrees
Explanation
Chiof
Square
Freedom
Q1 & Q11
1.67
3
Based on our sample results, we cannot
conclude that in the population under study
that the tendency to drink or not drink soft
varies significantly by ethnic orientation.
Q1 & Q12
2.84
1
We can be 90% confident that based on our
sample results, that in the population under
study that males differ significantly from
females in their tendency to consume or not
consume soft drinks.
Q1 & Q13
Q1 & Q14
Notes on the Chi-Square Test for Independent Samples: Note the SPSS chi-square output below.
do you drink soft drinks? * gender Crosstabulation
do you drink
soft drinks?
no
yes
Total
Count
% within do you
drink s oft drinks ?
% within gender
% of Total
Count
% within do you
drink s oft drinks ?
% within gender
% of Total
Count
% within do you
drink s oft drinks ?
% within gender
% of Total
Value
2.769b
2.338
2.835
gender
male
female
21
43
Total
64
32.8%
67.2%
100.0%
9.3%
4.0%
204
14.1%
8.1%
262
12.1%
12.1%
466
43.8%
56.2%
100.0%
90.7%
38.5%
225
85.9%
49.4%
305
87.9%
87.9%
530
42.5%
57.5%
100.0%
100.0%
42.5%
100.0%
57.5%
100.0%
100.0%
Df
1
1
1
Asymp. Sig.
.096
.126
.092
Exact Sig.
Exact Sig.
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher’s Exact Test
.106
.062
Linear-by-Linear
2.764
1
.096
Association
N of Valid Cases
530
a.
computed only for a 2x2 table
b.
0 cells (.0%) have expected count less than 5. The minimum expected count is
27.17.
Managerial interpretation of SPSS Chi-Square output: for purposes of determining significant
differences in a crosstabulation analysis via the Chi-Square test, use only the highlighted
information in the above table, i.e., Pearson Chi-Square/Value/Df/Asymp. Sig.
Purpose:
The purpose of the Chi-Square test for “K” independent samples for crosstabulation
analysis is to determine if significant differences exist. For example, in the crosstab table
above, the question is “based on these sample results, can we generalize to the population
under study that males differ significantly from females by likelihood to drink or not drink
soft drinks?”
Interpretation: If the Chi-Square test is significant (Asymp. Sig. is not greater than either .10 or .05, i.e.,
90% or 95% confidence that in the population a significant relationship exists), then the
analyst can use the percentages in the crosstab matrix to determine how much more or less
likely males are than females to drink or not drink soft drinks.
SPSS Exercise #5
OBJECTIVE:
To invoke the t-test to evaluate differences in the consumption patterns of males
versus females. The t-test (or z-test) compares the means by category groupings, for
example males versus females or high income versus low income, and computes the
probability that the sample results can be generalized to the population from which the
sample was drawn. SPSS calls the categories groupings.
OBJECTIVE:
To invoke the 1-Way Analysis of Variance Test (ANOVA) to evaluate differences in
the consumption patterns by Classification (Freshman, Sophomore, Junior, Senior,
Graduate Student, Other). The ANOVA test evaluates for significant differences in
consumption patterns for more than two categories or groupings. SPSS calls them
FACTORS.
Textbook Reference – T/Z-Test:
Pages 342 to 353 (See Table 12.12 on page 346)
Instructions: T/Z-Test
Note: In statistics, if a sample has less than 30 observations or cases, then we invoke a T-test. If
there are 30 or more cases, then we invoke a Z-test. SPSS calls both a T-test.
Use the analyze/compare means/independent samples t-test sequence to invoke the t-test.
For this exercise we are going to compare male and female soft drink consumption for each of
the following variables:
Q2A - % of soft drink consumption with sugar
Q3A - % of soft drink consumption with caffeine
Q4A - % of soft drink consumption with favorite soft drink
Q5 – weekly soft drink consumption on
Q9 – self-perception of how physically active
Q10 – self-perception of how socially active
SPSS calls the variables being analyzed “test variables.”
The variable we are using to compare responses is the “grouping variable,” in our
analysis gender. Under grouping variable, you will need to input the values for male (0)
and female (1).
On the output page read across for each variable the line that says “equal variances
assumed.” Notice the significance (Sig.) associated with the “F” test (variances) and “T”
test (means). If this value is less than or equal to .10 (90% confidence), then we conclude
that either the means and/or variances are significantly different.
Questions To Answer: T-Test
1.
Produce a table to summarize the T-test results. An example is as follows:
Variables
Q2A &
Gender
Q3A &
Gender
2.
Variance
Prob of Sig diff
.000
Means
Prob of Sig diff
.003
.176
.833
Interpretation of Results
Over 99% confident that based on
our sample results, that in the
population under study, that males
differ significantly from females
concerning the % of soft drinks
they drink with sugar.
We cannot conclude from our
sample results that in the
population under study, that males
and females differ regarding the %
of soft drinks consumed with
caffeine.
Summarize in a sentence or two the results of your table. What can you say about males
versus females?
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
Instructions:
ANOVA Test:
Use the analyze/compare means/one-way ANOVA sequence to invoke the One Way ANOVA test.
The dependent variables will be Q2a, Q3a, Q4a, Q5, Q9, and Q10. The FACTOR will be Q13
(classification).
On the output page notice “F” and “Sig.”, which are the computed F-value and the probability of
insignificance. 1 – Sig. = the probability that based on the sample results, we can assume that in the
population under that the relationships found in the sample results will also be found in the population.
Remember, in marketing research we must be at least 90% confident of the results. Hence, if Sig. is
greater than .10 then the differences in means for the factors in question will not be significant in the
population.
Questions to Answer: ANOVA Test
1.
Construct a table patterned after the one below to summarize your ANOVA results:
Variables
Q2a (% soft
drinks with
sugar) & Q13
(classification)
2.
Degrees of
Freedom
4, 261
F-Value
2.434
Probability of
Insignificance
.048
Interpretation of Results
95.2% confident that
based on the sample
results that in the
population under study,
that student differ
significantly by
classification concerning
the percentage of the soft
drinks they drink with
sugar.
Summarize in a sentence or two the results of your table. What can you say about soft
drink consumption by classification?
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
Technical Notes on the T-Test:
What is the objective of the T-Test?
The t-test is an inferential statistical test. The results of a t-test yield a probability that the differences
observed in the sample data can be generalized to the population in which the data was drawn. The t-test
measures for significant differences across 2 categories for a parametric mean or proportion. In other
words, suppose we wanted to know if males and females responded differently to a Likert scale question.
The t-test for means would compute a mean and standard deviation for males and females. Then it would
compare them. The resulting probability would be the likelihood that the differences observed in the
sample would also be observed in the population in which the sample was drawn.
Assumptions of the T-Test for Independent Samples
 That the groups being compared are independent of each other.
 Observations are independent when information about one is unrelated to the other.
 The test variables must be interval or ratio scale.
 The grouping variable must be discrete, i.e., discrete categories such as male & female, drink soft
drinks or do not drink soft drinks, etc
SPSS Output Example for T-Test:
Group Statistics
% soft drinks with sugar
gender
male
female
N
204
262
Mean
79.78
66.87
Std. Deviation
31.816
40.071
Std. Error
Mean
2.228
2.476
Lavene’s Test for Equality of
Variances
% soft drinks with sugar & gender
equal variances
assumed
equal variances
not assumed
F
46.149
Sig.
.000
t-test for equality of Means
% soft drinks with sugar & gender
equal variances
assumed
equal variances
not assumed
t
3.771
df
464
3.879 463.815
Sig.(2-tailed)
.000
.000
Mean Difference
12.92
12.92
Interpretation of SPSS T-Test Output:
Lavene’s Test for Equality of Variances – this test measures the variance among responses within each
grouping category. For example, was there more variance among male or female
respondents? In the example above, Lavene’s test indicates that variation among male
respondents was significantly different than variation among female respondents.
T-Test for Equality of Means – this test measures whether or not it can be assumed that in the population
under study that the mean response by one of the groupings (males) is significantly
different than the mean response by the other grouping (females). In the example above,
we can 100% confident that in the population that the % of soft drinks consumed with
sugar for male respondents will be significantly different than for female respondents.
Technical Notes on the ANOVA Test:
What is the objective of the ANOVA Test?
1-Way ANOVA is an inferential statistical test which test for significant differences across the means of
3 or more groupings, compared to the 2 groupings compared in the T-test. The results of the ANOVA
test yields a probability that the differences observed in the sample data can be generalized to the
population in which the data was drawn. In other words, suppose we wanted to know if average weekly
soft drink consumption varies by age group. The ANOVA test computes a mean consumption and
variance for each ethnic group (factor) for the independent variable, Q5, average weekly consumption.
Assumptions of the 1-WAY ANOVA test:




That the groups (factors) being compared are independent of each other.
Observations are independent.
The independent variables must be interval or ratio scale.
The factor variable must be discrete, i.e., discrete categories such as age categories,
income categories, or some limited number of categories to be compared.
SPSS Output Example for ANOVA:
Average Weekly Consumption of 12 oz. Soft Drinks
Sum of
Squares
df
Mean Square
Between Groups
371.580
5
74.316
Within Groups
33805.238
460
73.490
Total
34176.818
465
F
1.011
Sig.
.410
In the example above, the results of the ANOVA test is that we cannot conclude that soft drink
consumption will differ significantly by age category in the population under study. Sig. (the
probability of insignificance) is .410 or 41%, which is much higher than the minimum level of
confidence for significance in marketing research (90% is the minimum; many consultants prefer at least
95% confidence).
SPSS Exercise #6
OBJECTIVE:
To run a correlation analysis. Correlation analysis is a very valuable tool for
summarizing the relationship between two variables.
OBJECTIVE:
This exercise requires the use of bivariate regression analysis, which involves
determining how much of the variation in the dependent variable is explained by the
independent variables.
Textbook Reference – Correlation Analysis: Pages 372 to 373 (See additional notes on correlation
analysis at the end of this exercise.)
Textbook Reference – Bivariate Regression Analysis: Pages 360 to 371.
Instructions – Correlation Analysis:
1.
Use the analyze/correlate/bivariate sequence to obtain correlated results. All of the variables in
this correlation analysis utilize at least interval scale data. Hence, use the Pearson’s option, which
is the default correlation method.
2.
Explanation of Correlation Results: In the SPSS output, the top number is the correlation
coefficient, which measures the strength and direction of the relationship between the two
correlated variables. The second number is the probability that whatever relationship exists
between the two correlated variables, that relationship will not be significant in the population
under study. The bottom number is simply the number of cases in the analysis in question.
(Q5) Average weekly
consumption of soft
drinks (12 oz cans)
(Q7d) Last diet—quit
sugar soft drinks
(Q5) Average Weekly
consumption of soft drinks
(12 oz cans)
1.000
.
466
-.106
.022
466
(Q7d) Last diet—quit sugar soft
drinks
-.106
.022
466
1.000
.
466
In the results above, there is an inverse correlation between Q5 and Q7d, as indicated by the
correlation coefficient of -.106. The probability of insignificance is .022 or 2.2%. Hence, we have
a very significant relationship between the two correlated variables. The bottom number, 466,
indicates that the correlation analysis involved 466 pairs of variables.
3.
Questions to Answer: In this exercise we are correlating average weekly soft drink consumption
with several variables. Hence, correlate the following pairs of variables:
a.
Q2A & Q5
b.
Q3A & Q5
c.
Q5 & Q4a
d.
Q5 & Q7a
e.
Q5 & Q7i
f.
Q5 & Q7k
4.
Summarize the results of your correlation analysis using the table below. The information already
in the table is only an example.
Variables
Q7d & Q5
Level of
Confidence
97.8%
Interpret the results if the Correlation is Significant
We can be 97.8% confident that based on our sample results, that
in the population under study, that persons drinking more soft
drinks tended to disagree with the statement, “when I was last on
a diet, one of the lifestyle changes I made was not drinking soft
drinks with sugar,” and vice-versa.
Instructions – Bivariate Regression Analysis:
1.
2.
Use the analyze/regression/linear sequence to invoke the bivariate regression procedure.
Explanation of SPSS Regression Output:
a.
The Strength of Association: R2 – The coefficient of determination
R2 = explained variation (SSR)/total variation (SST) or
R2 = 1 – unexplained variation (SSE)/total variation (SST)
SPSS Output – Bivariate Regression
ANOVAb
Model
Sum of
Mean Square
Squares
Df
(MS)
F
Sig.
Regression (SSR)
403.690
1
403.690
5.546
.019a
Residual (SSE)
33773.13
464
72.787
Total (SST)
34176.82
465
a.
Predictors: (Constant), % soft drinks with sugar
b.
Dependent Variable: Average weekly consumption of soft drinks (12 oz cans)
Model
1
a.
b.
R
R2
Adjusted Std. Error of
R2
the Estimate
a
.109
.012
.010
8.53
Predictors: (Constant), % soft drinks with sugar
Dependent Variable: Average weekly consumption of soft drinks (12 oz cans)
b.
Statistical Significance of Regression Results
c.
F = MSR/MSE = 403.69/72.787 = 5.546
Sig. = .019 (We can be 98.1% confident that our regression results from our sample will
be statistically significant in the population under study.
The Regression Line
Coefficients a
Model
1
(Constant)
% soft drinks with sugar
Unstandardized
Coefficients
B
Std. Error
11.376
.867
-2.5E-02
.011
Standardi
zed
Coefficie
nts
Beta
-.109
t
13.128
-2.355
Sig.
.000
.019
a. Dependent Variable: Average weekly consumption of soft drinks (12 oz cans)
Regression Line: Y = 11.376 - .025x
If a person consumed 90% of their soft drinks with sugar, based on our regression
analysis, we would expect that they would consume a little over 9 soft drinks per
week (9.126).
Evaluation:
3.
The computed F value reveals a significant regression model. However, an
evaluation of R2 reveals that % soft drinks consumed with sugar only explains
1.2% of the variation in weekly soft drink consumption. Hence, in the context of
bivariate regression analysis, we need to look for a better predictor variable.
Regression Problem for Analysis:
Invoke a bivariate regression analysis for the following pairs of variables:
a.
Q5 & Q3a
b.
Q5 & Q4a
c.
Q5 & Q2b
Which of the regression models from a, b, or c does the best job of explaining the variation in
average weekly soft drink consumption? Summarize your results in a table similar to the one
below:
Variables
Q5 & Q3a
Q5 & Q4a
Q5 & Q2b
R2
F
d.
Briefly discuss your evaluation. _____________________________________________
_______________________________________________________________________
_______________________________________________________________________
e.
Using the regression line in “a” above, compute Y (Q5) if Q3a = 100. _______________
f.
Using the regression line in “b” above, compute Y (Q5) if Q4a = 100. _______________
g.
Using the regression line in “c” above, compute Y (Q5) if Q2b = 100. _______________
Additional Notes on Correlation Analysis:
Correlation Analysis: both Pearson’s & Spearman’s
1.
Measures the degree to which changes in one variable (sometimes we call it the dependent
variable) are associated with changes in another variable.
2.
Our analysis will only be bivariate correlation analysis.
3.
The correlation coefficient: Pearson’s & Spearman’s
a.
Is a relative measure of the
i.
Direction &
ii.
Strength
of the relationship between two variables, or vectors of data in a database.
b.
The coefficient can range from -1.0 to +1.0. A value close to +/- 1.0 indicates a strong
correlation, while a +/- value close to zero would indicative a relative weak correlation or
association between two variables.
4.
Correlation is only a descriptive analysis; hence, strong correlations do not necessarily mean there
is a cause-effect relationship between two variables. However, correlation is one of three preconditions for a cause-effect relationship.
5.
Given the following correlation matrix, note the interpretation below.
Age
Inc
Edu
Age
1.00
.00
-.05
.88
-.49
.09
Inc
-.05
.88
1.00
.00
.84
.03
Edu
-.49
.09
.84
.03
1.00
.00
Interpretation of the Correlation Matrix:
The top number, for age & edu, -49 indicates the relative strength and direction of the correlation/
association between the two variables in question. The bottom number, for age & edu, .09 indicates the
probability that given these sample results, there is a 9% chance that in the population a significant
association does NOT exist. A simpler way to say it is, we are 91% confident that based on these sample
results, there is a significant correlation/association between the two variables in the population.
If the probability of insignificance is ever over .10 (> 10% chance of insignificance in the population) we
ALWAYS conclude Ho, that there is no correlation between the two variables in question — no matter
what the top number, the correlation coefficient, turns out to be.
Hence, when examining correlation matrix always look at the bottom number first!! If the relationship is
significant then look at the top number, the correlation coefficient to evaluate the strength and direction of
the relationship.
The difference between Spearman’s and Pearson’s:
Spearman’s can handle categorical data if the response is dichotomous; otherwise we say Spearman’s is a
non-parametric test used to evaluate “ranked” or ordinal data. If your analysis of two variables involves at
least one ordinal variable, then you must use Spearman’s!! Pearson’s can also handle categorical data from
a dichotomous variable; otherwise it requires that BOTH variables be of at least interval scale (interval or
ratio).
SPSS Exercise #7
OBJECTIVE:
To obtain results from a traditional descriptive statistics analysis. This exercise gets
students to invoke basic fundamental analytical tools, means and standard deviation,
interpret the output and answer the questions.
Textbook Reference: Pages 339 to 342.
Instructions:
1.
Use the analyze/descriptive statistics/descriptives sequence to obtain results for this exercise.
On the questionnaire, Question #7 utilizes a 5-point Likert scale. This scale is balanced and can
be assumed to yield interval scale/metric data. Given the preceding, invoke SPSS to calculate the
mean and standard deviation for variables Q7a-Q7k.
2.
Answer each of the questions below:
a.
Using only the mean for each of the variables, for which question was there the greatest
amount of agreement?
__________________
b.
Again, using only the mean for each of the variables, for which question was there the
greatest amount of disagreement?
__________________
c.
Using only the standard deviation for each of the variables, for which question was
there the greatest amount of agreement?
__________________
d.
Again, using only the standard deviation for each of the variables, for which question
was there the greatest amount of disagreement?
__________________
e.
Explain the difference in the results of questions 1 & 2 versus 3 & 4.
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
SPSS Exercise #8
OBJECTIVE:
This exercise requires students to use only parts of the database, depending upon the
information to be obtained. In effect, students will select only certain cases in the
database depending upon the specifications of the problem.
Textbook Reference: No specific data analysis reference. Suggestions for summarizing data can be
found on pages 335 to 338.
Instructions:
1.
From our soft drink database suppose we want to compare persons who prefer their favorite soft
drink 75% of the time or less to those preferring their favorite soft drink more than 75% of the
time (76% to 100%). We want to compare consumption via the following demographic
characteristics: gender (Q12), ethnic orientation (Q11), average weekly consumption (Q5), and
how physically active the respondents perceive themselves (Q9).
2.
Follow this sequence of steps:
data/select cases/
if condition is satisfied (black dot)
if (click) Q4a <= 75
continue
This sequence of steps will get the database including only respondents consuming their
favorite soft drink 75% of the time or less.
3.
Utilize the analyze/descriptive statistics/frequencies sequence to obtain frequency distributions
for the following variables:
Q4a, Q5, Q9, Q11, Q12
4.
Follow the same procedure as in #2 above to obtain a database for only those respondents who
consume their favorite soft drink more than 75% of the time.
data/select cases/
if condition is satisfied (black dot)
if (click) Q4a > 75
continue
This sequence of steps will get the database including only respondents consuming their
favorite soft drink more than 75% of the time.
5.
Utilize the analyze/descriptive statistics/frequencies sequence to obtain frequency distributions
for the following variables:
Q4a, Q5, Q9, Q11, Q12
6.
Using tables, graphs or charts, compare the consumption of the two groups of respondents by
variables Q5, Q9, Q11, and Q12.
7.
Summarize the results of your tables/graphs/charts in a paragraph or less.
SPSS Exercise #9
OBJECTIVE:
To develop categories for variables with continuous data. For the variable Q5 (average
Weekly soft drink consumption), develop 3 discrete categories of consumption: LOW,
MEDIUM, and HIGH. Your earlier frequency analysis should help you establish the 3
discrete categories of average weekly soft drink consumption. The ranges used below
are just examples. Use your judgment the establish ranges for your database.
Textbook Reference: None. This is a SPSS-specific exercise.
Instructions:
1.
Assume based on analysis of our frequency distribution for Q5 that we establish the following
discrete ranges for average weekly soft drink consumption:
Consumption Category
LOW
MEDIUM
HIGH
2.
Range of Consumption(just examples; you establish ranges)
1 to 4
5 to 10
over 10
Utilize the following sequence of steps to establish discrete numerical ranges of soft drink
consumption (categories 1, 2, and 3).
Transform
Recode
Into different variables
Click Q5 (numeric variable)
Name output variable Q5c
Change
Label (categories of soft drink consumption)
Old Values & New values
Range 0 through 4
value 1 Add
(These ranges are just examples!!
Range 5 through 10
value 2 Add
You will need to develop ranges based
Range 10 through highest
value 3 Add
on your database results.)
Continue
OK (you should see the variable Q5c after Q9)
2.
Now, go to your new your new Q5C variable and go to the variable view screen. Go to the
VALUES column and give the values labels:
1=low 2=medium
3.
4.
5.
6.
3=high
Go back to the data view screen and inspect the new variable, Q5c.
Use the analyze/descriptives/frequencies sequence to obtain a frequency distribution for the new
variable, Q5c.
Printout the frequency table for Q5c.
Crosstabulate variables Q12 (gender) and Q5C (categories of soft drink consumption). Be sure to
invoke a chi-square analysis and interpret the chi-square statistic. Using the results of your
crosstabulation analysis, does soft drink consumption vary by gender? _______________
SPSS Exercises #10
OBJECTIVE:
Additional Data Analysis Questions B The preceding 9 exercises covered many aspects of
the basics of using SPSS to analyze data from a marketing research project. Exercise #10
is purely an application exercise, in which students will use what they have learned in the
previous exercises to answer the subsequent questions. There may be more than one
method of answering a particular question. In the real world, students will have to make
decisions concerning the selection of the best techniques for analyzing the data for a given
problem. That is precisely the objective of this exercise, although there are some hints
concerning appropriate methods of analysis.
Instructions:
Answer the following questions using user-friendly explanations, tables, charts, graphs,
numbers or whatever means you deem appropriate.
1.
Compare the level of loyalty toward favorite soft drink by gender.
a.
Do males or females (Q12) show more loyalty (Q4a) to their favorite soft drink?
__________________
b.
2.
What method of analysis did you use? ________________________ (chi-square,
correlation, or t-test?)
Compare the average weekly consumption of soft drinks (Q5) by classification (Q13).
a.
Do students increase or decrease their soft drink consumption as they progress in their
college experience?
____________________________________________________________________
b.
3.
What method of analysis did you use? _________________________
Compare age categories of respondents (Q14) and average weekly soft drink consumption (Q5).
a.
Correlate age (Q14) and soft drink consumption (Q5). Does soft drink consumption
increase or decrease with age?
______________________________________________________________
b.
Is this a Pearson’s or Spearman’s correlation analysis? ___________________________
c.
Run an ANOVA using age (Q14) categories as factors and soft drink consumption (Q5).
Does softdrink consumption differ significantly across age categories?
_______________________________________________________________________
d.
Do your correlation and ANOVA results agree? YES ____
NO ____
4.
Evaluate the results in Question 8 (Q8a-Q8j) by gender (Q4). Organize and discuss the results.
HINT: Do NOT use crosstabs here. Use the select cases procedure such that you run frequencies
for Q8 for males and then for females. Now construct a table indicating preferences by gender.
Example:
Question
you just mowed the grass
5.
Beverage Preference B
Males
Beverage Preference B
Females
beer(36%)
gatorade(41%)
Do males or females consider themselves:
a.
b.
c.
more socially active. ______________________________________________________
more physically active. ____________________________________________________
What method of analysis did you use to answer 5a & 5b? (chi-square, t-test or correlation)
_______________________________________________________________________