Download DATA ANALYSIS II

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
DATA ANALYSIS II
MKT 525
CROSS-TABULATIONS-1
Numbers of Cars
1 or none
2 or more
Income
Less than $37,500 48
= or > $37,500
27
Total
75
6
19
25
Total
54
46
100
CROSS-TABULATIONS-2
Numbers of Cars
1 or none
Income
Less than $37,500 89%
= or > $37,500
59%
2 or more
Total
11%
41%
100%
100%
CROSS-TABULATIONS-3
Numbers of Cars
Income
Less than $37,500
= or > $37,500
Total
1 or none
2 or more
64%
36
100%
24%
76
100%
CROSS-TABULATIONS-4
Family size
4 or less
5 or more
Numbers of Cars
1 or none
2 or more
Total
90%
23%
100%
100%
10%
77%
CROSS-TABULATIONS-5
Family:
4 members or less
No. Cars 1 or none 2 or more
Income:
Less than
$37,500
96%
4%
= or >
$37,500
81%
19%
5 or more
1 or none 2 or more
50%
50%
7%
93%
CROSS-TABULATIONS-6
Size of family:
4 or less 5 or more
Income:
Less than $37,500 4%
= or > $37,500
19%
Total
50% 11%
93% 41%
Chi Square Test
• Does an observed set of frequencies match an
expected pattern?
• Requirements:
– samples in cells must be independent
– Expected frequency must be 5 or more
• SPSS: Nonparametric- chisquare-test variableexpected value
Chi Square: Compare Two Classifications
Women
Men:
Freq. Obs.
Exp.
Obs.
Exp..
Total
M
W
0
13
16.5
20
16.5
33
33%
50%
1-3
12
12
12
12
24
30%
30%
4-7
15
11.5
8
11.5
23
37%
20%
Total
40
80
100%
100%
40
Compare means from two samples:
t-test
• Assume the 2 samples are independent
Is there a difference in number of years to pay back a home
improvement loan between S&Ls and other financial
institutions?
S&L
Other
Mean years 8.7
7.7
Variance
.5
.6
N
100
64
t = (8.7 - 7.7)/.1175 = 8.51
At p=.05, df = n1+n2-no. groups=100+64-2=162, critical t= 1.96
Compare proportions from two
samples: t-test
Do younger women use bubble baths less than older women?
Women under 35
Women 35-64
p(use bb)
.13
.23
Std. Dev.
.04
.04
N
144
169
Average p(use bb):weighted mean:=.184
Std. error of diff. of proportions=.0439
t = (.13 - .23) - 0 = -2.28
.0439
df = n - no. gps. = 144 + 169 - 2 = 311 Critical t = 1.96
SPSS
• Chi Square-compare two classifications:
– Analyze-Descriptive Statistics -Crosstabs Statistics-Chi square
• t-test -compare mean with expectation
– Analyze- Compare Means - One sample t-test test variable - test value
• t-test - compare independent samples
– Analyze-Compare means -Independent samplestest variable-grouping variable-levels of 2 groups
Correlation
• Is there an association between two variables?
• If so, how strong is it?
• What is the form of the association?
Correlation = measure of relationship between
two variables
Correlation-2
•
•
•
•
•
CORRELATION DOES NOT MEAN CAUSATION!
A measure of relationship; NOT a proportion!
Reflects a linear relationship.
Can range from -1.00 to +1.00
Correlation high if points close together when form a
line and low if points are far apart when form a line.
• Correlation coefficient is standardized and
dimensionless.
• Value of correlation = degree of relationship
• Sign of correlation = direction of relationship
SPSS for Correlation
• Both continuous:
– Analyze-correlation-bivariate-enter 2 variables
• One dichotomous and one continuous:
– Analyze-correlation-bivariate-enter 2 variables
• Both dichotomous:
– Analyze-crosstabs-statistics-phi &Cramer’s v
• Both ranks:
– Analyze-correlation-Spearman-enter 2 variables
Simple linear regression
• Want to predict value of one variable (DV) from
another variable (IV)
• Y = bX + a
• For each unit increase in X there is a b increase in Y
• r2 = coefficient of determination = proportion of
variance accounted for by regression model.
• Relation between correlation coefficient (r) and b:
b = r (s.d.y/s.d.x)
Simple regression: SPSS
• Analyze- regression - linear - dependent
variable name - independent (predictor)
variable name
Case
• A baking company found a correlation of .70 between
the number of persons in a HH and the consumption
of bread. They also found a correlation of -.35
between HH income and bread consumption.
• How would you interpret these findings?
• How much variance in HH bread consumption is
explain by a linear regression model using the
number of persons in the HH as the predictor
variable?
Related documents