Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DATA ANALYSIS II MKT 525 CROSS-TABULATIONS-1 Numbers of Cars 1 or none 2 or more Income Less than $37,500 48 = or > $37,500 27 Total 75 6 19 25 Total 54 46 100 CROSS-TABULATIONS-2 Numbers of Cars 1 or none Income Less than $37,500 89% = or > $37,500 59% 2 or more Total 11% 41% 100% 100% CROSS-TABULATIONS-3 Numbers of Cars Income Less than $37,500 = or > $37,500 Total 1 or none 2 or more 64% 36 100% 24% 76 100% CROSS-TABULATIONS-4 Family size 4 or less 5 or more Numbers of Cars 1 or none 2 or more Total 90% 23% 100% 100% 10% 77% CROSS-TABULATIONS-5 Family: 4 members or less No. Cars 1 or none 2 or more Income: Less than $37,500 96% 4% = or > $37,500 81% 19% 5 or more 1 or none 2 or more 50% 50% 7% 93% CROSS-TABULATIONS-6 Size of family: 4 or less 5 or more Income: Less than $37,500 4% = or > $37,500 19% Total 50% 11% 93% 41% Chi Square Test • Does an observed set of frequencies match an expected pattern? • Requirements: – samples in cells must be independent – Expected frequency must be 5 or more • SPSS: Nonparametric- chisquare-test variableexpected value Chi Square: Compare Two Classifications Women Men: Freq. Obs. Exp. Obs. Exp.. Total M W 0 13 16.5 20 16.5 33 33% 50% 1-3 12 12 12 12 24 30% 30% 4-7 15 11.5 8 11.5 23 37% 20% Total 40 80 100% 100% 40 Compare means from two samples: t-test • Assume the 2 samples are independent Is there a difference in number of years to pay back a home improvement loan between S&Ls and other financial institutions? S&L Other Mean years 8.7 7.7 Variance .5 .6 N 100 64 t = (8.7 - 7.7)/.1175 = 8.51 At p=.05, df = n1+n2-no. groups=100+64-2=162, critical t= 1.96 Compare proportions from two samples: t-test Do younger women use bubble baths less than older women? Women under 35 Women 35-64 p(use bb) .13 .23 Std. Dev. .04 .04 N 144 169 Average p(use bb):weighted mean:=.184 Std. error of diff. of proportions=.0439 t = (.13 - .23) - 0 = -2.28 .0439 df = n - no. gps. = 144 + 169 - 2 = 311 Critical t = 1.96 SPSS • Chi Square-compare two classifications: – Analyze-Descriptive Statistics -Crosstabs Statistics-Chi square • t-test -compare mean with expectation – Analyze- Compare Means - One sample t-test test variable - test value • t-test - compare independent samples – Analyze-Compare means -Independent samplestest variable-grouping variable-levels of 2 groups Correlation • Is there an association between two variables? • If so, how strong is it? • What is the form of the association? Correlation = measure of relationship between two variables Correlation-2 • • • • • CORRELATION DOES NOT MEAN CAUSATION! A measure of relationship; NOT a proportion! Reflects a linear relationship. Can range from -1.00 to +1.00 Correlation high if points close together when form a line and low if points are far apart when form a line. • Correlation coefficient is standardized and dimensionless. • Value of correlation = degree of relationship • Sign of correlation = direction of relationship SPSS for Correlation • Both continuous: – Analyze-correlation-bivariate-enter 2 variables • One dichotomous and one continuous: – Analyze-correlation-bivariate-enter 2 variables • Both dichotomous: – Analyze-crosstabs-statistics-phi &Cramer’s v • Both ranks: – Analyze-correlation-Spearman-enter 2 variables Simple linear regression • Want to predict value of one variable (DV) from another variable (IV) • Y = bX + a • For each unit increase in X there is a b increase in Y • r2 = coefficient of determination = proportion of variance accounted for by regression model. • Relation between correlation coefficient (r) and b: b = r (s.d.y/s.d.x) Simple regression: SPSS • Analyze- regression - linear - dependent variable name - independent (predictor) variable name Case • A baking company found a correlation of .70 between the number of persons in a HH and the consumption of bread. They also found a correlation of -.35 between HH income and bread consumption. • How would you interpret these findings? • How much variance in HH bread consumption is explain by a linear regression model using the number of persons in the HH as the predictor variable?