* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Slide 1
Survey
Document related concepts
Degrees of freedom (statistics) wikipedia , lookup
Psychometrics wikipedia , lookup
Operations research wikipedia , lookup
Foundations of statistics wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Misuse of statistics wikipedia , lookup
Transcript
Essentials of Marketing Research (Second Edition) Kumar Aaker & Day Instructor’s Presentation Slides Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chapter Thirteen Hypothesis Testing: Basic Concepts and Tests of Association, Means and Proportion Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Hypothesis Testing: Basic Concepts Assumption (hypothesis) made about a population parameter (not sample parameter) Purpose of Hypothesis Testing To make a judgement about the difference between two sample statistics or the sample statistic and a hypothesized population parameter Evidence has to be evaluated statistically before arriving at a conclusion regarding the hypothesis. Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Hypothesis Testing The null hypothesis (Ho) is tested against the alternative hypothesis (Ha). At least the null hypothesis is stated. Decide upon the criteria to be used in making the decision whether to “reject” or "not reject" the null hypothesis. Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day The Logic of Hypothesis Testing Evidence has to be evaluated statistically before arriving at a conclusion regarding the hypothesis Depends on whether information generated from the sample is with fewer or larger observations Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Problem Definition Clearly state the null and alternative hypotheses. Choose the relevant test and the appropriate probability distribution Determine the significance level Compute relevant test statistic Choose the critical value Determine the degrees of freedom Compare test statistic and critical value Decide if one-or two-tailed test Does the test statistic fall in the critical region? Do not reject null Reject null Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Basic Concepts of Hypothesis Testing (Contd.) The Three Criteria Used Are Significance Level Degrees of Freedom One or Two Tailed Test Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Significance Level Indicates the percentage of sample means that is outside the cut-off limits (critical value) The higher the significance level () used for testing a hypothesis, the higher the probability of rejecting a null hypothesis when it is true (Type I error) Accepting a null hypothesis when it is false is called a Type II error and its probability is () Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Significance Level (Contd.) When choosing a level of significance, there is an inherent tradeoff between these two types of errors Power of hypothesis test (1 - ) A good test of hypothesis ought to reject a null hypothesis when it is false 1 - should be as high a value as possible Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Degree of Freedom The number or bits of "free" or unconstrained data used in calculating a sample statistic or test statistic A sample mean (X) has `n' degree of freedom A sample variance (s2) has (n-1) degrees of freedom Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day One or Two-tail Test One-tailed Hypothesis Test Determines whether a particular population parameter is larger or smaller than some predefined value Uses one critical value of test statistic Two-tailed Hypothesis Test Determines the likelihood that a population parameter is within certain upper and lower bounds May use one or two critical values Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Basic Concepts of Hypothesis Testing (Contd.) Select the appropriate probability distribution based on two criteria Size of the sample Whether the population standard deviation is known or not Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Hypothesis Testing DATA ANALYSIS OUTCOME In Population Accept Null Hypothesis Null Hypothesis Correct Decision True Null Hypothesis False Type II Error Essentials of Marketing Research ,Second Edition Reject Null Hypothesis Type I Error Correct Decision Kumar , Aaker & Day Hypothesis Testing Tests in this class Frequency Distributions Statistical Test 2 Means (one) z (if is known) Means (two) t (if is unknown) t Means (more than two) ANOVA Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Cross-tabulation and Chi Square In Marketing Applications, Chi-square Statistic Is Used As Test of Independence Are there associations between two or more variables in a study? Test of Goodness of Fit Is there a significant difference between an observed frequency distribution and a theoretical frequency distribution? Statistical Independence Two variables are statistically independent if a knowledge of one would offer no information as to the identity of the other Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chi-Square As a Test of Independence Null Hypothesis Ho Two (nominally scaled) variables are statistically independent Alternative Hypothesis Ha The two variables are not independent Use Chi-square distribution to test Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chi-square As a Test of Independence (Contd.) Chi-square Distribution A probability distribution Total area under the curve is 1.0 A different chi-square distribution is associated with different degrees of freedom Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chi-square As a Test of Independence (Contd.) Degree of Freedom v = (r - 1) * (c - 1) r = number of rows in contingency table c = number of columns Mean of chi-squared distribution = Degree of freedom (v) Variance = 2v Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chi-square Statistic (2) Measures of the difference between the actual numbers observed in cell i (Oi), and number expected (Ei) under independence if the null hypothesis were true 2 (O i - E i ) 2 = S i =1 Ei n With (r-1)*(c-1) degrees of freedom r = number of rows c = number of columns Expected frequency in each cell: Ei = pc * pr * n Where pc and pr are proportions for independent variables and n is the total number of observations Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chi-square Step-by-Step 1) Formulate Hypotheses 2) Calculate row and column totals 3) Calculate row and column proportions 4) Calculate expected frequencies (Ei) 5) Calculate 2 statistic 6) Calculate degrees of freedom 7) Obtain Critical Value from table 8) Make decision regarding the Null-hypothesis Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Example of Chi-square as a Test of Independence Class Grade 1 2 A B C 10 20 45 8 16 18 D E 16 6 9 2 Essentials of Marketing Research ,Second Edition This is a ‘Cell’ Kumar , Aaker & Day Chi-square As a Test of Independence Exercise Own Expensive Automobile Yes No Low Income Middle High 45 52 34 53 55 27 Task: Make a decision whether the two variables are independent! Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day The chi-square distribution F(x2) df = 4 Critical value = 9.49 5% of area under curve = .05 x2 Probability distributions that are continuous, have one mode, and are skewed to the right. Exact shape varies according to the number of degrees of freedom. The critical value of a test statistic in a chi-square distribution is determined by specifying a significance level and the degrees of freedom. Ex: Significance level = .05 Degrees of freedom = 4 CVx2 = 9.49 The decision rule when testing hypotheses by means of chi-square distribution is: If x2 is <= CVx2, accept H0 Thus, for 4 df and = .05 If x2 is > CVx2, reject H0 If If x2 is <= 9.49, accept H0 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Cross Tabulation Example In a nationwide study of 1,402 adults a question was asked about institutions: “I am going to name some institutions in this country. As far as the people running these institutions are concerned, would you say have a great deal of confidence, only some confidence, or hardly any confidence at all in them?” One of the institutions was television. Answers to the question about television are cross-tabulated with three levels of income below. Annual Family Income Amount of confidence in television A great deal Under $10,000 $10,000 – 20,000 Over $20,000 95 57 39 191 Only some 272 274 214 760 Hardly any 140 163 148 451 507 494 401 1,402 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Calculations for income-confidence data Cell Observed Expected Contribution (Ou – Eu)2/ Eu Cell11 95 69.1 9.71 Cell12 57 67.3 1.58 Cell13 39 54.6 4.46 Cell21 272 274.8 .03 Cell22 274 267.8 .14 Cell23 214 217.4 .05 Cell31 140 163.1 3.27 Cell32 163 158.9 .11 Cell33 148 129.0 2.80 X2ts = 22.15 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day = .05 df = 4 [(r-1) (c-1)] n = 1402 X2cv = 9.5 X2ts = 22.15 F(x2) df = 4 X2cv = 9.5 5% of area under curve = .05 22.15 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Strength of Association Measured by contingency coefficient C= x2 o< c < 1 x2 + n 0 - no association (i.e. Variables are statistically independent) Maximum value depends on the size of table-compare only tables of same size Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Limitations As an Association Measure It Is Basically Proportional to Sample Size Difficult to interpret in absolute sense and compare cross-tabs of unequal size It Has No Upper Bound Difficult to obtain a feel for its value Does not indicate how two variables are related Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chi-square Goodness of Fit Used to investigate how well the observed pattern fits the expected pattern Researcher may determine whether population distribution corresponds to either a normal, poisson or binomial distribution Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chi-square Degrees of Freedom Employ (k-1) rule Subtract an additional degree of freedom for each population parameter that has to be estimated from the sample data Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Goodness-of-Fit Test Suppose a researcher is investigating preferences for four possible names of a new lightweight brand of sandals: Camfo, Kenilay, Nemlads, and Dics. Since the names are generated from random combinations of syllables, thre researcher expects preferences will be equally distributed across the four names (that is, each name will receive 25 percent of the available preferences). After sampling 300 people at reandom and asking them which one of the four names was most preferred, the following distribution resulted (each expected value is 300 * .25 = 75). Possible Name Observed Preferences Expected Preferences Camfo 30 75 Kenilay 80 75 Nemlads 120 75 Dics 70 75 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Goodness-of-Fit Test (cont.) There are (d – 1) or three degrees of freedom in this instance. If is specified as 0.01, the critical value is 11.325 from Statistical Appendix Table 3.18 Given this information, the hypothesis to be tested can be stated as: H0: preferences are equal for the names Ha: preferences are not equal for the names And the decision rule is If x2 is <= 11.325, accept H0. If x2 is > 11.325, reject H0. The test statistic is calculated as x2 = (30-75)2 / 75 + (80-75)2 / 75 + (120-75)2 / 75 + (70-75)2 / 75 = 27.00 + .33 + 27.00 + .33 = 54.66 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Hypothesis Testing For Differences Between Means Commonly used in experimental research Statistical technique used is analysis Of variance (ANOVA) Hypothesis Testing Criteria Depends on Whether the samples are obtained from different or related populations Whether the population is known on not known If the population standard deviation is not known, whether they can be assumed to be equal or not Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day The Probability Values (P-value) Approach to Hypothesis Testing P-value provides researcher with alternative method of testing hypothesis without pre-specifying Largest level of significance at which we would not reject ho Difference Between Using and p-value Hypothesis testing with a pre-specified Researcher is trying to determine, "is the probability of what has been observed less than ?" Reject or fail to reject ho accordingly Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day The Probability Values (P-value) Approach to Hypothesis Testing (Contd.) Using the p-Value Researcher can determine "how unlikely is the result that has been observed?" Decide whether to reject or fail to reject ho without being bound by a pre-specified significance level In general, the smaller the p-value, the greater is the researcher's confidence in sample findings P-value is generally sensitive to sample size A large sample should yield a low p-value P-value can report the impact of the sample size on the reliability of the results Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Hypothesis Testing About a Single Mean - Step-by-Step 1) Formulate Hypotheses 2) Select appropriate formula 3) Select significance level 4) Calculate z or t statistic 5) Calculate degrees of freedom (for t-test) 6) Obtain critical value from table 7) Make decision regarding the Null-hypothesis Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Hypothesis Testing About a Single Mean - Example 1 Ho: = 5000 (hypothesized value of population) Ha: 5000 (alternative hypothesis) n = 100 X = 4960 = 250 = 0.05 Rejection rule: if |zcalc| > z/2 then reject Ho. Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Hypothesis Testing About a Single Mean - Example 2 Ho: = 1000 (hypothesized value of population) Ha: 1000 (alternative hypothesis) n = 12 X = 1087.1 s = 191.6 = 0.01 Rejection rule: if |tcalc| > tdf, /2 then reject Ho. Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Hypothesis Testing About a Single Mean - Example 3 Ho: 1000 (hypothesized value of population) Ha: > 1000 (alternative hypothesis) n = 12 X = 1087.1 s = 191.6 = 0.05 Rejection rule: if tcalc > tdf, then reject Ho. Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Confidence Intervals Hypothesis testing and Confidence Intervals are two sides of the same coin. ( X - ) t= sx X ts x = Essentials of Marketing Research ,Second Edition interval estimate of Kumar , Aaker & Day Confidence Interval Estimation X Z If = .95 then, Problem: P( X - Z n u X +z n n ) = .95 = .01 n = 75 Since CI is for both sides, z-value is got for /2 = .005 Z /2 = 2.58 n = 15 75 P ( 290 - 2 . 58 ( 15 u 290 + 2 . 58 ( 75 15 )) = . 99 75 P ( 285 . 54 u 294 . 46 ) = 0 . 99 Test the hypothesis that the true mean weight of the Hawkeyes football team is greater than or equal to 300 pounds with = .05 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day H0: H1: uW 300 uW < 300 At = 0.05, CVZ = -1.645 (for a one-tailed test) Since Zts falls in the critical region We ______________________ the null hypothesis Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Test the hypothesis that the true mean weight of the Hawkeyes football team is equal to 286 pounds with = 0.01 H0: uW = 286 uW 286 AT = .01 CVZ = 2.58 Since Zts < CvZ we __________________ the null hypothesis Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Chain N Proportion of Stores Open for 24 hours A 40 -45 B 75 -40 H0: HA: PA = PB PA not equal to PB Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day And df = n1+n2-2 (n1-1) + (n2-1) = .05 df = 113 = weighted average of sample proportions Computation of tts would proceed as follows: Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day 40 (. 45 ) + 75 (. 40 ) 18 + 30 = = . 42 pˆ = 40 + 75 115 Since then and -1.96 +1.96 .025 - Essentials of Marketing Research ,Second Edition .025 Kumar , Aaker & Day + Descriptive Statistics for two samples of students, liberal arts majors (n = 317) and engineering majors (n = 592) include Liberal arts majors Engineering majors X 2.59 2.29 S 1.00 1.10 The smaller the mean, the more students agree with the statement. The formula for a t-test of mean differences for independent samples is With being the standard error of the mean difference Where Is a weighted average of sample standard deviations. In this situation the hypothesis: Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Pooled Std. dev = 1.07 Tts= 2.59-2.29 / .07 = .30 / .07 = 4.29 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Statistical techniques Analysis of Variance (ANOVA) Correlation Analysis Regression Analysis Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Analysis of Variance • ANOVA mainly used for analysis of experimental data • Ratio of “between-treatment” variance and “within- treatment” variance Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Analysis of Variance (ANOVA) Response variable - dependent variable (Y) Factor(s) - independent variables (X) Treatments - different levels of factors (r1, r2, r3, …) Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day One - Factor Analysis of Variance Studies the effect of 'r' treatments on one response variable Determine whether or not there are any statistically significant differences between the treatment means 1, 2,... R Ho: all treatments have same effect on mean responses H1 : At least 2 of 1, 2 ... r are different Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Example (Book p.495) Product Sales Price Level 1 2 3 4 5 Total Xp 39 ¢ 8 12 10 9 11 50 10 44 ¢ 7 10 6 8 9 40 8 49 ¢ 4 8 7 9 7 35 7 ] Overall sample mean: X = 8.333 Overall sample size: n = 15 No. of observations per price level: np = 5 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day Example (Book p.495) Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day One - Factor ANOVA - Intuitively If: = Between Treatment Variance Within Treatment Variance is large then there are differences between treatments is small then there are no differences between treatments To Test Hypothesis, Compute the Ratio Between the "Between Treatment" Variance and "Within Treatment" Variance Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day One - Factor ANOVA Table Source of Variation Variation Degrees of (SS) Freedom Mean Sum of Squares F-ratio Between (price levels) SSr r-1 MSSr =SSr/r-1 MSSr MSSu Within (price levels) SSu n-r MSSu=SSu/n-r Total SSt n-1 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day One - Factor Analysis of Variance Between Treatment Variance SSr = S np (Xp - X)2 = 23.3 r Within-treatment variance SSu = S S (Xip - Xp)2 = 34 np r Where i=1 p=1 SSr = treatment sums of squares r = number of groups np = sample size in group ‘p’ Xp = mean of group p X = overall mean Xip =sales at store i at level p Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day One - Factor Analysis of Variance Between variance estimate (MSSr) MSSr = SSr/(r-1) = 23.3/2 = 11.65 Within variance estimate (MSSu) MSSu = SSu/(n-r) = 34/12 = 2.8 Where n = total sample size r = number of groups Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day One - Factor Analysis of Variance Total variation (SSt): SSt = SSr + SSu = 23.3+34 = 57.3 F-statistic: F = MSSr / MSSu = 11.65/2.8 = 4.16 DF: (r-1), (n-r) = 2, 12 Critical value from table: CV(, df) = 3.89 Essentials of Marketing Research ,Second Edition Kumar , Aaker & Day