Download Slide 1

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Operations research wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Essentials of Marketing Research
(Second Edition)
Kumar Aaker & Day
Instructor’s Presentation Slides
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chapter Thirteen
Hypothesis Testing:
Basic Concepts and Tests of Association,
Means and Proportion
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Hypothesis Testing:
Basic Concepts

Assumption (hypothesis) made about a population
parameter (not sample parameter)

Purpose of Hypothesis Testing


To make a judgement about the difference between
two sample statistics or the sample statistic and a
hypothesized population parameter
Evidence has to be evaluated statistically before arriving at
a conclusion regarding the hypothesis.
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Hypothesis Testing

The null hypothesis (Ho) is tested against the alternative
hypothesis (Ha).

At least the null hypothesis is stated.

Decide upon the criteria to be used in making the decision
whether to “reject” or "not reject" the null hypothesis.
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
The Logic of Hypothesis Testing

Evidence has to be evaluated statistically before arriving at
a conclusion regarding the hypothesis

Depends on whether information generated from the
sample is with fewer or larger observations
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Problem Definition
Clearly state the null and
alternative hypotheses.
Choose the relevant test
and the appropriate
probability distribution
Determine the
significance
level
Compute
relevant test
statistic
Choose the critical value
Determine the
degrees of
freedom
Compare test statistic
and critical value
Decide if one-or
two-tailed test
Does the test statistic fall
in the critical region?
Do not reject
null
Reject null
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Basic Concepts of Hypothesis Testing
(Contd.)
The Three Criteria Used Are

Significance Level

Degrees of Freedom

One or Two Tailed Test
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Significance Level

Indicates the percentage of sample means that is outside
the cut-off limits (critical value)

The higher the significance level () used for testing a
hypothesis, the higher the probability of rejecting a null
hypothesis when it is true (Type I error)

Accepting a null hypothesis when it is false is called a
Type II error and its probability is ()
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Significance Level (Contd.)




When choosing a level of significance, there is an inherent
tradeoff between these two types of errors
Power of hypothesis test (1 - )
A good test of hypothesis ought to reject a null hypothesis
when it is false
1 -  should be as high a value as possible
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Degree of Freedom

The number or bits of "free" or unconstrained data used in
calculating a sample statistic or test statistic

A sample mean (X) has `n' degree of freedom

A sample variance (s2) has (n-1) degrees of freedom
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
One or Two-tail Test


One-tailed Hypothesis Test

Determines whether a particular population
parameter is larger or smaller than some predefined
value

Uses one critical value of test statistic
Two-tailed Hypothesis Test

Determines the likelihood that a population
parameter is within certain upper and lower bounds

May use one or two critical values
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Basic Concepts of Hypothesis Testing
(Contd.)

Select the appropriate probability distribution based on two
criteria

Size of the sample

Whether the population standard deviation is known or not
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Hypothesis Testing
DATA ANALYSIS
OUTCOME
In Population
Accept Null
Hypothesis
Null Hypothesis Correct Decision
True
Null Hypothesis
False
Type II Error
Essentials of Marketing Research ,Second Edition
Reject Null
Hypothesis
Type I Error
Correct
Decision
Kumar , Aaker & Day
Hypothesis Testing
Tests in this class

Frequency Distributions
Statistical Test
2

Means
(one)
z (if  is known)

Means
(two)
t (if  is unknown)
t

Means
(more than two)
ANOVA
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Cross-tabulation and Chi Square
In Marketing Applications, Chi-square Statistic Is
Used As
Test of Independence

Are there associations between two or more variables in a study?
Test of Goodness of Fit

Is there a significant difference between an observed frequency
distribution and a theoretical frequency distribution?
Statistical Independence

Two variables are statistically independent if a knowledge of one
would offer no information as to the identity of the other
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chi-Square As a Test of Independence
Null Hypothesis Ho

Two (nominally scaled) variables are statistically
independent
Alternative Hypothesis Ha

The two variables are not independent
Use Chi-square distribution to test
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chi-square As a Test of Independence
(Contd.)
Chi-square Distribution

A probability distribution

Total area under the curve is 1.0

A different chi-square distribution is associated with
different degrees of freedom
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chi-square As a Test of Independence
(Contd.)
Degree of Freedom
v = (r - 1) * (c - 1)
r = number of rows in contingency table
c = number of columns

Mean of chi-squared distribution
= Degree of freedom (v)

Variance = 2v
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chi-square Statistic (2)

Measures of the difference between the actual numbers observed
in cell i (Oi), and number expected (Ei) under independence if the
null hypothesis were true

2
(O i - E i ) 2
= S
i =1
Ei
n
With (r-1)*(c-1) degrees of freedom
r = number of rows c = number of columns

Expected frequency in each cell: Ei = pc * pr * n
Where pc and pr are proportions for independent variables and n
is the total number of observations
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chi-square Step-by-Step
1) Formulate Hypotheses
2) Calculate row and column totals
3) Calculate row and column proportions
4) Calculate expected frequencies (Ei)
5) Calculate 2 statistic
6) Calculate degrees of freedom
7) Obtain Critical Value from table
8) Make decision regarding the Null-hypothesis
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Example of Chi-square as a Test of
Independence
Class
Grade
1
2
A
B
C
10
20
45
8
16
18
D
E
16
6
9
2
Essentials of Marketing Research ,Second Edition
This is a ‘Cell’
Kumar , Aaker & Day
Chi-square As a Test of Independence Exercise
Own
Expensive
Automobile
Yes
No
Low
Income
Middle
High
45
52
34
53
55
27
Task: Make a decision whether the two variables
are independent!
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
The chi-square distribution
F(x2)
df = 4
Critical value = 9.49
5% of area under curve
 = .05
x2



Probability distributions that are continuous, have one mode, and are skewed to the right.
Exact shape varies according to the number of degrees of freedom.
The critical value of a test statistic in a chi-square distribution is determined by specifying a
significance level and the degrees of freedom.

Ex: Significance level = .05
Degrees of freedom = 4
CVx2 = 9.49
The decision rule when testing hypotheses by means of chi-square distribution is:
If x2 is <= CVx2, accept H0
Thus, for 4 df and  = .05
If x2 is > CVx2, reject H0
If If x2 is <= 9.49, accept H0
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Cross Tabulation Example

In a nationwide study of 1,402 adults a question was asked about institutions:
“I am going to name some institutions in this country. As far as the people
running these institutions are concerned, would you say have a great deal of
confidence, only some confidence, or hardly any confidence at all in them?”
One of the institutions was television.
Answers to the question about television are cross-tabulated with three levels of
income below.
Annual Family Income
Amount of
confidence in
television
A great deal
Under
$10,000
$10,000 –
20,000
Over $20,000
95
57
39
191
Only some
272
274
214
760
Hardly any
140
163
148
451
507
494
401
1,402
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Calculations for income-confidence data
Cell
Observed
Expected
Contribution
(Ou – Eu)2/ Eu
Cell11
95
69.1
9.71
Cell12
57
67.3
1.58
Cell13
39
54.6
4.46
Cell21
272
274.8
.03
Cell22
274
267.8
.14
Cell23
214
217.4
.05
Cell31
140
163.1
3.27
Cell32
163
158.9
.11
Cell33
148
129.0
2.80
X2ts = 22.15
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
 = .05
df = 4
[(r-1) (c-1)]
n = 1402
X2cv = 9.5
X2ts = 22.15
F(x2)
df = 4
X2cv = 9.5
5% of area under curve
 = .05
22.15
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Strength of Association

Measured by contingency coefficient
C=
x2
o< c < 1
 x2 + n

0 - no association (i.e. Variables are statistically
independent)

Maximum value depends on the size of table-compare
only tables of same size
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Limitations As an Association Measure
It Is Basically Proportional to Sample Size

Difficult to interpret in absolute sense and compare
cross-tabs of unequal size
It Has No Upper Bound

Difficult to obtain a feel for its value

Does not indicate how two variables are related
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chi-square Goodness of Fit

Used to investigate how well the observed pattern fits the
expected pattern

Researcher may determine whether population distribution
corresponds to either a normal, poisson or binomial
distribution
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chi-square Degrees of Freedom

Employ (k-1) rule

Subtract an additional degree of freedom for each
population parameter that has to be estimated from the
sample data
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Goodness-of-Fit Test
Suppose a researcher is investigating preferences for four possible names of a new
lightweight brand of sandals: Camfo, Kenilay, Nemlads, and Dics. Since the names
are generated from random combinations of syllables, thre researcher expects
preferences will be equally distributed across the four names (that is, each name will
receive 25 percent of the available preferences). After sampling 300 people at
reandom and asking them which one of the four names was most preferred, the
following distribution resulted (each expected value is 300 * .25 = 75).
Possible Name
Observed Preferences
Expected Preferences
Camfo
30
75
Kenilay
80
75
Nemlads
120
75
Dics
70
75
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Goodness-of-Fit Test (cont.)
There are (d – 1) or three degrees of freedom in this instance. If  is
specified as 0.01, the critical value is 11.325 from Statistical Appendix
Table 3.18 Given this information, the hypothesis to be tested can be
stated as:
H0: preferences are equal for the names
Ha: preferences are not equal for the names
And the decision rule is
If x2 is <= 11.325, accept H0.
If x2 is > 11.325, reject H0.
The test statistic is calculated as
x2 = (30-75)2 / 75 + (80-75)2 / 75 + (120-75)2 / 75 + (70-75)2 / 75
= 27.00 + .33 + 27.00 + .33
= 54.66

Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Hypothesis Testing For Differences
Between Means

Commonly used in experimental research

Statistical technique used is analysis Of variance
(ANOVA)
Hypothesis Testing Criteria Depends on

Whether the samples are obtained from different or related populations

Whether the population is known on not known

If the population standard deviation is not known, whether they can be
assumed to be equal or not
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
The Probability Values (P-value) Approach
to Hypothesis Testing

P-value provides researcher with alternative method of testing
hypothesis without pre-specifying 

Largest level of significance at which we would not reject ho
Difference Between Using  and p-value

Hypothesis testing with a pre-specified 

Researcher is trying to determine, "is the probability
of what has been observed less than ?"

Reject or fail to reject ho accordingly
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
The Probability Values (P-value) Approach
to Hypothesis Testing (Contd.)
Using the p-Value

Researcher can determine "how unlikely is the result that has been
observed?"

Decide whether to reject or fail to reject ho without being bound by a
pre-specified significance level

In general, the smaller the p-value, the greater is the researcher's
confidence in sample findings

P-value is generally sensitive to sample size


A large sample should yield a low p-value
P-value can report the impact of the sample size on the reliability of
the results
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Hypothesis Testing About
a Single Mean - Step-by-Step
1) Formulate Hypotheses
2) Select appropriate formula
3) Select significance level
4) Calculate z or t statistic
5) Calculate degrees of freedom (for t-test)
6) Obtain critical value from table
7) Make decision regarding the Null-hypothesis
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Hypothesis Testing About
a Single Mean - Example 1






Ho:  = 5000 (hypothesized value of population)
Ha:   5000 (alternative hypothesis)
n = 100
X = 4960
 = 250
 = 0.05
Rejection rule: if |zcalc| > z/2 then reject Ho.
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Hypothesis Testing About
a Single Mean - Example 2






Ho:  = 1000 (hypothesized value of population)
Ha:   1000 (alternative hypothesis)
n = 12
X = 1087.1
s = 191.6
 = 0.01
Rejection rule: if |tcalc| > tdf, /2 then reject Ho.
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Hypothesis Testing About
a Single Mean - Example 3






Ho:   1000 (hypothesized value of population)
Ha:  > 1000 (alternative hypothesis)
n = 12
X = 1087.1
s = 191.6
 = 0.05
Rejection rule: if tcalc > tdf,  then reject Ho.
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Confidence Intervals

Hypothesis testing and Confidence Intervals are two sides of
the same coin.
( X - )
t=
sx

X  ts x =
Essentials of Marketing Research ,Second Edition

interval
estimate of 
Kumar , Aaker & Day
Confidence Interval Estimation
X  Z

If  = .95 then,
Problem:
P( X - Z

n
u  X +z
n

n
) = .95
 = .01
n = 75
Since CI is for both sides, z-value is got for /2 = .005
Z /2 = 2.58

n
=
15
75
P ( 290 - 2 . 58 (
15
 u  290 + 2 . 58 (
75
15
)) = . 99
75
P ( 285 . 54  u  294 . 46 ) = 0 . 99
Test the hypothesis that the true mean weight of the Hawkeyes football team is greater than or
equal to 300 pounds with  = .05
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
H0:
H1:
uW  300
uW < 300
At  = 0.05, CVZ = -1.645 (for a one-tailed test)
Since Zts falls in the critical region
We ______________________ the null hypothesis
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day

Test the hypothesis that the true mean weight of the Hawkeyes
football team is equal to 286 pounds with  = 0.01
H0:
uW = 286
uW  286
AT  = .01
CVZ = 2.58
Since Zts < CvZ we __________________ the null hypothesis
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Chain
N
Proportion of Stores Open
for 24 hours
A
40
-45
B
75
-40
H0:
HA:
PA = PB
PA not equal to PB
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
And
df = n1+n2-2
(n1-1) + (n2-1)
= .05
df = 113
= weighted average of sample proportions
Computation of tts would proceed as follows:
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
40 (. 45 ) + 75 (. 40 ) 18 + 30
=
= . 42
pˆ =
40 + 75
115
Since
then
and
-1.96
+1.96
.025
-
Essentials of Marketing Research ,Second Edition
.025
Kumar , Aaker & Day
+
Descriptive Statistics for two samples of students, liberal arts majors (n = 317) and
engineering majors (n = 592) include
Liberal arts majors
Engineering majors
X
2.59
2.29
S
1.00
1.10
The smaller the mean, the more students agree with the statement. The formula for a t-test of
mean differences for independent samples is
With
being the standard error of the mean difference
Where
Is a weighted average of sample standard deviations. In this situation the hypothesis:
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Pooled Std. dev
= 1.07
Tts= 2.59-2.29 / .07 = .30 / .07 = 4.29
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Statistical techniques
Analysis of Variance (ANOVA)
Correlation Analysis
Regression Analysis
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Analysis of Variance
• ANOVA mainly used for analysis of
experimental data
• Ratio of “between-treatment” variance
and “within- treatment” variance
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Analysis of Variance (ANOVA)

Response variable - dependent variable (Y)

Factor(s) - independent variables (X)

Treatments - different levels of factors
(r1, r2, r3, …)
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
One - Factor Analysis of Variance

Studies the effect of 'r' treatments on one response variable

Determine whether or not there are any statistically significant
differences between the treatment means 1, 2,... R

Ho: all treatments have same effect on mean responses

H1 : At least 2 of 1, 2 ... r are different
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Example (Book p.495)
Product Sales
Price
Level
1
2
3
4
5
Total
Xp
39 ¢
8
12
10
9
11
50
10
44 ¢
7
10
6
8
9
40
8
49 ¢
4
8
7
9
7
35
7
]
Overall sample mean: X = 8.333
Overall sample size: n = 15
No. of observations per price level: np = 5
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
Example (Book p.495)
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
One - Factor ANOVA - Intuitively
If:
=
Between Treatment Variance
Within Treatment Variance
 is large then there are differences between treatments
 is small then there are no differences between treatments

To Test Hypothesis, Compute the Ratio Between the
"Between Treatment" Variance and "Within Treatment"
Variance
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
One - Factor ANOVA Table
Source of
Variation
Variation Degrees of
(SS)
Freedom
Mean Sum
of Squares
F-ratio
Between
(price levels)
SSr
r-1
MSSr =SSr/r-1
MSSr
MSSu
Within
(price levels)
SSu
n-r
MSSu=SSu/n-r
Total
SSt
n-1
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
One - Factor Analysis of Variance

Between Treatment Variance
SSr = S np (Xp - X)2 = 23.3
r

Within-treatment variance
SSu = S S (Xip - Xp)2 = 34
np
r
Where
i=1 p=1
SSr = treatment sums of squares
r = number of groups
np = sample size in group ‘p’
Xp = mean of group p
X = overall mean
Xip =sales at store i at level p
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
One - Factor Analysis of Variance

Between variance estimate (MSSr)
MSSr = SSr/(r-1) = 23.3/2 = 11.65

Within variance estimate (MSSu)
MSSu = SSu/(n-r) = 34/12 = 2.8
Where
n = total sample size
r = number of groups
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day
One - Factor Analysis of Variance

Total variation (SSt): SSt = SSr + SSu = 23.3+34 = 57.3

F-statistic: F = MSSr / MSSu = 11.65/2.8 = 4.16

DF: (r-1), (n-r) = 2, 12

Critical value from table: CV(, df) = 3.89
Essentials of Marketing Research ,Second Edition
Kumar , Aaker & Day