Download Express Brand Plan FY 2000

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Transcript
MBA 7025
Statistical Business Analysis
Hypothesis Testing
Jan 27, 2015
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 1
Agenda
Hypothesis
Testing
Georgia State University - Confidential
One-sample
Hypothesis
Test for the
Mean
Chi-Squared
Tests
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 2
Introduction
• Attempt to prove (or disprove) some assumption
Setup:
• Alternate hypothesis: What you wish to prove
Example: Person is guilty of crime
•
Null hypothesis: Assume the opposite of what is to be
proven. The null is always stated as an equality.
Example: Person is innocent
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 3
Hypothesis Testing
•
Take a sample, compute statistic of interest.
The evidence gathered against defendant.
•
How likely is it that if the null were true, you would get such a
statistic? (the p-value)
How likely is it that an innocent person would be found at the
scene of crime, with gun in hand, etc.
•
If very unlikely, then null must be false, hence alternate is proven
beyond reasonable doubt.
•
If quite likely, then null may be true, so not enough evidence to
discard it in favor of the alternate.
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 4
Types of Errors
Null is really
True
Null is really
False
reject null,
assume alternate is
proven
Type I Error
(convict the
innocent)
Good Decision
do not reject null,
evidence for alternate
not strong enough
Good Decision
Type II Error
(let guilty go free)
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 5
Hypothesis Testing Roadmap
Hypothesis Testing
Continuous
Normal,
Interval Scaled
Attribute
Non-Normal,
Ordinal Scaled
c2 Contingency
Tables
Means
Variance
Medians
Variance
Correlation
Z-tests
c2
Correlation
Levene’s
t-tests
F-test
Sign Test
Same tests as
Non-Normal
Medians
ANOVA
Bartlett’s
Wilcoxon
Correlation
KruskalWallis
Regression
Mood’s
Friedman’s
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 6
Parametric Tests
Use parametric tests when:
•
The data are normally distributed
•
The variances of populations (if more than one is sampled from)
are equal
•
The data are at least interval scaled
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 7
1) One sample z - test
•
Used when testing to see if sample comes from a known
population. A sample of 25 measurements shows a mean of 17.
Test whether this is significantly different from a the hypothesized
mean of 15, assuming the population standard deviation is known
to be 4.
One-Sample Z
Test of mu = 15 vs not = 15
The assumed standard deviation = 4
N Mean SE Mean
95% CI
Z
P
25 17.0000 0.8000 (15.4320, 18.5680) 2.50 0.012
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 8
2) z – test for proportions
•
70% of 200 customers surveyed say they prefer the taste of Brand
X over competitors. Test the hypothesis that more than 66% of
people in the population prefer Brand X.
Test and CI for One Proportion
Test of p = 0.66 vs p > 0.66
Sample X N Sample p
1
140 200 0.700000
Georgia State University - Confidential
95%
Lower
Bound Z-Value P-Value
0.646701
1.19
0.116
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 9
3) One sample t-test
•
The data show reductions in Blood Pressure in a
sample of 17 people after a certain treatment. We wish
to test whether the average reduction in BP was at
least 13%, a benchmark set by some other treatment
that we wish to match or better.
Probability Plot of BP Reduction
Normal - 95% CI
99
Mean
StDev
N
AD
P-Value
95
90
Percent
80
70
60
50
40
30
20
10
5
1
0
5
10
15
BP Reduction
Georgia State University - Confidential
20
25
30
13.82
3.925
17
0.204
0.850
BP
Reduction
%
10
12
9
8
7
12
14
13
15
16
18
12
18
19
20
17
15
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 10
3) One sample t-test
•
The p-value of 0.20 indicates that the reduction in BP could not be
proven to be greater than 13%. There is a 0.20 probability that it is
not greater than 13%.
One-Sample T: BP Reduction
Test of mu = 13 vs > 13
95%
Lower
Variable
N Mean StDev SE Mean Bound T
P
BP Reduction 17 13.8235 3.9248 0.9519 12.1616 0.87 0.200
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 11
4) Two sample t-test
•
You realize that though the overall reduction is not
proven to be more than 13%, there seems to be a
difference between how men and women react to the
treatment. You separate the 17 observations by
gender, and wish to test whether there is in fact a
significant difference between genders.
Test for Equal Variances for BP Reduction
F-Test
Test Statistic
P-Value
F
0.96
0.941
Gender
Lev ene's Test
Test Statistic
P-Value
M
1
2
3
4
5
95% Bonferroni Confidence Intervals for StDevs
6
Gender
F
M
6
8
10
12
14
BP Reduction
Georgia State University - Confidential
16
18
20
0.14
0.716
M
10
12
9
8
7
12
14
13
F
15
16
18
12
18
19
20
17
15
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 12
4) Two sample t-test
•
The test for equal variances shows that they are not different for the
2 samples. Thus a 2-sample t test may be conducted. The results
are shown below. The p-value indicates there is a significant
difference between the genders in their reaction to the treatment.
Two-sample T for BP Reduction M vs BP Reduction F
N Mean StDev SE Mean
BP Red M 8 10.63 2.50 0.89
BP Red F 9 16.67 2.45 0.82
Difference = mu (BP Red M) - mu (BP Red F)
Estimate for difference: -6.04167
95% CI for difference: (-8.60489, -3.47844)
T-Test of difference = 0 (vs not =): T-Value = -5.02 P-Value = 0.000
DF = 15
Both use Pooled StDev = 2.4749
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 13
Basics of ANOVA
•
•
Analysis of Variance, or ANOVA is a
technique used to test the hypothesis
that there is a difference between the
means of two or more populations. It is
used in Regression, as well as to
analyze a factorial experiment design,
and in Gauge R&R studies.
The basic premise of ANOVA is that
differences in the means of 2 or more
groups can be seen by partitioning
the Sum of Squares. Sum of Squares
(SS) is simply the sum of the squared
deviations of the observations from
their means. Consider the following
example with two groups. The
measurements show the thumb
lengths in centimeters of two types of
primates.
Georgia State University - Confidential
Obs.
Type A Type B
1
2
3
2
3
4
6
7
8
Mean
SS
3
2
7
2
Overall
Mean = 5
SS = 28
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 14
Basics of ANOVA
•
Total variation (SS) is 28, of which only
4 (2+2) is within the two groups. Thus
24 of the 28 is due to the differences
between the groups. This partitioning
of SS into ‘between’ and ‘within’ is
used to test the hypothesis that the
groups are in fact different from each
other.
• See www.statsoft.com for more
details
Georgia State University - Confidential
Obs.
Type A Type B
1
2
3
2
3
4
6
7
8
Mean
SS
3
2
7
2
Overall
Mean = 5
SS = 28
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 15
5) One-Way ANOVA
•
The results of running an ANOVA on the sample data from the
previous slide are shown here. The hypothesis test computes the
F-value as the ratio of MS ‘Between’ to MS ‘Within’. The greater the
value of F, the greater the likelihood that there is in fact a difference
between the groups. looking it up in an F-distribution table shows a
p-value of 0.008, indicating a 99.2% confidence that the difference
is real (exists in the Population, not just in the sample).
One-way ANOVA: Type A, Type B
Source DF SS MS
F
P
Factor 1 24.00 24.00 24.00 0.008
Error 4 4.00
1.00
Total 5 28.00
___________________________________
S = 1 R-Sq = 85.71% R-Sq(adj) = 82.14%
Georgia State University - Confidential
Minitab:
Stat/ANOVA/One-Way
(unstacked)
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 16
6) Two-Way ANOVA
•
Is the strength of steel produced different for different temperatures
to which it is heated and the speed with which it is cooled? Here 2
factors (speed and temp) are varied at 2 levels each, and strengths
of 3 parts produced at each combination are measured as the
response variable
Strength
20.0
22.0
21.5
23.0
24.0
22.0
25.0
24.0
24.5
17.0
18.0
17.5
Temp
Low
Low
Low
Low
Low
Low
High
High
High
High
High
High
Speed
Slow
Slow
Slow
Fast
Fast
Fast
Slow
Slow
Slow
Fast
Fast
Fast
The results show significant main effects as well as
an interaction effect.
Two-way ANOVA: Strength versus Temp, Speed
Source
DF
SS
Temp
1 3.5208
Speed
1 20.0208
Interaction 1 58.5208
Error
8 5.1667
Total
11 87.2292
MS
F
P
3.5208 5.45 0.048
20.0208 31.00 0.001
58.5208 90.61 0.000
0.6458
S = 0.8036 R-Sq = 94.08% R-Sq(adj) = 91.86%
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 17
6) Two-Way ANOVA
•
The box plots give an indication of the interaction effect. The effect
of speed on the response is different for different levels of
temperature. Thus, there is an interaction effect between
temperature and speed.
Boxplot of Strength by Temp, Speed
25
24
23
Strength
22
21
20
19
18
17
16
Speed
Temp
Fast
Slow
High
Georgia State University - Confidential
Fast
Slow
Low
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 18
Agenda
Hypothesis
Testing
Georgia State University - Confidential
One-sample
Hypothesis
Test for the
Mean
Chi-Squared
Tests
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 19
Hypothesis Testing Example
Gas Price
•
You believe that the current price of unleaded regular gasoline is
less than $4.00 on average nationwide, and wish to prove it.
•
Set up the hypothesis and test it.
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 20
i) Null and Alternate Hypotheses
•
What we wish to prove is called the Alternate Hypothesis. The
opposite of that is the Null, which must be assumed and shown to be
unlikely, based on sample data.
•
•
H0: μ = 4.00
Ha: μ < 4.00
What constitutes proof?
• Any conclusion based on a sample may be wrong. What probability
(at most) of being wrong is acceptable to you?
•
This is called
•
Let
 (alpha), or the acceptable Type I Error.
 = 0.05 (or 5%)
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 21
ii) The Sample Data
•
A sample of 49 gas stations nationwide shows average price of
unleaded is $ 3.87 and a standard deviation of $ 0.15 .
•
Could this sample have come from a population where the Mean was
in fact $4.00 (or greater)?
•
Assume the null is true, and this sample did in fact come from such
a population.
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 22
iii) Sampling Distribution if H0 True
•
What would the distribution of sample means from such a
population look like? From the Central Limit Theorem, we have
the following:
x  
x =
= $4.00
s
n
= 0.15/√49 = $ 0.02143
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 23
iv) The Test Statistic
•
How far from the assumed mean of 4.00 is the observed sample
mean of 3.87?
•
Measured in Standard Errors, this is the t-statistic.
•
One-sample t-test
•
t = (Sample Mean – Population Mean) / Standard Error
•
t = (3.87- 4.00)/0.02143 = -6.06
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 24
v) p-value
•
The probability that a value would be as extreme as (or more
extreme than) 6.06 SEs below the Mean is: 0.0000001!
•
[In Excel, =TDIST(6.06,48,1)]
•
This is called the p-value of the Hypothesis test.
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 25
vi) Conclusion
•
To determine if a result is statistically significant, a researcher would
have to calculate a p-value, which is the probability of observing an
effect given that the null hypothesis is true. The null hypothesis is
rejected if the p-value is less than 0.05 (5%).
•
If the null were true (the average price were in fact 4.00), there is
only a 0.0000001 probability that you would pick a sample with a
mean of 3.87 or smaller from such a population. Therefore, either
the null must be false (and therefore you proved your case) or you
picked an extremely rare sample.
•
You can conclude that the sample could not have come from a
population with Mean = 4.00 as assumed, and instead must have
come from one with Mean < 4.00.
•
The chance that you are wrong is less than 5%, your tolerance level.
In other words, p <
, hence you proved the case beyond
reasonable doubt.

Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 26
Agenda
Hypothesis
Testing
Georgia State University - Confidential
One-sample
Hypothesis
Test for the
Mean
Chi-Squared
Tests
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 27
Goodness-of-fit Test
A managed forest has the
following distribution of
trees:
Mannan & Meslow (1984)
made 156 observations of
foraging by red-breasted
nuthatches and found the
following:
Douglas Fir
54%
Ponderosa Pine
40%
Grand Fir
5%
Western Larch
1%
Douglas Fir
70
Ponderosa
Pine
79
Grand Fir
3
Western Larch
4
Mannan, R.W., and E.C. Meslow. 1984. “Bird populations and vegetation characteristics in
managed and old-growth forests, northeastern Oregon.” J. Wildl. Manage. 48: 1219-1238.
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 28
Hypotheses
•
Do the birds forage randomly, without regard to what species of
tree they are in? To be true, the observed and expected
distributions should be alike.
•
Null: The distributions are alike (good fit, meaning birds forage
randomly)
•
Alternate: The distributions are different (lack of fit, or birds
prefer certain vegetation)
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 29
Expected Values
•
Based on the percentage distribution of trees, the expected counts
for each type (out of 156) are:
Douglas Fir
84.24
Ponderosa Pine
62.40
Grand Fir
7.80
Western Larch
1.56
Georgia State University - Confidential
(54% of 156 = 84.24)
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 30
Chi-Square Statistics
Expected
Observed
o-e
(o-e)Sq
(o-e)Sq/e
2.41
Douglas Fir
84.24
70
-14.24
202.78
4.42
Ponderosa Pine
62.40
79
16.60
275.56
2.95
Grand Fir
7.80
3
-4.80
23.04
3.82
Western Larch
1.56
156.00
4
156
2.44
5.95
Chi-square =
p-value =
13.593
0.003514
For p-value in Excel, type =CHIDIST(13.593,3),
for 3 degrees of freedom (n groups -1)
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 31
Conclusion
•
Hypotheses:
– Null: The distributions are alike (good fit, meaning birds
forage randomly)
– Alternate: The distributions are different (lack of fit, or birds
prefer certain vegetation)
•
To determine if a result is statistically significant, a researcher
would have to calculate a p-value, which is the probability of
observing an effect given that the null hypothesis is true. The null
hypothesis is rejected if the p-value is less than 0.05 (5%).
•
Given the small p-value, we reject the null. These birds are not
foraging randomly – they prefer certain types of trees.
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 32
Test of Independence
No
Dog
Have
Dog
Female
29
23
Male
35
24
•
Demographic data on 111 students is available. We wish to
study gender differences, in this case pertaining to dog
ownership.
•
•
Data Set: Student
Variables: Gender, Dog (Yes/No)
•
Are Gender and Dog Ownership independent of each
other?
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 33
Hypotheses
•
Null: The two variables are independent of each other (the
occurrence of one does not influence the probability of the
occurrence of the other.)
•
Alternate: They are not independent (one influences the other)
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 34
Chi-Square Statistics
Tabulated statistics: Gender, Dog
Rows: Gender
Columns: Dog
No
Yes
All
Female
29
29.98
23
22.02
52
52.00
Male
35
34.02
24
24.98
59
59.00
All
64
64.00
47
47.00
111
111.00
Cell Contents: Count
Expected count
Pearson Chi-Square = 0.143, DF = 1, P-Value = 0.705
Likelihood Ratio Chi-Square = 0.143, DF = 1, P-Value = 0.705
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 35
Conclusion
•
Hypotheses:
– Null: The two variables are independent of each other (the
occurrence of one does not influence the probability of the
occurrence of the other.)
– Alternate: They are not independent (one influences the
other)
•
To determine if a result is statistically significant, a researcher
would have to calculate a p-value, which is the probability of
observing an effect given that the null hypothesis is true. The null
hypothesis is rejected if the p-value is less than 0.05 (5%).
•
Given the p-value>0.05, the null hypothesis is true. Gender and
Dog Ownership are independent of each other. The gender
difference does not influence the dog ownership.
Georgia State University - Confidential
MBA7025_Hypothesis_Testing.ppt/Jan 27, 2015/Page 36