Download Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Six Sigma Greenbelt Training
Hypothesis Testing
Dave Merritt
12/6/16
Hypothesis Testing Flowchart
Variance Tests
Chi Square Test
Compares a
distribution to a
specification or target
Mean Tests
1 Sample t Test
Compares one group
mean to a target
Variable
Data
Test for Equal
Variances
Compares the
variances of 2 or
more distributions
One way ANOVA
Compares the means
of two or more
distributions
2 Sample t Test
Hypothesis
Testing
Compares the means
of two distributions
F Test
Compares the
variances of 2
distributions
2 Paired t Test
Compares the paired
differences of two
distributions
Attribute
Data
Contingency
Table
Evaluates if methods
or classifications are
independent
Hypothesis Testing Flowchart

Hypothesis
Testing
Contingency Tables
This tool is used to test the relationship between two
sources of variation. In statistics, the relationship can be
two ways:
 Independent. There is no relationship at all (Two
different populations)
 Dependent. There is common relationship between
them (Same population).
This tool is going to tell us which of the relationships is
statistically valid. It is important to say that this won’t tell
us if data is good or bad, only if there is a difference
Attribute
Data
Contingency
Table
Evaluates if methods
or classifications are
independent
Contingency Table

The statistic to use is Chi-Square (2):


Oij -Eij)2/Eij
degrees of freedom = (r-1)(c-1)
where:
 O is the observed value.
 E is the expected value (Minitab will calculate this for you)
 r = number of rows
 c =number of columns
Contingency Table

Chi-Square (2): The Chi-Square Distribution
Probability Decreases
Chi-Square Distribution
0.5
0.4
Probability
8 df
The Chi-square distribution
changes with the number of
Degrees of Freedom
0.3
4 df
0.2
2 df
0.1
Chi Square
As Chi-Square increases
16.0
15.0
14.0
13.0
12.0
11.0
10.0
9.0
8.0
7.0
6.0
5.0
4.0
3.0
2.0
1.0
0.0
0.0
Contingency Tables
Example: To illustrate the use and analysis of contingency tables, let’s
use an evaluation of vehicle color preference vs. vehicle type.
Is the color preference dependent on the type?
Red
Green
Sports Car
201
45
Sedan
183
58
Truck
178
64
Solution:
Step 1
Null Hypothesis Ho: Preference for Vehicle color is independent of
vehicle type
Step 2
Alternate Hypothesis Ha: Color preference is not independent
of vehicle type
Step 3
To determine the critical value of the chi-square, we need to know
df(degree of freedom), the number of degrees of freedom involved.
Step 4
We will use Minitab to find both df, and the CHI-SQUARE statistic,
and calculate the expected values.
Contingency Tables
Let’s Go to MINTAB (HYPTEST.mpj) to set-up the CHI-SQUARE
test/Contingency Table for this Color vs Vehicle Type example
1)Set up the following table in Minitab
C1
C2
C3
C4
Color
Sports Car
Sedan
Truck
1
Red
2
Green
201
183
178
45
58
64
2) Minitab Stat - Tables - Chi-Square Test for Association
3) Select all Columns C2, C3, C4
4) Select OK
Contingency Tables
Minitab Results
Chi-Square Test
Expected counts are printed below observed counts
Sports C
Sedan
Truck
Total
201
183
178
562
189.65
185.79
186.56
45
58
64
56.35
55.21
55.44
246
241
242
1
2
Total
Chi-Sq =
167
729
0.680 +
0.042 +
0.393 +
2.288 +
0.141 +
1.322 = 4.866
DF = 2, P-Value = 0.088
Contingency Tables - Detail
Chi-Square Test
Expected counts are printed below observed counts
Observed Counts
Sports C
Sedan
Truck
Total
201
183
178
562
189.65
185.79
186.56
45
58
64
56.35
55.21
55.44
1
Expected Count:
Product of the ith row
total and the jth
column total divided
by the total of all
cells.
(562*246)/729=
189.65
Total of first column
(observed data)
(201 + 45 = 246)
Contribution to Chi-Sq
2
Total
Chi-Sq =
246
241
242
167
729
0.680 +
0.042 +
0.393 +
2.288 +
0.141 +
1.322 = 4.866
DF = 2, P-Value = 0.088
[(Obs – Exp)^2]/Exp
Degrees of Freedom
(rows-1)*(columns-1)
(2-1)*(3-1) = 2
Total across row
(201 + 183 + 178 = 562)
Total in end column
(observed data)
(562 + 167 = 729)
The probability that you
would have obtained the
observed counts if the
variables were independent
of each other.
If this value is less than or
equal to your alpha-level,
you can say the variables
are dependent.
.
Contingency Table
5) Determine the degrees of freedom (r-1)(c-1)
(2-1)(3-1) = 2
6) Determine the Critical Value using Minitab
Calc – Probability Distributions – Chi Square – Inverse Cumulative Probability
Degrees of freedom = 2
Input Constant = .95  (1-a = 1.0 - 0.05 = .95)
7) OK
Inverse Cumulative Distribution Function
Or, use tables in
appendix!
Chi-Square with 2 DF:
P( X <= x)
0.9500
x
5.9915
8) Conclusion
Chi-Square Value = 4.866 which is less than the Critical Value of 5.9915
p-value = 0.088 which is > 0.05
Accept the Null Hypothesis – The Color Preference and Vehicle Type are
Independent
Chi-Squared Distribution
Chi-Square Distribution df=2 alpha=0.05
0.5
Probability
0.4
0.3
Chi-Sq = 4.866
0.2
Critical Value 5.99
0.1
0
0
1
2
3
4
5
Chi Square
Accept Null Hypothesis
6
7
8
Hypothesis Testing Flowchart
Variance Tests
Variable
Data
Test for Equal
Variances
Compares the
variances of 2 or
more distributions
Hypothesis
Testing
F Test
Compares the
variances of 2
distributions
Attribute
Data
Contingency
Table
Evaluates if methods
or classifications are
independent
Test for Equal Variances




When comparing two distributions using variable data, we must
first decide if there is no statistical difference in the variances.
This is important since it affects the formula used to perform the
test on the means.
We also need to know if the distributions are normally
distributed since this can affect the type of homogeneity of
variance test used.
Our first step will be to plot the data using the normal probability
plot in Minitab (Stat-Basic Statistics-Normality test)
The results of this step will determine how you proceed.
Test for Equal Variances

If the normal probability plot indicates we are dealing with normally
distributed data, then we can use one of two types of tests:
– F test - Only for use if there are two distributions. (Minitab will
perform this test under Homogeneity of Variance.)
– Bartletts test - This can be used for two or more distributions.
(Minitab performs this test under Homogeneity of Variance.)
 If the data are not normally distributed, then we will offer only one
option:
– Levines test - This may be used on two or more distributions.
(Minitab performs this test under Homogeneity of Variance.)
Test for Equal Variances
When performing the Homogeneity of Variance test we will use Minitab:
– Stat – ANOVA – Test for Equal Variances
– Enter data in the stacked format
– Click on OK


The test indicates a significant difference if the calculated p-value is
less than the specified alpha value.(95% confidence has an alpha of
0.05 or 5%, therefore, a calculated p-value of less than 0.05 indicates a
significant difference between the two distributions.)
It is important to note that the test only indicates a significant difference.
It cannot determine goodness or badness. Your knowledge of the
process must be used to evaluate this condition.
Test for Equal Variances


The calculated F value should be compared to the critical F value.
If the calculated value is larger than the critical value then there is
a significant difference between the two distributions. If it is the
same or smaller, then statistically there is no difference between
the two distributions. They represent the same population.
There are two important issues to note:
1. This number is indicating a difference if it is larger because it
is using the F value. The F, Bartlett’s and Levine’s test in
Test for Equal Variances is using a probability statistic,
therefore a smaller number indicates significance (<.05).
2. These tests indicate only a difference, not goodness or
badness.
Test for Equal Variances Example 1
Example using data in file HYPTEST.MTW.
This data is for two different reactors
Reactor1 Reactor2
89.7
84.7
81.4
86.1
84.5
83.2
84.8
91.9
87.3
86.3
79.7
79.3
85.1
82.6
81.7
89.1
83.7
83.7
84.5
88.5
This is unstacked data
and must be stacked to
use Minitab`s Test for
Equal Variances.
Test for Equal Variances Example 1
Stack Command

Now we’ll change the data from unstacked to stacked.
Yield
89.7
81.4
84.5
84.8
87.3
79.7
85.1
81.7
83.7
84.5
84.7
86.1
83.2
91.9
86.3
79.3
82.6
89.1
83.7
88.5
Reactor
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
Minitab-Data-Stack-Columns
Stack columns C1 and C2
Store stacked data in Yield
Store subscripts into Reactor
OK
Now we have our Output
variable in C3 and our
Input Variable in C4.
Minitab automatically
assigns sequential values
to each column when it is
stacked.
Test for Equal Variances

Example 1
Now we’ll test the data for Normality.
Minitab-Stat-Basic Statistics-Normality Test, Variable - Yield
p-value > 0.05
data is normal
Test for Equal Variances
Example 1
Now perform the Test of Equal Variances:
Stat-ANOVA- Test of Equal Variances Response-Yield, Factor-Reactor, 95% Confidence
Test for Equal Variances: Yield 1 versus Reactor 1
Method
Null hypothesis
All variances are equal
Alternative hypothesis At least one variance is different
Significance level
α = 0.05
95% Bonferroni Confidence Intervals for Standard Deviations
Reactor 1 N StDev
CI
Reactor1 10 2.90180 (1.65970, 6.53916)
Reactor2 10 3.65033 (2.18733, 7.85175)
Individual confidence level = 97.5%
Tests
Method
Multiple comparisons
Levene
Test
Statistic
0.48
0.78
P-Value
0.487
0.390
p-value > 0.5
Accept Null Hypothesis
There is no difference in
the variances
Test for Equal Variances
Example 1
Graphical Results
Test for Equal Variances: Yield 1 vs Reactor 1
Multiple comparison intervals for the standard deviation, α = 0.05
Multiple Comparisons
P-Value
0.487
Levene’s Test
Reactor1
Reactor 1
P-Value
Reactor2
2
3
4
5
If intervals do not overlap, the corresponding stdevs are significantly different.
6
0.390
Test for Equal Variances
Example 2
Example 2 using data in file HYPTEST.MTW.
This data is for two different trimmers
This example compares the variances between the output of two
different trim machines
The response is the trimmed OD Dimension
Test for Equal Variances
Example 2
1) Stack the data –
Minitab-Manip-Stack/Unstack
Stack columns C7 and C8
Store stacked data in OD
Store subscripts into Trimmer
OK
2) Test the data for Normality –
Minitab-Stat-Basic Statistics-Normality Test,
Variable – OD
OK
Test for Equal Variances
Example 2
p-value < 0.05
data is
not normal
Test for Equal Variances
Example 2
Now perform the Homogeneity of variance test:
Stat-ANOVA-Test for Equal Variances Response-OD, Factor-Trimmer, 95% Confidence
Test for Equal Variances: OD versus Trimmer
Method
Null hypothesis
All variances are equal
Alternative hypothesis At least one variance is different
Significance level α = 0.05
95% Bonferroni Confidence Intervals for Standard Deviations
Trimmer N StDev
CI
Trimmer 1 20 0.0099852 (0.0074127, 0.015148)
Trimmer 2 20 0.0842441 (0.0654102, 0.122195)
Individual confidence level = 97.5%
Tests
Method
Multiple comparisons
Levene
Test
Statistic
56.31
22.88
P-Value
0.000
0.000
p-value < 0.5
Reject Null Hypothesis
There is a difference in
the variances
Test for Equal Variances Example 2
Graphical Results
Test for Equal Variances: OD vs Trimmer
Multiple comparison intervals for the standard deviation, α = 0.05
Multiple Comparisons
P-Value
0.000
Levene’s Test
Trimmer 1
Trimmer
P-Value
Trimmer 2
0.00
0.02
0.04
0.06
0.08
0.1 0
If intervals do not overlap, the corresponding stdevs are significantly different.
0.1 2
0.000
Hypothesis Testing Flowchart
Variance Tests
Mean Tests
1 Sample t Test
Compares one group
mean to a target
Variable
Data
Test for Equal
Variances
Compares the
variances of 2 or
more distributions
2 Sample t Test
Hypothesis
Testing
Compares the means
of two distributions
F Test
Compares the
variances of 2
distributions
2 Paired t Test
Compares the paired
differences of two
distributions
Attribute
Data
Contingency
Table
Evaluates if methods
or classifications are
independent
T-tests

Single mean compared to a target value

Comparison of two independent group means

Comparison of paired data from two groups
Single Mean Compared to Target





Example 1
Example using file Bhh73.mtw
The example includes 10 measures of specific gravity from an
alloy.
The question is: Is the mean of the sample representative of a
target value of 84.12?
The Hypotheses:
H o :  = 84.12
H a :   84.12
Ho can be rejected if p < 0.05
Single Mean Compared to Target Example 1
Minitab – Stat - Basic Statistics - 1-Sample t
Variables: C2-Sample
Hypothesized: 84.12
Alternate: not equal
OK
Test of μ = 84.12 vs ≠ 84.12
Variable N Mean StDev SE Mean
Reactor2 10 85.54 3.65 1.15
95% CI
T P
(82.93, 88.15) 1.23 0.250
Notice that 84.12 is included in this interval
p-value > 0.05
Accept Null Hypothesis
The sample mean is representative
of the target
Comparison of Two Independent Sample Means
Example 1





Now we will make a comparison between two group means. This is
really our first experimental design—One Attribute Factor (Input) and
one quantitative Output.
We’ll use data file Bhh77.mtw. Let’s change the scenario to
comparing Reactor 1 to Reactor 2 on chemical yield.
There are two ways to enter the data.
 Enter Reactor 1 yields in C1 and Reactor 2 yields in C2. This is
called the “unstacked approach”
 Enter all yields in C1 and enter Reactor number in C2. Minitab
calls C2 a subscript variable, the “stacked approach”.
The second method is preferred; we always want one column for
each Input variable and one column for each Output variable.
Let’s start with the Unstacked data and then we’ll use the stacked
data.
Two Sample t Test Unstacked
Example 1
Minitab – Stat - Basic Statistics - 2-Sample-t.
Samples in different columns
First: Reactor 1
Second: Reactor 2
Alternative: not equal
OK
Two Sample T-Test and Confidence Interval
Two sample T for Reactor1 vs Reactor2
N
Reactor1 10
Reactor2 10
Mean
84.24
85.54
StDev SE Mean
2.90
0.92
3.65
1.2
p-value > 0.05
Accept Null Hypothesis
The reactors appear to have
the same yield
95% CI for mu Reactor1 - mu Reactor2: ( -4.40, 1.8)
T-Test mu Reactor1 = mu Reactor2 (vs not =): T = -0.88 P = 0.39 DF = 18
Both use Pooled StDev = 3.30
Two Sample t Test Stacked Example 1
Minitab – Stat - Basic Statistics - 2-Sample-t.
Samples in one column
Samples: Yield
Subscripts: Reactor
Alternative: not equal
OK
Two Sample T-Test and Confidence Interval
Two sample T for Yield
Reactor
1
2
N
10
10
Mean
84.24
85.54
StDev SE Mean
2.90
0.92
3.65
1.2
95% CI for mu (1) - mu (2): ( -4.40, 1.8)
T-Test mu (1) = mu (2) (vs not =): T = -0.88 P = 0.39 DF = 18
Both use Pooled StDev = 3.30
Two Sample t Test Example 2

We’ll use data file Bhh77.mtw. Let’s change the scenario to
comparing the means of a molding process Before and After a
process change to eliminate Nonfills

Response = % Nonfill Scrap / Heat

Use Stacked and Unstacked methods
Two Sample t Test Unstacked
Example 2
Minitab – Stat - Basic Statistics - 2-Sample-t.
Samples in different columns
First: Before
Second: After
Alternative: not equal
OK
Two Sample T-Test and Confidence Interval
Two sample T for Before vs After
N
Before 100
After
100
Mean
StDev
SE Mean
0.05011 0.00109 0.00011
0.004983 0.000103 0.000010
p-value < 0.05
Reject Null Hypothesis
The process change
appears to have reduced
the nonfills
95% CI for mu Before - mu After: ( 0.04491, 0.045347)
T-Test mu Before = mu After (vs not =): T = 410.87 P = 0.0000 DF = 198
Both use Pooled StDev = 0.000777
Two Sample t Test Stacked Example 1
Minitab – Stat - Basic Statistics - 2-Sample-t.
Samples in one column
Samples: %Scrap
Subscripts: B/F
Alternative: not equal
OK
Two Sample T-Test and Confidence Interval
Two sample T for %Scrap
B/F
1
2
N
100
100
Mean
StDev
SE Mean
0.05011 0.00109 0.00011
0.004983 0.000103 0.000010
95% CI for mu (1) - mu (2): ( 0.04491, 0.045347)
T-Test mu (1) = mu (2) (vs not =): T = 410.87 P = 0.0000 DF = 198
Both use Pooled StDev = 0.000777
Paired Comparisons




This is a case where we can pair observations. A good example
is where we compare measurements made by an on-line
system to measurements made in a lab using the same
samples.
This also can also be used in measurement system studies to
see if operators are getting the same mean value across the
same set of samples.
Let’s look at the example in file Paircomp.mtw.
We are testing shoe material. We have a sample of 10 boys
and we’ll have each boy wear one shoe made from each
material.
Pair Comparisons Example 1
Use data on file Paircomp.mtw
Material A: 13.2, 8.2, 10.9, 14.3, 10.7, 6.6, 9.5, 10.8, 8.8, 13.3
Material B: 14.0, 8.8, 11.2, 14.2, 11.8, 6.4, 9.8, 11.3, 9.3, 13.6
BOY
MAT A
MAT B
Delta d
1
13.2
14.0
-0.80000
2
8.2
8.8
-0.60000
3
10.9
11.2
-0.30000
4
14.3
14.2
0.10000
5
10.7
11.8
-1.10000
6
6.6
6.4
0.20000
7
9.5
9.8
-0.30000
8
10.8
11.3
-0.50000
9
8.8
9.3
-0.50000
10
13.3
13.6
-0.30000
Our new Output variable is
Delta (d).
d = x Material A – x Material B
Hypotheses
Ho: d = 0
Ha: d  0
Where d = x matA  x matB
Pair Comparisons Example 1
Minitab – Stat - Basic Statistics – Paired-t.
Sample 1: Material A
Sample 2: Material B
OK
Paired T for Material A - Material B
Material A
Material B
Difference
N
10
10
10
Mean
10.630
11.040
-0.410
StDev
2.451
2.518
0.387
SE Mean
0.775
0.796
0.122
p-value < 0.05
Reject Null Hypothesis
The Delta ddoes not = 0
95% CI for mean difference: (-0.687, -0.133)
T-Test of mean difference = 0 (vs ≠ 0): T-Value = -3.35 P-Value = 0.009
Pair Comparisons Example 1
Minitab – Stat - Basic Statistics - 1-Sample-t.
Variable: Delta
Test mean: 0.0
OK
p-value < 0.05
Reject Null Hypothesis
The Delta ddoes not = 0
T-Test of the Mean
Test of mu = 0.000 vs mu not = 0.000
Variable
Delta
N
10
Mean
-0.410
StDev
0.387
SE Mean
0.122
T
-3.35
P
0.009
Doing the “Wrong” Analysis
We’ll use the same data and analyze it with the two independent sample comparison.
Minitab – Stat - Basic Statistics - 2-Sample-t.
Samples in different columns
First: Material A
Second: Material B
Alternative: not equal
Assume Equal Variances
OK
Two Sample T-Test and Confidence Interval
Two sample T for Material A vs Material B
N
Material 10
Material 10
Mean
10.63
11.04
StDev SE Mean
2.45
0.78
2.52
0.80
95% CI for mu Material - mu Material: ( -2.74, 1.92)
T-Test mu Material = mu Material (vs not =): T = -0.37 P = 0.72
DF = 18
Both use Pooled StDev = 2.49
Why is one analysis significant and one not significant?
Doing the “Wrong” Analysis
Performing the 2 sample t Test compares the two distributions without regard to pairing
BOY
MAT A
MAT B
1
13.2
14.0
2
8.2
8.8
3
10.9
11.2
4
14.3
14.2
5
10.7
6
6.6
7
9.5
9.8
8
10.8
11.3
9
8.8
9.3
10
13.3
13.6
vs
11.8
6.4
The test is designed to compare the wear of the
materials under equal conditions. That’s why
each boy wears one shoe of each material.
However, each boy will cause different amounts
of wear on his pair of shoes
For Example, Boy #1 has worn his shoes 2 times
the amount of Boy #6
The 2 Sample t Test did not detect a significant
difference
p-value = 0.72 > 0.05
Hypothesis Testing Flowchart
Variance Tests
Chi Square Test
Compares a
distribution to a
specification or target
Mean Tests
1 Sample t Test
Compares one group
mean to a target
Variable
Data
Homogeneity of
Variance
Compares the
variances of 2 or
more distributions
One way ANOVA
Compares the means
of two or more
distributions
2 Sample t Test
Hypothesis
Testing
Compares the means
of two distributions
F Test
Compares the
variances of 2
distributions
Paired t Test
Compares the paired
differences of two
distributions
Attribute
Data
Contingency
Table
Evaluates if methods
or classifications are
independent