Download Null Hypothesis - Wright State engineering

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Statistical Hypotheses
&
Hypothesis Testing
Statistical Hypotheses
There are two types of statistical hypotheses.
Null Hypothesis
The null hypothesis, denoted by H0, assumes the sample observations
result purely from chance.
Alternative Hypothesis
The alternative hypothesis, denoted by H1, states the counter-assumption
that sample observations are influenced by some non-random cause.
Note:
The Alternate Hypothesis is always the logical opposite of the Null Hypothesis.
Example: Suppose we wanted to determine whether a coin was fair and balanced.
A null hypothesis might be that half the flips would be Heads and
half of the flips would be Tails.
The alternative hypothesis would be that the number (percent) of Heads and
Tails would be very different.
Symbolically, these hypotheses would be expressed as:
H0: р = 0.5 where р = Probability of Heads
H1: р ≠ 0.5
Hypothesis Testing
Hypothesis testing is a decision making process about
accepting or rejecting a statement (assumption) regarding a
population parameter.
Frequently, hypothesis testing is applied to a assumption
about a population mean.
For example, test the assumption that the population
mean μ is equal to 120 versus μ is not equal to 120;
i.e., H0: μ = 120 versus H1: μ ≠ 120.
References
David Harper - Bionic Turtle
http://www.bionicturtle.com/learn/article/type_i_versus_type_ii_errors_9_minute_tutorial/
http://www.bionicturtle.com/learn/article/hypothesis_testing_9_minute_screencast/
http://www.bionicturtle.com/learn/article/hypothesis_testing_9_minute_screencast/
Null and Alternate Hypothesis
http://www.ganesha.org/spc/hyptest.html#hypothesis
Hypothesis Testing
Hypothesis Testing
Suppose we believe the average systolic blood pressure of healthy
adults is normally distributed
with mean μ = 120 and variance σ2 = 50.
To test this assumption, we sample the blood pressure of 42
randomly selected adults. Sample statistics are
Mean ¿ = 122.4
Variance s2 = 50.3
Standard Deviation s = √50.3 = 7.09
Standard Error = s / √n = 7.09 / √42 = 1.09
Central Limit Theorem
The distribution of all sample means of sample size n from a
Normal Distribution (μ, σ2) is a normally distributed with
Mean = μ
Variance = σ2 / n
For our case:
Mean μ = 120
Variance σ2 / n = 50 / 42 = 1.19
Note: Theoretically we can test the hypothesis regarding the
mean and the hypothesis regarding the variance; however one
usually presumes the sample variances are stable from sample to
sample and any one sample variance is an unbiased estimator of
the population variance. As such, hypothesis testing is most
frequently associated with testing assumptions regarding the
population mean.
Hypothesis Testing
Test the assumption
H0: μ = 120 vs. H1: μ ≠ 120
using a level of significance α = 5%
Note: If our sample came from the assumed population with
mean μ = 120, then we would expect 95% of all sample
means of sample size n = 42 to be within ± Zα/2 = ± 1.96
Confidence Interval 95%
Level of Significance a = 5%
95%
a / 2 = 2.5%
-Za/2 = -1.96
a / 2 = 2.5%
+Za/2 =
+1.96
Calculate Upper and Lower Bounds on ¿
¿Lower = μ – Zα/2 (s /√n) = 120 – 1.96(1.09) =117.9
¿Upper = μ + Zα/2 (s /√n) = 120 + 1.96(1.09) =122.1
Confidence Interval 95%
Level of Significance a = 5%
95%
a / 2 = 2.5%
a / 2 = 2.5%
μ = 120
-Za/2 = -1.96
¿ Lower = 117.9
+Za/2 = +1.96
¿ Upper = 122.1
Hypothesis Testing Comparisons
Compare our sample mean ¿ = 122.4
To the Upper and Lower Limits.
Confidence Interval 95%
Level of Significance a = 5%
95%
a / 2 = 2.5%
a / 2 = 2.5%
μ = 120
-Za/2 = -1.96
¿ Lower = 117.9
¿ = 122.4
+Za/2 = +1.96
¿ Upper = 122.1
Hypothesis Testing Conclusions
Note:
Sample mean ¿ = 122.4 falls outside of the 95% Confidence Interval.
We can reach one of two logical conclusions:
One, that we expect this to occur for 2.5% of the
samples from a population with mean μ = 120.
Two, our sample came from a population with a mean μ ≠ 120.
Since 2.5% = 1/40 is a rather “rare” event; we opt for the
conclusion that our original null hypothesis is false and
we reject H0: μ = 120 and therefore accept vs. H1: μ ≠ 120.
Confidence Interval 95%
Level of Significance a = 5%
μ ≠ 120
Conclude μ ≠ 120
¿ = 122.4
Alternate Method
Rather than compare the sample mean to the 95% lower and
upper bounds, one can use the Z Transformation for the sample
mean and compare the results with ± Zα/2.
Z0 = ( ¿ – μ ) / (s / √n) = (122.4 – 120) / 1.09 = 2.20
Confidence Interval 95%
Level of Significance a = 5%
95%
a / 2 = 2.5%
a / 2 = 2.5%
Z0 = 2.20
-Za/2 = -1.96
+Za/2 =
+1.96
Alternate Method
Note: Since Z0= 2.20 value exceeds Zα/2 =1.96, we
reach the same conclusion as before;
Reject H0: μ = 120 and Accept H1: μ ≠ 120.
Alternate Method - Extended
We can quantify the probability (p-Value) of
obtaining a test statistic Z0 at least as large as our sample Z0.
P( |Z0| > Z ) = 2[1- Φ (|Z0|)]
p-Value = P( |2.20| > Z ) = 2[1- Φ (2.20)]
p-Value = 2(1 – 0.9861) = 0.0278 = 2.8%
Compare p-Value to Level of Significance
If p-Value < α, then reject null hypothesis
Since 2.8% < 5%, Reject H0: μ = 120 and conclude μ ≠ 120.
Hypothesis Testing Errors
Hypothesis Testing
Suppose we believe the average systolic blood pressure of healthy
adults is normally distributed with mean μ = 120 and variance σ2 = 50.
To test this assumption, we sample the blood pressure of 42
randomly selected adults. Sample statistics are
Mean ¿ = 122.4
Variance s2 = 50.3
Standard Deviation s = √50.3 = 7.09
Standard Error = s / √n = 7.09 / √42 = 1.09
Z0 = ( ¿ – μ ) / (s / √n) = (122.4 – 120) / 1.09 = 2.20
Confidence Interval 95%
Level of Significance a = 5%
95%
a / 2 = 2.5%
a / 2 = 2.5%
Z0 = 2.20
-Za/2 = -1.96
+Za/2 =
+1.96
Conclusion (Critical Value)
Since Z0= 2.20 exceeds Zα/2 = 1.96,
Reject H0: μ = 120 and Accept H1: μ ≠ 120.
Conclusion (p-Value)
We can quantify the probability (p-Value) of
obtaining a test statistic Z0 at least as large as our sample Z0.
P( |Z0| > Z ) = 2[1- Φ (|Z0|)]
p-Value = P( |2.20| > Z ) = 2[1- Φ (2.20)]
p-Value = 2(1 – 0.9861) = 0.0278 = 2.8%
Compare p-Value to Level of Significance
If p-Value < α, then reject null hypothesis
Since 2.8% < 5%, Reject H0: μ = 120 and conclude μ ≠ 120.
Confidence Interval = 99%
Level of Significance α = 1%
Z0 = ( ¿ – μ ) / (s / √n) = (122.4 – 120) / 1.09 = 2.20
Zα/2 = +2.58
Confidence Interval 99%
Level of Significance a = 1%
99%
a / 2 = 0.5%
a / 2 = 0.5%
Z0 = 2.20
-Za/2 = -2.58
+Za/2 =
+2.58
Conclusion (Critical Value)
Since Z0= 2.20 is less than Zα/2 =2.58,
Fail to Reject H0: μ = 120 and conclude
there is insufficient evidence to say H1: μ ≠ 120.
Conclusion (p-Value)
We can quantify the probability (p-Value) of
obtaining a test statistic Z0 at least as large as our sample Z0.
P( |Z0| > Z ) = 2[1- Φ (|Z0|)]
p-Value = P( |2.20| > Z ) = 2[1- Φ (2.20)]
p-Value = 2(1 – 0.9861) = 0.0278 = 2.8%
Compare p-Value to Level of Significance
If p-Value < α, then reject null hypothesis
Since 2.8% > 1%, Fail to Reject H0: μ = 120 and conclude
there is insufficient evidence to say H1: μ ≠ 120.
Hypothesis Testing Conclusions
As can be seen in the previous example, our conclusions
regarding the null and alternate hypotheses are
dependent upon the sample data and the level of
significance.
Given different values of sample mean and the sample
variance or given a different level of significance,
we may come to a different conclusion.
Null Hypotheses
And
Alternate Hypotheses
Hypothesis Testing
Hypotheses are always about the population and never about the sample.
The true value of a hypothesis can never be known or confirmed.
Conclusions regarding hypotheses are never absolute and as such are
susceptible to some degree of definable/calculable risk of error.
Type I Error
Type II Error
Rejecting H0 when H0 is True
Failing to Reject H0 when H0 is False
Probability of Type I Error = α
Probability of Type II Error = β
Power of the Test
Probability of Correctly Rejecting a False Null Hypothesis = 1 - β
Probability of Correctly Rejecting H0 when H1 is true = 1 - β
Probability of Rejecting H0 when H0 is False = 1 - β
Probability of Accepting H1 when H1 is True = 1 - β
Probability of Type I and Type II Errors
The Level of Significance α establishes the Probability of a Type I Error.
The Probability of a Type II Error depends on the magnitude of the
true mean and the sample size.
Probability of Type II Errors
Consider
H0: μ = μ0
H1: μ ≠ μ0
Suppose the null hypothesis is false and the true magnitude of the
mean is μ = μ0 + δ.
X  0
X  0  
X  (0  )  n

Z0 





 n
 n
 n
 n
and therefore , Z0
 n 
N 
, 1 that is to say
 

Z0 is normally distributed with mean  n and variance 1.

Probability of Type II Error

 n
    Z 

 2


 n
     Z 
2






Applied Statistics and Probability for Engineers, 3ed, Montgomery & Runger, Wiley 2003