Download hypothesis testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
HYPOTHESIS TESTING
A hypothesis is a claim. Usually there are two
rival claims.
Shower hypothesis: Water temperature is
acceptable /not
How to Develop a Hypothesis?
We need to translate a problem into a statement
involving a statistical measure. The measure or a
parameter like  or p is then used in the
derivation of hypothesis. For example, the
CLAIM may be that education increases
earnings so  =average earnings of the educated.
If the average earnings of the entire population
is known to be $30,000
Now the two rival claims are:
H0   30000 versus Ha  > 30000 .
H0 is called the null hypothesis. It says that the
educated earn no better than others.
Ha is called the alternative hypothesis, which
says that the educated earn more.
In this example, the alternative is one-sided. If
the claim were that the educated earn LESS than
others, then the alternative would be set up as Ha
 < 30000.
Hypothesis testing is a scientific method of
choosing between two claims H0 and Ha
Some principles:
1) H0 is presumed true unless overwhelming
evidence rejects it. E.g., H0: defendant is not
guilty!
2)Sample data give test statistics for , p or 2
3)We reject the Null H0 if statistic falls in
Rejection Region
4) Null is usually a zero value (hence the name
null).
5) Instead of saying ACCEPT one says FAILS
TO REJECT.
6) Absolute certainty does not exist.
If there is a standard value for  the null is H0 
=std value (or true value). e.g. H0  =10 and
two-sided alternative is Ha   10, where it
could be larger than 10 or smaller than 10.
Need Skill to decide (1) appropriate statistical
parameter , p, etc. (2) appropriate Null (3)
One-sided or two-sided.
When one of the alternatives is selected, there
can be error.
Type I () and Type II () errors
Definition: Type I error: Selecting Ha when H0
should be selected.
H0 DOGS are dead, Ha DOGS are alive
Truth is DOGS are dead, still selecting Ha is
Type I error
Definition: Type II error: Selecting H0 when Ha
should be selected.
H0 DOGS are dead, Ha DOGS are alive
Truth is DOGS are alive, still selecting H0
means Type II error.
Definition:  denotes the probability of Type I
error = This is also called the “Level of test.”
Definition:  denotes the probability of Type II
error. This is usually hard to determine since it
depends on unknown parameter  itself.
It is desirable to formulate the hypothesis so that
 is the most serious consequence. The two
types of errors are inversely related. There is a
trade-off between  and . The smaller we
make  the larger the  we have to accept.
Hence one usually chooses  largest tolerable !
Steps in the Test of Hypothesis
1.Define the hyp. to be tested in plain English.
2.Select the appropriate statistical measure (such
as  p, 2) to rephrase the hypothesis.
3.Determine whether hyp. should be 1 or 2sided.
4.State the hypothesis using the statistical
measure selected in step 2.
5.Specify , the “level” of the test.
6.Select the appropriate test statistic, based on
the information at hand and the assumptions
you are willing to make.
7.Determine the critical value of the test statistic.
Three factors for critical value:
a. the type of alternative hypothesis,
(1) Two-sided (2) 1-sided left (3) 1-sided Right
b. the specification of a, the level of the test,
c. the distribution of the test statistic.
8.Collect sample data and compute the value of
the test statistic.
9.Make the decision. Is the value of the test
statistic in the rejection region?
a. If yes, reject the null hypothesis in favor of
the alternative.
b.If no, do not reject the null hypothesis.
10. State conclusion in terms of the orig.
question.
11-5 Testing a Hypothesis about a Population
Mean 
Example illustrates the hypothesis testing
procedure.
A local school board member wants to know if
sophomore students at Lincoln High School
have approximately the same reading level as
the state average for tenth graders. The state
average is150 words per minute with a standard
deviation of 15. The level of the test is to be set
at .05. A random sample size of 100 tenth
graders has been drawn, and the resulting
average is 157 words per minute.
Step 1 (Define the hypothesis in plain English.)
The hypothesis is straight forward:
• Lincoln High School tenth graders are
reading at the state average.
• They are not.
Step 2 (select statistical measure)  =average of
something. Number of words read per minute by
Lincoln Hi sophomores.
Step 3 (one or 2 sided) They want to know
either way, above or below average, so 2-sided.
Step 4 ( State H0 Ha ) H0  =150 the std. value
and Ha   150
Step 5 (Level of test?) The problem statement
says  = 0.05 or Type I error of 5%
Step 6 (select statistic) The choice depends on
answer to 3 questions. (i) Is  known? (ii) Is the
distribution Normal? (iii) is n large?
Case 1: If  is known, n is large, and
distribution is Normal, use Z=( y   0 ) /  _y ,
where  _y = /  n
Case 2: If  is known, n is large, but the
distribution may not be Normal. Still, since
n>30 Central Limit Theorem suggests same Z
statistic.
Case 3: If  is unknown, Normality not
assumed and n>30 we can simply use s instead
of  in the above formula.
Step 7 (Determine critical value) For 5%, level
z=1.96. The rejection region is in Tails. “Fail to
Reject” region is at the center. Why?
Recall H0  =150 is the null. If true, the
difference ( y  ) should be near zero.
Since y is likely to be normal, its z-transform
is N(0,1), or the standard normal variable. Hence
from Normal Tables we can decide that if y >
1.96 or y < 1.96 it belongs in Tails where we
decide to reject the null of zero.
Step 8 (Compute the test statistic) If sample
mean y =157 we have Z= (157  150)/ SE,
where SE= /  n = 15/10=1.5, Now Z=4.67.
This means that y is 4.67 standard deviations
away from hypothesized value of zero. This
seems too far away from the value of the null. Is
it too far from formal statistical viewpoint?
Step 9 (Make a decision) Critical value is 1.96
the test statistic is much larger than critical value
1.96 Decision must be to reject the null H0. The
difference between y and  could not have been
caused by mere sampling variation. We decide
that the observed difference is statistically
significant. That is, the difference is rather rare
for a 5% test, so unlikely.
Step 10: (State conclusion) There is significant
evidence at the 5% level that tenth graders at
Lincoln Hi do not read at the state average. It
would be tempting to conclude that they read at
a higher level, but we did not set up our test as a
one-sided test. The conclusions must be
consistent with the hypothesis.
P-Values
Definition: The probability of observing a
value as extreme or more extreme than the
observed value.
It is computed as the tail area with reference
to the observed test statistic. For one-sided
test, it is the left or the right tail. If left tail,
and test statistic is z1=(-1) then p value is
given by the R command
> pnorm(-1)
[1] 0.1586553
If right tail and z1=1.5, say then p value is
given by the R command:
> pnorm(1.5, lower.tail=FALSE)
[1] 0.0668072
If population variance (or standard
deviation) is unknown and therefore the
variance (or sd) is estimated from the
sample, then Student's t distribution should
be used instead of Normal density.
For example, if the t statistic is t1=(-1.21)
one-sided left tail with degrees of freedom
20 then the p value is obtained by the
following R command:
> pt(-1.21, df=20)
[1] 0.1201933
If the right side test with t1=2.1, df=33 then
use following R command
> pt(2.1, df=33, lower.tail=FALSE)
[1] 0.02172731
For two-sided test P-value is computed as
the sum of the two tail areas or by simply
doubling the above computations since z and
t distributions are symmetric.
Decisions: If  > P-value Reject the Null.
For example, for a 95% test with =0.05 we
reject the null if the P-value tail is small, say
0.04. If the tail is small, the observed value
is obviously in the rejection region.