Download Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Omnibus test wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Hypothesis Testing
 Test for one and two means
 Test for one and two proportions
Hypothesis Tests: An Introduction
We test a certain given theory / belief about population
parameter.
 We may want to find out, using some sample information,
whether or not a given claim / statement about population
is true.
Hypothesis and Test Procedures
A statistical test of hypothesis consist of :
1. The Null hypothesis,
2. The Alternative hypothesis,
3. The test statistic and its p-value
4. The rejection region
5. The conclusion

Definitions
Hypothesis Testing:
It is a process of using sample data and statistical procedures to
decide whether to reject or not to reject the hypothesis (statement)
about a population parameter value (or about its distribution
characteristics).
Null Hypothesis,
:
A null hypothesis is a claim (or statement) about a population
parameter that is assumed to be true.
(the null hypothesis is either rejected or fails to be rejected.)
Alternative Hypothesis,
:
An alternative hypothesis is a claim about a population parameter
that will be true if the null hypothesis is false.
The rejection of the null hypothesis will imply the acceptance of
this alternative hypothesis.
Test Statistic:
It is a function of the sample data on which the decision is to be
based.
Critical/ Rejection region:
It is a set of values of the test statistics for which the null
hypothesis will be rejected.
Critical point:
It is the first (or boundary) value in the critical region.
P-value:
The probability calculated using the test statistic. The smaller
the p-value is, the more contradictory is the data to
.
Procedure for hypothesis testing
1. Define the question to be tested and formulate a hypothesis
for a stating the problem.
H o :  a
H1:  a or   a or  > a
2. Choose the appropriate test statistic and calculate the
sample statistic value. The choice of test statistics is
dependent upon the probability distribution of the random
variable involved in the hypothesis.
3. Establish the test criterion by determining the critical value
and critical region.
4. Draw conclusions, whether to accept or to reject the null
hypothesis.
How to Decide whether to Reject or Accept H 0 ?
Example:
The average monthly earnings for women in managerial and
professional positions is RM 2400. Do men in the same positions have
average monthly earnings that are higher than those for women? A
random sample of n = 40 men in managerial and professional
positions showed x = RM3600 and s = RM 400. Test the appropriate
hypothesis using α = 0.01.
Solution:
The hypothesis to be tested are:
H 0 :  2400
H1:  2400 (claim)
We use normal distribution n > 30
Test statistic: Z  x    3600  2400  18.97
s
n
400
40
Rejection region:
Z  z
z  z0.01  2.33 (from normal distribution table)
Since 18.97 > 2.33, falls in the rejection region, we reject H 0 and conclude
that average monthly earnings for men in managerial and professional
positions are significantly higher than those for women.
Example:
Aisyah makes “kerepek ubi” and sell them in packets of 100g each. 12
randomly selected packets of “kerepek ubi” are taken and their weights in g
are recorded as follows:
98
102
98
100
96
91
97
97
100
94
101
97
Perform the required hypothesis test at 5% significance level to check
whether the mean weight per packet if “kerepek ubi” is not equal to 100g.
Solution:
The hypothesis to be tested are:
H 0 :   100
H1:   100
We use t distribution,  unknown, n = 12 < 30
x   97.5833  100
Test Statistic: t 

 2.737
s
n
3.0588
12
Rejection Region: t  t  or t  t 
2
2
From t-table ( Two-tailed test ):   0.025  t
0.025 ,11  2.201 and t0.025 ,11  2.201
2
Since – 2.737 < -2.201, falls in the rejection region, we reject H 0 and
conclude that weight per packet of “kerepek ubi” is not equal to 100g.
Exercise:
A teacher claims that the student in Class A put in more hours studying
compared to other students. The mean numbers of hours spent studying
per week is 25hours with a standard deviation of 3 hours per week. A
sample of 27 Class A students was selected at random and the mean
number of hours spent studying per week was found to be 26hours. Can
the teacher’s claim be accepted at 5% significance level?
Hypothesis Testing for the Differences
between Two Population Mean,  1  2 
Test hypothesis:
Null Hypothesis :
H 0 : 1  2  0
Alternative hypothesis
Rejection Region
H1 : 1  2  0
Z   z 2 or Z  z 2
H1 : 1  2  0
Z  z
H1 : 1  2  0
Z<  z
Test statistics:
x  x      

Z
1
2

1
2
1
n1

2
2
2
n2
x  x      
Z
1
2
1
2
s12 s2 2

n1 n2
x  x      

Z
1
2
Sp
with S p 
•
1
and  1 and  2 are known.
For two large and independent samples
and  1 and  2 are unknown.
(Assume  1   2 )
2
1 1

n1 n2
 n1  1 s12   n2  1 s22
n1  n2  2
•
For two large and independent samples
•
For two large and independent samples
and  1 and  2 are unknown.
(Assume  1   2 )
x  x      

t
1
2
Sp
1
2
1 1

n1 n2
(Assume  1   2 )
v  n1  n2  2
x  x      

t
1
2
1
For two small and independent samples
• and  1 and  2 are unknown.
2
s12 s2 2

n1 n2
2
 s12 s2 2 
 

n
n
 1
2 
v
2
2
1  s12 
1  s2 2 
  


n1  1  n1  n2  1  n2 
• For two small and independent samples
taken from two normally distributed
populations.
Example:
The mean lifetime of 30 bateries produced by company A is 50 hours and the
mean lifetime of 35 bulbs produced by company B is 48 hours. If the standard
deviation of all bulbs produced by company A is 3 hour and the standard
deviation of all bulbs produced by company B is 3.5 hours, test at 1 %
significance level that the mean lifetime of bulbs produced by Company A is
better than that of company B.
Solution:
H 0:  A  B  0
H 1:  A   B  0
Z test 
 50  48  0  2.4807
32 3.52

30
35
Ztest  2.4807  2.3263  Z0.01
We reject H 0 . The mean lifetime of bulbs produced by company A is better
than that of company B at 1% significance level.
Example:
A mathematic placement test was given to two classes of 45 and 55 student
respectively . In the first class the mean grade was 75 with a standard
deviation of 8, while in the second class the mean grade was 80 with a
standard deviation of 7. Is there a significant difference between the
Performances of the two classes at 5% level of significance? Assume the
population variances are equal.
Solution:
H 0 : 1  2  0
H1: 1  2  0
Z test
75  80   0

X 1  X 2  0


 3.3319
1
1
1
1
Sg

7.4656

n1 n2
45 55
Sg 
 44  82   54  7 2
44  55  2
 74656
Since Ztest  3.3319  1.96  Z0.025 , so we reject H 0
So there is a significant difference between the performance of
the two classes at 5% level of significance.
Exercise:
A sample of 60 maids from country A earn an average of
RM300 per week with a standard deviation of RM16, while a
sample of 60 maids from country B earn an average of RM250
per week with a standard deviation of RM18. Test at 5%
significance level that country A maids average earning exceed
country B maids average earning more than RM40 per week.
Example:
A manufacturer of a detergent claimed that his detergent is at least 95%
effective is removing though stains. In a sample of 300 people who had used the
detergent,n 279 people claimed that they were satisfied with the result.
Determine whether the manufacturer’s claim is true at 1% significance level.
Solution:
H 0 : p  /  0.95
H1 : p  0.95
Z test 
0.93  0.95
0.95  0.05 
300
 1.5894
Since Ztest  1.5894  2.3263  Z0.01 , we accept H 0
The manufacturer’s claim is true at 1% significance level.
Exercise:
When working properly, a machine that is used to make chips for calculators
produce 4% defective chips. Whenever the machine produces more than 4%
defective chips it needs an adjustment. To check if the machine is working
properly, the quality control department at the company often takes sample
of chips and inspects them to determine if they are good or defective. One
such random sample of 200 chips taken recently from the production line
contained 14 defective chips. Test at the 5% significance level whether or not
the machine needs an adjustment.
Rejection Region:
Example:
In a process to reduce the number of death due the dengue fever, two district,
district A and district B each consists of 150 people who have developed
symptoms of the fever were taken as samples. The people in district A is given
a new medication in addition to the usual ones but the people in district B is
given only the usual medication. It was found that, from district A and from
district B, 120 and 90 people respectively recover from the fever. Test the
hypothesis that the new medication helps to cure the fever using a level of
significance of 5%.
Solution:
H 0 : p A  pB  0
H 1: p A  p B  0
Ztest 
 0.8  0.6   0
1 
 1
0.7  0.3 


 150 150 
, ˆp 
120  90
 0.7
150  150
Since
Ztest  3.7796  1.6449  Z0.05 so we reject H 0
The new medication helps to cure the fever at 5% significance level.
Exercise:
A researcher wanted to estimate the difference between the percentages of
two toothpaste users who will never switch to other toothpaste. In a sample
of 500 users of toothpaste A taken by the researcher, 100 said that they will
never switched to another toothpaste. In another sample of 400 users of
toothpaste B taken by the same researcher, 68 said that they will never
switched to other toothpaste. At the significance level 1%, can we conclude
that the proportion of users of toothpaste A who will never switch to other
toothpaste is higher than the proportion of users of toothpaste B who will
never switch to other toothpaste?