Download Chapter 2 - UniMAP Portal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
CHAPTER 2
Hypothesis Testing
-Test for one means
- Test for two means
-Test for one and two proportions

WHY WE HAVE TO DO THE HYPOTHESIS?



To make decisions about populations based on the sample
information.
Example :- we wish to know whether a medicine is really effective
to cure a disease. So we use a sample of patients and take their
data in effect of the medicine and make decisions.
To reach the decisions, it is useful to make assumptions about
the populations. Such assumptions maybe true or not and called
the statistical hypothesis.
Type I and Type II error
Four outcomes of decision:
There are two possibilities correct outcomes and two possibilities
incorrect outcomes.
Type I error
 Occurs if you reject null hypothesis
 Level of significance is the maximum probability
of committing a type I error (  )
Type II error
 occurs if you do not reject the null hypothesis
when it is false.
 The probability of type II error is symbolized by 
Example Hypothesis Testing and Jury Trial:
Definitions
Hypothesis Test:
It is a process of using sample data and statistical procedures to
decide whether to reject or not to reject the hypothesis (statement)
about a population parameter value (or about its distribution
characteristics).
Null Hypothesis
Alternative Hypothesis
states that there is no difference
between a parameter and specific
value, or that there is no
difference between two
parameters.
states the existence of difference
between a parameter and a specific
value, or states that there is difference
between two parameters.
Test Statistic:
It is a function of the sample data on which the decision is to be
based.
Critical/ Rejection region:
It is a set of values of the test statistics for which the null
hypothesis will be rejected.
Critical point:
It is the first (or boundary) value in the critical region.
P-value:
The probability calculated using the test statistic. The smaller the
p-value is, the more contradictory is the data to
.
Procedure for hypothesis testing
1.
Define the question to be tested and formulate a hypothesis for a
stating the problem.
2. Determine the critical value and critical region.
3. Choose the appropriate test statistic and calculate the sample
statistic value. The choice of test statistics is dependent upon
the probability distribution of the random variable involved in
the hypothesis.
4. Make decision whether to accept or to reject the null
hypothesis.
5. Conclusion

1.
Hypothesis Testing For One Mean Population
Formulate the hypothesis and define claim.
Two-Tailed
Test
H0 :   k
H1 :   k
2.
Right-Tailed
Test
H0 :   or  k
H1 :   k
Left-Tailed
Test
H0 :   or  k
H1 :   k
Critical value by finding from statistical book. (page 26 or
page 28)
3. Test Statistic
 unknown
 known
Z
x 

n
is normal population
Z
x 
s
n  30
n
t
x 
, with degree of freedom, v  n  1
s
n
n  30
4. Rejection Rule
H1 :   k
z  z
H1 :   k
H1 :   k
5. Conclusion
z  z
z  z or z  z
2
2
Example 2.1:
A sample of 50 Internet shoppers were asked how much they
spent per year on Internet. From this sample, mean expenses
per year on Internet is 30460 and sample standard deviation
is 10151. It is desired to test whether they spend in mean
expenses is RM32500 per year or not. Test at   0.05 .
Solution:
1.The hypothesis tested are:
H 0 :   32500 (claim)
H1 :   32500
2. Critical Value : As two tailed (=), so alpha has to divide
by two, becomes :   0.05 / 2  0.025
3. Test Statistic:
Z
Z 0.025  1.96  1.96, 1.96
30460  32500
2040

 1.4212
10150
1435.4268
50
4. Rejection Region:
Ztest  1.4212  Z 0.025  1.96
Do not reject H 0
5. Conclusion: The Internet Shoppers spend RM32500 per
year on the Internet.
Example 2.2:
A random sample of 10 individuals who listen to radio was
selected and the hours per week that each listens to radio
was determined. The data are follows:
9 8 7 4 8 6 8 8 9 10
Test a hypothesis if mean hours individuals listen to radio
is less than 8 hours at   0.01.
*Hint : Find x and s
Solutions:
1. The hypothesis tested are:
H0 :   8
H1 :   8 (claim)
2. Critical Value:
t0.01,101  t0.01,9  2.8214
3. Test Statistic: n < 30
x  7.7,s  1.7029
7.7  8
0.3
ttest 

 0.5571
1.7029 0.5385
10
4. Rejection Region:ttest  0.5571  t0.01,9  2.8214
Do not reject H 0
5. Conclusion :
Mean hours individuals listen to radio is greater or equal to 8
hours.
Exercise 2.1:
1.
From the following data, test the null hypothesis that population
mean is less than or equal to 100 at 5% significance level.
Ans: t = 4.3043, Reject
105 108 112 121 100 105 99 107 112 122 118 105
2. A paint manufacturing company claims that the mean drying time for its
paint is at most 45 minutes. A random sample of 35 trials tested. It is found
that the sample mean drying time is 49.50 minutes with standard deviation 3
minutes. Assume that the drying times follow a normal distribution. At 1%
significance level, is there any sufficient evidence to support the
company claim? Ans: z = 8.8741, Reject
3.
The quality control manager claim that the daily yield at a
local chemical plant was averaged less than 80 tons for the last
several years. A random sample of 50 days gives an average yield of
71 tons with a standard deviation of 21 tons.
(a)
(b)
(c)
(d)
Construct a 95% confidence interval of mean yield at a local
chemical plant. Ans : [65.1791, 76.8209]
Construct a 99% confidence interval of mean yield at a local
chemical plant. Ans : [63.3503, 78.6497]
Calculate the width of confidence interval in (a) and (b). Explain
the different width. Ans : 11.6418, 15.29
Build a hypothesis testing whether the quality control claim is
true or not at 5% significance level. Ans : Z = -3.0305, Reject