Download Lecture_20_ch11_222_w05_s123

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Omnibus test wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
LESSON 20: HYPOTHESIS TESTING
Outline
• Hypothesis testing
• Tests for the mean
• Tests for the proportion
1
HYPOTHESIS TESTING
THE CONTEXT
• Example 1: A supervisor of a production line wants to
determine if the production time of a critical part is the same
as its design time, say 100 seconds. A random sample of
parts is taken and their production times are measured.
Does the sample information provide enough evidence that
the production time of the part is 100 seconds? Of course,
both of the following are important
– The production times sampled and
– The size of the sample
• In the above context, hypothesis testing provides a
technique to conclude if the production time of the part is
100 seconds.
2
HYPOTHESIS TESTING
THE CONTEXT
• Example 2: Suppose that a manager wants to produce a
new product if more than 10% potential customers buy the
product. A random sample of potential customers is asked
whether they would buy the product. Does the sample
information provide enough evidence that more than 10%
potential customers will buy the new product? Of course,
both of the following are important
– The yes/no answers provided by the respondents and
– The size of the sample
• In the above context, hypothesis testing provides a
technique to conclude if more than 10% potential
customers will buy the new product.
3
HYPOTHESIS TESTING
THE CONTEXT
• Example 3: Suppose that a quality control inspector wants
to determine if less than 2% items are defective. A random
sample of items are checked and inspected. Does the
sample information provide enough evidence that less than
2% items are defective? Of course, both of the following are
important
– The proportion defectives observed in the sample and
– The size of the sample
• In the above context, hypothesis testing provides a
technique to conclude if less than 2% items are defective.
4
HYPOTHESIS TESTING
SOME TERMS
• Null Hypothesis, HO
– The null hypothesis always specifies a single value. For
example, suppose that it is required to determine if the
population mean is 10. Then, the null hypothesis is
H O :   100
– Note that since the null hypothesis always specifies a
single value, none of the below may be a null hypothesis
H O :   100
H O :   100
H O :   100
H O : 80    120
5
HYPOTHESIS TESTING
SOME TERMS
• Alternative Hypothesis, HA
– The alternative hypothesis is very important because the
conclusion of the hypothesis testing is stated in terms of
alternative hypothesis
– Hypothesis testing provides a technique to determine if
there is enough statistical evidence that the alternative
hypothesis is true.
– There are three forms of alternative hypothesis:
Two-tail test One-tail (right-tail) test One-tail (left-tail) test
H A : p  0.02
H O :   100
H O : p  0.10
H A :   100
H A : p  0.10
H A : p  0.02
Example 1
Example 2
Example 3
6
HYPOTHESIS TESTING
SOME TERMS
• Alternative Hypothesis, HA (One- and Two-Tail Tests)
– It is very important to choose the right form of the
alternative hypothesis. The form depends on the context.
– In Example I, the supervisor wants to know if the mean
is 100 or different from 100. Both the too large and too
small values are equally undesirable. It is appropriate to
reject the claim if the sample mean is much different
from 100. So, the most appropriate test is the two-tail
test.
– In Examples 2 and 3 too small and too large
observations do not lead to the same action. When this
happens, a one-tail test is used.
7
HYPOTHESIS TESTING
SOME TERMS
• Alternative Hypothesis, HA (Left- and Right-Tail Tests)
– Choose between the left- and right-tail tests carefully.
– In Example 2 the manager wants to know if the
proportion is more than 0.10. So, H A : p  0.10 and the
most appropriate test is a right-tail test.
– In Example 3 the inspector wants to know if the
proportion is less than 0.02. So, H A : p  0.02 and the
most appropriate test is a left-tail test.
8
HYPOTHESIS TESTING
SOME TERMS
• Test Statistic and Rejection Region
– The test statistic is computed from the sample data.
– The test statistic is different for different tests. For
example, the test statistic for the z-test of mean is
X 
z
/ n
– The test statistic is the same for both one-tail and twotail tests. The rejection regions for one-tail and two-tail
tests are different.
– If the test statistic lies in the rejection region, the null
hypothesis is rejected, else the null hypothesis is not
9
rejected (Beware: not rejected  accepted)
HYPOTHESIS TESTING
SOME TERMS
• Rejection Region and Level of Significance, 
– Conclusion drawn from sample measurements are
usually expected to contain some errors
– Type I error
• To reject the null hypothesis when it’s actually true!
– Type II error
• To accept the null hypothesis when it’s actually false!
– Level of significance,  specifies a limit on the probability
of committing Type I error
– Rejection region is different for a different value of 
10
HYPOTHESIS TESTING
SOME TERMS
• Rejection Region
– If the test statistic lies in the rejection region, the null
hypothesis is rejected, else the null hypothesis is not
rejected (not rejected  accepted)
– The rejection regions for z-test are shown below:
z  z / 2
• Two-tail test: reject the null hypothesis if
• Right-tail test: reject the null hypothesis if z  z
• Left-tail test: reject the null hypothesis if
z   z
– Where z is the test statistic,  is the level of significance.
– Recall that zα is that value of z for which area on the right
is α.
11
HYPOTHESIS TESTING
SOME TERMS
f(x)
• Rejection Region for two-tail z-test of mean
Two -tail
rejection
region
x 


2
n

2
 z / 2

z / 2
12
HYPOTHESIS TESTING
SOME TERMS
Left -tail
rejection
region
x 

n
f(x)
f(x)
• Rejection Region for one-tail z-test of mean
Right -tail
rejection
region

x 
n


 z


z
13
HYPOTHESIS TESTING
SOME TERMS
• Type I Error
– Example: Suppose that a manufacturer of packaged
cereals produces cereal boxes. Each box is expected to
have a net weight of 100 gm. Periodically, samples are
collected and the average weight of the sample is
measured. It is possible that although the system is
producing cereal boxes as usual, just because of some
random variation, a sample may contain all boxes with
weights less than 100 gm. Then, the manufacturer may
be tempted to assume some problem with the system,
stop the production and search for the problem. In this
case, the sample data provides a false alarm and a Type
14
I error is committed.
HYPOTHESIS TESTING
SOME TERMS
• Type II Error
– Not to reject the null hypothesis when the null hypothesis
is false! (the opposite of the Type I error). The probability
of committing a Type II error is denoted by .
– Example: consider the manufacturer of the packaged
cereal again. Each cereal box is expected to have a net
weight of 100 gm. But, due to some problems in the
production system, the average weight is shifted to 98
gm. A Type II error is committed if a sample is collected
with average weight nearly 100 gm. Notice that in such a
case, the problem with the production system will not be
detected by the sample!
15
TESTING THE POPULATION MEAN WHEN
THE POPULATION VARIANCE IS KNOWN
A z-test is used in the following context:
• The measurements are normally distributed
• The population standard deviation is known, 
• It is desirable to know if the population mean is
– different from a given value (two-tail test)
– less than a given value (left-tail test)
– more than a given value (right-tail test)
16
TESTING THE POPULATION MEAN WHEN
THE POPULATION VARIANCE IS KNOWN
The test statistic and rejection region for the z-test are:
• Test statistic:
X 
z
/ n
Where, X is the sample mean,  is the population mean
stated in the null hypothesis,  is the population standard
deviation and n is the sample size.
• Rejection region:
– Two-tail test: reject the null hypothesis if z  z / 2
– Right-tail test: reject the null hypothesis if z  z
– Left-tail test: reject the null hypothesis if z   z
where,  is the level of significance.
17
TESTING THE POPULATION MEAN WHEN
THE POPULATION VARIANCE IS KNOWN
• For small population N  10n,
X 

n
N n
N 1
and the test statistic for the z-test is:
X 
z
 N n
n N 1
• For single-decision procedure the rejection regions are
stated in terms of z-statistic.
• However, for recurring-decision procedure, the rejection
regions are stated in terms of the actual measurements,
e.g., X , p
18
TESTING THE POPULATION MEAN WHEN
THE POPULATION VARIANCE IS KNOWN
Example 4: A machine that produces ball bearings is set so
that the average diameter is 0.60 inch. In a sample of 49 ball
bearings, the mean diameter was found to be 0.61 inch.
Assume that the standard deviation is 0.035. Does this
statistic provide sufficient evidence at the 5% significance
level to infer that the mean diameter is not 0.60 inch?
HO :
f(x)
HA :
Test statistic:
Rejection region:
Conclusion:
x 

n

19
HYPOTHESIS TESTING
INTERPRETATION
• If the null hypothesis is rejected
– Conclude that there is enough statistical evidence to
infer that the alternative hypothesis is true
• If the null hypothesis is not rejected
– Conclude that there is not enough statistical evidence to
infer that the alternative hypothesis is true
20
TESTING THE POPULATION MEAN WHEN
THE POPULATION VARIANCE IS KNOWN
Example 5: A random sample of 100 observations from a
normal population whose standard deviation is 50 produced a
mean of 145. Does this statistic provide sufficient evidence at
the 5% significance level to infer that population mean is
more than 140?
HO :
HA :
f(x)
Test statistic:
Rejection region:
Conclusion:
x 

n

21
TESTING THE POPULATION MEAN WHEN
THE POPULATION VARIANCE IS KNOWN
Example 6: A random sample of 100 observations from a
normal population whose standard deviation is 50 produced a
mean of 145. Does this statistic provide sufficient evidence at
the 5% significance level to infer that population mean is less
than 150?
HO :
HA :
f(x)
Test statistic:
Rejection region:
Conclusion:
x 

n

22
TESTING THE POPULATION MEAN WHEN
THE POPULATION VARIANCE IS KNOWN
Example 7: The following hypotheses are to be tested, with
 H 0  110 :
HO :   H0
H A :   H0
Assume σ = 25 and n = 100. The
following decision rule applies:
Reject H 0 if X  114
Compute Type I error probability
when μ=110 and Type II error
probability when μ=115.
f(x)
Accept H 0 if X  114
x 

n

23
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• If the population variance,  is not known, we cannot
compute the z-statistic. However, we may compute a similar
statistic, the t-statistic, that uses the sample standard
deviation s in place of the population standard deviation :
X 
t
s/ n
• tα is that value of t for which the area to its right under the
Student t-curve equals α.
• tα,df is that value of t for which the area to its right under the
Student t-curve for degrees of freedom=df equals α.
24
• The value tα df is obtained from Table G, Appendix A, p. 541.
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• Test of Hypothesis about a population mean when the
population variance is unknown:
– Test statistic: t 
– Rejection region:
• Two-tail test:
X  HO
s/ n
or t 
X  HO
s
n
N n
N 1
for small N
t  t / 2,n1
• Right-tail test:
t  t ,n 1
• Left-tail test:
t  t ,n 1
25
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
Example 8: Let n  100, N  500,   160
H O :   5,000
H A :   5,000
Suppose that X will serve as test statistic.
(a) Assuming that a significance level of α=0.01 is desired, find
the critical value for the sample mean and determine the
decision rule.
(b) Should H0 be accepted or rejected if the computed sample
mean turns out to be X  5,050 ?
26
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
Example 9: A random sample of 9 observations were drawn
from a large population. These are: 11,9,5,7,1,2,10,6,3. Test
to determine if we can infer at the 5% significance level that
the population mean is less than 10.
HO :
HA :
Test statistic:
Rejection region:
Conclusion:
Answer:
27
INFERENCE ABOUT
A POPULATION PROPORTION
• Let
π = the population proportion
p = the sample proportion
n = the sample size
Then,
The expected number of successes = np
Standard deviation of successes = np1  p 
The expected sample proportion

E p p
Standard deviation of the sample proportion
p1  p 
p 
n
28
INFERENCE ABOUT
A POPULATION PROPORTION
• Test of Hypothesis about a population proportion:
– Test statistic:
z
p   HO
 H 1   H / n
O
– Rejection region:
• Two-tail test:
O
z  z / 2
• Right-tail test:
z  z
• Left-tail test:
z   z
29
INFERENCE ABOUT
A POPULATION PROPORTION
Example 10: Test the following hypothesis:
H O :   0.40
H A :   0.40
  0.05, n  100, p  0.35
Test statistic:
Rejection region:
Conclusion:
30
INFERENCE ABOUT
A POPULATION PROPORTION
Example 11: In a television commercial, the manufacturer of a
toothpaste claims that more than seven out of 10 dentists
recommend the ingredients in his product. To test the claim, a
consumer protection group randomly samples 400 dentists
and asks each one whether he or she recommend a
toothpaste that contained the ingredients. A total of 290
dentists answered “Yes.” At the 5% significance level, can the
consumer group infer that the claim is true?
31
HO :
HA :
Test statistic:
Rejection region:
Conclusion:
Answer:
32
INFERENCE ABOUT
A POPULATION PROPORTION
Example 12: Test the following hypothesis:
H O :   0.20
H A :   0.20
  0.10, n  900, p  0.18
Test statistic:
Rejection region:
Conclusion:
33
READING AND EXERCISES
Lesson 20
Reading:
Section 11-1 to 11-3, pp. 344-389
Exercises:
11-4, 11-12, 11-14, 11-23, 11-31
34