Download Statistics Chapter 10

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Statistical inference wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
CHAPTER 11
MORE ON CONFIDENCE INTERVAL
AND HYPOTHESIS TESTING
Outline
• The t -distribution
– Confidence interval
– Hypothesis testing
• Inference about proportions
– Confidence interval
– Hypothesis testing
1
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• Chapters 9 and 10 discuss how to estimate confidence
interval and test the population mean when the population
variance is known.
• If the population variance,  is not known, we cannot
compute the z-statistic as
x 
z
/ n
• However, we may compute a similar statistic, the t-statistic,
that uses the sample standard deviation s in place of the
population standard deviation :
x 
t
s/ n
2
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• If the sampled population is normally distributed, the tstatistic follows what is called Student t distribution.
• Student t-distribution is similar to the normal distribution.
The Student t-distribution is
– symmetrical about zero
– mound-shaped, whereas the normal distribution is bellshaped
– more spread out than the normal distribution.
• The difference between t-distribution and normal
distribution depends on degrees of freedom, d.f. = n-1. For
small d.f., the difference is more. For large d.f., the tdistribution approaches the normal distribution. (See next)
3
d.f.=1
4
d.f.=2
5
d.f.=3
6
d.f.=4
7
d.f.=5
8
d.f.=30
9
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• We shall use the Student t-distribution to make inferences
about the population mean when the population variance is
unknown. The inferences include:
– Estimation of a confidence interval of the population
mean
– Test a hypothesis about the population mean
10
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• Notation:
– tA is that value of t for which the area to its right under
the Student t-curve equals A.
– tA,df is that value of t for which the area to its right under
the Student t-curve for degrees of freedom=df equals A.
• The value of tA,df is obtained from Table 11.1 on p. 373 and
Appendix B, Table 4 on p. 838.
• The table provides t-values for given areas. Hence, the
table is useful for estimating confidence interval and testing
hypothesis. However, the table does not give areas for all tvalues. Hence, the table is not useful to find p-values. Excel
may be used to find p-values. See next.
11
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• Some of the Excel functions for Student t-distribution are:
– TDIST(t, df, number of tails): Given t, df, and number of
tails, finds area in the tail(s)
• For example, TDIST(2,60,1) = 0.025
– This means that for t=2 and for degrees of
freedom = 60, the area to the right of t=2 is 0.025.
– Also, for t=-2 and degrees of freedom 60, the area
to the left of t=-2 is 0.025.
• To get p-value, use command TDIST
– TINV(Tail area, df ): Assumes two-tails. Given area in
two tails and df finds t. To get t for one tail, multiply the
12
area by 2.
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• For example, TINV(0.1,35) 1.60.
– This means that for area in two tails = 0.10 and for
degrees of freedom =35, t=1.60
– This also means that for area in the right tail =
0.10/2 = 0.05 and for degrees of freedom = 35,
t=1.60
– And for area in the left tail = 0.10/2 = 0.05 and for
degrees of freedom = 35, t=-1.60.
• So, to get tA,df use the command TINV(2A, df). Note
that the area is multiplied by 2.
13
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• Confidence interval estimator of population mean when
the population variance is unknown:
x  t / 2,n 1
•
•
•
•
•
s
n
Where,
x is the sample mean
s is the sample standard deviation
n is the sample size
t  / 2 , n 1 is that value of t for which area to the right of t is 
2
when the degrees of freedom is n-1
14
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
• Test of Hypothesis about a population mean when the
population variance is unknown:
x   HO
– Test statistic:
t
– Rejection region:
• Two-tail test:
t  t / 2,n1
s/ n
• Right-tail test:
t  t ,n 1
• Left-tail test:
t  t ,n 1
15
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
Example 1: A random sample of 9 observations were drawn
from a large population. These are: 11,9,5,7,1,2,10,6,3.
Estimate the population mean with 90% confidence.
16
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
Example 2: A random sample of 9 observations were drawn
from a large population. These are: 11,9,5,7,1,2,10,6,3. Test
to determine if we can infer at the 5% significance level that
the population mean is not equal to 5.
HO :
HA :
Rejection region:
Test statistic:
Conclusion:
Answer:
17
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
Example 3: A random sample of 9 observations were drawn
from a large population. These are: 11,9,5,7,1,2,10,6,3. Test
to determine if we can infer at the 5% significance level that
the population mean is greater than 3.
HO :
HA :
Rejection region:
Test statistic:
Conclusion:
Answer:
18
INFERENCE ABOUT POPULATION MEAN WHEN
THE POPULATION VARIANCE IS UNKNOWN
Example 4: A random sample of 9 observations were drawn
from a large population. These are: 11,9,5,7,1,2,10,6,3. Test
to determine if we can infer at the 5% significance level that
the population mean is less than 10.
HO :
HA :
Rejection region:
Test statistic:
Conclusion:
Answer:
19
INFERENCE ABOUT
A POPULATION PROPORTION
• If the data are qualitative, we can not find the mean. So, the
techniques described for the inference about mean do not
help.
• For qualitative data, we can only count the number of times
each value of the variable occurs. From these count, we
can compute proportions. Thus, the parameter of of interest
is population proportion, p.
20
INFERENCE ABOUT
A POPULATION PROPORTION
• Let
p = the population proportion
n = the sample size
Then,
The expected number of successes = np
Standard deviation of successes = np1  p 
The expected sample proportion
E p p
Standard deviation of the sample proportion
p1  p 
p 
n

21
INFERENCE ABOUT
A POPULATION PROPORTION
• Confidence interval estimator of population proportion:
p  z / 2
•
•
•
•

p 1 p
n

Where,
p is the sample proportion of successes
n is the sample size
z / 2 is that value of z for which area to the right of z is 
2
22
INFERENCE ABOUT
A POPULATION PROPORTION
• Test of Hypothesis about a population proportion:
p  pH O
– Test statistic:
z
– Rejection region:
• Two-tail test:
z  z / 2


pH O 1 pH O / n
• Right-tail test:
z  z
• Left-tail test:
z   z
23
INFERENCE ABOUT
A POPULATION PROPORTION
• The p-value of a Test of Hypothesis:
– Test statistic:
z
p  pH O


pH O 1 pH O / n
– The p-value :
• Two-tail test: the area to the right of |z| plus the area
to the left of -|z|
• Right-tail test: the area to the right of z
• Left-tail test: the area to the left of z
24
INFERENCE ABOUT
A POPULATION PROPORTION
Example 5: In a random sample of 300, we found 80
successes. Estimate the population proportion with 99%
confidence.
25
INFERENCE ABOUT
A POPULATION PROPORTION
Example 6: Test the following hypothesis:
H O : p  0.40
H A : p  0.40
  0.05, n  100, p  0.35
Rejection region:
Test statistic:
Conclusion:
26
INFERENCE ABOUT
A POPULATION PROPORTION
Example 7: In a television commercial, the manufacturer of a
toothpaste claims that more than seven out of 10 dentists
recommend the ingredients in his product. To test the claim, a
consumer protection group randomly samples 400 dentists
and asks each one whether he or she recommend a
toothpaste that contained the ingredients. A total of 290
dentists answered “Yes.” At the 5% significance level, can the
consumer group infer that the claim is true?
27
HO :
HA :
Rejection region:
Test statistic:
Conclusion:
Answer:
28
INFERENCE ABOUT
A POPULATION PROPORTION
Example 8: Test the following hypothesis:
H O : p  0.20
H A : p  0.20
  0.10, n  900, p  0.18
Rejection region:
Test statistic:
Conclusion:
29
INFERENCE ABOUT
A POPULATION PROPORTION
Example 9: Find p-value of the tests in Example 6, 7, 8
6. Test:
Test statistic:
The p-value:
7. Test:
Test statistic:
The p-value:
8. Test:
Test statistic:
The p-value:
30
READING AND EXERCISES
• Sections 11.1-11.2:
– Reading: pp. 369-385
– Exercises: 11.2,11.4
• Sections 11.4:
– Reading: 398-405 (selecting sample size to estimate a
population proportion is not included)
– Exercises: 22,24, 26
31