Download Lecture 6

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
STT 315, Summer 2006
Lecture 6
Materials Covered: Chapter 7
Suggested Exercises: 7.1---7.5, 7.11, 7.16, 7.21, 7.23, 7.39, 7.59, 7.60.
1. Hypothesis testing
Confidence intervals are used for estimation. Hypothesis testing is used for making decisions. We'll
learn about hypothesis tests via an example related to ESP (Extrasensory Perception).
2. Basic concepts (Section 7-2)
Example 7.1(Rhine's ESP experiments): In the 1930s J.B. Rhine and others conducted experiments to
test whether a person had ESP.
A: The procedure
--- A deck of cards with 5 designs (square, circle, star, plus sign, wavy lines) was used.
--- The cards were shuttled thoroughly by the experimenter.
--- A card was turned over each minute, and the subject had to write down the design without
seeing the card.
B: The hypotheses, informally
We are to decide between the two competing claims:
--- The subject does not have ESP
--- The subject has ESP
We want to rewrite these in the context of a probability model. Let p stand for the probability
that the subject correctly identifies a card.
If the subject does not have ESP, we'd expect p = 0.2.
If the subject does have ESP, we'd expect p > 0.2.
C. The hypotheses, formally
We are to decide between p = 0.2 and p > 0.2 based on the data.
We'll call the claim p = 0.2 the null hypothesis, denoted H0 .
We'll call the claim p > 0.2 the alternative hypothesis, denoted H1 .
More concisely, the hypotheses are
H0 : p = 0.2
vs
H1 : p > 0.2
STT 315, Summer 2006
In this problem we are not interested in cases where p < 0.2, since this wouldn't provide evidence for
ESP. But it is sometimes convenient to include these cases in the null hypothesis, so that we write
H0 : p  0.2.
The way we'll test hypotheses, only the parameter value in H0 that is closest to the alternative
hypothesis H1 influences the test, so we'll get the same results whichever way we state the null
hypothesis. So we'll write the null hypothesis in the form that makes most sense in the context of the
problem.
Note that the burden is on the experimenter to disprove H0. We'll always try to set up hypotheses this
way, where we'll stick with H0 unless there's strong evidence against it.
Useful to think of these in legal terms:
Ho: the defendant is innocent vs.
H1 : the defendant is guilty
In our legal system, a defendant is presumed innocent until proven guilty.
D. Possible errors
In making a decision, we risk 2 possible errors.
Deciding against H0 when it is in fact true, this is called a “Type I error”.
In the legal analogy, a Type I error means convicting an innocent person.
Deciding to stick with H0 when H1 is in fact true, this is called a “Type II error”.
In the legal analogy, a Type II error means acquitting a guilty person.
E. The test statistic
We need to use the data to make decision. As usual, let p̂ stand for the sample proportion of
correct answers. Intuitively if p̂ is significantly greater than 0.2, we'll decide in favor of H1.
Define Z 
pˆ  0.2
0.2(1  0.2)
n
.
Saying that p̂ is significantly greater than 0.2 is the same as
saying that Z is significantly greater than 0. (As long as we define “significantly” properly in both
cases.) It's more convenient to work with Z, because we know that if H0 is true, then Z
(approximately, for large n) has a standard normal distribution. We call Z the test statistic.
F. The p value
Important note: Don't confuse the p value with the parameter p.
How do we decide whether Z is significantly larger than 0?
We'll assume H0 is true, and see how likely it is that we'd get a value of Z as large as or larger
than the one we got in the experiment. The answer is the p-value.
STT 315, Summer 2006
G. Computing the p-value
A specific experiment of the type described was performed in 1938.
A large number of students were used as subjects. There were a total of 60000 cards used. The
subjects got 12489 of the 60000 correct, which is a proportion of 0.20815 correct. The observed
proportion p̂ = 0.20815 is bigger than 0.2, but is it large enough to choose H1? To answer this, we
compute the observed value of the Z statistic:
zo 
0.20815  0.20
0.20 * 0.80 / 60000
 4.99
The p-value is the probability that a standard normal random variable is greater than 4.99, which is a
very small number (about 0:0000003 from z-Table).
H. Drawing conclusions
Such a small p-value (0:0000003) provides strong evidence against H0, so we would probably
choose the alternative hypothesis H1, that p > 0.2. Based on the data, I would be comfortable
concluding that the true probability of correctly identifying the card is greater than 0.2.
Typically we decide a cutoff value denoted by , before the data are collected. If the calculated
p-value is less than , we reject H0. If the calculated p-value is not less than , we don't reject H0.
Smaller cutoffs  provide more protection against Type I errors, but less protection against Type II
errors.
The rejection region consists of those values of the test statistic that will lead to the rejection of the
null hypothesis. The size of the rejection region, called the level of significance (exactly ),
determines how small the p-value should be before we reject the null hypothesis.
Policy: If a level of significance  is specified,
reject Ho if p-value<.
Basic steps in testing hypotheses about proportions:
1) State the hypotheses.
2)
3)
Decide an appropriate cutoff value , if desired.
Collect data and compute the test statistic
zo 
4)
pˆ  po
po (1  po ) / n
here po is the value of p specified by the null hypothesis.
Calculate the p-value.
The way we compute the p-value depends on the H1.
(i). For H1: p > po (right-sided test), the p-value is the area to the right of zo under a
standard normal density, i.e. p-value = P( z  z o ) ;
(ii). For H1: p < po (left-sided test), the p-value is the area to the left of zo under a standard
STT 315, Summer 2006
normal density. i.e. p-value = P( z  z o )
(iii). For H1: p  po (two-sided test), the p-value is twice the area to the right of zo under a
standard normal density if zo is positive. p-value =2* P( z  z o ) if z o > 0 . If z o <0,
p-value =2* P( z  z o ) , where zo is the statistic value.
5)
Interpret the results.
Example 7.2 In 1999, 17% of high school students smoked frequently (20 or more days a month).
An education campaign aimed at reducing teen smoking was instituted. To determine whether it was
effective, a new study interviewed 500 high-school students. Of these, 80 smoked frequently. Note
that pˆ  80 / 500  0.16 , so the sample proportion is less than 0.17. Our job is to decide whether
this is attributable to a real drop in smoking or can be attributed to the fact that we've only looked at a
sample.
Hypotheses: H0: p̂  0.17
vs.
H1: p < 0.17
Here p̂ stands for the proportion of frequent smokers among all high-school students. In words, the
null hypothesis specifies that the campaign was not effective, and the alternative specifies that it was.
We'll use  = 0.10 as our cutoff (level of significance).
STT 315, Summer 2006
Example 7.3 A mathematician (John Kerrich) tossed a coin 10000 times to determine whether it was
fair! Hypotheses: H0: p = 0.5 vs. H1: p  0.5. Here p stands for the true probability of HEADS for
the coin. In words, the null hypothesis specifies that the coin is fair, while the alternative hypothesis
specifies that the coin is not fair. We'll use a cutoff of = 0.05. What’s your decision?
3. Testing hypotheses about means
Testing hypotheses about means is similar to testing hypotheses about proportions.
1) State the hypotheses (in terms of )
2) Decide an appropriate cutoff value , if desired
3) Collect data and compute the test statistic
A. If the population standard deviation is known.
z
x  0

.
n
B. If the population standard deviation is unknown.
t
x  0
.
s/ n
Here o is the value of  specified by the null hypothesis.
4) Calculate the p-value.
A. If the population standard deviation is known, using the z table.
------- Left-tailed test: H1 :    o
------- Right-tailed test: H1 :  
------- Two-tailed test: H1 :  
then
p  value  P( Z  z )
 o , then p  value  P( Z  z )
 o , then p  value  2  P( Z | z |)
B. If the population standard deviation is known, using the z table.
------- Left-tailed test: H1 :    o
------- Right-tailed test: H1 :  
------- Two-tailed test: H1 :  
then
p  value  P(t  t0 )
 o , then p  value  P(t  t o )
 o , then p  value  2  P(t | t o |)
STT 315, Summer 2006
5) Interpret the results.
(If a level of significance  is specified, reject Ho if p-value<.)
Example 7.4 Question: How accurate are radon detectors?
Study: Twelve radon detectors were placed in a chamber which exposed them to 105 picocuries/liter
of radon. The mean of the 12 readings was 101:13 and the standard deviation of the 12 readings was
9.40. We want to test Ho: = 105 versus H 1:   205. Choose = 0:10.
Example 7.5 To justify raising its rates, an insurance company claims that the mean medical expense
for all middle-class families is at least $700 per year. A survey of 100 randomly selected middle-class
families found that the mean medical expense for the year was $670 and the standard deviation was
$140. Assuming that the tails of the distribution of medical expenses are not usually long, is there any
evidence that the insurance company is misinformed?