Download 4/24/98 252x9842 - On

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
252grass2-052 10/4/05 (Open this document in 'Page Layout' view!)
Solution to Graded Assignment 2
1. Which of the following could be null hypotheses? Which could be an alternate hypothesis? Which could
be neither? Why? If it is H 0 , what is H 1 ? If it is H 1 , what is H 0 ? (i) p  .85, (ii) p  213, (iii) p  .25
(iv) x  0.5, (v)   7, (vi) s  5, (vii)   5, (viii)   5, (ix)   5 , (x) p  0.37 .
Solution: Remember the following:
α) Only numbers like , p,  2 ,  and (the population mean, proportion, variance,
standard deviation and median) that are parameters of the population can be in a
hypothesis; x, p, s 2 , s and x.50 (the sample mean, proportion, variance, standard deviation
and median) are statistics computed from sample data and cannot be in a hypothesis because
a hypothesis is a statement about a population;
β) The null hypothesis must contain an equality;
γ) p must be between zero and one;
δ) A variance or standard deviation cannot be negative.
p  .85 could be H 0 since it contains a parameter and an equality. H 1 would be p  .85 .
(ii) p  213 can’t be either H 0 or H 1 since a proportion can’t be above 1.
(iii) p  .25 could be H 0 since it contains a parameter and an equality. H 1 would be p  .25 .
(iv) x  0.5 can’t be either H 0 or H 1 since the sample mean is not a parameter.
(v)   7 could be H 0 since it contains a parameter and an equality. H 1 would be   7 .
(vi) s  5 can’t be either since the sample standard deviation is not a parameter (and a standard deviation
can’t be below zero).
(vii)   5 could be H 0 since it contains a parameter and an equality. H 1 would be   5 .
(viii)   5 can’t be either since a standard deviation can’t be below zero.
(ix)   5 could be H 0 since it contains a parameter and an equality. H 1 would be   5 .
(x) p  0.37 can’t be either H 0 or H 1 since the sample proportion is not a parameter
(i)
The next 3 problems are based on problems given in Larry J. Stephens, Advanced Statistics Demystified.
NY, McGraw – Hill, 2004.
2. Make sure that I know what formulas you are using.
A doctor asserts that her average patient gets less than 5 hours of exercise a week. A sample of 10 appears
below. Personalize the data as follows. Take the second to last digit of your student number. If that number
is k , subtract 0.5 from xk . If the number is zero, do not change the data. For example Seymour Butz’s
student number is 101321, so he changes x2 to 3.5. Use a 10% significance level. Is the doctor right?
x1
6.5
x2
x3
4.0
4.0
x4
x5
x6
x7
x8
x9
2.5
4.5
8.5
2.0
5.0
1.5
x10
9.5
a) State your null and alternative hypotheses.
b) Find critical values for the sample mean and test the hypothesis.
c) Find a confidence interval for the population mean and test the hypothesis.
d) Use a test ratio for a test of the mean
e) Find an approximate p-value for the test ratio using the t table and use the p-value to test the
hypothesis.
f) Suppose that the null hypothesis was H 0 :   5 . If p is the proportion of numbers above 5,
what is the null hypothesis involving p that you actually use? (Extra credit: find a p-value for the
null hypothesis and explain why you reject or do not reject the null hypothesis at the 10% level.)
252grass2-052 2/28/05
Solution: Make sure that I know what formulas you are using.
A doctor asserts that her average patient gets less than 5 hours of exercise a week. A sample of 10 appears
below. Personalize the data as follows. Take the second to last digit of your student number. If that number
is k , subtract 0.5 from xk . If the number is zero, do not change the data. For example Seymour Butz’s
student number is 101321, so he changes x2 to 3.5. Use a 10% significance level. Is the doctor right?
These are the personalized data. The last column is the original data.
Row
1
2
3
4
5
6
7
8
9
10
x1
x2
x3
x4
x5
6.0
4.0
4.0
2.5
4.5
8.5
2.0
5.0
1.5
9.5
6.5
3.5
4.0
2.5
4.5
8.5
2.0
5.0
1.5
9.5
6.5
4.0
3.5
2.5
4.5
8.5
2.0
5.0
1.5
9.5
6.5
4.0
4.0
2.0
4.5
8.5
2.0
5.0
1.5
9.5
6.5
4.0
4.0
2.5
4.0
8.5
2.0
5.0
1.5
9.5
x6
6.5
4.0
4.0
2.5
4.5
8.0
2.0
5.0
1.5
9.5
x7
x8
6.5
4.0
4.0
2.5
4.5
8.5
1.5
5.0
1.5
9.5
6.5
4.0
4.0
2.5
4.5
8.5
2.0
4.5
1.5
9.5
Minitab gave me the means and standard deviations.
n
s
Variable
x
sx
x1
x2
x3
x4
x5
x6
x7
x8
x9
x0
x9
6.5
4.0
4.0
2.5
4.5
8.5
2.0
5.0
1.0
9.5
x0
6.5
4.0
4.0
2.5
4.5
8.5
2.0
5.0
1.5
9.5
x
x
2
10
4.750
0.834
2.638
47.5
288.25
10
4.750
0.851
2.690
47.5
290.75
10
4.750
0.851
2.690
47.5
290.75
10
4.750
0.860
2.721
47.5
292.25
10
4.750
0.847
2.680
47.5
290.25
10
4.750
0.821
2.595
47.5
286.25
10
4.750
0.864
2.731
47.5
292.75
10
4.750
0.844
2.669
47.5
289.75
10
4.750
0.867
2.741
47.5
293.25
10
4.800
0.844
2.669
48.0
294.50
Given the nature of the data, it seems that we need to test the mean, so we look at our formula table under
means.
Interval for
Confidence
Hypotheses
Test Ratio
Critical Value
Interval
Mean (
x  0
H0 :   0
  x  z 2  x
xcv    z 2  x
z
known)
x
H1 :    0
Mean (
unknown)
  x  t 2 s x
DF  n 1
H0 :   0
H1 :    0
t
x  0
sx
xcv    t 2 s x
a) State your null and alternative hypotheses. I will do the problem for x1 above.
What the problem says: ‘A doctor asserts that her average patient gets less than 5 hours of exercise a
week. Is the doctor right?’ ----- This definitely implies that "   5 ” goes somewhere, but it does not
contain an equality, so it must be an alternate hypothesis. So the null hypothesis is H 0 :   5 and the
alternate hypothesis is H1 :   5 . (If you had said that the null hypothesis was '   5' , the alternate would
have been '   5' , and neither would really contain ‘less than.’)
What else the problem says: A sample of n  10 patients is taken and   .10 . I used x1 .
2
252grass2-052 2/28/05
b) Find critical values for the sample mean and test the hypothesis.
Remember H 0 :   5 , H1 :   5 , n  10 and   .10 . The degrees of freedom are n  1  9 . We must
use t because we do not know  . Since this is a one-sided test, we use t n1  t 9  1.383 . If we use the

.10
Minitab calculations for version 1 of the problem: n  10, x  4.750, sx  0.834 and s 
2.638.
Because the alternate hypothesis is H1 :   5 , we want a single critical value below 5. The two-sided
formula, x cv   0  t n1 s x , becomes xcv  0  tn1sx  5  1.3830.834  5  1.153  3.847 .
2
Make a diagram showing an almost Normal curve with a mean at 5 and a shaded 'reject' zone below 3.847.
Since x  4.750 , is not below 3.847, we do not reject H 0 .
c) Find a confidence interval for the population mean and test the hypothesis.
Because the alternate hypothesis is H1 :   5 we need a '' confidence interval.   x  t  s x becomes
2
  x  t s x  4.750  1.383 0.834  or   5.903 . Make a diagram showing an almost Normal curve
with a mean at x  4.750 , and the confidence interval below 5.903 shaded. Since  0  5 is below 5.903 and
thus in the confidence interval, we do not reject H 0 . Better, shade the area above 5 to indicate the null
hypothesis and not that, since the two shaded areas overlap, both the null hypothesis and the confidence
interval can be sure at the same time.
d) Use a test ratio for a test of the mean - Remember H 0 :   5 , H1 :   5 .
The test ratio is t calc 
x   0 4.750  5

 0.299 . Since this is a one-sided, left-tail test, pick
sx
0.834
9
 tn 1  t .10
 1.383 from the
t table. The ‘reject’ zone is the area below -1.383. Make a diagram
showing an almost Normal curve with a mean at 0 and a shaded 'reject' zone below -1.383. Since
t calc  0.299 is not below -1.383, we do not reject H 0 .
e) Find an approximate p-value for the test ratio using the t table and use the p-value to test the
hypothesis.
On the 9 df line of the t-table, note that 0.299 is between 0.261 and 0.398. Because 0.261 is in the .40
column and 0.398 is in the .35 column, the table is telling us that Pt  0.261   .40 and Pt  0.398   .35 .
But this implies, since the t distribution is symmetrical, that Pt  0.261  .40 and Pt  0.398   .35
since -0.399 is between these two values of t, the probability below it must be between the two probabilities
below it. So the p-value must be between .40 and .35. .35  Pt  0.299   .40, which means
.35  p  value  .40. Since these p-values are above the 10% significance level, do not reject H 0 .
Minitab Results
————— 10/4/2005 3:19:59 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\2gr2-052.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\2gr2-052.MTW'
Worksheet was saved on Fri Sep 23 2005
Results for: 2gr2-052.MTW
MTB > Onet C3;
SUBC>
Test 5;
SUBC>
Confidence 90.0;
SUBC>
Alternative -1.
One-Sample T: C3
Test of mu = 5 vs < 5
Variable
C3
N
10
Mean
4.75000
StDev
2.63787
SE Mean
0.83417
90%
Upper
Bound
5.90368
T
-0.30
P
0.386
3
252grass2-052 2/28/05
The computations above were run on Minitab for all ten versions of the problem with the following results.
x  0
n
s
x
sx
pval
  t calc 
sx
Variable N Mean StDev SE Mean Bound
T
P
1
C3 x
10 4.75000 2.63787 0.83417 5.90368
-0.30
0.386
C4 x 2
C5 x 3
10 4.75000 2.69000 0.85065 5.92648
-0.29
0.388
10 4.75000 2.69000 0.85065 5.92648
-0.29
0.388
4
C6 x
C7 x5
10 4.75000 2.72080 0.86039 5.93995
-0.29
0.389
10 4.75000 2.67966 0.84738 5.92195
-0.30
0.387
6
C8 x
C9 x7
10 4.75000 2.59540 0.82074 5.88510
-0.30
0.384
10 4.75000 2.73099 0.86362 5.94441
-0.29
0.389
C10 x
8
10 4.75000 2.66927 0.84410 5.91741
-0.30
0.387
C11 x
9
10 4.75000 2.74115 0.86683 5.94885
-0.29
0.390
C12 x
0
10 4.80000 2.66875 0.84393 5.96718
-0.24
0.409
Note that for the few of you who totally blew the ‘reject’ regions.
(i) If you are using a critical value, if x takes the same value as  0 , it must not be in your ‘reject’ zone!
(ii) If you are using a 1-sided confidence interval, x had better be inside it!
(iii) If you are using a t or z ratio, zero cannot be in your ‘reject’ zone!
Note that the vast majority of you did not tell me what your rejection zone was!
Learn to make  and call it ‘mu.’ It’s not a ‘u’ and you are too young to be unable
to conceive of a Greek letter!
Never use z with s unless degrees of freedom are high! Never use t with  or p !
Check the end of this document.
f) Suppose that the null hypothesis was H 0 :   5 . If p is the proportion of numbers above 5,
what is the null hypothesis involving p that you actually use? (Extra credit: find a p-value for the
null hypothesis and explain why you reject or do not reject the null hypothesis at the 10% level.)
According to the outline:
Hypotheses about
Hypotheses about a proportion
a median
If p is the proportion
If p is the proportion
above  0
below  0
 H 0 :   0

 H 1 :   0
 H 0 :   0

H 1 :   0
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p .5
 H 0 : p .5

 H 1 : p  .5
 H 0 :   0

H 1 :   0
 H 0 : p .5

 H 1 : p  .5
 H 0 : p .5

 H 1 : p  .5
4
252grass2-052 2/28/05
 H : 5
The last row says that if p is the proportion of numbers above 5, then  0
corresponds to
 H 1 :  5
 H 0 : p .5
. If we look at our data and star every number above 5, drop every number equal to 5 and then

 H 1 : p  .5
*
*
use x to represent the number of items above 5, n to represent our count of items that are not 5, and p to
represent the ratio, we get the results below.
Row
x1
x2
x3
x4
x5
x6
x7
x8
x9
x0
1
2
3
4
5
6
7
8
9
10
6.0*
4.0
4.0
2.5
4.5
8.5*
2.0
5.0?
1.5
9.5*
x*
n*
6.5*
3.5
4.0
2.5
4.5
8.5*
2.0
5.0?
1.5
9.5*
6.5*
4.0
3.5
2.5
4.5
8.5*
2.0
5.0?
1.5
9.5*
6.5*
4.0
4.0
2.0
4.5
8.5*
2.0
5.0?
1.5
9.5*
6.5*
4.0
4.0
2.5
4.0
8.5*
2.0
5.0?
1.5
9.5*
6.5*
4.0
4.0
2.5
4.5
8.0*
2.0
5.0?
1.5
9.5*
6.5*
4.0
4.0
2.5
4.5
8.5*
1.5
5.0?
1.5
9.5*
6.5*
4.0
4.0
2.5
4.5
8.5*
2.0
4.5
1.5
9.5*
6.5*
4.0
4.0
2.5
4.5
8.5*
2.0
5.0?
1.0
9.5*
6.5*
4.0
4.0
2.5
4.5
8.5*
2.0
5.0?
1.5
9.5*
3
3
3
3
3
3
3
3
3
3
9
9
9
9
9
9
9
10
9
9
.33
.33
.33
.33
.33
.33
.33
.30
.33
.33
p
This is a right sided test and the easiest way to do it is to use the binomial table, so that the p-value for all
but x8 is P x *  3  1  Px  2  1  .08984  .91016. For x8 it is P x *  3  1  Px  2  1  .05469
 .94531 . Both of these are well above the significance level, so we cannot reject the null hypothesis. See
end for use of z.




Minitab computation of mean and sample variance for x1
MTB
MTB
MTB
MTB
>
>
>
>
let c14=c3*c3 #Original
#numbers were in Column 3,
#squares in Column 14.
print c3 c14
Data Display
Row
1
2
3
4
5
6
7
8
9
10
x1
C14
x1
x12
6.0
4.0
4.0
2.5
4.5
8.5
2.0
5.0
1.5
9.5
36.00
16.00
16.00
6.25
20.25
72.25
4.00
25.00
2.25
90.25
So n  10,
x

x
1
 47 .5 and
 x  47.5  4.75,
n
10
x
s x2 
2
1
x
 288 .25
2
 nx 2
n 1
288 .25  10 4.75 
62 .625

 6.958333 ,
9
9
2
s  6.958333  2.63787 . This implies that
sx 
sx2

n
6.958333
 0.6958333  0.8342 .
10
MTB > sum c3
Sum of x1
Sum of x1 = 47.5
MTB > sum c14
Sum of C14
Sum of C14 = 288.25
5
252grass2-052 2/28/05
3. Assume, in problem 2, that the population standard deviation was 1 hour. State the test ratio and find its
p-value. Using this p-value, would we reject the null hypothesis a) If the significance level is 1%, b) If the
significance level is 5% and c) If the significance level is 10%?
Solution:
Note!!! The only thing that has changed from problem 2 is that the sample standard
deviation has been replaced by the population standard deviation of 1 – not the sample mean, not the
hypotheses. The only reason for this section was because some find e) above difficult, and this is
easier since you can use the Normal table.
We still have the null hypothesis H 0 :   5 and the alternate hypothesis H1 :   5 .
What else the problem says: Assume, in problem 2, that 1 was a population standard deviation.
Everything else is unchanged, except that there is no significance level at first. So  0  5, n  10 and
x  4.75. But  x  1 and  x 
says that the test ratio is z 
 x2
x  0
x
n


12
 0.100  0.3162 . We still have a right-tailed test. Table 3
10
4.75  5
 0.79 . So if we use the Normal table, we find
0.3162
p  value  Px  4.75   Pz  0.79   .5  .2852  .2148 . To do the last part of this problem, make a
diagram of the Normal distribution with a mean of zero and shade the area below -0.70.
a) If   .01, since the p-value of .2148 is not below the significance level, we do not reject H 0 .
b) If   .05, since the p-value of .2148 is not below the significance level, we do not reject H 0 .
c) If   .10 , since the p-value of .2148 is not below the significance level, we do not reject H 0 .
You did not answer this question if you did the problem 3 different ways and never found the pvalue!
Minitab Results
————— 10/4/2005 4:20:39 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\2gr2-052.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\2gr2-052.MTW'
Worksheet was saved on Fri Sep 23 2005
Results for: 2gr2-052.MTW
MTB > OneZ C3;
SUBC>
Sigma 1;
SUBC>
Test 5;
SUBC>
Alternative -1.
One-Sample Z: C3
Test of mu = 5 vs < 5
The assumed standard deviation = 1
Variable
C3
N
10
Mean
4.75000
StDev
2.63787
SE Mean
0.31623
95%
Upper
Bound
5.27015
Z
-0.79
P
0.215
MTB > OneZ C3 C4 C5 C6 C7 C8 C9 C10 C11 C12;
SUBC>
Sigma 1;
SUBC>
Test 5;
SUBC>
Alternative -1.
6
252grass2-052 2/28/05
One-Sample Z: C3, C4, C5, C6, C7, C8, C9, C10, C11, C12
Test of mu = 5 vs < 5
The assumed standard deviation = 1
Variable
C3
1
x
x2
C5 x 3
C6 x 4
C7 x5
C8 x6
C9 x7
C10 x8
C11 x9
C12 x0
C4
n
x
N
Mean
s
sx

StDev
SE Mean
Bound
z calc 
x  0
Z
x
pval
P
10
4.75000
2.63787
0.31623
5.27015
-0.79
0.215
10
4.75000
2.69000
0.31623
5.27015
-0.79
0.215
10
4.75000
2.69000
0.31623
5.27015
-0.79
0.215
10
4.75000
2.72080
0.31623
5.27015
-0.79
0.215
10
4.75000
2.67966
0.31623
5.27015
-0.79
0.215
10
4.75000
2.59540
0.31623
5.27015
-0.79
0.215
10
4.75000
2.73099
0.31623
5.27015
-0.79
0.215
10
4.75000
2.66927
0.31623
5.27015
-0.79
0.215
10
4.75000
2.74115
0.31623
5.27015
-0.79
0.215
10
4.80000
2.66875
0.31623
5.32015
-0.63
0.264
4. According to USA today only 5% of workers make more than 1000 copies a week. To test that statement,
you look at a sample of 150 workers and find that 12 made more than 1000 copies last week. Is this result
significantly different from the result reported by USA today?
a) State your null and alternative hypotheses.
b) Find a test ratio for a test of the proportion
c) Find a p-value for the test ratio and use the p-value to test the hypothesis at the 1%
significance level.
d) Restate the problem as a the one sided hypothesis that the proportion is less than 5%, find a pvalue for the null hypothesis and use it to test your hypothesis at a 1% significance level.
Solution: Since 5% is a proportion, we look up the test for a proportion in the formula table.
Interval for
Confidence
Hypotheses
Test Ratio
Critical Value
Interval
Proportion
p  p0
H 0 : p  p0
p  p  z 2 s p
pcv  p0  z 2  p
z

H
:
p

p
p
1
0
pq
p0 q0
sp 
p 
n
n
q 1  p
q0  1  p0
a) State your null and alternative hypotheses.
What the problem says: ‘Only 5% of workers make more than 1000 copies a week.’ 5% is a proportion.
There is nothing here about the mean. Though the word ‘only’ is used, there is no implication that the
proportion is above or below 5%. We thus have p  .05 . There is an equality in this statement, so it must
be a null hypothesis. So the null hypothesis is H 0 : p  .05 and the alternate hypothesis is H 1 : p  .05.
What else the problem says: A sample of 150 workers found that 12 made more than 1000 copies. So
p 0  .05, n  150 and x  12 . This implies that n  150 , q 0  1  p 0  1  .05  .95 .
p
x 12

 .08, and q  1  p  1  .08  .92.
n 150
7
252grass2-052 2/28/05
b) Find a test ratio for a test of the proportion
The table says z 
z
p  p0
p
and  p 
p0 q0
.05.95 

 0.00031667  0.017795 . This means that
n
150
p  p0
.08  .05
, if you have already said the null
 1.686 . (It makes no sense to use the test ratio, z 
.017795
p
hypothesis is about the mean.) This is a 2-sided 1% test, so that we use z.005  2.576 . We do not reject
the null hypothesis because 1.68 lies between -2.576 and +2.576.
c) Find a p-value for the test ratio and use the p-value to test the hypothesis at the 1%
significance level.
This is a 2 sided test and pvalue  2P p  .08   2Pz  1.69   2.5  .4545   .0910 . This is above
  .01 so we cannot reject the null hypothesis. Note: For a 2-sided hypothesis, if t or z is negative,
the p-value is twice the area below it. If t or z is positive, the p-value is twice the area above it.
d) Restate the problem as a the one sided hypothesis that the proportion is less than 5%, find a pvalue for the null hypothesis and use it to test your hypothesis at a 1% significance level.
The hypothesis p  .05 does not contain an equality, so it must be an alternate hypothesis. So the null
hypothesis is H 0 : p  .05 and the alternate hypothesis is H 1 : p  .05. This is tricky, because it is a left –
sided hypothesis, which means that pvalue  P p  .08   Pz  1.69   .5  .4545   .9545 . This is well
above any significance level that we might use, so we cannot reject the null hypothesis .
5. (Extra credit)
a) Finish 4 by testing the original hypothesis using a critical value for the proportion and an
appropriate confidence interval.
b) Were we reasonable assuming that   1 in question 3? Use the sample standard deviation that
you found in question 3 to test this using a test ratio and a confidence interval.
c) Use Minitab to do this problem. In column 1 put 12 yeses and 138 noes in any order so that you
fill 150 lines. (You only need a ‘y’ for yes and an ‘n’ for no.) You may give column 1 a name and
use the name in single quotes in place of c1. Remember that all multiline commands must end with
a period. ‘Pone’ is the command for a 1-sample test of a proportion.
Do the problem four different ways as follows.
Pone c1;
Conf 99.0;
Test 0.5;
Usez.
Pone c1;
Conf 99.0;
Test 0.5.
Pone 150 12;
Conf 99.0;
Test 0.5;
Usez.
Pone 150 12;
Conf 99.0;
Test 0.5.
8
252grass2-052 2/28/05
Compare the results to the results you got by working by hand. Which ones are closest? What is
the difference between what the four instruction sets do? Can you use the Stat < Basic Statistics<1
proportion pull-down menu with options to get the same results?
a) Finish 4 by testing the original hypothesis using a critical value for the proportion and an
appropriate confidence interval. Will you be able to do it on the exam?
Solution: Note: All of you should be able to do this problem using a critical value for p or a confidence
interval.   .01 . The null hypothesis is H 0 : p  .05 and the alternate hypothesis is H 1 : p  .05 .
p 0  .05, n  150 and x  12 . This implies that q 0  1  p 0  1  .05  .95, p 
x 12

 .08, and
n 150
q  1  p  1  .08  .92.
p0 q0
.05.95 

 0.00031667  0.017795 . The table says pcv  p0  z 2  p ,
n
150
 2.576. so we use p cv  p 0  z   p  .05  2.576 .017795   .05  .058 . Make a diagram
Critical Value:  p 
z 2  z.005
2
showing a Normal curve centered at p 0  .05 , a shaded 'reject' zone below .05 - .058 = -.008 (This is not a
reasonable value for p ) and above .05 + .058 = .108. Since p  .08 is not in the ‘reject’ zone, we cannot
reject
H0.
Confidence Interval: The table says p  p  z  s p , where, because q  1  p  .92 and n  150 ,
2
pq
.08 .92 

 .00049067  .02215 . Since the alternate hypothesis H1 : p  .05 implies a 2n
150
sided test, we use a 2-sided confidence interval. p  .08  2.576 .02215   .08  .057 or .023 to .137. Make
a diagram showing a Normal curve centered at .08 with the confidence interval between .023 and .137
shaded. Since p 0  .05 is within the confidence interval, we cannot reject H 0 .
b) Were we reasonable assuming that   1 in question 3? Use the sample standard deviation that
you found in question 3 to test this using a test ratio and a confidence interval.
Go back to the formula table. This is a 2-sided test of H 0 :   1 and H 1 :   1 or H 0 :  2  1 and
sp 
H1 :  2  1 . Not trying this was cheating yourself!
Interval for
Confidence
Hypotheses
Interval
VarianceH 0 :  2   02
n  1s 2
2  2
Small Sample
.5 .5 2 
H1: :  2   02
VarianceLarge Sample
 
s 2DF 
 z 2  2DF 
Test Ratio
2 
H 0 :  2   02
n  1s 2
 02
z 
2  2DF   1
2
H1 :  2   02
x

2
Critical Value
2
s cv

s cv 
 .25 .5 2  02
n 1
 2 DF
 z  2  2 DF
 nx 2
 6.958333 . We should
n 1
probably be assuming that   .01 Degrees of freedom are DF  n  1  10  1  9.
Test Ratio: Recall that n  10, and that, for x1 , x  4.75,
2 
9
n  1s 2
 02

9 6.958333 
12
s x2
 62 .625 . To be in the central 99% of values, this ratio should fall between
9
 2 .075  23.5893 and  2 .975  1.7349 . Since it does not fall between these values, reject H 0 .
9
252grass2-052 2/28/05
Confidence Interval: Use the formula  2 
n  1s 2
.25 .5 2 
, with the two values of chi-squared used with the
96.958333 
96.958333 
. This gives us 4.655   2  36.097 or
 2 
23 .5893
1.7349
2.158    6.008 , a comparatively gigantic confidence interval which still does not include H 0 :   1 ,
so assuming that the population standard deviation was 1 was not a good idea.
test ratio. This means
c) Use Minitab to do this problem. In column 1 put 12 yeses and 138 noes in any order so that you
fill 150 lines. (You only need a ‘y’ for yes and an ‘n’ for no.) You may give column 1 a name and
use the name in single quotes in place of c1. Remember that all multiline commands must end with
a period. ‘Pone’ is the command for a 1-sample test of a proportion.
Do the problem four different ways as follows.
Pone c1;
Conf 99.0;
Test 0.05;
Usez.
Pone c1;
Conf 99.0;
Test 0.05.
Pone 150 12;
Conf 99.0;
Test 0.05;
Usez.
Pone 150 12;
Conf 99.0;
Test 0.05.
Compare the results to the results you got by working by hand. Which ones are closest? What is
the difference between what the four instruction sets do? Can you use the Stat < Basic Statistics<1
proportion pull-down menu with options to get the same results?
There was an error in this part of the assignment – Instead of ‘test 0.05,’ the third instruction in all
of the commands originally read ‘test 0.5!’ If you followed my mistake, the results would be as below.
Results for: 2gr2-052a.MTW
MTB > pone c1;
SUBC> conf 99;
SUBC> test 0.5;
SUBC> usez.
Test and CI for One Proportion: C1
Test of p = 0.5 vs p not = 0.5
Event = y
Variable X N Sample p
99% CI
Z-Value P-Value
C1
12 150 0.080000 (0.022943, 0.137057) -10.29 0.000
MTB > pone c1;
SUBC> conf 99;
SUBC> test 0.5.
Test and CI for One Proportion: C1
Test of p = 0.5 vs p not = 0.5
Event = y
Exact
Variable X N Sample p
99% CI
P-Value
C1
12 150 0.080000 (0.033643, 0.154500) 0.000
10
252grass2-052 2/28/05
MTB > pone 150 12;
SUBC> conf 99;
SUBC> test 0.5;
SUBC> usez.
Test and CI for One Proportion
Test of p = 0.5 vs p not = 0.5
Sample X N Sample p
99% CI
Z-Value P-Value
1
12 150 0.080000 (0.022943, 0.137057) -10.29 0.000
MTB > pone 150 12;
SUBC> conf 99;
SUBC> test 0.5.
Test and CI for One Proportion
Test of p = 0.5 vs p not = 0.5
Exact
Sample X N Sample p
99% CI
P-Value
1
12 150 0.080000 (0.033643, 0.154500) 0.000
With the corrected instructions, you should have gotten the following.
————— 10/4/2005 6:15:27 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\2gr2052a.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\2gr2-052a.MTW'
Worksheet was saved on Tue Oct 04 2005
Results for: 2gr2-052a.MTW
MTB > print c1
Data Display
C1
n
y
n
n
n
n
n
n
MTB >
SUBC>
SUBC>
SUBC>
y
n
n
n
n
n
n
n
n
y
n
n
n
n
n
n
y
n
n
n
n
n
n
n
n
y
n
n
n
n
n
n
y
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
y
n
n
n
n
n
n
n
pone c1;
conf 99;
test 0.05;
usez.
n
n
n
n
n
n
n
n
y
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
y
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
y
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
y
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
y
n
n
n
n
n
n
n
n
n
n
n
n
n
#Test 1
Test and CI for One Proportion: C1
Test of p = 0.05 vs p not = 0.05
Event = y
Variable
X
N Sample p
99% CI
C1
12 150 0.080000 (0.022943, 0.137057)
MTB > pone c1;
SUBC> conf 99;
SUBC> test 0.05.
Z-Value
1.69
P-Value
0.092
#Test 2
Test and CI for One Proportion: C1
Test of p = 0.05 vs p not = 0.05
Event = y
Variable
C1
X
12
N
150
Sample p
0.080000
99% CI
(0.033643, 0.154500)
Exact
P-Value
0.129
11
252grass2-052 2/28/05
MTB >
SUBC>
SUBC>
SUBC>
pone 150 12;
conf 99;
test 0.05;
usez.
#Test 3
Test and CI for One Proportion
Test of p = 0.05 vs p not = 0.05
Sample
X
N Sample p
99% CI
1
12 150 0.080000 (0.022943, 0.137057)
MTB > pone 150 12;
#Test 4
SUBC> conf 99;
SUBC> test 0.05.
Z-Value
1.69
P-Value
0.092
Test and CI for One Proportion
Test of p = 0.05 vs p not = 0.05
Sample
1
X
12
N
150
Sample p
0.080000
99% CI
(0.033643, 0.154500)
Exact
P-Value
0.129
Comment: Obviously tests 1 and 3 are identical, as are tests 2 and 4. In tests 1 and 2 Minitab counted the
yeses, while in tests 3 and 4, the values of n and x were supplied by the user. Tests 1 and 3 use the Normal
distribution and are almost identical to the hand-calculated results. Tests 2 and 4 use the binomial
distribution and thus are more accurate. I was amazed at the difference in results. Note that though we
would not reject the null hypothesis at the 10% significance level using the Normal distribution, we would
using the more accurate binomial distribution.
In the outline it says the following.
b. Continuity Correction.
The continuity correction acts to expand the 'accept' interval by x 
used if npq  9 .
i. Test Ratio: z 
ii.
p  .5 n  p 0
p
, p 
p0 q0
n
Use  if

.5
This is the same as testing z against   z  
 2 n p

1
Critical Value: pcv  p0  2n  z 2 p



iii. Confidence Interval: p  p  1 2n  z  2 s p

1
2
in each direction. It should be
p  p 0 and  if p  p 0 .




)
Remember that the null hypothesis was H 0 : p  .05 and the alternate hypothesis was H 1 : p  .05 .
p 0  .05, n  150 and x  12 . This implies that q 0  1  p 0  1  .05  .95,
p0 q0
.05.95 
x 12
 .08, and q  1  p  1  .08  .92. If

 0.00031667  0.017795 , p  
n 150
n
150
we use the continuity correction, we are effectively using x  11.5 instead of x  12 . So
p  .5 n  p 0 .08  0.5150  .05 .0766667  .05
z


 1.50 and pvalue  2P p  .08   2Pz  1.50 
p
0.017795
0.0177795
p 
 2.5  .4332   .1336 . This is a much closer to the ‘exact’ p-value. Note that npq  150 .05 .95   7.125 ,
which is less than 9. The continuity correction seems to be needed here but it is ignored by Minitab.
12
252grass2-052 2/28/05
Note on 2f
 H : p .5
p  p0
p  .5
We have  0
. If this were a large sample (it’s not!) we could use z 
with

p
.5.5
 H 1 : p  .5
n
p  .33 , but most of you thought p  .5 . Why?
Greek letters that we use and their names.
Americans pronounce the e’s in beta and theta like long a’s. The British pronounce them like long e’s. Chi
is pronounced ‘kie’ to rhyme with pie. Mu is pronounced like a cat’s mew. Rho is pronounced like fish roe.

Alpha
Beta

Chi

Delta
, 

Epsilon
Gamma
, 
Lambda

Mu


Nu

Pi
Theta

Rho

Sigma
, 
13