Download 6/25/97 502as1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Time series wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Graded Assignment 1
Please show your work! Neatness and whether the papers are stapled may affect your grade.
1. A magazine wants to estimate the mean leisure time in hours enjoyed weekly by managers. Data taken
from a sample of 21 managers follows:
15 12 18 23 11 21 16 13
9 19 26 11
7 18 11 15 23 26 10
8 17
Compute the sample standard deviation using the computational formula. Use this sample standard
deviation to compute a 98% confidence interval for the mean. Does the mean differ significantly from 19
hours? Why?
2. How would your results change if the sample of 21 had been taken from a population of 100?
3. Assume that the population standard deviation is 6.00 (and that the sample of 21 is taken from a very
large population). Find z .0005 using the Normal table (If you have several values of z that you can use, pick
the average of the extreme ones.) and use it to compute a 99.9% confidence interval. Does the mean differ
significantly from 19 hours? Why?
Solution:
1)
x
index
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
sum
sx 
 x  329 ,  x  5825 ,
 x  329  15.6667
x
2
x2
15 225
12 144
18 324
23 529
11 121
21 441
16 256
13 169
9
81
19 361
26 676
11 121
7
49
18 324
11 121
15 225
23 529
26 676
10 100
8
64
17 289
329 5825
sx
n

n
s x2 
x
n  21
21
2
 nx 2

5825  2115 .6667 2
20
n 1
670 .64473

 33 .532237
20
The formula for the sapmle standard deviation is
in Table 20 of the Supplement.
s x  33 .532237  5.7907 .
s x2
33 .532237

 1.59677  1.2636
n
21
1    .98   .02

2
 .01
20
t n1  t.01
 2.528
2
From Table 3   x  tn 1 s x is the formula for a two sided confidence interval when the population
2
standard deviation is unknown.   x  t n1 s x  15.6667  2.5281.2636  15.6667  3.1944 or 12.472
2
to 18.861. If we ask if the mean is significantly different from 19, our null hypothesis is H 0 :   19 and
since 19 is not between the top and the bottom of the confidence interval, reject H 0 and say that the mean
is significantly different from 19. (But it is not significantly different from 18!)
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
2) If N  100 , the sample of 15 is more than 5% of the population, so use
sx 
N n
100  21
 1.2636
 1.2636 0.7980  1.2636 0.89330   1.1288 .
N 1
100  1
sx
n
Recall that x  15.6667 ,   .02 ,

2
20
 .01 and t n1  t.01
 2.528 .
2
Confidence interval:   x  tn 1 s x is the formula for a two sided interval.
2
  x  t n1 s x  15.6667  2.5281.1288  15.6667  2.5488 or 13.118 to 18.215. The interval is
2
smaller, but it doesn’t change anything – the mean is still significantly different from 19 (but not 18).
3) a) Find z .0005 .
Make a diagram! The diagram for z will be a Normal curve centered at zero and will show one point,
z .0005 , which has 0.05% above it (and 99.95% below it!) and is above zero because zero has 50% below it.
Since zero has 50% above it, the diagram will show 49.95% between zero and z .0005 .
From the diagram, we want one point z .0005 so that Pz  z.0005   .0005 or P0  z  z .0005   .4995 .
On the interior of the Normal table we can find to .4995 exactly. In fact, it says P0  z  z 0   .4995 for
any value of z 0 between 3.27 and 3.32. A compromise seems to be required, so note that the halfway point
between these numbers is 3.295. This means that we will say z.005  3.295 .
Check: Pz  3.295   Pz  0  P0  z  3.295   .5  .4995  .0005 . This is verified graphically
below.

b) We know that x  15.6667 , n  21 and   6.00 . So  x 

6.00

62
 1.7143
21
n
21
=1.3093. The 99.9% confidence interval has 1    .999 or   .001 , so z  z.0005  3.295 . The
2
confidence interval is   x  z  x  15 .6667  3.295 1.3093   15.6667  5.1390 or 10.53 to 20.81. If
2
we test the null hypothesis H 0 :   19 against the alternative hypothesis H1 :   19 , since 19 is on the
confidence interval, we do not reject the null hypothesis.
Extra Credit:
4) a. Use the data above to compute a 98% confidence interval for the population standard deviation.
Solution: From the supplement pg 1,
n  1s 2
 2
2 

2
 .01 . We use 
n 1
20
 21   .99
 8.2604 . The formula becomes
2
2 
1
2
s x  5.7907 , n  21 ,   .02 and
n  1s 2
2 n 1

2
2
20
  .01
 37.5662 and
20 33 .532237
37 .5662
. We know s x2  33.532237 ,
 2 
20 33 .532237
8.2604
or
17.8523   2  81.1879 . If we take square roots, we get 4.225    9.010 .
b. Assume that you got the sample standard deviation that you got above from a sample of 45, repeat a.
s 2 DF
s 2 DF
Solution: From the supplement pg 2,
. We now have s x  5.7907 ,
 
z   2 DF
 z   2 DF
2
n  45 ,   .02 and
becomes

2
2  .01 . We use z .01  2.327 and
5.7907  9.3808
2.327  9.3808
 
5.7907  9.3808
 2.327  9.3808
2 DF  2(44 )  88  9.3808 . The formula
or 4.640    7.701 .
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
c. Fool around with the method for getting a confidence interval for a median and try to come close to a
99% confidence interval for the median.
The numbers in order are
x1 x 2 x3 x 4 x5 x 6 x 7 x8 x 9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x 20 x 21
7
8
9
10
11
11
11
12
13
15
15
16
17
18
18
19
21
23
23
26
26
It says on the outline that   2Px  k  1 . We want  to be 1% or lower which means
Px  k  1  .005 . Unfortunately, our Binomial table only goes to n  20 and n  25 , and if we look
where p  .50 , we find that the first quantity at or below 0.5% is Px  4  .00129 for n  20 and
Px  5  .00204 for n  25 . So we can only guess that k is between 5 and 6. To be more accurate, use
n  1  z .2 n
21  1  2.576 21 22  11.8047

 5.10 . It tells us to use the 5th number
2
2
2
from the end, since, if we want to be conservative we round the answer down.
I can check my results two ways. I am using a continuity correction, which adds 0.5 to each interval.

 4.5  21.5  
4.5  10 .5 

k  1  4 Px  4.5  P  z  
 Pz  2.62   .5  .4956  .0044
  P z 
2.2912878 


 21.5.5  


 5.5  21.5  
5.5  10 .5 

k  1  5 Px  5.5  P  z  
  P z 
  Pz  2.18   .5  .4854  .0146
2
.
2912878







21
.
5
.
5





the formula k 

I also checked it by generating a binomial table for n  25 and p  .5 on Minitab. I put the numbers 0
through 7 in C1, used the Calc menu, then Probability Distributions and then Binomial. In the dialog box
I picked, Cumulative probability, Number of trials = 21 , Probability of success =.5, Input column =
C1 and Optional storage = C2. The equivalent command is:
MTB > CDF c1 c2;
SUBC>
Binomial 21
The results, in C1 and C2 are:
x0
P x  x 0 
0
0.0000005
1
0.0000105
2
0.0001106
3
0.0007448
4
0.0035987
5
0.0133018
6
0.0391769
7
0.0946236
0.5.
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
How I got these results
‘MTB >’ is the Minitab prompt. The retrieval is done using the ‘file’ pull-down menu and the ‘open worksheet’ command followed
by finding where I put the data. Other instructions were typed in the ‘session’ window.
I put the data in column 1 in Minitab and used the ‘Gsummary’ and ‘Describe’ commands to get the mean
and standard deviation.
Session to get confidence intervals
————— 9/16/2005 5:34:34 PM ————————————————————
Welcome to Minitab, press F1 for help.
Results for: 2gr1-052.MTW
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\2gr1-052.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\2gr1-052.MTW'
Worksheet was saved on Wed Sep 07 2005
MTB > Describe c1
Descriptive Statistics: time
Variable
time
N
21
N*
0
Mean
15.67
SE Mean
1.26
StDev
5.79
Minimum
7.00
Q1
11.00
Median
15.00
MTB > Gsummary c1;
SUBC> confidence 98.
Summary for time
MTB > let c2 = c1*c1
MTB > sum c1
Sum of time
Sum of time = 329
MTB > sum c2
Sum of C2
Sum of C2 = 5825
I computed the square of C1 in C2..
I got the sums for computing the variance.
Q3
20.00
Maximum
26.00
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
MTB > print c1 c2
Data Display
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
time
15
12
18
23
11
21
16
13
9
19
26
11
7
18
11
15
23
26
10
8
17
C2
225
144
324
529
121
441
256
169
81
361
676
121
49
324
121
225
529
676
100
64
289
MTB > Onet c1;
This does a 98% confidence interval and a test for a mean of 19 using s.
SUBC> test 19;
SUBC> confidence 98.
One-Sample T: time
Test of mu = 19 vs not = 19
Variable
time
N
21
Mean
15.6667
StDev
5.7908
SE Mean
1.2637
98% CI
(12.4722, 18.8612)
T
-2.64
P
0.016
MTB > Onez c1;
This does a 99.9% confidence interval and a test for a mean of 19 using sigma.
SUBC> sigma 6;
SUBC> confidence 99.9.
One-Sample Z: time
The assumed standard deviation = 6
Variable
time
MTB >
MTB >
SUBC>
MTB >
N
21
Mean
15.6667
let c5=c1
sort c5 c5;
by c5.
print c5
StDev
5.7908
SE Mean
1.3093
99.9% CI
(11.3584, 19.9750)
This sorts c1, which I moved to c5
Data Display
times
7
19
8
21
9
23
10
23
11
26
11
26
11
12
13
15
15
16
17
18
18
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Session to verify value of z
Welcome to Minitab, press F1 for help.
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\notmuch.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\notmuch.MTW'
This worksheet is nonsense and not actually used..
Worksheet was saved on Thu Apr 14 2005 It gets Minitab to look in the same place for normarea.
Results for: notmuch.MTW
MTB > %normarea5a
This does the graph shown below. It prompts you to put in the data.
Executing from file: normarea5a.MAC
Graphic display of normal curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K116)
Enter the mean and standard deviation of the normal curve.
DATA> 0
DATA> 1
Do you want the area to the left of a value? (Y or N)
no
Do you want the area to the right of a value? (Y or N)
yes
Enter the value for which you want the area to the right.
DATA> 3.295
...working...
Normal Curve Area
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Extra Credit Minitab Problem
5. Check some numbers in the t, Chi-Squared or F tables using the new set of Minitab routines that I have
prepared. To use the new set of routines, set up a file to hold your work. Then go to
http://courses.wcupa.edu/rbove , open the Minitab folder and download any of the following:
Normal Distribution Area programs:
NormArea5A.txt
or NormArea5C.txt and NormArea5.txt.
t Distribution Area programs:
tAreaA.txt
or tAreaC.txt and tArea.txt
Chi-squared Distribution Area programs:
ChiAreaA.txt
or ChiAreaC.txt and ChiArea.txt
F Distribution area programs:
FAreaA.txt
or FAreaC.txt and FArea.txt
Use Notepad (under ‘tools’ in Minitab’) to convert their extensions from .txt back to .mac. To see how they
are used, look at http://courses.wcupa.edu/rbove/Minitab/Area.doc.
Routines like tAreaA are self prompting. To use routines like tAreaC, you need to set up your data in
advance. If you want to use one of the worksheets that are mentioned in
http://courses.wcupa.edu/rbove/Minitab/Area.doc, click on ‘File’ and then ‘Open Worksheet.’ Copy a URL
like the ones below into File Name.’
http://courses.wcupa.edu/rbove/Minitab/252PrA1d-f.MTW
http://courses.wcupa.edu/rbove/Minitab/tEx1.MTW
http://courses.wcupa.edu/rbove/Minitab/ChiEx1.MTW
http://courses.wcupa.edu/rbove/Minitab/FEx1.MTW
Addendum: To get graphs into a document do the following. While you are in Minitab, click on the graph
and use the ‘File’ menu; choose ‘Save Graph as….’ A menu will appear. Pick a type (like ‘.jpg’ for color or
‘.png’ for black and white) under ‘Save as type …’ give the graph a name like ‘graph1.jpg’ and the graph
will be placed in the same file that contains your data and macro. You can now use the ‘Insert’ menu in
Word to insert a picture.
10
10
10
 1.372 , z.10  1.282 ,  2 .10  15.9872 ,  2.90  4.8650 ,
Results: I looked at the tables and found t.10
10,10  2.32 and F 10,10  1
F.10
 0.431 . For the numbers with .10 as a subscript, I checked that the
.90
2.32
probability above them was .10, for the numbers with .90 as a subscript, I checked that the probability
below them was .10.
————— 9/19/2005 5:33:43 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\notmuch.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\notmuch.MTW'
Worksheet was saved on Thu Apr 14 2005
Results for: notmuch.MTW
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
MTB > %tarea6a
Executing from file: tarea6a.MAC
Graphic display of t curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.
DATA> 10
Do you want the area to the left of a value? (Y or N)
n
Do you want the area to the right of a value? (Y or N)
y
Enter the value for which you want the area to the right.
DATA> 1.372
...working...
t Curve Area
Data Display
mode
median
0
0
MTB > %normarea6a
Executing from file: normarea6a.MAC
Graphic display of normal curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K116)
Enter the mean and standard deviation of the normal curve.
DATA> 0
DATA> 1
Do you want the area to the left of a value? (Y or N)
n
Do you want the area to the right of a value? (Y or N)
y
Enter the value for which you want the area to the right.
DATA> 1.282
...working...
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Normal Curve Area
MTB > %chiarea6a
Executing from file: chiarea6a.MAC
Graphic display of chi square curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.
DATA> 10
Do you want the area to the left of a value? (Y or N)
n
Do you want the area to the right of a value? (Y or N)
y
Enter the value for which you want the area to the right.
DATA> 15.9872
...working...
ChiSquare Curve Area
Data Display
std_dev
mode
median
4.47214
8.00000
9.33333
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
MTB > %chiarea6a
Executing from file: chiarea6a.MAC
Graphic display of chi square curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.
DATA> 10
Do you want the area to the left of a value? (Y or N)
l
Please answer Yes or No.
y
Enter the value for which you want the area to the left.
DATA> 4.8650
...working...
Chi Squared Curve Area
Data Display
std_dev
mode
median
4.47214
8.00000
9.33333
MTB > %farea6a
Executing from file: farea6a.MAC
Graphic display of F curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.DF2 must be above 4.
DATA> 10
DATA> 10
Do you want the area to the left of a value? (Y or N)
n
Do you want the area to the right of a value? (Y or N)
y
Enter the value for which you want the area to the right.
DATA> 2.32
...working...
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
F Curve Area
Data Display
mode
0.818182
std dev
0.968246
MTB > %farea6a
Executing from file: farea6a.MAC
Graphic display of F curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.DF2 must be above 4.
DATA> 10
DATA> 10
Do you want the area to the left of a value? (Y or N)
y
Enter the value for which you want the area to the left.
DATA> .431
...working...
F Curve Area
Data Display
mode
0.818182
std dev
0.968246
MTB >