Download 6/25/97 502as1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Time series wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Graded Assignment 1
Please show your work! Neatness and whether the papers are stapled may affect your grade.
1. A Psychiatrist is treating a group of aborigines who are suffering from depression. Whether justifiably or
not, she considers this group a random sample of 15 taken from a very large number of depressed
individuals. The numbers below represent the measurement of the sample’s level of depression an hour after
taking the pill using a commonly used (Coolidge Axis II) scale for measuring depression. Personalize the
data as follows: add the digits of your student number to the last six numbers. Example: Ima Badrisk has the
student number 123456; so the last six numbers become {51, 52, 53, 54, 55, 56}.
52
53
58
50
53
58
55
66
53
50
50
50
50
50
50
1. Compute the sample standard deviation using the computational formula. Use this sample standard
deviation to compute a 99% confidence interval for the mean. The doctor believes that subjects fed a sugar
pill will have an average score on the same scale of 58.73. Does the mean from your sample differ
significantly from 58.73? Why?
2. How would these results change if these individuals were a random sample of 15 taken from the 150
members of the tribe that are depressed?
3. Assume that the population standard deviation is 4.50 (and that the sample of 15 is taken from a very
large population). Find z .0025 using the Normal table (If you have several values of z that you can use, pick
the average of the extreme ones.) and use it to compute a 99.5% confidence interval. Does the mean differ
significantly from 58.73 now? Why?
Solution: There are two basic observations. 1) You can’t answer a question you haven’t read. It says
‘computational formula’ in the first part. If you don’t know what that means, find out! 2) You can’t do an
assignment based on problems if you haven’t looked at the problems. The first 3 problems were based on
Problems A1, A2 and 8.20. If you had made these your own, there was no chance of error.
1)
x  819 ,
x 2  44931 , n  15
x
index
x2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
52
53
58
50
53
58
55
66
53
51
52
53
54
55
56
819

2704
2809
3364
2500
2809
3364
3025
4356
2809
2601
2704
2809
2916
3025
3136
44931
x

 x  819  54.6
s x2 
n
x
15
2
 nx 2

44931  1554 .62
14
n 1
213 .6

 15 .2571
14
The formula for the sample standard deviation is
in Table 20 of the Supplement.
s x  15 .2571  3.9060 .
s x2
15 .2571

 1.0171  1.0085
n
15
n
14
1    .99   .01  2  .005
t n1  t.005
 2.977
sx 
sx

2
From Table 3   x  tn 1 s x is the formula for a two sided confidence interval when the population
2
standard deviation is unknown.   x  tn1 s x  54.600  2.9771.0085  54.600  3.002 or 51.598 to
2
57.602.
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
If we ask if the mean is significantly different from 58.73, our null hypothesis is H 0 :   58.73 and since
58.73 is not between the top and the bottom of the confidence interval, reject H 0 and say that the mean is
significantly different from 58.73. (But it is not significantly different from 57!)
2) If N  150 , the sample of 15 is more than 5% of the population, so use
N n
150  15
 1.0085
 1.0085 0.9060  1.2636 0.95187   1.2027 .
150  1
n N 1
14
Recall that x  54 .600 ,   .01 ,  2  .005 , t n1  t .005
 2.977 and that   x  tn 1 s x is the formula
sx 
sx
2
2
for a two sided interval.   x  t n1 s x  54.600  2.9771.2027  54.600  3.580 or 51.020 to 58.180.
2
The interval is smaller, but it doesn’t change anything – the mean is still significantly different from 58.73
(but not 58).
3) a) Find z .0025 and compute a 99.5% confidence interval for the population mean.
Make a diagram! The diagram for z will be a Normal curve centered at zero and will show one point,
z .0025 , which has 0.25% above it (and 99.75% below it!) and is above zero because zero has 50% below it.
Since zero has 50% above it, the diagram will show 49.75% between zero and z .0005 .
From the diagram, we want one point z .0025 so that Pz  z.0025   .0025 or P0  z  z .0025   .4975 .
On the interior of the Normal table we can find to .4975 exactly. In fact, it says P0  z  z 0   .4975 for
2.81. This means that we will say z .0025  2.81 .
Check: Pz  2.81  Pz  0  P0  z  2.81  .5  .4975  .0025 . This is verified graphically below.
b) We know that x  54 .600 , n  15 and   4.50 . So  x 


4.50

4.50 2
 1.350
15
n
15
=1.1629. The 99.5% confidence interval has 1    .995 or   .005 , so z  z.0025  2.81 . The
2
confidence interval is   x  z  x  54.600  2.811.1629   54.60  3.27 or 51.33 to 57.87. If we test
2
the null hypothesis H 0 :   58 .73 against the alternative hypothesis H1 :   58.73 , since 58.73 is not on
the confidence interval, we reject the null hypothesis or say that our results do not indicate that the mrean is
significantly different from 58.73.
Check of results in 1 and 3 using Minitab.
————— 9/16/2006 3:19:47 AM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\2gr1-060.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\2gr1-060.MTW'
Worksheet was saved on Wed Sep 13 2006
Results for: 2gr1-060.MTW
MTB > print c5
Data Display
drug3
52
53
58
50
53
MTB > Onet 'drug3';
SUBC>
Confidence 99.0.
One-Sample T: drug3
58
55
66
53
51
52
53
54
55
56
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Variable
N
Mean
StDev
drug3
15 54.6000 3.9060
MTB > OneZ 'drug3';
SUBC>
Sigma 4.5;
SUBC>
Confidence 99.5.
SE Mean
1.0085
99% CI
(51.5977, 57.6023)
The assumed standard deviation = 4.5
Variable
N
Mean
StDev SE Mean
drug3
15 54.6000 3.9060
1.1619
99.5% CI
(51.3385, 57.8615)
One-Sample Z: drug3
MTB > Stop.
First Extra Credit Problem
4. a. Use the data above to compute a 98% confidence interval for the population standard deviation.
b. Assume that you got the sample standard deviation that you got above from a sample of 45, repeat a.
c. Fool around with the method for getting a confidence interval for a median and try to come close to a
99% confidence interval for the median.
Solution: a. Recall that s x2  15.2571, s x  15 .2571  3.9060 and n  15 . The problem says that
  .02 and

2 n 1

2

2
 .01 . From the supplement pg 1 (or Table 3),
n  1s 2
 2
2 
n  1s 2
2 
1
2
n 1
14
14
  .01
 29.1413 and  21   .99
 4.6604 . The formula becomes
14 15 .25717

2
29 .1413
2.707    6.769 .
14 15 .2571

4.6604
. We use
2
2
or 7.3298   2  45.8328 . If we take square roots, we get
b. We will repeat a) with n  45 . Recall s x  3.9060 . Now DF  n  1  44 . From the supplement pg 2 (or
Table 3), the formula for large samples is
s 2 DF
z   2 DF
 
s 2 DF
 z   2 DF
2
. Since the  2 table has no
2
values for 44 degrees of freedom, we must use the large sample formula. We use z.01  2.327 and
2 DF  2(44 )  88  9.3808 . The formula becomes
3.9060  9.3808
2.327  9.3808
 
3.9060  9.3808
 2.327  9.3808
or
36 .6414
36 .6414
 
3.130    5.195 .
11 .7078
7.0538
c. We fool around with the method for getting a confidence interval for a median and try to come close to
a 99% confidence interval for the median.
The numbers in order are
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
x11 x12
x13
x14
x15
50
51
52
52
53
53
53
53
54
55
55
58
58
66
56
It says on the outline that, if we use the k th numbers from the end,   2Px  k  1 . We want  to be 1%
or lower which means Px  k  1  .005 . There are two ways to do this. If we take the easy way out and
n  1  z .2 n
15  1  2.576 15 16  9.9768

 3.0116 . This seems to
2
2
2
be telling us to use the numbers that are 3 from each end or x3  52 and x13  58. (To be conservative,
round the result down.)
use a Normal approximation k 

252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
To be more precise, use the Binomial table with n  15 . Possible intervals are
Let’s try a few intervals.
Interval
k
x1 to x15 , x 2 to x14 etc.
  2Px  k  1
x1 to x15 or 50 to 66
x 2 to x14 or 51 to 58
1
2Px  0  2.00003   .00006
2
2Px  1  2.00049   .00098
x3 to x13 or 52 to 58
3
2Px  2  2.00369   .00738
4
2Px  3  2.01758   .03516
x 4 to x12 or 52 to 56
Notice that we could have answered the question by finding the largest value of k with Px  k  1  .005 .
Since the smallest interval with a significance level below 1% is 52 to 58, this is the best that we can do.
We can check our results using the Normal distribution. The outline says, using a continuity correction,


k  1  1 2  np 

k  .5  .5n 

 P x  k  1  1 2   P z 
 P z 



2
npq
.5 n 



.

 3  1  .5  15.5  
2.5  7.5 

k  3 Px  2.5  P  z  
 Pz  2.82   .5  .4976  .0024
  P z 
1.9365 


15.5.5  



 4  1  .5  15.5  
3.5  7.5 

k  4 Px  2.5  P  z  
  P z 
  Pz  2.07   .5  .4808  .0192
1
.
9365







15
.
5
.
5





Since we need Px  k  1  .005 , k  3 was correct.
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Extra Credit Minitab Problem
5. Check some numbers in the Normal, t, Chi-Squared or F tables using the new set of Minitab routines that
I have prepared. To use the new set of routines, follow the instructions in Areadoc1. There are several
things that you can do. For the Normal distribution use the computer to check the answers to Examples 6.16.4 on pp 198-200 in the text. For the t-table pick a number of degrees of freedom and show that for that
number of degrees of freedom, the probability above, say, t .20 is 20%. You can do the same for the F and
chi-squared tables in your book of tables. A good answer will explain what you did and contain the
command dialog and graphs.
10
10
10
Results: I looked at the tables and found t.10
 1.372 , z.10  1.282 ,  2 .10  15.9872 ,  2.90  4.8650 ,
10,10  2.32 and F 10,10  1
F.10
 0.431 . For the numbers with .10 as a subscript, I checked that the
.90
2.32
probability above them was .10, for the numbers with .90 as a subscript, I checked that the probability
below them was .10.
————— 9/19/2005 5:33:43 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\notmuch.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\notmuch.MTW'
Worksheet was saved on Thu Apr 14 2005
Results for: notmuch.MTW
MTB > %tarea6a
Executing from file: tarea6a.MAC
Graphic display of t curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.
DATA> 10
Do you want the area to the left of a value? (Y or N)
n
Do you want the area to the right of a value? (Y or N)
y
Enter the value for which you want the area to the right.
DATA> 1.372
...working...
t Curve Area
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Data Display
mode
median
0
0
MTB > %normarea6a
Executing from file: normarea6a.MAC
Graphic display of normal curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K116)
Enter the mean and standard deviation of the normal curve.
DATA> 0
DATA> 1
Do you want the area to the left of a value? (Y or N)
n
Do you want the area to the right of a value? (Y or N)
y
Enter the value for which you want the area to the right.
DATA> 1.282
...working...
Normal Curve Area
MTB > %chiarea6a
Executing from file: chiarea6a.MAC
Graphic display of chi square curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.
DATA> 10
Do you want the area to the left of a value? (Y or N)
n
Do you want the area to the right of a value? (Y or N)
y
252solngr1 9/16/05
(Open this document in 'Page Layout' view!)
Enter the value for which you want the area to the right.
DATA> 15.9872
...working...
ChiSquare Curve Area
Data Display
std_dev
mode
median
4.47214
8.00000
9.33333
MTB > %chiarea6a
Executing from file: chiarea6a.MAC
Graphic display of chi square curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.
DATA> 10
Do you want the area to the left of a value? (Y or N)
l
Please answer Yes or No.
y
Enter the value for which you want the area to the left.
DATA> 4.8650
...working...
Chi Squared Curve Area
Data Display
std_dev
4.47214
252solngr1 9/16/05
mode
median
(Open this document in 'Page Layout' view!)
8.00000
9.33333
MTB > %farea6a
Executing from file: farea6a.MAC
Graphic display of F curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.DF2 must be above 4.
DATA> 10
DATA> 10
Do you want the area to the left of a value? (Y or N)
n
Do you want the area to the right of a value? (Y or N)
y
Enter the value for which you want the area to the right.
DATA> 2.32
...working...
F Curve Area
Data Display
mode
0.818182
std dev
0.968246
MTB > %farea6a
Executing from file: farea6a.MAC
Graphic display of F curve areas
Finds and displays areas to the left or right of a given value
or between two values. (This macro uses C100-C116 and K100-K120)
Enter the degrees of freedom.DF2 must be above 4.
DATA> 10
DATA> 10
Do you want the area to the left of a value? (Y or N)
y
Enter the value for which you want the area to the left.
DATA> .431
...working...
252solngr1 9/16/05
F Curve Area
Data Display
mode
0.818182
std dev
0.968246
MTB >
(Open this document in 'Page Layout' view!)