* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 252soln0
Degrees of freedom (statistics) wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Foundations of statistics wikipedia , lookup
Secretary problem wikipedia , lookup
Taylor's law wikipedia , lookup
German tank problem wikipedia , lookup
Resampling (statistics) wikipedia , lookup
252soln0 2/3/00 PROBLEM A2. If n  64 and x  11.50 , find 95% confidence intervals for the mean under the following circumstances: a.   6.30, N  3000 b.   6.30 , N  300 c. s  6.30, N  3000 d. s  6.30 , N  300 SOLUTION: Use the formulas from Table 3 of the syllabus supplement or from the outline. a.   x  z  x  11.50  1.960 .7875   11.50  1.54 or 9.96 to 13.04 2 x  x 6.30  64 n  .7875 z 2  z.025  1.960 b.   x  z  x  11.50  1.960 .6996   11.50  1.37 or 10.13 to 12.87 2 z 2 x N n 6.30  N  1 64 n  z.025  1.96 x  236 300  64  0.7875 .8884   .6996  0.7875  299 300  1 c.   x  tn1 s x  11.50  1.998 .7875   11.50  1.57 or 9.93 to 13.07 2 sx  sx  6.30 n  .7875 64 t  2 n 1 63  t .025  1998 . d.   x  tn1 s x  11.50  1.998 .6996   11.50  1.40 or 10.10 to 12.90 2 sx  sx n N n 6.30  N 1 64 300  64  .6996 300  1 63 tn1  t.025  1.998 2 PROBLEM A3. In a study of a grain market in an African country we want to figure out how large a sample we must take to find a daily average price for a grain transaction. (Assume a standard deviation of 5 cents.) a. We want a 99% confidence interval for the mean with an error of ±1 cent. b. What if the error is to be ±1/2 cent? z 2 2 , where z  z  z.005 since   .01 . 2 e2 a. We are told that the maximum error must be e  1 (or e  .01 ) and that   5 (or   .05 ). SOLUTION: We use the formula n  From the t table, z.005  2.576 so that n  z 2 2  2.576 2 52 12  165 .89 . since we always e2 round this quantity up, use a sample size of at least 166. Note that if we use n  165 , we find that  5  90  2.576  90  1.003 . The error term will (if we assume that x  90 )   x  z  2 n 165  5  90  2.576  90  1.000 . be slightly above 1. However, if we use n  166 ,   x  z  2 n 166 b. This time the maximum allowable error is e  0.5 , so n  z 2 2  2.576 2 52 0.52  663 .57 and e2 we must use a sample size of 664. Note that his sample size is four times the size in part a. 252soln0 2/3/00 PROBLEM A4 If s = 15 find a 95% confidence interval for  if a) n = 26, b) n = 99 SOLUTION: Use the formulas from Table 3 of the syllabus supplement or from the outline. a. This is a small sample since n  31 , so use n  1s 2  22  2  n  1s 2 12 2 . Since the degrees of 2  13 .1197 , the interval freedom are n  1  31  1  30 ,  22   .2025  40 .6466 and  12 2   9725 25 15 2 25 15 2 or 138 .388   2  428 .745 . Since an interval for the 40 .6466 13 .1197 standard deviation was requested, take the square root of both sides. 11.76    20.71 . b. Since the degrees of freedom are n  1  99  1  98 and are too large for the chi-square table use becomes s 2DF  z 2  2DF     2  s 2DF   z 2  2DF  . Since 2DF   298   196  14 and 15 14  15 14    or 13.158    17.442 . 1.960  14  1.960  14 Note that due to the larger sample size, this interval is smaller than the one in a. z 2  z.025  1.960 , the formula becomes PROBLEM A5. a. Find the confidence level for an interval for the median using binomial tables, if from a sample of 12 we take the third observation from both ends. b. Do the same for the 19th observation from both ends in a sample of 50. c. Do the same for an interval using the 10th observation from both ends in a sample of 40, using the normal approximation to the binomial distribution. d. In part c, try to find a 95% confidence interval for the median. SOLUTION: a) If we take the third number from both the bottom and the top of the data, we get the interval x3    x10 from the ordered numbers x1, x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11, x12 . For example, if the numbers are 13.2, 17.1,18.5, 21.3, 21.4, 22.0, 27.1, 27.7, 28.9, 29.2, 35.4, 35.9 , we would say that the interval is 18 .5    29.2. To find the confidence level, first find the significance level  , the probability that the interval is wrong. The interval will be wrong if (i) x3 through x10 are all below the median or (ii) x3 through x10 are all above the median. The probability of these two events are both the same, so that we can figure out the probability that x3 through x10 are all below the median and double it. The probability of any given number being below (or above) the median is 0.5, and the probability that x3 through x10 are all below the median is the probability that the first 10 or more numbers are all below the median, and is the same as the probability of getting ten or more heads in twelve flips of a coin. 252soln0 2/3/00 From the binomial table for n  12 and p  .5 , we find that Px  10   1  Px  9  1  .98071  .01929 . But note that the binomial distribution with p  .5 is symmetrical so that Px  10   Px  2 . Also remember that we stated in the previous paragraph that to get the significance level, we must double this probability, so that   2.01929   .03858 . Thus the confidence level is 1    1  2.01929   .96142 . More generally, if k is the index of the number at the bottom of the confidence interval, (in the case we just did k  3 ) the confidence level is 1    1  2Px  k  1 . b) If we take a sample of n  50 , put it in order, and then pick the 19th number k  19  from both the top and the bottom, so that the confidence interval is x19    x32  , the confidence level is 1    1  2Px  k  1  1  2Px  19  1  1  2Px  18   1  2.03245   .93510 . This can also be done using the Normal Distribution. If we ignore the continuity correction, and recall that for the binomial distribution with p  .5 and q  1  p  .5 ,   np  .5n and  2  npq  n.5.5  .25n , k  1     P z  k  1  .5n   P z  18  .550    Pz  1.98   .5  .4761  .0239  Px  k  1  P z          .25 n  .5 50    .and 1    1  2Px  k  1  1  2.0239   .9522 . This looks way off, so try the same problem with a  18  .5  .550   continuity correction Px  k  1  Px  18   P z    Pz  1.84   .5  .4671  .0329 .5 50   and the confidence level is 1    1  2Px  k  1  1  2.0329   .9342 . c) If n  40 and k  10 we have no binomial table, so use the normal approximation to the binomial distribution with a continuity correction. k  1  .5     P z  k  1  .5  .5n   P z  9  .5  .540    Pz  3.32   .5  .4995  .0005  Px  k  1  P z          .25 n .5 40     and the confidence level is 1    1  2Px  k  1  1  2.0005   .9990 . d) If we want a 95% confidence interval and n  40 , we require that 1    1  2Px  k  1 k  1  .5     P z  k  1  .5  .5n   .025 .   1 2.025   .95 . This means that Px  k  1  P z      .25 n     But since z .025  1.960 , we know that Px  k  1  Pz  1.960   .025 . So we can say that k  1  .5  .5n  1.960 . Solve this with n  40 , or note that k  1  .5  .5n  1.960 .25n and, solving .25 n for k , we find k  .5  .5n  1.960 .25n . If we substitute n  40 , k  .5  .540   1.960 .2540   20.5  10  20.5  6.26  14.30 . We could also follow the formula in the outline that says k n  1  z  n  40  1  z  40  14 .30 . Obviously k must be a whole number and the more 2 2 conservative choice would be to round it down, so that the interval is x14    x 27 . 2 2 252soln0 2/3/00 PROBLEM B.1 A firm claims that its median wage is $32000. The union claims that the median () is lower. A random sample of 100 employees shows that 40% are above $32000. Set this up as two hypotheses and test with a significance level of 5%. SOLUTION: We always replace a hypothesis about a median with a hypothesis about a proportion in the sign test. The statement implicit in the above is that the median is at least $32000. Since we have the  H :   32000  H : p  .5 number over $32000 let p be the proportion over $32000. Then  0 becomes  0 .  H 1 :   32000  H 1 : p  .5 p0 q0 x .5.5  .05 . Note that a  .40 , so that  p   n n 100 continuity correction has been added to all of these solutions. It has the effect of making the “accept” 0 .5 region larger by x  0.5 or p  . n (i) Critical Value Method: .5 .5 p cv  p 0  z  p   .5  1.645 .05    .5  .08225  .005  .41275 . If p is below this n 100 critical value we reject H 0 . Since .40 is below .41275, reject H 0 . Note that   .05, n  100, x  40 and p  (ii) Test Ratio Method: There are three possible versions. In all those below, the rule  if x  n   2 frequently used is , where  appears, use  . n  if x  2   p  .5  p 0 .40  0.5  .5  n 100   Pz  1.90    z  P p  .40   P  z    p .05    .5  .4713  .0287  x  .5  np0 40  0.5  100 .5  Px  40   P  z    Pz  1.90   .0287   z  100 .5.5   np0 q 0     z  2 x  1  n Px  40   P  z  240   1  100   Pz  1.90   .0287 n 100   In each case, the p-value is .0287. Since   .05, p  value  and we reject H 0 252soln0 2/3/00 PROBLEM B.2 We are testing that the median is 14. Let x be the number of items above 14. From a sample of size n  30 , we find x  25 . Use p for the proportion of the population over 14 and p for the proportion of the sample over 14. a) Test  = 14 b) Test  > 14 c) Test  < 14 25 SOLUTION: Note that p   .8333 . Assume   .05 . 30  H :   14  H : p  .5 a)  0 becomes  0 . If we use the critical value method H :   14  1  H 1 : p  .5   p0 q0 0.5  .5.5 .5  pcv  p0   z 2   .5  1.96   .5  .179  .017   .5  0.196   n n  30 30    or .304 to .696. Since .8333 is not in this interval reject H 0 . We are probably better off using the test ratio method, with z  x  .5  np0 np0 q 0 . Here np0  30 .5  15 and np0 q 0  15.5  7.5 . So  24 .5  15  pvalue  2 Px  25   2 P  z    2 Pz  3.47   2.5  .4797   2.0003   .0006 . 7.5   Since this is below the significance level, reject H 0 . b)  H 0 :   14 becomes   H 1 :   14  H 0 : p  .5 .   H 1 : p  .5 In this case  24 .5  15  pvalue  Px  25   P  z    .0003 . Since this is below the significance level, 7.5   reject H 0 . c)  H 0 :   14  H : p  .5 becomes  0 . In this case, it is possible to have many items   H 1 :   14  H 1 : p  .5 over 14 and for H 0 still to be true.  25 .5  15  pvalue  Px  25   P  z    Pz  3.83   .5  .4999   .9999 . Since this is 7.5   above the significance level, accept H 0 . PROBLEM B.3 A bank's average default rate on loans is supposedly 6 per month. In the first month there are 12 defaults. Test the first assertion assuming a Poisson distribution. Use a two-sided test with a 5% significance level. H 0 : Poisson6 SOLUTION:  . Though it is possible to put together a rejection region, the easiest way H 1 : not Poisson6 to do this is to use the Poisson(6) table and a p-value approach. If we look up the probability that x is 12 or larger we find: pvalue  2Px  12   21  Px  11  21  .9799   2.0201   .0402 . Since pvalue   , reject H 0 . 252soln0 2/3/00 PROBLEM B.4 a. I claim that x is binomially distributed with p  .01 . Test this assertion using a 2sided 5% test if there are 3 successes in 10 trials. b. Test for a binomial distribution with p  .10 when n  10 and x  4 . c. If n  100 and x  9 , test to see if p is at least 0.4. d. Calls coming into a switchboard in an hour presumably have a Poisson distribution with a mean of 144. Test this hypothesis if, in a given hour, 200 calls come in SOLUTION: x a. If we assume that has the Binomial distribution, our Hypotheses are H 0 : Binomial p  .01 . If we have a Binomial table for p  .01, note that  H 1 : not Binomial p  .01   np  10.01  .01, so that our value of x is too large. pvalue  2Px  3  21  Px  2  21  .99989   .00022 . This is below the significance level, so reject H 0 . b. c. If p  .10 and n  10 ,   np  1 so that x  4 is too large. pvalue  2Px  4  21  Px  3  21  .9984   2.0016   .0032 . This is below the significance level, so reject H 0 . H : Binomial p  .4 Our hypotheses are now  0 . Since n  100 and x  9 ,   np  40 and x is H 1 : Binomial p  .4 too small. From the binomial table for p  .4 , pvalue  Px  9  .00000 , so reject H 0 . If a table with n  100 is unavailable, use the Normal approximation, pvalue  Px  9  P p  .09  d.   .5 .09   .4   100   P z  .09  .005  .4   Pz  6.23   0, so reject H .  Pz  0    .4.6   .0024     100   This is a Poisson problem, but a table for Poisson(144) is not available. Fortunately for large   values of m , the Poisson mean, x ~ N m, m . Since there are no specific requirements, assume that a 2-sided 95% test is wanted. H 0 : Poisson144   H 1 : not Poisson144  Then  x  m z   m   200  144     4.67 . Since z 2  z.025  1.96 , and our test ratio is not between 1.96 , reject  144  H0. PROBLEM B.5 If  x  x    x 2 2  nx 2  40 and the confidence level is 95%, test if it is true that the variance is 2 when a) n  10 , b) n  20 , c) n  40. SOLUTION: We are  H 0 :  2  2 .   H 1 :  2  2 testing n  1   x  x    From the outline, since s 2  x  x   n 1 2 ,    x  x 2 x  x 2 40  and   20   2 in all cases. Since    2 n 1  02  02  02   02   the confidence level is 95%, all we really need to do is find out whether our value of  2 falls between 2 n  1 s 2  2  12 2 and  22 , in this case  .2975 and  .2025 . a. 9   2.700 9   19.023 . Since our value of 2 does n  10 implies 9 degrees of freedom.  .2975 and  .2025  not fall between them, reject H 0 . b. 19  8.907 and  219  32 .852 . Since our value of 2 n  20 implies 19 degrees of freedom.  .2975  .025 falls between them, do not reject H 0 . c. n  40 implies 39 degrees of freedom. Because we are beyond the  2 table, we must use the approximation, z  2  2  2 DF  1. We already know that  2  20, so that z  220   239   1  6.32  8.77  2.95 . For a confidence level of 95%, z must be between 1.96 and 1.96. Since this value of z is not, reject H 0 . 252soln0 2/3/00 PROBLEM C.1 Assume that = 4 and n = 70. Find the critical values, power function and operating characteristic curve for: H0 :   50 H1 :  < 50 Use a significance level of 5 percent. SOLUTION: a) First, state the problem and find a critical value or values.  H 0 :   50  4   4, n  70,   .05 so  x    0.47809 . Since this is a one-sided test, the  n 70  H 1 :   50 formula for a two-sided critical value x cv   0  z   x becomes xcv   0  z  x , so that 2 xcv  50  1.645 0.47809   49.2135 . So we will not reject H 0 if the sample mean x is greater than or equal to 49.2135. b) Decide on what values of 1 to use to compute  , the probability of a type II error. The usual set of values includes the mean from the null hypothesis, the critical value, a point about midway between these values and two points, one further out beyond the critical value by a distance equal to the distance between the null hypothesis mean and the critical value, and another halfway between this point and the critical value. We thus choose 50, 49.2135, and 49.6, which is about halfway between them. Since the difference between 50 and 49.2135 is about 0.8, the lowest value of 1 the we use is 48.4, and a point about halfway between 48.4 and 49.2135 is 48.8. c) Compute  for each value of 1 . Since a type II error is wrongly ‘accepting’ the null hypothesis, we compute the probability that the sample mean will be above or equal to the critical value for each value of  x  1  1 . Our computations are below. Note that, in general, for this one-sided hypothesis   P  z  cv . x   49 .2135  50     Px  49 .2135   50   P  z  1  50   Pz  1.645   .95 .47809   power  1    .05   1  49.6   Px  49 .2135   49 .6  P  z    49 .2135  49 .6    Pz  0.81  .2910  .5  .7910 .47809  power  1    .2090 1  49.2135   Px  49 .2135   49 .2135   P  z    49 .2135  49 .2135    Pz  0  .5000 .47809  power  1    .5000 1  48 .8   Px  49 .2135   48 .8  P  z    49 .2135  48 .8    Pz  0.86   .5  .3051  .1949 .47809  power  1    .8051 1  48.4   Px  49 .2135   48 .4  P  z    power  1    .9554 49 .2135  48 .4    Pz  1.70   .5  .4554  .0446 .47809  252soln0 2/3/00 PROBLEM C.2 A hardware firm charges a flat rate for mailing of small tools based on an average weight of 20 oz. with a standard deviation of 3.60 oz. A consultant challenges this assumption and a sample of 100 packages is taken. Find critical values for a significance level of 1% and compute the power function and operating characteristic curve. SOLUTION: a) First, state the problem and find a critical value or values.  H 0 :   20  3.60   3.60, n  100 ,   .01 so  x    0.360 . Since this is a two sided test, the  H :   20 n 100  1 formula for a critical value is x cv   0  z   x , so that xcv  20  2.576 0.360   20  0.927 . So we will 2 not reject H 0 if the sample mean x is between 19.073 and 20.927. b) Decide on what values of 1 to use to compute  , the probability of a type II error. The usual set of values includes the mean from the null hypothesis, the critical values, a point about midway between these values and two points, one further out beyond the critical value by a distance equal to the distance between the null hypothesis mean and the critical value, and another halfway between this point and the critical value. We thus choose the null hypothesis mean, 20 and the two critical values 19.073 and 20.927.20.5 and 21.5 are about halfway between 20 and the critical values. Since the difference between 20 and the critical values is about 1.0, the lowest value of 1 the we use is 18.0 and the highest is 22.0. Points about halfway between these numbers and the critical values are 18.5 and 21.5. c) Compute  for each value of 1 . Since a type II error is wrongly ‘accepting’ the null hypothesis, we compute the probability that the sample mean will be between the critical values for each value of 1 . Our  x  1 x  1  computations are below. Note that, in general, for a two-sided hypothesis   P  cv1  z  cv 2 .  x  x  20 .927  20  19 .073  201 1  20   P19 .073  x  20 .927   20   P  z 0.360   0.360  P2.575  z  2.575   2.4950   .99  1   1  20.5 or 19.5 20 .927  20 .5  19 .073  20 .5 z  0 . 360 0.360     P19 .073  x  20 .927   20 .5  P   P 3.96  z  1.19   .5  .3830  .8830 1  20.927 or 19.073 20 .927  20 .927  19 .073  20 .927 z  0.360 0.360     P19 .073  x  20 .927   20 .927   P   P 5.15  z  0.00   .5000 1  21 .5 or 18.5 20 .927  21 .5  19 .073  21 .5 z  0.360 0.360     P19 .073  x  20 .927   19 .5  P   P 6.74  z  1.59   .5  .4441  .0559 1  22.0 or 18.0 20 .927  22 .0  19 .073  22 .0 z  0.360 0.360     P19 .073  x  20 .927   22 .0  P   P 8.13  z  2.98   .5  .4986  .0014 252soln0 2/3/00 If we round these results, we get the following values for the operating characteristic and power: 22.0 21.5 20.9 20.5 20.0 19.5 19.0 18.5 1  power .00 1.00 .06 .94 .50 .50 .88 .12 .99 .01 .88 .12 .50 .50 .06 .94 18.0 .00 1.00
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            