Download X - York University

The Central Limit Theorem (CLT) Given a large random sample of size n from a population with mean  and standard deviation , then the sample mean X is approximately normally distributed with mean and standard deviation given by  =  X  =  / n X Note: 1. n >30 is usually large enough for the CLT to apply. 2. If the population from which we sample is normal thenX is exactly normally distributed with mean and standard deviation as above for any sample size. (1) Empirical Rule for X Consider a sample of size n from a population with mean  and standard deviation . Suppose X is normal ( or approximately normal), with  =  and  = /n X X (This would be the case if the population is normal or if the sample size is large). Find the probability that X will be within (a) 2 of  (b) 3 of  . X X (a) P( X will be within 2 of  ) X = (2) (b) P( X will be within 3 of  ) X = In general the statement “X will be within k of  “ means that X lies between X -k X and  +k  X If X is normal ( or approximately normal), then P( X will be within k of  ) = P(-k < Z <k) X (3) Z Confidence Interval Suppose we are given the following: Normal Population: Scores on a standardized test. Population Mean :  (unknown) Population S.D.:  =1.5 To estimate  we will take a srs of size n =25 and use X as our estimator. Recall that since the population is normal, X is normally distributed with  =  and  =  /n = 1.5/5 =.3 X X We would like to be able to express this estimate in the form X  E or ( X – E, X + E ). Here E is some error which determines the accuracy of our estimate. Let’s take E = 2  for now . X Thus we have For any given sample this interval may or may not contain the true mean  . It would be useful to know what the probability is that this interval covers  . If the interval covers the true mean  then  is somewhere in the interval above so thatX is in fact within 2  ( =0.6) of  . X Thus P [ (X - 2  , X + 2  ) covers ] X X = P (X is within 2  of  ) X = = (4) To make the probability above a nice number, .95, we should replace 2 by 1.96. Thus we can say “ For 95% of all samples of size n =25, the interval (X - 1.96  , X + 1.96  ) X X will cover the true value of  .” Or, “ For 95% of all samples of size n =25, X will be within 1.96 of the true X population mean .” The 95% value is called the LEVEL OF CONFIDENCE. This tells us the probability the interval will cover . The 1.96 = .588 is called the margin of error. This tells us how accurate X is X (i.e. how closeX will be to  for 95% of all samples). The interval (X - 1.96  , X + 1.96  ) is called a 95% X X Z-CONFIDENCE INTERVAL. The simulation below will illustrate how confidence intervals work. (5) MTB > random 25 c1-c40; SUBC> norm 10 1.5. MTB > zint 95 1.5 c1-c40. [ The first two command lines select 40 random samples each of size n =25 from a normal distribution with  =10 and  = 1.5. The third command line forms the 95% Z-CONFIDENCE INTERVAL for each sample] Confidence Intervals (The assumed sigma = 1.5) Variable C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20 C21 C22 C23 C24 C25 C26 C27 C28 C29 C30 C31 C32 C33 C34 C35 C36 C37 C38 C39 C40 N 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 Mean 10.459 9.826 10.388 9.741 10.441 10.331 8.941 10.205 10.163 10.009 10.455 10.365 10.626 10.090 10.339 10.208 10.356 9.943 10.015 9.924 10.037 9.490 9.972 10.330 9.635 9.292 10.053 9.484 10.666 9.896 9.942 10.100 9.483 9.691 10.390 10.569 9.813 9.905 10.442 9.945 StDev 1.661 1.486 1.600 1.297 1.766 1.637 1.264 1.627 1.560 1.619 1.787 1.220 1.475 1.677 1.103 1.480 1.508 1.388 1.318 1.473 1.271 1.345 1.484 1.644 1.609 1.558 1.072 1.726 1.402 1.640 1.583 1.657 1.496 1.623 1.369 1.178 1.326 1.489 1.405 1.919 (6) SE Mean 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 0.300 ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( 95.0% CI 9.871, 11.047) 9.238, 10.414) 9.800, 10.976) 9.153, 10.329) 9.853, 11.029) 9.743, 10.919) 8.353, 9.529) 9.617, 10.793) 9.575, 10.751) 9.421, 10.597) 9.867, 11.043) 9.777, 10.953) 10.038, 11.214) 9.502, 10.678) 9.751, 10.927) 9.620, 10.796) 9.768, 10.944) 9.355, 10.531) 9.427, 10.603) 9.336, 10.512) 9.449, 10.625) 8.902, 10.078) 9.384, 10.560) 9.742, 10.918) 9.047, 10.223) 8.704, 9.880) 9.465, 10.641) 8.896, 10.072) 10.078, 11.254) 9.308, 10.484) 9.354, 10.530) 9.512, 10.688) 8.895, 10.071) 9.103, 10.279) 9.802, 10.978) 9.981, 11.157) 9.225, 10.401) 9.317, 10.493) 9.854, 11.030) 9.357, 10.533) QUESTIONS 1. (a) In theory, how many of the above intervals would you expect to cover the true population mean  (=10)? (b) In fact how many actually do? 2. Suppose you selected 40 samples of size n =25 from a real population ( where typically the population mean and standard deviation are unknown). (a) Could you form a 95% Z- confidence interval for each sample? Explain. (b) If we knew  and formed forty 95% Z-confidence intervals, then we would expect 38 intervals to cover the population  . Could you tell which intervals will cover  ? Explain. (7) Note: (i) 100(1-)% Z-confidence interval of  is given by X  Z/2  ; where X  =  /n X (ii) For 95% Z –confidence interval ,  = .05. hence 95% Z-confidence interval of  is X  1.96  ; X where  =  /n X (iii) 99% Z-confidence interval of  is X  2.576  ; X where  =  /n X (iv) 90% Z-Confidence Interval of  is X  1.645  ; X where  =  /n X (8) Sample Size for Desired margin of Error Problem: Suppose you wish to estimate a population mean  with a specified margin of error m and level of confidence 100(1-)% . What sample size should be used? z  Solution: We know that m = 2 n Now we solve this equation for n. m2 =  nm2 =  n=  n= [z/2 /m]2 Note: In practice, we will round the sample size to the next whole number. Example 6.5; Page 425: Tim Kelley of Example 6.2 has decided that he wants his estimate of his monthly weight accurate to within 2 or 3 pounds with 95% confidence. How many measurements must he take to achieve these margins of error?  For this example it is known that   3. (i) For 95% confidence and margin of error of 2 pounds we have  z  n = 2  m  2   1.96  3  2  = =8.6  9 (always round up to the next whole number)   2   (ii) For 95% confidence and margin of error of 3 pounds we have  z  n=  2  m  2 2   =  1.96  3  =3.8  4.   3   (9) Problem 6.16; page 432: You are planning a survey of starting salaries for recent liberal arts major graduates from your college. From a pilot study you estimate that the standard deviation is about $8000. What sample size do you need to have a margin of error equal to $ 500 with 95% confidence? (10) STATISTICAL INFERENCE Let us begin with a review of some basic definitions. POPULATION: The set of all measurements or objects of interest in a particular study. If the entire population were available for analysis we would know everything about it. However, in practice one cannot know the entire population because it is either too expensive, or simply impossible or impractical to examine each member. Thus a sample from the population is used to obtain information about the population. SAMPLE: A subset of the population. The sample picked should be “representative” of the population from which it comes and should avoid any bias which might skew our view of the population. One way to achieve this is to use a SIMPLE RANDOM SAMPLE (srs) i.e. a sample chosen in such a way that each member of the population has an equal chance of being chosen. INFERENTIAL STATISTICS: deals with procedures which use the sample to draw conclusions about the population (from which it was drawn). The procedures of interest to us are CONFIDENCE INTERVALS and HYPOTHESIS TESTS. In particular we will be interested in drawing conclusions about certain characteristics of the population. Such characteristics are known as POPULATION PARAMETRS. Examples of such characteristics are a POPULATION MEAN (denoted by the Greek letter  ) and a POPULATION PROPORTION ( denoted by the letter p). EXAMPLE: Consider the population of weights ( in kg) of all newborn babies in Canada for a particular year. In this case, the POPULATION MEAN  is the average weight of all newborns in the population. An investigator may want to use a simple random sample of these weights to determine if there is sufficient evidence to answer questions like: Is  > 3.2 kg? or Is  < 3.2 kg? or Is   3.2 kg? EXAMPLE: Consider the population of all lakes in Nova Scotia. A biologist may be interested in the following POPULATION PROPORTION: p = the proportion of all lakes in Nova Scotia that are seriously affected by acid rain. She may want to use a simple random sample of lakes from this population to determine if there is sufficient evidence to answer questions like: Is p>.7 ? or, Is p<.7? or, Is p.7 ? When drawing conclusions about a population using information from a sample it is important to realize that one can NEVER be absolutely certain the conclusion is correct. This is because a sample, though it may be “representative” of the population, only contains part of all the information contained in the population. (11) HYPOTHESIS TESTING Example: A graduate student claims that over 70% of the lakes in Nova Scotia have been seriously affected by acid rain. To justify this claim she proposes the following `test`. “ Choose a simple random sample of 15 lakes in Nova Scotia. If 11 or more of the sampled lakes are seriously affected by acid rain, the claim is justified.” Formally, we set up this test as follows. First notice that the population of interest to this graduate student is the set of all lakes in Nova Scotia. The parameter of interest in her investigation is p=the true proportion of all lakes in Nova Scotia affected by the acid rain [p=the unknown population proportion]. NULL HYPOTHESIS ALTERNATIVE HYPOTHESIS What we want to reject. The viewpoint opposite to Ha Research Hypothesis. What we want to prove. H0 : Ha: TEST STATISTIC ( evidence from the sample used to make a decision) X= Distribution of X : Now in conducting this test we should make use of the fact that large values of X would be consistent with the _______________________ hypothesis that p .7. How large an X? Let’s pick some number c and decide that if Xc we conclude that_____________. Thus if X<c we must conclude that___________. The value c is called a CRITICAL VALUE . In this example, the graduate student has decided to use c =11. Her method for making a decision can be described as follows. (12) REJECTION OR CRITICAL REGION (rule for making a decision) Now suppose she conducts her study and that she observes that X  11. Then she would claim to have shown that Ha : p > .7 is true. If you had to use her study to make a policy decision, the first question you should ask is “ What is the probability that her claim is wrong? That is, what is the probability of getting X  11 when in fact H0 : p  .7 is true ?” Let’s find out by doing the calculations below. Suppose that H0 : p  .7 is true p = .5 Probability of a wrong decision P(Reject H0 / H0 is true) P(X  11 p =.5) = 1 – P (X  10) = = p=.6 P ( X  11 p =.6) = 1 – P ( X  10) = p=.7 = P(X  11 p = .7) = 1 – P(X  10) = = The error of rejecting H0 when in fact H0 is true is called a TYPE 1 ERROR. Notice that in this example the largest probability of making a type 1 error is _____________ and that it occurs when the value of p is _____________ ( that is on the boundary between H0 and Ha). The largest probability of making a type 1 error is called the LEVEL OF SIGNIFICANCE or TYPE 1 ERROR RATE of the test and is denoted by the Greek letter . (13) Conversely suppose that the graduate student observed X < 11 (i.e. X10), thus leading to the claim H0 : p  .7 is true. In this case you should ask “ What is the probability that her claim is wrong? That is, what is the probability of getting X < 11 when in fact Ha : p > .7 is true?” Let’s find out by doing the calculations below. Suppose that Probability of a wrong decision Probability of a correct Ha: p>.7 is true P (Accept H0 Ha true) decision P (Reject H0Ha true) p=.8 P(X< 11p=.8) P(X11 p=.8) =P(X 10) = 1 – P( X  10) = = = p=.9 P( X< 11 p =.9) P(X  11 p =.9) = P (X  10) =1- P(X  10) = = = The error of accepting H0 when in fact Ha is true is called a TYPE II ERROR. For a particular value of p say p1 in the alternative ( i.e. p1 >.7) the probability of making a type II error is called the TYPE II ERROR RATE evaluated at p = p1. This probability is denoted by (p1). Thus, (p1) = P ( Accept H0  p = p1 in Ha) Also for a particular value of p say p1 in the alternative (i.e. p1 > .7) we can calculate the probability of a correct decision ( see the last column of the table above). The probability of making a correct decision, that is, rejecting H0 when in fact Ha is true is called the POWER OF THE TEST AGAINST THE ALTERNATIVE p1 in Ha and is denoted K(p1). Thus K(p1) = P (Reject H0  p = p1 in Ha) Notice that K(p1) and (p1) are related by K(p1) = 1 -  (p1). If in fact Ha is true, power is a measure of a test’s ability to detect this. For example if in fact p were actually .8(.9), this test will detect this with probability___________(__________). (14) A good test that is one in whose results we can be confident of , will be one in which the probabilities of the type I and type II errors are small. The ideas discussed above are refer to the ERROR STRUCTURE of a test. A summary is provided below. DECISION Accept H0 ( Do not reject H0) Reject H0 ( Accept Ha) ACTUAL SITUATION H0 is True Ha is True ( H0 is false) Correct Decision Type II Error Type I Error Correct Decision QUESTION: For the student’s test above, state in words the consequence of making a (a) Type I Error: (b) Type II Error: ERROR RATES AND POWER OF A TEST TYPE I ERROR Reject H0 when H0 is true TYPE II ERROR Accept H0 when Ha is true POWER AGAINST the ALTERNATIVE p1 P(Type I Error) = P (Reject H0 H0 true). The largest possible probability of a type I error is denoted by  and is called the LEVEL OF SIGNIFICANCE or TYPE I ERROR RATE of the test. In calculating  = P ( reject H0  H0 true ) , use the value of p right on the boundary between H0 and Ha (p1) = P (Type II Error) = P ( Accept H0 p = p1 in Ha) K(p1) = P ( Reject H0  p = p1 in Ha ) = 1 -  (p1) In the case that Ha is true, power is a measure of the sensitivity of the test i.e. the ability of the test to detect that Ha is true. (15) Changing the Rejection Region Question: If we use the same sample size, how can we modify this test in order to reduce the type I error rate  ? Suppose we take c =14, so we reject H0 if X  14. What is  ? In this case what will happen to the type II error rate (p1) and the power K(p1) ? NOTE: Ideally, we would like  and (p) to be zero and K(p) to be 1; but for fixed n decreasing  causes (p) to increase and K(p) to decrease. NOTE: The only way to decrease both  and (p) is to increase the sample size. (16) The P-value Consider the test: H0: p  .70, Ha: p > .7, n=30; Reject H0 if X  26. Suppose we conduct the test and observe X to be x0 = 28. According to the rejection region we would reject H0 . We would in fact have rejected H0 even if our critical value had been 28. But with a critical value of 28, the type I error rate would be smaller. The P-value is the smallest type I error rate at which one can reject H0 on the basis of the observed outcome x0 . It is obtained by replacing the critical value ‘c’ by x0 in the calculation of the type I error rate. P-value = P (X  x0  H0 is true) For example, consider the cases where x0 is 28 and x0 is 24. Type I error rate  P(X26 p =.7) P-value when x0 = 28 P ( X  28  p =.7) P-value when x0 = 24 P(X  24  p =.7) = 1 – P (X  25  p = .7) = 1 – P (X  27  p =.7) = 1 – P ( X  23  p =.7) = 1 - .9698 = 1- .9979 = 1 - .8405 =.0302 =.0021 = .1595 Notice If x0 is in the rejection region the p-value   . If x0 is not in the rejection region then the p-value is >  . Thus it is clear that we can conduct our test at  = .03 without using a rejection region. We just have to calculate the P-value and use the following rule. If the P-value   then reject H0 . If the P-value >  then do not reject H0. (17) Summary: Hypothesis Testing Concept Left-Tailed Test Right-Tailed Test Hypotheses H0: p  p0 , Ha : p < p0 H0: p  p0 , Ha : p > p0 Critical Region Reject H0 if X  c Reject H0 if X  c Type I Error Rate  P(Reject H0H0 true) Type II Error Rate (p1) P(Accept H0 p =p1 in H a) Power K(p1) P(Reject H0 p =p1 in Ha) P-Value P(Xc p =p0) P(X c p = p0) P(X>c p = p1) P(x<c p =p1) P(Xcp=p1) or, 1-(p1) P(Xx0 p =p0) P(Xc p= p1) or, 1 -(p1) P(Xx0 p =p0) P-value Decision Rule Reject H0 if the P-value   Note: A similar theory also applies to a Two-tailed test, i.e., a test of H0: p =p0, Ha: p  p0 While we will conduct such tests in our applications, we will not discuss the theory here. An analogy of statistical hypotheses In practice we use  = .01 or  = .05. Thus to reject H0 we need strong evidence. In our judicial system, we use the phrase innocent until proven guilty beyond a reasonable doubt. We may define null and alternative hypotheses as follows: H0: defendant is innocent Ha: defendant is guilty. To prove defendant is guilty we need strong evidence. (18) CUMULATIVE BINOMIAL PROBABILITIES : P(Xx) n 15 x 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 30 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 0.1 .2059 .5490 .8159 .9444 .9873 .9978 .9997 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .1216 .3917 .6769 .8670 .9568 .9887 .9976 .9996 .9999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .0424 .1837 .4114 .6474 .8245 .9628 .9742 .9922 .9980 .9995 .9999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.2 0.3 .0352 .0047 .1671 .0353 .3980 .1268 .6482 .2969 .8358 .5155 .9389 .7216 .9819 .8689 .9958 .9500 .9992 .9848 .9999 .9963 1.000 .9993 1.000 .9999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .0115 .0008 .0692 .0076 .2061 .0355 .4114 .1071 .6296 .2375 .8042 .4164 .9133 .6080 .9679 .7723 .9900 .8867 .9974 .9520 .9994 .9829 .9999 .9949 1.000 .9987 1.000 .9997 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .0012 .0000 .0105 .0003 .0442 .0021 .1227 .0093 .2552 .0302 .4275 .0766 .6070 .1595 .7608 .2814 .8713 .4315 .9389 .5888 .9744 .7304 .9905 .8407 .9969 .9155 .9991 .9599 .9998 .9831 .9999 .9936 1.000 .9979 1.000 .9994 1.000 ..9998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.4 .0005 .0052 .0271 .0905 ..2173 .4032 .6098 .7869 .9050 .9662 .9907 .9981 .9997 1.000 1.000 1.000 .0000 .0005 .0036 .0160 .0510 .1256 .2500 .4159 .5956 .7553 .8725 .9435 .9790 .9935 .9984 .9997 1.000 1.000 1.000 1.000 1.000 .0000 .0000 .0000 .0003 .0015 .0057 .0172 .0435 .0940 .1763 .2915 .4311 .5785 .7145 .8246 .9029 .9519 .9788 .9917 .9971 .9991 .9998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.5 .0000 .0005 .0037 .0176 .0592 .1509 .3036 .5000 .6964 .8491 .9408 .9824 .9963 .9995 1.000 1.000 .0000 .0000 .0002 .0013 .0059 .0207 .0577 .1316 .2517 .4119 .5881 .7483 .8684 .9423 .9793 .9941 .9987 .9998 1.000 1.000 1.000 .0000 .0000 .0000 .0000 .0000 .0002 .0007 .0026 .0081 .0214 .0494 .1002 .1808 .2923 .4278 .5722 .7077 .8192 .8998 .9506 .9786 .9919 .9974 .9993 .9998 1.000 1.000 1.000 1.000 (19) 0.6 .0000 .0000 .0003 .0019 .0093 .0338 .0950 .2131 .3902 .5968 .7827 .9095 .9729 .9948 .9995 1.000 .0000 .0000 .0000 .0000 .0003 .0016 .0065 .0210 .0565 .1275 .2447 .4044 .5841 .7500 .8744 .9490 .9840 .9964 .9995 1.000 1.000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0002 .0009 .0029 .0083 .0212 .0481 .0971 .1754 .2855 .4215 .5689 .7085 .8237 .9060 .9565 .9828 .9943 .9985 .9997 1.000 1.000 0.7 .0000 .0000 .0000 .0001 .0007 .0037 .0152 .0500 .1311 .2784 .4845 .7031 .8732 .9647 .9953 1.000 .0000 .0000 .0000 .0000 .0000 .0000 .0003 .0013 .0051 .0171 .0480 .1133 .2277 .3920 .5836 .7625 .8929 .9645 .9924 .9992 1.000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0002 .0006 .0021 .0064 .0169 .0401 .0845 .1593 .2696 .4112 .5685 .7186 .8405 .9234 .9698 .9907 .9979 .9997 0.8 .0000 .0000 .0000 .0000 .0000 .0001 .0008 .0042 .0181 .0611 .1642 .3518 .6020 .8329 .9648 1.000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0001 .0006 .0026 .0100 .0321 .0867 .1958 .3704 .5886 .7939 .9308 .9885 1.000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0001 .0002 .0009 .0031 .0095 .0256 .0611 .1287 .2392 .3930 .5725 .7448 .8773 .9558 .9895 0.9 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0003 .0022 .0127 .0556 .1841 .4510 .7941 1.000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0001 .0004 .0024 .0113 .0432 .1330 .3231 .6083 .8784 1.000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0000 .0001 .0005 .0020 .0078 .0258 .0732 .1755 .3526 .5886 .8163 x 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 z Test for a Population Mean To test the hypothesis H0 :  = 0 based on an SRS of size n from a population with unknown mean  and known standard deviation  , compute the test statistic _ z = ( x  0 ) / n In terms of a standard normal random variable Z, the P-value for a test of H0 against Ha :  > 0 is P(Z  z) Ha :  < 0 is P(Z  z) Ha :   0 is 2P(Z  z) These P-values are exact if the population distribution is normal and are approximately correct for large n in other cases (Page 445-Text Book). Confidence Intervals and Two-Sided Tests: A level  two-sided significance test rejects a hypothesis H0: =0 exactly when the value 0 falls outside a level 1- confidence interval for . (20) To illustrate the test we consider the following problem. Problem 6.45 (Page 455): The Survey of Study Habits and Attitudes (SSHA) is a psychological test that measures the motivation, attitude toward school, and study habits of students. Scores range from 0 to 200. The mean score for U.S. college students is about 115, and the standard deviation is about 30. A teacher who suspects that older students have better attitudes toward school gives the SSHA to 20 students who are at least 30 years of age. Their mean score is x = 135.2. (a) Assuming that  = 30 for the population of older students, carry out a test of H0 :  = 115, Ha :  > 115. Report the P-value of your test, and state your conclusion clearly. (b) Your test in (a) required two important assumptions in addition to the assumption that the value of  is known. What are they? Which of these assumptions is most important to the validity of your conclusion in (a). Solution: Given: n= , x = , = Assume  = .05 Ha :  > 115; therefore this is right sided test. (a) (i) (ii) The test statistic in this case is z= (iii) p-value = P (Z  3.01 ) = (iv) Decision: (21) (v) Concluding Sentence: (b) Assumptions: (i) (ii) Example 6.16; Page 450: Bottles of a popular cola drink are supposed to contain 300 milliliters (ml) of cola. There is some variation from bottle to bottle because the filling machinery is not perfectly precise. The distribution of the contents is normal with standard deviation  = 3 ml. A student who suspects that the bottler is underfilling measures the contents of six bottles. The results are 299.4 297.7 301.0 298.9 300.2 297.0 Is this convincing evidence that the mean contents of cola bottles is less than the advertised 300 ml? The hypotheses are H0 :  = 300, Ha :  < 300 (22) Problem 6.46; Page 456: The mean yield of corn in the United States is about 120 bushels per acre. A survey of 40 farmers this year gives a sample mean yield of x = 123.8 bushels per acre. We want to know whether this is good evidence that the national mean this year is not 120 bushels per acre. Assume that the farmers surveyed are an srs from the population of all commercial growers and that the standard deviation of the yield in this population is  = 10 bushels per acre. Give the P-value for the test of H0 :  =120, Ha :  120. Are you convinced that the population mean is not 120 bushels per acre? Is your conclusion correct if the distribution of corn yields is some what non-normal? Why? (23) Problem 6.87; Page 481: You have an srs of size n = 16 from a normal distribution with  = 1. You wish to test H0 :  = 0 Ha :  > 0. You decide to reject H0 if x > 0 and the accept H0 otherwise. (a) Find the probability of a Type I error , that is, the probability that your test rejects H0 when in fact  = 0. (b) Find the probability of a Type II error when  = 0.2. This is the probability that your test accepts H0 when in fact  = 0.2. (c) Find the probability of a Type II error when  = 0.6. (24) Problem 6.84; Page 480: Example 6.16 discusses a test about the mean contents of cola bottles. The hypotheses are H0 :  = 300 Ha :  < 300 . The sample size is n =6, and the population is assumed to have a normal distribution with  = 3. A 5% significance test rejects H0 if z  -1.645, where the test statistic z is z= x  300 . 3 6 Power calculations help us see how large a shortfall in the bottle contents the test can be expected to detect. (a) Find the power of this test against the alternative  = 298. (b) Find the power of this test against the alternative  = 294. (c) Is the power against  = 296 higher or lower than the value you found in (b) ? Explain why this result makes sense? (25) The t-distribution The t-distribution depends on a single parameter. This parameter is called its degrees of freedom (df). If sampling is done from a normal distribution whose mean is  and standard deviation , then X -  Z =   /n follows standard normal distribution. Since,  in practice is mostly unknown; therefore, we can replace it by its estimate s. The random variable X -  T =  S /n follows t-distribution with n-1 degrees of freedom. Sketch of t-distribution In comparison with standard normal distribution, the tdistribution has more area in the tails while the standard normal distribution has more area in the middle. t-curve approaches Z-curve if df is large. (26) T-Interval: Confidence Interval for the Mean  of a Normal Population ( unknown) If a random sample X1 , X2 . . . Xn is chosen from a normal distribution; then 100(1-)% Confidence Interval of  is X  t/2 SE where: df for t is n-1, SE = s/n = standard error of X ( the estimated sd of X), X = s2 = s= Margin of Error: E = t/2 SE = t/2 s/n Level of Confidence ( Reliability) : 100(1-)% Notes: 1. For all n, t/2 > z/2 . 2. For df = , t/2 = z/2 , which are the entries at the bottom of the t –table. 3. For large n (n >30), the normality assumption may be ignored because of the Central Limit Theorem. 4. The estimate of , X is the mid-point of the CI and the margin of error is one half the width of the CI.  L X U Thus, (27) X = (L+U)/2 and E = (U – L)/2 Example: In a health study the birth weights of a random sample of 100 newborns from mothers with a low socioeconomic status in a large US city was recorded. The sample yielded a mean of 3.21 kg with a standard deviation of 0.71 kg. (a) Find a 90% confidence interval for the true mean birth weight of newborns from mothers with a low socioeconomic status. (b) Interpret the confidence interval. Solution: Here we wish to estimate  = mean birth weight of all newborns from mothers with a low socioeconomic status in this US city. Given: n= x = [estimate of  ] s= [estimate of  ] Since n > 30, it is not necessary that the population be normal ( due to the CLT). For a 90% CI, t/2 = = , df = n –1 = 99 x  t/2 s/n = = or, (c) x = _________ estimates the true population mean  with margin of error E =____________ and level of confidence (Reliability)____________. The level of confidence gives the proportion of intervals found this way that would cover . (28) Note: The interpretation of a confidence interval as given in the example above is the popular interpretation often heard on television or reported in newspapers. A mathematically precise interpretation of the confidence interval for this example would be “ Prior to sampling there was a .90 probability that the confidence interval to be formed would contain the true population mean  “. Example: For the data in the example above, find a 95% confidence interval for the true mean birth weight of newborns from mothers with a low socioeconomic status. Solution: Recall, n = 100, x = 3.21, For a 95% CI, t/2 = s = 0.71 . = , df = n –1 = 99 x  t/2 s/n = = or, Interpretation: x = _________ estimates the true population mean  with margin of error E =____________ and level of confidence (Reliability)____________. (29) Example: For the data in the example above, find a 99% confidence interval for the true mean birth weight of newborns from mothers with a low socioeconomic status. Solution: Recall, n = 100, x = 3.21, For a 99% CI, t/2 = s = 0.71 . = , df = n –1 = 99 x  t/2 s/n = = or, Interpretation: x = _________ estimates the true population mean  with margin of error E =____________ and level of confidence (Reliability)____________. Question: Considering these three examples, if the level of confidence is increased and all other things remain the same, the width of the confidence interval will_______________ . (30) Example: A study was conducted to determine the effect of acid rain on the lake water in an industrial region of the country. The data below gives the pH levels from a random sample of 10 lakes from this region. ( It was assumed that the sample came from a normal distribution). Minitab was used to find a 95% confidence interval for the mean pH level for all lakes in this region. C1: 6.6 7.1 7.3 6.7 6.8 6.2 6.5 5.9 6.9 6.3 MTB > tint 95 c1 One-Sample T: C1 Variable C1 N 10 Mean 6.630 StDev 0.424 SE Mean 0.134 ( 95.0% CI 6.326, 6.934) From the Minitab output answer the following: (a) What is the 95% confidence interval of  ? (b) What is the estimate of  and the estimated standard deviation of this estimate? (c) What is the margin of error E and level of confidence (reliability) for the estimate of  ? (31) The One-Sample t Test Suppose that an SRS of size n is drawn from a population having unknown mean . To test the hypothesis H0 :  = 0 based on an SRS of size n, compute the one-sample t statistic t= x  0 s/ n In terms of a random variable T having the t(n-1) distribution, the P-value for a test of H0 against Ha :  > 0 is P(T  t) Ha :  < 0 is P(T  t) Ha :   0 is 2P(T  t ) These P-values are exact if the population distribution is normal and are approximately correct for large n in other cases (Page 496-Text Book). (32) To illustrate the test we consider the following example. Example: A random sample of 120 high school graduates were given an IQ test. The sample mean IQ was 103.21 with a standard deviation of 16.18. Test at  = .10 if there is sufficient evidence to conclude that the mean of population from which the sample comes exceeds 100. Solution: Given: n =120, x = 103.21, s =16.18;  = .10 (i) Ha:  > 100 (ii) t= (iii) p- value = (iv) Decision: (v) Concluding Sentence: (33) Example: A psychological test, used to assess an individual’s ability to appraise other people, was given to a random sample of 12 supervisors in a large corporation. Their scores are given below. 64 97 73 71 68 74 60 78 60 74 73 75 Is there sufficient evidence at  = .05 to conclude that the mean score for the population of supervisors is below 75? Solution: Given: n= ,x = , s= (i) Ha :  < 75; therefore, this is a left sided test. (ii) t = (iii) p-value = (iii)Decision: (iv) Concluding Sentence: (34) ,  = .05 Example: A manufacturing process is supposed to produce ball bearings for use in industry with a diameter of 2cm. A random sample of 40 ball bearings was chosen and their diameters were measured. Mean and standard deviation of this random sample is given below; n =40 , x = 1.9991, s = .0089. Test the hypothesis Ha :   2 at  = .05. (i) Ha :   2; therefore, this is a two sided test. (ii) t = (iii)p-value = (iv) Decision: (v) Concluding Sentence: (35) PAIRED SAMPLE DESIGN Example: A consumer group wishes to compare two brands of tire, brand A and brand B with respect to tire wear. - One of each brand of tire was randomly assigned to the rear wheels of 8 cars. - The cars were then driven a specified number of miles - The amount of wear on each tire was recorded. Amount of wear Car 1 2 3 4 5 6 7 8 A(xi) 9.4 11.8 9.1 8.3 9.0 10.6 9.0 9.1 B(yi) 9.3 13.0 9.7 8.8 9.0 10.4 10.0 10.1 di = xi - yi 0.1 -1.2 -0.6 -0.5 0.0 0.2 -1.0 -1.0 In this design the brands of tire were paired ( matched or blocked) according to car, driver and route driven. Thus the differences ‘di’ in tire wear are most likely due to differences in brand and not in these other variables. Matched pairs t procedures Suppose, for example, using our ‘tire data’ we wish to test at  = .05 if “on average” brand B wears more than brand A. The underlying assumption for the t-test for matched pairs is that di `s form a srs from a normal distribution. Car 1 di = xi - yi 0.1 2 -1.2 3 -0.6 4 -0.5 5 0.0 6 0.2 7 -1.0 8 -1.0 di2 (36) (i) H0 :  = 0, Ha :  < 0;  = .05. (ii) t= (iii) p-value = (iv) Decision: (v) There is sufficient evidence at  = .05 to conclude that tire B wears more than tire A. d 0 s n = (37) PROBLEM 7.42 ; PAGE 522:The table below gives the pretest and posttest score on the MLA listening test in Spanish for 20 high school Spanish teachers who attended an intensive summer course in Spanish. The setting is identical to the one described in Example 7.7. Subject Pretest Posttest 1 30 29 2 28 30 3 31 32 4 26 30 5 20 16 6 30 25 7 34 31 8 15 18 9 28 33 10 20 25 11 30 32 12 29 28 13 31 34 14 29 32 15 34 32 16 20 27 17 26 28 18 25 29 19 31 32 20 29 32 (a) We hope to show that attending the institute improves listening skills. State an appropriate H0 and Ha . Be sure to identify the parameters appearing in the hypotheses. (b) Make a graphical check for outliers or strong skewness in the data that you will use in your statistical test, and report your conclusions on the validity of the test. (c) Carry out a test. Can you reject H0 at the 5% significance level? At the 1% significance level. (d) Give a 90 % confidence interval for the mean increase in listening score due to attending the summer institute. (38) MINITAB OUTPUT ————— 12/11/03 11:08:46 AM ———————————————————— Welcome to Minitab, press F1 for help. MTB > Retrieve "D:\PCDataSets\MINITAB\Ch07\Ex07_042.mtp"; SUBC> Portable. Retrieving worksheet from file: D:\PCDataSets\MINITAB\Ch07\Ex07_042.mtp # Worksheet was saved on Wed Apr 24 2002 Results for: Ex07_042.mtp MTB > let c4= c3-c2 MTB > prin c1-c4 Data Display Row Student Pretest Posttest Posttest-Pretest 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 30 28 31 26 20 30 34 15 28 20 30 29 31 29 34 20 26 25 31 29 29 30 32 30 16 25 31 18 33 25 32 28 34 32 32 27 28 29 32 32 -1 2 1 4 -4 -5 -3 3 5 5 2 -1 3 3 -2 7 2 4 1 3 MTB > * NOTE gstd * Character graphs are obsolete. * NOTE * Standard Graphics are enabled. Professional Graphics are disabled. Use the GPRO command to enable Professional Graphics. (39) MTB > hist c4 Histogram Histogram of Posttest Midpoint -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 Count 1 1 1 1 2 0 2 3 4 2 2 0 1 N = 20 * * * * ** ** *** **** ** ** * MTB > stem c4; SUBC> trim. Stem-and-Leaf Display: Posttest-Pretest Stem-and-leaf of Posttest Leaf Unit = 1.0 2 4 6 8 (7) 5 1 -0 -0 -0 0 0 0 0 N = 20 54 32 11 11 2223333 4455 7 MTB > boxp c4 Boxplot ----------------------------------I + I-----------------------------------+---------+---------+---------+---------+---------+-Posttest -5.0 -2.5 0.0 (40) 2.5 5.0 7.5 MTB > ttes 0 c4; SUBC> alte 1. One-Sample T: Posttest-Pretest Test of mu = 0 vs mu > 0 Variable Posttest-Pre N 20 Mean 1.450 Variable 95.0% Lower Bound Posttest-Pre 0.211 MTB > tint 90 c4 StDev 3.203 T 2.02 SE Mean 0.716 P 0.029 One-Sample T: Posttest-Pretest Variable Posttest-Pre N 20 Mean 1.450 StDev 3.203 (41) SE Mean 0.716 ( 90.0% CI 0.211, 2.689) The Two-Sample t Significance Test Suppose that an srs of size n1 is drawn from a normal population with unknown mean 1 and that an independent srs of size n2 is drawn from another normal population with unknown mean 2. To test H0: 1 = 2, compute the two -sample t statistic x  x2 t= 1 s12 s 22  n1 n2 and use P-values or critical values for the t(k) distribution, where the degrees of freedom k are either approximated by software or are the smaller of n1 – 1 and n2 –1. The P-values for a test of H0 against (i) Ha : 1 > 2 is P ( T t) (ii) Ha : 1 < 2 is P ( T t) (iii) Ha : 1  2 is 2P ( T t ) A 100(1-)% confidence interval of 1 - 2 is ( x1  x 2 )  t (42) 2 s12 s 22  n1 n2 Software approximation for the degrees of freedom For the The Two-Sample t Significance Test the statistical software uses the following formula for obtaining the degrees of freedom df =  s12 s 22      n1 n2  2 2 1  s12  1  s 22       n1  1  n1  n2  1  n2  2 Problem 7.70; Page 548: Does cocaine use by pregnant women cause their babies to have low birth weight? To study this question, birth weights of babies of women who tested positive for cocaine/crack during a drug-screening test were compared with the birth weights for women who either tested negative or were not tested, a group we call “other”. Here are the summary statistics. The birth weights are measured in grams. x Group n s Positive test 134 2733 599 Other 5974 3118 672 (a) Formulate appropriate hypotheses and carry out the test of significance for these data. (b) Give a 95% confidence interval for the mean difference in birth weights. (c) Discuss the limitations of the study design. What do you believe can be concluded from this study?  (a) We assume  = 0.05. Let 1 be the mean of the Positive group and 2 be is the mean of the other group. (i) H0 : 1 = 2; Ha : 1 < 2. x1  x 2 (ii) t= (iii) P-value = (43) s12 s 22  n1 n2 (b) (iv) Decision: (v) Concluding Sentence: 95% confidence interval of ( x1  x 2 )  t 2 1 - 2 is s12 s 22  n1 n2 (c) (44) Example 7.14; Page 530: An educator believes that new reading activities in the classroom will help elementary school pupils improve some aspects of their reading ability. She arranges for a third-grade class of 21 students to take part in these activities for an eight week period. A control classroom of 23 third-graders follows the same curriculum without the activities. At the end of the eight weeks, all students are given a Degree of Reading Power (DRP) test, which measures the aspects of reading ability that the treatment is designed to improve. The data appear in Table 7.3 TABLE 7.3 DRP scores for third-graders Treatment Group 24 61 59 46 43 44 52 43 58 67 62 57 71 49 54 43 53 57 49 56 33 Control Group 42 33 46 37 43 41 10 42 55 19 17 55 26 54 60 28 62 20 53 48 37 85 42  The summary statistics are Group Treatment Control n 21 23 x 51.48 41.52 s 11.01 17.15 Because we hope to show that the treatment (Group 1) is better than the control (Group 2), the hypotheses are (ii) (vi) H0 : 1 = 2; Ha : 1 > 2. Assume  = 0.05. t= x1  x 2 (vii) s12 s 22  n1 n2 P-value = (viii) Decision: (v) Concluding Sentence: (45) MINITAB OUTPUT ————— 12/16/03 10:27:45 AM ———————————————————— Welcome to Minitab, press F1 for help. MTB > twos c1 c2; SUBC> alte 1. Two-Sample T-Test and CI: Treatment, Control Two-sample T for Treatment vs Control Treatmen Control N 21 23 Mean 51.5 41.5 StDev 11.0 17.1 SE Mean 2.4 3.6 Difference = mu Treatment - mu Control Estimate for difference: 9.95 95% lower bound for difference: 2.69 T-Test of difference = 0 (vs >): T-Value = 2.31 P-Value = 0.013 DF = 37 MTB > twos 95 c1 c2 Two-Sample T-Test and CI: Treatment, Control Two-sample T for Treatment vs Control Treatmen Control N 21 23 Mean 51.5 41.5 StDev 11.0 17.1 SE Mean 2.4 3.6 Difference = mu Treatment - mu Control Estimate for difference: 9.95 95% CI for difference: (1.23, 18.68) T-Test of difference = 0 (vs not =): T-Value = 2.31 P-Value = 0.027 DF = 37 MTB > twos 90 c1 c2 Two-Sample T-Test and CI: Treatment, Control Two-sample T for Treatment vs Control N Treatmen Control 21 23 51.5 41.5 11.0 17.1 Mean StDev SE Mean 2.4 3.6 Difference = mu Treatment - mu Control Estimate for difference: 9.95 90% CI for difference: (2.69, 17.22) T-Test of difference = 0 (vs not =): T-Value = 2.31 (46) P-Value = 0.027 DF = 37 Example 7.16; Page 533: The Chapin Insight Test is a psychological test designed to measure how accurate the subject appraises other people. The possible scores on the test range from 0 to 41. During the development of the Chapin test, it was given to several different groups of people. Here are the results for male and female college students majoring in the liberal arts: Group 1 2 Sex Male Female n 133 162 x 25.34 24.94 s 5.05 5.44 Do these data support the contention that female and male students differ in average social insight?  We assume  = 0.05. (i) H0 : 1 = 2; Ha : 1  2. (ii) t= (iii) P-value = (iv) Decision: x1  x 2 s12 s 22  n1 n2 (v) Concluding Sentence: (47) The Pooled Two-Sample t Procedures Suppose that an srs of size n1 is drawn from a normal population with unknown mean 1 and that an independent srs of size n2 is drawn from another normal population with unknown mean 2. Suppose also that the two populations have the same standard deviation. A 100(1-)% confidence interval for 1 - 2 is 1 1 ( x1  x2 )  t s p  2 n1 n2 The degrees of freedom for the t density curve is n1 +n2 –2 and the pooled variance (n1  1) s12  (n2  1) s 22 s  n1  n2  2 2 p To test the hypothesis H0: 1 = 2, compute the two -sample t statistic x1  x 2 t= 1 1 sp  n1 n2 In terms of a random variable T having the t(n1 +n2-2) distribution, the P-value for a test of H0 against (i) Ha : 1 > 2 is P ( T t) (ii) Ha : 1 < 2 is P ( T t) (iii) Ha : 1  2 is 2P ( T t ) (48) Example: In an experiment to compare the breaking strengths of two alloys “X” and “Y”, random samples of beams of each alloy were chosen and a strength test applied. The data is listed below. Alloy X: 81 87 81 85 83 83 83 86 88 78 83 80 83 86 84 Alloy Y: 76 81 81 80 85 84 79 82 74 82 79 78 81 81 80 79 78 76 79 82 Is there sufficient evidence that on average the breaking strength of alloy X exceeds that of alloy Y? Assume the sample come from normal distributions with equal variances.(assume  = 0.05).  n1 = 15, x1  83.40 , s12  7.4 n2 = 20, x2  79.85 , s 2p  sp s 22  7.187 (n1  1) s12  (n2  1) s 22 = n1  n2  2 1 1 =  n1 n2 (i) H0 : 1 = 2, Ha: 1 > 2 (  =0.05) x1  x 2 (ii) t= 1 1  n1 n2 (iii) P-value = sp (iv) Decision: (v) Concluding Sentence: (49) Note: We obtain 99% confidence interval of 1 - 2. df = 33  30. t  t 0.005 = 2 Therefore, 99% confidence interval of 1 - 2 is ( x1  x2 )  t s p 2 (50) 1 1  n1 n2 Minitab Output ————— 12/16/03 10:38:27 AM ———————————————————— Welcome to Minitab, press F1 for help. MTB > twos c1 c2; SUBC> alte 1; SUBC> pooled. Two-Sample T-Test and CI: Alloy X, Alloy Y Two-sample T for Alloy X vs Alloy Y Alloy X Alloy Y N 15 20 Mean 83.40 79.85 StDev 2.72 2.68 SE Mean 0.70 0.60 Difference = mu Alloy X - mu Alloy Y Estimate for difference: 3.550 95% lower bound for difference: 1.991 T-Test of difference = 0 (vs >): T-Value = 3.85 Both use Pooled StDev = 2.70 P-Value = 0.000 DF = 33 MTB > twos 99 c1 c2; SUBC> pooled. Two-Sample T-Test and CI: Alloy X, Alloy Y Two-sample T for Alloy X vs Alloy Y Alloy X Alloy Y N 15 20 Mean 83.40 79.85 StDev 2.72 2.68 SE Mean 0.70 0.60 Difference = mu Alloy X - mu Alloy Y Estimate for difference: 3.550 99% CI for difference: (1.032, 6.068) T-Test of difference = 0 (vs not =): T-Value = 3.85 Both use Pooled StDev = 2.70 (51) P-Value = 0.001 DF = 33 Example: A cigarette manufacturer claims that “on average” his cigarettes ( brand A) have lower tar content than those of his nearest competitor (brand B) . An experiment comparing the tar content of two brands (in mg) yielded the following results. Brand A Brand B Sample Size 20 20 Sample Mean 9.57 10.16 Sample St. Dev. 1.01 1.11 Is there sufficient evidence at  = 0.05 to support the manufacturer’s claim?  Brand A : Brand B: x1  n1 = x2  n2 = (n1  1) s12  (n2  1) s 22 = s  n1  n2  2 2 p sp 1 1 =  n1 n2 (i) H0 : 1 = 2, Ha: 1 < 2 (  =0.05) (iv) x1  x 2 t= (v) 1 1  n1 n2 P-value = (vi) Decision: sp (v) Concluding Sentence: (52) s12  s 22  Note: For the above problem we now test the hypothesis that the two brands have different tar. (i) H0 : 1 = 2, Ha: 1 2 (  =0.05) (ii) x1  x 2 t= sp 1 1  n1 n2 (iii) P-value = (iv) Decision: (v) Concluding Sentence: (53) NORMAL APPROXIMATION FOR COUNTS AND PROPORTIONS An srs of size n is drawn from a population having population proportion p of X successes. Let X be the number of successes in the sample and pˆ  is the sample n proportion of successes. If n is large; then (i) X is approximately N (np, np(1  p) ). (ii) p̂ is approximately N (p, p(1  p) ). [This result is on Page 376 in the text n book] 100(1-)% CONFIDENCE INTERVAL FOR p For large n, p̂ is approximately N (p, (-z/2 < p(1  p) ). Therefore, n pˆ  p p (1  p ) n < z/2) = 1-  A simple calculation would show that the above equation is equivalent to the following P( p̂ -z/2 p(1  p) < p < p̂ +z/2 n p(1  p) ) = 1-  n The standard deviation of p̂ is given by  p̂ = p(1  p) n Since, p in practice is unknown; therefore, we replace it by its estimate p̂ and define standard error of sample proportion as follows: SE p̂ = pˆ (1  pˆ ) n (54) An approximate 100(1-)% confidence interval for p is given by p̂  z/2 SE p̂ This is the traditional confidence interval of p. Unfortunately, modern computer studies reveal that confidence intervals based on this approach can be quite inaccurate, even for large samples. Therefore, we will use a simple adjustment that works very well in practice. An estimate of p is defined by ~p = X  2 . n4 We call it the Wilson estimate. This estimate was first suggested by Edwin Bidwell Wilson in 1927. It can be shown that the distribution of ~p is close to the normal p(1  p) . To get a confidence n4 interval, we estimate p by ~p in this standard deviation to get the standard error of ~p . Here is the final result. distribution with mean p and standard deviation An approximate 100(1-)% confidence interval for p is ~p  z/2 SE ~p X 2 where, ~p = , and n4 SE ~p = ~ p (1  ~ p) n4 The margin of error is m= z/2 SE ~p . In practice we will use this confidence interval when the sample size is at least n =5 and the confidence level is 90%, 95% or 99%. (55) Example 8.1; Page 574: Alcohol abuse has been described by college presidents as the number one problem on campus and it is an important cause of death in young adults. How common is it? A survey of 17,096 students in U.S. four-year colleges collected information on drinking behaviour and alcohol related problems. The researchers defined “frequent binge drinking” as having five or more drinks in a row three or more times in the past two weeks. According to this definition, 3314 students were classified as frequent binge drinkers. The Wilson estimate of the proportion of drinkers is ~p = 3314  2  0.194 17096  4 SE ~p = ~ p (1  ~ p) n4 . Therefore, 95% confidence interval of p is ~p  z/2 SE ~p = Interpretation: ~p  0.194 estimates the true proportion p with margin of error 0.006 and level of confidence (reliability) 95%. (56) Sample Size for Desired Margin of Error 100(1-)% confidence interval of p is ~p  z/2 SE ~p .Therefore, the margin of error is m= z/2 SE ~p = z/2 ~ p (1  ~ p) . n4 A simple calculation gives  z n +4 =  2  m  2   ~p (1- ~p ).   The value of ~p is not known until we gather the data. Therefore, we must guess a value to use in the calculations. We call the guessed value p  . Therefore, sample size formula is given as follows:  z n +4 =  2  m  2   p  (1-p  )   1 . The margin of error will 2 be less than or equal to m if p  is chosen to be 0.5. The sample size formula in this case is given by If guessed value is not available then we may use p  = 2  z  n +4 =  2  .  2m    (57) Problem 8.25, Page 586: Land’s Beginning is a company that sells its merchandise through the mail. It is considering buying a list of addresses from a magazine. The magazine claims that at least 25% of its subscribers have high incomes ( they define this to be household income in excess of $100,000). Land’s Beginning would like to estimate the proportion of high-income people on the list. Checking income is very difficult and expensive but another company offers this service. Land’s Beginning will pay to find incomes for an srs of people on the magazine’s list. They would like the margin of error of the 95% confidence interval for the proportion to be 0.05 or less. Use the guessed value p  = 0.25 to find the required sample size. (58) Large-Sample Significance test for a Population Proportion Draw an SRS of size n from a large population with unknown proportion p of successes. To test the hypothesis H0: p = p0 , compute the z statistic z= pˆ  p 0 p 0 (1  p 0 ) n In terms of a standard normal random variable Z, the approximate P-value for a test of H0 against Ha : p > p0 is P(Z  z) Ha : p < p0 is P(Z  z) Ha : p  p0 is 2P(Z  z) In practice we will use this test if np0 > 10 and n(1-p0)>10. This test is given on Page 575 in the Text Book. (59) Problem 8.13; Page 584: In each of the following cases state whether or not the normal approximation to the binomial should be used for a significance test on the population proportion: (a) n =10, H0: p =0.6 (b) n =100, H0: p =0.4 (c) n = 2000, H0: p =0.996 (d) n = 500, H0 ; p =0.25.  (60) Example: In a random sample of 30 students from a university it was found that 22 lived in residence. Test at  = .05 if there is sufficient evidence to conclude that over 60% of the student body lives on campus. Here, Population = the student body at a university. p = the true proportion of the student body that lives on campus. (i) H0 : p  .6, Ha: p  .6 n = 30, p0 = .6; Check: np0 = , n(1-p0) = pˆ  p 0 (ii) z= (iii) P-value = (iv) Decisión: (v) Concluding Sentence: (61) p 0 (1  p 0 ) n = Example: The article “ Statistical Evidence of Discrimination” ( J. Amer. Stat. Assoc. (1982): 773-83) discussed the court case ‘swain vs. Alabama’(1965), in which it was alleged that there was discrimination against blacks in grand jury selection. Census data suggested that 25% of those eligible for grand jury selection were black, yet a random sample of 1050 called to appear for possible duty yielded only 177 Blacks. Test at  = .01 if this data strongly supports a case for discrimination.  Here, Population = all individual in Alabama who are eligible for grand jury duty. p= the true proportion of blacks who will be selected for grand jury duty from this population. (i) H0 : p , Ha : p n = 1050 , p0 = . ; Check: np0 = , n(1-p0) = pˆ  p 0 (ii) z= (iii) P-value = (iv) Decisión: (v) Concluding Sentence: p 0 (1  p 0 ) n = (62) MINITAB OUTPUT ————— 12/22/03 5:17:36 PM ———————————————————— Welcome to Minitab, press F1 for help. MTB > POne 1050 177; SUBC> Test 0.25; SUBC> Alternative -1; SUBC> UseZ. Test and CI for One Proportion Test of p = 0.25 vs p < 0.25 Sample 1 X 177 N 1050 Sample p 0.168571 95.0% Upper Bound 0.187575 Z-Value -6.09 P-Value 0.000 MTB > POne 1050 177; SUBC> Test 0.25; SUBC> UseZ. Test and CI for One Proportion Test of p = 0.25 vs p not = 0.25 Sample 1 X 177 N 1050 Sample p 0.168571 95.0% CI (0.145927, 0.191216) Z-Value -6.09 P-Value 0.000 99.0% CI (0.138812, 0.198331) Z-Value -6.09 P-Value 0.000 MTB > POne 1050 177; SUBC> Confidence 99.0; SUBC> Test 0.25; SUBC> UseZ. Test and CI for One Proportion Test of p = 0.25 vs p not = 0.25 Sample 1 X 177 N 1050 Sample p 0.168571 (63) Example : People often claim that the probability a newborn child will be a male differs significantly from 0.50 . In a sample of 812 births recorded at Hotel Dieu Hospital in Windsor, Ontario, 403 were found to be male. Test at  = .05 if there is sufficient evidence to support this claim.  Here: Population = all infants born at Hotel Dieu Hospital in Windsor, Ontario. p= the true probability a new born will be a male. (i) H0 : p Ha : p n = 1050 , p0 = . ; Check: np0 = n(1-p0) = pˆ  p 0 (ii) z= (iii) P-value = (iv) Decisión: (v) Concluding Sentence: p 0 (1  p 0 ) n = (64) Confidence Intervals for Comparing Two Proportions Choose an srs of size n1 from a large population having proportion p1 of successes and an independent srs of size n2 from another population having proportion p2 of successes. An approximate 100(1-)% confidence interval of p1 – p2 is (~ p1  ~ p2 )  z SE D~ 2 X 1 ~ , p1  1 n1  2 X 1 ~ p2  2 n2  2 are the Wilson estimates of the population proportions and the standard error of the difference is SE D~  ~ p1 (1  ~ p1 ) ~ p (1  ~ p2 )  2 n1  2 n2  2 The margin of error is m =  z SED~ 2 Use this method when both sample sizes are at least 10 and the confidence level is 90%, 95%, or 99%. (65) Example 8.8; Page 589: In the binge-drinking study, data were also summarized by gender Population 1 (men) 2 (women) Total n 7180 9916 17096 X 1630 1684 3314 We obtain 95% confidence interval of p1  p2 .  SE D~  = ~ p1 (1  ~ p1 ) ~ p (1  ~ p2 )  2 n1  2 n2  2 0.2270.773  0.1700.830 7180  2 9916  2 The 95% confidence interval of p1  p2 is (66) = 0.00622 Wilson estimates ~ p1  0.227 ~ p2  0.170 Significance Tests for Comparing Two Proportions To test the hypothesis H0: p1 = p2 Compute the z statistic z= pˆ 1  pˆ 2 SE Dp where the pooled standard error is SE Dp  pˆ (1  pˆ )( 1 1  ) n1 n2 and where pˆ  X1  X 2 n1  n2 In terms of a standard normal random variable Z, the P-value for a test of H0 against Ha : p1 > p2 is P (Zz) Ha : p1 < p2 is P (Zz) Ha : p1  p2 is 2P (Z z ) This z test is based on the normal approximation to the binomial distribution. As a general rule, we will use it when the number of successes and the number of failures in each of the sample is at least 5. (67) Example : In a medical study of the effectiveness of the drug Timolol in preventing angina attacks, 400 patients were randomly allocated to receive a daily dosage of the drug Timolol or a placebo for 28 weeks. The number of angina-free patients was then recorded for each group. Drug Placebo Sample Size 210(n1) 190(n2) No. of angina free patients 59(x1) 25(x2) Test at  = .01 if a significantly higher proportion of patients become angina free with the drug Timolol that with the placebo.  Let, p1 = the true proportion of patients who become angina free with the drug Timolol. p2 = the true proportion of patients who become angina free with the placebo. (i) H0 : p1 = p2 Ha : p1 > p2 ( = .01) (ii) p̂1  p̂2  = pˆ  SE Dp  z= pˆ (1  pˆ )( X1  X 2 = n1  n2 1 1  )= n1 n2 pˆ 1  pˆ 2 = SE Dp (iii) P-value = (iv) Decision: (v) Concluding Sentence: (68) = MINITAB OUTPUT ————— 12/19/03 3:09:30 PM ———————————————————— Welcome to Minitab, press F1 for help. MTB > PTwo 210 59 190 25; SUBC> Confidence 95; SUBC> Alternative 1; SUBC> Pooled. Test and CI for Two Proportions Sample 1 2 X 59 25 N 210 190 Sample p 0.280952 0.131579 Estimate for p(1) - p(2): 0.149373 95% lower bound for p(1) - p(2): 0.0843364 Test for p(1) - p(2) = 0 (vs > 0): Z = 3.66 P-Value = 0.000 MTB > PTwo 210 59 190 25; SUBC> Confidence 99. Test and CI for Two Proportions Sample 1 2 X 59 25 N 210 190 Sample p 0.280952 0.131579 Estimate for p(1) - p(2): 0.149373 99% CI for p(1) - p(2): (0.0475258, 0.251221) Test for p(1) - p(2) = 0 (vs not = 0): Z = 3.78 P-Value = 0.000 MTB > PTwo 210 59 190 25; SUBC> Confidence 95. Test and CI for Two Proportions Sample 1 2 X 59 25 N 210 190 Sample p 0.280952 0.131579 Estimate for p(1) - p(2): 0.149373 95% CI for p(1) - p(2): (0.0718770, 0.226870) Test for p(1) - p(2) = 0 (vs not = 0): Z = 3.78 (69) P-Value = 0.000 Example: Two telephone polls conducted by a news organization in September of 1993 and 1994 asked a random sample of American adults the following question: “ Do you favour a proposal to stop providing government health benefits and public education to illegal immigrants and their children” ( Source: Time Magazine, October 1994) September 1993 800(n1) 376(x1) Sample Size Number in favour September 1994 800(n2) 440(x2) Test at  = .01 if the proportion of American adults in favour of this proposal has increased from September 1993 to September 1994.  (i) H0 : p1 = p2 Ha : p1 < p2 ( = .01) (ii) p̂1  p̂2  = pˆ  SE Dp  z= pˆ (1  pˆ )( X1  X 2 = n1  n2 1 1  )= n1 n2 pˆ 1  pˆ 2 = SE Dp (iii) P-value = (iv) Decision: (v) Concluding Sentence: (70) = Example 8.9; Page 592: Are men and women college students equally likely to be frequent binge drinkers? We examine the survey data in Example 8.8 (Page 589) to answer this question. Here is the summary data Population 1 (men) 2 (women) Total  (i) H0 : p1 = p2 n 7180 9916 17096 Ha : p1  p2 X 1630 1684 3314 ( = .01) p̂2  (ii) p̂1  pˆ  SE Dp  z= pˆ (1  pˆ )( X1  X 2 = n1  n2 1 1  )= n1 n2 pˆ 1  pˆ 2 = SE Dp (iii) P-value = (iv) Decision: (v) Concluding Sentence: (71) pˆ 1  0.227 pˆ 2  0.170 pˆ  0.194 Data Analysis for Two-way Tables Example 9.1, Page 612: Here is a summary from Example 8.8 (page 589), where we compared frequent binge drinking of men and women by examining the proportions of each gender who engage in this activity. Population 1 (men) 2 ( women) Total n 7,180 9,916 17,096 X 1,630 1,684 3,314 In this chapter we use a different summary of the data. Rather than recording just the count of binge drinkers, we record counts for all outcomes in a two-way table. Example 9.2; Page 612: Here is the two –way table classifying students by gender and whether or not they are binge drinkers: Two-way table for frequent binge drinking and gender Gender Frequent binge drinker Yes No Total Men Women Total 1,630 5,550 7,180 1,684 8,232 9,916 3,314 13,782 17,096 Example 9.3; Page 613: For the above example, the joint distribution of binge drinking and gender is as follows: Joint distribution of frequent binge drinking and gender Gender Frequent binge drinker Yes No Total Men Women Total Because this is a distribution, the sum of the proportions should be 1. For this example the sum is 1.001. The difference is due to roundoff error. (72) Marginal distributions: Example 9.4; Page 614: Marginal distribution of gender Men Women Proportion Often we prefer to use percents rather than proportions. Here is the marginal distribution of gender described with percents: Marginal distribution of gender Men Women Percent Example 9.5, Page 614: The following table gives the marginal distribution of frequent binge drinking: Marginal distribution of frequent binge drinking Yes Percent (73) No Conditional distributions Example 9.7; Page 616: For women, the conditional distribution of the frequent binge-drinking in terms of percents is Conditional distribution of frequent binge drinking for women Yes No Percent Conditional distribution of frequent binge drinking for men Yes No Percent Comparing the conditional distributions reveals the nature of the association between gender and frequent binge drinking. In this set of data the men are more likely to be frequent binge drinkers than the women. The following bar graphs compare the women and men percents. (74) Problem 9.1; Page 637: The census Bureau collects data on years of school completed by Americans of different ages. The following table gives the years of education for three different age groups. People under the age of 25 are not included because many have not yet completed their education. Note that the unit of measure for each entry in the table is thousands of persons. Years of school completed, by age (thousands of persons) Age Group Education Did not complete high school Completed high school College, 1 to 3 years College, 4 or more years Total 25 to34 35 to 54 9,152 55 and over 16,035 5,325 14,061 11,659 10,342 41,338 30,512 24,070 19,926 19,878 73,028 18,320 9,662 8,005 52,022 56,451 41,247 38,225 166,438 (a) Give the joint distribution of education and age for this table. (b) What is the marginal distribution of age? (c) What is the marginal distribution of education?  (a) The joint distribution in percents is given as follows: Age Group Education Did not complete high school Completed high school College, 1 to 3 years College, 4 or more years 25 to34 35 to 54 55 and over (b) The marginal distribution of age in percents is 25-34 35-54 55 and over (c) The marginal distribution of education in percents is Did not complete high school Completed high school College, 1 to 3 years College, 4 or more years (75) Total Problem 9.2; Page 638: Refer to the previous exercise. Find the conditional distribution of education for each of the three age categories. Make a bar graph for each distribution and summarize their differences and similarities.  The conditional distributions in percents are given as follows: Age Group Education Did not complete high school Completed high school College, 1 to 3 years College, 4 or more years 25 to34 35 to 54 Bar Graphs: (76) 55 and over Problem 9.3; Page 638: Refer to the previous exercise. Compute the conditional distribution of age for each of the four education categories. Summarize the distributions graphically and write a short paragraph describing the distributions and how they differ.  The conditional distributions in percents are given as follows: Age Group Education Did not complete high school Completed high school College, 1 to 3 years College, 4 or more years 25 to34 35 to 54 55 and over Bar Graphs: Simpson’s Paradox: An association or comparison that holds for all of several groups can be reverse when the data are combined to form a single group. This reversal is called Simpson’s Paradox. (77) 2 ( Chi –Square) Test for Independence Assumption: An srs of size n is chosen from a population. Two categorical variables are measured for each individual and the sample data is classified by an rc contingency table. H0 : Row and column categories are independent. Ha : Row and column categories are dependent. TEST STATISTIC: 2 =  (Oij  Eij ) 2 Eij ; Eij = (ri )( c j ) n  02 is the observed value of 2. The statistic follows chi-square distribution with df = (r-1)(c-1). P-VALUE p-value = P(2   02 ) NOTE: Chi – square distribution is a right skewed distribution and has a single parameter called its degrees of freedom. The right tail probabilities of chi-square distribution are given in Table F. (78) Example: In a study on gender bias, a simple random sample of 300 new employees in the banking industry were classified according to “Gender” and as to “whether or not they received a promotion after an appropriate period of time” GENDER MALE FEMALE TOTAL YES 140 (O11) 48 (O12) 188 (r1) NO 70 (O21) 42 (O22) 112 (r2) TOTAL 210 (c1) 90 (c2) 300 (n) PROMOTION We wish to test H0 : Gender and promotion are independent. Ha : gender and promotion are dependent.(  = .05) We will calculate expected frequencies under H0 ( i.e. under independence). To see how Eij’s are obtained, consider E12, which is the expected number of female employees who were promoted. E12 = nP(a female is promoted) = nP(Y and F) = nP(Y)P(F) = n( In general, Eij = r1 c (r )(c ) )( 2 ) = 1 2 [by independence] n n n (ri )( c j ) n (79)  02  (140  131.6) 2 (48  56.4) 2 (70  78.4) 2 (42  33.6) 2 +   131.6 56.4 78.4 33.6 = =4.7873 degrees of freedom(df)=(2-1)(2-1)=1 Therefore, .025 < p-value < .05. Decision: p-value < .05, reject H0 . There is sufficient evidence at  = .05 to conclude that gender and promotion are not independent. Note: The chi-square approximation is adequate for practical use when the average expected cell count is 5 or greater and all individual expected counts are 1 or greater, except in the case of 22 contingency tables. All four expected counts in a 22 table should be 5 or greater. (80) EXAMPLE: A random sample of 200 individuals from a community were cross classified according to eye and hair colour. HAIR COLOUR Black EYE CLOUR Brunette Blond Red Brown 25 41 9 3 Blue 4 38 3 34 Other 3 22 10 8 (a) Test at  =.05 if eye colour and hair colour are related. (b) If H0 is rejected in (a), determine which cell contribute most to the decision. Interpret this cell. Minitab Output EXAMPLE: Consider the eye and hair colour example. Its Minitab output is given below. MTB > chis c1-c4 Chi-Square Test: BLACK, BRUNETTE, BLOND, RED Expected counts are printed below observed counts 1 BLACK BRUNETTE 25 41 12.48 39.39 BLOND 9 8.58 RED 3 17.55 Total 78 2 4 12.64 38 39.90 3 8.69 34 17.77 79 3 3 6.88 22 21.72 10 4.73 8 9.68 43 Total 32 101 22 45 200 Chi-Sq = 12.560 + 0.066 + 0.021 5.906 + 0.090 + 3.726 2.188 + 0.004 + 5.872 DF = 6, P-Value = 0.000 1 cells with expected counts less (i) + 12.063 + + 14.810 + + 0.290 = 57.594 than 5.0 H0 : Ha : (81) Continued: (82) Chi-Square Test for Comparing Several Populations Assumption: Select independent srs’s from each of c populations of size n1 , n2 , . . . nc . Classify each individual in a sample according to a categorical response variable with r possible values. There are c different probability distributions, one for each population. H0 : The distributions of the response variable are the same in all c populations. Ha : These c distributions are not all same. TEST STATISTIC: 2 =  (Oij  Eij ) 2 Eij ; Eij = (ri )( c j ) n  02 is the observed value of 2. The statistic follows chi-square distribution with df = (r-1)(c-1). P-VALUE p-value = P(2   02 ) (83) Example: Suppose an investigator wishes to determine if there is a difference in the proportion of male and female adults in Canada who favour gun control. He selects independent simple random samples from each population and obtains the following results: GENDER MALE FEMALE TOTAL OPINION YES 60 (O11) 105 (O12) 165 (r1) NO 40 (O21) 45 (O22) 85 (r2) TOTAL 100 (c1) 150 (c2) 250 (n) We apply chi-square test for homogeneity. H0 : The two populations are homogeneous with respect to opinion on gun control. Ha: The two populations are not homogeneous with respect to opinion on gun control. We find the expected frequencies when H0 is true. To see how Eij are calculated, let’s look at E12, which is the expected number of females who favour gun control E12 = (sample size from the female population)pYF r  c2 ( 1 ) n = (estimate) (r1 )(c 2 ) n Notice that E12 = (ri )( c j ) (r1 )(c 2 ) , In general, Eij = n n (84)  02 = There is insufficient evidence at  = .05 to conclude that the male and female populations are not homogeneous with respect to opinion on gun control. (85) Note: The above problem can also be done using the test on two proportions. MALE FEMALE YES 60 105 NO 40 45 100 150 TOTAL H0 : p1 = p2 p̂1 = SEDp = z= pˆ (1  pˆ )( Ha: p1p2 p̂ = p̂ 2 = 1 1  ) = n1 n2 .6  .7  .0612 p-value = 2 P(Z 1.63) = p-value > .05, Do not reject H0. There is insufficient evidence at  = .05 to conclude that proportions of males and females are not same on their opinions on gun control. (86) EXAMPLE: Students often claim that the distribution of grades varies depending on the subject. From past records a university registrar chose random samples of grades from selected first year courses in English, Mathematics and Biology. Care was taken to exclude any student who may have taken two or more of these courses. The distribution of grades for each sample is given below. ENGLISH MATHEMATICS BIOLOGY GRADE A 5 8 13 B 33 17 30 C 42 10 24 D 15 8 7 F 5 8 6 (a) Test at  =.05 if the distribution of grades differs among the three subjects. (b) If H0 is rejected in (a), determine which cell contribute most to the decision. Interpret this cell. Minitab Output MTB > chis c1-c3 Chi-Square Test: ENGLISH, MATHEMATICS, BIOLOGY Expected counts are printed below observed counts 1 ENGLISH MATHEMAT 5 8 11.26 5.74 BIOLOGY 13 9.00 Total 26 2 33 34.63 17 17.66 30 27.71 80 3 42 32.90 10 16.78 24 26.32 76 4 15 12.99 8 6.62 7 10.39 30 5 5 8.23 8 4.19 6 6.58 19 Total 100 51 80 231 Chi-Sq = 3.477 + 0.890 + 1.773 0.077 + 0.025 + 0.190 2.517 + 2.739 + 0.205 0.312 + 0.286 + 1.106 1.265 + 3.452 + 0.051 DF = 8, P-Value = 0.019 1 cells with expected counts less + + + + = 18.363 than 5.0 (87) Continued: (i) H0: Ha: (88) ONE WAY ANALYSIS OF VARIANCE (CHAPTER 12) Formula for sample variance (Review) If x1 , x2 , …xn is a set of n observations; then sample variance s2 is given b s2 = (x i  x)2 n 1 ( x1  x ) 2  ( x2  x ) 2  ...  ( xn  x ) 2 SSTO = n 1 n 1  SSTO is called the total sum of squares and an alternative formula for SSTO is n SSTO =  (x i  x) 2 = 1 x 2 i  (  xi ) 2 n The following Sum of Squares are used in One-way Analysis of Variance. (i) SST (or SSTO) = Total Sum of Squares. (ii) SSG = Group Sum of Squares. (iii) SSE= Error Sum of Squares. To explain these SS’s we consider the following example. EXAMPLE: A school of education conducted a study to compare four methods of teaching reading comprehension (method A is the standard method). Twenty four grade 5 students of similar ability were randomly assigned to the four methods. After the four months of instruction, a standardized reading test (SRT) was given to the students. Their scores are given below. METHOD A: 42 42 43 45 METHOD B: 44 45 46 46 47 48 METHOD C:46 47 48 48 48 51 METHOD D: 40 41 42 43 44 44 (89) 45 45  We call the above samples as Sample 1, Sample 2, Sample 3 and Sample 4. Sample 1: 42 42 43 45 n1 = 4,  x =172, x1 = 43, SSTO =6, s12 = 6/3 = 2. Sample 2: 44 45 46 n2 = 6,  x =276 , x1 = Sample 3: 46 47 Sample 4: 40 41 47 48 276  46 , SSTO2 =10, s 22 = 10/5 = 2. 6 48 n3 = 6,  x =288 , x1 = 46 48 48 51 288  48 , SSTO3 =14, s 32 = 14/5 = 2.8 6 42 n4 = 8,  x =344 , x 4 = 43 44 44 44 45 45 344  43 , SSTO4 =124, s 42 = 24/7 = 3.4286 8 COMBINED SAMPLE: n=24, x = 1080 162  45 , SSTO = 162, s2 = =7.0435 24 23 In the above example SSTO (or SST) = 162 This is called total sum of squares and it measures the total variation of the sample. SSG= n1 ( x1  x ) 2 + n2 ( x2  x ) 2 + n3 ( x3  x ) 2 + n4 ( x4  x ) 2 = =108 This is called Group Sum of Squares and it measures the variation between groups. (90) SSE = SSTO1+ SSTO2+ SSTO3+ SSTO4 = = This is called error sum of squares and it measures the variation within groups. Note: In one-way ANOVA the following result always hold: SST (or SSTO) = SSG + SSE In the above example SST=162, SSG= 108, SSE=54. Thus, 108+54=162. DEGREES OF FREEDOM The above SS’s have associated degrees of freedom denoted by DFT, DFG and DFE. If total number of values in the combined sample are N and number of groups are I; then, DFT = N-1, DFG= I-1, DFE=N-I The following result always hold: DFT=DFG+DFE MEAN SQUARES we would need the following mean squares in one-way ANOVA. (i) MSE = SSE DFE (ii) MSG = (91) SSG DFG TESTING HYPOTHESES IN ONE-WAY ANOVA ASSUMPTION: The data arise as “I” independent simple random samples from I normal distributions with equal variances 2. HYPOTHESES: H0 : 1 =2 =….I Ha : not all of the I are same. TEST STATISTIC: is given by the following analysis of variance table. DEGREES OF FREEDOM SOURCE SUM OF SQUARES MEAN SQUARE GROUPS I-1 SSG MSG ERROR N-I SSE MSE TOTAL N-1 SST The statistic F= MSG MSE F F= MSG MSE has F-distribution with df = (I-1,N-I). P-VALUE p-value = P (F  F0 ); where F0 is the observed value of the F statistic. The upper tail probabilities of F distribution are given in Table E. (92) EXAMPLE: The analysis of variance table for the previous example is SOURCE DEGREES OF FREEDOM SUM OF SQUARES MEAN SQUARE GROUPS 3 SSG=108 MSG= ERROR 20 SSE=54 MSE= TOTAL 23 SST=162 F F= df =(3,20) F-CURVE WITH df =(3,20) P(F8.10) = .001 Therefore, p-value = P(F  13.33) < .001 Decision: Reject H0. There is sufficient evidence at  = .05 to conclude that the mean SRT scores differ for students taught by the different teaching methods. (93) Minitab Output MTB > stac c1-c4 c5; SUBC> subs c6. MTB > onew c5 c6 One-way ANOVA: C5 versus C6 Analysis of Variance for C5 Source DF SS MS C6 3 108.00 36.00 Error 20 54.00 2.70 Total 23 162.00 Level 1 2 3 4 N 4 6 6 8 Pooled StDev = Mean 43.000 46.000 48.000 43.000 1.643 StDev 1.414 1.414 1.673 1.852 F 13.33 P 0.000 Individual 95% CIs For Mean Based on Pooled StDev -----+---------+---------+---------+(------*------) (-----*-----) (-----*-----) (----*----) -----+---------+---------+---------+42.5 45.0 47.5 50.0 (94) EXAMPLE: Below is a partially completed One –Way ANOVA Table. SOURCE DF SS MS GROUPS - 130 65 ERROR - 580 29 TOTAL - - F (a) Fill in the missing values. (b) State H0 and Ha . (c) Conduct a test of H0 and Ha at  =.05.  The coefficient of determination: We define the coefficient of determination as SSG SST For the above problem the value of R2 or R –square is 0.183. This result says that the FIT part of the model (that is, differences among means of the groups) accounts for 18.3% of the total variation in the data. R2 = (95) EXAMPLE: Below is a partially completed One-Way ANOVA table. SOURCE DF SS MS F TREATMENTS - - - - ERROR - 600 25 TOTAL 27 1500 (a) Fill in the missing entries in the table. (b) State H0 and Ha. (c) Conduct a test of H0 and Ha at  = .01. (96) PROBLEM 12.23(b) : Many studies have suggested that there is a link between PAGE787 exercise and healthy bones. Exercise stresses the bones and this causes them to get stronger. One study examined the effect of jumping on the bone density of growing rats. There were three treatments: a control with no jumping, a low-jump condition ( the jump height was 30 centimeters), and a high-jump condition (60 centimeters). After 8 weeks of 10 jumps per day, 5 days per week, the bone density of the rats ( expressed in mg/cm3) was measured. Here are the data: GROUP CONTROL 611 LOW JUMP 635 HIGH JUMP 650 621 605 622 BONE DENSITY (mg/cm3) 614 593 593 653 600 554 603 569 638 594 599 632 631 588 607 596 626 626 631 622 643 674 643 650 Run the analysis of variance. Report the F statistic with its degrees of freedom and the p-value. What do you conclude? Results for: Ex12_023.mtp MTB > onew c4 c5 One-way ANOVA: Bone Density versus Treatment Analysis of Variance for Bone Den Source DF SS MS Treatmen 2 7434 3717 Error 27 12580 466 Total 29 20013 Level 1 2 3 N 10 10 10 Pooled StDev = Mean 601.10 612.50 638.70 21.58 StDev 27.36 19.33 16.59 F 7.98 P 0.002 Individual 95% CIs For Mean Based on Pooled StDev -------+---------+---------+--------(------*------) (------*------) (------*------) -------+---------+---------+--------600 620 640 (97) EXAMPLE: The data below gives the strength measurements for a type of concrete when cured at three different temperatures. The water/cement ratio was held constant throughout the experiment. Temperature A 46.3 43.7 51.2 49.6 48.8 Temperature B 48.6 52.3 50.9 53.6 55.7 Temperature C 45.1 46.7 41.8 40.4 42.6 To compare the average strength at the various temperatures, a One-Way ANOVA was conducted using Minitab. Use the following output to answer the questions below. MTB > stac c1-c3 c4; SUBC> subs c5. MTB > onew c4 c5 One-way ANOVA: C4 versus C5 Analysis of Variance for C4 Source DF SS MS C5 2 198.10 99.05 Error 12 89.60 7.47 Total 14 287.70 Level 1 2 3 Pooled (a) (b) (c) N 5 5 5 Mean 47.920 52.220 43.320 StDev 2.949 2.686 2.547 F 13.27 P 0.001 Individual 95% CIs For Mean Based on Pooled StDev ---------+---------+---------+------(------*-----) (------*-----) (-----*------) ---------+---------+---------+------44.0 48.0 52.0 StDev = 2.733 State H0 and Ha State the p-value and use it to conduct the test. (=.05) Find the MSE and the pooled estimate of . How are these two numbers related. (98) Example: We use one-way Anova to compare the average strength at the various temperature for the following three random samples: Temperature A: 46.3 47.3 Temperature B: 48.6 52.3 Temperature C: 45.1 46.7  SSTO1 = SSTO2 = SSTO3 = SSE = SSG = SST= (99) Continued: ANOVA TABLE Concluding Sentence: (100)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download X - York University