Download required sample size to estimate mu, alpha

Chapter 20 Inference for Means: Part 2 Required Sample Size, Type II Error Probabilities 1 Required Sample Size To Estimate a Population Mean  • If you desire a C% confidence interval for a population mean  with an accuracy specified by you, how large does the sample size need to be? • We will denote the accuracy by ME, which stands for Margin of Error. Example: Sample Size to Estimate a Population Mean  • Suppose we want to estimate the unknown mean height  of male students at NC State with a confidence interval. • We want to be 95% confident that our estimate is within .5 inch of  • How large does our sample size need to be? Confidence Interval for  In terms of the margin of error ME, the CI for  can be expressed as x  ME The confidence interval for  is  s  x t    n *  s  so ME  tn 1    n * n 1 So we can find the sample size by solving this equation for n: ME  t * n 1  s     n t s which gives n    ME   * n 1 2 • Good news: we have an equation • Bad news: 1. Need to know s 2. We don’t know n so we don’t know the degrees of freedom to find t*n-1 A Way Around this Problem: Approximate by Using the Standard Normal Use the corresponding z* from the standard normal to form the equation  s  ME  z    n Solve for n: *  zs n   ME  * 2 Sampling distribution of x Confidence level .95     1.96 n ME ME set ME  1.96  1.96   n  ME      1.96 n  n and solve for n 2 (estimate  with s) Estimating s • Previously collected data or prior knowledge of the population • If the population is normal or near-normal, then s can be conservatively estimated by s  range 6 • 99.7% of obs. within 3  of the mean Example: sample size to estimate mean height µ of NCSU undergrad. male students  z s  n   ME  * We want to be 95% confident that we are within .5 inch of , so  ME = .5; z*=1.96 • Suppose previous data indicates that s is about 2 inches. • n= [(1.96)(2)/(.5)]2 = 61.47 • We should sample 62 male students 2 Example: Sample Size to Estimate a Population Mean -Textbooks • Suppose the financial aid office wants to estimate the mean NCSU semester textbook cost  within ME=$25 with 98% confidence. How many students should be sampled? Previous data shows  is about $85. 2  z *σ   (2.33)(85)  n     62.76 25    ME  round up to n = 63 2 Example: Sample Size to Estimate a Population Mean -NFL footballs • The manufacturer of NFL footballs uses a machine to inflate new footballs • The mean inflation pressure is 13.0 psi, but random factors cause the final inflation pressure of individual footballs to vary from 12.8 psi to 13.2 psi • After throwing several interceptions in a game, Tom Brady complains that the balls are not properly inflated. The manufacturer wishes to estimate the mean inflation pressure to within .025 psi with a 99% confidence interval. How many footballs should be sampled? Example: Sample Size to Estimate a n   z *  Population Mean   ME  • The manufacturer wishes to estimate the mean inflation pressure to within .025 pound with a 99% confidence interval. How may footballs should be sampled? • 99% confidence  z* = 2.58; ME = .025 •  = ? Inflation pressures range from 12.8 to 13.2 psi • So range =13.2 – 12.8 = .4;   range/6 = .4/6 = .067  2.58  .067  n   47.8  48  .025  2 . . . 1 2 3 48 2 Significance Levels and Rejections Regions Hypothesis Tests for  13  Levels and Rejection Regions, Right-Tail; n=26 (df=25) H 0 :   0 t y  0 s n  Rej Region .10 t > 1.316 .05 t > 1.708 .01 t > 2.485 If HA:  > 0 and =.10 then RR={t: t > 1.316} If HA:  > 0 and =.05 then RR={t: t > 1.708} If HA:  > 0 and =.01 then RR={t: t > 2.485} 14 Hypothesis Testing for , Type II Error Probabilities (Right-tail example) • Example – A new billing system for a department store will be costeffective only if the mean monthly account is more than $170. – A sample of 401 accounts has a mean of $174 and s = $65. – Can we conclude that the new system will be cost effective? 15 Right-tail example: hypotheses, significance level • Hypotheses – The population of interest is the credit accounts at the store. – We want to know whether the mean account for all customers is greater than $170. H0 :  = 170 HA :  > 170 – Where  is the mean account value for all customers – We will choose significance level  = .05 16 A Right - Tail Test: Rejection Region • The rejection region: reject H0 if the test statistic t satisfies t > t.05,n-1 = t.05,400 = 1.649 • We will reject H0 if the value of the test statistic t is greater than 1.649 • Results from the n = 401 randomly selected customers: x  $174, s  $65 17 Right-tail example: test statistic and conclusion – Hypotheses: H0 :  = 170 HA :  > 170 data: x  174, s  65 x   174  170 test statistic: t  s n Recall that the rejection region is  65 401  1.23 t  t ,n 1  t.05,400  1.649 Since the test statistic t = 1.23, and 1.23 < 1.649, We do not reject the null hypothesis H0:  = 170. 18 Right-tail example: P-value and conclusion P-value: The probability of observing a value of the test statistic as extreme or more extreme then t = 1.23, given that  = 170 is… t400 P-value  P(t400  1.23)  .1097 0 t  1.23 Since the P-value > .05, we conclude that there is not sufficient evidence to reject H0 : =170. Type II error is possible 19 Calculating , the Probability of a Type II Error • Calculating  for the t test is not at all straightforward and is beyond the level of this course – The distribution of the test statistic t is quite complicated when H0 is false and HA is true – However, we can obtain very good approximate values for  using z (the standard normal) in place of t. 20 Calculating , the Probability of a Type II Error (cont.) • We need to 1. specify an appropriate significance level ; 2. Determine the rejection region in terms of z 3. Then calculate the probability of not being in the rejection when  = 1, where 1 is a value of  that makes HA true. 21 Example (cont.) calculating – Test statistic: H0 :  = 170 HA :  > 170 Choose = .05 Rejection region in terms of z: z > z.05 = 1.645 rejection region in terms of x : x  170 z  1.645 65 400 65 x  170  1.645  175.34. 400  = 0.05 170 175.34 22 Example (cont.) calculating  – The rejection region with  = .05. Express the rejection region directly, not in standardized terms x  175.34 – Let the alternative value be  = 180 (rather than just >170) H :  = 170 0 HA:  = 180 Do not reject H0 =.05 = 170 175.34 Specify the alternative value under HA. 180 23 Example (cont.) calculating  – A Type II error occurs when a false H0 is not rejected. Suppose =180, that is H0 is false. H0:  = 170 A false H0… …is not rejected H1:  = 180 x  175.34 = 170 175.34 =.05 180 24 Example (cont.) calculating   (180)  P( x  175.34 given that H 0 is false)  P( x  175.34 given that   180)  P( z  175.34  180 65 400 )  .0764 H0:  = 170 Power when =180 = 1-(180)=.9236 H1:  = 180 = 170 175.34 180 25 Effects on  of changing  • Increasing the significance level , decreases the value of , and vice versa. 2 < 1 = 170 2 > 1 180 26 Judging the Test • A hypothesis test is effectively defined by the significance level  and by the sample size n. • If the probability of a Type II error  is judged to be too large, we can reduce it by – increasing , and/or – increasing the sample size. 27 Judging the Test • Increasing the sample size reduces  x  Recall RR : z   z , or s n x    z s n By increasing the sample size the standard deviation of the sampling distribution of the mean decreases. Thus, the cutoff value of for the rejection region decreases. 28 Judging the Test • Increasing the sample size reduces  x  Recall RR : z   z , or s n x    z s n Note what happens when n increases:  does not change, but  becomes smaller = 170 xxxLLxLxLxLL 180 29 Judging the Test • Increasing the sample size reduces  • In the example, suppose n increases from 400 to 1000. s 65 x    z  170  1.645  173.38 n 1000 173.38  180   P( Z  )  P( Z  3.22)  0 65 1000 •  remains 5%, but the probability of a Type II drops dramatically. 30 A Left - Tail Test • Self-Addressed Stamped Envelopes. – The chief financial officer in FedEx believes that including a stamped self-addressed (SSA) envelop in the monthly invoice sent to customers will decrease the amount of time it take for customers to pay their monthly bills. – Currently, customers return their payments in 24 days on the average, with a standard deviation of 6 days. – Stamped self-addressed envelopes are included with the bills for 76 randomly selected customers. The number of days until they return their payment is recorded. 31 A Left - Tail Test: Hypotheses • The parameter tested is the population mean payment period () for customers who receive self-addressed stamped envelopes with their bill. • The hypotheses are: H0:  = 24 H1:  < 24 • Use  = .05; n = 76. 32 A Left - Tail Test: Rejection Region • The rejection region: reject H0 if the test statistic t satisfies t < t.05,75 = 1.665 • We will reject H0 if the value of the test statistic t is less than 1.665 • Results from the 76 randomly selected customers: x  22.95 days, s  6 days 33 A Left -Tail Test: Test Statistic • The value of the test statistic t is: x   22.95  24 t   1.52 s n 6 76 Since the rejection region is t  t  t.05  1.665 Since the test statistic t = 1.52, and 1.52 > 1.665, We do not reject the null hypothesis. Note that the P-value = P(t75 < -1.52) = .066 > .05. Since our decision is to not reject the null hypothesis, A Type II error is possible. 34 Left-Tail Test: Calculating , the Probability of a Type II Error • The CFO thinks that a decrease of one day in the average payment return time will cover the costs of the envelopes since customer checks can be deposited earlier. • What is (23), the probability of a Type II error when the true mean payment return time  is 23 days? 35 Left-tail test: calculating (cont.) – Test statistic: H0 :  = 24 HA :  < 24 Choose = .05 Rejection region in terms of z: z < -z.05 = -1.645 rejection region in terms of x : x  24 z  1.645 6 75 6 x  24  1.645  22.86. 75  = 0.05 22.86 24 36 Left-tail test: calculating  (cont.) – The rejection region with  = .05. Express the rejection region directly, not in standardized terms x  22.86 – Let the alternative value be  = 23 (rather than just  < 24) H :  = 24 0 HA:  = 23 Specify the alternative value under HA. Do not reject H0 =.05 22.86 = 23 24 37 Left-tail test: calculating  (cont.)  (23)  P( x  22.86 given that H 0 is false)  P( x  22.86 given that   23)  22.86  23  Power when =23 =  P z    .718 6 75  1-(23)=.282  H0:  = 24 H1:  = 23 =.05 22.86 = 23 24 38 A Two - Tail Test for  • The Federal Communications Commission (FCC) wants competition between phone companies. The FCC wants to investigate if AT&T rates differ from their competitor’s rates. • According to data from the (FCC) the mean monthly long-distance bills for all AT&T residential customers is $17.09. 39 A Two - Tail Test (cont.) • A random sample of 100 AT&T customers is selected and their bills are recalculated using a leading competitor’s rates. • The mean and standard deviation of the bills using the competitor’s rates are x  $17.55, s  $3.87 • Can we infer that there is a difference between AT&T’s bills and the competitor’s bills (on the average)? 40 A Two - Tail Test (cont.) • Is the mean different from 17.09? H0:  = 17.09 H A :   17.09 • n = 100; use  = .05 41 A Two – Tail Test (cont.) Rejection region t  t.025,99 or t  t.025,99 t  1.9842 or t  1.9842 x   17.55  17.09 t   1.19 s n 3.87 100 t99 /2  0.025 -t/2 = -1.9842 /2  0.025 0 t/2 = 1.9842 Rejection region 42 A Two – Tail Test: Conclusion There is insufficient evidence to conclude that there is a difference between the bills of AT&T and the competitor. Also, by the P-value approach: The P-value = P(t < -1.19) + P(t > 1.19) = 2(.1184) = .2368 > .05 x   17.55  17.09 t   1.19 s n 3.87 100 A Type II error is possible /2  0.025 /2  0.025 -1.19 0 1.19 -t/2 = -1.9842 t/2 = 1.9842 43 Two-Tail Test: Calculating , the Probability of a Type II Error • The FCC would like to detect a decrease of $1.50 in the average competitor’s bill. (17.09-1.50=15.59) • What is (15.59), the probability of a Type II error when the true mean competitor’s bill  is $15.59? 44 Two – Tail Test: Calculating  (cont.) Rejection region rejection region in terms of x : x  17.09 z  1.96 3.87 100 3.87 x  17.09  1.96 100 x  16.33 z x  17.09 3.87  1.96 100 x  17.09  1.96 3.87 z   z.025 or z  z.025 z  1.96 or z  1.96 /2  0.025 /2  0.025 Do not reject H0 16.33 17.09 17.85 Reject H0 100 x  17.85 45 Two – Tail Test: Calculating  (cont.)  (15.59)  P(16.33  x  17.85 given that   15.59)  16.33  15.59 x  15.59 17.85  15.59   P     3.87 100 3.87 100 3.87 100   P(1.912  z  5.84)  .028 H0:  = 17.09 HA:  = 15.59 Power when =15.59 = 1(15.59)=.972 =.05 17.09 = 15.59 16.63 17.85 46 General formula: Type II Error Probability (A) for a Level  Test H A :   0  0   A  P  z  z    n   H A :   0  0   A  1  P  z   z    n   H A :   0   0   A  0   A  P  z  z /2    P  z   z /2    n   n    47 Sample Size n for which a level  test also has (A) =     ( z  z )  2   for a 1-tailed (right or left) test   0   A  n 2    ( z /2  z )   for a 2-tailed test (approx. solution)    0   A  48

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download required sample size to estimate mu, alpha