Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STAT 200 FINAL PRACTICE TEST (SOLUTIONS) 1. A) 15 6 .375 56 56 B) 3 .375 8 C) 15 .2679 56 D) P(G1 | B 2) 15 P(G1 B 2) 15 56 .4286 (note means “and”) 35 P( B 2) 35 56 2. 15 9 .375 64 64 3 B) .375 8 15 .2344 C) 64 A) 15 P(G1 B 2) 15 64 .375 40 P( B 2) 40 64 10 10 E) 1 (.375) 0 (.625)10 (.375)1 (.625) 9 .9363 0 1 D) P(G1 | B 2) 10 F) (.375) 2 (.625) 8 .1474 2 G) np 100(.625) 62.5 npq 100(.625)(.375) 4.841 H) z 59.5 62.5 .62 4.841 A .5000 .2324 .7324 3A. x 3780 x 3584600 x 3780 3584600 x n 4 2 2 2 2 s2 4166.67 n 1 3 CI for variance or standard deviation use: Confidence interval for population variance: Use 2 df ( s 2 ) 2 , get two values from the 2 table and solve for 2 twice. (Take square root if you want ) df = 3 and the two 2 table numbers are .216 and 9.348. .216 3(4166.67) 2 so 2 3(4166.67) 57870.4 .216 3(4166.67) 1337.18 and 2 9.348 1337.18 to 57870.4 3B. Sample size needed for CI for mean use : sample size needed for CI for mean is z n E 2 z 1.960(80) n 246 10 E 2 2 3C. Sample size needed for CI for proportion use: sample size for CI for proportion z 2 pq is n (use p q .5 to guarantee sample size is large enough, use p ' in place E2 of p to get reasonable estimate) n z 2 pq 1.960 2 (.5)(.5) 4269 E2 .015 2 3D. Difference of three means use ANOVA: ANOVA: Collect data from SRS’s from m different groups. Assume the populations are normal and the data are collected independently. Then the F statistic for rejecting H 0 : All population means are equal in favor of H a : There is some difference in the population means is found as follows (the sums are over each of the S sources): Fdata s 2factor 2 serror _ where s 2 factor _ n ( x x) i i df factor df factor S 1 and df error ni S 2 and s 2 error (n i 1) si2 df error _ _ x ni x i n x is the mean of all the data, n n i is how pieces of data from source i, _ x i is the sample mean of all the data from source i, and s i2 is the sample variance of all the data from source i. The Anova Test is always a one-tail test to the right. Company 1 Number of bulbs studied 4 Lifetimes 990,1010,900,880 3780 x x 2 Sample mean Sample variance Company 2 3 1000,900,1200 3100 Company 3 3 850,800,1000 2650 3584600 3250000 2362500 945 4166.67 1033.33 23333.33 883.33 10833.33 H 0 : All population means are equal in favor H a : There is some difference in the population means df factor S 1 2 df error ni S 10 3 7 S 3 _ n x _ i xi n 3780 3100 2650 953 10 _ _ s 2 factor n ( x x) s 2 error (n i Fdata 2 i df factor i 1) si2 df error s 2 factor s 2 error 4(945 953) 2 3(1033.33 953) 2 3(883.33 953) 2 17088.2267 2 3(4166.67) 2(23333.33) 2(10833.33) 11547.619 7 17088.2267 1.480 11547.619 NO F 4.7374 2 7 If there is no difference the chance we would find such strong or stronger evidence than we got that there is a difference is over 5%. This is assuming all conditions were met and the data were obtained in a proper fashion. 3E1) Difference of two means from independent samples use: Difference of t, df = min of the difference difference of means from 2 the sample of the two the population independent sizes - 1 sample means means in Ho samples subtracted in (often 0) appropriate order s12 s 22 n1 n 2 H 0 : other or other 0 H a : other or other 0 _ _ Picture of how x other x would be distributed if other 0 (using t with df = 9) t data (1040 1000) 0 80 2 50 2 20 10 40 1.675 23.8747 NO If there is the mean of the other company is not higher, the chance we would find such strong or stronger evidence than we got that it is higher is between 5% and 10%. This is assuming all conditions were met and the data were obtained in a proper fashion. 3E2) H 0 : other or other 0 H a : other or other 0 _ _ Picture of how x other x would be distributed if other 0 (using t with df = 9) t data (1040 1000) 0 2 2 80 50 20 10 40 1.675 23.8747 NO If there is the mean of the other company is the same, the chance we would find such strong or stronger evidence than we got that it is different is between 10% and 20%. This is assuming all conditions were met and the data were obtained in a proper fashion. 3F) Comparing two variances use: Hypothesis test for ratio of two variances: Use s12 Fdata 2 maker sure the top is bigger than the bottom. Keep track of the two s2 different df’s. Use the F-table for the critical value(s). Note that making F(data)>1 even if you have two critical values as in a two-tail test, only the right hand one matters. Data summary: x x Before = B After = A 240 233 2 x x n 2 14402 13583 .6667 3.5833 2 s2 n 1 A 1 (note the A went on the top since A’s sample standard B H 0 : A B or deviation is bigger) H a : A B or A 1 B s A2 would be distributed assuming A B . Note the tail is .05. The df s B2 for the top is 3 and for the bottom is 3. Picture of how F33 9.2766 Fdata 3.5833 5.375 .6667 NO If the insert does not raise the variance, the chance we would find such strong or stronger evidence than we got that it does is over 5%. This is assuming all conditions were met and the data were obtained in a proper fashion. G) Sample size needed for CI for proportion use: sample size for CI for proportion z 2 pq is n (use p q .5 to guarantee sample size is large enough, use p ' in place E2 of p to get reasonable estimate) n z 2 pq 1.960 2 (.3)(.7) 3586 E2 .015 2 H1) Comparing two means from matched pairs use: Difference of t, df = n-1 sample mean population means from 2 of the difference dependent differences mean in Ho samples (a.k.a. subtracted in (often 0) matched pairs) appropriate order Bulb 1 Power consumption before 60 Power consumption after 57 after - before 3 Bulb 2 61 58 3 Bulb 3 59 57 2 s2 Where s n is the s.d. of the differences. Bulb 4 60 61 -1 x x n 2 x 7 _ x 2 23 x H0 : B A H a : B A x 7 1.75 n or B A 0 or B A 0 4 2 s n 1 23 3 72 4 1.893 _ _ Picture of how x A x B would be distributed if A B 0 (using t with df = 3). t data 1.75 0 1.849 1.893 NO 4 If the insert does not change power consumption, the chance we would find such strong or stronger evidence than we got that it does is between 10% and 20%. This is assuming all conditions were met and the data were obtained in a proper fashion. H2) H 0 : B A or B A 0 H a : B A or B A 0 _ _ Picture of how x A x B would be distributed if A B 0 (using t with df = 3). t data 1.75 0 1.849 1.893 NO 4 If the insert does not save power, the chance we would find such strong or stronger evidence than we got that it does is between 5% and 10%. This is assuming all conditions were met and the data were obtained in a proper fashion. I1) HT for standard deviation use: Hypothesis test for population variance: Get critical value(s) from 2 table, and use 2 data df ( s 2 ) 2 where 2 is from H 0 . H 0 : 50 H a : 50 Picture of how ( df ) s 2 2 would be distributed assuming 50 . Note the tails are each .025. df = 3. 2 .216 2 9.348 From A: s 2 4166.67 2 data (df )( s 2 ) 2 (3)( 4166.67) 5 50 2 NO If the standard deviation is 50, the chance we would find such strong or stronger evidence than we got that it is not 50 is between 20% and 100%. This is assuming all conditions were met and the data were obtained in a proper fashion. I2) H 0 : 50 H a : 50 Picture of how ( df ) s 2 2 would be distributed assuming 50 . Note the tail is .05. df = 3. 2 7.815 From A: s 2 4166.67 2 data (df )( s 2 ) 2 (3)( 4166.67) 5 50 2 NO If the standard deviation is not over 50, the chance we would find such strong or stronger evidence than we got that it is over 50 is between 10% and 90%. This is assuming all conditions were met and the data were obtained in a proper fashion. J) This is not a HT or a CI, it is a probability question, we need to find the area under the curve. z A .5 .2549 .2451 K) CI for proportion use: 1 sample z proportion p' 22 .3667 60 .3667 1.960 1055 1000 .69 80 sample proportion, p’=number of successes / n (.3667)(.6333) 60 population proportion, p, in Ho HT: CI: 36.67% 12.19% pq n p' q' n L) This is not a HT or a CI, it is a probability question, we need to find the area under the curve. z 900 1000 2.50 80 4 A .5 .4938 .9938 M1) Difference of two proportions, use: Difference of z the difference in proportions difference of population (percentages, the two proportions or sample in Ho (often probabilities proportions 0) (0 and of success) subtracted non 0 from 2 in differences samples appropriate have order different standard deviations see HT(0 case): p 'pool q 'pool n1 p 'pool q 'pool n2 x1 x2 n1 n2 CI & HT(non 0 case): p'1 q'1 p' 2 q' 2 n1 n2 p 'pool where H 0 : pOther p or pOther p 0 H a : pOther p or pOther p 0 ' Picture of how all pOther p ' would be distributed if pOther p 0 . The best evidence that Ha is true is in the shaded part that is in both tails. The total shaded areais.05. 22 20 ..3818 60 50 20 22 ( )0 .0333333333 50 60 .358 (.3818)(.6182) (.3818)(.6182) .0930289625 50 60 p 'pool z data NO If the percentages are the same, the chance we would find such strong or stronger evidence than we got they differ is .7188. This is assuming all conditions were met and the data were obtained in a proper fashion. M2) H 0 : pOther p or pOther p 0 H a : pOther p or pOther p 0 ' Picture of how all pOther p ' would be distributed if pOther p 0 . The best evidence that Ha is true is in the right tail of .05. 22 20 ..3818 60 50 20 22 ( )0 .0333333333 50 60 .358 (.3818)(.6182) (.3818)(.6182) .0930289625 50 60 p 'pool z data NO If the percentage is not greater for the other company, the chance we would find such strong or stronger evidence than we got that it is higher is .3594. This is assuming all conditions were met and the data were obtained in a proper fashion. N) Matched pairs, use: Difference of t, df = n-1 means from 2 dependent samples (a.k.a. matched pairs) sample mean of the differences subtracted in appropriate order population difference mean in Ho (often 0) x x n 2 _ From earlier: x 1.75 3.182 1.893 4 x 7 1.75 n 4 2 s 1.75 3.012 n 1 23 3 s2 Where s n is the s.d. of the differences. 72 4 1.893 O) Difference of two proportions, use: Difference of z the proportions difference of (percentages, the two or sample probabilities proportions of success) subtracted from 2 in samples appropriate order difference in population proportions in Ho (often 0) (0 and non 0 differences have different standard deviations see HT(0 case): p 'pool q 'pool n1 p 'pool q 'pool n2 x1 x2 n1 n2 CI & HT(non 0 case): p'1 q'1 p' 2 q' 2 n1 n2 p 'pool “other – original” 22 ..3667 60 .40 .3667 1.960 (.4)(.6) (.3667)(.6333) 3.33% 18.25% 50 60 P) One mean, use: 1 sample mean t, df = n-1 sample mean population mean in Ho ' p other 20 .40 50 From A) ' p original x 3780 x x 2 3584600 2 x 2 s2 n n 1 4166.67 945 3.182 4 3584600 3 3780 2 4 4166.67 945 102.70 _ x 3780 945 4 s2 n where Q1) HT for proportion use: 1 sample z proportion sample proportion, p’=number of successes / n population proportion, p, in Ho HT: CI: pq n p' q' n 1 3 1 Ha : p 3 H0 : p Picture of how all p’ s would be distributed if p 1 1 . The best evidence for p is in 3 3 both tails. The total area of both tails is .05. 22 1 .03333 60 3 NO z data .548 (.3333)(.6667) .06086 60 If the percentage is 1/3, the chance we would find such strong or stronger evidence than we got that it is not 1/3 is .5842. This is assuming all conditions were met and the data were obtained in a proper fashion. Q2) H 0 : p Ha : p 1 3 1 3 Picture of how all p’ s would be distributed if p 1 1 . The best evidence for p is in 3 3 the right tail of .05. 22 1 .03333 60 3 NO z data .548 (.3333)(.6667) .06086 60 If the percentage is not over 1/3, the chance we would find such strong or stronger evidence than we got that it is over 1/3 is .2912. This is assuming all conditions were met and the data were obtained in a proper fashion. R1) One mean, use: 1 sample mean t, df = n-1 H 0 : 1000 H a : 1000 sample mean population mean in Ho s2 n _ Picture of how all x ’s would be distributed if 1000 . The best evidence that 1000 is in both tails. Each tail is .025. From earlier: x 3780 x x x n 2 3584600 2 2 s2 n 1 3584600 3 3780 2 4 4166.67 3780 945 1000 55 945 t data NO 1.704 4 4166.67 32.27487 4 If the mean was 1000, the chance we would find such strong or stronger evidence than we got that it is not 1000 is between 10% and 20%. This is assuming all conditions were met and the data were obtained in a proper fashion. _ x R2) H 0 : 1000 H a : 1000 _ Picture of how all x ’s would be distributed if 1000 . The best evidence that 1000 is in the left tail of .05. From earlier: x 3780 x x x n 2 2 s2 _ x n 1 3780 945 4 2 3584600 3780 2 3584600 4 4166.67 3 t data 945 1000 55 1.704 32.27487 NO 4166.67 4 If the mean was at least 1000, the chance we would find such strong or stronger evidence than we got that it is less than 1000 is between 5% and 10%. This is assuming all conditions were met and the data were obtained in a proper fashion. S) Difference of two means from independent samples use: Difference of t, df = min of the difference difference of means from 2 the sample of the two the population independent sizes - 1 sample means means in Ho samples subtracted in (often 0) appropriate order “Other – Original” (1040 1000) 2.262 T) z 80 2 50 2 20 10 or 40 54.005 s12 s 22 n1 n 2 4. Show two characteristics are related, use: O and E stuff : Right tails only. O E 2 2 data = E H 0 :two characteristics are independent H a : they are related df=(r-1)(c-1) E’s are found by (row total)(column total)/(grand total) H 0 : color and year independent H a : color and year related O’s black white blue tan 2000 25 35 10 10 2001 60 60 16 24 2002 32 30 11 7 totals 117 125 37 41 E’s black 2000 80(117) 29.25 320 2001 160(117) 58.5 320 2002 80(117) 29.25 320 Picture of how all Totals 80 160 80 320 white 80(125) 31.25 320 160(125) 62.5 320 80(125) 31.25 320 blue 80(37) 9.25 320 160(37) 19.5 320 80(37) 9.25 320 tan 80(41) 10.25 320 160(41) 21.5 320 80(41) 10.25 320 O E 2 ’s would be distributed if color and year were E independent. df =(4-1)(3-1) = 6 2 table 12.592 Use O E 2 = 4.25 2 3.75 2 .75 2 .25 2 1.5 2 2.5 2 3.5 2 2.5 2 E 29.25 31.25 9.25 10.25 58.5 62.5 19.5 21.5 2.75 2 1.25 2 1.75 2 3.25 2 NO 3.862 29.25 31.25 9.25 10.25 2 data = If there was no relationship, the chance we would find such strong or stronger evidence than we got that there is between 10% and 90%. This is assuming all conditions were met and the data were obtained in a proper fashion. 5. Showing data is not distributed a certain way, use: O and E stuff : Right tails O E 2 2 only. Use data = E H 0 : data distributed a certain way H a : its not distributed that way df=number of categories – 1 E’s are found using np where p is the probability of being in a category in Ho H 0 : pblack p white .35, pblue p tan .15 H a : not as above black white blue tan Total O’s 25 35 10 10 80 E’s 80(.35) = 28 80(.35) = 28 80(.15) = 12 80(.15) = 12 80 Picture of how all O E 2 ’s would be distributed if color and year were E independent. df =(4-1) = 3 2 table 7.815 2 data = O E 2 = E 9 49 4 4 2.738 28 28 12 12 NO If the colors were distributed 35-35-15-15, the chance we would find such strong or stronger evidence than we got that the data is not distributed that way is between 10% and 90%. This is assuming all conditions were met and the data were obtained in a proper fashion. 6. A) 80 70 production level 60 50 40 30 20 10 0 0 5 10 15 20 dexterity score x 70 y 306 x x x n B) r y 1006 y y n 2 2 x y xy n x x y y n n 2 2 2 2 18984 2 26 2 2 2 256.8 =.955 xy xy 4362 x y 78 n . C) H 0 : 0 Ha : 0 t table 2.353 t data r n2 3 .955 5.77 2 1 r 1 .955 2 x y xy n D) m x x 2 YES b =3 y m x =19.2 2 n n y 3 x 19.2 or productivity 3(dexterity) 19.2 Plugging in x = 0 we get y = 19.2 and plugging in x = 14 we get y = 61.2. Next we plot the points and draw the line in the graph. E) y 2 b y m xy n2 _ 1 n ( x0 x) 2 x x n 2 2 2.757 1 (14 14) 2 .447 5 26 _ 1 1 n ( x0 x) 2 x x n 2 1 2 61.2 3.182(2.757)(1.095) F) 61.2 3.182(2.757)(. 447) 1 (14 14) 2 1.095 5 26 or 61.2 9.61 or 61.2 3.92