Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Princeton University Department of Operations Research and Financial Engineering ORF 245 – Fundamentals of Engineering Statistics Final Exam May 22, 2008 7:30pm-10:30pm PLEASE DO NOT TURN THIS PAGE AND START THE EXAM UNTIL YOU ARE TOLD TO DO SO. Instructions: This exam is open book and open notes. Calculators are allowed, but not computers or the use of statistical software packages. Write all your work in the space provided after each question. There are questions on both sides of each page. Explain as thoroughly and as clearly as possible all your steps in answering each question. Full or partial credit can only be granted if intermediate steps are clearly indicated. Name: _______________________________________________________ Pledge: I pledge my honor that I have not violated the honor code during this examination. Signature: ____________________________________________________ 1: (12) _______ 6: (15) _______ 11: (12) _______ 2: (06) _______ 7: (20) _______ 12: (10) _______ 3: (10) _______ 8: (10) _______ 13: (10) _______ 4: (05) _______ 9: (20) _______ 14: (12) _______ 5: (05) _______ 10: (08) _______ 15: (20) _______ Total: (175) ___________ Descriptive Statistics: 1) Let xn and sn2 denote the sample mean and variance for the sample x1 ,..., xn and let xn +1 and sn2+1 denote these quantities when an additional observation xn +1 is added to the sample. a) (4 pts.) Show how xn +1 can be computed from xn and xn +1 . xn +1 = ⎞ 1 n +1 1 ⎛ n 1 xi = ( nxn + xn+1 ) ⎜ xi + xn +1 ⎟ = n + 1 i =1 n + 1 ⎝ i =1 ⎠ n +1 ∑ ∑ b) (8 pts.) Show that n 2 ( xn +1 − xn ) n +1 can be computed from xn +1 , xn , and sn2 . nsn2+1 = (n − 1) sn2 + so that sn2+1 Item dropped – do not grade. 2 2) Consider the following histogram that shows the time in months that articles submitted to a certain scientific journal in 2002 took to be reviewed for publication. a) (3 pts.) Which class interval contains the median review time? Reading the approximate areas under the histogram (prob/cum): 0-1: 0.27/0.27; 1-2: 0.1/0.37; 2-3: 0.105/0.475; 3-4: 0.11/0.585; 5-6: 0.08/0.735; 6-7: 0.075/0.810; 7-8: 0.125/0.935; 8-9: 0.065/1.0 4-5: 0.07/0.655; The median review time falls in the 3 to 4 month category. b) (3 pts.) Which class interval contains the third quartile of the review times? The third quartile of the review times falls in the 6 to 7 month category. (An answer of 5 to 6 months is also acceptable, given the uncertainty in the reading of areas in the histogram; but 7 to 8 months is not acceptable.) 3 Probability: 3) Items are inspected for flaws by two quality inspectors. If a flaw is present, it will be detected by the first inspector with probability 0.9, and by the second inspector with probability 0.7. Assume that the inspectors function independently. a) (4 pts.) If an item has a flaw, what is the probability that it will be found by at least one of the inspectors? Let I i : event that inspector i finds a flaw, i = {1, 2} Pr( I1 | flaw) = 0.9, Pr( I1c | flaw) = 0.1, Pr( I 2 | flaw) = 0.7, Pr( I 2c | flaw) = 0.3 Pr(flaw found by at least one inspector) = Pr( I1 ∪ I 2 | flaw) = Pr( I1 | flaw) + Pr( I 2 | flaw) − Pr( I1 ∩ I 2 | flaw) = 0.9 + 0.7 − 0.9 × 0.7 = 0.97 b) (6 pts.) Assume that both inspectors inspect every item and that if an item has no flaw, then neither inspector will detect a flaw. Assume also that the probability that an item has a flaw is 0.10. If an item is passed by both inspectors, what is the probability that it actually has a flaw? Pr( I1 | no flaw) = 0, Pr( I1c | no flaw) = 1, Pr( I 2 | no flaw) = 0, Pr( I 2c | no flaw) = 1 Pr(an item passed by both inspectors is actually flawed) = Pr (flaw | I1c ∩ I 2c ) = Pr( I1c ∩ I 2c ∩ flaw) Pr( I1c ∩ I 2c ) Pr( I1c ∩ I 2c ) = Pr( I1c ∩ I 2c | flaw) × Pr(flaw) + Pr( I1c ∩ I 2c | no flaw) × Pr(no flaw) = Pr( I1c | flaw) × Pr( I1c | flaw) × Pr(flaw) + Pr( I1c | no flaw) × Pr( I1c | no flaw) × Pr(no flaw) = 0.1 × 0.3 × 0.1 + 1 × 1× 0.9 = 0.903 Pr( I1c ∩ I 2c ∩ flaw) = Pr( I1c | flaw) × Pr( I1c | flaw) × Pr(flaw) = 0.1 × 0.3 × 0.1 = 0.003 Finally: Pr (flaw | I1c ∩ I 2c ) = 0.003 =0.003322 0.903 Alternative solution: diagram tree combined with conditional probability. 4 4) (5 pts.) An urn contains 3 red balls and 7 black balls. Players A and B withdraw balls from the urn consecutively until a red ball is selected. Namely, A draws the first ball, then B draws the second one, then A again, and so on, until the first one of them draws a red ball. If there is no replacement of the drawn balls, find the probability that A selects the red ball. Pr(A selects red ball) = Pr(red on 1st draw) + Pr(first red on 3rd draw) + Pr(first red on 5th draw) + Pr(first red on 7th draw) 3 7 6 3 7 6 5 4 3 7 6 5 4 3 2 3 + × × + × × × × + × × × × × × 10 10 9 8 10 9 8 7 6 10 9 8 7 6 5 4 3 7 1 1 7 = + + + = = 0.5833 or 58.33% 10 40 12 40 12 Alternative solution: = ⎛7⎞ ⎛7⎞ ⎛7⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 3 ⎝ 2⎠ 3 ⎝ 4⎠ 3 ⎝ 6⎠ 3 = + × + × + × = 0.5833 10 ⎛ 10 ⎞ 8 ⎛10 ⎞ 6 ⎛ 10 ⎞ 4 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝2⎠ ⎝4⎠ ⎝6⎠ Random Variables: 5) (5 pts.) Two types of coins are produced at a factory: a fair coin and a biased one that comes up heads 55 percent of the time. We have a coin from this factory but do not know whether it is a fair coin or a biased one. In order to ascertain which type of coin we have, we will perform the following statistical test: we will toss the coin 1000 times. If the coin lands on heads 525 or more times, then we will conclude that it is a biased coin, whereas, if it lands heads less than 525 times, then we will conclude that it is the fair coin. If the coin is actually fair, what is the probability that we will reach a false conclusion? [Hint: use the Normal approximation with continuity correction.] Let X be the # of heads in 1000 tosses of a fair coin Then X ∼ Bin(1000,0.5) ⇒ X ≈ N (500, 250) Pr(test yields false conclusion) = Pr( X ≥ 525) ⎛ 525 − 0.5 − 500 ⎞ = Pr ⎜ Z ≥ ⎟ = 1 − Φ (1.5495) = 1 − 0.9394 = 0.0606 or 6.06% 250 ⎝ ⎠ 5 6) (15 pts.) A bus travels between two cities A and B, which are 100 miles apart. If the bus has a breakdown, the distance from the breakdown to city A has a uniform distribution over (0, 100). There is a bus service station in city A, in B, and in the center of the route between A and B. It is suggested that it would be more efficient to have the three stations located 25, 50, and 75 miles, respectively, from A. Do you agree? Why? [Hint: compare the expected distance that the bus would have to be towed, from the breakdown point to the nearest service station.] Let X be the distance from A to where the bus breaks down: X ∼ Unif (0,100) Let Y be the distance from the breakdown point to the nearest service station in case 1 X if 0 ≤ X ≤ 25 ⎧ ⎫ ⎪ 50 − X if 25 < X ≤ 50 ⎪ ⎪ ⎪ Then Y = ⎨ ⎬ is uniformly distributed in each of these intervals ⎪ X − 50 if 50 < X ≤ 75 ⎪ ⎪⎩100 − X if 75 < X ≤ 100 ⎪⎭ EY = E[ X | 0 ≤ X ≤ 25] × Pr(0 ≤ X ≤ 25) + E[50 − X | 25 < X ≤ 50] × Pr(25 < X ≤ 50) + E[ X − 50 | 50 < X ≤ 75] × Pr(50 < X ≤ 75) + E[100 − X | 75 < X ≤ 100] × Pr(75 < X ≤ 100) = 12.5 × 0.25 + (50 − 37.5) × 0.25 + (62.5 − 50) × 0.25 + (100 − 87.5) × 0.25 ⇒ EY = 12.5 Now let Z be the distance from the breakdown point to the nearest service station in case 2 ⎧ 25 − X if 0 ≤ X ≤ 25 ⎫ ⎪ X − 25 if 25 < X ≤ 37.5⎪ ⎪ ⎪ ⎪⎪50 − X if 37.5 < X ≤ 50 ⎪⎪ Then Z = ⎨ ⎬ is uniformly distributed in each of these intervals X X 50 if 50 62.5 − < ≤ ⎪ ⎪ ⎪75 − X if 62.5 < X ≤ 75⎪ ⎪ ⎪ ⎩⎪ X − 75 if 75 < X ≤ 100 ⎭⎪ EZ = E[25 − X | 0 ≤ X ≤ 25] × Pr(0 ≤ X ≤ 25) + E[ X − 25 | 25 < X ≤ 37.5] × Pr(25 < X ≤ 37.5) + E[50 − X | 37.5 < X ≤ 50] × Pr(37.5 < X ≤ 50) + E[ X − 50 | 50 < X ≤ 62.5] × Pr(50 < X ≤ 62.5) + E[75 − X | 62.5 < X ≤ 75] × Pr(62.5 < X ≤ 75) + E[ X − 75 | 75 < X ≤ 100] × Pr(75 < X ≤ 100) = (25 − 12.5) × 0.25 + (31.25 − 25) × 0.125 + (50 − 43.75) × 0.125 + (56.25 − 50) × 0.125 + (75 − 68.75) × 0.125 + (87.5 − 75) × 0.25 ⇒ EZ = 9.375 As EZ < EY , then having service stations at 25, 50 and 75 miles IS more efficient. Alternate solutions: computing the expected values as integrals rather than conditional expectations; or graphing the distances and computing the areas under the graphs (but, in this case, the areas have to be proportional to the values above). 6 Joint Probability Distributions: 7) Choose a number X at random from the set of numbers {1,2,3,4,5} . Now choose a number at random from the subset no larger than X , that is, from {1,..., X } . Call this second number Y . a) (10 pts.) Find the joint probability mass function of X and Y . X→ Y↓ 1 2 3 4 1 2 3 4 5 1 5 1 10 1 15 1 20 1 1 10 1 15 1 20 1 1 15 1 20 1 1 20 1 5 p X ( x) 1 5 15 15 15 pY ( y ) 25 137 300 25 77 300 25 47 300 25 9 100 1 25 15 1 25 b) (7 pts.) Find the expected value and the variance of Y . 137 77 47 9 1 + 2× + 3× + 4× + 5× ⇒ EY = 2 300 300 300 100 25 137 77 47 Var (Y ) = (1 − 2) 2 × + (2 − 2) 2 × + (3 − 2) 2 × 300 300 300 9 1 400 + (4 − 2) 2 × + (5 − 2)2 × = ⇒ Var (Y ) = 1.333 100 25 300 EY = 1 × c) (3 pts.) Are X and Y independent? Explain. Note that p X ,Y (5,5) = 1 1 1 1 ≠ p X (5) × pY (5) = × = 25 5 25 125 Since there is at least one pair of values ( x, y ) for which p X ,Y ( x, y ) ≠ p X ( x) × pY ( y ), then X and Y are NOT independent. 7 Statistical Estimation: 8) (10 pts.) Maximum likelihood estimates possess the property of functional invariance, which means that if θˆ is the MLE of θ , and h(θ ) is any function of θ, then h(θˆ) is the MLE of h(θ ) . Given a random sample X 1 ,..., X n from a geometric distribution with parameter p , find the MLE of the odds ratio p (1 − p ) . Let X 1 , X 2 ,..., X n be a random sample of variable distributed as a Geom( p) Then: p X ( x) = (1 − p ) x p, for x ≥ 0 The joint p.m.f. of X 1 ,..., X n is given by: p X1 ,..., X n ( x1 ,..., xn ; p ) = (1 − p ) x1 p × (1 − p ) x2 p × ... × (1 − p ) xn p = (1 − p )∑ i i p n x The likelihood function is thus: ln[ p X1 ,..., X n ( x1 ,..., xn ; p )] = ( ∑ x ) ln(1 − p) + n ln p i i The MLE for the parameter p is obtained by derivation of the likelyhood function with respect to p: ( d ln[ p X1 ,..., X n ( x1 ,..., xn ; p )] dp ) = 0 ⇒ −∑ x i i 1 − pˆ 8 + n pˆ =0⇒ = pˆ 1 − pˆ n ∑x i i or pˆ 1 = 1 − pˆ x Confidence Intervals: 9) Let X represent the number of events that are observed to occur in n units of time or space, and assume that X ∼ Poisson ( nλ ) , where λ is the mean number of events that occur in one unit of time or space. Assume that X is large, so that X ∼ N ( nλ , nλ ) . A suitable estimator of λ is given by λˆ = X n , with standard error SE (λˆ) = λ n . a) (4 pts.) Assuming that X is large, what is the distribution of λ̂ ? (Name the distribution and tell the values of its parameters.) 1 ⎧ ⎫ E (λˆ ) = EX = λ ⎪⎪ ⎪⎪ n ⎨ ⎬ ⇒ λˆ ≈ N λ , λ n ⎪Var (λˆ ) = 1 Var ( X ) = λ ⎪ n ⎪⎭ n2 ⎩⎪ ( ) b) (4 pts.) Use the distribution found in the previous item and the fact that SE (λˆ ) ≈ λˆ n to derive an expression for the 100(1 − α ) % confidence interval for λ. ˆ Given that (λ − λ ) λˆ n (λˆ − z α ≈ N (0,1), then the 100(1-α )% CI for λ is given by: 2 λˆ n , λˆ + zα λˆ n 2 ) c) (4 pts.) A 5 mL sample of a certain suspension is found to contain 300 particles. The mean number of particles per mL in the suspension is ____60___, give or take ___3.464__. λˆ = 300 5 = 60 and SE(λˆ ) ≈ 60 5 = 12 = 3.464 d) (4 pts.) After 4 minutes, a geologist counted 256 particles emitted from a certain radioactive rock. Find a 95% confidence interval for the rate of emissions in units of particles per minute. λˆ = 256 4 = 64 SE(λˆ) ≈ 64 = 4 and z0.025 = 1.96 4 Thus the 95% CI for λ is: ( 64 − 1.96 × 4,64 + 1.96 × 4 ) = (56.16,71.84) and 9 e) (4 pts.) For how many minutes should particles be counted so that the 95% confidence interval specifies the rate to within ±1 particle per minute? ˆ We want z0.025 λ n 2 = 1 ⇒ n = λˆ z0.025 ⇒ n = 64 × 1.962 = 245.9 For 246 minutes. 10) A sample of seven concrete blocks had their compressive strength measured in MPa. The results were 1367.6, 1411.5, 1318.7, 1193.6, 1406.2, 1425.7, and 1572.4. Ten thousand bootstrap samples were generated from these data, and the bootstrap sample means were arranged in order. Refer to the smallest mean as Y1 , the second smallest as Y2 , and so on, with the largest being Y10000 . Assume that Y50 = 1283.4 , Y51 = 1283.4 , Y100 = 1291.5 , Y101 = 1291.5 , Y250 = 1305.5 , Y251 = 1305.5 , Y500 = 1318.5 , Y501 = 1318.5 , Y9500 = 1449.7 , Y9501 = 1449.7 , Y9750 = 1462.1 , Y9751 = 1462.1 , Y9900 = 1476.2 , Y9901 = 1476.2 , Y9950 = 1483.8 , and Y9951 = 1483.8 . a) (4 pts.) Compute the 95% bootstrap confidence interval for the mean compressive strength. ⎛Y +Y Y +Y ⎞ 95% CI for the mean = ⎜ 250 251 , 9750 9751 ⎟ = (1305.5,1462.1) 2 2 ⎝ ⎠ b) (4 pts.) Was this a parametric or a nonparametric bootstrap procedure? Explain. Nonparametric: the ten thousand samples were generated through random sampling, with replacement, from the given sample, without any information on the distribution of the population underlying the sample. 10 Tests of Hypothesis: 11) An article by Abdel-Aty et al. in the Journal of Transportation Engineering presents a tabulation of types of car crashes by the age of the driver over a three-year period in Florida. Here is the table: Age of drivers Total # of accidents # of accidents in driveways 15-24 years 82,486 4,243 25-64 years 219,170 10,701 a) (4 pts.) The difference between the proportions of driveway accidents for drivers aged 15-24 and drivers aged 25-64 is __0.261__%, give or take __0.0896__%. pˆ15 = 4243 82486 SE ( pˆ15 − pˆ 25 ) = = 0.05144 pˆ 25 = 10701 = 0.04883 pˆ15 − pˆ 25 = 0.00261 or 0.261% 219170 pˆ15 (1 − pˆ15 ) pˆ 25 (1 − pˆ 25 ) 0.05144 × 0.9486 0.04883 × 0.9512 + = + n 15 n25 82486 219170 = 0.0008963 or 0.0896% b) (4 pts.) Can you conclude that driveway accidents among 15-24 year-olds in FL are indeed likely to be proportionately higher than driveway accidents among 2564 year-old Floridians? State the hypotheses clearly and answer this question using the P-value. H 0 : p15 − p25 ≤ 0 H1 : p15 − p25 > 0 pˆ − pˆ 25 − ( p15 − p25 ) 4243 + 10701 = 0.04954 where pˆ pool = z = 15 82486 + 219270 SE ( pˆ pool ) SE ( pˆ pool ) = z= pˆ pool (1 − pˆ pool ) ⎛⎜ 1 + 1 ⎞⎟ = 0.0008864 n25 ⎠ ⎝ n15 0.00261 = 2.94, thus P-value = Pr( Z ≥ 2.94) = 0.0016 or 0.16% 0.0008864 ⇒ reject H 0 at significance level 1% Thus: younger Floridians do have a higher rate of driveway accidents than older ones c) (4 pts.) Assuming that young drivers in Florida do present a higher proportion of driveway accidents than older drivers, does this mean that younger Floridian drivers should be required to take a special course on how to drive on driveways, but not older drivers? Explain. Though statistically speaking younger Floridians do have a higher rate of driveway accidents than older ones, practically speaking the difference is too small (0.261%) to justify differentiated driving training for the two groups. This is thus a typical case in which statistical significance does not translate into practical significance. 11 12) An engineer claims that a new type of hard disk for laptops lasts longer than the old type. Independent random samples of 75 of each of the two types are chosen, and the sample means and standard deviations of their lifetimes are computed: New: Old: X 1 = 4387 h s1 = 252 h X 2 = 4260 h s2 = 231 h a) (4 pts.) Can you conclude that the mean lifetime of new hard disks is greater than that of the old hard disks? State the hypotheses clearly and answer this question at the 1% significance level. Item dropped – do not grade. b) (4 pts.) If the new hard disks have indeed a mean lifetime 40 h longer than the old ones, what is the probability ( β ) that the test performed in the previous item will incur into error of type II (that is, failing to reject H 0 )? Item dropped – do not grade. c) (2 pts.) Recompute the probability of error type II for the case of the new hard disks having a mean lifetime 80 h longer than the old ones. Item dropped – do not grade. 12 Correlation and Linear Regression: 13) A chemical engineer is studying the effect of temperature and stirring rate on the yield of a certain product. The process is run 16 times, at the settings indicated in the following table. The units for yield are percent of a theoretical maximum. The matrix of sample correlation coefficients among the variables in question is as follows: a) (5 pts.) Based on the analysis of sample correlation above, would you try and fit a multiple linear regression model in which the yield is the response variable and temperature and stirring rates are the covariates? Explain. No, it is not advisable to fit a model where both covariates are used, because there is a high level of linear correlation between temperature and stirring rate (0.9064). This is known as multicollinearity, and it will confound the least squares estimation of the linear regression coefficients. 13 b) (5 pts.) Find the 95% confidence interval for the coefficient of correlation between the stirring rate and the yield. What assumptions did you make in order to compute this confidence interval? Assuming that stirring rate and yield come from a bivariate normal distribution, then, V= by the Fisher transformation: ⎛ 1 1+ ρ 1 ⎞ 1 1+ r ln , ∼ N ⎜ ln ⎟ 2 1− r ⎝ 2 1− ρ n − 3 ⎠ ⎛ e 2c1 − 1 e 2c2 − 1 ⎞ and a 95% CI for ρ will be given by ⎜ 2c , 2c ⎟ , where c1 = v − z0.025 / n − 3 and c2 = v + z0.025 / n − 3 1 2 ⎝ e +1 e +1⎠ 1 1 + 0.7513 Thus: for v = ln = 0.9759, z0.025 = 1.96 ⇒ c1 = 0.9759 − 1.96 / 13 = 0.4321 and 2 1 − 0.7513 c2 = 0.9759 + 1.96 / 13 = 1.519 And finally, the 95% CI for ρ is: (0.407, 0.909). 14) The chemical engineer from the previous question has decided to calibrate a simple linear regression model with the yield as the response variable ( Y ) and stirring rate as the covariate ( X ). The results of the calibration obtained through Excel are: a) (2 pts.) What proportion of the observed variation in yield can be attributed to the simple linear regression relationship between yield and stirring rate? r 2 = 0.75132 = 0.564 ⇒ 56.4% b) (5 pts.) Can you say that an increase of 10 rpm in the stirring rate will produce an increase in yield of at least 2%? State the hypotheses clearly and answer this question at the 5% significance level. H 0 : β1 ≤ 0.2 tn − 2 = βˆ1 − β10 SEβˆ H1 : β1 > 2 = 10 = 0.2 0.3119 − 0.2 = 1.528 0.07322 1 P-value = Pr(T14 ≥ 1.528) = 0.0744 or 7.4% ⇒ cannot reject H 0 at 5% Thus, we cannot say that 10 rpm will increase yield by at least 2%. 14 c) (5 pts.) Construct the 95% confidence interval for the prediction of the yield percentage that corresponds to a stirring rate of 55 rpm. In order to compute this interval, you may need the following additional information: Given x* = 55 and yˆ = 61.5563 + 0.3119 x, then yˆ * = 61.5563 + 0.3119 × 55 = 78.71 The 95% CI for y* is: yˆ * ± t0.025,14 × SE pred ( yˆ * | x* ), where ( ⎡ x* − x 1 * * ⎢ SE pred ( yˆ | x ) = σˆ 1 + + ⎢ n S xx ⎢⎣ ) 2 ⎤ ⎥ ⎥ ⎥⎦ 1 2 Computing: σˆ = S yy (1 − r 2 ) n−2 = 234.5 × (1 − 0.564) = 2.70 14 ⎡ 1 (55 − 45) 2 ⎤ SE pred ( yˆ | x = 55) = 2.70 ⎢1 + + ⎥ 1360 ⎦ ⎣ 16 * * 1 2 = 2.88 Finally: yˆ * − t0.025,14 × SE pred ( yˆ * | x* ) = 78.71 − 2.144 × 2.88 = 72.5 yˆ * + t0.025,14 × SE pred ( yˆ * | x* ) = 78.71 + 2.144 × 2.88 = 84.9 Thus, the 95% CI for (y* | x* = 55) is: (72.5,84.9) 15 Multiple Linear Regression: 15) A study was made in which data was obtained to relate y = specific surface area ( cm3 /g ) to x1 = % NaOH used as a pretreatment chemical and x2 = treatment time (min) for a batch of pulp. The following R output resulted from a request to fit the model Y = β 0 + β1 x1 + β 2 x2 + ε . a) (6 pts.) Fill in the blanks in the tables above by computing the following values: the coefficients of determination – regular and adjusted, the regression sum of squares, the mean sums of squares – regression and residuals, and the value of the F statistics. Show your computations. Item dropped – do not grade. b) (2 pts.) What proportion of observed variation in specific surface area can be explained by the model relationship? Item dropped – do not grade. 16 c) (4 pts.) Does the chosen model appear to specify a useful relationship between the response and the covariates? Explain. Item dropped – do not grade. d) (4 pts.) Provided that % NaOH remains in the model, would you suggest that the covariate treatment time be eliminated? Explain. Item dropped – do not grade. e) (4 pts.) Calculate a 95% confidence interval for the expected change in specific surface area associated with an increase of 1 % in NaOH when treatment time is held fixed. Item dropped – do not grade. 17