Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Lesson 8 Chapter 7: Confidence and Prediction Intervals Michael Akritas Department of Statistics The Pennsylvania State University Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals 1 Introduction to Confidence Intervals 2 CIs for a Mean and a Proportion 3 CIs for the Regression Parameters 4 The issue of Precision 5 Prediction Intervals Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Bounding the Error of Estimation b are (at least By the CLT, if n is large, most estimators, θ, approximately) normally distributed, with mean equal to the true value, θ, of the parameter they estimate. Thus, · θ̂ ∼ N θ, σθ̂2 . The above fact provides probabilistic bounds on the size of the estimation error: θ̂ − θ ≤ 1.96σθ̂ holds 95% of the time. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals From Error Bounds to Confidence Intervals The probabilistic error bound, can be re-written as θ̂ − 1.96σθ̂ ≤ θ ≤ θ̂ + 1.96σθ̂ , i.e., an interval of plausible values for θ, with degree of plausibility approximately 95%. Such intervals are called confidence intervals (CI). In general, the 100(1 − α)% CI is of the form θ̂ − zα/2 σθ̂ ≤ θ ≤ θ̂ + zα/2 σθ̂ , or θ̂ ± zα/2 σθ̂ . Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Z Intervals 100(1 − α)% CIs that use percentiles of the standard normal distribution, zα/2 , as above, are called Z intervals. Z intervals for the mean require known variance, and either the assumption of normality or n ≥ 30. Typically, the variance is not known. Z intervals will be primarily used for proportions. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals The T Distribution and T Intervals When sampling from normal populations, an estimator θb of some parameter θ often satisfies, for all sample sizes n, θb − θ ∼ Tν , where σ bθ̂ is the estimated standard error, σ bθ̂ and Tν stands for T distribution with ν degrees of freedom. A T distribution is symmetric and its pdf tends to that of the standard normal as ν tends to infinity. The 100(1 − α/2)th percentile of the T distribution with ν degrees of freedom will be denoted by tν,α/2 . Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals p.d.f of the t-distr. with ν degrees of freedom area= α t ν,α Figure: PDF and Percentile of a T Distribution. As the DF ν gets large, tν,α/2 approaches zα/2 . For example, for ν = 9, 19, 60 and 120, tν,0.05 is: 1.833, 1.729, 1.671, 1.658, respectively, while z0.05 = 1.645. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Plots of N(0, 1) and T densities in R http: //stat.psu.edu/˜mga/401/fig/ComparTdensit.pdf Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Relation θ−θ σ bθ̂ ∼ tν , which also holds approximately when sampling non-normal populations provided n ≥ 30, leads to the following 1 − α bound on the error of estimation of θ b θ − θ bθ̂ , ≤ tν,α/2 σ b This error bound leads to the following (1 − α)100% CI for θ: b b θ − tν,α/2 σ bθ̂ , θ + tν,α/2 σ bθ̂ . (2.1) T intervals will be used for the mean, as well as for the regression parameters in the linear regression model. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Read Section 7.2 CI Semantics: The Meaning of “Confidence” Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals 0 10 20 CI count 30 40 50 Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals 0.2 0.4 0.6 0.8 End points of CIs Figure: 50 CIs for p. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals T CIs for the Mean: Proposition Let X1 , . . . , Xn be a simple r.s. from a population with mean µ and variance σ 2 , both unknown. Then X −µ √ ∼ tn−1 S/ n (3.1) holds exactly for any n if the population is normal, and holds approximately for non-normal populations provided n ≥ 30. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals The Proposition yields the 1 − α error bound S X − µ ≤ tn−1,α/2 √ n which leads to the following (1 − α)100% CI for the mean: S S X − tn−1,α/2 √ , X + tn−1,α/2 √ . n n Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example The mean weight loss of n = 16 grinding balls after a certain length of time in mill slurry is 3.42g with S = 0.68g. Construct a 99% CI for the true mean weight loss. Solution. Because n < 30 we must assume that the (statistical) population of the mean weight loss is normal. (In a real life application, the normality assumption should be verified by the histogram or the Q-Q plot of the data.) For α = 0.01 and n − 1 = 15 DF, Table A.4 gives tn−1,α/2 = t15,0.005 = 2.947. Thus the desired 99% CI for µ is √ 3.42 ± 2.947(0.68/ 16), or 2.92 < µ < 3.92. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals R command for the t-interval for the mean With data in x, the commands ”lm(x ∼ 1), confint(lm(x ∼ 1))” S will return X , and pair of values X ± tn−1,0.025 √ , which is n the 95% CI for µ. For the 90% CI of µ use ”confint(lm(x ∼ 1), level=0.9)” S (∗ ) The pair of values X ± tn−1,0.025 √ can also be gotten n as ”mean(x) ± qt(0.975,df=length(x)-1)*sd(x)/sqrt(length(x))” Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Z CIs for Proportions CIs for p, however, are slightly different due to: 1 We are typically given only T = X1 + · · · + Xn , or X = p̂. 2 b(1 − p b). σ 2 is estimated by p 3 We use the percentiles from the normal distribution, using the approximate result · b(1 − p b)/n , b∼ p N p, p b ≥ 8 and n(1 − p b) ≥ 8, i.e. at least eight 1s which holds if np and at least eight 0s. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals The above approximate distribution of p̂ leads to the (approximate) (1 − α)100% CI r b(1 − p b) p b ± zα/2 p . n Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example A Gallup Survey estimated the proportion of adults across the country who drink beer, wine, or hard liquor, at least occasionally. Of the 1516 adults interviewed, 985 said they drank. Find a 95% confidence interval for the proportion, p, of all Americans who drink. Solution: Here α = 0.05, and z0.025 = 1.96. Thus r 985 0.65 × 0.35 ± 1.96 = 0.65 ± 0.024 1516 1516 QUESTION: An interpretation of the above CI is that the probability is 0.95 that the true proportion of adults who drink lies in the interval you obtained. True or False? Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals R command for z-intervals for a proportion With T being the number of ”successes” in n trials, set ”phat=T/n” and use the commands ”phat ± qnorm(0.975)*sqrt(phat*(1-phat)/n)” to obtain the q 95% CI for p, i.e. the pair of values b ± z0.025 p b p(1−b p) . n To obtain 90% or other CIs, adjust the 0.975 in the above command accordingly. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals T CIs for the Slope of a Regression Line: Proposition Let (X1 , Y1 ), . . . , (Xn , Yn ), be iid satisfying E(Yi |Xi = x) = α1 +β1 x, and Var(Yi |Xi = x) = σε2 , same for all x. Then, v u u S2 σ bβ̂1 = u , where tP 1 P 2 2 Xi − ( Xi ) n " n # n n X X X 1 Yi2 − α̂1 Yi − β̂1 Xi Yi . S2 = n−2 i=1 i=1 i=1 is the estimator of the intrinsic variability. NOTE: σ bβ̂1 is also denoted by Sβ̂1 . Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals We saw that under the normality assumption, βb1 − β1 ∼ tn−2 . σ bβ̂1 This leads to the 100(1 − α)% error bound |βb1 − β1 | < tn−2,α/2 σ bβ̂1 , and corresponding 100(1 − α)% CI for β1 of: b b β1 − tn−2,α/2 σ bβ̂1 , β1 + tn−2,α/2 σ bβ̂1 Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example (Y =propagation of stress wave, X =tensile strength) P P 2 In this study, n = 14, X = 890, i i P P 2 Pi Xi = 67, 182, i Yi = 37.6, i Yi = 103.54 and i Xi Yi = 2234.30. Let Y1 denote an observation made at X1 = 30, and Y2 denote an observation at X2 = 35. Construct a 95% CI for E(Y1 − Y2 ). Solution. Note that E(Y1 − Y2 ) = −5β1 . We will first construct a 95% CI for β1 . We have: βb1 = −0.0147209, α b1 = 3.6209072, and S 2 = 0.02187. Thus, v s u 2 u S 0.02187 = = 0.001414, σ bβb = u 1 t 2 1 P 2 1 P 2 67, 182 − 890 14 Xi − ( Xi ) n Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example (Continued) so that, the 95% CI for β1 is βb1 ± t0.025,12 σ bβb 1 = −0.0147209 ± 2.179 × 0.001414 = −0.0147209 ± 0.00308 = (−0.0178, −0.01164). The 95% CI for −5β1 follows now easily: −5βb1 ± 5tα/2,n−2 σ bβb = 5(0.0147209) ± 5 × 2.179 × 0.001414. 1 Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals T CIs for the Regression Line Let (X1 , Y1 ), . . . , (Xn , Yn ), be iid satisfying E(Yi |Xi = x) = α1 +β1 x, and Var(Yi |Xi = x) = σ 2 , same for all x. Then, s 1 n(x − X )2 P + P 2 σ bµ̂Y |X =x = S , where n n Xi − ( Xi )2 S2 " n # n n X X X 1 2 = Yi − α̂1 Yi − β̂1 Xi Yi . n−2 i=1 i=1 i=1 is the estimator of the intrinsic variability. NOTE: σ bµ̂Y |X =x is also denoted by Sµ̂Y |X =x . Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals We saw that under the normality assumption, µ bY |X =x − µY |X =x ∼ tn−2 . σ bµ̂Y |X =x This leads to the 100(1 − α)% error bound |b µY |X =x − µY |X =x | < tn−2,α/2 σ bµ̂Y |X =x , and corresponding 100(1 − α)% CI for µY |X =x of: µ bY |X =x − tn−2,α/2 σ bµ̂Y |X =x , µ bY |X =x − tn−2,α/2 σ bµ̂Y |X =x Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals The n(x − X̄ )2 in the expression of σ bµ̂Y |X =x , means that confidence intervals for µY |X =x get wider as x get farther away from X . Figure: Confidence Intervals for µY |X =x Get Wider Away from X Estimation of µY |X =x for x < X(1) or x > X(n) is NOT recommended. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example P P n data points yield Xi = 292.90, Yi = 69.03, P= 11 P P Xi2 = 8141.75, Xi Yi = 1890.2, Yi2 = 442.1903, µ bY |X =x = 2.22494 + .152119x, and S = 0.3444. Construct 95% CIs for µY |X =26.627 and µY |X =25 . [Note that X = 26.627.] Solution: First, s SµbY |X =x = 0.3444 1 11(x − 26.627)2 + , 11 11(8141.75) − (292.9)2 so that, Sµ̂Y |X =26.627 = 0.1038, and SµbY |X =25 = 0.1082. Thus, µ bY |X =25 ± t.025,9 0.1082 = 6.028 ± 0.245, µ bY |X =26.627 ± t.025,9 0.1038 = 6.275 ± 0.235, Michael Akritas CI at X = 25 CI at X = 26.627. Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals R commands for CIs in regression CIs for the intercept and slope: ”confint(lm(y ∼ x))” (or ”confint(lm(y ∼ x),level=0.95)”) gives 95% CIs for both α1 and β1 . ”confint(lm(y ∼ x),level=0.90)” gives 90% CIs for both α1 and β1 . ”confint(lm(y ∼ x),parm=”x”,level=0.90)” gives 90% CI only for β1 . ”confint(lm(y ∼ x),parm=”(Intercept)”,level=0.90)” gives 90% CI only for α1 . CIs for µY |X (x) at, e.g., x = 5.5: ”newx=data.frame(x=5.5)”, ”predict(lm(y ∼ x), newx, interval=”confidence”,level=0.9)” Use ”newx=data.frame(x=c(4.5,5.5))” above for multiple CIs. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Generalities Precision in estimation is quantified by the size of the probabilistic error bound, or by the length of the CI. Error bounds are of the form S X − µ ≤ tn−1,α/2 √ (unknown σ, normal case, or n > 30) n r p̂(1 − p̂) |p̂ − p| ≤ zα/2 (np̂ ≥ 8, n(1 − p̂) ≥ 8). n Thus error bounds depend on n, and α since, e.g., z.05 = 1.645 < z.025 = 1.96 < z.005 = 2.575 In improving precision, we do not want to adjust α. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals The Ideal Case: σ known To construct a (1 − α)100% CI having a prescribed length of L, the sample size n is found by solving the equation σ 2zα/2 √ = L. n The solution is: n= σ 2zα/2 L 2 . If the solution is not an integer (as is typically the case), the number is rounded up. Rounding up guarantees that the prescribed precision objective will be more than met. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals The Ideal Case: An Example Example The time to response (in milliseconds) to an editing command with a new operating system is normally distributed with an unknown mean µ and σ = 25. We want a 95% CI for µ of length L = 10 milliseconds. What sample size n should be used? Solution. For 95% CI, α/2 = .025 and z.025 = 1.96. Thus 25 2 = 96.04, n = 2 · (1.96) 10 which is rounded up to n = 97. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals The Realistic Case: σ unknown Sample size determination must rely a preliminary approximation, Sprl , of σ. Two common methods are: 1 If the range of population values is known, use Sprl = range range , or Sprl = . 3.5 4 This approximation is inspired by the standard deviation of a U(a, b) random variable, which is σ = (b − a)/3.464. 2 Use the standard deviation, Sprl , of a preliminary sample. This is somewhat cumbersome because it requires some trial-and-error iterations. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Sampe Size Determination for Estimating p Equating the length of the (1 − α)100% CI for p to L and solving for n gives the solution is: n= 2 p̂(1 − p̂) 4zα/2 L2 . Round up. Two commonly used methods for obtaining a preliminary approximation, p̂prl are: 1 2 Obtain p̂prl either from a small pilot sample or from expert opinion, and use it in the above formula. Replace p̂(1 − p̂) in the formula by 0.25. This gives 2 n = zα/2 /L2 . Michael Akritas Round up. Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example A preliminary sample gave p̂prl = 0.91. How large should n be to estimate the probability of interest to within 0.01 with 95% confidence? Solution. “To within 0.01” is another way of saying that the 95% bound on the error of estimation should be 0.01, or the desired CI should have a width of 0.02. Since we have preliminary information, we use the first formula: n= 4(1.96)2 (0.91)(0.09) = 3146.27. (.02)2 This is rounded up to 3147. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example A new method of pre-coating fittings used in oil, brake and other fluid systems in heavy-duty trucks is being studied. How large n is needed to estimate the proportion of fittings that leak to within .02 with 90% confidence? (No prior info available). Solution. Here we have no preliminary information about p. Thus, we apply the second formula and we obtain 2 n = zα/2 /L2 = (1.645)2 /(.04)2 = 1691.26. This is rounded up to 1692. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Prediction refers to estimating an observation. It is related to estimating the mean, but prediction intervals (PIs) are different from CIs. For example 1 2 Predicting the fat content of the hot dog you are about to eat is related to estimating the mean fat content of hot dog. But the PI is different from the CI. Predicting the failure time of your resistor from its resistance is related to estimating the mean failure time of all resistors having the same resistance. But the PI is different from the CI. In the first example, there was no explanatory variable. The second example involves a regression context. We begin with the case of no explanatory variable. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Prediction Based on a Univariate Sample To emphasize the difference between PIs and CIs, suppose that the amount of fat in a randomly selected hot dog is N(20, 9). Thus there are no unknown parameters to be estimated, and no need to construct a CI. Still the amount of fat, X , in the hot dog which one is about to eat is unknown, simply because it is a random variable. According to well-accepted criteria, the best point-predictor of a normal random variable with mean µ, is µ. A (1 − α)100% PI is an interval that contains the r.v. with probability 1 − α. Namely: µ ± zα/2 σ. In the hot dog example, X ∼ N(20, 9), so the best point predictor of X is 20 and a 95% PI is 20 ± (1.96)3. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Typically, µ, σ are unknown and are estimated from a sample X1 , . . . , Xn by X , S, respectively. Then, the best point predictor of a future observation, is X . The PI, however, must now take into account the variability of X , S as estimators of µ, σ. Assuming normality, the (1 − α)100% PI for the next X is: r r 1 1 X − tα/2,n−1 S 1 + , X + tα/2,n−1 S 1 + . n n 1 , and the n variability of S is accounted for by the use of the t-percentiles. The variability of X is accounted for by the Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example The fat content measurements from a sample of size n = 10 hot dogs, gave sample mean and sample standard deviation of X = 21.9, and S = 4.134. Give a 95% PI for the fat content, X , of the next hot dog to be sampled. Solution: Assuming that the fat content of a randomly selected hot dog has the normal distribution, the best point predictor of X is X = 21.9 and the 95% PI is r 1 X ± t.025,9 S 1 + = (12.09, 31.71). n Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals PIs for the Normal Simple Linear Regression Model Let (X1 , Y1 ), . . . , (Xn , Yn ) be n observations that follow the normal simple linear regression model, i.e. Yi |Xi = xi ∼ N(α1 + β1 xi , σ 2 ). The point predictor for a future observation Y made at X = x is µ bY |X =x = α b1 + βb1 x. The 100(1 − α)% PI is s µ bY |X =x ± tα/2,n−2 S Michael Akritas 1+ 1 n(x − X )2 P + P 2 . n n Xi − ( Xi )2 Lesson 8 Chapter 7: Confidence and Prediction Intervals Outline Introduction to Confidence Intervals CIs for a Mean and a Proportion CIs for the Regression Parameters The issue of Precision Prediction Intervals Example P Consider again the where n =P 11, Xi = 292.90, P P study 2 P Yi2= 69.03, Xi = 8141.75, Xi Yi = 1890.200, Yi = 442.1903, µ bY |X = 2.22494 + .152119X , and S = 0.3444. Construct a 95% PI for a future observation, made at X = 25. Solution. The point predictor is µ bY |X =25 = 6.028, and the 95% PI at X = 25 is 6.028 ± 0.8165, as obtained from the formula s 1 11(1.627)2 P 2 P + . µ bY |X =25 ± t.025,9 (0.344) 1 + 11 11 Xi − ( Xi )2 The 95% CI for µY |X =25 was found to be 6.028 ± 0.245. This demonstrates that PIs are wider than CIs. Michael Akritas Lesson 8 Chapter 7: Confidence and Prediction Intervals