* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Interval Estimation
Survey
Document related concepts
Transcript
---לא מסווג--- Quantitative Methods 2013 Confidence Interval Estimation µ x [--------------------- x ---------------------] [--------------------- x ---------------------] [--------------------- x ---------------------] ---לא מסווג--- 1 Point Estimate for Population μ A point estimate is a single value estimate for a population parameter. Point estimate of the population mean, µ, is the sample mean. Example: A random sample of 32 textbook prices is taken from a local bookstore. Find a point estimate for the population mean, µ. 34 56 79 94 34 65 86 95 38 65 87 96 45 66 87 98 45 67 87 98 45 67 88 101 45 68 90 110 54 74 90 121 2 ---לא מסווג--- 34 56 79 94 34 65 86 95 38 65 87 96 45 66 87 98 45 67 87 98 45 67 88 101 45 68 90 110 54 74 90 121 x ≈ 74.22 The point estimate for the population mean of textbooks in the bookstore is $74.22. The problem with one value or a point estimate is that they are presented as being exact and the probability of them being precisely the right value is low. 3 ---לא מסווג--- Interval Estimate In practice it is more meaningful to have an interval estimate and to quantify these intervals by probability levels that give an estimate of the error in the measurement. • 74.22 Lower Confidence Limit interval estimate Upper Confidence Limit A point estimate is a single number. The interval estimate is a range within which the population parameter is likely to fall. 4 ---לא מסווג--- Point estimate for textbooks • 74.22 Interval estimate How confident do we want to be that the interval estimate contains the population mean, μ? The estimate for the prices of the textbooks is between 66.1 and 82.34 and I am 95% confident of these figures. 5 ---לא מסווג--- Population (mean, μ, is unknown) 74.22 I am 95% confident that μ is between 66.1 and 82.34. Sample 6 ---לא מסווג--- An interval estimate can be computed by adding and subtracting a margin of error to the point estimate. Point Estimate +/− Margin of Error 74.22 • Margin of Error Margin of Error Lower Confidence Limit Upper Confidence Limit The general form of an interval estimate of a population mean is: +/− 7 ---לא מסווג--- Lower Confidence Limit Margin of Error Margin of Error Point Estimate Upper Confidence Limit Width of confidence interval An interval gives a range of values: – Takes into consideration variation in sample statistics from sample to sample. – Based on observations from 1 sample. – Gives information about closeness to unknown population parameters. – Stated in terms of level of confidence. 8 ---לא מסווג--- Level of Confidence The level of confidence c is the probability that the interval estimate contains the population parameter. Since the sampling distribution shows how values of X are distributed around the population mean μ, the sampling distribution of X provides information about the possible differences between X and μ. c (1 – c) µ The remaining area in the tails is 1 – c . 9 ---לא מסווג--Interpretation: In the long run, c% of all the confidence intervals that can be constructed will contain the unknown true parameter. c = 1− α x µx = µ A specific interval either will contain or will not contain the true parameter x1 x2 c % of intervals constructed contain μ; (1‐c) % do not. Confidence Intervals 10 ---לא מסווג--- Suppose confidence level = 95% Also written (1 ‐ α) = 0.95 α is called the level of significance α/2 1− α α/2 11 ---לא מסווג--- Confidence Intervals for the Mean (Large Samples) 12 ---לא מסווג--- Confidence Interval for μ (n ≥ 30 or σ Known ) Assumptions ‐ n ≥ 30 ‐ or σ known with a normally distributed population Confidence interval estimate: σ X ± Zc n When n ≥ 30, the sample standard deviation, s, can be used for σ. where is the point estimate, X σ/ n is the standard error 13 ---לא מסווג--- Commonly used confidence levels are 90%, 95%, and 99% Confidence Level 80% 90% 95% 98% 99% 99.8% 99.9% Confidence Coefficient Z value 0.80 0.90 0.95 0.98 0.99 0.998 0.999 1.28 1.645 1.96 2.33 2.58 3.08 3.27 c = 1−α 14 Sampling Distribution of the Mean ---לא מסווג--- c = 1− α α/2 Intervals extend from X+Z to X−Z α/2 x µx = µ x1 σ n x2 c % of intervals constructed contain μ; (1‐c)% do not. σ n Confidence Intervals 15 ---לא מסווג--- Finding a Confidence Interval for a Population Mean (n ≥ 30 or σ known with a normally distributed population) In Words 1. Find the sample statistic. In Symbols x = 2. Specify σ, if known. Otherwise, if n ≥ 30, find the sample standard deviation s and use it as s = an estimate for σ. ∑x n ∑( x − x )2 n −1 3. Find the critical value zc that corresponds to Use Excel. the given level of confidence. =NORSMINV((1+c)/2) 4. Find the left and right endpoints and form the confidence interval. 16 ---לא מסווג--- Example: A random sample of 32 textbook prices is taken from a local college bookstore. The mean of the sample is 74.22, the sample standard deviation is s = 23.44. Construct a 95% confidence interval for the mean price of all textbooks in the bookstore. Since n ≥ 30, s can be substituted for σ. σ X±Z x = 74.22 n Use a 95% confidence level s = 23.44 Z = 1.96 17 74.22 ---לא מסווג--- σ X±Z n Left endpoint = ? • 74.22 – 8.12 = 66.1 ≈ 1.96 23.44 32 Right endpoint = ? • 74.22 • 74.22 + 8.12 = 82.34 With 95% confidence we can say that the cost for all textbooks in the bookstore is between $66.10 and $82.34. 18 ---לא מסווג--- Example: A random sample of 25 students had a grade point average with a mean of 2.86. Past studies have shown that the standard deviation is 0.15 and the population is normally distributed. Construct a 90% confidence interval for the population mean grade point average. n = 25 x = 2.86 2.81 < σ = 0.15 zc = 1.645 σ < 2.91 X±Z n With 90% confidence we can say that the mean grade point average for all students in the population is between 2.81 and 2.91. 19 ---לא מסווג--- Confidence Intervals for the Mean (Small Samples) 20 ---לא מסווג--- The t-Distribution When a sample size is less than 30, and the random variable X is approximately normally distributed, use a t‐distribution. X ± tc s n 21 ---לא מסווג--- Properties of the t‐distribution 1. The t‐distribution is bell shaped and symmetric about the mean. 2. The t‐distribution is a family of curves, each determined by a parameter called the degrees of freedom. The degrees of freedom are the number of free choices left after a sample statistic such as x is calculated. When you use a t‐distribution to estimate a population mean, the degrees of freedom are equal to one less than the sample size. d.f. = n – 1 Degrees of freedom 22 ---לא מסווג--- Degrees of Freedom (df) Idea: Number of observations that are free to vary after sample mean has been calculated. Example: Suppose the mean of 3 numbers is 8.0 Let X1 = 7 Let X2 = 8 What is X3? If the mean of these three values is 8.0, then X3 must be 9 (i.e., X3 is not free to vary) Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2 2 values can be any numbers, but the third is not free to vary for a given mean. 23 ---לא מסווג--- t ‐ Distribution As the degrees of freedom increase, the t-distribution approaches the normal distribution. After 30 d.f., the t-distribution is very close to the standard normal z-distribution. Standard Normal (t with df = ∞) t‐distributions are bell‐ shaped and symmetric, but have ‘fatter’ tails than the normal t (df = 13) t (df = 5) t 24 ---לא מסווג--- =TINV(probability, df) is used to find the value of the t under the distribution given the total area outside the curve or α. Note the difference in the way you enter the variables for the t and the normal distribution. For the t‐distribution you enter the area in the tails, whereas for the normal distribution you enter the area of the curve from the extreme left to a value on the x‐axis. 25 ---לא מסווג--- Critical Values of t Example: Find the critical value tc for a 95% confidence when the sample size is 5. 95% of the area under the t‐distribution curve with 4 degrees of freedom lies between t = ±2.776. c = 0.95 =TINV(0.05,4) −tc = − 2.776 tc = 2.776 t 26 ---לא מסווג--Constructing a Confidence Interval for the Mean: t‐Distribution In Words In Symbols 1. Identify the sample statistics. x = ∑x n 2 s = ∑( x − x ) n −1 2. Identify the degrees of freedom, the level of confidence c, and the critical value tc. d.f. = n – 1 =TINV(alpha, df) 3. Find the left and right endpoints and form the confidence interval. x ± tc s n 27 ---לא מסווג--- Example: A random sample of n = 25 taken from a normal population has X = 50 and s = 8. Form a 95% confidence interval for μ d.f. = n – 1 = 24, so t0.95 , n −1 = t 0.95,24 = 2.0639 =TINV(0.05,24) The confidence interval is X ± tc, n -1 s 8 = 50 ± (2.0639) n 25 46.698 ≤ µ ≤ 53.302 28 Example: In a random sample of 20 customers at a local fast food restaurant, the mean waiting time to order is 95 seconds, and the standard deviation is 21 seconds. Assume the wait times are normally distributed and construct a 90% confidence interval for the mean wait time of all customers. ---לא מסווג--- x = 95 n = 20 d.f. = 19 tc s n = 1.729 ⋅ s = 21 tc = 1.729 =TINV(0.1,19) 21 20 = 8.1 29 ---לא מסווג--- X ± tc, n -1 s n 86.9 < μ < 103.1 We are 90% confident that the mean wait time for all customers is between 86.9 and 103.1 seconds. 30 ---לא מסווג--- Normal or t‐Distribution? Use the normal distribution with Is n ≥ 30? Yes No Yes Is σ known? No σ n If σ is unknown, use s instead. No Is the population normally, or approximately normally, distributed? X±Z You cannot use the normal distribution or the t-distribution. Use the normal distribution with Yes X±Z σ n Use the t-distribution with n – 1 degrees of freedom. s X ± tc n 31 ---לא מסווג--- Normal or t‐Distribution? Example: Determine whether to use the normal distribution, the t‐distribution, or neither. a.) n = 50, the distribution is skewed, s = 2.5 The normal distribution would be used because the sample size is 50. b.) n = 25, the distribution is skewed, s = 52.9 Neither distribution would be used because n < 30 and the distribution is skewed. c.) n = 25, the distribution is normal, σ = 4.12 The normal distribution would be used because although n < 30, the population standard deviation is known. 32 Question: ---לא מסווג--- The 95% confidence interval of the sample mean of employee age for a major corporation is 19 years to 44 years based on a z‐statistic. The population of employees is more than 5,000 and the sample size of this test is 100. Assuming the population is normally distributed, the standard error of mean employee age is closest to: A. 1.96, B. 11.58, C. 6.38, D. 12.50. 33 ---לא מסווג--- C. At the 95% confidence level, with sample n=100 and mean 31.5 years, that appropriate test statictic is z=1.96 . Thus, the confidence interval is 31.5 ± 1.96 s X where is the standard error of the sample mean sX If we take the upper bound, we know that 31.5 ± 1.96 s X = 44 s X = 6.38 34 ---לא מסווג--- Question: An agricultural inspector wants to now the level of vitamin C in an load of kiwi fruits. The inspector took a random sample of 25 kiwis from the ship’s hold and measured the vitamin C content (in milligrams). Milligrams of vitamins per kiwi sampled: 109 88 91 136 93 101 89 97 115 92 114 106 94 109 110 97 89 117 105 92 83 79 107 100 93 35 ---לא מסווג--- Estimate the average level of vitamin C in the kiwi fruits and give a 95% confidence level of this estimate. Lower confidence level: 95.01 Upper confidence level: 105.47 36 ---לא מסווג--- APPENDIX 37 ---לא מסווג--- Student’s t Table 1-tail 2-tails d.f. 0.25 0.5 0.1 0.2 0.05 0.1 0.025 0.05 0.01 0.02 0.005 0.01 0.001 0.002 1 2 3 4 1.000 0.816 0.765 0.741 3.078 1.886 1.638 1.533 6.314 2.920 2.353 2.132 12.706 4.303 3.182 2.776 31.821 6.965 4.541 3.747 63.657 9.925 5.841 4.604 318.309 22.327 10.215 7.173 The body of the table contains t values, not probabilities Let: n = 3 df = n ‐ 1 = 2 α = 0.10 α/2 = 0.05 α/2 = 0.05 α/2 = 0.05 0 2.920 t c=1‐α=0.9 38