Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Estimation Chapter 7 “Farewell! Thou are too dear for my possessing And like enough thou know’st thy estimate.” William Shakespeare, Sonnet 87 MGMT 242 Topics and Goals for Chapter 7 • Unbiased point estimators for the population mean: – sample mean – sample median – sample trimmed mean • • • • • • Interval estimate of a population mean Confidence interval for a proportion Sample size and confidence intervals What to do when the population variance is unknown Confidence intervals with the Student’s t-distribution Assumptions for the Student’s t-distribution MGMT 242 Unbiased Estimators of the Population Mean • An estimator of the mean, µhat (µ with a caret over it) is unbiased if E(uhat) = µ, that is if the long run average of µhat equals the population mean. • The sample mean, xbar, is an unbiased estimator of µ; • The sample median, xm, is an unbiased estimator of µ; • The “trimmed sample mean” (doesn’t take top 10%, bottom 10% of values) is an unbiased estimator of µ. • Any linear combination of the sample values, divided by the number of values, is an unbiased estimator of µ (see Problem 7.8, where the middle of the range is used to estimate µ). MGMT 242 Efficient Estimators of the Population Mean • The most efficient estimator of the population mean is that which will give an estimate with the smallest standard deviation. • Example: Problems 7.1, 7.2, 7.8 (Electronic Reserve). MGMT 242 Confidence Interval for Population Mean-I • Suppose we measure a sample mean; how close is this value to the population mean? • If we know the standard deviation for the population, this is a straightforward problem; if we don’t, it’s more complicated--for now we’ll suppose we know the value of , the population standard deviation • The population distribution for the sample means, sample size N, will approach a normal distribution, mean µ, and standard deviation of the mean, mean = /N, as N gets large (practically, for N greater than about 10 to 20). MGMT 242 Confidence Interval for Population Mean-II • There is a 95% probability that the measured value of the sample mean, xbar, lies within the range µ-1.96 mean to µ + 1.96 mean (See Board demo) • This corresponds to the inequality µ-1.96 mean xbar µ +1.96 mean • With a little manipulation the inequality above can be changed to the one for the confidence interval (CI) xbar-1.96 mean µ xbar +1.96 mean where mean = /N. • Interpretation: 95% of the trials (in the long run) will give values of xbar within limits.(Concepts example) MGMT 242 Confidence Interval for Population Mean-III • General Case: Confidence Interval (CI) for level (1-)*100 % (e.g. = 0.05 corresponds to 95% level) • Then the CI (1-) is given by xbar - z (1-/2) mean µ xbar + z (1-/2) mean , where mean = / N and z (1-/2) is the z-score for the (1-/2) centile (see board diagram): Confidence Level z (1-/2) 90 % 1.645 95% 1.960 99% 2.575 MGMT 242 Interpretation of Confidence Interval The diagram to the left is from the “Concepts” StatPlus add-in. µ, the “true” mean salary, equals $5600; The 95% CI for given and N runs from Confidence Intervals for Sample Means $6,600 $6,400 $6,200 $6,000 $5,800 $5,600 $5,400 $5,200 $5,000 $4,800 $4,600 MGMT 242 Confidence Interval for Proportion-I • General Case: Confidence Interval (CI) for level )*100 % (e.g. = 0.05 corresponds to 95% level) • Then the CI (1-) is given by p - z (1-/2) p p + z (1-/2) p, (1- where z (1-/2) is the z-score for the (1-/2) centile, is the population proportion (proportion yeses in a yes/no questionnaire, proportion test positive in a medical test, etc.), p = x /N is the sample estimate of (x is the number of successes in a sample size N) and p, the standard deviation of the proportion, is estimated by p = {p(1-p)/N} 1/2 MGMT 242 Confidence Interval for Proportion-II • The CI (1-) for 1- = 0.95, a 95% CI, is given by p - 1.96 p p + 1.96 p, p = {p(1-p)/N} 1/2 • Example (Ex. 7.20, text): 84 out of 125 individuals are aware of a certain product; a 95% CI for this proportion is given by p = 84/125 = 0.672; 1.96 p = 1.96 x [p(1-p)/N] = 0.082, so 95% CI is given by 0.672 - 0.082 to 0.672 + 0.082 or (0.590, 0.754) MGMT 242 Sample Size Required for Given CI width • We know that the CI gets smaller as the sample size, N, increases. Suppose we require (at a certain significance level) a specific width, E, for the CI. Then E = 2 z (1-/2) mean and, since mean = /N, we get N = (2 z (1-/2) /E)2 • Example: Exercise 7.25, text: want 95% CI for insurance claims to $50 wide (=E), with estimated $400; Then N = (2x1.96x400 /50)2 = 984. MGMT 242 Student’s t-Distribution for Unknown • Sample Standard Deviation, s, used to estimate population standard deviation, , if unknown – s = { (xi - xbar)2/(N-1)}(1/2) for sample, size N, with mean xbar • Uncertainty in standard deviation (from sample size estimate) is clearly bigger, the smaller the sample size; • Have to account for this uncertainty by use of a new statistic, the “Student’s-t” variable. • t = (x - ) / s, for individual value of sample, or • t= (xbar - ) / SEM, with SEM = s /N, for sample mean. MGMT 242 Student’s t-Distribution--Continued I • The Student’s t-distribution gives the probability of the t-statistic (see previous slide) occurring by chance • The distribution will clearly depend on sample size, N • The larger the sample size, the more nearly the sample standard deviation, s, should approach the value for population standard deviation, • The effective sample size is the “degrees of freedom” (abbreviated as “df”); df = N-1 • Probability for large t, small N, is lower than for same value of z (see “Concepts” illustration). MGMT 242 Student’s-t Distribution Standard Normal MGMT 242 4.00 3.00 2.00 1.00 0.00 -1.00 -2.00 -3.00 t Distribution -4.00 • The graph at left compares a Student’s-t distribution (solid blue line) with the standard normal bell curve (dotted red line) for df=2 (N=3). Note that the probability of the Student’s-t is less than that for the z-curve, for statistic values greater than 2, or less than -2. Student’s-t Distribution--Example • Ex. 7.34, Text. Comparison shopping at 14 New York area department stores to get refrigerator prices yields the following results: $341,347,319,331,326,298,335,351,316,307,335,320,329,346 Find the 95% Confidence Interval (CI) for the price: (From Xcel) xbar = $328.64 SEM= s /N = 15.49 / 14 = $4.14 df = 14 -1 = 13 t13 = 2.160 (from Table 4, text, or Excel) 95% CI: 328.64- 2.160 x 4.14 to 328.64 + 2.160x 4.14 or $319.70 to $337.58 MGMT 242