Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Confidence Intervals for Population Mean Business Statistics Plan for Today • • • • • Inferential Statistics Point and Interval Estimates Confidence Intervals Estimating the required sample size Examples 1 Inferential Statistics • Goal = use information obtained from a sample to increase our knowledge about the population from which the sample was taken (i.e., to estimate or make inferences about the population) • 2 types: – Estimating the value of a population parameter – Testing a hypothesis • Using the Sampling Distribution of the Sample Mean (SDSM) is key Estimating a population mean • One of the purposes of randomly sampling a population is to get an estimate of the mean of the population • Usually, the best estimate of a population mean is the sample mean. Example: mean SEL test score for a group of 64 students is 77.4, thus 77.4 is the best estimate for the population of all students who take SEL test • Logic behind it is that you are more likely to get a sample mean of 77.4 from a population with a mean of 77.4: this is a point estimate 2 Point and Interval Estimates • Point estimate is when you estimate a specific value of a population parameter – Accuracy of the point estimate = SD (how much the scores in this distribution typically vary) • Interval estimate is when you estimate a range in which the population parameter is likely to fall – You can do this because the distribution of means is generally a normal curve, thus you know the percentage of scores that lie at a given area of the distribution: about 68 % of all sample means lie between the mean ± 1 SD Terminology • Point estimate: a single number designed to estimate a quantitative parameter of a population, usually the corresponding sample statistic • Interval estimate: an interval bounded by two values that is calculated from the sample and that is used to estimate the value of a population parameter • Confidence interval: an interval estimate with a specified level of confidence • Level of confidence 1 − 𝛼 : the proportion of all interval estimates that include the parameter being estimated – usually 90% , 95% , 98% or 99% 3 Example Take a city, like Trenton, NJ. We want to know how much time it takes workers living in Trenton to get to work and back: the commuting time • Sample = 36 workers from Trenton • Mean = 49 minutes • This mean becomes the point estimate for the population of all Trenton workers • σ = 15 minutes Example: continued • This mean should be close to the population mean, μ • SDSM and the CLT tell us how close this mean, a point estimate, is to the population mean, μ • Recall: with a large enough sample the SDSM will be close to normally distributed 4 Recall: the Empirical Rule Example: continued If we knew the value of 𝜇, the population mean, then we could have calculated an interval between which 9̴ 5% of the sample average commuting times should fall: From 𝜇 − 2𝜎𝑥 to 𝜇 + 2𝜎𝑥 , i.e. 𝜎 𝑛 to 𝜇 + 2 𝜎 𝑛 15 36 to μ + 2 15 36 from 𝜇 − 2 from μ − 2 , i.e. , i.e. from 𝜇 − 5 to 𝜇 + 5 minutes 5 Sampling Distribution of 𝒙 ’s , unknown μ In algebraic terms: 𝑃 𝜇 − 5 < 𝑥 < 𝜇 + 5 ≈ 95% Interval Estimates • Interval estimate: an interval bounded by two values that is calculated from the sample and that is used to estimate the value of a population parameter • Level of confidence 1 − 𝛼 : the proportion of all interval estimates that include the parameter being estimated • Confidence interval: an interval estimate with a specified level of confidence 6 Example: continued What are the bounds of the interval centered at 𝑥 = 49 minutes? From 𝑥 − 2𝜎𝑥 to 𝑥 + 2𝜎𝑥 , i.e. from 49−5 to 49+5 minutes This means that the 95.44% confidence interval for μ is from 44 to 54 minutes. Confidence Intervals 7 Summary : Calculating Confidence Intervals • • • • Sample Mean: 𝑥 Sample Size: n Population standard deviation: σ Level of confidence we wish to have: 1 − 𝛼 1 − 𝛼 ∙ 100% gives us an estimate of how confident you can be that your mean falls within this interval 0.95 *100% = 95%: you are 95% confident that the population mean falls within this interval Step by step Estimation of Mean μ (σ known) Assumption: either the general population has the bell-shaped symmetric distribution, or the sample size is at least 25. 8 Confidence Coefficient 𝒛(𝜶 𝟐) Constructing a Confidence Interval • Step 1: Set-Up – Describe the population parameter of interest • Step 2: The Confidence Interval Criteria – Check the assumptions – Identify the probability distribution and the formula to be used – State the level of confidence 𝟏 − 𝜶 • Step 3: The Sample Evidence – Collect the sample information 9 Constructing a Confidence Interval • Step 4: The Confidence Interval – Determine the confidence coefficient 𝑧(𝛼 2) – Find the error bound for a population mean 𝐸𝐵𝑀 = 𝑧(𝛼 2) ∙ 𝜎 𝑛 – Find the lower and upper confidence limits • Step 5: State the confidence interval from 𝑥 − 𝐸𝐵𝑀 to 𝑥 + 𝐸𝐵𝑀 (units) The confidence coefficient • Some useful numbers from the table: If if if If If If if 1 − 𝛼 = 0.80 1 − 𝛼 = 0.90 1 − 𝛼 = 0.94 1 − 𝛼 = 0.95 1 − 𝛼 = 0.96 1 − 𝛼 = 0.98 1 − 𝛼 = 0.99 (80%), (90%), (94%), (95%), (96%), (98%), (99%), then then then then then then then 𝑧 𝑧 𝑧 𝑧 𝑧 𝑧 𝑧 𝛼 𝛼 𝛼 𝛼 𝛼 𝛼 𝛼 2 2 2 2 2 2 2 = 1.28 = 1.645 = 1.88 = 1.96 = 2.055 = 2.33 = 2.575 Check for yourself! 10 Example: textbook cost A random sample of 60 students from X University has revealed that their average annual textbook spending is $928. From previous studies, it is known that the standard deviation for annual textbook costs can be takes as $230. Find a 95% confidence interval for the mean annual textbook costs for all students at X University. Example: textbook costs Step 1: What is the population parameter of interest? Step 2: 𝜎 = $230 is known. Is a sample of 60 students good enough? (we need the sampling distribution to be approximately normal); we will therefore use the standard normal distribution; the level of confidence is 1 − 𝛼 = 0.95 (95%) Step 3: 𝑛 = 60, 𝑥 = $928 11 Example: textbook costs Step 4: 0.95/2 = 0.475, 𝑧 𝛼 2 = 1.96 (table) 𝐸𝐵𝑀 = 𝑧 𝛼 2 ∙ 𝜎 𝑛 = 1.96 ∙ 230 60 = 58.2 𝑥 − 𝐸𝐵𝑀 = 869.8 , 𝑥 + 𝐸𝐵𝑀 = 986.2 Step 5: The 95% confidence interval for the population mean 𝜇 is: from $870 to $986 (same precision as the data) How to decrease the error? • To decrease the value of EBM (and thus, to decrease the size of the confidence interval for 𝜇) there are two possibilities: (A) Decrease the confidence level. A smaller confidence level will result in a smaller 𝑧(𝛼/2) аnd thus, you’ll get a smaller EBM. (B) Increase the size of a sample. A larger value of n means a larger value of 𝑛 and thus, you’ll get a smaller value of EBM. • Tradeoffs: (A) less certain, (B) more costly 12 Example: practice A survey by Future Shop involving 35 households in the area revealed the mean spending of $850 on home electronics during the last year. Construct a 98% confidence interval for the average annual spending on home electronics for all households in the area, if the population standard deviation is known to be $300. Answer: from $732 to $968. Estimating the sample size • If we wish the error EBM to be smaller than a certain value, 𝜀, but the confidence level is fixed at 1 − 𝛼 , we can choose the necessary sample size: 𝜎 𝜀 > 𝐸𝐵𝑀 = 𝑧(𝛼 2) ∙ 𝑛 Thus, 𝑛 > 𝑧 𝛼 2 ∙𝜎 2 𝜀 13 Estimating the sample size 𝑧 𝛼 2 ∙𝜎 𝜀 2 • The number rounded up to the nearest integer is denoted by 𝑛𝑚𝑖𝑛 : the minimum required sample size. • Example: a supermarket manager needs to estimate the average weekly grocery spending by his customers at a 90% level of confidence and with an error not exceeding $10. What is the minimum sample size needed, if he knows that the population standard deviation is $60? Example: grocery shopping • • • • • Solution. Given: 1 − 𝛼 = 0.9, 𝜎 = $60, 𝜀 = $10 Find: 𝑛𝑚𝑖𝑛 First, we have 𝑧(𝛼 2) = 1.645 Now, we compute: 𝑧 𝛼 2 ∙𝜎 𝜀 2 = 1.645∙60 2 10 = 97.4 Thus, the minimum required sample size is 𝑛𝑚𝑖𝑛 = 98 customers 14 Example: practice An insurance company wants to estimate the average mileage driven by residents per week in Hamilton, so that the error does not exceed 20 km at the 99% level of confidence. From other studies they know that the population standard deviation can be taken as 100 km. Estimate the sample size needed for this study. Answer: 166 drivers 15