Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 220 Chapter 8 Estimation of the mean and proportion Spring 2008 Dr. Mohammad Zainal Estimation: An introduction 2 Estimation is a procedure by which a numerical value or values are assigned to a population parameter based on the information collected from a sample. ¾ In inferential statistics, statistics μ is called the true population mean and p is called the true population proportion. There are many other population parameters, parameters such as the median, median mode, mode variance, variance and standard deviation. ¾ ¾ E a le of estimation: Examples e ti atio ¾ mean fuel consumption for a particular model of a car ¾ average time i taken k by b new employees l to learn l a job j b ¾ mean housing expenditure per month incurred by households QM-220, M. Zainal Estimation: An introduction 3 If we can conduct a census each time we want to find the value population p parameter, then the estimation p p procedures are of a p not needed. ¾ Example, if the Kuwaiti Census Bureau can contact every household in the Kuwait to find the mean housing expenditure of households, the result of the survey will actually be a census ¾ ¾ However, conducting a census: ¾ is too expensive, ¾ very time consuming, ¾ virtually impossible to contact every member of a population QM-220, M. Zainal Estimation: An introduction 4 That is why we usually take a sample from the population and pp p sample p statistic. Then we calculate the value of the appropriate assign a value or values to the corresponding population parameter based on the value of the sample statistic. ¾ Example, to estimate the mean housing expenditure per month of all households in the Kuwait, the Census Bureau will ¾ ¾ take a sample of certain households ¾ collect the information on the housing expenditure per month ¾ compute the value of the sample mean ¾ assign values to the population mean QM-220, M. Zainal Estimation: An introduction 5 The value assigned to a population parameter based on the p statistics is called an estimate of the value of a sample population parameter. ¾ The sample statistic used to estimate a population parameter is called an estimator. ¾ ¾ The estimation procedure involves the following steps. ¾ Select a sample. ¾ Collect the required information from the members of the sample. ¾ Calculate the value of the sample statistic. ¾ Assign value(s) to the corresponding population parameter. QM-220, M. Zainal Point and interval estimates 6 ¾ An estimate may be a point estimate or an interval estimate. A Point Estimate ¾ The value of a sample statistic that is used to estimate a population parameter is called a point estimate. ¾ If Census Bureau takes a sample of 10,000 households and determines the mean housing expenditure per month, x, for this sample is $1370. Then, using x as a point estimate of μ, the bureau can state that the mean housing g expenditure p per month,, p μ, for all households is about $1370. QM-220, M. Zainal Estimation: An introduction 7 ¾ Usually, whenever we use point estimation, we calculate the margin a i of error e o associated a o iated with ith that point oi t estimation. e ti atio ¾ For the estimation of the population mean, the margin of error is calculated as follows: Margin of error = ± 1.96σ x or ± 1.96s x An Interval Estimate ¾ In the interval estimation, instead of assigning a single value to a population parameter, an interval is constructed around the point estimate. estimate QM-220, M. Zainal Point and interval estimates 8 ¾ For the example, instead of saying that the mean housing e e ditu e per expenditure e month o th for fo all households hou ehold is i $1370, $1370 wee may ay obtain an interval subtracting a number from $1370 and adding the same number to $1370. ¾ Then we say y that this interval contains the p population p mean, μ. ¾ For purposes of illustration, suppose subtract $240 from $1370 and add $240 to $1370. $1370 Consequently, Consequently we obtain the interval ($1370 ‐ $240) to ($1370 + $240), or $1130 to $1610. QM-220, M. Zainal Point and interval estimates 9 Then we state that the interval $1130 to $1610 is likely to contain the population mean, μ, and that the mean housing expenditure per month for all households in the United States is between $$1130 and $$1610. ¾ ¾ This procedure is called interval estimation. The value Th l $1130 is i called ll d the th lower l li it off the limit th interval i t l and d $1610 is called the upper limit of the interval. ¾ QM-220, M. Zainal Point and interval estimates 10 The question is, what number we should add to and subtract from the point estimate? ¾ ¾ The answer to this question depends on two considerations: ¾ The standard deviation of the mean ¾ The level of confidence to be attached to the interval First, the larger the standard deviation, First deviation the greater is the number subtracted from and added to the point estimate. ¾ Second, the quantity subtracted and added must be large if we want to have a higher confidence in our interval. ¾ Confidence Level and Confidence Interval: Each interval is constructed with regard to a given confidence level and is called a confidence interval. ¾ QM-220, M. Zainal Point and interval estimates 11 ¾ The confidence level associated with a confidence interval states how much confidence we have that this interval contains the true population parameter. ¾ Th confidence The fid l l is level i denoted d d by b (1 ‐ α)100%, )100% where h α is i the h Greek letter alpha. When expressed as probability, it is called the confidence fd coefficient ff and d is denoted d d by b 1 – α. ¾ α is called the significance level. ¾ Any value of the confidence level can be chosen to construct a confidence interval,, the more common values are 90%,, 95%,, and 99%. The corresponding confidence coefficients are .90, .95, and .99. QM-220, M. Zainal Interval estimation of a population mean: 12 QM-220, M. Zainal Interval estimation of a population mean: large samples 13 If the population standard deviation σ is not known, then we use the sample standard deviation S, S in which ¾ S Sx = n ¾ is used instead of σ x = σ n The (1 ‐ ( α)100% confidence interval for μ ) μ is x ± zσ x if σ is known x ± zs x if σ is unknown The value of z used here is read from the standard normal di ib i table distribution bl for f the h given i confidence fid l level. l ¾ QM-220, M. Zainal Interval estimation of a population mean: large samples 14 The quantity zσ x(or zs x when σ is not known) in the confidence interval formula is called the maximum error of estimate and is denoted by E. ¾ ¾ To find z: 1‐Divide (1 ‐ α) by 2. 2‐Locate 2 Locate the answer ans er in the body of the standard normal distribution table and record the corresponding value of z. QM-220, M. Zainal Interval estimation of a population mean: large samples 15 Example: A publishing company has just published a new college textbook. Before the company p y decides the p price at which to sell this textbook, it wants to know the average price of all such textbooks in the market. The research department at the company took a sample of 36 comparable textbooks and collected information on their prices. This information produced a mean price of $90.50 for this sample. It is known that the standard deviation of the prices of all such textbooks is $7.50. (a) What is the point estimate of the mean price of all such college textbooks? What is the margin of error for this estimate? (b) Construct a 90% confidence interval for the mean price of all such college g textbooks. QM-220, M. Zainal Interval estimation of a population mean: large samples 16 Example: According to CardWeb.com, the mean bank credit card debt for households was $7868 in 2004. Assume that this mean was based on a random sample of 900 households and that the standard deviation of such debts for all households in 2004 was $2070. Make a 99% confidence interval for the 2004 mean bank credit card debt for all households. QM-220, M. Zainal Interval estimation of a population mean: large samples 17 The width of a confidence interval depends on the size of the maximum error, error E, E which depends on the values of z, z σ, and n. n Why ? ¾ ¾ But wee have ha e no o control o t ol on o σ. Why? ¾ So, the width depends only on: The value of z, which depends on the confidence level. ¾ The sample size n ¾ ¾ The value of z increases as the confidence level increases. For the same value of σ, an increase in n decreases the value of σ, which ,in turn decreases the size of E when the confidence level remains unchanged. unchanged ¾ QM-220, M. Zainal Interval estimation of a population mean: large samples 18 If we want to decrease the width of a confidence interval, we have two choices: ¾ Lower the confidence level. ¾ I Increase th sample the l size. i ¾ Lowering the confidence level is not a good choice because a lower confidence level may give less reliable results. ¾ Increasing the sample size n, is the best way to decrease the width of a confidence interval. ¾ QM-220, M. Zainal Interval estimation of a population mean: large samples 19 Confidence level and the width of the confidence interval Reconsider the last example. Suppose all the information given in that example remains the same. First, let us decrease the confidence level to 95%. ¾ From the normal distribution table, z = 1.96 for a 95% confidence level. Then, using z = 1.96 in the confidence interval, we obtain ¾ ¾ 95% confidence interval is smaller than the 99% interval QM-220, M. Zainal Interval estimation of a population mean: large samples 20 Sample size and the width of the confidence interval Reconsider the last example. Suppose we change n to be 2500 and all other information remain the same. ¾ The width of the confidence interval for n = 2500 is smaller than that of n = 900 ¾ QM-220, M. Zainal Interval estimation of a population mean: large samples 21 Example: The standard deviation for a population is 6.30. A random sample selected from this population gave a mean equal to 81.90. ¾Make a 99% confidence interval for μ assuming n = 36 ¾Make a 99% confidence interval for μ assuming n = 81 ¾Make M k a 99% confidence fid i interval l for f μ assuming i n = 100 ¾Does the width of the confidence intervals constructed in parts a th through h c decrease d as the th sample l size i increases? i ? Why? Wh ? QM-220, M. Zainal Interval estimation of a population proportion: large samples 22 ¾ Many times we want to estimate the population proportion. ¾ Examples: ¾The production manager of a company wants to estimate the proportion of defective items on a machine A bank manager may want to know the percentage of customers who are satisfied with the bank services. ¾ ¾ Recall: ¾The sampling p g distribution (approximately) normal. of the sample p proportion p p is ¾The mean of the sampling distribution of is equal to the population proportion. ¾The standard deviation of the sampling distribution of the sample proportion is σ pˆ = pˆ qˆ / n QM-220, M. Zainal Interval estimation of a population proportion: large samples 23 ¾ The margin of error is zs pˆ ¾ The (1 ‐ α)100% confidence interval for p is pˆ ± zs pˆ QM-220, M. Zainal Interval estimation of a population proportion: large samples 24 Example: According to a 2002 survey, 20% of Americans needed legal g advice during g the p past y year to resolve such thorny y issues as family trusts and landlord disputes. Suppose a recent sample of 1000 adult Americans showed that 20% of them needed legal advice during the past year to resolve such family‐related issues. ((a)) What Wh t is i the th point i t estimate ti t off the th population l ti proportion? ti ? What Wh t is i the th margin of error for this estimate? (b) Construct a 99% confidence interval for all adults Americans who needed legal advice during the past year. QM-220, M. Zainal Interval estimation of a population proportion: large samples 25 Example: According to the analysis of a CNN‐USA TODAY‐ Gallup poll conducted in October 2002, ʺStress has become a common part off everyday d life l f in the h United U d States. The h demands d d of work, family, and home place an increasing burden on the g American.ʺ According g to this p poll,, 40% of Americans average included in the survey indicated that they had a limited amount of time to relax (Gallup. com, November 8, 2002). The poll was based on a randomly selected national sample of 1502 adults aged 18 and older. Construct a 95% confidence interval for the corresponding population proportion. QM-220, M. Zainal Interval estimation of a population proportion: large samples 26 Example: p of 400 observations taken from a p population p a. A sample produced a sample proportion of .63. Make a 95% confidence interval for p. b Another b. A th sample l off 400 observations b ti t k taken f from th same the population produced a sample proportion of .59. Make a 95% p confidence interval for p. c. Another sample of 400 observations taken from the same population produced a sample proportion of .67. Make a 95% confidence interval for p. p QM-220, M. Zainal Determining the sample size for the estimation of mean 27 The big reason on why we usually conduct a survey instead of a census is our limited recourses. ¾ If a smaller sample can serve our purpose then no need to take a bigger sample. ¾ Suppose on a test to estimate the S h mean life l f off a battery. b If 40 batteries can give us the required confidence y should we waste our money y by y buying y g interval,, why more batteries. ¾ The question is how can we decide the minimum sample size to produce a confidence interval with a given α. α ¾ QM-220, M. Zainal Determining the sample size for the estimation of mean 28 ¾ Recall that E is a function of z, σ, and n. That is E = z. σ n If we fix z, σ, and E and try to find n. The sample size can be found using 2 σ n = z 2. 2 E ¾ If we don’t know σ, then s can be used instead by taking a pilot sample with any arbitrary size. size ¾ QM-220, M. Zainal Determining the sample size for the estimation of mean 29 Example: An alumni association wants to estimate the mean debt of this year yearʹss college graduates. graduates It is known that the population standard deviation of the debts of this yearʹs college graduates is $11,800. $11,800 How large a sample should be selected so that the estimate with a 99% confidence level is within $800 of the population mean? QM-220, M. Zainal Determining the sample size for the estimation of proportion 30 Similar to the sampling mean, we can determine the sample size for the sampling proportion. proportion ¾ ¾ The only difference is the standard deviation. The sample size can be found using σ E = z. n ¾If p is not known, we choose a conservative sample of size n by using g p = qq. Why? y ¾ ¾ Then, we estimate p using the preliminary sample. QM-220, M. Zainal Determining the sample size for the estimation of proportion 31 Example: Lombard Electronics Company has just installed a new machine that makes a part that is used in clocks. clocks The company wants to estimate the proportion of these parts produced by this machine that are defective. defective The company manager wants this estimate to be within .02 of the population proportion for a 95% confidence level. What is the most conservative estimate of the sample size that will limit the maximum a i u eerror o to o within i i .02 o of thee popu population a io p proportion? opo io QM-220, M. Zainal Determining the sample size for the estimation of proportion 32 Example: Consider the previous example again. Suppose a preliminary p y sample p of 200 p parts p produced by y this machine showed that 7% of them are defective. How large a sample should the company select so that the 95% confidence interval for p is within .02 of the population proportion? QM-220, M. Zainal Interval estimation of a population mean: small samples 33 In a previous section , we considered estimating the population mean for large samples (n ≥ 30). 30) ¾ Using the CLT, we assumed that the sampling distribution of the sample a le mean ea is i approximately a o i ately normal o al despite de ite the shape ha e of the population and whether or not σ is known. ¾ Unfortunately, many times we are restricted to small samples due to the nature of the experiment. ¾ ¾ For instance: Clinical Trials ¾ Space missions ¾ QM-220, M. Zainal Interval estimation of a population mean: small samples 34 If we are dealing with small sample sizes, we will have two scenarios: ¾ 1‐The original population is normal and σ is known. 2 Th original 2‐The i i l population l ti is i (approximately) ( i t l ) normall and d σ is i unknown. k In the first scenario, we use the normal distribution to construct the confidence interval of μ. ¾ In the second scenario, we can’t use the normal distribution to construct the confidence interval of μ. Instead, we will use another distribution called the t‐distribution. ¾ QM-220, M. Zainal Interval estimation of a population mean: small samples 35 Conditions under which the t‐distribution is used to make a confidence interval about μ. μ ¾ 1‐ The population from which the sample is drawn is (approximately) normally distributed 2‐ The sample size is small (that is, n < 30) 3‐ The population standard deviation, σ , is not known The t distribution The t distribution is a specific type of bell‐shaped distribution p yp p with lower height and a wider spread than the standard normal distribution. ¾ As the sample size becomes larger, the t distribution approaches the standard normal distribution. ¾ QM-220, M. Zainal Interval estimation of a population mean: small samples 36 The t distribution has only one parameter, called the degrees of freedom (df). The mean of the t distribution is equal to 0 and its ( ) q standard deviation is √[df/(df ‐ 2)]. ¾ ¾ The units of the t distribution are denoted by t. y The number of degrees of freedom (df) is the only parameter of the t distribution. ¾ df = n – 1 QM-220, M. Zainal Interval estimation of a population mean: small samples 37 Example: Find the value of t for n = 10 and .05 area in the right tail. Also, find it’s standard deviation. Solution: df = n – 1 = 9 → standard deviation = 1.134 The required value of t for 9 df and .05 area in the right tail QM-220, M. Zainal Interval estimation of a population mean: small samples 38 Example: Find the value of t for n = 10 and .05 area in the left tail. Also, find it’s standard deviation. Solution: QM-220, M. Zainal Interval estimation of a population mean: small samples 39 Confidence interval for μ using the t distribution If the following three conditions hold true, true we use the t distribution to make a confidence interval about μ. ¾ 11‐ The population from which the sample is drawn is The population from which the sample is drawn is (approximately) normally distributed 2‐ The sample size is small (that is, n < 30) 3‐ The population standard deviation, σ , is not known The (1 ‐ α)% confidence interval for μ for small samples is s x ± ts X where sX = n The value of t is obtained from the t distribution table for n‐1 df and a given confidence level. ¾ QM-220, M. Zainal Interval estimation of a population mean: small samples 40 Example: A doctor wanted to estimate the mean cholesterol level for all adult men living g in Dasmah. He took a sample p of 25 adult men from Hartford and found that the mean cholesterol level for this sample is 186 with a standard deviation of 12. Assume that the cholesterol level for all adult men in Hartford are (approximately) normally distributed. Construct a 95% confidence interval for the population mean μ. μ QM-220, M. Zainal Interval estimation of a population mean: small samples 41 Example: Twenty‐five randomly selected adults who buy books for general reading were asked how much they usually spend on books b k per year. The Th sample l produced d d a mean off $1450 and da standard deviation of $300 for such annual expenses. Assume that such expenses for all adults who buy books for general reading d h have an approximate normall distribution. d b Determine a 99% confidence interval for the corresponding population mean μ μ. QM-220, M. Zainal