Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
9-1 Estimation and Confidence Intervals Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9-2 When you have completed this chapter, you will be able to: Define a point estimator, a point estimate, and desirable properties of a point estimator such as unbiasedness, efficiency, and consistency. Define an interval estimator and an interval estimate Define a confidence interval, confidence level, margin of error, and a confidence interval estimate Construct a confidence interval for the population mean when the population standard deviation is known Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9-3 Construct a confidence interval for the population variance when the population is normally distributed Construct a confidence interval for the population mean when the population is normally distributed and the population standard deviation is unknown Construct a confidence interval for a population proportion Determine the sample size for attribute and variable sampling Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Terminology 9-4 Point Estimate …is a single value (statistic) used to estimate a population value (parameter) Interval Estimate …states the range within which a population parameter probably lies Confidence Interval …is a range of values within which the population parameter is expected to occur Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9-5 Desirable properties of a point estimator • efficient … possible values are concentrated close to the value of the parameter • consistent …values are distributed evenly on both sides of the value of the parameter • unbiased …unbiased when the expected value equals the value of the population parameter being estimated. Otherwise, it is biased! Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Terminology Standard error of the sample mean …is the standard deviation of the sampling distribution of the sample means s x s x It is computed by s = s n …is the symbol for the standard error of the sample mean …is the standard deviation of the population n …is the size of the sample Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9-6 Standard Error of the Means If s is not known and n > 30, the standard deviation of the sample(s) is used to approximate the population standard deviation s = s Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. x Computed by… n 9-7 9-8 …that determine the width of a confidence interval are: 1. The sample size, n 2. The variability in the population, usually estimated by s 3. The desired level of confidence Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Constructing Confidence Intervals 9-9 IN GENERAL, A confidence interval for a mean is computed by: zα/2 s n x Interpreting… Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Interpreting Confidence Intervals 9 - 10 The Globe Suppose that you read that “…the average selling price of a family home in York Region is $200 000 +/- $15000 at 95% confidence!” This means…what? Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Interpreting Confidence Intervals The Globe “…the average selling price of a family home in York Region is $200 000 +/$15 000 at 95% confidence!” 9 - 11 In statistical terms, this means: …that we are 95% sure that the interval estimate obtained contains the value of the population mean. Lower confidence limit is $185 000 ($200 000 - $15 000) Upper confidence limit is $215 000 ($200 000 + $15 000) Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Also… Interpreting Confidence Intervals The Globe “…the mean time to sell a family home in York Region is 40 days. 9 - 12 Your newspaper also reports that… You select a random sample of 36 homes sold during the past year, and determine a 90% confidence interval estimate for the population mean to be (31-39) days. Do your sample results support the paper’s claim? Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Interpreting Confidence Intervals You select a random sample of 36 homes sold during the past year, and determine a 90% confidence interval estimate for the population mean to be (31-39) days. 9 - 13 Lower confidence limit is 31 days Upper confidence limit is 39 days Our evidence does not support the statement made by the newspaper, i.e., the population mean is not 40 days, when using a 90% interval estimate There is a 10% chance (100%-90%) that the interval estimate does not contain the value of the population mean! Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Interpreting Confidence Intervals 9 - 14 90% Confidence Interval … 10% chance of falling outside this interval …or, focus on tail areas … i.e. = 0.10 .05 .05 90% 31 39 is the probability of a value falling outside the confidence interval Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 15 Find the appropriate value of z: P( X zs X zs n 1 2 n = .92 Locate Area on the normal curve Look up a= 0.46 in Table to get the corresponding z-score This is a 92% confidence interval 0.92 -1.75 0 1.75 Search in the centre of the table for the area of 0.46 Z = +/- 1.75 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Constructing Confidence Intervals x Common Confidence Intervals zα/2 s n s 95% C.I. for the mean: X 1.96 99% C.I. for the mean: s X 2 .58 n n Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 16 Also, About95% 95%ofofthe sample means for the constructed a intervals specifiedwill sample size will lie contain thewithin 1.96 standard parameter being deviations estimated. of the hypothesized population mean. Interval Estimates 9 - 17 If the population standard deviation is known or n > 30 If the population standard deviation is unknown and n<30 Use the z table… Use the t-table… n x Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. x zα/2 s t α/2 s n More on this later… 9 - 18 The Dean of the Business School wants to estimate the mean number of hours worked per week by students. A sample of 49 students showed a mean of 24 hours with a standard deviation of 4 hours. What is the population mean? Our best estimate is 24 hours. This is a point estimate. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Find the 95 percent confidence interval for the population mean. 9 - 19 Commonly denoted as 1- zα/2 s Mean = 24 SD = 4 N = 49 n Z = +/- 1.96 x 95% Confidence Substitute values: 24 + 1.96 4 49 = 24 +/- 1.12 The Confidence Limits range from 22.88 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. to 25.12 Interval Estimates 90% confidence level 1- = 0.9 or = 0.10 99% confidence level 1- = 0.99 or = 0.010 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 20 Student’s t-distribution ….used for small sample 9 - 21 sizes Characteristics …like z, the t-distribution is continuous …takes values between –4 and +4 …it is bell-shaped and symmetric about zero …it is more spread out and flatter at the centre than the z-distribution …for larger and larger values of degrees of freedom, the t-distribution becomes closer and closer to the standard normal distribution Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Student’s t-distribution Chart 9-1 Comparison of The Standard Normal Distribution and the Student’s t Distribution The t distribution should be flatter and more spread out than the z distribution t distribution Z distribution Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 22 Student’s t-distribution 9 - 23 …with df = 9 and 0.10 area in the upper tail… t = 1.383 0.10 t T -table Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Student’s t-distribution df 1 2 3 4 5 6 7 8 9 10 11 9 Confidence Intervals 80% 90% 95% 98% 99% Level of Significance for One-Tailed Test 0.100 0.050 0.025 0.010 0.005 0.10 Level of Significance for Two-Tailed Test 0.20 0.10 0.05 0.02 0.01 3.078 1.886 1.638 1.533 1.476 1.44 1.415 1.397 1.383 1.372 1.363 1.383 6.314 2.92 2.353 2.132 2.015 1.943 1.895 1.86 1.833 1.812 1.796 12.706 4.303 3.182 2.766 2.571 2.447 2.365 2.306 2.262 2.228 2.201 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.25 3.169 3.106 9 - 24 When? 9 - 25 …to use the z Distribution or the t Distribution Population Normal? NO YES n 30 or more? NO Use a nonparametric test (see Ch16) YES Population standard deviation known? NO Use the t Use the z distribution distribution Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. YES Use the z distribution Student’s t-distribution The Dean of the Business School wants to estimate the mean number of hours worked per week by students. A sample of only 12 students showed a mean of 24 hours with a standard deviation of 4 hours. Find the 95 percent confidence interval for the population mean. n is small so use the t - Distribution Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 26 …sample of only 12 students …a mean of 24 hours Data …a standard deviation of 4 hours = 24 n = 12 s = 4 x Formula tα/2 df = 12-1 = 11 X Looking up 5% level of significance for a two-tailed test with 11df, we find… Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 27 s n = 1 – 95% = .05 Student’s t-distribution df 1 2 3 4 5 6 7 8 9 10 11 11 Confidence Intervals 80% 90% 95% 98% 99% Level of Significance for One-Tailed Test 0.100 0.050 0.025 0.010 0.005 Level of Significance for Two-Tailed Test 0.20 0.10 0.05 0.02 0.01 0.05 3.078 1.886 1.638 1.533 1.476 1.44 1.415 1.397 1.383 1.372 1.363 6.314 2.92 2.353 2.132 2.015 1.943 1.895 1.86 1.833 1.812 1.796 12.706 4.303 3.182 2.766 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.201 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.25 3.169 3.106 9 - 28 …sample of only 12 students …a mean of 24 hours Data …a standard deviation of 4 hours = 24 n = 12 s = 4 x Formula tα/2 df = 12-1 = 11 9 - 29 s n = 1 – 95% = .05 X Looking up 5% level of significance for t = 2.201 0 025 . a two-tailed test with 11df, we find… 4 24 2.201 12 = 24 +/- 2.54 The confidence limits range from 21.46 to 26.54 Compare these with earlier limits of 22.88 to 25.12 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Student’s t-distribution The manager of the college cafeteria wants to estimate the mean amount spent per customer per purchase. A sample of 10 customers revealed the following amounts spent: $4.45 $4.05 $4.95 $3.25 $4.68 $5.75 $6.01 $3.99 $5.25 $2.95 Determine the 99% confidence interval for the mean amount spent. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 30 Student’s t-distribution $4.45 $4.05 $5.75 $6.01 Step 1 $4.95 $3.25 $4.68 $3.99 $5.25 $2.95 Determine the sample mean and standard deviation. X Step 2 9 - 31 = $4.53 s = $1.00 Enter the key data into the appropriate formula. n = 10 x Formula = 1-99% = .01 1.00 = 4.53 3.25 10 = $4.53 +/- $1.03 df = 10 – 1 = 9 tα/2 s n We are 99% confident that the mean amount spent per customer is between $3.50 and $5.56 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Constructing Confidence Intervals for Population Proportions A confidence interval for a population proportion is estimated by: Formula pz p (1 p ) n p …is the symbol for the sample proportion Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 32 Constructing Confidence Intervals for Population Proportions A sample of 500 executives who own their own home revealed 175 planned to sell their homes and retire to Victoria. Develop a 98% confidence interval for the proportion of executives that plan to sell and move to Victoria. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 33 Constructing Confidence Intervals for Population Proportions A sample of 500 executives who own their own home revealed 175 planned to sell their homes and retire to Victoria. Develop a 98% confidence interval for the proportion of executives… Formula p z /2 ˆ 9 - 34 p(1 p) n n = 500 p = 175/500 = .35 z = 2.33 . 35 2 . 33 98% CL = Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. . 35 ( 1 . 35 ) 500 . 35 . 0497 9 - 35 Finite-Population Correction Factor Used when n/N is 0.05 or more Formula sx = s n N -n N - 1 Correction Factor The attendance at the college hockey game last night was 2700. A random sample of 250 of those in attendance revealed that the average number of drinks consumed per person was 1.8 with a standard deviation of 0.40. Develop a 90% confidence interval estimate for the mean number of drinks consumed per person. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 36 Finite-Population Correction Factor X Formula Zα/2 s N -n N - 1 n The attendance at the college hockey game last night was 2700. N = 2700 n = 250 x = 1.8 A sample of 250 of s = 0.40 /2 = 0.05 those in attendance revealed that the average number of drinks consumed per Since 250/2700 >.05, use the correction factor person was 1.8 with a standard deviation of .4 2700 250 0.40. 1.8 1.645 ( )( ) Develop a 90% 2700 1 250 confidence interval estimate.… 90% CL = 1 . 8 0 . 04 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 37 Selecting the Sample Size Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Factors …that determine the sample size are: 1. The degree of confidence selected 2. The maximum allowable error 3. The variation in the population Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 38 Selecting the Sample Size Formula 9 - 39 zα/2 s 2 n = E E … is the allowable error Z …is the z-score for the chosen level of confidence S …is the sample deviation of the pilot survey Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Selecting the Sample Size A consumer group would like to estimate the mean monthly electricity charge for a single family house in July (within $5) using a 99 percent level of confidence. Based on similar studies the standard deviation is estimated to be $20.00. How large a sample is required? Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 40 Selecting the Sample Size Formula zα/2 s 2 E A consumer group would like to estimate the mean monthly electricity charge for a single family house in July (within $5) using a 99 percent level of confidence. Based on similar studies the standard deviation is estimated to be $20.00. 9 - 41 2.58 20 2 = 5.00 2 = (10.32) = 106.5 90% CL = A minimum of 107 homes must be sampled. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Selecting the Sample Size 9 - 42 The Kennel Club wants to estimate the proportion of children that have a dog as a pet. Assume a 95% level of confidence and that the club estimates that 30% of the children have a dog as a pet. If the club wants the estimate to be within 3% of the population proportion, how many children would they need to contact? Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Selecting the Sample Size New The Kennel Club wants to estimate the proportion of children that have a dog as a pet. Assume a 95% level of confidence and that the club estimates that 30% of the children have a dog as a pet. Formula 9 - 43 Z 2 n = p ( 1 p ) E 1 . 96 2 = . 3 (1 . 3 ) . 03 = (. 21 )( 65 . 33 ) 2 n = 896.4 A minimum of 897 children must be sampled. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Test your learning… www.mcgrawhill.ca/college/lind Online Learning Centre for quizzes extra content data sets searchable glossary access to Statistics Canada’s E-Stat data …and much more! Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 44 9 - 45 This completes Chapter 9 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.