* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Interval Estimator
Survey
Document related concepts
Transcript
Chapter 08 Interval Estimation 1 Chapter 8 - Learning Objectives • Explain the difference between a point and an interval estimate. • Construct and interpret confidence intervals: with a z for the population mean or proportion. with a t for the population mean. • Determine appropriate sample size to achieve specified levels of accuracy and confidence. 8.1 Introduction Statistical inference is the process by which we acquire information about populations from samples. There are two ways in which we can make inferences about the population parameter: 1. By providing estimates of a parameter 2. By testing a hypotheses about a parameter 3 8.2 Concept of Estimation The main objective of estimation is to determine the value of a population parameter on the basis of a sample statistic. There are two types of estimations 1. Point Estimation 2. Interval Estimation 4 Point Estimator A point estimator allows us to draw inference about a population parameter (say the mean or the proportion) by estimating a statistic from a sample. That is, the sample statistic provides us with estimate of the value of the parameter at a single point (value)—thus the name point estimate. 5 Interval Estimator An interval estimator draws inferences about a population parameter by providing a range (interval) of value within which the unknown population parameter lies. Population distribution Parameter Sample distribution Interval estimator 7 Example-Take a sample and compute the Average Weekly Summer Income of students in your sample (Say, 600) of UMD students. You want to know the Average Weekly Summer income of UMD students. Point Estimate: µ=$400 X = 400 Interval estimate: µ= $380-$420 Interval estimator is used more frequently than point estimator for two reasons : (1) point estimator is more prone to making faulty inference, and (2) interval estimator allows to specify how confident we are in our estimate. 8 Characteristics of Estimators In estimation (whether point or interval), we always want to select the right sample and sample statistic that enable us to estimate a parameter with as small error as possible. The selection of the right statistic depends on some important characteristics. Desirable characteristics of Estimators 9 Desirable Characteristics of Estimators 1. Unbiasedness: An unbiased estimator is one whose expected value is equal to the parameter it estimates. 2. Consistency: We say an unbiased estimator is consistent if the difference between the estimator and the parameter grows smaller as the sample size increases. 3. Relative efficiency: We say an estimate is relatively efficient if from among two or more unbiased estimators (estimates), the one we use has a relatively smaller variance . 10 8.3 Interval Estimation of the Population Mean and Proportion 8.3.1 When the Population Variance is Known 8.3.2 When the Population Variance is Unknown 11 8.3.1 Estimating the Population Mean when the Population Variance is Known We are able to provide an interval estimate of a population mean or proportion based on the following characteristics of a sampling distribution. 1. Given the sampling distribution, we can draw a sample of size n from the population, and calculate the sample mean or proportion 2. Given the central limit theorem we consider that the sampling distribution of the sample means or proportions is normal (or approximately normal) and thus provide probability estimates for the sample mean or proportion that we estimate. 3. Z x n Given the formula for standardizing a random variable, we can relate the standardized value obtained from a normal distribution and the sample mean/proportion we are estimating : 12 Margin of Error and the Interval Estimate The general form of an interval estimate of a population mean is x Margin of Error 13 8.3 Estimating the Population Mean when the Population Variance is Known The margin of error is computed using the following formula… Margin of Error z 2 ( ) n 14 8.3 Estimating the Population Mean when the Population Variance is Known Thus, the range (interval) that contains the true value of the unknown population parameter (say the mean) is x z ( ) 2 n Where Z is the standard value of the random variable; and α is the confidence coefficient at which we want to provide the interval estimate 15 8.3.1 Estimating the Population Mean when the Population Variance is Known x z 2 ( ) n where: X is the sample mean z/2 is the standardized value of the Random variable representing an area, /2 in on one tail of the standard normal probability distribution is the population standard deviation n is the sample size 1-α is the confidence coefficient 16 The Confidence Interval for ( is known) In its expanded form, the interval can be stated as follows: P( x z 2 x z 2 ) 1 n n The confidence interval 17 Interpreting the Confidence Interval for Based on the estimate, we can say that with a (1 – ) percent confidence the interval: x z 2 n , x z 2 n contains the true value of the unknown population parameter. 18 Interval Estimation of a Population Proportion p z / 2 where: p (1 p ) 1 n 1 - is the confidence coefficient z/2 is the z value providing an area of /2 in the upper tail of the standard normal probability distribution p is the sample proportion 20 The Confidence Interval for ( is known) Commonly used confidence levels and their corresponding Z scores α Z (for α/2) Confidence (Coefficient) level (1-α) 90% 10% Z 0.05 =1.645 95% 5% Z 0.025 =1.960 99% 1% Z 0.005 = 2.575 21 Interval Estimate of Population Mean:D Known: Example Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be computed (α =1% or 5% or 10% ) Step-2: Compute the corresponding margin of error for the selected Confidence coefficient Step-3: Establish the Interval estimate of by adding and subtracting the margin of error to the sample mean: 22 The Confidence Interval for ( is known) Hands-On-Practice Problems 23 Interval Estimate of Population Mean: Known: Example D A random sample of 81 credit card sales in a department store shows that an average, the store sales about $68 per credit card it issued. From past data, it is known that the standard deviation of sales on the stores credit card is $27. 8.1) Provide the 90% confidence interval estimate of the store’s sales on credit . 8.2) Provide the 95% confidence interval estimate of the store’s sales on credit. 8.3) Provide the 99% confidence interval estimate of the store’s sales on credit. 24 Interval Estimate of Population Mean: Known: Example Solution: n = 81; X = $68. D = $27. Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be computed: Confidence Coefficient: (1- α)= 90% Coefficient (α =10%) Step-2: Compute the corresponding margin of error for the selected confidence coefficient Coefficient (Zα/2 =0.05)=1.645; Margin of Error = 1.645 x 3 = 4.935 Standard Error x n 27 27 3 81 9 Step-3: Establish the Interval estimate of by adding and subtracting the margin of error to the sample mean: 68 – 4.935 = 63.065; 68 + 4.935 = 72. 935; [ 63.065 72.935] We are 90 percent confident that the average credit sales of the store lies 25 in the interval $63 and $73 Interval Estimate of Population Mean: Known: Example D A random sample of 81 credit card sales in a department store showed that an average sale of $68,000. From past data, it is known that the standard deviation of the credit card sales is $27. 8.1) The 90% confidence interval estimate of the sales on credit cards: [63.065- 72.935] 8.2) Determine the 95% confidence interval estimate of the store’s sales on credit cards. 26 Interval Estimate of Population Mean: Known: Example Solution: n = 81; X = $68. D = $27. Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be computed: Confidence Coefficient: (1- α)= 95% Coefficient (α =5%) Step-2: Compute the corresponding margin of error for the selected confidence coefficient Coefficient (Zα/2 =0.025)=1.96; Margin of Error = 1.96 x 3 = 5.88 Standard Error x n 27 27 3 81 9 Step-3: Establish the Interval estimate of by adding and subtracting the margin of error to the sample mean: 68 – 5.88= 62.12; 68 + 5.88 = 73. 88; [ 62.12 73.88] We are 95 percent confident that the average sales on a credit card lies 27 in the interval $62 and $74 Interval Estimate of Population Mean: Known: Example D A random sample of 81 credit card sales in a department store showed that an average sale of $68,000. From past data, it is known that the standard deviation of the credit card sales is $27. 8.1) The 90% confidence interval estimate of the sales on credit cards.: [63.065 - 72.935] 8.2) The 95% confidence interval estimate of the sales on credit cards [62.12 - 73.88] 8.3) Determine the 99% confidence interval estimate of the store’s sales on credit cards. 28 Interval Estimate of Population Mean: Known: Example Solution: n = 81; X = $68. D = $27. Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be computed: Confidence Coefficient: (1- α)= 99% Coefficient (α =1%) Step-2: Compute the corresponding margin of error for the selected confidence coefficient Coefficient (Zα/2 =0.005)=2.575; Margin of Error = 2.575 x 3 = 7.725 Standard Error x n 27 27 3 81 9 Step-3: Establish the Interval estimate of by adding and subtracting the margin of error to the sample mean: 68 – 7.725= 60.275; 68 + 7.725 = 75.725; [ 60.275 75.725] We are 99 percent confident that the average credit sales of the store lies 29 in the interval $60 and $76 Interval Estimate of Population Mean: Known: Example D A random sample of 81 credit card sales in a department store showed that an average sale of $68. From past data, it is known that the standard deviation of the credit card sales is $27. 8.1) The 90% confidence interval estimate of sales on credit cards. [63.065 - 72.935] 8.2) The 95% confidence interval estimate of sales on credit cards [62.12 - 73.88] 8.3) The 99% confidence interval estimate of sales on credit cards [60.275 - 75.725] 30 Implications… As we increase the confidence coefficient (say from 90% to 95% or to 99%), the interval that contains the mean of the population widens. There is a trade-off between the width of the interval and the confidence with which we can make the estimation 31 Interval Estimation of the when , The Population Standard Deviation, Is Unknown 32 The Confidence Interval for ( When , The Population Standard Deviation Is Unknown) Recall that when the population variance is known we use the following statistic to provide an interval estimate of a population mean ) x z 2 ( n 33 The t - Statistic However, information about population variance may not be available all the time. When the population variance is unknown, provided that the sampled population is normally distributed, we use the variance estimated from the sample and a t statistic (Student t distribution) to make inference about the population mean. n t x s Z x n 34 The t - Statistic The t distribution is moundshaped, and symmetrical around zero. The variance of a t-distribution depends on the sample size. Generally it has higher variance than a normal distribution 0 35 t Distribution Standard normal distribution t distribution (20 degrees of freedom) t distribution (10 degrees of freedom) z, t The variance (spread) of a t-distribution, compared to that of normal distribution is largely determined by the “degrees of freedom” ( the sample size) When the degrees of freedom (sample size) is more than 100, the standard normal z value provides a good approximation to the t value. 36 The t - Statistic The interval estimate of the population mean is thus computed as : s x [ t at n 1 x ( ) ] 2 n 37 The Confidence Interval for ( is unknown) Example:8.2.1 In a random sample of 100 oil changes, it was found that it takes an average of 22 minutes to change oil for a given car with a standard deviation of 5 minutes. Assuming that oil change time is normally distributed, provide the 99% confidence interval estimate of the average amount of time it takes to change oil on a typical car. [20.687 23.313] 38 The Confidence Interval for ( is unknown) Example 8.2.2. Using the same information (n=100; mean=22), but assuming a standard deviation of 25 minutes, provide the 99% confidence interval estimate of the population mean (the average amount of time it takes to change oil on a car). [15.435 28.565] 40 Implications for the Width of the Confidence Interval The width of the confidence interval is affected by 1. The confidence level (1-a): The higher the confidence level, the wider the interval estimate. 2. The population standard deviation (s): The higher the variance, the wider the interval estimate. 41 The Confidence Interval for ( is unknown) Example 8.2.3. Using a standard deviation of 5 and sample mean of 22minutes, but assuming a sample size of 400 oil changes, provide the 99% confidence interval estimate of the population mean (that is, the average amount of time it takes to change oil on a typical car). [20.712 23.288] 42 Implications for the Width of the Confidence Interval The width of the confidence interval is affected by 1. The confidence level (1-a): The higher the confidence level, the wider the interval estimate. 2. The population standard deviation (s): The higher the variance, the wider the interval estimate. 3. The sample size (n): The larger the sample size, the narrower the interval estimate 43 The Width of the Confidence Interval The width of the confidence interval is affected by confidence level, variance of the population, and sample size. 1. Although, we want higher confidence level and narrow interval estimate, there is a trade-off between confidence level and the interval estimate we want to establish. 2. Although lower variance can provide us with narrow interval estimate, the variance of the population or sample is often beyond our control. 3. Therefore, the only way we can establish a narrow (more informative interval) while maintaining higher confidence level is by adjusting (increasing) our sample size. 46 The Sample Size 90% Confidence level Determining the Proper sample size is thus a critical component of in Establishing Narrow Interval Estimation 47 8.3 Selecting the Sample size From the formula that we used to establish the interval estimate of the population parameter, we can derive a formula that allows us to determine the appropriate sample size. Two important requirements: 1. 2. At what confidence level do we want to provide the interval estimate What interval width (W) do we need? 48 8.3 Selecting the Sample size z 2 n w 2 ( Z / 2 ) 2 ( 2 ) 2 w Where W is the interval width we want to maintain. Hence, to compute the sample size, first we need to determine the interval width. 49 Selecting the Sample size Example 10.2 In order to estimate the amount of lumber that can be harvested from a tract of land with a 99% confidence, it was indicated that the mean diameter of trees in the tract must be within one inch. Assuming that diameters are normally distributed with standard deviation of 6 inches, how many samples should be selected to provide the interval estimation for the mean of the diameter of the trees in the tract at the specified confidence level?. 50 Selecting the Sample size Solution The estimate accuracy is +/-1 inch. That is w = 1. The confidence level 99% leads to = .01, thus z/2 = z.005 = 2.575. The standard deviation was given as 6 Thus, we can compute the required sample size as follows: 2 2 z 2 2.575(6) n 239 w 1 51 Computing Interval Estimates: Summary 1. Determine the sample size, and the values of variables of interest (width, spread of the population or sample). 2. Select the confidence level for the interval estimation 3. Compute the sample mean ( population variance may be known or unknown). 4. Determine the critical value (Z or t from the standard normal table) 5. Compute the confidence interval. 52 Summary of Interval Estimation Procedures for a Population Mean Is the population standard deviation known ? Yes No Use the sample standard deviation s to estimate s Known Case Use x z /2 n Unknown Case Use x t /2 s n 53 Interval Estimation of a Population Proportion p z / 2 where: p (1 p ) n 1 - is the confidence coefficient z/2 is the z value providing an area of /2 in the upper tail of the standard normal probability distribution p is the sample proportion 54