* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Confidence Intervals - Gordon State College
Survey
Document related concepts
Transcript
Confidence Intervals Point Estimate A specific numerical value estimate of a parameter. The best point estimate for the population mean is the sample mean. Properties of a Good Estimator 1. Unbiased 2. Consistent 3. Relatively Efficient Since there’s no way of knowing how good a point estimate is, statisticians generally prefer . . . … an Interval Estimate. A range of values used to estimate the parameter. The interval may or may not contain the parameter (but typically does). The most common kind is the . . . . . . Confidence Interval Confidence level = the probability that the interval estimate will contain the parameter. Common 90% confidence levels: 95% 98% 99% Confidence Intervals Tradeoff: A greater confidence level comes from a wider interval. Important z-values to remember: • 90%: 1.645 (some use 1.65) • 95%: 1.96 • 98%: 2.33 • 99%: 2.575 (some use 2.58) Confidence Intervals The maximum error of estimate (E) is the maximum difference between the point estimate of a parameter and the actual value of the parameter. Confidence Intervals Four steps: – Compute 1 – conf. level = . – Divide by 2. – Look up the associated z-value(s) for /2 and 1 – /2. – Compute the confidence interval. NOTATION FOR PROPORTIONS p = population proportion x p̂ sample proportion of x n successes in a sample of size n. qˆ 1 pˆ sample proportion of failures in a sample of size n. MARGIN OF ERROR OF THE ESTIMATE FOR p E z / 2 pˆ qˆ n NOTE: n is the size of the sample. CONFIDENCE INTERVAL FOR THE POPULATION PROPORTION p pˆ qˆ pˆ E p pˆ E where E z / 2 n The confidence interval is often expressed in the following equivalent formats: pˆ E or ( pˆ E , pˆ E ) or SAMPLE SIZES FOR ESTIMATING A PROPORTION p When an estimate pˆ is known: z / 2 n 2 E pˆ qˆ 2 When no estimate pˆ is known: z / 2 n 2 E 0.25 2 Confidence Intervals for the Mean When the population standard deviation or variance is known, the standard normal distribution can be used depending on the sample size and the shape of the original distribution . . . . Confidence Intervals for the Mean When n ≤ 30, the original variable must be normally distributed. When n > 30, the distribution of sample means will be approximately normal even if the original distribution isn’t normal. MARGIN OF ERROR FOR THE MEAN The margin of error for the mean is the maximum likely difference observed between sample mean x and population mean µ, and is denoted by E. When the standard deviation, σ, for the population is known, the margin of error is given by E z / 2 n where 1 − α is the desired confidence level. CONFIDENCE INTERVAL ESTIMATE OF THE POPULATION MEAN μ (WITH σ KNOWN and n > 30) x E x E where E z / 2 or or n xE ( x E, x E ) SAMPLE SIZE FOR ESTIMATING µ z / 2 n E 2 where zα/2 = critical z score based on desired confidence level E = desired margin of error σ = population standard deviation Confidence Intervals for the Mean To summarize, we use the standard normal distribution (z values from Table A-2) for these main reasons: – is known and the original variable is normally distributed OR is known and n > 30 Confidence Intervals for the Mean Now, if is unknown, s can be substituted for ,and we use a new distribution . . . the (Student) t distribution. PROPERTIES OF THE STUDENT t DISTRIBUTION The Student t distribution is different for different sample sizes (see Figure below for the cases n = 3 and n = 12). Features of the t-distribution Bell-shaped Symmetrical about the mean The mean, median and mode are equal to 0 and located at the center. The curve never touches the x-axis. Features of the t-distribution Variance is greater than 1. Actually a family of curves based on degrees of freedom (related to sample size) As d.f. increases, t approaches the standard normal distribution. ASSUMPTIONS: σ NOT KNOWN 1. The sample is a simple random sample. 2. Either the sample is from a normally distributed population OR n > 30. When σ is not known we will use the Student t Distribution. THE STUDENT t DISTRIBUTION If the distribution of a population is essentially normal, then the distribution of x t s n is essentially a Student t distribution for all samples of size n, and is used to find critical values denoted by tα/2. The Student t distribution is often referred to as the t distribution. Confidence Intervals The degrees of freedom are the number of values that are free to vary after a sample statistic has been computed. d.f. = n – 1 Confidence Intervals Two steps: – Use the appropriate confidence level and the appropriate degree of freedom [d.f. = n – 1] to look up the associated t-values in the table (A-3). – Compute the interval. MARGIN OF ERROR ESTIMATE OF µ (WITH σ NOT KNOWN) s E t / 2 n where (1 − α) is the confidence level and tα/2 has n − 1 degrees of freedom. NOTE: The values for tα/2 are found in Table A-3 which is found on page 606, inside the back cover, and on the Formulas and Tables card. CONFIDENCE INTERVAL ESTIMATE OF THE POPULATION MEAN μ (WITH σ NOT KNOWN) xE xE where s E t / 2 n CHOOSING THE APPROPRIATE DISTRIBUTION CI for a Standard Deviation 1. 2. Given sample values, estimate the population standard deviation σ or the population variance σ2. Determine the sample size required to estimate a population standard deviation or variance. COMMENT: Estimating standard deviations is very useful in areas such a quality control in a manufacturing process. This is because manufacturers want the products to be consistent. ASSUMPTIONS 1. 2. The sample is a simple random sample. The population must have normally distributed values (even if the sample is large). CHI-SQUARE DISTRIBUTION To estimate a population variance we use the chi-square distribution. 2 where ( n1) s 2 2 n = sample size s2 = sample variance σ2 = population variance PROPERTIES OF THE CHISQUARE DISTRIBUTION 1. The chi-square distribution is not symmetric, unlike the normal and Student t distributions. As the number of degrees of freedom increases, the distribution becomes more symmetric. Figure 7-8 Chi-Square Distribution Figure 7-9 Chi-Square Distribution for df = 10 and df = 20 PROPERTIES (CONTINUED) 2. The values of chi-square can be zero or positive, but they cannot be negative. 3. The chi-square distribution is different for each number of degrees of freedom, which is df = n – 1 for this CI. As the number increases, the chi-square distribution approaches the shape of a normal distribution. CRITICAL VALUES In Table A-4, each critical value of χ2 corresponds to an area given in the top row of the table, and that area represents the cumulative area located to the right of the critical value in the body of the table. NOTE: Since the chi-square distribution is not symmetric the left critical value L2 and the right critical value 2 are not R just opposites of each other. ESTIMATORS OF σ2 and σ The sample variance s2 is the best point estimate of the population variance σ2. The sample standard deviation s is the best point estimate of the population standard deviation σ. CONFIDENCE INTERVAL FOR POPULATION VARIANCE σ2 (n 1) s 2 2 2 R (n 1) s 2 2 L left-tail critical value right-tail critical value CONFIDENCE INTERVAL FOR POPULATION STANDARD DEVIATION σ (n 1) s 2 R 2 (n 1) s 2 L 2 DETERMINING SAMPLE SIZE To determine sample size, use Table 7-2 on page 364.