* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Confidence Intervals and Sample Size
Survey
Document related concepts
Transcript
Confidence Intervals and Sample Size Estimates • Point Estimate: A specific numerical value estimate of a parameter. The best point estimate of the population mean is the sample mean. • Example: Want to estimate the age of teachers in SHS. All teachers are surveyed and answer is 38.6 years. This is a point estimate. • Interval Estimate: An interval or a range of values used to estimate a parameter. This estimate may or may not contain the value of the parameter being estimated. • Preferred due to the fact that the sample mean, for the most part, is somewhat different from population mean due to sampling errors. • Example: Avg. Age might be 38.1 < 𝜇 < 39.1 which is 38.6 ± 0.5 years. Properties of Good Estimators • Estimator must be an unbiased estimator. The expected value or mean of the estimates obtained from samples is equal to the parameter being estimated. • Estimator must be consistent. As sample size increases, the value of the estimator approaches the value of the parameter estimated. • Estimator must be relatively efficient. Of all the statistics that can be used to measure a parameter, the relatively efficient estimator has the smallest variance. Confidence Intervals • The probability of being correct can be assigned before an interval estimate is made. • For example: May want to be 90%, 95%, or even 99% sure that the interval contains the true population mean. The larger your confidence gets the larger your interval must be. • Confidence Interval: A specific interval estimate of a parameter determined by using data obtained from a sample and the specific confidence level of the estimate. • Confidence level: The probability that the interval estimate will contain the parameter. Confidence Interval Formula for a specific ∝ Z sub alpha over 2 • ∝ : Represents the total area under both tails of the standard normal distribution curve. • ∝ ∕ 2: Represents the area in each one of the tails. • Relationship between ∝ and confidence level is that the confidence level is the percentage equivalent to the decimal value of 1 - ∝, and vice versa. • The second term of our CI Formula is called the maximum error of estimates. • Max error of estimates: The max difference between the point estimate of a parameter and the actual value of the parameter. Steps to find Z sub alpha over 2. • Step 1: Subtract decimal value of percent from 1. This creates your ∝. • Step 2: Take ∝ and divide by 2. This creates area of each tail. • Step 3: Subtract area from 0.5. • Step 4: Find the corresponding area from Step 3 on our Table E chart. • Step 5: Use that z-value in our CI formula. CI Examples • Example 1: Dr. Purnell wishes to find avg. age of teachers in the district. The standard deviation is known to be 4 years. A sample of 55 teachers had an average age of 32.5 years. Find the 95% confidence interval of the population mean. • Example 2: Avg. annual wind speed in Kitty Hawk NC is 15.6 mph. If a sample of 90 days was used to determine the average, find the 99% confidence interval of the mean. Assume standard deviation to be 2.3 mph. Determining Sample Size • Sample size determination is closely related to statistical estimation. • The formula is derived from the Maximum Error of Estimate formula. • All answers should be rounded up if there is any fractional or decimal portion in our answer. • Formula: n = 𝑍 𝑠𝑢𝑏 𝑎𝑙𝑝ℎ𝑎 𝑜𝑣𝑒𝑟 2 ● 𝜎 𝐸 ² Sample Size Examples • Example 1: Researcher is interested in estimating avg. salary of garbage men in a large town. He wants to be 95% sure that estimate is correct. If standard deviation is $950 how large a sample is needed to get the desired info. and to be accurate within $150? • Example 2: A nurse wants to estimate birth weights of babies. How large must sample be if she desires to be 90% confident that the true mean is within 8 ounces of the sample? Standard deviation is known to be 7 ounces. The T-Distribution Similarities to Normal Dist. • Bell shaped. • Symmetrical about the mean. • The mean, mode, and median are equal to 0 and located at the center. • Never touches x-axis. Differences to Normal Dist. • The variance is greater than 1. • It is actually a family of curves based on the concept of degrees of freedom, which is related to sample size. • As sample size increase it approaches the standard normal dist. curve. d.f. = Degrees of Freedom • d.f. : The number of values that are free to vary after a sample statistic has been computed. • They tell the researcher what curve to use when a distribution consists of a family of curves. • Example: Lets say mean of 10 values is 60. This means that 9 out of 10 values are free to vary. Once they have been selected the last value must be a specific number to get a sum of 600 since 600 ÷ 10 = 60. Hence d.f. = n – 1. Using Table F • Need to find the correct value of t sub alpha over 2. • Step 1: Find the correct d.f along the left hand side. ( d.f. = n – 1 ) • Step 2: Find correct confidence level on top. • Step 3: Intersection becomes our answer. • Step 4: Answer will be used in CI formula on next slide. • Do not need to worry about “one tail” or “two tails”. Confidence Interval Formula for when 𝝈 is unknown and n < 30 T-Distribution Examples • Example 1: For a group of 20 students taking a final exam the mean heart rate was 96 beats per minute. Standard deviation was 5. Find the 95% confidence interval of the true mean. • Example 2: A sample of 12 food servers showed an avg. weekly income of $340.40 with a standard deviation of $11. Find the 98% confidence interval of the true mean. Confidence Interval for a proportion • As with means, the statistician, given the sample population, tries to estimate the population proportion. • An interval estimate can be used for a proportion. • The formula is given below: The Symbols for Proportion Notation • “p hat” = X ÷ n, Where X = number of sample units that possess the characteristics of interest and n = sample size. • “q hat” = 1 – “p hat” Rules for using CI for a proportion • #1: n ● p and n ● q must be greater than or equal to 5. Just like binomial check. • #2: Round off to 3 decimal places. CI Example for Proportions • Example 1: In a recent study of 100 people, 78 said that they were satisfied with their current home. Find the 90% confidence interval of the true proportion of individuals who are satisfied with their current home. • Example 2: A nutritionist found that in a survey of 60 families, 32% said they ate apples at least once a week. Find the 95% confidence interval of the true proportion of families who eat apples at least once per week. Minimum sample size for Interval Estimate of a Population Proportion. • It is necessary to round up to obtain a whole number answer. No fractional or decimal answers allowed. • Formula: • If no p “hat” is known must use 0.5 for both p “hat” and q “hat”. Minimum Examples for Pop. Proportion • Example 1: A researcher wishes to estimate, with 98% confidence , the number of people who own an iphone. A previous study shows that 42% of those interviewed had an iphone. The researcher wishes to be accurate within 3% of the true proportion. Find the minimum sample size. • Example 2: The same researcher wishes to estimate the proportion of people who also own an Ipad. She wants to be 95% confident and accurate within 7% of the true proportion. Find the minimum sample size. CI for Variances and Standard Deviation • Variances and standard deviations are just as important as means. • Example: The variance and standard deviation of the medication in a certain prescription plays an important role in making sure the patients gets the proper dosage. • Due to fact that they are both rather important confidence intervals are necessary. • To calculate these intervals a new distribution is needed called the chi-square distribution. Chi-Square Distribution • Similar to the t-distribution in the fact that it too is a family of curves based on d.f. • Symbol for chi-square is 𝜒². (Pronounced “ki”) • Chi-square variable can not be negative and distributions are positively skewed. At roughly 100 d.f. the distribution becomes somewhat symmetrical. Chi-Square Distribution How to read chi-square table. • There are two different values that are going to be used in the formulas for variance and standard deviations. Need to find those 2 numbers first. • Step 1: Need to find ∝ first by subtracting 1- CI. • Step 2: Divide answer above by 2. Use that ∝ ÷ 2 answer and match it to the d.f. Their intersection creates 𝜒²right. • Step 3: Take answer from ∝ ÷ 2 and subtract it from 1. Use that number and the d.f. intersection to create 𝜒²left. • Step 4: Answers from steps 2 and 3 will be used to find confidence intervals for variances and standard deviations. Chi-Square Distribution Example • Find the values of 𝜒²right and 𝜒²left for a 95% confidence interval when n = 18. • Step 1: Need to find ∝ first by subtracting 1- CI. • Step 2: Divide answer above by 2. Use that ∝ ÷ 2 answer and match it to the d.f. This creates 𝜒²right. • Step 3: Take answer from ∝ ÷ 2 and subtract it from 1. Use that number and the d.f. intersection. This creates 𝜒²left. Formulas for CI for Variances and Standard Deviations • CI for Variance: • CI for Standard Deviation: • Remember s = sample standard deviation and s² = sample variance. Problem could give us either so if deviation is given need to square it. If variance is given plug directly in. CI Variance and Standard Deviation Examples • Find the 99% CI for the variance and standard deviation of the weights of 5 gallon containers of paint if a sample of 14 containers has a standard deviation of 1.2 pounds. Assume the variable is normally distributed. • Find the 90% CI for the variance and standard deviation for the lifetime of batteries if a sample of 25 batteries has a standard deviation of 2.1 months. Assume the variable is normally distributed.