Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Topic 6 - Confidence intervals based on a single sample • Sampling distribution of the sample mean pages 187 - 189 • Sampling distribution of the sample variance - pages 189 - 190 • Confidence interval for a population mean pages 209 - 212 • Confidence interval for a population variance - pages 268 - 270 • Confidence interval for a population proportion - pages 278 - 281 Confidence Intervals • To use the CLT in our examples, we had to know the population mean, m, and the population standard deviation, s. • This is okay if we have a huge amount of sample data to estimate these quantities with X and s, respectively. • In most cases, the primary goal of the analysis of our sample data is to estimate and to determine a range of likely values for these population values. Confidence interval for m with s known • Suppose for a moment we know s but not m • The CLT says that X is approximately normal with a mean of m and a variance of s2/n. • So, Z X m s will be standard normal. n • For the standard normal distribution, let za be such that P(Z > za) = a Confidence interval for m with s known • Normal calculator Confidence interval for m with s unknown • For large samples (n ≥ 30), we can replace s with s, so that X za /2 s n is a (1-a)100% confidence interval for m. • For small samples (n < 30) from a normal population, X ta /2,n 1 s n is a (1-a)100% confidence interval for m. • The value ta,n-1 is the appropriate quantile from a t distribution with n-1 degrees of freedom. • t-distribution demonstration • t-distribution calculator Acid rain data • The EPA states that any area where the average pH of rain is less than 5.6 on average has an acid rain problem. • pH values collected at Shenandoah National Park are listed below. • Calculate 95% and 99% confidence intervals for the average pH of rain in the park. Light bulb data • The lifetimes in days of 10 light bulbs of a certain variety are given below. Give a 95% confidence interval for the expected lifetime of a light bulb of this type. Do you trust the interval? Interpreting confidence statements • In making a confidence statement, we have the desired level of confidence in the procedure used to construct the interval. • Confidence interval demonstration Sample size effect • As sample size increases, the width of our confidence interval clearly decreases. • If s is known, the width of our interval for m is L 2za /2 • Solving for n, n (2za /2 s n s )2 L • If s is known or we have a good estimate, we can use this formula to decide the sample size we need to obtain a certain interval length. Paint example • Suppose we know from past data that the standard deviation of the square footage covered by a one gallon can of paint is somewhere between 2 and 4 square feet. How many one gallon cans do we need to test so that the width of a 95% confidence interval for the mean square footage covered will be at most 1 square foot? Confidence interval for a variance • In many cases, especially in manufacturing, understanding variability is very important. • Often times, the goal is to reduce the variability in a system. • For samples from a normal population, (n 1)s 2 (n 1)s 2 , 2 2 ca /2,n 1 c1a /2,n 1 is a (1-a)100% confidence interval for s2. • The value c2a/2,n-1 is the appropriate quantile from a c2 distribution with n-1 degrees of freedom. • c2 calculator Acid rain data • Calculate a 95% confidence interval for the variance of pH of rain in the park. Confidence interval for a proportion • A binomial random variable, X, counts the number of successes in n Bernoulli trials where the probability of success on each trial is p. • In sampling studies, we are often times interested in the proportion of items in the population that have a certain characteristic. • We think of each sample, Xi, from the population as a Bernoulli trial, the selected item either has the characteristic (Xi = 1) or it does not (Xi = 0). • The total number with the characteristic in the sample is then n X Xi i 1 Confidence interval for a proportion • The proportion in the sample with the characteristic is then n pˆ X /n X i /n i 1 • The CLT says that the distribution of this sample proportion is then normally distributed for large n with – E(X/n) = – Var(X/n) = • For the CLT to work well here, we need – X ≥ 5 and n-X ≥ 5 – n to be much smaller than the population size Confidence interval for a proportion • How can we use this to develop a (1-a)100% confidence interval for p, the population proportion with the characteristic? ˆ za /2 p ˆ (1 p ˆ )/n p Murder case example • Find a 99% confidence interval for the proportion of African Americans in the jury pool. (22 out of 295 African American in sample) • StatCrunch Sample size considerations • The width of the confidence interval is ˆ (1 p ˆ )/n L 2za /2 p • To get the sample size required for a specific length, we have ˆ (1 p ˆ) n (2za /2 /L )2 p • We might use prior information to estimate the sample proportion or substitute the most conservative value of ½. Nurse employment case • How many sample records should be considered if we want the 95% confidence interval for the proportion all her records handled in a timely fashion to be 0.02 wide? Prediction intervals • Sometimes we are not interested in a confidence interval for a population parameter but rather we are interested in a prediction interval for a new observation. • For our light bulb example, we might want a 95% prediction interval for the time a new light bulb will last. • For our paint example, we might want a 95% prediction interval for the amount of square footage a new can of paint will cover. Prediction intervals • Given a random sample of size n, our best guess at a new observation, Xn+1, would be the sample mean, X . • Now consider the difference, X n 1 X . • E (X n 1 X ) • Var (X n 1 X ) Prediction intervals • If s is known and our population is normal, then using a pivoting procedure gives • If s is unknown and our population is normal, a (1-a)100% prediction interval for a new observation is 1 X ta /2,n 1s 1 n Acid rain example • For the acid rain data, the sample mean pH was 4.577889 and the sample standard deviation was 0.2892. What is a 95% prediction interval for the pH of a new rainfall? • Does this prediction interval apply for the light bulb data?