Download Handout 7 - TAMU Stat

1 STAT 211 Handout 7 (Chapter 7: Statistical Intervals based on a Single Sample) A point estimate of a population characteristic is a single number that is based on sample data and represents a plausible value of the characteristic. The best statistic (MVUE) is the unbiased statistic with the smallest standard deviation. Since the point estimate is a single number, it does not provide information about the precision and reliability of estimation. A confidence interval for a population characteristic (parameter) is an interval of plausible values for the characteristic. It is constructed so that, with a chosen degree of confidence, the value of the characteristic will be captured inside the interval. The confidence level, 1-, associated with a confidence interval estimate the success rate of the method used to construct the interval. If we repeatedly sample from a population and calculate a confidence interval each time with the data available, then over the long run the proportion of the confidence intervals that actually contain the true value of the population characteristic will be 100(1-)% (95%, 90%, or 99% for =0.05, 0.10, or 0.01, respectively). The general form of a confidence interval: (point estimate for a specified statistic)  (critical value).(standard error for the point estimate). What is the best estimator for parameters, , 2, p? _____________ _ Empirical Rule tells you about 95% of all our values for x will be within 1.96 standard deviation from the mean.    1- when you compute 95% confidence interval is 0.95  when you compute 95% confidence interval is 0.05 z / 2 when you compute 95% confidence interval is 1.96 Confidence Interval for a Population Mean,  (1) Let X1, X2, ....,Xn be a random sample from a normal population with the unknown population mean  and the known population standard deviation , then 100(1-)% confidence interval for  is _   _  _   x     x  z / 2 , x  z / 2  where P  z / 2   z / 2   1   n n / n      Thus, in 95% of all possible samples,  will be captured in the following calculated confidence _ interval: x  1.96   n 2 (2) Large Sample Confidence Interval for : Let X1, X2, ...,Xn is a random sample from a population distribution with mean,  and standard deviation, . For the large sample size n, the CLT implies _ that X has approximately a normal distribution for any population distribution. The value of the population standard deviation  may not be known. Instead, the value of the sample standard deviation s may be known. If n is sufficiently large (n>40), 100(1-)% large sample confidence _ _ _ s _ s  s s  interval for  is  x  z / 2 , x  z / 2  where P x  z / 2    x  z / 2   1   n n n n   Thus, in 95% of all possible samples,  will be captured in the following calculated confidence _ s interval: x  1.96  n (3) Small Sample Confidence Interval for : When the sample size is small (n≤40), we have to make specific assumptions to find the confidence intervals. Assumption: The population of interest is normal, so that X1, X2, ...,Xn constitutes a random sample from a normal distribution with both  and  unknown. When the sample mean of a random sample of size n from a normal distribution with mean , the _ random variable T  x  has a probability distribution called a t-distribution with n-1 degrees of s/ n freedom (Properties of t-distribution: discussion (page 300 of your textbook) and t-distribution table is on page 743, Table A.5). _ s _ s  100(1-)% confidence interval for  is where  x  t / 2;n 1 , x  t / 2;n 1  n n  _   x    P  t / 2;n1   t / 2;n1   1   s/ n     Thus, in 95% of all possible samples,  will be captured in the following calculated confidence _ s interval: x  t 0.025;n 1  n Choosing the sample size: With the known desired confidence level and interval width, we can determine the necessary sample size. Let X1, X2, ....,Xn be a random sample from a normal population with the unknown population mean  and the known population standard deviation , The width of the interval is w= 2 z / 2 _ x will be within z / 2  n  n and the bound on the error estimation is z / 2  n . I mean of . The sample size required to estimate a population mean  to within 3 an amount B= z / 2 z   with 100(1)% confidence is n=   / 2  . n  B   written using the interval width, w= 2 z / 2 2 The same formula can be  2z   then n=   / 2  . n  w   2 Example 1:Each of the following is a confidence interval for true average amount of time spent by the patients using physical therapy device using the sample data: (10.90, 25.44), (13.58, 22.76) (a) What is the value of the sample mean time spent by the patients using physical therapy device? (b) The confidence level for one of these intervals is 95% and for the other is 99%. Which of the intervals has the 95% confidence level and why? Example 2: Suppose we want to estimate the average # of violent acts on TV per hour for a specific network. Data was collected from viewing random selection of 50 prime time hours and average of 11.7 violent acts were recorded. Suppose it is known that =5 and population distribution is normal. The 95% CI for  is (10.3141 , 13.0859) The 95% confidence interval for  if 100 prime time hours had been viewed where the same mean and the variance obtained is (10.72 , 12.68) The 90% CI for  is (10.5368 , 12.8632) The width of the 90% confidence interval for  is 2.3264 The bound on the error estimation of the 90% confidence interval for  is 1.1632 Example 3: Investigators would like to estimate the average taxable income of apartment dwellers to within $500, using a 95% CI for the normally distributed data. Suppose that the previous studies show that standard deviation is $8000. How many people should they study? (Answer: 984) Example 4: The brightness of a television picture tube can be evaluated by measuring the amount of current required to achieve a particular brightness. An engineer has designed a tube that he believes will require 300 microamps of current to produce the desired brightness level. A sample of 10 tubes results in the average of 317.2 and the standard deviation 15.7. Using 95% confidence interval, did he achieve the desired brightness? Example 5: I want to see how long on average, it takes Drano to unclog a sink. In a recent commercial, the stated claim was that it takes on average, 15 minutes. I wanted to see if that claim was true, so I tested Drano on 64 randomly selected sinks. I found that it took an average of 18 minutes with standard deviation of 2.5 minutes. Was their claim false? 99% CI for  is (17.1953 , 18.8047) 90% CI for  is (17.4859 , 18.5141) 4 Would my answer be different if I tested Drano on 25 randomly selected sinks and I found that it took an average of 18 minutes with standard deviation of 2.5 minutes? 99% CI for  is (16.6015 , 19.3985) 90% CI for  is (17.1445 , 18.5555) Example 6: Students weighed in kilograms at the beginning and end of a semester long fitness class. Assume the population of weight changes follows a normal distribution. A random sample of 12 female students yielded a mean of 0.45 and standard deviation of 1.5. 99% CI to estimate the true mean weight change is (-0.8949 , 1.7949). Would you believe me if I claimed the average weight change was 0? What is different in one-sided confidence intervals? Discussion Example 7: Determine the confidence level for each of the following large sample one-sided confidence bounds. _ s (a) Upper bound: x  0.93 (Answer: 0.8238) n _ s (b) Lower bound: x  1.75 (Answer: 0.9599) n Would your answer be different in small samples? A General Large Sample Confidence Interval ^ When the estimator  satisfies the following properties, a. The estimator has approximately a normal population distribution b. It is at least unbiased c. standard deviation of the estimator is known The confidence interval for  ^ can be constructed as   z / 2 ^  where ^       P   z / 2   z / 2   1      ^    Example 8: large sample confidence interval for the parameter  in Poisson distribution is _ _  _ _    x _ x x    , x  z / 2  z / 2   1    x  z / 2  where P  z / 2  n n /n        Large Sample Confidence Interval for a population proportion, p If n is sufficiently large, 100(1-)% large sample confidence interval for ^ ^ ^ ^  ^   ^    p(1  p) ^ p(1  p)  p p , p  z / 2  z / 2   1    p  z / 2  where P  z / 2  ^ ^ n n     p ( 1  p )/n     p is 5 ^ ^   Check if n p  10 and n1  p   10 to see if you have a large sample. Otherwise, there is a   formula (7.10) in your textbook, which can be used without checking if it is a large sample. I mean formula (7.10) can be used for large and small samples. Choosing the sample size: With the known desired confidence level and interval width, we can ^ determine the necessary sample size. Bound on the error estimation is z / 2 ^ ^ ^ p(1  p) . I mean p n ^ p(1  p) will be within z / 2 of p. The sample size required to estimate a population proportion p n ^ ^   ^ ^ z2 / 2 p1  p  p(1  p)   . The same to within an amount B= z / 2 with 100(1)% confidence is n= 2 n B ^ ^   ^ ^ 4 z2 / 2 p1  p  p(1  p)  . formula can be written using the interval width, w= 2 z / 2 then n= n w2 ^ ^ The conservative sample size can be found when p = 1  p =0.5 What is different in one-sided confidence intervals? Discussion Example 9: We are interested in proportion of all students enrolled in Stat211 who listen to country music. Using our class as random sample from Stat211 students, we see that ___________ out of ___________ listen to country music. Estimate the true proportion of all Stat211 students that listen to country music using 90% confidence interval. What parameter are we estimating?_______________ Example 10: Scripps News service reported that 4% of the members of the American Bar Association (ABA) are African American. Suppose that this figure is based on a random sample of 400 ABA members. (a) Is the sample size large enough to justify the use of the large-sample confidence interval for a population proportion? (b) Construct and interpret a 90% confidence interval for the true proportion of all ABA members who are African American. (Answer: (0.0239 , 0.0561)) Example 11: I want to estimate the proportion of freshmen Aggies who will drop out before graduation. How many Aggies should I include in my study in order to estimate p within 0.05 with 95% confidence? (Answer: 385) 6 A Prediction Interval for a Single Future Value: Let X1, X2, ...,Xn be a random sample from a normal population distribution and we wish to predict the value of Xn+1, a single future observation. 100(1-)% prediction interval for Xn+1 is _ 1 _ 1  x  t / 2;n 1 s 1  , x  t / 2;n 1 s 1   where  n n     _   x  x n 1  P  t / 2;n 1   t / 2;n 1   1     1   s 1 n   Example 12: What is the 99% prediction interval for the weight change of an individual student from the population distribution in example 6? (Answer: (-4.3992 , 5.2992)) Tolerance Intervals: Let k be a number between 0 and 100. A tolerance interval for capturing at least k% of the values in a normal distribution with a confidence level 100(1-)% has the form _ _   x  critical value  s , x  critical value  s    Table A.6 (page 726) is designed for the tolerance critical values where k=90, 95, 99 and =0.05 ,0.01 in one and two-sided intervals. Example 13: Use example 6 and calculate an interval that includes at least 95% of the student weight changes in the population distribution using a confidence level of 99%. (Answer: (-5.355 , 6.255)) Confidence Intervals for the Variance, 2 and Standard Deviation,  of a Normal Population : The population of interest is normal, so that X1, X2, ...,Xn constitutes a random sample from a normal distribution with parameters  and 2. Then the random variable _   x  x    i 2 (n  1)  s  i 1   2 2 n  freedom.  100(1-)% 2 has a chi-squared (  2 ) probability distribution with n-1 degrees of confidence interval for 2 is  (n  1)  s 2 (n  1)  s 2  , 2  2  1 / 2;n 1   / 2;n 1     where   (n  1)  s 2 P 12 / 2;n1    2 / 2;n1   1   . 2    The details of the chi-squared (  2 ) probability distribution will be discussed in class and the table of critical values (Table A.7, Page 727) will be demonstrated. 7 Example 14: Determine the following: (a) The 95th percentile for the chi-squared distribution with n=20. (b) The 5th percentile for the chi-squared distribution with n=20. (c) P(10.117  2  30.143) where  2 is a chi-squared r.v. with n=20. (d) P(  2 <10.283 or  2 >35.478) where  2 is a chi-squared r.v. with n=22. Example 15 (Exercise 7.46, the 6th edition and Exercise 7.44, the 5th edition): (a) Is it plausible to assume that the data come from a normal population distribution? Normal Probability Plot for turbidity 99 ML Estimates 95 Mean: 25.3133 StDev: 1.52528 90 Percent 80 70 60 50 40 30 20 10 5 1 22 24 26 28 30 Data Variable turbidity n 15 Mean 25.313 Median 25.800 TrMean 25.438 Variable turbidity Minimum 21.700 Maximum 27.300 Q1 24.100 Q3 26.700 StDev 1.579 SE Mean 0.408 (b) Calculate a 95% CI for the population standard deviation of turbidity.  (n  1)  s 2 (n  1)  s 2   14(1.579) 2 14(1.579) 2  , , 95% CI for  is  2   2 / 2;n 1     02.025;14  02.975;14 1 / 2; n 1    where  02.025;14 =26.119 and  02.975;14 =5.629   =(1.16 , 2.49)   (c) Calculate an upper bound with the confidence level 95% CI for the population standard deviation of turbidity. (n  1)  s 2   ;n 1 2  14(1.579) 2  2 0.05;14 =1.214 where  02.95;14 =23.685 8 Discussion on finding the confidence interval for the linear combination of the population means Example 16 (Exercise 7.53, the 6th edition Exercise 7.51, the 5th edition): Four different groves of fruit trees are selected for experimentation. The first three groves are sprayed with pesticides and the fourth is treated with the ladybugs. We like to measure the difference in true average yields between treatment with pesticides and treatment with ladybugs. Compute the 95% CI for   13 ( 1   2   3 )   4 where  i is the ith true average yield. _ Treatment si ni xi 1 (pesticide) 100 10.5 1.5 2 (pesticide) 90 10.0 1.3 3 (pesticide) 100 10.1 1.8 4 (ladybugs) 120 10.7 1.6 ^ 1 _ 3 _ _   _ 1 3    x1  x 2  x 3   x 4  (10.5  10  10.1)  10.7 =-0.5 2 2 2 2 _  _   _   _  1   ^  1 Var   Var x1   Var x 2   Var x 3    Var x 4    1  2  3   4 n3  n4   9         9  n1 n2 2 2 2 2  ^  1  1.5 1.3 1.8  1.6 Estimated Var    =0.0295    90 100  120   9  100 95% CI for  is  0.5  1.96 0.0295 =(-0.8366 , -0.1634)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Handout 7 - TAMU Stat