Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
1 STAT 211 Handout 7 (Chapter 7: Statistical Intervals based on a Single Sample) A point estimate of a population characteristic is a single number that is based on sample data and represents a plausible value of the characteristic. The best statistic (MVUE) is the unbiased statistic with the smallest standard deviation. A confidence interval for a population characteristic (parameter) is an interval of plausible values for the characteristic. It is constructed so that, with a chosen degree of confidence, the value of the characteristic will be captured inside the interval. The confidence level, 1-, associated with a confidence interval estimate the success rate of the method used to construct the interval. If we repeatedly sample from a population and calculate a confidence interval each time with the data available, then over the long run the proportion of the confidence intervals that actually contain the true value of the population characteristic will be 100(1-)% (95%, 90%, or 99% for =0.05, 0.10, or 0.01, respectively). The general form of a confidence interval: (point estimate for a specified statistic) (critical value).(standard error for the point estimate). What is the best estimator for parameters, , 2, p? _____________ _ Empirical Rule tells you about 95% of all our values for x will be within 1.96 standard deviation from the mean. What is 1- when you compute 95% confidence interval? ___________________ What is when you compute 95% confidence interval? ___________________ What is z / 2 when you compute 95% confidence interval? ___________________ Confidence Interval for a Population Mean, Suppose that the parameter of interest is the population mean, and that a. the population distribution is normal b. the value of the population standard deviation is known Let X1, X2, ....,Xn be a random sample. Then 100(1-)% confidence interval for is _ _ _ x x z / 2 , x z / 2 where P z / 2 z / 2 1 n n / n Thus, in 95% of all possible samples, will be captured in the following calculated confidence _ interval: x 1.96 n 2 Choosing the sample size: Bound on the error estimation is z / 2 z / 2 n B= z / 2 n _ . I mean x will be within of . The sample size required to estimate a population mean to within an amount z with 100(1)% confidence is n= / 2 . The same formula can be written using n B the interval width, w= 2 z / 2 2 2z then n= / 2 . n w 2 Example 1:Each of the following is a confidence interval for true average amount of time spent by the patients using physical therapy device using the sample data: (10.90, 25.44), (13.58, 22.76) (a) What is the value of the sample mean time spent by the patients using physical therapy device? (b) The confidence level for one of these intervals is 95% and for the other is 99%. Which of the intervals has the 95% confidence level and why? Example 2: Suppose we want to estimate the average # of violent acts on TV per hour for a specific network. Data was collected from viewing random selection of 50 prime time hours and average of 11.7 violent acts were recorded. Suppose it is known that =5. The 95% CI for is (10.3141 , 13.0859) The 95% confidence interval for if 100 prime time hours had been viewed where the same mean and the variance obtained is (10.72 , 12.68) The 90% CI for is (10.5368 , 12.8632) The width of the 90% confidence interval for is 2.3264 The bound on the error estimation of the 90% confidence interval for is 1.1632 Example 3: Investigators would like to estimate the average taxable income of apartment dwellers to within $500, using a 95% CI, Suppose that the previous studies show that standard deviation is $8000. How many people should they study? (Answer: 984) Large Sample Confidence Interval for Suppose that the parameter of interest is the population mean, and that a. X1, X2, ...,Xn is a random sample from a population distribution with mean, and standard deviation, . _ b. For the large sample size n, the CLT implies that X has approximately a normal distribution for any population distribution. 3 c. The value of the population standard deviation may not be known. Instead, the value of the sample standard deviation s may be known. If n is sufficiently large (n>40), 100(1-)% large sample confidence interval for is _ _ _ s _ s s s x z / 2 , x z / 2 where P x z / 2 x z / 2 1 n n n n Example 4: One method for solving the electric power shortage employs the construction of floating nuclear power plants located a few miles offshore in the ocean. Because there is concern about the possibility of a ship collision with the floating, an estimate of the density of ship traffic in the area is needed. The number of ships passing within 10 miles of the proposed power-plant location per day recorded for 60 days during July and August, possessed sample mean and variance, 7.2 and 8.8, respectively. (a) Find a 98% confidence interval for the mean number of ships passing within 10 miles of the proposed power-plant location during any day time period. (Answer:(6.3077,8.0923)) (b) Consider the possibility that 1 ship in precision of estimation are desired in 98 % confidence interval for the mean number of ships passing within 10 miles of the proposed power-plant location during a any day time period, what should be the sample size of ships observed? (Answer:48) Example 5: I want to see how long on average, it takes Drano to unclog a sink. In a recent commercial, the stated claim was that it takes on average, 15 minutes. I wanted to see if that claim was true, so I tested Drano on 64 randomly selected sinks. I found that it took an average of 18 minutes with standard deviation of 2.5 minutes. Was their claim false? 99% CI for is (17.1953 , 18.8047) 90% CI for is (17.4859 , 18.5141) What is different in one-sided confidence intervals? Discussion Example 6: Determine the confidence level for each of the following large sample one-sided confidence bounds: _ s (a) Upper bound: x 0.93 (Answer: 0.8238) n _ s (b) Lower bound: x 1.75 (Answer: 0.9599) n A General Large Sample Confidence Interval ^ When the estimator satisfies the following properties, a. The estimator has approximately a normal population distribution b. It is at least unbiased c. standard deviation of the estimator is known 4 The confidence interval for ^ can be constructed z / 2 as ^ where ^ P z / 2 z / 2 1 ^ Example 7: large sample confidence interval for the parameter in Poisson distribution is _ _ _ _ x _ x x , x z / 2 z / 2 1 x z / 2 where P z / 2 n n /n Large Sample Confidence Interval for a population proportion, p If n is sufficiently large, 100(1-)% large sample confidence interval for p is ^ ^ ^ ^ ^ ^ p(1 p) ^ p(1 p) p p where p z , p z P z z /2 /2 /2 /2 1 ^ ^ n n p(1 p) / n ^ ^ Check if n p 10 and n1 p 10 to see if you have a large sample. Otherwise, there is a formula (7.10) in your textbook, which can be used without checking if it is a large sample. I mean formula (7.10) can be used for large and small samples. ^ Choosing the sample size: Bound on the error estimation is z / 2 ^ ^ ^ p(1 p) . I mean p will be n ^ p(1 p) within z / 2 of p. The sample size required to estimate a population proportion p to n ^ ^ ^ ^ z2 / 2 p1 p p(1 p) . The same within an amount B= z / 2 with 100(1)% confidence is n= 2 n B ^ ^ ^ ^ 4 z2 / 2 p1 p p(1 p) . formula can be written using the interval width, w= 2 z / 2 then n= 2 n w ^ ^ The conservative sample size can be found when p = 1 p =0.5 What is different in one-sided confidence intervals? Discussion Example 8: We are interested in proportion of all students enrolled in Stat211 who listen to country music. Using our class as our random sample from Stat211 students, we see that ___________ out of ___________of you listen to country music. Estimate the true proportion of all Stat211 students that listen to country music using 90% confidence interval. 5 What parameter are we estimating?_______________ Example 9:Scripps News service reported that 4% of the members of the American Bar Association (ABA) are African American. Suppose that this figure is based on a random sample of 400 ABA members. (a) Is the sample size large enough to justify the use of the large-sample confidence interval for a population proportion? (b) Construct and interpret a 90% confidence interval for the true proportion of all ABA members who are African American. (Answer: (0.0239 , 0.0561)) Example 10: I want to estimate the proportion of freshmen Aggies who will drop out before graduation. How many Aggies should I include in my study in order to estimate p within 0.05 with 95% confidence? (Answer: 385) Intervals based on a Normal Population Distribution: When the sample size is small, we have to make specific assumptions to find the confidence intervals. Assumption: The population of interest is normal, so that X1, X2, ...,Xn constitutes a random sample from a normal distribution with both and unknown. When the sample mean of a random sample of size n from a normal distribution with mean , and _ the standard deviation s, the random variable T x s/ n has a probability distribution called a t- distribution with n-1 degrees of freedom. Properties of t-distribution: discussion (page 296 of your textbook) and t-distribution table is on page 725, Table A.5. _ s _ s x t / 2;n 1 , x t / 2;n 1 100(1-)% confidence interval for is where n n _ x P t / 2;n1 t / 2;n1 1 s/ n Example 11: Students weighed in kilograms at the beginning and end of a semester long fitness class. Assume the population of weight changes follows a normal distribution. A random sample of 12 female students yielded a mean of 0.45 and standard deviation of 1.5. 99% CI to estimate the true mean weight change is (-0.8949 , 1.7949). Would you believe me if I claimed the average weight change was 0? 6 What is different in one-sided confidence intervals? Discussion A Prediction Interval for a Single Future Value: Let X1, X2, ...,Xn be a random sample from a normal population distribution and we wish to predict the value of Xn+1, a single future observation. 100(1-)% prediction interval for Xn+1 is _ 1 _ 1 x t / 2;n 1 s 1 , x t / 2;n 1 s 1 where n n _ x x n 1 P t / 2;n 1 t / 2;n 1 1 1 s 1 n Example 12: What is the 99% prediction interval for the weight of an individual student from the population distribution in example 11? (Answer: (-4.3992 , 5.2992)) Tolerance Intervals: Let k be a number between 0 and 100. A tolerance interval for capturing at least k% of the values in a normal distribution with a confidence level 100(1-)% has the form _ _ x critical value s , x critical value s Table A.6 (page 726) is designed for the tolerance critical values where k=90, 95, 99 and =0.05 ,0.01 in one and two-sided intervals. Example 13: Use example 11 and calculate an interval that includes at least 95% of the student weights in the population distribution using a confidence level of 99%. (Answer: (-5.355 , 6.255)) Confidence Intervals for the Variance, 2 and Standard Deviation, of a Normal Population : The population of interest is normal, so that X1, X2, ...,Xn constitutes a random sample from a normal distribution with parameters and 2. Then the random variable _ x x i 2 (n 1) s i 1 2 2 n freedom. 100(1-)% 2 has a chi-squared ( 2 ) probability distribution with n-1 degrees of confidence interval for 2 is (n 1) s 2 (n 1) s 2 , 2 2 / 2;n 1 1 / 2;n 1 where (n 1) s 2 P 12 / 2;n1 2 / 2;n1 1 . 2 The details of the chi-squared ( 2 ) probability distribution will be discussed in class and the table of critical values (Table A.7, Page 727) will be demonstrated. 7 Example 14: Determine the following: (a) The 95th percentile for the chi-squared distribution with n=20. (b) The 5th percentile for the chi-squared distribution with n=20. (c) P(10.117 2 30.143) where 2 is a chi-squared r.v. with n=20. (d) P( 2 <10.283 or 2 >35.478) where 2 is a chi-squared r.v. with n=22. Exercise 7.44: (a) Is it plausible to assume that the data come from a normal population distribution? (b) Calculate an upper bound with the confidence level 95% CI for the population standard deviation of turbidity. (c) Calculate a 95% CI for the population standard deviation of turbidity. Variable turbidity n 15 Mean 25.313 Median 25.800 TrMean 25.438 Variable turbidity Minimum 21.700 Maximum 27.300 Q1 24.100 Q3 26.700 StDev 1.579 SE Mean 0.408 Normal Probability Plot for turbidity 99 ML Estimates 95 Mean: 25.3133 StDev: 1.52528 90 Percent 80 70 60 50 40 30 20 10 5 1 22 24 26 28 30 Data Discussion on finding the confidence interval for the linear combination of the population means Exercise 7. 51: 95% CI for 13 (1 2 3 ) 4 where i is the ith true average yield. _ Treatment si ni xi 1 (pesticide) 100 10.5 1.5 2 (pesticide) 90 10.0 1.3 3 (pesticide) 100 10.1 1.8 4 (ladybugs) 120 10.7 1.6