Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

Chapter 4 Statistical Inference Estimation -Confidence interval estimation for mean and proportion -Determining sample size Hypothesis Testing -Test for one and two means -Test for one and two proportions Statistical Inference Statistical inference is a process of drawing an inference about the data statistically. It concerned in making conclusion about the characteristics of a population based on information contained in a sample. Since populations are characterized by numerical descriptive measures called parameters, therefore, statistical inference is concerned in making inferences about population parameters. ESTIMATION In estimation, there are two terms that firstly, should be understand. The two terms involved in estimation are estimator and estimate. An estimate of a population parameter may be expressed in two ways: point estimate and interval estimate. Point Estimate A point estimate of a population parameter is a single value of a statistic. For example, the sample mean x is a point estimate of the population mean μ. Similarly, the sample proportion p̂ is a point estimate of the population proportion p. Interval estimate An interval estimate is defined by two numbers, between which a population parameter is said to lie. For example, a < x < b is an interval estimate of the population mean μ. It indicates that the population mean is greater than a but less than b. Point estimators Choosing the right point estimators to estimate a parameter depends on the properties of the estimators it selves. There are four properties of the estimators that need to be satisfied in which it is considered as best linear unbiased estimators. The properties are: Unbiased Consistent Efficient Sufficient Confidence Interval • A range of values constructed from the sample data. So that the population parameter is likely to occur within that range at a specified probability. • Specified probability is called the level of confidence. • States how much confidence we have that this interval contains the true population parameter. The confidence level is denoted by • Example :- 95% level of confidence would mean that if 100 confidence intervals were constructed, each based on the different sample from the same population, we would expect 95 of the intervals to contain the population mean. To compute a confidence interval, we will consider two situations: i. We use sample data to estimate, with X and the population standard deviation is known. ii. We use sample data to estimate, with X and the population standard deviation is unknown. In this case, we substitute the sample standard deviation (s) for the population standard deviation Example 2.1: Find 95% confidence interval for a population mean for these values : a) n 36, x 13.3, s 2 3.42 b) n 64, x 2.73, s 2 0.1047 a) 1st Step: 1 100 95 1 0.95 0.05 0.025 2 2nd Step: Find from table page 26. Z0.025 1.96 3rd Step: Use formula. CI x Z 2 s n 1.8493 CI 13.3 (1.96) 36 = 13.3 0.6041 = 12.6959,13.9041 4rd step : Conclusion: 95% confidence interval of mean lies in between 12.6959 to 13.9041. Example 2.2 : The mean and standard deviation of the maximum loads supported by sample of 60 cables are given 11.09 tons and 0.73 tons. Find 95% confidence interval of the mean of the maximum loads all cables produced by company. Example 2.3: The brightness of a television picture tube can be evaluated by measuring the amount of current required to achieve a particular brightness level. A random sample of 10 tubes indicated a sample mean 317.2microamps and a sample standard deviation is 15.7microamps. Find (in microamps) a 99% confidence interval estimate for mean current required to achieve a particular brightness level. Solution: s 15.7 x 317.2 s 15.7, n 10 30, x 317.2 For 99% CI: 99% 1 100% 1 0.99 0.01 0.005 2 From t normal distribution table: t ,n 1 t0.005 ,9 3.250 2 Hence 99% CI 15.7 317.2 t0.005 ,9 10 15.7 317.2 3.250 10 301.0645,333.3355 microamps Thus, we are 99% confident that the mean current required to achieve a particular brightness level is between 301.0645 and 333.3355 Exercise 2.1: Taking a random sample of 35 individuals waiting to be serviced by the teller, we find that the mean waiting time was 22.0 min and the standard deviation was 8.0 min. Using a 90% confidence level, estimate the mean waiting time for all individuals waiting in the service line. Answer : [19.7757, 24.2243] Confidence Interval Estimates for the differences between two population mean, 1 i) Variance X 1 12 and X 2 Z 2 2 22 12 n1 are known 22 n2 ii) If the population variances, and are unknown, then the following tables shows the different formulas that may be used depending on the sample sizes and the assumption on the population variances. 2 1 2 2 Equality of variances, 12 , 2 2 when are unknown 2 2 1 Sample size n1 30, n2 30 X 2 1 X 2 Z 2 X1 X 2 Z S p 12 22 2 Sp 2 2 1 n1 30, n2 30 X 2 s s 2 n1 n2 1 1 n1 n2 n1 1 s12 n2 1 s2 2 n1 n2 2 1 X 2 t 2 ,v s12 s2 2 n1 n2 2 s s2 n1 n2 v 2 2 s12 s2 2 n n 1 2 n1 1 n2 1 2 1 X 1 X 2 t S p 2 Sp 2 2 ,v 1 1 n1 n2 n1 1 s12 n2 1 s2 2 n1 n2 2 v n1 n2 2 Example 2.4: Two machines are used to fill plastic bottles with liquid laundry detergent. The standard deviations of fill volume are known to be 0.10 and 0.15 fluid ounce for the two machines, respectively. Two random samples of bottles from the machine 1 n1 14 and bottles from machine 2 are selected, n2 12 and the sample means fill volume are x 30.5 and x 29.4 fluid ounces. Construct a 90% confidence interval on the mean difference in fill volumes. Interpret the results. 1 2 1 2 Solution: 1 100% 90 Machine 1: x1 30.5 Machine 2: x2 29.4 1 0.10 n1 14 2 0.15 n2 12 X 12 2 2 0.102 0.152 30.5 29.4 Z 0.05 n1 n2 14 12 1 X 2 Z 2 1 0.90 0.1 0.05 2 1.1 1.6449 0.0509 1.0163,1.1837 We are 90% confidence that the mean difference to fill volumes lies between 1.0163 and 1.1837 fluid ounces. Exercise 2.2: 17 male undergraduate students and 20 female undergraduate students are randomly selected from faculty of mechanical engineering. Result for test 2 SSM 3763 shown the following data: Male : X M 82, S M 8 Female : X F 76, S F 6 Assume that both population are normally distributed and have equal population variances. Construct a 95% confidence interval for the difference in the two means. Answer : [1.3217, 10.6783] Example 2.5: According to a poll, 40% of working women says that they feel stress in working. The poll was based on a randomly selected of 1502 working women aged 30 and above. Construct a 95% confidence interval for the corresponding population proportion. Exercise 2.3 In a random sample of 70 automobiles registered in a certain state, 28 of them were found to have emission levels that exceed a state standard. Find a 95% confidence interval for the proportion of automobiles in the state whose emission levels exceed the standard. Answer : [0.2852, 0.5148] Example 2.5: Two separate surveys were carried out to investigate whether or not the users of Plus highway were in favour of raising the speed limit on highways. Of the 250 car drivers interviewed, 220 were in favour of raising the speed limit while of the 200 motorists interviewed , 180 were in favour of raising the speed limit . Find a 95% confidence interval for the difference in proportion between the car drivers and motorist who are in favour of raising the speed limit. Exercise 2.4 In a test of the effect of dampness on electric connections, 100 electric connections were tested under damp conditions and 150 were tested under dry conditions. Twenty of the damp connections failed and only 10 of dry ones failed. Find a 90% confidence interval for the difference between the proportions of connections that fail when damp as opposed to dry. Answer : [0.0591, 0.207] Error of estimation and choosing the sample size When we estimate a parameter, all we have is the estimate value from n measurements contained in the sample. There are two questions that usually arise: (i) How far our estimate will lie from the true value of the parameter? (ii) How many measurements should be considered in the sample? The distance between an estimate and the estimated parameter is called the error of estimation. For example if most estimates are within 1.96 standard deviations of the true value of the parameter, then we would expect the error of estimation to be less than 1.96 standard deviations of the estimator, with the probability approximately equal to 0.95. z /2 n , where n is rounded up to the nearest number. E 2 z 2 p 1 p B n Example 2.6: The college president asks the statistics teacher to estimate the average age of the students at their college. The statistics teacher would like to be 99% confident that the estimate should be accurate within 1 year. From the previous study, the standard deviation of the ages is known to be 3 years. How large a sample is necessary? Exercise 2.5: The diameter of a two years old Sentang tree is normally distributed with a Standard deviation of 8 cm. How many trees should be sampled if it is required to estimate the mean diameter within ± 1.5 cm with 95% confidence interval? Answer : 110 trees EXERCISES Exercise 2.6 A tire manufacturer wishes to investigate the tread life of its tires. A sample of 10 tires driven 50, 000 miles revealed a sample mean of 0.32 inches of tread remaining with a standard deviation of 0.09 inches. Construct a 95 percent confidence interval for the population mean. Would it be reasonable for the manufacturer to conclude that after 50, 000 miles the population mean amount of tread remaining is 0.30 inches? Answer : [0.2556, 0.3844] Exercise 2.7 Resin-based composites are used in restorative dentistry. A comparison of the surface hardness of specimens cured for 40 seconds with constant power with that of specimens cured for 40 seconds with exponentially increasing power. 15 specimens were cured with each method. Those cured with constant power had an average surface hardness (in N/mm) of 400.9 with a standard deviation of 10.6. Those cured with exponentially increasing powder had an average surface hardness of 367.2 with a standard deviation of 6.1. Find a 98% confidence interval for the difference in mean hardness between specimens cured by two methods. Answer: [25.7804, 41.6196] Exercise: 2.8 The wedding ceremony for a couple, Jamie and Robbin will be held in Menara Kuala Lumpur. A survey has been carried out to determine the proportion of people who will come to the ceremony. From 250 invitations, only 180 people agree to attend the ceremony. Find a 90% confidence interval estimate for the proportion of all people who will attend the ceremony. Answer : [0.6733, 0.7767]