Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ch9. Inferences Concerning Proportions Outline Estimation of Proportions Hypothesis concerning one Proportion Hypothesis concerning several proportions Analysis of r*c tables Goodness of fit Estimation In acceptance sampling we are concerned with the proportion of defectives in a lot, and in life testing we are concerned with the percentage of certain components which will perform satisfactorily during a stated period of time. It should be clear from these examples that problems concerning proportions, percentages, or probabilities are really equivalent. Estimation The point estimator of the population proportion, itself, is usually the sample proportion X/n. If the n trials satisfy the assumptions underlying the binomial distribution(P105), we know the mean and the standard deviation of the number of success is given by np and np(1 p) Estimator The mean and the standard deviation of the proportion of success (namely, of the sample proportion) are given by np p n and np(1 p) n p(1 p) n The first of these results shows that the sample proportion is an unbiased estimator of the binomial parameter p. Confidence interval Construction of confidence interval for the binomial parameter p (estimator). We first define x0 and x1 such that x0 b(k; n, p) / 2 and k 0 n b(k ; n, p) / 2 k x1 Thus, we assert with a probability of approximate 1 , and at least 1 , that the inequality x0 ( p) x x1 ( p) EX Suppose we want to find approximate 95% confidence interval for p for samples of size n=20. x0 and x1 can be determined by B( x0 ; n, p) 0.025 1 B( x1 1; n, p) 0.025 p 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 x0 0 1 3 5 7 9 11 14 x1 6 9 11 13 15 17 19 20 - Confidence interval For n is large We construct approximate confidence intervals for the binomial parameter p by using the normal approximation to the binomial distribution. With the probability 1 z / 2 X np z / 2 np(1 p) This yields x z / 2 n x x x x (1 ) (1 ) x n n p z n n /2 n n n EX. If x=36 of n=100 persons interviewed are familiar with the tax incentives for installing certain energy-saving devices, construct a 95% confidence interval for the corresponding true proportion. Solution: x/n =36/100=0.36 hence 0.36 1.96 0.36(1100 0.36) p 0.36 1.96 0.36(1100 0.36) 0.266 p 0.454 Maximum error The error when we use X/n as estimator of p is given by |X/n -p| Again using the normal distribution, we can assert with probability 1 that the inequality | X p(1 p) p | z / 2 n n Maximum error of estimate E z / 2 p(1 p) n EX In a sample survey conducted in a large city, 136 of 400 persons answered yes to the question of whether their city’s public transportation is adequate. With 99% confidence, what can we say about the maximum error if x/n=0.36 is used as an estimate of the corresponding true proportion? E z / 2 p(1 p) 0.34 0.66 2.575 0.061 n 400 Sample size determine If p is known z / 2 2 n p(1 p)( ) E If p is unknown 1 z / 2 2 n ( ) 4 E 9.2 Hypothesis The test of null hypothesis that a proportion equals some specified constant is widely used in sampling inspection, quality control, and reliability verification. Statistic for large sample test concerning p Null hypothesis p p0 X np0 Z np0 (1 p0 ) Criterion Region for testing (Large sample) Alternative hypothesis Reject null hypothesis if p p0 Z z p p0 Z z p p0 p p0 Z z / 2 or Z z / 2 EX In a study designed to investigate whether certain detonator used with explosives in coal mining meet the requirement that at least 90% will ignite the explosive when charged, it is found that 174 of 200 detonators function properly. Test the null hypothesis p=0.9 again the alternative p<0.9 at the 0.05 level of significance. Solution 1. Null hypothesis: p 0.9 Alternative hypothesis p 0.9 2. Level of significance: 0.05 3. Criterion: Reject the null hypothesis if Z<-1.645 4. Calculation: X np0 174 200(0.9) Z 1.41 np0 (1 p0 ) 200(0.9)(0.1) 5. The null hypothesis cannot be rejected. 6. P-value: 0.079 > level of significance 0.05 Hypothesis concerning several proportions We compare the consumer response to two different products, when we decide whether the proportion of defectives of a given process remains constant from day to day. Testing p1 p2 pk p Large-sample test We require independent random samples of size n1 , n2 , , nk if the corresponding number of successes are X1 , X 2 , , X k the test we should use is based on the fact X i npi that Z 1) Large samples the sampling distribution of i npi (1 pi ) is approximately the standard normal distribution 2) 3) The square of random variable having the standard normal distribution with 1 degree of freedom The sum of k independent random variables having chi-square distribution with 1 degree of freedom is a random variable having the chi-square distribution with k degrees of freedom. (Proves are not required) Cont. 2 ( x n p ) 2 i i i i 1 ni pi (1 pi ) k Is a value of random variable having approximately the chi-square distribution with k degrees of freedom. In practice, we substitute for the pi, which under the null hypothesis are all equal, the pooled estimate x1 x2 pˆ n1 n2 xk nk The null hypothesis should be rejected if the difference ˆ are large, the critical region is between the xi and ni p 2 2 where the number of degrees of freedom is k-1. Another approach successes Failures Total Sample1 ... x1 ... n1 x1 n1 ... .. Sample k xk nk xk nk Total x n x n Define the observed cell frequency oij : i 1, 2, j 1, ,k The expected number of successes and failures for the j-th sample are estimated by e1 j n j pˆ and thus 2 k 2 i 1 j 1 e2 j n j (1 pˆ ) (oij eij )2 eij Samples of three kind of materials, subjected to extreme temperature changes, produced the results should in the following table A B C Total successes 41 27 22 90 Failures 79 53 78 210 Total 120 80 100 300 Use the 0.05 level of significance to test whether, the probability of crumbling is the same of the three kinds of materials. Solution 1. Null hypothesis: p1 p2 p3 Alternative hypothesis: are not all equal 2. Level of significance: 0.05 3. Criterion: Reject the null hypothesis if 2 5.991 , degree 2 4. Calculation: pˆ 90 3 300 10 e11 90 120 36 300 e12 90 80 24 300 4.575 2 5. The null hypothesis cannot be rejected. e 30 13 Statistics for test concerning difference between two proportions Z X1 X 2 n1 n2 1 1 pˆ (1 pˆ )( ) n1 n2 For large samples, is a random variable having approximately the standard normal distribution. Confidence interval for the values of p1 p2 x1 x2 z / 2 n1 n2 x1 x x2 x (1 1 ) (1 2 ) n1 n1 n n2 2 n1 n2 9.4 Analysis of r*c tables The key random variable r c 2 i 1 j 1 (oij eij )2 eij is chi-square distribution with (r-1)(c-1) degrees of freedom 9.5 Goodness of fit Goodness of fit: try to compare an observed frequency distribution with the corresponding values of an expected, or theoretical, distribution. 2 ( o e ) 2 i i ei i 1 k is a random variable has the chi-square distribution with k-m degrees of freedom, where k is the number of terms in the formula and m is the number of quantities, obtained from the observed data that are needed to calculate the expected frequencies. EX Page 312~313.