Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Degrees of freedom (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Foundations of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Oct. 16, 2007 LEC #6 ECON 140A/240A-1 Interval Estimation and Hypothesis Testing L. Phillips I. Introduction From a simple random sample of voters we obtain a sample proportion of the voters supporting a candidate but we do not know the proportion for the entire population of voters. Can we say that this population proportion lies in a specified interval with some likelihood? Of course the candidate hopes this interval is above 0.5 and that the likelihood is nearly certain. Recall that our series of monthly rates of return for the UC stock index consisted of values that were independent of one another. From this data we can calculate a sample mean but we do not know the mean rate of return for the underlying population generating these monthly observations. Once again, can we say the population mean lies in some interval and, on average, if we used this procedure many times, what fraction of the time we would be right? II. Confidence Intervals for Sample Proportions and Population Proportions Suppose a simple random sample of 1000 California Democratic likely voters produces a sample proportion of 0.53 who support Hillary. Does the interval for the population proportion lie above 0.50? What fraction of the time, if we used this procedure again and again, would this interval be correct? We know that proportions are distributed binomially and since this is a large sample, we can approximate it with the normal distribution. We also know that for the normal distribution, approximately 68 percent of the observations lie within one standard deviation of the mean, and that 95% of the observations lie within plus or minus 1.96 standard deviations of the mean. i.e. P[-1.96 ( p̂ - p)/( p̂ ) 1.96] 0.95, (1) Oct. 16, 2007 LEC #6 ECON 140A/240A-2 Interval Estimation and Hypothesis Testing where ( p̂ ) = L. Phillips p(1 p) /n . With some manipulation( multiply the inequality by ( p̂ ) and add p to the inequality) we can restate this as: P[p – 1.96( p̂ ) < p̂ < p + 1.96( p̂ )] 0.95. (2) This means that this sampling procedure, if repeatedly used, would produce sample proportions that would lie within plus or minus 1.96 standard deviations from the population mean, 95% of the time. Alternatively, Eq. (1) can be expressed as: P[ p̂ - 1.96 ( p̂ ) < p < p̂ + 1.96 ( p̂ )] 0.95, (3) (which can be obtained by multiplying the inequality in Eq. (1) by ( p̂ ) and subtracting p̂ , and multiplying by minus one which reverses the inequality). This expression in Eq. (3) provides an interval within which the true population proportion will lie 95% of the time. However, since we do not know the population proportion p, we use the sample proportion, p̂ , to calculate the standard deviation: pˆ (1 pˆ ) / n , s( p̂ ) = (4) for use in Eq. (3): P[ pˆ 1.96s( pˆ ) p pˆ 1.96s( pˆ ) ] 0.95 (5) As a numerical example, take the sample mean of 0.53 for a sample of 1000. The standard deviation is: s( p̂ ) = 0.53(1 0.53) / 1000 = 0.0158. (6) Oct. 16, 2007 LEC #6 ECON 140A/240A-3 Interval Estimation and Hypothesis Testing L. Phillips The 95% confidence interval for the population proportion of California voters supporting Hillary is: P[0.499<p<0.561] = 0.95. (7) A politician supporting Hillary might be pleased with the interval, which mostly lies above 0.50, but nervous that 5% of the time the population proportion could lie outside of this interval, especially the proportion that lies below 0.5. What sort of interval for p would have a 99% confidence level? P[ p̂ - 2.58 s( p̂ ) < p < pˆ 2.58 s( p̂ )] 0.99, (8) Which calculates to, P[0.489 < p < 0.571} 0.99. (9) This wider confidence interval extends even more above 0.50 but also more below it, making a politician supporting Hillary, nervous. The only way we could do better is with a larger sample size, i.e. with a higher cost to obtain the larger sample of voters. Even in that case there is no guarantee the sample proportion will be 0.53 or higher. It could be lower, given the random nature of the sample. Conditional on a value for the sample proportion, these formulae could be used to see how the standard deviation and the confidence interval will vary with sample size, given the level of confidence chosen. III. Confidence Intervals for Sample Means and Population Means Using our example of a sample of twelve monthly rates of return for the UC stock index fund, with sample mean 1.61 and standard deviation 4.04, the variable t, t = (1.61 - )/(4.04 12 ), Oct. 16, 2007 LEC #6 ECON 140A/240A-4 Interval Estimation and Hypothesis Testing L. Phillips has the t distribution for eleven degrees of freedom. One degree of freedom in this sample of twelve observations has been used in the calculation of the sample mean, which in turn is used to calculate the sample standard deviation, s = { [ r ( j ) r ]2 /(n – 1)}1/2 . (10) j For eleven degrees of freedom, with a probability of 0.95, the population mean, , falls in the interval: P[ r t0.025 (4.04/ 12 ) < < r + t0.025 (4.04/ 12 )] 0.95. (11) For eleven degrees of freedom, t0.025 is 2.20, i.e. 2.5 percent of the distribution lies above t = 2.2 and 2.5 percent of the distribution lies below t = -2.2. For our sample mean, r =1.61, the 95% confidence interval for the population mean is: P[-0.96< < 4.18 } = 0.95 (12) This interval for the mean monthly rate of return for the population is quite broad and a larger sample would likely help. Compare this value of t of 2.2 for eleven degrees of freedom to obtain a 95 % confidence interval with the value of z = 1.96 from the normal distribution to obtain a 95% confidence interval. Ignorance has its price and not knowing the variance of the population of monthly returns forces us to calculate the sample deviation and to use the t distribution. IV. Hypothesis Tests for Proportions There are four steps to statistically testing a hypothesis. The first step is to formulate all of the hypotheses, both the null (or maintained hypothesis), and the alternative hypothesis. The second step is to identify a test statistic that will assess the Oct. 16, 2007 LEC #6 ECON 140A/240A-5 Interval Estimation and Hypothesis Testing L. Phillips evidence against the null hypothesis. The third step is a probability statement that answers the question: if the null hypothesis were true, then what is the probability of observing a test statistic at least as extreme as the one observed? The fourth step is to compare this probability to some chosen critical level of significance, say 5%. This critical level, i.e. a willingness to bear the burden of a probability of rejecting the null hypothesis, even if it were true, as high as 5 %, is designated . An example of a hypothesis test can be formulated from our virtual sample of one thousand California voters, 530 of whom support Ms. Clinton. The null hypothesis is that the true population proportion is 0.5, i.e. H0 : p = 0.5, (13) While the alternative hypothesis is that the true population proportion, p, lies below 0.5, Ha : p < 0.5. (14) The test statistic, z, is the sample proportion, p̂ , minus its expected value under the null hypothesis, i.e. the population mean, p, divided by the standard deviation of the sample proportion, ( p̂ ) = p(1 p) / n : z = ( p̂ - p)/( p̂ ) = (0.53 – 0.5)/ (0.5)(0.5) / 1000 = 0.03/0.0158 = 1.90. (15) Using the normal distribution, the value of getting a value of z greater than 1.90 is 0.029. For a significance criterion of 5%, this probability of 0.029 is smaller so we would accept the null hypothesis of p = 0.5 and the camp supporting Hillary would be happy. V. Hypothesis Test for a Sample Mean Oct. 16, 2007 LEC #6 ECON 140A/240A-6 Interval Estimation and Hypothesis Testing L. Phillips Although we know that the interval for the mean monthly rate of return for the population includes zero for our sample of 12 monthly returns for the UC stock index fund, for practice we can test the hypothesis that the population mean is zero, i.e. H0 : = 0, (16) Against the alternative that the mean differs from zero, Ha : 0. (17) The test statistic is the t-value equal to the sample mean, r , minus the population mean under the null hypothesis, , divided by the estimate of the standard deviation of the sample mean, s/n: t = ( r ) /(s/n) = (1.61 – 0)/(4.04/ 12 ) = 1.61/1.166 = 1.38. (18) If we choose a critical level of 5%, for 11 degrees of freedom, t0.025 = 2.2. Since our observed t statistic is only 1.38, we can not reject the null hypothesis that the mean monthly rate of return is zero. This does not jibe with our expectation that you can make money with stock index funds. Perhaps a larger sample would decrease the standard deviation of r , and provide more precision for this test. VI. Decision Theory Life is full of tradeoffs and hypothesis testing is another example. Consider the null hypothesis that the proportion of the population of voters supporting Hillary is 0.5 (more than 0.5 would have an equivalent consequence on the supporters aspirations to carry California with a majority for this candidate) versus the alternative hypothesis that this proportion is less than 0.5. We do not know the true state of affairs before the election is held, which we are trying to guess, but consider two possibilities. One possibility is that the proportion is 0.5(or more). The second possibility is that the Oct. 16, 2007 LEC #6 ECON 140A/240A-7 Interval Estimation and Hypothesis Testing L. Phillips proportion is less than 0.5. Table 1 cross-classifies four possible outcomes depicting our decision to accept or reject the null hypothesis versus the true (but unknown to us) state of affairs (state of nature). Our decision has two choices: accept or reject the null hypothesis. If we accept the null and it happens the null is true (a fact as yet unknown to us), then no problem. If we accept the null and the true state of affairs is that the null is false, then we make what True State P = 0.5 Accept null No Error P < 0.5 Type II Error Decision Reject null Type I Error No Error Table 1. Decision Theory and Two Types of Error --------------------------------------------------------------------------------is called a type II error. So there are two possible errors, accept the null when it is false or reject the null when it is true. This latter error is called a type I error. Recall that step three of our hypothesis test procedure answered the question, “if the null were true, then what is the probability of observing a test statistic at least as extreme as the one observed? The fourth step was to choose a critical significance level , such as one percent, to compare to this probability from step three. So steps three and four are focusing on a type I error, rejecting the null when it is true. We reject the null only if the Oct. 16, 2007 LEC #6 ECON 140A/240A-8 Interval Estimation and Hypothesis Testing L. Phillips test statistic, for example a z value or a t value, would have such an extreme value by chance only one percent of the time or less. We could set a significance level of one tenth of one percent. Then the likelihood of getting a test statistic beyond the critical level by chance would only be one in a thousand. We are less likely to reject the null if the null is true and so we are less likely to make a type I error. But if we reduce the Type I error, what is happening to the type II error? If we are less likely to reject the null it follows we are more likely to accept the null, and the probability of accepting the null when the null is false increases. That is we are more likely to make a type II error. Therein lies the tradeoff. The probability of making a type II error, i.e. accepting the null when the alternative hypothesis is true, is designated . The power of a test is defined as 1 - . It is the probability of rejecting the null hypothesis. Referring to Table 1, there is no error in rejecting the null if the null is false. The operating characteristic curve of a test is a plot of the probability of making a type II error, , as it depends on the value of the parameter that is the focus of the null hypothesis, for example the mean proportion of the population of California voters, p. The power function curve is a plot of the power of the test, 1 - , as it varies with this parameter that describes the true state of nature. An ideal power function is zero if the parameter determining the distribution corresponds to the null hypothesis and is one for the values of the parameter that correspond to the alternative hypothesis. As an example, consider some what-if scenarios. Suppose the true mean proportion of the population of California voters supporting a proposition on the November ballot is 0.5 and we are in a camp supporting this proposition. The null Oct. 16, 2007 LEC #6 ECON 140A/240A-9 Interval Estimation and Hypothesis Testing L. Phillips hypothesis is that p= 0.4999 (or less) and the alternative hypothesis is that p >= 0.5. We decide we can bear one chance in a hundred or less of making a type I error, i.e. = 0.01. From the normal distribution, the value of z beyond which we would reject the null even if true is 2.33, as illustrated in Figure 1. -----------------------------------------------------------------------------------------------------------0.5 p =0.5 DENSITY 0.4 0.3 0.2 0.1 0.0 -4 -2 0 2 4 z(alpha=1%) = 2.33 Z Figure 1: Probability of a Type I Error Equal 1% -------------------------------------------------------------------------------------From this value of z, as determined by our judgement of the critical level for the type I error, we can solve for p̂ : z = 2.33 = ( p̂ - p)/( p̂ ) = ( p̂ – 0.5)/ (0.5)(0.5) / 1000 = ( p̂ - 0.5)/0.0158, (20) so we would have a decision rule to reject the null if the sample proportion, p̂ , was greater than 0.5368, i.e. if 537 or more voters in the sample of 1000 support this proposition. But what about the type II error? Pursuing our what-if scenario, suppose the true value of the mean population proportion is 0.54, and we use our decision rule to accept Oct. 16, 2007 LEC #6 ECON 140A/240A-10 Interval Estimation and Hypothesis Testing L. Phillips the null if the sample proportion is below 0.5368. What is , the probability of accepting the null if the null is false? This situation is illustrated in Figure 2. The value of z used to calculate if the true population proportion is 0.54 is z = (0.536 – p)/ p(1 p) / 1000 = (0.536 – 0.540)/ (0.54)(0.46 / 1000 , z = -0.004/0.0158 = -0.253. (21) Figure 2: The Pobability of a Type II Error = 40% 0.5 p = 0.50 p = 0.54 0.4 0.3 DENSITY, p=0.50 DENSITY, p=0.54 0.2 0.1 alpha = 1 % beta = 40% 0.0 440 460 480 500 520 540 560 Decision Rule: Reject Null if Voters>536 Voters Supporting Davis -------------------------------------------------------------------------------Using the cumulative distribution function of the normal distribution, F(z) =F(-0.253) =0.40, (22) i.e. the probability of a type II error, , is 40% if p=0.54. The values of calculated for various what-if scenarios of the true proportion of California voters supporting this proposition, p, are listed in Table 2. A plot of versus p is called Oct. 16, 2007 LEC #6 ECON 140A/240A-11 Interval Estimation and Hypothesis Testing L. Phillips Table 2: Probability of Type II Error Versus Population Proportion True Population z value* Proportion, p 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 1.64 1.01 0.38 -0.25 -0.89 -1.53 -2.17 -2.82 -3.47 -4.13 F(z) = beta 0.950 0.844 0.648 0.400 0.187 0.063 0.015 0.002 0.000 0.000 1 - beta 0.050 0.156 0.352 0.600 0.813 0.937 0.985 0.998 1.000 1.000 * z = (0.536 - p)/sqrt[p*(1-p)/1000] the operating characteristic curve and is shown in Figure 3. The corresponding plot of 1- versus p is called the power function and is shown in Figure 4. Oct. 16, 2007 LEC #6 ECON 140A/240A-12 Interval Estimation and Hypothesis Testing L. Phillips Figure 3: Operating Characteristic Curve 1.000 0.900 0.800 0.700 0.500 0.400 0.300 0.200 0.100 0.000 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 0.6 0.61 Presumed Population Proportion, p Figure 4: Power Function of the Test 1.000 0.900 Ideal Power Function 0.800 0.700 0.600 1 - beta Beta 0.600 0.500 0.400 0.300 0.200 0.100 0.000 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 Supposed Population Proportion, p 0.58 0.59 0.61 Oct. 16, 2007 LEC #6 ECON 140A/240A-13 Interval Estimation and Hypothesis Testing L. Phillips