Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis Testing Basic Problem We are interested in deciding whether some data credits or discredits some “hypothesis” (often a statement about the value of a parameter or the relationship among parameters). 1 Suppose we consider the value of = (true) average lifetime of some battery of a certain cell size and for a specified usage. H0 : = 160 and hypothesize: H1 : 160 2 This would usually involve a scenario of either H0 : = 160 (1) 160 is a “standard” H1 : 160 (2) the previous was 160 (has there been a change?) or something analogous. H0 is called the “Null Hypothesis” H1 is called the “Alternate Hypothesis” 3 We must make a decision whether to ACCEPT H0 or REJECT H0 (ACCEPT H0 same as REJECT H1) (REJECT H0 same as ACCEPT H1) We decide by looking at from a sample of size n X 4 Basic Logic: H0 : = 160 H1 : 160 (1) Assume (for the moment) that H0 true. (2) Find the probability that X would be “that far away”, if, indeed, H0 is true. (3) If the probability is small, we reject H0; if it isn’t too small, we give the benefit of the doubt to H0 and accept H0. 5 BUT — what’s “too small” or “not too small”? Essentially — you decide! You pick (somewhat arbitrarily) a value, , usually .01 .10, and most often = .05, called the SIGNIFICANCE LEVEL; 6 If the probability of getting “as far away” from the H0 alleged value as we indeed got is greater than or equal to , we say “the chance of getting what we got isn’t that small and the difference could well be due to sample error, and, hence, we accept H0 (or, at least, do not reject H0).” 7 If the probability is <, we say that the chance of getting the result we got is too small (beyond a reasonable doubt) to have been simply “sample error,” and hence, we REJECT H0. 8 Suppose we want to decide if a coin is fair. We flip it 100 times. H0 : p = 1/2, coin is fair H1 : p 1/2, coin is not fair 9 Let X = number of heads Case 1) X = 49 Perfectly consistent with H0, Could easily happen if p = 1/2; ACCEPT H0 2) X = 81 Are you kiddin’? If p = 1/2, the chance of gettin’ what we got is one in a billion! REJECT H0 3) X = 60 NOT CLEAR! 10 What is the chance that if p = 1/2 we’d get “as much as” 10 away from the ideal (of 50 out of 100)? If this chance <, reject H0 If this chance >, accept H0 11 Important logic: H0 gets a huge “Favor from the Error”; H1 has the “Burden of Proof”; We reject H0 only if the results are “overwhelming”. 12 To tie together the value chosen and the X values which lead to accepting (or rejecting) H0, we must figure out the probability law of X if H0 is true. Assuming a NORMAL distribution (and the Central Limit Theorem suggests that this is overwhelmingly likely to be true), the answer is: X = 160 13 We can find (using normal distribution tables) a region such that = the probability of being outside the region: X /2 150.2 /2 =160 169.8 (I made up the values of 150.2 and 169.8) 14 Note: logic suggests (in this example) a “rejection” region which is 2sided; in experimental design, most regions are 1-sided. 150.2 169.8 is called the Acceptance Region (AR) <150.2 and >169.8 is called the Critical Region (CR) 15 X /2 150.2 /2 =160 169.8 Decision Rule: If X in AR, accept H0 If X in CR, reject H0 16 X is called the “TEST STATISTIC” (that function of the data whose value we examine to see if it’s in AR or CR.) ONE-SIDED LOWER TAIL ONE-SIDED UPPER TAIL H0 : < 10 H1 : >10 H0 : > 20 H1 : < 20 X X C Critical Value 20 10 C Critical Value 17 has another meaning, which in many contexts is important: H0 true we accept H0 we reject H0 Good! (Correct!) Type I Error, or “ Error” H0 false Type II Error, or “ Error” Good! (Correct) 18 = Probability of Type I error = P(rej. H0|H0 true) = Probability of Type II error = P(acc. H0|H0 false) 19 We often preset . The value of depends on the specifics of the H1: (and most often in the real world, we don’t know these specifics). 20 EXAMPLE: H0 : < 100 H1 : >100 Suppose the Critical Value = 141: X =100 C=14 1 21 = P (X < 141/= 150) = .3594 What is ? These are values corresp.to a value of 25 for the Std. Dev. of X = 150 141 = 150 = 160 = P (X < 141/= 160) = .2236 141 = 160 = 170 = P (X < 141/= 170) = .1230 141 = P (X < 141/= 180) = 170 = 180 = .0594 = P (X < 141|H0 false) 141 = 180 Note: Had been preset at .025 (instead of .05), C would have been 149 (and would be larger); had been preset at .10, C would have been 132 and would be smaller. and “trade off”. 23 In ANOVA, we have H0 : 12• • • = c H1 : not all (column) means are =. The probability law of “Fcalc” in the ANOVA table is an F distribution with appropriate degrees of freedom values, assuming H0 is true: 0 C Critical Value 24 MSBcol Fcalc = MSWError E(MSBcol) = 2 + Vcol E(MSWError) = 2 The larger the ratio, Fcalc, the more suggestive that H0 is false). C C is the value so that if Vcol = 0 (all ’s=) P (Fcalc > C)= 25 Note: What is ? ( = P Fcalc < C The ’s are not all = (i.e., the level of the factor does matter!!) ) Answer: Unable to be determined because we would need exact specification of the “non-equality”. [Hardly ever known, in practice!] 26 HOWEVER — The fact that we cannot compute the numerical value of in no way means it doesn’t exist! And – we can prove that whatever is, it still “trades off” with . 27