Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
http://xkcd.com/539/ Hypothesis Testing and Statistical Significance Estimators and Correlation Hypothesis Testing and Statistical Significance labels and other questions 2 General definition, continuous and discrete variables: E[X] x f (x)dx (where f (x) is probability density function for X) For discrete variables: E[X] x i f (x i ) i x i P(X x i ) i 1 x i X (when P(x1) P(x 2 ) n i P(x n ) 1 n) 3 What is an estimator? Often a trade-off between bias and variance ˆ E ˆ ) E ˆ bias( 4 Variance defined: Population variance: (have all obs 1…N) Two estimators of population variance: (Typeset equations courtesy http://en.wikipedia.org/wiki/Variance) 5 vs. (Typeset equations courtesy http://en.wikipedia.org/wiki/Variance) 6 1 N (x i x )(y i y ) E (X x )(Y y ) N i1 xy x y x y 1 n ˆ x )(y i ˆy ) (x i n 1 (x x) (y y) 1 n (x ) (y ) n i i i1 rxy z i zi ˆ x ˆy n i1 sx sy n i1 n 1 (x i x) (y i y) 1 n (x ) (y ) n i1 1 n 1 n 2 2 zi zi (x i x) (y i y) n 1 i1 n 1 i1 n 1 i1 n 1 (x i x) (y i y) n 1 i1 1 n 1 n 2 2 (x i x) (y i y) n 1 i1 n 1 i1 7 Standard Deviation Spread of a list Single variables have SD Graphics: Wikipedia Standard Error Spread of a chance process Sampling Distributions have SE 8 9 Remember that a z-score tells us where a score is located within a distribution– specifically, how many standard deviation units the score is above or below the mean. z Y 10 For example, if we find a particular difference that is x standard errors wide, how confident are we that the difference is not just due to chance? So… we can use z-scores on a normal curve to interpret how likely a given outcome is (how likely is it due to chance?) 11 Example, you have a variable x with mean of 500 and S.D. of 15. How common is a score of 525? Z = 525-500/15 = 1.67 If we look up the z-statistic of 1.67 in a z-score table, we find that the proportion of scores less than our value is .9525. z Y Or, a score of 525 exceeds .9525 of the population. (p < .05) 12 z is a test statistic More generally: z Y z = observed – expected SE Z tells us how many standard errors an observed value is from its expected value. 13 A confidence interval is a range of scores above and below the mean. The interval is in standard errors It is the interval where we expect our value to be A confidence coefficient is the likelihood that a given interval has the true value of the parameter Sample value = true population value + error 14 One-tailed Directional Hypothesis Probability at one end of the curve Two-tailed Non-directional Hypothesis Probability is both ends of the curve 15 Null Hypothesis: H0: μ1 = μc ▪ μ1 is the intervention population mean ▪ μc is the control population mean Alternative Hypotheses: H1: μ1 < μc H1: μ1 > μc 16 Null Hypothesis: H0: μ1 = μc ▪ μ1 is the intervention population mean ▪ μc is the control population mean Alternative Hypothesis: H1: μ1 ≠ μc 17 Do Berkeley students read more or less than 8 hours a week? H0: μ = 8 H1: μ ≠ 8 The mean for Berkeley students is equal to 8 The mean for Berkeley students is not equal to 8 18 Do Berkeley students read more than 8 hours a week (the average for students across the country)? H0: μ = 8 H1: μ > 8 There is no difference between Berkeley students and other students The mean for Berkeley students is higher than the mean for all students 19 A p-value is the observed significance level (more on this in a moment) A test statistic depends on the data, as does p. This chance assumes that the null hypothesis is correct. Thus, the smaller the chance (p-value), the morel likely that the null can be rejected. The choice of a test statistic (e.g., z, t, F, Χ2) depends on the model and they hypothesis being considered The basic process is exactly the same, however. 20 When p value > .10 → the observed difference is “not significant” When p value ≤ .10 → the observed difference is “marginally significant” or “borderline significant” When p value ≤ .05 → the observed difference is “significant” When p value ≤ .01 → the observed difference is “highly significant” 21 We cannot hypothesize the null As odd as it may seem at first, we reject or do not reject the null; a traditional hypothesis test tests against the null. We never use the word proof with hypothesis testing and statistics, we reject or accept. Prove has a specific meaning in mathematics and philosophy, but the term is misleading in statistics. 22 Type I Error: falsely rejecting a null hypothesis (false positive) Type II Error: Failing to reject the null hypothesis when it is false (false negative) 23 (The auto data) 24