Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
• Today: Quizz 4 • Tomorrow: Lab 3 – SN 4117 • Wed: A3 due • Friday: Lab 3 due • Mon Oct 1: Exam I this room, 12 pm • Mon Oct 1: No grad seminar Key concepts so far • • • • • Quantity Measurement scale Dimensions & Units Equations Data Equations – Sums of squared residuals quantify improvement in fit, compare models • Quantify uncertainty through frequency distributions – Empirical – Theoretical – 4 forms, 4 uses Today Selected examples from: Read lecture notes Logic of Hypothesis Testing Reject JUST LUCK Hypothesis A B C D Skill! Just Luck!! Izaak Walton Reject JUST LUCK • Compared observed outcome to all possible outcomes more tractable to restrict to all possible outcomes given that JUST LUCK hyp is true Arrangements of 8 fish such that IW catches 7? Reject JUST LUCK Arrangements of 8 fish such that IW catches 7? Assign probabilities to each outcome, assuming that the H0 ‘JUST LUCK’ is true For each fish, there is a 1 in 5 chance that IW will catch it IW=8 IW=7 IW=7 IW=7 IW=7 A=1 B=1 C=1 D=1 (1/5)8 (1/5)7 (1/5)7 (1/5)7 (1/5)7 0.00000256 0.0000128 0.0000128 0.0000128 0.0000128 p=0.00005376, i.e. 5 times in 10,000 Hypothesis Testing • Set of rules for making decisions in the face of uncertainty • Logic is inductive: from specific to general • Structure is binary 3 styles of statistical inference • Likelihood, frequentist and Bayesian inference • All based on the principle of maximum likelihood Definition: a model that makes the data more probable (best predicts the observed data) is said to be more likely to have generated the data 3 styles of statistical inference Likelihood inference ∑ res2 = 0.1171 ∑ res2 = 0.0204 Reduction in squared deviance ∑ res2 = 0.0966 Which model is more likely to have generated the data? Frequentist inference Use expected distribution of outcomes to calculate a probability 3 styles of statistical inference Bayesian inference Find the probability that a hypothesis is true, given the observed data Contrast to: finding the probability of observing the data I observed (or more extreme data), assuming that the null hypothesis is true Integrates prior knowledge we have on the system with new observations to make an informed decision 3 styles of statistical inference Bayesian inference e.g.: coin flip. Hypothesis: the coin is biased Observe flips: HTHHHTTHHHH Frequentist approach Null Hypothesis H0 • H0 just chance • Research hypothesis (what we really care about) is stated as HA • So, why work with H0 and not HA? – Easier to work out probabilities – Permits yes/no decision • Working with H0 is not intuitive. Logic is backwards because we want to reject H0, not explain how the world functions through H0 Choice of HA • Start with research hyp, then challenge it with H0 • HA/H0 defined with respect to population, not sample • HA/H0 must be defined prior to analysis • Choice of HA/H0 determines how we calculate p-value • HA/H0 pair must be exhaustive • HA/H0 must be mutually exclusive Choice of HA How do we choose it? Often HA=effect, H0= no effect BUT, more informative choices are available: G: growth rate of plants. c:Control, t: treated with fertilizer 1.. 2.. 3.. ‘tails’ ‘scale’ Type I & Type II error • Type I (α): reject H0 when it is true ‘false positive’ e.g. in a trial, accused is innocent but goes to jail H0: • Type II (β): not rejecting H0 when it is false ‘false negative’ e.g. in a trial, accused is guilty but is set free H0: Type I & Type II error • Type I (α): reject H0 when it is true ‘false positive’ • Type II (β): not rejecting H0 when it is false ‘false negative’ H0 True Not rejecting H0 Reject H0 H0 False Type I & Type II error True H0 Reject H0 when it is true Type I & Type II error Draw not rejecting H0 when it is false, i.e. β Tradeoff between α and β Draw rejecting H0 when H0 is false, i.e. power True HA Selected examples from: Will present 2 examples (if time allows) More examples in lecture notes Table 7.1 Generic recipe for decision making with statistics 1. 2. 3. 4. 5. 6. State population, conditions for taking sample State the model or measure of pattern…………………………… State null hypothesis about population…………………………… State alternative hypothesis………………………………………… State tolerance for Type I error……………………………………… State frequency distribution that gives probability of outcomes when the Null Hypothesis is true. Choices: a) Permutations: distributions of all possible outcomes b) Empirical distribution obtained by random sampling of all possible outcomes when H0 is true c) Cumulative distribution function (cdf) that applies when H0 is true State assumptions when using a cdf such as Normal, F, t or chisquare 7. Calculate the statistic. This is the observed outcome 8. Calculate p-value for observed outcome relative to distribution of outcomes when H0 is true 9. If p less than α then reject H0 in favour of HA If greater than α then not reject H0 10.Report statistic, p-value, sample size Declare decision Example: jackal bones Length of bones from 10 female and 10 male jackals (Manly 1991) Male Female L = length of mandible 120 110 (L=mm) of Golden jackals 107 111 110 107 116 108 114 110 111 105 113 107 117 106 114 111 112 111 113.4 108.6 mean 13.82 5.16 var Example: jackal bones 1. Population: All possible measurements on these bones All jackals in the world? Need to know if sample representative 2. Measure of pattern: ST = D0 = 3. H0: 4. HA: 5. α= 6. Theoretical dist of D0? Unknown Solution: construct empirical freq dist of D0 when H0 is true by randomization…. Example: jackal bones 2. D0 = mean(Lmale)-mean(Lfem) 3.H0: D0<=0 4.HA:D0>0 5. α=5% 6. Empirical FD. Randomization a) Assign bones randomly to 2 groups (forget M/F) b) Compute mean(gr1) and mean(gr2) c) D0,res= mean(gr1) - mean(gr2) d) Repeat many times (the more the better, continued later) e) Assemble random differences into a FD 7. Statistic. Do= 113.4 – 108.6 = 4.8 mm Example: jackal bones 2. D0 = mean(Lmale)-mean(Lfem) 3.H0: D0<=0 4.HA:D0>0 8. Compute p-value: 100,000 values of D0,res 360 values exceed 4.8 p = 360/100000 p = 0.0036 9. p =0.0036< α=0.05 reject H0 in favour of HA (D0>0) 10.D0 = 4.8 mm n= p= male jackal mandible bones significantly longer than those of females 5. α=5% Example: jackal bones This was laborious Can be made easier by using theoretical frequency distributions Trade off: must make assumptions Example: jackal bones 6d) repeat many times 100,000 repetitions Example: jackal bones 6d) repeat many times 10,000 repetitions Example: jackal bones 6d) repeat many times 1,000 repetitions Example: Oat Yield data Yield of oats in 2 groups 1. Control 2. Chemical seed treatment 1 common mean 1 mean per group Is the improvement better than random? Example: Oat Yield data 1. Sample: 8 measurements Population: all possible measurements taken with a stated procedure 2. Measure of pattern: ST = SSmodel 3. H0: E(SSmodel) = 0 4. HA:E(SSmodel) > 0 5. α=5% 6. Theoretical dist of SSmodel? Unknown Solution: construct empirical freq dist of SSmodel when H0 is true by randomization…. Example: Oat Yield data 6. Empirical FD a) Assign yields to 2 groups (forget treatment/control) b) Fit common mean model c) Fit 2 means model d) Calculate SSmodel e) Repeat many times (1000) f) Assemble random differences into a FD 7. Statistic. SSmodel=192.08 Example: Oat Yield data 8. Compute p-value: 1,000 values of SSmodel 161 values exceed 192.08 p = 161/1000 p = 0.161 9. p = 0.161 > 0.05 do not reject H0 The improvement is not better than random 10.SSmodel = 192.08 n=8 p = 0.161 we can not reject the JUST LUCK hypothesis QUIZZ 4 Good luck!