* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Hypothesis Testing
Survey
Document related concepts
Transcript
Hypothesis Testing CSCE 587 Background Assumptions: – Samples reflect an underlying distribution – If two distributions are different, the samples should reflect this. – A difference should be testable Model Evaluation • Issue: limited observations • Questions: – What is the appropriate test – What level of significance is appropriate Test Selection • One sample test: – compare sample to population – Actually: sample mean vs population mean – Does the sample match the population • Two sample test: – compare two samples – Sample distribution of the difference of the means Selecting the Test Type of problem: – Comparing a sample to a population – “Does the sample come from the population?” – Test: One Sample t-test t sample population SDsample / # observations One Sample t-Test t sample population SDsample / # observations Test: One Sample t-test • Used only for tests of the population mean • Null hypothesis: the means are the same • Compare the mean of the sample with the mean of the population • What do we know about the population? Example • Historically, the mean score for juniors on a test is 81.3 • Six students take the test resulting in a mean score of 87.6 and a standard deviation of 13.14 • The t-value is: 87.6 81.3 6.3 t 1.176 13.14 / 6 5.354 Example Given t= 1.176, how do we test the hypothesis? Compare to critical values in t-table. To do so we need: – to know the degrees of freedom – To select a confidence interval – One-tailed or two-tailed test? One-Tailed vs Two-Tailed One-tailed: – allots all of the alpha in one direction – Makes sense if we want to know a difference in one direction – sample population or sample population One-Tailed vs Two-Tailed Two-tailed: we only care if the means are different – Splits alpha into two tails – Makes sense if we are only asking if the means are different sample population Example • The degrees of freedom – DoF for one sample test is n-1 = 5 in this case • To select a confidence interval – We select 95% confidence interval • One-tailed or two-tailed test? – – – – One-tail allots all of the alpha in one direction Makes sense if we want to know a difference in one direction Two-tailed: we only care if the means are different We choose two-tailed Example Parameters: t= 1.176, DoF=5, 95%, 2-tailed Decision Compare t= 1.176, and t = 2.571 Accept or reject the null hypothesis? If tc t Reject null hypothesis If p-value < Reject null hypothesis Steps We Took 1. 2. 3. 4. 5. Selected the test Selected confidence level Calculated the test statistic Determined the degree of freedom Compared the computed test statistic to the table value 6. Accepted/Rejected the hypothesis Test Selection • One sample test: – compare sample to population – Does the sample match the population • Two sample test: – compare two samples – Sample distribution of the difference of the means Two Sample Tests • Independent two-sample t-test – Independently sampled – Identically distributed – What does this mean? • Paired two-sample t-test – Paired, not independently sampled – Example: sample, treat, resample same subjects Independent two-sample t-test Must have same variance X1 X 2 t 1 1 S X1 X 2 n1 n2 Numerator: difference between samples Denominator: estimated standard error of the difference between the two means Note the above version allow unequal sample sizes Independent two-sample t-test Denominator: estimated standard error of the difference between the two means X1 X 2 t 1 1 S X1 X 2 n1 n2 Where: S X1 X 2 (n1 1) S12 (n2 1) S 22 n1 n2 2 2 2 S , S And 1 2 are the unbiased variances of the two samples Degrees of Freedom One sample test: n1 + n2 – 1 Two sample test: n1 + n2 – 2 Why the difference? – For a sample: DoF = n-1 – We have two samples: (n1-1) & (n2-1) Independent t-test, unequal sizes, unequal variances • Use Welch’s t-test • Difference is in the denominator t X1 X 2 2 1 2 2 S S n1 n2 Where, S12 , S 22 are the unbiased variances of the two samples