* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 10-06 lecture
Degrees of freedom (statistics) wikipedia , lookup
Sufficient statistic wikipedia , lookup
History of statistics wikipedia , lookup
Foundations of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Misuse of statistics wikipedia , lookup
Basic t-test 10/6 Hypothesis Test for Population Mean • Goal: Infer m from M • Null hypothesis: m = m0, usually 0 • Change scores – Memory improvement, weight loss, etc. • Sub-population within known, larger population – IQ of CU undergrads • Approach: – Determine likelihood function for M, using CLT – Use actual sample mean to get p-value Likelihood Function for M • Probability distribution for M, according to H0 • Central Limit Theorem: – Mean equals population mean: mM = m0 – Standard deviation: M n – Shape: Normal – .zM M m M M M m0 Probability • z-score M p n – p-value: p(Z > z) m0 M Example: IQ • Are CU undergrads smarter than population? – Sample size 100, sample mean 103 • Likelihood function – If no difference, what are probabilities of sample means? – Null hypothesis: m0 = 100 – CLT: mM = 0 M = /√n = 15/10 = 1.5 > 1-pnorm(2) – z-score: (103-100)/1.5 = 2 [1] .0228 p M zM Problem: Unknown Variance zM M m0 n ? • Test statistic depends on population parameter – Can only depend on data or values assumed by H0 • Could include in null hypothesis – H0: m = m0 & = 0 – Usually no theoretical basis – Cannot tell which assumption fails • Change test statistic – Replace population SD with sample SD – Depends only on data and m0 – . M m0 t s n t Statistic • Invented by “Student” at Guinness brewery • .t M m0 s n • Deviation of sample mean divided by estimated standard error • Depends only on data and m0 • Sampling distribution depends only on n M m0 t n tn1 Normal(0,1) n 1 2 n1 n1 s 2 n1 Normal(0,1) t Distribution • Sampling distribution of t statistic • Derived from ratio of Normal and (modified) 2 • Depends only on sample size – Degrees of freedom: df = n – 1 – Invariant with respect to m, • Shaped like Normal, but with fatter tails – Reflects uncertainty in sample variance – Closer to Normal as n increases • Critical value decreases as n increases a = .05 df tcrit df tcrit 1 6.31 5 2.02 2 2.92 10 1.81 3 2.35 30 1.70 4 2.13 ∞ 1.64 Steps of t-test 1. 2. State clearly the two hypotheses Determine null and alternative hypotheses • • 3. H0: m = m0 H1: m ≠ m0 Compute the test statistic t from the data M m0 • t . s n 4. • 5. Determine likelihood function for test statistic according to H0 t distribution with n-1 degrees of freedom Get p-value • R: 1-pt(t,n-1) 6. Choose alpha level 7a. p > α: Retain null hypothesis, μ = μ0 7b. p < α: Reject null hypothesis, μ ≠ μ0 Example: Rat Mazes • Measure maze time on and off drug – Difference score: Timedrug – Timeno drug – Data (seconds): 5, 6, -8, -3, 7, -1, 1, 2 • • • • • Sample mean: M = 1.0 Sample standard deviation: s = 5.06 Standard error: SE = s/√n = 5.06/2.83 = 1.79 t = (M – m0)/SE = (1.0 – 0)/1.79 = .56 p = p(tn – 1 ≥ t) = .30 Two-tailed Tests • • • • Interested in m < m0 or m > m0 Leads to t < 0 or t > 0 Need critical region to cover both tails p-value counts both tails – p(|tn-1| > |t|) – 2p(tn-1 > t) if t > 0 – 2p(tn-1 < t) if t < 0