Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Formula Sheet for Statistical Methods (201-DDD-05) Five number summary: min, Q1 , median, Q3 , max Q1 : median of smallest half Q3 : median of largest half Fourth spread fs = Q 3 − Q 1 Outliers xi is an outlier if its distance from the closest fourth (Q1 or Q3 ) is > 1.5fs Sample variance 1 X s2 = (xi − x̄)2 n−1 P X 1 ( xi ) 2 s2 = x2i − n−1 n Sample standard deviation √ s = s2 Rule for expected value E(aX + b) = aE(X) + b Rule for variance V (aX + b) = a2 V (X) Binomial distribution X ∼ Bin(n, p): n px (1 − p)n−x for x = 0, 1, . . . , n p(x) = x E(X) = np, V (X) = np(1 − p) Hypergeometric distribution n = sample size, N = population size, M = number of successes in population M N −M p(x) = x E(X) = n · n−x N n M N , V (X) = N −n N −1 ·n· M N · 1− M N Poisson distribution X ∼ Poisson(λ): e−λ λx for x = 0, 1 . . . x! E(X) = λ, V (X) = λ Permutations n! Pk,n = (n − k)! p(x) = Combinations n n! Ck,n = = k k!(n − k)! Percentiles η = 100p th percentile of X (continuous r.v.): P (X ≤ η) = p Normal distribution Addition rule P (A ∪ B) = P (A) + P (B) − P (A ∩ B) Multiplication rule P (A ∩ B) = P (A)P (B|A) Independent events A and B are independent if P (B|A) = P (B), equivalently P (A ∩ B) = P (A)P (B) Law of Total Probability X −µ ∼ N (0, 1) σ For Z ∼ N (0, 1) set Φ(z) = P (Z ≤ z) If X ∼ N (µ, σ) then Φ(zα ) = 1 − α Statistics X1 , . . . , Xn random sample: 1X X= Xi (sample mean) n X 1 S2 = (Xi − X)2 (sample variance) n−1 A1 , . . . , Ak mutually exclusive & exhaustive: P (B) = P (A1 ∩ B) + · · · + P (Ak ∩ B) Special case: P (E) + P (E 0 ) = 1 De Morgan’s laws (A ∪ B)0 = A0 ∩ B 0 (A ∩ B)0 = A0 ∪ Sampling distributions X1 , . . . , Xn random sample, Xi ∼ distribution with mean µ and std. dev. σ: E(X) = µ, V (X) = σ 2 /n CLT: X−µ √ σ/ n ∼ N (0, 1) (n > 30) B0 Regression and Correlation Expected value for a discrete r.v. P E(X) = µX = xp(x) P E(h(X)) = h(x)p(x) Expected value for a continuous r.v. R∞ E(X) = µX = −∞ xf (x)dx R∞ E(h(X)) = −∞ h(x)f (x)dx Variance and standard deviation 2 = E(X 2 ) − E(X)2 V (X) = σX p σX = V (X) Sxx = Σx2i − (Σxi )2 n (Σyi )2 n (Σxi )(Σyi ) Sxy = Σ(xi yi ) − n SSE = Σyi2 − βˆ0 Σyi − βˆ1 Σxi yi Syy = Σyi2 − SST = Syy β̂1 = Sxy Sxx Σyi −β̂1 Σxi n √ Sxy , Sxx Syy β̂0 = r= s2 = SSE n−2 r2 = 1 − SSE SST Formula Sheet for Statistical Methods (201-DDD-05) HYPOTHESIS TESTING (α = significance level) One mean H 0 : µ = µ0 x − µ0 √ ∼ N (0, 1) s/ n t∗ = x − µ0 √ ∼ tn−1 s/ n (if n > 30) (if data normally distr.) s x ± zα/2 √ n (if data normally distr.) x1 − x2 − ∆ 0 ∼ N (0, 1) z = r ∗ s2 2 n2 + x1 − x2 − ∆ 0 r ∼ tν x1 − x2 ± zα/2 (if data normally distr.) s21 s22 (if data normally distr.) 1 Fα/2,ν2 ,ν1 Slope of regression line H0 : β1 = β10 βˆ1 − β10 √ ∼ tn−2 s/ Sxx β̂1 ± tα/2,n−2 √ (if data normally distr.) s Sxx Correlation coefficient H0 : ρ = 0 √ r n−2 ∼ tn−2 t∗ = √ 1 − r2 (if data normally distr.) Test of normality (Ryan-Joiner) H0 : population distribution is normal test statistic: correlation coefficient r from probability plot p̂ − p0 ∼ N (0, 1) (if n large) p0 (1−p0 ) n p̂(1 − p̂) n Difference of two proportions H0 : p1 − p2 = ∆ 0 p̂1 − p̂2 − ∆0 p̂1 (1−p̂1 ) n1 s p̂1 − p̂2 ± zα/2 s21 ∼ Fn1 −1,n2 −1 s22 2 One proportion H0 : p = p0 z∗ = q Ratio of two variances H0 : σ12 = σ22 t∗ = 1 r ! (if n1 > 30 and n2 > 30) + (if n1 > 30 and n2 > 30) n1 n2 s s21 s2 x1 − x2 ± tα/2,ν + 2 (if data normally distr.) n1 n2 2 s2 s2 1 + n2 n1 2 where ν = 2 (s1 /n1 )2 (s2 /n2 )2 + n2 −1 n −1 p̂ ± zα/2 (if data normally distr.) property of critical F -values: F1−α/2,ν1 ,ν2 = s2 2 n2 + s z∗ = q (n − 1)s2 ∼ χ2n−1 σ02 (n − 1)s2 (n − 1)s2 , χ2α/2,n−1 χ21−α/2,n−1 f∗ = Difference of two means H0 : µ1 − µ2 = ∆0 s2 1 n1 χ2 = (if n > 30) s x ± tα/2,n−1 √ n t∗ = CONFIDENCE INTERVALS (100(1 − α)% confidence level) One variance H0 : σ 2 = σ02 z∗ = s2 1 n1 AND + p̂2 (1−p̂2 ) n2 ∼ N (0, 1) p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 ) + n1 n2 (if n1 , n2 large) if r < rc , reject H0