Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math 141 Lecture 6: Measures of Variability Albyn Jones1 1 Library 304 [email protected] www.people.reed.edu/∼jones/courses/141 Albyn Jones Math 141 Measures of Spread The mean and median are measures of location, or some notion of the center or typical values of a distribution. We now consider measures of spread, or how variable a population is. Albyn Jones Math 141 First: Quantiles We may define other quantiles in the same way we defined the median: Definition: pth quantile Let X be a RV, then any number qp satisfying P(X ≤ qp ) ≥ p and P(X ≥ qp ) ≥ (1 − p) is a pth quantile. Like the median, quantiles of a distribution may not be unique. Albyn Jones Math 141 Example: Quartiles The 25th percentile is Q1 = q.25 , the median is Q2 = q.50 , and the 75th percentile is Q3 = q.75 . For a Binomial(100, .5), the quartiles are given by the qbinom function: > qbinom(c(.25,.5,.75),100,.5) [1] 47 50 53 # # check Q1 # > pbinom(47,100,.5) [1] 0.3086497 # at least .25 > pbinom(46,100,.5,lower.tail=FALSE) [1] 0.7579408 # at least .75 # > sum(dbinom(47:53,100,.5)) [1] 0.5158816 Albyn Jones Math 141 Binomial Quartiles 0.00 0.02 0.04 0.06 Quartiles for the Binomial(100,1/2) 22 25 28 31 34 37 40 43 46 Albyn Jones 49 52 55 Math 141 58 61 64 67 70 73 Example Let Ω = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} with equal probabilities. What is Q1 = q.25 ? What is q.3 ? What is Q2 = q.5 ? Albyn Jones Math 141 Some Measures of Spread InterQuartile Range (IQR): The distance from the 3rd quartile to the 1st: Q3 − Q1. The interval [Q1, Q3] contains the central 50% of a distribution. Variance (σ 2 ): Mean squared deviation from the mean; if µ = E(X ) then X Var(X ) = E(X − µ)2 = (xi − µ)2 pi Standard Deviation (σ): the square root of the mean squared deviation from the mean. p SD(X ) = σ = Var(X ) Albyn Jones Math 141 Why work with ugly statistics??? Some statistics have issues: For most populations, the tails or extreme quantiles, are typically the least reliable measures in a sample. For most populations, the IQR has complicated behavior in samples - there are no simple general formulas for the properties of the sample IQR. We shall see that variance and standard deviation have some very nice properties. Albyn Jones Math 141 Example: Y ∼ Bernoulli(p) Reminder: Each trial results in either a 1 or a 0. P(Y = 1) = p P(Y = 0) = (1 − p) = q Expected Value: E(Y ) = (0 · q) + (1 · p) = p Variance: σY2 = E(Y − p)2 = (0 − p)2 · q + (1 − p)2 · p p2 q + q 2 p = pq(p + q) = pq = p(1 − p) Albyn Jones Math 141 Example: a Fair Die Let Y be the value face up after a die roll. Ωy = {1, 2, 3, 4, 5, 6}, each with probability 1/6. Expected Value: E(Y ) = 1+2+3+4+5+6 = 3.5 6 Variance: σy2 = E(Y −3.5)2 = (1 − 3.5)2 + (2 − 3.5)2 + . . . + (6 − 3.5)2 6 or σy2 = 6.25 + 2.25 + 0.25 + 0.25 + 2.25 + 6.25 ≈ 2.91 6 Albyn Jones Math 141 Interpretation What does variance mean? Mean Squared Deviation: average squared distance from the mean. Remember: variance is in squared units: for example, if we measure height in cm, then the variance is in cm2 . Standard Deviation: For RV’s, SD is the analog of Euclidean distance. (What does that mean?) The SD is in the original units. Typical deviation: For symmetric, unimodal distributions, the standard deviation is a reasonable measure of typical deviation. Mean Absolute Deviation: Another candidate for the title of typical. Not as well behaved mathematically. Albyn Jones Math 141 Standard Deviation as Typical Deviation Symmetric Distributions 0.0 0.1 0.2 Density 0.3 0.4 0.5 A Symmetric, Unimodal Population −3 −2 −1 0 Z Albyn Jones Math 141 1 2 3 Standard Deviation as Atypical Deviation Skewed Distributions 0.4 0.2 0.0 Density 0.6 0.8 A skewed population −5 0 5 X Albyn Jones Math 141 10 15 Example: Bernoulli(1/2) Let X be a Bernoulli RV, with probability p = .5. Then Var(X ) = p(1 − p) = Thus r SD(X ) = 2 1 1 = 2 4 1 1 = 4 2 In this case, E(X ) = 1/2 and SD(X ) = 1/2, so the possible outcomes are exactly 0 = µ − σ and 1 = µ + σ. Albyn Jones Math 141 Variance is an average! Since variance is the mean squared deviation, it shares properties of means: in particular it is sensitive to outliers: First, consider a RV X with Ωx = {−1, 0, 1}, where the probabilities are (respectively) {.01, .98, .01}. E(X ) = 0, so Var(X ) = (−1−0)2 (.01)+(0−0)2 (.98)+(1−0)2 (.01) = .02 Compare to Y with Ωy = {−100, 0, 100}, with the same probabilities, {.01, .98, .01}. Var(Y ) = (−100−0)2 (.01)+(0−0)2 (.98)+(100−0)2 (.01) = 200 Albyn Jones Math 141 Example: Binomial(n,p) Let X ∼ Binomial(n, p), then n k n−k P(X = k) = p q k E(X ) = np, so from the definiton of variance we have Var(X ) = n X n k n−k (k − np) p q k 2 k=0 An algebraically challenging, though in fact analytically computable sum. As with E(X ), there is a nice relationship for sums of (independent) RV’s. Next time! Albyn Jones Math 141 First: Translation and Scaling Let X be a Bernoulli(1/2) trial. We know E(X ) = p = 1/2. Let Y = 2 · X − 1. We know 1 E(Y ) = 2E(X ) − 1 = (2 · ) − 1 = 0 2 What about the variance and SD? From the definition: Ωy = {2 · 1 − 1, 2 · 0 − 1} = {1, −1}, each with probability 1/2, so σ 2 = E(Y − 0)2 = (−1 − 0)2 · Thus SD(Y ) = √ 1 = 1. Albyn Jones Math 141 1 1 + (1 − 0)2 · = 1 2 2 SD’s scale naturally! Continuing the example: Y = 2 · X − 1. Var(Y ) = 1 = 4 · 1 = 22 Var(X ) 4 or SD(Y ) = SD(2X ) = 2SD(X ) Albyn Jones Math 141 Conclusion: Translation and Scaling Suppose that E(X ) = µ. Then for constants a and b, E(a + bX ) = a + bµ. Thus Var(a+bX ) = E((a+bX )−E(a+bX ))2 = E((a+bX −(a+bµ))2 = E(bX − bµ)2 = E(b2 (X − µ)2 ) = b2 E(X − µ)2 = b2 Var(X ) Taking the square root, we have SD(a + bX ) = bSD(X ) Translation does not affect the spread. Scaling by a constant multiplies the standard deviation by that constant. Albyn Jones Math 141 Summary Variance: expected squared deviation from the mean: Var(X ) = E(X − µx )2 = σ 2 Standard Deviation: SD(X ) = p Var(X ) = σ Properties: Var(a + bX ) = b2 Var(X ) SD(a + bX ) = bSD(X ) Interpretation: For a symmetric distribution, the standard deviation may be considered a typical deviation from the mean. For a strongly asymetrical distribution, it is better to work with quantiles (percentiles of the distribution). Albyn Jones Math 141