Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
AMS 102.7 Spring 2006 Jingyu Zou Elements of Statistics Lecture Notes # 18 (Last one) 1 Expectation Definition 1.1 If X is a discrete random variable taking on the values x1 , x2 , . . . , xn , with probability distribution P {X = x1 } = p1 , P {X = x2 } = p2 , . . . , P {X = xn } = pn , then the expectation (or expected value or mean) of X is given by E[X] = x1 p1 + x2 p2 + . . . + xn pn Recall the mean of a set of n observations that we did last class, where the mean is define as x̄ = x1 + x2 + . . . + xn n It can actually be interpreted as an expectation. Assume that all the n observations have the same chance to be observed, which is 1/n. So we have P {X = x1 } = P {x = x2 } = . . . = P {x = xn } = 1/n. Then 1 1 1 E[X] = x1 + x2 + . . . + xn = x̄ n n n Thus expectation can also be viewed as a measure of center, but taking into account the probability of each observation as the weight to compute the weighted average. Example 1.2 What is the expected number of heads if we toss three fair coins? E[X] = 0P {X = 0} + 1P {X = 1} + 2P {X = 2} + 3P {X = 3} 1 3 3 1 3 =0· +1· +2· +3· = 8 8 8 8 2 Example 1.3 Toss two fair dice. What is the expected sum of the two dice? (7) 2 Variance and Deviation As usual, we will first learn how to compute variance and deviation in the non-probabilistic sense. Variance is the average of the squared deviation of the observations from their mean. Definition 2.1 If x1 , x2 , . . . , xn are n observations in a sample, then the sample variance is s2 = (x1 − x̄)2 + (x2 − x̄)2 + . . . + (xn − x̄)2 n−1 Definition 2.2 If x1 , x2 , . . . , xn are all the n points in the population, then the population variance is (x1 − x̄)2 + (x2 − x̄)2 + . . . + (xn − x̄)2 s2 = n 1 Standard deviation is the positive square root of the variance. And it is a measure of the average distance of the observations from the mean. Example 2.3 The data below represent the number of children in a household for three households in a neighborhood. x1 = 0, x2 = 5, x3 = 7 (a) What is the sample variance? (b) What is the sample deviation? Question: Can we use (x1 − x̄) + (x2 − x̄) + . . . + (xn − x̄) n to measure the average distance from the mean? Isn’t that a more natural measure of average distance? Definition 2.4 If X is a discrete random variable taking on the values x1 , x2 , . . . , xk , with probabilities p1 , p2 , . . . , pk . Let µ be the expectation of X. Then the variance of X is given by Var(X) = σ 2 = E[(X − µ)2 ] =(x1 − µ)2 p1 + (x2 − µ)2 p2 + . . . + (xk − µ)2 pk =E[X 2 ] − (E[X])2 =(x21 p1 + x22 p2 + . . . + x2k pk ) − µ2 Note: Variance is always non-negative. But expectation can be real value. Definition 2.5 The standard deviation of X is given by p σ = Var(X) Example 2.6 Let X be the number of heads if we toss three fair coins. What is its variance and standard deviation? (σ 2 = 43 ) 3 Continuous Random Variables and Normal Distribution A random variable X that takes values in some intervals, or union of intervals, is said to be continuous. A probability cannot be assigned to each of the possible values because the number of outcomes in an interval cannot be counted. So we have to assign probabilities to intervals to outcomes, not single values. Definition 3.1 The probability density function (pdf ) of a continuous random variable X is a non-negative function such that the area under the function over an interval is equal to the probability that the random variable is in the interval. The total area under the curve must be 1. 2 One of the most common continuous random variables is the Normal distribution. The pdf of normal distribution is symmetric, bell-shaped, and centered at the mean µ. The Normal distribution with expectation µ and standard deviation µ is denoted by N (µ, σ). Example 3.2 The statistics from our 2nd midterm are µ = 71.09, σ = 24.6. We can model the scores as a normal distribution. If we let random variable X be the test score, then X is approximately N (71.09, 24.6). (a) What is the proportion of students having scores over 60? (b) What is the proportion of students having scores over 71.09, which is exactly the average? Theorem 3.3 If X is N (µ, σ), the standardized normal distribution Z = X−µ σ is N (0, 1). Example 3.4 Suppose you’re given a table of the distribution of N (0, 1). How would you compute, as in example 3.2, the proportion of students having scores over 60? (P {X > 60} = P {Z > −0.4508} = P {Z ≤ 0.4508} = 0.6639. The actual data is 38/55=0.6909. They are pretty close!!!) Definition 3.5 The z-score or standard score for an observed value tells us how many standard deviations the observed value is from the mean. It is computed as follows Z= X −µ σ Example 3.6 (a) The distribution of IQ scores for 12-year-old kids is N (100, 16). Jessica, who is 12 years old, had a score of 132. Compute her standardized score. (b) Suppose Jessica has an older brother, Mike, who is 20 years old and has an IQ score of 144. It wouldn’t make sense to directly compare Mike’s score of 144 to Jessica’s score of 132. The two scores come from different distributions due to the age difference. Assume the distribution of IQ scores for 20-year-olds is N (120, 20), what is Mike’s standardized score? (c) With respect to their gage group, who had the higher IQ score–Jessica or Mike? Theorem 3.7 For any normal distribution N (µ, σ), we have P {X ∈ [µ − σ, µ + σ]} = 0.68 P {X ∈ [µ − 2σ, µ + 2σ]} = 0.95 P {X ∈ [µ − 3σ, µ + 3σ]} = 0.997 A standardized score along with the above rules provide a useful frame of reference to assess if an observation is somewhat usual or unusual. We know that seeing z-scores outside the range of -2 to +3 is somewhat unusual – only 5% of the values are outside. A z-score that is outside the range of −3 to +3 is even more unusual, with only 0.3% of the values expected to be this extreme. 3