Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Dr. Neal, WKU MATH 382 Discrete Uniform Random Variables Let X be a random variable with a finite range { x1, x 2 , . . ., x n } such that each of the n distinct values in the range occurs with equal probability 1 / n . That is, P(X = xi ) = 1 / n for i = 1, 2, . . ., n . Then X is called a discrete uniform random variable. € € € Then Range X = {1, 2, Example 1. Roll one fair six-sided die and let X be the face value. 3, 4, 5, 6}. There are 6 values in the range, and P(X = k) = 1/6 for each k ∈ Range X . So X is discrete uniform. Example 2. Roll two dice and let X be the sum of the two dice. Then Range X = {2, . . . , 12}, so there are 11 values in range; but, each does not occur with probability 1/11 (e.g.; P(X = 2) = 1/36 and P(X = 7) = 1/6 ). So, X is not discrete uniform. It is however still a discrete random variable since the range is a finite set. Example 3. On European Roulette wheels, there is only one zero. So, roll the wheel and let X be number that lands. Then Range X = {0, 1, . . . , 36}. There are 37 values in the range and P(X = k) = 1/37 for each k ∈ Range X . Thus X is discrete uniform. Example 4. Draw a card from a standard deck (no jokers) and let X be the card value, where J, Q, K = 10 and Ace = 11. Then Range X = {2, . . ., 11}. But X is not uniform because P(X = 2) = 1/13 ≠ P(X = 10) = 4/13. Expected Value If X is discrete uniform with range { x1, x 2 , . . ., x n }, then the expected value (mean, or average value) of X is simply the normal average: µ = E[X] = € n 1 ∑ k P(X = k ) = ∑ xi = n k ∈Range X i =1 x1 + x2 +...+x n 1 + 2 +...+ n Note: When the range is {1, 2, . . . , n }, then µ = = n n . n(n + 1) n +1 2 = . 2 n Example 5. What is the average roll of one fair six-sided die? Because X is discrete 6 +1 uniform with range {1, 2, 3, 4, 5, 6}, we have µ = = 3.5. 2 Dr. Neal, WKU Variance Again let X be a discrete uniform random variable with range { x1, x 2 , . . ., x n }. To 2 compute the variance, we first note that the expected value of X is simply 2 E[X ] = 2 ∑ k P( X = k ) = k ∈Range X 2 x12 + x22 +... € +x n . n Then, x 2 + x 22 +... + xn 2 x1 + x2 +... + xn 2 σ 2 = Var( X ) = E[ X 2 ] − ( E[X ])2 = 1 − n n = 1 n 2 xi − µ 2 . ∑ n i=1 We often say that the variance is “the average of the squares minus the square of the average.” In particular, when the range is {1, 2, . . ., n }, then σ2 = 12 + 22 + ... + n 2 n n(n + 1)(2n + 1) n n + 1 2 6 − = − 2 2 n + 1 2 (n + 1)(2n + 1) (n + 1)2 (n + 1)[2(2n + 1) − 3(n + 1)] = − = 6 4 12 (n + 1)(n − 1) n2 − 1 = = . 12 12 2 As usual, the standard deviation is σ = σ . Non-uniform discrete random variables also can be treated as uniform by simply listing the values in increasing order and repeating those that occur more than once. The next example illustrates this concept. Dr. Neal, WKU Example 6. Let X(ω ) be the weight of person ω for the 12 players on a girl’s basketball team. These weights are as follows: {120, 122, 125, 130, 130, 145, 150, 150, 155, 165, 180, 180} Technically X is not uniform because there are only 9 distinct values in the range and P(X = 130) = 2/12 ≠ 1/9. However, we can still find the mean, variance, and standard deviation as before by using the preceding formulas. Here µ = 146, σ 260,784 / 12 − 1462 = 416, and σ = 416 ≈ 20.396. 2 = Mode and Median The mode of a random variable X is the most likely value in the range of X (that is, the value k such that P(X = k) is maximized). When the function values of X are listed in increasing order, as in Example 6, then the mode is simply the value that occurs most often. However if X is discrete uniform, then we do not really consider the mode because each value in the range is equally likely. Sometimes two or more values occur equally often as in Example 6 above (130, 150 and 180). If two values are most likely, then X is called bimodal. If three values occur most often, then X is called trimodal. The median of a random variable X is the smallest value k such that P(X ≤ k) ≥ 0.50 (that is, the first point at which the cumulative distribution exceeds 50%). This definition is the general form that can be used for all random variables. In the special case of a discrete uniform random variable or when function values are listed in increasing order, then another convention is used: The median is the middle value when there are an odd number of measurements listed, and it is the average of the two middle values when there is an even number of measurements listed. In Example 6 above, there are 12 measurements listed. So the median is the average of the 6th and 7th measurement which is given by (145 + 150)/2 = 147.5. The preceding methods for computing the mean, median, mode and standard deviation are generally used for raw data of a population when a measurement is given for each person in the population. Example 7. The following measurements are the Math ACT scores from a MATH 136 class at WKU in Spring 2014: 18 30 22 25 21 29 23 24 34 27 28 20 19 29 27 21 21 28 25 21 28 19 30 25 28 25 22 31 26 27 27 Find the mean µ , the median, the mode, and the standard deviation σ of this class. Dr. Neal, WKU Calculator Solution. TI-83/84: We enter the data into the STAT Edit screen under list L1. Enter data into L1 Sort data then enter 1–Var Stats L1 Output Scroll down We see that µ = 780/31 ≈ 25.16129, the median is 25 (which is the 16th measurement when all 31 measurements are listed in increasing order, and σ ≈ 3. 9521 . The calculator does not compute the mode for us. But by scrolling down the sorted list, we can observe that four different measurements each occur 4 times which is the most that any measurement occurs. Thus, we have modes of 21, 25, 27, and 28. Note: (i) Because we have a measurement for each person in the class, the displayed value x is actually the true population mean µ . If the data were only a random sample from a larger population, then we would use x ≈ 25.16129 to denote the sample mean and we would use S ≈ 4.0174 to denote the sample deviation. 2 (ii) From the displayed values, we also can compute σ and σ by σ2 = 2 1 n 2 2 20110 780 ≈ 15.61912672 x − µ = − ∑ 31 n i =1 i 31 and σ = σ 2 ≈ 3.952106618. (iii) Other statistics shown are the minimum value 18, the maximum 34, the first quartile Q1 = 21, and the third quartile Q3 = 28. The first quartile is the median of the first 15 values, and the third quartile is the median of the upper 15 values. Frequency Charts Often data sets with many measurements are given in a frequency chart that gives the number of occurrences for each measurement. Measurement Frequency x1 k1 x2 k2 x3 k3 . . . . xm km Then n = k1 + . . . + k m is the total number of measurements. So the mean µ is actually a weighted average given by k x + . . . + km xm . µ = 1 1 n Dr. Neal, WKU 2 In computing the variance σ , only the measurements are squared in the first part, not the frequencies: k x 2 + .. . + km x m 2 σ2 = 1 1 – µ 2. n Example 8. Suppose all the households in a neighborhood are surveyed as to how many children lived at home. The responses are below: Number of children Number of households 0 60 1 42 2 86 3 59 4 22 5 4 6 2 Find the mean µ , the median, the mode, and the standard deviation σ . Calculator Solution. We enter the measurements under list L1 and the frequencies under list L2. Then enter the command 1–Var Stats L1, L2. We see that there are 275 measurements, with the average number of children per household being µ = 511/275 ≈ 1.8582. Because 275/2 = 137.5, the median is the 138th measurement which is a 2. The mode is also 2 since it occurs most often. Lastly, σ2 = 1441 511 2 − ≈ 1.78716 and σ = σ 2 ≈ 1.33685. 275 275 Dr. Neal, WKU Exercises 1. The grades (out of 100) on Test 1 for a MATH 317 class were as follows: {48, 71, 72, 78, 84, 84, 86, 90, 94, 94, 94, 96, 100} (a) Give the distribution of A's, B's, etc. on a standard scale 90%, 80%, etc. (b) Compute the mean, median, mode, and standard deviation for the grades on this test. (c) Compute the distribution of grades on the following “normalized” scale: Grade A B C D F Range At least µ + σ [ µ + 0.5 σ , µ + σ ) [ µ – 0.5 σ , µ + 0.5 σ ) [ µ – σ , µ – 0.5 σ ) Below µ – σ Now consider the Test 2 grades of the class. One student has dropped and the grades are (again out of 100): {48, 58, 62, 69, 74, 75, 79, 80, 82, 84, 88, 89} (d) Redo parts (a), (b), and (c) with these grades. (e) In your opinion, explain if final class grades should be assigned on a standard scale or on a normalized scale. When if ever should an individual student's grade be curved? 2. All the female models at an agency were measured for shoe size in order to be fitted for a new line of shoe. The sizes were: Size Models 5.5 2 6 5 6.5 12 7 14 7.5 16 8 18 8.5 17 9 4 9.5 2 (a) Compute the mean, median, mode, and standard deviation of these shoe sizes. (b) What percentage of these models are above the agency average in female shoe size? (c) What percentage of these models are within one standard deviation of average female shoe size?