Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```Dr. Neal, WKU
MATH 382
Discrete Uniform
Random Variables
Let X be a random variable with a finite range { x1, x 2 , . . ., x n } such that each of the n
distinct values in the range occurs with equal probability 1 / n . That is, P(X = xi ) = 1 / n
for i = 1, 2, . . ., n . Then X is called a discrete uniform random variable.
€
€
€ Then Range X = {1, 2,
Example 1. Roll one fair six-sided die and let X be the face value.
3, 4, 5, 6}. There are 6 values in the range, and P(X = k) = 1/6 for each k ∈ Range X . So
X is discrete uniform.
Example 2. Roll two dice and let X be the sum of the two dice. Then Range X = {2, . . . ,
12}, so there are 11 values in range; but, each does not occur with probability 1/11 (e.g.;
P(X = 2) = 1/36 and P(X = 7) = 1/6 ). So, X is not discrete uniform. It is however still
a discrete random variable since the range is a finite set.
Example 3. On European Roulette wheels, there is only one zero. So, roll the wheel and
let X be number that lands. Then Range X = {0, 1, . . . , 36}. There are 37 values in the
range and P(X = k) = 1/37 for each k ∈ Range X . Thus X is discrete uniform.
Example 4. Draw a card from a standard deck (no jokers) and let X be the card value,
where J, Q, K = 10 and Ace = 11. Then Range X = {2, . . ., 11}. But X is not uniform
because P(X = 2) = 1/13 ≠ P(X = 10) = 4/13.
Expected Value
If X is discrete uniform with range { x1, x 2 , . . ., x n }, then the expected value (mean, or
average value) of X is simply the normal average:
µ = E[X] =
€
n
1
∑ k P(X = k ) = ∑ xi   =
n
k ∈Range X
i =1
x1 + x2 +...+x n
1 + 2 +...+ n
Note: When the range is {1, 2, . . . , n }, then µ =
=
n
n
.
n(n + 1)
n +1
2
=
.
2
n
Example 5. What is the average roll of one fair six-sided die? Because X is discrete
6 +1
uniform with range {1, 2, 3, 4, 5, 6}, we have µ =
= 3.5.
2
Dr. Neal, WKU
Variance
Again let X be a discrete uniform random variable with range { x1, x 2 , . . ., x n }. To
2
compute the variance, we first note that the expected value of X is simply
2
E[X ] =
2
∑ k P( X = k ) =
k ∈Range X
2
x12 + x22 +...
€ +x n .
n
Then,
x 2 + x 22 +... + xn 2  x1 + x2 +... + xn  2
σ 2 = Var( X ) = E[ X 2 ] − ( E[X ])2 = 1
−



n
n
=
1 n 2
xi − µ 2 .
∑
n i=1
We often say that the variance is “the average of the squares minus the square of the
average.”
In particular, when the range is {1, 2, . . ., n }, then
σ2 =
12
+
22
+ ... + n 2
n
n(n + 1)(2n + 1)
n
 n + 1 2
6
−
 =
−

 2 
 2 
n
+ 1 2
(n + 1)(2n + 1) (n + 1)2 (n + 1)[2(2n + 1) − 3(n + 1)]
=
−
=
6
4
12
(n + 1)(n − 1) n2 − 1
=
=
.
12
12
2
As usual, the standard deviation is σ = σ .
Non-uniform discrete random variables also can be treated as uniform by simply
listing the values in increasing order and repeating those that occur more than once.
The next example illustrates this concept.
Dr. Neal, WKU
Example 6. Let X(ω ) be the weight of person ω for the 12 players on a girl’s basketball
team. These weights are as follows:
{120, 122, 125, 130, 130, 145, 150, 150, 155, 165, 180, 180}
Technically X is not uniform because there are only 9 distinct values in the range
and P(X = 130) = 2/12 ≠ 1/9. However, we can still find the mean, variance, and
standard deviation as before by using the preceding formulas. Here µ = 146, σ
260,784 / 12 − 1462 = 416, and σ = 416 ≈ 20.396.
2
=
Mode and Median
The mode of a random variable X is the most likely value in the range of X (that is, the
value k such that P(X = k) is maximized). When the function values of X are listed in
increasing order, as in Example 6, then the mode is simply the value that occurs most
often. However if X is discrete uniform, then we do not really consider the mode
because each value in the range is equally likely.
Sometimes two or more values occur equally often as in Example 6 above (130, 150
and 180). If two values are most likely, then X is called bimodal. If three values occur
most often, then X is called trimodal.
The median of a random variable X is the smallest value k such that P(X ≤ k) ≥ 0.50
(that is, the first point at which the cumulative distribution exceeds 50%). This
definition is the general form that can be used for all random variables. In the special
case of a discrete uniform random variable or when function values are listed in
increasing order, then another convention is used:
The median is the middle value when there are an odd number of
measurements listed, and it is the average of the two middle values when
there is an even number of measurements listed.
In Example 6 above, there are 12 measurements listed. So the median is the average of
the 6th and 7th measurement which is given by (145 + 150)/2 = 147.5.
The preceding methods for computing the mean, median, mode and standard
deviation are generally used for raw data of a population when a measurement is given
for each person in the population.
Example 7. The following measurements are the Math ACT scores from a MATH 136
class at WKU in Spring 2014:
18
30
22
25
21
29
23
24
34
27
28
20
19
29
27
21
21
28
25
21
28
19
30
25
28
25
22
31
26
27
27
Find the mean µ , the median, the mode, and the standard deviation σ of this class.
Dr. Neal, WKU
Calculator Solution. TI-83/84: We enter the data into the STAT Edit screen under list L1.
Enter data into L1
Sort data then enter
1–Var Stats L1
Output
Scroll down
We see that µ = 780/31 ≈ 25.16129, the median is 25 (which is the 16th
measurement when all 31 measurements are listed in increasing order, and σ ≈ 3. 9521 .
The calculator does not compute the mode for us. But by scrolling down the sorted
list, we can observe that four different measurements each occur 4 times which is the
most that any measurement occurs. Thus, we have modes of 21, 25, 27, and 28.
Note: (i) Because we have a measurement for each person in the class, the displayed
value x is actually the true population mean µ . If the data were only a random sample
from a larger population, then we would use x ≈ 25.16129 to denote the sample mean
and we would use S ≈ 4.0174 to denote the sample deviation.
2
(ii) From the displayed values, we also can compute σ and σ by
σ2 =
2
1 n 2
2 20110  780 
≈ 15.61912672
x
−
µ
=
−


∑
 31 
n i =1 i
31
and σ = σ 2 ≈ 3.952106618.
(iii) Other statistics shown are the minimum value 18, the maximum 34, the first
quartile Q1 = 21, and the third quartile Q3 = 28. The first quartile is the median of the
first 15 values, and the third quartile is the median of the upper 15 values.
Frequency Charts
Often data sets with many measurements are given in a frequency chart that gives the
number of occurrences for each measurement.
Measurement
Frequency
x1
k1
x2
k2
x3
k3
.
.
.
.
xm
km
Then n = k1 + . . . + k m is the total number of measurements. So the mean µ is actually a
weighted average given by
k x + . . . + km xm
.
µ = 1 1
n
Dr. Neal, WKU
2
In computing the variance σ , only the measurements are squared in the first part,
not the frequencies:
k x 2 + .. . + km x m 2
σ2 = 1 1
– µ 2.
n
Example 8. Suppose all the households in a neighborhood are surveyed as to how many
children lived at home. The responses are below:
Number of children
Number of households
0
60
1
42
2
86
3
59
4
22
5
4
6
2
Find the mean µ , the median, the mode, and the standard deviation σ .
Calculator Solution. We enter the measurements under list L1 and the frequencies under
list L2. Then enter the command 1–Var Stats L1, L2.
We see that there are 275 measurements, with the average number of children per
household being µ = 511/275 ≈ 1.8582.
Because 275/2 = 137.5, the median is the 138th measurement which is a 2. The
mode is also 2 since it occurs most often. Lastly,
σ2 =
1441  511 2
−
 ≈ 1.78716 and σ = σ 2 ≈ 1.33685.

275
275
Dr. Neal, WKU
Exercises
1. The grades (out of 100) on Test 1 for a MATH 317 class were as follows:
{48, 71, 72, 78, 84, 84, 86, 90, 94, 94, 94, 96, 100}
(a) Give the distribution of A's, B's, etc. on a standard scale 90%, 80%, etc.
(b) Compute the mean, median, mode, and standard deviation for the grades on this
test.
(c) Compute the distribution of grades on the following “normalized” scale:
A
B
C
D
F
Range
At least µ + σ
[ µ + 0.5 σ , µ + σ )
[ µ – 0.5 σ , µ + 0.5 σ )
[ µ – σ , µ – 0.5 σ )
Below µ – σ
Now consider the Test 2 grades of the class. One student has dropped and the
grades are (again out of 100): {48, 58, 62, 69, 74, 75, 79, 80, 82, 84, 88, 89}
(d) Redo parts (a), (b), and (c) with these grades.
(e) In your opinion, explain if final class grades should be assigned on a standard scale
or on a normalized scale. When if ever should an individual student's grade be curved?
2. All the female models at an agency were measured for shoe size in order to be fitted
for a new line of shoe. The sizes were:
Size
Models
5.5
2
6
5
6.5
12
7
14
7.5
16
8
18
8.5
17
9
4
9.5
2
(a) Compute the mean, median, mode, and standard deviation of these shoe sizes.
(b) What percentage of these models are above the agency average in female shoe size?
(c) What percentage of these models are within one standard deviation of average
female shoe size?
```
Related documents