Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
RANDOM VARIABLES and PROBABILITY DISTRIBUTIONS A random variable X is a function that assigns (real) numbers to the elements of the sample space S of a random experiment. The value space V of a random variable is the set of all possible values of the r. v. X. A discrete random variable is one whose value space is finite or countably infinite. A continuous r.v. is one whose value space is an interval of real numbers. The probability function of a discrete random variable is the function that specifies the probabilities of the r.v. assuming each of the various values in the value space. That is, px PX x . The probability distribution of a discrete random variable consists of its value space together with its probability function. That is, the "possibilities" together with their "probabilities". It may be specified in function form or may be presented as a table of values or as a probability histogram. EXAMPLE: A "hand" of 5 cards is dealt from a thoroughly shuffled deck of cards. There are 52 2 ,598 ,960 different possible hands in the sample space S. 5 Let X denote the number of "Hearts" in the hand so dealt. Then the value space is V = {0, 1, 2, 3, 4, 5}. (V lists the possibilities) The Probabilities are: 13 39 0 5 p0 PX 0 0.2215 52 5 13 39 1 4 p1 PX 1 0.4114 52 5 Page 1 of 7 13 39 2 3 p2 PX 2 0.2743 52 5 13 39 3 2 p3 PX 3 0.0815 52 5 13 39 4 1 p4 PX 4 0.0107 52 5 13 39 5 0 p5 PX 5 0.0005 52 5 The probability function for this r.v. X is 13 39 x 5 x px PX x 52 5 The probability distribution of this r.v. X is 13 39 x 5 x px PX x 52 5 for x V ; i.e. for x 0 ,1,2,3,4 ,5 . Probability Distribution in table form Possible Value: x = 0 1 2 3 4 5 Total Probability: p(x) = 0.2215 0.4114 0.2743 0.0815 0.0107 0.0005 0.9999 or 1.0000 Basic Requirements of p(x) a. 0 px 1 for each x V b. px 1 all xV Page 2 of 7 Histogram of Probability Distribution Probability Histogram for Num ber of Hearts in 5-Card Hand 0.45 0.40 0.35 Probability 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0 1 2 3 4 5 Num ber of Hearts (Value of X) Question: What is the average number of Hearts in a 5-card hand selected this way? Definition: The mean or expected value of a discrete random variable X is defined as EX x. px . all x V Example: EX x px all x V 5 x px x 0 0 0.2215 1 0.4114 2 0.2743 3 0.0815 4 0.0107 5 0.0005 0 0.4114 0.5486 0.2445 0.0428 0.0025 1.2498 but is really 1.25 and is off because of rounding probabilities to 4 decimal places. Page 3 of 7 Calculating the Mean in Table Form x= 0 1 2 3 4 5 Total p(x) = 0.2215 0.4114 0.2743 0.0815 0.0107 0.0005 0.9999 x p(x) = 0.0000 0.4114 0.5486 0.2445 0.0428 0.0025 = 1.2498 Definition: The Variance of a discrete r.v. X is 2 VarX E X 2 x 2 . px all x in V Example: 2 VarX E X 2 5 x 1.25 2 . p x x 0 0 1.25 2 0.2215 1 1.25 2 0.4114 2 1.25 2 0.2743 3 1.25 2 0.0815 4 1.25 2 0.0107 5 1.25 2 0.0005 0.34609375 0.0257125 0.15429375 0.24959375 0.08091875 0.00703125 0.86364375 [Based on other information, the correct value without rounding error is 0.86397] Comment: The variance is the expected value of the quantity (X-)2. It is a measure of the amount of variability or variation to be expected among the possible values of a random variable. Page 4 of 7 Definition: The expected value of X2, the Square of X x 2 p x E X2 Example: all x V x 2 p x E X2 all x V 5 x 2 p x x 0 Calculating E X 2 in Table Form x= 0 1 2 3 4 5 Total x2 p(x) = 0.0000 0.4114 1.0972 0.7335 0.1712 0.0125 p(x) = 0.2215 0.4114 0.2743 0.0815 0.0107 0.0005 0.9999 E X 2 = 2.4258 Theorem: The variance of any random variable X can be determined by 2 VarX E X 2 E X 2 2 Example: (the definition ) (a useful calculatio n method) E X 2 2 2.4258 1.2498 2 2.4258 1.5620 0.8638 Actually, E X 2 165 2.426470588 so that 68 EX 2 2 2 165 5 2.4265 1.5625 0.86397 68 4 2 Page 5 of 7 which is what we had stated earlier. The rounding error is larger here because of the squaring taking place. Definition: The Standard Deviation of a r.v. X is the square root of its variance. 2 VarX . Example: 2 0.86397 0.9295 Comment: The standard deviation is the most commonly used measure of variation or variability of the values of a random variable. Empirical Rule If the shape of a probability distribution is mound-shaped and fairly symmetric, then the amount of probability between: a. and is about 0.68 b. 2 and 2 is about 0.95 c. 3 and 3 is almost 1.00 For a discrete random variable, look at its histogram. The histogram for the above example of Hearts in a 5-card hand is not symmetric but it does have a “mound” or high region. How well does the Empirical Rule apply in this case? = 1.25 and =0.93. The interval between and here is from 1.25-0.93 0.32 to 1.25 0.93 2.18 . Thus, P X P0.32 X 2.18 PX 1 or 2 . 0.4114 0.2743 0.6857 which is quite close to the predicted value of 0.68. Because the histogram is not symmetric and mound-shaped we do not expect the Empirical Rule to work very well.. Similarly, the interval between 2 and 2 is from 2 1.25 1.86 0.61 to 2 1.25 1.86 3.11 and P 2 X 2 P 0.61 X 3.11 PX 0 or 1 or 2 or 3 0.2215 0.4114 0.2743 0.0815 0.9887 This value is considerably higher than the Empirical Rule value of 0.95. Page 6 of 7 Another example with a perfectly symmetric and quite mound-shaped distribution follows. Example: If one tosses a coin 10 times and counts the number of Heads observed in the 10 tosses, the probability distribution of the random variable Y = number of Heads in 10 tosses has the probability histogram given below. For this random variable, = 5.00 and = 1.58 . Number of Heads in 10 Tosses of a Coin 0.25 Probability 0.20 0.15 0.10 0.05 0.00 0 1 2 3 4 5 6 7 8 9 10 Num ber of Heads (Value of X) Reading probabilities from the histogram as accurately as you can, check to see how well the Empirical Rule works in this case. Example: A random variable X is defined as the number of accidents a randomly chosen Saskatchewan driver has in a one-year period. Using accident records maintained by SGI over the past ten years, the probability distribution for r.v. X was determined to be as follows. Number x= Probability P[X = x] = 0 1 2 3 4 0.58 0.24 0.13 0.04 0.01 How many accidents does one expect a typical Saskatchewan driver to have in a 12-month period? How much variability does one expect to observe about this expected number? Page 7 of 7