Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STA301 – Statistics and Probability Lecture no 23: • Graphical Representation of the Distribution Function of a Discrete Random Variable • Mathematical Expectation • Mean, Variance and Moments of a Discrete Probability Distribution • Properties of Expected Values First, let us consider the concept of the DISTRIBUTION FUNCTION of a discrete random variable. DISTRIBUTION FUNCTION: The distribution function of a random variable X, denoted by F(x), is defined by F(x) = P(X < x). The function F(x) gives the probability of the event that X takes a value LESS THAN OR EQUAL TO a specified value x. The distribution function is abbreviated to d.f. and is also called the cumulative distribution function (cdf) as it is the cumulative probability function of the random variable X from the smallest value up to a specific value x. EXAMPLE: Find the probability distribution and distribution function for the number of heads when 3 balanced coins are tossed. Depict both the probability distribution and the distribution function graphically. Since the coins are balanced, therefore the equi probable sample space for this experiment is S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}. Let X be the random variable that denotes the number of heads. Then the values of X are 0, 1, 2 and 3. And their probabilities are: f(0) = P(X = 0) = P[{TTT}] = 1/8 f(1) = P(X = 1) = P[{HTT, THT, TTH}] = 3/8 f(2) = P(X = 2) = P[{HHT, HTH, THH}] = 3/8 f(2) = P(X = 3) = P[{HHH}] = 1/8Expressing the above information in the tabular form, we obtain the desired probability distribution of X as follows: Number of Heads Probability (xi) f(xi) 0 1 2 3 Total Virtual University of Pakistan 1 8 3 8 3 8 1 8 1 Page 177 STA301 – Statistics and Probability The line chart of the above probability distribution is as follows: f(x) 4/8 3/8 2/8 1/8 0 0 1 2 3 X In order to obtain the distribution function of this random variable, we compute the cumulative probabilities as follows: Number of Heads (xi) Probability f(xi) 1 8 3 8 3 8 1 8 0 1 2 3 Cumulative Probability F(xi) 1 8 1 3 4 8 8 8 4 3 7 8 8 8 7 1 1 8 8 Hence the desired distribution function is 0, 1 , 8 4 F x , 8 7 8 , 1, for x 0 for 0 x 1 for 1 x 2 for 2 x 3 for x 3 Why has the distribution function been expressed in this manner? The answer to this question is: INTERPRETATION: If x < 0, we have P(X < x) = 0, the reason being that it is not possible for our random variable X to assume value less than zero.(The minimum number of heads that we can have in tossing three coins is zero.) If 0 < x < 1, we note that it is not possible for our random variable X to assume any value between zero and one. (We will have no head or one head but we will NOT have 1/3 heads or 2/5 heads!) Virtual University of Pakistan Page 178 STA301 – Statistics and Probability Hence, the probabilities of all such values will be zero, and hence we will obtain a situation which can be explained through the following table: Number of Heads (xi) Probability f(xi) 0 1 8 0.2 0 0.4 0 0.6 0 0.8 0 1 3 8 Cumulative Probability F(xi) 1 8 1 1 0 8 8 1 1 0 8 8 1 1 0 8 8 1 1 0 8 8 1 3 4 8 8 8 The above table clearly shows that the probability that X is LESS THAN any value lying between zero and 0.9999… will be equal to the probability of X = 0 i.e. For 0 < x < 1, 1 P(X x) P(X 0) ; 8 Similarly, • For 1 < x < 2, we have PX x PX 0 PX 1 1 3 4 ; 8 we 8have 8 For 2 < x < 3, PX x PX 0 PX 1 PX 2 1 3 3 7 ; 8 8 8 8 And, finally, for x > 3, we have PX x PX 0 PX 1 PX 2 P(X 3) 1 3 3 1 8 1. 8 8 8 8 8 Hence, the graph of the DISTRIBUTION FUNCTION is as follows: Virtual University of Pakistan Page 179 STA301 – Statistics and Probability F(x) 1 6/8 4/8 2/8 0 1 2 3 X As this graph resembles the steps of a staircase, it is known as a step function. It is also known as a jump function (as it takes jumps at integral values of X).In some books, the graph of the distribution function is given as shown in the following figure: F(x) 1 6/8 4/8 2/8 0 1 2 3 X In what way do we interpret the above distribution function from a REAL-LIFE point of view? If we toss three balanced coins, the probability that we obtain at the most one head is 4/8, the probability that we obtain at the most two heads is 7/8, and so on.Let us consider another interesting example to illustrate the concepts of a discrete probability distribution and its distribution function: EXAMPLE: A large store places its last 15 clock radios in a clearance sale. Unknown to any one, 5 of the radios are defective. If a customer tests 3 different clock radios selected at random, what is the probability distribution of X, where X represent the number of defective radios in the sample? SOLUTION We have: Type of Clock Radio Good Defective Total The total number of ways of 15 . 3 Virtual University of Pakistan Number of Clock Radios 10 5 15 Page 180 STA301 – Statistics and Probability selecting 3 radios out of 15 is Also, is the total number of ways of selecting 3 good radios (and no defective radio) 10 5 . 3 0 Hence, the probability of X = 0 is 10 5 3 0 0.26. 15 3 The probabilities of X = 1, 2, and 3 are computed in a similar way. Hence, we obtain the following probability distribution: Number of defective clock radios in the sample X 0 1 2 3 Total Probability f(x) 0.26 0.49 0.22 0.02 0.99 1 The line chart of this distribution is: LINE CHART f(x) 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 X As indicated by the above diagram, it is not necessary for a probability distribution to be symmetric; it can be positively or negatively skewed. The distribution function of the above probability distribution is obtained as follows: Virtual University of Pakistan Page 181 STA301 – Statistics and Probability Number of defective clock radios in the sample X 0 1 2 3 Total f(x) F(x) 0.26 0.49 0.22 0.02 0.99 1 0.26 0.75 0.97 0.99 1 INTERPRETATION: The probability that the sample of 3 clock radios contains at the most one defective radio is 0.75, the probability that the sample contains at the most two defective radios is 0.97, and so on. Next, we consider the concept of MATHEMATICAL EXPECTATION Let a discrete random variable X have possible values x1, x2, …, xn with corresponding probabilities f(x1), f(x2), …, f(xn) such that f(xi) =1. Then the mathematical expectation or the expectation or the expected value of X, denoted by E(x), is defined as E(X) = x1f(x1) + x2f(x2) + … + xnf(xn) n x if x i , i 1 E(X) is also called the mean of X and is usually denoted by the letter . The expression n E X xi f xi i 1 may be regarded as a weighted mean of the variable’s possible values x1, x2, …,xn, each being weighted by the respective probability. In case the values are equally likely, E X 1 xi , n which represents the ordinary arithmetic mean of the n possible values It should be noted that E(X) is the average value of the random variable X over a VERY LARGE number of trials. Let us now consider an interesting example: EXAMPLE: If it rains, an umbrella salesman can earn $ 30 per day. If it is fair, he can lose $ 6 per day. What is his expectation if the probability of rain is 0.3? SOLUTION: Let X represents the number of dollars the salesman earns. Then X is a random variable with possible values 30 and –6, (where -6 corresponds to the fact that the salesman loses), and the corresponding probabilities are 0.3 and 0.7 respectively. Hence, we have: AMOUNT EARNED PROBABILITY EVENT P(x) ($) x Rain 30 0.3 No Rain –6 0.7 In order to compute the expected value of X, we carry out the following computation Total 1 AMOUNT EVENT Rain Virtual University of Pakistan No Rain EARNED ($) x 30 –6 Total PROBABILITY P(x) xP(x) 0.3 0.7 1 9.0 -4.2 4.8 Page 182 STA301 – Statistics and Probability Hence E(X) = $ 4.80 per day i.e. on the average, the salesman can expect to earn 4.8 dollars per day. Until now, we have considered the mathematical expectation of the random variable X. But, in many situations, we may be interested in the mathematical expectation of some FUNCTION of X: EXPECTATION OF A FUNCTION OF A RANDOM VARIABLE: Let H(X) be a function of the random variable X. Then H(X) is also a random variable and also has an expected value, (as any function of a random variable is also a random variable). If X is a discrete random variable with probability distribution f(x), then, since H(X) takes the value H(xi) when X = xi, the expected value of the function H(X) is E[H(X)] = H(x1) f(x1) + H(x2)f(x2) + … + H(xn) f(xn) Hx i f x i , i provided the series converges absolutely. Again, if H(X) = (X - )2, where is the population mean, then E(X – )2 = (xi - )2 f(x). We call this expected value the variance and denote it by Var(X) or 2. And, since E(X – )2 = E(X2) – [E(X)]2, hence the short cut formula for the variance is 2 = E(X2) – [E(X)]2. The positive square root of the variance, a before, is called the standard deviation. More generally, if H(X) = Xk, k = 1, 2, 3, …, then E(Xk) = xik f(x) which we call the kth moment about the origin of the random variable X and we denote it by k. Similarly, if H(X) = (X – )k, k = 1, 2, 3, …, then we get an expected value, called the kth moment about the mean of the random variable X, which we denote by k. That is: k = E(X – )k = (xi – )k f(x) The skewness of a probability distribution is often measured by 2 1 33 2 and kurtosis by 2 4 22 . These moment-ratios assist us in determining the skewness and kurtosis of our probability distribution in exactly the same way as was discussed in the case of frequency distributions. Next, we discuss some important properties of mathematical expectation. The important properties of the expected values of a random variable are as follows: PROPERTIES OF MATHEMATICAL EXPECTATION 1. If c is a constant, then E(c) = c. Thus the expected value of a constant is constant itself. Virtual University of Pakistan Page 183 STA301 – Statistics and Probability This point can be understood easily by considering the following interesting example: Suppose that a very difficult test was given to students by a professor, and that every student obtained 2 marks out of 20! It is obvious that the mean mark is also 2. Since the variable ‘marks’ was a constant, therefore its expected value was equal to itself. 2. If X is a discrete random variable and if a and b are constants, then E(aX + b) = a E(X) + b. Let us verify this from the following example: EXAMPLE: Let X represent the number of heads that appear when three fair coins are tossed. The probability distribution of X is: X 0 1 2 3 Total P(x) 1/8 3/8 3/8 1/8 1 The expected value of X is obtained as follows: x 0 1 2 3 Total P(x) 1/8 3/8 3/8 1/8 1 xP(x) 0 3/8 6/8 3/8 12/8=1.5 Hence, E(X) = 1.5 Suppose that we are interested in finding the expected value of the random variable 2X+3.Then we carry out the following computations: x 0 1 2 3 2x+3 3 5 7 9 Total P(x) 1/8 3/8 3/8 1/8 1 (2x+3)P(x) 3/8 15/8 21/8 9/8 48/8=6 Hence E(2X+3) = 6It should be noted that E(2X+3) = 6= 2(1.5) + 3= 2E(X) + 3 i.e. E (aX + b) = a E(X) + b. Virtual University of Pakistan Page 184