Download • - WordPress.com

STA301 – Statistics and Probability Lecture No. 24  Chebychev’s Inequality  Concept of Continuous Probability Distribution  Mathematical Expectation, Variance & Moments of a Continuous Probability Distribution We begin with the discussion of the concept of the Chebychev’s Inequality in the case of a discrete probability distribution Chebychev’s Inequality: If X is a random variable having mean  and variance 2 > 0, and k is any positive constant, then the probability that a value of X falls within k standard deviations of the mean is at least That is: P  k  X    k   1  1 , k2 Alternatively, we may state Chebychev’s theorem as follow: Given the probability distribution of the random variable X with mean  and standard deviation , the probability of the observing a value of X that differs the  by k or more standard deviations cannot exceed 1/k2. As indicated earlier, this inequality is due to the Russian mathematician P.L. Chebychev (1821-1894), and it provides a means of understanding how the standard deviation measures variability about the mean of a random variable. It holds for all probability distributions having finite mean and variance. Let us apply this concept to the example of the number of petals on the flowers of a particular species that we considered earlier: EXAMPLE: If a biologist is interested in the number of petals on a particular flower, this number may take the values 3, 4, 5, 6, 7, 8, 9, and each one of these numbers will have its own probability The probability distribution of the random variable X is: No. of Petals X x1 = 3 x2 = 4 x3= 5 x4 = 6 x5 = 7 x6 = 8 x7 = 9 P(x) 0.05 0.10 0.20 0.30 0.25 0.075 0.025 1 The mean of this distribution is:  = E(X) = XP(X) = 5.925  5.9 And the standard deviation of this distribution is: = S.D.X   36.925  5.9252  36.925  35.106  1.819  1.3 According to the Chebychev’s inequality, the probability is at least 1 - 1/22 = 1 - 1/4 = 3/4 = 0.75 that X will lie between  - 2 and  + 2 i.e. between 5.9 - 2(1.3) and 5.9 + 2(1.3) i.e. between 3.3 and 8.5 Let us have another look at the probability distribution: Virtual University of Pakistan Page 185 STA301 – Statistics and Probability No. of Petals X x1 = 3 x2 = 4 x3= 5 x4 = 6 x5 = 7 x6 = 8 x7 = 9 P(x) 0.05 0.10 0.20 0.30 0.25 0.075 0.025 1 According to this distribution, the probability that X lies between 3.3 and 8.5 is 0.10 + 0.20 + 0.30 + 0.25 + 0.075 = 0.925 which is greater than 0.75(AS indicated by the Chebychev’s inequality). Finally, and most importantly, we will use the concepts in Chebychev's Rule and the Empirical Rule to build the foundation for statistical inference-making. The method is illustrated in next example. EXAMPLE: Suppose you invest a fixed sum of money in each of five business ventures. Assume you know that 70% of such ventures are successful, the outcomes of the ventures are independent of one another, and the probability distribution for the number, x, of successful ventures out of five is: x P(x) a) Find  = E(X). Interpret the result. b)Find  0 1 2 3 4 5 .002 .029 .132 .309 .360 .168    E X   2 . Interpret the result. c) Graph P(x). d) Locate  and the interval  + 2 on the graph. Use either Chebychev’s Rule or the Empirical Rule to approximate the probability that x falls in this interval. Compare this result with the actual probability. e) Would you expect to observe fewer than two successful ventures out of five? SOLUTION: a) Applying the formula, = E(X)=xP(x) = 0(.002)+1(.029) + 2(.132) + 3(.309) + 4.(.360) + 5(.168) = 3.50 INTERPRETATION: On average, the number of successful ventures out of five will equal 3.5. (It should be remembered that this expected value has meaning only when the experiment – investing in five business ventures – is repeated a large number of times.) b) Now we calculate the variance of X: We know that 2 = E[(X - )2] = (x - )2 P(x) Hence, we will need to construct a column of x - : x P(x) x- 0 .002 –3.5 .029 –2.5 Virtual University1of Pakistan 2 .132 –1.5 3 .309 –0.5 (x-)2 12.25 6.25 2.25 0.25 (x-)2P(x) 0.02 0.18 0.30 0.08 Page 186 STA301 – Statistics and Probability Thus, the variance is 2 = 1.05 and the standard deviation is    2  1.05  1.02 This value measures the spread of the probability distribution of X, the number of successful ventures out of five. c) The graph of P(x) is shown in the following figure with the mean  and the interval  + 2 =3.50+2(1.02) =3.50+2.04 = (1.46, 5.54) shown on the graph. p(x) 0.4 0.3 0.2 0.1 x 0 0 1 2  + 2 (1.46) 3 4  5  + 2 (5.54) Note particularly that  = 3.5 locates the centre of the probability distribution. Since this distribution is a theoretical relative frequency distribution that is moderately mound-shaped, we expect (from Chebychev’s Rule) at least 75% and, more likely (from the Empirical Rule), approximately 95% of observed x values to fall in the interval  + 2 ------ that is, between 1.46 and 5.54. It can be seen from the above figure that the actual probability that X falls in the interval  + 2 includes the sum of P(x) for the values X = 2, X = 3, X = 4, and X = 5. Virtual University of Pakistan Page 187 STA301 – Statistics and Probability p(x) 0.4 0.3 0.2 0.1 x 0 0 1 2  + 2 (1.46) 3 4  5  + 2 (5.54) This probability is P(2) + P(3) + P(4) + P(5) = .132 +.309 + .360 + .168 = .969. Therefore, 96.9% of the probability distribution lies within 2 standard deviations of the mean. This percentage is CONSISTENT with both the Chebychev’s rule and the Empirical Rule. d) Fewer than two successful ventures out of five implies that x = 0 or x = 1. Since both these values of x lie outside the interval  + 2, we know from the Empirical Rule that such a result is unlikely (with approximate probability of only .05). The exact probability, P(x < 1), is P(0) + P(1) = .002 + .029 = .031. Consequently, in a single experiment where we invest in five business ventures, we would not expect to observe fewer than two successful ones. The key question: What is the significance of the Chebychev’s Inequality and the Empirical Rule? The answer to this question is that both these rules assist us in having a certain IDEA regarding amount of data lying between the mean minus a certain number of standard deviations and mean plus that same number of standard deviations. Given any data-set, the moment we compute the mean and standard deviation, we HAVE an idea regarding the two points (i.e. mean minus two standard deviations, and mean plus two standard deviations) between which the BULK of our data lies. If our data-set is hump-shaped, we obtain this idea through the Empirical Rule, and if we don’t have any reason to believe that our data-set is hump-shaped, then we obtain this idea through the Chebychev’s Rule Next, we begin the discussion of CONTINUOUS RANDOM VARIABLES. In this regard, the first point to be noted is that up till now we have discussed discrete random variables – quantities that are countable. We now begin the discussion of CONTINUOUS RANDOM VARIABLES – quantities that are measurable. As stated in the very first lecture, continuous variables result from measurement, and can therefore take any value within a certain range. For example, the height of a normal Pakistani adult male may take any value between 5 feet 4 inches and 6 feet. The temperature at a place, the amount of rainfall, time to failure for an electronic system, etc. are all examples of continuous random variable. Formally speaking, a continuous random variable can be defined as follows: CONTINUOUS RANDOM VARIABLE: A random variable X is defined to be continuous if it can assume every possible value in an interval [a, b], a < b, where a and b may be – and + respectively. The function f(x) is called the probability density function, abbreviated to p.d.f., or simply density function of the random variable X. Virtual University of Pakistan Page 188 STA301 – Statistics and Probability A continuous probability distribution looks something like this: f(x) X A p.d.f. has the following properties: i) f(x) > 0, for all x  ii)  f x  dx  1 iii) The probability that X takes on a value in the interval [c, d], c < d is given by:  P(c < x < d) d  f x  dx = which is the area under the curve y = f(x) c between X = c and X = d, as shown in the following figure: f(x) P(c < x < d) c d The TOTAL area under the curve is 1. In other words: 1) f(x) a non-negative function, 2) the integration takes place over all possible values of the random variable X between the specified limits, and 3) the probabilities are given by appropriate areas under the curve. Since k P X  k    f x  dx  0, k It should therefore be noted that the probability of a continuous random variable X taking any particular value k is always zero. That is why probability for a continuous random variable is measurable only over a given interval. Virtual University of Pakistan Page 189 STA301 – Statistics and Probability Further, since for a continuous random variable X, P(X = x) = 0 for every x, the following four probabilities are regarded as the same: P(c < X < d), P(c < X < d), P(c < X < d) and P(c < X d). They may be different for a discrete random variable. The values (expressed as intervals) of a continuous random variable and their associated probabilities can be expressed by means of a formula. We now discuss the distribution function of a continuous random variable. CONTINUOUS RANDOM VARIABLE: A random variable X may also be defined as continuous if its distribution function F(x) is continuous and is differentiable everywhere except at isolated points in the given range. In contrast with the graph of the distribution function of a discrete variable, the graph of F(x) in the case of a continuous variable has no jumps or steps but is a continuous function for all x-values, as shown in the following figure: 1 F(x) F(a) F(b) 0 X Since F(x) is a non-decreasing function of x, we have i) f(x) > 0, x ii) F x   f x dx , for all x.  The relationship between f(x) and F(x) is as follows: f(x) is obtained by finding the derivative of F(x), i.e.     d F x   f x  dx with the help of an example: Let us now explain the above concepts EXAMPLE: a) Find the value of k so that the function f(x) defined as follows, may be a density function f(x) = kx, 0 < x < 2 = 0, elsewhere b) Compute P(X = 1). c) Compute P(X > 1). d) Compute the distribution function F(x). e)  P X  1/2  1/ 3  X  2 / 3 SOLUTION a) The function f(x) will be a density function, if i) f(x) > 0 for every x, and Virtual University of Pakistan Page 190 STA301 – Statistics and Probability ii)   f x  dx  1  The first condition is satisfied when k > 0. The second condition will be satisfied, if   f x  dx  1,  0  2 i.e. if 1   f x  dx   f x  dx   f x  dx  0 2   0 2 0 2 i.e. if 1   0 dx   kx dx   0 dx 2  x   0  2k  2  0 i.e. if 1  0  k 2 This gives k = 1/2 We had f(x) = kx, 0 < x < 2 = 0, elsewhere and since we have obtained k = 1/2, hence:  2x , f x     0, b) for 0  x  2 elsewhere Since f(x) is continuous probability function, thereforeP(X = 1) = 0. c) P(X > 1) is obtained by computing the area under the curve (in this case, a straight line) between X=1 and X=2: f(x) 1 f(x) = x|2 0 X 1 2 This area is obtained as follows: Virtual University of Pakistan Page 191 STA301 – Statistics and Probability P(X > 1) = area of shaded region 2 =  f x  dx 1   2 2 3 =  x2 dx  x4  4 1 2 1 d) To compute the distribution function, we need to find: x F(x) = P(X < x) =  f x  dx  We do so step by step, as shown below: For any x such that - < x < 0, x F(x) =  0 dx  0,  If 0 < x < 2, we have x Fx    0 dx     dx   0 2  0 x    x4 , x x2 4 2 0 and, finally, for x > 2 we have 0 2x Fx    0 dx   Hence   02 F(x) = 0, for x < 0 = 2 dx   0 dx  1 0 x2 , for 0 < x < 2 4 =1, for x > 2. We will discuss the computation of the conditional probability  P X  1/2  1/ 3  X  2 / 3 Virtual University of Pakistan Page 192

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download • - WordPress.com