Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability, Mean and Median In the last section, we considered (probability) density functions. We went on to discuss their relationship with cumulative distribution functions. The goal of this section is to take a closer look at densities, introduce some common distributions and discuss the mean and median. Recall, we define probabilities as follows: Proportion of population for Area under the graph of p ( x ) between a and b which x is between a and b b p( x)dx a The cumulative distribution function gives the proportion of the population that has values below t. That is, t P (t ) p( x)dx Proportion of population having values of x below t When answering some questions involving probabilities, both the density function and the cumulative distribution can be used, as the next example illustrates. Example 1: Consider the graph of the function p(x). p x 0.2 0.1 2 4 6 8 10 Figure 1: The graph of the function p(x) a. Explain why the function is a probability density function. b. Use the graph to find P(X < 3) c. Use the graph to find P(3 § X § 8) 1 x Solution: a. Recall, a function is a probability density function if the area under the curve is equal to 1 and all of the values of p(x) are non-negative. It is immediately clear that the values of p(x) are non-negative. To verify that the area under the curve is equal to 1, we recognize that the graph above can be viewed as a triangle. Its base 1 is 10 and its height is 0.2. Thus its area is equal to 10 0.2 1 . 2 b. There are two ways that we can solve this problem. Before we get started, though, we begin by drawing the shaded region. p x 0.2 0.1 2 4 6 8 10 x The first approach is to recognize that we can determine the area under the curve from 0 to 3 immediately. The shaded area is another triangle, with a base of 3 and a height of 0.1. Thus, the area is equal to 0.15. A second approach would be to find the equation of the lines that form p(x) and use the integral formula on the previous page. For the first line, notice that the line passes through the points (0, 0) and (6, 0.2). Using the point-slope formula, we see that the line is given by p(x) = (1/30)x. The second line passes through the points (6, 0.2) and (10, 0). Again, using the point-slope formula, we see that the line is given by p(x) = -(1/20)x + 1/2. 1 x 30 1 1 So, we have that p ( x) x 20 2 0 if 0 x 6 if 6 x 10 otherwise Returning to the original question, we have that P(X < 3) is given by the integral 3 p( x)dx P(3) P(0) . On [0, 3), p(x) = (1/30)x. Notice that P(t) = (1/60)t . So, 2 0 we have that P ( X 3) P(3) P(0) 2 1 1 9 0.15 . (3) 2 (0) 2 60 60 60 c. Again, we have two ways that we can approach this problem. Again, we start by drawing out the shaded region. p x 0.2 0.1 2 4 6 8 10 x If we want to use triangles, it is easiest to use the fact that the area under the curve is equal to 1. The shaded region is thus equal to one minus the two triangles on the sides. In (b), we found the area of the left triangle is equal to 0.15. The area of the right triangle is equal to 0.1. So, the area of the shaded region is 1 – 0.15 – 0.1 = 0.75. If instead we were to use integrals, notice that p(x) changes functions at x = 6. 8 Thus, in order to compute the integral p( x)dx , we need to split into two pieces. 3 That is, 8 6 8 3 3 6 p( x)dx p( x)dx p( x)dx . 6 1 1 xdx x 2 0.45 . 30 60 3 3 6 6 p( x)dx 3 8 1 1 1 2 1 6 p( x)dx 6 20 x 2 dx 40 x 2 x 6 0.3 . 8 8 So, we see that the shaded area is equal to 0.45 + 0.3 = 0.75, which agrees with the answer we found the other way. Often times, we are concerned with finding the “average” value of a distribution. There are two common measured that are used: the mean and the median. The Mean If a quantity has a density function p(x), then we define the mean value of the quantity as xp( x)dx . 3 Example 2: Returning to the density function given in Example 1, compute its mean. Solution: Notice that p(x) changes functions at x = 6. Thus, in order to compute the integral 10 xp( x)dx , we will need to again split it into two pieces. Thus, we have that the mean is 0 equal to 10 6 0 0 xp( x)dx x 1 1 1 xdx x x dx 30 20 2 6 10 6 10 1 1 1 x3 x3 x 2 4 6 90 0 60 216 176 16 90 60 3 The Median A median of a quantity x distributed through a population is a value T such that half of the population has values of x less than T and half the population has values of x greater than T. That is, T satisfies the equation T 1 p( x)dx 2 where p(x) is the density function of the quantity. In words, we have that half the area under the graph of p(x) lies to the left of T (and half lies to the right of T.) Example 3: Returning to the density function given in Example 1, compute its median. Solution: Looking at Figure 1, notice that more than half of the area occurs in the left side of the triangle. Thus, the median will be a number between 0 and 6. 4 Since we do not need to worry about the function changing (since it is the same on the T 1 1 T2 1 . Solving for T, we see that interval [0, 6]), we have that xdx . That is, 60 2 30 2 0 T 30 . Note: We did not use the 30 for T, since we know that T is a positive number. There are a number of important distributions that arise in a variety of situations. Below, we list three such distributions as well as associated properties. The first important distribution we shall consider is the uniform distribution. We introduced this distribution in the previous section. The graph of the density function is constant on the interval [a, b] and zero elsewhere. p x 1 b -a a b x Figure 2: The density of the uniform distribution on [a, b] Uniform Distribution The density of the uniform distribution is given by p ( x) 1 , ba for a § x § b The cumulative distribution function is given by t P (t ) p( x)dx a t a , ba for a § t § b Another important distribution we shall consider is the exponential distribution. The graph of the density function is characterized by an exponential decay. 5 p x 1 x Figure 3: The density of the exponential distribution for c > 0. Exponential Distribution The density of the exponential distribution is given by p( x) ce cx , for x ¥ 0 and any constant c > 0 The cumulative distribution function is given by t P (t ) p( x)dx 1 e ct , for t ¥ 0 0 Example 4: Suppose that the probability density function for the wait time in line at a counter is if x 0 0 given by p ( x) x /5 if x 0 ke a. What is the value of the constant k? b. Determine the probability that a person will wait at least 3 minutes. c. What is the mean wait time? Solution: a. Comparing the form of the density function with that given in the box above, we see that c = 1/5. Thus, we must have that k = 1/5. Another way to see this would be to do the integration and solve for k. 1 ke 0 x /5 b dx lim ke x /5 dx lim 5ke x /5 lim 5k 5ke b /5 5k b 0 b b 0 Dividing both sides by 5, we see that k = 1/5. 6 b b. The probability that a person will wait at least 3 minutes is given by b p( x)dx lim p( x)dx lim P(b) P(3) 1 P(3) 1 (1 e 3 b 3/5 b 3 ) e 3/5 . Here, we used the fact that lim P(b) 1 to simplify the above expression. b c. The mean wait time is given by x 5e x /5 dx . Using integration by parts, we have: 0 b b x x /5 x x /5 x /5 b lim lim e dx e dx xe e x /5 dx 0 5 0 b 5 b 0 0 b b lim xe x /5 5e x /5 0 0 b lim be b /5 5e b /5 5 b 5 Note: In general, if p ( x) ce cx for x ¥ 0, then 1 xp( x)dx c . 0 The final distribution which we shall examine is the normal distribution. The graph of its density function is a bell-shaped curve which peaks at its mean, denoted by m. The width of the curve is determined by the standard deviation, denoted by s. s s x m Figure 4: The density of the normal distribution with parameters m and s. 7 Normal Distribution The density of the normal distribution is given by p ( x) 2 1 e( x ) 2 2 2 , for -¶ < x < ¶ where m is the mean of the distribution and s is the standard deviation. It is beyond the scope of this course to verify that p( x)dx 1. However, we can see ( x ) 2 2 2 will always be positive (but less than 1) and that 0 § p(x) § 1 for all x, since e 1 is a positive scalar that is less than 1. 2 The normal density function is not an elementary integral. That is, a closed form of the antiderivative does not exist. But, as Figure 4 above illustrates, there is still area under the curve. To evaluate the integral, we use a calculator or a table of values. Example 5: Lengths of human pregnancies are normally distributed with mean 268 days and standard deviation 15 days. What percentage of pregnancies last between 250 days and 280 days? Solution: Using the fact that m = 268 and s = 15, we have that the density function is given by 2 2 1 p ( x) e( x 268) 2(15) . Finding the integral numerically, we have: 15 2 Proportion of pregnancies lasting between 250 days and 280 days 280 15 250 8 2 1 e( x 268) 2 2(15) 2 dx 0.673 .