Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
4.2 Probability Distributions Definition. A random variable is a variable whose value is a numerical outcome of a random phenomenon. The probability distribution of a random variable tells us what the possible values of the variable are and how probabilities are assigned to those values. Discrete Random Variables Definition. A discrete random variable X has a finite number of possible values. The probability distribution of X lists the values and their probabilities: Value of X x1 x2 x3 · · · xk Probability p1 p2 p3 · · · pk Example 4.9. A household is a group of people living together, regardless of their relationship to each other. Many sample surveys such as the Current Population Survey select a random sample of households. Choose a household at random, and let the random variable X be the number of people living there. Here is the distribution of X. Household size Probability 1 2 3 4 5 6 7 .251 .321 .171 .154 .067 .022 .014 The probability that a randomly chosen household has more than two 1 members is P (X > 2) = P (X = 3) + P (X = 4) + P (X = 5) + P (X = 6) + P (X = 7) = .171 + .154 + .067 + .022 + .014 = .428 Equally Likely Outcomes Definition. If a random phenomenon has k possible outcomes, all equally likely, then each individual outcome has probability 1/k. The probability of any event A is count of outcomes in A count of all possible outcomes count of outcomes in A . = k P (A) = Example 4.10. Roll two dice and record the pips (dots) on each of the two up-faces. Figure 4.8 (see TM-65) shows the 36 possible outcomes. If the dice are carefully made, all 36 outcomes are equally likely. So each has probability 1/36. Gamblers are often interested in the sum of the pips on the up faces. What is the probability of rolling a 5? The event “roll a 5” contains the four outcomes: (1,4), (2,3), (3,2), (4,1). The probability is therefore 4/36 = 1/9 = 0.111. What about the probability of rolling a 7? In Figure 4.8 (TM-65) you will find six outcomes for which the sum of the pips is 7. The probability is 6/36 = 1/6 = 0.167. 2 The Mean and Standard Deviation of a Discrete Random Variable Definition. Suppose that X is a discrete random variable whose distribution is: Value of X x1 x2 x3 · · · xk Probability p1 p2 p3 · · · pk Find the mean of X by multiplying each possible value by its probability and adding over all the values: µ = x 1 p1 + x 2 p2 + · · · + x k pk = n i=1 x i pi . Note. The mean of a random variable X is a single fixed number µ. It gives the average value of X in several senses: • The mean µ is the average of the possible values of X, each weighted by how likely it is to occur. That’s what the definition of µ says. • The mean µ is the point at which the probability histogram of the distribution of X would balance if made of solid material. See Figure 4.9 (and TM-66). Recall that the mean µ of a density curve has this same property. • If we actually repeat the random phenomenon many times, record the value of X each time, and average these observed values, this average will get closer and closer to µ as we make more and more repititions. This fact is called the law of large numbers. 3 Definition. Suppose that X is a discrete random variable whose distribution is: Value of X x1 x2 x3 · · · xk Probability p1 p2 p3 · · · pk and that µ is the mean of X. The variance of X is σ 2 = (x1 − µ)2 p1 + (x2 − µ)2 p2 + · · · + (xk − µ)2 pk = n (xi − µ)2pi . i=1 The standard deviation σ is the square root of the variance. Continuous Random Variables Definition. A continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event. Note. The distribution of a continuous random variable assigns probabilities as areas under a density curve. See Figure 4.10 (and TM-67). Definition (for those with some calculus background). Suppose that X is a continuous random variable with probability distribution P (X). The mean of X is µ= xP (x) dx and the variance of X is 2 σ = (x − µ)2 P (x) dx, 4 where the integrals are taken over all possible values of X. The standard deviation σ is the square root of the variance. 5