Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
• A random variable is a variable whose values are numerical outcomes of a random experiment. That is, we consider all the outcomes in a sample space S and then associate a number with each outcome • Example: Toss a fair coin 4 times and let X=the number of Heads in the 4 tosses We write the so-called probability distribution of X as a list of the values X takes on along with the corresponding probabilities that X takes on those values. • The figure below (Fig. 4.6) and Example 4.23 show how to get the probability distribution of X. Each outcome has prob=1/16 (HINT: use the “and” rule to show this), and then use the “or” rule to show that P(X=1) = P(TTTH or TTHT or THTT or HTTT) etc…) • There are two types of r.v.s: discrete and continuous. A r.v. X is discrete if the number of values X takes on is finite (or countably infinite). In the case of any discrete X, its probability distribution is simply a list of its values along with the corresponding probabilities X takes on those values. Values of X: x1 x2 … xk P(X): p1 p2 pk NOTE: each value of p is between 0 and 1 and all the values of p sum to 1. We display probability distributions for discrete r.v.s with so-called probability histograms. The next slide shows the probability histogram for X=# of Hs in 4 tosses of a fair coin. The next slide gives a similar example... •The probability distribution of a random variable X lists the values and their probabilities: •The probabilities pi must add up to 1. •A basketball player shoots three free throws. The random variable X is the number of baskets successfully made. Suppose he is a 50% free throw shooter... H H - HHH M - HHM H - HMH M - HMM H M M… … Value of X 0 1 2 3 Probability 1/8 3/8 3/8 1/8 MMM HMM MHM MMH HHM HMH MHH HHH •The probability of any event is the sum of the probabilities pi of the values of X that make up the event. •A basketball player shoots three free throws. The random variable X is the number of baskets successfully made. Suppose he is a 50% free throw shooter. What is the probability that the player Value of X 0 1 2 3 successfully makes at least two Probability 1/8 3/8 3/8 1/8 MMM HMM MHM MMH HHM HMH MHH HHH baskets (“at least two” means “two or more”)? USE THE “OR” RULE! P(X≥2) = P(X=2) + P(X=3) = 3/8 + 1/8 = 1/2 What is the probability that the player successfully makes fewer than three baskets? USE THE “OR” RULE HERE TOO...! P(X<3) = P(X=0) + P(X=1) + P(X=2) = 1/8 + 3/8 + 3/8 = 7/8 or P(X<3) = 1 – P(X=3) = 1 – 1/8 = 7/8 (THIS IS THE “NOT” RULE) • A continuous r.v. X takes its values in an interval of real numbers. The probability distribution of a continuous X is described by a density curve, whose values lie wholly above the horizontal axis, whose total area under the curve is 1, and where probabilities about X correspond to areas under the curve. • The first example is the random variable which randomly chooses a number between 0 and 1 (perhaps using the spinner on page 253 – go over Example 4.25). This r.v. is called the uniform random variable and has a density curve that is completely flat! Probabilities correspond to areas under the curve... see next slide for the computations... A continuous random variable X takes all values in an interval. Example: There is an infinity of numbers between 0 and 1 (e.g., 0.001, 0.4, 0.0063876). How do we assign probabilities to events in an infinite sample space? We use density curves and compute probabilities for intervals. The probability of any event is the area under the density curve for the values of X that make up the event. This is a uniform density curve for the variable X. The probability that X falls between 0.3 and 0.7 is the area under the density curve for that interval (base x height for this density): P(0.3 ≤ X ≤ 0.7) = (0.7 – 0.3)*1 = 0.4 X The probability of a single point is meaningless for a continuous random variable. Only intervals can have a non-zero probability, represented by the area under the density curve for that interval. The probability of a single point is zero since there is no area above a point! This makes the following statement true: Height =1 The probability of an interval is the same whether boundary values are included or excluded: P(0 ≤ X ≤ 0.5) = (0.5 – 0)*1 = 0.5 P(0 < X < 0.5) = (0.5 – 0)*1 = 0.5 X P(0 ≤ X < 0.5) = (0.5 – 0)*1 = 0.5 P(X < 0.5 or X > 0.8) = P(X < 0.5) + P(X > 0.8) = 1 – P(0.5 < X < 0.8) = 0.7 (You may use either the “OR” Rule or the “NOT” Rule...) • The other example of a continuous r.v. that we’ve already seen is the normal random variable. See the next slide for a reminder of how we’ve used the normal and how it relates to probabilities under the normal curve... • Go over Example 4.26 in detail! We saw earlier that p-hat had a sampling distribution which was normal. Thus p-hat can be treated as a normal random variable… we have shown that the mean of p-hat is p and the standard deviation of p-hat is sqrt(p(1-p)/n). Now use this information to do Ex. 4.26… Continuous random variable and population distribution The shaded area under a density curve shows the proportion, or %, of individuals in a population with values of X between x1 and x2. Because the probability of drawing one individual at random depends on the frequency of this type of individual in the population, the probability is also the shaded area under the curve. % individuals with X such that x1 < X < x2 Mean of a random variable •The mean x bar of a set of observations is their arithmetic average. •The mean µ of a random variable X is a weighted average of the possible values of X, reflecting the fact that all outcomes might not be equally likely. A basketball player shoots three free throws. The random variable X is the number of baskets successfully made (“H”). MMM HMM MHM MMH HHM HMH MHH HHH Value of X 0 1 2 3 Probability 1/8 3/8 3/8 1/8 The mean of a random variable X is also called expected value of X. What is the expected number of baskets made? Do the computations... • We’ve already discussed the mean of a density curve as being the “balance point” of the curve… to establish this mathematically requires some higher level math… So we’ll think of the mean of a continuous r.v. in this way. For a discrete r.v., we’ll compute the mean (or expected value) as a weighted average of the values of X, the weights being the corresponding probabilities. E.g., the mean # of Hs in 4 tosses of a fair coin is computed as: (1/16)*0 + (4/16)*1 + (6/16)*2 + (4/16)*3 + (1/16)*4 = (32/16) = 2. • In either case (discrete or continuous), the interpretation of the mean is as the long-run average value of X (in a large number of repetitions of the experiment giving rise to X) • Look at Example 4.27 on page 260… a simple “lottery” (pick 3), like the old numbers game…you pay $1 to play (pick a 3 digit number), and if your number comes up, you win $500; otherwise, the bookie keeps your $1. Note that in the long run, your winnings are $500*(1/1000) + $0*(999/1000) = $.50 • Law of Large Numbers: Essentially states that if you sample from a population with mean = m, then the sample mean (x-bar) will approximate m for large sample sizes. Or that m is the expected value of many independent observations on the variable. CAREFULLY READ PAGES 273ff ON THE LAW OF LARGE NUMBERS AND ITS CONSEQUENCES! Stop Chapter 4 at the bottom of page 266 ("Rules for means"). HW: Read sections 4.3 & 4.4. Do # 4.53-4.58, 4.61-4.63, 4.66, 4.74-4.76