Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 6 Section 6.2 Transforming and Combining Random Variables Warm up Weights of three-year-old females The weights of three-year-old females closely follow a Normal distribution with a mean of 30.7 pounds and a standard deviation of 3.6 pounds. Randomly choose one three-year-old female and call her weight X. What is the probability that a randomly selected three-year-old female weighs at least 30 pounds? The weight X of a randomly chosen three-year-old female has the N(30.7, 3.6) distribution. We want to find P(X 30) as shown in the figure below. Perform calculations–show your work! The standardized score for the boundary value is = −0.19. From Table A, P(Z < −0.19) = 0.4247, so P(Z −0.19) = 1 − 0.4247 = 0.5753. Using technology: The command normalcdf(lower:30, upper: 1000, : 30.7, :3.6) gives an area of 0.5771. There is about a 58% chance that the randomly selected threeyear-old female will weigh at least 30 pounds. Many random number generators allow users to specify the range of the random numbers to be produced. Suppose that you specify that the range is to be 0 < Y < 2. Then the density curve of the outcomes has constant height between 0 and 2, and height 0 elsewhere. a) What is the height of the density curve between 0 and 2? Draw a graph of the density curve. Ht=0.5 b) Use your graph to find P(Y<1). = 0.5 c) Find P(0.5<Y<1.3) = 0.4 d) Find P(Y>0.8) = 0.6 Linear Transformations In Section 6.1, we learned that the mean and standard deviation give us important information about a random variable. In Chapter 2, we studied the effects of linear transformations on the shape, center, and spread of a distribution of data. Recall: 1. Adding (or subtracting) a constant, a, to each observation: • Adds a to measures of center and location. • Does not change the shape or measures of spread. 2. Multiplying (or dividing) each observation by a constant, b: • Multiplies (divides) measures of center and location by b. • Multiplies (divides) measures of spread by |b|. • Does not change the shape of the distribution. Multiplying a Random Variable by a Constant Pete’s Jeep Tours offers a popular half-day trip in a tourist area. There must be at least 2 passengers for the trip to run, and the vehicle will hold up to 6 passengers. Define X as the number of passengers on a randomly selected day. Passengers xi 2 3 4 5 6 Probability pi 0.15 0.25 0.35 0.20 0.05 The mean of X is 3.75 and the standard deviation is 1.090. Pete charges $150 per passenger. The random variable C describes the amount Pete collects on a randomly selected day. Collected ci 300 450 600 750 900 Probability pi 0.15 0.25 0.35 0.20 0.05 The mean of C is $562.50 and the standard deviation is $163.50. Compare the shape, center, and spread of the two probability distributions. Multiplying a Random Variable by a Constant Effect on a Random Variable of Multiplying (Dividing) by a Constant Multiplying (or dividing) each value of a random variable by a number b: • Multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b. • Multiplies (divides) measures of spread (range, IQR, standard deviation) by |b|. • Does not change the shape of the distribution. As with data, if we multiply a random variable by a negative constant b, our common measures of spread are multiplied by |b|. Adding a Constant to a Random Variable Consider Pete’s Jeep Tours again. We defined C as the amount of money Pete collects on a randomly selected day. Collected ci 300 450 600 750 900 Probability pi 0.15 0.25 0.35 0.20 0.05 The mean of C is $562.50 and the standard deviation is $163.50. It costs Pete $100 per trip to buy permits, gas, and a ferry pass. The random variable V describes the profit Pete makes on a randomly selected day. Profit vi 200 350 500 650 800 Probability pi 0.15 0.25 0.35 0.20 0.05 The mean of V is $462.50 and the standard deviation is $163.50. Compare the shape, center, and spread of the two probability distributions. Adding a Constant to a Random Variable Effect on a Random Variable of Adding (or Subtracting) a Constant Adding the same number a (which could be negative) to each value of a random variable: • Adds a to measures of center and location (mean, median, quartiles, percentiles). • Does not change measures of spread (range, IQR, standard deviation). • Does not change the shape of the distribution. Effect on a Linear Transformation on the Mean and Standard Deviation If Y = a + bX is a linear transformation of the Linear Transformations random variable X, then The probability distribution of Y has the same shape as the probability distribution of X. µY = a + bµX. σY = |b|σX (since b could be a negative number). Linear transformations have similar effects on other measures of center or location (median, quartiles, percentiles) and spread (range, IQR). Whether we’re dealing with data or random variables, the effects of a linear transformation are the same. Note: These results apply to both discrete and continuous random variables. Combining Random Variables Many interesting statistics problems require us to examine two or more random variables. Let’s investigate the result of adding and subtracting random variables. Let X = the number of passengers on a randomly selected trip with Pete’s Jeep Tours. Y = the number of passengers on a randomly selected trip with Erin’s Adventures. Define T = X + Y. What are the mean and variance of T? Passengers xi 2 3 4 5 6 Probability pi 0.15 0.25 0.35 0.20 0.05 Mean µX = 3.75 Standard Deviation σX = 1.090 Passengers yi 2 3 4 5 Probability pi 0.3 0.4 0.2 0.1 Mean µY = 3.10 Standard Deviation σY = 0.943 Combining Random Variables How many total passengers can Pete and Erin expect on a randomly selected day? Since Pete expects µX = 3.75 and Erin expects µY = 3.10 , they will average a total of 3.75 + 3.10 = 6.85 passengers per trip. We can generalize this result as follows: Mean of the Sum of Random Variables For any two random variables X and Y, if T = X + Y, then the expected value of T is E(T) = µT = µX + µY In general, the mean of the sum of several random variables is the sum of their means. Mrs. Richardson owns a company that sells really cool math t-shirts and she believes that the sales of her product are: X 1000 3000 5000 10,000 P(X) .1 .3 .4 .2 How many shirts is she expected to sell? 𝜇X = 1000 ∙ .1 + 3000 ∙ .3 + ⋯ = 5000 shirts 13 If the expected profit on each sale of Mrs. R’s shirts is $200 (they are super cool shirts!), what is the overall expected profit? 𝜇200X = 200 ∙ 5000 = $1,000,000 14 Mrs. Tran is starting another t-shirt company and she believes that the sales of her t-shirts are as follows. Y 300 500 750 P(Y) .4 .5 .1 𝜇Y = 300 ∙ .4 + 500 ∙ .5 + ⋯ = 445 shirts If the expected profit on each sale of Mrs. Tran’s shirts is $250 (her shirts are pretty cool too), what is her overall expected profit? 𝜇250Y = 250 ∙ 445 = $111,250 Mrs. R and Mrs. Tran want to join forces. What is the total expected profits combined of both shirts X and shirts Y? 𝜇250Y = 250 ∙ 445 = $111,250 𝜇200X = 200 ∙ 5000 = $1,000,000 𝜇200X+250Y = 1,000,000 + 111,250 = $1,111,250 16 Example (Linda cars) Cars Sold Probability 0 .3 1 .4 2 .2 3 .1 Trucks/SUV Probability 0 .4 1 .5 At her commission rate of 25% of gross profit on each vehicle she sells, Linda expects to earn $350 on each car and $400 on each truck/SUV sold. What are her average (expected) earnings? 2 .1 Let X be the number of cars Linda sells and Y the number of trucks and SUV’s. μx = (0)(.3) + (1)(.4) + (2)(.2) + (3)(.1) = 1.1 cars μy = (0)(.4) + (1)(.5) + (2)(.1) = .7 trucks and SUV’s Her mean earnings are μz = 350 μx + 400 μy = (350)(1.1) + (400)(.7) = $665 That’s her best estimate of her earnings for the day. Warmup SPCC considers a student to be full-time if he or she is taking between 12 and 18 units. The number of units X that a randomly selected SPCC full-time student is taking in the fall semester has the following distribution. Number of units: Probability: 12 13 14 15 16 17 18 0.25 0.10 0.05 0.30 0.10 0.05 0.15 a) At SPCC, the tuition for full-time students is $50 per unit. That is, if T = tuition charge for a randomly selected full-time student, T = 50X. Find the mean (μ50X) and standard deviation (50X). b) In addition to tuition charges, each full-time student at SPCC is assessed student fees of $100 per semester. If C = overall cost for a randomly selected full-time student, C = 100 + T. Find the mean (μ100+50X) and standard deviation (100+50X). a) T 600(0.25) 650(0.10) (900)(0.15) $732.50 T2 (600 732.5) 2 (0.25) (650 732.5) 2 (0.10) (900 732.5) 2 (0.15) 10,568 T 10,568 $103 b) 𝜇𝑐= 700 0.25 + ⋯ + 1000) 0.15) = $832.50 𝜎2𝑐= ( 700 − 832.5)2 0.25 + ⋯ + ( 1000 − 832.5)2 ( 0.15) = 10,568 𝜎2 = 10,568 = $103 Combining Random Variables The only way to determine the probability for any value of T is if X and Y are independent random variables. If knowing whether any event involving X alone has occurred tells us nothing about the occurrence of any event involving Y alone, and vice versa, then X and Y are independent random variables. Probability models often assume independence when the random variables describe outcomes that appear unrelated to each other. You should always ask whether the assumption of independence seems reasonable. In our investigation, it is reasonable to assume X and Y are independent if Paul and Erin operate their tours in different parts of the country. Combining Random Variables Let T = X + Y. Consider all possible combinations of the values of X and Y. Recall: µT = µX + µY = 6.85 sT2 = å(t i - mT ) 2 pi = (4 – 6.85)2(0.045) + … + (11 – 6.85)2(0.005) = 2.0775 Note: sX2 =1.1875 and sY2 = 0.89 What do you notice about the variance of T? Combining Random Variables As the preceding example illustrates, when we add two independent random variables, their variances add. Standard deviations do not add. Variance of the Sum of Random Variables For any two independent random variables X and Y, if T = X + Y, then the variance of T is σ T2 = σ 2X + σY2 In general, the variance of the sum of several independent random variables is the sum of their variances. ***** Remember that you can add variances only if the two random variables are independent, and that you can NEVER add standard deviations! Combining Random Variables We can perform a similar investigation to determine what happens when we define a random variable as the difference of two random variables. In summary, we find the following: Mean of the Difference of Random Variables For any two random variables X and Y, if D = X - Y, then the expected value of D is E(D) = µD = µX - µY In general, the mean of the difference of several random variables is the difference of their means. The order of subtraction is important! Variance of the Difference of Random Variables For any two random variables X and Y, if D = X - Y, then the variance of D is σ 2D = σ 2X + σY2 In general, the variance of the difference of two independent random variables is the sum of their variances. 2. Going back to the warm up problem, SPCC also has a campus in Matthews, specializing in just a few fields of study. Full-time students at the Matthews campus take only 3-unit classes. Let Y = number of units taken in the fall semester by a randomly selected full-time student at the downtown campus. Here is the probability distribution of Y and a probability histogram: Number of units: Probability: 12 0.3 15 0.4 18 0.3 Find the mean (μY) and standard deviation (Y). The mean of this distribution is Y = 15 units, and the standard deviation is Y = 2.3 units. Find the mean (μS) and standard deviation (S). S X Y 14.65 15 29.65 𝜎2 X+Y = 𝜎2 X + 𝜎2 Y = ( 2.06)2 + ( 2.3)2 = 9.63 σX+Y = 9.63 = 3.10 3. Let S = X + Y. It is reasonable to assume that X and Y are independent, because each student was selected at random. Find P(S = 24) P(S = 24) = P(X = 12 and Y = 12) = P(X = 12) × P(Y = 12) = (0.25)(0.3) = 0.075. That is, there is a 0.075 probability that the sum is 24 units. Let B = the amount spent on books in the fall semester for a randomly selected full-time student at SPCC. Suppose that µB = 153 andσB = 32. Recall from earlier that C = overall cost for tuition and fees for a randomly selected full-time student at SPCC. Find the mean and standard deviation of the cost of tuition, fees, and books (C + B) for a randomly selected full-time student at SPCC. CB C B = 832.50 + 153 = $985.50. The standard deviation cannot be calculated because the cost for tuition and fees and the cost for books are not independent. Students who take more units will typically have to buy more books. We defined the students at X = the number of units for a randomly selected full-time student at the main campus and Y = the number of units for a randomly selected full-time student at the Matthews campus. Also, and 𝜇𝑦= 15, 𝜎𝑦= 2.3 At the main campus, full-time students pay $50 per unit. At the Matthews campus, full-time students pay $55 per unit. Calculate the mean and standard deviation of the total amount of tuition for a randomly selected full-time student at the main campus and for a randomly selected full-time student at the Matthews campus. 𝜇50X+55Y = 50 14.65 + 55 15 = $1557.50 σ50X+55Y σ 50X = 50 2.06 = 102.8 σ 55Y = 55 2.3 = 127.6 𝜎2 50X = 102.82 =10,567.84 𝜎2 55Y = 127.62 =16,281.76 σ 50X+55Y = 10,567.84 + 16,281.76 = = 163.86 26,849.6 Warm up Most states and Canadian provinces have government-sponsored lotteries. Here is a simple lottery wager, from the Tri-State Pick 3 game that New Hampshire shares with Maine and Vermont. You choose a number with 3 digits from 0 to 9; the state chooses a three-digit winning number at random and pays you $500 if your number is chosen. Because there are 1000 numbers with three digits, you have probability 1/1000 of winning. Taking X to be the amount your ticket pays you, the probability distribution of X is (a) Find the mean and standard deviation of X (b) If you buy a Pick 3 ticket, your winnings are W = X − 1, because it costs $1 to play. Find the mean and standard deviation of W. Interpret each of these values in context. (c) Suppose you buy one Pick 3 ticket on each of two consecutive days. Find the expected value and standard deviation of your total winnings. Show your work. Combining Normal Random Variables If a random variable is Normally distributed, we can use its mean and standard deviation to compute probabilities. • Any sum or difference of independent Normal random variables is also Normally distributed. Mr. Starnes likes between 8.5 and 9 grams of sugar in his hot tea. Suppose the amount of sugar in a randomly selected packet follows a Normal distribution with mean 2.17 g and standard deviation 0.08 g. If Mr. Starnes selects 4 packets at random, what is the probability his tea will taste right? Let X = the amount of sugar in a randomly selected packet. Then, T = X1 + X2 + X3 + X4. We want to find P(8.5 ≤ T ≤ 9). µT = µX1 + µX2 + µX3 + µX4 = 2.17 + 2.17 + 2.17 +2.17 = 8.68 sT2 = sX2 + sX2 + sX2 + sX2 = (0.08) 2 + (0.08) 2 + (0.08) 2 + (0.08) 2 = 0.0256 1 2 3 4 sT = 0.0256 = 0.16 Normalcdf: lwr= 8.5, upr=9, µ=8.68, σ=0.16 P(-1.13 ≤ Z ≤ 2.00) = 0.8480 There is about an 85% chance Mr. Starnes’s tea will taste right. Example: A company manufactures and ships ipods. The times required to pack i-pods can be described by N(9 min, 1.5 min). The times required to ship i-pods can be described by N(6 min, 1 min). a) What is the probability that packing two consecutive i-pods takes more than 20 minutes? b) What percent of i-pods take longer to pack than to box? a) Let p1 = time for packing 1st ipod Let p2 = time for packing 2nd ipod μ = E(p1 + p2)=9 + 9 = 18 σ = σ(p1 + p2) = 1 52 + 1 52 = 2.12 . . Normalcdf: lwr=20, upr=1E99, µ=18, σ=2.12 P(X>20) = 0.17 So there is a 17% chance that it will take over 20 minutes to pack 2 consecutive ipods b) Packing: N(9, 1.5) Boxing: N(6,1) Let D = difference in times (Packing- Boxing) E(D) = 9 – 6= 3 SD(D) = = 1 .5 2 + 1 2 =1.8 Normalcdf: lwr=0, upr=1E99, µ=3, σ=1.8 P(D > 0)= 0.9525 So about 95% of ipods take longer to pack than box Example- golf Tom and George are playing in a tournament. Their scores vary as they play the course repeatedly. Tom’s score X has the N(110, 10) distribution, and George’s score Y varies from round to round according to the N(100,8) distribution. If they play independently, what is the probability that Tom will score lower than George and thus do better in the tournament? The difference X – Y between their scores is Normally distributed with mean and variance: • μx-y = μx – μy = 110 – 100 = 10 • σ2X-Y = σ2X+ σ2 Y= 102 + 82 = 164 • √(164) = 12.8, X – Y has the N(10, 12.8) distribution. Normalcdf: lwr = -1E99, upr = 0, µ = 10, σ = 12.8 The probability that Tom wins (lower score is better)is: P(D < 0) = .2177 Addition rule for variances of random variables that are not independent • If X and Y have a correlation , then 2 X+Y = 2X +2Y +2xY 2 X-Y = 2X + 2Y – 2xY • The correlation ρ is a number between –1 and 1 that measures the direction and strength of the linear relationship between two variables • The correlation between two independent random variables is zero 36 EX: SAT scores • • • • SAT Math X x = 625 x = 90 SAT Verbal Y Y = 590 Y= 100 Mean overall score = 625+590=1215 Variance cannot be computed because scores are not independent If we know the correlation of the SAT scores to be = .7,then we can apply Rule 3. 2 X+Y = 2X +2Y +2xY 2 X+Y = 2 +2 +2 2 X+Y = X+Y = 38