Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia, lookup

History of statistics wikipedia, lookup

Central limit theorem wikipedia, lookup

Law of large numbers wikipedia, lookup

Transcript
```Chapter 6
Section 6.2
Transforming and Combining
Random Variables
Warm up
Weights of three-year-old females
The weights of three-year-old females closely follow a
Normal distribution with a mean of 30.7 pounds and
a standard deviation of 3.6 pounds. Randomly choose
one three-year-old female and call her weight X.
What is the probability that a randomly selected
three-year-old female weighs at least 30 pounds?
The weight X of a randomly chosen three-year-old female has the
N(30.7, 3.6) distribution. We want to find P(X  30) as shown in
the figure below.
Perform calculations–show your work! The standardized score
for the boundary value is
= −0.19. From Table A, P(Z <
−0.19) = 0.4247, so P(Z  −0.19) = 1 − 0.4247 = 0.5753. Using
technology: The command normalcdf(lower:30, upper: 1000, :
30.7, :3.6) gives an area of 0.5771.
There is about a 58% chance that the randomly selected threeyear-old female will weigh at least 30 pounds.
Many random number generators allow users to
specify the range of the random numbers to be
produced. Suppose that you specify that the range is
to be 0 < Y < 2. Then the density curve of the
outcomes has constant height between 0 and 2, and
height 0 elsewhere.
a) What is the height of the density curve between 0
and 2? Draw a graph of the density curve. Ht=0.5
b) Use your graph to find P(Y<1). = 0.5
c) Find P(0.5<Y<1.3) = 0.4
d) Find P(Y>0.8) = 0.6
Linear Transformations
In Section 6.1, we learned that the mean and standard
deviation give us important information about a
random variable.
In Chapter 2, we studied the effects of linear transformations on
the shape, center, and spread of a distribution of data. Recall:
1. Adding (or subtracting) a constant, a, to each observation:
• Adds a to measures of center and location.
• Does not change the shape or measures of spread.
2. Multiplying (or dividing) each observation by a constant, b:
• Multiplies (divides) measures of center and location by b.
• Multiplies (divides) measures of spread by |b|.
• Does not change the shape of the distribution.
Multiplying a Random Variable by a Constant
Pete’s Jeep Tours offers a popular half-day trip in a tourist area. There
must be at least 2 passengers for the trip to run, and the vehicle will hold
up to 6 passengers. Define X as the number of passengers on a randomly
selected day.
Passengers xi
2
3
4
5
6
Probability pi
0.15
0.25
0.35
0.20
0.05
The mean of X is 3.75 and the standard
deviation is 1.090.
Pete charges \$150 per passenger. The random variable C describes
the amount Pete collects on a randomly selected day.
Collected ci
300
450
600
750
900
Probability pi
0.15
0.25
0.35
0.20
0.05
The mean of C is \$562.50 and the standard
deviation is \$163.50.
Compare the shape, center, and spread of the two probability distributions.
Multiplying a Random Variable by a
Constant
Effect on a Random Variable of Multiplying (Dividing) by a Constant
Multiplying (or dividing) each value of a random variable by a number b:
• Multiplies (divides) measures of center and location (mean,
median, quartiles, percentiles) by b.
• Multiplies (divides) measures of spread (range, IQR, standard
deviation) by |b|.
• Does not change the shape of the distribution.
As with data, if we multiply a random variable by a negative constant b,
our common measures of spread are multiplied by |b|.
Adding a Constant to a Random Variable
Consider Pete’s Jeep Tours again. We defined C as the amount of
money Pete collects on a randomly selected day.
Collected ci
300
450
600
750
900
Probability pi
0.15
0.25
0.35
0.20
0.05
The mean of C is \$562.50 and the standard
deviation is \$163.50.
It costs Pete \$100 per trip to buy permits, gas, and a ferry pass. The
random variable V describes the profit Pete makes on a randomly selected
day.
Profit vi
200
350
500
650
800
Probability pi
0.15
0.25
0.35
0.20
0.05
The mean of V is \$462.50 and the standard
deviation is \$163.50.
Compare the shape, center, and spread of the two probability distributions.
Adding a Constant to a Random
Variable
Effect on a Random Variable of Adding (or Subtracting) a Constant
Adding the same number a (which could be negative) to each value of a
random variable:
• Adds a to measures of center and location (mean, median,
quartiles, percentiles).
• Does not change measures of spread (range, IQR, standard
deviation).
• Does not change the shape of the distribution.
Effect on a Linear Transformation on the Mean and Standard Deviation
If Y = a + bX is a linear transformation of the
Linear
Transformations
random variable
X, then
The probability distribution of Y has the same
shape as the probability distribution of X.
µY = a + bµX.
σY = |b|σX (since b could be a negative number).
Linear transformations have similar effects on other measures of center
or location (median, quartiles, percentiles) and spread (range, IQR).
Whether we’re dealing with data or random variables, the effects of a
linear transformation are the same.
Note: These results apply to both discrete and continuous random
variables.
Combining Random Variables
Many interesting statistics problems require us to examine two or more
random variables.
Let’s investigate the result of adding and subtracting random variables.
Let X = the number of passengers on a randomly selected trip with
Pete’s Jeep Tours. Y = the number of passengers on a randomly
selected trip with Erin’s Adventures. Define T = X + Y. What are the
mean and variance of T?
Passengers xi
2
3
4
5
6
Probability pi
0.15
0.25
0.35
0.20
0.05
Mean µX = 3.75 Standard Deviation σX = 1.090
Passengers yi
2
3
4
5
Probability pi
0.3
0.4
0.2
0.1
Mean µY = 3.10 Standard Deviation σY = 0.943
Combining Random Variables
How many total passengers can Pete and Erin expect on a
randomly selected day?
Since Pete expects µX = 3.75 and Erin expects µY = 3.10 , they
will average a total of 3.75 + 3.10 = 6.85 passengers per trip. We
can generalize this result as follows:
Mean of the Sum of Random Variables
For any two random variables X and Y, if T = X + Y, then the expected
value of T is
E(T) = µT = µX + µY
In general, the mean of the sum of several random variables is the sum
of their means.
Mrs. Richardson owns a company that sells
really cool math t-shirts and she believes that
the sales of her product are:
X
1000
3000
5000
10,000
P(X)
.1
.3
.4
.2
How many shirts is she expected to sell?
𝜇X = 1000 ∙ .1 + 3000 ∙ .3 + ⋯
= 5000 shirts
13
If the expected profit on each sale of Mrs. R’s
shirts is \$200 (they are super cool shirts!),
what is the overall expected profit?
𝜇200X = 200 ∙ 5000 = \$1,000,000
14
Mrs. Tran is starting another t-shirt company and
she believes that the sales of her t-shirts are as
follows.
Y
300
500
750
P(Y)
.4
.5
.1
𝜇Y = 300 ∙ .4 + 500 ∙ .5 + ⋯ = 445 shirts
If the expected profit on each sale of Mrs. Tran’s shirts is
\$250 (her shirts are pretty cool too), what is her overall
expected profit?
𝜇250Y = 250 ∙ 445 = \$111,250
Mrs. R and Mrs. Tran want to join forces.
What is the total expected profits combined of
both shirts X and shirts Y?
𝜇250Y = 250 ∙ 445 = \$111,250
𝜇200X = 200 ∙ 5000 = \$1,000,000
𝜇200X+250Y = 1,000,000 + 111,250
= \$1,111,250
16
Example (Linda cars)
Cars Sold
Probability
0
.3
1
.4
2
.2
3
.1
Trucks/SUV
Probability
0
.4
1
.5
At her commission rate of 25% of gross profit on
each vehicle she sells, Linda expects to earn \$350
on each car and \$400 on each truck/SUV sold.
What are her average (expected) earnings?
2
.1
Let X be the number of cars Linda sells and Y the number of
trucks and SUV’s.
μx = (0)(.3) + (1)(.4) + (2)(.2) + (3)(.1) = 1.1 cars
μy = (0)(.4) + (1)(.5) + (2)(.1) = .7 trucks and SUV’s
Her mean earnings are
μz = 350 μx + 400 μy
= (350)(1.1) + (400)(.7) = \$665
That’s her best estimate of her earnings for the day.
Warmup
SPCC considers a student to be full-time if he or she is taking
between 12 and 18 units. The number of units X that a randomly
selected SPCC full-time student is taking in the fall semester has the
following distribution.
Number of units:
Probability:
12 13 14 15 16 17 18
0.25 0.10 0.05 0.30 0.10 0.05 0.15
a) At SPCC, the tuition for full-time students is \$50 per unit. That is, if T =
tuition charge for a randomly selected full-time student, T = 50X. Find
the mean (μ50X) and standard deviation (50X).
b) In addition to tuition charges, each full-time student at SPCC is
assessed student fees of \$100 per semester. If C = overall cost for a
randomly selected full-time student, C = 100 + T. Find the mean
(μ100+50X) and standard deviation (100+50X).
a)
T  600(0.25)  650(0.10) 
 (900)(0.15)  \$732.50
 T2  (600  732.5) 2 (0.25)  (650  732.5) 2 (0.10) 
 (900  732.5) 2 (0.15)  10,568
T  10,568  \$103
b) 𝜇𝑐= 700 0.25 + ⋯ + 1000) 0.15) = \$832.50
𝜎2𝑐= ( 700 − 832.5)2 0.25 + ⋯ + ( 1000 − 832.5)2 ( 0.15)
= 10,568
𝜎2 =
10,568 = \$103
Combining Random Variables
The only way to determine the probability for any value of T is if
X and Y are independent random variables.
If knowing whether any event involving X alone has occurred
tells us nothing about the occurrence of any event involving Y
alone, and vice versa, then X and Y are independent random
variables.
Probability models often assume independence when the
random variables describe outcomes that appear unrelated to
each other. You should always ask whether the assumption of
independence seems reasonable.
In our investigation, it is reasonable to assume X and Y are
independent if Paul and Erin operate their tours in different
parts of the country.
Combining Random Variables
Let T = X + Y. Consider all possible combinations of the values of X
and Y.
Recall: µT = µX + µY = 6.85
sT2 = å(t i - mT ) 2 pi
= (4 – 6.85)2(0.045) + … +
(11 – 6.85)2(0.005) = 2.0775
Note: sX2 =1.1875 and sY2 = 0.89
What do you notice about the
variance of T?
Combining Random Variables
As the preceding example illustrates, when we add two independent
Variance of the Sum of Random Variables
For any two independent random variables X and Y, if T = X + Y, then
the variance of T is
σ T2 = σ 2X + σY2
In general, the variance of the sum of several independent random
variables is the sum of their variances.
***** Remember that you can add variances only if the
two random variables are independent, and that you can
Combining Random Variables
We can perform a similar investigation to determine what happens when
we define a random variable as the difference of two random variables.
In summary, we find the following:
Mean of the Difference of Random Variables
For any two random variables X and Y, if D = X - Y, then the expected
value of D is
E(D) = µD = µX - µY
In general, the mean of the difference of several random variables is the
difference of their means. The order of subtraction is important!
Variance of the Difference of Random Variables
For any two random variables X and Y, if D = X - Y, then the variance of
D is
σ 2D = σ 2X + σY2
In general, the variance of the difference of two independent random
variables is the sum of their variances.
2. Going back to the warm up problem, SPCC also has a campus in
Matthews, specializing in just a few fields of study. Full-time
students at the Matthews campus take only 3-unit classes. Let Y =
number of units taken in the fall semester by a randomly selected
full-time student at the downtown campus. Here is the probability
distribution of Y and a probability histogram:
Number of units:
Probability:
12
0.3
15
0.4
18
0.3
Find the mean (μY) and standard deviation (Y).
The mean of this distribution is
Y = 15 units, and the standard deviation is
Y = 2.3 units.
Find the mean (μS) and standard deviation (S).
S  X  Y  14.65 15  29.65
𝜎2 X+Y = 𝜎2 X + 𝜎2 Y = ( 2.06)2 + ( 2.3)2 = 9.63
σX+Y = 9.63 = 3.10
3. Let S = X + Y. It is reasonable to assume that X
and Y are independent, because each student was
selected at random. Find P(S = 24)
P(S = 24) = P(X = 12 and Y = 12) =
P(X = 12) × P(Y = 12) = (0.25)(0.3) = 0.075.
That is, there is a 0.075 probability that the sum
is 24 units.
Let B = the amount spent on books in the fall
semester for a randomly selected full-time student
at SPCC. Suppose that µB = 153 andσB = 32. Recall
from earlier that C = overall cost for tuition and fees
for a randomly selected full-time student at SPCC.
Find the mean and standard deviation of the cost
of tuition, fees, and books (C + B) for a randomly
selected full-time student at SPCC.
CB  C  B
= 832.50 + 153 = \$985.50. The standard deviation cannot be
calculated because the cost for tuition and fees and the cost for
books are not independent. Students who take more units will
typically have to buy more books.
We defined the students at X = the number of units for
a randomly selected full-time student at the main
campus and Y = the number of units for a randomly
selected full-time student at the Matthews campus.
Also, and 𝜇𝑦= 15, 𝜎𝑦= 2.3
At the main campus, full-time students pay \$50 per
unit. At the Matthews campus, full-time students pay
\$55 per unit.
Calculate the mean and standard deviation of the total
amount of tuition for a randomly selected full-time
student at the main campus and for a randomly
selected full-time student at the Matthews campus.
𝜇50X+55Y = 50 14.65 + 55 15 = \$1557.50
σ50X+55Y
σ 50X = 50 2.06 = 102.8
σ 55Y = 55 2.3 = 127.6
𝜎2 50X = 102.82 =10,567.84
𝜎2 55Y = 127.62 =16,281.76
σ 50X+55Y = 10,567.84 + 16,281.76 =
= 163.86
26,849.6
Warm up
Most states and Canadian provinces have government-sponsored lotteries. Here is a simple
lottery wager, from the Tri-State Pick 3 game that New Hampshire shares with Maine and
Vermont. You choose a number with 3 digits from 0 to 9; the state chooses a three-digit
winning number at random and pays you \$500 if your number is chosen. Because there are
1000 numbers with three digits, you have probability 1/1000 of winning. Taking X to be the
amount your ticket pays you, the probability distribution of X is
(a) Find the mean and standard deviation of X
(b) If you buy a Pick 3 ticket, your winnings are W = X − 1, because it costs \$1 to play. Find
the mean and standard deviation of W. Interpret each of these values in context.
(c) Suppose you buy one Pick 3 ticket on each of two consecutive days. Find the expected
Combining Normal Random Variables
If a random variable is Normally distributed, we can use its mean and standard
deviation to compute probabilities.
• Any sum or difference of independent Normal random variables is also
Normally distributed.
Mr. Starnes likes between 8.5 and 9 grams of sugar in his hot tea. Suppose the
amount of sugar in a randomly selected packet follows a Normal distribution
with mean 2.17 g and standard deviation 0.08 g. If Mr. Starnes selects 4
packets at random, what is the probability his tea will taste right?
Let X = the amount of sugar in a randomly selected packet.
Then, T = X1 + X2 + X3 + X4. We want to find P(8.5 ≤ T ≤ 9).
µT = µX1 + µX2 + µX3 + µX4 = 2.17 + 2.17 + 2.17 +2.17 = 8.68
sT2 = sX2 + sX2 + sX2 + sX2 = (0.08) 2 + (0.08) 2 + (0.08) 2 + (0.08) 2 = 0.0256
1
2
3
4
sT = 0.0256 = 0.16
Normalcdf: lwr= 8.5, upr=9, µ=8.68, σ=0.16
P(-1.13 ≤ Z ≤ 2.00) = 0.8480
There is about an 85% chance Mr. Starnes’s
tea will taste right.
Example: A company manufactures and ships ipods. The times required to pack i-pods can be
described by N(9 min, 1.5 min). The times required
to ship i-pods can be described by N(6 min, 1 min).
a) What is the probability that packing two
consecutive i-pods takes more than 20 minutes?
b) What percent of i-pods take longer to pack than
to box?
a) Let p1 = time for packing 1st ipod
Let p2 = time for packing 2nd ipod
μ = E(p1 + p2)=9 + 9 = 18
σ = σ(p1 + p2) = 1 52 + 1 52 = 2.12
.
.
Normalcdf: lwr=20, upr=1E99, µ=18, σ=2.12
P(X>20) = 0.17
So there is a 17% chance that it will take over 20
minutes to pack 2 consecutive ipods
b) Packing: N(9, 1.5)
Boxing: N(6,1)
Let D = difference in times (Packing- Boxing)
E(D) = 9 – 6= 3
SD(D) = =
1 .5 2 + 1 2
=1.8
Normalcdf: lwr=0, upr=1E99, µ=3, σ=1.8
P(D > 0)= 0.9525
So about 95% of ipods take longer to pack than box
Example- golf
Tom and George are playing in a tournament. Their scores vary
as they play the course repeatedly. Tom’s score X has the
N(110, 10) distribution, and George’s score Y varies from round
to round according to the N(100,8) distribution. If they play
independently, what is the probability that Tom will score lower
than George and thus do better in the tournament?
The difference X – Y between their scores is Normally distributed
with mean and variance:
• μx-y = μx – μy = 110 – 100 = 10
• σ2X-Y = σ2X+ σ2 Y= 102 + 82 = 164
• √(164) = 12.8, X – Y has the N(10, 12.8) distribution.
Normalcdf: lwr = -1E99, upr = 0, µ = 10, σ =
12.8 The probability that Tom wins (lower score is
better)is:
P(D < 0) = .2177
Addition rule for variances of random
variables that are not independent
• If X and Y have a correlation ,
then
2 X+Y = 2X +2Y +2xY
2 X-Y = 2X + 2Y – 2xY
• The correlation ρ is a number between –1 and
1 that measures the direction and strength of
the linear relationship between two variables
• The correlation between two independent
random variables is zero
36
EX: SAT scores
•
•
•
•
SAT Math X
x = 625 x = 90
SAT Verbal Y Y = 590 Y= 100
Mean overall score = 625+590=1215
Variance cannot be computed
because scores are not independent
If we know the correlation of the
SAT scores to be  = .7,then we can
apply Rule 3.
2 X+Y = 2X +2Y +2xY
2 X+Y = 2 +2 +2
2 X+Y = 
 X+Y = 
38
```
Related documents