Download Chapter 16 Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Chapter 16 Random Variables
math2200
Life insurance
• Insurance company: a “death and
disability” policy
– Pay $10,000 when the client dies
– Pay $5,000 if the client is permanently
disabled
– Charge $50 per year
• Why $50?
– Using actuarial information, the company can
calculate the expected value of the policy.
Random variable
• The amount the company pays out on an
individual policy is called a random
variable.
• A random variable assumes a value based
on the outcome of a random event.
– Random variable is often denoted by a capital
letter, e.g., X
– A particular value that it can have is often
denoted by the corresponding lower case
letter, e.g., x
Random variable
• Discrete
– We can list all the outcomes (finite or
countable)
– E.g. the amount the insurance pays out is
either $10,000, $5,000 or $0
• Continuous
– any numeric value within a range of values.
• Example: the time you spend from home to school
Probability model
• The collection of all possible values and
the probabilities that they occur is called
the probability model for the random
variable.
Example
• Death rate in any year is 1 out of every 1000
people
• 2 out of 1000 suffer some kind of disability
• Probability model
Policyholder
outcome
Payment
(x)
Probability
(Pr(X=x))
Death
10,000
1/1000
Disability
5,000
2/1000
Neither
0
997/1000
What does the insurance company
expect?
• Suppose it insures exactly 1000 people
• In a year,
– 1 customer dies
– 2 are disabled
– The insurance company pays $10,000 +
$5,000*2 = $20,000
– Payment per customer: $20,000/1000 = $20
– Earnings per customer: $50
– Profit : $30 per customer!
Expected value
• $20 is the expected payment per customer
• E(X) = 20
=(10000 * 1 + 5000 * 2 + 0*997) / 1000
=10000*(1/1000) + 5000*(2/1000) + 0*(997/1000)
• E(X) = Σx* P(X=x)
– Center of the distribution
– A parameter of the model
Expected value
• Of particular interest is the value we expect a random
variable to take on, notated μ (for population mean) or
E(X) for expected value.
• The expected value of a (discrete) random variable can
be found by summing the products of each possible
value and the probability that it occurs:
  E  X    x  P  x
• Note: Be sure that every possible outcome is included
in the sum and verify that you have a valid probability
model to start with.
• Most of the time, the company makes $50
per customer
• But, with small probabilities, the company
needs to pay a lot ($10000 or $5000)
• The variation is big
• How to measure the variation?
Spread
• For data, we calculated the standard
deviation by first computing the deviation
from the mean and squaring it. We do that
with discrete random variables as well.
• The variance for a random variable is:
  Var  X     x     P  x 
2
2
• The standard deviation for a random
variable is:
  SD  X   Var  X 
Variance and standard deviation
Policyholder
outcome
Payment (x) Probability
Pr(X=x)
Deviation
Death
10,000
1/1000
(10000-20) = 9980
Disability
5,000
2/1000
5000-20 =4980
Neither
0
997/1000
0 -20 = -20
Var(X) = Σ[x-E(X)]2 * P(X=x)
Variance = 99802 (1/1000)+49802 (2/1000)+(-20)2 (997/1000) = 149,600
Standard deviation = square root of variance
SD(X) = $386.78
Properties of expected value and
standard deviation
• Shifting
– E(X+c) = E(X) + c
– Var(X+c) = Var(X)
– Example: Consider everyone in a company receiving
a $5000 increase in salary.
• Scaling
– E(aX) = aE(X)
– Var(aX) = a2 Var(X)
– Example: Consider everyone in a company receiving
a 10% increase in salary.
Properties of expected value and
standard deviation
• Additivity
– E(X ± Y) = E(X) ± E(Y)
– If X and Y are independent
• Var(X ± Y) = Var(X) + Var(Y)
• SD(X+Y) is NOT SD(X)+SD(Y)
• Suppose the outcomes for two customers are
independent, what is the variance for the total
payment to these two customers?
– Var(X+Y) = Var(X)+Var(Y) = 149600 + 149600 =
299200
• If one customer is insured twice as much, the
variance is
– Var(2X) = 4Var(X) = 4*149600 = 598400
– SD(2X) = 2SD(X)
X+Y and 2X
• Random variables do not simply add up
together!
– X and Y have the same probability model
– But they are not the same random variables
– Can NOT be written as X + X
Example
• Sell used Isuzu Trooper and purchase a
new Honda motor scooter
– Selling Isuzu for a mean of $6940 with a
standard deviation $250
– Purchase a new scooter for a mean of $1413
with a standard deviation $11
• How much money do I expect to have
after the transaction? What is the standard
deviation?
Combining Random Variables (The
Bad News)
• It would be nice if we could go directly from
models of each random variable to a model
for their sum.
• But, the probability model for the sum of
two random variables is not necessarily the
same as the model we started with even
when the variables are independent.
• Thus, even though expected values may
add, the probability model itself is different.
Combining Random Variables (The
Good News)
• When two independent continuous
random variables have Normal models, so
does their sum or difference.
• This fact will let us apply our knowledge of
Normal probabilities to questions about the
sum or difference of independent random
variables.
Combining normal random
variables
• Example: packaging stereos
– Stage 1: packing
• Normal with mean 9min and sd 1.5min
– Stage 2: boxing
• Normal with mean 6min and sd 1min
• What is the probability that packing two
consecutive systems take over 20 minutes?
• X1: time for packing the first system
– mean=9, sd = 1.5
• X2: time for packing the second system
• T=X1+X2: total time to pack two systems
– E(T) = E(X1)+E(X2) = 9+9=18
– Var(T) = Var(X1)+Var(X2) = 1.52 + 1.52 (assuming
independence)
– T is Normal with mean 18 and sd 2.12
• z-score = (20-18)/2.12 = 0.94
– P(T>20) = P(Z>0.94) = 0.1736
• What percentage of the stereo systems
take longer to pack than to box?
– P: time for packing a system
– B: time for boxing a system
– D=P-B: difference in times to pack and box a
system
– The questions is P(D>0)=?
– Assuming P and B are independent
• D is still Normal
• E(D) = E(P-B) = E(P)-E(B) = 9-6=3
• Var(D) = Var(P-B) = Var(P)+Var(B) = 1.52 + 12 =
3.25
• SD(D) = 1.80
• D is Normal with mean 3 and sd 1.80
• P(D>0) = 0.9525
• About 95% of all the stereo systems will require
more time for packing than for boxing
Correlation and Covariance
• If X is a random variable with expected
value E(X)=µ and Y is a random variable
with expected value E(Y)=ν, then the
covariance of X and Y is defined as
Cov(X,Y)=E((X-µ)(Y- ν))
• The covariance measures how X and Y
vary together.
Some properties of covariance
•
•
•
•
•
Cov(X,Y)=Cov(Y,X)
Cov(X,X)=Var(X)
Cov(cX,dY)=c*dCov(X,Y)
Cov(X,Y) = E(XY)- µν
If X and Y are independent, Cov(X,Y)=0
– The converse is NOT true
• Var(X ± Y) = Var(X) + Var(Y) ± 2Cov(X,Y)
Correlation and Covariance
(cont.)
• Covariance, unlike correlation, doesn’t
have to be between -1 and 1.
• To fix the “problem” we can divide the
covariance by each of the standard
deviations to get the correlation:
Corr ( X , Y ) 
Cov( X , Y )
 XY
What Can Go Wrong?
• Don’t assume everything’s Normal.
– You must Think about whether the Normality
Assumption is justified.
• Watch out for variables that aren’t
independent:
– You can add expected values for any two
random variables, but
– you can only add variances of independent
random variables.
What Can Go Wrong? (cont.)
• Don’t forget: Variances of independent
random variables add. Standard
deviations don’t.
• Don’t forget: Variances of independent
random variables add, even when you’re
looking at the difference between them.
• Don’t write independent instances of a
random variable with notation that looks
like they are the same variables.