Chapter 16 Random Variables math2200 Life insurance • A life insurance policy: – Pay $10,000 when the client dies – Pay $5,000 if the client is permanently disabled – Charge $50 per year Random variable • We call the variable X a random variable if the numeric value of X is based on the outcome of a random event. e.g. The amount the company pays out on one policy – Random variable is often denoted by a capital letter, e.g. X, Y and Z. A particular value of the variable is often denoted by the corresponding lower case letter, e.g. x, y and z Random variable • Discrete – If we can list all the outcomes (finite or countable) e.g. the amount the insurance pays out is either $10,000, $5,000 or $0 • Continuous – any numeric value within a range of values. • Example: the time you spend from home to school Probability model • The collection of all possible values and the probabilities that they occur is called the probability model for the random variable. Example • Death rate :1 out of every 1000 people per year • Disability rate: 2 out of 1000 per year • Probability model Policyholder outcome Payment (x) Probability (Pr (X=x)) Death 10,000 1/1000 Disability 5,000 2/1000 Neither 0 997/1000 What does the insurance company expect? • 1000 people insured and in a year, – 1 dies – 2 disabled – pays $10,000 + $5,000*2 = $20,000 – payment per customer: $20,000/1000 = $20 – charge per customer: $50 – profit : $30 per customer! Expected value • $20 is the expected payment per customer • E(X) = 20 =10000 * 1/1000 + 5000 * 2/1000 + 0*997/1000 • If X is a discrete random variable E X x P x How about spread? • Most of the time, the company makes $50 per customer • But, with small probabilities, the company needs to pay a lot ($10000 or $5000) • The variation is big • How to measure the variation? Spread • The variance of a random variable is: Var X x P x 2 2 • The standard deviation for a random variable is: SD X Var X Variance and standard deviation Policyholder outcome Payment (x) Probability Pr(X=x) Deviation Death 10,000 1/1000 (10000-20) = 9980 Disability 5,000 2/1000 5000-20 =4980 Neither 0 997/1000 0 -20 = -20 Var (X) = Σ[x-E(X)]2 * P(X=x) Variance = 99802 (1/1000)+49802 (2/1000)+(-20)2 (997/1000) = 149,600 Standard deviation = square root of variance SD(X) = $386.78 Properties of Expected value and Standard deviation • Shifting – E(X+c) = E(X) + c – Var(X+c) = Var(X) Example: Consider everyone in a company receiving a $5000 increase in salary. • Rescaling – E(aX) = aE(X) – Var(aX) = a2 Var(X) Example: Consider everyone in a company receiving a 10% increase in salary. Properties of expected value and standard deviation • Additivity – E(X ± Y) = E(X) ± E(Y) – If X and Y are independent • Var(X ± Y) = Var(X) + Var(Y) • Suppose the payments for two customers are independent, the variance for the total payment to these two customers Var (X+Y) = Var (X)+ Var (Y) = 149600 + 149600 = 299200 • If one customer is insured twice as much, the variance is – Var(2X) = 4Var(X) = 4*149600 = 598400 – SD(2X) = 2SD(X) Example :Combine Random Variables • Sell used Isuzu Trooper and purchase a new Honda motor scooter – Selling Isuzu for a mean of $6940 with a standard deviation $250 – Purchase a new scooter for a mean of $1413 with a standard deviation $11 • How much money do I expect to have after the transaction? What is the standard deviation? Combining Random Variables • Bad News: the probability model for the sum of two variables is often different from what we start with. • Good news: the magical normal model the probability model for the sum of independent Normal random variables is still normal. Example: Combining normal random variables • packaging stereos – Packing the system • Normal with mean 9 min and sd 1.5min – Boxing the system • Normal with mean 6 min and sd 1min • What is the probability that packing two consecutive systems take over 20 minutes? • What percentage of the stereo systems take longer to pack than to box ? • X1: mean=9, sd = 1.5 • X2: mean=9, sd = 1.5 • T=X1+X2: total time to pack two systems – E(T) = E(X1)+E(X2) = 9+9=18 – Var(T) = Var(X1)+Var(X2) = 1.52 + 1.52 = 4.5 (assuming independence) – T is Normal with mean 18 and sd 2.12 – P(T>20) = normalcdf(20,1E99, 18, 2.12) =0.1736 • What percentage of the stereo systems take longer to pack than to box ? – P: time for packing – B: time for boxing – D=P-B: difference in times to pack and box a system – The questions is P(D>0)=? – Assuming P and B are independent • E(D) = E(P-B) = E(P)-E(B) = 9-6=3 • Var(D) = Var(P-B) = Var(P)+Var(B) = 1.52 + 12 = 3.25 • SD(D) = 1.80 • D is Normal with mean 3 and sd 1.80 • P(D>0) =normalcdf(0,1E99,3,1.80)= 0.9525 • About 95% of all the stereo systems will require more time for packing than for boxing Correlation and Covariance (OPTIONAL) • If E(X)=µ and E(Y)=ν, then the covariance of the random variables X and Y is defined as Cov(X,Y)=E((X-µ)(Y-ν)) • The covariance measures how X and Y vary together. properties of covariance • • • • • Cov(X,Y)=Cov(Y,X) Cov(X,X)=Var(X) Cov(cX,dY)=c*dCov(X,Y) Cov(X,Y) = E(XY)- µν If X and Y are independent, Cov(X,Y)=0 – The converse is NOT true • Var(X ± Y) = Var(X) + Var(Y) ± 2Cov(X,Y) Correlation and Covariance (cont.) • Covariance, unlike correlation, doesn’t have to be between -1 and 1. • To fix the “problem” we can divide the covariance by each of the standard deviations to get the correlation coefficient: Corr ( X , Y ) Cov( X , Y ) XY What Can Go Wrong? • Don’t assume everything’s Normal. – You must Think about whether the Normality Assumption is justified. • Watch out for variables that aren’t independent: – You can add expected values for any two random variables, but – you can only add variances of independent random variables. What Can Go Wrong? (cont.) • Don’t forget: Variances of independent random variables add. Standard deviations don’t. • Don’t forget: Variances of independent random variables add, even when you’re looking at the difference between them. • Don’t write independent instances of a random variable with notation that looks like they are the same variables.