Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 16 Random Variables math2200 Life insurance • Insurance company: a “death and disability” policy – Pay $10,000 when the client dies – Pay $5,000 if the client is permanently disabled – Charge $50 per year • Why $50? – Using actuarial information, the company can calculate the expected value of the policy. Random variable • The amount the company pays out on an individual policy is called a random variable. • A random variable assumes a value based on the outcome of a random event. – Random variable is often denoted by a capital letter, e.g., X – A particular value that it can have is often denoted by the corresponding lower case letter, e.g., x Random variable • Discrete – We can list all the outcomes (finite or countable) – E.g. the amount the insurance pays out is either $10,000, $5,000 or $0 • Continuous – any numeric value within a range of values. • Example: the time you spend from home to school Probability model • The collection of all possible values and the probabilities that they occur is called the probability model for the random variable. Example • Death rate in any year is 1 out of every 1000 people • 2 out of 1000 suffer some kind of disability • Probability model Policyholder outcome Payment (x) Probability (Pr(X=x)) Death 10,000 1/1000 Disability 5,000 2/1000 Neither 0 997/1000 What does the insurance company expect? • Suppose it insures exactly 1000 people • In a year, – 1 customer dies – 2 are disabled – The insurance company pays $10,000 + $5,000*2 = $20,000 – Payment per customer: $20,000/1000 = $20 – Earnings per customer: $50 – Profit : $30 per customer! Expected value • $20 is the expected payment per customer • E(X) = 20 =(10000 * 1 + 5000 * 2 + 0*997) / 1000 =10000*(1/1000) + 5000*(2/1000) + 0*(997/1000) • E(X) = Σx* P(X=x) – Center of the distribution – A parameter of the model Expected value • Of particular interest is the value we expect a random variable to take on, notated μ (for population mean) or E(X) for expected value. • The expected value of a (discrete) random variable can be found by summing the products of each possible value and the probability that it occurs: E X x P x • Note: Be sure that every possible outcome is included in the sum and verify that you have a valid probability model to start with. • Most of the time, the company makes $50 per customer • But, with small probabilities, the company needs to pay a lot ($10000 or $5000) • The variation is big • How to measure the variation? Spread • For data, we calculated the standard deviation by first computing the deviation from the mean and squaring it. We do that with discrete random variables as well. • The variance for a random variable is: Var X x P x 2 2 • The standard deviation for a random variable is: SD X Var X Variance and standard deviation Policyholder outcome Payment (x) Probability Pr(X=x) Deviation Death 10,000 1/1000 (10000-20) = 9980 Disability 5,000 2/1000 5000-20 =4980 Neither 0 997/1000 0 -20 = -20 Var(X) = Σ[x-E(X)]2 * P(X=x) Variance = 99802 (1/1000)+49802 (2/1000)+(-20)2 (997/1000) = 149,600 Standard deviation = square root of variance SD(X) = $386.78 Properties of expected value and standard deviation • Shifting – E(X+c) = E(X) + c – Var(X+c) = Var(X) – Example: Consider everyone in a company receiving a $5000 increase in salary. • Scaling – E(aX) = aE(X) – Var(aX) = a2 Var(X) – Example: Consider everyone in a company receiving a 10% increase in salary. Properties of expected value and standard deviation • Additivity – E(X ± Y) = E(X) ± E(Y) – If X and Y are independent • Var(X ± Y) = Var(X) + Var(Y) • SD(X+Y) is NOT SD(X)+SD(Y) • Suppose the outcomes for two customers are independent, what is the variance for the total payment to these two customers? – Var(X+Y) = Var(X)+Var(Y) = 149600 + 149600 = 299200 • If one customer is insured twice as much, the variance is – Var(2X) = 4Var(X) = 4*149600 = 598400 – SD(2X) = 2SD(X) X+Y and 2X • Random variables do not simply add up together! – X and Y have the same probability model – But they are not the same random variables – Can NOT be written as X + X Example • Sell used Isuzu Trooper and purchase a new Honda motor scooter – Selling Isuzu for a mean of $6940 with a standard deviation $250 – Purchase a new scooter for a mean of $1413 with a standard deviation $11 • How much money do I expect to have after the transaction? What is the standard deviation? Combining Random Variables (The Bad News) • It would be nice if we could go directly from models of each random variable to a model for their sum. • But, the probability model for the sum of two random variables is not necessarily the same as the model we started with even when the variables are independent. • Thus, even though expected values may add, the probability model itself is different. Combining Random Variables (The Good News) • When two independent continuous random variables have Normal models, so does their sum or difference. • This fact will let us apply our knowledge of Normal probabilities to questions about the sum or difference of independent random variables. Combining normal random variables • Example: packaging stereos – Stage 1: packing • Normal with mean 9min and sd 1.5min – Stage 2: boxing • Normal with mean 6min and sd 1min • What is the probability that packing two consecutive systems take over 20 minutes? • X1: time for packing the first system – mean=9, sd = 1.5 • X2: time for packing the second system • T=X1+X2: total time to pack two systems – E(T) = E(X1)+E(X2) = 9+9=18 – Var(T) = Var(X1)+Var(X2) = 1.52 + 1.52 (assuming independence) – T is Normal with mean 18 and sd 2.12 • z-score = (20-18)/2.12 = 0.94 – P(T>20) = P(Z>0.94) = 0.1736 • What percentage of the stereo systems take longer to pack than to box? – P: time for packing a system – B: time for boxing a system – D=P-B: difference in times to pack and box a system – The questions is P(D>0)=? – Assuming P and B are independent • D is still Normal • E(D) = E(P-B) = E(P)-E(B) = 9-6=3 • Var(D) = Var(P-B) = Var(P)+Var(B) = 1.52 + 12 = 3.25 • SD(D) = 1.80 • D is Normal with mean 3 and sd 1.80 • P(D>0) = 0.9525 • About 95% of all the stereo systems will require more time for packing than for boxing Correlation and Covariance • If X is a random variable with expected value E(X)=µ and Y is a random variable with expected value E(Y)=ν, then the covariance of X and Y is defined as Cov(X,Y)=E((X-µ)(Y- ν)) • The covariance measures how X and Y vary together. Some properties of covariance • • • • • Cov(X,Y)=Cov(Y,X) Cov(X,X)=Var(X) Cov(cX,dY)=c*dCov(X,Y) Cov(X,Y) = E(XY)- µν If X and Y are independent, Cov(X,Y)=0 – The converse is NOT true • Var(X ± Y) = Var(X) + Var(Y) ± 2Cov(X,Y) Correlation and Covariance (cont.) • Covariance, unlike correlation, doesn’t have to be between -1 and 1. • To fix the “problem” we can divide the covariance by each of the standard deviations to get the correlation: Corr ( X , Y ) Cov( X , Y ) XY What Can Go Wrong? • Don’t assume everything’s Normal. – You must Think about whether the Normality Assumption is justified. • Watch out for variables that aren’t independent: – You can add expected values for any two random variables, but – you can only add variances of independent random variables. What Can Go Wrong? (cont.) • Don’t forget: Variances of independent random variables add. Standard deviations don’t. • Don’t forget: Variances of independent random variables add, even when you’re looking at the difference between them. • Don’t write independent instances of a random variable with notation that looks like they are the same variables.