Download x, y

Discrete Distributions Random Variable    A random variable X is a function that maps the possible outcomes of an experiment to real numbers. That is X: C --> R, where C is the set of all outcomes of an experiment and R is the set of real numbers. The space of X is the set of real numbers S = {x: X(c)= x, c  C } An Example of Random Variable    If we toss a coin one time, then there are two possible outcomes, namely “head up” and “tail up”. We can define a random variable X that maps “head up” to 1 and “tail up” to 0. We also can define a random variable Y that maps “head up” to 0 and “tail up” to 1.  The spaces of both random variables X and Y are {0,1}. Further Illustration of Random Variables   A random variable corresponds to a quantitative interpretation of the outcomes of an experiment. For example, a company offers its employees a drawing in its yearend party. A computer will randomly select an employee for the first prize of $100,000 based on the employees’ ID number, which ranges from 1 to 100.    In addition, the computer will randomly select two more employees for the second and third prizes of $50,000 and $10,000, respectively. Assume that each employee can receive only one award and the drawing starts with the third prize and ends with the first prize. Then, there are totally 100 × 99 × 98 = 970200 possible outcomes.   To Edward, whose employee ID number is 10, the random variable of his interest is as follows: X(<10, *, *>) = 10,000 X(<*, 10, *>) = 50,000 X(<*, *, 10>) = 100,000 X(all other outcomes) = 0 To Grace, whose employee ID number is 30, the random variable of her interest is as follows: Y(<30, *, *>) = 10,000 Y(<*, 30, *>) = 50,000 Y(<*, *, 30>) = 100,000 Y(all other outcomes) = 0    The outcome spaces of random variables X and Y are identical. However, X and Y map some outcomes to different real numbers. The spaces of X and Y are also identical and both are {0, 10000, 50000, 100000}. The probability functions of X and Y are also equal. Prob(X=10,000) = Prob(Y=10,000) = 0.01 Prob(X=50,000) = Prob(Y=50,000) = 0.01 Prob(X=100,000) = Prob(Y=100,000) = 0.01 Prob(X=0) = Prob(Y=0) = 0.97  The expected values of X and Y are equal to E[X] = E[Y] = 10,000 * 0.01 + 50,000 * 0.01 + 100,000 * 0.01 = 1600. Discrete Random Variables   Given a random variable X, let S denote the space of X. If S is a finite or countable infinite set, then X is said to be a discrete random variable. Countable Infinite  A set is said to be countable infinite, if it contains infinite number of elements and there exists a one-to-one mapping between each element of the set and the positive integers. Examples of Countable / Uncountable Infinite    The set of integer numbers is countable. The set of fractional numbers is countable. The set of real numbers is uncountable. Probability Mass Function  The probability mass function (p.m.f.) of a discrete random variable X is defined to be PX k   Prob X  k    Prob(q), qQk where Qk contains all outcomes that are mapped to k by random variable X .  In the previous example of drawing, PX 10,000  Prob X  10,000   Prob(  10, i, j ) 10,i , j  i 10, j 10 i j   10,i , j  i 10, j 10 i j 1  0.01. 100  99  98   In fact,the p.m.f. of a random variable is defined on a set of events of the experiment conducted. In the previous drawing example, the set of outcomes that are mapped to 10,000 by X is an event.  Furthermore, in the previous drawing example, random variables X and Y map some outcomes to different real numbers. However, X and Y have the same distribution, i.e. the p.m.f. of X and the p.m.f. of Y are equal. More precisely, PX (k )  PY (k ) for every k  {0,10000, 50000, 100000}. Properties of the Probability Mass Function  The p.m.f. of a random variable X satisfies the following three properties: 1 PX x   0 , x  S : the space of X. If S is finite, then PX x   0. 2  PX xi  1 . 3  P x  , xi S Prob A  x j A X j where A  S . Probability Distribution Function  For a random variable X, we define its probability distribution function F as FX t   Prob X  t  Properties of a Probability Distribution Function  1. lim FX t   1 . t  2 . lim FX t   0 . t  3 . FX w  FX t  , if w  t .  Any function that satisfies these conditions above can be a distribution function. An Example of the Probability Distribution Function of a Discrete Random Variable  Assume that we toss a 4sided die twice. Then, we have 16 possible outcomes： , 1,2 , 1,3 , 1,4 , 2,1 , 2,2 , 2,3 , 2,4,  1,1   , 3,2 , 3,3 , 3,4 , 4,1 , 4,2 , 4,3 , 4,4 3,1  Let random variable X be the sum of the outcome. Then, Prob X  2   1 2 , Prob X  3  16 16 3 4 Prob X  4   , Prob X  5  16 16 3 2 Prob X  6   , Prob X  7   16 16 1 Prob X  8  . 16 1 2 3 4 5 FX 5  Prob X  5      . 16 16 16 16 8 Operations of Random Variables   Let X and Y be two random variables defined on the same outcome space of an experiment. Then, we can define a new random variable Z=f(X,Y).   For example, in the example of drawing, if Edward and Grace are husband and wife, then we can define a new random variable Z=X+Y. We have X(<30, 10, *>) = 50,000 Y(<30, 10, *>) = 10,000 Z(<30, 10, *>) = 60,000 Function of Random Variables   Let X be a random variable and G be a function. Then, random variable Y=G（X）maps an outcome ν in the outcome space of X to value G（X（ν））. With respect to the probability distribution functions, if G（X） is monotonically increasing, oneto-one mapping, then FY t   ProbY  t   ProbG  X   t   ProbX  G 1 t   FX G 1 t  An Example of Functions of Random Variables   Let random variable X be the sum of two tosses of a 4sided die and Y=X2. Then,  FY 16  ProbY  16  Prob X  16  Prob X  4  FX 4. 2  6 3  PX 4  PX 3  PX 2   . 16 8 Expected Value of a Discrete Random Variable  Let X be a discrete random variable and S be its space. Then, the expected value of X is EX    Pr ob( z ) X ( z )   PX xi xi zC  xi S μ is a widely used symbol for expected value. Expected Value of a Function of a Random Variable  Let X be a random variable and G be a function. Then, the expected value of random variable Y  G X  is equal to  Gx P x  xi S i X i Expected Value of a Function of a Random Variable  Proof ： ' E Y    PY  yi yi , where S is the space of Y . yi S '   ProbY  y y i i yi S '    ProbX  x Gx  j yi S ' all x j such that G x j  yi     P x Gx . x j S X j j j  For example, let X correspond to the outcome of tossing a die once. Then, Px(1)=Px(2)=Px(3)=Px(4)=Px(5)=Px(6)=1/6.  and E[X]=3.5 If we are concerned about the difference between the observed outcome and the mean. And define Y=|X-E[X]|, then PY(1/2)=1/3, PY(3/2)=1/3, PY(5/2)=1/3. Therefore, 1 1 3 1 5 1 9 1 3 E[Y ]          . 2 3 2 3 2 3 2 3 2 On the other hand, 1 | xi  E[ X ] |P x ( xi )   | xi  3.5 |   6 xi xi 1 9 3  (2.5  1.5  0.5  0.5  1.5  2.5)    . 6 6 2 Theorems about the Expected Value （a）If c is a constant, Ec  c. （b）If c is a constant and g is a function, Ecg  X   cEg  X  （c）If c1 and c2 are constants and g1 and g2 are functions, then Ec1g1 X   c2 g2 X   c1Eg1 X  c2 Eg2 X . Theorems about the Expected Value   Proof of （a）： Trivial. Proof of （b）： E cg  X    cg x P x ,where S is the space xi S i X i of X and PX x  is the p.m.f of X.  c  g xi P X  xi  xi S  cE g  X  Theorems about the Expected Value  Proof of （c）： Ec1 g1  X   c2 g 2  X    c1 g1 xi   c2 g 2 xi PX xi  xi S   c1 g1 xi PX xi    c2 g 2 xi PX xi  xi S xi S  c1Eg1  X   c2 Eg 2  X .  An extension of （c） k  k  E  ci g i  X    ci Eg i  X .  i 1  i 1 Variance of a Discrete Random Variable   The variance of a random variable is defined to be E  X   2  and is typically denoted by σ2. For a discrete random variable X, 2 Var X   E  X     E X 2  2 X   2      E X     E X 2  2 E X    2 2  2 . σ is normally called the standard deviation. Variance of a Discrete Random Variable  Let X be a random variable with mean μX and variance σX2. Let Y= aX+b, where a and b are constants. Then, EY   EaX  b  aEX   b  a X  b    E aX  b  a  b    E a  X      a E  X      a  VarY   E Y   y  2 2 X 2 2 X 2 2 X 2 2 X . Variance of a Random Variable   The variance of a random variable measures the deviation of its distribution from the mean. For example, in one drawing, Robert has 0.1% of chance to win $100,000, while in another drawing, he has 0.01% of chance to win $1,000,000.   The expected amounts of award in these two drawings are equal. 0.001 * 100000 = 100 0.0001 * 1000000 = 100 However, their variances are different. 0.001 * (100000 – 100)2 + 0.999 * (0 – 100)2 = 9,990,000 0.0001 * (1000000 – 100)2 + 0.999 * (0 – 100)2 = 99,990,000  In many distributions, the mean and variance together uniquely determine the parameters of the random variables. The Bernoulli Experiment and Distribution   A Bernoulli experiment is a random experiment, the outcome of which can be classified in one of two mutually exclusive and exhaustive ways, say, success and failure. A sequence of Bernoulli trials occurs when a Bernoulli experiment is performed several independent times, so that the probability of success, say p, remains the same from trial to trial. The Bernoulli Distribution  Let X be a Bernoulli random variable. The p.m.f of X can be written as PX k   p 1  p  1 k k  , where k= 0 or 1 and p is the probability of success. The expected value of X is 1 k kp  1  p  1 k  p. k 0  The variance of X is 1 2 k 1 k     k  p p 1  p  p1  p .  k 0 The Binomial Distribution    Let X be the random variable corresponding to the number of successes in a sequence of Bernoulli trials. n k Then, PX k   Prob X  k   Ck p 1  p  nk where n is the number of Bernoulli trials and p is the probability of success in one trial. X is said to have a binomial distribution and is normally denoted by b（n , p）. , Example of the Binomial Distribution  Assume that Tiger and Whale are the two teams that enter the Championship series of the professional basket ball league. Based on prior records, Tiger has a 60% chance of beating Whale in a single game. Larry, who is a fan of Tiger, makes a bet with Peter, who is a fan of Whale. According to their agreement, Larry will pay Peter $1000, should Whale win the 5-game series. In order to make a fair bet, how much should Peter pay Larry, if Tiger wins the series?  The probability that Tiger wins the series is C35 (0.6)3 (0.4) 2  C45 (0.6) 4 (0.4)  C55 (0.6)5  0.6826 Z * 0.6826  1000 * (1  0.6826) Z  465.  If the championship series consists of 3 games, then what is the probability that Tiger win the series? C (0.6) (0.4)  C (0.6) 3 2 2  0.648  0.6826 3 3 3 The Moment-Generating Function  Let X be a discrete random variable with p.m.f PX x  and space S. If there is a positive number h such that   e E etX  txi xi S PX xi  exists and is finite for -h＜t＜h, then the function of t defined by   M t   E e tX is called the moment-generating function of X. and often abbreviated as m.g.f. The Moment-Generating Function  Let X and Y be two discrete random variables with the same space S. If E etX  E etY , then the probability mass functions of X and Y are equal. Insight of the argument above： Assume that S=｛s1,s2, …, sk｝contains only positive integers. Then, we have      PX s1 ets1  PX s2 ets2  ...  PX sk etsk  PY s1 ets1  PY s2 ets2  ...  PY sk etsk . Therefore, PX si   PY si  , i.e. X and Y have the same p.m.f. The Moment-Generating Function  Let M X t  be the m.g.f of a discrete random variable X. d K M X t  k txi   xi e PX  xi . K dt xi S Furthermore, d K M X 0 k k    x P x .  E X .  i X i K dt xi S    In particular,   X  M X 0 and   M X 2 X     0  M X 0 .   2 The Moment-Generating Function of the Binomial Distribution  Let X be b（n , p）. E X   n n k kC p  k 1  p  n k k 0 n n! n k  p k 1  p  k  0 k  1! n  k !   k n E X 2 2 Ckn p k 1  p  n k k 0 n n! k n k  p k 1  p  k  0 k  1! n  k ! are both difficult to compute. On the other hand, we can easily derive the m.g.f. of a binomial distribution.    e M X t   E e n tX k 0   Ckn  pe t  1  p  n k 0 tk k nk Ckn p k 1  p  nk   n  pe t  1  p  . The Moment-Generating Function of the Binomial Distribution    M X t   n pe  1  p  pet n2 n 1  t t 2 t t M X t   nn  1 pe  1  p  pe  npe pe  1  p   M X 0  np  M X 0  nn  1 p 2  np. n 1 t  Therefore,        X  M X 0  np 2   σ X  M X 0   M X 0    n 2 p 2  np 2  np  n 2 p 2 2  np1-p . The Poisson Process A Poisson process models the number of times that a particular type of events occur during a time interval.  The Poisson process is based on the following 3 assumptions：（1）The numbers of event occurrences in non-overlapping intervals are independent. lim Prob（one occurrence （2） t 0 between times t and t  t ）= t.  The Poisson Process lim （3） t  0 Prob（two occurrences between times t and t  t ）= 0.  λ is the only parameter of the Poisson process.  One example of the Poisson process is to model the number of Web accesses that a Web server receives between 8 AM and 9 AM. The Basis of the Assumptions of the Poisson Process   Assume that an ideal random number generator generates λ numbers in [0, 1]. If we divide [0, 1] evenly into n subintervals,then the probability that there is exactly one of the number generated in [0, 1/n] is   1  C1  1   n  1  n  1  1   1  1   . n  n The Basis of the Assumptions of the Poisson Process  The probability that there are exactly two of the numbers generated in [0, 1/n] is 2 1  1 C2   1   n  n  2    1  1   1   2 2n  n   2 . Let t  1 / n .Then,  1 lim Prob one occurrence in 0, t   lim t 1   t 0 n   n lim Prob two occurrences in 0, t   lim t 0 n     1 2  1  t. t  1  1   n 2  2     1 2 t 2 . The Poisson Distribution   Assume that we are concerned about a Poisson process with parameter λ and want to count the number of event occurrences during one time interval. We can divide the time interval evenly into n subintervals as the following figure shows. 1/n Time=0 Time=1 The Poisson Distribution  The probability that the event occurs k times during the time interval is     lim Ckn  1    n  n  n  k nk     lim   n   k! n    Since k  1    1   nk  n  n k   n   lim 1   n  n  n!    lim   n   k!n  k !  n  1  k   lim n   k!  1   k 1 and n  n k   n   lim 1    e  , n  n  k   the final result is e . k!  k  1    1   n ,  n  n k   n The Poisson Distribution  We say that a random variable X has a Poisson distribution, if PX k    k k! e  . By the Maclaurin’s series, we have  1 k  e   . k  0 k! Therefore,    k  P k   e  k!  e k 0  X k 0    e  1. The Poisson Distribution  The moment-generating function of a random variable with the Poisson distribution is   M X t   E e Xt e    k 0 MX MX  e  t e kt k    e k! k 0  k e k!  t   e e   e t 1 t t   e t  2  e e t  e  e t 1   e   e t 1   e t e  e t 1  The Poisson Distribution  Therefore,  X  M X  0   and  X 2  M X  0   M X  0      2    2  .   Therefore, λ is the average rate of event occurrence per unit of time. Let Y be the random variable corresponding to the number of event occurrences during a time interval of length t. Then, k  t  PY k   k! e  t . 2 The Poisson Distribution  The probability that the event occurs k times during a time interval of length t is t n lim Ck     n   n  lim n  t  1     n  n k t k 1  t n 1  t k  k!   t k t  e k! k   n   n Joint Distributions Joint Probability Mass Function  Let X and Y be two discrete random variables defined on the same outcome set. The probability that X=x and Y=y is denoted by PX,Y(x, y)= Prob(X=x,Y=y) and is called the joint probability mass function(joint p.m.f) of X and Y. PX,Y(x, y) satisfies the the following 3 properties: (1) 0  PX ,Y x, y   1 (2) P  x, y   1    x , y S X ,Y (3) Pr ob X , Y   A  P x, y ,    x , y A where A is a subset of S  S. X ,Y Example of Joint Distributions  Assume that a supermarket collected the following statistics of customers’ purchasing behavior: Purchasing Wine Not Purchasing Wine Male 45 255 Female 70 630 Purchasing Juice Not Purchasing Juice Male 60 240 Female 210 490 Example of Joint Distributions  Let random variable M correspond to whether a customer is male, random variable W correspond to whether a customer purchases wine, random variable J correspond to whether a customer purchases juice.  The joint p.m.f of M and W is W PMW (0,1) = 0.07 PMW (1,1) = 0.045 M PMW (0,0) = 0.63 PMW (1,0) = 0.255  The joint p.m.f of M and J is W PMW (0,1) = 0.21 PMW (1,1) = 0.06 M PMW (0,0) = 0.49 PMW (1,0) = 0.24 Marginal Probability Mass Function  Let PXY(x,y) be the joint p.m.f. of discrete random variables X and Y. PX x   Pr ob X  x    Pr obX  x, Y  yi  yj   PXY ( x, yi )  yj  is called the marginal p.m.f of X. Similarly, PY  y    PX ,Y xi , y  xi is called the marginal p.m.f. of Y. More on Joint Probability Mass Function  Note that we can always create a common outcome set for any two or more random variables. For example, let X and Y correspond to the outcomes of the first and second tosses of a coin, respectively. Then, the outcome set of X is {head up, tail up} and the outcome set of Y is also {head up, tail up}. The common outcome set of X and Y is {(head up,head up),(head up,tail up),(tail up,head up),(tail up,tail up)}. Independent Random Variables  Two discrete random variables X and Y are said to be independent if and only if for all possible combination of x and y PX ,Y x, y   PX x PY  y .  Otherwise, X and Y are said to be dependent. Example of Independent Random Variables  Assume that a supermarket collected the following statistics of customers’ purchasing behavior: Purchasing soft drinks Not purchasing soft drinks Male 90 210 Female 210 490 Example of Independent Random Variables   Let random variable M correspond to whether a customer is male or not and random variable S correspond to whether a customer purchases soft drinks or not. Then, M and S are independent, since for all possible combinations of the values of M and S, we have Prob(M=i,S=j)=Prob(M=i)Prob(S=j). Another Example of Joint Distribution Object X Y Class Object X Y Class 1 7.1 9.1 1 11 10.9 8.8 2 2 6.7 10.2 1 12 10.8 10.3 2 3 7.5 10.6 1 13 11.1 11 2 4 7.6 8.8 1 14 12.3 9.1 2 5 8.1 10.3 1 15 12.1 9.7 2 6 8.0 11.0 1 16 12 10.9 2 7 8.6 8.9 1 17 13.1 8.9 2 8 8.7 9.8 1 18 12.8 10.1 2 9 9.2 11.2 1 19 13.2 11.3 2 10 6.5 10.1 1 20 13.7 9.9 2 Average 7.8 10.0 - Average 12.2 10.0 - Joint p.m.f. of X, Y, and C 12 11 2 1 1 2 2 1 1 11 10 2 2 1 1 9 2 1 1 2 2 2 2 8 6 8 10 12 14 Joint p.m.f. of X and C 2 22 2 1 1 1 1 11 11 11 22 2 2 22 2 1 0 6 8 10 12 14 Joint p.m.f. of Y and C 2 22 2 1 11 1 2222 22 2 1 111 1 1 1 0 6 8 10 12 14 Joint Distribution Function  Let X and Y be two random variables. The joint distribution function is defined as follows:   FXY(x,y)=Prob( X≤x, Y≤y). Note that this definition applies to both discrete and continuous random variables. Joint Probability Density Function   Assume that X and Y be two continuous random variables defined on the same space S. The joint probability density function of X and Y is defined as follows: x, y  to be independent if and Fxysaid Xf and Y are xy  x, y   only if xy f XY x, y   f X x fY  y .  In some text books, it is defined that two random variables are independent, if and only if  We have FXY x, y   FX x FY  y . FXY  x, y  FX  x FY  y  f XY  x, y    xy xy  f X  x  fY  y .  The marginal p.d.f of X is  f X x    f XY x, y dy  and the marginal p.d.f of Y is fY  y     f x, y dx XY  Jointly Independent and Pairwise Independent   Note that, even we have  PX,Y (x,y) = PX (x)PY (y)  PY,Z (y,z) = PY (y)PZ(z)  PX,Z (x,z) = PX (x)PZ (z) Then, it is not necessary true that  PX,Y,Z (x,y,z) = PX (x)PY (y) PZ (z) An Example of Pairwise Independence  Let X and Y are two random variables that correspond to tossing a unbiased coin two times. Let Z = X  Y. Then  Prob(Z=0) = Prob(X=0,Y=0) + Prob(X=1,Y=1) = ½  Prob(X=0,Z=0) = Prob(X=0,Y=0) = ¼ = Prob(X=0)Prob(Z=0).   Therefore, X, Y and Z are pairwise independent. However, Prob(X=0,Y=0,Z=1) = 0 and Prob(X=0)Prob(Y=0)Prob(Z=1)= 1/8 Hence, X, Y and Z are not jointly independent.  On the other hand, jointly independent implies pairwise independent. For example, PX ,Y  x, y    PX ,Y ,Z  x, y , z  z   PX  x PY  y PZ z  z  PX  x PY  y  PZ z   PX  x PY  y . z Addition of Two Random Variables    Let X and Y be two random variables. Then, E[X+Y]=E[X]+E[Y]. Note that the above equation holds even if X and Y are dependent. Proof of the discrete case : E[ X  Y ]   PXY ( x, y )( x  y ) x y   x  PXY ( x, y )   y  PXY ( x, y ) x y x y   x  PXY ( x, y )   y  PXY ( x, y ) x y y x   xPX ( x)   yPY ( y )  E[ X ]  E[Y ] x y  On the other hand, Var[ X  Y ]  E[(( X  Y )  (  x   y )) 2 ]  E[( X  Y ) 2  (  x   y ) 2  2( X  Y )(  x   y )]  E[( X  Y ) 2 ]  (  x   y ) 2  2(  x   y ) 2  E[ X ]  E[Y ]  2 E[ XY ]   x   y  2  x  y 2 2 2 2  ( E[ X ]   x )  ( E [Y ]   y )  2( E[ XY ]   x  y ) 2 2 2 2  Var[ X ]  Var[Y ]  2( E [ XY ]  E[ X ]E[Y ]) #  Note that if X and Y are independent, then E[ XY ]   xyPXY ( x, y ) x y   xyPX ( x) PY ( y ) x y   xPX ( x) yPY ( y ) x y  E[ X ]E[Y ]  Therefore, if X and Y are independent, then Var[X+Y]=Var[X]+Var[Y]. Covariance  Let X and Y be two random variables. Then, E[(X-µX)(Y- µY)] is called the covariance of X and Y, and is denoted by σXY, where µX and µY are the means of X and Y, respectively. Covariance  E[(X-µX)(Y- µY)] = E[XY- µYX- µXY+ µXµY] = E[XY]- µYE[X]- µXE[Y]+E[µXµY] = E[XY]- µXµY  Therefore, if X and Y are independent, then Cov[X,Y]=0. Examples of Correlated Random Variables  Assume that a supermarket collected the following statistics of customers’ purchasing behavior: Purchasing Wine Not Purchasing Wine Male 45 255 Female 70 630 Purchasing Juice Not Purchasing Juice Male 60 240 Female 210 490 Examples of Correlated Random Variables  Let random variable M correspond to whether a customer is male, random variable W correspond to whether a customer purchases wine, random variable J correspond to whether a customer purchases juice.  The joint p.m.f of M and W is W PMW (0,1) = 0.07 PMW (1,1) = 0.045 M PMW (0,0) = 0.63 PMW (1,0) = 0.255 Cov(M,W)= E[MW]-E[M]E[W] = 0.045 – 0.3*0.115 = 0.0105 >0 M and W are positively correlated.  The joint p.m.f of M and J is W PMW (0,1) = 0.21 PMW (1,1) = 0.06 M PMW (0,0) = 0.49 PMW (1,0) = 0.24 Cov(M,J)= E[MJ]-E[M]E[J] = 0.06 – 0.3*0.27 = -0.021 < 0 M and J are negatively correlated. Covariance of Independent Random Variables  Assume that the supermarket also collected the following statistics of customers’ purchasing behavior: Purchasing soft drinks Not purchasing soft drinks Male 90 210 Female 210 490  The joint p.m.f of M and S is S PMS (0,1) = 0.21 PMWS(1,1) = 0.09 M PMS (0,0) = 0.49  PMWS(1,0) = 0.21 Cov(M,S)= E[MS]-E[M]E[S] = 0.09 – 0.3*0.3 = 0, due to the fact that M and S are independent. Correlation Coefficient  The correlation coefficient of two random variables X and Y is defined as follows:  cov( X , Y )  XY Bounds of a Correlation Coefficient  Let K (b)  E ((Y  uY )  b( X  u X )) 2   Y2  2b X  Y  b2 X2 . We have Y 2 2 K ( )   Y (1   ). X Since K(b) is the expected value of a square, K(b)  0 for all b  R. Therefore,  1    1.  Implication of the Value of the Correlation Coefficient  Assume that the supermarket collected the following statistics of customers’ purchasing behavior: Purchasing cosmetics Not purchasing cosmetics Male 10 290 Female 260 440 Implication of the Value of the Correlation Coefficient   Let random variable M correspond to whether a customer is male, random variable C correspond to whether a customer purchases cosmetics. Then, the correlation coefficient of M and C is -0.349. Implication of the Value of the Correlation Coefficient   On the other hand, we also have the following dataset: Purchasing juice Not purchasing juice Male 60 240 Female 210 490 The correlation coefficient of M and J is -0.103. Another Example of Correlation Coefficient Object X Y Class Object X Y Class 1 7.1 9.1 1 11 10.9 8.8 2 2 6.7 10.2 1 12 10.8 10.3 2 3 7.5 10.6 1 13 11.1 11 2 4 7.6 8.8 1 14 12.3 9.1 2 5 8.1 10.3 1 15 12.1 9.7 2 6 8.0 11.0 1 16 12 10.9 2 7 8.6 8.9 1 17 13.1 8.9 2 8 8.7 9.8 1 18 12.8 10.1 2 9 9.2 11.2 1 19 13.2 11.3 2 10 6.5 10.1 1 20 13.7 9.9 2 Average 7.8 10.0 - Average 12.2 10.0 - Joint p.m.f. of X, Y, and C 12 11 2 1 1 2 2 1 1 11 10 2 2 1 1 9 2 1 1 2 2 2 2 8 6 8 10 12 14 Joint p.m.f. of X and C 2 22 2 1 1 1 1 11 11 11 22 2 2 22 2 1 0 6 8 10 12 14 Joint p.m.f. of Y and C 2 22 2 1 11 1 2222 22 2 1 111 1 1 1 0 6 8 10 12 14 Another Example of Correlation Coefficients  The correlation coefficient of X and C is E[ XC ]  E[ X ]E[C ]   X C 16.1  10 1.5   0.925. 2.379  0.5 On the other hand, the covariance of Y and C is E[YC]-E[Y}E[C] = 15-10×1.5 =0 and therefore the correlation coefficient of Y and C is 0.   With respect to data analysis, random variable X provides valuable information about the class of an object. On the other hand, random variable Y essentially provides no information about the class of an object. Example of Uncorrelated Random Variables   Assume X and Y have the following joint p.m.f PXY(0,1)= PXY(1,0)= PXY(2,1)= 1/3 We have the following marginal p.m.f.s PX 0   PXY (0, y )  1 / 3 ; PX 1   PXY (1, y )  1 / 3 y y PX 2   PXY (2, y )  1 / 3 ; PY 0   PXY ( x,0)  1 / 3 x PY 1   PXY ( x,1)  2 / 3 x x Example of Uncorrelated Random Variables    Since PXY(0,1) = 1/3 PX(0) x PY(1) = 1/3 x 2/3 = 2/9, X and Y are not independent. However, Cov (X, Y) = E[XY] – E[X]E[Y] = [2/9 x 1 + 2/9 x 2] – [1 x 2/3] = 0. Therefore, independence implies uncorrelated, but the inverse is not true.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download x, y