Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Random Variables Handout Xavier Vilà Course 2004-2005 1 Discrete Random Variables. 1.1 Introduction 1.1.1 Definition of Random Variable A random variable X is a function that maps each possible outcome of a random experiment to a real number Hence, if Ω is the set of all possible outcomes1 of a random experiment, then: X :Ω→< A random variable is said to be discrete if the set Ω of possible values is a finite or infinite countable set. 1.1.2 Probabilities associated to X. Since a random variable takes different values depending on some random outcomes, the probability of each of these values will equal the probability of those outcomes that induce that value P (X = x) = P ({ωi ∈ Ω/X(ωi ) = x}) Example 1.1 For instance, if we toss two dices and consider the random variable X defined by the sum of the two top faces we have P (X = 2) = P ({ωi ∈ Ω/X(ωi ) = 2}) = P ({(1, 1)}) = 1 36 P (X = 3) = P ({ωi ∈ Ω/X(ωi ) = 3}) = P ({(1, 2), (2, 1)}) = . 2 36 . . P (X = 7) = P ({ωi ∈ Ω/X(ωi ) = 7}) = P ({(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}) = etc . . . 1.1.3 Mass Probability Function of a discrete random variable. Once we know all the values a random variable can take and the corresponding probabilities, we can somehow “forget” about the random experiment upon which the variable was constructed. The mass probability function contains all the information we need to know about the random variable. Definition 1.2 Given a random variable X, and the probabilities of each of its values, the mass probability function fX of X is defined as fX : < → [0, 1] such that fX (x) = p(X = x) if x is one of the values X takes fX (x) = 0 otherwise 1 You might recall that this set is what is called the sample space. 1 6 36 Notice that with this function we have all the information we need about the variable. This function completely characterizes the random variable Example 1.3 For instance, the function fX (x) = 6 − |7 − x| 36 if 2 ≤ x ≤ 12 fX (x) = 0 otherwise is the mass probability function for example 1.1 1.2 Moments of a discrete random variable. 1.2.1 Expectation. Definition 1.4 The expectation (expected value, or mean) of the discrete random variable X is defined as: X X E(X) = fX (x) · x = p(X = x) · x x x The expectation has the following properties: (i) E(a) = a (ii) E(aX) = aE(x) (iii) E(X + Y ) = E(X) + E(Y ) (iv) E(XY ) = E(X) · E(Y ) if X and Y are independent2 1.2.2 Variance. Definition 1.5 The variance Xof the discrete random variable X is defined as: V (X) = fX (x)(x − E(X))2 = E(X − E(X))2 x The variance has the following properties: (i) V (a) = 0 (ii) V (aX) = a2 V (X) (iii) V (X + Y ) = V (X) + V (Y ) if X and Y are independent (iv) V (X + a) = V (X) (v) V (X) ≥ 0 A different (and more useful) formula to compute the variance is: V (X) = E(X 2 ) − (E(X))2 2 Two random variables X and Y are independent if p(X = x ∩ Y = y) = p(X = x)p(Y = y) = fX (x) · fy (y) 2 1.3 Main discrete distributions. 1.3.1 Bernoulli distribution. Definition 1.6 A random variable X is said to have a Bernoulli distribution with parameter p (X ∼ Ber(p)) if only takes the values 1 and 0 with probabilities p and 1 − p respectively 1 with probability p fX (x) = 0 with probability 1 − p In this case we have: E(X) = p V (X) = p(1 − p) This distribution is the simplest case of a discrete random variable. It corresponds to a random experiment that may result in 1 (success) with probability p or 0 (failure) with probability 1 − p. 1.3.2 Binomial distribution Definition 1.7 A random variable X is said to have a Binomial distribution with parameters n and p (X ∼ B(n, p)) if only takes values 0, 1, 2, . . . with probabilities given by n x P (X = x) = p (1 − p)n−x x In this case we have: E(X) = np V (X) = np(1 − p) This variable X counts the number of “successes” after n repetitions of an experiment that may result in “success” with probability p or “failure” with probability 1 − p. For instance, flip a coin 10 times (n = 10) and count the number of “heads” (p = 0.5). In other words, a Binomial random variable is the summation of n independent Bernoulli random variables. 1.3.3 Geometric distribution. Definition 1.8 A random variable X is said to have a Geometric distribution with parameter p (X ∼ G(p)) if only takes values 1, 2, 3, . . . with probabilities given by: P (X = x) = p(1 − p)x−1 In this case we have: E(X) = p1 V (X) = (1−p) p2 This variable X counts how many times we need to repeat a Bernoulli experiment until the first “success” is obtained. For instance, how many times we need to flip a coin to get the first “head”. 3 1.3.4 Poisson distribution. Definition 1.9 A random variable X is said to have a Poisson distribution with parameter λ (X ∼ P (λ)) if only takes values 0, 1, 2, . . . with probabilities given by: P (X = x) = e−λ λx x! In this case we have E(X) = λ V (X) = λ This variable X is like a Binomial but without knowing the exact number of repetitions (or assuming infinite repetitions). That is, X counts how many times we will obtain “success” in a given time interval. For instance, how many phone calls we will get Friday afternoon if the average number of phone calls is 1.3 (λ = 1.3). 4 2 Continuous Random Variables. 2.1 Introduction 2.1.1 Definition of a continuous random variable Definition 2.1 A random variable X is said to be continuous it the set of values it takes with positive probability is a non countable infinite set. Example 2.2 For instance, to randomly choose a number in the interval [0, 1] 2.1.2 Probability Density Function. Definition 2.3 Given a continuous random variable X, its probability density function fX (x) is defined as: fX : < → < such that Z b fx (x)dx p[a ≤ X ≤ b] = a Notice that, as in the case of a discrete random variable, the density probability function is equivalent to the mass probability function in the sense that completely characterizes the random variable. In other words, fX contains all the information we need about the random variable X. One important difference with respect to discrete random variables is that in the case of a continuous random variable the probability that X equals a specific value is ALWAYS ZERO . This is so because a continuous random variable can take a “large” number of values (an infinite non-countable number of values, to be more precise) and, hence, the probability of each of these values is zero. 2.1.3 Cumulative Distribution Function. Definition 2.4 Given a continuous random variable X, its cumulative distribution function is defined as: FX : < → < such that Z FX (x) = p[X ≤ x] = x fX (t)dt −∞ The cumulative distribution function has the following properties: (i) 0 ≤ FX (x) ≤ 1 (ii) limx→∞ FX (x) = 1; limx→−∞ FX (x) = 0 (iii) x1 ≤ x2 ⇒ FX (x1 ) ≤ FX (x2 ) (iv) p[a ≤ X ≤ b] = FX (b) − FX (a) 0 (v) FX (x) = fX (x) (important property). 5 2.2 Moments of a continuous random variable. 2.2.1 Expectation. Definition 2.5 The expectation (expected value, or mean) of the continuous random variable X is defined as: Z E(X) = fX (x) · xdx x∈< The expectation has the same properties as in the case of discrete random variables: (i) E(a) = a (ii) E(aX) = aE(x) (iii) E(X + Y ) = E(X) + E(Y ) (iv) E(XY ) = E(X) · E(Y ) if X and Y are independent Remember that it is possible to compute the expectation not only of a random variable X, but also of any continuous transformation of it g(X). That is, Z E(g(X)) = fX (x) · g(x)dx x∈< 2.2.2 Variance. Definition 2.6 The variance of the continuous random variable X is defined as: Z V (X) = x∈< fX (x)(x − E(X))2 dx = E(X − E(X))2 The variance has the same properties as in the case of discrete random variables: (i) V (a) = 0 (ii) V (aX) = a2 V (X) (iii) V (X + Y ) = V (X) + V (Y ) if X and Y are independent (iv) V (X + a) = V (X) (v) V (X) ≥ 0 A different (and more useful) formula to compute the variance is: V (X) = E(X 2 ) − (E(X))2 2.3 Main continuous distributions 2.3.1 Uniform distribution. Definition 2.7 A random variable X is said to have a uniform distribution on the interval [a, b] (X ∼ U [a, b]) if its probability 1 density function is x ∈ [a, b] b−a fX (x) = 0 otherwise In this case we have: E(X) = b+a 2 V (X) = (b−a)2 12 6 2.3.2 Exponential distribution. Definition 2.8 A random variable X is said to have an exponential distribution with parameter λ if its probability density function is: λe−λx x ≥ 0 fX (x) = 0 x<0 In this case we have E(X) = λ1 V (X) = λ12 2.3.3 Normal. distribution. Definition 2.9 A random variable X is said to have a normal distribution with parameters µ and σ (X ∼ N (µ, σ 2 )) if its probability density function is 1 x−µ 2 1 fX (x) = √ e− 2 ( σ ) −∞<x<∞ σ 2π In this case we have E(X) = µ V (X) = σ 2 2.3.4 Standard Normal distribution. Definition 2.10 A random variable X is said to have a standard normal distribution (X ∼ N (0, 1), that is, a Normal distribution with µ = 0 and σ = 1) if its probability density function is 1 2 1 −∞<x<∞ e− 2 x fX (x) = √ 2π In this case, we clearly have E(X) = 0 V (X) = 1 Fact 2.11 Let X be a Normal random variable, X ∼ N (µ, σ 2 ). Then , the random variable Z defined as follows has a Standard Normal distribution Z= x−µ ∼ N (0, 1) σ 2.3.5 log-normal distribution. Definition 2.12 A random variable X is said to have a log-normal distribution with parameters µ and σ i the variable Y = ln X has a Normal distribution with parameters µ and σ. The probability density function of a log normal random variable is fX (x) = 1 √ xσ 2π 1 e− 2 ( ln x−µ 2 ) σ In this case we have σ2 E(X) = eµ+ 2 2 2 V (X) = eσ (eσ − 1)e2µ 7 −∞<x <∞ 2.3.6 chi-squared distribution. Definition 2.13 A random variable X is said to have a chi-squared distribution with n degrees of freedom (X ∼ χ2n ) if it is the sum of n squared standard normal random variables, X = Y12 + Y22 + . . . + Yn2 , where Yi ∼ N (0, 1). The probability density function of a chi-squared random variable is x 2 −1 e− 2 fX (x) = n Γ( 12 n)2 2 n x 0≤x<∞ where Γ(a)is the Gamma function. In this case we have E(X) = n V (X) = 2n 2.3.7 t-Student distribution. Definition 2.14 A random variable X is said to have a t-student distribution with n degrees of freedom (X ∼ tn ) if Z X=q Y n where Z ∼ N (0, 1) and Y ∼ χ2n . The probability density function of a t-student random variable is Γ( 12 (n + 1)) −∞<x<∞ fX (x) = √ 2 n+1 nπΓ( 12 n)(1 + xn ) 2 In this case we have E(X) = 0 n V (X) = n−2 (n > 2) 8