Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2 Random Variables 2.1 Random variables Real valued-functions defined on the sample space, are known as random variables (r.v.’s): RV : S → R Example. • X is a randomly selected number from a set of 1, 2, 4, 5, 6, 10. • Y is the number heads that has occured in tossing a coin 10 times. • V is the height of a randomly selected student. • U is a randomly selected number from the interval (0, 1). Discrete and Continuous Random Variables Random variables may take either a finite or a countable number of possible values. Such random variables are called discrete. However, there also exist random variables that take on a continuum of possible values. These are known as continuous random variables. Example Let X be the number of tosses needed to get the first head. Example Let U be a number randomly selected from the interval [0,1]. Distribution Function The cumulative distribution function (c.d.f) (or simply the distribution function) of the random variable X, say it F , is a function defined by F (x) = P (X ≤ x) ∀x ∈ R. Here are some properties of the c.d.f F , (i) F (x) is a nondecreasing function, (ii) limx→∞ F (x) = 1 (iii) limx→−∞ F (x) = 0 All probability questions about X can be answered in terms of the c.d.f F . For instance, P (a < X ≤ b) = F (b) − F (a) If we desire the probability that X is strictly smaller than b, we may calculate this probability by P (X < b) = lim+ P (X ≤ b − h) = lim+ F (b − h) h→ 0 h→ 0 Remark. Note that P (X < b) does not necessarily equal F (b). 1 2 2.2 Discrete Random Variables Definition. (Discrete Random Variable) A random variable that can take on at most a countable number of possible values is said to be discrete. For a discrete random variable X, we define the probability mass function (or probability density function, p.d.f) of X by p(a) = P (X = a). Let X be a random variable takes the values x1 , x2 , . . . . Then we must have ∞ X p(xi ) = 1. i=1 The distribution function F can be expressed in terms of the mass function by F (a) = X p(xi ) all xi ≤a Example. Let X be a number randomly selected from the set of numbers 0, 1, 2, 3, 4, 5. Find the probability that P (X ≤ 4). The Binomial Random Variable Suppose that n independent trials, each of which results in a ”success” with probability p and in a ”failure” with probability 1 − p, are to be performed. If X represents the number of successes that occur in the n trials, then X is said to be a binomial random variable with parameters (n, p). Denote X ∼ B(n, p). The probability mass function of a binomial random variable with parameters (n, p) is given by n k P (X = k) = p(k) = p (1 − p)n−k , k = 0, 1, 2, . . . , n k where Note that n! n = . k k!(n − k)! n X k=0 p(k) = n X n k p (1 − p)n−k = (p + (1 − p))n = 1 k k=0 Example. According to a CNN/USA Today poll, approximately 70% of Americans believe the IRS abuses its power. Let X equal the number of people who believe the IRS abuses its power in a random sample of n=20 Americans. Assuming that the poll results still valid, find the probability that (a) X is at least 13 (b) X is at most 11 3 The Geometric Random Variable Suppose that independent trials, each having probability p of being a success, are performed until a success occurs. If we let X be the number of trials required until the first success, then X is said to be a geometric random variable with parameter p. Its probability mass function is given by p(n) = P (X = n) = (1 − p)n−1 p, n = 1, 2, . . . Note that, ∞ X p(n) = p n=1 ∞ X (1 − p)n = 1 n=1 Example. Let X be the number of tosses needed to get the first head. P (X = n) = 1 , 2n x = 1, 2, 3, . . . . The mass function of X is then, p(x) = 1/2x . Hence, X p(x) = all x ∞ X 1 = 1. 2x x=1 Example. Signals are transmitted according to a Poisson process with rate λ. Each signal is successfully transmitted with probability p and lost with probability 1 − p. The fates of different signals are independent. What is the distribution of the number of signals lost before the first one is successfully transmitted? The Poisson Random Variable A random variable X, taking on one of the values 0, 1, 2, . . . is said to be a Poisson random variable with parameter λ, if p(k) = P (X = k) = e−λ λk , k! k = 0, 1, 2, . . . This equation defines a probability mass function since ∞ X k=0 p(k) = e−λ ∞ X λk = e−λ eλ = 1. k! k=0 A Poisson random variable involves observing discrete events in a continuous ”interval” of time, length, or space. Example. Suppose that the number of typographical errors on a single page of a book has a Poisson distribution with parameter λ = 1. Calculate the probability that there is at least one error on a page? 4 Assume that the average number of occurences of the event in per unit of ”time” is λ. Let Y be the number of the occurences of the event in s unit of ”time”. Then N (t) is a Poisson random variable with parameter λt, that is P (N (t) = k) = e−λt (λt) k! k k = 0, 1, 2, . . . Example. People enter a casino at a rate of 1 for every 2 minutes. (a) What is the probability that none enters between 12:00 and 12:05 (b) What is the probability that at least 4 people enter the casino during that time? Theorem. Let X ∼ B(n, p). If n is very large such that λ = np. then n x e−k k x P (X = x) = p (1 − p)n−x ≈ . x x! In other words, B(n, p) ≈ P oisson(λ), where λ = np. Proof. Let k = np or p = k/n n −x x λ λ n x λ n p (1 − p)n−x = 1− 1− = x n n n x x n x x n λ λ n! n! n λ k 1− 1− = = = x!(n − x)! n n n−λ x!(n − x)! n − λ n n n λ λ λx n(n − 1) · · · (n − x + 1) λx n! 1 1 − 1 − = . = x! (n − x)! (n − λ)x n x! (n − λ)x n P (X = x) = Example. Suppose that the probability that a random chosen item to be defective is 0.01. 800 items are shipped to a warehouse. What is the probability that there will be at most 5 defective items in that 800 items? 5 2.3 Continuous Random Variables Let X be a random variable whose set of possible values is uncountable. It is known that such random variable is called continuous. Definition. A random variable X is continuous if there exists a nonnegative function f (x), defined for all real x ∈ (−∞, ∞), having the property that for any set B of real numbers Z P (X ∈ B) = f (x) dx. B The function f (x) is called the probability density function of the random variable X. A density function must have Z ∞ f (x) dx = P (X ∈ (−∞, ∞)) = 1 −∞ and Z P (a ≤ X ≤ b) = b f (x) dx. a The relationship between the c.d.f F (x) and the p.d.f f (x) is expressed by d F (x) = f (x). dx Remark. The density function is not a probability. P (a − ε ≤ X ≤ a + ε) = Z a−ε f (x) dx ≈ εf (a) a+ε when ε is small. From this, we see that f (a) is a measure of how likely it is that the random variable will be near a. The Uniform Random Variable A random variable is said to be uniformly distributed over the interval (0, 1) if its probability density function is given by f (x) = ( 1, 0<x<1 0, otherwise. Note that the preceding is a density function since f (x) ≥ 0 Z ∞ f (x) dx = Z −∞ 1 dx = 1. 0 and for any 0 < a < b < 1 P (a ≤ X ≤ b) = Z b f (x) dx = a Z b 1 dx = a Z b f (x) dx = b − a. a 6 In general, we say that X is a uniform random variable on the interval (α, β) if its p.d.f is given by 1 , α<x<β f (x) = α − β 0, otherwise. Exponential Random Variables A continuous random variable whose p.d.f is given, for some λ, by f (x) = ( λe−λx , 0, x≥0 x < 0. is said to be an exponential random variable with parameter λ. The c.d.f of X is Z x Z x F (x) = f (x) = λe−λt dt = 1 − e−λx , 0 0 x ≥ 0. 7 2.4 Expectation of a Random Variable The Discrete Case If X is a discrete random variable having a probability mass function p(x), then the expected value of X is defined by E(X) = X x p(x) all x provided P all x |x| p(x) < ∞. Lemma. If X is non-negative integer valued random variable, then ∞ X E(X) = P (X > k). k=0 Example. (a) (Expectation of a Binomial Random Variable) Let X ∼ B(n, p). Calculate E(X). (b) (Expectation of a Geometric Random Variable) Calculate the expectation of a geometric random variable having parameter p. (c) (Expectation of a Poisson Random Variable) Calculate the expectation of a Poisson random variable having parameter λ. The Continuous Case The expected value of a continuous random variable is defined by E(X) = Z ∞ x f (x) dx −∞ provided R∞ −∞ |x| f (x) dx < ∞ Lemma. If X is non-negative random variable, then E(X) = Z ∞ P (X > x) dx. 0 Example. (a) (Expectation of a Uniform Random Variable) Let X ∼ B(n, p). Calculate E(X). (b) (Expectation of an Exponential Random Variable) Calculate the expectation of an exponential random variable having parameter λ. (c) (Expectation of a Normal Random Variable) Calculate the expectation of a Normal random variable having parameter µ and σ 2 . 8 2.5 Expectation of a Function of a Random Variable Now, we are interested in calculating, not the expected value of X, but the expected value of some function of X, say, g(X). Proposition 1. (a) If X is a discrete random variable with probability mass function p(x), then for any real-valued function g, X g(x) p(x) E[g(x)] = all x (b) If X is a continuous random variable with probability density function f (x), then for any real-valued function g, E[g(x)] = Z ∞ g(x) f (x) dx −∞ Proposition 2. If a and b are constants, then E(aX + b) = a E(X) + b. and E(X + Y ) = E(X) + E(Y ) Variance of a Random Variable The expected value of a random variable X, E(X), is also referred to as the mean or the first moment. The quantity E(X n ), n ≥ 1, is called the nth moment of X. The variance of X, denoted by Var(X), is defined by Var(X) = E[X − E(X)]2 . A useful formula to compute the variance is Var(X) = E(X 2 ) − [E(X)]2 . 9 2.6 Jointly Distributed Random Variables Thus far, we have concerned ourselves with the probability distribution of a single random variable. However, we are often interested in probability statements concerning two or more random variable. Joint Distribution Function To deal with probabilities of two random variables X and Y , we define the joint distribution function of X and Y by FX,Y (a, b) = P (X ≤ a, Y ≤ b), −∞ < a, b < ∞. The distribution function of X can be obtained from the joint c.d.f as follows: FX (a) = P (X ≤ a, Y < ∞) = F (a, ∞). Similarly, the c.d.f. of Y is given by FY (b) = P (X < ∞, Y ≤ b) = F (∞, b). Joint Probability Mass Function Let X and Y be both discrete random variables, then the joint mass function of X and Y is given by p(x, y) = P (X = x, Y = y). The probability mass function of X may be obtained from p(x, y) by X pX (x) = p(x, y) all y and similarly, the mass function of Y is pY (y) = X p(x, y) all x Joint Probability Density Function We say that X and Y are jointly continuous if there exists a function f (x, y), defined for all real x and y, having the property that for all sets A and B of real numbers Z Z P (X ∈ A, y ∈ B) = f (x, y) dx dy. B A The function f (x, y) is called the joint probability density function of X and Y . The p.d.f. of X and Y can be obtained from their joint p.d.f. by Z Z ∞ P (X ∈ A) = f (x, y) dy dx A −∞ 10 and P (Y ∈ B) = Z Z B ∞ f (x, y) dx dy. −∞ The integrals fX (x) = Z ∞ f (x, y) dy and fY (y) = −∞ Z ∞ f (x, y) dx. −∞ are called the density function of X and Y respectively. Expectation of a Function of Two Random Variables If X and Y are random variables and g is a function of two variables, then XX E[g(X, Y )] = g(x, y) p(x, y) in the discrete case y = Z x ∞ Z −∞ ∞ g(x, y)f (x, y) dx dy in the continuous case −∞ For instance, if g(X, Y ) = X + Y , then, in the continuous case, Z ∞Z ∞ (x + y) f (x, y) dx dy = E(X + Y ) = −∞ −∞ Z ∞Z ∞ Z ∞Z ∞ x f (x, y) dx dy + y f (x, y) dx dy = = −∞ −∞ −∞ −∞ = E(X) + E(Y ) Proposition. For any constants a and b, E(aX + bY ) = aE(X) + bE(Y ). Example. Let us compute the expectation of a binomial variable with parameters n and p. X ∼ B(n, p). Solution. X = X1 + X2 + · · · + Xn where Xi = ( 1, 0, if the ith trial is a success if the ith trial is a failure. Hence, E(Xi ) = 0 · (1 − p) + 1 · p = p. Thus E(X) = E(X1 + X2 + · · · + Xn ) = n p. Example. At a party N men throw their hats into the center of a room. The hats are mixed up and each man randomly selects one. Find the expected number of men who select their own hat. Solution. Let X denote the number of men that select their own hats. Define Xi by ( 1, if the ith man selects his own hat Xi = 0, otherwise. 11 Hence, X = X1 + X2 + · · · + XN . E(Xi ) = 1/N . Thus, E(X) = 1. That is, no matter how many people at the party, on the average just one of them will select his own hat. Example. Suppose there are 4 different types of coupons and suppose that each time one obtains a coupon, it is equally likely to be any one of the 4 types. Compute the expected number of different types that are contained in a set of 10 coupons. Solution. Define ( 1, if at least one type-i coupon is in the set of 10 Xi = 0, otherwise. Hence X = X1 + X2 + X3 + X4 . Now, E(Xi ) = P (Xi = 1) = = P (at least one type-i coupon is in the set of 10) = = 1 − P (no type-i coupons are in the set of 10 ) = 10 3 = 1− . 4 Hence, 10 # 3 E(X) = E(X1 ) + E(X2 ) + E(X3 ) + E(X4 ) = 4 1 − 4 " 12 2.7 Independent Random Variables The random variable X and Y are said to be independent if, for all a, b, P (X ≤ a, Y ≤ b) = P (X ≤ a) P (Y ≤ b). When X and Y are discrete, the condition of independence reduces to p(x, y) = pX (x) pY (y), and if X and Y are jointly continuous, independence reduces to f (x, y) = fX (x) fY (y). Proposition. If X and Y are independent, then for any functions h and g, g(X) and h(Y ) are independent and E[g(X) h(Y )] = E[g(X)] E[h(Y )]. Remark. In general, E[g(X) h(Y )] = E[g(X)] E[h(Y )] does NOT imply independence. 13 2.8 Covariance The covariance of any two random variables X and Y , denoted by Cov(X, Y ), is defined by Cov(X, Y ) = E[(X − E(X))(Y − E(Y )]. The following is an useful formula to compute the covariance: Cov(X, Y ) = E(XY ) − E(Y )E(Y ). Proposition. If X and Y are independent, then Cov(X, Y ) = 0. Properties of Covariance For any random variables X, Y , Z, and a constant c, • Cov(X, X) = Var(X) • Cov(X, Y ) = Cov(Y, X) • Cov(c X, Y ) = c Cov(X, Y ) • Cov(X, Y + Z) = Cov(X, Y ) + Cov(X, Z) Sums of Random Variables • Let X1 , X2 , . . . Xn be a sequence of random variables. Then Var n X Xi ! = i=1 n X Var(Xi ) + 2 i=1 n X i−1 X i=2 j=1 • If X1 , X2 , . . . Xn are independent, then Var n X i=1 Xi ! = n X i=1 Var(Xi ) Cov(Xi , Xj )