Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 8: Random Variables and Their Distributions • Toss a fair coin 3 times. – Let X stand for the number of HEADS in the 3 tosses. – Let Y stand for the number of TAILS in the 3 tosses. – Let Z stand for the difference in the number of HEADS and the number of TAILS in the 3 tosses. • X, Y , and Z are examples of random variables. – The possible values of X are 0, 1, 2, 3. – The possible values of Y are 0, 1, 2, 3. – The possible values of Z are −3, −1, 1, 3. What are random variables? • A mathematician would define a random variable as a function from the sample space S to the set of real numbers. • We will think of a random variable as a numerical quantity that we can compute if we know the outcome of the experiment. – It’s a variable because it will typically have many possible values. – It’s random because the value it takes depends on the outcome of our random experiment. Probability distribution To work with random variables we’ll need to know not only their possible values, but also the probability that the random variable takes each of the values. • Let X stand for the number of HEADS in the 3 tosses as before. Last week we computed the probability distribution of X: Value Probability 0 1/8 1 3/8 2 3/8 3 1/8 • Another way to write this: P(X = 0) = 1/8; P(X = 1) = 3/8; P(X = 2) = 3/8; P(X = 3) = 1/8. • A random variable X is said to be discrete, if it takes either finite number of values or countably many values. • A random variable X is said to be continuous, if it takes all values in an interval. Distribution function of a discrete r.v.: Let X is a discrete random variable taking values x1, x2, . . . The probability distribution of X is the function f (xi) = P{X = xi}. Note that the probability distribution of X satisfies the following properties: (i). 0 ≤ f (xj ) ≤ 1 for all j = 1, 2, . . .. (ii). P j=1 f (xj ) = 1. Example 5. See page 199. Example Exercise 5.9. 5.4. Expectation and variance of a random variable Here’s another way to generate a random variable with the same distribution as X in the coin tossing case: • Fill a basket with 8 balls: – One labelled 0; – Three labelled 1; – Three labelled 2; – One labelled 3. • Choose a ball at random. • We know that the mean of the 8 balls’ labels is 1.5, so it makes sense to say that the mean of X is 1.5. • Since the 8 balls represent the “population” in this context, we’ll use µ or µX to represent the mean of X. • Sometimes we’ll use E(X) for the mean of X. (Stands for expected value of X.) • Similarly, we can associate a standard deviation with X; we’ll call it σ or σX . More precisely, let X be a discrete r. v. taking values x 1 , x 2 , . . . , xn , . . . The probability distribution of X is give by f (xi) = P{X = xi}, (i = 1, 2, . . .). Expectation: The expectation of X is defined by E(X) = X xiP{X = xi} = i=1 Example 7. See page 208–209. Example 8. See page 209. X i=1 xif (xi). Variance: The variance of X is defined as Var(X) = X (xi − µ)2f (xi). i=1 We usually use the notation σ 2 = Var(X). P Alternative formula: Var(X) = i=1 x2i f (xi) − µ2. Standard deviation: The standard deviation of X is defined p as Var(X) = σ. Example 9. See page 212-213. 5.5 Bernoulli Trials An experiment (trial) is called a Bernoulli trial if it has two possible outcomes. We label one as success (S) and the other as failure (F ). These are just the terms used in statistics and bear no practical meaning. We usually denote the probability of getting S by p. Then P{F } = 1 − p. We can relate a random variable X with a Bernoulli trial. For example, we earn $100 if the outcome is S, and we lose $50 if the outcome is F . Let X be the amount of money we earn from one Bernoulli trial. Then X is a random variable taking two possible values 100, −50. Its probability distribution is P{X = 100} = p, P{X = −50} = 1 − p. The mean of X is E(X) = 100p − 50(1 − p). By saying we run Bernoulli trials, we mean (i). Each trial yields either S or F ; (ii). For each trial, the probability of getting S is p, and the probability of getting F is 1 − p; (iii). The trials are independent. The probability of success in a trial does not change given any information about the outcomes of the other trials. Example 11. See page 219. Exercise 5.45. See page 221.