Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, (§2.1 Rice) 2. Continuous Random variables, (§2.2 Rice) 3. Functions of a random variable (§2.3 Rice) Random Variables 1. A random variable is a number whose value is determined by chance 2. The number of heads in three coin tosses is a random variable 3. The time till the next magnitude 8 earth-quake is a random variable. 4. Example 2 is a discrete random variable since the answer must be a discrete integer value i.e., 0, 1, 2 . . .. Since time is continuous (3) is a continuous random variable Example: coin toss • For the three coin tosses the sample space is Ω = {hhh, hht, htt, hth, ttt, tth, thh, tht} • The random variable X is then 3 when hhh occurs, 2 when hht or thh or hth occurs • That is X = 2 if and only if ω ∈ {hht, thh, hth}, hence P (X = 2) = P ({hht, thh, hth}). • We can therefore work out the probability of seeing X = 0, 1, 2, 3 These are P (X = 0) = 1 3 3 , P (X = 1) = , P (X = 2) = , 8 8 8 P (X = 3) = 1 8 • This is called the probability mass function for X. It is also called the frequency function. 1 Probability mass function • A general discrete random variable which values x1 , x2 , x3 , · · · • The probability mass function is p(xi ) = P (X = xi ) • From the rules of probability we must have that 0 ≤ p(xi ) ≤ 1 and X p(xi ) = 1 i Cumulative distribution function • As an alternative to the mass function you can also defined the cumulative distribution function (cdf) • Defined by F (x) = P (X ≤ x) Cumulative Distribution 0.35 1.0 Prob. Mass Fn. ● 0.6 0.05 0.2 0.4 Probability 0.20 0.15 0.10 ● 0.0 ● 0.00 Probability 0.25 0.8 0.30 ● 0 1 2 3 −1 0 1 2 x 2 3 4 Cumulative distribution function • Cumulative distribution functions are often denoted by capital letters e.g. F (x) • Frequency functions by lowercase letters e.g. f (x) • The CDF is non-decreasing and satisfies lim F (x) = 0, lim F (x) = 1 x→−∞ x→∞ Bernoulli Random variables 1. A Bernoulli random variable takes only two possible values 0 or 1 2. The probability it takes the value 1 is p, the probability it takes value 0 is 1 − p. 3. Its frequency function is p(x) = p 1−p 0 if x = 1 if x = 0 otherwise 4. This can also be written as p(x) = px (1 − p)1−x for x = 0, 1 and 0 otherwise Exercise Sketch the frequency and cdf for the random variable X where P (X = −1) = P (X = 2) = 1 1 3 , P (X = 0) = , P (X = 1) = , 10 10 10 2 3 , P (X = 3) = 0, P (X = 4) = 10 10 3 Exercise Whats the probability mass function for the cdf below? 1.0 Cumulative Distribution ● 0.8 ● ● 0.6 ● ● 0.4 Probability ● ● 0.2 ● ● 0.0 ● 0 2 4 6 8 10 x Recommended Questions From §2.5 of Rice you should do 1, 3, 5 (a), 7 Indicator random variables 1. If A is an event, then there is a probability p that the event happens, and 1 − p that is doesn’t happen 2. This can be coded as a random variable by taking the value 1 if it does happen and 0 if it does not. 3. Formally defined the indicator random variable IA (ω) by 1 ω∈A IA (ω) = 0 ω∈ /A 4. Then IA (ω) is a Bernoulli r.v. for any A. 4 The Binomial distribution 1. Suppose that n independent experiments, each either ‘success’ or a ‘failure’, are run 2. Further suppose that for each experiment there is a fixed probability p of ‘success’ 3. The number of successes in n experiments is called a binomial random variable 4. Its frequency function is n k p(x) = p (1 − p)n−k k for x ∈ {0, 1, · · · , n}. The Binomial distribution Some frequency functions when n = 10 for different values of p 0.4 p=0.1 p=0.5 ● ● ● ● 0.10 probability 0.2 ● ● ● 0.1 probability 0.3 0.20 ● ● ● 2 4 ● ● 6 ● ● 8 ● 10 ● ● ● 0 ● 2 4 6 x x p=0.3 p=0.9 8 0.4 0 ● 0.00 0.0 ● ● ● 0.3 probability ● ● ● 0.1 ● 0.2 0.20 ● ● 0.00 ● ● 0 2 4 6 ● 8 ● 0.0 probability 10 ● ● 0.10 ● ● 10 ● 0 ● ● 2 x ● ● ● 4 ● 6 8 10 x The Binomial distribution 1. The mode is the x value with the highest probability. What is it in each of the cases shown above? 2. What is the relationship between the p = 0.1 and p = 0.9 case? 5 The Tay-Sachs disease • Couples can be carriers of Tay-Sach disease • Each child has a probability 0.25 of having the disease and this is independent across different children • If the couple have 4 children, the number that will have the disease is Binomial (4, 0.25) • These are P (k = 0) = 0.316, P (k = 1) = 0.422, P (k = 2) = 0.211, P (k = 3) = 0.047, P (k = 4) = 0.004 The Tay-Sachs disease • What would these probabilities be if the probability of a single child having the disease is 0.5? • What is the mode (i.e the most likely number)? The geometric distribution • The geometric distribution is also constructed from independent Bernoulli trials • On each trial a ‘success’ occurs with probability p • The geometric random variable counts the number of trials before the first success happens • The frequency function is p(k) = (1 − p)k−1 p for k = 1, 2, 3, · · ·. 6 The geometric distribution Here are some numerical examples for different values of p. 0.6 p=0.5 0.6 p=0.1 10 ● 0.0 ● ● ● ● ● ● ● ● ● ● ● 5 10 p=0.3 p=0.6 0.6 x 0.4 probability ● 15 ● ● 0.2 0.6 x 0.4 probability ● 15 ● 0.2 0.4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ● 0.2 probability 0.4 0.2 0.0 probability ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 10 ● 0.0 0.0 ● 15 ● ● ● ● ● ● ● ● ● ● ● ● 5 x 10 15 x Exercise 1. Which is more likely (i) 9 heads from 10 throws or (ii) 18 heads from 20 throws, of a fair coin 2. If X is a geometric random variable with p = 0.5 for what value of k is P (X ≤ K) ≈ 0.99 The hypergeometric distribution • Suppose we have an urn with n balls, r black and n − r white. • Let X be the number of black balls drawn when taking m balls without replacement. X has a hypergeometric distribution. • Its frequency function is P (X = k) = • Thus the probability of winning a lottery is hypergeometric 7 r k n−r m−k n m The Poisson distribution • This has a frequency function P (X = k) = λk exp(−λ) k! for k = 0, 1, 2, · · ·. • This can be thought of as a limit of binomial trials as n gets large, and p is small, where λ = np. The Poisson and binomial distributions Comparing numerically some Poisson and binomial distributions, the black is the Binomial, the red the Poisson. n=20, p=0.5, lambda=10 ● ● 1 2 ● 3 4 5 0.10 0.00 0.0 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 10 ● ● ● ● ● ● ● ● ● 15 x n=5, p=0.1, lambda=0.5 n=100, p=0.1, lambda=10 0.20 ● ● ● 0.00 ● ● 2 4 6 ● ● 8 ● ● 0.08 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.04 ● ● 20 ● ● ● ● ● ● ●● 0.12 ● probability probability ● ● 0 ● 0.10 ● x ● ● 0 ● ● ● ● ● ● ● ● ● 0.05 probability 0.4 ● ● 0.2 probability 0.15 0.6 n=5, p=0.1, lambda=0.5 ● ● 10 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 40 x 60 80 100 x Examples • Modelling the number of telephone calls coming into an exchange if the exchange has a large number of customers which act more or less independently • Modelling the number of α particles emitted from a radio active source • Modelling the number of large accidents by an insurance company 8 Recommended questions From §2.5 Rice problems: 11,13,1,7,27,31,32. Continuous Random variables • Suppose that the random variable of interest can take a continuum of values rather than lies in a discrete set • In such a case the frequency function is replaced by the density function f (x), which is f (x) ≥ 0 and Z ∞ f (x)dx = 1 −∞ • If X is a random variable with density f (x) then b Z P (a < X < b) = f (x)dx a Continuous Random variables • For small δ, if f (x) is continuous then P (x − δ δ ≤X ≤x+ )= 2 2 Z x+ δ2 f (u)du ≈ δf (x) x− δ2 • The cumulative distribution function F (x) is defined as Z x F (x) = P (X ≤ x) = f (u)du −∞ • By calculus have that dF (x) dx f (x) = Uniform Random Variables • If X is uniformly distributed on the interval [a, b] then 1 b−a f (x) = 0 a≤x≤b x < a or x > b • The cumulative distribution function is F (x) = 9 0 x−a b−a 1 x<a a≤x≤b x>b Uniform Random Variables The density and cdf for the uniform on [0, 1]. 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 Density Probability 0.6 0.8 1.0 Uniform CDF 1.0 Uniform density −0.5 0.0 0.5 1.0 1.5 −0.5 x 0.0 0.5 1.0 1.5 x The cdf • By the properties of the cdf the inverse F −1 (x) is well-defined. • The pth quantile of F is defined to be xp such that F (xp ) = p where p ∈ [0, 1] • When p = 0.5 the quantile is called the median, when its 0.25 or 0.75 its called the lower or upper quartile of F . 10 Probabilities If X has a uniform [0, 1] distribution then P (X ∈ (0.5, 0.6)) is illustrated for both the density and cdf below 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 Density Probability 0.6 0.8 1.0 Uniform CDF 1.0 Uniform density −0.5 0.0 0.5 1.0 1.5 −0.5 0.0 x 0.5 1.0 1.5 x Exercise Sketch both the density and cdf function for a uniform [−1, 1] random variable and indicate what corresponds to the probability that x > 0. The exponential distribution • The density function is f (x) = λ exp(−λx) x ≥ 0 0 x<0 • The cdf is F (x) = 1 − exp(−λx) x ≥ 0 0 x<0 11 ‘Memoryless property’ • The exponential distribution is often used to model lifetimes or waiting times. • It has the following property P (T > t + s|T > s) = P (T > t), see page 49 • What does this mean? The Normal distribution • Probably the most used distribution in statistics is called the normal • Its density function is given by (x − µ)2 1 f (x) = √ exp − 2σ 2 σ 2π −∞ < x < ∞. • The µ term is called the mean and the σ is called the standard deviation. • The cdf does not have a nice formula, but Table 2 page A7 Rice gives numerical values for a standard normal distribution. 12 The Normal distribution The plot shows three normal distributions. The black has µ = 0, σ = 1 (often called a standard normal). The red has µ = 5, σ = 1 while the blue has µ = 0, σ = 3. 0.2 0.1 0.0 Density 0.3 0.4 Normal densities −10 −5 0 x 13 5 10 The Normal distribution The figure shows the relationship between the shape of the normal density and the size of a standard deviation 0.2 0.0 0.1 Density 0.3 0.4 Normal densitiy −4 −2 0 2 4 Standard deviations from mean Recommended Questions From §2.5 Rice look at questions 34, 40, 41, and 45. Also study the memoryless property of the exponential on page 49. Functions of a random variable • Suppose X has a density function f (x), what is the density function of Y = g(X) for some function g? • Since X is a random variable (i.e., its value its determined by chance), then g(X)’s value is also determined by chance, hence it is also a random variable • The function g(X) could be a linear function, i.e., Y = g(X) = aX + b • Alternatively it could be a non-linear function Y = g(X) = X 2 . 14 Example Normal distribution • Suppose X ∼ N (µ, σ 2 ) (i.e. X has a normal distribution with mean µ and standard deviation σ) and that Y = aX + b where a > 0. • Consider the cdf for Y , FY (y) = P (Y ≤ y) = P (aX + b ≤ y) y−b = P (X ≤ ) a y−b = FX ( ) a • Thus the density of Y is fY (y) = = d y−b FX ( ) dy a y−b 1 fX ( ) a a Example Normal distribution • Thus " 1 exp − fY (y) = 2 aσ 2π 1 √ y − b − aµ aσ 2 # so Y ∼ N (a + bµ, a2 σ 2 ) Example B page 59 2 • Let X ∼ N (µ, σ ), we want to find the probability that X is less than σ away from µ, i.e. P (|X − µ| < σ) • This probability is P (σ < X − µ < σ) = P (−1 < • Using the previous result we see that Z = X−µ σ X −µ < 1) σ has a standard normal N (0, 1) distribution • If Φ(x) is the cdf for the standard normal distribution, then we want Φ(1) − Φ(−1) = 0.68 15 Example C page 59 • Find the density of X = Z 2 where Z ∼ N (0, 1) • We have FX (x) = P (X ≤ x) √ √ = P (− x ≤ Z ≤ x) √ √ = Φ( x) − Φ(− x) • Find the density of X by differentiating the cdf. Since Φ0 (x) = φ(x) the density for the standard normal, we get fX (z) √ 1 −1/2 √ 1 x φ( x) + x−1/2 φ(− x) 2√ 2 = φ( x) = • More explicitly this is x−1/2 fX (x) = √ exp(−x/2). 2π General rule • Let X be a continuous variable with density f (x) and let Y = g(X) where g is differentiable, and monotonic • The density of Y is fY (y) = fX (g −1 d −1 (y)) g (y) dy Recommended Questions From §2.5 Rice look at questions 53 (Hint use the results on a function of a random variable to convert the r.v. to a standard normal then use the tables at back of book), 55, 58, 59 and 67 (a, b) 16