Download Lecture 16: Normal Random Variables 1. Definition Definition: A

Lecture 16: Normal Random Variables 1. Definition Definition: A continuous random variable X is said to have the normal distribution with mean µ and variance σ 2 (denoted N (µ, σ 2 )) if the density of X is p(x) = √ 1 2 2 e−(x−µ) /2σ 2πσ − ∞ < x < ∞. We can show that the integral of p(x) over R is equal to 1 using the following trick. First, by making the substitution y = (x − µ)/σ, we see that Z ∞ Z ∞ 1 2 p(x)dx = √ e−y /2 dy. 2π −∞ −∞ R∞ Then, letting I = ∞ exp(−y 2 /2)dy, we obtain Z ∞ Z ∞ 2 2 −x2 /2 I = e dx e−y /2 dy −∞ Z−∞ ∞ Z ∞ 2 2 = e−(x +y )/2 dxdy −∞ −∞ ∞ Z 2π Z re−r = 0 Z = 2π 0 ∞ re−r 2 /2 2 /2 dθdr dr = 2π. 0 Here we have changed to polar coordinates in passing from the second to the third line, √ i.e., we set x = r cos(θ), y = r sin(θ), and dxdy = rdθdr. This calculation shows that I = 2π, which confirms that p(x) is a probability density. 2.) Properties A simple calculation using cumulative distribution functions and densities shows that an affine transformation of a normal random variable maps it to another normal random variable. Proposition: If X is normally distributed with parameters µ and σ 2 , then the random variable Y = aX + b is normally distributed with parameters aµ + b and a2 σ 2 . Remark: Notice that if X is normal with parameters µ and σ 2 , then Z = (X − µ)/σ is normal with parameters 0 and 1. Z is known as a standard normal random variable. This identity has several applications. For statistical applications of the normal distribution, we are often interested in probabilities of the form P(X > x) = 1−P(X ≤ x). Although a simple analytical expression is unavailable, these 1 quantities can be calculated numerically and have been extensively tabulated for the standard normal distribution (see Table 5.1 in Ross). Fortunately, because every normal random variable can be expressed in terms of a standard normal random variable, we can use these lookup tables for more general problems. For example, if X ∼ N (µ, σ 2 ), then by defining Z = (X − µ)/σ, we see that P(X > x) = P(σZ + µ > x) = 1 − P(Z ≤ (x − µ)/σ) = 1 − Φ((x − µ)/σ), where Φ(x) is the CDF of the standard normal distribution: Z x 1 2 e−y /2 dy. Φ(x) = √ 2π −∞ Another application of the relationship between the standard normal distribution and the other normal distributions is illustrated in the proof of the following proposition. Proposition: Let X be a normal random variable with parameters µ and σ 2 . Then the expected value of X is µ, while the variance of X is σ 2 . Proof: Let Z = (X − µ)/σ, so that Z ∼ N (0, 1). Then Z ∞ 1 2 E[Z] = √ xe−x /2 dx = 0, 2π −∞ while Z ∞ 2 1 2 Var(Z) = E Z = √ x2 e−x /2 dx 2π −∞ Z ∞ 1 2 −x2 /2 −xe−x /2 |∞ + e dx = 1. = √ ∞ 2π −∞ Since X = µ + σZ, the proposition follows from the identities E[X] = µ + σE[Z] = µ Var(X) = σ 2 Var(Z) = σ 2 . Remark: Notice that each normal distribution is uniquely determined by its mean and variance. 3.) The Normal Approximation to the Binomial Distribution One of the reasons that the normal distribution is so important in statistics is that it provides a counterpart to the Poisson approximation for a binomial distribution with parameters n and p when the success probability p is not small. Recall that if X1 , · · · , Xn are independent Bernoulli random variables, each having parameter p, then the distribution of the sum Sn = X1 + · · · + Xn 2 is binomial with parameters (n, p). DeMoivre-Laplace Limit Theorem: If Sn is a binomial random variable with parameters n and p, then Sn − np lim P a ≤ √ ≤ b = Φ(b) − Φ(a). n→∞ npq Remark: One way to interpret this theorem is that it says that when n is large, the distribu−np tion of the random variable Zn = S√nnpq can be approximated by the normal distribution with parameters (0, 1). In particular, notice that for all n, Zn has mean 0 and variance 1. This result is a special case of a much more general result known as the Central Limit Theorem, which states that the distribution of the sum of a large number of independent, identically-distributed random variables, when suitably normalized, is approximately normal. In this particular case, the individual random variables are all Bernoulli. Example: Suppose that a fair coin is tossed 100 times. What is the probability that the number of heads obtained is between 45 and 55 (inclusive)? If X denotes the number of heads obtained in 100 tosses, then X is a binomial random variable with parameters (100, 1/2). By the Demoivre-Laplace theorem, we know that X − 50 P{45 ≤ X ≤ 55} = P −1 ≤ ≤1 5 ≈ Φ(1) − Φ(−1) = Φ(1) − (1 − Φ(1)) = 2Φ(1) − 1 = 0.683, where Table 5.1 in Ross (p. 201) has been consulted for the numerical value of Φ(1) = 0.8413. Notice that we have also made use of the identity Φ(−x) = 1 − Φ(x), which follows from the fact that if X ∼ N (0, 1), then P{X ≤ −x} = P{X > x} = 1 − P{X ≤ x}. 3

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 16: Normal Random Variables 1. Definition Definition: A