Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 7: The Normal Distribution Definition: The random variable X is said to have normal distribution with mean µ and variance σ 2, if it has the density function f (x) = € 1 2)[(x− µ )/σ ]2 e−(1/ ,−∞ < x < ∞ . € σ 2π 2 In our text, the shorthand notation X ~ N(µ,σ ) is used to indicate the normal distribution with mean µ and variance σ 2. Properties of the Normal Distribution: € 1. € ∫ ∞ € € € f (x)dx = 1 (To prove this we need to make a clever move and find a means −∞ to rewrite this as a double integral. After that we need to make a change of variables and integrate in polar coordinates.) 2. f (x) ≥ 0 , for all x. 3. The distribution is symmetric about the mean, f (µ + x) = f (µ − x) . € 4. The maximum value of f occurs at x = µ . Standard Normal Distribution – this is a special case of the normal € distribution with mean 0 and standard deviation 1. (Variance is also 1) € Notation : X ~ N(0,1) Relation between Standard Normal Distribution and Normal Distribution X −µ 2 Suppose that X ~ N(µ,σ ) . Then the random variable Z = , € σ is normally distributed with mean 0 and standard deviation 1, i.e. € Z ~ N(0,1) . € € CDF table for N(0,1) is in back of book (page 601). To utilize the table you need to compute Z values first, shade area in question, and then possibly use symmetry notions to find related areas. (e.g Z<0 not in table.) Symmetry Notions for the Standard Normal Distribution −u z Note: Table for the CDF function Φ(z) = ∫−∞ 1 e 2 du is on page 601. 2 2π € Note: The CDF function for Normal Distributions is available on the TI‐ 83 and TI‐84 calculators. We will utilize this to solve various probability problems. WARNING: In many statistic texts and on the TI calculators, the syntax will be N( µ , σ ) with the parameters being the mean and standard deviation. nd VARS key, DIST menu TI Syntax: 2 € € normalcdf (lowerbound, upperbound, mean, standard deviation) Notes‐ if you want a lowerbound of −∞ , may use value −10 99 , and if you want an upperbound of +∞ , you may use 10 99 . If you do not specify the mean and standard deviation, the default is 0 € € and 1, respectively. € € Examples: 1. (see 7.1 in book for table use) The breaking strength of a 2 fabric in Newtons is denoted X, and is distributed X ~ N(µ,σ ) , where µ = 800 and σ 2 = 144. Find the probability that the strength of the fabric is at least 772 Newtons. 12 . €***Use σ =€ € TI: normalcdf (772, 1099, 800, 12) = .99018 €Example 2. (See 7.3 in the book for table use) The diameter of a thread on a fitting is normally distributed with mean 0.4008 cm and a standard deviation of 0.0004 cm. The design specifications are 0.4000 0.0010 cm . Find the probability that specifications are met. We want our random variable X to satisfy 0.3990 ≤ X ≤ 0.4010 ± So, we use the command € € TI: normalcdf (.399, .401, .4008, .0004). We find the final answer is .691459. € Problems in “reverse” – Probability is predetermined and cutoff value is requested. Goal: Find the value x so that P(X ≤ x) = p, where the value p is given. This is also referred to as the pth percentile. € Strategy (this works on both TI83 and TI84). 1. Find the associated Z cutoff for the cdf of the standard normal distribution. Use the command invNorm (p). 2. Solve the equation Z = X −µ , for X. σ Example 3: Suppose that IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. Find the 90th percentile IQ score. € We seek the score x so that P(X ≤ x) = .9. The associated Z‐score is invNorm (.9) = 1.282. € Solve 1.282 = X −µ , for X. σ Here we have 1.282 = € X −100 , or that X = 1.282*15 +100 = 119.23. 15 Reproductive Property of the Normal Distribution: Suppose that we have n independent, normal random variables, X 1, X2, € …Xn. Each is normally distributed with X i 2 ~ N(µi ,σ i ) . Then the random variable Y = X1 + X2+ …+ Xn is normally distributed n n with mean E(Y ) = µY = ∑ µi and variance V (Y ) = σ i=1 € 2 Y = ∑σ i2 . i=1 Remark: Must find the variance of the new distribution and use this to get new standard deviation. The proof relies upon the use of Moment Generating Functions. We discuss € € these later. Problems with multiple normal distributions: Example 4. A test is taken in three parts. For the first part the mean is 50 and the standard deviation is 10, for the second part the mean is 70 and the standard deviation is 15, and for the final part the mean is 130 and the standard deviation is 18. Assume that scores on each part of the exam are normally distributed. Find the probability that a student scores more than 270 points on the three parts of the test altogether. Consider Y = X1 +X2 +X3 as the sum of the three test parts. Use mean = 50 +70 +130 = 250. Use variance = 10^2 + 15^2 + 18^2 = 649, therefore standard deviation of total exam score is 649 . Answer: normalcdf (270, 10^99, 250, 649 ) = .2162 € Example 5. (See Example 7‐5 in text for table use) An assembly is built in three parts. For the first part the mean length is 12cm and the € variance is 0.02 for the second part the mean is 24 cm and the variance is 0.03, and for the final part the mean is 18 cm and the variance is 0.04. Assume that the lengths of the parts are independently and normally distributed. Find the probability that the total length of the assembly lies between 53.8 and 54.2 cm. Consider Y = X1 +X2 +X3 as the sum of the three assembly parts. Use mean = 12 +24 +18 = 54 cm. Use variance = .02 +.03 +.04, thus standard deviation = .09 = .3 Answer: normalcdf (53.8, 54.2, 54, .3) = .495 € Central Limit Theorem: If X1, X2, …Xn is a sequence of n independent 2 random variables with E(Xi) = µi and V(Xi) = σ i and Y= X1+ X2+ +…Xn, then under certain general conditions, the distribution defined as n Y − ∑ µi Zn = n ∑σ € € has an approximate N(0,1) distribution as n i=1 → ∞ . 2 i i=1 Special case: Let all of the random variables have the same € € € 2 distribution, that is E(Xi) = µ and V(Xi) = σ for each Xi. Let Y= X1+ X2+ +…Xn. Y − nµ Then Zn = has an approximate N(0,1) distribution as σ n € n → ∞ . € In practice, what does n → ∞ mean? Here are some numeric € guidelines for industrial rule of thumb: well behaved (use this approximation for n ≥ 4), reasonably behaved (n ≥ 12), ill behaved (n ≥ 100).) €€ € 1+ X2+ +…Xn is approximately € Practical shortcut for Special Case: Y= X normal with mean nµ and standard deviation σ n . Example 6: Suppose the truncated portion of a number has a uniform distribution € € on the interval [0,1]. (A number will be truncated by taking only the integer portion of that value. For example, we truncate 3.487 to 3 and consider .487 to be the truncated portion. This is also given by the function x ‐ x, where x denotes the floor or greatest integer function.) The truncated portion of 20 numbers is calculated. Estimate the probability that the total truncated amount is less than 9.1. € € Solution: Consider this to be a sum of twenty identical uniform distributions, X1, X2, …X20. Therefore each one has Mean = a+b 1 = 2 2 Variance = € (b − a) 2 1 = , thus standard deviation is σ = 1 12 12 12 From the special case, Y = X1+ X2+ …+X20 is approximately normal with € 20 12 . mean 20*1/2 = 10 and standard deviation € Our problem requests P(Y<9.1). We can use the TI‐83 calculator for this and compute € Normalcdf (‐10^99, 9.1, 10, 20 12 )= .242. To use the table in our book, € 1. Compute the associated Z‐values, Z = 9.1−10 ≈ −.0.70 . 20 12 € 2. Draw the associated area for the CDF function. Shade the region for Z<–0.70. 3. Use the symmetry of the curve and the table values to compute the correct probability. The Normal Approximation to the Binomial Distribution Recall we can view a binomial distribution Y as the sum of n independent Bernouilli Distributions: Let Y = X1+ X2+ …+Xn. Then each Xi has mean p and variance pq. The mean of Y is np and the variance is npq. Thus the standard deviation is npq . If we apply the Central Limit Theorem to Y, we see that € A Normal Distribution Y with mean np and variance npq is roughly normally distributed with mean np and standard deviation npq . Example 7: (See example 7.9 in book for table use) € In sampling from a production process that makes items of which 20% are defective, a random sample of 100 items is selected. The number of defectives in the sample is denoted by X. Estimate P(X<15) using the Normal Distribution. Here n =100, p = 0.2, and q = 0.8. Thus np = 20, and σ = npq = 100(.2)(.8) = 4 Estimate is € € Normalcdf (‐10^99, 15, 20,4) = 0.1056. To get more precise estimates we may utilize half‐interval corrections or corrections for continuity. This will help account for the fact that you are using a continuous distribution to estimate a discrete distribution. For example, in the binomial distribution there is a positive probability associated with the event X = 15, whereas there is zero probability associated to P(X=15) in a continuous distribution. Half interval Continuity Corrections on TI83 Binomial Distribution X with λ= np and s.d. σ = npq Quantity desired from Binomial Distribution P (X=x) P(X≤x) Associated Ti83 command Continuity Correction € P(x–0.5 ≤X ≤ x+0.5) P(X≤ x+0.5) Normalcdf(x–0.5, x+0.5, λ,σ ) Normalcdf(10^99, x+0.5, λ,σ ) P(X<x) = P(X≤x1) P(X≤ x1+0.5) = P(X≤ x–0.5) Normalcdf(10^99, x0.5, λ,σ ) P(X≥ x) P(X>x) =P(X≥ x+1) P(X≥ x–0.5) P(X≥ x+1–0.5) = P(X≥ x+0.5) P(a–0.5≤x≤ b+0.5) Normalcdf(x–0.5, 10^99, λ,σ ) Normalcdf(x+0.5, 10^99, λ,σ ) P(a≤X≤b) Normalcdf(a–0.5, b+0.5, λ,σ ) For example to compute the probability that X = 15, we would instead compute P(14.5< X<15.5). This would then allow us to utilize the normal approximation. The continuity adjustments either add a half interval or delete a half interval, depending on whether or not the equality is included e.g. < versus ≤. If equality is included in the original requested Binomial probability, just check that this value is in your adjusted interval. And if the equality case is not included, just check this value is not included in your adjusted interval. Example 8: Example 7‐ revisited and expanded: (See example 7.10 in book for table use) In sampling from a production process that makes items of which 20% are defective, a random sample of 100 items is selected. The number of defectives in the sample is denoted by X. Here n =100, p = 0.2, and q = 0.8. Thus np = 20, and σ = npq = 100(.2)(.8) = 4 a) Using the continuity corrections for the Normal Distribution, estimate P(X<15) using the Normal Distribution. € € We compute P(X<14.5). We calculate normalcdf (‐10^99, 14.5, 20, 4) = .08456. b) Estimate P(X=15) From the continuity corrections rules, we approximate P(X=15) by P(14.5<X<15.5). We calculate normalcdf (14,5,15.5, 20,4) = 0.0457. c) Estimate P(X≤ 15). By the continuity half‐interval corrections this is approximated by P(X<15.5) for the normal distribution. This value is normalcdf(‐10^99, 15.5, 20,4) = 0.13029.