Download Example

Navidi-38615 book December 4, 2008 Chapter 15:53 Confirming Pages 4 Commonly Used Distributions Introduction Statistical inference involves drawing a sample from a population and analyzing the sample data to learn about the population. In many situations, one has some knowledge about the probability mass function or probability density function of the population. In particular, the probability mass or density function can often be well approximated by one of several standard families of curves, or functions. In this chapter, we describe some of these standard families, and for each one we describe some conditions under which it is an appropriate model. 4.1 The Binomial Distribution Imagine an experiment that can result in one of two outcomes. One outcome is labeled “success” and the other outcome is labeled “failure.” The probability of success is denoted by p. The probability of failure is therefore 1 − p. Such a trial is called a Bernoulli trial with success probability p. The simplest Bernoulli trial is the toss of a coin. The two outcomes are heads and tails. If we define heads to be the success outcome, then p is the probability that the coin comes up heads. For a fair coin, p = 1/2. Another example of a Bernoulli trial is the selection of a component from a population of components, some of which are defective. If we define “success” as a defective component, then p is the proportion of defective components in the population. Many experiments consist of performing a sequence of Bernoulli trials. For example, we might sample several components from a very large lot and count the number of defectives among them. This amounts to conducting several independent Bernoulli trials and counting the number of successes. The number of successes is then a random variable, which is said to have a binomial distribution. 119 Navidi-38615 120 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions We now present a formal description of the binomial distribution. Assume that a series of n Bernoulli trials is conducted, each with the same success probability p. Assume further that the trials are independent—that is, that the outcome of one trial does not influence the outcomes of any of the other trials. Let the random variable X equal the number of successes in these n trials. Then X is said to have the binomial distribution with parameters n and p. The notation is X ∼ Bin(n, p). X is a discrete random variable, and its possible values are 0, 1, . . . , n. Summary If a total of n Bernoulli trials are conducted, and ■ ■ ■ The trials are independent Each trial has the same success probability p X is the number of successes in the n trials then X has the binomial distribution with parameters n and p, denoted X ∼ Bin(n, p). Example 4.1 A fair coin is tossed 10 times. Let X be the number of heads that appear. What is the distribution of X ? Solution There are 10 independent Bernoulli trials, each with success probability p = 0.5. The random variable X is equal to the number of successes in the 10 trials. Therefore X ∼ Bin(10, 0.5). Probability Mass Function of a Binomial Random Variable We now derive the probability mass function of a binomial random variable by considering an example. A biased coin has probability 0.6 of coming up heads. The coin is tossed three times. Let X be the number of heads. Then X ∼ Bin(3, 0.6). We will compute P(X = 2). There are three arrangements of two heads in three tosses of a coin, HHT, HTH, and THH. We first compute the probability of HHT. The event HHT is a sequence of independent events: H on the first toss, H on the second toss, T on the third toss. We know the probabilities of each of these events separately: P(H on the first toss) = 0.6, P(H on the second toss) = 0.6, P(T on the third toss) = 0.4 Since the events are independent, the probability that they all occur is equal to the product of their probabilities (Equation 3.11 in Section 3.2). Thus P(HHT) = (0.6)(0.6)(0.4) = (0.6)2 (0.4)1 Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.1 The Binomial Distribution 121 Similarly, P(HTH) = (0.6)(0.4)(0.6) = (0.6)2 (0.4)1 , and P(THH) = (0.4)(0.6)(0.6) = (0.6)2 (0.4)1 . It is easy to see that all the different arrangements of two heads and one tail have the same probability. Now P(X = 2) = P(HHT or HTH or THH) = P(HHT) + P(HTH) + P(THH) = (0.6)2 (0.4)1 + (0.6)2 (0.4)1 + (0.6)2 (0.4)1 = 3(0.6)2 (0.4)1 Examining this result, we see that the number 3 represents the number of arrangements of two successes (heads) and one failure (tails), 0.6 is the success probability p, the exponent 2 is the number of successes, 0.4 is the failure probability 1 − p, and the exponent 1 is the number of failures. We can now generalize this result to produce a formula for the probability of x successes in n independent Bernoulli trials with success probability p, in terms of x, n, and p. In other words, we can compute P(X = x) where X ∼ Bin(n, p). We can see that P(X = x) = (number of arrangements of x successes in n trials) · p x (1 − p)n−x (4.1) All we need to do now is to provide an expression for the number of arrangements of x successes in n trials. To describe this number, we need factorial notation. For any positive integer n, the quantity n! (read “n factorial”) is the number n! = (n)(n − 1)(n − 2) · · · (3)(2)(1) We also define 0! = 1. The number of arrangements of x successes in n trials is n!/[x!(n − x)!]. We can now define the probability mass function for a binomial random variable. If X ∼ Bin(n, p), the probability mass function of X is p(x) = P(X = x) =      n! p x (1 − p)n−x x!(n − x)! x = 0, 1, . . . , n 0 otherwise (4.2) Figure 4.1 (page 122) presents probability histograms for the Bin(10, 0.4) and Bin(20, 0.1) probability mass functions. Navidi-38615 book 122 December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions 0.3 0.3 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 1 2 3 4 5 (a) 6 7 8 9 10 0 0 2 4 6 8 10 12 14 16 18 20 (b) FIGURE 4.1 (a) The Bin(10, 0.4) probability histogram. (b) The Bin(20, 0.1) probability histogram. Example 4.2 The probability that a newborn baby is a girl is approximately 0.49. Find the probability that of the next five single births in a certain hospital, no more than two are girls. Solution Each birth is a Bernoulli trial with success probability 0.49. Let X denote the number of girls in 5 births. Then X ∼ Bin(5, 0.49). We need to find P(X ≤ 2). Using the probability mass function, P(X ≤ 2) = P(X = 0 or X = 1 or X = 2) = P(X = 0) + P(X = 1) + P(X = 2) 5! 5! (0.49)0 (0.51)5−0 + (0.49)1 (0.51)5−1 = 0!(5 − 0)! 1!(5 − 1)! 5! + (0.49)2 (0.51)5−2 2!(5 − 2)! = 0.0345 + 0.1657 + 0.3185 = 0.5187 Sampling from a Finite Population A manufacturing process produces 10,000 stainless steel rivets during the course of a day. It is known that 8% of the rivets manufactured by this process fail to meet a strength specification. A simple random sample of 20 rivets will be packaged for shipment to a customer. A quality engineer wants to know the probability that the shipment will contain exactly three defective rivets. To compute this probability, we can use the binomial distribution. Each of the 20 rivets in the shipment is a trial. A “success” occurs if the rivet is defective. It is clear that each trial has a success probability of 0.08. We want to know the probability that we Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.1 The Binomial Distribution 123 observe three successes in 20 trials. In order to use the binomial distribution, we must be sure that the trials are independent. To see that this is so, recall from the discussion of independence in Section 1.1 that when drawing a sample from a finite population, the sample items may be treated as independent if the sample comprises less than 5% of the population. Therefore, if X represents the number of defective rivets in the sample of 20, P(X = 3) = 20! (0.08)3 (0.92)17 = 0.1414 3!17! Summary Assume that a finite population contains items of two types, successes and failures, and that a simple random sample is drawn from the population. Then if the sample size is no more than 5% of the population, the binomial distribution may be used to model the number of successes. Example 4.3 A lot contains several thousand components, 10% of which are defective. Nine components are sampled from the lot. Let X represent the number of defective components in the sample. Find the probability that exactly two are defective. Solution Since the sample size is small compared to the population (i.e., less than 5%), the number of successes in the sample approximately follows a binomial distribution. Therefore we model X with the Bin(9, 0.1) distribution. The probability that exactly two are defective is 9! P(X = 2) = (0.1)2 (0.9)7 = 0.1722 2!(9 − 2)! Using the Binomial Table Table A.1 (in Appendix A) presents binomial probabilities of the form P(X ≤ x) for n ≤ 20 and selected values of p. Examples 4.4 and 4.5 illustrate the use of this table. Example 4.4 Of all the new vehicles of a certain model that are sold, 20% require repairs to be done under warranty during the first year of service. A particular dealership sells 14 such vehicles. What is the probability that fewer than five of them require warranty repairs? Solution Let X represent the number of vehicles that require warranty repairs. Then X ∼ Bin(14, 0.2). The probability that fewer than five vehicles require warranty repairs is P(X ≤ 4). We consult Table A.1 with n = 14, p = 0.2, and x = 4. We find that P(X ≤ 4) = 0.870. Navidi-38615 book 124 December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions Sometimes the best way to compute the probability of an event is to compute the probability that the event does not occur, and then subtract from 1. Example 4.5 provides an illustration. Example 4.5 Refer to Example 4.4. What is the probability that more than 2 of the 14 vehicles require warranty repairs? Solution Let X represent the number of invoices in the sample that receive discounts. We wish to compute the probability P(X > 2). Table A.1 presents probabilities of the form P(X ≤ x). We note that P(X > 2) = 1 − P(X ≤ 2). Consulting the table with n = 14, p = 0.2, x = 2, we find that P(X ≤ 2) = 0.448. Therefore P(X > 2) = 1 − 0.448 = 0.552. The Mean and Variance of a Binomial Random Variable With a little thought, it is easy to see how to compute the mean of a binomial random variable. For example, if a fair coin is tossed 10 times, we expect on the average to see five heads. The number 5 comes from multiplying the success probability (0.5) by the number of trials (10). This method works in general. If we perform n Bernoulli trials, each with success probability p, the mean number of successes is np. Therefore, if X ∼ Bin(n, p), then µ X = np. The variance of a binomial random variable is not so intuitive; it turns out to be np(1 − p). Summary If X ∼ Bin(n, p), then the mean and variance of X are given by µ X = np (4.3) σ X2 = np(1 − p) (4.4) Exercises for Section 4.1 1. Let X ∼ Bin(10, 0.6). Find a. P(X = 3) b. P(X = 6) c. P(X ≤ 4) d. P(X > 8) e. µ X f. σ X2 2. Let X ∼ Bin(5, 0.3). Find a. P(X < 3) b. P(X ≥ 1) c. P(1 ≤ X ≤ 3) d. P(2 < X < 5) e. P(X = 0) f. P(X = 3) g. µ X h. σ X2 Navidi-38615 book December 4, 2008 15:53 Confirming Pages 4.1 3. Find the following probabilities: a. b. c. d. P(X = 7) when X ∼ Bin(13, 0.4) P(X ≥ 2) when X ∼ Bin(8, 0.4) P(X < 5) when X ∼ Bin(6, 0.7) P(2 ≤ X ≤ 4) when X ∼ Bin(7, 0.1) 4. Ten percent of the items in a large lot are defective. A sample of six items is drawn from this lot. a. Find the probability that none of the sampled items are defective. b. Find the probability that one or more of the sampled items is defective. c. Find the probability that exactly one of the sampled items is defective. d. Find the probability that fewer than two of the sampled items are defective. 5. Of all the weld failures in a certain assembly, 85% of them occur in the weld metal itself, and the remaining 15% occur in the base metal. A sample of 20 weld failures is examined. a. What is the probability that exactly five of them are base metal failures? b. What is the probability that fewer than four of them are base metal failures? c. What is the probability that none of them are base metal failures? d. Find the mean number of base metal failures. e. Find the standard deviation of the number of base metal failures. 6. A large industrial firm allows a discount on any invoice that is paid within 30 days. Of all invoices, 10% receive the discount. In a company audit, 12 invoices are sampled at random. a. What is the probability that exactly four of them receive the discount? b. What is the probability that fewer than three of them receive the discount? c. What is the probability that none of them receive the discount? d. Find the mean number that receive the discount. e. Find the standard deviation of the number that receive the discount. 7. A fair coin is tossed eight times. The Binomial Distribution 125 a. What is the probability of obtaining exactly five heads? b. Find the mean number of heads obtained. c. Find the variance of the number of heads obtained. d. Find the standard deviation of the number of heads obtained. 8. In a large shipment of automobile tires, 10% have a flaw. Four tires are chosen at random to be installed on a car. a. What is the probability that none of the tires have a flaw? b. What is the probability that exactly one of the tires has a flaw? c. What is the probability that one or more of the tires has a flaw? 9. Of the bolts manufactured for a certain application, 85% meet the length specification and can be used immediately, 10% are too long and can be used after being cut, and 5% are too short and must be scrapped. a. Find the probability that a randomly selected bolt can be used (either immediately or after being cut). b. Find the probability that fewer than 9 out of a sample of 10 bolts can be used (either immediately or after being cut). 10. A distributor receives a large shipment of components. The distributor would like to accept the shipment if 10% or fewer of the components are defective and to return it if more than 10% of the components are defective. She decides to sample 10 components and to return the shipment if more than 1 of the 10 is defective. If the proportion of defectives in the batch is in fact 10%, what is the probability that she will return the shipment? 11. A k out of n system is one in which there is a group of n components, and the system will function if at least k of the components function. Assume the components function independently of one another. a. In a 3 out of 5 system, each component has probability 0.9 of functioning. What is the probability that the system will function? b. In a 3 out of n system, in which each component has probability 0.9 of functioning, what is the smallest value of n needed so that the probability that the system functions is at least 0.90? Navidi-38615 book 126 December 4, 2008 CHAPTER 4 15:53 Confirming Pages Commonly Used Distributions 12. Refer to Exercise 11 for the definition of a k out of n system. For a certain 4 out of 6 system, assume that on a rainy day each component has probability 0.7 of functioning and that on a nonrainy day each component has probability 0.9 of functioning. a. What is the probability that the system functions on a rainy day? b. What is the probability that the system functions on a nonrainy day? c. Assume that the probability of rain tomorrow is 0.20. What is the probability that the system will function tomorrow? 13. A certain large shipment comes with a guarantee that it contains no more than 15% defective items. If the proportion of defective items in the shipment is greater than 15%, the shipment may be returned. You draw a random sample of 10 items. Let X be the number of defective items in the sample. a. If in fact 15% of the items in the shipment are defective (so that the shipment is good, but just barely), what is P(X ≥ 7)? b. Based on the answer to part (a), if 15% of the items in the shipment are defective, would 7 defectives in a sample of size 10 be an unusually large number? c. If you found that 7 of the 10 sample items were defective, would this be convincing evidence that the shipment should be returned? Explain. d. If in fact 15% of the items in the shipment are defective, what is P(X ≥ 2)? e. Based on the answer to part (d), if 15% of the items in the shipment are defective, would 2 defectives in a sample of size 10 be an unusually large number? f. If you found that 2 of the 10 sample items were defective, would this be convincing evidence that the shipment should be returned? Explain. 14. An insurance company offers a discount to homeowners who install smoke detectors in their homes. A company representative claims that 80% or more of policyholders have smoke detectors. You draw a random sample of eight policyholders. Let X be the number of policyholders in the sample who have smoke detectors. a. If exactly 80% of the policyholders have smoke detectors (so the representative’s claim is true, but just barely), what is P(X ≤ 1)? b. Based on the answer to part (a), if 80% of the policyholders have smoke detectors, would one policyholder with a smoke detector in a sample of size 8 be an unusually small number? c. If you found that exactly one of the eight sample policyholders had a smoke detector, would this be convincing evidence that the claim is false? Explain. d. If exactly 80% of the policyholders have smoke detectors, what is P(X ≤ 6)? e. Based on the answer to part (d), if 80% of the policyholders have smoke detectors, would six policyholders with smoke detectors in a sample of size 8 be an unusually small number? f. If you found that exactly six of the eight sample policyholders had smoke detectors, would this be convincing evidence that the claim is false? Explain. 15. A message consists of a string of bits (0s and 1s). Due to noise in the communications channel, each bit has probability 0.3 of being reversed (i.e., a 1 will be changed to a 0 or a 0 to a 1). To improve the accuracy of the communication, each bit is sent five times, so, for example, 0 is sent as 00000. The receiver assigns the value 0 if three or more of the bits are decoded as 0, and 1 if three or more of the bits are decoded as 1. Assume that errors occur independently. a. A 0 is sent (as 00000). What is the probability that the receiver assigns the correct value of 0? b. Assume that each bit is sent n times, where n is an odd number, and that the receiver assigns the value decoded in the majority of the bits. What is the minimum value of n necessary so that the probability that the correct value is assigned is at least 0.90? 16. One design for a system requires the installation of two identical components. The system will work if at least one of the components works. An alternative design requires four of these components, and the system will work if at least two of the four components work. If the probability that a component works is 0.9, and if the components function independently, which design has the greater probability of functioning? Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.2 The Poisson Distribution 127 4.2 The Poisson Distribution The Poisson distribution arises frequently in scientific work. We will introduce this distribution by describing a classic application in physics: the number of particles emitted by a radioactive mass. A mass contains 10,000 atoms of a radioactive substance. The probability that a given atom will decay in a one-minute time period is 0.0002. Let X represent the number of atoms that decay in one minute. Now each atom can be thought of as a Bernoulli trial, where success occurs if the atom decays. Thus X is the number of successes in 10,000 independent Bernoulli trials, each with success probability 0.0002, so the distribution of X is Bin(10,000, 0.0002). Now assume that we want to compute the probability that exactly three atoms from this mass decay in one minute. Using the binomial probability mass function, we would compute as follows: P(X = 3) = 10,000! (0.0002)3 (0.9998)9997 = 0.18047 3! 9997! Computing the binomial probability involves some large factorials and large exponents. There is a simpler calculation that yields almost exactly the same result. Specifically, if n is large and p is small, and we let λ = np, it can be shown by advanced methods that for any non-negative integer x, n! λx p x (1 − p)n−x ≈ e−λ x!(n − x)! x! (4.5) We are led to define a new probability mass function, called the Poisson probability mass function. The Poisson probability mass function is defined by p(x) = P(X = x) =   e−λ  0 λx x! if x is a non-negative integer (4.6) otherwise If X is a random variable whose probability mass function is given by Equation (4.6), then X is said to have the Poisson distribution with parameter λ. The notation is X ∼ Poisson(λ). For the radioactive mass just described, we would use the Poisson mass function to approximate P(X = x) by substituting λ = (10, 000)(0.0002) = 2 into Equation (4.6). The result is P(X = 3) = e−2 23 = 0.18045 3! The Poisson probability agrees with the binomial probability to four decimal places. Navidi-38615 128 book December 4, 2008 CHAPTER 4 Example 4.6 Confirming Pages 15:53 Commonly Used Distributions If X ∼ Poisson(5), compute P(X = 3), P(X = 8), P(X = 0), P(X = −1), and P(X = 0.5). Solution Using the probability mass function (4.6), with λ = 5, we obtain Example 4.7 P(X = 3) = e−5 53 = 0.1404 3! P(X = 8) = e−5 58 = 0.0653 8! P(X = 0) = e−5 50 = 0.0067 0! P(X = −1) = 0 because −1 is not a non-negative integer P(X = 0.5) = 0 because 0.5 is not a non-negative integer If X ∼ Poisson(4), compute P(X ≤ 2) and P(X > 1). Solution P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2) = e−4 40 41 42 + e−4 + e−4 0! 1! 2! = 0.0183 + 0.0733 + 0.1465 = 0.2381 To find P(X > 1), we might try to start by writing P(X > 1) = P(X = 2) + P(X = 3) + · · · This leads to an infinite sum that is difficult to compute. Instead, we write P(X > 1) = 1 − P(X ≤ 1) = 1 − [P(X = 0) + P(X = 1)] 0 1 −4 4 −4 4 = 1− e +e 0! 1! = 1 − (0.0183 + 0.0733) = 0.9084 Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.2 The Poisson Distribution 129 Summary If X ∼ Poisson(λ), then ■ ■ ■ ■ X is a discrete random variable whose possible values are the non-negative integers. The parameter λ is a positive constant. The probability mass function of X is  x  e−λ λ if x is a non-negative integer x! p(x) = P(X = x) =  0 otherwise The Poisson probability mass function is very close to the binomial probability mass function when n is large, p is small, and λ = np. The Mean and Variance of a Poisson Random Variable The mean and variance of a Poisson random variable can be computed by using the probability mass function along with the definitions given by Equations (3.13) and (3.14) (in Section 3.3). It turns out that the mean and variance are both equal to λ. Mean and Variance of a Poisson Random Variable If X ∼ Poisson(λ), then the mean and variance of X are given by µX = λ (4.7) σ X2 = λ (4.8) Figure 4.2 presents probability histograms for the Poisson(1) and Poisson(10) probability mass functions. 0.35 0.35 0.3 0.3 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 0 2 4 6 8 10 12 14 16 18 20 22 24 (a) 0 2 4 6 8 10 12 14 16 18 20 22 24 (b) FIGURE 4.2 (a) The Poisson(1) probability histogram. (b) The Poisson(10) probability histogram. Navidi-38615 130 book December 4, 2008 CHAPTER 4 15:53 Confirming Pages Commonly Used Distributions One of the earliest industrial uses of the Poisson distribution involved an application to the brewing of beer. A crucial step in the brewing process is the addition of yeast culture to prepare mash for fermentation. The living yeast cells are kept suspended in a liquid medium. Because the cells are alive, their concentration in the medium changes over time. Therefore, just before the yeast is added, it is necessary to estimate the concentration of yeast cells per unit volume of suspension, so as to be sure to add the right amount. Up until the early part of the twentieth century, this posed a problem for brewers. They estimated the concentration in the obvious way, by withdrawing a small volume of the suspension and counting the yeast cells in it under a microscope. Of course, the estimates determined this way were subject to random variation, but no one knew how to measure the likely size of this variation. Therefore no one knew by how much the concentration in the sample was likely to differ from the actual concentration. William Sealy Gosset, a young man in his mid-twenties who was employed by the Guinness Brewing Company of Dublin, Ireland, discovered in 1904 that the number of yeast cells in a sampled volume of suspension follows a Poisson distribution. He was then able to develop methods to measure the concentration of yeast cells with greater precision. Gosset’s discovery not only enabled Guinness to produce a more consistent product, it showed that the Poisson distribution could have important applications in many situations. Gosset wanted to publish his result, but his managers at Guinness considered his discovery to be proprietary information. They required him to use a pseudonym, so that competing breweries would not realize how useful the results could be. Gosset chose the pseudonym “Student.” In Example 4.8, we will follow a train of thought that leads to Student’s result. Before we get to it though, we will mention that shortly after publishing this result, Student made another discovery that solved one of the most important outstanding problems in statistics, which has profoundly influenced work in virtually all fields of science ever since. We will discuss this result in Section 5.4. Example 4.8 Particles (e.g., yeast cells) are suspended in a liquid medium at a concentration of 10 particles per mL. A large volume of the suspension is thoroughly agitated, and then 1 mL is withdrawn. What is the probability that exactly eight particles are withdrawn? Solution So long as the volume withdrawn is a small fraction of the total, the solution to this problem does not depend on the total volume of the suspension but only on the concentration of particles in it. Let V be the total volume of the suspension, in mL. Then the total number of particles in the suspension is 10V . Think of each of the 10V particles as a Bernoulli trial. A particle “succeeds” if it is withdrawn. Now 1 mL out of the total of V mL is to be withdrawn. Therefore, the amount to be withdrawn is 1/V of the total, so it follows that each particle has probability 1/V of being withdrawn. Let X denote the number of particles withdrawn. Then X represents the number of successes in 10V Bernoulli trials, each with probability 1/V of success. Therefore X ∼ Bin(10V, 1/V ). Since V is large, 10V is large and 1/V is small. Thus, to a very close approximation, X ∼ Poisson(10). We compute P(X = 8) with the Poisson probability mass function: P(X = 8) = e−10 108 /8! = 0.1126. Navidi-38615 book December 4, 2008 15:53 Confirming Pages 4.2 The Poisson Distribution 131 In Example 4.8, λ had the value 10 because the mean number of particles in 1 mL of suspension (the volume withdrawn) was 10. Example 4.9 Particles are suspended in a liquid medium at a concentration of 6 particles per mL. A large volume of the suspension is thoroughly agitated, and then 3 mL are withdrawn. What is the probability that exactly 15 particles are withdrawn? Solution Let X represent the number of particles withdrawn. The mean number of particles in a 3 mL volume is 18. Therefore X ∼ Poisson(18). The probability that exactly 15 particles are withdrawn is 1815 P(X = 15) = e−18 15! = 0.0786 Note that for the solutions to Examples 4.8 and 4.9 to be correct, it is important that the amount of suspension withdrawn not be too large a fraction of the total. For example, if the total volume in Example 4.9 was 3 mL, so that the entire amount was withdrawn, it would be certain that all 18 particles would be withdrawn, so the probability of withdrawing 15 particles would be zero. Example 4.10 Grandma bakes chocolate chip cookies in batches of 100. She puts 300 chips into the dough. When the cookies are done, she gives you one. What is the probability that your cookie contains no chocolate chips? Solution This is another instance of particles in a suspension. Let X represent the number of chips in your cookie. The mean number of chips is 3 per cookie, so X ∼ Poisson(3). It follows that P(X = 0) = e−3 30 /0! = 0.0498. Examples 4.8 and 4.9 show that for particles distributed uniformly at random throughout a medium, the number of particles that happen to fall in a small portion of the medium follows a Poisson distribution. In these examples, the particles were actual particles, and the medium was spatial in nature. In many cases, however, the “particles” represent events, and the medium is time. We saw such an example previously, where the number of radioactive decay events in a fixed time interval turned out to follow a Poisson distribution. Example 4.11 presents another. Example 4.11 The number of email messages received by a computer server follows a Poisson distribution with a mean of 6 per minute. Find the probability that exactly 20 messages will be received in the next 3 minutes. Navidi-38615 book 132 December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions Solution Let X be the number of messages received in 3 minutes. The mean number of messages received in 3 minutes is (6)(3) = 18, so X ∼ Poisson(18). Using the Poisson(18) probability mass function, we find that P(X = 20) = e−18 1820 20! = 0.0798 Exercises for Section 4.2 1. Let X ∼ Poisson(3). Find a. b. c. d. e. f. P(X P(X P(X P(X µX σX = 2) = 0) < 3) > 2) 2. The concentration of particles in a suspension is 4 per mL. The suspension is thoroughly agitated, and then 2 mL is withdrawn. Let X represent the number of particles that are withdrawn. Find a. b. c. d. e. P(X = 6) P(X ≤ 3) P(X > 2) µX σX 3. Suppose that 0.2% of diodes in a certain application fail within the first month of use. Let X represent the number of diodes in a random sample of 1000 that fail within the first month. Find a. b. c. d. e. P(X = 4) P(X ≤ 1) P(1 ≤ X < 4) µX σX 4. The number of flaws in a given area of aluminum foil follows a Poisson distribution with a mean of 3 per m2 . Let X represent the number of flaws in a 1 m2 sample of foil. a. b. c. d. e. f. P(X P(X P(X P(X µX σX = 5) = 0) < 2) > 1) 5. The number of hits on a certain website follows a Poisson distribution with a mean rate of 4 per minute. a. What is the probability that five messages are received in a given minute? b. What is the probability that 9 messages are received in 1.5 minutes? c. What is the probability that fewer than three messages are received in a period of 30 seconds? 6. One out of every 5000 individuals in a population carries a certain defective gene. A random sample of 1000 individuals is studied. a. What is the probability that exactly one of the sample individuals carries the gene? b. What is the probability that none of the sample individuals carries the gene? c. What is the probability that more than two of the sample individuals carry the gene? d. What is the mean of the number of sample individuals that carry the gene? e. What is the standard deviation of the number of sample individuals that carry the gene? 7. A sensor network consists of a large number of microprocessors spread out over an area, in Navidi-38615 book December 4, 2008 15:53 Confirming Pages 4.2 communication with each other and with a base station. In a certain network, the probability that a message will fail to reach the base station is 0.005. Assume that during a particular day, 1000 messages are sent. a. What is the probability that exactly 3 of the messages fail to reach the base station? b. What is the probability that fewer than 994 of the messages reach the base station? c. What is the mean of the number of messages that fail to reach the base station? d. What is the standard deviation of the number of messages that fail to reach the base station? 8. Geologists estimate the time since the most recent cooling of a mineral by counting the number of uranium fission tracks on the surface of the mineral. A certain mineral specimen is of such an age that there should be an average of 6 tracks per cm2 of surface area. Assume the number of tracks in an area follows a Poisson distribution. Let X represent the number of tracks counted in 1 cm2 of surface area. Find a. b. c. d. e. P(X = 7) P(X ≥ 3) P(2 < X < 7) µX σX 9. A random variable X has a binomial distribution, and a random variable Y has a Poisson distribution. Both X and Y have means equal to 3. Is it possible to determine which random variable has the larger variance? Choose one of the following answers: i. Yes, X has the larger variance. ii. Yes, Y has the larger variance. iii. No, we need to know the number of trials, n, for X . iv. No, we need to know the success probability, p, for X . v. No, we need to know the value of λ for Y . 10. You have received a radioactive mass that is claimed to have a mean decay rate of at least 1 particle per The Poisson Distribution 133 second. If the mean decay rate is less than 1 per second, you may return the product for a refund. Let X be the number of decay events counted in 10 seconds. a. If the mean decay rate is exactly 1 per second (so that the claim is true, but just barely), what is P(X ≤ 1)? b. Based on the answer to part (a), if the mean decay rate is 1 particle per second, would one event in 10 seconds be an unusually small number? c. If you counted one decay event in 10 seconds, would this be convincing evidence that the product should be returned? Explain. d. If the mean decay rate is exactly 1 per second, what is P(X ≤ 8)? e. Based on the answer to part (d), if the mean decay rate is 1 particle per second, would eight events in 10 seconds be an unusually small number? f. If you counted eight decay events in 10 seconds, would this be convincing evidence that the product should be returned? Explain. 11. Someone claims that a certain suspension contains at least seven particles per mL. You sample 1 mL of solution. Let X be the number of particles in the sample. a. If the mean number of particles is exactly seven per mL (so that the claim is true, but just barely), what is P(X ≤ 1)? b. Based on the answer to part (a), if the suspension contains seven particles per mL, would one particle in a 1 mL sample be an unusually small number? c. If you counted one particle in the sample, would this be convincing evidence that the claim is false? Explain. d. If the mean number of particles is exactly 7 per mL, what is P(X ≤ 6)? e. Based on the answer to part (d), if the suspension contains seven particles per mL, would six particles in a 1 mL sample be an unusually small number? f. If you counted six particles in the sample, would this be convincing evidence that the claim is false? Explain. Navidi-38615 134 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions 4.3 The Normal Distribution The normal distribution (also called the Gaussian distribution) is by far the most commonly used distribution in statistics. This distribution provides a good model for many, although not all, continuous populations. Part of the reason for this is the Central Limit Theorem, which we shall discuss in Section 4.8. The normal distribution is continuous rather than discrete. The mean of a normal random variable may have any value, and the variance may have any positive value. The probability density function of a normal random variable with mean µ and variance σ 2 is given by 1 2 2 f (x) = √ e−(x−µ) /(2σ ) (4.9) σ 2π If X is a random variable whose probability density function is normal with mean µ and variance σ 2 , we write X ∼ N (µ, σ 2 ). Summary If X ∼ N (µ, σ 2 ), then the mean and variance of X are given by µX = µ σ X2 = σ 2 Figure 4.3 presents a plot of the normal probability density function with mean µ and standard deviation σ . The normal probability density function is sometimes called the normal curve. Note that the normal curve is symmetric around µ, so that µ is the median as well as the mean. It is also the case that for any normal population ■ ■ ■ About 68% of the population is in the interval µ ± σ . About 95% of the population is in the interval µ ± 2σ . About 99.7% of the population is in the interval µ ± 3σ . ≈ 99.7% ≈ 95% ≈ 68% m ⫺ 3s m ⫺ 2s m ⫺ 1s m m ⫹ 1s m ⫹ 2s m ⫹ 3s FIGURE 4.3 Probability density function of a normal population with mean µ and variance σ 2 . Navidi-38615 book December 4, 2008 15:53 Confirming Pages 4.3 The Normal Distribution 135 The proportion of a normal population that is within a given number of standard deviations of the mean is the same for any normal population. For this reason, when dealing with normal populations, we often convert from the units in which the population items were originally measured to standard units. Standard units tell how many standard deviations an observation is from the population mean. Example 4.12 Resistances in a population of wires are normally distributed with mean 20 m and standard deviation 3 m. The resistance of two randomly chosen wires are 23 m and 16 m. Convert these amounts to standard units. Solution A resistance of 23 m is 3 m more than the mean of 20, and 3 m is equal to one standard deviation. So 23 m is one standard deviation above the mean and is thus equivalent to one standard unit. A resistance of 16 m is 1.33 standard deviations below the mean, so 16 m is equivalent to −1.33 standard units. In general, we convert to standard units by subtracting the mean and dividing by the standard deviation. Thus, if x is an item sampled from a normal population with mean µ and variance σ 2 , the standard unit equivalent of x is the number z, where x −µ (4.10) z= σ The number z is sometimes called the “z-score” of x. The z-score is an item sampled from a normal population with mean 0 and standard deviation 1. This normal population is called the standard normal population. Example 4.13 The yield, in grams, of a dye manufacturing process is normally distributed with mean 1500 and standard deviation 50. The yield of a particular run is 1568 grams. Find the z-score. Solution The quantity 1568 is an observation from a normal population with mean µ = 1500 and standard deviation σ = 50. Therefore 1568 − 1500 z= 50 = 1.36 Example 4.14 Refer to Example 4.13. The yield of a certain run has a z-score of −1.4. Find the yield in the original units of grams. Solution We use Equation (4.10), substituting −1.4 for z and solving for x. We obtain x − 1500 −1.4 = 50 Solving for x yields x = 1430. The yield is 1430 grams. Navidi-38615 136 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions The proportion of a normal population that lies within a given interval is equal to the area under the normal probability density above that interval. This would suggest that we compute these proportions by integrating the normal probability density given in Equation (4.9). Interestingly enough, areas under this curve cannot be found by the method, taught in elementary calculus, of computing the antiderivative of the function and plugging in the limits of integration. This is because the antiderivative of this function is an infinite series and cannot be written down exactly. Instead, areas under this curve must be approximated numerically. Areas under the standard normal curve (mean 0, variance 1) have been extensively tabulated. A typical such table, called a standard normal table, or z table, is given as Table A.2 (in Appendix A). To find areas under a normal curve with a different mean and variance, we convert to standard units and use the z table. Table A.2 provides areas in the left-hand tail of the curve for values of z. Other areas can be calculated by subtraction or by using the fact that the total area under the curve is equal to 1. We now present several examples to illustrate the use of the z table. Example 4.15 Find the area under the normal curve to the left of z = 0.47. Solution From the z table, the area is 0.6808. See Figure 4.4. 0.6808 0 0.47 FIGURE 4.4 Solution to Example 4.15. Example 4.16 Find the area under the normal curve to the right of z = 1.38. Solution From the z table, the area to the left of z = 1.38 is 0.9162. Therefore the area to the right is 1 − 0.9162 = 0.0838. See Figure 4.5. 0.9162 0.0838 0 1.38 FIGURE 4.5 Solution to Example 4.16. Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.3 Example 4.17 The Normal Distribution 137 Find the area under the normal curve between z = 0.71 and z = 1.28. Solution From the z table, the area to the left of z = 1.28 is 0.8997. The area to the left of z = 0.71 is 0.7611. The area between z = 0.71 and z = 1.28 is therefore 0.8997−0.7611 = 0.1386. See Figure 4.6. 0.8997 0.7611 0.1386 – 0 = 1.28 0 0.71 0 0.71 1.28 FIGURE 4.6 Solution to Example 4.17. Recall that the pth percentile of a sample is the value that divides the sample so that p percent of the sample values are less than the pth percentile (see Section 1.2). This idea extends to populations as well. The pth percentile of a population is the value that divides the sample so that p percent of the population values are less than the pth percentile. For a normal population, therefore, the pth percentile is the value for which p percent of the area under the normal curve is to the left. Example 4.18 What z-score corresponds to the 75th percentile of a normal distribution? The 25th percentile? The median? Solution To answer this question, we use the z table in reverse. We need to find the z-score for which 75% of the area of the curve is to the left. From the body of the table, the closest area to 75% is 0.7486, corresponding to a z-score of 0.67. Therefore the 75th percentile is approximately 0.67. By the symmetry of the curve, the 25th percentile is z = −0.67 (this can also be looked up in the table directly). See Figure 4.7. The median is z = 0. ≈ 25% ≈ 75% 0 0.67 ⫺0.67 0 FIGURE 4.7 Solution to Example 4.18. Example 4.19 Lifetimes of batteries in a certain application are normally distributed with mean 50 hours and standard deviation 5 hours. Find the probability that a randomly chosen battery lasts between 42 and 52 hours. Navidi-38615 138 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions Solution Let X represent the lifetime of a randomly chosen battery. Then X ∼ N (50, 52 ). Figure 4.8 presents the probability density function of the N (50, 52 ) population. The shaded area represents P(42 < X < 52), the probability that a randomly chosen battery has a lifetime between 42 and 52 hours. To compute the area, we will use the z table. First we need to convert the quantities 42 and 52 to standard units. We have z= 42 − 50 = −1.60 5 z= 52 − 50 = 0.40 5 From the z table, the area to the left of z = −1.60 is 0.0548, and the area to the left of z = 0.40 is 0.6554. The probability that a battery has a lifetime between 42 and 52 hours is 0.6554 − 0.0548 = 0.6006. 0.6006 42 50 52 z = ⫺1.6 z = 0.4 FIGURE 4.8 Solution to Example 4.19. Example 4.20 Refer to Example 4.19. Find the 40th percentile of battery lifetimes. Solution From the z table, the closest area to 0.4000 is 0.4013, corresponding to a z-score of −0.25. The population of lifetimes has mean 50 and standard deviation 5. The 40th percentile is the point 0.25 standard deviations below the mean. We find this value by converting the z-score to a raw score, using Equation (4.10): −0.25 = x − 50 5 Solving for x yields x = 48.75. The 40th percentile of battery lifetimes is 48.75 hours. See Figure 4.9. ≈ 40% 48.75 50 z = ⫺0.25 FIGURE 4.9 Solution to Example 4.20. Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.3 Example 4.21 The Normal Distribution 139 A process manufactures ball bearings whose diameters are normally distributed with mean 2.505 cm and standard deviation 0.008 cm. Specifications call for the diameter to be in the interval 2.5 ± 0.01 cm. What proportion of the ball bearings will meet the specification? Solution Let X represent the diameter of a randomly chosen ball bearing. Then X ∼ N (2.505, 0.0082 ). Figure 4.10 presents the probability density function of the N (2.505, 0.0082 ) population. The shaded area represents P(2.49 < X < 2.51), which is the proportion of ball bearings that meet the specification. We compute the z-scores of 2.49 and 2.51: z= 2.49 − 2.505 = −1.88 0.008 z= 2.51 − 2.505 = 0.63 0.008 The area to the left of z = −1.88 is 0.0301. The area to the left of z = 0.63 is 0.7357. The area between z = 0.63 and z = −1.88 is 0.7357 − 0.0301 = 0.7056. Approximately 70.56% of the diameters will meet the specification. 0.7056 2.49 z = ⫺1.88 2.505 2.51 z = 0.63 FIGURE 4.10 Solution to Example 4.21. Linear Functions of Normal Random Variables If a normal random variable is multiplied by a nonzero constant or has a constant added to it, the resulting random variable is also normal, with a mean and variance that are determined by the original mean and variance and the constants. Specifically, Summary Let X ∼ N (µ, σ 2 ), and let a = 0 and b be constants. Then a X + b ∼ N (aµ + b, a 2 σ 2 ). Example 4.22 (4.11) A chemist measures the temperature of a solution in ◦ C. The measurement is denoted C and is normally distributed with mean 40◦ C and standard deviation 1◦ C. The measurement is converted to ◦ F by the equation F = 1.8C + 32. What is the distribution of F? Navidi-38615 140 book December 4, 2008 CHAPTER 4 15:53 Confirming Pages Commonly Used Distributions Solution Since C is normally distributed, so is F. Now µC = 40, so µ F = 1.8(40) + 32 = 104, and σC2 = 1, so σ F2 = 1.82 (1) = 3.24. Therefore F ∼ N (104, 3.24). Linear Combinations of Independent Normal Random Variables One of the remarkable features of the normal distribution is that linear combinations of independent normal random variables are themselves normal random variables. To be specific, suppose that X 1 ∼ N (µ1 , σ12 ), X 2 ∼ N (µ2 , σ22 ), . . . , X n ∼ N (µn , σn2 ) are independent normal random variables. Note that the means and variances of these random variables can differ from one another. Let c1 , c2 , . . . , cn be constants. Then the linear combination c1 X 1 + c2 X 2 + · · · + cn X n is a normally distributed random variable. The mean and variance of the linear combination are c1 µ1 + c2 µ2 + · · · + cn µn and c12 σ12 + c22 σ22 + · · · + cn2 σn2 , respectively (see Equations 3.32 and 3.36 in Section 3.4). Summary Let X 1 , X 2 , . . . , X n be independent and normally distributed with means µ1 , µ2 , . . . , µn and variances σ12 , σ22 , . . . , σn2 . Let c1 , c2 , . . . , cn be constants, and c1 X 1 + c2 X 2 + · · · + cn X n be a linear combination. Then c1 X 1 +c2 X 2 +· · ·+cn X n ∼ N (c1 µ1 +c2 µ2 +· · ·+cn µn , c12 σ12 +c22 σ22 +· · ·+cn2 σn2 ) (4.12) Example 4.23 In the article “Advances in Oxygen Equivalent Equations for Predicting the Properties of Titanium Welds” (D. Harwig, W. Ittiwattana, and H. Castner, The Welding Journal, 2001:126s–136s), the authors propose an oxygen equivalence equation to predict the strength, ductility, and hardness of welds made from nearly pure titanium. The equation is E = 2C + 3.5N + O, where E is the oxygen equivalence, and C, N , and O are the proportions by weight, in parts per million, of carbon, nitrogen, and oxygen, respectively (a constant term involving iron content has been omitted). Assume that for a particular grade of commercially pure titanium, the quantities C, N , and O are approximately independent and normally distributed with means µC = 150, µ N = 200, µ O = 1500, and standard deviations σC = 30, σ N = 60, σ O = 100. Find the distribution of E. Find P(E > 3000). Solution Since E is a linear combination of independent normal random variables, its distribution is normal. We must now find the mean and variance of E. Using Equation (4.12), Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.3 The Normal Distribution 141 we compute µ E = 2µC + 3.5µ N + 1µ O = 2(150) + 3.5(200) + 1(1500) = 2500 σ E2 = 22 σC2 + 3.52 σ N2 + 12 σ O2 = 22 (302 ) + 3.52 (602 ) + 12 (1002 ) = 57,700 We conclude that E ∼ N (2500, 57,700). √ To find P(E > 3000), we compute the z-score: z = (3000 − 2500)/ 57,700 = 2.08. The area to the right of z = 2.08 under the standard normal curve is 0.0188. So P(E > 3000) = 0.0188. If X 1 , . . . , X n is a random sample from any population with mean µ and variance σ 2 , then the sample mean X has mean µ X = µ and variance σ X2 = σ 2 /n. If the population is normal, then X is normal as well, because it is a linear combination of X 1 , . . . , X n with coefficients c1 = · · · = cn = 1/n. Summary Let X 1 , . . . , X n be independent and normally distributed with mean µ and variance σ 2 . Then σ2 X ∼ N µ, (4.13) n Other important linear combinations are the sum and difference of two random variables. If X and Y are independent normal random variables, the sum X + Y and the difference X − Y are linear combinations. The distributions of X + Y and X − Y can be determined by using Equation (4.12) with c1 = 1, c2 = 1 for X +Y and c1 = 1, c2 = −1 for X − Y . Summary Let X and Y be independent, with X ∼ N (µ X , σ X2 ) and Y ∼ N (µY , σY2 ). Then X + Y ∼ N (µ X + µY , σ X2 + σY2 ) (4.14) X − Y ∼ N (µ X − µY , σ X2 + σY2 ) (4.15) How Can I Tell Whether My Data Come from a Normal Population? In practice, we often have a sample from some population, and we must use the sample to decide whether the population distribution is approximately normal. If the sample Navidi-38615 book 142 December 4, 2008 CHAPTER 4 15:53 Confirming Pages Commonly Used Distributions is reasonably large, the sample histogram may give a good indication. Large samples from normal populations have histograms that look something like the normal density function—peaked in the center, and decreasing more or less symmetrically on either side. Probability plots, which will be discussed in Section 4.7, provide another good way of determining whether a reasonably large sample comes from a population that is approximately normal. For small samples, it can be difficult to tell whether the normal distribution is appropriate. One important fact is this: Samples from normal populations rarely contain outliers. Therefore the normal distribution should generally not be used for data sets that contain outliers. This is especially true when the sample size is small. Unfortunately, for small data sets that do not contain outliers, it is difficult to determine whether the population is approximately normal. In general, some knowledge of the process that generated the data is needed. Exercises for Section 4.3 1. Find the area under the normal curve a. b. c. d. To the right of z = −0.75. Between z = 0.40 and z = 1.15. Between z = −0.60 and z = 0.70. Outside z = 0.75 to z = 1.30. 2. Find the area under the normal curve a. b. c. d. To the right of z = 0.73. Between z = 2.12 and z = 2.57. Outside z = 0.96 to z = 1.62. Between z = −0.13 and z = 0.70. 3. Let Z ∼ N (0, 1). Find a constant c for which a. b. c. d. e. P(Z ≤ c) = 0.8413 P(0 ≤ Z ≤ c) = 0.3051 P(−c ≤ Z ≤ c) = 0.8664 P(c ≤ Z ≤ 0) = 0.4554 P(|Z | ≥ c) = 0.1470 4. If X ∼ N (3, 4), compute a. b. c. d. P(X ≥ 3) P(1 ≤ X < 8) P(−1.5 ≤ X < 1) P(−2 ≤ X − 3 < 4) 5. Scores on a standardized test are approximately normally distributed with a mean of 460 and a standard deviation of 80. a. What proportion of the scores are above 550? b. What is the 35th percentile of the scores? c. If someone’s score is 600, what percentile is she on? d. What proportion of the scores are between 420 and 520? 6. Weights of female cats of a certain breed are normally distributed with mean 4.1 kg and standard deviation 0.6 kg. a. What proportion of female cats have weights between 3.7 and 4.4 kg? b. A certain female cat has a weight that is 0.5 standard deviations above the mean. What proportion of female cats are heavier than this one? c. How heavy is a female cat whose weight is on the 80th percentile? d. A female cat is chosen at random. What is the probability that she weighs more than 4.5 kg? e. Six female cats are chosen at random. What is the probability that exactly one of them weighs more than 4.5 kg? 7. The lifetime of a light bulb in a certain application is normally distributed with mean µ = 1400 hours and standard deviation σ = 200 hours. a. What is the probability that a light bulb will last more than 1800 hours? b. Find the 10th percentile of the lifetimes. c. A particular battery lasts 1645 hours. What percentile is its lifetime on? d. What is the probability that the lifetime of a battery is between 1350 and 1550 hours? Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.4 8. At a certain university, math SAT scores for the entering freshman class averaged 650 and had a standard deviation of 100. The maximum possible score is 800. Is it possible that the scores of these freshmen are normally distributed? Explain. 9. The strength of an aluminum alloy is normally distributed with mean 10 gigapascals (GPa) and standard deviation 1.4 GPa. a. What is the probability that a specimen of this alloy will have a strength greater than 12 GPa? b. Find the first quartile of the strengths of this alloy. c. Find the 95th percentile of the strengths of this alloy. 10. The temperature recorded by a certain thermometer when placed in boiling water (true temperature 100◦ C) is normally distributed with mean µ = 99.8◦ C and standard deviation 0.1◦ C. a. What is the probability that the thermometer reading is greater than 100◦ C? b. What is the probability that the thermometer reading is within ±0.05◦ C of the true temperature? 11. Penicillin is produced by the Penicillium fungus, which is grown in a broth whose sugar content must be carefully controlled. The optimum sugar concentration is 4.9 mg/mL. If the concentration exceeds 6.0 mg/mL, the fungus dies and the process must be shut down for the day. a. If sugar concentration in batches of broth is normally distributed with mean 4.9 mg/mL and standard deviation 0.6 mg/mL, on what proportion of days will the process shut down? b. The supplier offers to sell broth with a sugar content that is normally distributed with mean 5.2 mg/mL and standard deviation 0.4 mg/mL. Will this broth result in fewer days of production lost? Explain. The Lognormal Distribution 143 12. The quality assurance program for a certain adhesive formulation process involves measuring how well the adhesive sticks a piece of plastic to a glass surface. When the process is functioning correctly, the adhesive strength X is normally distributed with a mean of 200 N and a standard deviation of 10 N. Each hour, you make one measurement of the adhesive strength. You are supposed to inform your supervisor if your measurement indicates that the process has strayed from its target distribution. a. Find P(X ≤ 160), under the assumption that the process is functioning correctly. b. Based on your answer to part (a), if the process is functioning correctly, would a strength of 160 N be unusually small? Explain. c. If you observed an adhesive strength of 160 N, would this be convincing evidence that the process was no longer functioning correctly? Explain. d. Find P(X ≥ 203), under the assumption that the process is functioning correctly. e. Based on your answer to part (d), if the process is functioning correctly, would a strength of 203 N be unusually large? Explain. f. If you observed an adhesive strength of 203 N, would this be convincing evidence that the process was no longer functioning correctly? Explain. g. Find P(X ≤ 195), under the assumption that the process is functioning correctly. h. Based on your answer to part (g), if the process is functioning correctly, would a strength of 195 N be unusually small? Explain. i. If you observed an adhesive strength of 195 N, would this be convincing evidence that the process was no longer functioning correctly? Explain. 4.4 The Lognormal Distribution For data that are highly skewed or that contain outliers, the normal distribution is generally not appropriate. The lognormal distribution, which is related to the normal distribution, is often a good choice for these data sets. The lognormal distribution is derived from the normal distribution as follows: If X is a normal random variable with mean µ and Navidi-38615 144 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions variance σ 2 , then the random variable Y = e X is said to have the lognormal distribution with parameters µ and σ 2 . Note that if Y has the lognormal distribution with parameters µ and σ 2 , then X = ln Y has the normal distribution with mean µ and variance σ 2 . Summary ■ ■ If X ∼ N (µ, σ 2 ), then the random variable Y = e X has the lognormal distribution with parameters µ and σ 2 . If Y has the lognormal distribution with parameters µ and σ 2 , then the random variable X = ln Y has the N (µ, σ 2 ) distribution. The probability density function of a lognormal random variable with parameters µ and σ is f (x) =      1 √ 1 exp − 2 (ln x − µ)2 2σ σ x 2π if x > 0 (4.16) if x ≤ 0 0 Figure 4.11 presents a graph of the lognormal density function with parameters µ = 0 and σ = 1. Note that the density function is highly skewed. This is the reason that the lognormal distribution is often used to model processes that tend to produce occasional large values, or outliers. 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 FIGURE 4.11 The probability density function of the lognormal distribution with parameters µ = 0 and σ = 1. Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.4 The Lognormal Distribution 145 It can be shown by advanced methods that if Y is a lognormal random variable with parameters µ and σ 2 , then the mean E(Y ) and variance V (Y ) are given by E(Y ) = eµ+σ 2 /2 2 V (Y ) = e2µ+2σ − e2µ+σ 2 (4.17) Note that if Y has the lognormal distribution, the parameters µ and σ 2 do not refer to the mean and variance of Y . They refer instead to the mean and variance of the normal random variable ln Y . In Equation (4.17) we used the notation E(Y ) instead of µY and V (Y ) instead of σY2 , in order to avoid confusion with µ and σ . To compute probabilities involving lognormal random variables, take logs and use the z table (Table A.2). Example 4.24 illustrates the method. Example 4.24 When a pesticide comes into contact with skin, a certain percentage of it is absorbed. The percentage that is absorbed during a given time period is often modeled with a lognormal distribution. Assume that for a given pesticide, the amount that is absorbed (in percent) within two hours of application is lognormally distributed with µ = 1.5 and σ = 0.5. Find the probability that more than 5% of the pesticide is absorbed within two hours. Solution Let Y represent the percent of pesticide that is absorbed. We need to find P(Y > 5). We cannot use the z table for Y , because Y is not normally distributed. However, ln Y is normally distributed; specifically, ln Y ∼ N (1.5, 0.52 ). We express P(Y > 5) as a probability involving ln Y : P(Y > 5) = P(ln Y > ln 5) = P(ln Y > 1.609) The z-score of 1.609 is 1.609 − 1.500 0.5 = 0.22 z= From the z table, we find that P(ln Y > 1.609) = 0.4129. We conclude that the probability that more than 5% of the pesticide is absorbed is approximately 0.41. How Can I Tell Whether My Data Come from a Lognormal Population? As stated previously, samples from normal populations rarely contain outliers. In contrast, samples from lognormal populations often contain outliers in the right-hand tail. That is, the samples often contain a few values that are much larger than the rest of the data. This of course reflects the long right-hand tail in the lognormal density function (Figure 4.11). For samples with outliers on the right, we transform the data, by taking the natural logarithm (or any logarithm) of each value. We then try to determine whether these logs come from a normal population, by plotting them on a histogram or on a probability plot. Probability plots will be discussed in Section 4.7. Navidi-38615 book 146 December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions Exercises for Section 4.4 1. The lifetime (in days) of a certain electronic component that operates in a high-temperature environment is lognormally distributed with µ = 1.2 and σ = 0.4. a. Find the mean lifetime. b. Find the probability that a component lasts between three and six days. c. Find the median lifetime. d. Find the 90th percentile of the lifetimes. 2. The article “Assessment of Dermopharmacokinetic Approach in the Bioequivalence Determination of Topical Tretinoin Gel Products” (L. Pershing, J. Nelson, et al., Journal of The American Academy of Dermatology, 2003:740–751) reports that the amount of a certain antifungal ointment that is absorbed into the skin can be modeled with a lognormal distribution. Assume that the amount (in ng/cm2 ) of active ingredient in the skin two hours after application is lognormally distributed with µ = 2.2 and σ = 2.1. a. Find the mean amount absorbed. b. Find the median amount absorbed. c. Find the probability that the amount absorbed is more than 100 ng/cm2 . d. Find the probability that the amount absorbed is less than 50 ng/cm2 . e. Find the 80th percentile of the amount absorbed. f. Find the standard deviation of the amount absorbed. 3. The body mass index (BMI) of a person is defined to be the person’s body mass divided by the square of the person’s height. The article “Influences of Parameter Uncertainties within the ICRP 66 Respiratory Tract Model: Particle Deposition” (W. Bolch, E. Farfan, et al., Health Physics, 2001:378–394) states that body mass index (in kg/m2 ) in men aged 25–34 is lognormally distributed with parameters µ = 3.215 and σ = 0.157. a. Find the mean BMI for men aged 25–34. b. Find the standard deviation of BMI for men aged 25–34. c. Find the median BMI for men aged 25–34. d. What proportion of men aged 25–34 have a BMI less than 22? e. Find the 75th percentile of BMI for men aged 25–34. 4. The article “Stochastic Estimates of Exposure and Cancer Risk from Carbon Tetrachloride Released to the Air from the Rocky Flats Plant” (A. Rood, P. McGavran, et al., Risk Analysis, 2001:675–695) models the increase in the risk of cancer due to exposure to carbon tetrachloride as lognormal with µ = −15.65 and σ = 0.79. a. b. c. d. e. Find the mean risk. Find the median risk. Find the standard deviation of the risk. Find the 5th percentile. Find the 95th percentile. 5. The prices of stocks or other financial instruments are often modeled with a lognormal distribution. An investor is considering purchasing stock in one of two companies, A or B. The price of a share of stock today is $1 for both companies. For company A, the value of the stock one year from now is modeled as lognormal with parameters µ = 0.05 and σ = 0.1. For company B, the value of the stock one year from now is modeled as lognormal with parameters µ = 0.02 and σ = 0.2. a. Find the mean of the price of one share of company A one year from now. b. Find the probability that the price of one share of company A one year from now will be greater than $1.20. c. Find the mean of the price of one share of company B one year from now. d. Find the probability that the price of one share of company B one year from now will be greater than $1.20. 6. A manufacturer claims that the tensile strength of a certain composite (in MPa) has the lognormal distribution with µ = 5 and σ = 0.5. Let X be the strength of a randomly sampled specimen of this composite. a. If the claim is true, what is P(X < 20)? b. Based on the answer to part (a), if the claim is true, would a strength of 20 MPa be unusually small? c. If you observed a tensile strength of 20 MPa, would this be convincing evidence that the claim is false? Explain. Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.5 d. If the claim is true, what is P(X < 130)? e. Based on the answer to part (d), if the claim is true, would a strength of 130 MPa be unusually small? The Exponential Distribution 147 f. If you observed a tensile strength of 130 MPa, would this be convincing evidence that the claim is false? Explain. 4.5 The Exponential Distribution The exponential distribution is a continuous distribution that is sometimes used to model the time that elapses before an event occurs. Such a time is often called a waiting time. A common example is the lifetime of a component, which is the time that elapses before the component fails. In addition, there is a close connection between the exponential distribution and the Poisson distribution. The probability density function of the exponential distribution involves a parameter, which is a positive constant λ whose value determines the density function’s location and shape. Definition The probability density function of the exponential distribution with parameter λ > 0 is λe−λx x >0 f (x) = (4.18) 0 x ≤0 Figure 4.12 presents the probability density function of the exponential distribution for various values of λ. If X is a random variable whose distribution is exponential with parameter λ, we write X ∼ Exp(λ). 4 ␭=4 3 2 ␭=2 1 ␭=1 0 0 0.5 1 1.5 2 2.5 3 FIGURE 4.12 Plots of the exponential probability density function for various values of λ. Navidi-38615 148 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions The cumulative distribution function of the exponential distribution is easy to compute. For x ≤ 0, F(x) = P(X ≤ x) = 0. For x > 0, the cumulative distribution function is x λe−λt dt = 1 − e−λx F(x) = P(X ≤ x) = 0 Summary If X ∼ Exp(λ), the cumulative distribution function of X is 1 − e−λx x >0 F(x) = P(X ≤ x) = 0 x ≤0 (4.19) The mean and variance of an exponential random variable can be computed by using integration by parts. The results follow. If X ∼ Exp(λ), then Example 4.25 µX = 1 λ (4.20) σ X2 = 1 λ2 (4.21) If X ∼ Exp(4), find µ X , σ X2 , and P(X ≤ 0.2). Solution We compute µ X and σ X2 from Equations (4.20) and (4.21), substituting λ = 2. We obtain µ X = 0.25, σ X2 = 0.0625. Using Equation (4.19), we find that P(X ≤ 0.2) = 1 − e−4(0.2) = 0.5507 The Poisson Process Recall from Section 4.2 that the Poisson distribution is often used to model the number of events that occur in a given region of time or space. We say that events follow a Poisson process with rate parameter λ when the numbers of events in disjoint intervals of time or space are independent, and the number X of events that occur in any interval of length t has a Poisson distribution with mean λt—in other words, when X ∼ Poisson(λt). We mentioned that the exponential distribution is sometimes used to model the waiting time to an event. It turns out that the exponential distribution is the correct model for waiting times whenever the events follow a Poisson process. The connection between the exponential distribution and the Poisson process is as follows: If events follow a Poisson process with rate parameter λ, and if T represents the waiting time from any starting point until the next event, then T ∼ Exp(λ). Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.5 Example 4.26 The Exponential Distribution 149 A radioactive mass emits particles according to a Poisson process at a mean rate of 15 particles per minute. At some point, a clock is started. What is the probability that more than 5 seconds will elapse before the next emission? What is the mean waiting time until the next particle is emitted? Solution We will measure time in seconds. Let T denote the time in seconds that elapses before the next particle is emitted. The mean rate of emissions is 0.25 per second, so the rate parameter is λ = 0.25, and T ∼ Exp(0.25). The probability that more than 5 seconds will elapse before the next emission is equal to P(T > 5) = 1 − P(T ≤ 5) = 1 − (1 − e−0.25(5) ) = e−1.25 = 0.2865 The mean waiting time is µT = 1 = 4 seconds. 0.25 Lack of Memory Property The exponential distribution has a property known as the lack of memory property, which we illustrate with Examples 4.27 and 4.28. Example 4.27 The lifetime of a transistor in a particular circuit has an exponential distribution with mean 1.25 years. Find the probability that the circuit lasts longer than two years. Solution Let T represent the lifetime of the circuit. Since µT = 1.25, λ = 1/1.25 = 0.8. We need to find P(T > 2). P(T > 2) = 1 − P(T ≤ 2) = 1 − (1 − e−0.8(2) ) = e−1.6 = 0.202 Example 4.28 Refer to Example 4.27. Assume the transistor is now three years old and is still functioning. Find the probability that it functions for more than two additional years. Compare this probability with the probability that a new transistor functions for more than two years, which was calculated in Example 4.27. Navidi-38615 book 150 December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions Solution We are given that the lifetime of the transistor will be more than three years, and we must compute the probability that the lifetime will be more than 3 + 2 = 5 years. The probability is given by P(T > 5 and T > 3) P(T > 3) If T > 5, then T > 3 as well. Therefore P(T > 5 and T > 3) = P(T > 5). It follows that P(T > 5) P(T > 5|T > 3) = P(T > 3) P(T > 5|T > 3) = e−0.8(5) e−0.8(3) = e−0.8(2) = = e−1.6 = 0.202 The probability that a 3-year-old transistor lasts 2 additional years is exactly the same as the probability that a new transistor lasts 2 years. Examples 4.27 and 4.28 illustrate the lack of memory property. The probability that we must wait an additional t units, given that we have already waited s units, is the same as the probability that we must wait t units from the start. The exponential distribution does not “remember” how long we have been waiting. In particular, if the lifetime of a component follows the exponential distribution, then the probability that a component that is s time units old will last an additional t time units is the same as the probability that a new component will last t time units. In other words, a component whose lifetime follows an exponential distribution does not show any effects of age or wear. The calculations in Examples 4.27 and 4.28 could be repeated for any values s and t in place of 3 and 2, and for any value of λ in place of 0.8. We now state the lack of memory property in its general form: Lack of Memory Property If T ∼ Exp(λ), and t and s are positive numbers, then P(T > t + s | T > s) = P(T > t) Exercises for Section 4.5 1. Let T ∼ Exp(0.5). Find a. b. c. d. µT σT2 P(T > 5) The median of T 2. The time between requests to a web server is exponentially distributed with mean 0.5 seconds. a. What is the value of the parameter λ? b. What is the median time between requests? c. What is the standard deviation? Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.6 a. What is the mean pore diameter? b. What is the standard deviation of the pore diameters? c. What proportion of the pores are less than 3 microns in diameter? d. What proportion of the pores are greater than 11 microns in diameter? e. What is the median pore diameter? f. What is the third quartile of the pore diameters? g. What is the 99th percentile of the pore diameters? 151 b. Find the probability that the distance between two flaws is between 8 and 20 m. c. Find the median distance. d. Find the standard deviation of the distances. e. Find the 65th percentile of the distances. d. What is the 80th percentile? e. Find the probability that more than one second elapses between requests. f. If there have been no requests for the past two seconds, what is the probability that more than one additional second will elapse before the next request? 3. A catalyst researcher states that the diameters, in microns, of the pores in a new product she has made follow the exponential distribution with parameter λ = 0.25. Some Other Continuous Distributions 5. A certain type of component can be purchased new or used. Fifty percent of all new components last more than five years, but only 30% of used components last more than five years. Is it possible that the lifetimes of new components are exponentially distributed? Explain. 6. A radioactive mass emits particles according to a Poisson process at a mean rate of 2 per second. Let T be the waiting time, in seconds, between emissions. 4. The distance between flaws on a long cable is exponentially distributed with mean 12 m. a. Find the probability that the distance between two flaws is greater than 15 m. a. b. c. d. e. f. What is the mean waiting time? What is the median waiting time? Find P(T > 2). Find P(T < 0.1). Find P(0.3 < T < 1.5). If 3 seconds have elapsed with no emission, what is the probability that there will be an emission within the next second? 4.6 Some Other Continuous Distributions The Uniform Distribution The continuous uniform distribution, which we will sometimes refer to just as the uniform distribution, is the simplest of the continuous distributions. It often plays an important role in computer simulation studies. The uniform distribution has two parameters, a and b, with a < b. Definition The probability density function of the continuous uniform distribution with parameters a and b is   1 a<x <b (4.22) f (x) = b − a  0 otherwise If X is a random variable with probability density function f (x), we say that X is uniformly distributed on the interval (a, b). Navidi-38615 152 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions Since the probability density function is constant on the interval (a, b), we can think of the probability as being distributed “uniformly” on the interval. If X is a random variable whose distribution is uniform on the interval (a, b), we write X ∼ U (a, b). The mean and variance of a uniform random variable can easily be computed from the definitions (Equations 3.19, 3.20, and 3.21 in Section 3.3). Let X ∼ U (a, b). Then a+b 2 (4.23) (b − a)2 12 (4.24) µX = σ X2 = The Gamma Distribution The gamma distribution is a continuous distribution, one of whose purposes is to extend the usefulness of the exponential distribution in modeling waiting times. It involves a certain integral known as the gamma function. We define the gamma function and state some of its properties. Definition For r > 0, the gamma function is defined by ∞ t r −1 e−t dt (r ) = (4.25) 0 The gamma function has the following properties: 1. 2. 3. If r is an integer, then (r ) = (r − 1)!. For any r , (r + 1) = r (r ). √ (1/2) = π. The gamma function is used to define the probability density function of the gamma distribution. The gamma probability density function has two parameters, r and λ, both of which are positive constants. Definition The probability density function of the gamma distribution with parameters r > 0 and λ > 0 is  r r −1 −λx λ x e   x >0 (r ) (4.26) f (x) =   0 x ≤0 Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.6 Some Other Continuous Distributions 153 1 0.8 r = 1, ␭ = 1 0.6 r = 3, ␭ = 2 0.4 r = 5, ␭ = 1 0.2 0 0 2 4 6 8 10 12 FIGURE 4.13 The gamma probability density function for various values of r and λ. If X is a random variable whose probability density function is gamma with parameters r and λ, we write X ∼ (r, λ). Note that when r = 1, the gamma distribution is the same as the exponential. In symbols, (1, λ) = Exp(λ). Figure 4.13 presents plots of the gamma probability density function for various values of r and λ. The mean and variance of a gamma random variable can be computed from the probability density function. The results are as follows. If X ∼ (r, λ), then µX = r λ (4.27) σ X2 = r λ2 (4.28) A gamma distribution for which the parameter r is a positive integer is sometimes called an Erlang distribution. If r = k/2 where k is a positive integer, the (r, 1/2) distribution is called a chi-square distribution with k degrees of freedom. The chisquare distribution is widely used in statistical inference. We will discuss some of its uses in Section 6.5. The Weibull Distribution The Weibull distribution is a continuous distribution that is used in a variety of situations. A common application of the Weibull distribution is to model the lifetimes of components such as bearings, ceramics, capacitors, and dielectrics. The Weibull probability density function has two parameters, both positive constants, that determine its location and Navidi-38615 154 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions 2 ␣ = 5, ␤ = 1 1.5 1 ␣ = 1, ␤ = 0.5 0.5 0 0 2 ␣ = 5, ␤ = 0.2 4 6 8 10 FIGURE 4.14 The Weibull probability density function for various choices of α and β. shape. We denote these parameters α and β. The probability density function of the Weibull distribution is α α−1 −(βx)α x >0 αβ x e (4.29) f (x) = 0 x ≤0 If X is a random variable whose probability density function is Weibull with parameters α and β, we write X ∼ Weibull(α, β). Note that when α = 1, the Weibull distribution is the same as the exponential distribution with parameter λ = β. In symbols, Weibull(1, β) = Exp(β). Figure 4.14 presents plots of the Weibull(α, β) probability density function for several choices of the parameters α and β. By varying the values of α and β, a wide variety of curves can be generated. Because of this, the Weibull distribution can be made to fit a wide variety of data sets. This is the main reason for the usefulness of the Weibull distribution. The Weibull cumulative distribution function can be computed by integrating the probability density function: F(x) = P(X ≤ x) =      x α αβ α t α−1 e−(βt) dt = 1 − e−(βx) α x >0 0 0 (4.30) x ≤0 This integral is not as hard as it looks. Just substitute u = (βt)α , and du = αβ α t α−1 dt. The mean and variance of the Weibull distribution are expressed in terms of the gamma function. Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.6 Some Other Continuous Distributions If X ∼ Weibull(α, β), then 1 1 µX = 1 + β α 2 1 1 2 2 σX = 2 1 + − 1+ β α α 155 (4.31) (4.32) In the special case that 1/α is an integer, then 2 2 1 1 1 1 2 µX = ! σX = 2 !− ! β α β α α If the quantity 1/α is an integer, then 1 + 1/α and 1 + 2/α are integers, so Property 1 of the gamma function can be used to compute µ X and σ X2 exactly. If the quantity 1/α is of the form n/2, where n is an integer, then in principle µ X and σ X2 can be computed exactly through repeated applications of Properties 2 and 3 of the gamma function. For other values of α, µ X and σ X2 must be approximated. Many computer packages can do this. Example 4.29 In the article “Snapshot: A Plot Showing Program through a Device Development Laboratory” (D. Lambert, J. Landwehr, and M. Shyu, Statistical Case Studies for Industrial Process Improvement, ASA-SIAM 1997), the authors suggest using a Weibull distribution to model the duration of a bake step in the manufacture of a semiconductor. Let T represent the duration in hours of the bake step for a randomly chosen lot. If T ∼ Weibull(0.3, 0.1), what is the probability that the bake step takes longer than four hours? What is the probability that it takes between two and seven hours? Solution We use the cumulative distribution function, Equation (4.30). Substituting 0.3 for α and 0.1 for β, we have P(T ≤ t) = 1 − e−(0.1t) 0.3 Therefore P(T > 4) = 1 − P(T ≤ 4) 0.3 = 1 − 1 − e−[(0.1)(4)] = e−(0.4) 0.3 = e−0.7597 = 0.468 Navidi-38615 book 156 December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions The probability that the step takes between two and seven hours is P(2 < T < 7) = P(T ≤ 7) − P(T ≤ 2) 0.3 0.3 = 1 − e−[(0.1)(7)] − 1 − e−[(0.1)(2)] = e−[(0.1)(2)] − e−[(0.1)(7)] 0.3 = e−(0.2) − e−(0.7) 0.3 0.3 0.3 = e−0.6170 − e−0.8985 = 0.132 Exercises for Section 4.6 1. A person arrives at a certain bus stop each morning. The waiting time, in minutes, for a bus to arrive is uniformly distributed on the interval (0, 10). a. Find the mean waiting time. b. Find the standard deviation of the waiting times. 2. Resistors are labeled 100 . In fact, the actual resistances are uniformly distributed on the interval (95, 103). a. Find the mean resistance. b. Find the standard deviation of the resistances. 3. Let T ∼ (6, 2). a. Find µT . b. Find σT . 4. Let T ∼ (r, λ). If µT = 8 and σT2 = 16, find r and λ. 5. Let T ∼ Weibull(0.5, 2). a. b. c. d. e. Find µT . Find σT2 . Find P(T ≤ 2). Find P(T > 3). Find P(1 < T ≤ 2). 6. If T is a continuous random variable that is always positive (such as a waiting time), with probability density function f (t) and cumulative distribution function F(t), then the hazard function is defined to be the function f (t) h(t) = 1 − F(t) The hazard function is the rate of failure per unit time, expressed as a proportion of the items that have not failed. a. If T ∼ Weibull(α, β), find h(t). b. For what values of α is the hazard rate increasing with time? For what values of α is it decreasing? 7. In the article “Parameter Estimation with Only One Complete Failure Observation” (W. Pang, P. Leung, et al., International Journal of Reliability, Quality, and Safety Engineering, 2001:109–122), the lifetime, in hours, of a certain type of bearing is modeled with the Weibull distribution with parameters α = 2.25 and β = 4.474 × 10−4 . a. Find the probability that a bearing lasts more than 1000 hours. b. Find the probability that a bearing lasts less than 2000 hours. c. Find the median lifetime of a bearing. d. The hazard function is defined in Exercise 6. What is the hazard at t = 2000 hours? 8. The lifetime of a certain battery is modeled with the Weibull distribution with α = 2 and β = 0.1. a. What proportion of batteries will last longer than 10 hours? b. What proportion of batteries will last less than 5 hours? c. What proportion of batteries will last longer than 20 hours? d. The hazard function is defined in Exercise 6. What is the hazard at t = 10 hours? 9. The lifetime of a cooling fan, in hours, that is used in a computer system has the Weibull distribution with α = 1.5 and β = 0.0001. Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.7 Probability Plots 157 b. Based on the answer to part (a), if the model is correct, would one day be an unusually short lifetime? Explain. c. If you observed a component that lasted one day, would you find this model to be plausible? Explain. d. If this model is correct, what is P(T ≤ 90)? e. Based on the answer to part (d), if the model is correct, would 90 days be an unusually short lifetime? An unusually long lifetime? Explain. f. If you observed a component that lasted 90 days, would you find this model to be plausible? Explain. a. What is the probability that a fan lasts more than 10,000 hours? b. What is the probability that a fan lasts less than 5000 hours? c. What is the probability that a fan lasts between 3000 and 9000 hours? 10. Someone suggests that the lifetime T (in days) of a certain component can be modeled with the Weibull distribution with parameters α = 3 and β = 0.01. a. If this model is correct, what is P(T ≤ 1)? 4.7 Probability Plots Scientists and engineers frequently work with data that can be thought of as a random sample from some population. In many such cases, it is important to determine a probability distribution that approximately describes that population. In some cases, knowledge of the process that generated the data can guide the decision. More often, though, the only way to determine an appropriate distribution is to examine the sample to find a probability distribution that fits. Probability plots provide a good way to do this. Given a random sample X 1 , . . . , X n , a probability plot can determine whether the sample might plausibly have come from some specified population. We will present the idea behind probability plots with a simple example. A random sample of size 5 is drawn, and we want to determine whether the population from which it came might have been normal. The sample, arranged in increasing order, is 3.01, 3.35, 4.79, 5.96, 7.89 Denote the values, in increasing order, by X 1 , . . . , X n (n = 5 in this case). The first thing to do is to assign increasing, evenly spaced values between 0 and 1 to the X i . There are several acceptable ways to do this; perhaps the simplest is to assign the value (i − 0.5)/n to X i . The following table shows the assignment for the given sample. i Xi (i − 0.5)/5 1 2 3 4 5 3.01 3.35 4.79 5.96 7.89 0.1 0.3 0.5 0.7 0.9 The value (i − 0.5)/n is chosen to reflect the position of X i in the ordered sample. There are i − 1 values less than X i , and i values less than or equal to X i . The quantity (i − 0.5)/n is a compromise between the proportions (i − 1)/n and i/n. Navidi-38615 book 158 December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions The goal is to determine whether the sample might have come from a normal population. The most plausible normal distribution is the one whose mean and standard deviation are the same as the sample mean and standard deviation. The sample mean is X = 5.00, and the sample standard deviation is s = 2.00. We will therefore determine whether this sample might have come from a N (5, 22 ) distribution. The first step is to compute the 100(i − 0.5)/5 percentiles (i.e., the 10th, 30th, 50th, 70th, and 90th percentiles) of the N (5, 22 ) distribution. We could approximate these values by looking up the z-scores corresponding to these percentiles and then converting to raw scores. In practice, the Q i are invariably calculated by a computer software package. The following table presents the X i and the Q i for this example. i Xi Qi 1 2 3 4 5 3.01 3.35 4.79 5.96 7.89 2.44 3.95 5.00 6.05 7.56 The probability plot consists of the points (X i , Q i ). Since the distribution that generated the Q i was a normal distribution, this is called a normal probability plot. If X 1 , . . . , X n do in fact come from the distribution that generated the Q i , the points should lie close to a straight line. Figure 4.15 presents a normal probability plot for the sample X 1 , . . . , X 5 . A straight line is superimposed on the plot, to make it easier to judge whether or not the points lie 12 0.999 10 0.99 8 0.95 0.9 0.75 6 0.5 4 0.25 2 0.1 0.05 0.01 0 0.001 ⫺2 3 4 5 6 7 8 3 4 5 6 7 8 FIGURE 4.15 Normal probability plots for the sample X 1 , . . . , X 5 . The plots are identical, except for the scaling on the vertical axis. The sample points lie approximately on a straight line, so it is plausible that they came from a normal population. Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.7 Probability Plots 159 close to a straight line. Two versions of the plot are presented; they are identical except for the scaling on the vertical axis. In the plot on the left, the values on the vertical axis represent the Q i . In the plot on the right, the values on the vertical axis represent the percentiles (as decimals, so 0.1 is the 10th percentile) of the Q i . For example, the 10th percentile of N (5, 22 ) is 2.44, so the value 0.1 on the right-hand plot corresponds to the value 2.44 on the left-hand plot. The 50th percentile, or median, is 5, so the value 0.5 on the right-hand plot corresponds to the value 5 on the left-hand plot. Computer packages often scale the vertical axis like the plot on the right. In Figure 4.15, the sample points are close to the line, so it is quite plausible that the sample came from a normal distribution. We remark that the points Q 1 , . . . , Q n are called quantiles of the distribution from which they are generated. Sometimes the sample points X 1 , . . . , X n are called empirical quantiles. For this reason the probability plot is sometimes called a quantile–quantile plot, or QQ plot. In this example, we used a sample of only five points to make the calculations clear. In practice, probability plots work better with larger samples. A good rule of thumb is to require at least 30 points before relying on a probability plot. Probability plots can still be used for smaller samples, but they will detect only fairly large departures from normality. Figure 4.16 shows two normal probability plots. The plot in Figure 4.16a is of the monthly productions of 255 gas wells. These data do not lie close to a straight line and thus do not come from a population that is close to normal. The plot in Figure 4.16b is of the natural logs of the monthly productions. These data lie much closer to a straight line, although some departure from normality can be detected. 0.999 0.999 0.99 0.99 0.95 0.9 0.95 0.9 0.75 0.75 0.5 0.5 0.25 0.25 0.1 0.05 0.1 0.05 0.01 0.01 0.001 0.001 0 500 1000 (a) 1500 3 4 5 (b) 6 7 FIGURE 4.16 Two normal probability plots. (a) A plot of the monthly productions of 255 gas wells. These data do not lie close to a straight line, and thus do not come from a population that is close to normal. (b) A plot of the natural logs of the monthly productions. These data lie much closer to a straight line, although some departure from normality can be detected. Navidi-38615 book 160 December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions Interpreting Probability Plots It’s best not to use hard-and-fast rules when interpreting a probability plot. Judge the straightness of the plot by eye. When deciding whether the points on a probability plot lie close to a straight line, do not pay too much attention to the points at the very ends (high or low) of the sample, unless they are quite far from the line. It is common for a few points at either end to stray from the line somewhat. However, a point that is very far from the line when most other points are close is an outlier, and deserves attention. Exercises for Section 4.7 1. Each of three samples has been plotted on a normal probability plot. For each, say whether the sample appears to have come from an approximately normal population. 0.999 0.999 0.99 0.95 0.9 0.75 0.5 0.25 0.1 0.05 0.01 0.99 0.95 0.9 0.75 0.5 0.25 0.1 0.05 0.01 0.001 0.001 (a) (b) 0.999 0.99 0.95 0.9 0.75 0.5 0.25 0.1 0.05 0.01 0.001 (c) 2. Construct a normal probability plot for the soapbars data in Exercise 1 in Section 1.3 (page 30). Do these data appear to come from an approximately normal distribution? 4.1 3.8 3.7 3.7 1.8 1.9 3.7 3.8 3.2 4.6 4.3 3.4 1.9 1.8 3.6 4.0 4.6 4.7 3.8 2.3 2.0 1.8 3.8 4.4 4.5 4.6 3.8 4.1 3.9 1.9 2.5 4.3 4.3 3.5 4.5 3.3 2.3 4.0 4.1 2.0 3. Below are the durations (in minutes) of 40 eruptions of the geyser Old Faithful in Yellowstone National Park. Construct a normal probability plot for these data. Do the data appear to come from an approximately normal distribution? Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.8 4. Below are the durations (in minutes) of 40 time intervals between eruptions of the geyser Old Faithful in Yellowstone National Park. 91 86 73 79 51 51 67 60 79 85 68 86 53 45 86 71 82 88 72 67 51 51 75 81 76 80 75 76 82 49 66 83 84 82 84 76 53 75 70 55 Construct a normal probability plot for these data. Do they appear to come from an approximately normal distribution? The Central Limit Theorem 161 5. Construct a normal probability plot for the PM data in Table 1.2 (page 17). Do the PM data appear to come from a normal population? 6. Construct a normal probability plot for the logs of the PM data in Table 1.2. Do the logs of the PM data appear to come from a normal population? 7. Can the plot in Exercise 6 be used to determine whether the PM data appear to come from a lognormal population? Explain. 4.8 The Central Limit Theorem The Central Limit Theorem is by far the most important result in statistics. Many commonly used statistical methods rely on this theorem for their validity. The Central Limit Theorem says that if we draw a large enough sample from a population, then the distribution of the sample mean is approximately normal, no matter what population the sample was drawn from. This allows us to compute probabilities for sample means using the z table, even though the population from which the sample was drawn is not normal. We now explain this more fully. Let X 1 , . . . , X n be a simple random sample from a population with mean µ and variance σ 2 . Let X = (X 1 + · · · + X n )/n be the sample mean. Now imagine drawing many such samples and computing their sample means. If one could draw every possible sample of size n from the original population and compute the sample mean for each one, the resulting collection would be the population of sample means. One could construct the probability density function of this population. One might think that the shape of this probability density function would depend on the shape of the population from which the sample was drawn. The surprising thing is that if the sample size is sufficiently large, this is not so. If the sample size is large enough, the distribution of the sample mean is approximately normal, no matter what the distribution of the population from which the sample was drawn. The Central Limit Theorem Let X 1 , . . . , X n be a simple random sample from a population with mean µ and variance σ 2 . X1 + · · · + Xn be the sample mean. Let X = n Let Sn = X 1 + · · · + X n be the sum of the sample observations. Then if n is sufficiently large, σ2 X ∼ N µ, n and Sn ∼ N (nµ, nσ 2 ) approximately (4.33) approximately (4.34) Navidi-38615 162 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions The Central Limit Theorem says that X and Sn are approximately normally distributed, if the sample size n is large enough. The natural question to ask is: How large is large enough? The answer depends on the shape of the underlying population. If the sample is drawn from a nearly symmetric distribution, the normal approximation can be good even for a fairly small value of n. However, if the population is heavily skewed, a fairly large n may be necessary. Empirical evidence suggests that for most populations, a sample size of 30 or more is large enough for the normal approximation to be adequate. For most populations, if the sample size is greater than 30, the Central Limit Theorem approximation is good. Example 4.30 Let X denote the number of flaws in a 1 in. length of copper wire. The probability mass function of X is presented in the following table. x P (X = x) 0 1 2 3 0.48 0.39 0.12 0.01 One hundred wires are sampled from this population. What is the probability that the average number of flaws per wire in this sample is less than 0.5? Solution The population mean number of flaws is µ = 0.66, and the population variance is σ 2 = 0.5244. (These quantities can be calculated using Equations 3.13 and 3.14 in Section 3.3.) Let X 1 , . . . , X 100 denote the number of flaws in the 100 wires sampled from this population. We need to find P(X < 0.5). Now the sample size is n = 100, which is a large sample. It follows from the Central Limit Theorem (expression 4.33) that X ∼ N (0.66, 0.005244). The z-score of 0.5 is therefore 0.5 − 0.66 = −2.21 z=√ 0.005244 From the z table, the area to the left of −2.21 is 0.0136. Therefore P(X < 0.5) = 0.0136, so only 1.36% of samples of size 100 will have fewer than 0.5 flaws per wire. See Figure 4.17. Note that in Example 4.30 we needed to know only the mean and variance of the population, not the probability mass function. Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.8 The Central Limit Theorem 163 0.0136 0.50 z = ⫺2.21 0.66 FIGURE 4.17 Solution to Example 4.30 Example 4.31 At a large university, the mean age of the students is 22.3 years, and the standard deviation is 4 years. A random sample of 64 students is drawn. What is the probability that the average age of these students is greater than 23 years? Solution Let X 1 , . . . , X 64 be the ages of the 64 students in the sample. We wish to find P(X > 23). Now the population from which the sample was drawn has mean µ = 22.3 and variance σ 2 = 16. The sample size is n = 64. It follows from the Central Limit Theorem (expression 4.33) that X ∼ N (22.3, 0.25). The z-score for 23 is z= 23 − 22.3 √ = 1.40 0.25 From the z table, the area to the right of 1.40 is 0.0808. Therefore P(X > 23) = 0.0808. See Figure 4.18. 0.0808 22.3 23.0 z = 1.4 FIGURE 4.18 Solution to Example 4.31 Normal Approximation to the Binomial Recall from Section 4.1 that if X ∼ Bin(n, p) then X represents the number of successes in n independent trials, each of which has success probability p. Furthermore, X has mean np and variance np(1 − p). If the number of trials is large enough, the Central Limit Theorem says that X ∼ N (np, np(1 − p)), approximately. Thus we can use the normal distribution to compute probabilities concerning X . Again the question arises: How large a sample is large enough? In the binomial case, the accuracy of the normal approximation depends on the mean number of successes np and on the mean number of failures n(1− p). The larger the values of np and n(1− p), the better the approximation. A common rule of thumb is to use the normal approximation Navidi-38615 164 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions whenever np > 5 and n(1 − p) > 5. A better and more conservative rule is to use the normal approximation whenever np > 10 and n(1 − p) > 10. Summary If X ∼ Bin(n, p), and if np > 10 and n(1 − p) > 10, then X ∼ N (np, np(1 − p)) approximately (4.35) To illustrate the accuracy of the normal approximation to the binomial, Figure 4.19 presents the Bin(100, 0.2) probability histogram with the N (20, 16) probability density function superimposed. While a slight degree of skewness can be detected in the binomial distribution, the normal approximation is quite close. 0.1 0.08 0.06 0.04 0.02 0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 FIGURE 4.19 The Bin(100, 0.2) probability histogram, with the N (20, 16) probability density function superimposed. The Continuity Correction The binomial distribution is discrete, while the normal distribution is continuous. The continuity correction is an adjustment, made when approximating a discrete distribution with a continuous one, that can improve the accuracy of the approximation. To see how it works, imagine that a fair coin is tossed 100 times. Let X represent the number of heads. Then X ∼ Bin(100, 0.5). Imagine that we wish to compute the probability that X is between 45 and 55. This probability will differ depending on whether the endpoints, 45 and 55, are included or excluded. Figure 4.20 illustrates the case where the endpoints are included—that is, where we wish to compute P(45 ≤ X ≤ 55). The exact probability is given by the total area of the rectangles of the binomial probability histogram corresponding to the integers 45 to 55 inclusive. The approximating normal curve is superimposed. To get the best approximation, we should compute the area under the normal curve between 44.5 and 55.5. In contrast, Figure 4.21 illustrates the case where we wish to compute P(45 < X < 55). Here the endpoints are excluded. The Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.8 The Central Limit Theorem 165 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 40 45 50 55 60 FIGURE 4.20 To compute P(45 ≤ X ≤ 55), the areas of the rectangles corresponding to 45 and to 55 should be included. To approximate this probability with the normal curve, compute the area under the curve between 44.5 and 55.5. 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 40 45 50 55 60 FIGURE 4.21 To compute P(45 < X < 55), the areas of the rectangles corresponding to 45 and to 55 should be excluded. To approximate this probability with the normal curve, compute the area under the curve between 45.5 and 54.5. exact probability is given by the total area of the rectangles of the binomial probability histogram corresponding to the integers 46 to 54. The best normal approximation is found by computing the area under the normal curve between 45.5 and 54.5. In summary, to apply the continuity correction, determine precisely which rectangles of the discrete probability histogram you wish to include, and then compute the area under the normal curve corresponding to those rectangles. Example 4.32 If a fair coin is tossed 100 times, use the normal curve to approximate the probability that the number of heads is between 45 and 55 inclusive. Solution This situation is illustrated in Figure 4.20. Let X be the number of heads obtained. Then X ∼ Bin(100, 0.5). Substituting n = 100 and p = 0.5 into Equation (4.35), we obtain the normal approximation X ∼ N (50, 25). Since the endpoints 45 and 55 are to be Navidi-38615 166 book December 4, 2008 CHAPTER 4 Confirming Pages 15:53 Commonly Used Distributions included, we should compute the area under the normal curve between 44.5 and 55.5. The z-scores for 44.5 and 55.5 are 44.5 − 50 55.5 − 50 z= = −1.1, z= = 1.1 5 5 From the z table we find that the probability is 0.7286. See Figure 4.22. 0.7286 44.5 z = ⫺1.1 50 55.5 z = 1.1 FIGURE 4.22 Solution to Example 4.32. Example 4.33 If a fair coin is tossed 100 times, use the normal curve to approximate the probability that the number of heads is between 45 and 55 exclusive. Solution This situation is illustrated in Figure 4.21. Let X be the number of heads obtained. As in Example 4.32, X ∼ Bin(100, 0.5), and the normal approximation is X ∼ N (50, 25). Since the endpoints 45 and 55 are to be excluded, we should compute the area under the normal curve between 45.5 and 54.5. The z-scores for 45.5 and 54.5 are 54.5 − 50 45.5 − 50 = −0.9, z= = 0.9 z= 5 5 From the z table we find that the probability is 0.6318. See Figure 4.23. 0.6318 45.5 z = ⫺0.9 50 54.5 z = 0.9 FIGURE 4.23 Solution to Example 4.33. Accuracy of the Continuity Correction The continuity correction improves the accuracy of the normal approximation to the binomial distribution in most cases. For binomial distributions with large n and small p, however, when computing a probability that corresponds to an area in the tail of the distribution, use of the continuity correction can in some cases reduce the accuracy of the normal approximation somewhat. This results from the fact that the normal Navidi-38615 book December 4, 2008 Confirming Pages 15:53 4.8 The Central Limit Theorem 167 approximation is not perfect; it fails to account for a small degree of skewness in these distributions. In summary, use of the continuity correction makes the normal approximation to the binomial distribution better in most cases, but not all. Normal Approximation to the Poisson Recall that if X ∼ Poisson(λ), then X is approximately binomial with n large and np = λ. Recall also that µ X = λ and σ X2 = λ. It follows that if λ is sufficiently large, i.e., λ > 10, then X is approximately binomial, with np > 10. It follows from the Central Limit Theorem that X is also approximately normal, with mean and variance both equal to λ. Thus we can use the normal distribution to approximate the Poisson. Summary If X ∼ Poisson(λ), where λ > 10, then X ∼ N (λ, λ) approximately (4.36) Continuity Correction for the Poisson Distribution Since the Poisson distribution is discrete, the continuity correction can in principle be applied when using the normal approximation. For areas that include the central part of the curve, the continuity correction generally improves the normal approximation, but for areas in the tails the continuity correction sometimes makes the approximation worse. We will not use the continuity correction for the Poisson distribution. Example 4.34 The number of hits on a website follows a Poisson distribution, with a mean of 27 hits per hour. Find the probability that there will be 90 or more hits in three hours. Solution Let X denote the number of hits on the website in three hours. The mean number of hits in three hours is 81, so X ∼ Poisson(81). Using the normal approximation, we find that X ∼ N (81, 81). We wish to find P(X ≥ 90). We compute the z-score of 90, which is z= 90 − 81 √ = 1.00 81 Using the z table, we find that P(X ≥ 90) = 0.1587. See Figure 4.24. 0.1587 81 90 z=1 FIGURE 4.24 Solution to Example 4.34. Navidi-38615 book 168 December 4, 2008 CHAPTER 4 15:53 Confirming Pages Commonly Used Distributions Exercises for Section 4.8 1. Bottles filled by a certain machine are supposed to contain 12 oz of liquid. In fact, the fill volume is random, with mean 12.01 oz and standard deviation 0.2 oz. a. What is the probability that the mean volume of a random sample of 144 bottles is less than 12 oz? b. If the population mean fill volume is increased to 12.03 oz what is the probability that the mean volume of a sample of size 144 will be less than 12 oz? 2. Among all the income tax forms filed in a certain year, the mean tax paid was $2000, and the standard deviation was $500. In addition, for 10% of the forms, the tax paid was greater than $3000. A random sample of 625 tax forms is drawn. a. What is the probability that the average tax paid on the sample forms is greater than $1980? b. What is the probability that more than 60 of the sampled forms have a tax of greater than $3000? 3. In a galvanized coating process for pipes, the mean coating weight is 125 lb, with a standard deviation of 10 lb. A random sample of 50 pipes is drawn. a. What is the probability that the sample mean coating weight is greater than 128 lb? b. Find the 90th percentile of the sample mean coating weight. c. How large a sample size is needed so that the probability is 0.02 that the sample mean is less than 123? 4. A 500-page book contains 250 sheets of paper. The thickness of the paper used to manufacture the book has mean 0.08 mm and standard deviation 0.01 mm. a. What is the probability that a randomly chosen book is more than 20.2 mm thick (not including the covers)? b. What is the 10th percentile of book thicknesses? c. Someone wants to know the probability that a randomly chosen page is more than 0.1 mm thick. Is enough information given to compute this probability? If so, compute the probability. If not, explain why not. 5. In a process that manufactures bearings, 90% of the bearings meet a thickness specification. In a sample of 500 bearings, what is the probability that more than 440 meet the specification? 6. Among the adults in a large city, 30% have a college degree. A simple random sample of 100 adults is chosen. What is the probability that more than 35 of them have a college degree? 7. The breaking strength (in kg/mm) for a certain type of fabric has mean 1.86 and standard deviation 0.27. A random sample of 80 pieces of fabric is drawn. a. What is the probability that the sample mean breaking strength is less than 1.8 kg/mm? b. Find the 80th percentile of the sample mean breaking strength. c. How large a sample size is needed so that the probability is 0.01 that the sample mean is less than 1.8? 8. The amount of warpage in a type of wafer used in the manufacture of integrated circuits has mean 1.3 mm and standard deviation 0.1 mm. A random sample of 200 wafers is drawn. a. What is the probability that the sample mean warpage exceeds 1.305 mm? b. Find the 25th percentile of the sample mean. c. How many wafers must be sampled so that the probability is 0.05 that the sample mean exceeds 1.305? 9. A sample of 225 wires is drawn from the population of wires described in Example 4.30. Find the probability that fewer than 110 of these wires have no flaws. 10. A battery manufacturer claims that the lifetime of a certain type of battery has a population mean of 40 hours and a standard deviation of 5 hours. Let X represent the mean lifetime of the batteries in a simple random sample of size 100. a. If the claim is true, what is P(X ≤ 36.7)? b. Based on the answer to part (a), if the claim is true, is a sample mean lifetime of 36.7 hours unusually short? c. If the sample mean lifetime of the 100 batteries were 36.7 hours, would you find the manufacturer’s claim to be plausible? Explain. d. If the claim is true, what is P(X ≤ 39.8)? Navidi-38615 book December 4, 2008 15:53 Confirming Pages Supplementary Exercises for Chapter 4 e. Based on the answer to part (d), if the claim is true, is a sample mean lifetime of 39.8 hours unusually short? f. If the sample mean lifetime of the 100 batteries were 39.8 hours, would you find the manufacturer’s claim to be plausible? Explain. 11. A new process has been designed to make ceramic tiles. The goal is to have no more than 5% of the tiles be nonconforming due to surface defects. A random sample of 1000 tiles is inspected. Let X be the number of nonconforming tiles in the sample. a. If 5% of the tiles produced are nonconforming, what is P(X ≥ 75)? 169 b. Based on the answer to part (a), if 5% of the tiles are nonconforming, is 75 nonconforming tiles out of 1000 an unusually large number? c. If 75 of the sample tiles were nonconforming, would it be plausible that the goal had been reached? Explain. d. If 5% of the tiles produced are nonconforming, what is P(X ≥ 53)? e. Based on the answer to part (d), if 5% of the tiles are nonconforming, is 53 nonconforming tiles out of 1000 an unusually large number? f. If 53 of the sample tiles were nonconforming, would it be plausible that the goal had been reached? Explain. Supplementary Exercises for Chapter 4 1. An airplane has 100 seats for passengers. Assume that the probability that a person holding a ticket appears for the flight is 0.90. If the airline sells 105 tickets, what is the probability that everyone who appears for the flight will get a seat? 2. The number of large cracks in a length of pavement along a certain street has a Poisson distribution with a mean of 1 crack per 100 m. a. What is the probability that there will be exactly 8 cracks in a 500 m length of pavement? b. What is the probability that there will be no cracks in a 100 m length of pavement? c. Let T be the distance in meters between two successive cracks. What is the probability density function of T ? d. What is the probability that the distance between two successive cracks will be more than 50 m? 3. Pea plants contain two genes for seed color, each of which may be Y (for yellow seeds) or G (for green seeds). Plants that contain one of each type of gene are called heterozygous. According to the Mendelian theory of genetics, if two heterozygous plants are crossed, each of their offspring will have probability 0.75 of having yellow seeds and probability 0.25 of having green seeds. a. Out of 10 offspring of heterozygous plants, what is the probability that exactly 3 have green seeds? b. Out of 10 offspring of heterozygous plants, what is the probability that more than 2 have green seeds? c. Out of 100 offspring of heterozygous plants, what is the probability that more than 30 have green seeds? d. Out of 100 offspring of heterozygous plants, what is the probability that between 30 and 35 inclusive have green seeds? e. Out of 100 offspring of heterozygous plants, what is the probability that fewer than 80 have yellow seeds? 4. A simple random sample X 1 , . . . , X n is drawn from a population, and the quantities ln X 1 , . . . , ln X n are plotted on a normal probability plot. The points approximately follow a straight line. True or false: a. X 1 , . . . , X n come from a population that is approximately lognormal. b. X 1 , . . . , X n come from a population that is approximately normal. c. ln X 1 , . . . , ln X n come from a population that is approximately lognormal. d. ln X 1 , . . . , ln X n come from a population that is approximately normal. 5. The Environmental Protection Agency (EPA) has contracted with your company for equipment to monitor water quality for several lakes in your water district. A total of 10 devices will be used. Assume that each Navidi-38615 book 170 December 4, 2008 CHAPTER 4 15:53 Confirming Pages Commonly Used Distributions device has a probability of 0.01 of failure during the course of the monitoring period. pled resistors will have resistances between 9.3 and 10.7 ? a. What is the probability that none of the devices fail? b. What is the probability that two or more devices fail? c. If the EPA requires the probability that none of the devices fail to be at least 0.95, what is the largest individual failure probability allowable? 9. The intake valve clearances on new engines of a certain type are normally distributed with mean 200 µm and standard deviation 10 µm. 6. In the article “Occurrence and Distribution of Ammonium in Iowa Groundwater” (K. Schilling, Water Environment Research, 2002:177–186), ammonium concentrations (in mg/L) were measured at a large number of wells in Iowa. The mean concentration was 0.71, the median was 0.22, and the standard deviation was 1.09. Is it possible to determine whether these concentrations are approximately normally distributed? If so, say whether they are normally distributed, and explain how you know. If not, describe the additional information you would need to determine whether they are normally distributed. 7. Medication used to treat a certain condition is administered by syringe. The target dose in a particular application is µ. Because of the variations in the syringe, in reading the scale, and in mixing the fluid suspension, the actual dose administered is normally distributed with mean µ and variance σ 2 . a. What is the probability that the dose administered differs from the mean µ by less than σ ? b. If X represents the dose administered, find the value of z so that P(X < µ + zσ ) = 0.90. c. If the mean dose is 10 mg, the variance is 2.6 mg2 , and a clinical overdose is defined as a dose larger than 15 mg, what is the probability that a patient will receive an overdose? a. What is the probability that the clearance is greater than 215 µm? b. What is the probability that the clearance is between 180 and 205 µm? c. An engine has six intake valves. What is the probability that exactly two of them have clearances greater than 215 µm? 10. The stiffness of a certain type of steel beam used in building construction has mean 30 kN/mm and standard deviation 2 kN/mm. a. Is it possible to compute the probability that the stiffness of a randomly chosen beam is greater than 32 kN/mm? If so, compute the probability. If not, explain why not. b. In a sample of 100 beams, is it possible to compute the probability that the sample mean stiffness of the beams is greater than 30.2 kN/mm? If so, compute the probability. If not, explain why not. 11. In a certain process, the probability of producing an oversize assembly is 0.05. a. In a sample of 300 randomly chosen assemblies, what is the probability that fewer than 20 of them are oversize? b. In a sample of 10 randomly chosen assemblies, what is the probability that one or more of them is oversize? c. To what value must the probability of an oversize assembly be reduced so that only 1% of lots of 300 assemblies contain 20 or more that are oversize? 8. You have a large box of resistors whose resistances are normally distributed with mean 10 and standard deviation 1 . 12. A process for manufacturing plate glass leaves an average of three small bubbles per 10 m2 of glass. The number of bubbles on a piece of plate glass follows a Poisson distribution. a. What proportion of the resistors have resistances between 9.3 and 10.7 ? b. If you sample 100 resistors, what is the probability that 50 or more of them will have resistances between 9.3 and 10.7 ? c. How many resistors must you sample so that the probability is 0.99 that 50 or more of the sam- a. What is the probability that a piece of glass 3 × 5 m will contain more than two bubbles? b. What is the probability that a piece of glass 4 × 6 m will contain no bubbles? c. What is the probability that 50 pieces of glass, each 3 × 6 m, will contain more than 300 bubbles in total? Navidi-38615 book December 4, 2008 15:53 Confirming Pages Supplementary Exercises for Chapter 4 13. The lifetime of a bearing (in years) follows the Weibull distribution with parameters α = 1.5 and β = 0.8. a. What is the probability that a bearing lasts more than 1 year? b. What is the probability that a bearing lasts less than 2 years? 14. The length of time to perform an oil change at a certain shop is normally distributed with mean 29.5 minutes and standard deviation 3 minutes. What is the probability that a mechanic can complete 16 oil changes in an eight-hour day? 15. A cereal manufacturer claims that the gross weight (including packaging) of a box of cereal labeled as weighing 12 oz has a mean of 12.2 oz and a standard deviation of 0.1 oz. You gather 75 boxes and weigh them all together. Let S denote the total weight of the 75 boxes of cereal. a. If the claim is true, what is P(S ≤ 914.8)? b. Based on the answer to part (a), if the claim is true, is 914.8 oz an unusually small total weight for a sample of 75 boxes? c. If the total weight of the boxes were 914.8 oz, would you be convinced that the claim was false? Explain. 171 d. If the claim is true, what is P(S ≤ 910.3)? e. Based on the answer to part (d), if the claim is true, is 910.3 oz an unusually small total weight for a sample of 75 boxes? f. If the total weight of the boxes were 910.3 oz, would you be convinced that the claim was false? Explain. 16. Someone claims that the number of hits on his website has a Poisson distribution with mean 20 per hour. Let X be the number of hits in five hours. a. If the claim is true, what is P(X ≤ 95)? b. Based on the answer to part (a), if the claim is true, is 95 hits in a five-hour time period an unusually small number? c. If you observed 95 hits in a five-hour time period, would this be convincing evidence that the claim is false? Explain. d. If the claim is true, what is P(X ≤ 65)? e. Based on the answer to part (d), if the claim is true, is 65 hits in a five-hour time period an unusually small number? f. If you observed 65 hits in a five-hour time period, would this be convincing evidence that the claim is false? Explain.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Example