Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistical Inference Theory Lesson 28 The CENTRAL LIMIT THEOREM 436 A population is a collection of numeric data. Each population can be considered a probability sample space with a given distribution. Inference theory allows one to take an appropriate sample of a given population and from this sample make specific judgements about the entire population. For example, assume you are a reporter on the newspaper of a local community college, whose enrollment is twenty thousand, and your assignment is to find the average age : of all the enrolled students. There are two ways you could proceed: 1. By some means, find the age of each of the 20,000 students and average their ages. If this can be done, then you would have computed :. 2. Take a representative sample, of say 100 students. From each of these students ask and record their age. From this sample you can compute the average age 0. If the second method is used, then you will use the value 0 in place of the average : of the whole population. Such a process is making an inference about the mean : of the whole population, the entre student body. In order to use 0 as an estimator, we need the central limit theorem which allows us to examine the distributions of and other distributions. 28.1-What is the Central Limit Theorem for ? Let {Xk} be a sequence of mutually independent random variables 1 with a common distribution, generated from a sample of size n drawn from a population. Suppose that : = E(Xk) and F2 = Var(Xk) are finite. Define the random variable: 1 See Supplementary problems in lesson 15 for definition Statistical Inference Theory Lesson 28 The Central Limit Theorem The central limit theorem states the following about the distribution of the random variable 1. For a large sample (n $30), 437 : is approximately normally distributed. 2. The mean 3. The standard deviation of is called the standard error of the mean. 4. If F is known, the distribution of is approximately normally distributed with mean 0 and standard deviation 1. 5. If F is not known and s is the standard deviation of the sample, we use s in place of F, in place of and the distribution of is approximately normally distributed with mean 0 and standard deviation 1. 28.1 - Example 1: Past records of the student body at a large university show that the mean age is : = 23.5 years with a standard deviation F = 3.1 years. A sample of n = 100 students is taken at random. We define to be the average age of the sample. Find (a). (b). the probability that the average age of this sample is at least 24 years old. (c). the probability that the average age is between 23.1 and 24.1 years old. 438 Statistical Inference Theory (d). the probability that the average age Lesson 28 is at most 23 years old. The Central Limit Theorem 1 Solutions: ' (a). We are given that F = 3.1 and the sample size taken is n = 100. From the Central Limit Theorem, we have ' (b). Step 1:We use the formula to find the area under the normal distribution curve for fig. 1 = 24. : = 23.5 = 0.31 Step 2: z = Step 3: From the normal distribution tables, P{ $ 24} = 0.5 - 0.4463 = 0.0537 ' (c). We use the formula 2 . fig. 2 Step 1: For = 23.1, . -1.29 Step 2: For = 24.1, Statistical Inference Theory Lesson 28 The Central Limit Theorem 439 3 . 1.94 fig. 3 From the normal distribution table, the area is P{23.1# # 24.1} = 0.4015 + 0.4738 = 0.8753 . ' (d). We use the formula 4 . For = 23, . -1.613 z= fig. 4 From the table, P{ # 23} = 0.5 - 0.4463 = 0.0537 . 28.1 - Example 2: A local fish packing company packs 50 gallon containers with 100 pounds of fish. Assume each month a government agency randomly selects 49 containers and computes the average weight. If this average of these containers is less than 100 pounds than the company is fined. Find the value : the company should strive for to assure that they will not be fined more than 2% of the time. Assume a standard deviation F = 5 pounds. 5 Solution: Step 1: To solve for :, we use the formula Step 2: . = 100 Step 3: fig. 5 Step 4: From the figure, we need to look up the area 0.48 from the normal distribution table: z = -2.05 . Step 5: = 100 - (-2.05)0.71 . 100 + 1.45 = 101.45 pounds. 440 Statistical Inference Theory Lesson 28 The Central Limit Theorem 28.1 - Example 3: The American Bubble Company recently purchased a new machine to fill 16 ounces of spring water. To check if the machine is filling a proper amount of water, they sample 100 bottles each hour. If the average fill of these bottles is less than c* ounces, than the machine is stopped and adjusted. Assuming F = 0.5 ounces, find c* so that the chance the machine is stopped when properly functioning is 0.01 . Solution: Step 1: To solve for c*, we use the formula c* = : + z . 6 Step 2: : = 16 Step 3: fig. 6 Step 4: From the figure, we need to look up the area 0.49 from the Normal distribution table: z = -2.33 . Step 5: c* = : + z = 16 - 2.33(0.05) = 15.88 ounces. Solved Problems 28.1 - Solved Problem 1: The average life of 100 watt light bulbs produced by a company is µ = 1,890 hours with a standard deviation F = 150 hours. A sample of n = 400 of these bulbs is selected at random. Find (a). . (b). the probability that the average life of this sample is at most 1,900 hours. (c). the probability that the average life is between 1,900 and 2,000 hours. (d). the probability that the average life is greater than 1,950 or less than 1,875 hours. Solutions: ' (a). We are given that F = 150 and the sample size taken is n = 400. From the Central Limit Theorem, we have = 7.5 . ' (b). Step 1: We use the formula Statistical Inference Theory Lesson 28 The Central Limit Theorem to find the area under the normal distribution curve for figure 7. fig. 7 7 = 1,900 : = 1,890 = 7.50 Step 2: Step 3: From the normal distribution tables, P{ # 1900} = P{z # 1.33} = 0.4082 + 0.5 = 0.9082 . 8 ' (c). From the normal distribution table, Step 1: fig. 8 Step 2: Step 3: P{1900 # # 2000} = P{1.33 # z # 14.67} = 0.5 - 0.4082 = 0.0918 ' (d). We use the formula Step 1: For = 1950, Step 2: For = 1875, 9 =8 fig. 9 P{ $ 1950} + P{ # 1875} = P{z $ 8} + P{z # -2} = 0 + 0.5 - 0.4772 = 0.0228 441 442 Statistical Inference Theory Lesson 28 The Central Limit Theorem 28.1 - Solved Problem 2: A machine is filling 1,000 cans hourly with 16 ounces of coffee. Each hour, a sample of 200 cans is randomly selected and checked for weight. If the average of these 200 cans weigh more than 16 ounces, the machine is stopped and adjusted. Assume a standard deviation F = 1.5 ounces. What value : should the company set the machine to assure that the process will be stopped no more than 5% of the time. 10 Solution: Step 1: To solve for :, we use the formula : = = 16 Step 2: fig. 10 . 0.106 Step 3: Step 4: From the figure, we need to look up the area 0.45 from the normal distribution table: z = 1.64 . = 16 - (1.64)0.106 . 16 - .17 = 15.83 ounces. Step 5: 28.1 - Solved Problem 3: The American Bubble Company recently purchased a new machine to fill 16 ounces of spring water. To check if the machine is filling a proper amount of water, they sample 100 bottles each hour. If the average fill of these bottles is more than c* ounces, than the machine is stopped and adjusted. Assuming F = 0.5 ounces, find c* so that the chance the machine is stopped, when properly functioning is 0.03. Solution: 11 Step 1: To solve for c*, we use the formula c* = : + z fig. 11 . Step 2: : = 16 Step 3: Step 4: From the figure, we need to look up the area 0.47 from the Normal distribution table: z = 1.88 . Step 5: c* = : + z = 16 + 1.88(0.05) =16+ 0.094 = 16.094 ounces. Unsolved Problems with Answers 28.1 - Problem 1: A machine bores on average 1 cm holes in a metal plate with a standard deviation of 0.01 cm. A sample of 100 plates are taken. Find (a). . (b). the probability that the average size hole for this sample is greater than 1.002 cm. Statistical Inference Theory Lesson 28 The Central Limit Theorem 443 (c). the probability that the average size hole for this sample is between 1.002 and 1.003 . (d). the probability that the average size hole for this sample is between .999 and 1.003 . Answers: ' (a). = 0.001 ' (b). 0.0228 ' (c). 0.0215 ' (d). 0.84 Refer back to 28.1 - Example 1 & 28.1 - Solved Problem 1. 28.1 - Problem 2: A local fish packing company packs 50 gallon containers with 100 pounds of fish. Assume each month, the company randomly selects 36 containers and computes the average weight. If the average of these containers is more than 100 pounds, then the company has to repack the containers. Find the value : that will cause the company to repack 10% of the time. Assume a standard deviation F = 6 pounds. Answer: : = 98.72 pounds Refer back to 28.1 - Example 2 & 28.1 - Solved Problem 2. 28.1 - Problem 3: A fishing company catches all its fish using nets. Government regulations require that the average length of a fish caught is 15 inches. After each catch, the company samples the length of 49 fish from its nets. If the average length is less than c* inches, all the fish are returned to the water. Assume on a given day that the average length of the catch is 15 inches with a standard deviation of 1.4 inches. Find c* so that the chance is only 5% that all the catch will be returned to the water. Answer: c* = 14.67 inches Refer back to 28.1 - Example 3 & 28.1 - Solved Problem 3.. Supplementary Problems 1. The records of a local men's health club show that the average lifting weight is 178 pounds. A random sample of 100 club members shows that 40% of the men can lift more that 179 pounds. For all members, find the standard deviation F. 2. A computer selects, with replacement, 36 numbers from the set {0, 1, 2, 3, 4..., 100}. 444 Statistical Inference Theory Lesson 28 The Central Limit Theorem , find :. a. Using the formula , find F. b. Using the formula c. For this sample, find the probability that P{ $ 60}. d. If only one number is selected at random, find P{ $ 60}. e. Use the Central Limit Theorem to find a sample size N for f. Find the smallest sample size where P{ = 4.86 . $ 60} = 0.01 . 3. College's records show that the grade point average (G.P.A.) of all female students is 2.95 with a standard deviation of 0.2 and a G.P.A of 2.94 for all male students with a standard deviation of 0.25. A random sample of 200 female students and 100 male students was taken. Find the probability a. that the average G.P.A. of the sampled female students and male students is greater than 2.97 . b. that the average G.P.A. of the sampled female students or male students is greater than 2.97 . 4. The American Bubble Company recently purchased a new machine to fill 16 ounces of spring water. To check if the machine is filling a proper amount of water, they sample each hour 100 bottles. If the average fill of these bottles is less than 15.85 ounces, than the machine is stopped and adjusted. Assuming F = 0.7 ounces, find the probability that over a 5 hour period, the machine will be stopped 1 time. For any sequence of discrete random variables X1, X2, ..., Xn , we define the joint distribution of any subset Xi, Xj,..., Xr as P{Xi = xk, Xj = xw,..., Xr = xt} = P[{Xi = xk}1{Xj = xw}1...1{Xr = xt}]. 5. A fair die is tossed twice. Let X1 equal the outcome on the first toss and X2 the outcome on the second toss. a. Compute the distribution of 0 by completing the following table: P{ } b. Compute : = E(X1), : = E(X2) and E(0 ). c. Compute F2, d Show 2 F0 Statistical Inference Theory Lesson 28 The Central Limit Theorem 445 e. Show F0 = f. Compute P{X1 > 3.5} and P{ > 3.5}. 6. Assume a binomial experiment with N independent trials where p is the probability of success on each trial. a. Show : = Np. b. 7. If X and Y are two discrete, independent random variables, show E(XY) = E(X)E(Y). 8. A sequence of mutually independent random variables is called a Bernoulli sequence if P{Xk = 1} = pk and P{Xk = 0} = 1 - pk = qk (k = 1, 2,..., N). a. If S = X1 + X2 + ... + XN , show E(S) = p1 + p2 + ... + pN. b.Show, 9. Assume {Xk} (k = 1,...,n} is a sequence of random variables satisfying the Central Limit Theorem. a. Show E( ) = :. b. In lesson 16, problem 13 we showed Show . . 10. Assume the following population S = {2,10}. A sample (with replacement ) of size N = 30 is taken from this population where P{Xk = 2} = 1/2 and P{Xk = 10} = 1/2 (k = 1,...,30). a. Find : and F2. b. Define as the population of all averages Find the size of population . c. List all 31 distinct numbers of d. Find the distribution of generated by all possible samples. . for the population . e. Find a summation formula for f. Using the central limit theorem, evaluate the sum in d. g. Assume a sample of N = 30 is taken. From the distribution of 4.9 # # 6.8 . , find the probability that 446 Statistical Inference Theory Lesson 28 The Central Limit Theorem h. Use the central limit theorem to approximate an estimate of P{4.9 # # 6.8} has mean : = 0 and F = 1. 11. Show the random variable 12. Assume s is the standard deviation computed from a sample of size N. Find : and F of where . 13. In a small European country the law permits a maximum of 4 automobiles per family. Their department of transportation recently did a study and found the following distribution of number of automobiles owned: 51% of the families own 1 automobile; 23% own 2 automobiles; 17% own 3 automobiles and 9% own 4 automobiles. Recently 100 families renewed their automobiles registration. Find a. :. b. . c. For these 100 families estimated the probability on average they own at least 2 automobiles. 14. Assume the following game is played: a fair die is tossed once and the resulting value is recorded. a. Write out the population. b. Find : and F. c. If this game is played 64 times, find the probability that the average score is between 4 and 5. 15. Assume the following game is played: five cards are drawn without replacement from an ordinary deck of cards and the number of diamonds is recorded. a. Write out the population. b. Find :, F. c. If the game is played 100 times, find the probability that the average number of diamonds drawn is less than 2.