Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
“Teach A Level Maths” Statistics 1 The Central Limit Theorem © Christine Crisp The Central Limit Theorem S1: The Central Limit Theorem AQA Normal Distribution diagrams in this presentation have been drawn using FX Draw ( available from Efofex at www.efofex.com ) "Certain images and/or photos on this presentation are the copyrighted property of JupiterImages and are being used with permission under license. These images and/or photos may not be copied or downloaded without permission from JupiterImages" The Central Limit Theorem How Good are Estimates? In the previous presentation we met a question where the sample was small. So, we couldn’t be sure that the estimate of the population mean was a good one. A numerical measure of the accuracy of an estimate can be made if we know the standard deviation of the population. To explain this, we’ll look again at the diagrams showing the means of 1000 samples from a population of weights of hens’ eggs. The Central Limit Theorem Population and 1000 sample means 2 94 n=5 s.d . 1 34 Population: Samples: Population and 1000 sample means mean 60 s.d . 0 67 60·0 60·0 n = 20 standard deviation mean of means samples of size 5 samples of size 20 2 94 2 94 standard deviation of means 1·34 2·67 We want to concentrate on the standard deviations. The Central Limit Theorem Population and 1000 sample means 2 94 n=5 s.d . 1 34 Population: Samples: Population and 1000 sample means mean 60 s.d . 0 67 60·0 60·0 n = 20 standard deviation mean of means samples of size 5 samples of size 20 2 94 2 94 standard deviation of means 1·34 2·67 We want to concentrate on the standard deviations. The Central Limit Theorem Population and 1000 sample means Population and 1000 sample means 2 94 s.d . 1 34 n=5 2 94 s.d . 0 67 n = 20 It can be shown that the standard deviation of the sample means is given by For samples of size 5: n n samples of size 20: 2 94 5 2 94 1 31 0 66 n 20 ( Our values were 1·34 and 0·67 but we didn’t have all possible samples. ) The Central Limit Theorem The standard deviation of the distribution of the sample means is called the standard error of the sample mean, often shortened to the standard error (s.e.). The standard error is given by n where, is the population standard deviation and n is the size of each sample. Since the distribution of sample means is Normal, approximately 68% of the sample means lie within 1 s.e. of the population mean and 95% within 2 s.e. The Central Limit Theorem We now know the following facts about the distribution of the means of samples of size n from a population with an approximately Normal distribution: The distribution of means is approximately Normal. The mean of the means is equal to the population mean, . The standard deviation is called the standard error and is equal to the population standard deviation, , divided by n . 2 We write X ~ N , n Very importantly, the above also hold when the population is not Normal but in this case n should be greater than 30. This is the Central Limit Theorem (C.L.T.) The Central Limit Theorem e.g.1. A General Studies test was given to 720 students in a College. The standard deviation of the marks is 20. The following marks are from a random sample of 12 students: 35, 23, 17, 38, 20, 25, 29, 32, 28, 31, 33, 24 (a) Estimate the mean mark of all the students. (b) Find the standard error of your estimate. (c) What size sample would be needed to halve the standard error? xless than 30, 48 so23we .must . . assume 29 The sample size is Solution: (a) x x 27 the 9 population is Normal.12 n This is the estimate of . (b) The standard error = (c) 5 77 2 n 20 5 77 n 12 20 n 6 93 n 48 n 2 89 The Central Limit Theorem Part (c) of the last question illustrates a useful principle. We halved the standard error from 5·77 to 2·89 by increasing the sample size from 12 to 48. To halve the standard error we must multiply the sample size by 4. Can you say directly what the sample size would need to be if we wanted the standard error to be a third of its original value? ANS: We need 9 12 108 The rule is: 2 To divide s.e. by 2, multiply sample size by 2 ( 4 ) 2 To divide s.e. by 3, multiply sample size by 3 ( 9 ) etc. The reason is that the formula for the s.e. contains division by n . The Central Limit Theorem e.g.2. The heights of plants grown from a particular variety of seeds are claimed to have a Normal distribution with mean 90 cm. and standard deviation of 10 cm. (a) Find the probability that a randomly selected plant is less than 100 cm. (b) A random sample of 5 plants are selected. Find the probability that the sample mean is less than 85 cm. Solution: Let X be the r.v. “ height of a plant (cm) ” X ~ N (90, 102 ) (a) We have just 1 plant so this part is not dealing with sample means. We want P ( X 100) 100 90 1 10 P ( Z 1) (1) 0 8413 Standardising, z Z N.B. z = 1 because 100 is one s.d. above the mean,0 901. The Central Limit Theorem e.g.2. The heights of plants grown from a particular variety of seeds are claimed to have a Normal distribution with mean 90 cm. and standard deviation of 10 cm. (a) Find the probability that a randomly selected plant is less than 100 cm. (b) A random sample of 5 plants are selected. Find the probability that the sample mean is less than 85 cm. Solution: (b) The Central Limit Theorem e.g.2. The heights of plants grown from a particular variety of seeds are claimed to have a Normal distribution with mean 90 cm. and standard deviation of 10 cm. (a) Find the probability that a randomly selected plant is less than 100 cm. (b) A random sample of 5 plants are selected. Find the probability that the sample mean is less than 85 cm. Solution: (b) 2 X ~ N , n 102 ) X ~ N (90, 20) With a sample size of 5, X ~ N (90, 5 We want P ( X 85) Z 85 90 1 12 Standardising: z 20 P( X 85) P( Z 1 12 ) 1 (1.12) 1 0 8686 0 1314 1 12 0 1 12 The Central Limit Theorem Exercise 1. The length of telephone calls received by an organization is known to have a standard deviation of 13 mins. The table gives the lengths of 50 randomly selected telephone calls. Length (min) 1-2 3-5 6-8 9-11 Frequency 14 12 10 8 12-17 18-25 4 2 (a) Use the sample to calculate an estimate of , the mean length of calls. (b) Find the standard error of your estimate. 13 Solution: s.e . than 30. The 1 84 x size 6 4 is(b) N.B. The(a)sample greater Central n 50 assume the Limit Theorem (C.L.T.) tells us we need not population is Normal. Exercise 2. The Central Limit Theorem A Normal distribution has a mean of 40 and a variance of 6. Find the probability that (a) the average of 10 observations exceeds 41 and (b) the average of 50 observations exceeds 41. Interpret your answers to (a) and (b), using sketches to help you. 3. The random variable X has a distribution X ~ N (50, 81) . (a) Write down the distribution of X , the mean of random samples of size 9 taken from X. (b) Find the probability that X is less than 45. The Central Limit Theorem 2. A Normal distribution has a mean of 40 and a variance of 6. Find the probability that (a) the average of 10 observations exceeds 41 and (b) the average of 50 observations exceeds 41. Interpret your answers to (a) and (b), using sketches to help you. Solution: Let X be the r.v. Then, X ~ N (40, 6) 6 X ~ N (40, 0 6) 10 41 40 We want P ( X 41) z 1 29 06 P( Z 1 29) (a) X ~ N 40, 1 (1 29 ) 1 0 9015 0 0985 (b) Method as (a) with X ~ N (40, 0 12) Ans: 0 0019 Z 0 1 29 The Central Limit Theorem 2. A Normal distribution has a mean of 40 and a variance of 6. Find the probability that (a) the average of 10 observations exceeds 41 and (b) the average of 50 observations exceeds 41. Interpret your answers to (a) and (b), using sketches to help you. X 0 0985 n 10 0 0019 X n 50 40 41 40 41 With a sample size of 10, about 10% of the sample means will lie above 41 but with a sample size of 50 only about 0·2% will do so. The Central Limit Theorem Exercise 3. The random variable X has a distribution X ~ N (50, 81) . (a) Write down the distribution of X , the mean of random samples of size 9 taken from X. (b) Find the probability that X is less than 45. Solution: (a) X ~ N 50, 81 (b) We want P ( X 45) 9 z P( Z 1 67) 1 (1 67 ) X ~ N (50, 9) 45 50 1 67 9 Z 1 0 9525 0 0475 1 67 0 1 67 The Central Limit Theorem The following slides contain repeats of information on earlier slides, shown without colour, so that they can be printed and photocopied. For most purposes the slides can be printed as “Handouts” with up to 6 slides per sheet. The Central Limit Theorem The standard deviation of the distribution of the sample means is called the standard error of the sample mean, often shortened to the standard error (s.e.). The standard error is given by n where, is the population standard deviation and n is the size of each sample. Since the distribution of sample means is Normal, approximately 68% of the sample means lie within 1 s.e. of the population mean and 95% within 2 s.e. The Central Limit Theorem We now know the following facts about the distribution of the means of samples of size n from a population with an approximately Normal distribution: The distribution of means is approximately Normal. The mean of the means is equal to the population mean, . The standard deviation is called the standard error and is equal to the population standard deviation, , divided by n . 2 We write X ~ N , n Very importantly, the above also hold when the population is not Normal but in this case n should be greater than 30. This is the Central Limit Theorem (C.L.T.) The Central Limit Theorem e.g.1. A general studies test was given to 720 students in a College. The standard deviation of the marks is 20. The following marks are from a random sample of 12 students: 35, 23, 17, 38, 20, 25, 29, 32, 28, 31, 33, 24 (a) Estimate the mean mark of all the students. (b) Find the standard error of your estimate. (c) What size sample would be needed to halve the standard error? The sample size is less than 30, so we must assume the population is Normal. Solution: (a) x x n 48 23 . . . 29 x 27 9 12 This is the estimate of . (b) The standard error = n 20 12 5 77 The Central Limit Theorem (c) 5 77 2 n 20 n 2 89 n 6 93 n 48 This part illustrates a useful principle. We halved the standard error from 5·77 to 2·89 by increasing the sample size from 12 to 48. To halve the standard error we must multiply the sample size by 4. To divide the standard error by 3, we need a sample size that is 9 times as large. i.e. 108 The rule is: 2 To divide s.e. by 2, multiply sample size by 2 ( 4 ) 2 To divide s.e. by 3, multiply sample size by 3 ( 9 ) etc. The reason is that the formula for the s.e. contains division by n . The Central Limit Theorem e.g.2. The heights of plants grown from a particular variety of seeds are claimed to have a Normal distribution with mean 90 cm. and standard deviation of 10 cm. (a) Find the probability that a randomly selected plant is less than 100 cm. (b) A random sample of 5 plants are selected. Find the probability that the sample mean is less than 85 cm. Solution: Let X be the r.v. “ height of a plant (cm) ” X ~ N (90, 102 ) (a) We have just 1 plant so this part is not dealing with sample means. Z We want P ( X 100) 100 90 Standardising, z 1 10 P ( Z 1) (1) 0 8413 0 1 The Central Limit Theorem e.g.2. The heights of plants grown from a particular variety of seeds are claimed to have a Normal distribution with mean 90 cm. and standard deviation of 10 cm. (a) Find the probability that a randomly selected plant is less than 100 cm. (b) A random sample of 5 plants are selected. Find the probability that the sample mean is less than 85 cm. Solution: (b) 2 X ~ N , n 102 ) X ~ N (90, 20) With a sample size of 5, X ~ N (90, 5 We want P ( X 85) Z 85 90 1 12 Standardising: z 20 P( X 85) P( Z 1 12 ) 1 (1.12) 1 0 8686 0 1314 1 12 0 1 12