Survey

Survey

Transcript

MAS1401 Handout 5 Additional Notes: Now in an actual experiment, we will usually collect a single sample of data from a population. Example (artificial data generated by me!) A random sample of 16 students taking MAS1401 sit an IQ test, giving the following results*: 124 114 116 119 123 124 137 115 118 120 105 115 111 134 132 140 Sample mean Sample standard deviation = = 121.69 9.79 What can these data tell you about the mean IQ of all students taking MAS1401? This is a good example of using a sample to try and answer a question about a population. It is reasonable to assume that the IQs come from a Normal distribution. However the mean, µ, and the standard deviation, σ, are liable to be different from those of the general population. You are interested in µ, the population mean IQ for MAS1401. You can estimate its value using the sample mean, 121.69. This isn’t particularly useful however, unless you can give some idea of how accurate the estimate is. What would be better would be to construct a “confidence interval” for µ. So how did our sample mean value,121.69, arise? Well by the Central Limit Theorem, it came from a Normal distribution with mean µ, and standard deviation σ/√n = σ/√16 = σ/4. * For a sample of data x1, x2,… , xn, the calculations are: Sample mean = Sample standard deviation x = s = = x1 x 2 x n n (x 1 - x ) 2 (x 2 - x ) 2 ... (x n - x ) 2 n 1 Using the slide entitled “An important property of the Normal distribution” (copied at the end of this handout), we can be 95% confident that population mean µ lies within 1.96 x (σ/4) of 121.69. Suppose you were to know σ, then you could use this to construct a 95% confidence interval for µ. In fact, for this example, I can tell you that σ = 9. So, straight away, you can construct a 95% confidence interval for µ as: sample mean ± (1.96 x σ/√n) which is 121.69 ± (1.96 x 9/4) i.e. the range of numbers from: 121.69 – (1.96 x 9/4) up to: 121.69 + (1.96 x 9/4) which gives the 95% confidence interval (117.28, 126.10) for µ. Additional Notes: For your sample of 16 IQs, suppose you didn’t know σ. Then you would replace it by s = 9.79. Here n-1 = 15, and the appropriate value for the t-distribution, obtained from Minitab, would be 2.13. So now the confidence interval would be the range of numbers from: up to: 121.69 – (2.13 x 9.79/4) 121.69 + (2.13 x 9.79/4), which gives the 95% confidence interval (116.48, 126.90) for µ. Note how this is wider than before, reflecting the greater uncertainty we have now. Luckily, MINITAB will handle all the calculations for us. The calculations we have just done are achieved using the procedure: Stat > Basic Statistics >1-sample t... An Important Property of the Normal Distribution It is always true that 95% of the probability lies within 1.96 standard deviations of the mean. Normal Distribuiton Probability Density 0.4 0.3 0.2 0.1 0.0 -3 -2 -1 0 1 Standard Deviations away from Mean 2 3 In popular scientific wisdom, “the usual range of a population lies within 2 standard deviations of the mean”.