Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Measures of Spread www.acoustics.salford.ac.uk (C) University of Salford, All Rights Reserved 1.0 Aims To introduce commonly used measures which quantify the amount of spread in measured data. 2.0 Learning Outcomes You will be able to Define and calculate range, variance, sample variance, standard deviation and sample standard deviation from a set of measured data. Describe how the standard deviation influences the normal distribution. 3.0 Introduction It is important not only to know measures of location, but to get some indication of the spread of data. Measures of spread lead us to be able to estimate errors in measured quantities. For example, if we measured the sensitivity of a set of microphones for a manufacturer and found them to give very different values, this would indicate a great variation in the build quality. Crucially, we could not be very confident about a single microphones sensitivity if only the mean value was given. We need information about the spread of data to get a handle of the likely range of values that this type of microphone provides. 4.0 Variance and standard deviation For a set of n measurements of a quantity x, the sample variance, deviation, n-1 are given by: 2 n-1 , and sample standard (1) where is the mean of the data. In spreadsheets these are defined as var() and stdev() Although the range of data (highest-lowest) value gives some idea of the spread of data, it is unreliable as it only relies on two measurements, rather than all data as standard deviation does. The sample variance and standard deviation are used in calculating experimental errors. Equation 1 gives the best estimates of the variance and standard deviation, because the sample mean, , is only an estimate of the true population mean. The use of n-1 on the denominator instead of n is due to using the sample and not the population mean and results in a better estimate of the values. If the population mean, is known, then the variance and standard deviation are better calculated by an alternative equation. In most cases, a set of measurements will always yield best estimate statistics as the population mean isn’t known. Standard deviation is the square root of the variance. It has the same units as the original data and for this reason is preferred when dealing with experimental error 5.0 Confidence limits of single measurements For a reasonable number of measurements which follow a normal distribution (N>20 say) 95% of measurements lie within two standard deviations of the mean. Other values are shown in Table 1. We can say that the probability of any ONE measurement being within two standard deviations of the mean is 95%. So this gives a measure of spread, and can be used to formulate experimental errors. the quantity 2 n-1 is known as the 95% confidence limit for a single measurement. Measures of Spread, www.acoustics.saldford.ac.uk 2 Table 1 % of measurements lying within a certain ± number of standard deviations of the mean: number of standard deviations 0 ±0.5 ±1 ±1.5 ±2 ±2.5 ±3 % of measurements 0 38 68 86.64 95 98.8 99.7 6.0 Confidence limits of mean measurements If we have a mean measurement based on an average of several measurements, then the confidence limit is formed from the standard error. The standard error is given by: (3) And the 95% confidence limits in the mean will usually be 2 standard errors. 6.1 Example The following set of measurements for the length (in mm) of a vibrating metal bar were made: 641 681 659 636 687 672 a) Calculate the mean and the 95% confidence limits in the mean. 697 683 Mean = 669.5 Standard deviation = 22.13 Standard error = 7.82 95% confidence limits for mean = 15.65 For small sets of data, to get 95% confidence limits you shouldn’t use exactly 2 standard errors, but look the appropriate value on the t-distribution table. You can evaluate the correct value using tinv(0.05,N) in Excel. So in the case above, you should use 2.3x the standard error so the answer is actually 18.04