Download The 95% confidence limits are calculated by taking (approximately)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Measures of Spread
www.acoustics.salford.ac.uk
(C) University of Salford, All Rights Reserved
1.0 Aims
To introduce commonly used measures which quantify the amount of spread in measured data.
2.0 Learning Outcomes
You will be able to
Define and calculate range, variance, sample variance, standard deviation and sample
standard deviation from a set of measured data.
Describe how the standard deviation influences the normal distribution.
3.0 Introduction
It is important not only to know measures of location, but to get some indication of the spread of data.
Measures of spread lead us to be able to estimate errors in measured quantities. For example, if we
measured the sensitivity of a set of microphones for a manufacturer and found them to give very
different values, this would indicate a great variation in the build quality. Crucially, we could not be
very confident about a single microphones sensitivity if only the mean value was given. We need
information about the spread of data to get a handle of the likely range of values that this type of
microphone provides.
4.0 Variance and standard deviation
For a set of n measurements of a quantity x, the sample variance,
deviation, n-1 are given by:
2
n-1
, and sample standard
(1)
where
is the mean of the data.
In spreadsheets these are defined as var() and stdev()
Although the range of data (highest-lowest) value gives some idea of the spread of data, it is
unreliable as it only relies on two measurements, rather than all data as standard deviation does.
The sample variance and standard deviation are used in calculating experimental errors. Equation 1
gives the best estimates of the variance and standard deviation, because the sample mean, , is only
an estimate of the true population mean. The use of n-1 on the denominator instead of n is due to
using the sample and not the population mean and results in a better estimate of the values. If the
population mean, is known, then the variance and standard deviation are better calculated by an
alternative equation. In most cases, a set of measurements will always yield best estimate statistics as
the population mean isn’t known. Standard deviation is the square root of the variance. It has the
same units as the original data and for this reason is preferred when dealing with experimental error
5.0 Confidence limits of single measurements
For a reasonable number of measurements which follow a normal distribution (N>20 say)
95% of measurements lie within two standard deviations of the mean. Other values are shown
in Table 1. We can say that the probability of any ONE measurement being within two
standard deviations of the mean is 95%. So this gives a measure of spread, and can be used to
formulate experimental errors. the quantity 2 n-1 is known as the 95% confidence limit for a
single measurement.
Measures of Spread, www.acoustics.saldford.ac.uk
2
Table 1 % of measurements lying within a certain ± number of standard deviations of the
mean:
number of standard deviations
0
±0.5
±1
±1.5
±2
±2.5
±3
% of measurements
0
38
68
86.64
95
98.8
99.7
6.0 Confidence limits of mean measurements
If we have a mean measurement based on an average of several measurements, then the confidence
limit is formed from the standard error. The standard error is given by:
(3)
And the 95% confidence limits in the mean will usually be 2 standard errors.
6.1 Example
The following set of measurements for the length (in mm) of a vibrating metal bar were
made:
641
681
659
636
687
672
a) Calculate the mean and the 95% confidence limits in the mean.
697
683
Mean = 669.5
Standard deviation = 22.13
Standard error = 7.82
95% confidence limits for mean = 15.65
For small sets of data, to get 95% confidence limits you shouldn’t use exactly 2 standard
errors, but look the appropriate value on the t-distribution table. You can evaluate the correct
value using tinv(0.05,N) in Excel. So in the case above, you should use 2.3x the standard
error so the answer is actually 18.04