Download Measures of Dispersion

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Measures of Dispersion
Measures of Dispersion
A measure of central tendency is of limited use if we do not have some knowledge of
how representative it is. Measures of Dispersion such as the Range, Inter-Quartile
Range, the Coefficient of Mean Deviation, the Standard Deviation etc., give a measure of
how the data is spread about the centre.
The Range:
This is simply the difference between the largest and smallest item of data. It is easy to
find and where there are no extremes, reasonably useful. However it cannot be used for
further analysis.
The Inter-Quartile Range:
This eliminates the problem of extreme values as it is the difference between the upper
quartile, the value above which 25% of the data lies, and the lower quartile, below which
25% of the data lies. However it also cannot be used for further analysis. It is very often
used with the median.
( n − {total ~ up ~ to ~ quartile ~ int erval})(Class.Width )
The Lower Quartile = L + 4
( Number ~ in ~ Quartile ~ int erval )
where the quartile interval is the one in which the n/4 item lies.
(3n − {total ~ up ~ to ~ quartile ~ int erval})(Class.Width )
4
The Upper Quartile = L +
( Number ~ in ~ Quartile ~ int erval )
where the quartile interval is the one ion which the 3n/4 item lies.
The Standard Deviation:
This is the most useful and widely used measure of dispersion although it is somewhat
more difficult to find and understand than any of the other measures. It is always used
with the mean. Approximately ° of the data lies within one standard deviation either side
of the mean.
Standard Deviation = σ =
∑ FX 2 − x 2
∑F
There is also another formula: σ =
=
∑ f (x − x)
∑f
∑ FX 2 − ⎛⎜ ∑ FX ⎞⎟
∑ F ⎜⎝ ∑ F ⎟⎠
2
calculate.
Walter Fleming
2
Page 1 of 3
However the first is easier to
Measures of Dispersion
Confidence Intervals
When we get a mean for a sample we often need to find out how close it is to the
unknown mean of the population. A confidence interval gives us a range of values within
which we can say the population mean lies, with a specified probability (normally 95% or
99%) that we are correct.
95% Confidence Interval = X ± 1.96(Standard Error)
where the Standard Error is
σ
n
σ is the standard deviation
,
n is the size of the sample
Note: Use 2.58 for 99% level
Example:
In a survey of 500 students, the mean cost of text books was found to be €58 with a
standard deviation of €15. Find the 95% confidence limits for the mean cost of text
books for all students.
Confidence Limits = 58 ± 1.96(
15
) = 58 ± 1.31
500
So there is a 95% probability that the mean cost for all students lies between €56.69 and
€59.31
15
The 99% confidence interval is 58 ± 2.58(
) = 58 ± 1.73
500
i.e. there is a 99% probability that the mean cost for all students lies between €56.27 and
€59.73
Confidence Intervals for Proportions
Instead of finding a mean and standard deviation, samples are often used to find what
proportion of a population has some characteristic. For example, a sample of 200
students is surveyed and it is found that 80 of them support abolishing the points system
in the Leaving Cert. We would like to know what proportion of all students hold this
opinion. Once again we cannot find the exact figure but we can find a 95% or a 99%
confidence interval in which we are confident this proportion lies.
A proportion is always represented as a decimal. In this case 80 out of 200 is 80/200
which is 0.4.
If the figure is given as a percent, convert it to a decimal by dividing by 100. For
example, if the percent of students supporting the abolishing was given as 34%, this is
0.34 as a decimal (Proportion).
Walter Fleming
Page 2 of 3
Measures of Dispersion
p(1 − p )
n
where p is the proportion and n is the size of the sample
95% Confidence Interval for a proportion = p ± 1.96
In the above example:
p= 80/200 = 0.4
1- p = 1 – 0.4 = 0.6
n = 200
Confidence Interval = 0.4 ± 1.96
( 0.4 )( 0.6 ) = 0.4 ± 0.07
200
We are 95% confident that the true proportion lies in the interval 0.33 and 0.47.
As percentages are often easier to understand, you can convert proportions to percents by
multiplying them by 100. So in this case the percentage of all students that support
abolishing the points system lies between 33% and 47%.
As with the means, the 99% interval is found by replacing 1.96 with 2.58
99% Confidence interval for a proportion = p ± 2.58
99% confidence interval = 0.4 ± 2.58
( 0.4 )( 0.6 )
200
p(1 − p )
n
= 0.4 ± 0.9
We are 99% confident that the proportion is between 0.31 and 0.49 or in percent, between
31% and 49%.
Walter Fleming
Page 3 of 3