Download Dispersion - Statistics Notes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Mean field particle methods wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Measures of Dispersion
Measures of central tendency do not provide complete information about the composition
and character of a series. In order to get a comprehensive picture of the series, we use
various methods of dispersion.
In the words of Spiegel “the degree to which numerical data tend to spread about an
average value is called dispersion of the data.”
Objectives of measures of dispersion
To know about the composition of the series
To compare the disparity between two of more series
Absolute and relative measure of dispersion
Absolute measure - These measures of dispersion can be used for comparison only if all
the values are expressed in same units.
Relative measure – these are calculated by dividing the absolute measure by a measure of
central tendency. These measures are also known as co-efficient of dispersion. These
measures helps in comparing values expressed in different units.
Absolute measures
Relative measures
Range
Quartile deviation
Mean deviation
Standard deviation
Co-efficient of range
Co-efficient of quartile deviation
Co-efficient of mean deviation
Co-efficient of standard deviation
Range- it is the difference between the largest and the smallest value.(L - S)
Co- efficient of range = L –S
L +S
(Note; in discrete series, range is the difference between highest and lowest variables
and not frequency. In continuous series, it is difference between highest upper limit
and lowest lower limit.)
Merits and demerits of range
Merits
It is simple to calculate and understand
It can be used in statistical series relating to quality control in production. control charts
are prepared based on the basis of range.
Demerits
It is an unstable measure of dispersion. It depends upon extreme values of the series
It is not based on all values in the series.
This cannot be used in the case of open-ended classes.
1
Quartile deviation and its coefficient
Quartile deviation is the half of the difference between the third quartile and the first
quartile
Q.D = (Q3 – Q1) /2
For the calculation of Q3 and Q1 refer previous chapter.
Coefficient of quartile deviation = (Q3 – Q1) / (Q3 Q1)
Merits and demerits of quartile deviation
Merits
It is simple to calculate and understand
Less affected by extreme values.
Demerits
It is not based on all the values of the series.
Mean deviation
Mean deviation is the arithmetic average of the deviations of all the values taken from
some average value (mean or median) of the series ignoring signs of the deviations
Calculation of mean deviation in individual series
Steps
 Find the average of the series (arithmetic mean or median)
 Find the absolute value of the deviations of the variables from its average(mean
or median) and sum them up.
 Divide the sum of the deviations by the number of the observations.
/
I.e. MD = ∑│d│ N
d = X – A.M or Median
n = number of observations.
Calculation of MD in discrete series
Steps
 Find the average of the series (arithmetic mean or median)
 Find the absolute value of the deviations of the variables from the
average.(arithmetic mean or median)


Multiply the deviations with the respective frequency and find the sum.
Divide the sum by the sum of the frequencies.
/
1. e. MD = ∑│fd│ ∑f
d = X – A.M or Median
∑f = total frequency
2
Calculation of MD in continuous series
Steps

Find the mid value of the classes


Find the average of the series (arithmetic mean or median)
Find the absolute value of the deviations of the variables from the
average.(arithmetic mean or median)


Multiply the deviations with the respective frequency and find the sum.
Divide the sum by the sum of the frequencies.
/
1. e. MD = ∑│fd│ ∑f
d = X – A.M or Median
X = mid value
∑f = total frequency
Coefficient of mean deviation
In order to find the coefficient of mean deviation, divide the mean deviation by arithmetic
mean if deviations are found from AM. If deviations are taken from median, divide MD
by median.
Coefficient of mean deviation = Mean deviation /AM
Or
MD / median
Merits and demerits of mean deviation
Merits
It is simple to calculate
It is based on all observations
Less affected by extreme values
Demerits
Suffers from inaccuracy since the signs are ignored.
Not capable of further algebraic treatment
Standard deviation
SD is the most scientific method of calculating dispersion.
Standard deviation is defined as the square root of the arithmetic mean of the squares of
deviations of the items from their mean value.
Calculation of standard deviation in individual series
Standard deviation in individual series can be calculated using following methods.
Direct method
Steps


Calculate the arithmetic mean of the series.
Find the deviations of the given variables from the arithmetic mean of the series(d)
3


Square each deviation and sum them up. (d2)
Divide the sum of the deviations by number of items and find the square root of it.
SD =
d=x – actual arithmetic mean
Short cut method
 Take one of the given items as the assumed mean.(A)

Find the deviations of the given variables from the assumed mean of the series(d)


Square each deviation and sum them up. (∑d2)
Apply the formula
SD = √(∑d2)⁄N –(∑d)/N)2
d=x –A(assumed mean)
Calculation of standard deviation in discrete series
Direct method
Steps





Calculate the arithmetic mean of the series.
Find the deviations of the given variables from the arithmetic mean of the series(d)
Square each deviation (d2)
Multiply d2 with frequency and sum them up.
Apply the formula.
SD = = √(∑ fd2)/∑f
d=x – actual arithmetic mean
Short cut method
 Take one of the given items as the assumed mean.(A)




Find the deviations of the given variables from the assumed mean of the series(d)
Square each deviation (d2)
Multiply d2 with frequency and sum them up.
Apply the formula
SD = √ (∑fd2) ∑f – (∑fd/∑f) 2
d=x –A (assumed mean)
Step deviation method
 Take one of the given items as the assumed mean.(A)


Find the deviations of the given variables from the assumed mean of the series (d) and
divide it by a common factor.
Square each deviation (d2)
4


Multiply d2 with frequency and sum them up.
Apply the formula

SD =√ (∑fd’2) ∑f – (∑f d’/∑f) 2 × c
d=x –A (assumed mean)
c = common factor
Calculation of standard deviation in continuous series
Direct method
Steps





Calculate the arithmetic mean of the series.
Find the deviations of the mid points from the arithmetic mean of the series(d)
Square each deviation (d2) and denote it by(d’2)
Multiply d’2 with corresponding frequency and sum them up.
Apply the formula.
SD = √ (∑ fd2)/∑f
d=x – actual arithmetic mean
Short cut method
 Take one of the given items as the assumed mean.(A)




Find the deviations of the mid point from the assumed mean of the series(d)
Square each deviation (d2)
Multiply d2 with corresponding frequency and sum them up.
Apply the formula
SD = √ (∑fd2) ∑f – (∑fd/
d=x –A (assumed mean)
x = Midvalue
Step deviation method
 Take one of the given items as the assumed mean.(A)




Find the deviations of the mid points from the assumed mean of the series (d) and
divide it by a common factor and denote it by d’.
Square each deviation (d’2)
Multiply d’2 with corresponding frequency and sum them up.
Apply the formula
SD =√ (∑fd’2) ∑f – (∑f d’/∑f) 2 × c
d’ = x – A/ c
x = Midvalue
c = common factor
Coefficient of variation
5
This relative measure is used for comparing dispersion of two series.
SD/AM x 100 (standard deviation divided by arithmetic mean multiplied by 100)
Merits and demerits of standard deviation
Merits
 It is based on all values
 Capable of further statistical treatment
 Less affected by changes in samples
Demerits
 Can be used only by experts
 Extreme values tend to get greater importance.
Construction of Lorenz curve
 Find the cumulative of the given values and convert them into percentage (each
cumulative value divided by the total multiplied by 100)
 Cumulative percentage of frequencies are plotted on X – axis of a graph, while
cumulative percentage of variables are plotted on y-axis
 Draw a line joining (0,0) and (100,100) to get the line of equality.
 Plot the actual data on the graph and join the points with a smooth hand curve and
extend it to the points (0,0) and (100,100).
The actual distribution curve is called the Lorenz curve. The gap between the line of
equality and the actual curve gives the variation in the distribution.
Larger the gap, more is the dispersion and vice versa.
Find range from the following
1.
23
56
78
98
ans 91
56
7
2.
23
56
78
98
56
100
Ans = 99
3
. wages 25 50 75 80 85 90
workers 4 6 9 3 2 1
ans 65
4.
Income
10-20
20-30
families
46
54
Ans = 40
34
34
30-40
42
90
1
34
34
87
87
34
34
56
40-50
30
6
7
8
9