Download Mean deviation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Mean field particle methods wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Mathematics
Statistics
Session Objectives
Session Objectives
1. Introduction
2. Mean deviation from the mean
3. Mean deviation from the median
4. Variance and standard deviation
5. Short-cut methods to find out the mean and
standard deviation
Introduction
We will study this topic based on your
knowledge of the earlier classes. This
includes knowledge of data
representation and measures of
central tendency — mean, median
and mode.
Mean of a continuous frequency distribution is given by
n
 fi xi
X
i 1
N
, where N 
n
 fi
i 1
Introduction
Median of a continuous frequency
distribution is given by
N


F
2

Median   
  h, where  Lower limit of median class
 f 


f  Frequency of median class
h  Width of median class
F  Cumulative frequency of the class
preceeding median class
N   fi
Mean Deviation from the Mean
Let us first understand what ‘mean
deviation’ is. Mean deviation is the
mean of the absolute deviations of a set
of observations, taken from a definite
central value (can be mean, median or
anything else).
The keyword to note in the above definition
is ‘absolute’ — only the numerical value of
the deviation is to be taken, ignoring the
sign.
Mean Deviation from the Mean
Mean deviation from the mean
for raw data (unclassified) :
In this case mean deviation
from the mean for a set of n
observations is given by
n
 xi  x
M.D. (X) 
Mean deviation from the mean
for grouped data (classified) :
In this case if xi’s are the midpoints of classes with frequency
fi, then the mean deviation from
the mean is given by
i1
n
n
 fi xi  x
M.D. (x) 
i 1
n
 fi
i 1
Mean Deviation from the Median
The only difference here is that the
mean is replaced by the value of
the median.
Mean deviation from the median
for raw data (unclassified)
In this case mean deviation from
the median for a set of n
observations is given by
n
 xi  Median
M.D. 
Mean deviation from the median
for grouped data (classified)
In this case if xi’s are the
mid-points of classes with
frequency fi , then the mean
deviation from the median is
given by
i1
n
n
 fi xi  Median
M.D. 
i1
n
 fi
i1
Variance and Standard Deviation
The variance of a set of observation
(xi) is the mean of the squares of
deviations from mean of the
observations (x) . The variance is
usually denoted by Var(X) or  2 .
If you now look at the definition above,
there are 3 parts to it. So for a raw data
of a set of n observations:
(i) Deviations from mean of the observations (xi  x)

(ii) Squares of deviations from mean xi  x

2
1 n
(iii) Mean of the squares of deviations from mean  xi  x
ni 1


2
Variance and Standard Deviation
 
Standard deviation  is defined as
the positive square root of the variance.
The value of the variance and standard
deviation for a grouped data is given by
n
 fi  xi  x 
Variance, 2  i1
2
n
and Standard
 fi
i1
n
 fi  xi  x 
deviation (S.D.),   i 1
n
 fi
i 1
2
Short-cut Method to Find Out Mean (x)
and Variance (2)
In order to reduce the calculations involved in
finding out the values of mean and variance for a
grouped data, the following algorithm can be
used to calculate the same.
Algorithm for finding out the mean (x) for a grouped data:
1.
Write down the frequency table with a column giving
the class-marks (mid-points of class intervals)
2.
Choose a number ‘A’ (usually the middle or almost
middle value of all xi’s) and take deviations
di = xi– A about A.
3.
Divide each deviation by the class width h.

Hence you get  u i 

di 
.
h 
Short-cut Method to Find Out Mean (x)
and Variance (2)
4.
5.
Multiply the frequencies (fi) with the
corresponding ui .Calculate the sum (fi ui ).
Find the sum of all frequencies
n
 fi  N .
i 1
6.
 n

Use the formula X  A  h  1  fiui 
Ni 1 


Short-cut Method to Find Out Mean (x)
and Variance (2)
Similarly, we can also use a short-cut
method to calculate the variance (2 )
for a grouped data
1.
Write down the frequency table with a column giving the
class-marks (mid-points of class intervals)
2.
Choose a number ‘A’ (usually the middle or almost middle
value of all xi’s) and take deviations di = xi– A about A.
3.
Multiply the frequencies (fi) with the corresponding di.
Calculate the sum (fi di ).
4.
Obtain the square of the deviations above (di2).
Short-cut Method to Find Out Mean (x)
and Variance (2)
5.
6.
Multiply the frequencies (fi) with the
corresponding di2. Calculate the sum (fi di2).
Find the sum of all frequencies
n
 fi  N.
i 1
2
 n
n


1
1
2
2
7. Use the formula     fidi    fidi  
N
 
N
i

1
i

1

 

Class Test
Class Exercise - 1
The number of students absent in a
school was recorded everyday for
147 days and the data is represented
in the following frequency table.
Number of students absent Number of days
5
1
6
5
7
11
8
14
9
10
16
13
11
10
12
70
13
4
15
1
18
20
1
1
Obtain the median and
describe what information
it conveys. Also find the
mean deviation from the
median.
Solution
Calculation of median and mean deviation
xi
fi
Cumulative
frequency
xi – 12
fi x i – 12
5
1
1
7
7
6
5
6
6
36
7
11
17
5
85
8
14
31
4
124
9
16
47
3
141
10
13
60
2
120
11
10
70
1
70
12
70
140
0
0
13
4
144
1
144
15
1
145
3
435
18
1
146
6
876
20
1
147
8
1176
N  147
3214
N
Here, N = 147, 
 73.5
2
The cumulative frequency
just greater than
N
2
is 140 and the value of
x is 12.
Hence, median = 12.
Solution contd..
The value of the median here
signifies that for about half the
number of days, approximately
12 students were absent.
Mean deviation about median 
1
 fi xi  12
N
3214

147
= 21.86
Class Exercise - 2
The following data represents the
expenditure pattern of a student
for the month of July. The student
gets Rs. 50 everyday as a pocket
money.
Expenditure (Rs.) Frequency No. of days 
0-10
10-20
5
12
20-30
8
30-40
40-50
4
2
Calculate the mean and standard deviation.
Solution
Calculation of mean
Class
0-10
10-20
20-30
30-40
40-50
xi
5
15
25
35
45
fi
5
12
8
4
2
N  31
Hence, mean x 
fi xi
25
180
200
140
90
635
1
 fi xi
N
635

 20.48
31
Solution contd..
Calculation of standard deviation
fi ui
ui2
fi ui2
5
12
xi – 25
10
–2
–1
–10
–12
4
1
20
12
25
8
0
0
0
0
35
45
4
2
1
2
4
4
1
4
4
8
xi
fi
5
15
N  31
ui 
–14
Hence, variance
2

1
 
2
2 1
2
  h   fi ui    fi ui  
N
 
 N
44
 44  14 2 
 100 



 31  31  
= 121.54
Hence,   121.54 = 11.02
Class Exercise - 3
An absent-minded professor was
computing certain experimental
data to find the mean and standard
deviation of 100 observations. He
found mean to be 40 and the
standard deviation to be 51.
His assistant later found that the
professor has, by mistake, read an
observation value as 61, instead
of the correct value of 91. Find the
correct mean and standard deviation
of the experimental data.
Solution
 xi
Based on incorrect data, Mean, x 
n
 xi
or 40 
100
  xi  4000
Hence,   xi correct  4000  61  91 = 4030
 The correct mean, xcorrect
4030

 40.3
100
Similarly, for standard deviation,   5.1.
Hence, 2   5.1  26.01
2
Solution contd...
 xi2
2
26.01 
  40 
100
or  xi2   26.01  1600   100 = 162601
Now, the correct value would be
  x   162601–  61  91
2
2
i
2
= 162601 + 4560 = 167161
correct
2
Hence, correct

 
 xi2
correct
100
   xi  correct 
 

100


2

167161  4030 
–

100
 100 
= 1671.61 – 1602.41 = 69.2
So, the correct standard deviation,
correct  69.2  8.32
2
Thank you