Download Measures of Spread Standard Deviation notes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Measures of Spread
Standard Deviation
The interquartile range is an effective measure of spread but it is awkward to calculate. Standard
deviation is a more useful measure of spread.
e.g. Find the mean of each set of data.
Set A: 30 70 80 110 10 42 78 90 22 68
Mean:
Set B: 60 58 62 61 61 58 63 59 60 58
Mean:
The mean for both sets is _60_ , but the data is very differently dispersed (or spread).
We have two ways of measuring the spread of the data:
1) Variance (  2 ) – If you square each deviation (distance from the mean) for the entire set of data,
and then find the mean of them, you get the Variance. Variance is denoted by  2 (sigma2) for a
population and s2 for a sample.
2) Standard Deviation (  ) – this is a measure of how close to the mean, the data are clustered. It is
found by taking the square root of the variance. Essentially it is the average distance of the data from
the mean.
Formula for Calculating Standard Deviation:

n
Ungrouped data:  
i 1
xi  x

2
 
n
Grouped data:  
n
i 1
fi xi  x
n
x
 60
  n  600
10
Calculate the variance and standard deviation for data set A.
Data (x)
30
70
80
110
10
42
78
90
22
68
Deviation from mean ( x  x )
( x  x )2
30-60 = -30
10
20
50
-50
-18
18
30
-38
8
(-30)2 = 900
100
400
2500
2500
324
324
900
1444
64
  x  x
n
i 1
2
= 9456

2
Mean x =
Variance:

n
2 
i 1
xi  x

2
n
= 9456
10
= 945.6
Standard Deviation:
 =
945.6
= 30.75
Calculate the variance and standard deviation for data set B.
Data (x)
frequency
(xx)
( x  x )2
f( x  x )2
60
2
60-60=0
0
0
58
3
-2
4
12
62
1
2
4
4
61
2
1
1
2
63
1
3
9
9
59
1
-1
1
1
 f  x  x
n
Variance:
(  ) = 2.8
2
Standard Deviation:
(  ) = 1.67
2
i 1
= 28
Conclusion
The standard deviation for Set B is smaller than that for set A,
indicating that the data are more closely clustered around the
mean, and as such are more consistent.
Related documents