Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
10 Standard deviations and variances
Standard deviations and variances of data.
If we have some measurements x1, x2, ... , xn then their average
_
x1 + x2 + + xn
x =
n
is a measure of where their "center" is. The standard deviation and variance give us measures of how
spread out the data is. The variance, s2, is the average of the squares of the differences of the data values
from the average except most people divide by n-1 instead of n, i.e.
s2 =
(1)
_
1 n
(x - x )2
n-1 j
j=1
Dividing by n – 1 instead of n makes simpler certain formulas that involve variances and standard
deviations. The standard deviation, s, is the square root of the variance, i.e.
The standard deviation, s, is the square root of the variance, i.e.
(2)
s =
_
1 n
(xj - x )2
n-1
j=1
Example 1.
A certain community is doing a study of the times between fires in their community. In a
certain period they measure the following times (in hours) between successive fires t1 = 1.3, t2 = 0.5, t3 =
1.6, t4 = 9.3, and t5 = 7.3. Find the average, variance and standard deviation of these five measurements.
Solution.
_
1.3 + 0.5 + 1.6 + 9.3 + 7.3
20
t =
=
= 4
5
5
s2 =
(1.3 - 4)2 + (0.5 - 4)2 + (1.6 - 4)2 + (9.3 - 4)2 + (7.3- 4)2
4
=
(- 2.7)2 + (- 3.5)2 + (- 2.4)2 + (5.3)2 + (3.3)2
7.29 + 12.25 + 5.76 + 28.09 + 10.89
=
4
4
=
64.28
= 16.07
4
s = 4.00874
There is an alternative formula for the variance that is sometimes useful for computational purposes.
Proposition 1. Let s2 be given by (1). Then
10 - 1
s2 =
(3)
1 n
n _2
(x )2 –
(x )
n-1 j
n-1
j=1
Proof. s2 =
n
n _
_
_ _
_ n
1 n
1 n
1
(xj - x )2 =
[(xj)2 – 2xjx + (x )2] =
[ (xj)2 – 2x xj + (x )2] =
n-1
n-1
n-1
j=1
j=1
j=1
j=1
j=1
n
_
_
1
1 n
n _2
[ (xj)2 – 2n(x )2 + n(x )2] =
(x )2 –
(x ) . //
n-1
n-1 j
n-1
j=1
j=1
Example 1 (continued). Use the alternative formula (3) to compute the variance of the data in Example 1.
1.32 + 0.52 + 1.62 + 9.32 + 7.32 5 2
1.69 + 0.25 + 2.56 + 86.49 + 53.29
(4) =
- 20
4
4
4
s2 =
=
144.28
- 20 = 36.07 – 20 = 16.07.
4
Remark. The mean absolute deviation about the mean, , is another possible measure of how spread out
the data is. It is defined by
1 n
_
= n | xj - x |
(4)
j=1
Example 1 (continued). For the data in Example 1, the mean absolute deviation about the mean is
=
|1.3 - 4| + |0.5 - 4| + |1.6 - 4| + |9.3 - 4| + |7.3- 4|
|- 2.7| + |- 3.5| + |- 2.4| + |5.3| + |3.3|
=
5
5
=
2.7 + 3.5 + 2.4 + 5.3 + 3.3
17.2
=
= 3.44
5
5
The mean absolute deviation is less than the standard deviation. More precisely, we have the following.
Proposition 2. If is defined by (4) and by (2) then
(5)
n-1
n
n
Proof. We write
_
n
_
| xj - x | = | xj - x | 1 and apply the Cauchy-Schwarz inequality
j=1
j=1
n
n ½ n ½
ujvj uj2 vj2
j = 1 j = 1
j=1
This gives
10 - 2
n
_
_ ½
n
| xj - x | | xj - x |2 n =
j = 1
j=1
_ ½
1 n
n (n - 1)
| xj - x |2
=
n-1
j=1
n (n - 1)
Dividing both sides by n gives (5). //
Standard deviations and variances of random variables.
If X is a random variable, then its mean or expected value is
f(x)x dx
f(xk)xk
if X is discrete with probability mass function f(x)
k
(6)
= E(X) =
if X is continuous with density function f(x)
-
The Law of Large Numbers connects the mean of a random variable with the average of data.
Theorem (Law of Large Numbers). Suppose X1, X2, …, Xn, … is a sequence of independent random
variables all having the same probability mass or density function and is the common mean of all the Xj.
Let
_
X 1 + X2 + + Xn
Xn =
n
_
be the "average" of X1, X2, …, Xn and E be the set of outcomes a in the sample space S such that Xn(a)
as n . Then Pr{E} = 1.
So if the data consists of a set of independent samples with the same distribution and n is large then the
average of the data should be close to the mean of the underlying probability distribution.
There is a similar connection between the variance and standard deviation of data and the corresponding
variance and standard deviation of a random variable.
If X is a random variable, then the variance of X is
f(x)(x - ) dx
f(xk)(xk - )2
if X is discrete
k
(7)
2 = E( (X – )2 ) =
2
-
The standard deviation, , is the square root of the variance, i.e.
(8)
=
2
10 - 3
if X is continuous
As in the case of data, there is an alternative formula for the variance.
Proposition 2. Let 2 be given by (5). Then
f(x)x dx -
f(xk)(xk)2
k
(9)
2 = E(X2) - 2 =
2
2
-
Proof. 2 = E( (X – )2 ) = E(X2 – 2X + 2) = E(X2) – E(2X) + E(2) = E(X2) – 2E(X) + 2 =
E(X2) - 22 + 2 = E(X2) - 2. //
Example 3. Suppose in Example 1 we model the times between fires by an exponential random variable
with rate = ¼ fires per hour. Find the theoretical variance and standard deviation of such a random
variable.
Solution. We have seen previously that for an exponential random variable T with rate has mean = 1/.
The density function is equal to e
- t
for t > 0 and zero for t < 0. Therefore E(T ) =
t2e-t dt. If one
2
0
integrates by parts twice then one obtains E(T ) = 2/ . So = E(T ) - = 2/ - 1/2= 1/2 and = 1/.
2
2
2
2
2
2
In our example this means 2 = 16 and = 4.
As indicated above, the Law of Large Numbers provides the connection between the average of data and the
mean of a random variable. In a similar fashion it provides the connection between the variance and
standard deviation of data and the variance and standard deviation of a random variable. Suppose we
model a collection of data x1, …, xn be a sequence of independent random variables X1, X2, …, Xn all with
_ X 1 + X2 + + Xn
_
the same probability mass or density function. Then Xn =
is a model for the average x of
n
_
x1, …, xn. The Law of Large Numbers says Xn as n with probability one where is the mean of
_
1 n
the probability distribution for all the Xj. In a similar fashion Sn2 =
(Xj - Xn)2 is a model for the
n-1
j=1
variance of x1, …, xn and Sn =
Sn2
is a model for the standard deviation of x1, …, xn. The Law of Large
numbers implies Sn2 approaches the variance 2 of the probability distribution for all the Xj with probability
one and Sn approaches the standard deviation of the probability distribution.
10 - 4