Download 3.4 Relative location

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
3.4 Measures of Relative Location:
z-score is the quantity which can be used to measure the relative location of the data.
Z-score, referred to as the standardized value for observation i, is defined as
zi 
Note:
xi  x
s .
z i is the number of standard deviation xi from the mean x .
Example 2 (continue):
Factory 1:
xi
10.1
zi
10.1
9.9
9.9
10.1
9.9
10.1
9.9
10.1
0.948 -0.948 0.948 -0.948 -0.948 0.948 -0.948 0.948 -0.948 0.948
Factory 2:
xi
16
zi
9.9
5
7
14
6
15
3
13
9
12
1.305 -1.088 -0.652 0.870 -0.870 1.088 -1.523 0.652 -0.217 0.435
There are two results related to the location of the data. The first result is Chebyshev’s
theorem.
Chebyshev’s Theorem:
For any population, within k standard deviation of mean, there are at least
(1 
1
)  100%
k2
of the data, where k is any value greater than 1.
Based on Chebyshev’s theorem, for any data set, it could be roughly estimated that at
least (1 
1
)  100% of data within k sample standard deviation of mean.
k2
Example 2 (continue):
As k=2, based on Chebyshev’s theorem, at least
(1 
1
)  100%  75%
22
of the data are estimated within 2 standard deviations of mean. For the data from
factory 1 and factory 2, all the data are within 2 sample deviations of mean, i.e., all
1
the data have z-score with absolute values smaller than 2.
The second result is based on the empirical rule. The rule is especially
applicable as the data have a bell-shaped distribution. The empirical rule
is
 Approximately 68% of the data will be within one standard deviation
of the mean (  1  zi  1 ).
 Approximately 95% of the data will be within two standard deviation
of the mean (  2  zi  2 ).
 Almost all of the data will be within three standard deviation of the
mean (  3  zi  3 ).
Example 2 (continue):
For data from factory 1, all the data are within one standard deviation of the mean
while 60% of the data are within one standard deviation of the mean for the data from
the factory2. The result based on the empirical rule is not applicable to the two data
set since the two data sets are not bell-shaped. However, for the following data,
2.11
-0.83
-1.43
1.35
-0.42
-0.69
-0.65
-0.29
-0.54
1.92
0.53
-0.27
1.7
0.88
1.25
0.32
-2.18
0.68
0.85
0.34
0
1
2
3
4
The histogram of the above data given below indicates the data is roughly
bell-shaped.
-2
-1
0
1
2
rn1
Approximately 65% of the data are within one standard deviation of the mean, which
is similar to the result based on the empirical rule (68%).
2
Detecting Outliers:
To identify the outliers, we can use either the box-plot or the z-score. The
outliers identified by the box-plot are those data outside the upper limit or
lower limit while the outliers identified by z-score are those with z-score
smaller than –3 or greater than 3.
Note: the outliers identified by box-plot might be different from
those identified by using z-score .
Example 4:
The flashlight batteries produced by one of the manufacturers are known to have an
average life of 60 hours with a standard deviation of 4 hours.
(a) At least what percentage of batteries will have a life of 54 to 66 hours?
(b) At least what percentage of the batteries will have a life of 52 to 68 hours?
(c) Determine an interval for the batteries’ lives that will be true for at least 80% of
the batteries.
[solution:]
Denote
x  60, s  4
(a)
[54,66]  60  6  x  1.5s
Thus, by Chebyshev’s theorem, within 1.5 standard deviation, there is at least
1 

 100%  55.55%
1 
2 
1
.
5


of batteries.
(b)
[52,68]  60  8  x  2s
Thus, by Chebyshev’s theorem, within 2 standard deviation, there is at least
1

1  2  100%  75%
 2 
of batteries.
(c)
3
1 
1

1  2   100%  80%  1  2  0.8  k  5
k
 k 
Thus, within
5 standard deviation, there is at least 80% of batteries. Therefore,
x  5s  60  5  4  60  8.94  51.06,68.94 .
Online Exercise:
Exercise 3.4.1
4