Download Chap004 - Ka

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Dr. Ka-fu Wong
ECON1003
Analysis of Economic Data
Ka-fu Wong © 2003
Chap 4-1
Chapter Four
Other Descriptive Measures (dispersion)
GOALS
1.
2.
3.
4.
5.
6.
7.
l
Compute and interpret the range, the mean deviation, the
variance, and the standard deviation of ungrouped data.
Compute and interpret the range, the variance, and the
standard deviation from grouped data.
Explain the characteristics, uses, advantages, and
disadvantages of each measure of dispersion
Understand Chebyshev’s theorem and the Normal, or
Empirical Rule, as they relate to a set of observations.
Compute and interpret quartiles and the interquartile
range.
Construct and interpret box plots.
Compute and understand the coefficient of variation and
the coefficient of skewness.
Ka-fu Wong © 2003
Chap 4-2
Range
 The range is the difference between the largest
and the smallest value.
 Only two values are used in its calculation.
 It is influenced by an extreme value.
 It is easy to compute and understand.
Ka-fu Wong © 2003
Chap 4-3
Mean Deviation
 The Mean Deviation is the arithmetic mean of the
absolute values of the deviations from the
arithmetic mean.
 All values are used in the calculation.
 It is not influenced too much by large or small
values.
 The absolute values are difficult to manipulate.
Σ X-X
MD =
n
Mean deviation is also known as Mean Absolute Deviation (MAD).
Ka-fu Wong © 2003
Chap 4-4
EXAMPLE 1
 The weights of a sample of crates containing
books for the bookstore (in pounds ) are:
103, 97, 101, 106, 103
Find the range and the mean deviation.
Range = 106 – 97 = 9
Ka-fu Wong © 2003
Chap 4-5
Example 1
The first step is to find the mean weight.
ΣX 510
X= =
= 102
n
5
The mean deviation is:
ΣX-X
103 - 102 + ... + 103 - 102
MD =
=
n
5
1 + 5 +1 + 4 + 5
=
= 2.4
5
Ka-fu Wong © 2003
Chap 4-6
Population Variance
 The population variance is the arithmetic mean
of the squared deviations from the population
mean.
 All values are used in the calculation.
 More likely to be influenced by extreme values
than mean deviation.
 The units are awkward, the square of the
original units.
Ka-fu Wong © 2003
Chap 4-7
Variance
 The formula for the population variance is:
2
Σ(X
μ)
σ2 =
N
 The formula for the sample variance is:
2
Σ(X
X
)
s2 =
n -1
Note in the sample variance formula the sum of deviation is
divided by (n-1) instead of n. Although it is logical to use n
instead of (n-1), the division by (n-1) yields an unbiased
estimator of the population variance but the division by n yields
a biased estimator.
Ka-fu Wong © 2003
Chap 4-8
EXAMPLE 2
 The ages of the Dunn family are:
2, 18, 34, 42
What is the population variance?
ΣX
96
μ=
=
= 24
n
4
2
2
2
(
)
(
)
Σ(X
μ)
2
24
+
...
+
42
24
σ2 =
=
N
4
944
=
= 236
4
Ka-fu Wong © 2003
Chap 4-9
The Population Standard Deviation
 The population standard deviation (σ) is the
square root of the population variance.
 For EXAMPLE 2, the population standard
deviation is 15.36, found by
σ=
Ka-fu Wong © 2003
2
σ =
236 = 15.36
Chap 4-10
EXAMPLE 3
The hourly wages earned by a sample of five
students are:
$7, $5, $11, $8, $6.
Find the variance.
ΣX 37
X=
=
= 7.40
n
5
s2 =
2
(
)
Σ X-X
=
(7 - 7.4)2 + ... + (6 - 7.4)2
n -1
21.2
=
= 5.30
5 -1
Ka-fu Wong © 2003
5 -1
Chap 4-11
Sample Standard Deviation
 The sample standard deviation is the square root
of the sample variance.
 In EXAMPLE 3, the sample standard deviation is
2.30
s=
Ka-fu Wong © 2003
s2 =
5.29 = 2.30
Chap 4-12
Sample Variance For Grouped Data
 The formula for the sample variance for grouped
data is:
Σf(x - x )
s =
Σf - 1
2
2
Σfx - 2xΣfx  Σf x

n -1
2
2
Σfx - 2nx  nx

n -1
2
Σfx - nx

n -1
2
Ka-fu Wong © 2003
2
Σf(x - 2xx  x )

n -1
2
2
2
2
Chap 4-13
Interpretation and Uses of the Standard
Deviation
 Chebyshev’s theorem: For any set of
observations, the minimum proportion of the
values that lie within k standard deviations of
the mean is at least:
1
1- 2
k
where k2 is any constant greater than 1.
Ka-fu Wong © 2003
Chap 4-14
Chebyshev’s theorem
Chebyshev’s theorem: For any set of observations, the
minimum proportion of the values that lie within k
standard deviations of the mean is at least 1- 1/k2
Ka-fu Wong © 2003
K
Coverage
1
0%
2
75.00%
3
88.89%
4
93.75%
5
96.00%
6
97.22%
Chap 4-15
Interpretation and Uses of the
Standard Deviation
 Empirical Rule: For any symmetrical, bellshaped distribution:
About 68% of the observations will lie within 1s
the mean,
About 95% of the observations will lie within 2s of
the mean
Virtually all the observations will be within 3s of
the mean
Empirical rule is also known as normal rule.
Ka-fu Wong © 2003
Chap 4-16
Bell-shaped Curve showing the relationship between σ and μ
mKa-fu Wong © 2003
3s
m-2s m-1s
m
m1s m2s m
3s
Chap 4-17
Why are we concern about dispersion?
 Dispersion is used as a measure of risk.
 Consider two assets of the same expected (mean)
returns.
 -2%, 0%,+2%
 -4%, 0%,+4%
 The dispersion of returns of the second asset is
larger then the first. Thus, the second asset is
more risky.
 Thus, the knowledge of dispersion is essential for
investment decision. And so is the knowledge of
expected (mean) returns.
Ka-fu Wong © 2003
Chap 4-18
Relative Dispersion
 The coefficient of variation is the ratio of the
standard deviation to the arithmetic mean,
expressed as a percentage:
s
CV = (100%)
X
Ka-fu Wong © 2003
Chap 4-19
Sharpe Ratio and Relative Dispersion
 Sharpe Ratio is often used to measure the
performance of investment strategies, with an
adjustment for risk.
 If X is the return of an investment strategy in
excess of the market portfolio, the inverse of the
CV is the Sharpe Ratio.
 An investment strategy of a higher Sharpe Ratio
is preferred.
http://www.stanford.edu/~wfsharpe/art/sr/sr.htm
Ka-fu Wong © 2003
Chap 4-20
Skewness
 Skewness is the measurement of the lack of
symmetry of the distribution.
 The coefficient of skewness can range from 3.00 up
to 3.00.
 A value of 0 indicates a symmetric distribution.
 It is computed as follows:
3(x - median)
sk =
S
Or
Ka-fu Wong © 2003
3

x-x 
n
 
 
sk =
(n - 1)(n - 2)   s  


Chap 4-21
Why are we concerned about skewness?
 Skewness measures the degree of asymmetry in
risk.
 Upside risk
 Downside risk
 Consider the distribution of asset returns:
 Right skewed implies higher upside risk than
downside risk.
 Left skewed implies higher downside risk than
upside risk.
Ka-fu Wong © 2003
Chap 4-22
Interquartile Range
 The Interquartile range is the distance between
the third quartile Q3 and the first quartile Q1.
 This distance will include the middle 50 percent
of the observations.
 Interquartile range = Q3 - Q1
Ka-fu Wong © 2003
Chap 4-23
EXAMPLE 5
 For a set of observations the third quartile is 24
and the first quartile is 10. What is the quartile
deviation?
 The interquartile range is 24 - 10 = 14. Fifty
percent of the observations will occur between
10 and 24.
Ka-fu Wong © 2003
Chap 4-24
Box Plots
 A box plot is a graphical display, based on
quartiles, that helps to picture a set of data.
 Five pieces of data are needed to construct a box
plot: the Minimum Value, the First Quartile, the
Median, the Third Quartile, and the Maximum
Value.
Ka-fu Wong © 2003
Chap 4-25
EXAMPLE 6
 Based on a sample of 20 deliveries, Buddy’s
Pizza determined the following information. The
minimum delivery time was 13 minutes and the
maximum 30 minutes. The first quartile was 15
minutes, the median 18 minutes, and the third
quartile 22 minutes. Develop a box plot for the
delivery times.
Ka-fu Wong © 2003
Chap 4-26
EXAMPLE 6 continued
median
min Q1
12 14 16 18
Ka-fu Wong © 2003
Q3
20
22
max
24
26 28
30
32
Chap 4-27
Working with mean and Standard
Deviation
Set
Data
Mean
St Dev
(1)
19
20
21
20.00
0.82
(2)
-1
0
1
0.00
0.82
(3)
19
20
20
20.00
0.71
(4)
38
40
42
40.00
1.63
(5)
57
60
63
60.00
2.45
(6)
19
19
20
20.00
0.82
(7)
3
5
8
5.33
2.05
(8)
4
7
9
6.67
2.05
(9)
7
12
17
12.00
4.08
12
20
21
35.56
18.04
(10)
Ka-fu Wong © 2003
21
20
27
21
32
21
35
45
56
72
Chap 4-28
Working with mean and Standard
Deviation
Set
Data
Mean
St Dev
(1)
19
20
21
20.00
0.82
(2)
-1
0
1
0.00
0.82
(3)
19
20
20
20.00
0.71
(4)
38
40
42
40.00
1.63
(5)
57
60
63
60.00
2.45
21
 (2) = (1) – mean(1):
 Mean(2)=0; Stdev(2)=Stdev(1)
 (3) = (1) + mean(1)
 Mean(3)=Mean(1); Stdev(3)<Stdev(1).
 (4) = (1)*2; (5) = (1)*3
 Mean(4)=mean(1)*2; mean(5)=mean(1)*3
 Stdev(4)=stdev(1)*2; stdev(5)=stdev(1)*3
Ka-fu Wong © 2003
Chap 4-29
Working with mean and Standard
Deviation
Set
Data
Mean
St Dev
20.00
0.82
20.00
0.82
(1)
19
20
21
(6)
19
19
20
(7)
3
5
8
5.33
2.05
(8)
4
7
9
6.67
2.05
(9)
7
12
17
12.00
4.08
12
20
21
35.56
18.04
(10)
20
27
21
32
21
35
45
56
72
 (6)=(1) multiplied by some frequency
 Mean(6)=Mean(1); Stdev(6)=Stdev(1).
 (9) = (7)+(8)
 Mean(9)=mean(7)+mean(8)
 (10) = (7) *(8)
 Mean(10)=mean(7)*mean(8)
Ka-fu Wong © 2003
Chap 4-30
Further results about mean and
variance of transformed variables





E(X) = mean or expected values
V(X) = E[(X-E(X))2]=E(X2) – E(X)2
E(a+bX) = a+bE(X)
E(X+Y) = E(X) + E(Y)
V(X+Y) = V(X) + V(Y) if X and Y are
independent.
Ka-fu Wong © 2003
Chap 4-31
Further results about mean and
variance of transformed variables




E(a+bX) = a+bE(X)
E(X+Y) = E(X) + E(Y)
Suppose we invest $1 in two assets. $a in asset
X and $(1-a) in asset Y. Their expected returns
are respectively E(X) and E(Y). We will expect a
return of E(aX+(1-a)Y) = aE(X) + (1-a)E(Y) for
this investment portfolio.
If these two assets are independent or
uncorrelated so that C(X,Y) =0, then the
variance is V(aX+(1-a)Y) = a2V(X) + (1-a)2V(Y)
Ka-fu Wong © 2003
Chap 4-32
Chapter Four
Other Descriptive Measures (dispersion)
- END -
Ka-fu Wong © 2003
Chap 4-33
Related documents