Download Finance Math Refresher

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Black–Scholes model wikipedia , lookup

Modern portfolio theory wikipedia , lookup

Transcript
Finance Math Refresher
This paper contains some basic math to refresh your human memory
and prepare you for the math content for the book
Correlation Risk Modeling and Management – An Applied Guide
including the Basel III Correlation Framework
There are problems at the end of each chapter. The solutions are available upon request.
For comments and questions, please email Gunter Meissner at
[email protected].
1
1. Number Theory
a) Natural numbers N also called whole numbers, are a set of numbers that take positive
values and have no decimals. If the 0 is excluded they are termed positive natural
number, if the 0 is included they are termed non-negative natural numbers. Formally,
for non-negative natural numbers, we have
N = {0, 1, 2, 3,…}
Graphically, the natural number set can be displayed as a vector
0 1 2 3 4
∞
For more on vectors see section 2.2.
b) Integers Z are positive and negative natural numbers. Formally
Z = {…,-3, -2, -1, 0, 1, 2, 3,…}
(Z comes from “Zahl”, which means ‘number’ in German, which is very important to
know )
c) Rational numbers Q are numbers that can be expressed as a quotient (also called
fraction) of integers, and a denominator of non-zero. Formally,
Q = p / q where p, q ϵ Z (reads: p and q are elements of the Integer number set Z) and q
≠ 0. As a consequence, rational numbers have finite decimals.
d) Irrational numbers are numbers that cannot be expressed as quotient of integers with a
denominator q ≠ 0. In other words, Irrational numbers have indefinite and nonrepeating decimals. Examples of irrational number are Euler’s number e = 2.82818…,
π = 3.1415926… or
2  1.414213. .
e) Real numbers R include all number sets above, i.e. the Natural numbers set N, the
Integer number set Z, the rational number set Q, as well as irrational numbers.
f) Imaginary numbers are numbers whose square root is less than 0. For example,  10
is an imaginary number since -10 is smaller than 0. A special imaginary number is i,
defined as i   1 . We do us imaginary numbers in finance, e.g. in Fourier analysis.
g) Complex numbers C are a combination of real numbers and imaginary numbers, i.e. a +
bi, where a and b are ϵ R and i   1 . Complex numbers are cool, they expand the
one-dimensional number line and add second dimension, which creates a number
2
plane. Complex numbers provide a solution to equations where no real solution can be
found.
We can relate the numbers sets above by expressing them as subsets:
NZQRC
This reads: The Natural number set N is a subset of the Integer set Z, which is a subset of the
rational number set Q, which is a subset of the real number set R, which is a subset of the
complex number set C.
Dividing by zero
a
a
, a ϵ R is not defined. So is not an element of any of the
0
0
number sets above. Compare the calculus chapter 3, where we apply limits, i.e. a denominator
is decreasing to an infinitesimally small unit, see equation (3.1).
In standard algebra the quotient
3
Problems for Chapter 1: Number Theory
(The answers are available upon request. Email Gunter Meissner at [email protected])
1.1 Name 3 examples of sets in the real world, which can only be measured (counted) by
natural numbers.
1.2 Negative numbers were rejected until the 17th century by many mathematicians.
Name 2 examples of negative numbers appearing the financial world.
1.3 “Every natural number is also a rational number, but not vice versa”. Is this statement true?
1.4 Name a rational number, which is not natural number
1.5 Name a real number that is not a natural number
1.6 “Irrational numbers are irrational”. Comment on this statement
1.7 Name an irrational number that is not a rational number
1.9 “Imaginary numbers have no analytical solution. Therefore they don’t make sense”.
Comment on this statement.
1.10 Can a complex number be expressed as a real number?
1.11 “Only God can divide by zero, humans can’t” Comment..
4
2. Algebra
There are six basic algebraic operations: Adding, subtracting; multiplication, division;
exponentiation and extracting roots. When these operations are combined, they have to be
performed in the order of: First exponentiation and extracting roots, then multiplication and
division, then adding and subtracting.
So if y = a + b cd, then cd has to be performed first, then the multiplication with b, then ‘a’ is
added.
Example: What is the solution of y = 1 + 2 x 34? It is 81 x 2 + 1 = 163. This is the only correct
solution.
Many rules exist in Algebra. Here are the most applied ones.
a0 = 1, where a ϵ R and a ≠ 0
(a + b)2 = a2 + 2ab + b2
(a - b)2 = a2 - 2ab + b2
am an = a(n+m)
am / an = a(m-n)
(a b)n = an bn
2
a2
a
   2
b
b
 1 
1
n
-n
 - n   a and a   n 
a 
a 
a (1/p)  a
p
a (q/p)  a q
p
For more algebraic rules, see http://orion.math.iastate.edu/dept/links/formulas/form1.pdf
5
2.1 On the Logarithm
Logarithms can help to solve algebraic equations. In particular, they can solve an
equation for the exponent. Logarithms in finance often help to more conveniently display
exponential growth rates. So let’s discuss them.
The idea of a logarithm is to reverse the operation of exponentiation. Simply put, the
logarithm asks the question: What is the exponent, with which we have to raise a given number
(the base) to get another given number (x)?
The notation of a logarithm is
y = logb(x)
(2.1)
b is the base. We are trying to find y. y is the exponent with which b has to be raised to
find the given number x. So if b = 10 and x = 1,000, the logarithm is y = 3, since 103 = 1,000.
Formally, log10(1,000) = 3. We read this as “the logarithm of 1,000 to the base 10 is 3”.
We often use the natural logarithm ‘ln’ in finance. This means the base b = e = 2.71282….
The notation is
y = loge(x) or y = ln(x)
Example: If x = 10, what it ln(x)? Well, we can just throw it into Excel and get ln(10) = 2.3026.
This is correct since the base e = 2.71828 raised to the power of 2.3026 = 10.
Logarithms are also quite convenient mathematically, i.e. they are typically easy to
dy 1
 . (  stands for ‘it follows that’).
differentiate and integrate. For example, if y = ln(x) 
dx x
For more on differentiation, see chapter 3.
Logarithmic Rules
There are some nice logarithmic rules, which come in handy to solve algebraic problems:
ln (ea) = a ln(e)
This equation helps us solve for exponents. For example, if we have equation y = xa and y and x
are given, we can solve for the exponent a, by using
ln (y) = ln (xa) or ln (y) = a ln(x) or a = ln(y) / ln(x).
6
Example: What is the solution of 100 = 7a? It is ln(100) = a ln(7) or a = ln(100) / ln(7) = 2.36656.
(We can look up ln(100) and ln(7) easily on every calculator, Excel, MatLab, etc).
Other helpful rules are
ln(ab) = ln(a) + ln(b)
ln(a/b) = ln(a) – ln(b)
We also use the natural logarithm to more conveniently display growth rates. Growth
rates are relative changes, expressed as (S1-S0)/S0, where St is the prices of an asset at time t.
For example if S1 = 110, and S0 = 100, the relative change is (110-100)/100 = 0.1 = 10%. We
often approximate relative changes as
(S1-S0)/S0 ≈ ln (S1/S0)
(2.2)
This is a good approximation for small differences between S1 and S0. Ln(S1/S0) are called
log-returns. The advantage of using log-returns is that they can be added over time. Relative
changes are not additive over time. Let’s show this in an example.
Example: A stock price at t0 is $100. From t0 to t1, the stock increases by 10%. Hence the stock
increases to $110. From t1 to t2 the stock increases again by 10%. So the stock price increases to
$110x0.1= $121. This increase of 21% higher than adding the percentage increases of
10%+10%=20%. Hence percentage increases are not additive over time.
Let’s look at the log-returns. The log-return from t0 to t1 is ln(110/100) = 9.531%. From t1 to t2
the log-return is ln(121/110) = 9.531%. When adding these returns, we get 9.531%+9.531%=
19.062%. This is the same as the log-return from t0 to t2, i.e. ln(121/100) = 19.062%. Hence logreturns are additive in time.1 See the spreadsheet ‘Log returns’
at
www.wiley.com/go/correlationriskmodeling
We also often display exponential functions in finance on the logarithmic scale. If we
have stock growing exponentially with ex, we have
1
We could have also solved for the absolute value 121, which matches a logarithmic growth rate of 9.531%:
ln(x/110) = 9.531%, or, ln(x)-ln(110) = 9.531%, or, ln(x) = ln(110) + 9.531%. Taking the power of e we get, e(ln(x)) = x =
e(ln(110)+0.09531) = 121.
7
10
9
8
7
6
ex
5
4
3
2
1
0
0
0.25
0.5
0.75
1
1.25
1.5
1.75
2
2.25
x
Figure 2.1: y = ex with respect to x
Displaying this graph on a natural logarithmic scale, applying ln ex = x, we get
2.5
2
1.5
ln(ex)
1
0.5
0
0
0.25
0.5
0.75
1
1.25
1.5
1.75
2
2.25
x
Figure 2.2: The exponential stock price growth displayed on a logarithmic scale
8
2.2 Vector and Matrix Algebra
We use vectors and especially matrices in finance. Figure 1 already displayed the natural
number set as a vector. We use matrices in investments and risk management to display the
covariance matrix of the assets in a portfolio. In finance, a covariance matrix measures how
asset prices move together in time. We also use default correlation matrices in riskmanagement. A default correlation matrix measures how probable the joint default of two
entities is within a certain time period, for example a year.
A vector is a geometric entity, which is characterized by two properties: a) Magnitude,
which is measured by its length and b) direction. Typically vectors have an origin (starting point)
and an end point (but they can also be infinite as our number set in Figure 1). In finance we
often apply row vectors and column vectors.
The notation for a row vector or horizontal vector is a1 a 2 a 3  where ax ϵ R. The notation for a
b1 
column vector or vertical vector is b2  , bx ϵ R.
 
b3 
Multiplying a row vector with a column vector results in a scalar (which is just the term for a
single number in vector algebra, it comes from ‘scale’)
b1 
a1 a 2 a 3  b2  = a1b1 +a2b2 + a3b3
b3 
2
Example: What is the result of 1 2 -3  4  ? It is 1 x 2 + 2 x 4 – 3 x 6 = -8.
 
6 
A row vector and a column vector are special cases of a matrix. The row vector is a onerow matrix and the column vector is a one-column matrix. However, matrices can have several
a b
rows and columns. For example, a square matrices can have two rows and two columns 
 ,
c d 
where {a, b, c, d} ϵ R. Matrices can only be multiplied if the number of columns of the first
matrix are identical to the number of rows in the second matrix. Matrix multiplication is done
by
9
 a b   e f   ae  bg af  bh 

 x
  

 c d   g h   ce  dg cf  dh 
(2.2)
1 2   - 2 3   - 2  2 3 - 2   0 1 
Example: 
 x
  
  

 3 4   1 - 1  - 6  4 9 - 4   - 2 5 
A matrix A can be transposed by exchanging the rows with the columns. The notation of a
transposed matrix is AT.
a b
Example: What is the transpose of matrix A = 
 ? AT=
c
d


a c 


b d
An Eigenvector (“eigen” comes from the German word “self”, which is very important to know)
is a special type of vector. A vector is an eigenvector x, if it satisfies the condition
A x = λ x, where A is a matrix, x is the eigenvector, and λ is the eigenvalue. Let’s look at an
example:
 4 2
2
 4 2   2  8 - 4  4 
Let A = 
    . It follows
 and x =   . x is an eigenvector since 
 x   = 
4
8
2
4
2
2
4
2
   4


 

   
 4 1  2
that the eigenvalue λ = ½, since   x    . Geometrically an eigenvector x, when
  4 2   2
multiplied with A, leaves unchanged, stretches, shrinks, or flips (points in the exact opposite
direction) or flips and stretches or flips and shrinks x. These are the only changes that can occur
from multiplying a scalar (λ) and a vector (x).
For
more
on
matrices
and
vectors,
www.wiley.com/go/correlationriskmodeling, ‘Matrix primer’.
10
go
to
Problems for Chapter 2: Algebra
(The answers are available upon request. Email Gunter Meissner at [email protected])
1.1 Solve 3 x 4 + 1 x 24-2
1.2 Solve 4y + 5z Is this a trick question? 
1.3 Solve 4y x 5z
1.4 Solve
1
2 x2
2
1.5 Solve (2 + 3)2
1.6 Solve (3 - 4)2
1.7 Solve 22 x 23 - 32
1.8 Solve
1.8 Solve
778
777
1
Another trick question? 
3
4
4
42
2
2
1.10
4 2
Solve   x  
2 4
1.12
 4 
Solve 

 4 4
1.12
Solve 20b = 9 for b
10
11
1.13
Solve ln(3/2)
1.14
 a c  3 
Solve 
 
 b d  4 
1.15
3
 a c  
Solve 
 4  Enough with the trick questions already!
 b d  5 
 
8 4
 2
Given is the matrix 
 . Is the vector x =   an eigenvector? If yes, what is the
 4 8
  2
eigenvalue?
1.16
12
3. Calculus
Finally calculus! Everyone likes calculus since we can deal with infinities and other cool stuff like
finding optima, calculating surfaces etc. Calculus has two main operations. 1) Differentiation
and 2) Integration. We use differentiation a lot in Finance, for example to calculate the riskparameters (called Greeks) of options, to see the marginal impact of a parameter change (as
volatility or asset price) on a portfolio, to optimize a portfolio etc.
3.1 Differentiation
Definition: A mathematical derivative measures how one variable changes, if another variable
changes by an infinitesimally small amount.
The derivative of a function y(x) is the slope of that function for an infinitesimally small
dy dy
change in x, formally
.
is the ‘Leibnitz notation’ by Gottfried Leibniz. Other notations are
dx dx
.
.
dy
the Lagrange notation f’(x) or y’, or the Newton notation y . Hence
 f' ' (x)  y'  y .  stands
dx
for ‘is equivalent to’
Let’s derive
dy
graphically in Figure 3.
dx
y
y(x)
∆y
∆x
x
x1
Figure 3: The tangent of a function y(x).
13
y
dy
is the slope of the tangent in point x1. We can now find the derivate
by letting ∆x, the
x
dx
discrete change of x, get smaller and smaller, formally
lim x  0
Equation (3.1) reads: The limit of
y dy

x dx
(3.1)
y
dy
if x approaches 0, is
.
x
dx
dx is now an infinitesimally small change of x. The slope of the function y(x) in dx, i.e. in the
dy
point x1, is
.
dx
3.1.1. Differentiation rules
There are several major rules for finding a derivative
1) Power rule
If y = xn →
dy
dy
 n x n -1 , n ≠ 0. Example: If y = x3 →
 3x 2
dx
dx
2) Constant factor rule
This rule allows to leave a constant ‘a’ unchanged when differentiating. Hence we have
If y = a xn →
dy
dy
 a n x n -1 , n ≠ 0 Example: If y = 2x3 →
 2 x 3x 2  6x 2
dx
dx
3) Sum rule
This rule states that in a function, which consists of two or more sum terms, each sum term can
be differentiated individually. Hence for two sum terms, we have
If y(x) = u(x) + v(x)
→
dy du dv
dy


 6x2  6x
, Example: If y = 2x3 + 3x2 →
dx dx dx
dx
14
4) Product rule
If y(x) = u(x) v(x) →
dy du
dv

v(x) 
u(x)
dx dx
dx
Example: y(x) = 2x2 3x
→
dy
 4x 3x  3 x 2x 2  12x 2  6x 2  18x 2
dx
How could we have derived this result faster? See problem 3.5
5) Quotient Rule
du
dv
v(x) 
u(x)
u(x)
dx
If y(x) =
 dx
v(x)
[v(x)] 2
Example: If y(x) 
3x 2
dy 6x 2x - 2 x 3x 2 12x 2  6x 2 6x 2 3x 2 3



 2  2 
2x
dx
4x 2
4x 2
4x
2x
2
How could we have derived this result faster? See problem 3.7
6) Chain rule
If y = f(u) and u = g(x) →
dy dy du

dx du dx
Example:
1
If y(x)  1  2x 3  (1  2x 3 ) 2 
dy 1
3x 2
3x 2
1
1
 (1  2x 3 ) 2 6x 2  (1  2x 3 ) 2 3x 2 

1
dx 2
(1  2x 3 ) 2
1  2x 3
7) Some specific Differentiation rules often applied in Finance
If y(x) = ln(x) →
dy 1

dx x
If y(x) = ey(x) 
de y(x) dy y(x)
de2x
 e
 2 e2x
Example: If y(x) = e2x 
dx
dx
dx
15
dy
 e x Let’s look at this convenient derivative geometrically. The function y(x) =
dx
x
e is displayed in Figure 3.1
If y(x) = ex 
Figure 3.1: The function y(x) = ex
From Figure 3.1 we can observe that if y(x) = ex = 
dy
 ex .
dx
In particular,
dy
( y ( 5))  0 . I.e. at x=-5 the function ex has the y-axis value of close to zero
dx
and the slope at y(-5) is also close to zero as seen from the tangent (in red).
y(-5)  0 as well
dy
( y (0))  1 . I.e. at x=0 the function ex has the y-axis value of 1, and the slope
dx
at x(0) is also 1 as seen from the tangent (in blue).
y(0)  1 as well
dy
( y (1))  e . I.e. at x=1 the function ex has the y-axis value of e = 2.71828…
dx
and the slope at y(1) is also e=2.71828… as seen from the tangent (in green).
y(1)  e as well
16
3.1.2 Partial Mathematical Derivative
Definition: Given is a function with several independent variables. A partial mathematical
derivative is the derivative of that function with respect to one variable, assuming the
other variables are constant.
The notation for the partial derivative operator is typically  , pronounced as ‘d’ or ‘del’ or
‘partial’.  is not a letter from the Greek alphabet and should not to be confused with the
Greek delta δ or sigma σ.
Let’s look at an example of a partial derivative.
y
 2  z I.e. the partial derivative of the function y(x,z) = 2x + xz + 4z
x
with respect to x is 2 + z, assuming the variable z is constant.
If y(x,z) = 2x + xz + 4z 
y
 x  4 I.e. the partial derivative of the function y(x,z) = 2x + xz + 4z
z
with respect to z is x + 4, assuming the variable x is constant.
If y(x,z) = 2x + xz + 4z 
We use partial derivatives a lot in Finance. For example we partially differentiate the Nobel prize
rewarded Black-Scholes-Merton option model. For a call option, we have
C  S0 e
-qT
N(d1 )  K e
 rT
N(d 2 )
where
d1 
ln(
S0e  qT
1
)  σ 2T
Ke rT
2
σ T
d 2  d1   T
(3.2)
C
is called ‘Delta’. It tells us how
S
the call price changes for an infinitesimally small change of S, assuming all other variables q, T, r,
and σ are constant. See problem 3.12.
The first partial derivatives of equation (3.2) with respect to S,
17
3.1.3 Finding the maximum and minimum of a function with a mathematical derivative
We can find the maximum or minimum of a function by differentiating the function, setting the
derivative to zero and solving for x. If the second derivative is >0, we found a minimum, if the
second
derivative is <0, we found a maximum.
dy
 2x . We set this to zero, i.e.
dx
2x = 0. The solution is x=0. So at x=0, we have a minimum or maximum of the function
Let’s look at an example. We have the function y(x) = x2 →
y(x) = x2. The second derivative is
d2y
 2 . Since 2 > 0, we have found a minimum at x=0,
dx 2
which is verified by Figure 3.2.
10
9
8
7
6
x2 5
4
3
2
1
-3
-2.7
-2.4
-2.1
-1.8
-1.5
-1.2
-0.9
-0.6
-0.3
0
0.3
0.6
0.9
1.2
1.5
1.8
2.1
2.4
2.7
3
0
x
Figure 3.2: The function y(x) = x2 with a minimum at x=0
For the function y(x) = x-2, we have maximum at x=0, see problem 3.13.
3.2 Integration
Integration is the reverse operation to differentiation. It was developed by Issac Newton
and Gottfried Leibniz in the 17th century. In fact, there was quite a quarrel between the two as
to who the primary inventor was. Leibniz published his results first, but may have peeked at
Newton’s notes while in London. This reminds us to be honest. You don’t want to go into
history as a plagiarizer…
18
There are several different types of integration concepts such as the Riemann Integral,
Lebesgue Integral, the Riemann-Stieltjes Integral and more. Generally, we can derive a heuristic
(means non-mathematical) definition as
Definition: A mathematical Integral measures the area of a function, which is bounded
horizontally by the y= f(x) and x [x=0], and vertically bounded by x=a and x=b.
In Riemann notation we express an integral as
b
 f(x) dx
(3.2)
a
In (3.2) the integral

is a stretched letter ‘s’, coming from the word ‘sum’. In fact, in a
sum we add discrete units ∆x, whereas in an integral we add infinitesimally small units dx.
Actually, the Riemann integral can be derived by starting with the ‘Riemann sum’, which adds
units
of
∆x
and
then
minimizes
the
∆x
to
get
the
dx.
See
http://en.wikipedia.org/wiki/Riemann_integral for more details.
dx in the integral (3.2) is just notation, indicating that we are summing up the
infinitesimally small values dx. x is a place holder, also called a ‘dummy variable’, since it is
replaced by the limits a and b during the process of integration. x=a and x=b are vertical limits,
the beginning and end of the domain of integration.
Example: Let’s look again at the function y(x) = x2. Let’s graphically show the integral of
y(x) = x2 in the domain a=1 and b=2:
2
Figure 3.3: The integral of the function y(x) = x2 for the domain a=1 b=2 is  x 2 dx
1
19
2
Let’s calculate the integral  x 2 dx . We have
1
b
 f(x) dx  F(b)  F(a)
(3.3)
a
F(x) 
where F(x) is
x q 1
C
q 1
q ≠ -1
(3.4)
and C is the arbitrary constant of integration. Applying equation (3.4) to our example y(x)=x2,
we have q=2.
Let’s apply equations (3.3) and (3.4) to integrate the function y(x)=x2 in the domain a=1, b=2,
with C=0:
b
2
23 13 7
1
a f(x) dx  1 x dx  F(2)  F(1)  3  3  3  2 3
2
1
This means that the area under the function y(x) = x2 from x=1 to x=2 is 2 , compare Figure
3
3.3.
3.2.1 The one-dimensional Integral
The domain of the integral is often an area, so it is two-dimensional. However, the
domain of integration can also be higher dimensional, as a volume (three-dimensional) or ndimensional, n>3. Also, we can apply integration for a one-dimensional function, i.e. a real line.
We do this in Finance when we create the expected value of an asset, which follows the
Geometric Brownian motion (GBM). The GBM is
dS
 μ S dt  σS ε t dt
S
(3.5)
where S is an asset as stock price, hence dS/S is the relative change of S (see chapter 2), μS is
the average growth rate of S, σS is the volatility of S, and εt is a random drawing from a
standardized normal distribution at time t (see chapter 4 for more details).
Let’s find the expected value of the asset S in equation (3.5). The expected value of the
normally distributed variable ε is 0, formally E(ε) = 0. Equation (3.5) then reduces to
20
dS
 μ S dt
S
(3.6)
To derive the expected value of S at a future time T, E(ST), we sum up, i.e. integrate the
infinitesimally small units in time μS dt. Hence we have
 dS 
0 E S   0 μS dt
T
For the left side of equation (3.7), we apply
T

(3.7)
dS
 ln(S) . For the right side of equation (3.7), we
S
apply that the integral of a constant a is  a dx  a x 2 . So for the right side we have
T
μ
S
dt  F(T) – F(0)= μS T – μS x 0 = μS T. Hence, when integrating equation (3.7), we derive
0
ln[E(S T )]  μST  ln(S 0 )
(3.7)
where ln(S0) is the integral constant C.
Taking both sides of equation(3.7) to the power of e, we derive
eln[E(ST )]  e[μ S T  ln(S0 )]
Applying eln(x) = x and e(x+y) = ex ey, we derive for the expected value of S at time T
E(S T )  S0 eμST
(3.8)
Equation (3.8) states that the expected value of the asset S at time T is simply the starting value
μ T
S0 (today’s value) multiplied with e S . For example, if a stock today has a price of S0=$100 and
the expected growth rate of a stock is 10%, in T=1 year the stock price is expected to be
ST = 100 x 2.71828 ^ 0.1 = $110.52.
2
We could apply equation (3.4) to show that
F(x 0 ) 
 a dx  a x . From equation (3.4), we can write
x 0 1
 x . Since x0 = 1, we have  a x 0 dx  a x + C
0 1
21
3.2.2 Some popular Integrals in Finance
Here is a list of some often applied Integrals in Finance:
 a dx  a x  C ,
e
e
x
ax
where a is a constant, a ϵ R and C is the constant of integration
dx  e x  C (see chapter 3.1.1. for details)
dx 
1 ax
e C
a
 ln(x) dx  x ln(x) - x  C
 φ(x) dx  Φ(x)  C
where  is the pdf (probability density function) and  is the cdf
(cumulative density function) of a standard normal distribution, see chapter 4 for details.
22
Problems for Chapter 3: Calculus
(The answers are available upon request. Email Gunter Meissner at [email protected])
3.1 What does the derivative
3.2 Explain the equation
dy
tell us?
dx
y
dy
lim x  0 
briefly.
x
dx
3
3.3 Differentiate x 2
3
3.4 Differentiate x 4  4x 2  2x  3
dy
 4x 3x  3 x 2x 2  12x 2  6x 2  18x 2 . We could have derived this result
dx
2
faster by using 2x 3x = 6x3. Differentiating 6x3 = ?
3.5 If y(x) = 2x2 3x →
3.6 Differentiate y(x) = x3 ln(x)
3x 2
dy 6x 2x - 2 x 3x 2 12x 2  6x 2 6x 2 3x 2 3



 2  2  . We could have derived
2x
dx
4x 2
4x 2
4x
2x
2
2
3
3x
3
this result faster by using
 x . Differentiating x  ?
2
2x 2
3.7 If y(x) 
3.8 Differentiate y(x) 
x3
.
2x  2
3.9 Differentiate y(x)  (2x  1)3
3.10 Differentiate y(x)  ex ln( x )
3.11 Differentiate y(x, z)  4x 2  3z partially with respect to z
3.12. OK. This is for the courageous student. In finance, we partially differentiate the Nobel
Prize rewarded Black-Scholes-Merton option pricing model
C  S0 e
-qT
N(d1 )  K e
 rT
N(d 2 )
where
d1 
23
ln(
S0e  qT
1
)  σ 2T
 rT
Ke
2
σ T
d 2  d1   T
(3.2)
to find the ‘Greeks’. The Greeks consist of the Delta
the Theta
C
C
 2C
, the Gamma
, the Vega
, and
2
S

S
C
. Give it a try… don’t get frustrated now 
T
3.13 Derive the function y(x) =
by differentiating
1
in an Excel spreadsheet. Find the maximum of the function
x2
1
, setting it to zero and solving for x. Why did you find a maximum?
x2
b
3.14 Solve  2x 2 dx  for the domain a  -1, b  1
a
3.13 Solve  a dx , where a ϵ R
3.15 Solve  0 dx
Is this a trick question? Ahhh, not really..
3.16 Solve  e2x dx
24
4. Statistics
Statistics can be fun. We get to draw colorful graphs called distributions, figure out the
math for them, integrate them and see if they fit the real world. Statistics can be divided into
two main branches 1) Descriptive statistics, which deals with collecting and interpreting data
and 2) Analytical or mathematical statistics, which is mainly probability theory, but also the
design of experiments such as how to forecast election results. In this refresher we will
concentrate on Descriptive Statistics.
4.1. Distributions and Moments
Informal Definition: A probability density function (pdf) is a distribution, which assigns
probabilities to the outcomes of random events. Importantly, pdfs are non-negative
everywhere and the summation of the outcomes, i.e. the integral of the entire function, is 1.
In Finance, for convenience, we often use the normal distribution to model variables
such as stock prices, interest rates, commodity prices etc. The normal distribution, also called
the bell-shaped or Gaussian curve, after its founder Carl Friedrich Gauss, looks as follows:
Figure 4.1: PDF of the standard normal distribution
The normal distribution is quantified with
25
1  x μ 

σ 
2
 
1
f(x; μ, σ 2 ) 
e 2
σ 2π
(4.1)
where μ is the mean, σ2 is the variance, and σ is the standard deviation. In Figure 4.1, we see a
special case of the normal distribution, the standard normal distribution with a mean μ = 0 and
a variance σ2 = 1. In this case, equation (4.1) reduces to
1  12 x 2
f(x;0,1) 
e
2π
(4.2)
In stochastic processes (stochastic means unknown, so non-deterministic), we often
sample from a standard normal distribution. This means that we randomly draw a sample from
the x-axis of a normal distribution. The notation for the sample is typically ε. It can be derived
as =normsinv(rand) in Excel or randn() in MatLab. As seen from Figure 4.1, the expected value
of ε, E(ε) = 0. See also problem 4.7.
If we integrate a probability density function, we derive the cumulative density function
(CDF), as seen in Figure 4.2:
Figure 4.2: CDF of a standard normal distribution
Importantly, a CDF at a certain point x* gives the probability of the random variable falling in
the interval (-∞, x*). In other words, it is the probability of the event to have a value of ≤ x*.
The CDF of a standard normal distribution shown in Figure 4.2 cannot be quantified with
elementary functions, but we can use the error function erf to derive it
26
F(x; μ, σ 2 ) 
1
 x  μ 
1  erf 


2
 σ 2 
(4.3)
x
where erf(x) 
2
 t 2 dt
. t is just a place holder, the dummy variable of integration which is
e
π 0
replaced with the limits 0 and x in the process of integration (the attentive reader remembers
this from chapter 3!!!)
In Finance we often use the log-normal distribution. The PDF of a lognormal distribution is
fl (x; μ, σ ) 
1
2
xσ 2π
e
1  ln(x)  μ 
 

2
σ

2
(4.4)
where μ and σ are the mean and standard deviation of ln(x).
0.6
0.5
0.4
fl(x; u,σ)
0.3
0.2
0.1
0
0.01
0.51
1.01
1.51
x
2.01
2.51
3.01
Figure 4.3: The PDF of a lognormal distribution with μ=0 and σ=1.
We often assume in Finance that an asset as a stock grows in time according to a lognormal distribution. In Figure 4.4 we observe that the asset S is expected to grow with the
growth rate μ to E(ST). The value of E(ST) was derived in equation (3.8). We also observe from
Figure 4.4 that the value, which an asset price can take in the future, falls within the lognormal
distribution. In particular, we observe from Figure 4.4 that the asset price S can increase
sharply, however with a low probability P1. The asset price can also decrease sharply, however
27
with the low probability P2. In a lognormal distribution the asset price cannot become negative,
which is in line with most asset prices as stocks, bonds and commodities in reality.
Hence altogether, many researchers believe that the lognormal distribution is a good
representation of financial assets in the real world. However, this is an empirical question and
depends on the asset, time frame, and geography. Some researchers may disagree and prefer
the normal distribution to model assets. See problem 4.8 for more.
Figure 4.4: An asset S, represented in time with the log-normal distribution.
Moments
Statistical distributions are characterized by their moments. The first four moments are
1st Moment: Mean; represented by the expected value of a distribution, E(ST) in the Figure 4.4
(for a calculation see below)
2nd Moment: Variance; loosely speaking how ‘wide’ the distribution is (for a calculation see
below)
3rd Moment: Skewness, i.e. how lopsided or asymmetric a distribution is
28
4th Moment: Excess Kurtosis, i.e. how fat the tails of a distribution are
By definition the standard normal distribution of Figure 4.1 and 4.2 has a first, third, and fourth
moment (defined as excess kurtosis, not kurtosis) of zero and the second moment is 1.
The standard lognormal distribution in Figure 4.3 and 4.4 has a
1st
Moment (mean) of e
compare Figure 4.1.
1
2
  2
. Hence with μ=0 and σ=1, it follows that the mean is 1.65
2nd Moment (variance) of (eσ  1) e2μ σ . Hence with μ=0 and σ=1, it follows that the variance
2
2
is 4.6708.
3rd Moment (skewness). From Figure 4.3. and 4.4 we observe that the skewness is bigger than
0, since the distribution is skewed to the right. In fact the skewness of a log-normal distribution
is (e σ  2) eσ  1 . Hence with μ=0 and σ=1, we derive a skewness of 6.1849.
2
2
4th Moment (kurtosis). From Figure 4.3 and 4.4, we can already conclude that the kurtosis of
the lognormal distribution is > 0, since the distribution shows a fat right tail. The kurtosis of a
log-normal distribution is e4σ  2e3σ  3e2σ  6 . For σ=1, this results in a kurtosis of 110.94,
showing indeed that the lognormal distribution has a very fat tail. We recommend a diet..
2
2
2
4.2 Time Series and Correlation
In Finance we often analyze time series of financial assets as stocks, bonds, currencies,
commodities etc. We typically look at the correlation between these asset time series to assess
the profit potential and the risk.
Definition: Financial correlations measure how two or more financial assets move together in
time.
Let’s analyze two time series in an example: Let’s assume we have a portfolio of 2 stocks, A*
and B*. They have performed as in Table 4.1:
29
Asset A*
100
120
108
190
160
280
2008
2009
2010
2011
2012
2013
Asset B*
200
230
460
410
480
380
Asset A* return in %
Asset B* return in %
20.00%
-10.00%
75.93%
-15.79%
75.00%
15.00%
100.00%
-10.87%
17.07%
-20.83%
Table 4.1: Performance of a portfolio with two assets
The return of an asset is expressed as a percentage. I.e.
Return(S t ) =
St - St -1
St -1
(4.5)
where St is the price of asset S at time t. So the return of asset A* at the end of 2009 is
(120-100)/100= 0.2 = 20%. See also chapter 2.1 on the approximation of the return in equation
(4.5) with the logarithm.
Let’s define the return of asset A* as A, and the return of asset B* as B. The average
return of asset A*, for the time frame 2009 to 2013 is µA = 29.03%, for asset B the average
return is µB = 20.07%. If we assign a weight to asset A, wA, and a weight to asset B, wB, the
portfolio return is
µPort = wA µA + wB µB
(4.6)
where wA + wB = 1
The standard deviation of returns, termed volatility, is derived for an asset A with
equation
σA 
1 n
(A t  μ A )2

n  1 t 1
(4.7)
where At is the return of asset A* at time t and n is the number of observed time units. A
standard deviation or volatility measures how far the numbers in the time series diverge from
its mean. From our example in Table 4.1, we find that the standard deviation of the returns of
asset A* is σA = 44.51% and σB = 47.58% (try to derive this yourself. If you can’t I can send a
spreadsheet with the solution). The covariance of returns for assets A and B is derived with
equation
30
1 n
COVAB 
 (A t  μ A )(B t  μ B ) .
n  1 t 1
(4.8)
The covariance tells us how asset return A and asset return B move together in time. For
our example in Table 4.1 we derive COVAB= -0.1568. Since the covariance is negative, we can
conclude that on average, if A increases, B decreases, vice versa. An even easier to interpret
correlation measure is the Pearson correlation coefficient ρ, which is a standardized
covariance, i.e. takes values between -1 and +1. The equation for ρ is
ρ
COVAB
σ Aσ B
(4.9)
For our example in Table 4.1, ρ = -0.1568 / (0.4451 x 0.4758) = -0.7403, confirming that the
asset returns A and B are negatively correlated. In fact the negative correlation is quite strong,
since the negative correlation -0.7403 is quite close to -1.
We can calculate the standard deviation for the returns of our two-asset portfolio as
σ Port  w 2Aσ 2A  w 2Bσ 2B  2w A w BCOVAB
(4.9)
With equal weights, i.e. wA = wB = 0.5, the example in Table 4.1 results in σPort = 16.66%.
Importantly, the standard deviation (or its square, the variance), is interpreted in
finance as risk. The higher the standard deviation, the higher is the risk of an asset or a
portfolio. Is standard deviation a good measure of risk? The answer is: It’s not great, but it’s
pretty much the only one we have. A high standard deviation may mean high upside potential!
So it penalizes possible profits! But high standard deviation naturally also means high downside
risk. In particular, risk averse investors will not like a high standard deviation, i.e. high
fluctuation of their returns.
An informative performance measure of an asset or a portfolio is the risk-adjusted
return, i.e. the return/risk ratio. For a portfolio, it is µPort/σPort, which we derived in equations
(4.6) and (4.9). In Figure 4.2 we observe one of the few ‘free lunches’ in finance: The lower,
preferable negative the correlation of the assets in a portfolio, the higher is the return/risk
ration. For a rigorous proof, see Markowitz (1952) and Sharpe (1964).
31
Mue/Sigma with respect to Correlation
250%
M
u
e
/
S
i
g
m
a
200%
150%
100%
50%
0%
-1
-0.5
0
Correlation
0.5
1
Figure 4.5: The negative relationship between return µ / risk σ with respect to the correlation of
the assets in the portfolio ρ.
Figure 4.1 shows the high impact of correlation on the return/risk ratio. A high negative
correlation results in a return/risk ratio of close to 250%, whereas a high positive correlation
results in a 50% ratio.
32
Problems for Chapter 4: Statistics
(The answers are available upon request. Email Gunter Meissner at [email protected])
4.1 What are the characteristics of a statistical distribution?
4.2 What is the difference between a pdf and cdf?
4.3 What are the 4 moments of a distribution and what information does each moment have?
4.4 What are the numerical values of the 4 moments of a standard normal distribution?
4.5 Let’s assume we have two assets A and B. They have performed as follows:
2008
2009
2010
2011
2012
2013
Asset A
100
90
130
180
160
200
Asset B
200
230
200
220
240
200
Asset A return in %
Asset B return in %
4.5a Calculate the return of each asset.
4.5b Which asset has performed better? (calculate the mean return)
4.5c Which asset is riskier? (calculate the volatility, i.e. standard deviation of returns)
4.5d What is the correlation between the asset returns? (calculate the covariance and
Pearson’s ρ)
4.5e Calculate the overall portfolio risk, assuming equation weights.
4.5f Calculate the risk-adjusted return of the portfolio, i.e. µPort/σPort.
4.6 For extra credit (who gets that credit anyway?): Which weights wA and wB of asset A and B
minimize the portfolio risk?
33
Hint: Start with equation (4.9) σ Port  w 2Aσ 2A  w 2Bσ 2B  2w A w BCOVAB . Differentiate equation
(4.9) partially with respect to wA. Set the derivative to 0 and solve for wA. Input σA σB and COVAB
and voila, you will have the wAmin, the weight of asset A that minimizes the portfolio volatility.
4.7 Derive the random drawing from a standard normal distribution ε in an excel spreadsheet
via =normsinv(rand). Build a histogram to show that ε is standard normally distributed.
4.8 There is a discussion which distribution fits asset prices better, the normal distribution or
the lognormal distribution. What tool could we provide traders to address this problem?
34