Download 3.8

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Quartic function wikipedia , lookup

Equation wikipedia , lookup

Cartesian tensor wikipedia , lookup

Basis (linear algebra) wikipedia , lookup

Determinant wikipedia , lookup

Symmetry in quantum mechanics wikipedia , lookup

Linear algebra wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

System of linear equations wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Gaussian elimination wikipedia , lookup

Four-vector wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Matrix multiplication wikipedia , lookup

Matrix calculus wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Jordan normal form wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Transcript
§3.8 Eigenvalues and eigenvectors
Definition 8.1: We say  is an eigenvalue of n  n matrix if there exists an x  0
such that Ax  x . In this case, x is called an eigenvector corresponding to
eigenvalue  .
From Definition 8.1, the matrix
A  I
is singular and
det( A  I )  0 .
Theorectically we may find the eigenvalues  by solving the characteristic
polynomial of the matrix A ,
f ( )  det( A  I )  0
(8.1)
Since by expanding the determinant, we have
f ( )  (1) n n  (1) n (a11  a 22    a nn )n 1    det A (8.2)
Obviously f ( )  0 has n roots 1 ,  2 ,  ,  n , in C (counting multiplicities),
or
f ( )  (1) n (  1 )(   2 )  (   n )
Then it follows that
1  2    n  a11  a22    ann  traceA
(8.3)
12 n  det A
(8.4)
When  is complex, the corresponding eigenvector x  C n .
Example 8.1: For diagonal matrix
0
 d1


A

 , the eigenvalues are 1  d1 ,  , n  d n ,
0
d n 

For lower or upper triangular matrices A , the eigenvalues are 1  a11 , 2  a22 ,  ,
n  ann .
Properties of eigenvalues and eigenvectors:
Let 1 ,  ,  n be the eigenvalues of A . Then
(i)
1 k ,  , n k are the eigenvalues of A k
(ii)
Suppose  is an eigenvalue of A and  is an eigenvalue of B . Then in
general  is not an eigenvalue of AB ,    is not an eigenvalue of
A B.
(iii)
If   0 is an eigenvalue of A then  1 is an eigenvalue of A 1 .
(iv)
A and A T have the same eigenvalues
(v)
If A and B are similar, i.e., there exists a nonsingular matrix P such that
B  P 1 AP then A and B has same eigenvalues
Proof:
(i)
Ax  x  Ak x  Ak 1 ( Ax)  Ak 1 (x)  Ak 1 x    k x .
(ii)
Ax  x , By  y unless A and B share same eigenvectors i.e.
Ax  x , Bx  x for some x , in general,  is not an eigenvalue of
AB .
(iii)
Ax  x  1 Ax  x  1 y  A1 y where y  Ax .
(iv)
Let
det( A  I )  0
.
Since
det( A  I )  det( A  I ) T
det( AT  I )  0
(v)
det( B  I )  det( P 1 AP  I )
 det( P 1 ( A  I ) P)  det P 1 det( A  I ) det P

1
det( A  I ) det P  det( A  I )
det P
Hence A and B have same characteristic polynomial.
then
Diagonalization of a matrix A:
Theorem 8.1: Suppose that the n  n matrix A has n linearly independent
eigenvectors. Then if these vectors are chosen to be columns of a matrix S , it follows
that S 1 AS is a diagonal matrix  with the eigenvalues of A along its diagonals
1

1
S AS    



2






n 
Proof:
AS  Ax1 , x2 ,, xn   Ax1 , Ax2 ,, Axn 
 1



 x1 , x2 ,, xn   x1 , x2 ,, xn 



n 

 S
or A  SS 1
Remark 8.1: Not all matrices possess n linearly independent eigenvectors and
therefore not all matrices are diagonalizable
0 1
 , det( A  I )  2
A  
0 0
Hence 1  2  0 and 0 is a multiple eigenvalue. If x is an eigenvector then it must
satisfies
0 1 
0 
 x1 
0 0 x  0 or x   0 


 
 
In this case,   0 is an eigenvalue with algebraic multiplicity 2 — it has only a
one-dimensional space of eigenvectors. The geometric multiplicity of   0 is one.
Now we state without proof the following important fact:
Theorem8.2: If A is symmetric, A  R nn , then A is diagonalizable.
Example 8.2:
0  1 0 
A  1 0 0
0 0 1
Solve characteristic polynomial of A , det( A  I )  (1   )(2  1)  0 .
Then 1  1 , 2  i , 3  i .
1  1
0 
  1  1 0


( A  1 I ) x1   1  1 0 x1  0 or x1  0
1
 0 0 0
2  i
 i  1 0 
1


( A   2 I ) x2   1  i 0  x2  0 or x 2   i 
 0 0 1  i 
 0 
3  i
1 
i 1 0 


( A  3 I ) x3  1 i
0  x3  0 or x 3   i 
0 
0 0 1  i 
Then
0 1 1 
1 0 0 


S  0  i i  ,   0 i 0 
1 0 0
0 0  i 
Q.E.D
Remark 8.2: One of the application in biology is to compute A k x 0 , k  1, 2, .
Suppose S 1 AS   . Then it is easy to verify
(S 1 AS ) k  S 1 A k S  k
Then
A  S S
k
k
1
 1 k

 S




 S 1


 n k 
Example 8.3: Fibonacci’s Sequence
In 1202, Fibonacci introduced a model of hypothetical rabbit population which
illustrates the basic idea of current models of human populations. In this, a single pair of
rabbits, one male and one female, starts the population. They will reproduce twice, and
at each reproduction produce a new pair, one male and one female. These will go on to
reproduce twice, etc. The following diagram summarizes the population’s dynamics.
NUMBER OF RABBIT PAIRS
Time
Birth
Reproductive
Rate
Ages
Age
↓
↓
↓
0
1
2
3
0
1
1
1
1
2
2
1
1
3
3
2
1
1
4
5
3
2
1
5
8
5
3
2
6
13
8
5
3
Let Bn be the birth rate in year n . Then we have
4
5
6

Bn1  Bn  Bn1
(8.5)
Let
B 
u n   n 1  .
 Bn 
Then
1 1
u n 1  
u n  Au n
1 0
The eigenvalue  of A satisfies
det( A  I )  2    1  0 .
Then
1 
   
1 5
1 5
, 2 
have corresponding eigenvectors  1  ,  2  . It
2
2
1  1 

follows that A  SS 1   1
1
2  1
1   0
0   1  2  1
.
2   1 1  1  2
Then
 Bn1 
1 
n
n 1
 B   u n  A u 0  S S u 0 , u 0  0
 
 n 
Hence
 Bn 1  1
 B   1
 n  
 2  1 n

1   0
1 n
2 n
1  1  5   1  5  
 
 

Bn 


1   2 1   2
5  2   2  


n

0  1  1
 
 2 n   1 1   2
1 1 5 



5  2 
n
as n   .
n
Other method:
The two roots are r  12 
5
2
. Since B1  1 and B2  2 , we have that
 5 1 n  5 1 n
r  

Bn  

 2 5 r
 2 5 


Leslie Model: a age structure model
A time unit T appropriate to the species being modeled and to the data available
is selected, and then the age structure of one sex at some time NT is described by the
vector
 v1, N 


vN    
v k , N 
where each component v i , N describes the number of individuals at time NT that
have ages (i  1)T  a  iT . Thus, the total population at time NT is
k

v N   vi , N ,
i 1
and the number of individuals having ages less than or equal to jT is
j
v
i 1
i ,k
.
A typical model for human populations is to take T  5 years and then introduce
16 age classes, k  16 , where
v1, N = number of people with age a , 0  a  5 ,  ,
v16, N = number of people with age a , 75  a  80 ,
Here the total population having ages greater than 80 is ignored.
Modeling proceeds as before. A relation between the age structure in time period
N and in time period N  1 is prescribed:

 
VN 1  F (VN ) ,
where now the reproduction function is a vector valued function of the components of

the vector VN . However, such models are usually very difficult to determine in practice,
and in general, they are quite difficult to analyze. There is one class of reproduction
functions that has been successful, both in description of real populations (it has been
applied mainly to human populations) and in its mathematical tractability. These are
analogs of Malthus’ model, and they employ linear reproduction functions.
Many names are associated with the development of this kind of model, Lotka,
McKendrick, Bernardelli and Leslie, to name a few, and many others have extended and
applied the basic results. However, Leslie’s work (1944, 1948) has been most widely
distributed, and this presentation follows the work of Leslie, McKendrick (1926) and
Feller (1968).
A. The Reproduction Matrix
The dynamics of one sex, usually females, will be described. First, a time period
T and a number of age classes A are selected. Each age class may lose members
through death and may contribute births during one time period. The probabilities of
these events depend on the ages. These are specified in the following way:
Let
bi = number of births per female in the i-th age class in one time period
mi = probability of survival to the next time period of those in the i-th age class
Therefore, for i  2, 3, , A , we have
vi , N 1  mi 1vi 1, N
which indicates the number of survivors of the i  1 age class after one time period,
and
v1, N 1  b1v1, N  b2 v 2, N    bk v k , N
which indicates the total number of births occurring in the population during the time
interval.
These data, m1 , , mk and b1 , , bk are assumed to be known from population
death and birth statistics. In terms of these, the model can be formulated as
v1, N 1  b1v1, N    bk v k , N
v 2, N 1  m1v1, N

v k , N 1  mk 1v k 1, N ,
or in the more concise matrix notation as


VN 1  MVN
(Leslie Model)
(8.6)
where the reproduction matrix M is given by
 b1

 m1
M  0

 
0

4.
b2
 bk 1
0

0
m2 
0



0
 mk 1
bk 

0
0


0 
Stable age distribution: The structure of M
The renewal theory can, of course, be applied directly to the full Leslie model to
describe the evolution of the population’s age structure. These details are carried out in
references Leslie (19458, 1948), Moran (1962), Keyfits and Flieger (1970) and Keyfitz
(1968).

The Leslie model was developed above, and it was shown there that if VN is the
age vector in the Nth time period, then


VN 1  MVN
where the reproduction matrix M is given by
 b1

m
M  1
0

0

b2
 bk 1
0

0
m2 
0
0
 mk 1
bk 

0
0

0 


Since VN  M N V0 , it is sufficient to analyze the matrices M N for large N .
A complete analysis of the powers of M can be carried out by using the spectral
decomposition of M (see Leslie (1945, 1948)). However, the dominant behavior is
easy to determine without any complicated technicalities.
The characteristic roots of M are determined by the characteristic equation
detM  I   0
In terms of the components of M , this becomes
k  b1k 1  b2 m1k 2    (bk m1  mk 1 )  0 .
We say that M is honest if there is a unique positive real root  0 and that all other
roots of the characteristic equation satisfy   0 .  0 is called the dominant root of

M . Let  0 be the eigenvectors corresponding to the dominant root.
Thus,


M 0  0 0

 0 can be taken as the vector
(8.7)
k0 1 /( m1 m2  m k 1 )
 k 2

0 /( m2  mk 1 ) 



 0 



0 / mk 1




1


Perron-Frobenius Theorem
M  (mij )
Let
be
a
nonnegative
nn
matrix,
i.e.,
mij  0
for
all
i, j  1, 2, , n . We denote a nonnegative matrix M by M  0 . The spectral
radius of M ,  (M) is defined as the large modulus of eigenvalues of M , i.e.,
 (M)  max{  :  is an eigenvalue of M} . In the following we state the
Perron-Frobenius Theorem without proof ([ ], p.508)
Theorem (Perron-Frobenius): Let M  R nn and M is irreducible and nonnegative.
Then
(a)
 (M) >0
(b)
 (M) is an eigenvalue of M
(c) There is a positive vector x such that Mx   (M) x
(d)
 (M) is an simple eigenvalue of M
(e) If  is an eigenvalue of M and its associated eigenvector is a positive vector,
then    (M) .


Let’s consider (8.6) VN 1  MVN . From Perron-Frobenius Theorem, 1  0   (M)
is a dorminant eigenvalue of M with eigenvector  0   1 in (8.8) . Write




V0  c1 1  c2 2    ck k
where  i is the associated eigenvalue of eigenvalue i , i  1, 2, , k . Assume
c1  0 . Then

M NV0  c1 M N 1  c 2 M N 2    c k M N k
 c1 N 1  c 2  N 2    c k  N k
(8.9)
  (c1 1  c 2 ( )  2    c k ( )  k
λ2
λ1
N
1
Since 1  0 is a dorminant eigenvalue, i.e.,
λk
λ1
N
i
1
N
 1 , i  2, 3, , k , from (8.9)
it follows that

M N V0  c10N 0 .
This calculation shows that for large N the age distribution is growing at a
constant rate  0 , but that the distribution between the age classes remains fixed. In fact,
the proportion if the population residing in any one age bracket is constant:
Vi , N
0N c 0,i
 0 ,i
No. of i th age class
.
 A

 A
A
Total population size
N
V j , N 0 c 0, j  0, j
j 1
j 1
j 1

Because of this fact, the vector  0 is called the stable age vector, and these
calculations show that a Malthusian population does approach a stable age distribution.
This fact is frequently used in population studies.
Example: Kimura’s Two-Parameter Model
The assumption that all nucleotide substitutions occur randomly, as in the JC
model, is unrealistic in most cases. For example, transition are generally more frequent
than transversions. To take this fact into account, Kimura (1980) proposed a
two-parameter model (Figure). In this scheme, the rate of transitional substitution at
each nucleotide site is  per unit time, whereas the rate of each of the two types of
transversional substitution is  per unit time.
Figure Two-parameter model of nucleotide substitution. In this model, the rate of
transition (  ) may not be equal to the rate of each of the two types of transversion (  ).
From Li and Graur (1991).
As in the one-parameter model, let p A(t ) be the probability that the nucleotide at
the site under consideration is A at time t ; pT (t ) , pC (t ) and pG (t ) are similarly
defined. The probability that the nucleotide at the site is A in the next generation is
given by
p A(t 1)  (1    2  ) p A(t )  pT (t )  pC (t )  pG (t )
To derive this equation we need to consider four possible scenarios: First, the nucleotide
at the site is A at time t and will remain unchanged at t  1 . The probability that the
nucleotide at the site is A at time t is p A(t ) and the probability that it will remain
unchanged at t  1 is (1    2  ) . The product of these two probabilities constitutes
the first term in Equation. Second, the nucleotide at the site is T at time t but will
change to A at t  1 , a transversional change. The probabilities for the two events are
pT (t ) and  , respectively, and their product constitutes the second term in Equation.
Third, the nucleotide at the site is C at time t but will change to A at t  1 , a
transversional change. The probabilities for the two events are pC (t ) and  ,
respectively, and their product constitutes the third term in Equation. Fourth, the
nucleotide at the site is G at time t but will change to A at t  1 , a transitional
change. The probabilities for these two events are pG (t ) and  , respectively, and their
product constitutes the fourth term in Equation. In the same manner, we can derive the
following equations:
pT (t 1)  p A(t )  (1    2  ) pT (t )  pC (t )  pG (t )
pC (t 1)  p A(t )  pT (t )  (1    2  ) pC (t )  pG (t )
pG (t 1)  p A(t )  pT (t )  pC (t )  (1    2  ) pG (t )
To solve the above equations we need to set the initial condition. If the initial
nucleotide is A , then p A( 0 )  1 and pT ( 0)  pC ( 0)  pG ( 0 )  0 , and a solution can be
obtained for the above equations. Similarly, we can obtain a solution for the initial
conditions pT ( 0 )  1 and p A( 0)  pC ( 0)  pG ( 0 )  0 , and so on. For the detail of the
solution, see Li (1986).



1    2 




1    2




M




1    2






1    2 

(3.18)
If we define the vector p (t ) as p (t )  ( p A(t ) , pT (t ) , pC (t ) , pG (t ) )T , then Equations 3.13a-d
can be written as


p ( t )  Mp ( t )
(3.19)
This equation can be studied by using linear algebra.
The matrix M in Equation (3.18) satisfies the following two conditions: (1) all
elements of the matrix are nonnegative and (2) the sum of the elements in each row is 1.
Such matrices are called stochastic matrices. The above formulation has in effect
assumed that the substitution process is a Markov chain, the essence of which is as
follows. Consider three time points in evolution, t1  t 2  t 3 . The Markovian model
assumes that the status of the nucleotide site at t 3 depends only on the status at t 2 but
not the status at t1 , if the status of the site at t 2 is known. In other words, if the
nucleotide at the site is specified at t 2 , then no additional information about the status
of the site in any past generation is required (or is of help) for predicting the nucleotide
at t 3 . For example, if the nucleotide at time t is A , then
p A( 0 )  1 and
pT ( 0)  pC ( 0)  pG ( 0 )  0 , and by using Equation (3.19) one can compute the
probability p i ( t 1) that the nucleotide will be i at time t  1 , i = A , T , C , or G ;
and by iteration of the same produce, one can compute p i ( ) for any future generation
 . For the theory of Markov chains, readers may consult Feller (1968).
Since the sum of each row of a stochastic matrix is 1, it is easy to see that   1 is an
eigenvalue with eigenvector (1,1,1,1) T  e . Since the matrix M in (3.18) is an
irreducible nonnegative matrix, from Perron-Frobenius Theorem, it follows that 0  1
is the dorminant eigenvalue. Hence


M N p (0)  c10N e  c1e
Hence as N becomes large, M N p (0) is in the direction of e  (1,,1) T .
In (3.19) if the sum of the elements in each column is 1, then (3.19) has eigenvalue
  1 with left eigenvector (1,1,1,1) .