Download Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 12

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Matrix (mathematics) wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix calculus wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Four-vector wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Exterior algebra wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Jordan normal form wikipedia , lookup

Symmetric cone wikipedia , lookup

Transcript
Semidefinite and Second Order Cone
Programming Seminar
Fall 2001
Lecture 12
Instructor: Farid Alizadeh
Scribe: Anton Riabov
12/03/2001
1
Overview
In this lecture we continue to study the cone of squares, define inner product
with respect to Euclidean Jordan algebra, prove convexity, self-duality, homogeneity, and symmetry of cone of squares. We also define direct sums and simple
algebras. We state without proof a theorem that there are only 5 classes of simple Euclidean Jordan algebras, and describe these classes. Finally we briefly
outline application of Jordan algebra theory to solving optimization problems
over symmetric cones using interior point methods.
2
Cone of Squares
Suppose (E, ◦) is an Euclidean Jordan algebra. In the previous lecture we have
given the following definition:
def
Definition 1 KE = {x2 : x ∈ E} is a cone of squares of an associative algebra.
√
KE is a cone, since ∀α ≥ 0 ⇒ αx2 = ( αx)2 ∈ KE . In this section we will
show that cones of squares of Euclidean Jordan algebras are convex, self-dual
and homogeneous.
2.1
Inner Product
Definition 2 For any x, y ∈ E the inner product hx, yi with respect to Eudef
clidean Jordan algebra (E, ◦) is defined as hx, yi = tr(x ◦ y).
Note that this inner product is bilinear and hx, yi = hy, xi, so it conforms
to definition of an inner product.
1
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
Fact 1 Inner product is associative: hx ◦ y, zi = hx, y ◦ zi.
The proof of this statement above is not straightforward. In fact, we do
not have the required machinery to prove it, and will accept it as a fact. The
following definition may be needed in future discussions.
def
Definition 3 τ(x, y) = Trace(L(x ◦ y)).
Note that the associativity here holds as well: τ(x ◦ y, z) = τ(x, y ◦ z).
Lemma 1 L(x) is a symmetric matrix with respect to h·, ·i.
Proof: We need to show that hL(x)y, zi = hy, L(x)zi ⇔ hx ◦ y, zi = hy, x ◦ zi,
which follows from commutativity of ◦ and associativity of h·, ·i.
Example 1 (SOCP Algebra (Rn+1 , ◦))
hx, yi = tr(x ◦ y) = 2xT y,
so h·, ·i corresponds to the definition of the usual inner product, and L is a
symmetric matrix in the usual sense.
Example 2 (Symmetric Matrices (Sn , ◦))
hX, Yi = Trace(X ◦ Y) = Trace(XY) = X • Y.
2.2
Convexity and Self-Duality of KE
To prove that KE is convex we first note
Proposition 1 If K is a cone, then K∗ is a convex cone.
Proof: By definition, K∗ = {y : hx, yi ≥ 0, ∀x ∈ K}. For any y1 , y2 ∈ K∗ we
have:
hy1 , xi ≥ 0
hy2 , xi ≥ 0
∀x,
∀x.
Adding these two inequalities, obtain: hy1 + y2 , xi ≥ 0, ∀x.
Corollary 1 Every self-dual cone is convex.
Now we only need to prove self-duality of KE , and convexity will follow.
Lemma 2 K∗E = {y : L(y) < 0}.
2
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
Proof:
y ∈ K∗E
⇐⇒
⇐⇒
⇐⇒
⇐⇒
hy, x2 i ≥ 0 ∀x
hy, x ◦ xi ≥ 0 ∀x
hy ◦ x, xi ≥ 0 ∀x
hL(y)x, xi ≥ 0 ∀x,
which means that L(y) < 0.
Lemma 3 If c is an idempotent (i.e. c2 = c), then eigenvalues of c are 0,1
and L(c) < 0.
Proof: Since c is an idempotent, we can write c2 − c = 0. Minimal polynomial
t2 − t = 0 has two roots: {0, 1}. If {c1 , ..., cr } is Jordan frame, then c = 1c1 +
0c2 + ... + 0cr . So it has 1 eigenvalue equal to one, and r − 1 eigenvalues equal
to zero.
In the previous lecture we have derived the following equation:
L(y2 ◦ z) + 2L(y)L(z)L(y) − 2L(y ◦ z)L(y) − L(y ◦ z)L(y) − L(y2 )L(z) = 0.
Now, we will substitute y ← c, z ← c:
L(c3 ) + 2L3 (c) − 2L(c2 )L(c) − L(c2 )L(c) = 0
⇒ L(c)[2L2 (c) − 3L(c)|I] = 0,
i.e. L(c) satisfies t(2t2 − 3t + 1) = 0, which is equivalent to t(2t − 1)(t − 1) = 0,
and eigenvalues of L(c) are in the set {0, 1/2, 1}, i.e. all its eigenvalues are positive
therefore L(c) < 0.
P
Corollary 2 If x = i λi ci and λi ≥ 0 ∀i, then L(x) < 0.
P
Proof: L(x) =
λi L(ci ), and we know that L(ci ) < 0.
Fact 2 Let x = λ1 c1 + ... + λr cr . Then eigenvalues of L(x) are
eigenvalues of Qx are λi λj .
λi +λj
2 ,
and
Proof: We are accepting this statement as a fact, since in the general case
we need to know more about Jordan algebras to be able to prove it. However
we note that for the case of matrix algebra we have seen the proof in previous
lectures. For SOCP case the proof is also easy. We refer the reader to recent
survey paper by F. Alizadeh and D. Goldfarb for more details.
Theorem 1 KE = K∗E .
Proof:
KE ⊆ K∗E . Choose an element x ∈ KE ⇒ x =
P 2 First, we will show
P that
2
λi ci . Then, L(x) =
λi L(ci ), and L(ci ) < 0, and λ2i ≥ 0 ∀i. Therefore
∗
L(x) < 0, and thus x ∈ KE .
3
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
Now we will show that K∗E ⊆ KE . Choose y ∈ K∗E . Then y =
Hence,
X
hy, ci i =
λj hcj , ci i = λi hci , ci i = λi tr(c2i ).
P
λj cj .
The second equality above follows from the fact that hcj , ci i = 0 if i 6= j. Now
we can obtain an expression for λi :
λi =
hy, ci i
.
tr(c2i )
We know that tr(c2i ) > 0, and we only need to show that hy, ci i ≥ 0. This will
imply that all eigenvalues λi of y are nonnegative and y ∈ KE . Indeed,
hy, ci i = hy, ci ◦ ci i = hL(y)ci , ci i ≥ 0,
the last inequality follows from L(y) < 0, and this completes the proof.
2.3
Homogeneous Cones
Definition 4 Cone K is homogeneous, if it is proper, and for all x, y ∈ Int K
there exists a linear transformation T , such that T (x) = y and T (K) = K.
Example 3 (Pn×n = K(Sn ,◦) )
Choose any X 0, Y 0. Using eigenvalue decomposition, write: X =
QΛQT , Y = PΩPT . Transformation T is then defined as the following sequence
of linear transformations:
X
Q−1 •Q−T
−→
Λ
Λ−1/2 •Λ−1/2
−→
I
Ω1/2 •Ω1/2
−→
P•P T
Ω −→ Y,
where • is used to show multiplication from left and right.
Each of these steps maps Pn×n onto itself because if A is nonsingular then
AXAT < 0 if, and only if, X < 0. Thus, Pn×n is homogeneous.
Example 4 (Second-Order Cones) The following set of 3 operations is sufficient to obtain the required transformation for any x and y in the interior of
the second-order cone, and therefore second-order cones are homogeneous.
x0
y
x
x1
If the points x and y happen to be on
the same “circle”, as in the picture on
the left, rotation transformation can be
applied. It is easy to see that this transform has all the required properties.
xn
4
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
x0
y
x
x1
If the points x and y are on the same
ray, we can apply dilation transformation, multiplying by y
All the rex I.
quired properties are satisfied.
xn
x0
y
x
x1
An operation called hyperbolic rotation can be constructed similarly to the
usual rotation by replacing sin and cos
by sinh and cosh. This operation can
be constructed to “rotate” points along
a hyperbola, which has cone boundaries
as asymptotes.
xn
Any point in the interior of the second order cone can be transformed into
another point in the interior by a combination of dilation and rotation along
x0 axis and hyperbolic rotations as follows. Let x = λ1 c1 + λ2 c2 and y =
ω1 d1 + ω2 d2 , where c1 , c2 is a Jordan frame and likewise, d1 , d2 is a Jordan
frame. To transform x to y, we first rotate c1 to d1 ; this automatically maps c2
to d2 , because c1 ⊥c2 and d1 ⊥d2 . So, now we have x0 = λ1 d1 + λ2 d2 . Next, in
the plane spanned by d1 , d2 the vector y has coordinates ω1 , ω2 with respect to
the basis d1 , d2 and x0 has coordinates λ1 , λ2 . Applying the dilation ωλ11 λω22 I
2
2
maps x0 to the point x00 = λ001 d1 + λ002 d2 , where λ001 = ωλ1 ω
and λ002 = ωλ1 ω
.
2
1
00
Now, both y and x are on the same branch of the hyperbola a1 a2 = ω1 ω2 ;
thus a hyperbolic rotation will map x00 to y.
We are going to claim that cone of squares KE is homogeneous. But before
we are able to prove this, we need the following theorem.
Theorem 2 If x is invertible, then Qx (Int KE ) = Int KE .
Proof: First note that the set of invertible elements is a disconnected set.
For example, in the case of second-order cones there are 3 regions of invertible
elements, separated by the borders of the cone, as it is illustrated in the figure.
In the algebra of symmetric matrices all quarters of the eigenvalue space
form a region of invertible elements. Intuitively this is explained by the fact
that if for two symmetric matrices eigenvalues have different signs, then there
exists a linear combination of these matrices having an eigenvalue 0.
One of these connected regions is Int KE . If y ∈ Int KE , then Qx y is also
invertible, since
(Qx y)−1 = Qx−1 y−1
5
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
1
3
2
Therefore Qx (Int KE ) can not cross any boundary lines, and is either (a)
contained in Int KE entirely, i.e. Qx (Int KE ) ⊆ Int KE , or (b) does not have any
common points with it, and Qx (Int KE ) ∩ Int KE = ∅.
We know that Qx e = x2 ∈ KE . Thus (a) is true, and
Qx (Int KE ) ⊆ Int KE
for all invertible x.
If y ∈ KE , then y−1 ∈ Int KE . Hence, y = Qy y−1 ∈ Qy Int KE , and the
inverse inclusion holds:
Int KE ⊆ Qx (Int KE ).
Corollary 3 KE is a homogeneous cone.
Proof: Suppose we are given y2 , x2 ∈ Int KE . The following linear transformation can be used to prove that KE is homogeneous:
Q
Qy
−1
x
x2 −→
e −→ y2 .
Each of the steps transforms Int KE into itself, by Theorem 2.
2.4
Symmetric Cones, Direct Sums and Simple Algebras
Definition 5 A cone is symmetric, if it is proper, self-dual, and homogeneous.
Clearly, the cone of squares KE of any Euclidean Jordan algebra (E, ◦) is
symmetric. It turns out that the converse is also true.
Fact 3 If K is a symmetric cone, then it is the cone of squares of some Euclidean Jordan algebra.
In fact, there are not many significantly different classes of symmetric cones.
But before we can define these classes, we need to introduce direct sums.
Definition 6 Let (E1 , ∗), (E2 , ) be Euclidean Jordan algebras. Then direct sum
def
of these algebras is (E1 , ∗) ⊕ (E2 , ) = (E1 × E2 , ◦), where for all x1 , x2 ∈ E1
and y1 , y2 ∈ E2 ,
x1
x2 def x1 ∗ x2
◦
=
.
y1
y2
y1 y2
6
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
Proposition 2 Let (E1 × E2 , ◦) be a direct sum of Euclidean Jordan algebras
and let x ∈ E1 and y ∈ E2 . The following properties hold:
x
L(x)
0
1. L
= L(x) ⊕ L(y) =
y
0
L(y)
x
Qx
0
2. Q
= Qx ⊕ Qy =
y
0 Qy
x
3. pE1 ⊕E2
= pE1 (x)pE2 (y), where p(·) is the corresponding charactery
istic polynomial.
x
4. trE1 ⊕E2
= trE1 x + trE2 y
y
x
5. det
= det x det y
y
2
x 6. = kxk2F/E1 + kyk2F/E2
y F/E1 ⊕E2
x 7. = kxk2/E1 + kyk2/E2
y 2/E1 ⊕E2
8. KE1 ⊕E2 = KE1 × KE2
9. rk(E1 ⊕ E2 ) = rk(E1 ) + rk(E2 )
Example 5 (Direct sums in SOCPs)
min cT1 x1 + ... + cTr xr
s.t. A1 x1 + ... + Ar xr = b
xi <Q 0 1 ≤ i ≤ r
The cone constraint in this SOCP restricts x to a direct sum of quadratic cones:
x ∈ Q1 × Q2 × ... × Qr .
Example 6 (Direct sums in LPs) The usual boring algebra of real numbers
(R, ·) is an Euclidean Jordan algebra, where “·” stands for number multiplication. Then, the algebra underlying the linear programs is a direct sum of such
algebras:
(Rn , ∗) = (R, ·) ⊕ (R, ·) ⊕ ... ⊕ (R, ·).
7
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
Multiplication operator “∗” is defined as follows:
   


x1
y1
x1 y1
 x2   y2  def  x2 y2 

 ∗  = 
 ...   ... 
 ...  .
xn
yn
xn yn
Note that

x1
0

L(x)y = 

0
0
x2
..
0
.

0
0

y = x ∗ y

xn
Since direct sums of Euclidean Jordan algebras are Euclidean Jordan algebras, the theory that we have developed covers any combinations of these
algebras. LP variables can be combined with SOCP variables, and with SDP
variables, and so on. It would be interesting to find out, what the “minimal”
algebras with respect to the direct sum are. In a sense we want to find the
“basis” of all possible Euclidean Jordan algebras. The following definition and
a theorem (given without proof) answer these questions.
Definition 7 An Euclidean Jordan algebra is simple, if it is not isomorphic to
a direct sum of other Euclidean Jordan algebras.
Theorem 3 There exist only 5 different classes of simple Euclidean Jordan
algebras.
In the remaining part of this subsection we will briefly describe these classes.
10 . SOCP Algebra (Rn+1 , ◦). This is the familiar algebra associated with
SOCP where B = I. (If B is any symmetric positive definite matrix, the corresponding Jordan algebra will be Euclidean, and in fact isomorphic to the case
where B = I.)
20 . Symmetric Matrices (Sn×n , ◦). Again, this is the familiar algebra of
symmetric matrices, that we have discussed in previous lectures.
30 . Complex Hermitian Matrices (Hn×n , ◦). A matrix X of complex numbers is Hermitian (X ∈ Hn×n ), if X = X∗ . Operation (·)∗ denotes conjugate transpose, which is defined as following: if (X)lk = alk + iblk , then
(X∗ )lk = akl − ibkl .
For a matrix of complex numbers of size n × n, one can provide a matrix in
real numbers of size 2n×2n, for which the algebra operations will carry through
8
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
in exactly the same way. To achieve this, each element is replaced by a 2 × 2
matrix:
a b
a + ib →
.
−b a
Consider an example:
a
c − di

a 0
0 a

c + di
→

b
 c −d
d c

c d
−d c 



b 0
0 b
(1)
Therefore, it is easy to see that (Hn×n , ◦) is a subalgebra of S2n×2n . Even
though Hn×n is a subalgebra of S2n×2n its rank is only r. Let u be a unit
length complex vector. Then transforming it to a real matrix by (1) we map u
to a n × 2 matrix, and uuT to a rank 2 real matrix. This rank 2 real matrix is
not a primitive idempotent within S2n×2n , but it is primitive in Hn×n .
40 . Hermitian Quaternion Matrices. Quaternions are an extension of complex numbers. Each quaternion number is a sum a + bi + cj + dk, where
a, b, c, d ∈ R and i, j, k are such that:
i2 = j2 = k2 = −1
ij = k = −ji
jk = i = −kj
ki = j = −ik
Analogous to how a complex number can be expressed as a pair of real
numbers, a quaternion can be expressed as a pair of complex numbers: a + bi +
cj + dk = (a + bi) + (c + di)j.
Conjugate transpose X∗ of a quaternion matrix X is defined as following. If
(X)pq = a + bi + cj + dk, then (X∗ )qp = a − bi − cj − dk. Hermitian quaternion
matrices satisfy X = X∗ .
In this algebra multiplication is defined as
def
X◦Y =
XY + YX
.
2
50 . Hermitian Matrices of Octonions of Size 3 × 3. Octonions are an
extension of quaternions in the same way, as the quaternions are an extension
of complex numbers. Introduce a number l, such that l2 = −1. Then an
octonion can be written as p1 + lp2 , where p1 and p2 are quaternions.
By definition,
def
(p1 + lp2 )(q1 + lq2 ) = (p1 q1 − q̄2 p2 ) + (q2 p1 + p2 q̄1 )l.
9
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
The main difference between octonions and quaternions is that multiplication
in octonions is not associative. Thus, if we build matrices out of octonions, the
matrix multiplication will not be associative either. However, by an amazing
coincidence, over the set of 3 × 3 octonion matrices that are Hermitian, the
def
multiplication X ◦ Y = (XY + YX)/2 is a Euclidean Jordan algebra. It is however
a Jordan algebra which is not induced by an associative algebra, and in fact it
can be shown that it is isomorphic to no Jordan algebra that is a subalgebra
of a Jordan algebra induced by an associative algebra. Therefore, this algebra
is often called the exceptional Jordan algebra or the Albert algebra, named
after Adrian Albert who discovered it. It can be shown that this algebra has
rank 3. The underlying vector space of this algebra is a 27-dimensional vector
space (there are 3 real numbers on the diagonal, and three octonions on the
off-diagonal; since octonions are 8-dimensional the set of such matrices yields a
27-dimensional algebra),
3
Symmetric Cone LP
We will give a brief sketch of how this theory is applied for describing interior
point methods.
Suppose we are given a program:
minhc, xi
s.t.
hai , xi = bi
x <KE 0
Its dual is:
max bT y
s.t.
X
yi ai + z = c
z <KE 0
Complementary slackness conditions:
x <KE 0
z <KE 0
hx, zi = 0
⇒ x◦z=0
As we discussed earlier, (− ln det x) is an appropriate barrier of the primal:
min hc, xi − µ ln det x
s.t. Ax = b
Lagrangian L(x, y) = hc, xi − µ ln det x + yT (b − Ax). It can be shown that
the gradient ∇x ln det(x) = x−T . Thus, the optimality conditions imply
∇x L = cT − µx−T + yT A = 0
∇y L = b − Ax = 0
10
scribe:Anton Riabov
Lecture 12
Date: 12/03/2001
def
Let z = cT − µx−T . So, we have to solve
Ax = b
AT y + z = c
The following equalities are equivalent:
z − µx−1 = 0
x − µz−1 = 0
x ◦ z = µe
If we replace x ← x + ∆x, y ← y + ∆y, z ← z + ∆z as before,

   
A 0 0
∆x
rp
 0 AT I  ∆y = rd 
∆z
rc
E
0 F
And now we only need to define what are E and F:
z − µx−1 = 0
x − µz−1 = 0
x◦z=µ
→ E = −µQx−1 , F = I
→ E = I, F = −µQz−1
→ E = L(z), F = L(x)e
These relations unify LP, SOCP, and SDP formulations of interior point
methods. In fact, We can express by Jordan algebraic notation, any optimization
problems with any combination of nonnegativity, second order, or semidefinite
constraints. Analysis of interior point algorithms is also streamlined in Jordan
algebraic formulation.
11