Download Separating Doubly Nonnegative and Completely

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Capelli's identity wikipedia , lookup

System of linear equations wikipedia , lookup

Symmetric cone wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Rotation matrix wikipedia , lookup

Jordan normal form wikipedia , lookup

Four-vector wikipedia , lookup

Determinant wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Signed graph wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Matrix calculus wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Transcript
Separating Doubly Nonnegative and
Completely Positive Matrices
Kurt M. Anstreicher
Dept. of Management Sciences
University of Iowa
(joint work with Hongbo Dong)
INFORMS National Meeting, Austin, November 2010
2
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, Sn+ denote
the cone of n×n real symmetric positive semidefinite matrices and
Nn denote the cone of symmetric nonnegative n × n matrices.
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, Sn+ denote
the cone of n×n real symmetric positive semidefinite matrices and
Nn denote the cone of symmetric nonnegative n × n matrices.
• The cone of n × n doubly nonnegative (DNN) matrices is then
Dn = Sn+ ∩ Nn.
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, Sn+ denote
the cone of n×n real symmetric positive semidefinite matrices and
Nn denote the cone of symmetric nonnegative n × n matrices.
• The cone of n × n doubly nonnegative (DNN) matrices is then
Dn = Sn+ ∩ Nn.
• The cone of n × n completely positive (CP) matrices is
Cn = {X | X = AAT for some n × k nonnegative matrix A}.
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, Sn+ denote
the cone of n×n real symmetric positive semidefinite matrices and
Nn denote the cone of symmetric nonnegative n × n matrices.
• The cone of n × n doubly nonnegative (DNN) matrices is then
Dn = Sn+ ∩ Nn.
• The cone of n × n completely positive (CP) matrices is
Cn = {X | X = AAT for some n × k nonnegative matrix A}.
• Dual of Cn is the cone of n × n copositive matrices,
Cn∗ = {X ∈ Sn | y T Xy ≥ 0 ∀ y ∈ <+
n }.
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, Sn+ denote
the cone of n×n real symmetric positive semidefinite matrices and
Nn denote the cone of symmetric nonnegative n × n matrices.
• The cone of n × n doubly nonnegative (DNN) matrices is then
Dn = Sn+ ∩ Nn.
• The cone of n × n completely positive (CP) matrices is
Cn = {X | X = AAT for some n × k nonnegative matrix A}.
• Dual of Cn is the cone of n × n copositive matrices,
Cn∗ = {X ∈ Sn | y T Xy ≥ 0 ∀ y ∈ <+
n }.
Clear that
Cn ⊆ Dn,
Dn∗ = Sn+ + Nn ⊆ Cn∗ ,
and these inclusions are in fact strict for n > 4.
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a
matrix V ∈ Cn∗ having V • X < 0.
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a
matrix V ∈ Cn∗ having V • X < 0.
Why Bother?
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a
matrix V ∈ Cn∗ having V • X < 0.
Why Bother?
• [Bur09] shows that broad class of NP-hard problems can be
posed as linear optimization problems over Cn.
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a
matrix V ∈ Cn∗ having V • X < 0.
Why Bother?
• [Bur09] shows that broad class of NP-hard problems can be
posed as linear optimization problems over Cn.
• Dn is a tractable relaxation of Cn. Expect that solution of
relaxed problem will be X ∈ Dn \ Cn.
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a
matrix V ∈ Cn∗ having V • X < 0.
Why Bother?
• [Bur09] shows that broad class of NP-hard problems can be
posed as linear optimization problems over Cn.
• Dn is a tractable relaxation of Cn. Expect that solution of
relaxed problem will be X ∈ Dn \ Cn.
• Note that least n where problem occurs is n = 5.
For X ∈ Sn let G(X) denote the undirected graph on vertices
{1, . . . , n} with edges {{i 6= j} | Xij 6= 0}.
For X ∈ Sn let G(X) denote the undirected graph on vertices
{1, . . . , n} with edges {{i 6= j} | Xij 6= 0}.
Definition 1. Let G be an undirected graph on n vertices.
Then G is called a CP graph if any matrix X ∈ Dn with
G(X) = G also has X ∈ Cn.
For X ∈ Sn let G(X) denote the undirected graph on vertices
{1, . . . , n} with edges {{i 6= j} | Xij 6= 0}.
Definition 1. Let G be an undirected graph on n vertices.
Then G is called a CP graph if any matrix X ∈ Dn with
G(X) = G also has X ∈ Cn.
The main result on CP graphs is the following:
Proposition 1. [KB93] An undirected graph on n vertices is
a CP graph if and only if it contains no odd cycle of length 5
or greater.
In [BAD09] it is shown that:
In [BAD09] it is shown that:
• Extreme rays of D5 are either rank-one matrices in C5, or rankthree “extremely bad” matrices where G(X) is a 5-cycle (every
vertex in G(X) has degree two).
In [BAD09] it is shown that:
• Extreme rays of D5 are either rank-one matrices in C5, or rankthree “extremely bad” matrices where G(X) is a 5-cycle (every
vertex in G(X) has degree two).
• Any such extremely bad matrix can be separated from C5 by a
transformation of the Horn matrix


1 −1 1 1 −1
 −1 1 −1 1 1 


 ∈ C ∗ \ D ∗.
1
−1
1
−1
1
H := 
5
5


 1 1 −1 1 −1 
−1 1 1 −1 1
In [DA10] show that:
In [DA10] show that:
• Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has
at least one vertex of degree 2.
In [DA10] show that:
• Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has
at least one vertex of degree 2.
• More general separation procedure applies to any X ∈ D5 \ C5
that is not componentwise strictly positive.
In [DA10] show that:
• Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has
at least one vertex of degree 2.
• More general separation procedure applies to any X ∈ D5 \ C5
that is not componentwise strictly positive.
An even more general separation procedure that applies to any
X ∈ D5 \ C5 is described in [BD10]. In this talk we will describe
the procedure from [DA10] for X ∈ D5 \ C5, X 6> 0, and its
generalization to larger matrices having block structure.
A separation procedure for the 5 × 5 case
Assume that X ∈ D5, X 6> 0. After a symmetric permutation
and diagonal scaling, X may be assumed to have the form


X11 α1 α2
(1)
X =  α1T 1 0  ,
α2T 0 1
where X11 ∈ D3.
A separation procedure for the 5 × 5 case
Assume that X ∈ D5, X 6> 0. After a symmetric permutation
and diagonal scaling, X may be assumed to have the form


X11 α1 α2
(1)
X =  α1T 1 0  ,
α2T 0 1
where X11 ∈ D3.
Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form
(1). Then X ∈ C5 if and only if there are matrices A11 and
A22 such that X11 = A11 + A22, and
Aii αi
∈ D4, i = 1, 2.
T
αi 1
A separation procedure for the 5 × 5 case
Assume that X ∈ D5, X 6> 0. After a symmetric permutation
and diagonal scaling, X may be assumed to have the form


X11 α1 α2
(1)
X =  α1T 1 0  ,
α2T 0 1
where X11 ∈ D3.
Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form
(1). Then X ∈ C5 if and only if there are matrices A11 and
A22 such that X11 = A11 + A22, and
Aii αi
∈ D4, i = 1, 2.
T
αi 1
In [BX04], Theorem 1 is utilized only as a proof mechanism, but
we now show that it has algorithmic consequences as well.
Theorem 2. Assume that X ∈ D5 has the form (1). Then
X ∈ D5 \ C5 if and only if there is a matrix


V11 β1 β2
V11 βi
∗, i = 1, 2,
T


∈
D
V = β1 γ1 0
such that
T γ
4
β
i
i
β2T 0 γ2
and V • X < 0.
Theorem 2. Assume that X ∈ D5 has the form (1). Then
X ∈ D5 \ C5 if and only if there is a matrix


V11 β1 β2
V11 βi
∗, i = 1, 2,
T


∈
D
V = β1 γ1 0
such that
T γ
4
β
i
i
β2T 0 γ2
and V • X < 0.
• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If
X̃ ∈ C5 is another matrix of the form (1), then Theorem 1
implies that V • X̃ ≥ 0.
Theorem 2. Assume that X ∈ D5 has the form (1). Then
X ∈ D5 \ C5 if and only if there is a matrix


V11 β1 β2
V11 βi
∗, i = 1, 2,
T


∈
D
V = β1 γ1 0
such that
T γ
4
β
i
i
β2T 0 γ2
and V • X < 0.
• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If
X̃ ∈ C5 is another matrix of the form (1), then Theorem 1
implies that V • X̃ ≥ 0.
• However cannot conclude that V ∈ C5∗ because V • X̃ ≥ 0 only
holds for X̃ of the form (1), in particular, x̃45 = 0.
Theorem 2. Assume that X ∈ D5 has the form (1). Then
X ∈ D5 \ C5 if and only if there is a matrix


V11 β1 β2
V11 βi
∗, i = 1, 2,
T


∈
D
V = β1 γ1 0
such that
T γ
4
β
i
i
β2T 0 γ2
and V • X < 0.
• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If
X̃ ∈ C5 is another matrix of the form (1), then Theorem 1
implies that V • X̃ ≥ 0.
• However cannot conclude that V ∈ C5∗ because V • X̃ ≥ 0 only
holds for X̃ of the form (1), in particular, x̃45 = 0.
• Fortunately, by [HJR05, Theorem 1], V can easily be “completed” to obtain a copositive matrix that still separates X
from C5.
Theorem 3. Suppose that X ∈ D5 \ C5 has the form (1), and
V satisfies the conditions of Theorem 2. Define


V11 β1 β2
V (s) =  β1T γ1 s  .
β2T s γ2
√
Then V (s) • X < 0 for any s, and V (s) ∈ C5∗ for s ≥ γ1γ2.
Separation for larger matrices with block structure
Procedure for 5 case where X 6> 0 can be generalized to larger
matrices with block structure. Assume X has the form


X11 X12 X13 . . . X1k
X T X

0
.
.
.
0

 12 22

 T
X =  X13
0 . . . . . . ..  ,

 .
.
.
.
.
.
.
.
0 
.
 .
T
X1k
0 . . . 0 Xkk
where k ≥ 3, each Xii is an ni × ni matrix, and
(2)
Pk
i=1 ni = n.
Lemma 1. Suppose that X ∈ Dn has the form (2), k ≥ 3,
and let
X11 X1i
i
X =
, i = 2, . . . , k.
T
X1i Xii
Then X ∈P
Cn if and only if there are matrices Aii, i = 2, . . . , k
such that ki=2 Aii = X11, and
Aii X1i
∈ Cn1+ni , i = 2, . . . , k.
T
X1i Xii
Moreover, if G(X i) is a CP graph for each i = 2, . . . , k,
then the above statement remains true with Cn1+ni replaced
by Dn1+ni .
Theorem 4. Suppose that X ∈ Dn \ Cn has the form (2),
where G(X i) is a CP graph, i = 2, . . . , k. Then there is a
matrix V , also of the form (2), such that
V11 V1i
∗
∈
D
, i = 2, . . . , k,
n
+n
T
1
i
V1i Vii
and V • X < 0. Moreover, if γi = diag(Diag(Vii).5), then the
matrix


V11 . . . V1k
Ṽ =  .. . . . ..  ,
T ... V
V1k
kk
where Vij = γiγjT , 2 ≤ i 6= j ≤ k, has Ṽ ∈ Cn∗ and Ṽ • X =
V • X < 0.
Note that:
Note that:
• Matrix X may have numerical entries that are small but not
exactly zero. Can then apply Lemma 1 to perturbed matrix X̃
where entries of X below a specified tolerance are set to zero. If
a cut V separating X̃ from Cn is found, then V • X ≈ V • X̃ <
0, and V is very likely to also separate X from Cn.
Note that:
• Matrix X may have numerical entries that are small but not
exactly zero. Can then apply Lemma 1 to perturbed matrix X̃
where entries of X below a specified tolerance are set to zero. If
a cut V separating X̃ from Cn is found, then V • X ≈ V • X̃ <
0, and V is very likely to also separate X from Cn.
• Theorem 4 may provide a cut separating a given X ∈ Dn \ Cn
even when the sufficient conditions for generating such a cut
are not satisfied. In particular, a cut may be found even when
the condition that X i is a CP graph for each i is not satisfied.
A second case where block structure can be used to generate cuts
for a matrix X ∈ Dn \ Cn is when X has the form


I X12 X13
...
X1k

 X T I X23
.
.
.
X
2k 
 12

 T
T
..
...
(3)
X =  X13 X23 . . .
,

 ..
.. . . .
...
X

(k−1)k 
T XT . . . XT
X1k
I
2k
(k−1)k
Pk
where k ≥ 2, each Xij is an ni × nj matrix, and i=1 ni = n.
The structure in (3) corresponds to a partitioning of the vertices
{1, 2, . . . , n} into k stable sets in G(X), of size n1, . . . , nk (note
that ni = 1 is allowed).
Applications
Example 1: Consider the box-constrained QP problem
(QPB) max xT Qx + cT x
s.t. 0 ≤ x ≤ e.
Applications
Example 1: Consider the box-constrained QP problem
(QPB) max xT Qx + cT x
s.t. 0 ≤ x ≤ e.
To describe a CP-formulation of QPB, define matrices


T
T
1 x s
T
1 x
Y =
,
Y + = x X Z  .
x X
s ZT S
Applications
Example 1: Consider the box-constrained QP problem
(QPB) max xT Qx + cT x
s.t. 0 ≤ x ≤ e.
To describe a CP-formulation of QPB, define matrices


T
T
1 x s
T
1 x
Y =
,
Y + = x X Z  .
x X
s ZT S
By the result of [Bur09], QPB can then be written in the form
(QPBCP) max Q • X + cT x
s.t. x + s = e, Diag(X + 2Z + S) = e,
Y + ∈ C2n+1.
Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.
Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.
• Can be shown that DNN relaxation is equivalent to “SDP+RLT”
relaxation, and for n = 2 problem is equivalent to QPB [AB10].
Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.
• Can be shown that DNN relaxation is equivalent to “SDP+RLT”
relaxation, and for n = 2 problem is equivalent to QPB [AB10].
• Constraints from Boolean Quadric Polytope (BQP) are valid for
off-diagonal components of X [BL09]. For n = 3, BQP is completely determined by triangle inequalities and RLT constraints.
However, can still be a gap when using SDP+RLT+TRI relaxation.
Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.
• Can be shown that DNN relaxation is equivalent to “SDP+RLT”
relaxation, and for n = 2 problem is equivalent to QPB [AB10].
• Constraints from Boolean Quadric Polytope (BQP) are valid for
off-diagonal components of X [BL09]. For n = 3, BQP is completely determined by triangle inequalities and RLT constraints.
However, can still be a gap when using SDP+RLT+TRI relaxation.
• Example from [BL09] with n = 3 has optimal value for QPB
of 1.0, value for SDP+RLT+TRI relaxation of 1.093. Solution
matrix Y + has 5 × 5 principal submatrix that is not strictly
positive, and is not CP. Can obtain cut from Theorem 3, resolve problem, and repeat.
1.00E+00
Gap to optimal value
1.00E‐01
1.00E‐02
1.00E‐03
1.00E‐04
1.00E‐05
1.00E‐06
1.00E‐07
1.00E‐08
1.00E‐09
0
5
10
15
Number of CP cuts added
Figure 1: Gap to optimal value for Burer-Letchford QPB problem (n = 3)
20
25
Example 2: Let A be the adjacency matrix of a graph G on
n vertices, and let α be the maximum size of a stable set in G.
Known [dKP02] that
Example 2: Let A be the adjacency matrix of a graph G on
n vertices, and let α be the maximum size of a stable set in G.
Known [dKP02] that
n
o
α−1 = min (I + A) • X : eeT • X = 1, X ∈ Cn . (4)
Example 2: Let A be the adjacency matrix of a graph G on
n vertices, and let α be the maximum size of a stable set in G.
Known [dKP02] that
n
o
α−1 = min (I + A) • X : eeT • X = 1, X ∈ Cn . (4)
Relaxing Cn to Dn results in the Lovász-Schrijver bound
n
o
(ϑ0)−1 = min (I + A) • X : eeT • X = 1, X ∈ Dn .
(5)
Let G12 be the complement of the graph corresponding to the
vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈
3.24.
Let G12 be the complement of the graph corresponding to the
vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈
3.24.
1 to approximate the dual of (4) provides no
• Using the cone K12
improvement [BdK02].
Let G12 be the complement of the graph corresponding to the
vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈
3.24.
1 to approximate the dual of (4) provides no
• Using the cone K12
improvement [BdK02].
• For the solution matrix X ∈ D12 from (5), cannot find cut
based on first block structure (2). However can find a cut
based on (3). Adding this cut and re-solving, gap to 1/α = 13
is approximately 2 × 10−8.
References
[AB10]
Kurt M. Anstreicher and Samuel Burer, Computable representations for convex hulls of lowdimensional quadratic forms, Math. Prog. B 124 (2010), 33–43.
[BAD09] Samuel Burer, Kurt M. Anstreicher, and Mirjam Dür, The difference between 5 × 5 doubly
nonnegative and completely positive matrices, Linear Algebra Appl. 431 (2009), 1539–1552.
[BD10]
Samuel Burer and Hongbo Dong, Separation and relaxation for cones of quadratic forms, Dept.
of Management Sciences, University of Iowa (2010).
[BdK02] Immanuel M. Bomze and Etienne de Klerk, Solving standard quadratic optimization problems via
linear, semidefinite and copositive programming, J. Global Optim. 24 (2002), 163–185.
[BL09]
Samuel Burer and Adam N. Letchford, On nonconvex quadratic programming with box constriants, SIAM J. Optim. 20 (2009), 1073–1089.
[Bur09] Samuel Burer, On the copositive representation of binary and continuous nonconvex quadratic
programs, Math. Prog. 120 (2009), 479–495.
[BX04]
Abraham Berman and Changqing Xu, 5 × 5 Completely positive matrices, Linear Algebra Appl.
393 (2004), 55–71.
[DA10]
Hongbo Dong and Kurt Anstreicher, Separating doubly nonnegative and completely positive matrices, Dept. of Management Sciences, University of Iowa (2010).
[dKP02] Etienne de Klerk and Dmitrii V. Pasechnik, Approximation of the stability number of a graph via
copositive programming, SIAM J. Optim. 12 (2002), 875–892.
[HJR05] Leslie Hogben, Charles R. Johnson, and Robert Reams, The copositive completion problem, Linear
Algebra Appl. 408 (2005), 207–211.
[KB93]
Natalia Kogan and Abraham Berman, Characterization of completely positive graphs, Discrete
Math. 114 (1993), 297–304.