* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Separating Doubly Nonnegative and Completely
Survey
Document related concepts
Capelli's identity wikipedia , lookup
System of linear equations wikipedia , lookup
Symmetric cone wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Rotation matrix wikipedia , lookup
Jordan normal form wikipedia , lookup
Four-vector wikipedia , lookup
Determinant wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Signed graph wikipedia , lookup
Gaussian elimination wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Matrix calculus wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Transcript
Separating Doubly Nonnegative and Completely Positive Matrices Kurt M. Anstreicher Dept. of Management Sciences University of Iowa (joint work with Hongbo Dong) INFORMS National Meeting, Austin, November 2010 2 CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. • The cone of n × n doubly nonnegative (DNN) matrices is then Dn = Sn+ ∩ Nn. CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. • The cone of n × n doubly nonnegative (DNN) matrices is then Dn = Sn+ ∩ Nn. • The cone of n × n completely positive (CP) matrices is Cn = {X | X = AAT for some n × k nonnegative matrix A}. CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. • The cone of n × n doubly nonnegative (DNN) matrices is then Dn = Sn+ ∩ Nn. • The cone of n × n completely positive (CP) matrices is Cn = {X | X = AAT for some n × k nonnegative matrix A}. • Dual of Cn is the cone of n × n copositive matrices, Cn∗ = {X ∈ Sn | y T Xy ≥ 0 ∀ y ∈ <+ n }. CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. • The cone of n × n doubly nonnegative (DNN) matrices is then Dn = Sn+ ∩ Nn. • The cone of n × n completely positive (CP) matrices is Cn = {X | X = AAT for some n × k nonnegative matrix A}. • Dual of Cn is the cone of n × n copositive matrices, Cn∗ = {X ∈ Sn | y T Xy ≥ 0 ∀ y ∈ <+ n }. Clear that Cn ⊆ Dn, Dn∗ = Sn+ + Nn ⊆ Cn∗ , and these inclusions are in fact strict for n > 4. Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Why Bother? Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Why Bother? • [Bur09] shows that broad class of NP-hard problems can be posed as linear optimization problems over Cn. Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Why Bother? • [Bur09] shows that broad class of NP-hard problems can be posed as linear optimization problems over Cn. • Dn is a tractable relaxation of Cn. Expect that solution of relaxed problem will be X ∈ Dn \ Cn. Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Why Bother? • [Bur09] shows that broad class of NP-hard problems can be posed as linear optimization problems over Cn. • Dn is a tractable relaxation of Cn. Expect that solution of relaxed problem will be X ∈ Dn \ Cn. • Note that least n where problem occurs is n = 5. For X ∈ Sn let G(X) denote the undirected graph on vertices {1, . . . , n} with edges {{i 6= j} | Xij 6= 0}. For X ∈ Sn let G(X) denote the undirected graph on vertices {1, . . . , n} with edges {{i 6= j} | Xij 6= 0}. Definition 1. Let G be an undirected graph on n vertices. Then G is called a CP graph if any matrix X ∈ Dn with G(X) = G also has X ∈ Cn. For X ∈ Sn let G(X) denote the undirected graph on vertices {1, . . . , n} with edges {{i 6= j} | Xij 6= 0}. Definition 1. Let G be an undirected graph on n vertices. Then G is called a CP graph if any matrix X ∈ Dn with G(X) = G also has X ∈ Cn. The main result on CP graphs is the following: Proposition 1. [KB93] An undirected graph on n vertices is a CP graph if and only if it contains no odd cycle of length 5 or greater. In [BAD09] it is shown that: In [BAD09] it is shown that: • Extreme rays of D5 are either rank-one matrices in C5, or rankthree “extremely bad” matrices where G(X) is a 5-cycle (every vertex in G(X) has degree two). In [BAD09] it is shown that: • Extreme rays of D5 are either rank-one matrices in C5, or rankthree “extremely bad” matrices where G(X) is a 5-cycle (every vertex in G(X) has degree two). • Any such extremely bad matrix can be separated from C5 by a transformation of the Horn matrix 1 −1 1 1 −1 −1 1 −1 1 1 ∈ C ∗ \ D ∗. 1 −1 1 −1 1 H := 5 5 1 1 −1 1 −1 −1 1 1 −1 1 In [DA10] show that: In [DA10] show that: • Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has at least one vertex of degree 2. In [DA10] show that: • Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has at least one vertex of degree 2. • More general separation procedure applies to any X ∈ D5 \ C5 that is not componentwise strictly positive. In [DA10] show that: • Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has at least one vertex of degree 2. • More general separation procedure applies to any X ∈ D5 \ C5 that is not componentwise strictly positive. An even more general separation procedure that applies to any X ∈ D5 \ C5 is described in [BD10]. In this talk we will describe the procedure from [DA10] for X ∈ D5 \ C5, X 6> 0, and its generalization to larger matrices having block structure. A separation procedure for the 5 × 5 case Assume that X ∈ D5, X 6> 0. After a symmetric permutation and diagonal scaling, X may be assumed to have the form X11 α1 α2 (1) X = α1T 1 0 , α2T 0 1 where X11 ∈ D3. A separation procedure for the 5 × 5 case Assume that X ∈ D5, X 6> 0. After a symmetric permutation and diagonal scaling, X may be assumed to have the form X11 α1 α2 (1) X = α1T 1 0 , α2T 0 1 where X11 ∈ D3. Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form (1). Then X ∈ C5 if and only if there are matrices A11 and A22 such that X11 = A11 + A22, and Aii αi ∈ D4, i = 1, 2. T αi 1 A separation procedure for the 5 × 5 case Assume that X ∈ D5, X 6> 0. After a symmetric permutation and diagonal scaling, X may be assumed to have the form X11 α1 α2 (1) X = α1T 1 0 , α2T 0 1 where X11 ∈ D3. Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form (1). Then X ∈ C5 if and only if there are matrices A11 and A22 such that X11 = A11 + A22, and Aii αi ∈ D4, i = 1, 2. T αi 1 In [BX04], Theorem 1 is utilized only as a proof mechanism, but we now show that it has algorithmic consequences as well. Theorem 2. Assume that X ∈ D5 has the form (1). Then X ∈ D5 \ C5 if and only if there is a matrix V11 β1 β2 V11 βi ∗, i = 1, 2, T ∈ D V = β1 γ1 0 such that T γ 4 β i i β2T 0 γ2 and V • X < 0. Theorem 2. Assume that X ∈ D5 has the form (1). Then X ∈ D5 \ C5 if and only if there is a matrix V11 β1 β2 V11 βi ∗, i = 1, 2, T ∈ D V = β1 γ1 0 such that T γ 4 β i i β2T 0 γ2 and V • X < 0. • Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If X̃ ∈ C5 is another matrix of the form (1), then Theorem 1 implies that V • X̃ ≥ 0. Theorem 2. Assume that X ∈ D5 has the form (1). Then X ∈ D5 \ C5 if and only if there is a matrix V11 β1 β2 V11 βi ∗, i = 1, 2, T ∈ D V = β1 γ1 0 such that T γ 4 β i i β2T 0 γ2 and V • X < 0. • Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If X̃ ∈ C5 is another matrix of the form (1), then Theorem 1 implies that V • X̃ ≥ 0. • However cannot conclude that V ∈ C5∗ because V • X̃ ≥ 0 only holds for X̃ of the form (1), in particular, x̃45 = 0. Theorem 2. Assume that X ∈ D5 has the form (1). Then X ∈ D5 \ C5 if and only if there is a matrix V11 β1 β2 V11 βi ∗, i = 1, 2, T ∈ D V = β1 γ1 0 such that T γ 4 β i i β2T 0 γ2 and V • X < 0. • Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If X̃ ∈ C5 is another matrix of the form (1), then Theorem 1 implies that V • X̃ ≥ 0. • However cannot conclude that V ∈ C5∗ because V • X̃ ≥ 0 only holds for X̃ of the form (1), in particular, x̃45 = 0. • Fortunately, by [HJR05, Theorem 1], V can easily be “completed” to obtain a copositive matrix that still separates X from C5. Theorem 3. Suppose that X ∈ D5 \ C5 has the form (1), and V satisfies the conditions of Theorem 2. Define V11 β1 β2 V (s) = β1T γ1 s . β2T s γ2 √ Then V (s) • X < 0 for any s, and V (s) ∈ C5∗ for s ≥ γ1γ2. Separation for larger matrices with block structure Procedure for 5 case where X 6> 0 can be generalized to larger matrices with block structure. Assume X has the form X11 X12 X13 . . . X1k X T X 0 . . . 0 12 22 T X = X13 0 . . . . . . .. , . . . . . . . . 0 . . T X1k 0 . . . 0 Xkk where k ≥ 3, each Xii is an ni × ni matrix, and (2) Pk i=1 ni = n. Lemma 1. Suppose that X ∈ Dn has the form (2), k ≥ 3, and let X11 X1i i X = , i = 2, . . . , k. T X1i Xii Then X ∈P Cn if and only if there are matrices Aii, i = 2, . . . , k such that ki=2 Aii = X11, and Aii X1i ∈ Cn1+ni , i = 2, . . . , k. T X1i Xii Moreover, if G(X i) is a CP graph for each i = 2, . . . , k, then the above statement remains true with Cn1+ni replaced by Dn1+ni . Theorem 4. Suppose that X ∈ Dn \ Cn has the form (2), where G(X i) is a CP graph, i = 2, . . . , k. Then there is a matrix V , also of the form (2), such that V11 V1i ∗ ∈ D , i = 2, . . . , k, n +n T 1 i V1i Vii and V • X < 0. Moreover, if γi = diag(Diag(Vii).5), then the matrix V11 . . . V1k Ṽ = .. . . . .. , T ... V V1k kk where Vij = γiγjT , 2 ≤ i 6= j ≤ k, has Ṽ ∈ Cn∗ and Ṽ • X = V • X < 0. Note that: Note that: • Matrix X may have numerical entries that are small but not exactly zero. Can then apply Lemma 1 to perturbed matrix X̃ where entries of X below a specified tolerance are set to zero. If a cut V separating X̃ from Cn is found, then V • X ≈ V • X̃ < 0, and V is very likely to also separate X from Cn. Note that: • Matrix X may have numerical entries that are small but not exactly zero. Can then apply Lemma 1 to perturbed matrix X̃ where entries of X below a specified tolerance are set to zero. If a cut V separating X̃ from Cn is found, then V • X ≈ V • X̃ < 0, and V is very likely to also separate X from Cn. • Theorem 4 may provide a cut separating a given X ∈ Dn \ Cn even when the sufficient conditions for generating such a cut are not satisfied. In particular, a cut may be found even when the condition that X i is a CP graph for each i is not satisfied. A second case where block structure can be used to generate cuts for a matrix X ∈ Dn \ Cn is when X has the form I X12 X13 ... X1k X T I X23 . . . X 2k 12 T T .. ... (3) X = X13 X23 . . . , .. .. . . . ... X (k−1)k T XT . . . XT X1k I 2k (k−1)k Pk where k ≥ 2, each Xij is an ni × nj matrix, and i=1 ni = n. The structure in (3) corresponds to a partitioning of the vertices {1, 2, . . . , n} into k stable sets in G(X), of size n1, . . . , nk (note that ni = 1 is allowed). Applications Example 1: Consider the box-constrained QP problem (QPB) max xT Qx + cT x s.t. 0 ≤ x ≤ e. Applications Example 1: Consider the box-constrained QP problem (QPB) max xT Qx + cT x s.t. 0 ≤ x ≤ e. To describe a CP-formulation of QPB, define matrices T T 1 x s T 1 x Y = , Y + = x X Z . x X s ZT S Applications Example 1: Consider the box-constrained QP problem (QPB) max xT Qx + cT x s.t. 0 ≤ x ≤ e. To describe a CP-formulation of QPB, define matrices T T 1 x s T 1 x Y = , Y + = x X Z . x X s ZT S By the result of [Bur09], QPB can then be written in the form (QPBCP) max Q • X + cT x s.t. x + s = e, Diag(X + 2Z + S) = e, Y + ∈ C2n+1. Replacing C2n+1 with D2n+1 gives tractable DNN relaxation. Replacing C2n+1 with D2n+1 gives tractable DNN relaxation. • Can be shown that DNN relaxation is equivalent to “SDP+RLT” relaxation, and for n = 2 problem is equivalent to QPB [AB10]. Replacing C2n+1 with D2n+1 gives tractable DNN relaxation. • Can be shown that DNN relaxation is equivalent to “SDP+RLT” relaxation, and for n = 2 problem is equivalent to QPB [AB10]. • Constraints from Boolean Quadric Polytope (BQP) are valid for off-diagonal components of X [BL09]. For n = 3, BQP is completely determined by triangle inequalities and RLT constraints. However, can still be a gap when using SDP+RLT+TRI relaxation. Replacing C2n+1 with D2n+1 gives tractable DNN relaxation. • Can be shown that DNN relaxation is equivalent to “SDP+RLT” relaxation, and for n = 2 problem is equivalent to QPB [AB10]. • Constraints from Boolean Quadric Polytope (BQP) are valid for off-diagonal components of X [BL09]. For n = 3, BQP is completely determined by triangle inequalities and RLT constraints. However, can still be a gap when using SDP+RLT+TRI relaxation. • Example from [BL09] with n = 3 has optimal value for QPB of 1.0, value for SDP+RLT+TRI relaxation of 1.093. Solution matrix Y + has 5 × 5 principal submatrix that is not strictly positive, and is not CP. Can obtain cut from Theorem 3, resolve problem, and repeat. 1.00E+00 Gap to optimal value 1.00E‐01 1.00E‐02 1.00E‐03 1.00E‐04 1.00E‐05 1.00E‐06 1.00E‐07 1.00E‐08 1.00E‐09 0 5 10 15 Number of CP cuts added Figure 1: Gap to optimal value for Burer-Letchford QPB problem (n = 3) 20 25 Example 2: Let A be the adjacency matrix of a graph G on n vertices, and let α be the maximum size of a stable set in G. Known [dKP02] that Example 2: Let A be the adjacency matrix of a graph G on n vertices, and let α be the maximum size of a stable set in G. Known [dKP02] that n o α−1 = min (I + A) • X : eeT • X = 1, X ∈ Cn . (4) Example 2: Let A be the adjacency matrix of a graph G on n vertices, and let α be the maximum size of a stable set in G. Known [dKP02] that n o α−1 = min (I + A) • X : eeT • X = 1, X ∈ Cn . (4) Relaxing Cn to Dn results in the Lovász-Schrijver bound n o (ϑ0)−1 = min (I + A) • X : eeT • X = 1, X ∈ Dn . (5) Let G12 be the complement of the graph corresponding to the vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈ 3.24. Let G12 be the complement of the graph corresponding to the vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈ 3.24. 1 to approximate the dual of (4) provides no • Using the cone K12 improvement [BdK02]. Let G12 be the complement of the graph corresponding to the vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈ 3.24. 1 to approximate the dual of (4) provides no • Using the cone K12 improvement [BdK02]. • For the solution matrix X ∈ D12 from (5), cannot find cut based on first block structure (2). However can find a cut based on (3). Adding this cut and re-solving, gap to 1/α = 13 is approximately 2 × 10−8. References [AB10] Kurt M. Anstreicher and Samuel Burer, Computable representations for convex hulls of lowdimensional quadratic forms, Math. Prog. B 124 (2010), 33–43. [BAD09] Samuel Burer, Kurt M. Anstreicher, and Mirjam Dür, The difference between 5 × 5 doubly nonnegative and completely positive matrices, Linear Algebra Appl. 431 (2009), 1539–1552. [BD10] Samuel Burer and Hongbo Dong, Separation and relaxation for cones of quadratic forms, Dept. of Management Sciences, University of Iowa (2010). [BdK02] Immanuel M. Bomze and Etienne de Klerk, Solving standard quadratic optimization problems via linear, semidefinite and copositive programming, J. Global Optim. 24 (2002), 163–185. [BL09] Samuel Burer and Adam N. Letchford, On nonconvex quadratic programming with box constriants, SIAM J. Optim. 20 (2009), 1073–1089. [Bur09] Samuel Burer, On the copositive representation of binary and continuous nonconvex quadratic programs, Math. Prog. 120 (2009), 479–495. [BX04] Abraham Berman and Changqing Xu, 5 × 5 Completely positive matrices, Linear Algebra Appl. 393 (2004), 55–71. [DA10] Hongbo Dong and Kurt Anstreicher, Separating doubly nonnegative and completely positive matrices, Dept. of Management Sciences, University of Iowa (2010). [dKP02] Etienne de Klerk and Dmitrii V. Pasechnik, Approximation of the stability number of a graph via copositive programming, SIAM J. Optim. 12 (2002), 875–892. [HJR05] Leslie Hogben, Charles R. Johnson, and Robert Reams, The copositive completion problem, Linear Algebra Appl. 408 (2005), 207–211. [KB93] Natalia Kogan and Abraham Berman, Characterization of completely positive graphs, Discrete Math. 114 (1993), 297–304.