* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Separating Doubly Nonnegative and Completely
Capelli's identity wikipedia , lookup
System of linear equations wikipedia , lookup
Symmetric cone wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Rotation matrix wikipedia , lookup
Jordan normal form wikipedia , lookup
Four-vector wikipedia , lookup
Determinant wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Signed graph wikipedia , lookup
Gaussian elimination wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Matrix calculus wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Separating Doubly Nonnegative and Completely Positive Matrices Kurt M. Anstreicher Dept. of Management Sciences University of Iowa (joint work with Hongbo Dong) INFORMS National Meeting, Austin, November 2010 2 CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. • The cone of n × n doubly nonnegative (DNN) matrices is then Dn = Sn+ ∩ Nn. CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. • The cone of n × n doubly nonnegative (DNN) matrices is then Dn = Sn+ ∩ Nn. • The cone of n × n completely positive (CP) matrices is Cn = {X | X = AAT for some n × k nonnegative matrix A}. CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. • The cone of n × n doubly nonnegative (DNN) matrices is then Dn = Sn+ ∩ Nn. • The cone of n × n completely positive (CP) matrices is Cn = {X | X = AAT for some n × k nonnegative matrix A}. • Dual of Cn is the cone of n × n copositive matrices, Cn∗ = {X ∈ Sn | y T Xy ≥ 0 ∀ y ∈ <+ n }. CP and DNN matrices Let Sn denote the set of n×n real symmetric matrices, Sn+ denote the cone of n×n real symmetric positive semidefinite matrices and Nn denote the cone of symmetric nonnegative n × n matrices. • The cone of n × n doubly nonnegative (DNN) matrices is then Dn = Sn+ ∩ Nn. • The cone of n × n completely positive (CP) matrices is Cn = {X | X = AAT for some n × k nonnegative matrix A}. • Dual of Cn is the cone of n × n copositive matrices, Cn∗ = {X ∈ Sn | y T Xy ≥ 0 ∀ y ∈ <+ n }. Clear that Cn ⊆ Dn, Dn∗ = Sn+ + Nn ⊆ Cn∗ , and these inclusions are in fact strict for n > 4. Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Why Bother? Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Why Bother? • [Bur09] shows that broad class of NP-hard problems can be posed as linear optimization problems over Cn. Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Why Bother? • [Bur09] shows that broad class of NP-hard problems can be posed as linear optimization problems over Cn. • Dn is a tractable relaxation of Cn. Expect that solution of relaxed problem will be X ∈ Dn \ Cn. Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using a matrix V ∈ Cn∗ having V • X < 0. Why Bother? • [Bur09] shows that broad class of NP-hard problems can be posed as linear optimization problems over Cn. • Dn is a tractable relaxation of Cn. Expect that solution of relaxed problem will be X ∈ Dn \ Cn. • Note that least n where problem occurs is n = 5. For X ∈ Sn let G(X) denote the undirected graph on vertices {1, . . . , n} with edges {{i 6= j} | Xij 6= 0}. For X ∈ Sn let G(X) denote the undirected graph on vertices {1, . . . , n} with edges {{i 6= j} | Xij 6= 0}. Definition 1. Let G be an undirected graph on n vertices. Then G is called a CP graph if any matrix X ∈ Dn with G(X) = G also has X ∈ Cn. For X ∈ Sn let G(X) denote the undirected graph on vertices {1, . . . , n} with edges {{i 6= j} | Xij 6= 0}. Definition 1. Let G be an undirected graph on n vertices. Then G is called a CP graph if any matrix X ∈ Dn with G(X) = G also has X ∈ Cn. The main result on CP graphs is the following: Proposition 1. [KB93] An undirected graph on n vertices is a CP graph if and only if it contains no odd cycle of length 5 or greater. In [BAD09] it is shown that: In [BAD09] it is shown that: • Extreme rays of D5 are either rank-one matrices in C5, or rankthree “extremely bad” matrices where G(X) is a 5-cycle (every vertex in G(X) has degree two). In [BAD09] it is shown that: • Extreme rays of D5 are either rank-one matrices in C5, or rankthree “extremely bad” matrices where G(X) is a 5-cycle (every vertex in G(X) has degree two). • Any such extremely bad matrix can be separated from C5 by a transformation of the Horn matrix 1 −1 1 1 −1 −1 1 −1 1 1 ∈ C ∗ \ D ∗. 1 −1 1 −1 1 H := 5 5 1 1 −1 1 −1 −1 1 1 −1 1 In [DA10] show that: In [DA10] show that: • Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has at least one vertex of degree 2. In [DA10] show that: • Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has at least one vertex of degree 2. • More general separation procedure applies to any X ∈ D5 \ C5 that is not componentwise strictly positive. In [DA10] show that: • Separation procedure based on transformed Horn matrix applies to X ∈ D5 \ C5 where X has rank three and G(X) has at least one vertex of degree 2. • More general separation procedure applies to any X ∈ D5 \ C5 that is not componentwise strictly positive. An even more general separation procedure that applies to any X ∈ D5 \ C5 is described in [BD10]. In this talk we will describe the procedure from [DA10] for X ∈ D5 \ C5, X 6> 0, and its generalization to larger matrices having block structure. A separation procedure for the 5 × 5 case Assume that X ∈ D5, X 6> 0. After a symmetric permutation and diagonal scaling, X may be assumed to have the form X11 α1 α2 (1) X = α1T 1 0 , α2T 0 1 where X11 ∈ D3. A separation procedure for the 5 × 5 case Assume that X ∈ D5, X 6> 0. After a symmetric permutation and diagonal scaling, X may be assumed to have the form X11 α1 α2 (1) X = α1T 1 0 , α2T 0 1 where X11 ∈ D3. Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form (1). Then X ∈ C5 if and only if there are matrices A11 and A22 such that X11 = A11 + A22, and Aii αi ∈ D4, i = 1, 2. T αi 1 A separation procedure for the 5 × 5 case Assume that X ∈ D5, X 6> 0. After a symmetric permutation and diagonal scaling, X may be assumed to have the form X11 α1 α2 (1) X = α1T 1 0 , α2T 0 1 where X11 ∈ D3. Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form (1). Then X ∈ C5 if and only if there are matrices A11 and A22 such that X11 = A11 + A22, and Aii αi ∈ D4, i = 1, 2. T αi 1 In [BX04], Theorem 1 is utilized only as a proof mechanism, but we now show that it has algorithmic consequences as well. Theorem 2. Assume that X ∈ D5 has the form (1). Then X ∈ D5 \ C5 if and only if there is a matrix V11 β1 β2 V11 βi ∗, i = 1, 2, T ∈ D V = β1 γ1 0 such that T γ 4 β i i β2T 0 γ2 and V • X < 0. Theorem 2. Assume that X ∈ D5 has the form (1). Then X ∈ D5 \ C5 if and only if there is a matrix V11 β1 β2 V11 βi ∗, i = 1, 2, T ∈ D V = β1 γ1 0 such that T γ 4 β i i β2T 0 γ2 and V • X < 0. • Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If X̃ ∈ C5 is another matrix of the form (1), then Theorem 1 implies that V • X̃ ≥ 0. Theorem 2. Assume that X ∈ D5 has the form (1). Then X ∈ D5 \ C5 if and only if there is a matrix V11 β1 β2 V11 βi ∗, i = 1, 2, T ∈ D V = β1 γ1 0 such that T γ 4 β i i β2T 0 γ2 and V • X < 0. • Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If X̃ ∈ C5 is another matrix of the form (1), then Theorem 1 implies that V • X̃ ≥ 0. • However cannot conclude that V ∈ C5∗ because V • X̃ ≥ 0 only holds for X̃ of the form (1), in particular, x̃45 = 0. Theorem 2. Assume that X ∈ D5 has the form (1). Then X ∈ D5 \ C5 if and only if there is a matrix V11 β1 β2 V11 βi ∗, i = 1, 2, T ∈ D V = β1 γ1 0 such that T γ 4 β i i β2T 0 γ2 and V • X < 0. • Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. If X̃ ∈ C5 is another matrix of the form (1), then Theorem 1 implies that V • X̃ ≥ 0. • However cannot conclude that V ∈ C5∗ because V • X̃ ≥ 0 only holds for X̃ of the form (1), in particular, x̃45 = 0. • Fortunately, by [HJR05, Theorem 1], V can easily be “completed” to obtain a copositive matrix that still separates X from C5. Theorem 3. Suppose that X ∈ D5 \ C5 has the form (1), and V satisfies the conditions of Theorem 2. Define V11 β1 β2 V (s) = β1T γ1 s . β2T s γ2 √ Then V (s) • X < 0 for any s, and V (s) ∈ C5∗ for s ≥ γ1γ2. Separation for larger matrices with block structure Procedure for 5 case where X 6> 0 can be generalized to larger matrices with block structure. Assume X has the form X11 X12 X13 . . . X1k X T X 0 . . . 0 12 22 T X = X13 0 . . . . . . .. , . . . . . . . . 0 . . T X1k 0 . . . 0 Xkk where k ≥ 3, each Xii is an ni × ni matrix, and (2) Pk i=1 ni = n. Lemma 1. Suppose that X ∈ Dn has the form (2), k ≥ 3, and let X11 X1i i X = , i = 2, . . . , k. T X1i Xii Then X ∈P Cn if and only if there are matrices Aii, i = 2, . . . , k such that ki=2 Aii = X11, and Aii X1i ∈ Cn1+ni , i = 2, . . . , k. T X1i Xii Moreover, if G(X i) is a CP graph for each i = 2, . . . , k, then the above statement remains true with Cn1+ni replaced by Dn1+ni . Theorem 4. Suppose that X ∈ Dn \ Cn has the form (2), where G(X i) is a CP graph, i = 2, . . . , k. Then there is a matrix V , also of the form (2), such that V11 V1i ∗ ∈ D , i = 2, . . . , k, n +n T 1 i V1i Vii and V • X < 0. Moreover, if γi = diag(Diag(Vii).5), then the matrix V11 . . . V1k Ṽ = .. . . . .. , T ... V V1k kk where Vij = γiγjT , 2 ≤ i 6= j ≤ k, has Ṽ ∈ Cn∗ and Ṽ • X = V • X < 0. Note that: Note that: • Matrix X may have numerical entries that are small but not exactly zero. Can then apply Lemma 1 to perturbed matrix X̃ where entries of X below a specified tolerance are set to zero. If a cut V separating X̃ from Cn is found, then V • X ≈ V • X̃ < 0, and V is very likely to also separate X from Cn. Note that: • Matrix X may have numerical entries that are small but not exactly zero. Can then apply Lemma 1 to perturbed matrix X̃ where entries of X below a specified tolerance are set to zero. If a cut V separating X̃ from Cn is found, then V • X ≈ V • X̃ < 0, and V is very likely to also separate X from Cn. • Theorem 4 may provide a cut separating a given X ∈ Dn \ Cn even when the sufficient conditions for generating such a cut are not satisfied. In particular, a cut may be found even when the condition that X i is a CP graph for each i is not satisfied. A second case where block structure can be used to generate cuts for a matrix X ∈ Dn \ Cn is when X has the form I X12 X13 ... X1k X T I X23 . . . X 2k 12 T T .. ... (3) X = X13 X23 . . . , .. .. . . . ... X (k−1)k T XT . . . XT X1k I 2k (k−1)k Pk where k ≥ 2, each Xij is an ni × nj matrix, and i=1 ni = n. The structure in (3) corresponds to a partitioning of the vertices {1, 2, . . . , n} into k stable sets in G(X), of size n1, . . . , nk (note that ni = 1 is allowed). Applications Example 1: Consider the box-constrained QP problem (QPB) max xT Qx + cT x s.t. 0 ≤ x ≤ e. Applications Example 1: Consider the box-constrained QP problem (QPB) max xT Qx + cT x s.t. 0 ≤ x ≤ e. To describe a CP-formulation of QPB, define matrices T T 1 x s T 1 x Y = , Y + = x X Z . x X s ZT S Applications Example 1: Consider the box-constrained QP problem (QPB) max xT Qx + cT x s.t. 0 ≤ x ≤ e. To describe a CP-formulation of QPB, define matrices T T 1 x s T 1 x Y = , Y + = x X Z . x X s ZT S By the result of [Bur09], QPB can then be written in the form (QPBCP) max Q • X + cT x s.t. x + s = e, Diag(X + 2Z + S) = e, Y + ∈ C2n+1. Replacing C2n+1 with D2n+1 gives tractable DNN relaxation. Replacing C2n+1 with D2n+1 gives tractable DNN relaxation. • Can be shown that DNN relaxation is equivalent to “SDP+RLT” relaxation, and for n = 2 problem is equivalent to QPB [AB10]. Replacing C2n+1 with D2n+1 gives tractable DNN relaxation. • Can be shown that DNN relaxation is equivalent to “SDP+RLT” relaxation, and for n = 2 problem is equivalent to QPB [AB10]. • Constraints from Boolean Quadric Polytope (BQP) are valid for off-diagonal components of X [BL09]. For n = 3, BQP is completely determined by triangle inequalities and RLT constraints. However, can still be a gap when using SDP+RLT+TRI relaxation. Replacing C2n+1 with D2n+1 gives tractable DNN relaxation. • Can be shown that DNN relaxation is equivalent to “SDP+RLT” relaxation, and for n = 2 problem is equivalent to QPB [AB10]. • Constraints from Boolean Quadric Polytope (BQP) are valid for off-diagonal components of X [BL09]. For n = 3, BQP is completely determined by triangle inequalities and RLT constraints. However, can still be a gap when using SDP+RLT+TRI relaxation. • Example from [BL09] with n = 3 has optimal value for QPB of 1.0, value for SDP+RLT+TRI relaxation of 1.093. Solution matrix Y + has 5 × 5 principal submatrix that is not strictly positive, and is not CP. Can obtain cut from Theorem 3, resolve problem, and repeat. 1.00E+00 Gap to optimal value 1.00E‐01 1.00E‐02 1.00E‐03 1.00E‐04 1.00E‐05 1.00E‐06 1.00E‐07 1.00E‐08 1.00E‐09 0 5 10 15 Number of CP cuts added Figure 1: Gap to optimal value for Burer-Letchford QPB problem (n = 3) 20 25 Example 2: Let A be the adjacency matrix of a graph G on n vertices, and let α be the maximum size of a stable set in G. Known [dKP02] that Example 2: Let A be the adjacency matrix of a graph G on n vertices, and let α be the maximum size of a stable set in G. Known [dKP02] that n o α−1 = min (I + A) • X : eeT • X = 1, X ∈ Cn . (4) Example 2: Let A be the adjacency matrix of a graph G on n vertices, and let α be the maximum size of a stable set in G. Known [dKP02] that n o α−1 = min (I + A) • X : eeT • X = 1, X ∈ Cn . (4) Relaxing Cn to Dn results in the Lovász-Schrijver bound n o (ϑ0)−1 = min (I + A) • X : eeT • X = 1, X ∈ Dn . (5) Let G12 be the complement of the graph corresponding to the vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈ 3.24. Let G12 be the complement of the graph corresponding to the vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈ 3.24. 1 to approximate the dual of (4) provides no • Using the cone K12 improvement [BdK02]. Let G12 be the complement of the graph corresponding to the vertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ0 ≈ 3.24. 1 to approximate the dual of (4) provides no • Using the cone K12 improvement [BdK02]. • For the solution matrix X ∈ D12 from (5), cannot find cut based on first block structure (2). However can find a cut based on (3). Adding this cut and re-solving, gap to 1/α = 13 is approximately 2 × 10−8. References [AB10] Kurt M. Anstreicher and Samuel Burer, Computable representations for convex hulls of lowdimensional quadratic forms, Math. Prog. B 124 (2010), 33–43. [BAD09] Samuel Burer, Kurt M. Anstreicher, and Mirjam Dür, The difference between 5 × 5 doubly nonnegative and completely positive matrices, Linear Algebra Appl. 431 (2009), 1539–1552. [BD10] Samuel Burer and Hongbo Dong, Separation and relaxation for cones of quadratic forms, Dept. of Management Sciences, University of Iowa (2010). [BdK02] Immanuel M. Bomze and Etienne de Klerk, Solving standard quadratic optimization problems via linear, semidefinite and copositive programming, J. Global Optim. 24 (2002), 163–185. [BL09] Samuel Burer and Adam N. Letchford, On nonconvex quadratic programming with box constriants, SIAM J. Optim. 20 (2009), 1073–1089. [Bur09] Samuel Burer, On the copositive representation of binary and continuous nonconvex quadratic programs, Math. Prog. 120 (2009), 479–495. [BX04] Abraham Berman and Changqing Xu, 5 × 5 Completely positive matrices, Linear Algebra Appl. 393 (2004), 55–71. [DA10] Hongbo Dong and Kurt Anstreicher, Separating doubly nonnegative and completely positive matrices, Dept. of Management Sciences, University of Iowa (2010). [dKP02] Etienne de Klerk and Dmitrii V. Pasechnik, Approximation of the stability number of a graph via copositive programming, SIAM J. Optim. 12 (2002), 875–892. [HJR05] Leslie Hogben, Charles R. Johnson, and Robert Reams, The copositive completion problem, Linear Algebra Appl. 408 (2005), 207–211. [KB93] Natalia Kogan and Abraham Berman, Characterization of completely positive graphs, Discrete Math. 114 (1993), 297–304.