* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download strongly polynomial time algorithm
Mathematical model wikipedia , lookup
Vincent's theorem wikipedia , lookup
Approximations of π wikipedia , lookup
Big O notation wikipedia , lookup
Halting problem wikipedia , lookup
Factorization wikipedia , lookup
System of polynomial equations wikipedia , lookup
Algorithm characterizations wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Our three formalisms • Linear programs – Algorithm: Simplex algorithm • Integer linear programs – Algorithm: Branch-and-bound (based on simplex) • Network flow – Network simplex + Klein’s algorithm • All these algorithms use exponential time on some instances. Worst-case efficient algorithms • Can we get algorithms running in polynomial time? • We say that an algorithm runs in polynomial time if its running time is upper bounded by a polynomial in the size of the input. • What exactly do we mean by this? Polynomials • 𝑇 𝑛 ≤ 12 𝑛7 − 7𝑛4 + 𝑛 • 𝑇 𝑛 = 𝑂 𝑛𝑐 for some constant 𝑐. • What is n? – The size of the input, but what is ”size”? • What is T(n)? – The running time, but how do we measure it? Models of computation Model 1: • Our computer holds exact real values. • The input is given by some real matrices. The size of the input is the number of entries in the matrices. • We perform an arithmetic operation (+,-,*,/) in each step of computation. • The output is the exact real result. Models of computation Model 2: • Our computer holds bits (bytes, words). • The input is given by rational matrices. The size of the input is the total number of bits in the matrices (each number being described in binary notation). • We perform some logical bit operation in each step of computation (or word operations, but we “charge” for each bit operation). • The output is the exact rational result. Model 1 vs. Model 2 • Model 1 is closer to the way we usually think about algorithms operating with real numbers. It is a very useful abstraction. • Model 2 is closer to reality. Model 1 cannot be implemented in a 100% faithful way. Model 2 can (and is). • An algorithm in Model 1 can be converted to an algorithm in Model 2 if it does not rely on very large numbers and we restrict the input to rational numbers. • The terminology “polynomial time algorithm” standardly refers to Model 2 (bits and gates). • The terminology “strongly polynomial time algorithm” refers to Model 1 (numbers and arithmetic). State-of-the-art • Network flow (and totally unimodular LPs in general): – Strongly polynomial time algorithm: • Flow fixing (Eva Tardos, 1985) • Minimum mean cost cycle cancelation algorithm. (Goldberg-Tarjan, 1989) • Linear programming: – Polynomial time algorithms: • The ellipsoid algorithm (Khachian, 1979) • Interior point algorithms (Karmakar, 1984) – Competitive with the simplex algorithm in practice – A strongly polynomial time algorithm is an open problem! • Integer linear programming: – A polynomial time algorithm exists if and only if P=NP. – This result extends to many of the problems we model using ILPs • E.g. there is a polynomial time algorithm for TSP if and only if P=NP. • Much more in the course ”Combinatorial Search”! Worst case polynomial time algorithms for Linear Programming The Ellipsoid algorithm (Khachian, 1979). • Theoretical breakthrough. • Algorithm far from being efficient in practice. • Spurred new research into LP algorithms. Interior Point algorithms (Karmakar, 1984). • Algorithm efficient in practice. • Tons of followup research and many interior point algorithms developed. • Best algorithms often beat the simplex algorithm. Ellipsoid algorithm Idea of algorithm: • Enclose (possibly a subset of) all possible feasible points within a large ball (i.e. ellipsoid). • Test if center of ellipsoid is feasible. • If not, the ellipsoid is partitioned into two parts by a hyperplane, with all possible feasible points in one part. • Find a new smaller ellipsoid that encloses this half-ellipsoid (pick the smallest), and repeat. Finding a half-ellipsoid Finding the new ellipsoid Why this works • Each iteration decreases volume by a factor 2−1/2 𝑛+1 . • First ellipsoid will have volume less than 2𝑛2 22𝐿 𝑛 . • If the system is feasible, the feasible region within the ball 𝐵(0, 𝑛2𝐿 ) has volume at least 2− 𝑛+2 𝐿 . • Hence at most 𝐾 = 16𝑛(𝑛 + 1)𝐿 iterations are needed, since: 2𝑛2 22𝐿 𝑛 2𝐾/2(𝑛+1) = 2𝑛2 22𝐿 𝑛 2−8𝑛𝐿 < 2− 𝑛+2 𝐿 . Technical issue • The feasible region may not be of full dimension, and hence can be non-empty but with volume 0! • Solution: Replace the system 𝐴𝑥 ≤ 𝑏 with 𝐴𝑥 < 𝑏′, where 𝑏𝑖′ = 𝑏𝑖 + 1/(22𝐿 ) • Then: 𝐴𝑥 ≤ 𝑏 is feasible if and only if 𝐴𝑥 < 𝑏′ is feasible. Ellipsoids An ellipsoid in 𝑹𝑛 is defined as follow: • Let 𝑄 be a non-singular 𝑛 × 𝑛 matrix, and let 𝑡 be a vector. • This defines an affine transformation 𝑇 𝑥 = 𝑄𝑥 + 𝑡. • The ellipsoid determined by 𝑇 is the image 𝑇 𝑆𝑛 of the unit-sphere 𝑆𝑛 = 𝑥 𝑥 ⊺ 𝑥 ≤ 1 . • 𝑇 𝑆𝑛 = 𝑥 ∣ 𝑥 − 𝑡 ⊺ 𝑄⊺ 𝑄 𝑥 − 𝑡 ≤ 1 . • Matrix of form 𝐵 = 𝑄⊺ 𝑄 is called positive definite matrix. The algorithm Input: System 𝐴𝑥 < 𝑏 1. let 𝑡0 ≔ 0, 𝐵0 ≔ 𝑛2 22𝐿 𝐼 2. for 𝑗 ≔ 1 to 𝐾 = 16𝑛 𝑛 + 1 𝐿 do a) b) c) d) (Current ellipsoid is given by 𝐵𝑗 and 𝑡𝑗 ) if 𝐴𝑡𝑗 < 𝑏, return 𝑡𝑗 else, find 𝑖 such that 𝑎𝑖⊺ 𝑡𝑗 ≥ 𝑏𝑖 let 𝑡𝑗+1 ≔ 1 𝑡𝑗 − 𝑛+1 𝐵𝑗 𝑎𝑖 𝑎𝑖⊺ 𝐵𝑗 𝑎𝑖 and 𝐵𝑗+1 ≔ 𝑛2 𝑛2 −1 𝐵𝑗 − Implementation issues The algorithm involves taking a squareroot! Solution: • Compute with limited precision. • In the analysis make each new ellipsoid slightly bigger to account for introduced errors. • Double the number of iterations to account for larger volumes. Unique feature of the Ellipsoid Algorithm • We do not need an explicit description of the system of inequalities: • We just need an efficient separation oracle. – An algorithm that given an infeasible solution finds a violated inequality. • We can potentially solve systems with exponentially many inequalities! • This is utilized in many (theoretical) applications of the Ellipsoid method. Interior point algorithms Idea of algorithm: • A polyhedron has complex structure on the border with exponentially many corners, edges, etc. Navigating there takes time. • Well inside the polyhedron there is no structure. One can navigate directly towards the optimum point without obstacles. The linear program and its dual. P: D: Maximize 𝑐 ⊺ 𝑥 subject to 𝐴𝑥 ≤ 𝑏, 𝑥 ≥ 0. Minimize 𝑏 ⊺ 𝑦 subject to 𝐴⊺ 𝑦 ≥ 𝑐, y ≥ 0. Add slack variables: P: Maximize 𝑐 ⊺ 𝑥 subject to 𝐴𝑥 + 𝑤 = 𝑏, 𝑥, 𝑤 ≥ 0. D: Minimize 𝑏 ⊺ 𝑦 subject to 𝐴⊺ 𝑦 − 𝑧 = 𝑐, y, z ≥ 0. Conversion to barrier problem Maximize 𝑐 ⊺ 𝑥 subject to 𝐴𝑥 + 𝑤 = 𝑏, 𝑥, 𝑤 ≥ 0 Change to: Maximize 𝑐 ⊺ 𝑥 + 𝜇 𝑗 log 𝑥𝑗 subject to 𝐴𝑥 + 𝑤 = 𝑏 +𝜇 𝑖 log 𝑤𝑖 The central path Lagrange multipliers Barrier problem: Maximize 𝑐 ⊺ 𝑥 + 𝜇 𝑗 log 𝑥𝑗 +𝜇 𝑖 log 𝑤𝑖 subject to 𝐴𝑥 + 𝑤 = 𝑏 Recall from Calculus: Lagrange multipliers allows us to maximize functions subject to equality constraints. Result After differentiations and manipulations: 𝐴𝑥 + 𝑤 = 𝑏 𝐴⊺ 𝑦 − 𝑧 = 𝑐 𝑋𝑍𝑒 = 𝜇𝑒 𝑌𝑊𝑒 = 𝜇𝑒 Notation: • 𝑋 = 𝑑𝑖𝑎𝑔 𝑥1 , 𝑥2 , … , 𝑥𝑛 𝑒 is all-1 vector. Existence of solution Theorem: If primal feasible set has nonempty interior and is bounded, then for each 𝜇 > 0 there is a unique solution (𝑥𝜇 , 𝑤𝜇 , 𝑦𝜇 , 𝑧𝜇 ) to the system 𝐴𝑥 + 𝑤 = 𝑏 𝐴⊺ 𝑦 − 𝑧 = 𝑐 𝑋𝑍𝑒 = 𝜇𝑒 𝑌𝑊𝑒 = 𝜇𝑒 Path following algorithm Given current 𝑥, 𝑤, 𝑦, 𝑧 > 0: 1. Estimate value for 𝜇. 2. Compute step directions (Δ𝑥, Δw, Δ𝑦, Δ𝑧) pointing approximately at point (𝑥𝜇 , 𝑤𝜇 , 𝑦𝜇 , 𝑧𝜇 ) on the central path. 3. Compute step length Θ such that 𝑥, 𝑤, 𝑦, 𝑧 = 𝑥, 𝑤, 𝑦, 𝑧 + 𝜃 ⋅ Δ𝑥, Δw, Δ𝑦, Δ𝑧 > 0 4. Replace 𝑥, 𝑤, 𝑦, 𝑧 by 𝑥, 𝑤, 𝑦, 𝑧 and continue. Key points 1. Balance making progress and making sure boundary is possible to avoid at suboptimal solutions. 2. Set up system of equations. Approximate using by dropping non-linear terms. (This is equivalent to Newton’s method). Convergence Repeat until current solution satisfies: Primal feasibility: Dual feasibility: Complementarity: (up to a tolerance) 𝑏 − 𝐴𝑥 − 𝑤 = 0 𝑐 − 𝐴⊺ 𝑦 + 𝑧 = 0 𝑧⊺𝑥 + 𝑦⊺𝑤 = 0 Convergence theorem Essentially: • Primal and dual infeasibility is decreased by factor 1 − 𝑡 in each iteration. • Complementarity is decreased by factor 1 − 𝑡 in each iteration. Further technical details omitted. Unique feature of interior point algorithms • The polyhedral structure of the set of feasible solution is largely irrelevant. • What we need is a set of manageable equations expressing strong duality. • We have strong duality for smooth convex non-linear optimization problems in general. • This is utilized in many (practical) applications, such as support vector machines in machine learning.