Download Mathematical Programming

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cayley–Hamilton theorem wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Euclidean vector wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix multiplication wikipedia , lookup

Laplace–Runge–Lenz vector wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Vector space wikipedia , lookup

Four-vector wikipedia , lookup

Matrix calculus wikipedia , lookup

Shapley–Folkman lemma wikipedia , lookup

System of linear equations wikipedia , lookup

Transcript
Mathematical Programming
Let   R n and f :   R . The optimization problems max { f (x) : x  } and
min { f (x) : x  } are also called mathematical programs.
 is called the feasible set and f (x) the objective function.
In problem max { f (x) : x  } , we seek x̂   such that f (xˆ )  f (x)  x   .
If such an optimal solution x̂ exists, then f (xˆ ) is called the optimal value of the
problem, and we can also write
f (xˆ ) = max { f (x) : x  }
x̂ = argmax { f (x) : x  }
Note that there can be several optimal solutions but only one optimal value.
Cases where a maximization problem fails to have an optimal solution
1.    (problem is infeasible)
2.  k  R, x    f (x)  k ( f (x) unbounded over  : problem unbounded)
3. f (x) bounded over  , but does not attain a maximum value.
Similar statements apply to the minimization problem min { f (x) : x  } .
Every maximization problem can be formulated as a minimization problem since
max { f (x) : x  }  min { f (x) : x  }
1
Examples of mathematical optimization problems
1. Linear program (LP):
  is set of solutions to a given system of linear inequalities and/or equations.


  x  R n : g i x  ki (for i  1,, r ), h j x  l j (for j  1,, q)
Here, each g i and h j is an n-dimensional (data) row vector, each ki and l j is a scalar
(data), and x is an n-dimensional column vector of variables.

f (x) is a linear function of the variables x1 , x2 ,, xn : f (x) 
n
wjxj
j 1
Every linear program can be (re)formulated into any of the following forms:
maxc x : Ax  b, x  0 (a maximization LP in canonical form)
min c x : Ax  b, x  0 (a minimization LP in canonical form)
maxc x : Ax  b, x  0 (a maximization LP in standard form)
min c x : Ax  b, x  0 (a minimization LP in standard form)
where c is an n-dimensional row vector, x is an n-dimensional column vector,
matrix A has m rows and n columns, and b is an m-dimensional column vector.
Examples of LP’s: transportation problem, network flow problems.
2. Nonlinear program (NLP):
  is set of solutions to a given system of inequalities and/or equations.


  x  R n : g i (x)  0 (for i  1,, p), hk (x)  0 (for k  1,, q)
where gi and hk represent arbitrary functions.
2
(Linear) Integer program (IP):
 An LP with the additional requirement that all variables take integer values.

max c x : Ax  b, x  0, x  Zn

3. (Linear) Mixed Integer program (MIP):
 An LP where some variables must take integer values

max cx  hy : Ax  Gy  b, x  0, x  R n , y  0, y  Z p

Matrices A and G have n and p columns, respectively, and same number of rows.
Examples: Network design problems, facility location problems.
4. (Linear) Binary Integer program (BIP): an IP with 0-1 variables



max c x : Ax  b, x {0,1}n  max c x : Ax  b, 0  x  1, x  Zn

Remarks:
 Any IP with a bounded feasible set can be reformulated as a BIP, using
binary (base 2) representation of integers.
 Any BIP can be re-formulated as a NLP.
Use the fact that xi2  xi  0  xi {0,1}
3
5. (Linear objective) Combinatorial Optimization Problems (CO):
Some examples:
Minimum Spanning Tree Problem (MST)
Assignment Problem
Shortest Path Problem
Traveling Salesman Problem (TSP)
Chinese Postman Problem (CPP)
Knapsack Problem (KP)
Typically, in a combinatorial optimization problem we have a finite set of n objects
(e.g., edges, vertices, items for knapsack) that we identify with the set N  1,2,n.
Additionally, there is a set  composed of subsets of N. Examples:
MST: N= E (G) ,  = { S  N : S induces a spanning tree of G}
TSP: N= E (G) ,  = { S  N : S induces a hamiltonian cycle of G}
KP: N = set of items,  ={ S  N : total weight of S  knapsack capacity}
In a linear objective combinatorial optimization problem each j  N has a value c j .
The objective is to determine the maximum (or minimum) value of {  c j : S  }
jS
and identify an (optimal) set Sˆ  N that yields this optimal value. That is, find
Sˆ  N such that
 c j = max {  c j : S  } = max
c j
S 
jSˆ
jS
 c j = min{  c j : S  } =
jSˆ
jS
(maximization problem)
jS
min
c j
S   jS
(minimization problem).
A combinatorial optimization problem can be formulated as an integer program
(typically a BIP) or an MIP, since any set S  N is completely described by its
incidence vector x {0,1}n , where x j  1 if j  S , and x j  0 otherwise.
4
Continuous optimization vs discrete optimization:
 Continuous optimization: variables can vary continuously (e.g.: LP, NLP).
 Discrete optimization: feasible set is discrete (e.g.: CO, IP, and also LP since
we can solve linear programs by considering only the finite set of extreme
points of the polyhedral feasible region).
Computational complexity of optimization problems:
Problems with polynomial time algorithms:
Assignment Problem
Minimum Spanning Tree
Shortest Path
Chinese Postman Problem
Linear Programming
Problems proven to be NP-Hard:
Traveling Salesman Problem
Knapsack Problem
Integer Programming
Importance of linear programming for discrete optimization:
 Linear programming theory has been used to develop polynomial time
(combinatorial) algorithms for some CO problems (e.g., nonbipartite weighted
matching, network flow problems).
 Linear programming algorithms and theory are also useful for many efficient
current algorithms to solve NP-Hard problems such as TSP, Knapsack
Problem, and general IP or MIP problems.
5
Difficulty of solving continuous optimization problems:




Descent algorithms for minimization problems.
(Ascent algorithms for maximization problems)
Local optimum vs global optimum.
Convexity theory identifies problems which are easier to solve.
Convexity: Definitions
1. Convex sets
Convex combinations
Extreme points
2. Convex functions and concave functions.
Linear functions are both convex and concave.
3. Convex (mathematical) programs:
min { f (x) : x  } where f is convex function and  is a convex set.
max { f (x) : x  } where f is concave function and  is a convex set.
Convexity results:
 In a convex program, a local optimum is a global optimum.
 If  is a convex set, f is convex function, and max { f (x) : x  } has an
optimal solution, then there is an optimal solution that is an extreme point.
 The intersection of a collection of convex sets is a convex set.
 Hyperplanes and halfspaces are convex sets.
6
Polyhedral sets:

Polyhedron: solution of a system of linear inequalities and/or equations.
Feasible region of a linear program is a polyhedron.
 A polyhedron is a convex set.
1. Linear programs are convex programs (local optima are also global).
2. If LP has an optimal solution exists, then there is one that is an
extreme point of the feasible region.
 Polytope: a bounded polyhedron.
 Representation theorem of polytopes.
Solvability of linear programs:
Given an LP, one of the following must be true:
1. LP has an optimal solution.
2. LP is infeasible.
3. LP is unbounded. (in this case, the feasible region must be unbounded )
Algorithms for solving linear programs:
1. Simplex Method (Dantzig 1947): not polynomial, but very efficient in practice.
Views LP as a discrete problem, since method focuses on the extreme points.
Primal Simplex and Dual Simplex are two main variants of the Simplex method.
2. Ellipsoid Method (Khachian 1979): first polynomial method, not practical.
3. Interior Point Methods (Karmarkar 1984): polynomial and practical.
Generates points in the interior of the polyhedron: LP as a continuous problem.
7
Review of Matrix multiplication:
Product of a row vector w and column vector x of dimension n: wx 
n
wjxj
j 1
A matrix of size (or order) m  n : an array with m rows and n columns
 a11
a
A   21
 

am1
a12
a22

am 2
 a1n 
 a2 n 

 

 amn 
where aij denotes the entry in the i-th row and j-th column of matrix A
(the (i,j)-th entry of A ). We will also refer to aij by A ij .
Vectors:
a column vector with n entries is a matrix of size n  1 .
a row vector with n entries is a matrix of size 1  n .
i-th row of A :
j-th column of A :
Ai  ai1
ai 2  ain 
 a1 j 
a 
2j
Aj   
  
 
 a mj 
 A1 
 2
A
Thus, we can write: A    and
  
 m
A 
A  A1
A2  An 
Matrix Multiplication: If A is m  n and B is n  p , then
AB is an m  p matrix with (i,j)-th entry
[AB ]ij  A i B j
8
Duality Theory of Linear Programming:
 Provides some theoretical insight on why LP has polynomial algorithms while
none have been found for IP (and other NP-Hard problems).
 LP duality theory is key in development of polynomial time algorithms for some
CO problems (e.g., nonbipartite weighted matching problem).
Dual of a linear program:
Consider the (primal) LP: maxc x : Ax  b, x  0 where A is an m by n matrix.
Then, x must be an n-dimensional column vector, c an n-dimensional row vector, and
b an m-dimensional column vector.
The dual of the LP above is the linear program: min yb : yA  c, y  0 where, for
the products to be defined, y must be an m-dimensional row vector of variables.
Remember: the dual of a canonical maximization LP is a canonical minimization LP.
Weak Duality Lemma:
y   y : yA  c, y  0, then cx  yb .
Let ~
x  x : Ax  b, x  0 and ~
 Corollary: If one of the above LP’s is unbounded, the other is infeasible.
Strong Duality Theorem:
Consider maxc x : Ax  b, x  0 and its dual min yb : yA  c, y  0 .
Then exactly one of the following must occur:
1. Both LP’s are feasible and maxc x : Ax  b, x  0 = min yb : yA  c, y  0
2. One LP is unbounded and the other is infeasible.
3. Both LP’s are infeasible.
9
Dual of a general linear program:
Weak and strong duality results apply to any primal-dual pair of LP’s.
Dual of the dual is the original primal LP (Involutory property of duality)
Complementary Slackness Theorem:
Let A be the m by n constraint matrix, b the rhs vector, and c the objective function
vector of a (primal) LP. Let the column vector ~
x  R n be a feasible solution to the
y  R m be a feasible solution
primal LP and let the row vector ~
to the dual LP.
Then ~
y are respectively optimal for each of the problems if and only if:
x and ~
c j  ~yA j x j  0

j  1,, n

~
yi A i ~
x  bi  0 i  1,, m
10
Linear combinations
A vector b  R n is a linear combination of vectors a1 , a 2 ,, a k (also all in R n ) if
b
k
  ja j
j 1
for some choice of real numbers 1 , 2 ,, k .
We also say that:
b is an affine combination of a1 , a 2 ,, a k if b 
k
k
j 1
j 1
  j a j and   j  1 .
b is a nonnegative combination of a1 , a 2 ,, a k if b 
b is a convex combination of a1 , a 2 ,, a k if b 

 ja j
k
  j a j and  j  0
j .
j 1
k
k
j 1
j 1
  j a j ,   j  1 ,  j  0 j .
will always denote a linear combination of a finite number of vectors.
More definitions:
Let     R n (the set  may have infinite number of vectors). Then
linear span of  :
lin     j a j : a j 
affine span of  :
aff  
cone generated by  :
cone  
  j a j : a j  ,  j  0 j
convex hull of  :
conv  
  j a j : a j  ,   j  1,  j  0 j
  j a j : a j  ,   j  1
11
Let     R n (  may have infinite number of vectors). We say that
  is a subspace
if   lin 
  is a affine set
if   aff 
  is a convex cone
if   cone 
  is a convex set
if   conv 
Linear Independence: the set { a1 , a 2 ,, a k } is linearly independent if
k
  j a j  0 implies that  j  0
j  1,, k
j 1
Equivalently, the set { a1 , a 2 ,, a k } is linearly independent if no vector in the set can
be written as a linear combination of the remaining vectors.
Affine Independence: the set { a1 , a 2 ,, a k } is affinely independent if
k
  j a j  0 and
j 1
k

j 1
j
 0 imply that
 j  0 j  1,, k
Equivalently, the set { a1 , a 2 ,, a k } is affinely independent if no vector in the set can
be written as an affine combination of the remaining vectors.
Spanning set of a subspace: (review definition)
Basis of a subspace: (review definition)
Dimension of a subspace: number of vectors in a basis
Dimension of a convex set  : max {| S |: S  , S affinely independent} - 1.
Notation: | S |  cardinality of the (finite) set S .
12