Download Convex optimization

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

List of regular polytopes and compounds wikipedia , lookup

Symmetric cone wikipedia , lookup

Lp space wikipedia , lookup

Shapley–Folkman lemma wikipedia , lookup

Transcript
ST 810, Advanced computing
Eric B. Laber & Hua Zhou
Department of Statistics, North Carolina State University
February 4, 2013
“Now all that’s left to do is minimize over θ.”
—Famous last words.
Convex optization problems
I
A general optimization problem of Rp
minx∈Rp
s.t.
I
I
f (x)
x ∈ S.
If the the optimization problem can be written so that f (x) is
a convex function and S is a convex set then the problem is
said to be convex.
Roadmap:
I
I
I
Convex sets and special cases (today)
Convex functions and special cases (next time)
Special cases of convex optimization problems (two weeks)
Convex sets
Definition
A set C ⊆ Rp is said to be affine if any linear combination of
points in C belongs to C provided the coefficients sum to one.
P
I.e., x1 , . . . , xk ∈ C and θ1 , . . . , θk ∈ R s.t. ki=1 θi = 1, then
k
X
i=1
θi x i ∈ C .
Line segment and affine set
I
I
Line between x1 and x2 is {θx1 + (1 − θ)x2 : θ ∈ [0, 1]}
Line through x1 and x2 is {θx1 + (1 − θ)x2 : θ ∈ R}
x1
x2
Affine sets
Claim
If C is an affine set and x0 ∈ C then V , C − x0 is a subspace
Proof.
Let ν1 , ν2 ∈ V and α, β ∈ R we want to show that αν1 + βν2 ∈ V
which is equivalent to showing
αν1 + βν2 + x0 ∈ C .
The left hand side of the above display is equal to
α(ν1 + x0 ) + β(ν2 + x0 ) + (1 − α − β)x0 ,
which is an affine combination of the points ν1 + x0 , ν2 + x0 , and
x0 all of which belong to C . Since C is closed under affine
combinations the result is proved.
Affine sets cont’d
I
Examples of affine sets
I
Solution set for linear equalities {x : Ax = b} is affine.
I
I
How would we show this?
Any subspace is affine
Definition
The affine hull of a set C ⊆ Rp is the smallest affine set that
contains C . It is given by
( k
)
k
X
X
aff C ,
θi xi : xi ∈ C , θi ∈ R,
θi = 1, i = 1, . . . , k .
i=1
i=1
What is the affine hull of the C = {x ∈ R2 : ||x||22 = 1}?
Convex sets
Definition
A set C ⊆ Rp is said to be convex if the line segment between any
two points in C is also in C . I.e., if x1 , x2 ∈ C then
{θx1 + (1 − θ)x2 : θ ∈ [0, 1]} ⊆ C .
Convex or not?
Convex combinations
Definition
P
P
Let θ1 , . . . , θk ∈ R+ satisfy ki=1 θi = 1 then ki=1 θi xi is called a
convex combination of x1 , . . . , xk .
Definition
The convex hull of a set C is the smallest convex set that contains
C . The convex hull is given by
( k
)
k
X
X
conv C ,
θi xi : θi ∈ R+ , xi ∈ C ,
θi = 1, i = 1, . . . , k
i=1
i=1
Convex hull
Convex hull
Infinite convex combinations
I
I
P
Let θ1 , . . . ∈ R+ and x1 , . . . ∈ C then ∞
i=1 θi xi is a called a
convex combination of x1 , . . . provided the series is
convergence.
If f (x) is a density over C then
Z
xf (x)dx
C
is called a combination of C provided the integral is finite.
I
Suppose X is a random variable with support C ⊆ Rp and
finite expectation. When is EX ∈ C ?
Cones
Definition
A set C ⊆ Rp is said to be a cone if for each x ∈ C the set
{θx : θ ≥ 0} is also in C .
A cone that is also convex is called a convex cone
Cones
(0, 0)
Convex cone
(0, 0)
Nonconvex cone
Cones cont’d
Claim
Let C be a convex cone, if x1 , x2 ∈ C and θ1 , θ2 ∈ R+ then
θ1 x1 + θ2 x2 ∈ C .
Proof.
If θ1 = θ2 = 0 then we’re done because 0 ∈ C . Otherwise note that
θ1
θ2
θ1 x1 + θ2 x2 = (θ1 + θ2 )
x1 +
x2 .
θ1 + θ2
θ 1 + θ2
The term in side the square-brackets on the right hand side of the
above display is a convex combination of elements in C and thus
belongs to C (since C is convex). Because C is a cone it is closed
under positive scaling.
Conic combinations
Definition
Let θ1 , . . . , θk ∈ R+ then
of points in Rp .
Pk
i=1 θi xi
is called a conic combination
How should we define the conic hull of a set C ?
Examples of convex sets
I
Any singleton is affine and thus convex
I
Rp is convex
I
Any line is convex. If the line passes through the origin is it a
subspace and a convex cone.
I
A ray {x0 + θx : θ ∈ R+ } is convex but not affine. a ray is a
convex cone if it passes through the origin.
I
Any subspace is affine and a convex cone
Working with definitions
Claim
If U ⊆ Rp is a subspace, then it’s affine and a convex cone.
Proof.
In class. . .
Hyperplanes and half-planes
Definition
A hyperplane is a set of the form {x : a| x = b} for some nonzero
a ∈ Rp and b ∈ R.
Equivalently we can express the foregoing hyperplane as a⊥ + x0
where x0 is any point in the hyperplane and a⊥ , {x : a| x = 0}.
I
A hyperplane partitions the space Rp into two half-planes
{x : a| x ≤ b} and {x : a| x > b}.
Quick review: norms
I
A norm || · || on Rp is a map from Rp × Rp into R+ which
satisfies:
1. ||x|| ≥ 0 and ||x|| = 0 ⇔ x = 0,
2. ||θx|| = |θ| ||x||,
3. ||x + y || ≤ ||x|| + ||y ||
for all x, y ∈ Rp and θ ∈ R.
I
A norm-ball of radius r > 0 centered at xc is defined as
B(xc , r , || · ||) , {x : ||x − xc || ≤ r }.
It is standard to suppress the norm and write B(xc , r ).
Norm balls
Claim
Let || · || be any norm on Rp . Then for any r > 0 and xc ∈ Rp the
norm ball B(xc , r ) is convex.
Proof.
Let ν1 , ν2 ∈ B(xc , r ) and θ ∈ [0, 1]. We need to show
θν1 + (1 − θ)ν2 ∈ B(xc , r ). Note that we can write ν1 = xc + u1 ,
ν2 = xc + u2 with u1 , u2 ∈ B(0, r ). Then:
||θν1 + (1 − θ)ν2 − xc || = ||θu1 + (1 − θ)u2 ||
≤ θ||u1 || + (1 − θ)||u2 ||
≤ θr + (1 − θ)r = r .
Thus, θv1 + (1 − θ)v2 ∈ B(xc , r ).
Norm balls
x2
x2
x1
x1
Norm balls
x2
x2
x1
x1
Norm cones
Definition
The norm cone in Rp is defined as {(x, t) : ||x|| ≤ t} ⊆ Rp+1 .
Claim
The norm cone C = {(x, t) : ||x|| ≤ t} ⊆ Rp+1 is a convex cone.
Proof.
Let ν1 , ν2 ∈ C and θ ∈ [0, 1]. Want to show
θν1 + (1 − θ)ν2 = (x, t) ∈ C . Note that νi = (xi , ti ) with
||xi || ≤ ti for i = 1, 2. Thus
||x|| = ||θx1 + (1 − θ)x2 ||
≤ θ||x1 || + (1 − θ)||x2 ||
≤ θt1 + (1 − θ)t2 = t.
Thus ||x|| ≤ t and ν ∈ C .
Second-order cone
I
When || · || is the Euclidean norm then the norm-cone is called
the second-order cone.
I
In two-dimensions (so (x, t) ∈ R3 ) the second-order norm
cone is called the Lorentz cone or icecream cone.
Positive semidefinite cone
I
Let S p denote the space of symmetric p × p real matrices and
p
S+
the space of symmetric nonnegative definite matrices
I
In class: Show S+p is a convex cone
Polyhedra
Definition
A polyhedron is a defined as the solution of a finite number of
linear equalities and inequalities
P = x : al| x ≤ bl , l = 1, . . . , m, cl| x = dl , l = 1, . . . , r
Claim
The intersection of convex sets is convex.
Proof.
In class...
Corollary
A polyhedron is convex.
Ex. polyhedron
I
Succinctly write polyhedron as
{x : Ax b, Cx = d}.
I
Ex. the positive orthant
{x : xj ≥ 0, j = 1, . . . , p} is a
polyhedron
Simplexes
Definition
The k + 1 points x0 , x1 , . . . , xk ∈ Rp are said to be affinely
independent if x1 − x0 , . . . , xk − x0 are linearly independent.
Definition
Let x0 , x1 , . . . , xk be affinely independent point in Rp . The simplex
determined by x0 , x1 , . . . , xk is
( k
)
k
X
X
conv {x0 , x1 , . . . , xk } =
θi xi : θ 0,
θi = 1 .
i=0
I
Simplexes are polyhedra
i=0
Simplex examples
I
In class: what does a simplex look like in 1D? What about
2D?
I
In 3D the simplex is a tetrahedron
I
The unit simplex is the p-dimensional simplex determined by
the unit vector and 0, e1 , . . . , ep ∈ Rp :


p


X
x : x 0,
xj ≤ 1


j=1
I
The probability simplex is generated by e1 , . . . , ep ∈ Rp :


p


X
x : x 0,
xj = 1


j=1
Converting to standard form
I
Ex. How to convert the probability simplex to standard form?
I
I
Use basis elements e1 , . . . , ep ∈ Rp ; these are affinely
independent.
Convex hull of e1 , . . . , ep is


p
p


X
X
θj ej , θ 0,
θj = 1 .
S= x : x=


j=1
j=1
Note that this corresponds to all discrete distributions over p
elements. The ith component of an element x ∈ S is the
probability of observing the ith element.
Writing a simplex as a polyhedron
I
Let x0 , x1 , . . . , xk ∈ Rp be affinely independent and let S
denote the generated simplex
( k
)
k
X
X
S ,
θi xi : θ 0,
θi = 1
i=0
(
=
x0 +
i=0
k
X
θi (xi − x0 ), , θ 0,
i=1
I
k
X
)
θi ≤ 1
i=0
Define B = (x1 − x0 , . . . , xk − x0 ), then Rank(B)
h i = k by
1
affine independence, and there exists A = A
A2 so that
A1 B
I
AB =
=
A2 B
0
Writing a simplex as a polyhedron, cont’d
I
For any x ∈ S we have x = x0 + By with y 0 and
P
p
i=1 yi ≤ 1; thus:
Ax
= Ax0 + ABy ,
matching equations above
A1 x
= A1 x0 + y , y 0,
p
X
yi ≤ 1,
i=1
A2 x
= A2 x0 ,
Equivalently
|
A1 x
A1 x0 ,
A2 x
= A2 x0 ,
1 (A1 x − A1 x0 ) ≤ 1.
Building new sets from old
I
Strategy for building classes of objects with a given property
(e.g., convexity, finite VC-dimension, measurability, etc.)
1. Find special cases
2. Find property preserving operations to tranform and combine
special cases
3. Build up large classes of objects by transforming and
combining simple cases
Operations that preserve convexity
I
Arbitrary intersection: Sα convex for α ∈ A then
convex.
I
T
α∈A Sα
is
Ex. the positive semidefinite cone S+p can be written as
\
{x ∈ S p : z | xz ≥ 0} ,
z6=0
I
since x 7→ z | xz is linear in x, S+p is the intersection of
half-spaces and hence convex.
We will prove (time-permitting) the following converse that if
S is a closed convex set then
\
S=
{H : H a halfspace , S ⊆ H}
Operations that preserve convexity cont’d
I
Affine functions: S convex and f : Rp → Rm , affine (i.e.
f (x) = Ax + b) then
f (S) , {f (x) : x ∈ S}
is convex.
I
I
I
Ex. Scaling and/or translating.
Ex. Projection of a convex subset onto some of its
coordinates. S ∈ Rp × Rm then writing x = (x1 , x2 ) with
x1 ∈ Rp , x2 ∈ Rm then f (x) = x1 is affine.
Ex. Sum of two convex sets. Let S1 , S2 ∈ Rp be convex, then
S1 + S2 is convex. To see this note that S1 × S2 is convex.
The map f (x1 , x2 ) = x1 + x2 is affine.
Operations that preserve convexity cont’d
I
Inverse images of affine functions: S convex and g : Rm → Rp
then
g −1 (S) , {x : g (x) ∈ S}
is convex.
I
p
Ex. Linear matrix
Pp inequality. Let A1 , . . . , Ap , B ∈ S , and
define A(x) = j=1 xj Aj then the set
{x : A(x) ≺ B}
is convex. Why?
Discrete probability distn examples
I
Let X be a real-valued discrete random variable with
P(X = ai ) = pi i = 1, . . . , n with a1 < a2 < . . . < an . Then p
belongs to the probability simplex
(
)
n
X
P = p : p 0,
pi = 1 .
i=1
I
Which of the following define convex subsets of P?1
1.
2.
3.
4.
5.
1
α ≤ Ef (X ) ≤ β for known function f
P(X > α) ≤ β
E|X |3 ≤ αE|X |
Var (X ) ≤ α
qτ (X ) ≥ α where qτ (X ) , inf{β : P(X ≤ β) ≥ τ }.
Adopted form problem 2.15 in Boyd and Vandenberghe.
Discrete probability distn examples: parts 1-4
Ans 1: Ef (X ) =
convex.
Pn
i=1 pi ai
Ans 2: P(X > α) =
thus convex.
hence the constraint is linear and thus
P
i:ai >α pi
hence the constraint is linear and the
P
Ans 3: E|X |3 − αE|X | = ni=1 pi (|ai |3 − α|ai |) hence the constraint
is linear and thus convex.
P
P
Ans 4: Var (X ) = ni=1 pi ai2 − ( ni=1 pi ai )2 which we can write as
n
X
i=1
pi ai −
X
pi pj ai aj ,
i,j
if n = 2, a1 = 1, a2 = 0, then Var (X ) ≤ α becomes
p1 − p12 ≤ α which is not convex.
Discrete probability distn example: part 5
∗
Ans 5: Define
Pk ∗ k , max{i : ai < α} then qτ (X ) ≥ α is equivalent to
i=1 pi < τ . Thus, the constraint qτ (X ) ≥ α is convex.
Linear fractional functions
I
Linear fractional functions are more general than affine
functions yet still preserve convexity
I
A linear fractional function f : Rp → Rm has the form
f (x) =
Ax + b
,
c |x + d
with domain {x : c | x + d > 0}.
I
If C is convex and C ⊆ domf then f (C ) is convex
I
Ex. Suppose X and Y are discrete r.v.s taking values in
{1, . . . , n} and {1, . . . , m} respectively. Then
pij
,
P(X = i|Y = j) = P
i pij
is a linear fractional function of p.