Download A NICE PROOF OF FARKAS LEMMA 1. Introduction Let - IME-USP

A NICE PROOF OF FARKAS LEMMA DANIEL VICTOR TAUSK Abstract. The goal of this short note is to present a nice proof of Farkas Lemma which states that if C is the convex cone spanned by a finite set and if x is not in C then there exists a linear functional α that is nonnegative on C with α(x) < 0. This implies in particular that the convex cone C is closed. 1. Introduction Let V be a real vector space. Recall that a subset C of V is said to be convex if for all x, y ∈ C and all t ∈ [0, 1] we have (1 − t)x + ty ∈ C; a simple induction Pk argument shows that C is convex if and only if i=1 ti xi ∈ C, for every integer Pk k ≥ 1, all x1 , . . . , xk ∈ C and all t1 , . . . , tk ≥ 0 with i=1 ti = 1. The set C is said to be a cone if for all x ∈ C, t ≥ 0 we have tx ∈ C. It is easy to show that C is a Pk convex cone if and only if i=1 ti xi ∈ C, for every integer k ≥ 1, all x1 , . . . , xk ∈ C and all t1 , . . . , tk ≥ 0. If {v1 , . . . , vk } is a finite nonempty subset of C then the set: (1.1) k nX o ti vi : t1 , . . . , tk ≥ 0 i=1 is easily seen to be the smallest convex cone containing {v1 , . . . , vk }; we call it the convex cone spanned by {v1 , . . . , vk }. We have the following: 1.1. Theorem (Farkas lemma). Let V be a real vector space and {v1 , . . . , vk } a finite nonempty subset of V . If x ∈ V is not on the convex cone (1.1) spanned by {v1 , . . . , vk } then there exists a linear functional α ∈ V ∗ such that α(x) < 0 and α(vi ) ≥ 0, for all i = 1, . . . , k. Clearly, if a linear functional α is nonnegative on v1 , . . . , vk then α is nonnegative on the entire convex cone (1.1); in particular, the condition of x not being in (1.1) is also necessary for the existence of the linear functional α with the properties stated in Theorem 1.1. The following is a simple consequence of Theorem 1.1: 1.2. Corollary. The convex cone spanned by a finite subset of IRn is closed in IRn . Proof. If C is the convex cone spanned by a finite subset of IRn and if x ∈ IRn is not in C then there exists a linear functional α ∈ IRn ∗ with α(x) < 0 which is nonnegative on C. Hence the open half-hyperplane y ∈ IRn : α(y) < 0 is an open neighborhood of x contained in the complement of C. Date: October 2005. 1 A NICE PROOF OF FARKAS LEMMA 2 If one doesn’t use Farkas Lemma, the thesis of Corollary 1.2 apparently has no immediate proof for it, although it may seem to be a fairly intuitive result. The following example shows that one should be careful with intuition in this matter. 1.3. Example. If C is an arbitrary compact convex subset of IRn then the cone: (1.2) tx : x ∈ C, t ≥ 0 is not necessarily closed in IRn . For instance, let C be the closed disc in IR2 of radius 1 centered at the point (1, 0).In this case, the cone (1.2) is the union of the open half-plane (x, y) ∈ IR2 : x > 0 with the origin. Hence it is not closed in IR2 . 1.4. Remark. Corollary 1.2 actually holds when IRn is replaced by an arbitrary Hausdorff topological vector space V . Namely, the convex cone spanned by a finite subset of V is contained in a finite-dimensional subspace S of V , which is complete and thus closed in V . Moreover, S is linearly homeomorphic to IRn and thus Corollary 1.2 implies that the convex cone is closed in S. 2. The proof of Farkas lemma Let us prove Theorem 1.1. We start by observing that there is no loss of generality in assuming that the vector space V is finite-dimensional. Namely, let V0 be the subspace of V spanned by {v1 , . . . , vk , x}; assuming the thesis valid for finite-dimensional spaces, we obtain a linear functional α ∈ V0∗ with α(x) < 0 and α(vi ) ≥ 0, for all i = 1, . . . , k. The conclusion is thus obtained by considering an arbitrary linear extension of α to V . We also observe that we can assume that the vectors vi are nonzero, since we can obviously delete the zero vectors from the list v1 , . . . , vk . To prove Theorem 1.1 we need the following: 2.1. Lemma (supporting hyperplane). Let V be a finite-dimensional nonzero real vector space, C be a convex subset of V and x ∈ V a point not in C. Then there exists a nonzero linear functional α ∈ V ∗ such that α(y) ≥ α(x), for all y ∈ C. We leave the proof of Lemma 2.1 to the appendix. Geometrically, the statement of the lemma can be interpreted as follows: if the point x is not on the convex set C then there exists a hyperplane y ∈ V : α(y) = c passing through x such that C is contained on the closed half-space y ∈ V : α(y) ≥ c . The following is a simple consequence of Lemma 2.1: 2.2. Corollary. Let C be a convex cone on a finite-dimensional nonzero real vector space V and let x ∈ V be a point not in C. Then there exists a nonzero linear functional α ∈ V ∗ with α(x) ≤ 0 and α(y) ≥ 0, for all y ∈ C. Proof. If C is empty the result is trivial, so assume C is nonempty; this implies 0 ∈ C. Let α ∈ V ∗ be as in the statement of Lemma 2.1; then 0 = α(0) ≥ α(x), for 0 ∈ C. Given y ∈ C, we claim that α(y) ≥ 0; otherwise, α(y) < 0 and t α(y) < α(x) for t > 0 sufficiently large. Then α(ty) < α(x) and since ty ∈ C, we obtain a contradiction. We can now begin the proof of Theorem 1.1. The proof is by induction on the dimension of V . If V is one-dimensional then the nonzero vector v1 is a basis for V and the hypothesis that x is not on the convex cone spanned by {v1} implies that A NICE PROOF OF FARKAS LEMMA 3 x = tv1 with t < 0; we may then choose α ∈ V ∗ to be the linear functional with α(v1 ) = 1. Assume now that V is a real vector space with dim(V ) > 1 and assume the theorem valid for vector spaces of dimension less than dim(V ). We will show that the theorem holds for V . This will be done by induction on the number k of vectors on the list v1 , . . . , vk . If k = 1, we consider two possibilities: if v1 and x are linearly independent, there exists a linear functional α ∈ V ∗ with α(v1 ) = 1 and α(x) = −1 (since the set {v1 , x} can be extended to a basis of V ); otherwise, if v1 and x are linearly dependent then x = tv1 with t < 0 and thus it suffices to choose a linear functional α with α(v1 ) = 1. Assume now that k > 1 and that the result holds when the convex cone is spanned by less than k vectors. Let x ∈ V be a point outside the convex cone spanned by {v1 , . . . , vk } and let us show that there exists a linear functional α ∈ V ∗ with α(x) < 0 and α(vi ) ≥ 0, for all i = 1, . . . , k. Corollary 2.2 gives us a nonzero linear functional α ∈ V ∗ with α(x) ≤ 0 and α(vi ) ≥ 0, for all i = 1, . . . , k. If α(x) < 0, we are done; thus, we only have to worry about the case where α(x) = 0. If α(vi ) = 0 for all i = 1, . . . , k then x and v1 , . . . , vk are all in the kernel of α, whose dimension is less than dim(V ). But we are assuming the result to hold for spaces of dimension less than dim(V ), so that there exists a linear functional β defined on the kernel of α with β(x) < 0 and β(vi ) ≥ 0, for i = 1, . . . , k. The linear functional β has a linear extension to the entire space V , and thus the thesis is obtained in this case. Let us assume that α(vi ) > 0 for at least one index i; then the set: vi : α(vi ) = 0 has cardinality less than k and thus there exists a linear functional β ∈ V ∗ with β(x) < 0 and such that β(vi ) ≥ 0 for all those i with α(vi ) = 0 (we know nothing about β(vi ) for those i with α(vi ) > 0). Given ε > 0, consider the linear functional α̃ = α + εβ. We have α̃(x) = εβ(x) < 0 and α̃(vi ) = εβ(vi ) ≥ 0 for those i with α(vi ) = 0. But by choosing ε > 0 sufficiently small, we can assure that α̃(vi ) = α(vi ) + εβ(vi ) is positive for those i with α(vi ) > 0. Thus the linear functional α̃ has the desired properties and the proof is complete. Appendix A. The proof of the supporting hyperplane lemma In this appendix we prove Lemma 2.1. We observe that this lemma is just a consequence of a version of the Hahn–Banach’s Theorem stated in many Functional Analysis books. We start by proving such version of Hahn–Banach’s Theorem. A.1. Theorem (Hahn–Banach). Let V be a real vector space and let p : V → IR be a nonnegative map satisfying the following two conditions: (a) p(x + y) ≤ p(x) + p(y), for all x, y ∈ V (subadditivity); (b) p(cx) = cp(x), for all x ∈ V and all c ≥ 0 (positive homogeneity). Let α ∈ S ∗ be a linear functional defined in a subspace S of V . If α(x) ≤ p(x) for all x ∈ S then α extends to a linear functional α̃ ∈ V ∗ such that α̃(x) ≤ p(x) for all x ∈ V . A.2. Remark. If V is a real vector space then a nonnegative map p : V → IR satisfying the conditions p(x + y) ≤ p(x) + p(y) and p(cx) = cp(x), for all x, y ∈ V , c ∈ IR is usually called a semi-norm; if, in addition, p(x) > 0 for all x 6= 0, then p is A NICE PROOF OF FARKAS LEMMA 4 called a norm. Note that conditions (a) and (b) on the statement of Theorem A.1 do not imply that p be a semi-norm, since condition (b) only requires p(cx) = cp(x) for nonnegative c. A nonnegative map p : V → IR is a semi-norm if and only if it satisfies (a), (b) and the additional condition p(−x) = p(x), for all x ∈ V . We observe that if p is a semi-norm and α ∈ V ∗ then the inequality α(x) ≤ p(x) holds for all x ∈ V if and only if the inequality |α(x)| ≤ p(x) does. Many books state the Hahn-Banach theorem only for semi-norms, but the proof in this more general context is exactly the same. The proof of Theorem A.1 uses the following: A.3. Lemma (simple extension). Under the hypothesis of Theorem A.1, given a vector v ∈ V not in S and denoting by S 0 the subspace spanned by S and v, then α extends to a linear functional α0 on S 0 such that α0 (x) ≤ p(x), for all x ∈ S 0 . Proof. A linear extension α0 of α to S 0 is uniquely determined by its value on the vector v; more explicitly, every vector of S 0 can be uniquely written in the form x + tv, with x ∈ S, t ∈ IR and setting α0 (v) = c for some real number c we obtain: α0 (x + tv) = α(x) + tc, for all x ∈ S, t ∈ IR. We wish to determine a value for c ∈ IR that makes the inequality: (A.1) α(x) + tc ≤ p(x + tv) true, for all x ∈ S, t ∈ IR. For t = 0, inequality (A.1) holds. For t > 0, (A.1) is equivalent to: (A.2) α xt + c ≤ p xt + v and for t < 0, (A.1) is equivalent to: (A.3) α xt + c ≥ −p − xt − v . Clearly (A.2) holds for all x ∈ S and all t > 0 if and only if α(y) + c ≤ p(y + v), for all y ∈ S; this is equivalent to: c ≤ p(y + v) − α(y), for all y ∈ S. Similarly, (A.3) holds for all x ∈ S and all t < 0 if and only if: c ≥ −p(z − v) + α(z), for all z ∈ S. Thus, the proof will be concluded if we are able to find a number c ∈ IR which is an upper bound for the set −p(z − v) + α(z) : z ∈ S and a lower bound for the set p(y + v) − α(y) : y ∈ S . But this is possible if and only if the inequality: (A.4) −p(z − v) + α(z) ≤ p(y + v) − α(y) holds, for all y, z ∈ S. Inequality (A.4) is equivalent to: (A.5) α(z + y) ≤ p(y + v) + p(z − v), and (A.5) follows from the subadditivity of p, as we show below: α(z + y) ≤ p(z + y) = p (y + v) + (z − v) ≤ p(y + v) + p(z − v). A NICE PROOF OF FARKAS LEMMA 5 Proof of Theorem A.1. If V is finite-dimensional (which is the case that actually matters to us), one simply applies Lemma A.3 a finite number of times, extending α successively until we get the desired linear functional α̃ on V . For the general case, one uses Zorn’s lemma to obtain a maximal extension of α to a linear functional e then Lemma A.3 implies that α̃ : Se → IR satisfying α̃(x) ≤ p(x), for all x ∈ S; Se = V , otherwise we would be able to find a proper extension of α̃. A.4. Definition. A subset C of a real vector space V is called absorbent in V if for all x ∈ V there exists ε > 0 such that tx ∈ C for 0 ≤ t ≤ ε. Obviously absorbent sets must contain the origin. A.5. Example. If C is a neighborhood of the origin in IRn then C is absorbent; namely, for x ∈ IRn , the continuous map IR 3 t 7→ tx ∈ IRn sends t = 0 to the origin and thus we must have tx ∈ C for |t| sufficiently small. A.6. Lemma. Let V be a real vector space and let C be an absorbent convex subset of V . Then there exist a nonnegative map p : V → IR satisfying conditions (a) and (b) in the statement of Theorem A.1, such that: (A.6) x ∈ V : p(x) < 1 ⊂ C ⊂ x ∈ V : p(x) ≤ 1 . Proof. For each x ∈ V , consider the set: Ix = t > 0 : x t ∈C . The fact that C is absorbent implies that Ix is nonempty; thus, it has a nonnegative infimum. Set p(x) = inf Ix , for all x ∈ V . Observe that if x = 0 then Ix = ]0, +∞[ and thus p(0) = 0; this proves (b) for c = 0. If c > 0 then clearly t ∈ Icx if and only if ct ∈ Ix , which show that Icx = cIx ; thus: p(cx) = inf Icx = c inf Ix = cp(x). This proves (b). Let us prove (a). Let x, y ∈ V be fixed, set t = p(x), s = p(y) and choose ε > 0. We know that there exists t0 ∈ Ix , s0 ∈ Iy with t ≤ t0 < t + ε and s ≤ s0 < s + ε. Thus tx0 ∈ C, sy0 ∈ C and the convexity of C implies that: t0 x s0 y x+y = 0 + 0 0 0 0 0 t +s t +s t t + s0 s0 is also in C; thus t0 + s0 ∈ Ix+y and: p(x + y) ≤ t0 + s0 < t + s + 2ε = p(x) + p(y) + 2ε. Since ε > 0 is arbitrary, we obtain (a). Finally, let us prove (A.6). Given x ∈ V with p(x) < 1 then there exists t ∈ Ix with p(x) ≤ t < 1. Thus xt ∈ C, 0 ∈ C and the convexity of C implies that x = t xt + (1 − t)0 is in C. Moreover, given x ∈ C then obviously 1 ∈ Ix , so that p(x) ≤ 1. This proves (A.6) and concludes the proof. A.7. Remark. The map p constructed in the proof of Lemma A.6 is not in general a semi-norm, unless C is symmetric, i.e., if x ∈ C implies −x ∈ C. A.8. Corollary. Lemma 2.1 holds if we add the hypothesis that the convex set C be absorbent (in which case the hypothesis of finite-dimensionality for V is no longer necessary). A NICE PROOF OF FARKAS LEMMA 6 Proof. Choose p as in the statement of Lemma A.6. Since 0 ∈ C and x 6∈ C, we have x 6= 0 and thus {x} is a basis for a one-dimensional subspace S of V . Let α ∈ S ∗ be the linear functional such that α(x) = 1. Notice that p(x) ≥ 1 for, otherwise, (A.6) would imply x ∈ C. Thus, for t ≥ 0 we have: α(tx) = t ≤ tp(x) = p(tx); obviously, for t < 0 we have α(tx) = t < 0 ≤ p(tx). We have thus shown that α(y) ≤ p(y) for all y ∈ S, and now Theorem A.1 gives us a linear extension α̃ of α to V such that α̃(y) ≤ p(y), for all y ∈ V . For all y ∈ C, we have p(y) ≤ 1 and thus: α̃(y) ≤ p(y) ≤ 1 = α̃(x). Hence −α̃(y) ≥ −α̃(x), for all y ∈ C, so that −α̃ is the linear functional we are looking for. A.9. Lemma. If V is a finite-dimensional real vector space and C is a nonempty convex subset of V then there exists a point p ∈ C and a subspace S of V such that the set C − p = x − p : x ∈ C is contained in S and is absorbent in S. Proof. Choose an arbitrary point q ∈ C and set C 0 = C − q, so that C 0 is a convex subset of V containing the origin. Now let {b1 , . . . , bk } be a maximal linearly independent subset of C 0 ; the existence of such subset follows easily from the finite-dimensionality of V . The subspace S spanned by {b1 , . . . , bk } contains C 0 ; otherwise, there would exist x ∈ C 0 \ S and thus {b1 , . . . , bk , x} would be a linearly independent subset of C 0 properly containing {b1 , . . . , bk }. Let L : IRk → S be the linear isomorphism that carries the canonical basis of IRk to {b1 , . . . , bk }. Then L−1 (C 0 ) is a convex subset of IRk containing the origin and the canonical basis; this implies that L−1 (C 0 ) also contains the set: Pk ∆k = (t1 , . . . , tk ) ∈ IRk : i=1 ti ≤ 1, ti ≥ 0, i = 1, . . . , k . 1 1 1 , k+1 , . . . , k+1 ∈ IRk is clearly an interior point of ∆k , so The point u = k+1 that ∆k − u is a neighborhood of the origin in IRk . Thus L−1 (C 0 ) − u is also a neighborhood of the origin in IRk and it is therefore an absorbent subset of IRk (recall Example A.5). The image of L−1 (C 0 )−u under the isomorphism L : IRk → S is equal to C 0 − L(u) and it is an absorbent subset of S. But C 0 − L(u) = C − q + L(u) and the conclusion is obtained by setting p = q + L(u). Proof of Lemma 2.1. If C is empty, any nonzero linear functional α will do. Assume that C is nonempty and choose p and S as in the statement of Lemma A.9. If the point x−p is not in S then there exists a linear functional α ∈ V ∗ such that α|S = 0 and α(x − p) = −1. Then, for all y ∈ C we have y − p ∈ S and thus: α(y) − α(p) = α(y − p) = 0 > −1 = α(x − p) = α(x) − α(p), which implies α(y) > α(x), for all y ∈ C. On the other hand, if x − p is in S then, since C − p is convex and absorbent in S, Corollary A.8 gives us a nonzero linear functional α on S with α(z) ≥ α(x − p), for all z ∈ C − p; this implies: α(y) − α(p) = α(y − p) ≥ α(x − p) = α(x) − α(p), and thus α(y) ≥ α(x), for all y ∈ C. The conclusion is obtained by considering an arbitrary linear extension of α to V . A NICE PROOF OF FARKAS LEMMA Departamento de Matemática, Universidade de São Paulo, Brazil E-mail address: [email protected] URL: http://www.ime.usp.br/~tausk 7

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download A NICE PROOF OF FARKAS LEMMA 1. Introduction Let - IME-USP