Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Smoothed Fischer-Burmeister Equation Methods for the Complementarity Problem 1 Houyuan Jiang CSIRO Mathematical and Information Sciences GPO Box 664, Canberra, ACT 2601, Australia Email: [email protected] Abstract : By introducing another variable and an additional equation, we describe a technique to reformulate the nonlinear complementarity problem as a square system of equations. Some useful properties of this new reformulation are explored. These properties show that this new reformulation is favourable compared with some pure nonsmooth equation reformulation and smoothing reformulation because it combines some advantages of both nonsmooth equation based methods and smoothing methods. A damped generalized Newton method is proposed for solving the reformulated equation. Global and local superlinear convergence can be established under some mild assumptions. Numerical results are reported for a set of the standard test problems from the library MCPLIB. AMS (MOS) Subject Classications. 90C33, 65K10, 49M15. Key Words. Nonlinear complementarity problem, Fischer-Burmeister functional, semismooth equation, Newton method, global convergence, superlinear convergence. 1 Introduction We are concerned with the solution of the nonlinear complementarity problem (NCP) [35]. Let F : <n ! <n be continuously dierentiable. Then the NCP is to nd a vector x 2 <n such that x 0; F (x) 0; F (x)T x = 0: (1) Reformulating the NCP as a constrained or unconstrained smooth optimization problem, and as constrained or unconstrained systems of smooth or nonsmooth equations, has been a popular strategy in the last decade. Based on these reformulations, many algorithms such as merit function methods, smooth or nonsmooth equation methods, smoothing methods, and interior point methods have been proposed. In almost all these methods, one usually tries to apply techniques in traditional nonlinear programming or systems of smooth equations to the reformulated problem considered. Dierent descent methods have been developed for the NCP by solving the system of nonsmooth equations reformulated by means of the Fischer-Burmeister functional [18]. See for example [10, 16, 17, 19, 25, 26, 27, 37, 42, 45]. In particular, global convergence of the damped generalized Newton method and the damped modied Gauss-Newton method for the Fischer-Burmeister functional reformulation of the NCP has been established in [25]. 1 This work was carried out initially at The University of Melbourne and was supported by the Australian Research Council. 1 A number of researchers have proposed and studied dierent smoothing methods. We refer the reader to [1, 2, 3, 4, 5, 6, 7, 14, 15, 20, 21, 23, 29, 30, 31, 32, 33, 41, 43, 44] and references therein. The main feature of smoothing methods is to reformulate the NCP as a system of nonsmooth equations, and then to approximate this system by a sequence of systems of smooth equations by introducing one or more parameters. Newton-type methods are applied to these smooth equations. Under certain assumptions, these solutions of smooth systems converge to the solution of the NCP by appropriately controlling these parameters. It seems that a great deal of eort is usually needed to establish global convergence of smoothing methods. The introduction of parameters results in underdetermined systems of equations, which may be the reason from our viewpoint that makes global convergence analysis complicated. The use of smoothing methods by means of the Fischer-Burmeister functional starts from Kanzow [29] for the linear complementarity problem. It has now become one of the main smoothing tools to solve the NCP and related problems. In particular, Kanzow [30] and Xu [44] have proved global as well as local superlinear convergence of their smoothing method for the NCP with uniform P -functions respectively. Burke and Xu [1] proved global linear convergence of their smoothing method for the linear complementarity problem with both the P -matrix and S -matrix properties. Global convergence and local fast convergence analysis is usually complicated because some techniques are required in order to drive the smoothing parameter to zero. This feature seems to be shared by the other smoothing methods mentioned in the last paragraph. Motivated by the above points, we shall introduce a technique to approximate the system of nonsmooth equations by a square system of smooth equations. This can be fullled by introducing a new parameter and a new equation. The solvability of the generalized Newton equation of this system can be guaranteed under very mild conditions. Since the reformulated system still gives rise to a smooth merit function, it turns out that the global convergence of the generalized Newton method can be established by following the standard analysis with some minor modications. Moreover, the damped modied Gauss-Newton method to the smooth equations can be extended to our system of nonsmooth equations without diculties. We would like to use the Fischer-Burmeister functional [18] to demonstrate our new technique though it may be adapted for other smoothing methods. In Section 2, the NCP is reformulated as a square system of equations by introducing a parameter, an additional equation and using the Fischer-Burmeister functional. We then study various properties which include semismoothness of the new system, equivalence between the new system and the NCP, and dierentiability of the least square merit function of the new system. Section 3 is devoted to study of sucient conditions that ensure nonsingularity of generalized Newton equations, a stationary point of the least square merit function to be a solution of the NCP, and boundedness of the level set associated with the least square merit function, respectively. In Section 4, we propose a damped generalized Newton method for solving this new system. Its global and local superlinear convergence can be established under mild conditions. Numerical results are reported for a set of the test problems from the library MCPLIB. We conclude the paper by oering some remarks in the last section. The following notion is used throughout the paper. For the vector x; y 2 <n , xT is the transpose of x and thus xT y is the inner product of x and y . kxk indicates the Euclidean norm of the vector x 2 <n . For a given matrix M = (mij ) 2 <nn and the index sets I ; J f1; . . . ; ng, MIJ denes the submatrix of M associated with the row 0 2 0 indexes in I and the column indexes in J . For a continuously dierentiable functional f : <n ! <, its gradient at x is dened by rf (x). If the function F : <n ! <n is continuously dierentiable at x, then let F 0 (x) denote its Jacobian at x. If F : <n ! <n is locally Lipschitz continuous at x, then @F (x) indicates its Clarke generalized Jacobian at x [8]. The notion (A) () (B ) means that the statements (A) and (B) are equivalent. 2 Reformulations and Equivalence In order to reformulate the NCP (1), let us recall two basic functions. The rst one is now known as the Fischer-Burmeister functional [18] which is dened by : < ! < 2 p (b; c) b + c ? (b + c): 2 2 The second one, denoted by : < ! <, is a modication of or a variation of its counterpart of in < . More precisely, : < ! < is dened by 3 3 p 3 (a; b; c) a + b + c ? (b + c): 2 2 2 Note that the function is introduced to study linear complementarity problems by Kanzow in [29], where a is treated as a parameter rather than an independent variable. Using these two functionals, we dene two functions associated with the NCP as follows. For any given x 2 <n and 2 <, dene H : <n ! <n by 0 (x ; F (x)) 1 CA .. H (x) B @ . 1 1 (xn ; Fn(x)) and G : <n ! <n , G~ : <n ! <n by +1 +1 0 e ? 1 BB (; x ; F (x)) G(; x) B .. B@ . 1 1 (; xn; Fn (x)) 1 ! CC ?1 e CC ~ G(; x) ; A where e is the Euler constant (or the natural logrithmic base). Consequently, we may dene two systems of equations: H (x) = 0 (2) and G(; x) = 0: (3) Note that the rst system has been extensively studied for the NCP (See for example [10, 16, 17, 19, 25, 26, 27, 37, 42, 45] and the references therein). If the rst equation is removed in the second system, then it reduces to the system introduced by Kanzow [29] for proposing smoothing or continuation methods to solve the LCP. Thereafter, this smoothing technique has been used for solving other related problems (See for example [1, 15, 20, 23, 29, 30, 31, 44]). The novelty of this paper is to introduce the rst equation, which makes (3) a square system. As it will be seen later, this new feature will overcome some diculties 3 encountered by the generalized Newton-type methods based on the system (2), and facilitate the analysis of global convergence, which is, from our point of view, usually complicated in the smoothing methods. Some nice properties for the methods based on the system (2) can be established for the similar methods based on (3). Moreover, our analysis is much closer to the spirit of the classical Newton method than smoothing methods. The global convergence analysis of the generalized Newton and the modied Gauss-Newton method for the system (2) has been done in [25]. In the sequel, the second system will be the main one to be considered despite some connections and dierences between (2) and (3) are explored. One may dene other functions which may play the same role as e ? 1. For simplicity of analysis, we use this special function in the sequel. See the discussions in Section 6 for more details on how to dene these kinds of functions. The least squares of H and G are denoted by and , namely, (x) 1 kH (x)k ; 2 (; x) 12 kG(; x)k : and are usually called merit functions. The denitions of the functions H and G heavily depend on the functional and respectively. Certainly, the study of some fundamental properties of and will help to get more insights into the functions H and G. Let E : <n ! <n be locally Lipschitz continuous at x 2 <n . Then the Clarke generalized Jacobian @E (x) of E at x is well-dened and can be characterized by the convex hull of the following set f lim E 0(xk )j E is dierentiable at xk 2 <ng: k 2 2 x !x @E (x) is a nonempty, convex and compact set for any xed x [8]. E is said to be semismooth at x 2 <n if it is directionally dierentiable at x, i.e., E 0(x; d) exists for any d 2 <n , and if V d ? E 0(x; d) = o(kdk) for any d ! 0 and V 2 @E (x + d). E is said to be strongly semismooth at x if it is semismooth at x and V d ? E 0(x; d) = O(kdk ): 2 See [39, 36, 19] for other characterizations and dierential calculus of semismoothness and strong semismoothness. We now present some properties of , G and . Note that similar properties for , H and have been studied in [10, 17, 18, 22, 27, 28]. Lemma 2.1 (i) When a = 0, then (a; b; c) = 0 if and only if b 0, c 0 and bc = 0. (ii) is locally Lipschitz, directionally dierentiable and strongly semismooth on < . Furthermore, if a + b + c > 0, then is continuously dierentiable at (a; b; c) 2 < . Namely, is continuously dierentiable except at (0; 0; 0). The generalized Jacobian of at (0; 0; 0) is 3 2 2 2 3 80 1 9 > > = <B C @ (0; 0; 0) = O >@ A j + ( + 1) + ( + 1) 1> : : ; 2 4 2 2 (iii) is smooth on < . The gradient of 2 3 2 at (a; b; c) 2 < is 3 r (a; b; c) = 2 (a; b; c)@ (a; b; c): 2 (iv) @b (a; b; c)@c (a; b; c) 0 for any (a; b; c) 2 < . If (0; b; c) 6= 0, then @b (0; b; c)@c (0; b; c) > 0. (v) (0; b; c) = 0 () @b (0; b; c) = 0 () @c (0; b; c) = 0 () @b (0; b; c) = @c (0; b; c) = 0 . 3 2 2 2 2 2 Proof. (i) Note that p (0; b; c) = (b; c). The result can be veried easily. T a + b + c is the Euclidean norm of the vector (a; b; c) . Then (ii) Note that p a + b + c is locally Lipschitz, directionally dierentiable and strongly semismooth on < . ?(b + c) is continuously dierentiable on < , hence locally Lipschitz, directionally dierentiable and strongly semismooth on < . Fischer [19] has proved that the 2 2 2 2 2 2 3 3 3 composition of strongly semismooth functions is still strongly semismooth. Therefore, is locally Lipschitz, p directionally dierentiable and strongly semismooth on < . If a + b + c > 0, a + b + c is continuously dierentiable at (a; b; c), and so is . Let d 2 < and d 6= 0. Then is continuously dierentiable at td for any t > 0. And 3 2 2 2 2 2 2 3 r (td) = ( q d ;q d ? 1; q d ? 1)T : d +d +d d +d +d d +d +d 1 2 1 2 2 2 2 3 2 1 3 2 2 2 3 2 1 2 2 2 3 For simplicity, let r (td) be denoted by (; ; )T . Clearly, + ( + 1) + ( + 1) = 1: 2 2 2 Let t tend to zero. By the semicontinuity property of the Clarke Jacobian, we obtain that (; ; ) 2 @ (0; 0; 0): It follows from the convexity of the generalized Jacobian that O @ (0; 0; 0): On the other hand, for any (a; b; c) = 6 0, (ra (a; b; c)) + (rb (a; b; c) + 1) + (rc (a; b; c) + 1) = 1: 2 2 2 By the denition of the Clarke generalized Jacobian, one may conclude that @ (0; 0; 0) O: This shows that @ (0; 0; 0) = O. (iii) Since is smooth everywhere on < except at (0; 0; 0), (0; 0; 0) is the only point at which is possibly not smooth. But it is easy to prove that is also smooth at (0; 0; 0). Therefore, is smooth on < . Furthermore, 3 2 2 2 3 r (a; b; c) = 2 (a; b; c)@ (a; b; c): Note that 2 (0; 0; 0)@ (0; 0; 0) = f0g is singleton though @ (0; 0; 0) = fOg is a set. (iv) By (ii), for any (a; b; c) 2 < and any (; ; )T 2 @ (a; b; c), we have + ( + 1) + ( + 1) 1: 2 3 2 2 2 5 This shows that 0. Suppose (0; b; c) 6= 0. Then it holds that either minfb; cg < 0 or bc 6= 0. In both cases, (ii) implies that 6= 0 and 6= 0. Consequently, > 0. (v) Clearly, if (0; b; c) = 0, then (iii) implies all the other results. If either @b (0; b; c) = 0 or @c (0; b; c) = 0, then we must have (0; b; c) = 0. If this is not so, (iv) implies that @b (0; b; c)@c (0; b; c) > 0, which is a contradiction. The proof is complete. 2 2 2 2 2 Proposition 2.1 (i) If (; x) is a solution of (3), then = 0. And x is a solution of the NCP if and only if (0; x) is a solution of (3), i.e. G(; x) = 0. (ii) G is continuously dierentiable at (; x) when 6= 0 and F is continuously dierentiable at x. G is semismooth on <n if F is continuously dierentiable on <n , and G is strongly semismooth on <n if F 0 (x) is Lipschtiz continuous on <n . If V 2 @G(; x), then V is of the following format, +1 +1 V = eC DF 0(x0) + E ! where C 2 <n , and both D and E are diagonal matrices in <nn satisfying ; Ci = q + xi + (Fi (x)) xi Dii = q ? 1; + xi + (Fi (x)) Fi (x) ? 1; Eii = q + xi + (Fi (x)) 2 2 2 2 2 2 2 2 2 if + xi + Fi (x) > 0, and 2 2 2 Ci = i ; Dii = i ; Eii = i; with i + (i + 1) + (i + 1) 1 if + xi + Fi (x) = 0. (iii) (; x) 0 for any (; x) 2 <n . And when the NCP has a solution, x is a solution of the NCP if and only if (0; x) is a global minimizer of over <n . (iv) is continuously dierentiable on <n . The gradient of at (; x) is 2 2 2 2 2 2 +1 +1 +1 + C T G~ (; x) r(; x) = V G(; x) = F 0e(x()eT D?G~1) (; x) + E G~ (; x) T for any V 2 @G(; x). (v) In (iv), for any and x (DG~ (; x))i(E G~ (; x))i 0; 1 i n If G~ i (0; x) 6= 0, then (DG~ (0; x))i(E G~ (0; x))i > 0: 6 ! (vi) The following four statements are equivalent. G~ (0; x))i = 0; (DG~ (0; x))i = 0; (E G~ (0; x))i = 0; (DG~ (0; x))i = (E G~ (0; x))i = 0: Proof. (i) If G(; x) = 0, then e ? 1 = 0, i.e., = 0. The rest follows from (i) of Lemma 2.1. (ii) When 6= 0 and F is continuously dierentiable at x, (; xi; Fi(x)) is continuously dierentiable at (; x) for 1 i n. Hence G(; x) is dierentiable at (; x). Note that the composition of any two semismooth functions or strongly semismooth functions is semismooth or strongly semismooth (See [19]). Since is strongly semismooth on < by (ii) of Lemma 2.1, semismoothness or strong semismoothness of G follows respectively if F is smooth at x or if F 0 is Lipschitz continuous at x. The form of an element V in @G(; x) follows from the Chain Rule Theorem (Theorem 2.3.9 of [8]) and the generalized Jacobian form of in (ii) of Lemma 2.1. It should be pointed out that unlike @ , we only manage to give an outer estimation of @G(; x). Nevertheless, this outer estimation will be enough for the following analysis. (iii) Trivially, (; x) 0 for any (; x). If x is a solution of the NCP, (i) shows that G(0; x) = 0, i.e., (0; x) is a global minimizer of . Conversely, if the NCP has a solution, then the global minimum of is zero. If in addition, (0; x) is also a global minimizer of , then (0; x) = 0 and G(0; x) = 0. The desired result follows from (i) again. (iv) can be rewritten as follows: 3 n X (; x) = 21 (e ? 1) + 21 k (; xi; Fi(x))k : i 2 2 =1 The smoothness of over <n follows from the smoothness of F and . The form of r follows from the Chain Rule Theorem and the smoothness of . (v) and (vi) The proof is analogous to that of (vi) and (v) of Lemma 2.1. It is omitted. 2 +1 2 Remark. Let W denote the set of all elements DF 0(x) + E such that there exists a vector C which makes the following matrix 1 0 C DF 0 (x) + E ! an element of @G(0; x). On the one hand, any element of @G(0; x) is very much like the element of @H (x), and @H (x) W . Because of this similarity, some standard analysis on @H (x) can be extended to @G(0; x) as we shall see in the next section. On the other hand, we must be aware that @H (x) and W are not the same in general. See [8] for more details. Therefore, some extra care needs to be taken if we say that some techniques on @H can be extended to W or @G(0; x). The results below reveal that , G~ and reduce to , H and when = 0. Further relationships between them can be explored. But we do not proceed here. Lemma 2.2 (i) (0; b; c) = (b; c) for any b; c 2 <. 7 (ii) G~ (0; x) = H (x) for any x 2 <n . (iii) (0; x) = (x) for any x 2 <n . 3 Basic Properties In this section, some basic properties of the functions G and are investigated. These properties include nonsingularity of the generalized Jacobian of G, sucient conditions for a stationary point of to be a solution of the NCP, and the boundedness of the level set of the merit function . In the context of the nonlinear complementarity, the notions of monotone matrices, monotone functions and other related concepts play important roles. We review some of them in the following. A matrix M 2 <nn is called a P -matrix (P -matrix) if each of its principal minors is positive (nonnegative). A function F : <n ! <n is said to be a P -function over the open set S <n if for any x; y 2 S with x 6= y , there exists i such that xi 6= yi and 0 0 (xi ? yi )(Fi (x) ? Fi (y )) 0: F is a uniform P -function over S if there exists a positive constant such that for any x; y 2 S (x ? y )(F (x) ? Fi (y )) kx ? y k : max in i i i Obviously, a P -matrix must be a P -matrix, and a uniform P -function must be a P function. It is well known that the Jacobian of a P -function is always a P -matrix and the Jacobian of a uniform P -function is a P -matrix (See [9, 34]). The following characterization on a P -matrix can be found in Theorem 3.4.2 of 2 1 0 0 0 0 0 [9]. Lemma 3.1 A matrix M 2 <nn is a P -matrix if and only if for every nonzero x there exists an index i (i i n) such that xi 6= 0 and xi (Mxi ) 0. 0 To guarantee nonsingularity of the generalized Jacobian of G at a solution of (3), R-regularity introduced by Robinson [40] will be proved to be one of the sucient conditions. Suppose x is a solution of the NCP (1). Dene three index sets I := f1 i n j xi > 0 = Fi(x)g; J := f1 i n j xi = 0 = Fi(x)g; K := f1 i n j xi = 0 < Fi (x)g: The NCP is said to be R-regular at x if the submatrix F 0 (x)II of F 0 (x) is nonsingular and the Schur-complement F 0(x)J J ? F 0 (x)J I F 0 (x)?II F 0 (x )IJ 1 is a P -matrix. Proposition 3.1 (i) If 6= 0 and F 0(x) is a P -matrix, then V is nonsingular for any V 2 @G(; x). (ii) If F 0 (x) is a P -matrix, then V is nonsingular for any V 2 @G(; x). 0 8 (iii) If = 0 and the NCP is R-regular at x , then V is nonsingular for any V 2 @G(0; x). Proof. From the denition of the generalized Jacobian of G(; x), it follows that for any V 2 @G(; x), V is nonsingular if and only if the following submatrix of V is nonsingular; DF 0 (x) + E: (i) If 6= 0, then both ?D and ?E are positive denite diagonal matrices. The nonsingularity of DF 0 (x) + E is equivalent to the nonsingularity of the matrix F 0 (x) + D? E with D? E a positive denite diagonal matrix. It follows that F 0 (x) + D? E is a P -matrix hence nonsingular if F 0 (x) is a P -matrix. (ii) If F 0 (x) is a P -matrix, as remarked after Proposition 2.1, the technique to prove nonsingularity of the matrix DF 0 (x) + E is quite standard. We omit the detail here and refer the reader to [27] for a proof. (iii) If = 0 and the NCP is R-regular at x, the techniques to prove nonsingularity of DF 0 (x) + E are also standard. See for example [17]. Therefore, nonsingularity of @G at (0; x) follows from nonsingularity of DF 0(x) + E . 2 1 1 1 0 The next result provides a sucient condition so that a stationary point of the least square merit function implies a solution of the NCP. Proposition 3.2 If (; x) is a stationary point of and F 0(x) is a P -matrix, then 0 = 0 and x is a solution of the NCP. Proof. Suppose (; x) is a stationary point of , i.e., r(; x) = 0. By Lemma 2.1, r(; x) = V T G(; x) = 0 for any V 2 @G(; x). We now prove that = 0. Otherwise, assume = 6 0. Then V is nonsingular by Proposition 3.1. This shows that G(; x) = 0, which implies = 0. This is a contradiction. Therefore, = 0. In this case, V T G(0; x) = 0 implies that (F 0 (x))T DG~ (0; x) + E G~ (0; x) = 0; and DiiG~(; x)(F 0(x)T DG~(; x))i + DiiG~ i(; x)EiiG~ i(; x) = 0: Suppose G~ i (0; x) 6= 0 for some index i. By (v) and (vi) of Proposition 2.1, DiiG~ (0; x)(F 0(x)T DG~ (0; x))i < 0; for any index i such that G~ i (0; x) 6= 0. By Lemma 3.1, F 0 (x)T and F 0 (x) are not P -matrices. This is a contradiction. Therefore, G~(0; x) = 0; 0 which, together with = 0 shows that G(; x) = 0. The desired result follows from (i) of Proposition 2.1. 2 Lemma 3.2 If F is a uniform P -function on <n and fxk g is an unbounded sequence, then there exists i (1 i n) such that both the sequences fxki g and fFi (xk )g are unbounded. 9 Proof. See the proof of Proposition 4.2 of Jiang and Qi [27]. 2 Lemma 3.3 Suppose that f(ak ; bk; ck)g is a sequence such that fak g is bounded, fbk g and fck g are unbounded. Then f (ak ; bk ; ck )g is unbounded. Proof. Without loss of generality, we may assume that bk ! 1 and ck ! 1 as k tends to innity. By the denition of , it is clear that k ! +1 if either bk or ck tends to ?1. Now assume that bk ! +1 and ck ! +1. Then for suciently large k, it follows that ?(ak ) + 2bkck (ak ) + (bk ) + (ck ) + bk + ck k k k k k = ?q(a ) + 2 maxfb ; c g minfb ; c g (ak ) + (bk ) + (ck ) + bk + ck k k k k k q ?k(a ) + 2 maxfkb ;kc g minfb ; c gk k ?(a ) + 2(maxfb ; c g) + 2 maxfb ; c g Hence, it follows from the boundedness of fak g that k is unbounded. This completes the proof. 2 Proposition 3.3 If F is a uniform P -function on <n and k is bounded, then the set j (ak; bk ; ck)j = q 2 2 2 2 2 2 2 2 2 2 of the level sets 2 k ( ) f(k ; x) : (k ; x) g is bounded for any 0. Proof. Assume that k () is unbounded. Then there exists a sequence fk ; xkg which is unbounded such that (k ; xk ) . This implies that fxk g is unbounded by the boundedness of k . By Lemma 3.2, there exists an index i such that both xki and Fi (xk ) are unbounded. Lemma 3.3 shows that (k ; xki ; Fi (xk )) is unbounded. Clearly, we obtain that (k ; xk ) is unbounded. This is a contradiction. Therefore, k ( ) is bounded for any 0. 2 4 A Damped Generalized Newton Method and Convergence In this section, we develop a generalized Newton method for the system (3). The method contains two main steps. The rst one is to dene a search direction, which we call the Newton step, by solving the following so-called generalized Newton equation V d = ?G(; x): (4) where V 2 @G(; x). The generalized Newton equation can be rewritten as follows ed = ?(e ? 1); Cd + (DF 0 (x) + E )dx = ?G~ (; x); where G~ (; x) is dened as in Section 2. The second main step is to do a line search along the generalized Newton step to decrease the merit function . The full description of our method is stated as follows. For simplicity, let z = (; x), z = ( ; x ) and z k = (k ; xk ). Similarly, dk = (dk ; dxk ), etc. Algorithm 1 (Damped generalized Newton method) + 10 + + Step 1 (Initialization) Choose an initial starting point z = ( ; x ) 2 <n such that > 0, two scalars ; 2 (0; 1), and let k := 0. Step 2 (Search direction) Choose Vk 2 @G(zk ) and solve the generalized Newton 0 0 0 +1 0 equation (4) with = k , z = z k and V = Vk . Let dk be a solution of this equation. If d = 0 is a solution of the generalized Newton equation, the algorithm terminates. Otherwise, go to Step 3. Step 3 (Line search) Let k = ik where ik is the smallest nonnegative integer i such that (z k + ()idk ) ? (z k ) ()ir(z k )T dk : Step 4 (Update) Let zk := zk + k dk and k := k + 1. Go to Step 2. +1 The above generalized Newton method reduces to the classical damped Newton method if G is smooth. See Dennis and Schnabel [11]. A similar algorithm for solving the system (2) is proposed in [25]. It has been recognized for a long time that nonmonotone line search strategies are superior to the monotone line search strategy from a numerical point of view. As shall be seen later, we shall implement a non-monotone line search in our numerical experiments. In a non-monotone version of the damped generalized Newton method, (z k ) on the left-hand side of the inequality in Step 3 is replaced by maxf(z k ); (z k? ); . . . ; (z k?l )g; where l is a positive integer number. When l = 0, the non-monotone line search coincides with the monotone line search. 1 Lemma 4.1 If G(z) 6= 0 and the generalized Newton equation (4) is solvable at z, then its solution d is a descent direction of the merit function at z , that is G0(z )T d < 0. Furthermore, the line search step is well-dened at z . Proof. It trivially follows from the dierentiability of and the generalized Newton 2 equation. Since is continuously dierentiable on <n , it is easy to see that Algorithm 1 is well-dened provided that the generalized Newton direction is well-dened at each step. In Step 2, the existence of the search direction depends on the solvability of the generalized Newton equation. From Proposition 3.1, the generalized Newton equation is solvable if rF (x) is a P -matrix and 6= 0. We repeat that the main dierence between (2) and (3) is that (3) has one more variable and one more equation than (2). This additional variable must be driven to zero in order to obtain a solution of (3) or a solution of the NCP from Algorithm 1. So we next present a result on and d. +1 0 Lemma 4.2 When > 0, then d 2 (?; 0). Moreover, + td 2 (0; ) if > 0: for any t 2 (0; 1]. 11 Proof. By the the rst equation of the generalized Newton equation (4) and the Taylor series, we have d = ? e e? 1 P1 1 n i = ? P i1! 1 n i i! P 1 n 1 i (i + 1)! = ? P1 n ; =1 =0 =0 1 i=0 i! which implies that d 2 (?; 0) when > 0. It is easy to see that + td 2 (0; ) for any t 2 (0; 1]. 2 Simply speaking, the above result says that after each step, the variable will be closer to zero than the previous value. Namely, is driven to zero automatically. However, is always positive. This implies two important observations. Firstly, G is continuously dierentiable at z k = (k ; xk ), which is nice. Secondly, the solvability of the generalized Newton equation becomes more achievable in the case 6= 0 than = 0; see Proposition 3.1. Theorem 4.1 Suppose the generalized Newton equation in Step 2 is solvable for each k. Assume that z = ( ; x) is an accumulation point of fz k g generated by the damped generalized Newton method. Then the following statements hold: (i) x is a solution of the NCP if fdk g is bounded. (ii) x is a solution of the NCP and fz k g converges to z superlinearly if @G(z ) is nonsingular and 2 (0; ). The convergence rate is quadratic if F 0 is Lipschitz continuous on <n . 1 2 Proof. The proof is similar to that of Theorem 4.1 in [25] where the damped generalized Newton method is applied to the system (2). We omit the details. 2 Corollary 4.1 Suppose F is a P -function on <n and 2 (0; ). Then Algorithm 1 is well-dened. Assume z = ( ; x) is an accumulation point of fz k g and @G(z ) 1 0 2 is nonsingular or F 0 (x) is a P -matrix. Then = 0, x is a solution of the NCP, and z k converges to (0; x) superlinearly. If F 0 is Lipschitz continuous on <n , then the convergence rate is quadratic. Proof. By Lemma 4.2, k > 0 for any k. Since F is a P -function, it follows from 0 Proposition 3.1 that @G(k ; xk ) is nonsingular, which implies that the generalized Newton equation is solvable for any k. The result follows from Theorem 4.1. 2 Corollary 4.2 Suppose F is a uniform P -function on <n and 2 (0; ). Then Algorithm 1 is well-dened, fz k g is bounded and z k converges to z = (0; x) superlinearly 1 2 with x the unique solution of the NCP, and the convergence rate is quadratic if F 0 is Lipschitz continuous on <n . 12 Proof. The results follow from Proposition 3.3 and Corollary 4.1. 2 Reamrk. One point worthy mentioning is about the calculation of the generalized Jacobian of G(; x) since we only managed to give an outer estimation of @G(; x) in Proposition 2.1. However, we never have to worry about this in Algorithm 1. The reason is that the parameter k is never equal to zero for any k. This implies that G is actually smooth at (k ; xk ) for any k. Therefore, the generalized Jacobian of G reduces to the Jacobian of G which is singleton and easy to calculate. 5 Numerical Results In this section, we present some numerical experiments for Algorithm 1 in Section 4 with a non-monotone line search strategy. We chose l = 3 for k 4 and l = k ? 1 for k = 2; 3, where k is the iteration index. We also made the following change in our implementation: k is replaced by 10? when k < 10? because our experience showed that numerical diculties occur sometimes if k is too close to zero. Algorithm 1 was implemented in MATLAB and run on a Sun SPARC workstation. The following parameters were used for all the test problems: = 10:0, = 10? , = 0:5. The default initial starting point was used for each test problem in the library MCPLIB [12, 13]. The algorithm is terminated when one of the following criteria is satised: (i) The iteration number reaches to 500; (ii) The line search step is less than 10? ; (iii) The minimum of 6 6 0 4 10 k min(F (xk ); xk )k1 and kr(zk )k 2 is less than or equal to 10? . We tested the nonlinear and linear complementarity problems from the library MCPLIB [12, 13]. The numerical results are summarized in Tables 1, where Dim denotes the number of variables in the problem, Iter the number of iterations, which is also equal to the number of Jacobian evaluations for the function F , NF the number of function evaluations for the function F , and " the nal value of k min(F (x); x)k1 at the found solution x . 6 The algorithm failed to solve bishop, colvdual, powell and shubik initially. Therefore, we perturbed the Jacobian matrices for these problems by adding I to F 0 (xk ), where > 0 is a small constant and I is an identity matrix. We used = 10? for bishop, powell and shubik, and = 10? for colvdual. Our code failed to solve tinloi within 500 iterations whether Jacobian perturbation is used or not. However, our experiment showed that it did not make any meaningful progress from the 33-rd iteration to the 500-th iteration. In fact, " = 2:07? in the both iterations and " is very close to 10? that was used for termination. All other problems have been solved successfully. One may see that most problems were solved in small number of iterations. One important observation is that the number of function evaluations is very close to the number of iterations for most of the test problems. This implies that full Newton steps are taken most times and superlinear convergence follows. 5 2 6 6 13 Problem bertsekas billups bishop colvdual colvnlp cycle degen explcp hanskoop jel josephy kojshin mathinum mathisum nash pgvon106 powell scarfanum scarfasum scarfbsum shubik simple-red sppe tinloi tobin + Dim 15 1 1645 20 15 1 2 16 14 6 4 4 3 3 10 106 16 13 14 40 45 13 27 146 42 Iter 16 14 83 17 16 14 14 14 22 14 14 15 22 15 14 39 15 19 21 22 169 14 14 32 (500) 14 NF 17 15 176 18 17 15 15 15 33 15 15 16 23 16 15 71 23 20 23 32 1093 15 15 118 (14540) 15 " 1.08e-07 6.09e-07 4.87e-07 2.40e-07 1.25e-08 1.23e-11 1.21e-10 1.75e-10 7.03e-08 7.20e-11 1.32e-10 8.59e-07 6.45e-07 4.86e-07 6.68e-11 3.44e07 2.22e-09 1.29e-08 1.83e-08 4.12e-08 7.45e-07 2.27e-08 1.57e-10 2.07e-06 1.18e-10 Table 1: Numerical results for the problems from MCPLIB 6 Concluding Remarks By introducing another variable and an additional equation, we have reformulated the NCP as a square system of nonsmooth equations. It has been proved that this reformulation shares some desirable properties of both nonsmooth equations reformulations and smoothing techniques. The semismoothness of the equation and the smoothness of its least square merit function enable us to propose the damped generalized Newton method, and to prove global as well as local superlinear convergence under mild conditions. Encouraging numerical results have been reported. The main feature in the proposed methods is the introduction of the additional equation e ? 1 = 0: As we have seen, fk g is a monotonically decreasing positive sequence if > 0. This property ensures the following important consequences: (i) the reformulated system is smooth at each iteration, which might not be so important for our methods since the system is semismooth everywhere; (ii) the linearized system has a unique solution 0 14 at any iteration k under mild conditions such as P -property; (iii) the fact that k must be driven to zero is usually satised in order to ensure right convergence (i.e., the accumulation point should be the solution of the equation or a stationary point of the least square merit function). One may nd other functions which may play a similar role. For example, e + ? 1 = 0 might be an alternative. In general, the equation e ? 1 = 0 can be replaced by the equation () = 0, where satises the following conditions: (i) : < ! < is continuously dierentiable with 0() > 0 for any (ii) () = 0 implies that = 0 0 (iii) d = ? 2 (?; 0) for any > 0. ( ) 0( ) Some comments on the requirements imposed on the function are explained as follows. The condition (i) is to ensure that is smooth and that d is well-dened. The condition (ii) guarantees that G(; x) = 0 implies that = 0 and x is a solution of the NCP and a stationary point of the merit function is a solution of the NCP under some mild conditions; see Propositions 2.1 and 3.2. The condition (iii) implies that 0 < + td < for any t 2 (0; 1], which is required in Armijo line search of Algorithm 1, and which also ensures that always remains positive and in a bounded set. In [38], Qi, Sun and Zhou also treated smoothing parameters as independent variables in their smoothing methods. In their algorithm, these smoothing parameters are updated according to both the line search rule and the quality of the approximate solution of the problem considered. See the mentioned paper for more details. As has been seen in Algorithm, our smoothing parameter is updated by the line search rule. The techniques introduced in this paper seem to be applicable for variational inequality, mathematical programs with equilibrium constraints, semi-denite mathematical programs and related problems. The technique of introducing an additional equation may be useful in other methods to solve the NCP and related problems as far as parameters are needed to be introduced. In an early version [24] of this paper, a damped modied Gauss-Newton method and another damped generalized Newton method based on a modied functional of were proposed, and global as well as local fast convergence results were established. The interested reader is referred to the report [24] for more details. Acknowledgements. The author is grateful to Dr. Danny Ralph for his numerous motivative discussions and many constructive suggestions and comments, and to Dr. Steven Dirkse for providing him the test problems and an MATLAB interface to access these problems. I am also thankful to anonymous referees and Professor Liqun Qi for their valuable comments. References [1] J. Burke and S. Xu, The global linear convergence of a non-interior path-following algorithm for linear complementarity problem, Mathematics of Operations Research 23 (1998) 719-735. 15 [2] B. Chen and X. Chen, A global and local superlinear continuation-smoothing method for P + R and monotone NCP, SIAM Journal on Optimization 9 (1999) 624-645. [3] B. Chen and P.T. Harker, A continuation method for monotone variational inequalities, Mathematical Programming (Series A) 69 (1995) 237-254. [4] B. Chen and P.T. Harker, Smooth approximations to nonlinear complementarity problems, SIAM Journal on Optimization 7 (1997) 403-420. [5] C. Chen and O.L. Mangasarian, Smoothing methods for convex inequalities and linear complementarity problems, Mathematical Programming 71 (1995) 51-69. [6] C. Chen and O.L. Mangasarian, A class of smoothing functions for nonlinear and mixed complementarity problems, Computational Optimization and Applications 5 (1996) 97-138. [7] X. Chen, L. Qi and D. Sun, Global and superlinear convergence of the smoothing Newton methods and its application to general box constrained variational inequalities, Mathematics of Computation 67 (1998) 519-540. [8] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983. [9] R.W. Cottle, J.-S. Pang and R.E. Stone, The Linear Complementarity Problems, Academic Press, New York, 1992. [10] T. De Luca, F. Facchinei and C. Kanzow, A semismooth equation approach to the solution of nonlinear complementarity problems, Mathematical Programming 75 (1996) 407-439. [11] J.E. Dennis and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equation, Prentice Hall, Englewood Clis, New Jersey, 1983. [12] S.P. Dirkse, MCPLIB and MATLAB interface MPECLIB and MCPLIB models, http://www.gams.com/mpec, 2001. [13] S.P. Dirkse and M.C. Ferris, MCPLIB: A collection of nonlinear mixed complementarity problems, Optimization Methods and Software 5 (1995) 407-439. [14] J. Eckstein and M. Ferris, Smooth methods of multipliers for complementarity problems, Mathematical Programming 86 (1999) 65-90. [15] F. Facchinei, H. Jiang and L. Qi, A smoothing method for mathematical programs with equilibrium constraints, Mathematical Programming 85 (1999) 81-106. [16] F. Facchinei and C. Kanzow, A nonsmooth inexact Newton method for the solution of large-scale nonlinear complementarity problems, Mathematical Programming (Series B) 17 (1997) 493-512. [17] F. Facchinei and J. Soares, A new merit function for nonlinear complementarity problems and a related algorithm, SIAM Journal on Optimization 7 (1997) 225247. 0 0 16 [18] A. Fischer, A special Newton-type optimization method, Optimization 24 (1992) 269-284. [19] A. Fischer, Solution of monotone complementarity problems with locally Lipschitzian functions, Mathematical Programming 76 (1997) 513-532. [20] M. Fukushima, Z.-Q. Luo and J.-S. Pang, A globally convergent sequential quadratic programming algorithm for mathematical programs with linear complementarity constraints, Computational Optimization and Applications 10 (1998) 5-34. [21] S.A. Gabriel and J.J. More, Smoothing of mixed complementarity problems, in Complementarity and Variational Problems, Michael C. Ferris and Jong-Shi Pang, eds., SIAM Publications, 1997, pp. 105-116. [22] C. Geiger and C. Kanzow, On the resolution of monotone complementarity problems, Computational Optimization and Applications 5 (1996) 155-173. [23] K. Hotta and A. Yoshise, Global convergence of a class of non-interior-point algorithms using Chen-Harker-Kanzow functions for nonlinear complementarity problems, Mathematical Programming 86 (1999) 105-133. [24] H. Jiang, Smoothed Fischer-Burmeister Equation Methods for the complementarity problem, Manuscript, Department of Mathematics, The University of Melbourne, June 1997. [25] H. Jiang, Global convergence analysis of the generalized Newton and GaussNewton methods for the Fischer-Burmeister equation for the complementarity problem, Mathematics of Operations Research 24 (1999) 529-543. [26] H. Jiang, M. Fukushima, L. Qi and D. Sun, A trust region method for solving generalized complementarity problems, SIAM Journal on Optimization 8 (1998) 140-157. [27] H. Jiang and L. Qi, A new nonsmooth equations approach to nonlinear complementarity problems, SIAM Journal on Control and Optimization 35 (1997) 178-193. [28] C. Kanzow. An unconstrained optimization technique for large-scale linearly constrained convex minimization problems, Computing 53 (1994) 101-117. [29] C. Kanzow, Some noninterior continuation methods for linear complementarity problems, SIAM Journal on Matrix Analysis and Applications 17 (1996) 851-868. [30] C. Kanzow, A new approach to continuation methods for complementarity problems with uniform P-functions, Operations Research Letters 20 (1997) 85-92. [31] C. Kanzow and H. Jiang, A continuation method for (strongly) monotone variational inequalities, Mathematical Programming 81 (1998) 103-125. [32] M. Kojima, N. Megiddo and S. Mizuno, A general framework of continuation methods for complementarity problems, Mathematics of Operations Research 18 (1993) 945-963. 17 [33] M. Kojima, S. Mizuno and T. Noma, Limiting behaviour of trajectories generated by a continuation method for monotone complementarity problems, Mathematics of Operations Research 15 (1990) 662-675. [34] J.J. More and W.C. Rheinboldt, On P ? and S ?functions and related classes of n-dimensional nonlinear mappings, Linear Algebra and its Applications 6 (1973) 45-68. [35] J.-S. Pang, Complementarity problems, in: R. Horst and P. Pardalos, eds., Handbook of Global Optimization, Kluwer Academic Publishers, Boston, 1994, pp. 271338. [36] L. Qi, Convergence analysis of some algorithms for solving nonsmooth equations, Mathematics of Operations Research 18 (1993) 227-244. [37] L. Qi, Regular pseudo-smooth NCP and BVIP functions and globally and quadratically convergent generalized Newton methods for complementarity and variational inequality problems, Mathematics of Operations Research 24 (1999) 440471. [38] L. Qi, D. Sun and G. Zhou, A new look at smoothing Newton methods for nonlinear complementarity problems and box constrained variational inequalities, Mathematical Programming 87 (2000) 1-35. [39] L. Qi and J. Sun, A nonsmooth version of Newton's method, Mathematical Programming 58 (1993) 353-368. [40] S.M. Robinson, Strongly regular generalized equation, Mathematics of Operations Research 5 (1980) 43-61. [41] H. Sellami and S. Robinson, Implementations of a continuation method for normal maps, Mathematical Programming (Series B) 76 (1997) 563-578. [42] P. Tseng, Growth behavior of a class of merit functions for the nonlinear complementarity problem, Journal of Optimization Theory and Applications 89 (1996) 17-38. [43] P. Tseng, An infeasible path-following method for monotone complementarity problems, SIAM Journal on Optimization 7 (1997) 386-402. [44] S. Xu, The global linear convergence of an infeasible non-interior path-following algorithm for complementarity problems with uniform P -functions, Mathematical Programming 87 (2000) 3, 501-517 [45] N. Yamashita and M. Fukushima, Modied Newton methods for solving a semismooth reformulation of monotone complementarity problems, Mathematical Programming 76 (1997) 469-491. 18 Attachment Proof of Theorem 4.1: (i) The generalized Newton direction in Step 2 is well-dened by the solvability assumption of the generalized Newton equation. By the generalized Newton equation and the smoothness of , we have r(zk )T dk = G(zk)T Vkzk = ?kG(zk)k = ?2(zk ) < 0: In view that dk 6= 0 and that d = 0 is not a solution of the generalized Newton equation, it follows that dk is a descent direction of the merit function at xk . Therefore, the well-denedness of the line search step (Step 3) and the algorithm follows from dierentiability of the merit function . Without loss of generality, we may assume that z is the limit of the subsequence fz k gk2K where K is a subsequence of f1; 2; . . .g. If fk gk2K is bounded away from zero, using a standard argument from the decreasing property of the merit function after each iteration and nonnegativeness of theP merit function over <n , P then k2K ?k r(z k )T dk < +1, which implies that k2K (z k ) < +1. Hence, limk! 1;k2K (z k ) = (z ) = 0 and z is a solution of (3). On the other hand, if fk gk2K has a subsequence converging to zero, we may pass to the subsequence and assume that limk!1;k2K k = 0. From the line search step, we may show that for all suciently large k 2 K (z k + k dk ) ? (z k ) k r(z k )T dk ; (z k + ? k dk ) ? (z k ) > ? k r(z k )T dk : Since fdk g is bounded, by passing to the subsequence, we may assume that limk! 1;k2K dk = d. By some algebraic manipulations and passing to the subsequence, we obtain r(z)T d = r(z)T d; which means that r(z )T d = 0. By the generalized Newton equation, it follows that G(z k )T G(z k ) + G(z k )T Vk dk = G(z k )T G(z k ) + r(z k )T dk = 0: This shows that limk!1;k2K G(z k )T G(z k ) = G(z )T G(z ) = 0, namely, z is a solution of (3). (ii) Since @G(z ) is nonsingular, it follows that k(Vk)? k c; for some positive constant c and all suciently large k 2 K . The generalized Newton equation implies that fdk gk2K is bounded. Therefore, (i) implies that G(z ) = 0. We next turn to the convergence rate. From semismoothness of G at z , for any suciently large k 2 K , G(zk + dk ) = G(z + z k + dk ? z ) ? G(z ) = U (z k + dk ? z ) + o(kz k + dk ? z k); 2 +1 + 1 1 + 1 where U 2 @G(z k + dk ) and G(zk ) = G(z + z k ? z ) ? G(z ) = V (z k ? z ) + o(kz k ? z k); 19 where V 2 @G(z k ). Let V = Vk in the last equality. Then the generalized Newton equation and uniform nonsingularity of Vk (k 2 K ) imply that kzk + dk ? zk = o(kzk ? zk): (5) and kdk k = kz k ?z k+o(kz k ?z k) which implies that limk1;k2K dk = 0. Consequently, it follows from nonsingularity of @G(z ), for any suciently large k 2 K kG(zk )k > 0; lim k!1;k2K kz k ? z k kG(zk + dk )k > 0: lim k!1;k2K kz k + dk ? z k Hence, (5) shows that kG(zk + dk )k = o(kG(zk)k): By the generalized Newton equation and 2 (0; ), we obtain that k = 1 for all suciently large k 2 K , i.e., the full generalized Newton step is taken. In another word, when k is suciently large, both z k and z k + dk are in a small neighborhood of z by (5), and the damped Newton method becomes the generalized Newton method. Then convergence and the convergence rate follow from Theorem 3.2 of [39]. 2 1 2 20