Download Computing the Nondominated Nash Points of a Normal Form Game

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Prisoner's dilemma wikipedia , lookup

Game mechanics wikipedia , lookup

Deathmatch wikipedia , lookup

Turns, rounds and time-keeping systems in games wikipedia , lookup

Minimax wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Nash equilibrium wikipedia , lookup

John Forbes Nash Jr. wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
Computing the Nondominated Nash Points of a
Normal Form Game with Two Players
Hadi Charkhgard∗a , Martin Savelsberghb , and Masoud Talebiana
a
School of Mathematical and Physical Sciences, The University of Newcastle, Australia
School of Industrial and Systems Engineering, Georgia Institute of Technology, USA
b
January 20, 2016
Abstract
We investigate computing Nash equilibria of normal form games with two players using mixed
integer programming. We slightly modify the mixed integer programming formulation of Sandholm
et al. [24] and show that this modified formulation has superior performance when computing a Nash
equilibrium. We then define the concept of efficient (Pareto optimal) Nash equilibria. This concept
is an equilibrium refinement, but different from the traditional concept of Pareto optimality. A Nash
equilibrium is “efficient” if it is Pareto optimal with respect to all Nash equilibria (but not necessarily
with respect to all possible mixed strategy profiles) of a game. This definition ensures that any game
with at least one Nash equilibrium (in mixed strategies), such as normal form games with two players,
must have at least one efficient Nash equilibrium. We prove that the set of all points in the payoff
space of a normal form game with two players corresponding to the utilities of players in an efficient
Nash equilibrium, the so-called nondominated Nash points, is finite. We demonstrate that biobjective
mixed integer programming can be used to efficiently compute the set of nondominated Nash points.
Finally, we illustrate how the nondominated Nash points can be used to determine the disagreement
point of a bargaining problem.
Keywords: biobjective mixed integer programming, bimatrix game, equilibrium refinement, efficient
Nash equilibria, disagreement point
1
Introduction
Over sixty years ago, John Nash introduced the most influential concept of game theory to this date,
now known as the Nash equilibrium. In his groundbreaking paper in 1951 [19], Nash showed that
every game with a finite number of players and action profiles has at least one Nash equilibrium (in
mixed strategies). Unfortunately, the proof of this fundamental result was non-constructive. In fact,
computing a Nash equilibrium, even for games with two players, appears to be hard; it is PPADcomplete [7]. As a consequence, how to efficiently compute a single Nash equilibrium and how to
efficiently compute all Nash equilibria have become central questions in algorithmic game theory.
∗
Corresponding author. Tel.: +61 424 607 237
E-mail address: [email protected]
1
We focus on normal form games with two players, sometimes referred to as bimatrix games. The
most popular algorithm for efficiently computing a single Nash equilibrium is the Lemke-Howson algorithm [14]. The Lemke-Howson algorithm is a path-following method designed specifically to compute
a Nash equilibrium of a nondegenerate normal form game. A slight variation of the algorithm can be
used to solve degenerate games. It is worth mentioning that the Lemke-Howson algorithm does not
handle negative payoffs. The Porter, Nudelman, and Shoham (PNS) algorithm [23] is an alternative
approach that repeatedly guesses the support of a mixed strategy for each player and then checks
whether these strategies result in a Nash equilibrium [23]. Another alternative approach, especially
relevant to our research, was proposed by Audet et al. [2] and later explored further by Sandholm et al.
[24]. Audet et al. [2] observe that mixed integer programming techniques can be used effectively to
find a Nash equilibrium. Rather than developing a customized algorithm, they formulate the problem
of finding a Nash equilibrium as a mixed integer program (MIP) and use commercial MIP solvers (in
particular CPLEX) to find a Nash equilibrium. Sandholm et al. [24] computationally show that this
approach is competitive (in terms of efficiency) with the Lemke-Howson algorithm.
It is a well-known fact that a normal form game can have a single Nash equilibrium, multiple but
a finite number of Nash equilibria, or even an infinite number of Nash equilibria. When a normal form
game has an infinite number of Nash equilibria, the set of all Nash equilibria can still be completely
described by finitely many Nash equilibria, namely by all extreme Nash equilibria.
Not surprisingly, many researchers have argued that computing a single Nash equilibrium in a game
that has multiple Nash equilibria does not provide sufficient information for analyzing the game (see,
for example, [3, 4, 9, 12, 15, 16, 17]). In such situations, the preferred approach is to find a complete
description of all the Nash equilibria and then use one or more criteria to identify or choose the most
appropriate equilibrium. Such an approach requires an algorithm to find a complete description of
all the Nash equilibria. Most such algorithms are based on cleverly and efficiently enumerating the
vertices of the best-response polytope. Unfortunately, computing all the Nash equilibria (i.e., finding a
complete description of the set of all Nash equilibria) is only practical for games with a small set of
actions for each player. For instance, Avis et al. [4] observe that for games in which each player has
more than 25 actions, their algorithm, lrsNash, does not terminate within 24 hours for nondegenerate
normal form games.
Because a single Nash equilibrium provides insufficient information to analyze a game and it is
prohibitive to compute all Nash equilibria for all but games with small action spaces, it is natural
to consider finding a restricted set of Nash equilibria, i.e., all Nash equilibria with certain (desirable)
characteristics. This concept is known as equilibrium refinement. Strong Nash equilibria, sub-game
perfect Nash equilibria, proper Nash equilibria, coalition-proof Nash equilibria are just some of the
well-known restricted sets that have been considered in the literature.
In this study, we define a different restricted set of Nash equilibria with two desirable characteristics:
(1) the set is nonempty if the game has at least one Nash equilibrium; (2) the set is computable with
multi-objective optimization techniques (at least for bimatrix games). More precisely, we focus on
finding (a minimal set of) efficient Nash equilibria. A Nash equilibrium is said to be “efficient” if it
is Pareto optimal with respect to all Nash equilibria (but not necessarily with respect to all possible
mixed strategy profiles) of a game. This definition of efficient ensures that any game with at least one
Nash equilibrium, such as bimatrix games, will have at least one efficient Nash equilibrium. To best
of our knowledge, this concept of efficient Nash equilibria has, so far, received little attention in the
literature. We were able to find only one reference that uses this notion [13]. LaValle [13] argues that
efficient Nash equilibria will be preferred by intelligent players.
We prove the critical result that even though a bimatrix game can have an infinite number of efficient
Nash equilibria, the set of all points in the payoff space corresponding to the utilities of players in an
efficient Nash equilibrium, the so-called nondominated Nash points, is finite. This implies that the set
of efficient Nash equilibria can be partitioned into a finite number of subsets of efficient Nash equilibria
2
with the same player payoffs. We demonstrate that computing a single efficient Nash equilibrium
for each of the subsets of the partition can be done efficiently by formulating the problem of finding
an efficient Nash equilibrium as a biobjective mixed integer program (BOMIP) and computing the
nondominated frontier of that biobjective mixed integer program.
During our exploration of biobjective mixed integer programming approaches for the computation
of nondominated Nash points, we were able to enhance the mixed integer programming formulation
proposed by Sandholm et al. [24] for finding a single Nash equilibrium of a normal form game with two
players. The modification, even though relatively minor, resulted in a significant reduction in computing
time across a wide variety of instances (generated using GAMUT, see http://gamut.stanford.edu).
On average, computing times were reduced by a factor of four.
Finally, we show how nondominated Nash points can be used in determining the disagreement
point in a bargaining problem. Nash introduced his approach to solve the bargaining problem in 1950
and showed that the bargaining problem has a unique solution, called the Nash bargaining solution, if
certain axioms hold [18]. However, the Nash bargaining solution depends on the choice of the status quo
or the disagreement point for the bargaining problem. The disagreement point represents the payoff
for each of the players if no grand coalition is created, i.e., if negotiations break down. Consequently,
in games with a single Nash equilibrium, the obvious disagreement point is given by the equilibrium
payoffs. However, in games with multiple equilibria, the situation is more complicated and the choice
of the disagreement point is controversial. We believe that in such games, the natural candidates
for disagreement points are the nondominated Nash points, because these points will be preferable
by intelligent players under competition [13] (and when negotiations break down, competition starts).
Supposing that each of the nondominated Nash points is equally likely to occur (under competition),
the expected payoff for the players is the average of the utility values of the nondominated Nash points.
We argue and demonstrate that an appropriate disagreement point is given by these expected payoffs.
To summarize, the contributions of our research are that we
1. develop an efficient approach for computing a single Nash equilibrium;
2. demonstrate that a set of efficient Nash equilibria, one for each nondominated Nash point, can
be computed effectively (for bimatrix games); and
3. propose the use of nondominated Nash points to define the disagreement point in a bargaining
problem.
The rest of paper is organized as follows. In Section 2, we introduce important concepts and notation
related to normal form games. In Section 3, we detail the logic of our new MIP for computing a single
Nash equilibrium and report the results of a set of computational experiments carried out to analyze
the performance of the new MIP. In Section 4, we formally introduce the notion of nondominated Nash
points and we conduct a comprehensive computational study in which we investigate the nondominated
Nash frontier of a large number of games. In Section 5, we demonstrate how the nondominated Nash
frontier can be exploited in Nash bargaining problems. Finally, in Section 6, we give some concluding
remarks.
2
Normal Form Games
We will now describe the basic concepts of normal form games. We refer the interested readers to [26]
for more information. A normal form game is a tuple (N, A, u) where
• N is a finite set of n players, indexed by i;
• A = (A1 , · · · , An ) where Ai is a finite set of actions (pure strategies) available to player i. Each
vector (a1 , · · · , an ) ∈ A is called an action profile;
3
• u = (u1 , · · · , un ) where ui : A → R is a real-valued utility (or payoff) function for player i.
There are two strategies for playing a game: a pure strategy and mixed strategy. In a pure strategy,
a player always chooses a unique action from among the set of possible actions. However, in a mixed
strategy, a player chooses an action from among the set of possible actions according to some probability
distribution.
Definition 1. Let (N, A, u) be a normal form game, and for any finite set S, let Π(S) be the set of all
discrete probability distributions over S. Then the set of mixed strategies for player i is Pi = Π(Ai ).
Definition 2. The Cartesian product of the individual mixed strategy sets P1 × · · · × Pn is called the
set of mixed strategy profiles.
Definition 3. The support of a mixed strategy pi ∈ Pi for a player i is the set of actions with positive
probability, i.e., {ai ∈ Ai : pi (ai ) > 0}.
Definition 4. Let (N, A, u) be a normal form game. The expected utility uE
i for player i of a mixed
strategy profile p = (p1 , · · · , pn ) is defined as,


n
X Y

uE
pj (aj ) ui (a)
i (p) =
a∈A
j=1
Definition 5. Let p−i := (p1 , · · · , pi−1 , pi+1 , · · · , pn ). Player i’s best response to the strategy profile
∗
E
p−i is a mixed strategy p∗i ∈ Pi such that uE
i (pi , p−i ) ≥ ui (pi , p−i ) for all strategies pi ∈ Pi .
Definition 6. A strategy profile p = (p1 , · · · , pn ) is a Nash equilibrium if for all agents i, pi is the best
response to p−i .
Theorem 7. Every game with a finite number of players and action profiles has at least one Nash
equilibrium [19].
For the remainder of the paper, we restrict ourselves to normal form games with two players (we
sometimes refer to it as bimatrix games). Next, we show how a Nash equilibrium can be computed.
Definition 5 implies that if the mixed strategy p2 of the second player is known, the best response
of the first player can be obtained by solving the following Linear Program (LP),
X X
max
p1 (a1 )p2 (a2 )u1 (a1 , a2 )
a1 ∈A1 a2 ∈A2
subject to
X
p1 (a1 ) = 1
a1 ∈A1
p1 (a1 ) ≥ 0
∀ a1 ∈ A1 .
Similarly, if the mixed strategy p1 of the first player is known, the best response by player two can be
obtained by solving the following LP,
X X
max
p1 (a1 )p2 (a2 )u2 (a1 , a2 )
a1 ∈A1 a2 ∈A2
subject to
X
p2 (a2 ) = 1
a2 ∈A2
p2 (a2 ) ≥ 0
4
∀ a2 ∈ A2 .
Definition 6 implies that to obtain a Nash equilibrium, the optimality conditions of these two LPs
must be achieved at the same time. The KKT optimality conditions imply the following feasibility
problem, which is known as the Linear Complementarity Problem (LCP):
X
p2 (a2 )u1 (a1 , a2 ) + r1 (a1 ) = uE
∀ a1 ∈ A1
1
a2 ∈A2
X
p1 (a1 )u2 (a1 , a2 ) + r2 (a2 ) = uE
2
∀ a2 ∈ A2
a1 ∈A1
X
p1 (a1 ) = 1
a1 ∈A1
X
p2 (a2 ) = 1
a2 ∈A2
p1 (a1 )r1 (a1 ) = 0
∀ a1 ∈ A1
p2 (a2 )r2 (a2 ) = 0
∀ a2 ∈ A2
r1 (a1 ) ≥ 0
∀ a1 ∈ A1
r2 (a2 ) ≥ 0
∀ a2 ∈ A2
p1 (a1 ) ≥ 0
∀ a1 ∈ A1
p2 (a2 ) ≥ 0
∀ a2 ∈ A2 ,
E
where uE
1 and u2 are free decision variables corresponding to the expected utility value of each player
in a Nash equilibrium [14]. Moreover, r1 (a1 ) and r2 (a2 ) for a1 ∈ A1 and a2 ∈ A2 are the slack variables
for the first two constraints. They can be interpreted as the regret of action a1 and a2 for a1 ∈ A1 and
a2 ∈ A2 when they are not in the support of a strategy of player one and two, respectively.
The LCP has an intuitive interpretation. It implies that in a Nash equilibrium, each player must
make the other player indifferent (the expected utility is exactly the same) between the choice of the
actions (pure strategies) in his support. This means that, in a Nash equilibrium, any action in the
support of a player must have zero regret.
Obviously, to obtain a Nash equilibrium the LCP has to be solved, which can be done exactly using
the Lemke-Hawson algorithm [14] or mixed integer programming [24], or heuristically using the PNS
algorithm [23].
The Lemke-Hawson algorithm is one of the most efficient algorithms for solving the LCP [26]. The
algorithm is a path-following method which pivots using complementary feasible bases and terminates
when it finds a Nash equilibrium. Sandholm et al. [24] transform the LCP into a mixed integer program
(MIP) and solve it using a powerful commercial MIP solver. They show that the MIP approach is
competitive with the Lemke-Hawson algorithm.
The PNS algorithm proceeds differently. The algorithm repeatedly guesses the support of each
player’s strategy so as to reduce the LCP to a linear program, which can then be checked for feasibility.
Sandholm et al. [24] observed that there exist classes of normal form games where the PNS algorithm
tends to struggle to obtain a Nash equilibrium because it must guess and evaluate a large number of
supports.
3
Computing a Nash Equilibrium
In this section, we present a new MIP to obtain a Nash equilibrium. Our formulation is a simplified
version of the best formulation of Sandholm et al. [24] with, importantly, two additional payoff-based
valid inequalities:
E
max uE
1 + u2
5
max
uE
1 ≤ u1
max
uE
2 ≤ u2
X
p2 (a2 )u1 (a1 , a2 ) + r1 (a1 ) = uE
1
∀ a1 ∈ A1
p1 (a1 )u2 (a1 , a2 ) + r2 (a2 ) = uE
2
∀ a2 ∈ A2
a2 ∈A2
X
a1 ∈A1
X
p1 (a1 ) = 1
a1 ∈A1
X
p2 (a2 ) = 1
a2 ∈A2
p1 (a1 ) ≤ b1 (a1 )
∀ a1 ∈ A1
p2 (a2 ) ≤ b2 (a2 )
∀ a2 ∈ A2
r1 (a1 ) ≤ M1 (1 − b1 (a1 ))
∀ a1 ∈ A1
r2 (a2 ) ≤ M2 (1 − b2 (a2 ))
∀ a2 ∈ A2
r1 (a1 ) ≥ 0
∀ a1 ∈ A1
r2 (a2 ) ≥ 0
∀ a2 ∈ A2
p1 (a1 ) ≥ 0
∀ a1 ∈ A1
p2 (a2 ) ≥ 0
∀ a2 ∈ A2
b1 (a1 ) ∈ {0, 1}
∀ a1 ∈ A1
b2 (a2 ) ∈ {0, 1}
∀ a2 ∈ A2 ,
where umax
= maxa∈A u1 (a), umax
= maxa∈A u2 (a), M1 = maxah ,al ∈A u1 (ah ) − u1 (al ) and M2 =
1
2
h
l
maxah ,al ∈A u2 (a ) − u2 (a ).
The MIP is a linearization of LCP, in which the constraints
p1 (a1 )r1 (a1 ) = 0
∀ a1 ∈ A1
p2 (a2 )r2 (a2 ) = 0
∀ a2 ∈ A2
are replaced by
p1 (a1 ) ≤ b1 (a1 )
∀ a 1 ∈ A1
p2 (a2 ) ≤ b2 (a2 )
∀ a 2 ∈ A2
r1 (a1 ) ≤ M1 (1 − b1 (a1 ))
∀ a 1 ∈ A1
r2 (a2 ) ≤ M2 (1 − b2 (a2 ))
∀ a 2 ∈ A2 .
That is, the nonlinear constraints are replaced by linear constraints at the price of the introduction
of binary variables b1 (a1 ) for a1 ∈ A1 and b2 (a2 ) for a2 ∈ A2 and disjunctive constraints r1 (a1 ) ≤
M1 (1 − b1 (a1 )) for a1 ∈ A1 and r2 (a2 ) ≤ M2 (1 − b2 (a2 )) for a2 ∈ A2 .
Moreover, two valid inequalities defining upper bounds on the expected utility values of the players
max and uE ≤ umax . Note that any feasible solution to the MIP represents
are added, i.e., uE
1 ≤ u1
2
2
a Nash equilibrium. The objective function, which maximizes the social welfare, is added because
computational experiments revealed that CPLEX solves the MIP faster when this objective function
is added [24].
To demonstrate the efficiency of the new MIP, a computational study was conducted. The instances
in the study are generated using GAMUT. We generated 5 classes of instances: C20, C40, C60, C80
and C100, where the class identifier Cm embeds the size of the game, i.e., m = |A1 | = |A2 |. Each
6
Table 1: Distributions of GAMUT.
D1
BertrandOligopoly
D2 BidirectionalLEG-CG
D3 BidirectionalLEG-RG
D4
CovariantGame-Pos
D5
CovariantGame
D6 CovariantGame-Zero
D7
GraphicalGame-RG
D8 GraphicalGame-Road
D9
GraphicalGame-SG
D10
GraphicalGame-SW
D11
D12
D13
D14
D15
D16
D17
D18
D19
D20
MinimumEffortGame
PolymatrixGame-CG
PolymatrixGame-RG
PolymatrixGame-Road
PolymatrixGame-SW
RandomGame
UniformLEG-CG
UniformLEG-RG
UniformLEG-SG
LocationGame
class has 20 subclasses, each having 5 instances. Each subclass is a different game distribution in
GAMUT. The 20 distributions can be found in Table 1. These distributions are used frequently in
previous studies, see for instance Sandholm et al. [24] or Porter et al. [23]. (It is worth mentioning that
distributions related to polymatrix and graphical games can be used to generate instances with more
than two players. However, the focus of our study is on instances with only two players.)
CPLEX 12.5.1 is used to solve the MIPs. All instances were run on a Dell PowerEdge R710 with
dual hex core 3.06GHz Intel Xeon X5675 processors and 96GB RAM running Red Hat Enterprise Linux
6, and only a single thread was used. In all experiments, a runtime limit of 1800 seconds was imposed.
Figure 1 shows the performance profile of the runtime of CPLEX for three formulations: the (best)
formulation proposed by Sandholm et al. [24] (S), the new formulation (N), and the new formulation
without the two valid inequalities defining upper bounds on the expected utility values of the players (W). A performance profile [10] presents cumulative distribution functions for a set of solution
approaches being compared with respect to a specific performance metric. The runtime performance
profile for a set of solution approaches is constructed by computing for each solution approach and for
each instance the ratio of the runtime of the solution approach on the instance and the minimum of
the runtimes of all solution approaches on the instance. The runtime performance profile then shows
the ratios on the horizontal axis and, on the vertical axis, for each solution approach, the fraction of
instances with a ratio that is greater than or equal to the ratio on the horizontal axis. This implies
that values in the upper left-hand corner of the graph indicate the best performance.
Clearly, the new formulation performs much better than the other formulations; in most cases, it is
about 4 times faster. The reason for this improvement is the introduction of the two valid inequalities
defining upper bounds on the expected utility values of the players. These inequalities are exploited by
the solution techniques embedded in CPLEX to find feasible solutions, which is clearly beneficial. The
payoff-based valid inequalities can also remove fractional solutions in situations where both negative
and positive payoffs exist for at least one of the players. In these situations, either M1 − umax
> 0 or
1
M2 − umax
>
0
or
both.
Consequently,
when
we
add
the
payoff-based
valid
inequalities,
we
implicitly
2
define a tighter bound for variables of the form
that M1 − umax
>0
1
P r(a1 ) or r(a2 ). To see this, suppose
E
and consider the constraints of the form a2 ∈A2 p2 (a2 )u1 (a1 , a2 ) + r1 (a1 ) = u1 . It is clear that if
P
E
max is added then r (a ) cannot be freely set to any value less
1 1
a2 ∈A2 p2 (a2 )u1 (a1 , a2 ) > 0 and u1 ≤ u1
than or equal to M1 .
CPLEX allows users to tune the performance by adjusting parameters. One of these parameters
CPLEX MIP emphasize, which “controls trade-offs between speed, feasibility, optimality, and moving
bounds in MIP”. Its default setting balances between optimality and feasibility. However, because
our interest is finding a feasible solution, i.e., a single Nash equilibrium, as quickly as possible, we
experimented with setting the parameter to emphasize feasibility over optimality. We found that
7
Ratio of solved instances
1
0.8
0.6
0.4
0.2
0
2
4
6
8
10
12
Ratio of runtime to the minimum runtime
S
N
W
14
Figure 1: Performance profile of the runtime of three formulations (S, N and W).
resulting performance profile is similar, i.e., the relative performance of the formulations does not
change, but the runtime increased significantly for all formulations (almost doubled).
4 Biobjective Mixed Integer Programming and Bimatrix
Games
As mentioned in the introduction, in addition to developing faster mixed integer programming approaches for finding a single Nash equilibrium, we are interested in developing biobjective integer
programming approaches for computing a desirable subset of Nash equilibria. The latter is the focus
of this section. We start by introducing the concept of efficient (or Pareto optimal) Nash equilibria.
Again, we note that the traditional concept of “Pareto optimality”, which is widely used in the
literature (see for instance [26]), is different from the concept that we will formally define in this
section. The existing concept defines Pareto optimality over the set of all possible mixed strategy
profiles. In other words, a mixed strategy profile is Pareto optimal if it is impossible to improve the
utility value of at least one of the players without a deterioration in the utility value of any of the other
players.
Based on this definition, there is no relationship between Pareto optimal mixed strategy profiles
and Nash equilibria. In other words, a Pareto optimal mixed strategy profile is not necessarily a Nash
equilibrium. For example, consider an instance of the prisoner’s dilemma game (See Table 2). There
are two prisoners (P1 and P2) and two actions are available to each player: confessing (C) and not
confessing (N). If both prisoners confess, they go to jail for 4 years. If only one of them confesses, the
one who confessed will be released and the other will go to jail for 8 years. If none of them confess,
both of them will go to jail for only 1 year. The only Nash equilibrium of this game is the case where
both players confess. However, it is not Pareto optimal, because, for example, when none of the players
confess, both players obtain a higher payoff.
Note that because our goal is to compute a (desirable) subset of all Nash equilibria, using the
8
Table 2: Prisoner’s dilemma
P1
P2
Strategy
N
C
N
-1,-1 -8,0
C
0,-8 -4,-4
traditional definition of Pareto optimality is not appropriate, because a Pareto optimal mixed strategy
profile is not necessarily a Nash equilibrium. Therefore, we define the concept of Pareto optimality
or efficiency over the set of all Nash equilibria rather than over the set of all possible mix strategy
profiles.
Let E denote the set of all Nash equilibria. Moreover, let the feasible set in the payoff space, U, be
the image of E under vector-valued function u = {u1 , · · · , un }, i.e., U := u(E) := {y ∈ Rn : y = u(p)
for some p ∈ E}.
Definition 8. A Nash equilibrium p0 ∈ E is called weakly efficient, if there is no other Nash equilibrium
p ∈ E such that ui (p) > ui (p0 ) for i = 1, · · · , n. If p0 is weakly efficient, then u(p0 ) is called a weakly
nondominated Nash point.
Definition 9. A Nash equilibrium p0 ∈ E is called efficient or Pareto optimal, if there is no other Nash
equilibrium p ∈ E such that ui (p) ≥ ui (p0 ) for i = 1, · · · , n and u(p) 6= u(p0 ). If p0 is efficient, then u(p0 )
is called a nondominated Nash point. The set of all efficient Nash equilibria p0 ∈ E is denoted by EE .
The set of all nondominated Nash points u(p0 ) ∈ U for some p0 ∈ EE is denoted by UN and referred to
as the nondominated Nash frontier or efficient Nash frontier.
Definition 10. A Nash equilibrium p0 ∈ E is called ideal if it simultaneously maximizes the expected
utility of all players. If p0 is ideal, then uI = u(p0 ) is called the ideal Nash point.
To compute the set of all efficient Nash equilibria or nondominated Nash points of a normal form
game with two players, the following BOMIP must be solved,
E
max uE (p) := {uE
1 (p), u2 (p)}.
p∈E
We sometimes refer to this formulation as BOMIP-NFG2. Note that E is the set of all Nash
equilibria and can be expressed by the constraints of the MIP introduced in Section 3.
Theorem 11. The number of nondominated Nash points of a normal form game with two players
({1, 2}, A, u) is finite.
Proof. Let supp(pi ) := {ai ∈ Ai : pi (ai ) > 0} be the support of a mixed strategy pi ∈ Pi of player
i ∈ {1, 2}. We observe that supp(pi ) ⊆ Ai and supp(pi ) 6= ∅. We denote by Si , the set of all possible
supp(pi ) where pi ∈ Pi and i ∈ {1, 2}. It is evident that since the game is finite (in terms of the number
of players and the number of action profiles), Si is also finite,
|Si | =
|Ai | X
|Ai |
l=1
l
= 2|Ai | − 1.
Next, consider a support vector (s1 , s2 ) ∈ S1 × S2 . We now try to simply the BOMIP-NFG2 based
on (s1 , s2 ). A specific support vector corresponds to a specific set of values of the binary variables in
the BOMIP-NFG2, i.e., b1 (a1 ) = 1 for all a1 ∈ s1 , b1 (a1 ) = 0 for all a1 ∈ A1 \ s1 , b2 (a2 ) = 1 for all
a2 ∈ s2 , and b2 (a2 ) = 0 for all a2 ∈ A2 \ s2 . Note that this implies that we also have p1 (a1 ) = 0 for all
9
a1 ∈ A1 \ s1 , r1 (a1 ) = 0 for all a1 ∈ s1 , p2 (a2 ) = 0 for all a2 ∈ A2 \ s2 , and r2 (a2 ) = 0 for all a2 ∈ s2 . As
a consequence, the BOMIP-NFG2 reduces to a biobjective linear program (BOLP) since all the binary
variables are removed. It is easy to see that the constraints of the BOLP can be decomposed into two
independent sets, i.e.,
Set 1:
Set 2:
X
X
p1 (a1 ).u2 (a1 , a2 ) + r2 (a2 ) = uE
p2 (a2 ).u1 (a1 , a2 ) + r1 (a1 ) = uE
∀
a
∈
A
1
1
2 ∀ a2 ∈ A2
1
a1 ∈A1
a2 ∈A2
X
X
p2 (a2 ) = 1
p1 (a1 ) = 1
a1 ∈A1
a2 ∈A2
r1 (a1 ) ≥ 0
∀ a1 ∈ A1 \s1
r2 (a2 ) ≥ 0
∀ a2 ∈ A2 \s2
r1 (a1 ) = 0
∀ a1 ∈ s1
r2 (a2 ) = 0
∀ a2 ∈ s2
p2 (a2 ) > 0
∀ a2 ∈ s2
p1 (a1 ) > 0
∀ a1 ∈ s1
p2 (a2 ) = 0
∀ a2 ∈ A2 \s2
p1 (a1 ) = 0
∀ a1 ∈ A1 \s1 .
If the support vector (s1 , s2 ), does not result in a Nash equilibrium, then the BOLP is infeasible. If
the support vector does result in a Nash equilibrium, then the BOLP must be feasible. However, we
know that the constraints of the BOLP are decomposed into two independent sets. This implies that
E
uE
1 and u2 can be maximized simultaneously, and thus there exists a single, ideal Nash point for the
BOLP. So, since there is a finite number of support vectors, i.e., S1 × S2 is finite, there is only a finite
number of nondominated Nash points.
Theorem 11 is a critical result that shows that even though a bimatrix game may have an infinite
number of efficient Nash equilibria, the set of nondominated Nash points is always finite. So, we can
compute one efficient Nash equilibrium corresponding to each nondominated Nash point to construct
a desirable subset of Nash equilibria.
It is worth mentioning that determining the nondominated frontier of a BOMIP is not easy in
general. The primary reason for this is that the nondominated frontier may have continuous segments,
i.e., parts in which all points of a line segment are nondominated. However, when a BOMIP has a finite
number of nondominated points (and the nondominated frontier does not have continuous segments),
then several effective algorithms for computing the nondominated frontier exist. One of these algorithms
is the balanced box method (see Boland et al. [5]), which we will use in our computational study.
As observed earlier, multiple Nash equilibria may exist for a single nondominated Nash point. The
balanced box method returns only one of them. Thus, when the algorithm terminates, it reports the set
of nondominated Nash points and one efficient Nash equilibrium for each of them. It is worth noting,
too, that the balanced box method can be customized to exploit the structure of the payoff-based valid
inequalities (see Section 3), which in many cases decreases the runtime of the algorithm substantially.
Detailed performance statistics of the balanced box method can be found in Tables 3 to 7, i.e.,
the number of nondominated Nash points (#NDP ), the time required to determine the efficient Nash
frontier (Time (sec.)) and the number of instances for which the complete efficient Nash frontier was
not determined within the time limit of 1800 seconds (#NS ).
Observe that the number of points in the efficient Nash frontier is very small and that in more
than half of the instances the efficient Nash frontier consists of a single point, i.e., there is an ideal
point. Moreover, the time required to determine the complete nondominated frontier is quite small (by
using CPLEX). For example, for the largest classes of instances, the complete nondominated frontier is
determined in about 5 minutes. For only 22 out of 500 instances the complete nondominated frontier
could not be determined in 30 minutes of computing; most of them belonging to subclasses D1 and D5.
The payoff values of instances in subclasses D1 and D5 are symmetric, which causes the performance
of the single objective commercial MIP solvers to deteriorate.
10
Table 3: Overall results of the balanced box method on class C20.
Subclass
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
D15
D16
D17
D18
D19
D20
Avg
Max
3
3
1
2
3
2
3
3
4
3
1
4
4
3
5
4
1
1
1
1
2.60
#NDP
Avg
Min
1.4
1
1.4
1
1
1
1.2
1
1.6
1
1.4
1
2
1
2
1
2
1
2
1
1
1
2.4
1
2.6
1
2.2
1
1.8
1
2.2
1
1
1
1
1
1
1
1
1
1.61
1.00
Time (sec.)
Max
Avg
Min
0.87
0.40
0.06
0.18
0.06
0.01
0.05
0.02
0.00
0.12
0.08
0.06
25.92
9.46
2.51
1.08
0.61
0.14
1.21
0.73
0.37
1.04
0.63
0.36
1.63
0.54
0.08
1.01
0.80
0.19
0.16
0.14
0.11
1.80
0.75
0.16
1.14
0.64
0.24
0.69
0.48
0.36
2.30
0.62
0.08
1.14
0.62
0.20
0.07
0.05
0.02
0.10
0.06
0.04
0.02
0.01
0.00
0.05
0.04
0.02
2.03
0.84
0.25
#NS
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
Table 4: Overall results of the balanced box method on class C40.
Subclass
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
D15
D16
D17
D18
D19
D20
Avg
Max
1
1
1
2
0
3
4
6
5
4
1
4
6
4
4
4
1
1
1
1
2.70
#NDP
Avg
Min
1
1
1
1
1
1
1.4
1
0
0
1.8
1
2.4
1
2.6
1
3
1
2.4
1
1
1
2
1
2.6
1
2.4
1
2.2
1
2.2
1
1
1
1
1
1
1
1
1
1.65
0.95
Max
2.12
0.20
0.09
0.56
1801.00
3.62
39.15
17.08
24.37
33.49
0.18
39.43
20.35
62.03
55.34
25.07
0.15
0.20
0.12
0.18
106.24
11
Time (sec.)
Avg
Min
1.50
0.96
0.14
0.05
0.05
0.03
0.40
0.23
1800.99 1800.98
2.41
0.96
12.31
1.10
7.53
1.45
11.77
3.20
9.97
1.80
0.13
0.07
10.57
1.04
8.72
2.22
17.07
0.74
21.55
3.09
8.36
2.45
0.12
0.07
0.09
0.04
0.05
0.01
0.14
0.12
95.69
91.03
#NS
0
0
0
0
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.25
Table 5: Overall results of the balanced box method on class C60.
Subclass
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
D15
D16
D17
D18
D19
D20
Avg
Max
1
2
1
2
2
4
5
6
3
2
1
3
3
5
8
5
1
3
1
1
2.95
#NDP
Avg
Min
1
1
1.2
1
1
1
1.6
1
0.6
0
2.4
1
2
1
3.2
1
1.8
1
1.6
1
1
1
1.8
1
1.8
1
3.2
2
4
2
3
2
1
1
1.6
1
1
1
1
1
1.79
1.10
Time (sec.)
Max
Avg
Min
4.93
1.83
0.01
1.03
0.73
0.41
0.52
0.22
0.09
1.24
0.99
0.70
1801.00 1184.38 128.06
86.82
46.99
7.52
112.76
27.26
2.37
118.94
43.75
2.68
145.10
54.20
11.88
10.89
5.77
1.99
0.54
0.45
0.35
45.63
18.12
3.99
33.75
13.34
3.29
98.30
34.12
4.29
101.15
44.91
3.92
78.50
37.36
4.66
0.33
0.24
0.22
0.75
0.31
0.06
0.39
0.16
0.02
0.35
0.27
0.21
132.15
75.77
8.84
#NS
0
0
0
0
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.15
Table 6: Overall results of the balanced box method on class C80.
Subclass
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
D15
D16
D17
D18
D19
D20
Avg
Max
3
1
2
3
4
3
8
5
4
3
1
4
3
3
5
4
2
1
1
1
3.05
#NDP
Avg
Min
1.2
0
1
1
1.2
1
1.6
1
1.8
0
1.8
1
3.8
1
2.6
1
2.8
1
1.6
1
1
1
2.4
1
2
1
2
0
3
1
2.6
1
1.2
1
1
1
1
1
1
1
1.83
0.85
Time (sec.)
Max
Avg
1800.99 1188.34
3.39
1.25
2.00
1.07
3.93
2.44
1800.98
772.01
99.20
31.63
788.14
250.41
131.31
60.05
217.55
84.38
113.52
56.85
0.86
0.55
1203.91
298.55
157.04
54.58
1800.95
405.83
204.61
91.04
281.55
90.20
0.78
0.52
0.37
0.26
1.46
0.40
0.53
0.44
430.65
169.54
12
Min
26.55
0.23
0.12
1.38
16.00
3.53
12.45
6.11
4.24
16.41
0.37
12.70
8.24
15.68
29.93
14.38
0.40
0.07
0.04
0.27
8.46
#NS
3
0
0
0
2
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0.30
Table 7: Overall results of the balanced box method on class C100.
Subclass
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
D15
D16
D17
D18
D19
D20
Avg
5
Max
1
1
1
4
3
3
6
3
4
6
2
5
6
8
5
3
2
1
1
1
3.30
#NDP
Avg
Min
0.2
0
1
1
1
1
2.6
1
2.4
2
1.6
1
2.6
0
1.8
1
2.6
1
3.8
2
1.2
1
3.2
1
2.6
0
3.4
1
3
2
1.4
0
1.2
1
1
1
1
1
1
1
1.93
0.95
Time (sec.)
Max
Avg
Min
1800.99 1442.38
8.17
3.69
1.44
0.28
2.95
1.06
0.12
10.58
6.55
2.56
15.18
10.00
6.74
714.78
224.47
27.97
1800.96
673.02
32.91
943.14
216.09
15.84
530.01
234.17
7.43
909.26
332.86
22.67
1.06
0.74
0.46
1055.44
350.64
74.62
1800.97
717.26
79.09
1610.19
370.83
16.57
1707.69
581.36
34.14
1800.98
953.23 108.79
1.30
0.74
0.44
0.99
0.61
0.17
0.37
0.25
0.09
1.00
0.86
0.73
735.58
305.93
21.99
#NS
4
0
0
0
0
0
1
0
0
0
0
0
1
0
0
2
0
0
0
0
0.40
Bargaining problem and the Nash Efficient Frontier
In this section, we discuss how knowledge of the efficient Nash frontier can be used in a bargaining
problem (or bargaining game) to define the disagreement point.
A bargaining problem is a cooperative game in which players agree to create a grand coalition instead
of competing with each other to get a higher payoff [25]. A fundamental question in this context is
what the payoff of each player should be in a grand coalition. One solution to a bargaining problem was
proposed by Nash and has become known as the Nash bargaining solution [18, 20]. Next, we explain
the Nash bargaining solution when there are only two players.
Let Y be the set of feasible points in the (two-dimensional) payoff space. In fact, set Y contains all
possible expected utility values of players. We assume that Y is compact and convex. Let YN be the
set of nondominated points of Y, i.e., if y ∈ YN , then there exists no point y 0 ∈ Y such that y10 ≥ y1 ,
y20 ≥ y2 , and y 0 6= y, and let d = (d1 , d2 ) ∈ Y be the disagreement point, representing the payoff for
each of the players if no grand coalition is created.
We note that determining the disagreement point of a bargaining problem is, in general, non-trivial
[8]. One reason is that when negotiations break down, players may decide to forgo (some of their) own
rewards and try to punish the other player, e.g., by minimizing the maximum expected payoff of their
opponent. But even if the players do not sacrifice their own rewards, determining the disagreement
point is not straightforward when there are multiple Nash equilibria.
Two classical axioms imposing restrictions on the solution to a Nash bargaining problem are
• Individual Rationality: None of the players accepts a payoff lower than the one which is guaranteed
E
to him under disagreement, i.e., uE
1 ≥ d1 and u2 ≥ d2 .
• Pareto Optimality: The solution must be such that the payoff for a single player cannot be
increased without decreasing the payoff of the other player.
E
∗
E
E
E
Let Y ∗ = {uE ∈ Y : uE
1 ≥ d1 , u2 ≥ d2 } and YN = {u ∈ YN : u1 ≥ d1 , u2 ≥ d2 }. To satisfy the
∗ . However, in general, Y ∗ still contains an
classical axioms, a bargaining solution u∗ must be in YN
N
infinite number of points. Nash introduced three additional axioms:
• Symmetry: If Y ∗ is symmetric, i.e., for any vector (y, y 0 ) ∈ Y ∗ , the vector (y 0 , y) is also in Y ∗ ,
13
then in a bargaining solution we must have u∗1 = u∗2 .
• Linear Invariance: Let u∗ be a solution of bargaining game G. Moreover, let Ĝ be a bargaining
game obtained from G by an order-preserving linear transformation T of one player’s utility
function. The solution û∗ to the bargaining game Ĝ has to be the image of u∗ under T , i.e.,
û∗ = T u∗ .
• Independence of Irrelevant Alternatives: Let d be the disagreement point and u∗ be a solution of
bargaining game G. Moreover, let Ĝ be a bargaining game that is obtained from G by restricting
Y to Ŷ, i.e., Ŷ ⊂ Y. If d ∈ Ŷ and u∗ ∈ Ŷ, then u∗ must be the solution of Ĝ.
Nash [18] proved that there exists a unique solution u∗ = (u∗1 , u∗2 ) to the bargaining problem that
satisfies all five axioms, namely
E
E
E
E
u∗ = arg max{(uE
1 − d1 )(u2 − d2 ) : u ∈ Y, u1 ≥ d1 , u2 ≥ d2 }.
(1)
Note that the solution depends on the choice of the disagreement point d.
Next, we discuss how Y and d can be defined for a bimatrix game. Since Y must contain the
expected utility values of players, Y can be defined as the set of convex combinations of outcomes in
the bimatrix game, i.e.,
X
X
X
E
2
E
λ(a)u1 (a), uE
λ(a)u2 (a),
λ(a) = 1, λ(a) ≥ 0 ∀a ∈ A}.
Y = {(uE
2 =
1 , u2 ) ∈ R : u1 =
a∈A
a∈A
a∈A
When a game has a unique Nash equilibrium, that equilibrium can be chosen as the disagreement
point. Unfortunately, when there are multiple Nash equilibria, then there is no obvious way for choosing
d. However, since the points on the efficient Nash frontier represent points with maximal payoffs for
each player under competition, they are the natural candidates for the disagreement point. Thus, if
the efficient Nash frontier consists of a single point, then that point can be chosen as the disagreement
point. However, as the results of our computational experiments in the previous section show, in many
games the efficient Nash frontier consists of more than one point (implying that it is not possible to
maximize both players’ payoffs simultaneously under competition). Because all points on the Nash
efficient frontier have the same chance of being the final outcome of the game under competition, the
expected best payoff of the game under competition is the average of the utility values of the players
over the points on the Nash efficient frontier. Therefore, one option is to set the disagreement point to
be equal to
P
P
E
E
uE ∈UN u1
uE ∈UN u2
(d1 , d2 ) = (
,
).
|UN |
|UN |
Because d is the expected best payoff of the game under competition, if creating a coalition results
in a greater payoff for the players, they have an incentive to collaborate. (Note that with this choice
of disagreement point, the payoff for each player in the Nash bargaining solution will be at least the
expected best payoff under competition.)
Next, we illustrate the Nash bargaining solution that results from this choice of disagreement point
by means of a small example. We generated a game with |A1 | = |A2 | = 6 using the RandomGame
distribution of GAMUT with the utility values of both players restricted to be in the interval [−150, 150].
The payoff matrix of this game is shown in Table 8.
This game has three nondominated Nash points. These points together with an associated Nash
equilibrium are shown in Table 9.
Now suppose that the players agree to create a grand coalition. The proposed disagreement point is
d = (20.508, 95.238). As a result, the Nash bargaining solution for this game is u∗ = (63.984, 145.307),
which implies both players should choose action a4 . We see that, in this example, with cooperation
each player can obtain a payoff that is close to its maximum possible payoff under competition (and
significantly larger than the expected best payoff under competition).
14
Table 8: Utility values of the example.
a2
(58.954,-50.672)
(-128.268,-126.595)
(124.872,-115.63)
(54.368,-97.699)
(-150,40.32)
(88.603,-63.644)
Player 2
a3
a4
(54.404,139.851) (-124.681,147.785)
(-6.944,-27.046)
(-125.045,12.209)
(-91.372,-121.989)
(-93.674,-46.08)
(-91.275,150)
(63.984,145.307)
(-54.68,131.402)
(-100.583,-70.608)
(-121.96,-137.767)
(81.699,-125.064)
a5
(-23.458,-42.076)
(120.311,-31.287)
(-2.735,-17.662)
(-82.081,-77.016)
(41.558,-84.65)
(-121.368,-96.295)
a6
(110.561,-60.093)
(105.222,-122.293)
(-100.213,-147.231)
(100.871,61.205)
(17.320,59.588)
(71.947,81.638)
Table 9: Nondominated Nash utility points of the example.
Action
a1
a2
a3
a4
a5
a6
Utility
NDP 1
p1
p2
0.372
0.000
0.000
0.000
0.000
0.564
0.628
0.436
0.000
0.000
0.000
0.000
-23.626
146.228
NDP 2
p1
p2
0.000
0.000
0.000
0.000
0.000
0.366
0.730
0.634
0.000
0.000
0.270
0.000
7.157
72.371
NDP 3
p1
p2
0.000
0.000
0.000
0.000
0.000
0.000
0.711
0.620
0.000
0.000
0.289
0.380
77.995 67.114
150
NDP 1
100
NDP 2
50
NDP 3
0
-50
-100
Convex Hull of Y
Status Quo
150
100
uE
1
50
0
-50
-100
-150
-150
a1
a2
a3
a4
a5
a6
a1
(20.846,-33.357)
(-115.218,123.302)
(-53.927,28.03)
(94.957,1.705)
(12.604,-45.033)
(-7.921,147.105)
uE
2
Player 1
Actions
Nash Solution (u∗ )
Nondominated Points
Figure 2: Representation of set Y and the Nash bargaining solution of the example.
15
An illustration of set Y, of the Nash bargaining solution, and the three nondominated Nash points
of the example can be found in Figure 2. An examination of the figure reveals the following
• The Nash bargaining solution does not change if we choose NDP 2 as disagreement point, i.e.,
u∗ = (63.984, 145.307). This is easily seen after recognizing that the area of the rectangle defined
by d and u∗ has to be maximal in a solution to optimization problem (1);
• If we choose NDP 1 as the disagreement point, then the Nash bargaining solution is (4.944, 147.092).
That is, the expected utility value of the second player will be slightly larger than 145.307, but
the expected utility value of the first player will be significantly smaller than 63.984. To achieve
u∗ = (4.944, 147.092), players should choose actions (a4 , a4 ) with probability 0.62 and actions
(a4 , a3 ) with probability 0.38; and
• If we choose NDP 3 as the disagreement point, then the Nash bargaining solution is (88.137, 90.238).
That is, the expected utility value of the first player will be larger than 63.984, but the expected utility value of the second player will be significantly smaller than 145.307. To achieve
u∗ = (88.137, 90.238), players should choose actions (a4 , a4 ) with probability 0.345 and actions
(a4 , a6 ) with probability 0.655.
We see that the combined payoff of both players in the Nash bargaining solution obtained when using
the proposed disagreement point is the largest (and it also best reflects the relative proportions of the
expected best payoffs.
6
Conclusion
We investigated the use of integer programming techniques for computing Nash equilibria of bimatrix
games. Our main contributions are: (1) By slightly modifying the mixed integer programming formulation of Sandholm et al. [24], we are able to improve its runtime efficiency by a factor of four (on average),
and, as a result, are able to more efficiently find a single Nash equilibrium, (2) By focusing on the concept of efficient Nash equilibria, we are able to exploit biobjective integer programming techniques to
construct a subset of Nash equilibria with certain desirable characteristics, and (3) By computing and
exploiting properties of nondominated Nash points, we are able to determine a disagreement point of
a bargaining problem.
We note that we have studied games in which each player wants to maximize a single objective.
In the literature, we also find multiobjective (multi-criteria) games in which each player wants to
(simultaneously) maximize one or more objectives. The definition of Nash equilibria and efficiency in
this context [1, 6, 11, 21, 22, 27, 28] differs significantly from the ones use in this paper. A possible
topic for further research is investigating whether nondominated Nash points for such games can be
found using mixed integer programming techniques.
References
[1] Altman, E., Boulogne, T., El-Azouzi, R., Jiménez, T., and Wynter, L. (2006). A survey on networking games in telecommunications. Computers & Operations Research, 33(2):286 – 311.
[2] Audet, C., Belhaiza, S., and Hansen, P. (2006). Enumeration of all the extreme equilibria in
game theory: Bimatrix and polymatrix games. Journal of Optimization Theory and Applications,
129(3):349–372.
[3] Audet, C., Hansen, P., Jaumard, B., and Savard, G. (2001). Enumeration of all extreme equilibria
of bimatrix games. SIAM Journal on Scientific Computing, 23(1):323–328.
16
[4] Avis, D., Rosenberg, D, G., Savani, R., and von Stengel, B. (2010). Enumeration of Nash equilibria
for two-player games. Economic Theory, 42(1):9–37.
[5] Boland, N., Charkhgard, H., and Savelsbergh, M. (2015). A criterion space search algorithm for
biobjective integer programming: The balanced box method. INFORMS Journal on Computing,
27(4):735–754.
[6] Borm, P., Vermeulen, D., and Voorneveld, M. (2003). The structure of the set of equilibria for two
person multicriteria games. European Journal of Operational Research, 148(3):480 – 493.
[7] Chen, X. and Deng, X. (2006). Settling the complexity of two-player nash equilibrium. 47th Annual
IEEE Symposium on Foundations of Computer Science, pages 261–272.
[8] Chun, Y. and Thomson, W. (1990). Nash solution and uncertain disagreement points. Games and
Economic Behavior, 2(3):213 – 223.
[9] Dickhaut, J. and Kaplan, T. (1991). A program for finding Nash equilibria. The Mathematica
Journal, 1:87–93.
[10] Dolan, E. D. and Moré, J, J. (2002). Benchmarking optimization software with performance
profiles. Mathematical Programming, 91(2):201–213.
[11] Ghose, D. (1991). A necessary and sufficient condition for pareto-optimal security strategies in
multicriteria matrix games. Journal of Optimization Theory and Applications, 68(3):463–481.
[12] Jansen, M. J. M. (1981). Maximal Nash subsets for bimatrix games. Naval Research Logistics
Quarterly, 28(1):147–152.
[13] LaValle, S, M. (2006). Planning Algorithms. Cambridge University Press.
[14] Lemke, C, E. and Howson, J, T. (1964). Equilibrium points of bimatrix games. Journal of the
Society for Industrial and Applied Mathematics, 12 (2):413–423.
[15] Mangasarian, O, L. (1964). Equilibrium points of bimatrix games. Journal of the Society for
Industrial and Applied Mathematics, 12(4):778–780.
[16] McKelvey, R, D. and McLennan, A. (1996). Computation of equilibria in finite games. In Handbook
of Computational Economics, pages 87–142. Elsevier.
[17] Mills, H. (1960). Equilibrium points in finite games. Journal of the Society for Industrial and
Applied Mathematics, 8(2):397–402.
[18] Nash, J, F. (1950). The bargaining problem. Econometrica, 18:155–162.
[19] Nash, J, F. (1951). Non-cooperative games. Annals of Mathematics, 54:286–295.
[20] Nash, J, F. (1953). Two person cooperative games. Econometrica, 21:128–140.
[21] Nishizaki, I. and Notsu, T. (2007). Nondominated equilibrium solutions of a multiobjective twoperson nonzero-sum game and corresponding mathematical programming problem. Journal of Optimization Theory and Applications, 135(2):217–239.
[22] Nishizaki, I. and Notsu, T. (2008). Nondominated equilibrium solutions of a multiobjective twoperson nonzero-sum game in extensive form and corresponding mathematical programming problem.
Journal of Global Optimization, 42(2):201–220.
17
[23] Porter, R., Nudelman, E., and Shoham, Y. (2008). Simple search methods for finding a Nash
equilibrium. Games and Economic Behavior, 63(2):642 – 662.
[24] Sandholm, T., Gilpin, A., and Conitzer, V. (2005). Mixed-integer programming methods for
finding Nash equilibria. In Proceedings of the 20th National Conference on Artificial Intelligence,
volume 2 of AAAI’05, pages 495–501, Pittsburgh, Pennsylvania. AAAI Press.
[25] Serrano, R. (2005). Fifty years of the Nash program 1953-2003. Investigaciones Economicas, pages
219–258.
[26] Shoham, Y. and Leyton-Brown, K. (2009). Multiagent Systems: Algorithmic, Game-Theoretic,
and Logical Foundations. Cambridge University Press.
[27] Voorneveld, M., Grahn, S., and Dufwenberg, M. (2000). Ideal equilibria in noncooperative multicriteria games. Mathematical Methods of Operations Research, 52(1):65–77.
[28] Zeleny, M. (1975). Games with multiple payoffs. International Journal of Game Theory, 4(4):179–
191.
18