Download Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Chicken (game) wikipedia , lookup

Game mechanics wikipedia , lookup

Deathmatch wikipedia , lookup

Promotion (chess) wikipedia , lookup

Chess strategy wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Computer Go wikipedia , lookup

Minimax wikipedia , lookup

Reversi wikipedia , lookup

Transcript
Jiang Zun – A Xiang Qi Application
Herbert Lee
CS 491B
December 8, 2006
Abstract
Jiang Zun is a computerized version of the ancient Chinese game of Xiang Qi.
Using the Java programming language, the developers have created an interface for
playing this game, as well as a computer player based on the Minimax algorithm, which
is a brute force algorithm that can be applied to all zero-sum, perfect information turnbased games, including Xiang Qi. The developers discuss every stage of the creation of
this application, from the user interface to the artificial intelligence portion, with special
emphasis being placed on the latter. The implementation of the minimax algorithm is
given special examination. In addition, the developers analyze their application and
reveal weaknesses in the implementation of the artificial intelligence portion, and offer
possible fixes to those flaws.
Introduction
Jiang Zun is a web-based application for the computerized play of Xiang Qi, or
Chinese Chess. Xiang Qi is a turn-based, two player strategy game similar to traditional,
or Western Chess. Both games involve capturing the enemy king in Western Chess, or
general in Xiang Qi. The name Jiang Zun comes from the Chinese expression for
checkmate. Many Xiang Qi pieces are analogous to those found in Western Chess, but
have different movement and attack patterns. Because of these differences, Xiang Qi
games are generally more quick-paced than Western Chess matches.
As in Western Chess, there are two players in the game, and the objective of the
game is to place the enemy's general in a position where they cannot escape from, known
as checkmate. The board, which is a nine by ten grid, is divided into two halves by a
river. On each side, there is a two by two grid in the center of the board horizontally and
placed at the furthest possible distance from the river, designated by a pair of diagonal
lines that form an X. This region is known as the castle, or fortress. There are seven
different types of pieces: soldier, cannon, chariot, horse, elephant, advisor, and general.
Similarly to chess, each piece has its own set of rules for movement. The general may
move one space horizontally or vertically. The advisor may move one space diagonally.
The cannon and chariot may move any number of spaces horizontally or vertically. The
horse moves horizontally or vertically one space, and then diagonally one space. The
elephant moves two spaces diagonally. No piece may jump over another piece with the
sole exception of the cannon, which must jump over exactly one piece in order to attack.
The general and his advisors must remain in the castle.
In addition to a client for hot-seat player versus player matches, Jiang Zun also
incorporates an artificial intelligence component. Players are able to test their skills
against a computer opponent. The computer opponent has one skill setting, and
implements the Minimax algorithm, which makes its next move by evaluating the board
position that it would be in after each of the moves available to it and making the move
which maximizes its board position. While not particularly sophisticated, the computer
opponent still provides a challenge to inexperienced to moderately skilled human players.
Xiang Qi has a game-tree complexity of ten to the power one hundred fifty, compared to
ten to the power one hundred twenty-three in Western chess.
We were motivated to develop this software for several reasons. While Xiang Qi
is a very popular board game, it lacks the international attention garnered by Western
Chess, especially in the artificial intelligence area. Because it is so similar to Western
Chess from a programming standpoint, there has not been a lot of effort to develop Xiang
Qi playing programs in the same way that there has been for Western Chess. Thus,
though the two games are very similar, there has not yet been a Xiang Qi program written
that can defeat the world's best Xiang Qi player. Because there is such a lamentable
dearth of computerized Xiang Qi efforts, creating software to help fill both these voids
was of interest to us. In addition, we were interested in general artificial intelligence, and
developing a computer program for Xiang Qi seemed a fascinating challenge.
Technological Background
The programming language used throughout the development of Jiang Zun is Java
2, Standard Edition 1.50. Java was used primarily for the ability to create applets, which
allows the software to be run from the web browser. This minimizes the effort to access
the software on the user’s part. Java was also used for its graphic user interface (GUI)
support, which is well defined and easy to use. The project was developed in the Eclipse
Software Developer’s Kit, a development environment for Java. Eclipse was chosen due
to its open-source nature, which was important since our project has a strict budgetary
limit.
Jiang Zun’s artificial intelligence component makes use of the minimax
algorithm. Sometimes referred to as min-max, the algorithm is best implemented in twoplayer, perfect information, zero-sum games. Perfect information means that there are no
unknown factors to either opponent. Zero-sum means that every gain by one player is
matched by an equivalent loss by the other player. Xiang Qi matches both of these
definitions, which makes minimax relatively easy to implement for Xiang Qi. The
minimax algorithm gets its name from minimum-maximum, which refers to the process
by which the algorithm finds the optimal move. The algorithm finds all of the possible
moves for one player, each of which generates a game state. The game states can be seen
as the nodes of a tree, while the moves are the paths between them. The algorithm makes
the assumption that the opponent will always make the maximal move, and seeks to
minimize that move, hence the name minimax. Ideally, the algorithm recurses through
the game tree until it reaches a winning or losing state. This strategy was proven by John
von Neumann in 1928 for all such two-person zero-sum games. Though this approach
works for simpler games, such as tic-tac-toe, the time constraint is prohibitive for a game
like Xiang Qi or Western chess. Hence, we must stop the search at some predetermined
depth and use heuristics to evaluate the game at that stage. The particular variant of
minimax used is the negamax algorithm, which is identical to the minimax algorithm in
operation, but is less awkward to program.
System Overview
The system can be divided into several components. At the core of the system is
the game engine, which is the software that displays the game board, pieces, and controls
the legality of moves. Surrounding the game engine is the graphical user interface, which
allows users to change game settings and view game information, such as what pieces
have been captured and whose turn it is. The system also includes the artificial
intelligence component, which makes moves for the computer opponent in single-player
games.
Design
Within the game, each player will see the game board. The game board will be
oriented so that the user is always closest to their own general. There will be a flag next
to the game board that indicates whose turn it is. The active player can select a piece to
move, which highlights the piece. They can then choose a place on the board to move to.
The game engine checks if the move is a legal move. If it is, it moves the piece to the
new piece and goes to the other player’s turn. If the move is not valid, it displays a
message to the active player and de-selects the piece. A valid move is one that does not
violate game rules and that also does not leave ones own side in check. If there are no
valid moves left, it displays a message to both players and ends the game.
For a one-player game, the player will make their move as if they were playing
against another human player. However, rather than waiting for their opponent to move,
they will wait for the computer to select a move and make it. The computer will make its
moves by following the negamax algorithm, a simple brute-force algorithm which is a
variation on the more well-known minimax or min-max algorithm. The minimax
algorithm for turn-based strategy games operates on the assumption that the opponent
will always select the move which benefits him or her the most. The algorithm looks at
all of its moves and then computes what all of its opponent's moves will be. It then
recursively calls itself on each one of its opponent's moves, generating the search tree.
For the levels of the tree in which it is looking at its own moves, it will choose the
maximum value, and for the levels of the tree in which it is looking at its opponent's
moves it will choose the minimum value. As described earlier, this search should be
done until an end-game state has been reached for optimal results. However, since this is
impossible, searches are generally performed to some pre-determined depth. In this
application, a depth of four is the maximum depth that can be searched within the allotted
time. Anything more than that and the human opponent will stop playing out of
frustration. Once the predetermined depth has been reached, the algorithm will evaluate
the game state and return that value.
The minimax algorithm has two subfunctions, min and max, which call each other
recursively. By altering the signs of the max function so that it alternates between calls,
we can eliminate the need for two separate methods. The resulting algorithm is called the
negamax algorithm. It is functionally identical to the minimax algorithm, but somewhat
more straightforward to program.
To reduce the number of nodes that the algorithm must search through, we
employ a technique known as alpha-beta pruning. Alpha-beta pruning eliminates
subtrees which have as their root a move which is known to be suboptimal. To do this,
the negamax function returns on values which are less than the value of other already
evaluated moves at the same level. By doing so, we cut down on the number of game
positions that must be examined. For alpha-beta pruning to be most effective, however,
we must employ some sorting algorithm to order the moves by their likely strength. In
general, attacks are stronger moves than non-attacks, and so we should examine them
first. Also, it is usually advantageous to move to the center of the board rather than to the
edges, and so moves which move to the middle of the board are considered first.
The negamax function takes as its inputs a game position, a depth, the side in
question, and an alpha and beta value, and returns an integer. If the depth equals zero, or
if the game is over, it will evaluate the game position and return that value. Otherwise, it
will recursively call itself for each one of the moves. It will pass the game position
resulting from applying the move to the current game position, the depth decremented,
and the opposite of the side. It will also pass the negative of the beta value as alpha and
alpha as the beta value.
Implementation
The graphical user interface for Jiang Zun is created using the Java Swing
components. Swing provides all of the graphical elements needed to draw the game
board and the pieces, as well as to create the GUI for the application. Java also provides
methods to display images, which we use to display the characters for the pieces.
Originally, we attempted to use the built-in methods for displaying strings graphically,
but due to the lack of Big5 fonts on the server, we soon changed to using the image
displaying methods. The images for the game were taken from an instructional site on
The game board is represented as a two-dimensional nine by ten array, as seen in Figure
1. The game is initially set to the one-player mode.
Fig. 1
There are three classes of importance to the core of the game engine: the
GamePanel class, the Move class, and the Piece class. The GamePanel class extends the
JPanel class, which comes from Java Swing components, and implements the
MouseListener interface. It contains the list of all the pieces on the board, the flag that
sets whose turn it is, the dimensions of the screen, and the size of the pieces in pixel
units. The GamePanel class also contains functions to paint the board and pieces, remove
pieces from the board, convert from board units to pixel units, and to detect mouse clicks.
The GamePanel class also contains functions that check whether a space is empty, to
return the Piece at a given x, y point, and to check whether the space between two points
is empty. In short, the GamePanel class deals with everything that involves looking at
the board as a whole, rather than piece by piece. The GamePanel also has a stack to hold
the history of the game, so that moves can be undone. It also has methods for undoing
moves and for cloning itself. When the clone method is called, it populates the new table
with clones of each Piece as well.
The Move class is a class which represents a move on the board. Every Move has
a Piece, an X coordinate, a Y coordinate, and a GamePanel member. The Piece is the
piece which is going to be moved, the X and Y coordinates are the coordinates of the
point that the piece will be moving to, and the GamePanel is the GamePanel on which the
Piece is found. The Move class also contains a method to determine validity of the move,
which checks the X and Y coordinates on the GamePanel against the Piece's rules for
movement.
The Piece class, on the other hand, represents a piece. Each piece has a pair of
coordinates, a character value, an integer value, and a side. The coordinates determine
where the piece is on the board, the value determines the piece’s rules for movement, and
the side determines who controls the piece. The integer value is used by the artificial
intelligence portion of the program in determining the optimal move. The Piece class
contains methods to determine whether the piece is in the castle, and whether the piece
has crossed the river. The last two methods are used for the movements for advisors,
generals, and elephants. It also contains methods to clone the piece.
The game paints the board and the pieces in the starting configuration at the start
of the game. After every move, the program will repaint the images. It flags the red side
as going first, as is the tradition. When the user chooses a piece to move, it checks
whether the player has control over that piece, as well as whether the move is legal or
not. A legal move is one in which the piece follows the movement rules. The program
does not consider the possibility of a draw, although the rules for a draw are present in
the standard Xiang Qi tournament rules.
In order to be able to undo moves, a GameStack class and a PieceBag class were
also created. A PieceBag is a container for a set of pieces. The GameStack class is a
standard implementation of a stack data structure. A GameStack holds a PieceBag in it
which represents the board position after each move. Whenever a move is made, a copy
of each of the pieces still on the GamePanel is made and put into the PieceBag, which is
pushed onto the GameStack. Whenever a move is undone, the GameStack pops the most
recent PieceBag, which the pieces in the current GamePanel are then set to.
The Board class, which does not directly affect the game, is the class which
contains our main() method. It extends the JFrame class, and generates a frame that
holds the GamePanel, as well as methods for starting a new game, changing the game's
appearance, and undoing moves. When starting a new game, the user is presented with
the option to play either a two-player hotseat game or a single-player game against a
computer opponent.
The artificial intelligence portion of the project is composed of two other main
classes: Game and AIPlayer. The Game class essentially takes a GamePanel and applies
a Move to it. However, to keep the move from actually being made on the the real
GameBoard, it applies the move to a copy of the GamePanel instead. The Game class
also evaluates the current value of the game using the following formula:
Current Value = Σ (values of pieces controlled by player) - Σ (values of pieces
controlled by opponent) + Σ (values of pieces that can be taken by player) - Σ (values of
pieces that can be taken by opponent)
The values for the pieces are as follows:
Piece
Value
General
1000
Advisor
2
Elephant
2
Chariot
10
Horse
4
Cannon
6
Pawn
1
Promoted Pawn
2
The current value of the game is used as part of the decision-making algorithm,
whose implementation is also contained in the Game class. The negamax function
applies the negamax algorithm to the Game. If the depth has been reached, the current
value is returned. Otherwise, negamax is recursively called. For this project, a depth of 3
is used, starting with the opponent's moves. This means that the program is looking
ahead two full turns. Any more moves than that and the program takes too long in
deciding what move to make.
The AIPlayer class acts as the opposing player in single player games. It contains
one method of interest, which decides on a move to make. The method generates a list of
Moves and a list of Games. The list of Moves consists of all valid moves for the class,
and the list of Games consists of each one of those Moves being applied to the current
game position. The value of each one of those moves is then computed using the
negamax function. The move with the highest value is then made.
System Evaluation
At time of writing, players can play hot-seat games and in single-player mode
against the computer. The single-player mode was determined to be at a difficulty level
appropriate for beginning players. However, the moves made were often poor, especially
in the early stages of the game. There were several different situations in which which
the computer opponent made moves which a relatively-skilled human player would easily
exploit. As far as we have been able to ascertain, this is due primarily to the inability to
look ahead more than one full turn due to time constraints. The evaluation function used
is also inadequate, as it fails to take into consideration many factors that a human player
learns about through experience.
The most glaring play error occurs at the beginning of the game, which the
computer opponent always moves a horse to the forward and to the edge of the screen
(Fig. 2). This is due to both the way that the evaluation function works and the lookahead limitations. At the beginning of a Xiang Qi game, all of the horses are vulnerable
to enemy cannons, which can jump over the opposing cannons and take the horses.
However, this is generally considered one of the worst starting moves, since the enemy
can simply take the offending cannons with their chariots. While the computer opponent
recognizes this as a poor move for itself because it looks ahead two turns, it does not see
that this would be as equally poor a move for its opponent, because it does not look that
far ahead. Thus, it will move its horse out of the way, because it views that move as the
one that is the opponent's best.
Fig. 2
There was also another, more subtle, but equally poor decision made as the result
of the naïve evaluation function. This decision was made when the computer player was
able to “fork” two opposing pieces, placing both in danger at once. In this situation, if
the opponent did not move one of the pieces out of danger, the computer player failed to
attack either one of the pieces. If we examine the game state evaluation function, it is
easy to see why. A piece that can be captured is worth the same as a piece that has been
removed from the board, which means that a position from which two pieces can be taken
is going to have a higher value than a position from which neither piece can be taken, as
can be seen in Figure 2:
Fig. 3
In this case, the computer player had threatened both of the opponent's cannons
with its chariot. However, when the human player did not move either cannon out of
danger, the computer player failed to take either cannon, instead moving its elephant out
of the cannon's attack route. This is because the evaluation function calculates this
position out to be a better one than one in which the cannon is captured. In order to fix
this problem and similar other ones, it would be necessary to write a new evaluation
function. Unfortunately, due to the brute-force nature of the minimax algorithm, it does
not seem possible to extend the look-ahead without making the time delay between
moves too long. Hence, a better evaluation function must be written.
The time taken per move was not yet at the quality desired, either, as each move
took between 5 to 10 seconds to be decided, even with such a short look-ahead. This is
likely because of the way that the board finds moves. For each piece, it looks at every
possible position on the game board and determines its validity. At the beginning of the
game, there are 16 pieces in play for each player. There are also 90 coordinate points on
the game board. Thus, the computer opponent will look at 1440 moves and determine
their validity. For each one of those moves, the opponent will look at each one of the
opponent's moves. At this point, we are only at one full turn of look-ahead and the
computer is already having to examine over 2 million moves, most of which are invalid,
but which must be considered anyway. In order to look 2 full turns ahead, which most
human players can do quite easily, we must examine 1012 moves and determine their
validity, which is not possible within the time frame. Even looking only one full turn
ahead, the decision-making process may be too slow.
Conclusions
Although the project may not have fulfilled all of our expectations, we the
developers have learned much from the experience. In the process of researching
artificial intelligence algorithms for two-player zero-sum games, we read about many
different artificial intelligence and machine learning processes, which was fascinating. In
particular, we learned all about different types of brute-force algorithms, generation and
searching of game trees, and the ins and outs of the minimax and negamax algorithms. In
addition, we re-learned a great deal of Java Swing programming. Most importantly, we
learned about how to design a project like this from the ground up, which was a new
experience.
In order to prepare this application for public use, several things would need to be
changed slightly, while other things would need a complete overhaul. First, the original
intention of playing over the Internet would definitely need to be developed. In these
modern times, an application which does not have the capability for online play would
have a very small player base. In addition to that, the computer opponent would need a
major reworking. A new, more efficient algorithm would need to be used, as any bruteforce algorithm like minimax or negamax will necessarily have to be limited in scope. A
machine learning approach or pattern analysis approach might have better results.
Finally, the user interface could use some work, as it is not up to current standards of
computer games.
Acknowledgments
We would like to thank Dr. Chengyu Sun for his advice and guidance throughout the
project.