* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction
Survey
Document related concepts
Transcript
Jiang Zun – A Xiang Qi Application Herbert Lee CS 491B December 8, 2006 Abstract Jiang Zun is a computerized version of the ancient Chinese game of Xiang Qi. Using the Java programming language, the developers have created an interface for playing this game, as well as a computer player based on the Minimax algorithm, which is a brute force algorithm that can be applied to all zero-sum, perfect information turnbased games, including Xiang Qi. The developers discuss every stage of the creation of this application, from the user interface to the artificial intelligence portion, with special emphasis being placed on the latter. The implementation of the minimax algorithm is given special examination. In addition, the developers analyze their application and reveal weaknesses in the implementation of the artificial intelligence portion, and offer possible fixes to those flaws. Introduction Jiang Zun is a web-based application for the computerized play of Xiang Qi, or Chinese Chess. Xiang Qi is a turn-based, two player strategy game similar to traditional, or Western Chess. Both games involve capturing the enemy king in Western Chess, or general in Xiang Qi. The name Jiang Zun comes from the Chinese expression for checkmate. Many Xiang Qi pieces are analogous to those found in Western Chess, but have different movement and attack patterns. Because of these differences, Xiang Qi games are generally more quick-paced than Western Chess matches. As in Western Chess, there are two players in the game, and the objective of the game is to place the enemy's general in a position where they cannot escape from, known as checkmate. The board, which is a nine by ten grid, is divided into two halves by a river. On each side, there is a two by two grid in the center of the board horizontally and placed at the furthest possible distance from the river, designated by a pair of diagonal lines that form an X. This region is known as the castle, or fortress. There are seven different types of pieces: soldier, cannon, chariot, horse, elephant, advisor, and general. Similarly to chess, each piece has its own set of rules for movement. The general may move one space horizontally or vertically. The advisor may move one space diagonally. The cannon and chariot may move any number of spaces horizontally or vertically. The horse moves horizontally or vertically one space, and then diagonally one space. The elephant moves two spaces diagonally. No piece may jump over another piece with the sole exception of the cannon, which must jump over exactly one piece in order to attack. The general and his advisors must remain in the castle. In addition to a client for hot-seat player versus player matches, Jiang Zun also incorporates an artificial intelligence component. Players are able to test their skills against a computer opponent. The computer opponent has one skill setting, and implements the Minimax algorithm, which makes its next move by evaluating the board position that it would be in after each of the moves available to it and making the move which maximizes its board position. While not particularly sophisticated, the computer opponent still provides a challenge to inexperienced to moderately skilled human players. Xiang Qi has a game-tree complexity of ten to the power one hundred fifty, compared to ten to the power one hundred twenty-three in Western chess. We were motivated to develop this software for several reasons. While Xiang Qi is a very popular board game, it lacks the international attention garnered by Western Chess, especially in the artificial intelligence area. Because it is so similar to Western Chess from a programming standpoint, there has not been a lot of effort to develop Xiang Qi playing programs in the same way that there has been for Western Chess. Thus, though the two games are very similar, there has not yet been a Xiang Qi program written that can defeat the world's best Xiang Qi player. Because there is such a lamentable dearth of computerized Xiang Qi efforts, creating software to help fill both these voids was of interest to us. In addition, we were interested in general artificial intelligence, and developing a computer program for Xiang Qi seemed a fascinating challenge. Technological Background The programming language used throughout the development of Jiang Zun is Java 2, Standard Edition 1.50. Java was used primarily for the ability to create applets, which allows the software to be run from the web browser. This minimizes the effort to access the software on the user’s part. Java was also used for its graphic user interface (GUI) support, which is well defined and easy to use. The project was developed in the Eclipse Software Developer’s Kit, a development environment for Java. Eclipse was chosen due to its open-source nature, which was important since our project has a strict budgetary limit. Jiang Zun’s artificial intelligence component makes use of the minimax algorithm. Sometimes referred to as min-max, the algorithm is best implemented in twoplayer, perfect information, zero-sum games. Perfect information means that there are no unknown factors to either opponent. Zero-sum means that every gain by one player is matched by an equivalent loss by the other player. Xiang Qi matches both of these definitions, which makes minimax relatively easy to implement for Xiang Qi. The minimax algorithm gets its name from minimum-maximum, which refers to the process by which the algorithm finds the optimal move. The algorithm finds all of the possible moves for one player, each of which generates a game state. The game states can be seen as the nodes of a tree, while the moves are the paths between them. The algorithm makes the assumption that the opponent will always make the maximal move, and seeks to minimize that move, hence the name minimax. Ideally, the algorithm recurses through the game tree until it reaches a winning or losing state. This strategy was proven by John von Neumann in 1928 for all such two-person zero-sum games. Though this approach works for simpler games, such as tic-tac-toe, the time constraint is prohibitive for a game like Xiang Qi or Western chess. Hence, we must stop the search at some predetermined depth and use heuristics to evaluate the game at that stage. The particular variant of minimax used is the negamax algorithm, which is identical to the minimax algorithm in operation, but is less awkward to program. System Overview The system can be divided into several components. At the core of the system is the game engine, which is the software that displays the game board, pieces, and controls the legality of moves. Surrounding the game engine is the graphical user interface, which allows users to change game settings and view game information, such as what pieces have been captured and whose turn it is. The system also includes the artificial intelligence component, which makes moves for the computer opponent in single-player games. Design Within the game, each player will see the game board. The game board will be oriented so that the user is always closest to their own general. There will be a flag next to the game board that indicates whose turn it is. The active player can select a piece to move, which highlights the piece. They can then choose a place on the board to move to. The game engine checks if the move is a legal move. If it is, it moves the piece to the new piece and goes to the other player’s turn. If the move is not valid, it displays a message to the active player and de-selects the piece. A valid move is one that does not violate game rules and that also does not leave ones own side in check. If there are no valid moves left, it displays a message to both players and ends the game. For a one-player game, the player will make their move as if they were playing against another human player. However, rather than waiting for their opponent to move, they will wait for the computer to select a move and make it. The computer will make its moves by following the negamax algorithm, a simple brute-force algorithm which is a variation on the more well-known minimax or min-max algorithm. The minimax algorithm for turn-based strategy games operates on the assumption that the opponent will always select the move which benefits him or her the most. The algorithm looks at all of its moves and then computes what all of its opponent's moves will be. It then recursively calls itself on each one of its opponent's moves, generating the search tree. For the levels of the tree in which it is looking at its own moves, it will choose the maximum value, and for the levels of the tree in which it is looking at its opponent's moves it will choose the minimum value. As described earlier, this search should be done until an end-game state has been reached for optimal results. However, since this is impossible, searches are generally performed to some pre-determined depth. In this application, a depth of four is the maximum depth that can be searched within the allotted time. Anything more than that and the human opponent will stop playing out of frustration. Once the predetermined depth has been reached, the algorithm will evaluate the game state and return that value. The minimax algorithm has two subfunctions, min and max, which call each other recursively. By altering the signs of the max function so that it alternates between calls, we can eliminate the need for two separate methods. The resulting algorithm is called the negamax algorithm. It is functionally identical to the minimax algorithm, but somewhat more straightforward to program. To reduce the number of nodes that the algorithm must search through, we employ a technique known as alpha-beta pruning. Alpha-beta pruning eliminates subtrees which have as their root a move which is known to be suboptimal. To do this, the negamax function returns on values which are less than the value of other already evaluated moves at the same level. By doing so, we cut down on the number of game positions that must be examined. For alpha-beta pruning to be most effective, however, we must employ some sorting algorithm to order the moves by their likely strength. In general, attacks are stronger moves than non-attacks, and so we should examine them first. Also, it is usually advantageous to move to the center of the board rather than to the edges, and so moves which move to the middle of the board are considered first. The negamax function takes as its inputs a game position, a depth, the side in question, and an alpha and beta value, and returns an integer. If the depth equals zero, or if the game is over, it will evaluate the game position and return that value. Otherwise, it will recursively call itself for each one of the moves. It will pass the game position resulting from applying the move to the current game position, the depth decremented, and the opposite of the side. It will also pass the negative of the beta value as alpha and alpha as the beta value. Implementation The graphical user interface for Jiang Zun is created using the Java Swing components. Swing provides all of the graphical elements needed to draw the game board and the pieces, as well as to create the GUI for the application. Java also provides methods to display images, which we use to display the characters for the pieces. Originally, we attempted to use the built-in methods for displaying strings graphically, but due to the lack of Big5 fonts on the server, we soon changed to using the image displaying methods. The images for the game were taken from an instructional site on The game board is represented as a two-dimensional nine by ten array, as seen in Figure 1. The game is initially set to the one-player mode. Fig. 1 There are three classes of importance to the core of the game engine: the GamePanel class, the Move class, and the Piece class. The GamePanel class extends the JPanel class, which comes from Java Swing components, and implements the MouseListener interface. It contains the list of all the pieces on the board, the flag that sets whose turn it is, the dimensions of the screen, and the size of the pieces in pixel units. The GamePanel class also contains functions to paint the board and pieces, remove pieces from the board, convert from board units to pixel units, and to detect mouse clicks. The GamePanel class also contains functions that check whether a space is empty, to return the Piece at a given x, y point, and to check whether the space between two points is empty. In short, the GamePanel class deals with everything that involves looking at the board as a whole, rather than piece by piece. The GamePanel also has a stack to hold the history of the game, so that moves can be undone. It also has methods for undoing moves and for cloning itself. When the clone method is called, it populates the new table with clones of each Piece as well. The Move class is a class which represents a move on the board. Every Move has a Piece, an X coordinate, a Y coordinate, and a GamePanel member. The Piece is the piece which is going to be moved, the X and Y coordinates are the coordinates of the point that the piece will be moving to, and the GamePanel is the GamePanel on which the Piece is found. The Move class also contains a method to determine validity of the move, which checks the X and Y coordinates on the GamePanel against the Piece's rules for movement. The Piece class, on the other hand, represents a piece. Each piece has a pair of coordinates, a character value, an integer value, and a side. The coordinates determine where the piece is on the board, the value determines the piece’s rules for movement, and the side determines who controls the piece. The integer value is used by the artificial intelligence portion of the program in determining the optimal move. The Piece class contains methods to determine whether the piece is in the castle, and whether the piece has crossed the river. The last two methods are used for the movements for advisors, generals, and elephants. It also contains methods to clone the piece. The game paints the board and the pieces in the starting configuration at the start of the game. After every move, the program will repaint the images. It flags the red side as going first, as is the tradition. When the user chooses a piece to move, it checks whether the player has control over that piece, as well as whether the move is legal or not. A legal move is one in which the piece follows the movement rules. The program does not consider the possibility of a draw, although the rules for a draw are present in the standard Xiang Qi tournament rules. In order to be able to undo moves, a GameStack class and a PieceBag class were also created. A PieceBag is a container for a set of pieces. The GameStack class is a standard implementation of a stack data structure. A GameStack holds a PieceBag in it which represents the board position after each move. Whenever a move is made, a copy of each of the pieces still on the GamePanel is made and put into the PieceBag, which is pushed onto the GameStack. Whenever a move is undone, the GameStack pops the most recent PieceBag, which the pieces in the current GamePanel are then set to. The Board class, which does not directly affect the game, is the class which contains our main() method. It extends the JFrame class, and generates a frame that holds the GamePanel, as well as methods for starting a new game, changing the game's appearance, and undoing moves. When starting a new game, the user is presented with the option to play either a two-player hotseat game or a single-player game against a computer opponent. The artificial intelligence portion of the project is composed of two other main classes: Game and AIPlayer. The Game class essentially takes a GamePanel and applies a Move to it. However, to keep the move from actually being made on the the real GameBoard, it applies the move to a copy of the GamePanel instead. The Game class also evaluates the current value of the game using the following formula: Current Value = Σ (values of pieces controlled by player) - Σ (values of pieces controlled by opponent) + Σ (values of pieces that can be taken by player) - Σ (values of pieces that can be taken by opponent) The values for the pieces are as follows: Piece Value General 1000 Advisor 2 Elephant 2 Chariot 10 Horse 4 Cannon 6 Pawn 1 Promoted Pawn 2 The current value of the game is used as part of the decision-making algorithm, whose implementation is also contained in the Game class. The negamax function applies the negamax algorithm to the Game. If the depth has been reached, the current value is returned. Otherwise, negamax is recursively called. For this project, a depth of 3 is used, starting with the opponent's moves. This means that the program is looking ahead two full turns. Any more moves than that and the program takes too long in deciding what move to make. The AIPlayer class acts as the opposing player in single player games. It contains one method of interest, which decides on a move to make. The method generates a list of Moves and a list of Games. The list of Moves consists of all valid moves for the class, and the list of Games consists of each one of those Moves being applied to the current game position. The value of each one of those moves is then computed using the negamax function. The move with the highest value is then made. System Evaluation At time of writing, players can play hot-seat games and in single-player mode against the computer. The single-player mode was determined to be at a difficulty level appropriate for beginning players. However, the moves made were often poor, especially in the early stages of the game. There were several different situations in which which the computer opponent made moves which a relatively-skilled human player would easily exploit. As far as we have been able to ascertain, this is due primarily to the inability to look ahead more than one full turn due to time constraints. The evaluation function used is also inadequate, as it fails to take into consideration many factors that a human player learns about through experience. The most glaring play error occurs at the beginning of the game, which the computer opponent always moves a horse to the forward and to the edge of the screen (Fig. 2). This is due to both the way that the evaluation function works and the lookahead limitations. At the beginning of a Xiang Qi game, all of the horses are vulnerable to enemy cannons, which can jump over the opposing cannons and take the horses. However, this is generally considered one of the worst starting moves, since the enemy can simply take the offending cannons with their chariots. While the computer opponent recognizes this as a poor move for itself because it looks ahead two turns, it does not see that this would be as equally poor a move for its opponent, because it does not look that far ahead. Thus, it will move its horse out of the way, because it views that move as the one that is the opponent's best. Fig. 2 There was also another, more subtle, but equally poor decision made as the result of the naïve evaluation function. This decision was made when the computer player was able to “fork” two opposing pieces, placing both in danger at once. In this situation, if the opponent did not move one of the pieces out of danger, the computer player failed to attack either one of the pieces. If we examine the game state evaluation function, it is easy to see why. A piece that can be captured is worth the same as a piece that has been removed from the board, which means that a position from which two pieces can be taken is going to have a higher value than a position from which neither piece can be taken, as can be seen in Figure 2: Fig. 3 In this case, the computer player had threatened both of the opponent's cannons with its chariot. However, when the human player did not move either cannon out of danger, the computer player failed to take either cannon, instead moving its elephant out of the cannon's attack route. This is because the evaluation function calculates this position out to be a better one than one in which the cannon is captured. In order to fix this problem and similar other ones, it would be necessary to write a new evaluation function. Unfortunately, due to the brute-force nature of the minimax algorithm, it does not seem possible to extend the look-ahead without making the time delay between moves too long. Hence, a better evaluation function must be written. The time taken per move was not yet at the quality desired, either, as each move took between 5 to 10 seconds to be decided, even with such a short look-ahead. This is likely because of the way that the board finds moves. For each piece, it looks at every possible position on the game board and determines its validity. At the beginning of the game, there are 16 pieces in play for each player. There are also 90 coordinate points on the game board. Thus, the computer opponent will look at 1440 moves and determine their validity. For each one of those moves, the opponent will look at each one of the opponent's moves. At this point, we are only at one full turn of look-ahead and the computer is already having to examine over 2 million moves, most of which are invalid, but which must be considered anyway. In order to look 2 full turns ahead, which most human players can do quite easily, we must examine 1012 moves and determine their validity, which is not possible within the time frame. Even looking only one full turn ahead, the decision-making process may be too slow. Conclusions Although the project may not have fulfilled all of our expectations, we the developers have learned much from the experience. In the process of researching artificial intelligence algorithms for two-player zero-sum games, we read about many different artificial intelligence and machine learning processes, which was fascinating. In particular, we learned all about different types of brute-force algorithms, generation and searching of game trees, and the ins and outs of the minimax and negamax algorithms. In addition, we re-learned a great deal of Java Swing programming. Most importantly, we learned about how to design a project like this from the ground up, which was a new experience. In order to prepare this application for public use, several things would need to be changed slightly, while other things would need a complete overhaul. First, the original intention of playing over the Internet would definitely need to be developed. In these modern times, an application which does not have the capability for online play would have a very small player base. In addition to that, the computer opponent would need a major reworking. A new, more efficient algorithm would need to be used, as any bruteforce algorithm like minimax or negamax will necessarily have to be limited in scope. A machine learning approach or pattern analysis approach might have better results. Finally, the user interface could use some work, as it is not up to current standards of computer games. Acknowledgments We would like to thank Dr. Chengyu Sun for his advice and guidance throughout the project.