First Semester Project Summary Guideline Name: James Mannion Project: Temporal Difference Learning in Chess WHAT IS IT? Once I am done my program will show that it is possible for a computer to learn an evaluation function for chess. As of right now, I have two working programs: one that allows two humans to play a full game of chess against each other and one that allows a human to play a full game of chess against a computer player that picks its moves at random. I have begun implementing necessary aspects of my heuristic function. I wrote a main text file that will store the heuristic function as it stands and that can be written over, and I wrote text files for each piece-square table. Currently the piece-square tables have all 1’s as there values, but this is necessary to make sure that the program starts out not being very effective, otherwise there would be no point in having it learn. HOW IT WORKS This project follows the rules of chess (except for slight variations I made to the stalemate rules for the purpose of simplicity). It allows a human user to input moves directly into a terminal, and gives a textbased representation of a chessboard. Currently, a human can either play another human or a random chess program. The Chessboard class checks to see whether a given move is legal by generating a list of the desired piece’s legal moves (based on the rules of its movement) and seeing whether the desired target is in that list of legal moves. If so, the program will make the move. If the move is not legal, then the program will simply prompt the user for a different move. HOW TO USE IT To run a human vs. human game, the user simply needs to run chess.py in a command terminal. To run a human vs. random AI game, the user needs to run chess_ai.py. Once in the game, the user enters their moves into the terminal using the following syntax: *letter of piece**rank of piece* *letter of target space**rank of target space*. So, the input “a2 a4” would move the piece at a2 to the space a4, assuming this is a legal move. THINGS TO NOTICE The text-based program shows the players when a king is in check and will not allow anyone to make any illegal moves. This includes disallowing moves that would put your own king in check. Figuring out how to code that part took me forever because it was very hard to visualize and also because it required hours of testing/debugging to make sure that it worked in as many scenarios I could throw at it. THINGS TO TRY Play a full game against the random computer to make sure there are no bugs. EXTENDING THE MODEL OR PROGRAM As I said above, now that I have the kinks sorted out, over the next few weeks I will be drastically increasing the scope of my program. The first step is to finish writing a basic 3-ply minimax, alphabeta search program. It will have access to and will parse the text files I have written that contain the initial heuristic function information, however it will not be able to change the files. This program will be the control for my experiments because it will remain constant throughout the testing period. Writing the minimax algorithm should be relatively easy since I have already written one for Othello, and I have figured out how my code is going to be structured. After this is done, I will be able to implement temporal difference learning. Since it uses a fairly complicated mathematical equation, I am still figuring out the actual mechanics of how I am going to alter the weights of the heuristic function in the text files. I have a very clear vision of how I am going to test the effectiveness of the learning algorithm. PROGRAMMING FEATURES One thing that took me a while to come up with was the way I check for moves that would put one’s own king in check. I made a deep copy of the current board, and applied the move that is being looked at to the copy, resulting in a board that is one step into the future. Then I will perform my check methods to that future board. If the king is in check, then this information is passed back to the original board and that move is deleted from the list of valid moves. RELATED MODELS I’m borrowing many of the basic structural ideas such as the minimax search and alpha-beta pruning from the Othello project in AI. CREDITS AND REFERENCES 1. Shannon, Claude E. “Programming a Computer for Playing Chess.” 1950. 2. Beal, D.F. and Smith, M.C. “Temporal Difference Learning for Heuristic Search and Game Playing.” 1999 3. Moriarty, David E. and Miikkulainen, Risto. “Discovering Complex Othello Strategies Through Evolutionary Neural Networks.” 4. Huang, Shiu-li and Lin, Fu-ren. “Using Temporal-Difference Learning for Multi-Agent Bargaining.” 2007 5. Russell, Stuart and Norvig, Peter. Artificial Intelligence: A Modern Approach. Second Edition. 2003.