Download FirstSemesterSummaryGuidelines

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

First Semester Project Summary Guideline
Name: James Mannion
Project: Temporal Difference Learning in Chess
Once I am done my program will show that it is possible for a computer to learn an evaluation function
for chess. As of right now, I have two working programs: one that allows two humans to play a full
game of chess against each other and one that allows a human to play a full game of chess against a
computer player that picks its moves at random. I have begun implementing necessary aspects of my
heuristic function. I wrote a main text file that will store the heuristic function as it stands and that can
be written over, and I wrote text files for each piece-square table. Currently the piece-square tables
have all 1’s as there values, but this is necessary to make sure that the program starts out not being very
effective, otherwise there would be no point in having it learn.
This project follows the rules of chess (except for slight variations I made to the stalemate rules for the
purpose of simplicity). It allows a human user to input moves directly into a terminal, and gives a textbased representation of a chessboard. Currently, a human can either play another human or a random
chess program. The Chessboard class checks to see whether a given move is legal by generating a list
of the desired piece’s legal moves (based on the rules of its movement) and seeing whether the desired
target is in that list of legal moves. If so, the program will make the move. If the move is not legal,
then the program will simply prompt the user for a different move.
To run a human vs. human game, the user simply needs to run in a command terminal. To run
a human vs. random AI game, the user needs to run Once in the game, the user enters their
moves into the terminal using the following syntax: *letter of piece**rank of piece* *letter of target
space**rank of target space*. So, the input “a2 a4” would move the piece at a2 to the space a4,
assuming this is a legal move.
The text-based program shows the players when a king is in check and will not allow anyone to make
any illegal moves. This includes disallowing moves that would put your own king in check. Figuring
out how to code that part took me forever because it was very hard to visualize and also because it
required hours of testing/debugging to make sure that it worked in as many scenarios I could throw at
Play a full game against the random computer to make sure there are no bugs.
As I said above, now that I have the kinks sorted out, over the next few weeks I will be drastically
increasing the scope of my program. The first step is to finish writing a basic 3-ply minimax, alphabeta search program. It will have access to and will parse the text files I have written that contain the
initial heuristic function information, however it will not be able to change the files. This program will
be the control for my experiments because it will remain constant throughout the testing period.
Writing the minimax algorithm should be relatively easy since I have already written one for Othello,
and I have figured out how my code is going to be structured. After this is done, I will be able to
implement temporal difference learning. Since it uses a fairly complicated mathematical equation, I am
still figuring out the actual mechanics of how I am going to alter the weights of the heuristic function in
the text files. I have a very clear vision of how I am going to test the effectiveness of the learning
One thing that took me a while to come up with was the way I check for moves that would put one’s
own king in check. I made a deep copy of the current board, and applied the move that is being looked
at to the copy, resulting in a board that is one step into the future. Then I will perform my check
methods to that future board. If the king is in check, then this information is passed back to the original
board and that move is deleted from the list of valid moves.
I’m borrowing many of the basic structural ideas such as the minimax search and alpha-beta pruning
from the Othello project in AI.
Shannon, Claude E. “Programming a Computer for Playing Chess.” 1950.
Beal, D.F. and Smith, M.C. “Temporal Difference Learning for Heuristic Search and Game
Playing.” 1999
Moriarty, David E. and Miikkulainen, Risto. “Discovering Complex Othello Strategies Through
Evolutionary Neural Networks.”
Huang, Shiu-li and Lin, Fu-ren. “Using Temporal-Difference Learning for Multi-Agent
Bargaining.” 2007
Russell, Stuart and Norvig, Peter. Artificial Intelligence: A Modern Approach. Second Edition.
Document related concepts

Minimax wikipedia, lookup

Computer Go wikipedia, lookup

Artificial intelligence in video games wikipedia, lookup

Computer chess wikipedia, lookup

Reversi wikipedia, lookup

Chess strategy wikipedia, lookup

Endgame tablebase wikipedia, lookup

Chess theory wikipedia, lookup

Promotion (chess) wikipedia, lookup

First-move advantage in chess wikipedia, lookup

Chess opening wikipedia, lookup

Fortress (chess) wikipedia, lookup

Zugzwang wikipedia, lookup