Download 1 Artificial Intelligence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Game mechanics wikipedia , lookup

Turns, rounds and time-keeping systems in games wikipedia , lookup

Computer chess wikipedia , lookup

Human–computer chess matches wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Computer Go wikipedia , lookup

Minimax wikipedia , lookup

Transcript
Backgammon
Artificial Intelligence
Adversial Search (Game Playing 2)
Chapter 6
(6.5 – 6.8)
Games of Chance
Stochastic Game Environments
Some games involve chance, for example:
– Roll of a die
– Spin of a game wheel
– Deal of cards from shuffled deck
Extend the game tree representation:
– Computer moves
– Opponent moves
– Chance nodes
Backgammon is a two player
game with uncertainty
Players roll dice to determine
what moves to make
White has just rolled 5 and 6
and had four legal moves:
•5-10,
•5-11,
•5-10,
•5-11,
Such games are good for
exploring decision making in
adversarial problems involving
skill and luck
Games with Chance
This chance nodes in addition to the Max
and Min nodes influence the tree
– Branches indicate possible variations
– Each branch indicates the outcome and its likelihood
5-11
19-24
10-16
11-16
Rolling Dice1)
36 ways to roll two dice
–
–
–
–
The same likelihood for all of them
Due to symmetry, there are only 21 distinct rolls
Six doubles have a 1/36 chance
The other fifteen have a 1/18 chance
1
Game Trees with Chance Nodes
Chance nodes (shown as
circles) represent the dice
rolls
Game Trees with Chance Nodes
Use expected values for
chance nodes
Min
Rolls
Min
Rolls
For chance nodes over a max
node, as in C, we compute:
Each chance node has 21
distinct children with a
probability associated with
each
epectimax(C) = Sumi(P(di) *
maxvalue(i))
For chance nodes over a min
node compute:
Max
Rolls
We can use minimax to
compute the values for the
MAX and MIN nodes
Max
Rolls
epectimin(C) = Sumi(P(di) *
minvalue(i))
-
-
Decisions with Chance
The next step is to understand how to make correct
decisions:
– We still want to pick the move that leads to the best position
– However, the resulting positions do not have definite minimax
values
-
-
Stochastic Game Environments
Weight score by the probabilities that move occurs
Use expected value for move: sum of possible
random outcomes
Instead, we can only calculate the expected value, where
the expectation is taken over all the possible dice rolls that
could occur
Decisions with Chance
This leads us to generalize the minimax value for deterministic games
to an expectiminimax value for games with chance nodes
Stochastic Game Environments
Choose move with highest expected value
Terminal nodes and MAX and MIN nodes (for which the dice roll is
known) work exactly the same way as before
Chance nodes are evaluated:
– Taking the weighted average of the values resulting from all possible dice
rolls
– EXPECTIMINIMAX(n) (successor function) for a chance node n simply
augments the state of n with each possible dice roll to produce each
successor s and P(s) is the probability that that dice roll occurs.
2
Evaluation Function
Properties of Expectiminimax
Complexity of O(bmnm)
– Where n is the number of distinct chance events
– Example backgammon:
• n = 21, b ≈ 20
Evaluation function (successor function) is used to evaluate the
"goodness" of a game position
The zero-sum assumption allows us to use a single evaluation
function to describe the goodness of a board with respect to both
players.
–
–
–
–
–
Stochastic Game Environments
Stochastic elements increase the branching factor
– 21 possible number rolls with 2 dice
– The value of look-ahead diminishes: as depth increases,
probability of reaching a particular node decreases
Alpha-beta pruning is less effective
Approximation to make with expectiminimax is to
cut the search off at some point and apply an evaluation
function to each leaf
Position Evaluation
(with chance nodes)
Evaluation functions for games such as backgammon differ from
evaluation functions for chess
f(n) > 0: position n good for me and bad for you.
f(n) < 0: position n bad for me and good for you
f(n) near 0: position n is a neutral position.
f(n) >> 0: win for me.
f(n) << 0: win for you.
Evaluation Function
Evaluation function is a heuristic function, and it is
where the domain experts’ knowledge resides
Example of an Evaluation Function for Tic-Tac Toe: f(n) = [# of
3-lengths open for me] – [# of 3-lengths open for you] where a
3-length is a complete row, column, or diagonal
For chess f(n) = w(n)/b(n) where w(n) = sum of the point value
of white’s pieces and b(n) is sum for black
Evaluation Function
The board evaluation function estimates how good the current board
state is for the computer
The presence of chance nodes means that one has to
be more careful about what the evaluation values mean
Heuristic function of the features of the board
– i.e. function(feat1, feat2, feat3, …, featn)
A program behaves totally differently with different evaluation
function and need sensitively specified evaluation function
A linear evaluation function of the features is a
weighted sum of feat1, feat2, feat3, …, featn
Example features for chess are piece count, piece placement,
squares controlled, etc…
3
Linear Evaluation Functions
Most evaluation functions are specified as a
weighted sum of position features:
f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)
– feat1, feat2, feat3, …, featk are features
– and w1, w2 , …, wn are their weights
More important features get more weight
Deep Blue has about 8,000 features in its evaluation function
Linear Evaluation Functions
Learning Checkers
A. L. Samuel, “Some Studies in Machine Learning using the
Game of Checkers,” IBM Journal of Research and
Development, 11(6):601-617, 1959
Learned linear weights by playing copies of itself thousands
of times
Used only an IBM 704 with 10,000 words of RAM, magnetic
tape, and a clock speed of 1 kHz
Successful enough to be competitive in human Tournaments
Learning Backgammon
The quality of play depends directly on the quality
of the evaluation function
G. Tesauro and T. J. Sejnowski, “A Parallel
Network that Learns to Play Backgammon,”
Artificial Intelligence, 39(3), 357-390, 1989
To build an evaluation function we have to:
Also learned by playing copies of itself
– Construct good features using expert knowledge of the
game
– Choose good weights… or learn them
Used a non-linear evaluation function: a neural
network
Rates in the top three players in the world
Learning Weights
How can we learn the weights for a linear evaluation
function?
Play lots of games against an opponent!
– For every move (or game) error = true outcome - evaluation
AI in Modern Computer Games
Genetic algorithms and genetic programming have
been used and shown some success in “evolving”
realistically-acting agents for games
– Certainly appropriate for “Sim”-type games
function
– If error is positive (underestimating) adjust weights to increase the
evaluation function
– If error is zero do nothing
– If error is negative (overestimating) adjust weights to decrease
the evaluation function
4
Games and Computers
State of the art for some game programs
–
–
–
–
–
Chess
Checkers
Othello
Backgammon
Go
Chess: IBM’s Deep Blue
Checkers1)
Chinook becomes world champion in 1994
Chinook runs on regular PCs, alpha-beta
search, end-games data base for six-piece
positions
Othello1)
Kasparov vs. Deep Blue, May 1997
– A parallel computer with 30 IBM RS/6000 processors running the
“software search"
– 480 custom VLSI chess processors that performed move generation
(including move ordering)
Many programs play far better than humans
– Smaller search space than chess
– Little evaluation expertise available
– Deep Blue searched 126 million nodes per second on average, with
a peak speed of 330 million nodes per second
– It generated up to 30 billion positions per move, reaching depth 14
routinely.
– The heart of the machine is a standard iterative-deepening alphabeta search with a transposition table and very sophisticated
heuristics
Chess: IBM’s Deep Blue
In some cases the search reached a depth of 40 plies. The evaluation
function had over 8000 features, many of them describing highly
specific patterns of pieces.
Massive data base with games from the literature
Database of 700,000 grandmaster games from which consensus
recommendations could be extracted
Backgammon1)
Neural-network based program ranks among the
best players in the world
– Improves its own evaluation function through learning
techniques
– Systematic search-based methods are practically hopeless
• Chance elements, branching factor
The system also used a large endgame database of solved positions
Note: Deep Blue still searches “brute force,” and still plays with little in
common with the intuition and strategy humans use
5
Go
Go is the most popular board game in Asia, requiring at
least as much discipline from its professionals as chess
Humans play far better
– Large branching factor (around 360)
• Daunting for systematic search methods
Up to 1997 there were no competent programs at all, but
now programs often play respectable moves
Go
Combine pattern recognition techniques
(successor(pattern of pieces, move))
Limited search (decide whether these pieces can
be captured, staying within the local area)
Go is an area that is likely to benefit from intensive
investigation using more sophisticated reasoning
methods
References
1) Franz J. Kurfess Associate Professor, Computer
Science Department at California Polytechnic State
University USA
Summary
In many games, there is a degree of
unpredictability through random elements
– Throwing dice, card distribution, roulette wheel, …
This requires chance nodes in addition to the Max
and Min nodes
– Branches indicate possible variations
– Each branch indicates the outcome and its likelihood
Summary
The evaluation value of a position depends on the
random element
– The definite minimax value must be replaced by an expected
value
Calculation of expected values
– Utility function for terminal nodes
– For all other nodes
• Calculate the expected value for each chance event
• Weigh by the chance that the event occurs
• Add up the individual calculated values
Summary
Evaluation functions estimate the quality of a given
board configuration for each player
– Good for opponent
– Good for computer
– 0 neutral
For many well-known games, computer algorithms
using heuristic search can match or out-perform
human world experts
6
Summary
An evaluation function gives an estimate of the utility of a
state when a complete search is impractical
State of the art for some game programs
–
–
–
–
–
Chess
Checkers
Othello
Backgammon
Go
Next Time!
Logical Agents
Chapter 7
7