Download raunak-Poker - CSE, IIT Bombay

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Poker as a Testbed for Machine
Intelligence Research
By Darse Billings, Dennis Papp,
Jonathan Schaeffer, Duane Szafron
Presented By:Debraj Manna
Gada Kekin Dhiraj
Raunak Pillani
CONTENT

Introduction






Characteristics of Poker Game
Texas Hold’Em
Requirements From Players
Lokibot
Experiment
Future Work
INTRODUCTION


Game Researchers used Chess & other
board games as TestBed
Poker can be a better testbed for decision
making problems
POKER

Game of Imperfect knowledge





Risk management
Agent modelling
Unreliable information
Deception
Heuristic Search and evaluation methods
employed in Chess not helpful.
AI PROBLEM
CHARACTERISTICS
General Application Problem
Problem Realization in Poker
Imperfect knowledge
Opponents' hands are hidden
Multiple competing agents
Many competing players
Betting strategies and their
consequences
Risk management
Agent modeling
Deception
Unreliable information
Identifying patterns in opponent's
play and exploiting them
Bluffing and varying style of play
Taking into account your
opponents' deceptive plays
TEXAS HOLD 'EM


Pre-Flop – Each player is dealt with two cards
with their face down
Community Cards are dealt in 3 stages:


Flop – 3 cards are dealt with face up
Turn – 4th community card is dealt with face up.
River – last community card is dealt


A round of betting held at each stage
Showdown – player having the best 5 cards
wins the game
BETTING STRATEGY

FOLD – Withdraw from the game

CALL – Match the current bet

RAISE – Raise the current outstanding bet

Only 3 raises are allowed in a round.
REQUIREMENT




Hand Strength – strength of your hand
compared to opponents.
Hand Potential – Probability of hand improving
as additional cards appear.
Betting Strategy – Determining optimal betting
strategy
Bluffing – Allows you make profit even on weak
hands
REQUIREMENT (contd.)


Opponent Modeling – Determining
probability distribution for opponents
strategy.
Unpredictability – making difficult for
opponent to model your strategy.
Lokibot
(later changed to Pokibot)
Pre-flop Evaluation



52 choose 2 = 1326 possible combinations for two
cards
Approximate income rate for each starting hand
using a simulation of 1,000,000 poker games done
against nine random opponents

Highest income rate: A pair of aces

Lowest income rate: 2 and 7 (of different suits)
One time evaluation
Hand Evaluation
1. Hand Strength


Assessment of the current strength of the
hand
Enumeration techniques can provide an
accurate estimate of the probability of
currently holding the strongest hand.
2. Hand Potential

Potential changes in hand strength
Hand Strength



Starting hand is
and the flop is
47 remaining unknown cards and {47 choose 2} =
1,081 possible hands an opponent might hold.
Hand strength is estimated by simply counting
number of possible hands that are:

better than ours (any pair, two pair, A-K, or three
of a kind: 444 hands)

equal to ours (9 possible remaining A-Q
combinations)

worse than ours (628)
Hand Potential



Hand strength alone is insufficient to
assess the quality of a hand
Example

Hand:

Flop:

Next card:
,
Positive / Negative Potential
Hand Potential (contd.)
5 cards
7 cards
Ahead
Tied
Behind
Sum
Ahead
Tied
449,005
0
3,211
8,370
169,504
540
621,720 = 628x990
8,910 = 9x990
Behind
Sum
91,981
540,986
1,036
12,617
346,543
516,587
439,560 = 444x990
1,070,190 = 1,081x990
Hand Potential (contd.)

If T{row,col} refers to the values in the table (B, T, A,
and S are Behind, Tied, Ahead, and Sum, resp.) then
Ppot and Npot are calculated by:
Ppot = (T{B,A} + T{B,T}/2 + T{T,A}/2 ) / ( T{B,S} +
T{T,S}/2)
Npot = (T{A,B} + T{A,T}/2 + T{T,B}/2 ) / ( T{A,S} +
T{T,S}/2)

Ppot = 0.208 and Npot = 0.274
Betting Strategy

Hand strength and potential are combined into effective
hand strength (EHS):
EHS = HSn + (1 - HSn ) x Ppot
where HSn is the adjusted hand strength for n
opponents, Ppot is the positive potential.

EHS is the probability that we are ahead, and in those
cases where we are behind there is a Ppot chance that
we will pull ahead
pot_odds = bets_to_us / ( bets_in_pot + bets_to_us )

Call when Ppot > pot_odds
Experiment

Player A is the most advanced version of the program

Player E is a basic player



Player B lacks an appropriate weighting of subcases,
using a uniform distribution for all possible opponent
hands.
Player C uses a simplistic pre-flop hand selection
method, rather than the advanced system which
accounts for player position and number of opponents.
Player D lacks the computation of hand potential, which
is used in modifying the effective hand strength and
calling with proper pot odds.
Experiment (contd.)
Experiment (contd.)


The Bot was also run against other
Poker playing bots and human players
over the internet.
In it's current state the bot showed
losses against advanced players
Work In Progress



It is a predictable player that reacts the same in a
given situation irrespective of any historical
information
Opponent modeling: When Lokibot is better able
to infer likely holdings for the opponent, it will be
capable of much better decisions
Betting strategy: bluff with high potential hands
and occasionally bet a strong hand weakly
Work Done After The Paper



Later versions used simulation to discover the correct
action to take, simulating what the actions of the other
players (estimated using the opponent modelling) would
be depending on the action that Lokibot chose.
They included selective sampling simulation: Opponent
modelling consisted of weights for each hole card
combination describing the probabilities of each action
(bet, call, fold) and they measured opponents by their
rate of each action.
The most recent work has concerned other approaches
to poker game-tree search methods, as well as ways to
evaluate perfomance of agents
Contributions Of This Paper




Showing that poker can be a testbed of realworld decision making,
Identifying the major requirements of highperformance poker,
Presenting new enumeration techniques for
hand-strength and potential, and
Demonstrating a working program that
successfully plays "real" poker.
REFERENCE


Billings D., Papp D., Schaeffer J. and Szafron
D. "Poker as a Testbed for Machine
Intelligence Research." In Advances in
Artificial Intelligence (Mercer R. and Neufeld E.
eds.), Springer-Verlag, pp 1-15, 1998.
http://www.poker-academy.com
Related documents