Download State-Space Search for Planning: Algorithms (Invited Lecture)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
State-Space Search for Planning: Algorithms
(Invited Lecture)
Héctor Geffner
ICREA and Universitat Pompeu Fabra
Barcelona, Spain
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
1
State Models
A classical planner is a solver for state-models of the form:
1.
2.
3.
4.
5.
6.
state space S
initial state s0 ∈ S
goal states SG ⊆ S
actions A(s) applicable in each s
transition function s0 = f (a, s), a ∈ A(s)
(Optional) Action Costs c(a, s) of doing a in s
• Models expressed in compact form through suitable languages
• A solution (plan) is an applicable action sequence a0, . . . , an that maps the
initial state s0 into a goal state sn+1 ∈ SG; si+1 = f (ai, si) and ai ∈ A(si)
P
• Optimal plans minimize the sum of action costs i=0,n c(ai, si)
• Other forms of planning involve slightly different models: incomplete info,
probabilistic planning, . . . .
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
2
The Strips Language (Fikes and Nilsson 71)
• Strips is one of the oldest, simplest, and most used planning languages, even
if not too flexible
• A (ground) problem in Strips is a tuple hA, O, I, Gi where
–
–
–
–
A stands for set of all atoms (boolean vars)
O stands for set of all operators (ground actions)
I ⊆ A stands for initial situation
G ⊆ A stands for goal situation
• Operators o ∈ O represented by three lists
– the Add list Add(o) ⊆ A
– the Delete list Del(o) ⊆ A
– the Precondition list P re(o) ⊆ A
• Intuitively, P re(o) encodes the atoms that must be true for o to be applicable,
Add(o) encodes the atoms that o makes true, and Del(o) encodes the atoms
that o makes false
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
3
From Language to Models: Semantics of Strips
Strips problem P = hA, O, I, Gi determines state model S(P )
• the states s ∈ S are collections of atoms
• the initial state s0 is I
• the goal states s ∈ SG are such that G ⊆ s
• the actions in s are the op ∈ O s.t. P rec(op) ⊆ s
• the state that results from doing op in s is s0 = s − Del(op) + Add(op)
• action costs c(op, s) are all 1
The (optimal) solution of planning problem P is the (optimal) solution of State
Model S(P )
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
4
Example: Blocks in Strips (PDDL Syntax)
(define (domain BLOCKS)
(:requirements :strips) ...
(:action pick_up
:parameters (?x)
:precondition (and (clear ?x) (ontable ?x) (handempty))
:effect (and (not (ontable ?x)) (not (clear ?x)) (not (handempty)) (hol
(:action put_down
:parameters (?x)
:precondition (holding ?x)
:effect (and (not (holding ?x)) (clear ?x) (handempty) (ontable ?x)))
(:action stack
:parameters (?x ?y)
:precondition (and (holding ?x) (clear ?y))
:effect (and (not (holding ?x)) (not (clear ?y)) (clear ?x)(handempty)
(on ?x ?y))) ...
(define (problem BLOCKS_6_1)
(:domain BLOCKS)
(:objects F D C E B A)
(:init (CLEAR A) (CLEAR B) ... (ONTABLE B) ... (HANDEMPTY))
(:goal (AND (ON E F) (ON F C) (ON C B) (ON B A) (ON A D))))
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
5
Example: Logistics in Strips/PDDL
(define (domain logistics)
(:requirements :strips :typing :equality)
(:types airport - location truck airplane - vehicle vehicle packet - thing thin
(:predicates (loc-at ?x - location ?y - city) (at ?x - thing ?y - location) (in ?x
(:action load
:parameters (?x - packet ?y - vehicle)
:vars (?z - location)
:precondition (and (at ?x ?z) (at ?y ?z))
:effect (and (not (at ?x ?z)) (in ?x ?y)))
(:action unload ..)
(:action drive
:parameters (?x - truck ?y - location)
:vars (?z - location ?c - city)
:precondition (and (loc-at ?z ?c) (loc-at ?y ?c) (not (= ?z ?y)) (at ?x ?z))
:effect (and (not (at ?x ?z)) (at ?x ?y)))
...
(define (problem log3_2)
(:domain logistics)
(:objects packet1 packet2 - packet truck1 truck2 truck3 - truck airplane1 - airp
(:init (at packet1 office1) (at packet2 office3) ...)
(:goal (and (at packet1 office2) (at packet2 office2))))
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
6
Computation: How to solve Strips planning problems?
• Partial Order (POCL) Planning (70’s, 80’s): work on any subgoal, resolve
threats; UCPOP 1992
• Graphplan (1995 – . . . ): build graph containing all possible parallel plans up
to certain length; then extract plan by searching the graph backward from
Goal
• SatPlan (1996 – . . . ): map planning problem given horizon into SAT problem;
use state-of-the-art SAT solver
• Heuristic Search Planning (1996 – . . . ): search state space S(P ) with heuristic function h extracted from problem P
• Model Checking Planning (1998 – . . . ): search state space S(P ) with ‘symbolic’ BrFS where sets of states represented by formulas implemented by
BDDs
• ...
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
7
Planning as State-Space Search: Progression
• Most basic approach to planning: reduces the search for plans to a search
for a path in a directed graph.
• In progression or forward search, the nodes in the graph are sets of atoms
representing states in S(P ), and
–
–
–
–
root node is I (the initial state)
goal nodes are the sets of atoms σ such that G ⊆ σ (goal states)
edges s → s0 iff s0 = s − Del(a) + Add(a) and P re(a) ⊆ s
edge costs c(s, s0) = c(a, s)
• This graph also called the forward or progression search space is the simplest
search space (isomorphic to S(P )) but not the only one
• Other search spaces also used: regression space, plan space, etc
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
8
Planning as State-Space Search: Regression
In regression or backwards search, the nodes in the graph are still sets of
atoms but they represent sets of states in S(P ), and
• root node σ0 is the goal G
• goal nodes are the sets of atoms σ such that σ ⊆∈ I
• edge σ → σ 0 iff σ 0 = σ − Add(a) + P re(a), provided Add(a) ∩ σ 6= ∅, and
Del(a) ∩ σ 0 = ∅
• edge costs c(σ, σ 0) = c(a, s)
Correctness/Completeness: The paths connecting the initial node to a goal
node in this graph (regression space) encode the plans for P that map the initial
state into a goal state (in reverse)
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
9
Computation: How to search for paths in these graphs?
• Blind search/Brute force algorithms
– Goal plays passive role in the search
e.g., Depth First Search (DFS), Breadth-first search (BrFS), Uniform Cost
(Dijkstra), Iterative Deepening (ID)
• Informed/Heuristic Search Algorithms
– Search uses a function h(s) that estimates ‘distance’ (cost) from state s to
SG to guide search
e.g., A*, IDA*, Hill Climbing, Best First Search (BFS), Branch & Bound
The latter can be much more effective: the question is how to get informed
heuristic functions h(s) quickly and automatically . . .
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
10
Heuristic Functions
• Heuristics derived as optimal cost function of relaxed problems
• Simple relaxations used in Planning
–
–
–
–
ignore delete lists
reduce large goal sets to subsets
ignore certain atoms
...
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
11
Additive Heuristic
• For all atoms p:
def
g(p; s) =
0
if p ∈ s, else
mina∈O(p)[1 + g(P rec(a); s)]
• For sets of atoms C, assume also independence:
def
g(C; s) =
X
g(r; s)
r∈C
• Resulting heuristic function hadd(s):
def
hadd(s) = g(Goals; s)
Heuristic not admissible but informative and fast
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
12
Max Heuristic
• For all atoms p:
def
g(p; s) =
0
if p ∈ s, else
mina∈O(p)[1 + g(P rec(a); s)]
• For sets of atoms C, replace sum by max
def
g(C; s) = maxr∈C g(r; s)
• Resulting heuristic function hmax(s):
def
hmax(s) = g(Goals; s)
Heuristic admissible but not very informative . . .
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
13
Max Heuristic and Planning Graphs
• Build reachability graph P0, A0, P1, A1, . . .
P0 = {p ∈ s}
Ai = {a ∈ O | P rec(a) ⊆ Pi}
Pi+1 = Pi ∪ {p ∈ Add(a) | a ∈ Ai}
...
...
P0
A0
P1
A1
...
– Graph implicitly represents max heuristic:
hmax(s) = min i such that G ⊆ Pi
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
14
Some Heuristic Search Planners
• Graphplan, 1995: makes plan graph heuristic hG more informed by keeping
track of mutex relations among pairs of atoms and actions in layered graph.
It uses this heuristic in a IDA* regression search from the goal
• HSP, 1998: it does a WA* progression search guided by additive heuristic
computed from scratch for every visited state s
• FF: 2000 it does a hill-climbing progression search guided by heuristic given
by the number of actions in ’relaxed plan’ (set of actions responsible for presence of goals in lowest layer).
• TEXT: Malik Ghallab et. al. Automated Planning: Theory & Practice. Morgan Kaufmann 2004.
• Graphplan: Blum, A., and Furst, M. Fast planning through planning graph analysis. In Proceedings of IJCAI-95,
1636–1642. 1995
• HSP: Bonet, B., and Geffner, H. Planning as heuristic search. Artificial Intelligence 129(1–2):5–33. 2001.
• FF: Hoffmann, J., and Nebel, B. The FF planning system: Fast plan generation through heuristic search. JAIR
2001:253–302.
Hector Geffner, State-Space Planning, Edinburgh, 10/2007
15