Download Artificial Intelligence - Welcome

Document related concepts

Genetic algorithm wikipedia , lookup

Transcript
Phase transition behaviour
Toby Walsh
Dept of CS
University of York
Outline
What have phase transitions to do with
computation?
How can you observe such behaviour in your
favourite problem?
Is it confined to random and/or NP-complete
problems?
Can we build better algorithms using knowledge
about phase transition behaviour?
What open questions remain?
2
Health warning
 To aid the clarity of my
exposition, credit may not
always be given where it is
due
 Many active researchers
in this area:
Achlioptas, Chayes, Dunne,
Gent, Gomes, Hogg, Hoos,
Kautz, Mitchell, Prosser,
Selman, Smith, Stergiou,
Stutzle, … Walsh
3
Before we begin
A little history ...
Where did this all start?
 At least as far back as
60s with Erdos & Renyi
 thresholds in random graphs
 Late 80s
 pioneering work by Karp,
Purdom, Kirkpatrick,
Huberman, Hogg …
 Flood gates burst
 Cheeseman, Kanefsky &
Taylor’s IJCAI-91 paper
In 91, I has just finished my PhD and
was looking for some new research
topics!
5
Phase transitions
Enough of the history, what has this
got to do with computation?
Ice melts. Steam condenses. Now
that’s a proper phase transition ...
An example phase transition
 Propositional satisfiability
(SAT)
 does a truth assignment exist
that satisfies a propositional
formula?
 NP-complete
(x1 v x2) & (-x2 v x3 v -x4)
x1/ True, x2/ False, ...
 3-SAT
 formulae in clausal form with 3
literals per clause
 remains NP-complete
7
Random 3-SAT
 Random 3-SAT
 sample uniformly from space
of all possible 3-clauses
 n variables, l clauses
 Which are the hard
instances?
 around l/n = 4.3
What happens with larger
problems?
Why are some dots red and
others blue?
8
Random 3-SAT
 Varying problem size, n
 Complexity peak appears
to be largely invariant of
algorithm
 backtracking algorithms like
Davis-Putnam
 local search procedures like
GSAT
What’s so special about 4.3?
9
Random 3-SAT
 Complexity peak coincides
with solubility transition
 l/n < 4.3 problems underconstrained and SAT
 l/n > 4.3 problems overconstrained and UNSAT
 l/n=4.3, problems on “knifeedge” between SAT and
UNSAT
10
“But it doesn’t occur in X?”
 X = some NP-complete problem
 X = real problems
 X = some other complexity class
Little evidence yet to support any of these claims!
11
“But it doesn’t occur in X?”
 X = some NP-complete problem
 Phase transition behaviour seen in:






TSP problem (decision not optimization)
Hamiltonian circuits (but NOT a complexity peak)
number partitioning
graph colouring
independent set
...
12
“But it doesn’t occur in X?”
 X = real problems
No, you just need a suitable ensemble of problems to
sample from?
 Phase transition behaviour seen in:






job shop scheduling problems
TSP instances from TSPLib
exam timetables @ Edinburgh
Boolean circuit synthesis
Latin squares (alias sports scheduling)
...
13
“But it doesn’t occur in X?”
 X = some other complexity class
Ignoring trivial cases (like O(1) algorithms)
 Phase transition behaviour seen in:
 polynomial problems like arc-consistency
 PSPACE problems like QSAT and modal K
 ...
14
“But it doesn’t occur in X?”
 X = theorem proving
 Consider k-colouring planar graphs
 k=3, simple counter-example
 k=4, large proof
 k=5, simple proof (in fact, false proof of k=4 case)
15
Locating phase transitions
How do you identify phase transition
behaviour in your favourite problem?
What’s your favourite problem?
 Choose a problem
 e.g. number partitioning
dividing a bag of numbers into
two so their sums are as
balanced as possible
 Construct an ensemble of
problem instances
 n numbers, each uniformly
chosen from (0,l ]
other distributions work
(Poisson, …)
17
Number partitioning
 Identify a measure of constrainedness
 more numbers => less constrained
 larger numbers => more constrained
 could try some measures out at random (l/n, log(l)/n,
log(l)/sqrt(n), …)
 Better still, use kappa!
 (approximate) theory about constrainedness
 based upon some simplifying assumptions
e.g. ignores structural features that cluster
solutions together
18
Theory of constrainedness
 Consider state space
searched
 see 10-d hypercube opposite
of 2^10 truth assignments
for 10 variable SAT problem
 Compute expected number
of solutions, <Sol>
 independence assumptions
often useful and harmless!
19
Theory of constrainedness
 Constrainedness given by:
kappa= 1 - log2(<Sol>)/n
where n is dimension of state space
 kappa lies in range [0,infty)
 kappa=0,
<Sol>=2^n,
 kappa=infty, <Sol>=0,
 kappa=1,
<Sol>=1,
under-constrained
over-constrained
critically constrained
phase boundary
20
Phase boundary
 Markov inequality
 prob(Sol) < <Sol>
Now, kappa > 1 implies <Sol> < 1
Hence, kappa > 1 implies prob(Sol) < 1
 Phase boundary typically at values of kappa
slightly smaller than kappa=1
 skew in distribution of solutions (e.g. 3-SAT)
 non-independence
21
Examples of kappa
 3-SAT
 kappa = l/5.2n
 phase boundary at kappa=0.82
 3-COL
 kappa = e/2.7n
 phase boundary at kappa=0.84
 number partitioning
 kappa = log2(l)/n
 phase boundary at kappa=0.96
22
Number partition phase transition
Prob(perfect partition) against kappa
23
Finite-size scaling
 Simple “trick” from statistical physics
 around critical point, problems indistinguishable except
for change of scale given by simple power-law
 Define rescaled parameter
 gamma = kappa-kappac . n^1/v
kappac
 estimate kappac and v empirically
 e.g. for number partitioning, kappac=0.96, v=1
24
Rescaled phase transition
Prob(perfect partition) against gamma
25
Rescaled search cost
Optimization cost against gamma
26
Easy-Hard-Easy?
 Search cost only easy-hard here?
 Optimization not decision search cost!
 Easy if (large number of) perfect partitions
 Otherwise little pruning (search scales as 2^0.85n)
 Phase transition behaviour less well understood
for optimization than for decision
 sometimes optimization = sequence of decision
problems (e.g branch & bound)
 BUT lots of subtle issues lurking?
27
Algorithms at the phase boundary
What do we understand about problem
hardness at the phase boundary?
How can this help build better
algorithms?
Looking inside search
 Three key insights
 constrainedness “knifeedge”
 backbone structure
 2+p-SAT
 Suggests branching
heuristics
 also insight into branching
mistakes
29
Inside SAT phase transition
 Random 3-SAT, l/n =4.3
 Davis Putnam algorithm
 tree search through space of
partial assignments
 unit propagation
 Clause to variable ratio l/n
drops as we search
=> problems become less
constrained
Aside: can anyone explain simple scaling?
l/n against depth/n
30
Inside SAT phase transition
 But (average) clause length,
k also drops
=> problems become more
constrained
 Which factor, l/n or k
wins?
 Look at kappa which includes
both!
Aside: why is there again such simple
scaling?
Clause length, k against depth/n
31
Constrainedness knife-edge
kappa against depth/n
32
Constrainedness knife-edge
 Seen in other problem domains
 number partitioning, …
 Seen on “real” problems
 exam timetabling (alias graph colouring)
 Suggests branching heuristic
 “get off the knife-edge as quickly as possible”
 minimize or maximize-kappa heuristics
must take into account branching rate, max-kappa often
therefore not a good move!
33
Minimize constrainedness
 Many existing heuristics minimize-kappa
 or proxies for it
 For instance




Karmarkar-Karp heuristic for number partitioning
Brelaz heuristic for graph colouring
Fail-first heuristic for constraint satisfaction
…
 Can be used to design new heuristics
 removing some of the “black art”
34
Backbone
 Variables which take fixed
values in all solutions
 alias unit prime implicates
 Let fk be fraction of
variables in backbone
 l/n < 4.3, fk vanishing
(otherwise adding clause
could make problem unsat)
 l/n > 4.3, fk > 0
discontinuity at phase
boundary!
35
Backbone
 Search cost correlated with backbone size
 if fk non-zero, then can easily assign variable “wrong”
value
 such mistakes costly if at top of search tree
 Backbones seen in other problems
 graph colouring
 TSP
 …
Can we make algorithms that identify and exploit the
backbone structure of a problem?
36
2+p-SAT
 Morph between 2-SAT and
3-SAT
 fraction p of 3-clauses
 fraction (1-p) of 2-clauses
 2-SAT is polynomial (linear)
 phase boundary at l/n =1
 but no backbone discontinuity
here!
 2+p-SAT maps from P to NP
 p>0, 2+p-SAT is NP-complete
37
2+p-SAT
 fk only becomes
discontinuous above p=0.4
Search cost against n
 but NP-complete for p>0 !
 search cost shifts from
linear to exponential at
p=0.4
 recent work on backbone
fragility
38
Structure
Can we model structural features not
found in uniform random problems?
How does such structure affect our
algorithms and phase transition behaviour?
The real world isn’t random?
 Very true!
Can we identify structural
features common in real
world problems?
 Consider graphs met in
real world situations




social networks
electricity grids
neural networks
...
40
Real versus Random
 Real graphs tend to be
sparse
 dense random graphs contains
lots of (rare?) structure
 Real graphs tend to have
short path lengths
 as do random graphs
 Real graphs tend to be
clustered
L, average path length
C, clustering coefficient
(fraction of neighbours connected to
each other, cliqueness measure)
mu, proximity ratio is C/L normalized
by that of random graph of same
size and density
 unlike sparse random graphs
41
Small world graphs
 Sparse, clustered, short
path lengths
 Six degrees of separation
 Stanley Milgram’s famous
1967 postal experiment
 recently revived by Watts &
Strogatz
 shown applies to:
 actors database
 US electricity grid
 neural net of a worm
 ...
42
An example
 1994 exam timetable at
Edinburgh University
 59 nodes, 594 edges so
relatively sparse
 but contains 10-clique
 less than 10^-10 chance in
a random graph
 assuming same size and
density
 clique totally dominated
cost to solve problem
43
Small world graphs
 To construct an ensemble of small world graphs
 morph between regular graph (like ring lattice) and
random graph
 prob p include edge from ring lattice, 1-p from random
graph
real problems often contain similar structure and
stochastic components?
44
Small world graphs
 ring lattice is clustered but has long paths
 random edges provide shortcuts without
destroying clustering
45
Small world graphs
46
Small world graphs
47
Colouring small world graphs
48
Small world graphs
 Other bad news
 disease spreads more
rapidly in a small world
 Good news
 cooperation breaks out
quicker in iterated
Prisoner’s dilemma
49
Other structural features
It’s not just small world graphs that have been studied
 Large degree graphs
 Barbasi et al’s power-law model
 Ultrametric graphs
 Hogg’s tree based model
 Numbers following Benford’s Law
 1 is much more common than 9 as a leading digit!
prob(leading digit=i) = log(1+1/i)
 such clustering, makes number partitioning much easier
50
The future?
What open questions remain?
Where to next?
Open questions
 Prove random 3-SAT occurs at l/n = 4.3
 random 2-SAT proved to be at l/n = 1
 random 3-SAT transition proved to be in range
3.003 < l/n < 4.506
 random 3-SAT phase transition proved to be
“sharp”
 2+p-SAT
 heuristic argument based on replica symmetry
predicts discontinuity at p=0.4
 prove it exactly!
52
Open questions
 Impact of structure on phase transition
behaviour
 some initial work on quasigroups (alias Latin
squares/sports tournaments)
 morphing useful tool (e.g. small worlds, 2-d to 3-d TSP,
…)
 Optimization v decision
 some initial work by Slaney & Thiebaux
 problems in which optimized quantity appears in control
parameter and those in which it does not
53
Open questions
 Does phase transition behaviour give insights to
help answer P=NP?
 it certainly identifies hard problems!
 problems like 2+p-SAT and ideas like backbone also
show promise
 But problems away from phase boundary can be
hard to solve
 over-constrained 3-SAT region has exponential
resolution proofs
 under-constrained 3-SAT region can throw up
occasional hard problems (early mistakes?)
54
Summary
That’s nearly all from me!
Conclusions
 Phase transition behaviour ubiquitous
 decision/optimization/...
 NP/PSpace/P/…
 random/real
 Phase transition behaviour gives insight into
problem hardness
 suggests new branching heuristics
 ideas like the backbone help understand branching
mistakes
56
Conclusions
 AI becoming more of an experimental science?
 theory and experiment complement each other well
 increasing use of approximate/heuristic theories to
keep theory in touch with rapid experimentation
 Phase transition behaviour is FUN
 lots of nice graphs as promised
 and it is teaching us lots about complexity and
algorithms!
57
Very partial bibliography
Cheeseman, Kanefsky, Taylor, Where the really hard problem are, Proc. of
IJCAI-91
Gent et al, The Constrainedness of Search, Proc. of AAAI-96
Gent & Walsh, The TSP Phase Transition, Artificial Intelligence, 88:359-358,
1996
Gent & Walsh, Analysis of Heuristics for Number Partitioning, Computational
Intelligence, 14 (3), 1998
Gent & Walsh, Beyond NP: The QSAT Phase Transition, Proc. of AAAI-99
Gent et al, Morphing: combining structure and randomness, Proc. of AAAI-99
Hogg & Williams (eds), special issue of Artificial Intelligence, 88 (1-2), 1996
Mitchell, Selman, Levesque, Hard and Easy Distributions of SAT problems, Proc.
of AAAI-92
Monasson et al, Determining computational complexity from characteristic ‘phase
transitions’, Nature, 400, 1998
Walsh, Search in a Small World, Proc. of IJCAI-99
Watts & Strogatz, Collective dynamics of small world networks, Nature, 393,
1998
58