Download 0 - Department of Computer Science and Engineering, CUHK

Document related concepts

EPR paradox wikipedia , lookup

History of quantum field theory wikipedia , lookup

Orchestrated objective reduction wikipedia , lookup

Quantum decoherence wikipedia , lookup

Interpretations of quantum mechanics wikipedia , lookup

Quantum teleportation wikipedia , lookup

Coherent states wikipedia , lookup

Probability amplitude wikipedia , lookup

Quantum machine learning wikipedia , lookup

Measurement in quantum mechanics wikipedia , lookup

T-symmetry wikipedia , lookup

Quantum computing wikipedia , lookup

Density matrix wikipedia , lookup

Hidden variable theory wikipedia , lookup

Symmetry in quantum mechanics wikipedia , lookup

Quantum group wikipedia , lookup

Canonical quantization wikipedia , lookup

Quantum state wikipedia , lookup

Quantum key distribution wikipedia , lookup

Transcript
Random Walk on Graphs and
its Algorithmic Applications
Shengyu Zhang
Winter School, ITCSC@CUHK, 2009
Random walk on graphs
On an undirected graph G:
 Starting from vertex v0
 Repeat for a number of
steps:


Go to a random neighbor.
Simple but powerful.
Road map: Random walk
Parameters
/Properties
Algorithms
Hitting time
 k-SAT
 st-connectivity
Mixing time
 PageRank
 Approximate counting
 Error-reduction
Road map: Quantum walk
Short intro to math model of quantum mechanics
Types
Algorithms
Discrete QW
Element Distinctness
Continuous QW
Formula Evaluation
PART I. Random Walk
Key parameter 1: Hitting time
Hitting time

Recall the process of random walk on a
graph G.


Starting vertex v0
Repeat for a number of steps:

j
Go to a random neighbor.
i

Hitting time:
H(i,j) = expected time to visit j (for the first
time), starting at i
Undirected graphs

Complete graph


Line:



H(i,j) = n-1 (i≠j)
1
2
…
n-1
H(i,j) = j2-i2 (i<j)
In particular, H(0,n-1) = (n-1)2.
General graph:

0
i
H(i,j) = O(n3).
j
(n/2)-line
(n/2)-complete graph
Algorithm 1: 2-SAT
k-SAT: satisfiability of k-CNF formula


n variables x1, …, xn∊{0,1}
m clauses, each being OR of k literals




Literal: xi or ¬xi
e.g. (k=3): (¬x1) ٧ x5 ٧ x7
3-CNF formula: AND of these m clauses
 e.g. ((¬x1) ٧ x5 ٧ x7) ٨ (x2 ٧ (¬x5) ٧ (¬x7)) ٨ (x1 ٧ x7 ٧ x8)
3-SAT Problem: Given a 3CNF formula, decide whether
there is an assignment of variables s.t. the formula
evaluates to 1.

For the above example, Yes: x5=1, x7=0, x1=1
P vs. NP

P: problems that can be easily solved


“easily”: in polynomial time
NP: problems that can be easily verified.

Formally: ∃ a polynomial time verifier V, s.t. for any input x,



If the answer is YES, then ∃y s.t. V(x,y) = 1
If the answer is NO, then ∀y, V(x,y) = 1
The question of TCS: Is P = NP?

Intuitively, no. NP should be much larger.



It’s much easier to verify (with help) than to solve (by yourself)
mathematical proof, appreciation of good music/food, …
Formal proof? We don’t know yet.

One of the 7 Millennium Problems by CMI.①
① http://www.claymath.org/millennium/P_vs_NP/
NP-completeness


k-SAT is NP-complete, for any k ≥ 3.
NP-complete:




NP-complete problems are the hardest ones in NP.
3-SAT is in NP:



In NP
All other problems in NP can be reduced to it in poly. time
witness --- satisfying assignment A
Verification: evaluate formula with variables assigned by A
[Cook-Levin] 3-SAT is NP-complete.
How about 2-SAT?

While 3-SAT is the hardest in NP, 2-SAT is
solvable in polynomial time.

Here we present a very simple randomized
algorithm, which has polynomial expected
running time.
Algorithm for 2-SAT

2SAT: each clause has two variables/negations
(x1∨x2)∧(x2∨¬x3) ∧(¬x4∨x3) ∧(x5∨x1)

Alg [Papadimitriou]:


Pick any assignment
Repeat O(n2) time


If all satisfied, done
Else


x1, x2, x3, x4, x5
0, 1, 0, 1, 0
1
Pick any unsatisfied clause
Pick one of the two literals each with ½ probability, and flip the
assignment on that variable
Analysis

(x1∨x2)∧(x2∨¬x3) ∧(¬x4∨x3) ∧(x5∨x1)




x1, x2, x3, x4, x5
0, 1, 0, 1, 0
If unsatisfiable: never find an satisfying assignment
If satisfiable, there exists a satisfying assignment x



If our initially picked assignment x’ is satisfying, then done.
Otherwise, for any unsatisfied clause, at least one of the
two variables is assigned a value different than that in x
Randomly picking one of the two variables and flipping its
value increases # correct assignments by 1 w.p. ≥ ½
Analysis (continued)

Consider a line of n+1 points, where k represents
“we’ve assigned k variables correctly”



Last slide: Randomly picking one of the two
variables and flipping its value increases # correct
assignments by 1 w.p. ≥ ½
Thus the algorithm is actually a random walk on the
line of n+1 points, with Pr[going right] ≥ ½.


“correctly”: the same way as x
Hitting time (i → n): O(n2)
So by repeating this flipping process O(n2) steps,
we’ll reach x with high probability.
Algorithm 2: st-connectivity
st-Connectivity

Problem: Given a graph G and two vertices s
and t on it, decide whether they are
connected.

BFS/DFS (starting at s) solves the problem in
linear time.
But uses linear space as well.
Question: Can we use much less space?


Simple random walk algorithm

A randomized algorithm takes O(log n)
space.



Starting at s, perform the random walk for O(n3)
steps. If ever see t, output YES and stop.
output NO.
Why it works? H(s,t) = O(n3).

If s can reach t, then we should see it within O(n3)
time.
Key parameter 2: Mixing time
Convergence


Now let’s study the probability distribution of
v
the particle after a long time.
u
w
What do we observe?



The distribution converges to
uniform…
wherever it starts.
Q: Does this hold in general?
t
0
1
0
1
1/2
0
1/2
2
1/4
2/4
1/4
3
3/8
2/8
3/8
4
5/16
6/16
5/16
5
11/32
10/32
11/32
6
21/64
22/64
21/64
Convergence
v1


Well, Yes and No.
Consider the following cases




on undirected graphs.
v2
v3
v4
v5
v1
v2
3
4
5
6
Case 1: The graph is unconnected. v
v
v
v
Case 2: The graph is bipartite.
[Thm] For any connected non-bipartite graph,
and any starting point, the random walk
converges.
Converges to what?
v
u



w
In the previous triangle example: uniform.
In general?
As a result of the convergence, the
distribution doesn’t change by the matrix

If the particle is on the graph according to the
distribution, then further random walk will result in
the same distribution.


We call it the stationary distribution.
[Fact] It’s unique.
stationary distribution

[Fact] It’s the following distribution: π(v) = d(v)/2m




d(v) = degree of v, i.e. # of neighbors.
m: |E|, i.e. # of edges.
[Proof] Consider one step of walk:
π’(v) = ∑u: (u,v)∊E p(u)∙[1/d(u)]
= ∑u: (u,v)∊E [d(u)/2m]∙[1/d(u)]
= ∑u: (u,v)∊E 1/2m
= d(v)/2m
= π(v)
So π is the stationary distribution.
For regular graphs, π is the uniform distribution.
Proportional to degree



Note: the stationary distribution
π(v) = d(v)/2m
is proportional to the degree of v.
What’s the intuition? The more neighbors you
have, the more chance you’ll be reached.
We will see another natural interpretation
shortly.
Speed of the convergence




We’ve seen that random walk converges to
the stationary distribution.
Next question: how fast is the convergence?
Let’s define the mixing time as
min {T: ∥pi(T) - π∥ ≤ ε}
where
pi(T): the distribution in time T, starting at i.
∥∙∥: some norm.
We’ll see the reason of mixing later. Now let’s
first see an example you run into everyday.
Algorithm 3: PageRank
PageRank

Google gives each webpage a number for its
“importance”.





[IT] Google: 10, Microsoft: 9, Apple: 9
[Media] NYTimes: 9, CNN: 10, sohu: 8, newsmth: 7
[Sports] NBA: 7, CBA: 7, CFA: 7
[University] MIT: 9, CUHK: 8, …
… Tsinghua, Pku, Fudan, IIT(B): 9
When you search for something by making a query,
a large number of related webpages are retrieved.

What webpages to retrieve? Information Retrieval. That’s
an orthogonal issue.
Ranking


How to give this big corpus to you?
Search engines rank them based on the
“importance”, and give them in descending
order.



Thus presumably the first page contains the 10
most important webpages related to your query.
Question: How to rank?
PageRank: Use the vast link structure as in
indicator of an individual page’s value.
Reference system

Webpage A has a link to webpage B
A thinks B is useful.


So a webpage with a lot of other pages pointing to it
is probably important.


A guy getting a lot of letters is strong.
Further, pointers from pages that are themselves
important bear more weight.


Think of it as A writing a recommendation letter for B.
Letters by Noga, László, Sasha, Avi, Andy, … mean a lot.
But the importance of those pages also need to be
calculated… we have a recurrence equation.
Furthermore

If page A has a lot of links, then each link
means less.



Ok, you get Einstein’s letter, but you know what,
last year everyone on the market got his letter.
So let’s assume that page A’s reference
weight is divided evenly to all pages B that A
links to.
Recurrence equation:
R(A) = ∑B: B→A R(B)/d(B)
where d(B) = # pages C that B links to
Sink issue


R(A) = ∑B: B→A R(B)/d(B) has a problem.
There are some “sink” pages that contain no links to
other pages.



Sinks accumulate weights without giving out.
The recurrence equation only has solutions with weight on
sinks, losing the original intension of indicating the
importance of all pages.
To handle this, we modify the recursion:
R(A) = (1-α)/N + α ∑B: B→A R(B)/d(B)



Force each page to have a (1- α)-fraction of weights
(evenly) going to all pages.
So each A also receives a weight of (1- α)/N from all pages
α: around 0.85
Random Walk view

Note that the recursion
R(A) = (1-α)/N + α ∑B: B→A R(B)/d(B)
is exactly the random walk on the graph,
where at each point A, we



w/ prob. α, follow a random link;
w/ prob. (1-α), go to a random page.
Question: How to solve this recurrence equation?

# webpages: ~ 30 billion (and counting…)
Algorithm

Recall: the random walk converges to the stationary
distribution!


Algorithm: start from any distribution, run a few
iterations of random walk, and output the result.


It’s a bit different since it’s random walk on directed graphs,
but this PageRank matrix has all good properties we need
so that the random walk also converges to the solution.
Google: 50-100 iterations, need a few days.
That should be close to the stationary distribution,
which serves as indictor of the importance of pages.
Next

We’ll see a bit math behind the mixing story.
Mathematics behind the mixing

[Eigenvalue decomposition] A symmetric matrix
MNN can be written as M = ∑i=1,…,n λiξiξiT



The eigenvectors are orthonormal.




λi: eigenvalue. Order them: |λ1| ≥ |λ2| ≥ |λ3| ≥ … ≥ |λn|
ξi: (column) eigenvector, i.e. Mξi = λiξi
∥ξi∥2=1
ξi, ξj = 0, for all i≠j
M2 = (∑i=1,…,n λiξiξiT)(∑j=1,…,n λjξjξjT)
= ∑i,j λiξiξiT λjξjξjT = ∑i,j λi λj ξi,ξi  ξjξjT
= ∑i λi2 ξjξiT
Mt = ∑i λit ξjξiT
Random walk in matrix form


A: adjacency matrix. (Aij = 1 if (i,j) ∊E; 0 o.w.)
P: probability transition matrix.




N = D-1/2AD-1/2 = D1/2PD-1/2



Pij = 1/di if (i,j) ∊E; 0 o.w.
P = D-1A, where D = diag(d1, …, dn). (di: deg(i))
For any distribution p, PTp is the distribution after one step
of random walk
Nij = 1/(didj)1/2 if (i,j) ∊E; 0 o.w.
N is symmetric, so N can be written as N = ∑i=1,…,n λiξiξiT
Random Walk for t steps:

Pt = (D-1/2ND1/2)t = D-1/2NtD1/2 = D-1/2 (∑iλitξiξiT)D1/2
= ∑iλitD-1/2ξiξiTD1/2
Why mixing? Why to π?

[Thm] For connected non-bipartite graph, N has




λ1 = 1 and ξ1 = π1/2 = ((d1/2m)1/2, …, (dn/2m)1/2) is the
square root version of the stationary distribution (so that the
l2 norm is 1).
|λ2| < 1. (So is all other λi’s.)
Random Walk for t steps: Pt = ∑i λitD-1/2ξiξiTD1/2
First item:
λ1tD-1/2ξ1ξ1TD1/2 = D-1/2(1/2m)[(didj)1/2]ijD1/2=
(1/2m)[dj]ij

For any starting distribution p, (1/2m)[dj]ijT p = π.
Speed of the convergence

All other items: since |λi|<1, λitD-1/2ξiξiTD1/2→0!

Speed of convergence depends on how close |λ2|
is to 1.

PageRank matrix: |λ2| = α (≈ 0.85)

Expander: 1-|λ2| = Ω(1)
The rest of the talk

Only ideas are given.

Many details are omitted.

I may cheat a bit to illustrate the main steps.
Algorithm 4: Approximately
counting
Algorithm 4: Approximately counting

Task: estimate the size of an exponentially large set
V


Approach: find a chain V0⊆V1⊆…⊆Vm=V





Example: Given G, count # of perfect matchings.
|V0| is easy to compute;
m = poly(n) layers;
Each ratio |Vi+1|/|Vi| = poly(n) is easy to estimate.
Then |V| = |V0| (|V1|/|V0|) (|V2|/|V1|)… (|Vn|/|Vn-1|) can
be estimated.
Question: How to estimate |Vi+1|/|Vi|?
Estimate the ratio

Estimate by random sampling:



Generate a random element uniformly distributed
in Vi+1, see how often it hits Vi.
Question: How to uniformly sample Vi+1?
By random walk



We construct a regular graph with vertex set Vi+1
The algorithm runs efficiently (i.e. in poly(n) time)
if the walk converges to uniform rapidly (i.e. in
poly(n) time).
It’s the case in perfect matching counting problem.
Algorithm 5: Error Reduction with
efficient randomness
Error reduction

Task: Reduce the error of a randomized algorithm A.






→ ε?
Naïve approach:


Error: Prr∈{0,1}^m[r is bad for A] = 1/3
Draw k random strings r1, …, rk,
Run algorithm A using r1, …, rk and get k answers
Output the majority of the answers
B
{0,1}m
[Fact] the error drops to 2-Θ(k) . --- Chernoff’s bound
Randomness complexity: mk.
Question: Can we reduce the error prob w/o using too
many additional random bits?
Expander graph

Expander: Eigenvalue gap is large --- Ω(1)





Random walk converges very fast --- O(log n).
Highly connected --- diameter O(log n)
Large boundary --- any subset has lots of edges
going out
Could be sparse --- ∃ constant degree expanders.
We know how to construct them explicitly.
Algorithm for error reduction

New Algorithm A’:







Thm: the error prob of A’ is also 2-Θ(k)
How many random bits used? m + O(k)


B
Construct an expander with V = {0,1}m
{0,1}m
Start from a random vertex r1
Perform a random walk (r1, r2,…, rk) of length k
Random algorithm A using r1, r2,…, rk and get k answers
Output the majority of the answers
The expander is of constant degree.
Highly dependent! Why it works?
Expander: large boundary

For simplicity, consider
the one-side error case:




Algorithm A’ wrong ↔
Whole walk in B.
Expander: Every set has
a large boundary.
At every step, it walks
outside B w.p. Ω(1).
Pr[k steps never out] =
2-Θ(k)
Non-expander
{0,1}m: all
random strings
B
Boundary
PART II. Quantum Walk
Quantum mechanics in one slide
Physics
|1 
Math
Physical System
Unit Vector
α|0+β|1
(|α|2+|β|2=1)
α,β: amplitudes
β
α
Evolution
Unitary Matrix
Measurement
Projection
Composition
Tensor Product
1
A classical bit
A quantum bit (qubit)
Measure by |0 and |1:
- get
0 w.p. |α|2; system → |0;
State space for
2 bits:
2; system → |1.
- get 1{00,
w.p.01,
|β|10,
all combinations
11}
1
Classical:
0
|0
0
|1
|1
Quantum:
|0
|0
State space for 2 qubits:
the space span{|00,|01,|10,|11}
Quantum walk

Many things become quite tricky.

Even the definition of quantum walk.

We’ll ignore the formal definitions here, but
only present some results.
Type 1: Discrete Quantum
Walk
Coin register


The most natural quantum counterpart of random
walk is the following: each vertex v gives its
neighbors (equally) some “mass” so that the l2 norm
of the total mass is still 1.
However, simply accumulating all mass from
neighbors does not work



The operator is not unitary.
To overcome this, we need a register to remember
where the mass is from.
µ
¶
1 1
It amounts to add a coin register.
p1


Flip the quantum coin: |v|c → |vH|c, where H =
Walk accordingly: |v|c → |nc(v)|c, where nc(v)
is the cth neighbor of v.
2
1 ¡ 1
General graph


Given any random walk on an undirected graph G,
Szegedy proposed a method to transform it into a
quantum walk
Set hitting time:



Classical set hitting time = O((ε∆)-1)
 ε : density of the target set, i.e. |T|/|V|


A a target set T⊆ V is hidden in the graph
Starting from the uniform distribution, what’s the expected
time to hit some point in T?
∆: eigenvalue gap of matrix P
[Szegedy] Quantum set hitting time
= O(√Classical set hitting time) = O((ε∆)-1/2)
V
T
Quantum algorithm for
Element Distinctness


Element Distinctness: To decide whether all input
integers are distinct.
[Ambainis] Discrete quantum walk on the following
graph:
(r = n2/3)
V = {S⊆[n]: |S| = r};



E = {(S,T): |S∩T| = r-2}
Eigenvalue gap: ∆ = 1/r
Target set density: ε = r2/n2
Quantum running time = setup time + hitting time
= O(r) + O((ε∆)-1/2) = O(r + n/√r) = O(n2/3).
Type 2: Continuous Quantum
Walk
Continuous quantum walk on a graph

Given an undirected graph G, and a
Hermitian matrix H that respects the graph
structure



Hermitian: (HT)* = H.
Respect G: Hij = 0 if (i,j)∉E.
The dynamics ψ(t) = ∑j αi(t)|i for H is
governed by the Schrodinger equation
i(d/dt) αi(t) = ∑j:(i,j)∊EHijαj(t),
which has solution |ψ(t) = eiHt |ψ(0).
Formula Evaluation

Grover Search: find a marked point in an n-point set
in time O(n1/2).


Equivalently: evaluate OR in time O(n1/2).
Generalizations:
∧
∨
∧
∨
∨
(Balanced) AND-OR Tree
¬
∧
General AND-OR-NOT Formula
(General Game Tree)
Formula evaluation

Motivation:




a natural generalization of OR --- Grover search
well studied subject in TCS
matching lower bound known long ago
game: min-max tree.

Same up to log(n)
max
min
min
Breakthroughs

[FGG, 07] An O(n1/2) algorithm for Balanced ANDOR
Tree.



Method: scattering theory…
Not really understandable for computer scientists…
[CRSZ, A, FOCS’07] An O(n1/2+o(1)) algorithm for
General ANDORNOT Formula.



Method: phase estimation + quantum walk.
Simpler algorithm, simpler proof.
For special cases like Balanced ANDOR Tree: O(n1/2)
Classical implications



[OS, STOC’03] Any formula f of size n has
polynomial threshold function thr(f) = O(n1/2)?
Our result:
~
thr( f ) ≤ deg( f ) ≤ Q( f ) ≤ n1/2+o(1)
[KlivansServedio,
STOC’01;
KOS, FOCS’02]
Definition of
thr
[BBCMdW, JACM’01]

Class C of Boolean functions has thr(f) ≤ r for all fC, ⇒ C
can be learned in time nO(r)
(in both PAC or adversarially generated examples)



n1/2+o(1)
[Implication] Formulas are learnable in time 2
This is very interesting because studies of quantum
algorithms solve a purely classical open problem!
This is not uncommon in quantum computing!
Sketch of the algorithm


{AND,OR,NOT}⇔{NAND}
For the formula f and an input x, construct a
graph Gx
x1:0
x2 :1
x3:0 x4:1 x5:1

Let A be the adjacency matrix of Gx; consider
A’s spectrum.
Key observation
• Red: function value at a vertex
• green: amplitude of 0-eigenvector

Function values and amplitudes of 0eigenvector propagate in the same way.
0
1
∧
0 ≠0
∧
≠0
1
1
0
∧
0 ≠0
If a function evaluates to 0,
then  an 0-eigenvector
∧
∧
=0
∧
1 =0 1 =0
If a function evaluates to 1,
then 0-eigenvector has no
support on the root.
Phase Estimation


Thus qualitatively, it’s enough to know whether there
is a 0-eigenvector with support on root.
Phase estimation: Given an unitary operator U (to
use) and an eigenvector |ψ, find the phase θ in the
corresponding eigenvalue eiθ.



Unitary matrix: all eigenvalues have l2-norm 1
For eiHt, the phase is nothing but the corresponding
eigenvalue of H.
The quantum walk is here!
Thus use phase estimation, we can find an
eigenvalue of H.

Depending on whether it’s 0, output the answer to f(x).
A highlevel structure of algorithm

We need to carefully choose weight hij on edges (i,j)
s.t. the above qualitative connection also works
quantitatively.

For f(x) = 0,


For f(x) = 1,


 an 0-eigenvector with Ω(1) support on root.
eigenvector with support on root has eigenvalue Ω(n-1/2)
Algorithm: Starting from the root, use quantum walk
to carry out the Phase Estimation, output the answer
depending on whether the phase is 0.
Summary





Random walk is a simple but powerful tool.
There will sure be more important algorithmic
applications to be found.
Theories of quantum walk have been rapidly
developed in the past couple of years.
A lot of fundamental issues yet to be better
understood.
Significant breakthrough algorithms using quantum
walk are ahead!
Reference

For random walk on graphs:



Lovász, Random Walks on Graphs: A Survey,
Combinatorics, 353-398, 1996.
Chung, Spectral Graph Theory, American Mathematical
Society, 1997.
For quantum walk:

Quantum computing in general:



Textbook by Nielsen and Chuang
Lecture notes by Vazirani, by Ambainis, and by Childs
Quantum walk:


Surveys by Kempe and by Santha.
All above lectures have nice chapters for quantum walk.
Thanks!