* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Learnability of Quantum States
Boson sampling wikipedia , lookup
Wave–particle duality wikipedia , lookup
Renormalization group wikipedia , lookup
Renormalization wikipedia , lookup
Relativistic quantum mechanics wikipedia , lookup
Basil Hiley wikipedia , lookup
Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup
Topological quantum field theory wikipedia , lookup
Double-slit experiment wikipedia , lookup
Bohr–Einstein debates wikipedia , lookup
Delayed choice quantum eraser wikipedia , lookup
Quantum dot cellular automaton wikipedia , lookup
Scalar field theory wikipedia , lookup
Particle in a box wikipedia , lookup
Quantum field theory wikipedia , lookup
Path integral formulation wikipedia , lookup
Bell test experiments wikipedia , lookup
Quantum decoherence wikipedia , lookup
Hydrogen atom wikipedia , lookup
Copenhagen interpretation wikipedia , lookup
Coherent states wikipedia , lookup
Density matrix wikipedia , lookup
Quantum dot wikipedia , lookup
Measurement in quantum mechanics wikipedia , lookup
Symmetry in quantum mechanics wikipedia , lookup
Quantum electrodynamics wikipedia , lookup
Quantum fiction wikipedia , lookup
Probability amplitude wikipedia , lookup
Quantum entanglement wikipedia , lookup
Orchestrated objective reduction wikipedia , lookup
Many-worlds interpretation wikipedia , lookup
History of quantum field theory wikipedia , lookup
Quantum computing wikipedia , lookup
Interpretations of quantum mechanics wikipedia , lookup
Bell's theorem wikipedia , lookup
Quantum group wikipedia , lookup
Quantum machine learning wikipedia , lookup
Canonical quantization wikipedia , lookup
EPR paradox wikipedia , lookup
Quantum state wikipedia , lookup
Quantum cognition wikipedia , lookup
Hidden variable theory wikipedia , lookup
The Learnability of
Quantum States
Scott Aaronson
University of Waterloo
Quantum State Tomography
Suppose we have a physical process that produces a
quantum mixed state
By applying the process repeatedly, we can prepare as
many copies of as we want
To each copy, we then apply a binary measurement E,
obtaining ‘1’ with probability Tr(E) and ‘0’ otherwise
Our goal is to learn an approximate description of
EXPERIMENTALISTS
ACTUALLY DO THIS
To learn about chemical reactions (Skovsen et al.
2003), test equipment (D’Ariano et al. 2002), study
decoherence mechanisms (Resch et al. 2005), …
But there’s a problem…
not! Why state
wouldof n qubits, we
To do tomography onFear
an entangled
he be raising this
need (4n) measurements
“problem” if he wasn’t
The current record: 8gonna
qubitsdemolish
(Häffner it?
et al. 2005),
requiring 656,100 experiments (!)
Does this mean that a generic 10,000-particle state can
never be “learned” within the lifetime of the universe?
If so, would call into question the operational status of
quantum states themselves (and make quantum
computing skeptics extremely happy)…
The Quantum Occam’s
Razor Theorem
Let be an n-qubit mixed state. Let D be a distribution
over two-outcome measurements. Suppose we draw
m measurements E1,…,Em independently from D, and
then output a “hypothesis state” such that
|Tr(Ei)-Tr(Ei)|≤ for all i. Then provided /10 and
1 n
1
1
m 2 2 2 2 log log ,
we’ll have
Pr Tr E Tr E 1
ED
with probability at least 1- over E1,…,Em
Remarks
Implies that we can do “pretty good tomography,” using
a number of measurements that grows only linearly (!)
with the number of qubits n
Result says nothing about the computational
complexity of preparing a hypothesis state that agrees
with measurement results
Can make dependence and and more reasonable,
at the cost of a log2n factor:
1 n
n
1
2
m O
log
log
2
1 n
1
The above bound is nearly tight: m 2 log
To prove the theorem, we need a notion
introduced by Kearns and Schapire called
Fat-Shattering Dimension
Let C be a class of functions from S to [0,1]. We say a set
{x1,…,xk}S is -shattered by C if there exist reals a1,…,ak
such that, for all 2k possible statements of the form
f(x1)a1- f(x2)a2+ … f(xk)ak-,
there’s some fC that satisfies the statement.
Then fatC(), the -fat-shattering dimension of C, is the
size of the largest set -shattered by C.
Small Fat-Shattering Dimension
Implies Small Sample Complexity
Let C be a class of functions from S to [0,1], and let fC.
Suppose we draw m elements x1,…,xm independently from
some distribution D, and then output a hypothesis hC
such that |h(xi)-f(xi)| for all i. Then provided /7 and
1
2 1
1
m 2 2 fat C log
log ,
35
we’ll have
Pr hx f x 1
xD
with probability at least 1- over x1,…,xm.
Proof uses a 1996 result of Bartlett and Long—building on
Alon et al., building on Blumer et al., building on Valiant
Upper-Bounding the Fat-Shattering
Dimension of Quantum States
Nayak 1999: If we want to “encode” k classical bits into
n qubits, in such a way that any bit can be recovered
with probability 1-p, then we need n(1-H(p))k
Corollary (“turning Nayak’s result on its head”):
Let Cn be the set of functions that map an n-qubit
measurement
E to to
Tr(E), for some . Then
No need
thank me! fat O n .
Cn
2
Quantum Occam’s Razor Theorem
follows easily…
Simple Application of Quantum Occam’s
Razor Theorem to Communication Complexity
x
y
Alice
Bob
f(x,y)
f: Boolean function mapping Alice’s N-bit string x and
Bob’s M-bit string y to a binary output
D1(f), R1(f), Q1(f): Deterministic, randomized, and
quantum one-way communication complexities of f
How much can quantum communication save?
• It’s known that D1(f)=O(M Q1(f)) for all total f
• In 2004 I showed that for all f,
D1(f)=O(M Q1(f)logQ1(f))
Theorem: R1(f)=O(M Q1(f))
for all f, partial or total
Proof: Fix Alice’s input x
By Yao’s minimax principle, Alice can consider a worstcase distribution D over Bob’s input y
Alice’s classical message will consist of y1,…,yT drawn
from D, together with f(x,y1),…,f(x,yT), where T=(Q1(f))
Bob searches for a quantum message that yields the
right answers on y1,…,yT
By the Quantum Occam’s Razor Theorem, with high
probability such a yields the right answers on most y
drawn from D
What about computational complexity?
BQP/qpoly: Class of problems solvable in quantum
polynomial time, with help from poly-size “quantum
advice state” that depends only on input length n
A. 2004: BQP/qpoly PP/poly
“Classical advice can always simulate quantum advice,
provided we use exponentially more computation”
Can this result be improved to BQP/qpoly QMA/poly?
(QMA: Quantum Merlin-Arthur)
Theorem: HeurBQP/qpoly HeurQMA/poly
Or in English: We can use trusted classical advice to verify
that untrusted quantum advice will work on most inputs
Proof Idea: The classical advice to the HeurQMA/poly
verifier will consist of “training inputs” x1,…,xm where
m=poly(n), as well as whether xiL for all i
Given a purported quantum advice state |, the verifier
first checks that | yields the right answers on the training
inputs, and only then uses it on its real input x
By the Quantum Occam’s Razor
Theorem, if | passes the initial test,
then w.h.p. it works on most inputs
Technical part is to do the verification
without destroying |
Stronger Result: HeurBQP/qpoly = HeurYQP/poly
Here YQP (“Yoda Quantum Polynomial-Time”) is like
QMAcoQMA, except that a single witness must work for all
inputs of length n
Open Problems
Computationally-efficient learning algorithms
Experimental implementation!
Tighter bounds on number of measurements
Does BQP/qpoly = YQP/poly?
Is D1(f) = O(M Q1(f))?