Download An information-theoretic perspective on the foundations of

Document related concepts

Ensemble interpretation wikipedia , lookup

Quantum dot wikipedia , lookup

Propagator wikipedia , lookup

Renormalization wikipedia , lookup

Instanton wikipedia , lookup

Double-slit experiment wikipedia , lookup

Quantum field theory wikipedia , lookup

Bohr–Einstein debates wikipedia , lookup

Delayed choice quantum eraser wikipedia , lookup

Coherent states wikipedia , lookup

Topological quantum field theory wikipedia , lookup

Quantum fiction wikipedia , lookup

Scalar field theory wikipedia , lookup

Quantum decoherence wikipedia , lookup

Particle in a box wikipedia , lookup

Hydrogen atom wikipedia , lookup

Renormalization group wikipedia , lookup

Orchestrated objective reduction wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Symmetry in quantum mechanics wikipedia , lookup

Relativistic quantum mechanics wikipedia , lookup

Path integral formulation wikipedia , lookup

Copenhagen interpretation wikipedia , lookup

Quantum group wikipedia , lookup

Density matrix wikipedia , lookup

Many-worlds interpretation wikipedia , lookup

Quantum computing wikipedia , lookup

Quantum machine learning wikipedia , lookup

Measurement in quantum mechanics wikipedia , lookup

History of quantum field theory wikipedia , lookup

Quantum electrodynamics wikipedia , lookup

Interpretations of quantum mechanics wikipedia , lookup

Canonical quantization wikipedia , lookup

T-symmetry wikipedia , lookup

Quantum state wikipedia , lookup

Probability amplitude wikipedia , lookup

EPR paradox wikipedia , lookup

Quantum entanglement wikipedia , lookup

Bell test experiments wikipedia , lookup

Hidden variable theory wikipedia , lookup

Bell's theorem wikipedia , lookup

Quantum key distribution wikipedia , lookup

Quantum teleportation wikipedia , lookup

Transcript
An information-theoretic perspective on the foundations of
quantum mechanics
Ekin Dogu§ Qubuk
March 16, 2010
1
Preface
This thesis contains the work of the author he did as a research assistant from the summer of 2009 to now, under the supervision of Professor John Boccio of the Department
of Physics and Astronomy at Swarthmore College. The nature of the work is along the
lines of a "library thesis", as it aims to survey and summarize current research in quantum
information theory. The author's goal is to bring together different information-theoretic
perspectives on the foundations of quantum mechanics, in order to present a current picture
of the relevance of quantum computational and communicational progress to the fundamental questions. The author's main contribution is to draw parallels between the different
formalisms and approaches that try to answer the fundamental question: "What is so special about the dynamics of quantum mechanics that makes it the underlying theory of our
universe?" The author has tried to present different approaches under the same mathematical framework , and filled in some of the gaps in derivations and connections between
these different approaches.
I would like to thank Professor John Boccio for his guidance during this research. I
am grateful to him for introducing me to the different perspectives on the study of the
foundations of quantum mechanics, while giving me the freedom to work on the approach
I was interested in the most. I would also like to thank Professor David Cohen for reading
the first draft of this thesis and giving me valuable feedback on it .
2
Contents
1
Introduction
2
Classical correlation polytopes
11
2.1
Two independent events .
11
2.2
Three independent events
13
2.3
Correlation polytopes for an arbitrary number of events
16
2.4
The CHSH polytope
16
2.5
Observations
18
3
4
7
....
Bipartite correlation boxes and inequalities
21
3.1
Bipartite correlation boxes . . . .
21
3.2
The CHSH inequalities revisited
23
3.3
The local polytope (£) .
26
3.4
The causal polytope (C)
27
3.5
Nonlocal boxes (PR boxes)
29
3.6
Nonlocality ..
30
3.7
Quantum boxes
32
3.7.1
Nonlocal correlations in QM .
32
3.7.2
The bound on the nonlocality of correlations in (Q)
34
3.7.3
The set of quantum correlations (Q) . .
35
Some features of quantum information theory
37
4.1
The fundamental unit of quantum information
37
4.1.1
38
4.2
Information content of a qubit
The no-cloning theorem . . . . . . . .
3
39
4.3
5
42
4.3.1
Quantum gates and operations
42
4.3.2
Superdense coding
44
4.3.3
Teleportation . . .
47
4.3.4
Entanglement swapping
50
Algebra of correlations
53
5.1
Algebra of wirings
54
5.2
Irreversibility . . .
56
5.3
Nonlocality distillation
57
5.3.1
Nonlocality distillation 1 (using an NL and a local box)
57
5.3.2
Nonlocality distillation 2 (using an L and an NL box)
58
5.4
6
Some applications of quantum non locality
Closure under wirings for polytopes.
59
5.4.1
Limiting the CHSH value
60
5.4.2
Noisy NL boxes.
61
5.4.3
The Uffink set
61
Information-theoretic axioms of QM
63
6.1
Quantum nonlocality as an axiom
63
6.2
Nonlocality and communication complexity
66
6.2.1
Communication complexity . . . . .
67
6.2.2
Maximal non locality and the inner product
68
6.2.3
Distributive decision problems and the IP N
69
6.2.4
Other nonlocal boxes.
70
6.2.5
Discussion . . . . . . .
71
6.3
The optimality of quantum correlations
4
73
7
6.3.1
Product states and entangled states
80
6.3.2
Three theorems about states in C ..
80
6.3.3
The impossibility of entanglement swapping in C
81
6.3.4
The impossibility of teleport at ion in C
83
6.3.5
Further study in the optimality of Q
84
Conclusions
85
5
6
1
Introd uction
Quantum mechanics(QM) is weird, and nobody understands it, according to R. Feynman
[15]. Despite being a fundamental physical theory, it is not deterministic and it predicts
(correctly) the existence of nonlocal correlations in our universe. Although it has been
proven to be accurate by every experiment performed to date, it still is very confusing as a
theory; as Feynman points out. At the heart of this confusion is the paradigm about which
QM has been constructed. Unlike other fundamental theories like special relativity, the
postulates of QM are purely mathematical, involving complex vectors in a Hilbert space
[29,25]. In special relativity, physical constraints like the speed oflight, and philosophically
satisfying principles like isotropy and homogeniety of space-time [23] make special relativity
a more friendly theory. Physicists have yet to find fundamental principles as physical that
would describe QM.
The weirdness of QM has been an area of study since the beginnings of quantum theory.
Quantum information theory (QIT), which is a relatively recent field, approaches QM from
a different, task-oriented viewpoint: what can one do with systems at the quantum scale,
to invent new computational and communicational methods?
Despite its original motivation as a practical line of research, QIT turned out to be a
very fruitful area of study, for deepening our understanding of QM. Feynman pointed out
to the fact that classical computers are very inefficient at simulating quantum systems,
and building a computer that works with QM dynamics could make simulations of quantum systems feasible [16]. Following this revolutionary idea, people showed that quantum
computers could be used for important computational tasks other than solving physics
problems. Clever algorithms have been discovered that would allow a quantum computer
to solve problems that are considered to be not solvable by classical computers [21]. Along
the similar lines, communication and cryptography protocols that can only be executed
7
by using quantum correlations have been developed [29 , 21]. This area of research also
drew out the boundaries the information processing power of quantum correlations, which
promises to single out information-theoretic principles that might show us why QM is so
special.
This thesis surveys some of the recent developments in QIT, to look at some of the
emergent properties of QM from an information-theoretic viewpoint. This work focuses on
the study of probabilistic systems in general: in the classical, quantum and post-quantum 1
universes. By placing QM amongst other probabilistic theories, we are able to see the
physical properties unique to QM. Since Popescu and Rohrlich showed that there are
theories more nonlocal than QM that still obey special relativity [24], many physicists
tried to discover the ways in which QM is different than other causal probabilistic theories.
By finding information theoretic principles that are unique to QM , we can redefine QM
with more "physical" axioms.
This thesis outlines the recent developments in QIT that offer new perspectives to
the foundational problems in quantum theory. The recent developments are interesting
because they mostly are based on relatively simple mathematics, like Bell's theory. The
fact that simple ideas (simple in terms of mathematical complexity) can still lead to such
important discoveries in QM shows that we still understand very little of the quantum
world. The study of correlations, which is free from the formalism of QM since it only
deals with probabilities, promises to be a useful area of study: both for understanding the
foundations of QM better and finding new practical uses of this weird behavior.
Chapters 2 and 3 introduce the general formalism that is used to investigate probabilistic theories. Chapter 4 introduces the basic concepts in quantum information theory
and the important communicational protocols that exploits the non-classical behavior of
1
Post-quantum refers to correlations t hat are "more nonlocal" t han is allowed by QM
8
quantum systems. Chapter 5 introduces some of the tools that are used to investigate
probabilistic theories that are more nonlocal than QM. Finally in Chapter 6, the probabilistic theories that are more nonlocal than QM are investigated, and their similarities
and differences to QM are presented. The final remarks are presented in the Conclusions
section.
9
10
2
Classical correlation polytopes
In this chapter, we will present a formalism that will be used to investigate correlations.
We are mainly interested in correlations and joint probabilities in classical, quantum and
post-quantum universes. These quantities will be investigated with geometric and algebraic
tools.
It is appropriate to define some of the geometric terms that will be used. A polytope is a
geometric object with flat sides that can exist in spaces of different dimensions. Polygon and
polyhedron are the specific names for polytopes in two and three dimensions, respectively.
The convex hull of a set of points X is the smallest convex set that contains X. A convex
set is a set that curves out, in the sense that it contains all the line segments between each
pair of its points.
Furthermore, a convex polytope can be defined as the convex hull of a finite set of
points or as a bounded intersection of a finite set of half spaces [34] . This will be explained
further as a consequence of the Weyl-Minkowski theorem [23].
2.1
Two independent events
Now we show how to represent a set of probabilities describing a set of events geometrically,
using a simple example. Consider two statements 8 1 and 8 2 that are independent of each
other, and their respective probabilities of being true: PI and P2. We also define P12, as
the probability of both of the statements being true. By common sense, we can write the
following inequalities:
(2.1)
(2 .2)
To completely describe the system with equations, we need another inequality. Equa-
11
tions (2.1) and (2.2) do not restrict the probability of the first statement or the second
statement being true, which cannot be larger than unity. To see this, consider the probability assignments: PI = P2 = 0.6 , P12
statement 2 is true is 0.6
+ 0.6 -
0.1
= 0.1. The the probability that statement 1 or
= 1.1 > 1. Thus following inequality is needed:
PI
+ P2 -
P12 S 1
(2.3)
since P12 is accounted for twice in PI and P2. Equations (2.1), (2.2) and (2.3) are necessary
and sufficient conditions to describe a system with two independent statements. These
three inequalities also have a geometrical representation in a three-dimensional space parameterized by the variables PI , P2 and P12·
Consider a three dimensional real space, parameterized by PI , P2 and P12. Equations
(2.1), (2.2) and (2.3) would define a volume in which those inequalities are satisfied (Figure
1) .
The vertices of this polytope are the rows of the deterministic truth table concerning
statement 1 and statement 2 (8 1 and 8 2 , respectively), as shown in Table 1. This follows
from the Weyl-Minkowski theorem, which states that every convex polytope can be described by its vertices or its facets. The polytope in Figure 1 can be described as a convex
hull of the vertices (the rows of Table 1). In this description a vector can be an element of
the polytope if and only if it is a convex combination of the vertices represented as vectors.
The convex combination refers to a linear combination (function) of points in which the
coefficients of each term is non-negative, and the coefficients add up to 1. The latter description of this polytope is the volume that satisfies the inequality equations ((2.1) , (2.2)
and (2.3)).
12
0.9
0.8
0.7
0.6
~ 0.5
0.4
0.3
0.2
0. 1
Figure 1: This polytope contains all the (pi, P2, P12) vectors that could represent our two-statement system.
81
82
8 1 1\ 8 2
0
0
0
0
0
0
1
0
1
1
1
1
Table 1: 51 /\ 52 denotes the logical operation 51 and 52
2.2
Three independent events
Extending this geometrical formalism to three distinct events (statements) is straightforward. Consider three independent events. The truth table is simple to calculate, and is
presented as Table 2.
Columns are the truth values of the events and joint pairs of events. We see that this
polytope will be six dimensional, with the coordinates: PI, P2, P3, P12, P13 and P23. The
13
81
82
83
8 1 /\82
8 1 /\83
8 2 /\83
0
0
0
1
1
1
0
1
0
0
1
0
1
0
1
1
0
1
0
0
0
1
1
1
0
0
0
0
1
0
0
1
0
0
0
0
0
1
0
1
0
0
0
0
0
0
1
1
Table 2: Three independent events.
rows of Table 2 will be the vertices, and the polytope will be convex hull of those vertices.
Usually the description of a polytope by its vertices is the easier description to write
down, but also the less insightful one. Below we find the description of this same polytope
in terms of inequalities that restrict the six probabilities.
Similar to the equations (2.1) and (2.2), the joint probabilities are less than or equal
to the corresponding probabilities of events, and all the probabilities are between zero and
one, inclusive. In other words,
O<p<p<1 }
-
~J
0"5: Pij
-
~
-
for {i , j} E {1,2, 3}2 and i
<j
(2.4)
"5: Pj "5: 1
where {i, j} E {I , 2, 3}2 means {i, j} belongs to the set of all pairs of elements of {I , 2, 3}.
Thus in this case, {i,j} E{ 1, 2, 3}2 = {{I , I}, {I, 2} , {l , 3}, {2, I} , {2, 2}, {2, 3}, {3, I}, {3, 2} , {3, 3}}.
Then we state the requirement that the probability that event i or event j will occur is
less than or equal to unity, in an analogous fashion to the inequality (2.3) (to constrain the
probability that a pair of events will occur).
Pi
+ Pj
- Pij
"5: 1 for {i , j}
14
E {I, 2, 3}2
and i < j.
(2.5)
In addition to these, we need to have an inequality that constrains the probability that
event one or event two or event three will occur between zero and one( to constrain the
probability that a pair of events will occur). So we have,
°S
PI
+ P2 + P3 -
P12 - P13 -
P23
+ P123 S
(2.6)
1
since the argument above is the probability that event one or event two or event three will
occur. The equality above would be enough, however it contains P123 , the probability that
81 and 8 2 and 8 3 will occur, which is not a variable in our six dimensional space. We
get around the problem by introducing more inequalities without the variable
P123.
The
inequality
PI
holds since
P123
+ P2 + P3 -
P12 - P13 -
is non-negative. Since Sl is an arbitrary event, this analysis could apply
to 81(not Sl) just as well. So we can replace
P2 -
P12
(2.7)
P23 S 1
and similarly
P13 =
PI
with 1 -
PI , P12 =
PlP3 with (1 - pdP3 = P3 -
P13·
PlP2 with (1 - pI)P2
=
Then the equation (2.7)
becomes
(1 - pI)
+ P2 + P3 -
P2
+ P12 -
1-
PI
P3
+ P13 -
+ P12 + P13 -
PI - P12 - P13
P23 =
P23 S 1,
+ P23 2:: 0.
(2.8)
Similarly, modifying equation (2.7) for 8 2 and 8 3 , we get the following two inequalities
respectively:
P2 -
P12 -
P23
15
+ P1 3 2:: 0,
(2.9)
and
P3 - Pl3 - P23
+ Pl2 2:: O.
(2.10)
The inequalities (2.4) , (2.5), (2.7), (2.8), (2.9) and (2.10) describe the facets of this polytope. Interested reader can refer to [23] for a proof that these inequalities are necessary
and sufficient to describe the polytope that has the vertices listed in Table 2.
2.3
Correlation polytopes for an arbitrary number of events
For n events, there are at most
G)
distinct pairs of events. Define S as a subset of pairs
of numbers from 1 to n , such that S ~ {{i ,j }ll S i
do not double count the pairs. Then
involving the probabilities
lSI S G).
PI, P2, .... , Pn
and
Pij
< j S n} which guarantees that we
C(n, S) denotes a correlation polytope
for {i, j} E S, in an n
+ lSI
dimensional
real space.
As mentioned above, an arbitrary polytope C( n , S) is in an n + lSI a dimensional space
since there are that many different probabilities to consider. Since the number of vertices
is equal to the number of rows of the truth table for n events, there are 2n vertices. Our
first example, with vertices from Table 1, is a C(2, S) where S = {{l, 2}}. Table 2 denoted
the vertices of a C(3 , S) where S
= {{I , 2}, {I, 3} , {2 , 3}}. Next, we discuss a polytope
with n = 4.
2.4
The CHSH polytope
The CHSH polytope (also referred to as the local polytope in this document) is of special
interest to us, due to its relation to the CHSH experiment . The CHSH experiment was an
experimental verification of QM violating Bell's inequalities. Given to entangled particles
with spin, the correlation between the outcomes of the spin measurements on these particles
when they are separated were too strong to be local. As it will be explained later , the
16
CHSH experiment proved that the quantum universe cannot have a local, realistic theory
that explains it.
The CHSH polytope is a C(4,8), where 8
{1,2} and {3,4} are not considered. Since
= {{1,3},{1,4},{2,3},{2,4}}. The pairs
181 =
4, this polytope is in an eight(n + 181)
dimensional space, and it has 2n = 24 = 16 vertices. To describe this polytope by its
vertices, we construct the truth table for four independent events in Table 3.
81
0
0
0
0
1
0
0
1
0
1
1
0
1
1
1
1
82
0
0
0
1
0
0
1
0
1
0
1
1
1
0
1
1
83
0
0
1
0
0
1
0
0
1
1
0
1
0
1
1
1
84
0
1
0
0
0
1
1
1
0
0
0
1
1
1
0
1
8 1 /\83
0
0
0
0
0
0
0
0
0
1
0
0
0
1
1
1
8 1 /\84
0
0
0
0
0
0
0
1
0
0
0
0
1
1
0
1
8 2 /\83
0
0
0
0
0
0
0
0
1
0
0
1
0
0
1
1
8 2 /\84
0
0
0
0
0
0
1
0
0
0
0
1
1
0
0
1
Table 3: Four independent events. We are only interested in the joint probabilities P13,
P14 , P23
and P24
Next we proceed to describe the polytope by the inequalities (facets). Analogous to
inequality equations (2.1), (2.2) and (2.4); we first constrain each probability between zero
and unity. We also make sure the probability that a single event will occur is not less than
17
the joint probability that that event and another specific event will occur.
o $; Pij
o $; Pij
$; Pi $; 1 }
for i E {I, 2},
j E {3,4}.
(2.11)
$; Pj $; 1
Then we write the inequality about an event i or another event j occurring, analogous to
the inequality equations (2.3) and (2.5):
Pi +Pj -Pij $;
1
fori E {1,2} , j E {3,4}.
(2.12)
Finally, analogous to the equation (2.8), we have the inequalities arising from the requirement that the probability that one or more of the four events will occur is less than or
equal to one:
-1 $;
PI3
+ PI4 + P24 -
P23 - PI - P4 $;
0,
(2.13)
-1 $;
P23
+ P24 + PI4 -
PI3 - P2 - P4 $;
0,
(2.14)
-1 $;
PI 4
+ PI3 + P23 -
P24 - PI - P3 $;
0,
(2.15)
-1 $;
P24
+ P23 + PI3 -
PI4 - P2 - P3 $;
O.
(2.16)
The final four inequalities are called the Clauser-Horne inequalities, and they will be very
useful in showing nonlocal behavior of quantum mechanics.
2.5
Observations
Pitowsky's formalism [23] of using polytopes to represent correlations and probabilities
has been described. It is important to note that only simple logical ideas that can be
considered common sense have been used to construct the polytopes described above.
Each of these polytopes includes all the probability vectors that can exist for the given
18
number of independent and joint events, according to classical logic. Since the inequalities
were not constructed using any physical considerations, these correlation boxes should be
able to represent any kind of system with the corresponding number of independent events
and joint events, irrespective of the dynamics underlying them. These inequalities arise
from classical logic.
19
20
3
3.1
Bipartite correlation boxes and inequalities
Bipartite correlation boxes
We define a bipartite correlation box(or just a "box" ) as a black box that two space-like
separated observers have access to. Both parties each have a set of possible inputs, and
a set of possible outputs. The kind of box determines the joint probabilities for getting
certain joint outputs to certain joint inputs. We assume that the order of who inputs his
or her setting does not change the outcome. In addition, they each receive an output as
soon as they enter the inputs. Finally, a correlation box is a one-time measurement device:
it can only be used once.
To make these boxes more concrete, assume two space-like separated parties: Alice and
Bob. We will consider a binary box. Then Alice and Bob each have two possible inputs
to, and two possible outputs from their parts of the box. Alice's (Bob's) input is denoted
by i (j) , and her (his) output is denoted by x (y) . Since we are considering a binary box,
we have:
i , j , x , y E {O, 1}.
Next, we define the joint probabilities. Prob
(mf = x,mf = y)
(or Pij lxy in short)
denotes the joint probability of Alice and Bob getting the results x and y if they have
input i and j , respectively, from their measurements using the box. Thus POOll l would
refer to the probability of Alice and Bob both receiving the output 1 if they have both
entered the input O.
Regardless of the kind of dynamics the box has, we can state some of the constraints on
the joint probabilities. We know that any of these joint probabilities will be nonnegative
21
and less than unity. Thus
Os Pij lx y s 1 for ijxy E {O, 1}4.
(3.1)
Moreover, these joint probabilities must be normalized, by which we mean that for any
set of inputs, the probabilities of getting an output must add up to unity. Thus
LPijl xy
i,j
=
1 forxy E {O,1}2.
(3.2)
Furthermore, since Alice and Bob are space-like separated, one party's choice of input
should not affect the output of the other party.
For example the probability of Alice
measuring the output 1 when she inputs 0 should not depend on the Bob's input.
In other words, the probability of Alice measuring the output 1 if she inputs a zero and
Bob inputs j can be written as a sum over the probabilities of different outcomes for Bob:
(3.3)
Then we can say that POj li ?
= PO/ Ii?, where j' is the bitwise compliment of j. This ensures
that Bob choosing to input j or j' does not affect the probability of Alice measuring a 1
when she inputs a zero.
More generally,
(3.4)
y
y
Similarly, Bob's measurements should not depend on Alice's measurement setting (input).
22
Thus
(3.5)
x
x
Otherwise Alice (Bob) could send information to Bob (Alice) by her (his) choice of input
for certain kinds of boxes. This requirement makes sure the measurement results (the
correlations) are causal.
3.2
The CHSH inequalities revisited
Consider the C(4,8) polytope where 8 = {{1 , 3},{1,4},{2,3},{2, 4}}. This polytope is
defined by the inequalities (2.11), (2.12), (2.13), (2.14), (2.15) and (2.16).
Now we show that the binary bipartite correlation boxes introduced in Section 3.1 can
be represented by the CHSH polytope, which is a general space for all probability vectors
of systems with four independent events and four of the joint probabilities that are listed
in 8. We define the four independent events and their probabilities as:
• PI is the probability of Alice measuring the output 1 when she inputs a 0, or
• P2 is the probability of Alice measuring the output 1 when she inputs a 1, or
• P3 is the probability of Bob measuring the output 1 when he inputs a 0, or (Pj=Oly=I).
• P4 is the probability of Bob measuring the output 1 when he inputs aI, or (Pj=lly=I).
23
Then the joint probabilities become:
Pl3
= PI
. P3
= (Pi=olx= d . (Pj =oly=d = POOlll ,
P23
= P2
. P3
= (Pi=llx=d . (Pj =oly=d = PlOlll ,
P14
= Pl· P4 = (Pi=Olx=d . (Pj =l ly=d = POI lll ,
P24
= P2
. P4
(3.6)
= (Pi= llx=l) . (Pj =l ly=l) = Pll lll '
Since the probability of an event k not occurring is 1 - Pk,
(1 - PI) = Pi=olx=o,
(1 - P2) = P i= l lx=o,
(1 - P3) = P j=oly=o,
(1 - P4) = Pj=lly=o
(3.7)
follows from the definitions of PI,P2,P3 and P4. Then the remaining joint probabilities are:
POOIOO = (1 -
pd . (1 -
P3)
= 1 + Pl3 - PI - P3 ,
PlOioO = (1 - P2) . (1 - P3) = 1 + P23 - P2 - P3 ,
(3.8)
POI loO = (1 - pI) . (1 - P4) = 1 + P14 - PI - P4 ,
P lli oO = (1 - P2) . (1 - P4) = 1 + P24 - P2 - P4·
This ends the mapping from Pk to P ijl xy · Note that only the probabilities PI, P2 , P3 , P4,
P13 , P23 , P14, P24 are going to be the coordinates of the polytope. Thus we only need the
joint probabilities listed in the equations in (3.6). Probabilities in equations (3.7) and (3.8)
can be extracted from the former.
For a given measurement setting (i , j), a correlation value (C( i, j)) is defined as:
(3.9)
We see that C( i, j) stands for the correlation between measurement results of Alice and
24
Bob, if they input i and j, respectively. We can write the correlation in terms of the
notation of equations (3.6), (3.7) and (3.8). For example C(I ,I) , the correlation between
the measurements if Alice and Bob both input 1, is:
C(I , 1) = P Ulaa + P UIU - P UllO - P UIOl
(3.10)
Similarly:
C(O , O)
= 4P13 - 2Pl - 2P3
+ 1,
(3.11)
C(O , 1)
= 4P14 - 2Pl - 2P4
+ 1,
(3.12)
C(I , O)
= 4P23 - 2P2 - 2P3
+ 1.
(3.13)
Now we define the value CHSH as:
CHSH
= C(O , 0) + C(O , 1) + C(I, 0) - C(I , 1).
(3.14)
Using the equations (3.10) , (3.11) , (3.12) and (3.13); CHSH can be written as:
(3.15)
We see that the equation (2.15) can be rewritten using the equality in (3.15) as
-IS
CHSH - 2
4
25
SO,
(3.16)
or
-2 :S; CHSH :S; 2.
(3.17)
This is the well known CHSH inequality, one of the more popular Bell-type inequalities.
Following the same derivation, inequalities (2.13), (2.14) and (2.16) also lead to similar
Bell-type inequalities.
It is worth emphasizing again that these inequalities are imposed by the classical ax-
ioms of probability. They are independent of any physical system; they follow the purely
mathematical character of probabilities and propositional logic.
3.3
The local polytope (£)
In Section 3.1, we introduced an abstract way of defining correlation sources. The bipartite correlation boxes that were defined can be used to represent any kind of correlations
between measurements made by two space-like separated parties. The inequalities in that
section constrained the joint probabilities to be normalized, nonnegative and causal. To be
more specific; the inequality (3.1) constrained the joint probabilities to be between 0 and
1, and the equation (3.2) made sure the joint probabilities were normalized. The equations
(3.4) and (3.5) however, made sure that Alice and Bob could not send information to each
other faster than the speed of light. This is also called the no-signaling condition in the
literature. These inequalities are the trivial facets of the CHSH polytope.
Following these inequalities, in Section 3.2, we expressed another inequality in terms of
the CHSH value that was derived in Section 2.4. This constraint on the CHSH value was
derived from the fact that the probability of one or more of the four independent events
occurring cannot be larger than unity. The CHSH inequalities are the nontrivial facets of
the local polytope.
Any probability vector (set of joint probabilities or set of correlations) that exists in the
26
CHSH polytope (Section 2.4) is called a local box. These boxes, obey all the restrictions
that arise from the classical axioms of probability. For this reason this boxes are considered
to obey locality, the reason for which will be explained.
Each of these boxes are represented by a point (or a vector to that point) in our eight
dimensional probability space. These points form the CHSH polytope, which was shown
to have 16 vertices. We will call these vertices the local vertices. Remember that the local
vertices are the rows of the deterministic truth table(Table 3), and thus the coordinates
are either zero or one.
Since these boxes obey the classical axioms of probability, we can write them as [10] :
(3.18)
where A is the shared random data that is continuous, and P>. is the probability that a
certain A will occur.
3.4
The causal polytope (C)
Now consider the the causal polytope, which is in the same space as the CHSH polytope.
It is bounded by the same inequalities except for the CHSH inequalities((2.13), (2.14),
(2.15) and (2.16)). This polytope only has the trivial facets of the local polytope, and
other trivial facets that preserve causality and normalization.
Figure 2 shows a schematic representation of the local polytope(£) and the causal
polytope(C). The dashed lines represent the nontrivial facets (CHSH inequalities) of the
local polytope, which are the CHSH inequalities. The solid lines represent the causality,
positivity and renormalization constraints.
The local boxes are the rows of Table 3. Something interesting to note here is that
the local boxes are all equivalent to each other. Since our assignment of Pk values to the
27
c
..- ... .
NL
L
I
L
Figure 2: A schematic representation of the two polytopes ['iJ.
bipartite joint probabilities is random as seen in the equations in (3.6), the local boxes can
be converted to each other with reversible operations. For example: instead of assigning
PI to 'Pi=Olx= l , we could have assigned it to 'Pj= l ly=o. The dynamics of the box would be
identical. Another way to see this is that we would just be rotating the polytope around
some axis that goes through the origin. Because of the exchange symmetry within the four
events, the polytope has certain rotational symmetries. Thus with reversible operations
we can convert a local box to any other local box.
Then instead of using the rows of Table 3, we can use the general form [7] :
aj3"(o
'Pijlxy
where
a/3,6
= {
1 if
°
x
= a . i EB
/3 and y =
, . j EB 6,
(3.19)
else
E {O,1}4 , and EB denotes addition in modulo 2. This follows from the fact
that the vertices must represent deterministic correlations, which follows from the Weyl-
28
Minkowski theorem (Section 2.1).
The 16 local vertices (denoted by L on Figure 2) of £ are also vertices of C. In addition
to these local boxes, C has other vertices, denoted by NL on Figure 2. Similar to the local
boxes, the nonlocal boxes also are equivalent to each other, and can be converted to each
other. These NL boxes can be expressed generally as [7]:
Cif3'Y =
P ijlxy
{
1/2 if x ttJ y = i . j ttJ 0: . i ttJ ;3 . j ttJ ,
o
(3.20)
else
where 0:;3, E {O, 1}3. The justification for the name will be presented later.
3.5
Nonlocal boxes (PR boxes)
One of the four nontrivial inequalities in Section 2.4, (2.15) to be more specific, led to the
well known CHSH inequality (3.17) derived in Section 3.2. What about the the other three
nontrivial inequalities in the same section?
Further analysis shows that there are in fact seven more CHSH-like values we can define,
as in (3.14), that will also lead to an expression:
- 2 S CHSH k S 2 for 2 S k S 8.
(3.21)
First define CHSH2 as -CHSH 1 . It is straightforward to show that equation (3.21) holds
for CHSH 2. Next define CHSH 3 as
CHSH3
= -C(O, 0) + C(O, 1) + C(l, 0) + C(l, 1) ,
(3.22)
(3.23)
Then we can write equation (2.14) as -2 < CHSH3,4 < 2 where CHSH4
29
Equations (2.13) and (2.16) can also be put in the same form, in terms of CHSHk for
k E {5,6} and k E {7, 8}, respectively. Put in the more general form ,
-2:::; CHSH k = (-l)'C(O, 0)+( -l)f3+,C(O, 1)+( -l)'+aC(l, 0)+( -1)'+a+f3 +1C(1, 1) :::; 2
(3.24)
for 1 :::; k :::; 8 and oJ3, E {O, 1}3. These eight CHSH inequalities are precisely the eight
nontrivial facets of £, or the 8 hyperplane boundaries between £ and C.
Note the one-to-one correspondence between the definition of nonlocal vertices defined
by the equation (3.20) and the nontrivial facets defined in the equation (3.24). Moreover,
the nonlocal vertices each seem to violate one of the nontrivial facets maximally.
example consider the CHSH 1 facet. For this inequality
equation (3.20) requires that Pijl xy is 1/2 iff x ffi y
= i . j.
0:
For
= f3 = , = 0, and thus the
In this case, the CHSH 1, defined
by the equation (3.14), becomes 4, which is the algebraic maximum for that value.
This is quite intriguing. there is a one-to-one mapping from the nontrivial facets to
the nonlocal vertices, and these vertices each violate the corresponding facet (inequality)
maximally. I suspect this is related to why the nonlocal boxes trivialize communication
complexity, which will be discussed in Section 6.2.5. More importantly, this provides us a
way of quantifying nonlocality of correlations. Since the nonlocal vertices give algebraically
maximal CHSHk values, it makes sense to say these vertices are maximally nonlocal. Then
the larger the CHSHk value, the more nonlocal the given set of correlations is. This also
seems to be the most common way of quantifying nonlocality in literature.
3.6
Nonlocality
We used the term nonlocality several times in the previous sections, but only in a mathematical sense. We defined any probability vector that is outside the boundaries of CHSH
30
facets as nonlocal. Looking at the geometric structure for this set of probability vectors
(basically points in C\£:) may seem unnecessary, since these joint probabilities can never
be observed in the universe.
It is in fact deceiving to call the set of correlations boxes (probability vectors) in C\ £:
nonlocal. A set of correlations can be observed in a local physical system and yet still
exist in C\ £. The fact that a causal box is not in £: can be explained by the fact that the
classical axioms of probability does not hold for that box, since we have only defined £:
using those axioms.
One example for such an explanation is the following. Throughout our derivation in
Chapter 2, we have assumed the probability that event 1 and event 2 occurs is equal to
P12
=
Pi . P2· However this assumption does not hold if a measurement involving event
1 interferes with the measurement of event 2. This would explain why some probability
vectors could not obey CHSH inequalities. However, further thought shows that since the
parties are space-like separated, they could not possibly interfere with each others measurements instantly, in a causal system. In general, arguments involving the applicability
of classical logic operations turn out to be non-causality arguments in disguise.
It turns out that it will be hard to define the physical meaning of the term nonlocality,
the way it is used in literature today. Causality and determinism have straightforward
physical interpretations. In the rest of the document, we will use the term "nonlocal" as
it is used in literature today, a set of joint probabilities that violate Bell-type inequalities.
We will see that the nonlocal correlations in the universe are worth investigating because
they allow algorithms for processing information that are not possible for local correlations.
In the next chapter, we discuss "nonlocality" in the context of quantum mechanics. We
see how parties that possess bipartite correlation boxes that can be simulated by QM can
outperform classical methods for communication.
31
3.7
3.7.1
Quantum boxes
N onlocal correlations in QM
It is well known that there exists correlations in QM that are not in the polytope C. We
now give a concrete example of QM violating the CHSH inequality. Consider the following
quantum state as a binary bipartite correlation box:
(3.25)
where
It)z
and
I-!-)z
represent the state of up and down spin in the Z direction, respectively.
The first spin corresponds to the particle that Alice has, and the other corresponds to the
particle that Bob has. These two entangled particles is the bipartite box that is shared by
Alice and Bob. Their choices of input settings are:
• i=O represents Alice measuring the spin of her particle along the Z axis,
• i=l represents Alice measuring the spin of her particle along the X axis,
• j=O represents Bob measuring the spin of her particle along the - ~x axis,
• j=l represents Bob measuring the spin of her particle along the
z:{
axis,
and for the outputs, measuring an up spin corresponds to the output being 1, and 0 else.
A quick calculation shows that:
CHSH1
= -C(l, 1) + C(O, 0) + C(O, 1) + C(l, 0)
1
1
1
1
=)2+)2+)2+)2
= 2V2
> 2.
32
This was a major concern in early days of QM, as it was voiced in the popular paper by
Einstein, Podolsky and Rosen [14] . EPR claimed that this nonlocality had two possible
explanations:
• Space-like separated measurements affect each other instantly, violating special relativity.
• QM as a theory is incomplete, in the sense that there are physical properties of a
quantum system that quantum theory cannot account for.
There is still debate about whether quantum systems can be explained by a deterministic
theory [28, 27, 23]. If quantum systems are deterministic, then QM is incomplete. Otherwise, QM is complete in the sense that it predicts everything that can be predicted about
physical systems.
If deterministic theories can be constructed for quantum systems, then we know that
they will not obey classical axioms of logic, as they cannot obey the Bell-type inequalities
and still be causal. The reason is that some quantum correlations exist outside of L,
which means that the inequalities like the equation (2.15) are not obeyed by a set of joint
probabilities in the universe. Some have tried to explain this by the argument shown in
the previous section, which basically states that
PI2
i- PI . P2.
However, this would violate
causality as discussed before.
It is the view of the author that for this reason, calling correlations outside of L nonlocal
is deceiving. It is true that these correlations do not exist in the classical world, and that
they lead to computational and communicational algorithms that are otherwise impossible. However these correlations are not actually nonlocal. They are causal, no influences
between space-like separated things are allowed. They are stronger correlations than the
ones in L.
33
Supporting this view is the fact that local hidden variable theories can actually be
constructed [23]. However these theories cannot be classical, in the sense that they cannot
satisfy the classical axioms about logic (as we have seen by the CHSH inequalities). The
local hidden variable theories, despite being complete and local, are not necessarily more
satisfactory than QM. It is not physically more intuitive since classical axioms are abandoned. In addition, like any other hidden variable theories, they do not predict different
measurements or dynamics than QM.
For the rest of this document, the term "nonlocality" will be used as it is commonly
used in the literature today. A nonlocal set of correlations will still refer to points in C,
and will be causal. They will only violate Bell-type inequalities. Now we investigate Q,
the set of quantum correlations.
3.7.2
The bound on the nonlocality of correlations in (Q)
Below we present the elementary proof of Tsirelson's bound on the nonlocality of quantum
correlations [30]. Tsirelson showed that the CHSH value of
2V2
achieved by the EPR
states is the maximum CHSH value that can be observed in quantum mechanics.
Consider the 4 unitary operators: AI, A 2 , BI and B 2 • Ak and Bk are measurements on
separated systems, thus they commute. Alice performing a measurement A does not affect
the outcome probabilities for Bob. Consider the CHSH I value for these four measurements.
(3.26)
34
Using the fact that
AkBz = BzA k, the CHSH value can be written as (after some algebra) :
Each term after the first one is negative, being the negative of a squared value. Then
CHSH 1 S ~
(Ai + A§ + Br + Bn· Since the operators are unitary, Ai + A§ + Bi + B~ = 4.
Then CHSH 1 S 2V2. The other CHSH k can also be shown to be limited by 2V2 if the
similar calculation is done using equation (3.24)
3.7.3
The set of quantum correlations (Q)
NL
Figure 3: (Color in online version) A schematic representation of the two polytopes and the quantum
set. The curved boundary encloses Q, which is depicted in pink. The brown lines represent the CHSH
inequalities . Note that L c Q c C. ['lJ.
35
The set of quantum correlations, Q , contains all the correlations that can be obtained
from a bipartite quantum state, like the one in the previous section. Q is convex, but
not a polytope. Its boundary is curved with infinite number of vertices [30, 20 , 23]. Q
is strictly a subset of the polytope that is bounded by CHSHk
=
2}2 for 1 ::; k ::; 8.
Thus there are boxes outside of Q that have CHSH values less than 2}2 (Figure 4). The
fact that quantum correlations cannot achieve CHSH values larger than 2}2 was proven
independently in [30, 20] and [19], and Tsirelson's proof was presented in the previous
section. The correlations in Q\£ lead to dynamics very different than correlations that are
4
2
JVL
CHSH
L F---------L----~ L
o
Figure 4: (Color in online version) Note that CHSH=2V2 is the maximal CHSH value quantum correlations can reach. However there are boxes that have CHSH values smaller than
value close to 2, that are post-quantum. [lS}.
2V2, even some with
CHSH
in £. Use of correlations in Q allows one to compute things efficiently that are not possible
with classical computers [21 , 16]. Similarly, quantum communication related properties
include the no broadcasting, teleport at ion and superdense coding exist in Q\£ and not in
£ [32 , 29 , 21].
36
4
4.1
Some features of quantum information theory
The fundamental unit of quantum information
In classical information theory, a bit is the fundamental unit of information. A bit can have
two distinct stable states, which are usually denoted by the binary values 0 and 1. Classical
computers process information by storing and manipulating bits. Usually the voltage of a
node in a circuit is discretely considered to be on or off, depending on its value being above
or below a certain threshold value. Consequently, a bit at a given time is either 1 or O.
The quantum mechanical analogue of a bit is a quantum-bit, or qubit in short. A
qubit is fundamentally different than a bit, which is an important underlying reason of
why quantum information processing is so different than its classical counterpart. To first
define a qubit from an abstract perspective, recall that a classical bit could be thought
of a two-state system represented by the binary values 0 and 1. Similarly, a qubit can
be thought of as a superposition of two states represented by a linear combination of two
vectors. The usual notation is as follows:
I ~)
= alO) + ,B 11).
(4.1)
where a and ,B are complex numbers, 10) represents the first state, and 11) the second.
They are analogous to the classical states 0 and 1. The difference of a qubit is that unlike
a bit, it does not have to exist in one of the two states. It can exist in a superposition of the
two different states. Physically, the spin degree freedom of an electron, or the polarization
of a photon can be represented by a qubit.
In classical computation, the value of a bit can be measured. We know that a node in a
circuit is either at a high or low potential before we measure it. In quantum computation
however, a qubit does not have to be in either state until we measure it. The complex
37
coefficients a and
(3 determine the probabilities of measuring the system to be in either of
the states. The major difference is that we cannot always examine a qubit to determine
its state. For example, when the spin of an electron in the state
there is a
lal 2
chance that it will be in the up state and
al
t)
+ (31-!-)
1(312 chance
is measured,
that it will be in
the down state. After that one measurement however, the information about the states
is lost, and no further measurements can be done. Thus doing the measurement does not
necessarily tell us the values of a and
4.1.1
(3.
Information content of a qubit
After the measurement of its spin, the electron will be measured to be in the up or down
state; thus
lal 2 + 1(312 =
1.
a and (3 can assume a continuous range of values satisfying this
condition. It is useful to think of a qubit as a point on a sphere. Consider rewriting the
state 11f;) as:
(4.2)
up to a phase factor, which does not affect the probabilities. Then we can think of each
qubit as a point on the unit sphere, with the spherical coordinates (1,
e, ¢) . This represen-
tation of a qubit is called the Bloch sphere [21]. Since a qubit can be represented with two
angles that can have a continuous range of values on the sphere, one could be tempted to
conclude that a qubit can store and infinite amount of information. A bit can have only
two values, whereas
e and ¢ can be any real number from 0 to 7r and 0 to 27r, respectively.
Not surprisingly, this conclusion turns out to be inaccurate. As mentioned in the previous
subsection, we generally cannot determine the coefficients of a quantum system, since after
one measurement it collapses into one of the states.
Thus there seems to be two different kinds of information associated with quantum
systems. It takes an infinite amount of information to specify a qubit, in the sense that
38
there are infinite pairs of (() , <iY) pairs to choose from when we one is describing a qubit.
However, the amount of information that can be extracted from that qubit is bounded.
These two kinds of information seem to coincide for classical information, where the number
of bits to specify a state is equal to the number of bits one can learn about that state if the
information is transmitted. On the other hand, consider that case that Alice sends only
one qubit to Bob. Alice has an infinite number of states to choose her qubit in. However
once Bob receives this qubit, best he can do it is to measure this qubit with respect to one
of the two orthogonal states. Then Bob would only get one bit of information, whether the
qubit was measured to be in one of the orthogonal states or the other. It was proven by
Holevo that one bit of classical information is the maximum amount of information that
can be transmitted by sending a qubit [18]. We see that although it seems like a qubit
contains an unbounded amount of information in the coefficients, transmission of a qubit
translates into sending one bit of classical information.
One way to get around this problem would be to clone I'l/J) . Then we could do the
measurement on every clone of I'l/J), and get a better estimate for the coefficient values with
every measurement by the applying statistics. That would allow us to send an unbounded
number of bits of classical information by sending only one qubit. In the following section,
we will see how QM does not allow this.
4 .2
The no-cloning theorem
As mentioned in the previous subsection; if one could clone a qubit, then it would be
possible to get more and more information about that qubit with every measurement
made on a different copy of the qubit. It turns out that this is not possible, since a pure
quantum state cannot be cloned [35]. Below we present a proof, for a cloner device that
can clone two arbitrary pure states I'l/J A) and 1<iY A). We will start with a system A and a
39
system B. System A will have the qubit we want to clone, and system B will start with
an arbitrary initial state IYB ) that will evolve into the qubit in system A at the end of the
procedure. We want:
(4.3)
which is to say that we want to start with the states 'IjJ and Y for systems A and B, and
end with the state 'IjJ for both A and B. As mentioned in the axioms of QM , the evolution
needs to be unitary. We have:
(4.4)
for some unitary matrix Dc. Since we expect this cloning unitary matrix to work on any
arbitrary state, we also have
(4.5)
Since we do not know the state of the system we want to clone, our cloning procedure
should work for any arbitrary state. Taking the Hermitian conjugate of equation (4.4) and
multiplying it with the equation (4.5), we get:
(4.6)
Since IYB) is a pure state and Dc is unitary, we get :
(4.7)
40
Then (~A I <pAl has to be either unity or zero. Physically this means that
I<PAl
and I ~Al
are either the same state or they are orthogonal to each other. Since we had started with
the assumption that the cloner would clone any arbitrary state, the equation 4.7 present a
contradiction. Thus a cloning device that clones arbitrary pure states cannot exist.
Further study of the possibility of cloning quantum systems could ask if it is possible
to clone mix states, as opposed to the pure states for which the proof was based on above.
Another useful question would be to ask if it is possible to get an approximate of the clone
of a system, with a good enough similarity. Investigation of these questions have shown
that it is impossible to clone any quantum system without a finite amount of information
loss about the system to be cloned[21] .
This is a counter-intuitive result, since we are used to cloning classical information very
often in daily life. In some sense it is comforting to know that cloning quantum systems is
not possible, since otherwise one could store an unboundedly large amount of information
into a qubit and be able to transmit all that information to another party.
In many sources, the no-cloning theorem is presented as a feature of the quantum correlations and not classical correlations [6, 29]. This is deceiving however, since a probabilistic
system in the classical world is also not cloneable. Consider the classical, probabilistic two
state system:
CP
= p. Sl + (1 -
p) . S2
(4.8)
where Sl and S2 are the two possible states, and p is the probability that the system is
in state Sl. Since we do not know the value of p, this system CP is not cloneable either
[32]. In this classical setting however, the system does not collapse to one of the possible
outcomes after a measurement, and thus can be measured repeatedly to learn the state
of the system to an arbitrary value. Thus no-cloning theorem is not just specific to the
quantum world.
However, being able to clone would let us be able to treat quantum
41
systems as classical systems, since we would clone the quantum system many times and do
repeated measurements on each copy as if it is a classical system that does not collapse to
a result.
In later chapters we will see how the no-cloning theorem prevents communication faster
than the speed of light, and how it can also be observed in super-quantum systems.
4.3
Some applications of quantum nonlocality
So far we have seen that although qubits are very different than bits in terms of their
structure, the amount of classical information transmitted by sending a single qubit is not
different than that of a bit. In addition, the no-cloning theorem is not necessarily unique to
non-classical systems. To see the crucial difference between quantum and classical systems,
we need to look at some clever computational and communicational algorithms that exploit
the superposition and interference properties of quantum systems to accomplish tasks that
would not be possible with only classical bits. Before we investigate these algorithms, we
need to define some of the tools in quantum computation.
4.3.1
Quantum gates and operations
One way to represent quantum algorithms is to represent them as circuits consisting of
gates. Recall that in digital logic, we have the following gates:
NOT(a)
= a E9 1,
AND(a, b) = a· b,
OR(a , b) = a E9 b,
42
where a and b are classical bits. Similarly in quantum computation, there are several gates.
As mentioned before as a postulate of QM , the evolution of a quantum system is described
by a unit ary transformation. It is straightforward to show that all of the quantum gates
below are unitary. Below we list some of them that we will use later in the chapter:
IDENTITYla) = Ila) = la ),
NOTIO)
= 11), NOTll ) = 10),
SWAP (ala)
PAULI-Z (ala)
HADAMARD (ala)
+ {J Ib))
+ {J Ib))
= H
(4.9)
+ {J Ib)) = {J la ) + a lb),
= Z
(ala)
(ala)
+ {J Ib))
+ {J Ib))
=
=
(4.10)
ala) - {J Ib),
a la )~Ib)
+ {J la )~Ib)
(4.11)
(4.12)
where The quantum gates listed above are for single qubits. In digital logic, there is only
one gate that can be applied to a bit, which is the NOT gate, which is shown above. In
quantum computation, there are several single qubit gates, some of which are listed above.
Of the multiple-qubit gates, one is very important and we will use in a following section.
It is called the CN OT gate, and all the other quantum gates are an extension of it [21].
We present it below:
ControlledNOT (Icp) , Ix)) = CNOT (Icp) , Ix)) = (Icp) , Ix EEl cp)) .
(4 .13)
Notice that unlike the single-qubit quantum gates, the CNOT gate acts on two qubits.
Thus Icp) and Ix) both have two coefficients associated with them. More explicitly: Icp)
allO)
+ hll)
and Ix) = a210)
+ b2 11)
=
for complex ai, a2 , hand b2· Looking at equation
(4.13) , we see that the first qubit is not affected by the gate, and comes out as it came in.
43
The second qubit however, is inverted if the first qubit is 11), and stays the unaffected if
the first qubit is 10). Below we present all the possibilities:
CNOT (10) 10)) = (10) 10)) ,
CNOT (10) 11))
= (10) 11)) ,
CNOT (11) 10)) = (11) 11)) ,
CNOT (11) 11)) = (11) 10)) .
Thus, given that lIP)
= alIO) + bIll); there is lal1 2 chance that Ix) will not be inverted,
and there is Ihl 2 chance that Ix) will be inverted. This is because there is lall 2 chance
that lIP) is 10) , and Ibl12 chance that lIP) is 11). Notice that a measurement on lIP) is not
applied during the execution of the gate, thus Ix) is inverted and not inverted at the same
time in superposition. We know that this gate is possible because the operation is unitary,
and it has been implemented physically in many labs [21].
Now that we have defined some ofthe tools for constructing quantum algorithms, we will
investigate some well-known procedures that bring out the power of quantum correlations.
First, we consider superdense coding.
4.3.2
Superdense coding
As mentioned before, Holevo's bound states that only one bit of classical information can
be communicated between two parties by sending one qubit. The protocol of superdense
coding allows the transmission of two classical bits of information by sending only one
qubit, provided that the two parties share entangled particles beforehand [8].
Consider this scenario, where Alice and Bob share a maximally entangled quantum box,
44
by which we mean that Alice and Bob each have one qubit that are entangled with each
other. Alice wants to send two classical bits to Bob, by sending her qubit to him. Equation
(3.25) describes one bipartite correlation box that is a maximally entangled quantum box.
For this algorithm, we use another one that is equally entangled, which we will denote by
(4.14)
The IEPRk ) are states with CHSH values equal to 2\1'2. They are the most nonlocal
states that can be observed in our universe (Q), named after Einstein, Podolsky and Rosen
who were amongst the first people to appreciate these states' mystery. Now we outline the
protocol:
• If Alice wants to send the bits 00 , she does nothing to her qubit and sends it to Bob.
• If Alice wants to send the bits 01, she applies the quantum gate PAULI-Z (Z) to
her qubit, and then sends it to Bob.
• If Alice wants to send the bits 10, she applies the quantum gate SWAP to her qubit,
and then sends it to Bob.
• If Alice wants to send the bits 11, she applies the quantum gates Z and SWAP to
her qubit, and then sends it to Bob.
Then Bob can do a measurement on the two qubits he has, that are entangled. We call
the state of the entangled pair after Alice applied her gates
IEP R*).
There are four
possible outcomes for IEPR2), depending on what Alice wants to send to Bob. These
four possibilities form an orthonormal basis for the two-qubit state space. Thus Bob can
successfully determine which state IEP R*) is in, since the possibilities form an orthonormal
basis. Below we give an example of the procedure which will hopefully make the idea
45
clearer.
Assume Alice wants to send the bits 01. Then she applies the Z gate to her qubit. This
is equivalent to saying that Z (8) I is applied to IEP R2). As mentioned before, (Z (8) I) (Iv)
Iw))
= Zl v)
(8)
Ilw). Recall that originally Alice's qubit was (l-!-z)A+ Itz)A)
applies the Z gate, it becomes (l-!-z)A-ltz)A)
/v'2.
/v'2.
(8)
After she
Then IEPR*) in this case is:
(4.15)
Since this is one of the orthogonal basis vectors of Bob's measurement device, he can detect
this state accurately and conclude that Alice sent him the bits 01. Below we list the other
IEP R*)mn, where mn denote the classical bits that Alice wants to send to Bob.
IEPR* )oo
= l-!-z)A I-!-Z)B~tZ)A Itz)B.
(4.16)
IEPR* )01
= l-!-z)A I-!-Z)B~tZ)A ItZ)B.
(4.17)
IEPR*)Io
= l-!-z)A ItZ)B~tZ)A I-!-Z)B .
(4.18)
IEPR* )l1
= l-!-z)A ItZ)B~tZ)A I-!-Z)B .
(4.19)
As mentioned before, IEP R*)mn where mn E 00, 01 , 10, 11 form an orthonormal basis.
These four states are also known as the Bell states or EPR pairs [21]. They all have the
highest CHSH value observable in the universe in a two-qubit state space. They also play
a very important role in teleportation, which we will discuss in the next section.
Notice that this the superdense coding does not violate Holevo's bound, since it involves
the two parties sharing an entangled state in addition to the communication of a qubit. It
46
is a powerful result, and it was one of the first indications that the nonlocal correlations in
Q lead to algorithms that are unintuitive.
It should be noted that a pair of photons can carry more than one qubit. Two photons
can be hyper-entangled [5], where they are entangled in various degrees of freedom. This
degrees of freedom can include spin, frequency, time ... etc. By creating photons that are
hyper-entangled with each other, it is possible to encode up to seven qubits in a pair of
photons. Then the protocol of superdense coding can be used to send more than two
classical bits with only one pair of photons. Note that this still does not violate Holevo's
bound, as although two photons are sent , it is still equivalent to more than one qubit from
an information-theoretic perspective.
Next we discuss teleportation, which in some sense is the exact opposite of superdense
coding in functionality.
4.3.3
Teleportation
Superdense coding procedure allows us to send two classical bits by sending one qubit
and using a shared entangled two-qubit system. Teleportation allows us to "teleport" a
qubit by sending two classical bits and using a shared entangled two-qubit system [9] .
By teleportation, we mean that the qubit that needs to be transmitted is completely
constructed on the receiving end. The qubit itself is not sent , as the only shared channels
of communication in this algorithm are classical. However, the qubit to be sent appears
on the receiving end, and disappears from the source, after the this protocol is executed.
Below we explain the protocol.
Alice and Bob share a maximally entangled two-qubit system. They each have one of
the qubits of the state IEP R2) as defined in equation (4.14). Alice has a qubit Iq) at her
disposal, where Iq)
= qolO) +qlI1) with complex constants qo
47
and ql· She wants to send Iq)
to Bob. The problem is, she does not have any information about the state of Iq ), by which
we mean that she does not know qo and ql. To make matters worse, she does not have
a quantum communication channel between her and Bob, thus she cannot send qubits to
him like she could in the previous scenario with superdense coding. As mentioned before,
it takes an infinite amount of information to specify Iq ), so it would take her an infinite
amount of time to send the description of Iq ) using classical communication channels to
Bob even if she had the information.
It turns out that Alice and Bob can transmit the information of the qubit, using the
protocol called" quantum teleportation" [9]. First we outline the algorithm, and then we
will explain it in detail. Alice does a measurement on Iq ) and her qubit of the entangled
pair IEP R2)' Then the entanglement of Alice's and Bob's qubits in IEP R2) is lost , but this
measurement entangles Iq ) with Bob's qubit of IEP R2)' Alice sends two bits of classical
information to Bob, to indicating to Bob what gates he needs to apply to his own qubit
to make his qubit equal to Iq ). Thus using the two bits of classical information he receives
from Alice, he can determine Iq). Below we explain it in detail.
The initial state of the whole system is Iq ) IEP R2)
= (qoIO)q + qlI1)q)
CO)A10)B;J1)A11)B )
where we named up-spin in z direction 11) and down-spin in z direction 10). We use the
subscript q to denote the qubit q, subscript A to denote Alice's qubit in the entangled
pair, and subscript B to denote Bob's qubit in the entangled pair. The state of the whole
system can be written as:
(4.20)
Alice wants to operate on the two qubits that are at her location (q's and A's) so that the
entanglement is swapped from qubits A and B to qubits q and B. To achieve this, Alice
48
applies CNOT and H to her two qubits. After the CNOT (equation (4.13)) gate, the
state becomes:
(CNOTqA ® I B ) [Iq ) ® IEPR 2) ]
=
=
h [qoIO )q( IOO )AB + Ill )AB) + qlI1)q(100)AB + Ill )AB) ]
= h [qoIO )q( IOO )AB + Ill )AB) + qlI 1)q(11O)AB + 101)AB)].
(CNOTqA ® I B)
Notice that Alice's qubit flipped in the second term because of the CNOT gate. Then
Alice applies the Hadamard gate (H, equation (4.12)) only to the qubit to be transferred
(Iq)) .
(Hq ® IA ® I B ) [Iq ) ® IEP R2)] =
= (Hq ® IA
® IB)
h [qoIO )q (100) AB + 111)AB) + qll1)q (110) AB + 101)AB) ]
=~ [qO (IO)q + 11)q) (IOO)AB + Ill )AB) + ql (IO)q -
11 )q) (110)AB
+ 101 )AB) ] '
which can also be written as:
~ [IOO )qA(qo IO )B + qlI 1)B) + 101 )qA(qo I1 )B + qlI 0)B)
+110)qA(qo I0 )B - qlI1)B)
+ Ill )qA(qo I1 )B -
q110)B)]
by regrouping the tensor product multiplications. At this point , Alice does a measurement
on the two qubits that she has (q and A). Then she sends the results to Bob, as two classical
bits. Depending on the bits Bob receives, he will know what gates to apply to his qubit B
to complete the teleportation. A breakdown of the cases will make it clear.
49
• If Alice measures both of the qubits to be
10;:
qOIO;B + ql1I;B
Then Bob's qubit collapses to the
originally in. Then Bob will receive
state. This is the state that q was
00 from Alice, and he will know that the qubit he
has is precisely the qubit Alice wanted to sent to him. The teleportation is complete.
• If Alice measures q and A to be
10; and II; respectively:
Then Bob's qubit collapses to the
qolI;B + qll0;B
state.
Alice will send 01 to
Bob. He will know to apply the SWAP gate (equation (4.10)) to his state. Since
SWAP [qOII ;B
+ q110;B]
=
qOIO;B + ql1I;B = Iq;, the teleportation is complete.
• If Alice measures q and A to be
II; and 10; respectively:
Then Bob's qubit collapses to the
qoIO;B - ql1I;B
state. Alice will send 10 to Bob.
He will know to apply the Z gate (equation (4.11)) to his state. Since Z
qlII;B] = qOIO;B + qllI;B = Iq;,
[qO 10;B
-
the teleportation is complete.
• If Alice measures both of the qubits to be
Then Bob's qubit collapses to the
II;:
qoII;B-qlI0;B state.
Alice will send 11 to Bob. He
will know to apply the SWAP and the Z gates to his state. Since SWAP [ Z [qO 11; B-
q110;B]]
4.3.4
=
qOIO;B + qllI;B
=
Iq;,
the teleportation is complete.
Entanglement swapping
Entanglement can be realized by local means, either by having two particles emerge from
the same source or two particles interacting with each other. Another source of entanglement is what is called the entanglement swapping, where the entanglement between two
particles is passed on to a different pair of particles. This third realization of entanglement
50
is not by local means, and it can also be referred to as "teleportation of entanglement"
[11]. We will see shortly that this is a very appropriate name.
Entanglement swapping is basically two particles being entangled while they are separated, thus it is a nonlocal effect. For example, assume Alice has a particle A that is
entangled with a particle B1 that belongs to Bob. In addition, imagine Bob has another
particle B2 that is entangled with another particle C that belongs to Carol. Assume both
of the entangled pairs are in the IEPR 2 ) state (equation (4.14)). Then the whole system
is in state:
IEPR 2 ) ABI
c:A
'C!
IEPR )
2
B2C
= (IO )A IO )Bl + 11)A I1 )Bl
J2
=
c:A
'C!
10)B2 10)c
+ 11)B2 11)c )
J2
~ [10000) + 10011 ) + 11100) + 11111)] ,
where all the states in the last line are in the AB1 B 2 C state (the subscripts are omitted).
Now, if Bob does a measurement on the two qubits that he has (B1 and B2), the general
state collapses into one of the following:
Thus although Alice's and Carol's qubits have never interacted, they have become entangled.
Another way to achieve entanglement swapping is to teleport half of an entangled
state. Assume Alice and Bob share an entangled state, and similarly Bob and Carol share
51
an entangled state. If Bob teleports his part of the system he shares with Alice to Carol, by
using the entangled state he shares with Carol using the teleportation protocol described
in the previous section, then Carol's new qubit is Bob's initial qubit he shared with Alice.
Thus now Carol shares an entangled state with Alice. This protocol is identical to the
protocol described above. Thus the name teleportation entanglement is very suitable.
Since entanglement is a fundamental source in quantum computation and information [21] ,
being able to swap it between parties is very useful. Entanglement swapping can be used
for several applications in quantum computation and cryptography [11].
52
5
Algebra of correlations
The idea of investigating closed sets of correlations has been recently introduced [1]. Correlations are "added" to each other, by "wiring" the boxes together. Wiring can refer to
different kinds of classical logical operations applied to inputs and outputs of boxes, m
order to combine boxes to produce a new box.
x
y
t
)t
~ n(y)
Xl (X)
at
X2(X,a l
hI
~ Y2(y,bJ)
a2
b2
a(x, at, a2)
b(y, bI, b2)
Figure 5: A schematic representation of two different boxes wired to each other, to produce a new box il}.
As seen on Figure 5, boolean operations can be applied to inputs and outputs of boxes
to be fed into the other boxes.
This formalism lets us investigate correlations from a
different perspective. Through this process, it is checked if a certain set of correlations is
closed or not. A set of boxes A is defined to be closed under an operation W if all boxes
obtainable by operation on A with Ware also contained in A.
A set of correlations for which a physical principle holds must be a closed set. For
example; L , the local polytope, must be a closed set. This is because in the classical
53
world of joint probabilities, we cannot combine local correlations to produce nonlocal ones.
Thus it is physically required that £: is a closed set. Similarly; Q, the set of quantum
correlations, must be closed set as well. Otherwise we would be able to wire quantum
systems together to get correlations that are stronger than QM allows. The same holds for
the causal polytope C.
On a parallel thought , different information-theoretic principles should correspond to
different closed sets. For example, the set of correlations that do not trivialize communication complexity (Section 6.2.5) must be a closed set for similar reasons.
This idea might prove very useful, as being able to classify different closed sets of
correlations would allow us to identify information-theoretic principles that are genuinely
different. Moreover, it would allow us to identify physical theories that could exist other
than QM.
5.1
Algebra of wirings
A box is defined by the joint probabilities its dynamics present (Section 3.1). Then a box
is basically a tensor that contains the information about the joint probabilities. Thus a
box, denoted by B (ij Ixy) , must have the information about 16 joint probabilities: Pijl xy
for ijxy E {O, 1}4.
An example is a nonlocal vertex, B(ijlxy)';1Z . Its probability tensor is given by:
which is defined in the equation (3.20).
IfN boxes are going to be wired, we have B(ijlxY)k for 1 S k S N, each with inputs ik
and jk, and outputs Xk and Yk. The kth inputs and outputs are functions of k - 1 inputs
54
and outputs before them, respectively. In other words:
for
1
~
k
~
N,
(5.1)
and i and j are the inputs to the newly produced box, and x and yare its outputs.
To make this concept more concrete, consider the AND wirings of outputs of N identical
boxes. In this example we set all ik and jk equal to i and j, respectively. Thus the inputs
to the wired new box is also the inputs to all the boxes inside that are being wired. The
outputs are logical AND operation applied to all N output bits. In other words:
ik = i, jk = j for 1 ~ k ~ N,
N
X = AND(Xl,X2 , .... ,XN) =
II Xk,
k=l
N
Y = AND(Yl , Y2, .... , YN) =
II Yk·
k=l
We are given N identical boxes B. Assume their probability tensor is Pijl xy . We apply
AND wiring procedure, and produce a new box B ' with the probability tensor P: j1xy . To
solve for P: j1xy ' first consider what happens to P:j1u . The x of B ' will be one only if all
Xk = 1. The same holds for y. Then we can write:
(5.2)
since the probability that both Xk and Yk equal 1 is Pijl l l , and all N of them have to be 1
55
P:
at the same time. Similarly, we compute
j 11O
as:
(5.3)
since we need
Similarly for
Xk
P:
=
1 for all k, and
Yk
=0
for at least one of the k's. Only then xy
=
10.
j I01 '
(5.4)
Finally, by the equation (3.2),
(5.5)
Noting that all joint probabilities are between 0 and 1, we see that equation (5.2) shows
that
P:
j lll
will get very small for large N, if
(5.3) and (5.4), for large N,
P:JI1o and P:
j I01
P:
j lll
i-
1. The same holds for equations
will also get very small. Then only
P:
j IOO
will
remain, which will make correlations very strong. For this reason, this process is called a
"non locality distillation", b ecause the nonlocality of the box can go as high as the algebraic
maximum, depending on the boxes B that are being wired.
AND wiring is a useful way of checking if a given set of correlations is closed. By
applying this simple wiring procedure, we can check if any correlations outside of the
initial set has been reached.
5.2
Irreversibility
When there is a closed set inside another closed set, irreversible transitions occur from the
larger set to the smaller set. For example, consider the closed sets I: and Q. R
= Q\I:
forms an island [1], in the sense that if a box is mapped out of R , it cannot never get
mapped back into R again. This is because Q is a closed set, so anything that is mapped
56
out of R must end up in £. However £ is closed, too. Thus the box can never be mapped
back into Ragain .
5.3
Nonlocality distillation
5.3.1
Nonlocality distillation 1 (using an NL and a local box)
As shown above, AND wiring can be used distill some boxes to stronger nonlocality.
There are other distillation procedures, each with a different set of boxes they can distill
maximally. Below we outline a distillation procedure proposed in [13] , which again wires
two identical boxes to produce a more nonlocal box. The domain of this procedure is:
(5.6)
°S
where
E
S 1, B ~i is defined as in the equation (3.20) , and Bfc is defined below.
B~1: (E) denotes the set of boxes that is the domain of this protocol. Clearly E must be
between
°
and 1, otherwise some probabilities of the new box B ~1: would be larger than
1. PC stands for perfectly correlated, because Bfc denotes the probabilities:
if x = y,
if
(5.7)
xi- y.
Notice that CHSH value of this PC box is 2, since all C(x , y) = 1, since x and yare
maximally correlated. This is different than the NL boxes, since NL boxes are maximally
nonlocal, C(O, O)
= C(O, 1) = C(l , 0) = -C(l , 1) which gives a CHSH value of 4.
Now we outline the wiring of the two boxes:
.
•
~l
•
X
=
=
~ , ~2
Xl
EEl
.
= ~ . Xl ,
X2 ,
57
• )1
= ), )2 = ) . Yl, and
• Y = Yl EB Y2·
This protocol distills (increases the CHSH value of) any B~~(E) for 0 :S; E :S; 1. Then
applying this procedure repeatedly, any B~~ can be distilled to maximum nonlocality.
This will be useful in Section 6.2.4.
5.3.2
Nonlocality distillation 2 (using an L and an NL box)
In the previous subsection, a nonlocality distillation protocol was presented, which used a
nonlocal and a local vertex (NL and L boxes). Now we present another distillation protocol
that uses a nonlocal vertex and a local vertex. Allcock et al. has introduced this distillation
procedure [2] . It is very useful in showing that some certain sets are not closed. We outline
this procedure below (it is again for wiring two boxes) :
• x =
Xl
EB
X2
EB 1,
• )1 = ), )2 = ) . Yl, and
• Y
=
Yl
+ Y2 + 1.
This protocol can distill the boxes presented below to an algebraic maximum (to the NL
boxes):
(5.8)
where B ~2 and B~iJ')'8 are defined as in the equations (3.19) and (3.20) , and Bx;.L(E) for
o :S; E :S;
1 denotes the set of boxes that is the domain of this protocol. Clearly
E
must be
between 0 and 1, otherwise some probabilities of the new box B x;.L would be larger than
1.
58
Note the similarity b etween B~1:(E) and Bx;.L(E) , defined in the equations (5.8) and
(5.6 respectively. They are both in between two boxes, and their positions between those
two boxes are determined by
E.
Bx;.L(E) has the CHSH value of 4E
+ 2(1 -
E) = 2 + 2E before the protocol is applied;
since B~2 has a CHSH value of 4, and B~lOl has a CHSH value of 2. After the protocol,
Bx;. L (E') is obtained, where E' = 2E - E2. We see that E'
value of the new box is 2 + 2E'
= 2 + 4E -
2E2 which is larger than the CHSH value of the
original box. Thus nonlocality is distilled, for arbitrary 0
5.4
> E, if E < 1. Then the CHSH
< E ::;
1.
Closure under wirings for polytopes
We have previously listed three closed polytopes of correlations: C, Q and C. Surprisingly,
no other genuinely different closed polytopes are known currently. We will present failed
attempts at finding closed polytopes, which gives insight to the structure of Q.
Figure 6: On the left is the polytope investigated in Section 5.4 .1. On the right is the polytope investigated
in Section 5.4.2. Figure originally from il}
59
5.4.1
Limiting the CHSH value
A simple idea to construct a closed polytope is to limit the CHSH value, and thus impose
artificial CHSH facets. This is basically what the local polytope is, but the restriction is
not artificial, it is the one that is imposed by classical logic at CHSH= 2.
We know that this idea does not work for CHSHS 2, since within £:, there exist logically
complete sets of operations that can take any deterministic truth function to any other
truth function. As mentioned in Section 2.1, any point in a correlation polytope can be
expressed as a convex sum of its vertices (which are deterministic truth functions). Thus,
to go from any point £: to any other point in £:, we express the original point as a convex
sum of the vertices, and then find the logical operations needed to get to the target point.
Thus £: is the smallest convex polytope. Then we try to find a larger convex polytope,
smaller than C.
Then we proceed to construct a polytope S with CHSHs
< S for some S between 2
and 4. This polytope has the trivial facets of £:, and the nontrivial facets of £:; but with a
different CHSH value S. Then S has the 16 local boxes as vertices (vertices of £:) , and 64
nonlocal boxes that are convex combinations of the vertices of £: and C as shown below:
(5.9)
where fJvrw-/31 E {O, 1}6 , and 0 = (ex EEl fJ) . h EEl v) EEl f3 EEl
(Y.
This 0 value corresponds to
the vertices of £: on the CHSH facet below each NL box (end of Section 3.5) . The CHSHs
value is 4E
+ 2(1 -
E) = 2 + 2E.
Note that the vertices of S are precisely the kind of boxes the protocol in Section 5.3
can be used for. Then it follows that no S with CHSHs can be closed, since the non locality
distillation protocol can be used on vertices of S to create boxes with values larger than
60
CHSHs. This is quite interesting. I: is the only closed polytope that have CHSH facets as
the nontrivial facets. This means that no other closed polytopes other than I: exist that
are solely bounded by nonlocality.
5.4.2
Noisy NL boxes
Instead of putting a limit on the CHSH value, we define new vertices to construct a polytope
smaller than C. This is equivalent to adding white noise (noise with maximally mixed
frequencies such that all outcomes are equally probably for all inputs) to the NL boxes,
such that they are not as nonlocal. This defines a "noisy" causal polytope,
Cnoisy,
with
nonlocal vertices moved inward compared to C.
We define the new NL vertices as:
It has been shown that this polytope is not closed either, by solving for the facets using
the LRS software [3]. It was shown that 64 boxes can be achieved using AND wirings
that are not in the original polytope. Something interesting that the authors have done
was to add those 64 boxes to the original polytope, and then ask if convex hull of these
boxes is closed. They iterated this process with the additional boxes from each step. Their
conjecture is that the set that has closure of this polytope has a curved boundary.
5.4.3
The Uffink set
Uffink's set was introduced in 2002 by Jos Uffink [31]. It contains all boxes that satisfy
(C(O, 0)
+ C(l, 0))2 + (C(O , 1) -
61
C(l, 1))2 :s; 4.
(5.10)
It has been shown recently that the inequality (5.10) is a bound relating to information
causality [2]. Any box violating this inequality also violates information causality. Information causality states that communication of k classical bits should not cause a gain of
more than k classical bits for the receiving party. This is quite exciting, and promises to
be a defining property of QM; if it can be shown that Q is the largest set that satisfies this
inequality, which would single out quantum correlations.
Using the ideas presented in this chapter, it has been shown that the Uffink's set is not
closed (again using the LRS software [3]). This implies that information causality is not
violated by some correlations that are not in Q. We would expect the information causality
to be preserved for a closed set of boxes. Then there must exist correlations that satisfy
the inequality (5.10) that is not in that closed set. Thus there exist correlations stronger
than quantum correlations that preserve information causality.
62
6
Information-theoretic axioms of QM
It has been argued that QM does not have a underlying physical law that explains the weird
nature of its dynamics [23]. It has been contrasted with the theory of special relativity,
which relies mainly on the principle of constant speed of light in every inertial reference
frame. The axioms of QM are very abstract and mathematical, and do not offer physical
insight to the unexpected behavior of quantum systems.
There has been disagreements about how important or useful it would be to have
physical axioms that define QM, as opposed to the mathematical ones used today [29]. In
this chapter we discuss some of them. The first one is related to nonlocality.
6.1
Quantum nonlocality as an axiom
Popescu and Rohrlich have investigated in several papers if causality and maximum nonlocality together could define QM [24]. As we have seen in the previous chapter, the answer
turns out to be negative. However that formalism was not very popular at the time, and
Popescu and Rohrlich brought it up to the attention of many researchers. That series
of papers started the research of correlations that are more nonlocal than QM. Below we
summarize their argument, and follow with different candidates for axioms.
Bell's inequality gives a bound on the strength of correlations between independent
events. As shown before, there is a bound on the strength of correlations between separated
events.
Consider two parties, Alice and Bob, are again space-like separated. They share a distributed system. Each party can independently perform one of two different measurements
on his or her part of the system. Alice(Bob) can set her (his ) measurement setting to 0
or 1. The result of a measurement she (he) makes is either 0 or 1. We define the result
Alice(Bob) gets when she (he) sets her (his) measurement setting to i(j) as mf(mf). Sim-
63
ilarly, the probability of Alice getting the result x and Bob getting the result (y) when
they set the measurement settings to i and j respectively, is given by:
Prob
(mt = x, mf = y)
or shortly Pijl xy ·
For all possible set of inputs Alice and Bob can choose, the probabilities of getting
distinct outputs must sum to 1. The following equality must hold:
L
Pijlxy = 1
ij E {00, 01,10, 11}.
for
(6.1)
x,YE{O ,l}
The correlation between two measurements
mt
and
mf , C(mt,mf) , is defined as:
(6.2)
Then, using the equation (6.1), the equation (6.2) can be rewritten as:
(6.3)
or similarly
(6.4)
To write the equations (6 .3) and (6.4) more compactly, notice that (Pij IOO
P (mt
E9
mf = 0) and (Pij l
Ol
+ PiJllO) = P (mt
modulo 2 as above. Then:
64
E9
+ Pijln) =
mf = 1) where E9 denotes addition in
mf = i· j) -1
for
ij
C(mt,mf)=1-2P(mt EB mf=i.j)
for
ij=l1.
C
(mt,mf)
=
2P (mt
EB
E {OO, 01, 1O},
(6.5)
(6.6)
This gives us a different way of writing the CHSH inequalities. Recall that
-2 S CHSH1 S 2,
-2 S C (m~, m~)
+C
(m~, mf)
+ C (mt , m~) -
C
(mt, mf) S 2.
(6.7)
Using the equations (6.5) and (6.6) ,we can rewrite the inequality (6.7) as:
-2 S2P (m~ EB m~ = i· j) + 2P (m~ EB mf = i · j) +
+ 2P (mt EB m~ = i · j) + 2P (mt EB mf = i· j) -
4 S 2,
which simplifies to
1S
L
P (mt
EB
mf = i· j) S 3.
(6.8)
i ,jE{O,l}
It is convenient to write the CHSH inequality in this form for the rest of the chapter. We
know that correlations in QM can violate this inequality.
(6.9)
which can also be written as, using a similar argument from above:
2-
J2 S
L
P
(mt
EB
i ,jE{O,l}
65
mf = i . j) S 2 + J2.
(6.10)
It is interesting to ask why CHSHQM is bounded by 2}2 [24]. Algebraically, the expres-
sion in the inequality (6.7) can get as large as 4 in magnitude. One possible answer is to
claim that if QM was more nonlocal(if CHSHQM was larger than 2}2), then superluminal
signaling would be possible. If that were the case, then QM could be based on two physical
axioms: causality(in the sense that special relativity is not violated) and maximal nonlocality. Unfortunately this turns out not to be the case, as there are simple correlations one
can make up that are still causal and more nonlocal than QM. Consider the following set
of correlations:
POOIOO = PIOIOO = POl loO = POOlll =
1
= PIOlll = POl lll = PlllOl = Pll lIO = 2'
(6.11)
and
POOIOI = POOIIO = POl loI = POl lIO =
= PIOIOI = PIOIIO = Pllill = Pll ioO = O.
(6.12)
The reader will notice that the set of correlations given by the equations (6.11) and (6.12)
is one of the vertices of C, B(ijlxy);z92 more precisely (as defined by the equation (3.20)).
Thus set of correlations is causal, and has the largest CHSH value, four, allowed by algebra.
This shows that there are correlations that are more nonlocal than QM and yet still causal.
Then what is special about the nonlocality limit CHSHQM of QM?
6.2
Nonlocality and communication complexity
To offer another possible explanation, Vim van Dam considered the following question: if
the set of correlations given by the equations 6.11) and (6.12) were possible, how would it
66
affect the communication complexity of different functions [33]7 This turned out to be a
good question to ask, and gave new insights into what is special about QM.
6.2.1
Communication complexity
Communication complexity of a function f(x,y) is determined by the number of bits of
information needs to be to communication between Alice and Bob if only Alice has the
information x and only Bob has the information y, and they want to compute f(x,y). It is
sufficient for one of the parties to be able to compute f(x,y). This does not make a difference
since we only consider decision problems, thus whichever party computes the function can
just send one bit to the other party that represents the result. As an example, consider
the following function:
EvenO,Odd(x, y)
~{ ~
if (x
if (x
+ y)
is even,
+ y)
is odd.
(6.13)
The function EvenOrOdd(x,y) has a communication complexity of 1. This is because Alice
can just send one bit to Bob, indicating if x is even or odd.
Then Bob can decide if
(x+y) is even or odd by considering x's parity he learned from Alice and y's parity that he
knows himself. Then we say that EvenOrOdd(x,y) has trivial communication complexity
as one bit is the minimum possible number of bits two parties have to communicate to do
distributed computation. On the other hand, the inner product function has maximum
possible communication complexity, which means that one of the parties has to send the
whole string of information to the other party before inner product can be calculated [32].
67
6.2.2
Maximal nonlocality and the inner product
In this section we show that maximal nonlocality of the correlations presented in equations
(6.11) and (6.12) can be used to reduce the communication complexity of the inner product
to 1. Looking at the equation (6.11), we see that all the nonzero probabilities satisfy
mf EEl mf
= i . j, which is how we get maximal nonlocality. We see that:
P
(mf
EEl
mf = i· j) = 1.
(6.14)
The inner product function for binary numbers x and y of length N, can be written as:
N
IPN(x, y)
=
L
Xm . Ym·
(6.15)
m=l
Assume Alice and Bob had more than N entangled particles in their disposal. Then they
each could use each bit of their number as a measurement setting to a corresponding entangled particle. Thus {il,i2, ... ,iN} = {Xl,X2, ... ,XN} and {jl,j2, ... ,jN} = {Yl,Y2,···,YN}.
By the equation (6.14), we know that (mf)m EEl (m f ) m
= i m · jm = Xm ' Ym·
Then to solve
the equation (6.15):
N
IPN(x,y)
=
L
N
Xm ' Ym =
L
m=l
m=l
N
N
= L (mf)m EEl
m=l
L
(mf)m EEl (mf)m
(mf)m
(6.16)
m=l
First(Second) term in the equation (6.16) can be calculated by Alice(Bob), as she (he) can
sum the results from each measurement she (he ) makes on the entangled particles from
her (his ) own side. Then one of the parties can send the result of their summation as one
bit to the other party, and then the other party can add that to his or her own summation
68
result to find the result of the inner product.
This result is quite remarkable. The function IP N(X,y) which would have a communication complexity of N in the classical or quantum world, has trivial communication
complexity in a universe where the maximally nonlocal correlations are allowed. Next , we
show that every distributive decision problem with N bit input from each party can be
reduced to the inner product function.
6.2.3
Distributive decision problems and the IP N
Formally, any function
f : {a, l}N c>9 {a, l}N
---+ {a, I} can be reduced to the IP N(X,y) where
x and yare binary strings of length N. Recall that any logical proposition with N variables
can be written as a polynomial with N variables which has at most 2N terms. Some of these
N variables will be from Alice and some will be from Bob. We can group them accordingly
and then the function with N terms ends up being a inner product function of 2N terms.
As an example, the logical operation NAND can be represented as follows:
NAND(x , y) = 1 EB
X ·
y.
The NAND gate is universal, so the expression above can be used to write any logical
proposition as a polynomial with arithmetic in modulo 2.
A general and common example is the equivalence function (EQUIV) . This function
compares the strings of Alice and Bob and returns a 1 if the strings are same. Assume
x =
Xl, X2
(1 EB
XI(2)
and
EB
Y
YI(2))
=
YI, Y2·
Then EQUIV(x,y)=(l EB
returns a one if the
XI(2)
=
69
YI(2)
Xl
EB yI) . (1 EB
and a
°
X2
EB
Y2).
Notice that
otherwise. Then EQUIV(x,y)
returns a 1 if x = y , and a 0 otherwise. EQUIV(x,y) can also be written as:
EQUIV(x, y) =
(6.17)
5
=
L
Im(Xl, X2)Jm(Yl , Y2).
(6.18)
m= l
We see that each term in the equation (6.17) can be written as Im(Xl, X2)Jm (Yl, Y2). Thus
the initial EQUIV(x,y) problem is reduced to an inner product function with each term
being a multiplication of a bit by Alice(Im (xl , X2)) and a bit by Bob(Jm (Yl , Y2)). Notice
that the initial problem required two input bits from each party, however the final inner
product function requires 5 input bits from both Alice and Bob.
6.2.4
Other nonlocal boxes
After Wim van Dam has shown that correlations with CHSH=4 can be used to trivialize
communication complexity, quantum communication complexity recieved more attention
as a foundational area of study in QM. Later in 2006, it was shown that boxes with
CHSH> 20/6 allow communication with trivial complexity [12].
After which Brunner and Skrzypczyk showed that the first distillation protocol outlined
in Section 5.3 can b e used to distill boxes of the form B~1: to CHSH values larger than
20/6 , and thus trivialize communication complexity. Figure 7 summarizes the findings.
A very important point here is that some boxes (outside of Q with CHSH values
very close to 2 have b een shown to trivialize communication complexity. This shows that
70
Post-quantum models
making communication complexity trivial
20
~32
~3
o
6
~~~
____________
~~
____ _____
~
2.8
2.6
24
2.2
20L-~O~2--~OA---O~.6~~O~8---L1--~I~.2--~
· 1.-4--~
·1.6~~~~2
CHSH2
Figure 7: (Color in online version) All boxes with CHSH> 20/6 (light blue) trivializes communication
complexity [12}. Boxes shown in dark blue can be distilled to maximum nonlocality which would also allow
trivial communication complexity [13}. Solid black line is the boundary of Q. Figure originally from [13}.
nontrivial communication complexity is not a principle restricting the CHSH value, but it
is a principle singling out quantum correlations. It has been claimed that CHSH inequality
is not an adequate measure of nonlocality, and this result supports that claim.
It is not clear if all post-quantum boxes trivialize communication complexity, however
correlation algebra offers new tools that are promising in this area of research.
6.2.5
Discussion
We showed how any distributive decision problem of length N can be reduced to an inner
product function of length 2N. We had also shown that the inner product of two numbers
of arbitrary length has trivial communication complexity if maximally nonlocal correlations exist . Thus we conclude that any distributed decision problem would have trivial
communication complexity if QM was maximally nonlocal.
This might offer an answer to why CHSHQM is not bounded by 4. Just as the limits
71
in special relativity and computational classes offer us physical intuition, not having trivial
communication complexity can be a physical reason for why CHSHQM is bounded. There
is a hierarchy of different classes in distributive problems, similar to the computational
complexity classes [4]. It would be against physical intuition from an information-theoretic
perspective if communication complexity classes were all reduced to trivial complexity.
To conclude that maximal non locality trivializes communication complexity of all problems overlooks an important drawback. Since Alice and Bob might need up to 2N entangled
particles for inputs of N bits, the number of entangled particles and the computation each
party has to do on his/her own to put the data in the inner product form brings exponential overhead to this algorithm. It is true that correlations given in the equations (6.11)
and (6.12) reduce the number of bits that needs to be communicated to 1, which itself
is quite significant. However it is important to note that it comes with an exponentially
increasing price of more entangled particles and more computation. It is the opinion of
the author that this exponential overhead for computation is not paid much attention to
in the literature.
It is still uncertain if communication complexity is trivialized by all post-quantum correlations. There is still a gap between Q and correlations that allow trivial communication
complexity. It is uncertain if that small gap will be bridged. If it is shown that all postquantum correlations trivialize communication complexity, then QM can be founded on 3
physical assumptions about the universe: causality, and maximum non locality with nontrivial communication complexity.
72
6.3
The optimality of quantum correlations
In references [6 , 25] it has been claimed that there is an inverse relationship between the
number of states and the number of measurements that are allowed in a non-classical probabilistic theory. For sets of correlations that are larger than Q (for example C), although
there are more states than are allowed in Q, the allowed set of measurements and thus the
dynamics are more limited. For example, it has been shown that entanglement swapping,
teleportation and dense coding do not exist in a universe where all correlations in Care
allowed [6, 25]. The authors Anthony Short and Jonathan Barrett explain this with the
existence of a trade-off between allowed states and allowed measurements. Another result
that supports this claim is that if we restrict the set of allowed correlations to the convex
hull of the local boxes (vertices of £) and the nonlocal box NL oOO as opposed to all of C,
we recover the dynamics of entanglement effects such as teleportation and entanglement
swapping [26] . This is an interesting idea, which views Q as optimal due to the nice balance
it has between the broad range of dynamics and allowable correlations. Below we outline
the reasoning of Short and Barrett in [25] .
Building on the formalism presented earlier, we aim to present an "operational model"
of a set of correlation boxes by listing the probabilities of possible inputs to a system and
the possible outputs. Consider the state of a box with a binary input and binary output.
Then this system has four different probabilities that we can specify about this system,
that would define the system. The four probabilities needed would be:
(6.19)
73
or in vector form:
POlO
p=
POll
(6.20)
PliO
P1I1
Note that equation (6.20) states all the probabilities that are associated with a binary input
binary output state. In addition, these states are normalized, meaning that POlo + POll
and Plio
+ PIll =
=1
l.
The set of measurements that are sufficient in completely describing a system is called
the fiducial measurements [17]. The idea is that for any measurement on the system,
the probabilities can be stated in terms of the measurement probabilities of the fiducial
measurements. The fiducial measurements of a system need not be unique. The next
example will explain what we mean by fiducial measurements more clear.
A physical example for a simple system is a qubit. There are an infinite number of
different measurements one can do on a qubit. So the vector
P that describes a qubit can
have infinite rows. However, the fiducial measurement for this system can be constructed
with only three rows, where each row would be the probability of measurement outcomes
corresponding to Pauli operators o"x ,
O"y
and
O"z.
The idea is that the information about any
measurement on the system can be characterized by the probability outcomes of fiducial
measurements, since measurement probabilities of a system along Pauli matrices completely
74
determine a system. Thus we write the fiducial measurements vector mentioned as:
PO-x 1+1
PO-x 1-1
p=
Po-yl+l
(6.21)
Po-yl-1
Po-zl+1
Po-zl-1
where Po-xl- 1 is the probability of getting the outcome -1 from a measurement corresponding
to O"x.
O"rr/4
=
Now assume we want to write the measurement probabilities corresponding to
(O"x
+ O"z)
/V2.
Hardy has shown that in this formalism, the probability Prob(r) of
obtaining a particular outcome r can be expressed as a linear function of the measurement
probabilities of fiducial measurements [17]. Thus there exists an
Prob(r) =
2:
Rr . P =
Rr such that:
(6.22)
R ijlxy P ijlxy·
i ,j ,x ,y
Now we find the
we will find
R+1
Rr vector for the measurement O"rr/4 =
and
R- 1 .
(O"x
+ O"z)
/V2 as an example.
Thus
The former one will give us probability of measuring a +1 as a
linear function of the probabilities of the fiducial measurements (equation 6.21). Writing
the arbitrary qubit state in the form of equation (4.2):
I1/;)
e
'<jJ
e
= olD) + ,6 11) = cos2lD) + e2 sin211)
= cos~ID)
+ cos¢
sin~11) + i sin¢ sin~11).
75
Then the first component of the vector in equation (6.21) is:
lie
(0 1+ (11
12
i · e12
Po-xl+l = 1 J2 11jJ) = 2 cos 2 + e 4>sm2
e '" e) (e
cos 2 + e- '"'1'Sin 2e)
= 21 ( cos 2 + e2'1'sin2
2
1 + sine cos¢
2
Similarly, the second component is:
lie
(0 1-(11
12
i
· e12
Po-xl- 1 = 1 J2 11jJ) = 2 cos 2 - e 4>sm2
e
e) (cos 2e- e-i4>sm. 2e)
= 21 ( cos 2 - ei4>'
sm 2
1 - sine cos¢
2
The third and fourth components are:
lie
i 1(11
. e12
Po-yl±l = 1(01=fJ2
11jJ) 12 = 2 cos 2 =f i cos¢ sin 2e ± sin¢ sm
2
=
~ ( (cos; ± sinf Sinn' + cos'f sin';)
1 ± sine sin¢
2
Finally, the last two components are easiest to calculate:
Thus for an arbitrary one-qubit system (equation (4.2)), the fiducial measurements set as
76
defined in equation (6.21) is equal to:
(1
+ sine cos¢) / 2
(1 - sine cos¢) /2
+ sine sin¢) / 2
Po-yl+ 1
(1
Po-yl- 1
(1 - sine sin¢) /2
Po- z l+1
cos 2 (e/2)
Po- z l- 1
sin 2 (e / 2)
To find the R ± l vectors for the measurement corresponding to
(6.23)
=
CY rr /4
(CY x
+ CY z ) /V2,
we
need to find the probability of getting the measurement outcome 1 and -1 in terms of e and
¢, and then express it as a linear function of the components of the vector
15 in equation
(6.23). The probability of getting the outcome 1 from the measurement corresponding to
CYrr /4
is (after going through the same steps as above):
= 2cos 2 (e/2)
PA
+ sin (e)
cos (¢)
+ V2 -
2V2
0"451+1
1
.
(6.24)
Next, we try to write this expression in terms of the components of 15. Using the trigonometric identities we get:
PA
_ 2cos2 (e / 2)
+ sin (e)
cos (¢)
+ V2 -
1
2V2
0"451+1 -
= _1_ (1 + sine cos¢) _ _1_ (1- sine cos¢)
2V2
~ (1
+2
+
2
+ sine sin¢)
2
1
2V2
2
_ ~ (1 - sine sin¢)
2
2
1
;;:)cos 2 (e/2);;:)sin2(e/2)
2v2
2v2
77
Thus, we find that
1
2V2
-1
2V2
1
"2
(6.25)
1
"2
1
2V2
- 1
2V2
Similarly, the R - 1 vector is:
- 1
2V2
1
2V2
1
"2
(6.26)
1
"2
-1
2V2
1
2V2
Given these R±1 vectors, we can now write:
Prob(+l) =
R+1 · P
and
Prob(-l) =
R_1 · P
(6.27)
To extend this idea, we can denote the set {R r } such that it includes all the allowed measurements on a given system, with different outcomes r . Thus the {R±d in equations
(6.25) and (6.26) would be elements of {R r } corresponding to the measurement ()
Then
{R r }
would include all
R±1
= 45.
corresponding to all allowed measurements with differ-
ent () and <p. Each of the different {(), <p } pairs would denote a measurement along the
corresponding axes. The set {R±1} would have to obey some constraints on any system
78
[25]. First main constraint is:
o s Rr . PSI
for all rand
P.
(6.28)
This constraint arises from the fact that the probability of getting any outcome on a
system must be non-negative, for any probability distribution that describes the system.
It is analogous to the constraint equation (3.1).
Another constraint is for renormalization. Analogous to the equation (3.2), the probability of obtaining a result from a measurement must be unity [25]. In other words:
LRr· P = 1 for all P.
(6.29)
r
P vector is not allowed for any system. For example, a qubit system
does not allow the P vector that assigns 1 to the probability of measuring a +1 from every
Note that every
Pauli measurement. Thus the following state vector is not allowed in Q:
P=
PO-x1+1
1
PO-x1-1
0
Po-yl+ 1
1
Po-yl- 1
0
Po-zl+1
1
Po-zl- 1
0
(6.30)
although it is normalized for every different measurement. It is not allowed in Q, however
it is allowed in C.
79
6.3.1
Product states and entangled states
A multi-partite state is a product state if and only if it satisfies
(6.31)
A state is seperable if and only if it can be written as the convex combination of product
states. Otherwise it is entangled
[17, 25].
We have introduced Hardy's formalism
[17] describing measurement and states, we
can now present Short and Barrett's optimality argument [25]. Looking at the constraint
equations (6.28) and (6.29), we can see that the more states
universe, the harder it will b e for
Rr
(P
vectors) are allowed in a
vectors to satisfy the constraints. For example the
state in equation (6.30) does not exist in Q, yet it exists in C. Thus it makes sense that some
measurements that are allowed in Q may not be allowed in C, since the measurements in
C would have to satisfy the constraints while including the state in equation (6.30). Using
this idea, Short and Barrett prove that in the universe C, teleportation and entanglement
swapping are not possible. Since there are many more states that are a llowed in C, the
measurements that lead to teleport at ion and entanglement swapping are not allowed, as
they do not satisfy the inequalities (6.28) and (6.29). Below we outline the proofs, building
on some assumptions. These assumptions are also proved in [25] , however these proofs are
not needed to be presented here.
6.3.2
Three theorems about states in C
• Theorem 1: All components of any
R vectors are between zero and unity, inclusive .
• Theorem 2: The probability of any measurement outcome can be expressed as a
convex combination of the probabilities of fiducial measurements.
80
• Theorem 3: All allowed measurements can be simulated using fiducial measurements and post-selection.
As mentioned above, these three theorems are proved in [25]. On the surface they agree
with our analysis of probabilities as convex combinations.
6.3.3
The impossibility of entanglement swapping in C
As in the teleportation protocol described in Section 4.3.4, Alice and Bob shared an entangled two-qubit system, and so do Bob and Carol. Alice and Bob share the state P,
which is composed of A and B1 (corresponding to Alice's and Bob's qubits, respectively).
Similarly, Bob and Carol share the state Q, which is composed of B2 and C (corresponding
to Bob's and Carol's qubits, respectively). Assume Bob does a measurement ]1, and gets
an outcome yl. Then P collapses to a state p~~IY1, where i and x are Alice's input and
output bits, respectively. The collapsed P state can be written as:
(6.32)
Similarly, after a measurement ]2 Bob performs on Q with the outcome Y2, Q{I~Y2 is the
collapsed state of Q where k and z are Carol's input and output bits, respectively. The
collapsed state can be written as:
(6.33)
So following the entanglement swapping protocol in Section 4.3.4, Bob performs a joint
measurement with the outcome r, (]1 and ]2) on both of his qubits with corresponding
outcomes Y1 and Y2. Following that, Alice and Carol do the measurements i and k and
get the outcomes x and z. The probability of this event (Alice, Bob and Carol getting the
81
outcomes x, rand z respectively) is:
Probiklrxz
L nr (jIJ2IYlY2)P( ijllxyI) Q(j2 k IY2
=
Z)
(6.34)
jd2Y1Y2
where
nr, P and Q are the state vectors, and the bits in the parentheses denote the
components of the vectors. Rewriting Probiklrxz as
Prob;klrxz =
"
'"
~
Or
phlY1QhlY2
hhY1Y2 ilx
klz
(6.35)
where
(6.36)
and p~~IY1, Q~I~Y2 are the collapsed states as defined in equations (6.32) and (6.33). We
can also define the collapsed state of the AC system, similar to the collapsed states defined
before. For Bob 's outcome r,
Probiklrxz
p jd2lr - ----'--iklxz - Probikl r
Probiklrxz
L xz Probiklrxz .
(6.37)
Using equation (6.35), equation (6.37) becomes
phhlr _
iklxz -
'"
~
Ar
phlY1QhlY2
jd2Y1Y2 iIx
klz
(6.38)
jd2Y1Y2
where
(6.39)
82
Due to Theorem 1, the components of the
R vector
are all positive
(R r (jd2IYIY2) >
0).
Then 0jd2Y1Y2 and AJd2Y1Y2 are all positive. Also note that
=1.
(6.40)
Then equation (6.38) means that the collapsed state of AC is not entangled, since it is
a convex combination of collapsed states of Alice and Carol, given that all AJd2Y1Y2 are
positive and L,jd2Y1Y2 AJd2Y1Y2
= 1. Thus any joint measurement jd2 Bob performs on
B1B2 cannot introduce entanglement between Alice and Carol, which means entanglement
cannot be swapped.
6.3.4
The impossibility of teleportation in C
Recall that in Section 4.3.4 two methods of swapping entanglement were introduced, and
the latter one was using teleportation. Bob could teleport his half of the entangled pair
to Carol, and thus Alice and Carol could share an entangled state. However since it was
shown that entanglement swapping is not allowed in C, teleportation cannot be allowed
either. If teleportation were allowed, then entanglement swapping would be allowed too.
Interesting point to note here is that as mentioned in the beginning of Section 6.3,
teleportation and this entanglement swapping have been shown to be possible if the set of
correlations is restricted by the convex hull of local boxes and one nonlocal box (N L 000)
[26]. This shows that these protocols are allowed for sets of correlations that include more
nonlocal correlations than QM. It is a good example of the trade-off between allowed states
and allowed measurements. By restricting the set of allowed states, the measurements that
lead to entanglement swapping and teleportation have been made possible.
83
6.3.5
Further study in the optimality of Q
Ultimately this line of research will want to prove the uniqueness of computers that processes information using correlations in Q (quantum computers) [6]. Note that physicists
started getting interested in quantum computation because it was realized that classical
computers cannot simulate quantum systems efficiently (with polynomial overhead), which
led to the idea that a computer that has the quantum mechanical dynamics could be superior than a classical computer. Today we know that a quantum computer can simulate a
classical computers efficiently, while the vice versa does not hold. Building on this idea, if
it can be shown that quantum computers can simulate all other computers with different
sets of correlations efficiently, then Q can be considered the most powerful set for computation amongst other causal theories. This would be a big step in identifying the physical
properties of QM.
84
7
Conclusions
Several ideas that might be considered as information-theoretic axioms of QM have been
presented. Both investigating the optimality of quantum correlations from an informationtheoretic point of view and the principle of nontrivial communication complexity are
promising. For either to be considered as axioms, they both have to be violated by all
post-quantum boxes, while being satisfied by all the quantum boxes. Both principles are
relatively recent, and are very promising to offer physical foundations to QM.
Another similar information-theoretic axiom proposed for QM is the principle of information causality. Information causality is broadly described as a generalization of the
no-signaling principle. The principle states that a transmission of k classical bits cannot
cause a gain of more than k bits for the receiving party. [22]. It was shown that all boxes
in Q obey this principle, while all boxes in C with CHSH value large than 2}2 violate it.
If it can be shown that post-quantum boxes with CHSH value less than 2}2 also violate
information causality, then this principle would be a single defining feature of the boxes
observed in the universe. However, as argued before, since the set of boxes that preserve
information causality is not found to be closed, it cannot be equal to Q. It must be larger
than Q , and still smaller than the polytope that satisfies CHSH k
< 2}2. Thus, although
information causality principle cannot be an axiom for QM, it still gives us a convincing
argument about Tsirelson's bound.
It has been argued that physical axioms of QM will not necessarily be more satisfying than the present mathematical axioms. The distinction was made by Einstein, who
categorized theories as either principle or constructive theories. Special relativity can be
considered a principle theory, since constraints derived from a general physical principle
are used to find out the dynamics of a system. Whereas the kinetic theory of gases is
constructive, according to Einstein, as it seeks to explain complex phenomena using rel-
85
atively simple elements and their dynamics. Timpson, for example, suggests that special
relativity as a principle theory was needed as physicists did not have an understanding of
how space-time behaved [29]. However today, since we already have QM that predicts the
evolution of quantum systems, axiomatization of QM is not needed. On the other hand
some believe that the nonphysical foundation of QM today is the main reason it is so poorly
understood and its dynamics are considered to be "weird" [23, 15].
Even if QM is better off being a constructive theory, it is still undeniable that the new
developments in quantum information theory offer new insights to understanding QM,
and constraints to the quantum computation and quantum communication systems that
might be realized in the future. Discoveries like Bell's inequalities and Popescu-Rohrlich's
nonlocal boxes have been made quite late, considering how simple their underlying mathematical formalism is. The fact that such simple and revolutionary ideas took so long
to discover proves that better understanding of foundations of QM is desperately needed.
The context-free approach of quantum information theory which focuses only on correlations and probabilities provide a different perspective that was needed. With the recently
acquired tools and principles, this area of research promises to offer many new insights.
Further research should look into building on the mathematical tools for investigating
correlation boxes, and differentiate between the information processing properties of correlations in Q and correlations in C - Q in more detail. With recent experiments that can
measure the limit on nonlocality very precisely, it is interesting to see the different behavior
of probabilistic boxes inside and outside that boundary. In addition, this area of research
sets the foundation for the computer science of the future if quantum computers can be
realized efficiently.
86
References
[1] Jonathan Allcock, Nicolas Brunner, Noah Linden, Sandu Popescu, Paul Skrzypczyk,
and Tamas Vertesi. Closed sets of non-local correlations. Physical Review A , 80:
062107, 2009.
[2] Jonathan Allcock, Nicolas Brunner, Marcin Pawlowski, and Valerio Scarani. Recovering part of quantum boundary from information causality. Physical Review A, 80:
040103 , 2009.
[3] D. Avis. Lrs. http://cgm.cs.mcgill.ca/,,-, avis/C/lrs.html.
[4] Laszlo Babai, Phyllis G. Frankl, and Janos Simon. Complexity classes in communication complexity theory. In Proceedings of the 27th IEEE Symposium on Foundations
of Computers Science, pages 337-347, Los Angeles, Ca. , USA, October 1986. IEEE
Computer Society Press.
[5] J. T. Barreiro and T. C. Wei amd Paul Kwiat. Beating the channel capacity limit for
linear photonic superdense coding. Nature Physics, 4:282- 286, 2008.
[6] Jonathan Barrett. Information processing in generalized probabilistic theories. Phys-
ical Review A , 75:032304, 2007.
[7] Jonathan Barrett, Noah Linden, Serge Massar , Stefano Pironio, Sandu Popescu, and
David Roberts. Nonlocal correlations as an information-theoretic resource. Physical
Review A , 71:022101, 2005. arXiv:quant-ph/0404097.
[8] Charles Bennett and Stephen Wiesner.
Communication via one- and two-particle
operators on einstein-podolsky-rosen states. Physical Review Letters, 69:2881-2884,
1992.
87
[9] Charles Bennett, Gilles Brassard, Claude Crepeau, Richard Josza, Asher Peres, and
William Wootters.
Teleporting an unknown quantum state via dual classical and
einstein-podolsky-rosen channels. Physical Review Letters, 70:1895- 1899, 1993.
[10] John Boccio. Quantum mechanics: Mathematical structure, physical structure, and
applications in the physical world. (Manuscript).
[11] Dirk Bouwmeester, Anton Zeilinger, and Artur K. Ekert, editors.
The Physics of
Quantum Information: Quantum Cryptography, Quantum Teleporiation, Quantum
Computation. Springer, 2000.
[12] Giles Brassard, Harry Buhrman, Noah Linden, Andre Allan Methot, Alain Tapp, and
Falk Unger. Limit on nonlocality in any world in which communication complexity is
not trivial. Physical Review Letters, 96:250401, 2006.
[13] Nicolas Brunner and Paul Skrzypczyk. Non-locality distillation and post-quantum
theories with trivial communication complexity. Physical Review Letters, 102:160403,
2009. arXiv:0901.4070.
[14] A. Einstein, B. Podolsky, and N. Rosen.
Can quantum-mechanical description of
physical reality be considered complete? Physical Review, 47:777- 780, 1935.
[15] Richard Feynman.
The Character of Physical Law. Modern Library, 1994. ISBN
0679601279.
[16] Richard P. Feynman. Simulating physics with computers. International Journal of
Theoretical Physics, 21(6-7):467-488, June 1982.
[17] L. Hardy. Quantum theory from five reasonable axioms. arXiv:quant-phjOlOl0l2,
September 2001.
88
[18] Alexander Holevo. Bounds for the quantity of information transmitted by a quantum
communication channel. Problems in Information Transmission, 9:177-183, 1973.
[19] Lawrence J. Landau.
Empirical two-point correlation functions.
Foundations of
Physics, 18:449- 460, 1988.
[20] Ll Masanes. Necessary and sufficient condition for quantum-generated correlations.
arXiv:quant-ph j0309137.
[21] Michael Nielsen and Isaac Chuang. Quantum Computation and Quantum Information.
Cambridge University Press, 2000.
[22] Marcin Pawlowski, Tomasz Peterek, Dagomir Kaszlikowski, Valerio Scarani, Andreas
Winter, and Marek Zukowski. A new physical principle: Information causality. Nature,
461 :1101- 1104, October 2009. arXiv:0905.2292v2.
[23] !tamar Pitowsky. Quantum Probability - Quantum Logic. Springer - Verlag, 1989.
[24] Sandu Popescu and Daniel Rohrlich. Quantum non locality as an axiom. Foundations
of Physics, 23:379-385, 1994.
[25] Anthony J. Short and Jonathan Barrett. Strong nonlocality: A trade-off between
states and measurements. arxiv:0909.2601 vI [quant-ph], September 2009.
[26] Paul Skrzypczyk, Nicolas Brunner, and Sandu Popescu. Emergence of quantum correlations from non locality swapping. Physical Review Letters, 102:110402, 2009.
[27] Gerard 't Hooft. Determinism beneath quantum mechanics. In Quo Vadis Quantum
Mechanics, Philadelphia, 2002.
[28] Gerard 't Hooft. The mathematical basis for deterministic quantum mechanics. J.
Phys.: Conf. Ser. , 67:012015 , 2007.
89
[29] Christopher G. Timpson. Contemporary Philosophy of Physics. The Ashgate Companion, 2008. arXiv:quant-phj 0611187.
[30] B. S. Tsirelson. Quantum generalizations of bell's inequality. Letters in Math ematical
Physics, 4:93-100, 1980.
[31] Jos Uffink. Quadratic bell inequalities as t ests for multipartite entanglement. Physical
Review Letters, 88:230406, 2002.
[32] Wim van Dam. Nonlocality €9 Communication Complexity. PhD thesis, University of
Oxford, 1999.
[33] Wim van Dam.
Implausible consequences of superstrong nonlocality.
quant-
ph j 0501159 , January 2005.
[34] Eric W. Weisstein.
Polytope.
Mathworld- A Wolfram Web Resource.
URL
http://mathworld.wolfram.com/Polytope.html.
[35] W. K. Wootters and W. H. Zurek. A single quantum cannot be cloned. Nature, 299:
802- 803 , 1982.
90