Download Schumacher Compression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Renormalization wikipedia , lookup

Topological quantum field theory wikipedia , lookup

Relativistic quantum mechanics wikipedia , lookup

Double-slit experiment wikipedia , lookup

Ensemble interpretation wikipedia , lookup

Basil Hiley wikipedia , lookup

Bohr–Einstein debates wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Scalar field theory wikipedia , lookup

Algorithmic cooling wikipedia , lookup

Particle in a box wikipedia , lookup

Delayed choice quantum eraser wikipedia , lookup

Quantum decoherence wikipedia , lookup

Quantum electrodynamics wikipedia , lookup

Quantum field theory wikipedia , lookup

Probability amplitude wikipedia , lookup

Max Born wikipedia , lookup

Path integral formulation wikipedia , lookup

Hydrogen atom wikipedia , lookup

Copenhagen interpretation wikipedia , lookup

Quantum dot wikipedia , lookup

Measurement in quantum mechanics wikipedia , lookup

Bell test experiments wikipedia , lookup

Coherent states wikipedia , lookup

Quantum fiction wikipedia , lookup

Many-worlds interpretation wikipedia , lookup

Symmetry in quantum mechanics wikipedia , lookup

Orchestrated objective reduction wikipedia , lookup

Bell's theorem wikipedia , lookup

History of quantum field theory wikipedia , lookup

Density matrix wikipedia , lookup

Quantum computing wikipedia , lookup

Quantum entanglement wikipedia , lookup

Interpretations of quantum mechanics wikipedia , lookup

EPR paradox wikipedia , lookup

Quantum group wikipedia , lookup

Quantum machine learning wikipedia , lookup

Canonical quantization wikipedia , lookup

Quantum channel wikipedia , lookup

Quantum state wikipedia , lookup

Quantum cognition wikipedia , lookup

Hidden variable theory wikipedia , lookup

Quantum teleportation wikipedia , lookup

Quantum key distribution wikipedia , lookup

T-symmetry wikipedia , lookup

Transcript
CHAPTER 17
Schumacher Compression
One of the fundamental tasks in classical information theory is the compression of information.
Given access to many uses of a noiseless classical channel, what is the best that a sender and receiver can make of this resource for compressed data transmission? Shannon’s compression theorem demonstrates that the Shannon entropy is the fundamental limit for the compression rate in
the IID setting (recall the development in Section 13.4). That is, if one compresses at a rate above
the Shannon entropy, then it is possible to recover the compressed data perfectly in the asymptotic
limit, and otherwise, it is not possible to do so.1 This theorem establishes the prominent role of the
entropy in Shannon’s theory of information.
In the quantum world, it very well could be that one day a sender and a receiver would have many
uses of a noiseless quantum channel available,2 and the sender could use this resource to transmit
compressed quantum information. But what exactly does this mean in the quantum setting? A simple model of a quantum information source is an ensemble of quantum states {p X (x) , ∣ψ x ⟩}, i.e., the
source outputs the state ∣ψ x ⟩ with probability p X (x), and the states {∣ψ x ⟩} do not necessarily have to
form an orthonormal basis. Let’s suppose for the moment that the classical data x is available as well,
even though this might not necessarily be the case in practice. A naive strategy for compressing this
quantum information source would be to ignore the quantum states coming out, handle the classical data instead, and exploit Shannon’s compression protocol from Section 13.4. That is, the sender
compresses the sequence x n emitted from the quantum information source at a rate equal to the
Shannon entropy H (X), sends the compressed classical bits over the noiseless quantum channels,
the receiver reproduces the classical sequence x n at his end, and finally reconstructs the sequence
∣ψ x n ⟩ of quantum states corresponding to the classical sequence x n .
The above strategy will certainly work, but it makes no use of the fact that the noiseless quantum
channels are quantum! It is clear that noiseless quantum channels will be expensive in practice, and
the above strategy is wasteful in this sense because it could have merely exploited classical channels
(channels that cannot preserve superpositions) to achieve the same goals. Schumacher compression
1
Technically, we did not prove the converse part of Shannon’s data compression theorem, but the converse of this
chapter suffices for Shannon’s classical theorem as well.
2
How we hope so! If working, coherent fault-tolerant quantum computers come along one day, they stand to benefit
from quantum compression protocols.
389
390
CHAPTER 17. SCHUMACHER COMPRESSION
is a strategy that makes effective use of noiseless quantum channels to compress a quantum information source down to a rate equal to the von Neumann entropy. This has a great benefit from a
practical standpoint—recall from Exercise 11.9.2 that the von Neumann entropy of a quantum information source is strictly lower than the source’s Shannon entropy if the states in the ensemble
are non-orthogonal. In order to execute the protocol, the sender and receiver simply need to know
the density operator ρ ≡ ∑x p X (x) ∣ψ x ⟩ ⟨ψ x ∣ of the source. Furthermore, Schumacher compression
is provably optimal in the sense that any protocol that compresses a quantum information source
of the above form at a rate below the von Neumann entropy cannot have a vanishing error in the
asymptotic limit.
Schumacher compression thus gives an operational interpretation of the von Neumann entropy
as the fundamental limit on the rate of quantum data compression. Also, it sets the term “qubit”
on a firm foundation in an information-theoretic sense as a measure of the amount of quantum
information “contained” in a quantum information source.
We begin this chapter by giving the details of the general information processing task corresponding to quantum data compression. We then prove that the von Neumann entropy is an achievable rate of compression and follow by showing that it is optimal (these two respective parts are the
direct coding theorem and converse theorem for quantum data compression). We illustrate how
much savings one can gain in quantum data compression by detailing a specific example. The final
section of the chapter closes with a presentation of more general forms of Schumacher compression.
17.1 The Information Processing Task
We first overview the general task that any quantum compression protocol attempts to accomplish.
Three parameters n, R, and є corresponding to the length of the original quantum data sequence, the
rate, and the error, respectively, characterize any such protocol. An (n, R + δ, є) quantum compression code consists of four steps: state preparation, encoding, transmission, and decoding. Figure 17.1
depicts a general protocol for quantum compression.
An
State Preparation. The quantum information source outputs a sequence ∣ψ x n ⟩ of quantum
states according to the ensemble {p X (x) , ∣ψ x ⟩} where
∣ψ x n ⟩
An
≡ ∣ψ x1 ⟩ 1 ⊗ ⋯ ⊗ ∣ψ x n ⟩
A
An
.
(17.1)
The density operator, from the perspective of someone ignorant of the classical sequence x n , is equal
to the tensor power state ρ⊗n where
ρ ≡ ∑ p X (x) ∣ψ x ⟩ ⟨ψ x ∣ .
(17.2)
x
Also, we can think about the purification of the above density operator. That is, an equivalent mathematical picture is to imagine that the quantum information source produces states of the form
∣φ ρ ⟩RA ≡ ∑
√
p X (x) ∣x⟩ ∣ψ x ⟩ ,
R
A
x
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
(17.3)
391
17.2. THE QUANTUM DATA COMPRESSION THEOREM
Figure 17.1: The most general protocol for quantum compression. Alice begins with the output of some quantum information source whose density operator is ρ ⊗n on some system An . The inaccessible reference system holds the purification of this density operator. She performs some CPTP encoding map E, sends the compressed qubits through 2nR uses
of a noiseless quantum channel, and Bob performs some CPTP decoding map D to decompress the qubits. The scheme
is successful if the difference between the initial state and the final state is negligible in the asymptotic limit n → ∞.
where R is the label for an inaccessible reference system (not to be confused with the rate R!). The
resulting IID state produced is (∣φ ρ ⟩RA)⊗n .
n
Encoding. Alice encodes the systems An according to some CPTP compression map E A →W
where W is a quantum system of size 2nR . Recall that R is the rate of compression:
R=
1
log dW − δ,
n
(17.4)
where dW is the dimension of system W and δ is an arbitrarily small positive number.
Transmission. Alice transmits the system W to Bob using n (R + δ) noiseless qubit channels.
n
Decoding. Bob sends the system W through a decompression map D W→Â .
The protocol has є error if the compressed and decompressed state is є-close in trace distance to
the original state (∣φ ρ ⟩RA)⊗n :
⊗n
∥(φ RA
ρ )
n →W
− (D W→Â ○ E A
n
⊗n
)((φ RA
ρ ) )∥ ≤ є.
1
(17.5)
17.2 The Quantum Data Compression Theorem
We say that a quantum compression rate R is achievable if there exists an (n, R + δ, є) quantum
compression code for all δ, є > 0 and all sufficiently large n. Schumacher’s compression theorem
establishes the von Neumann entropy as the fundamental limit on quantum data compression.
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
392
CHAPTER 17. SCHUMACHER COMPRESSION
Figure 17.2: Schumacher’s compression protocol. Alice begins with many copies of the output of the quantum information source. She performs a measurement onto the typical subspace corresponding to the state ρ and then performs a
compression isometry of the typical subspace to a space of size 2n[H(ρ)+δ] qubits. She transmits these compressed qubits
over n [H (ρ) + δ] uses of a noiseless quantum channel. Bob performs the inverse of the isometry to uncompress the
qubits. The protocol is successful in the asymptotic limit due to the properties of typical subspaces.
Theorem 17.2.1 (Quantum Data Compression). Suppose that ρ A is the density operator of the quantum information source. Then the von Neumann entropy H (A)ρ is the smallest achievable rate R for
quantum data compression:
inf {R ∶ R is achievable} = H (A)ρ .
17.2.1
(17.6)
The Direct Coding Theorem
Schumacher’s compression protocol demonstrates that the von Neumann entropy H (A)ρ is an
achievable rate for quantum data compression. It is remarkably similar to Shannon’s compression
protocol from Section 13.4, but it has some subtle differences that are necessary for the quantum
setting. The basic steps of the encoding are to perform a typical subspace measurement and an
isometry that compresses the typical subspace. The decoder then performs the inverse of the isometry to decompress the state. The protocol is successful if the typical subspace measurement successfully projects onto the typical subspace, and it fails otherwise. Just like in the classical case, the
law of large numbers guarantees that the protocol is successful in the asymptotic limit as n → ∞.
Figure 17.2 provides an illustration of the protocol, and we now provide a rigorous argument.
⊗n
Alice begins with many copies of the state (φ RA
ρ ) . Suppose that the spectral decomposition of
ρ is as follows:
ρ = ∑ p Z (z) ∣z⟩ ⟨z∣ ,
(17.7)
z
where p Z (z) is some probability distribution, and {∣z⟩} is some orthonormal basis. Her first step
n
n
E1A →YA is to perform a typical subspace measurement of the form in (14.1.4) onto the typical subspace of An , where the typical projector is with respect to the density operator ρ. The action of
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
393
17.2. THE QUANTUM DATA COMPRESSION THEOREM
n →An Y
E1A
n
on a general state σ A is
n →YAn
E1A
Y
Y
(σ A ) ≡ ∣0⟩ ⟨0∣ ⊗ (I − Π δA ) σ A (I − Π δA ) + ∣1⟩ ⟨1∣ ⊗ Π δA σ A Π δA ,
n
n
n
n
n
n
n
(17.8)
n
and the classically-correlated flag bit Y indicates whether the typical subspace projection Π δA is
successful or unsuccessful. Recall from the Shannon compression protocol in Section 13.4 that we
exploited a function f that mapped from the set of typical sequences to a set of binary sequences
n[H(ρ)+δ]
{0, 1}
. Now, we can construct an isometry U f that is a coherent version of this classical
An
W
function f . It simply maps the orthonormal basis {∣z n ⟩ } to the basis {∣ f (z n )⟩ }:
An
U f ≡ ∑ ∣ f (z n )⟩ ⟨z n ∣ .
W
z n ∈TδZ
(17.9)
n
The above operator is an isometry because the input space is a subspace of size at most 2n[H(ρ)+δ]
(recall Property 14.1.2) embedded in a larger space of size 2n (at least for qubits) and the output
n
space is of size at most 2n[H(ρ)+δ] . So her next step E2YA →YW is to perform the isometric compression
n
conditional on the flag bit Y being equal to one, and the action of E2YA →YW on a general classicaln
n
n
Y
Y
quantum state σ YA ≡ ∣0⟩ ⟨0∣ ⊗ σ0A + ∣1⟩ ⟨1∣ ⊗ σ1A is as follows:
E2YA
n →YW
(σ YA ) ≡ ∣0⟩ ⟨0∣Y ⊗ Tr {σ0A } ∣e⟩ ⟨e∣W + ∣1⟩ ⟨1∣Y ⊗ U f σ1A U †f ,
n
n
n
where ∣e⟩ is some error flag orthogonal to all of the states {∣ f (ϕ x n )⟩ }ϕ
W
W
pletes the details of her encoder E
n →YW
EA
An →YW
qn
x n ∈Tδ
(17.10)
. This last step com-
, and the action of it on the initial state is
⊗n
YA
((φ RA
ρ ) ) ≡ (E2
n →YW
n →YAn
○ E1A
⊗n
)((φ RA
ρ ) ).
(17.11)
Alice then transmits all of the compressed qubits over n [H (ρ) + δ] + 1 uses of the noiseless qubit
channel.
n
Bob’s decoding D YW→A performs the inverse of the isometry conditional on the flag bit being
An
equal to one and otherwise maps to some other state ∣e⟩ outside of the typical subspace. The action
Y
Y
of the decoder on some general classical-quantum state σ YW ≡ ∣0⟩ ⟨0∣ ⊗ σ0W + ∣1⟩ ⟨1∣ ⊗ σ1W is
n
Y
A
Y
D1YW→YA (σ YW ) ≡ ∣0⟩ ⟨0∣ ⊗ Tr {σ0W } ∣e⟩ ⟨e∣ + ∣1⟩ ⟨1∣ ⊗ U †f σ1W U f .
n
(17.12)
The final part of the decoder is to discard the classical flag bit: D2YA →A ≡TrY {⋅}. Then D YW→A ≡
n
n
n
D2YA →A ○ D1YW→YA .
We now can analyze how this protocol performs with respect to our performance criterion
in (17.5). Consider the following chain of inequalities:
n
⊗n
∥(φ RA
ρ )
n →YW
− (D YW→A ○ E A
n
⊗n
⊗n
≤ ∥∣1⟩ ⟨1∣ ⊗ (φ RA
ρ )
Y
⊗n
= ∥∣1⟩ ⟨1∣ ⊗ (φ RA
ρ )
Y
n
− (D1YW→A ○ E A
n
n →YW
− (∣0⟩ ⟨0∣ ⊗ Tr {(I −
Y
n
⊗n
) ((φ RA
ρ ) )∥
(17.13)
1
n →YW
YW→A
= ∥TrY {∣1⟩ ⟨1∣ ⊗ (φ RA
○ EA
ρ ) } − (D
Y
n
⊗n
) ((φ RA
ρ ) )∥
⊗n
(17.14)
1
) ((φ RA
ρ ) )∥
1
An
An
RA ⊗n
Π δ ) (φ ρ ) } ∣e⟩ ⟨e∣
(17.15)
Y
+ ∣1⟩ ⟨1∣ ⊗ Π δA (φ RA
ρ )
n
⊗n
Π δA )∥
n
1
(17.16)
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
394
CHAPTER 17. SCHUMACHER COMPRESSION
⊗n
Y
The first equality follows by adding a flag bit ∣1⟩ to (φ RA
and tracing it out. The first inequality
ρ )
follows from monotonicity of trace distance under the discarding of subsystems (Corollary 9.1.2).
⊗n
n
n
The second equality follows by evaluating the map D1YW→A ○ E A →YW on the state (φ RA
ρ ) . Continuing, we have
⊗n
≤ ∥∣1⟩ ⟨1∣ ⊗ (φ RA
ρ )
Y
Y
− ∣1⟩ ⟨1∣ ⊗ Π δA (φ RA
ρ )
n
+ ∥∣0⟩ ⟨0∣ ⊗ Tr {(I −
⊗n
⊗n
− Π δA (φ RA
ρ )
n
Π δA ∥
n
1
An
An
RA ⊗n
Π δ ) (φ ρ ) } ∣e⟩ ⟨e∣ ∥
Y
= ∥(φ RA
ρ )
√
≤ 2 є+є
⊗n
1
(17.17)
⊗n
Π δA ∥ + Tr {(I − Π δA ) (φ RA
ρ ) }
n
n
(17.18)
1
(17.19)
The first inequality follows from the triangle inequality for trace distance (Lemma 9.1.2). The equality uses the facts ∥ρ ⊗ σ − ω ⊗ σ∥1 = ∥ρ − ω∥1 ∥σ∥1 = ∥ρ − ω∥1 and ∥bρ∥1 = ∣b∣ ∥ρ∥1 for some density
operators ρ, σ, and ω and a constant b. The final inequality follows from the first property of typical
subspaces:
⊗n
n
An ⊗n
(17.20)
Tr {Π δA (φ RA
ρ ) } = Tr {Π δ ρ } ≥ 1 − є,
and the Gentle Operator Lemma (Lemma 9.4.2).
We remark that it is important for the typical subspace measurement in (17.8) to be implemented
as a coherent quantum measurement. That is, the only information that this measurement should
learn is whether the state is typical or not. Otherwise, there would be too much disturbance to the
quantum information, and the protocol would fail at the desired task of compression. Such precise
control on so many qubits is possible in principle, but it is of course rather daunting to implement
in practice!
17.2.2
The Converse Theorem
We now prove the converse theorem for quantum data compression by considering the most general compression protocol that meets the success criterion in (17.5) and demonstrating that such
an asymptotically error-free protocol should have its rate of compression above the von Neumann
entropy of the source. Alice would like to compress a state σ that lives on a Hilbert space An . The
n n
purification ϕ R A of this state lives on the joint systems An and R n where R n is the purifying system
(again, we should not confuse reference system R n with rate R). If she can compress any system
on An and recover it faithfully, then she should be able to do so for the purification of the state. An
(n, R + δ, є) compression code has the property that it can compress at a rate R with only error є.
The quantum data processing is
An E W D Ân ,
(17.21)
Ð
→
Ð
→
and the following inequality holds for a successful quantum compression protocol:
∥ω R
where
ωR
n Ân
n Ân
− ϕR
n Ân
∥ ≤ є,
≡ D(E(ϕ R
1
n An
(17.22)
)).
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
(17.23)
395
17.3. QUANTUM COMPRESSION EXAMPLE
Consider the following chain of inequalities:
2nR = log2 (2nR ) + log2 (2nR )
(17.24)
≥ ∣H (W)ω ∣ + ∣H (W∣R )ω ∣
≥ ∣H (W)ω − H (W∣R n )ω ∣
= I (W; R n )ω
n
(17.25)
(17.26)
(17.27)
The first inequality follows from the fact that both the quantum entropy H (W) and the conditional
quantum entropy H (W∣R n ) cannot be larger than the logarithm of the dimension of the system W
(see Property 11.1.3 and Theorem 11.5.1). The second inequality is the triangle inequality. The second
equality is from the definition of quantum mutual information. Continuing, we have
≥ I (Ân ; R n )ω
(17.28)
≥ I (Ân ; R n )ϕ − nє′
(17.29)
= I (An ; R n )ϕ − nє′
(17.30)
= H (A )ϕ + H (R )ϕ − H (A R )ϕ − nє
n
n
n
n
′
= 2H (An )ϕ − nє′
(17.31)
(17.32)
The first inequality follows from the quantum data processing inequality (Bob processes W with the
decoder to get Ân ). The second inequality follows from applying the Alicki-Fannes’ inequality (see
Exercise 11.9.7) to the success criterion in (17.22) and setting є′ ≡ 6єR + 4H2 (є)/n. The first equality
follows because the systems Ân and An are isomorphic (they have the same dimension). The second
equality is by the definition of quantum mutual information, and the last equality follows because
the entropies of each half of a pure, bipartite state are equal and their joint entropy vanishes. In the
case that the state ϕ is an IID state of the form (∣φ ρ ⟩RA)⊗n from before, the von Neumann entropy
is additive so that
H (An )∣φ ρ ⟩⊗n = nH (A)∣φ ρ ⟩ ,
(17.33)
and this shows that the rate R must be greater than the entropy H (A) if the error of the protocol
vanishes in the asymptotic limit.
17.3 Quantum Compression Example
We now highlight a particular example where Schumacher compression gives a big savings in compression rates if noiseless quantum channels are available. Suppose that the ensemble is of the following form:
1
1
{( , ∣0⟩) , ( , ∣+⟩)} .
(17.34)
2
2
This ensemble is known as the Bennett-92 ensemble because it is useful in Bennett’s protocol for
quantum key distribution. The naive strategy would be for Alice and Bob to exploit Shannon’s compression protocol. That is, Alice would ignore the quantum nature of the states, and supposing that
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
396
CHAPTER 17. SCHUMACHER COMPRESSION
the classical label for them were available, she would encode the classical label. Though, the entropy
of the uniform distribution on two states is equal to one bit, and she would have to transmit classical
messages at a rate of one bit per channel use.
A far wiser strategy is to employ Schumacher compression. The density operator of the above
ensemble is
1
1
∣0⟩ ⟨0∣ + ∣+⟩ ⟨+∣ ,
(17.35)
2
2
which has the following spectral decomposition:
cos2 (π/8) ∣+′ ⟩ ⟨+′ ∣ + sin2 (π/8) ∣−′ ⟩ ⟨−′ ∣ ,
(17.36)
∣+′ ⟩ ≡ cos (π/8) ∣0⟩ + sin (π/8) ∣1⟩ ,
∣−′ ⟩ ≡ sin (π/8) ∣0⟩ − cos (π/8) ∣1⟩ .
(17.37)
(17.38)
where
The binary entropy H2 (cos2 (π/8)) of the distribution [cos2 (π/8) , sin2 (π/8)] is approximately
equal to
0.6009 qubits,
(17.39)
and thus they can save a significant amount in terms of compression rate by employing Schumacher compression. This type of savings will always occur whenever the ensemble includes nonorthogonal quantum states.
Exercise 17.3.1 In the above example, suppose that Alice associates a classical label with the states,
so that the ensemble instead is
1
1
{( , ∣0⟩ ⟨0∣ ⊗ ∣0⟩ ⟨0∣) , ( , ∣1⟩ ⟨1∣ ⊗ ∣+⟩ ⟨+∣)} .
2
2
(17.40)
Does this help in reducing the amount of qubits she has to transmit to Bob?
17.4 Variations on the Schumacher Theme
We can propose several variations on the Schumacher compression theme. For example, suppose
that the quantum information source corresponds to the following ensemble instead:
{p X (x) , ρ xA} ,
(17.41)
where each ρ x is a mixed state. Then the situation is not as “clear-cut” as in the simpler model for
a quantum information source because the techniques exploited in the converse proof do not apply
here. Thus, the entropy of the source does not serve as a lower bound on the ultimate compressibility
rate.
Let us consider a special example of the above situation. Suppose that the mixed states ρ x live
on orthogonal subspaces, and let ρ A = ∑x p X (x) ρ x denote the expected density operator of the
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
397
17.4. VARIATIONS ON THE SCHUMACHER THEME
ensemble. These states are perfectly distinguishable by a measurement whose projectors project
onto the different orthogonal subspaces. Alice could then perform this measurement and associate
classical labels with each of the states:
ρ XA ≡ ∑ p X (x) ∣x⟩ ⟨x∣ ⊗ ρ xA .
X
(17.42)
x
Furthermore, she can do this in principle without disturbing the state in any way, and therefore the
entropy of the state ρ XA is equivalent to the original entropy of the state ρ A:
H (A)ρ = H (XA)ρ .
(17.43)
Naively applying Schumacher compression to such a source is actually not a great strategy here.
The compression rate would be equal to
H (XA)ρ = H (X)ρ + H (A∣X)ρ ,
(17.44)
and in this case, H (A∣X)ρ ≥ 0 because the conditioning system is classical. For this case, a much
better strategy than Schumacher compression is for Alice to measure the classical variable X, compress it with Shannon compression, and transmit to Bob so that he can reconstruct the quantum
states at his end of the channel. The rate of compression here is equal to the Shannon entropy H (X)
which is provably lower than H (XA)ρ for this example.
How should we handle the mixed source case in general? Let’s consider the direct coding theorem and the converse theorem. The direct coding theorem for this case is essentially equivalent to
Schumacher’s protocol for quantum compression—there does not appear to be a better approach in
the general case. The density operator of the source is equal to
ρ A = ∑ p X (x) ρ xA .
(17.45)
x
A compression rate R > H (A)ρ is achievable if we form the typical subspace measurement from the
n
⊗n
typical subspace projector Π δA onto the state (ρ A) . Although the direct coding theorem stays the
same, the converse theorem changes somewhat. A purification of the above density operator is as
follows:
√
RA
X
X′
XX ′ RA
(17.46)
∣ϕ⟩
≡ ∑ p X (x) ∣x⟩ ∣x⟩ ∣ϕ ρ x ⟩ ,
x
RA
where each ∣ϕ ρ x ⟩ is a purification of ρ xA. So the purifying system is the joint system XX ′ R. Let ω XX R Â
be the actual state generated by the protocol:
′
ω XX R Â ≡ D(E(ϕ XX RA)).
′
′
(17.47)
We can now provide an alternate converse proof:
nR = log2 dW
≥ H (W)ω
≥ H (W)ω − H (W∣X)ω
= I (W; X)ω
≥ I (A; X)ω
≥ I (A; X)ϕ − nє′
(17.48)
(17.49)
(17.50)
(17.51)
(17.52)
(17.53)
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
398
CHAPTER 17. SCHUMACHER COMPRESSION
The first equality follows from evaluating the logarithm of the dimension of system W. The first
inequality follows because the von Neumann entropy is less than the logarithm of the dimension of
the system. The second inequality follows because H (W∣X)ω ≥ 0—the system X is classical when
tracing over X ′ and R. The second equality follows from the definition of quantum mutual information. The third inequality follows from quantum data processing, and the final follows from applying
the Alicki-Fannes’ inequality (similar to the way that we did for the converse of the quantum data
XX ′ RA
compression theorem). Tracing over X ′ and R of ∣ϕ⟩
gives the following state
∑ p X (x) ∣x⟩ ⟨x∣ ⊗ ρ xA ,
X
(17.54)
x
demonstrating that the ultimate lower bound on the compression rate of a mixed source is the
Holevo information of the ensemble. The next exercise asks you to verify that ensembles of mixed
states on orthogonal subspaces saturate this bound.
Exercise 17.4.1 Show that the Holevo information of an ensemble of mixed states on orthogonal
subspaces has its Shannon information equal to its Holevo information. Thus, this is an example of
a class of ensembles that meet the above lower bound on compressibility.
17.5 Concluding Remarks
Schumacher compression was the first quantum Shannon-theoretic result discovered and is the simplest one that we encounter in this book. The proof is remarkably similar to the proof of Shannon’s
noiseless coding theorem, with the main difference that we should be more careful in the quantum
case not to be learning any more information than necessary when performing measurements. The
intuition that we gain for future quantum protocols is that it often suffices to consider only what
happens to a high probability subspace rather than the whole space itself if our primary goal is to
have a small probability of error in a communication task. In fact, this intuition is the same needed
for understanding information processing tasks such as entanglement concentration, classical communication, private classical communication, and quantum communication.
The problem of characterizing the lower and upper bounds for the quantum compression rate
of a mixed state quantum information source still remains open, despite considerable efforts in this
direction. It is only in special cases, such as the example mentioned in Section 17.4, that we know of
a matching lower and upper bound as in Schumacher’s original theorem.
17.6 History and Further Reading
Schumacher introduced typical subspaces and proved the quantum data compression theorem in
Ref. [203]. Jozsa and Schumacher later generalized this proof [160], and Lo further generalized
the theorem to mixed state sources [177]. There are other generalizations in Refs. [139, 13]. Several schemes for universal quantum data compression exist [158, 159, 30], in which the sender does
not need to have a description of the quantum information source in order to compress its output.
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
17.6. HISTORY AND FURTHER READING
399
There are also practical schemes for quantum data compression in work about quantum Huffman
codes [45].
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
400
CHAPTER 17. SCHUMACHER COMPRESSION
©2011 Mark M. Wilde—This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License