Download 1031219672-MIT

Document related concepts
no text concepts found
Transcript
Quantum Signal Processing by Single-Qubit Dynamics
by
Guang Hao Low
M.Sci. Physics
University of Cambridge, 2012
B.A. Natural Sciences
University of Cambridge, 2012
Submitted to the Department of Physics
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Physics
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2017
Massachusetts Institute of Technology 2017. All rights reserved.
A uthor .............
Signature redacted......................
Department of Physics
August 9, 2017
Certified by..........
S ignature redacted .....................
Isaac L. Chuang
Professor of Physics
Professor of Electrical Engineering and Computer Science
Thesis Supervisor
Accepted by ............
MA-SACHUSETTS INSTITUTE
OF TECHNOLOGY.
MAR 19 2018
LIBRARIES
ARQHIVES
Signature redacted
Nergis Mavalvala
Curtis and Kathleen Marble Professor of Astrophysics
Associate Department Head of Physics
2
Quantum Signal Processing by Single-Qubit Dynamics
by
Guang Hao Low
Submitted to the Department of Physics
on August 9, 2017, in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy in Physics
Abstract
Quantum computation is the most powerful realizable model of computation, and is uniquely
positioned to solve specialized problems intractable to classical computers. This quantum
advantage arises from directly exploiting the strangeness of quantum mechanics that is
fundamental to reality. As such, one expects our understanding of quantum processes in
physical systems to be indispensable to the design and execution of quantum algorithms.
We present quantum signal processing, which exploits the dynamics of simple quantum
systems to perform non-trivial computations. Such systems applied as computational modules in larger quantum algorithms, offer a natural physical alternative to standard tasks
such as the calculation of elementary functions with integer arithmetic. The quantum advantage of this approach, based on simple physics, is of significant practical relevance. In
cases, arbitrary bits of precision may be emulated using only constant space. Moreover,
the simplicity and performance of quantum signal processing is such that it is the final
missing ingredient for realizing a number of optimal quantum algorithms, particularly in
Hamiltonian simulation.
Quantum signal processing realizes a useful fusion of analog and digital models of quantum computation. At the physical level, we focus on how even a simple two-level system
- the qubit, computes through optimal discrete-time quantum control. Whereas quantum
control is typically used to synthesize unitary quantum gates, we solve the synthesis problem of unitary quantum functions with a fully characterization of achievable functions, and
efficient techniques for their implementation.
This furnishes a surprisingly rich framework in the analog model of quantum computation
for computing functions. The generality of this model is realized by many applications, often
with no modification, to quantum algorithms designed for digital quantum computers, in
particular for matrix manipulation. In this manner, we solve a number of open problem
related to optimal amplitude amplification algorithms, optimally computing on matrices
with a quantum computer, and the simulation of physical systems.
Thesis Supervisor: Isaac L. Chuang
Title: Professor of Physics
Professor of Electrical Engineering and Computer Science
3
4
Acknowledgments
This thesis is possible only through the influence of many people.
My research advisor Isaac Chuang has consistently been a deep well of insight and
guidance, not just in the intricacies of quantum computing, but also in the larger picture of
what it means to be a researcher and beyond. Our story goes a little further back - I met Ike
on the recommendation of my academic advisor Thomas Greytak during my undergraduate
exchange at MIT.
I am very fortunate to have had good mentors. The seeds for my time in graduate school
were sown in those early days in Ike's lab, also under the supervision of Peter Herskind and
Shannon Wang. That amazingly positive experience was a pivotal moment. From the
beginning, Ike gave me the freedom to pursue my interest, and I fondly remember our lunch
discussions.
I am grateful to my thesis committee members Eddie Farhi and Aram Harrow for their
support. The all-too-frequently unsung heroes of administrative staff in the Physics department, particularly Catherine Modica and Sydney Miller, also deserve special mention for
their unwavering assistance in my moments of need.
I learned much from discussions with my collaborators, particularly Ted Yoder in laying
the foundations of this thesis through our work on composite pulse sequences in Chapter 2.
Valuable experience was gained through working with Shelby Kimmel, Cedric Lin, Michael
Gutierrez, Helena Zhang, Richard Rines, Maris Ozols, Kuan-Yu Lin, Robert McConnell,
Tailin Wu, and John Chiaverini. Beyond these, Amira Eltony, Curtis Northcutt, Murphy
Niu, Sam Buercklin, Molu Shi, Dax Koh, Mischa Wood, Yuan Su, and many others not
mentioned but no less deserving.
I would also like to thank other teachers in my journey through physics: Steve McMahon,
Yeo Ye, Quek Hoon Khim, Chen Geok Loo, and Byran Poon.
Most of all, my parents for their love, and the opportunity to pursue my dreams.
5
6
Contents
1
Introduction
13
1.1
An outline of quantum computation . . . . . . . . . . . . . . . . . . . . . . .
1.1.1
U niversality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 C om plexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.3 Analog quantum computation . . . . . . . . . . . . . . . . . . . . . . .
1.1.4 Fault-tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.5 Digital quantum computation . . . . . . . . . . . . . . . . . . . . . . .
1.1.6 Analog-digital hybrid models of quantum computation . . . . . . . . .
Quantum signal processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
16
17
17
18
19
19
20
2
Analog computation on a single qubit
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1
Attributions and contributions . . . . . . . . . . . . . . . . . . . . . .
2.2 A model of analog quantum computation . . . . . . . . . . . . . . . . . . . .
2.3 Analog quantum computation on a single-qubit . . . . . . . . . . . . . . . . .
2.3.1
Representation of QSP . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Systematic and efficient design of optimal composite gates . . . . . . . . . . .
2.4.1 Polynomial characterization of quantum response functions . . . . . .
2.4.2 Fourier characterization of quantum response functions . . . . . . . . .
2.4.3 Implementation of quantum response functions . . . . . . . . . . . . .
2.4.4 Computation of quantum response functions . . . . . . . . . . . . . . .
2.4.5 Selection of quantum response functions . . . . . . . . . . . . . . . . .
2.4.6 The methodology of composite quantum gates . . . . . . . . . . . . . .
2.5 Exam ples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1
Composite population inversion gates . . . . . . . . . . . . . . . . . .
2.5.2 Broadband compensated NOT gates . . . . . . . . . . . . . . . . . . .
2.5.3
Composite quantum gates with sub-wavelength spatial selectivity . . .
2.6 C onclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
23
24
24
26
27
28
29
30
33
34
37
37
38
39
41
44
45
3
Amplitude amplification by quantum signal processing
47
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.1 Attributions and contributions . . . . . . . . . . . . . . . . . . . . . . 48
3.2 Quantum algorithms and query complexity . . . . . . . . . . . . . . . . . . . 48
3.3 Quantum search and amplitude amplification . . . . . . . . . . . . . . . . . . 49
3.4 Amplitude amplification by partial reflections . . . . . . . . . . . . . . . . . . 51
3.5 Flexible amplitude amplification . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.6 C onclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
1.2
7
4
Sparse Hamiltonian simulation by quantum signal processing
4.1 Introduction ........
.....................................
4.1.1
Attributions and contributions . . . . . . . . . . . . . . . . . . . . . .
4.2 Quantum walks in sparse Hamiltonian simulation . . . . . . . . . . . . . . . .
4.3 Eigenphase transformations by quantum signal processing . . . . . . . . . . .
4.4 Optimal sparse Hamiltonian simulation . . . . . . . . . . . . . . . . . . . . . .
57
57
58
58
60
62
5
Standard-form Hamiltonian simulation by qubitization
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 Attributions and contributions . . . . . . . . . . . . . . . . . . . . . .
5.2 The standard-form encoding of matrices . . . . . . . . . . . . . . . . . . . . .
5.2.1 Matrices from a linear combination of unitaries . . . . . . . . . . . . .
5.2.2 Density matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.3 Sparse matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Q ubitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Proof of construction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Operator function design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.1 Ancilla-free quantum signal processing . . . . . . . . . . . . . . . . . .
5.4.2 Single-ancilla flexible quantum signal processing . . . . . . . . . . . . .
5.4.3 Single-ancilla quantum signal processing on arbitrary unitaries . . . .
5.4.4 Single-ancilla quantum signal processing on controlled-qubiterates . . .
5.4.5 Double-ancilla quantum signal processing . . . . . . . . . . . . . . . .
5.4.6 Operator functions of normal matrices . . . . . . . . . . . . . . . . . .
5.5 Hamiltonian simulation by qubitization . . . . . . . . . . . . . . . . . . . . . .
5.6 C onclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
65
66
66
66
67
68
69
71
73
74
76
78
79
81
82
82
86
6
Uniform spectral amplification
87
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1.1 O ur R esults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.1.2
O rganization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.1.3 Attributions and contributions . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Uniform Spectral Amplification by Quantum Signal Processing . . . . . . . . 93
6.3 Amplitude Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.4 Uniform Spectral Amplification by Amplitude Multiplication . . . . . . . . . 96
6.4.1 Matrix Elements as State Overlaps . . . . . . . . . . . . . . . . . . . . 97
6.4.2 Amplitude Multiplication of Overlap States . . . . . . . . . . . . . . . 98
6.4.3 Reduction to Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . 101
6.4.4 Lower Bound on Sparse Hamiltonian Simulation . . . . . . . . . . . . 102
6.5 Universality of the Standard-Form . . . . . . . . . . . . . . . . . . . . . . . . 105
6.6 Construction of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.6.1 Polynomial Approximations to a Truncated Linear Function . . . . . . 107
6.6.2 Polynomials for Low-Energy Uniform Spectral Amplification . . . . . . 113
6.7 C onclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8
List of Figures
1-1
Dependencies of chapters.
2-1
2-2
2-3
Plots of equiripple polynomials DCL,I, ML,T. . . . . . . .
. . . . . . . .. .
40
Worst-case infidelity of equiripple NOT gates as function of target bandwidth. 43
Infidelity of spatially selective equiripple composite gates as function of distance. 45
3-1
Quantum circuit for amplitude amplification variants.
4-1
Quantum circuit for eigenphase transformations by quantum signal processing. 61
5-1
Quantum circuits for standard-form encoding of matrices described by comm on oracles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Quantum circuit for the qubitization of a standard-form encoding. . . . . . .
Quantum circuit for the phased qubiterate of a standard-form encoding. . . .
Quantum circuit for the flexible qubiterate . . . . . . . . . . . . . . . . . . . .
5-2
5-3
5-4
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
9
. . . . . . . . . . . . . 54
67
73
74
77
10
List of Tables
1.1
1.2
Truth tables of primitive Boolean logic gates NOT, OR, NAND, and Toffoli. . 13
Representations of primitive quantum gates {Had, T, CNOT}. . . . . . . . . 17
5.1
Six example problems solvable using the quantum signal processing and qubitization com bined. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Performance comparison of state-of-art with our new approaches for Hamiltonian sim ulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2
11
12
Chapter 1
Introduction
Machines designed to aid computation have long been of interest to civilization [121]. One
of the earliest examples is the abacus (-500 BCE), which can execute simple arithmetical algorithms. Ancient history is rife with more sophisticated constructions, such as the
Greek Antikythera gear mechanism (-100 BCE) for astronomy of the sun and moon, which
is also the earliest known analogue computer. It is hard to overstate the utility of these
devices. Charles Babbage's difference engine (1822 CE), a mechanical calculator for tabulating polynomials, could replace hours of laborious computation by hand with several
turns of a crank. After the invention of vacuum tubes (-1900 CE), it became common to
find electronic circuits for solving linear differential equation and performing convolutions
in real-time. Today, modern digital computers built upon billions of transistors per device
are capable of simulating astoundingly complex natural phenomena with astonishing speed,
limited only by the approximations made to the underlying laws of physics, and the effort
invested in their numerical propagation forward in time.
These are all examples of classical computers, with operating principles rooted in the
classical laws of physics. The earlier mechanical calculators rely upon Netwon's laws of
motion, and the later electronic computers are possible due to Maxwell's equations for electromagnetism. Future computers could also be based upon different physical systems, such
as fluid dynamics 1661, chemical reactions [117], and even DNA [1061. These speculative
thrusts are motivated in part to approach one of holy grails of classical irreversible computation: Launder's principle [74], which is the thermodynamic limit of kT ln 2 joules of
energy per bit of information erased. In most cases, the minimum benchmark for any such
architecture is demonstrating the ability to implement primitive Boolean logic gates that are
universal for classical computation, such as NOT and OR, or NAND, or Toffoli (Table. 1.1),
as these may be composed to synthesize arbitrary Boolean functions f(x) E {0, 1}, where x
is a natural number represented by an n-bit string.
In
Toffoli
NAND
OR
NOT
Out
In
Out
In
Out
In
Out
In
Out
0
0
0
0 0
1
0
0
0
0 0
0
1
0
0
1
0
0
0
1
1
0
0
1
1
1
0
1
1
1
1
0
1
1
1
1
0
0
0
0
0
1
1
1
0
1
0
0
0
1
0
1
1
1
1
0
1
1
1
0
1
1
1
1
0
1
1
1
1
0
1
0
1
0
1
1
Table 1.1: Truth tables of primitive Boolean logic gates NOT, OR, NAND, and Toffoli.
13
Computability: Regardless of the underlying physical principle, universal classical
computers are all surprisingly equivalent in terms of computability. As Boolean logic may be
used to construct the prototypical Turing machine for performing mechanical computations,
the forward direction this equivalence is formalized by the Church-Turing thesis [116], which
hypothesizes that any function f(x) may be computed by a Turing machine if and only if it is
computable by a human following some algorithm, ignoring resource limitations in space and
time. The reverse direction is furnished by the stronger Church-Turing-Deutsch thesis 136],
which hypothesizes that any physical process may be simulated by a Turing machine. To
date, no violations of these hypotheses have been found. In other words, if one is only
concerned with the set of problems to which one may compute solutions, the underlying
physical architecture is largely irrelevant as every universal classical computer may simulate
any other universal classical computer.
Complexity: Equivalence in computability, however, does not imply equivalence in
the more useful metric of computational complexity [114]. The power of universal classical computation is encapsulated by the complexity class P. This is the set of all functions
f(x) E {0, 1} on an n-bit string x efficiently computable by a Turing machine, that is, using
polynomial O(poly(n)) time and space, which in turn is simulatable using O(poly(n)) universal Boolean gates and bits. Allowance for O(poly(n)) additional random bits and computing
correct answer f(x) = 1 with probability ;> 2/3 generalizes this to BPP, which stands for
Bounded-error Probabilistic Polynomial time. Within BPP, different architectures have
unique strengths that allow them to solve certain specialized problems significantly faster,
with less error, or more cost-effectively. While these at-most polynomial or constant factor improvements are generally less exciting to understanding complexity classes, they are
of immense practical relevance. This is well-illustrated by a comparison between analog
and digital computation. Analog computers compute directly through dynamics governed
by physical laws of motions. Thus their inputs and outputs are continuous physical variables, such as voltage, or position. Historically, this made them particularly suitable for fast
real-time applications that interface with physical systems, such as control engineering. In
contrast, digital computers are based on discrete variables, or abstract bits of information,
upon which computation proceeds through Boolean logic. This layer of abstraction, the logical level, is simulated by the underlying physical dynamics of the physical level, and comes
with a significant practically-relevant constant- or polynomial-overhead in space and time.
Often, considerably more components are required to mimic a simple analog computation.
Fault-tolerance: The primary advantage of digital computation, however, is robustness against unreliable physical components [131] that fail with some probability. Whereas
bit-flip errors in digital bits may be fixed by error correction codes, the simplest being redundancy followed by a majority vote, no such mechanism exists for the continuous errors
found in continuous variables. Through error correction and carefully designed fault-tolerant
hardware, the accumulation of errors may be controlled to enable arbitrarily long computations. Notably, error rates in consumer-grade computing hardware are actually extremely
low on the order of a - 10-19 probability of failure per logic gate [96], thus diminishing the
necessity of error correction. However, error correction remains highly relevant to information storage where error rates due to radiation are considerably higher, and to inherently
noisy environments such as in telecommunications or outer space. In fact, with the continual
reduction in the size of transistors in silicon leading to more fault-prone behavior, there may
soon be a point where error correction becomes necessary to realize some optimal trade-off
between feature density and effective error rate even in logic gates.
Quantum computation: An even more dramatic separation in complexity is found
14
between computers based on classical laws of physics, and those on quantum mechanics. In
Section 1.1 we outline the basics of quantum computation following similar themes: universality, complexity, analog quantum computation, fault-tolerance, and digital quantum computation. The main content of this thesis is outlined in Section 1.1.6, where we motivate a
uniquely quantum hybrid of analog and digital quantum computation that merges the speed
and elegance of computation at the physical level with compatibility to the fault-tolerance
of the logical level. We then apply this approach, which we call quantum signal processing,
to the development of quantum simulation algorithms with remarkable low overhead and
optimal performance.
1.1
An outline of quantum computation
Richard Feynman was the first to recognize in 1982 [43] that quantum mechanics could be
exploited to build a quantum computer. This was motivated by the quantum simulation
problem - whereas classical systems are easy to simulate on a classical computer, a similar
feat for quantum systems is notoriously difficult. A system of n classical particles is fully
described by O(n) numbers for their positions and momentums, thus its dynamics simulatable with 0(poly(n)) time. In contrast, the state of quantum particles combine by a tensor
product and so their state is described by O(exp(n)) numbers, a peculiar property known as
the 'curse of dimensionality'. In other words, the problem simulating of quantum mechanics
with a classical computer, while certainly computable, requires exponential time and space.
Thus, simulating quantum many-body phenomena on a classical computer, such as hightemperature superconductivity is completely unfeasible, and that even for modestly-sized
systems, it may be more practical to purpose-build the quantum experiment - essentially
an analog quantum computer.
To the best of our knowledge, quantum mechanics underlies physical reality at small
length scales, and certain fundamental properties unique to quantum computation have no
analogue in classical computation.
* Curse of dimensionality: A bit always assumes one of two discrete values, and lives
in the space of binary numbers {0, 1}. Given n bits, there are 2n possible binary
states in {0, 1}n. However, at any instant, the current binary state is only one of these
possibilities, and describable using exactly n bits of information. In contrast, a single
quantum bit, the qubit with state 1,) is describable by three real numbers, and is a
unit vector in the two-dimensional complex vector space 10) E C 2 . Given n qubits,
these combine by a tensor product to form 2n orthonormal basis vectors. Unlike bits,
the composite quantum state is a unit vector in the L 2 norm living in C 2 n and cannot
be represented by any less than 0(2n) complex numbers, or 0(2n) bits. This also
enables non-local correlations, or entanglement, where the measurement outcomes of
individual qubits in multi-qubit states have statistics that cannot be described by
correlated classical random variables.
" Quantum interference: An n-qubit quantum state IV) is a unit vector in C2n, and
each basis vector may be indexed with n bits. Thus 1i) = E ajIj) may exist as a
superposition with complex amplitudes over all binary states, with a probability a3 12
of measuring the basis state Ij). As all quantum time-evolution is unitary, unitary
operators on 10) allow for the constructive and destructive interference of amplitudes
in various binary states. In contrast, classical probability distribution evolve under the
15
action of stochastic matrices, which only allows for the addition of probabilities, and
never subtraction. From a computational perspective, quantum inference allows for
the reinforcement of beneficial computational branches, and the pruning of unwanted
threads.
Thus a quantum computer [98] is also the most powerful model of computation that can be
feasibly constructed, potentially more capable than universal classical computers.
1.1.1
Universality
Many theoretical developments in quantum computation parallel those of classical computation. Similar to classical computers, quantum computers can be realized by different
implementations of quantum physics. At the physical level, qubits are formed from any
two energy levels of any effective particle with quantum properties, such as nuclei [133],
ions [53], photons [100], superconductors [30], or even quantum fields [63]. The dynamics
of the time-dependent quantum state 1,0(t)) of these qubits then evolves in continuous-time
according to the system Hamiltonian H(t) and the famous Schr6dinger equation
i____
at
= H (t)140(t)).
(1.1)
Though each system may differ in their Hamiltonian, all universal quantum computers are
polynomially equivalent in their computational power. Analogous to the universal classical
gates, judicious control over the Hamiltonian allows these systems to simulate a discrete set
of universal quantum gates that act on qubits, such as Hadamard and T and CNOT, or
Hadamard and Toffoli [21, all represented by unitary matrices in Table. 1.2. The famous
Solovay-Kitaev theorem [71] provides a recipe to synthesize any arbitrary single-qubit or
two-qubit unitary, so-called 'primitive' quantum gates, to arbitrarily high precision c using
a product of just O(polylog(1/c)) universal quantum gates. These may in turn be composed
to synthesize any unitary gate of arbitrary dimension. As Lloyd's Hamiltonian simulation
algorithm for local Hamiltonians [81] allows any universal quantum computer to efficiently
simulate time-evolution by any physical n-particle quantum system using O(poly(n)) primitive quantum gates and qubits, any universal quantum computer is capable of simulating
any other universal quantum computer with a polynomial overhead in time and space.
16
Had
1
T
CNOT
0)
)1
1
12 -(
0
)
1 0
0 1
0
0
0
001
0
10)a -
Had
Hadkb)a
10)a
TTLV))a
0
1
0
0
0)
X)b
|X)b
')a
1x ED X)a
Table 1.2: Representations of primitive quantum gates {Had, T, CNOT} as (middle row)
unitary matrices, and (bottom row) quantum circuits acting on input states Ma, KX)b from
the left, where the subscript indicates the qubit register. In the case of CNOT, the output
represented as modular addition is only valid on computational basis states e.g. I0)a E
{0)a, 1)a}. Note that the representation of Toffoli is identical to the Boolean case.
1.1.2
Complexity
This equivalence motivates one to define the complexity class of universal quantum computation BQP, which stands for Bounded-error Quantum Polynomial time. This is the set of
all functions f(x) E {0, 1} for which any unitary quantum circuit Q, acting on an n-qubit
input state |x), outputs the correct one-bit answer with probability > 2/3, using 0(poly(n))
qubits and universal quantum gates and. It is widely believed, though not proven, that universal quantum computers are strictly more powerful than their classical counterparts. In
other words, P C BPP C BQP. Note that BPP C BQP is particular easy to prove as
Toffoli alone is universal for classical computation, and random bits may be obtained by
measuring a uniform quantum superposition of binary states. Though originally motivated
by the quantum simulation problem, evidence for the quantum advantage speak through
the performance of remarkable quantum algorithms that surpass best classical algorithms.
The most well-known quantum speedups can be found in Shor's algorithm [9] which factors
an n-bit number using O(n 2 ) quantum gates versus the best classical factoring algorithm
which takes O(el9n/ 3 ) classical gates. In the weaker query model where one assumes access
to a black-box circuit that outputs bits of information one by one, the broadly applicable
Grover's algorithm [50] makes O(n1 / 2 ) queries to this circuit using a quantum superposition
of states to search an unsorted database of size n for a single marked element. In contrast,
the naive but optimal classical approach has to check every single element with O(n) queries.
1.1.3
Analog quantum computation
Universal quantum gate sets are a useful tool for comparison between physical systems as
they provide a worst-case estimate of the number of quantum gates and qubits required
to execute quantum algorithms. However, the cost of implementing this abstract layer can
hide large constant factors or even polynomial factors. Just like classical analog computers,
the underlying dynamics of each quantum system may be predisposed to solving specialized
problems. For instance, an n-qubit controlled phase gate may be implemented in a single
time step by some superconducting qubit architectures [135], whereas its naive synthesis
with standard universal gate sets would require O(n) time. Beyond the gate model, physi17
cal intuition may provide more natural models of universal quantum computation, such as
the ground state of certain two-body Hamiltonians [21], that yield insight into how nature
performs computations. Perhaps the most famous example is the adiabatic quantum computer [41, 3, 4]. In adiabatic algorithms, the time-dependent Hamiltonian H(t) transitions
from a simple Hamiltonian H(0) = Hi with an easily prepared ground state, to a final
Hamiltonian H(T) = f whose ground state encodes the solution to some computational
problem, such as search [108] or k-SAT [41]. Provided that this transition is slower than
the inverse energy gap to the first excited state, the initial ground state evolves into that of
H(T) with high probability. Though this evolution could certainly be simulated by universal
gate sets, the speed and elegance of natively physical analog quantum computation is highly
appealing [34], and accessible to current technology.
1.1.4
Fault-tolerance
Unfortunately, quantum computers scalable to a large number of qubits and capable of long
computations are notoriously difficult to build. Qubits are often single particles highly susceptible to weak environmental noise, unlike classical bits which result from the collective
phenomena of many particles. As a consequence of this fragility, the best quantum computers struggle to achieve better than a 10- 3 probability of failure per two-qubit gate [6]
at the physical level, which greatly limits the problems in which any advantage over classical computation may be demonstrated, even in analog quantum computation. Moreover,
quantum bits are not discrete in the same way as classical bits. Whereas a quantum state
is represented by a discrete set of continuous variables, a Boolean state is represented by a
discrete set of discrete variables. Due to the lack of success in classical analog error correction, one of the greatest triumphs and surprises of quantum computation was the discovery
that quantum error correction and fault-tolerant quantum computation is possible [19], and
is now regarded as essential to any long-term vision of quantum computation.
The basic idea behind quantum error correction [47] encodes the two bases of a logical qubit into two specially designed entangled states, the quantum codewords, that are
distributed over a larger number of n physical qubits. Each time-step of evolution at the
physical level then applies an error, which is some quantum channel that may be decomposed into a linear combination of the n-fold tensor product of single-qubit Pauli operators
{I, &X, &Y&, z}. Working on the assumption that noise acts locally on physical qubits, the
overwhelming probability of errors will be caused by a discrete set of low-weight operators,
where weight is the number of that non-identity single-qubit Pauli operators. All possible operators of sufficiently low-weight then transform the initial two codewords into other
codewords that are guaranteed to be mutually orthogonal, which allows the error that occurred to be uniquely identified by measurement, and then corrected. Quantum gates may
be applied to logical qubits using high weight operators on the physical qubits that preserve
these codewords, and any local errors may be corrected in a similar manner. Great care
must be taken in fault-tolerant quantum computation to organize these logical quantum
gates, measurements, and correction routines in such a way that low-weight errors do not
propagate throughout to become uncorrectable high-weight errors. Notably, only a finite
set of logical quantum gates may be implemented in this manner. Fortunately, some error
correction codes allow for a fault-tolerant implementation of universal quantum gate sets
such as Toffoli and Hadamard [137].
18
1.1.5
Digital quantum computation
This layer of fault-tolerant error correction with universal quantum gates acting on logical
qubits abstracts away the underlying physical dynamics, and is also known as digital quantum computation. Provided that physical error probabilities are below a certain threshold,
the threshold theorem [47] guarantees that logical error probabilities may be made exponentially smaller with the number of concatenations of the fault-tolerant layer. However, a
heavy price in overhead is paid for this robustness to noise. Realistic estimates [37] suggest
that tens of physical qubits and up to 10 5 physical gate operations may be required per
logical single-qubit gate. This is compounded by the up-to polynomial overhead in time and
space of compiling fast analog quantum algorithms into sequences of universal gates. Though
these estimates can be expected to go down if physical errors rates decrease or if more efficient fault-tolerant error correction protocols are found, digital quantum computation is
currently somewhat beyond the reach of existing technology.
1.1.6
Analog-digital hybrid models of quantum computation
The fundamental physics of quantum mechanics underlies quantum computation. Just as
how intuition about classical physics heavily influenced analog computation, intuition about
quantum physics has also driven the invention of analog quantum algorithms, such as the
adiabatic algorithm [42, 95] inspired by adiabaticity, and quantum walks [22, 24] inspired by
locality. These algorithms can -operate directly at the physical level of quantum hardware,
and thus come with low overhead and provide insight to how nature intrinsically enables
computation.
It is also widely recognized that a fault-tolerant logical layer of digital quantum computation will ultimately become necessary. The discrete set of universal logical quantum gates
implementable at this level motivates a model of quantum computation analogous to that
of the discrete set of universal classical Boolean gates. On one hand, many highly optimized
reversible Boolean compilations of common operations such as integer arithmetic and optimization may be implemented directly on a quantum computer using, say, the Toffoli gate,
which is universal for classical computation. This transfer of knowledge is of incalculable
value, and is commonly exploited by many quantum algorithms. On the other hand, this has
also motivated a more abstract understanding of how the mathematics of quantum physics
enables quantum computation, independent of the physical layer.
However, quantum algorithms designed for the logical layer lose the physical intuition of
analog quantum algorithms, and more pragmatically, lose their native speed and low overhead. One could certainly simulate analog quantum algorithms with the logical level [7], as
assured by universality of quantum computation. However, this is through digital Hamiltonian simulation algorithms with some polynomial overhead, and is not a uniquely quantum
phenomenon. After all, digital classical computers too have no fundamental difficulty simulating continuous classical physics. Whereas analog and digital classical computation are
mutually exclusive - one either computes on continuous variables, or on binary variables,
digital quantum computation still has some remnant analog behavior in how its unitary
operators still live in a continuous space of C 2 n. The 'analogness' of digital quantum computation lends hope that some kind of meaningful fusion between analog and digital models
of quantum computation is possible.
19
Chapter 2
Single-qubit analog
quantum computation
Chapter 3
Amplitude amplification
by QSP
Chapter 4
Sparse Hamiltonian
simulation by QSP
-+e
Chapter 5
Standard-form Hamiltonian
simulation by qubitiazation
Chapter 6
Uniform spectral amplification
Figure 1-1: Dependencies of chapters.
1.2
Quantum signal processing
In th thesis, we present a broad approach we call quantum signal processing that succeeds
at transferring some of the power of computing with physical systems to the digital realm.
We start in Chapter 2 with its original motivation as a tool that exploits the continuoustime dynamics of the simplest of two-level quantum systems to design a certain useful class
of quantum response functions QSP at the physical level. Following a thorough analysis
of what unitary functions in QSP are possible, and then solving the synthesis problem of
how they may be selected, we slowly introduce successively refined applications to quantum
algorithms. Of particular surprise is that QSP too may be directly implemented on a digital
quantum computer without modification or simulation.
For instance, in Chapter 3, QSP naturally maps to the dynamics of the amplitude
amplification algorithm, which allows us to solve open problems regard its design, and to
invent useful generalizations. Of greater interest is the application of QSP to the Hamiltonian simulation problem of simulating quantum physics on a digital quantum computer.
As it turns out in Chapter 4, QSP is the last missing ingredient that allows us to obtain
optimal simulation algorithms, which are also of pleasing elegance and simplicity, for sparse
Hamiltonians.
The usefulness of QSP motivates us to extend its applicability. Our solutions to the
problems of amplitude amplification and sparse Hamiltonian simulation were only possible as
those coincidentally had an underlying SU(2) qubit symmetry matching that QSP, whereas
such structure need not exist in general quantum algorithms. We resolve this in Chapter 5
with a very general 'qubitization' procedure that imposes a SU(2) structure on a wide variety
of possible ways descriptions of Hamiltonians are accessed by quantum computer. On one
hand, this transfers our optimal simulation algorithms to a broader class of structured matrices beyond the sparse model. On the other hand, qubitization in combination with QSP
furnishes a very natural framework for efficiently and optimally computing on exponentially
large matrices on a quantum computer, of which Hamiltonian simulation is an example. We
further study this possibility in Chapter 6, and obtain simulation algorithms optimal under
even more general circumstances.
The following chapter summaries outline in more detail our development and application
of quantum signal processing. The dependencies of these chapters are shown in Figure 1-1.
'Somewhat confusingly, the term 'quantum signal processing' is used in the opposite context in [39],
where classical signal processing techniques are inspired by quantum mechanics, but in no way involves any
kind of quantum mechanical system. We believe our use 'quantum' here is more traditional.
20
Chapter 2 draws the analogy between quantum control in physical systems and programs
that compute unitary functions on multiple variables. This analogy is formalized as a
general analog model of quantum computation, and applied to the simplest of quantum
physics - a single qubit. Despite this simplicity, a surprisingly structured family of
composite unitaries parameterized by a single input variable 0 emerges through the
technique of composite pulse sequences. There, L fixed single-qubit rotations by angle
o are interspersed between carefully designed sequences of arbitrary phase gates. We
call the set of all unitary functions generated by this model of quantum computation
QSPL.
Though composite pulse sequences is an old technology, most composite gates are designed ad-hoc either through geometric intuition or brute-force numerics, which limits
their effectiveness to either very specialized classes of functions, or short L sequences
with simple behavior. Any practical application of quantum signal processing to the
computation of functions must satisfy two criteria. First, the design space of achievable
functions must be known. Second, there must exist efficient algorithms for compiling
desired functions into primitive quantum gates. We fulfill this by rigorously characterizing QSPL. We prove that the quadratures of any of its composite unitary are
degree L trigonometric polynomials in 0, subject to certain necessary and sufficient
constraints. We also provide necessary and sufficient constraints on polynomials describing subsets of these quadratures. Given any such polynomial(s), we provide an
efficient classical algorithm running in 0(poly(L)) time for compiling a member of
QSPL consistent with these specifications into its implementation with O(L) singlequbit gates.
Chapter 3 generalizes the quantum algorithm of amplitude amplification using QSPL. In
general, quantum algorithms involve multi-qubit gates distributed over an arbitrary
number of qubits. Thus it is surprising that quantum signal processing, especially that
on a single-qubit, is at all relevant. The natural solution is to find quantum algorithms
with an intrinsic structure isomorphic to single-qubit rotations. Amplitude amplification, which rotates quantum states between two subspaces, is one such algorithm, with
applications in quantum state preparation. In the original algorithm, the amplitude
A of any marked quantum state 14) prepared by some oracle may be boosted by factor L using O(L) queries. More advanced variants such as fixed-point Grover search
construct more desirable behavior through the same geometric intuition or brute-force
found in composite pulse sequences. However, these are all special cases of the generalization, which prepares 10) with an amplitude described by arbitrary polynomial
functions of A that may be designed systematically, and compiled efficiently using the
exactly same techniques applied to compile QSPL.
Chapter 4 returns to the original motivation of quantum computation: the quantum simulation problem of synthesizing an c-approximation to the time-evolution operator
eC-it of any Hamiltonian H. Unitary time-evolution is a natural consequence of
Schr6dinger's equation at the physical level, and this suggests that digital Hamiltonian simulation algorithms that incorporate quantum signal processing could lead to
better performance. This is indeed the case - a simple multi-qubit generalization of
QSPL turns out to be exceeding useful at computing functions on the eigenphases of
arbitrary unitary operators, and, in combination with quantum walks, is the last missing piece of the puzzle for creating optimal simulation algorithms, at least in the query
21
model for d-sparse Hamiltonians. These Hamiltonians which have at most d-elements
in every row, and are particularly relevant to the simulation of physical systems and
the design of quantum algorithms. On one hand, particles in many realistic physical
systems interact locally with their neighbors. On the other hand, quantum walks that
exploit local Hamiltonians are fundamental to many quantum algorithms, through
which lower bounds on the complexity of Hamiltonian simulation may be derived.
The query complexity of our algorithm exactly matches the best-known lower bound
of Q(tdftmax + dlog()) in all parameters, is extremely simple, and moreover reduces the qubit overhead to just an additive constant factor 0(1), compared to prior
art with an ancilla cost scaling with some function of t, log (1/E), and a suboptimal
query complexity, though only by logarithmic factors.
Chapter 5 considers general Hamiltonians beyond the sparse model. We introduce a
standard-form encoding of Hamiltonians H/a = ((GI 0 I)U(IG) 0 1) that are obtained by projecting, with some normalization constant a, unitary signal oracles U
that describe H, onto a subspace spanned by some ancilla state G). Many prior models of Hamiltonian simulation such as sparse Hamiltonians, Hamiltonians described by
a linear combination of unitaries, or Hamiltonians that are density matrices, turn out
to be special cases of this model. We introduce a procedure called qubitization that
imposes a qubit structure on the standard-form. This enables the direct application
of QSPL to computing arbitrary matrix polynomial functions of H with optimal cost
and great simplicity, of which the unitary time-evolution function is a special case.
This leads to an optimal Hamiltonian simulation using ((ta + loglog((/) (1/6) ) queries,
log
with the same space and query improvements over prior art as in the sparse case.
Chapter 6 motivates a systematic approach to understanding and exploiting structure,
where Hamiltonians are encoded in the standard-form through the signal oracle U.
We define a uniform spectral amplification problem on this framework for expanding the spectrum of encoded Hamiltonian with exponentially small distortion. We
present general solutions to uniform spectral amplification in a hierarchy where factoring U into n = 1, 2, 3 unitary oracles represents increasing structural knowledge
of the encoding. Combined with structural knowledge of the Hamiltonian, specializing these results allow us simulate time-evolution by d-sparse Hamiltonians using 0 (t(d|ft|maxIftII1)1/2 log (tIIHII/E)) queries given the norms IH0I 5 fifli <
dllfHlmax. Up to logarithmic factors, this is a strict polynomial improvement upon
prior art using 0 (tdllf|max +
or 0(t 3/ 2 (dIIHI|maxIlftlifltI/E)'/ 2 ) queries.
)
In the process, we also prove a matching lower bound of Q(t(d|JfIImaxIIHI1)1/ 2
queries, present a distortion-free generalization of spectral gap amplification, and an
amplitude amplification algorithm that performs multiplication on unknown amplitudes.
22
Chapter 2
Analog computation on a single qubit
2.1
Introduction
In this chapter, we consider a simple model of analog quantum computation inspired by
composite pulse sequences [129, 45], also known as composite quantum gates. Composite
quantum gates are indispensable to many important quantum technologies, such as nuclear
magnetic resonance [77, 45, 67], magnetic resonance imaging, quantum sensing [5, 52, 88, ?],
and implementing robust gates in quantum computation [125, 94, 88, 134, 33, 18, 127, 132,
62, 87, 115, ?]. Their versatility arises from cunningly chosen sequences of L primitive
quantum gates that produce an effective quantum gate U with a more desirable dependence
on some parameters of interest 0, such as drive amplitude or background magnetic fields. As
a function of 0, the quantum response function U(O) can be tailored to amplify weak signals
beyond the statistics of repetition, and suppress noise without measurement. Finding such
useful composite gates is thus the subject of intense research, and the discovery of other
applications would be expedited if a useful characterization of all achievable U(6) could be
found.
To our knowledge, while composite quantum gates have always been a useful means to
an end, they have never been considered as a model of computation. Part of the difficulty is
that even though one could choose a response function U(O) that performs some desirable
computation on the input 6, its realization as a composite gate must be found. Only with
rare exceptions [127, 125, 88] and great effort are optimal, arbitrary length examples found
in closed-form. Though celebrated techniques including gradient ascent algorithms [67] and
pseudospectral methods [110, 109] allow us to formulate this as a systematic optimization
problem that can be solved by brute force, this is unfortunately with an exponential worstcase runtime O(eL) for finding optimally short L approximations. Finding efficient solutions
to various control problems would expand the potential of long composite gates, for which
the most sophisticated quantum response functions can be constructed.
A tantalizing similarity is seen in discrete-time signal processing [101]. Optimal finiteimpulse response filters [54] can be designed simply by choosing the lowest degree-L polynomial that is the optimal approximation to a desired frequency response, from which an
optimal and exact implementation is computed - made possible by efficient algorithms
for both steps. It is recognized that composite gates implement a filter on physical parameters [115, 64], and the use of polynomials in quantum response functions is wellknown [78, 79]. Unfortunately, quantum constraints can render computing these polynomials and their optimal implementation a hard problem. It would be a tremendous advance
23
if efficient solutions to these problems could be found, and even more so if the countless
results from the exalted history of classical discrete-time signal processing were transferable
to the quantum realm.
In Section 2.2, we define the inputs and outputs of a general model of analog quantum
computation that is compiled by optimal quantum control [20, 69]. This turns out to be
intractable for arbitrary physical systems, and so we consider, in Section 2.3, its simplest nontrivial member: a single qubit system. Its discrete-time approximation leads to composite
quantum gates and motivate in Section 2.3 the intuitive concept of choosing polynomials to
explicitly define the quantum response function U(O). This is made rigorous and tractable
in Section 2.4.1 by a simple characterization of the space of achievable U(O), and showing in
Section 2.4.3 how an optimal implementation of any such U(O) can be efficiently computed.
We then show in Section 2.4.4 how an achievable U(O) can be efficiently computed from
a partial specification with polynomials that describe only the composite gate fidelity or
transition probability response functions. This enables in Section 2.4.5 the efficient design
of achievable &(O) by inheriting from discrete-time signal processing existing polynomials
and efficient polynomial design algorithms. Together, these provide a methodology outlined
in Section 2.4.6 for the systematic and efficient design of composite quantum gates. Use
of this methodology is demonstrated in Section 2.5 with the creation of optimal bandwidth
compensated gates in Section 2.5.2 that provide an optimal solution in Section 2.5.3 to
the problem of implementing sub-wavelength spatially selective arbitrary quantum gates.
Further directions are discussed in Section 2.6.
2.1.1
Attributions and contributions
The majority of this chapter is taken from the preprint of published joint work [89] work with
Theodore J. Yoder and Isaac L. Chuang. Theodore J. Yoder provided the proof of Lem. 2.13,
and the rest of the manuscript was written by myself with helpful discussions and suggestions
from collaborators. Note that Section 2.2 is new and Section 2.3 has been significantly
rewritten. G.H.Low acknowledges funding by NSF RQCC Project No.1111337 and NRO.
We thank Alan Oppenheim and Tom Baran for inspiring discussions, and connections made
possible by their 6.341x open online MITx course. We thank Yuan Su for useful comments
on the paper.
2.2
A model of analog quantum computation
Every model of computation must have well-defined inputs, outputs, and a procedure that
maps inputs to outputs. Our procedure is the time-dependent Schr6dinger equation that
underlies continuous-time quantum dynamics.
ih aU(t) =t
-
(t)U(t),
U(0) =1.
(2.1)
In the following, the units of h = 1 and the Hamiltonian H(t) E CNxN is a dimension N
Hermitian time-dependent matrix. This outputs a unitary time-evolution operator with the
formal solution
U(t) = TO exp
[-i jN
H(t')dt' ,
24
(2.2)
where TO is the time-ordering operator.
The inputs to our computation are then d unknown real coefficients 6 E Rd that are part
of the Hamiltonian
d
Hi(t; 6) = Io (t) +
S 6j Hj(t),
(2.3)
j--l
for some time-dependent Hamiltonians Hj (t). It is useful to bound the spectral norms of
these components ||H2 (t)J| = 0(1) to avoid arbitrarily fast time-evolution. In any realistic
setting, H3 (t) will be highly structured. For instance, it may have a maximum slew-rate
maxt |IdHj (t)|1, represent local interactions, or even be constant. In some cases, the experimentalist may have control over the component f11(t) and be able to modify them, whereas
others may be uncontrollable. These constraints all limit the space of possible output unitary functions U(t; 6) of the input variables 6. However, for simplicity, we assume that all
H(t) are directly under our control.
More formally, these ingredients define a model of analog quantum computation that
illustrates key features.
Definition 2.1 (Analog quantum computation for unitary function problems).
Input: An integer number d > 0 of real parameters J C Rd.
Program: Time-evolution described by the Schr6dinger equation with a time-dependent
Hamiltonian H(t;
= H)o(t) + E
jHj(t) of dimension N comprised of an integer number d + 1 of time-dependent Hamiltonians Hj(t), with bounded spectral norm
|Hj|| < 1, and defined over t C [0, T].
Cost: The time complexity is T > 0, and the space complexity is N > 0.
Output: A unitary function 0(6) = TO exp [-i f
H(t'; 6)dt'] of -.
Though unconventional, the model in Def. 2.1 may be massaged to resemble that of
universal quantum computation BQP with a few modifications. This simulates the standard
gate model as (1) a sequence of arbitrary discrete single and two-qubit may be implemented
by a piecewise-constant choice of Ho(t). (2) We may treat the J to be Boolean rather
than real variables, hence choosing Hj(t) to also be piecewise-constant allows us to generate
quantum circuits dependent on the problem input. (3) We may recover the decision version
of BQP by performing a measurement on one qubit of the output of U(6)14), where 14) is
some initial state in the computational basis.
However, Def. 2.1 motivates a different interpretation of how physical systems compute.
Rather than performing measurements, we focus on the N2 components of U(J), which each
compute some complex-valued function Ujj(6). By choosing an appropriate input state 10)
and measurement basis
our interest lies in the amplitude function f : Ed -* C, where
KX),
f(6) = (x I(6)WP).
(2.4)
Given a target function f'(6) that is to be uniformly approximated with error e on all
inputs value, this becomes an instance in quantum control of finding the optimal control
25
policy for every H,(t) that solves the min-max optimization problem
min
max
f(6) - f'(6)
; c.
(2.5)
(XI(U10) SE [-1,1]
Though one may attack Eq. 2.5 using general techniques such as gradient ascent algorithms [67], it is essentially intractable as the optimization is over a continuous space of
functions with no obvious structure. However, if the solution is found, it provides a fully
quantum and extremely fast analog computation of the target function. As is, this model is
too general to be useful, which suggests searching for a simpler starting point.
2.3
Analog quantum computation on a single-qubit
Let us now consider the simplest non-trivial model: a generic single-qubit system resonantly
driven by a constant Rabi frequency 6 and time-dependent phase 0(t). The Hamiltonian of
this system is
6
H(t) = 2&
10,
(2.6)
where 8 = cos(#)&,x + sin(6)&y and a-,,y,z are Pauli matrices. Whereas #(t) are assumed
to be completely under our control, the input 6 is a constant that is not known beforehand.
We may define for this model a reasonable-looking model of analog quantum computation.
Definition 2.2 (Analog quantum computation with a single qubit).
Input: A real parameter 6 E R.
Program: Time-evolution described by the Schr6dinger equation with a time-dependent
Hamiltonian H(t; 6) = 8-&(t) of dimension 2 described by a time-dependent real function $(t) defined over t E [0, T].
Cost: The time complexity is T > 0, and the space complexity is 2.
Output: A unitary function 0(6) = T-O exp [--i H(t';
H 6)dt' of 6.
This is not the most general single-qubit Hamiltonian, but specialization ultimately
enables the tractable design of &(6). We simplify this further by taking 0(t) = # to be
piecewise-constant over time segments of duration T. This generates the primitive singlequbit rotation
R(0) = e-
0,
6 = 6r,
(2.7)
which is periodic in 0 with period 47. By partitioning the total time of evolution T into
L = T/r discrete segments, the continuous-time evolution of Def. 2.2 is discretized into an
product of L rotations, each with the same rotation amplitude 0, but with varying phases
5 = (#, ... , /L). Though now operating with discrete-time quantum gates, this still leads
to a model of quantum computation that natural to the underlying physics.
Definition 2.3 (Discrete-time analog quantum computation with a single qubit).
Input: A real parameter 6 E R.
26
Program: A sequence of integer L unitary operators
U(0) = N0, (0)N
(2.8)
-1(0) ... Rol (0)
_0,
of dimension 2 described by a vector of phases 0 E [L
Cost: The gate complexity is L > 0, and the space complexity is 2.
Output: A unitary function &(0) of 0.
Such products of single-qubit rotation are also widely know as composite pulse sequences [129, 45], or composite quantum gates. Despite the simplicity of the formulation,
there is still enough hidden complexity within Eq. 2.8 to obtain highly non-trivial response
functions U(0). We formalize this with a complexity class
Definition 2.4 (Quantum signal processing with a single qubit). Let the complexity class
QSPL be the set of all unitary functions U(6) be the output of any L gate program of Def. 2.3.
Then QSP is the union of QSPL for all integer L > 0.
Unfortunately, a systemic understanding of what QSPL contains appears lacking, and
the design U(0) often proceeds through a brute-force optimization over the phases 0. However, one noteworthy step in this direction is the Shinnar-LeRoux algorithm [113, 104] and its
refinements [60, 76, 49], which have so far been restricted to the field of magnetic resonance
imaging. There, 0 represents the amplitude of background magnetic fields and manifests as
an off-resonant rotation. Given otherwise perfect and arbitrary single-spin control, this approach enables the efficient design of U() by a connection to finite-impulse response filters.
Unfortunately, extending the concept to situations with different controls and additional
restrictions, such as this case of on-resonant compensating pulse sequences, appears to have
been difficult.
2.3.1
Representation of QSP
However, there are hints that members of QSPL are actually highly structured. By expanding U(6), the quantum response function has the form
L
-cosL-j
U(O) = E(-iW sini
j=O
(2.9)
(0)4L,j,
where 'L,j = &j(Re[L,j]1i + iIm[DL,] &z), and the phase sums GLj are defined through
the recurrence [87]
=
_1,j +
k _1j_1 iei(_1)3+ 1 oj,
4),0 = 1,
Ioljoo = 0,
(2.10)
performed over j = 0, 1, ... ,k, then k = 1, 2, ... , L.
Now, observe that U(O) is polynomial of degree L in x = cos (0/2) and y = sin (6/2)
with a particularly elegant representation. Using the trigonometric relation x 2 + y 2
U(0) has the form
{A(x)
-
+ iB(x)&z + iC(y)&x + iD(y)&y,
L odd,
A(x) + iB(x)&, + ixC(y)&x + ixD(y)&y,
L even.
27
where A(x), B(x), C(y), D(y) are polynomials of degree at most L with coefficients ak, bk, ck, dk
(k = 0,1, ... , L) respectively. In the following, A, B, C, D without arguments are understood
to be functions of the x, y seen in Eq. 2.11. As the tuple (A, B, C, D) is an equivalent representation of &(O), we refer to both interchangeably. In particular, achievable tuples are
those than can be realized by some composite gate of Eq. 2.9:
Definition 2.5 (Achievable polynomial tuples). A tuple of polynomials (A, B, C, D) is
achievable if -L, 0 IRL s.t. 1(6) = RL(O)RLl() ... R0, has the form of Eq. 2.11.
We are often interested in only a few components of (A, B, C, D). For example, the partial tuple (A, -, C, -) fully defines the gate fidelity response function FX(O) = 11Tr[Nt(x) (]2
with respect to some target gate ko().
Fx (0) =
(XSi
XC2
' (2.12)
cos (1) A - sin (
Cos () A -
C)C 2
'
L odd,
L even,
Similarly, (A, B,.,.) or (-, -, C, D) fully defines the transition probability response function
,
p(o) = 1(0IUI1)12
p(0) =1 - A2 - B2 = (C2 + D2) 1,
1x 2 ,
L odd,
(2.13)
L even.
We refer to a tuple with n empty slots as an n-partial tuple. An n-partial tuple is achievable
if it is consistent with some achievable tuple.
A brute-force approach to composite gate design is minimizing an objective function
for U(O) over a space L c N,
c IRL. Though useful examples have been discovered in
E
this manner, such an approach is highly unappealing. In addition to being inefficient with
a runtime O(eL), there is no guarantee that a globally optimal solution will be found.
Furthermore, the procedure provides little of the necessary insight into possible U(O) for
envisioning further novel applications.
2.4
Systematic and efficient design of optimal composite gates
The functional form of ((6) hints at a powerful methodology for composite gate design via
choices of the polynomials (A, B, C, D) of degree L. This ambition must solve long-standing
problems:
(P1) An insightful characterization of achievable (A, B, C, D) to eliminate the tr'aditional
guesswork in envisioning novel quantum response functions and their dependence on
(P2) An efficient algorithm to compute the optimal qimplementing an achievable (A, B, C, D),
in contrast to the intractable random search in time O(eL) of current state-of-art [88].
(P3) An efficient algorithm to compute an achievable (A, B, C, D) from achievable partial
tuples e.g. (A, -, C, .), as might be encountered with common objective functions for
Eq. 2.12, 2.13.
(P4) An efficient algorithm for computing achievable partial tuples optimal for some objective function.
Our main technical advances are precisely the resolution of problems (1-4). We describe
in a simple and intuitive manner the set of achievable (A, B, C, D), and provide efficient
28
algorithms for solving what has traditionally been the hardest aspects of composite gate
design. In particular, a beautiful connection is made with the historic field of discrete-time
signal processing that allows allows us to inherit much of its prior work in polynomial design.
In this manner, the inspired art of composite gates is transformed into a systematic science.
Optimal composite gates are simply polynomials optimal for the objective function, and
these polynomials can be found efficiently.
2.4.1
Polynomial characterization of quantum response functions
We characterize here achievable choices of quantum response functions (A, B, C, D) in a
manner independent of 0, hence resolving problem (P1). By providing insight into the forms
of possible 0(O), we also obtain a quantitative explanation for the remarkable versatility of
composite gates. Achievability constraints on the polynomials (A, B, C, D) are as follows:
Theorem 2.6 (Achievable tuples). A tuple of polynomials (A, B, C, D) of degree at most L
is achievable iff all the following are true:
(1) A, B, C, D are real.
(2) A(1) = 1 or B(1) = 0.
L odd,
B, C, D are odd,
A,
A, B are even and C, D are odd, L even.
rA(x) 2 + B(x) 2 + C(y) 2 + D(y) 2 ,
L odd,
(4) 1 =A(x)2 + B(X) 2 + x 2 C(y) 2 + x 2 D(y)2 , L even.
(3)
Proof. In forward direction, (1) and (3) are true by applying the trigonometric substitution
1 in Eq. 2.9 and collecting coefficients of 1, &X)Yz. (2) is true as U(0) = I in Eq. 2.9.
(4) is true as U is unitary so UtU =
and }Tr[UtU] evaluated via Eq. 2.11 produces
x2 +y 2
1
{ A(x)
+ B(x)
A(x)2 + B(x)
2
C(y) 2 + D(y) 2 ,
+ X2C(y) 2 + x 2 D(y) 2 ,
2
2 +
L odd,
L even.
(2.14)
In the reverse direction, we need to show that any (A, B, C, D) satisfying (1-4) is achievable
in the sense of Def. 2.5. We leave these steps to Lem. 2.12.
D
Conditions (1-4) for achievable (A, B, C, D) appear fairly general, which allows for great
flexibility in choosing arbitrary response functions. They are also understandable and intuitive. A characterization of achievable partial tuples is also useful. Not all quadratures of
U(O) might be relevant to an objective function, and optimizing over a subset (A, B, C, D)
could be easier. In the following, we examine how the unitarity constraint of condition (4)
is weakened for all possible 2-partial tuples.
Theorem 2.7 (Achievable 2-partial tuples). Assuming A, B, C, D satisfy conditions (1-3)
of Thm. 2.6,
(1) (A, -, C, ), (A,- 7,1 C) is achievable iff
rA(x) 2 + C(y) 2 < 1,
L odd,
(Ia) VO E F, A(X)2 + x2C(y) 2 < 1,
L even.
(2) (-, B, C, .), (., B, -, C) is achievable iff
(2a) V' E R, 1 B(x) 2 + C(y) 2 < 1,
L odd,
( B(x) 2 + X2C(y) 2 < 1, L even.
(3)(A, B,.,.) is achievable iff
(Sa) VO E DR, A(x) 2 + B(x) 2 < 1, and
29
(3b) Vx > 1, A(x) 2 + B(x) 2 > 1, and
(3c) VL even,Vx > 0, A(ix) 2 + B(ix)2 >1.
(4)(-,.,CD) is achievable iff
C4y) +GD(y)2 1,
(4a) V
E R
(4 b) VL odd, y
2
2 2
L odd,
D(y) 2 < 1,
1, C(y) 2 + D(y) 2
L even,
and
1
Proof. In the forward direction, all the (a) conditions are true from Eq. 2.14 using the
fact that A, B, C, D are all real, hence their squares are positive. (3b) is true by considering
Eq. 2.14 with the substitution x = V, y = -1 - A, and computing 1- A 2 (xf)- B 2(VA)
Note that the x, y here are complex. Using the odd/even symmetry of C, D, the RHS
factorizes into a positive term times (1 - A) or A(1 - A).
This is negative VA > 1 so
A 2 (VJX) + B2 (V)A)
1. (3c) is similarly proven by considering A < 0. The RHS factorizes
into A(1 - A) and a positive term. (4b) is proven with the substitution x = \1 -- A, y = v
and by considering A > 1. In the reverse direction, we need to show that assuming these
conditions enable the computation of an achievable (A, B, C, D). We leave these steps to
Lems. 2.13, 2.14.
l
Note that C, D are interchangeable in Thm. 2.7 as their constraints in Thm. 2.6 are
identical. We also characterize all possible 3-partial tuples.
Theorem 2.8 (Achievable 3-partial tuples). Assuming A, B, C, D satisfy conditions (1-3)
of Thm. 2.6, the following are achievable under their respective conditions
(1) (A,-,-,-) iff VO E R, A(x) 2 <1.
(2) (-,B,-,-) iff VO E R,B(x) 2 <
fC 2 (y) < 1,
L odd,
Lod
--) iff VO E R, 2 ()<1,
(3) (--C,
'
)
'x
C 2 (y) < 1, L even.
2
L odd,
(y) < 1
(4) (-,-,-,D) iff V E R, Ix2 D 2 (y)
1, L even.
{D
Proof. The forward direction follows by definition and from Eq. 2.14 where A, B, C, D are
all real VO E R, hence their squares are positive. The reverse direction is true from setting
the unspecified polynomial to 0 in one of the 2-partial tuples (1), (2) in Thm. 2.7.
l
These simple characterizations show how one can in principle encode almost any arbitrary
desired function into quadratures of U(O). Consider (A, -, -, -), which aside from symmetry
and A(1) = 1, only needs to satisfy Vjx < 1, A 2 (x) < 1. The famous Stone-Weierstrass
theorem [119] assures us that A(x) of sufficiently large degree L can approximate arbitrarily
well any arbitrary continuous real function that satisfies these constrains on the interval
jxj
1. This ability to create almost arbitrary quantum response functions helps explain
the applicability of composite gates to many diverse problems.
2.4.2
Fourier characterization of quantum response functions
Whereas the achievable quantum response functions in Thms. 2.6, 2.7, and 2.8 are characterized in terms of polynomial functions of x = cos (0/2) and y - sin (0/2), they may also
be expressed as Fourier series (A, B, C, D), which we denote with scripted letters. In some
cases, this Fourier representation is simpler than that as polynomials. We rewrite Eq. 2.11
30
as
&(0) -
A(x)i + iB(x)&z + iC(y)&6 + iD(y)&Y,
A(x) + iB(x)&, + ixC(y)&x + iZxD(y)&Y,
(2.15)
L odd,
L even.
A()i + iB(0),5z + iC()&x + iD()&Y.
Eq. 2.15 defines the relationship between the polynomials (A, B, C, D) and the Fourier series (A, B, C, D). This relationship may be worked out by a straightforward application of
Chebyshev polynomials Ta(cos (0)) = cos (nO) and Un(cos (0)) = sin((n 1)0) of the first and
second kind respectively.
In the case of L even,
L/2
L/2
a'cos (j0) =
A(0) =
L/2
j=0
a'T2j (cos(0/2))
j=0
L/2
Z a2jcos
L/2
2
(0/2) = Za2x
j=0
j=0
L/2
C(0) = E
c sin (j)
L/2
cj' sin (0/2)U2j-1(cos (0/2)) = X
=
j=1
2
X2
j=1
C yXy-2
j=1
L/2
(2.16)
= X E C2j-1Y 2j-1,
j=1
where we have made use of the identity x2+ y 2 = l and the fact that T., U, are even or odd
polynomials depending on n. We use the primed variables (a', bj, c , d9) to indicate Fourier
coefficients. Note that A(0) is an even Fourier series in 0 with periodicity 0 E [0, 27r) and
degree L/2, and C(0) is an odd Fourier series in 0 in 0 E [0, 27r) of degree L/2.
In the case of L odd,
2
A() =Zacos
(2
+
j=0
2
)
aT2j+1(cos (0/2)) =
j=0
L-1
C() = Ec
L-1
L-1
1)0)\'
/O(2
sin (2J+ 1)
1
j=0
L-1
L-1
=
c sin (/2)U2j(cos
j=0
j=0
I~~
a2j+
,
L-1
(0/2))
2
Cy
j=0
L-1
2
=
(2.17)
2
1
C2j+ y j+ .
j=0
Note that A(0) is an even Fourier series in 0/2 with periodicity 0 G [0, 47), degree (L - 1)/2,
and is odd about 0 = 7r, that is, A(7r + 0) = -A(7 - 0). C(0) is an odd Fourier series
in 0/2 with periodicity 0 E [0, 47r), degree (L - 1)/2, and is even about 0 = 7r, that is,
C(7r + 0) = C(7r - 0). The derivations for B and D are identical.
One may similarly derive a map from polynomials to Fourier series by using De Moivre's
formula and a Binomial expansion. For instance, cos 3 (0/2)
(eiO/2+e-i/2
. This is a
straightforward, though tedious, calculation and so we omit it. The net effect is that any
set of real coefficients aj, bj, cj, d may be solved for the same number of real aj, bj, cj, dj,
and vice-versa. This allows us to prove results equivalent to Thms. 2.6, 2.7, and 2.8.
31
Corollary 2.9 (Achievable Fourier tuples). A tuple of Fourier series (A, B, C, D) of finite
degree is achievable iff all the following are true:
(1) A, B, C, D are real.
(2) A(O) = I or B(O) = 0.
(3) A,B,C,D are of the form described by Eqs. 2.16 and 2.17.
.
(4) 1 = A () 2 + B(0) 2 + C(0) 2 + D(O) 2
Proof. We map the conditions of Thm. 2.6 to the ones here. In both directions, (1-3) are
true by the above discussion for the map between Fourier series and polynomials. Condition
(4) is true through by the definition in Eq. 2.15.
E
Corollary 2.10 (Achievable 2-partial Fourier tuples). Assuming (A, B, C, D) satisfy conditions (1-3) of Cor. 2.9,
(1) (X, -, 3, .), (X, *, *, Y), (-, X, Y, ), (-, X, -, Y) are achievable iff
(1a) VO C R, X(0) 2 + y(0) 2 < 1.
(2) (A,B,.,.) is achievable iff
(2a) VO
R), A(O)2 + B(9) 2 <1, and
jX_
(,L2 a/ T(x)2
(2c) VL odd, Vx > 1, (
.(L1)/2
a T2 j+ 1(X)
+ (L1)/2
2>1.
.
+ (E
(2b) VL even, Vixi > 1,
T
+2j+() 2 >
-
(3) (.,- ,C,D) is achievable iff
2
< 1, and
(3a) VO E R, C(0) 2 + D()
2
(3b) VL odd, Vy
1, C(y) + D(y) 2 > 1.
Proof. We map the conditions of Thm. 2.7 to the ones here and vice-versa. The only nontrivial changes are the map from (3b) and (3c) Thm. 2.7 to (2b) here, which we prove in
detail. The others follow directly from the equivalence between polynomials (A, B, C, D)
and Fourier series in (A, B, C, D) in Eqs. 2.16 and 2.17. Note that condition (4b) in Thm. 2.7
is left unchanged as its expression in terms of Fourier coefficients is not particularly illuminating.
When L is even, we use Eq. 2.16 and the semigroup property of Chebyshev polynomials
Tn(Tm(x)) = Tnm(x) to express A(x) = A(O), and similarly for B(x) as
L
A(x)=
j even=O
L/2
L/2
ajxj = A(O) = ZaT2j(x) = ZaTj(T2(x)).
j=O
(2.18)
Using T2 (x) = 2x 2 -1, T2 : {ix x
0} -+ {x x < -1} and T2 : {x x
1} -+ {x lx 1}.
Thus in the forward direction, (3b) and (3c) in Thm. 2.7 imply (2b) here. In the opposite
direction, let us assume (2b) to be true. We may then substitute a re-parameterization of
the domains {x I x < -1} = {T2 (ix) I x > 0} and {x I x > 1} = {T2 (x) I x > 1} to recover
(3b) and (3c).
l
Corollary 2.11 (Achievable 3-partial Fourier tuples). Assuming (A, B, C, D) satisfy conditions (1-3) of Cor. 2.9, (X, -, ., .), (., X, ., .), (., -, X, -), (-,-, -, X) are achievable iff V0 E
R, X(0)2 <1
Proof. This follows from Thm. 2.8 and the equivalence between polynomials (A, B, C, D)
and Fourier series in (A, B, C, D) in Eqs. 2.16 and 2.17.
l
32
2.4.3
Implementation of quantum response functions
Unleashing the potential of arbitrarily sophisticated choices of achievable (A, B, C, D) requires an efficient computation of their implementation #. It is clear that the a random
search is wholly inadequate as the degree of L could be very large. Nevertheless, achievability leads to a certain structure that resolves this problem (P2). This is encapsulated in the
following lemma, which is proven constructively and furnishes the reverse direction proof of
Thm. 2.6.
Lemma 2.12 (Optimal quantum response compilation). Exactly L phases
RL are
required to implement an achievable (A, B, C, D) of degree at most L, and these L phases
can be computed in time ((poly(L)).
Proof. A minimum of L phases j are required to implement a given (A, B, C, D) of degree
at most L as each application of R, (6) only increases the degree of (A, B, C, D) by one.
We now show that (A, B, C, D) can be implemented with at most L phases q. Due to the
even/odd symmetry of real A, B, C, D from Thm. 2.6 conditions (1) and (3), we can compute
its unique phase sum representation in Eq. 2.9 via the invertible transformation
dn)(L(-n)/2J),
L(ic?(-n/2
(L,j = ij I
/2
n-O (an+ ibn) (
j odd,
j even.
(2.19)
Let us take the ansatz U(6) = N4 ,(6)V(6) where f(6) is unitary and V(0) = 1 as
(A, B, C, D) represents a unitary from Thm. 2.6 condition (4). Thus f(0) also has a phase
sum representation 4L-1,j. These two phase sums are related by the linear map of Eq. 2.10,
with inverse
4
-e-(~l)ji0L
j+ kodd,
i,
j + k even.
'L~k
bL-1,j
k=O
j + k odd,
(2.20)
By choosing
1 odd
eiOL
Zk=
C2FL/21-1 + id 2 FL/ 2 -1
L,k
)Lk
(_1)FL/2]
even
(aL + ibL)
(2.21)
we satisfy the necessary condition (DL-1,L = 0 from Eq. 2.10. In particular, /L is real, as
Eq. 2.14 has the trailing term ((a2 + b2,) - (cFL/2 y + d2L/ 2 -) sin 2 L (0/2) = 0. Hence
the RHS of Eq. 2.21 has absolute value 1. By recursively reducing the degree of V(6), we
obtain all L phases #. The terminal case at L = 1 must be consistent with Eq. 2.10 where
o,o = 1. When evaluated with Eq. 2.19, 2.20, this is satisfied only if A(1) = 1 (Thm. 2.6
condition (2)), which is true for achievable (A, B, C, D). All steps in this procedure can
be computed in time O(poly(L)), and there are only L recursions, leading to a runtime of
O(poly(L)).
Note that one may derive a decomposition directly from Fourier series (A, B, C, D) to
phases jwithout going through the intermediary of polynomials (A, B, C, D) - we leave this
future work.
33
2.4.4
Computation of quantum response functions
A consequence of Lem. 2.12 is that designing a composite gate is no more difficult than
finding the (A, B, C, D) to describe the quantum response function U(O).
Optimizing
(A, B, C, D) for some objective function is far more intuitive than the prior art of a random
search over #. However, this still is a difficult problem The unitary constraint Eq. 2.14 represents a system of quadratic multinomial equations that would have to be solved at each
step of the optimization to obtain an achievable (A, B, C, D). Solving such systems is in
general an NP-complete task. This is the essence of problem (P3): it would be much easier
to optimize a subset of (A, B, C, D), and doing so is often the problem of practical interest
anyway.
This subset optimization is illustrated by the response functions F. (0), p(6) of Eq. 2.12,2.13
which depend on only two polynomials. Optimizing just these for some objective function
offers more freedom as the unitary constraint Eq. 2.14 is weakened to that of Thm. 2.7.
Ultimately, we must compute some achievable (A, B, C, D) from a partial specification in
order to find the phases q.
Fortunately, the structure of achievable partial tuples can be exploited to derive algorithms analogous to prior art [113] based on polynomial sum-of-squares problems [91], but
specialized to the symmetries of Thm. 2.7. We present results for (A, B., .), (A, -, C, -) of odd
degree and show how they apply to all achievable 2-partial tuple. As these primarily serve
to show that the necessary conditions in Thms. 2.6, 2.7, 2.8 are also sufficient, the details
of the proofs for Lems 2.13, 2.14, which also furnish constructive algorithms for computing
(A, B, C, D) from partial tuples, may be skipped by the casual reader.
Lemma 2.13 (Transition probability sum-of-squares). V 2-partial tuples (A, B,-,-) of odd
degree at most L that satisfy conditions (1-3) of Thm. 2.6 and (3a, 3b) of Thm. 2.7, -l
achievable (A, B, C, D) of degree at most L that can be computed in time poly(L).
Proof. Consider the polynomial of degree at most L
f(A) = 1 - A2 (
- A) - B2 (
-A),
A E R,
(2.22)
with roots S = {s I f(s) = 0} E CL (S contains duplicates if a root is degenerate). Since A,
B are odd polynomials, f(A) is real for all real A. Because f(A) is real, complex roots s, s*
occur in pairs. Thus we can group subsets of S without loss of generality as:
So = {s c S 1s = 0},
Sc = {s E S I Im[s] > 0},
(2.23)
Sr = {s E S I Re[s] =/ 0 A Im[s] = 0}.
Observe that So,, are real, and S, is complex. Thus
f(A) = K 2AISI
f1 (A - s) ]I ((A - Re[s]) + Im[s] ) ,
2
SESr
2
(2.24)
SESc
with scale constant K E R. Using (3b), f(A) 5 0, VA < 0. Hence, all negative roots in Sr
occur with even multiplicity. Using (3a), f(A) E [0, 1], VA E [0, 1]. As f(A) changes sign at
A = 0, ISo l is odd. Using the oddness of A, B, f(A) > 1, VA > 1. Since f(A)
> 0, VA > 0,
all positive roots in Sr occur with even multiplicity. Thus, all real roots excluding s = 0
34
occur with even multiplicity. By repeated application of the two-squares identity
(r2 + s 2 )(t 2 + u2 ) = (rt
sU) 2 + (ru T st)2 ,
(2.25)
the complex factors can be simplified like
1J ((A - Re[s]) 2 + Im[s] 2)
sC Sc
g 2 (A) + h 2(A),
(2.26)
where g, h are real polynomials in A. Thus f(A) = C2(VAX) + D2 (VA) where
JC (Y) I!~ = (KyI SoI fl (Y2 _S)l
{h(y2)}(.7
KysD(y)
y2
2
(2.27)
,
and C, D are odd real polynomials of degree at most L. Note that different choices of signs
Eq. 2.25 generates a finite number of different valid solutions. Computing the roots of f(A)
is the most difficult step of this algorithm, but can be done in time O(poly(L)) [97].
El
The proof for even L, and tuples (., -, C, D) carries through with minor modification.
The stated conditions in Thin. 2.7 guarantee that the various factors of A, (1 - A) necessary
for the correct symmetry of the unspecified polynomials occur with the right multiplicity,
and that all other real roots occur with even multiplicity. Some additional processing for the
(-,-, C, D) case is required as the output (A, B, C, D) is not guaranteed to satisfy A(1) - 1.
However, A(1) 2 + B(1) 2 = 1 is still true so by computing -y = Arg[A(1) + iB(1)], we can
form an achievable (A cos -y + B sin -y, B cos y - A sin -y, C, D).
We now present the analogous algorithm for (A, -, C, .).
Lemma 2.14 (Fidelity response sum-of-squares). V 2-partialtuples (A, -, C, -) of odd degree
at most L that satisfy conditions (1-3) of Thm. 2.6 and (1a) of Thm. 2.7, 3 achievable
(A, B, C, D) of degree at most L that can be computed in time poly(L).
+ t 2 ), y
(1 + t 2 )L(A, B, C, D).
(A(t), B(t), 0(t), D(t))
2t/(1 + t 2
)
Proof. With the Weierstrass substitution Vt E R, x = (1 - t2 )/(
define the real polynomials
(2.28)
These polynomials have extremely useful symmetries which we indicate with angled brackets
(.). (A) = (B) = (EN) are Even (E) aNtipalindromes (N) while (C) = (D) = (OP) are
Odd (0) Palindromes (P). Antipalindromes satisfy A(t) = -t2LA(t1) whereas palindromes
satisfy 0(t) = t2L6(t-1). Note that (E),(O) and (P),(N) polynomials with multiplication
form a group isomorphic to Z2 x Z2 . For example, (EN)(OP) = (ON).
Consider the positive, palindromic polynomial
f(t) = (1 + t 2 ) 2 L -
2
(t) __ 0 2 (t) = K 2 11(t - s),
(2.29)
sES
with scale constant K E R, and roots S = {s If(s) = } E C4L-5|o, where ISol is the
multiplicity of the zero roots. Note the degree of f(t) is 4L - So , not 4L, because the first
ISoI coefficients being zero implies the last ISol are as well. Due to the (EP) symmetry of
f (t), V roots s -f 0, ] roots s*,-s, and s-. Thus we group subsets of these roots without
35
any loss of information as follows:
So={sCSjs=0},
S = sES
s =1},
(2.30)
Sr = {s E S | Re[s] > 1 A Im[s] = 0},
Si = {s c S
Re[s] = 0 A Im[s] = 1},
S, ={s E S I Re[s] = 0 A Im[s] > 1},
Su = {s E S I Is= 1 A 0 < Arg[s] < 7r/2},
Se = {s E S I IsI > 1 A 0 < Arg[s] < 7r/2}.
Observe that So,1,r are real, Si,, are imaginary and Su,, are complex. From the real roots,
we construct the factor
fI
f tIS0 ( 2
fr=t
t
1
t2(S2 + s-2) + 1) 1),
(2_4
(fr) = (OP)
2
(2.31)
(EP) 2
(EN)
The positiveness of f(t) means that all real factors have even multiplicity.
polynomial. From the complex roots, we form
((t2 _ 1)2 + (2t)2)
Thus fr is a
(2.32)
2
,
f t = HsES (t2 _ 1)2 + (t(Im[s] + Im[s]-1)) 2
,
fu = HfsES(t 2 _ 1)2 + (2t sin (Arg[s])) 2
fc = HsCs (t4 - t 2 (1- 2 - 4 sin 2 (Arg[s]) + s12) + 1)2
+ (2(t 3 + t) Im[s] (1 --Is1-2)) 2
The symmetry of terms under the squares is one of (EP), (EN), (OP), (ON), and occur in a
combination that forms a group under repeated application of the two-squares identity of
Eq. 2.25. Thus we can construct
fift fufc = g2 + h 2 ,
(2.33)
(g) = (EN) i+1Su+1SU
2
(Kfrg) 2 + (Kfrh)
.
f(t)
(h) = (OP)Yil+1Su+1S'I,
,
For some combinations of multiplicities, this decomposition will not produce polynomials
with the symmetry (EN), (OP) required by b, D. However, summing the multiplicities of
these roots shows that ISji is even and that such combinations do not exist. From this
decomposition, we compute B(x), D(y) using
bk =
L
b2 n [Z
0 (-1)m(n
d2k+1 = -2rL
2p-m
L2J(
,
(2.34)
1m (p+)m~
.
(L-n--1) ( n)
Im
(L-n)
As with Lem. 2.13, different choices of signs in the two-squares identity lead to multiple
valid solutions. Computing the roots of f(t) is still the most difficult step, but can be done
36
in time ((poly(L)).
D
The case of even L replaces Eq. 2.29 with f(t) = (1+t2 ) 2 L-A 2 (t)-((1-t
2
)/(1+t2 )) 2 0 2 (t)
and we find b with (EP) and (1 - t2 )D with (ON) symmetry . A similar root-counting
argument guarantees the existence of such solutions. The coefficients of B(x), D(y) are then
computed also using Eq. 2.34. This procedure carries through without modification for the
other tuples (1), (2) of Thin. 2.7.
2.4.5
Selection of quantum response functions
It should be clear that optimal composite gate design is a systematic process no more difficult
than choosing one or two polynomials optimal for some objective function. Nevertheless,
problem (P4) is that computing these optimal polynomials could still be a difficult task.
However, the constraints on achievable partial tuples in Thms. 2.7, 2.8 seem fairly lax, which
lends hope that this could be done efficiently. In fact these constraints are consistent with
textbook problems in approximation theory [93].
It is at this point where a close connection with discrete-time signal processing [101] is
made. Efficient algorithms [92, 65, 48, 75, 56] for designing polynomials optimal for arbitrary
objective functions under a variety of optimality criteria have been extensively studied for
finite-impulse response filters [54]. We thus inherit much of this machinery, and in many
cases, existing polynomials consistent with achievability have already been found and are
directly transferable.
A most common optimality criterion is the Chebyshev norm: Let P0 (x) be the objective
function, with continuous weight function W(x) > 0, to be approximated by a polynomial
P(x) of degree L on a bounded subset B of the closed interval B C [-1, 1) with the smallest
Chebyshev error norm
c = maxIW(x) (P(x) - P(x))|.
xEB
(2.35)
The unique best approximation can be computed efficiently by Remez-type exchange algorithms [44]. Many variants exist such as where P(x) is a trigonometric polynomial [92],
bounded [48], subject other unary or linear constraint [75], and even complex [65]. Linear programming methods [75] provide an alternate solution. Efficient algorithms for other
optimality criteria such as least squares are also available [80, 128].
These algorithms efficiently solve the problem of optimization over achievable quantum
response functions U(O) where the objective functions are 2-partial or 3-partial tuples. Optimization for a 3-partial objective function involves a single quadrature from (A, B, C, D)
together with a single real objective function Po(9). Thus we optimize over P(O) for P"(6)
in Eq. 2.35 subject to the constraints of Thm. 2.8 for the corresponding quadrature. The
slightly more complicated 2-partial case instead specifies two quadratures and real objective functions Po,1(O),Po, 2 (6). Thus we define P0 (9) = Po, 1 (O) + iPo,2 (O), and optimize
over P(6) = P1 (0) + iP2 (O) for Po(O) subject to the constraints of Thm. 2.7 for the corresponding quadratures. Note that the unitarity inequality constraint poses no difficulty as
IP()1 2
p12 (O) + P2(0).
2.4.6
The methodology of composite quantum gates
Our efforts lead us to a methodology for the design of single spin quantum response functions
U(O) through composite quantum gates built from a sequence of L primitive gates all rotating
37
by 0, but each with its own phase
= (01,...,#L). The procedure is systematic, flexible,
and most importantly, provably efficient:
Problem statement Given L > 1 and objective function &,(0) for either 3-partial or 2partial tuples, find the composite quantum gate that implements through q the optimal
E-approximation to Uo(0).
Solution procedure
(Si) Check that &(0) is consistent with achievability.
-Satisfies conditions of Theorems. 2.7,2.8.
(S2) Choose optimality criterion.
-The Chebyshev norm is most common.
(S3) Execute polynomial optimization algorithm over achievable partial tuples.
-Remez-type algorithms are efficient.
(S4) Compute achievable tuple from partial tuple.
-This can be done efficiently by Lems. 2.13, 2.14.
(S5) Compute phases 0.
-This can be done efficiently by Lem. 2.12.
2.5
Examples
Using the methodology in Section 2.4.6, composite quantum gates with response function
U(9) that minimize the error with respect to arbitrary objective functions U 0 (6) can be
efficiently designed. We illustrate this process with three examples of independent scientific
interest: compensated population inversion gates, compensated broadband NOT gates, and
compensated narrowband quantum gates.
Population inversion gates rotate states 10) to 11) and vice-versa, and come in two flavors.
The broadband variant implements this rotation with high probability across the widest
bandwidth of 0 E B, meaning that the transition probability response function p(O) from
Eq. 2.13 is close to 1. The narrowband variant instead implements this rotation with low
probability so p(O) ~ 0, except at a single point p(7r) = 1. We discuss optimal design of
these gates in Section 2.5.1. As closed-form solutions for these gates are already known, and
used extensively in NMR spectroscopy, they help build familiarity with the methodology in
Section 2.4.6 when it is used to solve open questions in the next two examples.
Broadband compensated NOT gates implement the rotation Ro(7r) with high fidelity over
the widest bandwidth of 0 parameters. Whereas population inversion gates only succeed on
initial states 10) to 11), NOT gates apply a 7r rotation with a known phase for all input
states. Such gates have been extensively studied for applying uniform rotations in the
presence of drive field inhomogeneities, particularly in quantum computing applications,
and our methodology, presented in Section 2.5.2, solves open questions regarding the scaling
of bandwidth with sequence length as well as their efficient synthesis.
A complementary design problem addressed in Section 2.5.3 is that of narrowbandcompensated quantum gates. These instead apply a desired arbitrary rotation No(X) at a single
0 value, and the identity rotation elsewhere over widest bandwidth of 0 parameters. Such
gates are highly relevant to minimizing crosstalk in the selective addressing of spins in arrays, particularly when spin-spin distances are below the diffraction limit, as might be found
in architectures for scalable architectures of ion-trap quantum computation.
38
2.5.1
Composite population inversion gates
Population inversion gates maximize the bandwidth B over which the transition probability
response function p(O) from Eq. 2.13 is close to 1 for the broadband variant, or close to 0
for the narrowband variant. Note that in both cases, perfect population inversion occurs at
0 = 7r for L odd, owing to the fact that A(0) = 0. Moreover, the optimal polynomials and
phases for both variants turn out to be related by a simple transformation, so it suffices for
us to consider only the broadband case.
Composite gates with these properties have been studied extensively for nuclear magnetic
resonance and quantum computing applications. One approach to obtaining broadband
behavior is with the maximally flat ansatz p(0) =1 - O((0 - r) 2 n) [125]. This exponentially
suppresses errors in the transition probability to order n, thus p(O) ~~1 over a wide range of 0.
Remarkably, the 0 that implement this profile can be found in closed form [130] with optimal
sequence lengths L = n. More recently, a second approach has emerged [88], motivated
by the following observation: as the flat ansatz p(O) = 1 - O((6 - r)n) only increases
bandwidth indirectly through the suppression order n, better results can be obtained by
directly optimizing for bandwidth, while ensuring that the worst-case error I remained
bounded.
The procedure of Section 2.4.6 for odd L formalizes this task as a straightforward optimization problem:
(Si) Choose the objective function VO E B = 7r + [-IBI/2, JBJ/2], $Jo(O) = 0 for the
(A, 0, -, -) 2-partial tuple. Since p(O) = 1 - A 2 is close to 1 over B, the unitarity constraint
c2 + D 2 = I- A 2 implies that a rotation Rp(7r) is approximated over B, with an unspecified
phase # = Arg[C + iD] that varies with 0. As consistency with Thin. 2.6 requires that
A(1) = 1, this implies that identity is applied at 0 = 0, thus B must not contain 0 = 0.
(S2) Choose the Chebyshev optimality criterion, where the best A solves the minimax
optimization problem
E=minmaxA(x)J,
A 0EzB
E2=I,
(2.36)
where the worst-case transition probability over B is 1 - I.
(S3) Find the function A that solves Eq. 2.36. For consistency with Thin. 2.6, the optimization is over real odd polynomials A bounded by VJxJ 5 1, IA(x) I
1.
(S4) Using Lem. 2.13, compute the achievable tuple (A, 0, C, D) from the partial specifica-
tion (A, 0,,.).
(S5) Compute q from (A, 0, C, D) using Lem. 2.12.
The solution to (S3) is the Dolph-Chebyshev window function [38, 90] famous in discretetime signal processing.
DCL,I(Y) = VITL (/L,IX) , /L,,r
=
TL-1(1- 1 2 ),
(2.37)
where T,(x) = cos (n arccos (x)) are Chebyshev polynomials. Note the ripples of DC2, (X)
bounded by I in Fig. 2-1. This is in contrast to monotonic increase of the limiting function,
indicated by the subscript f,
DCLf(X) = lim DCL,I(X) =XL
which is maximally flat at x = 0, but has significantly narrower bandwidth.
39
(2.38)
Using x =
.
1
0.02
2
M9,f
iV 9 , 1 0 -21
DC9,f
0.01
--
DC 9,f
1
-D
0 2
1
2C910-
C
/
0
I
0
0.99
/
1
2
,f
M
-
m
0.98
-
2
-1
1
1
0
2
2
1
x
Figure 2-1: DCL,T (black), ML,I (teal) polynomials plotted for L = 9 and target worst-case
infidelity I = 10-2 (solid) and I -+ 0 (dashed), indexed by f. The observed ripples are
a generic feature of bandwidth optimized polynomials, unlike those optimized for maximal
flatness DCf, Mf. The inset plots their squares and defines the bandwidth B in x coordinates.
cos (0/2), the bandwidth in 0 coordinates is to order O(I)
LB -- 4f - 41|B31 = 23-I
(2.39)
Ef
Given the same target bandwidth, the worst-case error of DCL,I is exponentially smaller
than DCL,f. Note also the quadratic difference in the scaling with L of the bandwidth over
which DCL,I does not approximate F(x) = 0.
= 4arcsech
+0(l
s
Bf 4
(2.40)
+2
The ripples in the amplitude are a generic feature of best polynomial approximations to
functions in the Chebyshev norm. By sacrificing flatness, much smaller absolute variations
in error e can be achieved over some specified bandwidth B. This is a common theme that
will be revisited in the subsequent example.
Finding the phases that implement (DCL,I(X), 0, , -) is then a straightforward computation through (S4), (S5), and the results can be compared to the closed-form solutions
from [136, 88]: Ok = OL-k+1 where #1 = 0 and
Ok+1 = Ok + 2 tan-
The phases
[tn(L)V1
- 8ZL2]
(2.41)
for the narrowband variant (-, -, DCL,I(X), 0) are obtained by a simple 'toggling'
transformation [134] 4j = -(-)5o 3 - E
- 20k540
2.5.2
Broadband compensated NOT gates
Broadband compensated NOT gates maximize the bandwidth B over which the fidelity
response function with respect to the target gate No(7r) is close to 1. One option consistent
with this goal is the choice of fidelity response functions F,(0) = I- 0((0 -7r) 2n+ 2 ) that are
maximally flat with respect to (0 - 7r). When the correction order n increases, deviations
from 0 = 7r are exponentially suppressed, resulting in improved approximations of the target
gate over wider ranges of 0 E B. The central difficulty of this pursuit is finding the phases
# that maximize n for any given L. Unlike the population inversion gates of Section 2.5.1,
this appears to be significantly more difficult; optimal length solutions for the # have only
been found in closed-form for small n < 4 [87].
This problem has been attacked over the course of two decades, starting with Wimperis [134] who found the 0 in closed form for BB 1 , a L = 5 sequence with n = 2. This was
extended by Brown et. al. [18] with SKn for arbitrary L = 0(n3. 09 ) through a recursive
construction, and then by Jones [62, 59] with Fn to L = 0(n1 . 59 ) in closed-form through
sequence concatenation. The most recent effort [87] proved a lower bound of L = Q(n)
and conjectured that the sequence BBn (Wn in [59]) with L = 2n + 1 is optimal through
brute-force up to L = 25. Using our methodology, we can easily prove this conjecture and
efficiently compute its implementation 4.
Moreover, our methodology enables a second option. Instead of optimizing for correction
order, it is possible to directly minimize the worst-case infidelity I, which is the experimental
quantity of interest, over a target bandwidth B. We find that doing so leads to an improvement in I that scales exponentially with L over the maximally flat case. To prove these
statements, we proceed with the design outline of Section 2.4.6 for odd L:
(Si) Choose the objective function VO E B = ir + [-IBI/2,11B3/2], $o(0) = 0(7r) = -iCx
for the (., -, C, -) 3-partial tuple. Provided that B does not contain the point 0 = 0, this is
consistent with the constraints of Thin. 2.8. This corresponds to finding a fidelity response
function F, (0) = C2 (sin (2)) that is close to 1 across B.
(S2) The best fidelity response function for the maximally flat approach in prior art is
obtained from the function C that maximizes the correction order
n = max{n I C(y) =1 - 0((1 - y)f+l)},
(2.42)
C
I = 1 - min F, (0),
y = sin (6/2),
0E13
where I is the worst-case infidelity over the bandwidth B. It is easy to verify that any
such C satisfies F,(0) =1 -- 0((0 - 7r)2n+2). The more direct approach uses the Chebyshev
optimality criterion, where the best C solves the minimax optimization problem
e = minmax lC(y) - 11,
C
0E:
I = 1 - ( -)
2
.
(2.43)
(S3) Find the function C that solves Eqs. 2.42, 2.43. For consistency with Thin. 2.6, the
optimization is over real, odd polynomials C bounded by IC(y)I < 1 ,Vy E [-1, 11.
(S4) Using Lem. 2.14, compute the achievable tuple (A, 0, C, D) from the partial specification (-, 0, C,-).
(S5) Compute q from (A, 0, C, D) using Lem. 2.12.
We now present the solutions to (S3)_of this procedure. This is the most difficult step, as
once C is provided, the implementation # is a straightforward calculation. Eq. 2.42 is solved
41
by the the odd polynomial that satisfies the following n + 1 independent linear constraints:
(
dk 0
dkC(y)
C(1)1,
k = 1, 2, . . , n.
(2.44)
As a degree L odd polynomial has L--1 free parameters, a degree L
necessary and sufficient. This is solved by the polynomial
2n + 1 polynomial is
dyk
0,
(2.45)
2_
(
ML,f (Y)
Y=J
j=0
with an example M 9 ,f plotted in Fig. 2-1. The index L indicates the degree, and the subscript
f indicates that this is a maximally flat polynomial. As ML,f(y) is monotonically decreasing
from y < 1, the relation between infidelity I and bandwidth B is obtained by solving
I=
1 - Mf (cos (1B/4)) to leading order:
|B| L+ 1 2 L+5/2 [
E = (L8M
(2.46)
1
Thus given some target bandwidth B of high-fidelity operation, the composite quantum gate
represented by BBn = (, 0, M 2n+ 1,f(y), -) implements NOT with a worst-case fidelity that
decreases exponentially with sequence length. This proves the L = 2n + 1 conjecture of [87].
The odd polynomials of degree L that satisfy the Chebyshev error norm optimality
criterion in Eq. 2.43 can also be found. We label these polynomials ML,I, where L indicates
the degree, and I is the worst-case infidelity, which is directly related to the bandwidth B.
For L = 5, we have a complicated looking expression
3
-(4y3+3y2+2yi +1)y +(2y5+4y4+6y'+3y2)y
M5
+
(y1 -1)
2y5(y1+1) 2
3
(1+3y1+y2) 2 (1-2yi-4y )(3+9y1+8y)
3125y6(1+yl)4(1+2y1)3
3
(2.47)
parameterized implicitly through yi E [cos (r/5), 1]. For larger L, such as M 9 ,10 -2 in Fig. 21, the ML,I can always be computed numerically through the famous Parks-McClellan
algorithm [92] for finite impulse response filters. Remarkably, the Chebyshev error of this
approximation problem is known [40]:
I1
()
8 cos2 (1131/8) tanL+1
vf/-7
|B +
(II=~
8
I3/)(.8
(L-=)c+(I(I/))
v,(L
- 1) cos (11B1/4)
7/2
1+0
V"/,iL
|B
8
8
2
,
(2.48)
-
5
+-
.
21+1)y
LB
By comparing Eqs. 2.46. 2.48 in Fig. 2-2, it can be seen that for any target B and sequence
length L, the composite quantum gate OBn = (-, 0, M 2n+ 1,I(y), -) has a worst-case infidelity
that improves on BBn by an exponential factor 0(2 1-L).
In contrast to the BBn sequences that are fixed for each n, OBn allows for an optimal
design trade-off between bandwidth B and infidelity I. As seen in Fig. 2-1, this occurs by
introducing equiripples of equal amplitude bounded by I, similar to the DCL,r polynomials
42
V-4..
I
1
.
10-2
-
I
I
U
MLI
ML,f -----
10-41
10-6-""
9-
10-2
10~1 1
0.1
0.2
0-
Target Bandwidth I!B| / 27'r
q where k = 0L-k+1
0.4
0.6
0.8
1.0
I
L
Eq. 2.47
5
2(tan-
10-2
10-4
10-6
10--2
10-4
10-6
9
9
9
13
13
13
(2.987, 5.166, 4.021,1.678, 2.815, ...
(2.889,5.334,4.042, 1.490,2.926, ...
(2.844, 5.381,4.034, 1.414, 2.976, ...
(2.390, 0.771, 2.791,2.824, 2.115, 4.573, 4.888, ...
(2.233, 0.455, 2.853,2.862, 1.838, 4.558, 5.041, ...
(2.159, 0.314, 2.874, 2.877, 1.677, 4.495, 5.092, ...
2
2, 0,...),
~8x~j+8x~j-1
1 2
t2
+ x
)
, tanlt
)
)
)
)
1
)
_
Figure 2-2: Worst-case infidelity I of NOT gates OBn = (., 0, M n+l, (sin (0/2)),-) (solid,
2
1
Eq. 2.48) optimized for target bandwidth 0 E B compared to flatness-optimized NOT gate
BBn = (-, 0, M2 n+1,f (sin (0/2)), -) (dashed, Eq. 2.46), plotted for L = 2n + 1 = 5, 9, ..., 25
(from top). Observe that I for OBn is exponentially smaller by factor ~ 4n than BBn.
Alternatively, an OBn gate can approximate NOT with infidelity at most I over a much
wider bandwidth than BBn. The table provides examples of 0 for OBn rounded to 3 decimal
places.
43
for population inversion gates. Thus, given the same performance targets, an extremely
short OBn gate can perform just as well as a significantly longer BBn gate. In other words,
maximizing the correction order only improves the achieved bandwidth indirectly, leading to
a poor trade-off between I and B, whereas better results are naturally achieved by optimizing
for polynomials that directly solve Eq. 2.43 by minimizing infidelity over a target bandwidth.
2.5.3
Composite quantum gates with sub-wavelength spatial selectivity
Narrowband compensated gates maximize the bandwidth B over which the fidelity response
function with respect to identity 1 is close to 1, except at a single point 0 where an arbitrary target rotation fNo(X) is applied. Although the direct approach is computing new
polynomials (A, -, C, -) that satisfy these properties, we can reuse the polynomials ML,T from
Section 2.5.2 by making certain assumptions on the physical system. In the following, we
also assume that 1xI < r.
Consider a Gaussian beam of fixed width A. As a function of position r, this beam
has a spatially-varying Rabi frequency Q(r) = Qoe_ 2 /2A 2 . Thus when applied for time
to, a primitive gate R1(0(r)) that also varies as a function of position is generated, where
O(r) = Ooer 2 /2A 2 and 0 = Qoto. At r = 0, one can choose to, < such that the target rotation
x = 00 is implemented, and due to exponential decay of the Gaussian beam, moving away
from the beam center approximates the identity gate with infidelity 1(r) = sin2 (0(j)). Thus
at distance r/A > d/A = 3 1 = logl/ 2 -r from the beam center, the worst-case infidelity is
I. As the minimum possible beam width A is the wavelength of light, selective addressing
below the diffraction limit appears impossible. However, even this can be overcome with a
carefully designed composite quantum gate.
Narrowband composite gates of length L applicable to this scenario have been widely
studied. For instance, [134, 94] report beam width reductions by factor d ~ 0.73 1 [134, 94].
Further reduction is possible with longer composite gates [87], but with poor scaling
=
O(L~ 1/4).
A better narrowband composite gate results from using the broadband identity gate
ID = (ML,T(x), 0, -, -) designed from the ML,I polynomial in presented in Section 2.5.2.
Then, the fidelity response function with respect to identity is Fo(6) = M2,(x), which, as
we now show, corresponds to a quadratic improvement of
=O(L-1/2)
Let us compose ID with the Gaussian beam to produce the spatially-varying quantum
response function
Uspace(r) = ID(0oe-r 2 /2\ 2 )
ID(9o) + O(r2 ),
(2.49)
for some choice 0oI < 7r. Note that Uspace(r) is stable with respect to beam-pointing errors
in r due to the vanishing first derivative. The degree of spatial selectivity is computed from
the bandwidth in Eq. 2.48 by substituting JBI = 200e-sace/2 2 and solving for r/A. Thus,
identity is implemented with infidelity at most I at all r > d > ALspace as seen in Fig. 2-3,
where to leading order O(L- 1/ 2 ),
Bspace
=
A = 2
log (1/I)+ log (27/(Lr))
+l
44
4
n -
(2.50)
17
"
--
.
- -0.6
-~~
9-5
17 2941L
0
10-2
10-4
10-6
10-2
10-4
10-6
9
9
9
13
13
13
(0, 0.772, 4.357, 2.827,3.886, ...)
(0, 1.087,4.501, 2.707, 3.961, ...
(0, 1.235, 4.601,2.695, 4.029, ...
(0, 1.450,3.683, 2.501,3.220, 5.577, 5.728, ...
(0, 1.872, 4.326, 2.844,3.602,6.271, 0.257, ...
(0, 2.077,4.616, 2.978,3.742,0.250, 0.578, ...
-
1
)
)
)
)
I
)
Eq. 2.47
L
5
1
2
3 1
Distance from beam center r / A
q where q5 = #L-k+1
1
2(0, tan- ti, tan", ... ), ti = /8x 1 + 8x
Figure 2-3: Infidelity of spatially selective composite gates (ML,10--4(cos (1 e- 2 /_ 2 )), 0
plotted for 0o = ir and L = 1, ... , 25 (solid, from right). The effective beam width Bspace =
O(L-1/ 2 ) (inset) beyond which the identity gate is well-approximated is dramatically reduced
over that of a single gate 91. By varying 0o, arbitrary unitary gates can be applied at r = 0
with high beam-pointing stability. Poorer scaling Bspace = O(L- 1/ 4 ) results from using the
flat (ML,f, 0,,. -) (dashed). The table provides examples of 0 to 3 decimal places.
Meanwhile at r = 0, we obtain the gate
Uspace(0) = Ry (2 cos-' (ML,I (cos (o /2)))) .
(2.51)
where -y = Arg[C(sin -) + iD(sin - )]. The desired rotation Ro(X) is thus obtained by
choosing 0 such that cos (x/ 2 ) = ML,T(cos (Go/ 2 )) and rotating all phases #k +- qk + -Y,
which follows from e-i-lUspace(0)ei
Ro(X).
The optimality of these results follows from the construction of ML,I as optimal bandwidth polynomials. In particular, using the flat polynomial ML,f(x) leads to the scaling
Bspace = O(L-1/ 4 ) found in prior art and Fig. 2-3(inset).
2.6
Conclusion
We have presented and applied a methodology, analogous to the Shinnar-LeRoux algorithm
but with different controls, for the systematic design of resonant equiangular composite
quantum gates of length L on a single spin. In particular, we show that all steps are efficient with time complexity O(poly(L)), and provide an extremely rigorous characterization
of achievable quantum response functions. Moreover, the elegant and practical connection
made with discrete-time signal processing allows us to inherit and adapt many existing algorithms and polynomials used in the design of classical response functions for this quantum
problem. Much potential remains untapped there, and interdisciplinary exchange could spur
the discovery of further connections, leading to the development of previously intractable ap-
45
plications. Indeed, this relationship has already proven fruitful in surprising directions, such
as recent work furnishing optimal algorithms for important problems such as Hamiltonian
simulation [86, 84] on a quantum computer.
Various thought provoking extensions are also motivated. The set of achievable quantum
response functions is changed by introducing elements such as additional (possibly continuous) control parameters, disturbances, coupled spins [124, 94, 61], or open systems [68, 115].
These all enable their own unique applications, but also appear difficult to solve somehow
systematically and intuitively. Our success in the case of composite gates contributes supporting evidence that a useful characterization as well as efficient methods for these more
complex design problems could exist.
46
Chapter 3
Amplitude amplification by quantum
signal processing
3.1
Introduction
In Chapter 2, we introduced QSPL, a model of analog computation based on single-qubit
rotations. We now make strides towards the hope of fast analog-digital computation by
demonstrating one way in which QSPL results transfer to digital realm without modification.
We do so through an application to quantum query algorithms, starting in this section with
the quantum search problem, which is special case of amplitude amplification, which we
further generalize.
Amplitude amplification is a staple quantum subroutine for state preparation that used
in many quantum algorithms. The basic version is based on reflections about initial and
target states that lead to sinusoidal modulation in the final amplitude of the prepared
target state. These oscillations depending on the time N for which one runs the algorithm
as well as the the initial amplitude of the target state. The leads to a well-known 'soufflAf'
problem. If initial amplitude is not known beforehand, one also does not know what N
should be. Undercooking is running it for too little time which prevent the amplitude from
reaching the first oscillation peak, and overcooking is running it for long, which can cause
the amplitude to drop. One approach is to estimate 0 by performing a binary search over
values of N = 1, 2,4, ... - the cost of doing so turns out to not affect the asymptotic scaling of
Grover search. However, a more elegant and general approach would be to replace sinusoidal
modulation with some other custom function.
This can be achieved by a simple generalization known as phase-matching [83, 50, 17, 58],
where instead of performing reflections, one instead applies partial reflections with some
identity component. These partial reflections allow interference between states from different
computational time-steps, leading to more sophisticated relations between initial and final
state amplitudes. One such non-trivial function enables fixed-point quantum search [51, 136],
which solves the soufflAf problem without requiring any repetitions for estimation. However,
these results, especially in choosing the right sequence of partial reflections, tend to follow
either from geometric arguments, or a brute-force numerical search. What is lacking is a
general theory that classifies the space of such possible functions, as well as procedures for
compiling their implementation from some specification. The lack of knowledge about this
design space limit potential applications of the amplitude amplification framework.
Surprisingly, the model of analog computation QSPL from quantum signal processing on
47
a single-qubit provides this general theory, and essentially solves the design problem. Even
though amplitude amplification is in general an algorithm on multiple qubits, its underlying
SU(2) structure allow the results of QSPL to apply without modification. Thus one may
compute functions of amplitudes with the same speed and simplicity as at the physical level
of QSPL, at least in terms of query complexity.
In Section 3.2, we define the quantum query model which is applied in Section 3.3 to
outline the famous quantum search algorithm [50], also known as Grover's algorithm, for
searching a database of size n in O(n/) time, and its generalization as amplitude amplification. The most common generalization of amplitude amplification replaces reflection
with partial reflections, which allow one to design functions for obtaining non-trivial nonsinusoidal modulation. In Section 3.4, we completely solve this design problem. We prove
that any such function is polynomials subject to certain necessary and sufficient constraints,
and show how these partial reflections may be precomputed in classical polynomial time.
We then generalize this in Section 3.5 to obtain an even more flexible variant that relaxes
some constraints on these polynomials.
3.1.1
Attributions and contributions
The results in this section are taken from the preprint of joint work [85] with Isaac L.
Chuang.
3.2
Quantum algorithms and query complexity
The quantum query model is a useful tool for proving meaningful comparisons between the
performance of classical algorithms, and that of quantum algorithms. The basic idea is
to provide information on the problem to both classical and quantum circuits in a certain
standardized manner. In the standard approach, this is done through an black-box Boolean
function 0 : {0, 1} -+ {0, 1} that encodes some information on the 2' bit-string x =
X1X2...X2n E {O, 1}2n. In the simplest case, computing on index j E {0, 1} through 0(j) =
xj returns the value of the jth bit of x. With multiple applications of this function, one
could then determine functions f of x, such as its parity, or whether there exist any bits
that are one, using some other logic circuit that computes on the obtained information. As
0 can always be synthesized by some classical circuit of boolean logic, black-box means that
we treat this classical circuit as a single entity, the oracle, and count the number of times
we query this oracle.
This normalizes the inputs to classical and quantum algorithms as 0 could just as well
be synthesized by Toffoli gates, which are universal for classical computation, and can be
realized by quantum unitaries. Of course, the Toffoli gates have an equal number of inputs
and output, whereas 0 has just one output. Nevertheless, the single output bit 0(j) will
be one of the output bits of a larger oracle Oreversible : {0,l}" 1 {0, }. This is completely
equivalent as we may set 0(j) = Oreversible(j)1 and disregard all the other output bits.
Using standard techniques to uncompute the irrelevant output bits, this is also equivalent
to having access to the oracle 0 : {O, 1} x {O, 1}
+ {o, 1} x {0, 1}, where
O(z, j) = {z
xj,
j},
Vz E {0,1}, j E {0, 1}.
(3.1)
As we may implement classical Toffoli gates with quantum Toffoli gates, it is fair to assume
48
that a quantum algorithm has access to a unitary black-box quantum oracle
0O1Z) 1j) = | Z (1)|) j),
Vz E to, 1}, j E to, I}I,
(3.2)
which may be queried by any arbitrary superposition of states.
A generic Q-query quantum algorithm U then intersperses queries to 0 between Q + 1
arbitrary unitaries Uk, that do not depend on properties of 0, like
U = UQOQ-10 ...
10ou0 .
(3.3)
In other cases, one may assume that 0 computes different functions instead of simply returning the value of some built-in bit-string x. It may that 0 4 Ot, thus Eq. 3.3 would
be modified to insert queries to Ot in any order, each with the same cost as 0 as reversing
the order of quantum operations is easy on a universal quantum computer. However, in all
cases, it is important to keep in mind that counting queries to 0 is only an abstraction to
the true quantum gate and qubit cost of any quantum algorithm. In the decision model
of quantum computation, U acts on a computational basis state 10)10), and the algorithm
succeeds if f(x), encoded in the z bit is computed with success probability bounded away
from 1/2+0(1), e.g. greater than 2/3. The query complexity of a function f in this model is
then the smallest number of queries Q where this is so. Note that in more general quantum
algorithms, f need not be a boolean function. It could for example, be a desired quantum
state, or some unitary.
In the following, our convention for gate complexity counts the number of single-qubit
and two-qubit rotations required to implement a quantum algorithm, excluding that for
implementing 0. Our convention for space complexity counts the number of additional
qubits required, excluding those already used by 0.
3.3
Quantum search and amplitude amplification
The query model may be applied to prove a famous square-root speedup of the quantum
algorithms over the classical algorithms in problem of searching an unsorted database of
size n for a marked element. The quantum algorithm is known as Grover search, and works
by preparing a special quantum state that encodes the index of the marked element. This
procedure is generalized by the amplitude amplification algorithm for preparing quantum
states in general. We review both in this section.
The database of size n in the search problem is modeled as an n-bit string x = XiX2...Xn,
where marked elements correspond to bits of x being set to 1, and all other bits being set to
0. If there are m marked elements, let the set of indices corresponding to marked elements
be M of size IMI = m. The oracle provided for this problem returns the value of the jth
bit 0(j) = xj. Solving search problem entails returning the index j E [n] of any one such
marked element by making some Q queries to 0. The optimal classical algorithm is to
select, without replacement, random numbers j from the set [n]. Then with probability
rn/n, the oracle returns 0(j) = 1, which indicates success. Thus the classical algorithm
requires 0(n/m) queries on average.
The quantum algorithm works differently, and relies on querying 0 with a quantum
superposition of states in the computational basis. Let the uniform superposition over
all states be IS) the uniform superposition over all marked states IM), and the uniform
superposition over all unmarked states be IM'). Note the orthogonality relation (MIM') =
49
(
-in20)6)os
0. Thus
IS) =
Zli),
IM)
j),
=
IM') =
Z lj).
n
EM
j=
(3.4)
j~M
On input lS)a0)b, where the ancilla register b is of dimension 2 and the system register a is
of dimension n, the quantum oracle prepares the state
IS)ab = OIS)alO|0)\b
=
j=1
-
IM)all)b +
(
lj)all)b + E
(jE M
m
Id)alO)b
(3.5)
jg m
M )0)b,
sin(O)IM)all)b + COS (9)IM')a0)b,
sin (0)lt)ab + COS (6) 1t')ab,
where in the second line, we have defined sin (6) =
and similarly for cos (9), and
in the last line we define the target state lt)ab = IM)all)b, and the orthogonal lt')ab =
IM')alO)b. One can see that measuring the a register returns the state 11) with probability
m/n, identical to the classical algorithm.
To go further, consider the reflection operators
Ref1 ) =ab - 21s) (Slab,
Refit) = Zab - 21t)(tIab,
(3.6)
which flip the sign of only the states Is)ab, lt)ab. The product Ref1 8 )Ref
1 t) is known as the
Grover iterate, and has the special property of performing rotations only about the two
states lt)ab, lt')ab. By direct computation, one can represent the state IS)ab and these
rotations as matrices in the It)ab, lt')ab basis.
=
-
S
osin (0)
Rf)Rft) - (sin (29)
cos (20)
(3.7)
Thus by direct multiplication of the matrices, we obtain a well-known result. A sequence of
N Grover iterates acting on Is) prepares the state
(Ref.s)Reft)N s)ab = sin ((2N + 1)9) lt)ab + cos ((2N + 1))t')ab.
(3.8)
By choosing N =
- '~ = 0(1/0) = O( n/m), the amplitude sin ((2N + 1)6) = 0(1)
of the target state It) is close to one. Thus measuring the a register of this state returns
the uniform superposition of marked states IM) with probability 0(1). Measuring IM) then
returns one of the indices of the marked states, selected at random.
Thus this sequence of reflections solve the search problem, and all that remains is to
evaluate the queryomplexity, and the gate complexity of its implementation. The query
complexity of the Refi,) is 2, as seen by rewriting it as
Refl,) = lab - 21s) (Slab = O(Iab - 21S)(Sla 9 lO)(0b)ot.
(3.9)
Thus the query complexity of the procedure Eq. 3.16 is Q = 2N + 1 = O(/rn/m), which
50
recovers the famous square-root speedup over the classical case.
The gate complexity is obtained by adding the cost of preparing the uniform superposition IS), and the other reflections. First, note that IS) may be prepared using log 2 n
Hadamard gates IS)a = Had lg2n IO)a. When n is not a power of 2, the scaling in terms of
primitive gates is still ((logrn). Second, note that the reflection
(lab - 2IS)(Sla 0 10)(01b) = Had o' 2n (iab - 2|0)(01a 090)(Olb)Hado 'l2n
(3.10)
may be implemented using (9(log n) Hadamard gates, and a multiply-controlled phase gate
over 2n dimensions. This multiply-controlled phase gate may be implemented using O(log n)
single and two-qubit gates. Third, the reflection
Refit) = ab - 2It)(tlab = lab - 2IM)(Mla 0 11)(11b
(3.11)
appears to need prior knowledge about the state |M), which cannot be prepared beforehand.
However, Refit) only needs to perform reflections in the subspace of It) and It'), as these
are the only two states that appear in Eq. 3.5. Thus it may be simplified to
Rft) = Ia 0 (Ib - 211)(11b),
(3.12)
and may be implemented using a single-qubit gate. Adding these together, the total gate
complexity of Grover search is O(Q log n)), and the space complexity over that of the oracle
O is 0 ancilla qubits.
The generalization to amplitude amplification is obtained by treating the terms that
appear in the Grover search problem abstractly. Let us define the state preparation oracle
IS)ab = OIO)ab = sin (O)It)al)b + cos (0)it1 )ab,
(3.13)
which prepares the a state IS)ab, from which the target state It)a, marked by the ancilla I1)4,
may be obtained probability Isin (0)12. Note that It')ab has no support on the ancilla state
1l)b, but is otherwise arbitrary. Then we define the reflection operators
Ref1 s) = lab - 2Is)(sab = G(Iab - 2|0)(Olab)
= GRefjo)Ot,
Ref1t) = Ia 0 (1b - 211)(11).
(3.14)
(3.15)
Thus sequence of N iterates Reff,)Refit) produces the state
(Refls)Reft))NGIO)ab
sin ((2N + 1)6)It)aI1)b + COs ((2N + 1)6)It')ab,
(3.16)
with a query complexity of Q = 2N + 1, and gate complexity O(Q log (n)).
3.4
Amplitude amplification by partial reflections
The most common generalization of amplitude amplification replaces reflection with partial
reflections. This allows one to obtain more general functions of 0 in the amplitude of the
target state. Our starting point is Eq. 3.17. We use a slightly different convention here with
51
the state preparation oracle
IS) ab = O10) a I )b =sin (0) It)a I )b + COS (0) 1t Lab,
(3.17)
where the target state It), is marked by the I0)b state. This little difference as it can be
accomplished by a single 8 Pauli gate. We find it convenient to define It)ab = It)alO)b. Let
us define partial reflections parameterized by phases a, 0:
Ref,18 ) = Iab - (1 - e)Is)(slab,
efyjt) =lab - (1 - e-')lt)(tlab.
(3.18)
This leads to the generalized iterate Reff,18,RefiI1) which has query cost 2. An N = 2n + 1
query sequence of these iterates produces the state
n
IJ fak,Is)RefI,It)Is)ab = (iC() + D(O))t)ab + (A(6) - iB(0))lt')ab,
(3.19)
k=1
where Refas,1S)Ref01 ,it) acts first on the input, and A, B, C, D are real functions parameterized
by a, 3. Unfortunately, the dependence of a, 3 on any arbitrary choice of A, B, C, D appears
quite mysterious. Only in very few cases can the A, B, C, D can be specified for arbitrary
N and then inverted to obtain a consistent set of a, ' in closed-form [136]. For instance,
standard amplitude amplification corresponds to ak = A = -r.
We resolve this mystery by proving the following result
Theorem 3.1 (Amplitude amplification with partial reflections). Let 0 be a state preparation unitary acting on the computational basis states I0)a E Cd, I0)b E C2 such that
GIO)alO)b = sin (O)It)aI0)b + cos (6)lt')ab, where |t')ab has no support on |0)b. Then there
exists a quantum circuit
that requires odd N + 1 queries to G, N queries to G t , and
0(N log (d)) primitive quantum gates precomputed in classical 0((poly(N)) time, to prepare
the state
V I|0)a0)b = (iC(O) + D(O))lt)alO)b + (A(O) - iB(9))lt')ab,
(3.20)
where C(O), D(O) are any choice of real functions satisfying all the following conditions:
(1) C(O) = C(y),D = D(y), where C, D are odd real polynomials in y = sin (0) of degree at
most 2N + 1;
(2) Vy E [-1, 1], C 2 (y) + D 2 (y)
(3) Vy > 1, C 2 (y) + D 2 (y)
1;
1, and A, B, are functions of lesser interest.
This result is quite remarkable as the constraints are lax and allow for many interesting
functions. For instance, choosing C(y) =
T2N+1(y) = sin ((2N + 1)0) to be Chebyshev
0, recovers the baseline amplitude amplification
polynomials of the first kind and D(y)
algorithm.
Proof of Thm. 3.1. Consider the Q = 2N + 1 query sequence of Eq. 3.19. Let us re-express
the generalized reflection in Eq. 3.18 as:
Reff, 8I) = Iab - (1 - e iQ)I0)(lb5t =
a (jab - (1 - e )0)(0ab) at = O faRfi)t.
(3.21)
52
If 10)ab is of dimension 2d, Refao) is a conditional phase gate and may be implemented with
O(log(d)) primitive gates. As span{it')ab, lt)a0)b} is an invariant subspace of Refa,,S)RefO,1 t),
we may represent it equivalently with Pauli matrices &,,y,, through the replacements
a --+
(cos (O) - sin (6)
sin (0)
cos (0)
'
-
Ot
e7&YO
cos (0)
-sin (0)
sin (6)1
cos (0))
(3.22)
ei
e--i&,/2
10
/2~
Thus Ref.,1S)Refolt)
e0Q/2)
(eia/2
e-i(+)/2e
y
e-i
ei//2
,
-i&z1
3
/2
e0/2)
ei0/2
in this subspace. Though ap-
za/2e-iyoeiz_/2
plying Ot in general takes us out of the subspace, this operator is always paired with a
in the Grover iterate and never occurs in isolation - the representation is faithful. This
sequence of alternating &y,, rotations motivate us to define the operator for rotations by
angle 0 about an axis in the &,-&y plane of the Bloch sphere:
cos (0)
-i&z(7r/2+0)/2e-i&YOei&z(7/2+0)/2
_
-ie'O sin (0)
-iesin (0)
cos (0)
(3.23)
'
e-**O =
where &0 = cos (0)&,+sin (#)&y. We would like to express Eq. 3.19 as a product of just these
Q = 2N +1
rotations e--&Oko. Thus we replace the input state
and add a final reflection ei 3N+1e N+t,)
to obtain
O0)ab = Oesa'RfO,I0)ab,
N
eaO+,N+1
Ne +akf
ad)(3.24)
i
V_
(k=1
Promised that # g always acts on input state
representation.
O)ab, the fact GjO)
e-&Y 9dt')
permits the
N
iir~
ei(a'o
e
2
N+1- Zk=l(ak+/k))/
e--i&YO
-iz&
e-izI
3
N+1/2
e i&YOei&-zk/2e-i&YOei&z13k/2
(3.25)
o/2
Since we have the identity ea 9 = e-UzWe -yoeaz"&, and all e-as in Eq. 3.18 are sandwiched
between &, rotations, we replace these with the &.,-&y rotations of Eq. 3.23 and define the
composite iterate V- in Fig. 3-1
2N+1
V- = e' V-g =
e
k
= A(0)
+ iB(0)&z + iC(0)&x + iD(0)&Y,
(3.26)
k=1
where (D, which depends only on a,3, is chosen to cancel the global phase of V g, q depends
linearly on a,3 as seen in Figure 3-1, and the decomposition into the Pauli basis is always
possible for SU(2) matrices.
By replacing the product of two-parameter generalized Grover iterates in Eq. 3.19 with a
product of more fundamental and simpler one-parameter single-qubit rotations in Eq. 3.26,
53
(
G
REFo) 10 4
G
-
0)a 10)b -
-
REFIO)b-Gt
GOR
RKEF
n
O]_G
E
REFak,IO)aIO)b
Ft
k
-::
G
-
0)a -1
-
|O)b -
10)a -_
G-
0rGGt
-
R
Ro
!k-1-02k+7 G
n)
k=1 (--
-
|0)b
Figure 3-1: (top) Circuit diagram for amplitude amplification in Eq. 3.16. (middle) Circuit
diagram for amplitude amplification by phase-matching in Eq. 3.19. (bottom) Circuit diagram for amplitude amplification V. by quantum signal processing in Eq. 3.26. Note that
we abbreviate the reflection operators as R and drop the state subscript here. The query
complexity in all cases is Q = 2N + 1, and the gate complexity is O(Q log (d)).
the structure underlying generalized amplitude amplification is made clearer. As these
single-qubit rotations isomorphic to those considered in QSP 2 N+l, we may apply Lem. 3.2
that characterizes any achievable (C, D). Other choices from Chapter 2 (A, B), (A, C) etc.
are also possible.
LI
Lemma 3.2 (Achievable (C, D) - Thm. 2.7). For any odd integer N > 0, a choice of
functions C, D in Eq. 3.26 is achievable by some EE RN if and only if all the following are
true:
(1) C(O) = C(y), D(O) = D(y), where C, D are odd real polynomials of degree at most N;
(2) Vy E [-I1, 1], C2 (y) + 2 ()<1
2
(3) Vy > 1, C 2 (y) + D (Y) > 1.
Moreover, q
RN can be computed in classical 0(poly(N)) time.
3.5
Flexible amplitude amplification
The application of Thm. 3.1 requires finding a good polynomial approximation, say D to
the target function. However, it is not always clear how constraint (3) on properties of
the polynomial outside the interval of interest may always be satisfied. We rectify this in
by adding an additional ancilla qubit to stage a cancellation of the C term. This leads to
Thm. 6.14. Subject only to parity and being bounded, we can then implement, without
approximation, any arbitrary polynomial of degree exactly equal to the number of queries
to the state preparation operator C. This enables us to compute any real function with a
query complexity exactly that of the its best polynomial approximations thus allowing us
to transfer powerful results from approximation theory [93] to quantum computation.
Theorem 3.3 (Flexible amplitude amplification). Given a state preparationunitary O acting on the computationalbasis states |0)a E Cd ,
0
)b E C 2 such that GIO)a|O)b = sin (0)It)aI0)b+
cos (0)|t')ab, where It')ab has no support on I0)b, let D(O) be any function that satisfies all
the following conditions:
(1) D(0) = D(y), where D is an odd real polynomial in y = sin (0) of degree at most 2N + 1;
(2) Vy c [-1, 1], D 2 (y) < 1.
54
,
Then there exists a quantum circuit W- that requires N + 1 queries to G, N queries to
((N log (d)) primitive quantum gates pre-computed in classical 0(poly(N)) time, and an
additional single-qubit ancilla c, to prepare the state
W I0)a0)LOW)c = D(O)|t)al0)bW0)c + A()t')abj0)c + iC(6)1t)a|0)b1)c- iB()1t ')abI1)c,
(3.27)
for some real functions A, B, C, of lesser interest.
Proof of Thm. 6.14. Consider the composite iterate in Eq. 3.26 controlled by a single-qubit
ancilla register indexed by subscript c.
W- =
where |)
in V.
=
|+)(+|c + VZ_
-
-)(-jc,
(3.28)
(|0) t1)). Note that this can be implemented by controlling Refa,,o), Refo3jI)
The number of queries to C, Ot is unchanged and
6, Ot need not be controlled
unitaries. Thus W- still has query complexity N = 2n + 1 equal to V7- From the similarity
transformation d6ye-s'6 0 u
=
e-&-00,
__= A()
-
iB(O)&z - iC(O)&x + iD(O)&Y,
(3.29)
where 7 is the vector where all elements are ir. This allows us to stage a cancellation of C
when W is controlled by the ancilla state 10)c:
IV0)a|0)b|0)c = D()|t)a0)0)c + A(6)1t1 )abj0)c + iC(O)1t)aj0)bI1)c - iB(O)1t')abI1)c,
(3.30)
where 1t)aj0)bI0)c is our new target state. Thus the amplitude of D on the target state is
completely independent of A, B, C regardless of what they may be. This allows us to directly
apply the following result for achievable D in Lem. 3.4.
D
Lemma 3.4 (Achievable (D) - Thm. 2.8). For any odd integer N > 0, a choice of function
D in Eq. 5.25 is achievable by some 0 G RN if and only if all the following are true:
(1) D(6) = D(y), where D is an odd real polynomial of degree at most N;
2
(2) Vy E [--1, 1], D (y) < 1.
Moreover, 4
RN can be computed in classical 0(poly(N)) time.
3.6
Conclusion
Our success in solving the generalized amplitude amplification problem is evidence that result
from analog models of quantum computation may transfer seamlessly to understanding and
implementing algorithms meant for digital models of quantum computation. In the query
model, the SU(2) space of a single-qubit, as we study here, is isomorphic to the SU(2)
subspace spanned by a uniform superposition of marked states and a uniform superposition
of unmarked state. More general query algorithms beyond quantum search can be built to
calculate a boolean function f : {0, 1}' - {0, 1} that depends only on the number of marked
states (i.e. f(x) = f(xI) for some f {0, 1,...,n} -+ {0, 1}) and do so with a Grover-type
55
algorithm of partial reflections. Thus, the same methods introduced here also give a way to
determine how many reflections (analogous to our L) and what reflections (analogous to our
<j) are required to compute any particular symmetric boolean function, achieving the known
lower bounds for this problem, which (not) coincidentally are also derived using polynomials
[8]. As examples of this correspondence, the polynomials presented in Chapter 2 such as
DCL,I is an optimal solution for OR [136] whereas ML,I is optimal for Majority.
56
Chapter 4
Sparse Hamiltonian simulation by
quantum signal processing
4.1
Introduction
The simulation of physical systems should be one of the easiest problems solvable on a quantum computer. After all, unitary time-evolution is a natural consequence of Schr6dinger's
equation in continuous-time, and nature performs this feat instantaneously with every tick
of the universe. Despite this, the Hamiltonian simulation problem is surprisingly non-trivial
on a digital quantum computer, which appears to be one of the unavoidable consequences of
moving from an analog model of quantum computation based on Hamiltonians to a digital
model based discrete quantum gates. Ever since Feynman's particularly astute observation,
this field of study has been subject to intense ongoing research.
The first explicit quantum algorithms for Hamiltonian simulation were discovered by
Lloyd [81] for local Hamiltonians, which is particularly notable as most physical systems
interact locally. This was then generalized by Aharonov and Ta-Shma [3] to d-sparse Hamiltonians with at most d non-zero elements in every row. As inputs to the sparse Hamiltonian
model are standard quantum oracles, this model is particularly suited to designing quantum
algorithms, often though quantum walks. Moreover, its use of these oracles allow lower
bounds on the difficulty of the simulation problem to be proven. This versatility has led
to intense interest in improving sparse Hamiltonian simulation algorithms, with many celebrated results over the years [11, 23, 26, 28, 13, 10, 14]. Other modern developments also
move beyond the sparse Hamiltonian model, such as with Hamiltonians described by a linear
combination of unitary matrices [28, 14], or density matrices [82, 70].
Problem 1 (The Hamiltonian simulation problem). Let t > 0 be the time of simulation,
and e > 0 be the target approximation error. Given access to a unitary quantum oracle
0 that describes the Hamiltonian H, construct a Q-query quantum circuit V such that
|IV - e- H\| < e, with the smallest possible Q.
+
The solution depends strongly on how 0 makes information about ft available. In the
sparse Hamiltonian model, a number of lower bounds on the query cost of approximating
e-iHt with error e are well-known. The "no-fast-forwarding" theorem [25, 10] demands at
least Q(r) queries independent of e, where T = tdllftlmax and |H|imax is the largest element
of H in absolute value, and impressive recent work [13, 10] proved an exact error scaling
of 0
(1/E))
for r = 0(1). Though this suggests a naive additive lower bound Q(r
57
[10], the best prior algorithms approach these factors multiplicatively with either
(r/,)
linear scaling in time 0(m) [23] or sub-logarithmic scaling in error o(T)loglog[
log (/lE) ) [31, 13].
glog (1/c))
Long unanswered is the existence of an algorithm that is unconditionally optimal, with
implications for the relation between continuous and discrete-time models of physics, and of
interest in problems [120] where r, log (1/c) scale together.
In this chapter, we present an application of quantum signal processing to sparse Hamiltonian simulation. Unlike amplitude amplification in Chapter 3, which has a natural structure isomorphic to the unitaries QSPL, the Hamiltonian simulation problem has no such
symmetry in general. Indeed, a Hamiltonian with only two energy levels would not be the
most interesting system to simulate. Nevertheless, we find that the dynamics of QSPL
are naturally suited to the particular task of computing on eigenvalues of generic unitaries,
which leads to our results by performing this computation on certain quantum walk model of
Childs [22] and Berry [12]. On one hand, this allows us to obtain the first Hamiltonian simulation algorithm optimal in all parameters with query complexity O(tdIHIImax + log (1/f)
which matches the best lower bound. On the other hand, the resulting algorithm is extremely
simple, and with a very small constant space overhead, unlike prior art with an ancilla overhead scaling monotonically with t, e.
In Section 4.2, we define the standard quantum oracles that describe sparse Hamiltonians,
and outline the simulation technique of quantum walks used in prior art. In Section 4.3,
we apply the QSPL to obtain a powerful technique for performing computations on the
eigenphases of generic unitary operators. In Section 4.4, we prove our result on Hamiltonian
simulation by combining this technique with that of quantum walks.
4.1.1
Attributions and contributions
The results of this chapter are taken from a significantly restructured preprint of published
joint work [86] with Isaac L. Chuang. The manuscript was written by myself with helpful
discussions and suggestions from my advisor. We thank Cedric Yen-Yu Lin, Robin Kothari,
and Matthew Hastings for insightful discussions, and acknowledge funding by the ARO
Quantum Algorithms Program, the NSF CUA, and NSF RQCC Project No.1111337.
4.2
Quantum walks in sparse Hamiltonian simulation
The difficulty of Hamiltonian simulation depends strongly on the quantum oracle 0 that
provides a description of H. For instance, in the trivial case, the oracle might describe the
Hamiltonian through its time-evolution operator in the form O0) = e--HtI4 ), which solves
the problem. The issues within designing a suitable oracle can be subtle. In general, a
d-sparse matrix H on n qubits of exponential dimension 2" has up to d2n non-zero matrix
elements. If these O(d2n) number are completely random, recording them in a classical circuit would take already take exponential time, which would negate the purpose of quantum
simulation.
Thus the standard oracles for the sparse Hamiltonians implicitly assume that the H are
so structured that there exists an efficient classical algorithm built from O(poly(n)) universal
Boolean gates to compute the matrix elements of H, given a target row and column index.
Moreover, it also assumes that there exists an efficient classical algorithm to compute the
column indices of non-zero elements in every row. Given these classical circuits, one can
efficiently construct quantum oracles with the following properties:
58
Definition 4.1 (Sparse matrix oracles [12]). Sparse matrices with at most d non-zero elements in every row are specified by two oracles. The oracle OHlj)k)Iz)
j)|k)|z e Hk)
queried by j G [2'] row and k C [2'] column indices returns the value Hjk = (jIHIk), with
maximum absolute value ||$|maX =rax-jk |tjk| The oracle OFIj)l) =j)If(j,1)) queried
by j G [2"| row and 1 c [d] column indices computes in-place the column index f(j, 1) of the
lth non-zero entry of the jth row.
In this model, the value H2 , is returned in m-bit binary. For simplicity in evaluating
the query complexity, it is convention to assume that Hik is exact by ignoring errors in any
binary approximations. However, the gate complexity of arithmetic operations will account
for finite m. Thus OH acts on 2n + m qubits, and OF acts on 2n qubits.
Quantum walks are a technique applicable to transforming the description of a Hamiltonian H provided by oracles in Def. 6.7, into a unitary quantum walk operator W with
eigenphases non-linearly related to the eigenvalues of H A) = A A). We state the key features
of a particular quantum walk defined by Childs [23] and Berry [12].
In this quantum walk, one 1 query to OF, OH, OH each and O(n + m poly(log m))
primitive gates suffice to implement an isometry t that maps every state A)0) On+m+2 onto
two eigenstates IAl) of W:
TIA) = (IA+) + IA-)) /v2.
(4.1)
Moreover, t is constructed such that the walk operator W
iS(2Ttt - i) has eigenvalues
WIA ) = e2oA|IA ) defined by
O+
arcsin (A/IIHI|maxd) + (1 -F 1)7r/2,
(4.2)
that depends on the H eigenvalues A. As W corresponds to reflection about TTt followed
by swapping (2n + 2)-qubit registers with 5, its query and gate complexities are identical
to T up to constant factors.
Hamiltonian simulation is achieved by creatively applying W some number of times in
order to implement a state-dependent eigenphase IAl) -+ e-iAt|Al) independent of the t
index. Uncomputing with t then maps IAt) back onto IA)10)®n+m+2 with the desired
is nonlinear
phase evolution. However, some difficulties arise. First, the applied phase
in A. Second, each eigenstate IAt) evolves under W with phases in opposite directions. Thus
uncomputing with t does not map WTIA)(0)®n+m+2 back onto the basis IA)10)®n+m+2.
In [10], these are overcome by approximating the unitary transformation in Eq. 4.4 with
target function
0
A
h(9) = -- r sin (0)
->
h(O>\ ) = -At,
(4.3)
resulting in the desired phase, but implemented using a technique combining a linear com1
such that the success probability decays with N. This
bination of N-controlled W'.-.,N
necessitates amplitude amplification on shorter segments e-iHt/M each with error E/m,
thus invariably mixing sub-optimal factors of r and e in the query complexity.
59
4.3
Eigenphase transformations by quantum signal processing
0
Given an arbitrary unitary oracle W with eigenstates Wlux) = e'^
A uA), we now consider
the general problem of approximating a quantum circuit Vdeal whose eigenphases are those
of W transformed by some arbitrary real periodic function h:
ZetAuA)
Wi- (UA
-4IVdeai
Zeih(O\) I uA)(A
(4.4)
by querying the controlled-W operator. The main result of this section is a solution obtained
by an unusual application of the class of unitary functions QSPN.
Theorem 4.2 (Eigenphase transformations by QSPN). V real odd periodic functions h
(-1r,7r] -- (-7r, 7r] and even N > 0, let (A(O), C(O)) be real Fourierseries in (cos (k), sin (k)),
where k = 0, ... , N/2, that approximate
max A(O) + iC(0) - eih(O) I < E.
(4.5)
OE R
Given the unitary W= E e'0A' ux) (u, 1, and functions A, C, there exists a unitary quantum
circuit V that requires N/2 queries to controlled-W, N/2 queries to controlled-Ct, and
0(N) single-qubit gates, such that (+|V+) approximates ideal = E,\eih(0)Iux)(u\I with
success probability p > 1 - 16E and trace distance
ETr
max
10)
+
-
-
Videal))I1
8E.
(4.6)
Two key properties distinguish Thm. 4.2 from routines that can effect similar transformations, such as quantum phase estimation [98] or linear-combination-of-unitaries [28, 10, 14]
which require a large number of ancilla. First, is its intuitive use of just a single ancilla
qubit. Second, the query complexity of the methodology is exactly the degree N of optimal
trigonometric polynomial approximations to eih(O) with error e [92, 102, 105, 126], without
the decaying success probability of prior art. These are analogous to digital filter design
techniques in discrete-time signal processing [101], and allow for some transfer knowledge
from the vast field of function approximation [105].
The remainder of this section is dedicated to proving Thin. 4.2. Let us recall the class
of unitary functions QSPL in Def. 2.4 constructed from a length L sequence of single-qubit
rotations
Y(O) = NOL (0)NO _1(6)...NRol(0)
(4.7)
= A(9)i + iB(O)&z + iC(9)&x + iD()&Y,
where No (0) = -. (&2 COS 4+&y sin $ The complete classification of possible functions A, B, C, D
is described in Chapter 2. Using Chebyshev polynomials of the first and second kind
Tk(cos (0)) = cos (k) and Uk(cos 0) = sin(k+1)), we may map the results in Thin. 2.7,
from trigonometric polynomials to a Fourier series:
Lemma 4.3 (Achievable (A,C) adapted from Cor. 2.10). V even N > 0, a choice of real
functions A, C can be implemented by some eC RN if and only if all these are true:
(1) VO E R, A 2 (O) + C 2 (O) 1.
(2) A(0) = 1.
ak cos (k), {ak} G RN/ 2+1.
(3) A(6) =
_2
60
a)
-
(6)
-=
b)
POO)
-
e'sp'2/ 2
-
O2() -
Had
- -
ON-.
Had
e-i~e/2 _
c)
IuxL)
lux) -
k(
d)
=
kt) -
-
k")
Figure 4-1: Quantum circuits mapping (a) a sequence of single-qubit rotations V(9) to (d)
quantum signal processing V. Each single-qubit rotation RO(O) is replaced by (b) Up, built
from Hadamard gates and controlled-W with eigenstates W|uX) = eox IuA). Thus (c) Up on
input Iux) reduces to a single-qubit rotation R,(9A). By linearity, V on an arbitrary input
|0) may be understood as rotations V(Ox) controlled by a superposition of IuN). By some
choice of single-qubit input state and measurement basis, coefficients of the Iux) are then
rescaled by the components of the function V(OA) programmed by q.
(4) C(9) =jjL
ck sin (k),
{Ck} G RN/2
Moreover, q can be efficiently computed from the function A, C.
These results map, in three steps, to a quantum circuit that approximates Equation 4.4.
(a) Signal transduction of W into a signal unitary classically controlled by E R:
UO= Z N(0,) 0 Iu\)(uI.
(4.8)
This is implemented in Fig. 4-1b with one controlled-W, which is always possible on a
quantum computer in the worst-case by replacing all of its gates with controlled version,
and 0(1) single-qubit rotations:
UO = (e-44&2/ 2 0 i)(J(eiO&z/ 2 ® i),
o = +)(+l 0 i +I-)(-1
-
(4.9)
W
Z eiOA/ 2 go(0A) 09 IU)(UAI,
where 1t) =
1)
As U4 acting on luA) selects the rotation R0(0A) = (ux\ 4|u\) as seen
in Fig. 4-1c, these are precisely the single-qubit rotations in Eq. 4.7 with rotation angle 0,
controlled by the A index, but with an additional global phase eiOA/ 2
(b) Signal transformation by computing unitary functions V(OA) over a superposition
of 0, on the single-qubit ancilla through the simple circuit of Fig. 4-1d:
V = U4NUN-1 .. 4
,
E R, N even.
(4.10)
As this invokes W a number N times, its query cost is O(N). Note that the unwanted phase
eiO\/2 can be uncomputed by alternating between U4, and Ut
since R 4 (9) = N()
and
N is even.
61
(c) Signal projection of the ancilla onto some basis, to select desired components
of V(OA) in Eq. 4.7. As the desired phase transformation can be implemented through
A(0), C(0), Consider the input state J+) uA), and postselect on measuring (+. Other choices
are of course possible. This applies onto state luA) the coefficient
(+-I+)uA) = (A(OA) + iC(OA)) UA),
p = MInI(+|Y ( 1+)|1 2 = min A(0) + iC(0)12
OE R
,
OECR
(4.11)
with worst-case success probability p. Thus (a)-(c) provide a reduction from finding quantum algorithms for approximating laideal to finding Fourier approximations of A(O) + iC(0)
to eih(O).
The error of approximation may now be evaluated. Given A(O), C(0) that satisfy Eq. 4.5,
conditions (1), (2) of Lem. 4.3 will not generally be satisfied. (1) is violated as the maximum
of IA(0)1, 1C(0)1 is at worst 1 + c. Thus we rescale
A 1 (0) =A(0)/(1+ c),
),
C1(O) = C(0)/(1+
(4.12)
(A,(0) + iC(0)) - eih(O)I < E/(1 + c) + E < 2e,
at the cost of a slightly larger error 2c. Note that Al + C
(j-j)2.
(2) is violated as
A 1 (O) = cos6 > '- for some 6 E [R. Fixing this is more involved. As V(0) is unitary,
Ai + B2 + C2 + D 2 = 1. We can apply the prescription in [89] using polynomial sum-ofsquares to compute the unspecified B, D from A 1, C1 such that B, D are of the form (3) and
(4) respectively. Thus A2(0) + B 2 (0) = 1, and IB(0)l = sin 61. Define
A 2 (0) = A,(6) cos 6 + B(0) sin 6,
|A 2 (0) - A 1 (0)| 5
1+ 6
+ JB(0)J
1 + E
(4.13)
< E.
This introduces an additional error by using the triangle inequality and B 2 < 1 - A2 - C2
2. By construction, A 2 (0) = 1. The functions A 2 (0), C1(0) thus satisfy
E(1
Lem. 4.3. By adding the errors in Eqs. 4.12, 4.13, the distance of (+IVI+) from Videal in
Eq. 4.4 and the worst-case success probability in Eq. 4.11 are
1-
)2 =
Ern. < max IA 2 (0) + iCi(0) - eih((0) 1
OER
8,
(4.14)
p > (1 - 8c)2 > 1 - 166.
4.4
Optimal sparse Hamiltonian simulation
Hamiltonian simulation by applying quantum signal processing in Thm. 4.2 to the quantum
walk of Section 4.2 requires a good Fourier approximation to
A(O) + iC(0) .
-
62
e-i si"(0),
(4.15)
which is provided by the Jacobi-Anger expansion [1
cos (T sin (0)) = JO (T) + 2 E've,o Jk(T) COS (kO),
sin (T sin (6)) = 2 EZ'odd> Jk(T) sin (kO),
(4.16)
where Jk(T) are Bessel functions of the first kind. Note that these Fourier series are already
in the form required by conditions (3), (4) of Thm. 4.3. As IJk(T) I
I
[1] decays
rapidly with k, good approximations are obtained truncating Eq. 4.16 at k > N/2. This
approximates e- r sin (9) with error shown in [10] for T < N/2 = q - 1 to be
E
2jJka(r)I <
=
2 q!
k=q
.
(4.17)
2q
Inserting into Thm. 4.2, the query complexity of Hamiltonian simulation follows by solving
Eq. 5.43 for N, using the implementation of U0 in Eq. 4.9 with 0(1) queries, and that V in
Eq. 4.10 contains N applications of U0.
The optimality of this result for all input parameters follows from known lower bounds.
Specifically, Eq. 5.43 is matched with a corresponding lower bound N = Q(q) [10, 73] for
any q satisfying
C< 2
sin
(4.18)
Note that Eqs. 5.43, 4.18 are solved by the Lambert W-function [32] which captures
the detailed trade-off between T and e. Its asymptotic behavior may be understood by
substituting q =(T +-y), where -r, y > 0. When T
O (), one finds y = 0 (log(())
Thus we express the complexity of Hamiltonian simulation as
Theorem 4.4 (Optimal sparse Hamiltonian simulation). A d-sparse Hamiltonian H on n
qubits with matrix elements specified to m bits of precision can be simulated/or time-interval
t, error e, and success probability at least 1 - 2e with O(td||fI|max+ logo
)queries and
a factor 0((n + mpolylog(m))) additionalquantum gates.
This is valid for T = 0( logloglog(1/(1 /E) )n
) and stronger than prior art [13, 10] which assumes
T = 0(1). Unlike most Hamiltonian simulation algorithms, the query cost is additive in the
simulation length T and the target error e. As such, the r term matches the lower bound
Q(T) [10] with no multiplicative dependence on error.
63
64
Chapter 5
Standard-form Hamiltonian
simulation by qubitization
5.1
Introduction
Previously in Chapter 4, we presented a simulation algorithm based on QSPL for sparse
Hamiltonians that was optimal with respect to all parameters time t, sparsity d, max-norm
I|H||max, and error E. However, the quantum oracles that describe sparse Hamiltonians are
only one of several viable alternatives. For instance, some Hamiltonians are more naturally
described by a linear combination of unitary matrices [28, 14] or density matrices [82, 70].
It could be possible that approximating time-evolution by those Hamiltonians would require fundamentally different simulation techniques. Thus QSPL, which relied crucially on
properties of a particular quantum walk, might not be applicable.
-
In this Chapter, we present a very general 'standard-form' encoding of Hamiltonians H
that is compatible most known input models. More precisely, the standard-form may be
simulated exactly with a constant number of queries to those other quantum oracles that
describe Hamiltonians. Using a procedure we call 'qubitization', we impose an SU(2)-like
structure analogous to quantum walks, which enables the application of QSPL for computing
on eigenvalues of H. Its application to Hamiltonian simulation generalizes our previous
optimal sparse Hamiltonian simulation algorithm, and also furnishes simulation algorithms
for other input models that significantly improve upon prior art in both performance and
simplicity. Whereas Chapter 4 focused on implementing unitary operators function of H,
we consider in greater generality the classes of non-unitary functions of H that may be
computed, which enables new approaches to problems in Table 5.1 such as quantum linear
systems, and Gibbs sampling, which essentially compute functions f- 1 and
In Section 5.2, we introduce the standard-form encoding of matrices and show how a
number of common oracles describing matrices map easily to it. However, it is not clear how
computations on the encoded matrices may be performed. This is rectified in Section 5.3,
where we introduce a procedure called qubitization that imposes a qubit structure onto
the standard-form. In Section 5.4, this qubit structure allows the application of QSPL
to computing broad classes of polynomial functions of H, which we enumerate in detail.
As this qubit structure also resembles that of quantum walks, it also allows in Section 5.5
the computation of the time-evolution operator using that same quantum signal processing
techniques of Chapter 4.
65
Problem
BCCKS [10]
d-sparse [86]
Evolution by p I QLSP [27] j
H
O
Hamiltonian
Selects U1i
UJ coefficients
Hamiltonian
Isometry '
Identity
Density matrix
SWAP
Purified p
Solution
e--iCt
e-iHt
ee-,8f
Matrix
Any
Any
Gibbs [29]
Hamiltonian
Any
Any
Table 5.1: List of six example problems (top row), solvable using quantum signal processing
and qubitization to compute an operator function f[-] of H, the Hermitian component of
C = ((GIa
Is)U (IG)a 0 Is). Through qubitization, the scope of inputs to Quantum Linear
Systems Problem (QLSP) and Gibbs Sampling (Gibbs) can be any H of this form, either
indirectly through Hamiltonian simulation, or directly through quantum signal processing
on the standard-model.
5.1.1
Attributions and contributions
This section is based on submitted joint work [84] with Isaac L. Chuang, but significantly
rewritten with some new content. The manuscript was written by myself with helpful
discussions and suggestions from my advisor. We thank Robin Kothari, Yuan Su, and
Andrew Childs for insightful discussions, and acknowledge funding by the ARO Quantum
Algorithms Program and NSF RQCC Project No.1111337.
5.2
The standard-form encoding of matrices
The standard-form encoding provides a natural model through which information about
some matrix of interest is made available to a quantum computer. We assume that a
complex matrix C is encoded within a unitary quantum oracle U, the signal oracle, in the
following manner
Definition 5.1. A matrix C c C"x" acting on the system register s is encoded in standardform-(C, a, U, d) with normalization a ;> |C|| by the computational basis state |0)a G Cd on
the ancilla register a and signal unitary U C Cdnxdn if ((01a 0 is)U(IO)a 0 Zs) = C/a. If C
is also Hermitian, this is called a Hermitian standard-form encoding.
One reason why the standard-form should be considered natural is that it is no more or
less than the steps of generalized measurement [98], which is fundamental to discrete-time
quantum computation: On measurement outcome 10)a with best-case success probability
(110I/a) 2 < 1, a measurement operator C/a is applied on the system. Note that while
the normalization constant a could always be absorbed into a redefinition of C, it is useful
in some cases to leave it explicit. This naturalness is further supported by how matrices
described through other common quantum oracles easily map to the standard-form, such as
in Figure 5-1.
5.2.1
Matrices from a linear combination of unitaries
Matrices formed from a linear combination of unitaries are considered in [28] and [141. We
present a slightly more general version of their encoding. Suppose that the matrix C, acting
66
(a)
(b)
|0)a -
-
10)a
Gi
(d)
(c)
-
I0)a2 -0 --- - I)a2
- -G)ai -
G
U
-)ai U
S
S
Figure 5-1: Quantum circuits for (a) the standard-form encoding, and the standard-form
encodings of matrices that are (b) a linear combination of unitaries, (c) a density matrix,
and (d) sparse.
on the system register s has the decomposition
d
d
E aA ,
j=1
e
E
(5.1)
lajI,
j=1
for some number of d arbitrary complex coefficients aj, some arbitrary unitaries
, and let
a be the sum of absolute values of the aj. Assume that there exists a selector oracle V and
two state preparation oracles G 1 , and G2 defined as
j)(jla (9(T,j~j O1i|0)a =l,(00t
fr =
j=1
j=1
=_
(ja.
(5.2)
j=1
Note that the definition of 6 through its inverse is intentional, to avoid ambiguities in the
principle value of the square root. Then it is easy to verify that
(((00aG) 9 Zs)V((Gi I0)) & is) =
(5.3)
Thus the signal oracle U
( ®1
9t 8 )V(D
0 Is) encodes Cin standard-form-(C, a, U, d),
using 1 query each to V, G1, and G2.
5.2.2
Density matrices
Matrices that are density matrices are considered in [82] and [70]. In general, a density
matrix H is Hermitian, and has the representation, in some n-dimensional basis on the
system register,
d
a 10j)('jls,
=
j=1
67
(5.4)
where ozj are probabilities that sum to 1 and 10j) are arbitrary quantum states. There, they
assume access to a quantum channel S that on input state 10) (01, of some dimension greater
than H, produces the density matrix S(10) (01) = H. We present a slightly different version
of this encoding that assumes we can simulate E as a unitary process. Assume that there
exists a state preparation oracle G that prepares any purification of H, that is
d
GIO)a = GI0)a10)a2 = G)a =
vT5fIJ)a114j)a2.
(5.5)
j=1
Note that the register a2 is of the same dimension as s. One can also verify that Tr[O0)(ala( t]
H as expected. Let the unitary operation S swap the a2 and s registers, and let {IA)} be
an arbitrary complete basis on the system. Then
((Gla 0 s)1(IG)a 0 is) - is = ((Gla 0 Is)S(G)a 0 is)
'X
=:
Z ,A)(A s
(5.6)
j
ZZaj I ,j)s (0jiA) (Als
Xj
Thus the signal oracle U = (Ot 0JS)S(G 01i) encodes H in standard-form-(H, a, U, nd),
using 2 queries to G, and O(log (n)) primitive quantum gates.
5.2.3
Sparse matrices
In Chapter 4, we considered sparse matrices. Let us recall the definition of the sparse matrix
oracles
Definition 5.2 (Sparse matrix oracles [12]). Sparse matrices with at most d non-zero elements in every row are specified by two oracles. The oracle OH 1j)Ik)Iz)
lj)lk)lz ( IJk)
queried by j c [n] row and k E [n] column indices returns the value Hk = (j|HIk), with
maximum absolute value ||$||max = maxk |Hjkl. The oracle OFljl) =i)If(j,1)) queried
by j G [n] row and 1 E [d] column indices computes in-place the column index f(j, 1) of the
l1 h non-zero entry of the jth row.
By using integer arithmetic to compute elementary functions, one can transform values of
Hjk encoded in m-bit binary into quantum gates that implement rotations by some function
of Hjk. Thus given a state lj), indicating the row index of a matrix H acting on the system
register, and an upper bound Amax > lIftllrax, one can follow a procedure similar to that in
[12] to construct unitary operators Urow and Uci that prepare the states
Ucoil0)alj)s =
(0Fa(kjU
k,)as = E
j)slp)ai
(ks(q|akk
(Xk las =
qEF
(
P 10)a2 +
AFdmax
1-
Amax
(lqk q + (1 - kq)Hq (01a2 + 1F
Amax
m
68
1)a2
(57)
,
(21a
2Ama2J
m
where 6 jk is the Kronecker delta function, and F=
{k : k = f(j, 1) , 1 E [d]} is the set of
non-zero column indices in row j. The procedure requires 0(1) queries to OH and F, and
uses O(poly(m)) gates for arithmetic, and O(log (d)) gates for creating the superposition
over Fj. Note that our definition of the isometry Eq. 6.21 is an improvement over [12]
as it avoids ambiguity in both the principal range of the square-roots when Hjk < 0 and
a sign problem when Hjj < 0. From [12], the gate complexity of Ucoi, Urow, and Umix
combined is O(log (n) + poly(m)), where m is the number of bits of precision of Stk. The
contribution from poly(m) = O(m 5 / 2 ) is due to integer arithmetic for computing squareroots and trigonometric functions.
Let S, which requires O(logn) primitive gates, be a unitary operator that swaps the
registers s and a,. Thus one can verify that
(xjISI'k) = dAmaxI
(&0 (1a2 0S)
Thus the signal oracle U = (Urowt 0 Is)(a 0
0 &S dAmax
)(Uco
0 ( I) encodes H in standard-form-
(H, a, U, 3n), using 0(1) queries to OH and OF, and O(log (n)+poly(m)) primitive quantum
gates.
5.3
Qubitization
The generality of the standard form means that it is not immediately obvious how structured
computation is possible on it. Several problems are apparent. First, O is not in general
unitary, thus the standard technique of quantum phase estimation to encode its eigenvalues
in binary on a control register is not possible. Second, C is only defined through a projection
of the signal oracle by the states 0)a. Thus U does not immediately yield information on the
eigenvalues of C and quantum phase estimation on U is ineffective in general. Third, these
limitations suggest that the only available option is to perform repeated measurements of
C"j4'), on some input state |0)s, but this is not very exciting and succeeds with probability
vanishing exponentially with n.
Our result of 'qubitization' overcomes these problems. Given only access to oracles
encoding C in the standard form, this step translates the encoding to unitary evolution
with eigenvalues and eigenstates that depend directly on 0, using at most one additional
ancilla qubit. The signal oracle U is queried to obtain a Grover-like search parallelized over
all eigenvalues A of the Hermitian component ft = (0+ Ct) through the unitary qubiterate
W ~-,ei&@ S-'(Ill in some basis. Just as how the standard-model generalizes the inputs
to various problems in Section 5.2, qubitization appears to generalize a number of other
quantum algorithms of foundational importance. For instance, the gap A of eigenvalues
A = 1 - A of ft is amplified to cos- 1 (A)
O(v1 ) in the phase of the iterate, which
resembles spectral gap amplification [118], the quantization of stochastic matrices [122],
as well as Szegedy's [123] and Childs' [23] quantum walk. The key difference lies in the
extremely general encoding of the signal through any signal oracle U of the standard form.
Let us consider the standard-form encoding in Def. 6.2 in more detail. Given an encoding
of a complex matrix C standard-form-(C, 1, U, d), the unitary signal oracle U : 7 a 0 Hs -+
Na 0 W. acts on system (s) and ancilla (a) registers. Conditional on the measurement
outcome 10)a, this implements some non-unitary signal operator C : W, -9 -, on some nqubit input system state |/), E Ws. This divides U into two subspaces - RG = 10) (a 0N.
69
where U10)10) may be projected onto 10)CIO) with probability |Ol$)I2 for all I'b), and its
orthogonal complement RGI. In other words,
UIOO))
= 0)aftki)s +
1-
Hjk)l
l0 2
,
) a,
(5.9)
where the signal operator C has spectral norm OlI = ll((Ola 0 Is)U(0)a 0 Zs)I < 1, and
(o0lasHI)s0)a = 0. Whenever the context is clear, we drop the ancilla and system subscripts, and use 10)a 0 i. and 10)a interchangeably. We represent U such that the top-left
block is precisely C and acts on an input state lOVp) =0)10) E G, whereas the undefined
parts of & transform 10p) into some orthogonal state 10) E 'HG of lesser interest.
In the following, we consider the case where O is a normal matrix N, and will very soon
further specialize to Hermitian H. Thus the action of U on 10p) in Eq. 5.9 can be more easily
understood by decomposing 4') as a linear combination of eigenstates NIA) = Ae'X-NIA),
where A, XX E R:
UI0x) = AeixNI0,) + g(A)I0i),
1 - A 2.
g(A)
(5.10)
Later on, operator functions indicated by [.1 are defined by f[N] = E), f(Ae'xA)IA)(Al for
any scalar function f(-). For each eigenstate IA), we also find it useful to define the subspace
where
R,\ = span{I0,), 0j)} = span{I0x), U01)} and its Pauli operator basis ZP, &y
&-\10-) = -10-) and the A in the exponent are labels. Note that these subspaces are disjoint
as all 10x) have zero mutual overlap and U, which defines 10-), is unitary. Note the trivial
case A = 1, where W/\ is one-dimensional.
In the simplest case, one might wish to apply N multiple times to generate higher
moments. When N is proportional to a unitary, this corresponds to phase accumulation,
which is essential to precision measurements at the Heisenberg limit. Alternatively, if N = H
is Hermitian, it is a quantum observable, thus H 2 would allow a direct estimate of variance,
and so on. Unfortunately, the subspace l-t x for each eigenstate IA) is not invariant under U
in general. As a result, repeated applications in this basis do not produce higher moments of
H due to leakage out of W,\. The manner they do so depends on the undefined components
of U, must be analyzed on a case-by-case basis, and thus is of limited utility.
Order can be restored to this undefined behavior by stemming the leakage. The simplest
possibility that preserves the signal operator of Eq. 5.9 replaces U with a unitary ansatz,
the qubitization iterate or qubiterate for short, that on an input in 10) E ax IA) E HG has
the form
=
fw
~
t
Jg[H]
Jkt it
t)
-
eiX>, _g(A))
(A g(A)
Ae--2x;
0
A=DeZ2e
&N
e&)C0#1(e i&cos A
(5.11)
Note that H remains encoded in the standard form. For each eigenstate IA), W performs a rotation in SU(2) on disjoint two-dimensional subspaces 1X = span{ l0x), l0 )}=
span{l0x), WI0x)}, with basis states in (D 'x related by the basis transformation J
70
(A &A. To avoid ambiguity in the basis of W, we explicitly define
(Aeix
-g(A)
g(A)
Ae
= AeixA 0
)(OxI - g(A)0 )(0
1+ g(A)0 )(0 \ + AeixAI O )(O -L.
(5.12)
In the following, W will always be applied to states in the subspace @ 71a, thus its
action on states outside it need not be defined. The usefulness of this construct is evident;
due to its invariant subspace, multiple applications of the iterate result in highly structured
behavior. However, implementing W requires g[], which appears difficult to compute
efficiently in general. In prior art [35], this was approximated using phase estimation. The
qubitization problem then concerns efficiently constructing W without approximation, and
using 0(1) queries to the signal oracle of the standard-form.
5.3.1
Proof of construction
We now solve the qubitization problem. Using the signal oracle U, and arbitrary unitary
operations on only the ancilla register, we provide necessary and sufficient conditions in
Thm. 5.3 for the case of Hermitian H for when the qubiterate W can be implemented exactly
using only one query to U. As these conditions are somewhat restrictive, we then prove in
Thm. 5.4 that Qubitization is unconditionally possible by instead using the controlled-U
oracle in a quantum circuit that generates the same signal operator and satisfies these
conditions. We describe a similar construction for normal operators as well in Section 5.4.6.
In this subsection only, we consider a slightly more general signal oracle. Rather, than
the standard-form encoding H = (01aUI0)a, let us assume that the signal oracle factors into
three components ft/a = (01a(t 0 i)U (9 0 Is)IO)a, where Oacts on the ancilla register
only. This may be interpreted as replacing the computational basis state IO)a with a modified
state IG)a = O0)a. Note that we may always set 0 = ia to recover the original situation.
As U is a black-box oracle, we must find a unitary 5' 0 I1, acting only on the ancilla
register such that the iterate W ='U
of Eq. 5.11 is obtained. For the case of Hermitian
HIA)s = AIA)s, we now determine necessary and sufficient conditions on what 5' must be. As
5' is otherwise arbitrary, we use without loss of generality the ansatz of ' being a product
of a reflection about IG)a and another arbitrary unitary S on the ancilla:
W= ((21G)(G| - I)a 0 1s)SU,
IGA)as = IG)aIA)s -> G)
AGA)as
(5.13)
-
GA)as
(514)
Note that the reflection about G) can be implemented using a with two queries to G, and
a multiply-controlled phased gate that can be compiled by O(log (d)) primitive quantum
gates.
Theorem 5.3 (Conditions on Hermitian qubitization). For all signal oracles U that implement the signal operator ft, the unitary S in Eq. 5.13 creates a unitary iterate W with the
same signal operator in the same basis, but in an SU(2) invariant subspace containing |G)
if and only if
(GlaSUIG)a = H and (GlaSUSU|G)a =.
71
(5.15)
Proof. In the forward direction, we assume Eq. 5.11, then compute and compare with
Eq. 5.13: A = (GAIWIGA) = (GxSUIGA). By using this result repeatedly together
with the fact that SU is unitary, Gram-Schmidt orthonormalization of WIGA) furnishes
the state
jG')
_
WIG)-GA)(GAIWIGA)
|W|G)-|GA) (Gx|WIGx)|
larly computing and comparing -
1-
AIGA)-SUIGA)
-
V1- A2
2
A = (GllG-)
orthogonal to IG,).
By simi-
\ 2-(G>(U) 2IGA)
we obtain
-
(GAI(SU) 2 1GA) = 1. As these must be true for all eigenvectors IA), the conditions in Eq. 5.15
are necessary. That these are also sufficient follows from assuming Eq. 5.15 and attempting to recover the components of Eq. 5.11 using the definitions of Eq. 5.13. By applying
(GlaSUIG)a = H, we compute WIGA) = 21G)a(GIa UIGA) - SUIGA) = 2AIGA) - SUIGA).
In the basis of IGA) and IG), (GAIWIGA) = 2A - (GxISUIGA) = A and (G'kIWGx)
2
2
2
2A -2,\ -- \ +1
v/1 - A 2 . A similar calculation for
(GVIA($U(.)t (2A GA) - SUIGA)) =
the remaining components requires (GlaSUSUIG)a I and reveals that (GkLG') = A
and (GAIWIG') = 1 - A 2 . As this must be true for all A, we may indeed represent
_____________CJ~
8wVeA(
ltA\
_1A2)0
In hindsight, these results are manifest. After all, (GISUSUIG) = I implies that SU
is a reflection when controlled by input state IG), and it is well-known that a Grover iterate [50, 136] is the product of two reflection about start and target subspaces. Nevertheless,
the sufficiency of these conditions highlights that this is the simplest method to extract controllable and predictable behavior out of U. In particular, these conditions are automatically
satisfied in the trivial case with S = Ia when U only has eigenvalues 1, such as when it is
a controlled-Pauli operator.
Unfortunately, a solution to Eq. 5.15 may not exist for more general U. Thm. 5.3,
amounts to choosing S such that SU, is the inverse Ut whist preserving the signal operator
(GISUIG) = H. Given that S only acts on the ancilla register, it is hard to see how this
is always possible. Even if so, S may be difficult to implement as it is an arbitrary unitary
acting on a potentially large ancilla register. The solution is to construct a different quantum
circuit U' that contains U but still implements the same signal operator, and crucially always
has a extremely simple solution S. We now show how this can be done in all cases using
only 1 query to controlled-U and controlled-Ut as shown in Figure 5-2.
Theorem 5.4 (Existence of Hermitian qubitization). For all q qubit signal unitariesU that
implement the signal operator (GIUIG) =i,
there exists an q + 1 qubit quantum circuit
U' that queries controlled-U and controlled-Ut once to implement the Hermitian component
'(f + Ht) as the signal operator, such that the of conditions Eq. 5.15 can be satisfied.
Proof. We prove this by an explicit construction. Let the controlled-U operators be Qi =
0)(01b 0 as + 11)(11b 0 U, 2 = 10)(lb 0 U + 11)(11b 0 las. Thus the extra qubit states
yj or UT. By multiplying, U' = Q1Q2 = IO)(0b 0
t. Now consider the ancilla state IG')ab = -L(0)b + 1)b)IG)a, and choose
0)4, 1), are flags that selects either
U + 1)(11b 0
S(0) (11b + 11)(01b) 0 Zas. It is easy to verify that the conditions of Eq. 5.15 is satisfied.
(G'1SU'IG') = (G'IL'IG') = -(H
2
+ t),
(G'ISU'SU'IG') = (G'IU'tU'IG') = I
(5.16)
where we have used the fact that SG') = G') is an eigenstate, and that S swaps the 10), 1)
ancilla states in U', thus transforming it into its inverse.
L
72
10)b
0)a
Had
Had
07X
GU
Reflo
U
Figure 5-2: Quantum circuits for the qubitization W of a standard-form encoding-(H, 1, (Go
IS)& (t 0 is), d). The qubiterate W encodes H in standard-form-(H, 1, W, 2d). Note that
Had is the Hadamard gate, and we define the reflection Reflo),o)b = I - 210)(01a 0 0)(01b,
so this circuit ignore a global -1 phase. The gate complexity of the W is O(log (d)).
Even if we are given
for which there is no solution to Eq. 5.15, we can always apply
Thin. 5.4 to construct a U' that does with minimal overhead. Furthermore our proof uses
no information about the detailed structure of G) and so we may treat 0 as a black-box,
and absorb any other unitary acting only on the ancilla states into an appropriate definition
of U'. Thus without loss of generality, we can assume that any signal oracle U provided has
already been qubitized.
5.4
Operator function design
Qubitization explicitly imposes a qubit structure onto any matrix H encoded in the standardform. This allows us to transfer knowledge from QSPL to the computation of functions f[H]
of encoded matrices. Though one such class, the unitary functions of H, was considered in
Chapter. 4, many other non-unitary function could be possible by exploiting details in the
construction of the qubiterate W.
In fact, standard techniques such as quantum phase estimation are applicable on the
qubiterate, which has well-defined eigenphases that depend on the eigenvalues of H in a
predictable manner. As quantum phase estimation exploits additional ancilla qubits to
encode an increasingly precise phase in binary, it is clear that the set of continuous functions
computable on the eigenvalues of H scales with the number of ancilla qubits. This motivates
the complexity class
Definition 5.5 (Ancilla-Assisted Function Computation-m (AAFC-m)). Given a Hermitian
standard-form-(H,1, U, d), let AAFC-m be the family of complex functions f that can be
computed by a unitary operator V with at most m additional ancilla qubits initialized in the
computational basis |0)b, over that required for qubitization, such that V encodes f [H] in
standard-form-(f[H],0(1), V, d2').
In this section, we show that at most three ancilla qubits are ever needed to compute arbitrary continuous functions in the standard-model. In other words, AAFC-0 c AAFC-1 C
AAFC-2 C AAFC-3 = AAFC-oo, in contrast to quantum phase estimation and the more recent 'linear-combination-of-unitaries algorithm' [72] in which m can still be arbitrarily large.
From the perspective of Boolean logic, this is a complete surprise: a finite number of bits
can represent arbitrary precision. At the baseline of AAFC-0, this is achieved by directly
exploiting the SU(2) structure of the qubiterate to engineer arbitrary target functions f(A)
by Grover-like rotations with angles dependent on the eigenvalue A, and the procedures for
AAFC-1 to AAFC-3 is an analogous generalization to higher dimensions. These results implement polynomial functions of H with an optimal query complexity that exactly matches
73
I0)b
JO)a -
RefO
,o)-IE
0-0b
Had
UHad
RefO0.J~
GGt2
U
--
U
Figure 5-3: Quantum circuit for the phased qubiterate WO of a standard-form encoding(H, 1, (G 0 Is)U(G# 0 ih), d).
The phased qubiterate encodes H in standard-form(H, 1, Wo, 2d). Note that we define the reflection Refa,Io)lo)b =
I- (1-e-z)10) (01a )10) (01b,
so this circuit omits a global -1 phase. The gate complexity of WO is O(log (d)).
polynomial lower bounds for function approximation. Indeed, the extension to any function
stems simply from the fact that polynomials are dense in the space of real functions [119].
These powerful tools for target operator processing, made possible by qubitization, are
agnostic to the underlying oracles that describe the signal operator H. As seen in Section 5.2, converting various oracles that describe Hamiltonians to the standard form is quite
straightforward. This motivates the conjecture that the standard-form is most natural for
quantum algorithms involving matrices, and furnishes an optimal approach to implementing
arbitrary quantum measurements and their operator transformations when combined with
our quantum signal processing algorithms.
We provide algorithms with different overheads for large classes of transformations.
'Ancilla-free quantum signal processing' in Section 5.4.1 implement target operators where
the real and imaginary parts have the same parity with respect to H. This is extended by
'single-ancilla quantum signal processing' in Section 5.4.3 and Section 5.4.4 for more general
target operators, and then by 'double-ancilla quantum signal processing' in Section 5.4.5 for
completely arbitrary target operators.
5.4.1
Ancilla-free quantum signal processing
Without any additional ancilla qubits beyond that for required qubitization, the possibilities
for applying the qubiterate in Eq. 5.11 seem limited. For instance, (OlabNIO)ab = TN[]
produces only Chebyshev polynomials [27] - note the additional ancilla register b as we
assume W is constructed via Thm. 5.4. To go further, additional control parameters on
W are necessary. Thus we introduce the phased qubiterate, in Figure. 5-3 with the same
invariant subspace as W:
(Ogft
-ie-icg[]Jt)
Jijfiiedg(A)
j-ietjg[]
A
_ie-'g(A))
A
=
(5.17)
where &A = cos (#)&/ + sin (#)&\ and the phase 9 x = cos 1 (A).
Lemma 5.6. The phased qubiterate W in Eq. 5.17 is equal to WO = Z--to/2fV4 /2,
where 20 = (I - (1 - e-i)|0)(0lab) is a partial reflection about I0)ab by angle $ - R,
74
implements a relative phase between the IOA)abs and (0)abs subspaces. In block form,
0
A
-
Proof. Let us compute the phase applied to states 0A), 0'): Z401) = 0A) and ZjO')
(1 - (1 - eidk))|O') - e-'001). As this true for all A, Eq. 5.18 follows. Combining the
representation of W from Eq. 5.11 with this leads to Eq. 5.17,
WO=_+7/2~5-r/ 1 0o A-g(A)
(5.19
#4 =Z-MgWZ4-2 =
0
-ie
g(A)
A
0ie(.)
In Eq. 4.7, we considered the sequence of single-qubit rotations
V(O) = e-ikN
= A(O)
-...
e
2
0 2
/ e-1
0 2
(5.20)
/
+ iB(O)8-z + iC(9)&x + iD(O)&y.
These may be compared to a sequence of N phased qubiterates
WN
=
2
WON'
e
1
N
' ... e
2
e
1eA,
0
A
cos
(A).
(5.21)
A
In each subspace '71A, this is a product of SU(2) rotations, identical to the single-qubit case,
except that the rotation angle is doubled. Thus we may decompose this in the Pauli basis
ZA &A like
W
® A(20A)iA + i B(20\)>5 + iC(20, )&A + iD(20,\)&,
(5.22)
+
where (A, B, C, D) are real functions of 2 0A. As we can only prepare and measure the ancilla
computational basis states I0)ab, we consider the component (0|abW 0)ab = EAA(20A)
iB(20,)|A)(Ajs. We find it useful to define the functions (A, B, C, D) of A related by a
variable substitution e.g. A(6) = A(cos (0/2)). Thus W encodes the matrix A[H] + iB[H]
in standard-form-(A[] + iB[H], 1, W , 2d).
Any choice of phases
E RN generates sophisticated interference effects between eleG
ments of the sequence, leading to (A, B, C, D) with some non-trivial functional dependence
on H. Though the dependence of the output on # seems hard to intuit, they nevertheless
specify a program for computing functions of H, similar to how a list of numbers might specify a polynomial. Fortunately, we may apply the characterization of QSPL in Chapter 2 to
determine achievable functions (A, B, C, D), and their implementation by some choice of q
that may be obtained from any valid partial specification of (A, B, C, D) by some efficient
classical algorithm. For instance, we have the following result regarding Eq. 5.20
Lemma 5.7 (Achievable (A, B) - from Thm. 2.7). For any integer N > 0, a choice of
functions A, B in Eq. 5.20 is achievable by some q E IRN if and only if all the following are
true:
(1) A(O) = A(x), B(O) = B(x), where A, B are real parity-(N mod 2) polynomials in x =
75
(2_c@3) co
+
N
_-V
5.lo)(1|
cos (0/2) of degree at most N;
(2) A(1) = 1;
(3) Vx E [-1, 1], A 2 (x) + B 2 (x) < 1;
(4) Vx > 1, A2 (X) + B2 )>1
(5) VN even, x > 0, A 2 (ix) + B 2 (ix)
1.
Moreover, S C RN can be computed in classical 0(poly(N)) time.
This implies the quantum signal processing result
Theorem 5.8 (Ancilla-Free Quantum Signal Processing). Given Hermitian standard-form(H, 1, (Gt 01s)U(001s), d), let any A, B be degree N polynomials that satisfy the conditions
of Lem. 5.7. Then there exists a standard-form-(A[fH]+iB[H], 1, W, 2d), where W requires
((N)
queries to controlled-U, and ((N log(d)) primitive quantum gates.
Proof. This follows from the identity A(20,) = A(cos(OA)) = A(A), and similarly for the B
term.
As polynomials form a complete basis on bounded real intervals, these results imply the
query complexity of approximating any real function A[H] with error e is exactly that of its
best polynomial c-approximation satisfying the constraints of Thm. 5.8, and similarly for
the complex case.
5.4.2
Single-ancilla flexible quantum signal processing
Thm. 5.8 would be more useful if we could drop the unintuitive constraints (4,5) that impose
restriction on what the target functions must be outside the domain of interest. In the
following Thm. 5.9, we present a generalization that computes functions with only one
o) without those constraints, using an additional
component B[AI = ((Olabc 0Zs)V (VO)abc
single-qubit ancilla register c. Note that this does not follow immediately from Thm. 5.8
as the constraint A(1) = 1 means there will always be some A component, even if the
characterizations of other partial specifications of (A, B, C, D) are used. The trick is to
exploit the structure of single-qubit rotations Eq. 5.20 to stage a perfect cancellation of
the A[H] term by taking a linear combination of two standard-form encodings for ((Olab 0
Is)W $(|0)ab 0 Is) = A[H] k iB[H].
Theorem 5.9 (Flexible quantum signal processing). Given a Hermitian standard-form(H, 1, U, d), let B be any function that satisfies all the following conditions:
(1) B(x) = EN-0 bjxj is a real parity-(N mod 2) polynomial of degree at most N;
(2) B(0) =0;
(3) Vx E [-1, 1], B 2 (X)
1
Then there exists a Hermitian standard-form-(B[],1, V,4d), where B[H] = Zj=0 bj11,
and V requires 0(N) queries to controlled-U and ((N log(d)) primitive quantum gates precomputed in classical 0(poly(N)) time.
Proof of Thm. 5.9. Consider the composite qubiterate in Eq. 5.22 controlled by a singlequbit ancilla c. Let
P
=(6,y 0 jabs)(10)(OIC
76
W +g
+1)(11C
0 W_4.
U
0)c
-X
0)b -
Had
0)a -
G
2
2
Had
2
Ut
2
|0)c---
I0)b
V1,2
VO N
-
10)8
-
V101
Figure 5-4: (top) Circuit diagram for the flexible qubiterate V9
0)(09®W1+ 11)(11c9W-4,
where R = Iab-(1-e--i0)I0)(0aIO0)(b. (bottom) Circuit diagram for the flexible composite
qubiterate VP used to encode a standard-form-(B[H], 1, Vi, 4d). The query
complexity of Zg
is O(N) to O, controlled-U, and their inverses. Its gate complexity is O(N log (d)).
Note that details in the construction of I7 actually allow for the implementation of VL with
the same query complexity, as seen in Figure 5-4. By applying the similarity transformation
e &_
Jxz&x
-&, and &x&yax = -&Y,
-
WIV|0)ab|A)s
e
N-OO
.. . e--_10A
N-1)
)abIA)s,
(5.24)
(A(A)A - iB(A)&z,A + iC(A)&x,A - iD(A)&y,X) I0)abIA)s.
Thus using the ancilla state I+)cIO)ab, where It) = --- (10) + 11)), as the input to 17! results
v12,
in:
I+)cO)aIA)s = (-iA(A)I-)c + B(A)I+)c) IO)aIA)s + (C(A)K-)c + D(A)I+)c) IOA')abs.
(5.25)
Thus ((+Ic(0Iab 0 IS)fVL(l+)c|ab 9 is) = B[H'] encodes B[H'] in standard-form. Note that
this is independent of all the other functions A, C, D which are in general non-zero. Thus
we may apply Lem. 5.10 on achievable (B) even those all other components are in general
non-zero. Finally, let Vi = (Had 0 Iabs) V(Had 0 as)l
Lemma 5.10 (Achievable (B) - from Thm. 2.8). For any integer N > 0, a choice of function
B in Eq. 5.20 is achievable by some 4 G RN if and only if all the following are true:
(1) B(9) = B(x), where B is a real parity-(N mod 2) polynomial in x = cos (0/2) of degree
at most N;
(2) B(0) = 0;
(3) Vx & [--1, 1], B 2 (X) < 1.
Moreover,
R N can be computed in classical 9(poly(N)) time.
With Thm. 5.9, we are assured that any degree N bounded matrix polynomial that
77
goes to zero at the origin can be implemented exactly on a quantum computer using O(N)
queries, O(N) additional primitive quantum gates, and 0(1) additional ancilla qubits. Using
a similar procedure, one may instead extract the A[H] component.
5.4.3
Single-ancilla quantum signal processing on arbitrary unitaries
+
By exploiting an additional ancilla qubit, Thm. 5.9 presents one approach for computing
matrix polynomials with looser constraints. However, other constructions are possible. In
this section, we restate Thm. 4.2 for eigenphase transformation, which provides one such
alternative. Given any unitary P with eigenstates PIA)8 = eA IA)s and Po =+) (+I ca is
)c, consider
I-)(--Ic OP controlled by the single-qubit ancilla register b where &xIt)c =
the sequence
N
P =
'Pkwn11+7
k
PO'N+7r"
3Pp2 +7rIY
1,
(5.26)
k odd>1
p
OA /2ei1OA/2
= (e-iCP&z/2 09 Is)Po(eiW&,/2 & is)
2
For each eigenstate IA), a product of single qubit operators e-WN 0/2...ei
similar to Eq. 5.21 is obtained, and these only act on the ancilla b. Thus the choice of
determines the effective single-qubit ancilla operator that is controlled by JA).
PW=
)+i&z
+I2A( 3 (0,) + i&XC(O,) +i&YfD(O)) 0 JA)K AIs.
(
(5.27)
In the Hamiltonian simulation problem, we are concerned with implementing eigenphase
transformations and so select the components (+IPcI+) c = EA A(6A)+iC(0,)IA)(AIs. Note
that A, B are even Fourier series and C, D are odd Fourier series, all of degree at most N/2.
Thus by using a Fourier series to approximate any target function, we are able to fully
exploit any smooth or analytic behavior, with a query complexity exactly twice the degree
of best Fourier approximations. It follows from Cor. 2.9 for partial tuples (A, , C, -) that
= E\ A(Ox)+iC(OA)A)(Al
Theorem 5.11 (Achievable (A,C)). The target operator(+cg+),
e RN, where N is even, if and only if all the following are
can be implemented by some
true:
(1) VO C R, A2 (0)+ C 2 ()
<1;
(2) A(O) = 1;
(3) A(0) = Z
(4) C(O)
/2 ak cos (k);
~=1 Ck sin (k).
This enables eigenphase transformations A(GA) + iC(OA) ~ eih()
through
Theorem 5.12 (Eigenphase transformation, restated from Thm. 4.2). V real odd periodic
(-7r,7r] and even N > 0, let (A[O],C[9]) be real Fourier series
k
=
0,..., N/2, that approximate maxoEa IA[0] + iC[0] - eih(O)I <
in (cos (k), sin (k0)),
e/8. Given A[G],C[G], one can efficiently compute the 5 such that (+|cPyl+)c in Eq. 5.26
applies P,, a number N times to approximate Pideal = EA eih(O)|A)(A with trace distance
c and success probability > 1 - 2e.
||(+|C~P|+)c - Pidea |
functions h : (-r,7r] -
78
5.4.4
Single-ancilla quantum signal processing on controlled-qubiterates
Setting these arbitrary controlled-unitaries P in Eq. 5.27 to be controlled-qubiterates W
lead to interesting results not captured by Thm. 5.8 or Thm. 5.9. Observe from Eq. 5.11
that W can be diagonalized to obtain its eigenvalues e Fi cos-I (A)and eigenvectors IOX )abs.
S=
ei
cos1 (),
IOA)abs -
V2)abs
O0)as
(5.28)
By a judicious choice of ancilla state, more flexible target function transformations are
possible. As |0,) = 0x+)+IO-), projecting the sequence Eq. 5.26 onto IO)ab leads to a linear
combination of Eq. 5.27 with eigenphases
, = - cos- 1 (A):
(OlabPs 0)ab
ED I (A(OA) + A(-OA)) + i (13(0,) + B(-OA)) +
2
A)(Ala
(5.29)
A2
$2 A(,,) + i&zB(OA) 0 |A) (Ala = i2 g A(] + i&, 0
=0
[$],
A
where in the second line, the parity of (A, B, C, D) lead to a cancellation of (C, D), followed by a change of variables 0, = cos 1 (A) using Chebyshev polynomials Ta(cos (0)) =
cos (nO) of the first kind - (A, B) are even Fourier series so A(6x)
=
aTn(cos (OA)) =
El2= anTn(A) = A(A). Whereas (A, B) in Thm. 5.8 are restricted to be of definite parity,
(A, B) Eq. 5.29 have no such restriction.
Theorem 5.13. The target operator (0|c(OlabP |0)ab|0)c = A[]+iB[H] can be implemented
by some
E RN, where N is even, if and only if all the following are true:
(1) A(A), B(A) are real degree N/2 polynomials;
(2) A(1)
1;
(3) VA E [-1,11, A2 (x) +f 2 (X) <1.
(4) VA < -I or A > 1, A2 (A) + 1-2 ()>1
Proof. The sequence of rotations e-iA/ 2 &WN... .e 0 ,\/2% 2 e iO/2&1 is identical to those in
Thm. 5.8 but with a halved rotation angle OA -+ 6A/2. Thus all conditions on A(cos (0/2)) =
A(cos (9)) apply to Eq. 5.29. Using the Chebyshev polynomial cos (9) = T2 (cos (9/2)), we
equate A(A) = A(T 2 (A)). In the forward direction, condition (5) of Thm. 5.8 on the domain
Vy > 0, A = iy is mapped onto Vy
0, T2 (iy) < -1.
In the reverse direction, solving
T2 (y) = 2y 2 - 1 = x for x < -1 leads to y =
1v/1+ x/v'2 which is imaginary and matches
condition (5) of Thm. 5.8. Note the
indexing two possible solutions is squared as A 2 + B 2
are even polynomials.
If concerned with just selecting A[f] or B[H] alone, we may further relax the constraints
on behavior for JAl > 1, which is a domain of lesser interest as IIH < 1.
Theorem 5.14. The target operator (+|c(0labP(|0)abI+)c= A[] can be implemented by
some @ c RN, where N is even, if and only if all the following are true:
(1) A(A) is a real degree N/2 polynomial;
(2) A(1) =1;
(3) VA E [-1, 1], A 2 (A) < 1.
79
[H] can be imTheorem 5.15. The target operator (+|c(Olab(-i&z 0 iabs)P,6j0)abl+)c =
plemented by some 5 E RN, where N is even, if and only if all the following are true:
(1) B(x) is a real degree N/2 polynomial;
0;
(2) B(1)
2
(3) VA E [-1, 1], B (A) < 1.
Proof. Thm. 5.14 follows directly from Thm. 2.8 for the partial tuple (A,-,-,-) by observing
that A is a Fourier series A(O) = E / as cos (nO) =
anTa(cos (0)) of degree N/2.
Since A = cos(0), A(O) = EZ /2 anTn(A) = A(A) a polynomial of degree N/2. The proof
for Thm. 5.15 is almost identical. Note that our use of the state 1+), ensures that only one
l
quadrature - either I2 or &z is selected.
We may generalize the results of Section 5.4.3 slightly by adding arbitrary single-qubit
rotations e-ie&/ 2 on the ancilla.We define
N
P4I,
tSPk+1+7r
(ei-toz/2 0 Is)(e-i')'/ 2 09 is) Po(eiw~/2 0(g)
,S4k'
k odd>1
(5.30)
Using e-t(D/ 2 lt) = eFit/ 2 j ), this modifies Eq. 5.27 for arbitrary controlled unitaries to
Pb
=
(I2 A( + ,\) + iMzB(4 +
,) + i&.C(4 + ,) + i&yD(4 + '))
0 A)(Als.
(5.31)
-
This is equivalent to Eq. 5.26 with controlled (e'(P) instead of controlled P. Thus the
results of Thms. 4.3, 5.12 still hold but with all phases are shifted by P
+
Replacing P with the iterate W now leads to (OlabP4,gIO)ab =
2
)
0
IA)(Ala. As a result, it is no longer clear what a complete characterization of X(C + Ox)
X( - OA) for any X E {A, B, C, D} for arbitrary 1, is. Our strategy is to choose 1b and
then consider cases where X(1 + 0A) = X(4 - 0,). This allows us to apply the results of
Thm. 2.7 and Thm. 2.8, but with additional constraints on the form of (A, B, C, D). For
instance, the cases 4 = 0, w, are trivial as they reduce to Eq. 5.29 where A(6A) = A(-O,).
However, A are already even Fourier series so no additional constraints arise.
In the case (= 7r/2, we see that X(7r/2+x) = X(7r/2-0,) only if X is symmetric about
7r/2. As A, B are even Fourier series with period 2w, imposing this symmetry halves
0) =
their period to r, whereas the odd Fourier series C, D now only have Fourier components of
odd degree.
Theorem 5.16. The target operator (+|c(0|ab-,r/2,g|0)ab|+)c
= A[] +iC[H] can be implemented by some i E RN, where N is even, if all the following are true:
(1) A(A) is a real even polynomial of degree at most N/2;
(2) C(A) is a real odd polynomial of degree at most N/2;
(3) VA E [-1, 1, A 2 (A) + 0 2(A)
(4) A(0) =1.
1;
Proof. We start from the conditions of Lem. 4.3 for the partial tuple (A, *, C, *). Assume
that X(7r/2 + 6) = X(7r/2 - 0) for X E {A, B}. Thus A(O) = E
aN2,.
acos (nO)
ZNeen anTn(cos (6)).
Combining 0 = 7/2
cos-' (A)
80
-4
cos (0) =
-v/1 - A 2 and that
A(0) is an even polynomial in cos (0), we obtain A(0) = E N2
polynomial in A.
N12
c' A,
Similarly, C(0) =
odd ca sin (nO)
a' A' = A[A] is an even
= r N12 c sin ()Un_1(cos ())
where Un(cos (0)) = sin ((n+1)O) are Chebyshev polynomials of the second kind.
En
cddA
sin (0)
This proves condition (1). Lem. 4.3 also states VO E R, A 2 (0) C 2 (0) < 1.Since A 2 (0) +C 2 (0)
is a Fourier series in 20, this is equivalent to VO E [0, ,r] and VX ER, A 2 (0+X)+C2 (0+X) < 1.
Since (w/2
cos-'(A)) : [-1,1] -+ 7r/2 + [0,7r], a change of variables to A proves condition
(2). Solving 7r/2 k cos- (A) = 0 leads to A = 0 so the condition A(0) = 1 of Lem. 4.3 proves
condition (3).
Whereas A(1) = 1 in Thin. 5.13, 5.14, A(0) = 1 in Thin. 5.16 - these represent different
families of target functions. An analogous characterization for other target operators such
as (0|c(0|abAr/2, 8x&O)abIO)c =
f[] +i[],
OIc(0|ab,r/2,|0)abI0)c = A[] +iB[H] follow
from similar manipulations of Thin. 2.7.
5.4.5
Double-ancilla quantum signal processing
The target operators implementable by quantum signal processing on iterates W in Section 5.4.1, 5.4.4 are evidently very flexible, but nevertheless subject to some lax constraints.
For instance Thms. 5.16, 5.14 fix A(A) = 1 at some A = 0, 1 or impose a definite parity
on A, C or so on. The ability to implement completely arbitrary target operators would be
invaluable. This can indeed be done with a simple modification of Thin. 5.14.
The solution is to take a linear combination of A[ft] and identity. Define a signal operator
P'g controlled by yet another ancilla register d and the ancilla state
Ia)abcd =0)abI+)c (1 /'JJ0)d + \/1 - a1l)d) ,
(5.32)
where 1 > o > 0 and
P
(CelabcdPf,,I a)abcd
=i0)(Old 0iabs + 1)(l1dd9 ((-i&z ®Iabs)Po,)
(5.33)
(1 - a)B[H] k aid.
Theorem 5.17.
The target operator (K0abcdP I|a)abcd = f[H]/3 can be implemented by
some (P ERR, where N is even, 1 > a > 0, and sign
if all the following are true: (1)
f(A) is a real degree N/2 polynomial; (2) VA E [-1,1], If(A)|1 1.
Proof. Given any real polynomial f(A) of degree N/2 such that VA c [-1, 1], f(A)i1
1,
apply Thm. 5.15 to find
E RN such that (+Ic(O ab(-iz 0 Zabs)PO,|0)abI+)c B [H]
2
If (1) 1 is achievable as B(1) = 0 and VA E [-1, 1], f (A) < 1. Choosing
where B(A) = f()-f(1)
3-
a = lf(1)1/3, (alabcdFjgfl(f(l)),
I0)abcd =(1- I) f1 -+ I =
returns the sign of oz, with sign(0) = +.
lwhere sign((T)
l
Note that by taking yet another linear combination of two different fi, f2 created by
Thm. 5.17 in a very similar manner, one can them implement in the standard form -(fi[H]+
if 2 [H]), which is a completely arbitrary complex function.
81
5.4.6
Operator functions of normal matrices
The results of Thms. 5.4, 5.3 for qubitization can be extended to normal operators, though
we will not classify their possible computable functions. It is well known that any normal
matrix has a polar decomposition
ft H
=UH,
(5.34)
where Hu is unitary, HH is positive-semidefinite, [HU, HH] = 0 commute, and the eigenvalues are HIA) = eZOAAI), where A > 0, 9 E 1R. This reduces to a Hermitian operator when
Hu has eigenvalues 1, and reduces to a unitary operator when all HH has eigenvalues
1. The trivial approach to qubitization applies to any complex matrix. We simply use the
construction of Thm. 5.4 to implement the Hermitian signal operator 2( + t).
Another possibility uses two phased qubiterates in an alternating sequence on input state
JG)a R0)S,
WO=
where
form
Z0-7/ 2 (2|G)(G| - I)U Z-0+ / 2 ,
+ = $ and U
vo+ =
_ie -'S0
1
W_
=
(5.35)
Ut. For each eigenstate JA), the separate iterates have the block
2
.
AetOA
-
4_
=
-ie
4
0
2
/1-A
0
(5.36)
(53
-
where the first column corresponds to input states {|G)IA), IG) UIA), G) r+A)1. The
subspace spanned by these states is not invariant under any repeated application of an iterate
of the same sign. However, the product WW,+
has an invariant subspace containing
G). With the understanding that we only consider alternating sequences, each WO
the representation
/1
-ie
eT
-
A2
(537)
/
(
SA
-ieN/1 - A 2
has
Note that when all eigenvalues A are identical and $= 7r/2, this reduces to Oblivious amplitude amplification [12], and we recover Hermitian qubitization when all 0 X = 0. While
this approach uses one less ancilla qubit than the construction of Thm. 5.4, quantum signal
processing must be applied with care here as only even length W- have an invariant subspace. This limitation can be relevant in some cases, such as Hamiltonian simulation using
controlled-qubiterates individually.
5.5
Hamiltonian simulation by qubitization
The cost of simulating the time evolution operator e~-i
depends on several factors: the
number of system qubits n, evolution time t, target error c, and how information on the
Hamiltonian H is made available. This field has progressed rapidly following groundbreaking work in the fractional query model [13] achieving query complexities that depend logarithmically on error. This was generalized by Berry, Childs, Cleve, Kothari, and Somma
(BCCKS) [14] to the case where H =
QjIajUrj is a linear combination of d unitaries
82
and the 1%j sum to a - such a decomposition always exists - with an algorithm us(at/c)
logd)_lg_______tlo
ing 0( lOg4c))
ancilla qubits and only 0(
),t/c))queries. Subsequently [10],
an extension to d-sparse Hamiltonians was made, where H has < d non-zero elements
per row with max-norm |i|limax, to achieve a quadratic improvement in sparsity with
log (dtjjHmaxll/0 )
0 (dtIHmaxI
log (dtllftnax Il/0)
~'log
queries. A prominent open question featured in all these works
)
was whether the additive lower bound Q(t + loglog(/
hs
civbefrayo for any of these
a achievable
log (1/c)) was
models.
In Chapter 4, we presented a procedure achieving the optimal trade-off between all
parameters, with query complexity E(dt|H|lmax + logo1/))* The strictly linear-time performance with additive complexity is a quadratic improvement for precision simulations
t~
log (1/)
and the constant number of n + m + 3 ancilla qubits significantly improves
on prior art which depends on t, c like 0 ( log (tc).
Unfortunately, the d-sparse model is less appealing in practical implementations for
several reasons. First, it is exponentially slower than BCCKS when the Uj are of high weight
with sparsity 0(2n). Second, its black-box oracles can be challenging to realize. Avoiding
the 0 ( 2n) blowup by exploiting sparsity requires that positions of non-zero elements are
efficiently row-computable, which is not always the case. Third, the Childs quantum walk
requires a doubling of the n system qubits, which is not required by BCCKS.
Ideally, the best features of these two algorithms could be combined. For example,
given the decomposition
= E,
aj=
one would like the optimal additive complexity of
sparse Hamiltonian simulation, but with the BCCKS oracles that are more straightforward to
implement. Furthermore, one could wish for a constant ancilla overhead, of say [log 2 (d)] +2,
superior to either algorithm.
The SU(2) structure of the qubiterate is identical to that of the quantum walks used
for sparse Hamiltonian simulation in Chapter 4. Thus the technique of approximating
time-evolution by a sequence of controlled-quantum walks is directly applicable without
modification. This realizes the optimistic fusion of best-case results in prior art, and and
motivates new formulations of Hamiltonian simulation. In this section, we prove the optimal
Hamiltonian simulation algorithm in Thm. 5.18 that uses the standard-form encoding as
the input, apply this simulation algorithm to Hamiltonians described in Section 5.2, and
summarize a comparison of the results with prior art in Table. 5.2.
Theorem 5.18 (Hamiltonian simulation by qubitization). Given Hermitian standard-form-
(H, a, U, d), there exists a standard-form-(X, 1,YV, 4d) such that ||X - e--Ht || _ e, where V
requires Q = 0(ta +
(1/6) ) queries to controlled-U and 0(Q log (d)) primitive gates'.
The optimality of the procedure follows by using the qubitized variant of Childs' quantum walk for U. Furthermore, the transparent nature of Thin. 4.4 significantly expedites
the development of new useful formulations of Hamiltonian simulation. For instance, we
easily obtain a new result for the scenario where H is a density matrix p. Whereas p^ can
be produced by discarding the ancilla of some output from a quantum circuit G, we instead keep this ancilla, leading to an unconditional quadratic improvement in time scaling,
and an exponential improvement in error scaling over the sample-based Lloyd, Mohseni, and
Rebentrost (LMR) model [82, 70], as summarized in Table. 5.2. Indeed, most quantum prin'As error E occurs only in logarithms, it may refer to the trace distance, failure probability, or any other
polynomially-related distance without affecting the complexity scaling.
83
Algorithm
Model
__ ___
__ __ __ __ __
__(-()
Sparse [86]
(9(-)
_
dtIHI|max +
n+ m + 3
Hik
log (1/)
LMR [82]
Mixed p
n+ 1
log log (at/c)
t 2 /6
Thm. 5.18
Cor. 5.20
Con
(0IUIO)
.20log
E. ajUj
[log 2 (d)] + 2
[log 2 (d)] + 2
.2
t + log(1/6)
log (1/)
at + loglog(1
log (1/0)
Cor. 5.21
Purified p
n + [log 2 (d)] + 2
t + logloglog(1/f)
(1/E)lon
log log (at/0)
JJ.2
Gates per Query
n + poly(m)
at log (cd/c)
0 (log(d) log(at/E)
ja
BCCKS [14]
Query Complexity
Ancilla qubits
logrn
Table 5.2: Performance comparison of state-of-art with our new approaches (bottom three
lines), for Hamiltonian simulation e-ift of ft E C 2nx 2n with error E. The d-sparse simulation
oracle describes entries of H with maximum absolute value ||HI|max to m bits of precision.
|HNI
a = , agj and
d 1 cy,
The BCCKS oracle provides the decomposition H =
each Jj has cost 0(1). The LMR query complexity refers to samples of the density matrix
H. This work generalizes the above in Thm. 5.18 with oracles GOb) = IG) E Cd, & such
that (GUI(G) = H, where IIHI1 < 1. A new model Corollary 5.21 where the oracle that
aj j)aj4'), Tra[Ip)(pt] = is provided.
outputs the purification 1p) =
ciple component analysis applications for machine learning as well as quantum semidefinite
programming [16], are compatible with this form, and are thus enhanced.
We now proceed with the proof of Thm. 5.18. Note that the normalization a may be
absorbed by rescaling Hamiltonian ft. For simplicity, the proof assumes a = 1. Using
linearity, it suffices to prove our results on a single eigenstate ftIA) = AlA). From Eq. 5.11,
the qubiterate with an additional global phase e ( acting on a single eigenstate of f simplifies
to
2
1)
i ocos(A), (5.38)
e
(
e
1
2
_
in the basis IO) = 10) A), I0 ). Thus W has eigenstates IOX) = IO) iIOj) with eigenvalues
WIOA ) = e
1
()A). As the input state in the Hamiltonian simulation problem is
0 1)
=
,A+)+I1)
the'application of a sequence of phased qubiterates implements phase
evolution on these states with opposite signs.
e tWb0A) =
1
(eim-icos-'(X)l0A+)
+ei-
+icos-1(A)
10
f_)) .
(5.39)
This can be contrasted with the requirements of Hamiltonian simulation where a phase e-iAt
must be applied on both states. Hamiltonian simulation is accomplished by linearizing the
phase implemented by the qubiterate with some function
h(<(
cos 1 (A)) = -At.
(5.40)
This is accomplished with the choice
<D = -7r/2,
h(6)
84
sin (9)t.
(5.41)
This function may be approximated using Thm. 5.12, which also underlies sparse Hamiltonian simulation in Chapter 4. Its application requires a good Fourier approximation to
the function eih(O), which is provided by truncating the Jacobi-Anger expansion ei cos (z)t __
_1oikJk(t)eikz [1], where Jk(t) are Bessel function of the first kind.
cos (sin(OX)t) = Jo(t) + 2
E
Jk (t) cos (kO,),
(5.42)
k even>O
sin (sin(Ox)t) = 2
Jk(t) sin (kOx).
k odd>O
As done in [10, 86], this allows us to identify the truncated Fourier series of Eq. 5.42 as
the target functions A(O) ~ cos (sin(O)t), C(O) ~- sin (sin(O)t) in Thm. 4.3. The error
E = maxo JA(O) + iC(6) - e 'sin(0)from truncating this expansion for k > N/2 is a sum of
IJk(t)l that was bounded in [10]:
E
2Jk(t)
k=q
4tq =
q
=0
'iet \ q
N
+
->
,
(2)
log('
=
gqlog
,
00
q
(5.43)
The rapid convergence by truncation arises as eitsin(0) is an entire analytic function [15].
Thus Thm. 5.12 allows us to implement the time evolution operator with trace distance
I(+c(OabP@Ij0)abl+)c- e-iltII = O(e) and failure probability O(c). Solving for N then
furnishes the number of queries to W required to simulate e-iHt:
N =
t+
lg(E
No log (1/)
(5.44)
and the a normalization constant may be restored by rescaling H again. This achieves the
upper bound in Thm. 4.4. To prove that it is optimal, we show that sparse Hamiltonian
simulation is a special case.
Corollary 5.19 (Hamiltonian Simulation of a Sparse Hermitian Matrix). Given access to
the oracles in Section 5.2.3 that describe a d-sparse Hamiltonian H with max-norm IfHIlmax,
time evolution by H can be simulated for time t and error e with 0(dt|IIHImax + lo(1c_0
queries.
Proof. This follows from the standard-form encoding Eq. 5.8.
The case where H decomposes into a linear combination of unitaries is an immediate
application:
Corollary 5.20 (Hamiltonian Simulation of a Linear Combination of Unitaries). Given
U, defined
_1
access to the oracles in Section 5.2.3 that describe a Hamiltonian H =
by a linear combination of unitaries, time evolution by H -=
time t and error c with O(at + log(/))
ajUj can be simulated for
queries to 0, U.
Proof. This follows from the standard-form encoding Eq. 5.3.
85
l
The intuitiveness of Thm. 4.4 allows us to swiftly devise new models of Hamiltonian
simulation.
Corollary 5.21 (Hamiltonian Simulation of a Purified Density Matrix). Given access to
the oracles in Section 5.2.3 that describe a Hamiltonian H that is a density matrix, time
queries to G.
evolution by 3 can be simulated for time t and error c with O(t + log(1/)
Proof. This follows from the standard-form encoding Eq. 5.6.
5.6
LI
Conclusion
The standard-form encoding for matrix inputs to quantum computation is very flexible
and generalizes common input models, such as d-sparse oracles or a linear-combinationof-unitaries. As one is always free to impose a preferred basis on this standard form, It
illuminates an intuitive and straightforward path to other as-yet undiscovered input models
of interest. For instance, our definition of the purified density matrix input model for
the problem of Hamiltonian simulation led to a quadratic improvement in time and an
exponential improvement in error over the sample-based model - the proof of which consisted
of just a few lines.
The greater value of operator design through quantum signal processing and qubitization
lies in providing a unified approach to understanding a variety of quantum algorithms, and
what fundamentally determines their performance. In particular, important problems that
essentially rely on computing matrix functions, such as Hamiltonian simulation, quantum
linear systems, and Gibbs state preparation, have previously required a case-by-case analysis
for different various input models, each of which representing a major breakthrough. These
are now shown to be special cases of qubitization combined with quantum signal processing,
wherein finding an algorithm that succeeds on the standard form automatically implies
algorithms with equal performance for all other input model.
Through qubitization and quantum signal processing, we characterize the set of matrix
functions that can be implemented on the standard-form, and find that completely arbitrary
complex matrix polynomials can be implemented exactly using at most 0(1) ancilla qubits.
As these functions are also implemented with query complexity exactly that of optimal
polynomial approximation [93], these standard form algorithms can represent significant
improvements, as illustrated through our applications to Hamiltonian simulation.
86
Chapter 6
Uniform spectral amplification
6.1
Introduction
Quantum algorithms for matrix operations on quantum computers are one of its most exciting applications. In the best cases, they promise exponential speedups over classical
approaches for problems such as matrix inversion [55] and Hamiltonian simulation, which
is matrix exponentiation. Intuitively, any arbitrary unitary matrix applied to an q-qubit
quantum state is 'exponentially fast' due to a state space of dimension n = 2q. However, if
these matrix elements are presented as a classical list of O(n2 ) numbers, simply encoding
the data into a quantum circuit already takes exponential time. Thus the extent of this
speedup is sensitive to both the properties of the Hamiltonian and the input model defining
how that information is made accessible to a quantum computer.
Broad classes of Hamiltonians H, structured so as to enable this exponential speedup,
are well-known. The most-studied examples include local Hamiltonians [81] built from a
sum of terms each acting on a constant number of qubits, and its generalization as d-sparse
matrices [3] with at most d non-zero entries in every row, whose values and positions must
all be efficiently computable. More recent innovations consider matrices that are a linear
combination of unitaries [28, 14, 99] or density matrices [82, 70]. Though different classes
define different input models, that is unitary quantum oracles that encode H, it is still helpful
to quantify the cost of various quantum matrix algorithms through the query complexity,
which in turn depends on various structural descriptors of H, such as, but not limited to,
its spectral norm I|flN, induced 1-norm IIH1|, max-norm iH|ilmax, rank, or sparsity.
A challenging open problem is how knowledge of any structure may be maximally exploited to accelerate quantum algorithms. As the time-evolution operator e-iHt underlies
numerous such quantum algorithms, one common benchmark is the Hamiltonian simulation
problem of converting this description of H into a quantum circuit that approximates e-It
for time t with some error e. In Chapter 4, we provided an algorithm with optimal query
complexity O(tdl||max + l1/)
[86] in all parameters for sparse matrices [23, 13, 10],
based on quantum signal processing techniques. Though this settles the worst-case situation where only d and the max-norm liHIlmax are known in advance, there exist algorithms that exploit additional knowledge of the spectral norm Ii|iI and induced one-norm
11H11i to achieve simulation with 0(t/ 2 (djfih|max
|1|k
IH
) 1/ 2 ) [12] queries. Though this
square-root scaling in sparsity alone is optimal, it is currently unknown whether the significant penalty in the time and error scaling is unavoidable. Motivated by the inequalities
|1HI|
11H111 < djIH11max [25], one could hope for a best-case algorithm in
IIHI1max
87
Claim 6.1 that interpolates between these possibilities.
+
Claim 6.1 (Sparse Hamiltonian simulation). Given the standard quantum oracles that return values of d-sparse matrix elements of the HamiltonianH, there exists a quantum circuit
that approximates time-evolution e-it
with error e using Q = 0(t(d||HI|maxfli) 1/ 2
log l11c
) queries and O(Q log (n)) single and two-qubit quantum gates.
The challenge is exacerbated by how unitary time-evolution, though a natural consequence of Schr6dinger's equation in continuous-time, is not natural to the gate model of
discrete-time quantum computation. In some cases, such as quantum matrix inversion [72],
algorithms that are more efficient as well as considerably simpler in both execution and
concept can be obtained by creatively bypassing Hamiltonian simulation as an intermediate step. The need to disentangle the problem of exploiting structure from that of finding
best simulation algorithms is highlighted by celebrated Hamiltonian simulation techniques
such Lie-Product formulas [81], quantum walks [23], and truncated-Taylor series [14], each
radically different and specialized to some class of structured matrices.
A unifying approach to exploiting the structure of Hamiltonians, independent of any
specific quantum algorithm, is hinted at in Chapter 5 through Hamiltonian simulation by
qubitization. There, we focus on a standard-form encoding of matrices (Def. 6.2), which,
in addition to generalizing a number of prior input models, also appears more natural. On
measurement outcome 10)a with best-case success probability (IIHI/a)2 < 1, a Hermitian
measurement operator H/a is applied on the system - thus the standard-form is no more
or less than the fundamental steps of generalized measurement [98]. Treating this quantum
circuit as a unitary oracle, this amounts possessing no structural information whatsoever
about H. In this situation, we provided an optimal simulation algorithm (Thm. 6.3), notably
with only 0(1) ancilla overhead.
Definition 6.2 (Standard-form matrix encoding). A matrix H E C" T ' acting on the system register s is encoded in standard-form-(H,a, U, d) with normalization a > ||t|
by the
computational basis state |0)a E Cd on the ancilla register a and signal unitary U E Cdnxdn
if ((0|a 0 is)U(|O)a 0 Z4) =
standard-form encoding.
H/a.
1
If H is also Hermitian, this is called a Herimitian
Theorem 6.3 (Hamiltonian simulation by qubitization, restated from Thm 5.18). Given
Hermitian standard-form-(H,a,U, d), there exists a standard-form-(X, 1, V, 4d) such that
|$k - e--tl| < e, where V requires Q = 0(ta + lo(
)
queries to controlled- and
0(Q log (d)) primitive gates2
This motivates the standard-form encoding as the appropriate endpoint when structural
information about H is provided, though it does not exclude the possibility of superior
simulation algorithms not based on the standard-form. As Thm. 6.3 is the optimal simulation
algorithm, any exploitation of structure should manifest in minimizing the normalization a
of a Hamiltonian encoded in Def. 6.2. In order to avoid accumulating polynomial factors of
errors, this must only be with an exponentially small distortion to its spectrum. Moreover,
'The unitary C defined in Chapter 5 such that (((0O6) o )U((610)) 0 i) = H/a, which encodes H
with normalization a, may be absorbed into a redefinition of U. Moreover, for any / > 0, this is identical
to encoding Hf3 with normalization a3.
2
As error Eoccurs only in logarithms, it may refer to the trace distance, failure probability, or any other
polynomially-related distance without affecting the complexity scaling.
88
the cost of the procedure should allow for a favorable trade-off in the query complexity
of Hamiltonian simulation. Thus manipulation of the standard-form and any additional
structural information to this end is what we call the uniform spectral amplification problem.
Problem 2 (Uniform spectral amplification). Given Hermitian standard-form-(fH,o, &, d),
and an upper bound A - [||HI|,ca] on the spectral norm, exploit any additional information
about H or the signal unitary U to construct a Q-query quantum circuit that encodes Hamp
in standard-form with normalization A, such that ||Hamp - Hjj < e, and Q = o(oz/A)
9(polylog(1/c)).
Uniform spectral amplification is non-trivial as it precludes a number of standard techniques. First, amplitude amplification is precluded as the success probability must be
boosted for all input states to the system. Second, oblivious amplitude amplification [13, 14]
is also precluded as H is not in general unitary, or even close to unitary. Third, spectral
gap amplification [118] is precluded as it distorts the spectrum. As such, solving this problem would be of broad interest beyond Hamiltonian simulation. For instance, spectral gap
amplification is fundamental to adiabatic state preparation and understanding properties
of condensed matter system. Moreover, the prevalence of generalized measurements means
that this could also be applicable to quantum observable estimation in metrology and repeatuntil-success gate synthesis [103].
Some forms of spectral gap amplification have an underlying structure that resembles
the amplitude amplification algorithm for quantum state preparation. This suggests that at
least one possible solution to uniform spectral amplification could be obtained by solving a
related non-trivial amplitude multiplication problem, and vice-versa.
Problem 3 (Amplitude multiplication). Given a quantum state preparationoracle G0I)a0O)b
AIt)a|0)b+ 11 - A 2 It-')ab, and an upper bound IF - [A, 1] on the target state overlap, construct
a Q-query quantum circuit V that prepares I0)a|O)b = Aamplt)a|O)b + - |t-L)ab such that
|Aamp - A/Fj < c, and Q = O(F- 1 log (1/c)).
Amplitude multiplication is particularly interesting as amplitude amplification and its
many other variations [136] amplify target states with the same optimal scaling O(A- 1 ),
but with a highly non-linear dependence on the initial overlap. In contrast, Problem 3
performs arithmetic multiplication on the amplitudes with exponentially small error, notably
independent of, and without any prior knowledge of their values.
6.1.1
Our Results
We present quantum algorithms for Hamiltonian simulation based on the general principle
of finding solutions to the uniform spectral amplification Problem 2, which may be broadly
categorized as follows. In 'uniform spectral amplification by quantum signal processing', we
make no assumptions on the form of the signal unitary in the standard-form encoding of H,
and thus treat as a single unitary oracle. In 'uniform spectral amplification by amplitude
multiplication', we assume that signal unitary has the structure of factoring into two or
three unitary oracles, and by solving amplitude multiplication in Problem 3, also approach
the sparse simulation results of Claim. 6.1. We then provide a unifying perspective in
'universality of the standard-form' which further motivates the standard-form encoding of
Hamiltonians as a fundamental ingredient in quantum computation. In greater detail, these
results are as follows.
89
Uniform Spectral Amplification by Quantum Signal Processing
If we make no assumptions on the form of the signal unitary U that realizes the standardform encoding, we treat U as a black-box oracle, which we call the standard-form oracle.
In this situation, the first result is uniform spectral amplification in Thm. 6.4 that reduces
the normalization a of encoded Hamiltonians to O(A) using O(aA- 1 log(1/E)) queries. This
produces a quadratic improvement in success probability when the standard-form is applied
to perform quantum measurement, but serves no advantage to Hamiltonian simulation.
Theorem 6.4 (Uniform spectral amplification by spectral multiplication). Given Hermitian standard-form-(,a,U,d), let A E [|I0||,a]. Then for any E < O(A/a), there
exists a standard-form-(Hamp,2A, ,4d) such that k||$amp - H| K e, and V requires
0(aA- 1 log (1/c)) queries to controlled-U.
The second result is uniform spectral amplification of only the low-energy subspace in
Thm. 6.5, of H with eigenvalues E [-r, -c(1-A)], which is of interest to quantum chemistry
and adiabatic computation. There, the effective normalization is reduced to 0(1) using
0(A- 1 / 2 log3/ 2 (-))
queries. This is a generalization of spectral gap amplification [118]
with the distinction of preserving the relative energy spacing of all relevant states, and
of applying to any Hamiltonian encoded in standard-form. When applied to Hamiltonian
simulation, an acceleration to 0(taVA log 3/ 2 (ta/E)) queries is obtained in Cor. 6.13.
Theorem 6.5 (Uniform spectral amplification of low-energy subspaces). Given Hermitian
standard-form-(H,a,U,d) with eigenstates H/aIA) = AlA), let A E (0,1) be a positive
constant, and H = eq_ _
IA)(A| be a projector onto the low-energy subspace of H.
Then there exists a standard-form-(Hamp,Aa, V, 4d) such that |Ift i
1 2
31 2 (E))
e, and V requires
/
1(A- queries to controlled-U.
-
) )bl|
<
These results stem primarily from constructing polynomials with desirable properties,
which we implement using the technique of flexible quantum signal processing in Thm. 6.6.
The advantage of quantum signal processing over the related technique of linear-combinationof-unitaries [10] is its avoidance of Hamiltonian simulation as an intermediate step. This
reduces overhead in space, query complexity, and error, and leads to an extremely simple
algorithm that directly implements polynomial functions of H without any approximation.
Theorem 6.6 (Flexible quantum signal processing, restated from Thm. 5.9). Given a Hermitian standard-form-(H,1, U, d), let B be any function that satisfies all the following conditions:
(1) B(x) =ENo bjxj is a real parity-(N mod 2) polynomial of degree at most N;
(2) B(0)
0;
(3) Vx E [-1, 1], B 2 (X)
1
Then there exists a Hermitian standard-form-(B[H],1, V, 4d), where B[H] = EN bH3,
and Y1 requires 0(N) queries to controlled-U and 0(N log(d)) primitive quantum gates precomputed in classical 0(poly(N)) time.
Uniform Spectral Amplification by Amplitude Multiplication
Alternatively, we here assume that the signal unitary U that realizes the standard-form
encoding factors into two or three unitary quantum oracles Urow, Uc 0 i, and Umix, which we
also call standard-form oracles. When the signal unitary factors into two components U =
90
UrowUcoh, this constrains the representation of matrices in the standard-form to have matrix
elements of H that are exactly the overlap of appropriately defined quantum states, and
generalizes the sparse matrix model first introduced by Childs [23] for quantum walks. When
the signal unitary factors into three U= UowUmixUcol components, amplitude amplification
can be applied to obtain non-trivial Hamiltonians.
Note that amplitude amplification had been previously considered in the context of
sparse Hamiltonian simulation [12]. However, its non-linearity introduced a polynomial
dependence on error, which compounded into a polynomial overhead in scaling with respect
to time and error. In contrast, our solution to the amplitude multiplication problem achieves
uniform spectral amplification by multiplying all state overlaps by the same constant factor.
Specializing the general result Lem. 6.15 to the case of sparse Hamiltonians, which are
described by standard black-box quantum oracles (Def. 6.7) to its non-zero matrix elements
and positions, furnishes a simulation algorithm matching the complexity of Claim. 6.1, up to
logarithmic factors. Modulo these logarithmic factors, this an improvement over prior art,
with either best-case square-root improvement in sparsity [86], or a polynomial improvement
in time and exponential improvement in precision [12]
Definition 6.7 (Sparse matrix oracles [12]). Sparse matrices with at most d non-zero elements in every row are specified by two oracles. The oracle OH j)k)|z)
Ii)Ik)|z Hjk)
queried by j G [n] row and k E [n] column indices returns the value Hjk = (jH|k), with
maximum absolute value I|H|fmax = maxjk IfjkI. The oracle6F1j)I1)
Ii)jf(j,l)) queried
by j C [n] row and 1 G [d] column indices computes in-place the column index f(j, 1) of the
l1 h non-zero entry of the jth row.
Theorem 6.8 (Sparse Hamiltonian simulation by amplified state overlap). Given the dsparse matrix oracles in Def. 6.7 for the Hamiltonian H, let ||Hi$max = maxjk Ik| be the
max-norm, |f|l1i = maxj Zk fjk| be the induced 1-norm , and ||H|| be spectral norm. Then
Vt > 0, e > 0, the operator e-iHt can be approximated with error e using
0 (t(djftj|max-ft1i1) 1/ 2 log (t
u)
C
(l + tf1
log(1/c)
tj|ftji log log (1/6)
(6.1)
queries.
Observe that in the asymptotic limit of large ||ftHit > log (1/c), the query complexity
simplifies to O(t(d||I|maxfIHl1i)1/2log (11)).
The algorithm of Thin. 6.8 is particularly
flexible. If none of the above norms are known, they may be replaced by any upper bound,
such as determined by the inequalities |Ifimax < 11f 1 5
Hf Hi dIHIl|max [25]. Even in
the worst case, the results are similar to previous optimal simulation algorithms. Moreover,
the scaling in these parameters is optimal as we prove matching lower bound Thm. 6.9 by
finding a Hamiltonian that solves PARITY o OR.
Theorem 6.9. For any d > 1, s > 1, and t > 0, there exists a Hamiltonian H with sparsity
e(d), ||H||max = E(1), and ||H1 = 0(s), such that approximating time evolution e-iHt
with constant error requires Q(tvdii) queries.
Some of these results stem from constructing polynomials with desirable properties,
which we implement using the technique of flexible amplitude amplification from Chapter 3.
Amplitude multiplication in Thin. 6.10 is then a special case that solves Problem 3 up to a
factor of - in the range of the input and output amplitudes.
91
Theorem 6.10 (Amplitude multiplication algorithm). V A E [-1/2, 1/2], F E (IAJ, 1/2], c <
0(F), let 0 be a state preparationunitary acting on the computational basis states |0)a E Cd,
Alt~ab + /1 - A 2 t-L)ab, where It')ab has no support on
|0)b e C2 such that GIO)a=)b
10)b. Then there exists a quantum circuit 0' such that (tIa(0|b(0|c0'0)a|0)b10)c 1
using Q = 0(F- log (1/c)) queries to 0, (t,
additional ancilla qubit c.
2, - 2r
O(Q log (d)) primitive quantum gates, and an
Universality of the Standard-Form
Uniform spectral amplification is motivated by the idea that structure in the signal unitary
and its encoded Hamiltonian can be fully exploited by focusing only on manipulating the
standard-form, independent of any later application such as Hamiltonian simulation. This
is supported by the simulation algorithm Thm. 6.3 which is optimal with respect to all
parameters when the standard-form is provided as a black-box oracle. This perspective
would be further justified if one could rule out, to a reasonable extent, the existence of
superior simulation algorithms not based on the standard-form.
We show certain universality of the standard-form by proving an equivalence between
quantum circuits for simulation and those for quantum measurement, up to a logarithmic
overhead in time and a constant overhead in space. Where Thm. 6.3 transforms a measurement of H to time-evolution by ei
, we prove the converse in Thm. 6.11 which transforms
time-evolution e--Ist back into measurement H. In particular, this is with an exponential
improvement in precision over standard techniques based on quantum phase estimation.
Thus any non-standard-form simulation algorithm for e-iHt that exploits structure can be
always mapped in this manner onto the standard-form with a small overhead.
Theorem 6.11 (Standard-form encoding by Hamiltonian simulation). Given oracle access
to the controlled time-evolution e-iH such that ||I|
1/2, there exists a standard-form(Hin, 1, U, 4) such that ||Hiin - H||
e, where U requires Q = 0 (log (1/c)) queries and
0(Q) primitive quantum gates.
This is proven through the flexible quantum signal processing Thm. 5.9 using a particular
choice of polynomial. It is important to note however the caveat that our equivalence limits
||$tJ| = 0(1), and also fails when time-evolution can be approximated with o(t) queries.
Fortunately, the latter scenario can be disregarded with limited loss as 'no-fast-forwarding'
theorems [25] prove the necessity of Q(1ftlHt) queries for generic computational problems
and physical systems.
One useful application of this reverse direction is an alternate technique Cor. 6.12 for simulating time evolution by a sum of d Hermitian components Ed=1 Ht, given their controlledexponentials e itjij. This approach is considerably simpler than that of compressed fractional queries [13], and essentially works by using Thm. 6.11 to map each e-iHiti, where
flHjtj| = 0(1) to a standard-form encoding of Hjtj.
Corollary 6.12 (Hamiltonian simulation with exponentials).
Given the standard-form-(E _I aje-ifi, a, GaUSa, d), where U that prepares the state |G)a
j 1)(j Ula
aj and signal oracle U =
1, there exists a standard-form-(k, 1, V, 4d) such that |k - eit||
<cE,
_=
1 vaj/alj)a with a3 ;> 0, normalization a =
e-ika, with ||$j4|
V
requires 0 (at log (at/c) +
primitive quantum gates.
where
og log (at/E)
92
controlled-queries, and 0(Q log (d))
6.1.2
Organization
Our results are structured in the remainder of this manuscript as follows.
Part I is where we achieve uniform spectral amplification by quantum signal processing.
In Sec. 6.2, we treat the signal unitary as a single unitary oracle, and apply flexible
quantum signal processing to prove the solutions Thin. 6.4 and Thin. 6.5 to the uniform
spectral amplification problem.
Part II is where we achieve uniform spectral amplification by amplitude multiplication. We
prove in Sec. 6.3 the amplitude multiplication algorithm of Thin. 6.10. Subsequently
in Sec. 6.4, we consider signal unitaries that factors into two or three unitary oracles.
This motivates a general model of Hamiltonians encoded by state overlaps, where
uniform spectral amplification in Lem. 6.15 is enabled by amplitude multiplication.
Applying these results to the special case of sparse matrices leads to the simulation
algorithm Thm. 6.8, which matches the lower bound Thm. 6.9.
Sec. 6.5 is where we offer a unifying perspective of simulation algorithms and prove a certain
universality of the standard-form. This is through the equivalence between quantum
circuits for simulation and those for measurement described by Thm. 6.11, and leads
to the simulation algorithm Cor. 6.12.
Sec. 6.6 is where we constructively prove the properties of useful polynomials used in various
proofs.
We conclude in Sec. 6.7.
6.1.3
Attributions and contributions
The results in this section are from the preprint of joint work [851 with Isaac L. Chuang. The
manuscript was written myself with helpful discussions and suggestions from my advisor.
G.H. Low is funded by the NSF RQCC Project No.1111337 and ARO quantum algorithms
project. We thank Aram Harrow and Robin Kothari for suggesting PARITY o OR as a
possible lower bound.
6.2
Uniform Spectral Amplification by Quantum Signal Processing
When provided with no information on any structure in the standard-form encoding ((01a 0
Is)U(I0)a 0 Is) = H/a of the Hermitian matrix H, all we have is access to the signal oracle
U. Thus our only option is to apply quantum signal processing and study the polynomial
functions f[-] of H/a that achieve uniform spectral amplification. In this setting, Thm. 6.4
performs uniform spectral amplification, though the trade-off between its implementation
cost and the achieved reduction of a provides no advantage to Hamiltonian simulation.
However, a speedup is possible through Thm. 6.5 when interested only in the lower energy
subspace of H.
As the normalization a is always greater or equal than |I$H|, any input state |) on the
system has support only on eigenstates H/alA) = AlA) with eigenvalues JAl
IIHII/a < 1.
Given an upper bound A E [IIII, a] on the spectral norm, this means that in any polynomial
function p(x) that we construct, only its restriction to the domain x c [-A/a, A/a] is of
93
interest, so long as |p(x)J remains bounded by 1 over x C [-1, 1]. Thus one approach to
minimizing the normalization is to use quantum signal processing to encode a polynomial
with the property p[H/a] =
in standard-form. Thus, we should find a polynomial that
approximates a truncated linear function, such as
()xi
E [-1, 1],
E [0, 1F],
lxi G (F, 1].
In Thin. 6.18 of Sec. 6.6.1, we approximate flin,r(x) with a polynomial with the following properties: V IF E [0, 1/2] and c < O(F), the odd polynomial Plin,F,n of degree n
O(F-1 log (1/c)) satisfies
Vx C [-,17],
Plin,F,n(X)
-
-
21P
<
-
217
max lPIin,r,n(X)l
and
xE[-1,1]
1.
(6.3)
This polynomial satisfies the conditions of flexible quantum signal processing in Thin. 5.9,
and provides us with the solution Thin. 6.4 to uniform spectral amplification.
Proof of Thm. 6.4. Given Hermitian standard-form-(H, a, , d) and an upper bound A E
[iiHii, a], Define F = A/a < 1. Using Thin. 6.6 with the polynomial Plin,r,n, encode
in Hermitian standard-form- (Plin,r,n [H/a], 1, V, 4d). This requires
H =
Pin,r,n [H/a]
0(n) queries, and is identical to the Hermitian standard-form-(2Apiin,r,[/a], 2A, V, 4d).
Define Hamp = 2 Apin,r,n[H/a]. Then the error of approximation
ftamp
ft
A 2
2A
H
x
max
Plin,F,n (x)
E[-A,Axe
x
X
<
max
a
Piin,r,n(x)
2A-E[--r,F
F2
O(F), and has degree scaling like n = O(F--
1
(6.4)
log (I/E1)),
.
Finally, note that Plin,F,n requires Ei
so let us define E =
-
Unfortunately, this provides absolutely no advantage to Hamiltonian simulation as the
decrease in normalization by factor a/A is exactly balanced by an increase in query complexity by factor a/A. Nevertheless, Thin. 6.4 may be of use to applications involving
measurement such as quantum metrology and repeat-until-success circuits, as the success
probability 1f 112 is improved by a quadratic factor (a/A)2 . This is analogous to oblivious
amplitude amplification which only applies to matrices that are approximately unitary [13].
One workable possibility is highlighted by the deep connection between quantum signal
processing and the properties of polynomials. Thin. 6.4 uses a degree O(A-1) polynomial
with maximum gradient O(A-'). Yet a famous inequality by Markov indicates a best-case
quadratic advantage in the gradient p' of any degree n polynomial maxs[_1,1] Jp'(x)I <
n2 maxxe[l,l] lp(x)J. Thus we have not fully exhausted the capabilities of polynomials. As
this inequality becomes an equality for Chebyshev polynomials of the first kind TL(x) =
cos (L cos-- (x)) at x = 1, this suggests that a speedup is possible if we are only concerned
with time evolution on eigenstates with eigenvalues JAl C [1 - A, 1] where A < 1. With this
assumption, we may prove Thin. 6.5.
94
Proof of Thm. 6.5. Consider the truncated linear function
{
fgap,A(X)
(X+1-A
A[,'
E[-1, 1],
X E-1-+
otherwise.
A],
(6.5)
'
As fn(fgap,A,,
] _ k
-)gH
0, the theorem is proven by finding degree n odd polynomialPgapAn(x) that uniformly approximates fgap,A(x) with error maxxe[-1,-1+A] IPgap,A,n(X)fgap,A(x)| < e and also satisfies all the conditions of quantum signal processing Thm. 6.6.
We provide such a polynomial of degree O(A- 1/ 2 log3 / 2 (_)) in Lem. 6.31 of Sec.6.6.2. And
so we define fa
= Pgap,A,n[t], which approximates the desired amplified Hamiltonian
with error
2 -+"1
ft(H
)n|
max+[,-+A]
--
IPgap,,n
E-
)
As energy gaps in an interval of width A are stretched by factor A-- using only 0(A 1 / 2
queries, a quadratic advantage in normalization is achieved. This is essentially spectral gap
amplification [118] with two important distinctions: first, it applies to any Hamiltonian
through the standard-form, though as highlighted in [118], only those encoded with a = 1l1H1,
such as frustration-free Hamiltonians, can fully exploit the effect. Second, it amplifies the
spectral gap of all eigenvalues uniformly, rather than non-uniformly. By combining with
Thm. 6.3, one obtains a Hamiltonian simulation algorithm for low-energy subspaces, relevant
to quantum chemistry and adiabatic computation.
Corollary 6.13 (Hamiltonian simulation of low-energy subspaces). Given Hermitianstandardform-(H, a, U, d) with eigenstates H/a|A) = hIA), let A E (0,1) be a positive constant, and
ft = EAe[-1,-1+A] IA)(AI be a projector onto the low-energy subspace of H. Then timeevolution e-it
on eigenstates with eigenvalues A e [-1, -1 + A] can be approximated with
error e using O(ta!/og3
) + A- 2 log)/ 2
queries to controlled-U.
(v))
Proof. This follows from multiplying the query complexities of Thm. 6.3 with Thm. 6.5, similar to the proof of Cor. 6.12, to obtain a cost of 0 (taA +
queries for approximating e- it
taAe 2 = E/2.
3 /2 (_I_))
lo(1Ei)) O(A-1/2 log
with error ei + taAe 2 . Thus we choose ei = e/2 and
L
It is worth mentioning that Thm. 6.5 also performs uniform spectral amplification on
high energy states. This follows from the polynomial Pgap,A,n(x) being odd. Thus its ability
to stretch eigenvalues A E [-1, -1 + A] applies to those A E [1 - A, 1] as well.
6.3
Amplitude Multiplication
Amplitude amplification is a staple quantum subroutine for state preparation that used in
many quantum algorithms. The basic version and its generalization, are based on reflections,
and described in Chapter 3. The result that concerns us it flexible amplitude amplification,
which we restate now.
Theorem 6.14 (Flexible amplitude amplification). Given a state preparation unitary C
acting on the computationalbasis states |0)a E C', |O)b G C2 such that G|O)a|O)b= Alt)alO)b+
/1 - A2t-)ab, where t')ab has no support on IO)b, let D be any function that satisfies all
the following conditions:
95
(1) D is an odd real polynomial in A of degree at most 2N + 1;
(2) VA c [-1, 1], D 2(A) < 1.
Then there exists a quantum circuit W- such that (tja(0|b(0cVV|0)a|0)bI0)c = D(A), using
N + 1 queries to G, N queries to 6t, O(N log (d)) primitive quantum gates precomputed
from D in classical 0(poly(N)) time, and an additional qubit ancilla c, such that
The proof of amplitude multiplication follows from flexible amplitude amplification by
an appropriate choice of polynomials for D.
Proof of Thm. 6.10. The amplitude multiplication algorithm is a special case of Thm. 6.14
where D is a polynomial that approximates the truncated linear function
flin,r'(x)
{
2L
' 1,
E [--1, 1],
lxi G [01F],
lxi E [,1],
(6.6)
jx| E (17, 1].
In Thm. 6.18 of Sec. 6.6.1, we approximate flin,r(x) with a polynomial with the following properties: V F - [0, 1/2] and e < O(F), the odd polynomial PlinF,n of degree n
0(1'-1 log (1/c)) satisfies
C[-F],
Plin,r,n(x) - -
217
<
-
217
and
max
xE[-1,1]
lpiin,r,n(x)l
1.
(6.7)
-
As this polynomial satisfies the conditions of Thm. 6.14, there exists a state preparation unitary WY10)al0)bl0)c = Plin,F,n(Y)lt)aO)bjO)c + A(9)1t')alO)c + iC(O)lt)alO)bl1)c
-
iB(O)lt')abl1)c, where the functions A, B, C of lesser interest, that consists of O(n) queries to
6, 6t and O(n log (d)) primitive gates. Assuming that F E [I sin (0)1, 1/2] is an upper bound
on Isin (6)1, the amplitude of the target state is l(tla(Olb(OcVVl0)alO)bIO)c -i , -sI - 2r
In other words, all initial target state amplitudes sin (0) are divided by a constant factor 21F
with an multiplicative error e that can be made exponentially small.
l
Note that if one is interested in multiplication by a factor less than one, trivial solutions
exist. For any F > 1/2, one could prepare an ancilla state jF)c =
I0)c +
1-
and simply define the target state to be It)alO)blO)c in the prepared state GO0)alO)clFc)=
+
sin(9) t)alO)blO)c
6.4
Uniform Spectral Amplification by Amplitude Multiplication
We now consider a certain kind of structure within the signal unitary U that encodes some
Hamiltonian in standard-form. Whereas Sec. 6.2 treats U as a single oracle, we now assume
that it factors into other unitaries, say U = UrowUcoj, or U = UrowUmixUcoi, that we assume
access to as oracles. This factorization imposes in Sec. 6.4.1 the interpretation that encoded
Hamiltonians have matrix elements defined by the overlap between some set of quantum
states. We investigate in Sec. 6.4.2 how this structure may be exploited for uniform spectral
amplification. By applying amplitude multiplication, this is possible through Lem. 6.15 in
a fairly general setting. In Sec. 6.4.3, we specialize this to sparse Hamiltonian simulation,
which leads to the improved simulation algorithm Thm. 6.8. In Sec. 6.4.4, this algorithm
96
is proven to be optimal in all parameters, at least up to logarithmic factors, through a
matching lower bound Thin. 6.9.
6.4.1
Matrix Elements as State Overlaps
Decomposing the signal unitary into factors motivates a different interpretation of the
standard-form
((0la 0 Is)U(|O)a 9 1) = ((|a 0
s)UrowUcoi(I0)a 0 Is)
(6.8)
By definition, any unitary operator implements a basis transformation U= Ek Bk)(Aklas
between complete orthonormal sets of basis states {IBk)as} and {Ak)as}, and similarly for
Urow, UCol. Now consider a set of basis states {j)a} on the ancilla register, and a set of
basis states {uj)s} on the system register. Without loss of generality, we may represent
+
UrOw = Ek IXO,k) as(01a(UkIs + EjO Ek IXj,k)as(jIa(UkIs and UC 0i = E |I ,k)as(01a(UkIs
Zj#4 0 >k I/)j,k)as a (Uk Is for some set of basis states {IXj,k)as}, { I j,k)as}. Let us substitute
this into Eq. 6.8 and drop the 0 subscript.
Hik = UlH JO=(0aUjs
a
(ua
=
(6.9)
= (XOjl O,k)as = (XJIlk)as.
rouk)
(col0)aUk)s)
In other words, elements of ft in the luj)s basis may always be interpreted as the overlap of
appropriately defined quantum states 1/0)as, Xk)as, which we call overlap states. Moreover,
H need not unitary when the dimension of these states is greater than ft.
More generally, we may factor the signal unitary into three unitaries
= U=&ow mix Ucol.
If we preserve the interpretation of Urow and U1co as preparing appropriately defined quantum
states, the third unitary Umix is a new component that mixes these states to encode the
following Hamiltonian in standard-form
-
= ((Oa 9 Is) UowUmix Ucol(10)a ( Is),
= (XjIas Umix1k) as.
(6.10)
.
Note that this reduces to Eq. 6.9 by choosing Umix to be identity, or by absorbing it into
the definition of either Urow or Uco. Combined with Thin. 6.3, time evolution by e--ft may
be approximated with error c using 09(ta + log log (1/6))qeistUrwUmxanU
) queries to row, Umix, and(col. 0
However, the ability to efficiently prepare arbitrary quantum states represents an extremely powerful model of computation. For instance, arbitrary temperature Gibbs state
preparation is QMA-complete [46]. That not all states may be prepared in 0(1) queries
to commonly used quantum oracles can be built into the definition of the overlap states by
splitting them into 'good' components j/j)ais, 1ij)ais marked by an ancilla state |0)a2, and
'bad' components that are discarded. Difficult states then have a small amplitude in the
10)a2 subspace. Thus
14j)as =
A00jl/j)ais0)a 2 + N1 - A/ 30jl/badj)ais|1)a 2 ,
Xj)as =
Ay'jl/j)aisjO)a2 +
(6.11)
-
1 - AyYjlXbadj)ais 2 )a 2
Note that the dimension of the ancilla register ala2 is equal to a. The coefficients AY, A3 e
(0, 1] represent a slowdown factor due to the difficulty of state preparation, and the coeffi97
cients /3 , -yj E [0, 1] normalized to maxj /3 = 1, max -yj = 1 represent how the amplitude
in good states can be index-dependent by design. By restricting Umix to be identity on the
register a2, this encodes the following Hamiltonian in standard-form
H
- = ((01a
a
k
a
0 is) &ow UmixUcoi(10)a 0 Is),
(Xjlas~mixa
k)as = VA Ayjk(XjlalsmixVk)ais-
By explicitly including the slowdown factor
also reduced.
6.4.2
(6.12)
A7A,
the spectral norm
oIz a AyAf is
Amplitude Multiplication of Overlap States
This state overlap encoding of Hamiltonians motivates the use of amplitude amplification.
As the amplitudes of all states 10j) are attenuated by a constant factor
A, the intuition
is that one requires O(1/ Afi) queries to the state preparation operator Urow to boost the
amplitude in the subspace marked by 10)b by a factor O(1/ A,), and similarly for lj).
Thus 0(1/ AT + 1/V/ X.) queries appears sufficient to reduce the normalization a by a
factor VAAa. This suggests that a query complexity of Hamiltonian simulation could be
improved to 0 (ta(A
+ A3) +
) which is most advantageous when Ap and A^,
are both small. However, realizing this speedup is non-trivial.
In the context of prior art in sparse Hamiltonian simulation, attempts have been made
to exploit amplitude amplification [121. There, it was discovered that the sinusoidal nonlinearity of amplitude amplification introduces large errors. As these error accumulate over
long simulation times t, controlling them led to query complexity scaling like 0(t3 / 2 /E),
which is polynomially worse than what intuition suggests. In the following, we avoid these
issues by introducing a linearized version of amplitude amplification, which we call the
amplitude multiplication algorithm.
Before proceeding, note that amplitude amplification also imposes additional restrictions
on the form of the overlap states in Eq. 6.11. Amplitude amplification requires the ability
to perform reflections Reflo)ai about the subspace marked by 10)ai, as well as reflections
Refgp on any arbitrary superposition of initial states I0j), that is Vj, RefoUcoilO)alu)s =
-Ucoil0)ajuj)s, and Refp performs identity for any other ancilla state. The case for Urow
and IXj) is identical. Whereas the first operation
Reflo) 2 = (a2
- 210)(0la2) 0 Ia1s,
(6.13)
is easy using 0(1) primitive gates, the second operation requires Uco to represent controlled
state preparation. In other words, with the input Isj) on the system register, the overlap
state has the decomposition
Ucoil0)alUj)s
I') = (Aoo/3jI)aiI0)a
2 + V1 -
98
AO/jl4badj)aIl)a 2) 1u1)s,
(6.14)
thus encoding the following Hamiltonian in standard-form
H
-
Ha
a~k
=ow
x
(
((0Oa 0 is)U&owUmix~coiQ0O)a
&I),
-
X
Uk)s,
(6.15)A
Yja(Uj s(jl, IAAUmix Ik)ai
(6.15)
and allowing us to construct the controlled-reflection operator
Ref,
=Z(Ia
-
210)a 2 1'4j)ai Qlai(f0Ia 2 ) 0 Ius)(ujls = $coj((ia - 210)(01a) 0 is)ci,
(6.16)
using 2 queries and 0 (log d) primitive gates.
The error introduced by a naive application of amplitude amplification is illustrated by
an explicit calculation. Using a sequence of m > 0 controlled-Grover iterates Ref'PRefjo a
2
making 9(m) queries, one can prepare the state
kbamp,j) = (RefRef I)a2) Ucoi0)aluj)s
(6.17)
=
A
(Oj )ai1|0)a2 + - - -1)a2)
With the choice m
7
-
) LP)ai 10)a2 +
-1)a2)
Iuj)s
Uj)S
-
(sin ((2m + 1) sin-'
A1/ 2 ), we are guaranteed that all
-
>
,A80j. Though this improves the normalization, it also specifies an erroneous Hamiltonian as the matrix elements (Xamp,jI&mix|bamp,k) are larger than those of H$k by an indexdependent factor.
In contrast, Amplitude multiplication in Thm. 6.10 avoids this non-linearity and allows
us to boost the normalization of the encoded Hamiltonian with only an exponentially small
distortion to its spectrum. This leads to
Lemma 6.15 (Uniform spectral amplification by multiplied state overlaps). Let the Hamiltonian H be encoded in the standard-form of Eq. 6.15 with normalization a.
Given up-
-
per bounds A,3 E [AO, 1/2], A, c [Ay, 1/2] on the slowdown factors, and a target error
e C (0, min{A,3, A.}), the Hamiltonian Hli, can be encoded in standard-form with normalization 4a A,3Ay such that |Iin - Hfl < AE||H|| < ae
A, using Q
O((A1/ 2
A-1/ 2 ) log (1/E)) queries, O(Q log (d)) primitive gates, and 1 additional ancilla qubit.
Proof. Let us apply Thm. 6.10, which requires one additional ancilla qubit, to the state
overlap model Eq. 6.14. We identify Uc. 1 as the state preparation operator that prepares
A,3
1/2, and let
A3/3J. Assume that
the target state marked by J0)a2 with overlap
A) c [A,3, 1/21 be an upper bound on the slowdown factor. Then there exists a quantum
circuit Ur 01 that makes Q3
= O(A8 1/ 2 log (1/c)) queries to U,,I and uses O(QO log (d))
99
primitive gates, and similarly for Urow, to prepare the states
11inU) = $ oil0)alUj)s =
IXiin,j) = UrowI0)aluj)s
"j (1 + 60,j) IOj))a 1 0)a2 + -1)a)
(
Iu ),,
(6.18)
12)a) luj)s,
(1I+ y) ,j) a10)a 2 +
where I
,EyjI < c E (0, min{Ao, A7 }) < 1/2 are state-dependent errors in the amplitude.
Let us define the Hamiltonian Hiin encoded in standard-form with normalization 4aVAA,
as follows
((Ola 0 is)U'row UmixUCoi(IO)a 0 is) =
tk (1 +
4ajkAAy
4a/AOA
'J)(l +
'
lUk)(Us.
(6.19)
We may now evaluate the error of Hiin from that of the original Hamiltonian k, following
a similar approach from (12]. Let e, be a diagonal matrix with elements e3,j and similarly
for ey,j. Then
kin = (ft + e
+ fte,3 +-ifte) ,
+ft
lHiin - HIH < 11111 (Ile81| + |l|-yl +
(6.20)
2)5
Il||llple|yl) < ||HII(2e + e2) <
5
|lHle < -
Ac A e.
where the second-last inequality is due to E < 1/2, and the last inequality applies the upper
bound |I"II < a/
_A,. Summing up Q = Q3 + Q, + 1 leads to the claimed query and
gate complexities.
D
Combining with Thm. 6.3 then furnishes the following result on Hamiltonian simulation.
Lemma 6.16 (Hamiltonian simulation by multiplied state overlaps). Let the Hamiltonian
ft be encoded in the standard-form of Eq. 6.15 with normalization a. Given upper bounds
A3 E [An, 1/2], A_) E [A,1/2] on the slowdown factors, A ;> Iiki, and a target error e E
2
) + (A-1/
+ A-7 1/2) log (1/E)
log (tA/E)
E18
og g(1/r-)
queries, O(Q log (d)) primitive gates, and
/
Ay) log (
A, 3
+
(0, min{A,, Ay}), time-evolution e-i ft be approximatedwith error e using Q = 0 (ta(
0(1) additional ancilla qubits.
Proof. From Lem. 6.15, we may encode Hli, in standard-form with normalization 4ac
AflA,
f (llt0feo) = 0(Aeo). Thisand
requires
error
Qo li-in= 0((A 1/2+A7 1/2)
= log (EO))
queries to $row, Umix, $ 01 and their inverses, O(Qo log (d)) primitive gates, and 1 additional
ancilla qubit. Using the fact Ile A -
ei
K
IIZ - $31, the error of e- lin from ideal time+
evolution is I|e-ifint - e-iHt|l < ||iint - tt|l = O(tAeo). By combining with Thm. 6.3,
time-evolution by e-iHint can be approximated with error El using Q, = 0(t/A3A,
log
) queries to controlledUmixU 0 1 and its inverse, 0(Qi log (d)) additional primlog log (1/6El ) qeist
otold
row Oi
T.
n
t
nes,0(
o
d)adtoa
rm
itive gates, and 0(1) additional ancilla qubits. Thus time-evolution by e-iHt can be approximated with error e = O(ei + tAco) using Q = QoQi queries to controlled-Urow, Umix, Ucol
and their inverses, and O(Qi log (d) + QoQi log (d)) = O(Q log (d)) primitive gates. We can
100
control the error by choosing cl = 0(c) and co = O(e/(tA)). Substituting into Q produces
the claimed query complexity.
D
In the asymptotic limit of large t > log (1/E), the query complexity may be simplified
to O(ta0(
6.4.3
A3 + VA) log ('))
queries.
Reduction to Sparse Matrices
The results of Sec. 6.4, presented in a general setting, apply to the special case of sparse
matrices. The reduction follows by making three additional assumptions. First, assume
that the dimension of 10)a E C3, is larger than that of |uj)s E C'. Second, assume that
Vj E [n], juj),, is the computational basis |j),. Third, we assume that there exists oracles
in Def. 6.7 that describe d-sparse matrices [12]: With these oracles and an upper bound
Amax ;> |Hlmax, we described in Sec. 5.2.3 how 0(1) queries suffice to implement the
isometry represented by Urow|0)a and &coi0), with output states
UcoiIO)alj)s = lbj)as =
)+
(0|a(ksUtro
{ksq_
= (Xklas =
a1__g__g + (I -
(ksd
k
fk
ftk
a
dAmax
1
max
)
+ (1
(6.21)
,
/1q)
ak 2
1-
-
max
qCFA
(xi|Umix|'k)-HkHk
1 -- H
max
PEFj
amx
{ |a-
where 6 3 k is the Kronecker delta function, and F
{k : k = f(j, l) ,l E [d]} is the
set of non-zero column indices in row j. We also choose Umix to swap the registers s
and a1 . The gate complexity of UC 01, Urow, and Umix combined is O(log (n) + poly(m)),
where m = ((log (tjjHI/c)) is the number of bits of precision of Hik. The contribution
from poly(m) =O(m5/ 2 ) is due to integer arithmetic for computing square-roots and
trigonometric functions. This combined with Thm. 6.3 recovers the previous best result
on sparse Hamiltonian simulation in Chapter 4 using Q = 0(tdAmax + log (1/E)
log log (1/r-))
queries,
and O(Q(log (n) + poly(m))) primitive gates.
To see how Thm. 6.16 improves on this, we rewrite Eq. 6.21 in the format of Eq. 6.11
by collecting coefficients of the subspace marked by l0)a2-
dAax
(XkI as
where o-
=
>pEFj
'7i'1
EI
S
lj)slP)ai
I0)a2 + - --
VZax
pEFj
U
dAmax
Uk
(
Ekqk + (1 -- k)Hq(j(
Wikj,
and the induced one-norm
qk
qE Fy
IIHII
(6.22)
)sll)a2 ,
1
)
(0ka 2 +
= maxj oj.
(ks(2la2
Note that
,
IVj)as =
4'j) =
sIP)ai, and similarly for xj). From this, we obtain our main result on sparse
Hamiltonian simulation Thm. 6.8.
101
Proof of Thm. 6.8. Comparison of Eq. 6.22 with Eq. 6.11 yields #3 =
=
iy , Afl = A,=
. Thus we have the upper bound A0 = A
> A3 = A-. Moreover, from
Eq. 6.21, the normalization constant a = dAmax. The claimed query complexity is obtained
by substitution into Cor. 6.16.
D
This result is quite remarkable as it strictly improves upon prior art, modulo logarithmic factors, by exploiting additional structural information. In the asymptotic limit of large
Alt > log (1/E), the query complexity may be simplified to O(t dAmaxAi log (LA)). Using
+
the inequality IIHHJ < IIHII1 5 dIIHImax, the worst-case occurs when these norms are all
equal thus A = A 1 = dAmax. There, the query complexity of Thm. 6.8 up to logarithmic factors is O(tdAmax), equal to that of prior art (86]. However, the best-case IIHIi = O(IIHlmax)
leads to a quadratic improvement in sparsity with query complexity of 0(tvdAmax), also
ignoring logarithmic factors.
Another approach implicit in [12] assumes that oj are provided by the quantum oracle
Oclj)slz)c = ij)sIz D o-j), when queried the j E [n] row index. This allows us to exactly
compensate for the sinusoidal non-linearity of amplitude amplification by modifying initial
state amplitudes by some j-dependent multiplicative factor. Thus H may be encoded in
standard-form with normalization O( dAmaxAi) exactly without any error, leading to a
Hamiltonian simulation algorithm with query complexity Q = O(t(djHIImaxIIHtII) 1/ 2
log log (1/6)). While improves on Thm. 6.8 by logarithmic factors, and matches the complexity
Claim. 6.1, OC is in general difficult to construct.
6.4.4
Lower Bound on Sparse Hamiltonian Simulation
In this section, we prove the lower bound Thm. 6.9 on sparse Hamiltonian simulation, given
information on the sparsity, max-norm, and induced one-norm. The lower bounds in prior
art are obtained by constructing Hamiltonians that compute well-known functions. When
applied to our situation, one obtains Q(tIIHIIi) queries through the PARITY problem [10],
and Q(vfd) queries through OR [12]. This leads to an additive lower bound Q(tIIHtjIi + d).
Using similar techniques, we obtain a stronger lower bound Q(t(dIII111) 1/ 2 ) by creating a
Hamiltonian that computes the solution to the composed function PARITY o OR. Specifically, we combine a Hamiltonian that solves PARITY on n bits with constant error using at
least Q(SIIHI|maxt) queries, where t = E( '
), with a Hamiltonian that solves OR on m
bits exactly, with the promise that at most 1 bit is non-zero, using at least Q(Vm) queries.
Note that in all cases, the query complexity with respect to error is at least an additive term
log log (1/6)*
The Hamiltonian HPARITY that solves PARITY on n bits is well-known [10], and is
based on the Hamiltonian Hspia for perfect state transfer in spin chains. For completeness,
we outline the procedure. Consider a Hamiltonian of Hspin dimension n + 1, with matrix
elements in the computational basis {Iij), : j c [n + 1]} defined as
(j - 1IsHspin j)s = Vj(n - j + 1)/n.
(6.23)
Note that this Hamiltonian has sparsity 1, max-norm E(1), and 1-norm E(1). Time evolution by this Hamiltonian e-ifspinnm/2 10), =n), exactly transfers the state 10) to In) in time
2
102
One way to speed up these dynamics is to uniformly increase the value of all matrix elements. However, any increase in |iH||max is trivial as it simply decreases t by a proportionate
amount. Another way is to boost the sparsity of Hspiu, by taking a tensor product with a
Hamiltonian Hcomplete of dimension s where all matrix elements are 1 in the computational
basis {|j), : j E [s]}.
(ilcHcompietelj)c = 1,
Vi C [s], j C [s].
(6.24)
One of the eigenstates of Hcomplete is the uniform superposition Iu)c
Zjs
j)c with
eigenvalue Hcompletelu)c = slu)c. Thus we define the Hamiltonian
sc =
spin 0 Hcomplete-
(6.25)
Note that the H8 c has sparsity s, max-norm E(1), and 1-norm 8(s). One can see that Hec
perform faster state transfer like e-ifscnTr/( 2s)10)su)c = In),lu)c in time t =
useful to define the state Ii)Pc = Ii)ASu)c.
'.
We find it
Adding another qubit to this composite Hamiltonian together with some slight modification solves PARITY. Given an n-bit string x = XOX2...n-1, let us consider the
Hamiltonian of dimension 2 that computes the NOT function on the computational basis {j)output : j C [2],
HNOTJ
One can see that HNOT,j O)output =
(X
X
)loutput
and HNOT,j l)output =
(6.26)
Oloutput, as expected of
a NOT function. In the basis j)sc, we define the Hamiltonian
(
conjugate. (6.27)
1 +1)(j8 $® NOT~J +HermitianHPARITY
Vj(n - j +)
s
This Hamiltonian also performs perfect state transfer, but since the path of each transition
between the states |0)output and |1)output are gates by a NOT function on the bit xj, the
output state of time-evolution e-iPARITYmr/( 2 ) 1 sIu) c )output In) Iu) c j output In
the computational basis, HPARITY has sparsity 2s, max-norm e(1), and 1-norm e(s). Even
though HNOT,j has only one non-zero element, the sparsity increases by factor 2 as we cannot
compute beforehand the column index the non-zero. Thus measuring the output register
returns the parity of x
n-1
PARITY(x) =@ xj,
(6.28)
j=0
after evolving for time t =
. It is well-known that the parity on n bits cannot be computed
with less than Q(n) quantum queries, thus the query complexity of simulating time-evolution
by HPARITY for time t is at least Q(ts). As sparsity and 1-norm exhibit the same scaling
and in general |IHt|1 <; dJ|HI|max, the more accurate statement here if given information on
IIHI1i is the lower bound of Q(tflHJJ1) queries. In contrast, the lower bound of [10] quotes
Q(sparsity x t) as they consider the case where one is given information only on the sparsity..
103
We now present the extension to creating a Hamiltonian that solves PARITY o OR.
Notably, this Hamiltonian allows one to vary sparsity and 1-norm independently.
Proof of Thm. 6.9. The first step is construct a Hamiltonian that solves the OR function on
m bits xox1 ... xm-1, promised that at most 1 bit is non-zero. This Hamiltonian of dimension
2m, in the computational basis {Ik)0out p u tjj)o : k c [2], j C [m]}, is
ftOR
g
O0
01
- (
.
otC
(6.29)
Note that our construction is based on a modification of [13], where Oi there is zero matrix.
Here, Ci mimics the top-left component of HNOT in that is performs a bit-flip on the output
register if OR(x) = 0, and Oo mimics the top-right component of HNOT in that it performs
a bit-flip on the output register if OR(x) = 1. These matrices are defined as follows:
Co
=
X1
...
Xm-1
/1
(Xm-i
X0
...
Xm-2
1001+I
xm-2
Xm-1
-.
-
m-3
X1
X2
...
XO
1
1
1
Co...
,
+
X-
a~ t
(6.30)
Note that the non-Hermitian matrix Co has rows formed from cyclic shifts of x, whereas
Ci is Hermitian. Let us define the uniform superposition IU)0 =
j
Ij)o. It is
easy to verify that if at most one bit in x is non-zero, Colu)o
OR(x)lu)o. Similarly,
C1|u)o = (OR(x) e 1)Iu)o. Thus HoRjj)outputlu)o = Ij E OR(x))output lu) 0 . Note that HOR
has sparsity 2m, max-norm E(1), and 1-norm E(1).
Given an nm-bit string X,OX,1...XO,m-X1,...Xn_1,m_1,
the Hamiltonian HPARITYoOR
that computes the n-bit PARITY of a number n of m-bit OR functions is similar to HPARTY
in Eq. 6.27, except that instead of composing with NOT Hamiltonians defined by the bit xj
for each j E [n], we compose with OR Hamiltonians defined by the bits xj,ox,1...x,m_1 for
each j E [n]. By defining HOR,j as the Hamiltonian defined by those bits,
HPARITYoOR
-
jE[n]
n
+(n
j + 1) j +1)(Ujsc
OfRj
+ Hermitian conjugate. (6.31)
On the input state 10)SIu)cIu) 0 10)0 utput, time-evolution by e-iHPAR1TYoORn7r/( 2 d) produces
e-iHPARITYooRn7r/(2d)
0). u)IU) 0 10)output
-
In)s u)clu)j
®D OR(xj,oxj,1 ... xj,m-1))output.
(6.32)
Thus measuring the output register returns the parity of x
n-1
PARITY o OR(x) =
OR(xj,oxj,1...xj,m-1),
(6.33)
j=o
after time-evolution by t = nir/(2s). Note that HPARITYoOR has sparsity d = 2sm, max104
norm E(1), and 1-norm ®(s). It is well-known that the constant-error quantum query
complexity of PARITY o OR [107] is the product of the query complexity of PARITY with
that of OR. As at least Q(/m) queries are required to compute the OR of m bits, PARITYo
OR(x) requires at least Q(nVm) queries. Thus any algorithm for simulating time-evolution
by HPARITYoOR requires at least Q(nr m) = Q(t ds) queries.
6.5
Universality of the Standard-Form
We now establish an equivalence between simulation and measurement that justifies our focus on directly manipulating the standard-form encoding of structured Hamiltonians. This
equivalence, proven using Thin. 6.11, allows us to inter-convert quantum circuits that implement time-evolution e--H for |Hjj1 = 0(1) and quantum circuits that implement measurement IIHH with a query complexity that is logarithmic in error, and a constant overhead in
space. An application of this result to Hamiltonian simulation is Cor. 6.12 for Hamiltonians
that is a sum of Hermitian terms, given access only to their exponentials.
An intuitive picture of when simulation is possible emerges by interpreting the standardform matrix encoding Def. 6.2 as a quantum circuit that implements a measurement. To see
this explicitly, consider a Hermitian matrix encoded in standard-form-(H, a, U, d). Thus for
any arbitrary input state
7a, the standard-form applies
W)
U|G)a1'0)s =
|G)aHJ|)s+ |)as,
I(#|as(JG)a 9 Is)l = 0,
(6.34)
Note that in this section, we find it helpful to leave IG) explicit, similar to Sec. 5.3.1. So
upon measurement outcome IG) on the ancilla, which occurs with best-case probability
max 4 ,Wt
')
= (HHI/a) 2, the measurement operator H/a is implemented on the
system. As all measurement outcomes orthogonal to G) do not concern us, we represent
their output with some orthogonal unnormalized quantum state kI")as. Combined with the
Hamiltonian simulation by qubitization results of Thm. 6.3, one concludes that whenever
one has access to a quantum circuit that implements a generalized measurement with measurement operator H/a corresponding to one of the measurement outcomes, time-evolution
using 0 (ta +
og (I)E)
queries is possible.
The converse of approximating measurements given e-t
is a standard application of
quantum phase estimation. The proof sketch is (1) assume t is chosen such that IIHt| K c < 1
for some absolute constant c and define H' = Ht. (2) Perform quantum phase estimation
using 0(1/c) queries to controlled e--t
to encode the eigenphases A of its eigenstates
H'A) = AjA) to precision E in binary format A in an m-qubit ancilla register 7 b, where
m = O(log (1/c)). (3) Perform a controlled rotation on the single-qubit ancilla 10)a to
reduce the amplitude of |A) by factor A. (4) Uncompute the binary register by running
quantum phase estimation in reverse. This implements the sequence.
IO)blO)aiA)s + 1)bjO)aIA)s -4 1A)b (I)a +
a |0)b (A0)a +
1-
1 - IA211)a)) JA),
(6.35)
11)) JA).
Thus projecting onto the state 104b0)a implements the measurement operator H' with error
105
max, JA - A = O(e), and best-case success probability |H'H.
As Eq. 6.35 is a standard-form encoding of /ar with the signal unitary defined by steps
(2-4), this establishes one direction in the equivalence between measurement and simulation
up to polynomial error and logarithmic space. Ignoring these factors, our study of Hamiltonian simulation reduces to that of generalized measurements except in one edge case: this
equivalence does not hold with respect to t when e-iHt can be simulated with o(t) queries.
However, this case is less interesting as no-fast-forwarding theorems [25] show that Q(t)
queries are necessary for Hamiltonians that solve generic problems.
We strengthen this equivalence in the opposite direction Thm. 6.11 for approximating
measurement operators ft' using log (1/c) queries to e--ik' and 0(1) ancilla qubits. The
idea is to using quantum signal processing techniques to approximate two operator transformations: H 1 =(e--H' - eiH'), k 2 = sin- 1 (H 1 ).
Thus sin- 1 ((eift'
- eif')) = f'.
1
All that remains is finding a degree n polynomial approximation to sin- (x) with uniform
error n = O(log(1/c)). However, this seems impossible - sin-(x) is not analytic at x = 1,
thus its uniform polynomial approximation has degree n = 0(poly(1/e)). Fortunately, this
can be overcome due to the restricted domain IIftl| < c.
Lemma 6.17 (Polynomial approximation to sin-'(x)). V e E (0, 0(1)], there exists an odd
polynomial Parcsin,n of degree n = O(log (1/E)) such that
max
Parcsin,n (X) - sin- 1 (x)| < c,
XE[- 1/2,1/2]
and
max
xG[-1,1]
lparcsin,n(X) < 1.
(6.36)
Proof. We restate Thm. 3 of [112] by Saff and Totik: Let # be any number satisfying
f > 1 and let f E Ck[_1, 1] be a piecewise analytic function on m > 0 closed intervals
[-1, 1] = U-o[xj, Xj+1, -1= X0 < X1 < ... < Xm-1 < xm = 1, where the restriction of
f to any of the closed intervals [xj, xj+1] is analytic, and f is not analytic at each point
x1, - - - , xm-1. Then there exists constants g, G > 0 that depend only on f, and degree
n > 0 polynomials pn such that for every x E [-1, 1], lpn(x) - f(x) 1 :
G e-gnd,(x), where
d(x) = mino<j<m Ix - xj . Let us now apply this theorem. Define the function
farcsn W
{sin-
1
(x),
sgn(x) sin- 1 (3/4)
x E [-3/4,3/4],
(6.37)
otherwise,
where sgn(x) =
x. farcsin(x) is continuous but not differentiable at x =
3/4. Thus
f E CO[-1, 1], maxxE[-1/ 2 ,1/ 2 ] d(x) ;> 1/4, and there exist absolute constants G', g' > 0
and polynomials Pn such that maxxe[-1/2,1/2]
pfn(x) - farcsin(X)l
gg-g'n/4
=.
Hence
1
n = O(log (1/c)). Since e-g'"d'
<; 1 and Isin (3/4) < 0.85, there exists a constant
no > 0 such that for all n > no, maxxe[_,llI farcsin(x) - pn(x)I < 0.15 thus pn(x)I < 1. If
Pn(x) is not odd, replace it with its antisymmetric component pn +- "(x)"n(-x) which is
odd with at worst the same error. Now let Parcsin,n = Pn.
l
We now apply this polynomial approximation of sin-(x) to the proof of Thm. 6.11.
Proof of Thm. 6.11. The transformation from time evolution e-to measurement Ht
takes three steps. First, encode the Hermitian operator H 1 = sin (Ht) in standard-form.
This can be done with one query to the controlled time-evolution operator o = 10) (010 j+
106
I1)K1I ® e-ia and its inverse #t.
Z) 0o =|1)(0| 0 et +10)(11 0 e-H, G) =ei&x7r/4I0), (6.38)
Ui = 0 (8
H 1 = ((G 1 0)U 1 (IG) 09 ) sin (Ht).
Second, approximate H 2 = sin- 1 (H 1 ) using quantum signal processing. As the polynomial
Parcsin,N(x) of Lem. 6.17 satisfies the conditions of Thm. 5.9, the operator transformation
flint = Parcsin,N[H1] can be implemented exactly with O(N) queries to &o. This encodes
Hiint in standard-form with normalization 1. Now choose t such that IIftlI < c = 1/2.
Then ||sin (Ht)l < Ht|l < 1/2 as sin(x) K x. Third, evaluate the approximation error
using Lem. 6.17. IIHiint - Htj| < maxC[-1/ 2 ,1/ 2] IParcsin,N(X) - sin 1 (x)
e, for N
O(log (1/E)).
I
Incidentally, the equivalence between simulation and measurement also provides a simulation algorithm for Hamiltonians built from a sum of d Hermitian component ft
j= H*
where one only has access to these components through an oracle for their controlled exponentials e-iHjtj, for any tj E R. Though results with similar scaling can be obtained
through the techniques of compressed fractional queries [13], this approach has two main
advantages. First, the queries H, are not restricted to only have eigenvalues 1. Second, it
is significantly simpler both in concept and in implementation.
Proof of Cor. 6.12. From Thin. 6.11, O(log(i/ci)) queries to U suffice to encode Hcontrolled
with some state IG')b and
acts on the system register
Ed=I lj)(jla 0 Hj = ((G'Ib Oas)'(IG')b0 las) in standard-form
signal oracle U', where maxj ||Hj - Hg|| :
s. Thus ((G a(G'b 0 I,) U'(IG)ajG')b 0 I,)
where ||fapprox - H|| =
_
j=H -
ci and Hcontrolled
ftapprox/a encodes Happrox in standard-form
>I_1
=) ay||
I < adl. Using the fact
IleiA - e B115 11A - bi [13], we have Ie-iH't - e-inHI < taci. By applying Thin. 6.3,
e-iHapproxt can be approximated with error E2 using 0(ta + 1o091(/E2) )O(log(1/Ei)) queries
to U. By the triangle inequality, this approximates echoose El = ' and E2 = c/2.
6.6
with error < taci + 62.
Thus
l
Construction of Polynomials
In this section, we constructively prove the existence of the polynomials used in proving
spectral multiplication, spectral amplification of low-energy subspaces, and amplitude multiplication.
6.6.1
Polynomial Approximations to a Truncated Linear Function
The proof of Thm. 6.4 and Thm. 6.10 require a polynomial approximation Plin,r,n to the
truncated linear function
flin,r (x)
[2r ,
E [-1, 1],
XI E [0, 1],
jxj E (F, I].
(6.39)
The remainder of this section is dedicated to constructively proving the existence of plin,r,n
with the following properties:
107
Theorem 6.18 (Polynomial for linear amplitude amplification). V I' E [0, 1/2], c C (0, 0(F)],
there exists an odd polynomial plin,Jr,n of degree n = 0(F-1 log (1/c)) such that
V x E [-r,1], Plin,pn(X) - -- <
21F
21
and
max IPIin,r,n(x)I < 1.
(6.40)
xE[-1,1]
As close-to-optimal uniform polynomials approximations may be obtained by the Chebyshev truncation of entire functions, our strategy is to find an entire function flin,r,E that
approximates fnX over the domain x E [-I, F] with error c. We construct flin,r,E(x) in
three steps. First, approximate the sign function sgn(x) with an error functions, which is
entire. Second, approximate the rectangular function rect(x) with a sum of two error func-
tion j (erf(k(x + 6)) + erf(k(-x + 6))). Third, multiply this by ' to approximate fAn,r,E ()
with some error e. The approximation error of this sequence is described by Lems. 6.19,
6.20, 6.21:
Lemma 6.19 (Entire approximation to the sign function sgn(x)). V s > 0, x
(0,
/2/ew], let k =
/ log
E R, e C
(s). Then the function fg,,,,(x) = erf(kx) satisfies
11 2
X > 0,
1,
S> Ifsgn,JeX),
c > max Ifsgn,K,c(x) - sgn(x)I,
1,
x < 0,
1/2,
x = 0.
sgn(x) =
jxj !r/2
(6.41)
Proof. We apply elementary upper bounds on the complementary error function erfc(x)
1 - erf(x) =
f e-Y dy
f ge-- dy
e- 2 for any x > 0.
Thus maxX> /2 jerf(kx) -
) > v
where
e -(k,) 2 /4 = E and similarly for x < -n/2.
This
where W(x) is the Lambert-W function. From the upper
bound log x - log log x < W(x)
log1/2((
2
log x - -log log x for x > e [57], any choice of k >
_ > e ensures that erf(kx) is close to +1
w > 0, x E R, X E
Lemma 6.20 (Entire approximation to the rect function). V K > 0,
(0,
2/er], let k = f log 1 / 2 (s), 6 = (w + r,)/2.
(erf(k(x + 6)) + erf(k(-x + 6))) satisfies
1 ;> Ifrect,w,,,(X)I,
lfrect,w,K,E(X) - rect(x/w)l,
max
E >
IxIE[,w/2IU[w/2+,oo
l
over x> K/2.
Then the function frect,w,,,(x)
rect(x) =
1,
IxI < 1/2,
0,
Ixl > 1/2,
IxI = 1/2.
l1/2,
-
is solved by k = -2W(1)
1
(6.42)
+
Proof. This follows from the definition of the rect function rect(x/w) = (sgn(x + w/2)
sgn(-x + w/2)). Thus we choose 6 = (w + r,)/2 and apply the error estimates of Lem. 6.19.
Lemma 6.21 (Entire approximation to the truncated linear function). V F > 0, x E R, E E
(0, /2/er], the function flinr, (x) = - frect,2r,2r,E(X) satisfies
Iflin,r,e(x) 5 1,
max
Ixielo,r]
flin,rE(x) - -
108
2F
217
.
(6.43)
Proof. Consider the domain Ix| E [0, F]. There, Lem. 6.20 gives the approximation error
|frect,2r',2r,(x) - 11 <6. Multiplying both sides by x gives the stated result. Now consider
K 1. Thus the product isthe domain IxI c [0,2r]. There, jfrect,2r,2r,E(x)j < 1 and
<
bounded by
1. Now consider the domain x > 2F. Let us maximize fiin,r,,(x) over x, E.
Define 1/E' =
log (_)
> 1. Thus flin,r,E(x) =
(erf( x
, ) + erf( 2
, )). We make use
of the upper bounds erfc(x) = 1 - erf(x) <
_e_X2 and erfc(x) < e _X.
has the bounds 1 > erf(x2
( x+2r
term has the bounds -1
)
-
+
> erf(
21
>
-,) > -1.
1
p-
The first term
( x+2r' )2
The second
By adding these together and
extremizing the upper and lower bounds separately, fjij,r,e(x) E [-0.0011, 0.56] independent
of F and for all E' E [0, 1]. These bounds apply to x < 2F with a minus sign as flin,r,,(X) is
an odd function.
,
However, the required polynomial must have a non-uniform error [Plin,r,n(x) 21
proportional to IxI. Though fl in ,r,, of Lem. 6.21 has that property, its Chebyshev truncation
results in a worst-case uniform error c for all values of x. This is overcome by approximating
Plin,F,n(x) as the product of a Chebyshev truncation of the entire approximation to rect(x)
and with i.
We now evaluate the scaling of the degree of the Chebyshev truncation of
fiin,rc in Lem. 6.20 with respect to their parameters and the desired approximation error.
Our starting point is the Jacobi-Anger expansion of the exponential decay function:
Io-(j) +2ZI(C)T (- x) , (6.44)
fexp,o ()
+
C-8(x+1)
-
j=1
where Ij (p3) are modified Bessel functions of the first kind. The domain of this function and
all the following are assumed to be x C [-1, 1]. By truncating this expansion above j > n,
we obtain a degree n polynomial approximation Pexp,8,n(X) with truncation error Eexppn:
Pexp,8,n(X)
=
(Io(3) + 2ZI
()T
(-x)
(6.45)
,
j=1
Eexp,/3,n =
max IPexp~3n - fexpjB = 2e->
XE{--1,1]
IE(I3 ) .
(6.46)
n l
Note that the equality in the rightmost term of Eq. 6.46 arises as all the coefficients Ij (3 ) > 0
when 0 > 0. Thus 6exppn is maximized ITj(-x)l are all simultaneously maximized, which
occurs at x = -1 => T (-x) = 1. By solving Eexp,,n,, one can in principle obtain the required
degree n as a function of 3, c.
Error estimates for various degree n polynomial approximations to the exponential decay
function can be found in the literature. However these approximations are constructed using
other methods. For instance, a Taylor expansion leads to scaling linear in 3, and none
explicitly bound the sum Eexp,f,n. Fortunately, one particular error estimate in prior art is
good enough and can be shown, with a little work, to implicitly bound eexp,p,n. We first
sketch the proof of this estimate, then later show how it bounds Eexp,)3,n.
Lemma 6.22 (Polynomial approximation to exponential decay e-(x+1) adapted from [111]).
109
VQ > 0, e c (0,1/2], there exists a polynomial pn of degree
n = [2
[max[3e2, log (2/c)]]
log (4/c)]
(6.47)
such that
max Ipn(X) - e--3(X- 1) < 6.
(6.48)
xc[-1,1]
Proof. Consider the Chebyshev expansion of the monomial
/s
XS = 21s
E
j=O,S-j even
(
.s/)T(x) = E[TD, (x)],
(6.49)
s" - j)/2)
where s < 0 is an integer and E means the j = 0 term is halved. The representation an an
expectation over the random variable D, =-Es_ Y where Yj
t1 with equal probabilities
follows from the identity xT(x) = 2(Tj-I(x) + Tj+1(x)). They show that the Chebyshev
truncation of the monomial has error
min(s,n)
Pmon,s,n(x)
21-s
E
j=o,n-j even
(
.
(s
Tj (x),
(6.50)
j)/2
S/
Emon,s,n =
max |Pmon,s,n(X) Xc[-1,1]
xsj
2 1-s
E
j=n+1,n-j even
(s
2e-2/(2s)
-)/2
which follows from the triangle inequality with Tj(x) I 1 and the Chernoff bound P(IDsI>
n) < 2e- 2 /(2s). By replacing each monomial up to degree t in the Taylor expansion of
e~4(x~l)
e-#
=0
=e~
j!
_% (~j=j
mon,j,n(x).
Esach,#8,n
x3 with Pmon,s,n, they obtain the degree n polynomial Pn(x)
They show the error of this approximation is split into two terms:
max
xE[-1,1]
ei= 2e,
Ifn(x) - e-/-(x1-) I <
S
1
- e2,
IPmon,j,n - xi I
(6.51)
2-2/(2t),
j=n+l
2= 2ej=t+1
L.
By choosing n =
xi
1
2e -3.
3
2t log (4/c)] and t = Fmax{#3e 2 , log (4/c)}], cl +
C2
E
We now demonstrate how this upper bounds Eexp,8,n.
Lemma 6.23 (Chebyshev truncation error of exponential decay e-O(x+1)). V / > 0, e E
(0, 1/2], the choice n = [FV2Fmax[#e 2 , log (2/c)]] log (4/c)] = O( /(3 + log (1/E)) log (1/c)),
guarantees that Cep,,On < E.
Proof. This result follows essentially from how the truncating the Jacobi-Anger expansion
in Eq. 6.44 discards fewer coefficients that are all positive than the procedure of Thm. 6.22.
Hence the maximum truncation error occurs at x = 1 and is monotonically increasing with
110
the number of coefficients omitted in the truncation.
Eq. 6.50 is actually an equality Emon,s,n =
2 1-s
>/3
Observe that the first inequality in
j=n+1,n-j even
(S(s3) S. )/2). This follows from
the same logic as Eq. 6.46 - all coefficients are positive, thus the maximum error occurs at
X = 1, which simultaneously maximizes all Tj(x = 1) = 1. Similarly, the first inequality
in Eq. 6.51 is also actually an equality. Let us express the truncation error of Esach,#,n as a
Chebyshev expansion in full
Esach,13,n =2e-
maxkx)(.2
n+1
-
00
(/3/2)J
(13j
even
.k=n+1,j-k
even
k=t+
- k=Oj-k
((k))/2"~
(.2
( - k)/ 2
Note that we have used (--3) 3 T(-x) = 33Tk(x) as all pairs j - k are even. Thus 6sach,3,n
is maximized at Tk(x
1) = 1 in the sum above. This can be compared with
(6.53)
Eexp,,8,n =max 2eXE [-1,1]
= 6 sach,/3,n -
j
00
n l
S (012)3
"I
2e - 2e-
j=t+1
E
(j
k=O,j-k even
') 2
< Esach,o,nMore intuitively, both Eexp,3,n and 6sach,f,n sum over all coefficients j > n in the Chebyshev
expansion, but Csach,,,n in addition sums over some positive coefficients corresponding to
j < n. Thus the upper bound of Lem. 6.22 on Esach,#6,n applies to Eexp,,8,n,
E.
In the following, we will bound all errors of our polynomial approximations in terms
Eexpj,n, a partial sum over Bessel functions.
Corollary 6.24 (Polynomial approximation to the Gaussian function e-(yX) 2 ). V-y > 0, E E
(0, 1/2] the even polynomial Pgauss,-y,n of even degree n = O(I(- 2 + log (1/1)) log (1/c))
satisfies
n/2
PgaussY,n(X) = PeXp2/2,n/2(2x
2
_1)
e- 2 /2
(I
2/2) + 2
Ij(_2/2)(-1)T2j(x)
)
I(#) X)
j=1
(6.54)
Egauss,y,n =Ema
1jPgauss,-,n(X) - e(yX) 2
-
Eexp,-2/2,n/2 5
*.
Proof. This follows from Eq. 6.44 by a simple change of variables. Let x' = T2 (x) = 2x 2 - 1,
[-1, 1] maps the domain of
- 12map
[-1, 1]
e-yx). As 2
e-((x'+1)
2=. T
As
6x'
Thus
y2,
to that of fexp, 1 (x), the definition Eq. 6.54 results. Using the Chebyshev semigroup
property Tj( T2 (X)) = ( 1)"T2j(x), Pgauss,k,n is an even polynomial of degree n and its
approximation error is obtained by substitution into Eq. 6.46.
1--
111
A polynomial approximation to the error function follows immediately by integrating
Pgauss,y,n-
Corollary 6.25 (Polynomial approximation to the error function erf(kx)). Vk > 0, C E
(0, 0(1)] the odd polynomial Perf,k,n of odd degree n = O(vf( k 2 + log (1/e)) log (1/6)) satisfies
(n-1)/2
Perf,k,n(X)
(Io(k 12)x
j(k2 /)(
2
T2j
1j
-
2
-2kek/2
2j + 1
2j-I.))5
+
j=1
(6.55)
4k
max IPerfk,n(x) - erf(kx)l < v
Egausskn-1 < EXe[-1,1]
,\/7n
fo e- 2
Proof. From the definition of the error function erf(kx) =
term-by-term using the identity fx Tj(x)dx =
maining terms is bounded though
2ke-k 2/2
<
V/r
dx
k fe Pgauss,k,n(x)dx follows directly from integrating Eq. 6.54
the polynomial Perf,k,n+1 (x)
Eerf,k,n
2
1
00
.(x))
-
T2j+1 (x)
2j + 1
.E
=(n+l)/'
The error of the re-
(6.56)
i
-
Cerf,k,n =
2
2ke-k /2
I (k 2 /2)1
/
-
1
2j+
2
<K
4ke-k /2
II(k 2 /2)1 =
f
1
~2j- 1)
4k
Egauss,k,n- 1.
j=(n+1)/2
The error of erf,k,n
1 2
0(log- /
(1/6))
However, n = Q(k log1 / 2 (1/E)).
4k Eexp,k2/2,(n-1)/2.
Thus
-=
D
= 0(1) and does not make the scaling any worse.
A polynomial approximation to the shifted error function follows by a change of variables.
-
Corollary 6.26 (Polynomial approximation to the shifted error function erf(k(x - 6))).
Vk > 0,6 E [-1, 1], ec (0, 0(1)] the polynomial Perf,k,6,n(x) = Perf,2k,n((X
6)/2) of odd
degree n = O(V(k 2 + log (1/E)) log (1/6) satisfies
Eerf,k,6,n =
max
xG[-1,1]
Perf k,3,n(X) -
erf(k(x - 6))I
Eerf,2k,n
E.
(6.57)
Proof. This follows trivially from erf(k(x - 6)) = erf(2kxg3 ). Note that we have doubled
the degree of our polynomials in order to double the width of the domain, which we exploit
to allows translations.
D
This polynomial approximation of the shifted error function is the basic ingredient we
use to construct more complicated functions sgn and rect through Lems.6.19,6.20.
Corollary 6.27 (Polynomial approximation to the sign function sgn(x - 6)). V r, > 0,6 E
[-1, 1], c E (0, 0(1)] the polynomialpsgn, 6,n(x) = Perfk,J,n(x) of odd degree n = 0( log (1/c)),
112
log1/ 2 (
where k =
6sgn,K,,,n
), satisfies
max
=
Perf,k,6,n(X)
-
sgn(X -
6)1
Eerfk,3,n + 61
2
Eerfk,Jn < E-
(6.58)
Proof. The equation for k comes from Lem. 6.19. We then choose El = Eerf,k,6,n which defines
an implicit equation for cl and doubles the error.
L
Corollary 6.28 (Polynomial approximation to the rectangular function rect(x/w)). V r C
(0, 2], w G [0, 2 - r,], E E (0, 0(1)], the even polynomial
Prect,w,s,n(x) = 1 (Psgn,,(w+n)/2,n+1(X) + Psgn,i,,(w+r)/2,n+1(-x))
(6.59)
of even degree nO(}! log (1/C)) satisfies
Erect,w,,n
=
1Prectw,,,n(X) - rect(X/w)l < Esgn,,,,n < E.
max
jx|E[0,w/2]U[w/2+K,1]
(6.60)
Proof. This follows from the construction of a rectangular function with two sign functions
in Lem. 6.20.
l
Corollary 6.29 (Polynomial approximation to the truncated linear function flin,r(x)).
V F E (0, 1/2], e C (0,0 (F)], the odd polynomial Plin,r,n(X) = xPrect,2r,2r,n-1(X) of odd
n = 0(! log (1/c)) satisfies
217z
Erect,2F,2r,n--1 KE.
(6.61)
Clin,r,n =
max
Ixle[O,r] lXI
Plin,r,n(x) -
-
2F
C
Proof. This follows from multiplying a rectangular function with a linear function in Lem. 6.21.
One subtlety arises here: The error of Piinr,n is bounded by ret,2r,2r,n-l in the domain
How'e.
'
lxi E [31, 1]. Thus multiplying by 2 increases this error to at most
ever, the quantum signal processing conditions in Thm. 3.4 require all polynomials to be
bounded by 1. This implicitly constrains us to choose n such that Crect,2F,2r',n-1 < 2 is also
satisfied.
In all the above cases, the entire functions that are being approximated are bounded by
1. When the approximation error is e, the resulting polynomial is then bounded by 1 + C.
In such an event, we simply rescale these polynomials by a factor -;. At worst, this only
doubles the error of the approximation. We also emphasize that our proposed sequence of
polynomial transformations serve primarily to prove their asymptotic scaling. In practice,
close-to-optimal constant factors in the degree of these polynomials can be obtained by a
direct Chebyshev truncation of the entire functions.
6.6.2
Polynomials for Low-Energy Uniform Spectral Amplification
The proof of Thm. 6.5 requires a polynomial approximation PgapAn(x) Lem. 6.31 to the
truncated linear function
fX+1-A
fgap, A (X) =
E
1
G [-1, 1],
113
-
otherwise.
+ A](
(6.62)
Our strategy is to construct an entire function fgapA,E that approximates fgap,A with error e
over the domain of interest. Entire functions are desirable as they are analytic on the entire
complex plane. This implies that truncating their expansion fgap,A,'E(x) =
o agTj(x) in
the Chebyshev basis produces polynomials with a uniform approximation error that scales
almost optimally with the degree n [126]. We build fgap,A,, by using the entire approximation
to the sign function sgn(x) in Lem. 6.19 and some intermediate results on the error function
erf(x) =
e- 2 dy.
Lemma 6.30 (Entire approximation to the gapped linear function
[0, 1/2], x E [-1, oo], e E (0,
fgap,A,e(X)
fgap,A(X)).
V A E
]. Then the function fgap,A,E(x) satisfies
1 - Al A
X +
=
6>
max
fsgn,A,2c(X +
2
fgap,A,E(X) -
1-
3A/2)
,(6.63)
x+1-A
A
Xe [-1,-1+A]
0 <
fgap,AE(X) < 1,
max
IfgapAE(X)
xE[1-A,1]
.
E/10 >
max
xE[-1+A,oo]
Proof. Let us derive bounds on the following regions:
X E [-1, -1+A]: From Lem. 6.19,
.
-Ifsgf,A,2 ,(x+1-3A/2))
-1
E approximates the function
1 with error c. By multiplying both sides with x+ -NA, ifgap,A,e(A) I x+- e
x C [-1 + A, -1 + 3A/2]: From Lem. 6.19 1fsgn,A,2e E [0, 1/2]. In this region, x+1--A E
[0, 1/2]. Thus by multiplying, fgap,A,e(x) E [0, 1/2].
x E [-1 + 3A/2, 1 - A]: From the upper bound erfc(x) < eX2, one obtains fgap,A,E(X)
X+1 Ae-k2,(x+1-3A/2)2 where k - e log'/2 (1). The worst case occurs when k is smallest
hence e =
is largest. Thus the upper bound is maximized with value 1+:e(-5-3)/4
0.7 at x --
1+ (5 +
<
5)A < -1 + 2A.
x E [1 - A, oo): The upper bound obtained for x E [-1 + 3A/2, 1 - A] still applies here
and is monotonically decreasing with x. Thus it is maximized when A = 1/2 is largest
and at x = 1 - A. With this upper bound, fgap,1/2,c(1/ 2 ) < 2e_9k2 /16 < 32
x
E [-1 + A, oo]: x+'-" Aand
.
substituting k and then using the fact E K
by
2r<
~fsg",A,2e(x+1-3A/2)
2
are both positive, thus fgap,A,e(X)
is
positive.
We now construct a degree n polynomial approximation to fgap,A (x).
Lemma 6.31 (Polynomial approximation to the gapped linear function fgap,A(x)). V E K
0(1), there exists an odd polynomial pgap,A,n of degree n = O(A- 1/ 2 log 3/ 2 (1/(Ae))) such
that
max
xE[-1,-1+A]
Pgap,A,n(x)
- x+ 1
e
and
max IPgapA,n(X)I
1.
(6.64)
xE[-1,1]
Proof. Let us expand fgap,A, 1 (x) =E=0 ajTj (x) in the Chebyshev basis. Then the trun-
114
cation error of pn(x) = E' 0 ajTj (x) has a well-known upper bound from Thm. 8.2 of [126]:
max
xe[-1,1]
IPn (X) - fgap,A,1(X)
C62
2Mp
p -
M = max Ifgap,A,,i (z) 1,
I
(6.65)
zcE
for any elliptical radius p > 1, where E = {z : z = 1(e'O + p-e-'O),0 E [0, 27r)} is the
Bernstein ellipse. We will need an upper bound on Ierf(reio)I for r > 0, E [0, 27r):
2 _j
V
2 2
2
= V
J'0
2
2
er
vi
cos(24)dr
Jk(z + 1 - 3A/2)|
M~
zEE
= 2r max{1, e-r 2 cos (24) }
2r max{1, eRe(-(reO) 2
2
+ p2 2 + 2 cos (26)) < p 2 . Let k =2log'/ 2
k(IzI + 1 + 3A/2) 5 k(p + 1 + 3A/2). Then
+ 1- A
=
(6.66)
Jr
0VI_
We also need the upper bounds z2
Mmaxz
er2 cos (2$)e-ir 2 sin (24)dr
e_ i dr =
er
)
erf(rei")I =
A
- erf(k(z + I - 3A/2)
2a
2
IZI+ I+ A(1 + jerf(k(z + I
max
3A/2)|)
2A
zE~p
(6.67)
z EE(
1
+
< O(poly(p, A-')) max
21k(z + 1 - 3A/2)(
Vi(1
+ eRe(-(k(z+1-3A/2)2))
)
O(poly(p, A-')) max eRe(-(k(z+1-3A/2) 2
z EEp
By taking derivatives with respect to 0, the maximum value of the exponent
1)(2 - (2 - 3A)2 p2 + 2p 4
8p 2(1+ p4
(6.68)
)
Oe [0,27r)
)
a = max Re(-(k(z + 1 - 3A/2) 2 ) =k 2(p2
)en //1og(1/Ei)
C2 = 0 (poly(A
- n
0 (A-1/2 log 3/ 2 (max1
2
)
Let us choose p = e', where a = 0(1/vk 2 A). Then a = 0(1). Substituting the value of k,
we have a = 0(\A/log (1/ci)), and M = 0 (poly(A~1)). Thus from Eq. 6.65,
,i, (6.69)
-
where the last equation applies log (poly(A-')/c) = 0(log(-;)). Thus the total approximation error is maxxE[-1-1+A]IPn(X) _ x+1--A
1 +< 2. Let Pgap,sym,A,n(X) =(Pn(X)
Pn(-x)) be the odd component of pn(x). Using the bounds of Lem. 6.30, this increases
the error in x c [-1, -1 + A] to at most 'j(cE + 62). By subtracting these bounds, we
also have maxxE[1,1] IPgap,sym,A,n(X)j
1 + To(6 + E2). Thus we rescale this to obtain
Pgap,A,nkX) =
h(
).
I
Using maxxe
- 1| < x, This increases the error by
at most a constant factor maxx[-1-1+A] IPgap,A,n(X) El = E2 = O(E).
6.7
'+1
=
0(ci
(2),
so choose
El
Conclusions
We have combined ideas from qubitization and quantum signal processing to solve, in a
general setting, the uniform spectral amplification problem of implementing a low-distortion
115
expansion of the spectrum of Hamiltonians. One most surprising application of our results is
the simulation of sparse Hamiltonians where we obtain an algorithm with linear complexity
in O(t(dAmaxAi) 1/ 2 ), excluding logarithmic factors. This is particularly important as the
best-case scaling O(/d) is essential to an optimal realization of the fundamental quantum
search algorithm. However, this improvement also appears impossible as prior art claims
that 9(tdHI|max) queries is optimal. Nevertheless, the two are actually consistent. In
the situation where information on 11H11 1 is unavailable, previous results are recovered as
one may simply choose the worst-case Al = dAmax = djIfIHmax. This naturally leads to
the question of whether further improvement is possible. For instance, if information on
11HIJ rather than I1HIJ1 is made available, our lower bound is consistent with the stronger
statement of Q(t(dIH||maxIIHII) 1 / 2 ) queries.
More generally, the universality of our results motivates related future directions. Thus
far, a large number of common oracles used to describe Hamiltonians to quantum computers
map to the standard-form without much difficultly. Rather than focusing on improving
Hamiltonian simulation algorithms in the query model, perhaps an emphasis on improving
the quality of encoding, through a reduced normalization constant, would be more insightful,
easier, and also lead to greater generality. Combined with the extremely low overhead of
our techniques, algorithms obtained in this manner could be practical on digital quantum
computers sooner rather than later.
116
Bibliography
[1] Milton Abramowitz, Irene A Stegun, et al.
Applied mathematics series, 55:62, 1966.
Handbook of mathematical functions.
[2] Dorit Aharonov. A simple proof that Toffoli and Hadamard are quantum universal.
arXiv preprint quant-ph/0301040, 2003.
[3] Dorit Aharonov and Amnon Ta-Shma. Adiabatic quantum state generation and statistical zero knowledge. In Proceedings of the 35th Annual ACM Symposium on Theory
of Computing, STOC '03, pages 20-29, New York, NY, USA, 2003. ACM.
[4] Dorit Aharonov, Wim Van Dam, Julia Kempe, Zeph Landau, Seth Lloyd, and Oded
Regev. Adiabatic quantum computation is equivalent to standard quantum computa-
tion. SIAM Rev., 50(4):755-787, 2008.
[5] K. Arai, C. Belthangady, H. Zhang, N. Bar-Gill, S. J., DeVience, P. Cappellaro, A. Yacoby, and R. L., Walsworth. Fourier magnetic imaging with nanoscale resolution
and compressed sensing speed-up using electronic spins in diamond. Nat. Nano.,
10(10):859-864, October 2015.
[6] R. Barends, J. Kelly, A. Megrant, A. Veitia, D. Sank, E. Jeffrey, T. C. White, J. Mutus,
A. G. Fowler, B. Campbell, Y. Chen, Z. Chen, B. Chiaro, A. Dunsworth, C. Neill,
P. O/'Malley, P. Roushan, A. Vainsencher, J. Wenner, A. N. Korotkov, A. N. Cleland,
and John M. Martinis. Superconducting quantum circuits at the surface code threshold
for fault tolerance. Nature, 508(7497):500-503, April 2014.
[7] R. Barends, A. Shabani, L. Lamata, J. Kelly, A. Mezzacapo, U. Las Heras, R. Babbush, A. G. Fowler, B. Campbell, Yu Chen, Z. Chen, B. Chiaro, A. Dunsworth,
E. Jeffrey, E. Lucero, A. Megrant, J. Y. Mutus, M. Neeley, C. Neill, P. J. J. OaAZMalley, C. Quintana, P. Roushan, D. Sank, A. Vainsencher, J. Wenner, T. C. White,
E. Solano, H. Neven, and John M. Martinis. Digitized adiabatic quantum computing
with a superconducting circuit. Nature, 534(7606):222-226, June 2016.
[8] Robert Beals, Harry Buhrman, Richard Cleve, Michele Mosca, and Ronald de Wolf.
Quantum lower bounds by polynomials. J. Assoc. Comput. Mach., 48(4):778-797,
July 2001.
[9] David Beckman, Amalavoyal N. Chari, Srikrishna Devabhaktuni, and John Preskill.
Efficient networks for quantum factoring. Phys. Rev. A, 54:1034-1063, Aug 1996.
[10] D. W. Berry, A. M. Childs, and R. Kothari. Hamiltonian simulation with nearly
optimal dependence on all parameters. In 2015 IEEE 56th Annual Symposium on
Foundations of Computer Science (FOCS), pages 792-809, Oct 2015.
117
[11] Dominic W. Berry, Graeme Ahokas, Richard Cleve, and Barry C. Sanders. Efficient quantum algorithms for simulating sparse Hamiltonians. Comm. Math. Phys.,
270(2):359-371, 2007.
[12] Dominic W. Berry and Andrew M. Childs. Black-box Hamiltonian simulation and
unitary implementation. Quantum Info. Comput., 12(1-2):29-62, January 2012.
[13] Dominic W. Berry, Andrew M. Childs, Richard Cleve, Robin Kothari, and Rolando D.
Somma. Exponential improvement in precision for simulating sparse Hamiltonians. In
Proceedings of the 46th Annual ACM Symposium on Theory of Computing, STOC '14,
pages 283-292, New York, NY, USA, 2014. ACM.
[14] Dominic W. Berry, Andrew M. Childs, Richard Cleve, Robin Kothari, and Rolando D.
Somma. Simulating Hamiltonian dynamics with a Truncated taylor series. Phys. Rev.
Lett., 114:090502, Mar 2015.
[15] John P. Boyd. Rootfinding for a transcendental equation without a first guess: Polynomialization of Kepler's equation through Chebyshev polynomial expansion of the
sine. Appl. Numer. Math., 57(1):12 - 18, 2007.
[16] Fernando GSL Brandao and Krysta Svore. Quantum speed-ups for semidefinite programming. arXiv preprint arXiv:1 609.05537, 2016.
[17] Gilles Brassard, Peter Hoyer, and Alain Tapp. Quantum counting.
Languages and Programming, pages 820-831. Springer, 1998.
In Automata,
[18] Kenneth R. Brown, Aram W. Harrow, and Isaac L. Chuang. Arbitrarily accurate
composite pulse sequences. Phys. Rev. A, 70:052318, Nov 2004.
[19] A. R. Calderbank and Peter W. Shor. Good quantum error-correcting codes exist.
Phys. Rev. A, 54:1098-1105, Aug 1996.
[20] T. Caneva, M. Murphy, T. Calarco, R. Fazio, S. Montangero, V. Giovannetti, and G. E.
Santoro. Optimal control at the quantum speed limit. Phys. Rev. Lett., 103:240501,
Dec 2009.
[21] Xie Chen, Bei Zeng, Zheng-Cheng Gu, Beni Yoshida, and Isaac L Chuang. Gapped
two-body hamiltonian whose unique ground state is universal for one-way quantum
computation. Phys. Rev. Lett., 102(22):220501, 2009.
[22] Andrew M. Childs.
Universal computation by quantum walk.
Phys. Rev. Lett.,
102:180501, May 2009.
[23] Andrew M. Childs. On the relationship between continuous- and discrete-time quan-
tum walk. Commun. Math. Phys., 294(2):581-603, 2010.
[24] Andrew M. Childs, David Gosset, and Zak Webb. Universal computation by multiparticle quantum walk. Science, 339(6121):791-794, 2013.
[25] Andrew M. Childs and Robin Kothari. Limitations on the simulation of non-sparse
Hamiltonians. Quantum Info. Comput., 10(7):669-684, July 2010.
[26] Andrew M. Childs and Robin Kothari. Theory of Quantum Computation, Communication, and Cryptography, pages 94-103. Springer Berlin Heidelberg, 2011.
118
[27] Andrew M Childs, Robin Kothari, and Rolando D Somma. Quantum linear systems algorithm with exponentially improved dependence on precision. arXiv preprint
arXiv:1511.02306, 2015.
[28] Andrew M. Childs and Nathan Wiebe. Hamiltonian simulation using linear combinations of unitary operations. Quantum Info. Comput., 12(11-12):901-924, November
2012.
[29] Anirban Narayan Chowdhury and Rolando D Somma. Quantum algorithms for gibbs
sampling and hitting-time estimation. Quantum Info. Comput., 17(1 & 2):0041-0064,
2017.
[30] John Clarke and Frank K Wilhelm.
Superconducting quantum bits.
Nature,
453(7198):1031-1042, 2008.
[31] Richard Cleve, Daniel Gottesman, Michele Mosca, Rolando D. Somma, and David
Yonge-Mallo. Efficient discrete-time simulations of continuous-time quantum query
algorithms. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC '09, pages 409-416, New York, NY, USA, 2009. ACM.
[32] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and D. E. Knuth. On the
Lambert W function. Adv. Comput. Math., 5(1):329-359, 1996.
[33] Holly K. Cummins, Gavin Llewellyn, and Jonathan A. Jones. Tackling systematic
errors in quantum logic gates with composite rotations. Phys. Rev. A, 67:042308, Apr
2003.
[34] Arnab Das and Bikas K. Chakrabarti. Colloquium: Quantum annealing and analog
quantum computation. Rev. Mod. Phys., 80:1061-1081, Sep 2008.
[35] Ammar Daskin and Sabre Kais. An ancilla-based quantum simulation framework for
non-unitary matrices. Quant. Inform. Process., 16(1):33, 2016.
[36] D. Deutsch. Quantum theory, the Church-Turing principle and the universal quantum
computer. Proceedings of the Royal Society of London A: Mathematical, Physical and
Engineering Sciences, 400(1818):97-117, 1985.
[37] Simon J Devitt, William J Munro, and Kae Nemoto. Quantum error correction for
beginners. Rep. Progr. Phys., 76(7):076001, 2013.
[38] C. L. Dolph. A current distribution for broadside arrays which optimizes the relationship between beam width and side-lobe level. Proc. IRE, 34(6):335-348, June
1946.
[39] Yonina C Eldar and Alan V Oppenheim. Quantum signal processing. IEEE Signal
Processing Magazine, 19(6):12-32, 2002.
[40] Alexandre Eremenko and Peter Yuditskii. Uniform approximation of sgn x by polynomials and entire functions. J. Anal. Math., 101(1):313-324, 2007.
[41] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, Joshua Lapan, Andrew Lundgren,
and Daniel Preda. A quantum adiabatic evolution algorithm applied to random instances of an NP-complete problem. Science, 292(5516):472-475, 2001.
119
[42] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Michael Sipser. Quantum computation by adiabatic evolution. arXiv preprint quant-ph/0001106, 2000.
[43] Richard P. Feynman.
Simulating physics with computers.
Int. J. Theor. Phys.,
21(6):467-488, 1982.
[44] W. Fraser. A survey of methods of computing minimax and near-minimax polynomial
approximations for functions of a single independent variable. J. A CM, 12(3):295-314,
July 1965.
[45] R Freeman. Spin choreography. Oxford University Press Oxford, UK, 1998.
[46] Sevag Gharibian, Yichen Huang, Zeph Landau, Seung Woo Shin, et al. Quantum
Hamiltonian complexity. Found. Trends Theo. Comp. Sci., 10(3):159-282, 2015.
[47] Daniel Gottesman. An introduction to quantum error correction and fault-tolerant
quantum computation. In Proceedings of Symposia in Applied Mathematics on Quantum information science and its contributionsto mathematics, volume 68, pages 13-58,
2009.
[48] Francis Grenez. Design of linear or minimum-phase FIR filters by constrained Chebyshev approximation. Signal Process., 5(4):325-332, 1983.
[491 William A. Grissom, Zhipeng Cao, and Mark D. Does. IB+I-selective excitation pulse
design using the ShinnaraA;Le Roux algorithm. J. Mag. Res., 242:189 - 196, 2014.
[50] Lov K. Grover.
A fast quantum mechanical algorithm for database search.
pages
212-219, 1996.
[51] Lov K. Grover. Fixed-point quantum search. Phys. Rev. Lett., 95:150501, Oct 2005.
[52] T. Hdberle, D. Schmid-Lorch, K. Karrai, F. Reinhard, and J. Wrachtrup. Highdynamic-range imaging of nanoscale magnetic fields using optimal control of a single
qubit. Phys. Rev. Lett., 111:170801, Oct 2013.
[53] Hartmut Hdffner, Christian F Roos, and Rainer Blatt.
Quantum computing with
trapped ions. Phys. Rep., 469(4):155-203, 2008.
[54] F.J. Harris. On the use of windows for harmonic analysis with the discrete Fourier
transform. Proc. IEEE, 66(1):51-83, Jan 1978.
[55] Aram W. Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for linear
systems of equations. Phys. Rev. Lett., 103:150502, Oct 2009.
[56] E. Hofstetter, A. V. Oppenheim, and J. Siegel. A new technique for the design of
nonrecursive digital filters. 5th Annu. Princeton Conf. Informat. Sci. Syst., pages
62-72, March 1971.
[57] Abdolhossein Hoorfar and Mehdi Hassani. Inequalities on the lambert w function and
hyperpower function. J. Inequal. Pure and Appl. Math, 9(2):5-9, 2008.
[58] Peter Hoyer.
Arbitrary phases in quantum amplitude amplification.
62:052304, Oct 2000.
120
Phys. Rev. A,
[59] Sami Husain, Minaru Kawamura, and Jonathan A. Jones. Further analysis of some
symmetric and antisymmetric composite pulses for tackling pulse strength errors. J.
Mag. Res., 230:145 - 154, 2013.
[60] Vasiliki N Ikonomidou and George D Sergiadis. Improved Shinnar-Le Roux algorithm.
J. Mag. Res., 143(1):30 - 34, 2000.
[61] Svetoslav S. Ivanov and Nikolay V. Vitanov. Composite two-qubit gates. Phys. Rev.
A, 92:022333, Aug 2015.
[62] Jonathan A. Jones. Nested composite NOT gates for quantum computation. Phys.
Lett. A, 377(40):2860 - 2862, 2013.
[63] Stephen P Jordan, Hari Krovi, Keith SM Lee, and John Preskill. BQP-completeness
of scattering in scalar quantum field theory. arXiv preprint arXiv:1703.00454, 2017.
[64] Chingiz Kabytayev, Todd J. Green, Kaveh Khodjasteh, Michael J. Biercuk, Lorenza
Viola, and Kenneth R. Brown. Robustness of composite pulses to time-dependent
control noise. Phys. Rev. A, 90:012316, Jul 2014.
[65] Lina J. Karam and James H. McClellan. Chebyshev digital FIR filter design. Signal
Process., 76(1):17 - 36, 1999.
[66] Georgios Katsikis, James S. Cybulski, and Manu Prakash. Synchronous universal
droplet logic and control. Nat. Phys., 11(7):588-596, July 2015.
[67] Navin Khaneja, Timo Reiss, Cindie Kehlet, Thomas Schulte-Herbriiggen, and Steffen J. Glaser. Optimal control of coupled spin dynamics: design of NMR pulse sequences by gradient ascent algorithms. J. Mag. Res., 172(2):296 - 305, 2005.
[68] Kaveh Khodjasteh, Daniel A. Lidar, and Lorenza Viola. Arbitrarily accurate dynamical control in open quantum systems. Phys. Rev. Lett., 104:090501, Mar 2010.
[69] Kaveh Khodjasteh and Lorenza Viola. Dynamically error-corrected gates for universal
quantum computation. Phys. Rev. Lett., 102:080501, Feb 2009.
[70] Shelby Kimmel, Cedric Yen-Yu Lin, Guang Hao Low, Maris Ozols, and Theodore J.
Yoder. Hamiltonian simulation with optimal sample complexity. Npj Quantum Inf.,
3(1):13, 2017.
[71] A Yu Kitaev. Quantum computations: algorithms and error correction. Russ. Math.
Surv., 52(6):1191-1249, 1997.
[72] Robin Kothari. Efficient algorithms in quantum query complexity. PhD thesis, 2014.
[73] Robin Kothari. private communication.
[74] R. Landauer. Irreversibility and heat generation in the computing process.
Journal of Research and Development, 5(3):183-191, July 1961.
IBM
[75] Mathias Lang. Algorithms for the Constrained Design of Digital Filters with Arbitrary
Magnitude and Phase Responses. PhD thesis, Vienna University of Technology, 1999.
121
[76] Kuan J. Lee. General parameter relations for the Shinnar-Le Roux pulse design algo-
rithm. J. Mag. Res., 186(2):252 - 258, 2007.
[77] Malcolm H. Levitt. Composite Pulses. John Wiley & Sons, Ltd, 2007.
[78] J. S. Li and N. Khaneja. Ensemble control of Bloch equations. IEEE Trans. Autom.
Control, 54(3):528-536, March 2009.
[79] J. Shin Li, Justin Ruths, Tsyr Yan Yu, Haribabu Arthanari, and Gerhard Wagner. Optimal pulse design in quantum control: A unified computational method. Proceedings
of the Natl. Acad. Sci. U.S.A., 108(5):1879-1884, 2011.
[80] Y. C. Lim, J. H. Lee, C. K. Chen, and R. H. Yang. A weighted least squares algorithm
for quasi-equiripple FIR and IIR digital filter design. IEEE Trans. Signal Process.,
40(3):551-558, Mar 1992.
[81] Seth Lloyd. Universal quantum simulators. Science, 273(5278):1073, Aug 23 1996.
[82] Seth Lloyd, Masoud Mohseni, and Patrick Rebentrost. Quantum principal component
analysis. Nat. Phys., 10(9):631-633, September 2014.
[83] Gui Lu Long, Yan Song Li, Wei Lin Zhang, and Li Niu. Phase matching in quantum
searching. Phys. Lett., 262(1):27 - 34, 1999.
[84] Guang Hao Low and Isaac L Chuang. Hamiltonian simulation by qubitization. arXiv
preprint arXiv:1610.06546, 2016.
[85] Guang Hao Low and Isaac L Chuang. Hamiltonian simulation by uniform spectral
amplification. arXiv preprint arXiv:1707.05391, 2017.
[86] Guang Hao Low and Isaac L. Chuang. Optimal Hamiltonian simulation by quantum
signal processing. Phys. Rev. Lett., 118:010501, Jan 2017.
[87] Guang Hao Low, Theodore J. Yoder, and Isaac L. Chuang. Optimal arbitrarily accurate composite pulse sequences. Phys. Rev. A, 89:022341, Feb 2014.
[88] Guang Hao Low, Theodore J. Yoder, and Isaac L. Chuang. Quantum imaging by
coherent enhancement. Phys. Rev. Lett., 114:100801, Mar 2015.
[89] Guang Hao Low, Theodore J. Yoder, and Isaac L. Chuang. Methodology of resonant
equiangular composite quantum gates. Phys. Rev. X, 6:041067, Dec 2016.
[90] Peter Lynch. The Dolph-Chebyshev window: A simple optimal filter. Mon. Wea. Rev.,
125(4):655-660, 1997.
[91] Murray Marshall. Positive polynomials and sums of squares. Number 146. American
Mathematical Soc., 2008.
[92] J. McClellan, T. Parks, and L. Rabiner. A computer program for designing optimum
FIR linear phase digital filters. IEEE Trans. Audio Electroacoust., 21(6):506-526, Dec
1973.
[93] Giinter Meinardus. Approximation of functions: Theory and numerical methods, volume 13. Springer, Berlin, 1967.
122
[941 J. T. Merrill, S. C. Doret, Grahame Vittorini, J. P. Addison, and Kenneth R. Brown.
Transformed composite sequences for improved qubit addressing. Phys. Rev. A,
90:040301, Oct 2014.
[95] Ari Mizel, Daniel A. Lidar, and Morgan Mitchell. Simple proof of equivalence between
adiabatic quantum computation and the circuit model. Phys. Rev. Lett., 99:070502,
Aug 2007.
[96] Shubhendu S Mukherjee, Joel Emer, Tryggve Fossum, and Steven K Reinhardt. Cache
scrubbing in microprocessors: Myth or necessity? In Proceedings. 10th IEEE Pacific
Rim InternationalSymposium on Dependable Computing, pages 37-42. IEEE, 2004.
[97] C.Andrew Neff and John H. Reif. An efficient algorithm for the complex roots problem.
J. Complex, 12(2):81 - 115, 1996.
[98] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, 1 edition, January 2004.
[99] Leonardo Novo and Dominic W Berry. Improved hamiltonian simulation via a truncated Taylor series and corrections. arXiv preprint arXiv:1611.10033, 2016.
[100] Jeremy L O'Brien. Optical quantum computing. Science, 318(5856):1567-1570, 2007.
[101] A.V. Oppenheim and R.W. Schafer. Discrete-time Signal Processing (3rd Ed.).
Prentice-Hall signal processing series. Prentice Hall, 2010.
[102] Ricardo Pach6n and Lloyd N. Trefethen. Barycentric-remez algorithms for best polynomial approximation in the chebfun system. Bit Numer. Math., 49(4):721-741, 2009.
[103] Adam Paetznick and Krysta M. Svore. Repeat-until-success: Non-deterministic decomposition of single-qubit unitaries. Quantum Info. Comput., 14(15-16):1277-1301,
November 2014.
[104] J. Pauly, P. Le Roux, D. Nishimura, and A. Macovski. Parameter relations for the
Shinnar-Le Roux selective excitation pulse design algorithm [NMR imaging]. IEEE
Trans. Med. Imag., 10(1):53-65, Mar 1991.
[105] Michael James David Powell. Approximation theory and methods. Cambridge University Press, 1981.
[106] Lulu Qian, David Soloveichik, and Erik Winfree. Efficient Turing-Universal Computation with DNA Polymers, pages 123-140. Springer Berlin Heidelberg, Berlin,
Heidelberg, 2011.
[107] Ben W Reichardt. Reflections for quantum query algorithms. In Proceedings of the
28th annual ACM-SIAM symposium on Discrete Algorithms, pages 560-569. Society
for Industrial and Applied Mathematics, 2011.
[108
J6r6mie Roland and Nicolas J Cerf.
Quantum search by local adiabatic evolution.
Phys. Rev. A, 65(4):042308, 2002.
[109] I. Michael Ross and Mark Karpenko.
A review of pseudospectral optimal control:
From theory to flight. Annu. Rev. Control, 36(2):182 - 197, 2012.
123
[110] J. Ruths and J. S. Li. Optimal control of inhomogeneous ensembles. IEEE Trans.
Autom. Control, 57(8):2021-2032, Aug 2012.
[111] Sushant Sachdeva and Nisheeth K. Vishnoi. Faster algorithms via approximation
theory. Found. Trends Theo. Comp. Sci., 9(2):125-210, 2014.
[112] EB Saff and V Totik. Polynomial approximation of piecewise analytic functions. J.
Lond. Math. Soc., 2(3):487-498, 1989.
[113] Meir Shinnar, Scott Eleff, Harihara Subramanian, and John S. Leigh. The synthesis of
pulse sequences yielding arbitrary magnetization vectors. Mag. Res. Med., 12(1):74-80,
1989.
[114] Michael Sipser. Introduction to the Theory of Computation, volume 2.
Course Technology Boston, 2006.
Thomson
[115] A. Soare, H. Ball, D. Hayes, J. Sastrawan, M. C. Jarratt, J. J. McLoughlin, X. Zhen,
T. J. Green, and M. J. Biercuk. Experimental noise filtering by quantum control. Nat.
Phys., 10(11):825-829, November 2014.
[1161 Robert I. Soare. Turing oracle machines, online computing, and three displacements in
computability theory. Ann. Pure Appl. Logic., 160(3):368 - 399, 2009. Computation
and Logic in the Real World: CiE 2007.
[117] David Soloveichik, Matthew Cook, Erik Winfree, and Jehoshua Bruck. Computation
with finite stochastic chemical reaction networks. Nat. Comp., 7(4):615-633, December
2008.
[118] R. D. Somma and S. Boixo. Spectral gap amplification. SIAM J. Comput., 42(2):593610, 2013.
[119] M. H. Stone. The generalized Weierstrass approximation theorem.
21(4):167-184, 1948.
Math. Mag.,
[120] Gerald Jay Sussman and Jack Wisdom. Numerical evidence that the motion of Pluto
is chaotic. Science, 241(4864):433-437, 1988.
[121] Eric G. Swedin and David L. Ferro. Computers: The Life Story of a Technology
(Greenwood Technographies). Greenwood Press, Westport, CT, USA, 2005.
[122] M. Szegedy. Quantum speed-up of Markov chain based algorithms. In Proceedings of
the 45th Annual IEEE Symposium on Foundations of Computer Science, pages 32-41,
Oct 2004.
[123] Mario Szegedy. Spectra of quantized walks and a vI5K rule. arXiv preprint quant-
ph/0401053, 2004.
[124] Y Tomita, J T Merrill, and K R Brown. Multi-qubit compensation sequences. New J.
Phys., 12(1):015002, 2010.
[125] Boyan T. Torosov and Nikolay V. Vitanov. Smooth composite pulses for high-fidelity
quantum information processing. Phys. Rev. A, 83:053420, May 2011.
124
[126] Lloyd N Trefethen. Approximation theory and approximationpractice. Siam, Philadelphia, 2013.
[127] G6tz S. Uhrig. Keeping a quantum bit alive by optimized ir-pulse sequences. Phys.
Rev. Lett., 98:100504, Mar 2007.
[128] P. Vaidyanathan and Truong Nguyen. Eigenfilters: A new approach to least-squares
FIR filter design and applications including Nyquist filters. IEEE Trans. Circuits
Syst., 34(1):11-23, Jan 1987.
[129] L. M. K. Vandersypen and I. L. Chuang. NMR techniques for quantum control and
computation. Rev. Mod. Phys., 76:1037-1069, Jan 2005.
[130] Nikolay V. Vitanov. Arbitrarily accurate narrowband composite pulse sequences. Phys.
Rev. A, 84:065404, Dec 2011.
[131] John Von Neumann and Arthur Walter Burks. Theory of self-reproducing automata.
University of Illinois Press Urbana, 1996.
[132] Xin Wang, Lev S. Bishop, Edwin Barnes, J. P. Kestner, and S. DasSarma. Robust
quantum gates for singlet-triplet spin qubits using composite pulses. Phys. Rev. A,
89:022310, Feb 2014.
[1331 Warren S Warren.
The usefulness of NMR quantum computing.
277(5332):1688-1690, 1997.
Science,
[134] S. Wimperis. Broadband, narrowband, and passband composite pulses for use in
advanced NMR experiments. J. Mag. Res., Series A, 109(2):221 - 231, 1994.
[135] Chui-Ping Yang and Siyuan Han. n-qubit-controlled phase gate with superconducting
quantum-interference devices coupled to a resonator. Phys. Rev. A, 72(3):032311,
2005.
[136] Theodore J. Yoder, Guang Hao Low, and Isaac L. Chuang. Fixed-point quantum
search with an optimal number of queries. Phys. Rev. Lett., 113:210501, Nov 2014.
[137] Theodore J. Yoder, Ryuji Takagi, and Isaac L. Chuang. Universal fault-tolerant gates
on concatenated stabilizer codes. Phys. Rev. X, 6:031039, Sep 2016.
125