Download 1031219672-MIT

Quantum Signal Processing by Single-Qubit Dynamics by Guang Hao Low M.Sci. Physics University of Cambridge, 2012 B.A. Natural Sciences University of Cambridge, 2012 Submitted to the Department of Physics in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Physics at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2017 Massachusetts Institute of Technology 2017. All rights reserved. A uthor ............. Signature redacted...................... Department of Physics August 9, 2017 Certified by.......... S ignature redacted ..................... Isaac L. Chuang Professor of Physics Professor of Electrical Engineering and Computer Science Thesis Supervisor Accepted by ............ MA-SACHUSETTS INSTITUTE OF TECHNOLOGY. MAR 19 2018 LIBRARIES ARQHIVES Signature redacted Nergis Mavalvala Curtis and Kathleen Marble Professor of Astrophysics Associate Department Head of Physics 2 Quantum Signal Processing by Single-Qubit Dynamics by Guang Hao Low Submitted to the Department of Physics on August 9, 2017, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Physics Abstract Quantum computation is the most powerful realizable model of computation, and is uniquely positioned to solve specialized problems intractable to classical computers. This quantum advantage arises from directly exploiting the strangeness of quantum mechanics that is fundamental to reality. As such, one expects our understanding of quantum processes in physical systems to be indispensable to the design and execution of quantum algorithms. We present quantum signal processing, which exploits the dynamics of simple quantum systems to perform non-trivial computations. Such systems applied as computational modules in larger quantum algorithms, offer a natural physical alternative to standard tasks such as the calculation of elementary functions with integer arithmetic. The quantum advantage of this approach, based on simple physics, is of significant practical relevance. In cases, arbitrary bits of precision may be emulated using only constant space. Moreover, the simplicity and performance of quantum signal processing is such that it is the final missing ingredient for realizing a number of optimal quantum algorithms, particularly in Hamiltonian simulation. Quantum signal processing realizes a useful fusion of analog and digital models of quantum computation. At the physical level, we focus on how even a simple two-level system - the qubit, computes through optimal discrete-time quantum control. Whereas quantum control is typically used to synthesize unitary quantum gates, we solve the synthesis problem of unitary quantum functions with a fully characterization of achievable functions, and efficient techniques for their implementation. This furnishes a surprisingly rich framework in the analog model of quantum computation for computing functions. The generality of this model is realized by many applications, often with no modification, to quantum algorithms designed for digital quantum computers, in particular for matrix manipulation. In this manner, we solve a number of open problem related to optimal amplitude amplification algorithms, optimally computing on matrices with a quantum computer, and the simulation of physical systems. Thesis Supervisor: Isaac L. Chuang Title: Professor of Physics Professor of Electrical Engineering and Computer Science 3 4 Acknowledgments This thesis is possible only through the influence of many people. My research advisor Isaac Chuang has consistently been a deep well of insight and guidance, not just in the intricacies of quantum computing, but also in the larger picture of what it means to be a researcher and beyond. Our story goes a little further back - I met Ike on the recommendation of my academic advisor Thomas Greytak during my undergraduate exchange at MIT. I am very fortunate to have had good mentors. The seeds for my time in graduate school were sown in those early days in Ike's lab, also under the supervision of Peter Herskind and Shannon Wang. That amazingly positive experience was a pivotal moment. From the beginning, Ike gave me the freedom to pursue my interest, and I fondly remember our lunch discussions. I am grateful to my thesis committee members Eddie Farhi and Aram Harrow for their support. The all-too-frequently unsung heroes of administrative staff in the Physics department, particularly Catherine Modica and Sydney Miller, also deserve special mention for their unwavering assistance in my moments of need. I learned much from discussions with my collaborators, particularly Ted Yoder in laying the foundations of this thesis through our work on composite pulse sequences in Chapter 2. Valuable experience was gained through working with Shelby Kimmel, Cedric Lin, Michael Gutierrez, Helena Zhang, Richard Rines, Maris Ozols, Kuan-Yu Lin, Robert McConnell, Tailin Wu, and John Chiaverini. Beyond these, Amira Eltony, Curtis Northcutt, Murphy Niu, Sam Buercklin, Molu Shi, Dax Koh, Mischa Wood, Yuan Su, and many others not mentioned but no less deserving. I would also like to thank other teachers in my journey through physics: Steve McMahon, Yeo Ye, Quek Hoon Khim, Chen Geok Loo, and Byran Poon. Most of all, my parents for their love, and the opportunity to pursue my dreams. 5 6 Contents 1 Introduction 13 1.1 An outline of quantum computation . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 U niversality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 C om plexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Analog quantum computation . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Fault-tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 Digital quantum computation . . . . . . . . . . . . . . . . . . . . . . . 1.1.6 Analog-digital hybrid models of quantum computation . . . . . . . . . Quantum signal processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 16 17 17 18 19 19 20 2 Analog computation on a single qubit 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Attributions and contributions . . . . . . . . . . . . . . . . . . . . . . 2.2 A model of analog quantum computation . . . . . . . . . . . . . . . . . . . . 2.3 Analog quantum computation on a single-qubit . . . . . . . . . . . . . . . . . 2.3.1 Representation of QSP . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Systematic and efficient design of optimal composite gates . . . . . . . . . . . 2.4.1 Polynomial characterization of quantum response functions . . . . . . 2.4.2 Fourier characterization of quantum response functions . . . . . . . . . 2.4.3 Implementation of quantum response functions . . . . . . . . . . . . . 2.4.4 Computation of quantum response functions . . . . . . . . . . . . . . . 2.4.5 Selection of quantum response functions . . . . . . . . . . . . . . . . . 2.4.6 The methodology of composite quantum gates . . . . . . . . . . . . . . 2.5 Exam ples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Composite population inversion gates . . . . . . . . . . . . . . . . . . 2.5.2 Broadband compensated NOT gates . . . . . . . . . . . . . . . . . . . 2.5.3 Composite quantum gates with sub-wavelength spatial selectivity . . . 2.6 C onclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 23 24 24 26 27 28 29 30 33 34 37 37 38 39 41 44 45 3 Amplitude amplification by quantum signal processing 47 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.1.1 Attributions and contributions . . . . . . . . . . . . . . . . . . . . . . 48 3.2 Quantum algorithms and query complexity . . . . . . . . . . . . . . . . . . . 48 3.3 Quantum search and amplitude amplification . . . . . . . . . . . . . . . . . . 49 3.4 Amplitude amplification by partial reflections . . . . . . . . . . . . . . . . . . 51 3.5 Flexible amplitude amplification . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.6 C onclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 1.2 7 4 Sparse Hamiltonian simulation by quantum signal processing 4.1 Introduction ........ ..................................... 4.1.1 Attributions and contributions . . . . . . . . . . . . . . . . . . . . . . 4.2 Quantum walks in sparse Hamiltonian simulation . . . . . . . . . . . . . . . . 4.3 Eigenphase transformations by quantum signal processing . . . . . . . . . . . 4.4 Optimal sparse Hamiltonian simulation . . . . . . . . . . . . . . . . . . . . . . 57 57 58 58 60 62 5 Standard-form Hamiltonian simulation by qubitization 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Attributions and contributions . . . . . . . . . . . . . . . . . . . . . . 5.2 The standard-form encoding of matrices . . . . . . . . . . . . . . . . . . . . . 5.2.1 Matrices from a linear combination of unitaries . . . . . . . . . . . . . 5.2.2 Density matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Sparse matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Q ubitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Proof of construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Operator function design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Ancilla-free quantum signal processing . . . . . . . . . . . . . . . . . . 5.4.2 Single-ancilla flexible quantum signal processing . . . . . . . . . . . . . 5.4.3 Single-ancilla quantum signal processing on arbitrary unitaries . . . . 5.4.4 Single-ancilla quantum signal processing on controlled-qubiterates . . . 5.4.5 Double-ancilla quantum signal processing . . . . . . . . . . . . . . . . 5.4.6 Operator functions of normal matrices . . . . . . . . . . . . . . . . . . 5.5 Hamiltonian simulation by qubitization . . . . . . . . . . . . . . . . . . . . . . 5.6 C onclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 65 66 66 66 67 68 69 71 73 74 76 78 79 81 82 82 86 6 Uniform spectral amplification 87 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.1.1 O ur R esults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.1.2 O rganization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.1.3 Attributions and contributions . . . . . . . . . . . . . . . . . . . . . . 93 6.2 Uniform Spectral Amplification by Quantum Signal Processing . . . . . . . . 93 6.3 Amplitude Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6.4 Uniform Spectral Amplification by Amplitude Multiplication . . . . . . . . . 96 6.4.1 Matrix Elements as State Overlaps . . . . . . . . . . . . . . . . . . . . 97 6.4.2 Amplitude Multiplication of Overlap States . . . . . . . . . . . . . . . 98 6.4.3 Reduction to Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . 101 6.4.4 Lower Bound on Sparse Hamiltonian Simulation . . . . . . . . . . . . 102 6.5 Universality of the Standard-Form . . . . . . . . . . . . . . . . . . . . . . . . 105 6.6 Construction of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.6.1 Polynomial Approximations to a Truncated Linear Function . . . . . . 107 6.6.2 Polynomials for Low-Energy Uniform Spectral Amplification . . . . . . 113 6.7 C onclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 8 List of Figures 1-1 Dependencies of chapters. 2-1 2-2 2-3 Plots of equiripple polynomials DCL,I, ML,T. . . . . . . . . . . . . . . .. . 40 Worst-case infidelity of equiripple NOT gates as function of target bandwidth. 43 Infidelity of spatially selective equiripple composite gates as function of distance. 45 3-1 Quantum circuit for amplitude amplification variants. 4-1 Quantum circuit for eigenphase transformations by quantum signal processing. 61 5-1 Quantum circuits for standard-form encoding of matrices described by comm on oracles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantum circuit for the qubitization of a standard-form encoding. . . . . . . Quantum circuit for the phased qubiterate of a standard-form encoding. . . . Quantum circuit for the flexible qubiterate . . . . . . . . . . . . . . . . . . . . 5-2 5-3 5-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 9 . . . . . . . . . . . . . 54 67 73 74 77 10 List of Tables 1.1 1.2 Truth tables of primitive Boolean logic gates NOT, OR, NAND, and Toffoli. . 13 Representations of primitive quantum gates {Had, T, CNOT}. . . . . . . . . 17 5.1 Six example problems solvable using the quantum signal processing and qubitization com bined. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Performance comparison of state-of-art with our new approaches for Hamiltonian sim ulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.2 11 12 Chapter 1 Introduction Machines designed to aid computation have long been of interest to civilization [121]. One of the earliest examples is the abacus (-500 BCE), which can execute simple arithmetical algorithms. Ancient history is rife with more sophisticated constructions, such as the Greek Antikythera gear mechanism (-100 BCE) for astronomy of the sun and moon, which is also the earliest known analogue computer. It is hard to overstate the utility of these devices. Charles Babbage's difference engine (1822 CE), a mechanical calculator for tabulating polynomials, could replace hours of laborious computation by hand with several turns of a crank. After the invention of vacuum tubes (-1900 CE), it became common to find electronic circuits for solving linear differential equation and performing convolutions in real-time. Today, modern digital computers built upon billions of transistors per device are capable of simulating astoundingly complex natural phenomena with astonishing speed, limited only by the approximations made to the underlying laws of physics, and the effort invested in their numerical propagation forward in time. These are all examples of classical computers, with operating principles rooted in the classical laws of physics. The earlier mechanical calculators rely upon Netwon's laws of motion, and the later electronic computers are possible due to Maxwell's equations for electromagnetism. Future computers could also be based upon different physical systems, such as fluid dynamics 1661, chemical reactions [117], and even DNA [1061. These speculative thrusts are motivated in part to approach one of holy grails of classical irreversible computation: Launder's principle [74], which is the thermodynamic limit of kT ln 2 joules of energy per bit of information erased. In most cases, the minimum benchmark for any such architecture is demonstrating the ability to implement primitive Boolean logic gates that are universal for classical computation, such as NOT and OR, or NAND, or Toffoli (Table. 1.1), as these may be composed to synthesize arbitrary Boolean functions f(x) E {0, 1}, where x is a natural number represented by an n-bit string. In Toffoli NAND OR NOT Out In Out In Out In Out In Out 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 1 1 0 1 1 1 1 0 1 1 1 1 0 0 0 0 0 1 1 1 0 1 0 0 0 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 0 1 0 1 0 1 1 Table 1.1: Truth tables of primitive Boolean logic gates NOT, OR, NAND, and Toffoli. 13 Computability: Regardless of the underlying physical principle, universal classical computers are all surprisingly equivalent in terms of computability. As Boolean logic may be used to construct the prototypical Turing machine for performing mechanical computations, the forward direction this equivalence is formalized by the Church-Turing thesis [116], which hypothesizes that any function f(x) may be computed by a Turing machine if and only if it is computable by a human following some algorithm, ignoring resource limitations in space and time. The reverse direction is furnished by the stronger Church-Turing-Deutsch thesis 136], which hypothesizes that any physical process may be simulated by a Turing machine. To date, no violations of these hypotheses have been found. In other words, if one is only concerned with the set of problems to which one may compute solutions, the underlying physical architecture is largely irrelevant as every universal classical computer may simulate any other universal classical computer. Complexity: Equivalence in computability, however, does not imply equivalence in the more useful metric of computational complexity [114]. The power of universal classical computation is encapsulated by the complexity class P. This is the set of all functions f(x) E {0, 1} on an n-bit string x efficiently computable by a Turing machine, that is, using polynomial O(poly(n)) time and space, which in turn is simulatable using O(poly(n)) universal Boolean gates and bits. Allowance for O(poly(n)) additional random bits and computing correct answer f(x) = 1 with probability ;> 2/3 generalizes this to BPP, which stands for Bounded-error Probabilistic Polynomial time. Within BPP, different architectures have unique strengths that allow them to solve certain specialized problems significantly faster, with less error, or more cost-effectively. While these at-most polynomial or constant factor improvements are generally less exciting to understanding complexity classes, they are of immense practical relevance. This is well-illustrated by a comparison between analog and digital computation. Analog computers compute directly through dynamics governed by physical laws of motions. Thus their inputs and outputs are continuous physical variables, such as voltage, or position. Historically, this made them particularly suitable for fast real-time applications that interface with physical systems, such as control engineering. In contrast, digital computers are based on discrete variables, or abstract bits of information, upon which computation proceeds through Boolean logic. This layer of abstraction, the logical level, is simulated by the underlying physical dynamics of the physical level, and comes with a significant practically-relevant constant- or polynomial-overhead in space and time. Often, considerably more components are required to mimic a simple analog computation. Fault-tolerance: The primary advantage of digital computation, however, is robustness against unreliable physical components [131] that fail with some probability. Whereas bit-flip errors in digital bits may be fixed by error correction codes, the simplest being redundancy followed by a majority vote, no such mechanism exists for the continuous errors found in continuous variables. Through error correction and carefully designed fault-tolerant hardware, the accumulation of errors may be controlled to enable arbitrarily long computations. Notably, error rates in consumer-grade computing hardware are actually extremely low on the order of a - 10-19 probability of failure per logic gate [96], thus diminishing the necessity of error correction. However, error correction remains highly relevant to information storage where error rates due to radiation are considerably higher, and to inherently noisy environments such as in telecommunications or outer space. In fact, with the continual reduction in the size of transistors in silicon leading to more fault-prone behavior, there may soon be a point where error correction becomes necessary to realize some optimal trade-off between feature density and effective error rate even in logic gates. Quantum computation: An even more dramatic separation in complexity is found 14 between computers based on classical laws of physics, and those on quantum mechanics. In Section 1.1 we outline the basics of quantum computation following similar themes: universality, complexity, analog quantum computation, fault-tolerance, and digital quantum computation. The main content of this thesis is outlined in Section 1.1.6, where we motivate a uniquely quantum hybrid of analog and digital quantum computation that merges the speed and elegance of computation at the physical level with compatibility to the fault-tolerance of the logical level. We then apply this approach, which we call quantum signal processing, to the development of quantum simulation algorithms with remarkable low overhead and optimal performance. 1.1 An outline of quantum computation Richard Feynman was the first to recognize in 1982 [43] that quantum mechanics could be exploited to build a quantum computer. This was motivated by the quantum simulation problem - whereas classical systems are easy to simulate on a classical computer, a similar feat for quantum systems is notoriously difficult. A system of n classical particles is fully described by O(n) numbers for their positions and momentums, thus its dynamics simulatable with 0(poly(n)) time. In contrast, the state of quantum particles combine by a tensor product and so their state is described by O(exp(n)) numbers, a peculiar property known as the 'curse of dimensionality'. In other words, the problem simulating of quantum mechanics with a classical computer, while certainly computable, requires exponential time and space. Thus, simulating quantum many-body phenomena on a classical computer, such as hightemperature superconductivity is completely unfeasible, and that even for modestly-sized systems, it may be more practical to purpose-build the quantum experiment - essentially an analog quantum computer. To the best of our knowledge, quantum mechanics underlies physical reality at small length scales, and certain fundamental properties unique to quantum computation have no analogue in classical computation. * Curse of dimensionality: A bit always assumes one of two discrete values, and lives in the space of binary numbers {0, 1}. Given n bits, there are 2n possible binary states in {0, 1}n. However, at any instant, the current binary state is only one of these possibilities, and describable using exactly n bits of information. In contrast, a single quantum bit, the qubit with state 1,) is describable by three real numbers, and is a unit vector in the two-dimensional complex vector space 10) E C 2 . Given n qubits, these combine by a tensor product to form 2n orthonormal basis vectors. Unlike bits, the composite quantum state is a unit vector in the L 2 norm living in C 2 n and cannot be represented by any less than 0(2n) complex numbers, or 0(2n) bits. This also enables non-local correlations, or entanglement, where the measurement outcomes of individual qubits in multi-qubit states have statistics that cannot be described by correlated classical random variables. " Quantum interference: An n-qubit quantum state IV) is a unit vector in C2n, and each basis vector may be indexed with n bits. Thus 1i) = E ajIj) may exist as a superposition with complex amplitudes over all binary states, with a probability a3 12 of measuring the basis state Ij). As all quantum time-evolution is unitary, unitary operators on 10) allow for the constructive and destructive interference of amplitudes in various binary states. In contrast, classical probability distribution evolve under the 15 action of stochastic matrices, which only allows for the addition of probabilities, and never subtraction. From a computational perspective, quantum inference allows for the reinforcement of beneficial computational branches, and the pruning of unwanted threads. Thus a quantum computer [98] is also the most powerful model of computation that can be feasibly constructed, potentially more capable than universal classical computers. 1.1.1 Universality Many theoretical developments in quantum computation parallel those of classical computation. Similar to classical computers, quantum computers can be realized by different implementations of quantum physics. At the physical level, qubits are formed from any two energy levels of any effective particle with quantum properties, such as nuclei [133], ions [53], photons [100], superconductors [30], or even quantum fields [63]. The dynamics of the time-dependent quantum state 1,0(t)) of these qubits then evolves in continuous-time according to the system Hamiltonian H(t) and the famous Schr6dinger equation i____ at = H (t)140(t)). (1.1) Though each system may differ in their Hamiltonian, all universal quantum computers are polynomially equivalent in their computational power. Analogous to the universal classical gates, judicious control over the Hamiltonian allows these systems to simulate a discrete set of universal quantum gates that act on qubits, such as Hadamard and T and CNOT, or Hadamard and Toffoli [21, all represented by unitary matrices in Table. 1.2. The famous Solovay-Kitaev theorem [71] provides a recipe to synthesize any arbitrary single-qubit or two-qubit unitary, so-called 'primitive' quantum gates, to arbitrarily high precision c using a product of just O(polylog(1/c)) universal quantum gates. These may in turn be composed to synthesize any unitary gate of arbitrary dimension. As Lloyd's Hamiltonian simulation algorithm for local Hamiltonians [81] allows any universal quantum computer to efficiently simulate time-evolution by any physical n-particle quantum system using O(poly(n)) primitive quantum gates and qubits, any universal quantum computer is capable of simulating any other universal quantum computer with a polynomial overhead in time and space. 16 Had 1 T CNOT 0) )1 1 12 -( 0 ) 1 0 0 1 0 0 0 001 0 10)a - Had Hadkb)a 10)a TTLV))a 0 1 0 0 0) X)b |X)b ')a 1x ED X)a Table 1.2: Representations of primitive quantum gates {Had, T, CNOT} as (middle row) unitary matrices, and (bottom row) quantum circuits acting on input states Ma, KX)b from the left, where the subscript indicates the qubit register. In the case of CNOT, the output represented as modular addition is only valid on computational basis states e.g. I0)a E {0)a, 1)a}. Note that the representation of Toffoli is identical to the Boolean case. 1.1.2 Complexity This equivalence motivates one to define the complexity class of universal quantum computation BQP, which stands for Bounded-error Quantum Polynomial time. This is the set of all functions f(x) E {0, 1} for which any unitary quantum circuit Q, acting on an n-qubit input state |x), outputs the correct one-bit answer with probability > 2/3, using 0(poly(n)) qubits and universal quantum gates and. It is widely believed, though not proven, that universal quantum computers are strictly more powerful than their classical counterparts. In other words, P C BPP C BQP. Note that BPP C BQP is particular easy to prove as Toffoli alone is universal for classical computation, and random bits may be obtained by measuring a uniform quantum superposition of binary states. Though originally motivated by the quantum simulation problem, evidence for the quantum advantage speak through the performance of remarkable quantum algorithms that surpass best classical algorithms. The most well-known quantum speedups can be found in Shor's algorithm [9] which factors an n-bit number using O(n 2 ) quantum gates versus the best classical factoring algorithm which takes O(el9n/ 3 ) classical gates. In the weaker query model where one assumes access to a black-box circuit that outputs bits of information one by one, the broadly applicable Grover's algorithm [50] makes O(n1 / 2 ) queries to this circuit using a quantum superposition of states to search an unsorted database of size n for a single marked element. In contrast, the naive but optimal classical approach has to check every single element with O(n) queries. 1.1.3 Analog quantum computation Universal quantum gate sets are a useful tool for comparison between physical systems as they provide a worst-case estimate of the number of quantum gates and qubits required to execute quantum algorithms. However, the cost of implementing this abstract layer can hide large constant factors or even polynomial factors. Just like classical analog computers, the underlying dynamics of each quantum system may be predisposed to solving specialized problems. For instance, an n-qubit controlled phase gate may be implemented in a single time step by some superconducting qubit architectures [135], whereas its naive synthesis with standard universal gate sets would require O(n) time. Beyond the gate model, physi17 cal intuition may provide more natural models of universal quantum computation, such as the ground state of certain two-body Hamiltonians [21], that yield insight into how nature performs computations. Perhaps the most famous example is the adiabatic quantum computer [41, 3, 4]. In adiabatic algorithms, the time-dependent Hamiltonian H(t) transitions from a simple Hamiltonian H(0) = Hi with an easily prepared ground state, to a final Hamiltonian H(T) = f whose ground state encodes the solution to some computational problem, such as search [108] or k-SAT [41]. Provided that this transition is slower than the inverse energy gap to the first excited state, the initial ground state evolves into that of H(T) with high probability. Though this evolution could certainly be simulated by universal gate sets, the speed and elegance of natively physical analog quantum computation is highly appealing [34], and accessible to current technology. 1.1.4 Fault-tolerance Unfortunately, quantum computers scalable to a large number of qubits and capable of long computations are notoriously difficult to build. Qubits are often single particles highly susceptible to weak environmental noise, unlike classical bits which result from the collective phenomena of many particles. As a consequence of this fragility, the best quantum computers struggle to achieve better than a 10- 3 probability of failure per two-qubit gate [6] at the physical level, which greatly limits the problems in which any advantage over classical computation may be demonstrated, even in analog quantum computation. Moreover, quantum bits are not discrete in the same way as classical bits. Whereas a quantum state is represented by a discrete set of continuous variables, a Boolean state is represented by a discrete set of discrete variables. Due to the lack of success in classical analog error correction, one of the greatest triumphs and surprises of quantum computation was the discovery that quantum error correction and fault-tolerant quantum computation is possible [19], and is now regarded as essential to any long-term vision of quantum computation. The basic idea behind quantum error correction [47] encodes the two bases of a logical qubit into two specially designed entangled states, the quantum codewords, that are distributed over a larger number of n physical qubits. Each time-step of evolution at the physical level then applies an error, which is some quantum channel that may be decomposed into a linear combination of the n-fold tensor product of single-qubit Pauli operators {I, &X, &Y&, z}. Working on the assumption that noise acts locally on physical qubits, the overwhelming probability of errors will be caused by a discrete set of low-weight operators, where weight is the number of that non-identity single-qubit Pauli operators. All possible operators of sufficiently low-weight then transform the initial two codewords into other codewords that are guaranteed to be mutually orthogonal, which allows the error that occurred to be uniquely identified by measurement, and then corrected. Quantum gates may be applied to logical qubits using high weight operators on the physical qubits that preserve these codewords, and any local errors may be corrected in a similar manner. Great care must be taken in fault-tolerant quantum computation to organize these logical quantum gates, measurements, and correction routines in such a way that low-weight errors do not propagate throughout to become uncorrectable high-weight errors. Notably, only a finite set of logical quantum gates may be implemented in this manner. Fortunately, some error correction codes allow for a fault-tolerant implementation of universal quantum gate sets such as Toffoli and Hadamard [137]. 18 1.1.5 Digital quantum computation This layer of fault-tolerant error correction with universal quantum gates acting on logical qubits abstracts away the underlying physical dynamics, and is also known as digital quantum computation. Provided that physical error probabilities are below a certain threshold, the threshold theorem [47] guarantees that logical error probabilities may be made exponentially smaller with the number of concatenations of the fault-tolerant layer. However, a heavy price in overhead is paid for this robustness to noise. Realistic estimates [37] suggest that tens of physical qubits and up to 10 5 physical gate operations may be required per logical single-qubit gate. This is compounded by the up-to polynomial overhead in time and space of compiling fast analog quantum algorithms into sequences of universal gates. Though these estimates can be expected to go down if physical errors rates decrease or if more efficient fault-tolerant error correction protocols are found, digital quantum computation is currently somewhat beyond the reach of existing technology. 1.1.6 Analog-digital hybrid models of quantum computation The fundamental physics of quantum mechanics underlies quantum computation. Just as how intuition about classical physics heavily influenced analog computation, intuition about quantum physics has also driven the invention of analog quantum algorithms, such as the adiabatic algorithm [42, 95] inspired by adiabaticity, and quantum walks [22, 24] inspired by locality. These algorithms can -operate directly at the physical level of quantum hardware, and thus come with low overhead and provide insight to how nature intrinsically enables computation. It is also widely recognized that a fault-tolerant logical layer of digital quantum computation will ultimately become necessary. The discrete set of universal logical quantum gates implementable at this level motivates a model of quantum computation analogous to that of the discrete set of universal classical Boolean gates. On one hand, many highly optimized reversible Boolean compilations of common operations such as integer arithmetic and optimization may be implemented directly on a quantum computer using, say, the Toffoli gate, which is universal for classical computation. This transfer of knowledge is of incalculable value, and is commonly exploited by many quantum algorithms. On the other hand, this has also motivated a more abstract understanding of how the mathematics of quantum physics enables quantum computation, independent of the physical layer. However, quantum algorithms designed for the logical layer lose the physical intuition of analog quantum algorithms, and more pragmatically, lose their native speed and low overhead. One could certainly simulate analog quantum algorithms with the logical level [7], as assured by universality of quantum computation. However, this is through digital Hamiltonian simulation algorithms with some polynomial overhead, and is not a uniquely quantum phenomenon. After all, digital classical computers too have no fundamental difficulty simulating continuous classical physics. Whereas analog and digital classical computation are mutually exclusive - one either computes on continuous variables, or on binary variables, digital quantum computation still has some remnant analog behavior in how its unitary operators still live in a continuous space of C 2 n. The 'analogness' of digital quantum computation lends hope that some kind of meaningful fusion between analog and digital models of quantum computation is possible. 19 Chapter 2 Single-qubit analog quantum computation Chapter 3 Amplitude amplification by QSP Chapter 4 Sparse Hamiltonian simulation by QSP -+e Chapter 5 Standard-form Hamiltonian simulation by qubitiazation Chapter 6 Uniform spectral amplification Figure 1-1: Dependencies of chapters. 1.2 Quantum signal processing In th thesis, we present a broad approach we call quantum signal processing that succeeds at transferring some of the power of computing with physical systems to the digital realm. We start in Chapter 2 with its original motivation as a tool that exploits the continuoustime dynamics of the simplest of two-level quantum systems to design a certain useful class of quantum response functions QSP at the physical level. Following a thorough analysis of what unitary functions in QSP are possible, and then solving the synthesis problem of how they may be selected, we slowly introduce successively refined applications to quantum algorithms. Of particular surprise is that QSP too may be directly implemented on a digital quantum computer without modification or simulation. For instance, in Chapter 3, QSP naturally maps to the dynamics of the amplitude amplification algorithm, which allows us to solve open problems regard its design, and to invent useful generalizations. Of greater interest is the application of QSP to the Hamiltonian simulation problem of simulating quantum physics on a digital quantum computer. As it turns out in Chapter 4, QSP is the last missing ingredient that allows us to obtain optimal simulation algorithms, which are also of pleasing elegance and simplicity, for sparse Hamiltonians. The usefulness of QSP motivates us to extend its applicability. Our solutions to the problems of amplitude amplification and sparse Hamiltonian simulation were only possible as those coincidentally had an underlying SU(2) qubit symmetry matching that QSP, whereas such structure need not exist in general quantum algorithms. We resolve this in Chapter 5 with a very general 'qubitization' procedure that imposes a SU(2) structure on a wide variety of possible ways descriptions of Hamiltonians are accessed by quantum computer. On one hand, this transfers our optimal simulation algorithms to a broader class of structured matrices beyond the sparse model. On the other hand, qubitization in combination with QSP furnishes a very natural framework for efficiently and optimally computing on exponentially large matrices on a quantum computer, of which Hamiltonian simulation is an example. We further study this possibility in Chapter 6, and obtain simulation algorithms optimal under even more general circumstances. The following chapter summaries outline in more detail our development and application of quantum signal processing. The dependencies of these chapters are shown in Figure 1-1. 'Somewhat confusingly, the term 'quantum signal processing' is used in the opposite context in [39], where classical signal processing techniques are inspired by quantum mechanics, but in no way involves any kind of quantum mechanical system. We believe our use 'quantum' here is more traditional. 20 Chapter 2 draws the analogy between quantum control in physical systems and programs that compute unitary functions on multiple variables. This analogy is formalized as a general analog model of quantum computation, and applied to the simplest of quantum physics - a single qubit. Despite this simplicity, a surprisingly structured family of composite unitaries parameterized by a single input variable 0 emerges through the technique of composite pulse sequences. There, L fixed single-qubit rotations by angle o are interspersed between carefully designed sequences of arbitrary phase gates. We call the set of all unitary functions generated by this model of quantum computation QSPL. Though composite pulse sequences is an old technology, most composite gates are designed ad-hoc either through geometric intuition or brute-force numerics, which limits their effectiveness to either very specialized classes of functions, or short L sequences with simple behavior. Any practical application of quantum signal processing to the computation of functions must satisfy two criteria. First, the design space of achievable functions must be known. Second, there must exist efficient algorithms for compiling desired functions into primitive quantum gates. We fulfill this by rigorously characterizing QSPL. We prove that the quadratures of any of its composite unitary are degree L trigonometric polynomials in 0, subject to certain necessary and sufficient constraints. We also provide necessary and sufficient constraints on polynomials describing subsets of these quadratures. Given any such polynomial(s), we provide an efficient classical algorithm running in 0(poly(L)) time for compiling a member of QSPL consistent with these specifications into its implementation with O(L) singlequbit gates. Chapter 3 generalizes the quantum algorithm of amplitude amplification using QSPL. In general, quantum algorithms involve multi-qubit gates distributed over an arbitrary number of qubits. Thus it is surprising that quantum signal processing, especially that on a single-qubit, is at all relevant. The natural solution is to find quantum algorithms with an intrinsic structure isomorphic to single-qubit rotations. Amplitude amplification, which rotates quantum states between two subspaces, is one such algorithm, with applications in quantum state preparation. In the original algorithm, the amplitude A of any marked quantum state 14) prepared by some oracle may be boosted by factor L using O(L) queries. More advanced variants such as fixed-point Grover search construct more desirable behavior through the same geometric intuition or brute-force found in composite pulse sequences. However, these are all special cases of the generalization, which prepares 10) with an amplitude described by arbitrary polynomial functions of A that may be designed systematically, and compiled efficiently using the exactly same techniques applied to compile QSPL. Chapter 4 returns to the original motivation of quantum computation: the quantum simulation problem of synthesizing an c-approximation to the time-evolution operator eC-it of any Hamiltonian H. Unitary time-evolution is a natural consequence of Schr6dinger's equation at the physical level, and this suggests that digital Hamiltonian simulation algorithms that incorporate quantum signal processing could lead to better performance. This is indeed the case - a simple multi-qubit generalization of QSPL turns out to be exceeding useful at computing functions on the eigenphases of arbitrary unitary operators, and, in combination with quantum walks, is the last missing piece of the puzzle for creating optimal simulation algorithms, at least in the query 21 model for d-sparse Hamiltonians. These Hamiltonians which have at most d-elements in every row, and are particularly relevant to the simulation of physical systems and the design of quantum algorithms. On one hand, particles in many realistic physical systems interact locally with their neighbors. On the other hand, quantum walks that exploit local Hamiltonians are fundamental to many quantum algorithms, through which lower bounds on the complexity of Hamiltonian simulation may be derived. The query complexity of our algorithm exactly matches the best-known lower bound of Q(tdftmax + dlog()) in all parameters, is extremely simple, and moreover reduces the qubit overhead to just an additive constant factor 0(1), compared to prior art with an ancilla cost scaling with some function of t, log (1/E), and a suboptimal query complexity, though only by logarithmic factors. Chapter 5 considers general Hamiltonians beyond the sparse model. We introduce a standard-form encoding of Hamiltonians H/a = ((GI 0 I)U(IG) 0 1) that are obtained by projecting, with some normalization constant a, unitary signal oracles U that describe H, onto a subspace spanned by some ancilla state G). Many prior models of Hamiltonian simulation such as sparse Hamiltonians, Hamiltonians described by a linear combination of unitaries, or Hamiltonians that are density matrices, turn out to be special cases of this model. We introduce a procedure called qubitization that imposes a qubit structure on the standard-form. This enables the direct application of QSPL to computing arbitrary matrix polynomial functions of H with optimal cost and great simplicity, of which the unitary time-evolution function is a special case. This leads to an optimal Hamiltonian simulation using ((ta + loglog((/) (1/6) ) queries, log with the same space and query improvements over prior art as in the sparse case. Chapter 6 motivates a systematic approach to understanding and exploiting structure, where Hamiltonians are encoded in the standard-form through the signal oracle U. We define a uniform spectral amplification problem on this framework for expanding the spectrum of encoded Hamiltonian with exponentially small distortion. We present general solutions to uniform spectral amplification in a hierarchy where factoring U into n = 1, 2, 3 unitary oracles represents increasing structural knowledge of the encoding. Combined with structural knowledge of the Hamiltonian, specializing these results allow us simulate time-evolution by d-sparse Hamiltonians using 0 (t(d|ft|maxIftII1)1/2 log (tIIHII/E)) queries given the norms IH0I 5 fifli < dllfHlmax. Up to logarithmic factors, this is a strict polynomial improvement upon prior art using 0 (tdllf|max + or 0(t 3/ 2 (dIIHI|maxIlftlifltI/E)'/ 2 ) queries. ) In the process, we also prove a matching lower bound of Q(t(d|JfIImaxIIHI1)1/ 2 queries, present a distortion-free generalization of spectral gap amplification, and an amplitude amplification algorithm that performs multiplication on unknown amplitudes. 22 Chapter 2 Analog computation on a single qubit 2.1 Introduction In this chapter, we consider a simple model of analog quantum computation inspired by composite pulse sequences [129, 45], also known as composite quantum gates. Composite quantum gates are indispensable to many important quantum technologies, such as nuclear magnetic resonance [77, 45, 67], magnetic resonance imaging, quantum sensing [5, 52, 88, ?], and implementing robust gates in quantum computation [125, 94, 88, 134, 33, 18, 127, 132, 62, 87, 115, ?]. Their versatility arises from cunningly chosen sequences of L primitive quantum gates that produce an effective quantum gate U with a more desirable dependence on some parameters of interest 0, such as drive amplitude or background magnetic fields. As a function of 0, the quantum response function U(O) can be tailored to amplify weak signals beyond the statistics of repetition, and suppress noise without measurement. Finding such useful composite gates is thus the subject of intense research, and the discovery of other applications would be expedited if a useful characterization of all achievable U(6) could be found. To our knowledge, while composite quantum gates have always been a useful means to an end, they have never been considered as a model of computation. Part of the difficulty is that even though one could choose a response function U(O) that performs some desirable computation on the input 6, its realization as a composite gate must be found. Only with rare exceptions [127, 125, 88] and great effort are optimal, arbitrary length examples found in closed-form. Though celebrated techniques including gradient ascent algorithms [67] and pseudospectral methods [110, 109] allow us to formulate this as a systematic optimization problem that can be solved by brute force, this is unfortunately with an exponential worstcase runtime O(eL) for finding optimally short L approximations. Finding efficient solutions to various control problems would expand the potential of long composite gates, for which the most sophisticated quantum response functions can be constructed. A tantalizing similarity is seen in discrete-time signal processing [101]. Optimal finiteimpulse response filters [54] can be designed simply by choosing the lowest degree-L polynomial that is the optimal approximation to a desired frequency response, from which an optimal and exact implementation is computed - made possible by efficient algorithms for both steps. It is recognized that composite gates implement a filter on physical parameters [115, 64], and the use of polynomials in quantum response functions is wellknown [78, 79]. Unfortunately, quantum constraints can render computing these polynomials and their optimal implementation a hard problem. It would be a tremendous advance 23 if efficient solutions to these problems could be found, and even more so if the countless results from the exalted history of classical discrete-time signal processing were transferable to the quantum realm. In Section 2.2, we define the inputs and outputs of a general model of analog quantum computation that is compiled by optimal quantum control [20, 69]. This turns out to be intractable for arbitrary physical systems, and so we consider, in Section 2.3, its simplest nontrivial member: a single qubit system. Its discrete-time approximation leads to composite quantum gates and motivate in Section 2.3 the intuitive concept of choosing polynomials to explicitly define the quantum response function U(O). This is made rigorous and tractable in Section 2.4.1 by a simple characterization of the space of achievable U(O), and showing in Section 2.4.3 how an optimal implementation of any such U(O) can be efficiently computed. We then show in Section 2.4.4 how an achievable U(O) can be efficiently computed from a partial specification with polynomials that describe only the composite gate fidelity or transition probability response functions. This enables in Section 2.4.5 the efficient design of achievable &(O) by inheriting from discrete-time signal processing existing polynomials and efficient polynomial design algorithms. Together, these provide a methodology outlined in Section 2.4.6 for the systematic and efficient design of composite quantum gates. Use of this methodology is demonstrated in Section 2.5 with the creation of optimal bandwidth compensated gates in Section 2.5.2 that provide an optimal solution in Section 2.5.3 to the problem of implementing sub-wavelength spatially selective arbitrary quantum gates. Further directions are discussed in Section 2.6. 2.1.1 Attributions and contributions The majority of this chapter is taken from the preprint of published joint work [89] work with Theodore J. Yoder and Isaac L. Chuang. Theodore J. Yoder provided the proof of Lem. 2.13, and the rest of the manuscript was written by myself with helpful discussions and suggestions from collaborators. Note that Section 2.2 is new and Section 2.3 has been significantly rewritten. G.H.Low acknowledges funding by NSF RQCC Project No.1111337 and NRO. We thank Alan Oppenheim and Tom Baran for inspiring discussions, and connections made possible by their 6.341x open online MITx course. We thank Yuan Su for useful comments on the paper. 2.2 A model of analog quantum computation Every model of computation must have well-defined inputs, outputs, and a procedure that maps inputs to outputs. Our procedure is the time-dependent Schr6dinger equation that underlies continuous-time quantum dynamics. ih aU(t) =t - (t)U(t), U(0) =1. (2.1) In the following, the units of h = 1 and the Hamiltonian H(t) E CNxN is a dimension N Hermitian time-dependent matrix. This outputs a unitary time-evolution operator with the formal solution U(t) = TO exp [-i jN H(t')dt' , 24 (2.2) where TO is the time-ordering operator. The inputs to our computation are then d unknown real coefficients 6 E Rd that are part of the Hamiltonian d Hi(t; 6) = Io (t) + S 6j Hj(t), (2.3) j--l for some time-dependent Hamiltonians Hj (t). It is useful to bound the spectral norms of these components ||H2 (t)J| = 0(1) to avoid arbitrarily fast time-evolution. In any realistic setting, H3 (t) will be highly structured. For instance, it may have a maximum slew-rate maxt |IdHj (t)|1, represent local interactions, or even be constant. In some cases, the experimentalist may have control over the component f11(t) and be able to modify them, whereas others may be uncontrollable. These constraints all limit the space of possible output unitary functions U(t; 6) of the input variables 6. However, for simplicity, we assume that all H(t) are directly under our control. More formally, these ingredients define a model of analog quantum computation that illustrates key features. Definition 2.1 (Analog quantum computation for unitary function problems). Input: An integer number d > 0 of real parameters J C Rd. Program: Time-evolution described by the Schr6dinger equation with a time-dependent Hamiltonian H(t; = H)o(t) + E jHj(t) of dimension N comprised of an integer number d + 1 of time-dependent Hamiltonians Hj(t), with bounded spectral norm |Hj|| < 1, and defined over t C [0, T]. Cost: The time complexity is T > 0, and the space complexity is N > 0. Output: A unitary function 0(6) = TO exp [-i f H(t'; 6)dt'] of -. Though unconventional, the model in Def. 2.1 may be massaged to resemble that of universal quantum computation BQP with a few modifications. This simulates the standard gate model as (1) a sequence of arbitrary discrete single and two-qubit may be implemented by a piecewise-constant choice of Ho(t). (2) We may treat the J to be Boolean rather than real variables, hence choosing Hj(t) to also be piecewise-constant allows us to generate quantum circuits dependent on the problem input. (3) We may recover the decision version of BQP by performing a measurement on one qubit of the output of U(6)14), where 14) is some initial state in the computational basis. However, Def. 2.1 motivates a different interpretation of how physical systems compute. Rather than performing measurements, we focus on the N2 components of U(J), which each compute some complex-valued function Ujj(6). By choosing an appropriate input state 10) and measurement basis our interest lies in the amplitude function f : Ed -* C, where KX), f(6) = (x I(6)WP). (2.4) Given a target function f'(6) that is to be uniformly approximated with error e on all inputs value, this becomes an instance in quantum control of finding the optimal control 25 policy for every H,(t) that solves the min-max optimization problem min max f(6) - f'(6) ; c. (2.5) (XI(U10) SE [-1,1] Though one may attack Eq. 2.5 using general techniques such as gradient ascent algorithms [67], it is essentially intractable as the optimization is over a continuous space of functions with no obvious structure. However, if the solution is found, it provides a fully quantum and extremely fast analog computation of the target function. As is, this model is too general to be useful, which suggests searching for a simpler starting point. 2.3 Analog quantum computation on a single-qubit Let us now consider the simplest non-trivial model: a generic single-qubit system resonantly driven by a constant Rabi frequency 6 and time-dependent phase 0(t). The Hamiltonian of this system is 6 H(t) = 2& 10, (2.6) where 8 = cos(#)&,x + sin(6)&y and a-,,y,z are Pauli matrices. Whereas #(t) are assumed to be completely under our control, the input 6 is a constant that is not known beforehand. We may define for this model a reasonable-looking model of analog quantum computation. Definition 2.2 (Analog quantum computation with a single qubit). Input: A real parameter 6 E R. Program: Time-evolution described by the Schr6dinger equation with a time-dependent Hamiltonian H(t; 6) = 8-&(t) of dimension 2 described by a time-dependent real function $(t) defined over t E [0, T]. Cost: The time complexity is T > 0, and the space complexity is 2. Output: A unitary function 0(6) = T-O exp [--i H(t'; H 6)dt' of 6. This is not the most general single-qubit Hamiltonian, but specialization ultimately enables the tractable design of &(6). We simplify this further by taking 0(t) = # to be piecewise-constant over time segments of duration T. This generates the primitive singlequbit rotation R(0) = e- 0, 6 = 6r, (2.7) which is periodic in 0 with period 47. By partitioning the total time of evolution T into L = T/r discrete segments, the continuous-time evolution of Def. 2.2 is discretized into an product of L rotations, each with the same rotation amplitude 0, but with varying phases 5 = (#, ... , /L). Though now operating with discrete-time quantum gates, this still leads to a model of quantum computation that natural to the underlying physics. Definition 2.3 (Discrete-time analog quantum computation with a single qubit). Input: A real parameter 6 E R. 26 Program: A sequence of integer L unitary operators U(0) = N0, (0)N (2.8) -1(0) ... Rol (0) _0, of dimension 2 described by a vector of phases 0 E [L Cost: The gate complexity is L > 0, and the space complexity is 2. Output: A unitary function &(0) of 0. Such products of single-qubit rotation are also widely know as composite pulse sequences [129, 45], or composite quantum gates. Despite the simplicity of the formulation, there is still enough hidden complexity within Eq. 2.8 to obtain highly non-trivial response functions U(0). We formalize this with a complexity class Definition 2.4 (Quantum signal processing with a single qubit). Let the complexity class QSPL be the set of all unitary functions U(6) be the output of any L gate program of Def. 2.3. Then QSP is the union of QSPL for all integer L > 0. Unfortunately, a systemic understanding of what QSPL contains appears lacking, and the design U(0) often proceeds through a brute-force optimization over the phases 0. However, one noteworthy step in this direction is the Shinnar-LeRoux algorithm [113, 104] and its refinements [60, 76, 49], which have so far been restricted to the field of magnetic resonance imaging. There, 0 represents the amplitude of background magnetic fields and manifests as an off-resonant rotation. Given otherwise perfect and arbitrary single-spin control, this approach enables the efficient design of U() by a connection to finite-impulse response filters. Unfortunately, extending the concept to situations with different controls and additional restrictions, such as this case of on-resonant compensating pulse sequences, appears to have been difficult. 2.3.1 Representation of QSP However, there are hints that members of QSPL are actually highly structured. By expanding U(6), the quantum response function has the form L -cosL-j U(O) = E(-iW sini j=O (2.9) (0)4L,j, where 'L,j = &j(Re[L,j]1i + iIm[DL,] &z), and the phase sums GLj are defined through the recurrence [87] = _1,j + k _1j_1 iei(_1)3+ 1 oj, 4),0 = 1, Ioljoo = 0, (2.10) performed over j = 0, 1, ... ,k, then k = 1, 2, ... , L. Now, observe that U(O) is polynomial of degree L in x = cos (0/2) and y = sin (6/2) with a particularly elegant representation. Using the trigonometric relation x 2 + y 2 U(0) has the form {A(x) - + iB(x)&z + iC(y)&x + iD(y)&y, L odd, A(x) + iB(x)&, + ixC(y)&x + ixD(y)&y, L even. 27 where A(x), B(x), C(y), D(y) are polynomials of degree at most L with coefficients ak, bk, ck, dk (k = 0,1, ... , L) respectively. In the following, A, B, C, D without arguments are understood to be functions of the x, y seen in Eq. 2.11. As the tuple (A, B, C, D) is an equivalent representation of &(O), we refer to both interchangeably. In particular, achievable tuples are those than can be realized by some composite gate of Eq. 2.9: Definition 2.5 (Achievable polynomial tuples). A tuple of polynomials (A, B, C, D) is achievable if -L, 0 IRL s.t. 1(6) = RL(O)RLl() ... R0, has the form of Eq. 2.11. We are often interested in only a few components of (A, B, C, D). For example, the partial tuple (A, -, C, -) fully defines the gate fidelity response function FX(O) = 11Tr[Nt(x) (]2 with respect to some target gate ko(). Fx (0) = (XSi XC2 ' (2.12) cos (1) A - sin ( Cos () A - C)C 2 ' L odd, L even, Similarly, (A, B,.,.) or (-, -, C, D) fully defines the transition probability response function , p(o) = 1(0IUI1)12 p(0) =1 - A2 - B2 = (C2 + D2) 1, 1x 2 , L odd, (2.13) L even. We refer to a tuple with n empty slots as an n-partial tuple. An n-partial tuple is achievable if it is consistent with some achievable tuple. A brute-force approach to composite gate design is minimizing an objective function for U(O) over a space L c N, c IRL. Though useful examples have been discovered in E this manner, such an approach is highly unappealing. In addition to being inefficient with a runtime O(eL), there is no guarantee that a globally optimal solution will be found. Furthermore, the procedure provides little of the necessary insight into possible U(O) for envisioning further novel applications. 2.4 Systematic and efficient design of optimal composite gates The functional form of ((6) hints at a powerful methodology for composite gate design via choices of the polynomials (A, B, C, D) of degree L. This ambition must solve long-standing problems: (P1) An insightful characterization of achievable (A, B, C, D) to eliminate the tr'aditional guesswork in envisioning novel quantum response functions and their dependence on (P2) An efficient algorithm to compute the optimal qimplementing an achievable (A, B, C, D), in contrast to the intractable random search in time O(eL) of current state-of-art [88]. (P3) An efficient algorithm to compute an achievable (A, B, C, D) from achievable partial tuples e.g. (A, -, C, .), as might be encountered with common objective functions for Eq. 2.12, 2.13. (P4) An efficient algorithm for computing achievable partial tuples optimal for some objective function. Our main technical advances are precisely the resolution of problems (1-4). We describe in a simple and intuitive manner the set of achievable (A, B, C, D), and provide efficient 28 algorithms for solving what has traditionally been the hardest aspects of composite gate design. In particular, a beautiful connection is made with the historic field of discrete-time signal processing that allows allows us to inherit much of its prior work in polynomial design. In this manner, the inspired art of composite gates is transformed into a systematic science. Optimal composite gates are simply polynomials optimal for the objective function, and these polynomials can be found efficiently. 2.4.1 Polynomial characterization of quantum response functions We characterize here achievable choices of quantum response functions (A, B, C, D) in a manner independent of 0, hence resolving problem (P1). By providing insight into the forms of possible 0(O), we also obtain a quantitative explanation for the remarkable versatility of composite gates. Achievability constraints on the polynomials (A, B, C, D) are as follows: Theorem 2.6 (Achievable tuples). A tuple of polynomials (A, B, C, D) of degree at most L is achievable iff all the following are true: (1) A, B, C, D are real. (2) A(1) = 1 or B(1) = 0. L odd, B, C, D are odd, A, A, B are even and C, D are odd, L even. rA(x) 2 + B(x) 2 + C(y) 2 + D(y) 2 , L odd, (4) 1 =A(x)2 + B(X) 2 + x 2 C(y) 2 + x 2 D(y)2 , L even. (3) Proof. In forward direction, (1) and (3) are true by applying the trigonometric substitution 1 in Eq. 2.9 and collecting coefficients of 1, &X)Yz. (2) is true as U(0) = I in Eq. 2.9. (4) is true as U is unitary so UtU = and }Tr[UtU] evaluated via Eq. 2.11 produces x2 +y 2 1 { A(x) + B(x) A(x)2 + B(x) 2 C(y) 2 + D(y) 2 , + X2C(y) 2 + x 2 D(y) 2 , 2 2 + L odd, L even. (2.14) In the reverse direction, we need to show that any (A, B, C, D) satisfying (1-4) is achievable in the sense of Def. 2.5. We leave these steps to Lem. 2.12. D Conditions (1-4) for achievable (A, B, C, D) appear fairly general, which allows for great flexibility in choosing arbitrary response functions. They are also understandable and intuitive. A characterization of achievable partial tuples is also useful. Not all quadratures of U(O) might be relevant to an objective function, and optimizing over a subset (A, B, C, D) could be easier. In the following, we examine how the unitarity constraint of condition (4) is weakened for all possible 2-partial tuples. Theorem 2.7 (Achievable 2-partial tuples). Assuming A, B, C, D satisfy conditions (1-3) of Thm. 2.6, (1) (A, -, C, ), (A,- 7,1 C) is achievable iff rA(x) 2 + C(y) 2 < 1, L odd, (Ia) VO E F, A(X)2 + x2C(y) 2 < 1, L even. (2) (-, B, C, .), (., B, -, C) is achievable iff (2a) V' E R, 1 B(x) 2 + C(y) 2 < 1, L odd, ( B(x) 2 + X2C(y) 2 < 1, L even. (3)(A, B,.,.) is achievable iff (Sa) VO E DR, A(x) 2 + B(x) 2 < 1, and 29 (3b) Vx > 1, A(x) 2 + B(x) 2 > 1, and (3c) VL even,Vx > 0, A(ix) 2 + B(ix)2 >1. (4)(-,.,CD) is achievable iff C4y) +GD(y)2 1, (4a) V E R (4 b) VL odd, y 2 2 2 L odd, D(y) 2 < 1, 1, C(y) 2 + D(y) 2 L even, and 1 Proof. In the forward direction, all the (a) conditions are true from Eq. 2.14 using the fact that A, B, C, D are all real, hence their squares are positive. (3b) is true by considering Eq. 2.14 with the substitution x = V, y = -1 - A, and computing 1- A 2 (xf)- B 2(VA) Note that the x, y here are complex. Using the odd/even symmetry of C, D, the RHS factorizes into a positive term times (1 - A) or A(1 - A). This is negative VA > 1 so A 2 (VJX) + B2 (V)A) 1. (3c) is similarly proven by considering A < 0. The RHS factorizes into A(1 - A) and a positive term. (4b) is proven with the substitution x = \1 -- A, y = v and by considering A > 1. In the reverse direction, we need to show that assuming these conditions enable the computation of an achievable (A, B, C, D). We leave these steps to Lems. 2.13, 2.14. l Note that C, D are interchangeable in Thm. 2.7 as their constraints in Thm. 2.6 are identical. We also characterize all possible 3-partial tuples. Theorem 2.8 (Achievable 3-partial tuples). Assuming A, B, C, D satisfy conditions (1-3) of Thm. 2.6, the following are achievable under their respective conditions (1) (A,-,-,-) iff VO E R, A(x) 2 <1. (2) (-,B,-,-) iff VO E R,B(x) 2 < fC 2 (y) < 1, L odd, Lod --) iff VO E R, 2 ()<1, (3) (--C, ' ) 'x C 2 (y) < 1, L even. 2 L odd, (y) < 1 (4) (-,-,-,D) iff V E R, Ix2 D 2 (y) 1, L even. {D Proof. The forward direction follows by definition and from Eq. 2.14 where A, B, C, D are all real VO E R, hence their squares are positive. The reverse direction is true from setting the unspecified polynomial to 0 in one of the 2-partial tuples (1), (2) in Thm. 2.7. l These simple characterizations show how one can in principle encode almost any arbitrary desired function into quadratures of U(O). Consider (A, -, -, -), which aside from symmetry and A(1) = 1, only needs to satisfy Vjx < 1, A 2 (x) < 1. The famous Stone-Weierstrass theorem [119] assures us that A(x) of sufficiently large degree L can approximate arbitrarily well any arbitrary continuous real function that satisfies these constrains on the interval jxj 1. This ability to create almost arbitrary quantum response functions helps explain the applicability of composite gates to many diverse problems. 2.4.2 Fourier characterization of quantum response functions Whereas the achievable quantum response functions in Thms. 2.6, 2.7, and 2.8 are characterized in terms of polynomial functions of x = cos (0/2) and y - sin (0/2), they may also be expressed as Fourier series (A, B, C, D), which we denote with scripted letters. In some cases, this Fourier representation is simpler than that as polynomials. We rewrite Eq. 2.11 30 as &(0) - A(x)i + iB(x)&z + iC(y)&6 + iD(y)&Y, A(x) + iB(x)&, + ixC(y)&x + iZxD(y)&Y, (2.15) L odd, L even. A()i + iB(0),5z + iC()&x + iD()&Y. Eq. 2.15 defines the relationship between the polynomials (A, B, C, D) and the Fourier series (A, B, C, D). This relationship may be worked out by a straightforward application of Chebyshev polynomials Ta(cos (0)) = cos (nO) and Un(cos (0)) = sin((n 1)0) of the first and second kind respectively. In the case of L even, L/2 L/2 a'cos (j0) = A(0) = L/2 j=0 a'T2j (cos(0/2)) j=0 L/2 Z a2jcos L/2 2 (0/2) = Za2x j=0 j=0 L/2 C(0) = E c sin (j) L/2 cj' sin (0/2)U2j-1(cos (0/2)) = X = j=1 2 X2 j=1 C yXy-2 j=1 L/2 (2.16) = X E C2j-1Y 2j-1, j=1 where we have made use of the identity x2+ y 2 = l and the fact that T., U, are even or odd polynomials depending on n. We use the primed variables (a', bj, c , d9) to indicate Fourier coefficients. Note that A(0) is an even Fourier series in 0 with periodicity 0 E [0, 27r) and degree L/2, and C(0) is an odd Fourier series in 0 in 0 E [0, 27r) of degree L/2. In the case of L odd, 2 A() =Zacos (2 + j=0 2 ) aT2j+1(cos (0/2)) = j=0 L-1 C() = Ec L-1 L-1 1)0)\' /O(2 sin (2J+ 1) 1 j=0 L-1 L-1 = c sin (/2)U2j(cos j=0 j=0 I~~ a2j+ , L-1 (0/2)) 2 Cy j=0 L-1 2 = (2.17) 2 1 C2j+ y j+ . j=0 Note that A(0) is an even Fourier series in 0/2 with periodicity 0 G [0, 47), degree (L - 1)/2, and is odd about 0 = 7r, that is, A(7r + 0) = -A(7 - 0). C(0) is an odd Fourier series in 0/2 with periodicity 0 E [0, 47r), degree (L - 1)/2, and is even about 0 = 7r, that is, C(7r + 0) = C(7r - 0). The derivations for B and D are identical. One may similarly derive a map from polynomials to Fourier series by using De Moivre's formula and a Binomial expansion. For instance, cos 3 (0/2) (eiO/2+e-i/2 . This is a straightforward, though tedious, calculation and so we omit it. The net effect is that any set of real coefficients aj, bj, cj, d may be solved for the same number of real aj, bj, cj, dj, and vice-versa. This allows us to prove results equivalent to Thms. 2.6, 2.7, and 2.8. 31 Corollary 2.9 (Achievable Fourier tuples). A tuple of Fourier series (A, B, C, D) of finite degree is achievable iff all the following are true: (1) A, B, C, D are real. (2) A(O) = I or B(O) = 0. (3) A,B,C,D are of the form described by Eqs. 2.16 and 2.17. . (4) 1 = A () 2 + B(0) 2 + C(0) 2 + D(O) 2 Proof. We map the conditions of Thm. 2.6 to the ones here. In both directions, (1-3) are true by the above discussion for the map between Fourier series and polynomials. Condition (4) is true through by the definition in Eq. 2.15. E Corollary 2.10 (Achievable 2-partial Fourier tuples). Assuming (A, B, C, D) satisfy conditions (1-3) of Cor. 2.9, (1) (X, -, 3, .), (X, *, *, Y), (-, X, Y, ), (-, X, -, Y) are achievable iff (1a) VO C R, X(0) 2 + y(0) 2 < 1. (2) (A,B,.,.) is achievable iff (2a) VO R), A(O)2 + B(9) 2 <1, and jX_ (,L2 a/ T(x)2 (2c) VL odd, Vx > 1, ( .(L1)/2 a T2 j+ 1(X) + (L1)/2 2>1. . + (E (2b) VL even, Vixi > 1, T +2j+() 2 > - (3) (.,- ,C,D) is achievable iff 2 < 1, and (3a) VO E R, C(0) 2 + D() 2 (3b) VL odd, Vy 1, C(y) + D(y) 2 > 1. Proof. We map the conditions of Thm. 2.7 to the ones here and vice-versa. The only nontrivial changes are the map from (3b) and (3c) Thm. 2.7 to (2b) here, which we prove in detail. The others follow directly from the equivalence between polynomials (A, B, C, D) and Fourier series in (A, B, C, D) in Eqs. 2.16 and 2.17. Note that condition (4b) in Thm. 2.7 is left unchanged as its expression in terms of Fourier coefficients is not particularly illuminating. When L is even, we use Eq. 2.16 and the semigroup property of Chebyshev polynomials Tn(Tm(x)) = Tnm(x) to express A(x) = A(O), and similarly for B(x) as L A(x)= j even=O L/2 L/2 ajxj = A(O) = ZaT2j(x) = ZaTj(T2(x)). j=O (2.18) Using T2 (x) = 2x 2 -1, T2 : {ix x 0} -+ {x x < -1} and T2 : {x x 1} -+ {x lx 1}. Thus in the forward direction, (3b) and (3c) in Thm. 2.7 imply (2b) here. In the opposite direction, let us assume (2b) to be true. We may then substitute a re-parameterization of the domains {x I x < -1} = {T2 (ix) I x > 0} and {x I x > 1} = {T2 (x) I x > 1} to recover (3b) and (3c). l Corollary 2.11 (Achievable 3-partial Fourier tuples). Assuming (A, B, C, D) satisfy conditions (1-3) of Cor. 2.9, (X, -, ., .), (., X, ., .), (., -, X, -), (-,-, -, X) are achievable iff V0 E R, X(0)2 <1 Proof. This follows from Thm. 2.8 and the equivalence between polynomials (A, B, C, D) and Fourier series in (A, B, C, D) in Eqs. 2.16 and 2.17. l 32 2.4.3 Implementation of quantum response functions Unleashing the potential of arbitrarily sophisticated choices of achievable (A, B, C, D) requires an efficient computation of their implementation #. It is clear that the a random search is wholly inadequate as the degree of L could be very large. Nevertheless, achievability leads to a certain structure that resolves this problem (P2). This is encapsulated in the following lemma, which is proven constructively and furnishes the reverse direction proof of Thm. 2.6. Lemma 2.12 (Optimal quantum response compilation). Exactly L phases RL are required to implement an achievable (A, B, C, D) of degree at most L, and these L phases can be computed in time ((poly(L)). Proof. A minimum of L phases j are required to implement a given (A, B, C, D) of degree at most L as each application of R, (6) only increases the degree of (A, B, C, D) by one. We now show that (A, B, C, D) can be implemented with at most L phases q. Due to the even/odd symmetry of real A, B, C, D from Thm. 2.6 conditions (1) and (3), we can compute its unique phase sum representation in Eq. 2.9 via the invertible transformation dn)(L(-n)/2J), L(ic?(-n/2 (L,j = ij I /2 n-O (an+ ibn) ( j odd, j even. (2.19) Let us take the ansatz U(6) = N4 ,(6)V(6) where f(6) is unitary and V(0) = 1 as (A, B, C, D) represents a unitary from Thm. 2.6 condition (4). Thus f(0) also has a phase sum representation 4L-1,j. These two phase sums are related by the linear map of Eq. 2.10, with inverse 4 -e-(~l)ji0L j+ kodd, i, j + k even. 'L~k bL-1,j k=O j + k odd, (2.20) By choosing 1 odd eiOL Zk= C2FL/21-1 + id 2 FL/ 2 -1 L,k )Lk (_1)FL/2] even (aL + ibL) (2.21) we satisfy the necessary condition (DL-1,L = 0 from Eq. 2.10. In particular, /L is real, as Eq. 2.14 has the trailing term ((a2 + b2,) - (cFL/2 y + d2L/ 2 -) sin 2 L (0/2) = 0. Hence the RHS of Eq. 2.21 has absolute value 1. By recursively reducing the degree of V(6), we obtain all L phases #. The terminal case at L = 1 must be consistent with Eq. 2.10 where o,o = 1. When evaluated with Eq. 2.19, 2.20, this is satisfied only if A(1) = 1 (Thm. 2.6 condition (2)), which is true for achievable (A, B, C, D). All steps in this procedure can be computed in time O(poly(L)), and there are only L recursions, leading to a runtime of O(poly(L)). Note that one may derive a decomposition directly from Fourier series (A, B, C, D) to phases jwithout going through the intermediary of polynomials (A, B, C, D) - we leave this future work. 33 2.4.4 Computation of quantum response functions A consequence of Lem. 2.12 is that designing a composite gate is no more difficult than finding the (A, B, C, D) to describe the quantum response function U(O). Optimizing (A, B, C, D) for some objective function is far more intuitive than the prior art of a random search over #. However, this still is a difficult problem The unitary constraint Eq. 2.14 represents a system of quadratic multinomial equations that would have to be solved at each step of the optimization to obtain an achievable (A, B, C, D). Solving such systems is in general an NP-complete task. This is the essence of problem (P3): it would be much easier to optimize a subset of (A, B, C, D), and doing so is often the problem of practical interest anyway. This subset optimization is illustrated by the response functions F. (0), p(6) of Eq. 2.12,2.13 which depend on only two polynomials. Optimizing just these for some objective function offers more freedom as the unitary constraint Eq. 2.14 is weakened to that of Thm. 2.7. Ultimately, we must compute some achievable (A, B, C, D) from a partial specification in order to find the phases q. Fortunately, the structure of achievable partial tuples can be exploited to derive algorithms analogous to prior art [113] based on polynomial sum-of-squares problems [91], but specialized to the symmetries of Thm. 2.7. We present results for (A, B., .), (A, -, C, -) of odd degree and show how they apply to all achievable 2-partial tuple. As these primarily serve to show that the necessary conditions in Thms. 2.6, 2.7, 2.8 are also sufficient, the details of the proofs for Lems 2.13, 2.14, which also furnish constructive algorithms for computing (A, B, C, D) from partial tuples, may be skipped by the casual reader. Lemma 2.13 (Transition probability sum-of-squares). V 2-partial tuples (A, B,-,-) of odd degree at most L that satisfy conditions (1-3) of Thm. 2.6 and (3a, 3b) of Thm. 2.7, -l achievable (A, B, C, D) of degree at most L that can be computed in time poly(L). Proof. Consider the polynomial of degree at most L f(A) = 1 - A2 ( - A) - B2 ( -A), A E R, (2.22) with roots S = {s I f(s) = 0} E CL (S contains duplicates if a root is degenerate). Since A, B are odd polynomials, f(A) is real for all real A. Because f(A) is real, complex roots s, s* occur in pairs. Thus we can group subsets of S without loss of generality as: So = {s c S 1s = 0}, Sc = {s E S I Im[s] > 0}, (2.23) Sr = {s E S I Re[s] =/ 0 A Im[s] = 0}. Observe that So,, are real, and S, is complex. Thus f(A) = K 2AISI f1 (A - s) ]I ((A - Re[s]) + Im[s] ) , 2 SESr 2 (2.24) SESc with scale constant K E R. Using (3b), f(A) 5 0, VA < 0. Hence, all negative roots in Sr occur with even multiplicity. Using (3a), f(A) E [0, 1], VA E [0, 1]. As f(A) changes sign at A = 0, ISo l is odd. Using the oddness of A, B, f(A) > 1, VA > 1. Since f(A) > 0, VA > 0, all positive roots in Sr occur with even multiplicity. Thus, all real roots excluding s = 0 34 occur with even multiplicity. By repeated application of the two-squares identity (r2 + s 2 )(t 2 + u2 ) = (rt sU) 2 + (ru T st)2 , (2.25) the complex factors can be simplified like 1J ((A - Re[s]) 2 + Im[s] 2) sC Sc g 2 (A) + h 2(A), (2.26) where g, h are real polynomials in A. Thus f(A) = C2(VAX) + D2 (VA) where JC (Y) I!~ = (KyI SoI fl (Y2 _S)l {h(y2)}(.7 KysD(y) y2 2 (2.27) , and C, D are odd real polynomials of degree at most L. Note that different choices of signs Eq. 2.25 generates a finite number of different valid solutions. Computing the roots of f(A) is the most difficult step of this algorithm, but can be done in time O(poly(L)) [97]. El The proof for even L, and tuples (., -, C, D) carries through with minor modification. The stated conditions in Thin. 2.7 guarantee that the various factors of A, (1 - A) necessary for the correct symmetry of the unspecified polynomials occur with the right multiplicity, and that all other real roots occur with even multiplicity. Some additional processing for the (-,-, C, D) case is required as the output (A, B, C, D) is not guaranteed to satisfy A(1) - 1. However, A(1) 2 + B(1) 2 = 1 is still true so by computing -y = Arg[A(1) + iB(1)], we can form an achievable (A cos -y + B sin -y, B cos y - A sin -y, C, D). We now present the analogous algorithm for (A, -, C, .). Lemma 2.14 (Fidelity response sum-of-squares). V 2-partialtuples (A, -, C, -) of odd degree at most L that satisfy conditions (1-3) of Thm. 2.6 and (1a) of Thm. 2.7, 3 achievable (A, B, C, D) of degree at most L that can be computed in time poly(L). + t 2 ), y (1 + t 2 )L(A, B, C, D). (A(t), B(t), 0(t), D(t)) 2t/(1 + t 2 ) Proof. With the Weierstrass substitution Vt E R, x = (1 - t2 )/( define the real polynomials (2.28) These polynomials have extremely useful symmetries which we indicate with angled brackets (.). (A) = (B) = (EN) are Even (E) aNtipalindromes (N) while (C) = (D) = (OP) are Odd (0) Palindromes (P). Antipalindromes satisfy A(t) = -t2LA(t1) whereas palindromes satisfy 0(t) = t2L6(t-1). Note that (E),(O) and (P),(N) polynomials with multiplication form a group isomorphic to Z2 x Z2 . For example, (EN)(OP) = (ON). Consider the positive, palindromic polynomial f(t) = (1 + t 2 ) 2 L - 2 (t) __ 0 2 (t) = K 2 11(t - s), (2.29) sES with scale constant K E R, and roots S = {s If(s) = } E C4L-5|o, where ISol is the multiplicity of the zero roots. Note the degree of f(t) is 4L - So , not 4L, because the first ISoI coefficients being zero implies the last ISol are as well. Due to the (EP) symmetry of f (t), V roots s -f 0, ] roots s*,-s, and s-. Thus we group subsets of these roots without 35 any loss of information as follows: So={sCSjs=0}, S = sES s =1}, (2.30) Sr = {s E S | Re[s] > 1 A Im[s] = 0}, Si = {s c S Re[s] = 0 A Im[s] = 1}, S, ={s E S I Re[s] = 0 A Im[s] > 1}, Su = {s E S I Is= 1 A 0 < Arg[s] < 7r/2}, Se = {s E S I IsI > 1 A 0 < Arg[s] < 7r/2}. Observe that So,1,r are real, Si,, are imaginary and Su,, are complex. From the real roots, we construct the factor fI f tIS0 ( 2 fr=t t 1 t2(S2 + s-2) + 1) 1), (2_4 (fr) = (OP) 2 (2.31) (EP) 2 (EN) The positiveness of f(t) means that all real factors have even multiplicity. polynomial. From the complex roots, we form ((t2 _ 1)2 + (2t)2) Thus fr is a (2.32) 2 , f t = HsES (t2 _ 1)2 + (t(Im[s] + Im[s]-1)) 2 , fu = HfsES(t 2 _ 1)2 + (2t sin (Arg[s])) 2 fc = HsCs (t4 - t 2 (1- 2 - 4 sin 2 (Arg[s]) + s12) + 1)2 + (2(t 3 + t) Im[s] (1 --Is1-2)) 2 The symmetry of terms under the squares is one of (EP), (EN), (OP), (ON), and occur in a combination that forms a group under repeated application of the two-squares identity of Eq. 2.25. Thus we can construct fift fufc = g2 + h 2 , (2.33) (g) = (EN) i+1Su+1SU 2 (Kfrg) 2 + (Kfrh) . f(t) (h) = (OP)Yil+1Su+1S'I, , For some combinations of multiplicities, this decomposition will not produce polynomials with the symmetry (EN), (OP) required by b, D. However, summing the multiplicities of these roots shows that ISji is even and that such combinations do not exist. From this decomposition, we compute B(x), D(y) using bk = L b2 n [Z 0 (-1)m(n d2k+1 = -2rL 2p-m L2J( , (2.34) 1m (p+)m~ . (L-n--1) ( n) Im (L-n) As with Lem. 2.13, different choices of signs in the two-squares identity lead to multiple valid solutions. Computing the roots of f(t) is still the most difficult step, but can be done 36 in time ((poly(L)). D The case of even L replaces Eq. 2.29 with f(t) = (1+t2 ) 2 L-A 2 (t)-((1-t 2 )/(1+t2 )) 2 0 2 (t) and we find b with (EP) and (1 - t2 )D with (ON) symmetry . A similar root-counting argument guarantees the existence of such solutions. The coefficients of B(x), D(y) are then computed also using Eq. 2.34. This procedure carries through without modification for the other tuples (1), (2) of Thin. 2.7. 2.4.5 Selection of quantum response functions It should be clear that optimal composite gate design is a systematic process no more difficult than choosing one or two polynomials optimal for some objective function. Nevertheless, problem (P4) is that computing these optimal polynomials could still be a difficult task. However, the constraints on achievable partial tuples in Thms. 2.7, 2.8 seem fairly lax, which lends hope that this could be done efficiently. In fact these constraints are consistent with textbook problems in approximation theory [93]. It is at this point where a close connection with discrete-time signal processing [101] is made. Efficient algorithms [92, 65, 48, 75, 56] for designing polynomials optimal for arbitrary objective functions under a variety of optimality criteria have been extensively studied for finite-impulse response filters [54]. We thus inherit much of this machinery, and in many cases, existing polynomials consistent with achievability have already been found and are directly transferable. A most common optimality criterion is the Chebyshev norm: Let P0 (x) be the objective function, with continuous weight function W(x) > 0, to be approximated by a polynomial P(x) of degree L on a bounded subset B of the closed interval B C [-1, 1) with the smallest Chebyshev error norm c = maxIW(x) (P(x) - P(x))|. xEB (2.35) The unique best approximation can be computed efficiently by Remez-type exchange algorithms [44]. Many variants exist such as where P(x) is a trigonometric polynomial [92], bounded [48], subject other unary or linear constraint [75], and even complex [65]. Linear programming methods [75] provide an alternate solution. Efficient algorithms for other optimality criteria such as least squares are also available [80, 128]. These algorithms efficiently solve the problem of optimization over achievable quantum response functions U(O) where the objective functions are 2-partial or 3-partial tuples. Optimization for a 3-partial objective function involves a single quadrature from (A, B, C, D) together with a single real objective function Po(9). Thus we optimize over P(O) for P"(6) in Eq. 2.35 subject to the constraints of Thm. 2.8 for the corresponding quadrature. The slightly more complicated 2-partial case instead specifies two quadratures and real objective functions Po,1(O),Po, 2 (6). Thus we define P0 (9) = Po, 1 (O) + iPo,2 (O), and optimize over P(6) = P1 (0) + iP2 (O) for Po(O) subject to the constraints of Thm. 2.7 for the corresponding quadratures. Note that the unitarity inequality constraint poses no difficulty as IP()1 2 p12 (O) + P2(0). 2.4.6 The methodology of composite quantum gates Our efforts lead us to a methodology for the design of single spin quantum response functions U(O) through composite quantum gates built from a sequence of L primitive gates all rotating 37 by 0, but each with its own phase = (01,...,#L). The procedure is systematic, flexible, and most importantly, provably efficient: Problem statement Given L > 1 and objective function &,(0) for either 3-partial or 2partial tuples, find the composite quantum gate that implements through q the optimal E-approximation to Uo(0). Solution procedure (Si) Check that &(0) is consistent with achievability. -Satisfies conditions of Theorems. 2.7,2.8. (S2) Choose optimality criterion. -The Chebyshev norm is most common. (S3) Execute polynomial optimization algorithm over achievable partial tuples. -Remez-type algorithms are efficient. (S4) Compute achievable tuple from partial tuple. -This can be done efficiently by Lems. 2.13, 2.14. (S5) Compute phases 0. -This can be done efficiently by Lem. 2.12. 2.5 Examples Using the methodology in Section 2.4.6, composite quantum gates with response function U(9) that minimize the error with respect to arbitrary objective functions U 0 (6) can be efficiently designed. We illustrate this process with three examples of independent scientific interest: compensated population inversion gates, compensated broadband NOT gates, and compensated narrowband quantum gates. Population inversion gates rotate states 10) to 11) and vice-versa, and come in two flavors. The broadband variant implements this rotation with high probability across the widest bandwidth of 0 E B, meaning that the transition probability response function p(O) from Eq. 2.13 is close to 1. The narrowband variant instead implements this rotation with low probability so p(O) ~ 0, except at a single point p(7r) = 1. We discuss optimal design of these gates in Section 2.5.1. As closed-form solutions for these gates are already known, and used extensively in NMR spectroscopy, they help build familiarity with the methodology in Section 2.4.6 when it is used to solve open questions in the next two examples. Broadband compensated NOT gates implement the rotation Ro(7r) with high fidelity over the widest bandwidth of 0 parameters. Whereas population inversion gates only succeed on initial states 10) to 11), NOT gates apply a 7r rotation with a known phase for all input states. Such gates have been extensively studied for applying uniform rotations in the presence of drive field inhomogeneities, particularly in quantum computing applications, and our methodology, presented in Section 2.5.2, solves open questions regarding the scaling of bandwidth with sequence length as well as their efficient synthesis. A complementary design problem addressed in Section 2.5.3 is that of narrowbandcompensated quantum gates. These instead apply a desired arbitrary rotation No(X) at a single 0 value, and the identity rotation elsewhere over widest bandwidth of 0 parameters. Such gates are highly relevant to minimizing crosstalk in the selective addressing of spins in arrays, particularly when spin-spin distances are below the diffraction limit, as might be found in architectures for scalable architectures of ion-trap quantum computation. 38 2.5.1 Composite population inversion gates Population inversion gates maximize the bandwidth B over which the transition probability response function p(O) from Eq. 2.13 is close to 1 for the broadband variant, or close to 0 for the narrowband variant. Note that in both cases, perfect population inversion occurs at 0 = 7r for L odd, owing to the fact that A(0) = 0. Moreover, the optimal polynomials and phases for both variants turn out to be related by a simple transformation, so it suffices for us to consider only the broadband case. Composite gates with these properties have been studied extensively for nuclear magnetic resonance and quantum computing applications. One approach to obtaining broadband behavior is with the maximally flat ansatz p(0) =1 - O((0 - r) 2 n) [125]. This exponentially suppresses errors in the transition probability to order n, thus p(O) ~~1 over a wide range of 0. Remarkably, the 0 that implement this profile can be found in closed form [130] with optimal sequence lengths L = n. More recently, a second approach has emerged [88], motivated by the following observation: as the flat ansatz p(O) = 1 - O((6 - r)n) only increases bandwidth indirectly through the suppression order n, better results can be obtained by directly optimizing for bandwidth, while ensuring that the worst-case error I remained bounded. The procedure of Section 2.4.6 for odd L formalizes this task as a straightforward optimization problem: (Si) Choose the objective function VO E B = 7r + [-IBI/2, JBJ/2], $Jo(O) = 0 for the (A, 0, -, -) 2-partial tuple. Since p(O) = 1 - A 2 is close to 1 over B, the unitarity constraint c2 + D 2 = I- A 2 implies that a rotation Rp(7r) is approximated over B, with an unspecified phase # = Arg[C + iD] that varies with 0. As consistency with Thin. 2.6 requires that A(1) = 1, this implies that identity is applied at 0 = 0, thus B must not contain 0 = 0. (S2) Choose the Chebyshev optimality criterion, where the best A solves the minimax optimization problem E=minmaxA(x)J, A 0EzB E2=I, (2.36) where the worst-case transition probability over B is 1 - I. (S3) Find the function A that solves Eq. 2.36. For consistency with Thin. 2.6, the optimization is over real odd polynomials A bounded by VJxJ 5 1, IA(x) I 1. (S4) Using Lem. 2.13, compute the achievable tuple (A, 0, C, D) from the partial specification (A, 0,,.). (S5) Compute q from (A, 0, C, D) using Lem. 2.12. The solution to (S3) is the Dolph-Chebyshev window function [38, 90] famous in discretetime signal processing. DCL,I(Y) = VITL (/L,IX) , /L,,r = TL-1(1- 1 2 ), (2.37) where T,(x) = cos (n arccos (x)) are Chebyshev polynomials. Note the ripples of DC2, (X) bounded by I in Fig. 2-1. This is in contrast to monotonic increase of the limiting function, indicated by the subscript f, DCLf(X) = lim DCL,I(X) =XL which is maximally flat at x = 0, but has significantly narrower bandwidth. 39 (2.38) Using x = . 1 0.02 2 M9,f iV 9 , 1 0 -21 DC9,f 0.01 -- DC 9,f 1 -D 0 2 1 2C910- C / 0 I 0 0.99 / 1 2 ,f M - m 0.98 - 2 -1 1 1 0 2 2 1 x Figure 2-1: DCL,T (black), ML,I (teal) polynomials plotted for L = 9 and target worst-case infidelity I = 10-2 (solid) and I -+ 0 (dashed), indexed by f. The observed ripples are a generic feature of bandwidth optimized polynomials, unlike those optimized for maximal flatness DCf, Mf. The inset plots their squares and defines the bandwidth B in x coordinates. cos (0/2), the bandwidth in 0 coordinates is to order O(I) LB -- 4f - 41|B31 = 23-I (2.39) Ef Given the same target bandwidth, the worst-case error of DCL,I is exponentially smaller than DCL,f. Note also the quadratic difference in the scaling with L of the bandwidth over which DCL,I does not approximate F(x) = 0. = 4arcsech +0(l s Bf 4 (2.40) +2 The ripples in the amplitude are a generic feature of best polynomial approximations to functions in the Chebyshev norm. By sacrificing flatness, much smaller absolute variations in error e can be achieved over some specified bandwidth B. This is a common theme that will be revisited in the subsequent example. Finding the phases that implement (DCL,I(X), 0, , -) is then a straightforward computation through (S4), (S5), and the results can be compared to the closed-form solutions from [136, 88]: Ok = OL-k+1 where #1 = 0 and Ok+1 = Ok + 2 tan- The phases [tn(L)V1 - 8ZL2] (2.41) for the narrowband variant (-, -, DCL,I(X), 0) are obtained by a simple 'toggling' transformation [134] 4j = -(-)5o 3 - E - 20k540 2.5.2 Broadband compensated NOT gates Broadband compensated NOT gates maximize the bandwidth B over which the fidelity response function with respect to the target gate No(7r) is close to 1. One option consistent with this goal is the choice of fidelity response functions F,(0) = I- 0((0 -7r) 2n+ 2 ) that are maximally flat with respect to (0 - 7r). When the correction order n increases, deviations from 0 = 7r are exponentially suppressed, resulting in improved approximations of the target gate over wider ranges of 0 E B. The central difficulty of this pursuit is finding the phases # that maximize n for any given L. Unlike the population inversion gates of Section 2.5.1, this appears to be significantly more difficult; optimal length solutions for the # have only been found in closed-form for small n < 4 [87]. This problem has been attacked over the course of two decades, starting with Wimperis [134] who found the 0 in closed form for BB 1 , a L = 5 sequence with n = 2. This was extended by Brown et. al. [18] with SKn for arbitrary L = 0(n3. 09 ) through a recursive construction, and then by Jones [62, 59] with Fn to L = 0(n1 . 59 ) in closed-form through sequence concatenation. The most recent effort [87] proved a lower bound of L = Q(n) and conjectured that the sequence BBn (Wn in [59]) with L = 2n + 1 is optimal through brute-force up to L = 25. Using our methodology, we can easily prove this conjecture and efficiently compute its implementation 4. Moreover, our methodology enables a second option. Instead of optimizing for correction order, it is possible to directly minimize the worst-case infidelity I, which is the experimental quantity of interest, over a target bandwidth B. We find that doing so leads to an improvement in I that scales exponentially with L over the maximally flat case. To prove these statements, we proceed with the design outline of Section 2.4.6 for odd L: (Si) Choose the objective function VO E B = ir + [-IBI/2,11B3/2], $o(0) = 0(7r) = -iCx for the (., -, C, -) 3-partial tuple. Provided that B does not contain the point 0 = 0, this is consistent with the constraints of Thin. 2.8. This corresponds to finding a fidelity response function F, (0) = C2 (sin (2)) that is close to 1 across B. (S2) The best fidelity response function for the maximally flat approach in prior art is obtained from the function C that maximizes the correction order n = max{n I C(y) =1 - 0((1 - y)f+l)}, (2.42) C I = 1 - min F, (0), y = sin (6/2), 0E13 where I is the worst-case infidelity over the bandwidth B. It is easy to verify that any such C satisfies F,(0) =1 -- 0((0 - 7r)2n+2). The more direct approach uses the Chebyshev optimality criterion, where the best C solves the minimax optimization problem e = minmax lC(y) - 11, C 0E: I = 1 - ( -) 2 . (2.43) (S3) Find the function C that solves Eqs. 2.42, 2.43. For consistency with Thin. 2.6, the optimization is over real, odd polynomials C bounded by IC(y)I < 1 ,Vy E [-1, 11. (S4) Using Lem. 2.14, compute the achievable tuple (A, 0, C, D) from the partial specification (-, 0, C,-). (S5) Compute q from (A, 0, C, D) using Lem. 2.12. We now present the solutions to (S3)_of this procedure. This is the most difficult step, as once C is provided, the implementation # is a straightforward calculation. Eq. 2.42 is solved 41 by the the odd polynomial that satisfies the following n + 1 independent linear constraints: ( dk 0 dkC(y) C(1)1, k = 1, 2, . . , n. (2.44) As a degree L odd polynomial has L--1 free parameters, a degree L necessary and sufficient. This is solved by the polynomial 2n + 1 polynomial is dyk 0, (2.45) 2_ ( ML,f (Y) Y=J j=0 with an example M 9 ,f plotted in Fig. 2-1. The index L indicates the degree, and the subscript f indicates that this is a maximally flat polynomial. As ML,f(y) is monotonically decreasing from y < 1, the relation between infidelity I and bandwidth B is obtained by solving I= 1 - Mf (cos (1B/4)) to leading order: |B| L+ 1 2 L+5/2 [ E = (L8M (2.46) 1 Thus given some target bandwidth B of high-fidelity operation, the composite quantum gate represented by BBn = (, 0, M 2n+ 1,f(y), -) implements NOT with a worst-case fidelity that decreases exponentially with sequence length. This proves the L = 2n + 1 conjecture of [87]. The odd polynomials of degree L that satisfy the Chebyshev error norm optimality criterion in Eq. 2.43 can also be found. We label these polynomials ML,I, where L indicates the degree, and I is the worst-case infidelity, which is directly related to the bandwidth B. For L = 5, we have a complicated looking expression 3 -(4y3+3y2+2yi +1)y +(2y5+4y4+6y'+3y2)y M5 + (y1 -1) 2y5(y1+1) 2 3 (1+3y1+y2) 2 (1-2yi-4y )(3+9y1+8y) 3125y6(1+yl)4(1+2y1)3 3 (2.47) parameterized implicitly through yi E [cos (r/5), 1]. For larger L, such as M 9 ,10 -2 in Fig. 21, the ML,I can always be computed numerically through the famous Parks-McClellan algorithm [92] for finite impulse response filters. Remarkably, the Chebyshev error of this approximation problem is known [40]: I1 () 8 cos2 (1131/8) tanL+1 vf/-7 |B + (II=~ 8 I3/)(.8 (L-=)c+(I(I/)) v,(L - 1) cos (11B1/4) 7/2 1+0 V"/,iL |B 8 8 2 , (2.48) - 5 +- . 21+1)y LB By comparing Eqs. 2.46. 2.48 in Fig. 2-2, it can be seen that for any target B and sequence length L, the composite quantum gate OBn = (-, 0, M 2n+ 1,I(y), -) has a worst-case infidelity that improves on BBn by an exponential factor 0(2 1-L). In contrast to the BBn sequences that are fixed for each n, OBn allows for an optimal design trade-off between bandwidth B and infidelity I. As seen in Fig. 2-1, this occurs by introducing equiripples of equal amplitude bounded by I, similar to the DCL,r polynomials 42 V-4.. I 1 . 10-2 - I I U MLI ML,f ----- 10-41 10-6-"" 9- 10-2 10~1 1 0.1 0.2 0- Target Bandwidth I!B| / 27'r q where k = 0L-k+1 0.4 0.6 0.8 1.0 I L Eq. 2.47 5 2(tan- 10-2 10-4 10-6 10--2 10-4 10-6 9 9 9 13 13 13 (2.987, 5.166, 4.021,1.678, 2.815, ... (2.889,5.334,4.042, 1.490,2.926, ... (2.844, 5.381,4.034, 1.414, 2.976, ... (2.390, 0.771, 2.791,2.824, 2.115, 4.573, 4.888, ... (2.233, 0.455, 2.853,2.862, 1.838, 4.558, 5.041, ... (2.159, 0.314, 2.874, 2.877, 1.677, 4.495, 5.092, ... 2 2, 0,...), ~8x~j+8x~j-1 1 2 t2 + x ) , tanlt ) ) ) ) 1 ) _ Figure 2-2: Worst-case infidelity I of NOT gates OBn = (., 0, M n+l, (sin (0/2)),-) (solid, 2 1 Eq. 2.48) optimized for target bandwidth 0 E B compared to flatness-optimized NOT gate BBn = (-, 0, M2 n+1,f (sin (0/2)), -) (dashed, Eq. 2.46), plotted for L = 2n + 1 = 5, 9, ..., 25 (from top). Observe that I for OBn is exponentially smaller by factor ~ 4n than BBn. Alternatively, an OBn gate can approximate NOT with infidelity at most I over a much wider bandwidth than BBn. The table provides examples of 0 for OBn rounded to 3 decimal places. 43 for population inversion gates. Thus, given the same performance targets, an extremely short OBn gate can perform just as well as a significantly longer BBn gate. In other words, maximizing the correction order only improves the achieved bandwidth indirectly, leading to a poor trade-off between I and B, whereas better results are naturally achieved by optimizing for polynomials that directly solve Eq. 2.43 by minimizing infidelity over a target bandwidth. 2.5.3 Composite quantum gates with sub-wavelength spatial selectivity Narrowband compensated gates maximize the bandwidth B over which the fidelity response function with respect to identity 1 is close to 1, except at a single point 0 where an arbitrary target rotation fNo(X) is applied. Although the direct approach is computing new polynomials (A, -, C, -) that satisfy these properties, we can reuse the polynomials ML,T from Section 2.5.2 by making certain assumptions on the physical system. In the following, we also assume that 1xI < r. Consider a Gaussian beam of fixed width A. As a function of position r, this beam has a spatially-varying Rabi frequency Q(r) = Qoe_ 2 /2A 2 . Thus when applied for time to, a primitive gate R1(0(r)) that also varies as a function of position is generated, where O(r) = Ooer 2 /2A 2 and 0 = Qoto. At r = 0, one can choose to, < such that the target rotation x = 00 is implemented, and due to exponential decay of the Gaussian beam, moving away from the beam center approximates the identity gate with infidelity 1(r) = sin2 (0(j)). Thus at distance r/A > d/A = 3 1 = logl/ 2 -r from the beam center, the worst-case infidelity is I. As the minimum possible beam width A is the wavelength of light, selective addressing below the diffraction limit appears impossible. However, even this can be overcome with a carefully designed composite quantum gate. Narrowband composite gates of length L applicable to this scenario have been widely studied. For instance, [134, 94] report beam width reductions by factor d ~ 0.73 1 [134, 94]. Further reduction is possible with longer composite gates [87], but with poor scaling = O(L~ 1/4). A better narrowband composite gate results from using the broadband identity gate ID = (ML,T(x), 0, -, -) designed from the ML,I polynomial in presented in Section 2.5.2. Then, the fidelity response function with respect to identity is Fo(6) = M2,(x), which, as we now show, corresponds to a quadratic improvement of =O(L-1/2) Let us compose ID with the Gaussian beam to produce the spatially-varying quantum response function Uspace(r) = ID(0oe-r 2 /2\ 2 ) ID(9o) + O(r2 ), (2.49) for some choice 0oI < 7r. Note that Uspace(r) is stable with respect to beam-pointing errors in r due to the vanishing first derivative. The degree of spatial selectivity is computed from the bandwidth in Eq. 2.48 by substituting JBI = 200e-sace/2 2 and solving for r/A. Thus, identity is implemented with infidelity at most I at all r > d > ALspace as seen in Fig. 2-3, where to leading order O(L- 1/ 2 ), Bspace = A = 2 log (1/I)+ log (27/(Lr)) +l 44 4 n - (2.50) 17 " -- . - -0.6 -~~ 9-5 17 2941L 0 10-2 10-4 10-6 10-2 10-4 10-6 9 9 9 13 13 13 (0, 0.772, 4.357, 2.827,3.886, ...) (0, 1.087,4.501, 2.707, 3.961, ... (0, 1.235, 4.601,2.695, 4.029, ... (0, 1.450,3.683, 2.501,3.220, 5.577, 5.728, ... (0, 1.872, 4.326, 2.844,3.602,6.271, 0.257, ... (0, 2.077,4.616, 2.978,3.742,0.250, 0.578, ... - 1 ) ) ) ) I ) Eq. 2.47 L 5 1 2 3 1 Distance from beam center r / A q where q5 = #L-k+1 1 2(0, tan- ti, tan", ... ), ti = /8x 1 + 8x Figure 2-3: Infidelity of spatially selective composite gates (ML,10--4(cos (1 e- 2 /_ 2 )), 0 plotted for 0o = ir and L = 1, ... , 25 (solid, from right). The effective beam width Bspace = O(L-1/ 2 ) (inset) beyond which the identity gate is well-approximated is dramatically reduced over that of a single gate 91. By varying 0o, arbitrary unitary gates can be applied at r = 0 with high beam-pointing stability. Poorer scaling Bspace = O(L- 1/ 4 ) results from using the flat (ML,f, 0,,. -) (dashed). The table provides examples of 0 to 3 decimal places. Meanwhile at r = 0, we obtain the gate Uspace(0) = Ry (2 cos-' (ML,I (cos (o /2)))) . (2.51) where -y = Arg[C(sin -) + iD(sin - )]. The desired rotation Ro(X) is thus obtained by choosing 0 such that cos (x/ 2 ) = ML,T(cos (Go/ 2 )) and rotating all phases #k +- qk + -Y, which follows from e-i-lUspace(0)ei Ro(X). The optimality of these results follows from the construction of ML,I as optimal bandwidth polynomials. In particular, using the flat polynomial ML,f(x) leads to the scaling Bspace = O(L-1/ 4 ) found in prior art and Fig. 2-3(inset). 2.6 Conclusion We have presented and applied a methodology, analogous to the Shinnar-LeRoux algorithm but with different controls, for the systematic design of resonant equiangular composite quantum gates of length L on a single spin. In particular, we show that all steps are efficient with time complexity O(poly(L)), and provide an extremely rigorous characterization of achievable quantum response functions. Moreover, the elegant and practical connection made with discrete-time signal processing allows us to inherit and adapt many existing algorithms and polynomials used in the design of classical response functions for this quantum problem. Much potential remains untapped there, and interdisciplinary exchange could spur the discovery of further connections, leading to the development of previously intractable ap- 45 plications. Indeed, this relationship has already proven fruitful in surprising directions, such as recent work furnishing optimal algorithms for important problems such as Hamiltonian simulation [86, 84] on a quantum computer. Various thought provoking extensions are also motivated. The set of achievable quantum response functions is changed by introducing elements such as additional (possibly continuous) control parameters, disturbances, coupled spins [124, 94, 61], or open systems [68, 115]. These all enable their own unique applications, but also appear difficult to solve somehow systematically and intuitively. Our success in the case of composite gates contributes supporting evidence that a useful characterization as well as efficient methods for these more complex design problems could exist. 46 Chapter 3 Amplitude amplification by quantum signal processing 3.1 Introduction In Chapter 2, we introduced QSPL, a model of analog computation based on single-qubit rotations. We now make strides towards the hope of fast analog-digital computation by demonstrating one way in which QSPL results transfer to digital realm without modification. We do so through an application to quantum query algorithms, starting in this section with the quantum search problem, which is special case of amplitude amplification, which we further generalize. Amplitude amplification is a staple quantum subroutine for state preparation that used in many quantum algorithms. The basic version is based on reflections about initial and target states that lead to sinusoidal modulation in the final amplitude of the prepared target state. These oscillations depending on the time N for which one runs the algorithm as well as the the initial amplitude of the target state. The leads to a well-known 'soufflAf' problem. If initial amplitude is not known beforehand, one also does not know what N should be. Undercooking is running it for too little time which prevent the amplitude from reaching the first oscillation peak, and overcooking is running it for long, which can cause the amplitude to drop. One approach is to estimate 0 by performing a binary search over values of N = 1, 2,4, ... - the cost of doing so turns out to not affect the asymptotic scaling of Grover search. However, a more elegant and general approach would be to replace sinusoidal modulation with some other custom function. This can be achieved by a simple generalization known as phase-matching [83, 50, 17, 58], where instead of performing reflections, one instead applies partial reflections with some identity component. These partial reflections allow interference between states from different computational time-steps, leading to more sophisticated relations between initial and final state amplitudes. One such non-trivial function enables fixed-point quantum search [51, 136], which solves the soufflAf problem without requiring any repetitions for estimation. However, these results, especially in choosing the right sequence of partial reflections, tend to follow either from geometric arguments, or a brute-force numerical search. What is lacking is a general theory that classifies the space of such possible functions, as well as procedures for compiling their implementation from some specification. The lack of knowledge about this design space limit potential applications of the amplitude amplification framework. Surprisingly, the model of analog computation QSPL from quantum signal processing on 47 a single-qubit provides this general theory, and essentially solves the design problem. Even though amplitude amplification is in general an algorithm on multiple qubits, its underlying SU(2) structure allow the results of QSPL to apply without modification. Thus one may compute functions of amplitudes with the same speed and simplicity as at the physical level of QSPL, at least in terms of query complexity. In Section 3.2, we define the quantum query model which is applied in Section 3.3 to outline the famous quantum search algorithm [50], also known as Grover's algorithm, for searching a database of size n in O(n/) time, and its generalization as amplitude amplification. The most common generalization of amplitude amplification replaces reflection with partial reflections, which allow one to design functions for obtaining non-trivial nonsinusoidal modulation. In Section 3.4, we completely solve this design problem. We prove that any such function is polynomials subject to certain necessary and sufficient constraints, and show how these partial reflections may be precomputed in classical polynomial time. We then generalize this in Section 3.5 to obtain an even more flexible variant that relaxes some constraints on these polynomials. 3.1.1 Attributions and contributions The results in this section are taken from the preprint of joint work [85] with Isaac L. Chuang. 3.2 Quantum algorithms and query complexity The quantum query model is a useful tool for proving meaningful comparisons between the performance of classical algorithms, and that of quantum algorithms. The basic idea is to provide information on the problem to both classical and quantum circuits in a certain standardized manner. In the standard approach, this is done through an black-box Boolean function 0 : {0, 1} -+ {0, 1} that encodes some information on the 2' bit-string x = X1X2...X2n E {O, 1}2n. In the simplest case, computing on index j E {0, 1} through 0(j) = xj returns the value of the jth bit of x. With multiple applications of this function, one could then determine functions f of x, such as its parity, or whether there exist any bits that are one, using some other logic circuit that computes on the obtained information. As 0 can always be synthesized by some classical circuit of boolean logic, black-box means that we treat this classical circuit as a single entity, the oracle, and count the number of times we query this oracle. This normalizes the inputs to classical and quantum algorithms as 0 could just as well be synthesized by Toffoli gates, which are universal for classical computation, and can be realized by quantum unitaries. Of course, the Toffoli gates have an equal number of inputs and output, whereas 0 has just one output. Nevertheless, the single output bit 0(j) will be one of the output bits of a larger oracle Oreversible : {0,l}" 1 {0, }. This is completely equivalent as we may set 0(j) = Oreversible(j)1 and disregard all the other output bits. Using standard techniques to uncompute the irrelevant output bits, this is also equivalent to having access to the oracle 0 : {O, 1} x {O, 1} + {o, 1} x {0, 1}, where O(z, j) = {z xj, j}, Vz E {0,1}, j E {0, 1}. (3.1) As we may implement classical Toffoli gates with quantum Toffoli gates, it is fair to assume 48 that a quantum algorithm has access to a unitary black-box quantum oracle 0O1Z) 1j) = | Z (1)|) j), Vz E to, 1}, j E to, I}I, (3.2) which may be queried by any arbitrary superposition of states. A generic Q-query quantum algorithm U then intersperses queries to 0 between Q + 1 arbitrary unitaries Uk, that do not depend on properties of 0, like U = UQOQ-10 ... 10ou0 . (3.3) In other cases, one may assume that 0 computes different functions instead of simply returning the value of some built-in bit-string x. It may that 0 4 Ot, thus Eq. 3.3 would be modified to insert queries to Ot in any order, each with the same cost as 0 as reversing the order of quantum operations is easy on a universal quantum computer. However, in all cases, it is important to keep in mind that counting queries to 0 is only an abstraction to the true quantum gate and qubit cost of any quantum algorithm. In the decision model of quantum computation, U acts on a computational basis state 10)10), and the algorithm succeeds if f(x), encoded in the z bit is computed with success probability bounded away from 1/2+0(1), e.g. greater than 2/3. The query complexity of a function f in this model is then the smallest number of queries Q where this is so. Note that in more general quantum algorithms, f need not be a boolean function. It could for example, be a desired quantum state, or some unitary. In the following, our convention for gate complexity counts the number of single-qubit and two-qubit rotations required to implement a quantum algorithm, excluding that for implementing 0. Our convention for space complexity counts the number of additional qubits required, excluding those already used by 0. 3.3 Quantum search and amplitude amplification The query model may be applied to prove a famous square-root speedup of the quantum algorithms over the classical algorithms in problem of searching an unsorted database of size n for a marked element. The quantum algorithm is known as Grover search, and works by preparing a special quantum state that encodes the index of the marked element. This procedure is generalized by the amplitude amplification algorithm for preparing quantum states in general. We review both in this section. The database of size n in the search problem is modeled as an n-bit string x = XiX2...Xn, where marked elements correspond to bits of x being set to 1, and all other bits being set to 0. If there are m marked elements, let the set of indices corresponding to marked elements be M of size IMI = m. The oracle provided for this problem returns the value of the jth bit 0(j) = xj. Solving search problem entails returning the index j E [n] of any one such marked element by making some Q queries to 0. The optimal classical algorithm is to select, without replacement, random numbers j from the set [n]. Then with probability rn/n, the oracle returns 0(j) = 1, which indicates success. Thus the classical algorithm requires 0(n/m) queries on average. The quantum algorithm works differently, and relies on querying 0 with a quantum superposition of states in the computational basis. Let the uniform superposition over all states be IS) the uniform superposition over all marked states IM), and the uniform superposition over all unmarked states be IM'). Note the orthogonality relation (MIM') = 49 ( -in20)6)os 0. Thus IS) = Zli), IM) j), = IM') = Z lj). n EM j= (3.4) j~M On input lS)a0)b, where the ancilla register b is of dimension 2 and the system register a is of dimension n, the quantum oracle prepares the state IS)ab = OIS)alO|0)\b = j=1 - IM)all)b + ( lj)all)b + E (jE M m Id)alO)b (3.5) jg m M )0)b, sin(O)IM)all)b + COS (9)IM')a0)b, sin (0)lt)ab + COS (6) 1t')ab, where in the second line, we have defined sin (6) = and similarly for cos (9), and in the last line we define the target state lt)ab = IM)all)b, and the orthogonal lt')ab = IM')alO)b. One can see that measuring the a register returns the state 11) with probability m/n, identical to the classical algorithm. To go further, consider the reflection operators Ref1 ) =ab - 21s) (Slab, Refit) = Zab - 21t)(tIab, (3.6) which flip the sign of only the states Is)ab, lt)ab. The product Ref1 8 )Ref 1 t) is known as the Grover iterate, and has the special property of performing rotations only about the two states lt)ab, lt')ab. By direct computation, one can represent the state IS)ab and these rotations as matrices in the It)ab, lt')ab basis. = - S osin (0) Rf)Rft) - (sin (29) cos (20) (3.7) Thus by direct multiplication of the matrices, we obtain a well-known result. A sequence of N Grover iterates acting on Is) prepares the state (Ref.s)Reft)N s)ab = sin ((2N + 1)9) lt)ab + cos ((2N + 1))t')ab. (3.8) By choosing N = - '~ = 0(1/0) = O( n/m), the amplitude sin ((2N + 1)6) = 0(1) of the target state It) is close to one. Thus measuring the a register of this state returns the uniform superposition of marked states IM) with probability 0(1). Measuring IM) then returns one of the indices of the marked states, selected at random. Thus this sequence of reflections solve the search problem, and all that remains is to evaluate the queryomplexity, and the gate complexity of its implementation. The query complexity of the Refi,) is 2, as seen by rewriting it as Refl,) = lab - 21s) (Slab = O(Iab - 21S)(Sla 9 lO)(0b)ot. (3.9) Thus the query complexity of the procedure Eq. 3.16 is Q = 2N + 1 = O(/rn/m), which 50 recovers the famous square-root speedup over the classical case. The gate complexity is obtained by adding the cost of preparing the uniform superposition IS), and the other reflections. First, note that IS) may be prepared using log 2 n Hadamard gates IS)a = Had lg2n IO)a. When n is not a power of 2, the scaling in terms of primitive gates is still ((logrn). Second, note that the reflection (lab - 2IS)(Sla 0 10)(01b) = Had o' 2n (iab - 2|0)(01a 090)(Olb)Hado 'l2n (3.10) may be implemented using (9(log n) Hadamard gates, and a multiply-controlled phase gate over 2n dimensions. This multiply-controlled phase gate may be implemented using O(log n) single and two-qubit gates. Third, the reflection Refit) = ab - 2It)(tlab = lab - 2IM)(Mla 0 11)(11b (3.11) appears to need prior knowledge about the state |M), which cannot be prepared beforehand. However, Refit) only needs to perform reflections in the subspace of It) and It'), as these are the only two states that appear in Eq. 3.5. Thus it may be simplified to Rft) = Ia 0 (Ib - 211)(11b), (3.12) and may be implemented using a single-qubit gate. Adding these together, the total gate complexity of Grover search is O(Q log n)), and the space complexity over that of the oracle O is 0 ancilla qubits. The generalization to amplitude amplification is obtained by treating the terms that appear in the Grover search problem abstractly. Let us define the state preparation oracle IS)ab = OIO)ab = sin (O)It)al)b + cos (0)it1 )ab, (3.13) which prepares the a state IS)ab, from which the target state It)a, marked by the ancilla I1)4, may be obtained probability Isin (0)12. Note that It')ab has no support on the ancilla state 1l)b, but is otherwise arbitrary. Then we define the reflection operators Ref1 s) = lab - 2Is)(sab = G(Iab - 2|0)(Olab) = GRefjo)Ot, Ref1t) = Ia 0 (1b - 211)(11). (3.14) (3.15) Thus sequence of N iterates Reff,)Refit) produces the state (Refls)Reft))NGIO)ab sin ((2N + 1)6)It)aI1)b + COs ((2N + 1)6)It')ab, (3.16) with a query complexity of Q = 2N + 1, and gate complexity O(Q log (n)). 3.4 Amplitude amplification by partial reflections The most common generalization of amplitude amplification replaces reflection with partial reflections. This allows one to obtain more general functions of 0 in the amplitude of the target state. Our starting point is Eq. 3.17. We use a slightly different convention here with 51 the state preparation oracle IS) ab = O10) a I )b =sin (0) It)a I )b + COS (0) 1t Lab, (3.17) where the target state It), is marked by the I0)b state. This little difference as it can be accomplished by a single 8 Pauli gate. We find it convenient to define It)ab = It)alO)b. Let us define partial reflections parameterized by phases a, 0: Ref,18 ) = Iab - (1 - e)Is)(slab, efyjt) =lab - (1 - e-')lt)(tlab. (3.18) This leads to the generalized iterate Reff,18,RefiI1) which has query cost 2. An N = 2n + 1 query sequence of these iterates produces the state n IJ fak,Is)RefI,It)Is)ab = (iC() + D(O))t)ab + (A(6) - iB(0))lt')ab, (3.19) k=1 where Refas,1S)Ref01 ,it) acts first on the input, and A, B, C, D are real functions parameterized by a, 3. Unfortunately, the dependence of a, 3 on any arbitrary choice of A, B, C, D appears quite mysterious. Only in very few cases can the A, B, C, D can be specified for arbitrary N and then inverted to obtain a consistent set of a, ' in closed-form [136]. For instance, standard amplitude amplification corresponds to ak = A = -r. We resolve this mystery by proving the following result Theorem 3.1 (Amplitude amplification with partial reflections). Let 0 be a state preparation unitary acting on the computational basis states I0)a E Cd, I0)b E C2 such that GIO)alO)b = sin (O)It)aI0)b + cos (6)lt')ab, where |t')ab has no support on |0)b. Then there exists a quantum circuit that requires odd N + 1 queries to G, N queries to G t , and 0(N log (d)) primitive quantum gates precomputed in classical 0((poly(N)) time, to prepare the state V I|0)a0)b = (iC(O) + D(O))lt)alO)b + (A(O) - iB(9))lt')ab, (3.20) where C(O), D(O) are any choice of real functions satisfying all the following conditions: (1) C(O) = C(y),D = D(y), where C, D are odd real polynomials in y = sin (0) of degree at most 2N + 1; (2) Vy E [-1, 1], C 2 (y) + D 2 (y) (3) Vy > 1, C 2 (y) + D 2 (y) 1; 1, and A, B, are functions of lesser interest. This result is quite remarkable as the constraints are lax and allow for many interesting functions. For instance, choosing C(y) = T2N+1(y) = sin ((2N + 1)0) to be Chebyshev 0, recovers the baseline amplitude amplification polynomials of the first kind and D(y) algorithm. Proof of Thm. 3.1. Consider the Q = 2N + 1 query sequence of Eq. 3.19. Let us re-express the generalized reflection in Eq. 3.18 as: Reff, 8I) = Iab - (1 - e iQ)I0)(lb5t = a (jab - (1 - e )0)(0ab) at = O faRfi)t. (3.21) 52 If 10)ab is of dimension 2d, Refao) is a conditional phase gate and may be implemented with O(log(d)) primitive gates. As span{it')ab, lt)a0)b} is an invariant subspace of Refa,,S)RefO,1 t), we may represent it equivalently with Pauli matrices &,,y,, through the replacements a --+ (cos (O) - sin (6) sin (0) cos (0) ' - Ot e7&YO cos (0) -sin (0) sin (6)1 cos (0)) (3.22) ei e--i&,/2 10 /2~ Thus Ref.,1S)Refolt) e0Q/2) (eia/2 e-i(+)/2e y e-i ei//2 , -i&z1 3 /2 e0/2) ei0/2 in this subspace. Though ap- za/2e-iyoeiz_/2 plying Ot in general takes us out of the subspace, this operator is always paired with a in the Grover iterate and never occurs in isolation - the representation is faithful. This sequence of alternating &y,, rotations motivate us to define the operator for rotations by angle 0 about an axis in the &,-&y plane of the Bloch sphere: cos (0) -i&z(7r/2+0)/2e-i&YOei&z(7/2+0)/2 _ -ie'O sin (0) -iesin (0) cos (0) (3.23) ' e-**O = where &0 = cos (0)&,+sin (#)&y. We would like to express Eq. 3.19 as a product of just these Q = 2N +1 rotations e--&Oko. Thus we replace the input state and add a final reflection ei 3N+1e N+t,) to obtain O0)ab = Oesa'RfO,I0)ab, N eaO+,N+1 Ne +akf ad)(3.24) i V_ (k=1 Promised that # g always acts on input state representation. O)ab, the fact GjO) e-&Y 9dt') permits the N iir~ ei(a'o e 2 N+1- Zk=l(ak+/k))/ e--i&YO -iz& e-izI 3 N+1/2 e i&YOei&-zk/2e-i&YOei&z13k/2 (3.25) o/2 Since we have the identity ea 9 = e-UzWe -yoeaz"&, and all e-as in Eq. 3.18 are sandwiched between &, rotations, we replace these with the &.,-&y rotations of Eq. 3.23 and define the composite iterate V- in Fig. 3-1 2N+1 V- = e' V-g = e k = A(0) + iB(0)&z + iC(0)&x + iD(0)&Y, (3.26) k=1 where (D, which depends only on a,3, is chosen to cancel the global phase of V g, q depends linearly on a,3 as seen in Figure 3-1, and the decomposition into the Pauli basis is always possible for SU(2) matrices. By replacing the product of two-parameter generalized Grover iterates in Eq. 3.19 with a product of more fundamental and simpler one-parameter single-qubit rotations in Eq. 3.26, 53 ( G REFo) 10 4 G - 0)a 10)b - - REFIO)b-Gt GOR RKEF n O]_G E REFak,IO)aIO)b Ft k -:: G - 0)a -1 - |O)b - 10)a -_ G- 0rGGt - R Ro !k-1-02k+7 G n) k=1 (-- - |0)b Figure 3-1: (top) Circuit diagram for amplitude amplification in Eq. 3.16. (middle) Circuit diagram for amplitude amplification by phase-matching in Eq. 3.19. (bottom) Circuit diagram for amplitude amplification V. by quantum signal processing in Eq. 3.26. Note that we abbreviate the reflection operators as R and drop the state subscript here. The query complexity in all cases is Q = 2N + 1, and the gate complexity is O(Q log (d)). the structure underlying generalized amplitude amplification is made clearer. As these single-qubit rotations isomorphic to those considered in QSP 2 N+l, we may apply Lem. 3.2 that characterizes any achievable (C, D). Other choices from Chapter 2 (A, B), (A, C) etc. are also possible. LI Lemma 3.2 (Achievable (C, D) - Thm. 2.7). For any odd integer N > 0, a choice of functions C, D in Eq. 3.26 is achievable by some EE RN if and only if all the following are true: (1) C(O) = C(y), D(O) = D(y), where C, D are odd real polynomials of degree at most N; (2) Vy E [-I1, 1], C2 (y) + 2 ()<1 2 (3) Vy > 1, C 2 (y) + D (Y) > 1. Moreover, q RN can be computed in classical 0(poly(N)) time. 3.5 Flexible amplitude amplification The application of Thm. 3.1 requires finding a good polynomial approximation, say D to the target function. However, it is not always clear how constraint (3) on properties of the polynomial outside the interval of interest may always be satisfied. We rectify this in by adding an additional ancilla qubit to stage a cancellation of the C term. This leads to Thm. 6.14. Subject only to parity and being bounded, we can then implement, without approximation, any arbitrary polynomial of degree exactly equal to the number of queries to the state preparation operator C. This enables us to compute any real function with a query complexity exactly that of the its best polynomial approximations thus allowing us to transfer powerful results from approximation theory [93] to quantum computation. Theorem 3.3 (Flexible amplitude amplification). Given a state preparationunitary O acting on the computationalbasis states |0)a E Cd , 0 )b E C 2 such that GIO)a|O)b = sin (0)It)aI0)b+ cos (0)|t')ab, where It')ab has no support on I0)b, let D(O) be any function that satisfies all the following conditions: (1) D(0) = D(y), where D is an odd real polynomial in y = sin (0) of degree at most 2N + 1; (2) Vy c [-1, 1], D 2 (y) < 1. 54 , Then there exists a quantum circuit W- that requires N + 1 queries to G, N queries to ((N log (d)) primitive quantum gates pre-computed in classical 0(poly(N)) time, and an additional single-qubit ancilla c, to prepare the state W I0)a0)LOW)c = D(O)|t)al0)bW0)c + A()t')abj0)c + iC(6)1t)a|0)b1)c- iB()1t ')abI1)c, (3.27) for some real functions A, B, C, of lesser interest. Proof of Thm. 6.14. Consider the composite iterate in Eq. 3.26 controlled by a single-qubit ancilla register indexed by subscript c. W- = where |) in V. = |+)(+|c + VZ_ - -)(-jc, (3.28) (|0) t1)). Note that this can be implemented by controlling Refa,,o), Refo3jI) The number of queries to C, Ot is unchanged and 6, Ot need not be controlled unitaries. Thus W- still has query complexity N = 2n + 1 equal to V7- From the similarity transformation d6ye-s'6 0 u = e-&-00, __= A() - iB(O)&z - iC(O)&x + iD(O)&Y, (3.29) where 7 is the vector where all elements are ir. This allows us to stage a cancellation of C when W is controlled by the ancilla state 10)c: IV0)a|0)b|0)c = D()|t)a0)0)c + A(6)1t1 )abj0)c + iC(O)1t)aj0)bI1)c - iB(O)1t')abI1)c, (3.30) where 1t)aj0)bI0)c is our new target state. Thus the amplitude of D on the target state is completely independent of A, B, C regardless of what they may be. This allows us to directly apply the following result for achievable D in Lem. 3.4. D Lemma 3.4 (Achievable (D) - Thm. 2.8). For any odd integer N > 0, a choice of function D in Eq. 5.25 is achievable by some 0 G RN if and only if all the following are true: (1) D(6) = D(y), where D is an odd real polynomial of degree at most N; 2 (2) Vy E [--1, 1], D (y) < 1. Moreover, 4 RN can be computed in classical 0(poly(N)) time. 3.6 Conclusion Our success in solving the generalized amplitude amplification problem is evidence that result from analog models of quantum computation may transfer seamlessly to understanding and implementing algorithms meant for digital models of quantum computation. In the query model, the SU(2) space of a single-qubit, as we study here, is isomorphic to the SU(2) subspace spanned by a uniform superposition of marked states and a uniform superposition of unmarked state. More general query algorithms beyond quantum search can be built to calculate a boolean function f : {0, 1}' - {0, 1} that depends only on the number of marked states (i.e. f(x) = f(xI) for some f {0, 1,...,n} -+ {0, 1}) and do so with a Grover-type 55 algorithm of partial reflections. Thus, the same methods introduced here also give a way to determine how many reflections (analogous to our L) and what reflections (analogous to our <j) are required to compute any particular symmetric boolean function, achieving the known lower bounds for this problem, which (not) coincidentally are also derived using polynomials [8]. As examples of this correspondence, the polynomials presented in Chapter 2 such as DCL,I is an optimal solution for OR [136] whereas ML,I is optimal for Majority. 56 Chapter 4 Sparse Hamiltonian simulation by quantum signal processing 4.1 Introduction The simulation of physical systems should be one of the easiest problems solvable on a quantum computer. After all, unitary time-evolution is a natural consequence of Schr6dinger's equation in continuous-time, and nature performs this feat instantaneously with every tick of the universe. Despite this, the Hamiltonian simulation problem is surprisingly non-trivial on a digital quantum computer, which appears to be one of the unavoidable consequences of moving from an analog model of quantum computation based on Hamiltonians to a digital model based discrete quantum gates. Ever since Feynman's particularly astute observation, this field of study has been subject to intense ongoing research. The first explicit quantum algorithms for Hamiltonian simulation were discovered by Lloyd [81] for local Hamiltonians, which is particularly notable as most physical systems interact locally. This was then generalized by Aharonov and Ta-Shma [3] to d-sparse Hamiltonians with at most d non-zero elements in every row. As inputs to the sparse Hamiltonian model are standard quantum oracles, this model is particularly suited to designing quantum algorithms, often though quantum walks. Moreover, its use of these oracles allow lower bounds on the difficulty of the simulation problem to be proven. This versatility has led to intense interest in improving sparse Hamiltonian simulation algorithms, with many celebrated results over the years [11, 23, 26, 28, 13, 10, 14]. Other modern developments also move beyond the sparse Hamiltonian model, such as with Hamiltonians described by a linear combination of unitary matrices [28, 14], or density matrices [82, 70]. Problem 1 (The Hamiltonian simulation problem). Let t > 0 be the time of simulation, and e > 0 be the target approximation error. Given access to a unitary quantum oracle 0 that describes the Hamiltonian H, construct a Q-query quantum circuit V such that |IV - e- H\| < e, with the smallest possible Q. + The solution depends strongly on how 0 makes information about ft available. In the sparse Hamiltonian model, a number of lower bounds on the query cost of approximating e-iHt with error e are well-known. The "no-fast-forwarding" theorem [25, 10] demands at least Q(r) queries independent of e, where T = tdllftlmax and |H|imax is the largest element of H in absolute value, and impressive recent work [13, 10] proved an exact error scaling of 0 (1/E)) for r = 0(1). Though this suggests a naive additive lower bound Q(r 57 [10], the best prior algorithms approach these factors multiplicatively with either (r/,) linear scaling in time 0(m) [23] or sub-logarithmic scaling in error o(T)loglog[ log (/lE) ) [31, 13]. glog (1/c)) Long unanswered is the existence of an algorithm that is unconditionally optimal, with implications for the relation between continuous and discrete-time models of physics, and of interest in problems [120] where r, log (1/c) scale together. In this chapter, we present an application of quantum signal processing to sparse Hamiltonian simulation. Unlike amplitude amplification in Chapter 3, which has a natural structure isomorphic to the unitaries QSPL, the Hamiltonian simulation problem has no such symmetry in general. Indeed, a Hamiltonian with only two energy levels would not be the most interesting system to simulate. Nevertheless, we find that the dynamics of QSPL are naturally suited to the particular task of computing on eigenvalues of generic unitaries, which leads to our results by performing this computation on certain quantum walk model of Childs [22] and Berry [12]. On one hand, this allows us to obtain the first Hamiltonian simulation algorithm optimal in all parameters with query complexity O(tdIHIImax + log (1/f) which matches the best lower bound. On the other hand, the resulting algorithm is extremely simple, and with a very small constant space overhead, unlike prior art with an ancilla overhead scaling monotonically with t, e. In Section 4.2, we define the standard quantum oracles that describe sparse Hamiltonians, and outline the simulation technique of quantum walks used in prior art. In Section 4.3, we apply the QSPL to obtain a powerful technique for performing computations on the eigenphases of generic unitary operators. In Section 4.4, we prove our result on Hamiltonian simulation by combining this technique with that of quantum walks. 4.1.1 Attributions and contributions The results of this chapter are taken from a significantly restructured preprint of published joint work [86] with Isaac L. Chuang. The manuscript was written by myself with helpful discussions and suggestions from my advisor. We thank Cedric Yen-Yu Lin, Robin Kothari, and Matthew Hastings for insightful discussions, and acknowledge funding by the ARO Quantum Algorithms Program, the NSF CUA, and NSF RQCC Project No.1111337. 4.2 Quantum walks in sparse Hamiltonian simulation The difficulty of Hamiltonian simulation depends strongly on the quantum oracle 0 that provides a description of H. For instance, in the trivial case, the oracle might describe the Hamiltonian through its time-evolution operator in the form O0) = e--HtI4 ), which solves the problem. The issues within designing a suitable oracle can be subtle. In general, a d-sparse matrix H on n qubits of exponential dimension 2" has up to d2n non-zero matrix elements. If these O(d2n) number are completely random, recording them in a classical circuit would take already take exponential time, which would negate the purpose of quantum simulation. Thus the standard oracles for the sparse Hamiltonians implicitly assume that the H are so structured that there exists an efficient classical algorithm built from O(poly(n)) universal Boolean gates to compute the matrix elements of H, given a target row and column index. Moreover, it also assumes that there exists an efficient classical algorithm to compute the column indices of non-zero elements in every row. Given these classical circuits, one can efficiently construct quantum oracles with the following properties: 58 Definition 4.1 (Sparse matrix oracles [12]). Sparse matrices with at most d non-zero elements in every row are specified by two oracles. The oracle OHlj)k)Iz) j)|k)|z e Hk) queried by j G [2'] row and k C [2'] column indices returns the value Hjk = (jIHIk), with maximum absolute value ||$|maX =rax-jk |tjk| The oracle OFIj)l) =j)If(j,1)) queried by j G [2"| row and 1 c [d] column indices computes in-place the column index f(j, 1) of the lth non-zero entry of the jth row. In this model, the value H2 , is returned in m-bit binary. For simplicity in evaluating the query complexity, it is convention to assume that Hik is exact by ignoring errors in any binary approximations. However, the gate complexity of arithmetic operations will account for finite m. Thus OH acts on 2n + m qubits, and OF acts on 2n qubits. Quantum walks are a technique applicable to transforming the description of a Hamiltonian H provided by oracles in Def. 6.7, into a unitary quantum walk operator W with eigenphases non-linearly related to the eigenvalues of H A) = A A). We state the key features of a particular quantum walk defined by Childs [23] and Berry [12]. In this quantum walk, one 1 query to OF, OH, OH each and O(n + m poly(log m)) primitive gates suffice to implement an isometry t that maps every state A)0) On+m+2 onto two eigenstates IAl) of W: TIA) = (IA+) + IA-)) /v2. (4.1) Moreover, t is constructed such that the walk operator W iS(2Ttt - i) has eigenvalues WIA ) = e2oA|IA ) defined by O+ arcsin (A/IIHI|maxd) + (1 -F 1)7r/2, (4.2) that depends on the H eigenvalues A. As W corresponds to reflection about TTt followed by swapping (2n + 2)-qubit registers with 5, its query and gate complexities are identical to T up to constant factors. Hamiltonian simulation is achieved by creatively applying W some number of times in order to implement a state-dependent eigenphase IAl) -+ e-iAt|Al) independent of the t index. Uncomputing with t then maps IAt) back onto IA)10)®n+m+2 with the desired is nonlinear phase evolution. However, some difficulties arise. First, the applied phase in A. Second, each eigenstate IAt) evolves under W with phases in opposite directions. Thus uncomputing with t does not map WTIA)(0)®n+m+2 back onto the basis IA)10)®n+m+2. In [10], these are overcome by approximating the unitary transformation in Eq. 4.4 with target function 0 A h(9) = -- r sin (0) -> h(O>\ ) = -At, (4.3) resulting in the desired phase, but implemented using a technique combining a linear com1 such that the success probability decays with N. This bination of N-controlled W'.-.,N necessitates amplitude amplification on shorter segments e-iHt/M each with error E/m, thus invariably mixing sub-optimal factors of r and e in the query complexity. 59 4.3 Eigenphase transformations by quantum signal processing 0 Given an arbitrary unitary oracle W with eigenstates Wlux) = e'^ A uA), we now consider the general problem of approximating a quantum circuit Vdeal whose eigenphases are those of W transformed by some arbitrary real periodic function h: ZetAuA) Wi- (UA -4IVdeai Zeih(O\) I uA)(A (4.4) by querying the controlled-W operator. The main result of this section is a solution obtained by an unusual application of the class of unitary functions QSPN. Theorem 4.2 (Eigenphase transformations by QSPN). V real odd periodic functions h (-1r,7r] -- (-7r, 7r] and even N > 0, let (A(O), C(O)) be real Fourierseries in (cos (k), sin (k)), where k = 0, ... , N/2, that approximate max A(O) + iC(0) - eih(O) I < E. (4.5) OE R Given the unitary W= E e'0A' ux) (u, 1, and functions A, C, there exists a unitary quantum circuit V that requires N/2 queries to controlled-W, N/2 queries to controlled-Ct, and 0(N) single-qubit gates, such that (+|V+) approximates ideal = E,\eih(0)Iux)(u\I with success probability p > 1 - 16E and trace distance ETr max 10) + - - Videal))I1 8E. (4.6) Two key properties distinguish Thm. 4.2 from routines that can effect similar transformations, such as quantum phase estimation [98] or linear-combination-of-unitaries [28, 10, 14] which require a large number of ancilla. First, is its intuitive use of just a single ancilla qubit. Second, the query complexity of the methodology is exactly the degree N of optimal trigonometric polynomial approximations to eih(O) with error e [92, 102, 105, 126], without the decaying success probability of prior art. These are analogous to digital filter design techniques in discrete-time signal processing [101], and allow for some transfer knowledge from the vast field of function approximation [105]. The remainder of this section is dedicated to proving Thin. 4.2. Let us recall the class of unitary functions QSPL in Def. 2.4 constructed from a length L sequence of single-qubit rotations Y(O) = NOL (0)NO _1(6)...NRol(0) (4.7) = A(9)i + iB(O)&z + iC(9)&x + iD()&Y, where No (0) = -. (&2 COS 4+&y sin $ The complete classification of possible functions A, B, C, D is described in Chapter 2. Using Chebyshev polynomials of the first and second kind Tk(cos (0)) = cos (k) and Uk(cos 0) = sin(k+1)), we may map the results in Thin. 2.7, from trigonometric polynomials to a Fourier series: Lemma 4.3 (Achievable (A,C) adapted from Cor. 2.10). V even N > 0, a choice of real functions A, C can be implemented by some eC RN if and only if all these are true: (1) VO E R, A 2 (O) + C 2 (O) 1. (2) A(0) = 1. ak cos (k), {ak} G RN/ 2+1. (3) A(6) = _2 60 a) - (6) -= b) POO) - e'sp'2/ 2 - O2() - Had - - ON-. Had e-i~e/2 _ c) IuxL) lux) - k( d) = kt) - - k") Figure 4-1: Quantum circuits mapping (a) a sequence of single-qubit rotations V(9) to (d) quantum signal processing V. Each single-qubit rotation RO(O) is replaced by (b) Up, built from Hadamard gates and controlled-W with eigenstates W|uX) = eox IuA). Thus (c) Up on input Iux) reduces to a single-qubit rotation R,(9A). By linearity, V on an arbitrary input |0) may be understood as rotations V(Ox) controlled by a superposition of IuN). By some choice of single-qubit input state and measurement basis, coefficients of the Iux) are then rescaled by the components of the function V(OA) programmed by q. (4) C(9) =jjL ck sin (k), {Ck} G RN/2 Moreover, q can be efficiently computed from the function A, C. These results map, in three steps, to a quantum circuit that approximates Equation 4.4. (a) Signal transduction of W into a signal unitary classically controlled by E R: UO= Z N(0,) 0 Iu\)(uI. (4.8) This is implemented in Fig. 4-1b with one controlled-W, which is always possible on a quantum computer in the worst-case by replacing all of its gates with controlled version, and 0(1) single-qubit rotations: UO = (e-44&2/ 2 0 i)(J(eiO&z/ 2 ® i), o = +)(+l 0 i +I-)(-1 - (4.9) W Z eiOA/ 2 go(0A) 09 IU)(UAI, where 1t) = 1) As U4 acting on luA) selects the rotation R0(0A) = (ux\ 4|u\) as seen in Fig. 4-1c, these are precisely the single-qubit rotations in Eq. 4.7 with rotation angle 0, controlled by the A index, but with an additional global phase eiOA/ 2 (b) Signal transformation by computing unitary functions V(OA) over a superposition of 0, on the single-qubit ancilla through the simple circuit of Fig. 4-1d: V = U4NUN-1 .. 4 , E R, N even. (4.10) As this invokes W a number N times, its query cost is O(N). Note that the unwanted phase eiO\/2 can be uncomputed by alternating between U4, and Ut since R 4 (9) = N() and N is even. 61 (c) Signal projection of the ancilla onto some basis, to select desired components of V(OA) in Eq. 4.7. As the desired phase transformation can be implemented through A(0), C(0), Consider the input state J+) uA), and postselect on measuring (+. Other choices are of course possible. This applies onto state luA) the coefficient (+-I+)uA) = (A(OA) + iC(OA)) UA), p = MInI(+|Y ( 1+)|1 2 = min A(0) + iC(0)12 OE R , OECR (4.11) with worst-case success probability p. Thus (a)-(c) provide a reduction from finding quantum algorithms for approximating laideal to finding Fourier approximations of A(O) + iC(0) to eih(O). The error of approximation may now be evaluated. Given A(O), C(0) that satisfy Eq. 4.5, conditions (1), (2) of Lem. 4.3 will not generally be satisfied. (1) is violated as the maximum of IA(0)1, 1C(0)1 is at worst 1 + c. Thus we rescale A 1 (0) =A(0)/(1+ c), ), C1(O) = C(0)/(1+ (4.12) (A,(0) + iC(0)) - eih(O)I < E/(1 + c) + E < 2e, at the cost of a slightly larger error 2c. Note that Al + C (j-j)2. (2) is violated as A 1 (O) = cos6 > '- for some 6 E [R. Fixing this is more involved. As V(0) is unitary, Ai + B2 + C2 + D 2 = 1. We can apply the prescription in [89] using polynomial sum-ofsquares to compute the unspecified B, D from A 1, C1 such that B, D are of the form (3) and (4) respectively. Thus A2(0) + B 2 (0) = 1, and IB(0)l = sin 61. Define A 2 (0) = A,(6) cos 6 + B(0) sin 6, |A 2 (0) - A 1 (0)| 5 1+ 6 + JB(0)J 1 + E (4.13) < E. This introduces an additional error by using the triangle inequality and B 2 < 1 - A2 - C2 2. By construction, A 2 (0) = 1. The functions A 2 (0), C1(0) thus satisfy E(1 Lem. 4.3. By adding the errors in Eqs. 4.12, 4.13, the distance of (+IVI+) from Videal in Eq. 4.4 and the worst-case success probability in Eq. 4.11 are 1- )2 = Ern. < max IA 2 (0) + iCi(0) - eih((0) 1 OER 8, (4.14) p > (1 - 8c)2 > 1 - 166. 4.4 Optimal sparse Hamiltonian simulation Hamiltonian simulation by applying quantum signal processing in Thm. 4.2 to the quantum walk of Section 4.2 requires a good Fourier approximation to A(O) + iC(0) . - 62 e-i si"(0), (4.15) which is provided by the Jacobi-Anger expansion [1 cos (T sin (0)) = JO (T) + 2 E've,o Jk(T) COS (kO), sin (T sin (6)) = 2 EZ'odd> Jk(T) sin (kO), (4.16) where Jk(T) are Bessel functions of the first kind. Note that these Fourier series are already in the form required by conditions (3), (4) of Thm. 4.3. As IJk(T) I I [1] decays rapidly with k, good approximations are obtained truncating Eq. 4.16 at k > N/2. This approximates e- r sin (9) with error shown in [10] for T < N/2 = q - 1 to be E 2jJka(r)I < = 2 q! k=q . (4.17) 2q Inserting into Thm. 4.2, the query complexity of Hamiltonian simulation follows by solving Eq. 5.43 for N, using the implementation of U0 in Eq. 4.9 with 0(1) queries, and that V in Eq. 4.10 contains N applications of U0. The optimality of this result for all input parameters follows from known lower bounds. Specifically, Eq. 5.43 is matched with a corresponding lower bound N = Q(q) [10, 73] for any q satisfying C< 2 sin (4.18) Note that Eqs. 5.43, 4.18 are solved by the Lambert W-function [32] which captures the detailed trade-off between T and e. Its asymptotic behavior may be understood by substituting q =(T +-y), where -r, y > 0. When T O (), one finds y = 0 (log(()) Thus we express the complexity of Hamiltonian simulation as Theorem 4.4 (Optimal sparse Hamiltonian simulation). A d-sparse Hamiltonian H on n qubits with matrix elements specified to m bits of precision can be simulated/or time-interval t, error e, and success probability at least 1 - 2e with O(td||fI|max+ logo )queries and a factor 0((n + mpolylog(m))) additionalquantum gates. This is valid for T = 0( logloglog(1/(1 /E) )n ) and stronger than prior art [13, 10] which assumes T = 0(1). Unlike most Hamiltonian simulation algorithms, the query cost is additive in the simulation length T and the target error e. As such, the r term matches the lower bound Q(T) [10] with no multiplicative dependence on error. 63 64 Chapter 5 Standard-form Hamiltonian simulation by qubitization 5.1 Introduction Previously in Chapter 4, we presented a simulation algorithm based on QSPL for sparse Hamiltonians that was optimal with respect to all parameters time t, sparsity d, max-norm I|H||max, and error E. However, the quantum oracles that describe sparse Hamiltonians are only one of several viable alternatives. For instance, some Hamiltonians are more naturally described by a linear combination of unitary matrices [28, 14] or density matrices [82, 70]. It could be possible that approximating time-evolution by those Hamiltonians would require fundamentally different simulation techniques. Thus QSPL, which relied crucially on properties of a particular quantum walk, might not be applicable. - In this Chapter, we present a very general 'standard-form' encoding of Hamiltonians H that is compatible most known input models. More precisely, the standard-form may be simulated exactly with a constant number of queries to those other quantum oracles that describe Hamiltonians. Using a procedure we call 'qubitization', we impose an SU(2)-like structure analogous to quantum walks, which enables the application of QSPL for computing on eigenvalues of H. Its application to Hamiltonian simulation generalizes our previous optimal sparse Hamiltonian simulation algorithm, and also furnishes simulation algorithms for other input models that significantly improve upon prior art in both performance and simplicity. Whereas Chapter 4 focused on implementing unitary operators function of H, we consider in greater generality the classes of non-unitary functions of H that may be computed, which enables new approaches to problems in Table 5.1 such as quantum linear systems, and Gibbs sampling, which essentially compute functions f- 1 and In Section 5.2, we introduce the standard-form encoding of matrices and show how a number of common oracles describing matrices map easily to it. However, it is not clear how computations on the encoded matrices may be performed. This is rectified in Section 5.3, where we introduce a procedure called qubitization that imposes a qubit structure onto the standard-form. In Section 5.4, this qubit structure allows the application of QSPL to computing broad classes of polynomial functions of H, which we enumerate in detail. As this qubit structure also resembles that of quantum walks, it also allows in Section 5.5 the computation of the time-evolution operator using that same quantum signal processing techniques of Chapter 4. 65 Problem BCCKS [10] d-sparse [86] Evolution by p I QLSP [27] j H O Hamiltonian Selects U1i UJ coefficients Hamiltonian Isometry ' Identity Density matrix SWAP Purified p Solution e--iCt e-iHt ee-,8f Matrix Any Any Gibbs [29] Hamiltonian Any Any Table 5.1: List of six example problems (top row), solvable using quantum signal processing and qubitization to compute an operator function f[-] of H, the Hermitian component of C = ((GIa Is)U (IG)a 0 Is). Through qubitization, the scope of inputs to Quantum Linear Systems Problem (QLSP) and Gibbs Sampling (Gibbs) can be any H of this form, either indirectly through Hamiltonian simulation, or directly through quantum signal processing on the standard-model. 5.1.1 Attributions and contributions This section is based on submitted joint work [84] with Isaac L. Chuang, but significantly rewritten with some new content. The manuscript was written by myself with helpful discussions and suggestions from my advisor. We thank Robin Kothari, Yuan Su, and Andrew Childs for insightful discussions, and acknowledge funding by the ARO Quantum Algorithms Program and NSF RQCC Project No.1111337. 5.2 The standard-form encoding of matrices The standard-form encoding provides a natural model through which information about some matrix of interest is made available to a quantum computer. We assume that a complex matrix C is encoded within a unitary quantum oracle U, the signal oracle, in the following manner Definition 5.1. A matrix C c C"x" acting on the system register s is encoded in standardform-(C, a, U, d) with normalization a ;> |C|| by the computational basis state |0)a G Cd on the ancilla register a and signal unitary U C Cdnxdn if ((01a 0 is)U(IO)a 0 Zs) = C/a. If C is also Hermitian, this is called a Hermitian standard-form encoding. One reason why the standard-form should be considered natural is that it is no more or less than the steps of generalized measurement [98], which is fundamental to discrete-time quantum computation: On measurement outcome 10)a with best-case success probability (110I/a) 2 < 1, a measurement operator C/a is applied on the system. Note that while the normalization constant a could always be absorbed into a redefinition of C, it is useful in some cases to leave it explicit. This naturalness is further supported by how matrices described through other common quantum oracles easily map to the standard-form, such as in Figure 5-1. 5.2.1 Matrices from a linear combination of unitaries Matrices formed from a linear combination of unitaries are considered in [28] and [141. We present a slightly more general version of their encoding. Suppose that the matrix C, acting 66 (a) (b) |0)a - - 10)a Gi (d) (c) - I0)a2 -0 --- - I)a2 - -G)ai - G U -)ai U S S Figure 5-1: Quantum circuits for (a) the standard-form encoding, and the standard-form encodings of matrices that are (b) a linear combination of unitaries, (c) a density matrix, and (d) sparse. on the system register s has the decomposition d d E aA , j=1 e E (5.1) lajI, j=1 for some number of d arbitrary complex coefficients aj, some arbitrary unitaries , and let a be the sum of absolute values of the aj. Assume that there exists a selector oracle V and two state preparation oracles G 1 , and G2 defined as j)(jla (9(T,j~j O1i|0)a =l,(00t fr = j=1 j=1 =_ (ja. (5.2) j=1 Note that the definition of 6 through its inverse is intentional, to avoid ambiguities in the principle value of the square root. Then it is easy to verify that (((00aG) 9 Zs)V((Gi I0)) & is) = (5.3) Thus the signal oracle U ( ®1 9t 8 )V(D 0 Is) encodes Cin standard-form-(C, a, U, d), using 1 query each to V, G1, and G2. 5.2.2 Density matrices Matrices that are density matrices are considered in [82] and [70]. In general, a density matrix H is Hermitian, and has the representation, in some n-dimensional basis on the system register, d a 10j)('jls, = j=1 67 (5.4) where ozj are probabilities that sum to 1 and 10j) are arbitrary quantum states. There, they assume access to a quantum channel S that on input state 10) (01, of some dimension greater than H, produces the density matrix S(10) (01) = H. We present a slightly different version of this encoding that assumes we can simulate E as a unitary process. Assume that there exists a state preparation oracle G that prepares any purification of H, that is d GIO)a = GI0)a10)a2 = G)a = vT5fIJ)a114j)a2. (5.5) j=1 Note that the register a2 is of the same dimension as s. One can also verify that Tr[O0)(ala( t] H as expected. Let the unitary operation S swap the a2 and s registers, and let {IA)} be an arbitrary complete basis on the system. Then ((Gla 0 s)1(IG)a 0 is) - is = ((Gla 0 Is)S(G)a 0 is) 'X =: Z ,A)(A s (5.6) j ZZaj I ,j)s (0jiA) (Als Xj Thus the signal oracle U = (Ot 0JS)S(G 01i) encodes H in standard-form-(H, a, U, nd), using 2 queries to G, and O(log (n)) primitive quantum gates. 5.2.3 Sparse matrices In Chapter 4, we considered sparse matrices. Let us recall the definition of the sparse matrix oracles Definition 5.2 (Sparse matrix oracles [12]). Sparse matrices with at most d non-zero elements in every row are specified by two oracles. The oracle OH 1j)Ik)Iz) lj)lk)lz ( IJk) queried by j c [n] row and k E [n] column indices returns the value Hk = (j|HIk), with maximum absolute value ||$||max = maxk |Hjkl. The oracle OFljl) =i)If(j,1)) queried by j G [n] row and 1 E [d] column indices computes in-place the column index f(j, 1) of the l1 h non-zero entry of the jth row. By using integer arithmetic to compute elementary functions, one can transform values of Hjk encoded in m-bit binary into quantum gates that implement rotations by some function of Hjk. Thus given a state lj), indicating the row index of a matrix H acting on the system register, and an upper bound Amax > lIftllrax, one can follow a procedure similar to that in [12] to construct unitary operators Urow and Uci that prepare the states Ucoil0)alj)s = (0Fa(kjU k,)as = E j)slp)ai (ks(q|akk (Xk las = qEF ( P 10)a2 + AFdmax 1- Amax (lqk q + (1 - kq)Hq (01a2 + 1F Amax m 68 1)a2 (57) , (21a 2Ama2J m where 6 jk is the Kronecker delta function, and F= {k : k = f(j, 1) , 1 E [d]} is the set of non-zero column indices in row j. The procedure requires 0(1) queries to OH and F, and uses O(poly(m)) gates for arithmetic, and O(log (d)) gates for creating the superposition over Fj. Note that our definition of the isometry Eq. 6.21 is an improvement over [12] as it avoids ambiguity in both the principal range of the square-roots when Hjk < 0 and a sign problem when Hjj < 0. From [12], the gate complexity of Ucoi, Urow, and Umix combined is O(log (n) + poly(m)), where m is the number of bits of precision of Stk. The contribution from poly(m) = O(m 5 / 2 ) is due to integer arithmetic for computing squareroots and trigonometric functions. Let S, which requires O(logn) primitive gates, be a unitary operator that swaps the registers s and a,. Thus one can verify that (xjISI'k) = dAmaxI (&0 (1a2 0S) Thus the signal oracle U = (Urowt 0 Is)(a 0 0 &S dAmax )(Uco 0 ( I) encodes H in standard-form- (H, a, U, 3n), using 0(1) queries to OH and OF, and O(log (n)+poly(m)) primitive quantum gates. 5.3 Qubitization The generality of the standard form means that it is not immediately obvious how structured computation is possible on it. Several problems are apparent. First, O is not in general unitary, thus the standard technique of quantum phase estimation to encode its eigenvalues in binary on a control register is not possible. Second, C is only defined through a projection of the signal oracle by the states 0)a. Thus U does not immediately yield information on the eigenvalues of C and quantum phase estimation on U is ineffective in general. Third, these limitations suggest that the only available option is to perform repeated measurements of C"j4'), on some input state |0)s, but this is not very exciting and succeeds with probability vanishing exponentially with n. Our result of 'qubitization' overcomes these problems. Given only access to oracles encoding C in the standard form, this step translates the encoding to unitary evolution with eigenvalues and eigenstates that depend directly on 0, using at most one additional ancilla qubit. The signal oracle U is queried to obtain a Grover-like search parallelized over all eigenvalues A of the Hermitian component ft = (0+ Ct) through the unitary qubiterate W ~-,ei&@ S-'(Ill in some basis. Just as how the standard-model generalizes the inputs to various problems in Section 5.2, qubitization appears to generalize a number of other quantum algorithms of foundational importance. For instance, the gap A of eigenvalues A = 1 - A of ft is amplified to cos- 1 (A) O(v1 ) in the phase of the iterate, which resembles spectral gap amplification [118], the quantization of stochastic matrices [122], as well as Szegedy's [123] and Childs' [23] quantum walk. The key difference lies in the extremely general encoding of the signal through any signal oracle U of the standard form. Let us consider the standard-form encoding in Def. 6.2 in more detail. Given an encoding of a complex matrix C standard-form-(C, 1, U, d), the unitary signal oracle U : 7 a 0 Hs -+ Na 0 W. acts on system (s) and ancilla (a) registers. Conditional on the measurement outcome 10)a, this implements some non-unitary signal operator C : W, -9 -, on some nqubit input system state |/), E Ws. This divides U into two subspaces - RG = 10) (a 0N. 69 where U10)10) may be projected onto 10)CIO) with probability |Ol$)I2 for all I'b), and its orthogonal complement RGI. In other words, UIOO)) = 0)aftki)s + 1- Hjk)l l0 2 , ) a, (5.9) where the signal operator C has spectral norm OlI = ll((Ola 0 Is)U(0)a 0 Zs)I < 1, and (o0lasHI)s0)a = 0. Whenever the context is clear, we drop the ancilla and system subscripts, and use 10)a 0 i. and 10)a interchangeably. We represent U such that the top-left block is precisely C and acts on an input state lOVp) =0)10) E G, whereas the undefined parts of & transform 10p) into some orthogonal state 10) E 'HG of lesser interest. In the following, we consider the case where O is a normal matrix N, and will very soon further specialize to Hermitian H. Thus the action of U on 10p) in Eq. 5.9 can be more easily understood by decomposing 4') as a linear combination of eigenstates NIA) = Ae'X-NIA), where A, XX E R: UI0x) = AeixNI0,) + g(A)I0i), 1 - A 2. g(A) (5.10) Later on, operator functions indicated by [.1 are defined by f[N] = E), f(Ae'xA)IA)(Al for any scalar function f(-). For each eigenstate IA), we also find it useful to define the subspace where R,\ = span{I0,), 0j)} = span{I0x), U01)} and its Pauli operator basis ZP, &y &-\10-) = -10-) and the A in the exponent are labels. Note that these subspaces are disjoint as all 10x) have zero mutual overlap and U, which defines 10-), is unitary. Note the trivial case A = 1, where W/\ is one-dimensional. In the simplest case, one might wish to apply N multiple times to generate higher moments. When N is proportional to a unitary, this corresponds to phase accumulation, which is essential to precision measurements at the Heisenberg limit. Alternatively, if N = H is Hermitian, it is a quantum observable, thus H 2 would allow a direct estimate of variance, and so on. Unfortunately, the subspace l-t x for each eigenstate IA) is not invariant under U in general. As a result, repeated applications in this basis do not produce higher moments of H due to leakage out of W,\. The manner they do so depends on the undefined components of U, must be analyzed on a case-by-case basis, and thus is of limited utility. Order can be restored to this undefined behavior by stemming the leakage. The simplest possibility that preserves the signal operator of Eq. 5.9 replaces U with a unitary ansatz, the qubitization iterate or qubiterate for short, that on an input in 10) E ax IA) E HG has the form = fw ~ t Jg[H] Jkt it t) - eiX>, _g(A)) (A g(A) Ae--2x; 0 A=DeZ2e &N e&)C0#1(e i&cos A (5.11) Note that H remains encoded in the standard form. For each eigenstate IA), W performs a rotation in SU(2) on disjoint two-dimensional subspaces 1X = span{ l0x), l0 )}= span{l0x), WI0x)}, with basis states in (D 'x related by the basis transformation J 70 (A &A. To avoid ambiguity in the basis of W, we explicitly define (Aeix -g(A) g(A) Ae = AeixA 0 )(OxI - g(A)0 )(0 1+ g(A)0 )(0 \ + AeixAI O )(O -L. (5.12) In the following, W will always be applied to states in the subspace @ 71a, thus its action on states outside it need not be defined. The usefulness of this construct is evident; due to its invariant subspace, multiple applications of the iterate result in highly structured behavior. However, implementing W requires g[], which appears difficult to compute efficiently in general. In prior art [35], this was approximated using phase estimation. The qubitization problem then concerns efficiently constructing W without approximation, and using 0(1) queries to the signal oracle of the standard-form. 5.3.1 Proof of construction We now solve the qubitization problem. Using the signal oracle U, and arbitrary unitary operations on only the ancilla register, we provide necessary and sufficient conditions in Thm. 5.3 for the case of Hermitian H for when the qubiterate W can be implemented exactly using only one query to U. As these conditions are somewhat restrictive, we then prove in Thm. 5.4 that Qubitization is unconditionally possible by instead using the controlled-U oracle in a quantum circuit that generates the same signal operator and satisfies these conditions. We describe a similar construction for normal operators as well in Section 5.4.6. In this subsection only, we consider a slightly more general signal oracle. Rather, than the standard-form encoding H = (01aUI0)a, let us assume that the signal oracle factors into three components ft/a = (01a(t 0 i)U (9 0 Is)IO)a, where Oacts on the ancilla register only. This may be interpreted as replacing the computational basis state IO)a with a modified state IG)a = O0)a. Note that we may always set 0 = ia to recover the original situation. As U is a black-box oracle, we must find a unitary 5' 0 I1, acting only on the ancilla register such that the iterate W ='U of Eq. 5.11 is obtained. For the case of Hermitian HIA)s = AIA)s, we now determine necessary and sufficient conditions on what 5' must be. As 5' is otherwise arbitrary, we use without loss of generality the ansatz of ' being a product of a reflection about IG)a and another arbitrary unitary S on the ancilla: W= ((21G)(G| - I)a 0 1s)SU, IGA)as = IG)aIA)s -> G) AGA)as (5.13) - GA)as (514) Note that the reflection about G) can be implemented using a with two queries to G, and a multiply-controlled phased gate that can be compiled by O(log (d)) primitive quantum gates. Theorem 5.3 (Conditions on Hermitian qubitization). For all signal oracles U that implement the signal operator ft, the unitary S in Eq. 5.13 creates a unitary iterate W with the same signal operator in the same basis, but in an SU(2) invariant subspace containing |G) if and only if (GlaSUIG)a = H and (GlaSUSU|G)a =. 71 (5.15) Proof. In the forward direction, we assume Eq. 5.11, then compute and compare with Eq. 5.13: A = (GAIWIGA) = (GxSUIGA). By using this result repeatedly together with the fact that SU is unitary, Gram-Schmidt orthonormalization of WIGA) furnishes the state jG') _ WIG)-GA)(GAIWIGA) |W|G)-|GA) (Gx|WIGx)| larly computing and comparing - 1- AIGA)-SUIGA) - V1- A2 2 A = (GllG-) orthogonal to IG,). By simi- \ 2-(G>(U) 2IGA) we obtain - (GAI(SU) 2 1GA) = 1. As these must be true for all eigenvectors IA), the conditions in Eq. 5.15 are necessary. That these are also sufficient follows from assuming Eq. 5.15 and attempting to recover the components of Eq. 5.11 using the definitions of Eq. 5.13. By applying (GlaSUIG)a = H, we compute WIGA) = 21G)a(GIa UIGA) - SUIGA) = 2AIGA) - SUIGA). In the basis of IGA) and IG), (GAIWIGA) = 2A - (GxISUIGA) = A and (G'kIWGx) 2 2 2 2A -2,\ -- \ +1 v/1 - A 2 . A similar calculation for (GVIA($U(.)t (2A GA) - SUIGA)) = the remaining components requires (GlaSUSUIG)a I and reveals that (GkLG') = A and (GAIWIG') = 1 - A 2 . As this must be true for all A, we may indeed represent _____________CJ~ 8wVeA( ltA\ _1A2)0 In hindsight, these results are manifest. After all, (GISUSUIG) = I implies that SU is a reflection when controlled by input state IG), and it is well-known that a Grover iterate [50, 136] is the product of two reflection about start and target subspaces. Nevertheless, the sufficiency of these conditions highlights that this is the simplest method to extract controllable and predictable behavior out of U. In particular, these conditions are automatically satisfied in the trivial case with S = Ia when U only has eigenvalues 1, such as when it is a controlled-Pauli operator. Unfortunately, a solution to Eq. 5.15 may not exist for more general U. Thm. 5.3, amounts to choosing S such that SU, is the inverse Ut whist preserving the signal operator (GISUIG) = H. Given that S only acts on the ancilla register, it is hard to see how this is always possible. Even if so, S may be difficult to implement as it is an arbitrary unitary acting on a potentially large ancilla register. The solution is to construct a different quantum circuit U' that contains U but still implements the same signal operator, and crucially always has a extremely simple solution S. We now show how this can be done in all cases using only 1 query to controlled-U and controlled-Ut as shown in Figure 5-2. Theorem 5.4 (Existence of Hermitian qubitization). For all q qubit signal unitariesU that implement the signal operator (GIUIG) =i, there exists an q + 1 qubit quantum circuit U' that queries controlled-U and controlled-Ut once to implement the Hermitian component '(f + Ht) as the signal operator, such that the of conditions Eq. 5.15 can be satisfied. Proof. We prove this by an explicit construction. Let the controlled-U operators be Qi = 0)(01b 0 as + 11)(11b 0 U, 2 = 10)(lb 0 U + 11)(11b 0 las. Thus the extra qubit states yj or UT. By multiplying, U' = Q1Q2 = IO)(0b 0 t. Now consider the ancilla state IG')ab = -L(0)b + 1)b)IG)a, and choose 0)4, 1), are flags that selects either U + 1)(11b 0 S(0) (11b + 11)(01b) 0 Zas. It is easy to verify that the conditions of Eq. 5.15 is satisfied. (G'1SU'IG') = (G'IL'IG') = -(H 2 + t), (G'ISU'SU'IG') = (G'IU'tU'IG') = I (5.16) where we have used the fact that SG') = G') is an eigenstate, and that S swaps the 10), 1) ancilla states in U', thus transforming it into its inverse. L 72 10)b 0)a Had Had 07X GU Reflo U Figure 5-2: Quantum circuits for the qubitization W of a standard-form encoding-(H, 1, (Go IS)& (t 0 is), d). The qubiterate W encodes H in standard-form-(H, 1, W, 2d). Note that Had is the Hadamard gate, and we define the reflection Reflo),o)b = I - 210)(01a 0 0)(01b, so this circuit ignore a global -1 phase. The gate complexity of the W is O(log (d)). Even if we are given for which there is no solution to Eq. 5.15, we can always apply Thin. 5.4 to construct a U' that does with minimal overhead. Furthermore our proof uses no information about the detailed structure of G) and so we may treat 0 as a black-box, and absorb any other unitary acting only on the ancilla states into an appropriate definition of U'. Thus without loss of generality, we can assume that any signal oracle U provided has already been qubitized. 5.4 Operator function design Qubitization explicitly imposes a qubit structure onto any matrix H encoded in the standardform. This allows us to transfer knowledge from QSPL to the computation of functions f[H] of encoded matrices. Though one such class, the unitary functions of H, was considered in Chapter. 4, many other non-unitary function could be possible by exploiting details in the construction of the qubiterate W. In fact, standard techniques such as quantum phase estimation are applicable on the qubiterate, which has well-defined eigenphases that depend on the eigenvalues of H in a predictable manner. As quantum phase estimation exploits additional ancilla qubits to encode an increasingly precise phase in binary, it is clear that the set of continuous functions computable on the eigenvalues of H scales with the number of ancilla qubits. This motivates the complexity class Definition 5.5 (Ancilla-Assisted Function Computation-m (AAFC-m)). Given a Hermitian standard-form-(H,1, U, d), let AAFC-m be the family of complex functions f that can be computed by a unitary operator V with at most m additional ancilla qubits initialized in the computational basis |0)b, over that required for qubitization, such that V encodes f [H] in standard-form-(f[H],0(1), V, d2'). In this section, we show that at most three ancilla qubits are ever needed to compute arbitrary continuous functions in the standard-model. In other words, AAFC-0 c AAFC-1 C AAFC-2 C AAFC-3 = AAFC-oo, in contrast to quantum phase estimation and the more recent 'linear-combination-of-unitaries algorithm' [72] in which m can still be arbitrarily large. From the perspective of Boolean logic, this is a complete surprise: a finite number of bits can represent arbitrary precision. At the baseline of AAFC-0, this is achieved by directly exploiting the SU(2) structure of the qubiterate to engineer arbitrary target functions f(A) by Grover-like rotations with angles dependent on the eigenvalue A, and the procedures for AAFC-1 to AAFC-3 is an analogous generalization to higher dimensions. These results implement polynomial functions of H with an optimal query complexity that exactly matches 73 I0)b JO)a - RefO ,o)-IE 0-0b Had UHad RefO0.J~ GGt2 U -- U Figure 5-3: Quantum circuit for the phased qubiterate WO of a standard-form encoding(H, 1, (G 0 Is)U(G# 0 ih), d). The phased qubiterate encodes H in standard-form(H, 1, Wo, 2d). Note that we define the reflection Refa,Io)lo)b = I- (1-e-z)10) (01a )10) (01b, so this circuit omits a global -1 phase. The gate complexity of WO is O(log (d)). polynomial lower bounds for function approximation. Indeed, the extension to any function stems simply from the fact that polynomials are dense in the space of real functions [119]. These powerful tools for target operator processing, made possible by qubitization, are agnostic to the underlying oracles that describe the signal operator H. As seen in Section 5.2, converting various oracles that describe Hamiltonians to the standard form is quite straightforward. This motivates the conjecture that the standard-form is most natural for quantum algorithms involving matrices, and furnishes an optimal approach to implementing arbitrary quantum measurements and their operator transformations when combined with our quantum signal processing algorithms. We provide algorithms with different overheads for large classes of transformations. 'Ancilla-free quantum signal processing' in Section 5.4.1 implement target operators where the real and imaginary parts have the same parity with respect to H. This is extended by 'single-ancilla quantum signal processing' in Section 5.4.3 and Section 5.4.4 for more general target operators, and then by 'double-ancilla quantum signal processing' in Section 5.4.5 for completely arbitrary target operators. 5.4.1 Ancilla-free quantum signal processing Without any additional ancilla qubits beyond that for required qubitization, the possibilities for applying the qubiterate in Eq. 5.11 seem limited. For instance, (OlabNIO)ab = TN[] produces only Chebyshev polynomials [27] - note the additional ancilla register b as we assume W is constructed via Thm. 5.4. To go further, additional control parameters on W are necessary. Thus we introduce the phased qubiterate, in Figure. 5-3 with the same invariant subspace as W: (Ogft -ie-icg[]Jt) Jijfiiedg(A) j-ietjg[] A _ie-'g(A)) A = (5.17) where &A = cos (#)&/ + sin (#)&\ and the phase 9 x = cos 1 (A). Lemma 5.6. The phased qubiterate W in Eq. 5.17 is equal to WO = Z--to/2fV4 /2, where 20 = (I - (1 - e-i)|0)(0lab) is a partial reflection about I0)ab by angle $ - R, 74 implements a relative phase between the IOA)abs and (0)abs subspaces. In block form, 0 A - Proof. Let us compute the phase applied to states 0A), 0'): Z401) = 0A) and ZjO') (1 - (1 - eidk))|O') - e-'001). As this true for all A, Eq. 5.18 follows. Combining the representation of W from Eq. 5.11 with this leads to Eq. 5.17, WO=_+7/2~5-r/ 1 0o A-g(A) (5.19 #4 =Z-MgWZ4-2 = 0 -ie g(A) A 0ie(.) In Eq. 4.7, we considered the sequence of single-qubit rotations V(O) = e-ikN = A(O) -... e 2 0 2 / e-1 0 2 (5.20) / + iB(O)8-z + iC(9)&x + iD(O)&y. These may be compared to a sequence of N phased qubiterates WN = 2 WON' e 1 N ' ... e 2 e 1eA, 0 A cos (A). (5.21) A In each subspace '71A, this is a product of SU(2) rotations, identical to the single-qubit case, except that the rotation angle is doubled. Thus we may decompose this in the Pauli basis ZA &A like W ® A(20A)iA + i B(20\)>5 + iC(20, )&A + iD(20,\)&, (5.22) + where (A, B, C, D) are real functions of 2 0A. As we can only prepare and measure the ancilla computational basis states I0)ab, we consider the component (0|abW 0)ab = EAA(20A) iB(20,)|A)(Ajs. We find it useful to define the functions (A, B, C, D) of A related by a variable substitution e.g. A(6) = A(cos (0/2)). Thus W encodes the matrix A[H] + iB[H] in standard-form-(A[] + iB[H], 1, W , 2d). Any choice of phases E RN generates sophisticated interference effects between eleG ments of the sequence, leading to (A, B, C, D) with some non-trivial functional dependence on H. Though the dependence of the output on # seems hard to intuit, they nevertheless specify a program for computing functions of H, similar to how a list of numbers might specify a polynomial. Fortunately, we may apply the characterization of QSPL in Chapter 2 to determine achievable functions (A, B, C, D), and their implementation by some choice of q that may be obtained from any valid partial specification of (A, B, C, D) by some efficient classical algorithm. For instance, we have the following result regarding Eq. 5.20 Lemma 5.7 (Achievable (A, B) - from Thm. 2.7). For any integer N > 0, a choice of functions A, B in Eq. 5.20 is achievable by some q E IRN if and only if all the following are true: (1) A(O) = A(x), B(O) = B(x), where A, B are real parity-(N mod 2) polynomials in x = 75 (2_c@3) co + N _-V 5.lo)(1| cos (0/2) of degree at most N; (2) A(1) = 1; (3) Vx E [-1, 1], A 2 (x) + B 2 (x) < 1; (4) Vx > 1, A2 (X) + B2 )>1 (5) VN even, x > 0, A 2 (ix) + B 2 (ix) 1. Moreover, S C RN can be computed in classical 0(poly(N)) time. This implies the quantum signal processing result Theorem 5.8 (Ancilla-Free Quantum Signal Processing). Given Hermitian standard-form(H, 1, (Gt 01s)U(001s), d), let any A, B be degree N polynomials that satisfy the conditions of Lem. 5.7. Then there exists a standard-form-(A[fH]+iB[H], 1, W, 2d), where W requires ((N) queries to controlled-U, and ((N log(d)) primitive quantum gates. Proof. This follows from the identity A(20,) = A(cos(OA)) = A(A), and similarly for the B term. As polynomials form a complete basis on bounded real intervals, these results imply the query complexity of approximating any real function A[H] with error e is exactly that of its best polynomial c-approximation satisfying the constraints of Thm. 5.8, and similarly for the complex case. 5.4.2 Single-ancilla flexible quantum signal processing Thm. 5.8 would be more useful if we could drop the unintuitive constraints (4,5) that impose restriction on what the target functions must be outside the domain of interest. In the following Thm. 5.9, we present a generalization that computes functions with only one o) without those constraints, using an additional component B[AI = ((Olabc 0Zs)V (VO)abc single-qubit ancilla register c. Note that this does not follow immediately from Thm. 5.8 as the constraint A(1) = 1 means there will always be some A component, even if the characterizations of other partial specifications of (A, B, C, D) are used. The trick is to exploit the structure of single-qubit rotations Eq. 5.20 to stage a perfect cancellation of the A[H] term by taking a linear combination of two standard-form encodings for ((Olab 0 Is)W $(|0)ab 0 Is) = A[H] k iB[H]. Theorem 5.9 (Flexible quantum signal processing). Given a Hermitian standard-form(H, 1, U, d), let B be any function that satisfies all the following conditions: (1) B(x) = EN-0 bjxj is a real parity-(N mod 2) polynomial of degree at most N; (2) B(0) =0; (3) Vx E [-1, 1], B 2 (X) 1 Then there exists a Hermitian standard-form-(B[],1, V,4d), where B[H] = Zj=0 bj11, and V requires 0(N) queries to controlled-U and ((N log(d)) primitive quantum gates precomputed in classical 0(poly(N)) time. Proof of Thm. 5.9. Consider the composite qubiterate in Eq. 5.22 controlled by a singlequbit ancilla c. Let P =(6,y 0 jabs)(10)(OIC 76 W +g +1)(11C 0 W_4. U 0)c -X 0)b - Had 0)a - G 2 2 Had 2 Ut 2 |0)c--- I0)b V1,2 VO N - 10)8 - V101 Figure 5-4: (top) Circuit diagram for the flexible qubiterate V9 0)(09®W1+ 11)(11c9W-4, where R = Iab-(1-e--i0)I0)(0aIO0)(b. (bottom) Circuit diagram for the flexible composite qubiterate VP used to encode a standard-form-(B[H], 1, Vi, 4d). The query complexity of Zg is O(N) to O, controlled-U, and their inverses. Its gate complexity is O(N log (d)). Note that details in the construction of I7 actually allow for the implementation of VL with the same query complexity, as seen in Figure 5-4. By applying the similarity transformation e &_ Jxz&x -&, and &x&yax = -&Y, - WIV|0)ab|A)s e N-OO .. . e--_10A N-1) )abIA)s, (5.24) (A(A)A - iB(A)&z,A + iC(A)&x,A - iD(A)&y,X) I0)abIA)s. Thus using the ancilla state I+)cIO)ab, where It) = --- (10) + 11)), as the input to 17! results v12, in: I+)cO)aIA)s = (-iA(A)I-)c + B(A)I+)c) IO)aIA)s + (C(A)K-)c + D(A)I+)c) IOA')abs. (5.25) Thus ((+Ic(0Iab 0 IS)fVL(l+)c|ab 9 is) = B[H'] encodes B[H'] in standard-form. Note that this is independent of all the other functions A, C, D which are in general non-zero. Thus we may apply Lem. 5.10 on achievable (B) even those all other components are in general non-zero. Finally, let Vi = (Had 0 Iabs) V(Had 0 as)l Lemma 5.10 (Achievable (B) - from Thm. 2.8). For any integer N > 0, a choice of function B in Eq. 5.20 is achievable by some 4 G RN if and only if all the following are true: (1) B(9) = B(x), where B is a real parity-(N mod 2) polynomial in x = cos (0/2) of degree at most N; (2) B(0) = 0; (3) Vx & [--1, 1], B 2 (X) < 1. Moreover, R N can be computed in classical 9(poly(N)) time. With Thm. 5.9, we are assured that any degree N bounded matrix polynomial that 77 goes to zero at the origin can be implemented exactly on a quantum computer using O(N) queries, O(N) additional primitive quantum gates, and 0(1) additional ancilla qubits. Using a similar procedure, one may instead extract the A[H] component. 5.4.3 Single-ancilla quantum signal processing on arbitrary unitaries + By exploiting an additional ancilla qubit, Thm. 5.9 presents one approach for computing matrix polynomials with looser constraints. However, other constructions are possible. In this section, we restate Thm. 4.2 for eigenphase transformation, which provides one such alternative. Given any unitary P with eigenstates PIA)8 = eA IA)s and Po =+) (+I ca is )c, consider I-)(--Ic OP controlled by the single-qubit ancilla register b where &xIt)c = the sequence N P = 'Pkwn11+7 k PO'N+7r" 3Pp2 +7rIY 1, (5.26) k odd>1 p OA /2ei1OA/2 = (e-iCP&z/2 09 Is)Po(eiW&,/2 & is) 2 For each eigenstate IA), a product of single qubit operators e-WN 0/2...ei similar to Eq. 5.21 is obtained, and these only act on the ancilla b. Thus the choice of determines the effective single-qubit ancilla operator that is controlled by JA). PW= )+i&z +I2A( 3 (0,) + i&XC(O,) +i&YfD(O)) 0 JA)K AIs. ( (5.27) In the Hamiltonian simulation problem, we are concerned with implementing eigenphase transformations and so select the components (+IPcI+) c = EA A(6A)+iC(0,)IA)(AIs. Note that A, B are even Fourier series and C, D are odd Fourier series, all of degree at most N/2. Thus by using a Fourier series to approximate any target function, we are able to fully exploit any smooth or analytic behavior, with a query complexity exactly twice the degree of best Fourier approximations. It follows from Cor. 2.9 for partial tuples (A, , C, -) that = E\ A(Ox)+iC(OA)A)(Al Theorem 5.11 (Achievable (A,C)). The target operator(+cg+), e RN, where N is even, if and only if all the following are can be implemented by some true: (1) VO C R, A2 (0)+ C 2 () <1; (2) A(O) = 1; (3) A(0) = Z (4) C(O) /2 ak cos (k); ~=1 Ck sin (k). This enables eigenphase transformations A(GA) + iC(OA) ~ eih() through Theorem 5.12 (Eigenphase transformation, restated from Thm. 4.2). V real odd periodic (-7r,7r] and even N > 0, let (A[O],C[9]) be real Fourier series k = 0,..., N/2, that approximate maxoEa IA[0] + iC[0] - eih(O)I < in (cos (k), sin (k0)), e/8. Given A[G],C[G], one can efficiently compute the 5 such that (+|cPyl+)c in Eq. 5.26 applies P,, a number N times to approximate Pideal = EA eih(O)|A)(A with trace distance c and success probability > 1 - 2e. ||(+|C~P|+)c - Pidea | functions h : (-r,7r] - 78 5.4.4 Single-ancilla quantum signal processing on controlled-qubiterates Setting these arbitrary controlled-unitaries P in Eq. 5.27 to be controlled-qubiterates W lead to interesting results not captured by Thm. 5.8 or Thm. 5.9. Observe from Eq. 5.11 that W can be diagonalized to obtain its eigenvalues e Fi cos-I (A)and eigenvectors IOX )abs. S= ei cos1 (), IOA)abs - V2)abs O0)as (5.28) By a judicious choice of ancilla state, more flexible target function transformations are possible. As |0,) = 0x+)+IO-), projecting the sequence Eq. 5.26 onto IO)ab leads to a linear combination of Eq. 5.27 with eigenphases , = - cos- 1 (A): (OlabPs 0)ab ED I (A(OA) + A(-OA)) + i (13(0,) + B(-OA)) + 2 A)(Ala (5.29) A2 $2 A(,,) + i&zB(OA) 0 |A) (Ala = i2 g A(] + i&, 0 =0 [$], A where in the second line, the parity of (A, B, C, D) lead to a cancellation of (C, D), followed by a change of variables 0, = cos 1 (A) using Chebyshev polynomials Ta(cos (0)) = cos (nO) of the first kind - (A, B) are even Fourier series so A(6x) = aTn(cos (OA)) = El2= anTn(A) = A(A). Whereas (A, B) in Thm. 5.8 are restricted to be of definite parity, (A, B) Eq. 5.29 have no such restriction. Theorem 5.13. The target operator (0|c(OlabP |0)ab|0)c = A[]+iB[H] can be implemented by some E RN, where N is even, if and only if all the following are true: (1) A(A), B(A) are real degree N/2 polynomials; (2) A(1) 1; (3) VA E [-1,11, A2 (x) +f 2 (X) <1. (4) VA < -I or A > 1, A2 (A) + 1-2 ()>1 Proof. The sequence of rotations e-iA/ 2 &WN... .e 0 ,\/2% 2 e iO/2&1 is identical to those in Thm. 5.8 but with a halved rotation angle OA -+ 6A/2. Thus all conditions on A(cos (0/2)) = A(cos (9)) apply to Eq. 5.29. Using the Chebyshev polynomial cos (9) = T2 (cos (9/2)), we equate A(A) = A(T 2 (A)). In the forward direction, condition (5) of Thm. 5.8 on the domain Vy > 0, A = iy is mapped onto Vy 0, T2 (iy) < -1. In the reverse direction, solving T2 (y) = 2y 2 - 1 = x for x < -1 leads to y = 1v/1+ x/v'2 which is imaginary and matches condition (5) of Thm. 5.8. Note the indexing two possible solutions is squared as A 2 + B 2 are even polynomials. If concerned with just selecting A[f] or B[H] alone, we may further relax the constraints on behavior for JAl > 1, which is a domain of lesser interest as IIH < 1. Theorem 5.14. The target operator (+|c(0labP(|0)abI+)c= A[] can be implemented by some @ c RN, where N is even, if and only if all the following are true: (1) A(A) is a real degree N/2 polynomial; (2) A(1) =1; (3) VA E [-1, 1], A 2 (A) < 1. 79 [H] can be imTheorem 5.15. The target operator (+|c(Olab(-i&z 0 iabs)P,6j0)abl+)c = plemented by some 5 E RN, where N is even, if and only if all the following are true: (1) B(x) is a real degree N/2 polynomial; 0; (2) B(1) 2 (3) VA E [-1, 1], B (A) < 1. Proof. Thm. 5.14 follows directly from Thm. 2.8 for the partial tuple (A,-,-,-) by observing that A is a Fourier series A(O) = E / as cos (nO) = anTa(cos (0)) of degree N/2. Since A = cos(0), A(O) = EZ /2 anTn(A) = A(A) a polynomial of degree N/2. The proof for Thm. 5.15 is almost identical. Note that our use of the state 1+), ensures that only one l quadrature - either I2 or &z is selected. We may generalize the results of Section 5.4.3 slightly by adding arbitrary single-qubit rotations e-ie&/ 2 on the ancilla.We define N P4I, tSPk+1+7r (ei-toz/2 0 Is)(e-i')'/ 2 09 is) Po(eiw~/2 0(g) ,S4k' k odd>1 (5.30) Using e-t(D/ 2 lt) = eFit/ 2 j ), this modifies Eq. 5.27 for arbitrary controlled unitaries to Pb = (I2 A( + ,\) + iMzB(4 + ,) + i&.C(4 + ,) + i&yD(4 + ')) 0 A)(Als. (5.31) - This is equivalent to Eq. 5.26 with controlled (e'(P) instead of controlled P. Thus the results of Thms. 4.3, 5.12 still hold but with all phases are shifted by P + Replacing P with the iterate W now leads to (OlabP4,gIO)ab = 2 ) 0 IA)(Ala. As a result, it is no longer clear what a complete characterization of X(C + Ox) X( - OA) for any X E {A, B, C, D} for arbitrary 1, is. Our strategy is to choose 1b and then consider cases where X(1 + 0A) = X(4 - 0,). This allows us to apply the results of Thm. 2.7 and Thm. 2.8, but with additional constraints on the form of (A, B, C, D). For instance, the cases 4 = 0, w, are trivial as they reduce to Eq. 5.29 where A(6A) = A(-O,). However, A are already even Fourier series so no additional constraints arise. In the case (= 7r/2, we see that X(7r/2+x) = X(7r/2-0,) only if X is symmetric about 7r/2. As A, B are even Fourier series with period 2w, imposing this symmetry halves 0) = their period to r, whereas the odd Fourier series C, D now only have Fourier components of odd degree. Theorem 5.16. The target operator (+|c(0|ab-,r/2,g|0)ab|+)c = A[] +iC[H] can be implemented by some i E RN, where N is even, if all the following are true: (1) A(A) is a real even polynomial of degree at most N/2; (2) C(A) is a real odd polynomial of degree at most N/2; (3) VA E [-1, 1, A 2 (A) + 0 2(A) (4) A(0) =1. 1; Proof. We start from the conditions of Lem. 4.3 for the partial tuple (A, *, C, *). Assume that X(7r/2 + 6) = X(7r/2 - 0) for X E {A, B}. Thus A(O) = E aN2,. acos (nO) ZNeen anTn(cos (6)). Combining 0 = 7/2 cos-' (A) 80 -4 cos (0) = -v/1 - A 2 and that A(0) is an even polynomial in cos (0), we obtain A(0) = E N2 polynomial in A. N12 c' A, Similarly, C(0) = odd ca sin (nO) a' A' = A[A] is an even = r N12 c sin ()Un_1(cos ()) where Un(cos (0)) = sin ((n+1)O) are Chebyshev polynomials of the second kind. En cddA sin (0) This proves condition (1). Lem. 4.3 also states VO E R, A 2 (0) C 2 (0) < 1.Since A 2 (0) +C 2 (0) is a Fourier series in 20, this is equivalent to VO E [0, ,r] and VX ER, A 2 (0+X)+C2 (0+X) < 1. Since (w/2 cos-'(A)) : [-1,1] -+ 7r/2 + [0,7r], a change of variables to A proves condition (2). Solving 7r/2 k cos- (A) = 0 leads to A = 0 so the condition A(0) = 1 of Lem. 4.3 proves condition (3). Whereas A(1) = 1 in Thin. 5.13, 5.14, A(0) = 1 in Thin. 5.16 - these represent different families of target functions. An analogous characterization for other target operators such as (0|c(0|abAr/2, 8x&O)abIO)c = f[] +i[], OIc(0|ab,r/2,|0)abI0)c = A[] +iB[H] follow from similar manipulations of Thin. 2.7. 5.4.5 Double-ancilla quantum signal processing The target operators implementable by quantum signal processing on iterates W in Section 5.4.1, 5.4.4 are evidently very flexible, but nevertheless subject to some lax constraints. For instance Thms. 5.16, 5.14 fix A(A) = 1 at some A = 0, 1 or impose a definite parity on A, C or so on. The ability to implement completely arbitrary target operators would be invaluable. This can indeed be done with a simple modification of Thin. 5.14. The solution is to take a linear combination of A[ft] and identity. Define a signal operator P'g controlled by yet another ancilla register d and the ancilla state Ia)abcd =0)abI+)c (1 /'JJ0)d + \/1 - a1l)d) , (5.32) where 1 > o > 0 and P (CelabcdPf,,I a)abcd =i0)(Old 0iabs + 1)(l1dd9 ((-i&z ®Iabs)Po,) (5.33) (1 - a)B[H] k aid. Theorem 5.17. The target operator (K0abcdP I|a)abcd = f[H]/3 can be implemented by some (P ERR, where N is even, 1 > a > 0, and sign if all the following are true: (1) f(A) is a real degree N/2 polynomial; (2) VA E [-1,1], If(A)|1 1. Proof. Given any real polynomial f(A) of degree N/2 such that VA c [-1, 1], f(A)i1 1, apply Thm. 5.15 to find E RN such that (+Ic(O ab(-iz 0 Zabs)PO,|0)abI+)c B [H] 2 If (1) 1 is achievable as B(1) = 0 and VA E [-1, 1], f (A) < 1. Choosing where B(A) = f()-f(1) 3- a = lf(1)1/3, (alabcdFjgfl(f(l)), I0)abcd =(1- I) f1 -+ I = returns the sign of oz, with sign(0) = +. lwhere sign((T) l Note that by taking yet another linear combination of two different fi, f2 created by Thm. 5.17 in a very similar manner, one can them implement in the standard form -(fi[H]+ if 2 [H]), which is a completely arbitrary complex function. 81 5.4.6 Operator functions of normal matrices The results of Thms. 5.4, 5.3 for qubitization can be extended to normal operators, though we will not classify their possible computable functions. It is well known that any normal matrix has a polar decomposition ft H =UH, (5.34) where Hu is unitary, HH is positive-semidefinite, [HU, HH] = 0 commute, and the eigenvalues are HIA) = eZOAAI), where A > 0, 9 E 1R. This reduces to a Hermitian operator when Hu has eigenvalues 1, and reduces to a unitary operator when all HH has eigenvalues 1. The trivial approach to qubitization applies to any complex matrix. We simply use the construction of Thm. 5.4 to implement the Hermitian signal operator 2( + t). Another possibility uses two phased qubiterates in an alternating sequence on input state JG)a R0)S, WO= where form Z0-7/ 2 (2|G)(G| - I)U Z-0+ / 2 , + = $ and U vo+ = _ie -'S0 1 W_ = (5.35) Ut. For each eigenstate JA), the separate iterates have the block 2 . AetOA - 4_ = -ie 4 0 2 /1-A 0 (5.36) (53 - where the first column corresponds to input states {|G)IA), IG) UIA), G) r+A)1. The subspace spanned by these states is not invariant under any repeated application of an iterate of the same sign. However, the product WW,+ has an invariant subspace containing G). With the understanding that we only consider alternating sequences, each WO the representation /1 -ie eT - A2 (537) / ( SA -ieN/1 - A 2 has Note that when all eigenvalues A are identical and $= 7r/2, this reduces to Oblivious amplitude amplification [12], and we recover Hermitian qubitization when all 0 X = 0. While this approach uses one less ancilla qubit than the construction of Thm. 5.4, quantum signal processing must be applied with care here as only even length W- have an invariant subspace. This limitation can be relevant in some cases, such as Hamiltonian simulation using controlled-qubiterates individually. 5.5 Hamiltonian simulation by qubitization The cost of simulating the time evolution operator e~-i depends on several factors: the number of system qubits n, evolution time t, target error c, and how information on the Hamiltonian H is made available. This field has progressed rapidly following groundbreaking work in the fractional query model [13] achieving query complexities that depend logarithmically on error. This was generalized by Berry, Childs, Cleve, Kothari, and Somma (BCCKS) [14] to the case where H = QjIajUrj is a linear combination of d unitaries 82 and the 1%j sum to a - such a decomposition always exists - with an algorithm us(at/c) logd)_lg_______tlo ing 0( lOg4c)) ancilla qubits and only 0( ),t/c))queries. Subsequently [10], an extension to d-sparse Hamiltonians was made, where H has < d non-zero elements per row with max-norm |i|limax, to achieve a quadratic improvement in sparsity with log (dtjjHmaxll/0 ) 0 (dtIHmaxI log (dtllftnax Il/0) ~'log queries. A prominent open question featured in all these works ) was whether the additive lower bound Q(t + loglog(/ hs civbefrayo for any of these a achievable log (1/c)) was models. In Chapter 4, we presented a procedure achieving the optimal trade-off between all parameters, with query complexity E(dt|H|lmax + logo1/))* The strictly linear-time performance with additive complexity is a quadratic improvement for precision simulations t~ log (1/) and the constant number of n + m + 3 ancilla qubits significantly improves on prior art which depends on t, c like 0 ( log (tc). Unfortunately, the d-sparse model is less appealing in practical implementations for several reasons. First, it is exponentially slower than BCCKS when the Uj are of high weight with sparsity 0(2n). Second, its black-box oracles can be challenging to realize. Avoiding the 0 ( 2n) blowup by exploiting sparsity requires that positions of non-zero elements are efficiently row-computable, which is not always the case. Third, the Childs quantum walk requires a doubling of the n system qubits, which is not required by BCCKS. Ideally, the best features of these two algorithms could be combined. For example, given the decomposition = E, aj= one would like the optimal additive complexity of sparse Hamiltonian simulation, but with the BCCKS oracles that are more straightforward to implement. Furthermore, one could wish for a constant ancilla overhead, of say [log 2 (d)] +2, superior to either algorithm. The SU(2) structure of the qubiterate is identical to that of the quantum walks used for sparse Hamiltonian simulation in Chapter 4. Thus the technique of approximating time-evolution by a sequence of controlled-quantum walks is directly applicable without modification. This realizes the optimistic fusion of best-case results in prior art, and and motivates new formulations of Hamiltonian simulation. In this section, we prove the optimal Hamiltonian simulation algorithm in Thm. 5.18 that uses the standard-form encoding as the input, apply this simulation algorithm to Hamiltonians described in Section 5.2, and summarize a comparison of the results with prior art in Table. 5.2. Theorem 5.18 (Hamiltonian simulation by qubitization). Given Hermitian standard-form- (H, a, U, d), there exists a standard-form-(X, 1,YV, 4d) such that ||X - e--Ht || _ e, where V requires Q = 0(ta + (1/6) ) queries to controlled-U and 0(Q log (d)) primitive gates'. The optimality of the procedure follows by using the qubitized variant of Childs' quantum walk for U. Furthermore, the transparent nature of Thin. 4.4 significantly expedites the development of new useful formulations of Hamiltonian simulation. For instance, we easily obtain a new result for the scenario where H is a density matrix p. Whereas p^ can be produced by discarding the ancilla of some output from a quantum circuit G, we instead keep this ancilla, leading to an unconditional quadratic improvement in time scaling, and an exponential improvement in error scaling over the sample-based Lloyd, Mohseni, and Rebentrost (LMR) model [82, 70], as summarized in Table. 5.2. Indeed, most quantum prin'As error E occurs only in logarithms, it may refer to the trace distance, failure probability, or any other polynomially-related distance without affecting the complexity scaling. 83 Algorithm Model __ ___ __ __ __ __ __ __(-() Sparse [86] (9(-) _ dtIHI|max + n+ m + 3 Hik log (1/) LMR [82] Mixed p n+ 1 log log (at/c) t 2 /6 Thm. 5.18 Cor. 5.20 Con (0IUIO) .20log E. ajUj [log 2 (d)] + 2 [log 2 (d)] + 2 .2 t + log(1/6) log (1/) at + loglog(1 log (1/0) Cor. 5.21 Purified p n + [log 2 (d)] + 2 t + logloglog(1/f) (1/E)lon log log (at/0) JJ.2 Gates per Query n + poly(m) at log (cd/c) 0 (log(d) log(at/E) ja BCCKS [14] Query Complexity Ancilla qubits logrn Table 5.2: Performance comparison of state-of-art with our new approaches (bottom three lines), for Hamiltonian simulation e-ift of ft E C 2nx 2n with error E. The d-sparse simulation oracle describes entries of H with maximum absolute value ||HI|max to m bits of precision. |HNI a = , agj and d 1 cy, The BCCKS oracle provides the decomposition H = each Jj has cost 0(1). The LMR query complexity refers to samples of the density matrix H. This work generalizes the above in Thm. 5.18 with oracles GOb) = IG) E Cd, & such that (GUI(G) = H, where IIHI1 < 1. A new model Corollary 5.21 where the oracle that aj j)aj4'), Tra[Ip)(pt] = is provided. outputs the purification 1p) = ciple component analysis applications for machine learning as well as quantum semidefinite programming [16], are compatible with this form, and are thus enhanced. We now proceed with the proof of Thm. 5.18. Note that the normalization a may be absorbed by rescaling Hamiltonian ft. For simplicity, the proof assumes a = 1. Using linearity, it suffices to prove our results on a single eigenstate ftIA) = AlA). From Eq. 5.11, the qubiterate with an additional global phase e ( acting on a single eigenstate of f simplifies to 2 1) i ocos(A), (5.38) e ( e 1 2 _ in the basis IO) = 10) A), I0 ). Thus W has eigenstates IOX) = IO) iIOj) with eigenvalues WIOA ) = e 1 ()A). As the input state in the Hamiltonian simulation problem is 0 1) = ,A+)+I1) the'application of a sequence of phased qubiterates implements phase evolution on these states with opposite signs. e tWb0A) = 1 (eim-icos-'(X)l0A+) +ei- +icos-1(A) 10 f_)) . (5.39) This can be contrasted with the requirements of Hamiltonian simulation where a phase e-iAt must be applied on both states. Hamiltonian simulation is accomplished by linearizing the phase implemented by the qubiterate with some function h(<( cos 1 (A)) = -At. (5.40) This is accomplished with the choice <D = -7r/2, h(6) 84 sin (9)t. (5.41) This function may be approximated using Thm. 5.12, which also underlies sparse Hamiltonian simulation in Chapter 4. Its application requires a good Fourier approximation to the function eih(O), which is provided by truncating the Jacobi-Anger expansion ei cos (z)t __ _1oikJk(t)eikz [1], where Jk(t) are Bessel function of the first kind. cos (sin(OX)t) = Jo(t) + 2 E Jk (t) cos (kO,), (5.42) k even>O sin (sin(Ox)t) = 2 Jk(t) sin (kOx). k odd>O As done in [10, 86], this allows us to identify the truncated Fourier series of Eq. 5.42 as the target functions A(O) ~ cos (sin(O)t), C(O) ~- sin (sin(O)t) in Thm. 4.3. The error E = maxo JA(O) + iC(6) - e 'sin(0)from truncating this expansion for k > N/2 is a sum of IJk(t)l that was bounded in [10]: E 2Jk(t) k=q 4tq = q =0 'iet \ q N + -> , (2) log(' = gqlog , 00 q (5.43) The rapid convergence by truncation arises as eitsin(0) is an entire analytic function [15]. Thus Thm. 5.12 allows us to implement the time evolution operator with trace distance I(+c(OabP@Ij0)abl+)c- e-iltII = O(e) and failure probability O(c). Solving for N then furnishes the number of queries to W required to simulate e-iHt: N = t+ lg(E No log (1/) (5.44) and the a normalization constant may be restored by rescaling H again. This achieves the upper bound in Thm. 4.4. To prove that it is optimal, we show that sparse Hamiltonian simulation is a special case. Corollary 5.19 (Hamiltonian Simulation of a Sparse Hermitian Matrix). Given access to the oracles in Section 5.2.3 that describe a d-sparse Hamiltonian H with max-norm IfHIlmax, time evolution by H can be simulated for time t and error e with 0(dt|IIHImax + lo(1c_0 queries. Proof. This follows from the standard-form encoding Eq. 5.8. The case where H decomposes into a linear combination of unitaries is an immediate application: Corollary 5.20 (Hamiltonian Simulation of a Linear Combination of Unitaries). Given U, defined _1 access to the oracles in Section 5.2.3 that describe a Hamiltonian H = by a linear combination of unitaries, time evolution by H -= time t and error c with O(at + log(/)) ajUj can be simulated for queries to 0, U. Proof. This follows from the standard-form encoding Eq. 5.3. 85 l The intuitiveness of Thm. 4.4 allows us to swiftly devise new models of Hamiltonian simulation. Corollary 5.21 (Hamiltonian Simulation of a Purified Density Matrix). Given access to the oracles in Section 5.2.3 that describe a Hamiltonian H that is a density matrix, time queries to G. evolution by 3 can be simulated for time t and error c with O(t + log(1/) Proof. This follows from the standard-form encoding Eq. 5.6. 5.6 LI Conclusion The standard-form encoding for matrix inputs to quantum computation is very flexible and generalizes common input models, such as d-sparse oracles or a linear-combinationof-unitaries. As one is always free to impose a preferred basis on this standard form, It illuminates an intuitive and straightforward path to other as-yet undiscovered input models of interest. For instance, our definition of the purified density matrix input model for the problem of Hamiltonian simulation led to a quadratic improvement in time and an exponential improvement in error over the sample-based model - the proof of which consisted of just a few lines. The greater value of operator design through quantum signal processing and qubitization lies in providing a unified approach to understanding a variety of quantum algorithms, and what fundamentally determines their performance. In particular, important problems that essentially rely on computing matrix functions, such as Hamiltonian simulation, quantum linear systems, and Gibbs state preparation, have previously required a case-by-case analysis for different various input models, each of which representing a major breakthrough. These are now shown to be special cases of qubitization combined with quantum signal processing, wherein finding an algorithm that succeeds on the standard form automatically implies algorithms with equal performance for all other input model. Through qubitization and quantum signal processing, we characterize the set of matrix functions that can be implemented on the standard-form, and find that completely arbitrary complex matrix polynomials can be implemented exactly using at most 0(1) ancilla qubits. As these functions are also implemented with query complexity exactly that of optimal polynomial approximation [93], these standard form algorithms can represent significant improvements, as illustrated through our applications to Hamiltonian simulation. 86 Chapter 6 Uniform spectral amplification 6.1 Introduction Quantum algorithms for matrix operations on quantum computers are one of its most exciting applications. In the best cases, they promise exponential speedups over classical approaches for problems such as matrix inversion [55] and Hamiltonian simulation, which is matrix exponentiation. Intuitively, any arbitrary unitary matrix applied to an q-qubit quantum state is 'exponentially fast' due to a state space of dimension n = 2q. However, if these matrix elements are presented as a classical list of O(n2 ) numbers, simply encoding the data into a quantum circuit already takes exponential time. Thus the extent of this speedup is sensitive to both the properties of the Hamiltonian and the input model defining how that information is made accessible to a quantum computer. Broad classes of Hamiltonians H, structured so as to enable this exponential speedup, are well-known. The most-studied examples include local Hamiltonians [81] built from a sum of terms each acting on a constant number of qubits, and its generalization as d-sparse matrices [3] with at most d non-zero entries in every row, whose values and positions must all be efficiently computable. More recent innovations consider matrices that are a linear combination of unitaries [28, 14, 99] or density matrices [82, 70]. Though different classes define different input models, that is unitary quantum oracles that encode H, it is still helpful to quantify the cost of various quantum matrix algorithms through the query complexity, which in turn depends on various structural descriptors of H, such as, but not limited to, its spectral norm I|flN, induced 1-norm IIH1|, max-norm iH|ilmax, rank, or sparsity. A challenging open problem is how knowledge of any structure may be maximally exploited to accelerate quantum algorithms. As the time-evolution operator e-iHt underlies numerous such quantum algorithms, one common benchmark is the Hamiltonian simulation problem of converting this description of H into a quantum circuit that approximates e-It for time t with some error e. In Chapter 4, we provided an algorithm with optimal query complexity O(tdl||max + l1/) [86] in all parameters for sparse matrices [23, 13, 10], based on quantum signal processing techniques. Though this settles the worst-case situation where only d and the max-norm liHIlmax are known in advance, there exist algorithms that exploit additional knowledge of the spectral norm Ii|iI and induced one-norm 11H11i to achieve simulation with 0(t/ 2 (djfih|max |1|k IH ) 1/ 2 ) [12] queries. Though this square-root scaling in sparsity alone is optimal, it is currently unknown whether the significant penalty in the time and error scaling is unavoidable. Motivated by the inequalities |1HI| 11H111 < djIH11max [25], one could hope for a best-case algorithm in IIHI1max 87 Claim 6.1 that interpolates between these possibilities. + Claim 6.1 (Sparse Hamiltonian simulation). Given the standard quantum oracles that return values of d-sparse matrix elements of the HamiltonianH, there exists a quantum circuit that approximates time-evolution e-it with error e using Q = 0(t(d||HI|maxfli) 1/ 2 log l11c ) queries and O(Q log (n)) single and two-qubit quantum gates. The challenge is exacerbated by how unitary time-evolution, though a natural consequence of Schr6dinger's equation in continuous-time, is not natural to the gate model of discrete-time quantum computation. In some cases, such as quantum matrix inversion [72], algorithms that are more efficient as well as considerably simpler in both execution and concept can be obtained by creatively bypassing Hamiltonian simulation as an intermediate step. The need to disentangle the problem of exploiting structure from that of finding best simulation algorithms is highlighted by celebrated Hamiltonian simulation techniques such Lie-Product formulas [81], quantum walks [23], and truncated-Taylor series [14], each radically different and specialized to some class of structured matrices. A unifying approach to exploiting the structure of Hamiltonians, independent of any specific quantum algorithm, is hinted at in Chapter 5 through Hamiltonian simulation by qubitization. There, we focus on a standard-form encoding of matrices (Def. 6.2), which, in addition to generalizing a number of prior input models, also appears more natural. On measurement outcome 10)a with best-case success probability (IIHI/a)2 < 1, a Hermitian measurement operator H/a is applied on the system - thus the standard-form is no more or less than the fundamental steps of generalized measurement [98]. Treating this quantum circuit as a unitary oracle, this amounts possessing no structural information whatsoever about H. In this situation, we provided an optimal simulation algorithm (Thm. 6.3), notably with only 0(1) ancilla overhead. Definition 6.2 (Standard-form matrix encoding). A matrix H E C" T ' acting on the system register s is encoded in standard-form-(H,a, U, d) with normalization a > ||t| by the computational basis state |0)a E Cd on the ancilla register a and signal unitary U E Cdnxdn if ((0|a 0 is)U(|O)a 0 Z4) = standard-form encoding. H/a. 1 If H is also Hermitian, this is called a Herimitian Theorem 6.3 (Hamiltonian simulation by qubitization, restated from Thm 5.18). Given Hermitian standard-form-(H,a,U, d), there exists a standard-form-(X, 1, V, 4d) such that |$k - e--tl| < e, where V requires Q = 0(ta + lo( ) queries to controlled- and 0(Q log (d)) primitive gates2 This motivates the standard-form encoding as the appropriate endpoint when structural information about H is provided, though it does not exclude the possibility of superior simulation algorithms not based on the standard-form. As Thm. 6.3 is the optimal simulation algorithm, any exploitation of structure should manifest in minimizing the normalization a of a Hamiltonian encoded in Def. 6.2. In order to avoid accumulating polynomial factors of errors, this must only be with an exponentially small distortion to its spectrum. Moreover, 'The unitary C defined in Chapter 5 such that (((0O6) o )U((610)) 0 i) = H/a, which encodes H with normalization a, may be absorbed into a redefinition of U. Moreover, for any / > 0, this is identical to encoding Hf3 with normalization a3. 2 As error Eoccurs only in logarithms, it may refer to the trace distance, failure probability, or any other polynomially-related distance without affecting the complexity scaling. 88 the cost of the procedure should allow for a favorable trade-off in the query complexity of Hamiltonian simulation. Thus manipulation of the standard-form and any additional structural information to this end is what we call the uniform spectral amplification problem. Problem 2 (Uniform spectral amplification). Given Hermitian standard-form-(fH,o, &, d), and an upper bound A - [||HI|,ca] on the spectral norm, exploit any additional information about H or the signal unitary U to construct a Q-query quantum circuit that encodes Hamp in standard-form with normalization A, such that ||Hamp - Hjj < e, and Q = o(oz/A) 9(polylog(1/c)). Uniform spectral amplification is non-trivial as it precludes a number of standard techniques. First, amplitude amplification is precluded as the success probability must be boosted for all input states to the system. Second, oblivious amplitude amplification [13, 14] is also precluded as H is not in general unitary, or even close to unitary. Third, spectral gap amplification [118] is precluded as it distorts the spectrum. As such, solving this problem would be of broad interest beyond Hamiltonian simulation. For instance, spectral gap amplification is fundamental to adiabatic state preparation and understanding properties of condensed matter system. Moreover, the prevalence of generalized measurements means that this could also be applicable to quantum observable estimation in metrology and repeatuntil-success gate synthesis [103]. Some forms of spectral gap amplification have an underlying structure that resembles the amplitude amplification algorithm for quantum state preparation. This suggests that at least one possible solution to uniform spectral amplification could be obtained by solving a related non-trivial amplitude multiplication problem, and vice-versa. Problem 3 (Amplitude multiplication). Given a quantum state preparationoracle G0I)a0O)b AIt)a|0)b+ 11 - A 2 It-')ab, and an upper bound IF - [A, 1] on the target state overlap, construct a Q-query quantum circuit V that prepares I0)a|O)b = Aamplt)a|O)b + - |t-L)ab such that |Aamp - A/Fj < c, and Q = O(F- 1 log (1/c)). Amplitude multiplication is particularly interesting as amplitude amplification and its many other variations [136] amplify target states with the same optimal scaling O(A- 1 ), but with a highly non-linear dependence on the initial overlap. In contrast, Problem 3 performs arithmetic multiplication on the amplitudes with exponentially small error, notably independent of, and without any prior knowledge of their values. 6.1.1 Our Results We present quantum algorithms for Hamiltonian simulation based on the general principle of finding solutions to the uniform spectral amplification Problem 2, which may be broadly categorized as follows. In 'uniform spectral amplification by quantum signal processing', we make no assumptions on the form of the signal unitary in the standard-form encoding of H, and thus treat as a single unitary oracle. In 'uniform spectral amplification by amplitude multiplication', we assume that signal unitary has the structure of factoring into two or three unitary oracles, and by solving amplitude multiplication in Problem 3, also approach the sparse simulation results of Claim. 6.1. We then provide a unifying perspective in 'universality of the standard-form' which further motivates the standard-form encoding of Hamiltonians as a fundamental ingredient in quantum computation. In greater detail, these results are as follows. 89 Uniform Spectral Amplification by Quantum Signal Processing If we make no assumptions on the form of the signal unitary U that realizes the standardform encoding, we treat U as a black-box oracle, which we call the standard-form oracle. In this situation, the first result is uniform spectral amplification in Thm. 6.4 that reduces the normalization a of encoded Hamiltonians to O(A) using O(aA- 1 log(1/E)) queries. This produces a quadratic improvement in success probability when the standard-form is applied to perform quantum measurement, but serves no advantage to Hamiltonian simulation. Theorem 6.4 (Uniform spectral amplification by spectral multiplication). Given Hermitian standard-form-(,a,U,d), let A E [|I0||,a]. Then for any E < O(A/a), there exists a standard-form-(Hamp,2A, ,4d) such that k||$amp - H| K e, and V requires 0(aA- 1 log (1/c)) queries to controlled-U. The second result is uniform spectral amplification of only the low-energy subspace in Thm. 6.5, of H with eigenvalues E [-r, -c(1-A)], which is of interest to quantum chemistry and adiabatic computation. There, the effective normalization is reduced to 0(1) using 0(A- 1 / 2 log3/ 2 (-)) queries. This is a generalization of spectral gap amplification [118] with the distinction of preserving the relative energy spacing of all relevant states, and of applying to any Hamiltonian encoded in standard-form. When applied to Hamiltonian simulation, an acceleration to 0(taVA log 3/ 2 (ta/E)) queries is obtained in Cor. 6.13. Theorem 6.5 (Uniform spectral amplification of low-energy subspaces). Given Hermitian standard-form-(H,a,U,d) with eigenstates H/aIA) = AlA), let A E (0,1) be a positive constant, and H = eq_ _ IA)(A| be a projector onto the low-energy subspace of H. Then there exists a standard-form-(Hamp,Aa, V, 4d) such that |Ift i 1 2 31 2 (E)) e, and V requires / 1(A- queries to controlled-U. - ) )bl| < These results stem primarily from constructing polynomials with desirable properties, which we implement using the technique of flexible quantum signal processing in Thm. 6.6. The advantage of quantum signal processing over the related technique of linear-combinationof-unitaries [10] is its avoidance of Hamiltonian simulation as an intermediate step. This reduces overhead in space, query complexity, and error, and leads to an extremely simple algorithm that directly implements polynomial functions of H without any approximation. Theorem 6.6 (Flexible quantum signal processing, restated from Thm. 5.9). Given a Hermitian standard-form-(H,1, U, d), let B be any function that satisfies all the following conditions: (1) B(x) =ENo bjxj is a real parity-(N mod 2) polynomial of degree at most N; (2) B(0) 0; (3) Vx E [-1, 1], B 2 (X) 1 Then there exists a Hermitian standard-form-(B[H],1, V, 4d), where B[H] = EN bH3, and Y1 requires 0(N) queries to controlled-U and 0(N log(d)) primitive quantum gates precomputed in classical 0(poly(N)) time. Uniform Spectral Amplification by Amplitude Multiplication Alternatively, we here assume that the signal unitary U that realizes the standard-form encoding factors into two or three unitary quantum oracles Urow, Uc 0 i, and Umix, which we also call standard-form oracles. When the signal unitary factors into two components U = 90 UrowUcoh, this constrains the representation of matrices in the standard-form to have matrix elements of H that are exactly the overlap of appropriately defined quantum states, and generalizes the sparse matrix model first introduced by Childs [23] for quantum walks. When the signal unitary factors into three U= UowUmixUcol components, amplitude amplification can be applied to obtain non-trivial Hamiltonians. Note that amplitude amplification had been previously considered in the context of sparse Hamiltonian simulation [12]. However, its non-linearity introduced a polynomial dependence on error, which compounded into a polynomial overhead in scaling with respect to time and error. In contrast, our solution to the amplitude multiplication problem achieves uniform spectral amplification by multiplying all state overlaps by the same constant factor. Specializing the general result Lem. 6.15 to the case of sparse Hamiltonians, which are described by standard black-box quantum oracles (Def. 6.7) to its non-zero matrix elements and positions, furnishes a simulation algorithm matching the complexity of Claim. 6.1, up to logarithmic factors. Modulo these logarithmic factors, this an improvement over prior art, with either best-case square-root improvement in sparsity [86], or a polynomial improvement in time and exponential improvement in precision [12] Definition 6.7 (Sparse matrix oracles [12]). Sparse matrices with at most d non-zero elements in every row are specified by two oracles. The oracle OH j)k)|z) Ii)Ik)|z Hjk) queried by j G [n] row and k E [n] column indices returns the value Hjk = (jH|k), with maximum absolute value I|H|fmax = maxjk IfjkI. The oracle6F1j)I1) Ii)jf(j,l)) queried by j C [n] row and 1 G [d] column indices computes in-place the column index f(j, 1) of the l1 h non-zero entry of the jth row. Theorem 6.8 (Sparse Hamiltonian simulation by amplified state overlap). Given the dsparse matrix oracles in Def. 6.7 for the Hamiltonian H, let ||Hi$max = maxjk Ik| be the max-norm, |f|l1i = maxj Zk fjk| be the induced 1-norm , and ||H|| be spectral norm. Then Vt > 0, e > 0, the operator e-iHt can be approximated with error e using 0 (t(djftj|max-ft1i1) 1/ 2 log (t u) C (l + tf1 log(1/c) tj|ftji log log (1/6) (6.1) queries. Observe that in the asymptotic limit of large ||ftHit > log (1/c), the query complexity simplifies to O(t(d||I|maxfIHl1i)1/2log (11)). The algorithm of Thin. 6.8 is particularly flexible. If none of the above norms are known, they may be replaced by any upper bound, such as determined by the inequalities |Ifimax < 11f 1 5 Hf Hi dIHIl|max [25]. Even in the worst case, the results are similar to previous optimal simulation algorithms. Moreover, the scaling in these parameters is optimal as we prove matching lower bound Thm. 6.9 by finding a Hamiltonian that solves PARITY o OR. Theorem 6.9. For any d > 1, s > 1, and t > 0, there exists a Hamiltonian H with sparsity e(d), ||H||max = E(1), and ||H1 = 0(s), such that approximating time evolution e-iHt with constant error requires Q(tvdii) queries. Some of these results stem from constructing polynomials with desirable properties, which we implement using the technique of flexible amplitude amplification from Chapter 3. Amplitude multiplication in Thin. 6.10 is then a special case that solves Problem 3 up to a factor of - in the range of the input and output amplitudes. 91 Theorem 6.10 (Amplitude multiplication algorithm). V A E [-1/2, 1/2], F E (IAJ, 1/2], c < 0(F), let 0 be a state preparationunitary acting on the computational basis states |0)a E Cd, Alt~ab + /1 - A 2 t-L)ab, where It')ab has no support on |0)b e C2 such that GIO)a=)b 10)b. Then there exists a quantum circuit 0' such that (tIa(0|b(0|c0'0)a|0)b10)c 1 using Q = 0(F- log (1/c)) queries to 0, (t, additional ancilla qubit c. 2, - 2r O(Q log (d)) primitive quantum gates, and an Universality of the Standard-Form Uniform spectral amplification is motivated by the idea that structure in the signal unitary and its encoded Hamiltonian can be fully exploited by focusing only on manipulating the standard-form, independent of any later application such as Hamiltonian simulation. This is supported by the simulation algorithm Thm. 6.3 which is optimal with respect to all parameters when the standard-form is provided as a black-box oracle. This perspective would be further justified if one could rule out, to a reasonable extent, the existence of superior simulation algorithms not based on the standard-form. We show certain universality of the standard-form by proving an equivalence between quantum circuits for simulation and those for quantum measurement, up to a logarithmic overhead in time and a constant overhead in space. Where Thm. 6.3 transforms a measurement of H to time-evolution by ei , we prove the converse in Thm. 6.11 which transforms time-evolution e--Ist back into measurement H. In particular, this is with an exponential improvement in precision over standard techniques based on quantum phase estimation. Thus any non-standard-form simulation algorithm for e-iHt that exploits structure can be always mapped in this manner onto the standard-form with a small overhead. Theorem 6.11 (Standard-form encoding by Hamiltonian simulation). Given oracle access to the controlled time-evolution e-iH such that ||I| 1/2, there exists a standard-form(Hin, 1, U, 4) such that ||Hiin - H|| e, where U requires Q = 0 (log (1/c)) queries and 0(Q) primitive quantum gates. This is proven through the flexible quantum signal processing Thm. 5.9 using a particular choice of polynomial. It is important to note however the caveat that our equivalence limits ||$tJ| = 0(1), and also fails when time-evolution can be approximated with o(t) queries. Fortunately, the latter scenario can be disregarded with limited loss as 'no-fast-forwarding' theorems [25] prove the necessity of Q(1ftlHt) queries for generic computational problems and physical systems. One useful application of this reverse direction is an alternate technique Cor. 6.12 for simulating time evolution by a sum of d Hermitian components Ed=1 Ht, given their controlledexponentials e itjij. This approach is considerably simpler than that of compressed fractional queries [13], and essentially works by using Thm. 6.11 to map each e-iHiti, where flHjtj| = 0(1) to a standard-form encoding of Hjtj. Corollary 6.12 (Hamiltonian simulation with exponentials). Given the standard-form-(E _I aje-ifi, a, GaUSa, d), where U that prepares the state |G)a j 1)(j Ula aj and signal oracle U = 1, there exists a standard-form-(k, 1, V, 4d) such that |k - eit|| <cE, _= 1 vaj/alj)a with a3 ;> 0, normalization a = e-ika, with ||$j4| V requires 0 (at log (at/c) + primitive quantum gates. where og log (at/E) 92 controlled-queries, and 0(Q log (d)) 6.1.2 Organization Our results are structured in the remainder of this manuscript as follows. Part I is where we achieve uniform spectral amplification by quantum signal processing. In Sec. 6.2, we treat the signal unitary as a single unitary oracle, and apply flexible quantum signal processing to prove the solutions Thin. 6.4 and Thin. 6.5 to the uniform spectral amplification problem. Part II is where we achieve uniform spectral amplification by amplitude multiplication. We prove in Sec. 6.3 the amplitude multiplication algorithm of Thin. 6.10. Subsequently in Sec. 6.4, we consider signal unitaries that factors into two or three unitary oracles. This motivates a general model of Hamiltonians encoded by state overlaps, where uniform spectral amplification in Lem. 6.15 is enabled by amplitude multiplication. Applying these results to the special case of sparse matrices leads to the simulation algorithm Thm. 6.8, which matches the lower bound Thm. 6.9. Sec. 6.5 is where we offer a unifying perspective of simulation algorithms and prove a certain universality of the standard-form. This is through the equivalence between quantum circuits for simulation and those for measurement described by Thm. 6.11, and leads to the simulation algorithm Cor. 6.12. Sec. 6.6 is where we constructively prove the properties of useful polynomials used in various proofs. We conclude in Sec. 6.7. 6.1.3 Attributions and contributions The results in this section are from the preprint of joint work [851 with Isaac L. Chuang. The manuscript was written myself with helpful discussions and suggestions from my advisor. G.H. Low is funded by the NSF RQCC Project No.1111337 and ARO quantum algorithms project. We thank Aram Harrow and Robin Kothari for suggesting PARITY o OR as a possible lower bound. 6.2 Uniform Spectral Amplification by Quantum Signal Processing When provided with no information on any structure in the standard-form encoding ((01a 0 Is)U(I0)a 0 Is) = H/a of the Hermitian matrix H, all we have is access to the signal oracle U. Thus our only option is to apply quantum signal processing and study the polynomial functions f[-] of H/a that achieve uniform spectral amplification. In this setting, Thm. 6.4 performs uniform spectral amplification, though the trade-off between its implementation cost and the achieved reduction of a provides no advantage to Hamiltonian simulation. However, a speedup is possible through Thm. 6.5 when interested only in the lower energy subspace of H. As the normalization a is always greater or equal than |I$H|, any input state |) on the system has support only on eigenstates H/alA) = AlA) with eigenvalues JAl IIHII/a < 1. Given an upper bound A E [IIII, a] on the spectral norm, this means that in any polynomial function p(x) that we construct, only its restriction to the domain x c [-A/a, A/a] is of 93 interest, so long as |p(x)J remains bounded by 1 over x C [-1, 1]. Thus one approach to minimizing the normalization is to use quantum signal processing to encode a polynomial with the property p[H/a] = in standard-form. Thus, we should find a polynomial that approximates a truncated linear function, such as ()xi E [-1, 1], E [0, 1F], lxi G (F, 1]. In Thin. 6.18 of Sec. 6.6.1, we approximate flin,r(x) with a polynomial with the following properties: V IF E [0, 1/2] and c < O(F), the odd polynomial Plin,F,n of degree n O(F-1 log (1/c)) satisfies Vx C [-,17], Plin,F,n(X) - - 21P < - 217 max lPIin,r,n(X)l and xE[-1,1] 1. (6.3) This polynomial satisfies the conditions of flexible quantum signal processing in Thin. 5.9, and provides us with the solution Thin. 6.4 to uniform spectral amplification. Proof of Thm. 6.4. Given Hermitian standard-form-(H, a, , d) and an upper bound A E [iiHii, a], Define F = A/a < 1. Using Thin. 6.6 with the polynomial Plin,r,n, encode in Hermitian standard-form- (Plin,r,n [H/a], 1, V, 4d). This requires H = Pin,r,n [H/a] 0(n) queries, and is identical to the Hermitian standard-form-(2Apiin,r,[/a], 2A, V, 4d). Define Hamp = 2 Apin,r,n[H/a]. Then the error of approximation ftamp ft A 2 2A H x max Plin,F,n (x) E[-A,Axe x X < max a Piin,r,n(x) 2A-E[--r,F F2 O(F), and has degree scaling like n = O(F-- 1 (6.4) log (I/E1)), . Finally, note that Plin,F,n requires Ei so let us define E = - Unfortunately, this provides absolutely no advantage to Hamiltonian simulation as the decrease in normalization by factor a/A is exactly balanced by an increase in query complexity by factor a/A. Nevertheless, Thin. 6.4 may be of use to applications involving measurement such as quantum metrology and repeat-until-success circuits, as the success probability 1f 112 is improved by a quadratic factor (a/A)2 . This is analogous to oblivious amplitude amplification which only applies to matrices that are approximately unitary [13]. One workable possibility is highlighted by the deep connection between quantum signal processing and the properties of polynomials. Thin. 6.4 uses a degree O(A-1) polynomial with maximum gradient O(A-'). Yet a famous inequality by Markov indicates a best-case quadratic advantage in the gradient p' of any degree n polynomial maxs[_1,1] Jp'(x)I < n2 maxxe[l,l] lp(x)J. Thus we have not fully exhausted the capabilities of polynomials. As this inequality becomes an equality for Chebyshev polynomials of the first kind TL(x) = cos (L cos-- (x)) at x = 1, this suggests that a speedup is possible if we are only concerned with time evolution on eigenstates with eigenvalues JAl C [1 - A, 1] where A < 1. With this assumption, we may prove Thin. 6.5. 94 Proof of Thm. 6.5. Consider the truncated linear function { fgap,A(X) (X+1-A A[,' E[-1, 1], X E-1-+ otherwise. A], (6.5) ' As fn(fgap,A,, ] _ k -)gH 0, the theorem is proven by finding degree n odd polynomialPgapAn(x) that uniformly approximates fgap,A(x) with error maxxe[-1,-1+A] IPgap,A,n(X)fgap,A(x)| < e and also satisfies all the conditions of quantum signal processing Thm. 6.6. We provide such a polynomial of degree O(A- 1/ 2 log3 / 2 (_)) in Lem. 6.31 of Sec.6.6.2. And so we define fa = Pgap,A,n[t], which approximates the desired amplified Hamiltonian with error 2 -+"1 ft(H )n| max+[,-+A] -- IPgap,,n E- ) As energy gaps in an interval of width A are stretched by factor A-- using only 0(A 1 / 2 queries, a quadratic advantage in normalization is achieved. This is essentially spectral gap amplification [118] with two important distinctions: first, it applies to any Hamiltonian through the standard-form, though as highlighted in [118], only those encoded with a = 1l1H1, such as frustration-free Hamiltonians, can fully exploit the effect. Second, it amplifies the spectral gap of all eigenvalues uniformly, rather than non-uniformly. By combining with Thm. 6.3, one obtains a Hamiltonian simulation algorithm for low-energy subspaces, relevant to quantum chemistry and adiabatic computation. Corollary 6.13 (Hamiltonian simulation of low-energy subspaces). Given Hermitianstandardform-(H, a, U, d) with eigenstates H/a|A) = hIA), let A E (0,1) be a positive constant, and ft = EAe[-1,-1+A] IA)(AI be a projector onto the low-energy subspace of H. Then timeevolution e-it on eigenstates with eigenvalues A e [-1, -1 + A] can be approximated with error e using O(ta!/og3 ) + A- 2 log)/ 2 queries to controlled-U. (v)) Proof. This follows from multiplying the query complexities of Thm. 6.3 with Thm. 6.5, similar to the proof of Cor. 6.12, to obtain a cost of 0 (taA + queries for approximating e- it taAe 2 = E/2. 3 /2 (_I_)) lo(1Ei)) O(A-1/2 log with error ei + taAe 2 . Thus we choose ei = e/2 and L It is worth mentioning that Thm. 6.5 also performs uniform spectral amplification on high energy states. This follows from the polynomial Pgap,A,n(x) being odd. Thus its ability to stretch eigenvalues A E [-1, -1 + A] applies to those A E [1 - A, 1] as well. 6.3 Amplitude Multiplication Amplitude amplification is a staple quantum subroutine for state preparation that used in many quantum algorithms. The basic version and its generalization, are based on reflections, and described in Chapter 3. The result that concerns us it flexible amplitude amplification, which we restate now. Theorem 6.14 (Flexible amplitude amplification). Given a state preparation unitary C acting on the computationalbasis states |0)a E C', |O)b G C2 such that G|O)a|O)b= Alt)alO)b+ /1 - A2t-)ab, where t')ab has no support on IO)b, let D be any function that satisfies all the following conditions: 95 (1) D is an odd real polynomial in A of degree at most 2N + 1; (2) VA c [-1, 1], D 2(A) < 1. Then there exists a quantum circuit W- such that (tja(0|b(0cVV|0)a|0)bI0)c = D(A), using N + 1 queries to G, N queries to 6t, O(N log (d)) primitive quantum gates precomputed from D in classical 0(poly(N)) time, and an additional qubit ancilla c, such that The proof of amplitude multiplication follows from flexible amplitude amplification by an appropriate choice of polynomials for D. Proof of Thm. 6.10. The amplitude multiplication algorithm is a special case of Thm. 6.14 where D is a polynomial that approximates the truncated linear function flin,r'(x) { 2L ' 1, E [--1, 1], lxi G [01F], lxi E [,1], (6.6) jx| E (17, 1]. In Thm. 6.18 of Sec. 6.6.1, we approximate flin,r(x) with a polynomial with the following properties: V F - [0, 1/2] and e < O(F), the odd polynomial PlinF,n of degree n 0(1'-1 log (1/c)) satisfies C[-F], Plin,r,n(x) - - 217 < - 217 and max xE[-1,1] lpiin,r,n(x)l 1. (6.7) - As this polynomial satisfies the conditions of Thm. 6.14, there exists a state preparation unitary WY10)al0)bl0)c = Plin,F,n(Y)lt)aO)bjO)c + A(9)1t')alO)c + iC(O)lt)alO)bl1)c - iB(O)lt')abl1)c, where the functions A, B, C of lesser interest, that consists of O(n) queries to 6, 6t and O(n log (d)) primitive gates. Assuming that F E [I sin (0)1, 1/2] is an upper bound on Isin (6)1, the amplitude of the target state is l(tla(Olb(OcVVl0)alO)bIO)c -i , -sI - 2r In other words, all initial target state amplitudes sin (0) are divided by a constant factor 21F with an multiplicative error e that can be made exponentially small. l Note that if one is interested in multiplication by a factor less than one, trivial solutions exist. For any F > 1/2, one could prepare an ancilla state jF)c = I0)c + 1- and simply define the target state to be It)alO)blO)c in the prepared state GO0)alO)clFc)= + sin(9) t)alO)blO)c 6.4 Uniform Spectral Amplification by Amplitude Multiplication We now consider a certain kind of structure within the signal unitary U that encodes some Hamiltonian in standard-form. Whereas Sec. 6.2 treats U as a single oracle, we now assume that it factors into other unitaries, say U = UrowUcoj, or U = UrowUmixUcoi, that we assume access to as oracles. This factorization imposes in Sec. 6.4.1 the interpretation that encoded Hamiltonians have matrix elements defined by the overlap between some set of quantum states. We investigate in Sec. 6.4.2 how this structure may be exploited for uniform spectral amplification. By applying amplitude multiplication, this is possible through Lem. 6.15 in a fairly general setting. In Sec. 6.4.3, we specialize this to sparse Hamiltonian simulation, which leads to the improved simulation algorithm Thm. 6.8. In Sec. 6.4.4, this algorithm 96 is proven to be optimal in all parameters, at least up to logarithmic factors, through a matching lower bound Thin. 6.9. 6.4.1 Matrix Elements as State Overlaps Decomposing the signal unitary into factors motivates a different interpretation of the standard-form ((0la 0 Is)U(|O)a 9 1) = ((|a 0 s)UrowUcoi(I0)a 0 Is) (6.8) By definition, any unitary operator implements a basis transformation U= Ek Bk)(Aklas between complete orthonormal sets of basis states {IBk)as} and {Ak)as}, and similarly for Urow, UCol. Now consider a set of basis states {j)a} on the ancilla register, and a set of basis states {uj)s} on the system register. Without loss of generality, we may represent + UrOw = Ek IXO,k) as(01a(UkIs + EjO Ek IXj,k)as(jIa(UkIs and UC 0i = E |I ,k)as(01a(UkIs Zj#4 0 >k I/)j,k)as a (Uk Is for some set of basis states {IXj,k)as}, { I j,k)as}. Let us substitute this into Eq. 6.8 and drop the 0 subscript. Hik = UlH JO=(0aUjs a (ua = (6.9) = (XOjl O,k)as = (XJIlk)as. rouk) (col0)aUk)s) In other words, elements of ft in the luj)s basis may always be interpreted as the overlap of appropriately defined quantum states 1/0)as, Xk)as, which we call overlap states. Moreover, H need not unitary when the dimension of these states is greater than ft. More generally, we may factor the signal unitary into three unitaries = U=&ow mix Ucol. If we preserve the interpretation of Urow and U1co as preparing appropriately defined quantum states, the third unitary Umix is a new component that mixes these states to encode the following Hamiltonian in standard-form - = ((Oa 9 Is) UowUmix Ucol(10)a ( Is), = (XjIas Umix1k) as. (6.10) . Note that this reduces to Eq. 6.9 by choosing Umix to be identity, or by absorbing it into the definition of either Urow or Uco. Combined with Thin. 6.3, time evolution by e--ft may be approximated with error c using 09(ta + log log (1/6))qeistUrwUmxanU ) queries to row, Umix, and(col. 0 However, the ability to efficiently prepare arbitrary quantum states represents an extremely powerful model of computation. For instance, arbitrary temperature Gibbs state preparation is QMA-complete [46]. That not all states may be prepared in 0(1) queries to commonly used quantum oracles can be built into the definition of the overlap states by splitting them into 'good' components j/j)ais, 1ij)ais marked by an ancilla state |0)a2, and 'bad' components that are discarded. Difficult states then have a small amplitude in the 10)a2 subspace. Thus 14j)as = A00jl/j)ais0)a 2 + N1 - A/ 30jl/badj)ais|1)a 2 , Xj)as = Ay'jl/j)aisjO)a2 + (6.11) - 1 - AyYjlXbadj)ais 2 )a 2 Note that the dimension of the ancilla register ala2 is equal to a. The coefficients AY, A3 e (0, 1] represent a slowdown factor due to the difficulty of state preparation, and the coeffi97 cients /3 , -yj E [0, 1] normalized to maxj /3 = 1, max -yj = 1 represent how the amplitude in good states can be index-dependent by design. By restricting Umix to be identity on the register a2, this encodes the following Hamiltonian in standard-form H - = ((01a a k a 0 is) &ow UmixUcoi(10)a 0 Is), (Xjlas~mixa k)as = VA Ayjk(XjlalsmixVk)ais- By explicitly including the slowdown factor also reduced. 6.4.2 (6.12) A7A, the spectral norm oIz a AyAf is Amplitude Multiplication of Overlap States This state overlap encoding of Hamiltonians motivates the use of amplitude amplification. As the amplitudes of all states 10j) are attenuated by a constant factor A, the intuition is that one requires O(1/ Afi) queries to the state preparation operator Urow to boost the amplitude in the subspace marked by 10)b by a factor O(1/ A,), and similarly for lj). Thus 0(1/ AT + 1/V/ X.) queries appears sufficient to reduce the normalization a by a factor VAAa. This suggests that a query complexity of Hamiltonian simulation could be improved to 0 (ta(A + A3) + ) which is most advantageous when Ap and A^, are both small. However, realizing this speedup is non-trivial. In the context of prior art in sparse Hamiltonian simulation, attempts have been made to exploit amplitude amplification [121. There, it was discovered that the sinusoidal nonlinearity of amplitude amplification introduces large errors. As these error accumulate over long simulation times t, controlling them led to query complexity scaling like 0(t3 / 2 /E), which is polynomially worse than what intuition suggests. In the following, we avoid these issues by introducing a linearized version of amplitude amplification, which we call the amplitude multiplication algorithm. Before proceeding, note that amplitude amplification also imposes additional restrictions on the form of the overlap states in Eq. 6.11. Amplitude amplification requires the ability to perform reflections Reflo)ai about the subspace marked by 10)ai, as well as reflections Refgp on any arbitrary superposition of initial states I0j), that is Vj, RefoUcoilO)alu)s = -Ucoil0)ajuj)s, and Refp performs identity for any other ancilla state. The case for Urow and IXj) is identical. Whereas the first operation Reflo) 2 = (a2 - 210)(0la2) 0 Ia1s, (6.13) is easy using 0(1) primitive gates, the second operation requires Uco to represent controlled state preparation. In other words, with the input Isj) on the system register, the overlap state has the decomposition Ucoil0)alUj)s I') = (Aoo/3jI)aiI0)a 2 + V1 - 98 AO/jl4badj)aIl)a 2) 1u1)s, (6.14) thus encoding the following Hamiltonian in standard-form H - Ha a~k =ow x ( ((0Oa 0 is)U&owUmix~coiQ0O)a &I), - X Uk)s, (6.15)A Yja(Uj s(jl, IAAUmix Ik)ai (6.15) and allowing us to construct the controlled-reflection operator Ref, =Z(Ia - 210)a 2 1'4j)ai Qlai(f0Ia 2 ) 0 Ius)(ujls = $coj((ia - 210)(01a) 0 is)ci, (6.16) using 2 queries and 0 (log d) primitive gates. The error introduced by a naive application of amplitude amplification is illustrated by an explicit calculation. Using a sequence of m > 0 controlled-Grover iterates Ref'PRefjo a 2 making 9(m) queries, one can prepare the state kbamp,j) = (RefRef I)a2) Ucoi0)aluj)s (6.17) = A (Oj )ai1|0)a2 + - - -1)a2) With the choice m 7 - ) LP)ai 10)a2 + -1)a2) Iuj)s Uj)S - (sin ((2m + 1) sin-' A1/ 2 ), we are guaranteed that all - > ,A80j. Though this improves the normalization, it also specifies an erroneous Hamiltonian as the matrix elements (Xamp,jI&mix|bamp,k) are larger than those of H$k by an indexdependent factor. In contrast, Amplitude multiplication in Thm. 6.10 avoids this non-linearity and allows us to boost the normalization of the encoded Hamiltonian with only an exponentially small distortion to its spectrum. This leads to Lemma 6.15 (Uniform spectral amplification by multiplied state overlaps). Let the Hamiltonian H be encoded in the standard-form of Eq. 6.15 with normalization a. Given up- - per bounds A,3 E [AO, 1/2], A, c [Ay, 1/2] on the slowdown factors, and a target error e C (0, min{A,3, A.}), the Hamiltonian Hli, can be encoded in standard-form with normalization 4a A,3Ay such that |Iin - Hfl < AE||H|| < ae A, using Q O((A1/ 2 A-1/ 2 ) log (1/E)) queries, O(Q log (d)) primitive gates, and 1 additional ancilla qubit. Proof. Let us apply Thm. 6.10, which requires one additional ancilla qubit, to the state overlap model Eq. 6.14. We identify Uc. 1 as the state preparation operator that prepares A,3 1/2, and let A3/3J. Assume that the target state marked by J0)a2 with overlap A) c [A,3, 1/21 be an upper bound on the slowdown factor. Then there exists a quantum circuit Ur 01 that makes Q3 = O(A8 1/ 2 log (1/c)) queries to U,,I and uses O(QO log (d)) 99 primitive gates, and similarly for Urow, to prepare the states 11inU) = $ oil0)alUj)s = IXiin,j) = UrowI0)aluj)s "j (1 + 60,j) IOj))a 1 0)a2 + -1)a) ( Iu ),, (6.18) 12)a) luj)s, (1I+ y) ,j) a10)a 2 + where I ,EyjI < c E (0, min{Ao, A7 }) < 1/2 are state-dependent errors in the amplitude. Let us define the Hamiltonian Hiin encoded in standard-form with normalization 4aVAA, as follows ((Ola 0 is)U'row UmixUCoi(IO)a 0 is) = tk (1 + 4ajkAAy 4a/AOA 'J)(l + ' lUk)(Us. (6.19) We may now evaluate the error of Hiin from that of the original Hamiltonian k, following a similar approach from (12]. Let e, be a diagonal matrix with elements e3,j and similarly for ey,j. Then kin = (ft + e + fte,3 +-ifte) , +ft lHiin - HIH < 11111 (Ile81| + |l|-yl + (6.20) 2)5 Il||llple|yl) < ||HII(2e + e2) < 5 |lHle < - Ac A e. where the second-last inequality is due to E < 1/2, and the last inequality applies the upper bound |I"II < a/ _A,. Summing up Q = Q3 + Q, + 1 leads to the claimed query and gate complexities. D Combining with Thm. 6.3 then furnishes the following result on Hamiltonian simulation. Lemma 6.16 (Hamiltonian simulation by multiplied state overlaps). Let the Hamiltonian ft be encoded in the standard-form of Eq. 6.15 with normalization a. Given upper bounds A3 E [An, 1/2], A_) E [A,1/2] on the slowdown factors, A ;> Iiki, and a target error e E 2 ) + (A-1/ + A-7 1/2) log (1/E) log (tA/E) E18 og g(1/r-) queries, O(Q log (d)) primitive gates, and / Ay) log ( A, 3 + (0, min{A,, Ay}), time-evolution e-i ft be approximatedwith error e using Q = 0 (ta( 0(1) additional ancilla qubits. Proof. From Lem. 6.15, we may encode Hli, in standard-form with normalization 4ac AflA, f (llt0feo) = 0(Aeo). Thisand requires error Qo li-in= 0((A 1/2+A7 1/2) = log (EO)) queries to $row, Umix, $ 01 and their inverses, O(Qo log (d)) primitive gates, and 1 additional ancilla qubit. Using the fact Ile A - ei K IIZ - $31, the error of e- lin from ideal time+ evolution is I|e-ifint - e-iHt|l < ||iint - tt|l = O(tAeo). By combining with Thm. 6.3, time-evolution by e-iHint can be approximated with error El using Q, = 0(t/A3A, log ) queries to controlledUmixU 0 1 and its inverse, 0(Qi log (d)) additional primlog log (1/6El ) qeist otold row Oi T. n t nes,0( o d)adtoa rm itive gates, and 0(1) additional ancilla qubits. Thus time-evolution by e-iHt can be approximated with error e = O(ei + tAco) using Q = QoQi queries to controlled-Urow, Umix, Ucol and their inverses, and O(Qi log (d) + QoQi log (d)) = O(Q log (d)) primitive gates. We can 100 control the error by choosing cl = 0(c) and co = O(e/(tA)). Substituting into Q produces the claimed query complexity. D In the asymptotic limit of large t > log (1/E), the query complexity may be simplified to O(ta0( 6.4.3 A3 + VA) log (')) queries. Reduction to Sparse Matrices The results of Sec. 6.4, presented in a general setting, apply to the special case of sparse matrices. The reduction follows by making three additional assumptions. First, assume that the dimension of 10)a E C3, is larger than that of |uj)s E C'. Second, assume that Vj E [n], juj),, is the computational basis |j),. Third, we assume that there exists oracles in Def. 6.7 that describe d-sparse matrices [12]: With these oracles and an upper bound Amax ;> |Hlmax, we described in Sec. 5.2.3 how 0(1) queries suffice to implement the isometry represented by Urow|0)a and &coi0), with output states UcoiIO)alj)s = lbj)as = )+ (0|a(ksUtro {ksq_ = (Xklas = a1__g__g + (I - (ksd k fk ftk a dAmax 1 max ) + (1 (6.21) , /1q) ak 2 1- - max qCFA (xi|Umix|'k)-HkHk 1 -- H max PEFj amx { |a- where 6 3 k is the Kronecker delta function, and F {k : k = f(j, l) ,l E [d]} is the set of non-zero column indices in row j. We also choose Umix to swap the registers s and a1 . The gate complexity of UC 01, Urow, and Umix combined is O(log (n) + poly(m)), where m = ((log (tjjHI/c)) is the number of bits of precision of Hik. The contribution from poly(m) =O(m5/ 2 ) is due to integer arithmetic for computing square-roots and trigonometric functions. This combined with Thm. 6.3 recovers the previous best result on sparse Hamiltonian simulation in Chapter 4 using Q = 0(tdAmax + log (1/E) log log (1/r-)) queries, and O(Q(log (n) + poly(m))) primitive gates. To see how Thm. 6.16 improves on this, we rewrite Eq. 6.21 in the format of Eq. 6.11 by collecting coefficients of the subspace marked by l0)a2- dAax (XkI as where o- = >pEFj '7i'1 EI S lj)slP)ai I0)a2 + - -- VZax pEFj U dAmax Uk ( Ekqk + (1 -- k)Hq(j( Wikj, and the induced one-norm qk qE Fy IIHII (6.22) )sll)a2 , 1 ) (0ka 2 + = maxj oj. (ks(2la2 Note that , IVj)as = 4'j) = sIP)ai, and similarly for xj). From this, we obtain our main result on sparse Hamiltonian simulation Thm. 6.8. 101 Proof of Thm. 6.8. Comparison of Eq. 6.22 with Eq. 6.11 yields #3 = = iy , Afl = A,= . Thus we have the upper bound A0 = A > A3 = A-. Moreover, from Eq. 6.21, the normalization constant a = dAmax. The claimed query complexity is obtained by substitution into Cor. 6.16. D This result is quite remarkable as it strictly improves upon prior art, modulo logarithmic factors, by exploiting additional structural information. In the asymptotic limit of large Alt > log (1/E), the query complexity may be simplified to O(t dAmaxAi log (LA)). Using + the inequality IIHHJ < IIHII1 5 dIIHImax, the worst-case occurs when these norms are all equal thus A = A 1 = dAmax. There, the query complexity of Thm. 6.8 up to logarithmic factors is O(tdAmax), equal to that of prior art (86]. However, the best-case IIHIi = O(IIHlmax) leads to a quadratic improvement in sparsity with query complexity of 0(tvdAmax), also ignoring logarithmic factors. Another approach implicit in [12] assumes that oj are provided by the quantum oracle Oclj)slz)c = ij)sIz D o-j), when queried the j E [n] row index. This allows us to exactly compensate for the sinusoidal non-linearity of amplitude amplification by modifying initial state amplitudes by some j-dependent multiplicative factor. Thus H may be encoded in standard-form with normalization O( dAmaxAi) exactly without any error, leading to a Hamiltonian simulation algorithm with query complexity Q = O(t(djHIImaxIIHtII) 1/ 2 log log (1/6)). While improves on Thm. 6.8 by logarithmic factors, and matches the complexity Claim. 6.1, OC is in general difficult to construct. 6.4.4 Lower Bound on Sparse Hamiltonian Simulation In this section, we prove the lower bound Thm. 6.9 on sparse Hamiltonian simulation, given information on the sparsity, max-norm, and induced one-norm. The lower bounds in prior art are obtained by constructing Hamiltonians that compute well-known functions. When applied to our situation, one obtains Q(tIIHIIi) queries through the PARITY problem [10], and Q(vfd) queries through OR [12]. This leads to an additive lower bound Q(tIIHtjIi + d). Using similar techniques, we obtain a stronger lower bound Q(t(dIII111) 1/ 2 ) by creating a Hamiltonian that computes the solution to the composed function PARITY o OR. Specifically, we combine a Hamiltonian that solves PARITY on n bits with constant error using at least Q(SIIHI|maxt) queries, where t = E( ' ), with a Hamiltonian that solves OR on m bits exactly, with the promise that at most 1 bit is non-zero, using at least Q(Vm) queries. Note that in all cases, the query complexity with respect to error is at least an additive term log log (1/6)* The Hamiltonian HPARITY that solves PARITY on n bits is well-known [10], and is based on the Hamiltonian Hspia for perfect state transfer in spin chains. For completeness, we outline the procedure. Consider a Hamiltonian of Hspin dimension n + 1, with matrix elements in the computational basis {Iij), : j c [n + 1]} defined as (j - 1IsHspin j)s = Vj(n - j + 1)/n. (6.23) Note that this Hamiltonian has sparsity 1, max-norm E(1), and 1-norm E(1). Time evolution by this Hamiltonian e-ifspinnm/2 10), =n), exactly transfers the state 10) to In) in time 2 102 One way to speed up these dynamics is to uniformly increase the value of all matrix elements. However, any increase in |iH||max is trivial as it simply decreases t by a proportionate amount. Another way is to boost the sparsity of Hspiu, by taking a tensor product with a Hamiltonian Hcomplete of dimension s where all matrix elements are 1 in the computational basis {|j), : j E [s]}. (ilcHcompietelj)c = 1, Vi C [s], j C [s]. (6.24) One of the eigenstates of Hcomplete is the uniform superposition Iu)c Zjs j)c with eigenvalue Hcompletelu)c = slu)c. Thus we define the Hamiltonian sc = spin 0 Hcomplete- (6.25) Note that the H8 c has sparsity s, max-norm E(1), and 1-norm 8(s). One can see that Hec perform faster state transfer like e-ifscnTr/( 2s)10)su)c = In),lu)c in time t = useful to define the state Ii)Pc = Ii)ASu)c. '. We find it Adding another qubit to this composite Hamiltonian together with some slight modification solves PARITY. Given an n-bit string x = XOX2...n-1, let us consider the Hamiltonian of dimension 2 that computes the NOT function on the computational basis {j)output : j C [2], HNOTJ One can see that HNOT,j O)output = (X X )loutput and HNOT,j l)output = (6.26) Oloutput, as expected of a NOT function. In the basis j)sc, we define the Hamiltonian ( conjugate. (6.27) 1 +1)(j8 $® NOT~J +HermitianHPARITY Vj(n - j +) s This Hamiltonian also performs perfect state transfer, but since the path of each transition between the states |0)output and |1)output are gates by a NOT function on the bit xj, the output state of time-evolution e-iPARITYmr/( 2 ) 1 sIu) c )output In) Iu) c j output In the computational basis, HPARITY has sparsity 2s, max-norm e(1), and 1-norm e(s). Even though HNOT,j has only one non-zero element, the sparsity increases by factor 2 as we cannot compute beforehand the column index the non-zero. Thus measuring the output register returns the parity of x n-1 PARITY(x) =@ xj, (6.28) j=0 after evolving for time t = . It is well-known that the parity on n bits cannot be computed with less than Q(n) quantum queries, thus the query complexity of simulating time-evolution by HPARITY for time t is at least Q(ts). As sparsity and 1-norm exhibit the same scaling and in general |IHt|1 <; dJ|HI|max, the more accurate statement here if given information on IIHI1i is the lower bound of Q(tflHJJ1) queries. In contrast, the lower bound of [10] quotes Q(sparsity x t) as they consider the case where one is given information only on the sparsity.. 103 We now present the extension to creating a Hamiltonian that solves PARITY o OR. Notably, this Hamiltonian allows one to vary sparsity and 1-norm independently. Proof of Thm. 6.9. The first step is construct a Hamiltonian that solves the OR function on m bits xox1 ... xm-1, promised that at most 1 bit is non-zero. This Hamiltonian of dimension 2m, in the computational basis {Ik)0out p u tjj)o : k c [2], j C [m]}, is ftOR g O0 01 - ( . otC (6.29) Note that our construction is based on a modification of [13], where Oi there is zero matrix. Here, Ci mimics the top-left component of HNOT in that is performs a bit-flip on the output register if OR(x) = 0, and Oo mimics the top-right component of HNOT in that it performs a bit-flip on the output register if OR(x) = 1. These matrices are defined as follows: Co = X1 ... Xm-1 /1 (Xm-i X0 ... Xm-2 1001+I xm-2 Xm-1 -. - m-3 X1 X2 ... XO 1 1 1 Co... , + X- a~ t (6.30) Note that the non-Hermitian matrix Co has rows formed from cyclic shifts of x, whereas Ci is Hermitian. Let us define the uniform superposition IU)0 = j Ij)o. It is easy to verify that if at most one bit in x is non-zero, Colu)o OR(x)lu)o. Similarly, C1|u)o = (OR(x) e 1)Iu)o. Thus HoRjj)outputlu)o = Ij E OR(x))output lu) 0 . Note that HOR has sparsity 2m, max-norm E(1), and 1-norm E(1). Given an nm-bit string X,OX,1...XO,m-X1,...Xn_1,m_1, the Hamiltonian HPARITYoOR that computes the n-bit PARITY of a number n of m-bit OR functions is similar to HPARTY in Eq. 6.27, except that instead of composing with NOT Hamiltonians defined by the bit xj for each j E [n], we compose with OR Hamiltonians defined by the bits xj,ox,1...x,m_1 for each j E [n]. By defining HOR,j as the Hamiltonian defined by those bits, HPARITYoOR - jE[n] n +(n j + 1) j +1)(Ujsc OfRj + Hermitian conjugate. (6.31) On the input state 10)SIu)cIu) 0 10)0 utput, time-evolution by e-iHPAR1TYoORn7r/( 2 d) produces e-iHPARITYooRn7r/(2d) 0). u)IU) 0 10)output - In)s u)clu)j ®D OR(xj,oxj,1 ... xj,m-1))output. (6.32) Thus measuring the output register returns the parity of x n-1 PARITY o OR(x) = OR(xj,oxj,1...xj,m-1), (6.33) j=o after time-evolution by t = nir/(2s). Note that HPARITYoOR has sparsity d = 2sm, max104 norm E(1), and 1-norm ®(s). It is well-known that the constant-error quantum query complexity of PARITY o OR [107] is the product of the query complexity of PARITY with that of OR. As at least Q(/m) queries are required to compute the OR of m bits, PARITYo OR(x) requires at least Q(nVm) queries. Thus any algorithm for simulating time-evolution by HPARITYoOR requires at least Q(nr m) = Q(t ds) queries. 6.5 Universality of the Standard-Form We now establish an equivalence between simulation and measurement that justifies our focus on directly manipulating the standard-form encoding of structured Hamiltonians. This equivalence, proven using Thin. 6.11, allows us to inter-convert quantum circuits that implement time-evolution e--H for |Hjj1 = 0(1) and quantum circuits that implement measurement IIHH with a query complexity that is logarithmic in error, and a constant overhead in space. An application of this result to Hamiltonian simulation is Cor. 6.12 for Hamiltonians that is a sum of Hermitian terms, given access only to their exponentials. An intuitive picture of when simulation is possible emerges by interpreting the standardform matrix encoding Def. 6.2 as a quantum circuit that implements a measurement. To see this explicitly, consider a Hermitian matrix encoded in standard-form-(H, a, U, d). Thus for any arbitrary input state 7a, the standard-form applies W) U|G)a1'0)s = |G)aHJ|)s+ |)as, I(#|as(JG)a 9 Is)l = 0, (6.34) Note that in this section, we find it helpful to leave IG) explicit, similar to Sec. 5.3.1. So upon measurement outcome IG) on the ancilla, which occurs with best-case probability max 4 ,Wt ') = (HHI/a) 2, the measurement operator H/a is implemented on the system. As all measurement outcomes orthogonal to G) do not concern us, we represent their output with some orthogonal unnormalized quantum state kI")as. Combined with the Hamiltonian simulation by qubitization results of Thm. 6.3, one concludes that whenever one has access to a quantum circuit that implements a generalized measurement with measurement operator H/a corresponding to one of the measurement outcomes, time-evolution using 0 (ta + og (I)E) queries is possible. The converse of approximating measurements given e-t is a standard application of quantum phase estimation. The proof sketch is (1) assume t is chosen such that IIHt| K c < 1 for some absolute constant c and define H' = Ht. (2) Perform quantum phase estimation using 0(1/c) queries to controlled e--t to encode the eigenphases A of its eigenstates H'A) = AjA) to precision E in binary format A in an m-qubit ancilla register 7 b, where m = O(log (1/c)). (3) Perform a controlled rotation on the single-qubit ancilla 10)a to reduce the amplitude of |A) by factor A. (4) Uncompute the binary register by running quantum phase estimation in reverse. This implements the sequence. IO)blO)aiA)s + 1)bjO)aIA)s -4 1A)b (I)a + a |0)b (A0)a + 1- 1 - IA211)a)) JA), (6.35) 11)) JA). Thus projecting onto the state 104b0)a implements the measurement operator H' with error 105 max, JA - A = O(e), and best-case success probability |H'H. As Eq. 6.35 is a standard-form encoding of /ar with the signal unitary defined by steps (2-4), this establishes one direction in the equivalence between measurement and simulation up to polynomial error and logarithmic space. Ignoring these factors, our study of Hamiltonian simulation reduces to that of generalized measurements except in one edge case: this equivalence does not hold with respect to t when e-iHt can be simulated with o(t) queries. However, this case is less interesting as no-fast-forwarding theorems [25] show that Q(t) queries are necessary for Hamiltonians that solve generic problems. We strengthen this equivalence in the opposite direction Thm. 6.11 for approximating measurement operators ft' using log (1/c) queries to e--ik' and 0(1) ancilla qubits. The idea is to using quantum signal processing techniques to approximate two operator transformations: H 1 =(e--H' - eiH'), k 2 = sin- 1 (H 1 ). Thus sin- 1 ((eift' - eif')) = f'. 1 All that remains is finding a degree n polynomial approximation to sin- (x) with uniform error n = O(log(1/c)). However, this seems impossible - sin-(x) is not analytic at x = 1, thus its uniform polynomial approximation has degree n = 0(poly(1/e)). Fortunately, this can be overcome due to the restricted domain IIftl| < c. Lemma 6.17 (Polynomial approximation to sin-'(x)). V e E (0, 0(1)], there exists an odd polynomial Parcsin,n of degree n = O(log (1/E)) such that max Parcsin,n (X) - sin- 1 (x)| < c, XE[- 1/2,1/2] and max xG[-1,1] lparcsin,n(X) < 1. (6.36) Proof. We restate Thm. 3 of [112] by Saff and Totik: Let # be any number satisfying f > 1 and let f E Ck[_1, 1] be a piecewise analytic function on m > 0 closed intervals [-1, 1] = U-o[xj, Xj+1, -1= X0 < X1 < ... < Xm-1 < xm = 1, where the restriction of f to any of the closed intervals [xj, xj+1] is analytic, and f is not analytic at each point x1, - - - , xm-1. Then there exists constants g, G > 0 that depend only on f, and degree n > 0 polynomials pn such that for every x E [-1, 1], lpn(x) - f(x) 1 : G e-gnd,(x), where d(x) = mino<j<m Ix - xj . Let us now apply this theorem. Define the function farcsn W {sin- 1 (x), sgn(x) sin- 1 (3/4) x E [-3/4,3/4], (6.37) otherwise, where sgn(x) = x. farcsin(x) is continuous but not differentiable at x = 3/4. Thus f E CO[-1, 1], maxxE[-1/ 2 ,1/ 2 ] d(x) ;> 1/4, and there exist absolute constants G', g' > 0 and polynomials Pn such that maxxe[-1/2,1/2] pfn(x) - farcsin(X)l gg-g'n/4 =. Hence 1 n = O(log (1/c)). Since e-g'"d' <; 1 and Isin (3/4) < 0.85, there exists a constant no > 0 such that for all n > no, maxxe[_,llI farcsin(x) - pn(x)I < 0.15 thus pn(x)I < 1. If Pn(x) is not odd, replace it with its antisymmetric component pn +- "(x)"n(-x) which is odd with at worst the same error. Now let Parcsin,n = Pn. l We now apply this polynomial approximation of sin-(x) to the proof of Thm. 6.11. Proof of Thm. 6.11. The transformation from time evolution e-to measurement Ht takes three steps. First, encode the Hermitian operator H 1 = sin (Ht) in standard-form. This can be done with one query to the controlled time-evolution operator o = 10) (010 j+ 106 I1)K1I ® e-ia and its inverse #t. Z) 0o =|1)(0| 0 et +10)(11 0 e-H, G) =ei&x7r/4I0), (6.38) Ui = 0 (8 H 1 = ((G 1 0)U 1 (IG) 09 ) sin (Ht). Second, approximate H 2 = sin- 1 (H 1 ) using quantum signal processing. As the polynomial Parcsin,N(x) of Lem. 6.17 satisfies the conditions of Thm. 5.9, the operator transformation flint = Parcsin,N[H1] can be implemented exactly with O(N) queries to &o. This encodes Hiint in standard-form with normalization 1. Now choose t such that IIftlI < c = 1/2. Then ||sin (Ht)l < Ht|l < 1/2 as sin(x) K x. Third, evaluate the approximation error using Lem. 6.17. IIHiint - Htj| < maxC[-1/ 2 ,1/ 2] IParcsin,N(X) - sin 1 (x) e, for N O(log (1/E)). I Incidentally, the equivalence between simulation and measurement also provides a simulation algorithm for Hamiltonians built from a sum of d Hermitian component ft j= H* where one only has access to these components through an oracle for their controlled exponentials e-iHjtj, for any tj E R. Though results with similar scaling can be obtained through the techniques of compressed fractional queries [13], this approach has two main advantages. First, the queries H, are not restricted to only have eigenvalues 1. Second, it is significantly simpler both in concept and in implementation. Proof of Cor. 6.12. From Thin. 6.11, O(log(i/ci)) queries to U suffice to encode Hcontrolled with some state IG')b and acts on the system register Ed=I lj)(jla 0 Hj = ((G'Ib Oas)'(IG')b0 las) in standard-form signal oracle U', where maxj ||Hj - Hg|| : s. Thus ((G a(G'b 0 I,) U'(IG)ajG')b 0 I,) where ||fapprox - H|| = _ j=H - ci and Hcontrolled ftapprox/a encodes Happrox in standard-form >I_1 =) ay|| I < adl. Using the fact IleiA - e B115 11A - bi [13], we have Ie-iH't - e-inHI < taci. By applying Thin. 6.3, e-iHapproxt can be approximated with error E2 using 0(ta + 1o091(/E2) )O(log(1/Ei)) queries to U. By the triangle inequality, this approximates echoose El = ' and E2 = c/2. 6.6 with error < taci + 62. Thus l Construction of Polynomials In this section, we constructively prove the existence of the polynomials used in proving spectral multiplication, spectral amplification of low-energy subspaces, and amplitude multiplication. 6.6.1 Polynomial Approximations to a Truncated Linear Function The proof of Thm. 6.4 and Thm. 6.10 require a polynomial approximation Plin,r,n to the truncated linear function flin,r (x) [2r , E [-1, 1], XI E [0, 1], jxj E (F, I]. (6.39) The remainder of this section is dedicated to constructively proving the existence of plin,r,n with the following properties: 107 Theorem 6.18 (Polynomial for linear amplitude amplification). V I' E [0, 1/2], c C (0, 0(F)], there exists an odd polynomial plin,Jr,n of degree n = 0(F-1 log (1/c)) such that V x E [-r,1], Plin,pn(X) - -- < 21F 21 and max IPIin,r,n(x)I < 1. (6.40) xE[-1,1] As close-to-optimal uniform polynomials approximations may be obtained by the Chebyshev truncation of entire functions, our strategy is to find an entire function flin,r,E that approximates fnX over the domain x E [-I, F] with error c. We construct flin,r,E(x) in three steps. First, approximate the sign function sgn(x) with an error functions, which is entire. Second, approximate the rectangular function rect(x) with a sum of two error function j (erf(k(x + 6)) + erf(k(-x + 6))). Third, multiply this by ' to approximate fAn,r,E () with some error e. The approximation error of this sequence is described by Lems. 6.19, 6.20, 6.21: Lemma 6.19 (Entire approximation to the sign function sgn(x)). V s > 0, x (0, /2/ew], let k = / log E R, e C (s). Then the function fg,,,,(x) = erf(kx) satisfies 11 2 X > 0, 1, S> Ifsgn,JeX), c > max Ifsgn,K,c(x) - sgn(x)I, 1, x < 0, 1/2, x = 0. sgn(x) = jxj !r/2 (6.41) Proof. We apply elementary upper bounds on the complementary error function erfc(x) 1 - erf(x) = f e-Y dy f ge-- dy e- 2 for any x > 0. Thus maxX> /2 jerf(kx) - ) > v where e -(k,) 2 /4 = E and similarly for x < -n/2. This where W(x) is the Lambert-W function. From the upper bound log x - log log x < W(x) log1/2(( 2 log x - -log log x for x > e [57], any choice of k > _ > e ensures that erf(kx) is close to +1 w > 0, x E R, X E Lemma 6.20 (Entire approximation to the rect function). V K > 0, (0, 2/er], let k = f log 1 / 2 (s), 6 = (w + r,)/2. (erf(k(x + 6)) + erf(k(-x + 6))) satisfies 1 ;> Ifrect,w,,,(X)I, lfrect,w,K,E(X) - rect(x/w)l, max E > IxIE[,w/2IU[w/2+,oo l over x> K/2. Then the function frect,w,,,(x) rect(x) = 1, IxI < 1/2, 0, Ixl > 1/2, IxI = 1/2. l1/2, - is solved by k = -2W(1) 1 (6.42) + Proof. This follows from the definition of the rect function rect(x/w) = (sgn(x + w/2) sgn(-x + w/2)). Thus we choose 6 = (w + r,)/2 and apply the error estimates of Lem. 6.19. Lemma 6.21 (Entire approximation to the truncated linear function). V F > 0, x E R, E E (0, /2/er], the function flinr, (x) = - frect,2r,2r,E(X) satisfies Iflin,r,e(x) 5 1, max Ixielo,r] flin,rE(x) - - 108 2F 217 . (6.43) Proof. Consider the domain Ix| E [0, F]. There, Lem. 6.20 gives the approximation error |frect,2r',2r,(x) - 11 <6. Multiplying both sides by x gives the stated result. Now consider K 1. Thus the product isthe domain IxI c [0,2r]. There, jfrect,2r,2r,E(x)j < 1 and < bounded by 1. Now consider the domain x > 2F. Let us maximize fiin,r,,(x) over x, E. Define 1/E' = log (_) > 1. Thus flin,r,E(x) = (erf( x , ) + erf( 2 , )). We make use of the upper bounds erfc(x) = 1 - erf(x) < _e_X2 and erfc(x) < e _X. has the bounds 1 > erf(x2 ( x+2r term has the bounds -1 ) - + > erf( 21 > -,) > -1. 1 p- The first term ( x+2r' )2 The second By adding these together and extremizing the upper and lower bounds separately, fjij,r,e(x) E [-0.0011, 0.56] independent of F and for all E' E [0, 1]. These bounds apply to x < 2F with a minus sign as flin,r,,(X) is an odd function. , However, the required polynomial must have a non-uniform error [Plin,r,n(x) 21 proportional to IxI. Though fl in ,r,, of Lem. 6.21 has that property, its Chebyshev truncation results in a worst-case uniform error c for all values of x. This is overcome by approximating Plin,F,n(x) as the product of a Chebyshev truncation of the entire approximation to rect(x) and with i. We now evaluate the scaling of the degree of the Chebyshev truncation of fiin,rc in Lem. 6.20 with respect to their parameters and the desired approximation error. Our starting point is the Jacobi-Anger expansion of the exponential decay function: Io-(j) +2ZI(C)T (- x) , (6.44) fexp,o () + C-8(x+1) - j=1 where Ij (p3) are modified Bessel functions of the first kind. The domain of this function and all the following are assumed to be x C [-1, 1]. By truncating this expansion above j > n, we obtain a degree n polynomial approximation Pexp,8,n(X) with truncation error Eexppn: Pexp,8,n(X) = (Io(3) + 2ZI ()T (-x) (6.45) , j=1 Eexp,/3,n = max IPexp~3n - fexpjB = 2e-> XE{--1,1] IE(I3 ) . (6.46) n l Note that the equality in the rightmost term of Eq. 6.46 arises as all the coefficients Ij (3 ) > 0 when 0 > 0. Thus 6exppn is maximized ITj(-x)l are all simultaneously maximized, which occurs at x = -1 => T (-x) = 1. By solving Eexp,,n,, one can in principle obtain the required degree n as a function of 3, c. Error estimates for various degree n polynomial approximations to the exponential decay function can be found in the literature. However these approximations are constructed using other methods. For instance, a Taylor expansion leads to scaling linear in 3, and none explicitly bound the sum Eexp,f,n. Fortunately, one particular error estimate in prior art is good enough and can be shown, with a little work, to implicitly bound eexp,p,n. We first sketch the proof of this estimate, then later show how it bounds Eexp,)3,n. Lemma 6.22 (Polynomial approximation to exponential decay e-(x+1) adapted from [111]). 109 VQ > 0, e c (0,1/2], there exists a polynomial pn of degree n = [2 [max[3e2, log (2/c)]] log (4/c)] (6.47) such that max Ipn(X) - e--3(X- 1) < 6. (6.48) xc[-1,1] Proof. Consider the Chebyshev expansion of the monomial /s XS = 21s E j=O,S-j even ( .s/)T(x) = E[TD, (x)], (6.49) s" - j)/2) where s < 0 is an integer and E means the j = 0 term is halved. The representation an an expectation over the random variable D, =-Es_ Y where Yj t1 with equal probabilities follows from the identity xT(x) = 2(Tj-I(x) + Tj+1(x)). They show that the Chebyshev truncation of the monomial has error min(s,n) Pmon,s,n(x) 21-s E j=o,n-j even ( . (s Tj (x), (6.50) j)/2 S/ Emon,s,n = max |Pmon,s,n(X) Xc[-1,1] xsj 2 1-s E j=n+1,n-j even (s 2e-2/(2s) -)/2 which follows from the triangle inequality with Tj(x) I 1 and the Chernoff bound P(IDsI> n) < 2e- 2 /(2s). By replacing each monomial up to degree t in the Taylor expansion of e~4(x~l) e-# =0 =e~ j! _% (~j=j mon,j,n(x). Esach,#8,n x3 with Pmon,s,n, they obtain the degree n polynomial Pn(x) They show the error of this approximation is split into two terms: max xE[-1,1] ei= 2e, Ifn(x) - e-/-(x1-) I < S 1 - e2, IPmon,j,n - xi I (6.51) 2-2/(2t), j=n+l 2= 2ej=t+1 L. By choosing n = xi 1 2e -3. 3 2t log (4/c)] and t = Fmax{#3e 2 , log (4/c)}], cl + C2 E We now demonstrate how this upper bounds Eexp,8,n. Lemma 6.23 (Chebyshev truncation error of exponential decay e-O(x+1)). V / > 0, e E (0, 1/2], the choice n = [FV2Fmax[#e 2 , log (2/c)]] log (4/c)] = O( /(3 + log (1/E)) log (1/c)), guarantees that Cep,,On < E. Proof. This result follows essentially from how the truncating the Jacobi-Anger expansion in Eq. 6.44 discards fewer coefficients that are all positive than the procedure of Thm. 6.22. Hence the maximum truncation error occurs at x = 1 and is monotonically increasing with 110 the number of coefficients omitted in the truncation. Eq. 6.50 is actually an equality Emon,s,n = 2 1-s >/3 Observe that the first inequality in j=n+1,n-j even (S(s3) S. )/2). This follows from the same logic as Eq. 6.46 - all coefficients are positive, thus the maximum error occurs at X = 1, which simultaneously maximizes all Tj(x = 1) = 1. Similarly, the first inequality in Eq. 6.51 is also actually an equality. Let us express the truncation error of Esach,#,n as a Chebyshev expansion in full Esach,13,n =2e- maxkx)(.2 n+1 - 00 (/3/2)J (13j even .k=n+1,j-k even k=t+ - k=Oj-k ((k))/2"~ (.2 ( - k)/ 2 Note that we have used (--3) 3 T(-x) = 33Tk(x) as all pairs j - k are even. Thus 6sach,3,n is maximized at Tk(x 1) = 1 in the sum above. This can be compared with (6.53) Eexp,,8,n =max 2eXE [-1,1] = 6 sach,/3,n - j 00 n l S (012)3 "I 2e - 2e- j=t+1 E (j k=O,j-k even ') 2 < Esach,o,nMore intuitively, both Eexp,3,n and 6sach,f,n sum over all coefficients j > n in the Chebyshev expansion, but Csach,,,n in addition sums over some positive coefficients corresponding to j < n. Thus the upper bound of Lem. 6.22 on Esach,#6,n applies to Eexp,,8,n, E. In the following, we will bound all errors of our polynomial approximations in terms Eexpj,n, a partial sum over Bessel functions. Corollary 6.24 (Polynomial approximation to the Gaussian function e-(yX) 2 ). V-y > 0, E E (0, 1/2] the even polynomial Pgauss,-y,n of even degree n = O(I(- 2 + log (1/1)) log (1/c)) satisfies n/2 PgaussY,n(X) = PeXp2/2,n/2(2x 2 _1) e- 2 /2 (I 2/2) + 2 Ij(_2/2)(-1)T2j(x) ) I(#) X) j=1 (6.54) Egauss,y,n =Ema 1jPgauss,-,n(X) - e(yX) 2 - Eexp,-2/2,n/2 5 *. Proof. This follows from Eq. 6.44 by a simple change of variables. Let x' = T2 (x) = 2x 2 - 1, [-1, 1] maps the domain of - 12map [-1, 1] e-yx). As 2 e-((x'+1) 2=. T As 6x' Thus y2, to that of fexp, 1 (x), the definition Eq. 6.54 results. Using the Chebyshev semigroup property Tj( T2 (X)) = ( 1)"T2j(x), Pgauss,k,n is an even polynomial of degree n and its approximation error is obtained by substitution into Eq. 6.46. 1-- 111 A polynomial approximation to the error function follows immediately by integrating Pgauss,y,n- Corollary 6.25 (Polynomial approximation to the error function erf(kx)). Vk > 0, C E (0, 0(1)] the odd polynomial Perf,k,n of odd degree n = O(vf( k 2 + log (1/e)) log (1/6)) satisfies (n-1)/2 Perf,k,n(X) (Io(k 12)x j(k2 /)( 2 T2j 1j - 2 -2kek/2 2j + 1 2j-I.))5 + j=1 (6.55) 4k max IPerfk,n(x) - erf(kx)l < v Egausskn-1 < EXe[-1,1] ,\/7n fo e- 2 Proof. From the definition of the error function erf(kx) = term-by-term using the identity fx Tj(x)dx = maining terms is bounded though 2ke-k 2/2 < V/r dx k fe Pgauss,k,n(x)dx follows directly from integrating Eq. 6.54 the polynomial Perf,k,n+1 (x) Eerf,k,n 2 1 00 .(x)) - T2j+1 (x) 2j + 1 .E =(n+l)/' The error of the re- (6.56) i - Cerf,k,n = 2 2ke-k /2 I (k 2 /2)1 / - 1 2j+ 2 <K 4ke-k /2 II(k 2 /2)1 = f 1 ~2j- 1) 4k Egauss,k,n- 1. j=(n+1)/2 The error of erf,k,n 1 2 0(log- / (1/6)) However, n = Q(k log1 / 2 (1/E)). 4k Eexp,k2/2,(n-1)/2. Thus -= D = 0(1) and does not make the scaling any worse. A polynomial approximation to the shifted error function follows by a change of variables. - Corollary 6.26 (Polynomial approximation to the shifted error function erf(k(x - 6))). Vk > 0,6 E [-1, 1], ec (0, 0(1)] the polynomial Perf,k,6,n(x) = Perf,2k,n((X 6)/2) of odd degree n = O(V(k 2 + log (1/E)) log (1/6) satisfies Eerf,k,6,n = max xG[-1,1] Perf k,3,n(X) - erf(k(x - 6))I Eerf,2k,n E. (6.57) Proof. This follows trivially from erf(k(x - 6)) = erf(2kxg3 ). Note that we have doubled the degree of our polynomials in order to double the width of the domain, which we exploit to allows translations. D This polynomial approximation of the shifted error function is the basic ingredient we use to construct more complicated functions sgn and rect through Lems.6.19,6.20. Corollary 6.27 (Polynomial approximation to the sign function sgn(x - 6)). V r, > 0,6 E [-1, 1], c E (0, 0(1)] the polynomialpsgn, 6,n(x) = Perfk,J,n(x) of odd degree n = 0( log (1/c)), 112 log1/ 2 ( where k = 6sgn,K,,,n ), satisfies max = Perf,k,6,n(X) - sgn(X - 6)1 Eerfk,3,n + 61 2 Eerfk,Jn < E- (6.58) Proof. The equation for k comes from Lem. 6.19. We then choose El = Eerf,k,6,n which defines an implicit equation for cl and doubles the error. L Corollary 6.28 (Polynomial approximation to the rectangular function rect(x/w)). V r C (0, 2], w G [0, 2 - r,], E E (0, 0(1)], the even polynomial Prect,w,s,n(x) = 1 (Psgn,,(w+n)/2,n+1(X) + Psgn,i,,(w+r)/2,n+1(-x)) (6.59) of even degree nO(}! log (1/C)) satisfies Erect,w,,n = 1Prectw,,,n(X) - rect(X/w)l < Esgn,,,,n < E. max jx|E[0,w/2]U[w/2+K,1] (6.60) Proof. This follows from the construction of a rectangular function with two sign functions in Lem. 6.20. l Corollary 6.29 (Polynomial approximation to the truncated linear function flin,r(x)). V F E (0, 1/2], e C (0,0 (F)], the odd polynomial Plin,r,n(X) = xPrect,2r,2r,n-1(X) of odd n = 0(! log (1/c)) satisfies 217z Erect,2F,2r,n--1 KE. (6.61) Clin,r,n = max Ixle[O,r] lXI Plin,r,n(x) - - 2F C Proof. This follows from multiplying a rectangular function with a linear function in Lem. 6.21. One subtlety arises here: The error of Piinr,n is bounded by ret,2r,2r,n-l in the domain How'e. ' lxi E [31, 1]. Thus multiplying by 2 increases this error to at most ever, the quantum signal processing conditions in Thm. 3.4 require all polynomials to be bounded by 1. This implicitly constrains us to choose n such that Crect,2F,2r',n-1 < 2 is also satisfied. In all the above cases, the entire functions that are being approximated are bounded by 1. When the approximation error is e, the resulting polynomial is then bounded by 1 + C. In such an event, we simply rescale these polynomials by a factor -;. At worst, this only doubles the error of the approximation. We also emphasize that our proposed sequence of polynomial transformations serve primarily to prove their asymptotic scaling. In practice, close-to-optimal constant factors in the degree of these polynomials can be obtained by a direct Chebyshev truncation of the entire functions. 6.6.2 Polynomials for Low-Energy Uniform Spectral Amplification The proof of Thm. 6.5 requires a polynomial approximation PgapAn(x) Lem. 6.31 to the truncated linear function fX+1-A fgap, A (X) = E 1 G [-1, 1], 113 - otherwise. + A]( (6.62) Our strategy is to construct an entire function fgapA,E that approximates fgap,A with error e over the domain of interest. Entire functions are desirable as they are analytic on the entire complex plane. This implies that truncating their expansion fgap,A,'E(x) = o agTj(x) in the Chebyshev basis produces polynomials with a uniform approximation error that scales almost optimally with the degree n [126]. We build fgap,A,, by using the entire approximation to the sign function sgn(x) in Lem. 6.19 and some intermediate results on the error function erf(x) = e- 2 dy. Lemma 6.30 (Entire approximation to the gapped linear function [0, 1/2], x E [-1, oo], e E (0, fgap,A,e(X) fgap,A(X)). V A E ]. Then the function fgap,A,E(x) satisfies 1 - Al A X + = 6> max fsgn,A,2c(X + 2 fgap,A,E(X) - 1- 3A/2) ,(6.63) x+1-A A Xe [-1,-1+A] 0 < fgap,AE(X) < 1, max IfgapAE(X) xE[1-A,1] . E/10 > max xE[-1+A,oo] Proof. Let us derive bounds on the following regions: X E [-1, -1+A]: From Lem. 6.19, . -Ifsgf,A,2 ,(x+1-3A/2)) -1 E approximates the function 1 with error c. By multiplying both sides with x+ -NA, ifgap,A,e(A) I x+- e x C [-1 + A, -1 + 3A/2]: From Lem. 6.19 1fsgn,A,2e E [0, 1/2]. In this region, x+1--A E [0, 1/2]. Thus by multiplying, fgap,A,e(x) E [0, 1/2]. x E [-1 + 3A/2, 1 - A]: From the upper bound erfc(x) < eX2, one obtains fgap,A,E(X) X+1 Ae-k2,(x+1-3A/2)2 where k - e log'/2 (1). The worst case occurs when k is smallest hence e = is largest. Thus the upper bound is maximized with value 1+:e(-5-3)/4 0.7 at x -- 1+ (5 + < 5)A < -1 + 2A. x E [1 - A, oo): The upper bound obtained for x E [-1 + 3A/2, 1 - A] still applies here and is monotonically decreasing with x. Thus it is maximized when A = 1/2 is largest and at x = 1 - A. With this upper bound, fgap,1/2,c(1/ 2 ) < 2e_9k2 /16 < 32 x E [-1 + A, oo]: x+'-" Aand . substituting k and then using the fact E K by 2r< ~fsg",A,2e(x+1-3A/2) 2 are both positive, thus fgap,A,e(X) is positive. We now construct a degree n polynomial approximation to fgap,A (x). Lemma 6.31 (Polynomial approximation to the gapped linear function fgap,A(x)). V E K 0(1), there exists an odd polynomial pgap,A,n of degree n = O(A- 1/ 2 log 3/ 2 (1/(Ae))) such that max xE[-1,-1+A] Pgap,A,n(x) - x+ 1 e and max IPgapA,n(X)I 1. (6.64) xE[-1,1] Proof. Let us expand fgap,A, 1 (x) =E=0 ajTj (x) in the Chebyshev basis. Then the trun- 114 cation error of pn(x) = E' 0 ajTj (x) has a well-known upper bound from Thm. 8.2 of [126]: max xe[-1,1] IPn (X) - fgap,A,1(X) C62 2Mp p - M = max Ifgap,A,,i (z) 1, I (6.65) zcE for any elliptical radius p > 1, where E = {z : z = 1(e'O + p-e-'O),0 E [0, 27r)} is the Bernstein ellipse. We will need an upper bound on Ierf(reio)I for r > 0, E [0, 27r): 2 _j V 2 2 2 = V J'0 2 2 er vi cos(24)dr Jk(z + 1 - 3A/2)| M~ zEE = 2r max{1, e-r 2 cos (24) } 2r max{1, eRe(-(reO) 2 2 + p2 2 + 2 cos (26)) < p 2 . Let k =2log'/ 2 k(IzI + 1 + 3A/2) 5 k(p + 1 + 3A/2). Then + 1- A = (6.66) Jr 0VI_ We also need the upper bounds z2 Mmaxz er2 cos (2$)e-ir 2 sin (24)dr e_ i dr = er ) erf(rei")I = A - erf(k(z + I - 3A/2) 2a 2 IZI+ I+ A(1 + jerf(k(z + I max 3A/2)|) 2A zE~p (6.67) z EE( 1 + < O(poly(p, A-')) max 21k(z + 1 - 3A/2)( Vi(1 + eRe(-(k(z+1-3A/2)2)) ) O(poly(p, A-')) max eRe(-(k(z+1-3A/2) 2 z EEp By taking derivatives with respect to 0, the maximum value of the exponent 1)(2 - (2 - 3A)2 p2 + 2p 4 8p 2(1+ p4 (6.68) ) Oe [0,27r) ) a = max Re(-(k(z + 1 - 3A/2) 2 ) =k 2(p2 )en //1og(1/Ei) C2 = 0 (poly(A - n 0 (A-1/2 log 3/ 2 (max1 2 ) Let us choose p = e', where a = 0(1/vk 2 A). Then a = 0(1). Substituting the value of k, we have a = 0(\A/log (1/ci)), and M = 0 (poly(A~1)). Thus from Eq. 6.65, ,i, (6.69) - where the last equation applies log (poly(A-')/c) = 0(log(-;)). Thus the total approximation error is maxxE[-1-1+A]IPn(X) _ x+1--A 1 +< 2. Let Pgap,sym,A,n(X) =(Pn(X) Pn(-x)) be the odd component of pn(x). Using the bounds of Lem. 6.30, this increases the error in x c [-1, -1 + A] to at most 'j(cE + 62). By subtracting these bounds, we also have maxxE[1,1] IPgap,sym,A,n(X)j 1 + To(6 + E2). Thus we rescale this to obtain Pgap,A,nkX) = h( ). I Using maxxe - 1| < x, This increases the error by at most a constant factor maxx[-1-1+A] IPgap,A,n(X) El = E2 = O(E). 6.7 '+1 = 0(ci (2), so choose El Conclusions We have combined ideas from qubitization and quantum signal processing to solve, in a general setting, the uniform spectral amplification problem of implementing a low-distortion 115 expansion of the spectrum of Hamiltonians. One most surprising application of our results is the simulation of sparse Hamiltonians where we obtain an algorithm with linear complexity in O(t(dAmaxAi) 1/ 2 ), excluding logarithmic factors. This is particularly important as the best-case scaling O(/d) is essential to an optimal realization of the fundamental quantum search algorithm. However, this improvement also appears impossible as prior art claims that 9(tdHI|max) queries is optimal. Nevertheless, the two are actually consistent. In the situation where information on 11H11 1 is unavailable, previous results are recovered as one may simply choose the worst-case Al = dAmax = djIfIHmax. This naturally leads to the question of whether further improvement is possible. For instance, if information on 11HIJ rather than I1HIJ1 is made available, our lower bound is consistent with the stronger statement of Q(t(dIH||maxIIHII) 1 / 2 ) queries. More generally, the universality of our results motivates related future directions. Thus far, a large number of common oracles used to describe Hamiltonians to quantum computers map to the standard-form without much difficultly. Rather than focusing on improving Hamiltonian simulation algorithms in the query model, perhaps an emphasis on improving the quality of encoding, through a reduced normalization constant, would be more insightful, easier, and also lead to greater generality. Combined with the extremely low overhead of our techniques, algorithms obtained in this manner could be practical on digital quantum computers sooner rather than later. 116 Bibliography [1] Milton Abramowitz, Irene A Stegun, et al. Applied mathematics series, 55:62, 1966. Handbook of mathematical functions. [2] Dorit Aharonov. A simple proof that Toffoli and Hadamard are quantum universal. arXiv preprint quant-ph/0301040, 2003. [3] Dorit Aharonov and Amnon Ta-Shma. Adiabatic quantum state generation and statistical zero knowledge. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing, STOC '03, pages 20-29, New York, NY, USA, 2003. ACM. [4] Dorit Aharonov, Wim Van Dam, Julia Kempe, Zeph Landau, Seth Lloyd, and Oded Regev. Adiabatic quantum computation is equivalent to standard quantum computation. SIAM Rev., 50(4):755-787, 2008. [5] K. Arai, C. Belthangady, H. Zhang, N. Bar-Gill, S. J., DeVience, P. Cappellaro, A. Yacoby, and R. L., Walsworth. Fourier magnetic imaging with nanoscale resolution and compressed sensing speed-up using electronic spins in diamond. Nat. Nano., 10(10):859-864, October 2015. [6] R. Barends, J. Kelly, A. Megrant, A. Veitia, D. Sank, E. Jeffrey, T. C. White, J. Mutus, A. G. Fowler, B. Campbell, Y. Chen, Z. Chen, B. Chiaro, A. Dunsworth, C. Neill, P. O/'Malley, P. Roushan, A. Vainsencher, J. Wenner, A. N. Korotkov, A. N. Cleland, and John M. Martinis. Superconducting quantum circuits at the surface code threshold for fault tolerance. Nature, 508(7497):500-503, April 2014. [7] R. Barends, A. Shabani, L. Lamata, J. Kelly, A. Mezzacapo, U. Las Heras, R. Babbush, A. G. Fowler, B. Campbell, Yu Chen, Z. Chen, B. Chiaro, A. Dunsworth, E. Jeffrey, E. Lucero, A. Megrant, J. Y. Mutus, M. Neeley, C. Neill, P. J. J. OaAZMalley, C. Quintana, P. Roushan, D. Sank, A. Vainsencher, J. Wenner, T. C. White, E. Solano, H. Neven, and John M. Martinis. Digitized adiabatic quantum computing with a superconducting circuit. Nature, 534(7606):222-226, June 2016. [8] Robert Beals, Harry Buhrman, Richard Cleve, Michele Mosca, and Ronald de Wolf. Quantum lower bounds by polynomials. J. Assoc. Comput. Mach., 48(4):778-797, July 2001. [9] David Beckman, Amalavoyal N. Chari, Srikrishna Devabhaktuni, and John Preskill. Efficient networks for quantum factoring. Phys. Rev. A, 54:1034-1063, Aug 1996. [10] D. W. Berry, A. M. Childs, and R. Kothari. Hamiltonian simulation with nearly optimal dependence on all parameters. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 792-809, Oct 2015. 117 [11] Dominic W. Berry, Graeme Ahokas, Richard Cleve, and Barry C. Sanders. Efficient quantum algorithms for simulating sparse Hamiltonians. Comm. Math. Phys., 270(2):359-371, 2007. [12] Dominic W. Berry and Andrew M. Childs. Black-box Hamiltonian simulation and unitary implementation. Quantum Info. Comput., 12(1-2):29-62, January 2012. [13] Dominic W. Berry, Andrew M. Childs, Richard Cleve, Robin Kothari, and Rolando D. Somma. Exponential improvement in precision for simulating sparse Hamiltonians. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing, STOC '14, pages 283-292, New York, NY, USA, 2014. ACM. [14] Dominic W. Berry, Andrew M. Childs, Richard Cleve, Robin Kothari, and Rolando D. Somma. Simulating Hamiltonian dynamics with a Truncated taylor series. Phys. Rev. Lett., 114:090502, Mar 2015. [15] John P. Boyd. Rootfinding for a transcendental equation without a first guess: Polynomialization of Kepler's equation through Chebyshev polynomial expansion of the sine. Appl. Numer. Math., 57(1):12 - 18, 2007. [16] Fernando GSL Brandao and Krysta Svore. Quantum speed-ups for semidefinite programming. arXiv preprint arXiv:1 609.05537, 2016. [17] Gilles Brassard, Peter Hoyer, and Alain Tapp. Quantum counting. Languages and Programming, pages 820-831. Springer, 1998. In Automata, [18] Kenneth R. Brown, Aram W. Harrow, and Isaac L. Chuang. Arbitrarily accurate composite pulse sequences. Phys. Rev. A, 70:052318, Nov 2004. [19] A. R. Calderbank and Peter W. Shor. Good quantum error-correcting codes exist. Phys. Rev. A, 54:1098-1105, Aug 1996. [20] T. Caneva, M. Murphy, T. Calarco, R. Fazio, S. Montangero, V. Giovannetti, and G. E. Santoro. Optimal control at the quantum speed limit. Phys. Rev. Lett., 103:240501, Dec 2009. [21] Xie Chen, Bei Zeng, Zheng-Cheng Gu, Beni Yoshida, and Isaac L Chuang. Gapped two-body hamiltonian whose unique ground state is universal for one-way quantum computation. Phys. Rev. Lett., 102(22):220501, 2009. [22] Andrew M. Childs. Universal computation by quantum walk. Phys. Rev. Lett., 102:180501, May 2009. [23] Andrew M. Childs. On the relationship between continuous- and discrete-time quantum walk. Commun. Math. Phys., 294(2):581-603, 2010. [24] Andrew M. Childs, David Gosset, and Zak Webb. Universal computation by multiparticle quantum walk. Science, 339(6121):791-794, 2013. [25] Andrew M. Childs and Robin Kothari. Limitations on the simulation of non-sparse Hamiltonians. Quantum Info. Comput., 10(7):669-684, July 2010. [26] Andrew M. Childs and Robin Kothari. Theory of Quantum Computation, Communication, and Cryptography, pages 94-103. Springer Berlin Heidelberg, 2011. 118 [27] Andrew M Childs, Robin Kothari, and Rolando D Somma. Quantum linear systems algorithm with exponentially improved dependence on precision. arXiv preprint arXiv:1511.02306, 2015. [28] Andrew M. Childs and Nathan Wiebe. Hamiltonian simulation using linear combinations of unitary operations. Quantum Info. Comput., 12(11-12):901-924, November 2012. [29] Anirban Narayan Chowdhury and Rolando D Somma. Quantum algorithms for gibbs sampling and hitting-time estimation. Quantum Info. Comput., 17(1 & 2):0041-0064, 2017. [30] John Clarke and Frank K Wilhelm. Superconducting quantum bits. Nature, 453(7198):1031-1042, 2008. [31] Richard Cleve, Daniel Gottesman, Michele Mosca, Rolando D. Somma, and David Yonge-Mallo. Efficient discrete-time simulations of continuous-time quantum query algorithms. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC '09, pages 409-416, New York, NY, USA, 2009. ACM. [32] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and D. E. Knuth. On the Lambert W function. Adv. Comput. Math., 5(1):329-359, 1996. [33] Holly K. Cummins, Gavin Llewellyn, and Jonathan A. Jones. Tackling systematic errors in quantum logic gates with composite rotations. Phys. Rev. A, 67:042308, Apr 2003. [34] Arnab Das and Bikas K. Chakrabarti. Colloquium: Quantum annealing and analog quantum computation. Rev. Mod. Phys., 80:1061-1081, Sep 2008. [35] Ammar Daskin and Sabre Kais. An ancilla-based quantum simulation framework for non-unitary matrices. Quant. Inform. Process., 16(1):33, 2016. [36] D. Deutsch. Quantum theory, the Church-Turing principle and the universal quantum computer. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 400(1818):97-117, 1985. [37] Simon J Devitt, William J Munro, and Kae Nemoto. Quantum error correction for beginners. Rep. Progr. Phys., 76(7):076001, 2013. [38] C. L. Dolph. A current distribution for broadside arrays which optimizes the relationship between beam width and side-lobe level. Proc. IRE, 34(6):335-348, June 1946. [39] Yonina C Eldar and Alan V Oppenheim. Quantum signal processing. IEEE Signal Processing Magazine, 19(6):12-32, 2002. [40] Alexandre Eremenko and Peter Yuditskii. Uniform approximation of sgn x by polynomials and entire functions. J. Anal. Math., 101(1):313-324, 2007. [41] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, Joshua Lapan, Andrew Lundgren, and Daniel Preda. A quantum adiabatic evolution algorithm applied to random instances of an NP-complete problem. Science, 292(5516):472-475, 2001. 119 [42] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Michael Sipser. Quantum computation by adiabatic evolution. arXiv preprint quant-ph/0001106, 2000. [43] Richard P. Feynman. Simulating physics with computers. Int. J. Theor. Phys., 21(6):467-488, 1982. [44] W. Fraser. A survey of methods of computing minimax and near-minimax polynomial approximations for functions of a single independent variable. J. A CM, 12(3):295-314, July 1965. [45] R Freeman. Spin choreography. Oxford University Press Oxford, UK, 1998. [46] Sevag Gharibian, Yichen Huang, Zeph Landau, Seung Woo Shin, et al. Quantum Hamiltonian complexity. Found. Trends Theo. Comp. Sci., 10(3):159-282, 2015. [47] Daniel Gottesman. An introduction to quantum error correction and fault-tolerant quantum computation. In Proceedings of Symposia in Applied Mathematics on Quantum information science and its contributionsto mathematics, volume 68, pages 13-58, 2009. [48] Francis Grenez. Design of linear or minimum-phase FIR filters by constrained Chebyshev approximation. Signal Process., 5(4):325-332, 1983. [491 William A. Grissom, Zhipeng Cao, and Mark D. Does. IB+I-selective excitation pulse design using the ShinnaraA;Le Roux algorithm. J. Mag. Res., 242:189 - 196, 2014. [50] Lov K. Grover. A fast quantum mechanical algorithm for database search. pages 212-219, 1996. [51] Lov K. Grover. Fixed-point quantum search. Phys. Rev. Lett., 95:150501, Oct 2005. [52] T. Hdberle, D. Schmid-Lorch, K. Karrai, F. Reinhard, and J. Wrachtrup. Highdynamic-range imaging of nanoscale magnetic fields using optimal control of a single qubit. Phys. Rev. Lett., 111:170801, Oct 2013. [53] Hartmut Hdffner, Christian F Roos, and Rainer Blatt. Quantum computing with trapped ions. Phys. Rep., 469(4):155-203, 2008. [54] F.J. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform. Proc. IEEE, 66(1):51-83, Jan 1978. [55] Aram W. Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for linear systems of equations. Phys. Rev. Lett., 103:150502, Oct 2009. [56] E. Hofstetter, A. V. Oppenheim, and J. Siegel. A new technique for the design of nonrecursive digital filters. 5th Annu. Princeton Conf. Informat. Sci. Syst., pages 62-72, March 1971. [57] Abdolhossein Hoorfar and Mehdi Hassani. Inequalities on the lambert w function and hyperpower function. J. Inequal. Pure and Appl. Math, 9(2):5-9, 2008. [58] Peter Hoyer. Arbitrary phases in quantum amplitude amplification. 62:052304, Oct 2000. 120 Phys. Rev. A, [59] Sami Husain, Minaru Kawamura, and Jonathan A. Jones. Further analysis of some symmetric and antisymmetric composite pulses for tackling pulse strength errors. J. Mag. Res., 230:145 - 154, 2013. [60] Vasiliki N Ikonomidou and George D Sergiadis. Improved Shinnar-Le Roux algorithm. J. Mag. Res., 143(1):30 - 34, 2000. [61] Svetoslav S. Ivanov and Nikolay V. Vitanov. Composite two-qubit gates. Phys. Rev. A, 92:022333, Aug 2015. [62] Jonathan A. Jones. Nested composite NOT gates for quantum computation. Phys. Lett. A, 377(40):2860 - 2862, 2013. [63] Stephen P Jordan, Hari Krovi, Keith SM Lee, and John Preskill. BQP-completeness of scattering in scalar quantum field theory. arXiv preprint arXiv:1703.00454, 2017. [64] Chingiz Kabytayev, Todd J. Green, Kaveh Khodjasteh, Michael J. Biercuk, Lorenza Viola, and Kenneth R. Brown. Robustness of composite pulses to time-dependent control noise. Phys. Rev. A, 90:012316, Jul 2014. [65] Lina J. Karam and James H. McClellan. Chebyshev digital FIR filter design. Signal Process., 76(1):17 - 36, 1999. [66] Georgios Katsikis, James S. Cybulski, and Manu Prakash. Synchronous universal droplet logic and control. Nat. Phys., 11(7):588-596, July 2015. [67] Navin Khaneja, Timo Reiss, Cindie Kehlet, Thomas Schulte-Herbriiggen, and Steffen J. Glaser. Optimal control of coupled spin dynamics: design of NMR pulse sequences by gradient ascent algorithms. J. Mag. Res., 172(2):296 - 305, 2005. [68] Kaveh Khodjasteh, Daniel A. Lidar, and Lorenza Viola. Arbitrarily accurate dynamical control in open quantum systems. Phys. Rev. Lett., 104:090501, Mar 2010. [69] Kaveh Khodjasteh and Lorenza Viola. Dynamically error-corrected gates for universal quantum computation. Phys. Rev. Lett., 102:080501, Feb 2009. [70] Shelby Kimmel, Cedric Yen-Yu Lin, Guang Hao Low, Maris Ozols, and Theodore J. Yoder. Hamiltonian simulation with optimal sample complexity. Npj Quantum Inf., 3(1):13, 2017. [71] A Yu Kitaev. Quantum computations: algorithms and error correction. Russ. Math. Surv., 52(6):1191-1249, 1997. [72] Robin Kothari. Efficient algorithms in quantum query complexity. PhD thesis, 2014. [73] Robin Kothari. private communication. [74] R. Landauer. Irreversibility and heat generation in the computing process. Journal of Research and Development, 5(3):183-191, July 1961. IBM [75] Mathias Lang. Algorithms for the Constrained Design of Digital Filters with Arbitrary Magnitude and Phase Responses. PhD thesis, Vienna University of Technology, 1999. 121 [76] Kuan J. Lee. General parameter relations for the Shinnar-Le Roux pulse design algorithm. J. Mag. Res., 186(2):252 - 258, 2007. [77] Malcolm H. Levitt. Composite Pulses. John Wiley & Sons, Ltd, 2007. [78] J. S. Li and N. Khaneja. Ensemble control of Bloch equations. IEEE Trans. Autom. Control, 54(3):528-536, March 2009. [79] J. Shin Li, Justin Ruths, Tsyr Yan Yu, Haribabu Arthanari, and Gerhard Wagner. Optimal pulse design in quantum control: A unified computational method. Proceedings of the Natl. Acad. Sci. U.S.A., 108(5):1879-1884, 2011. [80] Y. C. Lim, J. H. Lee, C. K. Chen, and R. H. Yang. A weighted least squares algorithm for quasi-equiripple FIR and IIR digital filter design. IEEE Trans. Signal Process., 40(3):551-558, Mar 1992. [81] Seth Lloyd. Universal quantum simulators. Science, 273(5278):1073, Aug 23 1996. [82] Seth Lloyd, Masoud Mohseni, and Patrick Rebentrost. Quantum principal component analysis. Nat. Phys., 10(9):631-633, September 2014. [83] Gui Lu Long, Yan Song Li, Wei Lin Zhang, and Li Niu. Phase matching in quantum searching. Phys. Lett., 262(1):27 - 34, 1999. [84] Guang Hao Low and Isaac L Chuang. Hamiltonian simulation by qubitization. arXiv preprint arXiv:1610.06546, 2016. [85] Guang Hao Low and Isaac L Chuang. Hamiltonian simulation by uniform spectral amplification. arXiv preprint arXiv:1707.05391, 2017. [86] Guang Hao Low and Isaac L. Chuang. Optimal Hamiltonian simulation by quantum signal processing. Phys. Rev. Lett., 118:010501, Jan 2017. [87] Guang Hao Low, Theodore J. Yoder, and Isaac L. Chuang. Optimal arbitrarily accurate composite pulse sequences. Phys. Rev. A, 89:022341, Feb 2014. [88] Guang Hao Low, Theodore J. Yoder, and Isaac L. Chuang. Quantum imaging by coherent enhancement. Phys. Rev. Lett., 114:100801, Mar 2015. [89] Guang Hao Low, Theodore J. Yoder, and Isaac L. Chuang. Methodology of resonant equiangular composite quantum gates. Phys. Rev. X, 6:041067, Dec 2016. [90] Peter Lynch. The Dolph-Chebyshev window: A simple optimal filter. Mon. Wea. Rev., 125(4):655-660, 1997. [91] Murray Marshall. Positive polynomials and sums of squares. Number 146. American Mathematical Soc., 2008. [92] J. McClellan, T. Parks, and L. Rabiner. A computer program for designing optimum FIR linear phase digital filters. IEEE Trans. Audio Electroacoust., 21(6):506-526, Dec 1973. [93] Giinter Meinardus. Approximation of functions: Theory and numerical methods, volume 13. Springer, Berlin, 1967. 122 [941 J. T. Merrill, S. C. Doret, Grahame Vittorini, J. P. Addison, and Kenneth R. Brown. Transformed composite sequences for improved qubit addressing. Phys. Rev. A, 90:040301, Oct 2014. [95] Ari Mizel, Daniel A. Lidar, and Morgan Mitchell. Simple proof of equivalence between adiabatic quantum computation and the circuit model. Phys. Rev. Lett., 99:070502, Aug 2007. [96] Shubhendu S Mukherjee, Joel Emer, Tryggve Fossum, and Steven K Reinhardt. Cache scrubbing in microprocessors: Myth or necessity? In Proceedings. 10th IEEE Pacific Rim InternationalSymposium on Dependable Computing, pages 37-42. IEEE, 2004. [97] C.Andrew Neff and John H. Reif. An efficient algorithm for the complex roots problem. J. Complex, 12(2):81 - 115, 1996. [98] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, 1 edition, January 2004. [99] Leonardo Novo and Dominic W Berry. Improved hamiltonian simulation via a truncated Taylor series and corrections. arXiv preprint arXiv:1611.10033, 2016. [100] Jeremy L O'Brien. Optical quantum computing. Science, 318(5856):1567-1570, 2007. [101] A.V. Oppenheim and R.W. Schafer. Discrete-time Signal Processing (3rd Ed.). Prentice-Hall signal processing series. Prentice Hall, 2010. [102] Ricardo Pach6n and Lloyd N. Trefethen. Barycentric-remez algorithms for best polynomial approximation in the chebfun system. Bit Numer. Math., 49(4):721-741, 2009. [103] Adam Paetznick and Krysta M. Svore. Repeat-until-success: Non-deterministic decomposition of single-qubit unitaries. Quantum Info. Comput., 14(15-16):1277-1301, November 2014. [104] J. Pauly, P. Le Roux, D. Nishimura, and A. Macovski. Parameter relations for the Shinnar-Le Roux selective excitation pulse design algorithm [NMR imaging]. IEEE Trans. Med. Imag., 10(1):53-65, Mar 1991. [105] Michael James David Powell. Approximation theory and methods. Cambridge University Press, 1981. [106] Lulu Qian, David Soloveichik, and Erik Winfree. Efficient Turing-Universal Computation with DNA Polymers, pages 123-140. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011. [107] Ben W Reichardt. Reflections for quantum query algorithms. In Proceedings of the 28th annual ACM-SIAM symposium on Discrete Algorithms, pages 560-569. Society for Industrial and Applied Mathematics, 2011. [108 J6r6mie Roland and Nicolas J Cerf. Quantum search by local adiabatic evolution. Phys. Rev. A, 65(4):042308, 2002. [109] I. Michael Ross and Mark Karpenko. A review of pseudospectral optimal control: From theory to flight. Annu. Rev. Control, 36(2):182 - 197, 2012. 123 [110] J. Ruths and J. S. Li. Optimal control of inhomogeneous ensembles. IEEE Trans. Autom. Control, 57(8):2021-2032, Aug 2012. [111] Sushant Sachdeva and Nisheeth K. Vishnoi. Faster algorithms via approximation theory. Found. Trends Theo. Comp. Sci., 9(2):125-210, 2014. [112] EB Saff and V Totik. Polynomial approximation of piecewise analytic functions. J. Lond. Math. Soc., 2(3):487-498, 1989. [113] Meir Shinnar, Scott Eleff, Harihara Subramanian, and John S. Leigh. The synthesis of pulse sequences yielding arbitrary magnetization vectors. Mag. Res. Med., 12(1):74-80, 1989. [114] Michael Sipser. Introduction to the Theory of Computation, volume 2. Course Technology Boston, 2006. Thomson [115] A. Soare, H. Ball, D. Hayes, J. Sastrawan, M. C. Jarratt, J. J. McLoughlin, X. Zhen, T. J. Green, and M. J. Biercuk. Experimental noise filtering by quantum control. Nat. Phys., 10(11):825-829, November 2014. [1161 Robert I. Soare. Turing oracle machines, online computing, and three displacements in computability theory. Ann. Pure Appl. Logic., 160(3):368 - 399, 2009. Computation and Logic in the Real World: CiE 2007. [117] David Soloveichik, Matthew Cook, Erik Winfree, and Jehoshua Bruck. Computation with finite stochastic chemical reaction networks. Nat. Comp., 7(4):615-633, December 2008. [118] R. D. Somma and S. Boixo. Spectral gap amplification. SIAM J. Comput., 42(2):593610, 2013. [119] M. H. Stone. The generalized Weierstrass approximation theorem. 21(4):167-184, 1948. Math. Mag., [120] Gerald Jay Sussman and Jack Wisdom. Numerical evidence that the motion of Pluto is chaotic. Science, 241(4864):433-437, 1988. [121] Eric G. Swedin and David L. Ferro. Computers: The Life Story of a Technology (Greenwood Technographies). Greenwood Press, Westport, CT, USA, 2005. [122] M. Szegedy. Quantum speed-up of Markov chain based algorithms. In Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science, pages 32-41, Oct 2004. [123] Mario Szegedy. Spectra of quantized walks and a vI5K rule. arXiv preprint quant- ph/0401053, 2004. [124] Y Tomita, J T Merrill, and K R Brown. Multi-qubit compensation sequences. New J. Phys., 12(1):015002, 2010. [125] Boyan T. Torosov and Nikolay V. Vitanov. Smooth composite pulses for high-fidelity quantum information processing. Phys. Rev. A, 83:053420, May 2011. 124 [126] Lloyd N Trefethen. Approximation theory and approximationpractice. Siam, Philadelphia, 2013. [127] G6tz S. Uhrig. Keeping a quantum bit alive by optimized ir-pulse sequences. Phys. Rev. Lett., 98:100504, Mar 2007. [128] P. Vaidyanathan and Truong Nguyen. Eigenfilters: A new approach to least-squares FIR filter design and applications including Nyquist filters. IEEE Trans. Circuits Syst., 34(1):11-23, Jan 1987. [129] L. M. K. Vandersypen and I. L. Chuang. NMR techniques for quantum control and computation. Rev. Mod. Phys., 76:1037-1069, Jan 2005. [130] Nikolay V. Vitanov. Arbitrarily accurate narrowband composite pulse sequences. Phys. Rev. A, 84:065404, Dec 2011. [131] John Von Neumann and Arthur Walter Burks. Theory of self-reproducing automata. University of Illinois Press Urbana, 1996. [132] Xin Wang, Lev S. Bishop, Edwin Barnes, J. P. Kestner, and S. DasSarma. Robust quantum gates for singlet-triplet spin qubits using composite pulses. Phys. Rev. A, 89:022310, Feb 2014. [1331 Warren S Warren. The usefulness of NMR quantum computing. 277(5332):1688-1690, 1997. Science, [134] S. Wimperis. Broadband, narrowband, and passband composite pulses for use in advanced NMR experiments. J. Mag. Res., Series A, 109(2):221 - 231, 1994. [135] Chui-Ping Yang and Siyuan Han. n-qubit-controlled phase gate with superconducting quantum-interference devices coupled to a resonator. Phys. Rev. A, 72(3):032311, 2005. [136] Theodore J. Yoder, Guang Hao Low, and Isaac L. Chuang. Fixed-point quantum search with an optimal number of queries. Phys. Rev. Lett., 113:210501, Nov 2014. [137] Theodore J. Yoder, Ryuji Takagi, and Isaac L. Chuang. Universal fault-tolerant gates on concatenated stabilizer codes. Phys. Rev. X, 6:031039, Sep 2016. 125

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1031219672-MIT