Time-dependent Hamiltonian simulation with $L^1$-norm scaling

The difficulty of simulating quantum dynamics depends on the norm of the Hamiltonian. When the Hamiltonian varies with time, the simulation complexity should only depend on this quantity instantaneously. We develop quantum simulation algorithms that exploit this intuition. For sparse Hamiltonian simulation, the gate complexity scales with the $L^1$ norm $\int_{0}^{t}\mathrm{d}\tau\left\lVert H(\tau)\right\lVert_{\max}$, whereas the best previous results scale with $t\max_{\tau\in[0,t]}\left\lVert H(\tau)\right\lVert_{\max}$. We also show analogous results for Hamiltonians that are linear combinations of unitaries. Our approaches thus provide an improvement over previous simulation algorithms that can be substantial when the Hamiltonian varies significantly. We introduce two new techniques: a classical sampler of time-dependent Hamiltonians and a rescaling principle for the Schr\"{o}dinger equation. The rescaled Dyson-series algorithm is nearly optimal with respect to all parameters of interest, whereas the sampling-based approach is easier to realize for near-term simulation. These algorithms could potentially be applied to semi-classical simulations of scattering processes in quantum chemistry.

The difficulty of simulating quantum dynamics depends on the norm of the Hamiltonian. When the Hamiltonian varies with time, the simulation complexity should only depend on this quantity instantaneously. We develop quantum simulation algorithms that exploit this intuition. For sparse Hamiltonian simulation, the gate complexity scales with the L 1 norm t 0 dτ H(τ ) max , whereas the best previous results scale with t max τ ∈[0,t] H(τ ) max . We also show analogous results for Hamiltonians that are linear combinations of unitaries. Our approaches thus provide an improvement over previous simulation algorithms that can be substantial when the Hamiltonian varies significantly. We introduce two new techniques: a classical sampler of time-dependent Hamiltonians and a rescaling principle for the Schrödinger equation. The rescaled Dyson-series algorithm is nearly optimal with respect to all parameters of interest, whereas the sampling-based approach is easier to realize for near-term simulation. These algorithms could potentially be applied to semi-classical simulations of scattering processes in quantum chemistry.

Introduction
Simulating the Hamiltonian dynamics of a quantum system is one of the most promising applications of a quantum computer. The apparent classical intractability of simulating quantum dynamics led Feynman [25] and others to propose the idea of quantum computation. Quantum computers can simulate various physical systems, including condensed matter physics [3], quantum field theory [28], and quantum chemistry [2,14,37,48]. The study of quantum simulation has also led to the discovery of new quantum algorithms, such as algorithms for linear systems [27], differential equations [10], semidefinite optimization [11], formula evaluation [23], quantum walk [17], and ground-state and thermal-state preparation [20,41].
Let H(τ ) be a Hamiltonian defined for 0 ≤ τ ≤ t. The problem of Hamiltonian simulation is to approximate the evolution exp T −i t 0 dτ H(τ ) using a quantum circuit comprised of elementary quantum gates, where exp T denotes the time-ordered matrix exponential. If the Hamiltonian H(τ ) = H does not depend on time, the evolution operator can be represented in closed form as e −itH . Then the problem can be greatly simplified and it has been thoroughly studied by previous works on quantum simulation [1, 4, 5, 7-9, 13, 16, 18, 19, 29, 31-33, 35, 36].
Simulating a general time-dependent Hamiltonian H(τ ) naturally subsumes the timeindependent case, and can be applied to devising quantum control schemes [38,40], describing quantum chemical reactions [12], and implementing adiabatic quantum algorithms [22]. However, the problem becomes considerably harder and there are fewer quantum algorithms available. Wiebe, Berry, Høyer, and Sanders designed a time-dependent Hamiltonian simulation algorithm based on higher-order product formulas [50]. They assume that H(τ ) is smooth up to a certain order and they give an example in which a desired approximation cannot be achieved due to the non-differentiability of the Hamiltonian. The smoothness assumption is relaxed in subsequent work by Poulin, Qarry, Somma, and Verstraete [42] based on techniques of Hamiltonian averaging and Monte Carlo estimation. The fractional-query algorithm of Berry, Childs, Cleve, Kothari, and Somma can also simulate time-dependent Hamiltonians [6], with an exponentially improved dependence on precision and only logarithmic dependence on the derivative of the Hamiltonian. A related quantum algorithm for time-dependent Hamiltonian simulation was suggested by Berry, Childs, Cleve, Kothari, and Somma based on the truncated Dyson series [8], which is analyzed explicitly in [29,36].
In this paper, we study time-dependent Hamiltonian simulation based on a simple intuition: the difficulty of simulating a quantum system should depend on the integrated norm of the Hamiltonian. To elaborate, first consider the special case of simulating a timeindependent Hamiltonian. The complexity of such a simulation depends on t H [15], where · is a matrix norm that quantifies the size of the Hamiltonian. It is common to express the complexity in terms of the spectral norm H ∞ (i.e., the Schatten ∞-norm), which quantifies the maximum energy of H.
In the general case where the Hamiltonian H(τ ) is time dependent, we expect a quantum simulation algorithm to depend on the Hamiltonian locally in time, and therefore to have complexity that scales with the integrated norm t 0 dτ H(τ ) . This is the L 1 norm of H(τ ) when viewed as a function of τ , so we say such an algorithm has L 1 -norm scaling. Surprisingly, existing analyses of quantum simulation algorithms fail to achieve this complexity; rather, their gate complexity scales with the worst-case cost t max τ ∈[0,t] H(τ ) . It is therefore reasonable to question whether our intuition is correct, or if there exist faster time-dependent Hamiltonian simulation algorithms that can exploit this intuition. 1 Our work answers this question by providing multiple faster quantum algorithms for time-dependent Hamiltonian simulation. These algorithms have gate complexity that scales with t 0 dτ H(τ ) , in contrast to the best previous scaling of t max τ ∈[0,t] H(τ ) . As the norm inequality t 0 dτ H(τ ) ≤ t max τ ∈[0,t] H(τ ) always holds but is not saturated in general, these algorithms provide a strict speedup over existing algorithms. We further analyze an application to simulating scattering processes in quantum chemistry, showing that our improvement can be favorable in practice.
We introduce notation and terminology and state our assumptions in Section 2. Following standard assumptions about quantum simulation, we consider two different input models of Hamiltonians. The first is the sparse matrix (SM) model common for analyzing Hamiltonian simulation in general, in which the Hamiltonian is assumed to be sparse and access to the locations and values of nonzero matrix elements are provided by oracles. We quantify the complexity of a simulation algorithm by the number of queries and additional gates it uses. The second model, favorable for practical applications such as condensed matter physics and quantum chemistry simulation, assumes that the Hamiltonian can be explicitly decomposed as a linear combination of unitaries (LCU), where the coefficients are efficiently computable on a classical computer and the summands can be exponentiated and controlled on a quantum computer. We ignore the cost of implementing the coefficient oracle and focus mainly on the gate complexity. Quantum simulation algorithms can sometimes work more efficiently in other input models, but we study these two models since they are versatile and provide a fair comparison of the gate complexity.
Reference [6] claims that the fractional-query algorithm can simulate time-dependent Hamiltonians with L ∞ -norm scaling. However, it is not hard to see that its query complexity in fact scales with the L 1 norm. While we do not show how to achieve this scaling for the gate complexity, our analysis is simple and suggests that such a result might be possible. We analyze the query complexity of the fractional-query algorithm in Section 2.5.
We develop two new techniques to simulate time-dependent Hamiltonians with L 1norm scaling. Our first technique is a classical sampling protocol for time-dependent Hamiltonians. In this protocol, we randomly sample a time τ ∈ [0, t] and evolve under the time-independent Hamiltonian H(τ ), where the probability distribution is designed to favor those τ with large H(τ ) . Campbell introduced a discrete sampling scheme for time-independent Hamiltonian simulation [13] and our protocol can be viewed as its continuous analog, which we call "continuous qDRIFT". We show that continuous qDRIFT is universal, in the sense that any Hamiltonian simulable by [13] can be simulated by continuous qDRIFT with the same complexity. In addition, we shave off a multiplicative factor in the analysis of [13] by explicitly evaluating the rate of change of the evolution with respect to scaling the Hamiltonian. Continuous qDRIFT and its analysis are detailed in Section 3. Our algorithm is also similar in spirit to the approach of Poulin et al. [43] based on Hamiltonian averaging and Monte Carlo estimation, although their algorithm does not have L 1 -norm scaling. We discuss the relationship between these two approaches in Appendix A.
We also present a general principle for rescaling the Schrödinger equation in Section 4. In the rescaled Schrödinger equation, the time-dependent Hamiltonian H(τ ) has the same norm for all τ ∈ [0, t], so the norm inequality t with equality. Using this principle, we show that the simulation algorithm based on the truncated Dyson series [8,29,36] can also be improved to have L 1 -norm scaling.
To illustrate how our results might be applied, we identify a specific problem in quantum chemistry for which our L 1 -norm improvement is advantageous: semi-classical scattering of molecules in a chemical reaction. For such a simulation, H(τ ) changes dramatically throughout the evolution, so its L 1 norm can be significantly smaller than its L ∞ norm. We discuss this application further in Section 5.
Finally, we conclude in Section 6 with a brief discussion of the results and some open questions.

Time-dependent Hamiltonian evolution
Let H(τ ) be a time-dependent Hamiltonian defined for 0 ≤ τ ≤ t. By default, we assume that H(τ ) is continuously differentiable and H(τ ) = 0 everywhere, and we defer the discussion of pathological cases to Section 6. If the Hamiltonian H(τ ) = H is time independent, the evolution is given in closed form by the matrix exponential e −itH . However, there exists no such closed-form expression for a general H(τ ) and we instead represent the evolution by If G(τ ) is another time-dependent Hamiltonian, the evolutions generated by H(τ ) and G(τ ) have distance bounded by the following lemma.
We will abbreviate the evolution operator as E(t, s) := exp T −i t s dτ H(τ ) when there is no ambiguity. In the special case where H(τ ) = H is time independent, the evolution e −itH only depends on the time duration so we denote E(t) := E(t, 0). Therefore, we have the differential equation We may further obtain an integral representation of E(t, 0). To this end, we apply the fundamental theorem of calculus to the Schrödinger equation and obtain Equivalently, E(t, 0) satisfies the integral equation For any 0 ≤ s ≤ t, the evolution operator satisfies the multiplicative property The operator E(0, t) with t ≥ 0 is understood as the inverse evolution operator For a thorough mathematical treatment of time-dependent Hamiltonian evolution, we refer the reader to [21]. Finally, the quantum channel corresponding to the unitary evolution E(t, 0) is denoted as E(t, 0) and is defined by For time-independent Hamiltonians, we denote E(t) := E(t, 0).

Notation for norms
We introduce norm notation for vectors, matrices, operator-valued functions, and linear maps on the space of matrices. Let α = α 1 α 2 · · · α L ∈ C L be an L-dimensional vector. We use α p to represent the vector p norm of α. Thus, For a matrix A, we define A p to be the Schatten p-norm of A [47,51]. We have Finally, if f : [0, t] → C is a continuous function, we use f p to mean the L p norm of the function f . Thus, We combine these norms to obtain norms for vector-valued and operator-valued functions. Let α : [0, t] → C L be a continuous vector-valued function, with the jth coordinate at time τ denoted α j (τ ). We use α p,q to mean that we take the p norm α(τ ) p for every τ and compute the L q norm of the resulting scalar function. For example, Note that α(τ ) p is continuous as a function of τ , so α p,q is well defined and is indeed a norm for vector-valued functions. Similarly, we also define A p,q for a continuous operatorvalued function by taking the Schatten p-norm A(τ ) p for every τ and computing the L q norm of the resulting scalar function. In rare cases, we will also encounter time-dependent linear combinations of operators of the form A(τ ) = L l=1 A l (τ ), and we write A p,q,r to mean that we take the Schatten p-norm A l (τ ) p of each summand, and apply the q norm and L r norm to the resulting vector-valued functions. For example, We also define A max as the largest matrix element of A in absolute value, The norm A max is a vector norm of A but does not satisfy the submultiplicative property of a matrix norm. It relates to the spectral norm by the inequality [15, Lemma 1] If A is a continuous operator-valued function, we interpret A max,q in a similar way as above. Therefore, Finally, we define a norm for linear maps on the space of matrices. Let E : A → E(A) be a linear map on the space of matrices on H. The diamond norm of E is where the maximization is taken over all matrices B on H ⊗ H satisfying B 1 ≤ 1. Below is a useful bound on the diamond-norm distance between two unitary channels.
Lemma 2 (Diamond-norm distance between unitary channels [9, Lemma 7]). Let V and U be unitary matrices, with associated quantum channels V : ρ → V ρV † and U : ρ → U ρU † . Then, The sampling-based algorithm (Section 3) produces a channel close to , and its error is naturally quantified by the diamond-norm distance. Other simulation algorithms such as the Dyson-series approach (Section 4) produce operators that are close to the unitary exp T − i t 0 dτ H(τ ) , and we quantify their error in terms of the spectral norm. For a fair comparison one may instead describe all simulation algorithms using quantum channels and use the diamond-norm distance as the unified error metric. By Lemma 2, we lose at most a factor of 2 in this conversion.

Hamiltonian input models
Quantum simulation algorithms may have different performance depending on the choice of the input model of Hamiltonians. In this section, we describe two input models that are extensively used in previous works: the sparse matrix (SM) model and the linearcombination-of-unitaries (LCU) model. We also discuss other input models that will be used in later sections.
Let H(τ ) be a time-dependent Hamiltonian defined for 0 ≤ τ ≤ t. In the SM model, we assume that H(τ ) is d-sparse in the sense that the number of nonzero matrix elements within each row and column throughout the entire interval [0, t] is at most d. We assume that the locations of the nonzero matrix elements are time independent. Access to the Hamiltonian is given through the oracles O loc |j, s = |j, col(j, s) , Here, col(j, s) returns the column index of the sth element in the jth row that may be nonzero over the entire time interval [0, t]. We quantify the complexity of a quantum simulation algorithm by the number of queries it makes to O loc and O val , together with the number of additional elementary gates it requires. Such a model includes many realistic physical systems and is well-motivated from a theoretical perspective [27]. As the following lemma shows, a d-sparse Hamiltonian can be efficiently decomposed as a sum of 1-sparse terms.
where the coefficients α l (τ ) ≥ 0 are continuously differentiable and nonzero everywhere, and the matrices H l are both unitary and Hermitian. We assume that the coefficients α l (τ ) can be efficiently computed by a classical oracle O coeff , and we ignore the classical cost of implementing such an oracle. We further assume that each |0 0| ⊗ I + |1 1| ⊗ H l can be implemented with gate complexity g c , and each |0 0| ⊗ I + |1 1| ⊗ e −iτ H l for an arbitrarily large τ can be performed with g e gates. Such a setting is common in the simulation of condensed matter physics and quantum chemistry. We quantify the complexity of a simulation algorithm by the number of elementary gates it uses. Quantum simulation algorithms can sometimes work in other input models. For example, the qDRIFT protocol introduced in Section 3 requires only that the Hamiltonians have the form  Table 1: Complexity comparison of previous algorithms (top three) and the algorithms introduced in this paper (bottom two) for simulating time-dependent Hamiltonians. Logarithmic factors are suppressed by O notation and the (non-query) gate complexities are compared. The product formula algorithm of [50] is omitted as its gate complexity scales polynomially with high-order derivatives and is not directly comparable to other algorithms in the table. The complexity of the full Monte Carlo estimation algorithm [42] is not analyzed explicitly; only its first step is compared. The fractional-query algorithm [6] does not have an explicit implementation for Hamiltonians in the LCU model, and its implementation in the SM model is streamlined by the Dyson-series approach [8,29,36].
(LC) model. On the other hand, the Dyson-series algorithm can be described in terms of the Select operation irrespective of how this operation is implemented. We consider the SM and LCU models for all the time-dependent simulation algorithms so that we can give a fair comparison of their complexity.

Simulation algorithms with L 1 -norm scaling
We now explain the meaning of L 1 -norm scaling in the SM and the LCU models. Let H(τ ) be a time-dependent Hamiltonian defined for 0 ≤ τ ≤ t. We say that an algorithm in the SM model simulates H(τ ) with L 1 -norm scaling if, given any continuously differentiable upper bound Λ max (τ ) ≥ H(τ ) max , the algorithm has query complexity and gate complexity that scale with Λ max 1 = t 0 dτ Λ max (τ ) up to logarithmic factors. The norm bound Λ max (τ ), together with other auxiliary information, must be accessed by the quantum simulation algorithm; we assume such quantities can be computed efficiently.
In the LCU model, we are given a time-dependent Hamiltonian with the decomposition We say that an algorithm has L 1 -norm scaling if, for any continuously differentiable vector-valued function Λ(t) with Λ l (τ ) ≥ α l (τ ), the algorithm has query and gate complexity that scale with Λ ∞,1 = t 0 dτ max l Λ l (τ ) up to logarithmic factors.
For better readability, we express the complexity of simulation algorithms in terms of the norm of the original Hamiltonian, such as H max,1 and α ∞,1 , instead of the upper bounds Λ max 1 and Λ ∞,1 . We use standard asymptotic notation, with O, Ω, and Θ representing asymptotic upper, lower, and tight bounds, respectively. We also suppress logarithmic factors using the O notation when the complexity expression becomes too complicated. Table 1 compares the results of this paper with previous results on simulating time-dependent Hamiltonians.
Our goal is to develop simulation algorithms that scale with the L 1 -norm with respect to the time variable τ , for both query complexity and gate complexity. We start by reexamining the fractional-query approach. It was mentioned in [6] that this approach can simulate time-dependent Hamiltonians with L ∞ -norm scaling, but we find that its query complexity scales with the L 1 norm. We give this improved analysis in the next section.

Query complexity with L 1 -norm scaling
We begin by reviewing the result of [6] for simulating time-independent Hamiltonians. We assume that the Hamiltonian is given by a linear combination of unitaries G = L l=1 β l G l with nonnegative coefficients β l . Here, G l are both unitary and Hermitian, so they are reflections and their eigenvalues are ±1.
We say that a quantum operation is a fractional-query algorithm if it is of the form where Q is unitary with eigenvalues ±1, U j are unitary operations, and τ j ≥ 0. Here, we regard Q as the oracle and U j as non-query operations, so this algorithm has fractionalquery complexity m j=1 τ j . A quantum algorithm that makes (discrete) queries to Q is a fractional-query algorithm with τ j = 1. Conversely, any fractional-query algorithm can be efficiently simulated in the discrete query model. In particular, an algorithm with fractional-query complexity T can be simulated with error at most using To apply the fractional-query approach, we approximate the evolution under G using the first-order product formula Observe that e −iπG l are unitary operations with eigenvalues ±1, so e −i t can be viewed as a fractional-query algorithm with query complexity O( β 1 t), provided that we can make fractional queries to multiple oracles e −iπG 1 , . . . , e −iπG L . This can be realized by a standard fractional-query algorithm accessing the single oracle with the same query complexity [6, Theorem 4.1].
To simulate with accuracy , we set r = O ( We now convert this multi-oracle algorithm to a single-oracle algorithm with the same fractional-query complexity T = O( β 1 t) and, with precision O( ), implement it in the discrete query model. Altogether, this approach makes queries to the operation Select(Exp-G) = L l=1 |l l| ⊗ e −iπG l . As mentioned in [6], the fractional-query approach can also be used to simulate timedependent Hamiltonians by replacing (24) with a product-formula decomposition of the time-ordered evolution. However, [6] only gives a brief discussion of this issue and the claimed complexity has only L ∞ scaling. We now give an improved analysis of this algorithm for the SM model, showing that its query complexity achieves L 1 -norm scaling.
Theorem 4 (Fractional-query algorithm with L 1 -norm scaling (SM)). A d-sparse timedependent Hamiltonian H(τ ) acting on n qubits can be simulated for time τ ∈ [0, t] with accuracy using Proof. For readability, we assume that H max,1 , H max,∞ , and H ∞,∞ are the norm upper bounds provided to the algorithm. We first decompose exp T −i t 0 dτ H(τ ) into a product of evolutions of time-independent Hamiltonians H(kt/r), each evolving for time t/r. By Lemma 1, we have which implies To approximate with precision , it suffices to choose Note that we use O here because we can choose r to be the minimum integer satisfying (30), giving an upper bound on the number of steps that suffice to achieve error at most . We then decompose the evolution under each time-independent sparse Hamiltonian Furthermore, each G j is 1-sparse and Hermitian with eigenvalues ±1 and the value and location of each non-zero matrix element in G j can be accessed using O(1) queries to H.
In the third line we have used the inequality between the spectral norm and max norm, in the fourth line we have used the bound on the max norm (32), and in the fifth line we We apply the first-order product formula to obtain Therefore, it is possible to choose r as such that the error of the first-order product-formula decomposition is at most By choosing r as the maximum of (31) and (35), we ensure that the error in each of the r time steps is O( /r), so the total error is O( ). Altogether, we find a fractional-query algorithm with total query complexity and error We now convert this multi-oracle algorithm to a single-oracle algorithm with the same fractional-query complexity and, with precision O( ), implement it in the discrete query model. The single oracle for the standard fractional-query algorithm is now This oracle encodes the time-dependence of H in an ancilla. The operators U j in the fractional-query algorithm then need to increment the time register.
Altogether, we make discrete queries. We now show how the query complexity of this approach achieves L 1 -norm scaling. The intuition is that the total query complexity O (41) To achieve an additive error of δ, it suffices to choose Since δ can be made arbitrarily close to 0, we have the total query complexity of as claimed.
The above analysis shows that the fractional-query algorithm can simulate a timedependent Hamiltonian with query complexity that scales with the L 1 -norm. However, this approach does not directly give a useful result for the gate complexity. The difficulty arises from the factor of g in [6, Proof of Theorem 1.1], which corresponds to the complexity of applying a sequence of driving operations U j . These operations need to increment k (indexing the time), as well as l, which takes η(k) values depending on k. Applying the sequence of operations U j therefore requires determining new values of l and k, which can depend on the sum of η(k) over a long sequence of values of k. This will introduce significant gate complexity, so a fast algorithm would require a more efficient procedure for implementing the driving operations.
Instead, we develop other quantum algorithms that achieve L 1 -norm scaling for not only the query complexity but also the gate complexity. We employ two main techniques: the continuous qDRIFT sampling protocol (Section 3) and a rescaling principle for the Schrödinger equation (Section 4).

Continuous qDRIFT
We show in Section 3.2 that continuous qDRIFT is universal, in the sense that any timeindependent Hamiltonian simulable by the algorithm of [13] can be simulated by our protocol. We then discuss the simulation complexity in both the SM and the LCU models in Section 3.3.
The continuous qDRIFT protocol also has similarities with the approach of Poulin et al. [43] based on Hamiltonian averaging and Monte Carlo sampling, although their approach does not have L 1 -norm scaling. We give a detailed comparison between these two approaches in Appendix A.

A classical sampler of time-dependent Hamiltonians
Let H(τ ) be a time-dependent Hamiltonian defined for 0 ≤ τ ≤ t. For this section only, we relax our requirements on the Hamiltonians: we assume that H(τ ) is nonzero everywhere and is continuous except on a finite number of points. We further suppose that each H(τ ) can be directly exponentiated on a quantum computer. The ideal evolution under H(τ ) for time t is given by E(t, 0) = exp T − i t 0 dτ H(τ ) and the corresponding quantum channel is The high-level idea of the sampling algorithm is to approximate the ideal channel by a mixed unitary channel where p(τ ) is a probability density function defined for 0 ≤ τ ≤ t. This channel can be realized by a classical sampling protocol. With a proper choice of p(τ ), this channel approximates the ideal channel and can thus be used for quantum simulation.
We begin with a full definition of U(t, 0). Inspired by [13], we choose p(τ ) to be biased Note that U(t, 0) is a valid quantum channel (in particular, p(τ ) can never be zero). Furthermore, it can be implemented with unit cost: for any input state ρ, we randomly sample a value τ according to p(τ ) and perform e −iH(τ )/p(τ ) . Note also that H(τ )/p(τ ) in the exponential implicitly depends on t. Indeed, H ∞,1 includes an integral over time, so p(τ ) decreases with the total evolution time t. We call this classical sampling protocol and the channel it implements "continuous qDRIFT". This protocol assumes that the spectral norm H(τ ) ∞ is known a priori and that we can efficiently sample from the distribution p(τ ). In practice, it is often easier to obtain a spectral-norm upper bound Λ(τ ) ≥ H(τ ) ∞ . Such an upper bound can also be used to implement continuous qDRIFT, provided that it has only finitely many discontinuities. Specifically, we define so p Λ (τ ) is a probability density function. Using p Λ to implement continuous qDRIFT, we obtain the channel whose analysis is similar to that presented here. For readability, we assume that we can efficiently sample from p(τ ) = H(τ ) ∞ / H ∞,1 and we analyze U(t, 0). We show that continuous qDRIFT approximates the ideal channel with error that depends on the L 1 -norm.
Theorem 5 (L 1 -norm error bound for continuous qDRIFT, short-time version). Let H(τ ) be a time-dependent Hamiltonian defined for 0 ≤ τ ≤ t; assume it is continuous except on a finite number of points and nonzero everywhere. Define E(t, 0) = exp T −i t 0 dτ H(τ ) and let E(t, 0)(·) = E(t, 0)(·)E † (t, 0) be the corresponding quantum channel. Let U(t, 0) be the continuous qDRIFT channel where p(τ ) = H(τ ) ∞ / H ∞,1 . Then Note that this bound is only useful when t is small enough that the right-hand side of (50) is less than 1 (the norm H ∞,1 involves an integral over t, so it increases with t).
To prove this theorem, we need a formula that computes the rate at which the evolution operator changes when the Hamiltonian is scaled. To illustrate the idea, consider the degenerate case where the Hamiltonian H is time independent. Then the evolution under H for time t is given by e −itH . A direct calculation shows that so the rate is −itHe −itsH in the time-independent case. This calculation becomes significantly more complicated for a time-dependent Hamiltonian. The following lemma gives an explicit formula for d ds We sketch the proof of this formula for completeness, but refer the reader to [21, p. 35] for mathematical justifications that are beyond the scope of this paper.
Lemma 6 (Hamiltonian scaling). Let H(τ ) be a time-dependent Hamiltonian defined for 0 ≤ τ ≤ t and assume it has finitely many discontinuities.
Proof sketch. We first consider the special case where H(τ ) is continuous in τ . We invoke the variation-of-parameters formula [30,Theorem 4.9] to construct the claimed integral representation for d ds E s (t, v). To this end, we need to find a differential equation satisfied by d dt d ds E s (t, v) and the corresponding initial condition d ds E s (t, v) t=v . We differentiate the Invoking the variation-of-parameters formula, we find an integral representation It thus remains to find the initial condition d ds E s (t, v) t=v . We start from the Schrödinger equation d dt E s (t, v) = −isH(t)E s (t, v) and apply the fundamental theorem of calculus with initial condition E s (v, v) = I, obtaining the integral representation Differentiating this equation with respect to s gives Combining (55) and (58) establishes the claimed integral representation for d ds E s (t, v). Now consider the case where H(τ ) is piecewise continuous with one discontinuity at t 1 ∈ [v, t]. We use the multiplicative property to break the evolution at t 1 , so that each subevolution is generated by a continuous Hamiltonian. We have The general case of finitely many discontinuities follows by induction.
Note that our argument implicitly assumes the existence of the derivatives and that we can interchange the order of d ds and d dt . A rigorous justification of these assumptions is beyond the scope of the paper; we refer the reader to [21, p. 35] for details.
Invoking Lemma 6, we have Thus, the first derivatives of E s (t, 0)(ρ) and U s (t, 0)(ρ) at s = 0 agree with each other: Applying the fundamental theorem of calculus twice, we obtain By properties of the Schatten norms and the definition p(τ ) = H(τ ) ∞ / H ∞,1 , we find that

Lemma 6 immediately yields an upper bound on
It thus remains to bound d 2 dv 2 E v (t, 0) ∞ . Using Lemma 6 twice, we have which implies We finally obtain the desired bound (69) as claimed.
The above error bound works well for a short-time evolution. When t is large, in order to control the error of simulation, we divide the entire evolution into segments [t j , t j+1 ] with 0 = t 0 < t 1 < · · · < t r = t and apply continuous qDRIFT within each. We employ a variable-time scheme to segment the evolution, so that our L 1 -norm scaling result can be generalized to a long-time evolution. Specifically, we have: where p(τ ) = H(τ ) ∞ / H ∞,1 . Then, for any positive integer r, there exists a division 0 = t 0 < t 1 < · · · < t r = t, such that To ensure that the simulation error is at most , it thus suffices to choose Proof. The times t 1 , · · · , t r−1 are selected as follows. We aim to simulate with accuracy for each segment. To achieve this, we define t 1 , · · · , t r−1 so that (74) The existence of such times is guaranteed by the intermediate value theorem. By telescoping, we find from Theorem 5 that which establishes the claimed error bound.

Universality
We now show that the continuous qDRIFT method introduced above can be applied in the far more general LC model where the Hamiltonian is a sum of time-dependent terms. In this sense it can be regarded as a universal method. Recall from Section 2.3 that in the general LC model, the Hamiltonian can be expressed as where each H l (τ ) is continuous and can be efficiently exponentiated on a quantum computer. This includes many familiar models as special cases: where p l (τ ) is the probability distribution To analyze the performance of this sampler, we adapt the analysis in Theorem 5 and Theorem 7, which becomes more complicated as we are now sampling a discrete-continuous probability distribution p l (τ ). Fortunately, a significant amount of effort can be saved with the help of the following universal property.

Theorem 8 (Universality of continuous qDRIFT). Let H(τ ) = L l=1 H l (τ ) be a timedependent Hamiltonian defined for 0 ≤ τ ≤ t that is nonzero everywhere. Assume that each H l (τ ) is continuous and nonzero everywhere. Define the probability distribution
Then there exists a time-dependent Hamiltonian G(τ ) defined for 0 ≤ τ ≤ t with finitely many discontinuities, such that the following correspondence holds: p l (τ ) , where we have the probability distribution q(τ ) := G(τ ) ∞ / G ∞,1 .
Before presenting the proof, we explain how Theorem 8 can be applied to simulation in the LC model. We expect that the mixed-unitary channel L approximates the ideal evolution with L 1 -norm scaling as in Theorem 5 and Theorem 7, but direct analysis would be considerably more complicated. However, universality (Statement 3 of Theorem 8) shows that this channel is the same as t q(τ ) . Thus, the analysis of Section 3.1 can be applied with the help of Theorem 8.
Proof of Theorem 8. We define G(τ ) to be the piecewise Hamiltonian where we use the abbreviation for the marginal probability distribution. Statements 1 and 2 can both be proved by directly evaluating the integrals and t 0 dτ G(τ ) = (83) We use Statement 1 to deduce that Therefore, which completes the proof of Statement 3.

Define two parametrized quantum channels
and observe that (91) For readability, we only consider the trace norm E 1 (t, 0)(ρ) − G 1 (t, 0)(ρ) 1 , whose analysis can be easily adapted to bound (E 1 (t, 0) ⊗ 1)(σ) − (G 1 (t, 0) ⊗ 1)(σ) 1 and thus the diamond-norm distance E 1 (t, 0) − G 1 (t, 0) . By Lemma 6 and Theorem 8, we find that the first derivatives of E s (t, 0)(ρ) and G s (t, 0)(ρ) at s = 0 agree with each other: . (92) Thus, we can apply the fundamental theorem of calculus twice and obtain which implies where p l (τ ) is the probability distribution p l (τ ) := H l (τ ) ∞ / H ∞,1,1 . Then, for any positive integer r, there exists a division 0 = t 0 < t 1 < · · · < t r = t, such that To ensure that the simulation error is at most , it thus suffices to choose The proof of Theorem 7 follows from Theorem 5 using the same reasoning as that used to prove Theorem 7.

Complexity of the continuous qDRIFT algorithm
As an immediate consequence of universality, we obtain the complexity of the continuous qDRIFT algorithm for simulating time-dependent Hamiltonians in both the SM and the LCU models.
where the second equality follows because H j (τ ) is 1-sparse, and the inequality follows from Lemma 3. Assuming H j (τ ) ∞ / H ∞,1,1 can be sampled efficiently, Theorem 7 implies that the algorithm has sample complexity and thus query complexity For each elementary exponentiation, we initialize a quantum register in the computational basis state |τ, j and use it to control the 1-sparse term we need to simulate. This can be done with gate complexity O n . Since the number of 1-sparse simulations is the query complexity, we obtain the gate complexity as claimed.
Our above argument assumes that H j (τ ) ∞ is known a priori and that the distribution H j (τ ) ∞ / H ∞,1,1 can be efficiently sampled. However, the argument still works if we replace each H j (τ ) ∞ by the upper bound assuming that the probability distribution p l (τ ) := α l (τ )/ α 1,1 can be efficiently sampled.
Proof. For any H(τ ) = L l=1 α l (τ )H l , we estimate The claimed complexity then follows from Theorem 7 .

Rescaled Dyson-series algorithm
In this section, we propose a general principle for rescaling the Schrödinger equation (Section 4.1). We then apply this principle to improve the Dyson-series algorithm (Section 4.2) to achieve L 1 -norm scaling.

A rescaling principle for the Schrödinger equation
Let H(τ ) be a time-dependent Hamiltonian defined for 0 ≤ τ ≤ t. The evolution under H(τ ) for time t is given by the unitary operator E(t, 0) = exp T −i t 0 dτ H(τ ) , which satisfies the Schrödinger equation We now propose a rescaling principle that helps to achieve L 1 -norm scaling. The goal is to effectively have a Hamiltonian with constant spectral norm. Recall that for a time-independent Hamiltonian one can multiply the time by a constant and divide the Hamiltonian by the same constant and obtain the same time evolution. We can achieve something similar with a time-dependent Hamiltonian by rescaling the total evolution time to and using the rescaled Hamiltonian From this definition, it is obvious that the Hamiltonian has constant norm, because Moreover, we find that the time-evolution operator satisfies Solving this equation shows that we can obtain exactly the same time-evolution operator using the rescaled time and Hamiltonian: We also have the norm equality s max so any algorithm that simulates the rescaled Hamiltonian H(ς) with complexity that scales with the L ∞ norm can simulate the original Hamiltonian with L 1 -norm scaling. While our above discussion considers the spectral norm · ∞ , other norms may be used depending on the input model of the Hamiltonian. Indeed, in the analysis for the SM model below we use the max-norm instead of the spectral norm.
Note that it may be hard in practice to compute the exact value of H(τ ) . However, we can instead use the change of variable where Λ(τ ) ≥ H(τ ) is any upper bound on the norm that can be efficiently computed.

Complexity of the rescaled Dyson-series algorithm
In this section, we show how the Dyson-series algorithm [8,29,36] can be rescaled to have L 1 -norm scaling. We address this first for the SM model of Hamiltonian access before handling the LCU model (see Section 2.3 for definitions of these models).
Unlike continuous qDRIFT, the rescaled Dyson-series algorithm requires additional oracle access to the input Hamiltonian. Specifically, we need oracles that implement the inverse change-of-variable and compute the max-norm A quantum computer with access to these oracles can simulate time-dependent Hamiltonians with L 1 -norm scaling. Note that because f (τ ) increases monotonically, we can use binary search to compute f −1 (ς) up to precision δ using O(log(t/δ)) queries to f , so we expect it to be straightforward to implement the oracle O var in practice.
This construction is similar to [36,Lemma 8], except that the Hamiltonian is rescaled. Specifically, we use oracles O var and O norm to implement the transformation from which we obtain the rescaled Hamiltonian by querying O val and re-normalizing the result with H(f −1 (ς)) max to compute H jk (ς): We then uncompute the ancilla registers storing f −1 (ς) and H(f −1 (ς)) max . Overall, this implements a rescaled oracle The remaining algorithm proceeds as in [36].
The gate complexity of the preparation of the time registers is log M times the query complexity, where we have shown that where H ∞,1 is polynomial in norms of H and its derivative. Since this a logarithmic cost, the contribution to the complexity from preparation of the time registers is O d H max,1 .
The remaining contribution to the gate complexity comes from acting on the system itself. The cost of this is O(n) for each of the oracle queries, which gives gate complexity O d H max,1 n (this is the dominant cost in (117)).
Thus, the rescaled Dyson-series algorithm can simulate time-dependent Hamiltonians in the SM model with L 1 -norm scaling. Next we turn our attention to the LCU model. For an input Hamiltonian H(τ ) = L l=1 α l (τ )H l , this approach assumes quantum access to the coefficient oracle The rescaled Hamiltonian takes the form where α l (τ ) := α l (τ )/ α(τ ) ∞ . Therefore, we have a global upper bound on the absolute value of the coefficients α ∞,∞ ≤ 1.
The remaining construction is similar to [29,Section V C], except that the Hamiltonian is rescaled. Specifically, we use oracles O var and O norm to implement the transformation from which we obtain the rescaled coefficients by querying O coeff and doing arithmetic, We then uncompute the ancilla registers storing f −1 (ς) and α(f −1 (ς)) ∞ . Overall, this implements a rescaled oracle The remaining algorithm proceeds as in [29].
That complexity may be obtained from [18,Lemma G.7], or the unary iteration procedure of [2, Section III A].

Applications to chemistry and scattering theory
There are numerous cases in physics where one needs to simulate time-dependent quantum systems. Indeed, the pulse sequences that constitute individual quantum gates or adiabatic sweeps are described by time-dependent Hamiltonians. Here, we look at the particular case of simulating semi-classical scattering of molecules within a chemical reaction as an example of time-dependent Hamiltonian dynamics [24,46]. Chemical scattering problems involve colliding reagents. As the molecules move closer, the electronic configuration changes due to strengthening Coulomb interactions, which is ultimately responsible for either the reagents forming a bond or flying apart depending on the initial conditions and the nature of the reagents. In the non-relativistic case, the Hamiltonian for two colliding atoms A and B at positions x A and x B , respectively, and M electrons with positions x m for m = 1, . . . , M , can be expressed as Here are three-dimensional vectors of operators, whereas the corresponding nuclear terms (such as x A and x B ) are threedimensional vectors of scalars. We further define |x m − x m | to be the operator The wave function can be thought of as having a nuclear as well as an electronic component. First, we assume that the nuclear and the electronic wave functions are decoupled [26,49]: This approximation is justified by the fact that the nuclear mass is substantially greater than the electronic mass. We then follow the time-dependent self-consistent field (TDSCF) approximation, which further treats x A and x B as classical degrees of freedom x A (t) and x B (t) with conjugate momenta (p A (t), p B (t)). This simplification is justified by Ehrenfest's theorem, which states that for a sufficiently narrow quantum wave packet, the equation of motion for the centroid follows the classical trajectory (to leading order in ). Under this approximation, the electronic dynamics satisfy where we have suppressed the implicit dependence of the electronic wave function on x 1 , . . . , x M . The equation of motion for the two nuclear positions in the time-dependent self-consistent field approximation within the Ehrenfest method is given by the Hamilton-Jacobi equation: and similarly for x B . The function H nuc (t) here is simply the Hamiltonian H nuc with the classical substitutions x A → x A (t), p A → p A (t) and similarly for x B and p B . Similarly, we define H elec (t) to be the electronic Hamiltonian under this classical substitution. The evolution in the Ehrenfest method is governed by a pair of tightly coupled quantum and classical dynamical equations, wherein the full Schrödinger equation only needs to be solved to understand part of the dynamics for the system. Indeed, as the Born-Oppenheimer approximation instantaneously holds under the above approximations, we can further express the electronic dynamics within a second-quantized framework with respect to a basis of molecular orbitals as where for some basis of orthonormal molecular orbitals ψ p ( x; t) (which are implicitly time dependent if these basis functions are chosen to be functions of the nuclear positions, as would be appropriate for an atomic orbital basis), Thus under the above approximations, the dynamics that need to be simulated take the form of a standard second quantized simulation of chemistry, except the Hamiltonian is Consider the case where two reagents move towards each other from distant points with large momenta. To get an intuitive understanding of this evolution, it is instructive to examine the case of two molecules colliding using a classical force field. This will give us an expression that is qualitatively accurate for x A (t) and x B (t). To do this, we use a Lennard-Jones potential to model the interaction between two helium nuclei. The potential as a function of separation between the nuclei r(t) = |x A (t)−x B (t)| is assumed to be of the form V (r) = r 12 m r 12 − 2r 6 m r 6 , where ≈ 10 K and r m ≈ 2.6 Å [44]. Setting the initial radial velocity to be approximately the root mean square (RMS) velocity of helium at 25 • C, we solve the classical equations of motion to find the trajectory shown in Figure 1.
From Figure 1, we find that the interaction appears as a brief but intense kick between the two systems. As a result, the norm of the Hamiltonian changes dramatically throughout the evolution and we expect simulation algorithms with L 1 -norm scaling to be advantageous over previous approaches. We leave a detailed study of such an advantage as a subject for future work.

Discussion
We have shown that a time-dependent Hamiltonian H(τ ) can be simulated for the time interval 0 ≤ τ ≤ t with gate complexity that scales according to the L 1 norm t 0 dτ H(τ ) . We designed new algorithms based on classical sampling and improved the previous Dysonseries approach to achieve this scaling. This is a polynomial speedup in terms of the norm dependence, an advantage that can be favorable in practice. In particular, our result has potential applications to simulating scattering processes in quantum chemistry. Our analysis also matches the intuition that the difficulty of simulating a quantum system should depend on the norm of the Hamiltonian instantaneously. This dual interpretation suggests that the L 1 -norm dependence of our result cannot be significantly improved. However, further speedup might be possible if we know a priori the energy range of the initial state, as is suggested in [34,43].
The rescaled Dyson-series approach is nearly optimal with respect to all parameters of interest. Indeed, a lower bound of Ω d H max t+ log(1/ ) log log(1/ ) was given in [9, Theorem 2] for simulating time-independent sparse Hamiltonians, which of course also holds for the more general time-dependent case. The query complexity (116) of the rescaled Dyson-series approach matches this dependence on d H max t and on , except that it scales as the product of the two terms instead of the sum (so, as in all quantum simulation algorithms prior to the advent of quantum signal processing [33], it does not achieve the optimal tradeoff between t and ). However, this approach requires computing the rescaling function (114) and the Hamiltonian norm (115) in quantum superposition, which may introduce large overhead in practice. In comparison, continuous qDRIFT relies on classical sampling and may be better suited to near-term simulation. Its complexity has no dependence on the parameter L in the LCU decomposition (Corollary 9 ), which is advantageous for Hamiltonians consisting of many terms.
For most of our analysis, we have assumed that the Hamiltonian H(τ ) is continuously differentiable. This assumption can be relaxed to allow finitely many discontinuities. Indeed, if H(τ ) is discontinuous at the times 0 = t 0 < t 1 < · · · < t r = t but otherwise continuously differentiable, we may divide the evolution into r segments and apply a timedependent Hamiltonian simulation algorithm within each time interval [t j , t j+1 ]. For the Dyson-series approach, the complexity depends linearly on the L 1 norm, so concatenation gives a simulation of the entire evolution with L 1 -norm scaling. The assumptions about the Hamiltonian can be even further relaxed: the continuous qDRIFT algorithm works properly provided only that H(τ ) is Lebesgue integrable. Further discussion of this point is beyond the scope of this paper, and we refer the reader to [21] for details.
Our analysis can also be adapted to simulate time-dependent Hamiltonians that have countably many zeros. Indeed, since the equation H(τ ) = 0 has at most countably many solutions, we can find c ∈ R such that H(τ ) + cI is nonzero everywhere. Then, , so the result is only off by a global phase. Note that this assumption can be completely dropped if we use continuous qDRIFT: we define the exceptional set and redefine U(t, 0) as We note that U(t, 0) is a valid quantum channel and can be implemented with unit cost. Indeed, for any input state ρ, we randomly sample a value τ according to p(τ ) and perform e −iH(τ )/p(τ ) if τ ∈ [0, t]\B 0 , and the identity operation otherwise. This implements The remaining analysis proceeds as in Section 3. The qDRIFT protocol that we analyzed here only achieves first-order accuracy. It is natural to ask if sampling a different probability distribution could lead to an algorithm with better performance. The answer seems to be "no" if we only use a univariate distribution. To see this, consider the discrete case where H = L l=1 H l is a Hamiltonian consisting of L terms. We sample according to a probability vector p ∈ (153) By Jensen's inequality, with equality if and only if all H l ∞ /p l are equal, implying that the probability distribution p l := H l ∞ / H ∞,1 is optimal. A similar optimality result holds for continuous qDRIFT (though the proof is more involved). However, this does not preclude the existence of a higher-order qDRIFT protocol using more complicated sampling [39]. For example, besides the basic evolutions e −itH l /p l , one could evolve under commutators [H j , H k ] or anticommutators {H j , H k }. We could also use a multivariate distribution and correlate different steps of the qDRIFT protocol. For future work, it would be interesting to find a higher-order protocol, or prove that such a protocol cannot exist.
The fractional-query algorithm described in Section 2.5 provides a natural approach to simulating time-dependent Hamiltonians whose query complexity scales with the L 1norm. While we believe such a scaling also holds for the gate complexity, it would be highly nontrivial to give an explicit implementation. In any case, the fractional-query approach is streamlined by the Dyson-series approach and the latter can be rescaled to achieve L 1 -norm scaling.
The rescaling principle that we proposed can potentially be applied to improve other quantum simulation algorithms. For example, we can use the product-formula algorithm [50] to simulate the rescaled Hamiltonian H(ς) := H f −1 (ς) / H f −1 (ς) ∞ for time s = H ∞,1 . The difficulty here is that the derivative of the rescaled Hamiltonian can be larger than the original one, making the rescaled algorithm perform worse. We leave a thorough study of this issue as a subject for future work.
Finally, it would be interesting to identify further applications of our L 1 -norm scaling result, such as to designing new quantum algorithms and to improving the performance of quantum chemistry simulation. It might also be of interest to demonstrate these approaches experimentally, for applications such as implementing adiabatic algorithms with quantum circuits. by a better analysis of the same algorithm. Indeed, they use a uniform distribution to pick random times during the Monte Carlo estimation. This sampling ignores the instantaneous norm H(τ ) ∞ of the Hamiltonian and therefore the resulting algorithm cannot scale with the L 1 norm t 0 dτ H(τ ) ∞ . Instead, continuous qDRIFT uses a probability distribution that biases toward those times with larger instantaneous norm. In Section 3, we proved that such a sampling gives a direct simulation of time-dependent Hamiltonians with complexity that scales with the L 1 norm. We now give an indirect implementation: (i') we show in Appendix A.1 that the error of replacing exp T − i t 0 dτ H(τ ) by an ordinary matrix exponential of H av scales like O H 2 ∞,1 , improving the analysis of [42]; (ii') we further prove in Appendix A.2 that the average Hamiltonian can be simulated by continuous qDRIFT with L 1 -norm scaling. Combining these two steps, we see that the Monte Carlo estimation approach of [42] is superseded by continuous qDRIFT.
To handle the second term, we use Lemma 1. Observe that the generator of E(s, 0) is H(u), 0 ≤ u ≤ s, whereas the generator of E(τ, 0) is H(u), 0 ≤ u ≤ τ . So they only differ on the interval min{s, τ }, max{s, τ } . Consequently, Altogether, we have The above bound on the spectral-norm error can be converted to a bound on the diamond-norm error using Lemma 2.

A.2 Implementing Hamiltonian averaging by continuous qDRIFT
Let H(τ ) be a time-dependent Hamiltonian defined for 0 ≤ τ ≤ t and assume that it is continuous and nonzero everywhere. We have showed that the ideal evolution can be approximated by an evolution under the average Hamiltonian with error that scales with the L 1 norm. We now show that such a Hamiltonian averaging can be implemented by continuous qDRIFT, again with L 1 -norm scaling. This improves over the algorithm of [42] which scales with the L ∞ norm.
Theorem 12 improves the constant prefactor from 8 to 4.
Optimizing over σ proves the claimed bound.