Improved Accuracy for Trotter Simulations Using Chebyshev Interpolation

Quantum metrology allows for measuring properties of a quantum system at the optimal Heisenberg limit. However, when the relevant quantum states are prepared using digital Hamiltonian simulation, the accrued algorithmic errors will cause deviations from this fundamental limit. In this work, we show how algorithmic errors due to Trotterized time evolution can be mitigated through the use of standard polynomial interpolation techniques. Our approach is to extrapolate to zero Trotter step size, akin to zero-noise extrapolation techniques for mitigating hardware errors. We perform a rigorous error analysis of the interpolation approach for estimating eigenvalues and time-evolved expectation values, and show that the Heisenberg limit is achieved up to polylogarithmic factors in the error. Our work suggests that accuracies approaching those of state-of-the-art simulation algorithms may be achieved using Trotter and classical resources alone for a number of relevant algorithmic tasks.


Introduction
Quantum simulation has become, arguably, the most promising application of quantum computing in the near-term [1,2], with the potential to provide exponential speedups for a host of problems ranging from the electronic structure [2,3,4,5] to simulation of scattering dynamics within quantum field theories [6,7,8]. The central challenge of digital Hamiltonian simulation is, given a fixed Hamiltonian, simulation time, and error tolerance, provide a minimal-length sequence of quantum gates that approximates the unitary dynamics within that error tolerance.
Major strides have been made in the last several years towards this goal. Several classes of methods, such as Linear Combinations of Unitaries (LCU) methods [9,10,11,12,13], qubitization [14,15] and Trotter-based methods [16,17,18,19,20,21] have emerged as leading approaches for simulating quantum dynamics. Unlike the other aforementioned methods, Trotter methods yield a complexity that scales with the commutators of the Hamiltonian terms [18], which can lead to substantial performance improvements for simulations of local Hamiltonian [19]. However, Trotter methods scale super-polynomially worse with the error tolerance than other existing methods [22,11]. This makes such methods inferior to other strategies in cases where high accuracy is required.
It would be interesting if the gap in simulation accuracy between Trotter and other approaches could be bridged in a minimal way, such as classical post processing. Richardson extrapolation has been proposed as a method to mitigate algorithmic and hardware errors for simulation and linear systems [23,24]. Recently, the approach been demonstrated on quantum hardware [25] and yielded improvements compared to Trotter simulation alone. However, there remains a gap in the theoretical understanding of the errors, though one may be able to adapt existing results from the theory of multiproduct formulas.
The present work takes a different approach to the problem, achieving improved accuracy using polynomial interpolation, hereafter referred to simply as "interpolation." In this scheme, we take Trotter simulation data at various time step sizes, then interpolate this data with a polynomial. Our estimator is then the value of this polynomial at the desired ideal of zero step size. By choosing our nodes so that the interpolation is well-conditioned, we show that a Heisenberg-limited 1/ϵ scaling can be attained for estimating observables up to logarithmic factors. This is in contrast to the 1/ϵ 1+1/p scaling obtained with a single Trotter estimate alone, where p is the order of the Trotter formula. Interestingly, our result also holds in cases where only the lowest order Trotter formula is used, providing the first poly-logarithmic scaling method with the error that uses a constant number of ancillary qubits. We study this method in two cases. The first case uses phase estimation to extract eigenvalues of the Hamiltonian; whereas the second method uses dynamical simulation and amplitude estimation to learn expectation values of observables. We further validate these methods numerically using a newly proposed unbiased phase estimation method that we call Gaussian phase estimation.
The layout of the paper is as follows. In Section 2, we go over the basic setup of our work, including assumptions and notation. We review Suzuki-Trotter (ST) formulas and polynomial interpolation, state an important lemma, and present the well-conditioned interpolation formulas that we use to enable our approach. Section 3 contains our analysis and applications of interpolation to eigenvalue estimation. Specifically, Section 3.1 takes a perturbative approach to deriving error bounds and complexity, and applies the results to the problem of estimating Trotter error on a quantum computer. Section 3.2 approaches the problem through complex analysis and the Bernstein ellipse, and applies the analysis to the newly proposed Gaussian phase estimation. In Section 4, we present our main results involving extrapolation of expectation values of time-evolved observables. This case is conceptually distinct from the case of phase estimation, because here the time evolutions are generally long. In Section 5 we validate our claims numerically for small instances of transverse Ising models. Finally, we conclude in Section 6 and discuss future avenues of research.

Setup and Notation
The algorithms presented in this paper are simply an application of standard polynomial interpolation to data obtained from Trotter simulations. These two pieces are, in many respects, disjoint, and the interpolation can be thought of as a form of classical post processing. Thus, we will begin this section with a review of the quantum part, Trotterization, then move to interpolation.

Trotter
Consider a Hamiltonian H on a collection of n qubits, decomposed in a specified way into a sum of m terms.
Notice that both the Hamiltonian and decomposition are specified, as the decomposition is not unique and can have implications for algorithmic performance. Questions about optimal decompositions are entirely neglected in this work, as well as questions of how one maps a quantum system of interest (such as a molecule or crystal) onto a set of n qubits appropriately. In our results, as a proxy for the true simulation cost, we will quote the number of exponentials U j = e −iH j t used to approximate e −iHt . Of course, this is only indicates the true cost when each U j is computationally cheap. In a number of relevant situations, the Hamiltonian is k-local or sparse, and it is possible to identify decompositions such that each U j has constant cost and m grows polynomially with respect to the number of qubits n. For such Hamiltonians, there exist efficient quantum algorithms to approximate the time evolution operator U (t) = e −iHt to some desired precision ϵ, many of which rely on Trotter formulas. Standard examples of Trotter formulas include the first-order formula and more generally, the order 2k symmetric ST formula, defined recursively as for every k ∈ Z + \{1}, where u k := (4−4 1/(2k−1) ) −1 [26]. Though many kinds of Trotter formulas exist, the ST formulas (2.4) will be our primary tool in this work for several reasons. First, they are symmetric under naive 1 time reversal S 2k (−t) = S 2k (t) † (2.5) which will allow us to effectively double the number of interpolation points. Second, the order of the formula 2k can be taken arbitrarily large, and therefore our results will apply to quite a general class. Finally, the lowest order k = 1 symmetric formula is actually practical, having small constant factors, and therefore our bounds will have something to say about actual simulations. Trotter formulas approximate U (t) only in a neighborhood around t = 0. Thus, the standard trick is to divide the time interval [0, t] into r subintervals, such that each interval is sufficiently small that the Trotter approximation is valid, then string together these simulations. For the simple case of a uniform mesh of r subintervals, this becomes where big O is understood as taking r large. Clearly, we have that lim r→∞ S 2k (t/r) r = U (t).
Rather than thinking about the number of steps r tending to infinity, it is simpler for our subsequent analysis to consider s = 1/r as a "dimensionless step size," and instead think about s → 0. In terms of s, we definẽ U s (t) := S 2k (st) 1/s . (2.7) as the approximate evolution operator for s ̸ = 0. The discontinuity at s = 0 in (2.7) may be filled by the exact evolutionŨ 0 (t) := U (t). Though we defined s as a reciprocal integer, definition (2.7) together withŨ 0 (t) suggests an extension to allow s to be realvalued. In fact, the resulting functionŨ s is smooth on a neighborhood of s = 0, a fact that will allow us to precisely characterize the interpolation error. For our purposes, we will only consider |s| ≤ 1. When 1/s is not an integer, we may implementŨ s using fractional queries [27] by splitting 1/s into integer and fractional parts.
Here, r = rnd(1/s) ∈ Z is 1/s rounded to the nearest integer, and f ∈ [−1/2, 1/2]. Finally, we note thatŨ s is an even function of s, which we will make use of to cut the number of interpolation points in half by reflecting across s = 0.
1 Not to be confused with actual time reversal, which is an antilinear operator.
Prior work has demonstrated the value of considering the effective Trotter Hamiltonian in the analysis of Trotter formulas [28]. This approach is also helps us calculate high order derivatives ofŨ s as needed for our error bounds. We thus define an effective HamiltonianH so thatŨ (2.10) Note thatH s depends on t as well, though this dependence will be left implicit. For the purposes of bounding the interpolation error, we require a bound on the norm of H s . This is supplied by the following lemma.
Lemma 1. In the notation introduced above, let s be chosen such that Then the following bound on the derivatives ofH s with respect to s holds.
The proof of this lemma is technical, so it is relegated to Appendix A. Note that our bounds are uniformly worse for larger k, and suggest low order formulas are all that need to be considered for our approach. Whether this is an artifact of our bound or a real feature of the method is left to be understood.

Polynomial Interpolation
Having introduced ST formulas and stated a needed lemma, we turn our attention to interpolation. Essentially, our goal is to use interpolation to "extrapolate" to the ideal of s = 0. 2 There are many quantities that we could be interested in extrapolating, including the eigenvalues λ i (s) of the effective Hamiltonian, and expectation values of time-evolved observables.
These will be the primary focus of the present work. No matter what quantity we are interested in, we'll assume it was obtained from ST on a quantum computer for the purposes of subsequent cost analysis. However, in principle our approach should work for any Trotter scheme, not just ST. While the interpolation is classical and independent of the method in which the data is generated, we will assume a quantum simulation was used when considering the computational cost. We assume all quantum operations are executed perfectly, including the exponentials exp(−iH j t) for simulation. Thus, the primary sources of error are the interpolation error and error in the calculation of the data points (e.g. the Hamiltonian energies or expectation values at various points s i ). Error in the data points may arise from hardware noise, but even in its absence, a measurement protocol such as phase estimation induces a systematic error that cannot be removed. In this work, we only account for the latter: algorithmic errors. This is not to say that the interpolation method could not be applied to noisy quantum systems, but rather that our cost analysis does not account for it.
We now describe the interpolation framework.
For any s ∈ [−a, a], standard results in polynomial interpolation [29] tell us that the signed error is given by for some ξ ∈ I s , where I s ⊂ [−a, a] is the smallest interval containing s and the interpolation points {s i }. Throughout this work, superscripts such as in f (n) will refer to nth-order derivatives. The nth degree nodal polynomial ω n (s) is defined as the unique monic polynomial with zeros at the interpolation points.
Our estimate for f (0) is P n−1 f (0). Since we are interested in s = 0, ω n becomes a (signed) product of the interpolation points. We can bound the interpolation error E n (0) in a way that is independent of the precise value of ξ (which is unknown and difficult to find) by maximizing over ξ ∈ I s .
This allows us to specialize our interpolation error in the manner described in the following lemma.
. . , n be the collection of Chebyshev interpolation points on the interval [−a, a]. In the notation above, we have Proof. For n odd, s = 0 is one of the interpolation points, so the error is zero and the bound holds automatically. Hereafter, we only consider n even (which will be the case of practical interest). Using the generic bound (2.16) with the Chebyshev nodes, To obtain the lemma, we just need to appropriately bound the product of cosines. Since n is even, n = 2m for some m ∈ Z + . Moreover, we have a reflectional symmetry about m, in the sense that cos 2i − 1 2n π = cos 2(n − i + 1) − 1 2n π . (2.20) Hence, we only need to take the product over i = 1, . . . , m and square it.
To proceed further, let's reindex the remaining product by i → m − i + 1. This gives where we used the fact that sin(x) ≤ x for all x ≥ 0. Factoring out the denominator from the product, the remaining terms become a double factorial.
The double factorial can be bounded as follows.
Returning to the original product of equation (2.21), and reintroducing n = 2m, the resulting bound is Reinserting this result into the last line of equation (2.19) gives the bound stated in the lemma.
Though Chebyshev interpolation enjoys nice mathematical properties, it presents a challenge for Trotter simulation because of the need for noninteger time steps in equation (2.7). In the face of this, there are several options one could take: restricting to integer time steps, or perform fractional queries using, say the Quantum Singular Value Transformation (QSVT).
First, consider restricting to integer time steps, by gathering data at the nearest reciprocal integer 1/r to the Chebyshev node s. For symmetrical interval [−a, a], this distance goes as O(a 2 ) as a → 0. From here, one could either (a) take the estimate for f (1/r) as the estimate for f (s), accruing some error in the process, or (b) perform the interpolation at the approximate Chebyshev nodes given by the collection of points 1/r i . Unfortunately, for our purposes, option (a) leads to unacceptable errors of order O(a) in the data, eliminating accuracy gains. As for option (b), it is possible to use robustness results on Chebyshev interpolation [30] to argue that almost-Chebyshev nodes should be almost as well-conditioned. Again, however, we find that our scaling of the number of nodes is such that the node displacements must be quite small, leading again to poor scaling.
Because of this, for most of this work we choose to invoke access to fractional queries using the QSVT [27]. A notable exception to this is in application to Gaussian phase estimation in Section 3.2.3, where the error in the fractional part can be mitigated in that context. While fractional queries increase the overhead compared to Trotter alone, this overhead is a constant.

Stability Analysis and Interpolation
Polynomial interpolation is a valuable numerical tool, but some implementations can lead to numerical instability [31]. However, the situation is not as bad as often presented in textbooks [32]. While linear algebraic approaches involving Vandermonde matrices suffer instability for high degree polynomials [33], methods such as barycentric formulas are provably stable with respect to floating point arithmetic [34].
A particularly important consideration is the choice of interpolation nodes. It is well known that equally spaced nodes can lead to the Runge phenomenon: rapid oscillations near the ends of the interval that grow with polynomial degree [29]. These oscillations can be overcome with a superior choice of nodes, such as the zeros of the Chebyshev polynomials. Interpolations done with this set of nodes are guaranteed to converge to functions that are Lipschitz continuous as n → ∞. Moreover, they are well-conditioned in the sense of small errors in the data values. Finally, because they anti-cluster around s = 0, they are relatively cheap to compute with Trotter formulas. In this work, we will always interpolate at the nth-degree Chebyshev nodes, or approximations thereof, on a symmetric interval [−a, a] about the origin, defined in (2.17). We choose even n so as to avoid the origin (which has infinite cost to compute), and also utilize the reflectional symmetry of f (s).
To compute the interpolant P n−1 f linear algebraically, we overcome the limitations of the standard Vandemonde approach by expanding in terms of orthonormal Chebyshev polynomials rather than monomials x j . (2.26) Here, p j is defined by  where T j is the standard jth Chebyshev polynomial.
T j (x) := cos(n cos −1 x) (2.28) By orthonormality, we are referring to the condition [35] n k=1 for all 0 ≤ i, j < n, with s k being the zeros of T n given in (2.17). This immediately implies the matrix is orthogonal, and therefore has condition number κ(V) := ∥V∥ ∥V −1 ∥ equal to one. This is the source of well-conditioning in our approach. The coefficients c = (c 0 , c 1 , . . . , c n−1 ) in equation (2.26) satisfy for the vector of values y = (f (s 1 ), f (s 2 ), . . . , f (s n )), since P n−1 f is an interpolant. Hence, c = V T y gives the vector of coefficients. We now develop our argument for well-conditioning. (2.33) Next, we use the lower bound in order to bound the terms of the sum.
Here, H n denotes the nth harmonic number and ψ(n) is the digamma function. The general result follows from a rescaling by 1/a.
We have a reflectional symmetry about k → n − k + 1, allowing us to cut the sum in half and remove the absolute value sign. Here, m ≡ n/2. We observe that the sum increases as k approaches m due to the first order pole at π/2. We can upper bound tan(x), and therefore the sum above, as follows.
In the last line, we reindexed by j = m − k. This sum can be evaluted exactly in terms of the digamma function ψ at half integer values.
Using a well-known upper bound on ψ, this is bounded by Finally, resubstituting n = 2m, dropping the negative term for an upper bound, and noting that 2γ/π < 1, we arrive at the result stated in the lemma.
∥d(0)∥ 1 < 2 π log(n + 1) + 1 (2.47) The benefit of well-conditioning comes from relaxing the need to have exquisitely precise data to achieve good interpolations. This property is captured by the following theorem.
Proof. First, observe that P n−1 f (s) = p(s) T c = p(s) T V T y by the discussion surrounding (2.31). Hence, By Hölder's inequality, From Lemma 3, and from the assumptions on the distance between y andỹ, ∥Vp(s)∥ 1 ∥y −ỹ∥ ∞ ≤ 2 π log(n + 1) + 1 ϵ 2 π log(n + 1) + 1 = ϵ (2.50) with probability Pr = (1 − δ/n) n . In fact, since the probability of each component y exceeding the specified distance is δ/n, by the union bound the total probability of at least one component exceeding this distance is less than n × (δ/n) = δ. Thus, the inequality is satisfied with probability Pr ≥ 1 − δ. This completes the proof.
Theorem 5 is what suggests that our interpolation approach may have the potential to achieve accuracy improvements without increasing costs compared to standard Trotter. It tells us that the error in Trotter data can be as large as the error of the final estimate up to a factor which is logarithmically small in the number of interpolation points, and therefore these dataỹ i can be computed "cheaply enough." Thus, Theorem 5 is plays an important role in the proofs of Appendices B and E, and in the corresponding results presented below.
In the applications to Gaussian phase estimation we consider in Section 3.2.3, we will need a probabilistic statement of well-conditioning that takes into account confidence intervals for the data. This is supplied by the following lemma.

Lemma 6.
Let {Y j } be normally distributed random variables for the n Chebyshev interpolation points, with central values µ j = f (s j ) and variances σ j . Then the random variable R n−1 (s) associated with the polynomial interpolant at value s is normally distributed with mean P n−1 (s) and variance σ 2 R bounded as To bound the variance, observe that where Σ = diag (σ 1 , σ 2 , . . . , σ n ), and the last inner product was identified as a squared two norm. We can bound this using the spectral norm.
At first, it may seem that the results of this lemma are quite restrictive, only applying to unbiased, normally distributed estimates of the function values f (s i ). However, we will find occasion to apply this lemma in the context of the Gaussian phase estimation considered in Section 3.2.3.

Approach 1: Error Analysis through Perturbation Theory
Having laid the groundwork for well-conditioned polynomial interpolation, we apply our framework to the task of phase estimation. The idea is to perform logarithmically many phase estimation experiments evaluated at the Chebyshev nodes 2.17. We then bound the error using the interpolation theory of the previous section. The following theorem bounds the performance of such an algorithm.
where each H j is a Hermitian matrix and a j (t) is real valued and in C n for positive integer n. For each t ∈ R, let |ℓ(t)⟩ be an eigenstate of H(t) with eigenvalue λ ℓ (t), and assume oracular preparation of |ℓ(t)⟩. Further, assume that there exists a minimum gap γ(t) > 0 such that for all ℓ ′ ̸ = ℓ and t > 0, |λ ℓ (t) − λ ℓ ′ (t)| ≥ γ(t). Finally, assume that the evolution time t used for all such experiments satisfy 2k(5/3) k−1 m max l∈ [1,m] ∥H l ∥t ≤ π/20. It is then possible to use an n-point polynomial interpolation formula to estimate λ ℓ (0) within error ϵ and failure probability at most 1/3, using a number of operator exponentials and queries that is bounded above by The proof of this theorem is technical and relegated to Appendix B. As a brief sketch, the proof proceeds by using perturbation theory to evaluate the derivatives of the eigenvalues and eigenvectors of the effective HamiltonianH s , as well as multiple applications of the triangle inequality and similar linearizing approximations.

The above result applies generically to any piecewise analytic Hamiltonian H(t).
To give a better intuition for how it could be used, let us focus our attention on a phase estimation protocol. There are multiple phase estimation procedures that can be employed, but the shared basic idea is that one provides a quantum state |ψ⟩ = j c j (t) |j(t)⟩ and performs a series of evolutions of the form e −iH(t)t to this state to yield an estimate of one of the eigenvalues, exp(−iλ l (t)t). Each eigenvalue is randomly sampled with probability |c j (t)| 2 . We will usually demand that the variance of this estimate is bounded above by t 2 and that the expected error is at most ϵ for such a procedure, but it is also common for the accuracy guarantees to be given in terms of a probability of failure.

Application: Estimation of Trotter Error
One important capability that our method enables is the estimation of error in ST formulas. This is important because existing error bounds are typically not tight, and leading order expansions for the error are prohibitively expensive even for short evolutions [18]. Here we provide a way to address this via a method for computing a Frobenius distance ∥U − V ∥ F between two unitaries U, V , where is the Frobenius norm. More specifically, we fix an ST order k and letŨ s be as in equation (2.7). Our method applies to any product formula, but our theoretical analysis restricts us to consider ST formulas. To estimate the Trotter error, we compute using the circuit of Figure 1. Observe thatŨ 1 ≡ S 2k is a single Trotter step, while is the exact time evolution operator. Therefore, by estimating

4)
we obtain a metric of the Trotter error. As usual, we will estimate (3.4) by extrapolating to s = 0. A challenge that arises is that the circuit of Figure 1 only gives a normalized Frobenius distance, corresponding to the average square singular value of S 2k (t) −Ũ 1 (t).

Corollary 8.
Under the assumptions of Theorem 7, there exists a quantum algorithm that can compute the quantity ∥S 2k (t) − e −iHt ∥ 2 F for H = m j=1 H j with ϵ error and failure probability (also) ϵ using a number of operator exponentials of the H j that is in Proof. Consider the circuit of Figure 1, with U =Ũ s and V =Ũ 1 . Let SELECT refer to the control operations in this LCU-type circuit. We first demonstrate that the probability of measuring the first qubit to be zero is ∥S ). This fact follows from the analysis of the LCU lemma [36]; however, that analysis is typically performed for pure states, so we generalize it here. Let |0⟩⟨0| ⊗ ρ be the input state of the circuit. Then the gate operations proceed as follows.
We then have that the probability of measuring the first qubit to be zero is Taking ρ = 1/2 n in particular yields Thus, the probability of measuring 0 on the first qubit gives a normalized Frobenius distance between the two operators. The cost of doing this is O(1) queries to the underlying ST formulas, each of which boils down to O(m5 k ) operator exponentials. Using Amplitude Estimation on the target, we can construct an operator W such that the eigenvalues of W within the subspace supporting the initial state are of the form (3.8) We then can invoke Theorem 7 to show that the number of exponentials needed to learn the extrapolated phase, within error ϵ ′ with probability greater than 2/3, satisfies The remaining question is how small ϵ ′ must be to guarantee that the error in the squared Frobenius distance is at most ϵ. Letφ denote the estimate of the phase that is returned by our protocol, which has error at most ϵ ′ . Our estimate of the squared Frobenius distancenD 2 is then given by the phase in equation (3.8).
Thus we have that Thus we can estimate the mean square singular value of the difference between the two operators within error ϵ Thus, ϵ and ϵ ′ are related by a constant, and the cost for learning the mean square singular value of the difference between the two unitaries is for all k > 0. Further, even with no such promise about the variance, the following weaker bound holds.
Proof. The result follows from the Chebyshev and Markov inequalities.
The above corollaries show that we can use our interpolation procedure to the estimate largest eigenvalue of the error operator. In particular, let the probability of an eigenvalue being greater than the estimate be O(1/2 n ). Then the Chebyshev bound (3.14) implies that it suffices to take k ∈ O( √ 2 n ). Thus, with high probability, all of the eigenvalues for the square of the error operator will be at mostD 2 /2 n + O( √ ξ2 n ). Thus if ξ ∈ o(D 4 /2 3n ) then the estimate yielded by this procedure will also estimate the spectral norm.

Approach 2: Error Analysis through Bernstein Ellipses and Analyticity
In this section, we propose an alternative analysis for eigenvalue estimation, as well as a specific approach to phase estimation. We first prepare each effective eigenstate through the procedure in [37], except using one single qubit the semi-classical QFT, summarized in Theorem 15. Then we perform a new Gaussian phase estimation (Theorem 16) on said eigenstate and then interpolate the estimates. The main result is summarized in Theorem 17.
To analyze the error and complexity of this approach, we use a different formalism that relies on complex analyticity [38], which allows us to estimate the convergence rate of the interpolation in terms of range of analyticity rather than and not derivatives of the effective Hamiltonian. This is beneficial since the bounds using derivatives might become unmanageable near level crossings (See Appendix B).

Ancillary Lemmas
To begin, we need to introduce some notions from complex analysis. For each ρ > 1, let B ρ ⊂ C be the Bernstein ellipse, which is an ellipse with foci at ±1 and semimajor axis (ρ + ρ −1 )/2. The following lemma bounds the Chebyshev interpolation error for analytic functions on B ρ .
4Cρ −n ρ − 1 for each degree n > 0 of the interpolant through the n + 1 Chebyshev nodes.
Lemma 10 shows that the interpolation error shrinks exponentially in n. We would like to apply this lemma to analyze the eigenvalues λ(z) of the effective HamiltonianH z , which is a continuation ofH s to the complex plane. However, we need to understand the domain under which λ(z) is analytic, i.e. free of level-crossings.
To characterize this domain, we first utilize a result from Bauer and Fike [39] which bounds the shift in the eigenvalues under a shift in the operator.
for at least one eigenvalue λ k of A, where ∥·∥ is the spectral norm.
In order to determine the radius of analyticity, we assume a specific form for the upper bound on ∥H −H z ∥, whereH z is the analytic continuation ofH s to the complex plane. Here is the resulting theorem Lemma 12. Let λ(τ ) be an eigenvalue ofH z , where τ = zt for z ∈ C and t ∈ R + , and γ 0 be a lower bound on the spectral gap ofH 0 = H. Let α, β ∈ R + be constants such that Then λ(τ ) is analytic on any origin-centered disc of radius r provided that r ≤ r max , with Here W 0 is principal branch of the Lambert W function.
The above lemma gives an upper bound r max on the radius of analyticity. From this, assuming r max > 1, we can calculate the largest possible B ρ by equating the semimajor axis with the maximum radius which has solution The following lemmas give a clue as to what α and β should look like in terms of commutators between the terms of H and also with respect to its overall spectral norm. However, first, we introduce a bound on the error on the effective total Hamiltonian evolution in terms of the "instantaneous" Hamiltonian error, E , which is defined through the complex-time Schrödinger equation. Let S p (τ ) be a pth order product formula approximating exp(iHτ ), such that For this purpose, we also introduce the "accumulated" error where log S p (τ ) can be defined through the complex-time Magnus expansion (See [40] for compilation of proofs, complex-time extension is straight-forward) where P is a path going from τ 1 = 0 to τ 1 = τ . This, like the real-time Magnus expansion, converges provided ∥Ω∥ ≤ π. With this, we obtain an upper bound on ∥E(τ )∥ stated through the following lemma:

Lemma 13. Given the error definition
for complex τ , we can bound its norm through Now, we bound the norm for E (τ ) in Appendix D.3 and make use of Here, α comm A s , . . . , A 1 , B := q 1 +···+qs=p p q 1 ··· qs ad qs As · · · ad q 1 A 1 (B) .
While the above theorem is a useful theoretical bound, a problem arises because bounding the commutators is often impractical computationally. In such cases, α comm can be upper bounded through the triangle inequality and submultiplicativity of the norm.
With these set of lemmas at our disposal, we can now provide a cost estimate for state preparation for all the ground states ofH s k .  Figure 2: Circuit to implement a Gaussian m-qubit phase estimation algorithm with (m−q)-qubits for spectral interpolation. TheŨ ′ s k operator is evolved by an effective HamiltonianH s k , with a Trotter step size ts k (See Theorem 15) and corresponding evolution time

State Preparation Cost
Theorem 15 (State preparation cost). Let H = m j=1 H j satisfy m j=1 ∥H j ∥ ≤ 1, and let γ ∈ R + be a lower bound on the spectral gap of H. Let s 1 , . . . , s n be the nth degree Chebyshev nodes on [−1, 1] for even n. Then there exists an algorithm to prepare the ground state ofH s k to precision ϵ using controlled queries ofŨ ′ s k = (S p (ts k )) e ′ k , where t ∈ Θ(γ 1/p ) and e ′ k = sgn s k ⌈s 1 /|s k |⌉, with a total gate count and a single auxiliary qubit.
Now that the cost of state preparation is covered, we assume an exact state preparation for our next gate count upper bound for eigenvalue estimation. However, first, we will introduce a new phase estimation algorithm as one of our methods for eigenvalue estimation.

Interlude: Gaussian Quantum Phase Estimation
The first step in the phase estimation protocol involves preparing a Gaussian distribution in the ancillary register of m qubits. Let p(w; σ) be the probability density function for a Gaussian distribution with zero mean (µ = 0) and variance σ 2 .
where T > 0 plays the role of a sampling rate, and N (σ, T, m) > 0 is the normalization constant for the discrete, truncated Gaussian.
An exact preparation of |p G ⟩ can be achieved through the methods proposed in Ref. [41]. However, within a target error, we can prepare a coarser Gaussian distribution and then perform an upsampling through QFT, zero-padding, and then QFT −1 . This is an established method for upsampling/interpolation of discrete signals in classical discrete signal processing (DSP), and was introduced for quantum distribution preparation in Ref [42]. In Figure 2 we illustrate the circuit used for this method. The operator Z plays the role of zero-padding ubiquitous to classical DSP for interpolation in the conjugate space. The errors introduced by truncation and finite sample rate in the time domain are estimated in Appendix C, and can be summarized with the following theorem.
denote the Fourier transform of the (root of the) Gaussian p. Then the spectral norm error of the prepared wave function in Fourier domain is

29)
Using these errors in the approximate state constructed in Fourier domain, we can perform a sequence of unitaries controlled on the Gaussian window. This approach is similar in spirit to the Kaiser-window approach taken in [43]. However, here we use a Gaussian window, which by inverting the conjugate spaces labels in Theorem 16, we know we can prepare efficiently with a cost O(polylog(1/ε)/σ), where ε is the vector error on the window state.

Phase Estimation Cost
Via the above theorem, we can now bound the cost for extrapolating eigenvalues obtained using Gaussian phase estimation. Through Theorem 5 and Lemma 6, we estimate the effects of uncertainty propagation that interpolation has on the final interpolant for Gaussian and one-qubit phase estimation. The result goes as follows: Theorem 17 (Phase estimation cost). The gate count of estimating the eigenvalues of H s k using Gaussian phase estimation is where σ P is the standard deviation on the interpolated observable at s = 0. The number of auxiliary qubits required is O (log(1/σ P )). Alternatively, using a single-ancillary-qubit approach, where ϵ > 0 is such that max s |λ(s) − P n−1λ (s)| ≤ ϵ and P n−1λ is the (unique) (n − 1)degree polynomial interpolating the dataλ i at the n interpolation points s i .
The bound for C est,gpe can also be cast in terms of a confidence interval, ϵ = wσ P , around the mean, which introduces an error rate that decreases super-exponentially with w.

Interpolation for Expectation Values
We now consider the application of Chebyshev interpolation to estimate expectation values, a fundamental task in quantum computation. The setting is as follows: given a quantum state ρ and observable O, the expectation value is given by ⟨O⟩ = Tr (ρO). We evolve our system according to a 2k-th order ST formulaŨ s given by (2.7). The time evolved expectation values of interest is captured by the function where O s (t) is given by equation (2.13). We've normalized the expectation values by ∥O∥ because the relative error is a useful and natural metric, and also the normalized operators may be block encoded for amplitude estimation. Alternatively, we simply restrict our attention to normalized observables with ∥O∥ = 1. The interpolation algorithm we propose can be summarized as follows.
1. Given Hamiltonian H, simulation time t, and tolerance ϵ for the estimate of ⟨O(t)⟩/ ∥O∥, choose the appropriate interpolation interval [−a, a] and an even number n of Chebyshev nodes. We neglect the cost of this step. The error analysis we will perform subsequently will inform the choices of a and n.
2. Compute estimatesỹ i of the expectation values ⟨O s i (t)⟩ for each s i with i = 1, . . . , n/2, to an accuracy depending on ϵ and n. We will assume this step is done with Iterative Quantum Amplitude Estimation (IQAE) [44], a recent approach to amplitude estimation that exhibits low quantum overhead. Our metric of cost is the number of H j exponentials executed on a quantum circuit, where H = j H j . Note that by symmetry, we need not computeỹ 3. Perform the polynomial fitP n−1 f through the points (s i ,ỹ i ) using a Chebyshev expansion (2.26). Note thatP n−1 f will automatically be even. This fit is wellconditioned, and we neglect the cost of this step.
To summarize, one performs amplitude estimation to acquire the time evolved expectation value for at each Chebyshev node, then performs a polynomial interpolation of the data. The estimate is the value at s = 0.
Given an even set of Chebyshev nodes {s 1 , . . . , s n }, and making use of Lemma 2, the interpolation error E n−1 assuming perfect data points is given by With a suitable bound on ∂ n s O(t), we can provide an upper bound on the interpolation error at s = 0. This bound is provided by the following lemma.
The proof of this lemma is an exercise in repeated use of triangle inequality and the combinatorics of large derivatives, and is left to Appendix E. Note that once the derivative bound holds, the interpolation error bound follows immediately from Lemma 2.
One motivation for these bounds is deriving asymptotic expressions for the algorithmic complexity. The following theorem gives an asymptotic query complexity for the number N exp of Trotter exponentials exp(−iH j τ ). However, this is not the total cost since these circuits need to be repeated to perform the appropriate measurement protocols. Since O can be block encoded, the expectation value can be obtained via an amplitude estimation protocol. By the well-conditioning of Lemma 5, each data point needs to be within ϵ of the exact Trotter value, up to a logarithmic factor in n. This robustness is why our result maintains a O(ϵ −1 ) scaling. In our proof, we assume the IQAE protocol is used, requiring only a single qubit overhead. The fractional queries for the noninteger timestep also require O(1) overhead, meaning the total overhead is O(L) due to the block encoding. To relate n and a to the required precision ϵ, simulation time t and Hamiltonian H, Lemma 18 can be used. Thus, we can relate N exp to these basic parameters. We carry out the formal proof in Appendix E.
As advertised, we see there is a "near-Heisenberg" scaling of 1/ϵ, up to logarithmic factors. However, there is an unsavory quadratic scaling in the simulation time in cases without high accuracy demands. We believe this can be improved, because our approach us forces us to have n scale linearly in T . We believe this is overly pessimistic, and a log(∥H∥ T ) scaling is more likely. Finally, our results suggest the best performance for using low order formulas, since our bounds are strictly worse for increasing ST order k.

Numerical Experiments
In the previous sections, we presented theorems, with proofs in the appendices, that interpolation of Trotter data can lead to higher accuracies for eigenvalues and expectation values than Trotter alone. This section provides numerical evidence backing up our analytic claims, not only showing that improved scaling is possible, but also that high-order Trotter formulas need not always provide better error scaling when used in conjunction with interpolation. Specifically, we demonstrate this improved scaling for phase estimation using interpolation for second-order and fourth-order ST formulas. For our demonstration we consider the transverse Ising model of two spins.
This is a minimal example with just two non-commuting terms, yielding a nonzero Trotter error. Our aim will be to estimate the ground state energy of the effective HamiltonianH s in the limit s → 0, and seek numerical evidence of the improved performance relative to low-order Trotter formulas that Theorems 15 and 17 suggest. In order to avoid aliasing with our Fourier spectral methods, we must satisfy the bound The padding ∆ pad is there to suppress the probability leakage wrapping around the boundary. For simplicity, define a dimensionless effective Hamiltoniañ

(5.5)
This way, the spectrum ofh s k lies within the domain [−1/2 + ∆ pad /2, 1/2 − ∆ pad /2] and now is in line with the conventions of Fourier spectral methods in [37]. With this, we now define the normalized Hamiltonian Here t is sufficiently small as to fulfill Equation (5.5). We must also make sure that s is chosen according to the constraints of Section 3.2 such that there are no level crossings throughout the interpolation interval. We now describe the results. Figure 3 displays the results of Chebyshev interpolation with second order Trotter formulas using various (even) numbers of points. For each H ′ s k , we numerically simulate the ground state preparation protocol to a state error, ϵ state . This is efficient because the algorithm scales O(log 1/ϵ state ). We then simulate the Gaussian phase estimation on these states. Moreover, we have used zero-padding as described in the Section 3.2.3 in order to upsample the spectrum and be able to ignore digitization errors. The spectral upsampling through zero-padding is efficient since the cost scales like O(q 2 ), where q is the number of total qubits after padding. The blue bands in these plots show an uncertainty equivalent to 0.01 × σ P (s). We have down-scaled the statistical uncertainties by a factor of a 100 such that the interpolation error changes are noticeable when we change the number of nodes.
Next, in Figure 4, we show the exact systematic error from our simulations as well as the upper bounds of the Bernstein ellipse analysis of Lemmas 14, 12, and 10. We see from this data that interpolation indeed improves the quality of the estimate of the energy as anticipated, even when relatively low-precision estimates are used at s = 1. As discussed previously, only positive values of s k are computed due to reflection symmetry.
Finally, in Figure 5 we performed ground state energy estimation using second and fourth order formulas without interpolation, in order to compare the costs when plotted against the error. From the plots, we observe that beyond n = 2 the extrapolation methods already outperform using second-order product formulas alone. Specifically, we see clear indications from this data that the bias in the interpolated error (for large 1/ϵ) scales logarithmically for these Gaussian phase estimation experiments. This agrees with our expectations from Theorem 17 wherein the systematic errors from interpolating phase estimation experiments are expected to scale as polylog(1/ϵ). In contrast, the bias from a fixed experiment is expected to be O(1/ϵ 3/2 ) and O(1/ϵ 5/4 ) respectively.
It is worth noting that this does not violate the Heisenberg limit as here we are only showing the interpolation error, with data computed to machine precision. The exponential reduction in the uncertainty with the number of interpolation points considered is independently shown in Figure 4. Note that for this data we do not see an improvement from transitioning to higher-order formulas, which is consistent with our prior expectations. These results show that extrapolation tends to outperform transitioning to a higher-order product formula does not necessarily lead to an improved scaling for the problem of phase estimation.

Conclusion
In this work, we present a framework for achieving improved accuracy of Trotter simulations using polynomial interpolation. We apply our framework to the problems of eigenvalue and expectation value estimation, and derive query complexities assuming the Trotter data is computed quantumly. We find that our methods achieve a Heisenberg scaling ofÕ(1/ϵ) up to polylogarithmic factors. This is an improvement relative to a single Trotter calculation, which would yieldÕ(1/ϵ 1+1/p ) for order p formulas. For the low-order formulas used in practice, where p is not so large, this improvement may  On the left, we used a second-order formula and on the right a fourth-order one. As we can see from this log-log plot, the convergence is exponential as expected.  Figure 5: The gate cost from using second-(left) and fourth-order (right) formulas for ground state estimation plotted against the systematic error on the eigenvalue estimation. In blue, we have the cost of using single long-depth circuits using product formulas alone. In orange, we have the gate cost of using multiple shallower circuits and then using the extrapolation methods presented in this work. be significant.
An interesting feature of polynomial interpolation in this context is that it provides a uniform approximation of the function f (s) on the interpolation interval, thereby giving a characterization of the Trotter error for various step sizes. In contrast, similar techniques such as Richardson extrapolation only attempt to estimate f (0). Whether this global characterization of the Trotter error can be usefully employed is a question we leave open.
There are several areas in which improvements in our analysis can likely be made. First, in most of our claims, we resort to fractional queries in order to produce step sizes at the values of the Chebyshev nodes. We suspect this machinery is not needed in practice, and that improved analysis, coupled with mathematical results on the robustness of the Lesbesgue constant to node perturbations, will show that interpolating at nearby points will be sufficient. Second, the quadratic dependence in simulation time that arises for expectation value interpolation in certain regimes is unsavory. The root cause of this in the analysis appears to be a linear relation between the simulation time and number of interpolation nodes, which seems overly cautious. Numerical studies beyond our own, as well as implementations on quantum hardware, may reveal a rosier picture than our naive bounds suggest. Finally, our results do not show commutator scaling, though the method itself almost certainly does. For Hamiltonians with almost commuting terms, this would manifest in the data being nearly flat, resulting in an approximately straight line fit to the correct answer. Our difficulty in showing this expected relationship arises from the messy expressions for high order derivatives of the effective Hamiltonian, but this should not be a fundamental issue.
Our work shows that polynomial interpolation can be a powerful supplement to quantum subroutines using low-order Trotter simulations. Given that our results ameliorate the worst aspects of low order formulas, we believe that investigation of highly efficient low-order formulas for simulation is likely to be an increasingly important topic. Improvements that arise from the superior constant factors that low order formulas can provide may lead to a new generation of more efficient algorithms for simulating chemistry and physics using quantum computers.

Acknowledgements
We thank Christoph Lehner, Taku Izubuchi and Yuta Kikuchi for the insightful discussions. During the inception of this work G. R. was supported through Brookhaven Na- for s ∈ R \ {0}, withH 0 := lim s→0Hs = H. We will understand log S 2k (st) through a power series expansion about the identity.
This series converges precisely when Using the fundamental theorem of calculus, we can derive a suitable condition for convergence as a neighborhood about s = 0. The condition above implies where H j l is some Hamiltonian piece H j indexed by l, the derivative can be upper bounded as where τ = (τ l ) N k l=1 is the vector of ST coefficients, and in going to the second line we used a Hölder inequality. We have ∥τ l ∥ 1 ≤ N k max l |τ l | from Appendix A of [45] we have Thus, the requirement for convergence of the logarithm becomes where we used the expression N k = (2m)5 k−1 for the number of ST exponentials.
We now assume s is within the symmetric interval defined by (A.8), such that (A.2) is convergent. Since log S 2k (0) = 0, s = 0 is a zero of order at least one. We want to absorb the diverging 1/s term and better understanding the leading dependence in s. To facilitate this, we writeH where we defined Note that ∆S 2k is analytic in s, and is a finite difference around s = 0, such that Through the series expansion (A.9) we may bound derivatives ofH s via bounds on derivatives of ∆S 2k . We first obtain a power series of ∆S 2k by Taylor expanding every term in the product formula S 2k . Regrouping in powers of st, the result is where the parenthetical symbol is the multinomial coefficient, and the sum J is over all values of J = (j 1 , . . . , j N k ) such that k j k = j. The derivatives with respect to s are now easy to compute. Using the fact that for p > n (and zero otherwise), we have where Λ := max j ∥H j ∥ and τ max = max l |τ l |. Factoring out powers of n + 1 and reindexing, we are left with the following bound on derivatives of ∆S 2k .
This expression is quite elegant; it is as if we were taking n + 1 derivatives of the exponential e cs with Factors of c will occur frequently in what follows, so we find it convenient to adopt this symbol as shorthand. We return to bounding the derivatives of powers of ∆S 2k (st) as in equation (A.9).
We reduce this to the previous case by performing a multinomial expansion. where we've used the sum property of the n l where appropriate. The remaining sum over the multinomial coefficient is given by (j + 1) n . Hence, Notice that, when j = 0, this is consistent with equation (A.15).
With result (A.20) in hand, we return to the power series (A.9). Differentiating term by term and performing a binomial expansion for each term will allow us to apply our previous results. It will be helpful to consider two cases separately: j ≤ n and j > n. These regimes are somewhat qualitatively different, since the derivatives of s j may or may not vanish depending on the number of derivatives. Focusing on the case j ≤ n, we have Note that the sum runs only to j, not n. Taking a triangle inequality upper bound using (A.20), we may upper bound (A.23) as where we have factored out terms not involving q from the sum, and manipulated the factorials for reasons which will be seen presently. Taking the upper bound n!/(n−q)! < n q , and factoring out n−j powers of (j +1)c, we may upper bound the above expression by Thus, with some minor polishing, we may express the bound on (A.22) for j ≤ n as Now let's move on to the j > n case. Here, there are not enough derivatives to kill off the s j term, so the binomial sum in (A.23) will run from q = 0 to n.
Similar to before, we use the bound (A. 20), to obtain (A.28) where B l and B h refer to bounds on the "low" and "high" parts of the series. Employing the bounds from equations (A.26) and (A.30), we have Let's begin by simplifying the bound on B l . We will at this point make the assumption that s is sufficiently small such that cs < 1. This will necessarily factor into the cost later. This simplification yields The remaining sum can be bounded by (n + 1) n , hence, where the definition that 0 0 = 1 handles the edge case. Let's turn our attention to B h . We will start by reindexing so that the series begins at j = 0 in (A.33). To be concrete, let's take ecse cs < 1/2, which is implied by cs < π/20. Coupled with the inequality in (A.16), this condition can be met provided that This is the condition that appears in the statement of Lemma 1, and implies the rest of the conditions required for this proof to carry through. This allows us to upper bound B h further as B h ≤ 2e π/20 (cs) n+1 (3e π/20 /2) n (n + 2) n−1 ≤ 4(cs) n+1 (9/5) n (n + 2) n−1 . (A.42) Since (n + 2) n−1 ≤ e 2 n n /2 (using 0 0 := 1 for the edge case n = 0), we have Altogether, using s ≤ 1 ∂ n sH s t ≤ n n (e 2 c) n+1 1 + 2(9/5e 2 ) n ≤ 2n n (e 2 c) n+1 The final result then follows from substituting for c and noting that the duration of each time step is at most 2k/3 k−1 using the results of [17].

B Proof of Theorem 7
In order to use the result of Lemma 1 in the proof of Theorem 7, we need to reexpress the derivatives of the eigenvalues in terms of the effective HamiltonianH t (we use t rather than s due to the extrapolation parameter taking the role of time). The following lemma provides such a bound on the derivatives under the assumption that the Hamiltonian is gapped.
Lemma 20. LetH t : R → C 2 n ×2 n be a Hamiltonian on n qubits with p continuous derivatives. SupposeH t has a minimum eigenvalue gap where λ j (t) and λ k (t) are eigenvalues ofH t . Define which has dimensions of 1/t. Then for all eigenvalues λ k (t) Proof. The essence of this proof is to examine the derivatives of λ k (t) by expressing them in terms of derivatives ofH t via perturbation theory. Using the bound Lemma 1 onH t lets us bound the eigenvalues in turn. Let |k(t)⟩ be the eigenvector for eigenvalue λ k (t) and assume that the eigenvectors' phases are chosen such that k(t) k (t) = 0 for all t. Such a choice is always possible. Standard results from perturbation theory [46] show that Further, taking the Euclidean norm of the 2nd equation, where γ(t) is the minimum eigenvalue gap of the effective Hamiltonian at time t. We now consider second-and higher-order derivatives of λ k (t). By repeatedly taking derivatives of λ k (t) in the first line of (B.1), we obtain an expression of the general form .
Note that there is a pattern to this series. If there are m H (p) (for p ≥ 0) present in the expansion there must be m − 1 powers of the gaps present in the terms. Before going into more detail let us define the "degree" of a term to be the number of differentiable factors (eigenvectors, Hamiltonians or inverse gaps) present in a product. For example ⟨x(t)| ∂ tHt |x(t)⟩ is degree 3.
We have that if A is a term of degree deg(A). Then it follows from (B.1) that since from the product rule the derivative of the product of the factors is the sum of the distributed sum of the derivatives of the factors and that if |λ k ⟩ or (λ j − λ k ) −1 are differentiated then the degree increases by at most 4. While the derivative of the effective Hamiltonian does not increase the degree, the increase in the degree is still at most 4 in this case thus in the worst case scenario the degree is increased by 4. As the degree deg(∂ t λ k (t)) = 3 ≤ 4, it is straight forward to see that deg(∂ q t λ k (t)) = 4q (B.5) Next, let us assume that B is a sum of N terms (B) products of the above factors. We then have from (B.1) that Next we will show that the number of terms present in ∂ q t λ k (t) obeys We prove this by induction. The number of terms present when q = 1 is 1. This demonstrates the base case. Assume that for some value q ′ that the induction hypothesis holds. Then from (B.5) and (B.6) that which proves the relation by induction.
We then see that if any factor is differentiated ν times then its norm is bounded above by Λ ν+1 We therefore have that since the maximum value of the derivative is the number of terms multiplied by the maximum norm of the product of all the factors that Proof of Theorem 7. First, from the Baker Campbell Hausdorff formula, and thus log(S 2k (t))/t ∈ O(1) as t → 0. Thus an eigenvalue E eff (t) exists forH t for all t > 0 which demonstrates the first claim. SinceH t has p continuous derivatives, then ∂ p t λ k (t) exists and is bounded above by The value of Λ γ can then be bounded above using Lemma 1 (under the assumption that 2k(5/3) k−1 m max l∈ [1,m] ∥H l ∥t ≤ π/20 for all t) by [1,m] ∥H l ∥t) n+1 This implies that For t ∈ [−a, a] we have from Lemma 2 that for an n − 1 point interpolation formula [1,m] ∥H l ∥)(1 + Γ) n+1 a 2 n (B.14) Given that we wish to have the interpolation error at most ϵ int we can solve for the value of a such that t ∈ [−a, a] for the interpolation that is needed in order to ensure that the error bound. Specifically, The number of exponentials needed in the degree n polynomial interpolation is, using the fact that the error at −t is equal to the error at t is from the fact that the j th Chebyshev node location scales like a/j and the fact that the cost of phase estimation within error ϵ PE with probability of failure at most δ PE scales as There are two competing tendencies in the cost. The number of operator exponentials increases polynomially with n, but the scaling with the error tolerance improves exponentially with n. Setting these two equal to each other to estimate the optimal scaling yields Under the assumption that ϵ int ≤ 5/3 we have that the solution is of the form Finally, we have from Theorem 5 that the error in the extrapolated energy as a function of the number of points used for the measurement for an n-point formula we would like to be at most ϵ which is implied by Thus the total number of exponentials needed to estimate each Trotter data point within error ϵ PE , and within probability of failure δ PE = 1/3n, is given by By the union bound, the total probability of failure is at most 1/3. By Theorem 5, our estimate P n−1 λ(0) will be within ϵ of the true value.

C Proof of Theorem 16
Per the statement of the theorem, we would like to estimate the error between the Discrete Fourier Transform (DFT) and corresponding samples of the Fourier transform of the continuous Gaussian distribution. We split this analysis in three parts: calculating time-domain truncation errors, estimating frequency-domain truncation errors (nonzero sampling rate), and finally calculate the error coming from renormalization. Figure 6 gives an schematic representation of the analysis, including the resulting big-O bounds. For reference, here are some useful definitions which we will use in the proof to follow.
It will also be convenient to define the Fourier dual sampling rates and widths, respectively.

C.1 Truncation Error
The error between the scaled Discrete-time Fourier Transform (DTFT) 10) and the untruncated DFT X[k] of Equation (C.7) can be estimated to be We can upper boundε trunc using a right-Riemann sum approximation of the error function, and together with the simple bound [47] erfc (C.14) Finally, we can express the truncation error ϵ trunc , using (C.11) and the fact that N (σ, T, m) ∈ O(1/T ).

C.2 Aliasing error
Now that we have estimated the error on the DTFT by time-domain truncation, we would also like to estimate the aliasing error, that is, the difference between the DTFT and the Fourier transform over a period 1/T . This is coming from having a finite sampling rate of a function that is not bounded in the frequency domain. In order to estimate the aliasing error, we must look at the Fourier transform of a Gaussian distribution, which is another Gaussian.
Now, the DTFT can be expressed in terms of a Fourier transform in the following way Therefore, the aliasing error is The aliasing error has a critical point at f = 0 and its second derivative is strictly positive throughout f = [−1/2T, 1/2T ] and thus we know that the error is largest at the boundaries of the DTFT. That is, at f = −1/2T, 1/2T . The error at these two points is expected to be the same so we just choose f = 1/2T in order to bound the error from above. That means where θ 2 is the 2nd Jacobi theta constant. For e −(2πσ/T ) 2 ≤ 1, For e −4π 2 (σ/T ) 2 ≤ 1/2 Therefore, the error ϵ alias between and can be estimated using (C.21) to be ϵ alias =ε Thus, with a choice of both sources of error can be bounded with (C. 26) This means that after the DFT, we know that we will have in the register the amplitude where ϵ DFT is the total error from both truncation as well as aliasing. We then bound the error by summing (C.21) and (C.10) to find (C.28)

C.3 Renormalization error and ultimate additive error
We know that the signal after DFT has to be normalized, we also know that it is locally ϵ DFT close to Equation (C.23), which is not necessarily normalized. Therefore, for the norm of samples of Equation (C.23) we obtain the following upper bound on normalization ratios where we have used the assumption ∥⃗ ϵ DF T ∥ ≤ 1. Similarly, the lower bound is Thus, if we take the full inequality and take the square root, we obtain, We now divide every side by N (σ f , F, q) to obtain F, q) .
We multiply these inequalities by max We take the absolute value to have a two-sided inequality, and use the definitions (C.8) and (C.9).
It is worth noting that, for the purpose of estimating (T N (σ, T, m)) 1/2 , we can increase 2 q arbitrarily (or decrease F arbitrarily). Through this, we see that: For our choice of σ T this would mean that , (C. 38) and .
Therefore, the amplitudes that we get on the ancillary register are

D Proofs of State Preparation and Ground State Energy Extrapolation Costs
Here we prove results related to the costs of state preparation and phase estimation for extrapolation of the ground state energy, from Theorem 15 and Theorem 17 respectively.

D.1 Proof of Lemma 12
Proof. Let λ m−1 , λ m , and λ m+1 be consecutive simple eigenvalues of H this guarantee can be satisfied by the constraint Solving for r on the last inequality gives us If λ m is the ground state or highest state of H, we just repeat a similar proof except there is only one state above or below respectively.

D.2 Proof of Lemma 13
Proof. We take a look at the complex-time Schrödinger equation for S p (τ ).
We now recall that we can express through the Magnus expansion for the complex-time derivative, We then re-express E(τ ) and expand Ω ′ Here, P is any path in the complex plane going from τ 1 = 0 to τ 1 = τ . We now expand out the first application of ad Ω on A and separate the n = 1 term from the sum obtaining: Following this, we make use of the estimation lemma ('ML' inequality) to obtain where s parametrizes the path P and |γ P | ds = dτ 1 ds ds is the arc-length differential. Now, we make use of the bound ∥ad Ω ∥ ≤ 2∥Ω∥ to obtain: We now make use of

D.3 Proof of Lemma 14
Our task here is to show that a bound of the form shown (3.17) actually exists, and for this we follow closely the work done in [48]. In order to make assertions over the general complex plane we will extend their results to complex time. The extension will be straightforward because e iτ A has no singularities except for non-removable singularities at Im(τ ) ∈ {−∞, ∞} for a Hermitian A and at Im(τ ) = −∞ when it is positive semi-definite. Thus, we can replace all the real-axis integrals t 0 f (t 1 )dt 1 for the contour integrals where P is any path from 0 to a generally complex τ as long as that path does not include Im(τ ) ∈ {−∞, ∞}. Our results are summarized by the following Proof. First, we define a general product formula the following way where the coefficients a (υ,m) are real numbers. The parameter Υ denotes the number of stages of the product formula. For example, for the ST formula S 2k (t), we have Υ = 2 · 5 k−1 . The permutation π υ controls the ordering of operator summands within stage υ of the product formula. We then consider the error in the effective Hamiltonian with the help of the definition of the path-ordered exponential: Using this definition and Lemma 13, we can write the operator error on the Hamiltonian the following way: where s in this context is |τ 1 |. Let ⪰ be a lexicographic, or dictionary, ordering on be a product formula with ←− (υ,m) dictating that the operations are applied from right to left in ascending order according to ⪰. Let −→ (υ,m) refer to descending order. After differentiating the evolution operator, and some algebra, one obtains that E (τ ) can be expressed as (D.34) We can bound the norm of this term through the following Taylor expansion of the term e iτ As · · · e iτ A 2 e iτ A 1 Be −iτ A 1 e −iτ A 2 · · · e −iτ As = C 0 + C 1 τ + · · · + C p−1 τ p−1 We can bound the norm of that last term in Equation (D.35) through triangle inequality for contour integrals, and for sake of simplicity, we fix contour C going straight from 0 to τ . We also assume for the moment that matrices A j are positive definite. Thus, the upper bound becomes: Thus, the upper bound on ∥E (τ )∥ is where (υ, m + 1) = (υ + 1, 1

D.4 Avoiding fractional queries
For the Gaussian QPEA defined in Section 3.2.3, we must define theŨ s k operator. A first attempt looks the following waỹ U s k = (S p (ts k )) e k = exp −its kHs k e k , (D. 43) where s k are the Chebyshev nodes with a = 1, and e k = s 1 s k , (D. 44) such that the total simulation times T = s 1 t, (D. 45) which is the same for eachŨ s k . This, however, will require fractional powers of (that is, a non-integer number of queries of) S p (ts k ). Although this can be efficiently achieved through quantum signal processing and one extra ancillary qubit, within this section we would like to stay as rudimentary as possible, minimizing quantum overhead. Hence, we will instead use the evolution operatorŨ We note that now the equivalent evolution time T ′ k varies for different k's, but this has a negligible effect in state preparation (See Theorem 15) or in eigen energy estimation (See Theorem 17) with respect to the effective HamiltonianH s k .
When it comes to query cost, this has the effect of increasing the resolution and the cost of QPEA by a factor of |e ′ k | |e k | ≤ 2. (D.50) As a last step before jumping into the proof of Theorem 15 we will first discuss some estimates for the coefficients of the terms in the extrapolation. The first such result is given in In the framework we propose here, most algorithms' cost will scale proportionally to ∥e ′ ∥ 1 ≤ 2∥e∥ 1 .

D.5 Proof of Theorem 15
Proof of Theorem 15. In the framework we propose here, most algorithms' cost will scale from Lemma 3 ∥e ′ ∥ 1 ≤ 2∥e∥ 1 = 2s 1 For example, the ground state preparation methods detailed in [37] would scale proportionally to ∥e∥ 1 . More precisely, the cost of state preparation scales with the minimum eigenvalue gap as , (D.52) where, from Lemma 11 and the fact that the spectral gap is bounded below by γ 0 , we have that Thus, solving for n we find: We also have that the radius of the Bernstein elipse needed satisfies ρ max = Θ(r max ) from Lemma 12; hence, n = O p log(1/ϵ) log (γ 0 /t p ) + log((p + 1)!/2 p+1 ) . (D.58) It is clear that a constraint is since we need the radius to be a quantity greater than 1 and there is no advantage in going to a much smaller t. Finally, the cost for state preparation is Note that the terms in the cost function L 0 go like 1/σ k due to Fourier duality and Theorem 16 and the other factor is from the cost of a single U k implementation. After doing the minimization we are left with: In the second line we brought out n factors of ec using the condition on the indices D, and we identified Y n evaluated the same at every argument to be the single-variable Bell (or Touchard) polynomial B n . We can bound the size of B n ( 2/πc) [54] by There is the interpolation error from the polynomial P n−1 f fitting f assuming perfect interpolation points (s i , f (s i )). But f (s i ) can only be estimated; let's callỹ i this estimate. The error inỹ i in our analysis comes from the statistical error inherent in the estimation protocol as well as the error in the fractional query procedure for a 1/s evolution. We can independently consider the interpolation error and the data error via the triangle inequality.
Here L n is the Lebesgue constant of the interpolation, essentially a condition number, and ϵ data is an upper bound on the error in the data.P n−1 f is the fit to the imperfect data and P n−1 f the fit to the perfect data (s i , f (s i )). For generic interpolation nodes, L n can grow rapidly; however, for the set of Chebyshev nodes we obtain a near-optimal value [55].
L n ≤ 2 π log(n + 1) + 1 (E. 19) Since we want the total error to be within a threshold ϵ, we can require (E.20) Given these error bounds, we can now turn to the cost of acquiring the data points. Let's assume the expectation value can be encoded as an amplitude estimation problem. This occurs, for example, when O can be block encoded in a unitary U, with scaling parameter γ such that ∥O∥ /γ ≤ 1. Performing a Hadamard test gives the amplitude 1 + ⟨O s i (t)⟩/γ 2 . (E.21) If we estimate this amplitude to within accuracy ϵ data ∥O∥ /2γ, we can estimate f (s i ) within ϵ data . Using Iterative Quantum Amplitude Estimation [44], we can obtain this estimate using a Grover iterate G constructed from two Hadamard test oracles. The number of Grover oracles N G required is given by where N k = 2m5 k−1 , and where the logarithm comes from the need for fractional queries with QSVT. There is also a O(1) overhead associated with the fractional queries and IQAE. Altogether, the number of exponentials for a single data point is in Therefore, the total number N exp of generating all n/2 data points (we only need half due to symmetry) is in Plugging in (E.22) for N G above and summing over 1/s i using Lemma 3, N exp ∈ O γN k L n n ∥O∥ ϵa (log n) log 2n δ log 2 γL n π ∥O∥ ϵ log(1/ϵ data ) ⊂Õ n aϵ log 1/δ (E. 26) whereÕ suppresses factors logarithmic in n and ϵ. We also assume γ/ ∥O∥ ∈ O(1). The number of nodes n and the interpolation interval [−a, a] will be determined by ϵ int , the interpolation error assuming perfect data. To apply our error bounds from the previous subsection, we work in a regime where a satisfies the conditions of Lemma