Randomizing multi-product formulas for Hamiltonian simulation

Quantum simulation, the simulation of quantum processes on quantum computers, suggests a path forward for the efficient simulation of problems in condensed-matter physics, quantum chemistry, and materials science. While the majority of quantum simulation algorithms are deterministic, a recent surge of ideas has shown that randomization can greatly benefit algorithmic performance. In this work, we introduce a scheme for quantum simulation that unites the advantages of randomized compiling on the one hand and higher-order multi-product formulas, as they are used for example in linear-combination-of-unitaries (LCU) algorithms or quantum error mitigation, on the other hand. In doing so, we propose a framework of randomized sampling that is expected to be useful for programmable quantum simulators and present two new multi-product formula algorithms tailored to it. Our framework reduces the circuit depth by circumventing the need for oblivious amplitude amplification required by the implementation of multi-product formulas using standard LCU methods, rendering it especially useful for early quantum computers used to estimate the dynamics of quantum systems instead of performing full-fledged quantum phase estimation. Our algorithms achieve a simulation error that shrinks exponentially with the circuit depth. To corroborate their functioning, we prove rigorous performance bounds as well as the concentration of the randomized sampling procedure. We demonstrate the functioning of the approach for several physically meaningful examples of Hamiltonians, including fermionic systems and the Sachdev-Ye-Kitaev model, for which the method provides a favorable scaling in the effort.


Introduction
The simulation of quantum processes on quantum computers is one of the most eagerly anticipated use cases for quantum computing. The ability to simulate a system's time evolution promises to provide insights into the dynamics of interacting quantum systems in situations where approximate classical simulation methods fail and constitutes one of the cornerstones of quantum technologies [1].
This work aims at improving algorithms that simulate the dynamics of expectation values of observables. The need for developing such machinery stems from the observation that state-of-the-art quantum devices and early quantum computers are still rather limited in their realizable circuit depths and control. We, therefore, assume only access to a quantum-oracle machine that implements single-qubit state preparation, controlled time evolution and quantum measurements and strive for minimizing the required depth of suitable quantum algorithms. Such a setting explicitly allows for the use of product formulas, which are the earliest algorithms proposed for the simulation of time-independent local Hamiltonians [2].
Such Trotter-Suzuki methods, as they are called, have evolved from comparably simple prescriptions for local Hamiltonians to sophisticated schemes able to capture more general sparse time-independent Hamiltonians [3][4][5] as well as time-dependent Hamiltonians [6,7] and open quantum systems [8,9]. Despite their relative simplicity, product formulas are still at the forefront of Hamiltonian simulation Algorithm Trotter-Suzuki [26] (Section 2)

Childs-Wiebe [13] (Section 2)
Novel multi-product formulas [here] (Section 3.2) Max. depth 2Lr · 5 χ−1 2L(K + 1) · 5 χ−1 2LR · 5 χ−1 Query complexity per sample Sampling overhead 1 2χR q=0 C q ν (r) , b (r) 2 Table 1: Comparison of the standard Trotter-Suzuki formula, the multi-product formula used by Childs and Wiebe and the closed-form/matching multi-product formula introduced in this work for approximating the time evolution operator exp(−iHt) of a Hamiltonian H = L k=1 h k . We use Λ = k h k and provide detailed error bounds in Section 2 and Theorem 3. Each algorithm provides the choice of some free parameters, dictating their performance and resource requirements. The multi-product formulas proposed by Childs and Wiebe in Ref. [13] and those introduced here further come with additional parameters depending on the choice of these free parameters. Since we aim for a randomized implementation using the framework introduced in Algorithm 1, the maximal circuit depths, query complexities, and error scalings are slightly different from their implementation in a deterministic, coherent fashion. Furthermore, implementing them in the randomized sampling framework comes with the stated sampling overheads. Note that their errors can also exhibit a commutator scaling as discussed in Refs. [11,14] for all of these methods. and actual performance on several physically plausible and interesting Hamiltonian models of strongly correlated quantum systems, for which we find a favorable performance over known schemes of quantum simulation.

Multi-product formulas
Multi-product formulas constitute the main ingredient of our work, and hence we will briefly review the underlying ideas, starting with product formulas. The goal of quantum simulations is to approximate the quantum dynamics of a complex quantum system described by a many-body Hamiltonian decomposed as composed of Hermitian Hamiltonian terms {h k } of neither necessarily small nor geometrically local support defined on a quantum lattice equipped with a Hilbert space H = (C d ) ⊗n .
Here, n is the number of degrees of freedom of finite local dimension d. To access the Hamiltonian, we assume to have oracles O k (t) implementing time evolutions under each term in the Hamiltonian O k (t) |ψ = e −ih k t |ψ (2) for any k ∈ [1, L] and t ∈ R. In digital quantum simulation, these oracles are built from Clifford gates and phase rotations, but we do not assume that such a decomposition is available to us, as the oracles could be implemented by the time evolution of a programmable device. We only require to have control over the implementation, i.e., that we can apply the oracle O k (t) depending on the state of a subsystem encoded as a qubit as Now, quantum simulation aims at making reliable predictions of the expectation values of observables O (which in the ideal case are local with a small support on the lattice, but again, this is not a necessity) at later times T > 0 O(T ) := tr(U (T )ρU † (T )O) (4) for initial quantum states ρ, where U is the time evolution operator. In practice, this commonly means to establish ways to approximate the corresponding time evolution operator Although it is worth stressing that a-priori information about the time T and properties of the initial state ρ, as well as features of the locality of the underlying Hamiltonian H can be exploited to come up with highly specialized approximations, we do not assume any underlying structure or limitations in this work. A simple, yet effective approach to approximating the time evolution operator is that of using product formulas, which make use of N consecutive evolutions of individual Hamiltonian terms h kj by associated short time intervals α j t, defined with real coefficients α j such that j α j = 1. Partial backward evolutions are explicitly allowed, i.e., α j < 0 for some j, even though t > 0. In general, such a formula is defined as a product of time evolutions where each single term evolution can be implemented via O k (t). Here, we distinguish between a direct evolution by time t and a repeated evolution of short time slices t/r < 1 such that The most accurate known formulas of this type are Trotter-Suzuki formulas [26]. They are recursively defined for any positive integer χ and any time t by with s 2χ := (4 − 4 1/(2χ+1) ) −1 for any positive integer χ. This specific choice of s 2χ ensures that the Taylor series of S 2χ (t) matches that of the actual time evolution U (t) up to O(t 2χ+1 ), which makes it a good approximation for t 1. It is important to stress that constructing an O(t 2χ+1 ) approximation requires the application of 2 · 5 χ−1 (L − 1) + 1 individual oracles O k (t), where L is again the number of Hamiltonian terms. More generally, we would represent them by oracle calls and we will refer to the number of exponentials N exp . Using repetitions as in Eq. (7), the number of exponentials of the algorithm is of O(rL5 χ−1 ) to approximate the actual time evolution U (T ) up to an accuracy of ε ∈ O(r(T /r) 2χ+1 ) in operator distance.
To enhance the comparability of product formulas with more involved multi-product formulas whose bounds have not been improved to the same extent, we use the following simple tail bound and note that better bounds exist, e.g. via commutator relations [11]: Here, g χ = (5/3) χ−1 4χ/3 originates in the exponential tail of the Trotter-Suzuki terms [13] and Λ = L k=1 h k .
The exponential dependence of the circuit depth on χ can pose a challenge for real-world implementations, and especially so within the NISQ regime. Altogether, the number of oracle calls needed to simulate the evolution up to an error ε ≤ 1 is upper bounded by as proven in Ref. [4]. An alternative complexity bound using commutator relations of the individual Hamiltonian terms can be found in Ref. [11].
A known technique to decrease the number of required short time evolutions, i.e., oracle calls, is the use of multi-product formulas [27,28]. While the Trotter-Suzuki approximation cancels erroneous contributions of higher-order terms by adding backward evolutions, multi-product formulas achieve the same cancellations by superposing different product formulas. Conventionally, one employs multi-product formulas that describe the same time evolution (up to a fixed order of t), but whose erroneous higher-order contributions are of different strength and can thus be made to cancel. This is achieved by approximating the time evolution to the same Trotter-Suzuki order, but considering product formulas different in the number of time slices [27,29]. The Childs and Wiebe multi-product formula discussed in Ref. [13] is of the form where K is an integer defining a cutoff and { q } is a set of pairwise different integers. The coefficients ensuring the error terms in the multi-product formula vanish up to O(t 2(K+χ) ), resulting in where we have again stated a simple tail bound for comparability and note that it could also exhibit a commutator scaling as discussed in Refs. [11,14]. Unlike Childs and Wiebe, we will henceforth use the simplest version achieved by setting q = q for all q since this choice is the most favorable for a randomized implementation, which will become clearer in Section 3.
In Definitions 2 and 3 and Section 4, we employ a different approach for multi-product formulas and develop two techniques whose errors scale with O(t 2χR+1 ), and where R is comparable to K + 1.
Multi-product formulas were firstly used for quantum simulation by Childs and Wiebe [13], who developed the linear-combinations-of-unitaries (LCU) approach to directly implement multi-product formulas on a quantum system. Note that sums of unitaries are not inherently physical operations since the unitary group is not closed under addition. Childs and Wiebe use a non-deterministic approach to implement multi-product formulas that can lead to large algorithmic depth. Berry et al. [15] have implemented LCU for a truncated Taylor series nearly perfectly but their use of oblivious amplitude amplification requires an additional register and a complex state preparation procedure. Recently, Ref. [14] has improved the condition number of multi-product formulas and thus extended the use of LCU by amplitude amplification. However, these improvements have no impact on their asymptotic performance and are less favorable for randomized implementations.
Using randomized sampling such as proposed in Algorithm 1, we can circumvent the need for these potentially deep circuits. Randomized compiling of multi-product formulas entails sampling among the individual terms of Eq. (13) requiring the implementation of just twice the number of oracle calls, rather than the entire linear combination and the overhead for its implementation.
Step 1: Prepare multi-product formula for importance sampling: Step 2: Repeat the following two steps N times (see Algorithm 1): ∼ {p k , V k } via importance sampling, i.e., draw V k with probability p k .
(b) Run the following circuit and measure outcome : Overview of the randomized sampling framework for estimating the dynamics of observables. After preparing a multi-product formula for importance sampling, we can run Algorithm 1 to implement it in a simple, randomized fashion. While this overview already hints toward its use for time evolution and estimating the dynamics of observables, these results apply to general multi-product formulas.
3 Randomized sampling using multi-product formulas LCU-type algorithms, despite having a finite failure probability, are deterministic when successfully applied. In contrast to such deterministic implementations, there is a recent interest in randomized algorithms [21][22][23][24]. These novel results improve the performance of product formulas by randomizing the order of the short time evolution and introducing the alternative setting of randomized compiling. With the goal of reducing the circuit depth required for the implementation of multi-product formulas, we build upon randomized compiling and propose the setting of randomized sampling: Given an observable O and a quantum system in the initial state ρ, our goal is to find the expectation value of the observable after a time evolution U = e −iHt governed by the Hamiltonian H = L k=1 h k . That is, we wish to compute In the following, we describe a randomized algorithm for such a task, give convergence guarantees, and present novel multi-product formulas suited for this framework. Note that we will require one part of our system to be encoded as a single qubit, while the rest acts as the simulator.

The randomized sampling framework
We begin by proposing the randomized sampling framework summarized in Figure 2. After rescaling the coefficients of a given multi-product formula, we can apply the following simple algorithm to estimate Eq. (16) in a randomized fashion based upon importance sampling:

Algorithm 1 (Randomized sampling). Given an observable O and an ensemble
of M unitaries {V k } and corresponding probabilities {p k }, we consider N independent repetitions of the following protocol: Note that it is in general sufficient to measure X and O separately and multiply the outcomes, rather than performing a joint measurement. This could be useful e.g. when the measurement of O is available in a black-box fashion. As we show in more detail in Section 4, we find the following result for infinitely many runs of the algorithm:

Corollary 1 (Sample mean convergence). Let O be an observable, ρ be an initial state and let
Then, the sample mean of Algorithm 1 converges to the expectation value of the random measurement outcomes {o j } j , given by Furthermore, since an infinite number of measurements is infeasible, we give an expression for the number N of repetitions of Algorithm 1 sufficient for achieving the desired accuracy and confidence for the goal of randomized sampling. Using Hoeffding's inequality [30], we can formulate the following result:

Theorem 1 (Randomized implementation of sums of unitaries). Let O be an observable, ρ an initial state and let
Then, for a fixed accuracy ε ∈ (0, 1) and confidence δ ∈ (0, 1), a total of repetitions of Algorithm 1 suffice to accurately approximate the expectation value tr OV ρV † using the sampling mean estimator, i.e., with probability at least 1 − δ.
So far, this framework is fairly general: Given a set V, the algorithm samples from the linear transform with the unitary V . For quantum simulation, V (a decent approximation to U ) and V still have to be found. This is where multi-product formulas come into play. Multi-product formulas approximate U by a weighted sum of product formulas, and so the featured product formulas and their weights could make up the set V. The problem is that we require {p k } to form a probability distribution, i. e., p k > 0 and M k=1 p k = 1. This disqualifies us from identifying the sets {p k } with {C q } straightforwardly. While q C q = 1 holds for any multi-product formula per construction (see the first component of Eq. (14)), the sign of some C q will be negative [31]. We thus have to absorb the signs of these negative C q into their unitaries V q , but then we find that the operation is not normalized as q |C q | > 1. At this point we introduce the quantity Ξ := q |C q |, that we call resolution factor. The resolution is practically used to define a probability distribution via p k = |C k | /Ξ such that Algorithm 1 samples from As the time evolution is now approximated by ΞV rather than V , Ξ factors into the number of required circuit evaluations. With Ξ > 1 requiring to run the algorithm more often to achieve an error comparable with the situation where Ξ = 1, this resolution factor can be regarded as a penalty. This penalty is further amplified if a direct time evolution by time t is insufficient and a repeated time evolution of short time slices t/r < 1 is required. Since each of such repetitions will result in additional factors of Ξ leading to an exponential scaling of the resolution factor in the number of repetitions, this behavior prohibits the use of a large number of repetitions as is usual for product formulas. On the other hand, this randomized implementation does not rely on post-processing and the increase of some success probability via wellconditioning as is the case for the deterministic, block-encoding implementation, which in turn would increase the number of required circuit evaluations drastically. However, especially in the regime of early quantum computers with capabilities between those of NISQ devices and large fault-tolerant quantum computers, where one would use few repetitions, a deeper circuit may pose a bottleneck much tighter than a slightly higher number of circuit evaluations would. We, therefore, envision these results to be of particular interest in this intermediate regime, where one might be interested in simulating the dynamics of quantum systems instead of performing full-fledged quantum phase estimation. We can make the following guarantees for approximating time evolutions in randomized sampling with a nontrivial resolution and a finite number of measurements: Theorem 2 (Approximating unitaries with a finite number of measurements,). Let U be a unitary, O an observable, ρ some initial state and let Then, it is sufficient to run repetitions of Algorithm 1 to achieve with probability at least 1 − δ.

Custom multi-product formulas
The multi-product formulas of Childs and Wiebe can easily be adapted to this randomized sampling scheme. Since previous improvements of multi-product formulas focused on increasing success probabilities by changing their condition number [14], which has no impact on their asymptotic error scaling, we can ignore constraints encountered in LCU techniques. Our sole interest lies in a low algorithmic depth and a moderate resolution, so we choose to set q = q for all q = 1, . . . , K + 1 in Eq. (13) and (14). However, those multi-product formulas make relatively little use of the Trotter-Suzuki order 2χ of its components. In fact, the only reason not to minimize the depth by using S 2 (·) blocks is the high resolution factor. To this end, we present a family of novel multi-product formulas tailored toward the randomized sampling framework with improved scaling in the Trotter-Suzuki order that can be optimized for their resolution factor. A brief overview of the process leading to a suitable multi-product formula is presented in Figure 3 While Childs and Wiebe's approach manipulates higher-order error terms, we employ a construction to modulate the entire Taylor expansion.
Definition 1 (Linear combination of time evolution operators). Let χ ≥ 1 and R ≥ 1 be integers and S 2χ (t) a Trotter-Suzuki product formula approximation to U (t) as defined in Eq.
Use the matching multi-product formula introduced in Definition 2 Use the closed-form multi-product formula introduced in Definition 3 Attempt solving the system of nonlinear equations defined in Eq. (29) Numerically optimize the parameters b using a loss function balancing error bound (see Theorem 3) and resolution factor Ξ = q |C q | impacting the number of circuit evaluations. This involves inverting the Vandermonde matrix defined in Eq. (27) Obtain a multi-product formula with q C q V q ≈ e −iHt Implement MPF using the newly proposed randomized sampling framework (Section 3.1 and Figure 2) S u c c e s s F a il u r e Figure 3: Overview of the steps necessary to arrive at a suitable multi-product formula as discussed in the following section. While the matching multi-product formula is superior to the closed-form version, the required solution to this system of nonlinear equations might not always be available for some Here, the coefficients {C q } are related to the parameters b and ν via the inverse of the corresponding Vandermonde matrix in Eq. (27) as is discussed in detail in Section 4. The parameters ν will have an important role in the following construction, whereas b can be tuned to strike a balance between the resolution factor Ξ, error bound ε, and condition number. While previous approaches focused on optimizing b for the condition number relevant for the success probability when using block-encodings and amplitude amplification, we instead numerically optimize them for the resolution factor and obtain improved error bounds along the way. Note that a similar construction has recently been used to approximate energy measurements in the fully analog setting [32]. We can now turn toward the definition of the first new multi-product formula.
Definition 2 (Matching multi-product formula). Let χ ≥ 1 and R ≥ 1 be integers and L 2χ,R (ν, b, t) as defined in Definition 1. Then, for any t ∈ R, we define the multi-product formula M with ν for all 0 ≤ k ≤ 2χR, whereas the parameters {b (r) } can be chosen arbitrarily.
Multiplying the coefficients C q of the individual building blocks L 2χ,R (ν (r) , b (r) , t) of the matching multi-product formula results in a resolution factor of For the time evolution to work, the set of vectors ν (r) has to be found satisfying the constraint in Eq. (29) for different R and 2χ. It is important to note that obtaining the coefficients ν for the matching multi-product formula requires solving the system of nonlinear equations defined by Eq. (29), whose complexity scales at least exponentially in χR. In this work, we have used numerical solvers which converge in a few seconds for χR ≤ 10, which we view as reasonably high as they provide an approximation up to O(t 21 ), which we view as suitable for early quantum computers. When solving the system of nonlinear equations becomes unfeasible, be it due to slow convergence or the desire for significantly better approximations, we can employ a second version of the multi-product formula in which the ν (r) are already determined at the cost of a slightly larger resolution factor Ξ: Definition 3 (Closed-form multi-product formula). Let χ ≥ 1 and R ≥ 1 be integers and L 2χ,R (ν, b, t) as defined in Definition 1. Then for any t ∈ R, we define the multi-product formula M with for all 0 ≤ k ≤ 2χR and 1 < n ≤ R. The parameters {b (r) } can be chosen arbitrarily.
Multiplying the coefficients of the individual L 2χ,R (ν (r) , b (r) , t), results in a resolution factor for the closed-form multi-product formula of For these multi-product formulas, we can prove the following two results: be a multi-product formula constructed according to Definition 2 or 3. We then find that Theorem 3 (Error bound for custom multi-product formulas). For M 2χ,R (t) being a multi-product formula constructed according to Definitions 2 or 3, we have Figure 4: Quantum circuit for randomized sampling with a multi-product formula such that the circuit samples from E( O ). Again, it is in general sufficient to multiply the measurement outcomes instead of performing a joint measurement.
with g χ := 4χ for matching and closed-form multi-product formulas respectively.
Consequently, these formulas can be used for randomized sampling via Theorem 2. A numerical comparison of the error bounds of all presented formulas can be found in Fig. 5, while an overview of all relevant quantities is provided in Table 1.
While the formula dependent ζ χ,R themselves are not too illuminating, they can be upper bounded by the corresponding resolution factors multiplied by a term depending on the magnitudes of the used parameters b. However, these upper bounds are not tight which is why their exact versions are used in any explicit calculations. Nevertheless, they showcase that a bad choice of b can lead to significant penalties in the performance of the resulting multi-product formula, further stressing the need for optimizing with respect to these parameters.
The matching and closed-form multi-product formulas both rely on products of L 2χ,R (ν (r) , b (r) , t). For Algorithm 1, these products can either be expanded and the resulting operators and coefficients be identified with the sets {V k }, {p k }, or sets of (V (1) ) be drawn and applied as in •/• has been sampled. Furthermore, since the above framework randomizes only over product formulas, we are not necessarily bound to Trotter-Suzuki formulas and one could also think of a doubly-stochastic version in which the short-term evolutions comprising the product formulas are sampled randomly as well. In that sense, it is allowed to construct the multi-product formulas with building blocks of rather than (9). This is not possible for Childs and Wiebe's multi-product formulas, which require symmetric building blocks and are thus limited to Trotter-Suzuki formulas.

Proofs
In this section, we provide proofs of the statements presented above. In Section 4.1, we are proving our statements regarding the errors and uncertainty of the randomized sampling procedure. We then turn our attention to the matching and closed-form multi-product formula, providing the intuition behind their construction and verifying their error scaling in Section 4.2. We finally analyze the upper bounds for their error (which is the error of the average time evolution they describe) in Section 4.3.

Randomized sampling
Let us start by following Algorithm 1 step by step. The controlled and anti-controlled applications of V • and V • are defined as and the initial state ρ will be transformed to after the third step of the protocol. The expectation value for the succeeding measurement of X ⊗ I is then given by Here, we note that is in general not necessary to perform a joint measurement but sufficient to multiply the measurement results. This might be beneficial for cases in which the measurement of O is only available in a black-box fashion. We now use that V • and V • are sampled independently from the same ensemble such that Then, we can combine the quantum average form Eq. (47) with the classical average from Eq. (49) to conclude that the expectation value of the random single-shot measurement outcome o is given by which proves Corollary 1.
For real-world applications, we will never achieve a perfect expectation value. It is therefore vital to inspect the single-shot behavior of Algorithm 1 and give an estimate for the number of iterations required to achieve the desired precision. Since the possible outcomes of measuring X on the auxiliary qubit are For a fixed accuracy ε ∈ (0, 1) and confidence δ ∈ (0, 1), it is therefore sufficient to perform repetitions of Algorithm 1 to achieve the desired accuracy and confidence, concluding the proof.
Given recent developments, for example in Ref. [33], one might wonder whether substantial improvements are possible by using a more refined estimator, most notably median-of-means. Unfortunately, this is not very realistic in most scenarios. For observables that obey O 2 = I, such as local and global Pauli observables, the variance of a single-shot outcome o j ∈ [−1

, 1] becomes Var [o j ] = 1−E [o j ] = O(1).
In this case, the variance is of the same order as the magnitude and it is impossible to (asymptotically) improve over Hoeffding's inequality (asymptotic normality) [34]. However, median-of-means could still be used in this setting to take advantage of additional information about the variance, in cases where it is available. Now, we additionally assume that V times the resolution Ξ approximates the time evolution operator U , i.e., Furthermore, we will use the following result.

Lemma 1 (Closeness of expectation values
Proof. We begin by defining U as the difference between the exact and approximated time evolution According to Eq. (53) we find that U ≤ ε. Consequently, By combining the insights from the previous discussion, we can now tackle Theorem 2: Proof of Theorem 2. Add and subtract Ξ 2 tr OV ρV † to find By applying Theorem 1 with accuracy ε/Ξ 2 to the first term and using Lemma 1, we find that this upper bound is given by (3 O + 1)ε.

Construction of the new multi-product formulas
In Definitions 2 and 3, we propose two alternatives to Childs and Wiebe's multi-product formulas, which reduce the number of circuit evaluations required for a randomized implementation and have improved error scaling. In the following, we will motivate their construction and stress the advantages they provide. First of all, note that every approximation of the exact time evolution U (t) ≈ e −itH can be written as a sum of operatorsÂ k such that withÂ 0 = I. For any approximated time evolution with an error of O(t m+1 ), we find thatÂ k = (iH) k /k! for all k ≤ m. In other words, the Taylor expansion of the exact time evolution and its approximation has the same Taylor expansion for the first m non-trivial terms.
For the standard Trotter-Suzuki formula with U (t) = S 2χ (t), we havê for all k ≤ 2χ. For k > 2χ, these operatorsÂ k resemble uncontrolled, erroneous operators. In Definition 1, we propose a superposition of approximations with differently scaled times S 2χ (b n t), introducing controllable modulation parameters ν k up to a Taylor order of 2χR. Its Taylor expansion is given by where C q , b q ∈ R and ν k = ( q C q b k q ) is fixed via the linear transformation of the coefficients with the Vandermonde matrix B j,k = b j−1 k , BC = ν as in Eq. (27). The condition number of the Vandermonde matrix as the product of its Hilbert-Schmidt norm and the norm of the pseudo-inverse can be bounded from above and below by explicit expressions involving the vector defining the Vandermonde matrix [35]. Choosing the vector b, we can calculate the coefficients C for a fixed solution vector ν using the inverse Vandermonde matrix B −1 , which is found exactly to be [36,37] where the sum runs over all binary strings a = (a 0 a 1 . . . a 2χR−1 ) of length 2χR and Hamming weight 2χR−k. However, we have found that for numerical purposes, a matrix inversion of B clearly outmatches the analytical computation of B −1 in terms of runtime. Using Eq. (64), we can now manipulate the Taylor expansion of a Trotter-Suzuki block S 2χ (t) by replacing it with some L 2χ,R (ν, b, t). The key insight here is that while the operatorsÂ k are only correct for Taylor orders k ≤ 2χ, we gain control of the prefactors up to order 2χR. Consequently, we can eliminate all erroneousÂ k for 2χ < k ≤ 2χR by setting the corresponding ν k to zero. This allows us to construct the matching and closed-form multi-product formulas using blocks of L 2χ,R (ν, b, t) with different ν (and possibly b).

Matching multi-product formula
The first version of our proposed multi-product formula builds upon the multiplication of R multi-product building blocks L 2χ,R (ν (r) , b (r) , t) that according to Eq. (64) can be written as where we can eliminate the second sum by setting ν k = 0 for 2χ < k ≤ 2χR. Their product now yields To mimic the exact time evolution up to O(t 2χR+1 ), we require µ k = 1/k!, leading to Definition 2.

Closed-form multi-product formula
The closed-form version of the presented multi-product formula sums products of building blocks L. To motivate the specific construction, it is useful to take a look at the specific case of 2χ = 4 and R = 3 for some choice of b and t using the shorthand to get where the orders 1, . . . , 2χ have been visually highlighted for clarity. In this example, the eleventh order in t is revealed by multiplying the corresponding terms from the Taylor expansion of L in Eq. (62), with the corresponding coefficients in (70) The first term of Eq. (70) takes care of the zeroth and the first 2χ order of U . The second term multiplies all terms of orders t 1 . . . t 2χ with t 2χ and so takes care of the next 2χ terms of the expansion. The third term multiplies with a O(t 2χ ) term twice, taking care of the subsequent 2χ terms -a pattern emerges. In a general setting (with arbitrary 2χ and R) we can write this sum as where we have used Eq. (64) in Eq. (72) and ν (0) j = δ 2χ,j from Definition 3 to collapse the first factor into (−iHt) 2χ(r−1) /(2χ)! in Eq. (73). Considering also that ν (0) j = 1 for 0 ≤ j ≤ 2χ, the r = 1 term (that we recognize are the first 2χ orders of the time evolution) can be separated from the sum. Discarding all terms that vanish due to ν (r) for which we consult Definition 3, a last time resolving the remaining ν (r) k . This leaves us with the correct time evolution up to order 2χR + 1, thus proving the definition.

Error of the averaged operators
Following upon the insights from Section 4.2 we find Corollary 2 already proven by Eq. (66) and Eq. (74) as long as M 2χ,R is constructed according to Definition 2 or 3. The proof of Theorem 3 requires a more involved error analysis.
Proof of Theorem 3. We first need to bound the remainder terms of the Taylor series expansions of U (t) and M (m) . In the following, R (f ) denotes the remainder term of the Taylor series of an operator-valued function f truncated at -th order in t. We thus find that Moving forward, we employ some recently established results on the theory of Trotter errors. Specifically, we make use of the 'Trotter error with 1-norm scaling' lemma of Ref. [11]. Building upon these insights in this fresh context, we find that the exponential remainder of a product formula such as in Eq. (6) can be bounded by This bound can now be applied to (75). For the first term, we obtain while the second term can be bounded by where we have introduced the factor to group all terms relating to χ from the Trotter-Suzuki decomposition appearing in Eq. (53) of Ref. [13]. At the same time, we have defined Λ := k h k . Consequently, we can bound the error of the matching multi-product formula via with ζ (m) The error analysis of the closed-form multi-product formula follows along similar lines: Again, we bound the remainder terms of the Taylor series expansions of U (t) and M (cf) , finding where the second term can be bounded by we find concluding the proof.

Comparison and numerical validation
We will now compare the Trotter-Suzuki product formula algorithm and multi-product formulas in the randomized sampling framework. To make comparisons as fair as possible, we will assume that each algorithm uses at most R-fold sequences of S 2χ (·) blocks. Product formulas may use repetitions as in Eq. (7) achieving an error of O((t/R) 2χ+1 ), which is a reliable way to approximate time evolutions for longer times R τ > 1. However, repetitions improve the accuracy of shorter time evolutions only minimally. The situation is different if R is an integer power of five, which allows us to build the next higher Trotter-Suzuki order and approximate the time evolution up to a leading order of 2(χ+log 5 R)+1 in t. This means improving the leading power of t comes at an exponential cost for the circuit depth. The situation can be remedied by sampling from a multi-product formula. Remarkably, the statement for sampling observable eigenvalues with product formulas is very similar to Theorem 2.
Since the sampling errors of all of these methods are comparable, we disregard them in this comparison and instead only think about asymptotic limits. The number of required circuit evaluations for the same accuracy then follows directly from the corresponding resolution factors. The main difference between product and multi-product formulas is that Trotter-Suzuki algorithms do not have a resolution   (11) and (15) and Theorem 3. We have chosen 2χ = 4 and r = R = K + 1 = 3 to fix the depth of all methods to be 30 oracle calls. The parameters {b (r) } have been numerically optimized for the matching and closed-form multi-product formulas with an initial guess of b = { 1, −1, 2, −2, . . . , 7 }, while we employ q = q for Childs and Wiebe's formula as the choice with the lowest resolution factor. We find the resolution factors for the multi-product formulas to be Ξ (CW) ≈ 3.13, Ξ (m) ≈ 1.22 and Ξ (cf) ≈ 1.36. Consequently, the matching and closed-form multi-product formulas would allow for four to five repetitions before they require the same amount of circuit evaluations as Childs and Wiebe's formula. In the black box optimization, we have used the error bound as the objective function with the modification of using (Ξ (m) ) 20 and (Ξ (cf) ) 10 , instead of ζχ,R respectively, to ensure reasonable resolution factors.
factor. This is equivalent to Ξ = 1 with consequences for sampling complexity and error bounds. This advantage is, however, quickly outweighed as Childs and Wiebe's multi-product formula delivers an improved approximation that is exact up to a leading order of 2(χ + R) − 1 in t. Modifying the leading power of t is now possible by adding exponentially fewer terms. Here, we consider only the simplest form of Childs and Wiebe's multi-product formula with q = q since their improvements, e.g. those proposed in Ref. [14], focus on increasing the success probability of their deterministic implementation without improving their error scaling and lead to significantly worse resolution factors. Furthermore, note that they require the same circuit depth for R = K + 1 in the randomized sampling framework due to the final term in Eq. (13). With matching and closed-form multi-product formulas, the approximation can be further improved to leading order 2χR + 1 in t. This means that only one additional S 2χ (·) block improves the order by 2χ rather than 2 as for multi-product formulas of the prior art. While the scaling in t of the novel multi-product formulas is superior to those of Childs and Wiebe, their theoretical error bound is not as tight, leading to a crossover of error bounds at |h k |t < 1. While it is true that the error bounds cross over at comparably small errors, it is important to note that the significantly reduced resolution factors allow for multiple repetitions, i.e., r evolutions of shorter times t/r, until the same number of circuit evaluations is required as for Childs and Wiebe's formulas.
We have plotted the error bounds of all formulas for fixed circuit depth between all methods for one particular optimization of the multi-product formula parameters in Fig. 5. Optimizing the set of parameters {b (r) }, we tend to achieve noticeably lower resolution factors than obtained using Childs and Wiebe's multi-product formula. We generally find the matching multi-product formula to have a smaller resolution factor than the closed-form multi-product formula. While the matching multi-product formula has a better resolution, it requires an additional layer of classical optimization.
It is important to note that since we did not find useful bounds relating b to any of the resolution factors, further improvements of the bounds in Fig. 5 seem possible. In the absence of relations Ξ(b), it is necessary to numerically optimize all b (r) for a chosen loss function, which is not the case for Childs and Wiebe's multi-product formula due to existing, analytical relations. We found that global optimization using basin hopping combined with Nelder-Mead optimization yields the best results since the optimization landscape exhibits a large number of local minima. The optimization is also sensitive to the initial guess for those parameters, with an equal spread of positive and negative integers, i.e., b init = (1, −1, 2, −2, 3, −3, . . . , χR + 1), leading to better results. It is also fruitful to vary the loss function used for optimization. On the one hand, using the error bound for a fixed τ yields the lowest error, with the downside that the corresponding resolution factors are too large to be practical. On the other hand, solely optimizing for the resolution factor might result in resolution factors arbitrarily close to one with the downside of significantly worse error bounds. We, therefore, found the most fruitful loss function to be the error bound with the modification of using the resolution factor Ξ directly instead of γ and amplifying its impact on the loss function by using Ξ p for some power p, which allows us to balance both quantities. Nevertheless, the use of different loss functions could be explored further. We have also found that bounding each parameter b by the total maximum leads to more robust results.
To corroborate the functioning of the newly proposed multi-product formulas beyond theoretical bounds, we also compare the actual performance with that of the conventional multi-product formula and Trotter-Suzuki. Here, we compare the actual operator distance to the ideal time evolution for the following five physically plausible and meaningful Hamiltonians, with the results shown in Fig. 6: First, we consider a standard Heisenberg Hamiltonian with periodic boundary conditions, described by This is a Hamiltonian that plays an important role in condensed matter physics, as a prototypical model capturing ferromagnetism. Trotter-Suzuki and Childs-Wiebe formulas perform extremely well, as the Hamiltonian comprises many commuting terms. Since their performance guarantees rely on nested commutators [11], this behavior is to be expected and demonstrates the superior performance of these well studied formulas for lattice Hamiltonians, which is not reached by the newly proposed multi-product formulas. However, they are less optimal for Hamiltonians with fewer commuting terms. Motivated by these findings, we now turn to investigate a Hamiltonian comprising mutually anti-commuting terms, defined as Anti-commuting terms play an important role when using quantum simulation algorithms to investigate fermionic quantum models. Here, we find a notable and significant advantage of the newly proposed multiproduct formulas over Trotter-Suzuki and Childs-Wiebe already for quite large parameters of τ = |h k |t. Consequently, we expect our formulas to work well and exceed previous methods for fermionic systems, which, once transformed into qubit Hamiltonians using the Jordan-Wigner transformation, will have fewer commuting terms than standard lattice spin Hamiltonians.
This behavior is confirmed and further corroborated by our results of the Sachdev-Ye-Kitaev (SYK) model as defined in Ref. [38], whose Hamiltonian is given by where N is the number of Majorana fermion mode operators γ p and the J p,q,r,s are real-valued scalars drawn randomly from a normal distribution with variance σ 2 = 3!/N 3 . This is an intricate local, but not geometrically local, model that is believed to provide insights into instances of strongly correlated quantum materials. It is used in the study of scrambling dynamics and has a close relation with discrete models that capture aspects of holography in the black hole context. Again invoking the Jordan-Wigner transformation, N Majorana fermion mode operators can be mapped onto N/2 qubits. For our simulations, we thus chose N = 10 and N = 14, leading to a five and seven-qubit model, respectively. Here, we again find a notable and in instances substantial advantage for large τ and even for large times t. This is a physically highly plausible and interesting model for which our new simulation methods fare well.
As a further fermionic system, we look at the spinful Hubbard model on two by two sites, defined in Ref. [39] as  with spin σ, tunneling amplitude t = 2, Coulomb potential U = 2, magnetic field h = 0.5 and chemical potential µ = 0.25. Also, a and a † represent fermionic annihilation and creation operators. These are comparably small system sizes, but already show the substantial potential of the proposed simulation method. Finally, as a last family of examples, in order to gauge the performance of our methods for larger system sizes, we investigate a system of 200 non-interacting ("free") fermions with nearest neighbor interactions and periodic boundary conditions with h i,j = 1 if the respective fermions are nearest neighbors and h i,j = 0 otherwise. Note that for gauging the performance for free fermions, we do not compare the time evolution operator U (t) = e −iHt to its approximation, but the Greens function propagator G(t) = e −iht to its approximation. All comparisons are done in the randomized sampling framework for a Trotter-Suzuki order of 2χ = 4, the number of repetitions R = 3 and corresponding parameters for all other algorithms, ensuring an equal depth measured in the number of the required oracle calls. Although the straightforward comparison of their bounds in Fig. 5 suggests an advantage of our multi-product formulas at about τ = 0.1, we find that this is the case for much larger τ already on actual systems as shown in Fig. 6. Note also that since the Hamiltonians we consider are not necessarily geometrically local, known classical simulation techniques will be heavily challenged even for comparably short simulation times While the performance of the simple multi-product formula and Trotter-Suzuki is also much better than their bounds indicate, we find that the presented bounds for our proposed multi-product formulas are comparably looser. It is also worth noting that the performance of the matching version is slightly better than that of the closed-form version, although it comes with the penalty of an additional, classical optimization loop. We also find that the advantages of our multi-product formulas are visible even for τ > 1; in the case of the SYK model, this holds even for actual simulation times t > 1. Additionally, we find that this advantage can also be maintained for larger system sizes, as indicated by their performance on the model of free fermions. The presented numerical studies, therefore, provide strong arguments for the functioning of our proposed multi-product formulas and their advantage in the presented regimes. It is also important to note that the corresponding resolution factors required for the randomized sampling scheme of Ξ (cf) ≈ 1.36 and Ξ (m) ≈ 1.22 are significantly better than the Ξ (CW) ≈ 3.13 of the multiproduct formula proposed by Childs and Wiebe in Ref. [13] and would thus allow for at least four repetitions before they require the same overhead in circuit evaluations.

Discussion and conclusion
In this work, we have brought together two main ingredients of methods of quantum simulation. The results are notably more resource-efficient ways of performing short-time Hamiltonian simulation. These are on the one hand higher-order multi-product formulas [13,14], on the other an element which has long been underappreciated but recently been of high interest: the element of randomness [20][21][22][23][24]24]. Overcoming the prejudice that the time evolution has to be completed in each run of a compilation, we have introduced a novel framework for implementing multi-product formulas, to estimate expectation values of time-evolved observables.
Concretely, we have proposed a randomized sampling approach that focuses on the time evolution of the observable, not the state. When implementing multi-product formulas in a randomized fashion rather than via block encodings in the LCU framework, we can circumvent the need for additional amplitude amplification or post-selection. The results presented here have been obtained by only requiring access to a quantum-oracle machine that implements single-qubit state preparation, controlled time evolution, and quantum measurements. They are thus especially relevant in the regime of early quantum computers in which NISQ algorithms reach their limits but where full-fledged, digital, long-time evolution algorithms on fault-tolerant quantum computers are not yet available. Consequently, this work may be seen as targeting a regime in between the digital and analog setting, where we have some form of parametric control over a simulator system allowing us to compile the target time evolution with sequences of the simulator's time evolutions [40,41]. This programmable regime then constitutes a departure from the analog setting with relatively little control over the simulator and is not as strict as digital simulation, where the control over the quantum system is strong enough to fashion its interactions into quantum gates.
Within this randomized sampling framework, we have proposed two new multi-product formulas. These schemes have been equipped with full rigorous performance guarantees. Furthermore, we have included a detailed estimation of the number of circuit evaluations that are required, a vital metric for randomized approaches. Comparing the error bounds of these newly introduced algorithms with Trotter-Suzuki product formulas and Childs and Wiebe's multi-product formula, we find that they outperform the latter for a fixed circuit depth in a practically relevant regime.
The use of multiple repetitions of short time evolutions by t/r instead of a single, long time evolution by t, elementary to achieving long simulation times, comes with a penalty in the number of circuit evaluations which scales exponentially in r. However, the base of this exponential, the resolution factor in our case, is only slightly larger than one and could be optimized even further. Even for our toy examples, it was in the range of 1.2 − 1.4 for the newly proposed multi-product formulas as compared to 3.13 for Childs and Wiebe type multi-product formulas. Thus, at least a few repetitions are still in reach, rendering these results especially relevant for early quantum computers since we do not solely consider geometrically local Hamiltonians, for which even comparably short simulation times are a highly difficult task for known classical simulation techniques.
Benchmarking them on five different Hamiltonians, all of which are physically well motivated and each interesting in its own right, we have found that this advantage can be expected already at comparably large simulation times. While lattice Hamiltonians with many commuting terms are most likely best approximated using Trotter-Suzuki or Childs-Wiebe formulas, the newly proposed multi-product formulas show a clear advantage for fermionic Hamiltonians and those with a small number of commuting terms. This insight points to the direction that there might not be a universally optimal quantum simulation algorithm for digital quantum simulation. Instead, some algorithms could be better suited to capture the specifics of a given local Hamiltonian model. The downside of the proposed methods is that more measurements are required to reach the desired precision through the resolution factor. This resolution factor can be optimized using a classical black-box optimization, which is de facto a requirement for the functioning of the proposed multi-product formulas.
The present work is essentially bridging the gap between analog and (perhaps error-corrected), fully digital quantum technology: Not only do we expect there to be other randomized sampling schemes in digital quantum simulation, but once one can replace the element of randomness with block encodings, one can switch from these expectation value based algorithm to algorithms based on quantum phase estimation. Overall, the method introduced gives rise to a less resource-demanding way of performing Hamiltonian simulation, while also remaining conceptually and technologically simpler than for instance qubitization [18], bringing such ideas to an extent closer to the resources available in early quantum computers.
Looking ahead, it remains an open problem to relate the parameters b in Definition 1 to the resolution factor in a way that would eliminate the need for black-box optimization. So far, we can only connect the two quantities analytically, and that involves the complicated product with the Vandermonde matrix (27). Alternatively, one could improve the optimization rather than replace it. We have used only simple optimizers and loss functions, and expect possible improvements for more involved loss functions and optimization algorithms. Furthermore, the presented constructions are just two of the plethora of new multi-product formulas that could be constructed with Definition 1 and might exhibit better error bounds and resolution.

Acknowledgments
This work has been supported by the DFG (CRC 183 project B01 and A03, EI 519/21-1). This work has also received funding from the European Unions Horizon 2020 research and innovation program under grant agreement No. 817482 (PASQuanS), specifically dedicated to programmable quantum simulators. It has also been supported by the BMWK (PlanQK and EniQmA), the BMBF (DAQC on notions of digital-analog quantum simulation and FermiQP on fermionic quantum processors), the Munich Quantum Valley (K8), and the Einstein Foundation (Einstein Research Unit on Quantum Devices) . M. K. acknowledges funding from ARC Centre of Excellence for Quantum Computation and Communication Technology (CQC2T), project number CE170100012. The authors endorse Scientific CO 2 nduct [42] and provide a CO 2 emission table in the appendix.