Fault-tolerant quantum computation of molecular observables

Over the past three decades significant reductions have been made to the cost of estimating ground-state energies of molecular Hamiltonians with quantum computers. However, comparatively little attention has been paid to estimating the expectation values of other observables with respect to said ground states, which is important for many industrial applications. In this work we present a novel expectation value estimation (EVE) quantum algorithm which can be applied to estimate the expectation values of arbitrary observables with respect to any of the system's eigenstates. In particular, we consider two variants of EVE: std-EVE, based on standard quantum phase estimation, and QSP-EVE, which utilizes quantum signal processing (QSP) techniques. We provide rigorous error analysis for both both variants and minimize the number of individual phase factors for QSPEVE. These error analyses enable us to produce constant-factor quantum resource estimates for both std-EVE and QSP-EVE across a variety of molecular systems and observables. For the systems considered, we show that QSP-EVE reduces (Toffoli) gate counts by up to three orders of magnitude and reduces qubit width by up to 25% compared to std-EVE. While estimated resource counts remain far too high for the first generations of fault-tolerant quantum computers, our estimates mark a first of their kind for both the application of expectation value estimation and modern QSP-based techniques.


Introduction
Calculating molecular energies is the most popular potential application of fault-tolerant quantum computers in quantum chemistry.Several studies have estimated and optimized the computational resources required to compute the ground state energy of classically challenging systems, improving the required number of non-Clifford gates (e.g.Toffoli) from ∼ 10 16 to ∼ 10 10 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10].However, calculating ground state energies alone will have limited practical applicability.For example, in drug design, the calculations of molecular forces with respect to the ground state is required to simulate molecular dynamics [11,12,13] and electric multipole moments can be used to determine the permeability of drugs [14].
Calculating the expectation values of general observables on a quantum computer requires more complex circuits than those for energy calculations [15,16].Energies are eigenvalues of the Hamiltonian and can be evaluated with quantum phase estimation (QPE) directly [17,18].For operators which expressions for qubitization eigenphases of Ĥ FIG. 1. Overview of the quantum algorithm for expectation value estimation.(a) The first of two registers is initialized as a window function using the routine init.In the standard formulation of quantum phase estimation, init would just apply Hadamard gates to every qubit, but more refined circuits can be used to achieve optimal phase readouts [31,32].In the second register, we prepare a target state with an Ansatz state preparation (ASP) routine, for instance a preparation of a Slater determinant state, followed by qubitization-based quantum phase estimation with which we postselect on the right energy.Both registers are fed into an outer quantum phase estimation (oQPE) routine and an expression for the expectation value is obtained by measuring all qubits of the upper register in the computational basis.two subcircuits.The subcircuit in Figure 1(e) is a block encoding of the operator F , and the subcircuit (d) uses block encodings of the Hamiltonian Ĥ inside an inner QPE (iQPE), as well as information about the ground state energy G, and the spectral gap of Ĥ.The 1-norms of Ĥ and F after factorization are H and F , respectively.We can estimate the complexity of the oQPE by counting the repetitions of its main element, the block encoding of the Hamiltonian Ĥ.
As iQPE needs to be precise enough to distinguish the eigenphase that qubitization makes with respect to the Hamiltonian ground state | G i from the qubitization eigenphase of the first excited state, it makes at least O( H / ) queries to the block encoding of Ĥ (following arguments in Berry et al. [34]).However, due to QPE discretization errors, the accuracy of iQPE has to be improved to satisfy the target precision of the observable.To achieve a target error e " in the inner phase estimation, it is necessary to increase its complexity by a factor of {(e ").Note that the ASP is not a↵ected by this bit discretization error as much as the iQPE, as the ASP only needs to prepare | G i with a fidelity close to 1, but not necessarily as close as 1 e ".In later sections, we will introduce two versions of the expectation value estimation algorithm that allow for {(e ") = 1/e " and {(e ") = log(1/e "), respectively.The target error of the observable also shapes the number of qubits in the oQPE.To estimate the expectation value up to an error ", we have to repeat the iterate U a total of O( F /") times.Setting e " = O("/ F ) also for the target precision The first of two registers is initialized as a window function using the routine init.In the standard formulation of quantum phase estimation, init would just apply Hadamard gates to every qubit, but more refined circuits can be used to achieve optimal phase readouts [31,32].In the second register, we prepare a target state with an Ansatz state preparation (ASP) routine, for instance a preparation of a Slater determinant state, followed by qubitization-based quantum phase estimation with which we postselect on the right energy.Both registers are fed into an outer quantum phase estimation (oQPE) routine and an expression for the expectation value is obtained by measuring all qubits of the upper register in the computational basis.state of the Hamiltonian.The general structure of the expectation value estimation algorithm is presented in Fig. 1.At the highest hierarchical level, we divide a number of qubits into two quantum registers.State preparation initializes the first register in a QPE filter or window function with the routine init, while an ansatz state preparation (ASP) initializes a subset of the lower register in one of the qubitization eigenstates associated with the Hamiltonian ground state |ψ G ⟩.There is no need for the preparation to be perfect, but the squared overlap with the target state is a factor in the success probability of the entire algorithm.Note that in contrast to some prior art [33,23], this state preparation is not part of the main routine and thus appears only O(1) times: we could therefore allow ASP to contain more costly state preparation routines such as in Refs.[18,20,21].However, since the success probability of the ASP is decoupled from the success probability of the remaining algorithm, we can use (traditional) qubitized phase estimation on an initially-prepared state (using e.g. a Hartree-Fock state or a sum of Slater determinants) to yield the ground state on one register, and an estimate of the ground state energy which we can read out and store classically.The precision required for this estimate can be the same as the precision of the ground state reflections in later steps: it scales linearly with the energy gap ∆ between the ground state and the first excited state.
These prepared states are then fed into an outer quantum phase estimation (oQPE), which entangles eigenstates of the oracle U on the lower register with expressions for the eigenphase θ of their corresponding eigenvalues e i2πθ on the registers above, see Fig. 1(b).By measurement of the upper register at the end of the circuit in Fig. 1(a) (and up to discretization errors), we project the lower register into an eigenstate of U and learn the eigenphase from the configuration of the upper register, which encodes θ as a binary fixed-point number in its computational basis.With high probability, we have projected into an eigenstate whose eigenphase θ = θ ± contains information on the desired expectation value [15]:  1(c), therefore does not require the additional calculations as necessary in the overlap estimation routine [15].U is divided into two subcircuits.The subcircuit in Figure 1(e) is a block encoding of the operator F , and the subcircuit (d) uses block encodings of the Hamiltonian Ĥ inside an inner QPE (iQPE), as well as information about the ground state energy G, and the spectral gap ∆ of Ĥ.The 1-norms of Ĥ and F after factorization are λ H and λ F , respectively.We can estimate the complexity of the oQPE by counting the repetitions of its main element, the block encoding of the Hamiltonian Ĥ.
As iQPE needs to be precise enough to distinguish the eigenphase that qubitization makes with respect to the Hamiltonian ground state |ψ G ⟩ from the qubitization eigenphase of the first excited state, it makes at least O(λ H /∆) queries to the block encoding of Ĥ (following arguments in Ref. [34]).However, due to QPE discretization errors, the accuracy of iQPE has to be improved to satisfy the target precision of the observable.To achieve a target error ε in the inner phase estimation, it is necessary to increase its complexity by a factor of κ( ε).Note that the ASP is not affected by this bit discretization error as much as the iQPE, as the ASP only needs to prepare |ψ G ⟩ with a fidelity close to 1, but not necessarily as close as 1 − ε.In later sections, we will introduce two versions of the expectation value estimation algorithm that allow for κ( ε) = 1/ ε and κ( ε) = log(1/ ε), respectively.The target error of the observable also shapes the number of qubits in the oQPE.To estimate the expectation value up to an error ε, we have to repeat the iterate U a total of O(λ F /ε) times.Setting ε = O(ε/λ F ) also for the target precision of iQPE, we find that oQPE has a query complexity of Considering the use of asymptotically optimal algorithms for the ASP [21], we present the complexity of the algorithm in Table 1, where γ is the overlap amplitude of the initial state feeding into the ASP before oQPE.The table shows a two-fold advantage of the expectation value estimation over sampling and related approaches-a quadratic improvement of the sampling complexity and an additive, rather than a multiplicative dependence on the overlap amplitude γ between the Hamiltonian ground state and its initial approximation.The upper register in Fig. 1(a) features O(log(λ F /ε)) qubits.The lower register is split into four sub-registers phase, enc[H], enc [F] and sim that we describe in Figure 1(f ).When necessary for clarity, we will decorate states with the registers they are supported on, e.g.|ψ G ⟩ sim .
Table 1: Complexity of sampling (ideal and early fault-tolerant), the original overlap estimation algorithm, together with the improved versions (including QSP-EVE), where we include the complexity of state preparation routines.Here, ε is the target accuracy of the observable value, ∆ is the spectral gap of the Hamiltonian, λH and λF are norms of the Hamiltonian and observable, respectively, and γ is the overlap amplitude between the Hamiltonian ground state |ψG⟩ and its initial approximation.For instance, if we initialize the system in a Hartree-Fock state |HF⟩, we find γ = |⟨ψG|HF⟩|.The state preparation itself can be a state-of-the-art optimal routine such as [21].The original overlap estimation algorithm assumes that a reflection about the state |ψG⟩ is implemented constructively, meaning one has to resort to the aforementioned Ansatz state preparation.The dots in the second line indicate that no logarithmic dependence has been given in the source material.

Algorithm details
We show that the algorithm of Fig. 1(a) computes the desired expectation value within the chosen error.The Hamiltonian Ĥ and the operator F have a spectral decomposition where |ψ E ⟩ and |ϕ η ⟩ are eigenstates corresponding to the energies E and eigenvalues η of Ĥ and F , respectively, with both bounded between −1 and +1.Knowing the complete spectrum of Ĥ and F -the exact values of E and η in the sums of Eq. (4) and Eq.(5)-is not required.We only require knowledge of the ground state energy E = G and a lower bound on the spectral gap ∆.
Since the expectation value estimation algorithm is a form of phase estimation, we can verify its function by showing that the iterate in Fig. 1(c) has the eigenphases corresponding to Eq. (2).The iterate U consists of two subcircuits: each subcircuit (Figure 1(d) and (e)) is a reflection-a self-inverse unitary operator.We verify the eigenphases of the U in three steps: first, we consider the eigenphases of iterates consisting of any two reflections in Section 3.1.Second, we detail the first reflection of Fig. 1(d) and its use of quantum phase estimation in Section 3.2.Finally we identify the first and second reflection with the general reflections of Section 3.1 in Section 3.3, showing that θ ± are among the eigenphases of U.

General iterates of two reflections
We define an iterate as the product of two reflection operators R π and R τ , which are given through their respective projectors π and τ : The projectors are not necessarily known; we only need to know how to construct the resulting reflections.Furthermore, the projectors can be of arbitrary rank.We are interested in the singular values of their product: The left singular vectors |t k ⟩ and right singular vectors |p k ⟩ of the same singular value w k > 0 can now be used to construct two eigenstates |v k+ ⟩ and |v k− ⟩ of the iterate R τ R π .The technical details are in Appendix A. The eigenstates of the iterate are where The corresponding eigenvalues of |v k± ⟩ are expressed in terms of w k as Using the iterate U = R τ R π within phase estimation would thus estimate the eigenphases ±i2 arccos w k .The expectation value estimation ensures that at least one of the singular values w k is a function of the expectation value ⟨ψ G | F |ψ G ⟩. Therefore, by estimating the singular values w k , we can obtain the expectation value.

Reflections with quantum phase estimation inside
The subcircuit of Fig.
A good reference for the above are equations 10 -13 of [34].With iQPE operating on the phase register, it would ideally output computational basis states |Θ E,± ⟩ relating to integers , where n is the number of qubits in the phase register: We use this property to implement reflections on certain qubitization eigenstates |Q E,σ ⟩ indirectly by dressing a reflection on the phase register with iQPE circuits.To that end, we introduce Refl, an arithmetic reflection that tags a set M of integers m ∈ [0, 2 n − 1] in the computational basis of the phase register by means of Toffoli gates and data-loaders, such that where |m⟩ is a computational state encoding the integer m just like |Θ E,± ⟩ encodes Θ E,± (see the circuit in Fig. 2).Knowing the integers Θ E,σ for energies E and signs σ = ± allows us to implement a reflection on the corresponding qubitization eigenstates, Combining Eq. (15) with Eq. (14) we see that ϱ(E, σ) is a projector into the all-zero state only in certain cases: We will use that property in the next section, where M will only contain expressions for the ground state energy G.This approach is less costly than implementing 1 − 2|ψ G ⟩⟨ψ G | starting from an initial reflection 1 − 2|HF⟩⟨HF| by acting with ASP routines on the Hartree-Fock state |HF⟩.We will also see that this approach uses fewer individual QSP phase factors than when implementing 1 − 2|ψ G ⟩⟨ψ G | directly as in Ref. [11].

Expectation value estimation
We now formulate the expectation value estimation routine as an instance of the singular value estimation of Section 3.1.In Fig. 1(c) the subcircuit in panel (d) shall now correspond to the reflection R π and the subcircuit in panel (e) shall correspond to R τ in Eq. ( 6).The latter reflection is a block encoding B[ F ] of the observable F on the sim and enc[F ] registers.Being self-inverse, the block encoding B[ F ] is a reflection on the subspace spanned by the states |ω η ⟩, associated with the eigenvalues η, such that From the fact that ⟨0|B[ F ]|0⟩ enc[F] = F , we find that |ω η ⟩ has the form with where |ϕ η ⟩ are the eigenstates of the observable, see Eq. (5), and|ϕ η ; 0 ⊥ ⟩ is the component orthogonal to |ϕ η ; 0⟩ within |ω η ⟩.Since the projection of |ϕ η ; 0 ⊥ ⟩ onto the all-zero state of the enc[F ] register vanishes, we find that Keeping in mind that B[ F ] and |w η ⟩ are supported on the sim and enc[F ] registers we find that the reflection is Therefore, its projector is Using Eq. ( 15), the projector of R π is Going from Eq. (23) to Eq. ( 24), the projector |0⟩⟨0| has been factored out of the tensor and is subsequently expanded with 1 = η |ϕ η ⟩⟨ϕ η | on the sim register.Using Eq. ( 19), (12), and (20), the product of the two projectors is We can obtain the square of the singular values by solving the eigenvalue problem of where we have used the spectral decomposition of the observable from Eq. (5).When M in Eq. ( 15) only contains Θ G,s for a fixed σ = s (determined during the ASP), then, by Eq. 16, we have ϱ(G, s) One solution to Eq. ( 26) is therefore with its eigenvalue equal to Using Eq. ( 11) we find that a phase estimation with the iterate R τ R π allows us to estimate the phase angles (2) when the input is the state of Eq. ( 28).
There are other variations of this iterate.Here we use Eq. ( 15) to replace a reflection 1 − 2|ψ G ⟩⟨ψ G | on the sim register.Such a reflection could be implemented with QSP/QSVT techniques directly, resulting in a circuit depicted in Fig. 3(a).However, this would require us to solve a large-scale optimization problem to obtain the phase factors, which we want to avoid.One could also conceive a version of the circuit where iQPE is used within R π , but with a Hamiltonian simulation algorithm different from qubitization as in Fig. 3(b).While qubitization has an asymptotically optimal scaling for normalized Hamiltonians, there are instances for which other methods are more efficient [37].A last variation of the iterate is depicted in Figure 1(c), but the set M in Refl is made to include Θ G,+ and Θ G,− at the same time.All three variations of the iterate would allow us to estimate the angles 2πθ ± = ± arccos⟨ψ G | F |ψ G ⟩, which is twice as much signal as in Eq. (2).So what would disqualify the third variation when it offers a better sensitivity?As shown in the next section, we have to consider the discretization error of the iQPE routines as a limiting factor for the accuracy of the estimated expectation value.We have deliberately chosen to dismiss the third variation of the iterate, as we would, in this case, have to contend with ϱ(G, +) ̸ = ϱ(G, −), introducing an additional source of error on top of the errors already considered.

Errors and failure probability
Unfortunately, unlike the ideal formulation in Eq. ( 13), an actual implementation of a quantum phase estimation will create errors propagating through the entire circuit.Every quantum phase estimation suffers from errors due to the discretization of the eigenphase expressions: a quantum phase estimation circuit with n-qubits outputs the eigenphases σ arccos E of the eigenstate |Q E,σ ⟩ without error as long as Θ E,σ is an integer.Discrete eigenphases are generally unlikely, and for non-integer Θ E,σ quantum phase estimation outputs a distribution p n (•) of integer expressions k ∈ [0, 2 n −1] for pseudo eigenphases 2πk/2 n in n qubits: where |k⟩ is the computational basis state encoding the integer k.In general, the pseudo-eigenphase distribution causes the projectors ϱ(E, σ) on the phase register to "smear out" such that we can no longer expect Eq. ( 16) to hold.Since Θ G,s is not guaranteed to be an integer, we have to choose an integer number m such that m and m + 1 make up the set M in Eq. ( 14), in order to lower-and upper-bound the target eigenphase with m ≤ Θ G,s ≤ m + 1.Here, M has only two elements, but it could generally include any r configurations, and in consequence, the projectors ϱ(E, σ) would have rank r.As pseudo-eigenphases of all |Q E,σ ⟩ could overlap with configurations in M, the states that we utilize to estimate ⟨ψ G | F |ψ G ⟩ can be contaminated with excited states of Ĥ.While it is challenging to find expressions for the errors in estimating the expectation value without knowledge of the complete spectrum of Ĥ, we can estimate the largest error contribution caused by the contamination of the ground state |ψ G ⟩ with the first excited state |ψ E ⟩ of energy E. Let us quantify this contamination: we can write the singular value decomposition of the product as where Ω j are the singular values, and |G s j ⟩, |E s j ⟩ are left and right singular vectors, respectively.Contamination of |Q G,s ⟩ with |Q E,s ⟩ happens when there are (non-zero) singular values Ω j > 0. Without knowing the spectrum of F , we can bound the difference between F est , the estimated value of F with respect to the contaminated state, and the actual expectation value as The contamination additionally causes the expectation value algorithm to fail with a non-zero probability.This is because the state in Eq. ( 28) does not completely overlap with the solutions that allow us to estimate the expectation value up to the error in Eq. (31).We find that having prepared the state in Eq. ( 28), the success probability of the expectation value estimation algorithm is between ⟨0|ϱ(G, s)|0⟩/2 and ⟨0|ϱ(G, s)|0⟩.The success probability depends on the observable and approaches its maximum value fast if The proof for the observable error and the success probability can be found in Appendix B. The error and failure probability of the expectation value algorithm can be decreased by improving the accuracy of the iQPE routines by either increasing the number of qubits in the phase register or by using of QSP routines.The two approaches span different versions of the same algorithm, which we shall call std-EVE (standard expectation value estimation) and QSP-EVE (QSP expectation value estimation).Both versions are discussed in the following: Standard version (std-EVE) -The number of phase qubits must initially be chosen according to the norm and the spectral gap of Ĥ.However, one might choose to add more qubits to the phase register in order to deal with the discretization error.The idea is that the discretization error only really affects the lesser significant bits of the QPE, and adding qubits raises the significance of the bits distinguishing Θ G,s from Θ E,s .Concerning the success probability, it has been demonstrated in Appendix C of Ref. [38] how increasing the number of qubits increases the success probability of QPE.In the next section we will demonstrate how it decreases the expectation value error of our algorithm.To increase the success probability of std-EVE, we could add more elements in M, but this would typically increase the expectation value error again, which in turn calls for a further increase of the phase register.Adding a single qubit of precision to a phase estimation routine doubles its complexity, so the cost of the phase estimation increases exponentially with the number of additional qubits.
QSP-version (QSP-EVE) -The singular qubits in the phase register are labeled by phase[x] for x = 1 . . .n, where phase [1] is the most significant bit and phase[n] is the least significant bit in the phase estimation.The textbook version of quantum phase estimation used in std-EVE then consists of the sequence where the routine V x sets the value of qubit phase[x] by calling the controlled iterate U a total of 2 x−1 times (or an equivalent double kickback series of two reflections).The routine V x uses the feedback from the less significant phase qubits phase[y] for y > x, as depicted in Fig. 4. The reason for pseudo eigenphases is that the routines V x are not setting the qubit phase[x] to either |0⟩ or |1⟩ but to a superposition of the two.Refs.[30,29] attempt to round the qubit phase[x] to the most-likely computational basis state using QSVT techniques.This is also the idea behind the version of iQPE used in the QSP-EVE algorithm, but there we specifically use QSP techniques rather than the more general QSVT.In this QSP version of QPE, the rounding is achieved with a symmetric QSP [40] circuit, depicted in Fig. 5. Using d x queries to V x and d x + 1 Z-rotations on qubit phase[x], we define a sequence The angles φ x,k are carefully chosen such that a similarly-structured sequence, where the V x routines are replaced with X-rotations about the angle ϑ yields with some function Q x (z) and the degree-d x polynomial function P x (z) approximating a step function S(z) defined as The deviation of P x (z) from S(z) is largest in a region of κ x around the discontinuities at z = ±1/ √ 2. The maximum value of |P x (z) − S(z)| outside these regions is defined as ∆ x .This maximum difference can be decreased exponentially fast.The degree of the polynomial that matches the required characteristics can be found as x log(∆ −1 x )) [24,28,41].The iQPE routine in QSP-EVE is then defined as the sequential application of the W x subroutines, starting with the subroutine setting the least significant bit: Different approximations to S(z) will give different d x .With any choices of P x , we can compute the singular values Ω j and expectation values ⟨0|ϱ(G, s)|0⟩ with an efficient classical routine defined in Appendix C, allowing for an analysis of errors and success probabilities.

Comparison of std-EVE and QSP-EVE
For a meaningful comparison between std-EVE and QSP-EVE, we will consider a baseline QPE querying the qubitization iterate with n 0 qubits, thereby querying the Hamiltonian block encoding O(2 n0 ) times.Here, n 0 is the number of qubits high enough to separate the ground and first excited state eigenphases, ± arccos G and ± arccos E, by at least 2 −n0+1 π.This separation is chosen such that consecutive numbers { m, m + 1} ∈ M can be used to tag states in the Refl routine, where for a given sign s ∈ {1, −1}.Counts for std-EVE and QSP-EVE are then expressed in units of the baseline QPE complexities, c bQPE .
In std-EVE, we would add n x additional qubits to separate the eigenphases of the Hamiltonian ground state from the eigenphase of the first excited state.Every qubit added to the iQPE circuit with respect to the baseline QPE roughly doubles its complexity, and so we find its complexity to be upper-bounded by 2 nx c bQPE .
In QSP-EVE, we would use a symmetric QSP sequence [40] to round every qubit to its closest value [29] with QSP.We choose the same target polynomial P (•) for all QSP routines W x associated with the rounding of the qubit phase[x], i.e. we pick P x (•) = P (•) for all x.The polynomial P (•) is a degree d approximation of where erf is the Gaussian error function.The variable k is inversely related to κ x -region (Eq.73 in Ref. [28]) where we have chosen κ x = 0.25.Note that it is conceivable to relax the degree requirements for polynomials P y (•) of more significant qubits phase[y], but this would, at most, improve the overall complexity by a factor of 2. With a uniform P (•), the complexity of the iQPE can be upper bounded by 2d • c bQPE .The factor of 2 is due to one extra qubit being added to iQPE in QSP-EVE-a modification we have found to be necessary even when the polynomial is the perfect step function, i.e.P (•) = S(•).
The achievable precisions for std-EVE and QSP-EVE in terms of the target error of the observable (relative to λ F ) are depicted in Fig. 6(a) as a function of their algorithmic complexities in units of multiples of the baseline QPE.The dashed curve of QSP-EVE in the graph denotes the expectation value errors with respect to target polynomial P (•), while the data points denote expectation value errors obtained from QSP circuits using pre-computed phase factors with finite bit precision.The phase angles were obtained through optimization using the LBFGS solver in QSPPACK [42].While the first expectation value errors match the theoretical predictions of the dashed curve, we see a clear difference after d = 128.We attribute this behavior to a suboptimality of the phase factors due to the increasing hardness of the underlying optimization problem for larger d.Infinite QSP, a different optimization method [43], which performs well for target functions such as sine and cosine, is ineffective here, as the Chebychev coefficients of P (•) do not decay rapidly enough.For our results in Fig. 6(a), we have chosen a baseline QPE with n 0 = 20 qubits.The presented data is relatively robust against varying numbers of baseline qubits n 0 -as long as n 0 is sufficiently large-since the discretization error only affects the first few insignificant qubits.
The data shows a rapid decay of the expectation value error by increasing the complexity of QSP-EVE -a decay that turns out to be exponential.We attribute this behavior to our ability to exponentially suppress the deviation of P (•) from S(•) at their plateaus.At the same time, the expectation value error decays roughly linear with the complexity of iQPE in std-EVE.It follows that the overall complexities of both algorithms, see Eq. (3), become for std-EVE and QSP-EVE, respectively.It should be noted that for very large choices of target error (up until roughly 6 • 10 −3 ) that std-EVE makes fewer queries to the baseline QPE than QSP-EVE.This target error regime is insufficient for our application, but in principle, if one could relax the target error sufficiently, the overhead introduced by the use of QSP makes QSP-EVE less competitive; this may be relevant in other applications where QPE is used on its own rather than as part of a larger routine such as in this algorithm, where the iQPE is used in a reflection about the ground state.Note also that the maximal success probability of std-EVE is always fixed to roughly 8/π 2 ≈ 81% while it quickly approaches 1 in QSP-EVE as depicted in Fig. 6(b).Through the use of QSP-EVE we reduce the number of individual phase factors from ∆ −1 log ∆ −1 ∼ 10 5 in [33] to d ∼ 10 2 .

Resource estimates for molecular systems
In the following, we calculate resource requirements for computing nuclear forces [11], electric dipole moments [44], and kinetic energies using std-EVE and QSP-EVE, relying on the theoretical curve for the latter.The choice of observables covers both one-and two-body operators and thus allows us to cost our algorithms for different tasks.We note that the forces and dipole moments are energy derivatives and could also be estimated using ground state energy calculations together with finite-difference formulas [11].As molecular test systems, we use H 2 , Be, a single water molecule, ammonia, and pbenzyne with molecular geometries defined in Tab. 5 in Appendix E. All calculations are performed in the cc-pVDZ basis set.The Hamiltonian matrix elements are obtained with the PySCF code [45,46].For Be and H 2 we calculate the spectral gap using the FCI routine of PySCF [45,46], for ammonia and water we use the Block2 DMRG code [47] with a bond dimension M = 1000 on the full space, and for p-benzyne in a (30e, 30o) active space.The data is processed and visualized with an in-house software package.
The number of logical qubits required for the execution of each algorithm, as well as the required number of Toffoli gates, is then estimated by matching the precision O(λ F /ε) with requirements for the complexity of iQPE in the curves of Fig. 6(a).A detailed description of the procedure can be found in Appendix D. We summarize the results of our calculations in the following paragraphs and report tables with detailed resource estimates in Appendix F. Accurate nuclear forces are required for many applications, such as optimizing molecular geometries or simulating molecular dynamics [48].For a molecule with N a atoms, there are 3N a nuclear forces.The i-th force component of the A-th nuclei can be represented by a two-body force operator with the one-and two-body integrals f (1) pq and f (2 pqrs being defined as in Eq. ( 22) of Ref. [11].There is no universally defined notion of chemical accuracy as target precision for the force operator.Instead we follow [11] and use ε force = 5 mHa/ Å as target precision.We calculate the resource requirements for all 3N a operators and report the results in Tab. 2. We find that QSP-EVE requires ∼ 10 15 to ∼ 10 19 Toffoli gates and thousands of logical qubits to reach the target precision, compared to ∼ 10 17 to ∼ 10 23 for std-EVE.For all force components, QSP-EVE significantly outperforms std-EVE in terms of gate counts and number of logical qubits, showing at most a ∼ 4875× reduction in gate count and ∼ 26% reduction in qubit count (for p-benzyne).
As a second example, we calculate the resource requirements for estimating the expectation value of the dipole moment operator.The i th component of dipole moment can be represented as a one-body operator where the index i denotes the Cartesian coordinate.For typical applications, one aims to estimate the dipole moment up to a relative error of a few percent [44].Assuming 1 Debye as the typical strength of the dipole moment and a relative error of 1% motivates our choice of ε dip.=10 mDebye for this observable.We calculate the resource requirements for all components of the dipole operator.We note that for diatomic molecules, we only calculate the resource requirements for the non-vanishing dipole moment along the internal coordinate.The results for the dipole moment are shown in Tab. 3. Notably, we find that at most QSP-EVE provides an ∼ 2376× reduction in gate count and ∼ 25% reduction in qubit count when compared to std-EVE (for p-benzyne).
As second one-body operator, we study the kinetic energy operator of the electronic structure Hamiltonian Table 3: Quantum resources for computing dipole moments using EVE.For the water, ammonia, and p-benzyne system, we present the number of spin orbitals N , the norm λH of the Hamiltonian in Ha, the gap of the Hamiltonian ∆H in Ha, and the norm of the dipole operator λ D (i) in Debye, along with the resulting numbers of Toffoli gates and logical qubits for std-EVE and QSP-EVE instances estimating the expectation values of dipole operators along different axes.Note that the z component is omitted for Water due to the system's geometrical symmetry.For complete datasets, including subroutine cost breakdowns, see Appendix F. The expectation value of the kinetic energy operator can serve as a check of the correctness of the wave function through the virial theorem [49].As the kinetic energy operator is an energy operator, we choose chemical accuracy as target precision, i.e., ε kin.= 1.6 mHa.The results for the kinetic energy are shown in Tab. 4. Similarly to the previous observables, we find that at best QSP-EVE provides a ∼ 4564× reduction in gate count and ∼ 26% reduction in qubit count when compared to std-EVE (for p-benzyne).For all three observables, we find that QSP-EVE results in around 2-4 orders of magnitude lower Toffoli gate counts and logical qubits numbers over std-EVE, as well as up to an approx.25% reduction in qubit counts.In combination, such improvements lead QSP-EVE to provide up to five orders of magnitude to reduce circuit volume for the systems studied.

Property
We summarize all our resource estimates in Fig. 7(a), displaying the logical-qubit and magicstate requirements.The complete data is also listed in Appendix F. We confirm that the resource requirements for QSP-EVE are lower than for std-EVE in all cases considered.The relative advantage of QSP-EVE over std-EVE is separately depicted in Fig. 7(b) in the number of Toffoli gates for the listed observables with respect to their relative 1-norms Λ F = λ F /ε. Eq. (41) indicates that the advantage should scale as O(Λ F log Λ F ).A fit through the data points roughly confirms this behavior.
An overview of the cost distribution of QSP-EVE can be found in Figure 8, where the gate counts  for estimating the kinetic energy of a p-benzyne system are split into the different subroutines, down to the block encodings of Ĥ and F .The majority of the cost (not just for QSP-EVE but also for std-EVE as can be seen in Appendix D) comes from the block encodings of the Hamiltonians B[H], not because a single call is expensive but due to the large number of calls throughout the algorithm.The cost for a repeated phase estimation within the Ansatz state preparation (the ASP routine in Fig. 1) is included in the resource counts as well as the callgraph in Fig. 8.We assume a pessimistic overlap of 1% between the Hamiltonian ground state |ψ G ⟩ and the Hartree-Fock state that serves as input on the sim register.

Conclusion
In this work, we have presented two quantum algorithms for expectation value estimation, std-EVE and QSP-EVE, and calculated their computational costs.Comparing the two algorithms we have shown that by exploiting the latest developments in quantum signal processing (QSP) it is possible to improve the asymptotic quantum resource requirements.Additionally, we have provided explicit resource estimates for calculating the expectation values of molecular forces, dipole moments, and kinetic energies for different exemplary molecules, requiring between ∼ 10 3 and ∼ 10 4 logical qubits and between ∼ 10 15 and ∼ 10 19 Toffoli gates.Furthermore, we have presented a breakdown of the contributions of the different algorithm's subroutines to the total computational cost in Figure 9.We found that the inner phase estimation within the reflections in the EVE algorithms represents the most significant source of computational cost, driven by the direct dependence on the Hamiltonian norm and the spectral gap.
Complementary to the resource estimates, we have obtained the QSP phase angles that would be required at the time of the algorithm's compilation.Although the results of the numerical optimization are currently unsatisfactory for larger problem instances, we have lowered the required numbers of individual phase factors sufficiently for us to believe that the QSP phase angles can be obtained

QSP-EVE
Figure 8: Callgraph depicting the estimated quantum resources required for computing the kinetic energy of a p-benzyne molecule using QSP-EVE.The graph displays the distribution of the costs among various subroutines, named as in Fig. 1, and where B[H] is the block encoding of the Hamiltonian and Rτ is the block encoding of the observable (i.e.B[F]).All costs are given in terms of Toffoli gate counts, with each subroutine node depicting the per-call cost, the total number of calls and cost when taken over the full algorithm (i.e. over all parent calls).Note that some routines are deliberately omitted in the count, as they contribute very little.Edge numbers define the number of calls of the target routine within a single call of its parent routine.Darker shading indicates a greater total gate cost for the subroutine.The ASP comprises several low-precision phase estimation routines in the repeat-until-success circuit.The initial state overlap of the Hartree-Fock state with the Hamiltonian ground state is assumed to be 1% and is subsequently lifted to above 91% (via a simple repeat-until-success scheme).Details on the resource estimation of all subroutines can be found in Appendix D. The callgraph diagram for the same problem instance in std-EVE can be found in Fig. 9 in Appendix F even for larger instances with more computing power or through further research into optimization methods.Our research makes clear that, although QSP techniques are the current state of the art in fault-tolerant algorithms, they do imply a heavy classical precomputation.It is important to balance this increase in classical computational cost versus the reduction of quantum resources requirements when making algorithm design choices.
We acknowledge that the current gate counts are still unreasonably high to run on mid-term faulttolerant hardware.Once algorithmic improvements yield more feasible resource estimates, architecturebased optimization might drive the cost down further.Most importantly, our results show that adopting ideas from QSP while limiting the complexity of its optimization within the EVE algorithm can bring up to three orders of magnitude improvement when it comes to the number of Toffoli gates.
Consequently, future algorithmic improvements together with more efficient Hamiltonian compression schemes [50,5] could yield orders of magnitude improvements, as already seen for estimating the Hamiltonian's ground-state energy [1,2,3,4,5,6,7,8,9,10].We identified the Hamiltonian simulation subroutine as the most cost determining part of our algorithm and any progress in this area would directly reduce our resource requirements as well.
A final observation concerns the use of QSP for QPE as a subroutine, related to the comment made in Section 5.1 where we compare the resource costs of QSP-EVE and std-EVE.For large target errors, the number of extra phase bits to use in std-EVE is small enough such that it outperforms QSP-EVE (in terms of queries to the baseline QPE).While for all examples considered here the precision requirements favor the use of QSP-EVE, there may be contexts with less stringent target error requirements in which std-EVE may be a better choice.For instance, in ground-state energy estimation where one makes a single query to QPE, an upper bound on the probability of success for estimating an eigenenergy to some precision is already 81%.In this scenario, one may not need to account for bit discretization error, and can instead opt to run standard QPE a handful of times rather than a potentially more expensive QSP QPE.This work highlights the importance of selecting appropriate variants of QPE within different coherent and non-coherent settings or for different required precisions, and the significant resultant impact upon resource requirements.

A Singular value estimation details
In this Section, we provide details about the eigenvalues in Eq. (11) and eigenstates in Eq. (8) of the iterate featuring the product of two general reflections.We start by recalling Eq. (6) and Eq.(7), and conclude that the projectors have the form such that ⟨t j |p k ⟩ = ⟨t j |τ • π|p k ⟩ = δ jk w k and where π ⊥ and τ ⊥ are projectors orthogonal to all |p k ⟩ and |t k ⟩, as well as π ⊥ • τ ⊥ = 0.In that notation, we can make an Ansatz vector for an eigenstate of R τ R π : with coefficients a k and b k .For R τ R π |v k ⟩ we find If |v k ⟩ is an eigenstate of R τ R π , then the last line must be equal to λ k |v k ⟩, where λ k is an eigenvalue.
Comparing coefficients, we retrieve the following system of equations: Solving (48) for a k yields which can be used to eliminate b k from Eq. (49): We then convert Eq. (51) into a quadratic equation. ( The quadratic equation can be solved for λ k by k − 1.Now we turn our attention to the eigenstate: making the eigenvalue in Eq. (53) explicit in Eq. (50), we obtain where we have used (1 + e −2ix ) = (2e −ix cos x) for arbitrary x.Plugging the new expression for a k into (7) gives us Finally, substituting ), we prove Eq. (8).

B Details about errors and failure probabilities
In this Section, we prove the following two statements.

I. Expectation value error
The systematic error to the estimated expectation value in the approximation outlined in Section 4 is 2 max j Ω j .
Let us start with the first statement by recalling the singular value decomposition in Eq. (30) while also noting that where the matrices [ • ] j are formulated with respect to the basis states From here on we will use the notation has the eigenvalues The eigenvalues w 2 j+ and w 2 j− belong to different solutions to the problem.They are squares of the singular values w j± , but only one of them will be used to estimate the value of F G .Note that Eq. ( 59) is related to a representation of F in the subspace of |ψ G ⟩ and |ψ E ⟩: which has the eigenvalues λ ± : Inspecting both Eq.(60) and Eq.(62), we find which is smaller than 1, considering that || F || ≤ 1 due to its block encoding implementation.For a nonzero Ω j , we would estimate F G with the solution closest to (1−F G )/4: by using √ a 2 + b 2 ≤ |a|+|b| for any real numbers a and b we find which means that we spoil the estimate of F G by at most ±|Ω j z|.By inspecting Eq. (62) we can see that |z| can, for some observables be as big as 2, without that their eigenvalues would be unbound, The accuracy of F G is therefore upper-bounded by proving the first statement.We will now verify the statement about the success probability.Let us say that the matrix in Eq. (59) is solved by a vector (c G,j , c E,j ) ⊤ .The left singular vector of τ • π is therefore With the shorthand we describe the success probability with the overlaps of the initial state |Q G,s ; 0; 0⟩ with the two viable solutions of the expectation value estimation: The overlap with |p ⊥ k ⟩ is small and can be neglected: to define |p ⊥ j ⟩ we find |t j ⟩ as What is more, we find which equals ω 2 j ⟨0|G s j ⟩ in the limit of c G,j → 1, c E,j → 0 and ω j → (1 − F G )/4.In the same limit, ⟨Q G,s ; 0; 0|p j ⟩ = ⟨0|G s j ⟩ and so The contribution of |p ⊥ j ⟩ thus vanishes in the limit of the observable error being small.The success probability can, therefore, be lower-bounded with So how big is |c G,j |? From the multiplication of the first row of the matrix in Eq. (59) with the vector (c G,j , c E,j ) ⊤ , we learn that such that we find that the ratio of |c G,j | and |c E,j | is In the worst case of F G − F E = 0, both coefficients are of the same size such that |c G,j | 2 = 1/2, but in the ideal case where Ω j = 0 we would have |c G,j | 2 = 1.The worst case success probability is, therefore, and the best-case success probability is double that, proving the statement about the success probability.Note that it would be possible to have more refined statements about the success probability when having an idea of the gap F G − F E .

C Numerical framework for iQPE
In this section, we provide a numerical framework that will allow us to estimate the singular values Ω j and overlaps ⟨0|ϱ(G, s)|0⟩ depending on the set of functions P x (•), Q x (•) for all x = 1 . . .n, where n is the number of qubits in the phase register.For this to be possible, we need a basis for the projectors ϱ(E, σ) with respect to a fixed tuple (E, σ).Let us call the basis states |X m E,σ ⟩, where m are the integer labels in M, such that We would need to be able to make statements about the overlaps ⟨0|X m E,σ ⟩ for us to compute ⟨0|ϱ(G, s)|0⟩.To attain Ω j , however, there is no need for us to expand |X m E,σ ⟩ into the computational basis.The set of states |X m E,σ ⟩ itself can function as the basis of dimension r: the squared singular values Ω 2 j , can be computed by ϱ(G, s) • ϱ(E, s) • ϱ(G, s) in the basis of ϱ(G, s), if we are given a suitable relation of the overlap of two basis states for different tuples (E, σ).This section has, therefore, three goals: Before we start, it will be necessary to establish some notation.Throughout this section we will write b ℓ for the binary representation of integers ℓ between 0 and 2 n − 1: and represent |ℓ⟩ by where b ℓ,1 is the most-significant bit and b ℓ,n is the least significant qubit.We will access the functions P x (•), Q x (•) via the matrix-valued function M (x) of arbitrary input angles ϑ: Let us also introduce a function Z x (•) returning the remainder of an n-bit integer after the x-th significant bit as a fixed-point number: Now we can begin working towards the goals I-III.Let us start with a look at the spectral decomposition of the iQPE operator: with Eq. (83) and we can say where ϑ E,σ = σ(arccos E)/(2π) mod 1. Immediately we find where The overlap of arbitrary states is now When (E, σ) = ( E, σ) and j = k, the complex exponential equals one, and so Since the Z y (j) = Z y (k) we find ϑ = ϑ ′ which causes the term in Eq. (93) to vanish regardless of b j,y .It follows that and so we achieve goal I.With access to the matrix elements M is equal to ϱ(E, σ) phase in Eq. (79), the maximum success probability can be computed by which fulfills the criteria for goal II.Errors and success probabilities for QSP-EVE can now be computed when expressions P x (z) and Q x (z) can be accessed numerically for arbitrary inputs z.This trivially contains std-EVE numbers following P x (z) = z and Q

D Details about resources estimates
In this section we describe the details that went into calculating the resource estimates.We have calculated the resources within a 10-step procedure.
1. Error budgeting: The target error of the observable shall be ε targ = ε/λ F .Now we have to distribute this error between the systematic error of the expectation value stemming from the discretization error of the inner QPE, and the finite resolution of the outer QPE.We find and want to minimize the complexity of the algorithm, which is proportional to the function where the function κ(•) is a multiplier to the complexity of the inner phase estimation routine, in order to compensate the discretization error.Initially defined in Eq. (3), our results show that the function κ(x) = 1/x for std-EVE and κ(x) = log(1/x) for QSP-EVE.Optimizing (100) under the condition (99) using the Lagrange method, we find that ε in = ε out = ε targ /2 in the case of std-EVE.For QSP-EVE we find where W −1 (•) is the (−1)-branch of the Lambert function and e is the Euler number.4. Hamiltonian and observable block encoding: We decompose the Hamiltonian and observable through double factorization [9,5] to obtain the fermionic bases and low-rank tensors for B[ Ĥ] and B[ F ].The vast majority of the gate and auxiliary qubit complexity for implementing the block encoding of the double-factorized Hamiltonian comes from implementing the basis-transforming Givens rotations.Each rotation must be performed to some precision β, and the angles for each set of Givens rotations per leaf in the rank decomposition must be loaded coherently by a data-loading circuit, or a QROM [51].For a system of N orbitals, there will be N many Givens rotations (which must then be uncomputed).For a double-factorized rank M = O(N 2 ), there will be M different sets of N many Givens rotations.This means that we must load M different sets of N rotations, each with β bits of precision, and then we must actually perform the rotations themselves (which we can do via addition into a phase gradient state of size β).
The gate complexity of the aforementioned steps is therefore O(M/k + N βk) for the dataloading, and O(N β) to implement addition into a phase gradient state, where k is a tunable parameter to trade off between gate and qubit complexity.The qubit complexity for these steps is O(N β) + O( M/β).The parameter β is usually between 10-20 bits.The bits of precision in all Givens rotations can be determined numerically or analytically.Here, we choose the more conservative, analytic estimate given in Eq. 76 in [9] which scales logarithmically with the number of spin-orbitals.Practically, one would often truncate the factorization, omitting singular vectors with singular values below a certain threshold.There is no clear consensus on how to assess the impact of an arbitrarily set threshold, and as we want out counts to represent the worst case, we only truncate singular vectors in the single factorization for singular values under 10 −15 .
The number of qubits and gates used in these steps is a plurality of all gates and qubits used per iteration of the inner phase estimation, and thus constitutes the majority of the qubit and gate costs for the entire algorithm (since the number of phase qubits for the inner and outer QPEs will be much smaller than N β).

Baseline QPE costs:
The cost for B[ Ĥ] is used to determine the cost c bQPE of a baseline QPE, running qubitization with a n 0 -sized phase register big enough to resolve the conditions (38) and (39), where and the complexity c bQPE roughly 2 n0−1 times the cost of the Hamiltonian block encoding.F Comprehensive resource data std-EVE Figure 9: Callgraph depicting the estimated quantum resources required for computing the kinetic energy of a p-Benzyne molecule using std-EVE.Routine costs are given in terms of Toffoli gate counts, with each subroutine node depicting the per-call cost, the total number of calls, and the cost when taken over the full algorithm.Note that some routines are deliberately omitted in the count, as they contribute very little.Numbered edges define the number of calls of the target routine within a single call of its parent routine.Darker shading indicates a greater total gate cost for the subroutine.The ASP comprises several low-precision phase estimation routines in the repeat-until-success circuit.The initial state overlap of the Hartree-Fock state with the Hamiltonian ground state is assumed to be 1% and is subsequently lifted to above 91%.Details on the resource estimation of all subroutines can be found in Appendix D.

E Molecular geometries
FIG. 1. Overview of the quantum algorithm for expectation value estimation.(a)The first of two registers is initialized as a window function using the routine init.In the standard formulation of quantum phase estimation, init would just apply Hadamard gates to every qubit, but more refined circuits can be used to achieve optimal phase readouts[31,32].In the second register, we prepare a target state with an Ansatz state preparation (ASP) routine, for instance a preparation of a Slater determinant state, followed by qubitization-based quantum phase estimation with which we postselect on the right energy.Both registers are fed into an outer quantum phase estimation (oQPE) routine and an expression for the expectation value is obtained by measuring all qubits of the upper register in the computational basis.(b) Zooming in on the oQPE routine.Here the upper register has been split into single qubits.Note that the size of the upper register is arbitrary; it only has four qubits in this case.Here U are the phase oracle iterates applied to the lower register for phase kickback and QFT denotes the quantum Fourier transform.(c) Zooming in on the phase oracle U, which consists of two reflections in panels (d) and (e) acting on the four registers phase, enc[H], sim, and enc[F ].(d) Reflection featuring inner quantum phase estimation routines (iQPE) using qubitization of the Hamiltonian Ĥ.The iQPE routines write expressions of the eigenphases into the phase register, and the computational-state reflection Refl, controlled on the all-zero state of the enc[F ] register, tags eigenstates associated with | Gi in the sim register.The reflection is identified with R⇡ in Section III.(e) Block encoding of F , conditioned on the all-zero state in the enc[H] register.The reflection is identified with R⌧ in Section III.(f ) Descriptions of the qubit registers that the iterate U acts on.H is the norm of the Hamiltonian Ĥ and is its spectral gap.

Figure 1 :
Figure 1: Overview of the quantum algorithm for expectation value estimation.(a)The first of two registers is initialized as a window function using the routine init.In the standard formulation of quantum phase estimation, init would just apply Hadamard gates to every qubit, but more refined circuits can be used to achieve optimal phase readouts[31,32].In the second register, we prepare a target state with an Ansatz state preparation (ASP) routine, for instance a preparation of a Slater determinant state, followed by qubitization-based quantum phase estimation with which we postselect on the right energy.Both registers are fed into an outer quantum phase estimation (oQPE) routine and an expression for the expectation value is obtained by measuring all qubits of the upper register in the computational basis.(b) Zooming in on the oQPE routine.Here the upper register has been split into single qubits.Note that the size of the upper register is arbitrary; it only has four qubits in this case.Here U are the phase oracle iterates applied to the lower register for phase kickback and QFT denotes the quantum Fourier transform.(c) Zooming in on the phase oracle U, which consists of two reflections in panels (d) and (e) acting on the four registers phase, enc[H], sim, and enc[F ].(d) Reflection featuring inner quantum phase estimation routines (iQPE) using qubitization of the Hamiltonian Ĥ.The iQPE routines write expressions of the eigenphases into the phase register, and the computational-state reflection Refl, controlled on the all-zero state of the enc[F ] register, tags eigenstates associated with |ψG⟩ in the sim register.The reflection is identified with Rπ in Section 3. (e) Block encoding of F , conditioned on the all-zero state in the enc[H] register.The reflection is identified with Rτ in Section 3. (f ) Descriptions of the qubit registers that the iterate U acts on.λH is the norm of the Hamiltonian H and ∆ is its spectral gap.
Figure 1: Overview of the quantum algorithm for expectation value estimation.(a)The first of two registers is initialized as a window function using the routine init.In the standard formulation of quantum phase estimation, init would just apply Hadamard gates to every qubit, but more refined circuits can be used to achieve optimal phase readouts[31,32].In the second register, we prepare a target state with an Ansatz state preparation (ASP) routine, for instance a preparation of a Slater determinant state, followed by qubitization-based quantum phase estimation with which we postselect on the right energy.Both registers are fed into an outer quantum phase estimation (oQPE) routine and an expression for the expectation value is obtained by measuring all qubits of the upper register in the computational basis.(b) Zooming in on the oQPE routine.Here the upper register has been split into single qubits.Note that the size of the upper register is arbitrary; it only has four qubits in this case.Here U are the phase oracle iterates applied to the lower register for phase kickback and QFT denotes the quantum Fourier transform.(c) Zooming in on the phase oracle U, which consists of two reflections in panels (d) and (e) acting on the four registers phase, enc[H], sim, and enc[F ].(d) Reflection featuring inner quantum phase estimation routines (iQPE) using qubitization of the Hamiltonian Ĥ.The iQPE routines write expressions of the eigenphases into the phase register, and the computational-state reflection Refl, controlled on the all-zero state of the enc[F ] register, tags eigenstates associated with |ψG⟩ in the sim register.The reflection is identified with Rπ in Section 3. (e) Block encoding of F , conditioned on the all-zero state in the enc[H] register.The reflection is identified with Rτ in Section 3. (f ) Descriptions of the qubit registers that the iterate U acts on.λH is the norm of the Hamiltonian H and ∆ is its spectral gap.

Figure 2 :
Figure 2: Example of a computational state reflection Refl.The reflection is the equivalent of a number of multiqubit Toffoli gates, where each provides a phase flip (−1) on a specified computational subspace of qubits they act on.As per convention, empty control symbols indicate that the specified subspace for that qubit is |0⟩, and filled control symbols indicate that the subspace is |1⟩.This figure shows a reflection on the computational states |0111⟩ and |1000⟩, corresponding to integers m = 7 and m = 8, respectively (see Eq. (14)).These gates can be cheaply realized using the elbow circuits shown in Figures4 and 5of[3].

Figure 3 :
Figure 3: Variations of the expectation value estimation iterate in Figure 1(c).(a) A more general version of the iterate, in which the reflection Rπ is implemented with quantum singular value transformation (QSVT), quantum signal processing (QSP) or quantum eigenvalue transformation (QET) techniques [36] without the need for an iQPE.When block encodings B[ Ĥ] are used, this version still requires a enc[H] register, but without the need to condition the block encoding B[ F ] on the all-zero state of the enc[H] qubits within Rτ .No phase register is required.(b) An implementation of the iterate in which the iQPE routines feature a Hamiltonian simulation technique that does not use block encodings.We, therefore, do not require a enc[H] register or control of the block encoding B[ F ].

Figure 4 :
Figure 4: Building blocks of a QPE for qubitization.(a) Quantum phase estimation building block that rotates the qubit phase[x] according to the phase kickback of the oracles Q on the last register.Here, Q is the qubitization iterate, and the last register is the combination of the registers sim and enc[H].One can apply a double-kickback trick discussed in [3], Eq. (16).We denote the Hadamard gates with H, and R k are phase rotations R k = |0⟩⟨0| + exp(−iπ/2 k−1 )|1⟩⟨1| controlled on the less significant phase qubits for phase feedback.An inexpensive way to implement the latter is to feed the qubits phase[y] for y > x into an adder circuit with a phase-gradient state (see Ref.[39], page 4), flipping the value of all qubits phase[y] controlled on the value of qubit phase[x] before and after the addition.A circuit Vx with an n-qubit phase register makes an equivalent of 2 x−1 queries to Q for all x = 1 . . .n.(b) Qubitization iterate Q, featuring the block encoding of the Hamiltonian Ĥ and a reflection on the all-zero state of the enc[H] register.

Figure 5 :
Figure 5: QSP-version of the inner QPE in QSP-EVE.X(α) and Z(β) are X and Z rotations about the angles α and β, respectively, where X(α) = cos αI + i sin αX and Z(β) = cos βI + i sin βZ.The shaded areas are repeated d times with individual angles φ k , where d is the degree of a suitable polynomial approximation to the step function S(•), see Eq. (36).The presented circuits are a symmetric QSP implementation [40] according to the Chebychev polynomial in Eq. (40) such that Px(•) = P (•) for all phase qubits phase[x].As symmetric QSP only allows fixing the real part of Px(•) in Eq. (35) but never yields an imaginary part in the off-diagonal function Qx(•), we can effectively transfer the imaginary part of Px(•) onto Qx(•) by a basis transform of the qubit phase[x], which is done by the Clifford rotations X(±π/4).The subroutines Vx are defined in Figure 4.

Figure 6 :
Figure 6: Performance of std-EVE and QSP-EVE algorithms in general.(a) Comparing the expectation value errors in std-EVE and QSP-EVE with respect to the complexity of the iQPE in terms of c bQPE , the complexity of a baseline QPE with n0 = 20 qubits.These results are relatively robust against change in the number of qubits n0 and are relative to the cost of a Hamiltonian block encoding.In std-EVE, the expectation value errors decrease roughly linearly with the complexity of iQPE, which is set by the number of extra qubits nx: 2 nx c bQPE .In QSP-EVE, the expectation value error decreases exponentially with the degree d of the target polynomial P (•).The complexity of iQPE in QSP-EVE is given by individual degrees as 2d•c bQPE .The dashed line presents the "emulated" QSP-EVE curve, which is the curve for an ideal implementation of the target polynomial P (•).The singular data points represent the errors obtained with pre-computed phase factors.Every such data point is decorated with its corresponding degree d.The QSP angles used in the data points have a precision of 21 bits.We can see that the data points follow the theoretical prediction but diverge after d = 128 due to the suboptimality of the computed phase factors.(b) Minimum failure probability 1 − ⟨0|ϱ(G, s)|0⟩ of QSP-EVE different from unity, as a function of the polynomial degree d.The dashed curve outlines the theoretically achievable curve, whereas the data points denote results based on numerically attained phase factors.The success probability climbs quickly from roughly 8/π 2 ≈ 0.81 at d = 1 towards 1 as d increases.At d = 128, the failure probability is 7 • 10 −6 .The underlying iQPE routine has a total of 20 phase qubits, and the bit precision for QSP angles is 21.The minimum failure probability of std-EVE is constant at 19%.

Figure 7 :
Figure 7: Resource analysis for all observables considered (3 force components, kinetic energies and dipole moments) of various molecules in cc-pVDZ basis.(a) Logical qubit counts plotted against Toffoli gate counts for the estimation for std-EVE and QSP-EVE algorithms.Marker sizes are proportional to observable 1-norms, and each data point represents one of the observables.(b) Advantage of QSP-EVE over std-EVE.Toffoli gate count of the std-EVE algorithm relative to a run of QSP-EVE with respect to different normalized observable 1-norms ΛF .That is, the absolute 1-norm of the observable λF , relative to the target error ε, which is 1.6 mHa for kinetic energies, 10 mDebye for dipole moments and 5 mHa/ Å for the forces, such that ΛF = λF /ε.This quantity has been chosen as it indicates the complexity of the outer QPE.The linear fit (dashed line) in the plot roughly confirms the scaling of O(ΛF / log ΛF ) expected from Eq. (41), with a gradient of roughly 0.91ΛF .

⟩
To find the states |X m E,σ ⟩ and prove that they form a basis by showing that ⟨X j E,σ |X k E,σ ⟩ = δ jk ; II To provide an expression for ⟨0|ϱ(G, s)|0⟩ that solely depend on P x (•) and Q x (•); III To provide expressions ⟨X j for the overlaps of two states from different bases (E, σ) ̸ = ( E, σ) in terms of the functions P x (•) and Q x (•).
y is the last bit in which b k and b j differ, i.e. b k,x = b j,x for x > y and b k,y = b j,y the y-th factor in Eq. (92) turns into b M (x) bj,y, b (ϑ) • M (x) * bj,y+1, b (ϑ ab (ϑ) for arbitrary inputs ϑ, we can calculate the overlaps ⟨X j E, σ |X k E,σ ⟩ with Eq. (92) for the cases where (E, σ) ̸ = ( E, σ), achieving goal

2 . 3 .
Qubits in the phase register of the outer QPE.ε out is used to set the bit precision n out of oQPE by n out = log π Parameters of the inner QPE: ε in is used to determine either the number of extra qubits n x in case we are using std-EVE or the degree d of the polynomial function QSP-EVE from Figure6(a).The results in the figure have been precomputed on the basis of a QPE with the spectral gap conditions of(38) and(39).The expectation values are attained using Eq.(31) and the procedure outlined in Appendix C.An important ingredient for the calculation are the values arccos G and arccos E in the ranges based on the ground-and excited state energies, G and E. While arccos G is found in a closed range between delimiters m and m + 1 in Eq.(38), arccos E is defined on an open range after m + 2 in Eq.(39).Since we only care for the relative distances of ± arccos G and ± arccos E to the angle 2 −n+1 π m, where n is the total number of phase qubits and m an integer chosen to be close to ± arccos G, we can e.g.set m = 0 and redefine the ranges of arccos G and arccos E accordingly.The results in Figure6are attained by maximizing Eq. (31) over variations of arccos G and arccos E in their new-defined ranges.

Table 2 :
Quantum resources for computing molecular forces using EVE.For the water, ammonia, and p-benzyne system, we present the number of spin orbitals N , the norm λH of the Hamiltonian in Ha, the gap of the Hamiltonian ∆H in Ha, and the norm of force operator λ F (i) A in Ha/ Å together with the resulting Toffoli and logical qubit counts for std-EVE and QSP-EVE instances estimating the expectation values of force operators.The force operators of different atoms and different components have different norms.Therefore, we report only the components associated with minimum and maximum gate costs here.For complete datasets, including subroutine cost breakdowns, see Appendix F.

Table 4 :
Quantum resources for computing kinetic energies using EVE.For the H2, Be, water, ammonia, and p-benzyne system, we present the number of spin orbitals N , norm of the Hamiltonian λH in Ha, the gap of the Hamiltonian ∆H in Ha, and the norm of kinetic energy operator λK in Ha, along with the resulting number of Toffoli gates and logical qubits for std-EVE and QSP-EVE instances estimating the expectation values of kinetic energy operators.For complete datasets, including subroutine cost breakdowns, see Appendix F.

6 .
Inner QPE costs: The baseline QPE cost, c bQPE is modified with the inner QPE parameters from the third step, which is either n x or d, to obtain the cost c iQPE of the inner QPE, either c iQPE ≈ 2 nx c bQPE in case of std-EVE or c iQPE ≈ 2dc bQPE in case of QSP-EVE.7.Outer QPE iterate cost: The cost of Refl and B[ F ], where the latter is obtained through double factorization, are combined to obtain the cost c U of an oQPE iterate U. We findc U = 2c iQPE + c Refl + c B[ F ] ,(103)where c Refl and c B[ F ] are the gate costs of the Refl and the observable block encoding, respectively.8.Outer QPE cost: The total cost c oQPE of oQPE is obtained based on the costs of iterates U taking into account state-of-the-art tricks like double phase-kickback.The number of times U, or parts thereof are queried has been set by the number of qubits in the oQPE phase register obtained in step two.A rough estimate would be c oQPE ≈ 2 nout−1 c U .(104) 9. Ansatz state preparation costs: The cost of ASP are estimated by multiple execution of what is essentially a baseline QPE with one additional qubit.The number of repetitions depends on 1/γ 2 , the squared inverse overlap amplitude of the Hartree-Fock state |HF⟩ with the ground state |ψ G ⟩, γ = |⟨HF|ψ G ⟩|, which contributes as a multiplicative constant in this cost of ASP.When the overlap of the Hartree-Fock state with the ground state is fixed, we can assume that in the worst case, the rest of the state is the first excited state.After having projected the state into one of the qubitization eigenstates |HF⟩ → |Ψ⟩, |Ψ⟩ has the form|Ψ⟩ = a G |Q G,s ⟩ + a E |Q E,s ⟩ .forsomerandoms=±1, and where a G and a E are constants.Using the notation of Appendix B for the QPE routine of the ASP, we findThe ASP circuit shall look a bit like the reflection featuring the inner QPE, but we flip some auxiliary qubit conditionally on Refl and after the circuit concludes we measure this qubit expecting a flip while also measuring the phase register expecting |0⟩.The state fidelity f ASP after a successful projection is thenf ASP = |⟨Ψ; 0|Proj|Q G,s ⟩ ⊗ |0⟩ phase | Setting |a G | = γ, |a E | 2 = 1 − γ 2we can obtain p ASP and f ASP numerically.For γ 2 = 0.01, we find that f ASP ≈ 91% and the average gate cost c ASP of the ASP is c ASP = 4c bQPE /γ.10.Total costs.Costs of oQPE and ASP are combined to yield the total gate cost, while smaller cost factors like the creation of phase gradient states are neglected on account of their comparatively small size.

Table 5 :
Molecular geometries used for all calculations presented in this work.