Heisenberg-limited quantum phase estimation of multiple eigenvalues with few control qubits

Quantum phase estimation is a cornerstone in quantum algorithm design, allowing for the inference of eigenvalues of exponentially-large sparse matrices.The maximum rate at which these eigenvalues may be learned, --known as the Heisenberg limit--, is constrained by bounds on the circuit complexity required to simulate an arbitrary Hamiltonian. Single-control qubit variants of quantum phase estimation that do not require coherence between experiments have garnered interest in recent years due to lower circuit depth and minimal qubit overhead. In this work we show that these methods can achieve the Heisenberg limit, {\em also} when one is unable to prepare eigenstates of the system. Given a quantum subroutine which provides samples of a `phase function' $g(k)=\sum_j A_j e^{i \phi_j k}$ with unknown eigenphases $\phi_j$ and overlaps $A_j$ at quantum cost $O(k)$, we show how to estimate the phases $\{\phi_j\}$ with (root-mean-square) error $\delta$ for total quantum cost $T=O(\delta^{-1})$. Our scheme combines the idea of Heisenberg-limited multi-order quantum phase estimation for a single eigenvalue phase [Higgins et al (2009) and Kimmel et al (2015)] with subroutines with so-called dense quantum phase estimation which uses classical processing via time-series analysis for the QEEP problem [Somma (2019)] or the matrix pencil method. For our algorithm which adaptively fixes the choice for $k$ in $g(k)$ we prove Heisenberg-limited scaling when we use the time-series/QEEP subroutine. We present numerical evidence that using the matrix pencil technique the algorithm can achieve Heisenberg-limited scaling as well.

Quantum phase estimation is a cornerstone in quantum algorithm design, allowing for the inference of eigenvalues of exponentially-large sparse matrices. The maximum rate at which these eigenvalues may be learned, -known as the Heisenberg limit-, is constrained by bounds on the circuit complexity required to simulate an arbitrary Hamiltonian. Single-control qubit variants of quantum phase estimation that do not require coherence between experiments have garnered interest in recent years due to lower circuit depth and minimal qubit overhead. In this work we show that these methods can achieve the Heisenberg limit, also when one is unable to prepare eigenstates of the system. Given a quantum subroutine which provides samples of a 'phase function' g(k) = j A j e iφ j k with unknown eigenphases φ j and overlaps A j at quantum cost O(k), we show how to estimate the phases {φ j } with (root-mean-square) error δ for total quantum cost T = O(δ −1 ). Our scheme combines the idea of Heisenberg-limited multiorder quantum phase estimation for a single eigenvalue phase [1,2] with subroutines with so-called dense quantum phase estimation which uses classical processing via time-series analysis for the QEEP problem [3] or the matrix pencil method. For our algorithm which adaptively fixes the choice for k in g(k) we prove Heisenberglimited scaling when we use the timeseries/QEEP subroutine. We present numerical evidence that using the matrix pencil technique the algorithm can achieve Heisenberg-limited scaling as well.

Introduction
For quantum computers to overcome the 50 + year head start in research and development enjoyed by their classical competition, quantum algorithms must eke out every inch of quantum speedup over their classical counterparts. Quantum phase estimation (QPE) of a unitary operator U , -a BQP-complete problem [4]-, plays a central or support role in many promising quantum applications [5][6][7]. However, not all flavours of quantum phase estimation are equally powerful. To perform QPE with accuracy (root-mean-square) error δ, initial implementations of quantum phase estimation [8,9] used a O(log(δ −1 ))-qubit control register, and required computation time scaling as T = O(δ −2 ), known as the Sampling Noise Limit, when contributions from outlying (unlikely) data are taken into account [1]. Much work has been undertaken over the succeeding years to improve QPE estimation rates to the theoretically optimal Heisenberg limit T = O(δ −1 ) [10][11][12] and reduce the control overhead.
The requirement to perform applications of U conditional on a large entangled control register is technically challenging and has strong coherence requirements. It has been known for a long time [13] that the control register in QPE can be replaced by a single qubit using classical feedback and re-preparation of the control qubit, also known as iterative QPE [14]. For estimating the phase of a single eigenstate, -assuming the preparation of this eigenstate-, the sampling noise limit can thus simply be achieved using a single control qubit. In [1] iterative-QPE was extended to achieve the Heisenberg limit T = O(δ −1 ). This Heisenberg limit can be shown, via Cramer-Rao bounds [1], to be a lower bound on the cost of phase estimation, assuming one cannot fastforward the unitary U [15]. This type of estimation have additionally demonstrated a relative robustness to error [2,16], which is of interest to NISQ applications. Other analyses of QPE use maximum-likelihood [17], or Bayesian [16,18] inference.
The requirement to prepare eigenstates of the unitary U is not possible for most applications. It is well known that the 'textbook' QPE al-gorithm succeeds for any initial state, i.e. the output is always an accurate estimate of one of the eigenphases of U [8]. However, the performance of few-ancilla QPE on starting states that are not eigenstates has only been examined recently. In [19] it was demonstrated numerically that one may infer single eigenvalues from mixed or superposed initial eigenstates using standard classical signal processing techniques [20]. Under some additional constraints on the system, it was recently found that these techniques could be performed in the absence of any control qubits or the need to apply controlled unitary operations [21,22], a further significant saving. Due to the need to 'densely sample' the phase function g(k) = j A j e iφ j k , and a lack of optimization of the classical post-processing techniques, Ref. [19] only achieved sampling-noise-limited scaling, but not Heisenberg-limited scaling. By dense sampling we mean that we draw samples from g(k) with k a sequence of integers, k = 0, 1, . . . , K (as opposed to, say, choosing k = 2 d , i.e. exponentially increasing which is used in textbook QPE and iterative QPE). By a clever adjustment of the quantum phase estimation problem to target estimation of the spectral function, Eq. (3), of the input state, Ref. [3] was able to prove rigorous results, with bounds that were subsequently improved in Ref. [23]. Still, these results fall short of reaching "the Heisenberg limit for the problem of estimating multiple phases", however that should be defined.
In this work, we demonstrate single-control qubit quantum phase estimation at a so-called Heisenberg limit. We extend the methods used in Refs. [1,2] that obtain Heisenberg-limited scaling for single eigenphases to the multiple-phase setting by the use of a multi-order scheme and phase matching subroutines between different orders. We show that to make this phase matching unambiguous requires the sampling scheme to be adaptive, i.e. the next choice for k of g(k) depends on the current phase estimates. At each order the multi-order algorithm requires input from a dense phase estimation method: for a given order k we use samples from g(kk) with k = 0, 1, . . . , K. As we require the freedom to choose k a real number, to be applicable to a completely general U we must invoke the quantum singular value transformation of Ref. [24], which requires O(1) additional control bits. Us-ing the time-series or QEEP analysis of Ref. [3] as classical processing subroutine, we are able to obtain a rigorous proof of Heisenberg-limited scaling of our multi-order scheme. Using the matrix pencil method analysed in Ref. [19], as such a dense subroutine, we are able to show numerical results consistent with the Heisenberg limit, with a performance improvement over the time-series analysis results.
In essence, our paper is concerned with what choices of k in g(k) and what classical processing are needed to enable Heisenberg-limited scaling, i.e. scaling which minimizes the total number of applications T of (controlled) U (which we refer to as the quantum cost) given a targeted error δ with which to estimate multiple eigenvalue phases of U present in some input state |Ψ . It can thus be viewed as purely solving a problem of classical signal processing. This does not mean that such questions are trivial: for example, the question of how to estimate phases if one is allowed to only get single samples from g(k) for a set of randomly chosen k relates to the dihedral hidden subgroup problem in quantum information theory [25]. We note that other work, based on a Monte Carlo extension of [3], achieving Heisenberg scaling (up to polylog factors) was recently presented in [26]. Another recent work also demonstrated numerical evidence for Heisenberg-limited phase estimation using Bayesian methods [27]. We also note that the information-theoretically optimal method in [17] which picks random k in g(k) has a classical processing cost which is linear in the quantum cost T , and Theorem 1 in [17] can be converted to bound the mean-squared-error in estimating a single phase, see comments below Theorem 2.5 in Section 2.2. In principle this information-theoretic method can be extended to the case of n φ eigenvalue phases, but the classical processing cost will be exponential in n φ as one iterates over the possible values of the phases, while our Algorithm 4.1 has a polynomial (but superlinear) quantum and classical cost in terms of the number of phases n φ .

Outline
We begin in Sec. 2 by defining a few mathematical objects. We separate the quantum part of a phase estimation problem, namely sampling of the phase (or signal) function g(k) in Eq. (2) given an input unitary U and input state |Ψ in Definition 2.2 through running some quantum circuits, and the classical processing of samples from g(k) to extract the eigenvalue data of U . In Sec. 2.1 we state and prove some needed properties of the distance between phases. In Sec. 2.2 we prove several Cramer-Rao bounds on the scaling of the error versus the total quantum cost for the estimation of a single eigenvalue phase, Theorem 2.5. We state the previous result on getting Heisenberg-limited scaling for a single eigenvalue phase in Algorithm 2.6. Table 1 contains a glossary of the symbols used in this paper.
In Sec. 3 we properly define a multi-eigenvalue phase estimation problem (Def. 3.1). We discuss algorithms that can be used to extract multiple phases from densely-sampled signal g(k). We state error bounds satisfied by the output of Alg. A.3 (Lemma 3.2) that is used as the fixedorder subroutine in our final Algorithm 4.1 which achieves Heisenberg scaling.
In Section 4 we present our Heisenberg-limited algorithm. We discuss a critical aliasing problem to be solved which occurs when estimating multiple eigenvalues. We show that an adaptive choice for k in g(k) can solve this issue and we prove that such adaptive choice always exists in Lemma 4.2. In Lemma 4.3 and Lemma 4.4 we prove some properties about Algorithm 4.1 which will be helpful in proving the final Theorem 4.5.
Thus in Theorem 4.5 we prove Algorithm 4.1 achieves Heisenberg-limited scaling, which is the main result of this work. In Sec. 5 we numerically compare this rigorous implementation to an implementation using the matrix pencil method, used in Ref. [19], for which we are unable to find a rigorous proof of Heisenberg scaling. We finish the paper with a discussion, Section 6.

The classical and quantum tasks of phase estimation
One may separate quantum phase estimation into the extraction of a signal which consists of oscillations at eigenvalue frequencies φ j at a chosen time k, and the processing of this signal to resolve the frequencies. Let us first define the following: Definition 2.1 (Signal or Phase Function, and Spectral Function). Let U ∈ U(2 N ) be an Nqubit unitary operator, and |Ψ ∈ C 2 N an N -qubit state. We label the eigenstates |φ j of U by their   phase -U |φ j = e iφ j |φ j . We can decompose |Ψ in terms of these eigenstates, and write the overlap A j := |a j | 2 , j A j = 1.
We define the phase function, -also called the signal-, g(k) for k ∈ R of a state |Ψ under U as The spectral function A(φ) is defined as Note that 2π 0 dφA(φ) = 1, and g(k) = 2π 0 dφe ikφ A(φ); i.e. the phase function sets the Fourier coefficients of the spectral function.
Note that one may change seamlessly between the description of a unitary U and its eigenvalues and a Hermitian operator H and its eigenvalues using the transform U = e iHt for an appropriate choice of t. Note that since g(−k) = g * (k) we can restrict ourselves to k ≥ 0.
One may consider algorithms estimating g(k) at integer k ∈ Z + , which require the quantum circuits using controlled-U k in Fig. 1 with k ∈ Z + . In our final Alg. 4.1 we will however use k ∈ R + (in practice k ∈ Q + ). In order to implement U k , we can write k = k + α, and we can simulate U k in time O(k) if we can simulate U α in time independent of k. The accuracy of this fractional query to U α can be independent of the final error in our phase estimation (as long as it is sufficiently small). Simulating U α is not a significant issue for Hamiltonian simulation methods such as Trotter decompositions [28], which allow simulation of e iHt for arbitrary t ∈ R. If we instead have access to a circuit implementation of a unitary U , or a block-encoding of a Hamiltonian H, we can implement a fractional query of U α via the quantum singular value transform [24,Corollary 34]. The circuit to implement U α to error requires O(1) additional ancilla qubits, and O(∆ −1 max log(1/ )) implementations of controlled-U (where ∆ max is the largest gap in the spectrum of U ) 1 . We will assume in this work that our states have support on at most n φ phases, and so we can bound ∆ ≥ π/n φ . The cost of implementing U α will thus not be a significant part of the cost to implement U k under the assumption k >> A −1 0 log(1/ ). To simplify our remaining analysis, we assume herein that we can implement U k at a total quantum cost k for all k ∈ R + .
The following task summarizes the quantum subroutine for phase estimation which is to be executed with the quantum circuits in Fig. 1 Given a k ∈ Z + , error > 0, and confidence 0 < p ≤ 1 ∈ R, PFE outputs an estimatẽ Fig. 1 and Our assumption 2 that the cost of implementing U k for k ∈ R + scales as O(k) implies that the above scaling for the cost of phase function estimation holds when k ∈ R + .
Note that estimating the quantum cost of the subroutine in Def. 2.2 as linear in k is consistent with general no-fast forwarding statements [15] which state that for general Hamiltonians one 1 The ∆max dependence in our circuit comes from the requirement in [24,Corollary 34] that our unitary have spectrum on [−π + ∆max, π − ∆max]. Though this will not immediately be the case, we are free to rotate the spectrum of our unitary to satisfy this requirement, as long as the spectral gap exists. We also only require to consider those eigenstates with support on our initial state in this algorithm; eigenstates with zero weight can be adjusted as part of implementing the quantum singular value transformation without affecting the phase function g(k). 2 In practice, the cost of phase function estimation for k ∈ R + will scale as M ( k + O(n φ log(1/ )) when using quantum signal processing techniques. One can confirm that the effect this has on our result is to effectively increase the cost of calling the QEEP subroutine, Alg. A.3, from O( −6 ) to O( −6 log(1/ )). This will only change the prefactor of our Heisenberg-limited phase estimation algorithm (Alg. 4.1), and does not prevent it achieving the Heisenberg limit. cannot implement U t = exp(itH) in time sublinear in t. It is expected that phase function estimation is hard to do efficiently on a classical computer as it allows one, via classical postprocessing, to sample from the eigenvalue distribution from the input state which can be reformulated as a BQP-complete problem [4]. The quantum subroutine for PFE proceeds by executing the circuits in Fig. 1 for the given k. The control qubit is prepared in the 1 √ 2 (|0 + |1 ) state, and is used to control k applications of the unitary U on the system register prepared in |Ψ . The reduced density matrix of the control qubit then takes the form The phase function g(k) is extracted by state tomography of the control qubit: one estimates the real and imaginary parts of g(k) = g (r) (k) + ig (i) (k) by M repetitions of the circuit in Fig. 1 and measurements of the control qubit in the Xor Y-basis respectively. We will ignore any dependence of phase function estimation on N .

Phase distance on the circle
Quantum phase estimation describes a series of protocols to estimate the eigenphases φ j . As these eigenvalues are defined on the circle [0, 2π), we need a notion of distance which respects this periodicity: Clearly, the distance obeys the triangle inequality: The following Lemma addresses a technical issue in the proof of performance of Algorithm 4.1. For integer k ∈ Z + , we have |kx| T = min m∈Z |k∆ + 2πm| = k min m∈Z |∆ + 2πm k | T , implying Eq. (8) below directly. However, for k ∈ R + (or rational numbers k ∈ Q + ) we need to specify a range of x for which such a statement holds, that is: we have for any θ and thus kx = ∆ + 2πm with m ∈ {0, . . . k − 1} and ∆ ∈ [−π, π) and |kφ − θ| T = |∆|. Hence where the minimum can be achieved by m = n.

Limits for single-eigenvalue phase estimation
For the special case of estimating a single eigenvalue phase, the Cramér-Rao theorem can be used to lower bound the quantum cost T to learn the phase with accuracy δ, known as the Heisenberg limit. The problem of estimating multiple phases φ j , in the presence of unknown overlaps A j , is not easily amenable to such Fisher information analysis as it is a multi-parameter estimation problem. However, it can be expected that the cost T of this task is at least as high as that of single phase estimation, hence it is of interest to review these bounds here. The following theorem also proves a dense signal limit which sits in between Heisenberg and sampling noise scaling: Theorem 2.5 (The Heisenberg, Dense Signal and Sampling Limits). The (root-mean-square) error δ of an estimatorφ of the eigenvalue phase φ employing the circuits in Fig. 1

on a eigenstate of U is always lower bounded as
where T is the quantum cost of implementing the circuits. If we choose to use only quantum circuits with k = 1, the sampling noise limit holds: If we choose circuits with k = 0, 1, . . . , K with a fixed number of repetitions M for each circuit we are bound by a so-called 'dense signal' limit: In these statements c is some constant.
Proof. Letφ be an estimator of φ which is inferred from the data x. Here the data x is the string of outcomes of the ancilla qubit measurements for all the experiments using the left and right circuits in Fig. 1. We have by the Cramér-Rao theorem [29,30] where the Fisher information is defined as Thus I(φ) limits the information we may learn about φ given a dataset x drawn from P(x|φ) and we can calculate I(φ). Let M r k be the number of experiments, using the circuit with the X measurement, and M i k be the number of experiments using the circuit with the Y measurement with a certain chosen k. The Fisher information for all independent experiments together is addi- with I(φ|k, r) and I(φ|k, i) the Fisher information of a single experiment and k is the sum over the chosen set of ks. For a single experiment we can calculate, using Eq. (14) and the probability for the output bit given in Fig. 1, that I(φ|k, r) = I(φ|k, i) = k 2 and thus At the same time the total quantum cost of all experiments is The key insight here is that the relative dependence on k is different between T and I and this implies that the trend of the number of experimental runs M r/i k as a function of k will affect the maximum rate of estimation. If we choose only k = 1 we see that δ 2 ≥ 1 T which is the sampling noise limit.
The biggest value for I(φ) for a given T is obtained when we choose a single largest possible while the total quantum cost is T = M K(K + 1).
, which is the dense signal limit.
Remark: We note that a randomized version of the dense signal choice can potentially scale in near-Heisenberg-limited fashion. In this method, one would draw k at random from 1, . . . , K and repeat this S times to generate random variables k 1 , . . . , k S and repeat experiments with fixed M for each such k i . With the right choice of S×M = polylog(K), one can argue, using the Cramer-Rao lower bound analysis above, that the expected error E(δ) ≥ polylog(E(T ))

E(T )
where E(T ) is the expected quantum cost. The algorithm in [17] uses such strategy with M = 1. Clearly, the sampling noise limit can be achieved by choosing k = 1 in the circuits of Fig. 1. However, one can ask whether the dense signal limit or the Heisenberg limit can also be achieved, in particular when we demand that the classical post-processing is computationally efficient, meaning that this processing is polynomial in the quantum cost T . For the dense signal limit one needs a classical method to process the estimates of g(k) at k = 1, . . . , K to estimateφ. Using perturbation theory in the noise, the matrix pencil method has been claimed to achieve this for a single eigenvalue [31].
Achieving the Heisenberg limit is non-trivial due to phase aliasing: the phase function g(k) obtained by the experiments using U k remains invariant if the phase φ is shifted by 2π k . This implies that a strategy of estimating φ from a single point g(k) at large k will fail unless φ is already known to sit within a window of width 2π k . This issue is circumvented by sampling g(k) at multiple orders k = 2 d to 'gradually zoom in' on φ.
To get Heisenberg scaling, one lets the number of samples M and thus the confidence, to depend on the order, so that the most significant bits of φ are determined with the highest confidence. Methods for doing this were first introduced in Ref. [1], and improved in Ref. [2] for the purpose of gate calibration. Here we state the result: Algorithm 2.6 (Heisenberg Algorithm For Single Eigenvalue Phase [1,2]). Given an targeted error δ > 0, and numbers α, γ ∈ Z + . The Heisenberg algorithm which outputs an estimatẽ φ for φ proceeds as follows: Fig. 1, to obtain an estimatẽ

Returnφ =φ (d f ) as an estimate for φ.
It was proven in [2] that for some choices of α and γ the root-mean-square error δ on the final

thus reaching the Heisenberg limit.
One might consider the effect of experimental noise on these limits. The algorithm given in [2] was proven to be robust against noise that affected the estimation of any g(k) by no more than 1 √ 8 . However, realistic noise tends to scale with the circuit depth, eventually breaking this bound.
In the presence of a uniform depolarizing channel, it is possible to extend the above calculation of Fisher information to optimize the recovery of a single phase φ, however in the limit that δ → 0 only the sampling-noise limit can be obtained: Lemma 2.7. The (root-mean-square) error δ of an estimatorφ of the eigenvalue phase φ employing the circuits in Fig. 1 on an eigenstate of U in the presence of a pure depolarizing channel with a fixed failure probability p = e −1/τ per iteration of U is bounded as Proof. A pure depolarizing channel sends the offdiagonal element of the reduced density matrix in Eq. (4) to p k g(k). This adjustment can be propagated directly through to the Fisher information (Eq. (15)), which becomes while the total quantum cost (Eq. 16) remains the same. Let us consider the relative contribution to I versus the contribution to T of a single choice of k; if we write I(φ) = k I k and T = k T k , we have I k /T k = ke −2k/τ . Differentiating w.r.t. k and setting equal to zero yields d dk Optimizing I(φ) with respect to T thus requires setting k = τ 2 and increasing M k = M r k + M i k , which yields Substituting this into the Cramér-Rao bound δ ≥ I(φ) − 1 2 yields the desired result.
Remark: Note that as we fix T ∼ τ in the above, we in effect have Heisenberg-limited scaling in τ . However, if we treat τ as a constant this is only the sampling noise limit as defined above. This result holds only for the simplest-possible noise case; more complicated noise is difficult to analyse, but numerical results show it may prevent estimation beyond some minimum value using standard techniques [19]. We assume herein that all circuits are noiseless (and will not use Lemma 2.7 in the rest of this work).
3 Defining the task of multipleeigenvalue phase estimation In this section we define the goal of estimating multiple eigenvalue phases of some unitary U . When the input state |Ψ is supported on multiple eigenstates, choosing a single k does not suffice, simply since knowing Eq. (2) at a single point k does not give a unique solution {A j , φ j } [19]. A simple way to circumvent this problem is thus to estimate 'densely': estimating g(k) for all integers 0 ≤ k ≤ K would allow us to fit up to O(K) (φ j , A j ) pairs. However, this does not saturate the Heisenberg limit as shown in Theorem 2.5, hence we need to come up with a different method.
Separate from this, the full eigenspectrum of an arbitrary N -qubit unitary U has up to 2 N unique values, making it impossible to describe in polynomial time in N . In addition, the spectral content of the input state |Ψ could be very dense, with many eigenvalues clustered together instead of separated by gaps, and the overlap for these eigenvalues, A j , could be sharply concentrated or uniformly spread. To deal with general input states, Ref. [3] thus formulated the quantum eigenvalue estimation problem (QEEP): instead of estimating individual phases, the focus is on estimating the spectral function A(φ) in Eq. (3) with some resolution. We recall the precise definition of this problem in App. A, Def. A.1. In our case, we focus on the case where the initial state only has a non-zero overlap with a small number n φ of eigenvectors of U , and we want to estimate eigenphases corresponding to each of them.
Here is our precise definition of the problem to be solved: Definition 3.1 (Multiple eigenvalue estimation problem). Fix an error bound δ > 0, an overlap bound A > 0. For a unitary U and state |Ψ , we assume that A j > A for exactly n φ phases φ j and A j = 0 for all other phases so that n φ ≤ A −1 . Let g(k) = j A j e ikφ j be the phase function in Def. 2.1 and assume access to the PFE quantum subroutine in Def. 2.2 for any k ∈ R + , generating data x. The task is to output a set {φ l } of n φ or fewer estimates of the phases {φ j } such that, if we take the closest estimateφ the accuracy error is bounded by δ j ≤ δ for all j = 1, . . . , n φ .
Remark: Def. 3.1 allows us the freedom to assign a single estimate to multiple phases when calculating the final mean-squared-error.

Methods of dense signal phase estimation
Achieving the Heisenberg limit for multiple eigenvalues requires solving the problem considered in Def. 3.1 with a total quantum cost T = O(δ −1 ). We intend to accomplish that goal with a multiorder estimation scheme. At each order d, we will use a data processing method to estimate multiple eigenphases θ (d) j of U k d to within some error , from data generated by PFE (analogous to step 2b of Alg. 2.6). (We will stitch the estimates of the phases θ (d) j together to give Heisenberglimited estimates of the corresponding φ j in a manner similar to step 2d of Alg. 2.6.) We can offload the estimation of θ (d) j to a subroutine; we will show later that we can afford a subroutine with superlinear scaling in as the final error δ in our multi-order scheme can be made arbitrarily small even at fixed . In this section we discuss the two subroutines that we will consider in this work: the matrix pencil method (Alg. 5.1) first studied for QPE in Ref. [19], and the 'time series analysis' proposed in Ref. [3] to solve the QEEP problem mentioned above.
We detail our implementation of the matrix pencil method in Alg. 5.1; this is a well-known algorithm in signal processing, that is known to achieve the dense-sampling limit for a single eigenvalue [30,31]. However, bounding the error of the matrix pencil method in estimating many phases typically requires a minimal gap ∆ between these phases, ∆ = min i =j |φ i − φ j | T , and that we query the PFE to obtain estimates of g(k) at k > 1 ∆ . This is a proven necessary condition to estimate multiple φ j to error ≤ c∆ [32] for some constant c. In our case, we need to allow for the case where we are estimating two phases θ In principle this is not forbidden by the result of [32] (and numerical simulation confirms that this indeed works), but we do not know of a formal statement about the scaling of the matrix pencil method in this situation. Instead, we opt for a different method for the purposes of forming a rigorous proof of the Heisenberg limit, and test the matrix pencil method in numerics only.
Ref. [3] proved rigorously the QEEP can be solved from the densely sampled signal g(k) generated with the PFE in Def. 2.2 using a 'timeseries analysis' algorithm. We review the results of Ref. [3] in detail in App. A. However, the solution to the QEEP is an estimationÃ(φ) of a discretization of the spectral function A(φ) (Eq. (3)) rather than a set of estimates {θ To use the time-series analysis algorithm as a subroutine in our multi-order phase estimation algorithm then requires converting from one form to the other. This is achieved by Alg. A.3, the Conservative QEEP Eigenvalue Extraction algorithm. This algorithm is designed so that its output fulfills the following guarantees whenever the time-series analysis algorithm succeeds (defined as Ã (φ) − A(φ) 1 ≤ ) Lemma 3.2. Fix a confidence bound 0 < p < 1, an overlap bound A, a number of phases n θ < 1 A , and an error bound 0 < < A 3 . Let g(k) = j A j e ikθ j be the phase function for a unitary V , with A j > A for exactly n θ phases θ j , and A j = 0 for all other phases. Let {θ l } be a set of estimates of {θ j } generated by Alg. A.3 with error and confidence bound p. With probability at least p, the following statements are true: 1. For each phase θ j with A j > 0, there exists at least one estimateθ l such that 2. For each estimateθ l there exists at least one phase θ j with A j > 0 such that 3. The number of estimates |{θ l }| ≤ n θ .
See App A for a proof. The total quantum cost of Alg. A.3 is inherited directly from the cost of the time series analysis algorithm (as it involves no additional quantum circuitry), which is O( −6 | log(1 − p)|). 4 Multiple eigenvalues: multi-order estimation and the phase matching problem To achieve Heisenberg-limited scaling for multiple phases, we combine the dense signal algorithms which can resolve multiple phases in the previous section with the single-phase Heisenberg limited algorithm, Algorithm 2.6, which achieves the correct scaling.
A natural way to achieve such combination is to estimate phases {θ If we would manage to get an estimate of θ (d) j = φ j 2 d at each order d with error and be able to combine these estimates in an unequivocal manner, then reaching the Heisenberg limit for multiple phases may be feasible. Note that the error in the final d th f estimate in this case would be δ ∼ /2 d f with in Algorithm A.3. One could thus achieve arbitrarily small δ for fixed by making d f arbitrarily large. This allows us to use (possibly non-optimal) routines such as the QEEP algorithm since the scaling with does not propagate to a scaling in δ for a sufficiently small .
However, the combination of phase information at different orders in the case of multiple phases may not be feasible when we use V = U 2 d for increasing d as in Algorithm 2.6. The reason is that if we have multiple phases, the previous order estimates provide sets of 'ballpark' intervals and it may not clear or unambiguous which interval to choose in order to convert a new estimateθ  (as in step 2d of Algorithm 2.6). We would like the choice of the next order to be such that this 'matching with a previous estimate' can be done unambiguously.
For this, we will estimate the multiple eigen- , we use an adaptive strategy for choosing the next multiplier κ d in Alg. 4.1. That is, the algorithm will determine a κ d based on the estimatesφ (d−1) j from the previous round. Although this scheme requires some classical processing of the experimental data before the experiment is finished, it is not an adaptive scheme in the same sense as iterative QPE [14], as we do not require feedback within the coherent lifetime of a single experiment.
The generalization from using U 2 d to U k d for k d ∈ R + presents one small additional complication. In order to prove bounds on the estimation at each order we will require invoking Lemma 2.4. However, this requires that our phases φ j satisfy Eq. (7) (unless k d ∈ Z + ). If a phase φ j does not satisfy Eq. (7), one can construct a situation where two corresponding estimatesφ (d) j are found on either side of the branch cut at 2π, and where we cannot guarantee that our algorithm would choose the 'correct' one (without knowledge of the hidden φ j ). To solve this issue, we note that one may shift the phases of U by a constant χ by performing phase estimation on U e −iχ instead of U . This need not even be done on the quantum device, as one simply multiplies estimates of g(k) by e −ikχ . As we assume the existence of only n φ phases, we can always find some U e iχ with phases in some window [φ min , φ max ) with φ min ≥ π k , φ max ≤ π(2 k −1) k when k ≥ 3n φ . This will allow us to invoke Lem. 2.4 to match estimates of eigenphases of U e iχ and estimates of eigenphases of (U e iχ ) k as we require. We also note that the above issue can be circumvented when U = e iHt by a suitable choice of t.

Heisenberg-limited algorithm for multiple phases
We now describe our Heisenberg-limited phase estimation algorithm. This algorithm targets a final error δ = O(δ c ), where δ c is a fixed input to the algorithm itself (We will calculate the constant of proportionality in the proof of Theorem 4.5). The Heisenberg limit will be achieved by making this δ c smaller while keeping the error of the phase extraction subroutine, Alg. A.3, constant. and Let the confidence parameter p d be given some real numbers α > 0, γ > 2 and k d to be chosen below. The algorithm proceeds as follows: 1. Let d = 0 and k d=0 = 1. Use Alg. A.3 to find a set of first estimates {φ (0) j } of eigenvalues of U with error parameter 0 in Eq. (26), overlap bound A, and confidence p d=0 in Eq. (28). If this set is empty or has more than n φ elements, return {0} (this is a failure mode).

Find the point
i.e. ζ is the midway point in the largest gap between the phase estimatesφ i.e. d ζ is half the size of the largest gap. Shift or |φ 4. While k d < 2 δc : (a) Set d → d + 1.  (33) or there exists someθ (34) or the number of estimates |{θ

Return {φ
In principle the first few orders d could be skipped given accurate prior knowledge of our phases φ j . However, as the largest circuits are executed at the latter d values, this will only change the constant factor of the algorithm, rather than the asymptotic scaling with δ.
In the rest of this section, we prove that Alg. 4.1 can achieve the Heisenberg limit. The first use of the QEEP subroutine requires a potentially smaller error parameter ( 0 , bounded by Eq. (26)) than subsequent uses (where needs to be only bounded by Eq. (27)). (Invoking Alg. A.3 requires that the error parameters and 0 are at most A/3 ≤ 1/(3n φ ), which is fulfilled by both bounds.) This relates to a technical issue: we require k 1 ≥ 3n φ in order for Lemma C.1 in Appendix C and thus Lemma 2.4 to apply. For later rounds k d ≥ 3n φ automatically, and the multiplier κ d is no longer constrained, which indirectly allows us to relax the region of valid choices for . The first step in proving the performance of Algorithm 4.1 is to show that the multipliers can be chosen in the first (step 3 in Alg. 4.1), and subsequent rounds (step 4f in Alg. 4.1), which obey the desired conditions. This is accomplished by the following Lemma which is proved in Appendix B. Remarks: The probability with which a multiplier can be found which obeys the desired property is rather arbitrary in this Lemma and can be increased by choosing a smaller . Note that it is easy to verify whether for a randomly chosen multiplier the desired conditions hold or not. The validity of this Lemma importantly does not depend on whether the phase estimates are actually accurate, it only depends on the number of phases n φ . In practice, we do not generate a random multiplier κ d+1 through this Lemma, but simply exhaustively search for a valid κ d+1 starting at the maximal value, see Section 5.
The reason to adaptively choose the multiplier κ d+1 for d = 0, 1, . . . is that two (estimated) phases in principle need to lead to separate estimates at the next order: this is expressed in Eq. (36). An exception to this occurs when the (estimated) phases are still close enough, as in Eq. (37), so that their next-order refined estimates could merge at the next order, see Fig. 2. Phase estimates can thus split and merge over the multiple orders. They split when sufficient accuracy is available at the next order to distinguish them, they can stay or are allowed to merge when such accuracy is not yet needed at the given order.
In what follows below we will assume, just for simplicity of the proof, that the error parameter is bounded by crit,0 in Eq. (26) for all rounds, and is the same for all rounds, including the first one.

Bounding the error with and without failures
In this section we state and prove the two key intermediate lemmas, Lemma 4.3 and Lemma 4.4 on our way towards proving that Alg. 4.1 reaches the Heisenberg-limit. Together, these lemmas allow us to bound the error in Alg. 4.1, -assuming that the phase extraction subroutine succeeds for the first d rounds-, to within O( /k d ).
These Lemmas deal with the issue of 'aliasing' or the correct matching of new estimates with older estimates which is solved by the specific choice of κ d+1 in step 4f of Alg. 4.1, see also }. When none of the failure modes is encountered, d f is set by the first k d such that k d ≥ δc (since the next k d+1 ≥ 2 /δ c as κ d+1 ≥ 2). Since κ d ≥ 2, we observe that

Hidden phases
Promised estimate regions Estimates (separated) Estimates (merged) Estimates (aliased) →j (solid lines) lying within the promised estimate region about φ j (coloured boxes). By matching phases at subsequent orders (arrows), the algorithm is able to converge to an ever-more accurate estimate of each φ j . The phase extraction subroutine only promises that each region will contain at least one phase (and that the total number of estimates at each order is bounded by n φ ) -when two regions overlap, the subroutine may merge the phases to give a single estimate (green and purple lines). Estimates at subsequent orders may continue to separate and even re-merge until the regions separate, at which point the algorithm promises with high confidence a precise estimate for each hidden phase. The estimateφ (d) →j at each order is only known mod (2π)/k d , leading to a set of potential aliases (dotted lines at d = 4) for each phase. We do not know a priori which alias is correct, and must rely on the fact that the true estimateφ →j . By carefully choosing each k d , we can guarantee that no alias will satisfy this condition (so long as Alg. A.3 succeeds), and our phase matching will be unambiguous.

• (Property 1b) For every estimateφ
Proof. We prove this Lemma by induction. Consider the first round d = 0 (k d=0 = 1), i.e. step 1 of Alg. 4.1. If the QEEP subroutine, Alg. A.3, succeeds (with probability p 0 ) then Lemma 3.2 holds, namely for each φ j there exists an estimateφ (0) l such that and for each estimateφ Hence the statement to be proven holds at d = 0. In the next steps we work with these shifted phases but for simplicity we don't use any new notation and refer to them as φ j and estimatesφ such that • (Assumption 1b) For every estimateφ there exists a phase φ j such that Note that these assumptions certainly imply that one can apply Lemma 2.4 to the estimatesφ l . That is, given that the real phases φ j are 2 /k d close to these estimates and that the (shifted) φ j obey Eq. (110), it implies that Eq. (8) can be used with k ≥ 3n φ (which is the case for all rounds d ≥ 1).
We consider the QEEP subroutine, Alg. A.3, with a given choice of κ d obeying the conditions in step 3 (for d = 1) and step 4f (for higher d), executed in step 4b with confidence p d . In the math below we refer to conditions on κ d>1 , namely Eq. (36) and Eq. (37), but the conditions on κ 1 in Eq. (31) and Eq. (32) are of identical form (so we don't make a separate argument for the d = 0 → d = 1 induction step).
Let thus {θ (d+1) l } be a set of estimates of the eigenphases {θ • (Assumption 2b) For every estimateθ To prove the induction step, we thus need to show that the setφ • (Property 1b) For every estimateφ there exists a phase φ j such that First consider Assumption 2a. Assumption 2a implies that for every phase φ j there exists ã In this proof we will use the label l =→ j for thisθ (d+1) l associated with φ j . Thus, also using Eq. (8) with n ideal j,→j the integer which achieves the minimum, i.e.
Similarly, by Assumption 1a, there is someφ which is 2 /k d -close to φ j , again using a label which shows this association. Consider the optimization in Eq. (35) at step 4d in Alg. 4.1. We define n l = arg min The goal is thus to prove that for each φ j , using the correspondingθ (d+1) →j , we have n →j = n ideal j,→j which directly implies Property 1a.
We can bound using Eq. (6) and then Eq. (8), Assumptions 1a and 2a and the optimality of n →j,→j , Now if Eq. (36) holds for some other m =→ j, we claim on the other hand that hence matching θ m , with m =→ j is non-optimal and will not be chosen in the Algorithm. To indeed see that Eq. (36) implies Eq. (50), we can calculate Here we have used that Eq. (8) To see that Eq. (37) implies Eq. (53) indeed, note that it is sufficient to prove that as one can then show that for n = n →j,→j ∈ {0, . . . , k d+1 − 1} that so n →j,→j is optimal. Using Eq. (6), Eq. (37) and Eq. (60), we can prove Eq. (54) since We have thus shown that for each φ j , there is aθ (d+1) →j , such that step 4d will output (θ (d+1) →j + 2πn →j,→j )/k d+1 with n →j,→j defined in Eq. (48), related to the previous order estimate φ (d) →j which was already close to φ (d) j . The last step is to show that n →j,→j = n ideal j,→j using Property 1a. It holds that where we used that 4 (κ d+1 + 1) < π. Indeed for d > 1, < π 16 (κ d+1 ≤ 3) and for d = 0, < π 4(3n φ +2) (κ 1 ≤ 3n φ + 1) given the upper bounds on in Eqs. (27) and (26). This implies through the same argument as in Eq. (56) that n ideal j,→j achieving the minimum in Eq. (43) equals n →j,→j and hence we obtain Property 1a. Now let's prove Property 1b. Given aθ (d+1) l , let φ j=→l be the real phase for which, by Assumption 2b, it holds that To prove Property 1b, we need to show that n l = n ideal →l,l with n l defined in Eq. (48). Let alsõ φ (d) →j be the previous order 2 /k d -close estimate to φ j=→l by Assumption 1a. The idea is that in the optimization step of the algorithm, so that this leads to a better estimate for the phase φ j=→l .
Given aθ (d+1) l we can deduce, as before, that Using previous arguments, all other ξ m,l are either larger or give the same integer n →j,l and thus n l = n →j,l . In addition, we can bound, using this equality and Assumption 1a implying that n l = n →j,l = n ideal →l,l as desired.
Algorithm 4.1 has a few failure modes, namely steps 1, 4c and 4e where we exit and return an estimate of lower order. Arguments in Lemma 4.3 show that these failure modes are only encountered when the QEEP subroutine, Alg. A.3, fails at some order. 'Failure' here is not complete failure; regardless of whether the algorithm fails, it will return a set of estimatesφ j , and the error in these estimates will contribute to Eq. 25. To achieve the Heisenberg limit we must make sure that both the probability of failure is small, and that the estimatesφ j from a failed instance of the algorithm still lie close to the true values φ j to minimize their contribution to Eq. (25). The probability p d with which the QEEP subroutine succeeds at the dth order is bounded by the parameters α, γ, given as an input to Alg. 4.1 (Eq. 28). The probability that this achieves for up to and including the dth order is bounded by d d =0 p d . It is crucial for the success of our algorithm that we do not encounter any exit modes other than those mentioned above, which we can now prove given the machinery developed in Lem. 4.3.
and we argued previously, via induction, that this does not happen when Alg. A.3 succeeds up to order d, as min j ξ j,l is upper-bounded as in Eq.
which can not happen due to the success of Alg. A.3 which implies the bound in Eq. (49). Consider lastly step 4e which exits if the current dth order estimates do not lie in the region for which Eq. (7) holds with given k d . We have argued in Lemma 4.3 that, assuming success of the subroutines implementing Alg. A.3, that Eq. (7) holds for the phase estimates at all orders.
Now let us consider failures of the QEEP subroutine, Alg. A.3, which do not lead to exiting. Let's imagine that the first failure occurs at some order d 0 . Now we want to make sure that continuing with higher orders after such failure still leads to an error of order ∼ /k d 0 −1 , even though the failure (or any subsequent failure) is not detected.
To show this, we check that if Alg. 4.1 exits at some later round, namely during d = d f + 1 and outputs estimatesφ (d f ) j that these will be sufficiently close to the estimates right before failure, that is, the set of phasesφ Then, by Lem. 4.3, these estimates will also be sufficiently close to the true phases φ j .
Vice-versa, for each estimateφ there exists a phase φ j such that • (Property 1b) For every estimateφ there exists a phase φ j such that Then, since the algorithm does not exit at step 4c through Eqs. (33) or step 4e for any order where the second equality follows from being allowed to apply Eq. (8) (which is validated by passing the test at step 4e). This implies that in step 4d of Alg. 4.1 at round d, for a givenθ hence this ξ k,m l is not optimal, or that Eq. (37) holds, in which case The proofs of these claims are exactly the same as in the proof of Lemma 4.3, i.e. using Eqs. (54), (56),(57). Now let us prove Eq. (64). Given a phase φ j , we can use Property (1a) to find an associated estimateφ (which we label with → j again) for which 3. With probability less than Combining our bounds then yields which is the Heisenberg limit. Note that a dependence on the number of phase n φ enters the scaling of δ via which needs to be bounded as in Eqs. (26) and (27).

Numerical implementation
Thm. 4.5 requires using the QEEP algorithm (Theorem A.2 and Algorithm A.3) in order to obtain provable bounds. Instead of analytic bounds, we now turn to a numerical demonstration, giving the opportunity to implement and test Algorithm 4.1 with a few modifications. We test the algorithm using two different sub-routines, one based on the matrix pencil method [31], and one based on the QEEP time-series analysis of Theorem A.2, as described in Algorithm A.3. Code to implement all simulations can be found at https://github.com/alicjadut/qpe. To improve the practical performance of Alg. 4.1, we make the following two small changes. Firstly, instead of choosing κ d+1 in step 4f in the ranges declared in Lem. 4.2, we choose the largest κ d+1 consistent with Eq. (36) and Eq. (37) for allφ l . We note that the maximum such κ d+1 is bounded above by jl defined in Eq. (92), and we find the largest possible κ d+1 by iterating backwards through these boundaries till a gap is found.) Secondly, as the bounds for and 0 in Lem. 4.2 are rather loose, and our performance scales rather badly in both, we choose the largest = 0 that allows all simulations to find a value of κ d+1 > 2 at each order.
When using the matrix pencil processing subroutine, we follow the implementation described in Ref. [19]: Algorithm 5.1. The matrix pencil method takes as input estimates of the phase function g(k) = j A j e ikθ j for a unitary V at points k = 0, 1, . . . K and an overlap bound A, and proceeds as follows: .., 2K − L K }, a = 0, 1, with L K = (K + 1)/2 , and using g(−k) = g * (k).

Construct the
3. Calculate the eigenvalues of T , λ j = |λ j |e iθ j and from there the phase estimatesθ j .

Calculate the overlap estimatesÃ j by leastsquares minimization of the vector
5. Return the phase estimatesθ j for which the corresponding overlap estimateÃ j ≥ A.
To use this algorithm as a subroutine in Alg. 4.1 (in place of Alg. A.3), we implement it on the matrix V = U k d , which requires implementing V k = U kk d for a range of integer k on a quantum device.
To isolate the performance of the estimation routine from the generation of the signal itself, we do not test our protocols on data generated from simulating or approximating a particular unitary. Instead, we test the ability of the protocols to estimate n φ = 2 and n φ = 4 randomlychosen phases φ j ∈ [0, 2π] when sampling from the true phase function g(k). We take all phases with equal weight -A j = 1/n φ . We simulate the sampling from g(k) in Algorithm 5.1 or Algorithm A.2 for some V = U k d by simulating the readout of a control qubit with the reduced density matrix of Eq. (4). (In practice this would be generated by the quantum circuit in Fig. 1 where θ j = k d φ j mod 2π are the eigenvalues of V . Then, we return the fraction of +1s drawn as estimates for the real and imaginary parts of g(k) respectively. This is then repeated at all points k = k d k for k = 0, 1, . . . , K. Following the discussion in Sec. 2.2 and using the notation from Eq. (74), we sum the total quantum cost for the algorithm over all requested g(k) queries; (We ignore the subleading correction from the final term in Eq. (74) as this will not affect the scaling of our result.) For the signal length K and number of points M d to sample each g(k) at, we follow the bounds given in Ref. [3] (both when using the QEEP and matrix pencil subroutines): • signal length: K = 0.1L ln 2 L , with L = 2π the number of bins used in the QEEP subroutine (Def. A.1).
• number of measurements of each circuit: Here, p d is given in Eq. (28) for a given k d . This equation requires fixing a choice of α and γ -across all experiments we take α = 2 and γ = 2.1.
To demonstrate that our methods achieve the Heisenberg limit, in Fig. 3 we plot the error as a function of the total quantum cost for a set of simulations using the methods described above. Each simulation draws a different set of n φ random phases, and a final error δ c ∈ [10 −5 , 10 −2 ] (for each choice of δ c we use the same 50 sets of phases). We plot the error for each phase estimate separately in Fig. 3 (i.e. each simulation corresponds to n φ points in the plot). As both the total quantum cost and the error is different between simulations, we bin all experiments within a range of T values, and calculate the root mean square error and root mean square total quantum cost. This gives a good approximation to the accuracy error defined in Def. 3.1 for the restricted data set used.
For the QEEP subroutine, we observe a clear fit of the data (blue points) to a δ ∼ T −1 trend, as expected from Thm. 4.5, but with a rather large constant factor; we find T ∼ 10 10 δ −1 for estimating 2 phases and T ∼ 10 15 δ −1 for estimating 4. Further optimization of the QEEP algorithm for these purposes may yet improve this constant factor. However, as the methods of Ref. [3] were not designed for estimating individual phases, it may be expected that this method performs somewhat badly for this purpose, so we have not pursued this further.
Simulations using the matrix pencil subroutine outperform simulations using the QEEP subroutine by a factor of 10 4 − 10 6 , and clearly demonstrate Heisenberg-limited scaling δ ∼ T −1 as well. We take this result instead of an analytic proof as strong numerical evidence for Heisenberg-limited scaling when a version of Alg. 4.1 is constructed using the matrix pencil method as a subroutine. We notice that the error in two phase estimates in the first bin of the matrix pencil method for n φ = 2 is significantly above the remainder of the population (by about a factor 100×), which blows up our error bars for this bin. Further investigation shows that the two phases in question are from the same simulation, and separated by only 1.5 × 10 −4 . By contrast, for the simulation in question (at d = 1) our algorithm sets k 1 K ∼ 3 × 10 3 < (1.5 × 10 −4 ) −1 (where k 1 K is the largest value of the phase function g(k) sampled during this simulation). This implies that our signal lies within the region where improving our estimation accuracy by increasing the number of shots M d is exponentially hard [32]. In latter simulations with these two phases we see that our estimation error regresses to similar results as all other estimates.

Conclusion
In this work we studied Heisenberg-limited quantum phase estimation using a single control qubit. In this form of phase estimation, we rely on classical signal processing to extract eigenvalue data from the phase function g(k) in Eq. (2).
It has been an open question whether these methods can achieve the Heisenberg limit in the case of multiple phases: Ref. [26] answered this question up to log factors with a Heisenberglimited Monte Carlo algorithm, providing a sampling of the spectral function A(φ) in Eq. (3) from which to estimate the phases. In this work we also answered this question in the affirmative exactly with a new adaptive multi-order phase estimation algorithm, for which we prove Heisenberg scaling if the algorithm uses a QEEP phase extraction subroutine. We numerically show the performance of this algorithm, also when instead of using a QEEP subroutine, one uses the matrix pencil method to extract phase estimates from the phase function g(k). This result complements the previous work discussed in the introduction by closing the question of whether there exists a gap between classical post-processing of phase function data and fully quantum phase estimation.
In obtaining our results we encountered at least two details of quantum phase estimation that we have not seen discussed in the literature. The first is the dense signal limit, Eq. (12) in Thm. 2.5: sampling g(k) at all integer point k = 1, . . . , K is sub-optimal regardless of what method is used to process the data. However, we also briefly argued that by picking points among k = 1, . . . , K at random one may go beyond this, and one could interpret this as allowing the results obtained in [26] in which such randomized choices for k are taken.
The second is the need for adaptive choices of k d to solve the phase matching problem. It is unclear to us how far this problem extends; although Lemma 4.2 provides a practical solution, others may still exist. Another open question with respect to Algorithm 4.1 is whether one can remove the need to choose real-valued multipliers κ d and restrict to integer choices. Restricting κ d ∈ N would significantly simplify some technical issues, i.e. the applicability of Lemma 2.4 and the need for shifting phases in step 2 of Alg. 4.1, but we don't know how to prove a version of Lemma 4.2 for κ d ∈ N. In fact, we don't know whether there is a fundamental difference in performance between only using data obtained with integer k in g(k) versus data obtain with real-valued, -in practice rational-, k in g(k).
We have assumed in our problem description, Def. 3.1, that the spectrum in the input state is discrete consisting of n φ phases with amplitudes above some cut-off. In practice this condition may not be fulfilled and thus studying the performance of the algorithms on more typical spectra induced by many-body Hamiltonians and easyto-prepare input states will be of interest. The scaling of our (provable) Heisenberg-limited algorithm in n φ is also rather poor [O(n 24 φ ) as described], as we have not attempted to optimize this aspect. This should in principle be immediately reducible to O(n 3 φ δ −1 + n 6 φ ) under the assumption that the matrix pencil method continues to achieves the dense sampling limit when estimating multiple eigenvalue, however we do not know a proof of this. In principle linear scaling with n φ should be achievable (or even sub-linear if the methods of [33] could be applied in this setting); optimizing this is a clear target for future work.
A direction for future research is to make this algorithm efficient in practice (i.e. improve the parameter dependence and the practical run time) or devise yet-different Heisenberg-scaling algorithms and examine their performance in the presence of experimentally-noisy signals g(k). O −1 ln 2 ( −1 ) and M =Õ(| ln(1 − p)| −4 ) for each k = 1, 2 . . . , K. The total quantum cost for U is then bounded as T = O(M K 2 ) =Õ(| ln(1 − p)| −6 ).
We note that the approximate indicator functions f l (.) in Eq. (85) are designed to have a quickly decaying Fourier series, which is required to achieve polynomial-time scaling. We also refer the reader to Ref. [23], which has extended the result by relaxing the requirement that f l (φ) + f (l−1) (φ) = 1 on the interval V l ∩ V l−1 . The idea of this QEEP algorithm is as follows. Since b l = j A j f l (φ j ), using Eq. (3), periodically extending f l (φ) beyond [0, 2π) and Fourier decomposing gives b l = j A j k∈Z e ikφ jf l (k) = k∈Z g(k)f l (k). At the same time, the fact that f l (.) is an indicator function ensures that b l ≈ φ j ∈V l A l . Thus knowledge of g(k) and the Fourier coefficients f l (k) for a range of k allows one to estimate the weights b l . The requirement to estimate the spectral function to within a 1-norm , Eq. (84), is very stringent, hence the scaling of T with error is quite costly, T = O( −6 ). It is possible that one can improve the scaling by re-examining the analysis in [3]. l − 1 was not itself removed.

A.3 Proof of Lemma 3.2
Lemma 3.2 relates the performance of Alg. A.3 to the performance of the QEEP subroutine. We now prove this Lemma. Note that {θ j } is an ordered list of n φ phases, while {θ l } is an ordered list (that we will show contains at most n φ phases).
Proof of Lemma 3.2 Our proof follows by showing that the output from Alg. A.3 satisfies these statements whenever Eq. (84) holds. This yields our confidence bound as Eq. (84) holds with probability p by Def. A.1.
To see that statement 2 holds when Eq. (84) is satisfied, note that if V l contains no phases θ j with A j > A and Eq. (84) is satisfied,b l < < A/3, and l will not be added to the set S in Alg. A.3. This implies that when Eq. (84) holds, if l ∈ S there exists some θ j ∈ V l with A j > A, in which case |θ j −θ l | T = |θ j − l| T ≤ 2 .
To see that statement 3 holds when Eq. (84) is satisfied, note that θ j ∈ V l ∩ V (l+1) mod L for exactly one l (and θ j / ∈ V m for m = l, (l+1) mod L). Then, when Eq. (84) holds,b l +b (l+1) mod L > 2A/3, so max(b l ,b (l+1) mod L ) > A/3, and either l, (l + 1) mod L or l and (l + 1) mod L will be added to the set S during step 2 of Alg. A.3 for each phase θ j . Then, step 4 of Alg. A.3 will remove (l + 1) mod L if l remains in S, so each phase θ j can contribute to only one final estimatẽ θ l = l , and the number of estimates is bounded from above by the number of phases.
To see that statement 1 holds, we use the point in the previous paragraph that, when Eq. (84) is satisfied, each phase θ j with A j > A adds either l, (l + 1) mod L, or l and (l + 1) mod L to the set S during step 2. of Alg. A.3. Then in step 4. of Alg. A.3, l is removed from S only if (l−1) mod L remains in S, and (l + 1) mod L is removed from S only if l remains in S. This implies that the distance from θ j to an estimate is bounded by Let us first prove the existence of κ d+1 ∈ [2, κ max ] with κ max = 3 that satisfies our conditions (the proof for k 1 is similar), for small enough in Eq. (27). Note that Eq. (27) implies ≤ π 300 ≈ 0.01.
Given some pairφ l | and let R j,l be a set of κ d+1 such that neither Eq. (37) nor Eq. (36) holds for the chosen phases, that is, We call ∪ j,l R j,l the forbidden region and want to show that we can choose a value for κ d+1 ∈ [2,3] outside this forbidden region if is sufficiently small. We do this by bounding the size of the forbidden region above and showing that this is smaller than the region [2,3], leaving room to choose κ d+1 .
Note that R j,l is nonempty only if using Eq. (88). We may write the set R j,l as where R (n) j,l is the set of κ d+1 for which |k d κ d+1 ∆ j,l | T = k d κ d+1 ∆ j,l − 2πn The size of each interval R (n) j,l can then be calculated We can bound j,l − 16 2 πn max (n max + 1) Here, n max = n max (j, l) is the largest index of a set R (n) j,l in Eq. (92) for which κ d+1 ∈ [2, κ max ]. Since κ d+1 ≤ 3, Eq. (92) implies that Now using Eq. (90) and Eq. (88) gives R j,l ≤ 1 1 − (4 /k d ∆ j,l ) 2 × 30 (864000 2 + 248160 π + 11899π 2 ) 5329π 3 ≤ 3 (864000 2 + 248160 π + 11899π 2 ) 532π 3 ≤ 0.24, where the last inequality used Eq. (88). As there are n φ ≥ 2 phases the length of the total forbidden region ∪ j,l R j,l is bounded from above by n 2 φ 2 |R j,l |. We want to this interval to be, say, at most 1/4, so that by choosing κ d+1 randomly we have a 75% change of not landing in the forbidden interval. For larger n φ we thus should use leading to Eq. (27). We now repeat the above approach for the special case of finding the multiplier in the first round κ 1 = k 1 . Consider thus k 1 ∈ [3n φ , κ max ] with κ max = 3n φ + 1. The key difference here is that there is a stricter lower bound on this multiplier κ max so that needs to be chosen smaller, depending on n φ , namely we choose as expressed in Eq. (26). Given some pairφ l , j < l, and again let ∆ j,l = |φ (0) j −φ (0) l |. Then, let R j,l be the set of k 1 such that neither Eq. (37) nor Eq. (36) holds for the chosen phases. That is, We again call ∪ j,l R j,l the forbidden region and want to show that we can choose a value for k 1 ∈ [3n φ , 3n φ + 1] outside this forbidden region, assuming that 0 is small enough. Note that the logic of the first few inequalities in Eq. (90) still holds in this new calculation, leading to where the second inequality used Eq. (99) and the value for κ max . Note that for large n φ this allows ∆ j,l to decrease like ∼ 1/n φ , while previously ∆ j,l was lowerbounded by a constant, Eq. (90). This time, we may write the set R j,l as where R (n) j,l is the set of k 1 satisfying